production SAL

1-50 of 10000 results (31ms)

2021-03-21 §
10:25	<_joe_>	restarting gerrit on gerrit1001, using 45G of reserved memory	[production]
09:22	<elukey>	install apache2-bin-dbgsym on gerrit1001 - T277127	[production]
08:50	<qchris>	Restarting apache on gerrit1001 again (all apache workers busy again) see T277127	[production]
08:18	<qchris>	Restarting apache on gerrit1001 (all apache workers busy)	[production]
2021-03-20 §
00:22	<tzatziki>	altering emails for STei (WMF) and SGrabarczuk (WMF)	[production]
2021-03-19 §
21:11	<mutante>	scandium - stop apache and rerun puppet which fails after reimaging because it tries to run an nginx on port 80 which is already used by apache T268248	[production]
20:31	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on scandium.eqiad.wmnet with reason: REIMAGE	[production]
20:29	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on scandium.eqiad.wmnet with reason: REIMAGE	[production]
20:15	<mutante>	scandium - reimaging with buster	[production]
20:14	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on scandium.eqiad.wmnet with reason: reimage	[production]
20:14	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on scandium.eqiad.wmnet with reason: reimage	[production]
20:11	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2245.codfw.wmnet	[production]
19:55	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mw2245.codfw.wmnet	[production]
19:53	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2244.codfw.wmnet	[production]
19:53	<legoktm@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host lists1002.wikimedia.org	[production]
19:50	<mutante>	testreduce1001 - confirmed MariaDB @@datadir is /srv/data/mysql and deleting /var/lib/mysql (T277580)	[production]
19:40	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mw2244.codfw.wmnet	[production]
19:39	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw2245.codfw.wmnet	[production]
19:39	<legoktm@cumin1001>	START - Cookbook sre.ganeti.makevm for new host lists1002.wikimedia.org	[production]
19:39	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw2244.codfw.wmnet	[production]
19:37	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2252.codfw.wmnet,service=canary	[production]
19:37	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2251.codfw.wmnet,service=canary	[production]
19:33	<dzahn@cumin1001>	conftool action : set/weight=1; selector: name=mw2252.codfw.wmnet,service=canary	[production]
19:33	<dzahn@cumin1001>	conftool action : set/weight=1; selector: name=mw2251.codfw.wmnet,service=canary	[production]
19:24	<mutante>	deploy2002 - re-enabled puppet, reverted patch of scap-sync-master	[production]
18:46	<mutante>	deploy2002 - disable puppet, copy modified version of scap-master-sync over it that does not --exclude="*/cache/l10n/.cdb" (for T275826)	[production]
16:01	<effie>	upgrade memcached on mc-gp200*	[production]
12:36	<klausman@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE	[production]
12:34	<klausman@cumin2001>	START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE	[production]
12:10	<effie>	upgrade memcached on mc1026,mc2026	[production]
11:37	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
11:37	<klausman@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
11:36	<klausman@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
11:36	<klausman@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
11:30	<klausman@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
11:29	<klausman@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
11:29	<klausman@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
11:29	<klausman@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
11:29	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
11:29	<klausman@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
11:27	<akosiaris@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
11:27	<akosiaris@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
11:20	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE	[production]
11:18	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE	[production]
10:45	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
10:45	<klausman@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
10:42	<moritzm>	installing dbmonitor1002 T224589	[production]
10:42	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
10:42	<klausman@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
10:41	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]