production SAL

5251-5300 of 10000 results (99ms)

2023-06-13 §
12:45	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply	[production]
12:45	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
12:45	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
12:45	<akosiaris@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
12:44	<akosiaris@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
12:44	<akosiaris@deploy1002>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
12:44	<akosiaris@deploy1002>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
12:35	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance	[production]
12:35	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance	[production]
12:31	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P49419 and previous config saved to /var/cache/conftool/dbconfig/20230613-123117-ladsgroup.json	[production]
12:29	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4049.ulsfo.wmnet	[production]
12:28	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4041.ulsfo.wmnet	[production]
12:26	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance	[production]
12:25	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance	[production]
12:18	<fabfur>	reboot cp4041 and cp4049 for kernel upgrade (T335835)	[production]
12:18	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cp4041.ulsfo.wmnet	[production]
12:18	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cp4049.ulsfo.wmnet	[production]
12:16	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P49418 and previous config saved to /var/cache/conftool/dbconfig/20230613-121611-ladsgroup.json	[production]
12:15	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
12:15	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
12:15	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance	[production]
12:15	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance	[production]
12:09	<hashar>	Restarted Zuul CI due to T309376	[production]
12:06	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance	[production]
12:05	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance	[production]
11:56	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance	[production]
11:56	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance	[production]
11:46	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance	[production]
11:46	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance	[production]
11:45	<Amir1>	cat wikis_having_stubs \| xargs -I {} bash -c 'echo {}; touch /home/ladsgroup/{}.undo.sql; chmod 777 /home/ladsgroup/{}.undo.sql; mwscript maintenance/storage/moveToExternal.php --wiki={} --end 200000000 --undo /home/ladsgroup/{}.undo.sql DB cluster26' (T299387)	[production]
11:43	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4048.ulsfo.wmnet	[production]
11:42	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4040.ulsfo.wmnet	[production]
11:41	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019,lvs2013} and A:lvs (T329049)	[production]
11:40	<hnowlan@cumin1001>	START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019,lvs2013} and A:lvs (T329049)	[production]
11:37	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020,lvs2014} and A:lvs (T329049)	[production]
11:37	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance	[production]
11:37	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance	[production]
11:36	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:929648\|moveToExternal: Also check for utf8 encoding before trying to convert]] (duration: 09m 59s)	[production]
11:35	<hnowlan@cumin1001>	START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020,lvs2014} and A:lvs (T329049)	[production]
11:32	<fabfur>	reboot cp4040 and cp4048 for kernel upgrade (T335835)	[production]
11:32	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cp4040.ulsfo.wmnet	[production]
11:32	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cp4048.ulsfo.wmnet	[production]
11:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P49417 and previous config saved to /var/cache/conftool/dbconfig/20230613-113111-root.json	[production]
11:28	<ladsgroup@deploy1002>	ladsgroup: Backport for [[gerrit:929648\|moveToExternal: Also check for utf8 encoding before trying to convert]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet	[production]
11:27	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
11:27	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
11:26	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:929648\|moveToExternal: Also check for utf8 encoding before trying to convert]]	[production]
11:26	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2118.codfw.wmnet with reason: Maintenance	[production]
11:26	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db2118.codfw.wmnet with reason: Maintenance	[production]
11:26	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1181.eqiad.wmnet with reason: Maintenance	[production]