production SAL

1451-1500 of 10000 results (45ms)

2022-04-27 §
15:45	<nokafor@deploy1002>	Started deploy [airflow-dags/analytics@6684963]: (no justification provided)	[production]
15:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P26723 and previous config saved to /var/cache/conftool/dbconfig/20220427-154449-ladsgroup.json	[production]
15:39	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host thanos-be2004.codfw.wmnet	[production]
15:36	<kevinbazira@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
15:35	<kevinbazira@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
15:33	<kevinbazira@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
15:31	<kevinbazira@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
15:29	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1138 (T306560)', diff saved to https://phabricator.wikimedia.org/P26722 and previous config saved to /var/cache/conftool/dbconfig/20220427-152944-ladsgroup.json	[production]
15:02	<moritzm>	installing mariadb-10.5 updates (as packaged in Debian Bullseye, unrelated to wmf-mariadb packages)	[production]
14:57	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2003.codfw.wmnet	[production]
14:54	<razzi@cumin1001>	conftool action : set/pooled=inactive; selector: service=cloudceph,name=cloudcephmon1003.eqiad.wmnet	[production]
14:54	<razzi@cumin1001>	conftool action : set/pooled=no; selector: service=cloudceph,name=cloudcephmon1003.eqiad.wmnet	[production]
14:53	<razzi@cumin1001>	conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1003.eqiad.wmnet	[production]
14:47	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host thanos-be2003.codfw.wmnet	[production]
14:47	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
14:46	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
14:46	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
14:46	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
14:43	<ladsgroup@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:786976\|Enable videojs in eswiki (T303785 T248418)]] (duration: 00m 51s)	[production]
14:42	<moritzm>	imported cas 6.4.6.3 to apt.wikimedia.org	[production]
14:27	<kevinbazira@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
14:23	<kevinbazira@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
14:22	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2002.codfw.wmnet	[production]
14:21	<kevinbazira@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
14:12	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1138 (T306560)', diff saved to https://phabricator.wikimedia.org/P26720 and previous config saved to /var/cache/conftool/dbconfig/20220427-141215-ladsgroup.json	[production]
14:12	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance	[production]
14:12	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance	[production]
14:09	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host thanos-be2002.codfw.wmnet	[production]
14:08	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1100.eqiad.wmnet with reason: Maintenance	[production]
14:08	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 10:00:00 on db1100.eqiad.wmnet with reason: Maintenance	[production]
14:07	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1173.eqiad.wmnet with reason: Maintenance	[production]
14:07	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 10:00:00 on db1173.eqiad.wmnet with reason: Maintenance	[production]
14:07	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6001.drmrs.wmnet	[production]
14:03	<klausman@deploy1002>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
14:03	<klausman@deploy1002>	helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.	[production]
14:00	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host prometheus6001.drmrs.wmnet	[production]
13:59	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4001.ulsfo.wmnet	[production]
13:54	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host prometheus4001.ulsfo.wmnet	[production]
13:53	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5001.eqsin.wmnet	[production]
13:51	<jayme@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
13:47	<jayme@cumin1001>	START - Cookbook sre.dns.netbox	[production]
13:46	<moritzm>	rebalance ganeti-test after adding new bullseye node T306499	[production]
13:46	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host prometheus5001.eqsin.wmnet	[production]
13:45	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3001.esams.wmnet	[production]
13:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1122 (T306560)', diff saved to https://phabricator.wikimedia.org/P26719 and previous config saved to /var/cache/conftool/dbconfig/20220427-134308-ladsgroup.json	[production]
13:40	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host prometheus3001.esams.wmnet	[production]
13:37	<mvernon@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2040.codfw.wmnet with OS bullseye	[production]
13:36	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti-test2001.codfw.wmnet to ganeti-test01.svc.codfw.wmnet	[production]
13:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298556)', diff saved to https://phabricator.wikimedia.org/P26718 and previous config saved to /var/cache/conftool/dbconfig/20220427-133537-ladsgroup.json	[production]
13:35	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti-test2001.codfw.wmnet to ganeti-test01.svc.codfw.wmnet	[production]