production SAL

3201-3250 of 10000 results (83ms)

2023-05-05 §
13:20	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P47766 and previous config saved to /var/cache/conftool/dbconfig/20230505-132050-ladsgroup.json	[production]
13:14	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cephosd1004.eqiad.wmnet	[production]
13:14	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1003.eqiad.wmnet	[production]
13:13	<andrewbogott>	rebooting cloudbackup2001.codfw.wmnet, unresponsive	[production]
13:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P47765 and previous config saved to /var/cache/conftool/dbconfig/20230505-130544-ladsgroup.json	[production]
13:05	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cephosd1003.eqiad.wmnet	[production]
13:05	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1002.eqiad.wmnet	[production]
12:57	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cephosd1002.eqiad.wmnet	[production]
12:56	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1001.eqiad.wmnet	[production]
12:50	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1188 (T335845)', diff saved to https://phabricator.wikimedia.org/P47764 and previous config saved to /var/cache/conftool/dbconfig/20230505-125038-ladsgroup.json	[production]
12:46	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host cephosd1001.eqiad.wmnet	[production]
12:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1188 (T335845)', diff saved to https://phabricator.wikimedia.org/P47763 and previous config saved to /var/cache/conftool/dbconfig/20230505-124412-ladsgroup.json	[production]
12:44	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance	[production]
12:43	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance	[production]
12:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47762 and previous config saved to /var/cache/conftool/dbconfig/20230505-124349-ladsgroup.json	[production]
12:31	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet	[production]
12:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P47761 and previous config saved to /var/cache/conftool/dbconfig/20230505-122843-ladsgroup.json	[production]
12:24	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet	[production]
12:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P47760 and previous config saved to /var/cache/conftool/dbconfig/20230505-121336-ladsgroup.json	[production]
12:06	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1001.eqiad.wmnet	[production]
11:59	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-mariadb1001.eqiad.wmnet	[production]
11:58	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-db1002.eqiad.wmnet	[production]
11:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47759 and previous config saved to /var/cache/conftool/dbconfig/20230505-115830-ladsgroup.json	[production]
11:52	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-db1002.eqiad.wmnet	[production]
11:51	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47758 and previous config saved to /var/cache/conftool/dbconfig/20230505-115126-ladsgroup.json	[production]
11:51	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance	[production]
11:51	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance	[production]
11:26	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P47757 and previous config saved to /var/cache/conftool/dbconfig/20230505-112649-ladsgroup.json	[production]
11:26	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P47756 and previous config saved to /var/cache/conftool/dbconfig/20230505-112605-ladsgroup.json	[production]
11:11	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P47755 and previous config saved to /var/cache/conftool/dbconfig/20230505-111145-ladsgroup.json	[production]
11:11	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P47754 and previous config saved to /var/cache/conftool/dbconfig/20230505-111100-ladsgroup.json	[production]
10:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P47753 and previous config saved to /var/cache/conftool/dbconfig/20230505-105640-ladsgroup.json	[production]
10:55	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P47752 and previous config saved to /var/cache/conftool/dbconfig/20230505-105555-ladsgroup.json	[production]
10:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P47751 and previous config saved to /var/cache/conftool/dbconfig/20230505-104135-ladsgroup.json	[production]
10:41	<moritzm>	installing wireshark security updates	[production]
10:40	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P47750 and previous config saved to /var/cache/conftool/dbconfig/20230505-104050-ladsgroup.json	[production]
09:28	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Host sad (T336033)	[production]
09:28	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Host sad (T336033)	[production]
09:14	<Amir1>	power cycled db1170\	[production]
09:10	<marostegui>	Failover m2-master from dbproxy1013 to dbproxy1015	[production]
09:08	<hnowlan@deploy1002>	Finished deploy [restbase/deploy@8aba801]: deploying to host missing from configs (duration: 01m 22s)	[production]
09:06	<hnowlan@deploy1002>	Started deploy [restbase/deploy@8aba801]: deploying to host missing from configs	[production]
08:58	<XioNoX>	deploy CR914772 on all hosts running Bird	[production]
08:15	<godog>	delete wal and chunks_head from prometheus5002 and prometheus4002 to let prometheus start back up and not crashloop - T309979	[production]
08:07	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host netflow2003.codfw.wmnet with OS bookworm	[production]
08:05	<hashar@deploy1002>	Finished deploy [integration/docroot@78e6f40]: build: Updating eslint-config-wikimedia to 0.25.0 (duration: 00m 13s)	[production]
08:04	<hashar@deploy1002>	Started deploy [integration/docroot@78e6f40]: build: Updating eslint-config-wikimedia to 0.25.0	[production]
07:32	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
07:31	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 12 days, 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
07:31	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 12:00:00 on db1132.eqiad.wmnet with reason: Maintenance	[production]