production SAL

8701-8750 of 10000 results (96ms)

2022-12-20 §
14:17	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage	[production]
14:16	<moritzm>	installing jackson-databind security updates	[production]
14:14	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage	[production]
14:10	<moritzm>	installing ruby-rails-html-sanitizer security updates	[production]
13:04	<aikochou@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .	[production]
12:58	<aikochou@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .	[production]
12:13	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS bullseye	[production]
11:31	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1001.wikimedia.org	[production]
11:26	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host lists1001.wikimedia.org	[production]
11:16	<moritzm>	installing apache2 security updates on Buster	[production]
11:03	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host archiva1002.wikimedia.org	[production]
10:59	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host archiva1002.wikimedia.org	[production]
10:55	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1002.eqiad.wmnet	[production]
10:51	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host matomo1002.eqiad.wmnet	[production]
10:42	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-web1001.eqiad.wmnet	[production]
10:35	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host an-web1001.eqiad.wmnet	[production]
10:34	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1009.eqiad.wmnet	[production]
10:29	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host an-tool1009.eqiad.wmnet	[production]
10:29	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1008.eqiad.wmnet	[production]
10:25	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host an-tool1008.eqiad.wmnet	[production]
10:24	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1010.eqiad.wmnet	[production]
10:19	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host an-tool1010.eqiad.wmnet	[production]
10:16	<moritzm>	rebalance ganeti cluster in ulsfo after adding new node and decom of the old hardware T317247	[production]
10:06	<oblivian@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-web: apply	[production]
10:06	<oblivian@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-web: apply	[production]
10:05	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-web: apply	[production]
10:05	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-web: apply	[production]
09:48	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
09:47	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
08:45	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
08:45	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
08:40	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
08:40	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo and group 1	[production]
08:38	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet	[production]
08:32	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet	[production]
04:10	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0)	[production]
03:56	<bking@cumin1001>	START - Cookbook sre.wdqs.reboot	[production]
02:02	<bking@cumin1001>	END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)	[production]
01:50	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0)	[production]
00:40	<bking@cumin1001>	START - Cookbook sre.wdqs.reboot	[production]
00:38	<bking@cumin1001>	START - Cookbook sre.wdqs.reboot	[production]
00:27	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0)	[production]
2022-12-19 §
23:50	<bking@cumin1001>	START - Cookbook sre.wdqs.reboot	[production]
23:32	<ryankemper@puppetmaster1001>	conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2009.*	[production]
23:32	<ryankemper@puppetmaster1001>	conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2010.*	[production]
23:32	<ryankemper@puppetmaster1001>	conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2011.*	[production]
23:32	<ryankemper>	[WDQS] Temporarily removing wdqs20[09-12] from pybal; these are new hosts that aren't ready for service until data reload has completed (long-running process). In meantime, remove these so they don't factor into pybal's depool threshold	[production]
23:30	<ryankemper@puppetmaster1001>	conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2012.*	[production]
23:30	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0)	[production]
23:07	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0)	[production]