production SAL

201-250 of 10000 results (63ms)

2023-01-10 §
14:28	<jayme@cumin1001>	START - Cookbook sre.ganeti.reimage for host kubestagetcd2001.codfw.wmnet with OS bullseye	[production]
14:26	<jiji@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:25	<zabe@deploy1002>	Finished scap: Backport for [[gerrit:877268\|[config]: GDI Safety Survey Wave 4 (T325136)]] (duration: 17m 42s)	[production]
14:21	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host apifeatureusage2001.codfw.wmnet	[production]
14:19	<claime>	Pausing reboots of eqiad appservers for deployments	[production]
14:18	<cgoubert@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw[1369-1372].eqiad.wmnet	[production]
14:18	<cgoubert@cumin1001>	START - Cookbook sre.hosts.remove-downtime for mw[1369-1372].eqiad.wmnet	[production]
14:14	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apifeatureusage1001.eqiad.wmnet	[production]
14:11	<jiji@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mc2036.codfw.wmnet	[production]
14:10	<cgoubert@cumin1001>	END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)	[production]
14:09	<zabe@deploy1002>	zabe and essexigyan: Backport for [[gerrit:877268\|[config]: GDI Safety Survey Wave 4 (T325136)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet	[production]
14:07	<zabe@deploy1002>	Started scap: Backport for [[gerrit:877268\|[config]: GDI Safety Survey Wave 4 (T325136)]]	[production]
14:07	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Reinitialize staging-codfw with k8s 1.23	[production]
14:06	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host apifeatureusage1001.eqiad.wmnet	[production]
14:06	<jayme@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: Reinitialize staging-codfw with k8s 1.23	[production]
14:03	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc2035.codfw.wmnet	[production]
14:03	<jiji@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
14:03	<jiji@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc2035.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1001"	[production]
13:49	<jiji@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc2035.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1001"	[production]
13:46	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1002.eqiad.wmnet with OS bullseye	[production]
13:46	<btullis@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1001"	[production]
13:46	<jiji@cumin1001>	START - Cookbook sre.dns.netbox	[production]
13:44	<godog>	delete grafana dashboards from "sre dashboards for deletion" folder - T178690	[production]
13:43	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2049.codfw.wmnet	[production]
13:37	<jiji@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mc2035.codfw.wmnet	[production]
13:36	<jiji@cumin1001>	START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet	[production]
13:34	<btullis@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1001"	[production]
13:26	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2001.wikimedia.org	[production]
13:22	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host irc2001.wikimedia.org	[production]
13:19	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage	[production]
13:16	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage	[production]
13:08	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetdb-test2001.codfw.wmnet	[production]
13:08	<jmm@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
13:08	<jmm@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetdb-test2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"	[production]
12:59	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host cephosd1002.eqiad.wmnet with OS bullseye	[production]
12:59	<btullis@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cephosd1002.eqiad.wmnet with OS bullseye	[production]
12:56	<jmm@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetdb-test2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"	[production]
12:53	<jmm@cumin2002>	START - Cookbook sre.dns.netbox	[production]
12:50	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-cluster	[production]
12:50	<cgoubert@cumin1001>	END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)	[production]
12:50	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-cluster	[production]
12:50	<jmm@cumin2002>	START - Cookbook sre.hosts.decommission for hosts puppetdb-test2001.codfw.wmnet	[production]
12:49	<claime>	Starting rolling reboot of eqiad appservers	[production]
12:47	<btullis@cumin1001>	END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes	[production]
12:36	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host cephosd1002.eqiad.wmnet with OS bullseye	[production]
12:34	<btullis@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cephosd1002.eqiad.wmnet with OS bullseye	[production]
12:31	<oblivian@deploy1002>	helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:31	<oblivian@deploy1002>	helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:31	<oblivian@deploy1002>	helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync	[production]
12:31	<oblivian@deploy1002>	helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync	[production]