production SAL

651-700 of 10000 results (86ms)

2024-01-10 §
19:00	<topranks>	disabling OSPF connection from mr1-codfw to codfw core routers T348164	[production]
18:40	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply	[production]
18:38	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade	[production]
18:37	<filippo@cumin1002>	START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade	[production]
18:37	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
18:37	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply	[production]
18:36	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
18:35	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for prometheus2005.codfw.wmnet	[production]
18:35	<filippo@cumin1002>	START - Cookbook sre.hosts.remove-downtime for prometheus2005.codfw.wmnet	[production]
18:24	<sukhe>	stop pybal on lvs2013: T352758	[production]
17:59	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade	[production]
17:58	<filippo@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade	[production]
17:54	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye	[production]
17:47	<pfischer@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:46	<pfischer@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:44	<pfischer@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:44	<pfischer@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:40	<sukhe@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet	[production]
17:34	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage	[production]
17:31	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage	[production]
17:28	<sukhe@cumin2002>	START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet	[production]
17:27	<sukhe>	enable puppet on lvs2014: T352758	[production]
17:16	<kamila@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye	[production]
17:15	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye	[production]
17:14	<cmooney@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:14	<cmooney@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"	[production]
17:14	<cmooney@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"	[production]
17:09	<cmooney@cumin1002>	START - Cookbook sre.dns.netbox	[production]
16:55	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage	[production]
16:52	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage	[production]
16:37	<kamila@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye	[production]
16:36	<godog>	upgrade prometheus on prometheus2006 - T354399	[production]
16:32	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:32	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:31	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:31	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:30	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:29	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:25	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot	[production]
16:25	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot	[production]
16:22	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye	[production]
16:20	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply	[production]
16:02	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage	[production]
16:00	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye	[production]
15:59	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye	[production]
15:57	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye	[production]
15:57	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage	[production]
15:41	<jmm@cumin2002>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::data	[production]
15:41	<kamila@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye	[production]
15:40	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage	[production]