production SAL

2051-2100 of 10000 results (80ms)

2023-06-13 §
16:25	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1156.eqiad.wmnet with reason: Maintenance	[production]
16:25	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1156.eqiad.wmnet with reason: Maintenance	[production]
16:25	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1146.eqiad.wmnet with reason: Maintenance	[production]
16:24	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1146.eqiad.wmnet with reason: Maintenance	[production]
16:24	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1139.eqiad.wmnet with reason: Maintenance	[production]
16:24	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1139.eqiad.wmnet with reason: Maintenance	[production]
16:24	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1129.eqiad.wmnet with reason: Maintenance	[production]
16:24	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1129.eqiad.wmnet with reason: Maintenance	[production]
16:19	<jiji@deploy1002>	helmfile [staging] DONE helmfile.d/services/ipoid: apply	[production]
16:19	<jiji@deploy1002>	helmfile [staging] START helmfile.d/services/ipoid: apply	[production]
16:13	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS buster	[production]
16:07	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye	[production]
16:02	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2105.codfw.wmnet with reason: Maintenance	[production]
16:02	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db2105.codfw.wmnet with reason: Maintenance	[production]
15:55	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage	[production]
15:52	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1157.eqiad.wmnet with reason: Maintenance	[production]
15:52	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db1157.eqiad.wmnet with reason: Maintenance	[production]
15:51	<pt1979@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage	[production]
15:45	<SandraEbele>	Deployed refinery-source using jenkins	[production]
15:34	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-worker1149']	[production]
15:34	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster	[production]
15:34	<jhancock@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-worker1149']	[production]
15:28	<akosiaris@deploy1002>	helmfile [codfw] DONE helmfile.d/services/recommendation-api: sync	[production]
15:28	<akosiaris@deploy1002>	helmfile [codfw] START helmfile.d/services/recommendation-api: sync	[production]
15:28	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['an-worker1149']	[production]
15:28	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/recommendation-api: sync	[production]
15:28	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/services/recommendation-api: sync	[production]
15:27	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host snapshot1016.eqiad.wmnet with OS buster	[production]
15:21	<jhancock@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-worker1149']	[production]
15:21	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['an-worker1149']	[production]
15:20	<jhancock@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-worker1149']	[production]
15:16	<akosiaris@deploy1002>	helmfile [staging] DONE helmfile.d/services/recommendation-api: sync	[production]
15:15	<akosiaris@deploy1002>	helmfile [staging] START helmfile.d/services/recommendation-api: sync	[production]
15:14	<SandraEbele>	deploying refinery source	[production]
15:14	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2177.codfw.wmnet with reason: Maintenance	[production]
15:14	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db2177.codfw.wmnet with reason: Maintenance	[production]
15:14	<jclark@cumin1001>	START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye	[production]
15:02	<otto@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply	[production]
15:01	<otto@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-main: apply	[production]
15:00	<elukey>	run kafka re-assign partitions for eqiad.change-prop.transcludes.resource-change on kafka-main1001 - T338357	[production]
14:59	<otto@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply	[production]
14:58	<otto@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-main: apply	[production]
14:58	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/eventgate-main: apply	[production]
14:57	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/eventgate-main: apply	[production]
14:57	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster	[production]
14:54	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: Maintenance	[production]
14:54	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: Maintenance	[production]
14:54	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2156.codfw.wmnet with reason: Maintenance	[production]
14:54	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db2156.codfw.wmnet with reason: Maintenance	[production]
14:47	<aikochou@deploy1002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .	[production]