production SAL

651-700 of 10000 results (75ms)

2023-01-25 §
20:00	<brett@cumin1001>	conftool action : set/pooled=yes; selector: name=cp6011.drmrs.wmnet	[production]
19:58	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS bullseye	[production]
19:52	<denisse@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host centrallog1002.eqiad.wmnet with OS bullseye	[production]
19:38	<denisse@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage	[production]
19:36	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage	[production]
19:33	<denisse@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage	[production]
19:33	<brett@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage	[production]
19:21	<denisse@cumin1001>	START - Cookbook sre.hosts.reimage for host centrallog1002.eqiad.wmnet with OS bullseye	[production]
19:17	<brennen@deploy1002>	Synchronized php: group1 wikis to 1.40.0-wmf.20 refs T325583 (duration: 07m 04s)	[production]
19:12	<brett@cumin1001>	START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS bullseye	[production]
19:10	<brennen@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.20 refs T325583	[production]
19:06	<brett@cumin1001>	conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet	[production]
19:01	<brennen>	1.40.0-wmf.20 train (T325583): no blockers, rolling to group1.	[production]
19:00	<denisse@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host centrallog1002.eqiad.wmnet with OS bullseye	[production]
19:00	<denisse@cumin1001>	START - Cookbook sre.hosts.reimage for host centrallog1002.eqiad.wmnet with OS bullseye	[production]
18:59	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS bullseye	[production]
18:37	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage	[production]
18:35	<hnowlan@deploy1002>	helmfile [codfw] DONE helmfile.d/services/thumbor: apply	[production]
18:34	<brett@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage	[production]
18:33	<hnowlan@deploy1002>	helmfile [codfw] START helmfile.d/services/thumbor: apply	[production]
18:33	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/thumbor: apply	[production]
18:32	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/thumbor: apply	[production]
18:14	<brett@cumin1001>	START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS bullseye	[production]
18:11	<hnowlan@deploy1002>	helmfile [staging] DONE helmfile.d/services/thumbor: apply	[production]
18:11	<hnowlan@deploy1002>	helmfile [staging] START helmfile.d/services/thumbor: apply	[production]
18:11	<hnowlan@deploy1002>	helmfile [staging] DONE helmfile.d/services/thumbor: apply	[production]
18:10	<hnowlan@deploy1002>	helmfile [staging] START helmfile.d/services/thumbor: apply	[production]
18:05	<brett@cumin1001>	conftool action : set/pooled=yes; selector: name=cp6010.drmrs.wmnet	[production]
17:58	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS bullseye	[production]
17:32	<mutante>	removing racktables.wikimedia.org from DNS - that's it for this ancient service T327405	[production]
16:57	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp2031.codfw.wmnet,service=ats-be	[production]
16:57	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp2031.codfw.wmnet,service=cdn	[production]
16:51	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2031.codfw.wmnet with OS bullseye	[production]
16:50	<btullis@cumin1001>	START - Cookbook sre.kafka.reboot-workers for Kafka jumbo-eqiad cluster: Reboot kafka nodes	[production]
16:46	<brett@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage	[production]
16:43	<brett@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage	[production]
16:34	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet,service=ats-be	[production]
16:34	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet,service=cdn	[production]
16:33	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS bullseye	[production]
16:32	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2031.codfw.wmnet with reason: host reimage	[production]
16:28	<sukhe@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp2031.codfw.wmnet with reason: host reimage	[production]
16:24	<brett@cumin1001>	START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS bullseye	[production]
16:14	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1002.eqiad.wmnet	[production]
16:11	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage	[production]
16:09	<sukhe@cumin2002>	START - Cookbook sre.hosts.reimage for host cp2031.codfw.wmnet with OS bullseye	[production]
16:08	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-conf1002.eqiad.wmnet	[production]
16:08	<sukhe@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage	[production]
16:04	<btullis@cumin1001>	START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster	[production]
16:03	<btullis@deploy1002>	helmfile [staging] DONE helmfile.d/services/datahub: sync on main	[production]
15:56	<sukhe@cumin2002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2031']	[production]