production SAL

7351-7400 of 10000 results (91ms)

2023-11-09 §
19:47	<eevans@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching aqs2012.codfw.wmnet: Applying JVM security upgrade - eevans@cumin1001	[production]
19:45	<urbanecm@deploy2002>	Finished scap: Backport for [[gerrit:973228\|wikimaniawiki: Revert wordmark and tagline back (T350640)]] (duration: 07m 22s)	[production]
19:44	<volans@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1108.eqiad.wmnet with OS bullseye	[production]
19:43	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P53241 and previous config saved to /var/cache/conftool/dbconfig/20231109-194357-arnaudb.json	[production]
19:43	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1027.eqiad.wmnet with OS bookworm	[production]
19:41	<ebernhardson@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:41	<ebernhardson@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:38	<volans@cumin1001>	START - Cookbook sre.hosts.reimage for host cp1108.eqiad.wmnet with OS bullseye	[production]
19:38	<urbanecm@deploy2002>	Started scap: Backport for [[gerrit:973228\|wikimaniawiki: Revert wordmark and tagline back (T350640)]]	[production]
19:33	<urbanecm@deploy2002>	Finished scap: Backport for [[gerrit:973215\|wikimaniawiki: Switch back to standard logo (T350640)]] (duration: 07m 11s)	[production]
19:33	<ebernhardson@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:33	<ebernhardson@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:32	<volans@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp1108.mgmt.eqiad.wmnet with reboot policy GRACEFUL	[production]
19:32	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1105.eqiad.wmnet with OS bullseye	[production]
19:32	<sukhe@cumin1001>	START - Cookbook sre.hosts.reimage for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:32	<sukhe@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:30	<volans@cumin1001>	START - Cookbook sre.hosts.provision for host cp1108.mgmt.eqiad.wmnet with reboot policy GRACEFUL	[production]
19:28	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1231 (T348183)', diff saved to https://phabricator.wikimedia.org/P53240 and previous config saved to /var/cache/conftool/dbconfig/20231109-192850-arnaudb.json	[production]
19:28	<fnegri@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1060.eqiad.wmnet with reason: host reimage	[production]
19:28	<urbanecm@deploy2002>	urbanecm: Continuing with sync	[production]
19:28	<urbanecm@deploy2002>	urbanecm: Backport for [[gerrit:973215\|wikimaniawiki: Switch back to standard logo (T350640)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
19:27	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1057.eqiad.wmnet with reason: host reimage	[production]
19:26	<urbanecm@deploy2002>	Started scap: Backport for [[gerrit:973215\|wikimaniawiki: Switch back to standard logo (T350640)]]	[production]
19:26	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Depooling db1231 (T348183)', diff saved to https://phabricator.wikimedia.org/P53239 and previous config saved to /var/cache/conftool/dbconfig/20231109-192621-arnaudb.json	[production]
19:26	<arnaudb@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance	[production]
19:26	<arnaudb@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance	[production]
19:25	<sukhe@cumin1001>	START - Cookbook sre.hosts.reimage for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:25	<topranks>	shutting down et-1/1/5.2201 (sandbox1-a-codfw) interfaces on crX-codfw (T348159)	[production]
19:25	<sukhe@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:24	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1059.eqiad.wmnet with reason: host reimage	[production]
19:24	<arnaudb@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance	[production]
19:24	<arnaudb@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance	[production]
19:24	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1224 (T348183)', diff saved to https://phabricator.wikimedia.org/P53238 and previous config saved to /var/cache/conftool/dbconfig/20231109-192416-arnaudb.json	[production]
19:22	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1026.eqiad.wmnet with reason: host reimage	[production]
19:22	<fnegri@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage	[production]
19:20	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage	[production]
19:20	<sukhe@cumin1001>	START - Cookbook sre.hosts.reimage for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:20	<sukhe@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:19	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1027.eqiad.wmnet with reason: host reimage	[production]
19:18	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1059.eqiad.wmnet with reason: host reimage	[production]
19:18	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1060.eqiad.wmnet with reason: host reimage	[production]
19:18	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1057.eqiad.wmnet with reason: host reimage	[production]
19:17	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1026.eqiad.wmnet with reason: host reimage	[production]
19:16	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1027.eqiad.wmnet with reason: host reimage	[production]
19:15	<sukhe@cumin1001>	START - Cookbook sre.hosts.reimage for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:15	<sukhe@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1107.eqiad.wmnet with OS bullseye	[production]
19:15	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1105.eqiad.wmnet with reason: host reimage	[production]
19:12	<otto@deploy2002>	helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply	[production]
19:11	<sukhe@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp1105.eqiad.wmnet with reason: host reimage	[production]
19:09	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P53237 and previous config saved to /var/cache/conftool/dbconfig/20231109-190910-arnaudb.json	[production]