production SAL

3801-3850 of 10000 results (92ms)

2024-02-26 §
23:00	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage	[production]
22:57	<btullis@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage	[production]
22:55	<jclark@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage	[production]
22:54	<btullis@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage	[production]
22:46	<ryankemper@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651	[production]
22:45	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Depooling db2149 (T357189)', diff saved to https://phabricator.wikimedia.org/P57963 and previous config saved to /var/cache/conftool/dbconfig/20240226-224557-arnaudb.json	[production]
22:45	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance	[production]
22:45	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance	[production]
22:45	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage	[production]
22:42	<jclark@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage	[production]
22:42	<TimStarling>	on snapshot1010 killed PHP processes left over from kill -9 of python parents T358458	[production]
22:42	<btullis@cumin1002>	START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm	[production]
22:41	<btullis@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-redacteddb1001.eqiad.wmnet with OS bookworm	[production]
22:38	<jclark@cumin1002>	START - Cookbook sre.hosts.reimage for host es1040.eqiad.wmnet with OS bookworm	[production]
22:29	<ryankemper@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: cloudelastic restart	[production]
22:28	<ryankemper@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: cloudelastic restart	[production]
22:27	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage	[production]
22:25	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance	[production]
22:24	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance	[production]
22:24	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57962 and previous config saved to /var/cache/conftool/dbconfig/20240226-222435-arnaudb.json	[production]
22:24	<jclark@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage	[production]
22:20	<jclark@cumin1002>	START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS bookworm	[production]
22:18	<ryankemper@cumin2002>	END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651	[production]
22:15	<jclark@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
22:14	<jclark@cumin1002>	START - Cookbook sre.hosts.provision for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
22:09	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P57961 and previous config saved to /var/cache/conftool/dbconfig/20240226-220928-arnaudb.json	[production]
22:06	<ryankemper@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651	[production]
22:02	<jdrewniak@deploy2002>	Synchronized portals: Wikimedia Portals Update: [[gerrit:1006579\| Bumping portals to master (T128546)]] (duration: 08m 37s)	[production]
21:56	<jclark@cumin1002>	START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm	[production]
21:54	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P57960 and previous config saved to /var/cache/conftool/dbconfig/20240226-215422-arnaudb.json	[production]
21:54	<jdrewniak@deploy2002>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:1006579\| Bumping portals to master (T128546)]] (duration: 08m 26s)	[production]
21:39	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57959 and previous config saved to /var/cache/conftool/dbconfig/20240226-213916-arnaudb.json	[production]
21:38	<cjming@deploy2002>	Finished scap: Backport for [[gerrit:1006312\|Fix regression in WebM transcodes breaking audio (T358342)]] (duration: 11m 14s)	[production]
21:30	<cjming@deploy2002>	cjming and bvibber: Continuing with sync	[production]
21:29	<cjming@deploy2002>	cjming and bvibber: Backport for [[gerrit:1006312\|Fix regression in WebM transcodes breaking audio (T358342)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
21:27	<cjming@deploy2002>	Started scap: Backport for [[gerrit:1006312\|Fix regression in WebM transcodes breaking audio (T358342)]]	[production]
21:22	<dzahn@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1004.eqiad.wmnet with OS bullseye	[production]
21:16	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Depooling db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57958 and previous config saved to /var/cache/conftool/dbconfig/20240226-211619-arnaudb.json	[production]
21:16	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance	[production]
21:16	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance	[production]
21:15	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2105 (T357189)', diff saved to https://phabricator.wikimedia.org/P57957 and previous config saved to /var/cache/conftool/dbconfig/20240226-211557-arnaudb.json	[production]
21:10	<dzahn@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1004.eqiad.wmnet with reason: host reimage	[production]
21:07	<dzahn@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on contint1004.eqiad.wmnet with reason: host reimage	[production]
21:02	<ebernhardson@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:02	<ebernhardson@deploy2002>	helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:00	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P57956 and previous config saved to /var/cache/conftool/dbconfig/20240226-210050-arnaudb.json	[production]
20:58	<dzahn@cumin1002>	START - Cookbook sre.hosts.reimage for host contint1004.eqiad.wmnet with OS bullseye	[production]
20:58	<dzahn@cumin1002>	END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host contint1004.eqiad.wmnet	[production]
20:57	<dzahn@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint1004.eqiad.wmnet with OS bullseye	[production]
20:52	<jclark@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1040.eqiad.wmnet with OS bookworm	[production]