production SAL

1201-1250 of 10000 results (105ms)

2024-05-14 §
06:35	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm	[production]
06:33	<marostegui@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db2185.codfw.wmnet with OS bookworm	[production]
06:33	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm	[production]
05:31	<kart_>	Updated cxserver to 2024-04-23-221507-production (T363263, T333969, T360303, T360310)	[production]
05:25	<kartik@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/cxserver: apply	[production]
05:24	<kartik@deploy1002>	helmfile [eqiad] START helmfile.d/services/cxserver: apply	[production]
05:22	<kartik@deploy1002>	helmfile [codfw] DONE helmfile.d/services/cxserver: apply	[production]
05:22	<kartik@deploy1002>	helmfile [codfw] START helmfile.d/services/cxserver: apply	[production]
05:19	<kartik@deploy1002>	helmfile [staging] DONE helmfile.d/services/cxserver: apply	[production]
05:19	<kartik@deploy1002>	helmfile [staging] START helmfile.d/services/cxserver: apply	[production]
05:15	<kart_>	Updated MinT to 2024-03-28-061726-production (T333969)	[production]
05:08	<kartik@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply	[production]
04:59	<kartik@deploy1002>	helmfile [eqiad] START helmfile.d/services/machinetranslation: apply	[production]
04:33	<kartik@deploy1002>	helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply	[production]
04:25	<kartik@deploy1002>	helmfile [codfw] START helmfile.d/services/machinetranslation: apply	[production]
04:18	<kartik@deploy1002>	helmfile [staging] DONE helmfile.d/services/machinetranslation: apply	[production]
04:14	<kartik@deploy1002>	helmfile [staging] START helmfile.d/services/machinetranslation: apply	[production]
04:00	<mwpresync@deploy1002>	Finished scap: testwikis wikis to 1.43.0-wmf.5 refs T361399 (duration: 57m 45s)	[production]
03:02	<mwpresync@deploy1002>	Started scap: testwikis wikis to 1.43.0-wmf.5 refs T361399	[production]
02:34	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
02:34	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
02:33	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240514-023316-ladsgroup.json	[production]
02:18	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62375 and previous config saved to /var/cache/conftool/dbconfig/20240514-021809-ladsgroup.json	[production]
02:03	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62374 and previous config saved to /var/cache/conftool/dbconfig/20240514-020301-ladsgroup.json	[production]
01:47	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62373 and previous config saved to /var/cache/conftool/dbconfig/20240514-014753-ladsgroup.json	[production]
01:18	<ejegg>	fundraising civicrm upgraded from c854dd3a to c7b0dfbb	[production]
00:35	<tstarling@deploy1002>	Finished scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752 (duration: 14m 56s)	[production]
00:20	<tstarling@deploy1002>	Started scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752	[production]
00:19	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db2152 (T364299)', diff saved to https://phabricator.wikimedia.org/P62372 and previous config saved to /var/cache/conftool/dbconfig/20240514-001956-marostegui.json	[production]
00:19	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance	[production]
00:19	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance	[production]
2024-05-13 §
22:55	<bking@cumin2002>	conftool action : set/weight=10:pooled=yes; selector: name=elastic110[5\|7]\.eqiad\.wmnet	[production]
22:43	<ryankemper@cumin2002>	END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975	[production]
22:30	<zabe>	zabe@mwmaint1002:~$ mwscript cleanupTitles.php itwikivoyage # T298315	[production]
22:27	<bking@cumin2002>	conftool action : set/weight=10:pooled=no; selector: name=elastic110[5\|7]\.eqiad\.wmnet	[production]
21:47	<ryankemper@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975	[production]
21:46	<ryankemper@cumin2002>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
21:39	<ryankemper@cumin2002>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
21:39	<eileen>	civicrm upgraded from 447e1472 to c854dd3a	[production]
21:32	<ryankemper@cumin2002>	END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch	[production]
21:32	<ebernhardson@deploy1002>	Finished scap: Backport for [[gerrit:1030983\|Unbreak link buttons (T364062)]] (duration: 22m 00s)	[production]
21:22	<ryankemper@cumin2002>	START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch	[production]
21:20	<ebernhardson@deploy1002>	jdlrobson and ebernhardson: Continuing with sync	[production]
21:12	<ebernhardson@deploy1002>	jdlrobson and ebernhardson: Backport for [[gerrit:1030983\|Unbreak link buttons (T364062)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
21:10	<ebernhardson@deploy1002>	Started scap: Backport for [[gerrit:1030983\|Unbreak link buttons (T364062)]]	[production]
20:57	<ebernhardson@deploy1002>	Finished scap: Backport for [[gerrit:1017152\|IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884)]] (duration: 17m 22s)	[production]
20:45	<ebernhardson@deploy1002>	ebernhardson and tchanders: Continuing with sync	[production]
20:42	<ebernhardson@deploy1002>	ebernhardson and tchanders: Backport for [[gerrit:1017152\|IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
20:40	<ebernhardson@deploy1002>	Started scap: Backport for [[gerrit:1017152\|IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884)]]	[production]
20:38	<ebernhardson@deploy1002>	Finished scap: Backport for [[gerrit:1014626\|Remove old CampaignEvents DB config (prod) (T348281)]] (duration: 21m 14s)	[production]