production SAL

2251-2300 of 10000 results (83ms)

2023-05-17 §
15:30	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance	[production]
15:30	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling es2032 (T335845)', diff saved to https://phabricator.wikimedia.org/P48347 and previous config saved to /var/cache/conftool/dbconfig/20230517-153010-ladsgroup.json	[production]
15:30	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48346 and previous config saved to /var/cache/conftool/dbconfig/20230517-153004-ladsgroup.json	[production]
15:30	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance	[production]
15:29	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance	[production]
15:29	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48345 and previous config saved to /var/cache/conftool/dbconfig/20230517-152945-ladsgroup.json	[production]
15:29	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2002.wikimedia.org	[production]
15:25	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host irc2002.wikimedia.org	[production]
15:18	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org	[production]
15:14	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48344 and previous config saved to /var/cache/conftool/dbconfig/20230517-151458-ladsgroup.json	[production]
15:14	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org	[production]
15:14	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48343 and previous config saved to /var/cache/conftool/dbconfig/20230517-151438-ladsgroup.json	[production]
15:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet	[production]
15:07	<aikochou@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .	[production]
15:01	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet	[production]
14:59	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48342 and previous config saved to /var/cache/conftool/dbconfig/20230517-145952-ladsgroup.json	[production]
14:59	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48341 and previous config saved to /var/cache/conftool/dbconfig/20230517-145932-ladsgroup.json	[production]
14:48	<jmm@cumin2002>	END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P{aqs101[6-9]*} and A:aqs	[production]
14:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48340 and previous config saved to /var/cache/conftool/dbconfig/20230517-144446-ladsgroup.json	[production]
14:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48339 and previous config saved to /var/cache/conftool/dbconfig/20230517-144425-ladsgroup.json	[production]
14:40	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48338 and previous config saved to /var/cache/conftool/dbconfig/20230517-144025-ladsgroup.json	[production]
14:40	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance	[production]
14:40	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance	[production]
14:39	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48337 and previous config saved to /var/cache/conftool/dbconfig/20230517-143949-ladsgroup.json	[production]
14:39	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance	[production]
14:39	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - EventBus: produce to mediawiki.page_change.v1 stream - T336817 (duration: 06m 20s)	[production]
14:39	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance	[production]
14:38	<btullis@cumin1001>	END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker	[production]
14:36	<moritzm>	installing jackson-databind security updates	[production]
14:34	<xcollazo@deploy1002>	Finished deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for T336800 (duration: 00m 09s)	[production]
14:34	<xcollazo@deploy1002>	Started deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for T336800	[production]
14:33	<ottomata>	EventBus: produce to mediawiki.page_change.v1 stream - T336817	[production]
14:30	<otto@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync	[production]
14:30	<otto@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-main: sync	[production]
14:28	<otto@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync	[production]
14:28	<otto@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-main: sync	[production]
14:27	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/eventgate-main: sync	[production]
14:27	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/eventgate-main: sync	[production]
14:27	<ottomata>	rolling restart of eventgate-main to pick up new mediawiki.page_change.v1 stream config - T336817	[production]
14:17	<elukey>	run authdns-update for new ml-serve/ores discovery endpoints - T336726	[production]
14:15	<jmm@cumin2002>	START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P{aqs101[6-9]*} and A:aqs	[production]
14:15	<jmm@cumin2002>	END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P{aqs101[2-5]*} and A:aqs	[production]
14:14	<otto@deploy1002>	Synchronized wmf-config/ext-EventStreamConfig.php: wgEventStreams - Declare mediawiki.page_change.v1 stream - T336817 (duration: 07m 30s)	[production]
14:10	<bking@deploy1002>	helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
14:09	<bking@deploy1002>	helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply	[production]
14:09	<bking@deploy1002>	helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
14:08	<bking@deploy1002>	helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply	[production]
14:07	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1101.eqiad.wmnet	[production]
13:59	<taavi@deploy1002>	Finished scap: Backport for [[gerrit:920582\|Define $maintClass in maintenance script for compatibility (T317375)]] (duration: 07m 24s)	[production]
13:59	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-worker1101.eqiad.wmnet	[production]