2251-2300 of 10000 results (81ms)
2023-05-17 ยง
15:30 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance [production]
15:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling es2032 (T335845)', diff saved to https://phabricator.wikimedia.org/P48347 and previous config saved to /var/cache/conftool/dbconfig/20230517-153010-ladsgroup.json [production]
15:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48346 and previous config saved to /var/cache/conftool/dbconfig/20230517-153004-ladsgroup.json [production]
15:30 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance [production]
15:29 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance [production]
15:29 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48345 and previous config saved to /var/cache/conftool/dbconfig/20230517-152945-ladsgroup.json [production]
15:29 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2002.wikimedia.org [production]
15:25 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host irc2002.wikimedia.org [production]
15:18 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org [production]
15:14 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48344 and previous config saved to /var/cache/conftool/dbconfig/20230517-151458-ladsgroup.json [production]
15:14 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org [production]
15:14 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48343 and previous config saved to /var/cache/conftool/dbconfig/20230517-151438-ladsgroup.json [production]
15:07 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet [production]
15:07 <aikochou@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . [production]
15:01 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet [production]
14:59 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48342 and previous config saved to /var/cache/conftool/dbconfig/20230517-145952-ladsgroup.json [production]
14:59 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48341 and previous config saved to /var/cache/conftool/dbconfig/20230517-145932-ladsgroup.json [production]
14:48 <jmm@cumin2002> END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P{aqs101[6-9]*} and A:aqs [production]
14:44 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48340 and previous config saved to /var/cache/conftool/dbconfig/20230517-144446-ladsgroup.json [production]
14:44 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48339 and previous config saved to /var/cache/conftool/dbconfig/20230517-144425-ladsgroup.json [production]
14:40 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling es2028 (T335845)', diff saved to https://phabricator.wikimedia.org/P48338 and previous config saved to /var/cache/conftool/dbconfig/20230517-144025-ladsgroup.json [production]
14:40 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance [production]
14:40 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance [production]
14:39 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling es1027 (T335845)', diff saved to https://phabricator.wikimedia.org/P48337 and previous config saved to /var/cache/conftool/dbconfig/20230517-143949-ladsgroup.json [production]
14:39 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance [production]
14:39 <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - EventBus: produce to mediawiki.page_change.v1 stream - T336817 (duration: 06m 20s) [production]
14:39 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance [production]
14:38 <btullis@cumin1001> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker [production]
14:36 <moritzm> installing jackson-databind security updates [production]
14:34 <xcollazo@deploy1002> Finished deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for T336800 (duration: 00m 09s) [production]
14:34 <xcollazo@deploy1002> Started deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for T336800 [production]
14:33 <ottomata> EventBus: produce to mediawiki.page_change.v1 stream - T336817 [production]
14:30 <otto@deploy1002> helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync [production]
14:30 <otto@deploy1002> helmfile [eqiad] START helmfile.d/services/eventgate-main: sync [production]
14:28 <otto@deploy1002> helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync [production]
14:28 <otto@deploy1002> helmfile [codfw] START helmfile.d/services/eventgate-main: sync [production]
14:27 <otto@deploy1002> helmfile [staging] DONE helmfile.d/services/eventgate-main: sync [production]
14:27 <otto@deploy1002> helmfile [staging] START helmfile.d/services/eventgate-main: sync [production]
14:27 <ottomata> rolling restart of eventgate-main to pick up new mediawiki.page_change.v1 stream config - T336817 [production]
14:17 <elukey> run authdns-update for new ml-serve/ores discovery endpoints - T336726 [production]
14:15 <jmm@cumin2002> START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P{aqs101[6-9]*} and A:aqs [production]
14:15 <jmm@cumin2002> END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P{aqs101[2-5]*} and A:aqs [production]
14:14 <otto@deploy1002> Synchronized wmf-config/ext-EventStreamConfig.php: wgEventStreams - Declare mediawiki.page_change.v1 stream - T336817 (duration: 07m 30s) [production]
14:10 <bking@deploy1002> helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply [production]
14:09 <bking@deploy1002> helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply [production]
14:09 <bking@deploy1002> helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply [production]
14:08 <bking@deploy1002> helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply [production]
14:07 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1101.eqiad.wmnet [production]
13:59 <taavi@deploy1002> Finished scap: Backport for [[gerrit:920582|Define $maintClass in maintenance script for compatibility (T317375)]] (duration: 07m 24s) [production]
13:59 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-worker1101.eqiad.wmnet [production]