3401-3450 of 10000 results (71ms)
2022-09-01 ยง
15:35 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
15:34 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
15:34 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
15:21 <moritzm> installing usb.ids update from Bullseye 11.4 point release [production]
15:19 <moritzm> updating docker.io on ml-serve* to bugfix release from Bullseye 11.4 point release [production]
14:54 <topranks> Draining traffic from Lumen Tranport CCT 442550294 (cr1-codfw to cr4-ulsfo) ahead of hot-cut to lower-latency path with carrier [production]
14:29 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1002.eqiad.wmnet [production]
14:25 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host puppetboard1002.eqiad.wmnet [production]
14:07 <moritzm> installing net-snmp security updates on Buster [production]
14:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet [production]
14:01 <marostegui> test T316744 [production]
14:01 <marostegui> test T316744 [production]
14:00 <marostegui> Failover m5 from db1107 to db1183 - T316744 [production]
13:57 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet [production]
13:56 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet [production]
13:53 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet [production]
13:52 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netbox1002.eqiad.wmnet [production]
13:43 <moritzm> rebooting netbox1002 (running netbox.wikimedia.org) [production]
13:43 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netbox1002.eqiad.wmnet [production]
13:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox2002.codfw.wmnet [production]
13:37 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netbox2002.codfw.wmnet [production]
13:32 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2135,2160].codfw.wmnet,db[1107,1117,1183].eqiad.wmnet with reason: switchover m5 T316744 [production]
13:31 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db[2135,2160].codfw.wmnet,db[1107,1117,1183].eqiad.wmnet with reason: switchover m5 T316744 [production]
13:19 <jayme@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
13:19 <jayme@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
13:19 <jayme@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
13:19 <jayme@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
13:18 <jayme@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:18 <jayme@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
13:16 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:16 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:15 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:15 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:10 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:09 <oblivian@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:823677|Move 5% of traffic to php 7.4 (T271736)]] (duration: 03m 45s) [production]
13:09 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:09 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:00 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
13:00 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
13:00 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
12:59 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
12:56 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
12:56 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
12:29 <herron> restarted thanos-query on thanos-fe1001 [production]
12:20 <cdanis@cumin2002> dbctl commit (dc=all): 'T316482 remove replicas from x2', diff saved to https://phabricator.wikimedia.org/P33736 and previous config saved to /var/cache/conftool/dbconfig/20220901-122026-cdanis.json [production]
12:13 <klausman@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-serve-ctrl1001.eqiad.wmnet [production]
12:13 <klausman@cumin1001> START - Cookbook sre.hosts.remove-downtime for ml-serve-ctrl1001.eqiad.wmnet [production]
12:13 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
12:12 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]