451-500 of 10000 results (70ms)
2023-08-08 ยง
20:30 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@f1a6177]: whitelist new qlever endpoints take 4 (forgot git pull) T339347 [production]
20:30 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s3' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315353) [production]
20:29 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s2' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315353) [production]
20:24 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] (duration: 10m 55s) [production]
20:17 <urbanecm@deploy1002> urbanecm and matmarex: Continuing with sync [production]
20:16 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 3 T339347 (duration: 02m 54s) [production]
20:14 <urbanecm@deploy1002> urbanecm and matmarex: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] synced to the testservers mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option) [production]
20:14 <ryankemper> [WDQS] Lag caught up on `wdqs1006`; repooled -> `ryankemper@wdqs1006:~$ sudo pool` [production]
20:13 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] [production]
20:13 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 3 T339347 [production]
19:28 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300 [production]
19:28 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300 [production]
19:23 <bking@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
19:06 <ryankemper> [WDQS] Depooled `wdqs1006` while it catches up on 7 hours of lag [production]
19:05 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 2 (duration: 11m 34s) [production]
18:54 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4001.ulsfo.wmnet with OS bullseye [production]
18:54 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 2 [production]
18:49 <bking@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wcqs,name=eqiad [production]
18:48 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@dff41b7]: whitelist new qlever endpoints (duration: 03m 08s) [production]
18:45 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: whitelist new qlever endpoints [production]
18:45 <ryankemper@deploy1002> deploy aborted: 0.3.124 (duration: 01m 50s) [production]
18:43 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: 0.3.124 [production]
18:38 <bking@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
18:31 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
18:27 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
18:12 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host durum4001.ulsfo.wmnet with OS bullseye [production]
18:12 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum4001.ulsfo.wmnet with OS bookworm [production]
17:56 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1001.eqiad.wmnet with OS bullseye [production]
17:55 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
17:52 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
17:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2136 (T342617)', diff saved to https://phabricator.wikimedia.org/P50209 and previous config saved to /var/cache/conftool/dbconfig/20230808-175101-ladsgroup.json [production]
17:50 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance [production]
17:50 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance [production]
17:50 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2119 (T342617)', diff saved to https://phabricator.wikimedia.org/P50208 and previous config saved to /var/cache/conftool/dbconfig/20230808-175040-ladsgroup.json [production]
17:41 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1002.eqiad.wmnet with reason: host reimage [production]
17:38 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1001.eqiad.wmnet with reason: host reimage [production]
17:37 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1002.eqiad.wmnet with reason: host reimage [production]
17:35 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1001.eqiad.wmnet with reason: host reimage [production]
17:35 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P50207 and previous config saved to /var/cache/conftool/dbconfig/20230808-173534-ladsgroup.json [production]
17:31 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host durum4001.ulsfo.wmnet with OS bookworm [production]
17:24 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1083.eqiad.wmnet with OS bullseye [production]
17:24 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host wcqs1002.eqiad.wmnet with OS bullseye [production]
17:22 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host wcqs1001.eqiad.wmnet with OS bullseye [production]
17:20 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P50206 and previous config saved to /var/cache/conftool/dbconfig/20230808-172027-ladsgroup.json [production]
17:05 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2119 (T342617)', diff saved to https://phabricator.wikimedia.org/P50205 and previous config saved to /var/cache/conftool/dbconfig/20230808-170521-ladsgroup.json [production]
17:01 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1083.eqiad.wmnet with reason: host reimage [production]
16:58 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1144:3314 (T342617)', diff saved to https://phabricator.wikimedia.org/P50204 and previous config saved to /var/cache/conftool/dbconfig/20230808-165824-ladsgroup.json [production]
16:58 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
16:58 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
16:58 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1143 (T342617)', diff saved to https://phabricator.wikimedia.org/P50203 and previous config saved to /var/cache/conftool/dbconfig/20230808-165803-ladsgroup.json [production]