2451-2500 of 10000 results (90ms)
2023-08-08 ยง
21:04 <bking@cumin1001> conftool action : set/pooled=no; selector: name=wcqs1003.eqiad.wmnet,service=wcqs [production]
21:02 <bking@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=wcqs,name=eqiad [production]
21:02 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye [production]
20:58 <bking@deploy1002> Finished deploy [wdqs/wdqs@f1a6177] (wcqs): f1a6177 (duration: 00m 17s) [production]
20:58 <bking@deploy1002> Started deploy [wdqs/wdqs@f1a6177] (wcqs): f1a6177 [production]
20:57 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wcqs1002.eqiad.wmnet with OS bullseye [production]
20:52 <bking@deploy1002> Finished deploy [wdqs/wdqs@f1a6177] (wcqs): f1a6177 (duration: 00m 18s) [production]
20:52 <bking@deploy1002> Started deploy [wdqs/wdqs@f1a6177] (wcqs): f1a6177 [production]
20:43 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:946997|Deploy to CN language wikis (T335886)]] (duration: 09m 08s) [production]
20:41 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@f1a6177]: whitelist new qlever endpoints take 4 (forgot git pull) T339347 (duration: 10m 44s) [production]
20:37 <urbanecm@deploy1002> ksarabia and urbanecm: Continuing with sync [production]
20:36 <urbanecm@deploy1002> ksarabia and urbanecm: Backport for [[gerrit:946997|Deploy to CN language wikis (T335886)]] synced to the testservers mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option) [production]
20:34 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:946997|Deploy to CN language wikis (T335886)]] [production]
20:31 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s6' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315510) [production]
20:30 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s5' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315353) [production]
20:30 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@f1a6177]: whitelist new qlever endpoints take 4 (forgot git pull) T339347 [production]
20:30 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s3' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315353) [production]
20:29 <urbanecm> mwmaint1002: `foreachwikiindblist 'group2 & s2' extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --current --all --touched-after=20230615000000` (T315353) [production]
20:24 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] (duration: 10m 55s) [production]
20:17 <urbanecm@deploy1002> urbanecm and matmarex: Continuing with sync [production]
20:16 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 3 T339347 (duration: 02m 54s) [production]
20:14 <urbanecm@deploy1002> urbanecm and matmarex: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] synced to the testservers mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option) [production]
20:14 <ryankemper> [WDQS] Lag caught up on `wdqs1006`; repooled -> `ryankemper@wdqs1006:~$ sudo pool` [production]
20:13 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:946998|Enable wgDiscussionToolsEnablePermalinksBackend on s2/s3/s5/s6 group2 (T315353)]] [production]
20:13 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 3 T339347 [production]
19:28 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300 [production]
19:28 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300 [production]
19:23 <bking@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
19:06 <ryankemper> [WDQS] Depooled `wdqs1006` while it catches up on 7 hours of lag [production]
19:05 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 2 (duration: 11m 34s) [production]
18:54 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4001.ulsfo.wmnet with OS bullseye [production]
18:54 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@aa5f5b7]: whitelist new qlever endpoints take 2 [production]
18:49 <bking@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wcqs,name=eqiad [production]
18:48 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@dff41b7]: whitelist new qlever endpoints (duration: 03m 08s) [production]
18:45 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: whitelist new qlever endpoints [production]
18:45 <ryankemper@deploy1002> deploy aborted: 0.3.124 (duration: 01m 50s) [production]
18:43 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: 0.3.124 [production]
18:38 <bking@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
18:31 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
18:27 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
18:12 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host durum4001.ulsfo.wmnet with OS bullseye [production]
18:12 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum4001.ulsfo.wmnet with OS bookworm [production]
17:56 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1001.eqiad.wmnet with OS bullseye [production]
17:55 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
17:52 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage [production]
17:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2136 (T342617)', diff saved to https://phabricator.wikimedia.org/P50209 and previous config saved to /var/cache/conftool/dbconfig/20230808-175101-ladsgroup.json [production]
17:50 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance [production]
17:50 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance [production]
17:50 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2119 (T342617)', diff saved to https://phabricator.wikimedia.org/P50208 and previous config saved to /var/cache/conftool/dbconfig/20230808-175040-ladsgroup.json [production]
17:41 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1002.eqiad.wmnet with reason: host reimage [production]