2021-01-22
ยง
|
07:26 |
<ryankemper> |
[WDQS Deploy] WDQS deploy complete; service is healthy |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
06:58 |
<ryankemper> |
[WDQS Deploy] Initial deploy complete, `query.wikidata.org` handles queries fine, proceeding to post-deploy steps |
[production] |
06:57 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@70f9d37]: 0.3.60 (duration: 10m 43s) |
[production] |
06:50 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary `wdqs1003` following canary WDQS deploy, proceeding to rest of fleet |
[production] |
06:46 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@70f9d37]: 0.3.60 |
[production] |
06:46 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary `wdqs1003` before WDQS deploy, beginning deploy |
[production] |
06:45 |
<ryankemper> |
[wdqs] re-pooled `wdqs1013` (all caught up on lag) |
[production] |
06:16 |
<marostegui> |
Stop MySQL on db1117 db2133 db2078 T272614 |
[production] |
06:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2143 and db2144 as x2 codfw slaves T269324', diff saved to https://phabricator.wikimedia.org/P13885 and previous config saved to /var/cache/conftool/dbconfig/20210122-060147-marostegui.json |
[production] |
06:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2142 into x2 as codfw master T269324', diff saved to https://phabricator.wikimedia.org/P13884 and previous config saved to /var/cache/conftool/dbconfig/20210122-060007-marostegui.json |
[production] |
05:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce db1118 weight', diff saved to https://phabricator.wikimedia.org/P13883 and previous config saved to /var/cache/conftool/dbconfig/20210122-054330-marostegui.json |
[production] |
01:26 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2368.codfw.wmnet |
[production] |
01:25 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2366.codfw.wmnet |
[production] |
01:25 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2374.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2368.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2366.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2374.codfw.wmnet |
[production] |
01:19 |
<Urbanecm> |
Evening B&C window finished |
[production] |
01:18 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/AbuseFilter/: 7d8ab70d5b00142e8344e242dd085eb7bfa81145: Dont return the status of doBlockInternal when processing block actions (duration: 00m 59s) |
[production] |
01:16 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: 376cba1b33dd68d40490a1498c59a4d430318ab1: Enroll idwiki in the DiscussionTools a/b test (T268191) (duration: 00m 55s) |
[production] |
01:14 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/DiscussionTools/: 513a7861bbcf06a8ac5c29e1b9838640cbd7c628: A/B test output when a specific feature is being tested (T268191) (duration: 00m 55s) |
[production] |
01:12 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/WikibaseMediaInfo/: 4b0259b761681ca90b3f3039019553ddca40a5fe: Distinguish between null continue value and unknown one (T272548) (duration: 00m 59s) |
[production] |
01:06 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2376.codfw.wmnet |
[production] |
01:03 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: REIMAGE |
[production] |
01:02 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2376.codfw.wmnet |
[production] |
01:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: REIMAGE |
[production] |
01:00 |
<Urbanecm> |
Evening B&C still in process, waiting on Zuul |
[production] |
00:59 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2366.codfw.wmnet with reason: REIMAGE |
[production] |
00:58 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2374.codfw.wmnet with reason: REIMAGE |
[production] |
00:58 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2368.codfw.wmnet with reason: REIMAGE |
[production] |
00:56 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2374.codfw.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1174.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1161.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1157.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1167.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1163.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1158.eqiad.wmnet with reason: REIMAGE |
[production] |
00:49 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1168.eqiad.wmnet with reason: REIMAGE |
[production] |
00:48 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1165.eqiad.wmnet with reason: REIMAGE |
[production] |
00:48 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1160.eqiad.wmnet with reason: REIMAGE |
[production] |
00:48 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1170.eqiad.wmnet with reason: REIMAGE |
[production] |
00:48 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1169.eqiad.wmnet with reason: REIMAGE |
[production] |
00:48 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE |
[production] |
00:46 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1164.eqiad.wmnet with reason: REIMAGE |
[production] |
00:46 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1166.eqiad.wmnet with reason: REIMAGE |
[production] |
00:44 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1174.eqiad.wmnet with reason: REIMAGE |
[production] |