2021-01-22
ยง
|
10:36 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1093 (re)pooling @ 100%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13897 and previous config saved to /var/cache/conftool/dbconfig/20210122-103609-kormat.json |
[production] |
10:22 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13895 and previous config saved to /var/cache/conftool/dbconfig/20210122-102237-kormat.json |
[production] |
10:21 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1093 (re)pooling @ 75%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13894 and previous config saved to /var/cache/conftool/dbconfig/20210122-102105-kormat.json |
[production] |
10:18 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host archiva1002.wikimedia.org |
[production] |
10:16 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host archiva1002.wikimedia.org |
[production] |
10:07 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13893 and previous config saved to /var/cache/conftool/dbconfig/20210122-100734-kormat.json |
[production] |
10:06 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1093 (re)pooling @ 50%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13892 and previous config saved to /var/cache/conftool/dbconfig/20210122-100602-kormat.json |
[production] |
10:03 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1130 depooling: Rebooting for T272255', diff saved to https://phabricator.wikimedia.org/P13891 and previous config saved to /var/cache/conftool/dbconfig/20210122-100307-kormat.json |
[production] |
10:03 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on db1130.eqiad.wmnet with reason: Rebooting for T272255 |
[production] |
10:03 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:30:00 on db1130.eqiad.wmnet with reason: Rebooting for T272255 |
[production] |
10:02 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Temporarily add db1110 to api group T272255', diff saved to https://phabricator.wikimedia.org/P13890 and previous config saved to /var/cache/conftool/dbconfig/20210122-100233-kormat.json |
[production] |
09:52 |
<moritzm> |
uploaded cairo 1.14.0-2.1+deb8u2+wmf1 to apt.wikimedia.org |
[production] |
09:50 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1093 (re)pooling @ 25%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13889 and previous config saved to /var/cache/conftool/dbconfig/20210122-095058-kormat.json |
[production] |
09:44 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db1093 depooling: Rebooting for T272255', diff saved to https://phabricator.wikimedia.org/P13888 and previous config saved to /var/cache/conftool/dbconfig/20210122-094453-kormat.json |
[production] |
09:44 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on db1093.eqiad.wmnet with reason: Rebooting for T272255 |
[production] |
09:44 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:30:00 on db1093.eqiad.wmnet with reason: Rebooting for T272255 |
[production] |
09:43 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Temporarily add db1088 to api group T271106', diff saved to https://phabricator.wikimedia.org/P13887 and previous config saved to /var/cache/conftool/dbconfig/20210122-094337-kormat.json |
[production] |
08:49 |
<moritzm> |
installing PIP security updates for stretch |
[production] |
08:44 |
<moritzm> |
installing mutt updates for stretch |
[production] |
08:35 |
<XioNoX> |
Remove BGP for Zayo transit in ulsfo, eqiad, eqord |
[production] |
08:33 |
<elukey> |
update puppet compiler's facts |
[production] |
07:26 |
<ryankemper> |
[WDQS Deploy] WDQS deploy complete; service is healthy |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
06:59 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
06:58 |
<ryankemper> |
[WDQS Deploy] Initial deploy complete, `query.wikidata.org` handles queries fine, proceeding to post-deploy steps |
[production] |
06:57 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@70f9d37]: 0.3.60 (duration: 10m 43s) |
[production] |
06:50 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary `wdqs1003` following canary WDQS deploy, proceeding to rest of fleet |
[production] |
06:46 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@70f9d37]: 0.3.60 |
[production] |
06:46 |
<ryankemper> |
[WDQS Deploy] All tests passing on canary `wdqs1003` before WDQS deploy, beginning deploy |
[production] |
06:45 |
<ryankemper> |
[wdqs] re-pooled `wdqs1013` (all caught up on lag) |
[production] |
06:16 |
<marostegui> |
Stop MySQL on db1117 db2133 db2078 T272614 |
[production] |
06:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2143 and db2144 as x2 codfw slaves T269324', diff saved to https://phabricator.wikimedia.org/P13885 and previous config saved to /var/cache/conftool/dbconfig/20210122-060147-marostegui.json |
[production] |
06:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db2142 into x2 as codfw master T269324', diff saved to https://phabricator.wikimedia.org/P13884 and previous config saved to /var/cache/conftool/dbconfig/20210122-060007-marostegui.json |
[production] |
05:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce db1118 weight', diff saved to https://phabricator.wikimedia.org/P13883 and previous config saved to /var/cache/conftool/dbconfig/20210122-054330-marostegui.json |
[production] |
01:26 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2368.codfw.wmnet |
[production] |
01:25 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2366.codfw.wmnet |
[production] |
01:25 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2374.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2368.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2366.codfw.wmnet |
[production] |
01:22 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2374.codfw.wmnet |
[production] |
01:19 |
<Urbanecm> |
Evening B&C window finished |
[production] |
01:18 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/AbuseFilter/: 7d8ab70d5b00142e8344e242dd085eb7bfa81145: Dont return the status of doBlockInternal when processing block actions (duration: 00m 59s) |
[production] |
01:16 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: 376cba1b33dd68d40490a1498c59a4d430318ab1: Enroll idwiki in the DiscussionTools a/b test (T268191) (duration: 00m 55s) |
[production] |
01:14 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/DiscussionTools/: 513a7861bbcf06a8ac5c29e1b9838640cbd7c628: A/B test output when a specific feature is being tested (T268191) (duration: 00m 55s) |
[production] |
01:12 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.27/extensions/WikibaseMediaInfo/: 4b0259b761681ca90b3f3039019553ddca40a5fe: Distinguish between null continue value and unknown one (T272548) (duration: 00m 59s) |
[production] |
01:06 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2376.codfw.wmnet |
[production] |
01:03 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: REIMAGE |
[production] |
01:02 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2376.codfw.wmnet |
[production] |
01:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: REIMAGE |
[production] |