2022-02-25
§
|
23:32 |
<dzahn@deploy1002> |
helmfile [staging] DONE helmfile.d/services/miscweb: apply |
[production] |
23:30 |
<dzahn@deploy1002> |
helmfile [staging] START helmfile.d/services/miscweb: apply |
[production] |
21:37 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup.json |
[production] |
21:22 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21539 and previous config saved to /var/cache/conftool/dbconfig/20220225-212159-ladsgroup.json |
[production] |
21:06 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21538 and previous config saved to /var/cache/conftool/dbconfig/20220225-210654-ladsgroup.json |
[production] |
21:02 |
<ryankemper> |
[WDQS] Restarted wdqs eqiad exporters: `ryankemper@cumin1001:~$ sudo -E cumin -b 1 'wdqs1*' 'systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service'` |
[production] |
21:01 |
<ryankemper> |
[WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good. Still looking into `Reduced availability for job jmx_wdqs_updater`; will try restarting blazegraph exporters in eqiad |
[production] |
20:51 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21537 and previous config saved to /var/cache/conftool/dbconfig/20220225-205149-ladsgroup.json |
[production] |
20:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21536 and previous config saved to /var/cache/conftool/dbconfig/20220225-204844-ladsgroup.json |
[production] |
20:48 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21535 and previous config saved to /var/cache/conftool/dbconfig/20220225-204836-ladsgroup.json |
[production] |
20:33 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21534 and previous config saved to /var/cache/conftool/dbconfig/20220225-203331-ladsgroup.json |
[production] |
20:31 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
20:31 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
20:31 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
20:30 |
<ryankemper@deploy1002> |
Finished deploy [wdqs/wdqs@5d384a5]: 0.3.104 (duration: 07m 18s) |
[production] |
20:23 |
<ryankemper> |
[WDQS Deploy] Tests passing following deploy of `0.3.104` on canary `wdqs1003`; proceeding to rest of fleet |
[production] |
20:22 |
<ryankemper@deploy1002> |
Started deploy [wdqs/wdqs@5d384a5]: 0.3.104 |
[production] |
20:22 |
<ryankemper> |
[WDQS Deploy] Gearing up for deploy of wdqs `0.3.104`. Pre-deploy tests passing on canary `wdqs1003` |
[production] |
20:18 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21533 and previous config saved to /var/cache/conftool/dbconfig/20220225-201826-ladsgroup.json |
[production] |
20:03 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21532 and previous config saved to /var/cache/conftool/dbconfig/20220225-200322-ladsgroup.json |
[production] |
19:59 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21531 and previous config saved to /var/cache/conftool/dbconfig/20220225-195917-ladsgroup.json |
[production] |
19:59 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
19:59 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
19:59 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance |
[production] |
19:59 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance |
[production] |
19:58 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
19:58 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
19:57 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
19:57 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
19:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21530 and previous config saved to /var/cache/conftool/dbconfig/20220225-195658-ladsgroup.json |
[production] |
19:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21529 and previous config saved to /var/cache/conftool/dbconfig/20220225-194153-ladsgroup.json |
[production] |
19:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21528 and previous config saved to /var/cache/conftool/dbconfig/20220225-192649-ladsgroup.json |
[production] |
19:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21527 and previous config saved to /var/cache/conftool/dbconfig/20220225-191144-ladsgroup.json |
[production] |
19:11 |
<jgleeson> |
payments updated from 4638c0ec to 3dfac3b2 |
[production] |
19:09 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21526 and previous config saved to /var/cache/conftool/dbconfig/20220225-190939-ladsgroup.json |
[production] |
19:09 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
19:09 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
19:08 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance |
[production] |
19:07 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance |
[production] |
19:07 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance |
[production] |
19:07 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance |
[production] |
19:07 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21525 and previous config saved to /var/cache/conftool/dbconfig/20220225-190737-ladsgroup.json |
[production] |
18:52 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21524 and previous config saved to /var/cache/conftool/dbconfig/20220225-185233-ladsgroup.json |
[production] |
18:37 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21523 and previous config saved to /var/cache/conftool/dbconfig/20220225-183728-ladsgroup.json |
[production] |
18:22 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21522 and previous config saved to /var/cache/conftool/dbconfig/20220225-182223-ladsgroup.json |
[production] |
18:19 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21521 and previous config saved to /var/cache/conftool/dbconfig/20220225-181918-ladsgroup.json |
[production] |
18:19 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance |
[production] |