1-50 of 10000 results (45ms)
2022-02-25 ยง
23:32 <dzahn@deploy1002> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
23:30 <dzahn@deploy1002> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
21:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup.json [production]
21:22 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21539 and previous config saved to /var/cache/conftool/dbconfig/20220225-212159-ladsgroup.json [production]
21:06 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21538 and previous config saved to /var/cache/conftool/dbconfig/20220225-210654-ladsgroup.json [production]
21:02 <ryankemper> [WDQS] Restarted wdqs eqiad exporters: `ryankemper@cumin1001:~$ sudo -E cumin -b 1 'wdqs1*' 'systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service'` [production]
21:01 <ryankemper> [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good. Still looking into `Reduced availability for job jmx_wdqs_updater`; will try restarting blazegraph exporters in eqiad [production]
20:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21537 and previous config saved to /var/cache/conftool/dbconfig/20220225-205149-ladsgroup.json [production]
20:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21536 and previous config saved to /var/cache/conftool/dbconfig/20220225-204844-ladsgroup.json [production]
20:48 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
20:48 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
20:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21535 and previous config saved to /var/cache/conftool/dbconfig/20220225-204836-ladsgroup.json [production]
20:33 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21534 and previous config saved to /var/cache/conftool/dbconfig/20220225-203331-ladsgroup.json [production]
20:31 <ryankemper> [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` [production]
20:31 <ryankemper> [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` [production]
20:31 <ryankemper> [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` [production]
20:30 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@5d384a5]: 0.3.104 (duration: 07m 18s) [production]
20:23 <ryankemper> [WDQS Deploy] Tests passing following deploy of `0.3.104` on canary `wdqs1003`; proceeding to rest of fleet [production]
20:22 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@5d384a5]: 0.3.104 [production]
20:22 <ryankemper> [WDQS Deploy] Gearing up for deploy of wdqs `0.3.104`. Pre-deploy tests passing on canary `wdqs1003` [production]
20:18 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21533 and previous config saved to /var/cache/conftool/dbconfig/20220225-201826-ladsgroup.json [production]
20:03 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21532 and previous config saved to /var/cache/conftool/dbconfig/20220225-200322-ladsgroup.json [production]
19:59 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21531 and previous config saved to /var/cache/conftool/dbconfig/20220225-195917-ladsgroup.json [production]
19:59 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
19:59 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
19:59 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance [production]
19:59 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance [production]
19:58 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
19:58 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
19:57 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
19:57 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
19:56 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21530 and previous config saved to /var/cache/conftool/dbconfig/20220225-195658-ladsgroup.json [production]
19:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21529 and previous config saved to /var/cache/conftool/dbconfig/20220225-194153-ladsgroup.json [production]
19:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21528 and previous config saved to /var/cache/conftool/dbconfig/20220225-192649-ladsgroup.json [production]
19:11 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21527 and previous config saved to /var/cache/conftool/dbconfig/20220225-191144-ladsgroup.json [production]
19:11 <jgleeson> payments updated from 4638c0ec to 3dfac3b2 [production]
19:09 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21526 and previous config saved to /var/cache/conftool/dbconfig/20220225-190939-ladsgroup.json [production]
19:09 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
19:09 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
19:08 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [production]
19:07 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [production]
19:07 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance [production]
19:07 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance [production]
19:07 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21525 and previous config saved to /var/cache/conftool/dbconfig/20220225-190737-ladsgroup.json [production]
18:52 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21524 and previous config saved to /var/cache/conftool/dbconfig/20220225-185233-ladsgroup.json [production]
18:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21523 and previous config saved to /var/cache/conftool/dbconfig/20220225-183728-ladsgroup.json [production]
18:22 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21522 and previous config saved to /var/cache/conftool/dbconfig/20220225-182223-ladsgroup.json [production]
18:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21521 and previous config saved to /var/cache/conftool/dbconfig/20220225-181918-ladsgroup.json [production]
18:19 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [production]
18:19 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [production]