1001-1050 of 10000 results (35ms)
2022-02-28 §
06:02 <ladsgroup@cumin1001> START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bullseye [production]
05:57 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1178 (T302185)', diff saved to https://phabricator.wikimedia.org/P21549 and previous config saved to /var/cache/conftool/dbconfig/20220228-055626-ladsgroup.json [production]
05:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance [production]
05:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance [production]
05:55 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21548 and previous config saved to /var/cache/conftool/dbconfig/20220228-055530-ladsgroup.json [production]
05:52 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21547 and previous config saved to /var/cache/conftool/dbconfig/20220228-055226-ladsgroup.json [production]
05:40 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21546 and previous config saved to /var/cache/conftool/dbconfig/20220228-054025-ladsgroup.json [production]
05:38 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.23/includes/content/ContentHandler.php: Backport: [[gerrit:766136|ContentHandler: Use ParserOutputAccess for accessing ParserOutput (T302620)]] (duration: 00m 49s) [production]
05:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21545 and previous config saved to /var/cache/conftool/dbconfig/20220228-053721-ladsgroup.json [production]
05:25 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21544 and previous config saved to /var/cache/conftool/dbconfig/20220228-052521-ladsgroup.json [production]
05:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21543 and previous config saved to /var/cache/conftool/dbconfig/20220228-051905-ladsgroup.json [production]
05:19 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
05:18 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
05:10 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21542 and previous config saved to /var/cache/conftool/dbconfig/20220228-051016-ladsgroup.json [production]
05:05 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bullseye [production]
04:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [production]
04:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [production]
04:55 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [production]
04:55 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [production]
04:49 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage [production]
04:46 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage [production]
04:35 <ladsgroup@cumin1001> START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bullseye [production]
04:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21541 and previous config saved to /var/cache/conftool/dbconfig/20220228-043003-ladsgroup.json [production]
04:30 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance [production]
04:29 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance [production]
2022-02-27 §
20:42 <XioNoX> configure OSPF between cr2-drmrs and cr2-eqdfw [production]
2022-02-25 §
23:32 <dzahn@deploy1002> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
23:30 <dzahn@deploy1002> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
21:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup.json [production]
21:22 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21539 and previous config saved to /var/cache/conftool/dbconfig/20220225-212159-ladsgroup.json [production]
21:06 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21538 and previous config saved to /var/cache/conftool/dbconfig/20220225-210654-ladsgroup.json [production]
21:02 <ryankemper> [WDQS] Restarted wdqs eqiad exporters: `ryankemper@cumin1001:~$ sudo -E cumin -b 1 'wdqs1*' 'systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service'` [production]
21:01 <ryankemper> [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good. Still looking into `Reduced availability for job jmx_wdqs_updater`; will try restarting blazegraph exporters in eqiad [production]
20:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21537 and previous config saved to /var/cache/conftool/dbconfig/20220225-205149-ladsgroup.json [production]
20:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21536 and previous config saved to /var/cache/conftool/dbconfig/20220225-204844-ladsgroup.json [production]
20:48 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
20:48 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
20:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21535 and previous config saved to /var/cache/conftool/dbconfig/20220225-204836-ladsgroup.json [production]
20:33 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21534 and previous config saved to /var/cache/conftool/dbconfig/20220225-203331-ladsgroup.json [production]
20:31 <ryankemper> [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` [production]
20:31 <ryankemper> [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` [production]
20:31 <ryankemper> [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` [production]
20:30 <ryankemper@deploy1002> Finished deploy [wdqs/wdqs@5d384a5]: 0.3.104 (duration: 07m 18s) [production]
20:23 <ryankemper> [WDQS Deploy] Tests passing following deploy of `0.3.104` on canary `wdqs1003`; proceeding to rest of fleet [production]
20:22 <ryankemper@deploy1002> Started deploy [wdqs/wdqs@5d384a5]: 0.3.104 [production]
20:22 <ryankemper> [WDQS Deploy] Gearing up for deploy of wdqs `0.3.104`. Pre-deploy tests passing on canary `wdqs1003` [production]
20:18 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21533 and previous config saved to /var/cache/conftool/dbconfig/20220225-201826-ladsgroup.json [production]
20:03 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21532 and previous config saved to /var/cache/conftool/dbconfig/20220225-200322-ladsgroup.json [production]
19:59 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21531 and previous config saved to /var/cache/conftool/dbconfig/20220225-195917-ladsgroup.json [production]
19:59 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]