2022-01-26
ยง
|
08:18 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1013.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
08:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1134 (T285149)', diff saved to https://phabricator.wikimedia.org/P19243 and previous config saved to /var/cache/conftool/dbconfig/20220126-080948-marostegui.json |
[production] |
08:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1134 (T285149)', diff saved to https://phabricator.wikimedia.org/P19242 and previous config saved to /var/cache/conftool/dbconfig/20220126-080842-marostegui.json |
[production] |
08:08 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance |
[production] |
08:08 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance |
[production] |
08:08 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance |
[production] |
08:08 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance |
[production] |
08:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1163 (T285149)', diff saved to https://phabricator.wikimedia.org/P19241 and previous config saved to /var/cache/conftool/dbconfig/20220126-080831-marostegui.json |
[production] |
07:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P19240 and previous config saved to /var/cache/conftool/dbconfig/20220126-075326-marostegui.json |
[production] |
07:51 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
07:50 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2131.codfw.wmnet with OS bullseye |
[production] |
07:50 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
07:50 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
07:49 |
<marostegui@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1020.eqiad.wmnet with OS bullseye |
[production] |
07:49 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
07:45 |
<taavi@deploy1002> |
Synchronized wmf-config/interwiki.php: Config: [[gerrit:757377|Update interwiki cache]] (duration: 00m 52s) |
[production] |
07:43 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host es1020.eqiad.wmnet with OS bullseye |
[production] |
07:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P19239 and previous config saved to /var/cache/conftool/dbconfig/20220126-073822-marostegui.json |
[production] |
07:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1163 (T285149)', diff saved to https://phabricator.wikimedia.org/P19238 and previous config saved to /var/cache/conftool/dbconfig/20220126-072317-marostegui.json |
[production] |
07:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1163 (T285149)', diff saved to https://phabricator.wikimedia.org/P19237 and previous config saved to /var/cache/conftool/dbconfig/20220126-072211-marostegui.json |
[production] |
07:22 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance |
[production] |
07:22 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance |
[production] |
07:22 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance |
[production] |
07:22 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance |
[production] |
07:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169 (T285149)', diff saved to https://phabricator.wikimedia.org/P19236 and previous config saved to /var/cache/conftool/dbconfig/20220126-072200-marostegui.json |
[production] |
07:17 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2115.codfw.wmnet with OS bullseye |
[production] |
07:14 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host db2131.codfw.wmnet with OS bullseye |
[production] |
07:14 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2096.codfw.wmnet with OS bullseye |
[production] |
07:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P19235 and previous config saved to /var/cache/conftool/dbconfig/20220126-070654-marostegui.json |
[production] |
06:51 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P19234 and previous config saved to /var/cache/conftool/dbconfig/20220126-065149-marostegui.json |
[production] |
06:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove recentchangeslinked from s8 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19233 and previous config saved to /var/cache/conftool/dbconfig/20220126-064653-marostegui.json |
[production] |
06:43 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host db2115.codfw.wmnet with OS bullseye |
[production] |
06:41 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host db2096.codfw.wmnet with OS bullseye |
[production] |
06:36 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169 (T285149)', diff saved to https://phabricator.wikimedia.org/P19232 and previous config saved to /var/cache/conftool/dbconfig/20220126-063644-marostegui.json |
[production] |
06:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1020 T300005', diff saved to https://phabricator.wikimedia.org/P19231 and previous config saved to /var/cache/conftool/dbconfig/20220126-063149-marostegui.json |
[production] |
06:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1169 (T285149)', diff saved to https://phabricator.wikimedia.org/P19230 and previous config saved to /var/cache/conftool/dbconfig/20220126-063037-marostegui.json |
[production] |
06:30 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
06:30 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
06:30 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance |
[production] |
06:30 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance |
[production] |
06:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2086 (s7,s8) T299882', diff saved to https://phabricator.wikimedia.org/P19229 and previous config saved to /var/cache/conftool/dbconfig/20220126-062406-marostegui.json |
[production] |
05:02 |
<ryankemper@deploy1002> |
Finished deploy [wdqs/wdqs@dc7c5ac] (wcqs): Deploy 0.3.100 to WCQS (duration: 02m 21s) |
[production] |
04:59 |
<ryankemper@deploy1002> |
Started deploy [wdqs/wdqs@dc7c5ac] (wcqs): Deploy 0.3.100 to WCQS |
[production] |
04:56 |
<ryankemper> |
[WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good |
[production] |
03:42 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
03:42 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
03:42 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
03:40 |
<ryankemper@deploy1002> |
Finished deploy [wdqs/wdqs@dc7c5ac]: 0.3.100 (duration: 08m 35s) |
[production] |
03:32 |
<ryankemper> |
[WDQS Deploy] Tests passing following deploy of `0.3.100` on canary `wdqs1003`; proceeding to rest of fleet |
[production] |
03:31 |
<ryankemper@deploy1002> |
Started deploy [wdqs/wdqs@dc7c5ac]: 0.3.100 |
[production] |