2022-02-01
§
|
05:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19711 and previous config saved to /var/cache/conftool/dbconfig/20220201-055638-root.json |
[production] |
05:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1105:3312 (T298558)', diff saved to https://phabricator.wikimedia.org/P19710 and previous config saved to /var/cache/conftool/dbconfig/20220201-055327-marostegui.json |
[production] |
05:53 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance |
[production] |
05:53 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance |
[production] |
05:08 |
<andrew@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2004-dev.codfw.wmnet with OS bullseye |
[production] |
03:37 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye |
[production] |
03:36 |
<andrew@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet2004-dev.codfw.wmnet with OS bullseye |
[production] |
02:26 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
02:25 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
02:25 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
02:24 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
02:18 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye |
[production] |
02:09 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
02:08 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
02:08 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
02:07 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
01:48 |
<ryankemper> |
T282117 Merged https://gerrit.wikimedia.org/r/c/operations/dns/+/717606 and successfully ran `sudo -i authdns-update` on `authdns1001`. `commons-query.wikimedia.org` is online now. (sidenote: go-live date of service is 2022-02-01) |
[production] |
01:42 |
<ryankemper> |
T299222 `ryankemper@cumin1001:~$ sudo cumin 'wcqs*' 'sudo rm -fv /etc/default/wcqs-updater'` |
[production] |
01:42 |
<ryankemper> |
T299222 `ryankemper@cumin1001:~$ sudo cumin 'wdqs*' 'sudo rm -fv /etc/default/wdqs-updater'` |
[production] |
01:24 |
<ryankemper> |
T299222 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/757124; running puppet on `w*qs*` before purging old filepaths |
[production] |
00:31 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:30 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
00:30 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:28 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
00:24 |
<catrope@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:758495|Enable Local upload on ptwikinews (T300466)]] (duration: 00m 50s) |
[production] |
00:23 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:22 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
00:22 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:21 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
00:18 |
<ryankemper> |
[WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good |
[production] |
00:11 |
<catrope@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:758033|Lower The Wikipedia Library extension edit count (T288070)]] (duration: 00m 50s) |
[production] |
00:11 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:10 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
00:10 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
00:09 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
2022-01-31
§
|
23:50 |
<dduvall@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production |
[production] |
23:50 |
<dduvall@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging |
[production] |
23:50 |
<dduvall@deploy1002> |
helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production |
[production] |
23:49 |
<dduvall@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production |
[production] |
23:49 |
<dduvall@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging |
[production] |
23:49 |
<dduvall@deploy1002> |
helmfile [codfw] START helmfile.d/services/blubberoid: apply on production |
[production] |
23:44 |
<dduvall@deploy1002> |
helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging |
[production] |
23:44 |
<dduvall@deploy1002> |
helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production |
[production] |
23:44 |
<dduvall@deploy1002> |
helmfile [staging] START helmfile.d/services/blubberoid: apply on staging |
[production] |
23:31 |
<inflatador> |
[WCQS Deploy] Restarted `wcqs-updater` across all hosts: `sudo cumin -b 6 'wcqs*' 'sudo systemctl restart wcqs-updater'` |
[production] |
23:29 |
<bking@deploy1002> |
Finished deploy [wdqs/wdqs@f0287fb] (wcqs): Deploy 0.3.101 to WCQS (duration: 02m 39s) |
[production] |
23:28 |
<inflatador> |
[WCQS Deploy] Tests look good following deploy of `0.3.101` to canary `wcqs1002.eqiad.wmnet`, proceeding to rest of fleet |
[production] |
23:26 |
<bking@deploy1002> |
Started deploy [wdqs/wdqs@f0287fb] (wcqs): Deploy 0.3.101 to WCQS |
[production] |
23:17 |
<inflatador> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
23:16 |
<inflatador> |
[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |