2022-01-21
§
|
09:06 |
<ayounsi@cumin1001> |
END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) |
[production] |
09:06 |
<ayounsi@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
09:04 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp3063.esams.wmnet with OS buster |
[production] |
09:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 60%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18968 and previous config saved to /var/cache/conftool/dbconfig/20220121-090113-root.json |
[production] |
09:00 |
<vgutierrez> |
depool cp3063 to be reimaged as cache::upload_envoy - T271421 |
[production] |
08:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18967 and previous config saved to /var/cache/conftool/dbconfig/20220121-084609-root.json |
[production] |
08:37 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1018.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
08:35 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1018.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
08:31 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1018.eqiad.wmnet |
[production] |
08:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18966 and previous config saved to /var/cache/conftool/dbconfig/20220121-083106-root.json |
[production] |
08:27 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti1018.eqiad.wmnet |
[production] |
08:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18965 and previous config saved to /var/cache/conftool/dbconfig/20220121-081602-root.json |
[production] |
08:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 20%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18964 and previous config saved to /var/cache/conftool/dbconfig/20220121-080058-root.json |
[production] |
07:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18963 and previous config saved to /var/cache/conftool/dbconfig/20220121-075801-root.json |
[production] |
07:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18962 and previous config saved to /var/cache/conftool/dbconfig/20220121-074555-root.json |
[production] |
07:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18961 and previous config saved to /var/cache/conftool/dbconfig/20220121-074257-root.json |
[production] |
07:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18960 and previous config saved to /var/cache/conftool/dbconfig/20220121-073051-root.json |
[production] |
07:30 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1032.eqiad.wmnet with OS bullseye |
[production] |
07:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 60%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18959 and previous config saved to /var/cache/conftool/dbconfig/20220121-072754-root.json |
[production] |
07:26 |
<elukey> |
elukey@stat1007:~$ sudo systemctl reset-failed product-analytics-movement-metrics.service |
[production] |
07:21 |
<elukey> |
elukey@build2001:~$ sudo systemctl reset-failed ifup@ens13.service |
[production] |
07:19 |
<elukey> |
systemctl reset-failed session-3.scope on an-test-client1001 (failed, transient unit) |
[production] |
07:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18958 and previous config saved to /var/cache/conftool/dbconfig/20220121-071250-root.json |
[production] |
07:04 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host es1032.eqiad.wmnet with OS bullseye |
[production] |
06:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1032 for reimage T299741', diff saved to https://phabricator.wikimedia.org/P18957 and previous config saved to /var/cache/conftool/dbconfig/20220121-065854-marostegui.json |
[production] |
06:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 40%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18956 and previous config saved to /var/cache/conftool/dbconfig/20220121-065746-root.json |
[production] |
06:54 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bullseye |
[production] |
06:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18955 and previous config saved to /var/cache/conftool/dbconfig/20220121-064243-root.json |
[production] |
06:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 20%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18954 and previous config saved to /var/cache/conftool/dbconfig/20220121-062739-root.json |
[production] |
06:24 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bullseye |
[production] |
06:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es2032 to es1 master T299741', diff saved to https://phabricator.wikimedia.org/P18953 and previous config saved to /var/cache/conftool/dbconfig/20220121-062116-marostegui.json |
[production] |
06:19 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2030.codfw.wmnet with OS bullseye |
[production] |
06:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18952 and previous config saved to /var/cache/conftool/dbconfig/20220121-061235-root.json |
[production] |
05:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18951 and previous config saved to /var/cache/conftool/dbconfig/20220121-055732-root.json |
[production] |
05:49 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host es2030.codfw.wmnet with OS bullseye |
[production] |
05:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: repooling after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P18950 and previous config saved to /var/cache/conftool/dbconfig/20220121-054228-root.json |
[production] |
2022-01-20
§
|
22:40 |
<inflatador> |
running puppet-merge for https://gerrit.wikimedia.org/r/755810 |
[production] |
22:38 |
<inflatador> |
running puppet-merge for ^^ |
[production] |
22:27 |
<urandom> |
rolling restart of Cassandra, aqs-next -- T298516 |
[production] |
21:04 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1008.eqiad.wmnet with OS buster |
[production] |
20:58 |
<jhathaway> |
rebotting mx1001 to test new kernel |
[production] |
20:40 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
20:38 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
20:38 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
20:37 |
<urandom> |
upgrading Cassandra to 3.11.11, aqs1010 -- T298516 |
[production] |
20:37 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
20:36 |
<jhuneidi@deploy1002> |
rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.18 refs T293959 |
[production] |
20:34 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.reimage for host backup1008.eqiad.wmnet with OS buster |
[production] |
20:32 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
20:31 |
<jhuneidi@deploy1002> |
Synchronized php-1.38.0-wmf.18/extensions/DiscussionTools/includes/HeadingItem.php: Backport: [[gerrit:755684|Prevent assertion failure caused by empty headings (T299583)]] (duration: 00m 50s) |
[production] |