2022-10-26
ยง
|
11:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P36476 and previous config saved to /var/cache/conftool/dbconfig/20221026-114840-ladsgroup.json |
[production] |
11:46 |
<sukhe> |
sudo ipmitool -I lanplus -H "cp4046.mgmt.ulsfo.wmnet" -U root -E chassis power cycle |
[production] |
11:45 |
<sukhe@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4046.ulsfo.wmnet with OS buster |
[production] |
11:42 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2128 (T318950)', diff saved to https://phabricator.wikimedia.org/P36475 and previous config saved to /var/cache/conftool/dbconfig/20221026-114207-ladsgroup.json |
[production] |
11:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36474 and previous config saved to /var/cache/conftool/dbconfig/20221026-114109-ladsgroup.json |
[production] |
11:39 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2128 (T318950)', diff saved to https://phabricator.wikimedia.org/P36473 and previous config saved to /var/cache/conftool/dbconfig/20221026-113941-ladsgroup.json |
[production] |
11:39 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance |
[production] |
11:39 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance |
[production] |
11:39 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2128.codfw.wmnet with reason: Maintenance |
[production] |
11:39 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2128.codfw.wmnet with reason: Maintenance |
[production] |
11:39 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36472 and previous config saved to /var/cache/conftool/dbconfig/20221026-113925-ladsgroup.json |
[production] |
11:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1113:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36471 and previous config saved to /var/cache/conftool/dbconfig/20221026-113856-ladsgroup.json |
[production] |
11:38 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
11:38 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
11:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36470 and previous config saved to /var/cache/conftool/dbconfig/20221026-113835-ladsgroup.json |
[production] |
11:33 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1182 (T321312)', diff saved to https://phabricator.wikimedia.org/P36469 and previous config saved to /var/cache/conftool/dbconfig/20221026-113333-ladsgroup.json |
[production] |
11:33 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS buster |
[production] |
11:29 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp4046 |
[production] |
11:29 |
<sukhe@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host cp4046 |
[production] |
11:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1182 (T321312)', diff saved to https://phabricator.wikimedia.org/P36468 and previous config saved to /var/cache/conftool/dbconfig/20221026-112634-ladsgroup.json |
[production] |
11:26 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance |
[production] |
11:26 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance |
[production] |
11:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1162 (T321312)', diff saved to https://phabricator.wikimedia.org/P36467 and previous config saved to /var/cache/conftool/dbconfig/20221026-112609-ladsgroup.json |
[production] |
11:24 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P36466 and previous config saved to /var/cache/conftool/dbconfig/20221026-112419-ladsgroup.json |
[production] |
11:24 |
<sukhe@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4046.ulsfo.wmnet with OS buster |
[production] |
11:23 |
<jmm@cumin2002> |
END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1015.eqiad.wmnet to cluster eqiad and group B |
[production] |
11:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P36465 and previous config saved to /var/cache/conftool/dbconfig/20221026-112328-ladsgroup.json |
[production] |
11:22 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1015.eqiad.wmnet to cluster eqiad and group B |
[production] |
11:22 |
<jmm@cumin2002> |
END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1023.eqiad.wmnet to cluster eqiad and group B |
[production] |
11:22 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1023.eqiad.wmnet to cluster eqiad and group B |
[production] |
11:20 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1015.eqiad.wmnet |
[production] |
10:55 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P36461 and previous config saved to /var/cache/conftool/dbconfig/20221026-105556-ladsgroup.json |
[production] |
10:54 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36460 and previous config saved to /var/cache/conftool/dbconfig/20221026-105406-ladsgroup.json |
[production] |
10:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36459 and previous config saved to /var/cache/conftool/dbconfig/20221026-105315-ladsgroup.json |
[production] |
10:51 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36458 and previous config saved to /var/cache/conftool/dbconfig/20221026-105140-ladsgroup.json |
[production] |
10:51 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance |
[production] |
10:51 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance |
[production] |
10:51 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2111 (T318950)', diff saved to https://phabricator.wikimedia.org/P36457 and previous config saved to /var/cache/conftool/dbconfig/20221026-105129-ladsgroup.json |
[production] |
10:51 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36456 and previous config saved to /var/cache/conftool/dbconfig/20221026-105102-ladsgroup.json |
[production] |
10:51 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance |
[production] |
10:50 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance |
[production] |
10:50 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36455 and previous config saved to /var/cache/conftool/dbconfig/20221026-105052-ladsgroup.json |
[production] |
10:50 |
<dcausse> |
restarting blazegraph on wdqs1007 (BlazegraphFreeAllocatorsDecreasingRapidly) |
[production] |
10:47 |
<jmm@cumin2002> |
END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A |
[production] |
10:46 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
10:46 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A |
[production] |
10:42 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
10:42 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
10:40 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1162 (T321312)', diff saved to https://phabricator.wikimedia.org/P36454 and previous config saved to /var/cache/conftool/dbconfig/20221026-104050-ladsgroup.json |
[production] |
10:38 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |