301-350 of 10000 results (62ms)
2022-10-26 ยง
11:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P36476 and previous config saved to /var/cache/conftool/dbconfig/20221026-114840-ladsgroup.json [production]
11:46 <sukhe> sudo ipmitool -I lanplus -H "cp4046.mgmt.ulsfo.wmnet" -U root -E chassis power cycle [production]
11:45 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4046.ulsfo.wmnet with OS buster [production]
11:42 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2128 (T318950)', diff saved to https://phabricator.wikimedia.org/P36475 and previous config saved to /var/cache/conftool/dbconfig/20221026-114207-ladsgroup.json [production]
11:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36474 and previous config saved to /var/cache/conftool/dbconfig/20221026-114109-ladsgroup.json [production]
11:39 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2128 (T318950)', diff saved to https://phabricator.wikimedia.org/P36473 and previous config saved to /var/cache/conftool/dbconfig/20221026-113941-ladsgroup.json [production]
11:39 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance [production]
11:39 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance [production]
11:39 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2128.codfw.wmnet with reason: Maintenance [production]
11:39 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db2128.codfw.wmnet with reason: Maintenance [production]
11:39 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36472 and previous config saved to /var/cache/conftool/dbconfig/20221026-113925-ladsgroup.json [production]
11:38 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1113:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36471 and previous config saved to /var/cache/conftool/dbconfig/20221026-113856-ladsgroup.json [production]
11:38 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
11:38 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
11:38 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36470 and previous config saved to /var/cache/conftool/dbconfig/20221026-113835-ladsgroup.json [production]
11:33 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T321312)', diff saved to https://phabricator.wikimedia.org/P36469 and previous config saved to /var/cache/conftool/dbconfig/20221026-113333-ladsgroup.json [production]
11:33 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS buster [production]
11:29 <sukhe@cumin2002> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp4046 [production]
11:29 <sukhe@cumin2002> START - Cookbook sre.network.configure-switch-interfaces for host cp4046 [production]
11:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1182 (T321312)', diff saved to https://phabricator.wikimedia.org/P36468 and previous config saved to /var/cache/conftool/dbconfig/20221026-112634-ladsgroup.json [production]
11:26 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1162 (T321312)', diff saved to https://phabricator.wikimedia.org/P36467 and previous config saved to /var/cache/conftool/dbconfig/20221026-112609-ladsgroup.json [production]
11:24 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P36466 and previous config saved to /var/cache/conftool/dbconfig/20221026-112419-ladsgroup.json [production]
11:24 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4046.ulsfo.wmnet with OS buster [production]
11:23 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1015.eqiad.wmnet to cluster eqiad and group B [production]
11:23 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P36465 and previous config saved to /var/cache/conftool/dbconfig/20221026-112328-ladsgroup.json [production]
11:22 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti1015.eqiad.wmnet to cluster eqiad and group B [production]
11:22 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1023.eqiad.wmnet to cluster eqiad and group B [production]
11:22 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti1023.eqiad.wmnet to cluster eqiad and group B [production]
11:20 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1015.eqiad.wmnet [production]
10:55 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P36461 and previous config saved to /var/cache/conftool/dbconfig/20221026-105556-ladsgroup.json [production]
10:54 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36460 and previous config saved to /var/cache/conftool/dbconfig/20221026-105406-ladsgroup.json [production]
10:53 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36459 and previous config saved to /var/cache/conftool/dbconfig/20221026-105315-ladsgroup.json [production]
10:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2123 (T318950)', diff saved to https://phabricator.wikimedia.org/P36458 and previous config saved to /var/cache/conftool/dbconfig/20221026-105140-ladsgroup.json [production]
10:51 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance [production]
10:51 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance [production]
10:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2111 (T318950)', diff saved to https://phabricator.wikimedia.org/P36457 and previous config saved to /var/cache/conftool/dbconfig/20221026-105129-ladsgroup.json [production]
10:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1110 (T318950)', diff saved to https://phabricator.wikimedia.org/P36456 and previous config saved to /var/cache/conftool/dbconfig/20221026-105102-ladsgroup.json [production]
10:51 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance [production]
10:50 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance [production]
10:50 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T318950)', diff saved to https://phabricator.wikimedia.org/P36455 and previous config saved to /var/cache/conftool/dbconfig/20221026-105052-ladsgroup.json [production]
10:50 <dcausse> restarting blazegraph on wdqs1007 (BlazegraphFreeAllocatorsDecreasingRapidly) [production]
10:47 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A [production]
10:46 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
10:46 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A [production]
10:42 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
10:42 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
10:40 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1162 (T321312)', diff saved to https://phabricator.wikimedia.org/P36454 and previous config saved to /var/cache/conftool/dbconfig/20221026-104050-ladsgroup.json [production]
10:38 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]