2024-01-23
ยง
|
10:23 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye |
[production] |
10:17 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55324 and previous config saved to /var/cache/conftool/dbconfig/20240123-101718-marostegui.json |
[production] |
10:13 |
<ayounsi@cumin1002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
10:13 |
<ayounsi@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet |
[production] |
10:12 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm |
[production] |
10:10 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2171:3315 db2171:3316', diff saved to https://phabricator.wikimedia.org/P55323 and previous config saved to /var/cache/conftool/dbconfig/20240123-101056-marostegui.json |
[production] |
10:10 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.decommission for hosts an-master1001.eqiad.wmnet |
[production] |
10:04 |
<ayounsi@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet |
[production] |
10:04 |
<ayounsi@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
10:03 |
<ayounsi@cumin1002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
10:03 |
<ayounsi@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
10:02 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye |
[production] |
10:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55322 and previous config saved to /var/cache/conftool/dbconfig/20240123-100212-marostegui.json |
[production] |
10:00 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55321 and previous config saved to /var/cache/conftool/dbconfig/20240123-100002-marostegui.json |
[production] |
09:59 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance |
[production] |
09:59 |
<ayounsi@cumin1002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
09:59 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance |
[production] |
09:59 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance |
[production] |
09:59 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance |
[production] |
09:59 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55320 and previous config saved to /var/cache/conftool/dbconfig/20240123-095923-marostegui.json |
[production] |
09:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55319 and previous config saved to /var/cache/conftool/dbconfig/20240123-094417-marostegui.json |
[production] |
09:41 |
<ayounsi@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
09:33 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage |
[production] |
09:29 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage |
[production] |
09:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55318 and previous config saved to /var/cache/conftool/dbconfig/20240123-092910-marostegui.json |
[production] |
09:24 |
<hashar@deploy2002> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.15 refs T354433 |
[production] |
09:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55317 and previous config saved to /var/cache/conftool/dbconfig/20240123-091404-marostegui.json |
[production] |
09:11 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55316 and previous config saved to /var/cache/conftool/dbconfig/20240123-091154-marostegui.json |
[production] |
09:11 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance |
[production] |
09:11 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance |
[production] |
09:11 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55315 and previous config saved to /var/cache/conftool/dbconfig/20240123-091132-marostegui.json |
[production] |
09:04 |
<ayounsi@cumin1002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
09:01 |
<ayounsi@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet |
[production] |
09:01 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55314 and previous config saved to /var/cache/conftool/dbconfig/20240123-090104-root.json |
[production] |
08:56 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55313 and previous config saved to /var/cache/conftool/dbconfig/20240123-085625-marostegui.json |
[production] |
08:55 |
<taavi> |
updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992245/ https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992359/ |
[production] |
08:51 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye |
[production] |
08:46 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55312 and previous config saved to /var/cache/conftool/dbconfig/20240123-084559-root.json |
[production] |
08:44 |
<gmodena@deploy2002> |
helmfile [staging] DONE helmfile.d/services/eventstreams: apply |
[production] |
08:44 |
<gmodena@deploy2002> |
helmfile [staging] START helmfile.d/services/eventstreams: apply |
[production] |
08:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55311 and previous config saved to /var/cache/conftool/dbconfig/20240123-084301-ladsgroup.json |
[production] |
08:42 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance |
[production] |
08:42 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance |
[production] |
08:42 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance |
[production] |
08:42 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance |
[production] |
08:42 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55310 and previous config saved to /var/cache/conftool/dbconfig/20240123-084244-ladsgroup.json |
[production] |
08:41 |
<ayounsi@cumin1002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet |
[production] |
08:41 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55309 and previous config saved to /var/cache/conftool/dbconfig/20240123-084119-marostegui.json |
[production] |
08:39 |
<ayounsi@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet |
[production] |
08:37 |
<gmodena@deploy2002> |
helmfile [staging] START helmfile.d/services/eventstreams: apply |
[production] |