2023-02-09
§
|
14:52 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc-gp1001.eqiad.wmnet |
[production] |
14:52 |
<sukhe@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc-gp1001.eqiad.wmnet |
[production] |
14:51 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['mc-gp1001.eqiad.wmnet'] |
[production] |
14:51 |
<sukhe@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['mc-gp1001.eqiad.wmnet'] |
[production] |
14:50 |
<jiji@cumin1001> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['mc-gp1001.eqiad.wmnet'] |
[production] |
14:49 |
<jiji@cumin1001> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['mc-gp1001.eqiad.wmnet'] |
[production] |
14:46 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2434.codfw.wmnet with OS buster |
[production] |
14:46 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage |
[production] |
14:46 |
<dcausse@deploy1002> |
Finished deploy [wikimedia/discovery/analytics@dc3cd56]: T329089: proper reconciliation of missed page-undelete events (duration: 20m 48s) |
[production] |
14:46 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2433.codfw.wmnet with OS buster |
[production] |
14:45 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2149 (T328255)', diff saved to https://phabricator.wikimedia.org/P44035 and previous config saved to /var/cache/conftool/dbconfig/20230209-144535-ladsgroup.json |
[production] |
14:45 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance |
[production] |
14:45 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance |
[production] |
14:45 |
<jiji@cumin1001> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc-gp1001.eqiad.wmnet |
[production] |
14:44 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2430.codfw.wmnet with OS buster |
[production] |
14:44 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" |
[production] |
14:44 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2429.codfw.wmnet with OS buster |
[production] |
14:44 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" |
[production] |
14:44 |
<jiji@cumin1001> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc-gp1001.eqiad.wmnet |
[production] |
14:44 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2432.codfw.wmnet with reason: host reimage |
[production] |
14:43 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage |
[production] |
14:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1130 (T328817)', diff saved to https://phabricator.wikimedia.org/P44034 and previous config saved to /var/cache/conftool/dbconfig/20230209-144321-marostegui.json |
[production] |
14:43 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1130.eqiad.wmnet with reason: Maintenance |
[production] |
14:43 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db1130.eqiad.wmnet with reason: Maintenance |
[production] |
14:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T328817)', diff saved to https://phabricator.wikimedia.org/P44033 and previous config saved to /var/cache/conftool/dbconfig/20230209-144300-marostegui.json |
[production] |
14:41 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2432.codfw.wmnet with reason: host reimage |
[production] |
14:38 |
<pt1979@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" |
[production] |
14:38 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance |
[production] |
14:38 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance |
[production] |
14:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2109 (T328255)', diff saved to https://phabricator.wikimedia.org/P44032 and previous config saved to /var/cache/conftool/dbconfig/20230209-143828-ladsgroup.json |
[production] |
14:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P44031 and previous config saved to /var/cache/conftool/dbconfig/20230209-143704-marostegui.json |
[production] |
14:32 |
<pt1979@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" |
[production] |
14:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P44030 and previous config saved to /var/cache/conftool/dbconfig/20230209-142754-marostegui.json |
[production] |
14:27 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS bullseye |
[production] |
14:25 |
<dcausse@deploy1002> |
Started deploy [wikimedia/discovery/analytics@dc3cd56]: T329089: proper reconciliation of missed page-undelete events |
[production] |
14:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P44029 and previous config saved to /var/cache/conftool/dbconfig/20230209-142321-ladsgroup.json |
[production] |
14:22 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage |
[production] |
14:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P44028 and previous config saved to /var/cache/conftool/dbconfig/20230209-142157-marostegui.json |
[production] |
14:21 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2432.codfw.wmnet with OS buster |
[production] |
14:19 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage |
[production] |
14:17 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage |
[production] |
14:14 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2431.codfw.wmnet with OS buster |
[production] |
14:14 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage |
[production] |
14:14 |
<dcausse> |
T329089: re-playing detected inconsistencies (missing mediawiki.page-undelete events) from 2022-10-31 to 2023-02-07 to WDQS |
[production] |
14:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P44027 and previous config saved to /var/cache/conftool/dbconfig/20230209-141247-marostegui.json |
[production] |
14:09 |
<Lucas_WMDE> |
lucaswerkmeister-wmde@mwdebug1001:~$ mwscript namespaceDupes.php shnwikibooks --fix | tee T328634-1-unpatched.out # T328634 – finished successfully, to my surprise |
[production] |
14:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P44026 and previous config saved to /var/cache/conftool/dbconfig/20230209-140815-ladsgroup.json |
[production] |
14:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2158 (T329203)', diff saved to https://phabricator.wikimedia.org/P44025 and previous config saved to /var/cache/conftool/dbconfig/20230209-140650-marostegui.json |
[production] |
14:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db2158 (T329203)', diff saved to https://phabricator.wikimedia.org/P44024 and previous config saved to /var/cache/conftool/dbconfig/20230209-140124-marostegui.json |
[production] |
14:01 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance |
[production] |