2023-11-23
ยง
|
10:27 |
<arnaudb@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: provisionning db2188.codfw.wmnet - T343674 |
[production] |
10:27 |
<arnaudb@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: provisionning db2188.codfw.wmnet - T343674 |
[production] |
10:22 |
<stevemunene@cumin1001> |
START - Cookbook sre.hosts.reimage for host druid1008.eqiad.wmnet with OS bullseye |
[production] |
10:16 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-role for role: swift::proxy |
[production] |
10:09 |
<arnaudb@cumin1001> |
START - Cookbook sre.mysql.clone of db2175.codfw.wmnet onto db2189.codfw.wmnet |
[production] |
10:06 |
<arnaudb@cumin1001> |
dbctl commit (dc=all): 'Cloning db2175 in db2189 for T343674', diff saved to https://phabricator.wikimedia.org/P53737 and previous config saved to /var/cache/conftool/dbconfig/20231123-100638-arnaudb.json |
[production] |
10:05 |
<arnaudb@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: provisionning db2189.codfw.wmnet - T343674 |
[production] |
10:04 |
<arnaudb@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: provisionning db2189.codfw.wmnet - T343674 |
[production] |
10:04 |
<arnaudb@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: provisionning db2189.codfw.wmnet - T343674 |
[production] |
10:04 |
<arnaudb@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: provisionning db2189.codfw.wmnet - T343674 |
[production] |
09:59 |
<stevemunene@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host druid1008.eqiad.wmnet with OS bullseye |
[production] |
09:27 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox |
[production] |
09:26 |
<ayounsi@cumin1001> |
START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox |
[production] |
09:20 |
<arnaudb@cumin1001> |
START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2190.codfw.wmnet |
[production] |
09:19 |
<arnaudb@cumin1001> |
END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2149.codfw.wmnet onto db2190.codfw.wmnet |
[production] |
09:18 |
<arnaudb@cumin1001> |
START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2190.codfw.wmnet |
[production] |
09:18 |
<arnaudb@cumin1001> |
END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2149.codfw.wmnet onto db2190.codfw.wmnet |
[production] |
09:17 |
<arnaudb@cumin1001> |
START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2190.codfw.wmnet |
[production] |
09:15 |
<arnaudb@cumin1001> |
dbctl commit (dc=all): 'Cloning db2149 in db2190 for T343674', diff saved to https://phabricator.wikimedia.org/P53736 and previous config saved to /var/cache/conftool/dbconfig/20231123-091514-arnaudb.json |
[production] |
09:13 |
<arnaudb@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: provisionning db2190.codfw.wmnet - T343674 |
[production] |
09:13 |
<arnaudb@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: provisionning db2190.codfw.wmnet - T343674 |
[production] |
09:13 |
<arnaudb@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: provisionning db2190.codfw.wmnet - T343674 |
[production] |
09:13 |
<arnaudb@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: provisionning db2190.codfw.wmnet - T343674 |
[production] |
09:12 |
<godog> |
add 50G to prometheus/services in codfw |
[production] |
09:10 |
<stevemunene@cumin1001> |
START - Cookbook sre.hosts.reimage for host druid1008.eqiad.wmnet with OS bullseye |
[production] |
09:10 |
<godog> |
add 80G to prometheus/k8s in eqiad |
[production] |
08:49 |
<Emperor> |
powercycle titan1001 |
[production] |
08:45 |
<moritzm> |
powercycling titan1002 |
[production] |
08:37 |
<hashar> |
Restarting CI Jenkins for plugins removals |
[production] |
07:21 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS bookworm |
[production] |
07:19 |
<_joe_> |
restarted sirenbot |
[production] |
07:08 |
<hashar> |
Restarted CI Jenkins to upgrade Rebuilder plugin |
[production] |
07:08 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage |
[production] |
07:04 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage |
[production] |
06:53 |
<hashar> |
Restarting Gerrit |
[production] |
06:52 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS bookworm |
[production] |
06:50 |
<hashar> |
Restarting CI Jenkins for plugins removals |
[production] |
06:44 |
<marostegui> |
Failover m2 from db1195 to db1119 - T351638 |
[production] |
06:38 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switch |
[production] |
06:37 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switch |
[production] |
06:23 |
<hashar> |
Restarting CI Jenkins for plugin update # T282893 |
[production] |
02:26 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2028.codfw.wmnet with OS bullseye |
[production] |
01:26 |
<dzahn@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host planet2003.codfw.wmnet with OS bookworm |
[production] |
01:26 |
<dzahn@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host planet1003.eqiad.wmnet with OS bookworm |
[production] |
01:09 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host restbase2028.codfw.wmnet with OS bullseye |
[production] |
01:03 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['restbase2033'] |
[production] |
01:02 |
<jhancock@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['restbase2033'] |
[production] |
01:01 |
<jhancock@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase2033.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
00:51 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.provision for host restbase2033.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
00:44 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on planet2003.codfw.wmnet with reason: host reimage |
[production] |