2022-01-11
ยง
|
14:13 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance |
[production] |
14:12 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance |
[production] |
14:12 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance |
[production] |
14:12 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance |
[production] |
14:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18569 and previous config saved to /var/cache/conftool/dbconfig/20220111-141249-marostegui.json |
[production] |
13:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18568 and previous config saved to /var/cache/conftool/dbconfig/20220111-135744-marostegui.json |
[production] |
13:50 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host dbproxy1021.eqiad.wmnet with OS bullseye |
[production] |
13:43 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. |
[production] |
13:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18567 and previous config saved to /var/cache/conftool/dbconfig/20220111-134239-marostegui.json |
[production] |
13:36 |
<btullis@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. |
[production] |
13:36 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. |
[production] |
13:33 |
<moritzm> |
installing 4.9.290 kernels von stretch systems (no reboots yet) |
[production] |
13:29 |
<btullis@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. |
[production] |
13:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18565 and previous config saved to /var/cache/conftool/dbconfig/20220111-132734-marostegui.json |
[production] |
13:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18564 and previous config saved to /var/cache/conftool/dbconfig/20220111-132627-marostegui.json |
[production] |
13:26 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
13:26 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
13:12 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
13:11 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM people1003.eqiad.wmnet |
[production] |
13:09 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
13:09 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
13:07 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.reboot-vm for VM people1003.eqiad.wmnet |
[production] |
13:05 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
13:04 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM planet1002.eqiad.wmnet |
[production] |
12:59 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.reboot-vm for VM planet1002.eqiad.wmnet |
[production] |
12:45 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
12:41 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
12:41 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
12:37 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
12:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18563 and previous config saved to /var/cache/conftool/dbconfig/20220111-122143-marostegui.json |
[production] |
12:17 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
12:15 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
12:15 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
12:15 |
<cparle@deploy1002> |
Synchronized wmf-config: Config: [[gerrit:752599|Enable support for references (T230315)]] (duration: 01m 00s) |
[production] |
12:14 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage |
[production] |
12:14 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage |
[production] |
12:14 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
12:10 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18562 and previous config saved to /var/cache/conftool/dbconfig/20220111-121025-root.json |
[production] |
12:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18561 and previous config saved to /var/cache/conftool/dbconfig/20220111-120638-marostegui.json |
[production] |
12:00 |
<moritzm> |
reverting kubetcd2004.codfw.wmnet back to "plain" storage |
[production] |
11:56 |
<moritzm> |
rebalance ganeti row A (all nodes reimaged to Buster) |
[production] |
11:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18560 and previous config saved to /var/cache/conftool/dbconfig/20220111-115522-root.json |
[production] |
11:51 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18559 and previous config saved to /var/cache/conftool/dbconfig/20220111-115133-marostegui.json |
[production] |
11:41 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet |
[production] |