2022-08-02
ยง
|
10:34 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
10:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: After restart', diff saved to https://phabricator.wikimedia.org/P32147 and previous config saved to /var/cache/conftool/dbconfig/20220802-103318-root.json |
[production] |
10:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: After restart', diff saved to https://phabricator.wikimedia.org/P32146 and previous config saved to /var/cache/conftool/dbconfig/20220802-101813-root.json |
[production] |
10:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db2175 to s2 T311494', diff saved to https://phabricator.wikimedia.org/P32145 and previous config saved to /var/cache/conftool/dbconfig/20220802-101522-marostegui.json |
[production] |
10:12 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1019.eqiad.wmnet with OS bullseye |
[production] |
10:05 |
<jynus> |
shutdown dbprov2002 backup2005 backup2008 T310070 |
[production] |
10:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 50%: After restart', diff saved to https://phabricator.wikimedia.org/P32144 and previous config saved to /var/cache/conftool/dbconfig/20220802-100308-root.json |
[production] |
10:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32143 and previous config saved to /var/cache/conftool/dbconfig/20220802-100304-root.json |
[production] |
09:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db2079 from dbctl T313885', diff saved to https://phabricator.wikimedia.org/P32141 and previous config saved to /var/cache/conftool/dbconfig/20220802-095455-marostegui.json |
[production] |
09:52 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage |
[production] |
09:49 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage |
[production] |
09:49 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. |
[production] |
09:48 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: After restart', diff saved to https://phabricator.wikimedia.org/P32140 and previous config saved to /var/cache/conftool/dbconfig/20220802-094804-root.json |
[production] |
09:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32139 and previous config saved to /var/cache/conftool/dbconfig/20220802-094759-root.json |
[production] |
09:44 |
<godog> |
grow sdb3 by 100G on thanos-be2004 - T314275 |
[production] |
09:43 |
<btullis@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. |
[production] |
09:42 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. |
[production] |
09:37 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reimage for host dbproxy1019.eqiad.wmnet with OS bullseye |
[production] |
09:36 |
<btullis@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. |
[production] |
09:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: After restart', diff saved to https://phabricator.wikimedia.org/P32138 and previous config saved to /var/cache/conftool/dbconfig/20220802-093259-root.json |
[production] |
09:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32137 and previous config saved to /var/cache/conftool/dbconfig/20220802-093254-root.json |
[production] |
09:30 |
<btullis@puppetmaster1001> |
conftool action : set/pooled=no; selector: cluster=wikireplicas-b,name=dbproxy1019.eqiad.wmnet |
[production] |
09:30 |
<btullis@puppetmaster1001> |
conftool action : set/pooled=yes; selector: cluster=wikireplicas-b,name=dbproxy1018.eqiad.wmnet |
[production] |
09:28 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. |
[production] |
09:26 |
<btullis@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: cluster=wikireplicas-a,name=dbproxy1019.eqiad.wmnet |
[production] |
09:25 |
<btullis@puppetmaster1001> |
conftool action : set/pooled=yes; selector: cluster=wikireplicas-a,name=dbproxy1018.eqiad.wmnet |
[production] |
09:22 |
<btullis@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. |
[production] |
09:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 5%: After restart', diff saved to https://phabricator.wikimedia.org/P32136 and previous config saved to /var/cache/conftool/dbconfig/20220802-091754-root.json |
[production] |
09:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32135 and previous config saved to /var/cache/conftool/dbconfig/20220802-091749-root.json |
[production] |
09:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2143', diff saved to https://phabricator.wikimedia.org/P32134 and previous config saved to /var/cache/conftool/dbconfig/20220802-091518-root.json |
[production] |
09:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 2%: After restart', diff saved to https://phabricator.wikimedia.org/P32133 and previous config saved to /var/cache/conftool/dbconfig/20220802-090250-root.json |
[production] |
09:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32132 and previous config saved to /var/cache/conftool/dbconfig/20220802-090245-root.json |
[production] |
08:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1181 (re)pooling @ 1%: After restart', diff saved to https://phabricator.wikimedia.org/P32131 and previous config saved to /var/cache/conftool/dbconfig/20220802-084745-root.json |
[production] |
08:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32130 and previous config saved to /var/cache/conftool/dbconfig/20220802-084740-root.json |
[production] |
08:46 |
<marostegui> |
stop mysql on db2095 db2107 db2109 db2137 db2147 db2159 db2160 pc2012 for pdu maintenance on codfw b5 T310070 |
[production] |
07:49 |
<moritzm> |
upgrading drmrs ganeti clusters to 3.0.2 T312637 |
[production] |
07:33 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to plain disks, T311686 |
[production] |
07:33 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to plain disks, T311686 |
[production] |
07:22 |
<godog> |
bounce icinga on alert2001 - T314353 |
[production] |
07:18 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to DRBD, T311686 |
[production] |
07:18 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to DRBD, T311686 |
[production] |
06:58 |
<elukey> |
restart rsyslog on ml-serve2006 |
[production] |
06:56 |
<ladsgroup@deploy1002> |
Synchronized php-1.39.0-wmf.22/extensions/FlaggedRevs/maintenance/pruneRevData.php: Backport: [[gerrit:819077|pruneRevData: Make cleaning in larger batches (T296380)]] (duration: 03m 26s) |
[production] |
06:56 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
06:55 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
06:55 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
06:54 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
06:46 |
<godog> |
bounce icinga on alert1001 - T314353 |
[production] |
05:48 |
<marostegui@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db2088.codfw.wmnet |
[production] |
05:48 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |