2021-04-13
ยง
|
16:42 |
<dcaro> |
Ceph balancer got the cluster to eval 0.014916, that is 88-77% usage for compute pool, and 28-19% usage for the cinder one \o/ (T274573) |
[admin] |
16:41 |
<arturo> |
create VM toolsbeta-sgeexec-1002 (T277653) |
[toolsbeta] |
16:41 |
<marxarelli> |
deleting errant wmf/1.36.0-wmf.39 branches in mediawiki/core and submodule repos |
[releng] |
16:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15311 and previous config saved to /var/cache/conftool/dbconfig/20210413-163851-root.json |
[production] |
16:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 90%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15310 and previous config saved to /var/cache/conftool/dbconfig/20210413-162347-root.json |
[production] |
16:17 |
<razzi> |
rebalance kafka partitions for webrequest_text partitions 19, 20 |
[analytics] |
16:08 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1029.eqiad.wmnet with reason: REIMAGE |
[production] |
16:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 80%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15309 and previous config saved to /var/cache/conftool/dbconfig/20210413-160844-root.json |
[production] |
16:06 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1028.eqiad.wmnet with reason: REIMAGE |
[production] |
16:05 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1029.eqiad.wmnet with reason: REIMAGE |
[production] |
16:04 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2016.codfw.wmnet with reason: REIMAGE |
[production] |
16:03 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1028.eqiad.wmnet with reason: REIMAGE |
[production] |
16:02 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2015.codfw.wmnet with reason: REIMAGE |
[production] |
16:02 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2016.codfw.wmnet with reason: REIMAGE |
[production] |
16:00 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2014.codfw.wmnet with reason: REIMAGE |
[production] |
16:00 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2015.codfw.wmnet with reason: REIMAGE |
[production] |
15:58 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2014.codfw.wmnet with reason: REIMAGE |
[production] |
15:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 70%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15308 and previous config saved to /var/cache/conftool/dbconfig/20210413-155340-root.json |
[production] |
15:44 |
<arturo> |
delete VMs toolsbeta-sgeexec-0903 and toolsbeta-buster-sgeexec-01 (no longer useful) |
[toolsbeta] |
15:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 60%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15307 and previous config saved to /var/cache/conftool/dbconfig/20210413-153836-root.json |
[production] |
15:36 |
<arturo> |
created VM toolsbeta-sgeexec-0903 (buster) (T277653) |
[toolsbeta] |
15:31 |
<arturo> |
live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/678043/ (T277653) |
[toolsbeta] |
15:26 |
<herron> |
migrating kafka-logging broker logstash1010 to kafka-logging1001 T279342 |
[production] |
15:24 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15306 and previous config saved to /var/cache/conftool/dbconfig/20210413-152333-root.json |
[production] |
15:21 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
15:12 |
<Trey314159> |
reindexing English wikis on elastic@eqiad, elastic@codfw, and cloudelastic complete (with some failures) (T274200) |
[production] |
15:08 |
<dcaro> |
Activating continuous upmap balancer, keeping a close eye (T274573) |
[admin] |
15:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 40%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15305 and previous config saved to /var/cache/conftool/dbconfig/20210413-150829-root.json |
[production] |
15:03 |
<dcaro> |
Executing a second pass, there's still movements to improve the eval of 0.030075 (T274573) |
[admin] |
15:02 |
<dcaro> |
First pass finished, improved eval to 0.030075 (T274573) |
[admin] |
14:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 30%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15304 and previous config saved to /var/cache/conftool/dbconfig/20210413-145325-root.json |
[production] |
14:49 |
<dcaro> |
Running the first_pass balancing plan on ceph eqiad, current eval 0.030622 (T274573) |
[admin] |
14:43 |
<dcaro> |
enabling ceph upmap pg balancer on equiad (T274573) |
[admin] |
14:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 20%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15303 and previous config saved to /var/cache/conftool/dbconfig/20210413-143821-root.json |
[production] |
14:36 |
<andrewbogott> |
upgrading codfw1dev to version Victoria, T261137 |
[admin] |
14:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db1184 with minimal weight on s1 for the first time T275633', diff saved to https://phabricator.wikimedia.org/P15302 and previous config saved to /var/cache/conftool/dbconfig/20210413-143419-marostegui.json |
[production] |
14:10 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1027.eqiad.wmnet with reason: REIMAGE |
[production] |
14:08 |
<moritzm> |
updated bullseye d-i image to 2021-04-12 daily build T275873 |
[production] |
14:08 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1027.eqiad.wmnet with reason: REIMAGE |
[production] |
14:08 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1026.eqiad.wmnet with reason: REIMAGE |
[production] |
14:06 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1026.eqiad.wmnet with reason: REIMAGE |
[production] |
14:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db1184 with minimal weight on s1 for the first time T275633', diff saved to https://phabricator.wikimedia.org/P15301 and previous config saved to /var/cache/conftool/dbconfig/20210413-140431-marostegui.json |
[production] |
14:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1184 (re)pooling @ 20%: Slowly pool db1184 for the first time in s1 T275633', diff saved to https://phabricator.wikimedia.org/P15300 and previous config saved to /var/cache/conftool/dbconfig/20210413-140353-root.json |
[production] |
14:03 |
<_joe_> |
uploading new versions of the mcrouter, php7.2-fpm and php7.3-fpm images to the registry |
[production] |
14:01 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2013.codfw.wmnet with reason: REIMAGE |
[production] |
13:59 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2012.codfw.wmnet with reason: REIMAGE |
[production] |
13:59 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2013.codfw.wmnet with reason: REIMAGE |
[production] |
13:57 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2011.codfw.wmnet with reason: REIMAGE |
[production] |
13:57 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2012.codfw.wmnet with reason: REIMAGE |
[production] |