2020-09-28
§
|
09:02 |
<klausman@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:00 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:00 |
<klausman@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
08:56 |
<dcausse> |
T263970: recovering lost apifeature indices (copying eqiad indices -> codfw) |
[production] |
08:55 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:53 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
08:46 |
<godog> |
swift codfw-prod: bump object weight for ms-be2057 - T261633 |
[production] |
08:43 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:43 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
08:43 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:42 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
08:37 |
<elukey> |
decommission the hadoop test cluster (analytics1028->41) |
[production] |
08:36 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:36 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) |
[production] |
08:35 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:34 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
08:34 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:32 |
<ema> |
text@eqiad: rolling varnish upgrade to 6.0.6-1wm1 T263557 |
[production] |
08:28 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 100%: mobo replaced T260670', diff saved to https://phabricator.wikimedia.org/P12813 and previous config saved to /var/cache/conftool/dbconfig/20200928-082825-kormat.json |
[production] |
08:21 |
<ema> |
upload@eqiad: rolling varnish upgrade to 6.0.6-1wm1 T263557 |
[production] |
08:21 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Remove db2113 from contributions/logpager/recentchanges*/watchlist T263842', diff saved to https://phabricator.wikimedia.org/P12812 and previous config saved to /var/cache/conftool/dbconfig/20200928-082114-kormat.json |
[production] |
08:13 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 75%: mobo replaced T260670', diff saved to https://phabricator.wikimedia.org/P12811 and previous config saved to /var/cache/conftool/dbconfig/20200928-081321-kormat.json |
[production] |
08:07 |
<jayme> |
restarting pybal on lvs3005 for switching to conf1005 - T196487 |
[production] |
08:06 |
<jayme> |
restarting pybal on lvs3006 for switching to conf1005 - T196487 |
[production] |
08:02 |
<jayme> |
restarting pybal on lvs3007 for switching to conf1005 - T196487 |
[production] |
08:02 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) |
[production] |
07:58 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 50%: mobo replaced T260670', diff saved to https://phabricator.wikimedia.org/P12810 and previous config saved to /var/cache/conftool/dbconfig/20200928-075817-kormat.json |
[production] |
07:54 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.stop-cluster |
[production] |
07:43 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 25%: mobo replaced T260670', diff saved to https://phabricator.wikimedia.org/P12809 and previous config saved to /var/cache/conftool/dbconfig/20200928-074313-kormat.json |
[production] |
07:29 |
<_joe_> |
restarting pybal on the LVS primaries |
[production] |
07:24 |
<dcausse> |
T263970: forcing allocation of enwiki_general_1587198756 (chi@eqiad) |
[production] |
07:18 |
<_joe_> |
restarting pybal on the backup LVS in eqiad, codfw to pick up the new wikifeeds endpoint |
[production] |
07:17 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) |
[production] |
07:09 |
<elukey@cumin1001> |
START - Cookbook sre.presto.roll-restart-workers |
[production] |
06:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es2028 as es1 master in codfw T261717', diff saved to https://phabricator.wikimedia.org/P12806 and previous config saved to /var/cache/conftool/dbconfig/20200928-065938-marostegui.json |
[production] |
06:15 |
<marostegui> |
Set innodb_change_buffering = inserts; on db2089 (s5), db2106 (s4), db2108 (s2), db2085 (s1), db2085 (s8), db2087 (s7), db2087 (s6), db2109 (s3) T263443 |
[production] |
05:55 |
<marostegui> |
Stop MySQL on es2013 before decommissioning it T263740 |
[production] |
05:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove es2013 from dbctl T263740', diff saved to https://phabricator.wikimedia.org/P12805 and previous config saved to /var/cache/conftool/dbconfig/20200928-055410-marostegui.json |
[production] |
05:48 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es2013 T263740', diff saved to https://phabricator.wikimedia.org/P12804 and previous config saved to /var/cache/conftool/dbconfig/20200928-054846-marostegui.json |
[production] |
05:22 |
<marostegui> |
Decrease labsdb1011 weight |
[production] |
2020-09-26
§
|
19:20 |
<chrisalbon> |
sudo service uwsgi-ores restart |
[production] |
02:17 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
02:04 |
<cdanis@cumin2001> |
conftool action : set/pooled=false; selector: dnsdisc=ores,name=eqiad |
[production] |
02:04 |
<cdanis@cumin2001> |
conftool action : set/pooled=true; selector: dnsdisc=ores,name=codfw |
[production] |
01:56 |
<cdanis> |
❌cdanis@cumin2001.codfw.wmnet ~ 🕙🍺 sudo cumin 'A:ores and A:codfw' 'systemctl restart celery-ores-worker.service uwsgi-ores.service ' |
[production] |
01:48 |
<cdanis@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=ores,name=codfw |
[production] |
01:48 |
<cdanis@cumin1001> |
conftool action : set/pooled=true; selector: dnsdisc=ores,name=eqiad |
[production] |
01:17 |
<cdanis> |
❌cdanis@ores2001.codfw.wmnet ~ 🕤🍺 sudo systemctl restart uwsgi-ores.service |
[production] |
01:11 |
<cdanis> |
✔️ cdanis@ores2001.codfw.wmnet ~ 🕘🍺 sudo systemctl restart celery-ores-worker.service |
[production] |