3851-3900 of 10000 results (32ms)
2020-04-27 §
09:34 <jynus@cumin2001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
09:33 <jynus@cumin2001> START - Cookbook sre.hosts.decommission [production]
09:33 <jynus@cumin2001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
09:32 <jynus@cumin2001> START - Cookbook sre.hosts.decommission [production]
09:32 <jynus@cumin2001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
09:31 <jynus@cumin2001> START - Cookbook sre.hosts.decommission [production]
09:25 <marostegui> Stop MySQL on labsdb1012 to reclone labsdb1011 - T249188 [production]
09:11 <marostegui> Deploy schema change on s1 codfw, lag will show up - T250055 [production]
08:52 <moritzm> restarting cas on idp1001 to pick up Java 11 security update (will void active SSO sessions) [production]
08:26 <marostegui> Deploy schema change on s5 codfw, lag will show up - T250055 [production]
08:24 <kormat> Truncating and optimizing parsercache for pc1010 and pc2010 T247787 [production]
08:18 <mutante> running puppet on all cp-ats [production]
08:15 <godog> add 80G to prometheus global LV [production]
07:25 <elukey> roll restart elastic-chi on cloudelastic100[1-4] to pick up the last JVM GC settings - T231517 [production]
07:15 <marostegui> Kill updateSpecialPages.php wikidatawiki --override --only=Fewestrevisions as it is causing lag - T238199 [production]
07:14 <elukey> powercycle an-worker1089 - unreachable via ssh, mgmt serial available, soft cpu lock events registered in dmesg [production]
06:59 <elukey> force ifdown/ifup eno1 on analytics1052 - interface negotiated speed flapping [production]
06:42 <moritzm> installing Java security updates on IDP hosts, will void current SSO sessions [production]
06:30 <elukey@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=mw1280.eqiad.wmnet [production]
06:22 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
06:19 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
06:00 <marostegui> Stop MySQL on labsdb1011 for reimage - T249188 [production]
05:58 <moritzm> installing git security updates on jessie [production]
05:56 <marostegui> Compress tables on db1104 - T232446 [production]
05:53 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1104 for defragmentation - T232446', diff saved to https://phabricator.wikimedia.org/P11039 and previous config saved to /var/cache/conftool/dbconfig/20200427-055320-marostegui.json [production]
05:47 <vgutierrez> rolling restart ats-tls in cp[1085,1089] and text@esams - T249335 [production]
05:33 <marostegui> Depool labsdb1011 T249188 [production]
2020-04-26 §
18:08 <elukey> powercycle puppetmaster1001 - mgmt serial console not usable, no ssh, racadm getsel doesn't show anything [production]
2020-04-25 §
10:23 <addshore> going to restart and probably depool for a short time wdqs1005 as it is in a deadlock T242453 [production]
05:52 <_joe_> depooling mw1407 again, should not be serving traffic [production]
05:27 <shdubsh> restart elasticsearch on logstash2022 [production]
2020-04-24 §
21:25 <cdanis@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=wdqs,name=eqiad [production]
19:41 <Amir1> applying T114117 on labswiki (wikitech) [production]
18:58 <shdubsh> restart elasticsearch on logstash2021 [production]
18:50 <shdubsh> restart elasticsearch on logstash2020 [production]
15:12 <cdanis@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=eqiad [production]
15:08 <addshore> depool and restart wdqs1006 to catch up with lag after deadlock T242453 [production]
11:13 <Amir1> apply T250071 on s10 (labswiki) [production]
2020-04-23 §
22:06 <Urbanecm> Perform timeouting rename at enwiki Wikipedia talk:Introduction --> Wikipedia talk:Introduction (historical) using moveBatch.php ([[:meta:Special:Diff/20009402|request]]) [production]
18:38 <ejegg> updated payments-wiki from 1640f5e21e to 45bf1734e0 [production]
2020-04-22 §
08:55 <Urbanecm> Move User:Wikipedia:Introduction (historical) --> Wikipedia:Introduction (historical) at enwiki using moveBatch.php, on-wiki interface was time-outing [production]
05:50 <elukey@deploy1001> Finished deploy [analytics/refinery@30facc4]: Test of new scap settings (duration: 04m 42s) [production]
05:45 <elukey@deploy1001> Started deploy [analytics/refinery@30facc4]: Test of new scap settings [production]
05:25 <elukey@deploy1001> deploy aborted: log (duration: 00m 02s) [production]
05:24 <elukey@deploy1001> Started deploy [analytics/refinery@30facc4]: log [production]
01:55 <milimetric@deploy1001> Finished deploy [analytics/refinery@30facc4]: Analytics: another follow-up on the train, jar version bump (take 2, analytics1030 keeps failing) (duration: 00m 42s) [production]
01:54 <milimetric@deploy1001> Started deploy [analytics/refinery@30facc4]: Analytics: another follow-up on the train, jar version bump (take 2, analytics1030 keeps failing) [production]
01:54 <milimetric@deploy1001> Finished deploy [analytics/refinery@30facc4]: Analytics: another follow-up on the train, jar version bump (duration: 02m 54s) [production]
01:51 <milimetric@deploy1001> Started deploy [analytics/refinery@30facc4]: Analytics: another follow-up on the train, jar version bump [production]
01:51 <milimetric@deploy1001> deploy aborted: Analytics: another follow-up on the train, jar version bump (duration: 04m 08s) [production]