2018-07-11
ยง
|
22:18 |
<bsitzmann@deploy1001> |
Started deploy [mobileapps/deploy@03fa731]: Update mobileapps to b5e152d (T195325 T189830 T177619 T196523) |
[production] |
21:53 |
<elukey> |
restart rsyslog on lithium - in:imtcp stuck in EAGAIN (Resource temporarily unavailable) due to a old socket to tegmen.wikimedia.org |
[production] |
21:47 |
<twentyafterfour> |
disabled phabricator throttling |
[production] |
21:47 |
<elukey> |
re-enable kafka mirror maker on kafka100[1-3] |
[production] |
21:41 |
<ppchelko@deploy1001> |
Finished deploy [changeprop/deploy@45c3807]: Temporary decrease concurrency (duration: 01m 17s) |
[production] |
21:40 |
<ppchelko@deploy1001> |
Started deploy [changeprop/deploy@45c3807]: Temporary decrease concurrency |
[production] |
21:16 |
<elukey> |
starting kafka on kafka100[1-3] after zk cleanup |
[production] |
21:14 |
<_joe_> |
repooling mw1280 |
[production] |
20:57 |
<krinkle@deploy1001> |
Synchronized wmf-config/mc.php: Ifa659de6453 - Revert multi-write mcrouter for most wikis - T198239 (duration: 00m 58s) |
[production] |
20:49 |
<_joe_> |
depooling mw1280 for debugging |
[production] |
20:48 |
<_joe_> |
repooled mw1223 after investigation |
[production] |
20:39 |
<_joe_> |
depooling mw1223 for debugging |
[production] |
20:20 |
<bearND> |
rolled back mobileapps deploy |
[production] |
20:20 |
<bsitzmann@deploy1001> |
Finished deploy [mobileapps/deploy@03fa731]: Update mobileapps to b5e152d (T195325 T189830 T177619 T196523) (duration: 03m 30s) |
[production] |
20:16 |
<bsitzmann@deploy1001> |
Started deploy [mobileapps/deploy@03fa731]: Update mobileapps to b5e152d (T195325 T189830 T177619 T196523) |
[production] |
20:03 |
<volans> |
rolling restarting mediawiki API in eqiad with the highest load |
[production] |
19:45 |
<gehel> |
note: updater was not and will not be using kafka |
[production] |
19:45 |
<gehel> |
restarting wdqs-updater on wdqs1010 (still not using Kafka) |
[production] |
19:31 |
<elukey> |
cleaned up *change-prop.retry.change-prop.retry* in /srv/kafka/data on kafka100[1-3] |
[production] |
19:22 |
<akosiaris> |
stop changeprop on all scb hosts |
[production] |
19:16 |
<_joe_> |
restarting a few hhvm appservers with high load |
[production] |
18:44 |
<akosiaris> |
ok, change merged, running puppet on scb hosts |
[production] |
18:34 |
<elukey> |
restarted topic nuke script for kafka main |
[production] |
18:14 |
<elukey> |
start kafka on kafka1002 |
[production] |
18:14 |
<elukey> |
stop mirror makers on kafka100[1-3] |
[production] |
18:10 |
<jynus@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1086 fully (duration: 00m 57s) |
[production] |
17:36 |
<_joe_> |
restarted cpjobqueue in codfw |
[production] |
17:31 |
<elukey> |
restart kafka on kafka1001 (oom registered) |
[production] |
17:12 |
<elukey> |
restart kafka on kafka1003 with 2G heap settings |
[production] |
17:10 |
<elukey> |
restart kafka on kafka1002 with 2G heap settings |
[production] |
17:06 |
<elukey> |
restarted kafka on kafka1001 with Xmx 2G and Xms 2F |
[production] |
17:05 |
<_joe_> |
restarting cpjobqueue on scb2001 |
[production] |
17:00 |
<oblivian@puppetmaster1001> |
conftool action : set/pooled=false; selector: dnsdisc=eventbus,name=eqiad |
[production] |
16:50 |
<elukey> |
stop topics cleaner script |
[production] |
16:40 |
<_joe_> |
masking and stopping cpjobqueue, changeprop everywhere |
[production] |
16:36 |
<elukey> |
start topic clean procedure on kafka1001 (tmux root session) |
[production] |
16:23 |
<jynus@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1086 with low load (duration: 00m 58s) |
[production] |
16:19 |
<elukey> |
restart kafka on kafka1003 |
[production] |
16:19 |
<Pchelolo> |
restart cpjobqueue |
[production] |
16:02 |
<ejegg> |
disabled failmail logstream for SmashPig |
[production] |
15:40 |
<_joe_> |
restarting eventbus on kafka-main in eqiad |
[production] |
15:26 |
<chasemp> |
install dtach on labnet1003 |
[production] |
15:18 |
<_joe_> |
restarting kafka on kafka1002 |
[production] |
15:16 |
<gehel> |
killing wdqs-updater on wdqs10(09|10) to diminish load on kafka |
[production] |
15:11 |
<elukey> |
restart again kafka on kafka100[1,2] - failed for OOM |
[production] |
15:08 |
<Pchelolo> |
stop cpjobqueue in eqiad |
[production] |
15:03 |
<elukey> |
restart kafka on kafka1003 |
[production] |
14:58 |
<papaul> |
shutting down furud to disconnect disk array shelves |
[production] |
14:57 |
<elukey> |
rolling restart of eventbus on kafka100[1-3] |
[production] |
14:53 |
<elukey> |
restart kafka on kafka1002 |
[production] |