2017-10-29
§
|
23:49 |
<ema> |
powercycle cp4024 |
[production] |
22:31 |
<ariel@tin> |
Finished deploy [dumps/dumps@2aa2275]: fix keep setting to work with overrides (duration: 00m 02s) |
[production] |
22:31 |
<ariel@tin> |
Started deploy [dumps/dumps@2aa2275]: fix keep setting to work with overrides |
[production] |
17:55 |
<ariel@tin> |
Finished deploy [dumps/dumps@d8978ce]: add overrides section processing to config file (duration: 00m 04s) |
[production] |
17:55 |
<ariel@tin> |
Started deploy [dumps/dumps@d8978ce]: add overrides section processing to config file |
[production] |
17:23 |
<ariel@tin> |
Finished deploy [dumps/dumps@d426cf7]: batch 7z jobs, multistream job fixup (duration: 00m 02s) |
[production] |
17:23 |
<ariel@tin> |
Started deploy [dumps/dumps@d426cf7]: batch 7z jobs, multistream job fixup |
[production] |
12:54 |
<ema> |
cp4026: restart varnish-be for mbox lag |
[production] |
2017-10-28
§
|
21:03 |
<bblack> |
cp1067 (current target cache): disabling the relatively-new VCL that sets do_stream=false if !CL on applayer fetches... |
[production] |
19:39 |
<hoo@tin> |
Synchronized wmf-config/CommonSettings.php: Half the Flow -> Parsoid timeout (100s -> 50s) (T179156) (duration: 00m 51s) |
[production] |
19:39 |
<bblack> |
backend restart on cp1065 |
[production] |
18:39 |
<bblack> |
restarting varnish backend on cp1053 to move the lag/503 issues to another box and buy more time to debug |
[production] |
18:28 |
<bblack> |
cp4025 - restart backend for mailbox lag (upload@ulsfo, unrelated to text-cluster issues) |
[production] |
18:21 |
<bblack> |
cp1053 - manual VCL change, backends appservers+api_appservers, reduce connect/firstbyte/betweenbytes timeoues from 5/180/60 to 3/20/10 |
[production] |
16:51 |
<elukey> |
restart varnish backend on cp1055 - mailbox lag + T179156 |
[production] |
12:14 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=mw1313.eqiad.wmnet |
[production] |
12:10 |
<elukey> |
manually killed (SIGTERM) hhvm on mw1313 - high load, hhvm-dump-debug not responsive |
[production] |
12:01 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=mw1313.eqiad.wmnet |
[production] |
11:53 |
<elukey> |
restart hhvm on mw1285 - hhvm-dump-debug in /tmp/hhvm.17700.bt |
[production] |
11:24 |
<hoo@tin> |
Synchronized wmf-config/Wikibase-labs.php: Consistency sync (duration: 00m 50s) |
[production] |
10:52 |
<volans> |
restarted pdfrender on scb1001, was stuck since 2d with AssertionError: display is not set! |
[production] |
2017-10-27
§
|
20:54 |
<MaxSem> |
running migratePreferences.php on group2 wikis |
[production] |
19:09 |
<hoo> |
Ran scap pull on mwdebug1001 |
[production] |
18:20 |
<awight@tin> |
Finished deploy [ores/deploy@185170f]: Test pip-9 scap trick on ores1002 (non-production) (duration: 02m 17s) |
[production] |
18:18 |
<awight@tin> |
Started deploy [ores/deploy@185170f]: Test pip-9 scap trick on ores1002 (non-production) |
[production] |
17:54 |
<hoo> |
Taking mwdebug1001 to do tests regarding T179156 |
[production] |
16:38 |
<gehel> |
re-enabling wdqs-updater |
[production] |
16:16 |
<bblack> |
cp1054 varnish backend restarted (was 503s / bad-conns target of ongoing issues) |
[production] |
16:16 |
<gehel> |
wdqs updater is now stopped for real |
[production] |
16:10 |
<XioNoX> |
deactivating BGP sessions to Zayo in eqiad (flapping) |
[production] |
15:58 |
<gehel> |
disabling wdqs updater on all nodes |
[production] |
15:50 |
<hoo@tin> |
Synchronized wmf-config/Wikibase-production.php: Disable constraints check with SPARQL for now (T179156) (duration: 00m 50s) |
[production] |
15:48 |
<marostegui> |
Compress InnoDB on db2038 (s6) - T178359 |
[production] |
15:46 |
<bblack> |
restart varnish-backend on cp4022 (upload@ulsfo) - mailbox |
[production] |
14:49 |
<bblack> |
turn on cp4024 port on asw-ulsfo |
[production] |
13:52 |
<bblack> |
reboot cp4021 to clean up oom messes |
[production] |
13:49 |
<bblack> |
restarting nginx on cp4021, without NUMA memory constraints |
[production] |
12:10 |
<marostegui> |
Optimize commonswiki.templatelinks on dbstore1001 - T162789 |
[production] |
12:03 |
<elukey> |
execute systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka-jumbo hosts (old unit not deployed anymore) |
[production] |
11:41 |
<mutante> |
gerrit back - maintenance over |
[production] |
11:39 |
<mutante> |
gerrit restart to apply gerrit:386793 is imminent |
[production] |
11:36 |
<ema> |
cp4023: varnish-backend-restart for lag |
[production] |
11:06 |
<mobrovac@tin> |
Finished deploy [citoid/deploy@ff63420]: Update dependencies (duration: 03m 18s) |
[production] |
11:04 |
<marostegui> |
Optimize cebwiki.templatelinks on db1103 - T174509 |
[production] |
11:03 |
<mobrovac@tin> |
Started deploy [citoid/deploy@ff63420]: Update dependencies |
[production] |
08:56 |
<marostegui> |
Optimize table commonswiki.recentchanges on dbstore1001 - T162789 |
[production] |
08:11 |
<marostegui> |
Drop redundant indexes from pagelinks and templatelinks on s3 wikis only for dbstore2001 and dbstore2002 - T174509 |
[production] |