2017-10-28
§
|
18:21 |
<bblack> |
cp1053 - manual VCL change, backends appservers+api_appservers, reduce connect/firstbyte/betweenbytes timeoues from 5/180/60 to 3/20/10 |
[production] |
16:51 |
<elukey> |
restart varnish backend on cp1055 - mailbox lag + T179156 |
[production] |
12:14 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=mw1313.eqiad.wmnet |
[production] |
12:10 |
<elukey> |
manually killed (SIGTERM) hhvm on mw1313 - high load, hhvm-dump-debug not responsive |
[production] |
12:01 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=mw1313.eqiad.wmnet |
[production] |
11:53 |
<elukey> |
restart hhvm on mw1285 - hhvm-dump-debug in /tmp/hhvm.17700.bt |
[production] |
11:24 |
<hoo@tin> |
Synchronized wmf-config/Wikibase-labs.php: Consistency sync (duration: 00m 50s) |
[production] |
10:52 |
<volans> |
restarted pdfrender on scb1001, was stuck since 2d with AssertionError: display is not set! |
[production] |
2017-10-27
§
|
20:54 |
<MaxSem> |
running migratePreferences.php on group2 wikis |
[production] |
19:09 |
<hoo> |
Ran scap pull on mwdebug1001 |
[production] |
18:20 |
<awight@tin> |
Finished deploy [ores/deploy@185170f]: Test pip-9 scap trick on ores1002 (non-production) (duration: 02m 17s) |
[production] |
18:18 |
<awight@tin> |
Started deploy [ores/deploy@185170f]: Test pip-9 scap trick on ores1002 (non-production) |
[production] |
17:54 |
<hoo> |
Taking mwdebug1001 to do tests regarding T179156 |
[production] |
16:38 |
<gehel> |
re-enabling wdqs-updater |
[production] |
16:16 |
<bblack> |
cp1054 varnish backend restarted (was 503s / bad-conns target of ongoing issues) |
[production] |
16:16 |
<gehel> |
wdqs updater is now stopped for real |
[production] |
16:10 |
<XioNoX> |
deactivating BGP sessions to Zayo in eqiad (flapping) |
[production] |
15:58 |
<gehel> |
disabling wdqs updater on all nodes |
[production] |
15:50 |
<hoo@tin> |
Synchronized wmf-config/Wikibase-production.php: Disable constraints check with SPARQL for now (T179156) (duration: 00m 50s) |
[production] |
15:48 |
<marostegui> |
Compress InnoDB on db2038 (s6) - T178359 |
[production] |
15:46 |
<bblack> |
restart varnish-backend on cp4022 (upload@ulsfo) - mailbox |
[production] |
14:49 |
<bblack> |
turn on cp4024 port on asw-ulsfo |
[production] |
13:52 |
<bblack> |
reboot cp4021 to clean up oom messes |
[production] |
13:49 |
<bblack> |
restarting nginx on cp4021, without NUMA memory constraints |
[production] |
12:10 |
<marostegui> |
Optimize commonswiki.templatelinks on dbstore1001 - T162789 |
[production] |
12:03 |
<elukey> |
execute systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka-jumbo hosts (old unit not deployed anymore) |
[production] |
11:41 |
<mutante> |
gerrit back - maintenance over |
[production] |
11:39 |
<mutante> |
gerrit restart to apply gerrit:386793 is imminent |
[production] |
11:36 |
<ema> |
cp4023: varnish-backend-restart for lag |
[production] |
11:06 |
<mobrovac@tin> |
Finished deploy [citoid/deploy@ff63420]: Update dependencies (duration: 03m 18s) |
[production] |
11:04 |
<marostegui> |
Optimize cebwiki.templatelinks on db1103 - T174509 |
[production] |
11:03 |
<mobrovac@tin> |
Started deploy [citoid/deploy@ff63420]: Update dependencies |
[production] |
08:56 |
<marostegui> |
Optimize table commonswiki.recentchanges on dbstore1001 - T162789 |
[production] |
08:11 |
<marostegui> |
Drop redundant indexes from pagelinks and templatelinks on s3 wikis only for dbstore2001 and dbstore2002 - T174509 |
[production] |
05:59 |
<marostegui> |
Stop MySQL on db2084 to copy s5 to db2086 - T178359 |
[production] |
05:55 |
<marostegui> |
dbstore1001: convert itwiki.pagelinks to TokuDB - T162789 |
[production] |
05:53 |
<mutante> |
uploaded parsoid_0.8.0 to releases.wikimedia.org (for Subbu - T179134) |
[production] |
05:46 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1060 - T174509 (duration: 00m 50s) |
[production] |
03:02 |
<bblack> |
cp1067, cp4026 - backend restarts, mailbox lag |
[production] |
2017-10-26
§
|
22:47 |
<demon@tin> |
rebuilt wikiversions.php and synchronized wikiversions files: wikidata to wmf.4 |
[production] |
22:39 |
<bblack> |
restarting varnish-backend on cp1053 (mailbox lag from ongoing issues elsewhere?) |
[production] |
21:40 |
<bblack> |
raising backend max_connections for api.svc.eqiad.wmnet + appservers.svc.eqiad.wmnet from 1K to 10K on cp1053.eqiad.wmnet (current funnel for the bulk of the 503s) |
[production] |
21:32 |
<hoo@tin> |
Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s) |
[production] |
21:30 |
<hoo@tin> |
Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s) |
[production] |
21:00 |
<hoo> |
Fully revert all changes related to T178180 |
[production] |
20:58 |
<hoo@tin> |
Synchronized wmf-config/Wikibase.php: Revert "Add property for RDF mapping of external identifiers for Wikidata" (T178180) (duration: 00m 50s) |
[production] |
20:02 |
<ladsgroup@tin> |
Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s) |
[production] |
20:00 |
<ladsgroup@tin> |
Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s) |
[production] |
19:41 |
<awight@tin> |
Finished deploy [ores/deploy@0adae70]: Increase extractor wikidata API timeout to 15s, T179107 (duration: 07m 25s) |
[production] |
19:36 |
<aaron@tin> |
Started restart [jobrunner/jobrunner@a20d043]: (no justification provided) |
[production] |