2014-09-15
§
|
22:56 |
<Krinkle> |
Running sample job on integration-slave1008 and warming up npmjs.org cache |
[production] |
22:49 |
<Krinkle> |
Running sample job on integration-slave1007 and warming up npmjs.org cache |
[production] |
22:48 |
<Krinkle> |
Pooling the newly setup Trusty-based Jenkins slaves (integration-slave1006, integration-slave1007 and integration-slave1008) |
[production] |
22:42 |
<bblack> |
dropping static routes for 2620:0:861:ed1a::[d,f,10,11] -> lvs1005 from cr[12]-eqiad (only 11 is of any consequence, misc-web-lb, and they're advertised by bgp and this is preventing failover to lvs1002) |
[production] |
21:28 |
<cscott> |
updated OCG to version 188a3c221d927bd0601ef5e1b0c0f4a9d1cdbd31 |
[production] |
20:46 |
<subbu> |
deployed Parsoid version b845bff9 |
[production] |
18:49 |
<ejegg> |
Synchronized php-1.24wmf20/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 23s) |
[production] |
18:46 |
<hoo> |
Sync to tmh100[12] failed, according to awight |
[production] |
18:44 |
<ejegg> |
Synchronized php-1.24wmf21/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 09s) |
[production] |
18:43 |
<manybubbles> |
performance tests show cirrus should handle jawiki with no problem but if load spirals out of control and I'm not around then revert https://gerrit.wikimedia.org/r/#/c/160465/ |
[production] |
18:40 |
<hoo> |
Local part of the global rename of Gnumarcoo => .avgas fatally timed out on itwiki. This needs to be fixed per hand. |
[production] |
18:40 |
<manybubbles> |
Setting Cirrus to jawiki's primary search backend went well but Japan is mostly asleep. If Elasticsearch load takes a turn for the worse in four or five hours then we'll know how it went. |
[production] |
17:14 |
<bd808> |
Restarted elasticsearch on logstash1003; 2014-09-14T09:33:57Z java.lang.OutOfMemoryError |
[production] |
17:09 |
<_joe_> |
killing salt-call on all mediawiki hosts |
[production] |
17:06 |
<bd808> |
Restarted elasticsearch on logstash1001; 2014-09-15T06:12:09Z java.lang.OutOfMemoryError |
[production] |
17:04 |
<bblack> |
using salt to kill salt-minion everywhere... |
[production] |
17:02 |
<bd808> |
Restarted logstash on logstash1001. I hoped this would fix the dashboards, but it looks like the backing elasticsearch cluster is too sad for them to work at the moment. |
[production] |
16:55 |
<bd808> |
Restarted hung elasticsearch service on logstash1002 |
[production] |
16:15 |
<manybubbles> |
jawiki now has cirrus as primary. we're back to where we were before the great cascading failure of two months ago |
[production] |
16:13 |
<manybubbles> |
Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s) |
[production] |
15:29 |
<marktraceur> |
Synchronized php-1.24wmf21/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s) |
[production] |
15:27 |
<marktraceur> |
Synchronized php-1.24wmf20/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s) |
[production] |
15:18 |
<marktraceur> |
Synchronized php-1.24wmf21/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s) |
[production] |
15:17 |
<marktraceur> |
Synchronized php-1.24wmf20/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s) |
[production] |
15:06 |
<marktraceur> |
Synchronized wmf-config/: [SWAT] Remove 'renameuser' right from bureaucrats on CentralAuth wikis (duration: 00m 09s) |
[production] |
14:54 |
<aude> |
Synchronized wmf-config/Wikibase.php: Bump wikibase memcached key for test.wikidata, test, test2 (duration: 00m 16s) |
[production] |
14:54 |
<hashar> |
Updated Jenkins Job Builder fork: e5c0c61..2d74b16 |
[production] |
14:50 |
<aude> |
Finished scap: Put test.wikidata back on mw1.24-wmf19 extension branch (duration: 37m 27s) |
[production] |
14:43 |
<manybubbles> |
restarting the enwiki cirrus reindex process - it crashed over the weekend. why you crash and leave error message "1". "1" is not a useful error message. |
[production] |
14:13 |
<aude> |
Started scap: Put test.wikidata back on mw1.24-wmf19 extension branch |
[production] |
13:03 |
<_joe_> |
fenari is swapping hard, restarting apache who was eating up all the RAM |
[production] |
09:20 |
<hashar> |
Synchronized wmf-config/InitialiseSettings.php: *.scienceimage.csiro.au to the wgCopyUploadsDomains {{gerrit|159999}} {{bug|70771}} (duration: 00m 06s) |
[production] |
09:16 |
<hashar> |
Jenkins: apt-get upgrade on prod slaves (updates php5 / libc / jdk 7) |
[production] |
03:09 |
<springle> |
Synchronized wmf-config/db-eqiad.php: depool db1036 (duration: 00m 09s) |
[production] |
02:03 |
<LocalisationUpdate> |
failed: mwversionsinuse returned empty list |
[production] |
01:47 |
<hoo> |
Synchronized wmf-config/liquidthreads.php: Remove global $path (duration: 00m 07s) |
[production] |
01:47 |
<hoo> |
Synchronized wmf-config/flaggedrevs.php: Remove global $path (duration: 00m 10s) |
[production] |
2014-09-14
§
|
20:37 |
<ori_> |
enabling puppet on mw1053 |
[production] |
20:11 |
<springle> |
Synchronized wmf-config/db-eqiad.php: depool db1062, locked up (duration: 00m 09s) |
[production] |
13:24 |
<_joe_> |
stopped puppet aand the JR on mw1053 |
[production] |
12:42 |
<hoo> |
Ran sync-common on mw1053 to stop "Unrecognized job type 'ChangeNotification'." exceptions |
[production] |
11:14 |
<springle> |
Synchronized wmf-config/db-eqiad.php: repool es1005 (duration: 00m 07s) |
[production] |
10:37 |
<springle> |
restart es1005 |
[production] |
09:56 |
<springle> |
Synchronized wmf-config/db-eqiad.php: repool es1007, depool es1005 (duration: 00m 10s) |
[production] |
02:01 |
<LocalisationUpdate> |
failed: mwversionsinuse returned empty list |
[production] |
00:45 |
<ori_> |
fenari appears to still have twemproxy (in addition to nutcracker); decom'ing. |
[production] |
00:29 |
<ori_> |
restarting apache2 on fenari |
[production] |