2013-11-11
§
|
08:50 |
<pp-pdf1> |
- update mwlib.rl to 0.14.4. |
[production] |
06:43 |
<ori-l> |
scap: "sudo: no tty present and no askpass program specified" for snapshot1 & snapshot4 |
[production] |
06:39 |
<apergos> |
probably gratuitous powercycle of sq80, it seems fine now in any case |
[production] |
06:33 |
<ori> |
Finished syncing Wikimedia installation... : |
[production] |
06:32 |
<apergos> |
sq48 repeat of these errors and hung again, so rt #6274 opened |
[production] |
06:28 |
<apergos> |
powercycled hung sq48, took two tries to come up, "NMI received for unknown reason 31 on CPU 0" and "mptbase: ioc0: ERROR - Failed to come READY after reset" |
[production] |
06:28 |
<ori> |
Started syncing Wikimedia installation... : |
[production] |
02:40 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Mon Nov 11 02:40:33 UTC 2013 |
[production] |
02:29 |
<ori-l> |
Apache logs filled with "SearchPhaseExecutionException[Failed to execute phase [dfs], all shards failed" |
[production] |
02:15 |
<ori-l> |
Continuing inspection of logs on fluorine. memcached-serious.log is flooded with 'Memcached error for key [...]' errors, problem started in May or June judging by log sizes. |
[production] |
02:14 |
<LocalisationUpdate> |
completed (1.23wmf3) at Mon Nov 11 02:14:09 UTC 2013 |
[production] |
02:08 |
<LocalisationUpdate> |
completed (1.23wmf2) at Mon Nov 11 02:07:59 UTC 2013 |
[production] |
02:04 |
<ori-l> |
Earlier issue identified by Ryan and Leslie as intermittent packet loss between eqiad and esams, due to capacity issue with provider. |
[production] |
2013-11-10
§
|
23:42 |
<ori-l> |
CPU overload in text caches esams |
[production] |
23:42 |
<ori-l> |
Per Ryan: packet loss from esams to eqiad on xe-4-2-2.cr1-eqiad.wikimedia.org |
[production] |
23:26 |
<ori-l> |
redis.log: flooded with 'Used automatic re-authentication for Lua script [...]' (68,955 such messages) |
[production] |
23:24 |
<ori-l> |
fatal log: Fatal error: Allowed memory size of 201326592 bytes exhausted (tried to allocate 72 bytes) at /usr/local/apache/common-local/php-1.23wmf2/extensions/WikibaseDataModel/DataModel/Entity/Entity.php on line 130 (47 such fatals) |
[production] |
23:21 |
<ori-l> |
poolcounter.log: Pool counter is full (multiple wikis) |
[production] |
23:17 |
<ori-l> |
exception.log: Exception from line 110 of /usr/local/apache/common-local/php-1.23wmf2/includes/WikiPage.php: Invalid or virtual namespace -1 given. (2 such errors) |
[production] |
23:14 |
<ori-l> |
exception.log: Exception from line 114 of /usr/local/apache/common-local/php-1.23wmf2/includes/upload/UploadStash.php: UploadStash::getFile No user is logged in, files must belong to users (8 such errors) |
[production] |
23:14 |
<ori-l> |
exception.log: Exception from line 61 of /usr/local/apache/common-local/php-1.23wmf2/includes/media/ImageHandler.php: No width specified to ImageHandler::makeParamString (194 such errors) |
[production] |
23:13 |
<ori-l> |
Investigating possible site issue and logging everything that I come across. |
[production] |
02:52 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sun Nov 10 02:52:22 UTC 2013 |
[production] |
02:20 |
<LocalisationUpdate> |
completed (1.23wmf3) at Sun Nov 10 02:20:46 UTC 2013 |
[production] |
02:11 |
<LocalisationUpdate> |
completed (1.23wmf2) at Sun Nov 10 02:11:33 UTC 2013 |
[production] |
2013-11-09
§
|
13:46 |
<springle> |
starting pt-kill daemon on S1 slaves for SpecialAllpages::showToplevel queries > 30s |
[production] |
10:08 |
<paravoid> |
upgraded ganglia-web to 3.5.10 |
[production] |
05:37 |
<aaron> |
synchronized php-1.23wmf3/includes/filebackend/SwiftFileBackend.php '519883bd67c960998e8cb2eb0cda52f3764192f7' |
[production] |
05:35 |
<aaron> |
synchronized php-1.23wmf2/includes/filebackend/SwiftFileBackend.php '519883bd67c960998e8cb2eb0cda52f3764192f7' |
[production] |
02:11 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Nov 9 02:10:54 UTC 2013 |
[production] |
02:05 |
<LocalisationUpdate> |
completed (1.23wmf3) at Sat Nov 9 02:04:55 UTC 2013 |
[production] |
02:03 |
<LocalisationUpdate> |
completed (1.23wmf2) at Sat Nov 9 02:03:45 UTC 2013 |
[production] |
01:23 |
<gwicke> |
synchronized php-1.23wmf3/extensions/Parsoid/php |
[production] |
01:20 |
<aaron> |
synchronized php-1.23wmf2/maintenance 'c674d4c90c40d18187d893ed7a7ea94f43a1a624' |
[production] |
01:19 |
<aaron> |
synchronized php-1.23wmf3/maintenance 'c674d4c90c40d18187d893ed7a7ea94f43a1a624' |
[production] |
01:16 |
<gwicke> |
synchronized php-1.23wmf2/extensions/Parsoid/php |
[production] |
01:11 |
<aaron> |
synchronized php-1.23wmf3/includes/filebackend '2afdc066f52b54faf63f9c980f7ef6a7841dd094' |
[production] |
01:11 |
<aaron> |
synchronized php-1.23wmf2/includes/filebackend '2afdc066f52b54faf63f9c980f7ef6a7841dd094' |
[production] |
00:46 |
<aaron> |
synchronized php-1.23wmf3/maintenance |
[production] |
00:32 |
<aaron> |
synchronized php-1.23wmf2/maintenance/cleanupUploadStash.php '01da29d3ce23cea02df3fc975d6d51bfbc50a222' |
[production] |
2013-11-08
§
|
21:38 |
<ori> |
updated /a/common to {{Gerrit|I30f4e5975}}: logmsg-git-hook: fix commit determination logic & run on new MW branch creation |
[production] |
21:22 |
<marc> |
synchronized wmf-config/InitialiseSettings.php 'Bug: 56203 - Various changes to wikidatawiki's user rights configuration (for real this time)' |
[production] |
21:00 |
<marc> |
synchronized wmf-config/InitialiseSettings.php 'Bug: 56203 - Various changes to wikidatawiki's user rights configuration' |
[production] |
19:23 |
<^d> |
jobqueue: dropped Cirrus-related jobs from wikidatawiki's queue |
[production] |
19:21 |
<demon> |
synchronized wmf-config/InitialiseSettings.php 'No more Cirrus for wikidatawiki' |
[production] |
19:19 |
<mwalker> |
no longer use cmd=_xclick for donors in China |
[production] |
19:02 |
<cmjohnson> |
B side on ps1-c7-eqiad will be unplugged to be tested |
[production] |
18:50 |
<mutante> |
db1050 back up after skipping mount of failed /a |
[production] |
18:45 |
<faidon> |
synchronized wmf-config/db-eqiad.php 'depool db1050' |
[production] |
18:45 |
<faidon> |
updated /a/common to {{Gerrit|Ic66d0e783}}: Depool db1050, down, disk failed |
[production] |