2015-02-07
§
|
15:36 |
<apergos> |
started nginx on daaset1001, it was not running for some reason |
[production] |
09:40 |
<bblack> |
depooled cp1070 in pybal |
[production] |
09:33 |
<bblack> |
rebooting cp1070 (dead network, dead console) |
[production] |
05:10 |
<subbu> |
deployed parsoid hotfiix 8ca7ef40 (cherry-pick of 447a0565) |
[production] |
04:48 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Feb 7 04:47:30 UTC 2015 (duration 47m 29s) |
[production] |
03:13 |
<gwicke> |
restarting parsoid cluster |
[production] |
02:34 |
<LocalisationUpdate> |
completed (1.25wmf16) at 2015-02-07 02:33:09+00:00 |
[production] |
02:33 |
<l10nupdate> |
Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 01s) |
[production] |
02:20 |
<LocalisationUpdate> |
completed (1.25wmf15) at 2015-02-07 02:19:09+00:00 |
[production] |
02:19 |
<l10nupdate> |
Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s) |
[production] |
02:11 |
<qchris> |
Ran kafka leader re-election as analytics1021 dropped out of it's partition leader role. |
[production] |
01:48 |
<bblack> |
leaving cp1064 (jessie upload eqiad) pooled front+back. it's experimental but looks stable. if upload-related 503 spikes and I'm not around, feel free to depool it. |
[production] |
00:18 |
<qchris> |
Manually bumping heap for the Hadoop namenodes and revived them after both of them running out of heap and not coming back. |
[production] |
2015-02-06
§
|
22:53 |
<marktraceur> |
Synchronized wmf-config/: [friday] beta config change for tgr (duration: 00m 09s) |
[production] |
22:53 |
<subbu> |
restarted parsoid service to kill several stuck processes on multiple nodes |
[production] |
20:05 |
<robh> |
ms1004 coming offline, shouldnt page (but disregard if it does) |
[production] |
19:19 |
<subbu> |
deployed parsoid hotfiix a9dbd4fc (cherry-pick of 76d6658c) |
[production] |
16:54 |
<reedy> |
Synchronized wmf-config/InitialiseSettings.php: Adding cdm16062.contentdm.oclc.org to wgCopyUploadsDomains (duration: 00m 05s) |
[production] |
16:40 |
<godog> |
cancel downtime on graphite1001, enable downtime on tungsten pending full decomission |
[production] |
16:05 |
<godog> |
bounce ocg on ocg1001 and stop additional ocg instance running |
[production] |
15:50 |
<bblack> |
depool -> repool cp1064 varnish-frontend, reduced cache size to 16G, re-enabled compact_memory |
[production] |
15:50 |
<godog> |
restart ocg on ocg1003 to pick up statsd dns changes |
[production] |
15:48 |
<godog> |
restart ocg on ocg1002 to pick up statsd dns changes |
[production] |
14:33 |
<bblack> |
starting up a fresh round of SSL testing on eqiad upload pooling (cp1064) |
[production] |
14:12 |
<godog> |
bounce diamond on lvs2004/lvs2005 |
[production] |
13:55 |
<cmjohnson1> |
upgrading boron to trusty |
[production] |
10:51 |
<godog> |
reimage ms-be2014 |
[production] |
07:50 |
<_joe_> |
restarting the parsoid cluster, one node at a time, some processes are stuck. |
[production] |
02:04 |
<LocalisationUpdate> |
failed: git pull of extensions failed |
[production] |
01:01 |
<ori> |
restarting xenon on fluorine |
[production] |
00:14 |
<krenair> |
Synchronized php-1.25wmf15/includes/CategoryViewer.php: https://gerrit.wikimedia.org/r/#/c/188944/1 (duration: 00m 06s) |
[production] |
00:13 |
<krenair> |
Synchronized php-1.25wmf16/includes/CategoryViewer.php: https://gerrit.wikimedia.org/r/#/c/188945/1 (duration: 00m 06s) |
[production] |
2015-02-05
§
|
23:53 |
<bd808> |
Updated Wikimania Scholarships to 0852585 (re-enable language selection) + local hack in trebuchet repo to remove incomplete translations |
[production] |
22:53 |
<reedy> |
Synchronized wmf-config/InitialiseSettings.php: Enable Parsoid on wikitech (duration: 00m 05s) |
[production] |
21:08 |
<reedy> |
Purged l10n cache for 1.25wmf13 |
[production] |
21:03 |
<reedy> |
Synchronized php-1.25wmf16/extensions/CheckUser/: (no message) (duration: 00m 07s) |
[production] |
21:00 |
<reedy> |
Synchronized php-1.25wmf16: (no message) (duration: 00m 52s) |
[production] |
20:59 |
<bblack> |
cp1064 upload b ackend re-enabled in cache.pp; if upload-related 503s ensue later today and I'm not around, feel free to re-disable it |
[production] |
20:57 |
<mutante> |
radon - reinstalling, scheduled downtime |
[production] |