2015-09-24
§
|
18:21 |
<twentyafterfour@tin> |
rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf24 |
[production] |
18:20 |
<legoktm> |
moved oauthadmin group from User:Yuvipanda@metawiki to User:YuviPanda@metawiki |
[production] |
18:19 |
<godog> |
restart restbase on restbase1005 |
[production] |
18:18 |
<godog> |
restart restbase on restbase1004 |
[production] |
18:16 |
<godog> |
restart restbase on restbase1003 |
[production] |
18:06 |
<paravoid> |
depooling cp1046, stability issues |
[production] |
18:00 |
<demon@tin> |
Synchronized multiversion/MWRealm.php: (no message) (duration: 00m 17s) |
[production] |
17:59 |
<ori> |
Merged Apache config change Ia095457fb. It will refresh the Apache service as it rolls out, causing elevated 503s for the next 20 minutes. |
[production] |
17:53 |
<godog> |
rolling restart restbase in eqiad |
[production] |
17:35 |
<chasemp> |
powercycling cp1046 at mgmt as I can't ssh in and it seems like it should be up |
[production] |
17:26 |
<godog> |
bounce restbase on restbase1002, apply new datacenter config |
[production] |
17:10 |
<_joe_> |
cleaning up /tmp on mw1152 |
[production] |
17:09 |
<cmjohnson1> |
powering down for the last time es1001 - es1010 |
[production] |
16:17 |
<thcipriani@tin> |
Synchronized php-1.26wmf23/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240727]] (duration: 00m 26s) |
[production] |
16:01 |
<robh> |
nothing on puppet swat window, easiest swat ever. |
[production] |
15:46 |
<thcipriani@tin> |
Synchronized php-1.26wmf24/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240711]] (duration: 00m 26s) |
[production] |
15:23 |
<thcipriani@tin> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable suggestions in ca, en, es, fr, it, ja, tr, ru, zh [[gerrit:240638]] (duration: 00m 17s) |
[production] |
14:37 |
<paravoid> |
repooling codfw |
[production] |
12:54 |
<bblack> |
restarting varnish daemons on second half of maps, parsoid, misc clusters (package upgrade, shm_reclen change) |
[production] |
12:50 |
<bblack> |
restarting varnishd instances on text, mobile, upload clusters for package upgrade (slow salt, no parallelism, ~5m spacing - FE cache loss, BE cache stays, should take ~9h) |
[production] |
12:05 |
<moritzm> |
installed rpcbind security updates on eeden, baham, radon, maerlant, rhenium |
[production] |
11:56 |
<bblack> |
restarting varnish daemons on half of maps, parsoid, misc clusters (package upgrade, shm_reclen change) |
[production] |
11:36 |
<bblack> |
reinstall lvs300[12] to jessie - T96375 |
[production] |
11:21 |
<akosiaris> |
killed tail -f varnishncsa.log on cp1065 and ran apt-get clean to reclaim some disk space |
[production] |
11:14 |
<bblack> |
stopping pybal on lvs300[12]; lvs300[34] taking over |
[production] |
11:07 |
<bblack> |
upgrading varnishes to 3.0.6plus-wm8 (non-restarting, just pkg update on-disk) |
[production] |
09:40 |
<jynus> |
performing latest (software) steps to decom es1001-es1010 (puppet disabling, etc.) |
[production] |
08:39 |
<jynus> |
restarted HHVM @ mw1056, 1104, 1122 |
[production] |
05:33 |
<yuvipanda> |
deleted logstash indexes for 08/27 and 28 too |
[production] |
05:31 |
<yuvipanda> |
deleted indexes for 08/14, 15, 25, 26 on logstash |
[production] |
03:59 |
<yuvipanda> |
restarting elasticsearch in logstash1001-3 |
[production] |
03:53 |
<yuvipanda> |
restarted es on logstash1004-6 |
[production] |
03:02 |
<yuvipanda> |
jstack dumped logstash output onto /home/yuvipanda/stack on logstash1001 since strace seems useles |
[production] |
02:51 |
<yuvipanda> |
restarted logstash on logstash1002 |
[production] |
02:41 |
<yuvipanda> |
gmond at 100% again, killing it and stopping puppet again |
[production] |
02:40 |
<yuvipanda> |
re-enabling and running puppet on hafnium to see what it's bringing up |
[production] |
02:38 |
<l10nupdate@tin> |
Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 30s) |
[production] |
02:23 |
<yuvipanda> |
kill gmond on hafnium and disable puppet to prevent it from taking it back up. Was taking 100% CPU |
[production] |
02:16 |
<Krinkle> |
Kibana/Logstash outage. Zero events received after 2015-09-23T23:59:59.999Z. |
[production] |
02:14 |
<Krinkle> |
Partial EventLogging outage (client-side events via hafnium abruptly stopped 2015-09-23 11:36 UTC - 15 hours ago) |
[production] |
01:53 |
<mutante> |
started logstash on logstash1002 again |
[production] |
01:35 |
<mutante> |
bast1001: unmounting /srv/home_pmtpa (backup on bacula) |
[production] |
01:34 |
<mutante> |
removing subversion packages from bast1001 |
[production] |
01:15 |
<ori@tin> |
Synchronized php-1.26wmf24/includes: Ifa0d4cfe8e3: Backport I1ff61153d and I8e4c3d5a5 (duration: 00m 23s) |
[production] |
00:19 |
<jynus> |
restarted replication on db1051 |
[production] |
00:17 |
<ori> |
restarting tcpircbot on neon |
[production] |
00:16 |
<mutante> |
started logstash on logstash1002 |
[production] |
00:08 |
<bblack> |
varnish package on carbon for jessie updated to 3.0.6plus-wm8 |
[production] |