2014-03-12
§
|
16:10 |
<ottomata> |
initiating controlled shutdown of analytics1021 kafka broker to do some load testing and also fix runtime java version |
[production] |
15:56 |
<cmjohnson1> |
ms-be1005 going down to fix mgmt |
[production] |
15:06 |
<hoo> |
syncing to mw120[1-3] failed |
[production] |
14:58 |
<hoo> |
synchronized php-1.23wmf17/extensions/CentralAuth/ 'Fix global account deletion' |
[production] |
14:38 |
<springle> |
synchronized wmf-config/db-eqiad.php 's2 db1063 full steam' |
[production] |
14:24 |
<paravoid> |
deploying new swift ring @ eqiad, setting weight from 100 to 2000 on all disks |
[production] |
13:01 |
<cmjohnson1> |
shutting down and relocating mw1201, mw1202, mw1203 to d5-eqiad |
[production] |
10:41 |
<maxsem> |
synchronized wmf-config/mobile.php 'https://gerrit.wikimedia.org/r/118227' |
[production] |
10:17 |
<springle> |
started s4 dump for toolserver on db72 /a |
[production] |
09:52 |
<springle> |
synchronized wmf-config/db-eqiad.php 's2 pool db1063, warm up' |
[production] |
09:45 |
<hashar> |
rerestarting Jenkins |
[production] |
09:16 |
<hashar> |
kill -9 of Jenkins since it is unresponsive |
[production] |
09:13 |
<hashar> |
restarting Jenkins |
[production] |
09:12 |
<hashar> |
Jenkins broken again! Good morning. |
[production] |
06:18 |
<springle> |
xtrabackup clone db1018 to db1063 |
[production] |
05:38 |
<springle> |
synchronized wmf-config/db-eqiad.php 's1 pool db1061, warm up' |
[production] |
04:52 |
<springle> |
synchronized wmf-config/db-eqiad.php 's1 db1051 warm up' |
[production] |
02:43 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Wed Mar 12 02:42:59 UTC 2014 (duration 42m 58s) |
[production] |
02:32 |
<springle> |
synchronized wmf-config/db-eqiad.php 's1 drop load during xtrabackup clone db1051 to db1061' |
[production] |
02:18 |
<LocalisationUpdate> |
completed (1.23wmf17) at 2014-03-12 02:18:12+00:00 |
[production] |
02:10 |
<LocalisationUpdate> |
completed (1.23wmf16) at 2014-03-12 02:10:25+00:00 |
[production] |
2014-03-11
§
|
19:36 |
<mutante> |
re-deleting salt keys for pmtpa appservers |
[production] |
19:03 |
<mutante> |
shut down mw86-mw125 (sdtpa row A, A5) |
[production] |
18:52 |
<mutante> |
shut down mw28-mw57 |
[production] |
18:35 |
<bd808> |
purged l10n cache for 1.23wmf15 |
[production] |
18:33 |
<bd808> |
purged l10n cache for 1.23wmf14 |
[production] |
18:08 |
<bd808> |
rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf17 |
[production] |
18:03 |
<bd808> |
updated /a/common to {{Gerrit|I06d07cc3e}}: beta: disable memcached accross datacenters |
[production] |
17:31 |
<mutante> |
shut down srv284-srv301 (sdtpa row B, B5) |
[production] |
16:08 |
<cmjohnson1> |
attempting to fix ps1-b5 and ps1-b6 |
[production] |
14:03 |
<reedy> |
synchronized wmf-config/ |
[production] |
14:02 |
<reedy> |
synchronized database lists files: |
[production] |
13:55 |
<reedy> |
synchronized wmf-config/CommonSettings.php 'I208d51b5db031d35518453e2b9de096f7f53f7a0' |
[production] |
10:37 |
<MaxSem> |
Manually disabled old broken job queue cronjobs on hume |
[production] |
02:57 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Tue Mar 11 02:57:11 UTC 2014 (duration 57m 10s) |
[production] |
02:22 |
<LocalisationUpdate> |
completed (1.23wmf17) at 2014-03-11 02:22:39+00:00 |
[production] |
02:12 |
<LocalisationUpdate> |
completed (1.23wmf16) at 2014-03-11 02:12:35+00:00 |
[production] |
2014-03-10
§
|
23:30 |
<^d> |
kicking gerrit to pick up bugfix. |
[production] |
23:07 |
<mutante> |
shutting down srv258-srv270 |
[production] |
22:18 |
<bd808> |
Two instances of logstash were running on logstash1001; Killed both and started service again |
[production] |
21:55 |
<bd808> |
Restarted logstash on logstash1001; new events flowing in again now |
[production] |
21:49 |
<K4-713> |
synchronized payments to 5d20972. |
[production] |
21:47 |
<bd808> |
ganglia monitoring for elasticsearch on logstash cluster seems broken. Caused by 1.0.x upgrade having not happened there yet? |
[production] |
21:46 |
<bd808> |
Restarted elastcisearch on logstash1003; it was JVM heap thrashing at 98% heap used. |
[production] |
21:44 |
<RobH> |
arsenic reclaim per rt6522, ignore alerts |
[production] |
21:40 |
<sbernardin> |
ms-be5 swapping failed disk |
[production] |
21:34 |
<K4-713> |
synchronized payments cluster to 01f7af8 |
[production] |
21:27 |
<bd808> |
No new data in logstash since 14:56Z. Bryan will investigate. |
[production] |
20:26 |
<gwicke> |
Coren fixed up the Parsoid deploy by running "salt-run deploy.restart 'parsoid/deploy' '10%'" from the salt master as a work-around for [[bugzilla:61882]] |
[production] |
20:18 |
<gwicke> |
deployed Parsoid 681f7b8d2 using deploy 77d17489; service restart incomplete due to [[bugzilla:61882]] |
[production] |