2016-07-04
§
|
13:43 |
<akosiaris> |
restart smokeping on netmon1001, temporarily disabled msw1-codfw |
[production] |
13:38 |
<gehel> |
resuming writes on Cirrus / elasticsearch, this did not speedup cluster recovery |
[production] |
13:18 |
<godog> |
bounce redis on rcs1001 |
[production] |
13:16 |
<gehel> |
restarting elastic1021 for kernel upgrade (T138811) |
[production] |
13:07 |
<elukey> |
Bootstrapping again Cassandra on aqs100[456] (rack awareness + 2.2.6 - testing environment) |
[production] |
13:02 |
<gehel> |
pausing writes on Cirrus / elasticsearch for faster cluster restart |
[production] |
12:43 |
<hashar> |
Nodepool back up with 10 instances (instead of 20) to accomodate for labs capacity T139285 |
[production] |
12:39 |
<godog> |
nodetool-b stop -- COMPACTION on restbase1014 |
[production] |
12:29 |
<moritzm> |
rolling reboot of rcs* cluster for kernel security update |
[production] |
12:10 |
<moritzm> |
rolling reboot of ocg* cluster for kernel security update |
[production] |
11:40 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Failover db1053 to db1072 (duration: 00m 40s) |
[production] |
10:56 |
<moritzm> |
rolling reboot of swift frontends in eqiad for kernel security update |
[production] |
10:30 |
<yuvipanda> |
stop nodepool on labnodepool1001 and disable puppet to keep it down, to allow stabilizing labs first |
[production] |
10:28 |
<yuvipanda> |
restart rabbitmq-server on labcontrol1001 |
[production] |
10:14 |
<moritzm> |
installing chromium security update on osmium |
[production] |
10:07 |
<moritzm> |
installing xerces-c security updates on Ubuntu systems (jessie already fixed) |
[production] |
10:01 |
<_joe_> |
stopping jobchron and jobrunner on mw1001-10 before decommission |
[production] |
09:50 |
<godog> |
reimage ms-be300[234] with jessie |
[production] |
09:44 |
<hashar> |
Labs infra cant delete instances anymore (impacts CI as well) T139285 |
[production] |
09:41 |
<moritzm> |
installing p7zip security updates |
[production] |
09:38 |
<hashar> |
CI is out of Nodepool instances, the pool has drained because instances can no more be deleted over the OpenStack API |
[production] |
09:25 |
<elukey> |
Added new jobrunners in service - mw130[256].eqiad.wmnet (https://etherpad.wikimedia.org/p/jessie-install) |
[production] |
08:16 |
<moritzm> |
rolling reboot of swift backends in eqiad for kernel security update |
[production] |
07:49 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Failover db1034 to db1062 (duration: 00m 30s) |
[production] |
02:26 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Mon Jul 4 02:26:54 UTC 2016 (duration 5m 42s) |
[production] |
02:21 |
<mwdeploy@tin> |
scap sync-l10n completed (1.28.0-wmf.8) (duration: 09m 14s) |
[production] |
2016-07-01
§
|
22:23 |
<krinkle@tin> |
Synchronized php-1.28.0-wmf.8/extensions/WikimediaEvents/extension.json: T128115 (duration: 00m 37s) |
[production] |
22:22 |
<krinkle@tin> |
Synchronized php-1.28.0-wmf.8/extensions/WikimediaEvents/modules/: T128115 (duration: 00m 30s) |
[production] |
21:04 |
<ori@tin> |
Synchronized wmf-config/CommonSettings.php: I7a95c0f4: Bump $wgResourceLoaderMaxQueryLength to 5,000 (duration: 00m 32s) |
[production] |
20:08 |
<ori@tin> |
Synchronized wmf-config/CommonSettings.php: I6eb0ae67: Bump $wgResourceLoaderMaxQueryLength to 4,000 (duration: 00m 26s) |
[production] |
19:17 |
<ori> |
restarted coal on graphite1001 stopped receiving messages from EL 0mq publisher |
[production] |
19:16 |
<ori> |
restarted navtiming on hafnium; stopped receiving messages from EL 0mq publisher |
[production] |
18:34 |
<mutante> |
mw1259 - powercycling |
[production] |
18:32 |
<krinkle@tin> |
Synchronized docroot/default/: (no message) (duration: 00m 31s) |
[production] |
18:31 |
<krinkle@tin> |
Synchronized errorpages/: (no message) (duration: 01m 06s) |
[production] |
17:47 |
<ebernhardson> |
restart elasticsearch on elastic1017 to attempt to clear up a continuous backlog of relocating shards |
[production] |
15:53 |
<godog> |
temporarily run 3x statsdlb instances on graphite1001 to minimise drops - T101141 |
[production] |
14:57 |
<dcausse> |
upgraded and restarted elastic on nobelium@eqiad |
[production] |
14:21 |
<godog> |
enable another statsdlb instance temporarily on graphite1001 to investigate drops |
[production] |
14:15 |
<moritzm> |
rearmed keyholder on mira after reboot |
[production] |