2851-2900 of 10000 results (63ms)
2019-04-15 §
13:42 <godog> reboot ms-be1013 [production]
13:09 <moritzm> installing wget security updates on trusty hosts [production]
12:59 <moritzm> restarting archiva on archiva1001 for OpenJDK security update [production]
12:50 <moritzm> restarting Apache on matomo1001 to pick up OpenSSL update [production]
12:14 <moritzm> rolling restart of HHVM/Apache on deployment servers to pick up OpenSSL update [production]
11:59 <fsero> pointing boron docker builds to the new registry temporarily (docker builds on boron might fail) [production]
11:35 <Amir1> EU swat is done [production]
11:26 <moritzm> rolling restart of HHVM/Apache on labweb* to pick up OpenSSL update [production]
09:58 <moritzm> installing openssl1.0 security updates [production]
09:18 <gehel> unbanning elastic1029 from cluster [production]
08:58 <moritzm> updating mediawiki servers in eqiad to version 1.8.1 of the PHP extension for wikidiff [production]
08:29 <onimisionipe> increase wal_keep_segments on codfw maps master [production]
08:19 <moritzm> updating mediawiki servers in codfw to version 1.8.1 of the PHP extension for wikidiff [production]
07:50 <Amir1> ladsgroup@mwmaint1002:~$ mwscript maintenance/initSiteStats.php --wiki=hywwiki --active (T220936) [production]
05:31 <marostegui> Upgrade db1100 [production]
05:07 <marostegui> powercycle mw1280 (crashed) [production]
2019-04-14 §
06:10 <ebernhardson> unban elastic1027 from eqiad-psi [production]
05:36 <ebernhardson> unbanning elastic1027 after about half the shards left and load dropped [production]
05:31 <ebernhardson> ban elastic1027 from elasticsearch-psi in eqiad [production]
04:59 <ebernhardson> restart elasticsearch_6@production-searhc-psi-eqiad on elastic1027 due to 100% cpu for last 30+ minutes [production]
2019-04-13 §
18:46 <godog> 3h downtime for cloudvirt1015 [production]
15:58 <ebernhardson> restart elasticsearch on elastic1027 [production]
15:34 <shdubsh> restart recommendation_api on scb1001 [production]
15:33 <shdubsh> restart recommendation_api on scb2001 [production]
10:46 <onimisionipe> depooling maps2001 for postgres init [production]
08:05 <gehel> repooling wdqs1008 - data transfer completed - T220830 [production]
00:32 <krinkle@deploy1001> Synchronized php-1.33.0-wmf.25/includes/: Idc19cc29764a / T220854 - hot fix (duration: 05m 37s) [production]
2019-04-12 §
21:16 <Krinkle> scap was unable to sync to 1 apache (connect to host cloudweb2001-dev.wikimedia.org port 22: Connection timed out) [production]
21:10 <krinkle@deploy1001> Synchronized php-1.33.0-wmf.25/extensions/ImageMap/includes/ImageMap.php: I0ee84f059da / T217087 (duration: 05m 12s) [production]
19:27 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) [production]
19:27 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
19:24 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) [production]
19:24 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:59 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
18:59 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
17:17 <onimisionipe> depooling maps2002 for postgres init [production]
17:16 <onimisionipe> repooling maps2001 - postgres init is complete [production]
16:14 <elukey> install ifstat on all the mc1* hosts for network bandwidth investigation [production]
15:56 <gehel> starting data trasnfer from wdqs1008 to wdqs1009 - T220830 [production]
15:32 <thcipriani> gerrit back [production]
15:29 <thcipriani> gerrit restart incoming [production]
14:29 <onimisionipe> depool maps2001 for postgres initialization [production]
13:24 <akosiaris> re-enable puppet across the fleet. Patch merged, recovery storm coming [production]
13:18 <akosiaris> disable puppet across the fleet to avoid incoming puppet alert storm [production]
12:57 <marostegui> Purge old rows and optimize tables on spare host pc1010 T210725 [production]
12:53 <urandom> decommissioning cassandra-c, restbase2008 -- T208087 [production]
12:49 <gehel> rolling restart of cassandra on maps* for jvm upgrade [production]
12:22 <arturo> T220095 disable icinga checks for labtestcontrol2003 [production]
12:16 <gilles@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T220807 Reduce cawiki survey sampling rate (duration: 05m 11s) [production]
11:56 <moritzm> upgrading app server canaries to version 1.8.1 of the PHP wikidiff extension (HHVM already deployed) T203069 [production]