4301-4350 of 10000 results (56ms)
2018-10-24 ยง
19:38 <twentyafterfour> The train is now blocked by database lock contention of unknown origin [production]
19:31 <twentyafterfour> the errors were all coming from wmf.26 but the error rate skyrocketed after deploying 1.33.0-wmf.1 to group1 so there is some query in the new branch which is holding a lock. T207881 [production]
19:19 <twentyafterfour@deploy1001> rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655 [production]
18:16 <XioNoX> enable BGP sessions to transit/peering on cr2-eqord - T204170 [production]
17:20 <gehel> repooling all elasticsearch servers in eqiad [production]
17:12 <cmjohnson1> rebooting cloudvirt1019 [production]
17:04 <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s) [production]
17:03 <jforrester@deploy1001> scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) [production]
16:36 <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s) [production]
16:31 <addshore@deploy1001> Synchronized wmf-config/Wikibase.php: [[gerrit:469444]] Wikibase.php, dont load wikidata repo settings on other repos (take 2) (duration: 00m 54s) [production]
16:04 <XioNoX> power-off cr1-eqord - T204170 [production]
16:00 <twentyafterfour> 15:59:06 Synchronized php-1.33.0-wmf.1/extensions/EventBus/: revert "Set event datetime with microsecond resolution." on 1.33.0-wmf.1 refs T207817 (duration: 00m 56s) [production]
15:59 <XioNoX> disable BGP sessions to transit/peering on cr1-eqord - T204170 [production]
15:54 <twentyafterfour> deploying https://gerrit.wikimedia.org/r/469451 [production]
14:23 <herron> scheduled icinga downtime and disabling puppet on logstash hosts. deploying role::kafka::logging to logstash elasticserach data hosts [production]
13:35 <XioNoX> pre-configure switch ports for labvirt1007/8/9/12:eth1 in cloud-virt-instance-trunk range on asw2-b-eqiad [production]
13:17 <ema> begin cache hosts rolling reboots for kernel/microcode updates T203011 [production]
12:24 <ema> cp-ats: upgrade trafficserver to 8.0.0-1wm1 T204232 [production]
12:12 <ema> cp1072: upgrade trafficserver to 8.0.0-1wm1 T204232 [production]
11:22 <ema> cp1071: upgrade trafficserver to 8.0.0-1wm1 T204232 [production]
10:56 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Restore db1092 and db1104 original weight (duration: 00m 52s) [production]
10:35 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 and starting to restore db1104 original weight (duration: 00m 54s) [production]
10:28 <marostegui> Compare revision table on dewiki cebwiki shwiki srwiki mgwiktionary enwikivoyage on db1100 and db2075 - T184805 [production]
09:54 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s) [production]
09:39 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s) [production]
09:20 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 and db1087 (duration: 01m 05s) [production]
08:55 <marostegui> Stop MySQL for upgrade and reboot on db1087 [production]
08:47 <marostegui> Update MySQL on db1092 for upgrade and reboot [production]
08:03 <godog> fix aggregation to 'sum' for MediaWiki.RevisionSlider - T205416 [production]
07:33 <gehel> powercycling wdqs1010 - T207817 [production]
07:19 <_joe_> powercycling wdqs1009 [production]
07:04 <elukey> powercycle wdqs1008 [production]
06:59 <elukey> powercycle wdqs1007 [production]
06:55 <elukey> powercycle wdqs1006 (depool first) [production]
06:46 <elukey> powercycle wdqs1005 [production]
06:42 <SMalyshev> repooled wdqs1003 [production]
06:35 <_joe_> powercycling wdqs[2001-2002,2004-2006].codfw.wmnet, one at a time [production]
06:33 <elukey> powercycle wdqs1004 [production]
05:24 <kartik@deploy1001> Finished deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) (duration: 04m 06s) [production]
05:20 <kartik@deploy1001> Started deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) [production]
02:34 <mutante> powercycled wdqs1009 - by request [production]
02:24 <onimisionipe@deploy1001> Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 03s) [production]
02:24 <onimisionipe@deploy1001> Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue [production]
02:12 <onimisionipe@deploy1001> Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 23s) [production]
02:12 <onimisionipe@deploy1001> Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue [production]
01:56 <tstarling@deploy1001> Synchronized php-1.33.0-wmf.1/includes/page/WikiPage.php: T207530 (duration: 00m 53s) [production]
01:46 <tstarling@deploy1001> Synchronized php-1.32.0-wmf.26/includes/page/WikiPage.php: fix deletion performance regression T207530 (duration: 00m 55s) [production]
01:37 <bawolff> deployed T207750 [production]
01:24 <mutante> wdqs2005 - powercycled, wasnt reachable via SSH and also couldn't login on mgmt, mgmt full of jave exceptions from wdqs-updater [production]
00:28 <twentyafterfour@deploy1001> rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.1 refs T206655 [production]