2018-10-24
ยง
|
21:26 |
<banyek> |
pausing replication on dbstore2002 (T204930) |
[production] |
19:38 |
<twentyafterfour> |
The train is now blocked by database lock contention of unknown origin |
[production] |
19:31 |
<twentyafterfour> |
the errors were all coming from wmf.26 but the error rate skyrocketed after deploying 1.33.0-wmf.1 to group1 so there is some query in the new branch which is holding a lock. T207881 |
[production] |
19:19 |
<twentyafterfour@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655 |
[production] |
18:16 |
<XioNoX> |
enable BGP sessions to transit/peering on cr2-eqord - T204170 |
[production] |
17:20 |
<gehel> |
repooling all elasticsearch servers in eqiad |
[production] |
17:12 |
<cmjohnson1> |
rebooting cloudvirt1019 |
[production] |
17:04 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s) |
[production] |
17:03 |
<jforrester@deploy1001> |
scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) |
[production] |
16:36 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s) |
[production] |
16:31 |
<addshore@deploy1001> |
Synchronized wmf-config/Wikibase.php: [[gerrit:469444]] Wikibase.php, dont load wikidata repo settings on other repos (take 2) (duration: 00m 54s) |
[production] |
16:04 |
<XioNoX> |
power-off cr1-eqord - T204170 |
[production] |
16:00 |
<twentyafterfour> |
15:59:06 Synchronized php-1.33.0-wmf.1/extensions/EventBus/: revert "Set event datetime with microsecond resolution." on 1.33.0-wmf.1 refs T207817 (duration: 00m 56s) |
[production] |
15:59 |
<XioNoX> |
disable BGP sessions to transit/peering on cr1-eqord - T204170 |
[production] |
15:54 |
<twentyafterfour> |
deploying https://gerrit.wikimedia.org/r/469451 |
[production] |
14:23 |
<herron> |
scheduled icinga downtime and disabling puppet on logstash hosts. deploying role::kafka::logging to logstash elasticserach data hosts |
[production] |
13:35 |
<XioNoX> |
pre-configure switch ports for labvirt1007/8/9/12:eth1 in cloud-virt-instance-trunk range on asw2-b-eqiad |
[production] |
13:17 |
<ema> |
begin cache hosts rolling reboots for kernel/microcode updates T203011 |
[production] |
12:24 |
<ema> |
cp-ats: upgrade trafficserver to 8.0.0-1wm1 T204232 |
[production] |
12:12 |
<ema> |
cp1072: upgrade trafficserver to 8.0.0-1wm1 T204232 |
[production] |
11:22 |
<ema> |
cp1071: upgrade trafficserver to 8.0.0-1wm1 T204232 |
[production] |
10:56 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Restore db1092 and db1104 original weight (duration: 00m 52s) |
[production] |
10:35 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 and starting to restore db1104 original weight (duration: 00m 54s) |
[production] |
10:28 |
<marostegui> |
Compare revision table on dewiki cebwiki shwiki srwiki mgwiktionary enwikivoyage on db1100 and db2075 - T184805 |
[production] |
09:54 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s) |
[production] |
09:39 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s) |
[production] |
09:20 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 and db1087 (duration: 01m 05s) |
[production] |
08:55 |
<marostegui> |
Stop MySQL for upgrade and reboot on db1087 |
[production] |
08:47 |
<marostegui> |
Update MySQL on db1092 for upgrade and reboot |
[production] |
08:03 |
<godog> |
fix aggregation to 'sum' for MediaWiki.RevisionSlider - T205416 |
[production] |
07:33 |
<gehel> |
powercycling wdqs1010 - T207817 |
[production] |
07:19 |
<_joe_> |
powercycling wdqs1009 |
[production] |
07:04 |
<elukey> |
powercycle wdqs1008 |
[production] |
06:59 |
<elukey> |
powercycle wdqs1007 |
[production] |
06:55 |
<elukey> |
powercycle wdqs1006 (depool first) |
[production] |
06:46 |
<elukey> |
powercycle wdqs1005 |
[production] |
06:42 |
<SMalyshev> |
repooled wdqs1003 |
[production] |
06:35 |
<_joe_> |
powercycling wdqs[2001-2002,2004-2006].codfw.wmnet, one at a time |
[production] |
06:33 |
<elukey> |
powercycle wdqs1004 |
[production] |
05:24 |
<kartik@deploy1001> |
Finished deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) (duration: 04m 06s) |
[production] |
05:20 |
<kartik@deploy1001> |
Started deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) |
[production] |
02:34 |
<mutante> |
powercycled wdqs1009 - by request |
[production] |
02:24 |
<onimisionipe@deploy1001> |
Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 03s) |
[production] |
02:24 |
<onimisionipe@deploy1001> |
Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue |
[production] |
02:12 |
<onimisionipe@deploy1001> |
Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 23s) |
[production] |
02:12 |
<onimisionipe@deploy1001> |
Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue |
[production] |
01:56 |
<tstarling@deploy1001> |
Synchronized php-1.33.0-wmf.1/includes/page/WikiPage.php: T207530 (duration: 00m 53s) |
[production] |
01:46 |
<tstarling@deploy1001> |
Synchronized php-1.32.0-wmf.26/includes/page/WikiPage.php: fix deletion performance regression T207530 (duration: 00m 55s) |
[production] |
01:37 |
<bawolff> |
deployed T207750 |
[production] |
01:24 |
<mutante> |
wdqs2005 - powercycled, wasnt reachable via SSH and also couldn't login on mgmt, mgmt full of jave exceptions from wdqs-updater |
[production] |