| 2017-05-03
      
      ยง | 
    
  | 21:06 | <demon@naos> | Synchronized README: No-op, forcing co-master sync (duration: 02m 28s) | [production] | 
            
  | 20:35 | <mutante> | mw1167 - same as mw1166 (jobrunners) - there was a hhvm[12547]: Fatal error: unknown exception followed by mysql slow query, SELECT MASTER_TID_WAIT... | systemctl restart hhvm recovers it | [production] | 
            
  | 20:30 | <mutante> | mw1166 - restart hhvm service (Fatal error: request has exceeded memory limit) | [production] | 
            
  | 20:13 | <urandom> | T160759: restoring default tombstone thresholds, restbase10{3,4,6} | [production] | 
            
  | 19:57 | <mutante> | mw1287 - also restarting hhvm (with systemctl restart) | [production] | 
            
  | 19:56 | <mutante> | mw1287 - restarted crashed apache (proxy_fcgi:error) | [production] | 
            
  | 19:48 | <demon@naos> | Finished scap: Cleaning up some unused branches, no-op (duration: 15m 13s) | [production] | 
            
  | 19:33 | <demon@naos> | Started scap: Cleaning up some unused branches, no-op | [production] | 
            
  | 19:32 | <demon@naos> | Pruned MediaWiki: 1.29.0-wmf.18 (duration: 00m 19s) | [production] | 
            
  | 19:30 | <demon@naos> | Pruned MediaWiki: 1.29.0-wmf.20 [keeping static files] (duration: 00m 44s) | [production] | 
            
  | 19:27 | <ppchelko@naos> | Finished deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 attempt #2 - checks timeout (duration: 01m 39s) | [production] | 
            
  | 19:26 | <ppchelko@naos> | Started deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 attempt #2 - checks timeout | [production] | 
            
  | 19:25 | <ppchelko@naos> | Finished deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 (duration: 07m 39s) | [production] | 
            
  | 19:18 | <ppchelko@naos> | Started deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 | [production] | 
            
  | 18:48 | <papaul> | db2084 - signing puppet certs, salt-key, initial run | [production] | 
            
  | 18:48 | <urandom> | T160759: reducing tombstone threshold to 1000, restbase1014 | [production] | 
            
  | 18:46 | <urandom> | T160759: reducing tombstone threshold to 1000, restbase1016 | [production] | 
            
  | 18:39 | <urandom> | T160759: reducing tombstone threshold to 1000, restbase1013 | [production] | 
            
  | 18:35 | <urandom> | restarting restbase1016-c | [production] | 
            
  | 18:34 | <urandom> | restarting restbase1013-b | [production] | 
            
  | 18:00 | <bblack> | restart cp2005 backend (lag) | [production] | 
            
  | 17:33 | <moritzm> | uploaded openjdk-8 u131 to apt.wikimedia.org | [production] | 
            
  | 17:14 | <jynus@naos> | Synchronized wmf-config/InitialiseSettings.php: Disable cognate- it is causing an outage on x1 (duration: 01m 06s) | [production] | 
            
  | 16:30 | <jynus@naos> | Synchronized wmf-config/db-eqiad.php: Fine-tune per-server load to reduce db connection errors (duration: 01m 27s) | [production] | 
            
  | 16:17 | <mutante> | install2002 / db2084 - reverting live hack, re-enabling puppet. db2084 doesnt even talk to DHCP, all other new db servers are fine, just this one out of 22 is not. seems to be actually broken NIC, cable was switched, switch config was checked too | [production] | 
            
  | 16:08 | <mutante> | install2002 - temp stop puppet to debug dhcp issue of db2084 | [production] | 
            
  | 15:13 | <catrope@naos> | Synchronized php-1.29.0-wmf.21/includes/logging/LogPager.php: Replace FORCE INDEX(ls_field_val) with IGNORE INDEX(ls_log_id) (https://gerrit.wikimedia.org/r/#/c/351653/ for T17441) (duration: 01m 14s) | [production] | 
            
  | 15:09 | <RoanKattouw> | Live-hacked (cherry-picked) https://gerrit.wikimedia.org/r/#/c/351653/ onto naos and synced to mwdebug1002 for testing | [production] | 
            
  | 14:54 | <gehel> | restart of elasticsearch on relforge | [production] | 
            
  | 14:43 | <END> | (PASS) - Rolling restart of parsoid in codfw and eqiad - t09_restart_parsoid (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:27 | <START> | - Rolling restart of parsoid in codfw and eqiad - t09_restart_parsoid (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:26 | <END> | (PASS) - Update Tendril tree to start from the core DB masters in eqiad - t09_tendril (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:25 | <START> | - Update Tendril tree to start from the core DB masters in eqiad - t09_tendril (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:25 | <godog> | start swiftrepl on ms-fe1005 | [production] | 
            
  | 14:24 | <END> | (PASS) - Start MediaWiki jobrunners, videoscalers and maintenance in eqiad - t09_start_maintenance (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:22 | <START> | - Start MediaWiki jobrunners, videoscalers and maintenance in eqiad - t09_start_maintenance (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:21 | <END> | (PASS) - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:21 | <START> | - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:20 | <END> | (PASS) - Set MediaWiki in read-write mode in eqiad (db-eqiad config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:20 | <MediaWiki> | read-only period ends at: 2017-05-03 14:20:28.286697 (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:20 | <root@naos> | Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-write mode in datacenter eqiad (duration: 00m 32s) | [production] | 
            
  | 14:19 | <START> | - Set MediaWiki in read-write mode in eqiad (db-eqiad config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:19 | <END> | (PASS) - Set core DB masters in read-write mode in eqiad, ensure masters in codfw are read-only - t07_coredb_masters_readwrite (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:19 | <START> | - Set core DB masters in read-write mode in eqiad, ensure masters in codfw are read-only - t07_coredb_masters_readwrite (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:19 | <END> | (PASS) - Switch the Redis masters from codfw to eqiad and invert the replication - t06_redis (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:19 | <START> | - Switch the Redis masters from codfw to eqiad and invert the replication - t06_redis (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:18 | <END> | (PASS) - Switch traffic flow to the appservers from codfw to eqiad - t05_switch_traffic (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:17 | <START> | - Switch traffic flow to the appservers from codfw to eqiad - t05_switch_traffic (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:16 | <END> | (FAIL) - Switch MediaWiki master datacenter and read-write discovery records from codfw to eqiad - t05_switch_datacenter (switchdc/oblivian@neodymium) | [production] | 
            
  | 14:16 | <root@naos> | Synchronized wmf-config/CommonSettings.php: Switch MediaWiki active datacenter to eqiad (duration: 00m 31s) | [production] |