| 2018-01-10
      
      § | 
    
  | 09:15 | <marostegui> | Deploy schema change on db1051 - T174569 | [production] | 
            
  | 09:12 | <moritzm> | rebooting radium (tor relay) for kernel security update | [production] | 
            
  | 08:42 | <marostegui> | Stop replication in sync on db1089 and db1067 - T162807 | [production] | 
            
  | 08:41 | <marostegui@tin> | Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 05s) | [production] | 
            
  | 08:38 | <marostegui> | Deploy schema change on s5 dbstore1001 - T174569 | [production] | 
            
  | 08:33 | <moritzm> | rebooting mw1299-mw1306 (job runners) for kernel security update (along with update to HHVM 3.18.6) | [production] | 
            
  | 08:28 | <hashar> | contint1001: upgraded Zuul 2.5.0-8-gcbc7f62-wmf4jessie1 .. 2.5.0-8-gcbc7f62-wmf6 | T158243 | [production] | 
            
  | 08:13 | <marostegui> | Deploy schema change on s5 dbstore1002 - T174569 | [production] | 
            
  | 07:44 | <moritzm> | rebooting mw1262-mw1275 for kernel security update (along with update to HHVM 3.18.6) | [production] | 
            
  | 07:37 | <marostegui> | Drop external_user from wikidatawiki - T184247 | [production] | 
            
  | 06:17 | <marostegui> | Deploy schema change on s5 codfw master (db2052) with replication (this will generate lag on codfw) - T174569 | [production] | 
            
  | 02:24 | <l10nupdate@tin> | scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 02s) | [production] | 
            
  | 01:39 | <mutante> | mw1226 - high load - hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1739PST.log ; restart-hhvm | [production] | 
            
  | 00:43 | <mutante> | rebooting gerrit server for kernel upgrade | [production] | 
            
  | 00:18 | <mutante> | rebooting phabricator server for kernel upgrade | [production] | 
            
  
    | 2018-01-09
      
      § | 
    
  | 22:52 | <godog> | ms-be1033 truncate unrotated and big server.log | [production] | 
            
  | 22:22 | <aaron@tin> | Synchronized php-1.31.0-wmf.16/includes/Setup.php: 68b4bbfbc12c626 (duration: 01m 15s) | [production] | 
            
  | 22:20 | <mutante> | netmon2001 - arming keyholder for rancid | [production] | 
            
  | 21:10 | <mepps> | updated SmashPig from 45aa62650c to 778e8f87b4 | [production] | 
            
  | 20:57 | <twentyafterfour@tin> | Finished scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) (duration: 36m 34s) | [production] | 
            
  | 20:21 | <twentyafterfour@tin> | Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) | [production] | 
            
  | 20:14 | <twentyafterfour@tin> | scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="test2wiki" --outdir="/tmp/scap_l10n_3984299293" --threads=10 --lang en  --quiet' returned non-zero exit status 1 (duration: 02m 44s) | [production] | 
            
  | 20:13 | <mutante> | netmon2001 - rebooting | [production] | 
            
  | 20:12 | <twentyafterfour@tin> | Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 | [production] | 
            
  | 20:04 | <mutante> | gerrit2001 - rebooting | [production] | 
            
  | 20:00 | <mutante> | phab2001 - reboot for upgrade | [production] | 
            
  | 19:20 | <mepps> | rolledback SmashPig from 0c45b1a684 to 45aa62650c | [production] | 
            
  | 19:07 | <mepps> | updated SmashPig from 45aa62650c to 0c45b1a684 | [production] | 
            
  | 18:42 | <mutante> | ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga,  ms-be* removing from puppet/icinga (T169518) | [production] | 
            
  | 18:38 | <mutante> | ms-fe3001 - shutting down for decom, removed from puppet | [production] | 
            
  | 18:38 | <mutante> | mw1227 still not showing recovery, using restart-hhvm | [production] | 
            
  | 18:29 | <mutante> | mw1227 killed it one more time and also restarted apache.. now load going down | [production] | 
            
  | 18:26 | <mutante> | mw1227 hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1024PST.log ; then killed hhvm and restarted it with systemctl | [production] | 
            
  | 17:56 | <twentyafterfour> | MediaWiki Train: Branching 1.31.0-wmf.16 | [production] | 
            
  | 17:41 | <moritzm> | rebooting image scalers in codfw for kernel security update (along with HHVM update) | [production] | 
            
  | 17:30 | <volans> | re-enabled Icinga event handlers on RAID checks for lvs3001 | [production] | 
            
  | 17:17 | <ema> | failover traffic back to lvs3001, raid rebuilt | [production] | 
            
  | 17:15 | <godog> | depool restbase cassandra 2 nodes - T184100 | [production] | 
            
  | 16:35 | <cmjohnson1> | disabling pupppet for decom on mw1180-1200 | [production] | 
            
  | 16:28 | <volans> | disabled Icinga event handlers on RAID checks for lvs3001, WIP on the host | [production] | 
            
  | 16:18 | <gehel> | starting cluster reboot for elasticsearch / cirrus codfw | [production] | 
            
  | 16:09 | <bd808> | data-services: added s8.{analytics,web}.db.svc.eqiad.wmflabs and aliases (T181643, T184179) | [production] | 
            
  | 16:09 | <elukey> | re-started mysql on dbstore1002 (and slave replication) after hw maintenance | [production] | 
            
  | 15:44 | <godog> | roll-restart swift frontends in codfw and eqiad | [production] | 
            
  | 15:40 | <akosiaris@tin> | Finished deploy [servermon/servermon@10e165e]: Testing scap check (duration: 00m 02s) | [production] | 
            
  | 15:40 | <akosiaris@tin> | Started deploy [servermon/servermon@10e165e]: Testing scap check | [production] | 
            
  | 15:31 | <gehel> | reboot maps-test* for kernel upgrade | [production] | 
            
  | 15:30 | <elukey> | stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop) | [production] | 
            
  | 15:23 | <herron> | puppet master reboots complete.  re-enabling puppet agents | [production] | 
            
  | 15:18 | <ema> | lvs3001 disk swap: failover traffic to lvs3003 T166965 | [production] |