| 2015-11-30
      
      § | 
    
  | 21:55 | <mutante> | re-wrote l10nupdate cron; restarted cron service on tin | [production] | 
            
  | 20:05 | <apergos> | re-enabled puppet on neodymium, minion testing concluded for now | [production] | 
            
  | 19:47 | <gwicke> | running `nodetool decommission` on restbase1009 in preparation for the conversion to the multi-instance setup, per https://phabricator.wikimedia.org/T95253# | [production] | 
            
  | 19:31 | <demon@tin> | Synchronized wmf-config/InitialiseSettings.php: rm deprecated/unused rate limit log config (duration: 00m 28s) | [production] | 
            
  | 17:27 | <demon@tin> | Synchronized php-1.27.0-wmf.7/extensions/WikimediaMaintenance/: need maint script errywhere (duration: 00m 28s) | [production] | 
            
  | 16:51 | <thcipriani@tin> | Synchronized php-1.27.0-wmf.7/extensions/ContentTranslation/modules/draft/ext.cx.draft.js: SWAT: Add some extra information to save failure logging [[gerrit:255956]] (duration: 00m 28s) | [production] | 
            
  | 16:38 | <thcipriani@tin> | Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable QuickSurveys reader segmentation survey [[gerrit:255448]] (duration: 00m 28s) | [production] | 
            
  | 16:30 | <paravoid> | mw1002 service hhvm restart | [production] | 
            
  | 16:17 | <paravoid> | rolling back to kernel 3.19 on lvs2001/2/3 | [production] | 
            
  | 15:29 | <paravoid> | stopping pybal on lvs2001/2/3 | [production] | 
            
  | 15:21 | <paravoid> | switching lvs2004/5/6 traffic back to lvs2001/2/3 | [production] | 
            
  | 15:13 | <paravoid> | switching lvs2001/2/3 traffic to lvs2004/5/6 and upgrading kernels | [production] | 
            
  | 15:12 | <_joe_> | restarting HHVM on mw1147 too, same reason as mw1114 | [production] | 
            
  | 15:10 | <_joe_> | restarting hhvm on mw1114, stuck in __pthread_cond_wait () [folly::EventBase::runInEventBaseThreadAndWait ()], apparently blocked in writing to stdout | [production] | 
            
  | 15:02 | <paravoid> | switching traffic from lvs4002 to lvs4004; upgrading lvs4002's kernel | [production] | 
            
  | 15:02 | <paravoid> | switching traffic back to lvs4001 | [production] | 
            
  | 14:57 | <paravoid> | switching traffic from lvs4001 to lvs4003; upgrading lvs4001's kernel | [production] | 
            
  | 14:45 | <paravoid> | switching traffic from lvs3001 to lvs3003; upgrading lvs3001's kernel | [production] | 
            
  | 14:38 | <paravoid> | switching traffic back to lvs3002 | [production] | 
            
  | 14:31 | <paravoid> | switching traffic from lvs3002 to lvs3004; upgrading lvs3002's kernel | [production] | 
            
  | 14:07 | <bblack> | upgrading varnishkafka package on all caches | [production] | 
            
  | 13:52 | <bblack> | updating varnishkafka on cp1065 | [production] | 
            
  | 11:03 | <godog> | upgrade python-statsd to 3.0.1 in eqiad | [production] | 
            
  | 10:59 | <godog> | upgrade python-statsd to 3.0.1 in codfw | [production] | 
            
  | 10:15 | <godog> | reenable puppet on graphite1001 | [production] | 
            
  | 10:10 | <paravoid> | re-enabling OSPF over cr2-eqiad:xe-5/2/2 <-> cr1-ulsfo:xe-0/0/3.538 | [production] | 
            
  | 10:09 | <paravoid> | re-enabling cr2-eqiad:xe-5/2/0 and xe-5/2/1 | [production] | 
            
  | 10:01 | <jynus> | performing schema change on db1046 (analytics master) | [production] | 
            
  | 09:32 | <jynus> | removing old snapshots from db1046 | [production] | 
            
  | 06:38 | <ori> | Restarted statsv on hafnium | [production] | 
            
  | 02:00 | <l10nupdate@tin> | LocalisationUpdate failed: git pull of core failed | [production] | 
            
  | 01:56 | <gwicke> | started `nodetool cleanup` on restbase1002 to get rid of unnecessary data from earlier 1001 decommission attempt | [production] | 
            
  | 01:05 | <bd808@tin> | sync-l10n completed (1.27.0-wmf.7) (duration: 01m 19s) | [production] | 
            
  | 01:04 | <bd808> | testing l10n cache rebuild as l10nupdate user (take 2) | [production] | 
            
  | 00:57 | <Krenair> | test | [production] | 
            
  | 00:49 | <bd808@tin> | sync-l10nupdate completed (1.27.0-wmf.7) (duration: 04m 37s) | [production] | 
            
  | 00:45 | <bd808> | testing l10n cache rebuild as l10nupdate user | [production] | 
            
  | 00:01 | <bd808> | Tried to update scap to 1879fd4 (Add sync-l10n command for l10nupdate); trebuchet reported 0/483 minions completing fetch and 3/483 minions completing checkout | [production] | 
            
  
    | 2015-11-29
      
      § | 
    
  | 21:25 | <jynus> | importing user.user_touched (s7) from dbstore1002 to sanitarium. s7 lag on labs replicas will be higher for some minutes. | [production] | 
            
  | 20:51 | <jynus> | importing user.user_touched (s6) from dbstore1002 to sanitarium. s6 lag on labs replicas will be higher for some minutes. | [production] | 
            
  | 20:28 | <jynus> | importing user.user_touched (s5) from dbstore1002 to sanitarium. s5 lag on labs replicas will be higher for some minutes. | [production] | 
            
  | 19:51 | <jynus> | importing user.user_touched (s4) from dbstore1002 to sanitarium. s4 lab will be affected for some minutes. | [production] | 
            
  | 04:50 | <gwicke> | restarted cassandra on restbase1009 to avoid it running out of disk space; had large compaction (~2TB) at 80% and only 64G disk space left | [production] | 
            
  | 03:01 | <YuviPanda> | run chown -R l10nupdate: /var/lib/l10nupdate/mediawiki  for Reedy on tin | [production] | 
            
  | 02:28 | <Reedy> | l10nupdate failed because some git objects owned by 997:l10nupdate | [production] | 
            
  | 02:00 | <l10nupdate@tin> | LocalisationUpdate failed: git pull of core failed | [production] |