| 2016-02-08
      
      § | 
    
  | 13:12 | <bblack> | start up more rolling cache reboots for kernels (cpNNNN) | [production] | 
            
  | 13:09 | <elukey> | updated hhvm on mw2016.codfw.wmnet, mw2161.codfw.wmnet, mw2199.codfw.wmnet, mw1259.eqiad.wmnet, mw1260.eqiad.wmnet | [production] | 
            
  | 13:05 | <_joe_> | roll back installation of pybal, issues with upd and ipv6 | [production] | 
            
  | 12:56 | <elukey> | updated hhvm on mw1080, mv1084, mw1241 | [production] | 
            
  | 12:32 | <elukey> | restarting hhvm on mw1052, mw1075, mw1080, mw1081, mw1094, mw1095 to rollout the new version | [production] | 
            
  | 12:32 | <_joe_> | uploaded a new pybal package; installing on codfw and ulsfo backups | [production] | 
            
  | 12:05 | <_joe_> | restarted cron on tin, to catch up with the uid change for the l10nupdate user | [production] | 
            
  | 11:53 | <bblack> | rebooting cp1074, cp3047 (for kernels, also to compare bios/drac settings...) | [production] | 
            
  | 11:26 | <jynus> | stopping mysql at db2012 | [production] | 
            
  | 11:25 | <jynus> | starting mysql at db2012 | [production] | 
            
  | 11:05 | <moritzm> | rebooting db2012 for kernel update | [production] | 
            
  | 11:00 | <moritzm> | rebooting terbium for kernel update | [production] | 
            
  | 10:26 | <moritzm> | rebooting es2006,es2008 for kernel update | [production] | 
            
  | 10:25 | <moritzm> | upgrading jobrunners/imagescalers in eqiad for hhvm float timeout fix | [production] | 
            
  | 10:20 | <jynus> | changing s2 replication topology in preparation for master failover | [production] | 
            
  | 09:45 | <jynus> | starting es2004 | [production] | 
            
  | 09:29 | <moritzm> | rebooting es2005,es2007,es2009,es2010 for kernel update | [production] | 
            
  | 09:15 | <elukey> | hhvm restarted on mw1044.eqiad.wmnet due to hhvm package update | [production] | 
            
  | 09:15 | <l10nupdate@tin> | ResourceLoader cache refresh completed at Mon Feb  8 09:15:11 UTC 2016 (duration 8m 10s) | [production] | 
            
  | 09:12 | <elukey> | hhvm restarted on mw1034.eqiad.wmnet due to hhvm package update | [production] | 
            
  | 09:07 | <oblivian@tin> | sync-l10n completed (1.27.0-wmf.12) (duration: 11m 55s) | [production] | 
            
  | 08:42 | <_joe_> | trying a manual run of l10nupdate since it failed last night again | [production] | 
            
  | 08:25 | <moritzm> | rebooting es2001 to es2004 for kernel update | [production] | 
            
  
    | 2016-02-05
      
      § | 
    
  | 23:54 | <chasemp> | nfs shaping is really writes :) | [production] | 
            
  | 23:54 | <chasemp> | tc to shape some nfs read traffic in tools for labs (also logged there) can be cancelled with: /sbin/tc qdisc del dev eth0 root | [production] | 
            
  | 23:51 | <YuviPanda> | dropped old nfs snapshots from labstore1001 | [production] | 
            
  | 23:30 | <maxsem@mira> | Synchronized portals: (no message) (duration: 01m 18s) | [production] | 
            
  | 23:29 | <maxsem@mira> | Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 01m 19s) | [production] | 
            
  | 22:56 | <jynus> | reimaging db1018 | [production] | 
            
  | 22:48 | <jynus> | restarting slave on m2/codfw (db2011) | [production] | 
            
  | 22:41 | <krenair@mira> | Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/268818/ (duration: 01m 22s) | [production] | 
            
  | 22:10 | <bblack> | cache rolling reboots stopped for the weekend, can pick up the other half monday | [production] | 
            
  | 20:36 | <bblack> | resuming rolling cache reboots | [production] | 
            
  | 20:07 | <mutante> | cygnus - reboot VM | [production] | 
            
  | 19:28 | <bblack> | halted rolling cache reboots, we seem to be having problems with a batch of them coming back... | [production] | 
            
  | 18:23 | <demon@mira> | Synchronized wmf-config/InitialiseSettings.php: comment stuff, gerrit 267994 (duration: 01m 19s) | [production] | 
            
  | 18:15 | <jynus> | stopping mysql@db1018 and starting to clone it for reimaging | [production] | 
            
  | 18:10 | <jynus@mira> | Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 02m 12s) | [production] | 
            
  | 17:31 | <cmjohnson1> | trouble shooting elastic1021 | [production] | 
            
  | 17:07 | <bblack> | rolling cpNNNN reboots are 27% complete, only two hosts so far failed to reboot on their own (but came up fine after manual racadm powercycle) | [production] | 
            
  | 16:20 | <ottomata> | reenabling kafka1012 in analytics-eqiad kafka cluster | [production] | 
            
  | 16:03 | <jynus> | reimaging db2030 to test jessie installer | [production] | 
            
  | 15:53 | <oblivian@tin> | sync-l10n completed (1.27.0-wmf.12) (duration: 00m 08s) | [production] | 
            
  | 15:47 | <urandom> | performing rolling restbase restart in staging env | [production] | 
            
  | 15:38 | <_joe_> | launched l10update cronjob manually, was not running since tin's reimaging | [production] | 
            
  | 15:35 | <andrewbogott> | rebooting silver for kernel update - wikitech outage will ensue | [production] | 
            
  | 15:33 | <urandom> | re-restarting restbase on restbase1002.eqiad.wmnet,restbase1005.eqiad.wmnet,restbase1006.eqiad.wmnet,restbase1009.eqiad.wmnet (prior restarts may have happened before puppet run) | [production] | 
            
  | 15:29 | <andrewbogott> | rebooting holmium for kernel update | [production] |