| 2016-10-20
      
      ยง | 
    
  | 15:14 | <apergos> | rolling reboot of image scalers for codfw, eqiad: mw2086-2089, mw2148-2151, mw1293-1298 | [production] | 
            
  | 15:10 | <ottomata> | restarted statsv on hafnium | [production] | 
            
  | 14:55 | <moritzm> | bounced ntp on mw2196/mw2197 (XFAC state) | [production] | 
            
  | 14:34 | <moritzm> | rebooting rutherfordium for kernel update | [production] | 
            
  | 14:27 | <filippo@puppetmaster1001> | conftool action : set/pooled=no; selector: name=prometheus1001.eqiad.wmnet | [production] | 
            
  | 14:26 | <filippo@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=prometheus1002.eqiad.wmnet | [production] | 
            
  | 14:24 | <akosiaris> | bounce ntpd on bast4001 | [production] | 
            
  | 14:20 | <moritzm> | rebooting auth* servers | [production] | 
            
  | 14:20 | <ottomata> | starting rolling restart of analytics-eqiad kafka brokers to apply kernel update | [production] | 
            
  | 14:18 | <filippo@puppetmaster1001> | conftool action : set/pooled=no; selector: name=prometheus2001.codfw.wmnet | [production] | 
            
  | 14:18 | <filippo@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=prometheus2002.codfw.wmnet | [production] | 
            
  | 14:17 | <apergos> | rolling reboot of remaining app servers in codfw: mw2221-2245, and in eqiad: mw1261-1275 | [production] | 
            
  | 14:11 | <jmm@puppetmaster1001> | conftool action : set/pooled=inactive; selector: mw2098.codfw.wmnet | [production] | 
            
  | 14:09 | <jynus@mira> | Synchronized wmf-config/db-eqiad.php: mariadb: move db1053 from s1 to s4 (duration: 02m 06s) | [production] | 
            
  | 13:38 | <moritzm> | restarting mx1001 for kernel update | [production] | 
            
  | 13:22 | <moritzm> | restarting francium for kernel update | [production] | 
            
  | 13:15 | <godog> | rolling reboot of prometheus machines for kernel update | [production] | 
            
  | 13:14 | <moritzm> | restarting ms1001 for kernel update | [production] | 
            
  | 13:10 | <elukey> | force failover from temporary Hadoop Master node (an1002) to its stanby (an1001) to restore the standard configuration | [production] | 
            
  | 13:05 | <elukey> | correction: force failover for Hadoop Master node (an1001) to its stanby (an1002) and rebooting an1001 for kernel upgrades | [production] | 
            
  | 12:59 | <elukey> | force failover for Hadoop Master node (an1002) to its stanby (an1002) and rebooting an1001 for kernel upgrades | [production] | 
            
  | 12:59 | <moritzm> | ferm on baham (failed to start due to failing DNS resolution in early boot) | [production] | 
            
  | 12:52 | <moritzm> | restarting mx2001 for kernel update | [production] | 
            
  | 12:48 | <moritzm> | bounced ntp on mw2116 (XFAC state) | [production] | 
            
  | 12:39 | <elukey> | restarting an1003 for kernel upgrades (oozie/hive master) | [production] | 
            
  | 12:35 | <moritzm> | bounced ntp on baham (was stick in INIT phase) | [production] | 
            
  | 12:31 | <apergos> | more app server rolling restarts for codfw: mw2163-2199 | [production] | 
            
  | 12:29 | <apergos> | more API server rolling restarts for eqiad: mw1221-1235, 1276-1290 | [production] | 
            
  | 12:27 | <apergos> | more APP server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258 | [production] | 
            
  | 12:12 | <moritzm> | restarting bast2001 for kernel update | [production] | 
            
  | 12:11 | <apergos> | retaction. those are app servers, not starting them yet | [production] | 
            
  | 12:10 | <apergos> | more api server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258 | [production] | 
            
  | 12:08 | <moritzm> | bounced ntp on mw2206 (XFAC state) | [production] | 
            
  | 12:05 | <bblack> | correction: rebooting baham / ns1.wikimedia.org for kernel | [production] | 
            
  | 12:04 | <bblack> | rebooting baham / ns2.wikimedia.org for kernel | [production] | 
            
  | 11:53 | <elukey> | rebooting an1027 (camus job launcher) for kernel upgrades | [production] | 
            
  | 11:48 | <moritzm> | bounced ntp on mw2101 and mw2147 (XFAC state) | [production] | 
            
  | 11:48 | <bblack> | depool cp1047 (cache_maps eqiad) | [production] | 
            
  | 11:23 | <apergos> | rolling restarts of more api servers in codfw: mw2200 - 2220 | [production] | 
            
  | 11:17 | <elukey> | rebooting all the Analytics Hadoop nodes for kernel upgrades | [production] | 
            
  | 11:07 | <mobrovac> | change-prop restarting in codfw after kafka kernel upgrade | [production] | 
            
  | 10:58 | <apergos> | rolling reboots for first batch of app servers in eqiad: mw1170-1188 | [production] | 
            
  | 10:50 | <elukey> | rebooting kafka200[12] for kernel upgrades (Kafka main-codfw non live cluster) | [production] | 
            
  | 10:38 | <apergos> | rolling restarts on first batch of api servers in eqiad: mw1189-1208 | [production] | 
            
  | 10:21 | <apergos> | while the first batch of codfw api servers trundle along, starting rolling reboots for appservers in codfw starting with mw2090-2098, 2100-2119 | [production] | 
            
  | 10:20 | <moritzm> | removing a few older kernels on analytics1036, was short of disk space in /boot partition | [production] | 
            
  | 10:05 | <elukey> | rebooting the Analytics Hadoop cluster for kernel upgrades | [production] | 
            
  | 09:50 | <jynus> | stop sql thread replication for db1053 and applying partitioning as a "special slave" | [production] | 
            
  | 09:32 | <godog> | rolling restart of graphite machines for kernel upgrade | [production] | 
            
  | 09:16 | <apergos> | restarts of mw2075,6,7 done, starting rolling restarts shortly of 8,9, 2120-2147 | [production] |