| 2018-01-18
      
      § | 
    
  | 15:07 | <mforns> | starting refinery deployment | [analytics] | 
            
  | 12:43 | <elukey> | piwik on bohrium re-enabled | [analytics] | 
            
  | 12:40 | <elukey> | set piwik in readonly mode and stopped mysql on bohrium (prep step for reboot) | [analytics] | 
            
  | 09:38 | <elukey> | reboot thorium (analytics webserver) for security upgrade - This maintenance will cause temporary unavailability of the Analytics websites | [analytics] | 
            
  | 09:37 | <elukey> | resumed druid hourly index jobs via hue and restored pivot's configuration | [analytics] | 
            
  | 09:21 | <elukey> | reboot druid1001 for kernel upgrades | [analytics] | 
            
  | 09:00 | <elukey> | suspended hourly druid batch index jobs via Hue | [analytics] | 
            
  | 08:58 | <elukey> | temporarily set druid1002 in superset's druid cluster config (via UI) | [analytics] | 
            
  | 08:53 | <elukey> | temporarily point pivot's configuration to druid1002 (druid1001 needs to be rebooted) | [analytics] | 
            
  | 08:52 | <elukey> | disable druid1001's middlemanager as prep step for reboot | [analytics] | 
            
  | 07:11 | <elukey> | re-run webrequest-load-wf-misc-2018-1-18-3 via Hue | [analytics] | 
            
  
    | 2018-01-17
      
      § | 
    
  | 17:33 | <elukey> | killed the banner impression spark job (application_1515441536446_27293) again to force it to respawn (real time indexers not present) | [analytics] | 
            
  | 17:29 | <elukey> | restarted all druid overlords on druid100[123] (weird race condition messages about who was the leader for some task) | [analytics] | 
            
  | 16:24 | <elukey> | re-run all the pageview-druid-hourly failed jobs via Hue | [analytics] | 
            
  | 14:42 | <elukey> | restart druid middlemanager on druid1003 as attempt to unblock realtime streaming | [analytics] | 
            
  | 14:21 | <elukey> | forced kill of banner impression data streaming job to get it restarted | [analytics] | 
            
  | 11:44 | <elukey> | re-run pageview-druid-hourly-wf-2018-1-17-9 and pageview-druid-hourly-wf-2018-1-17-8 (failed due to druid1002's middlemanager being in a weird state after reboot) | [analytics] | 
            
  | 11:44 | <elukey> | restart druid middlemanager on druid1002 | [analytics] | 
            
  | 10:38 | <elukey> | stopped all crons on hadoop-coordinator-1 | [analytics] | 
            
  | 10:37 | <elukey> | re-run webrequest-druid-hourly-wf-2018-1-17-8 (failed due to druid1002's reboot) | [analytics] | 
            
  | 10:22 | <elukey> | reboot druid1002 for kernel upgrades | [analytics] | 
            
  | 09:53 | <elukey> | disable druid middlemanager on druid1002 as prep step for reboot | [analytics] | 
            
  | 09:46 | <elukey> | rebooted analytics1003 | [analytics] | 
            
  | 09:46 | <elukey> | removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?) | [analytics] | 
            
  | 08:53 | <elukey> | disabled camus as prep step for analytics1003 reboot | [analytics] | 
            
  
    | 2018-01-11
      
      § | 
    
  | 22:35 | <ottomata> | restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/403774 | [analytics] | 
            
  | 22:04 | <ottomata> | restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403762/ | [analytics] | 
            
  | 20:57 | <ottomata> | restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403753/ | [analytics] | 
            
  | 17:37 | <joal> | Kill manual banner-streaming job to see it restarted by cron | [analytics] | 
            
  | 17:11 | <ottomata> | restart kafka on kafka-jumbo1003 | [analytics] | 
            
  | 17:08 | <ottomata> | restart kafka on kafka-jumbo1001...something is not right with my certpath change yesterday | [analytics] | 
            
  | 14:46 | <joal> | Deploy refinery onto HDFS | [analytics] | 
            
  | 14:33 | <joal> | Deploy refinery with Scap | [analytics] | 
            
  | 14:07 | <joal> | Manually restarting banner streaming job to prevent alerting | [analytics] | 
            
  | 13:23 | <joal> | Killing banner-streaming job to have it auto-restarted from cron | [analytics] | 
            
  | 11:45 | <elukey> | re-run webrequest-load-wf-text-2018-1-11-8 (failed due to reboots) | [analytics] | 
            
  | 11:39 | <joal> | rerun mediacounts-load-wf-2018-1-11-8 | [analytics] | 
            
  | 10:48 | <joal> | Restarting banner-streaming job after hadoop nodes reboot | [analytics] | 
            
  | 10:01 | <elukey> | reboot analytics1059-61 for kernel updates | [analytics] | 
            
  | 09:34 | <elukey> | reboot analytics1055->1058 for kernel updates | [analytics] | 
            
  | 09:04 | <elukey> | reboot analytics1051->1054 for kernel updates | [analytics] |