| 2016-12-22
      
      § | 
    
  | 21:03 | <catrope@tin> | Finished scap: Sync Idf4618977f172 in the OAuth extension- (duration: 24m 16s) | [production] | 
            
  | 20:39 | <catrope@tin> | Started scap: Sync Idf4618977f172 in the OAuth extension- | [production] | 
            
  | 19:59 | <ebernhardson> | restarting elasticsearch (again) on relforge100[12] to test ltr plugin | [production] | 
            
  | 19:22 | <gehel> | restart wdqs-blazegraph and wdqs-updater on wdqs1001.eqiad.wmnet (suspicious load) | [production] | 
            
  | 19:18 | <ebernhardson> | restarting elasticsearch on relforge100[12] to test ltr plugin | [production] | 
            
  | 19:18 | <jynus> | stopping replication on dbstore2001(s2) and db2035 for enwiktionary.templatelinks reimport | [production] | 
            
  | 19:04 | <godog> | roll restart swift proxy on ms-fe1* to drain thumbor traffic | [production] | 
            
  | 15:39 | <jynus> | restart dbstore2001 to change buffer pool size, testing gerrit:328671 | [production] | 
            
  | 14:51 | <elukey> | restarting the yarn node manager java daemons on all the Hadoop worker nodes due to suspect memory leak | [production] | 
            
  | 14:14 | <elukey> | the previous entry is missing: "on analytics1032" | [production] | 
            
  | 14:13 | <elukey> | manually starting the yarn nodemanager after OOM | [production] | 
            
  | 13:41 | <jynus> | stopping db1035 (depooled) replication to perform maintenance to avoid disk alerts in the next 2 weeks | [production] | 
            
  | 10:02 | <moritzm> | installing c-ares security updates on trusty systems (jessie already fixed for quite a while) | [production] | 
            
  | 10:02 | <moritzm> | installing c-ares security updates | [production] | 
            
  | 09:02 | <moritzm> | installing tomcat security updates | [production] | 
            
  | 08:45 | <moritzm> | installing libav security updates on trusty systems | [production] | 
            
  | 08:18 | <moritzm> | installing Django security updates | [production] | 
            
  | 07:26 | <elukey> | created /var/log/squid3/access.log.1.gz on aluminum to fix cronspam - T132324 | [production] | 
            
  | 02:26 | <l10nupdate@tin> | ResourceLoader cache refresh completed at Thu Dec 22 02:26:23 UTC 2016 (duration 4m 49s) | [production] | 
            
  | 02:21 | <l10nupdate@tin> | scap sync-l10n completed (1.29.0-wmf.6) (duration: 07m 54s) | [production] | 
            
  
    | 2016-12-21
      
      § | 
    
  | 23:48 | <mutante> | europium - jessie reinstall done - powered down until until reclaim (T153918) | [production] | 
            
  | 23:31 | <mutante> | europium - re-installing with jessie (T82239) | [production] | 
            
  | 19:15 | <mutante> | public1-b-eqiad and public1-c-eqiad are configured to use install1001 as DHCP, all others still use carbon as DHCP | all subnets now use install1001 as TFTP | [production] | 
            
  | 19:13 | <mutante> | carbon - re-enabled puppet and DHCP | [production] | 
            
  | 18:13 | <mutante> | carbon - temp stopping dhcp server | [production] | 
            
  | 15:22 | <gehel> | truncating /var/log/elasticsearch/relforge-eqiad_feature.log on relforge100[12] | [production] | 
            
  | 15:04 | <elukey> | removed mongodb* packages from stat1003 after https://gerrit.wikimedia.org/r/328519 | [production] | 
            
  | 14:54 | <moritzm> | installing ghostscript security updates on trusty hosts | [production] | 
            
  | 14:29 | <moritzm> | installing imagemagick security updates | [production] | 
            
  | 13:09 | <moritzm> | install hdf5 security updates | [production] | 
            
  | 13:03 | <mobrovac@tin> | Finished deploy [parsoid/deploy@dab1f27]: Bug fix for mwApiServer T153797 (duration: 05m 32s) | [production] | 
            
  | 12:57 | <mobrovac@tin> | Starting deploy [parsoid/deploy@dab1f27]: Bug fix for mwApiServer T153797 | [production] | 
            
  | 12:46 | <moritzm> | install openjdk-6 security update on labsdb1006 | [production] | 
            
  | 10:43 | <jynus> | dropping non-wiki databases from labsdb1001 | [production] | 
            
  | 10:01 | <moritzm> | installing libgme security updates | [production] | 
            
  | 09:58 | <jynus> | extending db1035 /srv partition | [production] | 
            
  | 08:42 | <elukey> | restarted hhvm/jobrunner (and killed ffmpeg processes) on mw116[89] | [production] | 
            
  | 08:01 | <marostegui> | Running optimize table on db1044 for the pagelinks tables as we urgently need some space back on that host - T153826 | [production] | 
            
  | 07:20 | <marostegui> | Running optimize table on db1045 for the revision tables as we urgently need some space back on that host - https://phabricator.wikimedia.org/T153739 | [production] | 
            
  | 04:36 | <Niharika> | commtech Added samwilson as project admin | [production] | 
            
  | 03:03 | <dzahn@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=mw1169.eqiad.wmnet | [production] | 
            
  | 02:51 | <mutante> | relforge1001 has huge /var/log/elastichsearch/relforge-eqiad_feature.log that wrote GBs just today but then stopped | [production] | 
            
  | 02:23 | <mutante> | mw1169 - reinstall done - sign new puppet cert, initial run... | [production] | 
            
  | 02:20 | <l10nupdate@tin> | scap sync-l10n completed (1.29.0-wmf.6) (duration: 07m 40s) | [production] | 
            
  | 02:12 | <mutante> | mw1169 - delete salt key, revoke puppet cert | [production] | 
            
  | 02:06 | <mutante> | reinstalling mw1169 (carbon DHCP, install1001 TFTP) | [production] | 
            
  | 02:02 | <mutante> | re-enabling DHCP and puppet | [production] | 
            
  | 01:49 | <mutante> | carbon - temp stop DHCP service to test install from install1001 | [production] | 
            
  | 01:47 | <dzahn@puppetmaster1001> | conftool action : set/pooled=no; selector: name=mw1169.eqiad.wmnet | [production] | 
            
  | 01:47 | <mutante> | mw1169 - schedule 2 hours downtime - boot for reinstall shortly | [production] |