| 
      
        2016-07-23
      
      §
     | 
  
    
  | 15:38 | 
  <godog> | 
  stop swift in esams test cluster, lots of logging from there | 
  [production] | 
            
  | 15:37 | 
  <godog> | 
  lithium sudo lvextend --size +10G -r  /dev/mapper/lithium--vg-syslog | 
  [production] | 
            
  | 04:58 | 
  <ori> | 
  Gerrit is back up after service restart; was unavailable between ~ 04:29 - 04:57 UTC | 
  [production] | 
            
  | 04:56 | 
  <ori> | 
  Restarting Gerrit on ytterbium | 
  [production] | 
            
  | 04:48 | 
  <ori> | 
  Users report Gerrit is down; on ytterbium java is occupying two cores at 100% | 
  [production] | 
            
  | 03:48 | 
  <chasemp> | 
  gnt-instance reboot seaborgium.wikimedia.org | 
  [production] | 
            
  | 02:26 | 
  <l10nupdate@tin> | 
  ResourceLoader cache refresh completed at Sat Jul 23 02:26:49 UTC 2016 (duration 5m 41s) | 
  [production] | 
            
  | 02:21 | 
  <mwdeploy@tin> | 
  scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 24s) | 
  [production] | 
            
  | 01:02 | 
  <tgr@tin> | 
  Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPlugin.php: T141160 (duration: 00m 29s) | 
  [production] | 
            
  | 01:01 | 
  <tgr@tin> | 
  Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthHooks.php: T141160 (duration: 00m 27s) | 
  [production] | 
            
  | 01:00 | 
  <tgr@tin> | 
  Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPrimaryAuthenticationProvider.php: T141160 (duration: 00m 28s) | 
  [production] | 
            
  | 00:37 | 
  <tgr> | 
  doing an emergency deploy of https://gerrit.wikimedia.org/r/#/c/300679 for T141160, creates dozens of new users per hour to be unattached on loginwiki which probably has weird consequences | 
  [production] | 
            
  
    | 
      
        2016-07-22
      
      §
     | 
  
    
  | 22:19 | 
  <aaron@tin> | 
  Synchronized wmf-config/InitialiseSettings.php: Enable debug logging for DBTransaction (duration: 00m 38s) | 
  [production] | 
            
  | 21:10 | 
  <ejegg> | 
  updated civicrm from 2f4805fa2d2a7c57881408be2b3a017d26d8f43e to d657255e1edebeccfc0a03bea70b78eb11375cf8 | 
  [production] | 
            
  | 20:58 | 
  <ejegg> | 
  disabled Worldpay audit parser job | 
  [production] | 
            
  | 18:59 | 
  <ejegg> | 
  rolled back payments from 79d2b67067fd7e579372b63e0d619eccfa3b9143 to 79cb53998c41f72d0fa49130ed1f66dc112b478c | 
  [production] | 
            
  | 18:54 | 
  <mutante> | 
  restart grrrit-wm | 
  [production] | 
            
  | 16:05 | 
  <Jeff_Green> | 
  running authdns-update to correct a DKIM public key on wikipedia.org | 
  [production] | 
            
  | 15:24 | 
  <anomie> | 
  Starting script to populate empty gu_auth_token [[phab:T140478]] | 
  [production] | 
            
  | 15:16 | 
  <urandom> | 
  T140825: Restarting Cassandra to apply 8MB trickle_fsync (restbase1015-a.eqiad.wmnet) | 
  [production] | 
            
  | 14:21 | 
  <gehel> | 
  rolling restart of logstash100[1-3] - T141063 | 
  [production] | 
            
  | 14:19 | 
  <urandom> | 
  T134016: Boostrapping restbase2004-c.codfw.wmnet | 
  [production] | 
            
  | 12:42 | 
  <jynus> | 
  applying new m5 db grants | 
  [production] | 
            
  | 11:12 | 
  <jynus> | 
  reimage dbproxy1009 T140983 | 
  [production] | 
            
  | 11:04 | 
  <jynus> | 
  applying new m2 db grants | 
  [production] | 
            
  | 10:47 | 
  <jynus> | 
  reimage dbproxy1007 T140983 | 
  [production] | 
            
  | 10:36 | 
  <jynus> | 
  applying new m1 db grants | 
  [production] | 
            
  | 10:27 | 
  <hashar> | 
  Restarting Jenkins entirely (deadlocked) | 
  [production] | 
            
  | 10:23 | 
  <hashar> | 
  Jenkins has some random deadlock. Will probably reboot it | 
  [production] | 
            
  | 09:45 | 
  <jynus> | 
  reimage dbproxy1006 | 
  [production] | 
            
  | 09:36 | 
  <jynus> | 
  applying new m3 db grants | 
  [production] | 
            
  | 08:19 | 
  <jynus> | 
  reimage dbproxy1008 | 
  [production] | 
            
  | 06:43 | 
  <jynus> | 
  updating dns records: m3-slave to db1043; m2-master to dbproxy1002 | 
  [production] | 
            
  | 04:08 | 
  <jynus> | 
  backing up, shutting down and reimage db1043 | 
  [production] | 
            
  | 03:14 | 
  <jynus> | 
  stopping db1043 db | 
  [production] | 
            
  | 03:06 | 
  <twentyafterfour> | 
  restarted apache2 and phd on iridium | 
  [production] | 
            
  | 03:04 | 
  <jynus> | 
  reverting m3-master dns back to the proxy | 
  [production] | 
            
  | 02:59 | 
  <jynus> | 
  restarted phd on iridium | 
  [production] | 
            
  | 02:35 | 
  <jynus> | 
  SET GLOBAL read_only=0; on db1048 | 
  [production] | 
            
  | 02:34 | 
  <jynus> | 
  updating m3-master dns | 
  [production] | 
            
  | 02:33 | 
  <jynus> | 
  setting db1043 as read-only (phabricator/m3) | 
  [production] | 
            
  | 02:31 | 
  <jynus> | 
  making dbstore1002.eqiad.wmnet:3306 a child of db1048.eqiad.wmnet:3306 | 
  [production] | 
            
  | 02:27 | 
  <jynus> | 
  making db2012.codfw.wmnet:3306 a child of db1048.eqiad.wmnet | 
  [production] | 
            
  | 02:25 | 
  <l10nupdate@tin> | 
  ResourceLoader cache refresh completed at Fri Jul 22 02:25:53 UTC 2016 (duration 5m 47s) | 
  [production] | 
            
  | 02:20 | 
  <mwdeploy@tin> | 
  scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 23s) | 
  [production] | 
            
  | 00:53 | 
  <bd808> | 
  Restarted elasticsearch on logstash1003; couldn't find master (even though the master thought 1003 was fine) | 
  [production] | 
            
  | 00:43 | 
  <mutante> | 
  restarted grrrit-wm | 
  [production] | 
            
  | 00:01 | 
  <maxsem@tin> | 
  Synchronized wmf-config/InitialiseSettings-labs.php: Labs-only cleanups (duration: 00m 25s) | 
  [production] |