| 
      
        2015-12-22
      
      §
     | 
  
    
  | 15:52 | 
  <mutante> | 
  restbase1003 - disk space, restbase1008 - disk space, restbase1004 - cassandra cql refused | 
  [production] | 
            
  | 15:23 | 
  <akosiaris> | 
  upgrade cassandra on maps-test2003 | 
  [production] | 
            
  | 15:06 | 
  <jynus> | 
  restarting and reconfiguring mysql at dbstore2001 | 
  [production] | 
            
  | 15:06 | 
  <mutante> | 
  labtestcontrol2001 - puppet had not been running for a while, a bunch of changes have been applied incl. keys and passwords | 
  [production] | 
            
  | 15:04 | 
  <mutante> | 
  enabling puppet on labtestcontrol2001 | 
  [production] | 
            
  | 15:04 | 
  <akosiaris> | 
  upgraded cassandra on maps-test2004 | 
  [production] | 
            
  | 11:54 | 
  <apergos> | 
  salt packages with wmf packages precise running on ms-{bf}e* in esams; trusty running on analytics103* in eqiad; jessie running on restbase2* in codfw | 
  [production] | 
            
  | 11:43 | 
  <godog> | 
  restart cassandra bootstrap on restbase1004 | 
  [production] | 
            
  | 10:09 | 
  <jynus> | 
  online resizing /srv/postgres on labsdb1006 +100GB | 
  [production] | 
            
  | 10:06 | 
  <hashar> | 
  Restarting Jenkins | 
  [production] | 
            
  | 09:54 | 
  <apergos> | 
  precise and trusty salt packages with wmf patches deployed manually on dataset1001 and analytics1001, seem to work fine | 
  [production] | 
            
  | 08:42 | 
  <jynus> | 
  restarting and reconfiguring mysql at db2036 | 
  [production] | 
            
  | 02:30 | 
  <l10nupdate@tin> | 
  ResourceLoader cache refresh completed at Tue Dec 22 02:30:28 UTC 2015 (duration 6m 54s) | 
  [production] | 
            
  | 02:23 | 
  <mwdeploy@tin> | 
  sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s) | 
  [production] | 
            
  | 00:29 | 
  <krenair@tin> | 
  Synchronized php-1.27.0-wmf.9/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/260492/ (duration: 00m 32s) | 
  [production] | 
            
  | 00:22 | 
  <krenair@tin> | 
  Synchronized php-1.27.0-wmf.9/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightDialogTool.js: https://gerrit.wikimedia.org/r/#/c/260429/ (duration: 00m 30s) | 
  [production] | 
            
  
    | 
      
        2015-12-21
      
      §
     | 
  
    
  | 20:49 | 
  <godog> | 
  restbase1004 bootstrap failed, restbase1007-a is down java.lang.RuntimeException: A node required to move the data consistently is down (/10.64.0.230). | 
  [production] | 
            
  | 19:27 | 
  <legoktm> | 
  running checkLocalUser.php --delete=1 for real this time on terbium | 
  [production] | 
            
  | 19:22 | 
  <godog> | 
  reimage restbase1004 | 
  [production] | 
            
  | 19:14 | 
  <paravoid> | 
  powercycling mw1011 | 
  [production] | 
            
  | 19:11 | 
  <paravoid> | 
  rolling restart of hhvm on the eqiad jobrunners | 
  [production] | 
            
  | 18:47 | 
  <jynus> | 
  common-sync: Copying to mw1016.eqiad.wmnet from tin.eqiad.wmnet | 
  [production] | 
            
  | 18:35 | 
  <ori> | 
  correction: previous log message was for mw1015, not mw1017 | 
  [production] | 
            
  | 18:27 | 
  <ori> | 
  mw1017: enabled jemalloc profiling, restarted hhvm, now running hhvm-collect-heaps | 
  [production] | 
            
  | 17:48 | 
  <akosiaris> | 
  restarted hhvm on mw1012.eqiad.wmnet | 
  [production] | 
            
  | 16:57 | 
  <thcipriani> | 
  timeout on sync-file to mw1016.eqiad.wmnet | 
  [production] | 
            
  | 16:56 | 
  <thcipriani@tin> | 
  Synchronized php-1.27.0-wmf.9/extensions/Popups/Popups.hooks.php: SWAT: Use ExtensionRegistry to determine whether TextExtracts is installed [[gerrit:260346]] (duration: 02m 48s) | 
  [production] | 
            
  | 16:34 | 
  <jynus> | 
  sync-common to mw1085 | 
  [production] | 
            
  | 16:26 | 
  <jynus> | 
  powercycling mw1085.eqiad.wmnet | 
  [production] | 
            
  | 16:22 | 
  <thcipriani> | 
  mw1085.eqiad.wmnet times out on SSH connection | 
  [production] | 
            
  | 16:19 | 
  <godog> | 
  reboot restbase1007, load through the roof | 
  [production] | 
            
  | 16:18 | 
  <thcipriani@tin> | 
  Synchronized php-1.27.0-wmf.9/extensions/CentralNotice/resources/subscribing/ext.centralNotice.geoIP.js: SWAT: Update CentralNotice [[gerrit:260316]] (duration: 03m 03s) | 
  [production] | 
            
  | 16:08 | 
  <godog> | 
  depool restbase1007 | 
  [production] | 
            
  | 16:01 | 
  <apergos> | 
  jessie packages for salt with local patches deployed on restbase1001, looks fine but just in case.  | 
  [production] | 
            
  | 15:44 | 
  <godog> | 
  adding new 1TB disk to restbase1007 | 
  [production] | 
            
  | 14:22 | 
  <andrewbogott> | 
  disabling puppet on labnet1002 for dnsmasq tests | 
  [production] | 
            
  | 14:07 | 
  <MaxSem> | 
  me and yurik are nuking old maps data and reimporting planet | 
  [production] | 
            
  | 13:46 | 
  <jynus> | 
  extending online s2-master data disk by +100GB | 
  [production] | 
            
  | 13:15 | 
  <akosiaris> | 
  disabled puppet on maps-test2001 and commented out osmupdater crontab entry until we fix the sync process | 
  [production] | 
            
  | 11:02 | 
  <jynus> | 
  emergency restart of db1047's mysql | 
  [production] | 
            
  | 09:54 | 
  <jynus> | 
  reenabling semisync replication on s3 | 
  [production] | 
            
  | 09:07 | 
  <godog> | 
  stop cassandra on restbase1004, decomissioned | 
  [production] | 
            
  | 02:29 | 
  <l10nupdate@tin> | 
  ResourceLoader cache refresh completed at Mon Dec 21 02:29:51 UTC 2015 (duration 6m 47s) | 
  [production] | 
            
  | 02:23 | 
  <mwdeploy@tin> | 
  sync-l10n completed (1.27.0-wmf.9) (duration: 09m 45s) | 
  [production] | 
            
  | 02:20 | 
  <andrewbogott> | 
  disabling puppet on labnet1002 to mess with dnsmasq | 
  [production] | 
            
  | 01:44 | 
  <andrewbogott> | 
  disabled puppet on holmium and labservices1001 to control roll-out of https://gerrit.wikimedia.org/r/#/c/260037/ | 
  [production] |