| 
      
        2009-05-11
      
      §
     | 
  
    
  | 21:40 | 
  <mark> | 
  Switched srv32, srv33 puppetmaster to sockpuppet | 
  [production] | 
            
  | 21:13 | 
  <mark> | 
  Puppetified srv121, srv122 and srv123 and installed them as app servers | 
  [production] | 
            
  | 20:40 | 
  <brion> | 
  making a note for the record that db7 (enwiki watchlist) is lagging sometimes. it's under extra load pulling a dump | 
  [production] | 
            
  | 06:46 | 
  <tstarling> | 
  synchronized php-1.5/includes/specials/SpecialUndelete.php  'deploying r50470 to fix bug 18726 (double URL escaping)' | 
  [production] | 
            
  | 04:12 | 
  <Tim> | 
  starting recompressTracked for all wikis on hume | 
  [production] | 
            
  | 03:57 | 
  <Tim> | 
  stopped old ES slaves on srv172, srv173, srv184, srv185 | 
  [production] | 
            
  | 03:53 | 
  <Tim> | 
  cleaned up relay logs on db25 | 
  [production] | 
            
  
    | 
      
        2009-05-08
      
      §
     | 
  
    
  | 20:32 | 
  <tfinc> | 
  synchronized php-1.5/extensions/WikimediaMessages/WikimediaMessages.i18n.php  | 
  [production] | 
            
  | 18:59 | 
  <tfinc> | 
  synchronized php-1.5/extensions/WikimediaMessages/WikimediaMessages.i18n.php  | 
  [production] | 
            
  | 08:54 | 
  <Tim> | 
  amane's root partition filled up due to the cp running in a root screen, copying from NFS to an unmounted mount point /mnt/big-disk. Moving some stuff to the real mount point /mnt/scratch (in another screen) | 
  [production] | 
            
  | 08:25 | 
  <Tim> | 
  deploying r48837 and r48911 to fix bug 18171 (broken oldimage parameter) | 
  [production] | 
            
  | 06:11 | 
  <midom> | 
  synchronized php-1.5/db.php  'got to get coffee' | 
  [production] | 
            
  | 05:56 | 
  <midom> | 
  synchronized php-1.5/db.php  | 
  [production] | 
            
  | 05:55 | 
  <midom> | 
  synchronized php-1.5/db.php  | 
  [production] | 
            
  | 00:08 | 
  <brion> | 
  db12 no longer overloaded with 'too many connections'. very mysterious | 
  [production] | 
            
  | 00:04 | 
  <brion> | 
  db12 | 
  [production] | 
            
  | 00:04 | 
  <brion> | 
  db errs on en. poking... | 
  [production] | 
            
  
    | 
      
        2009-05-07
      
      §
     | 
  
    
  | 21:36 | 
  <domas> | 
  db30 disks are shown online, array degraded after 'arcconf rescan', not sure what that means | 
  [production] | 
            
  | 21:33 | 
  <domas> | 
  db19 disk error counts: http://p.defau.lt/?pBtD7HBx1O6IboeB9VgINg (one disk just failed few times entirely, other gets lots of aborts/medium errors, might be related) | 
  [production] | 
            
  | 21:21 | 
  <domas> | 
  db30.mgmt needs reset (facilitated by physical movements of power cord) | 
  [production] | 
            
  | 21:03 | 
  <domas> | 
  db30 has _second_ disk death | 
  [production] | 
            
  | 21:00 | 
  <domas> | 
  db28 FUBAR information: http://p.defau.lt/?EZH6Bg4GwYJDJ4hG3OIJYQ | 
  [production] | 
            
  | 20:46 | 
  <domas> | 
  db28 fb0.fm1.f1.speed is flapping between 0 and 21100. needs datacenter inspection and/or vendor service.  | 
  [production] | 
            
  | 19:50 | 
  <domas> | 
  db19 has corrupted ibdata, depooling | 
  [production] | 
            
  | 19:50 | 
  <midom> | 
  synchronized php-1.5/db.php  | 
  [production] | 
            
  | 19:47 | 
  <domas> | 
  bad disk on db19 actually made I/Os time out, thus corrupting relay logs, reset slave seems to have helped.  | 
  [production] | 
            
  | 19:36 | 
  <midom> | 
  synchronized php-1.5/db.php  'db25 needs some load' | 
  [production] | 
            
  | 18:46 | 
  <domas> | 
  db19 drive failed, needs replacement (you hear, Rob?! :) | 
  [production] | 
            
  | 17:35 | 
  <domas> | 
  added retry=1 to ProxyPass for secure.wikimedia apaches backend | 
  [production] | 
            
  | 16:56 | 
  <domas> | 
  enabling mod_deflate (bottom of main.conf) on apaches | 
  [production] | 
            
  | 16:50 | 
  <domas> | 
  added new singtel subnet to trusted xff | 
  [production] | 
            
  | 16:50 | 
  <midom> | 
  synchronized php-1.5/extensions/TrustedXFF/trusted-xff.cdb  | 
  [production] | 
            
  | 07:45 | 
  <domas> | 
  reset slave on db18 | 
  [production] | 
            
  | 00:00 | 
  <brion> | 
  added an 'editor' group to wikitech so we don't have to make all users sysops to edit until we get round to culling the abuse accounts :) | 
  [production] |