| 2009-04-23
      
      § | 
    
  | 05:37 | <Tim> | srv217 did not come up from a soft reboot, but power cycle worked. Before reboot, observed apache2 hanging indefinitely on nanosleep(), but couldn't reproduce a timer issue in other processes. An NFS mount was hanging on stat. | [production] | 
            
  | 05:13 | <Tim> | rebooting srv217 | [production] | 
            
  | 04:41 | <Tim> | srv217 is hanging on various operations, investigating. Trying to shut down its apache. | [production] | 
            
  | 04:35 | <tstarling> | synchronized php-1.5/db.php | [production] | 
            
  | 04:31 | <Tim> | copy done, started cluster18 mysql instance on ms3 using srv104 snapshot, repooled it | [production] | 
            
  | 02:07 | <tstarling> | synchronized php-1.5/InitialiseSettings.php | [production] | 
            
  | 01:57 | <Tim> | relaxed wgAccountCreationThrottle on frwiki, presumably the 2006 vandal emergency is over. Disabled it on idwiki for workshop event. | [production] | 
            
  | 01:45 | <Tim> | copying srv104's data from ms3 to ms2 | [production] | 
            
  | 01:11 | <Tim> | started mysql on srv104 | [production] | 
            
  
    | 2009-04-22
      
      § | 
    
  | 21:44 | <tomaszf> | db9 is back up. excessive tmpfs file systems removed | [production] | 
            
  | 21:39 | <tomaszf> | taking outage on db9 to remove tmpfs file systems | [production] | 
            
  | 11:34 | <JeLuF> | initiated reboot of srv137. dmesg shows no usable information any more. | [production] | 
            
  | 11:30 | <JeLuF> | srv137 has read-only filesystem. Stopped Apache. | [production] | 
            
  | 06:03 | <andrew> | synchronized php-1.5/includes/specials/SpecialBlockip.php  'Live-merged r49730, typo causing failures in user hiding' | [production] | 
            
  | 06:02 | <Andrew> | srv137 still seems read-only, srv137: rsync: mkstemp "/apache/common/php-1.5/includes/specials/.SpecialBlockip.php.1QkrKX" failed: Read-only file system (30) | [production] | 
            
  | 03:14 | <Tim> | copying ES data from srv104 to ms3 using nc tarpipe | [production] | 
            
  | 03:10 | <tstarling> | synchronized php-1.5/db.php  'depooling srv104 ES' | [production] | 
            
  | 03:03 | <Tim> | corruption found on cluster18, the copy source server (srv106) is missing lots of rows. Switched back to srv105/104. | [production] | 
            
  | 03:02 | <tstarling> | synchronized php-1.5/db.php | [production] | 
            
  | 02:50 | <tstarling> | synchronized php-1.5/includes/Revision.php  'reverted profiling and logging hacks' | [production] | 
            
  | 02:40 | <Tim> | depooled ms2 ex-fedora instances and shut them down, it can be a backup for now | [production] | 
            
  | 02:38 | <tstarling> | synchronized php-1.5/db.php | [production] | 
            
  | 02:33 | <Tim> | deployed the new ms2/ms3 ex-fedora ES configuration | [production] | 
            
  | 02:32 | <tstarling> | synchronized php-1.5/db.php | [production] | 
            
  | 02:01 | <Tim> | set up ex-fedora mysql instances on both ms2 and ms3, controlled with /etc/init.d/mysql-ex-fedora | [production] | 
            
  | 01:04 | <Tim> | changed the main mysql instance on ms3 (rc1) to bind to a single IP address instead of * | [production] | 
            
  
    | 2009-04-21
      
      § | 
    
  | 19:41 | <mark> | Added grosley.wikimedia.org to local_domains list on grosley's exim.conf, and added appropriate aliases in /etc/aliases | [production] | 
            
  | 16:35 | <Andrew> | Re-ran rebuildTemplates.php, all seems well now | [production] | 
            
  | 16:30 | <robh> | synchronized php-1.5/mc-pmtpa.php  'syncing for fred' | [production] | 
            
  | 16:30 | <root> | synchronized php-1.5/mc-pmtpa.php  'swapping out srv88 for srv159 and srv90 for srv198' | [production] | 
            
  | 16:29 | <andrew> | synchronized php-1.5/mc-pmtpa.php  'Switched srv88 for srv159, srv90 for srv198 to fix down memcache nodes' | [production] | 
            
  | 16:18 | <azafred> | restarted memcached on srv96. Now responding. | [production] | 
            
  | 16:14 | <Rob> | Fred needs to start logging in as Fred and not as root, bad fred (see it wasnt me this time, bwahahahahahaa) | [production] | 
            
  | 16:11 | <Andrew> | Fred fixed up some memcached nodes, but no joy with rebuildTemplates | [production] | 
            
  | 16:10 | <root> | synchronized php-1.5/mc-pmtpa.php  'swapping out down servers for active ones' | [production] | 
            
  | 16:09 | <root> | synchronized php-1.5/mc-pmtpa.php  'swapping out down servers for active ones' | [production] | 
            
  | 16:01 | <Rob> | srv137 read only, depooled in pybal for apache and rebooting. | [production] | 
            
  | 15:57 | <root> | synchronized php-1.5/mc-pmtpa.php  'swapping out down servers for active ones' | [production] | 
            
  | 14:34 | <Andrew> | rebuildTemplates.php appeared not to help, same problem as before (stopped after a few wikis). Possibly a dodgy memcache node. | [production] | 
            
  | 14:32 | <Andrew> | ran rebuildTemplates.php metawiki due to reports of <messagename> appearing in place of the central notice. | [production] | 
            
  | 05:04 | <Andrew> | Live-merged r49685, fix for unsuppression of usernames on unblock -- some usernames were left stuck suppressed if they were unblocked when the block suppressed their username | [production] | 
            
  | 05:03 | <andrew> | synchronized php-1.5/includes/specials/SpecialBlockip.php | [production] | 
            
  | 05:03 | <andrew> | synchronized php-1.5/includes/specials/SpecialIpblocklist.php | [production] | 
            
  | 01:34 | <azafred> | Made some improvments on Spam handling. Bayes is in play and can learn from everybody what is spam and what is ham. Documentation to follow. | [production] |