| 2009-04-27
      
      § | 
    
  | 14:54 | <Rob> | srv104 back online | [production] | 
            
  | 14:48 | <Rob> | srv102 and srv103 back up and online | [production] | 
            
  | 14:43 | <Rob> | srv102-106 reinstalling. | [production] | 
            
  | 14:29 | <Rob> | srv53 has a bad fan, shutting down until its replaced. | [production] | 
            
  | 14:20 | <Rob> | srv102-srv109 being upgraded to ubuntu. | [production] | 
            
  | 11:43 | <andrew> | synchronized php-1.5/InitialiseSettings.php  'Updated $wgSitename for ukwikimedia in accordance with IRC request from Michael Peel, a board member' | [production] | 
            
  | 02:20 | <Tim> | srv53 down, took it out of memcached rotation. Updating the memcached spare list. | [production] | 
            
  | 02:20 | <tstarling> | synchronized php-1.5/mc-pmtpa.php | [production] | 
            
  | 02:12 | <Tim> | fixed rc1 slaves, broken by expire_logs_days on ms3 | [production] | 
            
  | 01:59 | <Tim> | Shut down srv217 for maintenance. Similar timer interrupt issue observed as before: select() syscalls running indefinitely despite a short timeout specified. | [production] | 
            
  | 01:53 | <tstarling> | synchronized php-1.5/db.php | [production] | 
            
  | 01:52 | <Tim> | repooled ms3 rc1 instance | [production] | 
            
  | 01:49 | <Tim> | reset slave on db21, was running out of disk space due to relay logs | [production] | 
            
  | 01:42 | <Tim> | fixed nagios for srv99, still had its apache check command set to my CGI security vulnerability demonstration, permanently saved in retention.dat despite config changes | [production] | 
            
  | 01:17 | <Tim> | enabled apport on srv99, to see if I can track down the nagios flapping | [production] | 
            
  | 00:52 | <Tim> | restarted trackBlobs.php | [production] | 
            
  
    | 2009-04-25
      
      § | 
    
  | 23:31 | <Tim-away> | experimentally stopping replication on db3 to check disk load | [production] | 
            
  | 22:51 | <tstarling> | synchronized php-1.5/db.php  'reduced load on db3' | [production] | 
            
  | 18:50 | <mark> | Killed long-running SQL query TrackBlobs::trackRevisions query from hume causing db3 to lag heavily | [production] | 
            
  | 17:22 | <mark> | Stopped Apaches on srv32/srv33 again, as syncs will fail in most cases | [production] | 
            
  | 16:36 | <mark> | Started /home-less apache on srv33 | [production] | 
            
  | 13:23 | <mark> | Started /home-less apache on srv32 | [production] | 
            
  | 11:03 | <mark> | Kicked srv99 back into submission | [production] | 
            
  | 10:56 | <mark> | Squid-blocked high-rate scraper which was overloading ES | [production] | 
            
  | 05:30 | <Tim-away> | fixed conflict markers in extensions/CentralNotice/SpecialNoticeText.php and resynced. | [production] | 
            
  | 05:30 | <tstarling> | synchronized php-1.5/extensions/CentralNotice/SpecialNoticeText.php | [production] | 
            
  
    | 2009-04-24
      
      § | 
    
  | 22:23 | <rainman__> | search back up on all wikis | [production] | 
            
  | 22:17 | <root> | synchronized php-1.5/lucene.php  'Replacement for reinstalled srv58' | [production] | 
            
  | 22:15 | <brion> | synchronized php-1.5/secure.php  'fix for thumbs on private ssl access (bug 18475 etc)' | [production] | 
            
  | 21:19 | <rainman_> | srv58 dead, making all non-major wikis search broken, transfering the service to search11/12.... | [production] | 
            
  | 19:50 | <Rob> | srv90-srv99 ganglia installed. | [production] | 
            
  | 19:50 | <Rob> | srv97 online | [production] | 
            
  | 19:47 | <Rob> | srv98 online | [production] | 
            
  | 19:46 | <Rob> | srv96 online | [production] | 
            
  | 19:45 | <Rob> | srv99 online | [production] | 
            
  | 19:42 | <Rob> | srv95 online | [production] | 
            
  | 19:40 | <Rob> | srv92, srv93, and srv94 back online | [production] | 
            
  | 19:39 | <Rob> | srv91 back online | [production] | 
            
  | 19:24 | <Rob> | srv90 online | [production] | 
            
  | 19:16 | <Rob> | srv90-srv99 reinstalled, currently looping though package installation | [production] | 
            
  | 18:34 | <mark> | Fixed ganglia by installing the appropriate config files on the (reinstalled) aggregation hosts | [production] | 
            
  | 18:28 | <Rob> | installed ganglia on all servers reinstalled to ubuntu apache thus far today. | [production] | 
            
  | 18:27 | <Rob> | srv89 back online | [production] | 
            
  | 18:17 | <Rob> | srv90-srv99 will be down over the next 30 minutes for ubuntufication. | [production] | 
            
  | 18:16 | <robh> | synchronized php-1.5/mc-pmtpa.php  'some spares were actually down' | [production] | 
            
  | 18:14 | <robh> | synchronized php-1.5/mc-pmtpa.php  'removed the 9x servers for reinstallation' | [production] | 
            
  | 18:02 | <Rob> | srv84 ubuntufied and online | [production] | 
            
  | 17:58 | <Rob> | srv83 ubuntufied and online | [production] | 
            
  | 17:54 | <Rob> | srv82 ubuntufied and online | [production] | 
            
  | 17:50 | <Rob> | srv81 reinstalled and online | [production] |