| 2009-07-11
      
      § | 
    
  | 14:53 | <tstarling> | synchronized php-1.5/InitialiseSettings.php  'disabling CentralAuth' | [production] | 
            
  | 14:36 | <Tim> | restarted webserver7 on ms1 | [production] | 
            
  | 14:22 | <Tim> | some kind of overload, seems to be image related | [production] | 
            
  | 10:09 | <midom> | synchronized php-1.5/db.php  'db8 doing commons read load, full write though' | [production] | 
            
  | 09:22 | <domas> | restarted job queue with externallinks purging code, <3 | [production] | 
            
  | 09:22 | <domas> | installed nrpe on db2 :) | [production] | 
            
  | 09:22 | <midom> | synchronized php-1.5/db.php  'giving db24 just negligible load for now' | [production] | 
            
  | 08:38 | <midom> | synchronized php-1.5/includes/parser/ParserOutput.php  'livemerging r53103:53105' | [production] | 
            
  | 08:37 | <midom> | synchronized php-1.5/includes/DefaultSettings.php | [production] | 
            
  
    | 2009-07-10
      
      § | 
    
  | 21:21 | <Fred> | added ganglia to db20 | [production] | 
            
  | 19:58 | <azafred> | synchronized php-1.5/CommonSettings.php  'removed border=0 from wgCopyrightIcon' | [production] | 
            
  | 18:58 | <Fred> | synched nagios config to reflect cleanup. | [production] | 
            
  | 18:52 | <Fred> | cleaned up the node_files for dsh and removed all decommissioned hosts. | [production] | 
            
  | 18:36 | <mark> | Added DNS entries for srv251-500 | [production] | 
            
  | 18:18 | <fvassard> | synchronized php-1.5/mc-pmtpa.php  'Added a couple spare memcache hosts.' | [production] | 
            
  | 18:16 | <RobH_DC> | moved test to srv66 instead. | [production] | 
            
  | 18:08 | <RobH_DC> | turning srv210 into test.wikipedia.org | [production] | 
            
  | 17:57 | <Andrew> | Reactivating UsabilityInitiative globally, too. | [production] | 
            
  | 17:55 | <Andrew> | Scapping, back-out diff is in /home/andrew/usability-diff | [production] | 
            
  | 17:43 | <Andrew> | Apply r52926, r52930, and update Resources and EditToolbar/images | [production] | 
            
  | 16:44 | <Fred> | reinstalled and configured gmond on storage1. | [production] | 
            
  | 15:08 | <Rob> | upgraded blog and techblog to wordpress 2.8.1 | [production] | 
            
  | 13:58 | <midom> | synchronized php-1.5/includes/api/ApiQueryCategoryMembers.php  'hello, fix\\!' | [production] | 
            
  | 12:40 | <Tim> | prototype.wikimedia.org is in OOM death, nagios reports down 3 hours, still responsive on shell so I will try a light touch | [production] | 
            
  | 11:08 | <tstarling> | synchronized php-1.5/mc-pmtpa.php  'more' | [production] | 
            
  | 10:58 | <Tim> | installed memcached on srv200-srv209 | [production] | 
            
  | 10:51 | <tstarling> | synchronized php-1.5/mc-pmtpa.php  'deployed the 11 available spares, will make some more' | [production] | 
            
  | 10:48 | <Tim> | mctest.php reports 17 servers down out of 78, most from the range that Rob decommissioned | [production] | 
            
  | 10:37 | <Tim> | installed memcached on srv120, srv121, srv122, srv123 | [production] | 
            
  | 10:32 | <Tim> | found rogue server srv101, missing puppet configuration and so skipping syncs. Uninstalled apache on it. | [production] | 
            
  
    | 2009-07-09
      
      § | 
    
  | 23:56 | <RoanKattouw> | Rebooted prototype around 16:30, got stuck around 15:30 | [production] | 
            
  | 21:43 | <Rob> | srv35 (test.wikipedia.org) is not posting, i think its dead jim. | [production] | 
            
  | 21:35 | <Rob> | decommissioned srv55 and put srv35 in its place in C4, test.wikipedia.org should be back online shortly | [production] | 
            
  | 20:04 | <Rob> | removed decommissioned servers from node groups, getting error on syncing up nagios. | [production] | 
            
  | 20:03 | <Rob> | updated dns for new apache servers | [production] | 
            
  | 19:54 | <Rob> | decommissioned all old apaches in rack pmtpa b2 | [production] | 
            
  | 16:22 | <Tim> | creating mhrwiki (bug 19515) | [production] | 
            
  | 13:27 | <domas> | db13 controller battery failed, s2 needs master switch eventually | [production] | 
            
  
    | 2009-07-07
      
      § | 
    
  | 19:06 | <Fred> | adjusted www.wikipedia.org apache conf file to remove a redirect-loop to www.wikibooks.org. (bug #19460) | [production] | 
            
  | 17:34 | <Fred> | found the cause of Ganglia issues: Puppet. Seems like the configuration of the master hosts gets reverted to being deaf automagically... | [production] | 
            
  | 17:05 | <Fred> | ganglia fixed. For some reason the master cluster nodes were set to Deaf mode... (ie the aggregator couldn't gather data from them). | [production] | 
            
  | 15:02 | <robh> | synchronized php-1.5/InitialiseSettings.php  '19470 Rollback on pt.wikipedia' | [production] | 
            
  | 03:37 | <Fred> | fixing ganglia. Expect disruption | [production] | 
            
  | 00:27 | <tomaszf> | starting six worker threads for xml snapshots | [production] | 
            
  | 00:12 | <Fred> | srv142 and srv55 will need manual power-cycle. | [production] | 
            
  | 00:10 | <Fred> | Rolling reboot has finally completed. | [production] |