| 2009-01-08
      
      § | 
    
  | 22:08 | <brion> | putting db12 back in service, caught up | [production] | 
            
  | 21:42 | <RobH> | changed the ip address for the management interfaces on sq31-sq50 | [production] | 
            
  | 21:30 | <RobH> | updated dns with the squids and srv mangement info for pmtpa | [production] | 
            
  | 21:16 | <brion> | taking load off db12 while it updates | [production] | 
            
  | 21:15 | <brion> | killing stuck query threads on db12 (lagged 13k seconds) | [production] | 
            
  | 20:23 | <RobH> | updated dns removing a large number of decommissioned servers from records. | [production] | 
            
  | 20:08 | <RobH> | pushed updates to dns for mangement ip allocations, changed mangement ips of search8-search12 | [production] | 
            
  | 19:43 | <RobH> | changed the mangement ip addresses of db5-db10 to fit into current ip scheme | [production] | 
            
  | 18:20 | <RobH> | updated dns for the management name resolution of db11-db30 | [production] | 
            
  | 18:11 | <RobH> | ms5 has lom access enabled and is ready for testing.  (Only one ethernet connection in lieu of the typical 3 on the thumper/thors) | [production] | 
            
  | 15:50 | <RobH> | srv118 reinstalled | [production] | 
            
  | 15:46 | <RobH> | srv136 is borked.  Even after reinstall, it will run for a few minutes, then lock hard.  Going to RMA it. | [production] | 
            
  | 15:38 | <RobH> | reinstalled srv136 and srv118 cuz they were pissing me off (a valid reinstallation reason if there ever was one.) | [production] | 
            
  | 15:09 | <RobH> | and srv118 back down, thing is borked. | [production] | 
            
  | 15:06 | <RobH> | srv118 back online and serving requests. | [production] | 
            
  | 15:01 | <RobH> | pushed db13 back into cluster, same with db14, from yesterdays work | [production] | 
            
  | 14:26 | <RobH> | srv101 back online and in lvs | [production] | 
            
  | 14:15 | <RobH> | reinstalled srv101, installing wikimedia-task-app packages now | [production] | 
            
  | 06:37 | <JeLuF> | rebooted db18. Mysqld was stuck but couldn't be killed. | [production] | 
            
  | 04:08 | <Tim> | migrated all locked wikis from $wgReadOnly(File) to permissions-based locking, so that stewards can edit the alternate project links, and so that various MediaWiki components don't break on page view | [production] | 
            
  | 03:57 | <river> | set up ms3/ms4 with solaris 10 update 6 | [production] | 
            
  
    | 2009-01-07
      
      § | 
    
  | 22:50 | <RobH> | db13 and db14 are replicating but not in the cluster (not sure if they are caught up) | [production] | 
            
  | 22:35 | <RobH> | updated power strip information for ps1-a1-sdtpa and balanced load | [production] | 
            
  | 22:35 | <RobH> | reseated mrj cable for csw1-sdtpa_1/13 | [production] | 
            
  | 21:36 | <RobH> | started up db13 and db14 | [production] | 
            
  | 21:19 | <RobH> | updating firmware on db13-db14 | [production] | 
            
  | 21:15 | <RobH> | shutdown db13 and db14 to fix lom lockup issue. | [production] | 
            
  | 20:52 | <RobH> | depooled db13 and db14 in db.php to reboot them and fix the SP lockup issue. | [production] | 
            
  | 20:49 | <RobH> | updating firmware on db16. | [production] | 
            
  | 20:43 | <RobH> | started mysql back up on db15 | [production] | 
            
  | 20:42 | <RobH> | cold reset of db16 to resolve lom issue.  will update firmware upon boot. | [production] | 
            
  | 20:39 | <RobH> | swappned hostnames on ms3 and ms4, updated racktables and dns to reflect change | [production] | 
            
  | 20:24 | <brion> | disabled wikidiff2 since it's not installed, and this apparaently is nicely broken | [production] | 
            
  | 20:21 | <RobH> | db15 now responsive to lom and ready to be re-integrated into the cluster | [production] | 
            
  | 20:12 | <RobH> | db15 cold reset fixes the LOM non-responsive issue.  Upgrading its firmware to prevent future issues. | [production] | 
            
  | 20:06 | <brion> | removed stray whitespace from wikitech config file which was breaking rss feeds | [production] | 
            
  | 19:23 | <mark> | Possibility that esams LVS was overloaded, split over 2 boxes (fuchsia & mint) | [production] | 
            
  | 19:19 | <RobH> | ms3 and ms4 are accessible via LOM and ready for setup/deployment | [production] | 
            
  | 19:05 | <RobH> | updated dns for ms3-ms5, updated dns for mangement for all media servers. | [production] | 
            
  | 19:05 | <RobH> | nope  ;_; | [production] | 
            
  | 19:04 | <RobH> | did domas update the bot for the new wikitech? | [production] | 
            
  | 19:03 | <brion> | touching MessagesZh.php and re-trying scap; may not have properly updated | [production] | 
            
  | 17:40 | <brion-plague> | scapping -- merged r45507 zh specialpage alias fix to live. also r45499 (revert of Cite error thingy) seems to already have been merged | [production] | 
            
  | 13:58 | <Tim> | ran updateAutoPromote.php on all flaggedRevs wikis | [production] | 
            
  | 13:41 | <Tim> | scap | [production] |