| 
      
        2011-04-25
      
      §
     | 
  
    
  | 23:33 | 
  <Ryan_Lane> | 
  added python-mwclient to lucid repo | 
  [production] | 
            
  | 21:36 | 
  <RobH> | 
  storage2 still offline, wont boot into os, but is remotely accessible | 
  [production] | 
            
  | 21:20 | 
  <RobH> | 
  trying to fix storage2 | 
  [production] | 
            
  | 20:16 | 
  <notpeter> | 
  actually adding everyone on ops to watchmouse service... didn't know this had not already been done. | 
  [production] | 
            
  | 20:02 | 
  <RobH> | 
  updated csw1 to removed labels and move to default vlan ports 11/12, 11/14, 11/19, & 11/21.  old connection ports for dataset2, tridge, ms1, and ms5 | 
  [production] | 
            
  | 19:53 | 
  <RobH> | 
  the datacenter is looking awesome. | 
  [production] | 
            
  | 19:45 | 
  <RobH> | 
  ms1 moved from temp network to permanent home, no downtime, responding fine | 
  [production] | 
            
  | 19:42 | 
  <RobH> | 
  ms5 connection moved, no downtime, responds fine, less than 4 seconds | 
  [production] | 
            
  | 19:40 | 
  <RobH> | 
  updated csw1-sdtpa 15/1,15/2 from vlan 105 to vlan 2, 15/3 and 15/4 from vlan 105 to 101 | 
  [production] | 
            
  | 18:52 | 
  <RobH> | 
  snapshot4 relocated to new home, ready for os install | 
  [production] | 
            
  | 18:42 | 
  <RobH> | 
  db19 and db20 back online (not in services as they have other issues) | 
  [production] | 
            
  | 18:39 | 
  <RobH> | 
  db19 and db20 powering back up | 
  [production] | 
            
  | 18:25 | 
  <RobH> | 
  virt4 experienced an accidental reboot when rebalancing power in the rack, my fault, not the hardware | 
  [production] | 
            
  | 18:12 | 
  <RobH> | 
  rack b2 power rebalanced | 
  [production] | 
            
  | 18:01 | 
  <RobH> | 
  db19 set to slave, depooled in db.php, no other services evident, shutting down (mysql stopped cleanly) | 
  [production] | 
            
  | 18:00 | 
  <RobH> | 
  db20 shutdown | 
  [production] | 
            
  | 18:00 | 
  <RobH> | 
  didnt log that i setup ports 11/38-40 for db19, db20, and snapshot4 on csw1-sdtpa.  tested out fine and all my major configuration changes on netowrk should be complete | 
  [production] | 
            
  | 17:56 | 
  <RobH> | 
  ok, db20 and db19 are coming offline to relocate their rack location due to power distro issues | 
  [production] | 
            
  | 15:47 | 
  <RobH> | 
  delay, not coming down yet, need more cables | 
  [production] | 
            
  | 15:46 | 
  <RobH> | 
  db19 is coming down as well, it is depooled anyhow | 
  [production] | 
            
  | 15:46 | 
  <RobH> | 
  db20 is coming down, ganglia aggregation for those hosts may be delayed until it is back online. | 
  [production] | 
            
  | 15:21 | 
  <RobH> | 
  relocating snapshot4 into rack c2, it will be offline during this process | 
  [production] | 
            
  | 15:20 | 
  <RobH> | 
  db43-db47 network setup, sites not down, yay me | 
  [production] | 
            
  | 15:10 | 
  <RobH> | 
  being on csw1 makes robh nervous. | 
  [production] | 
            
  | 15:09 | 
  <RobH> | 
  labeling and setting up ports on 11/33 through 11/37 on csw1-sdtpa for db43 through db47 | 
  [production] | 
            
  | 14:47 | 
  <RobH> | 
  fixed storage2 serial console (set it to higher rate, magically works, or it just fears me) and also confirmed its remote power control is functioning | 
  [production] | 
            
  | 14:42 | 
  <RobH> | 
  stealing dataset1's known good scs connection to test storage2.  dataset1 service will remain unaffected. | 
  [production] | 
            
  
    | 
      
        2011-04-23
      
      §
     | 
  
    
  | 22:31 | 
  <RobH> | 
  required even. | 
  [production] | 
            
  | 22:31 | 
  <RobH> | 
  no drives display error leds, futher investigation requried | 
  [production] | 
            
  | 22:27 | 
  <RobH> | 
  ms2 is having bad drive investigated.  if we do this right, it wont go down.  if we don't it will.  is a slave es server. | 
  [production] | 
            
  | 22:00 | 
  <RobH> | 
  singer returned to operation, blog, techblog, survey, and secure returned to normal operation | 
  [production] | 
            
  | 21:52 | 
  <RobH> | 
  singer is once again coming back down for drive replacement.  This will take offline blog.wikimedia.org, techblog.wikimedia.org, survey.wikimedia.org, and secure.wikipedia.org.  Service will be returned as soon as possible.   | 
  [production] | 
            
  | 21:19 | 
  <RobH> | 
  singer back online, for awhile, will come back down for further repair shortly. | 
  [production] | 
            
  | 21:05 | 
  <RobH> | 
  singer going down, blogs will be offline, so will secure, system will return to service as soon as possible | 
  [production] | 
            
  | 21:00 | 
  <RobH> | 
  preparing to fix the dead drive in singer, this will offline secure, blog, techblog, and survey during the drive replacement process | 
  [production] | 
            
  | 19:50 | 
  <mark> | 
  Upgrading mr1-pmtpa to junos 10.4R3.4 | 
  [production] | 
            
  | 17:49 | 
  <RobH> | 
  migrating searchidx1 & search1-search10 to new ports in same rack.  moving one at a time and ensuring link lights between moves.  (already tested with search10) | 
  [production] | 
            
  | 14:11 | 
  <RobH> | 
  db19 is back online, seems to not have any mysql setup done. | 
  [production] | 
            
  | 14:02 | 
  <RobH> | 
  restarting db19 | 
  [production] | 
            
  | 14:02 | 
  <RobH> | 
  arcconf checks out all drives on db19 are indeed working as rich found earlier | 
  [production] | 
            
  | 12:47 | 
  <mark> | 
  Added (x121Address=1) condition to the LDAP query of the ldap_aliases router on mchenry's exim | 
  [production] | 
            
  | 00:32 | 
  <hcatlin> | 
  Mobile: Deploying fix to an issue that kept the standard-style Main_Page from displaying on mobile | 
  [production] | 
            
  | 00:25 | 
  <Ryan_Lane> | 
  restarting memcached on all of the mobile servers | 
  [production] |