| 2009-02-23
      
      § | 
    
  | 22:56 | <RobH_> | srv136 reinstalled and redeployed as apache | [production] | 
            
  | 22:33 | <mark> | Installed puppet (test install) and thereby automatically gmond as aggregator on srv33 | [production] | 
            
  | 21:16 | <domas> | adler has disk with media errors (ID:5, 6th disk in array): http://p.defau.lt/?3_7_6aIatj3DeNBw_jjtBg - needs cannibalized samuel, disk replacement, and ubuntu install on raid10 | [production] | 
            
  | 19:04 | <Rob> | srv136 back from repairs, reinstalling as apache server | [production] | 
            
  | 18:44 | <Rob> | srv217 not running apache, synced and restarted | [production] | 
            
  | 18:29 | <Rob> | srv33 reinstalled to ubuntu and deployed as apache server | [production] | 
            
  | 18:24 | <Rob> | srv32 reinstalled to ubuntu and deployed as apache server | [production] | 
            
  | 17:55 | <Rob> | reinstalling srv32 to ubuntu | [production] | 
            
  | 17:38 | <Rob> | resynced and restarted apache on srv32, srv33, srv34 | [production] | 
            
  | 17:32 | <Rob> | srv31 powered back up | [production] | 
            
  | 17:25 | <Rob> | found a breaker flip in the DC, affects srv31-srv34 | [production] | 
            
  | 13:40 | <domas> | oh, btw folks, kudos on perfect web2.0 engineering, now morebots complains when message is longer than 140 bytes, and we end up without our microblogging syndication | [production] | 
            
  | 13:39 | <domas> | added "su -m 'www-data' -c 'find /opt/mwlib/var/cache/ -mindepth 3 -mtime +1 -delete'" to pdf1 crontab, does anyone actually look after this service? | [production] | 
            
  | 12:57 | <Tim> | deployed r47704, now command line scripts don't access /home anymore | [production] | 
            
  | 11:37 | <Tim> | switched archive directory over to /mnt/upload5, starting another rsync. Some files will be missing until the rsync is done | [production] | 
            
  | 10:07 | <Tim> | moved all job runners from the previous ad hoc script to the new wikimedia-job-runner package | [production] | 
            
  | 06:25 | <Tim> | moved the nagios plugins for fedora from /home/nagios to /h/w/common/nagios-fedora-plugins | [production] | 
            
  | 05:21 | <Tim> | started udp2log on db20, MW UDP logs were dead | [production] | 
            
  | 05:19 | <Tim> | killed errant jobs loop scripts still running on fedora servers | [production] | 
            
  | 04:36 | <Tim> | fixed the log directory for /etc/cron.d/mw-central-notice, killed the process that was in a tight loop trying to write to a stale NFS file handle | [production] | 
            
  | 04:28 | <Tim> | finished moving ExtensionDistributor working copy | [production] | 
            
  | 04:14 | <Tim> | moving ExtensionDistributor working directory from /home to /mnt/upload5 | [production] | 
            
  | 04:00 | <Tim> | private/archive/wikipedia was in fact not migrated, but an initial rsync was done. I will do a second rsync now. | [production] | 
            
  | 03:42 | <Tim> | rsync done, uploads re-enabled, b/c symlinks set up | [production] | 
            
  | 03:37 | <Tim> | doing rsync | [production] | 
            
  | 03:31 | <Tim> | temporarily disabled file uploads on all private wikis, for migration to ms1 | [production] | 
            
  | 02:50 | <Tim> | same for commons ForeignDBViaLBRepo directory, ScanSet directory, CentralNotice directory, | [production] | 
            
  | 02:44 | <Tim> | fixed CommonSettings.php location of deleted images, upload3 -> upload5, appears to have been moved already | [production] | 
            
  
    | 2009-02-21
      
      § | 
    
  | 19:49 | <mark> | Installed gmond on eiximenis | [production] | 
            
  | 19:02 | <domas> | db26 lacks 8g of ram :) | [production] | 
            
  | 19:00 | <mark> | Restarted stuck apache on srv217 | [production] | 
            
  | 17:26 | <mark> | Started apache on srv218-221 | [production] | 
            
  | 17:24 | <mark> | Restarted stuck apache on srv217 | [production] | 
            
  | 17:07 | <mark> | Squid/kernel upgrade complete | [production] | 
            
  | 16:46 | <mark> | Increased max-connections per upload squid to ms1 to 100 | [production] | 
            
  | 15:58 | <mark> | Running automated upgrade/reboot of squid and kernel on sq43-47 | [production] | 
            
  | 15:58 | <mark> | Upgraded squid and kernel on sq41-42, sq48-50, and rebooted | [production] | 
            
  | 15:44 | <mark> | Upgraded squid and kernel on sq36-40, and rebooted | [production] | 
            
  | 12:55 | <river> | fixed reverse dns entries for ms3/ms4, which had got swapped somehow | [production] | 
            
  | 11:55 | <Tim> | re-enabled ExtensionDistributor | [production] | 
            
  | 11:16 | <Tim> | removed syslog.0 and messages.0 on srv170 and srv176, they had critical disk free on / | [production] | 
            
  | 03:25 | <Tim> | started apache on the image scaling servers | [production] | 
            
  | 02:51 | <brion> | ran sync-common on srv199 while i'm at it | [production] |