| 2011-04-03
      
      § | 
    
  | 05:29 | <Ryan_Lane> | rebooting ms4 with -d to get a coredump | [production] | 
            
  | 05:14 | <apergos> | reanbling webserver on ms4 for testing | [production] | 
            
  | 04:45 | <apergos> | stopping web service on ms4 for the moment | [production] | 
            
  | 04:29 | <apergos> | shot webserver again | [production] | 
            
  | 04:26 | <apergos> | turned off hourly snaps on ms4, turned back on webserver and nfs | [production] | 
            
  | 04:09 | <apergos> | rebooted ms4, shut down webserver and nfsd temporarily for testing | [production] | 
            
  | 02:58 | <apergos> | still looking at kernel memory issues, still rebooting, ryan should be here in a few minutes to help out | [production] | 
            
  | 02:03 | <apergos> | a solaris advisor... also have zfs arch cache max to 2g which is ridiculously low but wtf right? | [production] | 
            
  | 02:02 | <apergos> | set tcp_time_wait_interval to 10000 at suggestion of | [production] | 
            
  | 01:37 | <apergos> | lowered zfs arch max to 2g (someone should reset this later)... will take effect on next reboot | [production] | 
            
  | 00:29 | <apergos> | rebooting with the new zfs arc cache max value, which will reduce the min value as well... dunno if this will give us enough breathing room or not | [production] | 
            
  | 00:24 | <apergos> | set zfs arc cache to ridiculously low value of 4gb, since when it's healthy it's using much less than that (1gb), this will take effect on reboot | [production] | 
            
  | 00:22 | <Reedy> | Still experiencing MS4 issues, thumb service is likely to be problematic for most users | [production] | 
            
  
    | 2011-04-02
      
      § | 
    
  | 23:47 | <apergos> | rebooting ms4 from serial console, out to lunch and took the renderers down too | [production] | 
            
  | 18:42 | <catrope> | synchronized php-1.17/wmf-config/CommonSettings.php  'Per NeilK, change Category:Uploaded_by_UploadWizard to Category:Uploaded_with_UploadWizard' | [production] | 
            
  | 17:59 | <mark> | Upgrading varnish to 2.1.5 | [production] | 
            
  | 17:14 | <demon> | synchronized php-1.17/includes/filerepo/LocalFile.php  'r85200' | [production] | 
            
  | 14:19 | <mark> | Implemented CARP weights for distant CARP parents on squid configurator (used to be all equal before) | [production] | 
            
  | 11:36 | <mark> | Created btrfs filesystem on ms6, striped (raid10 style) over 46 devices - very experimental | [production] | 
            
  | 09:50 | <mark> | Reinstalling ms6 with Ubuntu 10.04 | [production] | 
            
  | 09:50 | <mark> | Fixed torrus again | [production] | 
            
  | 06:02 | <mark> | !wikipedia The image thumbnail servers appear stable now | [production] | 
            
  | 04:59 | <mark> | Increased nginx worker processes from 1 to 4, set file limit to 30k | [production] | 
            
  | 04:40 | <mark> | !wikipedia Image Thumbnail server outage, it's being worked on | [production] | 
            
  | 04:34 | <mark> | Power cycling ms4 again | [production] | 
            
  | 04:06 | <mark> | Power cycled ms4 again | [production] | 
            
  | 04:02 | <mark> | Removed ms4 from pmtpa.upload config, sending all thumbs to ms5 | [production] | 
            
  | 03:47 | <mark> | Restarted rsyncs ms4->ms5 | [production] | 
            
  | 03:25 | <Ryan_Lane> | powercycling ms4 again | [production] | 
            
  | 02:59 | <Ryan_Lane> | rebooting ms4 | [production] | 
            
  | 02:46 | <Ryan_Lane> | seems ms4 is totally dead, powercycling it | [production] | 
            
  | 01:09 | <Ryan_Lane> | installing python-pyinotify on spence for an updated ircecho | [production] | 
            
  
    | 2011-04-01
      
      § | 
    
  | 21:35 | <Ryan_Lane> | purging some binlogs on db9 to free up space | [production] | 
            
  | 21:35 | <RobH> | bugzilla now version 4 | [production] | 
            
  | 21:31 | <RobH> | taking down bugzilla for a quick upgrade | [production] | 
            
  | 18:48 | <Ryan_Lane> | added ctwoo, brion, py, and reedy to the engineering alias | [production] | 
            
  | 18:36 | <mark> | Deployed ms5.pmtpa.wmnet as a special 'apache' for pmtpa squid uploads... now serving a small portion of commons thumbs | [production] | 
            
  | 18:11 | <RobH> | bugzilla back online, CRproxy was affected, and repaired | [production] | 
            
  | 17:30 | <RobH> | bugzilla.wikimedia.org going offline for database backup and upgrade | [production] | 
            
  | 17:13 | <RobH> | beginning upgrade process for bugzilla, it's availability will be in question during this time | [production] | 
            
  | 16:59 | <mark> | Turned off Etag in the webserver7 configuration (/opt/webserver7/https-ms4/config/obj.conf) on ms4 | [production] | 
            
  | 16:50 | <notpeter> | rm-ing old binlogs on db9 after confirming that there is no slave lag on db10 | [production] | 
            
  | 15:53 | <mark> | Puppetised nginx and htcp purger setup on ms5 | [production] | 
            
  | 11:36 | <apergos> | restarted lighty on dataset2 (but why did it die?) | [production] | 
            
  | 00:06 | <tstarling> | synchronized php-1.17/includes/specials/SpecialImport.php  'r85099' | [production] |