| 2009-04-17
      
      § | 
    
  | 16:48 | <azafred> | updated spamassassin rules on lily to include the SARE rules and mirror the settings on McHenry. | [production] | 
            
  | 10:25 | <tstarling> | synchronized robots.txt | [production] | 
            
  | 08:19 | <tstarling> | synchronized php-1.5/InitialiseSettings.php | [production] | 
            
  | 07:13 | <Tim> | temporarily killed apache on overloaded ES masters | [production] | 
            
  | 07:11 | <tstarling> | synchronized php-1.5/db.php  'zeroing read load on ES masters' | [production] | 
            
  | 06:04 | <Tim> | brief site-wide outage while it rebooted, reason unknown. All good now. Resuming logrotate. | [production] | 
            
  | 05:55 | <Tim> | db20 h/w reboot | [production] | 
            
  | 05:48 | <Tim> | shutting down daemons on db20 for pre-emptive reboot. Serial console shows "BUG: soft lockup - CPU#4 stuck for 11s! [rsync:27854]" etc. | [production] | 
            
  | 05:10 | <Tim> | on db20: killed logrotate -f half done due to alarming kswapd CPU (linked to deadlocked rsync processes). May need a reboot. | [production] | 
            
  | 05:00 | <Tim> | fixed logrotate on db20, broken since March 10 due to broken status file, most likely due to non-ASCII filenames generated by demux.py. Patched demux.py. Removed everything.log. | [production] | 
            
  | 02:14 | <river> | set up ms6.esams, copying /export/upload from ms1 | [production] | 
            
  | 00:24 | <Tim> | blocked lots of uci.edu IPs that were collectively doing 20 req/s of expensive API queries, overloading ES | [production] | 
            
  | 00:15 | <brion> | techblog post on Phorm opt-out is linked from slashdot; load on singer seems fairly stable. | [production] | 
            
  
    | 2009-04-16
      
      § | 
    
  | 23:06 | <tfinc> | synchronized php-1.5/extensions/ContributionReporting/ContributionHistory_body.php | [production] | 
            
  | 22:48 | <azafred> | bounced apache on srv217. All threads were DED - dead | [production] | 
            
  | 22:16 | <tfinc> | synchronized php-1.5/extensions/ContributionReporting/ContributionHistory_body.php | [production] | 
            
  | 22:08 | <tfinc> | synchronized php-1.5/extensions/ContributionReporting/ContributionHistory_body.php | [production] | 
            
  | 17:41 | <domas> | fantastic. I start _looking_ at stuff and it fixes itself. | [production] | 
            
  | 17:35 | <midom> | synchronized php-1.5/includes/Revision.php  'live profiling hook' | [production] | 
            
  | 17:28 | <domas> | db20 has kswapd deadlock, needs reboot soonish | [production] | 
            
  | 17:18 | <midom> | synchronized php-1.5/InitialiseSettings.php  'disabled stats' | [production] | 
            
  | 17:15 | <midom> | synchronized php-1.5/InitialiseSettings.php  'enabling udp stats' | [production] | 
            
  | 16:18 | <azafred> | bounced apache on srv217 (no pid file so previous restart did not include this one) | [production] | 
            
  | 15:57 | <brion> | network borkage between Florida and Amsterdam. Visitors through AMS proxies can't reach sites. | [production] | 
            
  | 15:55 | <azafred> | bounced apache on srv[73,86,88,93,108,114,139,141,154,181,194,204,213,99] | [production] | 
            
  | 15:52 | <Tim-away> | started mysqld on srv98,srv122,srv124,srv142,srv106,srv107: done with them for now. srv102 still going. | [production] | 
            
  | 15:30 | <mark> | Set up ms6 with SP management at ms6.ipmi.esams.wikimedia.org | [production] | 
            
  | 14:13 | <mark> | Restoring traffic to Amsterdam cluster | [production] | 
            
  | 14:06 | <mark> | Reloading csw1-esams | [production] | 
            
  | 13:55 | <mark> | Reloading csw1-esams | [production] | 
            
  | 13:53 | <JeLuF> | ms1 NFS issues again. Might be load related | [production] | 
            
  | 13:49 | <Tim> | copying fedora ES data from ms3 to ms2 | [production] | 
            
  | 13:44 | <JeLuF> | ms1 is reachable, no errors logged, NFS daemons running fine. After some minutes, NFS clients were able to access the server again. Root cause unknown. | [production] | 
            
  | 13:38 | <JeLuF> | ms1 issues. On NFS slaves: "ls: cannot access /mnt/upload5/: Input/output error" | [production] | 
            
  | 13:24 | <mark> | DNS scenario knams-down for upcoming core switch reboot | [production] | 
            
  | 08:23 | <river> | pdns on bayle crashed, bindbackend parser seems rather fragile | [production] | 
            
  | 03:01 | <andrew> | synchronized php-1.5/InitialiseSettings.php  'Deployed AbuseFilter to ptwiki' | [production] | 
            
  
    | 2009-04-15
      
      § | 
    
  | 22:42 | <tomaszf> | adding ramdisk to db9 to speed up create tmp tables | [production] | 
            
  | 22:34 | <mark> | PowerDNS got confused by a commented DNS entry and broke zone wikimedia.org, fixed | [production] | 
            
  | 22:32 | <brion-codereview> | DNS broken. mark's poking it | [production] | 
            
  | 22:24 | <mark> | Temporarily removed AAAA record from mayflower in DNS | [production] | 
            
  | 22:14 | <brion-codereview> | db9 tmpfs full, breaking anything using that db | [production] | 
            
  | 22:00 | <brion-codereview> | ipv6 connectivity broken between isidore & mayflower, breaking codereview SVN updates | [production] | 
            
  | 20:59 | <brion> | civicrm queries bogging down db9 affecting otrs performance. tom's looking into it | [production] | 
            
  | 18:24 | <robh> | synchronized php-1.5/InitialiseSettings.php  'for subpages on ukwikimedia' | [production] | 
            
  | 17:32 | <robh> | synchronized php-1.5/InitialiseSettings.php  'Bug 17898 Wiktionary is a bad interwiki prefix on ukwiktionary and mlwiktionary' | [production] | 
            
  | 17:25 | <robh> | synchronized php-1.5/InitialiseSettings.php  'per bug 17773 Install Labeled Section Transclusion for dewikiversity' | [production] | 
            
  | 14:33 | <robh> | synchronized php-1.5/InitialiseSettings.php  'Bug 17718 Disable CentralNotice on private/fishbowl wikis' | [production] | 
            
  | 14:29 | <robh> | synchronized php-1.5/InitialiseSettings.php  '18434 Enable the rollback feature on Commons' | [production] | 
            
  | 14:19 | <robh> | synchronized php-1.5/InitialiseSettings.php  '18307 Add autopatrolled group to English Wikisource' | [production] |