| 2012-01-03
      
      § | 
    
  | 22:29 | <RobH> | added in the lo addres to lvs4, now its working and generating thumbnails | [production] | 
            
  | 22:09 | <reedy> | synchronizing Wikimedia installation... : Push [[rev:107938|r107938]] [[rev:107948|r107948]] | [production] | 
            
  | 21:45 | <RobH> | ganglia graphs will have missing data for past 30 to 40 minutes | [production] | 
            
  | 21:45 | <RobH> | spence back online, ganglia and nagios confirmed operational | [production] | 
            
  | 21:38 | <RobH> | resetting spence and dropping to serial to try to fix it | [production] | 
            
  | 21:25 | <RobH> | nagios and ganglia down due to spence reboot, system still coming back online | [production] | 
            
  | 21:21 | <RobH> | spence is unresponsive to ssh and serial console, rebooting | [production] | 
            
  | 21:14 | <LeslieCarr> | resetting DRAC 5 on spence for management connectivity | [production] | 
            
  | 21:05 | <binasher> | that fixed it. but how did that happen? | [production] | 
            
  | 21:05 | <binasher> | ran ip addr add 10.2.1.22/32 label "lo:LVS" dev lo on lvs4 | [production] | 
            
  | 19:36 | <reedy> | synchronized php-1.18/skins/common/images/  '[[rev:107930|r107930]]' | [production] | 
            
  | 17:36 | <mutante> | killing more runJobs.php / nextJobDB.php processes on a bunch of servers (/home/catrope/badjobrunners) | [production] | 
            
  | 17:26 | <RoanKattouw> | Stopping job runners on the following DECOMMISSIONED servers: srv151 srv152 srv153 srv158 srv160 srv164 srv165 srv166 srv167 srv168 srv170 srv176 srv177 srv178 srv181 srv184 srv185 | [production] | 
            
  | 15:55 | <RobH> | torrus back, took forever to recompile | [production] | 
            
  | 15:53 | <reedy> | synchronized wmf-config/InitialiseSettings.php  'Bug 33485 - Enable WikiLove in si.wikipedia' | [production] | 
            
  | 15:52 | <Reedy> | Created wikilove tables on siwiki | [production] | 
            
  | 15:46 | <RobH> | torrus deadlocked, kicking | [production] | 
            
  | 14:00 | <RoanKattouw> | Restarting job runners on srv242 and mw25, those are the last ones that are stuck | [production] | 
            
  | 13:57 | <RoanKattouw> | Restarting all job runners that are stuck | [production] | 
            
  | 13:48 | <RoanKattouw> | Restarting job runner on srv236, seems to be stuck | [production] | 
            
  | 02:02 | <LocalisationUpdate> | completed (1.18) at Tue Jan  3 02:05:21 UTC 2012 | [production] | 
            
  
    | 2012-01-02
      
      § | 
    
  | 23:36 | <Reedy> | Seems to potentially be an issue with job runners, enwiki backed up to over 90k over the last week or so. Needs investigating | [production] | 
            
  | 23:18 | <tstarling> | synchronized php-1.18/includes/parser/Parser.php  '[[rev:107856|r107856]]' | [production] | 
            
  | 22:58 | <tstarling> | synchronizing Wikimedia installation... : | [production] | 
            
  | 18:08 | <nikerabbit> | synchronized wmf-config/InitialiseSettings.php  'Bug 33368: WebFonts on bpywiki' | [production] | 
            
  | 18:05 | <nikerabbit> | synchronized php-1.18/languages/messages/  'i18ndeploy [[rev:107843|r107843]]' | [production] | 
            
  | 18:04 | <nikerabbit> | synchronized php-1.18/extensions/WebFonts/WebFonts.i18n.php  'i18ndeploy [[rev:107843|r107843]]' | [production] | 
            
  | 16:58 | <mutante> | installed SiteMap extension on Bugzilla - soon bugs should be googleable | [production] | 
            
  | 16:33 | <mutante> | upgraded Bugzilla from 4.0.2 to 4.0.3 (http://www.bugzilla.org/releases/4.0.3/release-notes.html#v40_point) (RT #2194) | [production] | 
            
  | 14:47 | <mutante> | cleaned out gammu spool to stop sms bomb - sorry. deamon runs again now though.. | [production] | 
            
  | 14:36 | <mutante> | fixed gammu-smsd on spence per wikitech "Nagios#Fixing_the_USB_dongle" (sending out queued SMS now ) | [production] | 
            
  | 14:30 | <mutante> | puppet ran on spence, ganglia also seems ok despite the errors i logged before. gammu-smsd cant find device again though | [production] | 
            
  | 14:03 | <mutante> | spence / gmetad - RRD_update .. illegal attempt to update using time .. last update time is .. (minimum one second step) | [production] | 
            
  | 13:57 | <mutante> | gmond complains about missing kernel modules on spence when trying to start on boot | [production] | 
            
  | 13:54 | <mutante> | spence down, no ssh, no mgmt output, powercycling it .. | [production] | 
            
  | 02:01 | <LocalisationUpdate> | completed (1.18) at Mon Jan  2 02:04:47 UTC 2012 | [production] | 
            
  | 00:08 | <tstarling> | synchronized php-1.18/includes/media/SVGMetadataExtractor.php  '[[rev:107792|r107792]]' | [production] | 
            
  
    | 2012-01-01
      
      § | 
    
  | 21:28 | <Ryan_Lane> | restarted pdns-recursor on dobson | [production] | 
            
  | 21:26 | <Ryan_Lane> | restarted pdns on ns2 about an hour ago | [production] | 
            
  | 09:46 | <apergos> | restarted lucene search on srch 10, 11,  then later on 3,4,9,1 | [production] | 
            
  | 09:35 | <apergos> | removed log.1 from /a/search/logs on search6, it was 35gb | [production] | 
            
  | 03:55 | <Tim> | fixed broken package on search7 and search11 | [production] | 
            
  | 02:01 | <LocalisationUpdate> | completed (1.18) at Sun Jan  1 02:04:30 UTC 2012 | [production] | 
            
  | 01:36 | <Tim> | adjusted FD limit in /etc/init.d/lsearchd on all search servers with sed | [production] | 
            
  | 01:34 | <Tim> | increased FD limit on search6 and restarted lsearchd | [production] | 
            
  | 00:46 | <Tim> | removed some logs on search6 to fix /a disk space exhaustion | [production] |