| 
      
        2014-07-30
      
      §
     | 
  
    
  | 19:32 | 
  <bd808> | 
  Disabled puppet on deployment-mediawiki02 for the same reason | 
  [releng] | 
            
  | 19:31 | 
  <bd808> | 
  Disabled puppet on deployment-mediawiki01; Ori will look into hhvm config changes that were being applied | 
  [releng] | 
            
  | 16:52 | 
  <bd808> | 
  Fixed beta-scap-eqiad Jenkins job by correcting ssh problems in beta project | 
  [releng] | 
            
  | 16:43 | 
  <bd808> | 
  Fixed ssh to jobrunner01 and videoscaler01 by correcting unrelated puppet manifest problem and forcing run via salt. | 
  [releng] | 
            
  | 16:00 | 
  <bd808> | 
  Puppet runs on videoscaler01 and jobrunner01 failing for "Could not find dependency Ferm::Rule[bastion-ssh] for Ferm::Rule[deployment-bastion-scap-ssh]" | 
  [releng] | 
            
  | 16:00 | 
  <bd808> | 
  Puppet seems manually disabled on apache0[12]. | 
  [releng] | 
            
  | 15:59 | 
  <bd808> | 
  Can't ssh to apache0[12], videoscaler01 and jobrunner01. Puppet not running on any of them. libnss-ldapd unattended update has broken /etc/nslcd.conf | 
  [releng] | 
            
  | 15:23 | 
  <bd808> | 
  Removed cherry-pick for Iac547efa83cf059a1276b6e279c3ebd4c7224b2c and updated cherry-pick for I5afba2c6b0fbf90ff8495cc4a82f5c7851893b52 to latest patch set. | 
  [releng] | 
            
  | 15:05 | 
  <bd808> | 
  Two cherry-picks in puppet conflicting with merged production changes: I5afba2c6b0fbf90ff8495cc4a82f5c7851893b52 and Iac547efa83cf059a1276b6e279c3ebd4c7224b2c (ori, twentyafterfour) | 
  [releng] | 
            
  | 14:49 | 
  <bd808> | 
  Started apache2 service on deployment-mediawiki01 | 
  [releng] | 
            
  | 14:16 | 
  <hashar> | 
  rebooting hhvm  | 
  [releng] | 
            
  | 09:42 | 
  <hashar> | 
  bastion had broken puppet because deployment_server and zuul both declare the same python packages {{gerrit|150501}} | 
  [releng] | 
            
  | 09:40 | 
  <hashar> | 
  restoring on puppetmaster modules/mediawiki/templates/apache/apache2.conf.erb which got deleted somehow | 
  [releng] | 
            
  | 09:29 | 
  <hashar> | 
  Rebooting apache01/02 to see whether it fix the ssh connection issue | 
  [releng] | 
            
  | 09:27 | 
  <hashar> | 
  manually started hhvm on mediawiki01 | 
  [releng] | 
            
  | 09:25 | 
  <hashar> | 
  rebooting deployment-mediawiki01  hhvm process went zombie | 
  [releng] | 
            
  | 09:23 | 
  <hashar> | 
  restarting hhvm on mediawiki 01/02 | 
  [releng] | 
            
  | 09:05 | 
  <hashar_> | 
  Beta scap script broken since 6:30am UTC  https://integration.wikimedia.org/ci/job/beta-scap-eqiad/ | 
  [releng] | 
            
  
    | 
      
        2014-07-29
      
      §
     | 
  
    
  | 22:56 | 
  <cscott> | 
  updated OCG to version aeb8623d6ebe41ae7c7e36c57844bd9ea8e6d595 | 
  [releng] | 
            
  | 21:02 | 
  <bd808> | 
  Converted deployment-sentry2.eqiad.wmflabs to use beta salt/puppet master | 
  [releng] | 
            
  | 19:14 | 
  <hashar> | 
  Removed all jobs from queue, restarted slave agent.  Update Jobs coming back | 
  [releng] | 
            
  | 19:09 | 
  <hashar> | 
  deployment-bastion jenkins slave is stuck.  Beta cluster is no more updating code :-// | 
  [releng] | 
            
  | 15:58 | 
  <godog> | 
  restarted hhvm on deploymnet-mediawiki01 | 
  [releng] | 
            
  | 15:52 | 
  <godog> | 
  restarted hhvm on deployment-mediawiki02 | 
  [releng] | 
            
  | 15:50 | 
  <godog> | 
  installed libevent-dbg on deployment-mediawiki02 to capture an hhvm backtrace | 
  [releng] | 
            
  | 15:17 | 
  <bd808> | 
  _joe_ restarting hhvm on deployment-mediawiki01 | 
  [releng] | 
            
  | 15:00 | 
  <bd808> | 
  Apache stuck with 65 children on both deployment-mediawiki servers | 
  [releng] | 
            
  | 10:37 | 
  <hashar> | 
  Restarted hhvm on mediawiki{01,02} | 
  [releng] | 
            
  
    | 
      
        2014-07-28
      
      §
     | 
  
    
  | 17:41 | 
  <bd808> | 
  Updated hhvm to latest 3.3-dev+20140728 build on deployment-mediawiki0[12] | 
  [releng] | 
            
  | 15:37 | 
  <manybubbles> | 
  rebuilding elasticsearch indexes to build a weighted all field we'll try to use to improve performance | 
  [releng] | 
            
  | 15:32 | 
  <bd808> | 
  Restarted hhvm on deployment-mediawiki0[12]. All apache children were stuck waiting for hhvm to respond. | 
  [releng] | 
            
  | 15:20 | 
  <bd808> | 
  Restarted apache on deployment-mediawiki02. 65 children and non-responsive to requests. (same as mediawiki01) | 
  [releng] | 
            
  | 15:18 | 
  <bd808> | 
  Restarted apache on deployment-mediawiki01. 65 children and non-responsive to requests. | 
  [releng] | 
            
  | 14:23 | 
  <manybubbles> | 
  or not - looks like I can't! | 
  [releng] | 
            
  | 14:22 | 
  <manybubbles> | 
  reubilding cirrus search indexes to pick up a speed up all field | 
  [releng] | 
            
  | 08:30 | 
  <hashar> | 
  restarted varnish on deployment-cache-bits01 . Hoping to clear bits cache | 
  [releng] |