| 
      
        2017-07-05
      
      §
     | 
  
    
  | 14:19 | 
  <moritzm> | 
  rebooting logstash100[4-6] for kernel update | 
  [production] | 
            
  | 14:00 | 
  <moritzm> | 
  rebooting logstash100[1-3] for kernel update | 
  [production] | 
            
  | 13:59 | 
  <ema> | 
  cache_misc: upgrade to varnish 4.1.7-1wm1 and reboot for kernel update | 
  [production] | 
            
  | 13:48 | 
  <apergos> | 
  re-enabling puppet on snapshot1001, 1005 for testing | 
  [production] | 
            
  | 13:46 | 
  <moritzm> | 
  rebooting restbase1011 for kernel update | 
  [production] | 
            
  | 13:44 | 
  <zeljkof> | 
  EU SWAT finished! | 
  [production] | 
            
  | 13:43 | 
  <zfilipin@tin> | 
  Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:362986|Set Wikibase readFullEntityIdColumn setting to false]] (duration: 00m 42s) | 
  [production] | 
            
  | 13:35 | 
  <zfilipin@tin> | 
  Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:363043|Enable WikiLove for ckbwiki (T169563)]] (duration: 00m 43s) | 
  [production] | 
            
  | 13:24 | 
  <zfilipin@tin> | 
  Synchronized dblists/closed.dblist: SWAT: [[gerrit:361686|Reopen nlwikinews (T168764)]] (duration: 02m 50s) | 
  [production] | 
            
  | 13:21 | 
  <jmm@puppetmaster1001> | 
  conftool action : set/pooled=inactive; selector: mw1196.eqiad.wmnet | 
  [production] | 
            
  | 13:18 | 
  <apergos> | 
  power cycled dataset1001, crashed, unresponsivle on mgmt console | 
  [production] | 
            
  | 13:18 | 
  <zfilipin@tin> | 
  Synchronized dblists/closed.dblist: SWAT: [[gerrit:361686|Reopen nlwikinews (T168764)]] (duration: 02m 50s) | 
  [production] | 
            
  | 13:16 | 
  <elukey> | 
  reboot conf2001 for kernel updates | 
  [production] | 
            
  | 13:09 | 
  <moritzm> | 
  rebooting restbase1010 for kernel update | 
  [production] | 
            
  | 12:49 | 
  <marostegui> | 
  Force BBU relearn on db1016 - T166344 | 
  [production] | 
            
  | 12:36 | 
  <marostegui> | 
  Move labsdb1010 main general replication thread to a named replication thread called db1095 - T153743 | 
  [production] | 
            
  | 12:33 | 
  <marostegui> | 
  Stop all replication threads on db1095 for maintenance - T153743 | 
  [production] | 
            
  | 12:32 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Repool db1085 - T153743 (duration: 02m 49s) | 
  [production] | 
            
  | 12:29 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Repool db1051 - T168661 (duration: 02m 50s) | 
  [production] | 
            
  | 12:16 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Depool db1051 - T168661 (duration: 02m 51s) | 
  [production] | 
            
  | 12:11 | 
  <apergos> | 
  puppet is currently disabled again on snapshots 1,5,6,7 and on dataset1001; we saw the same nfs issue shortly after reboot, with no dump processes going, as snapshots 5,6,7 had not remounted the filesystem | 
  [production] | 
            
  | 11:20 | 
  <moritzm> | 
  rebooting wtp2* servers for kernel update | 
  [production] | 
            
  | 11:14 | 
  <moritzm> | 
  rebooting restbase1009 for kernel update | 
  [production] | 
            
  | 10:56 | 
  <hashar> | 
  restarting Jenkins for plugin upgrades | 
  [production] | 
            
  | 10:45 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 59s) | 
  [production] | 
            
  | 10:41 | 
  <marostegui> | 
  Run redact_sanitarium on s6 databases db1102 - T153743 | 
  [production] | 
            
  | 10:41 | 
  <moritzm> | 
  rebooting wtp1001 for kernel update | 
  [production] | 
            
  | 10:37 | 
  <moritzm> | 
  rebooting restbase1008 for kernel update | 
  [production] | 
            
  | 10:32 | 
  <apergos> | 
  rebooting snapshot hosts to clean up hung nfs client processes | 
  [production] | 
            
  | 10:30 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 51s) | 
  [production] | 
            
  | 10:24 | 
  <apergos> | 
  rebooted dataset1001 to unstick nfsd and pick up new kernel, re-enabled puppet | 
  [production] | 
            
  | 10:14 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Repool db1066 - T168661 (duration: 02m 50s) | 
  [production] | 
            
  | 10:11 | 
  <moritzm> | 
  rebooting restbase1007 for kernel update | 
  [production] | 
            
  | 10:01 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Depool db1066 - T168661 (duration: 02m 50s) | 
  [production] | 
            
  | 09:57 | 
  <marostegui> | 
  Deploy alter table on s1 eqiad hosts - T168661 | 
  [production] | 
            
  | 09:48 | 
  <godog> | 
  move 'instances' graphite hierarchy out of the way, do not delete yet - T143405 | 
  [production] | 
            
  | 09:27 | 
  <marostegui> | 
  Stop MySQL on db1085 for maintenance - T153743 | 
  [production] | 
            
  | 09:21 | 
  <godog> | 
  upload nginx_1.11.10-1+wmf2 to jessie-wikimedia and nginx_1.11.10-1+wmf2~stretch1 to stretch-wikimedia | 
  [production] | 
            
  | 09:17 | 
  <marostegui@tin> | 
  Synchronized wmf-config/db-eqiad.php: Depool db1085 - T153743 (duration: 02m 50s) | 
  [production] | 
            
  | 08:44 | 
  <apergos> | 
  puppet disabled and processes accessing dataset1001 exported filesystem shot, on: stat1002,3, snapshot1001,5,6,7, while investigation continues | 
  [production] | 
            
  | 07:27 | 
  <moritzm> | 
  rebooting restbase-dev* for kernel update | 
  [production] | 
            
  | 07:13 | 
  <moritzm> | 
  rebooting notebook* hosts | 
  [production] | 
            
  | 05:18 | 
  <marostegui> | 
  Deploy alter table on s3 master - db1075 - T168661 | 
  [production] | 
            
  | 05:13 | 
  <marostegui> | 
  Deploy alter table on s7 master - db1062 - T168661 | 
  [production] | 
            
  | 05:08 | 
  <marostegui> | 
  Force a relearn on db1046's BBU - T166141 | 
  [production] | 
            
  | 02:27 | 
  <l10nupdate@tin> | 
  scap sync-l10n completed (1.30.0-wmf.7) (duration: 10m 23s) | 
  [production] |