| 
      
        2019-07-29
      
      ยง
     | 
  
    
  | 13:09 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 13:01 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.druid.roll-restart-workers | 
  [production] | 
            
  | 12:45 | 
  <marostegui@deploy1001> | 
  Synchronized wmf-config/db-eqiad.php: Provision db2128 into s5 api T221533 (duration: 00m 47s) | 
  [production] | 
            
  | 12:45 | 
  <marostegui> | 
  Provision db2128 into s5 codfw - T228969 | 
  [production] | 
            
  | 12:44 | 
  <marostegui@deploy1001> | 
  Synchronized wmf-config/db-codfw.php: Provision db2128 into s5 api T221533 (duration: 00m 47s) | 
  [production] | 
            
  | 12:39 | 
  <arturo> | 
  T228870 reboot cloudvirt1005.eqiad.wmnet for kernel updates | 
  [production] | 
            
  | 12:38 | 
  <aborrero@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 12:38 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 12:20 | 
  <arturo> | 
  T228870 reboot cloudvirt1004.eqiad.wmnet for kernel updates | 
  [production] | 
            
  | 12:20 | 
  <aborrero@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 12:20 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 11:58 | 
  <arturo> | 
  T228870 reboot cloudvirt1003.eqiad.wmnet for kernel updates | 
  [production] | 
            
  | 11:57 | 
  <aborrero@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 11:57 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 11:36 | 
  <arturo> | 
  icinga downtime toolschecker for 6h | 
  [production] | 
            
  | 11:31 | 
  <arturo> | 
  T228870 reboot cloudvirt1002.eqiad.wmnet for kernel updates | 
  [production] | 
            
  | 11:31 | 
  <aborrero@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 11:31 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 11:14 | 
  <arturo> | 
  T228870 reboot cloudvirt1001.eqiad.wmnet for kernel updates | 
  [production] | 
            
  | 11:14 | 
  <aborrero@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 11:13 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 11:13 | 
  <aborrero@cumin1001> | 
  END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) | 
  [production] | 
            
  | 11:13 | 
  <aborrero@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 11:11 | 
  <dcausse> | 
  EU SWAT done | 
  [production] | 
            
  | 11:10 | 
  <dcausse@deploy1001> | 
  Synchronized wmf-config/SearchSettingsForWikidata.php: [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder (duration: 00m 47s) | 
  [production] | 
            
  | 10:37 | 
  <jdrewniak@deploy1001> | 
  Synchronized portals: Wikimedia Portals Update: [[gerrit:526125| Bumping portals to master (T128546)]] (duration: 00m 47s) | 
  [production] | 
            
  | 10:36 | 
  <jdrewniak@deploy1001> | 
  Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:526125| Bumping portals to master (T128546)]] (duration: 00m 47s) | 
  [production] | 
            
  | 09:49 | 
  <marostegui> | 
  Add db2128 to tendril and zarcillo - T228969 | 
  [production] | 
            
  | 09:24 | 
  <elukey@cumin1001> | 
  END (FAIL) - Cookbook sre.druid.roll-restart-workers (exit_code=99) | 
  [production] | 
            
  | 09:22 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.druid.roll-restart-workers | 
  [production] | 
            
  | 09:21 | 
  <elukey@cumin1001> | 
  END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) | 
  [production] | 
            
  | 08:55 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.druid.roll-restart-workers | 
  [production] | 
            
  | 08:51 | 
  <root@> | 
  helmfile [STAGING] Ran 'apply' command on namespace 'kube-system' for release 'calico-policy-controller' . | 
  [production] | 
            
  | 08:47 | 
  <elukey> | 
  set mcrouter async behavior for codfw replication to all mw app/api servers (changes will be picked up when puppet runs on the hosts) - T225642 | 
  [production] | 
            
  | 08:35 | 
  <godog> | 
  temp stop puppet on cp hosts to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/525259 | 
  [production] | 
            
  | 08:32 | 
  <elukey@cumin1001> | 
  END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) | 
  [production] | 
            
  | 08:32 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.hadoop.roll-restart-workers | 
  [production] | 
            
  | 08:16 | 
  <marostegui> | 
  Drop abuse_filter_log.afl_log_id in s7 eqiad - T226851 | 
  [production] | 
            
  | 07:49 | 
  <dcausse> | 
  elastic@eqiad force recovery of failed shards (eswiki stuck) | 
  [production] | 
            
  | 07:30 | 
  <marostegui@deploy1001> | 
  Synchronized wmf-config/db-eqiad.php: Remove db2038 from config T221533 (duration: 00m 46s) | 
  [production] | 
            
  | 07:29 | 
  <marostegui@deploy1001> | 
  Synchronized wmf-config/db-codfw.php: Remove db2038 from config T221533 (duration: 00m 50s) | 
  [production] | 
            
  | 07:18 | 
  <elukey@cumin1001> | 
  END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) | 
  [production] | 
            
  | 06:45 | 
  <akosiaris> | 
  poweroff orespoolcounter{1,2}00{1,2} for removal T227640 | 
  [production] | 
            
  | 06:37 | 
  <_joe_> | 
  restarted php7.2 on mwdebug1002, low opcache | 
  [production] | 
            
  | 06:36 | 
  <_joe_> | 
  restarted coherence report on netmon1002, it failed earlier this morning | 
  [production] | 
            
  | 06:31 | 
  <_joe_> | 
  restarting nrpe on restbase-dev1006 T224260 | 
  [production] | 
            
  | 06:30 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.hadoop.roll-restart-workers | 
  [production] | 
            
  | 05:33 | 
  <marostegui@deploy1001> | 
  Synchronized wmf-config/db-eqiad.php: Depool db1104 in preparation for Tuesday 30th failover in s8 (duration: 00m 54s) | 
  [production] | 
            
  | 05:18 | 
  <marostegui> | 
  Drop Drop abuse_filter_log.afl_log_id from s7 codfw with replication (this will cause lag in s7 codfw) - T226851 | 
  [production] | 
            
  | 05:05 | 
  <marostegui> | 
  Remove db1072 from tendril and zarcillo T228956 | 
  [production] |