| 
      
        2020-01-26
      
      §
     | 
  
    
  | 21:42 | 
  <akosiaris> | 
  test depool maps1003 | 
  [production] | 
            
  | 21:42 | 
  <akosiaris@cumin1001> | 
  conftool action : set/pooled=no; selector: name=maps1003.* | 
  [production] | 
            
  | 21:38 | 
  <vgutierrez> | 
  powercycling cp3051 - T238305 | 
  [production] | 
            
  | 21:23 | 
  <akosiaris> | 
  restart kartotherian on maps1002 | 
  [production] | 
            
  | 21:19 | 
  <vgutierrez> | 
  restart varnish-fe and ats-tls on cp3056 | 
  [production] | 
            
  | 21:02 | 
  <bblack> | 
  ats-tls-restart on cp3064 | 
  [production] | 
            
  | 20:51 | 
  <bblack> | 
  esams text caches: reverting earlier sysctl mitigations | 
  [production] | 
            
  | 18:11 | 
  <volans> | 
  shutdown elastic2043 - T243715 | 
  [production] | 
            
  | 18:01 | 
  <volans> | 
  depooled elastic2043 - T243715 | 
  [production] | 
            
  | 18:01 | 
  <volans@cumin1001> | 
  conftool action : set/pooled=inactive; selector: name=elastic2043.codfw.wmnet | 
  [production] | 
            
  | 17:28 | 
  <elukey> | 
  restart varnishkafka-webrequest on cp3064 | 
  [production] | 
            
  | 17:25 | 
  <elukey> | 
  restart varnishkafka-webrequest on cp3056 | 
  [production] | 
            
  | 17:03 | 
  <bblack> | 
  reduce /proc/sys/net/ipv4/tcp_max_syn_backlog to 8192 on esams text caches | 
  [production] | 
            
  | 16:55 | 
  <bblack> | 
  reduce /proc/sys/net/ipv4/tcp_synack_retries to 1 on esams text caches | 
  [production] | 
            
  | 16:42 | 
  <cdanis> | 
  ✔️ cdanis@cp4030.ulsfo.wmnet ~ 🕦☕ sudo depool | 
  [production] | 
            
  | 16:38 | 
  <bblack> | 
  applying GRE MTU mitigation from T232602 to all cp1, cp3, cp5 cache nodes | 
  [production] | 
            
  | 15:43 | 
  <XioNoX> | 
  3*prepend in esams/knams | 
  [production] | 
            
  | 15:26 | 
  <elukey> | 
  repool deployed | 
  [production] | 
            
  | 15:24 | 
  <elukey> | 
  repool esams | 
  [production] | 
            
  | 15:01 | 
  <cdanis> | 
  deployed | 
  [production] | 
            
  | 15:00 | 
  <cdanis> | 
  depool esams | 
  [production] | 
            
  | 14:56 | 
  <XioNoX> | 
  enabling netflow sampling on the knams-esams links (esams side) | 
  [production] | 
            
  | 11:25 | 
  <effie> | 
  restarted tilerator and tileratorui on maps1002 | 
  [production] | 
            
  | 11:23 | 
  <effie> | 
  restarted tilerator and tileratorui on maps1001 | 
  [production] | 
            
  | 10:38 | 
  <effie> | 
  deployed | 
  [production] | 
            
  | 10:37 | 
  <effie> | 
  Pool esams back | 
  [production] | 
            
  | 01:12 | 
  <cdanis> | 
  deployed | 
  [production] | 
            
  | 01:12 | 
  <cdanis> | 
  depool esams with new geo-maps-esams-offline | 
  [production] | 
            
  
    | 
      
        2020-01-24
      
      §
     | 
  
    
  | 22:31 | 
  <mutante> | 
  ganeti1003 - sudo gnt-instance remove etherpad1001.eqiad.wmnet (T224580) | 
  [production] | 
            
  | 22:21 | 
  <mutante> | 
  shutting down etherpad1001 - service fully migrated to etherpad1002 - running decom cookbook on ganeti VM (T224580) | 
  [production] | 
            
  | 22:20 | 
  <dzahn@cumin1001> | 
  END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) | 
  [production] | 
            
  | 22:19 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.hosts.decommission | 
  [production] | 
            
  | 21:18 | 
  <cdanis> | 
  ✔️ cdanis@cp4029.ulsfo.wmnet ~ 🕟🍵 sudo depool | 
  [production] | 
            
  | 17:54 | 
  <jforrester@deploy1001> | 
  Synchronized wmf-config/InitialiseSettings.php: Clean up CheckUser config (duration: 01m 09s) | 
  [production] | 
            
  | 15:43 | 
  <gehel> | 
  restart blazegraph + updater on wdqs1007 (seems stuck, known issue) | 
  [production] | 
            
  | 15:33 | 
  <otto@deploy1001> | 
  helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . | 
  [production] | 
            
  | 14:28 | 
  <vgutierrez> | 
  uploaded mtail 3.0.0~rc5-1~bpo9+1wmf2 to apt.wm.o (buster) - T243591 | 
  [production] | 
            
  | 14:26 | 
  <akosiaris@deploy1001> | 
  helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | 
  [production] | 
            
  | 14:24 | 
  <akosiaris@deploy1001> | 
  helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | 
  [production] | 
            
  | 14:23 | 
  <akosiaris@deploy1001> | 
  helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | 
  [production] | 
            
  | 13:16 | 
  <akosiaris@deploy1001> | 
  helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . | 
  [production] | 
            
  | 11:09 | 
  <moritzm> | 
  purged stale grafana package from grafana1001, caused systemd unit failure | 
  [production] | 
            
  | 11:04 | 
  <effie> | 
  restart php-fpm on mw1238-mw1239 | 
  [production] | 
            
  | 09:29 | 
  <akosiaris> | 
  disable and mask etherpad-lite on etherpad1002 to avoid corruption issues. T224580 | 
  [production] | 
            
  | 08:42 | 
  <marostegui> | 
  Remove wikiadmin2 user from pc2XXX codfw hosts T243512 | 
  [production] | 
            
  | 08:17 | 
  <moritzm> | 
  installing python-apt security updates | 
  [production] | 
            
  | 07:19 | 
  <_joe_> | 
  force run puppet on all esams cache nodes, for mitigation of T243313 | 
  [production] | 
            
  | 06:37 | 
  <marostegui> | 
  Stop replication on db1107 | 
  [production] |