| 2020-09-08
      
      ยง | 
    
  | 21:57 | <andrew@deploy1001> | Finished deploy [horizon/deploy@7a3221d]: refreshing to clobber local hacks (duration: 00m 13s) | [production] | 
            
  | 21:57 | <andrew@deploy1001> | Started deploy [horizon/deploy@7a3221d]: refreshing to clobber local hacks | [production] | 
            
  | 19:19 | <jhuneidi@deploy1001> | rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.8 | [production] | 
            
  | 19:12 | <jhuneidi@deploy1001> | Finished scap: testwikis wikis to 1.36.0-wmf.8 (duration: 71m 45s) | [production] | 
            
  | 18:22 | <elukey> | rm /srv/prometheus/ops/targets/mjolnir_msearch_eqiad.yaml on prometheus100[3,4] as cleanup after https://gerrit.wikimedia.org/r/621988 - T260305 | [production] | 
            
  | 18:00 | <jhuneidi@deploy1001> | Started scap: testwikis wikis to 1.36.0-wmf.8 | [production] | 
            
  | 17:58 | <ryankemper@cumin1001> | START - Cookbook sre.wdqs.data-reload | [production] | 
            
  | 17:57 | <ryankemper@cumin1001> | END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97) | [production] | 
            
  | 17:54 | <Amir1> | Deployed patch for T262240 | [production] | 
            
  | 17:53 | <ryankemper@cumin1001> | START - Cookbook sre.wdqs.data-reload | [production] | 
            
  | 17:23 | <andrewbogott> | rebooting cloudvirt1033 | [production] | 
            
  | 17:03 | <klausman> | attempted to add rock-dkms_3.3-19_all.deb to thirdparty/amd-rocm33 for use on analytics servers with GPUs | [production] | 
            
  | 16:35 | <otto@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Set canary_events_enabled: true for eventgate test streams and eventlogging_Test - T251609 (duration: 00m 58s) | [production] | 
            
  | 16:34 | <herron> | increased elk5 logstash JVM heaps to 2g (to help decrease kafka-logging consumer lag) | [production] | 
            
  | 16:12 | <longma> | 1.36.0-wmf.8 was branched at e81e81e91473cc8259c473165863aca8ecea2784 for T257976 | [production] | 
            
  | 16:03 | <akosiaris@deploy1001> | helmfile [staging] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | [production] | 
            
  | 16:03 | <akosiaris@deploy1001> | helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | [production] | 
            
  | 16:02 | <akosiaris@deploy1001> | helmfile [codfw] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . | [production] | 
            
  | 15:34 | <jayme@cumin1001> | conftool action : set/pooled=yes; selector: name=kubernetes1004.* | [production] | 
            
  | 15:32 | <jayme@cumin1001> | conftool action : set/pooled=yes; selector: service=kubesvc,name=kubernetes1013.* | [production] | 
            
  | 15:30 | <elukey> | roll restart of hadoop master daemons on an-master100[1,2] after the cookbook failed | [production] | 
            
  | 15:26 | <elukey@cumin1001> | END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) | [production] | 
            
  | 15:20 | <_joe_> | restarted celery-ores-worker.service on ores1007 | [production] | 
            
  | 15:19 | <_joe_> | restarted ferm on wdqs1011 | [production] | 
            
  | 15:18 | <elukey@cumin1001> | START - Cookbook sre.hadoop.roll-restart-masters | [production] | 
            
  | 15:16 | <_joe_> | starting wdqs-updater on wdqs1005 | [production] | 
            
  | 15:15 | <bblack@cumin1001> | conftool action : set/pooled=yes; selector: name=cp1090.eqiad.wmnet | [production] | 
            
  | 15:14 | <bblack@cumin1001> | conftool action : set/pooled=yes; selector: name=cp108[789].eqiad.wmnet | [production] | 
            
  | 15:14 | <bblack> | repool cp1087-90 (eqiad row D) | [production] | 
            
  | 15:13 | <herron> | rolling restart of elk5 logstashes | [production] | 
            
  | 15:10 | <marostegui> | Start mysql on db1106 after PDU maintenance is done | [production] | 
            
  | 15:03 | <jayme@cumin1001> | conftool action : set/pooled=inactive; selector: service=kubesvc,name=kubernetes1013.* | [production] | 
            
  | 15:03 | <jayme@cumin1001> | conftool action : set/pooled=inactive; selector: name=kubernetes1004.* | [production] | 
            
  | 15:03 | <XioNoX> | request virtual-chassis vc-port set pic-slot 1 member 4 port 0 | [production] | 
            
  | 15:03 | <XioNoX> | request virtual-chassis vc-port set pic-slot 0 member 2 port 50 | [production] | 
            
  | 15:02 | <XioNoX> | request virtual-chassis vc-port set pic-slot 1 member 1 port 1 | [production] | 
            
  | 14:53 | <marostegui> | Reload dbproxy1016 to recover the alert | [production] | 
            
  | 14:45 | <jynus> | restarting bacula-dir @ backup1001 | [production] | 
            
  | 14:44 | <XioNoX> | reboot asw2-d3-eqiad | [production] | 
            
  | 14:33 | <moritzm> | bouncing ferm on hosts where ferm.service failed due to DNS resolution issues for prometheus hosts | [production] | 
            
  | 14:31 | <volans> | restarted ssh on mc1033 from console | [production] | 
            
  | 14:16 | <XioNoX> | request virtual-chassis vc-port delete pic-slot 1 member 4 port 0 | [production] | 
            
  | 14:16 | <XioNoX> | request virtual-chassis vc-port delete pic-slot 0 member 2 port 50 | [production] | 
            
  | 14:14 | <XioNoX> | request virtual-chassis vc-port delete pic-slot 1 member 1 port 1 | [production] | 
            
  | 14:13 | <akosiaris> | drain kubernetes1013, kubernetes1004. They are on row D | [production] | 
            
  | 14:13 | <bblack> | dns1002 - disable puppet + bird service (stop advertising recdns from row D) | [production] | 
            
  | 14:03 | <kormat@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 14:03 | <kormat@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 13:59 | <bblack@cumin1001> | conftool action : set/pooled=no; selector: name=cp1090.eqiad.wmnet | [production] | 
            
  | 13:59 | <bblack> | depooling cp1087-1090 | [production] |