| 2020-02-07
      
      ยง | 
    
  | 22:20 | <jeh> | ceph: round 2 OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718 | [production] | 
            
  | 20:47 | <mutante> | OS install on new install_server VMs worked on second attempt, issues are gone. signed puppet certs for install1003.eqiad.wmnet, install2003.codfw.wmnet, initial puppet runs (T224576) | [production] | 
            
  | 20:42 | <jeh> | ceph: OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718 | [production] | 
            
  | 20:32 | <mutante> | ganeti: attempting to reinstall install1003 which failed last time | [production] | 
            
  | 17:38 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10350 and previous config saved to /var/cache/conftool/dbconfig/20200207-173850-marostegui.json | [production] | 
            
  | 17:36 | <twentyafterfour@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: sync InitializeSettings again for lols refs T233866 (duration: 01m 03s) | [production] | 
            
  | 17:32 | <twentyafterfour@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: sync https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/570929 refs T233866 (duration: 01m 02s) | [production] | 
            
  | 17:25 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10349 and previous config saved to /var/cache/conftool/dbconfig/20200207-172541-marostegui.json | [production] | 
            
  | 17:22 | <twentyafterfour@deploy1001> | rebuilt and synchronized wikiversions files: roll back all wikis to 1.35.0-wmf.16 refs T233866 | [production] | 
            
  | 17:19 | <marostegui> | Start MySQL on es1019 after onsite maintenance T243963 | [production] | 
            
  | 16:46 | <filippo@cumin1001> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) | [production] | 
            
  | 16:38 | <filippo@cumin1001> | START - Cookbook sre.ganeti.makevm | [production] | 
            
  | 16:13 | <XioNoX> | remove MSS clamping from eqiad/eqord/knams/esams | [production] | 
            
  | 16:05 | <andrew@deploy1001> | Finished deploy [horizon/deploy@bc777d6]: Fix for T243422 (duration: 03m 45s) | [production] | 
            
  | 16:04 | <vgutierrez> | pooling cp4030 with buster - T242093 | [production] | 
            
  | 16:03 | <bblack> | removing GRE MTU mitigations from cp[135]xxx - T232602 | [production] | 
            
  | 16:01 | <andrew@deploy1001> | Started deploy [horizon/deploy@bc777d6]: Fix for T243422 | [production] | 
            
  | 15:50 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 15:48 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 15:25 | <vgutierrez> | depool & reimage cp4030 as buster - T242093 | [production] | 
            
  | 15:21 | <vgutierrez> | pooling cp4031 with buster - T242093 | [production] | 
            
  | 15:20 | <vgutierrez> | pooling ncredir3001 running buster - T243391 | [production] | 
            
  | 15:18 | <marostegui> | Restart all instances on db1124 and db1125 to pick up a new replication filter - T240094 | [production] | 
            
  | 15:11 | <marostegui> | Restart all instances on db2094 and db2095 to pick up a new replication filter - T240094 | [production] | 
            
  | 14:56 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 14:53 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 14:43 | <hoo@deploy1001> | Synchronized wmf-config/Wikibase.php: REVERT: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 40s) | [production] | 
            
  | 14:43 | <Amir1> | ladsgroup@mwmaint1002:~$ mwscript createAndPromote.php --wiki=zhwiki --force "Amir Sarabadani (WMDE)" --sysop (T244578) | [production] | 
            
  | 14:40 | <hoo@deploy1001> | Scap failed!: 9/11 canaries failed their endpoint checks(http://en.wikipedia.org) | [production] | 
            
  | 14:38 | <hoo@deploy1001> | Synchronized wmf-config/Wikibase.php: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 20s) | [production] | 
            
  | 14:33 | <vgutierrez> | depool and reimage ncredir3001 as buster - T243391 | [production] | 
            
  | 14:32 | <vgutierrez> | depool & reimage cp4031 as buster - T242093 | [production] | 
            
  | 14:23 | <vgutierrez> | pooling ncredir3002 running buster - T243391 | [production] | 
            
  | 13:26 | <vgutierrez> | pooling cp4021 with buster - T242093 | [production] | 
            
  | 13:05 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 13:03 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 12:51 | <vgutierrez> | depool and reimage ncredir3002 as buster - T243391 | [production] | 
            
  | 12:42 | <vgutierrez> | depool & reimage cp4021 as buster - T242093 | [production] | 
            
  | 12:08 | <akosiaris@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 12:08 | <akosiaris@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 11:58 | <akosiaris@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 11:57 | <akosiaris@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 11:25 | <vgutierrez> | pooling ncredir5001 running buster - T243391 | [production] | 
            
  | 11:24 | <vgutierrez> | pooling cp4022 with buster - T242093 | [production] | 
            
  | 11:09 | <akosiaris> | undo wikifeeds experiments | [production] | 
            
  | 11:07 | <akosiaris@deploy1001> | helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . | [production] | 
            
  | 10:42 | <akosiaris@deploy1001> | helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' . | [production] | 
            
  | 10:40 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 10:37 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 10:36 | <akosiaris> | conduct experiments with stopping/starting uwsgi-ores on ores2001 T242705 | [production] |