| 
      
        2020-02-10
      
      §
     | 
  
    
  | 06:47 | 
  <vgutierrez> | 
  depool cp4029 & reimage as buster - T242093 | 
  [production] | 
            
  | 06:45 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool es1019', diff saved to https://phabricator.wikimedia.org/P10360 and previous config saved to /var/cache/conftool/dbconfig/20200210-064553-marostegui.json | 
  [production] | 
            
  | 06:44 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool db1091 T232446', diff saved to https://phabricator.wikimedia.org/P10359 and previous config saved to /var/cache/conftool/dbconfig/20200210-064458-marostegui.json | 
  [production] | 
            
  | 06:39 | 
  <marostegui> | 
  Compress db1124:3318 - this will generate lag on s8 wiki replicas - T232446 | 
  [production] | 
            
  | 06:37 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool db1091 T232446', diff saved to https://phabricator.wikimedia.org/P10358 and previous config saved to /var/cache/conftool/dbconfig/20200210-063716-marostegui.json | 
  [production] | 
            
  | 06:23 | 
  <marostegui> | 
  Remove partitions from db1099:3311, db1099:3318 T239453 | 
  [production] | 
            
  | 06:21 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Depool  db1099:3318 T239453', diff saved to https://phabricator.wikimedia.org/P10357 and previous config saved to /var/cache/conftool/dbconfig/20200210-062112-marostegui.json | 
  [production] | 
            
  | 06:18 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Depool repool db1099:3311 T239453', diff saved to https://phabricator.wikimedia.org/P10356 and previous config saved to /var/cache/conftool/dbconfig/20200210-061822-marostegui.json | 
  [production] | 
            
  | 06:16 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool db1091 T232446', diff saved to https://phabricator.wikimedia.org/P10355 and previous config saved to /var/cache/conftool/dbconfig/20200210-061656-marostegui.json | 
  [production] | 
            
  
    | 
      
        2020-02-07
      
      §
     | 
  
    
  | 22:20 | 
  <jeh> | 
  ceph: round 2 OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718 | 
  [production] | 
            
  | 20:47 | 
  <mutante> | 
  OS install on new install_server VMs worked on second attempt, issues are gone. signed puppet certs for install1003.eqiad.wmnet, install2003.codfw.wmnet, initial puppet runs (T224576) | 
  [production] | 
            
  | 20:42 | 
  <jeh> | 
  ceph: OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718 | 
  [production] | 
            
  | 20:32 | 
  <mutante> | 
  ganeti: attempting to reinstall install1003 which failed last time | 
  [production] | 
            
  | 17:38 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10350 and previous config saved to /var/cache/conftool/dbconfig/20200207-173850-marostegui.json | 
  [production] | 
            
  | 17:36 | 
  <twentyafterfour@deploy1001> | 
  Synchronized wmf-config/InitialiseSettings.php: sync InitializeSettings again for lols refs T233866 (duration: 01m 03s) | 
  [production] | 
            
  | 17:32 | 
  <twentyafterfour@deploy1001> | 
  Synchronized wmf-config/InitialiseSettings.php: sync https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/570929 refs T233866 (duration: 01m 02s) | 
  [production] | 
            
  | 17:25 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10349 and previous config saved to /var/cache/conftool/dbconfig/20200207-172541-marostegui.json | 
  [production] | 
            
  | 17:22 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: roll back all wikis to 1.35.0-wmf.16 refs T233866 | 
  [production] | 
            
  | 17:19 | 
  <marostegui> | 
  Start MySQL on es1019 after onsite maintenance T243963 | 
  [production] | 
            
  | 16:46 | 
  <filippo@cumin1001> | 
  END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) | 
  [production] | 
            
  | 16:38 | 
  <filippo@cumin1001> | 
  START - Cookbook sre.ganeti.makevm | 
  [production] | 
            
  | 16:13 | 
  <XioNoX> | 
  remove MSS clamping from eqiad/eqord/knams/esams | 
  [production] | 
            
  | 16:05 | 
  <andrew@deploy1001> | 
  Finished deploy [horizon/deploy@bc777d6]: Fix for T243422 (duration: 03m 45s) | 
  [production] | 
            
  | 16:04 | 
  <vgutierrez> | 
  pooling cp4030 with buster - T242093 | 
  [production] | 
            
  | 16:03 | 
  <bblack> | 
  removing GRE MTU mitigations from cp[135]xxx - T232602 | 
  [production] | 
            
  | 16:01 | 
  <andrew@deploy1001> | 
  Started deploy [horizon/deploy@bc777d6]: Fix for T243422 | 
  [production] | 
            
  | 15:50 | 
  <vgutierrez@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 15:48 | 
  <vgutierrez@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 15:25 | 
  <vgutierrez> | 
  depool & reimage cp4030 as buster - T242093 | 
  [production] | 
            
  | 15:21 | 
  <vgutierrez> | 
  pooling cp4031 with buster - T242093 | 
  [production] | 
            
  | 15:20 | 
  <vgutierrez> | 
  pooling ncredir3001 running buster - T243391 | 
  [production] | 
            
  | 15:18 | 
  <marostegui> | 
  Restart all instances on db1124 and db1125 to pick up a new replication filter - T240094 | 
  [production] | 
            
  | 15:11 | 
  <marostegui> | 
  Restart all instances on db2094 and db2095 to pick up a new replication filter - T240094 | 
  [production] | 
            
  | 14:56 | 
  <vgutierrez@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] | 
            
  | 14:53 | 
  <vgutierrez@cumin1001> | 
  START - Cookbook sre.hosts.downtime | 
  [production] | 
            
  | 14:43 | 
  <hoo@deploy1001> | 
  Synchronized wmf-config/Wikibase.php: REVERT: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 40s) | 
  [production] | 
            
  | 14:43 | 
  <Amir1> | 
  ladsgroup@mwmaint1002:~$ mwscript createAndPromote.php --wiki=zhwiki --force "Amir Sarabadani (WMDE)" --sysop (T244578) | 
  [production] | 
            
  | 14:40 | 
  <hoo@deploy1001> | 
  Scap failed!: 9/11 canaries failed their endpoint checks(http://en.wikipedia.org) | 
  [production] | 
            
  | 14:38 | 
  <hoo@deploy1001> | 
  Synchronized wmf-config/Wikibase.php: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 20s) | 
  [production] | 
            
  | 14:33 | 
  <vgutierrez> | 
  depool and reimage ncredir3001 as buster - T243391 | 
  [production] | 
            
  | 14:32 | 
  <vgutierrez> | 
  depool & reimage cp4031 as buster - T242093 | 
  [production] | 
            
  | 14:23 | 
  <vgutierrez> | 
  pooling ncredir3002 running buster - T243391 | 
  [production] | 
            
  | 13:26 | 
  <vgutierrez> | 
  pooling cp4021 with buster - T242093 | 
  [production] | 
            
  | 13:05 | 
  <vgutierrez@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | 
  [production] |