| 2020-03-25
      
      § | 
    
  | 11:20 | <dzahn@cumin1001> | conftool action : set/pooled=no; selector: name=mw123[2-5].eqiad.wmnet | [production] | 
            
  | 11:20 | <dzahn@cumin1001> | conftool action : set/pooled=no; selector: name=mw125[0-3].eqiad.wmnet | [production] | 
            
  | 11:19 | <urbanecm@deploy1001> | Synchronized wmf-config/CommonSettings.php: SWAT: 59412db: Add gwtoolset to available rights to allow granting to global groups (duration: 01m 07s) | [production] | 
            
  | 11:12 | <urbanecm@deploy1001> | Synchronized wmf-config/CommonSettings.php: SWAT: 7b8d7c5: TwoColConflict: Limited default deployment CommonSettings.php (T244863) (duration: 01m 06s) | [production] | 
            
  | 11:10 | <urbanecm@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: SWAT: 81cda0f: TwoColConflict: Limited default deployment InitialiseSettings.php (T244863; take II) (duration: 01m 06s) | [production] | 
            
  | 11:08 | <urbanecm@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: SWAT: 81cda0f: TwoColConflict: Limited default deployment InitialiseSettings.php (T244863) (duration: 01m 17s) | [production] | 
            
  | 11:08 | <jynus@cumin1001> | dbctl commit (dc=all): 'Reduce db1091 load, increase main traffic on all other s4 instances', diff saved to https://phabricator.wikimedia.org/P10762 and previous config saved to /var/cache/conftool/dbconfig/20200325-110821-jynus.json | [production] | 
            
  | 10:55 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Fully repool db1137', diff saved to https://phabricator.wikimedia.org/P10761 and previous config saved to /var/cache/conftool/dbconfig/20200325-105503-marostegui.json | [production] | 
            
  | 10:39 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10760 and previous config saved to /var/cache/conftool/dbconfig/20200325-103938-marostegui.json | [production] | 
            
  | 10:37 | <XioNoX> | change aggregate policy for 2620:0:862::/48 on cr3-knams - T236785 | [production] | 
            
  | 10:19 | <XioNoX> | change aggregate policy for v4 prefixes on cr2-eqdfw - T236785 | [production] | 
            
  | 10:04 | <oblivian@deploy1001> | helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . | [production] | 
            
  | 10:04 | <oblivian@deploy1001> | helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . | [production] | 
            
  | 09:56 | <XioNoX> | change aggregate policy for 2620:0:860::/46 on cr2-eqdfw - T236785 | [production] | 
            
  | 09:54 | <vgutierrez> | Enable inbound TLSv1.3 on upload@eqsin - T170567 | [production] | 
            
  | 09:27 | <jmm@cumin2001> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) | [production] | 
            
  | 09:23 | <vgutierrez> | upgrade ATS to 8.0.6-1wm3 on upload@eqsin - T170567 | [production] | 
            
  | 09:14 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10759 and previous config saved to /var/cache/conftool/dbconfig/20200325-091421-marostegui.json | [production] | 
            
  | 09:02 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10758 and previous config saved to /var/cache/conftool/dbconfig/20200325-090227-marostegui.json | [production] | 
            
  | 08:55 | <marostegui@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 08:53 | <marostegui@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 08:38 | <marostegui> | Reimage db1137 | [production] | 
            
  | 08:18 | <marostegui> | Reboot db1117 for full-upgrade | [production] | 
            
  | 08:15 | <oblivian@deploy1001> | helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . | [production] | 
            
  | 08:15 | <oblivian@deploy1001> | helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . | [production] | 
            
  | 08:14 | <_joe_> | upgrading all eventgate-main to envoy 1.13.1 T246868 | [production] | 
            
  | 08:12 | <marostegui> | Stop all mysql daemons on db1117 | [production] | 
            
  | 07:50 | <oblivian@deploy1001> | helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . | [production] | 
            
  | 07:50 | <oblivian@deploy1001> | helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . | [production] | 
            
  | 07:42 | <XioNoX> | reboot scs-eqsin for CPU usage | [production] | 
            
  | 07:20 | <jmm@cumin2001> | START - Cookbook sre.ganeti.makevm | [production] | 
            
  | 07:09 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1137 for upgrade', diff saved to https://phabricator.wikimedia.org/P10757 and previous config saved to /var/cache/conftool/dbconfig/20200325-070946-marostegui.json | [production] | 
            
  | 06:57 | <marostegui> | Deploy schema change on db2129 (s6 codfw master) | [production] | 
            
  | 06:15 | <marostegui> | Rename tables on db1133 (m5 master) nova_api database - T248313 | [production] | 
            
  | 06:13 | <marostegui> | Remove grants 'nova'@'208.80.154.23' on nova.* - T248313 | [production] | 
            
  
    | 2020-03-24
      
      § | 
    
  | 20:53 | <cdanis> | repool eqsin | [production] | 
            
  | 20:52 | <jforrester@deploy1001> | Synchronized wmf-config/CommonSettings.php: Don't hard-set wgTmhUseBetaFeatures to true, let it vary by wiki (duration: 01m 07s) | [production] | 
            
  | 20:50 | <jforrester@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 07s) | [production] | 
            
  | 20:49 | <jforrester@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: Set wgTmhUseBetaFeatures to vary by wiki (duration: 01m 06s) | [production] | 
            
  | 20:35 | <twentyafterfour@deploy1001> | rebuilt and synchronized wikiversions files: Attempt #2: group0 wikis to 1.35.0-wmf.25 refs T233873 | [production] | 
            
  | 20:32 | <twentyafterfour@deploy1001> | Synchronized wmf-config: Now touch and sync again because of settings cache rache condition. refs T248409 (duration: 00m 59s) | [production] | 
            
  | 20:31 | <cdanis> | rebooting cr2-eqsin T248394 | [production] | 
            
  | 20:30 | <twentyafterfour@deploy1001> | Synchronized wmf-config: Now sync InitializeSettings* refs T248409 (duration: 00m 59s) | [production] | 
            
  | 20:28 | <twentyafterfour@deploy1001> | Synchronized wmf-config/CommonSettings.php: sync CommonSettings before InitialiseSettings refs T248409 (duration: 00m 58s) | [production] | 
            
  | 20:27 | <volans> | force rebooting analytics1044 from console, host down and unreachable (ping, ssh, console) | [production] | 
            
  | 20:26 | <cdanis> | commit flow-table-size on cr2-eqsin T248394 | [production] | 
            
  | 20:19 | <cdanis> | eqsin depooled for router maintenance at 16:15 | [production] | 
            
  | 19:29 | <twentyafterfour@deploy1001> | scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) | [production] | 
            
  | 19:29 | <twentyafterfour> | rolling back to wmf.24 due to high error rate refs T233873 | [production] | 
            
  | 19:28 | <twentyafterfour@deploy1001> | scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) | [production] |