| 2022-08-15
      
      ยง | 
    
  | 21:42 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 21:42 | <cjming@deploy1002> | Synchronized php-1.39.0-wmf.23/skins/Vector/resources/skins.vector.es6: Backport: [[gerrit:823228|Sticky header AB test bucketing for 2 treatment buckets (T312573)]] (duration: 03m 05s) | [production] | 
            
  | 21:34 | <ejegg> | payments-wiki upgraded from 41709763 to f9f91f1f | [production] | 
            
  | 21:32 | <ejegg|afk> | payments-wiki rolled back to 41709763 | [production] | 
            
  | 21:29 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.reimage for host elastic1083.eqiad.wmnet with OS bullseye | [production] | 
            
  | 21:22 | <ejegg> | payments-wiki upgraded from 41709763 to f9f91f1f | [production] | 
            
  | 21:07 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1080.eqiad.wmnet with OS bullseye | [production] | 
            
  | 20:56 | <mwdebug-deploy@deploy1002> | helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:55 | <mwdebug-deploy@deploy1002> | helmfile [codfw] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:55 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:55 | <cjming@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:823227|Revert "Enable sticky header edit A/B test for idwiki + viwiki"]] (duration: 03m 15s) | [production] | 
            
  | 20:54 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:50 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1080.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 20:48 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1080.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 20:35 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.reimage for host elastic1080.eqiad.wmnet with OS bullseye | [production] | 
            
  | 20:33 | <cjming> | end of UTC late backport window | [production] | 
            
  | 20:31 | <cjming@deploy1002> | Synchronized php-1.39.0-wmf.23/extensions/GrowthExperiments: Backport: [[gerrit:822485|WelcomeSurvey/VariantHooks: Change hook used for redirection (T313064)]] (duration: 04m 37s) | [production] | 
            
  | 20:29 | <mwdebug-deploy@deploy1002> | helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:28 | <mwdebug-deploy@deploy1002> | helmfile [codfw] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:28 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:26 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:12 | <cjming@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:821310|Enable sticky header edit A/B test for idwiki + viwiki (T312295)]] (duration: 03m 30s) | [production] | 
            
  | 20:11 | <mwdebug-deploy@deploy1002> | helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:10 | <mwdebug-deploy@deploy1002> | helmfile [codfw] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:10 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 20:05 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 19:35 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1096:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32391 and previous config saved to /var/cache/conftool/dbconfig/20220815-193541-ladsgroup.json | [production] | 
            
  | 19:35 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 19:35 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 19:35 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1130 (T314041)', diff saved to https://phabricator.wikimedia.org/P32390 and previous config saved to /var/cache/conftool/dbconfig/20220815-193520-ladsgroup.json | [production] | 
            
  | 19:20 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32389 and previous config saved to /var/cache/conftool/dbconfig/20220815-192014-ladsgroup.json | [production] | 
            
  | 19:05 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32388 and previous config saved to /var/cache/conftool/dbconfig/20220815-190508-ladsgroup.json | [production] | 
            
  | 18:50 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1130 (T314041)', diff saved to https://phabricator.wikimedia.org/P32387 and previous config saved to /var/cache/conftool/dbconfig/20220815-185002-ladsgroup.json | [production] | 
            
  | 18:49 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1081.eqiad.wmnet with OS bullseye | [production] | 
            
  | 18:40 | <ebernhardson@deploy1002> | Finished deploy [wikimedia/discovery/analytics@230a820]: include additional deubgging information in HivePartitionRangeSensor logs (duration: 02m 08s) | [production] | 
            
  | 18:38 | <ebernhardson@deploy1002> | Started deploy [wikimedia/discovery/analytics@230a820]: include additional deubgging information in HivePartitionRangeSensor logs | [production] | 
            
  | 18:33 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1081.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 18:31 | <pt1979@cumin2002> | END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ms-be2067.codfw.wmnet | [production] | 
            
  | 18:29 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1081.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 18:24 | <pt1979@cumin2002> | END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2067.codfw.wmnet | [production] | 
            
  | 18:16 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.reimage for host elastic1081.eqiad.wmnet with OS bullseye | [production] | 
            
  | 18:07 | <herron> | thanos compact process was hung, forced thanos-compact restart on thanos-fe2001 | [production] | 
            
  | 17:48 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1052.eqiad.wmnet with OS bullseye | [production] | 
            
  | 17:32 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1052.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:29 | <pt1979@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet | [production] | 
            
  | 17:28 | <pt1979@cumin2002> | START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2067.codfw.wmnet | [production] | 
            
  | 17:28 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1052.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:28 | <pt1979@cumin2002> | END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2067.codfw.wmnet | [production] | 
            
  | 17:28 | <pt1979@cumin2002> | START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2067.codfw.wmnet | [production] | 
            
  | 17:24 | <ebernhardson@deploy1002> | Finished deploy [wikimedia/discovery/analytics@d4137b5]: increase subgraph query SLA and remove same from drop_old_data (duration: 02m 17s) | [production] |