| 2022-05-05
      
      ยง | 
    
  | 13:12 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 13:10 | <mwdebug-deploy@deploy1002> | helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 13:09 | <mwdebug-deploy@deploy1002> | helmfile [codfw] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 13:09 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 13:08 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 13:08 | <klausman@cumin1001> | END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-cache1001.eqiad.wmnet | [production] | 
            
  | 13:08 | <klausman@cumin1001> | START - Cookbook sre.hosts.reboot-single for host ml-cache1001.eqiad.wmnet | [production] | 
            
  | 13:08 | <aqu@deploy1002> | Started deploy [analytics/refinery@6b9b65d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6b9b65d] | [production] | 
            
  | 13:07 | <aqu@deploy1002> | Finished deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d] (duration: 00m 08s) | [production] | 
            
  | 13:07 | <aqu@deploy1002> | Started deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d] | [production] | 
            
  | 13:06 | <tgr@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789556|GrothExperiments: Enable Add Link backend on tier 3 wikis (T304542)]] (duration: 00m 49s) | [production] | 
            
  | 13:06 | <aqu@deploy1002> | Finished deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d] (duration: 29m 59s) | [production] | 
            
  | 13:03 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 13:03 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 13:03 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T307525)', diff saved to https://phabricator.wikimedia.org/P27724 and previous config saved to /var/cache/conftool/dbconfig/20220505-130313-ladsgroup.json | [production] | 
            
  | 12:59 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: After the incident', diff saved to https://phabricator.wikimedia.org/P27723 and previous config saved to /var/cache/conftool/dbconfig/20220505-125917-root.json | [production] | 
            
  | 12:58 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1166 (T307525)', diff saved to https://phabricator.wikimedia.org/P27722 and previous config saved to /var/cache/conftool/dbconfig/20220505-125806-ladsgroup.json | [production] | 
            
  | 12:53 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1033.eqiad.wmnet | [production] | 
            
  | 12:53 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance | [production] | 
            
  | 12:53 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance | [production] | 
            
  | 12:53 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 12:53 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 12:52 | <jelto@cumin1001> | START - Cookbook sre.hosts.reboot-single for host gitlab-runner1001.eqiad.wmnet | [production] | 
            
  | 12:49 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host restbase1033.eqiad.wmnet | [production] | 
            
  | 12:49 | <jelto@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org | [production] | 
            
  | 12:48 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1032.eqiad.wmnet | [production] | 
            
  | 12:45 | <jelto@cumin1001> | START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org | [production] | 
            
  | 12:44 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host restbase1032.eqiad.wmnet | [production] | 
            
  | 12:44 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After the incident', diff saved to https://phabricator.wikimedia.org/P27721 and previous config saved to /var/cache/conftool/dbconfig/20220505-124413-root.json | [production] | 
            
  | 12:44 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 12:44 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 12:44 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T307525)', diff saved to https://phabricator.wikimedia.org/P27720 and previous config saved to /var/cache/conftool/dbconfig/20220505-124401-ladsgroup.json | [production] | 
            
  | 12:39 | <jmm@cumin2002> | END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host restbase1031.eqiad.wmnet | [production] | 
            
  | 12:36 | <aqu@deploy1002> | Started deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d] | [production] | 
            
  | 12:36 | <jelto@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org | [production] | 
            
  | 12:32 | <jelto@cumin1001> | START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org | [production] | 
            
  | 12:31 | <jelto@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org | [production] | 
            
  | 12:29 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host restbase1031.eqiad.wmnet | [production] | 
            
  | 12:29 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After the incident', diff saved to https://phabricator.wikimedia.org/P27719 and previous config saved to /var/cache/conftool/dbconfig/20220505-122909-root.json | [production] | 
            
  | 12:28 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27718 and previous config saved to /var/cache/conftool/dbconfig/20220505-122854-ladsgroup.json | [production] | 
            
  | 12:27 | <jelto@cumin1001> | START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org | [production] | 
            
  | 12:27 | <aqu> | Regular analytics weekly train [analytics/refinery@cc4b2bd] | [production] | 
            
  | 12:27 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1030.eqiad.wmnet | [production] | 
            
  | 12:26 | <mvernon@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye | [production] | 
            
  | 12:25 | <jelto@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org | [production] | 
            
  | 12:20 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host restbase1030.eqiad.wmnet | [production] | 
            
  | 12:19 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1170:3317 (T307525)', diff saved to https://phabricator.wikimedia.org/P27717 and previous config saved to /var/cache/conftool/dbconfig/20220505-121935-ladsgroup.json | [production] | 
            
  | 12:19 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 12:19 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 12:19 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1158 (T307525)', diff saved to https://phabricator.wikimedia.org/P27716 and previous config saved to /var/cache/conftool/dbconfig/20220505-121928-ladsgroup.json | [production] |