| 2023-10-18
      
      ยง | 
    
  | 13:04 | <kartik@deploy2002> | helmfile [staging] DONE helmfile.d/services/cxserver: apply | [production] | 
            
  | 13:04 | <kartik@deploy2002> | helmfile [staging] START helmfile.d/services/cxserver: apply | [production] | 
            
  | 13:03 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P53008 and previous config saved to /var/cache/conftool/dbconfig/20231018-130343-arnaudb.json | [production] | 
            
  | 13:03 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P53007 and previous config saved to /var/cache/conftool/dbconfig/20231018-130325-arnaudb.json | [production] | 
            
  | 12:59 | <kartik@deploy2002> | helmfile [codfw] DONE helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:59 | <kartik@deploy2002> | helmfile [codfw] START helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:52 | <jbond@cumin1001> | START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye | [production] | 
            
  | 12:51 | <jbond> | upload puppet_7.23.0-1~debu11u1 (bullseye backport | [production] | 
            
  | 12:48 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P53006 and previous config saved to /var/cache/conftool/dbconfig/20231018-124838-arnaudb.json | [production] | 
            
  | 12:48 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P53005 and previous config saved to /var/cache/conftool/dbconfig/20231018-124820-arnaudb.json | [production] | 
            
  | 12:44 | <kartik@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:44 | <pt1979@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti-test2004.mgmt.codfw.wmnet with reboot policy FORCED | [production] | 
            
  | 12:44 | <kartik@deploy2002> | helmfile [eqiad] START helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:43 | <jbond@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS bullseye | [production] | 
            
  | 12:43 | <arnaudb@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2109.codfw.wmnet with reason: db2109 downtime while repooling | [production] | 
            
  | 12:39 | <arnaudb@cumin1001> | START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2109.codfw.wmnet with reason: db2109 downtime while repooling | [production] | 
            
  | 12:38 | <kartik@deploy2002> | helmfile [staging] DONE helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:37 | <kartik@deploy2002> | helmfile [staging] START helmfile.d/services/cxserver: apply | [production] | 
            
  | 12:33 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P53004 and previous config saved to /var/cache/conftool/dbconfig/20231018-123333-arnaudb.json | [production] | 
            
  | 12:33 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P53003 and previous config saved to /var/cache/conftool/dbconfig/20231018-123315-arnaudb.json | [production] | 
            
  | 12:18 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P53002 and previous config saved to /var/cache/conftool/dbconfig/20231018-121828-arnaudb.json | [production] | 
            
  | 12:18 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P53001 and previous config saved to /var/cache/conftool/dbconfig/20231018-121811-arnaudb.json | [production] | 
            
  | 12:17 | <arnaudb> | repool db2161 and db1126 | [production] | 
            
  | 11:51 | <btullis@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1009.eqiad.wmnet | [production] | 
            
  | 11:44 | <btullis@cumin1001> | START - Cookbook sre.hosts.reboot-single for host stat1009.eqiad.wmnet | [production] | 
            
  | 11:43 | <fnegri@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm | [production] | 
            
  | 11:34 | <jbond@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 11:31 | <jbond@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 11:29 | <hnowlan@deploy2002> | helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:29 | <hnowlan@deploy2002> | helmfile [codfw] START helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:24 | <jgiannelos@deploy2002> | helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply | [production] | 
            
  | 11:23 | <jgiannelos@deploy2002> | helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply | [production] | 
            
  | 11:21 | <hnowlan@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:20 | <hnowlan@deploy2002> | helmfile [eqiad] START helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:16 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:16 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/editor-analytics: apply | [production] | 
            
  | 11:14 | <fnegri@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 11:12 | <fnegri@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 11:11 | <ladsgroup@deploy2002> | Finished scap: Backport for [[gerrit:966592|Set s6 and s8 to write both for pagelinks migration (T345732)]] (duration: 10m 10s) | [production] | 
            
  | 11:08 | <jbond@cumin1001> | START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye | [production] | 
            
  | 11:05 | <ladsgroup@deploy2002> | ladsgroup: Continuing with sync | [production] | 
            
  | 11:02 | <ladsgroup@deploy2002> | ladsgroup: Backport for [[gerrit:966592|Set s6 and s8 to write both for pagelinks migration (T345732)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | [production] | 
            
  | 11:01 | <fnegri@cumin1001> | START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm | [production] | 
            
  | 11:00 | <ladsgroup@deploy2002> | Started scap: Backport for [[gerrit:966592|Set s6 and s8 to write both for pagelinks migration (T345732)]] | [production] | 
            
  | 10:40 | <volans> | re-enabled puppet on the cumin hosts. installed spicerack 8.0.1 on the cumin hosts | [production] | 
            
  | 10:37 | <volans@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS bullseye | [production] | 
            
  | 10:35 | <btullis@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1007.eqiad.wmnet | [production] | 
            
  | 10:32 | <fnegri@cumin1001> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm | [production] | 
            
  | 10:28 | <kevinbazira@deploy2002> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 10:19 | <fnegri@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage | [production] |