| 
      
        2025-02-04
      
      ยง
     | 
  
    
  | 14:09 | 
  <Lucas_WMDE> | 
  lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1115377|kowikisource: Add Draft namespace (T385162)]] # re-log from 14:07 UTC | 
  [production] | 
            
  | 13:46 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db1229 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73176 and previous config saved to /var/cache/conftool/dbconfig/20250204-134646-root.json | 
  [production] | 
            
  | 13:44 | 
  <aborrero@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1004.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 13:35 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1002.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 13:31 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db1229 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73175 and previous config saved to /var/cache/conftool/dbconfig/20250204-133141-root.json | 
  [production] | 
            
  | 13:27 | 
  <aborrero@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 13:23 | 
  <aborrero@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 13:17 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 13:16 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db1229 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73174 and previous config saved to /var/cache/conftool/dbconfig/20250204-131636-root.json | 
  [production] | 
            
  | 13:14 | 
  <andrew@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 13:11 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db2220 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73173 and previous config saved to /var/cache/conftool/dbconfig/20250204-131118-root.json | 
  [production] | 
            
  | 13:09 | 
  <godog> | 
  upgrade poolcounter-prometheus-exporter to 0.1.2 - T333947 | 
  [production] | 
            
  | 13:07 | 
  <aborrero@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host cloudgw1004.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 13:04 | 
  <aborrero@cumin1002> | 
  END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudgw1004.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 12:46 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db1229 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73168 and previous config saved to /var/cache/conftool/dbconfig/20250204-124625-root.json | 
  [production] | 
            
  | 12:43 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P73167 and previous config saved to /var/cache/conftool/dbconfig/20250204-124345-marostegui.json | 
  [production] | 
            
  | 12:41 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db2220 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73166 and previous config saved to /var/cache/conftool/dbconfig/20250204-124107-root.json | 
  [production] | 
            
  | 12:40 | 
  <bking@deploy2002> | 
  helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 12:39 | 
  <bking@deploy2002> | 
  helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 12:38 | 
  <jynus> | 
  deploying new backup grants for ES hosts T383902 | 
  [production] | 
            
  | 12:33 | 
  <jiji@deploy2002> | 
  helmfile [codfw] DONE helmfile.d/services/shellbox: apply | 
  [production] | 
            
  | 12:32 | 
  <jiji@deploy2002> | 
  helmfile [codfw] START helmfile.d/services/shellbox: apply | 
  [production] | 
            
  | 12:28 | 
  <vgutierrez@cumin1002> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2011.codfw.wmnet,lvs6001.drmrs.wmnet,lvs1017.eqiad.wmnet,lvs3008.esams.wmnet,lvs7001.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:28 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P73165 and previous config saved to /var/cache/conftool/dbconfig/20250204-122838-marostegui.json | 
  [production] | 
            
  | 12:27 | 
  <vgutierrez@cumin1002> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2011.codfw.wmnet,lvs6001.drmrs.wmnet,lvs1017.eqiad.wmnet,lvs3008.esams.wmnet,lvs7001.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:26 | 
  <vgutierrez> | 
  upgrading pybal on high-traffic1 load balancers -  T373027 | 
  [production] | 
            
  | 12:26 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db2220 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73164 and previous config saved to /var/cache/conftool/dbconfig/20250204-122602-root.json | 
  [production] | 
            
  | 12:25 | 
  <vgutierrez@cumin1002> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2012.codfw.wmnet,lvs6002.drmrs.wmnet,lvs1018.eqiad.wmnet,lvs3009.esams.wmnet,lvs7002.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:24 | 
  <vgutierrez@cumin1002> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2012.codfw.wmnet,lvs6002.drmrs.wmnet,lvs1018.eqiad.wmnet,lvs3009.esams.wmnet,lvs7002.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:23 | 
  <vgutierrez> | 
  upgrading pybal on high-traffic2 load balancers -  T373027 | 
  [production] | 
            
  | 12:23 | 
  <root@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Index rebuild | 
  [production] | 
            
  | 12:21 | 
  <vgutierrez@cumin1002> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic (T373027) | 
  [production] | 
            
  | 12:20 | 
  <root@cumin1002> | 
  END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2222.codfw.wmnet | 
  [production] | 
            
  | 12:20 | 
  <vgutierrez@cumin1002> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic (T373027) | 
  [production] | 
            
  | 12:18 | 
  <vgutierrez> | 
  upgrading pybal on low-traffic load balancers -  T373027 | 
  [production] | 
            
  | 12:17 | 
  <vgutierrez@cumin1002> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2014.codfw.wmnet,lvs6003.drmrs.wmnet,lvs1020.eqiad.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:15 | 
  <vgutierrez@cumin1002> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2014.codfw.wmnet,lvs6003.drmrs.wmnet,lvs1020.eqiad.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet} and A:lvs (T373027) | 
  [production] | 
            
  | 12:15 | 
  <root@cumin1002> | 
  START - Cookbook sre.mysql.upgrade for db2222.codfw.wmnet | 
  [production] | 
            
  | 12:14 | 
  <vgutierrez> | 
  upgrading pybal on secondary load balancers -  T373027 | 
  [production] | 
            
  | 12:14 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Depool db2222 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73163 and previous config saved to /var/cache/conftool/dbconfig/20250204-121450-marostegui.json | 
  [production] | 
            
  | 12:14 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'es2040 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73162 and previous config saved to /var/cache/conftool/dbconfig/20250204-121400-root.json | 
  [production] | 
            
  | 12:13 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1242 (T384592)', diff saved to https://phabricator.wikimedia.org/P73161 and previous config saved to /var/cache/conftool/dbconfig/20250204-121331-marostegui.json | 
  [production] | 
            
  | 12:11 | 
  <vgutierrez@cumin1002> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs500[4-5]*} and A:lvs (T373027) | 
  [production] | 
            
  | 12:10 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'db2220 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73160 and previous config saved to /var/cache/conftool/dbconfig/20250204-121056-root.json | 
  [production] | 
            
  | 12:10 | 
  <vgutierrez@cumin1002> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs500[4-5]*} and A:lvs (T373027) | 
  [production] | 
            
  | 12:07 | 
  <elukey> | 
  manually executed docker-system-prune-dangling.service on build2001 | 
  [production] | 
            
  | 12:04 | 
  <elukey> | 
  manually dropped 2.5.1rocm6.2-1-20250202 on build2001 - T385531 | 
  [production] | 
            
  | 12:03 | 
  <vgutierrez> | 
  upgrading pybal on eqsin -  T373027 | 
  [production] | 
            
  | 11:59 | 
  <elukey@deploy2002> | 
  helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. | 
  [production] | 
            
  | 11:59 | 
  <elukey@deploy2002> | 
  helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. | 
  [production] |