| 
      
        2025-09-21
      
      §
     | 
  
    
  | 19:58 | 
  <wmbot~jeanfred@tools-bastion-15> | 
  Reloaded SQL table configuration for 0fe6c07 (T346681) | 
  [tools.heritage] | 
            
  | 18:40 | 
  <ryankemper> | 
  T395772 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/1189979 to fix puppet failures on deploy servers | 
  [production] | 
            
  | 18:20 | 
  <ryankemper> | 
  [WDQS] Restarted `wdqs-blazegraph` on `wdqs2009` to restore service to https://query-legacy-full.wikidata.org/ | 
  [production] | 
            
  | 18:15 | 
  <ryankemper@cumin2002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs[2009,2016].codfw.wmnet,wdqs[1018-1020].eqiad.wmnet with reason: T395772 | 
  [production] | 
            
  | 13:49 | 
  <wmbot~peterbowman@tools-bastion-14> | 
  Fix NKJP servlet and legacy SGE-era links, bump MySQL/J connector | 
  [tools.pbbot] | 
            
  | 09:17 | 
  <wmbot~dcaro@acme> | 
  END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-37, tools-k8s-worker-nfs-2 | 
  [tools] | 
            
  | 09:02 | 
  <wmbot~dcaro@acme> | 
  START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-37, tools-k8s-worker-nfs-2 | 
  [tools] | 
            
  | 03:16 | 
  <dcaro> | 
  acking and silencing CPU capacity alerts to handle on Monday, they should not page | 
  [tools] | 
            
  | 01:46 | 
  <andrew@cloudcumin1001> | 
  END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker role in the tools cluster | 
  [tools] | 
            
  | 01:46 | 
  <andrew@cloudcumin1001> | 
  Added a new k8s worker tools-k8s-worker-113.tools.eqiad1.wikimedia.cloud to the cluster | 
  [tools] | 
            
  | 01:36 | 
  <andrewbogott> | 
  adding additional worker node in response to repeated capacity alerts | 
  [tools] | 
            
  | 01:35 | 
  <andrew@cloudcumin1001> | 
  START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the tools cluster | 
  [tools] | 
            
  | 01:01 | 
  <mwpresync@deploy1003> | 
  Finished scap build-images: Publishing wmf/next image (duration: 01m 02s) | 
  [production] | 
            
  | 01:00 | 
  <mwpresync@deploy1003> | 
  Started scap build-images: Publishing wmf/next image | 
  [production] | 
            
  
    | 
      
        2025-09-19
      
      §
     | 
  
    
  | 21:38 | 
  <wmbot~jeanfred@tools-bastion-15> | 
  Load altered jobs.yml so that update-monuments runs on py39 | 
  [tools.heritage] | 
            
  | 21:36 | 
  <wmbot~jeanfred@tools-bastion-15> | 
  Recreate check-emailable-users job for WLM 2025 with Py39 image | 
  [tools.heritage] | 
            
  | 21:36 | 
  <wmbot~jeanfred@tools-bastion-15> | 
  Wiped out the py37 venv and recreated a py39 one | 
  [tools.heritage] | 
            
  | 18:35 | 
  <fceratto@deploy1003> | 
  helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . | 
  [production] | 
            
  | 18:07 | 
  <cmooney@cumin1003> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "remove sretest2009 - cmooney@cumin1003" | 
  [production] | 
            
  | 18:07 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "remove sretest2009 - cmooney@cumin1003" | 
  [production] | 
            
  | 17:59 | 
  <cmooney@cumin1003> | 
  END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | 
  [production] | 
            
  | 17:57 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.dns.netbox | 
  [production] | 
            
  | 17:56 | 
  <cmooney@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2009.codfw.wmnet | 
  [production] | 
            
  | 17:56 | 
  <cmooney@cumin1003> | 
  END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | 
  [production] | 
            
  | 17:56 | 
  <cmooney@cumin1003> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1003" | 
  [production] | 
            
  | 17:56 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1003" | 
  [production] | 
            
  | 17:51 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.dns.netbox | 
  [production] | 
            
  | 17:48 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.hosts.decommission for hosts sretest2009.codfw.wmnet | 
  [production] | 
            
  | 17:36 | 
  <cmooney@cumin1003> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "force sync to remove sretest2009 - cmooney@cumin1003" | 
  [production] | 
            
  | 17:34 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "force sync to remove sretest2009 - cmooney@cumin1003" | 
  [production] | 
            
  | 17:16 | 
  <ladsgroup@cumin1003> | 
  dbctl commit (dc=all): 'Set s1 to RW', diff saved to https://phabricator.wikimedia.org/P83443 and previous config saved to /var/cache/conftool/dbconfig/20250919-171624-ladsgroup.json | 
  [production] | 
            
  | 17:12 | 
  <jhathaway@cumin2002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie | 
  [production] | 
            
  | 17:12 | 
  <jhathaway@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie | 
  [production] | 
            
  | 17:09 | 
  <cmooney@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 17:04 | 
  <taavi@cumin1003> | 
  dbctl commit (dc=all): 'set s1 ro', diff saved to https://phabricator.wikimedia.org/P83441 and previous config saved to /var/cache/conftool/dbconfig/20250919-170402-taavi.json | 
  [production] | 
            
  | 17:02 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 16:56 | 
  <jhathaway@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 16:54 | 
  <cmooney@cumin1003> | 
  END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2009.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 16:52 | 
  <jhathaway@cumin2002> | 
  START - Cookbook sre.hosts.provision for host sretest2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 16:37 | 
  <wmbot~lucaswerkmeister-wmde@tools-bastion-15> | 
  deployed 7eb3c14155 (update service.template) | 
  [tools.wdmm] | 
            
  | 16:36 | 
  <wmbot~lucaswerkmeister-wmde@tools-bastion-15> | 
  deployed e352d5c66f (update overrides) | 
  [tools.wdmm] | 
            
  | 16:29 | 
  <cmooney@cumin1003> | 
  START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 16:29 | 
  <cmooney@cumin1003> | 
  END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2009.codfw.wmnet with OS bookworm | 
  [production] |