| 2025-07-07
      
      § | 
    
  | 10:09 | <root@cumin1002> | START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe | [production] | 
            
  | 09:58 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:43 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:42 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:25 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:21 | <marostegui@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1250.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 09:18 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:13 | <marostegui> | Failover m2 from db1250 to db1228 - T397633 | [production] | 
            
  | 09:09 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 09:06 | <marostegui@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2160,2233].codfw.wmnet,db[1217,1228,1250].eqiad.wmnet with reason: maintenance | [production] | 
            
  | 08:15 | <marostegui@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1237.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 08:01 | <oblivian@cumin1003> | END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Feature: logging of deny actions; add rename functionality - oblivian@cumin1003" | [production] | 
            
  | 08:01 | <oblivian@cumin1003> | END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Feature: logging of deny actions; add rename functionality - oblivian@cumin1003 | [production] | 
            
  | 08:00 | <oblivian@cumin1003> | START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Feature: logging of deny actions; add rename functionality - oblivian@cumin1003 | [production] | 
            
  | 08:00 | <oblivian@cumin1003> | START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Feature: logging of deny actions; add rename functionality - oblivian@cumin1003" | [production] | 
            
  | 08:00 | <marostegui@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1237.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 07:53 | <vgutierrez> | repooling cp7006 with Ia82b9354a5b9e7bd5443b4af0888325919ddb19e applied - T397917 | [production] | 
            
  | 07:53 | <marostegui@cumin1002> | dbctl commit (dc=all): 'Depool db1237 T397612', diff saved to https://phabricator.wikimedia.org/P78763 and previous config saved to /var/cache/conftool/dbconfig/20250707-075308-root.json | [production] | 
            
  | 07:52 | <marostegui@cumin1002> | dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write T397612', diff saved to https://phabricator.wikimedia.org/P78762 and previous config saved to /var/cache/conftool/dbconfig/20250707-075254-root.json | [production] | 
            
  | 07:51 | <marostegui@dns1006> | END - running authdns-update | [production] | 
            
  | 07:50 | <marostegui@dns1006> | START - running authdns-update | [production] | 
            
  | 07:25 | <vgutierrez> | depooling cp7006 to test Ia82b9354a5b9e7bd5443b4af0888325919ddb19e - T397917 | [production] | 
            
  | 07:25 | <marostegui> | Starting x1 eqiad failover from db1237 to db1220 - T397612 | [production] | 
            
  | 07:21 | <marostegui@cumin1002> | dbctl commit (dc=all): 'Set db1220 with weight 0 T397612', diff saved to https://phabricator.wikimedia.org/P78760 and previous config saved to /var/cache/conftool/dbconfig/20250707-072157-root.json | [production] | 
            
  | 07:13 | <marostegui@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 15 hosts with reason: Primary switchover x1 T397612 | [production] | 
            
  | 07:11 | <brouberol@deploy1003> | helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply | [production] | 
            
  | 07:10 | <brouberol@deploy1003> | helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply | [production] | 
            
  | 07:04 | <vgutierrez> | testing haproxy 2.8.15 in cp5017 and cp5025 - T398720 | [production] | 
            
  | 06:29 | <brouberol@deploy1003> | helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply | [production] | 
            
  | 06:29 | <brouberol@deploy1003> | helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply | [production] | 
            
  
    | 2025-07-04
      
      § | 
    
  | 21:39 | <krinkle@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1166438|beta: Change loginwiki/metawiki/auth canonical to beta.wmcloud.org (T289318)]] (duration: 18m 12s) | [production] | 
            
  | 21:33 | <krinkle@deploy1003> | krinkle: Continuing with sync | [production] | 
            
  | 21:23 | <krinkle@deploy1003> | krinkle: Backport for [[gerrit:1166438|beta: Change loginwiki/metawiki/auth canonical to beta.wmcloud.org (T289318)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 21:21 | <krinkle@deploy1003> | Started scap sync-world: Backport for [[gerrit:1166438|beta: Change loginwiki/metawiki/auth canonical to beta.wmcloud.org (T289318)]] | [production] | 
            
  | 20:32 | <krinkle@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1165989|beta: Include allowance for wmcloud.org in wgGraphAllowedDomains (T289318)]], [[gerrit:1165999|beta: Change Beta wikidata canonical to beta.wmcloud.org (T289318)]] (duration: 94m 52s) | [production] | 
            
  | 20:26 | <krinkle@deploy1003> | krinkle: Continuing with sync | [production] | 
            
  | 18:59 | <krinkle@deploy1003> | krinkle: Backport for [[gerrit:1165989|beta: Include allowance for wmcloud.org in wgGraphAllowedDomains (T289318)]], [[gerrit:1165999|beta: Change Beta wikidata canonical to beta.wmcloud.org (T289318)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 18:57 | <krinkle@deploy1003> | Started scap sync-world: Backport for [[gerrit:1165989|beta: Include allowance for wmcloud.org in wgGraphAllowedDomains (T289318)]], [[gerrit:1165999|beta: Change Beta wikidata canonical to beta.wmcloud.org (T289318)]] | [production] | 
            
  | 15:14 | <vgutierrez> | fetch haproxy 2.8.15 on thirdparty/haproxy28 component for bullseye-wikimedia (apt.wm.o) | [production] | 
            
  | 14:46 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bullseye | [production] | 
            
  | 14:40 | <stevemunene@cumin1002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1179.eqiad.wmnet with OS bullseye | [production] | 
            
  | 14:36 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bullseye | [production] | 
            
  | 14:29 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 14:20 | <vgutierrez> | repooling cp7006 | [production] | 
            
  | 14:20 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 14:12 | <vgutierrez> | depooling cp7006 for testing purposes | [production] | 
            
  | 14:09 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 14:06 | <stevemunene@cumin1002> | START - Cookbook sre.hosts.reimage for host an-worker1179.eqiad.wmnet with OS bullseye | [production] | 
            
  | 14:01 | <elukey@cumin2002> | START - Cookbook sre.hosts.reimage for host sretest2006.codfw.wmnet with OS bookworm | [production] | 
            
  | 13:15 | <elukey@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2006.codfw.wmnet with OS bookworm | [production] |