| 
      
        2025-01-28
      
      ยง
     | 
  
    
  | 13:23 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2026.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 13:22 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Depooling db1194 (T384592)', diff saved to https://phabricator.wikimedia.org/P72623 and previous config saved to /var/cache/conftool/dbconfig/20250128-132238-marostegui.json | 
  [production] | 
            
  | 13:22 | 
  <marostegui@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance | 
  [production] | 
            
  | 13:22 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1191 (T384592)', diff saved to https://phabricator.wikimedia.org/P72622 and previous config saved to /var/cache/conftool/dbconfig/20250128-132227-marostegui.json | 
  [production] | 
            
  | 13:22 | 
  <brouberol@deploy2002> | 
  helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply | 
  [production] | 
            
  | 13:20 | 
  <brouberol@deploy2002> | 
  helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply | 
  [production] | 
            
  | 13:19 | 
  <fceratto@dns1004> | 
  END - running authdns-update | 
  [production] | 
            
  | 13:19 | 
  <root@cumin1002> | 
  START - Cookbook sre.mysql.upgrade for db1166.eqiad.wmnet | 
  [production] | 
            
  | 13:18 | 
  <root@cumin1002> | 
  END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2190 gradually with 4 steps - Repooling after rebuild index | 
  [production] | 
            
  | 13:17 | 
  <fceratto@dns1004> | 
  START - running authdns-update | 
  [production] | 
            
  | 13:15 | 
  <dbrant@deploy2002> | 
  helmfile [codfw] DONE helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:15 | 
  <dbrant@deploy2002> | 
  helmfile [codfw] START helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:14 | 
  <dbrant@deploy2002> | 
  helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:13 | 
  <dbrant@deploy2002> | 
  helmfile [eqiad] START helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:13 | 
  <dbrant@deploy2002> | 
  helmfile [staging] DONE helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:12 | 
  <dbrant@deploy2002> | 
  helmfile [staging] START helmfile.d/services/mobileapps: apply | 
  [production] | 
            
  | 13:07 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P72619 and previous config saved to /var/cache/conftool/dbconfig/20250128-130720-marostegui.json | 
  [production] | 
            
  | 13:07 | 
  <dbrant@deploy2002> | 
  helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:06 | 
  <dbrant@deploy2002> | 
  helmfile [codfw] START helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:06 | 
  <dbrant@deploy2002> | 
  helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:05 | 
  <dbrant@deploy2002> | 
  helmfile [eqiad] START helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:04 | 
  <dbrant@deploy2002> | 
  helmfile [staging] DONE helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:03 | 
  <dbrant@deploy2002> | 
  helmfile [staging] START helmfile.d/services/wikifeeds: apply | 
  [production] | 
            
  | 13:03 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 13:02 | 
  <jmm@cumin2002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2026.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 12:52 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P72617 and previous config saved to /var/cache/conftool/dbconfig/20250128-125213-marostegui.json | 
  [production] | 
            
  | 12:51 | 
  <cmooney@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on netflow3003.esams.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd | 
  [production] | 
            
  | 12:50 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 12:50 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002" | 
  [production] | 
            
  | 12:50 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1004.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 12:50 | 
  <andrew@cumin1002> | 
  END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002" | 
  [production] | 
            
  | 12:45 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 12:41 | 
  <andrew@cumin1002> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002" | 
  [production] | 
            
  | 12:39 | 
  <root@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Index rebuild | 
  [production] | 
            
  | 12:37 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Depool db1166 T382842', diff saved to https://phabricator.wikimedia.org/P72615 and previous config saved to /var/cache/conftool/dbconfig/20250128-123713-marostegui.json | 
  [production] | 
            
  | 12:37 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1191 (T384592)', diff saved to https://phabricator.wikimedia.org/P72614 and previous config saved to /var/cache/conftool/dbconfig/20250128-123706-marostegui.json | 
  [production] | 
            
  | 12:32 | 
  <andrew@cumin1002> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002" | 
  [production] | 
            
  | 12:32 | 
  <root@cumin1002> | 
  START - Cookbook sre.mysql.pool db2190 gradually with 4 steps - Repooling after rebuild index | 
  [production] | 
            
  | 12:31 | 
  <elukey@deploy2002> | 
  helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. | 
  [production] | 
            
  | 12:30 | 
  <elukey@deploy2002> | 
  helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. | 
  [production] | 
            
  | 12:27 | 
  <root@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2230.codfw.wmnet with reason: Index rebuild | 
  [production] | 
            
  | 12:27 | 
  <slyngshede@dns1004> | 
  END - running authdns-update | 
  [production] | 
            
  | 12:27 | 
  <root@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2230.codfw.wmnet with reason: Index rebuild | 
  [production] | 
            
  | 12:25 | 
  <slyngshede@dns1004> | 
  START - running authdns-update | 
  [production] | 
            
  | 12:24 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Depooling db1191 (T384592)', diff saved to https://phabricator.wikimedia.org/P72611 and previous config saved to /var/cache/conftool/dbconfig/20250128-122428-marostegui.json | 
  [production] | 
            
  | 12:24 | 
  <marostegui@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance | 
  [production] | 
            
  | 12:24 | 
  <marostegui@cumin1002> | 
  dbctl commit (dc=all): 'Repooling after maintenance db1174 (T384592)', diff saved to https://phabricator.wikimedia.org/P72610 and previous config saved to /var/cache/conftool/dbconfig/20250128-122406-marostegui.json | 
  [production] | 
            
  | 12:23 | 
  <andrew@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 12:22 | 
  <cmooney@cumin1002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on netflow2003.codfw.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd | 
  [production] | 
            
  | 12:19 | 
  <andrew@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage | 
  [production] |