| 2023-12-07
      
      ยง | 
    
  | 18:05 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 18:05 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 18:04 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1165 (T343198)', diff saved to https://phabricator.wikimedia.org/P54277 and previous config saved to /var/cache/conftool/dbconfig/20231207-180427-ladsgroup.json | [production] | 
            
  | 18:04 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 18:04 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 18:04 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 18:03 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 17:58 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs1024.eqiad.wmnet | [production] | 
            
  | 17:57 | <bking@cumin1001> | START - Cookbook sre.hosts.remove-downtime for wdqs1024.eqiad.wmnet | [production] | 
            
  | 17:40 | <hnowlan@deploy2002> | helmfile [codfw] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 17:40 | <hnowlan@deploy2002> | helmfile [codfw] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 17:39 | <hnowlan@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 17:38 | <hnowlan@deploy2002> | helmfile [eqiad] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 17:23 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cephosd2002.codfw.wmnet with OS bullseye | [production] | 
            
  | 17:09 | <herron@cumin1001> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 17:09 | <herron@cumin1001> | END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup logstash/kibana records T299700 - herron@cumin1001" | [production] | 
            
  | 17:08 | <herron@cumin1001> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup logstash/kibana records T299700 - herron@cumin1001" | [production] | 
            
  | 17:05 | <herron@cumin1001> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 16:45 | <arnaudb@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 16:44 | <arnaudb@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 16:44 | <arnaudb@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 16:43 | <arnaudb@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 16:39 | <jhancock@cumin2002> | START - Cookbook sre.hosts.reimage for host cephosd2002.codfw.wmnet with OS bullseye | [production] | 
            
  | 16:39 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cephosd2002.codfw.wmnet with OS bullseye | [production] | 
            
  | 16:38 | <brouberol@cumin1001> | START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. | [production] | 
            
  | 16:27 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:27 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:26 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:26 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:25 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:24 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:24 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:23 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:09 | <hnowlan@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:09 | <hnowlan@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 16:02 | <sukhe> | run dummy authdns-update on dns6001 | [production] | 
            
  | 16:00 | <milimetric@deploy2002> | Finished deploy [analytics/refinery@8b8f178] (thin): hotfix: sqoop (duration: 00m 07s) | [production] | 
            
  | 16:00 | <milimetric@deploy2002> | Started deploy [analytics/refinery@8b8f178] (thin): hotfix: sqoop | [production] | 
            
  | 15:57 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db2104 (T348183)', diff saved to https://phabricator.wikimedia.org/P54274 and previous config saved to /var/cache/conftool/dbconfig/20231207-155712-arnaudb.json | [production] | 
            
  | 15:55 | <milimetric@deploy2002> | Finished deploy [analytics/refinery@8b8f178]: hotfix: sqoop (duration: 10m 08s) | [production] | 
            
  | 15:53 | <sukhe> | running authdns-update with broken resolv.conf on dns6001 | [production] | 
            
  | 15:48 | <sukhe> | clear out dns6001 resolv.conf to check for SSH config-based authdns-update | [production] | 
            
  | 15:45 | <milimetric@deploy2002> | Started deploy [analytics/refinery@8b8f178]: hotfix: sqoop | [production] | 
            
  | 15:45 | <klausman@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 15:44 | <klausman@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 15:44 | <klausman@deploy2002> | helmfile [staging] DONE helmfile.d/services/api-gateway: apply | [production] | 
            
  | 15:44 | <klausman@deploy2002> | helmfile [staging] START helmfile.d/services/api-gateway: apply | [production] | 
            
  | 15:42 | <arnaudb@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54273 and previous config saved to /var/cache/conftool/dbconfig/20231207-154205-arnaudb.json | [production] | 
            
  | 15:37 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sessionstore2006.codfw.wmnet with OS bullseye | [production] | 
            
  | 15:36 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sessionstore2005.codfw.wmnet with OS bullseye | [production] |