| 2023-11-02
      
      ยง | 
    
  | 18:22 | <sukhe@cumin2002> | START - Cookbook sre.hosts.reimage for host doh1001.wikimedia.org with OS bookworm | [production] | 
            
  | 18:21 | <topranks> | Shutting asw-b-codfw uplink to cr2-codfw down in advance of cable move (T347191) | [production] | 
            
  | 18:09 | <ebernhardson@deploy2002> | helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 18:09 | <ebernhardson@deploy2002> | helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 18:07 | <topranks> | Making cr1-codfw VRRP Master for row A traffic again on ssw1-a1-codfw interface (T347191) | [production] | 
            
  | 17:50 | <topranks> | Shutting asw-a-codfw uplink to cr1-codfw down in advance of cable move (T347191) | [production] | 
            
  | 17:45 | <topranks> | Moving row A outbound traffic from direct CR link to routing via Spinie (T347191) | [production] | 
            
  | 17:45 | <fnegri@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.eqiad.wmnet with OS bookworm | [production] | 
            
  | 17:42 | <vgutierrez> | repool cp4051 and cp5030 | [production] | 
            
  | 17:40 | <ebernhardson@deploy2002> | helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 17:40 | <ebernhardson@deploy2002> | helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 17:23 | <vgutierrez> | depool cp5030 | [production] | 
            
  | 17:19 | <vgutierrez> | restart haproxy on cp4051 | [production] | 
            
  | 17:14 | <bd808@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:14 | <fnegri@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:13 | <bd808@deploy2002> | helmfile [eqiad] START helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:13 | <bd808@deploy2002> | helmfile [codfw] DONE helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:12 | <bd808@deploy2002> | helmfile [codfw] START helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:11 | <fnegri@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:11 | <bd808@deploy2002> | helmfile [staging] DONE helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:10 | <bd808@deploy2002> | helmfile [staging] START helmfile.d/services/toolhub: apply | [production] | 
            
  | 17:06 | <topranks> | shutting down uplink from asw-a-codfw et-7/0/52 to cr2-codfw et-1/0/0 (T347191) | [production] | 
            
  | 17:05 | <cmooney@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches | [production] | 
            
  | 17:05 | <cmooney@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches | [production] | 
            
  | 17:02 | <bd808@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply | [production] | 
            
  | 17:01 | <bd808@deploy2002> | helmfile [eqiad] START helmfile.d/services/developer-portal: apply | [production] | 
            
  | 17:01 | <bd808@deploy2002> | helmfile [codfw] DONE helmfile.d/services/developer-portal: apply | [production] | 
            
  | 17:00 | <bd808@deploy2002> | helmfile [codfw] START helmfile.d/services/developer-portal: apply | [production] | 
            
  | 17:00 | <bd808@deploy2002> | helmfile [staging] DONE helmfile.d/services/developer-portal: apply | [production] | 
            
  | 16:59 | <bd808@deploy2002> | helmfile [staging] START helmfile.d/services/developer-portal: apply | [production] | 
            
  | 16:57 | <fnegri@cumin1001> | START - Cookbook sre.hosts.reimage for host cloudcontrol1005.eqiad.wmnet with OS bookworm | [production] | 
            
  | 16:40 | <vgutierrez> | depool cp4051 | [production] | 
            
  | 16:35 | <otto@deploy2002> | helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:35 | <otto@deploy2002> | helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:31 | <otto@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:30 | <otto@deploy2002> | helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:30 | <ottomata> | eventgate-analytics-external:  setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes.   Should reduce throttling and perhaps help with latency.  If works, will make this the default in the chart. - T347477 | [production] | 
            
  | 16:30 | <ottomata> | eventgate-analytics in codfw:  setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes.   Should reduce throttling and perhaps help with latency.  If works, will make this the default in the chart. - T347477 | [production] | 
            
  | 16:29 | <otto@deploy2002> | helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:29 | <otto@deploy2002> | helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply | [production] | 
            
  | 16:26 | <fabfur> | haproxy: this change https://gerrit.wikimedia.org/r/c/operations/puppet/+/971228 will be propagated soon to all cp-ulsfo hosts (T348851) | [production] | 
            
  | 16:07 | <otto@deploy2002> | helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 16:06 | <otto@deploy2002> | helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 15:57 | <otto@deploy2002> | helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 15:57 | <otto@deploy2002> | helmfile [staging] START helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 15:51 | <ottomata> | eventgate-analytics in eqiad:  setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes.   Should reduce throttling and perhaps help with latency.  If works, will make this the default in the chart. - T347477 | [production] | 
            
  | 15:50 | <otto@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 15:50 | <otto@deploy2002> | helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply | [production] | 
            
  | 15:48 | <sukhe> | sudo cumin 'O:prometheus' 'run-puppet-agent' | [production] | 
            
  | 15:45 | <sukhe@cumin2002> | END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough | [production] |