2023-11-02
ยง
|
18:22 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.reimage for host doh1001.wikimedia.org with OS bookworm |
[production] |
18:21 |
<topranks> |
Shutting asw-b-codfw uplink to cr2-codfw down in advance of cable move (T347191) |
[production] |
18:09 |
<ebernhardson@deploy2002> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
18:09 |
<ebernhardson@deploy2002> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
18:07 |
<topranks> |
Making cr1-codfw VRRP Master for row A traffic again on ssw1-a1-codfw interface (T347191) |
[production] |
17:50 |
<topranks> |
Shutting asw-a-codfw uplink to cr1-codfw down in advance of cable move (T347191) |
[production] |
17:45 |
<topranks> |
Moving row A outbound traffic from direct CR link to routing via Spinie (T347191) |
[production] |
17:45 |
<fnegri@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.eqiad.wmnet with OS bookworm |
[production] |
17:42 |
<vgutierrez> |
repool cp4051 and cp5030 |
[production] |
17:40 |
<ebernhardson@deploy2002> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
17:40 |
<ebernhardson@deploy2002> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
17:23 |
<vgutierrez> |
depool cp5030 |
[production] |
17:19 |
<vgutierrez> |
restart haproxy on cp4051 |
[production] |
17:14 |
<bd808@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/toolhub: apply |
[production] |
17:14 |
<fnegri@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage |
[production] |
17:13 |
<bd808@deploy2002> |
helmfile [eqiad] START helmfile.d/services/toolhub: apply |
[production] |
17:13 |
<bd808@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/toolhub: apply |
[production] |
17:12 |
<bd808@deploy2002> |
helmfile [codfw] START helmfile.d/services/toolhub: apply |
[production] |
17:11 |
<fnegri@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage |
[production] |
17:11 |
<bd808@deploy2002> |
helmfile [staging] DONE helmfile.d/services/toolhub: apply |
[production] |
17:10 |
<bd808@deploy2002> |
helmfile [staging] START helmfile.d/services/toolhub: apply |
[production] |
17:06 |
<topranks> |
shutting down uplink from asw-a-codfw et-7/0/52 to cr2-codfw et-1/0/0 (T347191) |
[production] |
17:05 |
<cmooney@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches |
[production] |
17:05 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches |
[production] |
17:02 |
<bd808@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply |
[production] |
17:01 |
<bd808@deploy2002> |
helmfile [eqiad] START helmfile.d/services/developer-portal: apply |
[production] |
17:01 |
<bd808@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/developer-portal: apply |
[production] |
17:00 |
<bd808@deploy2002> |
helmfile [codfw] START helmfile.d/services/developer-portal: apply |
[production] |
17:00 |
<bd808@deploy2002> |
helmfile [staging] DONE helmfile.d/services/developer-portal: apply |
[production] |
16:59 |
<bd808@deploy2002> |
helmfile [staging] START helmfile.d/services/developer-portal: apply |
[production] |
16:57 |
<fnegri@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudcontrol1005.eqiad.wmnet with OS bookworm |
[production] |
16:40 |
<vgutierrez> |
depool cp4051 |
[production] |
16:35 |
<otto@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:35 |
<otto@deploy2002> |
helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:31 |
<otto@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:30 |
<otto@deploy2002> |
helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:30 |
<ottomata> |
eventgate-analytics-external: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 |
[production] |
16:30 |
<ottomata> |
eventgate-analytics in codfw: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 |
[production] |
16:29 |
<otto@deploy2002> |
helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:29 |
<otto@deploy2002> |
helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply |
[production] |
16:26 |
<fabfur> |
haproxy: this change https://gerrit.wikimedia.org/r/c/operations/puppet/+/971228 will be propagated soon to all cp-ulsfo hosts (T348851) |
[production] |
16:07 |
<otto@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply |
[production] |
16:06 |
<otto@deploy2002> |
helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:57 |
<otto@deploy2002> |
helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply |
[production] |
15:57 |
<otto@deploy2002> |
helmfile [staging] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:51 |
<ottomata> |
eventgate-analytics in eqiad: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 |
[production] |
15:50 |
<otto@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply |
[production] |
15:50 |
<otto@deploy2002> |
helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:48 |
<sukhe> |
sudo cumin 'O:prometheus' 'run-puppet-agent' |
[production] |
15:45 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough |
[production] |