2801-2850 of 10000 results (97ms)
2023-11-02 ยง
18:52 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh1001.wikimedia.org with OS bookworm [production]
18:46 <topranks> shutting down uplink from asw-b-codfw et-2/0/51 to cr1-codfw in advance of cable move (T347191) [production]
18:44 <topranks> Making cr2-codfw VRRP Master for row B traffic over new link from ssw1-a8-codfw (T347191) [production]
18:35 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh1001.wikimedia.org with reason: host reimage [production]
18:32 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on doh1001.wikimedia.org with reason: host reimage [production]
18:22 <dduvall@deploy2002> rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.3 refs T348356 [production]
18:22 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host doh1001.wikimedia.org with OS bookworm [production]
18:21 <topranks> Shutting asw-b-codfw uplink to cr2-codfw down in advance of cable move (T347191) [production]
18:09 <ebernhardson@deploy2002> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
18:09 <ebernhardson@deploy2002> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
18:07 <topranks> Making cr1-codfw VRRP Master for row A traffic again on ssw1-a1-codfw interface (T347191) [production]
17:50 <topranks> Shutting asw-a-codfw uplink to cr1-codfw down in advance of cable move (T347191) [production]
17:45 <topranks> Moving row A outbound traffic from direct CR link to routing via Spinie (T347191) [production]
17:45 <fnegri@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.eqiad.wmnet with OS bookworm [production]
17:42 <vgutierrez> repool cp4051 and cp5030 [production]
17:40 <ebernhardson@deploy2002> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
17:40 <ebernhardson@deploy2002> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
17:23 <vgutierrez> depool cp5030 [production]
17:19 <vgutierrez> restart haproxy on cp4051 [production]
17:14 <bd808@deploy2002> helmfile [eqiad] DONE helmfile.d/services/toolhub: apply [production]
17:14 <fnegri@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage [production]
17:13 <bd808@deploy2002> helmfile [eqiad] START helmfile.d/services/toolhub: apply [production]
17:13 <bd808@deploy2002> helmfile [codfw] DONE helmfile.d/services/toolhub: apply [production]
17:12 <bd808@deploy2002> helmfile [codfw] START helmfile.d/services/toolhub: apply [production]
17:11 <fnegri@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage [production]
17:11 <bd808@deploy2002> helmfile [staging] DONE helmfile.d/services/toolhub: apply [production]
17:10 <bd808@deploy2002> helmfile [staging] START helmfile.d/services/toolhub: apply [production]
17:06 <topranks> shutting down uplink from asw-a-codfw et-7/0/52 to cr2-codfw et-1/0/0 (T347191) [production]
17:05 <cmooney@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches [production]
17:05 <cmooney@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches [production]
17:02 <bd808@deploy2002> helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply [production]
17:01 <bd808@deploy2002> helmfile [eqiad] START helmfile.d/services/developer-portal: apply [production]
17:01 <bd808@deploy2002> helmfile [codfw] DONE helmfile.d/services/developer-portal: apply [production]
17:00 <bd808@deploy2002> helmfile [codfw] START helmfile.d/services/developer-portal: apply [production]
17:00 <bd808@deploy2002> helmfile [staging] DONE helmfile.d/services/developer-portal: apply [production]
16:59 <bd808@deploy2002> helmfile [staging] START helmfile.d/services/developer-portal: apply [production]
16:57 <fnegri@cumin1001> START - Cookbook sre.hosts.reimage for host cloudcontrol1005.eqiad.wmnet with OS bookworm [production]
16:40 <vgutierrez> depool cp4051 [production]
16:35 <otto@deploy2002> helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply [production]
16:35 <otto@deploy2002> helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply [production]
16:31 <otto@deploy2002> helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply [production]
16:30 <otto@deploy2002> helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply [production]
16:30 <ottomata> eventgate-analytics-external: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 [production]
16:30 <ottomata> eventgate-analytics in codfw: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 [production]
16:29 <otto@deploy2002> helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply [production]
16:29 <otto@deploy2002> helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply [production]
16:26 <fabfur> haproxy: this change https://gerrit.wikimedia.org/r/c/operations/puppet/+/971228 will be propagated soon to all cp-ulsfo hosts (T348851) [production]
16:07 <otto@deploy2002> helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply [production]
16:06 <otto@deploy2002> helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply [production]
15:57 <otto@deploy2002> helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply [production]