production SAL

7651-7700 of 10000 results (96ms)

2023-11-02 §
18:07	<topranks>	Making cr1-codfw VRRP Master for row A traffic again on ssw1-a1-codfw interface (T347191)	[production]
17:50	<topranks>	Shutting asw-a-codfw uplink to cr1-codfw down in advance of cable move (T347191)	[production]
17:45	<topranks>	Moving row A outbound traffic from direct CR link to routing via Spinie (T347191)	[production]
17:45	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.eqiad.wmnet with OS bookworm	[production]
17:42	<vgutierrez>	repool cp4051 and cp5030	[production]
17:40	<ebernhardson@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:40	<ebernhardson@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:23	<vgutierrez>	depool cp5030	[production]
17:19	<vgutierrez>	restart haproxy on cp4051	[production]
17:14	<bd808@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/toolhub: apply	[production]
17:14	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage	[production]
17:13	<bd808@deploy2002>	helmfile [eqiad] START helmfile.d/services/toolhub: apply	[production]
17:13	<bd808@deploy2002>	helmfile [codfw] DONE helmfile.d/services/toolhub: apply	[production]
17:12	<bd808@deploy2002>	helmfile [codfw] START helmfile.d/services/toolhub: apply	[production]
17:11	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage	[production]
17:11	<bd808@deploy2002>	helmfile [staging] DONE helmfile.d/services/toolhub: apply	[production]
17:10	<bd808@deploy2002>	helmfile [staging] START helmfile.d/services/toolhub: apply	[production]
17:06	<topranks>	shutting down uplink from asw-a-codfw et-7/0/52 to cr2-codfw et-1/0/0 (T347191)	[production]
17:05	<cmooney@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches	[production]
17:05	<cmooney@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Move row A/B CR uplinks to SPINE switches	[production]
17:02	<bd808@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
17:01	<bd808@deploy2002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
17:01	<bd808@deploy2002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
17:00	<bd808@deploy2002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
17:00	<bd808@deploy2002>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
16:59	<bd808@deploy2002>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]
16:57	<fnegri@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudcontrol1005.eqiad.wmnet with OS bookworm	[production]
16:40	<vgutierrez>	depool cp4051	[production]
16:35	<otto@deploy2002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
16:35	<otto@deploy2002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply	[production]
16:31	<otto@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
16:30	<otto@deploy2002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply	[production]
16:30	<ottomata>	eventgate-analytics-external: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477	[production]
16:30	<ottomata>	eventgate-analytics in codfw: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477	[production]
16:29	<otto@deploy2002>	helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
16:29	<otto@deploy2002>	helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply	[production]
16:26	<fabfur>	haproxy: this change https://gerrit.wikimedia.org/r/c/operations/puppet/+/971228 will be propagated soon to all cp-ulsfo hosts (T348851)	[production]
16:07	<otto@deploy2002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply	[production]
16:06	<otto@deploy2002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply	[production]
15:57	<otto@deploy2002>	helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply	[production]
15:57	<otto@deploy2002>	helmfile [staging] START helmfile.d/services/eventgate-analytics: apply	[production]
15:51	<ottomata>	eventgate-analytics in eqiad: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477	[production]
15:50	<otto@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply	[production]
15:50	<otto@deploy2002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply	[production]
15:48	<sukhe>	sudo cumin 'O:prometheus' 'run-puppet-agent'	[production]
15:45	<sukhe@cumin2002>	END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough	[production]
15:40	<fabfur>	cp4037 repooling with changes for dedicated healthcheck backend (haproxy): https://gerrit.wikimedia.org/r/c/operations/puppet/+/966221/ (T348851)	[production]
15:34	<otto@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
15:34	<otto@deploy2002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply	[production]
15:27	<otto@deploy2002>	helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply	[production]