production SAL

5601-5650 of 10000 results (103ms)

2023-09-19 §
09:56	<stevemunene@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1147.eqiad.wmnet with reason: host reimage	[production]
09:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1134 (re)pooling @ 3%: Repooling after recloning db1128', diff saved to https://phabricator.wikimedia.org/P52524 and previous config saved to /var/cache/conftool/dbconfig/20230919-095127-root.json	[production]
09:48	<slyngshede@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idm2001.wikimedia.org with OS bookworm	[production]
09:42	<stevemunene@cumin1001>	START - Cookbook sre.hosts.reimage for host an-worker1147.eqiad.wmnet with OS bullseye	[production]
09:40	<btullis@cumin1001>	START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.	[production]
09:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1134 (re)pooling @ 1%: Repooling after recloning db1128', diff saved to https://phabricator.wikimedia.org/P52523 and previous config saved to /var/cache/conftool/dbconfig/20230919-093622-root.json	[production]
09:12	<elukey@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2001.codfw.wmnet	[production]
09:08	<elukey@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2001.codfw.wmnet	[production]
09:03	<elukey@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2002.codfw.wmnet	[production]
08:59	<elukey@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2002.codfw.wmnet	[production]
08:47	<elukey@cumin1001>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2003.codfw.wmnet	[production]
08:44	<godog>	bounce benthos@webrequest_live to clear out old metrics	[production]
08:43	<elukey@cumin1001>	START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2003.codfw.wmnet	[production]
08:41	<godog>	remove MediaWiki..growthexperiments.taskcount.link_recommendation. from graphite - T346371	[production]
08:39	<jmm@cumin2002>	END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad	[production]
08:36	<stevemunene@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1146.eqiad.wmnet with OS bullseye	[production]
08:34	<jmm@cumin2002>	START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad	[production]
08:30	<jmm@cumin2002>	END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw	[production]
08:29	<slyngshede@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idm2001.wikimedia.org with reason: host reimage	[production]
08:26	<brouberol@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
08:26	<brouberol@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
08:26	<slyngshede@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on idm2001.wikimedia.org with reason: host reimage	[production]
08:26	<brouberol>	redeploying mw-page-content-change-enrich in codfw T336041	[production]
08:26	<brouberol@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
08:25	<brouberol@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
08:25	<jmm@cumin2002>	START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw	[production]
08:25	<brouberol>	redeploying mw-page-content-change-enrich in eqiad T336041	[production]
08:24	<brouberol@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply	[production]
08:24	<brouberol@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply	[production]
08:24	<brouberol>	redeploying eventstreams-internal in eqiad T336041	[production]
08:23	<brouberol@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply	[production]
08:23	<brouberol@deploy1002>	helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply	[production]
08:23	<brouberol>	redeploying eventstreams-internal in codfw T336041	[production]
08:22	<brouberol@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
08:21	<brouberol@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply	[production]
08:21	<brouberol>	redeploying eventstream-analytics-external in codfw T336041	[production]
08:21	<brouberol@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply	[production]
08:20	<brouberol@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply	[production]
08:20	<brouberol>	redeploying eventstream-analytics-external in eqiad T336041	[production]
08:19	<brouberol@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply	[production]
08:18	<brouberol@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply	[production]
08:18	<brouberol>	redeploying eventstream-analytics in codfw T336041	[production]
08:18	<brouberol@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply	[production]
08:17	<brouberol@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply	[production]
08:13	<stevemunene@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1146.eqiad.wmnet with reason: host reimage	[production]
08:11	<slyngshede@cumin1001>	START - Cookbook sre.hosts.reimage for host idm2001.wikimedia.org with OS bookworm	[production]
08:10	<stevemunene@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1146.eqiad.wmnet with reason: host reimage	[production]
08:05	<brouberol@deploy1002>	helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply	[production]
08:05	<brouberol@deploy1002>	helmfile [staging] START helmfile.d/services/eventstreams-internal: apply	[production]
08:05	<moritzm>	restarting FPM on mw canaries to pick up libwebp updates	[production]