production SAL

3601-3650 of 10000 results (96ms)

2023-10-18 §
09:52	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on stat1009.eqiad.wmnet with reason: Extending downtime for stat1009	[production]
09:52	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on stat1009.eqiad.wmnet with reason: Extending downtime for stat1009	[production]
09:48	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
09:47	<volans@cumin2002>	START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet	[production]
09:25	<volans>	uploaded spicerack_8.0.1 to apt.wikimedia.org bullseye-wikimedia	[production]
09:23	<jayme@deploy1002>	helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply	[production]
09:23	<jynus>	aborting backup of es1022, es1025 (there was already another backup running)	[production]
09:23	<fnegri@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
09:22	<jayme@deploy1002>	helmfile [codfw] START helmfile.d/services/wikifunctions: apply	[production]
09:21	<jynus>	starting new backup of es1022, es1025 (new clusters only)	[production]
09:20	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1006.eqiad.wmnet	[production]
09:20	<jayme@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply	[production]
09:19	<jayme@deploy1002>	helmfile [eqiad] START helmfile.d/services/wikifunctions: apply	[production]
09:17	<jayme@deploy1002>	helmfile [staging] DONE helmfile.d/services/wikifunctions: apply	[production]
09:17	<jayme@deploy1002>	helmfile [staging] START helmfile.d/services/wikifunctions: apply	[production]
09:17	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on stat1009.eqiad.wmnet with reason: Moving /home to /srv/home on stat1009 and rebooting	[production]
09:16	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on stat1009.eqiad.wmnet with reason: Moving /home to /srv/home on stat1009 and rebooting	[production]
09:14	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host stat1007.eqiad.wmnet	[production]
09:13	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host stat1006.eqiad.wmnet	[production]
09:13	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1004.eqiad.wmnet	[production]
09:10	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
09:06	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
09:05	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host stat1004.eqiad.wmnet	[production]
09:02	<aqu@deploy2002>	Finished deploy [airflow-dags/analytics@c17c91c]: Fix following yesterday weekly train deploy - Second try [airflow-dags@c17c91ce] (duration: 00m 06s)	[production]
09:02	<aqu@deploy2002>	Started deploy [airflow-dags/analytics@c17c91c]: Fix following yesterday weekly train deploy - Second try [airflow-dags@c17c91ce]	[production]
09:01	<aqu@deploy2002>	deploy aborted: Fix following yesterday weekly train deploy [airflow-dags@c17c91ce] (duration: 01m 10s)	[production]
09:00	<aqu@deploy2002>	Started deploy [airflow-dags/analytics@c17c91c]: Fix following yesterday weekly train deploy [airflow-dags@c17c91ce]	[production]
08:54	<fnegri@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
08:51	<jayme@deploy1002>	helmfile [staging] DONE helmfile.d/services/wikifunctions: apply	[production]
08:40	<jayme@deploy1002>	helmfile [staging] START helmfile.d/services/wikifunctions: apply	[production]
08:18	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
08:14	<volans@cumin2002>	START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet	[production]
08:08	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
08:06	<volans@cumin2002>	START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet	[production]
08:03	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
08:03	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add A/PTR for lsw1-e8/ssw links - ayounsi@cumin1001"	[production]
08:02	<ayounsi@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add A/PTR for lsw1-e8/ssw links - ayounsi@cumin1001"	[production]
07:54	<kevinbazira@deploy2002>	helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .	[production]
07:47	<ayounsi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
07:46	<marostegui@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2132.codfw.wmnet with OS bookworm	[production]
07:40	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2132.codfw.wmnet with reason: host reimage	[production]
07:37	<volans>	temporarily disabled puppet on the A:cumin hosts to deploy and test spicerack v8.0.0	[production]
07:37	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db2132.codfw.wmnet with reason: host reimage	[production]
07:28	<filippo@deploy2002>	helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
07:28	<filippo@deploy2002>	helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply	[production]
07:28	<filippo@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
07:28	<filippo@deploy2002>	helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply	[production]
07:27	<filippo@deploy2002>	helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
07:27	<filippo@deploy2002>	helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply	[production]
07:20	<marostegui@cumin1001>	START - Cookbook sre.hosts.reimage for host db2132.codfw.wmnet with OS bookworm	[production]