production SAL

4651-4700 of 10000 results (54ms)

2022-02-28 §
15:58	<klausman@cumin2002>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ml-etcd-staging2001	[production]
15:56	<vgutierrez>	rolling upgrade to HAProxy 2.4.14 on HAProxy caching nodes - T290005	[production]
15:54	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
15:53	<klausman@cumin2002>	START - Cookbook sre.hosts.decommission for hosts ml-etcd-staging2001	[production]
15:53	<klausman@cumin2002>	END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts ml-etcd-staging2001	[production]
15:52	<klausman@cumin2002>	START - Cookbook sre.hosts.decommission for hosts ml-etcd-staging2001	[production]
15:50	<elukey@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes2021.codfw.wmnet with OS bullseye	[production]
15:48	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:46	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:44	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
15:37	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
15:33	<milimetric@deploy1002>	Finished deploy [analytics/refinery@84a0770] (hadoop-test): Add a few wikis to the sqoop list (duration: 07m 16s)	[production]
15:30	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
15:26	<milimetric@deploy1002>	Started deploy [analytics/refinery@84a0770] (hadoop-test): Add a few wikis to the sqoop list	[production]
15:25	<milimetric@deploy1002>	Finished deploy [analytics/refinery@84a0770] (thin): Add a few wikis to the sqoop list (duration: 00m 08s)	[production]
15:25	<milimetric@deploy1002>	Started deploy [analytics/refinery@84a0770] (thin): Add a few wikis to the sqoop list	[production]
15:23	<milimetric@deploy1002>	Finished deploy [analytics/refinery@84a0770]: Add a few wikis to the sqoop list (duration: 21m 18s)	[production]
15:18	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host kubernetes2020.codfw.wmnet with OS bullseye	[production]
15:07	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage	[production]
15:06	<ntsako@deploy1002>	Finished deploy [airflow-dags/analytics@0a2ffb8]: (no justification provided) (duration: 00m 07s)	[production]
15:06	<ntsako@deploy1002>	Started deploy [airflow-dags/analytics@0a2ffb8]: (no justification provided)	[production]
15:04	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage	[production]
15:02	<krinkle@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: I616f56388eee9df21e (duration: 00m 49s)	[production]
15:02	<milimetric@deploy1002>	Started deploy [analytics/refinery@84a0770]: Add a few wikis to the sqoop list	[production]
14:53	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
14:50	<elukey@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes2020.codfw.wmnet with OS bullseye	[production]
14:48	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2019.codfw.wmnet with OS bullseye	[production]
14:44	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:43	<klausman@cumin2001>	END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ml-etcd-staging2001.codfw.wmnet	[production]
14:37	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage	[production]
14:35	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage	[production]
14:33	<klausman@cumin2001>	START - Cookbook sre.ganeti.makevm for new host ml-etcd-staging2001.codfw.wmnet	[production]
14:20	<elukey@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes2019.codfw.wmnet with OS bullseye	[production]
14:18	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2018.codfw.wmnet with OS bullseye	[production]
14:09	<kharlan@deploy1002>	helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply	[production]
14:09	<kharlan@deploy1002>	helmfile [staging] START helmfile.d/services/linkrecommendation: apply	[production]
14:07	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2018.codfw.wmnet with reason: host reimage	[production]
14:05	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2018.codfw.wmnet with reason: host reimage	[production]
14:03	<jelto>	update gitlab-ce to 14.7.4 on all GitLab hosts	[production]
14:00	<ebysans@deploy1002>	Finished deploy [airflow-dags/analytics@75e8eb7]: (no justification provided) (duration: 00m 14s)	[production]
14:00	<kharlan@deploy1002>	helmfile [staging] START helmfile.d/services/linkrecommendation: apply	[production]
14:00	<ebysans@deploy1002>	Started deploy [airflow-dags/analytics@75e8eb7]: (no justification provided)	[production]
13:51	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1111 (T302185)', diff saved to https://phabricator.wikimedia.org/P21600 and previous config saved to /var/cache/conftool/dbconfig/20220228-135158-ladsgroup.json	[production]
13:50	<elukey@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes2018.codfw.wmnet with OS bullseye	[production]
13:36	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P21599 and previous config saved to /var/cache/conftool/dbconfig/20220228-133653-ladsgroup.json	[production]
13:21	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P21598 and previous config saved to /var/cache/conftool/dbconfig/20220228-132148-ladsgroup.json	[production]
13:14	<moritzm>	restarting apache on puppet masters to pick up expat security update	[production]
13:06	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1111 (T302185)', diff saved to https://phabricator.wikimedia.org/P21597 and previous config saved to /var/cache/conftool/dbconfig/20220228-130644-ladsgroup.json	[production]
13:01	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1111.eqiad.wmnet with OS bullseye	[production]
12:46	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1111.eqiad.wmnet with reason: host reimage	[production]