__all__ SAL

51-100 of 10000 results (68ms)

2022-10-07 §
11:56	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS buster	[production]
11:51	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster	[production]
11:50	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1001.eqiad.wmnet with OS buster	[production]
11:50	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster	[production]
11:50	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1001.eqiad.wmnet with OS buster	[production]
11:50	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster	[production]
11:49	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS buster	[production]
11:33	<arturo>	rabbitmq-server.service @ cloudrabbit1002 is again up and running (T320232)	[admin]
11:27	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster	[production]
11:01	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
10:49	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
10:48	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
10:41	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
10:24	<arturo>	stopping rabbitmq-server.service @ cloudrabbit1002 (T320232)	[admin]
10:19	<arturo>	restarting nova-conductor in all 3 cloudcontrols (T320232)	[admin]
09:45	<arturo>	restarting rabbitmq-server.service @ cloudrabbit1002 (T320232)	[admin]
09:26	<elukey>	delete calico pods in CrashLoop on dse-k8s-codfw (probably due to the incorrect docker settings)	[production]
09:26	<elukey>	delete calico pods in CrashLoop on dse (probably due to the incorrect docker settings)	[analytics]
09:03	<arturo>	restarted nova-fullstack.service on cloudcontrol1005 (T320232)	[admin-monitoring]
08:59	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .	[production]
08:52	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
08:49	<arturo>	cleaning up a bunch of leaked VMs on "BUILD" status (T320232)	[admin-monitoring]
08:44	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
08:43	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .	[production]
08:39	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
08:37	<klausman@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
08:36	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .	[production]
08:35	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
08:35	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
08:33	<klausman@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:32	<klausman@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
08:28	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1029.eqiad.wmnet to cluster eqiad and group A	[production]
08:26	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti1029.eqiad.wmnet to cluster eqiad and group A	[production]
08:23	<aborrero@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudnet1004.eqiad.wmnet with OS bullseye	[production]
08:23	<aborrero@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet1003.eqiad.wmnet with OS bullseye	[production]
08:22	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1014.eqiad.wmnet with OS bullseye	[production]
08:22	<vgutierrez>	partition ats-be cache in cp6016 - T317748	[production]
08:21	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .	[production]
08:20	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
08:19	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
08:19	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .	[production]
08:19	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
08:19	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
08:11	<klausman@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
08:11	<klausman@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
08:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1014.eqiad.wmnet with reason: host reimage	[production]
08:03	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1014.eqiad.wmnet with reason: host reimage	[production]
07:54	<elukey>	re-initialize docker on dse-k8s-worker1004 - wrong storage type set (devicemapper instead of overlay2)	[production]
07:54	<elukey>	re-initialize docker on dse-k8s-worker1004 - wrong storage type set (devicemapper instead of overlay2)	[analytics]
07:50	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host ganeti1014.eqiad.wmnet with OS bullseye	[production]