production SAL

1301-1350 of 10000 results (84ms)

2023-10-18 §
12:48	<arnaudb@cumin1001>	dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P53005 and previous config saved to /var/cache/conftool/dbconfig/20231018-124820-arnaudb.json	[production]
12:44	<kartik@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/cxserver: apply	[production]
12:44	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti-test2004.mgmt.codfw.wmnet with reboot policy FORCED	[production]
12:44	<kartik@deploy2002>	helmfile [eqiad] START helmfile.d/services/cxserver: apply	[production]
12:43	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
12:43	<arnaudb@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2109.codfw.wmnet with reason: db2109 downtime while repooling	[production]
12:39	<arnaudb@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2109.codfw.wmnet with reason: db2109 downtime while repooling	[production]
12:38	<kartik@deploy2002>	helmfile [staging] DONE helmfile.d/services/cxserver: apply	[production]
12:37	<kartik@deploy2002>	helmfile [staging] START helmfile.d/services/cxserver: apply	[production]
12:33	<arnaudb@cumin1001>	dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P53004 and previous config saved to /var/cache/conftool/dbconfig/20231018-123333-arnaudb.json	[production]
12:33	<arnaudb@cumin1001>	dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P53003 and previous config saved to /var/cache/conftool/dbconfig/20231018-123315-arnaudb.json	[production]
12:18	<arnaudb@cumin1001>	dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P53002 and previous config saved to /var/cache/conftool/dbconfig/20231018-121828-arnaudb.json	[production]
12:18	<arnaudb@cumin1001>	dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P53001 and previous config saved to /var/cache/conftool/dbconfig/20231018-121811-arnaudb.json	[production]
12:17	<arnaudb>	repool db2161 and db1126	[production]
11:51	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1009.eqiad.wmnet	[production]
11:44	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host stat1009.eqiad.wmnet	[production]
11:43	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
11:34	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage	[production]
11:31	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage	[production]
11:29	<hnowlan@deploy2002>	helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply	[production]
11:29	<hnowlan@deploy2002>	helmfile [codfw] START helmfile.d/services/editor-analytics: apply	[production]
11:24	<jgiannelos@deploy2002>	helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply	[production]
11:23	<jgiannelos@deploy2002>	helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply	[production]
11:21	<hnowlan@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply	[production]
11:20	<hnowlan@deploy2002>	helmfile [eqiad] START helmfile.d/services/editor-analytics: apply	[production]
11:16	<hnowlan@deploy2002>	helmfile [staging] DONE helmfile.d/services/editor-analytics: apply	[production]
11:16	<hnowlan@deploy2002>	helmfile [staging] START helmfile.d/services/editor-analytics: apply	[production]
11:14	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
11:12	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
11:11	<ladsgroup@deploy2002>	Finished scap: Backport for [[gerrit:966592\|Set s6 and s8 to write both for pagelinks migration (T345732)]] (duration: 10m 10s)	[production]
11:08	<jbond@cumin1001>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
11:05	<ladsgroup@deploy2002>	ladsgroup: Continuing with sync	[production]
11:02	<ladsgroup@deploy2002>	ladsgroup: Backport for [[gerrit:966592\|Set s6 and s8 to write both for pagelinks migration (T345732)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
11:01	<fnegri@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
11:00	<ladsgroup@deploy2002>	Started scap: Backport for [[gerrit:966592\|Set s6 and s8 to write both for pagelinks migration (T345732)]]	[production]
10:40	<volans>	re-enabled puppet on the cumin hosts. installed spicerack 8.0.1 on the cumin hosts	[production]
10:37	<volans@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
10:35	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1007.eqiad.wmnet	[production]
10:32	<fnegri@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
10:28	<kevinbazira@deploy2002>	helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .	[production]
10:19	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
10:16	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage	[production]
10:09	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet	[production]
10:07	<fnegri@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm	[production]
10:03	<volans@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet	[production]
09:54	<volans@cumin2002>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
09:52	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on stat1009.eqiad.wmnet with reason: Extending downtime for stat1009	[production]
09:52	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on stat1009.eqiad.wmnet with reason: Extending downtime for stat1009	[production]
09:48	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
09:47	<volans@cumin2002>	START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet	[production]