production SAL

4851-4900 of 10000 results (91ms)

2023-02-23 §
20:18	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)	[production]
20:16	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host relforge1004.eqiad.wmnet	[production]
20:10	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host relforge1004.eqiad.wmnet	[production]
20:09	<brennen@deploy1002>	Finished deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001, take 3 (duration: 03m 13s)	[production]
20:08	<bking@cumin1001>	START - Cookbook sre.wdqs.restart	[production]
20:06	<brennen@deploy1002>	Started deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001, take 3	[production]
20:05	<bking@cumin1001>	END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)	[production]
20:05	<bking@cumin1001>	START - Cookbook sre.wdqs.restart	[production]
19:58	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host relforge1003.eqiad.wmnet	[production]
19:55	<brennen@deploy1002>	Finished deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001, take 2 (duration: 01m 04s)	[production]
19:54	<bking@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply JRE updates - bking@cumin1001 - T329957	[production]
19:53	<brennen@deploy1002>	Started deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001, take 2	[production]
19:51	<brennen@deploy1002>	Finished deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001 (duration: 01m 10s)	[production]
19:51	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host relforge1003.eqiad.wmnet	[production]
19:50	<brennen@deploy1002>	Started deploy [phabricator/deployment@3f2dd1b]: test deploy to aphlict2001	[production]
19:50	<mutante>	aphlict2001 - manually created /etc/phabricator/config.yaml - empty file owned by root:phab-deploy to debug for T330393 T322369	[production]
19:46	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
19:45	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
19:45	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
19:45	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
19:45	<bd808@deploy1002>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
19:45	<bd808@deploy1002>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]
19:20	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5003.wikimedia.org with OS bullseye	[production]
18:53	<bking@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
18:53	<bking@cumin1001>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
18:52	<bking@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
18:51	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5003.wikimedia.org with reason: host reimage	[production]
18:50	<bking@cumin1001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
18:49	<sukhe>	run puppet agent on puppetdb2003	[production]
18:48	<sukhe@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on dns5003.wikimedia.org with reason: host reimage	[production]
18:47	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wdqs2022']	[production]
18:38	<pt1979@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs2022']	[production]
18:38	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/toolhub: apply	[production]
18:36	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/toolhub: apply	[production]
18:36	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/toolhub: apply	[production]
18:36	<bking@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
18:35	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/toolhub: apply	[production]
18:35	<bd808@deploy1002>	helmfile [staging] DONE helmfile.d/services/toolhub: apply	[production]
18:34	<bd808@deploy1002>	helmfile [staging] START helmfile.d/services/toolhub: apply	[production]
18:34	<dcaro@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1003.eqiad.wmnet with OS bullseye	[production]
18:24	<jbond@cumin2002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wdqs2021']	[production]
18:15	<jbond@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs2021']	[production]
18:14	<sukhe@cumin2002>	START - Cookbook sre.hosts.reimage for host dns5003.wikimedia.org with OS bullseye	[production]
18:14	<jbond@cumin2002>	END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet	[production]
18:08	<jbond@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet	[production]
17:56	<fab@deploy1002>	Finished deploy [airflow-dags/research@5edcd7b]: (no justification provided) (duration: 00m 10s)	[production]
17:55	<fab@deploy1002>	Started deploy [airflow-dags/research@5edcd7b]: (no justification provided)	[production]
17:49	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4003.wikimedia.org with OS bullseye	[production]
17:46	<dcaro@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudcephosd1003.eqiad.wmnet with OS bullseye	[production]
17:45	<fab@deploy1002>	Finished deploy [airflow-dags/research@5edcd7b]: (no justification provided) (duration: 00m 27s)	[production]