production SAL

6351-6400 of 7789 results (26ms)

2023-05-09 §
14:37	<bking@deploy1002>	helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
14:37	<bking@deploy1002>	helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply	[production]
2023-05-08 §
20:25	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet	[production]
19:52	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet	[production]
19:51	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2004.codfw.wmnet	[production]
19:45	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host wdqs2004.codfw.wmnet	[production]
19:41	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2006.codfw.wmnet	[production]
19:34	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host wdqs2006.codfw.wmnet	[production]
19:20	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2006.codfw.wmnet with reason: rebooting to help with lag	[production]
19:20	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2006.codfw.wmnet with reason: rebooting to help with lag	[production]
19:20	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2006.codfw.wmnet	[production]
19:20	<bking@cumin1001>	START - Cookbook sre.hosts.remove-downtime for wdqs2006.codfw.wmnet	[production]
19:18	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs1004.eqiad.wmnet with reason: rebooting to help with lag	[production]
19:18	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs1004.eqiad.wmnet with reason: rebooting to help with lag	[production]
19:12	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: rebooting to help with lag	[production]
19:12	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on 19 hosts with reason: rebooting to help with lag	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1001.eqiad.wmnet	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001"	[production]
17:31	<bking@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001"	[production]
16:02	<bking@cumin1001>	conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=codfw	[production]
14:14	<bking@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:09	<bking@cumin1001>	START - Cookbook sre.hosts.decommission for hosts an-airflow1001.eqiad.wmnet	[production]
2023-05-03 §
23:15	<bking@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835	[production]
19:43	<bking@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835	[production]
19:37	<bking@cumin1001>	END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001	[production]
19:20	<inflatador>	bking@cumin1001 reboot Elastic cluster for T335835	[production]
19:19	<bking@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001	[production]
17:41	<inflatador>	bking@cumin1001 reboot wdqs20[13-22].codfw.wmnet T335835	[production]
2023-05-02 §
19:40	<bking@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw	[production]
19:40	<bking@cumin1001>	START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw	[production]
19:18	<bking@deploy1002>	Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 05s)	[production]
19:18	<bking@deploy1002>	Started deploy [wdqs/wdqs@0e051d8]: (no justification provided)	[production]
19:13	<bking@deploy1002>	Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 16s)	[production]
19:12	<bking@deploy1002>	Started deploy [wdqs/wdqs@0e051d8]: (no justification provided)	[production]
19:12	<bking@deploy1002>	Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 07m 13s)	[production]
19:05	<bking@deploy1002>	Started deploy [wdqs/wdqs@0e051d8]: (no justification provided)	[production]
19:04	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
19:04	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
18:56	<bking@deploy1002>	Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 03s)	[production]
18:56	<bking@deploy1002>	Started deploy [wdqs/wdqs@0e051d8]: (no justification provided)	[production]
18:56	<bking@deploy1002>	Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 19s)	[production]
18:55	<bking@deploy1002>	Started deploy [wdqs/wdqs@0e051d8]: (no justification provided)	[production]
18:50	<bking@cumin1001>	conftool action : set/pooled=inactive; selector: name=wdqs2022.codfw.wmnet	[production]
15:35	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
15:34	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
2023-05-01 §
21:08	<bking@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049	[production]
21:08	<bking@cumin1001>	START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049	[production]
21:08	<bking@cumin1001>	END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049	[production]
21:08	<bking@cumin1001>	START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049	[production]