7351-7400 of 8776 results (31ms)
2023-05-08 §
19:18 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs1004.eqiad.wmnet with reason: rebooting to help with lag [production]
19:12 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: rebooting to help with lag [production]
19:12 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on 19 hosts with reason: rebooting to help with lag [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1001.eqiad.wmnet [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001" [production]
17:31 <bking@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001" [production]
16:02 <bking@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=codfw [production]
14:14 <bking@cumin1001> START - Cookbook sre.dns.netbox [production]
14:09 <bking@cumin1001> START - Cookbook sre.hosts.decommission for hosts an-airflow1001.eqiad.wmnet [production]
2023-05-03 §
23:15 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835 [production]
19:43 <bking@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835 [production]
19:37 <bking@cumin1001> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 [production]
19:20 <inflatador> bking@cumin1001 reboot Elastic cluster for T335835 [production]
19:19 <bking@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 [production]
17:41 <inflatador> bking@cumin1001 reboot wdqs20[13-22].codfw.wmnet T335835 [production]
2023-05-02 §
19:40 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw [production]
19:40 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw [production]
19:18 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 05s) [production]
19:18 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:13 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 16s) [production]
19:12 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:12 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 07m 13s) [production]
19:05 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:04 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:04 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
18:56 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 03s) [production]
18:56 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
18:56 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 19s) [production]
18:55 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
18:50 <bking@cumin1001> conftool action : set/pooled=inactive; selector: name=wdqs2022.codfw.wmnet [production]
15:35 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
15:34 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
2023-05-01 §
21:08 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049 [production]
2023-04-25 §
19:48 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:48 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:48 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2006.codfw.wmnet [production]
19:48 <bking@cumin1001> START - Cookbook sre.hosts.remove-downtime for wdqs2006.codfw.wmnet [production]
19:46 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2009.codfw.wmnet [production]
19:46 <bking@cumin1001> START - Cookbook sre.hosts.remove-downtime for wdqs2009.codfw.wmnet [production]
19:46 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs2006.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:46 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on wdqs2006.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:23 <inflatador> bking@cumin1001 finishing WDQS deploy...restarting `wdqs-categories` across lvs-managed hosts [production]
18:57 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 17m 29s) [production]
18:39 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: 0.3.123 [production]
14:58 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 07m 38s) [production]
14:50 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: 0.3.123 [production]