6351-6400 of 7789 results (26ms)
2023-05-09 §
14:37 <bking@deploy1002> helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply [production]
14:37 <bking@deploy1002> helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply [production]
2023-05-08 §
20:25 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet [production]
19:52 <bking@cumin1001> START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet [production]
19:51 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2004.codfw.wmnet [production]
19:45 <bking@cumin1001> START - Cookbook sre.hosts.reboot-single for host wdqs2004.codfw.wmnet [production]
19:41 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2006.codfw.wmnet [production]
19:34 <bking@cumin1001> START - Cookbook sre.hosts.reboot-single for host wdqs2006.codfw.wmnet [production]
19:20 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs2006.codfw.wmnet with reason: rebooting to help with lag [production]
19:20 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs2006.codfw.wmnet with reason: rebooting to help with lag [production]
19:20 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2006.codfw.wmnet [production]
19:20 <bking@cumin1001> START - Cookbook sre.hosts.remove-downtime for wdqs2006.codfw.wmnet [production]
19:18 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs1004.eqiad.wmnet with reason: rebooting to help with lag [production]
19:18 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs1004.eqiad.wmnet with reason: rebooting to help with lag [production]
19:12 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: rebooting to help with lag [production]
19:12 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on 19 hosts with reason: rebooting to help with lag [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1001.eqiad.wmnet [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:36 <bking@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001" [production]
17:31 <bking@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001" [production]
16:02 <bking@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=codfw [production]
14:14 <bking@cumin1001> START - Cookbook sre.dns.netbox [production]
14:09 <bking@cumin1001> START - Cookbook sre.hosts.decommission for hosts an-airflow1001.eqiad.wmnet [production]
2023-05-03 §
23:15 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835 [production]
19:43 <bking@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 - T335835 [production]
19:37 <bking@cumin1001> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 [production]
19:20 <inflatador> bking@cumin1001 reboot Elastic cluster for T335835 [production]
19:19 <bking@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot - bking@cumin1001 [production]
17:41 <inflatador> bking@cumin1001 reboot wdqs20[13-22].codfw.wmnet T335835 [production]
2023-05-02 §
19:40 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw [production]
19:40 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw [production]
19:18 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 05s) [production]
19:18 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:13 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 16s) [production]
19:12 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:12 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 07m 13s) [production]
19:05 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
19:04 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
19:04 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
18:56 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 03s) [production]
18:56 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
18:56 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: (no justification provided) (duration: 00m 19s) [production]
18:55 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: (no justification provided) [production]
18:50 <bking@cumin1001> conftool action : set/pooled=inactive; selector: name=wdqs2022.codfw.wmnet [production]
15:35 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
15:34 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye [production]
2023-05-01 §
21:08 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083]* for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049 [production]
21:08 <bking@cumin1001> START - Cookbook sre.elasticsearch.ban Banning hosts: elastic[2045-2048,2059,2065-2066,2071,2081-2083] for row C switch upgrade - bking@cumin1001 - T334049 [production]