201-250 of 10000 results (119ms)
2026-03-23 ยง
14:58 <bking@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors [production]
14:58 <bking@cumin2002> START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors [production]
14:57 <fnegri@cumin1003> conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet [production]
14:57 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:56 <sukhe@dns1004> START - running authdns-update [production]
14:56 <sukhe@dns1004> END - running authdns-update [production]
14:56 <fnegri@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 T419960 [production]
14:56 <fnegri@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet [production]
14:55 <fnegri@cumin1003> START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet [production]
14:55 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:55 <fnegri@cumin1003> conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet [production]
14:55 <sukhe@dns1004> START - running authdns-update [production]
14:55 <eevans@cumin1003> START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad [production]
14:52 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:52 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:51 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:51 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:50 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:50 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:50 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:50 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:49 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:49 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:49 <sukhe@dns1004> END - running authdns-update [production]
14:48 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:48 <sukhe@dns1004> START - running authdns-update [production]
14:47 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:46 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet [production]
14:45 <bking@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad [production]
14:44 <sukhe@dns1004> END - running authdns-update [production]
14:43 <sukhe@dns1004> START - running authdns-update [production]
14:40 <sukhe@dns1004> FAIL - running authdns-update [production]
14:39 <andrew@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet [production]
14:38 <sukhe@dns1004> START - running authdns-update [production]
14:37 <bking@cumin2002> conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad [production]
14:36 <bking@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors [production]
14:36 <bking@cumin2002> START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors [production]
14:34 <fnegri@cumin1003> conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 T419960 [production]
14:33 <sukhe@dns1004> FAIL - running authdns-update [production]
14:33 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet [production]
14:32 <fnegri@cumin1003> conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet [production]
14:32 <sukhe@dns1004> START - running authdns-update [production]
14:31 <bking@cumin2002> conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw [production]
14:30 <jclark@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye [production]
14:30 <jclark@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" [production]
14:27 <andrew@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet [production]
14:22 <fnegri@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 T419960 [production]