51-100 of 10000 results (97ms)
2026-03-23 ยง
14:39 <andrew@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet [production]
14:38 <sukhe@dns1004> START - running authdns-update [production]
14:37 <bking@cumin2002> conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad [production]
14:36 <bking@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors [production]
14:36 <bking@cumin2002> START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors [production]
14:34 <fnegri@cumin1003> conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 T419960 [production]
14:33 <sukhe@dns1004> FAIL - running authdns-update [production]
14:33 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet [production]
14:33 <fnegri@cumin1003> START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet [production]
14:32 <fnegri@cumin1003> conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet [production]
14:32 <sukhe@dns1004> START - running authdns-update [production]
14:31 <bking@cumin2002> conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw [production]
14:30 <jclark@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye [production]
14:30 <jclark@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" [production]
14:27 <andrew@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet [production]
14:22 <fnegri@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 T419960 [production]
14:22 <fnegri@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet [production]
14:22 <fnegri@cumin1003> START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet [production]
14:21 <fnegri@cumin1003> conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet [production]
14:20 <jclark@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" [production]
14:17 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet [production]
14:14 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{wikikube-worker[2332-2356].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw) [production]
14:14 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet [production]
14:14 <jayme@cumin1003> START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet [production]
14:13 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair [production]
14:11 <andrew@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet [production]
14:07 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet [production]
14:04 <kamila@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org [production]
14:04 <jayme@cumin1003> START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet [production]
14:03 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet [production]
14:03 <jayme@cumin1003> START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet [production]
14:00 <kamila@cumin1003> START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org [production]
14:00 <kamila@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org [production]
13:59 <jclark@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage [production]
13:57 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet [production]
13:56 <kamila@cumin1003> START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org [production]
13:56 <kevinbazira@deploy2002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
13:55 <kamila@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org [production]
13:55 <jclark@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage [production]
13:52 <jayme@cumin1003> START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet [production]
13:52 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet [production]
13:52 <jayme@cumin1003> START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet [production]
13:51 <kamila@cumin1003> START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org [production]
13:51 <kamila@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org [production]
13:50 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . [production]
13:48 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . [production]
13:47 <kamila@cumin1003> START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org [production]
13:47 <jayme@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet [production]