551-600 of 10000 results (137ms)
2026-03-18 ยง
16:41 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update [production]
16:40 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet [production]
16:39 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie [production]
16:39 <jynus@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet [production]
16:38 <moritzm> installing PHP 8.2 security updates [production]
16:37 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet [production]
16:36 <brett@cumin2002> cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet [production]
16:35 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie [production]
16:34 <moritzm> installing alsa-lib security updates [production]
16:33 <brett@cumin2002> cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet [production]
16:32 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet [production]
16:32 <sukhe@cumin1003> cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org [production]
16:29 <moritzm> failover Ganeti master in eqiad to ganeti1046 [production]
16:29 <cdobbins@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage [production]
16:29 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet [production]
16:28 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet [production]
16:28 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet [production]
16:24 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update [production]
16:24 <cdobbins@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage [production]
16:24 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet [production]
16:22 <klausman@cumin1003> START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet [production]
16:20 <jynus@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet [production]
16:19 <klausman@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet [production]
16:18 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet [production]
16:18 <btullis@cumin1003> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm [production]
16:16 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm [production]
16:16 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet [production]
16:16 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet [production]
16:14 <sukhe@cumin1003> cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org [production]
16:14 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet [production]
16:14 <jynus@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet [production]
16:13 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage [production]
16:13 <klausman@cumin1003> START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet [production]
16:12 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update [production]
16:12 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm [production]
16:12 <btullis@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003" [production]
16:11 <btullis@cumin1003> START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm [production]
16:11 <moritzm> powercycling ganeti1053 (stuck on reboot) [production]
16:09 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage [production]
16:09 <klausman@cumin1003> END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad [production]
16:09 <klausman@cumin1003> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad [production]
16:08 <klausman@cumin1003> END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad [production]
16:07 <jynus@cumin1003> START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet [production]
16:07 <jynus@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet [production]
16:06 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage [production]
16:06 <klausman@cumin1003> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad [production]
16:06 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage [production]
16:04 <btullis@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003" [production]
16:04 <klausman@cumin1003> END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad [production]
16:04 <klausman@cumin1003> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad [production]