801-850 of 10000 results (28ms)
2025-05-27 ยง
10:33 <jmm@cumin1003> START - Cookbook sre.hosts.reimage for host ganeti7002.magru.wmnet with OS bookworm [production]
10:30 <elukey@cumin1002> START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet [production]
10:28 <klausman@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet [production]
10:28 <klausman@cumin1003> START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet [production]
10:27 <klausman@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1010.eqiad.wmnet with OS bookworm [production]
10:25 <jgiannelos@deploy1003> helmfile [staging] DONE helmfile.d/services/changeprop: apply [production]
10:25 <jgiannelos@deploy1003> helmfile [staging] START helmfile.d/services/changeprop: apply [production]
10:19 <jgiannelos@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply [production]
10:18 <jgiannelos@deploy1003> helmfile [eqiad] START helmfile.d/services/mobileapps: apply [production]
10:18 <jgiannelos@deploy1003> helmfile [codfw] DONE helmfile.d/services/mobileapps: apply [production]
10:18 <jgiannelos@deploy1003> helmfile [codfw] START helmfile.d/services/mobileapps: apply [production]
10:15 <jgiannelos@deploy1003> helmfile [staging] DONE helmfile.d/services/mobileapps: apply [production]
10:15 <taavi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage [production]
10:15 <jgiannelos@deploy1003> helmfile [staging] START helmfile.d/services/mobileapps: apply [production]
10:14 <stevemunene@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-worker1148.eqiad.wmnet with reason: Upgrade an-worker hard drives from 4TB to 8TB group 5 - rack F1 [production]
10:14 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply [production]
10:13 <stevemunene@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-worker1155.eqiad.wmnet with reason: Upgrade an-worker hard drives from 4TB to 8TB group 5 - rack F1 [production]
10:13 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply [production]
10:11 <klausman@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1010.eqiad.wmnet with reason: host reimage [production]
10:08 <taavi@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage [production]
10:08 <klausman@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1010.eqiad.wmnet with reason: host reimage [production]
09:49 <taavi@cumin1002> START - Cookbook sre.hosts.reimage for host cloudnet2008-dev.codfw.wmnet with OS bookworm [production]
09:47 <taavi@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2007-dev.codfw.wmnet with OS bookworm [production]
09:39 <brouberol@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet [production]
09:36 <klausman@cumin1003> START - Cookbook sre.hosts.reimage for host ml-serve1010.eqiad.wmnet with OS bookworm [production]
09:34 <klausman@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet [production]
09:34 <klausman@cumin1003> START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet [production]
09:33 <brouberol@cumin2002> START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet [production]
09:29 <taavi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage [production]
09:27 <klausman@cumin1003> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet [production]
09:25 <taavi@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage [production]
09:22 <klausman@cumin1003> START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet [production]
09:05 <taavi@cumin1002> START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS bookworm [production]
09:00 <klausman@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1011.eqiad.wmnet with OS bookworm [production]
08:55 <moritzm> remove ganeti7002 from the magru02 cluster T394263 [production]
08:46 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9002 [production]
08:46 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 9002 [production]
08:46 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34141 [production]
08:46 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 34141 [production]
08:45 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 29208 [production]
08:45 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 29208 [production]
08:45 <klausman@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1011.eqiad.wmnet with reason: host reimage [production]
08:44 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28598 [production]
08:44 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 28598 [production]
08:43 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 20857 [production]
08:43 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 20857 [production]
08:42 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16347 [production]
08:42 <klausman@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1011.eqiad.wmnet with reason: host reimage [production]
08:41 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'email' for AS: 16347 [production]
08:40 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 59605 [production]