1301-1350 of 10000 results (96ms)
2024-06-05 ยง
11:39 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2002.codfw.wmnet [production]
11:39 <hnowlan@cumin1002> conftool action : set/weight=10:pooled=yes; selector: name=(wikikube-worker1008.eqiad.wmnet|wikikube-worker1009.eqiad.wmnet|wikikube-worker1010.eqiad.wmnet|wikikube-worker1011.eqiad.wmnet|wikikube-worker1012.eqiad.wmnet),cluster=kubernetes,service=kubesvc [production]
11:38 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netbox-dev2002.codfw.wmnet [production]
11:38 <mvernon@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1061.eqiad.wmnet [production]
11:38 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2060.codfw.wmnet [production]
11:37 <jiji@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1031.eqiad.wmnet with OS bullseye [production]
11:36 <jmm@cumin2002> START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad [production]
11:31 <hnowlan> running homer to configure bgp on 5 new k8s workers [production]
11:31 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1011.eqiad.wmnet with OS bullseye [production]
11:30 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2060.codfw.wmnet [production]
11:30 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host ms-be1061.eqiad.wmnet [production]
11:27 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1009.eqiad.wmnet with OS bullseye [production]
11:21 <jiji@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1031.eqiad.wmnet with reason: host reimage [production]
11:17 <jiji@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1031.eqiad.wmnet with reason: host reimage [production]
11:12 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1011.eqiad.wmnet with reason: host reimage [production]
11:09 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1009.eqiad.wmnet with reason: host reimage [production]
11:06 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1011.eqiad.wmnet with reason: host reimage [production]
11:06 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1009.eqiad.wmnet with reason: host reimage [production]
11:06 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2059.codfw.wmnet [production]
11:03 <jiji@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephosd1031.eqiad.wmnet with OS bullseye [production]
11:03 <claime> restarted send_tile_invalidations.service on maps1009 [production]
11:03 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P64098 and previous config saved to /var/cache/conftool/dbconfig/20240605-110303-ladsgroup.json [production]
10:59 <jmm@cumin2002> END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw [production]
10:54 <mvernon@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1060.eqiad.wmnet [production]
10:54 <marostegui@cumin1002> dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P64097 and previous config saved to /var/cache/conftool/dbconfig/20240605-105400-root.json [production]
10:53 <hnowlan@cumin1002> START - Cookbook sre.hosts.reimage for host wikikube-worker1011.eqiad.wmnet with OS bullseye [production]
10:53 <hnowlan@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1011.eqiad.wmnet with OS bullseye [production]
10:53 <hnowlan@cumin1002> START - Cookbook sre.hosts.reimage for host wikikube-worker1009.eqiad.wmnet with OS bullseye [production]
10:52 <hnowlan@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1009.eqiad.wmnet with OS bullseye [production]
10:52 <jmm@cumin2002> START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw [production]
10:50 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2059.codfw.wmnet [production]
10:50 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2058.codfw.wmnet [production]
10:47 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P64096 and previous config saved to /var/cache/conftool/dbconfig/20240605-104757-ladsgroup.json [production]
10:46 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host ms-be1060.eqiad.wmnet [production]
10:46 <mvernon@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1059.eqiad.wmnet [production]
10:42 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2058.codfw.wmnet [production]
10:40 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2057.codfw.wmnet [production]
10:39 <klausman@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet [production]
10:38 <marostegui@cumin1002> dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P64094 and previous config saved to /var/cache/conftool/dbconfig/20240605-103854-root.json [production]
10:37 <klausman@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet [production]
10:37 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1012.eqiad.wmnet with OS bullseye [production]
10:35 <klausman@cumin2002> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw [production]
10:34 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1010.eqiad.wmnet with OS bullseye [production]
10:32 <ladsgroup@cumin1002> dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P64093 and previous config saved to /var/cache/conftool/dbconfig/20240605-103251-ladsgroup.json [production]
10:32 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host ms-be1059.eqiad.wmnet [production]
10:32 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2057.codfw.wmnet [production]
10:31 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2056.codfw.wmnet [production]
10:30 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1008.eqiad.wmnet with OS bullseye [production]
10:30 <mvernon@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1058.eqiad.wmnet [production]
10:27 <jmm@cumin2002> END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox [production]