3501-3550 of 10000 results (96ms)
2023-03-23 ยง
11:57 <hnowlan@deploy2002> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
11:54 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-druid1001.eqiad.wmnet with reason: host reimage [production]
11:52 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2004.codfw.wmnet with OS bullseye [production]
11:51 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-druid1001.eqiad.wmnet with reason: host reimage [production]
11:47 <vgutierrez> rolling rollback to HAProxy 2.6.9 in cache upload cluster - T332796 [production]
11:36 <btullis@cumin1001> START - Cookbook sre.ganeti.reimage for host an-test-druid1001.eqiad.wmnet with OS bullseye [production]
11:32 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2004.codfw.wmnet with reason: host reimage [production]
11:27 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2004.codfw.wmnet with reason: host reimage [production]
11:26 <hnowlan@deploy2002> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
11:16 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host irc2002.wikimedia.org with OS bullseye [production]
11:15 <hnowlan@deploy2002> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
11:15 <hnowlan@deploy2002> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
11:08 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main2004.codfw.wmnet with OS bullseye [production]
11:07 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2004.codfw.wmnet with reason: stop kafka and reimage [production]
11:06 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2004.codfw.wmnet with reason: stop kafka and reimage [production]
11:05 <hnowlan@deploy2002> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
11:05 <hnowlan@deploy2002> helmfile [staging] DONE helmfile.d/services/thumbor: apply [production]
11:04 <hnowlan@deploy2002> helmfile [staging] START helmfile.d/services/thumbor: apply [production]
11:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on irc2002.wikimedia.org with reason: host reimage [production]
10:56 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on irc2002.wikimedia.org with reason: host reimage [production]
10:44 <jmm@cumin2002> START - Cookbook sre.ganeti.reimage for host irc2002.wikimedia.org with OS bullseye [production]
10:41 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host irc2002.wikimedia.org [production]
10:38 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2005.codfw.wmnet with OS bullseye [production]
10:21 <jmm@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) irc2002.wikimedia.org on all recursors [production]
10:21 <jmm@cumin2002> START - Cookbook sre.dns.wipe-cache irc2002.wikimedia.org on all recursors [production]
10:21 <jmm@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
10:21 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc2002.wikimedia.org - jmm@cumin2002" [production]
10:18 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2005.codfw.wmnet with reason: host reimage [production]
10:15 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2005.codfw.wmnet with reason: host reimage [production]
10:10 <jmm@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc2002.wikimedia.org - jmm@cumin2002" [production]
10:08 <jmm@cumin2002> START - Cookbook sre.dns.netbox [production]
10:08 <jmm@cumin2002> START - Cookbook sre.ganeti.makevm for new host irc2002.wikimedia.org [production]
10:01 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main2005.codfw.wmnet with OS bullseye [production]
09:57 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2005.codfw.wmnet with reason: stop kafka and reimage [production]
09:57 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2005.codfw.wmnet with reason: stop kafka and reimage [production]
09:47 <moritzm> uploaded prometheus-druid-exporter 0.8-2 for bullseye-wikimedia T332584 T332589 [production]
08:20 <elukey> clean up docker and reboot kubernetes2024 to enable overlay2 - T332803 [production]
08:11 <vgutierrez> testing HAProxy 2.6.11 in cp4044 - T332796 [production]
08:08 <vgutierrez> fetch haproxy 2.6.11 in apt.wm.o thirdparty/haproxy26 for bullseye & buster [production]
08:04 <vgutierrez> rolling rollback to HAProxy 2.6.9 in cache text cluster - T332796 [production]
07:54 <elukey> clean up docker and reboot kubernetes2023 to enable overlay2 - T332803 [production]
07:50 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes2023.codfw.wmnet with reason: Restart docker with overlay [production]
07:49 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes2023.codfw.wmnet with reason: Restart docker with overlay [production]
07:49 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes2024.codfw.wmnet with reason: Restart docker with overlay [production]
07:49 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes2024.codfw.wmnet with reason: Restart docker with overlay [production]
07:42 <elukey> clean up docker on kubernetes1024 (cordon + stop kubelet + docker + clean /var/lib/docker/*) and reboot to enable overlay2 - T332803 [production]
07:38 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes1024.eqiad.wmnet with reason: Restart docker with overlay [production]
07:37 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes1024.eqiad.wmnet with reason: Restart docker with overlay [production]
07:23 <marostegui@cumin1001> dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45928 and previous config saved to /var/cache/conftool/dbconfig/20230323-072315-root.json [production]
07:08 <marostegui@cumin1001> dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45927 and previous config saved to /var/cache/conftool/dbconfig/20230323-070811-root.json [production]