3151-3200 of 10000 results (90ms)
2023-03-07 ยง
14:26 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:25 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:25 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:24 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:24 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:21 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:21 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:21 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:20 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:20 <topranks> issuing reboot to upgrade asw2-a-eqiad virtual-chassis to Junos 21.4 [production]
14:20 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:19 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:19 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1038'] [production]
14:17 <akosiaris@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubernetes1020.eqiad.wmnet with OS bullseye [production]
14:16 <cmooney@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-eqiad with reason: eqiad row A upgrade [production]
14:16 <cmooney@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mr1-eqiad with reason: eqiad row A upgrade [production]
14:15 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1037'] [production]
14:13 <akosiaris> kubectl cordon kubernetes{1005,1007,1008,1017,1018}.eqiad.wmnet T329073 [production]
14:13 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2070.codfw.wmnet with OS bullseye [production]
14:12 <mvernon@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin1001" [production]
14:09 <cmjohnson@cumin1001> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038'] [production]
14:09 <cmooney@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 238 hosts with reason: eqiad row A upgrade [production]
14:09 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1038'] [production]
14:09 <cmjohnson@cumin1001> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038'] [production]
14:08 <akosiaris@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: host reimage [production]
14:08 <akosiaris@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: host reimage [production]
14:07 <cmjohnson@cumin1001> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1037'] [production]
14:07 <cmooney@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on 238 hosts with reason: eqiad row A upgrade [production]
14:05 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=restbase1031.eqiad.wmnet [production]
14:05 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=restbase102[18].eqiad.wmnet [production]
14:05 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=restbase101[69].eqiad.wmnet [production]
14:02 <mvernon@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin1001" [production]
13:59 <jbond> failover pki.discovery.wmnet to codfw T329073 [production]
13:58 <Emperor> depool thanos-fe1001 T329073 [production]
13:55 <Emperor> depool ms-fe1009 T329073 [production]
13:55 <Emperor> depool moss-fe1001 T329073 [production]
13:54 <akosiaris@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1020.eqiad.wmnet with OS bullseye [production]
13:50 <moritzm> disabling Puppet in eqiad/esams/drmrs for forthcoming Switch maintenance T329073 [production]
13:50 <topranks> staging Junos files to individual VC members eqiad row A to prep for upgrade [production]
13:15 <otto@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:15 <otto@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
13:14 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1019.eqiad.wmnet with OS bullseye [production]
13:04 <moritzm> drain ganeti1011 for eventual reimage to Bullseye T311687 [production]
13:00 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1018.eqiad.wmnet with OS bullseye [production]
12:57 <sukhe> removing dns1001 from authdns_servers for T329073 [production]
12:55 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: host reimage [production]
12:52 <akosiaris@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: host reimage [production]
12:44 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: host reimage [production]
12:41 <akosiaris@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: host reimage [production]
12:38 <akosiaris@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1019.eqiad.wmnet with OS bullseye [production]