2251-2300 of 10000 results (47ms)
2022-01-21 ยง
14:40 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
14:35 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2026.codfw.wmnet with OS buster [production]
14:35 <aqu@deploy1002> Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 07s) [production]
14:35 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1018.eqiad.wmnet with OS buster [production]
14:35 <aqu@deploy1002> Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) [production]
13:13 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2025.codfw.wmnet with OS buster [production]
13:09 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1017.eqiad.wmnet with OS buster [production]
13:07 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase1017.eqiad.wmnet [production]
13:05 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2025.codfw.wmnet [production]
13:01 <aqu@deploy1002> Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s) [production]
13:01 <aqu@deploy1002> Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) [production]
12:26 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase1016.eqiad.wmnet [production]
12:26 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2024.codfw.wmnet [production]
12:25 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1017.eqiad.wmnet with OS buster [production]
12:25 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2025.codfw.wmnet with OS buster [production]
12:13 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2024.codfw.wmnet with OS buster [production]
12:11 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1016.eqiad.wmnet with OS buster [production]
12:10 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti1025.eqiad.wmnet to ganeti01.svc.eqiad.wmnet [production]
12:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet [production]
11:56 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet [production]
11:38 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:38 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:34 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
11:34 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
11:31 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
11:31 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
11:18 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1016.eqiad.wmnet with OS buster [production]
11:18 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2024.codfw.wmnet with OS buster [production]
11:17 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2023.codfw.wmnet [production]
11:15 <vgutierrez> pool cp3063 running envoy as TLS termination layer - T271421 [production]
11:14 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2023.codfw.wmnet with OS buster [production]
10:58 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3063.esams.wmnet with OS buster [production]
10:33 <moritzm> migrate primary/secondary instances off ganeti1013 [production]
10:14 <moritzm> switch kubetcd1006 back to plain disks [production]
10:14 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks [production]
10:14 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks [production]
10:09 <moritzm> switch kubetcd1005 back to plain disks [production]
10:08 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2023.codfw.wmnet with OS buster [production]
10:07 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks [production]
10:07 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks [production]
09:51 <moritzm> switch kubetcd1004 back to plain disks [production]
09:50 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: Switch back to plain disks [production]
09:50 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: Switch back to plain disks [production]
09:41 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reimage for host cp3063.esams.wmnet with OS buster [production]
09:40 <vgutierrez@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3063.esams.wmnet with OS buster [production]
09:31 <marostegui@cumin1001> dbctl commit (dc=all): 'es1032 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18970 and previous config saved to /var/cache/conftool/dbconfig/20220121-093120-root.json [production]
09:19 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
09:19 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
09:16 <marostegui@cumin1001> dbctl commit (dc=all): 'es1032 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18969 and previous config saved to /var/cache/conftool/dbconfig/20220121-091617-root.json [production]
09:11 <ayounsi@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]