1801-1850 of 10000 results (42ms)
2022-01-21 ยง
15:29 <jhathaway@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on mx1001.wikimedia.org with reason: kernel testing [production]
15:25 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2026.codfw.wmnet with OS buster [production]
15:24 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1019.eqiad.wmnet with OS buster [production]
15:24 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase1018.eqiad.wmnet [production]
15:22 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1018.eqiad.wmnet with OS buster [production]
15:07 <herron> removing kibana.discovery.wmnet record and switching legacy elk LVS instances to state: lvs_setup T299700 [production]
14:52 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
14:41 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
14:40 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
14:35 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2026.codfw.wmnet with OS buster [production]
14:35 <aqu@deploy1002> Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 07s) [production]
14:35 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1018.eqiad.wmnet with OS buster [production]
14:35 <aqu@deploy1002> Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) [production]
13:13 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2025.codfw.wmnet with OS buster [production]
13:09 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1017.eqiad.wmnet with OS buster [production]
13:07 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase1017.eqiad.wmnet [production]
13:05 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2025.codfw.wmnet [production]
13:01 <aqu@deploy1002> Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s) [production]
13:01 <aqu@deploy1002> Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) [production]
12:26 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase1016.eqiad.wmnet [production]
12:26 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2024.codfw.wmnet [production]
12:25 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1017.eqiad.wmnet with OS buster [production]
12:25 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2025.codfw.wmnet with OS buster [production]
12:13 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2024.codfw.wmnet with OS buster [production]
12:11 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1016.eqiad.wmnet with OS buster [production]
12:10 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti1025.eqiad.wmnet to ganeti01.svc.eqiad.wmnet [production]
12:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet [production]
11:56 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet [production]
11:38 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:38 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:34 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
11:34 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
11:31 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
11:31 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
11:18 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase1016.eqiad.wmnet with OS buster [production]
11:18 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2024.codfw.wmnet with OS buster [production]
11:17 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=restbase2023.codfw.wmnet [production]
11:15 <vgutierrez> pool cp3063 running envoy as TLS termination layer - T271421 [production]
11:14 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2023.codfw.wmnet with OS buster [production]
10:58 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3063.esams.wmnet with OS buster [production]
10:33 <moritzm> migrate primary/secondary instances off ganeti1013 [production]
10:14 <moritzm> switch kubetcd1006 back to plain disks [production]
10:14 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks [production]
10:14 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: Switch back to plain disks [production]
10:09 <moritzm> switch kubetcd1005 back to plain disks [production]
10:08 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host restbase2023.codfw.wmnet with OS buster [production]
10:07 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks [production]
10:07 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: Switch back to plain disks [production]
09:51 <moritzm> switch kubetcd1004 back to plain disks [production]
09:50 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: Switch back to plain disks [production]