2021-03-02
ยง
|
11:40 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host kubestagemaster2001.codfw.wmnet |
[production] |
11:16 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host kubemaster2002.codfw.wmnet |
[production] |
11:16 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host kubemaster2001.codfw.wmnet |
[production] |
11:12 |
<hnowlan@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
11:11 |
<hnowlan@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' . |
[production] |
10:37 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1028.eqiad.wmnet |
[production] |
10:31 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host mc1028.eqiad.wmnet |
[production] |
10:29 |
<effie> |
upgrade memcached on mc2024, mc1028 |
[production] |
10:21 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1119.eqiad.wmnet |
[production] |
10:18 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1119.eqiad.wmnet |
[production] |
10:12 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE |
[production] |
10:09 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE |
[production] |
10:05 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE |
[production] |
10:03 |
<volans@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE |
[production] |
09:54 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1130-1131].eqiad.wmnet |
[production] |
09:52 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1130-1131].eqiad.wmnet |
[production] |
09:46 |
<liw@deploy1002> |
Finished scap: testwikis wikis to 1.36.0-wmf.33 (duration: 36m 20s) |
[production] |
09:43 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1124-1128].eqiad.wmnet |
[production] |
09:41 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1124-1128].eqiad.wmnet |
[production] |
09:39 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1120-1123].eqiad.wmnet |
[production] |
09:37 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1120-1123].eqiad.wmnet |
[production] |
09:36 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1119.eqiad.wmnet |
[production] |
09:33 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1119.eqiad.wmnet |
[production] |
09:12 |
<liw@deploy1002> |
Started scap: testwikis wikis to 1.36.0-wmf.33 |
[production] |
08:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE |
[production] |
08:56 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1130.eqiad.wmnet with reason: REIMAGE |
[production] |
08:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE |
[production] |
08:54 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1130.eqiad.wmnet with reason: REIMAGE |
[production] |
08:54 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1128.eqiad.wmnet with reason: REIMAGE |
[production] |
08:53 |
<vgutierrez> |
rolling restart of ats-tls on ulsfo |
[production] |
08:52 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1128.eqiad.wmnet with reason: REIMAGE |
[production] |
08:39 |
<kharlan@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . |
[production] |
08:30 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1127.eqiad.wmnet with reason: REIMAGE |
[production] |
08:28 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1126.eqiad.wmnet with reason: REIMAGE |
[production] |
08:27 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1127.eqiad.wmnet with reason: REIMAGE |
[production] |
08:25 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1126.eqiad.wmnet with reason: REIMAGE |
[production] |
08:25 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1125.eqiad.wmnet with reason: REIMAGE |
[production] |
08:23 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1125.eqiad.wmnet with reason: REIMAGE |
[production] |
08:00 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1124.eqiad.wmnet with reason: REIMAGE |
[production] |
07:59 |
<liw> |
1.36.0-wmf.33 was branched at 800e1f8cea169fc9c6e72ac1dc197591a06701bd for T274937 |
[production] |
07:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1123.eqiad.wmnet with reason: REIMAGE |
[production] |
07:58 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1124.eqiad.wmnet with reason: REIMAGE |
[production] |
07:56 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1122.eqiad.wmnet with reason: REIMAGE |
[production] |
07:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1123.eqiad.wmnet with reason: REIMAGE |
[production] |
07:54 |
<godog> |
swift eqiad-prod: add weight to ms-be106[0-3] - T268435 |
[production] |
07:54 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1122.eqiad.wmnet with reason: REIMAGE |
[production] |
07:28 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1121.eqiad.wmnet with reason: REIMAGE |
[production] |
07:27 |
<ryankemper> |
Pooled `elastic106[0,4]` (Noticed I never re-pooled these hosts after resolving an incident last week) |
[production] |
07:26 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1121.eqiad.wmnet with reason: REIMAGE |
[production] |
07:26 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1120.eqiad.wmnet with reason: REIMAGE |
[production] |