2021-01-22
§
|
22:41 |
<reedy@deploy1001> |
Synchronized invalid.json: (no justification provided) (duration: 00m 58s) |
[production] |
20:07 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1268.eqiad.wmnet with reason: REIMAGE |
[production] |
20:05 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1268.eqiad.wmnet with reason: REIMAGE |
[production] |
20:05 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2330.codfw.wmnet with reason: REIMAGE |
[production] |
20:05 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2332.codfw.wmnet with reason: REIMAGE |
[production] |
20:03 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2328.codfw.wmnet with reason: REIMAGE |
[production] |
20:01 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1413.eqiad.wmnet with reason: REIMAGE |
[production] |
20:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2334.codfw.wmnet with reason: REIMAGE |
[production] |
20:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1413.eqiad.wmnet with reason: REIMAGE |
[production] |
20:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2328.codfw.wmnet with reason: REIMAGE |
[production] |
20:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2330.codfw.wmnet with reason: REIMAGE |
[production] |
20:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2332.codfw.wmnet with reason: REIMAGE |
[production] |
19:59 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2334.codfw.wmnet with reason: REIMAGE |
[production] |
19:39 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2356.codfw.wmnet |
[production] |
19:38 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2354.codfw.wmnet |
[production] |
19:38 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2352.codfw.wmnet |
[production] |
19:36 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2350.codfw.wmnet |
[production] |
19:35 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2352.codfw.wmnet |
[production] |
19:35 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2350.codfw.wmnet |
[production] |
19:35 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2354.codfw.wmnet |
[production] |
19:34 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2356.codfw.wmnet |
[production] |
19:15 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE |
[production] |
19:13 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE |
[production] |
19:11 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE |
[production] |
19:10 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE |
[production] |
19:09 |
<mutante> |
releases1002 systemctl reset-failed |
[production] |
19:09 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE |
[production] |
19:09 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE |
[production] |
19:08 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE |
[production] |
19:07 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE |
[production] |
18:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2364.codfw.wmnet |
[production] |
18:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2362.codfw.wmnet |
[production] |
18:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2360.codfw.wmnet |
[production] |
18:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2358.codfw.wmnet |
[production] |
18:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2362.codfw.wmnet |
[production] |
18:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2364.codfw.wmnet |
[production] |
18:45 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2360.codfw.wmnet |
[production] |
18:45 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2358.codfw.wmnet |
[production] |
18:17 |
<mutante> |
releases2002 - rebooting to confirm works now and also new disk gets auto-mounted |
[production] |
18:03 |
<mutante> |
releases1002 - replaced ens5 with ens6 in /etc/network/interfaaces and rebooted again |
[production] |
18:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk |
[production] |
18:01 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk |
[production] |
17:59 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster |
[production] |
17:59 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster |
[production] |
17:57 |
<mutante> |
releases1002 (releases.wm.org active backend) - rebooting - hopefully it does not run into T272555 but if it does now it's known how to fix |
[production] |
17:55 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2364.codfw.wmnet with reason: REIMAGE |
[production] |
17:54 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2360.codfw.wmnet with reason: REIMAGE |
[production] |
17:53 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2358.codfw.wmnet with reason: REIMAGE |
[production] |
17:52 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2362.codfw.wmnet with reason: REIMAGE |
[production] |