2020-04-06
§
|
19:05 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
19:03 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
19:00 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) |
[production] |
18:58 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
18:57 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) |
[production] |
18:51 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
18:42 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
16:54 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
15:04 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) |
[production] |
14:09 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
14:07 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) |
[production] |
14:07 |
<elukey@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
13:26 |
<elukey> |
reboot stat1008 as test to verify ROCm 3.3 upgrades |
[production] |
13:22 |
<elukey> |
stat1008 upgraded to ROCm 3.3 (enables Tensorflow 2.x) |
[production] |
11:52 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) |
[production] |
11:48 |
<elukey@cumin1001> |
START - Cookbook sre.aqs.roll-restart |
[production] |
11:18 |
<elukey> |
import AMD ROCm 3.3 packages in buster-wikimedia (component thirdparty/rocm33) - T247082 |
[production] |
08:54 |
<elukey> |
bootstrap wdqs200[7,8] - T246343 |
[production] |
07:35 |
<elukey> |
restart elasticsearch_6@cloudelastic-chi-eqiad on cloudelastic1003 as attempt to fix heavy GC runs (old gen) - T231517 |
[production] |