2020-02-27
§
|
17:56 |
<bd808> |
Deleted instances tools-worker-10[21-40] |
[tools] |
16:14 |
<bd808> |
Decommissioning tools-worker-10[21-40] |
[tools] |
16:02 |
<bd808> |
Drained tools-worker-1021 |
[tools] |
15:51 |
<bd808> |
Drained tools-worker-1022 |
[tools] |
15:44 |
<bd808> |
Drained tools-worker-1023 (there is no tools-worker-1024) |
[tools] |
15:39 |
<bd808> |
Drained tools-worker-1025 |
[tools] |
15:39 |
<bd808> |
Drained tools-worker-1026 |
[tools] |
15:11 |
<bd808> |
Drained tools-worker-1027 |
[tools] |
15:09 |
<bd808> |
Drained tools-worker-1028 (there is no tools-worker-1029) |
[tools] |
15:07 |
<bd808> |
Drained tools-worker-1030 |
[tools] |
15:06 |
<bd808> |
Uncordoned tools-worker-10[16-20]. Was over optimistic about repacking legacy Kubernetes cluster into 15 instances. Will keep 20 for now. |
[tools] |
15:00 |
<bd808> |
Drained tools-worker-1031 |
[tools] |
14:54 |
<bd808> |
Hard reboot tools-worker-1016. Direct virsh console unresponsive. Stuck in shutdown since 2020-01-22? |
[tools] |
14:44 |
<bd808> |
Uncordoned tools-worker-1009.tools.eqiad.wmflabs |
[tools] |
14:41 |
<bd808> |
Drained tools-worker-1032 |
[tools] |
14:37 |
<bd808> |
Drained tools-worker-1033 |
[tools] |
14:35 |
<bd808> |
Drained tools-worker-1034 |
[tools] |
14:34 |
<bd808> |
Drained tools-worker-1035 |
[tools] |
14:33 |
<bd808> |
Drained tools-worker-1036 |
[tools] |
14:33 |
<bd808> |
Drained tools-worker-10{39,38,37} yesterday but did not !log |
[tools] |
00:29 |
<bd808> |
Drained tools-worker-1009 for reboot (NFS flakey) |
[tools] |
00:11 |
<bd808> |
Uncordoned tools-worker-1009.tools.eqiad.wmflabs |
[tools] |
00:08 |
<bd808> |
Uncordoned tools-worker-1002.tools.eqiad.wmflabs |
[tools] |
00:02 |
<bd808> |
Rebooting tools-worker-1002 |
[tools] |
00:00 |
<bd808> |
Draining tools-worker-1002 to reboot for NFS problems |
[tools] |
2020-02-26
§
|
23:42 |
<bd808> |
Drained tools-worker-1040 |
[tools] |
23:41 |
<bd808> |
Cordoned tools-worker-10[16-40] in preparation for shrinking legacy Kubernetes cluster |
[tools] |
23:12 |
<bstorm_> |
replacing all tool limit-ranges in the 2020 cluster with a lower cpu request version |
[tools] |
22:29 |
<bstorm_> |
deleted pod maintain-kubeusers-6d9c45f4bc-5bqq5 to deploy new image |
[tools] |
21:06 |
<bstorm_> |
deleting loads of stuck grid jobs |
[tools] |
20:27 |
<jeh> |
rebooting tools-worker-[1008,1015,1021] |
[tools] |
20:15 |
<bstorm_> |
rebooting tools-sgegrid-master because it actually had the permissions thing going on still |
[tools] |
18:03 |
<bstorm_> |
downtimed toolschecker for nfs maintenance |
[tools] |