2020-01-07
§
|
22:40 |
<bstorm_> |
rebooted tools-worker-1007 to recover it from disk full and general badness |
[tools] |
16:33 |
<arturo> |
deleted by hand pod metrics/cadvisor-5pd46 due to prometheus having issues scrapping it |
[tools] |
15:46 |
<bd808> |
Rebooting tools-k8s-worker-[6-14] |
[tools] |
15:35 |
<bstorm_> |
changed kubeadm-config to use a list instead of a hash for extravols on the apiserver in the new k8s cluster T242067 |
[tools] |
14:02 |
<arturo> |
`root@tools-k8s-control-3:~# wmcs-k8s-secret-for-cert -n metrics -s metrics-server-certs -a metrics-server` (T241853) |
[tools] |
13:33 |
<arturo> |
upload docker-registry.tools.wmflabs.org/coreos/kube-state-metrics:v1.8.0 copied from quay.io/coreos/kube-state-metrics:v1.8.0 (T241853) |
[tools] |
13:31 |
<arturo> |
upload docker-registry.tools.wmflabs.org/metrics-server-amd64:v0.3.6 copied from k8s.gcr.io/metrics-server-amd64:v0.3.6 (T241853) |
[tools] |
13:23 |
<arturo> |
[new k8s] doing changes to kube-state-metrics and metrics-server trying to relocate them to the 'metrics' namespace (T241853) |
[tools] |
05:28 |
<bd808> |
Creating tools-k8s-worker-[6-14] (again) |
[tools] |
05:20 |
<bd808> |
Deleting busted tools-k8s-worker-[6-14] |
[tools] |
05:02 |
<bd808> |
Creating tools-k8s-worker-[6-14] |
[tools] |
00:26 |
<bstorm_> |
repooled tools-sgewebgrid-lighttpd-0919 |
[tools] |
00:17 |
<bstorm_> |
repooled tools-sgewebgrid-lighttpd-0918 |
[tools] |
00:15 |
<bstorm_> |
moving tools-sgewebgrid-lighttpd-0918 and -0919 to cloudvirt1004 from cloudvirt1029 to rebalance load |
[tools] |
00:02 |
<bstorm_> |
depooled tools-sgewebgrid-lighttpd-0918 and 0919 to move to cloudvirt1004 to improve spread |
[tools] |
2020-01-06
§
|
23:40 |
<bd808> |
Deleted tools-sgewebgrid-lighttpd-09{0[1-9],10} |
[tools] |
23:36 |
<bd808> |
Shutdown tools-sgewebgrid-lighttpd-09{0[1-9],10} |
[tools] |
23:34 |
<bd808> |
Decommissioned tools-sgewebgrid-lighttpd-09{0[1-9],10} |
[tools] |
23:13 |
<bstorm_> |
Repooled tools-sgeexec-0922 because I don't know why it was depooled |
[tools] |
23:01 |
<bd808> |
Depooled tools-sgewebgrid-lighttpd-0910.tools.eqiad.wmflabs |
[tools] |
22:58 |
<bd808> |
Depooling tools-sgewebgrid-lighttpd-090[2-9] |
[tools] |
22:57 |
<bd808> |
Disabling queues on tools-sgewebgrid-lighttpd-090[2-9] |
[tools] |
21:07 |
<bd808> |
Restarted kube2proxy on tools-proxy-05 to try and refresh admin tool's routes |
[tools] |
18:54 |
<bstorm_> |
edited /etc/fstab to remove NFS and unmounted the nfs volumes tools-k8s-haproxy-1 T241908 |
[tools] |
18:49 |
<bstorm_> |
edited /etc/fstab to remove NFS and rebooted to clear stale mounts on tools-k8s-haproxy-2 T241908 |
[tools] |
18:47 |
<bstorm_> |
added mount_nfs=false to tools-k8s-haproxy puppet prefix T241908 |
[tools] |
18:24 |
<bd808> |
Deleted shutdown instance tools-worker-1029 (was an SSSD testing instance) |
[tools] |
16:42 |
<bstorm_> |
failed sge-shadow-master back to the main grid master |
[tools] |
16:42 |
<bstorm_> |
Removed files for old S1tty that wasn't working on sge-grid-master |
[tools] |
2020-01-04
§
|
18:11 |
<bd808> |
Shutdown tools-worker-1029 |
[tools] |
18:10 |
<bd808> |
kubectl delete node tools-worker-1029.tools.eqiad.wmflabs |
[tools] |
18:06 |
<bd808> |
Removed tools-worker-1029.tools.eqiad.wmflabs from k8s::worker_hosts hiera in preparation for decom |
[tools] |
16:54 |
<bstorm_> |
moving VMs tools-worker-1012/1028/1005 from cloudvirt1024 to cloudvirt1003 due to hardware errors T241884 |
[tools] |
16:47 |
<bstorm_> |
moving VM tools-flannel-etcd-02 from cloudvirt1024 to cloudvirt1003 due to hardware errors T241884 |
[tools] |
16:16 |
<bd808> |
Draining tools-worker-10{05,12,28} due to hardware errors (T241884) |
[tools] |
16:13 |
<arturo> |
moving VM tools-sgewebgrid-lighttpd-0927 from cloudvirt1024 to cloudvirt1009 due to hardware errors (T241884) |
[tools] |
16:11 |
<arturo> |
moving VM tools-sgewebgrid-lighttpd-0926 from cloudvirt1024 to cloudvirt1009 due to hardware errors (T241884) |
[tools] |
16:09 |
<arturo> |
moving VM tools-sgewebgrid-lighttpd-0925 from cloudvirt1024 to cloudvirt1009 due to hardware errors (T241884) |
[tools] |
16:08 |
<arturo> |
moving VM tools-sgewebgrid-lighttpd-0924 from cloudvirt1024 to cloudvirt1009 due to hardware errors (T241884) |
[tools] |