2018-03-06
§
|
16:15 |
<madhuvishy> |
Reboot tools-docker-registry-02 T189018 |
[tools] |
15:50 |
<madhuvishy> |
Rebooting tools-worker-1011 |
[tools] |
15:08 |
<chasemp> |
tools-k8s-master-01:~# kubectl uncordon tools-worker-1011.tools.eqiad.wmflabs |
[tools] |
15:03 |
<arturo> |
drain and reboot tools-worker-1011 |
[tools] |
15:03 |
<chasemp> |
rebooted tools-worker 1001-1008 |
[tools] |
14:58 |
<arturo> |
drain and reboot tools-worker-1010 |
[tools] |
14:27 |
<chasemp> |
multiple tools running on k8s workers report issues reading replica.my.cnf file atm |
[tools] |
14:27 |
<chasemp> |
reboot tools-worker-100[12] |
[tools] |
14:23 |
<chasemp> |
downtime icinga alert for k8s workers ready |
[tools] |
13:21 |
<arturo> |
T188994 in some servers there was some race in the dpkg lock between apt-upgrade and puppet. Also, I forgot to use DEBIAN_FRONTEND=noninteractive, so debconf prompts happened and stalled dpkg operations. Already solved, but some puppet alerts were produced |
[tools] |
12:58 |
<arturo> |
T188994 upgrading packages in jessie nodes from the oldstable source |
[tools] |
11:42 |
<arturo> |
clush -w @all "sudo DEBIAN_FRONTEND=noninteractive apt-get autoclean" <-- free space in filesystem |
[tools] |
11:41 |
<arturo> |
aborrero@tools-clushmaster-01:~$ clush -w @all "sudo DEBIAN_FRONTEND=noninteractive apt-get autoremove -y" <-- we did in canary servers last week and it went fine. So run in fleet-wide |
[tools] |
11:36 |
<arturo> |
(ubuntu) removed linux-image-3.13.0-142-generic and linux-image-3.13.0-137-generic (T188911) |
[tools] |
11:33 |
<arturo> |
removing unused kernel packages in ubuntu nodes |
[tools] |
11:08 |
<arturo> |
aborrero@tools-clushmaster-01:~$ clush -w @all "sudo rm /etc/apt/preferences.d/* ; sudo puppet agent -t -v" <--- rebuild directory, it contains stale files across all the cluster |
[tools] |