2019-04-10
§
|
18:52 |
<bstorm_> |
depooled and rebooted tools-sgeexec-0929 because systemd was in a weird state |
[tools] |
18:46 |
<bstorm_> |
depooled and rebooted tools-sgewebgrid-lighttpd-0913 because high load was caused by ancient lsof processes |
[tools] |
14:49 |
<bstorm_> |
cleared E state from 5 queues |
[tools] |
13:06 |
<arturo> |
T218126 hard reboot tools-sgeexec-0906 |
[tools] |
12:31 |
<arturo> |
T218126 hard reboot tools-sgeexec-0926 |
[tools] |
12:27 |
<arturo> |
T218126 hard reboot tools-sgeexec-0925 |
[tools] |
12:06 |
<arturo> |
T218126 hard reboot tools-sgeexec-0901 |
[tools] |
11:55 |
<arturo> |
T218126 hard reboot tools-sgeexec-0924 |
[tools] |
11:47 |
<arturo> |
T218126 hard reboot tools-sgeexec-0921 |
[tools] |
11:23 |
<arturo> |
T218126 hard reboot tools-sgeexec-0940 |
[tools] |
11:03 |
<arturo> |
T218126 hard reboot tools-sgeexec-0928 |
[tools] |
10:49 |
<arturo> |
T218126 hard reboot tools-sgeexec-0923 |
[tools] |
10:43 |
<arturo> |
T218126 hard reboot tools-sgeexec-0915 |
[tools] |
10:27 |
<arturo> |
T218126 hard reboot tools-sgeexec-0935 |
[tools] |
10:19 |
<arturo> |
T218126 hard reboot tools-sgeexec-0914 |
[tools] |
10:01 |
<arturo> |
T218126 hard reboot tools-sgeexec-0907 |
[tools] |
09:40 |
<arturo> |
T218126 hard reboot tools-sgeexec-0918 |
[tools] |
09:27 |
<arturo> |
T218126 hard reboot tools-sgeexec-0932 |
[tools] |
09:26 |
<arturo> |
T218216 hard reboot tools-sgeexec-0932 |
[tools] |
09:04 |
<arturo> |
T218216 add `profile::ldap::client::labs::client_stack: sssd` to prefix puppet for sge-exec nodes |
[tools] |
09:03 |
<arturo> |
T218216 do a controlled rollover of sssd, depooling sgeexec nodes, reboot and repool |
[tools] |
08:39 |
<arturo> |
T218216 disable puppet in all tools-sgeexec-XXXX nodes for controlled sssd rollout |
[tools] |
00:32 |
<andrewbogott> |
migrating tools-worker-1022, 1023, 1025, 1026 to eqiad1-r |
[tools] |
2019-04-09
§
|
22:04 |
<bstorm_> |
added the new region on port 80 to the elasticsearch security group for stashbot |
[tools] |
20:43 |
<andrewbogott> |
moving tools-worker-1018, 1019, 1020, 1021 to eqiad1-r |
[tools] |
20:04 |
<andrewbogott> |
moving tools-k8s-etcd-03 to eqiad1-r |
[tools] |
19:54 |
<andrewbogott> |
moving tools-flannel-etcd-02 to eqiad1-r |
[tools] |
18:36 |
<andrewbogott> |
moving tools-worker-1016, tools-worker-1017 to eqiad1-r |
[tools] |
18:05 |
<andrewbogott> |
migrating tools-k8s-etcd-02 to eqiad1-r |
[tools] |
18:00 |
<andrewbogott> |
migrating tools-flannel-etcd-01 to eqiad1-r |
[tools] |
17:36 |
<andrewbogott> |
moving tools-worker-1014, tools-worker-1015 to eqiad1-r |
[tools] |
17:05 |
<andrewbogott> |
migrating tools-k8s-etcd-01 to eqiad1-r |
[tools] |
15:56 |
<andrewbogott> |
moving tools-worker-1012, tools-worker-1013 to eqiad1-r |
[tools] |
14:56 |
<bstorm_> |
cleared 4 queues on gridengine of E status (ldap again) |
[tools] |
14:07 |
<andrewbogott> |
moving tools-worker-1010, tools-worker-1011, tools-worker-1001 to eqiad1-r |
[tools] |
03:48 |
<andrewbogott> |
moving tools-worker-1008 and tools-worker-1009 to eqiad1-r |
[tools] |
02:07 |
<bstorm_> |
reloaded ferm on tools-flannel-etcd-0[1-3] to get the k8s node moves to register |
[tools] |
2019-04-04
§
|
21:21 |
<bd808> |
Uncordoned tools-worker-1013.tools.eqiad.wmflabs after reboot and forced puppet run |
[tools] |
20:53 |
<bd808> |
Rebooting tools-worker-1013 |
[tools] |
20:50 |
<bd808> |
Draining tools-worker-1013.tools.eqiad.wmflabs |
[tools] |
20:29 |
<bd808> |
Released floating IP and deleted instance tools-checker-01 via Horizon |
[tools] |
20:28 |
<bd808> |
Shutdown tools-checker-01 via Horizon |
[tools] |
20:17 |
<bd808> |
Repooled tools-webgrid-lighttpd-0906 after reboot, apt-get dist-upgrade, and forced puppet run |
[tools] |
20:13 |
<bd808> |
Hard reboot of tools-sgewebgrid-lighttpd-0906 via Horizon |
[tools] |
20:09 |
<bd808> |
Repooled tools-webgrid-lighttpd-0912 after reboot, apt-get dist-upgrade, and forced puppet run |
[tools] |