2023-05-15
ยง
|
17:34 |
<wm-bot2> |
rebooting all the workers of tools k8s cluster (64 nodes) (T316544) - cookbook ran by dcaro@vulcanus |
[tools] |
17:30 |
<volans@cumin2002> |
END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device ssw1-a1-codfw.mgmt.codfw.wmnet |
[production] |
17:30 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:30 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - volans@cumin2002" |
[production] |
17:29 |
<volans@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - volans@cumin2002" |
[production] |
17:27 |
<volans@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:27 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:27 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - volans@cumin2002" |
[production] |
17:26 |
<volans@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - volans@cumin2002" |
[production] |
17:20 |
<wm-bot2> |
rebooted k8s node tools-k8s-worker-87 (T316544) - cookbook ran by dcaro@vulcanus |
[tools] |
17:19 |
<wm-bot2> |
rebooted k8s node tools-k8s-worker-88 (T316544) - cookbook ran by dcaro@vulcanus |
[tools] |
17:17 |
<bd808> |
Rebuilding bullseye and buster docker containers to pick up openssh-client package addition (T258841) |
[tools] |
17:15 |
<volans@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:15 |
<volans@cumin2002> |
START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet |
[production] |
17:12 |
<wm-bot2> |
rebooting the whole tools k8s cluster (64 nodes) (T316544) - cookbook ran by dcaro@vulcanus |
[tools] |
17:06 |
<dcaro> |
rebooting tools-sgegrid-shadow (T316544) |
[tools] |
17:00 |
<dcaro> |
rebooting tools-sgegrid-master (T316544) |
[tools] |
16:55 |
<dcaro> |
rebooting tools-sgeexec-10-20 (T316544) |
[tools] |
16:53 |
<dcaro> |
rebooting tools-sgeweblight-10-18 (T316544) |
[tools] |
16:53 |
<dcaro> |
rebooting tools-sgeweblight-10-25 (T316544) |
[tools] |
16:53 |
<dcaro> |
rebooting tools-sgeweblight-10-20 (T316544) |
[tools] |
16:52 |
<dcaro> |
rebooting tools-sgeweblight-10-21 (T316544) |
[tools] |
16:52 |
<dcaro> |
rebooting tools-sgeexec-10-22 (T316544) |
[tools] |
16:51 |
<dcaro> |
rebooting tools-sgeweblight-10-28 (T316544) |
[tools] |
16:50 |
<dcaro> |
rebooting tools-sgeexec-10-17 (T316544) |
[tools] |
16:48 |
<dcaro> |
rebooting tools-sgeexec-10-21 (T316544) |
[tools] |
16:47 |
<dcaro> |
rebooting tools-sgeexec-10-19 (T316544) |
[tools] |
16:45 |
<dcaro> |
rebooting tools-sgeexec-10-8 (T316544) |
[tools] |
16:45 |
<dcaro> |
rebooting tools-sgeweblight-10-24 (T316544) |
[tools] |
16:44 |
<dcaro> |
rebooting tools-sgewebgen-10-2 (T316544) |
[tools] |
16:44 |
<dcaro> |
rebooting tools-sgeweblight-10-16 (T316544) |
[tools] |
16:43 |
<dcaro> |
rebooting tools-sgeweblight-10-30 (T316544) |
[tools] |
16:43 |
<dcaro> |
rebooting tools-sgeexec-10-18 (T316544) |
[tools] |
16:42 |
<dcaro> |
rebooting tools-sgeexec-10-16 (T316544) |
[tools] |
16:42 |
<dcaro> |
rebooting tools-sgeexec-10-14 (T316544) |
[tools] |
16:41 |
<dcaro> |
rebooting tools-sgeweblight-10-32 (T316544) |
[tools] |
16:40 |
<dcaro> |
rebooting tools-sgeweblight-10-22 (T316544) |
[tools] |
16:39 |
<dcaro> |
rebooting tools-sgeweblight-10-17 (T316544) |
[tools] |
16:32 |
<dcaro> |
rebooting tools-sgeexec-10-13.tools.eqiad1.wikimedia.cloud (T316544) |
[tools] |
16:23 |
<dcaro> |
rebooting tools-sgeweblight-10-26 (T316544) |
[tools] |
16:15 |
<bd808> |
Hard reboot of tools-sgebastion-11 via Horizon (done circa 16:11Z) |
[tools] |
16:14 |
<arturo> |
rebooted a bunch of nodes to cleanup D procs and high load avg because NFS outage (result of T316544) |
[tools] |
15:00 |
<aokoth@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Setup Incomplete |
[production] |
15:00 |
<aokoth@cumin1001> |
START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Setup Incomplete |
[production] |
14:58 |
<hauskater> |
Dropped akwiki and nawiki from CVNBot10 as closed wikis. On-wiki lists require an update. |
[cvn] |
14:28 |
<wm-bot2> |
Drained cloudvirt1034.eqiad.wmnet - cookbook ran by andrew@bullseye |
[admin] |
14:28 |
<wm-bot2> |
Set cloudvirt cloudvirt1034.eqiad.wmnet maintenance (downtime id: 96ce2ed0-3aff-4d04-be0b-e16513070617, use this to unset) - cookbook ran by andrew@bullseye |
[admin] |
14:27 |
<wm-bot2> |
Draining cloudvirt1034.eqiad.wmnet - cookbook ran by andrew@bullseye |
[admin] |
14:24 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: testing transferpy cookbook |
[production] |
14:24 |
<bking@cumin1001> |
START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: testing transferpy cookbook |
[production] |