2025-02-01
ยง
|
22:54 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T384592)', diff saved to https://phabricator.wikimedia.org/P73020 and previous config saved to /var/cache/conftool/dbconfig/20250201-225456-marostegui.json |
[production] |
22:39 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P73019 and previous config saved to /var/cache/conftool/dbconfig/20250201-223949-marostegui.json |
[production] |
22:24 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P73018 and previous config saved to /var/cache/conftool/dbconfig/20250201-222442-marostegui.json |
[production] |
20:56 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2173 (T384592)', diff saved to https://phabricator.wikimedia.org/P73016 and previous config saved to /var/cache/conftool/dbconfig/20250201-205602-marostegui.json |
[production] |
20:55 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14:00:00 on db2186.codfw.wmnet with reason: Maintenance |
[production] |
20:55 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2173.codfw.wmnet with reason: Maintenance |
[production] |
20:55 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73015 and previous config saved to /var/cache/conftool/dbconfig/20250201-205525-marostegui.json |
[production] |
20:40 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P73014 and previous config saved to /var/cache/conftool/dbconfig/20250201-204018-marostegui.json |
[production] |
20:25 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P73013 and previous config saved to /var/cache/conftool/dbconfig/20250201-202511-marostegui.json |
[production] |
20:10 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73012 and previous config saved to /var/cache/conftool/dbconfig/20250201-201004-marostegui.json |
[production] |
19:05 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73011 and previous config saved to /var/cache/conftool/dbconfig/20250201-190526-marostegui.json |
[production] |
19:05 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2170.codfw.wmnet with reason: Maintenance |
[production] |
19:05 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73010 and previous config saved to /var/cache/conftool/dbconfig/20250201-190504-marostegui.json |
[production] |
18:49 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P73009 and previous config saved to /var/cache/conftool/dbconfig/20250201-184957-marostegui.json |
[production] |
18:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P73008 and previous config saved to /var/cache/conftool/dbconfig/20250201-183450-marostegui.json |
[production] |
18:19 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73007 and previous config saved to /var/cache/conftool/dbconfig/20250201-181943-marostegui.json |
[production] |
17:06 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73006 and previous config saved to /var/cache/conftool/dbconfig/20250201-170624-marostegui.json |
[production] |
17:06 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2153.codfw.wmnet with reason: Maintenance |
[production] |
17:06 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73005 and previous config saved to /var/cache/conftool/dbconfig/20250201-170602-marostegui.json |
[production] |
16:50 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P73004 and previous config saved to /var/cache/conftool/dbconfig/20250201-165055-marostegui.json |
[production] |
16:41 |
<cmooney@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002 |
[production] |
16:35 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P73003 and previous config saved to /var/cache/conftool/dbconfig/20250201-163548-marostegui.json |
[production] |
16:20 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73002 and previous config saved to /var/cache/conftool/dbconfig/20250201-162041-marostegui.json |
[production] |
15:52 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
kubectl rollout restart deployment bridgebot # stopped bridging for no clear reason (kubectl logs show recent telegram/IRC messages), see if this helps |
[tools.bridgebot] |
15:29 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for all nodes |
[toolsbeta] |
15:17 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73001 and previous config saved to /var/cache/conftool/dbconfig/20250201-151709-marostegui.json |
[production] |
15:17 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2146.codfw.wmnet with reason: Maintenance |
[production] |
15:16 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2145 (T384592)', diff saved to https://phabricator.wikimedia.org/P73000 and previous config saved to /var/cache/conftool/dbconfig/20250201-151646-marostegui.json |
[production] |
15:15 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for all nodes |
[toolsbeta] |
15:15 |
<andrew@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=97) for all nodes |
[toolsbeta] |
15:14 |
<andrewbogott> |
hard rebooting all VMs for T385264 |
[toolsbeta] |
15:14 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for all nodes |
[toolsbeta] |
15:06 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-108 |
[tools] |
15:05 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-108 |
[tools] |
15:05 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-107 |
[tools] |
15:04 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-107 |
[tools] |
15:04 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-106 |
[tools] |
15:03 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-106 |
[tools] |
15:03 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-105 |
[tools] |
15:02 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-105 |
[tools] |
15:02 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-103 |
[tools] |
15:01 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-103 |
[tools] |
15:01 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-102 |
[tools] |
15:01 |
<andrewbogott> |
rebooting all k8s (non-nfs) worker nodes for T385264 |
[tools] |
15:01 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P72999 and previous config saved to /var/cache/conftool/dbconfig/20250201-150139-marostegui.json |
[production] |
15:00 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-102 |
[tools] |
14:57 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-9 |
[tools] |
14:56 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-9 |
[tools] |
14:56 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-74 |
[tools] |
14:55 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-74 |
[tools] |