2021-03-18
§
|
19:24 |
<bstorm> |
set profile::toolforge::infrastructure across the entire project with login_server set on the bastion and exec node-related prefixes |
[tools] |
16:21 |
<andrewbogott> |
enabling puppet tools-wide |
[tools] |
16:20 |
<andrewbogott> |
disabling puppet tools-wide to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 |
[tools] |
16:19 |
<bstorm> |
added profile::toolforge::infrastructure class to puppetmaster T277756 |
[tools] |
04:12 |
<bstorm> |
rebooted tools-sgeexec-0935.tools.eqiad.wmflabs because it forgot how to LDAP...likely root cause of the issues tonight |
[tools] |
03:59 |
<bstorm> |
rebooting grid master. sorry for the cron spam |
[tools] |
03:49 |
<bstorm> |
restarting sssd on tools-sgegrid-master |
[tools] |
03:37 |
<bstorm> |
deleted a massive number of stuck jobs that misfired from the cron server |
[tools] |
03:35 |
<bstorm> |
rebooting tools-sgecron-01 to try to clear up the ldap-related errors coming out of it |
[tools] |
01:46 |
<bstorm> |
killed the toolschecker cron job, which had an LDAP error, and ran it again by hand |
[tools] |