2021-03-24
§
|
12:46 |
<arturo> |
shutoff the old stretch VMs `tools-docker-registry-03` and `tools-docker-registry-04` (T278303) |
[tools] |
12:38 |
<arturo> |
associate floating IP 185.15.56.67 with `tools-docker-registry-05` and refresh FQDN docker-registry.tools.wmflabs.org accordingly (T278303) |
[tools] |
12:33 |
<arturo> |
attach cinder volume `tools-docker-registry-data` to VM `tools-docker-registry-05` (T278303) |
[tools] |
12:32 |
<arturo> |
snapshot cinder volume `tools-docker-registry-data` into `tools-docker-registry-data-stretch-migration` (T278303) |
[tools] |
12:32 |
<arturo> |
bump cinder storage quota from 80G to 400G (without quota request task) |
[tools] |
12:11 |
<arturo> |
created VM `tools-docker-registry-06` as Debian Buster (T278303) |
[tools] |
12:09 |
<arturo> |
dettach cinder volume `tools-docker-registry-data` (T278303) |
[tools] |
11:46 |
<arturo> |
attach cinder volume `tools-docker-registry-data` to VM `tools-docker-registry-03` to format it and pre-populate it with registry data (T278303) |
[tools] |
11:20 |
<arturo> |
created 80G cinder volume tools-docker-registry-data (T278303) |
[tools] |
11:10 |
<arturo> |
starting VM tools-docker-registry-04 which was stopped probably since 2021-03-09 due to hypervisor draining |
[tools] |
2021-03-18
§
|
19:24 |
<bstorm> |
set profile::toolforge::infrastructure across the entire project with login_server set on the bastion and exec node-related prefixes |
[tools] |
16:21 |
<andrewbogott> |
enabling puppet tools-wide |
[tools] |
16:20 |
<andrewbogott> |
disabling puppet tools-wide to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 |
[tools] |
16:19 |
<bstorm> |
added profile::toolforge::infrastructure class to puppetmaster T277756 |
[tools] |
04:12 |
<bstorm> |
rebooted tools-sgeexec-0935.tools.eqiad.wmflabs because it forgot how to LDAP...likely root cause of the issues tonight |
[tools] |
03:59 |
<bstorm> |
rebooting grid master. sorry for the cron spam |
[tools] |
03:49 |
<bstorm> |
restarting sssd on tools-sgegrid-master |
[tools] |
03:37 |
<bstorm> |
deleted a massive number of stuck jobs that misfired from the cron server |
[tools] |
03:35 |
<bstorm> |
rebooting tools-sgecron-01 to try to clear up the ldap-related errors coming out of it |
[tools] |
01:46 |
<bstorm> |
killed the toolschecker cron job, which had an LDAP error, and ran it again by hand |
[tools] |