2021-03-30
§
|
16:15 |
<bstorm> |
added `labstore::traffic_shaping::egress: 800mbps` to tools-static prefix T278539 |
[tools] |
15:44 |
<arturo> |
shutoff tools-static-12/13 (T278539) |
[tools] |
15:41 |
<arturo> |
point horizon web proxy `tools-static.wmflabs.org` to tools-static-14 (T278539) |
[tools] |
15:37 |
<arturo> |
add `mount_nfs: true` to tools-static prefix (T2778539) |
[tools] |
15:26 |
<arturo> |
create VM tools-static-14 with Debian Buster image (T278539) |
[tools] |
12:19 |
<arturo> |
introduce horizon proxy `deb-tools.wmcloud.org` (T278436) |
[tools] |
12:15 |
<arturo> |
shutdown tools-sgebastion-09 (stretch) |
[tools] |
11:05 |
<arturo> |
created VM `tools-sgebastion-10` as Debian Buster (T275865) |
[tools] |
11:04 |
<arturo> |
created server group `tools-bastion` with anti-affinity policy |
[tools] |
2021-03-25
§
|
19:30 |
<bstorm> |
forced deletion of all jobs stuck in a deleting state T277653 |
[tools] |
17:46 |
<arturo> |
rebooting tools-sgeexec-* nodes to account for new grid master (T277653) |
[tools] |
16:20 |
<arturo> |
rebuilding tools-sgegrid-master VM as debian buster (T277653) |
[tools] |
16:18 |
<arturo> |
icinga-downtime toolschecker for 2h |
[tools] |
16:05 |
<bstorm> |
failed over the tools grid to the shadow master T277653 |
[tools] |
13:36 |
<arturo> |
shutdown tools-sge-services-03 (T278354) |
[tools] |
13:33 |
<arturo> |
shutdown tools-sge-services-04 (T278354) |
[tools] |
13:31 |
<arturo> |
point aptly clients to `tools-services-05.tools.eqiad1.wikimedia.cloud` (hiera change) (T278354) |
[tools] |
12:58 |
<arturo> |
created VM `tools-services-05` as Debian Buster (T278354) |
[tools] |
12:51 |
<arturo> |
create cinder volume `tools-aptly-data` (T278354) |
[tools] |
2021-03-24
§
|
12:46 |
<arturo> |
shutoff the old stretch VMs `tools-docker-registry-03` and `tools-docker-registry-04` (T278303) |
[tools] |
12:38 |
<arturo> |
associate floating IP 185.15.56.67 with `tools-docker-registry-05` and refresh FQDN docker-registry.tools.wmflabs.org accordingly (T278303) |
[tools] |
12:33 |
<arturo> |
attach cinder volume `tools-docker-registry-data` to VM `tools-docker-registry-05` (T278303) |
[tools] |
12:32 |
<arturo> |
snapshot cinder volume `tools-docker-registry-data` into `tools-docker-registry-data-stretch-migration` (T278303) |
[tools] |
12:32 |
<arturo> |
bump cinder storage quota from 80G to 400G (without quota request task) |
[tools] |
12:11 |
<arturo> |
created VM `tools-docker-registry-06` as Debian Buster (T278303) |
[tools] |
12:09 |
<arturo> |
dettach cinder volume `tools-docker-registry-data` (T278303) |
[tools] |
11:46 |
<arturo> |
attach cinder volume `tools-docker-registry-data` to VM `tools-docker-registry-03` to format it and pre-populate it with registry data (T278303) |
[tools] |
11:20 |
<arturo> |
created 80G cinder volume tools-docker-registry-data (T278303) |
[tools] |
11:10 |
<arturo> |
starting VM tools-docker-registry-04 which was stopped probably since 2021-03-09 due to hypervisor draining |
[tools] |
2021-03-18
§
|
19:24 |
<bstorm> |
set profile::toolforge::infrastructure across the entire project with login_server set on the bastion and exec node-related prefixes |
[tools] |
16:21 |
<andrewbogott> |
enabling puppet tools-wide |
[tools] |
16:20 |
<andrewbogott> |
disabling puppet tools-wide to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 |
[tools] |
16:19 |
<bstorm> |
added profile::toolforge::infrastructure class to puppetmaster T277756 |
[tools] |
04:12 |
<bstorm> |
rebooted tools-sgeexec-0935.tools.eqiad.wmflabs because it forgot how to LDAP...likely root cause of the issues tonight |
[tools] |
03:59 |
<bstorm> |
rebooting grid master. sorry for the cron spam |
[tools] |
03:49 |
<bstorm> |
restarting sssd on tools-sgegrid-master |
[tools] |
03:37 |
<bstorm> |
deleted a massive number of stuck jobs that misfired from the cron server |
[tools] |
03:35 |
<bstorm> |
rebooting tools-sgecron-01 to try to clear up the ldap-related errors coming out of it |
[tools] |