2020-07-15 §
23:11 <bd808> Removed ssh root key for valhallasw from project hiera (T255697) [tools]
2020-07-09 §
18:53 <bd808> Updating git-review to 1.27 via clush across cluster (T257496) [tools]
2020-07-08 §
11:16 <arturo> merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/610029 -- important change to front-proxy (T234617) [tools]
11:11 <arturo> live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/610029 (T234617) [tools]
2020-07-07 §
23:22 <bd808> Rebuilding all Docker images to pick up webservice v0.73 (T234617, T257229) [tools]
23:19 <bd808> Deploying webservice v0.73 via clush (T234617, T257229) [tools]
23:16 <bd808> Building webservice v0.73 (T234617, T257229) [tools]
15:01 <Reedy> killed python process from tools.experimental-embeddings using a lot of cpu on tools-sgebastion-07 [tools]
15:01 <Reedy> killed meno25 process running pwb.py on tools-sgebastion-07 [tools]
09:59 <arturo> point DNS tools.wmflabs.org A record to (tools-legacy-redirector) (T247236) [tools]
2020-07-06 §
11:54 <arturo> briefly point DNS tools.wmflabs.org A record to (tools-legacy-redirector) and then switch back to (tools-proxy-05). The legacy redirector does HTTP/307 (T247236) [tools]
11:50 <arturo> associate floating IP address to tools-legacy-redirector (T247236) [tools]
2020-07-01 §
11:19 <arturo> cleanup exim email queue (4 frozen messages) [tools]
11:01 <arturo> live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/608849 (T256737) [tools]
2020-06-30 §
11:18 <arturo> set some hiera keys for mtail in puppet prefix `tools-mail` (T256737) [tools]
2020-06-29 §
22:48 <legoktm> built html-sssd/web image (T241817) [tools]
22:23 <legoktm> rebuild python{34,35,37}-sssd/web images for https://gerrit.wikimedia.org/r/608093 [tools]
12:01 <arturo> introduced spam filter in the mail server (T120210) [tools]
2020-06-25 §
21:49 <zhuyifei1999_> re-enabling puppet on tools-sgebastion-09 T256426 [tools]
21:39 <zhuyifei1999_> disabling puppet on tools-sgebastion-09 so I can play with mount settings T256426 [tools]
21:24 <bstorm> hard rebooting tools-sgebastion-09 [tools]
2020-06-24 §
12:36 <arturo> live-hacking puppetmaster with exim prometheus stuff (T175964) [tools]
11:57 <arturo> merging email ratelimiting patch https://gerrit.wikimedia.org/r/c/operations/puppet/+/607320 (T175964) [tools]
2020-06-23 §
17:55 <arturo> killed procs for users `hamishz` and `msyn` which apparently were tools that should be running in the grid / kubernetes instead [tools]
16:08 <arturo> created acme-chief cert `tools_mail` in the prefix hiera [tools]
2020-06-17 §
10:40 <arturo> created VM tools-legacy-redirector, with the corresponding puppet prefix (T247236, T234617) [tools]
2020-06-16 §
23:01 <bd808> Building new Docker images to pick up webservice 0.72 [tools]
22:58 <bd808> Deploying webservice 0.72 to bastions and grid [tools]
22:56 <bd808> Building webservice 0.72 [tools]
15:10 <arturo> merging a patch with changes to the template for keepalived (used in the elastic cluster) https://gerrit.wikimedia.org/r/c/operations/puppet/+/605898 [tools]
2020-06-15 §
21:28 <bstorm_> cleaned up killgridjobs.sh on the tools bastions T157792 [tools]
18:14 <bd808> Rebuilding all Docker images to pick up webservice 0.71 (T254640, T253412) [tools]
18:12 <bd808> Deploying webservice 0.71 to bastions and grid via clush [tools]
18:04 <bd808> Building webservice 0.71 [tools]
2020-06-12 §
13:13 <arturo> live-hacking session in the puppetmaster ended [tools]
13:10 <arturo> live-hacing puppet tree in tools-puppetmaster-02 for testing PAWS related patch (they share haproxy puppet code) [tools]
00:16 <bstorm_> remounted NFS for tools-k8s-control-3 and tools-acme-chief-01 [tools]
2020-06-11 §
23:35 <bstorm_> rebooting tools-k8s-control-2 because it seems to be confused on NFS, interestingly enough [tools]
2020-06-04 §
13:32 <bd808> Manually restored /etc/haproxy/conf.d/elastic.cfg on tools-elastic-* [tools]
2020-06-02 §
12:23 <arturo> renewed TLS cert for k8s metrics-server (T250874) following docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Certificates#internal_API_access [tools]
11:00 <arturo> renewed TLS cert for prometheus to contact toolforge k8s (T250874) following docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Certificates#external_API_access [tools]
2020-06-01 §
23:51 <bstorm_> refreshed certs for the custom webhook controllers on the k8s cluster T250874 [tools]
00:39 <bd808> Ugh. Prior SAL message was about tools-sgeexec-0940 [tools]
00:39 <bd808> Compressed /var/log/account/pacct.0 ahead of rotation schedule to free some space on the root partition [tools]
2020-05-29 §
19:37 <bstorm_> adding docker image for paws-public docker-registry.tools.wmflabs.org/paws-public-nginx:openresty T252217 [tools]
2020-05-28 §
21:19 <bd808> Killed 7 python processes run by user 'mattho69' on login.toolforge.org [tools]
21:06 <bstorm_> upgrading tools-k8s-worker-[30-60] to kubernetes 1.16.10 T246122 [tools]
17:54 <bstorm_> upgraded tools-k8s-worker-[11..15] and starting on -21-29 now T246122 [tools]
16:01 <bstorm_> kubectl upgraded to 1.16.10 on all bastions T246122 [tools]
15:58 <arturo> upgrading tools-k8s-worker-[1..10] to 1.16.10 (T246122) [tools]