2020-06-23 §
16:08 <arturo> created acme-chief cert `tools_mail` in the prefix hiera [tools]
2020-06-17 §
10:40 <arturo> created VM tools-legacy-redirector, with the corresponding puppet prefix (T247236, T234617) [tools]
2020-06-16 §
23:01 <bd808> Building new Docker images to pick up webservice 0.72 [tools]
22:58 <bd808> Deploying webservice 0.72 to bastions and grid [tools]
22:56 <bd808> Building webservice 0.72 [tools]
15:10 <arturo> merging a patch with changes to the template for keepalived (used in the elastic cluster) https://gerrit.wikimedia.org/r/c/operations/puppet/+/605898 [tools]
2020-06-15 §
21:28 <bstorm_> cleaned up killgridjobs.sh on the tools bastions T157792 [tools]
18:14 <bd808> Rebuilding all Docker images to pick up webservice 0.71 (T254640, T253412) [tools]
18:12 <bd808> Deploying webservice 0.71 to bastions and grid via clush [tools]
18:04 <bd808> Building webservice 0.71 [tools]
2020-06-12 §
13:13 <arturo> live-hacking session in the puppetmaster ended [tools]
13:10 <arturo> live-hacing puppet tree in tools-puppetmaster-02 for testing PAWS related patch (they share haproxy puppet code) [tools]
00:16 <bstorm_> remounted NFS for tools-k8s-control-3 and tools-acme-chief-01 [tools]
2020-06-11 §
23:35 <bstorm_> rebooting tools-k8s-control-2 because it seems to be confused on NFS, interestingly enough [tools]
2020-06-04 §
13:32 <bd808> Manually restored /etc/haproxy/conf.d/elastic.cfg on tools-elastic-* [tools]
2020-06-02 §
12:23 <arturo> renewed TLS cert for k8s metrics-server (T250874) following docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Certificates#internal_API_access [tools]
11:00 <arturo> renewed TLS cert for prometheus to contact toolforge k8s (T250874) following docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Certificates#external_API_access [tools]
2020-06-01 §
23:51 <bstorm_> refreshed certs for the custom webhook controllers on the k8s cluster T250874 [tools]
00:39 <bd808> Ugh. Prior SAL message was about tools-sgeexec-0940 [tools]
00:39 <bd808> Compressed /var/log/account/pacct.0 ahead of rotation schedule to free some space on the root partition [tools]
2020-05-29 §
19:37 <bstorm_> adding docker image for paws-public docker-registry.tools.wmflabs.org/paws-public-nginx:openresty T252217 [tools]
2020-05-28 §
21:19 <bd808> Killed 7 python processes run by user 'mattho69' on login.toolforge.org [tools]
21:06 <bstorm_> upgrading tools-k8s-worker-[30-60] to kubernetes 1.16.10 T246122 [tools]
17:54 <bstorm_> upgraded tools-k8s-worker-[11..15] and starting on -21-29 now T246122 [tools]
16:01 <bstorm_> kubectl upgraded to 1.16.10 on all bastions T246122 [tools]
15:58 <arturo> upgrading tools-k8s-worker-[1..10] to 1.16.10 (T246122) [tools]
15:41 <arturo> upgrading tools-k8s-control-3 to 1.16.10 (T246122) [tools]
15:17 <arturo> upgrading tools-k8s-control-2 to 1.16.10 (T246122) [tools]
15:09 <arturo> upgrading tools-k8s-control-1 to 1.16.10 (T246122) [tools]
14:49 <arturo> cleanup /etc/apt/sources.list.d/ directory in all tools-k8s-* VMs [tools]
11:27 <arturo> merging change to front-proxy: https://gerrit.wikimedia.org/r/c/operations/puppet/+/599139 (T253816) [tools]
2020-05-27 §
17:23 <bstorm_> deleting "tools-k8s-worker-20", "tools-k8s-worker-19", "tools-k8s-worker-18", "tools-k8s-worker-17", "tools-k8s-worker-16" [tools]
2020-05-26 §
18:45 <bstorm_> upgrading maintain-kubeusers to match what is in toolsbeta T246059 T211096 [tools]
16:20 <bstorm_> fix incorrect volume name in kubeadm-config configmap T246122 [tools]
2020-05-22 §
20:00 <bstorm_> rebooted tools-sgebastion-07 to clear up tmp file problems with 10 min warning [tools]
19:12 <bstorm_> running command to delete over 2000 tmp ca certs on tools-bastion-07 T253412 [tools]
2020-05-21 §
22:40 <bd808> Rebuilding all Docker containers for tools-webservice 0.70 (T252700) [tools]
22:36 <bd808> Updated tools-webservice to 0.70 across instances (T252700) [tools]
22:29 <bd808> Building tools-webservice 0.70 via wmcs-package-build.py [tools]
2020-05-20 §
09:59 <arturo> now running tesseract-ocr v4.1.1-2~bpo9+1 in the Toolforge grid (T247422) [tools]
09:50 <arturo> `aborrero@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:tools name:tools-sge[bcew].*}' 'apt-get install tesseract-ocr -t stretch-backports -y'` (T247422) [tools]
09:35 <arturo> `aborrero@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:tools name:tools-sge[bcew].*}' 'rm /etc/apt/sources.lists.d/kubeadm-k8s-component-repo.list ; rm /etc/apt/sources.list.d/repository_thirdparty-kubeadm-k8s-1-15.list ; run-puppet-agent'` (T247422) [tools]
09:23 <arturo> `aborrero@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:tools name:tools-sge[bcew].*}' 'rm /etc/apt/preferences.d/* ; run-puppet-agent'` (T247422) [tools]
2020-05-19 §
17:00 <bstorm_> deleting/restarting the paws db-proxy pod because it cannot connect to the replicas...and I'm hoping that's due to depooling and such [tools]
2020-05-13 §
18:14 <bstorm_> upgrading calico to 3.14.0 with typha enabled in Toolforge K8s T250863 [tools]
18:10 <bstorm_> set "profile::toolforge::k8s::typha_enabled: true" in tools project for calico upgrade T250863 [tools]
2020-05-09 §
00:28 <bstorm_> added nfs.* to ignored_fs_types for the prometheus::node_exporter params in project hiera T252260 [tools]
2020-05-08 §
18:17 <bd808> Building all jessie-sssd derived images (T197930) [tools]
17:29 <bd808> Building new jessie-sssd base image (T197930) [tools]
2020-05-07 §
21:51 <bstorm_> rebuilding the docker images for Toolforge k8s [tools]