951-1000 of 4034 results (30ms)
2021-05-21 §
16:40 <bstorm> resize tools-k8s-ingress-5 to g3.cores4.ram8.disk20 [tools]
16:04 <majavah> rollback kubernetes ingress update from front proxy [tools]
06:52 <Majavah> pool tools-k8s-ingress-6 and depool ingress-[2,3] T264221 [tools]
2021-05-20 §
17:05 <Majavah> pool tools-k8s-ingress-5 as an ingress node, depool ingress-1 T264221 [tools]
16:31 <Majavah> pool tools-k8s-worker-4 as an ingress node T264221 [tools]
15:17 <Majavah> trying to install ingress-nginx via helm again after adjusting security groups T264221 [tools]
15:15 <Majavah> move tools-k8s-ingress-[5-6] from "tools-k8s-full-connectivity" to "tools-new-k8s-full-connectivity" security group T264221 [tools]
2021-05-19 §
12:15 <Majavah> rollback ingress-nginx-gen2 [tools]
11:09 <Majavah> deploy helm-based nginx ingress controller v0.46.0 to ingress-nginx-gen2 namespace T264221 [tools]
10:44 <Majavah> create tools-k8s-ingress-[4-6] T264221 [tools]
2021-05-16 §
16:52 <Majavah> clear error state from tools-sgeexec-0905 tools-sgeexec-0907 tools-sgeexec-0936 tools-sgeexec-0941 [tools]
2021-05-14 §
19:18 <bstorm> adjusting the rate limits for bastions nfs_write upward a lot to make NFS writes faster now that the cluster is finally using 10Gb on the backend and frontend T218338 [tools]
16:55 <andrewbogott> rebooting toolserver-proxy-01 to clear up stray files [tools]
16:47 <andrewbogott> deleting log files older than 14 days on toolserver-proxy-01 [tools]
2021-05-12 §
19:45 <bstorm> cleared error state from some queues [tools]
19:05 <Majavah> remove phamhi-binding phamhi-view-binding cluster role bindings T282725 [tools]
19:04 <bstorm> deleted the maintain-kubeusers pod to get it up and running fast T282725 [tools]
19:03 <bstorm> deleted phamhi from admin configmap in maintain-kubeusers T282725 [tools]
2021-05-11 §
17:17 <Majavah> shutdown and delete tools-checker-03 T278540 [tools]
17:14 <Majavah> move floating ip 185.15.56.61 to tools-checker-04 [tools]
17:12 <Majavah> add tools-checker-04 as a grid submit host T278540 [tools]
16:58 <Majavah> add tools-checker-04 to toollabs::checker_hosts hiera key T278540 [tools]
16:49 <Majavah> creating tools-checker-04 with buster T278540 [tools]
16:32 <Majavah> carefully shutdown tools-k8s-haproxy-1 T252239 [tools]
16:29 <Majavah> carefully shutdown tools-k8s-haproxy-2 T252239 [tools]
2021-05-10 §
22:58 <bstorm> cleared error state on a grid queue [tools]
22:58 <bstorm> setting `profile::wmcs::kubeadm::docker_vol: false` on ingress nodes [tools]
15:22 <Majavah> change k8s.svc.tools.eqiad1.wikimedia.cloud. to point to the tools-k8s-haproxy-keepalived-vip address 172.16.6.113 (T252239) [tools]
15:06 <Majavah> carefully rolling out keepalived to tools-k8s-haproxy-[3-4] while making sure [1-2] do not have changes [tools]
15:03 <Majavah> clear all error states caused by overloaded exec nodes [tools]
14:57 <arturo> allow tools-k8s-haproxy-[3-4] to use the tools-k8s-haproxy-keepalived-vip address (172.16.6.113) (T252239) [tools]
12:53 <Majavah> creating tools-k8s-haproxy-[3-4] to rebuild current ones without nfs and with keepalived [tools]
2021-05-09 §
06:55 <Majavah> clear error state from tools-sgeexec-0916 [tools]
2021-05-08 §
10:57 <Majavah> import docker image k8s.gcr.io/ingress-nginx/controller:v0.46.0 to local registry as docker-registry.tools.wmflabs.org/nginx-ingress-controller:v0.46.0 T264221 [tools]
2021-05-07 §
18:07 <Majavah> generate and add k8s haproxy keepalived password (profile::toolforge::k8s::haproxy::keepalived_password) to private puppet repo [tools]
17:15 <bstorm> recreated recordset of k8s.tools.eqiad1.wikimedia.cloud as CNAME to k8s.svc.tools.eqiad1.wikimedia.cloud T282227 [tools]
17:12 <bstorm> created A record of k8s.svc.tools.eqiad1.wikimedia.cloud pointing at current cluster with TTL of 300 for quick initial failover when the new set of haproxy nodes are ready T282227 [tools]
09:44 <arturo> `sudo wmcs-openstack --os-project-id=tools port create --network lan-flat-cloudinstances2b tools-k8s-haproxy-keepalived-vip` [tools]
2021-05-06 §
14:43 <Majavah> clear error states from all currently erroring exec nodes [tools]
14:37 <Majavah> clear error state from tools-sgeexec-0913 [tools]
04:34 <Majavah> add own root key to project hiera on horizon T278390 [tools]
02:36 <andrewbogott> removing jhedden from sudo roots [tools]
2021-05-05 §
19:27 <andrewbogott> adding taavi as a sudo root to project toolforge for T278390 [tools]
2021-05-04 §
15:23 <arturo> upgrading exim4-daemon-heavy in tools-mail-03 [tools]
10:47 <arturo> rebase & resolve merge conflicts in labs/private.git [tools]
2021-05-03 §
16:23 <dcaro> started tools-sgeexec-0907, was stuck on initramfs due to an unclean fs (/dev/vda3, root), ran fsck manually fixing all the errors and booted up correctly after (T280641) [tools]
14:07 <dcaro> depooling tols-sgeexec-0908/7 to be able to restart the VMs as they got stuck during migration (T280641) [tools]
2021-04-29 §
18:23 <bstorm> removing one more etcd node via cookbook T279723 [tools]
18:12 <bstorm> removing an etcd node via cookbook T279723 [tools]
2021-04-27 §
16:40 <bstorm> deleted all the errored out grid jobs stuck in queue wait [tools]