tools SAL

2901-2950 of 5650 results (34ms)

2020-09-17 §
15:34	<andrewbogott>	depooling tools-k8s-worker-70 and tools-k8s-worker-66 for flavor remapping	[tools]
15:30	<andrewbogott>	repooling tools-sgeexec-0909, 0908, 0907, 0906, 0904	[tools]
15:21	<andrewbogott>	depooling tools-sgeexec-0909, 0908, 0907, 0906, 0904 for flavor remapping	[tools]
13:55	<andrewbogott>	depooled tools-sgewebgrid-lighttpd-0917 and tools-sgewebgrid-lighttpd-0920	[tools]
13:55	<andrewbogott>	repooled tools-sgeexec-0937 after move to ceph	[tools]
13:45	<andrewbogott>	depooled tools-sgeexec-0937 for move to ceph	[tools]
2020-09-16 §
23:20	<andrewbogott>	repooled tools-sgeexec-0941 and tools-sgeexec-0939 for move to ceph	[tools]
23:03	<andrewbogott>	depooled tools-sgeexec-0941 and tools-sgeexec-0939 for move to ceph	[tools]
23:02	<andrewbogott>	uncordoned tools-k8s-worker-58, tools-k8s-worker-56, tools-k8s-worker-42 for migration to ceph	[tools]
22:29	<andrewbogott>	draining tools-k8s-worker-58, tools-k8s-worker-56, tools-k8s-worker-42 for migration to ceph	[tools]
17:37	<andrewbogott>	service gridengine-master restart on tools-sgegrid-master	[tools]
2020-09-10 §
15:37	<arturo>	hard-rebooting tools-proxy-05	[tools]
15:33	<arturo>	rebooting tools-proxy-05 to try flushing local DNS caches	[tools]
15:25	<arturo>	detected missing DNS record for k8s.tools.eqiad1.wikimedia.cloud which means the k8s cluster is down	[tools]
10:22	<arturo>	enabling ingress dedicated worker nodes in the k8s cluster (T250172)	[tools]
2020-09-09 §
11:12	<arturo>	new ingress nodes added to the cluster, and tainted/labeled per the docs https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Deploying#ingress_nodes (T250172)	[tools]
10:50	<arturo>	created puppet prefix `tools-k8s-ingress` (T250172)	[tools]
10:42	<arturo>	created VMs tools-k8s-ingress-1 and tools-k8s-ingress-2 in the `tools-ingress` server group T250172)	[tools]
10:38	<arturo>	created server group `tools-ingress` with soft anti affinity policy (T250172)	[tools]
2020-09-08 §
23:24	<bstorm>	clearing grid queue error states blocking job runs	[tools]
22:53	<bd808>	forcing puppet run on tools-sgebastion-07	[tools]
2020-09-02 §
18:13	<andrewbogott>	moving tools-sgeexec-0920 to ceph	[tools]
17:57	<andrewbogott>	moving tools-sgeexec-0942 to ceph	[tools]
2020-08-31 §
19:58	<andrewbogott>	migrating tools-sgeexec-091[0-9] to ceph	[tools]
17:19	<andrewbogott>	migrating tools-sgeexec-090[4-9] to ceph	[tools]
17:19	<andrewbogott>	repooled tools-sgeexec-0901	[tools]
16:52	<bstorm>	`apt install uwsgi` was run on tools-checker-03 in the last log T261677	[tools]
16:51	<bstorm>	running `apt install uwsgi` with --allow-downgrades to fix the puppet setup there T261677	[tools]
14:26	<andrewbogott>	depooling tools-sgeexec-0901, migrating to ceph	[tools]
2020-08-30 §
00:57	<Krenair>	also ran qconf -ds on each	[tools]
00:34	<Krenair>	Tidied up SGE problems (it was spamming root@ every minute for hours) following host deletions some hours ago - removed tools-sgeexec-0921 through 0931 from @general, ran qmod -rj on all jobs registered for those nodes, then qdel -f on the remainders, then qconf -de on each deleted node	[tools]
2020-08-29 §
16:02	<bstorm>	deleting "tools-sgeexec-0931", "tools-sgeexec-0930", "tools-sgeexec-0929", "tools-sgeexec-0928", "tools-sgeexec-0927"	[tools]
16:00	<bstorm>	deleting "tools-sgeexec-0926", "tools-sgeexec-0925", "tools-sgeexec-0924", "tools-sgeexec-0923", "tools-sgeexec-0922", "tools-sgeexec-0921"	[tools]
2020-08-26 §
21:08	<bd808>	Disabled puppet on tools-proxy-06 to test fixes for a bug in the new T251628 code	[tools]
08:54	<arturo>	merged several patches by bryan for toolforge front proxy (cleanups, etc) example: https://gerrit.wikimedia.org/r/c/operations/puppet/+/622435	[tools]
2020-08-25 §
19:38	<andrewbogott>	deleting tools-sgeexec-0943.tools.eqiad.wmflabs, tools-sgeexec-0944.tools.eqiad.wmflabs, tools-sgeexec-0945.tools.eqiad.wmflabs, tools-sgeexec-0946.tools.eqiad.wmflabs, tools-sgeexec-0948.tools.eqiad.wmflabs, tools-sgeexec-0949.tools.eqiad.wmflabs, tools-sgeexec-0953.tools.eqiad.wmflabs — they are broken and we're not very curious why; will retry this exercise when everything is standardized on	[tools]
15:03	<andrewbogott>	removing non-ceph nodes tools-sgeexec-0921 through tools-sgeexec-0931	[tools]
15:02	<andrewbogott>	added new sge-exec nodes tools-sgeexec-0943 through tools-sgeexec-0953 (for real this time)	[tools]
2020-08-19 §
21:29	<andrewbogott>	shutting down and removing tools-k8s-worker-20 through tools-k8s-worker-29; this load can now be handled by new nodes on ceph hosts	[tools]
21:15	<andrewbogott>	shutting down and removing tools-k8s-worker-1 through tools-k8s-worker-19; this load can now be handled by new nodes on ceph hosts	[tools]
18:40	<andrewbogott>	creating 13 new xlarge k8s worker nodes, tools-k8s-worker-67 through tools-k8s-worker-79	[tools]
2020-08-18 §
15:24	<bd808>	Rebuilding all Docker containers to pick up newest versions of installed packages	[tools]
2020-07-30 §
16:28	<andrewbogott>	added new xlarge ceph-hosted worker nodes: tools-k8s-worker-61, 62, 63, 64, 65, 66. T258663	[tools]
2020-07-29 §
23:24	<bd808>	Pushed a copy of docker-registry.wikimedia.org/wikimedia-jessie:latest to docker-registry.tools.wmflabs.org/wikimedia-jessie:latest in preparation for the upstream image going away	[tools]
2020-07-24 §
22:33	<bd808>	Removed a few more ancient docker images: grrrit, jessie-toollabs, and nagf	[tools]
21:02	<bd808>	Running cleanup script to delete the non-sssd toolforge images from docker-registry.tools.wmflabs.org	[tools]
20:17	<bd808>	Forced garbage collection on docker-registry.tools.wmflabs.org	[tools]
20:06	<bd808>	Running cleanup script to delete all of the old toollabs-* images from docker-registry.tools.wmflabs.org	[tools]
2020-07-22 §
23:24	<bstorm>	created server group 'tools-k8s-worker' to create any new worker nodes in so that they have a low chance of being scheduled together by openstack unless it is necessary T258663	[tools]
23:22	<bstorm>	running puppet and NFS 4.2 remount on tools-k8s-worker-[56-60] T257945	[tools]