tools SAL

1701-1750 of 1937 results (20ms)

2016-01-21 §
19:13	<YuviPanda>	depooled instances on labvirt1003	[tools]
19:06	<YuviPanda>	re-enabled queues on exec nodes that were on labvirt1002	[tools]
19:02	<YuviPanda>	failed over tools proxy to tools-proxy-02	[tools]
18:46	<YuviPanda>	drained and disabled queues on all nodes on labvirt1002	[tools]
18:38	<YuviPanda>	restarted all restartable jobs in instances on labvirt1001 and deleted all non-restartable ghost jobs. these were already dead	[tools]
2016-01-20 §
14:50	<chasemp>	reboot tools-webgrid-lighttpd-1209 as frozen	[tools]
2016-01-15 §
18:34	<chasemp>	tools-mail-01 is locked up I am rebooting	[tools]
2016-01-14 §
01:56	<YuviPanda>	rm service.manifest for wikiviewstats to prevent it from constantly trying to start up and fail webservice	[tools]
01:32	<YuviPanda>	stopped erwin85's tools since it was causing replag on labsdb1002	[tools]
2016-01-11 §
22:19	<valhallasw`cloud>	reset maxujobs 0->128, job_load_adjustments none->np_load_avg=0.50, load_ad... -> 0:7:30	[tools]
22:12	<YuviPanda>	restarted gridengine master again	[tools]
22:07	<valhallasw`cloud>	set job_load_adjustments from np_load_avg=0.50 to none and load_adjustment_decay_time to 0:0:0	[tools]
22:05	<valhallasw`cloud>	set maxujobs back to 0, but doesn't help	[tools]
21:57	<valhallasw`cloud>	reset to 7:30	[tools]
21:57	<valhallasw`cloud>	that cleared the measure, but jobs still not starting. Ugh!	[tools]
21:55	<valhallasw`cloud>	set job_load_adjustments_decay_time = 0:0:0	[tools]
21:45	<YuviPanda>	restarted gridengine master	[tools]
21:43	<valhallasw`cloud>	qstat -j <jobid> shows all queues overloaded; seems to have started just after a load test for the new maxujobs setting	[tools]
21:42	<valhallasw`cloud>	resetting to 0:7:30, as it's not having the intended effect	[tools]
21:41	<valhallasw`cloud>	currently 353 jobs in qw state	[tools]
21:40	<valhallasw`cloud>	that's load_adjustment_decay_time	[tools]
21:40	<valhallasw`cloud>	temporarily sudo qconf -msconf to 0:0:1	[tools]
19:59	<YuviPanda>	Set maxujobs (max concurrent jobs per user) on gridengine to 128	[tools]
17:51	<YuviPanda>	kill all queries running on labsdb1003	[tools]
17:20	<YuviPanda>	stopped webservice for quentinv57-tools	[tools]
2016-01-09 §
21:07	<valhallasw`cloud>	moved tools-checker/208.80.155.229 back to tools-checker-01	[tools]
21:02	<andrewbogott>	rebooting tools-checker-01 as it is unresponsive.	[tools]
13:12	<valhallasw`cloud>	tools-worker-1002. is unresponsive. Maybe that's where the other grrrit-wm is hiding? Rebooting.	[tools]
2016-01-08 §
19:46	<chasemp>	couldn't get into tools-mail-01 at all and it seemed borked so I rebooted	[tools]
17:23	<andrewbogott>	killing tools.icelab as per https://wikitech.wikimedia.org/wiki/User_talk:Torin#Running_queries_on_tools-dev_.28tools-bastion-02.29	[tools]
2015-12-30 §
04:06	<YuviPanda>	delete all webgrid jobs to start with a clean slate	[tools]
03:54	<YuviPanda>	qmod -rj all tools in the continuous queue, they are all orphaned	[tools]
03:22	<YuviPanda>	stop cron on tools-submit, wait for webservices to come back up	[tools]
02:39	<YuviPanda>	remove lbenedix and ebekebe from tools.hcclab	[tools]
00:40	<YuviPanda>	restarted master on grid-master	[tools]
00:40	<YuviPanda>	copied and cleaned out spooldb	[tools]
00:10	<YuviPanda>	reboot tools-grid-shadow	[tools]
00:08	<YuviPanda>	attempt to stop shadowd	[tools]
00:03	<YuviPanda>	attempting to start gridengine-master on tools-grid-shadow	[tools]
00:00	<YuviPanda>	kill -9'd gridengine master	[tools]
2015-12-29 §
23:31	<YuviPanda>	rebooting tools-grid-master	[tools]
23:22	<YuviPanda>	restart gridengine-master on tools-grid-master	[tools]
00:18	<YuviPanda>	shut down redis on tools-redis-01	[tools]
2015-12-28 §
22:34	<chasemp>	attempt to unmount nfs volumes on tools-redis-01 to debug but it hands (I am on console and see root at console hang on login)	[tools]
22:31	<YuviPanda>	disable NFS on tools-redis-1001 and 1002	[tools]
21:32	<YuviPanda>	disable puppet on tools-redis-01 and -02	[tools]
21:27	<YuviPanda>	created tools-redis-1001	[tools]
2015-12-23 §
21:21	<YuviPanda>	deleted tools-worker-01 to -05, creating tools-worker-1001 to 1005	[tools]
21:19	<valhallasw`cloud>	tools-proxy-01: umount /home /data/project /data/scratch /public/dumps	[tools]
19:01	<valhallasw`cloud>	ah, connections that are kept open. A new incognito window is routed correctly.	[tools]