2016-01-11
§
|
22:12 |
<YuviPanda> |
restarted gridengine master again |
[tools] |
22:07 |
<valhallasw`cloud> |
set job_load_adjustments from np_load_avg=0.50 to none and load_adjustment_decay_time to 0:0:0 |
[tools] |
22:05 |
<valhallasw`cloud> |
set maxujobs back to 0, but doesn't help |
[tools] |
21:57 |
<valhallasw`cloud> |
reset to 7:30 |
[tools] |
21:57 |
<valhallasw`cloud> |
that cleared the measure, but jobs still not starting. Ugh! |
[tools] |
21:55 |
<valhallasw`cloud> |
set job_load_adjustments_decay_time = 0:0:0 |
[tools] |
21:45 |
<YuviPanda> |
restarted gridengine master |
[tools] |
21:43 |
<valhallasw`cloud> |
qstat -j <jobid> shows all queues overloaded; seems to have started just after a load test for the new maxujobs setting |
[tools] |
21:42 |
<valhallasw`cloud> |
resetting to 0:7:30, as it's not having the intended effect |
[tools] |
21:41 |
<valhallasw`cloud> |
currently 353 jobs in qw state |
[tools] |
21:40 |
<valhallasw`cloud> |
that's load_adjustment_decay_time |
[tools] |
21:40 |
<valhallasw`cloud> |
temporarily sudo qconf -msconf to 0:0:1 |
[tools] |
19:59 |
<YuviPanda> |
Set maxujobs (max concurrent jobs per user) on gridengine to 128 |
[tools] |
17:51 |
<YuviPanda> |
kill all queries running on labsdb1003 |
[tools] |
17:20 |
<YuviPanda> |
stopped webservice for quentinv57-tools |
[tools] |
2015-12-23
§
|
21:21 |
<YuviPanda> |
deleted tools-worker-01 to -05, creating tools-worker-1001 to 1005 |
[tools] |
21:19 |
<valhallasw`cloud> |
tools-proxy-01: umount /home /data/project /data/scratch /public/dumps |
[tools] |
19:01 |
<valhallasw`cloud> |
ah, connections that are kept open. A new incognito window is routed correctly. |
[tools] |
18:59 |
<valhallasw`cloud> |
switched to -02, worked correctly, switched back. Switching back does not seem to fully work?! |
[tools] |
18:40 |
<valhallasw`cloud> |
scratch that, first going to eat dinner |
[tools] |
18:38 |
<valhallasw`cloud> |
dynamicproxy ban system deployed on tools-proxy-02 working correctly for localhost; switching over users there by moving the external IP. |
[tools] |
14:42 |
<valhallasw`cloud> |
toollabs homepage is unhappy because tools.xtools-articleinfo is using a lot of cpu on tools-webgrid-lighttpd-1409. Checking to see what's happening there. |
[tools] |
10:46 |
<YuviPanda> |
migrate tools-worker-01 to 3.19 kernel |
[tools] |