You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setup a cluster today (0.93.2) and suddenly noticed that the 'master' node was
not being reported in a "qstat -f" command and was not accepting run jobs from
the queue . . . i.e., with 12 nodes x 8 cpus each (96), when 96 jobs are
submitted, only 88 run (nodes 1-11) while 8 remain in the queue waiting.
I tried restarting the cluster using the 'sge' plugin to manually ensure that
master_is_exec_host was set to 'True'. But the result was the same: 88 running
8 waiting.
The text was updated successfully, but these errors were encountered:
From http://mailman.mit.edu/pipermail/starcluster/2012-March/001109.html
Setup a cluster today (0.93.2) and suddenly noticed that the 'master' node was
not being reported in a "qstat -f" command and was not accepting run jobs from
the queue . . . i.e., with 12 nodes x 8 cpus each (96), when 96 jobs are
submitted, only 88 run (nodes 1-11) while 8 remain in the queue waiting.
I tried restarting the cluster using the 'sge' plugin to manually ensure that
master_is_exec_host was set to 'True'. But the result was the same: 88 running
The text was updated successfully, but these errors were encountered: