Longjobs -- Checking Job Status | |||
|
There are 3 factors determining scheduling priority for a job:
Wait time will also depend on the number of execution machines available:
qstat
. Without
arguments, it displays status for all jobs currently in the system.
For example:
athena% qstat
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 240 hmprof sun-long STDIN 27:00 Hold -- 379 jqpublic sun-long foo 27:00 Run 00:03 385 jpqublic sun-medium STDIN 06:00 Que -- 386 llta sun-medium bar1 06:00 Run 00:02 387 llta linux-long bar2 27:00 Run 13:37
Syntax for other common options:
qstat jobID
Example:
athena% qstat 385
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 385 jqpublic sun-medium STDIN 06:00 Run 00:03
qstat -u username
Example:
athena% qstat -u jqpublic
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 379 jqpublic sun-long foo 27:00 Que -- 385 jqpublic sun-medium STDIN 06:00 Run 00:03
qstat -l
Example (in combination with the Job ID):
athena% qstat -l 385
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 385 jqpublic sun-medium STDIN 06:00 Run 00:03 Job started on Thu Feb 07 at 11:08This is also useful for obtaining the estimated start time of a queued job, and for troubleshooting why a job has not been scheduled to run. For example:
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 386 jqpublic sun-medium matbg 00:10 Que -- Estimated start time: Thu Feb 07 at 17:18 387 jqpublic sun-long STDIN 27:00 Que -- Not Running: No node can provide job's requested resources 388 jqpublic linux-long STDIN 27:00 Que -- Nodes serving this queue are down
Run (running) Que (queued) Hold (on hold, per user, admin, or system) Wait (time wait, if specified via qsub -A) Exit (exiting) Trans (in transit - not yet committed, or being routed) Susp (suspended)You can also obtain job status via the
longjob -jobs
interface. Currently this simply invokes qstat -l
.
To view all jobs in a particular queue, use: qstat
queue
For example:
athena% qstat sun-long
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 240 hmprof sun-long STDIN 27:00 Hold -- 379 jqpublic sun-long foo 27:00 Run 22:03 390 llta sun-long bar2 27:00 Run 13:37 403 llta sun-long bar3 15:00 Que --
To view a summary of all queues use: qstat -q
For example:
athena% qstat -q
Queue Limit Run Que State ---------------- ----- --- --- ----- any-medium 06:00 0 0 any-long 27:00 0 0 sun-medium 06:00 1 4 sun-long 27:00 0 1 linux-medium 06:00 0 0 linux-long 27:00 1 0 29.123-res 27:00 1 3 Inaccessible --- --- 3 8Notes:
Limit
column will always
show the actual definition.)
any-
queues are generic, for jobs that may run on
any available platform. Note that there is no way to indicate
a preference for a certain platform with such a queue, either by the
the user or in the server; when more than one type of machine is
available to run a job, one cannot reliably predict which machine
will be chosen by the server.
Inaccessible
means that you are not permitted to use
that queue; its access is restricted to a particular set of
users (for example, a class which has reserved a set of machines).
longjob -queues
to display a
summary of all configured queues.
qusage
athena% qusage
Account Remain Queued Running Expires ------- ------ ------ ------- -------- 29.123 54:16 0:00 27:00 01/01/02 jqpublic 30:17 0:00 0:00 02/03/04
Remain
shows how much time you have left, i.e. quota minus
the total time used by jobs which have already run. If you submit a job
with a time limit greater than this, it will be rejected
immediately. Note that it is decremented only when a job has
finished.
Queued
and Running
show the sum of the
time limits for jobs in each state.
Remain
but you have other jobs running or
queued, the new job may be held pending their completion. In
particular,
if the new job's limit is greater than Remain - (Queued +
Running)
, that job will be held --
if there is sufficient quota for the new job when the others
have finished (or if you cancel them) it will be automatically
queued; otherwise, it will remain in the hold state.
For more detail (including the total quota allocated for each account),
use: qusage -l
For example:
athena% qusage -l
Account Remain Queued Running Expires ------- ------ ------ ------- -------- 29.123 97:30 0:00 6:00 Jan 01 2002 Allocation: 100 Platforms: any Created: Dec 29 2000 Modified: Feb 15 2001 jqpublic 30:17 0:00 0:00 Feb 03 2004 Allocation: 35 Platforms: sun Created: Jan 06 2001 Modified: Feb 15 2001You can also use the command
longjob -quota
to view your
quota information.
Last modified: Mon Jul 21 11:39:41 EDT 2003