|  Longjobs -- Checking Job Status | |||
| 
 | |||
There are 3 factors determining scheduling priority for a job:
Wait time will also depend on the number of execution machines available:
qstat.  Without
arguments, it displays status for all jobs currently in the system.
For example:
athena% qstat
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 240 hmprof sun-long STDIN 27:00 Hold -- 379 jqpublic sun-long foo 27:00 Run 00:03 385 jpqublic sun-medium STDIN 06:00 Que -- 386 llta sun-medium bar1 06:00 Run 00:02 387 llta linux-long bar2 27:00 Run 13:37
Syntax for other common options:
         qstat jobID
       
Example:
  athena% qstat 385
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 385 jqpublic sun-medium STDIN 06:00 Run 00:03
       qstat -u username
    
Example:
  athena% qstat -u jqpublic
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 379 jqpublic sun-long foo 27:00 Que -- 385 jqpublic sun-medium STDIN 06:00 Run 00:03
       qstat -l
     
Example (in combination with the Job ID):
  athena% qstat -l 385
  Job ID      Username  Queue         Jobname     Limit  State  Elapsed
  ------      --------  -----         -------     -----  -----  -------
  385         jqpublic  sun-medium    STDIN       06:00  Run    00:03
       Job started on Thu Feb 07 at 11:08
     
This is also useful for obtaining the estimated start time of a queued
job, and for troubleshooting why a job has not been scheduled to run.
For example:
     
  Job ID      Username  Queue         Jobname     Limit  State  Elapsed
  ------      --------  -----         -------     -----  -----  -------
  386          jqpublic sun-medium    matbg       00:10  Que    --   
     Estimated start time: Thu Feb 07 at 17:18
  387          jqpublic sun-long      STDIN       27:00  Que    --   
     Not Running: No node can provide job's requested resources
  388          jqpublic linux-long    STDIN       27:00  Que    --   
     Nodes serving this queue are down
     
      Run     (running)
      Que     (queued)
      Hold    (on hold, per user, admin, or system)
      Wait    (time wait, if specified via qsub -A)
      Exit    (exiting)
      Trans   (in transit - not yet committed, or being routed)
      Susp    (suspended)
You can also obtain job status via the longjob -jobs
interface.  Currently this simply invokes qstat -l.
To view all jobs in a particular queue, use:   qstat 
queue
  For example: 
athena% qstat sun-long
Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 240 hmprof sun-long STDIN 27:00 Hold -- 379 jqpublic sun-long foo 27:00 Run 22:03 390 llta sun-long bar2 27:00 Run 13:37 403 llta sun-long bar3 15:00 Que --
To view a summary of all queues use:  qstat -q
  For example:
athena% qstat -q
Queue             Limit  Run  Que  State
----------------  -----  ---  ---  -----
any-medium        06:00    0    0
any-long          27:00    0    0
sun-medium        06:00    1    4
sun-long          27:00    0    1
linux-medium      06:00    0    0
linux-long        27:00    1    0
29.123-res        27:00    1    3  Inaccessible
                         ---  ---
                           3    8
Notes:
Limit column will always
       show the actual definition.)
  any- queues are generic, for jobs that may run on
       any available platform.  Note that there is no way to indicate
       a preference for a certain platform with such a queue, either by the
       the user or in the server; when more than one type of machine is
       available to run a job, one cannot reliably predict which machine
       will be chosen by the server.
  Inaccessible means that you are not permitted to use
       that queue; its access is restricted to a particular set of
       users (for example, a class which has reserved a set of machines).
longjob -queues to display a
summary of all configured queues.
qusage
athena% qusage
Account Remain Queued Running Expires ------- ------ ------ ------- -------- 29.123 54:16 0:00 27:00 01/01/02 jqpublic 30:17 0:00 0:00 02/03/04
Remain shows how much time you have left, i.e.  quota minus
       the total time used by jobs which have already run.  If you submit a job
       with a time limit greater than this, it will be rejected
       immediately.  Note that it is decremented only when a job has
       finished. 
  Queued and Running show the sum of the
       time limits for jobs in each state. 
  Remain but you have other jobs running or
       queued, the new job may be held pending their completion.  In
       particular,  
       if the new job's limit is greater than Remain - (Queued +
       Running), that job will be held --
       if there is sufficient quota for the new job when the others
       have finished (or if you cancel them) it will be automatically
       queued; otherwise, it will remain in the hold state.
For more detail (including the total quota allocated for each account),
use:   qusage -l
  For example:   
athena% qusage -l
Account Remain Queued Running Expires ------- ------ ------ ------- -------- 29.123 97:30 0:00 6:00 Jan 01 2002 Allocation: 100 Platforms: any Created: Dec 29 2000 Modified: Feb 15 2001 jqpublic 30:17 0:00 0:00 Feb 03 2004 Allocation: 35 Platforms: sun Created: Jan 06 2001 Modified: Feb 15 2001You can also use the command
longjob -quota to view your
quota information.
Last modified: Mon Jul 21 11:39:41 EDT 2003