Longjobs -- Running Jobs | |||
|
There are two ways to submit a job:
longjob -submit
interface
(shows accounts and queues available to you, and prompts for basic
submission parameters)
qsub
command directly
(more flexible than longjob
but
provides no context and requires more attention to command syntax)
~/my_jobs/jobname.e483
and
~/my_jobs/jobname.o483
). If it is unable to save the
files there, it will instead email them to you.
-rn
option in qsub
, a script directive,
or the QSUB environment variable).
longjob -submit
In the example below, user input is shown as value; [value] indicates that the user accepted the listed default with the <Enter> key.
Before running longjob -submit
:
add longjobs
if you haven't already
cd
to the job's working directory (this
should either be in your homedir, or in a locker which your job
script attaches)
athena% longjob -submit You are registered in the following account(s), with available quota as shown: Account Remain Queued Running Expires ------- ------ ------ ------- ------- 29.123 97:23 0:00 0:00 01/01/02 The following queues are configured. You may not submit a job to any queue whose status contains 'Inaccessible'. Queue Limit Run Que State ---------------- ----- --- --- ----- any-medium 06:00 0 0 any-long 27:00 0 0 linux-medium 06:00 0 0 linux-long 27:00 1 0 sun-medium 06:00 1 4 sun-long 27:00 0 1 reserved-1 27:00 1 3 reserved-2 27:00 0 0 Inaccessible --- --- 3 8
Please enter an account: [29.123]
Please enter a queue: sun-long
Jobs with shorter time limits will tend to have queue priority. You may accept the default time limit for the queue, or specify a lower limit, if you know that your job will complete in time. In either case, your job will be terminated if it exceeds the time limit.
Please enter a time limit in hours, or in hh:mm format: [27:00]
15
Please enter the name of your script file,
or <Enter> to read commands from standard input.
Script file: [<stdin>] foo
[Creating a renewable Kerberos ticket for use by the job] Password for jqpublic@ATHENA.MIT.EDU: [Forwarding renewable Kerberos 5 TGT...] 478.longm.mit.edu Your job has been submitted. Use the 'qstat' command to display the status of your job, and the job queues. The current queue status: Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 478 jqpublic sun-long foo 15:00 Run -- Job started on Thu Feb 01 at 11:16Note that your job will be listed with
State = Hold
if you
have other jobs running which might put you over quota; see the quota information on the Checking Job Status
page for details.
qsub
Before running qsub
:
add longjobs
if you haven't already
cd
to the job's working directory (this
should either be in your homedir, or in a locker which your job
script attaches)
qsub
:
qsub -a account -q queue script_nameexample:
athena% qsub -a 29.123 -q sun-long foo Password for jqpublic@ATHENA.MIT.EDU: [Forwarding renewable Kerberos 5 TGT...] 483.longm.mit.edu
qsub -a account -q queueexample:
athena% qsub -a 29.123 -q sun-long cd $PBS_O_WORKDIR rm -f ./my_data ./my_program > /tmp/raw_output ./crunch_data /tmp/raw_output > ./my_data ^D Password for jqpublic@ATHENA.MIT.EDU: [Forwarding renewable Kerberos 5 TGT...] 484.longm.mit.edu
For a list of available queues, use the qstat -q
command
(more details on the checking
status page.)
qsub
options(This is not an exhaustive list; see the qsub man page for information on additional options.)
-l
resource_list
-l walltime=hh:mm:ssFor example,
-l walltime=15:00:00will limit the job to 15 hours. Note that time should be specified in the format hh:mm:ss (otherwise a single number will be interpreted as seconds and xx:yy will be interpreted as minutes and seconds).
-r
rerunnability
y
; the
standard output and error streams will be recreated, but if
your job attempts to write to output files directly it may
produce undesirable behaviors (e.g., adding to data from
an earlier partial run, or failing to create a file that
already exists). If set to n
, the service will
not attempt to rerun the job; instead, it would simply produce a
message about the failure.. n
for no, y
for
yes. Default is y
.
-z
zephyr_options
n
for none, or a combination
of a
(job aborted), b
(job
began), e
(job ended). Default is abe
.
-m
mail_options
n
for none, or a combination
of a
(job aborted), b
(job
began), e
(job ended). Default is abe
.
-E
email_output
e
for error,
o
for output,
oe
for both, or n
for neither.
Default is n
.
-e
error_path
submit_dir/job_name.ejob_id
(for example, ~/my_jobs/foo.e483
for a script
foo
submitted from ~/my_jobs
).
-o
output_path
submit_dir/job_name.ojob_id
(for example, ~/my_jobs/foo.o483
for a script
foo
submitted from ~/my_jobs
).
-j
join
n
for not merged,
eo
for intermixed as standard error, or
oe
for intermixed as standard output.
Default is n
.
-N
name
QSUB
may be used to set default
options for qsub
. For example:
setenv QSUB "-a 29.123 -l walltime=10:00:00"Options specified at the command line or in job script directives take precedence over those specified via the
QSUB
environment variable
add longjobs
if you haven't
already.]
There are two ways to cancel a job:
longjob -cancel
interface
(shows which jobs you have queued, and prompts for the ID of the
job you wish to cancel)
qdel
command directly (useful when you
know the ID of the job you wish to cancel, or want to cancel more
than one job), e.g.:
qdel jobid ...where jobid is the number given to you at submit time (and which you can look up with
qstat
). For example:
athena% qstat -u jqpublic Job ID Username Queue Jobname Limit State Elapsed ------ -------- ----- ------- ----- ----- ------- 379 jqpublic sun-long foo 27:00 Que -- 385 jqpublic sun-medium STDIN 06:00 Run 00:03 athena% qdel 379 Job 379.hydrogen.mit.edu deleted.You can cancel more than one job by specifying all of the job ID's on the command line:
athena% qdel 382 388 Job 382.hydrogen.mit.edu deleted. Job 388.hydrogen.mit.edu deleted.
qstat
list for a few seconds until the process is finished.
add longjobs
if you haven't
already.]
Once a job has been submitted, you may use:
qalter
to modify its qsub
options
qtix
for renewing, forwarding, or destroying its tickets.
qmove
to move it from one queue to another.
Last modified: Mon Jul 21 12:16:51 EDT 2003