MIT Information Systems    Athena Longjobs -- Using Job Scripts
Athena Owl    On this page: Basic Template | MATLAB | MSI/Cerius2 | Script Details | Testing Scripts

A job script essentially consists of the sequence of commands you would type at the Athena prompt to run your job by hand -- the differences are that the programs you run must all be non-interactive, and that you may want to add a few commands to set things like the working directory or output paths. More complete information is in the Script Details section below.

Basic template

  ######################################################################## 
  # set job's working directory, if you want it to be other than ~/ 
  # (this example uses the submit directory, as explained below)
  cd $PBS_O_WORKDIR

  # remove or clean up any output from a previous run
  rm -f results

  # add or attach any lockers you will need aside from your homedir
  add foo bar

  # run commands, invoke other scripts, etc.
  tweedledee
  tweedledum > results
  ######################################################################## 
Basic notes on output, lockers, and cwd: See also the sections below for more Script Details and tips on Testing Scripts.

MATLAB

Preparing the M-file: Calling MATLAB in the the job script:

MSI/Cerius2MS


Script Details

A job script may consist of:

Defaults and Caveats

shell and #!
The script contents are given as standard input to the shell (tcsh unless you have changed your login shell). Note that a #! line in the script will be ignored; if you want to use a script with a different interpreter it is best to keep it a separate file, and invoke that from within the job script.

lockers and path
The system attaches your home directory before starting your job, so the usual /mit/username path is available; it does not attach or add other lockers unless you have such commands in your dotfiles (see below), or include them in your job script. If you have an appropriate binary directory set up under your homedir (i.e., following the Athena locker organization conventions described in the lockers man page), it will be added to the path automatically, assuming you are using the default Athena shell; see the dotfiles section below for more details.

If your job will write output files directly (rather than using stdout), make sure that you save these files before the end of the job to an attached locker with sufficient quota, and where you have write access. The service will attempt to rerun your job in certain server-failure cases; this may cause problems with files your job writes to directly (as opposed to standard ouput and error streams, which the service will simply recreate). If it is not feasible to construct your job to handle the possible existence of data from a previous run, you should make the job non-rerunnable by submitting with the -rn option (in qsub, a script directive, or the QSUB environment variable).

Note that you may use the local disk on the execution machine for intermediate processing (e.g., you may want to have your job generate large data files in /tmp, then compress them before saving the final output in a locker).

working directory
By default, jobs start in the user's home directory. Since it is common to want the job's working directory to be the same as the cwd from which you submit the job, the system sets an environment variable PBS_O_WORKDIR to this value. For example, if you run the qsub command from the directory ~/my_jobs/29.123 and your script contains the lines:
     cd $PBS_O_WORKDIR
     ./my_program > my_data
it will look for an executable ~/my_jobs/29.123/my_program (rather than the default ~/my_program) and will create the output file ~/my_jobs/29.123/my_data (rather than the default ~/my_data).

standard output and error
By default, the system will create one file each for the job's standard output and error streams; at the end of the job it attempts to save these back to the directory from which the job was submitted (if it fails, it will instead email them to you). The filenames are constructed from the job name and ID number generated by the system, for uniqueness. For example, if the script foo is submitted from ~/my_jobs, it generates respective error and output files:
     ~/my_jobs/foo.e483
     ~/my_jobs/foo.o483
     

dotfiles
The system runs your shell as a non-login shell, which means that:

assuming you are using the Athena-default ~/.cshrc which use the system-wide startup files (for more information, see the Athena Dotfiles publication). If you are using a different shell (in particular bash), you may need to source its associated dotfiles in your script.

Note that the job may fail if you have included commands which attempt to set terminal characteristics in one of the sourced dotfiles. Any such command should be skipped by adding a test for the environment variable PBS_ENVIRONMENT, for example:

     setenv PRINTER kiwi
     setenv LPROPT "-h -z"
     if ( ! $?PBS_ENVIRONMENT ) then
        terminal stuff here
     endif

job environment variables
The system sets up several environment variables for the job which may be useful in scripts, including:

For more details, see the qsub man page, DESCRIPTION section.


Script Directives

The job script may begin with directives to specify qsub options using this syntax:

    #PBS -flag option

For example:

    #PBS -a 29.123
    #PBS -l walltime=10:00:00

will specify 29.123 as the account and set a time limit of 10 hours. (For details on qsub options, see the Running Jobs page.)

Notes:


Testing Your Scripts

If you are new to the system or are running a job with a different application than usual, it's a good idea to test a short version of your script to make sure you haven't overlooked something that might keep it from running as expected. The testjob program allows you to do this without actually submitting it to the system (it simulates the longjobs environment on your local workstation so you can test your script without having to wait in a queue or use up quota).

To test your job script:

  1. Prepare a version of your script to run a short job
  2. Run the short job (either using the testjob utility, or on the actual system)
Preparing the script to run a short job

There are two ways you can do this:

Running the short job with testjob

Running the short job on the actual system

If the system is not busy, you may prefer to test your script by actually submitting a short version of it. Note that if you submit with the longjob -submit command, you will be prompted for a time limit. If you submit with qsub you can specify a short limit with the option:

  -l walltime=hh:mm:ss
  
For example, for a 10-minute limit use the following:
  athena% qsub -l walltime=00:10:00 ...
  
(The system will interpret a single value as seconds and xx:yy as minutes:seconds; to avoid confusion it's best to specify all three time fields.)
Longjobs Documentation: Overview | Job Scripts (this page) | Running Jobs | Checking Job Status | Quick Reference and FAQ


Last modified: Thu, Jan 2 2003