Proposed Athena 'longjobs' service description |
Contact information: |
Public discussion: longjobs@mit.edu
For notes to the longjobs team: longjobs-dev@mit.edu |
---|---|
| |
Introduction |
This document describes the current thinking for a system which
would provide a solution to the "longjobs" problem in the Athena
environment. The goal is to ensure that we understand user needs,
and, if feasible, devise a solution that meets those needs, to the
fullest practical extent. In this document, we present:
We welcome and encourage any comments and other responses, whether approving, critical, or otherwise. |
Problem Summary |
The major problem we are addressing is the requirement that users remain present at the console of an Athena workstation for the duration of a session; in public clusters, workstations cannot be left unattended for more than a few minutes. Users have no reliable way to run procedures of long duration, even jobs that require no user interaction, without physically remaining at the workstation console. In "traditional" UNIX environments, users can either run non-interactive procedures in the "background", or use the "at" facility to submit a script, or "job", to be run by the system. Neither of these options is supported in the current Athena environment. We thus perceive the need to be one of an ability to run non-interactive, unattended jobs within the Athena environment. |
Service Overview |
We envision a system consisting of a pool of dedicated, centrally-managed Athena server machines on which users can execute such non-interactive jobs remotely. Submitting and running jobsUsers could submit a job from any Athena workstation, specifying the type of Athena machine they want the job to run on. The job would be directed immediately to a central "master" server, which would dispatch jobs to "slave" execution machines. An execution slave would only run one job at a time, so a job would remain queued at the master until a suitable slave machine became available. Authentication and authorization would be based on the user's Kerberos principal. The intent would be to provide a system that is as "transparent", and compatible with the Athena interactive computing experience, as possible; a user should easily be able to submit a job as a script of the same shell commands that would be entered during a normal interactive login session. The execution machines would essentially be identical to normal Athena workstations, with the same facilities available, except for those facilities which are only suitable for interactive capabilities. Users would specify the resource requirements for the job, e.g. type of machine and maximum elapsed time, at submit time. Submitted jobs would enter a queue; scheduling would be done using a first-in, first-out scheme, probably adjusted for fairness. There would most likely be multiple queues, based on resource requirements and perhaps priority. Users could query the server to obtain job status, and cancel jobs that were no longer wanted. Users would optionally be notified, via email and/or Zephyr, when a job begins and/or ends. User accounts would be added, and home directories attached, on the execution machine only for the duration of the job, similar to an Athena login session. The user's normal login shell would be executed, with the submitted script as input. Standard output and error files would be written to the user's directory, or emailed to the user, per user option. Jobs would be subject to a strict limit of elapsed time; any job exceeding that limit would be killed. Since a slave machine will never run more than one job at a time, the user is assured of having no contention for CPU cycles or other resources while the job is running, so that the time needed to complete the job will be as predictable as possible. At the end of each job, the slave will perform a cleanup and check procedure, to ensure system integrity. There will most likely be a strict limit on the number of jobs a user can have queued and/or running at one time. We will possibly program the job scheduler to further prevent one user from unfairly monopolizing the service.
Please see the end of this document for examples
of how such a system might be used.
Job credentialsIn order for the job to run with Kerberos credentials, as is required for most authentication and authorization purposes in the Athena environment, the user would optionally be able to acquire a long-lived, renewable Kerberos 5 ticket-granting ticket, which would be forwarded as part of the job. The master and slave servers would manage these tickets, renewing them as needed. In addition, the execution server would use this ticket to acquire Kerberos 4 tickets, and AFS tokens, the latter being most critical for users to be able to access their home and other directories from within the job context. Users who did not wish to forward a long-lived ticket could optionally choose to forward an existing short-lived ticket, or to run the job without any tickets or tokens. A renewable ticket could be maintained up to the maximum life permitted by our Kerberos configuration (currently one week). AccountingExperience with prototype Athena batch facilities, and with other services, leads us to be concerned that a completely free and open longjobs service will be oversubscribed to the extent that those with real needs will be frozen out and frustrated. For this reason, we are investigating an accounting and billing component. Billing has the advantage that it will discourage abuse, and that it will provide a rate base to fund expansion of a genuinely popular service. We will examine the following services as possible cost models: Athena cluster; Athena dialup; Athena printing; Electronic Classroom; Tether. Charges for using the service may manifest (for example) as debits against a predetermined time allocation (e.g. for course work), against a pre-paid sum, or as charges on a monthly bill. Charges may be based on a per-hour charge, a flat rate with usage quota, rates varying by time, priority, etc., or some combination thereof. There may also be a charge for account activation. Users may have multiple accounts available to them. Users should have a way to query the system for their current accounting information. Administration and supportThe master and slave machines would be managed by Athena Server Operations. Mechanisms would be implemented for them to regulate the service as needed, e.g. limiting queue size, locking out misbehaving users, etc. The Academic Computing Support Team will provide user support for the service, and be able to control certain access to the queues, so that queues could potentially be reserved for special needs. |
Design Issues and Problems |
The following issues and potential problems are inherent in the approach outlined above:
|
Examples |
The following are general examples of how a user might create and
submit a job for the service. (The longjobs command names and user
interfaces shown below are merely illustrative; they do not indicate
actual commands).
Creating a job scriptTo create a script for a job, the user would enter the appropriate shell commands into a script file named, for example, "my_script"; this file might look like:cd <my_working_directory> ./my_program > /tmp/raw_output ./crunch_data /tmp/raw_output > ./my_dataIn this example, the script changes to the user's working directory, runs a program which writes raw output to a temp file, and then runs another program to process that output, with the resulting data written to the user directory. The last step would be necessary if, for example, the initial output was so large that it would exceed the user directory's quota. Submitting a jobThe user would then submit the script to the longjobs service, e.g.:athena% lj_submit --queue sun4-12hour --zephyr begin,end my_script Password: Job XXXX submitted.[There may be a more user-friendly submit program which prompts for the submission parameters.] In the example, the queue name, sun4-12hour, would be a pre-defined system entity, providing a way for the user to specify the resources required for the job, in this case, a sun4 machine, with a time limit of 12 hours. The --zephyr option specifies that the user should be sent a Zephyr message when the job begins and ends execution. (There would be a similar option to send mail). The submit program would then prompt for the user's Kerberos password; this would be necessary to create a renewable ticket for the job. Additional options would exist for the user to forward an existing ticket, or to run the job without any ticket.
Finally, when the job has been enqueued successfully, the submit program
will output a unique job ID, which can be used to identify the job in
subsequent status or delete commands.
Getting job statusThe user will be able to list the status of the queue(s), for example:athena% lj_status Job ID Username Queue Jobname Limit Status Elapsed ------ -------- ----- ------- ----- ------ ------- XXXX jdoe sun4-12hour my_script 12:00 Running 01:37 YYYY jqpublic sun4-2hour foo 2:00 Queued -In this way, users will be able to monitor their jobs' progress, and get an indication of how busy the system is. Removing a jobFinally, if the user decides to cancel the job:athena% lj_cancel XXXXwhere XXXX denotes the job ID displayed when the job was submitted. |