.nr % 1
.OH ''PBS ERS'Introduction'
.EH 'Introduction'PBS ERS''
.P1
.so ers_setup.ms
.Rv $Revision: 2.2 $
.nr Fi 0 1
.nr H1 0
.\"         Portable Batch System (PBS) Software License
.\" 
.\" Copyright (c) 1999, MRJ Technology Solutions.
.\" All rights reserved.
.\" 
.\" Acknowledgment: The Portable Batch System Software was originally developed
.\" as a joint project between the Numerical Aerospace Simulation (NAS) Systems
.\" Division of NASA Ames Research Center and the National Energy Research
.\" Supercomputer Center (NERSC) of Lawrence Livermore National Laboratory.
.\" 
.\" Redistribution of the Portable Batch System Software and use in source
.\" and binary forms, with or without modification, are permitted provided
.\" that the following conditions are met:
.\" 
.\" - Redistributions of source code must retain the above copyright and
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer.
.\" 
.\" - Redistributions in binary form must reproduce the above copyright and 
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer in the documentation and/or other materials provided with the
.\"   distribution.
.\" 
.\" - All advertising materials mentioning features or use of this software must
.\"   display the following acknowledgment:
.\" 
.\"   This product includes software developed by NASA Ames Research Center,
.\"   Lawrence Livermore National Laboratory, and MRJ Technology Solutions.
.\" 
.\"         DISCLAIMER OF WARRANTY
.\" 
.\" THIS SOFTWARE IS PROVIDED BY MRJ TECHNOLOGY SOLUTIONS ("MRJ") "AS IS" 
.\" WITHOUT WARRANTY OF ANY KIND, AND ANY EXPRESS OR IMPLIED WARRANTIES, 
.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, 
.\" FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE EXPRESSLY
.\" DISCLAIMED.
.\"
.\" IN NO EVENT, UNLESS REQUIRED BY APPLICABLE LAW, SHALL MRJ, NASA, NOR
.\" THE U.S. GOVERNMENT BE LIABLE FOR ANY DIRECT DAMAGES WHATSOEVER,
.\" NOR ANY INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\" 
.\" This license will be governed by the laws of the Commonwealth of Virginia,
.\" without reference to its choice of law rules.
.NH 1
.Tc \s+2\f3Introduction\fP\s-2
.LP
.OF 'Chapt \*(rV''\n(H1-%'
.EF '\n(H1-%''Chapt \*(rV'
.NH 2
.Tc \f3Purpose\fP
.LP
This document is the 
.nh
.UL Portable
.UL Batch
.UL System
.UL External
.UL Reference
.UL Specification.
.hy
It describes the overall design of the Portable Batch System,
the 
.B PBS ,
and details its external behaviors and interfaces.
User, Operator, and Administrator commands are described.
The interface library which is used by the commands and also may be used
to extend the functionality of 
.B PBS
is described, as is the application level data exchange protocol (network
protocol).
.QP
NOTICE
.QP
However, this system is currently in
development and the interfaces and functionality described are subject to
change during the course of development.
.LP
Suggested reading of sections of the ERS is as follows:
.IP "General User"
Users who just wish to make use of the common features of PBS to submit,
monitor and control jobs are advised to read ERS chapters 1, 2.2, 2.7, 3.2
through 3.4, and 5. 
.IP Programmers
Programmers wishing to add an interface to PBS to their code should read
the sections listed for the general user and chapter 4.
.IP Operators
Batch system operator should read the sections listed under general user
above plus chapters 6 and 7.
.IP Administrators
Batch system administrators or managers are advised to read the entire ERS
with the possible exception of chapters 4 and 11.
.LP
The requirements for 
.B PBS
are given in the document
.nh
.UL Portable
.UL Batch
.UL System
.UL Requirement
.UL Specifications .
.hy
The internal design of each component of
.B PBS
is specified in the document
.nh
.UL Portable
.UL Batch
.UL System
.UL Internal
.UL Design
.UL Specification .
.hy
.NH 2
.Tc \f3Glossary\fP
.so glossary.ms
.LP
.NH 2
.Tc \f3Document Conventions\fP
.LP
The following font conventions are used throughout this document.
.LP
.I
New terms are introduced in italicized text.
.LP
.B
Names of commands, library functions, and signals are shown in bold, serifed text.
.LP
.Er "Error values"
.ft 6
are shown in a sans-serif typeface and 
.Er "inside brackets" .
.ft 1
.LP
.Ar
Option Argument and operands are shown as italic.
.LP
.At
Attribute or data item names associated with jobs, queues, or the server
are shown in a bold, sans-serif typeface.
.ps +1
.LP
.Av
Nonspecific values of attribute or data items are shown in a sans-serif typeface.
.ps +1
.LP
.Sc "Symbolic Constant"
.ft 6
values, typically to be found in header files are shown in a sans-serif typeface and 
.Sc "inside braces" .
.ft 1
.LP
.Ty
Examples of formats and type-ins are in the fixed width typewriter font.
.ft 1
.LP
Unresolved issues or areas which may require modification as the
implementation progresses are called out by a note in a quoted paragraph,
left and right indents, and are headed with the phrase \*QAuthor Note:\*U.
As these issues become resolved, this document will be updated and
those sections removed.
.NH 2
.Tc "\f3Overview of The Portable Batch System\fP"
.LP
In the past, Unix systems were used in a totally interactive manner.
Background jobs were just processes with their input disconnected from the
terminal.  However, as Unix moved onto larger and larger processors,
the need to be able to schedule tasks based on available resources increased
in importance.  The advent of networked compute servers, smaller general
systems, and workstations lead to the requirement of a networked batch
capability.
.LP
The purpose of the
.B PBS
system is to provide additional
controls over initiating or scheduling execution of batch jobs;
and to  allow routing of those jobs between different hosts.
The batch system allows a site to define and implement policy as to what
types of resources and how much of each resource can be used by different jobs. 
The batch system also provides a mechanism with which a user can insure
a job will have access to the resources required to complete.
.LP
The batch system is made up of a number of components, the server and
clients such as user commands.
A server component manages a number of different objects, such as queues
or jobs.  Each object consists of a number of data items or attributes.
The attributes are characterized as 
.I public
attributes,
.I "read-only public"
attributes,
.I private
attributes, and
.I internal
attributes.
.LP
Public attributes have values which are supplied by or can be changed by
client requests.  The behavior of the object changes when the value of
an attribute is changed.  The values of public attributes are available
upon request to clients.
Read-only attributes are public attributes whose values are available
as status to clients, but the clients cannot change the values.
Throughout this document, public and read-only attributes will be
commonly referred to simply as \*Qattributes\*U.
.I Private
attributes are those data items which are permanent.  They are passed as
part of the object when the object changes ownership, for example when
a job moves between servers.  Private items are generally not made available
to client programs.
.I Internal
data items are not visible, are not passed with the object between servers,
and depend on the server implementation.
Public and private attributes will be described in this ERS.  Except in
special cases, internal data items will not be described.
.LP
Typical interaction between the components is based upon the
client - server model, with clients making (batch) requests to servers and
the servers performing work on behalf of the clients.
Clients do not create or modify objects directly, but depend upon the
server which manages those objects.
.LP
A batch server is a persistent process or set of processes, such as a daemon.
The batch server manages batch objects such as queues and jobs.  It provides
batch services like creating, routing, executing, modifying, or deleting jobs
for batch clients.
A batch server may at times request services of other servers.  During that
time, the server is acting in the role of a client.
.LP
User, operator, and administrator commands are batch clients.  They allow
users of the batch system to request batch services via the command line.
While the commands may appear to accomplish certain services, they
actually request and obtain the services from a batch server by means of
a batch request.
.LP
The Interface Library provided as part of
.B PBS
supplies the interface to a server for the supplied commands and
allows the development of other application clients.
.NH 2
.Tc "\f3PBS Features\fP"
.LP
This section describes some of the special features of PBS.
.NH 3
.Tc "Interactive Batch Jobs"
.LP
With a normal batch job, the input to the job is the script supplied via
the qsub command and the output and error streams are spooled to disk
files and delivered after the job completes.
PBS provides support for scheduling and running jobs that require interactive
user access to the input and output of the job during run time.  
This access is often required to run debuggers or other programs requiring
feedback as part of a job that must have scheduled access to scarce hardware.
.LP
A PBS interactive job is submitted with the qsub command.
If the
.Ty -I
option is specified on the qsub command line or in a #PBS directive in
the script file [or if 
.Ty "-W\ interactive=true"
is specified on the command line or in a directive] the job is a interactive
job.  A job may determine that it is an interactive batch job by the value
of the environment variable 
.B PBS_ENVIRONMENT .
Instead of the normal value of
.B PBS_BATCH ,
it will have a value of
.B PBS_INTERACTIVE .
[Sessions not created by PBS do not have the PBS_ENVIRONMENT variable set.]
.LP
After submitting the job to the batch system, qsub remains active
waiting for the job to connect over the network.  When the job starts, data
written to standard output and error is sent to qsub which displays it on
its terminal (qsub's standard output).   The qsub command reads its standard
input and passes the data to the job as the job's standard input.  The job
is connected via a pseudo tty, so job control signals and special characters
are processed by the job, not the local qsub session.  Since the job's
stardard input is from the terminal, via qsub, the script is not executed
as is a normal script.  However, any #PBS directives in the script are
processed by qsub and sent with the job.  Therefore, job attributes and
resources requirements may be specified in the script as with a normal batch
job.
.LP
While qsub is waiting for the job to start, it will recognize the interrupt
signal typically generated by entering CNTL-C on the keyboard.   Qsub will
ask if the user wishes to exit.  If the user responds with 
.Ty yes
(or any string starting with a 'y') qsub will send a request to the server
asking that the batch job be aborted.  Qsub will also periodically check on
the batch job.  If qsub finds that the job has been deleted, qsub will 
inform the user and exit.
.LP
When the job starts execution, qsub will inform the user and begin to relay
the input, output and error streams.   Control keys are passed to the job.
Thus a CNTL-Z will suppend the job, not the qsub command, and CNTL-C will
cause a interrupt signal to the job, not to qsub.
During this time, qsub will three escape commands, if the line begins with:
.IP ~.
Qsub will exit.  This will end the job.
.IP ~^Z
(CNTL-Z) Qsub will suspend itself, the job remains running.  Neither input
nor output is transferred.  The user may issue commands to the local shell.
.IP ~^Y
(CNTL-Y) Qsub will suspend the part of itself that sends input to the job.
This allows the user to issue commands to the local shell while still
receiving the output of the job.
.LP
Note, the last two escape lines do not work if the local shell does not support
job control, e.g. is the Bourne shell sh.
.LP
When the job terminates, qsub will exit returning control to the local shell.
.NH 3
.Tc "Job Prologue and Epilogue Scripts"
.LP
PBS provides for the execution of two administrative supplied scripts with
each job.   The prologue script is run immediately before a job is executed,
the epilogue script is run immediately after.  Both scripts are run with
root privilege and may used to place a \*Qbanner\*U on the job's output
or establish part of the evironment for the job such as creating temporary
directories or cleaning up after the job.
.NH 3
.Tc "File Stage In and Stage Out"
.LP
PBS provides for files to be staged, or moved, before and after a job is run.
The user specifies a \*Qremote\*U location and name of the file and the 
\*Qlocal\*U name when submitting the job.  The remote name includes a host name
which is typically a remote host, but may be the local execution host.
If the file is local to the execution host, /bin/cp is used to copy the file,
if the file is remote, rcp is used.
.LP
Staging out of files occures as part of the post job processing.  The job
is shown to be in exiting, 'E', state during the staging out.  Once the files
are staged out to their destination, they are deleted from the execution host.
If the user wishes to retain the files on the execution host, s/he should link
the file to second file name using the 
.I ln
(link) command.  
.LP
Staging in files is a bit more complex.  A decision must be made about when
to begin to stage in files for a job. The files must be available before the
job executes.  The amount of time that will be required to copy the files is
unknown to PBS, that being a function of file size and network speed.
If file in-staging is not started until the job has been selected to run
when the other required resources are available, either those resources are
\*Qwasted\*U while the stage in occurs, or another job is started which takes
the resources away from the first job, and might prevent it from running.
If the files are staged in well before the job is otherwise ready to run,
the files may take up valuable disk space need by running jobs.
.LP
PBS provides two ways that file in-staging can be initiated for a job.
If a run request is received for a job with a requirement for staging-in files,
the staging in operation is begun and when completed, the job is run.
Or, a specific stage-in request may be received for a job, see pbs_stagein(3B),
in which case the files are staged in but the job is not run.  When the job
is run, it begins execution immediately because the files are already there.
.LP
In either case, if the files could not be staged-in for any reason, the job
is placed into a wait state with a \*Qexecute at\*U time
.Sc PBS_STAGEFAIL_WAIT ,
30 minutes, in the future.   A mail message is sent to the job owner requesting
that s/he look into the problem.  The reason the job is changed into wait
state is to prevent the scheduler from constantly retrying the same job which
likely would keep on failing.
.LP
Figure \n(H1\-\n+(Fi shows the (sub)state changes for a job involving
file in staging.  The scheduler may note the substate of the job and
chose to perform pre-staging via the pbs_stagein() call.
The scheduler developer should carefully chose a stage in approach based
on factors such as the likely source of the files, network speed, and
disk capacity.
.DS
.so stagein.pic
.sp
.ce
Figure \n(H1\-\n(Fi: Job Substate Changes During File Stage In
.DE
.NH 2
.Tc Acknowledgements
.LP
Much assistance to the PBS project was given in the early days in terms of
man power and ideas by both 
.I "Lawrence Livermore Nation Laboratory"
and by the
.I "National Energy Research Supercomputer Center" .
Special thanks go to Bruce Kelly and Clark Streeter of NERSC, who directly
assisted in the development of PBS.  Additional help
was provided by Kent Crispin and Terry Heidelberg of LLNL.
.LP
The supplied code for the mom_rcp utility was taken from the
.I bsd4.4-Lite
distribution.  This code is copyrighted by 
.I "The Regents of the University of California" .
The complete copyright notice and right to modify and
redistribute the software is contained in the source code.
.LP
The Red Hat Linux port was done by the
.I "Pittsburgh Supercomputing Center"
under funding by the
.I "National Institute of Standards and Technology" ,
NIST.  Special thanks go to John Kochmar and Rob Pennington of PSC for
doing the port.
.LP
The ports of PBS to Digitial Equipment Corporation Unix on the Alpha workstation
and to HP-UX
were provided by Dirk Grunwald at
.I "University of Colorado, Boulder" .
.LP
No list of acknowledgements for PBS would possibly be complete without 
special recognition of the first two beta test sites and the brave individuals
who were willing to try PBS.   Thomas Milliman of the
.I "Space Sciences Center"
of the
.I "University of New Hampshire"
was the first beta tester.  Wendy Lin of
.I "Purdue University" 
was the second tester and holds the honor of submitting more problem reports
than anyone else outside of NASA.   Without you two, the project would not be
so successful.
.\" force the Next Chapter to start on an odd page
.bp
.if e \{
.sp 10
.DS C
[Page intentionally left bank.]
.DE
.bp
\}
