.nr % 1
.OH ''PBS ERS'General Specification'
.EH 'General Specification'PBS ERS''
.P1
.so ers_setup.ms
.Rv $Revision: 2.1 $
.nr H1 1
.nr Fi 0 1
.NH 1
.Tc \f3\s+2General Specifications\s-2\fP
.LP
.OF 'Chapt \*(rV''\n(H1-%'
.EF '\n(H1-%''Chapt \*(rV'
.\"         Portable Batch System (PBS) Software License
.\" 
.\" Copyright (c) 1999, MRJ Technology Solutions.
.\" All rights reserved.
.\" 
.\" Acknowledgment: The Portable Batch System Software was originally developed
.\" as a joint project between the Numerical Aerospace Simulation (NAS) Systems
.\" Division of NASA Ames Research Center and the National Energy Research
.\" Supercomputer Center (NERSC) of Lawrence Livermore National Laboratory.
.\" 
.\" Redistribution of the Portable Batch System Software and use in source
.\" and binary forms, with or without modification, are permitted provided
.\" that the following conditions are met:
.\" 
.\" - Redistributions of source code must retain the above copyright and
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer.
.\" 
.\" - Redistributions in binary form must reproduce the above copyright and 
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer in the documentation and/or other materials provided with the
.\"   distribution.
.\" 
.\" - All advertising materials mentioning features or use of this software must
.\"   display the following acknowledgment:
.\" 
.\"   This product includes software developed by NASA Ames Research Center,
.\"   Lawrence Livermore National Laboratory, and MRJ Technology Solutions.
.\" 
.\"         DISCLAIMER OF WARRANTY
.\" 
.\" THIS SOFTWARE IS PROVIDED BY MRJ TECHNOLOGY SOLUTIONS ("MRJ") "AS IS" 
.\" WITHOUT WARRANTY OF ANY KIND, AND ANY EXPRESS OR IMPLIED WARRANTIES, 
.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, 
.\" FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE EXPRESSLY
.\" DISCLAIMED.
.\"
.\" IN NO EVENT, UNLESS REQUIRED BY APPLICABLE LAW, SHALL MRJ, NASA, NOR
.\" THE U.S. GOVERNMENT BE LIABLE FOR ANY DIRECT DAMAGES WHATSOEVER,
.\" NOR ANY INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\" 
.\" This license will be governed by the laws of the Commonwealth of Virginia,
.\" without reference to its choice of law rules.
A server is a persistent process, a daemon, which manages several classes
of objects and provides batch services.
Each class of object has a set of attributes (variables) which contain
information that is specific to that object.
In the following sections, the objects and the services are described.
.NH 2
.Tc \f3Attribute Types\fP
.LP
Each attribute associated with an object has a defined data type.  The following
is a list of the general data types currently supported.
.IP Boolean
used for true/false yes/no variables.
The true state may be input as
.Ty True ,
.Ty TRUE ,
.Ty y ,
.Ty Y ,
or 
.Ty 1 .
The false state may be input as
.Ty False ,
.Ty FALSE ,
.Ty n ,
.Ty N ,
or
.Ty 0 .
Unset boolean attributes are generally treated as if set to false.
.IP Integer
used for numeric variables.  The input data is a numeric string specifying
a value which will fit into a long integer on the host system.
Some attributes will place additional restrictions on the value range.
.IP Size
used for size of disk or memory related values.  The input data is in the form
of a numeric string with optional suffix.  The suffix consist of an optional
scale factor character 
.Ty kKmMgGtT
and an optional byte or word indicator
.Ty bBwW .
.IP Character
used for types containing a single alphanumeric character.
.IP String\ 
used for types requiring a single null terminated character string.  Additional
format requirements may be placed on the string for a specific attribute.
.IP "Array of Strings"
used where the attribute value is a series of strings.  The value is input
as a set of comma separated stings.  The value at the human interface level
(command level) often requires quoting.
.IP List
used for those attributes requiring a linked list of data structures.  The
input data is typically of the same form as the \*Qarray of strings\*U above.
.IP Resources
is a special case of list for resource limit or resource usage items.
.IP "Access Control List"
is a special case of list for access control lists.
More information on the access control lists can be found in the ERS
chapter \*QSecurity\*U, section \*QAuthorization\*U.
.LP
.NH 2
.Tc \f3Batch Jobs\fP
.LP
A batch job is a primary object managed by a batch server.
From the user's view point, a job is a file which is submitted via the
.B qsub
command.  Typically, this file is a shell script and is interpreted by
a command shell.  In fact the file may contain any data and the user
can request any valid program to process the file as its standard input.
In addition to the shell script, a batch job consists of many attributes which
affect the processing of the job.  These are covered in the next section.
.LP
The server maintains an internal representation of the job and a copy of
that representation as a file on disk to insure the job information is not
lost across server instantiations.
.LP
Jobs are created by the server upon receipt and successful processing of a
.I "Queue Job"
batch request.
Jobs are maintained by the server until (1) the job executes and terminates,
(2) the job is deleted by a
.I "Delete Job"
batch request, (3) the job is moved or routed to another server, or (4) the
server determines it is impossible to process the job and the job is 
.I aborted  .
.LP
When a job is created, it is named with a
.I job_identifier 
by the server which created the job.
The job_identifier is of the form
.DS
.Ty sequence_number.server_name
.DE
where the
.Ty sequence_number
is a unique number within the creating server
and
.Ty server_name
is the name of that server.
.LP
.so ../man1/pbs_job_attributes.7B
.NH 3
.Tc Job Private Attributes
.LP
The following data items are private attributes of the job.
These items are a permanent part of the job object and are passed with
the job between servers or between the server and the execution server,
but are not passed to user clients.
.RS .25i
.Al hopcount
The hop count is maintained by the server.  It is set to zero when the job
is created and incremented each time the job changes destination, queue
or server.
The hopcount attribute is used to prevent endless routing loops and to
ensure correct ordering of updates of the job's current location for the
.B "Job Locate"
batch request. [type: integer]
.Al security
Reserved for future implementation.  [type: string]
.RE
.NH 3
.Tc Job Internal Data Items
.LP
The following data items are internal to the server representation of a
job.  They are specifically described here because of their importance.
.RS .25i
.Al destination
The 
.At destination_id
supplied on the 
.B qsub 
or 
.B qmove
commands.
.Al job_substate
The secondary job state field, see \*Qstate\*U under Job Read-Only Attributes.
As it is not visible to clients, the values are not defined in this document.  
.RE
.LP
.NH 3
.Tc "Interactive Batch Jobs"
.LP
PBS supports \*Qinteractive batch jobs\*U.  An interactive batch job
is a job submitted to PBS where the standard input, output, and error
streams of the job are connected to the terminal session in which qsub is
run.  The qsub command acts as a conduit for the communication between the
job and the terminal session.  
.NH 2
.Tc \f3Queues\fP
.LP
A batch queue is an object managed by a batch server.  A batch queue
consists of a collection of zero or more batch jobs, a set of queue
attributes, private attributes, and a set of internal data items.
Jobs are said to reside in the queue or be members of the queue.
In spite of the name, jobs residing in a queue need not be ordered
first in, first out.
.LP
Access to a queue is limited to the server which owns the queue.  All
clients gain information about a queue or jobs within a queue through
batch requests to the server.
.LP
Two main types of queues are defined: routing queues and execution queues.
The type of queue is determined by which subset of queue attributes
have been assigned to it.
.LP
When a job resides in a routing queue, it is a candidate for routing to
a new destination.  Each routing queue has a list of destinations to
which jobs may be routed.  The new destination may be a different queue
within the same server or a queue under a different server.
.LP
.\" A routing queue may be one of two sub-types, push or pull.
.\" A push routing queue actively routes jobs to the associated destination.
.\" It will repeatedly try to route a job until the job has been successfully
.\" routed to a new destination.  A pull routing queue is passive.  It will
.\" only route jobs to a new destination upon request for a job from that
.\" destination.  Jobs are said to be pulled from the queue.
.\" In one sense, a pull routing queue is a holding place until some execution
.\" server needs work.
.\" .LP
Jobs are removed from a routing queue when:
.IP - 2
The job has been successfully routed to another queue.
.IP - 2
The job has been deleted.
.IP - 2
The job has been moved to another queue.
.IP - 2
The job has been aborted by the server.
.LP
When a job resides in an execution queue, it is a candidate for execution.
A job in execution is still a member of the execution queue from which
it was selected for execution.
Jobs are removed from an execution queue when:
.IP - 2
The job has executed and terminated.
.IP - 2
The job has been deleted.
.IP - 2
The job has been moved to another queue.
.IP - 2
The job has been aborted.
.so ../man1/pbs_queue_attributes.7B
.LP
.NH 2
.Tc \f3Batch Server Attributes\fP
.LP
The following attributes apply to the server.
.so ../man1/pbs_server_attributes.7B
.LP
.NH 3
.Tc Server Sub-Objects
.LP
The following are lists of objects belonging to the server.
.RS
.Al queues\ 
List of the queues managed by this server.
.Al job
List of the jobs managed by this server.
.RE
.LP
.NH 2
.Tc \f3PBS  Files\fP
.LP
The PBS subsystem maintains its files under the directory
.Sc PBS_DIR
which is usually set to
.Av /usr/spool/PBS ,
see figure \n(H1\-\n+(Fi 
.nr Fj \n(Fi
for the layout directories used by PBS.
There is a subdirectory, 
.Sc PBS_DIR
.Av /server_priv
for private files accessible only by the server, see figure \n(H1\-\n+(Fi
used by the Server daemon for its private files.  The Job Executor, MOM,
has a similiar subdirectory 
.Av mom_priv .
.DS
.so dir_struct.pic
.sp
.ce
\f3Figure \n(H1\-\n+(Fj: PBS Directory Structure\f1
.DE
.rr Fj
.DS
.so svr_dir.pic
.sp
.ce
\f3Figure \n(H1\-\n+(Fi: PBS Server Files\f1
.DE
.LP
.NH 2
.Tc "\f3Job Selection / Scheduling\fP"
.LP
Job selection or job scheduling is the process of picking which eligible job is
placed into execution or initiated.  To be eligible, the job
must reside in an execution queue and be in the
.B queued
state.
.LP
To maximize flexibility in implementing site policy,
.B PBS
provides a separate program as the selection process.
This process operates on the principle of evaluating the eligible jobs
according to a program written either in a yacc/lex based procedural
language or in Tcl.
This process, the
.B "PBS Scheduler" ,
pbs_sched, communicates with the PBS Server via a socket based IPC,
using the standard PBS API.
This provides the capability of having the Scheduler and the Server reside
on different hosts.
.LP
The Scheduler communicates with another process called the
.B "Machine Oriented Miniserver" ,
pbs_mom, in its role as a resource manager
to obtain information about the loading of the host system.
Loading information may contain details of memory usage, cpu load average,
and more.
This information, taken as input to the scheduling program can be used to
control job scheduling.
The relationship between PBS, the Scheduler, and the
Machine Oriented Miniserver can be seen in figures \n(H1\-\n+(Fi
.nr Fj \n(Fi
and \n(H1\-\n+(Fi.
.DS B
.so sched1.pic
.sp
\f3Figure \n(H1\-\n(Fj: Batch Scheduling on a Single Host\f1
.DE
.LP
The concept of the Scheduler and Machine Oriented Miniserver
can be extended to include
multiple hosts.  A single Scheduler would provide scheduling for one or more
PBS servers.  The Scheduler would talk with a Machine Oriented Miniserver
on each host for which it had scheduling responsibility.
.DS B
.so sched2.pic
.sp
\f3Figure \n(H1\-\n(Fi: Batch Scheduling on Multiple Hosts\f1
.DE
.LP
The procedural or Tcl script which drives the Scheduler is a plain text file.
The administrator may update the script in the file or direct the scheduler
to use a different file.
.LP
The script may be written to consider time and queue information,
so different procedures
could apply at different times and to different queues.  The script may
also state when the conditions are such that a job should be started, 
deleted, or suspended.
.\" .LP
.\" The formal procedural language for the yacc/lex based syntax is defined
.\" in appendix A.
.\" The following is an example of a simple procedure:
.\" .sp 1
.\" .nf
.\" .Ty
.\" #
.\" # A simple proceedure to run up to 3 jobs if the load average is less than 2.0
.\" # and the job cput is less than 60 seconds
.\" #
.\" rm 15003 host_foo
.\" global variable nrun;
.\" host resource loadave;
.\" job requirement cput jcput;
.\" 
.\" foreach host {
.\"     nrun = 0;
.\"     if (loadave < 2.0) {
.\" 	foreach job {
.\" 	    if (jcput < 60) {
.\" 		run;
.\" 		nrun = nrun + 1;
.\" 	    }
.\" 	    if (nrun > 2) exit;
.\" 	}
.\"     }
.\" }
.\" .ft 1
.\" .fi
.NH 2
.Tc \f3General Identifiers\fP
.LP
The following identifiers or names are referenced throughout this document.
Unless otherwise noted, their usage will conform to the definition and syntax
described in the following sub-sections and to the general rules described in
the next paragraph.
.LP
If allowed as part of the identifier, when entering the identifier string on
the command line or in a PBS job script directive, embedded single or double
quote marks must be escaped by enclosing the string in the other type of
quote mark.  Therefore, the string may not contain both types of quote marks.
If white space is allowed in the identifier string, the string must be quoted
when it is enteried on the command line or in a PBS job directive.
.NH 3
.Tc Account String
.LP
An 
.I "Account String"
is a string of characters that some server implementations may use to provide
addition accounting or charge information.   The syntax is unspecified except
that it must be a single string.   When provided on the command line to a PBS 
utility or in a directive in a PBS job script, any embedded white space must
be escaped by enclosing the string in quotes.
.NH 3
.Tc Attribute Name
.LP
An 
.I "Attribute Name"
identifies an attribute or data item that is part of the information that makes
up a job, queue, or server.  The name must consist of alphanumeric characters
plus the underscore, '_', character.  It should start with an alphanumeric
character.  The length is not limited.  The names recognized by PBS are
listed in sections 2.2, 2.3, and 2.4.
.NH 3
.Tc Destination Identifiers
.LP
A destination identifier is a string used to specify a particular destination.
The identifier may be specified in one of three forms:
.DS I
.Ty queue@server_name
.Ty queue
.Ty @server_name
.DE
where:
.RS .25i
.IP \f5queue\fP\ 
is an ASCII character string of up to 15 characters.  Valid characters are
alphanumerics, the hyphen and the underscore.
The string must begin with a letter.
.Ty Queue
is the name of a queue at the batch server specified by
.Ty server_name.
That server will interpret the queue string.
If 
.Ty queue
is omitted, a null string is assumed.
.IP \f5server_name\fP
is a string identifying a server; see server_name, section 2.7.9.
If
.Ty server_name
is omitted, the default server is assumed.
.RE
.NH 3
.Tc Default Server
.LP
When a server is not specified to a client, the client will send batch 
requests to the server identified as the
.I "default server" .
A client identifies the default server by (a) the setting of the environment
variable 
.B PBS_DEFAULT
which contains a server name, or by (b) the server name in the file
specified by the
.Av $(PBS_DEFAULT_FILE)
build parameter in the local.mk file.
.NH 3
Host Name
.LP
A
.I "Host Name"
is a string that identifies a host or system on the network.  The syntax of 
the string must follow the rules established by the network.  For IP, a host
name is of the form
.Ty "name.domain" ,
where domain is a hierarchical, dot-separated List of subdomains.  Therefore,
a host name cannot contain a dot, \*Q.\*U as a legal character other than as
a subdomain separater.
The name must not contain the commerical at sign, \*Q@\*U, as this
is often used to separate a file from the  host in a remote file name.
Also, to prevent confusion with port numbers (see section 2.7.9) a host name
cannot contain a colon, ":".
The maximum length of a host name supported by PBS is defined by
.Sc PBS_MAXHOSTNAME ,
currently set to 64.
.NH 3
.Tc Job Identifiers
.LP
When the term 
.I "job identifier"
is used, the identifier is specified as:
.DS
.Ty sequence_number[.server_name][@server]
.DE
The 
.Ty sequence_number
is the number supplied by the server when the job was submitted.
.LP
The 
.Ty server_name
component is the name of the server which created the job.
If it is missing, the name of the default server will be assumed.
.LP
.Ty @server
specifies the current location of the job.
See the definition of default server in section 2.7.4 and the section 5.1.2,
entitled "Directing Requests to Correct Server."
.LP
When the term
.I "fully qualified job identifier"
is used, the identifier is specified as:
.DS
.Ty sequence_number.server[@server]
.DE
The
.Ty @server
suffix is not required
if the job is still resides at the original server which created the job. 
The
.B qsub
command will return a fully qualified job identifier.
.NH 3
Job Name
.LP
A
.I "Job Name"
is a string assigned by the user to provide a meaningful label to identify
the job.  The job name is up to and including 15 characters in length and
may contain any printable characters other than white space.  It must start
with an alphanumeric character.  If the user does not assign a name, PBS will
assign a default name as described under the -N option of the
.I qsub (1)
command.
.NH 3
.Tc Resource Name
.LP
A 
.I "Resource Name"
identifies a job resource requirement and may also indentify a resource usage
limit.  The name must consist of alphanumeric characters plus the underscore,
\*Q_\*U, character.  It should start with an alphamumeric character.  The length
is not limited.  Certain resource names are identified and reserved by POSIX
1003.2d and by PBS.  They are listed in section 3.4.3, \*QTypes of Resources\*U.
.NH 3
Server Name
.LP
.A
.I "Server Name"
is an ASCII character string of the form:
.DS
.Ty basic_server_name[:port]
.DE
The string identifies a batch server.
Basic server names are identical to host names (see section 2.7.5).
The network routine 
.I gethostbyname
will be used to translate to a network address.  The network routine
.I getservbyname
will be used to determine the port number.
.LP
An alternate port number
may be specified by appending a colon, \*Q:\*U, and the port number to
the host name.  This provides the  means of specifying an alternate (test)
server on a host.
.NH 3
.Tc User Name
.LP
A 
.I "User Name"
is a string which identifies a user on the system under PBS.  It is also
known as the login name.
PBS will accept names up to and including 16 characters.
The name may contain any printable, non white space character excluding the
commercial at sign, \*Q@\*U.
The various systems on which PBS is executing may place additional
limitations on the user name.
.\" force next chapter to odd page
.bp
.if e \{
.sp 10
.DS C
[Page intentionally left bank.]
.DE
.bp
\}
