.nr % 1
.OH ''PBS IDS'Batch Server'
.EH 'Batch Server'PBS IDS''
.P1
.so ids_setup.ms
.Rv $Revision: 2.3 $
.nr Fi 0 1
.nr H1 4
.NH 1
.Tc "\f3\s+2The Batch Server\s-2\fP"
.LP
.OF 'Chapt \*(rV''\n(H1-%'
.EF '\n(H1-%''Chapt \*(rV'
.\"         Portable Batch System (PBS) Software License
.\" 
.\" Copyright (c) 1999, MRJ Technology Solutions.
.\" All rights reserved.
.\" 
.\" Acknowledgment: The Portable Batch System Software was originally developed
.\" as a joint project between the Numerical Aerospace Simulation (NAS) Systems
.\" Division of NASA Ames Research Center and the National Energy Research
.\" Supercomputer Center (NERSC) of Lawrence Livermore National Laboratory.
.\" 
.\" Redistribution of the Portable Batch System Software and use in source
.\" and binary forms, with or without modification, are permitted provided
.\" that the following conditions are met:
.\" 
.\" - Redistributions of source code must retain the above copyright and
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer.
.\" 
.\" - Redistributions in binary form must reproduce the above copyright and 
.\"   acknowledgment notices, this list of conditions and the following
.\"   disclaimer in the documentation and/or other materials provided with the
.\"   distribution.
.\" 
.\" - All advertising materials mentioning features or use of this software must
.\"   display the following acknowledgment:
.\" 
.\"   This product includes software developed by NASA Ames Research Center,
.\"   Lawrence Livermore National Laboratory, and MRJ Technology Solutions.
.\" 
.\"         DISCLAIMER OF WARRANTY
.\" 
.\" THIS SOFTWARE IS PROVIDED BY MRJ TECHNOLOGY SOLUTIONS ("MRJ") "AS IS" 
.\" WITHOUT WARRANTY OF ANY KIND, AND ANY EXPRESS OR IMPLIED WARRANTIES, 
.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, 
.\" FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE EXPRESSLY
.\" DISCLAIMED.
.\"
.\" IN NO EVENT, UNLESS REQUIRED BY APPLICABLE LAW, SHALL MRJ, NASA, NOR
.\" THE U.S. GOVERNMENT BE LIABLE FOR ANY DIRECT DAMAGES WHATSOEVER,
.\" NOR ANY INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\" 
.\" This license will be governed by the laws of the Commonwealth of Virginia,
.\" without reference to its choice of law rules.
.NH 2
.Tc \f3Server Overview\fP
.LP
The batch server is the heart of the batch processing system.
There is typically one server per main processing host.
Additional servers may exist for testing or special purposes.
Also, a single server may be configured to support a cluster of
processing nodes.
.LP
The batch server has the following responsibilities:
.IP \(bu
Own and manage batch jobs.
.IP \(bu
Own and manage queues.
.IP \(bu
Recover state of jobs and queues upon restart of batch server.
.IP \(bu
Perform services on behalf of clients based on batch service requests.
.IP \(bu
Perform deferred services on behalf of jobs based on external events
(changes in environment, resources, etc.) or time.
.IP \(bu
Initiate selection of jobs for execution based on a set of site defined
policy rules.
.IP \(bu
Establish resource reservations and usage limits for jobs being placed
into execution.
.IP \(bu
Place a batch job into execution and monitor its progress.
.IP \(bu
Perform post job execution processing and clean-up.
.LP
.NH 3
.Tc Server Objects and Attributes
.LP
With apologies to Lewis Carroll...
.sp
.in +1i
.nf
`The time has come,' the Walrus said,
\ \ `To talk of many things:
Of Queues - and Jobs - and Attributes -
\ \ Of cabbages - and kings -
And why the sea is boiling hot -
\ \ And whether PBS has wings.'
.fi
.in -1i
.sp
To understand the design of the PBS
server, it is necessary to understand the concepts behind server objects,
like jobs and queue, and the object attributes.  Three classes of objects
exist within the server: jobs, queues, and the server itself.
An instantiation of an object is represented by a structure and the data
it contains.  There is a separate structure for each object.
.NH 4
.Tc \s-2Job Objects\s+2
.LP
A job is a set of data about the job and the job script.
The job data is maintained in a 
.I job
structure which is defined in
.I job.h .
This information, along with the script, is also recorded on disk to prevent
lose in case of a crash.
The job data controls how the server deals with the job, the resources
made available to the job during execution, and what happens to the
standard output and standard error files of the job when it completes
processing.
.LP
The job data can be divided in two groups, the fixed data and the attributes.
The fixed data is typically private to the server or read-only to
the client.  This data is fixed in size and maintained in a sub-structure.
.LP
The client supplied/modifiable data is in the form of 
.I attributes .
The ERS names these attributes and explains their purposes.
The attributes of a job are defined  in the file
.Ix job_attr_def.c
as an array of attribute definition structures,
.Ty "attribute_def job_attr_def[]" .
It is critical to maintain the ordering between the definitions in
.Ix job_attr_def[]
and the 
.Ty "enum job_atr"
defined in job.h.  The values in this enum are used to index into the
job attribute array.
.NH 4
.Tc \s-2Queue Objects\s+2
.LP
A queue is little more than a collection of jobs.  There are attributes
associated with a queue which control how the server deals with the queue.
As with the job, the queue structure is recorded to disk to preserve it
across crashes and shutdowns.
.LP
POSIX 1003.2d defined and PBS supports two basic queue types,
execution and routing.
Jobs remain in execution queues until they are run or aborted.
Jobs in routing queues are to be moved to another queue.  The destination
may be a queue in the same server or in a remote server.
.LP
The queue attributes are defined in the file
.Ix queue_attr_def.c
as an array of attribute definition
structures
.Ty "attribute_def que_attr_def[]"  .
As with jobs, the ordering between the members of the array 
.Ix que_attr_def[]
and the
.Ty "\*Qenum queueattr\*U"
defined in 
.Ix queue.h
must be maintained.  The two types of queues have slightly different 
attributes.  The que_attr_def array contains both sets.  Only the attributes
defined for the type of a queue are used with that queue.
.NH 4
.Tc \s-2The Server Object\s+2
.LP
The server itself is an object, a structure and a set of attributes which
control various aspects of the servers operation.
The server attributes are defined in the file
.Ix svr_attr_def.c
as the  attribute definition array
.Ix svr_attr_def[] .
Again it is critical to maintain the ordering between the attributes and
the 
.Ty "\*Qenum srv_atr\*U"
defined in 
.Ix server.h
\&.
.NH 4
.Tc "\s-2Just What are Attributes?\s+2"
.LP
The concept of an attribute in PBS provides it with much of 
flexibility and power of PBS.  In one sense, an attribute is just another
data item, a element of the parent objects structure.  What sets
attributes apart is the two tier representation of an attribute and
the encapsulation of the data and its associated functions.
.LP
An attribute is represented by its name and its value.
Exterior to the server, an attribute is seen as a pair of
character strings, one for the attribute name and one for its value.
Internally, attributes are represented in one of two ways depending on if
the meaning of the attribute is known to the program.
If the meaning is unknow to the program (specifically a Job Server), the
internal representation is very similiar to the external form.  This will
be discussed in more detail shortly.
For those attributes whose meaning is known, i.e. there is code to do something
with the value, the attribute is represented by two structures, the
.I attribute
structure (often referred to as the value), and the
.I attribute_def 
structure.
The attribute structure contains the actual attribute value in a machine
dependent form.
There is one attribute structure for each instance (occurrence) of an attribute.
.LP
The attribute_def structure contains the attribute name, flags, and pointers
to the functions used to access and manipulate the attribute,
see the section 
.B "Attribute Manipulation Functions"
later in this chapter.
There is one and only one attribute_def structure for each named attribute of
an object.
Attributes of the same data type (integer, character string, ...)
may share the access functions.
To add an attribute with a new name and new capability, it is only necessary
to add the new
definition structure and any access function which might be unique.
.LP
The attribute definitions exist in an array of attribute_def for each type
of parent object.
Attributes with the same name but different types and meaning
.I may
exist in different types of objects such as jobs and queues.
However, this is 
.I not
recommended for the confusion factor.
.LP
The attribute value is represented in a 
.I attr_value
union within the
.I attribute
structure.
This union contains all possible value data types, see attribute.h.
It is assumed that any code needing the attribute value knows what type it is;
however, that information is available in the attribute_def.
Some of the attribute value data types, (or more simply 
.I "attribute types" )
require additional storage to hold the value.  In these cases, the
additional space is allocated and freed as required.
.LP
All possible attributes for the server and queues are known
to the server by name.
Any reference to an unknown attribute name for those objects is illegal.
This is not true for jobs however.  Since jobs may be "just passing
through" to another server or the name and value may have meaning to the
Job Scheduler.  The meaning of such attributes are unknown to this specific
Job Server.
To handle this case, a special attribute, the
.At unknown
attribute, is created for jobs.  Any unrecognized attribute for a job
is maintained under the unknown attribute as a linked
list of two strings, name and value, and a control or header structure.
The strings and the control structure, which  gives their lengths and the
total storage required, are placed in a single allocated block of storage.
This block can be easily saved to disk without any knowledge of the type.
This form is known as the 
.I svrattrl
structure (for server attribute list).
.LP
The svrattrl structure also acts to isolate the server from the actual
form used for network encoding.
.NH 4
.Tc "\s-2What are Resources\s+2"
.LP
Up to this point, there have been a few scattered references to resources.
So the time has come to describe what they are.  The answer depends on whom
you ask.   POSIX 1003.2d defined a job attribute named
.I resource_list .  
which has two meanings;  and thus it exist in the PBS Server.
The resource_list job attribute is actually a set of requirements of system
resources needed by the job to execute and a set of limits to place on the
usage by the job of thoses resources.  For example, a job may need two tape
drives to execute.  Thus it would have a requirement in the resource_list of
\*Qtapes=2\*U.  A limit on the cpu usage of a job can be stated by a 
resource_list entry of \*Qcput=10\*U.
.LP
MOM will interpret the resource_list as limits.  The scheduler sees the
list as a list of job requirements, a slightly different view point.  The
resource monitor reports the availability of system resources to the scheduler.
They have nothing to do with the job resource_list.
.LP
Within the Server, resources are treated as a special case of a job attribute.
They are special in that they have multiple names and values and are in fact
maintained by the server as a linked list headed by the attribute.
.LP
The resources for a batch complex managed by a Server are defined within the
server.  PBS supplies sets of resource definitions in the form of an array
of resource definition structure,
.Ix svr_resc_def[]
defined in a series of files
.Ix resc_def_*.c
\&.
There is a file for each target system supported by PBS.  Additional resources
may be added to the Server by inserting the appropriate definition in the
correct file.  However, code to process the resource will likely be required
in MOM.  Be sure to read the section on resource.h.
.NH 2
.Tc \f3Packaging\fP
.LP
The PBS Server is a single program which is run with root privilege as
a daemon process.
The source code for the server consists of the files in the directory
.I src/server ,
many of the header files in
.I src/include ,
and many of the libraries found under
.I src/lib/* .
The descriptions of the server's routines are groups by the object on which
they act or the general purpose of the function.
.NH 2
.Tc \f3Program: pbs_server\fP
.LP
.NH 3
.Tc Overview
.LP
The PBS Server is started by the
.B pbs_server (8)
command.  The pbs_server command may be entered by a operator manually or it
may be placed in boot time start up file (/etc/rc.local).
Once the pbs_server has been started, it will:
.RS
.IP 1. 4
Validate the server database structure (see pbsd_init).
.IP 2. 4
Abort, requeue, restart, or reconnect to executing jobs depending on the
initialization mode.
.IP 3. 4
Initialize the network (and other interprocess communication) connections.
.IP 4. 4
Begin to accept and process batch service requests and to perform deferred
services.
.RE
.LP
The pbs_server will continue to perform services until it is terminated by
the receipt of a shutdown request or a SIGTERM (or SIGSHUTDN) signal.
The actions taken by pbs_server upon shutdown depend on the type of 
shutdown, delayed or immediate, see pbs_terminate(3) and qterm(8);
but will always include updating the server's database.
.NH 3
.Tc External Interfaces
.LP
The pbs_server process has the following external interfaces:
.RS
.IP \(bu
Arguments supplied on the command line.
.IP \(bu
The server database which is described in the section ??.??.?? Server
Database.
.IP \(bu
The batch requests received over the network interface, described in the
section 11.3 Protocols.
.IP \(bu
The information available from the PBS Scheduler, described
in the section 6.1.1 Scheduler/Server Communication and 8.1 MOM's
Interpretation of PBS Protocol.
.RE
.LP
The following modules (source files) are part of the pbs_server.
.NH 3
.Tc Server Main Loop
.LP
The file
.Ix pbsd_main.c
in directory
.I src/server
contains the initial entry point for the pbs daemon,
the code to interpret the arguments passed to the daemon, the call to
initialize the internal data and state, the call to initialize the
network interfaces, and the main process loop.
.LP
The server's main loop is event driven.  The event types are the arrival of
a batch service request, the arrival of a reply to a request made to another
server or daemon, the arrival of a signal, and the expiration of some timed
event.  The server runs as a pseudo multi-threaded serial server.  Unlike
parallel servers which fork a child copy of itself for each service request,
the PBS Server runs as a single program which processes all requests.  This
is to insure consistency of the internal data.  There are two situations in
which the server will fork, to send mail to a job owner and to send a job to
another server.  The latter may be time consuming and rather than handle the
complexity, the server creates a child which sends the job.  The server treats
the send operations as atomic, it either successes or fails.
.LP
The pseudo multi-threaded comes from the method by which the server handles
tasks which might result in a delay.  For example when the serve sends a request
to another server, rather than block waiting for the reply, the fact that a 
reply is expected, on which communication connection it is expected, and what
function should process the reply is saved as an
.I event .
The arrival of the reply triggers the event processing.  This is know as a 
\*Qdeferred reply\*U event.   There are several other types of events, they
are described in the routine 
.I set_task() .
.LP
.Fn main()
.Cs
main(\^int argc, char **argv)
.Ce
.IP Args: 4
The
.Ar argv
array may contain the following options:
.RS
.IP "[-a true|false]"
Sets the
.At scheduling
attribute.
.IP "[-d config_path]"
Path of top level, see {PBS_DIR} in figure \n(H1\-\n+(Fi.
.IP "[-P dis_port]"
Specifies the port on which the server listens for DIS encoded requests;
must be numberic.
.IP "[-t type]"
Initialization type
.IP "[-A account_file]"
Specifies the absolute path to the accounting log file.
.IP "[-L log_file]"
Specifies the absolute path to the general log file.
.IP "[-M port]"
Specifies a port on which MOM should be contacted.
.IP "[-S port]"
Specifies a port on which the scheduler should be contacted.
.RE
.IP
See
.B pbs_server (8)
for more detail on the -t and -d options.
.IP Returns:
None.
.LP
Control flow:
.nf
    Get local host name and default ports.
    uses DIS_tcp_setup() to set tcp routines for DIS encoding
    Process arguments, setting flags based on options.
    Set log_event_mask and open the log file
    Set up to ignore or catch signals.
    Perform initialization processing based on type of initialization.
    Initialize the network communications.
    
    Begin the main processing loop.
        Process any ready work task event in the various work lists,
        see next_task().
        If the server is in state RUNNING,
            If the recovery type was RECOV_HOT,
                If more than SVR_HOT_CYCLE seconds have passed since last time,
                    call start_hot_jobs()
                If more than SVR_HOT_LIMIT seconds have passed since server up,
                    reset recovery type to RECOV_WARM to ignore hot jobs.
            If time or event to run scheduler and attribute Scheduling is true,
                call schedule_jobs()
            For each routing queue, 
                call queue_route() to route jobs.
        Wait on arrival of batch service request. 
        If a request arrived,
            Process request.
        Else if received a signal,
            If signal was death of child,
                Perform sub-server clean up processing.
            Else
                Shutdown the server.
        Continue with main processing loop.
    
    Update all server databases.
    respond to the shutdown request (if one).
    Close network connections.
    Log the final shutdown event.
    Close the log.
    Exit
.fi
.DS B
.so dir_struct.pic
.sp
.ce
\f3Figure \n(H1\-\n(Fi: PBS Home Directory\f1
.DE
.LP
.Fn next_task()
.Cs
static time_t next_task (void);
.Ce
.IP Returns: 4
.RS
.IP time_t
time to next event
.RE
.LP
This function scans the various work task lists and 
for any for which service is now required, calls 
.I dispatch_task()
to invoke the
processing routine.  The lists are processed in the following order:
.IP 1.
If the
.At svr_delay_entry
global variable is set non-zero, then the external event list,
.At task_list_event ,
is scanned for events which have been changed to type 
.Sc Immed ,
see 
.I catch_child() 
for details.
.IP 2.
Any entry in the immediate list,
.At task_list_immed ,
is dispatched.
.IP 3.
If the event time of any entries in the timed list,
.At task_list_timed ,
has been reached, they are dispatched.
.LP
If there is a need to run the job scheduler and scheduling is active,
that is done by setting
.Av svr_do_schedule
to
.Sc SCH_SCHEDULE_TIME .
.LP
The least of (1) the time to the next timed action (if one), or (2) the
time to the next scheduler run.
.Fn start_hot_jobs()
.Cs
static void start_hot_jobs()
.Ce
.LP
This routine is called in the main loop when the server recovery mode
is 
.Sc RECOV_HOT .
Its purpose is to restart jobs which were running when the server last
went down.  Each job owned by the server which (1) are in state 
.Sc JOB_SUBSTATE_QUEUED ,
and (2) have the
.Sc JOB_SVFLG_HOTSTART
flag set in 
.I ji_svrflags
is placed into execution by calling
.I svr_startjob() .
.NH 3
.Tc Server Initialization
.LP
The file
.Ix pbsd_init.c
in directory
.I src/server
contains the code to initialize the batch server.
This code is called once when pbs_server begins execution.  The actions
performed depend on the type of initiation.
.LP
.Fn pbsd_init()
.Cs
int pbsd_init(\^int type\^)
.Ce
.IP Args: 4
.RS
.IP type
The type of initialization
.RE
.IP Returns:
.RS
.IP 0
If initialization is successful.  Note, many internal tables have been
loaded and the global server state has been changed.
.IP non-zero
If initialization failed.
.RE
.LP
The sequence of events for the initialization is:
.LP
Catch the following signals: SIGHUP, SIGINT, SIGTERM, and SIGCHLD.
Set up path names to various server directories and
clear the head of server lists.  
Set the various default server attribute values, network retry time and
force logging of all event types.
Set the default log file name to the Julian day of the year.
.LP
If this initialization is not of type 
.I create ,
load the server attributes from database, see svr_recov().
.LP
Initialize server global data items, such as the name of this server, its
network address and port, and MOM's address and port number. 
.LP
Then, if not a 
.I create
initialization, recover the queue attributes from the files in the
queue directory.  For each queue database file, call
.I que_recov() .
.LP
If not a 
.I create
or
.I clean
initization, recover the jobs from the save files in the jobs directory.
Change the server's current working directory to the jobs directory.
For each file with a name ending in the job suffix, 
.B .JB ,
recover the job information, by calling
.I job_recov() 
and process the job to either re-queue or delete, see 
.I pbsd_init_job() .
Report on number of jobs recovered.
.LP
If the queue rank number used to order jobs in the queue has gone negative,
it is reset to zero and each job has its queue rank updated starting from one.
This is to prevent overflow.  While this should take a minimum of five years,
PBS is such a great product, it is bound to run that long :-).
.LP
The job tracking records are recovered from their save file and reloaded
into a tracking array.  The array is allocated to whole the larger of
the number of records in the save file or the minimum number of records
.Sc PBS_TRACK_MINSIZE .
.LP
If the initialization type is
.I Cold
or
.I CREATE ,
set the server attribute
.At Idle
to true.
.LP
.Fn build_path()
.Cs
static char *build_path(parent, name, suffix)
.Ce
.IP Args: 4
.RS
.IP parent
the name of the parent directory, used as the prefix.
.IP name
the desired file name.
.IP suffix
the suffix to append or null.
.RE
.IP Returns: 4
.RS
.IP pointer
to the name string.
.RE
.LP
The size of the path name is calculated and that amount of space is allocated.
The parent directory name is copied into the allocated space.
If the parent does not end in a slash, '/', one will be appended.
Then the name and any suffix is appended.
.Fn pbsd_init_job()
.Cs
static void pbsd_init_job(job *pjob, int type)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to job structure to process.
.IP type
Initialization type.
.RE
.LP
This function is called by pbsd_init() for each job file found and recovered.
The actions taken depend upon the state (substate) of the job at the
time the server went down, and upon the initialization type.
.LP
If the initialization type is 
.I clean ,
then abort the job by calling
.I job_abt() .
.LP
Otherwise, act according to the job substate.  Unless otherwise noted, 
the route 
.I pbsd_init_reque()
is called to requeue the job.  For job in substate:
.IP TRANSICM
If the job was created here, the client was a temporary one not a server, then
set substate to QUEUED.  Otherwise, hold on to the job in the new job list
and wait for some server to send a commit.
.IP TRANSOUT
Requeue the job as QUEUED.
.IP TRANSOUTCM
We need to (re)send the \*QReady to Commit\*U and \*QCommit\*U messages,
however the net connection has not yet been initialized.
So, requeue the job as is and establish a work task to finish sending the job.
.IP "QUEUED, PRESTAGEIN, STAGEIN, STAGECMP, STAGEFAIL, STAGEGO, HELD, SYNCHOLD, DEPNDOLD, WAITING, RUNNING, or STARTING"
Requeue the job as is.
.IP JOB_SUBSTATE_RESOURCE
Requeue the job in state JOB_STATE_QUEUED.  It will need to look
for its resources again.
.IP JOB_SUBSTATE_SYNCRES
Clear all recorded \*qready\*U dependencies and requeue the job.
.IP "EXITING or STAGEOUT"
Set a task entry to complete job the exit processing.
.IP "Any other"
Abort the job.
.LP
.Fn pbsd_init_reque()
.Cs
static void pbsd_init_reque(job *pjob, int change)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to job structure.
.IP change
flag to change or keep current job state.
.RE
.LP
This function is called by pbsd_init_job() to perform the enqueue.
Messages about the requeuing are placed in the log file.
.LP
If the
.Av change
flag is set to
.Sc CHANGE_STATE
(1), 
.I svr_evaljobstate()
is called to determine to what the job state and substate should be set; 
.I svr_setjobstate()
is called to set them.  If
.Av change
is
.Sc KEEP_STATE
(0), the job state and substate are unchanged.
.LP
Then 
.I svr_enquejob()
is called to add the job to the queue.
.Fn catch_child()
.Cs
void catch_child(int sig)
.Ce
.IP Args: 4
.RS
.IP sig
The signal
.Sc SIGCHLD
which caused this signal handler to be invoked.
.RE
.LP
This function is the signal handler for SIGCHLD, death of child.
Upon receipt of a SIGCHLD, a
.I waitpid()
system call is performed to collect the pid and exit status of any 
terminated child.
The event work task list is searched for entry with a type of 
.Sc Deferred_Child
and an event id matching the child pid.
If found, the exit status is saved in the entry and the entry type
is changed to
.Sc Immed .
The global flag
.At svr_delay_entry
is updated to indicate that the main loop should search the delayed
list for entries to be moved to the immediate list.
We do this rather than immediately the process the entry to minimize the work
performed in the signal handler and to prevent relinking the list when an
interrupted function might already have been doing so.
.Fn change_logs()
.Cs
static void change_logs()
.Ce
This is the signal handler for SIGHUP.  When a hup is received,
the handler closes the accounting file calling
.I acct_close()
and reopens it calling
.I acct_open()
with
.Ar path_acct .
This allows the file to moved to a new name and restarted.
.Fn stop_me()
.Cs
static void stop_me();
.Ce
This is the signal handler for all signals which are to terminate
the server.
.LP
The signal number is saved for a log_event() call which is made outside
of the handler, and the server state is set to
.Sc SV_STATE_SHUTSIG .
.NH 4
.Ix attr_recov.c
.LP
The file
.I src/server/attr_recov.c
contains the functions to write an array of attributes to a file
and to restore the attributes from the file.
The attributes of an object are saved whenever they are changed and
when the server is shut down.  This allows the server to recover its
state when restarted.
.LP
When attributes are being saved, they are encoded into a list of
.I svrattrl
entries.
This list is packed into a buffer by calling
.I save_struct() .
The buffer is only written whenever it becomes full.
This saves I/O calls.
On a recovery or restart, performance is not critical.  
.Fn save_setup()
.Cs
void save_setup(int fds)
.Ce
.IP Args: 4
.RS
.IP fds
The open file descriptor to which to write.
.RE
.LP
The file descriptor is squirreled away for calls to other functions in
this file.  The pointer into the buffer and the amounts of space used and
available are initialized.
.Fn save_struct()
.Cs
int save_struct(char *pobj, size_t objsize)
.Ce
.IP Args: 4
.RS
.IP pobj
A character pointer to a object (structure) to save.  The object cannot have
data that exists (is pointed to) outside of the object itself.
.IP objsize
The size, in bytes, of the object.  This amount of data is saved.
.RE
.IP Returns: 4
.RS
.IP 0
If object is written to the file successfully.
.IP -1
If an error occurs.
.RE
.LP
As much data as will currently fit into the \*Qpack buffer\*U is copied from
the object to the buffer.
If not all of the object fits, the buffer is
written and the pointer into it and the space available/used are reset.
.Fn save_flush()
.Cs
int save_flush()
.Ce
.IP Returns: 4
.RS
.IP 0
on success
.IP -1 
on error
.RE
.LP
Any data which resides in the \*Qpack buffer\*U is written to disk.  The
saved file descriptor is reset to a value to indicate the current save
operation is complete.  
.LP
Note, save_setup() must be called before attempting to pack any additional 
data.  Also, it up to the caller to close the file descriptor if that is
appropriate.
.Fn save_attr()
.Cs
int save_attr( attribute_def *padef, attribute *pattr, int numattr)
.Ce
.IP Args: 4
.RS
.IP padef
Pointer to the attribute definition structure, used to obtain the
attribute name.
.IP attr
Pointer to the first attribute in the array to be saved.
.IP numattr
The number of attributes in the array to save.
.RE
.IP Returns: 4
.RS
.IP 0
if successful
.IP -1
if error
.RE
.LP
Each attribute in the array (see below) is in turn encoded into a svrattrl
entry using the 
.I at_encode
routine for that attribute. The
.Sc ATR_VFLAG_MODIFY
is cleared to indicate the data has been saved.
Then the svrattrl entry is packed into the output buffer by calling
.I save_struct() .
The entry is unlinked from the list and freed.  
.LP
After the final attribute is encoded into the buffer, a dummy svrattrl
entry with a size set to a magic number,
.Sc ENDATTRIBUTES ,
is appended using save_struct().  This entry will be recognized by
recov_attr()
as indicating the end of the attributes has been reached.
.LP
Note, attributes of type
.Sc ATR_TYPE_ACL ,
are ignored.  These access control list attributes are not saved in the
same matter as other attributes, see
.I save_acl() .
.Fn recov_attr()
.Cs
int recov_attr(int fd, void *parent, attribute_def *padef, attribute *pattr,
               int limit, int unknown)
.Ce
.IP Args: 4
.RS
.IP fd
The open file descriptor to read.
.IP parent
Pointer to the parent object (structure) which contains the attributes.
.IP padef
The attribute definition structures for these attributes.
.IP pattr
A pointer to the array of attributes which is being restored.
.IP limit
The number of attributes in the attribute array and attribute
definition array, passed to find_attr().
.IP unknown
If greater than zero, this is the index into the attribute definition array
to use when the attribute does not match any known attributes, the 
attribute to use for \*Qunknown\*U attributes.  This is used only for jobs.
.RE
.IP Returns: 4
.RS
.IP 0
if successful
.IP -1
if error
.RE
.LP
Each attribute in turn, is reloaded in two reads, the first read gets the
fixed size portion of the
.I svrattrl
structure itself, this gives the size of the encoded attribute.
The second read obtains the variable portion containing the encoded strings.
The attribute is identified using the name string and the 
.I find_attr()
function.
.LP
If the attribute name does not match any in the definition array, either 
(1) the job is a transient job (in a routing queue) and has attributes
that are not known here; or (2) the server has been rebuilt and the attributes
changed.
In case one, the attribute is saved in the attribute given by the
.I unknown
parameter.  In case two, 
.I unknown
will be zero, the event is logged, and the attribute ignored.
.LP
The attribute value is then passed to the appropriate decode function.
If the attribute definition structure contains a non-null pointer to an
action function (at_action), the action routine is called.
The pointer to the parent structure is passed to the action routine along with
the pointer to the attribute and the 
.I "action mode" .
In this case, the action mode is set to 
.Sc ATR_ACTION_RECOV .
.LP
The loop is terminated when the total size (al_tsize) specified in the 
.I svrattrl
structure is equal to the magic number 
.Sc ENDATTRIBUTES .
.NH 4
.Ix job_recov.c
.LP
The file
.I src/server/job_recov.c
contains the functions to save and recover (restore) a job structure and
its associated sub structures and lists from a job file on disk.
.Fn job_save()
.Cs
int job_save (job *pjob, int updatetype)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to job structure which is to be saved.
.IP updatetype
The type of save, quick, full update or new
.RE
.IP Returns: 4
.RS
.IP 0
If save was successful.
.IP -1
If error.
.RE
.LP
The job structure is saved to disk in a file whose name matches
the job identifier.  The save is one of two types: quick (mode = 
.Sc SAVEJOB_QUICK ),
or full (mode = 
.Sc SAVEJOB_FULL
or mode = 
.Sc SAVEJOB_NEW ).
.LP
If the 
.Ar ji_modified
flag in the job structure is set indicating that one or more attributes
have been modified, the 
.Sc JOB_ATR_mtime
attribute is update to the current time.
Note, this flag should be set any time a non Read-Only attribute is changed
on behalf of a client.
.LP
A quick save is performed to record state and other internal data changes.
Only the basic fixed length section of the job structure is re-recorded.
A rewrite in place is performed.  This minimizes the amount of I/O for
a common type of save.
.LP
A full save is performed for a new job or whenever an attribute changes
or an dependency is registered.  This update records the basic job structure
plus all of the variable length sub-structures and lists.
The various pieces of the structure are packed (buffered) and
the number of write calls are minimized for performance.
.br
1. The basic structure is written to disk using
.I save_struct() .
.br
2. The  attributes are encoded and packed into a buffer, using
.I save_attr() .
.LP
If an error occurs on a write, the whole series is retried once from the start.
.QP
Author's Note:
.br
The whole series of save operations uses synchronous writes,  the file is
opened with
.Sc O_SYNC .
For some incomprehensible reason, O_SYNC is not included in POSIX.1 at this
time.  However, it is felt that the benefit of insuring the completion of
the write out-weighs this slight incompatibility.
.LP
.Fn job_recov()
.Cs
job *job_recov(char *filename)
.Ce
.IP Args: 4
.RS
.IP filename
The name of the job file from which a job is to be reloaded.
.RE
.IP Returns: 4
.RS
.IP "Non-null"
job pointer to the newly created job structure on success.
.IP Null
job pointer if the recovery failed.
.RE
.LP
An new job structure is allocated in memory.
The job structure, its working attributes, and its dependencies are recovered
from disk.  This takes place in two steps:
.br
1. The basic job structure is read in.
.br
2. The attributes are restored using
.I recov_attr() .
.NH 4
.Ix svr_recov.c
.LP
The file
.I src/server/svr_recov.c
contains the functions that save and restore the server structure and the
server attributes to or from disk.
.Fn svr_recov()
.Cs
int svr_recov(char *serverdb)
.Ce
.IP Args: 4
.RS
.IP serverdb
name of the server save file.
.RE
.IP Returns: 4
.RS
.IP 0
on success
.IP -1
on error
.RE
.LP
The server save file is opened.  The server structure data is read directly
into the structure.  Then recov_attr() is called to reload the attributes.
The server database file is closed
.LP
The server's attributes are searched for one of type
.Sc ATR_TYPE_HOSTACL .
When found,
.I recov_acl()
is called to reload the access control list.
.Fn svr_save()
.Cs
int svr_save(server *ps, int mode)
.Ce
.IP Args: 4
.RS
.IP ps
Pointer to the server structure.
.IP mode
of save, quick or full.
.RE
.IP Returns: 4
.RS
.IP 0
on success
.IP -1
on error
.RE
.LP
If the mode is set for a quick save,
.Sc SVR_SAVE_QUICK ,
the server database file is opened and the fixed portion, server.sv_qs,
of the server structure is written.  The file is closed.
.LP
Otherwise, a new server database file is opened and save_setup() is called to
initialize the save I/O buffer.  Then save_struct() and save_attr() are called
to save the server structure (serverobj) and the server attributes.
Save_flush() is called to finish the I/O and the file is closed.
The original save file is unlinked and the new file linked to its name.
The new name is unlinked.  All this work minimizes the window in which the
server database file could be lost if the system crashes.
.LP
The server's attributes are searched for one of type
.Sc ATR_TYPE_HOSTACL .
When found,
.I save_acl()
is called to save the access control list.
.Fn save_acl()
.Cs
int save_acl(attribute *pattr, attribute_def *pdef, char *path, char *name)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to acl attribute.
.IP pdef
pointer to attribute def structure for acl attribute.
.IP path
of directory in which acl file lives.
.IP name
of parent object, also the file name.
.RE
.IP Returns: 4
.RS
.IP 0
if successful.
.IP <0
if error.
.RE
.LP
Access control list attributes are saved, not with the other attributes of the
object, but in their own file.  This is done for two reasons.  First, the size
of the attribute value, the acl entries, might be large.  This would 
slow the updating of the parent object save file and increase the window where
a crash might cause loss of data.  Second and more important, the separate
file allows the administrator to directly edit the access control list.  Since
the list might be large, its is easier to input directly than through qmgr.
Note, any changes will not take effect unless the server is shutdown and
reloads the acl.
.LP
If the
.Sc ATR_VFLAG_MODIFY
is off in the attribute, need do nothing so just return.
The file name is created by concatenating the path and the parent object name.
The suffix \*Q.new\*U is appended to the name and this file is created.
The attribute is encoded by calling its
.I at_encode
routine.  Note, as the attribute uses the array of strings, arst, encoding, the 
.Sc ATR_ENCODE_SAVE
flag causes each string entry to be concatenated with a new line separating
the sub strings.  
.LP
Just the value portion of the encoded entry, not the full svrattrl
structure is written to disk.  This yields an editable file.
.LP
The file is closed.  The old file name, without the .new suffix is unlinked
and then relinked to the new file.  The new file name, with the suffix,
is unlinked.  This ensures the old contents are not lost before the new
contents are safe.
.Fn recov_acl()
.Cs
void recov_acl(attribute *pattr, attribute_def *pdef, char *path, char *name)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to acl attribute.
.IP pdef
pointer to attribute def structure for acl attribute.
.IP path
of directory in which acl file lives.
.IP name
of parent object, also the file name.
.RE
.LP
This function reloads the value of an access control list into the attribute.
It is only called when the server is initializing.
.LP
The file name is created from the path and parent object name.
The file is stat-ed to obtain its size.  If the stat fails or the size is
zero, the function returns.
.LP
The file is opened for read.  A buffer large enough to whole the entire file
is allocated and the file is read into it.
The file is closed.
.LP
The data is decoded into the attribute by calling the
.I at_decode()
routine for the attribute.  Then the buffer is freed.

.NH 3
.Tc Job Functions
.LP
.NH 4
.Ix job_func.c
.LP
The file
.I src/server/job_func.c
contains general functions to deal with job structures.  Functions to
allocate and free the job structure, initialize or set the working
attributes, abort and restart jobs are included.
.Fn job_abt()
.Cs
int job_abt(job *pjob, char *text)
.Ce
.IP Args: 4
.RS 
.IP pjob
Pointer to job structure for job to be aborted.
.IP text
Message to be logged and mailed to owner.
.RE
.IP Returns: 4
.RS
.IP 0 
Job successfully aborted.
.IP -1
Error occurred.
.RE
.LP
The job state is set to 
.Sc JOB_STATE_EXITING
and the substate to 
.Sc JOB_SUBSTATE_ABORT .
A mail message is set to the job owner.
A track job batch request is sent to the server which created the job
and any defined alternate server.
.LP
If the job state was 
.Sc JOB_STATE_RUNNING
and the server is not initializing,
a kill signal is set to the job and the job state is updated to disk.
.LP
Else if the job was 
.Sc JOB_STATE_RUNNING
and the server is initializing
(the job was running when the server went down),
job exit processing is started to deal with output files.
.LP
Otherwise, the job is removed from the system by calling job_purge().
.Fn job_alloc()
.Cs
job *job_alloc()
.Ce
.IP Returns: 4
.RS
.IP address
of job structure
.IP NULL
if allocation of memory fails.
.RE
.LP
This function allocates the space for the job structure.  
The working array of attributes is initialized to \*Qunset\*U by calling
.I job_init_wattr() .
.Fn job_free()
.Cs
void job_free(job *pj)
.Ce
.IP Args: 4
.RS
.IP pj
Pointer to job structure to be freed.
.RE
.LP
The various sub-structures of the job structure are freed:
.br
(1) the dependency structures, depend_p and depend_child, 
.br
(2) the attribute string set, attrlist, and
.br
(3) the extra space allocated to any of the working attributes.
.LP
Finally the job structure itself is freed.
.Fn job_init_wattr()
.Cs
void job_init_wattr(job *pj)
.Ce
.IP Args: 4
.RS
.IP pj
Pointer to job structure in which to initialize the attributes.
.RE
.LP
This function is called to initialize the working attribute array in
a job structure.  For each attribute, the attribute type field is set
to match that of the corresponding member of the job attribute definition
array.  The attribute value flag 
.Sc ATR_VFLAG_SET
is cleared to indicate the attribute has not be set by a client request
(is set to a default unset value).
.Fn job_purge()
.Cs
void job_purge(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to job structure of job to be purged from system.
.RE
.LP
The job structure is dequeued from any queue by calling svr_dequejob().
The job control file and job script file are unlinked (deleted).
If the job has output or checkpoint files in the PBS spool area,
they are unlinked.
The job structure and all associated structures are freed by calling
job_free().
.Fn find_job()
.Cs
job *find_job(char *jobid)
.Ce
.IP Args: 4
.RS
.IP jobid
The job id character string.
.RE
.IP Returns: 4
.RS
.IP Pointer
to the job structure if found, otherwise NULL
.RE
.LP
Each job in the server's list of all jobs is checked until a job
structure with the same job id is found or the end of the list is reached.
.NH 4
.Ix svr_jobfunc.c
.LP
The file 
.I src/server/svr_jobfunc.c
contains general job related server functions.
.Fn svr_enquejob()
.Cs
void svr_enquejob(job *pjob)
.Ce
.IP Args:
.RS
.IP pjob
Pointer to the job.
.RE
.LP
It is linked into the list of all server jobs.
The counts of jobs managed by the server and managed by the server per state
are incremented.
The queue is located from the queue name in the job structure,
.I find_queuebyname() 
is called.
The job structure is linked into the list of jobs owned by the queue.
.LP
The position of the job in the server list of all jobs and the queue list
is determined by the 
.At JOB_ATR_qrank ,
queue_rank, attribute of the job.   Starting at the end of the queue, the
most likely place for the job to be placed, the list is searched backwards
for a job with rank lower than the new job.  The new job is inserted after
that job.
.LP
The the current count of jobs in the queue,
.I qu_numjobs ,
the number of jobs in the given state,
.I qu_njstat[state] 
and
.I sv_jobstates[state] ,
and the number of total jobs in the server
.I sv_numjobs
are incremented.
The current location attribute,
.Sc JOB_ATR_current_loc ,
is update to the queue and server name.
.LP
If the job is changing queue types, routing to execution for example,
the queue dependent type fields in the 
.I ji_un
union are set according to the new queue type.
.LP
The job attribute
.At JOB_ATR_qtime 
is set to the current time if it was unset.  This notes the first time
into the queue.  At this time 
.I account_record()
is called with
.Sc PBS_ACCT_QUEUE
to make an accounting file entry.
Any unset resource which has a queue specific default value is set to the
default value.
.LP
If the job is being enqueued in an execution queue, several checks
are made.
If the job attribute
.At JOB_ATR_depend
is set, the function
.I depend_on_que()
is called to process any job dependency actions which might be required.
Note, the use of the
.Sc ATR_ACTION_NOOP
mode, this is because depend_on_que() is the at_action routine for dependencies
and needs to limit what it does when called for enqueued jobs as opposed to
jobs actually being modified.
Additionally, the scheduling flag
.Av svr_do_schedule
is set to
.Sc SCH_SCHEDULE_NEW .
.LP
If the job is being enqueued in an route (push) queue, the ji_un union in 
the job structure is set up for
.Sc JOB_UNION_TYPE_ROUTE
type.  The ji_quetime field is set to the current time to mark the time 
in the queue and the next retry time, ji_rteretry, is cleared.
.Fn svr_dequejob()
.Cs
void svr_dequejob(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to job structure to remove from a queue.
.RE
.LP
The job is unlinked from the queue in which it resides.  
The the current count of jobs in the queue,
.I qu_numjobs ,
the number of jobs in the given state,
.I qu_njstat[state] 
and
.I sv_jobstates[state] ,
and the number of total jobs in the server
.I sv_numb_jobs
are decremented.
Clear any job resource values which are marked
as being set to the queue specific default.
.Fn svr_setjobstate()
.Cs
int svr_setjobstate(job *pjob, int newstate, int newsubstate)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure.
.IP newstate 
the new value for the job state.
.IP newsubstate 
the new value for the job substate.
.RE
.IP Returns: 4
.RS
.IP 0
if successful.
.IP "non zero"
if save of job structure failed.
.RE
.LP
Sets the job state and substate to the supplied values and updates
the job save file if needed.
.LP
If the job is in substate
.Sc JOB_SUBSTATE_TRANSICM ,
then it is a brand new job and it has never been added into the various
server and queue state counts.  Therefore these are not updated at this
time.  When the job is enqueued into a queue, very shortly, then the
counts will be incremented to include this job.
.LP
Otherwise, if the state is changed, the server and queue state counts are
updated.  The state and substate are set to the supplied values.
If the queue is an execution queue and the new state is
.Sc JOB_STATE_QUEUED ,
then 
.Ar svr_do_schedule
is set to 
.Sc SCH_SCHEDULE_NEW
to kick start the scheduler as the job is eligible to run.
For the later accounting entry, the job attribue
.At JOB_ATR_etime
is set to the current time.   This will be recorded as the \*Qeligible\*U time.
.LP
If the 
.Ar ji_modified
flag in the job is set, the job attributes have been modified, then
the complete job is save by calling
.I job_save()
with the save mode of 
.Sc SAVEJOB_FULL .
Or, if only the state or substate changed, and if you change the state
you had better change the substate, then
.I job_save() 
is called with
.Sc SAVEJOB_QUICK .
The return value from job_save is passed back to the caller.
.Fn svr_evaljobstate()
.Cs
void svr_evaljobstate(job *pjob, int *newstate, int *newsub, int force)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure.
.IP newstate
RETURN:  pointer to where recommended job state is returned.
.IP newsub
RETURN:  pointer to where recommended job substate is returned.
.IP force
if true, force the state evaluation.
.RE
.LP
When evaluating the state,
the attributes of the job which might effect the job state are examined and
the recommended state and substate are returned.
This function should not be used to directly set the job state.  That 
should only be done via 
.I svr_setjobstate()
as it also updates the job attribute
.At JOB_ATR_state 
and updates the server and queue state counts.
.LP
.IP - 3
If 
.At force
is false and
the current job state is 
.Sc JOB_STATE_TRANSIT
or
.Sc JOB_STATE_RUNNING ,
the current state and substate are returned as the suggested state.
Code was added to svr_evaljobstate() to not change things when the job was
in JOB_STATE_TRANSIT, otherwise a job submitted with a past due execution
time screwed up by having its state changed to Queued while still being
received; the wait event timer was set to the old time and would go off 
immediately.
.IP
If
.At force
is true, the job is evaluated according to the following rules regardless
of the current state.
.IP -
If any hold is set it takes precedence over waiting and 
.Sc JOB_STATE_HELD
is returned.
.IP -
If the execute time attribute is set and that time has not
been reached,
.Sc JOB_STATE_WAITING
is set.
.IP -
If the job has a stage-in files attribute 
.At JOB_ATR_stagein ,
set, the state will be 
.Sc JOB_STATE_QUEUED .
If the files have been staged in (flag
.Sc JOB_SVFLG_StagedIn
is set in 
.Av ji_svrflags ),
the substate is
.Sc JOB_SUBSTATE_STAGECMP
(stage in complete), otherwise the substate is
.Sc JOB_SUBSTATE_PRESTAGEIN
(pre-stagein).
.IP -
Otherwise, 
.Sc JOB_STATE_QUEUED
is returned.
.LP
.Fn get_variable()
.Cs
char *get_variable(job *pjob, char *variable)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job.
.IP variable
name of an environment variable passed with job.
.RE
.IP Returns: 4
A pointer to the value part of the name=value environment string if found,
null otherwise.
.LP
This function finds the environment variable name=value string passed with a
job and returns a pointer to the value.  It is most often used to find the
variable PBS_O_HOST to determine the name of the host from which the job
was submitted.
.Fn chk_svr_resc_limit()
.Cs
static void chk_svr_resc_limit(attribute *jobatr, attribute *queatr,
attribute *svratr)
.Ce
.IP Args: 4
.RS
.IP jobatr
pointer to the job's resource list attribute.
.IP queatr
pointer to the specific queue's resource limit (max) attribute.
.IP svratr
pointer to the server's resource limit (max) attribute.
.RE
.IP Returns:
The global variables
.Av comp_resc_gt
and 
.Av comp_resc_lt
are set according to the comparisions.
.LP
For each resource limit (requirement) specified for the job that is not an
inherited default value, the limit is compared with:
.RS
.IP a. 3
The corresponding queue's limit if one is set for that resource, or
.IP b.
The server's limit if one is set for that limit.
.RE
.LP
The job's resource request (limit) is compared with the the queue or
server limit.   If the request exceeds the limit, the global variable
.Av comp_resc_gt
or
.Av comp_resc_lt
is incremented depending on the relationship of the request to the limit.
If neither a queue nor a server limit is set, neither of the global variables
is changed.
.Fn chk_resc_limits()
.Cs
int chk_resc_limits(attribute *pattr, pbs_queue *pque)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to job's Resource_List attribute.
.IP pque
pointer to queue in which the job resides.
.RE
.IP Returns: 4
zero if job's limits are within queue/server bounds, PBSE_EXCQRESC if not.
.LP
Each set resource limit (requirement) of the job is checked against the queue's
minimum limit specified in the attribute
.At QA_ATR_ResourceMin .
If the queue has a
maximum limit attribute,
.At QA_ATR_ResourceMax ,
the job's requirements are checked against it or if there is not a queue max
limit, the job is checked against the server's maximum limit
.At SRV_ATR_ResourceMax 
by calling 
.I chk_svr_resc_limit() .
.Fn svr_chkque()
.Cs
int svr_chkque(job *pjob, queue *pque, char *host, int move_type)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure.
.IP pque
pointer to queue structure for queue to check.
.IP host
name of host submitting job.
.IP move_type
type of move, MOVE_TYPE_* as defined in server_limits.h
.RE
.IP Returns: 4
.RS
.IP 0
if job can be enqueued.
.IP nonzero
if error, return is an PBSE_ error number.
.RE
.LP
The
.Ar move_type argument identifies if the job is being moved into the queue
as a result of:
.RS
.IP MOVE_TYPE_Move 
new submission or qmove by non-privileged user.
.IP MOVE_TYPE_Route
routing from a routing queue.
.IP MOVE_TYPE_MgrMv
qmove by privileged user (manager).
.IP MOVE_TYPE_Order
qorder request.
.RE
.LP
The following checks are made to see if the job can be enqueued
into the queue:
.IP 1. 4
If the queue is an execution queue, then check the following:
.RS
.IP a. 4
Can the execution uid and gid be established?
This is checked first because a return of
.Er PBSE_BADUSER
or 
.Er PBSE_BADGRP
is fatal event to a request by a manager to move a job.
.IP b.
Does the job have an \*Qunknown\*U resource,
.Er PBSE_UNKRESC ?
Also fatal to a manager.
.IP c.
Does the job have an \*Qunknown\*U attribute,
.Er PBSE_NOATTR ?
Also fatal to a manager.
.IP d. 
If the queue's group ACL is enabled, is the execution group allowed,
.Er PBSE_PERM ?
This is not fatal if requested by a manager.
.RE
.IP 2.
The queue is enabled,
.Er PBSE_QUNOENB ,
and the queue job limit,
.At max_queuable
(QA_ATR_MaxJobs) is not exceeded,
.Er PBSE_MAXQUED .
This is not fatal if requested by a manager.  This check is skipped for a queue
order request on the basis that two jobs are being swapped so the queue
limits are not affected.
.IP 3.
If the queue is marked as accepting jobs only from a routing queue,
.At QA_ATR_FromRouteOnly
is true,
.Er PBSE_QACESS .
This is not fatal if either a manger request or the job is from a routing queue.
It is not checked for a queue order.
.IP 4.
If the queue has an enabled host ACL, then the submitting host must
be able to access the queue,
.Er PBSE_BADHOST .
This is not fatal if requested by a manager.
.IP 5.
If the queue has an enabled user ACL, then the job owner must be able
to access the queue,
.Er PBSE_PERM .
This is not fatal if requested by a manager.
.IP 6.
The resources of the job must be with in the range specified by the
minimum and maximum resources allowed in the queue,
.Er PBSE_EXCQRESC .
This is not fatal if requested by a manager.
.LP
If any check fails, the appropriate error number is returned.  If all checks
pass, then zero is returned.
.Fn job_set_wait()
.Cs
int job_set_wait(attribute *pattr, void *pobject, int actmode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the execute-time,
.Sc JOB_ATR_exectime ,
attribute of a job.
.IP pobject
pointer to a job structure, cast as a void * to match prototype.
.IP actmode
the attribute set mode, see attribute.h.
.RE
.IP Returns: 4
.RS
.IP 0
if ok.
.IP -1-zero
if error.
.RE
.LP
This routine is called at the
.I at_action()
function whenever the 
.At execute-time
attribute of a job is set.
.LP
A search is made for an existing work task on the job's list pointing to 
.I job_wait_over() .
If one is found, the event time is updated to the value of the job wait
(execution) time.  If one is not found, and if the execution time is later
than the current time, an work task entry is created for the wait time
and set to invoke
.I job_wait_over() .
.Fn job_wait_over()
.Cs
static void job_wait_over(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to a work task entry.
.RE
.LP
This function is invoked off the server`s work task list.  The entry was
set up with the event time of a job's execution wait time and member
.At wt_parm1
as a pointer to the job.  All we need to do is re-evaluate the job's state
by calling 
.I svr_evaljobstate()
and
.I svr_setjobstate() .
.Fn default_std()
.Cs
static void default_std(job *pjob, char key, char *to)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job.
.IP key
to which file, the single character 'o' for output or 'e' for error.
.IP to
pointer to buffer in which the file name is placed.
.RE
.LP
The default name for either the standard output or standard error stream
of a job is generated.  The name is of the form
.Ty job_name".[e|o]job_sequence_number ,
where 'e' is used for error or the 'o' for output.  The job_name is from the
.At JOB_ATR_jobname
attribute.  The buffer in which the name is placed must be sufficiently large
to hold the name.  
.Fn prefix_std_file()
.Cs
char *prefix_std_file(job *pjob, char key)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job.
.IP key
to which stream, output or error.
.RE
.IP Returns: 4
.RS
.IP pointer
to malloc-ed space holding the generated full path name.
.RE
.LP
This function builds the fully specified (absolute) default path name for the
either the standard output or standard error of a job.  The result is of
the form:
.br
.Ty qsub_host:$PBS_O_WORKDIR/job_name.[e|o]job_sequence_number
.br
where 
.Ty qsub_host
is the name of the host on which the qsub command ran when the job was
submitted, 
.Ty $PBS_O_WORKDIR
is replaced by the value of the 
.B PBS_O_WORKDIR
environment variable associated with the job, i.e. the current working directory
of the qsub command.  The remainder of the path name, the default name,
is built by calling
.I default_std() 
described above.
.Fn get_jobowner()
.Cs
void get_jobowner(char *from, char *to)
.Ce
.IP Args: 4
.RS
.IP from
string from which the owner is obtained.
.IP to
buffer to which the owner name is returned.
.RE
.LP
This function returns the owner name (or any first part of a string)
stripping off the \*Q@host\*U portion (or any part following and including
a '@' character).  The destination buffer must be large enough to hold
the resulting string,  for a user name this is 
.Sc PBS_MAXUSER +1
characters.
.Fn set_deflt_resc()
.Cs
static void set_deflt_resc(attribute *ja, attribute *default)
.Ce
.IP Args: 4
.RS
.IP ja
ponter to the job resource attribute (typically Resource_List).
.IP default
pointer to the queue/server attribute to use as a default.
.RE
.LP
For each resource listed in the default attribute, if the corresponding
resource is unset in the job resource_list, set it to the value in the
default.  Also set 
.Sc ATR_VFLAG_DEFLT
to indicate it is a default value so it will not be passed if the job is
moved to a new queue or server.
.Fn set_resc_deflt()
.Cs
void set_resc_deflt(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job.
.RE
.LP
This public routine is used to set any default Resource_List values for a job.
The function
.I set_deflt_resc()
(very close in name isn't it) is called in turn with: the queue's
resource_default, the server's resource_default, the queue's resource_max, and
the server's resource_max attribute.   
.Fn set_statechar()
.Cs
void set_statechar(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job.
.RE
.LP
The job_state attribute,
.At JOB_ATR_state ,
is set to
.Ty "T, Q, H, W, R,"
or
.Ty E 
depending on the job state in
.Ar ji_substate .
A special case \- if job state is
.Sc JOB_STATE_RUNNING
and the flag
.Sc JOB_SUBSTATE_SUSPEND
is set in
.Ar ji_svrflags ,
the state character is set to
.Ty S .  
This is found only for jobs running under Unicos,  see
.I post_signal_req() .
.Fn eval_chkpnt()
.Cs
static void eval_chkpnt(attribute *jobckp, attribute *queckp)
.Ce
.IP Args: 4
.RS
.IP jobckp
pointer to a job checkpoint attribute.
.IP queckp
pointer to a queue checkpoint attribute.
.RE
.LP
This function is called when a job is enqueued in an execution queue.
It is to insure the that if the job's checkpoint attribute
.At JOB_ATR_chkpnt ,
is of the form "c=dddd", then the interval value, dddd, is not more than 
the value of the queue's checkpoint_min attribute,
.At QE_ATR_ChkptMin .
.NH 3
.Tc Request and Reply Functions
.LP
This section covers the functions related to receiving requests and to
issuing requests and replies.  Much of the design and implementation was
mandated by the use of ISODE.
.NH 4
.Ix process_request.c
.LP
The file
.I src/server/process_request.c
contains the top level routine invoked to process a batch request from a
client program as well as some supporting functions.  
.LP
.Fn process_request()
.Cs
void process_request(socket)
.Ce
.IP Args: 4
.RS
.IP socket\ 
is the socket descriptor from which the request is to be read.
.RE
.LP
This function is invoked when
.I accept_conn()
determines that input is available on a socket connected to a client.
The purpose of process_request() is to read in the request and dispatch
it to the appropriate function for processing.
.LP
The server only accepts DIS requests and calls dis_request_read() to read and
decode the request.  If any connection comes in marked as FromClientASN will
cause the server to abort.
Note, that MOM only accepts DIS and so only calls dis_request_read().
.LP
If the return from dis_request_read() routine indicates end-of-file,
The connection is closed by calling a local function
.I close_netconn() ,
If there was a new job being received over the connection, close_netconn
is directed to consider enqueuing it. 
.LP
If the return from the read (isode_request_read()) routine indicates that a
read or system error occurred, the connection is just terminated on the
assumption that a reply would not get through either.  
.LP
If the return from the read routine indicates that the request did not
decode correctly, a reject reply is sent to the client.
.LP
The host from which the request is being sent is determined by calling
.I get_connecthost() .
The client host is authorized against the server's host ACL by calling
.I acl_check() .
.LP
If the client connected to the server on a \*Qreserved\*U port, the standard
socket authorization scheme, we take it as meaning that the client is another
server with full privileges.
Otherwise, the user making the request is authenticated by calling
.I authenticate_user()
and the privileges are established by calling 
>I svr_get_privilege() .
If any authentication or authorization fails, the request is rejected with
the appropriate error code.
.LP
If the server's state is anything other than
.Sc SV_STATE_RUN ,
then certain requests will be rejected.  These ususally entail the running of
new jobs or the enqueing of new jobs.
.LP
Next, the request is dispatched, via
.I dispatch_request() ,
to the appropriate service function based upon request type.
Each service function is required to reply to the request and deallocate
the batch_request structure when processing of the request is completed.
.LP
.Fn dispatch_request()
.Cs
void dispatch_request(sock, request)
.Ce
.IP Args: 4
.RS
.IP sock
the socket over which the request arrived.
.IP request
a pointer to the batch_request structure.
.RE
.LP
The request is dispatched to the appropriate routine for processing.
Any unrecognized request is rejected.
.LP
.Fn alloc_br()
.Cs
struct batch_request *alloc_br()
.Ce
.IP Returns: 4
.RS
.IP pointer
to an allocated batch_request structure.
.RE
.LP
A batch_request structure is allocated and cleared.
The socket descriptor, 
.Av rq_conn ,
is set to -1 to indicate there is no connection, This is filled in by the
calling routine.  The allocated request structure is linked into the list
of request structures headed in the global variable
.Av svr_requests .
The structure should be freed by calling
.I free_br() .
.LP
.Fn close_client()
.Cs
static void close_client(int socket)
.Ce
.IP Args: 4
.RS
.IP socket
the connection to close.
.RE
.LP
First, the connection is closed by calling
.I close_conn() .
The list of active request structures, headed by the global variable
.Av svr_requests ,
is searched for any with the fields
.Av rq_conn
and
.Av rq_orgconn
equal to the
.At socket
parameter.  If found, the  field is set to 
.B -1 
to indicate the connection has been closed and no reply should be returned.
.Fn free_br()
.Cs
void free_br(struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure (allocated by process_request).
.RE
.LP
The batch request structure is unlinked for the list headed by
.Av svr_requests .
The structure and any allocated sub-structures, including the
reply structure, are freed.
This is a place where code will have to be added if new types of requests
are added.
.LP
There are a few routines named 
.I freebr_*()
that are local to this file.
They are called by free_br() depending on the type of request.
.Fn close_quejob()
.Cs
static void close_quejob(int socket)
.Ce
.IP Args: 4
.RS
.IP socket
the socket descriptor of a closed connection.
.RE
.LP
When invoked, this function searches the list of incoming jobs headed by
.Av sv_newjobs
in the server structure.  This list is comprised of jobs for which a Queue
Job request has been received, but no Commit request.
.LP
When the connection to the sending agent is lost one of the following actions
is taken.
.IP \(bu
If a Ready to Commit has not be received for the job, the job still belongs
to the sending agent.  The local structure is discarded.
.IP \(bu
If a Ready to Commit has been received, the substate is
.Sc JOB_SUBSTATE_TRANSICM ,
and the job is marked as being created here for the first time, 
.Sc JOB_SVFLG_HERE
is set in
.Av ji_svrflags
in the job structure, then the client is a user 
.B qsub
command.  In this case all the information is at hand and the client is
transitory, so we accept ownership of the job and enqueue it.
.IP \(bu
If the substate is
.Sc JOB_SUBSTATE_TRANSICM
but
.Sc JOB_SVFLG_HERE
is not set, then the job is being transferred from another server.  That
server retains ownership until it send a Commit.  The defined recovery
process calls for to just wait for the Commit.  Therefore, we leave the
job as is.
.NH 4
.Ix dis_read.c
.LP
The file 
.I src/server/dis_read.c
contains the high level functions to read and decode 
.I "Data Is Strings"
or DIS encoded requests and replies.  The lower level routines that perform the
actual decode are found as
.I decode_*()
routines in libpbs.a and 
.I disr*()
routines in libdis.a.   An advantage of the DIS routines is that the data
may be decoded directly into the server's batch_request structure eliminating
several data copy operations.
.Fn dis_request_read()
.Cs
int dis_request_read(int socket, struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP socket
the socket on which a request has been received.
.IP request
pointer to an allocated batch request structure which will be filled in.
.RE
.IP Returns:
.RS
.IP 0
A request was received and decoded correctly.
.IP -1
EOF received, the client has closed the connection.
.IP positive
PBS error number. 
.RE
.LP
The function
.I DIS_tcp_reset()
is called to reset the read buffer for the
DIS I/O over TCP/IP support routines before the data is read.
This would only be required once for the server as it only uses
TCP/IP.  However MOM uses both TCP/IP and RPP intermixed, so the routines must
be reset each time.
.LP
The request is in three pieces, (1) the header which contains the requestor's
name and the request type, (2) the request body which varies with each type
of request, and (3) the request extension.
.I decode_DIS_ReqHdr()
is called to decode the header.   If it fails or if the protocol type and
verison in the header are not recognized, 
.Er PBSE_DISPROTO
is returned.  If decode_DIS_ReqHdr() returns EOF, we also return it (-1).
.LP
Based on the request type contained in the header, a large switch statement
results in calling the decode_*() routine corresponding to the request type.
If an error is returned, it is logged and passed upwards.
.LP
The request extension is decoded by
.I decode_DIS_ReqExtend() .
.Fn DIS_reply_read()
.Cs
int DIS_reply_read(int socket, struct batch_reply *reply)
.Ce
.IP Args: 4
.RS
.IP socket
on which to write the reply.
.IP reply
pointer to a batch reply structure (contained within a batch_request structure).
.RE
.IP Returns:
0 on success, non-zero if error.
.LP
This function simply calls
.I DIS_tcp_reset()
to reset the DIS I/O buffer for TCP/IP and then invokes
.I decode_DIS_replySvr()
to perform the real work.  Any error returned by decode_DIS_replySvr() is
just passed on.
.NH 4
.Ix reply_send.c
.LP
The file
.I src/server/reply_send.c
contains the functions to form an error (or reject) reply and to send a
reply back to the requesting client.
.LP
.LP
.Fn set_err_reply()
.Cs
static void set_err_reply(int code, char *msg, struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP code
The error code to return to the client.
.IP msg
pointer to a character buffer in which a message is built.
.IP request
pointer to the batch request.
.RE
This routine fills in the basic reply structure within a batch_request.
If the current reply union is other than
.Sc BATCH_REPLY_CHOICE_NULL ,
the structure is freed by calling
.I reply_free() .
.LP
If the error
.Ar code
is
.Er PBSE_SYSTEM ,
then the value of 
.Ar errno
is checked for non-zero and having an associated error message, see perror(3).
If it exists, the message is appended to the text of 
.Ar msg_system
for return to the client.
If the value of
.Ar code 
is any other PBS error or if 
.Ar code
is less than the base number of PBS errors,
.Sc PBSE_ ,
it is assumed to be a local system error number,
the routine sees if that error has an associated message.
If there is one, that message is placed into
.Ar msg .
.LP
.Fn reply_send()
.Cs
int reply_send(struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
A pointer to the protocol independent batch request structure which also
contains the reply structure.
.RE
.IP Returns: 4
.RS
.IP 0
If ok
.IP -1
If error
.RE
.LP
The connection socket descriptor is obtained from the request structure.
If the socket descriptor,
.Ar sfds ,
has the value of 
.Sc PBS_LOCAL_CONNECTION ,
then the request being replied to was from this server.  A work task of type
.Sc Deferred_Reply_Local 
and the event equal to the address of the request structure is located
and dispatched by moving the work task entry from the event list to the
immediate list.  [Note, originally dispatch_task() was called directly to
provide immediate processing of the event task.  This resulted in a problem
of what to do when register dependency request was rejected.  The desired end
result is to abort the requesting job, however that cannot be done by the
routine processing the reply if it is called directly because the higher level
routines assume the job will still be around.   By moving the work task
entry to the immediate list and having it dispatched out of the main loop,
all higher level routines have completed their work and we have generalized
the case to match that of the request having going off host over the net.]
.LP
If the socket descriptor has a positive value,
the request came from a different server.
The reply is encode by calling
.I dis_reply_write() .
.LP
Note, if the socket descriptor is negative, but not 
.Sc PBS_LOCAL_CONNECTION ,
then this indicates that the connection was closed on End of File back in 
.I process_request() .
In this case, no reply is sent and no error is returned.
.LP
Following either success or failure in sending the reply, the
original batch request/reply structure is freed by calling
.I free_br() .
On an error, a PBS error number is returned.
.Fn reply_ack()
.Cs
void reply_ack(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch request.
.RE
.LP
This routine returns a success reply to a client.
The reply structure with in the request structure is filled in 
with the choice set to 
.Sc BATCH_REPLY_CHOICE_None ,
the code to
.Er PBSE_NONE ,
and the auxcode to 0.
The request and reply are then passed to
.I reply_send() .
.LP
.Fn req_reject()
.Cs
void req_reject(int code, int aux, struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP code
The error code to return to the client.
.IP aux
The auxiliary error core.
.IP request
pointer to the batch request.
.RE
.LP
A batch reply structure with in the request is filled in by calling
.I set_err_reply() .
The auxcode in the reply is set to the value of
.Ar aux .
Then
.I reply_send()
is called to complete the reply and send it.
.Fn reply_badattr()
.Cs
void reply_badattr(int code, int aux, struct svrattrl *pal,
                   struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP code
The error code to return to the client.
.IP aux
The auxiliary error core.
.IP pal
pointer to the client supplied attributes, in the form of a list of svrattrl.
.IP request
pointer to the batch request.
.RE
.LP
This routine forms a error reply for a request which is being rejected for
an invalid attribute/resource name or value.
The basic reply structure is filled in by calling
.I set_err_reply() .
It is identical to 
.I req_reject()
except that 
.Ar aux
is used as an index into the
.Ar pal
attribute list.   The name of that attribute, and resource name if one, is
appended to the error message.   The main purpose is to identify the offending
attribute/resource to the user.
.Fn reply_text()
.Cs
void reply_text(struct batch_request *request, int code, char *text)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch request structure.
.IP code
The error code to return to the client.
.IP text
The text string to send to the client.
.RE
.LP
Set the code to the supplied value, the auxcode to 0, the type to
text, and copy in whatever of the text parameter that will fit.
Then call
.I reply_send() .
.Fn reply_jobid()
.Cs
int reply_jobid(struct batch_request *request, char *jobid, int which)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch request structure.
.IP jobid
the job id string.
.IP which
reply type, the choice discriminator.
.RE
.IP Returns: 4
.RS
.IP 0
No error
.IP error
value if error.
.RE
.LP
This is used to generate and send a reply containing the job id.
It is used to repond to the following requests: Queue Job, Ready to Commit,
and Commit.
.NH 4
.Ix req_getcred.c
.LP
The file
.I src/server/req_getcred.c
contains functions relating authentication of a client making batch requests.
.Fn req_getcred()
This function is retained until version 1.1.6 to provide compatibility with
1.1.4 and earlier clients.  In 1.1.6, only the non-credential pbs_iff method
of authentication will be supported in order to remove encryption and allow
export of PBS.
.Fn req_connect()
.Cs
void req_connect(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to a Connection Batch Request.
.RE
.LP
With the removal of encrypted credentials in 1.1.5, the credential type is
.Sc int_BATCH_credentialtype_credential__none
and this routine serves mainly to insure the connection from 
.I pbs_connect()
to the server has been made before 
.B pbs_iff
is called to authenticate it.
.Fn req_authenuser()
.Cs
void req_authenuser(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the Authenticated User batch request.
.RE
.LP
This routine forms the server side of the authentication method introduced
in version 1.1.5.  The program
.B pbs_iff
will send over a privileged port the port number of the client.  If this
connection is found by the server and it is not already authenticated, 
the connection
.Ar svr_conn[socket]
is marked with
.Sc PBS_NET_CONN_AUTHENTICATED
and the current time (for historical reasons), and the user and hostname
from the request are saved as the credential in
.Ar conn_credent[socket] .
.NH 3
.Tc Issuing Requests to Other Servers
.LP
When the server must issue a request to another server, the Scheduler, or
MOM, the server cannot wait on the reply; the issuance of the request and
the reception of the reply must be asynchronous events.  This is accomplished
through the use of a work task order.  For each request issued, there is a
work task order created that specifies the function to be called when the reply
is received.  The work task is of type
.Sc Deferred_Reply ,
and it is connected to the reply by having the event set to the socket number
on which the reply will be read.
.LP
Another factor which complicates the process of issuing requests is that
the request may actually be for the local server itself.  For example, a
Register Dependency Request may need to be sent to a different server or
to the local server depending on the location of the parent job.  In order
to remove the decision process about location from the request itself, this
decision is moved into three common functions:
.IP svr_connect()
will return a special value,
.Sc PBS_LOCAL_CONNECTION ,
for the connection handle if the address is local.
.IP issue_request() 
will either connect to a remote server and send it the request,
or the function will dispatch the request locally.  
The decision is based on the value of the connection handle pass to
issue_request().
.IP reply_send()
compliments the issue_request() function by either transmitting
the reply to a request to a remote client-server or by directly dispatching
the reply if the request was from the local server.
.LP
Since the ASN.1 data encoding has been removed, only 
.I issue_Drequest() 
is used to issue requests now.  Requests will be only use 
.I process_Dreply()
to reply with.
All channels should be marked with
.Sc ToServerDIS .

.LP
.NH 4
.Ix issue_request.c
.LP
The file
I src/server/issue_request.c
contains the function
.I issue_request()
described in \*QIssuing Requests to Other Servers\*U.
.Fn issue_Drequest()
.Cs
int issue_Drequest(int handle, struct batch_request *request,
                  void (*func)(struct work_task *));
.Ce
.IP Args: 4
.RS
.IP handle
the connection handle for the connection (real or imaginary) to the server.
This is not the socket, but the return from 
.I svr_connect() .
.IP request
the batch request structure.
.IP func
the function to deal with the reply, it inserted in the work task.
.RE
.IP Returns: 4
.RS
.IP 0
if request sent ok.
.IP Non-zero
if could not deliver the request.
.RE
If the value of the connection handle is the special value
.Sc PBS_LOCAL_CONNECTION ,
then the request is for the local server itself.  The special value
is saved in the request.
A work task structure is set up with the the passed function, the type
.Sc Deferred_Reply_Local ,
and the event being the address of the request structure.  Then
.I dispatch_request() 
is called to pass the request to the correct local processing routine.
The socket number is set to 
.Sc PBS_LOCAL_CONNECTION
to indicate this is a request to the local server.
(When the reply is returned through 
.I reply_send() ,
the work task will be dispatched.)
.LP
If the host is a remote host, the work task is set up with the passed function,
the type
.Sc Deferred_Reply ,
and the event equal to the socket number extracted from the connection handle.
.I DIS_tcp_reset()
is called to reset the write buffer used by the DIS I/O routines.
The request is then passed to the appropriate routine to be encoded and
written on the network. (Some of these routines reside in the API library,
libpbs.a, others are particular to the server.  These are handled by calling
.I encode_DIS_ReqHdr() ,
some variant of
.I encode_DIS_*()
depending on the request,
.I encode_DIS_ReqExtend()
and
.I DIS_tcp_wflush()
to complete and write out the request.
.Fn issue_Arequest()
.Cs
int issue_Arequest(int handle, struct batch_request *request,
                  void (*func)(struct work_task *));
.Ce
.IP Args: 4
.RS
.IP handle
the connection handle for the connection (real or imaginary) to the server.
This is not the socket, but the return from 
.I svr_connect() .
.IP request
the batch request structure.
.IP func
the function to deal with the reply, it inserted in the work task.
.RE
.IP Returns: 4
.RS
.IP 0
if request sent ok.
.IP Non-zero
if could not deliver the request.
.RE
.LP
If the value of the connection handle is the special value
.Sc PBS_LOCAL_CONNECTION ,
then the request is for the local server itself.  The special value
is saved in the request.
A work task structure is set up with the the passed function, the type
.Sc Deferred_Reply_Local ,
and the event being the address of the request structure.  Then
.I dispatch_request() 
is called to pass the request to the correct local processing routine.
The socket number is set to 
.Sc PBS_LOCAL_CONNECTION
to indicate this is a request to the local server.
(When the reply is returned through 
.I reply_send() ,
the work task will be dispatched.)
.LP
If the host is a remote host, the work task is set up with the passed function,
the type
.Sc Deferred_Reply ,
and the event equal to the socket number extracted from the connection handle.
The request is then passed to the appropriate routine to be encoded and
written on the network. (Some of these routines reside in the API library,
libpbs.a, others are particular to the server.
.Fn process_reply()
.Cs
void process_reply(int sock)
.Ce
.IP Args: 4
.RS
.IP sock
The socket file descriptor from which the reply was read.
.RE
.LP
This function is called by 
.I wait_request()
when a reply to a request is ready to be read, the call to
.I svr_connect()
was typically established process_reply() as the call back function.
.LP
A work task entry on the
.Av task_list_event 
list is located with the event matching the socket.  A pointer to the original
request is in the work task field
.Av wt_parm1 .
The request address, along with the socket, is passed to
.I isode_reply_read
which will decode the reply and insert it into the request.  The work task
is then dispatched.
.Fn relay_to_mom()
.Cs
int relay_to_mom(pbs_net_t mom, struct batch_request *request, 
                 void (*function)(struct work_task *))
.Ce
.IP Args: 4
.RS
.IP mom
The network address of MOM.
.IP request
pointer to the request which is to be sent to MOM.
.IP function
to be invoked when the reply from MOM is received.
.RE
.IP Returns: 4
.RS
.IP zero 
on success
.IP non-zero
if error, see issue_request().
.RE
.LP
This is a short cut function for transferring an existing or new request to
the Machine Oriented Mini-server, MOM.  A connection is established to the
MOM specified by 
.At mom
and the request is sent by calling
.I issue_request() .
.LP
This may be used to relay a  request received from a client to MOM.
issue_request() will insert the MOM connection socket into the request in
.Av rq_conn
over-writing the socket to the client which will be needed to reply.
Thus, the original socket is saved in the request in
.Av rq_orgconn .
.B Warning:
this value must be restore to
.Av rq_conn
by whatever routine processes the reply data.
.Fn reissue_to_svr()
.Cs
static void reissue_to_svr(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
work task pointer created by issue_to_svr().
.RE
.LP
This routine is called via a time delayed work task entry created by
.I issue_to_svr() .
It attempts to retry sending a request to a remote server via issue_to_svr().
If the retry time limit is exceeded or the new attempt to connect the remote
server fails with no retry possibility, the work task entry will be forwarded
to the post processing routine specified by the function which made the
request.  The
.Ar wt_aux
field of the work task is set to -1 to indicate an error.  Since, all the
post processing routines expect a connection handle in 
.Ar wt_event ,
and this event is a time, wt_event is also set to -1.
.LP
If the call to issue_to_svr() was not rejected, this function just returns
and lets the dispatch_request() function free the work task entry.  Note, that
if issue_to_svr() chooses to retry, then a new work task entry is created 
by it.
.Fn issue_to_svr()
.Cs
int issue_to_svr(char *server_name, struct batch_request *preq, 
                 void (*reply_function)(struct work_task *))
.Ce
.IP Args: 4
.RS
.IP server_name
of server where request is to be sent.
.IP preq
pointer to request to send.
.IP reply_function
is the function to be invoked when the request reply is received.
.RE
.IP Returns: 4
.RS
.IP 0
on success.
.IP -1
if hard error.
.RE
.LP
This request is used to send or forward a request to a server.
The server may be remote or it may be our self.
It is not typically used to send requests to MOM because different error
processing is required.
.LP
The destination server name is copied into the request and the 
.Ar rq_fromsvr
flag is set to indicate it comes from a server incase the destination server
is our self and we use the same structure.
Likewise, permissions are set to manager read/write.
The server name turned into an address via calls to 
.I parse_servername() 
and
.I get_hostaddr() .
If get_hostaddr() returns 
.I "busy, retry" ,
we will retry later.  Any other error is fatal.
.LP
.I svr_connect() 
is called to obtain a connection to the destination server.
If svr_connect() return 
.Sc PBS_NET_RC_RETRY ,
we do so.
The handle, request, and post processing function,
.Ar reply_function ,
are included in a call to
.I issue_request() .
.LP
If retry is indicated, a work task entry is created by calling
.I set_task() .
This is a timed entry with a delay of
.Sc PBS_NET_RETRY_TIME
seconds.  
.Fn release_req()
.Cs
void release_req(struct work_task task)
.Ce
.IP Args: 4
.RS
.IP task
setup by issue_request() and used to dispatch this function.
.RE
.LP
This routine is used as \*Qreply processor routine\*U when there is no
interest in the content of the reply.  It frees,
.I free_br() ,
the request structure and disconnects,
.I svr_disconnect() ,
from the other server.
It must not be used when the request originated from an outside client,
or the client will not receive the answer.
.NH 4
.Ix svr_connect.c
.LP
The file
.I src/server/svr_connect.c
contains three functions. The function 
.I svr_connect()
is the server's equivalent to the API routine
.I pbs_connect() .
This function is used by the server to establish a connection to a peer
server.  The calling server assumes the role of a client to the peer server.
The function
.I svr_disconnect()
is the server's equivalent to the API routine
.I pbs_disconnect() .
.LP
These two functions brings together the requirements of both the server and its
.I net_server
system of waiting on I/O together with the 
.I connection_handle
used by the API routines such as 
.I _pbs_queuejob() .
This allows the server to asynchronously wait on the reply from the peer server
and use the _pbs_*.c routines of the API.
The connection_handle array is much larger than for the typical client.
.LP
The function
.I parse_servername()
will return the host name section of a server name and the optional service
port section.
.Fn svr_connect()
.Cs
int svr_connect(pbs_net_t hostaddr, int port, void (*function)(int socket),
enum conn_type type)
.Ce
.IP Args: 4
.RS
.IP hostaddr
is a pbs_net_t (unsigned long) containing the Internet address in
network byte order.
.IP port
is the port to which to connect, in network byte order.
.IP function
to be invoked by 
.I wait_request()
when data (a reply) is ready to be read on the connection.  The argument to
the function is the socket.  This function is typically
.I process_reply() .
.IP type
of data encoding for the connection: 
.Sc ToServerDIS .
.RE
.IP Returns: 4
.RS
.IP ">= 0"
is a connection handle for a connection to a remote server.
.IP PBS_LOCAL_CONNECTION
a special value if the destination server is this server.
.IP -1
if an error occurred.
.RE
.LP
If the host address and port number match that of this server, then
.Sc PBS_LOCAL_CONNECTION
is returned.  No physical connection is made, see issue_request().
.LP
Otherwise, the libnet.a routine
.I client_to_svr()
is called to open the connection with the specified host address and port
number.  The socket is added to the svr_conn array by calling
.I add_conn() .
The entries type is
.Sc General
and the call-back function for data ready to read is
.Ar func .
If func is not null, meaning that a reply will be read, 
.I add_conn()
is called to make func the call back function.
For releases 1.1.9 and 1.1.10, PBS marks the connection with the 
.Ar type
passed in the call.
.LP
The connection handle array used by the API routines has an entry added
and the the index into the array is the return value.
An ISODE Presentation Stream is allocated for use by the API routines.
.Fn svr_disconnect()
.Cs
void svr_disconnect(int handle)
.Ce
.IP Args: 4
.RS
.IP handle
the connection handle returned by svr_connect().
.RE
.LP
If the handle is valid,
the ISODE presentation stream is freed, the connect_handle array member
is released, and the socket is closed by calling 
.I net_close() .
Note, a handle of
.Sc PBS_LOCAL_CONNECTION
is greater that the maximum allow handle index and a handle of -1
indicates the connection is not open.
.Fn socket_to_handle()
.Cs
int socket_to_handle(int socket)
.Ce
.IP Args: 4
.RS
.IP socket
number of the socket.
.RE
.IP Returns: 4
The number of a \*Qconnection handle\*U set up for the socket; -1 if error.
.LP
An unused entry in the connection table,
.Ar connection[] ,
is located and assigned to the socket.  ISODE streams are allocated for it.
.Fn parse_servername()
.Cs
char *parse_servername(char *name, int *service)
.Ce
.IP Args:
.RS
.IP name
the server's name in the form
.Ty hostname[:port] .
.IP service
RETURN: the port number, if specified in name, is returned.  If there is not
a :port in the name argument, *service is unchanged.
.RE
.IP Returns: 4
A pointer to the host name, up to but not includeding any :port returned.  
The host name is in static storage and will be overwritten on the next call
to parse_servername().
.LP
The hostname[:port] passed in name is parsed.
.NH 3
.Tc Queue Functions
.LP
.NH 4
.Ix queue_func.c
.LP
The file
.I src/server/queue_func.c
contains general functions for queue structure management.
.Fn que_alloc()
.Cs
queue *que_alloc()
.IP Returns: 4
.RS
.IP Pointer
to queue structure created
.IP Null
if unable to create queue.
.RE
.LP
This function is called to create a queue structure in memory.  The space
is allocated and cleared.
The structure is linked into the server list of queues headed in
.Av sv_queues .
The number of queues,
.Av sv_numque ,
is incremented.
Each attribute array entry is set to\*Qunset\*U.
For the attributes in the array of those common to all queues, the attribute
type flag is set.  In the union of attributes that are queue type dependent,
the type flag is not set.
.LP
The queue is marked as modified,
.Av qu_modified
is set to one, but the structure is not written to disk by this routine.
.Fn que_free()
.Cs
void que_free(queue *pq)
.Ce
.IP Args: 4
.RS
.IP pq
Pointer to the queue structure to be freed.
.RE
.LP
Any space allocated to the attributes is freed.
The server count of queues,
.Av sv_numque ,
is decremented and the queue structure is unlinked from the server list.
Then the queue structure itself is freed.
.Fn que_purge()
.Cs
int que_purge(queue *pq)
.Ce
.IP Args: 4
.RS
.IP pq
Pointer to the queue to be removed from the system.
.RE
.IP Returns: 4
.RS
.IP 0
If successful
.IP -1
If error.
.RE
.LP
An error is returned if the queue to be purged owns any jobs.
.LP
The queue save file is unlinked and the queue structure is released by calling
.I que_free() .
.Fn find_queuebyname()
.Cs
queue *find_queuebyname(char *qname)
.Ce
.IP Args: 4
.RS
.IP qname
The name of the desired queue.
.RE
.IP Returns: 4
.RS
.IP Pointer
to the queue structure if found
.IP Null
If no queue found
.RE
.LP
Search linked list of server's queues for one with given name.
Any
.Ty @server
suffix on the queue name is ignored.
.Fn get_dfltque()
.Cs
queue *get_dfltque()
.Ce
.IP Returns: 4
.RS
.IP pointer
to the default queue if defined, or NULL.
.RE
.LP
If the server attribute
.At default_queue
is set, and if there is a queue by that name, a pointer to it is returned.
Otherwise, a null pointer is returned.
.NH 4
.Ix queue_recov.c
.LP
The file
.I src/server/queue_recover.c
contains the functions to save and restore a queue structure and its
associate attributes.
.Fn que_save()
.Cs
int que_save(queue *pque)
.Ce
.IP Args: 4
.RS
.IP pque
Pointer to queue structure which is to be saved.
.RE
.IP Returns: 4
.RS
.IP 0
if success
.IP -1
if error
.RE
.LP
If the queue is marked as modified, it is saved to disk.  If not, it
isn't.
.LP
The queue file name is based on the queue name, which is obtained from the
queue structure.  This file is opened.  save_setup() is called to
initialize the save buffer.  The queue structure is written using
save_struct().  
.LP
The queue attributes are saved by calling 
.I save_attr().
.LP
The save buffer is flushed, save_flush(), and the file is closed.
The queue is marked an not modified.
.LP
The queue's attributes are searched for any of type
.Sc ATR_TYPE_HOSTACL ,
.Sc ATR_TYPE_USERACL ,
or
.Sc ATR_TYPE_GRPACL .
When found, 
.I save_acl()
is called to save the contents of the access control list to its own file.
.Fn que_recov()
.Cs
que *que_recov(char *filename)
.Ce
.IP Args: 4
.RS
.IP filename
The name of the queue save file.
.RE
.IP Returns: 4
.RS 
.IP "Non null"
queue pointer to the new queue structure upon success.
.IP Null
pointer on failure.
.RE
.LP
The queue structure is allocated and initialize via 
.I que_alloc() .
The file specified is opened.  The basic queue data is read into the
queue structure pointed to by pque.  The attributes are reloaded by calling
.I recov_attr() .
.LP
The queue's attributes are searched for any of type
.Sc ATR_TYPE_HOSTACL ,
.Sc ATR_TYPE_USERACL ,
or
.Sc ATR_TYPE_GRPACL .
When found, 
.I recov_acl()
is called to reload the contents of the access control list from its own file.
.LP
The queue is marked as not modified to prevent an unnecessary rewrite to disk.
.NH 3
.Tc Server Functions
.LP
This section of the IDS covers a collection of modules which contain general
bookkeeping functions for the server.  If they did not fit else where, they
are probably here.
.NH 4
.Ix run_sched.c
.LP
The file
.I src/server/run_sched.c
contains functions used by the Server to contact and command the job Scheduler.
The connection to the Scheduler is a two faced connection, or maybe I should
say it turns on you.   The Server contacts the Scheduler to open the connection
and sends it a schedule command.  This makes the Server a client to the
Scheduler.  But the scheduler needs to send requests to the Scheduler as a
client.  Thus after sending the command the Server adds the connection to
those from which it accepts requests and the Scheduler sets up the connection
to look like it was created via a call to pbs_connect().
.LP
The schedule command sent from the Server and the Scheduler is a simple
4 byte integer, in network order.   The integer has the value of:
.Sc SCH_SCHEDULE_NEW (1),
.Sc SCH_SCHEDULE_TERM (2),
.Sc SCH_SCHEDULE_TIME (3),
.Sc SCH_SCHEDULE_RECYC (4), 
or
.Sc SCH_SCHEDULE_CMD (5).
Additional commands are planned but not currently supported.
.Fn schedule_jobs()
.Cs
int schedule_jobs()
.Ce
.IP Returns: 4
.RS
.IP -1
Error occurred, could not contact the Scheduler.
.IP \ 0
Scheduler was sent the schedule command.
.IP +1
An unresponded schedule command is already outstanding to the Scheduler, only
one at a time is allowed.
.RE
.LP
This routine is called from the main scheduler loop in
.I pbsd_main() .
If this is the first time the function has been called, the scheduler command
.Sc SCH_SCHEDULE_FIRST
will be sent to the scheduler regardless of the reason it was called.
If
.Av scheduler_sock
is minus one (otherwise it is the socket of the existing connection to the
Scheduler), 
.I contact_sched()
is called to send the command, listed above to the scheduler.
The command is found in the external variable
.Av svr_do_schedule .
.Fn contact_sched()
.Cs
static int contact_sched(int command)
.Ce
.IP Args: 4
.RS
.IP command
is the integer command to be sent to the scheduler.
.RE
.IP Returns: 4
.RS
.IP socket
of the connection to the scheduler or -1 if error.
.RE
.LP
The function
.I client_to_svr()
is called to open a connection to the Scheduler at address
.Av pbs_scheduler_addr
and port
.Av pbs_scheduler_port .
Then
.I add_conn()
is called to add the connection to the set to which the server will listen
for requests, and
.I net_add_close_func()
to register the locat function
.I scheduler_close()
as the function to be called when the connection closes.  Next
.I put_4byte()
is called to output the command.
.Fn put_4byte()
.Cs
static int put_4byte(int socket, unsigned int command)
.Ce
.IP Args: 4
.RS
.IP socket
connection to the Scheduler.
.IP command
to be sent.
.RE
.IP Returns: 4
.RS
.IP 0
for success, or -1 if error.
.RE
.LP
This function takes the least significant four bytes of the
.Av command ,
places them in network order and writes them on the connection.
It will work for any architecture where the size of an unsigned int is at
least 4 bytes.
.LP
The corresponding routine, get_4byte(), is found in
.I src/scheduler.rules/get_4byte.c.
.LP
The return value is -1 if 4 bytes could not be written on the socket.
.Fn scheduler_close()
.Cs
static void scheduler_close(int socket)
.Ce
.IP Args: 4
.RS
.IP socket
connection which was closed, unused.
.RE
.LP
The variable
.Av scheduler_sock
is set to -1 to indicate to 
.I schedule_jobs()
that the Scheduler connection is terminated.
.LP
If only one job was \*Qrun\*U by the scheduler during the cycle, as shown by
.Av scheduler_jobct
being set to one, then the external (see pbsd_main.c)
.Av svr_do_schedule
is set to 
.Sc SCH_SCHEDULE_RECYC
to recall the scheduler.  A scheduler script may be written to run only
one job per cycle to  ensure its newly taken resources are considered by the
scheduler before selecting another job.  In that case, rather than wait a full
cycle before scheduling the next job, we check that one (and only one) job
was run by the scheduler.  If true, then we recycle the scheduler (a committee
decision). 
.NH 4
.Ix geteusernam.c
.LP
The file
.I src/server/geteusernam.c
contains functions to obtain the login name and group under which the job
should be executed and set the corresponding uid, gid in the job structure.
.Fn geteusernam()
.Cs
static char *geteusernam(job *pjob, attribute *pattr)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job structure.
.IP pattr
pointer to the User_List attribute, either the job's or the newly modified
(qalter).
.RE
.IP Returns: 4
.RS
.IP pointer
to the user name.
.RE
.LP
The name is located by trying the following steps in the order listed until
a name is found.
.RS
.IP 1. 4
A 
.Av username@host
in the attribute
.At User-List
with a host name matching the local host name.
.IP 2.
A 
.Av username
in the attribute
.At User-List
with no host name specified, this is the wild card username.
.IP 3. 4
The username from the job attribute
.At owner-name .
This name is mapped to a local name by calling
.I site_map_user() .
(Remember, the PBS supplied version of site_map_user() just returns the
name given as input.)
.RE
.LP
The 
.At User-List
attribute is of type
.Sc ATTR_TYPE_ARST ,
array of strings.  Each string in the array is of the form
\f5username\f3[\f5@host\f3]\f1.
.LP
The selected name is saved in a static buffer and stripped of any
host name.
.Fn getegroup()
.Cs
static char *getegroup(job *pjob, attribute *pattr)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job.
.IP pattr
pointer to the group_list attribute, either from the job structure or a
newly modified one (qalter).
.RE
.IP Returns: 4
.RS
.IP pointer
to a string containing the the group name, null if one is not specified.  
.RE
.LP
This function returns the name of the group under which the job should
execute if one was specified.  The passed attribute,
.At JOB_ATR_grouplst ,
is searched for 
.RS
.IP 1. 4
A name with a host name matching the server host, or
.IP 2.
No host name (the wild card host).
.RE
.LP
If neither is found, a null pointer is returned.
.Fn set_jobexid()
.Cs
int set_jobexid(job *pjob, attribute *attr_array)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure.
.IP attr_array
pointer to array of job attributes, either the actual job's, or if they are
being modified, the newly modified array, see modify_job_attr().
.RE
.IP Returns: 4
.RS
.IP 0 
if successful.
.IP non-zero
error number, if error.
.RE
.LP
The execution uid and gid fields in the job, ji_euid, ji_egid, are set.
The name under which the job should be executed is obtained by calling
.I geteusernam() .
It is called with either the 
.At User_List
attribute from the passed in attribute array; or it is unset, the actual
job's working attribute 
.Av ji_wattr[JOB_ATR_userlst] .
.LP
The password entry for returned name is retrieved.  If there is not an entry
.Er PBSE_BADUSER
is returned.  If
.Sc PBS_ROOT_JOBS
is defined non-zero, an UID of zero is allowed if and only if the job owner
is root@\f2this host\fP.  If
.Sc PBS_ROOT_JOBS
is defined to zero, then an UID of zero is not allowed at all and
.ER PBSE_BADUSER
is returned.
.LP
The job structure, which contains the job owner name and  submitting host name, and the local user name are passed to
.I site_check_user_map()
to see if the user is authorized to execute a job as the selected user.
The user name is placed into the job attribute
.At JOB_ATR_euser .
.LP
For Cray Unicos system, an addition check is performed.  These systems have
a User Data Base (UDB) which contains permission bits.  Two are of interest
at this point,  if either
.Sc PERMBITS_NOBATCH
or
.Sc PERMBITS_RESTRICTED
is set for the user, he is denied access to the system for batch jobs (or at
all).  The job is aborted with
.Er PBSE_QACESS .
Also for the Cray, if the job account attribute,
.At JOB_ATR_account ,
is not set, the default account id, ACID, is obtained from the UDB entry.
.LP
The routine
.I getegroup()
is called with either the 
.At group_list member of the passed-in attribute array if set, or the actual
.Av ji_wattr[JOB_ATR_grouplst]
of the job;
the function determines if a group was specified for the execution.
If a group name was specified and the group is not the user's primary group
and the user name is not listed as a member of the specified group,
.Er PBSE_BADGRP
is returned.
If a group was specified, and it was the user's login group, then that is
allowed.
If a group was not specied for this host, then the user's login group is
taken as the default. 
The job attribute
.At JOB_ATR_egroup
is set to the group name; or in the case of defaulting to the login group
and getgrnam() return null (no such group), the numerical value (gid) is
converted into a string for JOB_ATR_egroup.
.LP
Also, if the group is the primary group from the
password file, the attribute default value flag,
.Sc ATR_VFLAG_DEFLT
is added to the attribute.  This has special meaning to MOM, see
.I start_exec() ,
and
.I setup_cpyfiles()
in server/req_jobobit.c.
.NH 4
.Ix svr_chk_owner.c
.LP
The file
.I src/server/svr_chk_owner.c
contains functions supporting authorization and authentication checking
of batch requests.
.Fn svr_chk_owner()
.Cs
int svr_chk_owner(preq, pjob)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the batch request structure.
.IP pjob
pointer to the job structure.
.RE
.IP Returns: 4
.RS
.IP 0
if requesting user is the job owner.
.IP non 0
if not job owner.
.RE
.LP
The user name and host name from the request are mapped to a local name by
.I site_map_user() .
The owner of the job is obtained from the job owner attribute
.At JOB_ATR_job_owner .
The host name from which the job was submitted is obtained from the job by
calling
.I get_orighost() .
It along with the job owner's name is mapped via site_map_user().
If the two resulting local names are equal, zero (0) is returned; else
non-zero is returned.
.Fn svr_authorize_jobreq()
.Cs
int svr_authorize_jobreq(struct batch_request *request, job *pjob);
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch request.
.IP pjob
pointer to the job structure.
.RE
.IP Returns: 4
.RS
.IP 0
if the client is authorized to act on the job.
.IP non-zero
if not authorized.
.RE
.LP
The requester or client is autorized to act on a job if the requester is
the job owner, see
.I svr_chk_owner() ,
or has been granted Operator or Manager privileges.
.Fn svr_get_privilege()
.Cs
int svr_get_privilege(char *user, char *host)
.Ce
.IP Args: 4
.RS
.IP user
the name of the user (client).
.IP host
the host from which the request is being made.
.RE
.IP Returns: 4
.RS
.IP (integer)
which is the read/write privilege granted.
.RE
.LP
The function
.I svr_get_privilege()
returns the access privilege granted to the named user.
There are three levels of privilege defined:
.IP User
has no special level of privilege.  A user has the ability to create, alter,
status and delete his/her own jobs.  A user can also status queues and the
server.
.IP Operator
has one level of special privilege.  An operator can alter, status, and
delete any user's jobs, status and alter queues, and status the server.
.IP Administrator
has the highest level of privilege.  An administrator has all the capabilities
of an operator plus the privilege to create and delete queues and alter
the server.
.LP
Any client user is automatically granted \*Quser\*U privilege.  
Administrator and operator privilege is granted on a name at host
basis.
If the user name associated with the host (or wild card) appears in
the server's
.At administrators
or 
.At operators
attribute, then that user is granted the corresponding additional privilege.
.LP
The return value from svr_get_priv() is the bitwise \*Qand\*U of the following
values which are defined in 
.I attribute.h :
.in +.5i
.nf
user			ATR_DFLAG_USRD & ATR_DFLAG_USWR
operator		ATR_DFLAG_OPRD & ATR_DFLAG_OPWR
administrator	ATR_DFLAG_MGRD & ATR_DFLAG_MGWR
.fi
.in -.5i
.Fn authenticate_user()
.Cs
int authenticate_user(struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the server network independent batch_request structure.
.RE
.IP Returns: 4
.RS
.IP 0
if user is authenticated
.IP <0
if authenticate fails.
.RE
.LP
In the basic provided system, the user is authenticated if the user name
and host name provided in the credential matches the user name in the request
and the host name determined from the network interface.  The time stamp must
be current, not less than
.Sc CREDENTIAL_TIME_DELTA seconds less than the local time nor more than
.Sc CREDENTIAL_LIFETIME
seconds more than the local system time.
.Fn chk_job_request()
.Cs
job *chk_job_request(char *jobid, struct batch_request *preq, int sock)
.Ce
.IP Args: 4
.RS
.IP jobid
the job identifier of the job to which the request applies.
.IP preq
pointer to the batch request.
.IP sock
the socket over which any reply is sent.
.RE
.IP Returns: 4
.RS
.IP pointer
to the job, null if error.
.RE
.LP
This function provides the common checks for batch service requests that
apply to existing jobs.  First the job is located; if not found
.Er PBSE_UNKJOBID
is returned to the client.
.LP
If the client is not authorized to make the request against the job, see
.I svr_authorize_jobreq() ,
.Er PBSE_PERM
is returned to the client.
.LP
Finally, if the job is in the exiting state, 
.Er PBSE_BADSTATE
is returned.
.LP
On any error, a reply is sent to the client via
.I req_reject()
and the function returns a null job pointer.  The caller should just return
up the line.
.NH 4
.Ix svr_func.c
.LP
the file
.I src/server/svr_func.c
contains various server support functions.
.Fn encode_svrstate()
.Cs
int encode_svrstate(attribute *pattr, list_head *head, char *name, char *rescn,
                    int mode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the server state attribute.
.IP head
head of list of encoded attributes, svratrlst, to which to append the attribute.
.IP name
name of the server state attribute.
.IP rescn
resource name, null.
.IP mode
the encode mode.
.RE
.IP Returns: 4
.RS
.IP zero
if successful, non-zero if error.
.RE
.LP
This is a special \*Qat_encode\*U routine for the server state attribute.
It turns the numeric state into the corresponding textual name: Idle, Active,
Scheduling, Terminating, or Terminating Delayed.
.LP
The choice between Idle and Active is made based on the setting of the
.At scheduling
attribute.  If there is a call outstanding to the scheduler, its socket is
not -1, then the state is mapped into Scheduling.
.Fn set_resc_assigned()
.Cs
void set_resc_assigned(job *pjob, enum batch_op op)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job which is being taken into running or exiting state.
.IP op
operator, 
.Sc Incr
or 
.Sc Decr .
.RE
.LP
When a job is being placed into run state or taken out of run state,
this routine is called to update the server attribute
.At SRV_ATR_resource_assn ,
resources used.  This attribute is the sum of certain resource requirements
of jobs in the running state.  The attribute may be useful in scheduling
scripts.  
.LP
If the job is not in state
.Sc JOB_STATE_RUNNING ,
this function just returns.  (Might be called twice if MOM is restarted after
the job terminates).
For each resource list member which is marked in the resource definition with
.Sc ATR_DFLAG_RASSN ,
that resource limit value is added/subtracted to/from the corresponding
resource member.
of SRV_ATR_resource_assn.
.Fn ck_chkpnt()
.Cs
int ck_chkpnt(attribute *pattr, void *pobject, int mode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to job checkpoint attribute.
.IP pobject
not used here
.IP mode
not used here
.RE
.IP Returns: 4
a PBS error number or 0 if ok.
.LP
This is the \*Qat_action\*U routine for the job's checkpoint, 
.At JOB_ATR_chkpnt ,
attribute.  Ck_chkpnt is called whenever the checkpoint attribute value is
set or changed.  The routine makes sure the value is proper, equal to
"n", "s", "u", "c", or "c=dddd", where dddd is a number.
.NH 4
.Ix svr_mail.c
.LP
The file
.I src/server/svr_mail.c
contains the function to send mail to a job's mail list.
.Fn svr_mailowner()
.Cs
void svr_mailowner(job *pjob, char mailpoint, int force, char * text)
.Ce
.IP Args: 4
.RS
.IP pjob
Pointer to the job about which mail is to be sent.
.IP mailpoint
The single character indicating the mail point.
.IP force
flag to force sending the mail.
.IP text
The character string of the message to mail.
.RE
.LP
The 
.I mailpoint
parameter is a single character which identifies the point at which mail
is being sent:
.RS
.IP a
for abort, 
.IP b
for beginning of execution,
.IP e
for exit, and
.IP s
for file staging (in) error.
.RE
.LP
If the
.Av force 
flag is true, the mail message is to be sent.  Otherwise the job attribute 
.At JOB_ATR_mailpnts
is checked to see if the user requested mail at this point.  If not, the
function just returns.
.LP
If mail is to be sent, the function
forks with out setting up a work task on the pid as there is nothing to
do when the child exits.
The parent returns to the caller.
.LP
The child process builds the sendmail command is built up in a buffer.
It includes the -f option to specifiy the \*Qsender's name\*U which is obtained
from the server attribute
.At SRV_ATR_mailfrom ,
\*Qmail_from\*U.
Also included is the mail destination, if the job has a specified
.At JOB_ATR_mailuser 
attribute, that list is used instead of the job owner as the recipient
of the mail.
The command line is passed to 
.I popen()
and the mail headers and body message are written on the pipe.
The headers includes a subject phrase based on the mail point.
The child process then exits.
.LP
The server will reap the child and clean up the child_task entry.
.NH 4
.Ix svr_messages.c
.LP
The file
.I src/server/svr_messages.c
has been replaced by
.I src/lib/Liblog/pbs_messages.c
because of a change to log_err() to print messages associated with PBS error
numbers.
.LP
.NH 4
.Ix svr_resccost.c
.LP
The file 
.I src/server/svr_resccost.c
contains functions associated with the
.At resources_cost
attribute and calculating the resource cost of a job.  This attribute and these
functions support the synchronous job starting functions found in 
.I req_register.c .
.LP
It was the original intent to have the resource cost be an integer recorded
in the resource_definition structure itself.  It seemed logical, one
value per definition, why not.  But "the old atomic set" destroys that
idea.  Is is necessary to be able to have temporary attributes with their own
values, hence it came down to another linked-list of values.  Each entry
contains the cost value and a pointer to the resource definition structure
to tie the cost to that resource.
.Fn add_cost_entry()
.Cs
static struct resource_cost *add_cost_entry(attribute *pattr,
                                            resource_def *pdef);
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the 
.At resources_cost
attribute.
.IP pdef
pointer to the resource definition structure for the specific resource.
.RE
.IP Returns: 4
.RS
.IP pointer
to the newly created resource cost entry, NULL if error.
.RE
.LP
A new entry is allocated and initialized to zero.
.Fn decode_rcost()
.Cs
int decode_rcost(struct attribute *pattr, char *name, char *rescn, char *val)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the 
.At resources_cost
attribute.
.IP name 
of the attribute.
.IP resc
the resource name.
.IP val
The cost of the resource (the value).
.RE
.IP Returns: 4
.RS
.IP zero
on success.
.IP non-zero
on error.
.RE
.LP
The resource cost entry for the specified resource is found in the list
headed in the attribute.  If not found, a new one is created by calling
.I add_cost_entry() .
The value string is converted to an integer and inserted in the structure.
.Fn encode_rcost()
.Cs
int encode_rcost(attribute *pattr, list_head *phead, char *atname,
                 char *rsname, int mode)
.Ce
.IP Args: 4
All arguments are standard for an at_encode() routine.
.IP Return: 4
Greater than zero on success, zero if attribute was unset, negative if error.
.LP
For each entry in the resource cost attribute list, a
.Av svrattrlst
entry is created by calling 
.I attrlist_create .
The al_value field is set to the resource cost value and the entry is
linked on the list headed by 
.Av phead .
.Fn set_rcost()
.Cs
int set_rcost(attribute *old, attribute *new, enum batch_op op)
.Ce
.IP Args: 4
.RS
.IP old
the attribute whose value is to be modified.
.IP new
the attribute whose value is the modifier.
.IP op
SET, INCR, or DECR operation.
.RE
.IP Return: 4
zero on success, non-zero on error.
.LP
For each entry in the 
.Av new
attribute, the corresponding value in the
.Av old 
attribute is modified according to the operation.
.Fn free_rcost()
.Cs
void free_rcost(attribute *pattr)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the resource cost attribute which is to be freed.
.RE
.LP
All entries in the list of
.Av resource_cost
structures headed in the attribute are deleted from the list and freed.
.Fn calc_job_cost()
.Cs
long calc_job_cost(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job for which the resource cost is to be calculated.
.RE
.IP Returns: 4
The resource cost of the job.
.LP
The resource cost of the job is the sum of the \*Qper system cost,\*U
.At SVR_ATR_sys_cost ,
and the products of the specified resource costs and their respective
amounts of resources.  To the the produce for becoming too large, for those
resources measured in \*Qsize\*U, the size is converted to 
.I megabytes
before multiplying by the cost, i.e. the cost is in terms of megabytes, not
bytes.
.NH 4
.Ix svr_task.c
.LP
The file
.I src/server/svr_task.c
contains the server functions for maintaining the list
of deferred services such as route retry, job waiting, and completion of
batch requests that depend on communication with other processes, e.g. MOM.
.LP
The tasks fit into one of three major types:
.IP Immediate 15
Tasks which the server should act upon immediately. 
Many entries are placed into this list by
.I pbsd_init()
during server recovery.
.IP Time
Tasks which are deferred to a specific time (in the future).
Jobs in the Wait state have a task entry of this type.
.IP Event
Tasks which are deferred to the occurrence of a specific (external) event.
All child processes are recorded by this type of task entry as are batch
requests which depend on the response of another process.
.LP
The deferred tasks are recorded in a work task structure.
All tasks of the same type are linked together in lists headed in the
global variables 
.Ar task_list_immed ,
.Ar task_List_time ,
or
.Ar task_list_event .
An entry is created and added to the appropriate list by
.I set_task() .
An entry is removed by calling 
.I delete_task() .
.LP
Note that set_task() returns a pointer to the work task entry.  This is often
used to add the entry to a list headed in the structure referenced by
.Ar wt_parm1 .
Wt_parm1 is often a pointer to a structure, such as a job structure.
This pointer is typically used by the function invoked by the task dispatcher.
If it is at all possible that the \*Qpointed to\*U structure could be freed
before the work task is acted on, the list of work tasks in the structure is
used to delete the work task along with the \*Qpointed to\*U structure.
The caller of 
.I set_task()
.B MUST
add the work task entry to the structure's list of work tasks.
.LP
The tasks on the immediate and timed list are processed in the main server
loop.  Events on the event list are either processed when the event is
detected or shortly there after by moving the event to the immediate list.
.QP
WARNING:
.br
You should never move an entry from one list to another in a signal
handler as you cannot be sure of the state of the links.  
.LP
.Fn set_task()
.Cs
struct work_task *set_task(enum work_type type, long event_id,
                           void (*func)(struct work_task *),
			   void *parm1)
.Ce
.IP Args: 4
.RS
.IP type
The type of task. 
.IP event_id
An identifier to relate this task with a specific event.
.IP func
The function to perform the task.
.IP parm1
The parameter to be saved in wt_parm1 in the work task entry.
.RE
.IP Returns: 4
.RS
.IP pointer
to the allocated work task entry.
.IP Null
if an error occurred and no entry was allocated.
.RE
.LP
A work task entry is allocated and initialized with the data passed as 
arguments to this function.  The entry is added to one of the three
lists maintained by the server depending on the event type: immediate,
timed, or external event.
If the additional parameter entries in the work task entry 
.Ar wt_parm2 
and
.Ar wt_aux
are meaningful to the invoked function,
.I set_task
.B "the caller of must initialize them" .
.LP
The function assigned to process the task,
.Ar func() ,
must take one argument, a pointer to the work task entry.
.Fn dispatch_task()
.Cs
void dispatch_task(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to a work task entry to dispatch.
.RE
.LP
The work task entry is unlinked from both the main server list and
the optional (job) structure list.
If specified, the function in the task entry is called and passed a pointer to
the work task itself.
When the function returns, the work task entry is freed.
.Fn delete_task()
.Cs
void delete_task(struct work_task *ptask)
.Ce
.IP Args: 4
.RS
.IP ptask
pointer to the task entry to clear.
.RE
.LP
The task entry is unlinked from its list(s) and freed.
.NH 4
.Ix list_link.c
.LP
The file
.I src/server/list_link.c
contains routines for maintenance of a doubly linked list.
The list is linked through a structure
.I list_link
in each entry.
The list is headed by a 
.I list_head 
structure (nothing more than another list_link),
Each link is contained in a
.I list_link
structure.  In addition to forward and backward pointers, the list_link
structure contains a pointer to the parent structure which contains
the link.  This allows a structure to have multiple list_link structures
and to reside in multiple lists.  In the head entry, this pointer to the
parent structure is a NULL pointer.  This allows the end and head of the
list to be recognized.  NEVER, NEVER, allow the parent structure pointer in
a list member TO BE NULL; or the parent structure pointer in the head
structure TO BE NOT NULL; or the next and prior pointers in the head to be
NULL!
.LP
The definition of the link structures are contained in the file
.I include/list_link.h .
Also defined in the header file are the following macros:
.IP \s-2CLEAR_HEAD()\s+2
which clears a list head structure including the parent structure pointer.
.IP \s-2CLEAR_LINK()\s+2
which clears the next and prior members of a list link structure.
.IP \s-2GET_NEXT()\s+2
which returns the address of the parent structure of the next item in the list.
A NULL pointer is returned if the end of the list is reached.
.IP \s-2GET_PRIOR()\s+2
which returns the address of the parent structure of the previous item in
the list.
A NULL pointer is returned if the head of the list is reached.
.LP
.Fn insert_link()
.Cs
void insert_link(struct list_link *old, struct list_link *new,
                 void *pnewobj, int position)
.Ce
.IP Args: 4
.RS
.IP old
Pointer to an list_link entry already in the list or the head structure.
.IP new
Pointer to the list_link sub-structure in the new entry.
.IP pnewobj
Pointer to the parent structure, holding the new list_link sub-structure.
.IP position
If 0, then the new entry is added before the old, else it is added afterwards.
.RE
.LP
The new entry is added to the list either before or after the old entry
depending on the setting of position.
Note, if the old entry is the list head, inserting \*Qafter\*U makes the new
entry the first in the list; inserting \*Qbefore\*U makes the new entry the
last in the list.
.LP
The seemingly extra parameter, pnewobj, is a pointer to the parent structure of
the list_link sub-structure,  If the list_link could always be the first
member of the parent structure, this would not be needed.  However, to allow
for the structure to be in multiple lists, this extra parameter is required.
The links always point to the top of the parent structure, allowing other
members to be addressed.
.Fn append_link()
.Cs
void append_link(struct list_head *head, struct list_link *new,
                 void *newpobj)
.Ce
.IP Args: 4
.RS
.IP head
Pointer to (address of) the list_head structure.
.IP new
Pointer to the list_link sub-structure in the new entry.
.IP pnewobj
Pointer to the parent structure containing the new list_link structure.
.RE
.LP
The new entry is appended to the end of the list.
.Fn delete_link()
.Cs
void delete_link(struct list_link *old)
.Ce
.IP Args: 4
.RS
.IP old
Pointer to the entry to be deleted from the list.
.RE
.LP
The entry is removed from the list.  The forward and back link pointers in
the old entry are set to point to itself.  Otherwise the old entry is not
disturbed.
.Fn swap_link()
.Cs
void swap_link(list_link *one, list_link *two)
.Ce
.IP Args: 4
.RS
.IP one
pointer to one entry in a list.
.IP two
pointer to another entry in the same list.
.RE
.LP
This routine swaps the positions in a list of two members of the list.
If the two members are adjacent, one is moved after the other.
Otherwise, each entry is unlinked and relinked after the entry ahead of
the other.
.Fn is_linked()
.Cs
int is_linked(list_head *head, list_link *entry)
.Ce
.IP Args: 4
.RS
.IP head
Pointer to head of list.
.IP entry
Pointer to list_link structure in question.
.RE
.IP Returns: 4
.RS
.IP 1
if the entry is in the list headed by head.
.IP 0
if the entry is not in the list.
.RE
.LP
This function walks the list until it encounters the entry in question
or reaches the end of the list.
.Fn list_move()
.Cs
void list_move(list_head *from, list_head *to)
.Ce
.IP Args: 4
.RS
.IP from
pointer to a list_head.
.IP to
pointer to a list_head.
.RE
.LP
The list headed by
.Ar from
is moved to be headed by
.Ar to 
instead.
The list head
.Ar from
is cleared.
The whole thing is just insuring that the pointer in the head and tail list
elements point to the correct list_head structure.
.NH 4
.Ix accounting.c
.LP
The file
.I src/server/accounting.c
contains routines for the creation of the server accounting file.
.Fn acct_job()
.Cs
static void acct_job(job *pjob, char *buf)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job for which the accounting record is to be written.
.IP buf
pointer to a buffer in which the record is built.  It must be big enough.
.RE
.IP Returns:
pointer to next available byte in buffer.
.LP
This private routine is used by
.I account_jobstr()
and 
.I account_jobend()
to add the following information to the accounting record being built in
buffer: user, group, account, job name, session id, job creation time,
job queued time, time when the job became eligible for execution,
the time the job started execution, and the job resource requirements.
.Fn acct_open()
.Cs
int acct_open(char *filename)
.Ce
.IP Args: 4
.RS
.IP filename
of the accounting file to be opened.
.RE
.IP Returns: 4
zero on success, -1 if error.
.LP
Calling acct_open() with a null pointer request that the default account file,
based on the current day be opened.
The file will be switch each day with the first record after midnight, see
.I account_record().
.LP
Calling it will a pointer to a null string, from a -A "", is direction
to not to open a file.  This in effect, turns off account recording.
Calling acct_open() with a full path name turns off switching to a new file
each day.
.Fn void acct_close()
.Cs
void acct_close()
.Ce
.LP
Closes the accounting file if open.
.Fn account_record()
.Cs
void account_record(char type, job *pjob, char *text)
.Ce
.IP Args: 4
.RS
.IP type
of record
.IP pjob
pointer to job
.IP text
to append to record.
.RE
.LP
This function formats and records the basic record.  The supplied text is
appended to the date time stamp, type character, and job id.
.LP
If automatic file switching is on (using default file name) and the current
day is not the same day as the the day the file was opened, then the file
is closed,
.I acct_close() ,
and opened anew,
.I acct_open() .
.Fn account_jobstr()
.Cs
void account_jobstr(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job
.RE
.LP
This function builds the text part of a job start (of execution) record.
The function
.I acct_job()
is used to list the basic information about the job.  Then
.I account_record() 
is called.
.Fn account_jobend()
.Cs
void account_jobend(job *pjob, char *used)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job
.IP used
text about the resources which were used by the job.
.RE
.LP
This function builds the text part of a job end (of execution) record.
The function
.I acct_job()
is used to list the basic information about the job.  Then
the information from 
.I req_jobobit()
about resource usage is appended.  Last,
.I account_record() 
is called.
.NH 3
.Tc Node Functions
.LP
The functions in this section deal with 
.B Node
resources.  The functions include allocating, reserving, and freeing.
.NH 4
.Ix node_manager.c
.LP
The file
.I src/server/node_manager.c
contains functions that
.IP (1)
Deal with nodes as resources: allocating, reserving, and freeing.
.IP (2)
Server to Mom communication used to tract state of the nodes.
.LP
.Fn write_node_state()
.Cs
void write_node_state()
.Ce
.LP
This routines writes the node state file 
.Sc NODE_STATUS 
which is
.Ty PBS_HOME/server_priv/node_status .
If the file is not already open, it is opened.  If already opened, the file
is truncated to zero length.
.LP
The file is written as the node name and the state as an integer.  
Only those nodes which are marked 
.I off-line
are recorded in this file.  If a node is allocated to a job, that is determined
by the recovered job attributes.   If a job is down, that is discovered when
the server cannot communicat with the node.
.Fn free_prop()
.Cs
static void free_prop(struct prop *proplist)
.Ce
.IP Args: 4
.RS
.IP proplist
A pointer to a linked list of properities of a node.
.RE
.LP
A properity is just a string, which may be descriptive of some property
of the node, assigned by the Batch Administrator.   Zero or more may be
assigned via the node description file, see 
.I setup_nodes ().
.LP
This routine frees the structures used to hold the property strings.
.Fn node_unreserve()
.Cs
void node_unreserve(resource_t handle)
.Ce
.IP Args: 4
.RS
.IP handle
A resource handle used to identify a set of reserved resources.
.RE
.LP
This function releases the reservation on a set of nodes.  The reservation
is identified by
.Ar handle .
If
.Ar handle
is the special value
.Sc RESOURCE_T_ALL ,
then all reserved resources are released.
.Fn hasprop()
.Cs
static int hasprop(struct pbsnode *node, struct prop *props)
.Ce
.IP Args: 4
.RS
.IP node
pointer to a single pbsnode structure
.IP props
A list of properities, some or all of which are marked as "needed"
.RE
.IP Returns:
One if the node has the "needed" properities, zero if not.
.LP
For each "needed" properity in the 
.Ar props
list, check the properity list of the specified 
.Ar node .
If all needed properites are in the nodes properity list return 1, else
return 0.
.Fn "mark() nodes"
.Cs
static void mark(struct pbsnode *node, struct prop *props)
.Ce
.IP Args: 4
.RS
.IP node
pointer to a single pbsnode structure
.IP props
a list of properities
.RE
.LP
For each properity in
.Ar props ,
mark that properity in the
.Ar node
properity list.
.Fn "search() nodes"
.Cs
int search(struct prop *glorf, int skip, int order, int depth)
.Ce
.IP Args: 4
.RS
.IP glorf
a properity list
.IP skip
a bit mask, if bits match those in the 
.I inuse
field of the node, that node is skipped (ignored)
.IP order
the position or order of the needed node in the user's specification
.IP depth
the limit of the depth of the recursive search - used to limit the search time
.RE
.IP Returns:
One if the nodes are available - the nodes are marked in the
.Av flag
field with the
.Sc thinking
flag.  If the nodes are not available, zero is returned.
.LP
This function looks for a node which contains the properties given
in the list
.Ar glorf .
The parameter
.Ar check
is a flag to indicate if nodes which are in use should be checked.
The parameter
.Ar order
is the order of this particular node in the user's specification.
First, the node list is searched for one with the
given properties.  If one is found, it is marked "thinking" and
a 1 is returned.  If not, the nodes which are marked "thinking"
are searched.
If one is found with the given properties, mark it "conflict"
and call
.Ar search()
recursively to find a node with the properties being used by the
conflict node.  If one is found, return 1.  If this second loop
finishes without finding a match, return 0.  The depth of recursive
calls is limited by the parameter
.Ar depth .
.Fn number()
.Cs
static int number(char **ptr, int *num)
.Ce
.IP Args: 4
.RS
.IP ptr
pointer into a node specification string; it is updated
.IP num
RETURN: the integer found is returned in the location pointed to
.RE
.IP Returns:
.RS
.IP 0
A valid integer was found in the node spec location pointed to by ptr
.IP 1
No integer found
.IP -1
An integer of value zero was found, not legal in the node spec
.RE
.LP
The next token in the node spec is checked to see if it is an integer.
The pointer to the node spec is updated to point beyond the integer.
.Fn "property() of node"
.Cs
static int property( char **ptr, char **prop)
.Ce
.IP Args: 4
.RS
.IP ptr
pointer into a node specification; will be updated
.IP prop
RETURN: pointer to the located valid properity name in the node spec
.RE
.IP Returns:
Zero if a valid properity name was the next token, 1 if not.
.LP
To be a legal properity name, the first character of its name must be
alphabetic, the remaining chacters must be alphanumeric or '-' or '.'.
The next token in the node spec is checked to see if it is an legal
properity name.  The pointer to the node spec is updated to point beyond
the name.
.Fn proplist()
.Cs
static int proplist(char **str, struct prop **list)
.Ce
.IP Args: 4
.RS
.IP str
A pointer into a node specification, updated
.IP prop
RETURN: a pointer to a new property list is return
.RE
.IP Returns:
Zero on success, 1 else
.LP
Starting at
.Ar str ,
the next element in a supplied node spec is checked by calling
.I properity()
to see if it is a properity name.   If it is, a new element in a generated
property is allocated and filled in.  If it is not a valid property name,
1 is returned.  The processing is stopped at the first invalid property name
or at the colon, ":", that ends the node spec section.
.Fn listelem()
.Cs
static int listelem(char *str, int order)
.Ce
.IP Args: 4
.RS
.IP str
pointer into a node specification, it is updated
.IP order
The order of this node spec within the total specification
.RE
.IP Returns:
.RS
.IP 1
if the node spec can be satisfied
.IP 0
if the node spec cannot be completely satisified
.IP -1
if the node spec is impossible to satisify ever.
.RE
.LP
This function handles a singular node specification.  It
checks for a leading number,
.I number() ,
followed by a sequence of properties 
.I (proplist() ,
and creates a list for each one.
.LP
The number of nodes in the total pool which have the required set of properities
is counted via calling
.I hasprop()
on each node.  If the number of nodes with the properties is less than the
requested number, -1 is returned.  If sufficient nodes are available,
+1 is returned.  
.LP
If neither of the above cases are true, an addition search is made via
.I search() ,
ignoring none of the nodes (checking allocated/down ones)
to see if the request can be satisifed if all were free.
.Fn "mod_spec() nodes"
.Cs
static char *mod_spec(char *spec, char *global)
.Ce
.IP Args: 4
.RS
.IP spec
pointer to a node specification
.IP global
pointer to a properity to add
.RE
.IP Return:
a pointer to a modified node specification string
.LP
The properity given by
.Ar global
is appended to each node specification section within the
.Ar spec .
I.e. with a global value of
.Ty general
and a node spec of
.Ty 2:propA:propB+3:propC
a new spec of
.Ty 2:propA:propB:general+3:propC:general
is returned.
.Fn nodecmp()
.Cs
int nodecmp(void *aa, void *bb)
.Ce
.IP Args: 4
Both aa and bb are pointers to pbsnode structures
.IP Returns:
The comparison relationship is returned
.LP
This routine is passed as the comparison function for the general C lib
sort routine.  It orders nodes by
.IP - 
free nodes first if the global variable
.Ar exclusive
is set, or
.IP -
shared nodes first if exclusive is not set.
.LP
When assigned nodes, we want to assign matching free nodes for exclusive use
and match nodes already shared for shared use.
.LP
.Fn node_spec()
.Cs
int node_spec(char *str, int early)
.Ce
.IP Args: 4
.RS
.IP str
pointer to node specification string
.IP early
flag to quit test early
.RE
.IP Returns:
.RS
.IP >0
number of nodes required to meet spec, if they are available
.IP 0
if cannot currently be satisfied
.IP -1
if cannot ever be satisfied
.RE
.LP
We assume unless the key word
.Ty shared
is found that the node request is for exclusive allocation, so the global
variable
.Ar exclusive 
is set by default.
If any 
.I global
properities are specified at the end of the spec, they are checked for the
key word
.Ty shared ;
if found, 
.Ar exclusive
is cleared.
.LP
.I ctnodes()
is used to determine the total number of nodes specified in the spec.  It 
that is greater than the total number of nodes, we bail out with a -1.
.LP
The nodes are sorted by free or shared depending on the setting of exclusive.
The
.Ar flag
field of each node is cleared to 
.Sc okay .
If the node is free, it is counted in a count of nodes,
.Ar svr_numnodes .
.LP
The node spec is checked by calling 
.I listelem() 
which also tentively allocates nodes matching the subspec by marking them
.Ty thinking .
.LP
If any node is marked for allocation with 
.Ty thinking ,
but is not available to the job (already in use), then 
.I search()
is used to attempt to find a replacement.   This may entail given up a node
already marked thinking which matches the empty spec and finding a replacement
node for the one surrendered.  A complex problem.
.Fn setup_nodes()
.Cs
int setup_nodes()
.Ce
.IP Returns: 4
zero on success, -1 otherwise.
.LP
Open and read the 
.Ty (PBS_HOME)/server_priv/nodes
file.  Allocate structures for pbsnodes and props as requried.
The total number of nodes in the file is maintained in 
.Ar svr_numnodes .
.LP
Each primary host name is validated by calling gethostbyname().
The IP address for the node is recorded in the node structure.
.LP
The state of each node is initialized to
.Sc INUSE_UNKNOWN
until the server is able to check with pbs_mom on that node.
.Fn set_nodes()
.Cs
int set_nodes(job *pjob, char *spec, char **rtnlist)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a job to which nodes are to be assigned
.IP spec
node specification required by that job
.IP rtnlist
RETURN: a list of allocated nodes (if possible) will be returned here.
.RE
.IP Returns:
Zero if ok, or a PBS error number if not.
.LP
This function allocates nodes to a job.   The requirement is given in the
node specification
.Ar spec .
.LP
The nodes to allocate are chosen by calling
.I node_spec() .
If the return indicates the request cannot be satisified currently,
.Er PBSE_RESCUNAV
(temporarily unavailable) is returned, if the return from node_spec()
indicates the request can never be satisified, 
.Er PBSE_BADATVAL
is returned.
.LP
If
.Ar exclusive
is set, the number of allocated nodes is deducted from the total number 
available,
.Ar svr_numnodes .
Each node selected by node_spec() is marked in the 
.Ar flag
field with the flag
.Sc thinking ,
each of those nodes is marked as being allocated to the job either as
shared,
.Sc INUSE_JOBSHARE ,
or exclusively
.Sc INUSE_JOB .
A pointer to the job is linked into the node structure.  Note, a share node
may be allocated to more than one job.
.LP
The list of nodes is ordered to match the specification given.  This was 
carried around in the
.Ar order 
field.   The list is a string of the form:
.Ty node1+node2+node3+...
.Fn node_avail()
.Cs
int node_avail(char *spec, int *avail, int *alloc, int *reserved, int *down)
.Ce
.IP Args: 4
.RS
.IP spec
pointer to a node spec
.IP avail
RETURN: pointer to a integer in which the number of available nodes that match
the spec is returned.
.IP alloc
RETURN: pointer to a integer in which the number of allocated nodes that match
the spec is returned.
.IP reserved
RETURN: pointer to a integer in which the number of reserved nodes that match
the spec is returned.
.IP down
RETURN: pointer to a integer in which the number of down nodes that match
the spec is returned.
.RE
.IP Returns:
Zero on success or PBS error number.
.LP
This is the node specific part of a batch Resource Query request, see
.I pbs_rescquery() .
The node specification may come in two flavors:
.RS
.IP simple
The request is of the form
.Ty nodes
or
.Ty nodes=
and covers all possible nodes; or the request deals with a single set of
properities,
.Ty nodes=prop[:prop...]
in which case the numbers returned concern the number of nodes with those
properities.   All four numbers are valid.
The above is determined by calling
.I hasprop()
against each known node.  If the node has the requested properities, the
count of available, allocated, ... is incremented depending on the node state.
.IP complex
The request is of the forms:
.Ty nodes=number
or with multiple nodes
.Ty nodes=prop[:prop]+prop...
In this case, only the
.Ar avail
number has meaing and it is kludged.  If greater than zero, it is the number
of nodes requested by the spec and some set of nodes is currently available
which would satisify the spec.   If equal zero, the spec is possible, but some
node or nodes are currently allocated/reserved/down.  If 
.Ar avail
is -1, the spec could never be satisfied.
This is determined by calling 
.I node_spec() 
with the spec and setting 
.Ar avail
to its return value.  Note, the number of available nodes,
.Ar svr_numnodes
would be reduced by node_spec() and must be reset since the nodes are not
actually assigned.
.RE
.LP
.Fn node_reserve()
.Cs
int node_reserve(char *spec, resource_t tag)
.Ce
.IP Args: 4
.RS
.IP spec
another node spec
.IP tag
A resource reservation handle
.RE
.IP Returns:
.RS
.IP >0
if the reservation was made
.IP 0
if the reservation was not made or was made in part but may be satified later
.IP -1
if the reservation could never be made
.RE
.LP
This is the node specific piece of the Resource Reserver batch request, see
.I req_rescreserve ().
.LP
If this is a reservation that had been attempted before (was partially
satisfied), then 
.Ar tag will not be
.Sc RESOURCE_T_NULL 
and the nodes currently reserved for that tag are freed by calling
.I node_unreserve ()
This allows us to reallocate them (or differents ones as the case may be).
.LP
The routine 
.I node_spec()
is called to determine if the nodes requested are available.
If they are, the
.Sc thinking
nodes are reset to
.Sc INUSE_RESERVE .
If the reservation cannot be currently satisfied, those nodes which are
.Sc thinking
and
.Sc INUSE_FREE
are reserved as above.
.Fn free_nodes()
.Cs
void free_nodes(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a job struture
.RE
.LP
Any node with the given job in its allocated to job list has that job
removed.   If and only if the job list becomes null, is the node marked free.
.Fn ping_nodes()
.Cs
void ping_nodes(struct work_task *ptask)
.Ce
.IP Args: 4
.RS
.IP ptask
pointer to a work_task structure 
.RE
.LP
This routine is called off of the server's work task list.  It is used to 
.I ping
Mom on nodes periodically to see if they are alive.
.LP
If the node is down, or in use by a job it is not pinged.  When a down
node comes up, its Mom should yell at the server.
If required, an RPP stream is setup to Mom on the node. 
.I is_compose()
starts a message to the Mom and 
.I rpp_flush() 
sends it. If there is a failure, the RPP steam is closed and the node marked
.Sc INUSE_DOWN .
Note, there is no reply to this ping message, if the stardard RPP handshaking
acknowledges receipt of the message, that tells us it is up.
.LP
A new work task is set for 300 seconds later.
.Fn set_old_nodes()
.Cs
void set_old_nodes(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure
.RE
.LP
This routine is called on the server's startup from 
.I pbsd_init ().
It looks at the nodes assigned to running jobs in the attribute
.At JOB_ATR_exec_host
and calls
.I set_one_old()
to mark that node as in use and allocated to this job.
The job attribute
.At JOB_ATR_resource
is scanned for the resource
.Ty neednodes .
If found (and set), a search is made for the global property
.Ty shared . 
If found, then the nodes allocated to the job are marked as
.Sc INUSE_JOBSHARE ,
else they are marked
.Sc INUSE_JOB .
.Fn set_one_old()
.Cs
static void set_one_old(char *name, job *pjob, int shared)
.Ce
.IP Args: 4
.RS
.IP name
of the node to mark as belong to the job
.IP pjob
pointer to the job
.IP shared
either 
.Sc INUSE_JOB
or
.Sc INUSE_JOBSHARE
.RE
.LP
This is a helper routine for set_old_nodes().
The list of pbsnode structures is scanned for a node with this name.
Note, the node
.B name
is the last property in the prop list.
The node is marked in use with the value of
.Ar shared
and the job pointer is added to the list of jobs allocated to the node.

.NH 3
.Tc Server Batch Request Functions
.LP
The functions in the following sections perform the processing required
for batch requests received from clients, including other servers.
.LP
The first item of business in processing each job related task is to
determine if the requesting user has the authority to make the request.
This is done by calling
.I svr_authorize_jobreq() .
If the request is not a job related request, then that request will use
another mechanism.
.LP
For job related requests, unless otherwise specified, the request
must be rejected if the job is in the
.Sc JOB_STATE_EXITING
state.
.LP
The last item of business required for each batch request function is the
generation and issuance of a reply to the client.
.NH 4
.Ix req_quejob.c
.LP
The file
.I src/server/req_quejob.c
contains the functions associated with the sequence of batch requests
that request a server to create (queue) a job.  The job may be a new
job, the requesting client is qsub(1)/pbs_submit(3).  Or the job may
be an existing job, the client is a another server routing the job to
this server.
.Fn req_queuejob()
.Cs
void req_queuejob(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
This request is to create a new job or to transfer a job from one server
to another.  The destination, a queue name, for the job is specified in 
the request.
When a job is being transferred (routed), the job identifier will be specified
in the request and the client must be another server.  A null user name in
the credential indicates the client is another server.
If the job is not from another server, it cannot have a job id specified
in the request.  If the job is from a user client and thus being created
here, the next job sequence
.Av sv_jobidnumber ,
number is assigned.  Together with the server name, this is the job id.
.LP
Both the list of new (inbound) jobs and existing jobs is searched for a job
with the same id.  If found this is a serious error.
.LP
The destination queue is validated.  If it does not exist or is not enabled 
(receiving new jobs), an error reply is returned to the client.
.LP
It would be nice to be able to use the job id as the base file name under which
the job information is maintained.  However, since the job name contains the
server (host) name, it can be quite long; longer than the 14 characters
guaranteed by POSIX.  Hence, we make up a name which is the job name shorted
to 11 characters, 14 - 3 for the \*Q.JB\*U suffix.  Unfortunately, this name
might collide with one from a different server whoses name starts the same.
The made up, or hashed name is opened.  If one already exists, the name is
changed starting with the eleventh character and working toward the first until
a unique name is created.
The 11 character basename is recorded in the job structure.
.LP
The  routine
.I job_alloc()
function is called to allocate and initialize the job structure.
Error replies are returned if the job cannot be created or already exists.
.LP
Each attribute in the request is decoded via the appropriate at_decode()
function into a local copy of the job attribute array.
If any attribute name is unknown to the server, it is maintained in the
special unknown attribute list.
If any attribute fails to decode correctly, an error is returned.  The
.I auxcode
field in the reply identifies the attribute in error.
On any error, the job is purged from the server.
.LP
When all supplied attributes, including resources, are successfully decoded,
the job attributes are updated by \*Qsetting\*U them to the decoded values.
.LP
If the job is being created by this server, the 
.At job-owner
attribute is set to the client user name, the
.At ctime
(create time) attribute is set to the current time, and the
.At hopcount
attribute is initialized to one.
.LP
Otherwise, if the job is being routed here, not created here, then if the
.At job-owner
attribute was not been passed with the request, the request is rejected.
The
.At hopcount
is incremented and if too big, the request is rejected.
.LP
If the destination queue is an execution queue, the job execution uid and gid
are set by calling
.I set_jobexid() .
This has the side effect of checking any queue access control list; the
user must have access rights or the request is rejected.
For security reasons, no batch job is allowed to be submitted or run with
the uid of zero (0); it might allow a user to crack security and submit a
job which would cause root-rot.
.LP
In addition to the attributes, the following fields in the job structure
are set:  ji_state, ji_substate, ji_svrflags, ji_numattr, ji_ctime, 
ji_un_type (to 
.Sc JOB_UNION_TYPE_NEW ),
ji_jobid, ji_quen, ji_euid, ji_egid.
The job state is set to
.Sc JOB_STATE_TRANSIT
and the substate to
.Sc JOB_SUBSTATE_TRANSIN .
.LP
If any error occurs after the job structure has been allocated,
the request is rejected and the job structure is freed via 
.I job_purge() .
Job_purge must be used rather than job_free() because the control (save)
file has been created.
.LP
The job structure is linked into the server's new job list,
.At sv_newjobs .
A success reply is returned to the client.
.Fn req_jobcredential()
.Cs
void req_jobcredential(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
In the standard PBS release, this routine is a stub which will reject
the request.  It is provided to allow a site or vendor to add support for
Kerberos or AFS (Andrew File System) where access tickets must be passed
with the job.
.Fn req_jobscript()
.Cs
void req_jobscript(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
The job's script is passed by one or more jobscript requests.  The amount
of data in each request is limited to just under 8KB.  This allows the use
of UDP protocol if anyone ever cares to implement PBS on it.
Since the Job Script request must follow a Queue Job request, the
network connection table has already been set up with a pointer to the
job structure.  This pointer is used to locate the job for which the script
is intended.
.LP 
The size of the script file is maintained in the job structure.
If the size is zero when a jobscript request is received, we assume
that the request must be the first and create the script file.
Otherwise, we open the file with
.Sc O_APPEND .
The script file name is based upon the job control file name with a different
suffix, \*Q
.Av \.SC \*U.
.LP
The script data is written to the file, the file is closed, the file size
in the job structure is updated, and an reply is returned to the client.
.Fn req_rdytocommit()
.Cs
void req_rdytocommit(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
When this request is received, we know the client has completed sending all
data for the job.  The job is manually marked in state
.Sc JOB_STATE_TRANSIT
and substate
.Sc JOB_SUBSTATE_TRANSICM .
The state is set manually to prevent the server and queue (which the job is
not yet in anyway) from being updated.
The job structure is saved in the job file by calling
.I job_save() .
It will remain in substate
.Sc JOB_SUBSTATE_TRANSICM
until the Commit Request is received.
.Fn req_commit()
.Cs
void req_commit(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
When this request is received, the job should reside in the server's
new job list and be in substate
.Sc JOB_SUBSTATE_TRANSICM .
This request tells us that the client is giving up control of the
job to us.
The job state and substate are updated to reflect the
setting of certain attributes, see
.I svr_evaljobstate() .
Typically, the new state will be either
.Sc JOB_STATE_QUEUED ,
.Sc JOB_STATE_HELD ,
or
.Sc JOB_STATE_WAITING .
.LP
The 
.At JOB_ATR_qrank ,
queue_rank, attribute is set from the global variable
.Ar queue_rank .
This is used to insure the job will be ordered in the queue in the
correct place on a restart of the server.
The job is placed into its destination queue and the various state counts
are updated by calling
.I svr_enquejob() .
The job file is \*Qquickly\*U updated by calling
.I job_save().
It is now ready for processing depending on the queue type.
.LP
If the job was not created here and is not a new job, then the server calls
.I issue_track()
to notify the tracking server of the job's new location.
.NH 4
.Ix req_delete.c
.LP
The file
.I src/server/req_delete.c
contains the functions used in processing a delete job batch request.
.Fn remove_stagein()
.Cs
void remove_stagein(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job which has had files staged in.
.RE
.LP
When a file has had files staged in but not yet run and the job is to
be deleted or moved, the staged files should be removed to restore the
system to the state before the job was submitted.
.LP
A delete files request is built by calling
.I cpy_stage() ,
see req_jobobit.c,
and the request is sent to MOM via
.I relay_to_mom() .
Note, only one try is made, win or lose.   The request structure is freed
after the send by 
.I release_req() .
.Fn req_deletejob()
.Cs
void req_deletejob(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
If the job is in the
.Sc JOB_STATE_TRANSIT
state (outbound)
and a job routing process has been forked and recorded in a work task
entry, a pointer to the delete job batch request structure is recorded
in a new work_task entry with a processing function of
.I post_delete_route() .
The routing process is sent a 
.Sc SIGTERM 
signal.
The abort processing is continued when the routing process terminates, the 
death of child processing locates the work task entries and places them
on the immediate work list.  The work task entry for the routing child will
be processed before the entry with the
.I post_delete_route() .
If the router process returned an exit status of zero,
the job was routed to another server before it could be deleted, and the
job is purged from this server.  If however, the router exit status was
non-zero, then the job is still ours to delete.  See what happens in
.I post_delete_route() .
.LP
If the job is substate 
.Sc JOB_SUBSTATE_PRERUN ,
then we need to wait for MOM to finish receiving the job so we can delete it,
otherwise there is a race condition and the runjob command may hang.
Therefore, the delete job request is placed in the work task for one second
later by calling
.I set_task()
with a timed event pointing to
.I post_delete_route() .
.LP
Otherwise, if the requesting client is not the job owner, then send
mail to the user to inform him of the delete.
If the request extend field,
.Av rq_extend
is a non null pointer, and the text to which it points does not start with
.Ty deldelay= ,
then the text is a message (from the Scheduler) which is appended to the
mail message.
.LP
If the job is in the state of
.Sc JOB_STATE_RUNNING
then the function
.I issue_signal()
is called to send a Signal Job request to the MOM 
responsible for the execution of the job requesting a
.Sc SIGTERM
signal be sent to the job.  The address of the client batch_request structure
is passed to issue_signal() so that it can be found and completed with MOM
responds.  Likewise, the function 
.I post_delete_mom1()
is passed to issue_signal() as the reply processor.  The real work continues
within post_delete_mom1().
.LP 
If the job has a non-migratable (Cray style) checkpoint image as shown by
.Ar ji_svrflags
containing
.Sc JOB_SVFLG_CHKPT ,
then job exit processing is performed to deliver the output and remove the
job files under MOM's control.  The job is set to state
.Sc JOB_STATE_EXITING
and substate
.Sc JOB_SUBSTATE_EXITING .
The variable
.Ar ji_momhandle
is set to -1 force 
.I on_job_exit ()
to obtain a new connection to MOM and a work task entry is created to invoke
on_job_exit() immediately.
.LP
If the job has files that have been staged in already, marked by setting
.Sc JOB_SVFLG_StagedIn
in the job structure field
.Av ji_svrflags ,
then
.I remove_stagein() 
is called to ask MOM to delete the files and the job is aborted via
.I job_abt() .
.LP
If the job is in any other state,
.I job_abt()
is called to dispose of the job immediately and a reply is generated for
the request.
.Fn post_delete_route()
.Cs
static void post_delete_route(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to the work_task entry whoses dispatch resulting in calling this
function.
.RE
.LP
All that need be done is to recall the 
.I req_delete()
function.
The work_task member,
.At wt_parm1
contains a pointer to the original Delete Job batch request.
and it will either (1) find the job has been requeued by the router when
it received the signal, or (2) the job was already gone (and now forgotten)
in which case 
.Er PBSE_UNKJOBID
is returned and the client can look elsewhere. 
.Fn post_delete_mom1()
.Cs
static void post_delete_mom1(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to the work_task entry whoses dispatch resulting in calling this
function.
.RE
.LP
Here we continue the work started in req_deletejob() for a job in the running
state.  The work task pointed to by
.At pwt
is the one especially created to send to MOM.  It the
.Av rq_extra
field is a pointer to the original client request.  If MOM did not reject
the signal request, we can acknowledge the client request.  (If we wait till
after the job is signaled a second time, coming up, the user may feel the
delay is too long.)  Note, at this point, the original request is gone.
We now build a new work task entry for the time delay, (1) either specified
in the original request in the request extension, (2) the queue attribute
.At kill_delay ,
or (3) 2 seconds (why 2, why not?).
This new work task points to the function
.I post_delete_mom2() ,
which will continue the work and points to not the batch request, but
directly to the job.  
.Fn post_delete_mom2()
.Cs
static void post_delete_mom2(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to the work_task entry whoses dispatch resulting in calling this
function.
.RE
.LP
When we get here, it is time to send the 
.Sc SIGKILL
signal to the job, if the job still exists in the running state.
We will assume that MOM will accept the signal request, so just pass
.I release_req()
as the post processing function to 
.I issue_signal() .
.LP
Once the job dies, normal job exit processing will occur.
.NH 4
.Ix req_holdjob.c
.LP
The file
.I src/server/req_holdjob.c
contains the functions to process the Hold Job and Release Job requests.
.Fn chk_hold_priv()
.Cs
int chk_hold_priv(long value, int privilege)
.Ce
.IP Args: 4
.RS
.IP value
the hold value specified.
.IP privilege
of the calling client.
.RE
.IP Returns:
0 if ok, PBS error number otherwise.
.LP
This routine checks that the client has the required privilege for setting
a hold:
.RS
.IP HOLD_u
No special privilege is required.
.IP HOLD_o
Operator,
.Sc ATR_DFLAG_OPWR ,
or manager,
.Sc ATR_DFLAG_MGWR ,
privilege is required.
.IP HOLD_o
Manager privilege is required.
.RE
.LP
.Fn req_holdjob()
.Cs
void req_holdjob(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
The hold types specified in the request are determined by calling a
private routine 
.I get_hold()
which finds the holds to be set, decodes them, and checks the privilege 
required against the clients.
These holds are then added to the job 
.At JOB_ATR_hold
attribute.
(This should be done by calling at_set(), but I cheat and set them directly.)
.LP
If the job is in
.Sc JOB_STATE_RUNNING
state, and if checkpoint is supported by the server, and if the 
.At checkpoint
attribute is not
.Av n ,
then the following additional actions are taken:
.IP \(bu
a Hold Request is sent to the MOM which is shepherding the job in execution.
Upon receipt of the Hold Request, MOM will attempt to checkpoint the job 
and terminate its execution.
.IP \(bu
If MOM returned a reply indicating she was successful in checkpointing the
job, the job substate is set to
.Sc JOB_SUBSTATE_RERUN
to cause rerun post job processing,
and the job is retained in the execution queue.
Note, the job is left in run state until MOM aborts the job and notifies us
with the Job Obit notice.
.IP
If MOM returned a reply indicating that the checkpoint failed,
the error reply is returned to the client requesting the hold.
If the reply from MOM indicated that checkpoint was not supported on
the execution host, the job is left in execution state, with the hold
noted just as if checkpoint was not supported by the server. 
.LP
If checkpoint is not supported, or if the 
.At checkpoint
attribute is
.Av n ,
no additional processing is performed, the job is left executing.
.LP
.Fn req_releasejob()
.Cs
void req_releasejob(batch_request *request)
.Ce
.IP Args: 4
.RS
.IP request
pointer to the batch_request structure containing the request.
.RE
.LP
The hold types specified in the request are determined by calling a
private routine 
.I get_hold()
which finds the holds to be released, decodes them, and checks the privilege 
required against the clients.
Each hold type specified in the request is removed from the job
.At hold-list 
attribute by calling the
.I at_decode()
routine
and clearing the corresponding bit in
.At JOB_ATR_hold via the
.I at_set() 
routine for the attribute.
.LP
.Fn get_hold()
.Cs
static int get_hold(list_head *head, char **pstring)
.Ce
.IP Args: 4
.RS
.IP head
of list of svrattrl structures containing the attributes from the request.
.IP pstring
RETURN: pointer to a string pointer which will be set to point to the
hold characters, n, u, o, and/or s.
.RE
.IP Returns:
0 if ok, error otherwise.
.LP
The 
.At Hold_Types
attribute in the supplied list is located, there should be one and only one
such attribute, otherwise the hold or release request was ill formed.
The character pointer,
.Ar pstring ,
is set to point to the attribute (external) value.
The attribute is decoded into a temporary attribute which is available to
the routine routines in this file.
.Fn post_hold()
.Cs
static void post_hold(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
Pointer to the work task entry.
.RE
.LP
This routine is called to when MOM responds to the Hold Job request passed
to her from 
.I req_holdjob ()
when checkpointing is supported by the server.
If MOM returns an error indicating that checkpoint failed (including not
supported), it is logged and the error is returned to the client that
initiated the Hold Request.
The job state is restored to 
.Sc JOB_SUBSTATE_RUNNING
since the job is still running.
.LP
If checkpointing succeeded, 
.Ar ji_svrflags 
is updated with
.Sc JOB_SVFLG_HASRUN
and either
.Sc JOB_SVFLG_CHKP
(for Cray style non-migratable checkpoint) or
.Sc JOB_SVFLG_ChkptMig
(for yet non implemented migratable checkpoint).
.I account_record()
is called with
.Sc PBS_ACCT_CHKPNT
to record the checkpoint suspension in the accounting file.
.LP
The Hold Job request is acknowledged.
.NH 4
.Ix req_jobobit.c
.LP
The file
.I src/server/req_jobobit.c
contains the function to process the Job Obituary batch request and
associated post-execution functions.
The Job Obituary request is actually a notice from MOM that the referenced
job has terminated execution.  
Generally, if the job terminated, post processing is performed to return output
and remove the job, see
.I on_job_exit() .
If the job is to be rerun, the job is requeued in its current queue, see
.I on_job_rerun() .
.LP
There are several special cases of job termination which are handled.
.IP \(bu 3
If MOM dies, either on her own or because the system crashed, MOM has lost
control of the executing jobs.
Either they died also or they became detached from MOM.  When MOM recovers,
she will attempt to kill all jobs and mark them as exited.  She will insert
a special exit status code of
.Sc JOB_EXEC_INITABT
to be returned to the server as the job exit status.  
This exit status notes the job died/was killed on recovery.
The server will rerun the job if allowed or terminate it if not.
.IP \(bu 3
If MOM aborted the job on recovery and the job had a Cray style
non-migratable checkpoint file, mom returns a special job exit code of
.Sc JOB_EXEC_INITRST .
The job is marked in 
.Ar ji_svrflags
with
.Sc JOB_SVFLG_HASRUN
and
.Sc JOB_SVFLG_CHKPT .
The job state is simply requeued.
.IP \(bu 3
If MOM aborted the job on recovery and the job had a as yet unimplemented
migratable checkpoint image, mom returns a job exit status of
.Sc JOB_EXEC_INITRMG .
The job is marked in
.Ar ji_svrflags
with
.Sc JOB_SVFLG_HASRUN
and
.Sc JOB_SVFLG_ChkptMig
and its substate is set to 
.Sc JOB_SUBSTATE_RERUN
to cause rerun processing.
.IP \(bu 3
If MOM is unable to start a job for some reason that is permanent, i.e.
the user account was invalid or
the job asked for an unknown resource, then MOM will set the job exit code to
either
.Sc JOB_EXEC_FAIL1
if the error was detected before the standard output of the job was created or
.SC JOB_EXEC_FAIL2
if the error was noted after the standard output was set up.
In both cases the server will abort the job; the difference is the message
mailed to the user.
.IP \(bu 3
If MOM is unable to start a job for some reason that is believed to be
temporary, such as a resource has be gobbled up by an interactive session,
then MOM will set the job exit code to
.Sc JOB_EXEC_RETRY .
The server will requeue the job; it is treated as a rerun except that
the job's output is not saved.
.LP
.Fn wait_for_send()
.Cs
static void wait_for_send(struct work_task *ptask)
.Ce
.IP Args: 4
.RS
.IP ptask
pointer to the work task entry that called this routine.
.RE
.LP
This routine just calls back
.I req_jobobit() .
The work task was set up there as a delay mechanism.
.Fn req_jobobit()
.Cs
void req_jobobit(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
This function validates the request, updates the list of resources used, 
records the job exit status in
.Av ji_exitstat ,
and replies to MOM.
The scheduling flag,
.Av svr_do_schedule ,
is set to
.Sc SCH_SCHEDULE_TERM .
If the job cannot be found and the server was initiated 
.Sc RECOV_COLD
or
.Sc RECOV_CREATE ,
then the jobs were blown away.  The server replies to MOM with
.Er PBSE_CLEANEDOUT
to instruct her to trash her files relating to that job.
Otherwise if the job cannot be found, the server returns 
.Sc PBSE_UNKJOBID .
.LP
If the job is already in the 
.Sc JOB_STATE_EXITING
state, then MOM must be recovering and sending the server a second
notice.  Return 
.Er PBSE_ALRDYEXIT
to MOM which tells her to mark the job as exiting and close the connection.
The server will continue to process the job on the thread started by the
original notice.
.LP
If the job state is not
.Sc JOB_STATE_RUNNING ,
an obit should never have been issued, so this logged and ignored.
Otherwise if the substate is
.Sc JOB_SUBSTATE_PRERUN ,
then the obit notice \*Qwon the race\*U condition between it and the SIGCHLD
from the child of the server that sent the job to MOM to run, see
.I svr_strtjob2()
in req_runjob.c.  We need to wait for the send side to complete so the 
run job request can be acknowledged.  So a work task with a one second delay
is created to call 
.I wait_for_send() .
It will just restart req_jobobit().
.LP
The information in the request is processed first, saving the status and
building the mail/log message before replying to MOM, otherwise the information
is lost.  The reply is then made to keep MOM from waiting any longer.
.LP
A normal exit status from a job can never be negative, since only 8 bits is
return.
If the exit status of the job is negative, it is a special status from MOM,
and is one of the following:
.IP \s-1JOB_EXEC_INITABT\s+1
The job was aborted by MOM on her recovery.  If the job can be rerun, its
substate is set to 
.Sc JOB_SUBSTATE_RERUN 
and the
.Sc JOB_SVFLG_HASRUN
flag is set in 
.I ji_svrflags .
Rerun processing will take place.
.IP \s-1JOB_EXEC_RETRY\s+1
MOM was unable to start the job, but it should be retried.  If the job has
been rerun before and has output files, this case is treated as another rerun.
If the job has not be run before, the empty output files are not saved, but
other rerun processing is performed.  This is accomplished by setting the
substate to
.Sc JOB_SUBSTATE_RERUN1 ,
see on_job_rerun().
.IP "all other"
A special mail message is sent to the owner and normal exit processing takes
place.
.LP
If the exit value is greater than 10000, then the job ended on a signal. 
10000 was added by MOM to the signal number.  Different executions hosts
may have different size exit masks in wait.h, so the signal value is forced 
to be uniform.  This allows us to issue a different mail message to the user
on job end.
.LP
If the job is being terminated, not rerun, then the job state is set to
.Sc JOB_STATE_EXITING
and the substate to
.Sc JOB_SUBSTATE_EXITING .
If requested, mail is sent to the
.At mail_list
by calling
.I svr_mailowner() .
The new-lines in the resource usage message are replaced with spaces for the
log entry.  The function
.I account_jobend()
is called to record the usage to the accounting file.  If 
.Sc PBSEVENT_JOB_USAGE
is sent in the server's
.At log_events
attribute, then the same message is recorded in the log, otherwise a
short form is recorded.
.LP
The function
.I on_job_exit()
is called with a pointer to a work task of type WORK_Immed.
.LP
If the job is being truly rerun, not restarted from checkpoint, then the
resources used attribute,
.At JOB_ATR_resc_used ,
and the execution host attribute,
.At JOB_ATR_exec_host ,
are cleared.
Then the function 
.I on_job_rerun()
is invoked with a work task entry of WORK_Immed.
.LP
If the job has a Cray style checkpoint file, 
.Sc JOB_SVFLG_CHKPT
is set, the job is requeued directly.
.LP
As a reminder, both on_job_exit() and on_job_rerun() are invoked via
.I dispatch_task()
so that the work task structure is deleted when the on_job_* function returns.
.Fn on_job_exit()
.Cs
void on_job_exit(struct work_task * ptask)
.Ce
.IP Args: 4
.RS
.IP ptask\ 
pointer to a work task entry which points to the job in exit state.
.RE
.LP
The steps required for a normally terminating job are:
.IP 1. 3
Set the job state to
.Sc JOB_STATE_EXITING 
and the substate to 
.Sc JOB_SUBSTATE_EXITING .
Deal with any job termination dependences.
.IP 2.
Set the substate to
.Sc JOB_SUBSTATE_STAGEOUT .
Send a Copy Files request to MOM to move the standard output, standard
error, and any \*Qstaged out|*U files of the job and the files listed in the 
.I "stage out"
resource to the final destination.
.IP 3.
Set the job substate to 
.Sc JOB_SUBSTATE_STAGEDEL .
Send a Delete Files request to MOM to delete any \*Qstaged in\*U files.
.IP 4.
Set the job substate to 
.Sc JOB_SUBSTATE_EXITED .
Send a Delete Job request to MOM to remove remaining traces of job including
the job control file, job script, and any checkpoint file.
.IP 5.
Send the final Track Job request to the original server (if not me) and
purge the job from the system.
.LP
This function is invoked by 
.I req_jobobit() ,
by work task when various copy and delete file requests to MOM complete, and by
.I pbsd_init()
on recovery.
Its purpose to determine where in the exiting processing the job is and
resume with the next step.
If the work task type is
.Sc WORK_Immed 
or
.Sc WORK_Timed ,
this routine is being called to perform the cycle or stage of
processing the job indicated by its substate.  Otherwise, the work task type is
.Sc WORK_Deferred_Reply ,
and this routine is being called back after MOM has been replied to a request.
A
.Sc WORK_Timed
call back results when on_job_exit() is called from 
.I pbsd_init() .
In this case, as with the first time on_job_exit() is called for a job,
there is not a connection to MOM and one must be made by calling
.I mom_comm() .
If the connection cannot be established, MOM is not (yet) alive, a time
delay work task is created to retry MOM after a delay.
.LP
Switch on the job substate:
.IP "JOB_SUBSTATE_EXITING or JOB_SUBSTATE_ABORT"
Process any dependencies by calling
.I depend_on_term() .
Advance the job substate to JOB_SUBSTATE_STAGEOUT.
.IP JOB_SUBSTATE_STAGEOUT
If the work task type is 
.Sc WORK_Immed ,
then this is the first call into this routine.
The first task is to determine which of the standard job files (output and
error) are to be moved.
.RS
.IP -
If the job attribute
.At JOB_ATR_join
is set to other than 
.Ty n ,
then determine which file is listed first (exists and will be moved) and
which are joined into that one (does not exist).
.IP -
For each file not joined to another file, determined if it is to be
kept by checking the job attribute
.At JOB_ATR_keep .
If kept, add to the Copy File list with the destination (remote) name set to
the default file name.
.IP -
For each file not joined and not kept, add the file to the the Copy File
request with the the destination name set to the name given in the
corresponding path attribute.
.RE
If the job has a stage-out resource, then append thoses files to the 
Copy Files request.  Then send the request to MOM.
.IP
If the work_task type is not
.Sc WORK_Immed ,
 MOM has replied to the Copy File request.
If the reply indicates a failure, generate a mail message to the job owner
and set the substate to
.Sc JOB_SUBSTATE_EXITING .
.IP
Regardless of the reply from MOM, free the prior batch request, set the
substate to
.Sc JOB_SUBSTATE_STAGEDEL
and set up a work task of type 
.Sc WORK_Immed
and pointing back to on_job_exit.
On being dispatched, the appropriate next action will be performed.
.IP JOB_SUBSTATE_STAGEDEL
If the work task type is
.Sc WORK_Immed ,
this is the first time into this section.
Build and send MOM a Delete Files request for each file that was \*Qstaged 
in\*U.
.IP
If the work task type is not immediate and
If the reply indicates a failure, generate a mail
message to the job owner.  Free the the batch request, set the job substate to
.Sc JOB_SUBSTATE_EXITED
and continue with that action.
.IP JOB_SUBSTATE_EXITED
Send a Delete Job Request to Mom.
Send the final Track Job request to the creating server if that is not here.
Call 
.I job_purge()
to remove the job.
.LP
.Fn on_job_rerun()
.Cs
void on_job_rerun(struct work_task * ptask)
.Ce
.IP Args: 4
.RS
.IP ptask
pointer to a work task entry which points to the job in exit state.
.RE
.LP
This function requeues a job when it stops following a rerun request.
The substate of the job has already been set to
.Sc JOB_SUBSTATE_RERUN 
by 
.I req_rerunjob() .
At the end of processing, the job state is reset to 
.Sc JOB_STATE_QUEUED ,
and the substate to
.Sc JOB_SUBSTATE_QUEUED .
The job is left in the current queue.
The actions on the job are driven by the substate recorded in the job.
.IP JOB_SUBSTATE_RERUN
On the first entry in on_job_rerun(), the substate will already be set to
.Sc JOB_SUBSTATE_RERUN ,
and the work task pointer will be of type WORK_Immed.
.I mom_comm() 
is called to obtain a connection to MOM.
If the host on which the job was executing is the server host, no file action
is required.  The job state is set to 
.Sc JOB_STATE_EXITING
and substate to
.Sc JOB_SUBSTATE_RERUN1 .
A work task entry is created to pick up at that point in post processing.
.IP
If the execution host is not the server host, then the various files must
be recovered to the server in case the job is rerun elsewhere.
A 
.B "Rerun Job"
request is sent to MOM.  This directs her to return standard output,
standard error, and any checkpoint file to the server using a
.B "Job Files"
request.
.IP
If MOM responds with success, the job state and substate are set to 
.Sc JOB_STATE_EXITING
and
.Sc JOB_SUBSTATE_RERUN1
and the server proceeds with the next step.
.IP JOB_SUBSTATE_RERUN1
If there are file to be staged-out, the server builds a 
.B "Copy Files"
request, see
.I cpy_stage() ,
and the server sends it to MOM.  Note, MOM will delete any files she stages out.
Regardless of sucess or failure, the substate is updated to
.Sc JOB_SUBSTATE_RERUN2 .
.IP JOB_SUBSTATE_RERUN2
If the job had files staged-in, 
.I cpy_stage() 
is called to build a copy files request for those files and the request is
converted to a 
.B "Delete Files"
request which is sent to mom.  If there are no staged-in files to delete or
after the request is prcessed (success or failure), the job substate is
updated for the next phase.
.IP JOB_SUBSTATE_RERUN3
The job is removed from MOM's custody by sending her a Delete Job request.
.LP
The socket handle,
.Av ji_momhandle
and the 
.Sc JOB_SVFLG_StagedIn
flag in
.Av ji_svrflags
are cleared.
The new job state and substate are determined by calling
.I svr_evaljobstate()
and set by
.I svr_setjobstate() .
In effect, this requeues the job.
.Fn mom_comm()
.Cs
int mom_comm(job *pjob, void (*function)(struct work_task *))
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure.
.IP function
to invoke via a work task if the connection to MOM cannot be established.
.RE
.IP Returns: 4
the connection handle or -1 if no connection was established.
.LP
If a handle has already be recorded in 
.Av ji_momhandle
of the job structure (not -1), it is returned.  Otherwise, a new connection
to MOM is established by calling
.I svr_connect()
with the host address of MOM found in
.Av ji_un.ji_exect.ji_momaddr .
If this address is zero (which might be the case if the ji_un union was 
cleared on by moving the job), the address of the host in the
.At JOB_ATR_exec_host
attribute is obtained prior to calling svr_connect().
If the connection is established, the handle is saved in ji_momhandle and
returned to the caller.
.LP
If the connection cannot be established, a work task structure is set up
with a time delay of 
.Sc PBS_NET_RETRY_TIME
and a call back function as passed in the parameter
.Av function .
.Fn setup_from()
.Cs
static char *setup_from(job *pjob, char *suffix)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure
.IP suffix
to append to file based on output, error, or checkpoint.
.RE
.IP Return:
Pointer to allocated string containing file name.
.LP
This function returns a name for a standard file for a job.  The suffixes are
defined in job.h as:
.Sc JOB_STDOUT_SUFFIX
\- .OU for standard output,
.Sc JOB_STDERR_SUFFIX
\- .ER for standard error, and
.Sc JOB_CKPT_SUFFIX
\- .CK for checkpoint.
.Fn setup_cpyfiles()
.Cs
static struct batch_request *setup_cpyfiles( struct batch_request *preq,
job *pjob, char *from, char *to, int direction, int flag)
.Ce
.IP Args: 4
.RS
.IP preq
Pointer to copy file request structure to build.  If null, the structure will
be allocated, otherwise the existing one will be expanded.
.IP pjob
pointer to job.
.IP from
name of file local to mom.
.IP to
name of file remote to mom (destination on stage-out, source on stage-in).
.IP direction
of transfer: 
.Sc STAGE_DIR_IN
\- in to Mom, or
.Sc STAGE_DIR_OUT .
.IP flag
indication type of file:
.Sc STDJOBFILE
\- standard job file (output, error),
.Sc JOBCKPFILE
\- checkpoint file, or
.Sc STAGEFILE
\- user specified stage-in/out file.
.RE
.IP Return:
Pointer to the copy files batch request structure.
.LP
If
.Ar preq
is null, then a batch_request structure is allocated and initialized for
.Sc PBS_BATCH_CopyFiles
including the job id, owner, effective user and effective group names.
Note, if the effective group is the user's login group as indicated by
.Sc ATR_VFLAG_DEFL
set in the
.At JOB_ATR_egroup
attribute, the group name is set to the null string.  This tells mom to use
the gid from the password entry.
If the
.Ar preq
is not null, the existing copy files request structure is used by appending
the new file pair to the current list.
.LP
A file pair structure, 
.Av rqfpair ,
is allocated and initialized with the 
.Ar from
and
.Ar to
names and the file type flag which is an indication to Mom as to where the
local file is/should be.
.Fn is_joined()
.Cs
static int is_joined(job *pjob, enum job_atr nat)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job
.IP nat
indicates which attribute the file name concerns:
.Sc JOB_ATR_outpath
or
.Sc JOB_ATR_errpath .
.RE
.IP Returns:
.RS
.IP 1
if file is joined to another.
.IP 0
if not joined.
.RE
.LP
This routine takes the number,
.Av nat ,
of a job attribute and determines if that file (output or error) is joined 
to another in the job's
.At JOB_ATR_join
attribute.   Note in the case of \*Q-j oe\*U option, the error file is joined
to the output file.  If nat was JOB_ATR_errpath, the return would be true (1).
.Fn cpy_stdfile()
.Cs
static struct batch_request *cpy_stdfile(struct batch_request *preq,
job *pjob, enum job_atr nat)
.Ce
.IP Args: 4
.RS
.IP preq
Pointer to copy file request structure to build.  If null, the structure will
be allocated, otherwise the existing one will be expanded.
.IP pjob
pointer to job.
.IP nat
identifies attribute specifying output or error path.
.RE
.IP Return:
Pointer to the copy files batch request structure.
.LP
This function determines if one of the job's standard files (output or error)
should be copied.  If so, it builds or adds to the copy files request.
.LP
If the job is interactive, there is no output to copy.  Otherwise we choose
the suffix and a default key letter based on the file.  The key letter is
used to check the keep list,
.At JOB_ATR_keep .
.I is_joined()
is used to determine if the file was joined to another and doesn't exist 
separately.  
.LP
The
.I to
file name is based on the job attribute value.  The
.I from
name is returned from
.I setup_from() .
.LP
The function 
.I setup_cpyfiles()
does the rest of the work.
.Fn cpy_stage()
.Cs
struct batch_request *cpy_stage(struct batch_request *preq, job *pjob,
enum job_atr nat, int direction)
.Ce
.IP Args: 4
.RS
.IP preq
Pointer to copy file request structure to build.  If null, the structure will
be allocated, otherwise the existing one will be expanded.
.IP pjob
pointer to job.
.IP nat
identifies attribute specifying path.
.IP direction
of transfer. 
.RE
.IP Returns:
pointer to new or expanded copy files batch request.
.LP
This is the equivalent to cpy_stdfile() for stage-in/out files.
If the attribute specified by
.Av nat
is set, then for each 
.I local_name@remote_host:remote_name
element in its value, the element is parsed into the 
.I to
path, and the
.I from
path.
.I setup_cpyfiles() does the rest.
.NH 4 
.Ix req_locate.c
.LP
The file
.I src/server/req_locate.c
contains the function to process the Locate Job batch request.
.Fn req_locatejob()
.Cs
void req_locatejob (struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The function will attempt to find information about the job in two places.
First, the server will search its list of all active jobs by calling
.I find_job() .
If that fails, the server will search the array of tracking records pointed
to by the server structure member
.I sv_track .
.LP
If found in either place, the current location is reported to the client
in the reply.  If not found, the server responds with 
.Er PBSE_UNKJID .
.NH 4
.Ix req_manager.c
.LP
The file
.I src/server/req_manage.c
contains the server function for processing the Manager batch request,
creating and deleting queues and setting server queue and node attributes.
.LP
.Fn req_manager()
.Cs
void req_manager(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
As this is not a job related batch request, the authorization is performed
differently.  The user's privilege is obtained.  If the manage command is a
.I create 
or 
.I delete ,
the privilege must be at the administrator level.
If the manage operation is a
.I set
or 
.I unset ,
the privilege generally can be at either the administrator or operator level.
The exception to this statement comes when dealing with 
.I node-attributes,  
where certain changes are only available to managers.
.LP
A function to perform the requested operation is now called.  The function
called is chosen based on the
.I command
and
.I "object type"
specified in the Manage request.  
The 
.I command
values can be
.Sc Create ,
.Sc Delete ,
.Sc Set
and 
.Sc Unset .
The object type values can be
.Sc Server
or
.Sc Queue.
It is not legal to either create or delete a server.
.LP
If any error is detected, an error reply is returned to the client.
.LP
Each command \(em object specific function generates and sends a success or
error reply to the client.
.Fn mgr_server_set()
.Cs
void mgr_server_set(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
The specified attributes of the server are set by calling
.I mgr_set_attr()
with the address of the server attribute array, the address of the server
attribute definition array, the number of attributes in the array, the list
of attributes from the batch request, and the privilege of the requester.
.LP
.I Svr_update()
is called to save the server information to disk and
.I mgr_log_attr()
to log the changes in the log file.
An appropriate reply is generated and sent to the client.
.Fn mgr_server_unset()
.Cs
void mgr_server_unset(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
The specified attributes of the server are set by calling
.I mgr_unset_attr()
with the address of the server attribute array, the address of the server
attribute definition array, the number of attributes in the array, the list
of attributes from the batch request, and the privilege of the requester.
.LP
.I Svr_upatedb()
is called to save the server information to disk.
The routine
.I mgr_log_attr()
is called to log the attributes changes.
An appropriate reply is generated and sent to the client.
.Fn mgr_queue_create()
.Cs
void mgr_queue_create(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
.I Find_queuebyname() 
is called to insure a queue does not already exist with the specified name.
Space for the queue structure is allocated and initialized by calling
.I que_alloc() .
.LP
At this point the type of the queue is indeterminate.  It is established
by the first attribute found which is restricted to a certain queue type.
The attribute list in the request is scanned for the first attribute 
whose definition contains a parent type flag of other than 
.Sc PARENT_TYPE_QUE_ALL .
The queue takes on the queue type indicated by that attribute.
.LP
The function
.I mgr_set_attr()
is called to actually set the queue values.
If successful,
.I svr_save()
and
.I que_save()
are called to write the queue save file and update the server's save file.
A success reply is returned to the users.
.LP
If any attribute being set is incompatible with the queue_type
as determined by calling
.I check_que_attr() ,
a \*Qwarning\*U message is returned to the client. 
Any errors in the request will result in the queue structure being freed by
.I que_free() 
and a error reply returned to the user.
.Fn mgr_queue_delete()
.Cs
void mgr_queue_delete(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
The function
.I que_purge()
is called to remove the queue.
If the queue contains any jobs, the request is rejected.
.Fn mgr_queue_set()
.Cs
void mgr_queue_set(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
The queue is located by calling
.I find_queuebyname() .
Then 
.I mgr_set_attr()
is called to update the queue attributes.
.LP
The routine
.I check_que_attr()
is called to insure the specified attributes are appropriate to the
queue type; if there is a problem, a \*Qwarning\*U message is sent.
An appropriate reply is returned to the client.
.Fn mgr_queue_unset()
.Cs
void mgr_queue_unset(int sfds, struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP sfds
the socket connection to the requesting client.
.IP req
pointer to the batch_request structure.
.RE
.LP
The queue is located by calling
.I find_queuebyname() .
The specified attributes of the queue are unset (cleared) by calling
.I mgr_unset_attr() .
An appropriate reply is returned to the client.
.Fn mgr_set_attr()
.Cs
int mgr_set_attr(attribute *patr, attribute_def *padef, int numattr,
                 svrattrl *reqattr, int privilege, int *bad)
.Ce
.IP Args: 4
.RS
.IP patr
pointer to the attribute array in the server or queue to be set.
.IP padef
pointer to the attribute definition array for the objects attributes.
.IP numattr
integer number of attributes in the parent object attribute array.
.IP reqattr
pointer to the list of attributes in the batch request
.IP privilege
level of the client.
.IP bad
RETURN:  pointer to integer, if an error occurs, the integer is set to the
index of the attribute in error.
.RE
.IP Returns: 4
.RS
.IP 0
if successful.
.IP >0
error number if not successful.
.RE
.LP
The setting of the requested attributes is treated as an atomic operation,
all are set or none are.  This is accomplished by calling
.I attr_atomic_set()
which duplicates the attribute values and updates the copies with the
new values.  If any error occurs, the copies are removed by calling
.I attr_atomic_kill() .
.LP
For each and every modified attribute, the original parent object attribute
is  cleared and set to the temporary (new) value.  If there is an 
.I at_action()
routine associated with the attribute, it is invoked.
.LP
When all modification have been completed successfully, the temporary new
attributes are removed.  Note, the values are not freeded because the
real attributes point to the values where malloc-ed storage is involved.
.LP
If an specified attribute is not found in the attribute definition array,
if the attribute cannot be written with the client privilege, or the attribute
is read-only,  the integer pointed to by
.At bad
is set to the number, starting with 1, of the attributes ordinal position in
the request list.  An error value is returned.
.Fn mgr_unset_attr()
.Cs
int mgr_unset_attr(attribute *patr, attribute_def *padef, int numattr,
                   svrattrl *plist, int privilege, int *bad)
.Ce
.IP Args: 4
.RS
.IP patr
pointer to the attribute array in the server or queue to be unset.
.IP padef
pointer to the attribute definition array for the objects attributes.
.IP numattr
the integer number of the attributes in the definition array.
.IP plist
pointer to the list of svrattrl elements in the batch request.
.IP privilege
level of the client.
.IP bad
RETURN: pointer to integer, if an error occurs, the integer is set to the
index of the attribute in error.
.RE
.IP Returns: 4
.RS
.IP 0
if successful.
.IP >0
error number if not successful.
.RE
.LP
If an named attribute is not found in the attribute definition array or
the attribute cannot be written with the client privilege,
the integer pointed to by
.At bad
is set to the number, starting with 1, of the attributes ordinal position in
the request list.  An error value is returned.
.LP
If the attribute(s) specified in the request are not resources,
the appropriate at_free() routine is called for
each attribute of the parent object, queue or server, listed in the request.
This also results in the flag
.Sc ATR_VFLAG_SET
being cleared.
.LP
If the attribute(s) are of type resource, 
.Sc ATTR_TYPE_RESC ,
and if a specific resource (member)
is not specified, the attribute is freed as above.  If however, a specific
resource member is given, that member only is freed.
.QP
Kludge Warning
.LP
The server attribute \*Qresources_cost\*U,
.Sc SRV_ATR_resource_cost ,
is set as a resource type attribute, i.e. the type field is set to
.Sc ATTR_TYPE_RESC .
This is because they relate to the different resource names.
However, the structures in the value list are not resource structures, but are
.I resource_cost
structures.  Therefore, when unsetting a single member of this attribute, the
at_free() routine associated with
.I "the resource"
cannot be used; the value is just unlinked and freed.  Rather then set up a
new attribute type and have to check it where ever the server checks for
ATTR_TYPE_RESC, in this one place in mgr_unset_attr(), we special case
resource_cost structures by checking that the attribute parent type is
.Sc PARENT_TYPE_SERVER
as only the server has this type of resource and that the index into the
attribute definition array is 
.At SRV_ATR_resource_cost .
.Fn mgr_node_set()
.Cs
void mgr_node_set(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the batch request structure that holds the specific node request.
.RE
.LP
If the request is to apply to all nodes at the server, the local flag 
.At allnodes
is set.  Otherwise, the server's array of pbsnode structs is searched for the
node specified in the request.  If the node is not found in the server's array, the
value
.Er PBSE_UNKNODE
is sent back to the requestor.
.LP
If the node is found in the server's pbsnode array or the request applies to all
nodes, the request is logged with the server and function 
.I mgr_set_node_attr()
is called for each node in the request in an attempt to satisfy it.  Assuming the
entire request was able to be satisfied, 
.I reply_ack
is called to send back the simple acknowledgement message and function
.I write_node_state
is called if any changed node-state information needs to be permanently recorded
by the server.  A return from function follows.
.LP
If for some reason the node modification request could not be satisfied, 
.I mgr_set_node_attr
returns with a nonzero return code.  The specific return code indicates the type
of error encountered.
.LP
For the case where an error has occurred and the modifications
were intended for a specific node an appropriate reply message is generated and
returned to the requestor, along with the error code, either by calling 
.I req_reject
or
.I reply_badattr.
In the latter case the variable 
.At bad
will contain the node-attribute (its\ list\ position) that created a problem
for the request. A return from function follows.
.LP
For the case where an error has occurred and the modifications
are intended for all nodes, a pointer to the failed node is recorded in an array
and processing advances to the next of the server's nodes.  After processing all nodes,
the array of failed nodes is scanned to construct a reply message listing those nodes
that failed to get modified.  This generated message is passed to the function
.I reply_text().
Following this, memory malloc'd for the temporary recording of failures and message
building is freed and, a return from function occurs.
.Fn mgr_node_delete()
.Cs
void mgr_node_delete(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
.br
pointer to batch-request structure holding  specific node request.
.RE
.LP
Top level function for deleting a node (or all nodes) in the server's
node list.  The pbsnode will be marked as deleted.  It will no longer
be assigned to any new jobs, will no longer be pinged in the server's
main loop and, any current job tasks will continue executing on the node
until they terminate or the job aborts, or the job is killed.
.LP
.nh
A check is made to determine if the node specification in the request is
valid. If it is, the node is effectively deleted from the server's internal
node list by calling
.I effective_node_delete.
At this point the function 
.I chk_characteristic
is called to determine if the node is also marked as 
.At INUSE_OFFLINE.
If it is so marked, an indicator is set to signal the fact that the file
.I node_status,
which tracks nodes that are offline, must be updated.  Likewise, a global
indicator,
.I svr_chngNodesfile,
is set to alert the server that the nodes file needs to be regenerated from
the server's internal pbsnode list.  Finally, the function
.I reply_ack
is called to send an acknowledgement of the request, an indication of success.
.hy
.LP
If the batch request cannot be successfully completed, an appropriate
reply is sent back to the requester.  During a global modification,
a list of those nodes not being able to be modified is sent back as part of
that reply.
.Fn mgr_node_create()
.Cs
void mgr_node_create(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq +2
pointer to batch-request structure holding  specific node request.
.RE
.LP
Top level function for creating a node for the server's internal
node list.  After the pbsnode is initialized, any properties or state or
node type that has also been specified in the batch-request is set
on the pbsnode by calling the function
.I mgr_set_node_attr.
.LP
Assuming that all of this occurs successfully, a global indicator is set
which will, at server shutdown, cause the 
.I nodes
file to be regenerated based on the server's current internal pbsnodes list.
And, the other thing which transpires on successful pbsnode creation is that
each pbsnode which has not been "effectively deleted" from the server's
list will have its
.At INUSE_NEEDS_HELLO_PING
bit set in the pbsnode's
.I inuse
field.  This causes the
.I ping_nodes
function, called periodically in the server's main loop, to send a
.At HELLO
message to the MOM on the node being pinged.  This message ultimately leads
to the server sending to the MOM on the node in question all of the IP
addresses for all the non-deleted nodes that it has in its list.  The MOM
can then update its internal set of 
.I okclients,
those nodes from whom a communication is deemed valid. Finally, function
.I reply_ack
is called sending back to the requester an acknowledgement, all was successful.
.LP
If the pbsnode creation does not meet with success, the reason for the failure
shows up in an error return code (rc) variable and that is processed to
generate an appropriate reply to the requester.  The possible error codes are
.At "PBSE_NONODES, PBSE_NODEEXIST, PBSE_SYSTEM, PBSE_INTERNAL, PBSE_NOATTR, PBSE_MUTUALEX, PBSE_BADNATVAL.
.Fn mgr_node_set_attr()
.Cs
static int mgr_node_set_attr(struct pbsnode *pnode, attribute_def *pdef,
                             int limit, svrattrl *plist, int privil,
                             int *bad, void *parent, int mode)
.Ce
.IP Args: 4
.RS
.IP pnode
.br
pointer to pbsnode structure needing modification
.IP pdef
beginning of the definitions array for node attributes
.IP limit
.br
length of the node-attribute definitions array
.IP plist
.br
pointer to the batch request's list of svrattrl structures
.IP privil
.br
requester's privilege level
.IP bad
for a "bad node-attribute" type of error, pass back the offending
attribute's position in the request list
.IP parent
may go unused in this function
.IP mode
passed to attrib's action func, not currently used by this func
.RE
.IP Returns: 4
.RS
.IP "  0" +2
if successful in doing all modifications
.IP >0
return code if a problem occurs (modifications are rolled back)
.RE
.LP
This function is called by the top level function
.I mgr_node_set.
A successful (0) return means all necessary modifications to various 
data fields belonging to the specified pbsnode have gotten appropriately
modified.  If a problem occurs in mid-process, any partially completed modifications
are abandoned, allocated memory is freed, an error return code indicating
the source of the problem is passed back to the caller and, no modification is
made to the subject pbsnode.
.LP
Processing occurs as follows:
.br
Space for a temporary array of node-attribute structures is acquired on the heap.
The number of node-attribute structures requested is the number that reside in
the definitions file, 
.I server/attr_node_def.c.
For each attribute in the array, that attribute's  "action" function in the definition
is called to give an initialization to the node-attribute.  Any error that might
occur mid way through halts the process, with function
.I attr_atomic_kill
being called to facilitate the roll back of the processing that has occurred
to this point and an error code is passed back to the caller.
.LP
Once the temporary node-attribute array is setup, it is passed to function
.I attr_atomic_node_set
along with the list of requested node-attribute changes specified in the
batch request received by the server.  Attr_atomic_node_set calls
upon the "decode" and "set" functions for each node-attribute specified
in the request.  Assuming this process is successful for the entire request,
the temporary node-attribute array will have been updated appropriately and
those node-attributes of the array that received an update have been marked.
Should any problem occur mid way through the process, function
.I attr_atomic_kill
is called upon to roll back the processing and a non-zero
error return code is passed back to the caller, for use in shaping a
reply.
.LP
Give processing success to this point, a temporary copy (tnode) of
the pbsnode is now updated using the data from the node-attributes in the temporary
array.  With success at this step, the temporary pbsnode gets copied back to
the original pbsnode, the temporary node-attribute array is freed and success (0)
is returned.  Any failure during update of the temporary node gets handled as before.
.Fn mgr_log_attr()
.Cs
static void mgr_log_attr(svrattrl *list, int log_class, char *object_name)
.Ce
.IP Args: 4
.RS
.IP list
of svrattrl structures containing the attributes from the request.
.IP log_class
an object class defined in log.h.
.IP object_name
the name of the parent object, queue or server.
.RE
.LP
For each attribute modified by a Manager Request, a log entry is formatted as:
.Cs
attributes set: attribute_name =|+|- value
.Ce
and written to the log file.
.Fn set_queue_type()
.Cs
int set_queue_type(attribute *pattr, void *pque, int mode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the 
.At QA_ATR_QType
attribute.
.IP pque
pointer to the queue being created or modified.
.IP
(unused).
.RE
.LP
This is the
.I at_action()
routine for the queue type attribute.
The new string value of the 
.At QA_ATR_QType
attribute is checked against the allowable values:
.Av Execution
and
.Av Route .
The match is made regardless of case and the string may be shorted to any
set of initial characters.  
The attribute value string is replaced with a copy of the full string for
consistency in the status display.
.LP
The internal type representation, qu_type, is also set.
.Fn check_que_enable()
.Cs
int check_que_enable(attribute *pattr, void *pque, mode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the queue 
.At Enabled 
attribute.
.IP pque
pointer to the queue being enabled.
.IP mode
(unused).
.RE
.IP Returns: 4
.RS
.IP 0 
if queue completely defined.
.IP Non-zero
error if queue has not be defined sufficiently to determine its type.
.RE
.LP
This function is the 
.I at_action() 
function associated with the queue 
.At QA_ATR_Enabled
attribute.  It is called whenever the attribute is modified.
If the queue type has not yet been set, the enable is disallowed and
.Er PBSE_QUENOEN
is returned.
.Fn check_que_attr()
.Cs
static char *check_que_attr(queue *pque)
.Ce
.IP Args: 4
.RS
.IP pque
pointer to the queue being modified or created.
.RE
.IP Returns: 4
.RS
.IP Null 
if no conflict is found.
.IP pointer
to the name of the attribute if one conflicts with the queue type.
.RE
.LP
This strangeness requires some explanation.  A queue can either of
two types: execution, or route.
Some queue attributes are common to both types, others are specific to a
single type.
Rather than have two attribute definition arrays, one is defined with
all possible queue attributes included.  The tentative type of the queue is
defined by \*Qusage,\*U the first type specific attribute specified determines
the tentative queue type.  Once that has happened, no attribute is allowed
that is specific to a different type.  Thus the existence of this routine.
.LP
For each attribute set in the queue, it is determined
if the attribute is appropriate to the queue type.  If the queue type has
not yet been fixed, it is tentatively (internal to this routine) set to 
.Sc QTYPE_Unset ,
then any attribute is allowed, but the first attribute associated with only
one type of queue forces the queue to that type (internal to this routine).
.LP
If the attribute does not fit with the real or tentative queue type, 
a pointer is the name of the attribute is returned.
.Fn manager_oper_chk()
.Cs
int manager_oper_chk(attribute *pattr, void *pobject,int actmode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to server managers or operators attribute.
.IP pobject
pointer to parent object, the server. 
.IP actmode
the the type of action affecting the attribute.
.RE
.IP Returns: 4
.RS
.IP 0
if no error
.IP "PBS error"
number, if error.
.RE
.LP
This is the
.I at_action ()
routine for two server attributes,
.At managers ,
and
.At operators .
When the list of those with manager or operator privilege is set, altered, or
otherwise modified, this routine is invoked.
The routine validates each list entry to insure that the entry is in the form
.Ty user@fully.qualified.hostname
or
.Ty user@*.wildcard.domain .
This is done to insure that the list is not created with an invalid host name
or a name that might be resolved in a different domain than was intended.
.LP
The user name must be followed by an '@' sign.
If the string after the '@' does not start with a wild card character, '*',
the string is used to obtain a fully qualified host name by calling
.I get_fullhostname() .
It is an error if the returned host name does not match the specified name.
If the string does start with '*', no additional checks are possible.
On any error on a set or alter actions (resulting from a batch request),
.Er PBSE_BADHOST
is returned.  On a recovery action (server initialization), any improper
lines are logged, but no error is returned.  An error might occur here
if the access files were edited by hand.
.NH 4
.Ix node_func.c
.LP
The file
.I src/server/node_func.c
contains certain functions which are used in support batch requests pertaining to nodes.
.LP
.Fn find_nodebyname()
.Cs
struct pbsnode    *find_nodebyname (char *nodename)
.Ce
.IP Args: 4
.RS
.IP nodename 10
pointer to the name of the node being sought
.RE
.IP Returns: 4
.RS
.IP 0 10
if node name isn't found in the server's node list or the server doesn't
have a list of nodes
.IP address
pointer to the pbsnode
.RE
.LP
This function walks the server's node list 
.Av pbsnlist
and returns the address of the pbsnode structure whose name field,
.Av last,
matches the name pointed to by
.Av nodename.
Zero is returned for the value of the pointer if no match is found or the list is empty.
.Fn save_characteristic()
.Cs
void    save_characteristic(struct pbsnode *pnode)
.Ce
.IP Args: 4
.RS
.IP pnode
.br
pointer to the pbsnode structure in question
.RE
.LP
Saves the 
.I characteristics
of the pbsnode along with the address of the pbsnode. These are saved to static
variables in the file and are later examined by the function,
.I chk_characteristic,
whose job it is to report back any changes in the node's characteristics to the
caller.
.Fn chk_characteristic()
.Cs
int    chk_characteristic(struct pbsnode *pnode, int *need_todo)
.Ce
.IP Args: 4
.RS
.IP pnode
.br
pointer to the pbsnode structure in question
.IP need_todo
return various bit flags into this location
.RE
.IP Returns: 4
.RS
.IP -1
current pbsnode address doesn't match that stored by
.I save_characteristic.
.IP \ 0
check was performed successfully and flag bits in
.I need_todo
got appropriately set/cleared
.RE
.LP
This function is the companion to function
.I save_characteristic(),
which should be invoked prior to the invocation of the current function.  If the function
is successful, the integer location pointed to by
.I need_todo
will hold the result of the check.  Currently, the results of the check are
encoded in a pair of bits in this integer location and are used in determining
whether or not file
.I nodes
should be updated from the internal pbsnode
array, and whether the file tracking nodes that are marked as being offline needs an
update.
.Fn status_nodeattrib()
.Cs
.IP "int  status_nodeattrib (svrattrl *pal, attribute_def *padef, struct pbsnode *pnode," 21
 int limit, int priv, list_head *phead, int *bad)
.Ce
.IP Args: 4
.RS
.IP pal 6
pointer to an svrattrl structure from the batch request
.IP padef
pointer to the array of node-attribute definitions
.IP pnode
pointer to the subject pbsnode
.IP limit
number of elements in the array pointed to by padef
.IP priv
requester's privileges
.IP phead
heads a list of svrattrl structs in the reply area of the batch request structure
.IP bad
if there is a node-attribute error in processing, record it's list position here
.RE
.IP Returns: 4
.RS
.IP 0 6
all requested status information was successfully obtained
.IP !=0
some kind of error occured; if it's a node-attribute error *bad returns the position;
an appropriate PBS error code is returned to the caller to shape the reply
.RE
.LP
This function is invoked when a batch status request regarding a node(s) is received
by the server.  It adds the status of each requested (or all) node-attribute to the status
reply.
.Fn initialize_pbsnode()
.Cs
.IP "void  initialize_pbsnode (struct pbsnode *pnode, struct prop *pname, ulong *pul, int ntype)"
.Ce
.IP Args: 4
.RS
.IP pnode
.br
pointer to pbsnode being initialized
.IP pname
pointer to a prop struct carring the node's name
.IP pul
pointer to an array of unsigned ints. Each entry holds an ipaddrs
for this node
.IP ntype
.br
flag indicating whether to set the node to time-shared or cluster
.RE
.LP
This function is invoked to carry out initialization on any new pbsnode
being created via the
.nh
.I qmgr
.hy
command.  The assumption is that all the input parameters are valid.
This initialization parallels that done in function
.I setup_nodes
where the server reads the file
.I nodes
as part of its startup process.
.Fn effective_node_delete()
.Cs
.IP "void  effective_node_delete (struct pbsnode *pnode)"
.Ce
.IP Args: 4
.RS
.IP pnode +2
pointer to pbsnode being effectively deleted
.RE
.LP
The pbsnode pointed to by
.I pnode
is effectively deleted from the server's internal pbsnodes list.  This
is accomplished by setting the
.At INUSE_DELETED
bit on the inuse field, removing the prop list that hangs from the pbsnode
(including the name prop) and, clearing any
.At INUSE_NEEDS_HELLO_PING
bit that might be set in the pbsnode's inuse filed.  Depending on the node's
.I ntype
field, the server's count of time-shared nodes or its count of cluster nodes is
decremented by one.
.Fn setup_notification()
.Cs
.IP "void  setup_notification()"
.Ce
.LP
This function is invoked to Set up the  mechanism for notifying the other members
of the server's node pool that a new node was added manually via qmgr.
Actual notification really occurs some time later via the server's invocation
of the ping_nodes routine from within the server's main loop.
For each node that does not have its
.At INUSE_DELETED
bit set in the inuse field, the
.At INUSE_NEEDS_HELLO_PING
bit is set.  Setting of the bit causes the server to send a
.I "Hello Ping"
message to the node during the server's later invocation of the ping_nodes function.
The node responds with a
.At HELLO
and the server then builds and sends to the node a list of all the IP addresses
of all the non-deleted nodes that it has in its list. This message is read by
the MOM on the node being pinged and the new IP address-set gets used to update the tree of
.I okclients
for the MOM on that node.
.Fn process_host_name_part()
.Cs
.IP "int   process_host_name_part (struct batch_request *preq, ulong **pul," 28
struct prop **pname, int *ntype)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to a batch request (INPUT)
.IP pul
receives location of null terminated array with node's ip addresses (OUTPUT)
.IP pname
receives location of a struct prop with name field that of the node in the batch_request (OUTPUT)
.IP ntype
.br
address of an integer location. Records into this integer whether the node is to be of type time-shared
or of type cluster (OUTPUT)
.RE
.IP Returns: 4
.RS
.IP 0 6
Success
.IP !=0
An error code
.At (PBSE_UNKNODE, PBSE_SYSTEM)
.RE
.LP
When invoked this function does the following, processes into a prop structure the hostname portion
of a batch request involving a node, gets that host's set of IP addresses into
an array and, places a code for the node's specified node-type (cluster/time-shared) into an integer
variable.  If the object name contained in the batch request is not null and, that
name is a valid host name, a prop structure is allocated on the heap to hold the name.
The IP addresses for the node are obtained from the system and written to a null terminated 
array of ints allocated on the heap. The location of these data structures are passed back via the
calling parameters as is an indication of whether the request is for a node of type
time-shared or cluster.  
.Fn update_nodes_file()
.Cs
int update_nodes_file()
.Ce
.LP
When called, this function will attempt to update the
.I nodes
file of the server.
It walks the server's array of pbsnodes constructing
for each entry, which is not marked as deleted, a line for a new nodes file.
The lines are written to a temporary file which subsequently, after all
node processing is done, replaces the current nodes file.  If any system
errors happen along the way, the temporary file, if it exists, is closed
and removed and the original nodes file is not modified.
.LP
This function gets called by various primary functions in the
.I req_manager.c
file whenever a node is created/deleted or its properties/ntype modified.  Should
for some reason the function return error, a global indicator
.I svr_chngNodesfile
is set signaling that this function ought to be call during the server's
shutdown process.
.Fn recompute_ntype_cnts()
.Cs
void recompute_ntype_cnts()
.Ce
.LP
The action of this function is to walk the server's array of pbsnodes and
for each entry that is not marked as deleted notes its ntype value and
increments one of the appropriate local counters (time-shared or cluster).
.LP
The server's global node counters,
.I svr_clnodes
and
.I svr_tsnodes,
are then replaced by the values from these local counters.
.NH 4
.Ix req_messagejob.c
.LP
The file
.I src/server/req_messagejob.c
contains the server function for processing the Message Job batch request.
.LP
.Fn req_messagejob()
.Cs
void req_messagejob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
If the job is not in state
.Sc JOB_STATE_RUNNING
and substate 
.Sc JOB_SUBSTATE_RUNNING
the request is rejected with
.Er PBSE_BADSTATE .
.LP
The request is forwarded to the MOM responsible for the running job by calling
.I relay_to_mom() .
The action will be picked up in 
.I post_message_req()
when MOM replies.
.Fn post_message_req()
.Cs
static void post_message_req(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to the work task entry.
.RE
.LP
When MOM replies to a relayed Message Job Request, the delayed child work
task entry points to this function.  All it does is reply to the client
with an acknowledge or reject based on the code in the reply from MOM.
.NH 4
.Ix req_modify.c
.LP
The file
.I src/server/req_modify.c
contains the server function for processing the Modify Job batch request.
.LP
.Fn req_modifyjob()
.Cs
void req_modifyjob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
It is critical that the Modify Job request be atomic, either all of the
attributes modifications are performed or none are.
Therefore the function
.I attr_atomic_set()
is used to perform the set.
.LP 
First, certain checks must be made first if the job is in the
.Sc JOB_STATE_RUNNING
state.
If so, each specified attribute or resource is identified by calling
.I find_attr() .
If the attribute or resource is not marked as alterable in the run state,
.Sc ATR_DFLAG_ALTRUN 
set, then the request is rejected with
.Er PBSE_MODATRRUN .
Which resources are alterable when a job is running depends on MOM's ability
to update the limit.  Polled limits such as walltime can be updated on any
host.  On systems which use the 
.I setrlimit()
system call, those system enforced limits are not updatable since they 
can only be set by the process which they control.
.LP
The routine 
.I modify_job_attr()
is called to perform the set operation.  
If an error is detected, the attributes and resources are not updated and
the error is returned to the user.
The routine
.I set_resc_deflt() 
is called to set to the default values any Resource_List values which may
have been unset.
.LP
If the job is not currently running, 
.I svr_evaljobstate()
and
.I svr_setjobstate()
are called to review and update the job state.  Svr_setjobstate will also
save the job structure and updated attributes to disk.
.LP
If a resource limit for a running job is being changed, 
.I relay_to_mom()
is used to forward the request to MOM.  When the reply is received, 
.I post_modify_req()
is invoked.
.Fn modify_job_attr()
.Cs
int modify_job_attr(job *pjob, svrattrl *list, int permission, int *bad)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job whose attributes are to be modified.
.IP list
pointer to the first member of a list of svrattrl structures containing
the new attributes values from the modify request.
.IP permission
of the client from the request.
.IP bad
RETURN:  pointer to an integer in which the index of the first bad attribute
is returned.
.RE
.IP Returns: 4
.RS
.IP 0
if ok.
.IP non-zero 
error number if error.
.RE
.LP
The function
.I attr_atomic_set()
is called to decode and set a copy of the job attributes.
If the set is unsuccessful, the copies are freed and the error is returned.
.LP
If one or more resource limits are being changed, additional checks are made:
If the job is running, only a manager or operator is allowed to raise them.
The function
.I comp_resc() 
is used to compare the current and new values.
If the job is not running, the limits may be adjusted up or down, but must
remain with the queue minimum and maximum as established by
.At QA_ATR_ResourceMax
and
.At QA_ATR_ResourceMin .
.LP
If there are no errors,
each modified attribute value replaces the original.
The original attribute value is freed and the new value inserted.
It is important to note that the attribute copy value is not freed, it
now belongs to the original.
It is also should be noted that for those attributes whoses value is 
represented by linked list, the first and last list elements must be relinked,
this is accomplished by calling
.I list_move() .
.LP
If there is an 
.I at_action()
routine associated with the attribute, it is invoked.
If there are any failures, an error reply is returned.
If either the 
.At User_List
or
.At group_list
attributes changed, then 
.I set_jobexid()
is called to determine the effective execution user and group names.  This is
done outside of any at_action routine because it involves two inter-dependent
attributes.
.LP
Finally, the job modified flag, ji_modified, is set.
.Fn post_modify_req()
.Cs
static void post_modify_req(struct work_task task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to work task established by 
.I relay_to_mom() .
.RE
.LP
This function is invoked to process the return from MOM of a modify job request.
The connection to MOM is closed ane the original request is reset to point
back to the original client connection.  If there was an error, it is logged
and the error code returned to the client.
.NH 4
.Ix req_movejob.c
.LP
The file
.I src/server/req_movejob.c
contains the server function for processing the Move Job batch request.
.Fn req_movejob()
.Cs
void req_movejob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The job must be in one of the following states:
.Sc JOB_STATE_QUEUED ,
.Sc JOB_STATE_HELD ,
or
.Sc JOB_STATE_WAITING ,
otherwise the request is rejected.
.LP
If the destination is another queue on this server, the state of the destination
queue and the authorization of the user to access that queue is checked.
If the may be moved, the job is dequeued by calling
.I svr_dequejob()
and queue in the new destination by calling
.I svr_enquejob() .
A success reply is returned to the client.
If the job cannot be moved, the request is rejected.
.LP
If the destination is on a different server, the destination specified in the
request is saved in the job structure member
.I ji_destin
and
.I ji_un_type is set to
.Sc JOB_UNION_TYPE_ROUTE .
.LP
The function
.I create_child_entry()
is called to create a child process table entry of type 
.Sc Child_ROUTER .
The batch request is linked into the list headed in the child table entry
field cp_deferred.
The job state and substate are set to
.Sc JOB_STATE_TRANSIT
and
.Sc JOB_SUBSTATE_TRNOUT .
A child process is created by calling
.I fork() ,
the cp_pid field of the child process table entry is updated by the parent.
The parent server returns to continue processing other events and requests.
.LP
The child process calls the function
.I svr_routejob()
to perform the move operation.  The child process will generate and create
the various subrequests which are part of the Queue Job batch request.
If any errors or network time outs occur, an error code is returned as
the child process exit status.
.Fn req_orderjob()
.Cs
void req_orderjob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
This function provides the batch service in Order Job batch request.
This request is to swap the positions of two jobs in a queue.
The requestor must have permission to operate on both jobs; be owner of both
or be privileged.
Neither job can be running, or
.Er PBSE_BADSTATE
is returned.
.LP
If the two jobs are in the same queue, the problem is fairly simple.
The list_link function,
.I swap_link ()
is called twice, first to swap the position of the two jobs in the server's
all job list and the second time to swap positions in the queue list.
The 
.At JOB_ATR_qrank ,
\*Qqueue_rank\*U, is also exchanged between to the two jobs so they will
be correctly ordered if the server is restarted.  Both jobs are saved to disk
to record the queue rank change.
.LP
When the two jobs are in different queues, extra checks must be made to
be sure that each job is allowed into the other's queue.  The function
.I svr_chkque()
is called for both of the jobs with the opposite queue header.
If the two jobs are allowed in the opposite queue, the rank in
.At JOB_ATR_qrank ,
is swapped as above.  The parent queue name in the job structure,
.Ar ji_queue ,
is swapped and the two jobs are dequeued from their existing queue and
requeued into the other.  This insures that the current queue attribute and
queue type are updated.  A "Q" record is also produced for the accounting log.
.NH 4
.Ix req_register.c
.LP
The file
.I src/server/req_register.c
contains the functions to deal with the Register Dependent Job batch request
as well as additional dependency related functions.
This file and the similarly named function are mis-named since the register
operation is only one of several dependency related operations.
It is just too much bother to go back and change  all the references.
It is quickly determined by the reader that this set of functions is the
strangest and most difficult to understand in all of PBS.  A extra credit
\*QA\*U is here by given to the reader that figures it all out.
.Fn req_register()
.Cs
void req_register (struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
This function provides the batch service in response to a Register Dependent
batch request.  The request may ask for one of four operations.
.LP
Three of the operations are used for the non-synchronous dependencies
and are fairly straight forward:
.IP \s-1REGISTER\s+1
This operation registers a dependency relation between parent and child.
It results from an 
.Av after*
dependency attribute on the child or a
.Av before*
dependency attribute on the parent.
Included in the request is the full type of dependency and the id of the
registering job.  When received, the server will set up a mirror image
type dependency attribute.
This will remind the server to send notification to the child
job when the parent reaches the specified state.
.IP
Note, if the server is built with 
.Sc PBS_DEPENDENCY_SECURE=1 ,
then any Register Dependent batch request must be from the owner of the job
affected.  This prohibits cross user dependencies.
If the server is built with
.Sc PBS_DEPENDENCY_SECURE=0 ,
then Register Dependent batch requests which \*Qregister\*U the \*Qafter\*U
types and corresponding \*Qrelease\*U operations described below,
are accepted even when the requesting user does not own the affected job.
This allows cross-user dependencies,  however with the check on the ready
request described below, the only type of dependency that can be established
by another user is one where that user's job runs 
.I after
another users.  This prevents a user from delaying or expediting another
user's program execution.  
.\" .IP \s-1UNREGISTER\s+1
.\" This operation directs the server to remove a dependency.  This 
.\" typically results when a child job is modified (qalter) and existing
.\" dependencies are replaced with others or none.
.IP \s-1RELEASE\s+1
This operation is sent from the parent to the child's server.  It indicates
that the specified 
.Av after*
dependency has been satisfied and can be removed from the dependency list.
When all 
.Av after*
dependencies have been removed, the hold is removed from the child and it is
free to run.
.IP \s-1DELETE\s+1
The DELETE operation requests that a server abort a dependent job.  It is sent
to dependent jobs whoses dependency cannot be satisfied.
For example, if Job-B is dependent on Job-A termination normally,
exit status of zero, and Job-A terminates abnormally, then the server managing
Job-A will send a DELETE operation to the server managing Job-B.
.IP \s-1UNREGISTER\s+1
The UNREG operation is the reverse of the register.  An existing relationship
between the job and a child is to be removed.  Either
.I unregister_sync()
or
.I unregister_dep()
is called depending on type dependency type.
.LP
If the dependency type is synchronous, the work is a bit more involved.  
There are three approached that could be taken here, which oh which ???
.IP 1.
When the \*Qmaster\*U job has received a register operation from all other jobs
in the set, the server will send a release operation to each job.  This
will remove the system hold and allow the job to begin to compete for
resources.  When each job has its resources, it would notify the master;
when all have their resources at the same time, a run request would be sent
to each.
.IP 2.
As each job registers, it sends the \*Qcost\*U of its required resources.
When all jobs have registered, the job with the highest resource cost is
released from its hold.  When that job is scheduled, the lower cost jobs
are \*Qforced\*U into running as well.
.IP 3.
As each job registers, it sends the \*Qcost\*U of its required resources.
When all jobs have registered, the job with the lowest resource cost is
released from its hold.  When that job is scheduled and begins to run, it
notifies the master.  Then the job with the next lowest resource cost is
released.  This continues until all jobs are placed into execution.
.LP
Approach 1 seems too complex to work.  Without a master scheduler, it is
unlikely on loaded systems that all jobs would have resources available at
the same time.  To keep from choking the system, the jobs could not hold
onto their slot forever, but would have to time out and release their run
window.  Approach 2 is a possibility but might lead to threshing when the 
lower cost jobs are \*Qforced\*U to run.  But it might still work.
.LP
To start with, the PBS team chose to go with approach 3, even though it is
the least synchronous of the approaches.
.IP \s-1REGISTER\s+1
The register operation dependency request is sent to the server managing the
\*Qmaster\*U job.
This establishes the link from the master back to the child and 
reports the cost of resources for each job.  It is also used to update the
location of the child/master when that job moves.
.IP
Requests must be from the job owner regardless of the setting of
.Sc PBS_DEPENDENCY_SECURE .
.IP \s-1RELEASE\s+1
When an Register operation has received for each expected dependent child, and
when a Ready operation is received from a prior released job, the master
server will send a release request to release the hold on the job with the
cheapest resource cost which has not yet been released.  This allows that job
to fight for resources (be scheduled).
.IP \s-1READY\s+1
When a child job is able to obtain its resources (has be scheduled), a 
.I Ready
operation is sent to the master.  When the master scheduler receives a Ready
operation from a child, as described above, it will release the next cheapest
job until all have been released.
.LP
.Fn alter_unreg()
.Cs
static void alter_unreg(job *pjob, attribute *old, attribute *new)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job being altered.
.IP old
pointer to job's current (old) dependency attribute.
.IP new
pointer to job's new (as altered) dependency attribute.
.RE
.LP
For any dependency type currently established for the job which are being
deleted (are not in the new [altered] attribute), an unregister,
.Sc JOB_DEPEND_OP_UNREG ,
operation is send to the parner job.  This deletes the corresponding dependency
listed with that job.  This routine is called by
.I depend_on_que()
when it is acting as the at_action() routine for the dependency attribute.
.Fn depend_on_que()
.Cs
int depend_on_que(attribute *pattr, job *pjob, int mode)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the dependency attribute.
.IP pjob
pointer to a job structure.
.IP mode
is the at_action mode.
.RE
.IP Returns: 4
Zero if ok, non-zero otherwise.
.LP
The function is called on two events, when a job is moved into an execution
queue, the mode will be
.Sc ATR_ACTION_NOOP ,
 and when the dependency attribute is altered, the mode is
.Sc ATR_ACTION_ALTER .
The alter case happens when this routine is called as the
.I at_action
routine for the dependency attribute.  In either case, we want
the actions to only happen if the job is in an execution queue so jobs are
not held in routing queues.  The other time this routine is called is when a
job is moved into an execution queue, so it is called from
.I svr_enquejob() .
.LP
For the alter case only, existing dependencies could be deleted, so
.I alter_unreg()
is called to check for that possibility.
.LP
If the job has dependencies which required placing a system hold on the job,
that is done by calling
.I set_depend_hold() .
.LP
If the job has 
.Av SYNCCT
dependency, the (master) job's resource cost is calculated by
.I calc_job_cost() 
and a entry in the syncct list is created (as if a Register operation request
had been received) by calling
.I register_sync() .
If all jobs have registered (unlikely in this case as this is the master job),
.I release_cheapest()
is called to send a release to the cheapest job.
.LP
For all other dependency types except 
.Av JOB_DEPEND_TYPE_ON 
(all Before and After types), a Register Dependency \(em Register operation is
sent to the parent job (job on which the dependency is based).
.Fn post_doq()
.Cs
static void post_doq(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to work task entry created by issue_request().
.RE
.LP
This routine is the call back routine when the reply to a Register dependency
request is sent from within depend_on_que().
If the request was rejected, then the job for which the request was sent is
aborted.
.Fn depend_on_exec()
.Cs
void depend_on_exec(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a job structure.
.RE
.LP
This routine is called when a job with dependencies goes into execution.
If the job has
.Av BEFORESTART
dependencies, a Register Dependencies \(em Release message is sent to each job
in the set.
If the job is a member of a sync set and not the master (has dependency of
.Av SYNCWITH ),
a Register Dependencies \(em Ready message is sent to the master stating that
this job is about to run.
If the job is the master of a sync set, (has dependency of
.Av SYNCCT ),
then 
.I release_cheapest()
is called directly to release the next cheapest job.
.Fn post_doe()
.Cs
static void post_doe(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to work task entry created by issue_request().
.RE
.LP
This routine is the call back routine when the reply to a Register dependency
request is sent from within depend_on_exec().
If the request was rejected, then the job for which the request was sent is
aborted.
.Fn depend_on_term()
.Cs
void depend_on_term(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a job structure.
.RE
.LP
This routine is called when a job with dependencies terminates execution.
If the job has
.Av BEFOREANY
dependencies, a Register Dependencies \(em Release message is sent.
If the job has
.Av BEFOREOK
dependencies and the job terminated \*Qnormally\*U, and/or
.Av BEFORENOTOK
dependencies and the job terminated abnormally, a Register Dependencies \(em
Release message is sent to each job.
Otherwise, a Register Dependencies \(em Delete message is sent to those jobs
that will never run because of the dependency on the reverse exit status.
.LP
If the job has
.Av JOB_DEPEND_TYPE_SYNCCT
a special check must be performed.  The whole purpose behind the sync
set concept is to have jobs run at the same time and communicate with each
other.  If there is no communication, there is no need to run together.
So if a job, especially the master, quits before all jobs have started running,
then there must be a problem.  Doubly so for the master, not because of any
relation internal to the jobs, but because with out it there is no place to
register the Release and Ready operations.
Therefore, if the master has terminated and not all of the jobs in the sync
set have reported Ready (running), then all jobs are aborted.
.Fn release_cheapest()
.Cs
static void release_cheapest( job *pjob, struct depend *pdep)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job for which the resource cost should be calculated.
.IP pdep
pointer to the 
.Av SYNCCT
dependency.
.RE
.LP
For each job in the set (list) headed by the SYNCCT dependency which have
not been Released or Readied (running), find the one with the lowest
resource cost.  If this is the first job of the set to be released, then
set the scheduler hint field to 
.Sc SYNC_SCHED_HINT_FIRST ,
otherwise, set it to
.Sc SYNC_SCHED_HINT_OTHER .
Call
.I send_depend_req()
to send a Register Dependency \(em Release operation message.  
.LP
The scheduler hint field is recorded in the receiving job's
.At Sched_hint
attribute.   As explained in the ERS, this is purely a hint to the
scheduler to decrease the priority of the first job to prevent cheating and
increase priority of the other jobs in the set to improve synchronism.
.Fn set_depend_hold()
.Cs
static void set_depend_hold(job *pjob, attribute *depend)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure
.IP depend
pointer to the dependency attribute.
.RE
.LP
This function examines the dependencies on a job and if required, sets
a system hold.  Depending on the dependency type, the job state is set to
.Sc JOB_STATE_HELD
and the substate to either:
.IP JOB_SUBSTATE_SYNCHOLD
If the job has either 
.Sc JOB_DEPEND_TYPE_SYNCWITH
or
.Sc JOB_DEPEND_TYPE_SYNCCT
dependencies that have not been released.
.IP JOB_SUBSTATE_DEPNHOLD
If the job has any 
.Sc JOB_DEPEND_TYPE_AFTER*
or
.Sc JOB_DEPEND_TYPE_ON
type dependencies.
.LP
If the job has none of the above dependencies and was in substate
.Sc JOB_DEPEND_TYPE_SYNCWITH
or
.Sc JOB_DEPEND_TYPE_SYNCCT ,
the system hold is removed and the state is re-evaluated by calling
.I svr_evaljobstate() .
.Fn depend_clrrdy()
.Cs
void depend_clrrdy(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a job structure.
.RE
.LP
This function clears any synchronous dependency ready flags in the
job's dependency attribute.  It is called from pbsd_init() during
recover.  The flags are cleared because it is unlikely that the children
are still ready.  At some point in the future, the children will again
notify the parent that they are ready.
.Fn find_depend()
.Cs
static depend *find_depend(int type, attribute *pattr)
.Ce
.IP Args: 4
.RS
.IP type
of the dependency to find.
.IP pattr
pointer to the dependency attribute of the job.
.RE
.IP Returns: 4
.RS
.IP pointer
to the depend structure found of the requested type.
.RE
.LP
This function searchs the dependency attribute of a job for a certain 
dependency type, a depend structure of the specified type.
If it is found, a pointer to it is returned.
.Fn make_depend()
.Cs
static depend *make_depend(int type, attribute *pattr)
.Ce
.IP Args: 4
.RS
.IP type
of the dependency to be added.
.IP pattr
pointer to the dependency attribute of the job.
.RE
.IP Returns: 4
.RS
.IP pointer
to the created depend structure.
.RE
.LP
This function allocates and initializes a depend structure and links
it on the list of structures headed in the dependency attribute.
.Fn register_sync()
.Cs
static int register_sync(struct depend *depend, char *child, char *host,
                         long cost)
.Ce
.IP Args: 4
.RS
.IP depend
pointer to the
.Sc JOB_DEPEND_TYPE_SYNCCT
dependency structure.
.IP child
the job id of the child (non-master) job.
.IP host
the name of the server (host name) which manages the child job.
.IP cost
the resource cost for the child job.
.RE
.IP Returns: 4
.RS
.IP 0
if successful
.IP error
.Er PBSE_SYSTEM
if failed.
.RE
.LP
This function is called
when a Register Dependency request is received with an operation of
.Sc JOB_DEPEND_OP_REGISTER
and the dependency type is 
.Sc JOB_DEPEND_TYPE_SYNCWITH .
If the client job has already been registered with this, the \*Qmaster\*U job,
the location of client job is updated.  Otherwise, the client (child) job
is registered by making adding a 
.B depend_job
structure, see
.I make_dependjob() ,
to the \*Qsyncwith\*U depend structure.  
The child job's resource cost is recorded and the count of registered job is
incremented in
.Av dp_numreg .
If dp_numreg exceeds the number of expected jobs,
.Av dp_numexp ,
.Er PBSE_IVALREQ
is returned.
.Fn register_after()
.Cs
static int register_dep(attribute *pattr, struct batch_request *request,
                        int type, int *made)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the dependency attribute of a job.
.IP request
pointer to the Register Dependency batch request.
.IP type
of the dependency to set up.
.IP make
RETURN: pointer to integer which is set to 1 if the child dependency is
new (was made), 0 if already exists.
.RE
.IP Returns: 4
.RS
.IP 0
on success.
.IP error
number if fails.
.RE
.LP
This function is called from 
.I req_register()
when a register request is received with the operation of
.Sc JOB_DEPEND_OP_REGISTER
and a type of any of the
.Sc JOB_DEPEND_TYPE_AFTER*
or
.Sc JOB_DEPEND_TYPE_BEFORE*
forms.
The purpose is to set up or update a dependency of the opposite
form (before_X becomes after_X, after_Y becomes before_Y) to remind the server
to release the depend job at the right time.
First, find or make a depend structure of the type needed, opposite of that
in the request.
Then add or update the location of the dependent child job.
One is returned in the argument pointed to by
.Av made
if the dependency is created, zero is returned if just updated.
.Fn unregister_dep()
.Cs
static int unregister_dep(attribute *pattr, struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the job's dependency attribute.
.IP preq
pointer to the batch request (dependency register, op of unregister).
.RE
.IP Returns:
zero on success, 
.Er PBSE_IVALREQ
if the dependency to unregister is not present.
.LP
This handles unregistering (deleting) before/after dependencies.  The mirror
image type dependency (before* <-> after*) pointing to the requesting job
is located.  It is deleted by calling
.I del_depend_job() .
.Fn unregister_sync()
.Cs
static int unregister_sync(attribute *pattr, struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to the job's dependency attribute.
.IP preq
pointer to the batch request (dependency register, op of unregister).
.RE
.IP Returns:
zero on success,
.Er PBSE_IVALREQ
if the dependency to unregister is not present.
.LP
This handles unregistering (deleting) syncwith dependencies.  The master,
.Sc JOB_DEPEND_TYPE_SYNCCT
dependency is located and within it the registration pointing to the
requesting job.  It is deleted by calling
.I del_depend_job().
The number of registered jobs is decremented.  Assuming that drops the count
below what is required to release the first job, if the master job has
been released, is re-held.
.Fn find_dependjob()
.Cs
static struct depend_job *find_dependjob(struct depend *depend, char *jobid)
.Ce
.IP Args: 4
.RS
.IP depend
pointer to the depend structure.
.IP jobid
of the job of which the depend_job structure is desired.
.RE
.IP Returns: 4
.RS
.IP pointer
to the depend_job structure if found.
.RE
.LP
The list of depend_job structures attached to the depend structure is
searched for one with the child id matching the supplied job id.
.Fn make_dependjob()
.Cs
static struct depend_job *make_dependjob(struct depend *depend,
                                         char *jobid, char *host)
.Ce
.IP Args: 4
.RS
.IP depend
pointer to the parent depend structure.
.IP jobid
of the job to add.
.IP host
name (server name) owning the child job.
.RE
.IP Returns: 4
.RS
.IP pointer
to the created depend_job structure.
.RE
.LP
A depend_job structure is allocated, initialized and appended to the list
headed in the parent depend structure.
.Fn send_depend_req()
.Cs
static int send_depend_req(job *pjob, struct depend_job *parent, int type,
                           int op, int scheduler_hint, 
                           void postfunc(struct work_task *))
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job which is to be registered with another.
.IP parent
pointer to the depend_job structure holding the parent job's name.
.IP type
the type of dependency of the child (to register).
.IP op
the operation, Register, Ready, Delete, ...
.IP scheduler_hint
the value of the scheduler hint to pass to the other job.
.IP postfunc
the function to call when the reply to the request is received.
.RE
.IP Returns: 4
.RS
.IP zero
on success.
.IP non-zero
error value if an error occurred.
.RE
.LP
This function forms and issues a Register Dependent Job batch request.
The owner of the job, not the server, is inserted into the request as the
requester.  The job id of the parent job is taken from the job dependency
structure pointed to by
.Av parent.
The dependency type and operation code is set according to the arguments
.Av type
and
.Av op .
The destination server is obtained from the job dependency structure.
.LP
If the request is of type
.Sc JOB_DEPEND_TYPE_SYNCWITH
and the operation is
.Sc JOB_DEPEND_OP_REGISTER ,
then the child job's resource cost is calculated,
.I calc_job_cost()
and included in the request.  Otherwise it is set to 0.
.LP
The function
.I issue_to_svr()
is called to send the request on its way, with
.Av postfunc
as the call back routine.
.Fn decode_depend()
.Cs
int decode_depend(attribute *pattr, char *name, char *rescn, char *val)
.Ce
.LP
The value string passed in parameter
.Ar val
is a null string or a
comma-separated series of substrings.  Each substring is of the form:
.Cs
depend_type=argument[,argument,...][,depend_type=argument[,argument,...]...]
.Ce
where
.Ar depend_type
is one of the following:
.Cs
on	after		before		syncwith	
	afterok		beforeok		syncct
	afternotok	beforenotok
	afterany		beforeany
.Ce
as described in the ERS.  The
.Ar argument
portion of the substring depends on the depend_type.  For 
.Ty on
or
.Ty syncct ,
the 
.Ar argument
is a numeric string which is a count of jobs.  Otherwise, 
.Ar argument
is a job identifier.
.LP
If the 
.Ar val
is the null string, the attribute is being \*Qunset\*U.  The attribute is
freed by calling
.I free_depend()
and marked with
.Sc ATR_VFLAG_MODIFY
(free_depend() cleared 
.AT ATR_VFLAG_set ).
.LP
Otherwise, for each 
.Ar depend_type
specified in the value string:
.IP 1.
If a 
.I depend
structure of that form does not already exist, one is created and linked into
the list headed in the attribute structure.
This structure identifies the base dependency type and the number of jobs
listed for this type.
.IP 2.
An explicit \*Q0n\*U form will have set its count in the number of expected
jobs in the \*Qon\*U structure.  The \*Qon\*U structure is created if
non-existent.
.IP 3. 
For each \*Qafter\*U, \*Qbefore\*U, or \*Qsync\*U form, a 
.I depend_job
structure is created containing the job identifier, the job location,
and the registered/ready flags.
.LP
The attribute flags are set with
.Sc ATR_VFLAG_MODIFY
and 
.Sc ATR_VFLAG_SET .
.LP
.Fn cpy_jobsvr()
.Cs
static int cpy_jobsvr(char *dest, char *source)
.Ce
.IP Args: 4
.RS
.IP dest
pointer to destination string.
.IP source
pointer to source string.
.RE
.LP
This little kludge is used in 
.I encode_depend()
to copy a job id of the form
.Ty seq.server[:port][@server[:port]]
to
.Ty seq.server[\:port][@server[\:port]] .
It escapes the colons since the colon is also used to separate job ids within
the dependency string.
.Fn dup_depend()
.Cs
static int dup_depend(attribute *pattr, struct depend *depend)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to job's dependency attribute in which a dependency is to be
duplicated.
.IP depend
pointer to the dependency to be duplicated.
.RE
.IP Returns: 4
zero on success, non-zero if error.
.LP
This function duplicate (adds) a dependency to attribute.
A new dependency sub-structure is allocated by calling
.I make_depend()
with the attribute and the type of the dependency from the existing one.
Various fields are copied into the new and for each child job, the
depend_job structure is reproduced.
.Fn encode_depend()
.Cs
int encode_depend(attribute *pattr, list_head *phead, char *atname,
                  char *rsname, int mode)
.Ce
.LP
The values of the dependencies are encoded into a series of strings
and placed into a buffer.  
The encoding performed is according to the following rules:
.IP 1. 4
For each depend structure in the list, a 
.I depend_type
string is placed in the buffer followed by an equal sign, \*Q=\*U, followed
by the appropriate argument string.
.IP 2. 4
If the the depend_type is 
.Ty syncct
or
.Ty on ,
the argument string is a numeric string expressing the dependency count.
Otherwise the string is a colon separated list of the job identifiers associated
with the depend_type.  Colons within each job identifier, used to indicate an
alternative server port, must be escaped with a leading back slash.
.LP
Then the function
.I attrlist_create()
is called to create an svrattrl entry containing the attribute name and
the encode string is inserted into the entry.
.LP
.Fn set_depend()
.Cs
int set_depend(attribute *old, attribute *new, enum set_op op)
.Ce
.LP
The value of the depend attribute old is set according to the operation:
.IP Set
old is replaced with new.
.IP Incr
[Not currently supported.]
.IP Decr
[Not currently supported.]
.LP
If the type of dependency to be "set" already exists in the \*Qold\*U attribure,
it is deleted via
.I del_depend() .
The dependency from the \*Qnew\*U attribute is copied via
.I dup_depend() .
.Fn comp_depend()
.Cs
int comp_depend(attribute *pattr, attribute *with)
.Ce
.LP
Not used, does nothing, always returns -1.
.Fn free_depend()
.Cs
void free_depend(attribute *pattr)
.Ce
The depend attribute lists are freed and in the attribute flag
.Sc ATR_VFLAG_SET
is cleared.
.Fn build_depend()
.Cs
static int build_depend(list_head *head, char *key, char *value)
.Ce
.IP Args: 4
.RS
.IP head
of list of depend structures.
.IP key
is keyword, name of dependency type.
.IP value
is the value string, following the equal sign.
.RE
.IP Returns: 4
.RS
.IP 0
if ok,
.IP non-0
if error.
.RE
.LP
The keyword is used to determine the dependency type.
If it does not match a legal value, 
.Er PBSE_BADATVAL
is returned.
.LP
Since certain combinations of dependencies are illegal, the existing
dependencies are scanned and the types noted.  If the new dependency would
create an illegal combination, 
.Er PBSE_BADATVAL
is returned.  The illegal combinations are:
.IP \(bu
syncwith with syncct, on, or any of the after forms.
.IP \(bu
syncct with syncwith or another syncct.
.LP
If the base depend structure does not already exist for the type of dependency
being created, one of the correct type is allocated.
The value string is parsed by calling
.I parse_comma_string()
and an appropriate depend_job structure is allocated.
Note that within a job id, a colon indicating an alternative server port must
be escaped with a leading back slash in the external form.  Otherwise, it would
be taken as the colon that separates multiple job ids.   Within the 
.I depend_job
structure, the back slash is not needed and in fact gets in the way of comparing
job ids.   So the back slash is removed.
.LP
Note that a command line value of 
.Ty "depend=type"
without a colon and following value is a means of clearing that type of
dependency.   build_depend makes an "empty" depend structure for that type.
If the job is being altered, 
.I set_depend()
will replace the existing entries for that type of dependency with the new
(non-existent) ones, in effect, clearing the old entries.
.Fn clear_depend()
.Cs
static void clear_depend(struct depend *pd, int type, int exist)
.Ce
.IP Args: 4
.RS
.IP pd
pointer to depend structure to clear.
.IP type
of depend structure to set.
.IP exist
flag, if true the depend structure already exists and any associated
depend_job structures should be freed.
.RE
.LP
The depend structure is cleared.
.Fn del_depend()
.Cs
void del_depend(struct depend *pd)
.Ce
.IP Args: 4
.RS
.IP pd
pointer to depend structure to delete.
.RE
.LP
A depend structure and any associated depend_job structures are freeded.
.NH 4
.Ix req_rescq.c
.LP
The file
.I src/server/req_rescq.c
deals with batch requests to query
.Sc PBS_BATCH_Rescq ,
reserve
.Sc PBS_BATCH_ReserveResc ,
and release 
.Sc PBS_BATCH_ReleaseResc
resources.
.Fn req_rescq()
.Cs 
void req_rescq(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the
.Sc PBS_BATCH_Rescq
Query Resource batch request.
.RE
.LP
If the number of resource items, strings in the resource array, is less than
one, the request is reject with
.Er RM_ERR_BADPARAM .
The 4 integer arrays (available, allocated, reserved, down) to hold the
returns are malloced and initialized to zero.
.LP
Each string in the resource list is parsed for the resource name and value.
Depending on the resource name, the appropriate function is called.
At the present time, only 
.Ty nodes
is supported and the supporting function is
.I node_avail() .
If an unrecognized type of resource is specified, the request is rejected with
.Er RM_ERR_BADPARAM .
.Fn req_rescreserve()
.Cs
void req_rescreserve(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the
.Sc PBS_BATCH_ReserveResc
Reserve Resource batch request.
.RE
.LP
At the present time, the only 
.I "reserverable resources"
handled via this request are
.Ty nodes .
.LP
The client must have manager or operator privilege to make this request.
If the number of resource items, strings in the resource array, is less than
one, the request is reject with
.Er RM_ERR_BADPARAM .
.LP
If the suppied resource handle is not null,
.Sc RESOURCE_T_NULL ,
any existing resources allocated to that handle are released by calling
.I node_unreserve() . 
Otherwise, a new resource handle is generated to be returned.
.LP
For each resource string in the array, the corresponding resource support
function is called.   For
.Ty nodes 
the function is
.I node_reserve ().
.LP
If the reservation is only partially successful (some but not all nodes
were reserved), 
.Er PBSE_RMPART
is returned.
The resource handle is returned.
.Fn req_rescfree()
.Cs
void req_rescfree(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the
Reserve Resource batch request.
.RE
.LP
At the present time, the only 
.I "reserverable resources"
handled via this request are
.Ty nodes .
.LP
.I node_unreserve()
is called to free (release or unreserve) the nodes.

.NH 4
.Ix req_rerun.c
.LP
The file
.I src/server/req_rerun.c
contains the server function for processing the Run Job batch request.
.Fn req_rerunjob()
.Cs
void req_rerunjob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The job structure is located.  The job state must be
.Sc JOB_STATE_RUNNING
and the substate
.Sc JOB_SUBSTATE_RUNNING
or the request is rejected.
The job
.At rerunable
attribute must be set to
.Av y 
or the request is rejected.
.LP
The job substate is set to
.Sc JOB_SUBSTATE_RERUN .
The function
.I send_signal()
is called to request that MOM send SIGKILL to the process group.
The function
.I post_rerun()
will handle the reply from MOM about the signal request.
.LP
Latter, when MOM notifies the server of job termination,
the post-execution processing routine,
.I req_jobobit() ,
will note the rerun substate of JOB_SUBSTATE_RERUN, set
.Sc JOB_SVFLG_HASRUN
in the job server flags (ji_svrflags) and requeue the job.
The flags
.Sc JOB_SVFLG_CHKPT
and 
.Sc JOB_SVFLG_ChkptMig
are cleared to prevent the job from being set up for restart when next run, see
.I send_job ().
.LP
.I account_record()
is called with
.Sc PBS_ACCT_RERUN
to note the rerun in the accounting file.
.Fn post_rerun()
.Cs
static void post_rerun(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to a work task entry.
.RE
.LP
This routine processes the reply from MOM regarding the signal job request
sent in req_rerun().   If MOM had no problem, the work task entry (and
therefore the request structure) is released.
.LP
If MOM rejected the request, about the only valid reason would be that she
did not know if the job id.   Why this might happen I don't know, but it
has once or twice.   Anyway, the job is directly requeued.
.NH 4
.Ix req_runjob.c
.LP
The file
.I src/server/req_runjob.c
contains the server functions for processing the Run Job batch request and
general placing a job into execution.
.Fn req_runjob()
.Cs
void req_runjob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
This function handles the Run Job and Async Run Job requests.
These requests requires the requesting user to have operator or administrator
privilege, otherwise the request is rejected.  The client may be another
server performing a synchronous dependency job start, the Scheduler, or
the qrun command.  Servers/schedulers always have privilege.
This is checked by calling 
.I chk_job_torun() .
.LP
If the request is the Async Run request, the request is acknowledged now
to prevent any delays.  The pointer to the request, preq, is nulled to prevent
any later attempt to use it since the request structure is freed by the
acknowledgement.
.LP
The function
.I svr_startjob()
is called to initiate the job into execution.  If svr_startjob()
returns a error, the Run Job request is rejected by req_runjob().
For a normal (non async) Run request, the request is acknowledged by one of
the follow up routines,
.I svr_stagein()
or 
.I post_sendmom() .
.Fn req_stagein()
.Cs
void req_stagein(struct batch_request *request)
.Ce
.IP Args:
.RS
.IP request
pointer to the Stage In batch request.
.RE
.LP
This starts the file stage-in process.   It is normally invoke by the
scheduler.  If the job does not have files to stage in, the request is
rejected with
.Er PBSE_IVALREQ .
.LP
.I svr_stagein()
is called to send the copy file request to MOM.  That function is requested
to update the job to state
.Sc JOB_STATE_QUEUED
and substate 
.Sc JOB_SUBSTATE_STAGEIN
during the stage in operation.
.Fn svr_stagein()
.Cs
static int svr_stagein(job *pjob, struct batch_request *preq, int state, 
int substate)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job.
.IP preq
pointer to the Run Job batch request.
.IP state
The next job state.
.IP substate
The next job substate.
.RE
.IP Returns: 4
.RS
.IP 0
if the Copy Files was  successfully sent to Mom.
.IP non-zero
error reply otherwise.
.RE
.LP
The function
.I cpy_stage()
is called to build a Copy Files batch request.  A copy of the job id is
created and a pointer to it is placed in the batch request,
.I rq_extra .  
This string is used to find the job structure in
.I post_stagein()
rather than saving the value of pjob.  It is remotely possible that the
job might be deleted before Mom replies to the request, in which case,
the pointer to the job would be invalid.
.LP
If the Copy Files request was built by cpy_stage(), then there are indeed
files to copy.
The Copy Files request is sent to Mom by calling
.I relay_to_mom()
with the host address saved in the job structure,
.Av ji_qs.ji_un.ji_exect.ji_momaddr .
The job state and substate are set to 
.Ar state
and
.Ar substate .
.LP
At this point, a reply is sent to the original batch request, rather
than wait the possibly long time it may take Mom to copy the requested files.
This does mean that a failure of the copy will cause a asynchronous wait 
being placed on the job.
.LP
If the Copy Files request was not built by cpy_stage(), there were
no files listed in the stage-in attribute.  The routine
.I svr_strtjob2()
is called to start the job execution and its return is our return.
.Fn post_stagein()
.Cs
static void post_stagein(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to work task structure.
.RE
.LP
This function is called when the reply to a Copy Files request to Mom,
initiated in 
.I svr_stagein() ,
is received by the server.
The job for which the request was issued is located by calling
.I find_job()
with the job id saved in the copy request, see svr_stagein().
If the job is not found (was deleted, unlikely but possible),
the function just returns.
.LP
If the copy request return is zero, the next action is determined by
the current substate of the job.   If it is
.Sc JOB_SUBSTATE_STAGEGO ,
.I svr_strtjob2()
is called to send the job to Mom for execution.  Note that the batch
request pointer, the second parameter, is null.  The original 
request has already be acknowledged in svr_stagein().
If the job substate is not JOB_SUBSTATE_STAGEGO, it is
.Sc JOB_SUBSTATE_STAGEIN
and the state and substate are updated by calling
.I svr_evaljobstate()
and
.I svr_setjobstate() .
The job is most likely to be placed in state
.Sc JOB_STATE_QUEUED
and substate 
.Sc JOB_SUBSTATE_STAGECMP .
.LP
If the return from MOM is non-zero, the copy failed and the job is placed in
a waitting state,
.Sc JOB_STATE_WAITING ,
substate
.Sc JOB_SUBSTATE_STAGEFAIL .
The execution time attribute,
.At JOB_ATR_exectime ,
is set for 
.Sc PBS_STAGEFAIL_WAIT
seconds in the future.  This is done to keep the job from being rescheduled
over and over in a short amount of time.
A mail message is and sent to the job owner requesting that he/she 
investigate and fix the problem.
.Fn svr_startjob()
.Cs
int svr_startjob(job *pj, struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP pj
Pointer to job structure of a job to run.
.IP request
to run the job to which must be responded, or NULL if server staring jobs
on initialization.
.RE
.IP Returns: 4
.RS
.IP 0
If contact with MOM was successful (see below).
.IP non-zero
if job could not be placed into execution.
.RE
.LP
This function attempts to place the job into running state.
It is called when the Run Job Batch Request is received, this may be 
from the scheduler or the operator.
.LP
The short file name used as the base for saving the job structure and script
must be made available to Mom, she will used the same name as we know there
will not be a conflict with other jobs.  To ship it to Mom, this name is
placed in a read-only attribute,
.At JOB_ATR_hashname .
.LP
If the job has the
.At JOB_ATR_stagein
attribute set, then
.I svr_stagein()
is called to direct Mom to copy the files.
It is passed the state and substate of
.Sc JOB_STATE_RUNNING
and
.Sc JOB_SUBSTATE_STAGEGO
to indicate that the job will be run as soon as the files are staged-in.
If svr_stagein() returns non-zero
indicating it was unable to contact Mom, the Run Job request is rejected.
If svr_stagein() is able to contact Mom, it will reply to the request (see
the commentary in svr_stagein).
.LP
If there are no files to stage in,
.I svr_strtjob2() 
is called.
.Fn svr_strtjob2()
.Cs
static int svr_strtjob2(job *pjob, struct batch_request *request)
.Ce
.IP Args: 4
.RS
.IP pj
Pointer to job structure of a job to run.
.IP request
to run the job to which must be responded, or NULL if server staring jobs
on initialization.
.RE
.IP Returns: 4
.RS
.IP 0
If contact with MOM was successful (see below).
.IP non-zero
if job could not be placed into execution.
.RE
.LP
The job state and substate are set to
.Sc JOB_STATE_RUNNING
and
.Sc JOB_SUBSTATE_PRERUN .
Then
.I send_job() ,
see svr_movejob.c, is called to \*Qmove\*U the job to MOM.
This creates a child process to send the job.  When the child process
completes, the routine
.I post_sendmom()
is given control to update the job substate to 
.Sc JOB_SUBSTATE_RUNNING ,
or to requeue the job depending on success or failure of the move.
post_sendmom() will also repond as required to the Run Job batch request
if it exists.
.Fn post_sendmom()
.Cs
static void post_sendmom(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to work task entry which caused the dispatch of this function.
In the work task, wt_parm1 points to the job, and if wt_parm2 is not
NULL, it point to a Run Job batch request.
.RE
.LP
This function is equivalent to post_routejob for the case of sending a job
to MOM for execution.  When the child process exits, post_sendmom is
dispatched as a result of the work task associated with the child.
.LP
If the send was successfully, the job is placed in state
.Sc JOB_STATE_RUNNING
and substate
.Sc JOB_SUBSTATE_RUNNING .
Note, for a very short job, there can be a race condition between the
completion of the child process that sent the job to mom and the Obit notice
from MOM, see req_jobobit().  It might be that the job substate has already
been set to exiting.   Also a rerun request could have changed the substate
to indicate the rerun.   Hence the state and substate is not updated if
the substate is not
.Sc JOB_SUBSTATE_PRERUN
as set in
.I svr_strtjob2() .
.LP
If there is an out standing Run Job batch request, pointed to 
.At wt_parm2 , 
it is acknowledged.
The time of the start is recorded in
.Av ji_chkpttime
which is overloaded for this purpose for accounting.  If the job is being
restarted from a checkpoint file, 
.I account_record()
is called with
.Sc PBS_ACCT_RESTRT ,
otherwise 
.I account_jobstr()
is called to make the accounting entry.
The job
.At "Session Id"
attribute is updated by calling
.I stat_mom_job() .
If this job is the parent job of any dependent jobs waiting on this job to
start, the dependent jobs are notified by calling 
.I depend_on_exec() .
.LP
If the send failed, and the substate is 
.Sc JOB_SUBSTATE_ABORT 
we assume the send was interrupted because the job is being deleted, and we
do nothing except reject the batch request, if it exists.
Otherwise, the job is requeued for a later retry; and if there is a batch
request, it is rejected.
.Fn chk_job_torun()
.Cs
static job *chk_job_torun(struct batch_request *preq)
.Ce
.IP Args: 4
.RS
.IP preq
pointer to the batch request (run job or stagein).
.RE
.IP Returns:
a pointer to the job specified in the request if all is well, otherwise null.
.LP
The job is located via
.I chk_job_request() .
The request will be rejected if the job is in
.Sc JOB_STATE_TRANSIT
or
.Sc JOB_STATE_EXITING
state, or substates
.Sc JOB_SUBSTATE_STAGEGO ,
.Sc JOB_SUBSTATE_PRERUN ,
or
.Sc JOB_SUBSTATE_RUNNING .
If the request is to stage in files, it will also be rejected if the substate
is
.Sc JOB_SUBSTATE_STAGEIN .
.LP
The requesting client must have operator or administrator privilege (which
the Scheduler does).  The job must be in an execution queue.
.LP
A host for execution may be specified in the request.
If this is the null string, either the local host is assumed or if the
the job is to be restarted from a checkpoint, then the prior execution
host is assumed as the new execution host.
If the host name is not the null string and
if the job is being restarted from a checkpoint, then the execution host
must be the same as the earlier execution host or 
.Er PBSE_BADHOST
is returned.
The name of the host on which to execute the job is saved in the job
structure in
.Av ji_destin
for svr_statjob().
This host name is also converted to a host address which is saved in
.Av ji_qs.ji_un.ji_exect.ji_momaddr .
The attribute
.At JOB_ATR_exec_host ,
execution host, is set to the selected/specified host name.
.NH 4
.Ix req_select.c
.LP
The file
.I src/server/req_select.c
contains the server function for processing the Select Job batch request and
the Select\-Status (selstat) Job batch request.
.Fn req_selectjobs()
.Cs
void req_selectjobs(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
This function handles both the Select Job and the special Select-Status
Job request.  The latter is provided primarily to enable the job Scheduler
to obtain status about jobs that it should consider.  It is a waste of 
bandwidth to receive status about jobs in routing queues or (depending on
policy) held or waiting jobs in execution queues.
There are two differences in the treatment of the requests,
first the return values differ and second the sequence of processing.
The Select Job request has as a return a list of job identifiers which
meet the selection criteria.  The Select-Status, or selstat, request
has as its return a set of job status replies, one for each job which
meets the selection criteria.  The Select Job is straight foreward to process,
just go through the list of jobs for those that match the selection criteria.
However, for Selstat, the same problem with running jobs exists as does
for Status Job, the server's resources used information for some running jobs
may be stale.  A status request to MOM is required to update the server's
information.
.LP
For both requests,
each attribute specified in the request is decoded into a selection list
which contains the decoded attribute value, the selection operator, and
a pointer to the attribute definition, see
.I build_selist() .
For Select Job, the flow process to the final selection step in
.I sel_step3() .
For Selstat, two passes are required, the first in
.I sel_step2()
preselects the jobs and gets an status update from MOM for any that need it.
Then 
.I sel_step3()
re-selects the jobs for the reply.
Information about the request are passed to both sel_step2() and sel_step3()
in a 
.Ar stat_cntl
structure as used by req_stat_job().
.LP
One of the specified attributes,
.Sc ATTR_q
or \*Qdestination\*U is not a true attribute and receives special treatment
in build_selist() if present....
If
.Sc ATTR_q
was specified, then the search for jobs will be limited to the list headed
by that queue.  Otherwise, the search is among all jobs managed by the server.
.Fn sel_step2()
.Cs
static void sel_step2(struct stat_cntl *cntl)
.Ce
.IP Args: 4
.RS
.IP cntl
pointer to a stat_cntl structure used to keep state for the search through
the list of jobs.
.RE
.LP
The search starts with the job identified in the stat_cntl struture:
.IP null
The case for the first entry, start from the top of the list.
.IP "job name"
The job on which broke the search the prior round (i.e. caused a stat
request to MOM).  If this job is missing, restart at the beginning.
.LP
With in the loop, the \*Qnext\*U job is obtained.  This is either
the first job if starting from the top, or the job following the one we left
off with in the prior round.
.LP
For each job in the search list which the user is entitled to query status,
the function 
.I select_job()
is called to determine if the job meets the selection criteria, see sel_step3.
For each \*Qselected\*U job, if the job is running and the information from
MOM is stale, older than
.Sc PBS_RESTAT_JOB
seconds as recorded in
.Ar ji_momstat ,
the loop is broken to send a request to mom to update all jobs, see
.I stat_to_mom() .
The current job id is saved in the stat_cntl structure.  When MOM replies,
.I sel_step2
will be restarted with the next job.   When all jobs have been checked,
.I sel_step3()
is called to repeat the selection and build the status replies.
.FN sel_step3()
.Cs
static void sel_step3(struct stat_cntl *cntl)
.Ce
.IP Args: 4
.RS
.IP cntl
pointer to a stat_cntl structure used to keep state for the search through
the list of jobs.
.RE
.LP
Here is where the reply to the request is actually built.  The stat_cntl
structure is passed in from either
.I req_selectjobs()
for a simple Select Jobs request or from
.I sel_step2() 
for a Select-status request.  We loop through the list of jobs in the queue
or server looking (again) for those that meet the criteria,
.I select_job()
is called for those jobs which the user is privileged to see.
If the requesting user does not have special privilege, the ability to query
jobs owned by other uses is determined by the setting of the server attribute
.At query_other_jobs .
For the batch request type of:
.IP Select 10
If a job meets the criteria, its job id is entered into the select reply list.
.IP Selstat 10
If a job meets the criteria, 
.I status_job()
is called to append the status of that job to the reply.
.LP
After the search is complete, any space allocated to the selection list
is freed.
The select reply list, or job status list is included in the reply to
the client.
.Fn select_job()
.Cs
static int select_job(job *pj, struct select_list *psel)
.Ce
.IP Args: 4
.RS
.IP pj
pointer to a candidate job.
.IP psel
pointer to the selection list set up from the request.
.RE
.IP Returns: 4
.RS
.IP 0
if job does not meet criteria.
.IP 1
if job does meet criteria.
.RE
.LP
For each attribute in the list pointed to by
.At psel
which has a value that has been set, the attribute's
.I at_comp()
function is called to determine the relationship between the requested 
attribute and the job attribute.  If the relation matches that specified
in the corresponding member of the
.At operator
array, the comparison continues.
Otherwise the job is not selected, zero is returned.
If all attributes match, then the job is selected.  One is returned.
.LP
There is one attribute which must be special cased.  If 
.AT JOB_ATR_userlst ,
-u, is specified, it is a list of job owners, not the
.At user-list
job attribute.  If the job owner is in the list, we accept the job.
.Fn sel_attr()
.Cs
static int sel_attr(attribute *pattr, struct select_list *select)
.Ce
.IP Args: 4
.RS
.IP pattr
pointer to an attribute.
.IP select
pointer to a select_list entry.
.RE
.IP Returns: 4
.RS
.IP 1
if attribute matches the selection criteria.
.IP 0
if not.
.RE
.LP
The attribute value and the value in the selection list are compared via
a call to the appropriate
.I at_comp()
routine.
The comparison result is matched against the selection operator.  If it
fits 1 is returned, otherwise 0 is returned.
.Fn build_selentry()
.Cs
static int build_selentry(svrattrl *plist, attribute_def *pdef, int perm,
                          struct select_list **rtnentry);
.Ce
.IP Args: 4
.RS
.IP plist
pointer to a member of the list of attributes in request on which to select.
.IP pdef
pointer to the attribute definition for the above attribute.
.IP perm
the users access permissions.
.IP rtnentry
RETURN: the address of the created entry is returned here.
.RE
.IP Returns:
zero if ok, error code if not.
.LP
A single selection list entry is created for the specified attribute.
The entry contains a pointer to the attribute definition structure.  This
provides access to the comparison routine (at_comp).  It also contains
the decode attribute value from the request and the selection operator.
.LP
If the privilege level is not sufficient to read the attribute, or the
attribute cannot be selected in the manner requested [some attributes
are restricted to an equal/not equal test], an error is returned.
.Fn free_sellist()
.Cs
static void free_sellist(struct select_list *pslist)
.Ce
.IP Args: 4
.RS
.IP pslist
pointer to a select list structure.
.RE
.LP
A select list, created by build_selist(), is freed.
.Fn build_selist()
.Cs
static int build_selist(svrattrl *list, int permission, 
                        struct select_list **select, queue **pque, int *bad)
.Ce
.IP Args: 4
.RS
.IP list
pointer to a list of svrattrl structures from the select request.
.IP permission
the client's privilege level.
.IP select
RETURN: pointer to a pointer to a select list.  The location of the
create select list is returned here.
.IP pque
RETURN: pointer to a queue pointer.  If job search is limited to a queue,
this is set.
.IP bad
RETURN: pointer to an integer which will be set to the index (starting with
1) of a bad attribute.
.RE
.IP Returns: 4
.RS
.IP 0 
if the selection list was built.
.IP error
number if an error occurred.
.IP
The parameters above marked as returns.
.RE
.LP
For each member of the svrattrl (attribute) list from the request,
a select_list structure is allocated, the attribute is decoded into
the structure, the operator is set from the request, and the structure
is linked into the select_list.  All this is done via the call to
.I build_selentry() .
.LP
If an ATTR_q (-q) pseudo-attribute was specified, a search is make for a
queue of that name and a pointer to it is returned in
.At pque.
.LP
Another special case is when the attribute is for -s,
.At JOB_ATR_state ,
the actual attribute is single character of type
.Sc ATTR_TYPE_CHAR ,
but the selection may be a string of multiple letters, see the -s option is
qselect(1).  Hence there is a special attribute definition structure for this
case which decodes a string and supplies a special comparison routine,
.I comp_state() 
which compares each letter of the selection string with the job's state.
.NH 4
.Ix req_shutdown.c
.LP
The file
.I src/server/svr_shutdown.c
contains the functions to gracefully terminate the server.
.Fn req_shutdown()
.Cs
void req_shutdown(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.IP Returns: 4
.RS
.IP 0
if success.
.IP non 0
if error.
.RE
.LP
The requesting user must have operator or administrator privilege or the
request is rejected.
The address of the shutdown request is saved in
.Ar pshutdown_request
for the function 
.I shutdown_ack() .
Then the function
.I svr_shutdown()
is called with the type of shutdown requested.
.Fn shutdown_ack()
.Cs
void shutdown_ack()
.Ce
.LP
This function is called from the server's main routine just before it exits.
The purpose is to check if the shutdown is because of a request (qterm)
and reply to it.  
.Fn svr_shutdown()
.Cs
void svr_shutdown(int type)
.Ce
.IP Args: 4
.RS
.IP type
The type of shutdown requested.
.RE
.LP
The server state is set to indicate the type of shutdown:
.nf
- 
.Sc SV_STATE_SHUTIMM
for type immediate,
.Sc SHUT_IMMEDIATE ,
or for receipt of signal SIGTERM.
.br
- 
.Sc SV_STATE_SHUTDEL 
for type delay,
.Sc SHUT_DELAY ,
.br
- 
.Sc SV_STATE_DOWN
for type quick,
.Sc SHUT_QUICK .
.fi
to restrict services.
Note, a SHUT_IMMEDIATE or a SIGTERM while the server is in state
SV_STATE_SHUTIMM  will force the server into SV_STATE_DOWN.
The type of shutdown is recorded in the event log.
If the shutdown type is quick, return now; the main loop will be broken.
.LP
For each job managed by the server, if the job is in the
.Sc JOB_STATE_RUNNING
state, the following actions are performed:
.RS
.IP \(bu
The
.Sc JOB_SVFLG_HOTSTART
and
.Sc JOB_SVFLG_HASRUN
bits are turned on in
.I ji_svrflags .
.IP \(bu
If checkpoint/restart is supported and the job checkpoint attribute
is not "n", then an
attempt is made to checkpoint and terminate the job is made by calling
.I shutdown_chkpt() .
.IP \(bu
Else if the job cannot be checkpointed or the checkpoint fails, 
then attempt to rerun the job or kill it off by calling
.I rerun_or_kill() .
.RE
.LP
.Fn shutdown_chkpt()
.Cs
static int shutdown_chkpt(job *job)
.Ce
.IP Args: 4
.RS
.IP job
pointer to the job to checkpoint.
.RE
.IP Returns: 4
.RS
.IP 0
if the checkpoint request (hold request) was successfully set to MOM.
.IP non-zero
error number if not.
.RE
.LP
A batch_request structure is allocated and set up as a Hold Job request.
This request is sent to MOM,
.I relay_to_mom() ,
for action.  The routine
.I post_chkpt()
will in invoked when MOM responds.
.Fn post_chkpt()
.Cs
static void post_chkpt(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to the work task entry.
.RE
.LP
This function is called when MOM replies to a request sent by
.I shutdown_chkpt() .
If the checkpoint/hold was successful, either the
.Sc JOB_SVFLG_CHKPT
or
.Sc JOB_SVFLG_ChkptMig
bit is set in the job server flags, 
.Ar ji_qs.ji_svrflag
depending on the checkpoint type return information from MOM.
The checkpoint type is found in the
.Ar brp_auxcode
word of the reply to the checkpoint request.
.LP
Otherwise, we attempt to rerun the job or kill it off by calling
.I rerun_or_kill() .
.Fn rerun_or_kill()
.Cs
void rerun_or_kill(job *pjob, char *text)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job to rerun or kill off.
.IP text
message to log, the reason this function is being called.
.RE
.LP
If the job attribute
.At JOB_ATR_rerunable
is true, a
.Sc SIGKILL
signal request is sent to MOM.
The job substate is set to
.Sc JOB_SUBSTATE_RERUN
to indicate to post job execution processing that the job is not to be
discarded.
.LP
If the job cannot be rerun, and the server state is not
.Sc SV_STATE_SHUTDEL ,
.I job_abt()
is called to kill off the job.
.NH 4
.Ix req_signal.c
.LP
The file
.I src/server/req_signal.c
contains the server function for processing the Signal Job batch request.
.Fn req_signaljob()
.Cs
void req_signaljob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The job must be in state
.Sc JOB_STATE_RUNNING .
The signal value supplied in the request is a string, it may either be a
numeric string or an alphanumeric signal name.
The special names
.Ty suspend
and
.Ty resume
are reserved for the special suspend/resume functions.   Use of these names
require manager or operator privilege.
.LP
The request is forwarded to MOM by 
.I relay_to_mom() .
Note, if the signal value is a numeric string, MOM
will convert it to the corresponding integer value.  If it is a name,
which may or may not have the\*QSIG\*U prefix, the name is converted to the 
correct signal value.
If the name is not known on the execution system, the request is rejected
with error 
.Er PBSE_UNKSIG .
.LP
When the MOM replies, the function
.I post_signal_req()
is invoked to generate the reply to the client.
.Fn issue_signal()
.Cs
int issue_signal(job *pjob, char *signal, void (*func)(struct work_task *),
                 void *extra)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job structure of job to be signaled.
.IP signal
alphabetic signal name or numerical string value of signal to send to job.
.IP func
the function to invoke when the reply to the signal request is received.
.IP extra
bit of information to insert in generated signal job batch_request structure.
.RE
.IP Returns: 4
.RS
.IP 0 
if successful.
.IP -1
if error.
.RE
.LP
This function is provided to allow the server itself to initiate a signal
to a running job.  A Signal Job batch request structure is allocated via
.I alloc_br() 
and initialized.
The void pointer
.At extra 
is inserted into the structure in
.Av rq_extra .
The request is sent to the MOM in charge of the job  by calling
.I relay_to_mom() .
On the reply from MOM, the function
.I release_req() 
is invoked which just frees the batch_request structure.
.LP
An error is returned from issue_job only if it cannot allocate the
batch_request structure or if relay_to_mom fails.  We have no idea what
MOM did with the signal.
.LP
When MOM replies to the Signal Job request, the function specified as  
.At func
will be invoked via the work task mechanism.  
.I "This function MUST free the batch request and close the connection" .
The easiest way is to call
.I release_req() .
.LP
The 
.At extra
parameter, and in fact the
.At func
post process function were added to issue_signal() to generalize it
(how does one spell \*Qkludge\*U) for req_delete.c.
.Fn post_signal_req()
.Cs
static void post_signal_req(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to the deferred child work task.
.RE
.LP
When MOM replies to a Signal Job request forwarded to her on behalf of
an external client, this function
will receive her reply and relay its code to the client.
.LP
If either of the special signal names,
.Ty suspend
or
.Ty resume ,
was issued and MOM acknowledged the request without error,  the flag
.Sc JOB_SVFLG_Suspend
is updated in
.Av ji_svrflags
(set for suspend, cleared for resume).
The job_state attribute value letter is changed to 
.Ty S
or
.Ty R 
by calling
.I set_statechar() .
.NH 4
.Ix req_stat.c
.LP
The file
.I src/server/req_stat.c
contains the server functions for providing status about 
.IP \(bu
A job or set of jobs in reply to a Status Job batch request.
The client may request status of a single job by supplying the job id,
or a set of job by supplying a destination id.
If a destination id is supplied, then status of all jobs at
that destination, a queue, that the user is entitled to status is returned.
.IP \(bu
A queue or all the queues in owned by the server.
.IP \(bu
The server itself.
.LP
.Fn req_stat_job()
.Cs
void req_stat_job(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
If the id supplied in the request, rq_id, is not null and begins with a
numeric character, the request is for status of a single job whose id is
specified.
.LP
If the id in the request is not null and begins with a alphabetic character,
then the id specifies a queue.  An attempt is made to locate the queue of
that name.
If the queue does not exist on this server, 
.Er PBSE_UNKQUE
is returned to the user.
.LP
Else if the id in the request is null, or starts with the '@' character, then
the request is for all jobs in the server.  
.LP
A private status control structure is allocated and initialized to hold the
type of status and pointer to the job or queue as required.  This structure
is passed to
.I req_stat_job_step2() .
.Fn req_stat_job_step2()
.Cs
static void req_stat_job_step2(struct stat_cntl control)
.Ce
.IP Args: 4
.RS
.IP control
is a pointer to privately defined status control structure.
.RE
.LP
This function,
.I stat_step_2() ,
is a effect of the complication of having MOM be responsible for running jobs.
When a user requests status of a running job, the user expects to see 
information about resource utilization by that job.  This implies that the
server must obtain reasonably current status information from MOM for each
job the client requested status.  Additional complication arises from
the desire to kept the Server free from waiting on any other server, that is
no synchronous requests.  As the server works through the list of jobs
for which the client requested status, rather than ask MOM for an update and
block waiting on her reply, each time the server goes to a MOM, a work task
is established and the server returns to its main loop.  This adds two
additional routines
.I stat_to_mom()
and
.I stat_update() .
.LP
The checking state of jobs and asking MOM for recent updates and then
building the final reply to the client is done in two separate passes.
This is to eliminates the possibility of starting to build the status
for a job and having to go to MOM, only to have the job disappear
before we hear from MOM.
.LP
The first part of 
.I req_stat_job_step2()
checks each job for which status is requested.
The type of request, single job, jobs in queue, all jobs, as well as the
last job checked is passed in the status control structure.  The \*Qlast job
checked\*U is null the first time in,  this causes stat_step_2 to start with
the first job in the queue or server list, or the single job in the request.
.LP
If any job is running and the last update from MOM was received more than
.Sc PBS_RESTAT_JOB
seconds ago, then it goes to MOM again.  PBS_RESTAT_JOB is used to keep
the server from flooding MOM with status request for a anxious user.
The function
.I stat_to_mom() is called to form a status request and issue
it to the appropriate MOM.
If the user asked for status of a single job, that is all we ask from MOM,
otherwise we ask MOM for status of all her jobs.  This may save additional
requests later.
At this point req_stat_job_step2 returns back to the
main server loop.  When MOM replies, the action picks up in 
.I stat_update()
which updates the job status information and re-invokes stat_step_2() passing
it a pointer to the
.Ar stat_cntl
structure used to maintain the position amoung the jobs.  The process continues
with the next job.  This explains the funny initialization of 
.Ar pjob
with in the while() loop.   Note, if the job disappears while the server is
waiting for MOM to reply to the status, the server just starts over as
find_job() returns a null, the starting condition.
.LP
The second part of req_stat_job_step2 (which should be step 3)
is to loop back through the jobs and build
up the status reply to be returned to the user.  This is done by calling
.I status_job()  
for each job for which status is being provided.
Then, at long last, the status can be returned to the client.
Note, if status_job returns any non-zero status other than
.Er PBSE_PERM ,
that error is returned to the client.  If PBSE_PERM is returned, that
job is ignored, it is invisible to the client.
.Fn stat_to_mom()
.Cs 
int stat_to_mom(job *pjob, struct stat_cntl *control)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job.
.IP control
pointer to the status control structure.
.RE
.IP Returns: 4
.RS
.IP 0
if no errors.
.IP error
number if problem.
.RE
.LP
A Status Job batch request is created and initialized, see
.I alloc_br() .
This request has a pointer to the status control structure.
A connection is opened to MOM by calling
.I svr_connect() .
The connection is maintained until MOM replies.
The status request is sent to MOM by calling 
.I issue_request() .
.Fn stat_update()
.Cs
static void stat_update(struct work_task *task)
.Ce
.IP Args: 4
.RS
.IP task
pointer to the deferred child work task.
.RE
.LP
This function is invoked by
.I process_reply()
when the reply to a status request is received from MOM.
Per the overall paradigm (does a paradigm make four nickels?),
process_reply calls a specific processing routine identified in a work
task structure associated with the connection.  This work task points to
the original batch request structure.  In this case, the specific processing
routine is stat_update and the batch request structure also points to the
private status control structure.  
.LP
For each object status element returned,
the job structure is located by calling
.I find_job() 
with the job name from the reply.
The job attributes contained in the reply are passed to
.I modify_job_attr()
which updates the job structure.  
Note, the
.Sc ATR_DFLAG_FSET
flag is set in the permissions passed to modify_job_attr.
This allows \*QRead Only\*U attributes, such as the Session ID to be
modified.
.LP
If 
.Ar ji_momstat
is zero in the job structure, this is the first update since the job started
to run.  Hence we should save the job info to disk with a call to
.I job_save()
with
.Av SAVEJOB_FULL .
ji_momstat is set non-zero so we will not save after future updates from MOM.
.LP
If the job structure could not be found, it might have been deleted after
we issued the request to MOM.  We just ignore the situation here.
When
.I req_stat_job_step2()
discovers the missing job, it will restart the update process from the 
beginning of the queue or server's list.  Without the job, we cannot
continue to the next because the link field has been unlinked and freed.
.LP
In either case, the batch request built to send to MOM is freed and
the connection is broken. 
Typically the the routine that called stat_to_mom(), likely
req_stat_job_step2(), is specified in the status control structure.  This
routine is re-invoked to continue with the next job.
Should no routine be specified, see 
.I stat_mom_job() ,
the control structure is freed even though it was not allocated here.  This
saves an extra function just to do that.
.Fn stat_mom_job()
.Cs
void stat_mom_job(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to a single job.
.RE
.LP
This routine is a special front end to 
.I stat_to_mom()
to allow functions outside of this source file to issue a status call to MOM.
The primary user is 
.I post_sendmom() . 
We need to obtain the session id of the job newly placed into execution.
.LP
A status control structure is built and passed along with the job pointer
to stat_to_mom().  In this case, the function to invoke after MOM replies
is null.
.Fn req_stat_que()
.Cs
void req_stat_que(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The reply structure is initialized.
If the
.At id
in the request is either the null string or a null pointer, then status of
all queues at the server is being requested.
The routine
.I status_que()
is called in turn or each queue managed by the server.
.LP
Otherwise, it is a request for status of a single specified queue.  The
queue is located and
.I status_que()
is called for that queue.
If the specified queue does not exist, then
.Er PBSE_UNKQUE
is returned.
.Fn status_que()
.Cs
void status_que(queue *pque, struct batch_request *preq, list_head *preqattr)
.Ce
.IP Args: 4
.RS
.IP pque
pointer to the queue structure.
.IP pliststat
pointer to the head of the list to which a status structure is appended.
.IP preq
pointer to the batch request, used to access the requested attribute list
and client permissions.
.RE
.LP
A status structure is allocated, the object type is set to \*Qqueue,\*U
and the object name to the queue name.  The structure is linked to
.At pliststat .
.LP
The private function
.I status_attrib()
is called to encode and attach the attributes of the queue to the reply.
.Fn req_stat_svr()
.Cs
void req_stat_svr(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
A status structure is allocated, the object type is set to \*Qserver,\*U
and the object name to the server name.  The structure is linked to
.At pliststat .
.LP
The private function
.I status_attrib()
is called to encode and attach the attributes of the server to the reply.
.Fn update_state_ct()
.Cs
static void update_state_ct(attribute_def *padef, attribute *pattr, 
                            int ct_array)
.Ce
.IP Args: 4
.RS
.IP padef
pointer to an attribute definition.
.IP pattr
pointer to an attribute value.
.IP ct_array
pointer to the array of integers which holds the count of jobs per state.
.RE
.LP
This function is used to update the \*Qjobs per state\*U attribute of queue
and the server.  It is called whenever a status request is made of the
queue or server.  The count of the number of jobs in each state is maintained
in private data space within the queue or server structure.  These values
are converted to strings and placed in the public attribute.
.LP
The data space for the public Jobs by State attribute is a fixed character
array in the server or queue structure.  Note the special 
.I decode_null()
and
.I set_null()
routines associated with this attribute.
.NH 4
.Ix stat_job.c
.LP
The file
.I src/server/stat_job.c
contains functions to support the Status Job Request.  These are separated
to make them available for use in MOM.
.Fn status_job()
.Cs
int status_job(job *pj, batch_request *preq, svrattrl *pal,
               list_head *pliststat, int *bad)
.Ce
.IP Args: 4
.RS
.IP pj
pointer to the job structure.
.IP preq
pointer to the batch request.
.IP pal
pointer to first of a list of svrattrl structs containing attributes
to be returned.
.IP pliststat
UPDATED: pointer to the head of the list to which a status structure
is appended.
.IP bad
UPDATED: set if one of the specific attribute in 
.At pal
is invalid.
.RE
.IP Returns: 4
.RS
.IP 0
if no error.
.IP "non zero"
error number if error occurred.
.RE
.LP
The privilege to read (request status of) the job is validated.
If the client does not have operator or manager permission, then the request
is accepted only if the client is the job owner or the server allows all
jobs to be read, see server attribute
.At SRV_ATR_query_others .
If the client is denied access, 
.Er PBSE_PERM
is returned.
.LP
A status structure is allocated, the object type is set to job, and the
object name is set to the job identifier.
.LP
The state attribute
.Sc JOB_ATR_state
is updated from the ji_state field in the job structure.
.LP
The attributes of the job are encoded and attached to the reply
structure by
.I status_attrib() .
.Fn status_attrib()
.Cs
static void status_attrib(svrattrl *pal, attribute_def *padef,
                          attribute *pattr, int limit, int priv, 
                          list_head *phead, int *bad)
.Ce
.IP Args: 4
.RS
.IP pal
pointer to the list of requested attributes.
.IP padef
pointer to the attribute definition structure array for the object.
.IP pattr
pointer to the parent objects attributes.
.IP limit
the number of attributes in the above arrays.
.IP priv
the privilege of the client.
.IP phead
pointer to the head of the list in the reply structure to which the 
encoded attributes are linked.
.IP bad
UPDATED: set to the index of the first invalid attribute in 
.At pal .
.RE
.LP
If no specific attributes of the statused object were requested,
the list pointed to by 
.At pal
will be empty (null), then each attribute of the job which is
.I readable
with the client level of privilege is encoded into a svrattrl structure
by calling the
.I at_encode()
routine for the attribute.  The svrattrl entry is appended to the list
headed in the status structure.
.LP
If specific attributes were specified in the batch request,
the list pointed to by
.At pal
is not empty, then only those attributes which are known to the server, and
readable are returned to the client.  
For each attribute above, the corresponding attribute entry is located
and encoded into a svrattrl as above.
.LP
Note that MOM's version of this routine is simplier.  MOM encodes for the
status reply only those attributes listed in an array of specified attributes,
.Ar mom_rtn_list ,
contained in this file.

.NH 4
.Ix req_trackjob.c
.LP
The file
.I src/server/req_track.c
contains the server functions for recording job tracking information
received in a Track Job batch request.
The information is recorded in a member of a
.I tracking
array.  There is a pointer,
.At sv_track ,
to the array in the server structure, as well as its current size of the array,
.At sv_tacksize ,
and a flag,
.At sv_trackmodified ,
indicating if the structure has been modified.
.Fn req_trackjob()
.Cs
void req_trackjob(struct batch_request *req)
.Ce
.IP Args: 4
.RS
.IP req
pointer to the batch_request structure.
.RE
.LP
The tracking array is searched for a matching job id.  In case it is not
found, a pointer is kept to where in the array to insert a new record.
If an entry with a matching job id is located and its hopcount is less than
that in the request, it is updated with the new information from the request.
Otherwise a new entry is allocated, set with the information from the request,
and linked into the list.
.LP
The
.I sv_trackmodified 
flag is set in the server to indicate the list has been modified since the
last time it was saved.
information
.Fn track_save()
.Cs
static void track_save()
.Ce
.LP
This function saves job tracking entries to disk.
If the server flag
.I sv_trackmodified
is not set, there are no updated entries, so just exit.
.LP
The save file specified in
.Av path_track
is opened and the save buffer is written out.
Then the save file is closed.  The 
.I sv_trackmodified
flag is cleared.

.NH 3
.Tc Job Router Overview
.LP
The purpose of the Job Router is to find a destination queue which matches
the requirements for a job in a route queue.  Each queue given as
a destination for a route queue is tried.  If the destination is
local (in the same server that contains the route queue), the
communication with the destination queue is internal.  If not,
a process is created to deal with sending the job over the network.
.LP
Each attempt to send a job to a queue starts with a Queue Job Request which
includes information about the requirements for the job.  If the
queue can accommodate the job, it accepts the queue request.  
If not, it rejects it.  If the error return indicates the rejection
is permanent, the queue name is added to a list kept with each job
of destinations to not try again.
.NH 4
.Ix job_route.c
.LP
The major functions in file
.I src/server/job_route.c
which make up the Job Router are described below.
.Fn add_dest()
.Cs
badplace *add_dest(job *pjob)
.Ce
.IP Args: 4
.RS
.IP pjob
The job which has an entry made in its bad destination list.
.RE
.IP Returns
.RS
.IP pointer
if call is successful.
.IP NULL
If call is not successful.
.RE
.LP
.Fn is_bad_dest()
.Cs
badplace *is_bad_dest(job *pjob, char *dest)
.Ce
.IP Args: 4
.RS
.IP pjob
The job to check for the destination.
.IP dest
The destination to look for.
.RE
.IP Returns
.RS
.IP pointer
If dest is found.
.IP NULL
If dest is not found.
.RE
.LP
The list of badplace structures attached to the job is searched for one
with the specified destination.  If found a pointer to it is returned,
otherwise a null pointer is returned.
.Fn default_router()
.Cs
int default_router(job *pjob, pbs_queue *pque, long retry)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to job to route.
.IP pque
pointer to queue in which the job resides.
.IP retry
next time to retry the route.
.RE
.IP Returns:
0 if job is being routed or is still ok in the queue,
non-zero if cannot be and should be aborted.
.LP
An attempt is made to route the job to each destination listed in order in
the queue attribute
.At QR_ATR_RouteDestin .
Upon having attempted the last destination, if 
.Ar ji_retryok
in the job structure is false, no destination would accept the job, that 
is logged and 
.Er PBSE_ROUTEREJ
is returned.
If 
.Ar ji_retryok
is true, at least one destination can be retried at
.Ar retry
time, zero is returned.
.LP
Foreach destination, 
.I is_bad_dest()
is called to check if the current destination is listed in the job structure
as a \*Qbad\*U destination, one which has permanently rejected the job.
If bad, the next destination is tried.  The function
.I svr_movejob() 
is invoked to attempt the move (route) the job to the current trial destination.
If it returns -1, the current destination is added to the bad list by calling
.I add_dest() .
If the move succeeded, or is underway (move to a different server), we return
zero.   If svr_movejob() returns 1, the move failed, but may be retried, so
.Ar ji_retryok
is set true and the next destination is tried.
.Fn job_route()
.Cs
int job_route(job *job)
.Ce
.IP Args: 4
.RS
.IP job
The job which is to be routed.
.RE
.IP Returns:
.RS
.IP 0
If call is successful.  Note, the job may still be "owned" by the
local server.
.IP non-zero
error number if call failed.
.RE
.LP
Check the job state.
If the job is in state 
.Sc JOB_STATE_TRANSIT ,
ignore it, it is already routing.
If the job is in 
.Sc JOB_STATE_HELD
and attribute
.At QR_ATR_RouteHeld
is not true or the job is in state
.Sc JOB_STATE_WAITING
and attribute 
.At QR_ATR_RouteWaiting
is not true, then we will ignore the job shortly.
If the job is in any other state other than the above or 
.Sc JOB_STATE_QUEUED ,
a record is added to the log and the job is ignored.
.LP
Next we check the queue in which the job resides.  It must be started,
.At QA_ATR_Started
true, and if the queue attribute
.At QA_ATR_MaxRun
is set the number of jobs in the queue in state
.Sc JOB_STATE_TRANSIT
must be less than that specified in the attribute.
.LP
If the job has been laying around in the queue for longer than the allowable
life time,
.At QR_ATR_RouteLifeTime ,
return 
.Er PBSE_ROUTEEXPD .
The retry time is calculated to be the current time plus either the value of
the queue attribute
.At QR_ATR_RouteRetryTime
if set, or the default retry time
.Sc PBS_NET_RETRY_TIME .
If the job is to be ignore because of its state we do that now (after the
test for life time).
.LP
We are now in the main routing loop.
If the job has been through all the possible destinations without being
routed we check the retry flag, ji_retryok.  If it is cleared, all destinations
rejected the job for reasons which seem permanent, 
.Er PBSE_ROUTEREJ
is returned.  If any destination rejected the job for \*Qtemporary\*U reasons,
unable to contact the server, or the queue was not enabled, the route retry
time for the job, ji_un.ji_routet.ji_rteretry, is set to the retry time and
zero is returned.
.LP
Otherwise, we have more destinations to try.  The next one is selected and
.I is_bad_dest()
called to determine if it is on the \*Qbad\*U list.  If not,
.I svr_movejob()
is called to attempt to route the job.
If svr_movejob returns an indication that the destination gave a permanent
rejection, the destination is added to the bad list by 
.I add_dest() .
If the rejection is temporary, the retry flag, ji_retryok, is set and we go
on to the next candidate destination.  Otherwise, the route is in progress
or has be completed (if local) and so zero is returned.
.Fn queue_route()
.Cs
void queue_route(queue *que)
.Ce
.IP Args: 4
.RS
.IP que
pointer to a routing queue.
.RE
.LP
For each job whose route retry time, ji_un.ji_routet.ji_rteretry,
has been reached, we call
.I job_route() .
If job_route() returns 
.Er PBSE_ROUTEREJ ,
rejected by all destinations,
or
.Er PBSE_ROUTEEXPD ,
life in queue expired,
the job is aborted.
.NH 4
.Ix svr_movejob.c
.LP
The major functions in file
.I src/server/svr_movejob.c
which make up the Job Mover are described below.
.Fn svr_movejob()
.Cs
int svr_movejob(job *job, char *destination, batch_request *request)
.Ce
.IP Args: 4
.RS
.IP job
The job which is to be routed.
.IP destination
The destination queue where the job will be sent.
.IP request
The batch request from the client or NULL if this is from route.
.RE
.IP Returns:
.RS
.IP 0
If move is complete.  The job is now owned by the destination queue.
.IP -1
If call failed.  The job has not moved.
.IP 1
A \*Qtemporary\*U failure.  The call failed but may be tried again.
.IP 2
The move is deferred (in progress).  A child has been created to process it and
will return sometime in the future.
.RE
.LP
Copy the destination into the job structure.
If the destination is local to this server, call
.I local_move() ,
else call
.I net_move() .
.Fn local_move()
.Cs
int local_move(job *job, batch_request *request)
.Ce
.IP Args: 4
.RS
.IP job
The job which is to be routed.
.IP request
The batch request from the client or NULL if this is from route.
.RE
.IP Returns:
.RS
.IP 0
If route is complete.  The job is now owned by the destination queue.
.IP -1
If call failed.  The job has not moved.
.IP 1
The move failed but may be retried.
.RE
.LP
Search for the destination queue, if it does not exist return -1.
If the queue is not enabled, return 1.
If the job is not being move at the specific request of the administrator,
then check the resource requirements of the job against the queue limits via
The function 
.I svr_chkque()
is called to check the destination queue state and the resource requirements of the job against the queue limits.  The type of move (route, non-privileged
user move, privileged move) determines what items are enforced in svr_chkque().
If the job requirements fit the destination queue  limits, unlink job from
current queue via
.I svr_dequejob() ,
reset the queue rank job attribute
.At JOB_ATR_qrank 
to a new value (job goes to the end of the queue),
and link into queue via
.I svr_enquejob() .
.Fn net_move()
.Cs
int net_move(job *job, batch_request *request)
.Ce
.IP Args: 4
.RS
.IP job
The job which is to be moved, or routed.
.IP request
The batch request from the client or NULL if this is from route.
.RE
.IP Returns:
.RS
.IP 2
If no error occurred.  The job is in the state JOB_STATE_TRANSIT.
A child has been created which will return with a status indicating
success or failure.
.IP -1
If call failed.  The job has not changed state.
.RE
.IP "Returns from child:"
.RS
.IP 0
If route is complete.  The job is now owned by the destination queue.
.IP 1
If call failed.  The job has not moved.
.IP 2
The move failed but may be retried.
.RE
.LP
This function serves double duty.  It is used to route a job (from a routing
queue, see
.I job_route()\  ),
or to move a job (a move request) to another batch server.
.LP
The server name (host name) and service port is determined by passing the
destination sub-string following a \*Q@\*U character to
.I parse_servername().
The host address is obtained from
.I get_hostaddr() .
The job state is set to 
.Sc JOB_STATE_TRANSIT .
This information, along with the type of move and post child processing
function, is passed to
.I send_job()
to actually fork a child to send the job.
.LP
If the batch_request pointer is not null, the move is the direct result of
a Move Job batch request.  The move_type parameter is set to
.Sc MOVE_TYPE_Move ,
the post child processing function desired is
.I post_movejob() ,
and the data pointer to place in the work task points to the request.
Otherwise, the move results from a route operation.  The move_type parameter is
set to
.Sc MOVE_TYPE_Route ,
the post child function is 
.I post_routejob() ,
and the data pointer is set to NULL (after all, there is no request to
which to point).
.Fn send_job()
.Cs
int send_job(job *pjob, pbs_net_t address, int port, int move_type,
             void (*post_func)(struct work_task *),
             void *data_pointer)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job to be sent.
.IP address
of the destination server (host).
.IP port
number for the service (server or MOM).
.IP move_type
the type of send:  move, route, or execute (to MOM).
.IP post_func
address of a function to invoke after completion of the move/route.
.IP data_pointer
pointer to the data of interest to the post child function, saved in
the work task.
.RE
.IP Returns: 4
.RS
.IP 2
if the child was successfully created (see svr_movejob).
.IP -1 
if error, pbs_errno set to the error number.
.RE
.LP
The death-of-child signal is blocked until the work task is set and the child
is underway.
A child process is forked to do the queue job request sequence.
.LP
The parent creates a work task to be dispatched on death of the child.
The job pointer is passed to
.I set_task()
to be placed in 
.At wt_parm1 .
The data_pointer item, either NULL or a pointer to the batch request is
inserted into 
.At wt_parm2 .
The post processing routine had better expect what is in wt_parm2.
The dispatched function depends on the type of move.  It is passed in as
post_func, and is typically:
.in +0.5i
.br
.I post_routejob() 
if the move type is route.
.br
.I post_movejob()
if the move type is move.
.br
.I post_to_mom()
if the move type is execute.
.in -0.5i
.br
Now, unblock the death-of-child signals and return 2.
.LP
The created child process, the router performs the following actions.
It sets up a signal catcher to insure an error return.
The job attributes are encoded into a list of 
.Av svrattrl
structures.
The encoding mode is according to the destination,
.Sc ATR_ENCODE_MOM
if the job is being sent to MOM, move type is
.Sc MOVE_TYPE_Exec ;
and
.Sc ATR_ENCODE_SVR
if the job is being routed to another server, move type is
.Sc MOVE_TYPE_Route .
The svrattrl structures contain the 
.Av attrl
structures required by the API routines in libpbs.a.  The attrl sub-structures
are correctly linked by calling
.I attrl_fixlink() .
The path name of the job's script file is set up based on the file prefix
information in ji_qs.ji_fileprefix.
.LP
The following steps are tried several times:
.IP
If this is not the first time around the loop, there must have been an
error the prior time.  Disconnect from the server.  Call
.I should_retry_route()
to determine if we should retry, if not exit with a status of 1.
.IP
Connect to the destination server by calling
.I svr_connect() .
If the connection fails for a reason marked by svr_connect as
.Sc PBS_NET_RC_FATAL ,
the failure is recorded in the log and and exit status of one (1) indicates
the permanent failure.  If the failure is not permanent, continue with the
next cycle around the loop.
.IP
If the job is already in substate
.Sc JOB_SUBSTATE_TRNOUTCM ,
we are attempting to complete an interrupted job send operation.  We skip
steps up to sending the \*Qready to commit\*U.
.IP
Call the API routine
.I _pbs_queuejob()
to send the job attributes.
If the job has a checkpoint file at MOM,
.Ar JOB_SVFLG_CHKPT
is set and if the move is a send to MOM, then skip steps up to the 
commit step.
.IP
Call the API routine 
.I _pbs_jscript()
to send the script file.
.IP
If the move type is to MOM and the job has already been run once,
.Sc JOB_SVFLG_HASRUN
set, then copy over the job's standard output, error and (if exists)
migratable checkpoint file. 
.IP
Now block all signals so the final stages of the transfer cannot be
stopped by the server.
Send the ready to commit by calling the API routine
.I _pbs_rdytocmt() .
If it is rejected, unblock signals and continue the next cycle around the loop.
.IP
The receiving server now has everything and except when sending to MOM for
execution, we purge our copy to prevent duplicate jobs.
If the move type is not type 
.B execute ,
delete the job files by calling
.I job_purge() .
Send the commit, call
.I _pbs_commit().
.LP
Disconnect from the destination server, and indicate a successful move with
an exit of 0.
.Fn post_routejob()
.Cs
void post_routejob(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
A pointer to the work task entry.
.RE
.LP
This function is invoked from a work_task entry when the job router
process terminates.  The work_task member
.At wt_parm1
points to the job being routed and
.At wt_aux
is set to the exit status of the router.
If the router exit status shows the job was sent ok:
.br
- if files where already staged-in, call
.I remove_stagein() ,
.br
- delete the job by calling
.I job_purge() 
and return 0.
.LP
If the exit status indicates a permanent failure, its either a \*Qbad\*U 
destination or the router caught a signal.
If the job substate is set to
.Sc JOB_SUBSTATE_ABORT ,
the server has received a request to delete the job, so stop the routing;
another work task will complete the delete process.  Otherwise the
destination is bad, mark it not to be tried again for this job,
.I add_dest() .
On either a permanent or temporary failure, attempt to route to the next
destination by recalling
.I job_route() .
If job_route returns any error abort the job.
.Fn post_movejob()
.Cs
void post_movejob(struct work_task *pwt)
.Ce
.IP Args: 4
.RS
.IP pwt
pointer to the work task entry.
.RE
.LP
This function is invoked from a work task entry when the job router
process has terminated.  The route was the result of a Move Job request.
The pointer to the batch_request structure for the move request is in
.At wt_parm1 .
The exit status of the child process which attempted the move (route) is in
.At wt_aux .
.LP
When the child process which was forked to perform a route operation in
response to a Move Job batch request terminates, this function is called by
the work task dispatch routine.
.LP
A reply is returned to the client based on the
exit status of the routing child process.  If the status is zero, there
were no errors and the job has been routed to a new server.
If files had been staged-in for the job, they are deleted by calling
.I remove_stagein() .
The job is purged and a success reply is returned to the client.
.LP
If errors occurred, the job still exists on this server.  An error reply
is returned to the client and the job is requeued by setting the state to 
the value returned by
.I svr_evaljobstate()
and calling 
.I svr_setjobstate() .
.Fn should_retry_route()
.Cs
static int should_retry_route(int error)
.Ce
.IP Args: 4
.RS
.IP error
to be examined to determine retry or not retry.
.RE
.IP Returns: 4
.RS
.IP 1
if the route should be retried.
.IP -1
if the route should not be retried.
.RE
.LP
this function looks at the error passed as a parameter and determines
if the route should be retried.

.NH 3
.Tc Header Files
.LP
.NH 4
.Ix attribute.h
.LP
The structures, symbols, and access function prototypes needed to declare and
define attributes are located in this header file.
.LP
Attributes are represented in one or both of two forms, external and
internal.  When an attribute is moving external to the server, either
to the network or to disk (for saving), it is represented in the
external form, a 
.I svrattrl
structure.  This structure holds the
attribute name as a string.  If the attribute is a resource type, the
resource name is encoded as a string, else it is null.  The value of the
attribute (or resource) is encoded into a third string.  The structure contains
a length field for all three strings and a field which gives the over all size
of the svrattrl structure and the appended strings.
.LP
Internally, attributes exist in two separate structures.  The
attribute type is defined by a definition structure,
.I attribute_def ,
which contains the name of the attribute, flags, and pointers to the functions
used to access the value.  This info is "hard coded".  There is one
"attribute definition" per (attribute name, parent object type) pair.
.LP
The attribute value is contained in another structure,
.I attribute ,
which contains the value with in a union of the possible value types.
The possible types are:
.IP ATR_TYPE_LONG
the data is arithmetic or boolean and fits in a C long type internal to
the structure.
.IP ATR_TYPE_CHAR
the data is a single character and is maintained internal to the structure.
.IP ATR_TYPE_STR
the data is null terminated sting.  Storage for the data is on the heap and
a pointer to it is in the attribute structure.
.IP ATR_TYPE_ARST
the data is an array of strings.  The value in the attribute structure points
to a 
.I array_strings
structure on the heap.  This structure has an array of pointers to each string.
The strings are maintained on the heap in contiguous storage.
.IP ATR_TYPE_SIZE
the data is a size.  It is maintained as a long integer and two flag sets
which specify K,M,G,T and bytes or words.
.IP ATR_TYPE_RESC
the data is list of resources, see resource.h.
Each resource is on the heap.
.IP ATR_TYPE_LIST
the data is list of other structures.  Each member of the list is on the heap.
.IP ATR_TYPE_ACL
the data is an Access Control List.  It is maintained as an array of strings,
ATR_TYPE_ARST, but marked differently to aid in the saving to/recovery from
disk.
.LP
Privilege to access an attribute is defined by the bit wise "inclusive or"
of the following as set in the attribute definition:
.IP ATR_DFLAG_USRD
readable (status can be obtained) by a non-privileged user client.
.IP ATR_DFLAG_USWR
writtable (can be set) by a non-privileged user client.
.IP ATR_DFLAG_OURD
Reserved.
.IP ATR_DFLAG_OUWR
Reserved.
.IP ATR_DFLAG_OPRD
readable by a client with operator privilege.
.IP ATR_DFLAG_OPWR
writtable by a client with operator privilege.
.IP ATR_DFLAG_MGRD
readable by a client with manager privilege.
.IP ATR_DFLAG_MGWR
writtable by a client with manager privilege.
.IP ATR_DFLAG_SvRD
readable (will be sent to) another server or the scheduler.
.IP ATR_DFLAG_SvWR
writtable (can be set by) another server or the scheduler.
.IP ATR_DFLAG_MOM
Sent to MOM with the job when it is to be run.  Those and only those attributes
(and resources) so marked are sent to MOM.  Applies to Job attribute/resources
only.
.LP
The following bit wise flags are used by the Server, they are set in the
attribute definition structure:
.IP ATR_DFLAG_ALTRUN
the job attribute or resource can be altered while the job is running.
.IP ATR_DFLAG_NOSTAT
the attribute is returned to a client only on specific request for this
attribute.  Can be used to shorten the list seen with a \*Qqstat -f\*U.
.IP ATR_DFLAG_SELEQ
in a select operation, see qselect(1), the only legal operations are
equal (.eq.) and not-equal (.ne).
.IP ATR_DFLAG_RASSN
the job resource entry is to summed on the the server's
.At resources_used
attribute when the job is placed into execution, and subtracted when the job
terminates.
.IP ATR_DFLAG_RMOMIG
currently not used.
.LP
The following flags are maintained by the server in the attribute (value)
structure:
.IP ATR_VFLAG_SET
the attribute/resource is set, i.e. the value has meaning.
.IP ATR_VFLAG_MODIFY
the attribute/resource has been modified either by a decode or set operation.
.IP ATR_VFLAG_DEFLT
the value is set to a system defined default value.  The value is neither saved
nor sent to another server as the default may be different.
.NH 4
.Ix resource.h
.LP
This header file contains the definitions and declarations for resources.
As discussed earlier, resources are a special case of an attribute, a linked
list of attribute values headed in an attribute such as 
.At resource_list .
Resources use similiar structures as attributes.  Certain types, type related
functions, and flags may differ between the two.
.LP
Within the resource structure, the value is contained in an attribute
substructure, this is done so the various attribute decode and encode
routines can be "reused".
.LP
Unlike "attributes" which are typically identical between servers
within an administrative domain,  resources vary between systems.
Hence, the resource instance has a pointer to the resource definition
rather than depending on a predefined index.  Three routines are declared
within the header file that are useful in finding or adding resources:
.IP find_resc_def()
returns a pointer to the resource definition structure for a given resource
name.
.IP find_resc_entry()
returns a pointer to a resource entry in a resource list which points to the
the supplied resource definition.  Null is returned if no such entry exists
within the list.
.IP add_resource_entry() 
will add an unset entry to the list.
.LP
All the flags and permission bits discussed under
.I attribute.h
apply to resources.
.NH 4
.Ix batch_request.h
.LP
This file contains the giant union into which all batch request are converted.
Where possible, the fields are fixed length so the structure can be malloc-ed
in one piece.
.NH 4
.Ix credential.h
.LP
This file contains the structures and constants used in producing a PBS 
authentication credential.
.NH 4
.Ix job.h
.LP
This header file contains the structure definition used by the Server and
MOM to hold the job information.  Note that there two parts to the job
structure:  the interior portion, sub-structure jobfix, contains the fixed
length data for each job that is saved to disk; the remainder of the structure
contains data that can be reconstructed and need not be saved.
.LP
A note on the Job State and Substate, the State is a gross indication of the
job state which is returned to the user.  The Substate is the actual state
of the \*Qjob state engine.\*U
.NH 4
.Ix list_link.h
.LP
This file contains the structure definitions, function prototypes, and
access macros for managing a doubly linked list.  The structures defined
are:
.IP list_link
This structure contains the forward and backward pointer for each list
entry.  It is typically placed as the first sub-structure of the structure
defining the list entry.
.Cs
typedef struct list_link {
	struct list_link *ll_prior;
	struct list_link *ll_next;
	void             *ll_struct;
} list_link;
.Ce
.IP list_head
A list_head is identical to a link_link structure with member ll_struct set to
NULL.
.LP
The macros CLEAR_LINK and CLEAR_HEAD are defined in this header file.
The macros GET_NEXT and GET_PRIOR as also defined here.   They expand either
to in-line code or a function call depending on the setting of the symbol
.Sc NDEBUG .

.NH 3
.Tc Site Modifiable Files
.LP
The files and functions described in this section provide a site the ability
to customize PBS to meet special requirements.
The supplied version of the c source files may be found in the
.I src/lib/Libsite
directory and are linked via the libsite.a library. 
How to modify these files is discussed in the IDS chapter on libsite.a.
In addition, there are a set of header files, loaded into the target tree
include directory which provide the capability to add new attributes.
.NH 4
.Ix site_allow_u.c
.LP
The file
.I src/lib/Libsite/site_allow_u.c
contains the following function:
.Fn site_allow_u()
.Cs
int site_allow_u(char *user, char *host)
.Ce
.IP Args: 4
.RS
.IP user
The name of the user making a connection to the server.
.IP host
The name of the host from which the user is making a connection.
.RE
.IP Returns: 4
.RS
.IP zero
If the user is to be allowed access, zero (0) is returned.  This is the default.
.IP non-zero
If the user is to be denied access, a non-zero error code, typically
.Er PBSE_PERM
should be returned.
.RE
.LP
The provided version always returns zero.
A site may add code to perform whatever checks it wishes.  Realize however,
that this will be called on every new connection.  A procedure that takes
time will impact performance.
.NH 4
.Ix site_alt_rte.c
.LP
The file
.I src/lib/Libsite/site_alt_rte.c
contains the following function:
.Fn site_alt_router()
.Cs
int site_alt_router(job *pjob, pbs_queue *pque, long retry)
.Ce
.IP Args: 4
.RS 
.IP pjob
pointer to job to be routed.
.IP pque
pointer to queue in which the job currently resides.
.IP retry
next route retry time.
.RE
.IP Returns:
zero if job is still alive (in queue or being routed), an PBS error code,
.Er PBSE_ROUTEREJ ,
if the job has been rejected by all; the job will be killed.
.LP
As provided, this routine just calls the default router function,
.I default_router() .
A site may replace this function and \*Qactivate\*U it for a queue by setting
the queue attribute
.At QR_ATR_AltRouter
(alt_router) to true.  Please study the default router,
.I default_router()
to understand the required procedures which must be performed:
.IP Destinations
are listed in the queue attribute
.At QR_ATR_RouteDestin .
.IP svr_movejob()
should be used to perform the route.  It will return:
.RS
.IP -1 4
if the destination rejected the job for a reason which is considered
permanent; the destination should not be retried.
.IP 0
The route succeeded.  This implies the route was to a local queue, see next
return entry.
.IP 2
The route to a remote queue is under way (sending the job).  The job will have
been placed in Transiting state.  When the sending completes, either
(a) the job will have been moved and deleted locally, (b) the move failed
and the destination added to the bad list, or (c) it can be retired, job
requeued in route queue in a state other than Transiting.
.IP 1
The route (local) failed, but can be retried later.
.RE
.LP
.NH 4
.Ix site_check_u.c
.LP
The file 
.I src/lib/Libsite/site_check_u.c
contains the following functions:
.Fn site_acl_check()
.Cs
int site_acl_check(job *pjob, pbs_queue *pque)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job structure.
.IP pque
pointer to the candidate queue
.RE
.IP Returns: 4
.RS
.IP 0
if job is allowed into the queue
.IP non-zero
if job is not allowed into the queue
.RE
.LP
This routine determines if a job is allowed into a certain queue.
.Fn site_check_user_map()
.Cs
int site_check_user_map(job *pjob, char *luser)
.Ce
.IP Args: 4
.RS
.IP pjob
pointer to the job structure.
.IP luser
the local user name.
.RE
.IP Returns: 4
.RS
.IP 0 
if user allowed to execute as login described by password entry.
.IP -1
if user not allowed.
.RE
.LP
This routine determines if the job owner is privileged to execute as the
user described by a password entry.  The local user name is the login
name selected by 
.I geteusernam() 
from the
.At user-list
attribute of the job in question.
.LP
The PBS default distribution module determines privilege by:
.IP 1.
If the submitting host is the current host and the job owner name is
the same as the login name selected, the privilege is granted.
.IP 2.
If the hosts are different, privilege is granted by calling
.I ruserok(3N) .  
.QP
This is not strictly POSIX conforming as POSIX does not define ruserok().
However until 1003.22 actually has a standard for distributed security...
.NH 4
.Ix site_map_user.c
.LP
The file 
.I src/lib/Libsite/site_map_user.c
contains the following function:
.Fn site_map_user()
.Cs
char *site_map_user(char *user, char *host)
.Ce
.IP Args: 4
.RS
.IP user
The user name on the specified host.
.IP host
The host name.
.RE
.IP Returns: 4
.RS
.IP pointer
to to the mapped user name.
.RE
.LP
This function provides a place holder for a mapping of a user name on one
system to the user name on a common reference system.
Given a user name and a host name, this routine will return a \*Qmapped\*U or
\*Qcommon\*U user name.  
It is used for mapping in two situations:
.IP \(bu
Authorization of requests \- 
A site may run with users having different login names on different
systems.  If a user submits a job from system A and wishes to status it from
system B where he has a different name, there must be some means to map the
two users to a common name.
.IP \(bu
Mapping job owner to execution user name where the submitting (qsub) host
and the execution host have different name spaces for users.
.LP
The routine as supplied assumes a common name across all systems.  Therefore
it just returns the user name as given.
.LP
This function is called from the routine
.I svr_chk_owner() 
in pbs_server which is used to determine if the requestor is the job owner,
and from
.I geteusernam()
to determine the local execution name.
.LP
A site is free to modify this routine to map as required.
No input data should be modified.   If a different name is to be returned,
it should be saved in a static character array of size
.Sc PBS_MAXUSER
as defined in pbs_ifl.h.
.NH 4
Adding Attributes to PBS
.LP
A site can add attributes to the server, to queues, or to jobs.  Three sets
of header files are provided for this purpose.   The files are copied from
empty (except for comments) template files (*.ht) in the src/include
directory to the object (target) include directory when the target tree
is set up.  They are named
.Ty site_*_attr_*.h .
The first asterisk stands for one of
.Ty svr (server),
.Ty que (queue),
or 
.Ty job ;
and the second asterisk stands for 
.Ty enum
or
.Ty def .
An additional two files, named
.Ty site_qmgr_*_print.h
are provided for including the server and queue attributes in the output
of the qmgr \*Qprint\*U sub-command.
.LP
The .ht files in the source tree should not be modified.   Any modifications
there may be lost with the next release of PBS.   The .h files placed in the
target/include directory will not be over written.
.sp 2
Files:
.br
.Ix site_svr_attr_def.h
.br
.Ix site_svr_attr_enum.h
.br
.Ix site_qmgr_svr_print.h
.LP
Together, these files provide the ability to add attributes to the server.
The attribute itself is defined in site_svr_attr_def.h and an enumerated index
is added in site_svr_attr_enum.h.  Attribute is defined by adding structures
of the following form in site_svr_attr_def.h:
.Cs
	{	"attribute_name",
		decode_*,
		encode_*,
		set_*,
		comp_*,
		free_*,
		action_*,
		perm_flags,
		ATR_TYPE_#,
		PARENT_TYPE_SERVER
	},
.Ce
The quote marks and commas are required as shown.  The asterisk (*) and pound
sign (#) are replaced with the data type as found in 
.I attribute.h .
The common data types are:
.DS
.TS
box tab (|) ;
c c c c
.
Data Type|*|#|free_*
_
Boolean|b|LONG|free_null
Long int|l|LONG|free_null
A character|c|CHAR|free_null
String|str|STR|free_str
Array of strings|arst|ARST|free_arst
Size|size|SIZE|free_null
.TE
.DE
Within a single attribute definition, the routines and data types must agree.
.LP
The enumeration with in site_svr_enum.h can be any name, but a name of the
form:
.Ty SVR_SITE_ATR_ name
is recommend to prevent name space conflicts.  For each attribute element
added (one set of stuff with in the braces) in site_svr_attr_def.h, there
.B "MUST BE"
one enumeration lable added in site_svr_attr_enum.h
.LP
The attribute names, as given in the site_svr_attr_def.h entries, may be added
in site_qmgr_svr_print.h.   If added, these attributes will be included in the
output of a 
.Ty "print server"
qsub sub-command.  The format is:
.Cs
	"name_one",
	"name_two",
.Ce
.LP
For example, to add two new attributes named 
.B foo 
(a boolean) and
.B bar 
(a string), the following are added:
.LP
In site_svr_attr_def.h
.Cs
	{	"foo",
		decode_b,
		encode_b,
		set_b,
		comp_b,
		free_null,
		NULL_FUNC,
		NO_USER_SET,
		ATR_TYPE_LONG,
		PARENT_TYPE_SERVER
	},
	{	"bar",
		decode_str,
		encode_str,
		set_str,
		comp_str,
		free_str,
		NULL_FUNC,
		NO_USER_SET,
		ATR_TYPE_STR,
		PARENT_TYPE_SERVER
	},
.Ce
.LP
In site_svr_attr_enum.h:
.Cs
	SVR_SITE_ATR_foo,
	SVR_SITE_ATR_bar,
.Ce
.LP
And in site_qmgr_svr_print.h:
.Cs
	"foo",
	"bar",
.Ce
.sp 2
Files:
.br
.Ix site_que_attr_def.h
.br
.Ix site_que_attr_enum.h
.br
.Ix site_qmgr_que_print.h
.LP
The same information as given for the server attributes apply to defining
queue attributes.  The exception (there has to be at least one, right)
is the parent type can be
.Ty PARENT_TYPE_QUE_ALL
for an attributes that applies to both execution and routing queues, 
.Ty PARENT_TYPE_QUE_EXC
for execution queues only, or
.Ty PARENT_TYPE_QUE_RTE
for routing queues only.
.sp 2
Files:
.br
.Ix site_job_attr_def.h
.br
.Ix site_job_attr_enum.h
.LP
Again the same information holds.  The parent type is
.Ty PARENT_TYPE_JOB .
There is no qmgr header file for job attributes.

.\" force next chapter to odd page
.bp
.if e \{
\&
.sp 10
.DS C
[This page is blank.]
.DE
.bp
\}
