\" following macro cuts the date provided by RCS so only yy/mm/dd shows
.de Cd
\&\\$2
..
.sp 3
.in +0.7i
.sp 0.22i
.vs 11
.ps +22
P
.br
\h'11p'B\ \ 
.ps -4
\v'-4p'Portable Batch System\v'4p'
.br
.ps +4
\h'21p'S
.br
.ps -22
.vs 12
\l'4.9i'
.br
.in -0.8i
.sp 3
.TL
Test Plan
.AU
Albeaus Bayucan
Robert L. Henderson
Thomas M. Proett
Dave Tweten \(dg
.FS \(dg
Numerical Aerospace Simulation Systems Division,
NASA Ames Research Center, Moffett Field, CA
.FE
.AI
.B "MRJ Technology Solutions"
2672 Bayshore Parkway
Suite 810
Mountain View,  CA 94043
http://pbs.mrj.com
.sp
.so ../ers/release.ms
.br
Printed: \*(DY
.LP
.ds CH PBS Test Plan
.bp
.DS C
\s+2\f3Portable Batch System (PBS) Software License\fP\s-2
.sp
Copyright \(co 1999, MRJ Technology Solutions.
.br
All rights reserved.
.DE
.LP
Acknowledgment: The Portable Batch System Software was originally developed
as a joint project between the Numerical Aerospace Simulation (NAS) Systems
Division of NASA Ames Research Center and the National Energy Research
Supercomputer Center (NERSC) of Lawrence Livermore National Laboratory.
.LP
Redistribution of the Portable Batch System Software and use in source
and binary forms, with or without modification, are permitted provided
that the following conditions are met:
.IP -
Redistributions of source code must retain the above copyright and
acknowledgment notices, this list of conditions and the following disclaimer.
.IP -
Redistributions in binary form must reproduce the above copyright and
acknowledgment notices, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the distribution.
.IP -
All advertising materials mentioning features or use of this software must
display the following acknowledgment:
.RS
.QP
This product includes software developed by NASA Ames Research
Center, Lawrence Livermore National Laboratory, and MRJ Technology Solutions.
.RE
.sp
.LP
.ce
DISCLAIMER OF WARRANTY
.QP
THIS SOFTWARE IS PROVIDED BY MRJ TECHNOLOGY SOLUTIONS ("MRJ") "AS IS" WITHOUT
WARRANTY OF ANY KIND,  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY,  FITNESS
FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE EXPRESSLY  DISCLAIMED.
.QP
IN NO EVENT, UNLESS REQUIRED BY APPLICABLE LAW, SHALL MRJ, NASA, NOR
THE U.S. GOVERNMENT  BE LIABLE FOR ANY DIRECT DAMAGES WHATSOEVER,
NOR ANY  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
.LP
This license will be governed by the laws of the Commonwealth of Virginia,
without reference to its choice of law rules.
.LP
.ds LF $Revision: 2.2 $
.ds CF -%-
.SH
Revision History
.LP
.IP "Revision 1.0" 15
March 1, 1994 \(em Initial approved version.
.so ../ers/rel_history.ms
.LP
.SH
Revision Bars
.LP
Revision bars, like those associated with this paragraph, identify
parts of the text that have changed significantly since the previous
revision.
.sp 3
.SH
Acknowledgements
.LP
Special acknowledgements are extended to the following groups for
their contributions to this project plan:
.sp
.DS
Lawrence Livermore National Laboratory - LCC
Lawrence Livermore National Laboratory - NERSC
.DE
.bp
.NH 1
INTRODUCTION
.LP
This document gives a description of the goals to be achieved through
testing every major release of the \*QPortable Batch System\*U or
.B PBS .
It also covers the procedures to be used to test the software package.
PBS is an extension to a POSIX\**
.FS
IEEE Standard 1003,
.I
Information technology \(em portable operating system interface.
.R
.FE
or Unix\**
.FS
Unix is a trademark of USL.
.FE
operating system which provides the capability to submit and control
jobs in a batch environment, independent of the interactive
environment.
The remainder of the document describes the sort of testing to be
performed at each stage of the process, leading finally to a
demonstration of operational readiness.
.LP
This is the first section.
The second section covers the goals to be achieved by testing PBS, as well
as an overview of basic concepts and terminology found in studies of software
testing.
The third, fourth and fifth sections describe the tests to be
performed as part of each of the three stages of testing.
The first stage is testing each part of PBS in isolation, performed by
the individual developer.
The second is testing the integrated release, performed by the PBS
project members during the integration process.
The final stage is operational readiness testing, conducted by INC
Branch members and culminating in an Operational Readiness Review.
.bp
.NH 1
GOALS OF TESTING
.LP
There are three reasons to test PBS.
The first is to find as many faults as possible so they can be
fixed.
The second is to demonstrate its operational readiness by failing,
after diligent testing, to find significant deficiencies.
The third is to increase confidence in the software's correctness and
reliability.
To achieve these three goals, several types of tests will be used,
through several stages of testing.

.NH 2
TYPES OF TESTS
.LP
For purposes of the PBS test program, there are five types of tests:
functionality tests, recoverability tests, reliability tests,
operational tests and random exercise tests.
.LP
A good set of functionality tests demonstrates, singly, each feature
claimed in the External Reference Specification.
In addition, it tests the most common combinations of features.
.LP
Recoverability tests exercise the software's responses to external
events.
They should include tests of normal and abnormal shutdown procedures
as well as responses to disappearance of various parts of PBS (i.e. daemon
deaths).
.LP
Reliability tests exercise the software's responses to nonsensical
conditions, including bad parameters, bad combinations of parameters and
network outages.
These tests determine whether or not the software behaves in a civilized
fashion when given out-of-limit inputs, and confirms what those
limits are, where possible.
.LP
Operational tests exercise the software's ability to perform its
designed mission.
They cover moving jobs through the entire job lifecycle and demonstrate that all appropriate events happen and are logged
correctly.
.LP
Random exercise tests attempt to simulate production conditions to
detect the sort of bugs which evade the other kinds of tests.
A random exercise test may consist of anything from a specially
constructed test script to making a system available to a group of
pilot users.

.NH 2
THE STAGES OF TESTING
.LP
The testing of PBS is to be a three stage process consisting of unit
testing, system testing and operational readiness testing.
At all stages, the major goal is the discovery, isolation and removal
of bugs.
Each stage has a slightly different perspective and a different
standard and arbiter of success.
.LP
The first stage is unit testing.
It is conducted in conjunction with development of each piece of PBS.
The goal of unit testing is to remove as many bugs as possible when
the object under test is as simple as possible.
Functionality, recoverability and reliability testing are performed at
this stage.
Operational and random exercise tests are not appropriate at this
stage.
Unit testing will be performed by the developer or developers of each
piece of PBS.
A piece passes unit testing when its developer certifies that no major
known bugs remain.
.LP
The second stage is system testing.
It is conducted in conjunction with system integration. Individual
pieces of PBS are brought together to be tested as a whole. 
The goal of system testing is twofold: to discover any bugs missed
in unit testing and to discover bugs resulting from malignant
interactions between separately developed pieces.
Functionality, recoverability, reliability, operational and random
exercise tests are performed at this stage. An automated testing procedure
must be provided that will also act as a regression tester tool for testing
changed code. 
System testing is performed by all members of the PBS project team.
PBS passes system testing when all tests run and there are no known
bugs.
.LP
The third stage is operational readiness testing or acceptance testing.
It is conducted after release of PBS from the development team.
The goal of operational readiness testing is to demonstrate that PBS
is ready to serve NAS users.
This is conducted by different people using a completely different set
of tests from those used in earlier stages of testing, to reduce the
probability of systematic ``blind spots'' in the testing procedure.
These tests include functionality, recoverability, reliability,
operational and random exercise tests.
Operational readiness testing is performed by members of the NAS INC
branch.
PBS passes operational readiness testing when an Operational Readiness
Review has accepted the recommendation of the INC testers that it be
put into production.
.bp
.NH 1
UNIT TESTING
.LP
Unit testing is the thorough exercise of a major piece of PBS prior to
its integration into the whole.
Its purpose is to demonstrate basic function and to catch errors
early, before increased complication of the integrated system makes
problem isolation more difficult.
Tests must exercise each documented feature which requires no other
part of the whole for its execution.
In addition, recoverability and reliability testing must be performed.
.LP
When a major piece of PBS has passed unit testing, it becomes a
candidate for integration and system testing.
It passes unit testing when its developer certifies that no major
failures of functionality, recoverability or reliability remain.
Repair of any remaining minor failures must be made during integration
and system testing.
.LP
Because unit testing is the type of testing closest to the
implementation, some of the following specifications may require the
reader to make reference to the Internal Design Specification for
complete understanding.
.bp
.NH 2
UNIT TESTING THE USER COMMANDS
.LP
The user commands tested are qalter, qdel, qhold, qmove, qmsg,
qrerun, qrls, qselect, qsig, qstat, and qsub.
They are tested using
an automated tester program written in DejaGNU.
.LP
.NH 3
Functionality Tests
.LP
The following enumerates the different class of inputs needed to test the
functionality of PBS commands:
.NH 4
Qalter
.IP \(bu
Create input cases that test sending
.B qalter
-A, -S, -a, -c, -a, -e, -h, -p and
-u options to a running job.
.IP \(bu
Write a test case that checks to make sure that
.B qalter
is returning the correct exit code.
.IP \(bu
Create
.B qalter
input cases that ensure:
.RS
.IP -
the job identifier operands of the form,
.I sequence_number[.server_name][@server]
are correctly sent to the server.
.IP -
valid -a
.B date_time
arguments of the form
.I  [[CC]YY]MMDDhhmm[.SS]
are sent correctly to the PBS server.
.IP - 
valid -A
.B account_string
arguments of reasonable length are sent properly to the PBS
server. 
.IP -
valid -c
.B interval
values (n, s, c, c=minutes) are passed correctly to the server.
.IP -
valid -e
.B path
arguments containing relative and absolute pathnames, including any pairings
with hostnames, are sent correctly to the server.
.IP -
valid -h
.B hold_list
arguments containing one or more of u(ser), o(ther), and s(ystem), or the
singular n(one) are sent correctly to the server. 
.IP -
valid -j
.B join
arguments (oe, eo, n) are sent properly to the server.
.IP -
valid -k
.B keep
arguments containing one or more of o (stdout) and e (stderr), or the
singular n (none) are sent correctly to the server.
.IP -
valid -l
.B resource_list
arguments of the form,
.RS
.I resource_name[=[value]][,resource_name[=[value]],...]
.RE
are sent correctly to the server.
.IP -
valid -m
.B mail_options
arguments containing one or more of (a, b, e), or the singular n(one),
are sent correctly to thes server.
.IP -
valid -M
.B user_list
arguments of the form
.RS
.I user[@host][,user[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -N
.B name 
arguments are sent correctly to the server.
.IP -
valid -o
.B path
arguments containing relative and absolute pathnames, including any pairings
with hostnames, are sent correctly to the server.
.IP -
valid -p
.B priority
arguments of values between -1024 and +1023 are sent correctly to the server.
.IP -
valid -r
arguments of either y or n are sent correctly to the server.
.IP -
valid -S
.B path_list
arguments of the form,
.RS
.I path[@host][,path[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -u
.B user_list
arguments of the form,
.RS
.I user[@host][,user[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -W
.B additional_attributes
arguments having one of the forms,
.RS
.IP
.I depend=type:argument[,type:argument[:argument...],...]
.IP
.I group_list=group[@host][,group[@host],...]
.IP
.I stagein=local_file@hostname:remote_file[,...]
.IP
.I stageout=local_file@hostname:remote_file[,...]
.RE
are sent correctly to the server.
.RE
.NH 4
Qdel
.IP \(bu
Test deleting users' jobs as a PBS administrator or a PBS operator.
.IP \(bu
Write a test case that checks to make sure that
.B qdel
is returning the correct exit code.
.IP \(bu
Create
.B qdel
input cases that ensure:
.RS
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are passed correctly to the server.
.IP -
the -W
.B delay
arguments arguments are passed correctly to the server.
.RE
.NH 4 
Qhold
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qhold
is returning the correct exit code.
.IP \(bu
Create
.B qhold
test cases that ensure: 
.RS
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are passed correctly to the server.
.IP -
valid -h 
.B hold_list
arguments consisting of one or more of the letters (u, o, s), or the
singular "n", are correctly sent to the server. 
.RE
.NH 4 
Qmove
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qmove
is returning the correct exit code.
.IP \(bu
Create
.B qmove
input cases that ensure: 
.RS
.IP -
valid
.B destination
arguments specified as
.I "queue, @server, queue@server"
are sent correctly to the server.
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are correctly sent to the server.
.RE
.NH 4 
Qmsg
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qmsg
is returning the correct exit code.
.IP \(bu
Create
.B qmsg
input cases that ensure:
.RS
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are correctly sent to the server.
.IP -
the
.B  message_string
arguments of reasonable length are correctly sent to the server.
.IP -
single letter arguments -E and/or -O are correctly communicated to the server.
.RE
.NH 4 
Qrerun
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qrerun
is returning the correct exit code.
.IP \(bu
Create
.B qrerun
test cases to ensure:
.RS
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are correctly sent to the server.
.RE
.NH 4 
Qrls
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qrls
is returning the correct exit code.
.IP \(bu
Create inputs cases that ensure:
.RS
.IP -
the job identifier operands of the form,
.RS
.I sequence_number[.server_name][@server]
.RE
are correctly sent to the server.
.IP -
valid -h
.B hold_list
arguments containing one or more of u(ser), o(ther), and s(ystem), or the
singular n(one) are sent correctly to the server. 
.RE 
.NH 4 
Qselect
.LP
.IP \(bu
Write a test case that checks to make sure that
.B qselect
is returning the correct exit code.
.IP \(bu
Create test cases that ensure:
.RS
.IP -
valid -A
.B account_string
arguments of reasonable length are correctly passed on to the server. 
.IP -
valid -a
.B [op]date_time
arguments of the form,
.RS
.I [.eq.|.ne.|.ge.|.gt.|.lt.][[CC]YY]MMDDhhmm[.SS]]
.RE
are sent correctly to the server.
.IP -
valid -c
.B [op]interval
arguments of the form,
.RS
.I [.eq.|.ne.|.ge.|.gt.|.lt.][n|s|c|c=minutes|u]
.RE
are sent correctly to the server.
.IP -
valid -h
.B hold_list
arguments containing one or more of u(ser), o(ther), and s(ystem), or the
singular n(one) are sent correctly to the server. 
.IP -
valid -l
.B resource_list
arguments of the form,
.RS
.I "resource_name {.eq.|.ne.|.ge.|.gt.|.le.|.lt.} value[,resource_name...]"
.RE
are correctly sent to the server.
.IP -
valid -N
.B name
arguments are sent correctly to the server.
.IP -
valid -p
.RS
.I [.eq.|.ne.|.ge.|.gt.|.le.|.lt.]priority 
.RE
arguments are sent correctly to the server.
.IP -
valid -q
arguments specified as
.I "queue, @server, queue@server"
are sent properly to the server.
.IP -
valid -r
.B rerun
arguments of either y or n are sent correctly to the server.
.IP -
valid -s
.B states
arguments consisting of any combination of the characters "E", "Q", "R", "T",
and "W", are sent properly to the server.
.IP -
valid -u
.RS
.I user_name[@host][,user_name[@host],...]
.RE
arguments are passed on to the server correctly.
.RE
.NH 4 
Qsig
.IP \(bu
Write a test case that checks to make sure that
.B qsig
is returning the correct exit code.
.IP \(bu
Create
.B qsig
test cases that ensure:
.RS
.IP -
the job identifier operands of the form,
.I sequence_number[.server_name][@server],
are correctly sent to the server.
.IP -
valid -s
.B signal
arguments specified as signal number, short signal name (i.e. KILL), or long
signal name (i.e. SIGKILL), are sent correctly to the server.  
.RE
.NH 4 
Qstat
.IP \(bu
Write a test case that checks to make sure that
.B qstat
is returning the correct exit code.
.IP \(bu
Create
.B qstat
input cases that ensure:
.RS
.IP -
the -Q flag is correctly giving queue status.
.IP -
the -B flag is correctly giving the server status.
.IP -
the -f flag is correctly giving a full display status of job(s), destination(s),
server(s).
.IP -
the job identifier operands of the form,
.I sequence_number[.server_name][@server]
are correctly sent to the server, when specified with or without the -f option.
.IP -
the
.B destination
arguments to the -Q option specified as
.I "queue, @server, queue@server"
are sent correctly to the server.
.IP -
the
.B server_name
arguments are correctly sent to the server, when specified with the -B option.
.RE
.NH 4
Qsub
.IP \(bu
Write a test case that checks to make sure that
.B qsub
is returning the correct exit code.
.IP \(bu
Create
.B qsub
input cases that test the interactive feature. Test to make sure the
PBS_ENVIRONMENT variable is getting set to "PBS_INTERACTIVE", that issuing a 
<control-C> before a job starts executing will actually delete the job
(after prompting user), that sending a terminate signal (kill -ABRT,
or ~.) to a running interactive job will actually kill the job, and that
sending a suspend signal (via ~susp) will actually suspend a running
interactive job.
.IP \(bu
Create input cases that test processing of directive lines in a script file.
Test different valid directive prefixes by using the -C option or by setting
the PBS_DPREFIX environment variable, and make sure blank lines, comment
lines (those beginning with #) are ignored during parsing. 
.IP \(bu
Create basic input cases that ensure:
.RS
.IP -
valid -a
.B date_time
arguments of the form
.I  [[CC]YY]MMDDhhmm[.SS]
are sent correctly to the PBS server.
.IP - 
valid -A
.B account_string
arguments of reasonable length are sent properly to the PBS
server. 
.IP -
valid -c
.B interval
values (n, s, c, c=minutes) are passed correctly to the server.
.IP -
valid -C
.B directive_prefix
arguments of reasonable length are correctly seen as directive prefixes for
scanning script files.
.IP -
valid -e
.B path
arguments containing relative and absolute pathnames, including any pairings
with hostnames, are sent correctly to the server.
.IP -
the -h flag correctly will signal the server to apply a user hold on the job
being submitted.
.IP -
the -I flag correctly will tell the server to run the job as an "interactive"
job. 
.IP -
valid -j
.B join
arguments (oe, eo, n) are sent properly to the server.
.IP -
valid -k
.B keep
arguments containing one or more of o (stdout) and e (stderr), or the
singular n (none) are sent correctly to the server.
.IP -
valid -l
.B resource_list
arguments of the form,
.RS
.I resource_name[=[value]][,resource_name[=[value]],...]
.RE
are sent correctly to the server.
.IP -
valid -m
.B mail_options
arguments containing one or more of (a, b, e), or the singular n(one),
are sent correctly to the server.
.IP -
valid -M
.B user_list
arguments of the form
.RS
.I user[@host][,user[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -N
.B name 
arguments are sent correctly to the server.
.IP -
valid -o
.B path
arguments containing relative and absolute pathnames, including any pairings
with hostnames, are sent correctly to the server.
.IP -
valid -p
.B priority
arguments of values between -1024 and +1023 are sent correctly to the server.
.IP -
valid -q
.B destination
arguments specified as
.I "queue, @server, queue@server"
are sent correctly to the server.
.IP -
valid -r
arguments of either y or n are sent correctly to the server.
.IP -
valid -S
.B path_list
arguments of the form,
.RS
.I path[@host][,path[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -u
.B user_list
arguments of the form,
.RS
.I user[@host][,user[@host],...]
.RE
are sent correctly to the server.
.IP -
valid -v
.B variable_list
arguments are correctly passed on to the server.
.IP -
the -V flag will signal the server to export all environment variables in the
qsub's command environment to the batch job.
.IP -
valid -W
.B additional_attributes
arguments having one of the forms,
.RS
.IP
.I depend=type:argument[,type:argument[:argument...],...]
.IP
.I group_list=group[@host][,group[@host],...]
.IP
.I stagein=local_file@hostname:remote_file[,...]
.IP
.I stageout=local_file@hostname:remote_file[,...]
.RE
are sent correctly to the server.
.IP -
the -z flag will cause qsub to not return to stdout the job identifier
assigned to the job.
.RE
.bp
.LP
.NH 3
Reliability Tests
.NH 4
Qalter
.IP \(bu
Test illegal values of the -a
.B date_time 
arguments such as bad month (MM), day (DD), hour (hh), minutes (mm), seconds
(ss). Also, test
.B date_time
arguments that are in the PAST, and exercise arguments with incorrect syntax.
.IP \(bu
Test behavior of qalter when given a bad -A syntax. 
.IP \(bu
Test incorrectly specified -c option such as giving an
.B interval
argument that is not in the set (n, s, c, c=minutes), and combining "n" with
other argument letters.
.IP \(bu
Test the -e argument when it has a bad or unknown hostname part.
.IP \(bu
Test incorrectly specified -h option such as giving a
.B hold_list
argument that is not one or more of (u, o, s), and combining n with another
argument letter.
.IP \(bu
Test incorrectly specified -j option such as giving an
.B join 
argument that is not in the set (oe, eo, n) or combining n with another 
argument letter.
.IP \(bu
Test incorrectly specified -k option such as giving an
argument that is  not one or more of (o, e), and combining n with the other 
two argument letters.
.IP \(bu
Test incorrectly specified -l option such as giving an invalid resource name,
an invalid resource value, and incomplete
.B resource_list 
specification.
.IP \(bu
Test incorrectly specified -m option such as giving an
argument that is not one or more of (a, b, e), and combining n with the other
three argument letters. 
.IP \(bu
Test badly formed
.B user_list
argument to the -M option.
.IP \(bu
Test -N job name arguments that are more than maximum in length (16 chars)
and less than the minimum in length (0 char), or having the first character
being non-alphabetic.
.IP \(bu
Test the -o argument by giving it a bad or unknown hostname part.
.IP \(bu
Test values of -p
.B priority
that are outside the valid range [-1024,+1023]. 
.IP \(bu
Test values of -r
arguments that are not in the set (y, n).
.IP \(bu
Test incorrectly specified -S
.B path_list
argument particularly when there are 2 shell paths specified for one host, 2
shell paths specified without a host, or when the path_list specified does not
follow the correct syntax.
.IP \(bu
Test incorrectly specified -u
.B user_list
argument particularly when there are 2 user names specified for one host, 2
user names specified without a host, or when the user_list specified does not
follow the correct syntax.
.IP \(bu
Test incorrectly specified -W 
.B depend=dependency_list
argument particularly when given a bad value for
.B synccount,
and
.B on
or, the depend arguments are badly constructed.
.IP
Test incorrectly specified -W
.B group_list=g_list
argument such as a badly constructed g_list, a g_list containing 2 groups for
one host or 2 groups without a host.
.IP
Test incorrectly specified -W
.B "stagein=file_list, stageout=file_list"
argument such as a badly constructed file_list, or a badly
constructed combination of stagein and stageout arguments.
.NH 4
Qdel
.IP \(bu
Create input cases that check bad operands such as non-numeric job identifiers,
 and incorrectly constructed jids.
.IP \(bu
Give
.B qdel
a very big sequence number as the job identifier.
.IP \(bu
Test
.B qdel
without any arguments.
.IP \(bu
Test bad
.B delay
arguments to -W.
.NH 4
Qhold
.IP \(bu
Create input cases for
.B qhold
that check bad operands such as non-numeric job identifiers,
and incorrectly constructed jids.
.IP \(bu
Test incorrectly specified -h
.B hold_list
arguments with values specified that are not in one or more of (u, o, s),
or the values contain n combined with another argument letter.
.NH 4
Qmove
.IP \(bu
Create input cases for
.B qmove
that check bad operands such as non-numeric job identifiers,
incorrectly constructed jids, incompletely constructed destination identifiers,
and unknown queue names. 
.IP \(bu
Give the command a very big sequence number as the job identifier.
.IP \(bu
Test the command when the destination queue is disabled, and when the job
being moved is currently running.
.IP \(bu
Test the command without any arguments.
.NH 4
Qmsg
.IP \(bu
Create input cases for
.B qmsg
that check bad operands such as non-numeric job identifiers and
incorrectly constructed jids.
.IP \(bu
Give the command a very big sequence number as the job identifier.
.IP \(bu
Test the command without any arguments.
.IP \(bu
Test how
.B qmsg
behaves when it is given an incomplete argument list.
.NH 4
Qrerun
.IP \(bu
Create input cases for
.B qrerun
that check bad operands such as non-numeric job identifiers and
incorrectly constructed jids.
.IP \(bu
Give the command a very big sequence number as the job identifier.
.IP \(bu
Test the command without any arguments.
.IP \(bu
Execute qrerun on a non-rerunnable, queued, waiting, or held job.
.NH 4
Qrls
.IP \(bu
Create input cases for
.B qrls
that check bad operands such as non-numeric job identifiers,
and incorrectly constructed jids.
.IP \(bu
Test incorrectly specified -h
.B hold_list
arguments with values specified that are not one or more of (u, o, s),
or the values contain n combined with another argument letter.
.NH 4
Qselect
.IP \(bu
Test -A option without an option argument, or an argument that is the NULL
("") string. 
.IP \(bu
Test -N option without any option argument.
.IP \(bu
Test bad
.B date_time
values to the -a option with a bad month (MM), day (DD), hour
(hh), minute (mm), and seconds (ss). Also, give a
.B date_time
argument that is in the PAST, an argument that doesn't follow the correct
syntax, or an argument that contains an unknown selection operator (op) value.
.IP \(bu
Test -c option by giving it arguments that are not in one of
(n, s, c, c=minutes).
Also, give illegal operator values of .ge., .gt., .lt., .le. when specifying a
.I u
.B interval
argument. Check behavior of
.B qselect
when the -c option is not given any option argument or it is given an illegal
operator (op) value.
.IP \(bu
Test the -h option by giving it bad
.B hold_list
arguments that are not in one or more of (u, o, s), or n combined
with another argument letter. Also test behavior of qselect when -h is
not given any option argument.
.IP \(bu
Test the -l option by giving it bad resource names, bad resource values, an
incomplete
.B resource_list
specification, and an unknown operator (op) value. 
.IP \(bu
Test
.B qselect
when it is given an unknown argument.
.IP \(bu
Test the -p option by giving it values that are non-numeric, or specifying
arguments that contain 
an illegal operator (op) value. Also, test this option by not giving it any
option argument.
.IP \(bu
Test the -q option by giving it an unknown queue, unknown server value, or
give it an incorrectly constructed
.I destination
identifier. Also, test this option by not giving it any option argument.
.IP \(bu
Test the -r option without any option argument, or  give it a value not in
(y, n).
.NH 4
Qsig
.IP \(bu
Test
.B qsig
by giving it incorrectly constructed job identifiers and identifiers 
that are "too big". 
.IP \(bu
Test the -s option by giving it an unknown signal name or number.
.IP \(bu
Test incorrectly constructed command line of
.B qsig.
.NH 4
Qstat
.IP \(bu
Test
.B qstat
by giving it incorrectly constructed job identifiers and identifiers that are
"too big".
.IP \(bu
Test the -Q option by giving it an unknown queue or unknown server argument, or
specifying an incorrectly constructed
.I destination
argument. Also, incorrectly give a job identifier argument or a server name
argument to -Q and/or -f option(s).
.IP \(bu
Give a job identifier argument to the -B option. 
Give a
.I destination
argument to the -B and/or -f option(s).
.NH 4
Qsub
.IP \(bu
Test the -A option without any option argument.
.IP \(bu
Test the -C option without any option argument.
.IP \(bu
Test the -M option without any option argument, or give it an incomplete
.B user_list
specification.
.IP \(bu
Test the -N option by giving it a
.B name
value whose length is more than 15 chars or less than 1 char, or specifying
a string value that contains a non-alphabetic 1st character. Also check the
behavior of
.B qsub
when this option is not given any argument.
.IP \(bu
Test the -S option by giving it a
.B path_list
that contains more than one shell path for the same host or more than one shell
path without a host. Also check the behavior of
.B qsub
when this option is given
an incorrectly constructed argument or when it is not given any argument.
.IP \(bu
Test the
.B " -W depend"
option by giving it bad arguments to
.B synccount
and
.B on.
Also, test malformed
.I depend=dependency_list
lines.
.IP \(bu
Test the 
.B -W group_list
option by specifying option arguments containing 2 groups for one host, or 2
groups without a host. Check malformed
.I group=group_list
lines.
.IP \(bu
Test the
.B "-W stagein/stageout"
option by specifying bad host or bad file (i.e.
directory) as the stagein or stageout files. Also, test incorrectly
constructed 
.I "stagein=file_list, stageout=file_list"
lines.
.IP \(bu
Test the -a option by giving it
.B date_time
values that have a bad month (MM), day (DD), hour (hh), minute (mm), and
seconds (ss). Test this option if its argument is incorrectly specified (does
not follow the correct syntax), or the option argument is not given at all, or
the
.B date_time
argument is some time in the PAST.
.IP \(bu
Test the -c option by giving it
values that are not in (n, s, c, c=minutes).  Also test it without any option
argument.
.IP \(bu
Test the -e option by giving a
.B path
argument that has a bad hostname portion.
.IP \(bu
Test the
.B "-W interactive=false"
and
.B "-W interactive="
arguments to
.B qsub.
They both should return some error message.
.IP \(bu
Test the -j option by giving it values that are not in the set (oe, eo, n).
.IP \(bu
Test the -k option by giving it argument values that are not one or more of
(o, e), or specifying values that contain an 'n' combined with another argument
letter. 
.IP \(bu
Test the -l option by giving it arguments such as bad resource names, bad
resource values, and an incomplete
.B resource_list
specification.
.IP \(bu
Test the -m option by giving it argument values that are not one or more of
(a, b, e), or specifying values that contain an \*'n\*' combined with another
argument letter.
.IP \(bu
Test the -o option by giving a
.B path
argument that has a bad hostname portion.
.IP \(bu
Test the -p option by giving it
.B priority
values that are non-numeric, or not in the range [-1024, +1023].
.IP \(bu
Test the -q option by giving it
.B destination
values that are incorrectly formed, or the option argument contains an unknown
queue or server name. 
.IP \(bu
Test the -r option by giving it values that are not in (y, n).
.IP \(bu
Test script processing by giving
.B qsub
a bad script file argument such as a non-existent file, or a directory file.
.IP \(bu
Test the -u option by giving
.B user_list
arguments that are incorrectly constructed, or it is correctly constructed but
contains more than one user name for one host, or more than one user name
without a host.
.bp
.NH 2
UNIT TESTING THE OPERATOR COMMANDS
.LP
The operator utilities tested are qdisable, qenable, qrun, qstart,
qstop, and qterm. 
.LP
.NH 3
Functionality Tests
.NH 4
Qdisable
.IP \(bu
Write a test case that checks to make sure that
.B destination 
identifiers of the form
.I "queue, @server, or queue@server"
are successfully passed on to the server for processing.
.IP \(bu
Test 
.B qdisable
to make sure that it is returning the correct exit code.
.NH 4
Qenable
.IP \(bu
Write a test case that checks to make sure that
.B destination 
identifiers of the form
.I "queue, @server, or queue@server"
are successfully passed on to the server for processing.
.IP \(bu
Test 
.B qenable
to make sure that it is returning the correct exit code.
.NH 4
Qrun
.IP \(bu
Create an input case that checks to make sure that job
identifiers of the form
.RS
.I  sequence_number[.server_name][@server]
.RE
are correctly passed on to the server for processing.
.IP \(bu
Also, check that the -H host argument is getting picked up by the server. 
.IP \(bu
Test 
.B qrun
to make sure that it is returning the correct exit code.
.NH 4
Qstart
.IP \(bu
Write a test case that checks to make sure that destination
identifiers of the form,
.I "queue, @server, queue@server"
are correctly getting passed on to the server for processing.
.IP \(bu
Test 
.B qstart
to make sure that it is returning the correct exit code.
.NH 4
Qstop
.IP \(bu
Write a test case that checks to make sure that destination
identifiers of the form,
.I "queue, @server, queue@server"
are correctly getting passed on to the server for processing.
.IP \(bu
Test 
.B qstop
to make sure that it is returning the correct exit code.
.NH 4
Qterm
.IP \(bu
Create input cases that check to make sure arguments such as the name of
the server, the -t types of shutdown (immediate, delay) are getting seen by
the server.
.IP \(bu
Test 
.B qterm
to make sure that it is returning the correct exit code.
.LP
.NH 3
Reliability Tests
.NH 4
Qdisable
.IP \(bu
Test
.B qdisable 
by giving it incorrectly constructed
.B destination
arguments.
.NH 4
Qenable
.IP \(bu
Test
.B qenable
by giving it incorrectly constructed
.B destination
arguments.
.NH 4
Qrun
.IP \(bu
Test
.B qrun
by giving it a "too big" job identifier number, or a badly formed job
identifier.
.IP \(bu
Test the command when given no option argument.
.IP \(bu
Test the -H option of the command  by giving it an unknown host argument, a
NULL argument, or no option argument.
.NH 4
Qstart
.IP \(bu
Test
.B qstart
by giving it malformed
.B destination
identifiers containing unknown queues and server names.
.NH 4
Qstop
.IP \(bu
Test
.B qstop
by giving it malformed
.B destination
identifiers containing unknown queues and server names.
.NH 4
Qterm
.IP \(bu
Test the -t option by giving no option argument, an unknown shutdown type
argument, or a NULL argument.
.bp
.LP
.NH 2
UNIT TESTING THE ADMINISTRATOR COMMANDS
.LP
The administrator command to be tested is
.B qmgr.
.NH 3
Functionality Tests
.NH 4
Qmgr
.IP \(bu
Create a test case to make sure that the correct server is contacted based on
the server name(s) argument(s) to
.B qmgr.
.IP \(bu
Create input cases that ensure:
.RS
.IP -
the -a option will abort
.B qmgr
on any syntax errors or server connection rejection errors.
.IP -
the command specified with the -c option will be communicated correctly to
the server.
.IP -
the -e option will echo all commands to the standard output.
.IP -
the -n option will only do syntax checking of commands.
.IP -
the -z option will not write any errors to standard error.
.RE
.IP \(bu
Test the command to make sure that it returns the correct exit code.
.LP
.NH 3
Reliability Tests
.NH 4
Qmgr
.IP \(bu
Test such cases as an unknown or malformed server name argument.
.IP \(bu
Test the following syntax error filled input cases and see how
.B qmgr
behaves:
.IP
        create queue q1,
.IP
        create queue ,q1
.IP
        create queue ,
.IP
        delete queue q1,
.IP
        delete queue ,q1
.IP
        delete queue ,
.IP
        unset queue q1 attrib oper value
.IP
        u q q1, attrib
.IP
        unset queue ,q1 attrib
.IP
        unset queue , attrib
.IP
        list queue fast,
.IP
        list queue ,fast
.IP
        list queue ,
.IP
        print queue q1,
.IP
        print queue ,q1
.IP
        print queue ,
.IP
        set server
.IP
        set server ,
.IP
        unset server attrib oper value
.IP
        unset server
.IP
        unset server attrib,
.IP
        unset server ,attrib
.IP
        unset server ,
.IP
        list server attrib,
.IP
        list server ,attrib
.IP
        list server ,
.IP
        print server attrib,
.IP
        print server ,attrib
.IP
        print server ,
.NH 2
UNIT TESTING THE BATCH SERVER
.LP
.NH 3
Functionality Tests
.LP
Create input cases that test the actions provided by the -a, -d, -p, -A, -L,
-M, -S, and -t options.
.LP
.NH 3
Recoverability Tests
.LP
Issue a kill signal to a running PBS server. The server should catch the TERM
signal and shutdown cleanly, closing its log file. On a hot restart of the
server, all rerunnable jobs that were running at time of shutdown are rerun, 
all non-rerunnable jobs are aborted, and regular queued jobs are left
untouched. Also, information about queues and the server are preserved. On
a warm restart of the server, all rerunnable jobs that were running at time of
shutdown are requeued, all non-rerunnable jobs are aborted, and regular
queued jobs are left untouched. Also, information about queues and the server
are preserved. On a cold restart of the server, all jobs are deleted but 
information about queues and the server are preserved.
.NH 3
Reliability Tests
.LP
Execute
.B pbs_server
using a non-root account. Also, set the permission of 
the daemon's executable file to set-user-id to root, and then run the daemon
using a non-root account. Both schemes should fail to start the daemon.
.LP
Attempt to run 
.B pbs_server
with another server having the same
PBS_SERVER_HOME running.
.LP
Create input cases that attempt to run the server without giving any
option arguments to -a, -d, -p, -A, -L, -M, -S, and -t options. Also, run
pbs_server with an unknown or garbage argument.
.bp
.NH 2
UNIT TESTING THE TCL JOB SCHEDULER
.LP
In addition to basic TCL commands,  the TCL scheduler provides
a library of TCL/PBS commands used to directly communicate with the
resource monitor and PBS server in order to schedule runs of PBS jobs.
.NH 3 
Functionality Tests
.LP
Create input cases that test the actions provided by the -d, -p, -L, -S, -b,
-i, -t, and -a options.
.LP
Write TCL scheduler scripts that test the basic PBS/TCL commands such as
openrm, addreq, getreq, flushreq, fullresp, pbsstatserv, pbsstatque, 
pbsselstat, pbsrunjob, pbsdeljob, pbsholdjob, pbsalterjob, pbsconnect, 
pbsdisconnect, pbsstatjob, downrm, configrm, and closerm. In addition, test
the following TCL/PBS commands for manipulating time entities: datetime, and
strftime.
.LP
.NH 3 
Recoverability Tests
.LP
Issue a kill -9 to a running
.B pbs_sched_tcl.
No other daemon should exit, and
upon restart of the scheduler, no valid PBS job should disappear.
.LP
Issue a kill -HUP to a running
.B pbs_sched_tcl.
The daemon should simply reread its body script file and continue execution.
.NH 3 
Reliability Tests
.LP
Execute
.B pbs_sched_tcl
using a non-root account. Also, set the permission of 
the daemon's executable file to set-user-id to root, and then run the daemon
using a non-root account. Both schemes should fail to start the daemon.
.LP
Attempt to run
.B pbs_sched_tcl
but with another scheduler (having the same PBS_SERVER_HOME) running. 
.LP
Create input cases that attempt to run
.B pbs_sched_tcl
without giving any option argument to -d, -p, -L, -S, -b, -i, -t, -r, and -a.
Also, run the TCL scheduler with an incorrect -a
.B alarm
value such as 0, -1, 0.1. Also, give the daemon an unknown or garbage argument. 
.bp
.NH 2
UNIT TESTING THE BASL JOB SCHEDULER
.NH 3 
Functionality Tests
.LP
Create input cases that test the actions provided by the -d, -p, -L, and -r
options.
.LP
Write a BASL scheduler script that will exercise the scheduler language syntax
and semantics.
.LP
.NH 3 
Recoverability Tests
.LP
Issue a kill -9 to a running
.B pbs_sched_basl.
No other daemon should exit, and
upon restart of the scheduler, no valid PBS job should disappear.
.LP
.NH 3 
Reliability Tests
.LP
Execute
.B pbs_sched_basl
using a non-root account. Also, set the permission of 
the daemon's executable file to set-user-id to root, and then run the daemon
using a non-root account. Both schemes should fail to start the daemon.
.LP
Attempt to run
.B pbs_sched_basl
but with another scheduler (having the same PBS_SERVER_HOME) running. 
.LP
Create input cases that attempt to run
.B pbs_sched_basl
without giving any option argument to -d, -L, and -S options. Also, run
the scheduler with an unknown or garbage  argument.
.bp
.NH 2
UNIT TESTING THE RESOURCE MONITOR
.LP
Tests for PBS Resource Monitor.
.NH 3
Functionality Tests
.LP 
.NH 4
Longevity
.LP
Use a shell script to form a loop which runs a program which forms
a connection and sends a few requests, then exits.  Do this continuously
for several hours from several hosts.  File descriptors and memory
should not leak.
.NH 4
Client death
.LP
Create a program which will send part of a packet then exit.  Run this
once to verify that resmom will continue operation.  Then run it in a
loop for several hours to make sure no resources get used up by an
outstanding request that never completes.
.NH 4
Accuracy
.LP
Use whatever means at hand to create a process with a given memory size.
Send a request to get the memory size of this process.  This should match.
Do the same with CPU time.
.NH 4
Command Interface
.LP
Create a test program that checks the command interface of resource monitor
to make sure that -d, -p, -L, -c, and -a options work as intended.
.NH 3
Recoverability Tests
.NH 4
Signal recovery
.LP
A hup signal is used to cause the resource monitor to re-read its
config file.  This should not cause a memory leak or any outstanding
partial command to be forgotten.
.LP
Send a partial packet then a hup signal.  Send the rest of the
packet.  You should get a good response.
.LP
Start a loop which sends a hup signal once a second and run it for
several hours.  Run "Longevity" again but send hup signals to resmom
as the script(s) run.  None of the scripts should report errors.
.LP
Issue a kill command to a running
.B pbs_mom.
The resource monitor should catch the TERM signal, close its log file, and exit
gracefully without affecting any of the other PBS daemons.
.LP
Issue a kill -9 to
.B pbs_mom.
The 
death of the resource  monitor should not affect currently running PBS
daemons, and running PBS jobs should continue to execute.
.LP
.NH 3
Reliability Tests
.LP
.NH 4
Network fragmentation
.LP
It is possible that messages to and from the resource monitor
will be fragmented by the network such that a read will not
return a complete command.  The resource monitor must be able
to receive a partial command, recognize it as such and save
it until the rest of the command arrives.
.LP
Set up several connections and send messages from each that are
broken into pieces.  They should be kept track of without losing
data.
.NH 4
Command Interface
.LP
Write input cases that attempt to run
.B pbs_mom
using incorrectly
constructed command lines. Examples are specifying no option arguments to
-d, -p, -L, -c, and -a, and giving bad values such as -1, 0, and 0.1 to the -a  
.B alarm
option.
.NH 4
Security
.LP
Test security feature of resource monitor by attempting to run the daemon 
using a non-root account, or by setting the permission on the daemon's
executable file to set-user id to root, and run the daemon as a regular user.
Both attempts should fail to start the daemon.
.LP
Attempt to run
.B pbs_mom
but with another resource monitor having the
same PBS_SERVER_HOME already running in the system. This should fail to start
the daemon.
.NH 4
Server behavior
.LP
The resource monitor will be a long lived process and must have
no memory leaks.  Usually, the address space provided for a process
is large enough that small memory leaks can occur without causing
problems for a normal, short-lived process.  The resource monitor
has more stringent requirements.
.NH 4
Memory limit
.LP
Use the shell
.B limit
command to set a small datasize.  First, connect
once and issue several commands to see if it will crash or fail.
If it will run but not be able to get enough memory to do anything
but report errors, try again after increasing the datasize limit.
When you have a limit that is just above a threshold where it will
function with no errors for one connection, set up several connections.
Try at least five.  It should not crash or core dump.  At worst,
some queries should fail with "system" errors, but some should work.
Try sending very big queries broken into pieces.
.bp
.NH 2
UNIT TESTING THE MACHINE ORIENTED MINISERVER
.LP
.NH 3
Functionality Tests
.LP
Create scripts that will check to make sure that the option arguments
-d, -L, -p, and -r work as intended.
.NH 3
Recoverability Tests
.LP
Issue a kill to a running
.B pbs_mom.
The daemon should catch the TERM signal and shutdown gracefully. 
.LP
Issue a kill -HUP to a running
.B pbs_mom.
This should cause the daemon to simply
re-read its log file and it shouldn't cause any PBS jobs to abort or
daemons to die.
.LP
Issue a kill -9 to a running
.B pbs_mom.
On startup, all jobs should be reasonably recoverable without leaving stray
processes.
.NH 3
Reliability Tests
.LP
Create input cases that would try to run
.B pbs_mom
using a non-root account,
or setting the daemon's file permission to set-user-id bit to root and then
running the daemon using a non-root account. Both scheme should fail to start
the daemon.   
.LP
Attempt to run
.B pbs_mom
when there's another MOM running using the same
PBS_SERVER_HOME. This should also fail to start the daemon.
.LP
Create input cases that attempt to start
.B pbs_mom
without providing any
option argument to -d, -L, and -p. Also, see what happens when mom is
given an unknown or garbage argument.
.LP
.NH 1
SYSTEM TESTING
.LP
System testing is the thorough exercise of the entirety of PBS prior
to its release by the development team.
Its purpose is to catch errors missed by unit testing and to detect
errors which are the result of malignant interactions among major
parts of PBS.
Functionality, recoverability and reliability tests, first used in unit
testing, must be repeated after integration. Any new functionality,
recoverability and reliability tests that require multiple parts of
PBS must be performed, and the entire system must receive operational and
random exercise testing.
.LP
When PBS passes system testing, it becomes a candidate for release by
the development team.
It passes system testing when all tests run and there are no known
bugs.
.LP
System testing as well as unit testing, which was discussed in the previous
pages, have been automated using a GNU test package called DejaGNU. DejaGNU
provides the test driver script and supporting library of TCL commands for
(1) running TCL/Expect-written test modules, and (2) reporting their PASS or
FAIL results. Currently, 1790 test cases have been written that are also used
as a regression tester for finding bugs caused by bug-fix or enhancement
code changes.
.LP
The following is a discussion of what system test cases should be applied to
PBS. Many of them have already been written and integrated into the
automated regression tester. 
.NH 2
Functionality Tests
.NH 3
Batch Server and the PBS commands
.LP
By using
.B qmgr
and a running
.B pbs_server,
test the server's
.I "Manage Request"
function by sending the actions, create and delete queues, set and
unset queue  attributes, list and print queue information, set and unset server
attributes, and list and print server information. Also, test the permission
aspect when making
.B qmgr
requests, by sending inputs that require specific level of user/client
privilege. For example, execute "create queue" and "delete queue" using an
account designated as the PBS operator. Execute "create queue", "delete queue",
"set queue/server", and "unset queue/server" using a non-privileged
(non-administrator, non-operator) account.
.LP
Test assigning meaningful values to different queue and server attributes
and make sure that the server correctly processes them. 
For example, boolean values should accept
t, T, f, F, and so on as specified in the ERS.
.LP
By using
.B "qstat -B"
or
.B qmgr
and a running
.B pbs_server,
test the server's
.I "Server Status Request"
function. Test
.B qmgr
's
.B acl_hosts, 
.B acl_host_enable
attribute options for blocking certain users or users at certain hosts from
seeing server status. Also, check to make sure that the
.B query_other_jobs
server attribute is getting enforced by PBS.
.LP
Test the 
.I "Start Up Request"
function of the server by running
.B pbs_server
and testing the validity
of different startup modes (i.e. warm, cold, hot).
.LP
Use
.B qterm
and a running
.B pbs_server
to issue a 
.I "Shut Down Request"
to the server. Check the permision restrictions of running qterm; that is, it
must be runnable only by root. Also, test the different types of shutdown
(via -t option) that
.B qterm
can instruct the server to perform.
.LP 
By using
.B qstat,
and a running
.B pbs_server,
display and validate the status of specified queue(s).
Test the
.B acl_users,
.B acl_user_enable,
.B acl_hosts,
and
.B acl_host_enable
queue attributes to see if certain users or users from certain hosts can be
blocked from getting queue status information.
.LP
By using
.B qdel,
.B qsig
and a running
.B pbs_server,
test the
.I "Abort Request"
function. See what happens if an abort request is sent by a non-owner
or non-privileged account.
.LP
By giving various inputs to
.B qsub
and
.B qstat,
the
.I Commit Request
function of the server can be thoroughly tested.
.LP
By giving different inputs to
.B qdel,
test the
.I "Delete Job Request
function of the server.
.LP
By using
.B qsub,
.B qalter,
and
.B qhold,
test the
.I "Hold Job Request" 
function of the server. Verify that only an administrator can set
all types of holds, and an operator can only set other and user-type holds.
Also, see what happens when the job being held is already in a held state.
.LP
By specifying various inputs to
.B qsub,
the 
.I "Queue Job Request" 
function of the server is thoroughly tested. Verify the 
notion of default queues and default servers. Test
to make sure queued jobs are given the correct state. For example, if a valid
.B Execution_Time
attribute is set for a job, then the job should initially be in the waiting
state. Check also to make sure that the appropriate PBS_O* environment
variables are appended to a job's
.B Variable_List
attribute when it gets queued.
.LP
By submitting various jobs that take input from a script file, the
.I "Job Script Request"
server function is thoroughly tested. 
.LP
By running various PBS commands with job identifiers as operands, the
.I "Locate Job Request"
server function is thorougly exercised. 
.LP
By using the running various input cases of
.B qmsg,
the
.I "Message Job Request"
function is tested.
.LP
Thoroughly test the
.I "Modify Job Request"
function by giving 
.B qalter
an exhaustive input list. Modify job attributes that are alterable or not
alterable.
.LP
By using
.B qmove
to move a job from one queue to another queue located in the current host or in
some remote host, then the
.I "Move Job Request"
function of the server is tested.
.LP
By testing the basic functions of the
.B qrls
command,
then the
.I "Release Job Request"
function of the server is exercised.
.LP
By specifying various valid inputs to
.B qrerun,
then the
.I "Rerun Job Request"
function of the server is tested.
.LP
By specifying various valid inputs to
.B qrun,
then the
.I "Run Job Request"
function of the server is tested.
.LP
By specifying an exhaustive list of valid inputs to
.B qselect,
then the
.I "Select Jobs Request"
function of the PBS server is thoroughly tested.
.LP
By issuing various signals to running jobs via
.B qsig,
the
.I "Signal Job Request" 
function of the PBS server is thoroughly tested.
.LP
By using various combination of inputs to
.B "qsub -W depend"
or
.B "qalter -W depend,"
and verifying its actions, then the PBS server's job synchronization and
job dependency features is tested.
.LP
By using various combination of inputs to
.B "qsub -W stagein/stageout"
or
.B "qalter -W stagein/stageout,"
and verifying its actions, then the PBS server's
file staging capability is tested.
.LP
By setting the
.B scheduling
server attribute to
.I True,
then the communication between the server and the scheduler can be verified.
.LP
The
.B "Job Initiation"
service provided by the server is exercised by verifying actions of
.B "qsub -S"
or
.B "qalter -S"
for setting the shell on which to run a job,
.B "qsub -u"
or
.B "qalter -u"
for setting the user name to run a job under, any
.B "qsub -v"
or 
.B "qsub -V"
inputs that export certain environment variables before job start, 
.B "qsub -j"
or
.B "qalter -j"
for specifying what input streams and output streams 
of a job should be merged together, and
.B "qsub -m",
.B "qsub -M",
.B "qalter-m",
.B "qalter-M"
for setting conditions for sending email and the recipients of the messages
during or after a job run.
Also, a command called
.B qrerun
can test the server's capability to append the standard output and standard
error streams of a job to any previous streams.  
.LP
By creating routing queues and testing how jobs are routed to destination 
queues using
.B qsub
or
.B qalter,
 then the
.B "Job Routing"
function of the PBS server is tested.
.LP
By verifying the actions of
.B "qsub -o",
.B " qsub -e",
.B "qalter -o",
.B " qalter -e", 
.B "qsub -k",
.B "qalter -k",
and
.B "qalter -W stageout",
then the "Job Exit" server function of delivering the necessary output files to
users, is thoroughly exercised.
.LP
Test to make sure that timed events such as waiting jobs are released at their
designated
.B  Execution_Time.
.LP
.NH 3
Machine Oriented Mini-Server, Batch Server, and PBS commands
.LP
Execute a job using
.B qrun,
and check if it causes the server to call a
.B pbs_mom
at some default host or specified host, and see if that mom 
executes the job and mom correctly returns any standard output and error
files of the job.
.NH 3
Resource Monitor, TCL/Rule Scheduler, Batch Server
.LP
Create test scheduler scripts, either in TCL or BASL (rule), that perform some
of the basic scheduler functions such as getting information from the server
about current jobs, queues, and the server state, querying the resource monitor
to obtain system statistics like memory available, idle, load average,
and using some local scheduler mechanism to make decisions as to what jobs to
run and how and where to run them.
.LP
Test the TCL scheduler feature of being able to bring down the resource monitor
using
.B downrm.
Also, test sending a different configuration file containing static resources
to the resource monitor by issuing a
.B configrm
TCL command. Check to make sure that querying for values of static resources
are returning accurate information.
.LP
Test opening connections to more than one resource monitors residing on 
different remote hosts.
.LP
.NH2
Recoverability Tests
.LP
Test sending HUP, TERM, and KILL signals to a running server, resource monitor,
scheduler, or mom and verify that the remaining PBS daemons don't start
misbehaving in strange ways. Also in terms of PBS server and mom, experiment
with various options in starting up the daemons and verify that jobs, queues,
and server information are "reasonably" recovered and kept in a consistent
state. For example, on MOM startup, all rerunnable PBS jobs
that were running at time of MOM shutdown are requeued, all non-rerunnable jobs
are aborted, regular queued jobs are left untouched, and previous job
processes are cleaned up.
.LP
.NH 2
Reliability Tests
.LP
Test the server's behavior when a request via
.B qmgr
is made to alter the values of read-only server and queue attributes.
.LP
Test the server's response when a
.B qstat
request of server status is made from a non-authorized user or host.
.LP
Test the server's behavior when a
.B qstat
request of queue status is made from a non-authorized user, or the specified
queue is non-existent.
.LP
Test the server's behavior when a 
.B qdel
request is made to a job not owned by the server, or the request is coming from
a user/client who is not authorized to delete the designated job.
.LP
Test the server's behavior when a 
.B qhold
request is made to a job not managed by the server, or the request is coming
from a user/client who is not authorized to add any of the specified hold 
types. For example, check what happens when a non-privilege user attempts to
place an o(ther) or s(ystem) hold on a job, or when an operator user attempts
to place a s(ystem) hold. Both tests should fail to place the designated hold
type on the job.
.LP
Test the server's actions by sending
.B qmsg
request to a job that is not running, or the request is made by a user/client
who is not authorized to post a message to the designated job, or the request
is to a job not owned by the server.
.LP
Test the server's actions by sending various
.B qalter
requests to a job that is not owned by the server, or the request is made by
a user/client who is not authorized to make modifications to the job's
attributes, or the request includes a resource change that would exceed the
limits of the queue or server, or the request contains a resource name that
is not recognized by the server. Also, see what happens when attempting to
modify a running job's non-alterable (because of job state) attribute. Sample
attributes that can only be modified when the job is not running are
.B Account_String,
.B Keep_Files,
.B User_List.
.LP
Test the server's behavior by sending 
.B qmove
request from a user/client who is not authorized to remove a designated job
from the original queue, or the request is from a user/client who is not
authorized to submit jobs to the new queue, or the request specifies a job that
is not 
.B queued,
.B held,
or
.B waiting,
or the request specifies a new queue that is disabled or inaccessible.
.LP
Test sending various illegal
.B qsub
requests to the server, such as submitting jobs from a user/client who is not
authorized to create jobs in the target queue, attempting to submit jobs to a
a non-existent or disabled target queue, setting the job's resource limits
that would exceed the limits set upon a target queue, and setting an unknown 
resource for a job.
.LP
Test the server's actions by sending
.B qhold
requests from a user/client who is not authorized to add/remove any of the
specified hold types, or the batch server does not manage the job being held.
.LP
Run 
.B qrerun
from a user/client who is not authorized to rerun a designated job, or run the
command with the
.B Rerunable
attribute set to 
.I False
or the job's state set to not running.
.LP
Run 
.B qrun
from a user/client who is not authorized to run a designated job, or run the
command with the non-running job state.
.LP
Test the server's action when given a
.B qselect
request in which the destination queue (-q option) is invalid.
.LP
Test the server's behavior when a
.B qsig
command is issued to a job that is not running, the server does
not own the job, the requested signal is not supported by the host operating
system, or the user/client is not authorized to signal the designated job.
.LP
Test what happens when a job status request is made via
.B qstat
executed under a user/client who is not authorized to query for job status,
or the job being queried is not owned by the user. Also, test the server
attribute, 
.B query_other_jobs,
to make sure that it works as intended.
.LP
specify in
.B "qsub -W"
or 
.B "qalter -W"
no
.B on
dependency for before, beforeok, beforenotok, and beforeany, or specify a non
fully-qualified job identifier, a non-existent job id, or not-owned jobs for
after, afterok, afternotok, afterany, before, beforeok, beforenotok, and
beforeany.  Also, test what happens when a slave job, submitted with an afterok 
attribute value that points to some parent job, and the parent job finishes
with errors. Do the same for afternotok but this time, the parent job finishes
without any errors.
.NH 2
Operational Tests
.LP
The automated testsuite is made up of individual TCL/Expect-written routines 
to exercise the entire or partial job lifecycle from submission,
modification, execution, and to eventual deletion of jobs. It also tests the
lifecycle of queues from their initial creation, modification, and up to their
deletion. Some of the test routines exercise features of PBS daemons that
require for the daemons to be started, restarted, and killed (normally and
abnormally).  Validation of daemons' actions are done by consulting their
log files.
.LP
.NH 2
Random Exercise Tests
.LP
Some of the test routines in the PBS-DejaGNU testsuite attempt to mimic the
behavior of production jobs. For example, the job dependency tests are done
using a set of master and slave executables that wait for each other to update
a system message queue.
.LP
.bp
.NH 1
OPERATIONAL READINESS TESTING
.LP
Operational Readiness Testing is conducted after release of PBS from
the development team.
The goal of operational readiness testing is to demonstrate that PBS
is ready to serve NAS users.
It is conducted by different people using different tests to reduce
the probability of systematic ``blind spots'' in the testing
procedure.
These tests include functionality, recoverability reliability,
operational and random exercise tests.
Operational readiness testing is performed by members of the NAS Code INC
branch.
PBS passes operational readiness testing when an Operational Readiness
Review has accepted the recommendation of the INC testers that it be
put into production.

.NH 2
Functionality Tests
.LP
.NH 2
Recoverability Tests
.LP
.NH 2
Reliability Tests
.LP
.NH 2
Operational Tests
.LP
.NH 2
Random Exercise Tests
