
  For more complete information on NetPIPE, visit the webpage at:

http://www.scl.ameslab.gov/Projects/NetPIPE/

NetPIPE was originally developed by Quinn Snell, Armin Mikler,
John Gustafson, and Guy Helmer.

It is currently being developed and maintained by Dave Turner with
help from several graduate students (Xuehua Chen, Adam Oline, Bogdan Vasiliu).

The latest release, version 3.2, includes additional modules to test
PVM, TCGMSG, SHMEM, and MPI-2, as well as the GM, GPSHMEM, ARMCI, and LAPI
software layers they run upon.  

In the future, we would like to add VIA and Infinaband modules.

If you have problems or comments, please email netpipe@scl.ameslab.gov

____________________________________________________________________________

NetPIPE Network Protocol Independent Performance Evaluator, Release 2.3
Copyright 1997, 1998 Iowa State University Research Foundation, Inc.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation.  You should have received a copy of the
GNU General Public License along with this program; if not, write to the
Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
____________________________________________________________________________


Building NetPIPE
----------------

NetPIPE requires an ANSI C compiler.  You are on your own for 
installing the various libraries that NetPIPE can be used to
test.

Review the provided Makefile and change any necessary settings, such
as the CFLAGS compiler flags, required extra libraries, and MPI or PVM
library & include file pathnames if you have these communication
libraries.

Compile NetPIPE with the desired communication interface by using:

  make mpi        (this will use the default MPI on the system)
  make pvm        (you may need to set some paths in the makefile)
  make tcgmsg     (you will need to set some paths in the makefile)
  make mpi2       (this will test 1-sided MPI_Put() functions)
  make shmem      (1-sided library for Cray and SGI systems)

  make tcp
  make gm         (for Myrinet cards, you will need to set some paths)
  make shmem      (1-sided library for Cray and SGI systems)
  make gpshmem    (SHMEM interface for other machines)
  make armci      (still under development)
  make lapi       (for the IBM SP)

Running NetPIPE
---------------

   NetPIPE will dump its output to the screen by default and also
to the np.out.  The following parameters can be used to change how
NetPIPE is run, and are in order of their general usefulness.

	-b: specify send and receive TCP buffer sizes e.g. "-b 32768"
            This can make a huge difference for Gigabit Ethernet cards.
            You may need to tune the OS to set a larger maximum TCP
            buffer size for optimal performance.

	-l: lower bound (start value for block size) e.g. "-l 1"

	-u: upper bound (stop value for block size) e.g. "-u 1048576"

	-o: specify output filename e.g. "-o output.txt"

        -z: for MPI, receive messages using ANYSOURCE

        -g: MPI-2: use MPI_Get() instead of MPI_Put()

        -f: MPI-2: do not use a fence call (may not work for all packages)


	-A: specify buffers alignment e.g. "-A 4096"
            buffers are page-aligned by default

	-a: asynchronous receive (a.k.a. pre-posted receive)
		May not have any effect, depending on your implementation

	-i: specify increment step size e.g. "-i 64"
		Default is exponential increment calculated at runtime

	-O: specify buffer offset e.g. "-O 127"

	-s: stream option (default mode is "ping pong")
		If this option is used, it must be specified on both
		the sending and receiving processes

   TCP
   ---

      Compile NetPIPE using 'make tcp'
      other_host> NPtcp -r [options]
      local_host> NPtcp -t -h other_host [options]

   MPICH
   -----

      Install MPICH
      Compile NetPIPE using 'make mpi'
      use p4pg file or edit mpich/util/mach/mach.{ARCH} file
          to specify the machines to run on
      mpirun [-nolocal] -np 2 NPmpi [options]
      'setenv P4_SOCKBUFSIZE 256000' can make a huge difference for
           MPICH on Unix systems.

   LAM/MPI    (comes on the RedHat Linux distributions now)
   -------

      Install LAM
      Compile NetPIPE using 'make mpi'
      put the machine names into a lamhosts file
      'lamboot -v -b lamhosts' to start the lamd daemons
      mpirun -np 2 [-O] NPmpi [options]
      The -O parameter avoids data translation for homogeneous systems.

   MPI/Pro     (commercial version)
   -------

      Install MPI/Pro
      Compile NetPIPE using 'make mpi'
      put the machine names into /etc/machines or a local machine file
      mpirun -np 2 NPmpi [options]

   MP_Lite      (A lightweight version of MPI)
   -------

      Install MP_Lite  (http://www.scl.ameslab.gov/Projects/MP_Lite/)
      Compile NetPIPE using 'make MP_Lite'
      mprun -np 2 -h {host1} {host2} NPmplite [options]

   PVM
   ---

      Install PVM  (comes on the RedHat distributions now)
      Set the PVM paths in the makefile if necessary.
      Compile NetPIPE using 'make pvm'
      use the 'pvm' utility to start the pvmd daemons
        type 'pvm' to start it  (this will also start pvmd on the local_host)
        pvm> help           --> lists all commands
        pvm> add other_host --> will start a pvmd on a machine called 'host2'
        pvm> quit           --> when you have all the pvmd machines started
      other_host> NPpvm -r [options]
      local_host> NPpvm -t -h other_host [options]
      Changing PVMDATA in netpipe.h and PvmRouteDirect in pvm.c can
        effect the performance greatly.

   TCGMSG      (unlikely anyone will try this that doesn't know TCGMSG well)
   -------

      Install TCGMSG package
      Set the TCGMSG paths in the makefile.
      Compile NetPIPE using 'make tcgmsg'
      create a NPtcgmsg.p file with hosts and paths (see hosts/NPtcgmsg.p)
      parallel NPtcgmsg
          (no options can be passed into this version)

   MPI-2
   -----

      Install the MPI package
      Compile NetPIPE using 'make mpi2'
      Follow the directions for running the MPI package from above
      The MPI_Put() function will be tested with fence calls by default.
      Use -g to test MPI_Get() instead, or -f to do MPI_Put() without
        fence calls (will not work with LAM).

   SHMEM
   -----

      Must be run on a Cray or SGI system that supports SHMEM calls.
      Compile NetPIPE using 'make shmem'
      (Xuehua, fill out the rest)

   GPSHMEM  (a General Purpose SHMEM library) (gpshmem.c in development)
   -------

      Ask Ricky or Krzysztof for help :).

   GM       (test the raw performance of GM on Myrinet cards)
   --

      Install the GM package and configure the Myrinet cards
      Compile NetPIPE using 'make gm'
      other_host> NPgm -r [options]
      local_host> NPgm -t -h other_host [options]

   LAPI   (Xuehua, please fill this out)
   ----

   ARMCI
   -----   
   
      Install the ARMCI package
      Compile NetPIPE using 'make armci'
      Follow the directions for running the MPI package from above
      If running on interfaces other than the default, create a file
        called armci_hosts, containing two lines, one for each hostname,
	then run package. 

Interpreting the Results
------------------------

NetPIPE's output file contains five columns: time to transfer the block,
bits per second, bits in block, bytes in block, and variance.  These
columns may be graphed to represent and compare the network's
performance.  For example, the "network signature" graph can be
created by graphing the time column versus the bits per second column
(see the NetPIPE report at the URL above for the details why this
graph is important and how to interpret it).  The more traditional
"throughput versus block size" graph can be created by
graphing the bytes column versus the bits per second column.

Changes
-------

 - we need to put the getrusage stuff from version 2.4 back in

version 3.2 (9/21/02)
   * Added PVM, TCGMSG, MPI-2, SHMEM, GM, GPSHMEM, ARMCI, and LAPI
     modules.
   * Removed initial latency test since it is redundant
   * Changed ping-pong test so data does not start in cache by default.
     The -c option does the ping-pong between two buffers, which affects
     SMP message-passing results greatly.  For SMP systems, both ways
     should be tested.
  
version 2.3 (9/24/98)
   * Add PVM interface contributed by Clark E. Dorman <dorman@s3i.com>

   * Revamp README file with instructions for NPmpi and NPpvm, and
     clarify some instructions for NPtcp

version 2.2 (8/21/98):
   * Carefully check all return values from write(2) and read(2)
     system calls in TCP.c.  Handle short reads properly.  Make the Sync()
     function transmit and receive a useful string which can be
     checked for validity.

   * Correct the overloading of SendTime() and RecvTime() functions
     by breaking out SendRepeat() and RecvRepeat() as separate
     functions.

   * Handle systems whose accept(2) system call does not carry socket
     options over from the listening socket.  In particular, set the
     TCP_NODELAY flag and socket buffers on an accepted socket.
