
                             EMBOSS: dotmatcher
     _________________________________________________________________
   
                              Program dotmatcher
                                       
Function

   Displays a thresholded dotplot of two sequences
   
Description

   A dotplot is a graphical representation of the regions of similarity
   between two sequences.
   
   The two sequences are placed on the axes of a rectangular image and
   (subject to threshold conditions) wherever there is a similarity
   between the sequences a dot is placed on the image.
   
   Where the two sequences have substantial regions of similarity, many
   dots align to form diagonal lines. It is therefore possible to see at
   a glance where there are local regions of similarity as these will
   have long diagonal lines. It is also easy to see other features such
   as repeats (which form parallel diagonal lines), and insertions or
   deletions (which form breaks or discontinuities in the diagonal
   lines).
   
   dotmatcher uses a threshold to define whether a match is plotted
   (calculated from the substitution matrix). A window of specified
   length is moved up all possible diagonals and a score is calculated
   within each window for each position along the diagonals. The score is
   the sum of the comparisons of the two sequences using the given
   similarity matrix along the window. If the score is above the
   threshold, then a line is plotted on the image over the position of
   the window.
   
Usage

   Here is a sample session with dotmatcher.
   
% dotmatcher sw:hba_human sw:hbb_human

   click here for result
   
Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-sequencea]         sequence   Sequence USA
  [-sequenceb]         sequence   Sequence USA
*  -data               bool       Output the match data to a file instead of
                                  plotting it
*  -graph              graph      Graph type
*  -xygraph            xygraph    Graph type
*  -outfile            outfile    Display as data

   Optional qualifiers:
   -windowsize         integer    window size over which to test threshhold
   -threshold          integer    threshold
   -matrixfile         matrix     Matrix file

   Advanced qualifiers:
   -stretch            bool       Display a non-proportional graph

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-sequencea]
   (Parameter 1) Sequence USA Readable sequence Required
   [-sequenceb]
   (Parameter 2) Sequence USA Readable sequence Required
   -data Output the match data to a file instead of plotting it Yes/No No
   -graph Graph type EMBOSS has a list of known devices, including
   postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows,
   x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm,
   png EMBOSS_GRAPHICS value, or x11
   -xygraph Graph type EMBOSS has a list of known devices, including
   postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows,
   x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm,
   png EMBOSS_GRAPHICS value, or x11
   -outfile Display as data Output file <sequence>.dotmatcher
   Optional qualifiers Allowed values Default
   -windowsize window size over which to test threshhold Integer 3 or
   more 10
   -threshold threshold Integer 0 or more 50
   -matrixfile Matrix file Comparison matrix file in EMBOSS data path
   EBLOSUM62 for protein
   EDNAFULL for DNA
   Advanced qualifiers Allowed values Default
   -stretch Display a non-proportional graph Yes/No No
   
Input file format

   Any 2 sequence USAs of the same type (DNA or protein).
   
Output file format

   An image is output to the requested graphics device.
   
Data files

   It uses the specified matrix substitution file to compare the two
   sequences.
   
   For protein sequences EBLOSUM62 is used for the substitution matrix.
   For nucleotide sequence, EDNAFULL is used. Others can be specified.
   
   EMBOSS data files are distributed with the application and stored in
   the standard EMBOSS data directory, which is defined by EMBOSS
   environment variable EMBOSS_DATA.
   
   Users can provide their own data files in their own directories.
   Project specific files can be put in the current directory, or for
   tidier directory listings in a subdirectory called ".embossdata".
   Files for all EMBOSS runs can be put in the user's home directory, or
   again in a subdirectory called ".embossdata".
   
   The directories are searched in the following order:
     * . (your current directory)
     * .embossdata (under your current directory)
     * ~/ (your home directory)
     * ~/.embossdata
       
Notes

   None.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   None.
   
Exit status

   0 upon successful completion.
   
Known bugs

   None.
   
See also

   Program name                          Description
   dotpath      Displays a non-overlapping wordmatch dotplot of two sequences
   dottup       Displays a wordmatch dotplot of two sequences
   polydot      Displays all-against-all dotplots of a set of sequences
   
   dottup, by comparison, has no threshold, using a wordmatch-style
   method. dottup is less sensitive, but substantially faster than
   dotmatcher.
   
Author(s)

   This application was written by Ian Longden (il@sanger.ac.uk)
   Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus,
   Hinxton, Cambridge, CB10 1SA, UK.
   
History

 Completed 1st June 1999.
 Last modified 16th June 1999.

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
