
                               EMBOSS: seqnr
     _________________________________________________________________
   
                                 Program seqnr
                                       
Function

   Converts redundant database results to a non-redundant set of hits
   
Description

   This is part of Jon Ison's protein structure analysis package.
   
   This package is still being developed.
   
   Please ignore this program until further details can be documented.
   
   All further queries should go to Jon Ison. (Jon Ison)
   
Usage

   Here is a sample session with seqnr:

% seqnr

Command line arguments

   Mandatory qualifiers:
  [-path]              string     Directory of redundant database search
                                  results
  [-extn]              string     File extension of redundant database search
                                  results files
  [-outpath]           string     Directory for proccessed results
  [-outextn]           string     File extension for proccessed results files
  [-datafile]          matrixf    Residue substitution matrix
  [-gapopen]           float      The gap insertion penalty is the score taken
                                  away when a gap is created. The best value
                                  depends on the choice of comparison matrix.
                                  The default value assumes you are using the
                                  EBLOSUM62 matrix for protein sequences, and
                                  the EDNAFULL matrix for nucleotide
                                  sequences.
  [-gapextend]         float      The gap extension, penalty is added to the
                                  standard gap penalty for each base or
                                  residue in the gap. This is how long gaps
                                  are penalized. Usually you will expect a few
                                  long gaps rather than many short gaps, so
                                  the gap extension penalty should be lower
                                  than the gap penalty. An exception is where
                                  one or both sequences are single reads with
                                  possible sequencing errors in which case you
                                  would expect many single base gaps. You can
                                  get this result by setting the gap open
                                  penalty to zero (or very low) and using the
                                  gap extension penalty to control gap
                                  scoring.
  [-thresh]            float      The % sequence identity redundancy threshold

   Optional qualifiers: (none)
   Advanced qualifiers: (none)
   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-path]
   (Parameter 1) Directory of redundant database search results Any
   string is accepted ./
   [-extn]
   (Parameter 2) File extension of redundant database search results
   files Any string is accepted .psiblasts
   [-outpath]
   (Parameter 3) Directory for proccessed results Any string is accepted
   ./
   [-outextn]
   (Parameter 4) File extension for proccessed results files Any string
   is accepted .clean
   [-datafile]
   (Parameter 5) Residue substitution matrix Comparison matrix file in
   EMBOSS data path EBLOSUM62
   [-gapopen]
   (Parameter 6) The gap insertion penalty is the score taken away when a
   gap is created. The best value depends on the choice of comparison
   matrix. The default value assumes you are using the EBLOSUM62 matrix
   for protein sequences, and the EDNAFULL matrix for nucleotide
   sequences. Floating point number from 1.0 to 100.0 10.0 for any
   sequence
   [-gapextend]
   (Parameter 7) The gap extension, penalty is added to the standard gap
   penalty for each base or residue in the gap. This is how long gaps are
   penalized. Usually you will expect a few long gaps rather than many
   short gaps, so the gap extension penalty should be lower than the gap
   penalty. An exception is where one or both sequences are single reads
   with possible sequencing errors in which case you would expect many
   single base gaps. You can get this result by setting the gap open
   penalty to zero (or very low) and using the gap extension penalty to
   control gap scoring. Floating point number from 0.0 to 10.0 0.5 for
   any sequence
   [-thresh]
   (Parameter 8) The % sequence identity redundancy threshold Any integer
   value 95.0
   Optional qualifiers Allowed values Default
   (none)
   Advanced qualifiers Allowed values Default
   (none)
   
Input file format

Output file format

Data files

Notes

   None.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   None.
   
Exit status

   It always exits with status 0.
   
Known bugs

   None.
   
See also

   Program name Description
   cutgextract Extract data from CUTG
   domainer Build domain coordinate files
   nrscope Converts redundant EMBL-format SCOP file to non-redundant one
   pdbtosp Convert raw swissprot:pdb equivalence file to embl-like format
   printsextract Extract data from PRINTS
   prosextract Builds the PROSITE motif database for patmatmotifs to
   search
   rebaseextract Extract data from REBASE
   scope Convert raw scop classification file to embl-like format
   scopparse Reads raw-, and writes EMBL-like, scop classification files
   tfextract Extract data from TRANSFAC
   
Author(s)

   This application was written by Jon Ison (jison@hgmp.mrc.ac.uk)
   
History

   Written (date) - author.
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
