
                           EMBOSS: helixturnhelix
     _________________________________________________________________
   
                            Program helixturnhelix
                                       
Function

   Report nucleic acid binding motifs
   
Description

   helixturnhelix uses the method of Dodd and Egan and finds
   helix-turn-helix nucleic acid binding motifs in proteins.
   
   The helix-turn-helix motif was originally identified as the
   DNA-binding domain of phage repressors. One alpha-helix lies in the
   wide groove of DNA; the other lies at an angle across DNA.
   
Usage

   Here is a sample session with helixturnhelix.

% helixturnhelix
Input sequence: sw:laci_ecoli
Output file [laci_ecoli.hth]:

Command line arguments

   Mandatory qualifiers:
  [-sequence]          seqall     Sequence database USA
  [-outfile]           report     (no help text) report value

   Optional qualifiers:
   -mean               float      Mean value
   -sd                 float      Standard Deviation value
   -minsd              float      Minimum SD
   -eightyseven        bool       Use the old (1987) weight data

   Advanced qualifiers: (none)
   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-sequence]
   (Parameter 1) Sequence database USA Readable sequence(s) Required
   [-outfile]
   (Parameter 2) (no help text) report value Report file
   Optional qualifiers Allowed values Default
   -mean Mean value Number from 1.000 to 10000.000 238.71
   -sd Standard Deviation value Number from 1.000 to 10000.000 293.61
   -minsd Minimum SD Number from 0.000 to 100.000 2.5
   -eightyseven Use the old (1987) weight data Yes/No No
   Advanced qualifiers Allowed values Default
   (none)
   
Input file format

   The input sequence can be one or more protein sequences.
   
Output file format

   The output is a standard EMBOSS report file.
   
   The results can be output in one of several styles by using the
   command-line qualifier -rformat xxx, where 'xxx' is replaced by the
   name of the required format. The available format names are: embl,
   genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel,
   feattable, motif, regions, seqtable, simple, srs, table, tagseq
   
   See:
   http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for
   further information on report formats.
   
   By default helixturnhelix writes a 'motif' report file.
   
   Here is a sample output:
  __________________________________________________________________________

########################################
# Program: helixturnhelix
# Rundate: Mon Feb 11 13:45:02 2002
# Report_file: laci_ecoli.hth
########################################

#=======================================
#
# Sequence: LACI_ECOLI     from: 1   to: 360
# HitCount: 1
#
# Hits above +2.50 SD (972.73)
#
#=======================================

Maximum_score_at at "*"

(1) Score 2160.000 length 22 at residues 4->25
           *
 Sequence: VTLYDVAEYAGVSYQTVSRVVN
           |                    |
           4                    25
 Standard_deviations: 6.54


#---------------------------------------
#---------------------------------------
  __________________________________________________________________________

Data files

   The data files are stored in the standard EMBOSS data directory. The
   names are:
     * Ehth.dat matrix file
     * Ehth87.dat 1987 shorter matrix file
       
   With care these can be replaced to suit your data sets. If the files
   are placed in the following directories they will be used in
   preference to the files in the EMBOSS distribution data directory:
     * . (your current directory)
     * .embossdata
     * ~/ (your home directory)
     * ~/.embossdata
       
   Here is the default file:

# Amino acid counts for 91 Helix-turn-helix (presumed) protein motifs
# from Dodd IB and Egan JB (1990) Nucl. Acids. Res. 18:5019-5026.
#
Sample: 91 aligned sequences
#
# R  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 Total Exp
# - -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- ----- ---
  A  2  1  3 14 10 12 75  6 15  9  1  1  4  3  8 15  4  4  4 11  0 10   212 995
  C  0  0  1  1  0  0  0  0  0  3  3  1  1  0  0  0  0  0  0  1  0  3    14 106
  D  0  1  0  1 14  0  0 14  1  0  5  0  1  2  0  0  0  0  1  1  0  2    43 556
  E  4  5  0 11 26  0  0 16  9  3  3  0  3 12 13  0  0  2  0  1 13  6   127 669
  F  4  0  4  0  0  4  0  1  0 10  0  0  0  0  1  0  0  1  1  1 22  0    49 358
  G  9  7  1  4  0  0  8  0  0  0 50  0  6  0  7  1  0  3  1  1  0  4   102 761
  H  4  3  1  1  2  0  0  3  2  0  5  0  3  3  0  2  0  2  4  5  0  2    42 225
  I 10  0 13  3  2 15  0  4  9  4  0 17  0  2  0  1 31  1  4  8 16  1   141 583
  K  4  4  6 11 12  1  1 14 11  0  5  2  2  7  2  1  0  5  8  4  5 15   120 516
  L 16  1 17  0  1 35  0  3 12 31  0 22  0  2  1  1 22  1  1 12 20  0   198 954
  M  7  0  2  1  1  1  0  0  5  7  1 10  0  0  2  0  2  0  0  2  0  1    42 275
  N  0  8  0  1  0  0  0  2  1  1 14  0  8  1  4  2  0  4  9  0  0 11    66 383
  P  1  6  0  1  0  0  0  0  0  0  0  0  3 13  7  0  0  0  0  0  0  3    34 403
  Q  2  1 21  9 11  0  0  9  8  0  0  2  1 17  7 12  0  3 12  5  3  9   132 437
  R  9 10 14  9  5  0  1 16 10  0  1  0  1 17  8  7  0 17 28  3  0 16   172 609
  S  2 17  0  8  4  1  6  1  2  2  3  0 37  1 25  5  0 29  3  0  1  5   152 552
  T  6 24  3 12  1  5  0  2  2  4  0  5 20  4  3 39  0  4  1  0  4  3   142 512
  V  7  3  1  1  2 16  0  0  2 12  0 29  0  5  3  3 32  0  7  8  7  0   138 724
  W  2  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  0  0  2 21  0  0    27 105
  Y  2  0  4  3  0  1  0  0  2  4  0  1  1  2  0  2  0 15  5  7  0  0    49 267

Notes

   None.
   
References

    1. Dodd I.B., Egan J.B. (1987) "Systematic method for the detection
       of potential lambda cro-like DNA-binding regions in proteins." J.
       Mol. Biol. 194: 557-564.
    2. Dodd I.B., Egan J.B. (1990) "Improved detection of
       helix-turn-helix DNA-binding motifs in protein sequences." Nucleic
       Acids Res. 18: 5019-5026.
       
Warnings

   The program will warn you if the data file is not mathematically
   accurate.
   
Diagnostic Error Messages

   None.
   
Exit status

   It exits with status 0 unless an error is reported.
   
Known bugs

   None.
   
See also

   Program name                       Description
   antigenic    Finds antigenic sites in proteins
   digest       Protein proteolytic enzyme or reagent cleavage digest
   fuzzpro      Protein pattern search
   fuzztran     Protein pattern search after translation
   garnier      Predicts protein secondary structure
   hmoment      Hydrophobic moment calculation
   oddcomp      Finds protein sequence regions with a biased composition
   patmatdb     Search a protein sequence with a motif
   patmatmotifs Search a PROSITE motif database with a protein sequence
   pepcoil      Predicts coiled coil regions
   pepnet       Displays proteins as a helical net
   pepwheel     Shows protein sequences as helices
   preg         Regular expression search of a protein sequence
   pscan        Scans proteins using PRINTS
   sigcleave    Reports protein signal cleavage sites
   tmap         Displays membrane spanning regions
   
Author(s)

   This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)
   
   Original program "HELIXTURNHELIX" by Peter Rice (EGCG 1990)
   
History

   Completed 11th March 1999
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
