
                                EMBOSS: dan
     _________________________________________________________________
   
                                  Program dan
                                       
Function

   Calculates DNA RNA/DNA melting temperature
   
Description

   Dan calculates the melting temperature (Tm) and the percent G+C of a
   nucleic acid sequence (optionally plotting them). For the Melting
   temperature profile, free energy values calculated from nearest
   neighbor thermodynamics are used (Breslauer et al. Proc. Natl. Acad.
   Sci. USA 83, 3746-3750 and Baldino et al. Methods in Enzymol. 168,
   761-777).
   
Usage

   Here is a sample session with dan.

% dan
Input sequence: embl:paamir
Enter window size [20]:
Enter Shift Increment [1]:
Enter DNA concentration (nM) [50.]:
Enter salt concentration (mM) [50.]:
Output file [paamir.dan]:

   An example of producing a plot of Tm:

% dan -plot
Input sequence(s): embl:paamir
Enter window size [20]:
Enter Shift Increment [1]:
Enter DNA concentration (nM) [50.]:
Enter salt concentration (mM) [50.]:
Enter minimum temperature [55.]:
Graph type [x11]:

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-sequence]          seqall     Sequence database USA
   -windowsize         integer    The values of melting point and other
                                  thermodynamic properties of the sequence are
                                  determined by taking a short length of
                                  sequence known as a window and determining
                                  the properties of the sequence in that
                                  window. The window is incrementally moved
                                  along the sequence with the properties being
                                  calculated at each new position.
   -shiftincrement     integer    This is the amount by which the window is
                                  moved at each increment in order to find the
                                  melting point and other properties along
                                  the sequence.
   -dnaconc            float      Enter DNA concentration (nM)
   -saltconc           float      Enter salt concentration (mM)
*  -formamide          float      This specifies the percent formamide to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -mismatch           float      This specifies the percent mismatch to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -prodlen            integer    This specifies the product length to be used
                                  in calculations (it is ignored unless
                                  -product is used).
*  -mintemp            float      Enter a minimum value for the temperature
                                  scale (y-axis) of the plot.
*  -graph              xygraph    Graph type
*  -outfile            report     If a plot is not being produced then data on
                                  the melting point etc. in each window along
                                  the sequence is output to the file.

   Optional qualifiers (* if not always prompted):
*  -temperature        float      If -thermo has been specified then this
                                  specifies the temperature at which to
                                  calculate the DeltaG, DeltaH and DeltaS
                                  values.

   Advanced qualifiers:
   -rna                bool       This specifies that the sequence is an RNA
                                  sequnce and not a DNA sequence.
   -product            bool       This prompts for percent formamide, percent
                                  of mismatches allowed and product length.
   -thermo             bool       Output the DeltaG, DeltaH and DeltaS values
                                  of the sequence windows to the output data
                                  file.
   -plot               bool       If this is not specified then the file of
                                  output data is produced, else a plot of the
                                  melting point along the sequence is
                                  produced.

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-sequence]
   (Parameter 1) Sequence database USA Readable sequence(s) Required
   -windowsize The values of melting point and other thermodynamic
   properties of the sequence are determined by taking a short length of
   sequence known as a window and determining the properties of the
   sequence in that window. The window is incrementally moved along the
   sequence with the properties being calculated at each new position.
   Integer from 1 to 100 20
   -shiftincrement This is the amount by which the window is moved at
   each increment in order to find the melting point and other properties
   along the sequence. Integer 1 or more 1
   -dnaconc Enter DNA concentration (nM) Number from 1.000 to 100000.000
   50.
   -saltconc Enter salt concentration (mM) Number from 1.000 to 1000.000
   50.
   -formamide This specifies the percent formamide to be used in
   calculations (it is ignored unless -product is used). Number from
   0.000 to 100.000 0.
   -mismatch This specifies the percent mismatch to be used in
   calculations (it is ignored unless -product is used). Number from
   0.000 to 100.000 0.
   -prodlen This specifies the product length to be used in calculations
   (it is ignored unless -product is used). Any integer value Window size
   (20)
   -mintemp Enter a minimum value for the temperature scale (y-axis) of
   the plot. Number from 0.000 to 150.000 55.
   -graph Graph type EMBOSS has a list of known devices, including
   postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows,
   x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm,
   png EMBOSS_GRAPHICS value, or x11
   -outfile If a plot is not being produced then data on the melting
   point etc. in each window along the sequence is output to the file.
   Report file
   Optional qualifiers Allowed values Default
   -temperature If -thermo has been specified then this specifies the
   temperature at which to calculate the DeltaG, DeltaH and DeltaS
   values. Number from 0.000 to 100.000 25.
   Advanced qualifiers Allowed values Default
   -rna This specifies that the sequence is an RNA sequnce and not a DNA
   sequence. Yes/No No
   -product This prompts for percent formamide, percent of mismatches
   allowed and product length. Yes/No No
   -thermo Output the DeltaG, DeltaH and DeltaS values of the sequence
   windows to the output data file. Yes/No No
   -plot If this is not specified then the file of output data is
   produced, else a plot of the melting point along the sequence is
   produced. Yes/No No
   
Input file format

   Any DNA or RNA sequence USA.
   
Output file format

   If a plot is not being produced, dan reports the sequence of each
   oligomer window, its melting temperature under the specified
   conditions and its GC content.
   
   The output is a standard EMBOSS report file.
   
   The results can be output in one of several styles by using the
   command-line qualifier -rformat xxx, where 'xxx' is replaced by the
   name of the required format. The available format names are: embl,
   genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel,
   feattable, motif, regions, seqtable, simple, srs, table, tagseq
   
   See:
   http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for
   further information on report formats.
   
   By default dan writes a 'seqtable' report file.
   
   This is the start and the end of the output file from the example.

########################################
# Program: dan
# Rundate: Mon Feb 11 12:07:10 2002
# Report_file: paamir.dan
########################################

#=======================================
#
# Sequence: PAAMIR     from: 1   to: 2167
# HitCount: 2148
#=======================================

  Start     End Tm     GC     DeltaG DeltaH DeltaS TmProd Sequence
      1      20 64.9   70.0   .      .      .      .      ggtaccgctggccgagcatc
      2      21 63.7   65.0   .      .      .      .      gtaccgctggccgagcatct
      3      22 63.7   65.0   .      .      .      .      taccgctggccgagcatctg
      4      23 66.9   70.0   .      .      .      .      accgctggccgagcatctgc
      5      24 66.7   70.0   .      .      .      .      ccgctggccgagcatctgct
      6      25 65.5   70.0   .      .      .      .      cgctggccgagcatctgctc
      7      26 65.5   70.0   .      .      .      .      gctggccgagcatctgctcg
      8      27 63.7   65.0   .      .      .      .      ctggccgagcatctgctcga
      9      28 62.9   60.0   .      .      .      .      tggccgagcatctgctcgat
     10      29 62.6   65.0   .      .      .      .      ggccgagcatctgctcgatc
     11      30 61.7   60.0   .      .      .      .      gccgagcatctgctcgatca
     12      31 60.2   60.0   .      .      .      .      ccgagcatctgctcgatcac
etc.

   2143    2162 65.6   70.0   .      .      .      .      ggtggccgccaaccagttcc
   2144    2163 64.4   65.0   .      .      .      .      gtggccgccaaccagttcct
   2145    2164 64.1   65.0   .      .      .      .      tggccgccaaccagttcctc
   2146    2165 65.4   70.0   .      .      .      .      ggccgccaaccagttcctcg
   2147    2166 64.2   65.0   .      .      .      .      gccgccaaccagttcctcga
   2148    2167 62.4   65.0   .      .      .      .      ccgccaaccagttcctcgag

#---------------------------------------
#---------------------------------------

   The header information contains details of the program, date and
   sequence
   
   Subsequent lines contain columns of data for each window into the
   sequence as it is moved along, giving:
   
     * The start postion of the window
     * The end position of the window
     * The melting temperature of the window
     * The percentage C+G of the window
     * The sequence of the window
       
   If the qualifier '-product' is used to make the program prompt for
   percent formamide percent of mismatches allowed and product length,
   then the output includes the melting temperature of the specified
   product:

########################################
# Program: dan
# Rundate: Mon Feb 11 12:11:25 2002
# Report_file: paamir.dan
########################################

#=======================================
#
# Sequence: PAAMIR     from: 1   to: 2167
# HitCount: 2148
#=======================================

  Start     End Tm     GC     DeltaG DeltaH DeltaS TmProd Sequence
      1      20 64.9   70.0   .      .      .      54.9   ggtaccgctggccgagcatc
      2      21 63.7   65.0   .      .      .      52.8   gtaccgctggccgagcatct
      3      22 63.7   65.0   .      .      .      52.8   taccgctggccgagcatctg
      4      23 66.9   70.0   .      .      .      54.9   accgctggccgagcatctgc

etc.

   If the qualifier '-thermo' is gived then the DeltaG, DeltaH and DeltaS
   of the sequence in the window is also output.
   
Data files

   The EMBOSS data files "Edna.melt" and "Erna.melt" are used to read in
   the entropy/enthalpy/energy data for DNA and RNA respectively.
   
   EMBOSS data files are distributed with the application and stored in
   the standard EMBOSS data directory, which is defined by EMBOSS
   environment variable EMBOSS_DATA.
   
   Users can provide their own data files in their own directories.
   Project specific files can be put in the current directory, or for
   tidier directory listings in a subdirectory called ".embossdata".
   Files for all EMBOSS runs can be put in the user's home directory, or
   again in a subdirectory called ".embossdata".
   
   The directories are searched in the following order:
     * . (your current directory)
     * .embossdata (under your current directory)
     * ~/ (your home directory)
     * ~/.embossdata
       
Notes

   None.
   
References

    1. Breslauer, K.J., Frank, R., Blocker, H., and Marky, L.A. (1986).
       "Predicting DNA Duplex Stability from the Base Sequence."
       Proceedings of the National Academy of Sciences USA 83, 3746-3750.
    2. Baldino, M., Jr. (1989). "High Resolution In Situ Hybridization
       Histochemistry." In Methods in Enzymology, (P.M. Conn, ed.), 168,
       761-777, Academic Press, San Diego, California, USA.
       
Warnings

   RNA sequences must be submited to this application with the '-rna'
   qualifier on the command line, otherwise the sequence will be assumed
   to be DNA.
   
Diagnostic Error Messages

   None.
   
Exit status

   0 if successful.
   
Known bugs

   None.
   
See also

   Program name                          Description
   banana       Bending and curvature plot in B-DNA
   btwisted     Calculates the twisting in a B-DNA sequence
   chaos        Create a chaos game representation plot for a sequence
   compseq      Counts the composition of dimer/trimer/etc words in a sequence
   freak        Residue/base frequency table or plot
   isochore     Plots isochores in large DNA sequences
   wordcount    Counts words of a specified size in a DNA sequence
   
Author(s)

   This program was originally included in EGCG under the names "MELT"
   and "MELTPLOT", written by Rodrigo Lopez.
   
   This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)
   
History

   Written (1999) - Alan Bleasby
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
