
                             EMBOSS: scopalign
     _________________________________________________________________
   
                               Program scopalign
                                       
Function

   Generate alignments for SCOP families
   
Description

   scopalign parses a SCOP classification file in EMBL-like format
   generated by the EMBOSS applications scope or nrscope, and domain
   coordinate files generated by the EMBOSS application domainer, and
   calls stamp to generate structural alignments for each SCOP family in
   turn.
   
VERY IMPORTANT NOTE

   scopalign will only run with with a version of stamp which has been
   modified so that PDB ID codes of length greater than 4 characters are
   acceptable. This involves a trivial change to the stamp module
   getdomain.c (around line number 155), a 4 must be changed to a 7 as
   follows:
   temp=getfile(domain[0].id,dirfile,4,OUTPUT);
   temp=getfile(domain[0].id,dirfile,7,OUTPUT);
   
   The modified code is kept on the HGMP file system in
   /packages/stamp/src2 WHEN RUNNING SCOPALIGN AT THE HGMP IT IS
   ESSENTIAL THAT THE COMMAND 'use stamp2' (which runs the script
   /packages/menu/USE/stamp2) IS GIVEN BEFORE SCOPALIGN IS RUN. This will
   ensure that the modified version of stamp is used.
   
Usage

   Here is a sample session with scopalign:

% scopalign

Command line arguments

   Mandatory qualifiers:
  [-scopf]             infile     Name of scop file for input (embl-like
                                  format)
  [-path]              string     Location of alignment files for output

   Optional qualifiers: (none)
   Advanced qualifiers: (none)
   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-scopf]
   (Parameter 1) Name of scop file for input (embl-like format) Input
   file Escop.dat
   [-path]
   (Parameter 2) Location of alignment files for output Any string is
   accepted ./
   Optional qualifiers Allowed values Default
   (none)
   Advanced qualifiers Allowed values Default
   (none)
   
Input file format

   scopalign parses a SCOP classification file in EMBL-like format
   generated by the EMBOSS applications scope or nrscope, and domain
   coordinate files generated by the EMBOSS application domainer.
   
Output file format

   The names of the output files are identical to the names of the
   families given in the SCOP classification records, except that if a
   file of a certain name already exists, then an "_1", "_2" etc will be
   added as appropriate.
   
   The format of the scopalign output file (Figure 1) is similar to the
   output file generated by stamp when issued with the following three
   types of command:
   
   (1) stamp -l ./stamps_file.dom -s -n 2 -slide 5 -prefix ./stamps_file
   -d ./stamps_file.set;sorttrans -f ./stamps_file.scan -s Sc 2.5 >
   ./stamps_file.sort;stamp -l ./stamps_file.sort -prefix ./stamps_file >
   ./stamps_file.log
   
   (2) poststamp -f ./stamps_file.3 -min 0.5
   
   (3) ver2hor -f ./stamps_file.3.post > ./stamps_file.out
   
   However, the SCOP classification records for the appopriate family are
   written above the alignment, no dssp assignments are given, and only
   the 'Post similar' line is given. Also, 7 character domain identifier
   codes taken from the scop classificaiton file are given.
   
Figure 1 Example of scopalign output file

CL   All alpha proteins
XX
FO   Globin-like
XX
SF   Globin-like
XX
FA   Globins
XX
Number               10        20        30        40        50
d1vrea_              LSAAQRQVVASTWKDIAgsdngAGVGKECFTKFLSAHHDMAAV f gFS
d3sdhb_      svydaaaqLTADVKKDLRDSWKVIG sd kKGNGVALMTTLFADNQETIGYfkrlGN
d3hbia_      svydaaaqLTADVKKDLRDSWKVIG sd kKGNGVALMTTLFADNQETIGYfkrlGN
d3sdha_      svydaaaqLTADVKKDLRDSWKVIG sd kKGNGVALMTTLFADNQETIGYfkrlGN
Post_similar --------11111111111111111-00-1111111111111111111111-0-111

Number        60        70        80        90       100       110
d1vrea_      GAS   dpGVADLGAKVLAQIGVAVSHLgDEGKMVAEMKAVGVRHKgygnkhIKAEY
d3sdhb_      VSQgmandKLRGHSITLMYALQNFIDQLdNPDDLVCVVEKFAVNHI  t rkISAAE
d3hbia_      VSQgmandKLRGHSITLMYALQNFIDQLdNPDDLVCVVEKLAVNHI  t rkISAAE
d3sdha_      VSQgmandKLRGHSITLMYALQNFIDQLdNPDDLVCVVEKFAVNHI  t rkISAAE
Post_similar 111---0011111111111111111111011111111111111111--0-0011111

Number          120       130       140       150       160
d1vrea_      FEPlGASL LSAMEhriggkMNAAAKDAWAAAYADisgalisglqs
d3sdhb_      FGK INGPiKKVLA s k nFGDKYANAWAKLVAVvqa al
d3hbia_      FGK INGPiKKVLA s k nFGDKYANAWAKLVAVvqa al
d3sdha_      FGK INGPiKKVLA s k nFGDKYANAWAKLVAVvqa al
Post_similar 111-1111-11111-0-0-1111111111111111100-00-----

Data files

   None
   
Notes

   scopalign will only run with with a version of stamp which has been
   modified so that PDB ID codes of length greater than 4 characters are
   acceptable. This involves a trivial change to the stamp module
   getdomain.c (around line number 155), a 4 must be changed to a 7 as
   follows:
   temp=getfile(domain[0].id,dirfile,4,OUTPUT);
   temp=getfile(domain[0].id,dirfile,7,OUTPUT);
   
   The modified code is kept on the HGMP file system in
   /packages/stamp/src2 WHEN RUNNING SCOPALIGN AT THE HGMP IT IS
   ESSENTIAL THAT THE COMMAND 'use stamp2' (which runs the script
   /packages/menu/USE/stamp2) IS GIVEN BEFORE SCOPALIGN IS RUN. This will
   ensure that the modified version of stamp is used.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   None.
   
Exit status

   It always exits with status 0.
   
Known bugs

   None.
   
See also

   Program name Description
   contacts Reads coordinate files and writes contact files
   dichet Parse dictionary of heterogen groups
   interface Reads coordinate files and writes inter-chain contact files
   psiblasts Runs PSI-BLAST given scopalign alignments
   seqsort Removes ambiguities from a set of hits resulting from a
   database search
   siggen Generates a sparse protein signature
   sigscan Scans a sparse protein signature against swissprot
   
Author(s)

   This application was written by Jon Ison (jison@hgmp.mrc.ac.uk)
   
History

   Written (May 2001) - Jon Ison
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
