
                              EMBOSS: mwfilter
     _________________________________________________________________
   
                               Program mwfilter
                                       
Function

   Filter noisy molwts from mass spec output
   
Description

   mwfilter is designed to remove unwanted (noisy) data from mass
   spectrometry output in proteomics. Given a list of molecular weights
   this program removes those which are:
   
     * Contaminating trypsin or keratin
     * Modified oxy-methionine or oxy-threonine
     * Peaks associated with sodium ions.
       
   The last two operations can be done as most peaks are reported in both
   modified and unmodified forms. Removal of modified peaks aids in
   database searching for protein identification.
   
Usage

   Here is a sample session with mwfilter:

% mwfilter
Filter noisy molwts from mass spec output
Input file: molwts.dat
ppm tolerance [50.0]:
Output file [molwts.mwfilter]:

Command line arguments

   Mandatory qualifiers:
  [-infile]            infile     Molecular weight file input
   -tolerance          float      ppm tolerance
  [-outfile]           outfile    Output file name

   Optional qualifiers: (none)
   Advanced qualifiers:
   -datafile           string     Data file of noisy molecular weights

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   

   Mandatory qualifiers Allowed values Default
   [-infile]
   (Parameter 1) Molecular weight file input Input file Required
   -tolerance ppm tolerance Any integer value 50.0
   [-outfile]
   (Parameter 2) Output file name Output file <sequence>.mwfilter
   Optional qualifiers Allowed values Default
   (none)
   Advanced qualifiers Allowed values Default
   -datafile Data file of noisy molecular weights Any string is accepted
   Emwfilter.dat
   
Input file format

   The input file is a simple list of the experimental molecular weights.
   There should be one weight per line.
   
   Comments in the data file start with a '#' character in the first
   column.
   Blank lines are ignored.
   
   An example of the input data follows:
  __________________________________________________________________________

874.364756
927.450380
1045.572
1068.397129
1121.431124
1163.584593
1305.660840
1428.596448
1479.748341
1502.549157
1554.591658
1567.686209
1576.708354
1639.868056
1748.611920
1753.745298
1880.841178
2383.99
  __________________________________________________________________________

Output file format

   The output is a list of the molecular weights with the unwanted
   (noisy) data removed.
   
   An example of the output data follows:
  __________________________________________________________________________

874.364756
927.450380
1068.397129
1121.431124
1163.584593
1305.660840
1428.596448
1479.748341
1502.549157
1554.591658
1567.686209
1576.708354
1639.868056
1748.611920
1753.745298
  __________________________________________________________________________

Data files

   The program reads the data file Emwfilter.dat for the molecular
   weights of items to be deleted from the experimental data.
   
   EMBOSS data files are distributed with the application and stored in
   the standard EMBOSS data directory, which is defined by the EMBOSS
   environment variable EMBOSS_DATA.
   
   To see the available EMBOSS data files, run:
   
% embossdata -showall

   To fetch one of the data files (for example 'Exxx.dat') into your
   current directory for you to inspect or modify, run:

% embossdata -fetch -file Exxx.dat

   Users can provide their own data files in their own directories.
   Project specific files can be put in the current directory, or for
   tidier directory listings in a subdirectory called ".embossdata".
   Files for all EMBOSS runs can be put in the user's home directory, or
   again in a subdirectory called ".embossdata".
   
   The directories are searched in the following order:
     * . (your current directory)
     * .embossdata (under your current directory)
     * ~/ (your home directory)
     * ~/.embossdata
       
Notes

   None.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   None.
   
Exit status

   It always exits with status 0.
   
Known bugs

   None.
   
See also

   Program name Description
   backtranseq Back translate a protein sequence
   charge Protein charge plot
   checktrans Reports STOP codons and ORF statistics of a protein
   sequence
   compseq Counts the composition of dimer/trimer/etc words in a sequence
   emowse Protein identification by mass spectrometry
   freak Residue/base frequency table or plot
   iep Calculates the isoelectric point of a protein
   octanol Displays protein hydropathy
   pepinfo Plots simple amino acid properties in parallel
   pepstats Protein statistics
   pepwindow Displays protein hydropathy
   pepwindowall Displays protein hydropathy of a set of sequences
   
Author(s)

   This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)
   
History

   Written (Jan 2002) - Alan Bleasby.
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
