|
|
EMBOSS: pepstats |
DayhoffStat is the amino acid's Dayhoff statistic divided by the molar percent. The Dayhoff statistic is the amino acid's relative occurence per 1000 aa normalised to 100 by rls@ebi.ac.uk (original work from 1993)
% pepstats Protein statistics Input sequence: sw:laci_ecoli Output file [laci_ecoli.pepstats]:
Mandatory qualifiers:
[-sequencea] sequence Sequence USA
-outfile outfile Output file name
Optional qualifiers: (none)
Advanced qualifiers:
-[no]termini bool Include charge at N and C terminus
-aadata string Molecular weight data for amino acids
General qualifiers:
-help bool report command line options. More
information on associated and general
qualifiers can be found with -help -verbose
|
| Mandatory qualifiers | Allowed values | Default | |
|---|---|---|---|
| [-sequencea] (Parameter 1) |
Sequence USA | Readable sequence | Required |
| -outfile | Output file name | Output file | <sequence>.pepstats |
| Optional qualifiers | Allowed values | Default | |
| (none) | |||
| Advanced qualifiers | Allowed values | Default | |
| -[no]termini | Include charge at N and C terminus | Yes/No | Yes |
| -aadata | Molecular weight data for amino acids | Any string is accepted | Eamino.dat |
PEPSTATS of LACI_ECOLI from 1 to 360 Molecular weight = 38563.98 Residues = 360 Average Residue Weight = 107.122 Charge = 1.5 Isoelectric Point = 6.8820 Residue Number Mole% DayhoffStat A = Ala 44 12.222 1.421 B = Asx 0 0.000 0.000 C = Cys 3 0.833 0.287 D = Asp 17 4.722 0.859 E = Glu 15 4.167 0.694 F = Phe 4 1.111 0.309 G = Gly 22 6.111 0.728 H = His 7 1.944 0.972 I = Ile 18 5.000 1.111 K = Lys 11 3.056 0.463 L = Leu 40 11.111 1.502 M = Met 10 2.778 1.634 N = Asn 12 3.333 0.775 P = Pro 14 3.889 0.748 Q = Gln 28 7.778 1.994 R = Arg 19 5.278 1.077 S = Ser 33 9.167 1.310 T = Thr 19 5.278 0.865 V = Val 34 9.444 1.431 W = Trp 2 0.556 0.427 X = Xxx 0 0.000 0.000 Y = Tyr 8 2.222 0.654 Z = Glx 0 0.000 0.000 Property Residues Number Mole% Tiny (A+C+G+S+T) 121 33.611 Small (A+B+C+D+G+N+P+S+T+V) 198 55.000 Aliphatic (I+L+V) 92 25.556 Aromatic (F+H+W+Y) 21 5.833 Non-polar (A+C+F+G+I+L+M+P+V+W+Y) 199 55.278 Polar (D+E+H+K+N+Q+R+S+T+Z) 161 44.722 Charged (B+D+E+H+K+R+Z) 69 19.167 Basic (H+K+R) 37 10.278 Acidic (B+D+E+Z) 32 8.889
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
| Program name | Description |
|---|---|
| backtranseq | Back translate a protein sequence |
| charge | Protein charge plot |
| checktrans | Reports STOP codons and ORF statistics of a protein sequence |
| compseq | Counts the composition of dimer/trimer/etc words in a sequence |
| emowse | Protein identification by mass spectrometry |
| freak | Residue/base frequency table or plot |
| iep | Calculates the isoelectric point of a protein |
| mwfilter | Filter noisy molwts from mass spec output |
| octanol | Displays protein hydropathy |
| pepinfo | Plots simple amino acid properties in parallel |
| pepwindow | Displays protein hydropathy |
| pepwindowall | Displays protein hydropathy of a set of sequences |