A metric model of amino acid substitution
- PMID: 14871874
- DOI: 10.1093/bioinformatics/bth065
A metric model of amino acid substitution
Erratum in
- Bioinformatics. 2004 Dec 12;20(18):3716
Abstract
Motivation: We address the question of whether there exists an effective evolutionary model of amino-acid substitution that forms a metric-distance function. There is always a trade-off between speed and sensitivity among competing computational methods of determining sequence homology. A metric model of evolution is a prerequisite for the development of an entire class of fast sequence analysis algorithms that are both scalable, O(log n) and sensitive.
Results: We have reworked the mathematics of the point accepted mutation model (PAM) by calculating the expected time between accepted mutations in lieu of calculating log-odds probabilities. The resulting substitution matrix (mPAM) forms a metric. We validate the application of the mPAM evolutionary model for sequence homology by executing sequence queries from a controlled yeast protein homology search benchmark. We compare the accuracy of the results of mPAM and PAM similarity matrices as well as three prior metric models. The experiment shows that mPAM significantly outperforms the other three metrics and sufficiently approaches the sensitivity of PAM250 to make it applicable to the management of protein sequence databases.
Similar articles
-
Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.Bioinformatics. 2006 Feb 15;22(4):413-22. doi: 10.1093/bioinformatics/bti828. Epub 2005 Dec 13. Bioinformatics. 2006. PMID: 16352653
-
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27. Bioinformatics. 2005. PMID: 15509610
-
Relation between weight matrix and substitution matrix: motif search by similarity.Bioinformatics. 2005 Apr 1;21(7):938-43. doi: 10.1093/bioinformatics/bti090. Epub 2004 Oct 28. Bioinformatics. 2005. PMID: 15514002
-
Exploiting evolutionary relationships for predicting protein structures.Biotechnol Bioeng. 2003 Dec 30;84(7):756-62. doi: 10.1002/bit.10850. Biotechnol Bioeng. 2003. PMID: 14708116 Review.
-
Pairwise statistical significance and empirical determination of effective gap opening penalties for protein local sequence alignment.Int J Comput Biol Drug Des. 2008;1(4):347-67. doi: 10.1504/ijcbdd.2008.022207. Int J Comput Biol Drug Des. 2008. PMID: 20063463 Review.
Cited by
-
A reduced amino acid alphabet for understanding and designing protein adaptation to mutation.Eur Biophys J. 2007 Nov;36(8):1059-69. doi: 10.1007/s00249-007-0188-5. Epub 2007 Jun 13. Eur Biophys J. 2007. PMID: 17565494
-
Ab initio detection of fuzzy amino acid tandem repeats in protein sequences.BMC Bioinformatics. 2012 Mar 21;13 Suppl 3(Suppl 3):S8. doi: 10.1186/1471-2105-13-S3-S8. BMC Bioinformatics. 2012. PMID: 22536906 Free PMC article.
-
A collection of amino acid replacement matrices derived from clusters of orthologs.J Mol Evol. 2005 Nov;61(5):659-65. doi: 10.1007/s00239-005-0060-0. Epub 2005 Oct 20. J Mol Evol. 2005. PMID: 16245010
-
Inconsistent distances in substitution matrices can be avoided by properly handling hydrophobic residues.Evol Bioinform Online. 2008 Oct 9;4:255-61. doi: 10.4137/ebo.s885. Evol Bioinform Online. 2008. PMID: 19204823 Free PMC article.
-
Amino acid "little Big Bang": representing amino acid substitution matrices as dot products of Euclidian vectors.BMC Bioinformatics. 2010 Jan 4;11:4. doi: 10.1186/1471-2105-11-4. BMC Bioinformatics. 2010. PMID: 20047649 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous