Automatic methods for predicting functionally important residues
- PMID: 12589769
- DOI: 10.1016/s0022-2836(02)01451-1
Automatic methods for predicting functionally important residues
Erratum in
- J Mol Biol. 2009 Mar;387(2):521. del Sol Mesa, Antonio [corrected to del Sol, Antonio]
Abstract
Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21.
Similar articles
-
Prediction of amino acid positions specific for functional groups in a protein family based on local sequence similarity.J Mol Recognit. 2016 Apr;29(4):159-69. doi: 10.1002/jmr.2515. Epub 2015 Nov 8. J Mol Recognit. 2016. PMID: 26549790
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
BADASP: predicting functional specificity in protein families using ancestral sequences.Bioinformatics. 2005 Nov 15;21(22):4190-1. doi: 10.1093/bioinformatics/bti678. Epub 2005 Sep 13. Bioinformatics. 2005. PMID: 16159912
-
Practical analysis of specificity-determining residues in protein families.Brief Bioinform. 2016 Mar;17(2):255-61. doi: 10.1093/bib/bbv045. Epub 2015 Jul 2. Brief Bioinform. 2016. PMID: 26141829 Review.
-
A practical guide for the computational selection of residues to be experimentally characterized in protein families.Brief Bioinform. 2012 May;13(3):329-36. doi: 10.1093/bib/bbr052. Epub 2011 Sep 19. Brief Bioinform. 2012. PMID: 21930656 Review.
Cited by
-
SPEER-SERVER: a web server for prediction of protein specificity determining sites.Nucleic Acids Res. 2012 Jul;40(Web Server issue):W242-8. doi: 10.1093/nar/gks559. Epub 2012 Jun 11. Nucleic Acids Res. 2012. PMID: 22689646 Free PMC article.
-
Efficient identification of critical residues based only on protein structure by network analysis.PLoS One. 2007 May 9;2(5):e421. doi: 10.1371/journal.pone.0000421. PLoS One. 2007. PMID: 17502913 Free PMC article.
-
Background frequencies for residue variability estimates: BLOSUM revisited.BMC Bioinformatics. 2007 Dec 27;8:488. doi: 10.1186/1471-2105-8-488. BMC Bioinformatics. 2007. PMID: 18162129 Free PMC article.
-
Assessment of ligand-binding residue predictions in CASP9.Proteins. 2011;79 Suppl 10(Suppl 10):126-36. doi: 10.1002/prot.23174. Epub 2011 Oct 11. Proteins. 2011. PMID: 21987472 Free PMC article.
-
Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions.Comput Struct Biotechnol J. 2013 Dec 5;8:e201308007. doi: 10.5936/csbj.201308007. eCollection 2013. Comput Struct Biotechnol J. 2013. PMID: 24688747 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources