Automatic methods for predicting functionally important residues

Antonio del Sol¹, Florencio Pazos, Alfonso Valencia

Affiliations

PMID: 12589769
DOI: 10.1016/s0022-2836(02)01451-1

Free article

Automatic methods for predicting functionally important residues

Antonio del Sol et al. J Mol Biol. 2003.

Free article

. 2003 Feb 28;326(4):1289-302.

doi: 10.1016/s0022-2836(02)01451-1.

Authors

Antonio del Sol¹, Florencio Pazos, Alfonso Valencia

Affiliation

¹ Protein Design Group, National Center for Biotechnology, Cantoblanco, Madrid 28049, Spain.

PMID: 12589769
DOI: 10.1016/s0022-2836(02)01451-1

Erratum in

J Mol Biol. 2009 Mar;387(2):521. del Sol Mesa, Antonio [corrected to del Sol, Antonio]

Abstract

Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
- CORE
- Elsevier Science
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automatic methods for predicting functionally important residues

Affiliation

Automatic methods for predicting functionally important residues

Authors

Affiliation

Erratum in

Abstract

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources