Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 14:13:235.
doi: 10.1186/1471-2105-13-235.

Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction

Affiliations

Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction

Elin Teppa et al. BMC Bioinformatics. .

Abstract

Background: A large panel of methods exists that aim to identify residues with critical impact on protein function based on evolutionary signals, sequence and structure information. However, it is not clear to what extent these different methods overlap, and if any of the methods have higher predictive potential compared to others when it comes to, in particular, the identification of catalytic residues (CR) in proteins. Using a large set of enzymatic protein families and measures based on different evolutionary signals, we sought to break up the different components of the information content within a multiple sequence alignment to investigate their predictive potential and degree of overlap.

Results: Our results demonstrate that the different methods included in the benchmark in general can be divided into three groups with a limited mutual overlap. One group containing real-value Evolutionary Trace (rvET) methods and conservation, another containing mutual information (MI) methods, and the last containing methods designed explicitly for the identification of specificity determining positions (SDPs): integer-value Evolutionary Trace (ivET), SDPfox, and XDET. In terms of prediction of CR, we find using a proximity score integrating structural information (as the sum of the scores of residues located within a given distance of the residue in question) that only the methods from the first two groups displayed a reliable performance. Next, we investigated to what degree proximity scores for conservation, rvET and cumulative MI (cMI) provide complementary information capable of improving the performance for CR identification. We found that integrating conservation with proximity scores for rvET and cMI achieved the highest performance. The proximity conservation score contained no complementary information when integrated with proximity rvET. Moreover, the signal from rvET provided only a limited gain in predictive performance when integrated with mutual information and conservation proximity scores. Combined, these observations demonstrate that the rvET and cMI scores add complementary information to the prediction system.

Conclusions: This work contributes to the understanding of the different signals of evolution and also shows that it is possible to improve the detection of catalytic residues by integrating structural and higher order sequence evolutionary information with sequence conservation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Different column patterns in an MSA. Schematic representation of an MSA and its phylogenetic tree (left). Conserved position is highlighted in red, coevolved positions in green and orange and putative SDPs in yellow and blue. On the top are indicated the column pattern and on the bottom, the suitable method to detect each kind of position (C: conservation score; cMI: cumulative MI; ivET: integer value ET; rvET: real value ET; XDET and SDPfox are also indicated).
Figure 2
Figure 2
Heat map representation of the Spearman rank correlation coefficient between methods. cMI: cumulative MI, ivET: integer value evolutionary trace, rvET: real value evolutionary trace, cons: conservation. Numbers following the methods name (100, 62 and 50) indicate the redundancy of the sequences in the MSA (100, 62 and 50% redundancy reduced). The dendrogram indicates the distance between methods. Correlation colour key goes from white (0, no correlation) to blue (1, perfect correlation). All correlations are statistically different from zero (T-test, p-value threshold of 0.05).

Similar articles

Cited by

References

    1. Porter CT, Bartlett GJ, Thornton JM, The Catalytic Site Atlas. Nucleic Acids Res. 2004. pp. 129–133. Database issue. - PMC - PubMed
    1. Oliveira L W, Vriend G, Ljzerman AP. Identification of class-determining residues in G protein-coupled receptors by sequence analysis. Receptors Channels. 1997;5(3-4):159–174. - PubMed
    1. Pirovano W, Feenstra KA, Heringa J. Sequence comparison by sequence harmony identifies subtype-specific functional sites. Nucleic Acids Res. 2006;34(22):6540–6548. doi: 10.1093/nar/gkl901. - DOI - PMC - PubMed
    1. Chakrabarti S, Panchenko AR. Coevolution in defining the functional specificity. Proteins. 2009;75:231–240. doi: 10.1002/prot.22239. - DOI - PMC - PubMed
    1. Casari G, Sander C, Valencia A. A method to predict functional residues in proteins. Nat Struct Mol Biol. 1995;2(2):171–178. doi: 10.1038/nsb0295-171. - DOI - PubMed

Publication types

LinkOut - more resources