H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments
- PMID: 24766829
- PMCID: PMC4021312
- DOI: 10.1186/1471-2105-15-118
H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments
Abstract
Background: The identification of functionally important residue positions is an important task of computational biology. Methods of correlation analysis allow for the identification of pairs of residue positions, whose occupancy is mutually dependent due to constraints imposed by protein structure or function. A common measure assessing these dependencies is the mutual information, which is based on Shannon's information theory that utilizes probabilities only. Consequently, such approaches do not consider the similarity of residue pairs, which may degrade the algorithm's performance. One typical algorithm is H2r, which characterizes each individual residue position k by the conn(k)-value, which is the number of significantly correlated pairs it belongs to.
Results: To improve specificity of H2r, we developed a revised algorithm, named H2rs, which is based on the von Neumann entropy (vNE). To compute the corresponding mutual information, a matrix A is required, which assesses the similarity of residue pairs. We determined A by deducing substitution frequencies from contacting residue pairs observed in the homologs of 35 809 proteins, whose structure is known. In analogy to H2r, the enhanced algorithm computes a normalized conn(k)-value. Within the framework of H2rs, only statistically significant vNE values were considered. To decide on significance, the algorithm calculates a p-value by performing a randomization test for each individual pair of residue positions. The analysis of a large in silico testbed demonstrated that specificity and precision were higher for H2rs than for H2r and two other methods of correlation analysis. The gain in prediction quality is further confirmed by a detailed assessment of five well-studied enzymes. The outcome of H2rs and of a method that predicts contacting residue positions (PSICOV) overlapped only marginally. H2rs can be downloaded from http://www-bioinf.uni-regensburg.de.
Conclusions: Considering substitution frequencies for residue pairs by means of the von Neumann entropy and a p-value improved the success rate in identifying important residue positions. The integration of proven statistical concepts and normalization allows for an easier comparison of results obtained with different proteins. Comparing the outcome of the local method H2rs and of the global method PSICOV indicates that such methods supplement each other and have different scopes of application.
Figures






Similar articles
-
H2r: identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments.BMC Bioinformatics. 2008 Mar 18;9:151. doi: 10.1186/1471-2105-9-151. BMC Bioinformatics. 2008. PMID: 18366663 Free PMC article.
-
CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.BMC Bioinformatics. 2012 Apr 5;13:55. doi: 10.1186/1471-2105-13-55. BMC Bioinformatics. 2012. PMID: 22480135 Free PMC article.
-
Experimental assessment of the importance of amino acid positions identified by an entropy-based correlation analysis of multiple-sequence alignments.Biochemistry. 2012 Jul 17;51(28):5633-41. doi: 10.1021/bi300747r. Epub 2012 Jul 6. Biochemistry. 2012. PMID: 22737967
-
Estimating residue evolutionary conservation by introducing von Neumann entropy and a novel gap-treating approach.Amino Acids. 2008 Aug;35(2):495-501. doi: 10.1007/s00726-007-0586-0. Epub 2007 Aug 21. Amino Acids. 2008. PMID: 17710364 Free PMC article.
-
Inter-residue, inter-protein and inter-family coevolution: bridging the scales.Curr Opin Struct Biol. 2018 Jun;50:26-32. doi: 10.1016/j.sbi.2017.10.014. Epub 2017 Nov 5. Curr Opin Struct Biol. 2018. PMID: 29101847 Free PMC article. Review.
Cited by
-
Molecular dynamics and structure function analysis show that substrate binding and specificity are major forces in the functional diversification of Eqolisins.BMC Bioinformatics. 2018 Sep 24;19(1):338. doi: 10.1186/s12859-018-2348-2. BMC Bioinformatics. 2018. PMID: 30249179 Free PMC article.
-
Inferring joint sequence-structural determinants of protein functional specificity.Elife. 2018 Jan 16;7:e29880. doi: 10.7554/eLife.29880. Elife. 2018. PMID: 29336305 Free PMC article.
-
A Single Mutation Increases the Thermostability and Activity of Aspergillus terreus Amine Transaminase.Molecules. 2019 Mar 27;24(7):1194. doi: 10.3390/molecules24071194. Molecules. 2019. PMID: 30934681 Free PMC article.
-
Deep Analysis of Residue Constraints (DARC): identifying determinants of protein functional specificity.Sci Rep. 2020 Feb 3;10(1):1691. doi: 10.1038/s41598-019-55118-6. Sci Rep. 2020. PMID: 32015389 Free PMC article.
-
Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.PLoS Comput Biol. 2016 Dec 21;12(12):e1005294. doi: 10.1371/journal.pcbi.1005294. eCollection 2016 Dec. PLoS Comput Biol. 2016. PMID: 28002465 Free PMC article.
References
-
- Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J. New developments in the InterPro database. Nucleic Acids Res. 2007;35(Database issue):D224–228. - PMC - PubMed
-
- de Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nat Rev Genet. 2013;14(4):249–261. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources