Prediction of contact residue pairs based on co-substitution between sites in protein structures
- PMID: 23342110
- PMCID: PMC3546969
- DOI: 10.1371/journal.pone.0054252
Prediction of contact residue pairs based on co-substitution between sites in protein structures
Abstract
Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints in varying degrees on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing phylogenetic correlations and indirect dependences through other residues within a protein or even through other molecules. Rapid growth of protein families with unknown folds requires an accurate de novo prediction method for protein structure. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that inferred residue-residue proximities can be sufficient information to predict a protein fold without the use of known three-dimensional structures. Here, we propose an alternative method of inferring coevolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. Substitution probability and physico-chemical changes (volume, charge, hydrogen-bonding capability, and others) accompanied by substitutions at each site in each branch of a phylogenetic tree are estimated with the likelihood of each substitution, and their direct correlations between sites are used to detect concurrent and compensatory substitutions. In order to extract direct dependences between sites, partial correlation coefficients of the characteristic changes along branches between sites, in which linear multiple dependences on feature vectors at other sites are removed, are calculated and used to rank coevolving site pairs. Accuracy of contact prediction based on the present coevolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.
Conflict of interest statement
Figures

















Similar articles
-
The Structural Determinants of Intra-Protein Compensatory Substitutions.Mol Biol Evol. 2022 Apr 11;39(4):msac063. doi: 10.1093/molbev/msac063. Mol Biol Evol. 2022. PMID: 35349721 Free PMC article.
-
Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments.Bioinformatics. 2003 Apr 12;19(6):750-5. doi: 10.1093/bioinformatics/btg072. Bioinformatics. 2003. PMID: 12691987
-
Reducing the false positive rate in the non-parametric analysis of molecular coevolution.BMC Evol Biol. 2008 Apr 10;8:106. doi: 10.1186/1471-2148-8-106. BMC Evol Biol. 2008. PMID: 18402697 Free PMC article.
-
Correlated substitution analysis and the prediction of amino acid structural contacts.Brief Bioinform. 2008 Jan;9(1):46-56. doi: 10.1093/bib/bbm052. Epub 2007 Nov 13. Brief Bioinform. 2008. PMID: 18000015 Review.
-
Prediction of Structures and Interactions from Genome Information.Adv Exp Med Biol. 2018;1105:123-152. doi: 10.1007/978-981-13-2200-6_9. Adv Exp Med Biol. 2018. PMID: 30617827 Review.
Cited by
-
Liquid-theory analogy of direct-coupling analysis of multiple-sequence alignment and its implications for protein structure prediction.Biophys Physicobiol. 2015 Dec 11;12:117-9. doi: 10.2142/biophysico.12.0_117. eCollection 2015. Biophys Physicobiol. 2015. PMID: 27493860 Free PMC article.
-
A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions.Biophys Physicobiol. 2016 Apr 22;13:45-62. doi: 10.2142/biophysico.13.0_45. eCollection 2016. Biophys Physicobiol. 2016. PMID: 27924257 Free PMC article.
-
Monte Carlo simulation of a statistical mechanical model of multiple protein sequence alignment.Biophys Physicobiol. 2017 Jul 12;14:99-110. doi: 10.2142/biophysico.14.0_99. eCollection 2017. Biophys Physicobiol. 2017. PMID: 28828285 Free PMC article.
-
Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions.Comput Struct Biotechnol J. 2013 Dec 5;8:e201308007. doi: 10.5936/csbj.201308007. eCollection 2013. Comput Struct Biotechnol J. 2013. PMID: 24688747 Free PMC article. Review.
-
The evolution of logic circuits for the purpose of protein contact map prediction.PeerJ. 2017 Apr 18;5:e3139. doi: 10.7717/peerj.3139. eCollection 2017. PeerJ. 2017. PMID: 28439455 Free PMC article.
References
-
- Yanovsky C, Horn V, Thorpe D (1964) Protein structure relationships revealed by mutation analysis. Science 146: : 1593–1594. - PubMed
-
- Fitch WM, Markowitz E (1970) An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet 4: 579–593. - PubMed
-
- Bazykin G, Kondrashov F, Ogurtsov A, Sunyaev S, Kondrashov A (2004) Positive selection at sites of multiple amino acid replacements since rat-mouse divergence. Nature 429: 558–562. - PubMed
-
- Maisnier-Patin S, Andersson DI (2004) Adaptation to the deleterious effect of antimicrobial drug resistance mutations by compensatory evolution. Research in Microbiology 155: 360–369. - PubMed
-
- Tufféry P, Darlu P (2000) Exploring a phylogenetic approach for the detection of correlated substitutions in proteins. Mol Biol Evol 17: 1753–1759. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources