Detecting coevolving positions in a molecule: why and how to account for phylogeny
- PMID: 21949241
- DOI: 10.1093/bib/bbr048
Detecting coevolving positions in a molecule: why and how to account for phylogeny
Abstract
Positions in a molecule that share a common constraint do not evolve independently, and therefore leave a signature in the patterns of homologous sequences. Exhibiting such positions with a coevolution pattern from a sequence alignment has great potential for predicting functional and structural properties of molecules through comparative analysis. This task is complicated by the existence of additional correlation sources, leading to false predictions. The nature of the data is a major source of noise correlation: sequences are taken from individuals with different degrees of relatedness, and who therefore are intrinsically correlated. This has led to several method developments in different fields that are potentially confusing for non-expert users interested in these methodologies. It also explains why coevolution detection methods are largely unemployed despite the importance of the biological questions they address. In this article, I focus on the role of shared ancestry for understanding molecular coevolution patterns. I review and classify existing coevolution detection methods according to their ability to handle shared ancestry. Using a ribosomal RNA benchmark data set, for which detailed knowledge of the structure and coevolution patterns is available, I demonstrate and explain why taking the underlying evolutionary history of sequences into account is the only way to extract the full coevolution signal in the data. I also evaluate, using rigorous statistical procedures, the best approaches to do so, and discuss several important biological aspects to consider when performing coevolution analyses.
Similar articles
-
Base pairing constraints drive structural epistasis in ribosomal RNA sequences.Mol Biol Evol. 2010 Aug;27(8):1868-76. doi: 10.1093/molbev/msq069. Epub 2010 Mar 8. Mol Biol Evol. 2010. PMID: 20211929
-
A model-based approach for detecting coevolving positions in a molecule.Mol Biol Evol. 2005 Sep;22(9):1919-28. doi: 10.1093/molbev/msi183. Epub 2005 Jun 8. Mol Biol Evol. 2005. PMID: 15944445
-
Simultaneous Bayesian inference of phylogeny and molecular coevolution.Proc Natl Acad Sci U S A. 2019 Mar 12;116(11):5027-5036. doi: 10.1073/pnas.1813836116. Epub 2019 Feb 26. Proc Natl Acad Sci U S A. 2019. PMID: 30808804 Free PMC article.
-
Heterotachy and functional shift in protein evolution.IUBMB Life. 2003 Apr-May;55(4-5):257-65. doi: 10.1080/1521654031000123330. IUBMB Life. 2003. PMID: 12880207 Review.
-
Phylogenetic analyses: comparing species to infer adaptations and physiological mechanisms.Compr Physiol. 2012 Jan;2(1):639-74. doi: 10.1002/cphy.c100079. Compr Physiol. 2012. PMID: 23728983 Review.
Cited by
-
Structural Alignment and Covariation Analysis of RNA Sequences.Bio Protoc. 2020 Feb 5;10(3):e3511. doi: 10.21769/BioProtoc.3511. eCollection 2020 Feb 5. Bio Protoc. 2020. PMID: 33654736 Free PMC article.
-
Evolutionary conservation of RNA sequence and structure.Wiley Interdiscip Rev RNA. 2021 Sep;12(5):e1649. doi: 10.1002/wrna.1649. Epub 2021 Mar 22. Wiley Interdiscip Rev RNA. 2021. PMID: 33754485 Free PMC article. Review.
-
Subtype-specific structural constraints in the evolution of influenza A virus hemagglutinin genes.Sci Rep. 2016 Dec 14;6:38892. doi: 10.1038/srep38892. Sci Rep. 2016. PMID: 27966593 Free PMC article.
-
Power law tails in phylogenetic systems.Proc Natl Acad Sci U S A. 2018 Jan 23;115(4):690-695. doi: 10.1073/pnas.1711913115. Epub 2018 Jan 8. Proc Natl Acad Sci U S A. 2018. PMID: 29311320 Free PMC article.
-
Phylogenetic Corrections and Higher-Order Sequence Statistics in Protein Families: The Potts Model vs MSA Transformer.ArXiv [Preprint]. 2025 Mar 1:arXiv:2503.00289v1. ArXiv. 2025. PMID: 40365615 Free PMC article. Preprint.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources