Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins
- PMID: 18570032
- DOI: 10.1080/10635150802158670
Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins
Abstract
Codon-and amino acid-substitution models are widely used for the evolutionary analysis of protein-coding DNA sequences. Using codon models, the amounts of both nonsynonymous and synonymous DNA substitutions can be estimated. The ratio of these amounts represents the strength of selective pressure. Using amino acid models, the amount of nonsynonymous substitutions is estimated, but that of synonymous substitutions is ignored. Although amino acid models lose any information regarding synonymous substitutions, they explicitly incorporate the information for amino acid replacement, which is empirically derived from databases. It is often presumed that when the protein-coding sequences are highly divergent, synonymous substitutions might be saturated and the evolutionary analysis may be hampered by synonymous noise. However, there exists no quantitative procedure to verify whether synonymous substitutions can be ignored; therefore, amino acid models have been arbitrarily selected. In this study, we investigate the issue of a statistical comparison between codon-and amino acid-substitution models. For this purpose, we propose a new procedure to transform a 20-dimensional amino acid model to a 61-dimensional codon model. This transformation reveals that amino acid models belong to a subset of the codon models and enables us to test whether synonymous substitutions can be ignored by using the likelihood ratio. Our theoretical results and analyses of real data indicate that synonymous substitutions are very informative and substantially improve evolutionary inference, even when the sequences are highly divergent. Therefore, we note that amino acid models should be adopted only after carefully investigating and discarding the possibility that synonymous substitutions can reveal important evolutionary information.
Similar articles
-
A combined empirical and mechanistic codon model.Mol Biol Evol. 2007 Feb;24(2):388-97. doi: 10.1093/molbev/msl175. Epub 2006 Nov 16. Mol Biol Evol. 2007. PMID: 17110464
-
Site-to-site variation of synonymous substitution rates.Mol Biol Evol. 2005 Dec;22(12):2375-85. doi: 10.1093/molbev/msi232. Epub 2005 Aug 17. Mol Biol Evol. 2005. PMID: 16107593
-
An empirical codon model for protein sequence evolution.Mol Biol Evol. 2007 Jul;24(7):1464-79. doi: 10.1093/molbev/msm064. Epub 2007 Mar 30. Mol Biol Evol. 2007. PMID: 17400572
-
Can RNA selection pressure distort the measurement of Ka/Ks?Gene. 2006 Mar 29;370:1-5. doi: 10.1016/j.gene.2005.12.015. Epub 2006 Feb 20. Gene. 2006. PMID: 16488091 Review.
-
Modeling sequence evolution.Methods Mol Biol. 2008;452:255-85. doi: 10.1007/978-1-60327-159-2_13. Methods Mol Biol. 2008. PMID: 18566769 Review.
Cited by
-
Positive evolutionary selection on the RIG-I-like receptor genes in mammals.PLoS One. 2013 Nov 27;8(11):e81864. doi: 10.1371/journal.pone.0081864. eCollection 2013. PLoS One. 2013. PMID: 24312370 Free PMC article.
-
Single-Copy Genes as Molecular Markers for Phylogenomic Studies in Seed Plants.Genome Biol Evol. 2017 May 1;9(5):1130-1147. doi: 10.1093/gbe/evx070. Genome Biol Evol. 2017. PMID: 28460034 Free PMC article.
-
AlignWise: a tool for identifying protein-coding sequence and correcting frame-shifts.BMC Bioinformatics. 2015 Nov 9;16:376. doi: 10.1186/s12859-015-0813-8. BMC Bioinformatics. 2015. PMID: 26553107 Free PMC article.
-
Superiority of a mechanistic codon substitution model even for protein sequences in phylogenetic analysis.BMC Evol Biol. 2013 Nov 21;13:257. doi: 10.1186/1471-2148-13-257. BMC Evol Biol. 2013. PMID: 24256155 Free PMC article.
-
Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species.Front Plant Sci. 2016 Jun 28;7:959. doi: 10.3389/fpls.2016.00959. eCollection 2016. Front Plant Sci. 2016. PMID: 27446185 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources