Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Jan;10(1):97-109.
doi: 10.1093/bib/bbn049. Epub 2008 Oct 29.

Models of coding sequence evolution

Affiliations
Review

Models of coding sequence evolution

Wayne Delport et al. Brief Bioinform. 2009 Jan.

Abstract

Probabilistic models of sequence evolution are in widespread use in phylogenetics and molecular sequence evolution. These models have become increasingly sophisticated and combined with statistical model comparison techniques have helped to shed light on how genes and proteins evolve. Models of codon evolution have been particularly useful, because, in addition to providing a significant improvement in model realism for protein-coding sequences, codon models can also be designed to test hypotheses about the selective pressures that shape the evolution of the sequences. Such models typically assume a phylogeny and can be used to identify sites or lineages that have evolved adaptively. Recently some of the key assumptions that underlie phylogenetic tests of selection have been questioned, such as the assumption that the rate of synonymous changes is constant across sites or that a single phylogenetic tree can be assumed at all sites for recombining sequences. While some of these issues have been addressed through the development of novel methods, others remain as caveats that need to be considered on a case-by-case basis. Here, we outline the theory of codon models and their application to the detection of positive selection. We review some of the more recent developments that have improved their power and utility, laying a foundation for further advances in the modeling of coding sequence evolution.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
The relationship between the recombination rate and the rate of false inference of positive selection in an example data set consisting of 10 taxa (using a population-scaled substitution rate of μ = 3.6, and a population-scaled recombination rate varying from ρ = 0 to 0.1 with increments of 0.002). Recombination rates are measured as ρ = 2Nr, where N is the effective population size and r is the number of recombination events per inter-codon link per lineage per generation. Hence, a recombination rate of ρ = 0.004 means that a given inter-codon link at a given lineage experiences, on average, one recombination event every 500N generations.

References

    1. Felsenstein J. Inferring Phylogenenies. Sunderland, MA: Sinauer Associates, Inc.; 2004.
    1. Huelsenbeck JP, Crandall KA. Phylogeny estimation and hypothesis testing using maximum likelihood. Ann Rev Ecol Syst. 1997;28:437–66.
    1. Siepel A, Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol. 2004;21:468–88. - PubMed
    1. Siepel A, Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol. 2004;11:413–28. - PubMed
    1. Seo T-K, Kishino H. Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins. Syst Biol. 2008;57:367–77. - PubMed

Publication types