Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Oct 24;2(10):e1065.
doi: 10.1371/journal.pone.0001065.

Ancestral inference and the study of codon bias evolution: implications for molecular evolutionary analyses of the Drosophila melanogaster subgroup

Affiliations

Ancestral inference and the study of codon bias evolution: implications for molecular evolutionary analyses of the Drosophila melanogaster subgroup

Hiroshi Akashi et al. PLoS One. .

Abstract

Reliable inference of ancestral sequences can be critical to identifying both patterns and causes of molecular evolution. Robustness of ancestral inference is often assumed among closely related species, but tests of this assumption have been limited. Here, we examine the performance of inference methods for data simulated under scenarios of codon bias evolution within the Drosophila melanogaster subgroup. Genome sequence data for multiple, closely related species within this subgroup make it an important system for studying molecular evolutionary genetics. The effects of asymmetric and lineage-specific substitution rates (i.e., varying levels of codon usage bias and departures from equilibrium) on the reliability of ancestral codon usage was investigated. Maximum parsimony inference, which has been widely employed in analyses of Drosophila codon bias evolution, was compared to an approach that attempts to account for uncertainty in ancestral inference by weighting ancestral reconstructions by their posterior probabilities. The latter approach employs maximum likelihood estimation of rate and base composition parameters. For equilibrium and most non-equilibrium scenarios that were investigated, the probabilistic method appears to generate reliable ancestral codon bias inferences for molecular evolutionary studies within the D. melanogaster subgroup. These reconstructions are more reliable than parsimony inference, especially when codon usage is strongly skewed. However, inference biases are considerable for both methods under particular departures from stationarity (i.e., when adaptive evolution is prevalent). Reliability of inference can be sensitive to branch lengths, asymmetry in substitution rates, and the locations and nature of lineage-specific processes within a gene tree. Inference reliability, even among closely related species, can be strongly affected by (potentially unknown) patterns of molecular evolution in lineages ancestral to those of interest.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Silent divergence under departures from equilibrium major codon usage.
The x-axis shows MCU values prior to a change in N e. A: expected instantaneous per locus rates of pu and up fixations when N e decreases to 1/3 its original value across genes. B: expected d up,pu = (#up−#pu)/(#up+#pu) for the 1/3N e scenario (decreasing codon bias). C and D: expected instantaneous silent rates and d up,pu after a doubling of N e (increasing codon bias). Legends and X-axis scales apply to graphs in the same column. See text for details of the model. The curves assume that variation in selection coefficients underlies MCU variation among genes (u/v = 1 across genes).
Figure 2
Figure 2. Synonymous distance trees for six Drosophila melanogaster subgroup species.
m, s, t, y, e, and o refer to D. melanogaster, D. simulans, D. teissieri, D. yakuba, D. erecta, and D. orena, respectively. The assumed tree topology ((m, s), ((t, y), (e, o))) is based on . Silent distances were calculated using CODEML and averaged across 22 genes. (ML) unrooted tree showing maximum likelihood distances under a codon-based substitution model. Equilibrium frequencies of each codon were calculated from the nucleotides frequencies at three codon positions (F3x4). (NG) unrooted neighbor-joining tree based on Nei-Gojobori pairwise distances. Numbers shown on each branch are per site synonymous distances x1000. (1x) topology employed in 1x simulations. Mutation rates and numbers of generations per branch were set to give expected per site silent divergence of 0.05 for the m, s, t, y, e, o, ty, and eo lineages and 0.075 and 0.025 for the ms and tyeo lineages, respectively, for equilibrium MCU = 0.7. Abbreviations for ancestral nodes are shown below and to the right of the nodes.
Figure 3
Figure 3. Inference of pu and up substitutions on the m lineage under equilibrium codon bias evolution.
The numbers of hits and the ratios of inferred to actual hits reflect averages across 300 replicates of the 1x equilibrium scenario. X-axis scales apply to graphs in the same column. The legend in A applies to B and E and the legend in C applies to D. Reliability of d up,pu inference was greater for ML than for MP for MCU≥0.6 (see text for criteria).
Figure 4
Figure 4. Ancestral codon configurations in simulations of codon bias evolution.
Trees representing extant codon configurations consistent with single silent changes in the m lineage, ECC_uppppp (A) and ECC_puuuuu (E), are shown. The three most common ancestral codon configurations underlying these extant codon configurations are shown in B, C, and D for ECC_uppppp and in E, F, and G for ECC_puuuuu. Trees C and G reflect child/ancestor reverse changes and trees D and H show child/sib-ancestor parallel changes. The relative frequencies of ancestral codon configurations underlying ECC_uppppp and ECC_puuuuu in the codon bias simulations are shown as bubble plots beneath the trees. The sizes of the bubbles reflect the relative numbers of ancestral codon configurations in each class for three different MCU values. For the non-equilibrium scenarios (1/3N e and 2N e), the proportion of each ancestral codon configuration among the extant configurations relative to the proportion under equilibrium codon bias evolution are given. The data are from Table 1.
Figure 5
Figure 5. Reliability of ancestral codon bias inference under equilibrium evolution.
A, B, and C show ratios of inferred to actual values and the d up,pu for the 0.5× tree (averages among 500 replicates) and D, E, and F show values for the 2× tree (averages among 200 replicates). Actual and inferred numbers of hits are not shown. The legend in A applies to B, D, and E and the legend in C applies to F. X-axis scales are identical for graphs in the same column.
Figure 6
Figure 6. Reliability of ML inference for evolution under the GCpref codon and HKY85 models.
d up,pu values are shown for 4-fold synonymous codons for ML inference under the equilibrium GCpref codon model and for data simulated under the HKY85 substitution matrix. The latter matrix was set to give identical expected numbers of substitutions and equilibrium GC content for the two scenarios. The legend applies to both graphs and the y-axis scales are identical in the two graphs. Note that d up,pu inference biases are larger for 4-fold redundant codons than for 2-fold redundant codons under the GCpref codon model. Data are averaged across 300 replicates.
Figure 7
Figure 7. Reliability of ancestral codon bias inference under non-equilibrium evolution: variable selection intensity.
The legend in A applies to B, D, and E. The legend in C also applies to F. X-axis scales are identical for graphs in the same column. Note that the MCU values reflect values at the m node and differ from ancestral values. For the 1/3N e scenario, reliability of d up,pu inference was greater for ML than for MP for 0.6≤MCU≤0.8. For 2N e, ML was more reliable for MCU≥0.8 (see text for criteria). Data are averaged across 300 replicates.
Figure 8
Figure 8. Reliability of ancestral codon bias inference under non-equilibrium evolution: variable mutation.
The legend in A applies to B, D, and E. The legend in C applies to F. X-axis scales are identical for graphs in the same column. Note that MCU values are given for the m node and have shifted from ancestral values. For the 2u/v scenario, reliability of d up,pu inference was greater for ML than for MP for MCU≥0.8, but MP was more reliable for MCU≤0.6. For 0.5u/v, d up,pu was more reliably inferred by ML than by MP for MCU≥0.7, but MP was more reliable for MCU = 0.5. Data are averaged across 300 replicates.
Figure 9
Figure 9. Reliability of ML codon bias inference under lineage-specific non-stationarity.
Differences between inferred and actual d up,pu values are plotted as a function of MCU for each lineage (averages across 300 simulations are plotted). The legend applies to all graphs. X-axis scales apply to all graphs in the same column. Y-axis scales are identical among all graphs. The lineage-specific scenarios are as follows: stationary MCU (st) in <l>s and e, decreasing MCU (1/3N e) in m, y, o, and eo, and increasing MCU (2N e) in t and ty. Scenarios were varied in the ancestral ms and tyeo lineages. For the ms lineage: black (st), red (1/3N e), blue (2N e). For the tyeo lineage: thick (st), thin (1/3N e), dotted (2N e).

Similar articles

Cited by

References

    1. Long M, Langley CH. Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science. 1993;260:91–95. - PubMed
    1. Akashi H. Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics. 1996;144:1297–1307. - PMC - PubMed
    1. Eanes WF, Kirchner M, Yoon J, Biermann CH, Wang IN, et al. Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans. Genetics. 1996;144:1027–1041. - PMC - PubMed
    1. Fitch WM, Bush RM, Bender CA, Cox NJ. Long term trends in the evolution of H(3) HA1 human influenza type A. Proc Natl Acad Sci U S A. 1997;94:7712–7718. - PMC - PubMed
    1. Messier W, Stewart CB. Episodic adaptive evolution of primate lysozymes. Nature. 1997;385:151–154. - PubMed

Publication types

LinkOut - more resources