Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2007 Dec;17(12):1837-49.
doi: 10.1101/gr.6249707. Epub 2007 Nov 7.

Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes

Affiliations
Comparative Study

Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes

Andreas Heger et al. Genome Res. 2007 Dec.

Abstract

The newly sequenced genome sequences of 11 Drosophila species provide the first opportunity to investigate variations in evolutionary rates across a clade of closely related species. Protein-coding genes were predicted using established Drosophila melanogaster genes as templates, with recovery rates ranging from 81%-97% depending on species divergence and on genome assembly quality. Orthology and paralogy assignments were shown to be self-consistent among the different Drosophila species and to be consistent with regions of conserved gene order (synteny blocks). Next, we investigated the rates of diversification among these species' gene repertoires with respect to amino acid substitutions and to gene duplications. Constraints on amino acid sequences appear to have been most pronounced on D. ananassae and least pronounced on D. simulans and D. erecta terminal lineages. Codons predicted to have been subject to positive selection were found to be significantly over-represented among genes with roles in immune response and RNA metabolism, with the latter category including each subunit of the Dicer-2/r2d2 heterodimer. The vast majority of gene duplications (96.5%) and synteny rearrangements were found to occur, as expected, within single Müller elements. We show that the rate of ancient gene duplications was relatively uniform. However, gene duplications in terminal lineages are strongly skewed toward very recent events, consistent with either a rapid-birth and rapid-death model or the presence of large proportions of copy number variable genes in these Drosophila populations. Duplications were significantly more frequent among trypsin-like proteases and DM8 putative lipid-binding domain proteins.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Gene prediction results. (A) Two-dimensional histogram of percentage identity and alignment coverage of D. melanogaster transcripts to their best matching predictions in D. pseudoobscura. Transcripts predicted with conserved gene structure and >80% coverage were retained for further analysis, the remainder were removed. (B) Numbers of predicted genes in all fly genomes. Genes with conserved or partially conserved gene structure are shown in blue shades, pseudogenes are shown in gray shades indicating conservation of gene structure: conserved (light), partially conserved (medium), single exon (dark), and retrotransposed (white). Species names have been abbreviated.
Figure 2.
Figure 2.
Orthology assignment. (A) Numbers of D. melanogaster genes with orthologs in other Drosophila species. These ortholog counts increase with increasing statistical coverage of genome sequence and decrease with increasing species divergence. The numbers of sequences in 1:1 orthology assignments are shown in black, while the numbers of 1:many orthologs are shown in gray. (B) The inferred phylogeny of Drosophila species based on median ds values among orthologs. The tree was computed using the FITCH program of the PHYLIP package (Felsenstein 1989). Branch lengths are given in ds. Branch support values, computed as percentage of gene phylogenies that are consistent with the species phylogeny, are shown as red pie slices. (C) Gene-based synteny plot between D. melanogaster (X-axis) and D. yakuba (Y-axis). Genes are sorted by physical locations on the chromosomes. The box marks an artefactual duplication between chromosome 3L and chromosome 3L_random in D. yakuba that explains the excess of 1:2 orthologs in this assembly. (D) Gene-based synteny plot between the more divergent species pair D. melanogaster (X-axis) and D. virilis (Y-axis). Species names have been abbreviated.
Figure 3.
Figure 3.
Branch-specific terminal dN/dS, dN, and dS values in solid, hatched, and open bars, respectively. The error bars indicate the standard deviation from 20 replicates. Species names have been abbreviated.
Figure 4.
Figure 4.
Approximately 20%–30% of all gene duplications have been very recent, whereas duplications inferred to have been more ancient occurred less frequently and more uniformly. Duplication events have been dated by the synonymous substitution rate dS and normalized by the overall height of each gene tree (open squares) or by the corresponding branch length (solid squares) of the species tree after reconciliation with the gene tree. The latter have been aggregated over all internal or terminal lineage branches, respectively. A similar picture emerges when considering each branch separately (Supplemental Figs. S9 and S10). Duplications from D. melanogaster subgroup only rooted with D. pseudoobscura and/or D. persimilis sequences (A) or all 12 species (B). Among 13,132 clusters, 5851 had the full species complement and there were 1853 clusters with 1305 internal and 3794 lineage-specific duplications.
Figure 5.
Figure 5.
Most gene duplications are closely linked and are recent (dS < 0.04). Shown here are lineage-specific duplications in D. yakuba for the five large chromosomal arms, but similar results are seen for other species (Supplemental Figs. S12 and S13). Each duplication is represented by two dots connected by an arc. These are colored by their divergence (dS value, see scale). Pseudogenes are shown in gray. Genes are placed on the chromosomal arms according to their physical location. Most duplications are local such that only a single dot is visible. Overlapping or very close duplications are stacked on top of each other. Multiple duplications within the same gene family are stacked on top of each other in the outer rings whose increased radius reflects the family size. Each member of a multigene family is connected to all other members resulting in a connected path of arcs within a family. Translocations involving three families of likely transposable elements have not been shown to simplify the image.
Figure 6.
Figure 6.
Different classes of rapidly evolving genes. Duplicated genes are often involved in adaptive functions such as responses to external stimuli, whereas they are under-represented in transcription factors and regulatory genes. Shown are over-/under-represented GOSlim categories of D. melanogaster genes present in clusters containing gene duplications (n = 1126) (A), without detectable orthologs in species further diverged than D. yakuba and D. erecta (n = 795) (B), and with sites predicted to have been subject to positive selection (n = 121) (C). The size of the box represents the P value of the over/under-representation while the fold over-/under-presentation is indicated by the color of the box (see scale at bottom). False-positive predictions arising from the application of multiple tests were controlled using a false discovery rate of 0.05.

References

    1. Akashi H. Molecular evolution between Drosophila melanogaster and D. simulans: Reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics. 1996;144:1297–1307. - PMC - PubMed
    1. Akashi H., Ko W., Piao S., John A., Goel P., Lin C., Vitins A.P., Ko W., Piao S., John A., Goel P., Lin C., Vitins A.P., Piao S., John A., Goel P., Lin C., Vitins A.P., John A., Goel P., Lin C., Vitins A.P., Goel P., Lin C., Vitins A.P., Lin C., Vitins A.P., Vitins A.P. Molecular evolution in the Drosophila melanogaster species subgroup: Frequent parameter fluctuations on the timescale of molecular divergence. Genetics. 2006;172:1711–1726. - PMC - PubMed
    1. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J., Zhang J., Zhang Z., Miller W., Lipman D.J., Zhang Z., Miller W., Lipman D.J., Miller W., Lipman D.J., Lipman D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Ao J., Ling E., Yu X., Ling E., Yu X., Yu X. Drosophila C-type lectins enhance cellular encapsulation. Mol. Immunol. 2007;44:2541–2548. - PMC - PubMed
    1. Aquadro C.F., Lado K.M., Noon W.A., Lado K.M., Noon W.A., Noon W.A. The rosy region of Drosophila melanogaster and Drosophila simulans. I. Contrasting levels of naturally occurring DNA restriction map variation and divergence. Genetics. 1988;119:875–888. - PMC - PubMed

Publication types

Substances

LinkOut - more resources