Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2018 Apr 1;35(4):925-941.
doi: 10.1093/molbev/msy005.

Variable Rates of Simple Satellite Gains across the Drosophila Phylogeny

Affiliations
Comparative Study

Variable Rates of Simple Satellite Gains across the Drosophila Phylogeny

Kevin H-C Wei et al. Mol Biol Evol. .

Erratum in

Abstract

Simple satellites are tandemly repeating short DNA motifs that can span megabases in eukaryotic genomes. Because they can cause genomic instability through nonallelic homologous exchange, they are primarily found in the repressive heterochromatin near centromeres and telomeres where recombination is minimal, and on the Y chromosome, where they accumulate as the chromosome degenerates. Interestingly, the types and abundances of simple satellites often vary dramatically between closely related species, suggesting that they turn over rapidly. However, limited sampling has prevented detailed understanding of their evolutionary dynamics. Here, we characterize simple satellites from whole-genome sequences generated from males and females of nine Drosophila species, spanning 40 Ma of evolution. We show that PCR-free library preparation and postsequencing GC-correction better capture satellite quantities than conventional methods. We find that over half of the 207 simple satellites identified are species-specific, consistent with previous descriptions of their rapid evolution. Based on a maximum parsimony framework, we determined that most interspecific differences are due to lineage-specific gains. Simple satellites gained within a species are typically a single mutation away from abundant existing satellites, suggesting that they likely emerge from existing satellites, especially in the genomes of satellite-rich species. Interestingly, unlike most of the other lineages which experience various degrees of gains, the lineage leading up to the satellite-poor D. pseudoobscura and D. persimilis appears to be recalcitrant to gains, providing a counterpoint to the notion that simple satellites are universally rapidly evolving.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Satellite DNA characterization from standard and PCR-free WGS libraries. (A) Principal component analysis of libraries generated from PCR-free, 8-cycle PCR, and 12-cycle PCR libraries. PC1 accounts for over 90% of the variance between samples. (B) The %AT composition of each kmer is plotted against the log2 fold-difference between PCR-free and 8-cycle PCR libraries. The points are fitted with a quadratic function; the R2 is labeled on the top left. (C) For kmers that are >1.5-fold lower in 8-cycle PCR libraries, the log2 fold-differences between PCR-free and 8-cycle PCR libraries (dark gray) and between PCR-free and 12-cycle PCR libraries (light gray) are plotted. (D) The coefficient of variation across triplicates of each condition. * indicates significance at P < 0.0001. (E) The distribution of pairwise correlations of satellite quantities between PCR conditions and replicates are plotted before and after GC-correction.
<sc>Fig</sc>. 2.
Fig. 2.
The landscape of satellite DNAs across nine Drosophila species. (A) The heatmap depicts satellite DNA quantities in log10 scale for females and males of the nine species, with the phylogeny of the species drawn on the left. Due to space constraints, only satellites >10 kb in at least one sample are plotted. For the complete set of 207 satellites, see supplementary figure 3, Supplementary Material online. The order of kmers was determined by hierarchical clustering of their quantities in the different samples. (B) Fluorescent in situ hybridization of AAGAG and AATACAATTG in D. melanogaster and D. simulans mitotic chromosomes from third-instar larval neuroblast cells. (C) Total simple satellite abundances are plotted against estimated heterochromatin content (modified from Bosco et al. 2007); the regression line is plotted in red. (D) Simple satellites of low (light gray) and high (dark gray) abundance are binned by the number of species in which they are found. (E) Pairwise Spearman’s correlation of satellite abundance between species is plotted as a heatmap. Negative correlations are driven largely by species-specific satellites.
<sc>Fig</sc>. 3.
Fig. 3.
Y-enriched satellites across species. (A) For each species, the satellite quantities in females are plotted against those in males. Satellites with significant enrichment in males, and therefore at least partially Y-linked, are labeled in blue with the counts displayed on the bottom right of each plot. A subset of Y-linked satellites are absent in females, and are therefore Y-specific; they fall within the gray boxes and their counts are tallied in the top left corner. The presence/absence cutoffs of the samples are demarcated by the dotted lines. (B) Distribution of satellites across species is plotted for all satellites (same as fig. 2D), the Y-enriched satellites, and the Y-specific satellites. All pairwise comparisons of the three distributions are significantly different (Kolmogorov–Smirnov test, P values < 1e-5). (C) Read counts of TEs in females versus males are plotted for D. pseudoobscura and D. persimilis (see supplementary fig. 6, Supplementary Material online, for comparisons between all species).
<sc>Fig</sc>. 4.
Fig. 4.
Satellite gains and losses along the Drosophila phylogeny. (A) Unambiguous simple-satellite gains and losses are labeled above and below each branch, respectively. The branch lengths are not drawn to scale. (B) Branches on which parallel gains are found are connected with gray lines. The width of the lines is proportional to the number of parallel gains, which are labeled above or below the lines. The branch lengths are not drawn to scale, and the placement of satellites on each branch does not reflect their actual age. Note that four satellites were gained in parallel but are not plotted here because the identity of one of the branches is ambiguous (see supplementary table S3, Supplementary Material online).
<sc>Fig</sc>. 5.
Fig. 5.
Sequence similarity of lineage-specific satellites. (A) Bootstrapped distributions of the proportion of satellites within one mutation from another are plotted for each species (blue), and for random sets sampled from all species (gray). The vertical and horizontal lines demarcate the median and 95% intervals of the swarms, respectively. All comparisons between the species and random distributions are significantly different (Wilcoxon ranked sum test, P values < 1e-10). (B) Simple satellites in Drosophila melanogaster are plotted as nodes where the size and color indicate their abundances; to prevent clutter, only those that are >10 kb in abundance are labeled. Simple satellites one mutation away from each other are connected by edges. See supplementary figure 9, Supplementary Material online, for networks in all species. (C) The log10 abundance of each simple satellite is plotted against the number of edges it has in a given species; the regression line for this correlation is in red with the P value and R2 shown on the top right.

References

    1. Abad JP, Carmena M, Baars S, Saunders RD, Glover DM, Ludeña P, Sentis C, Tyler-Smith C, Villasante A.. 1992. Dodeca satellite: a conserved G+C-rich satellite from the centromeric heterochromatin of Drosophila melanogaster. Proc Natl Acad Sci U S A. 8910:4663–4667. - PMC - PubMed
    1. Agudo M, Losada A, Abad JP, Pimpinelli S, Ripoll P, Villasante A.. 1999. Centromeres from telomeres? The centromeric region of the Y chromosome of Drosophila melanogaster contains a tandem array of telomeric HeT-A- and TART-related sequences. Nucleic Acids Res. 2716:3318–3324. - PMC - PubMed
    1. Aird D, Ross MG, Chen W-S, Danielsson M, Fennell T, Russ C, Jaffe DB, Nusbaum C, Gnirke A.. 2011. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 122:R18.. - PMC - PubMed
    1. Andolfatto P, Wong KM, Bachtrog D.. 2011. Effective population size and the efficacy of selection on the X chromosomes of two closely related Drosophila species. Genome Biol Evol. 3:114–128.10.1093/gbe/evq086 - DOI - PMC - PubMed
    1. Ashburner M. 1990. Puffs, genes, and hormones revisited. Cell 611:1–3.10.1016/0092-8674(90)90205-S - DOI - PubMed

Publication types

LinkOut - more resources