Recombination Suppression Drives Expansion of the Drosophila Dot Chromosome

Timothy J Stanek¹, Wilson Leung², Christopher D Shaffer²; Genomics Education Partnership; Ishtar Olaveja³, Annabelle Laughlin^{2

3}, Jaquelyn Hester^{1

3}, Darwin Garrido³, Emily K Oh^{2

3}, Maria Volski^{1

3}, Nistha Panda^{2

3}, Mia Mo^{2

3}, Ethan Cordes², Martin Dalling^{2

3}, Kacie Kershaw^{3

4}, Malcolm Arnott^{3

5}, Stephen Daly^{2

3}, Silvia Garcia Valenzuela³, Paige Thompson^{1

3}, Kayla L Hastert³, Destiny Sabb³, Kathryn Karpinski^{3

4}, Meher Naaz Arora², Nicholas Rius³, Larissa LoBello^{2

3}, Sebastian Jaramillo³, Omkar Sonavane^{1

3}, Alice Herrmann^{2

3}, Laura K Reed⁶, Sarah C R Elgin², Cindy Arrigo³, Christopher E Ellison¹

Collaborators, Affiliations

PMID: 41442496
PMCID: PMC12728734
DOI: 10.1093/molbev/msaf304

Recombination Suppression Drives Expansion of the Drosophila Dot Chromosome

Timothy J Stanek et al. Mol Biol Evol. 2025.

. 2025 Nov 28;42(12):msaf304.

doi: 10.1093/molbev/msaf304.

PMID: 41442496
PMCID: PMC12728734
DOI: 10.1093/molbev/msaf304

Abstract

Genome size varies widely, even among closely related species, yet much less is known about chromosome size variation. Here we use the fourth chromosome of Drosophila, also known as the "Muller F element" or "dot chromosome", as a model to investigate chromosome-specific size expansion. The F element of most Drosophila species is small (∼1.3 Mb) and almost entirely heterochromatic, yet harbors approximately 80 protein-coding genes. Here, we study D. kikkawai, D. takahashii, D. ananassae, and D. bipectinata, whose F elements are 2- to 15-fold larger in size compared to D. melanogaster. Through manual gene curation and comparative genomic analysis, we find that their F elements have expanded primarily via accumulation of transposable elements (TEs) in introns and intergenic regions. Natural selection appears less efficient on these expanded F elements: they have smaller effective population sizes and their genes exhibit reduced usage of optimal codons, compared to D. melanogaster. We propose that F element size variation is driven by differences in F element recombination rates. The ultra-long (∼20 Mb) F elements of D. ananassae and D. bipectinata display high rates of rearrangement and sequence evolution and exhibit independent TE-driven expansions. Our results suggest that F elements of most Drosophila species likely recombine enough to prevent size expansion, while F element recombination in D. ananassae and D. bipectinata is either absent or rare enough to allow TEs and other deleterious mutations to accumulate via Muller's ratchet; thus, these chromosomes evolve more like a Y chromosome than a typical Drosophila F element.

Keywords: Drosophila; genome size; heterochromatin.

PubMed Disclaimer

Figures

**Fig. 1.**
*Drosophila* species with expanded F elements. a) Karyotypes of the four expanded F element species. Muller elements A to F correspond to *D. melanogaster* chromosomes X, 2L, 2R, 3L, 3R, and 4, respectively. The X chromosomes (Muller A) and F elements of *D. ananassae* and *D. bipectinata* are both metacentric. Both telocentric (dark red arm only) and metacentric (both dark and light red arms) F elements have been reported in *D. kikkawai* (Baimai and Chumchong 1980). Previous work suggests that the strain used here has the acrocentric F element (Leung et al. 2023). Panel adapted from Leung et al. (2023). b) Phylogenetic relationships among the species studied here, along with the F element scaffold length from their respective genome assemblies (Leung et al. 2023). Divergence times (in millions of years) are shown for each node of the tree (Suvorov et al. 2022).

**Fig. 2.**
F element expansion occurs within both introns and intergenic regions. Genes on the Muller F and the reference region of Muller D for each species were annotated by members of the GEP. a) Total coding span of Muller D and Muller F genes from each species. b) Sum of intron lengths per gene. c) Sum of coding exon lengths per gene. d) Total exon count per gene. e) Intergenic lengths. f) Total intron count per gene. For each violin, the dot demarcates the median, the box represents the interquartile range, and the whiskers represent 1.5 × the interquartile range. For each panel, an asterisk after the plot header identifies statistically significant interactions between the independent variables chromosome and species (two-way non-parametric analysis of variance, see Methods). For post-hoc analyses (ART-C procedure with Holm correction, see Methods), a pound symbol (#) identifies statistically significant (adjusted P < 0.05) comparisons of Muller F to Muller D within a species, while the dollar symbol ($) identifies statistically significant (adjusted P < 0.05) comparisons of the same Muller element between each expanded F species and *D. melanogaster*. All tests and their associated statistical values can be found in Table S1. Values for gene coding span length, CDS length, intron length, CDS count, intron count, and intergenic length are available in Table S2.

**Fig. 3.**
Repetitive elements drive F element expansions. a) Repetitive and non-repetitive sequence composition of Muller F, by species. Each species with an asterisk has a significantly larger proportion of repetitive DNA on its F element compared to *D. melanogaster* (Fisher's exact test adjusted P < 0.05). Diagonal patterning identifies repeat classes in each species that comprise a significantly larger proportion of the F element compared to *D. melanogaster* (Fisher's exact test adjusted P < 0.05). Note: Manual inspection of the most abundant elements from the “Unknown” category in *D. takahashii* and *D. kikkawai* suggests that they are primarily DNA transposons. b) Log2-fold change in size (bp) relative to the respective *D. melanogaster* category. Abbreviations: CDS, coding sequence; DNA, DNA transposon; LINE, long interspersed nuclear element retrotransposon; LTR, long terminal repeat retrotransposon; RC/Helitron, rolling circle/Helitron transposon; Unknown, unknown/unclassified repeat, as reported by *Earl Grey* (Baril et al. 2024). The *simple repeat* category is composed of noncoding simple repeats (excluding transposon sequences). Thenoncoding category is composed of unmasked noncoding sequence (see Methods). The TE category is composed of DNA, LINE, LTR, and RC/Helitron transposons (excluding unknown/unclassified). All tests and their associated statistical values can be found in Table S1.

**Fig. 4.**
F element genes show less biased codon usage. Normalized CAI were determined for all genes on the F element and D element reference region for each species (see Methods). a) Normalized CAI for Muller F genes and Muller D reference genes. For each violin, the dot demarcates the median, the box represents the interquartile range, and the whiskers represent 1.5 × the interquartile range. b) Normalized CAI values versus distance from centromere. Orange points represent wanderer genes as annotated in Table 2. Correlation coefficients and P-values were calculated using Spearman's rank correlation coefficient, excluding wanderer genes. c) Scatterplots comparing normalized CAI values between F element orthologs in the expanded F species against *D. melanogaster*. Correlation coefficients and P-values were calculated using Spearman's rank correlation coefficient. d) Scatterplot comparing normalized CAI values between F element orthologs of *D. ananassae* against *D. bipectinata*. Note that a third gene, *CG33941*, shows strong codon usage bias in both *D. ananassae* and *D. bipectinata* but not in the other species. Wanderer genes were excluded from panels (c) and (d). For 4A, the asterisk after the plot header identifies statistically significant interactions between the independent variables chromosome and species (two-way non-parametric analysis of variance). For post-hoc analyses (ART-C procedure with Holm correction), a pound symbol (#) identifies statistically significant (adjusted P < 0.05) comparisons of Muller F to Muller D within a species, while the dollar symbol ($) identifies statistically significant (adjusted P < 0.05) comparisons of the same Muller element between each expanded F species and *D. melanogaster*. All tests and their associated statistical values can be found in Table S1.

**Fig. 5.**
F element synteny comparison. a) Multi-species visualization of F element synteny. Each link (light blue) connects a pair of orthologous genes between species. Divergence times between species pairs are shown on the left. Note: the F elements of each species are not drawn to scale–their lengths are shown on the right-hand side of the plot. b) The minimum number of chromosomal rearrangements between the species pairs shown in (a), as calculated by *GRIMM* (Tesler 2002). c) Synteny blocks were identified from whole genome alignments between *D. melanogaster* and *D. takahashii*, and between *D. bipectinata* and *D. ananassae*. Mean synteny block size for each Muller element is shown for pairwise comparisons between *ananassae*/*bipectinata* versus *melanogaster*/*takahashii*. d) Synteny coverage (ie aligned fraction of chromosome) is shown for the same comparison as in (c).

**Fig. 6.**
F element sequence conservation. a) Whole genome alignments were constructed using genome assemblies from the 14 species shown in the phylogenetic tree. Red text indicates the *ananassae* species group, the members of which all have a highly expanded F element. Bold text indicates the focal species used in the conservation analyses shown in (b). Node labels indicate divergence times (in millions of years, where available) from Suvorov et al. (2022). Note that branch lengths are not drawn to scale. b) For each bold focal species shown in (a), pairwise comparisons were made to each of the other species in the tree. For each comparison, the fraction of aligned bases between the two species was calculated for 20 kb windows across either the Muller D reference region (d) or the entire F element (f). Note: *D. pseudoananassae* is abbreviated *D.pse.ananassae*. c) Pairwise alignment rates (*D. melanogaster* versus *D. takahashii* and *D. ananassae* versus *D. bipectinata*) for coding and noncoding regions of the Muller D reference region and the Muller F element.

**Fig. 7.**
Distinct repeat content of *ananassae* group F elements. a) Reciprocal repeat masking results for eleven *ananassae* group species plus *D. melanogaster*, *D. takahashii*, and *D. kikkawai* (see Methods). The F elements of each *ananassae* group species complex (ie *ananassae*, *bipectinata*, and *ercepeae*) carry a subset of repeats that are not found in the other complexes. The heatmap cells are colored based on the percentage of the F element that is masked. Differences in color along the diagonal are due to differences in F element repeat density among species. b) Visualization of repeat content in the introns of the *zfh2* gene across members of the *ananassae* species group. Black rectangles represent *zfh2* coding exons. Colored rectangles represent matches to individual TE families identified in either *D. ananassae* (left panel) or *D. bipectinata* (right panel) TEs. TE family names and coordinates within *zfh2* for each species are provided in Table S4.

**Fig. 8.**
Effective population size and linkage disequilibrium. a) Estimates of effective population size (Ne) for both the autosomes and F elements. b) The mean physical distance at which linkage disequilibrium (LD) decays to ¾ of its maximum value. c) The F element versus autosomal (F/A) ratios of LD decay.

See this image and copyright information in PMC

References

1. Arguello JR et al. Recombination yet inefficient selection along the Drosophila melanogaster subgroup's fourth chromosome. Mol Biol Evol. 2010:27:848–861. 10.1093/molbev/msp291. - DOI - PMC - PubMed
1. Armstrong J et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020:587:246–251. 10.1038/s41586-020-2871-y. - DOI - PMC - PubMed
1. Badet T, Tralamazza SM, Feurtey A, Croll D. Recent reactivation of a pathogenicity-associated transposable element is associated with major chromosomal rearrangements in a fungal wheat pathogen. Nucleic Acids Res. 2024:52:1226–1242. 10.1093/nar/gkad1214. - DOI - PMC - PubMed
1. Baimai V, Chumchong C. Karyotype variation and geographic distribution of the three sibling species of the Drosophila kikkawai complex. Genetica. 1980:54:113–120. 10.1007/BF00055979. - DOI
1. Balachandran P et al. Transposable element-mediated rearrangements are prevalent in human genomes. Nat Commun. 2022:13:7115. 10.1038/s41467-022-34810-8. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Recombination Suppression Drives Expansion of the Drosophila Dot Chromosome

Recombination Suppression Drives Expansion of the Drosophila Dot Chromosome

Abstract

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Miscellaneous