Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 10:12:e85249.
doi: 10.7554/eLife.85249.

Expansion and loss of sperm nuclear basic protein genes in Drosophila correspond with genetic conflicts between sex chromosomes

Affiliations

Expansion and loss of sperm nuclear basic protein genes in Drosophila correspond with genetic conflicts between sex chromosomes

Ching-Ho Chang et al. Elife. .

Abstract

Many animal species employ sperm nuclear basic proteins (SNBPs) or protamines to package sperm genomes tightly. SNBPs vary across animal lineages and evolve rapidly in mammals. We used a phylogenomic approach to investigate SNBP diversification in Drosophila species. We found that most SNBP genes in Drosophila melanogaster evolve under positive selection except for genes essential for male fertility. Unexpectedly, evolutionarily young SNBP genes are more likely to be critical for fertility than ancient, conserved SNBP genes. For example, CG30056 is dispensable for male fertility despite being one of three SNBP genes universally retained in Drosophila species. We found 19 independent SNBP gene amplification events that occurred preferentially on sex chromosomes. Conversely, the montium group of Drosophila species lost otherwise-conserved SNBP genes, coincident with an X-Y chromosomal fusion. Furthermore, SNBP genes that became linked to sex chromosomes via chromosomal fusions were more likely to degenerate or relocate back to autosomes. We hypothesize that autosomal SNBP genes suppress meiotic drive, whereas sex-chromosomal SNBP expansions lead to meiotic drive. X-Y fusions in the montium group render autosomal SNBPs dispensable by making X-versus-Y meiotic drive obsolete or costly. Thus, genetic conflicts between sex chromosomes may drive SNBP rapid evolution during spermatogenesis in Drosophila species.

Keywords: D. melanogaster; evolutionary biology; genetics; genomics; meiotic drive; positive selection; protamines; pseudogenes.

Plain language summary

In sperm, DNA is packaged more tightly than in other cells thanks to small proteins called ‘sperm nuclear basic proteins’ (SNBPs), also called protamines in mammals. SNBPs are important for sperm to develop properly and correctly perform their role during fertilization. Although the evolution of SNBPs has been studied in mammals, these proteins have not been as thoroughly examined in invertebrates. Chang et al. took advantage of the availability of high-quality sequences for the genomes of 78 species of Drosophila flies to investigate the evolution of the genes that code for SNBPs in these flies. The results showed that, just like in mammals, in Drosophila the protein sequences of SNBPs evolve rapidly. However, unlike mammals, Chang et al. also found that Drosophila species frequently gained and lost genes coding for SNBPs. Interestingly, the ‘older’ genes (genes that appeared earlier in evolution) that code for SNBPs are not essential for reproduction in the fruit fly Drosophila melanogaster. This is an unexpected finding because older genes usually have essential roles for survival and reproduction, which require them to be passed on to the next generation and remain in the genome. In contrast, younger SNBP genes that had appeared more recently and were not shared between different species of Drosophila were often essential for fertility. These results, combined with other observations about where SNBP genes are located in the genome, led Chang et al. to hypothesize that SNBPs present in sex chromosomes act as ‘meiotic drivers’ while those on other chromosomes (known as autosomes) suppress meiotic drive. In other words, SNBP genes present in the sex chromosomes may be responsible for killing sister sperm cells that do not carry those genes, while SNBP genes that are not located on sex chromosomes may suppress this activity. This is of particular interest because it indicates that SNBPs are involved in genetic conflicts between the two sex chromosomes: sperm that carry SNBPs on the X chromosome may kill sperm with a Y chromosome, and vice versa. The results of Chang et al. shed light on the mysterious evolution of SNBPs in Drosophila flies. Although previous hypotheses regarding the rapid evolution of SNBPs evolution have focused on their role in genome packaging, this new analysis suggests that much of the evolutionary change is likely driven by genetic conflicts between sex chromosomes.

PubMed Disclaimer

Conflict of interest statement

CC, IM, HM No competing interests declared

Figures

Figure 1.
Figure 1.. Origins and evolution of Drosophila sperm nuclear basic protein (SNBP) genes.
(A) Phylogenomic analysis of 13–15 SNBP genes from D. melanogaster organized into three groups (dotted lines): required for male fertility, not required for male fertility, or untested in previous analyses. We identified homologs of these genes in 14 other Drosophila species and an outgroup species, S. lebanonensis, whose phylogenetic relationships and divergence times are indicated on the left (Kumar et al., 2017). Genes retained in autosomal syntenic locations are indicated by black squares, whereas paralogs located in non-syntenic autosomal locations, or X-chromosomes, or Y-chromosomes are indicated in gray, blue and red squares, respectively. Numbers within the squares show the copy number, if >1, of different genes, e.g., D. melanogaster has two paralogs each of both Prot and tHMG genes. An empty square with a line across it indicates that only a pseudogene can be found in the shared syntenic location, whereas an ‘X’ indicates that no ortholog is found, even though one is expected based on the phylogenomic inference of SNBP age. Based on this analysis, we infer that eight SNBP genes are at least 50 million years old, but only three genes are strictly retained in all 16 species (CG30056, CG31010, and Prot). Indeed, none of the SNBP genes required for male fertility in D. melanogaster are strictly conserved in other Drosophila species, either arising more recently (Mst77F, Prtl99C) or having been lost in at least one species after birth (ddbt). We also marked the montium group species, D. kikkawai, in red, because it has unusually lost six SNBP genes. (B, C) We compared dN/dS (B) or dN (C) values for all orthologous SNBP genes (red dots) in D. melanogaster compared to a histogram of the same values for the genome-wide distribution (gray bars) obtained from an analysis using six species by the 12 Drosophila genomes project (Clark et al., 2007). Our analyses reveal that most SNBP genes are at or beyond the 95th or 99th percentile for dN/dS or dN values (blue dashed lines). The values of CG34269 are calculated using only five species because it is lost in one of the surveyed species, D. ananassae; therefore; we do not show its dN, as it is not comparable to other genes.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Expression patterns of sperm nuclear basic protein (SNBP) genes in D. melanogaster spermatogenesis.
Using single-cell expression data from Witt et al., 2021, we estimated SNBP gene expression in each cell type using the NormalizeData function of Seurat (Hao et al., 2021), with a scale factor of 10000. The cell type is assigned by the expression of stage-specific genes in the previous study (Witt et al., 2021).
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Number and location of high mobility group (HMG) boxes in sperm nuclear basic protein (SNBP) proteins.
We plotted the location of HMG boxes in 15 SNBP proteins encoded by the D. melanogaster genome. Among them, 11 have only one HMG box, whereas 4 of them have two HMG boxes. The location of HMG boxes varies between SNBP proteins. A scale bar for protein size is at the bottom of the figure.
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Expression patterns of sperm nuclear basic protein (SNBP) genes in Drosophila and Scaptodrosophila species.
We estimated the expression of SNBP orthologs (A) and paralogs (B) using publicly available transcriptome datasets. We used colors to represent expression levels in each sample. Our analyses reveal that almost all SNBP genes are expressed only in testes. The raw values are shown in Supplementary file 2.
Figure 1—figure supplement 4.
Figure 1—figure supplement 4.. Sperm nuclear basic protein (SNBP) expression level in testes is correlated across Drosophila species.
We estimated the expression level of each SNBP gene in testes across seven Drosophila species (D. melanogaster, D. simulans, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, and Scaptodrosophila lebanonensis) and compared the relative expression level of orthologs to each other. The numbers below the diagonal are spearman rho coefficients. Our data suggest a moderate to high correlation between Sophophora species. The raw values are shown in Supplementary file 2.
Figure 2.
Figure 2.. The strictly retained, highly conserved sperm nuclear basic protein (SNBP) gene, CG30056, is dispensable for male fertility in D. melanogaster.
(A) The SNBP gene, CG30056, is encoded co-directionally in an intron of the essential frazzled gene. Using guide RNAs designed to match sites flanking CG30056, and a healing construct encoding eye-specific DsRed, we created a knockout allele replacing CG30056 with DsRed. The knockout was verified using PCR and primers flanking the CG30056 locus (right). Note that balancer lines encode a wildtype copy of CG30056. (B) We performed fertility assays comparing CG30056 homozygous knockout flies with heterozygous controls, either KO/Balancer or KO/wt (gray ovals). Each dot represents a single replicate, and the average and 95% confidence interval based on standard errors are shown in the figures. Fertility assays were performed either for a few days or to sperm exhaustion (gray ovals). We also assayed fertility of knockout strains for the fertility-essential Mst77F gene, and the fertility-nonessential Tpl94D gene. We also documented the sex ratios of the resulting progeny in (C). Consistent with previous findings, we found that Mst77F knockout males are essentially sterile and Tpl94D knockout males were indistinguishable from their heterozygous controls. We found either no or weak evidence of fertility impairments in two different crosses with homozygous CG30056 knockout males compared to KO/Balancer controls. However, we found no evidence of CG30056 requirement for male fertility in more stringent ‘sperm exhaustion’ fertility experiments compared to KO/wildtype controls (gray ovals). (C) We observed no significant evidence of sex-ratio distortion that would suggest an X-versus-Y meiotic drive in progeny resulting from either CG30056, Mst77F, or Tpl94D knockout males. Although there is suggestive evidence of sex-ratio distortion in progeny of one of the Mst77F genotypes, this is inconsistent between the two crosses and most likely due to stochastic effects of having very few resulting progeny. The raw data of (B) and (C) are shown in Supplementary file 8.
Figure 3.
Figure 3.. Recurrent amplifications of Drosophila sperm nuclear basic protein (SNBP) genes are biased for sex-chromosomal linkage.
(A) Using reciprocal BLAST (see 'Materials and methods'), we searched for homologs of each D. melanogaster SNBP gene in 78 distinct Drosophila species and two outgroup species (shown in dot lines). We depict our findings using the circular phylogram representation for SNBP gene CG31010. The innermost circle is a circular phylogeny of the species (Kim et al., 2021). The next circle ring indicates autosomal copies, with colors to indicate copy number (scale bar, top left; note that scales are different for each gene). Thus, CG31010 is present in one autosomal copy in all but one Drosophila species (gray bar). The third circle indicates sex-chromosomal copies. Red and blue frames in the middle ring indicate X- or Y-linkage if that can be reliably assigned. Dotted frames indicate copies that might not be real orthologs based on phylogeny, whereas solid frames indicate five or more copies. For example, CG31010 is present in five copies on the X-chromosome of D. obscura. The outermost circle shows copies with ambiguous chromosomal location: there are no such copies for CG31010. (B) Using the same representation scheme, we indicate gene retention and amplification for seven other SNBP genes for which we find robust evidence of amplification, from a copy number of five (CG14835) to nearly 50 (tHMG). We also marked the montium group species that lost many SNBP genes with yellow lines. We note that assemblies of Lordiphosa species have lower quality, and the data need to be interpreted carefully. (C) SNBP gene amplifications (five or more copies) are heavily biased for sex chromosomal linkage. Given the relative size of sex chromosomes and autosomes, this pattern is highly non-random (test of proportions, p=2.3e-5).
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Six sperm nuclear basic protein (SNBP) genes did not undergo significant gene amplification events in Drosophila species.
We searched for homologs of each D. melanogaster SNBP gene in 78 distinct Drosophila species using reciprocal BLAST. We represent our findings using the same circular representation as in Figure 3: the innermost ring indicates autosomal genes, the middle ring indicates sex-linked genes, and the outer ring shows genes with an ambiguous location. In contrast to the significant gene amplification of eight SNBP genes shown in Figure 3, the five SNBP genes represented here only underwent a relatively modest copy number change of twofold or three-fold.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Concerted evolution of sperm nuclear basic protein (SNBP) gene amplifications.
Phylogenetic analyses of the eight SNBP genes that underwent gene amplifications reveal that most of these amplifications are evolutionarily young. The phylogeny also suggests concerted evolution among the amplified copies of CG14835 in the D. arawakana clade and Prtl99C in the D. suzukii clade, similar to the amplified tHMG-hetX copies on D. mauritiana and D. simulans (Figure 4). The phylogenies from amplified copies of tHMG and ProtA/B in Lordiphosa species are not shown here because of their low-quality sequences.
Figure 4.
Figure 4.. Tracing the duplication and amplification of tHMG genes in D.simulans and close relatives.
(A) Using a combination of genome assemblies and phylogenetic analyses, we traced the evolutionary origins and steps that led to the massive amplification of tHMG genes on the D. simulans X chromosome. The first step in this process was the duplication of the ancestral tHMG gene (flanked by CCT1 and Octb1R) on the 3R chromosomal arm to a new location on 3R (tHMG-3R#2 now flanked by CG31468 and Gba1a) and to a location on the X chromosome euchromatin, where tHMG-euX is flanked by CG12691 and CG15572. We infer that this CG12691-tHMG-euX locus then duplicated to another locus in X-heterochromatin, between Atbp and the flamenco locus, and further amplified. These resulting copies experienced different fates in D. simulans and its sibling species. For example, in D. sechellia, tHMG-3R#2, tHMG-euX, and tHMG-hetX were all lost but a degenerated copy of tHMG-3R#2 and flanking genes can be found on its Y chromosome. In contrast, in D. mauritiana, tHMG-3R#2 pseudogenized on 3R, tHMG-euX was retained while tHMG-hetX underwent an amplification to a copy number of 15 tandemly arrayed genes in the X heterochromatin. Finally, in D. simulans, tHMG-3R#2 was completely lost, tHMG-euX was pseudogenized, and tHMG-hetX amplified to a copy number of 15 on the X heterochromatin. We note that the amplification unit sizes are different between D. simulans and D. mauritiana, suggesting that these were independent amplifications. Moreover, we detected different copy numbers (all more than 30) of tHMG-hetX across three sequenced strains of D. simulans we surveyed. This difference is likely due to both incomplete assemblies of this region and strain-specific differences. In addition to this X chromosomal expansion, we also found a few degenerated copies of tHMG on the 3R heterochromatic region and the Y chromosome. (B) The alignment shows the divergence between different tHMG copies in the D. simulans clade and D. melanogaster. Surprisingly, we X-linked tHMG duplicates diverged more from parental genes on autosomes, indicating that they experienced different evolutionary forces than the parental copies. Among 243 aligned nucleotide sites, we found 19 non-synonymous changes and only 3 synonymous changes shared in all X-linked copies after they diverged from the parental copy. Similarly, four non-synonymous changes and no synonymous change occurred on the parental copy in the ancestral species of the simulans clade. Most non-synonymous changes are in the DNA-binding HMG box. As a result, parental copies and new X-linked copies in D. simulans and D. mauritiana only share ~70% protein identity, which is very low given the <3 MY divergence. Our branch test using PAML further shows that both branches have significantly higher protein evolution rates (ω = 1.6, LRT test, p=0.007; Supplementary file 11). However, we did not find evidence of positive selection using a branch-site test (LRT test, p=0.23; Supplementary file 11). (C) Phylogenetic analyses of the various tHMG genes confirm the chronology of events outlined in (A) and find strong evidence of concerted evolution among the amplified tHMG-hetX copies on D. mauritiana and D. simulans, in which copies from the X-linked heterochromatic region are highly homogeneous within species, but diverged between species. For comparison, we showed the species tree on the left, and the phylogeny of three D. simulans clade species is not solved due to lineage sorting and gene flow. To simplify the analysis, we only used sequences that are annotated in NCBI databases.
Figure 5.
Figure 5.. Evolutionary retention, degeneration, or translocation of sperm nuclear basic protein (SNBP) genes following chromosomal fusions.
SNBP genes are ancestrally encoded on autosomes. Following chromosome fusion over Drosophila evolution, we found eight cases in which three SNBP genes (CG14835, CG34269, and ddbt) became linked to sex chromosomes. In 1/8 cases, SNBP genes translocated back to an autosome. In 2/8 cases, the sex chromosome-linked SNBP genes degenerated despite being otherwise widely conserved in non-montium Drosophila species. In 5/8 cases, SNBP genes were retained on neo-sex chromosomes in 5/8 cases. Among these, we observed one amplification event; ddbt amplified to six copies in D. repletoides. In contrast to sex chromosomal linkage, SNBP genes that remained linked to autosomes despite chromosomal fusions were strictly retained in 16/16 cases. These retention patterns differ significantly between sex chromosomes and autosomes (Fisher’s exact test, p=0.03).
Figure 6.
Figure 6.. A dramatic loss of sperm nuclear basic protein (SNBP) genes coincided with a fusion of X and Y chromosomes in the montium group species.
(A) Using a phylogeny of species from the montium group, we traced the retention or loss of SNBP genes that are otherwise primarily conserved across other Drosophila species. Genes retained in autosomal syntenic locations are indicated in black squares, whereas pseudogenes are indicated by an empty square with a diagonal line. We traced a total of 11 independent pseudogenization events. Three of these pseudogenization events occurred early such that all species from this group have lost CG14835, Mst33A, and tHMG. Three other SNBP genes were lost later (in some cases on multiple occasions) and are, therefore, missing only in a subset of species. For example, we infer that CG34629 was lost on at least five independent occasions (and also in outgroup species D. ananassae). We correlated this dramatic loss of otherwise-conserved SNBP genes with the X-chromosome linkage of genes that are ancestrally Y-linked in other Drosophila species, shown on the right. For example, of 12 Y-chromosomal genes in most related species, including D. melanogaster and D. ananassae, most are now X-linked in montium group species (e.g., 11/12 in D. triauraria, 9/11 in D. jambulina, and 7/10 in D. bocqueti and D. kikkawai). We note these species still harbor a Y chromosome; however, this Y-chromosome lacks most ancestrally Y-linked genes. (B) We traced the chromosomal arrangement and linkage of ancestrally Y-linked genes in D. triauraria using new genome assembly (NCBI accession: GCA_014170315.2) and genetic crosses in (C). We were able to show that the D. triauraria X chromosome represents a fusion of the X chromosome (e.g., from D. melanogaster) and chromosomal segments containing 11 protein-coding genes that are typically found on the Y chromosome (e.g., from D. melanogaster). Genetic crosses confirmed the X-linkage of 9 of these previously Y-linked genes. The lack of allelic differences in D. triauraria prevented us from confirming this for the other two genes: CCY and WDY. (C) An example of the genetic cross used to verify X-linkage. Using genetic crosses between different D. triauraria strains with allelic variation in ancestral Y-linked genes, we evaluated whether male flies inherit these genes maternally, paternally, or from both parents. We observed only maternal inheritance, confirming the X-chromosomal linkage of these genes.
Figure 6—figure supplement 1.
Figure 6—figure supplement 1.. Phylogenetic analyses help distinguish between two models of relocation of ancestrally Y-linked genes.
Two hypotheses have been proposed for the relocation of ancestrally Y-linked genes in the montium species group. The first hypothesis proposed by Dupim et al., 2018 posits that the Y-chromosomal genes duplicated onto another chromosome, following which either the Y-linked or non-Y-linked genes were retained. We favor an alternate hypothesis in which all Y-linked genes got fused to the X-chromosome, following which some Y-linked genes relocated back to the Y chromosome in some but not all montium group species. We find strong evidence for the second hypothesis regarding the PRY and Ppr-Y gene, which are both located on the same contig in D. triauraria and D. kikkawai even though they are X-linked in D. triauraria and Y-linked in D. kikkawai. Phylogenetic analyses of PRY suggest that it relocated back to the Y chromosome from the X chromosome in the D. kikkawai lineage. Similarly, the WDY and kl-2 genes are also co-located on the D. triauraria X chromosome and D. kikkawai Y chromosome. However, in this case, the phylogeny is ambiguous enough to prevent us from distinguishing between the two hypotheses for the WDY and kl-2 genes in D. jambulina, D. bocqueti, and D. kikkawai.
Figure 7.
Figure 7.. Genetic conflict between sex chromosomes may explain the rapid turnover of sperm nuclear basic protein (SNBP) genes in Drosophila species.
SNBP genes are ancestrally encoded on autosomes where we hypothesize that some of them act to suppress meiotic drive between sex chromosomes (e.g., ProtA/B). However, in some cases, paralogs of these SNBP genes duplicate onto sex chromosomes where they undergo dramatic amplification. We propose that this amplification creates an opportunity for them to act as meiotic drive elements themselves (e.g., Dox), imbuing sex chromosomes that inherit them with transmission advantages. A fusion of the sex chromosomes (e.g., D. montium species group) leads to a loss of meiotic competition between sex chromosomes, which will subsequently lead to the loss or degeneration of the suppressing SNBP genes on autosomes since their drive suppression functions are rendered superfluous.

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Balhorn R. The protamine family of sperm nuclear proteins. Genome Biol. 2007;8:227. doi: 10.1186/gb-2007-8-9-227. - DOI - PMC - PubMed
    1. Barckmann B, Chen X, Kaiser S, Jayaramaiah-Raja S, Rathke C, Dottermusch-Heidel C. Three levels of regulation lead to protamine and mst77f expression in Drosophila. Dev Biol. 2013;377:33–45. doi: 10.1016/j.ydbio.2013.02.018. - DOI - PMC - PubMed
    1. Bayes JJ, Malik HS. Altered heterochromatin binding by a hybrid sterility protein in Drosophila sibling species. Science. 2009;326:1538–1541. doi: 10.1126/science.1181756. - DOI - PMC - PubMed
    1. Beckmann JF, Sharma GD, Mendez L, Chen H, Hochstrasser M. The Wolbachia cytoplasmic incompatibility enzyme cidb targets nuclear import and protamine-histone exchange factors. eLife. 2019;8:e50026. doi: 10.7554/eLife.50026. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources