Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jun 12:2024.06.11.598575.
doi: 10.1101/2024.06.11.598575.

Genetic variation in recalcitrant repetitive regions of the Drosophila melanogaster genome

Affiliations

Genetic variation in recalcitrant repetitive regions of the Drosophila melanogaster genome

Harsh G Shukla et al. bioRxiv. .

Update in

Abstract

Many essential functions of organisms are encoded in highly repetitive genomic regions, including histones involved in DNA packaging, centromeres that are core components of chromosome segregation, ribosomal RNA comprising the protein translation machinery, telomeres that ensure chromosome integrity, piRNA clusters encoding host defenses against selfish elements, and virtually the entire Y chromosome. These regions, formed by highly similar tandem arrays, pose significant challenges for experimental and informatic study, impeding sequence-level descriptions essential for understanding genetic variation. Here, we report the assembly and variation analysis of such repetitive regions in Drosophila melanogaster, offering significant improvements to the existing community reference assembly. Our work successfully recovers previously elusive segments, including complete reconstructions of the histone locus and the pericentric heterochromatin of the X chromosome, spanning the Stellate locus to the distal flank of the rDNA cluster. To infer structural changes in these regions where alignments are often not practicable, we introduce landmark anchors based on unique variants that are putatively orthologous. These regions display considerable structural variation between different D. melanogaster strains, exhibiting differences in copy number and organization of homologous repeat units between haplotypes. In the histone cluster, although we observe minimal genetic exchange indicative of crossing over, the variation patterns suggest mechanisms such as unequal sister chromatid exchange. We also examine the prevalence and scale of concerted evolution in the histone and Stellate clusters and discuss the mechanisms underlying these observed patterns.

PubMed Disclaimer

Figures

Fig.1 :
Fig.1 :
Comparison of assemblies here to that of Release 6. A. D-Genies dotplots of the Release 6 reference genome assembly scaffolds for Drosophila melanogaster (x-axis) versus our contig- (top) and scaffold-level (bottom) assemblies of iso-1, A4 and A3. B. Repeat content comparison of iso-1 Release 6 assembly versus our iso-1 HiFi assembly for each Muller element (i.e., autosomal chromosome arms and the X chromosome). C. Comparison of total sequence assigned to euchromatic and heterochromatic compartments in iso-1 Rel6 versus iso-1 HiFi. D. Cumulative length plot for assembly contigs of strains used in this study and iso-1 Release 6. The x-axis is on a log10 scale.
Fig.2 :
Fig.2 :
Characterization of newly assembled X-linked heterochromatic sequence. A. D-Genies dotplots of our HiFi assemblies (y-axis) versus the Release 6 assembly of iso-1 (x-axis) covering the proximal end of X chromosome scaffold for both assemblies. B. Repeat catalog of the newly assembled proximal regions on the X chromosome for different strains. C. Map of sequence from the newly assembled proximal end of the contigs assembled here. Top: Schematic map of X chromosome structure. Bottom: structure of the region in the HiFi assemblies of the X chromosome for the haplotypes in A4, iso-1, and A3. The haplotypes are painted with colors representing different repeat categories (following panel B).
Fig.3 :
Fig.3 :
Maps of the newly assembled histone locus in Drosophila melanogaster. A. Overview of the structure of the histone locus. Top: Schematic illustration of the location of the histone locus relative to the rest of chromosome 2L. Bottom: Map of location of elements in the histone cluster, including the 5 individual histone genes and various transposable elements. “iso-1 Ref” refers to the Release 6 community assembly and “iso-1 HiFi” indicates the assembly presented here. White rectangles for iso-1 Ref and A3 HiFi indicate gaps resized to facilitate comparison with the other two assemblies. B. Location distribution of the major histone unit types (i.e., 5kb and 4.8kb). C. Expanded map of the proximal end of the array. Order of strains follows panel A.
Fig.4 :
Fig.4 :
A. Distribution of landmark anchors used in this study. Anchors enclosed in red boxes indicate units harboring TE insertions. Other anchors are identified by phylogenetic grouping (see Methods). Anchors sharing the same color are putatively homologous. Lines between strains highlight the positions of these putative homologs in their respective haplotypes. B. Distribution of unit types along the histone cluster in 3 haplotypes, including two additional subtypes of 5kb: 5053 and 5063. C. Graphical representation of physical distance between histone units on the chromosome (x-axis) and molecular distance (y-axis). Molecular distances represent all unique pairwise comparisons either between or within strains. Bottom right: Schematic diagram how molecular distance comparisons between and within strain comparisons were conducted.
Fig.5 :
Fig.5 :
Recombination in and around the histone locus. A. Integrative Genomics Viewer plot of the regions flanking the histone locus. Values between double headed arrows under the x-axis indicate the r2 measure of linkage disequilibrium between the variants spanned by the arrows. Coordinates are with respect to the histone locus, with negative values indicating the distal flank and positive values indicating the proximal flank. H1 and H2 each represent haplotypes present in all but 2 strains. ZI170 and ZI220 possess their own private combinations of variants from H1 and H2 haplotypes and are the only representatives of their respective haplotypes. B. Matrix of r2 between 4 50kb segments (S1-S4) around the histone locus. Each segment is fixed in physical size. Since only variants are plotted, the plotted size scales with the number of variants in the segment. Each segment is separated by a size determined by the length of the histone cluster in iso-1 ( 600kb). Colored boxes correspond to expansions in panels C-E. Bottom: Schematic representation of chromosome 2L and the location of the histone locus with respect to the segments and to the annotated boundaries of the heterochromatin. C. Expanded view of pairwise LD for 50kb regions flanking a control locus spanning 22.19Mb-22.79Mb (i.e., S3 and S4). D. Expanded view of pairwise LD for 50kb regions flanking the histone locus spanning 21.55Mb-22.14Mb (i.e., S2 and S2). E. Expanded view of pairwise LD for 50kb regions flanking a control locus spanning 20.9Mb-21.5Mb (i.e., S1 and S2).
Fig.6 :
Fig.6 :
Maps of euchromatic and heterochromatic Stellate clusters. A. Schematic illustration of the location of the Stellate clusters relative to the rest of the X chromosome. B. Overview of the structure of the euchromatic Stellate locus. The numbers on the top represent full length Stellate sequences in the tandem array. C. Overview of the structure of the heterochromatic Stellate locus. Type 1 and Type 2 loci are marked by labels on the top in iso-1. The numbers on the bottom represent full length Stellate sequences in the particular locus. The skin color parallelograms represent syntenic sequences (drawn by eye). Purple half rectangles represent the 200 kb tandem segmental duplicates harboring some of Type 1 loci.

References

    1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, et al. 2000. The genome sequence of Drosophila melanogaster. Science 287:2185–2195. - PubMed
    1. Adashev VE, Kotov AA, Bazylev SS, Shatskikh AS, Aravin AA, Olenina LV. 2020. Stellate Genes and the piRNA Pathway in Speciation and Reproductive Isolation of Drosophila melanogaster. Front. Genet. 11:610665. - PMC - PubMed
    1. Altemose N, Miga KH, Maggioni M, Willard HF. 2014. Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput. Biol. 10:e1003628. - PMC - PubMed
    1. Anon. GenomeScope. Available from: http://qb.cshl.edu/genomescope/genomescope2.0/
    1. Anon. D-GENIES. Available from: https://dgenies.toulouse.inra.fr/

Publication types