Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 26;185(11):1986-2005.e26.
doi: 10.1016/j.cell.2022.04.017. Epub 2022 May 6.

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders

Affiliations

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders

David Porubsky et al. Cell. .

Abstract

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.

Keywords: L1 mobile element; genomic disorder; genomic instability; genomic structural variation; human genetic variation; inversion; pathogenic CNV; recurrent mutation; retrotransposon.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests E.E.E. is a scientific advisory board (SAB) member of Variant Bio. C.L. is an SAB member of Nabsys. The following authors have previously disclosed a patent application (no. EP19169090) relevant to Strand-seq: A.D.S., J.O.K., T.M., and D.P.; the other authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Inversion discovery in a diversity panel.
A) Breakdown of inversion (Inv) classes (see Fig. 2 for L1-internal events). InvDup, inverted duplication; miso, (likely) misoriented; complex/lowconf, lower-confidence call. B) Affected bp per variant class and population. Del, deletion; Ins, insertion. C) Balanced inversion landscape (n = 292). D) Inversion discovery (n = 399 sites) by technology with affected bp (pie chart). PAV, phased assembly variant caller. E) Pericentromeric inversion on chromosome 2. Strand-seq read counts in 50 kbp bins (step size: 10 kbp) are represented as bars above (teal; Crick reads) and below (orange; Watson) the midline. SDs and morbid CNVs are annotated. Arrowhead plot reports inversions (H1, haplotype 1; H2, haplotype 2) in NA19650 and nonhuman primates. FISH probe positions shown (bottom). CEN, centromere. F) FISH confirms inversion (red) compared to control (white).
Figure 2.
Figure 2.. Inversion formation mechanisms.
A) Representation of inversions and their flanks for events <10 kbp (all sequence-resolved events are in Figure S1C). B) Size distribution for event types from (A). Unresolved: not assembled. C) Functional annotation of events. D) Depiction of twin-priming. (1) Cleavage of the first DNA strand by the L1-encoded endonuclease; (2) annealing of the L1 RNA poly(A) and initiation of reverse transcription (RT) at the free 3′OH; (3) after second strand cleavage, the derived single-stranded overhang at the 5′ TSD anneals internally to the L1 transcript, generating Junction 1 (Jct1); (4) the inverted and non-inverted cDNA products are annealed, generating Junction 2 (Jct2); both junctions are repaired by MMEJ; (5) retrotransposition finalizes with second strand synthesis and ligation. E) Size distribution for L1-associated events. IQR, interquartile range. F) Top, inversion and truncation breakpoint (BKP) density, using kernel density estimation (KDE). Bottom, likelihood of each L1 integration outcome while L1 RT progresses towards the 5′ end of L1 mRNA sequence. G) Left, fraction of full-length, 5′ deleted and inverted L1 inserts exhibiting microhomology, nucleotide insertions, and blunt joints between the 3′ end of the TSD and the 5′ end of the integrated L1. Right, size distribution (bp) for microhomologies and insertions. H) Inversion junction conformations with duplicated (Dup) and deleted (Del) pieces of L1 sequence and blunt joins.
Figure 3.
Figure 3.. Recurrence of balanced inversions in the human genome.
A) Rate of balanced inversions discovered with each added genome differs from SV insertions and deletions (orange lines, right axis). Dotted lines fit logarithmic model growth. Singleton: 1 allele; polymorphic: AF < 50%; major: AF ≥ 50% (but less than 100%), putative misorient: AF = 100%. B) Inversion recurrence detection: (i) tiSNPs based, (ii) Haplotype based approach. Venn diagram depicts overlap by approach for 127 tested inversions. C–E) Evidence for single (C, 17q21) and recurrent (D, 8p23.1 [distal part chr8:8225000-8301024]; E, 11p11) loci. Left: dendrograms (centroid hierarchical clustering method) show relationships among inverted and direct-oriented haplotypes. Ancestral (blue) vs. derived (orange) SNPs, informative tiSNPs (black) and SNPs with ≥75% mappability (purple) are shown. Middle: haplotype-based principal component (PC) analysis. Right: inferred cladograms of the loci of interest. Blue dots, putative inversion events.
Figure 4.
Figure 4.. Recurrence on chromosome Y.
A) Annotated chromosome Y (top) and sites of inversion (enumerated 1–15) projected onto haplotypes. Phylogeny (left) with estimated divergence times (kya, 1000 years ago). B) Sex chromosome enrichment of recurrent inversions (cons. single, consensus single-event).
Figure 5.
Figure 5.. Association of toggling inversions with morbid CNVs.
A) Left: Overlap of balanced inversions with a redundant list (n = 155) of morbid CNVs. Cons., consensus. Right: permuted overlaps, p-values (bottom). B) Left: Dot plots of representative assembled haplotypes at 3q29. SD pairs are highlighted in orange (direct) and green (inverse). Tandem duplications of at least one inversion-mediating SD (2nd row) are observed in 43/68 (63%) haplotypes. Right: Direct duplications (SD #2), increasing risk of morbid CNV formation, are common in direct and absent in inverted haplotypes (p-values, Fisher’s exact test). C) Structural haplotypes at 15q13.3, where INV-β and INV-β′ configurations potentially promote recurrent inversions or morbid CNVs. Additional haplotypes (IV, V) containing deletions putatively protect against inversions and morbid CNVs (see also Data S2). D, E) Inversions at 7q11.23 and 2q13.
Figure 6.
Figure 6.. Complex inverted haplotypes and inversions at sites of morbid CNVs.
A) The 1p36.13 region differs between the T2T-CHM13 and GRCh38 references. B) Optical mapping reveals four haplotype classes (I-IV), with 12 (H1-H12) seen at least twice at 1p36.13. Colored arrows represent genomic segments, and black arrows deletions. Black rectangle outlines variants relative to T2T-CHM13. C) Inversions at 16p13.11 and 17p11.2. D) An inversion overlapping the PWAS type II region (recurrent CNV breakpoints denoted as BP1, 2 and 3). FISH probe positions shown (bottom). E) Scatterplot depicting shared rare SNPs within the 1KG data for the locus in (D). AC, allele count. F) FISH validation of the locus in panel D. CEN, centromere.

Similar articles

Cited by

References

    1. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, et al. (2015). A global reference for human genetic variation. Nature 526, 68–74. - PMC - PubMed
    1. Aagaard Nolting L, Brasch-Andersen C, Cox H, Kanani F, Parker M, Fry AE, Loddo S, Novelli A, Dentici ML, Joss S, et al. (2020). A new 1p36.13-1p36.12 microdeletion syndrome characterized by learning disability, behavioral abnormalities, and ptosis. Clin. Genet 97, 927–932. - PubMed
    1. Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, et al. (2020). Mapping and characterization of structural variation in 17,795 human genomes. Nature 583, 83–89. - PMC - PubMed
    1. Aguado C, Gayà-Vidal M, Villatoro S, Oliva M, Izquierdo D, Giner-Delgado C, Montalvo V, García-González J, Martínez-Fundichely A, Capilla L, et al. (2014). Validation and genotyping of multiple human polymorphic inversions mediated by inverted repeats reveals a high degree of recurrence. PLoS Genet. 10, e1004208. - PMC - PubMed
    1. Anantharaman TS, Mysore V, and Mishra B (2004). Fast and cheap genome wide haplotype construction via optical mapping. Biocomputing 2005. - PubMed

Publication types