Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 6;18(1):36.
doi: 10.1186/s13059-017-1158-6.

Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

Affiliations

Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

Ryan L Collins et al. Genome Biol. .

Abstract

Background: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies.

Results: We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV.

Conclusions: These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

Keywords: Autism; Chromoanagenesis; Chromothripsis; Complex chromosomal rearrangement; Copynumber variation; Germline mutation; Inversion; Neurodevelopmental disorders; Structural variation; Whole-genome sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The diverse landscape of SV in participants with ASD and other developmental disorders. We sequenced the genomes of 689 participants with ASD and other developmental disorders. a Physical coverage and (b) median insert size of liWGS libraries. c Count and distributions of large SV detected by liWGS (Additional file 1). d Distribution of SVs per participant by SV class. e Density plots of SV sizes by class. Characteristic Alu and L1 peaks are absent due to the resolution of liWGS (> ~ 5 kb) being larger than most mobile element insertions. f Cumulative distributions of SV frequencies by class. Singletons (single observation among all 686 samples) are marked with an arrow. Rare SVs are defined as those with variant frequency (VF) < 1%
Fig. 2
Fig. 2
Classifying 16 recurrent subclasses of large, complex SVs in the human genome. At liWGS resolution, we identified 16 recurrent classes of cxSV, defined here as non-canonical rearrangements involving two or more distinct SV signatures or at least three linked breakpoints. We validated 97.4% (150/154) of all cxSV sites assessed by at least one assay. Each participant harbored a median of 14 cxSVs at liWGS resolution (range: 6–23 cxSVs per participant). We identified 289 distinct cxSVs across 686 participants, totaling 9666 cxSV observations. Each row represents a subclass of cxSV, with columns representing the subclass abbreviation, number of distinct variants discovered, validation rate, total number of observed variants across all participants, the percentage of participants that were found to harbor at least one such variant in their genome, the median size of all variants in that subclass, each subcomponent SV signature that comprises the class, a linear schematic of each class of cxSV, and a simulated example of the copy-number profile as would be observed by chromosomal microarray or WGS
Fig. 3
Fig. 3
liWGS and lrWGS resolved a de novo gene-disrupting cxSV that was cryptic to standard siWGS. We performed lrWGS from 10X Genomics (Pleasanton, CA, USA) as a method of orthogonal validation for three large complex SVs detected by liWGS, two of which failed to fully validate by traditional methods. One notable example is shown here; the other two are provided in Additional file 2: Figures S4 and S5. a A de novo complex reciprocal translocation with three breakpoints between chromosomes 2 (pink) and 6 (green) was discovered by liWGS in a participant with ASD and predicted to result in LoF of PARK2 and CAMKMT. However, two of three breakpoints (breakpoints #1 and #3; orange) were not detectable by siWGS. b lrWGS heatmaps from Loupe software [113] analysis of lrWGS data showed clear evidence for each of the three SV breakpoints. c lrWGS resolved and phased all three breakpoints, including both breakpoints that failed molecular validation due to low-complexity repetitive sequence (blue), which were resolved by spanning the low-complexity sequence with 28 liWGS reads and 30 lrWGS molecules at breakpoint #1 and 12 liWGS reads and 41 lrWGS molecules at breakpoint #3
Fig. 4
Fig. 4
Rare SVs are enriched for hallmarks of deleterious biological outcomes. Comparing all rare (VF < 1%) and common (VF > 1%) SVs discovered in this cohort revealed differences in their respective functional annotations (Additional file 2: Table S2). a Rare SVs were larger on average than common SVs [1]. b Rare SVs were more likely than common SVs to disrupt genes, particularly when the disruption was predicted to result in LoF. Rare SVs were also more likely than common SVs to result in disruption of promoters [112, 114], enhancers [112, 114], and TAD boundaries [110]. c Genes predicted to harbor at least one LoF mutation due to a rare SV were enriched in many subcategories when compared to common SV, including genes predicted to be constrained against truncating mutations in healthy individuals (Constrained) [65, 66], genes predicted to be intolerant of functional variation in healthy individuals (Intolerant) [67], genes with significant burdens of exonic deletions in NDD cases versus healthy controls (NDD ExDels) [38], genes associated with an autosomal dominant disorder (Autosomal Dom.) [68, 69], and genes with at least one pathogenic variant reported in ClinVar (Disease Assoc.) [70] (Additional file 2: Table S3)
Fig. 5
Fig. 5
Extreme chromoanagenesis manifests by multiple mutational mechanisms in three participants with developmental anomalies. We applied WGS to resolve microscopically visible cxSVs in three unrelated participants with developmental abnormalities. a, b Circos representations of two cases of extreme and largely balanced chromothripsis, involving > 40 breakpoints, > 40 Mb, and > 12 genes across four chromosomes [9, 115]. Points plotted around the inner ring represented estimated copy number alterations; deletions are highlighted in red. Links represent non-reference junctions on derivative chromosomes. c Circos representation of a somatic mosaic chromoanasynthesis event of chromosome 19 [115]. Duplications are shaded in blue and interspersed duplications are designated by shaded ribbons leading from the duplicated sequence to their insertion site. d CMA and WGS analysis of the mosaic chromoanasynthesis from panel c (participant TL009) revealed all nine CNVs involved in the rearrangement to have arisen on the maternal homologue and that 6/8 duplications were apparently mosaic (2.57 ± 0.02 copies, 95% CI; median coverage shown in yellow; yellow shading indicates 95% CI). Surprisingly, 2/8 duplications (outlined in teal) exhibited significantly greater copy numbers than the other six (p = 9.18 × 10–8), were linked by an underlying interstitial inversion and appeared to represent approximately three copies, suggesting this rearrangement might have originated as a de novo dupINVdup cxSV in the maternal germline (Additional file 2: Figure S7)

Similar articles

Cited by

References

    1. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. doi: 10.1038/nature15394. - DOI - PMC - PubMed
    1. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7:85–97. doi: 10.1038/nrg1767. - DOI - PubMed
    1. Alkan C, Coe BP, Eichler EE. Genome structural variation discovery and genotyping. Nat Rev Genet. 2011;12:363–376. doi: 10.1038/nrg2958. - DOI - PMC - PubMed
    1. Brand H, Collins RL, Hanscom C, Rosenfeld JA, Pillalamarri V, Stone MR, et al. Paired-duplication signatures mark cryptic inversions and other complex structural variation. Am J Hum Genet. 2015;97:170–176. doi: 10.1016/j.ajhg.2015.05.012. - DOI - PMC - PubMed
    1. Brand H, Pillalamarri V, Collins RL, Eggert S, O’Dushlaine C, Braaten EB, et al. Cryptic and complex chromosomal aberrations in early-onset neuropsychiatric disorders. Am J Hum Genet. 2014;95:454–461. doi: 10.1016/j.ajhg.2014.09.005. - DOI - PMC - PubMed

Publication types