Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 7;176(6):1310-1324.e10.
doi: 10.1016/j.cell.2019.01.045. Epub 2019 Feb 28.

Megabase Length Hypermutation Accompanies Human Structural Variation at 17p11.2

Affiliations

Megabase Length Hypermutation Accompanies Human Structural Variation at 17p11.2

Christine R Beck et al. Cell. .

Abstract

DNA rearrangements resulting in human genome structural variants (SVs) are caused by diverse mutational mechanisms. We used long- and short-read sequencing technologies to investigate end products of de novo chromosome 17p11.2 rearrangements and query the molecular mechanisms underlying both recurrent and non-recurrent events. Evidence for an increased rate of clustered single-nucleotide variant (SNV) mutation in cis with non-recurrent rearrangements was found. Indel and SNV formation are associated with both copy-number gains and losses of 17p11.2, occur up to ∼1 Mb away from the breakpoint junctions, and favor C > G transversion substitutions; results suggest that single-stranded DNA is formed during the genesis of the SV and provide compelling support for a microhomology-mediated break-induced replication (MMBIR) mechanism for SV formation. Our data show an additional mutational burden of MMBIR consisting of hypermutation confined to the locus and manifesting as SNVs and indels predominantly within genes.

Keywords: CNVs; DNA repair; complex rearrangements; genomic characterization; genomic disorders; long-read sequencing; phasing.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests

Baylor College of Medicine (BCM) and Miraca Holdings have formed a joint venture with shared ownership and governance of the Baylor Genetics (BG), which performs clinical microarray analysis and clinical exome sequencing. C.A.S. is an employee of BCM and derives support through a professional services agreement with the BG. J.R.L. serves on the Scientific Advisory Board of the BG. J.R.L. has stock ownership in 23andMe, is a paid consultant for Regeneron Pharmaceuticals, has stock options in Lasergen, and is a co-inventor on multiple United States and European patents related to molecular diagnostics for inherited neuropathies, eye diseases, and bacterial genomic fingerprinting.

Figures

Figure 1-
Figure 1-. Non-recurrent rearrangements display increased SNV and indel mutations
A) Array comparative genomic hybridization (aCGH) data for 19 de novo recurrent rearrangements (10 SMS deletions in green and 9 PTLS duplications in red) are depicted. The bounds of the capture region are shown with dashed black lines. The three SMS repeats are denoted with arrows indicating their orientation on chromosome 17p11.2 (purple translucent vertical lines depict these regions in panel A only). The established dosage-sensitive gene underlying SMS and PTLS, RAI1 is denoted with a black vertical line (another gene in the region, PMP22, is also indicated in panel A). B) 26 de novo non-recurrent rearrangements (14 PTLS and 12 SMS) are depicted. Deleted regions are shown in green, duplicated regions in red, and triplicated regions in blue. C) Local, de novo Single Nucleotide Variant (SNV) and indel mutations are shown in the context of the 4/19 recurrent structural variants (SVs) they occurred with. D) Local, de novo SNV and indel mutations are shown in the context of the 13/26 non-recurrent SV events that harbored concurrent SNV mutational events. All mutations in SMS cases occur within the non-deleted regions; all mutations in PTLS cases occur within duplicated regions except one mutation in BAB2811. See also Figure S1 and Table S3).
Figure 2-
Figure 2-. Sequencing and analysis strategy
A) Schematic of regional capture and sequencing of trios. Regional capture and Illumina sequencing was performed on each individual in the study. From these data, regional de novo SNVs and indel mutations were ascertained. Additionally, 7 Mb capture and PacBio sequencing of the probands was performed to identify SV junctions. SNVs, indels, and SV junctions were experimentally verified to be de novo with PCR and Sanger sequencing, and SNV and indel mutations were visualized in the PacBio data from the same individuals. B) Exome sequencing of the trios was conducted to identify genome-wide de novo mutational burden in each trio in the study; all mutations were confirmed with PCR and Sanger sequencing. C) A flow diagram shows the regional genomic sequencing strategy employed for the 26 individuals studied with non-recurrent mutations. See also Table S1.
Figure 3-
Figure 3-. Regional B-allele frequency and genotype information allows SV phasing
The phasing data for selected individuals carrying duplications is shown; red dots represent maternal and blue dots paternal informative SNPs (black dots are non-informative). The x-axis represents the coordinates (hg19 genomic position) along the 17p11.2 capture region, and the y-axis is the B-allele frequency. A) BAB2811 carries a duplication on the maternal haplotype; we phased 11 SNVs in cis with this SV, including one outside of the duplicated region. B) BAB3810 carries a duplication on the maternal haplotype; one SNV was in cis with this SV junction. C) BAB8123 carries a duplication on the paternal haplotype; two SNVs were in cis with this SV. D) BAB2986 carries a duplication on the maternal haplotype; two SNVs were in cis with this SV. See Table S3 for SNV phasing data for these probands indicating that SV and SNV occurred de novo in cis with the SV (Table S3).
Figure 4-
Figure 4-. SNVs and indels accompany SV formation
A) BAB2543 carries two duplications in an inverted orientation separated with a copy-neutral segment (DUP-NML-DUP/INV). Breakpoint junction (jct) 1 maps to inverted SMSREP LCRs, and evaded sequencing attempts. Breakpoint jct 2 was mediated by inverted Alu repeats and forms an Alu-Alu chimera; junction sequence is characterized by 29 bp of microhomology. Eight de novo mutations have also been characterized within 17p11.2; Sanger sequencing electropherogram confirming each SNV is shown along with the location, genomic context and type. B) BAB1931 carries three deletions interspersed with copy-neutral segments (DEL-NML-DEL-NML-DEL). SV breakpoints display one, two and zero bp of microhomology at the junctions, and jct3 was previously uncharacterized. The four de novo SNVs and indels present in the proband and Sanger sequencing electropherogram confirmation are depicted below. The SNV at 20409881 was not independently confirmed by using a PCR/Sanger sequencing strategy due to its presence within an LCR; however, it was observed in both Illumina and PacBio sequencing data and shown to be de novo in the trio Illumina sequencing data. C) Plot shows the relative contribution of each SNV transition and transversion observed de novo in the non-recurrent individuals. Overall abundance of C>G mutations can be readily observed. D) Enrichment of de novo SNVs in proximity to SV breakpoints was observed in the genomes of 9 out of 13 subjects with non-recurrent SV. This enrichment was not observed for de novo SNVs (N=4) detected in the subjects carrying recurrent SVs. The normalized statistics (Z-value) for each simulation and observation (red dot) is displayed with the box plots. E) Mutational clustering was examined in individuals with more than one de novo SNV. SNV mutations show statically significant clustering in 5 out of 9 NR rearrangements. The normalized statistics (Z-value) for each simulation and the observation (red dot) are plotted. The box plots were colored according to the number of de novo mutations detected in each subject. (*) P ≤ 0.05; (**) P ≤ 0.01; (***) P ≤ 0.001. See also STAR Methods, Figure S2 and Data S1.

References

    1. Abyzov A, Li S, Kim DR, Mohiyuddin M, Stutz AM, Parrish NF, Mu XJ, Clark W, Chen K, Hurles M, et al. (2015). Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nat Commun 6, 7256. - PMC - PubMed
    1. Bainbridge MN, Wang M, Wu Y, Newsham I, Muzny DM, Jefferies JL, Albert TJ, Burgess DL, and Gibbs RA (2011). Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities. Genome Biol 12, R68. - PMC - PubMed
    1. Beck CR, Carvalho CM, Banser L, Gambin T, Stubbolo D, Yuan B, Sperle K, McCahan SM, Henneke M, Seeman P, et al. (2015). Complex genomic rearrangements at the PLP1 locus include triplication and quadruplication. PLoS Genet 11, e1005050. - PMC - PubMed
    1. Brandler WM, Antaki D, Gujral M, Noor A, Rosanio G, Chapman TR, Barrera DJ, Lin GN, Malhotra D, Watts AC, et al. (2016). Frequency and Complexity of De Novo Structural Mutation in Autism. Am J Hum Genet 98, 667–679. - PMC - PubMed
    1. Campbell CD, and Eichler EE (2013). Properties and rates of germline mutations in humans. Trends Genet 29, 575–584. - PMC - PubMed

Publication types

Supplementary concepts