Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 5;96(2):208-20.
doi: 10.1016/j.ajhg.2014.12.017. Epub 2015 Jan 29.

Next-generation sequencing of duplication CNVs reveals that most are tandem and some create fusion genes at breakpoints

Affiliations

Next-generation sequencing of duplication CNVs reveals that most are tandem and some create fusion genes at breakpoints

Scott Newman et al. Am J Hum Genet. .

Abstract

Interpreting the genomic and phenotypic consequences of copy-number variation (CNV) is essential to understanding the etiology of genetic disorders. Whereas deletion CNVs lead obviously to haploinsufficiency, duplications might cause disease through triplosensitivity, gene disruption, or gene fusion at breakpoints. The mutational spectrum of duplications has been studied at certain loci, and in some cases these copy-number gains are complex chromosome rearrangements involving triplications and/or inversions. However, the organization of clinically relevant duplications throughout the genome has yet to be investigated on a large scale. Here we fine-mapped 184 germline duplications (14.7 kb-25.3 Mb; median 532 kb) ascertained from individuals referred for diagnostic cytogenetics testing. We performed next-generation sequencing (NGS) and whole-genome sequencing (WGS) to sequence 130 breakpoints from 112 subjects with 119 CNVs and found that most (83%) were tandem duplications in direct orientation. The remainder were triplications embedded within duplications (8.4%), adjacent duplications (4.2%), insertional translocations (2.5%), or other complex rearrangements (1.7%). Moreover, we predicted six in-frame fusion genes at sequenced duplication breakpoints; four gene fusions were formed by tandem duplications, one by two interconnected duplications, and one by duplication inserted at another locus. These unique fusion genes could be related to clinical phenotypes and warrant further study. Although most duplications are positioned head-to-tail adjacent to the original locus, those that are inverted, triplicated, or inserted can disrupt or fuse genes in a manner that might not be predicted by conventional copy-number assays. Therefore, interpreting the genetic consequences of duplication CNVs requires breakpoint-level analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genetic Outcomes for Duplication CNVs Duplication of region B can be in direct or inverted orientation or can be inserted at another locus. Genes (arrows) and duplication breakpoints (dashed lines) are shown. Whole-gene duplication can lead to triplosensitivity, whereas intragenic duplications can disrupt the reading frame and cause loss of function. Direct intergenic duplications can generate a nonfunctional gene at the breakpoint junction while maintaining intact genes at the edges of the duplication. Intergenic duplications with breakpoints in two different genes can create a gene fusion if the genes are in the same orientation and the reading frame is maintained. Inverted intergenic duplications can create a fusion gene at the junction and will mutate one gene (gray) without retaining an intact copy at the locus. Loss of one gene copy through inverted duplication can lead to haploinsufficiency. Insertional translocations can disrupt or fuse genes at the site of insertion (gray).
Figure 2
Figure 2
Duplication Breakpoint Sequencing (A) High-resolution array CGH of genomic DNA from subject EGL464 fine maps the 568-kb duplication. Log2 ratio of subject versus control signal intensity is shown on the y axis. (B) SureSelect target enrichment of the 20-kb region surrounding breakpoints (dashed lines) followed by next-generation sequencing and alignment of paired-end reads (gray) reveals sequences from the normal chromosome 1 (chr1: 46,084,756–46,085,053). (C) Discordant reads (green) that map to this region have mate pairs that map to the positive strand at chr1: 46,652,055–46,652,356, consistent with a direct, tandem duplication. (D) Split reads that span the duplication junction misalign (colored vertical lines) to the reference genome at the site of the breakpoint (arrow; chr1: 46,084,825).
Figure 3
Figure 3
Breakpoint Junctions Reveal Signatures of DNA Repair (A) Examples of junctions with Alu-Alu homology (purple), microhomology (blue), blunt ends, and insertions (bold) are shown. Duplication breakpoint junctions are shown as the middle sequence, aligned to the reference genome at the two sides of the direct duplications. Underlined sequence shows the origin of the templated insertion in EGL527. (B) Frequency of Alu-Alu homology (H), microhomology (1 to >8 bp), blunt ends (0), and insertions (1 to >8 bp) at sequenced junctions. Colors are the same as in (A). (C) Breakpoints from 1,000 simulated duplications have a different distribution of microhomology and blunt ends compared to observed junctions in (B) (p = 5.117 × 10−12).
Figure 4
Figure 4
DUP-NML-DUP and DUP-TRP-DUP Organization High-resolution array CGH reveals duplications and/or triplications in EGL515 (A), EGL559 (B), EGL688 (C), and EGL407 (D). Log2 ratio of subject versus control signal intensity is shown on the y axis. Normal copy number, duplicated, and triplicated segments are labeled A–E for DUP-NML-DUP (A and B) and DUP-TRP-DUP (C and D) rearrangements. Gray arches connect sequenced junctions relative to the reference genome (above) and the rearrangement (below). Duplicated and triplicated segments can be inverted (Inv) or in direct orientation.
Figure 5
Figure 5
In-Frame Fusion Genes Predicted at Duplication Junctions (A, D, and G) Genes that cross breakpoints are shown relative to the reference genome (above) and the duplication (below). The genomic coordinates of breakpoints have been confirmed by sequencing (black) or high-resolution array CGH (gray). (A) EGL480’s direct duplication of chromosome 2p22.1. (B) The direct duplication fuses SOS1 to MAP4K3. (C, F, and I) Domains of the fusion proteins in EGL480 (C), EGL701 (F), and EGL605 (I). We predicted fusion protein motifs by entering fusion cDNA sequence from Ensembl 75 into ScanProsite. (D) EGL701’s duplication of the X chromosome is inverted and inserted into chromosome 9. (E) COL4A6 is fused to USP20 at the insertion site. (G) Array CGH (above) and breakpoint sequencing (below) of EGL605’s DUP-NML-DUP. There are two possible structures for this rearrangement, and both predict a KCNH5-FUT8 fusion. (H) KCNH5 fuses to FUT8 at the inverted junction of the two duplications.

Similar articles

Cited by

References

    1. Neill N.J., Torchia B.S., Bejjani B.A., Shaffer L.G., Ballif B.C. Comparative analysis of copy number detection by whole-genome BAC and oligonucleotide array CGH. Mol. Cytogenet. 2010;3:11. - PMC - PubMed
    1. Cooper G.M., Coe B.P., Girirajan S., Rosenfeld J.A., Vu T.H., Baker C., Williams C., Stalker H., Hamid R., Hannig V. A copy number variation morbidity map of developmental delay. Nat. Genet. 2011;43:838–846. - PMC - PubMed
    1. Kearney H.M., Thorland E.C., Brown K.K., Quintero-Rivera F., South S.T., Working Group of the American College of Medical Genetics Laboratory Quality Assurance Committee American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants. Genet. Med. 2011;13:680–685. - PubMed
    1. Watson C.T., Marques-Bonet T., Sharp A.J., Mefford H.C. The genetics of microdeletion and microduplication syndromes: an update. Annu. Rev. Genomics Hum. Genet. 2014;15:215–244. - PMC - PubMed
    1. Cook E.H., Jr., Scherer S.W. Copy-number variations associated with neuropsychiatric conditions. Nature. 2008;455:919–923. - PubMed

Publication types

Associated data