Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov;22(11):2250-61.
doi: 10.1101/gr.136572.111. Epub 2012 Jun 28.

nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing

Affiliations

nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing

Andrew McPherson et al. Genome Res. 2012 Nov.

Abstract

Complex genomic rearrangements (CGRs) are emerging as a new feature of cancer genomes. CGRs are characterized by multiple genomic breakpoints and thus have the potential to simultaneously affect multiple genes, fusing some genes and interrupting other genes. Analysis of high-throughput whole-genome shotgun sequencing (WGSS) is beginning to facilitate the discovery and characterization of CGRs, but further development of computational methods is required. We have developed an algorithmic method for identifying CGRs in WGSS data based on shortest alternating paths in breakpoint graphs. Aiming for a method with the highest possible sensitivity, we use breakpoint graphs built from all WGSS data, including sequences with ambiguous genomic origin. Since the majority of cell function is encoded by the transcriptome, we target our search to find CGRs that underlie fusion transcripts predicted from matched high-throughput cDNA sequencing (RNA-seq). We have applied our method, nFuse, to the discovery of CGRs in publicly available data from the well-studied breast cancer cell line HCC1954 and primary prostate tumor sample 963. We first establish the sensitivity and specificity of the nFuse breakpoint prediction and scoring method using breakpoints previously discovered in HCC1954. We then validate five out of six CGRs in HCC1954 and two out of two CGRs in 963. We show examples of gene fusions that would be difficult to discover using methods that do not account for the existence of CGRs, including one important event that was missed in a previous study of the HCC1954 genome. Finally, we illustrate how CGRs may be used to infer the gene expression history of a tumor.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Breakpoint graphs and polyfusions. (A) A breakpoint as an unexpected adjacency. (B) The breakpoint graph for a single breakpoint showing a breakpoint edge. (C) Two breakpoints on chromosomes A, B, and C. (D) The breakpoint graph for the two breakpoints showing two breakpoint edges and an adjacency edge. (E) Three breakpoints on chromosomes A, B, and C. (F) The breakpoint graph for the three breakpoints showing a (B2, B5) adjacency edge that encodes the optional nature of breakpoint (B3, C1). (G) Breakpoints for an X-Y gene fusion with a complex breakpoint. (H) The breakpoint graph for the complex breakpoint showing an alternating path between X and Y.
Figure 2.
Figure 2.
Closed chains of breakage and rejoining (CCBRs). (A) In an idealized version of a CCBR, no chromosomal material is lost or gained. (B) Actual CCBRs may involve small loss or gain of chromosomal material. For instance, the A2A3 and C2C3 sections of chromosomes A and C appear to have been lost, and the B2B3 section of chromosome B appears to have been duplicated. (C) The breakpoint graph for the CCBR in B showing (A1, A4) and (C1, C4) loss edges and a (B2, B3) gain edge.
Figure 3.
Figure 3.
Performance of nFuse breakpoint prediction on breakpoints previously discovered in HCC1954. (A) Shown is the overlap between sets of breakpoints discovered by Bignell et al. (2007), Stephens et al. (2009), Galante et al. (2011), and nFuse. Previously discovered breakpoints are rediscovered by nFuse with a recall of 0.858. (B) Beanplot comparing nFuse breakpoint scores for a random selection of 3000 nFuse breakpoint predictions, and the 296 ‘true positive’ nFuse breakpoint predictions. Score is calculated as −log probability. The nFuse breakpoint scoring ranks true-positive breakpoints significantly higher (closer to zero) than random breakpoints, many of which are expected to be false positives.
Figure 4.
Figure 4.
Complex breakpoints and polyfusions in HCC1954. (A–D) Complex breakpoints produce truncated GSDMC and PVT1 transcripts and ENDOD1-WASH2P and ZDHHC11-RNF130 fusion transcripts. Validated by LR-PCR. (E) A PHF20L1-FAM49B-SAMD12 polyfusion produces an in-frame PHF20L1-SAMD12 fusion transcript. Validated by LR-PCR. (F–G) Complex breakpoints corroborated by multiple fusion transcripts.
Figure 5.
Figure 5.
CGRs discovered in primary tumor sample 963. (A) A single CCBR produces four fusion genes: MYC-ARHGEF17, ARHGEF17-SHANK2, SHANK2-HMGN2P46, and HMGN2P46-MYC. Only the ARHGEF17-SHANK2 and HMGN2P46-MYC fusion genes produce fusion transcripts. (B) Example of a CGR that is both a CCBR and a polyfusion involving three loci. The aberrant 1-7-11 chromosome produces three fusion transcripts: WDTC1-CD151, WDTC1-EFCAB4A, and WDTC1-PRKRIP1.

Similar articles

Cited by

References

    1. Asmann YW, Hossain A, Necela BM, Middha S, Kalari KR, Sun Z, Chai HS, Williamson DW, Radisky D, Schroth GP, et al. 2011. A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res 39: e100 doi: 10.1093/nar/gkr362 - PMC - PubMed
    1. Bashir A, Volik S, Collins C, Bafna V, Raphael BJ 2008. Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer. PLoS Comput Biol 4: e1000051 doi: 10.1371/journal.pcbi.1000051 - PMC - PubMed
    1. Berger MF, Levin JZ, Vijayendran K, Sivachenko A, Adiconis X, Maguire J, Johnson LA, Robinson J, Verhaak RG, Sougnez C, et al. 2010. Integrative analysis of the melanoma transcriptome. Genome Res 20: 413–427 - PMC - PubMed
    1. Berger MF, Lawrence MS, Demichelis F, Drier Y, Cibulskis K, Sivachenko AY, Sboner A, Esgueva R, Pflueger D, Sougnez C, et al. 2011. The genomic complexity of primary human prostate cancer. Nature 470: 214–220 - PMC - PubMed
    1. Bignell GR, Santarius T, Pole JC, Butler AP, Perry J, Pleasance E, Greenman C, Menzies A, Taylor S, Edkins S, et al. 2007. Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res 17: 1296–1303 - PMC - PubMed

Publication types

LinkOut - more resources