Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(6):e39987.
doi: 10.1371/journal.pone.0039987. Epub 2012 Jun 27.

FusionFinder: a software tool to identify expressed gene fusion candidates from RNA-Seq data

Affiliations

FusionFinder: a software tool to identify expressed gene fusion candidates from RNA-Seq data

Richard W Francis et al. PLoS One. 2012.

Abstract

The hallmarks of many haematological malignancies and solid tumours are chromosomal translocations, which may lead to gene fusions. Recently, next-generation sequencing techniques at the transcriptome level (RNA-Seq) have been used to verify known and discover novel transcribed gene fusions. We present FusionFinder, a Perl-based software designed to automate the discovery of candidate gene fusion partners from single-end (SE) or paired-end (PE) RNA-Seq read data. FusionFinder was applied to data from a previously published analysis of the K562 chronic myeloid leukaemia (CML) cell line. Using FusionFinder we successfully replicated the findings of this study and detected additional previously unreported fusion genes in their dataset, which were confirmed experimentally. These included two isoforms of a fusion involving the genes BRK1 and VHL, whose co-deletion has previously been associated with the prevalence and severity of renal-cell carcinoma. FusionFinder is made freely available for non-commercial use and can be downloaded from the project website (http://bioinformatics.childhealthresearch.org.au/software/fusionfinder/).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. FusionFinder rationale.
A) RNA-Seq produces millions of short reads, some of which will span the exon boundaries of hypothetical fusion transcripts between Gene 1 and Gene 2. Two different fusion isoforms involving different exons are shown, left and right, along with a single read that spans each breakpoint. Reads are split into smaller pseudo PE reads which can be aligned independently to a reference transcriptome. B) Alignment of pseudo PE reads against the reference transcriptome. One of each pair aligns to an exon on Gene 1 and the other aligns to an exon on Gene 2. Repeating this process for all other RNA-Seq reads creates “alignment blocks” from overlapping groups of aligned 5′ and 3′ pseudo PE reads and their genomic coordinates. Multiple alignment blocks on either gene (as for Gene 1 in the example) provide evidence for the existence of different isoforms of the fusion.
Figure 2
Figure 2. Identification of the transcript breakpoint in each PRIM1:NACA isoform.
Alignments of the full length 76mer reads providing evidence for the two isoforms of PRIM1:NACA (i.e. as originally identified by Levin et al, top, and the novel isoform identified by FusionFinder, bottom) against the last 30 bases of the implicated PRIM1 (G1) exon and the first 30 bases of the NACA (G2) exon. The transcript breakpoint can be clearly seen where the PRIM1 exon ends and the NACA exon begins. Also displayed is an in-frame translation of the G1 exon from wild type PRIM1, running into the fused NACA exon. Both isoforms retain an open reading frame despite different exon usage.
Figure 3
Figure 3. RT-PCR validation of the fusion candidates.
Primers were designed around the individual fusion breakpoints and cDNA was synthesised using gene-specific primers. Products were successfully amplified for the following fusion isoforms; BCR:ABL (380 bp, lane 1), PRIM1:NACA isoform 1 (400 bp, lane 2), PRIM1:NACA isoform 2 (340 bp, lane 3), C3orf10:VHL isoform 2 (340 bp, lane 6), ACCS:EXT2 isoform 3 (230 bp, lane 9) and SLC29A1:HSP90AB1 (340 bp, lane 10). No product could be amplified from CEP170:RAD51L1 (lane 4), C3orf10:VHL isoform 1 (lane 5), ACCS:EXT2 isoform 1 or ACCS:EXT isoform 2 (lanes 7 and 8). The corresponding negative controls for each reaction are in the lanes proceeding each reaction. All detected fusion products were validated by Sanger sequencing.
Figure 4
Figure 4. Comparison of sensitivity and PPV for FusionFinder, FusionMap and Tophat-Fusion.
To compare the sensitivity and PPV of FusionFinder, FusionMap and Tophat-Fusion to detect fusion genes, each software was used to analyse a randomly generated dataset simulating normal genes and 55 fusion genes. Calculations of sensitivity and PPV were made for subgroups of the results based on the number of reads evidencing the fusion genes predicted by each software. FusionFinder shows consistently higher sensitivity than both FusionMap and Tophat-Fusion and shows a generally higher PPV than FusionMap and similar PPV to Tophat-Fusion.

References

    1. Nowell PC, Hungerford DA. Chromosome studies on normal and leukemic human leukocytes. J Natl Cancer Inst. 1960;25:85–109. - PubMed
    1. Rowley JD. Identificaton of a translocation with quinacrine fluorescence in a patient with acute leukemia. Ann Genet. 1973;16:109–112. - PubMed
    1. Shtivelman E, Lifshitz B, Gale RP, Canaani E. Fused transcript of abl and bcr genes in chronic myelogenous leukaemia. Nature. 1985;315:550–554. - PubMed
    1. Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, et al. The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet Chapter 10: Unit 10 11. 2008. - PMC - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. - PMC - PubMed

Publication types