Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 6;23(1):10.
doi: 10.1186/s13059-021-02588-5.

JAFFAL: detecting fusion genes with long-read transcriptome sequencing

Affiliations

JAFFAL: detecting fusion genes with long-read transcriptome sequencing

Nadia M Davidson et al. Genome Biol. .

Abstract

In cancer, fusions are important diagnostic markers and targets for therapy. Long-read transcriptome sequencing allows the discovery of fusions with their full-length isoform structure. However, due to higher sequencing error rates, fusion finding algorithms designed for short reads do not work. Here we present JAFFAL, to identify fusions from long-read transcriptome sequencing. We validate JAFFAL using simulations, cell lines, and patient data from Nanopore and PacBio. We apply JAFFAL to single-cell data and find fusions spanning three genes demonstrating transcripts detected from complex rearrangements. JAFFAL is available at https://github.com/Oshlack/JAFFA/wiki .

Keywords: Fusions; Long reads; Nanopore; PacBio; RNA sequencing; Translocations.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
JAFFAL pipeline steps for fusion detection. Reads are aligned to the reference transcriptome, reads split across different genes are identified as candidate fusion reads and subsequently aligned to the reference genome for confirmation. Reads are clustered into breakpoint positions which are then ranked and reported (see text for details)
Fig. 2
Fig. 2
Fusion finding sensitivity on simulated ONT data with background. A The fraction of simulated fusions detected (y-axis) by JAFFAL across a range of fusion coverage levels (x-axis). Read identity levels are shown in different colors (red-purple). B The fraction of simulated fusions detected (y-axis) by JAFFAL and LongGF for sequence identity levels of 75–95%
Fig. 3
Fig. 3
Comparison of JAFFAL and LongGF on cancer cell line sequencing. Shown are ROC style curve with the ranking of previously validated fusions against other reported fusions for A MCF-7, HCT-116, A549, and K562 cell lines sequenced with ONT and B MCF-7, HCT-116, and SK-BR-3 cell lines sequenced with PacBio. C For MCF-7 only, high confidence fusions from JAFFAL (crosses) are compared against three short-read Illumina replicates (squares) across three sequencing depths (colors). D The overlap between fusions called by JAFFAL (high and low confidence) and LongGF (> 1 read support) on MCF-7
Fig. 4
Fig. 4
Detection of fusions in single-cell ONT sequencing of five cell lines. A t-SNE plot generated from short-read gene expression. Color indicates the cell line that a fusion detection is known to be in from CCLE. Gray indicates a cell with no detected CCLE fusion. B For each of the 15 fusions detected by JAFFAL, the number of cells identified in each of the five clusters is shown. Fusion labels are colored according to the CCLE cell line they were previously identified in. Black indicates a novel fusion. C JAFFAL identified BMPR2-TYW5 and TYW5-ALS2CR11 in the H838 cell line as belonging to the same transcript and forming the three-gene fusion BMPR2-TYW5-ALS2CR11 identified in 15 reads (two different isoforms). Expressed exons in the fusion transcript are shown in blue, red, and green, with color indicating the gene of origin. Red bars show the position of translocations seen in short-read whole-genome sequencing of H838 in CCLE. The breakpoint within ALS2CR11 falls within its third final exon, and this exon appears to be spliced out. The six isoforms we identified for BMPR2-TYW5-ALS2CR11 and the number of long reads supporting each are also shown. The location of PCR forward and reverse primers which validated the translocation between BMPR2 and ALS2CR11 are shown in black (bottom)

References

    1. Mitelman F, Johansson B, Mertens F. The impact of translocations and gene fusions on cancer causation. Nat Rev Cancer. Nature Publishing Group. 2007;7:233–245. - PubMed
    1. Druker BJ, Talpaz M, Resta DJ, Peng B, Buchdunger E, Ford JM, et al. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N Engl J Med. 2001;344:1031–1037. - PubMed
    1. Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, et al. Transcriptome sequencing to detect gene fusions in cancer. Nature. 2009;458:97–101. - PMC - PubMed
    1. Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K, et al. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 2011;12:R6. - PMC - PubMed
    1. Wong M, Mayoh C, Lau LMS, Khuong-Quang D-A, Pinese M, Kumar A, et al. Whole genome, transcriptome and methylome profiling enhances actionable target discovery in high-risk pediatric cancer. Nat Med [Internet]. 2020; Available from: 10.1038/s41591-020-1072-4 - PubMed

Publication types

LinkOut - more resources