Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;14 Suppl 7(Suppl 7):S2.
doi: 10.1186/1471-2105-14-S7-S2. Epub 2013 Apr 22.

State of art fusion-finder algorithms are suitable to detect transcription-induced chimeras in normal tissues?

Affiliations

State of art fusion-finder algorithms are suitable to detect transcription-induced chimeras in normal tissues?

Matteo Carrara et al. BMC Bioinformatics. 2013.

Abstract

Background: RNA-seq has the potential to discover genes created by chromosomal rearrangements. Fusion genes, also known as "chimeras", are formed by the breakage and re-joining of two different chromosomes. It is known that chimeras have been implicated in the development of cancer. Few publications in the past showed the presence of fusion events also in normal tissue, but with very limited overlaps between their results. More recently, two fusion genes in normal tissues were detected using both RNA-seq and protein data.Due to heterogeneous results in identifying chimeras in normal tissue, we decided to evaluate the efficacy of state of the art fusion finders in detecting chimeras in RNA-seq data from normal tissues.

Results: We compared the performance of six fusion-finder tools: FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse and TopHat-fusion. To evaluate the sensitivity we used a synthetic dataset of fusion-products, called positive dataset; in these experiments FusionMap, FusionFinder, MapSplice, and TopHat-fusion are able to detect more than 78% of fusion genes. All tools were error prone with high variability among the tools, identifying some fusion genes not present in the synthetic dataset. To better investigate the false discovery chimera detection rate, synthetic datasets free of fusion-products, called negative datasets, were used. The negative datasets have different read lengths and quality scores, which allow detecting dependency of the tools on both these features. FusionMap, FusionFinder, mapSplice, deFuse and TopHat-fusion were error-prone. Only FusionHunter results were free of false positive. FusionMap gave the best compromise in terms of specificity in the negative dataset and of sensitivity in the positive dataset.

Conclusions: We have observed a dependency of the tools on read length, quality score and on the number of reads supporting each chimera. Thus, it is important to carefully select the software on the basis of the structure of the RNA-seq data under analysis. Furthermore, the sensitivity of chimera detection tools does not seem to be sufficient to provide results consistent with those obtained in normal tissues on the basis of fusion events extracted from published data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Events involved in chimeras formation. Chimeras, not due to a genomic pathological-associated rearrangement, may originate from two separate events: intergenic splicing and transgenic splicing. An intergenic splicing event combines exons from two adjacent genes of the same chromosome, while a transgenic splicing event combines exons from two gene locate on different chromosomes.
Figure 2
Figure 2
Chimeras detection in the positive dataset. The expected number of reads (open circle) associated to each chimera in the positive dataset is shown together with the reads detected by the six different fusion finders. THF: TopHat-fusion, FM: FusionMap, FH: FusionHunter, MS: MapSplice, DF: deFuse, FF: FusionFinder.
Figure 3
Figure 3
Distribution of the quality scores associated with lib100_1 and lib100_2. The same reads generated with BEERS software were associated with two different sets of quality scores. Upper panel: quality scores associated with lib100_1. Lower panel: quality scores associated with lib100_2. The lines in the bottom of the figure indicate the subset of quality scores used for generating the 2 × 50 and 2 × 75 nts fastq files.
Figure 4
Figure 4
Venn diagrams of genes detected as part of false chimera in negative datasets. FM) FusionMap shows a direct dependency of false chimeras with respect to the read length and a limited dependency of false chimera detection on the basis of the quality scores associated with the reads. FF) FusionFinder shows an inverse dependency of false chimeras on the basis of the read length and a strong dependency of false chimera detection on the basis of the quality scores associated with the reads. THF) TopHat-Fusion detects the highest number of false chimeras. Its dependency with respect to read length is quite limited. DF) deFuse shows a direct dependency of false chimeras on the basis of the read length. MS) MapSplice shows a significant dependency of false chimera detection on the basis of the quality scores associated with the reads. FusionHunter is not shown, since it is the only tool that does not detect false chimeras in the negative datasets.

References

    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature methods. 2008;5(7):621–628. doi: 10.1038/nmeth.1226. - DOI - PubMed
    1. Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, Sam L, Barrette T, Palanisamy N, Chinnaiyan AM. Transcriptome sequencing to detect gene fusions in cancer. Nature. 2009;458(7234):97–101. doi: 10.1038/nature07638. - DOI - PMC - PubMed
    1. Magrangeas F, Pitiot G, Dubois S, Bragado-Nilsson E, Cherel M, Jobert S, Lebeau B, Boisteau O, Lethe B, Mallet J. et al.Cotranscription and intergenic splicing of human galactose-1-phosphate uridylyltransferase and interleukin-11 receptor alpha-chain genes generate a fusion mRNA in normal cells. Implication for the production of multidomain proteins during evolution. The Journal of biological chemistry. 1998;273(26):16005–16010. doi: 10.1074/jbc.273.26.16005. - DOI - PubMed
    1. Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R, Novik A, Sorek R. Transcription-mediated gene fusion in the human genome. Genome research. 2006;16(1):30–36. - PMC - PubMed
    1. Parra G, Reymond A, Dabbouseh N, Dermitzakis ET, Castelo R, Thomson TM, Antonarakis SE, Guigo R. Tandem chimerism as a means to increase protein complexity in the human genome. Genome research. 2006;16(1):37–44. - PMC - PubMed

Publication types