Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb;14(2):135-139.
doi: 10.1038/nmeth.4106. Epub 2016 Dec 12.

Simulation-based comprehensive benchmarking of RNA-seq aligners

Affiliations

Simulation-based comprehensive benchmarking of RNA-seq aligners

Giacomo Baruzzo et al. Nat Methods. 2017 Feb.

Abstract

Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Base-level precision and recall for human and malaria data sets.
Figure 2
Figure 2
Junction-level precision and recall for human and malaria data sets.
Figure 3
Figure 3
The effect of tuning parameters on the human-T3-data base-level statistics. For each tool, the figure shows the alignment statistics for the ‘default’ (d) and the ‘tuned’ (t) alignments.
Figure 4
Figure 4
Runtime performance on human and malaria data. Bars show average runtime in minutes from three replicates. Error bars, s.d. Note that Novoalign has no multithreading in its free-license versions. To obtain comparable results, we divided the Novoalign runtime by the number of threads used (16). However, the real scalability could be different from the ideal one used here, resulting in a longer execution time.

References

    1. Hayer KE, Pizarro A, Lahens NF, Hogenesch JB, Grant GR. Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data. Bioinformatics. 2015;31:3938–3945. - PMC - PubMed
    1. Bonfert T, Kirner E, Csaba G, Zimmer R, Friedel CC. ContextMap 2: fast and accurate context-based RNA-seq mapping. BMC Bioinformatics. 2015;16:122. - PMC - PubMed
    1. Philippe N, Salson M, Commes T, Rivals E. CRAC: an integrated approach to the analysis of RNA-seq reads. Genome Biol. 2013;14:R30. - PMC - PubMed
    1. Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26:873–881. - PMC - PubMed
    1. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357–360. - PMC - PubMed

Publication types

LinkOut - more resources