MapSplice: accurate mapping of RNA-seq reads for splice junction discovery
- PMID: 20802226
- PMCID: PMC2952873
- DOI: 10.1093/nar/gkq622
MapSplice: accurate mapping of RNA-seq reads for splice junction discovery
Abstract
The accurate mapping of reads that span splice junctions is a critical component of all analytic techniques that work with RNA-seq data. We introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (<75 bp) and long reads (≥ 75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy. We demonstrate that MapSplice achieves higher sensitivity and specificity than TopHat and SpliceMap on a set of simulated RNA-seq data. Experimental studies also support the accuracy of the algorithm. Splice junctions derived from eight breast cancer RNA-seq datasets recapitulated the extensiveness of alternative splicing on a global level as well as the differences between molecular subtypes of breast cancer. These combined results indicate that MapSplice is a highly accurate algorithm for the alignment of RNA-seq reads to splice junctions. Software download URL: http://www.netlab.uky.edu/p/bioinfo/MapSplice.
Figures





















Similar articles
-
A probabilistic framework for aligning paired-end RNA-seq data.Bioinformatics. 2010 Aug 15;26(16):1950-7. doi: 10.1093/bioinformatics/btq336. Epub 2010 Jun 23. Bioinformatics. 2010. PMID: 20576625 Free PMC article.
-
PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data.Bioinformatics. 2012 Feb 15;28(4):479-86. doi: 10.1093/bioinformatics/btr712. Epub 2012 Jan 4. Bioinformatics. 2012. PMID: 22219203 Free PMC article.
-
Detection of splice junctions from paired-end RNA-seq data by SpliceMap.Nucleic Acids Res. 2010 Aug;38(14):4570-8. doi: 10.1093/nar/gkq211. Epub 2010 Apr 5. Nucleic Acids Res. 2010. PMID: 20371516 Free PMC article.
-
Multiplexed primer extension sequencing: A targeted RNA-seq method that enables high-precision quantitation of mRNA splicing isoforms and rare pre-mRNA splicing intermediates.Methods. 2020 Apr 1;176:34-45. doi: 10.1016/j.ymeth.2019.05.013. Epub 2019 May 21. Methods. 2020. PMID: 31121301 Free PMC article. Review.
-
Mapping RNA-seq reads to transcriptomes efficiently based on learning to hash method.Comput Biol Med. 2020 Jan;116:103539. doi: 10.1016/j.compbiomed.2019.103539. Epub 2019 Nov 13. Comput Biol Med. 2020. PMID: 31765913 Review.
Cited by
-
Disease-Associated Circular RNAs: From Biology to Computational Identification.Biomed Res Int. 2020 Aug 17;2020:6798590. doi: 10.1155/2020/6798590. eCollection 2020. Biomed Res Int. 2020. PMID: 32908906 Free PMC article. Review.
-
Trnp1 organizes diverse nuclear membrane-less compartments in neural stem cells.EMBO J. 2020 Aug 17;39(16):e103373. doi: 10.15252/embj.2019103373. Epub 2020 Jul 6. EMBO J. 2020. PMID: 32627867 Free PMC article.
-
Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing.Am J Hum Genet. 2016 May 5;98(5):843-856. doi: 10.1016/j.ajhg.2016.03.017. Am J Hum Genet. 2016. PMID: 27153396 Free PMC article.
-
MNK Inhibition Disrupts Mesenchymal Glioma Stem Cells and Prolongs Survival in a Mouse Model of Glioblastoma.Mol Cancer Res. 2016 Oct;14(10):984-993. doi: 10.1158/1541-7786.MCR-16-0172. Epub 2016 Jun 30. Mol Cancer Res. 2016. PMID: 27364770 Free PMC article.
-
A Protocol for the Detection of Fusion Transcripts Using RNA-Sequencing Data.Methods Mol Biol. 2024;2812:243-258. doi: 10.1007/978-1-0716-3886-6_14. Methods Mol Biol. 2024. PMID: 39068367
References
-
- Andersen LB, Ballester R, Marchuk DA, Chang E, Gutmann DH, Saulino AM, Camonis J, Wigler M, Collins FS. A conserved alternative splice in the von Recklinghausen neurofibromatosis (NF1) gene produces two neurofibromin isoforms, both of which have GTPase-activating protein activity. Mol. Cell. Biol. 1993;13:487–495. - PMC - PubMed
-
- Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R, Majewski J. Genome-wide analysis of transcript isoform variation in humans. Nat. Genet. 2008;40:225–231. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources