Errors in RNA-Seq quantification affect genes of relevance to human disease
- PMID: 26335491
- PMCID: PMC4558956
- DOI: 10.1186/s13059-015-0734-x
Errors in RNA-Seq quantification affect genes of relevance to human disease
Abstract
Background: RNA-Seq has emerged as the standard for measuring gene expression and is an important technique often used in studies of human disease. Gene expression quantification involves comparison of the sequenced reads to a known genomic or transcriptomic reference. The accuracy of that quantification relies on there being enough unique information in the reads to enable bioinformatics tools to accurately assign the reads to the correct gene.
Results: We apply 12 common methods to estimate gene expression from RNA-Seq data and show that there are hundreds of genes whose expression is underestimated by one or more of those methods. Many of these genes have been implicated in human disease, and we describe their roles. We go on to propose a two-stage analysis of RNA-Seq data in which multi-mapped or ambiguous reads can instead be uniquely assigned to groups of genes. We apply this method to a recently published mouse cancer study, and demonstrate that we can extract relevant biological signal from data that would otherwise have been discarded.
Conclusions: For hundreds of genes in the human genome, RNA-Seq is unable to measure expression accurately. These genes are enriched for gene families, and many of them have been implicated in human disease. We show that it is possible to use data that may otherwise have been discarded to measure group-level expression, and that such data contains biologically relevant information.
Figures






Similar articles
-
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.BMC Bioinformatics. 2011 Aug 4;12:323. doi: 10.1186/1471-2105-12-323. BMC Bioinformatics. 2011. PMID: 21816040 Free PMC article.
-
GIIRA--RNA-Seq driven gene finding incorporating ambiguous reads.Bioinformatics. 2014 Mar 1;30(5):606-13. doi: 10.1093/bioinformatics/btt577. Epub 2013 Oct 11. Bioinformatics. 2014. PMID: 24123675
-
Assessing the impact of human genome annotation choice on RNA-seq expression estimates.BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S8. doi: 10.1186/1471-2105-14-S11-S8. Epub 2013 Nov 4. BMC Bioinformatics. 2013. PMID: 24564364 Free PMC article.
-
Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools.IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):566-586. doi: 10.1109/TCBB.2018.2873010. Epub 2018 Oct 1. IEEE/ACM Trans Comput Biol Bioinform. 2020. PMID: 30281477 Review.
-
Characterizing and annotating the genome using RNA-seq data.Sci China Life Sci. 2017 Feb;60(2):116-125. doi: 10.1007/s11427-015-0349-4. Epub 2016 Jun 13. Sci China Life Sci. 2017. PMID: 27294835 Review.
Cited by
-
Comprehensive landscape of subtype-specific coding and non-coding RNA transcripts in breast cancer.Oncotarget. 2016 Oct 18;7(42):68851-68863. doi: 10.18632/oncotarget.11998. Oncotarget. 2016. PMID: 27634900 Free PMC article.
-
ReadXplorer 2-detailed read mapping analysis and visualization from one single source.Bioinformatics. 2016 Dec 15;32(24):3702-3708. doi: 10.1093/bioinformatics/btw541. Epub 2016 Aug 18. Bioinformatics. 2016. PMID: 27540267 Free PMC article.
-
Neonatal ketone body elevation regulates postnatal heart development by promoting cardiomyocyte mitochondrial maturation and metabolic reprogramming.Cell Discov. 2022 Oct 11;8(1):106. doi: 10.1038/s41421-022-00447-6. Cell Discov. 2022. PMID: 36220812 Free PMC article.
-
Considerations and practical implications of performing a phenotypic CRISPR/Cas survival screen.PLoS One. 2022 Feb 17;17(2):e0263262. doi: 10.1371/journal.pone.0263262. eCollection 2022. PLoS One. 2022. PMID: 35176052 Free PMC article.
-
Altered zinc balance in the Atp7b-/- mouse reveals a mechanism of copper toxicity in Wilson disease.Metallomics. 2018 Nov 14;10(11):1595-1606. doi: 10.1039/c8mt00199e. Metallomics. 2018. PMID: 30277246 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
- BBS/E/D/20211550/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BB/J004243/1/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BBS/E/D/20211551/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- BB/J004235/1/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- G0900740/MRC_/Medical Research Council/United Kingdom
LinkOut - more resources
Full Text Sources
Other Literature Sources