Differential expression in RNA-seq: a matter of depth
- PMID: 21903743
- PMCID: PMC3227109
- DOI: 10.1101/gr.124321.111
Differential expression in RNA-seq: a matter of depth
Abstract
Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.
Figures








Similar articles
-
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11. BMC Genomics. 2015. PMID: 26099631 Free PMC article.
-
Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays.PLoS One. 2011 Mar 24;6(3):e17820. doi: 10.1371/journal.pone.0017820. PLoS One. 2011. PMID: 21455293 Free PMC article.
-
DAFS: a data-adaptive flag method for RNA-sequencing data to differentiate genes with low and high expression.BMC Bioinformatics. 2014 Mar 31;15:92. doi: 10.1186/1471-2105-15-92. BMC Bioinformatics. 2014. PMID: 24685233 Free PMC article.
-
Sequencing transcriptomes in toto.Integr Biol (Camb). 2011 May;3(5):522-8. doi: 10.1039/c0ib00062k. Epub 2011 Feb 4. Integr Biol (Camb). 2011. PMID: 21298135 Review.
-
Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis.Brief Funct Genomics. 2015 Mar;14(2):130-42. doi: 10.1093/bfgp/elu035. Epub 2014 Sep 18. Brief Funct Genomics. 2015. PMID: 25240000 Review.
Cited by
-
Salt Stress Tolerance in Casuarina glauca: Insights from the Branchlets Transcriptome.Plants (Basel). 2022 Nov 1;11(21):2942. doi: 10.3390/plants11212942. Plants (Basel). 2022. PMID: 36365395 Free PMC article.
-
Comparative transcriptome analysis between a resistant and a susceptible Chinese cabbage in response to Hyaloperonospora brassicae.Plant Signal Behav. 2020 Jul 2;15(7):1777373. doi: 10.1080/15592324.2020.1777373. Epub 2020 Jun 14. Plant Signal Behav. 2020. PMID: 32538253 Free PMC article.
-
Gene expression and phytohormone levels in the asymptomatic and symptomatic phases of infection in potato tubers inoculated with Dickeya solani.PLoS One. 2022 Aug 29;17(8):e0273481. doi: 10.1371/journal.pone.0273481. eCollection 2022. PLoS One. 2022. PMID: 36037153 Free PMC article.
-
Comparative gene expression between two yeast species.BMC Genomics. 2013 Jan 16;14:33. doi: 10.1186/1471-2164-14-33. BMC Genomics. 2013. PMID: 23324262 Free PMC article.
-
Comparative RNA-Seq Analysis Revealed Tissue-Specific Splicing Variations during the Generation of the PDX Model.Int J Mol Sci. 2023 Nov 30;24(23):17001. doi: 10.3390/ijms242317001. Int J Mol Sci. 2023. PMID: 38069324 Free PMC article.
References
-
- Anders S 2010. Htseq: analysing high-throughput sequencing data with python. http://www-huber.embl.de/users/anders/HTSeq/ - PMC - PubMed
-
- Anderson J 2005. RNA turnover: unexpected consequences of being tailed. Curr Biol 15: R635–R638 - PubMed
-
- Argout X, Salse J, Aury J, Guiltinan M, Droc G, Gouzy J, Allegre M, Chaparro C, Legavre T, Maximova S, et al. 2010. The genome of Theobroma cacao. Nat Genet 43: 101–108 - PubMed
-
- Blencowe BJ, Ahmad S, Lee LJ 2009. Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes. Genes Dev 23: 1379–1386 - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources