Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
- PMID: 31870423
- PMCID: PMC6927181
- DOI: 10.1186/s13059-019-1874-1
Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
Abstract
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from "regularized negative binomial regression," where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
Keywords: Normalization; Single-cell RNA-seq.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures






Similar articles
-
Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data.Genome Biol. 2021 Sep 6;22(1):258. doi: 10.1186/s13059-021-02451-7. Genome Biol. 2021. PMID: 34488842 Free PMC article.
-
Comparison and evaluation of statistical error models for scRNA-seq.Genome Biol. 2022 Jan 18;23(1):27. doi: 10.1186/s13059-021-02584-9. Genome Biol. 2022. PMID: 35042561 Free PMC article.
-
Asc-Seurat: analytical single-cell Seurat-based web application.BMC Bioinformatics. 2021 Nov 18;22(1):556. doi: 10.1186/s12859-021-04472-2. BMC Bioinformatics. 2021. PMID: 34794383 Free PMC article.
-
Machine learning and statistical methods for clustering single-cell RNA-sequencing data.Brief Bioinform. 2020 Jul 15;21(4):1209-1223. doi: 10.1093/bib/bbz063. Brief Bioinform. 2020. PMID: 31243426 Review.
-
Single-cell RNA sequencing in breast cancer: Understanding tumor heterogeneity and paving roads to individualized therapy.Cancer Commun (Lond). 2020 Aug;40(8):329-344. doi: 10.1002/cac2.12078. Epub 2020 Jul 12. Cancer Commun (Lond). 2020. PMID: 32654419 Free PMC article. Review.
Cited by
-
Age, sex, and cell type-resolved hypothalamic gene expression across the pubertal transition in mice.Biol Sex Differ. 2024 Oct 24;15(1):83. doi: 10.1186/s13293-024-00661-9. Biol Sex Differ. 2024. PMID: 39449090 Free PMC article.
-
Distinct identities of leaf phloem cells revealed by single cell transcriptomics.Plant Cell. 2021 May 5;33(3):511-530. doi: 10.1093/plcell/koaa060. Plant Cell. 2021. PMID: 33955487 Free PMC article.
-
Single-cell RNA-seq analyses show that long non-coding RNAs are conspicuously expressed in Schistosoma mansoni gamete and tegument progenitor cell populations.Front Genet. 2022 Sep 20;13:924877. doi: 10.3389/fgene.2022.924877. eCollection 2022. Front Genet. 2022. PMID: 36204320 Free PMC article.
-
Single-Cell RNA Sequencing and Quantitative Proteomics Analysis Elucidate Marker Genes and Molecular Mechanisms in Hypoplastic Left Heart Patients With Heart Failure.Front Cell Dev Biol. 2021 Feb 25;9:617853. doi: 10.3389/fcell.2021.617853. eCollection 2021. Front Cell Dev Biol. 2021. PMID: 33718359 Free PMC article.
-
Neuromodulatory co-expression in cardiac vagal motor neurons of the dorsal motor nucleus of the vagus.iScience. 2024 Jul 19;27(8):110549. doi: 10.1016/j.isci.2024.110549. eCollection 2024 Aug 16. iScience. 2024. PMID: 39171288 Free PMC article.
References
-
- Stegle O, Teichmann SA, Marioni JC. Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet. 2015; 16(January 2014):133–45. http://dx.doi.org/10.1038/nrg3833{%}5Cn. http://www.nature.com/nrg/journal/vaop/ncurrent/full/nrg3833.html{#}author-information. - PubMed
-
- The Tabula MurisConsortium. Single-cell transcriptomic characterization of 20 organs and tissues from individual mice creates a Tabula Muris. bioRxiv. 2018. https://www.biorxiv.org/content/early/2018/03/29/237446. Accessed 29 Mar 2018.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources