. 2019 Mar 22;20(1):63.

doi: 10.1186/s13059-019-1662-y.

EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data

Aaron T L Lun¹, Samantha Riesenfeld², Tallulah Andrews³, The Phuong Dao⁴, Tomas Gomes³; participants in the 1st Human Cell Atlas Jamboree; John C Marioni^{5

6

7}

Affiliations

¹ Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, UK. aaron.lun@cruk.cam.ac.uk.
² Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
³ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
⁴ Program for Computational and Systems Biology, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, USA.
⁵ Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, UK. marioni@ebi.ac.uk.
⁶ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. marioni@ebi.ac.uk.
⁷ EMBL European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. marioni@ebi.ac.uk.

PMID: 30902100
PMCID: PMC6431044
DOI: 10.1186/s13059-019-1662-y

EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data

Aaron T L Lun et al. Genome Biol. 2019.

. 2019 Mar 22;20(1):63.

doi: 10.1186/s13059-019-1662-y.

Authors

Aaron T L Lun¹, Samantha Riesenfeld², Tallulah Andrews³, The Phuong Dao⁴, Tomas Gomes³; participants in the 1st Human Cell Atlas Jamboree; John C Marioni^{5

6

7}

Affiliations

¹ Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, UK. aaron.lun@cruk.cam.ac.uk.
² Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
³ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
⁴ Program for Computational and Systems Biology, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, USA.
⁵ Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, UK. marioni@ebi.ac.uk.
⁶ Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. marioni@ebi.ac.uk.
⁷ EMBL European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. marioni@ebi.ac.uk.

PMID: 30902100
PMCID: PMC6431044
DOI: 10.1186/s13059-019-1662-y

Abstract

Droplet-based single-cell RNA sequencing protocols have dramatically increased the throughput of single-cell transcriptomics studies. A key computational challenge when processing these data is to distinguish libraries for real cells from empty droplets. Here, we describe a new statistical method for calling cells from droplet-based data, based on detecting significant deviations from the expression profile of the ambient solution. Using simulations, we demonstrate that EmptyDrops has greater power than existing approaches while controlling the false discovery rate among detected cells. Our method also retains distinct cell types that would have been discarded by existing methods in several real data sets.

Keywords: Cell detection; Droplet-based protocols; Empty droplets; Single-cell transcriptomics.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

**Fig. 1**
Cell-calling results from different algorithms in simulations based on the PBMC dataset. Simulation scenarios are labelled as G₁/G₂ where G₁ and G₂ are the number of barcodes in the group of large and small cells, respectively. a The recall for each method, defined as the proportion of detected cells from each group. EmptyDrops was used with an FDR threshold of 0.1%. b The observed FDR in the set of libraries detected by EmptyDrops at a range of nominal FDR thresholds (dotted lines), defined as the proportion of detected droplets that are empty. In both plots, each point represents the result of one simulation iteration, the bar represents the mean value across 10 iterations, and the error bars represent the standard error of the mean

**Fig. 2**
Application of EmptyDrops and other cell detection methods to one sample of the placenta dataset. a A barcode rank plot showing the fitted spline used for knee point detection in EmptyDrops. The detected knee and inflection points are also shown. b The negative log-likelihood for each barcode in the multinomial model of EmptyDrops, plotted against the total count. Barcodes detected as putative cell-containing droplets at an FDR of 0.1% are labelled in red. Only barcodes with t_b>T are shown. c An UpSet plot [20] of the barcodes detected by each combination of methods (vertical bars). Horizontal bars represent the number of barcodes detected by each method. d Histogram outlines of the log-total count for barcodes detected by each method

**Fig. 3**
t-SNE plots for the placenta dataset (a, b) or the 900 neuron dataset (c, d), constructed from barcodes that were detected with EmptyDrops and/or CellRanger. Each point represents a barcode and is colored based on a, c whether it was detected as a cell with each method; b the expression of monocyte marker genes *KCNA5*, *CFP*, *STX11*, and *S100A12*; or d the expression of interneuron marker genes *Gad1*, *Gad2*, and *Sla6c1*. Expression of the relevant marker set in each barcode was quantified as the sum of the normalized log-expression values across all marker genes. Arrows mark the putative monocyte and interneuron populations in each dataset

See this image and copyright information in PMC

Cited by

Novel Technologies in Studying Brain Immune Response.
Li L, Lenahan C, Liao Z, Ke J, Li X, Xue F, Zhang JH. Li L, et al. Oxid Med Cell Longev. 2021 Mar 18;2021:6694566. doi: 10.1155/2021/6694566. eCollection 2021. Oxid Med Cell Longev. 2021. PMID: 33791073 Free PMC article. Review.
Single-cell RNA sequencing identifies shared differentiation paths of mouse thymic innate T cells.
Lee M, Lee E, Han SK, Choi YH, Kwon DI, Choi H, Lee K, Park ES, Rha MS, Joo DJ, Shin EC, Kim S, Kim JK, Lee YJ. Lee M, et al. Nat Commun. 2020 Aug 31;11(1):4367. doi: 10.1038/s41467-020-18155-8. Nat Commun. 2020. PMID: 32868763 Free PMC article.
Vascular deficiencies in renal organoids and ex vivo kidney organogenesis.
Ryan AR, England AR, Chaney CP, Cowdin MA, Hiltabidle M, Daniel E, Gupta AK, Oxburgh L, Carroll TJ, Cleaver O. Ryan AR, et al. Dev Biol. 2021 Sep;477:98-116. doi: 10.1016/j.ydbio.2021.04.009. Epub 2021 May 15. Dev Biol. 2021. PMID: 34000274 Free PMC article.
Age-associated changes in lineage composition of the enteric nervous system regulate gut health and disease.
Kulkarni S, Saha M, Slosberg J, Singh A, Nagaraj S, Becker L, Zhang C, Bukowski A, Wang Z, Liu G, Leser JM, Kumar M, Bakhshi S, Anderson MJ, Lewandoski M, Vincent E, Goff LA, Pasricha PJ. Kulkarni S, et al. Elife. 2023 Dec 18;12:RP88051. doi: 10.7554/eLife.88051. Elife. 2023. PMID: 38108810 Free PMC article.
Altered microbial bile acid metabolism exacerbates T cell-driven inflammation during graft-versus-host disease.
Lindner S, Miltiadous O, Ramos RJF, Paredes J, Kousa AI, Dai A, Fei T, Lauder E, Frame J, Waters NR, Sadeghi K, Armijo GK, Ghale R, Victor K, Gipson B, Monette S, Russo MV, Nguyen CL, Slingerland J, Taur Y, Markey KA, Andrlova H, Giralt S, Perales MA, Reddy P, Peled JU, Smith M, Cross JR, Burgos da Silva M, Campbell C, van den Brink MRM. Lindner S, et al. Nat Microbiol. 2024 Mar;9(3):614-630. doi: 10.1038/s41564-024-01617-w. Epub 2024 Mar 1. Nat Microbiol. 2024. PMID: 38429422 Free PMC article.

See all "Cited by" articles

References

1. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161(5):1202–14. doi: 10.1016/j.cell.2015.05.002. - DOI - PMC - PubMed
1. Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L, Weitz DA, Kirschner MW. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015;161(5):1187–201. doi: 10.1016/j.cell.2015.04.044. - DOI - PMC - PubMed
1. Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, Gregory MT, Shuga J, Montesclaros L, Underwood JG, Masquelier DA, Nishimura SY, Schnall-Levin M, Wyatt PW, Hindson CM, Bharadwaj R, Wong A, Ness KD, Beppu LW, Deeg HJ, McFarland C, Loeb KR, Valente WJ, Ericson NG, Stevens EA, Radich JP, Mikkelsen TS, Hindson BJ, Bielas JH. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049. doi: 10.1038/ncomms14049. - DOI - PMC - PubMed
1. Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M, Lonnerberg P, Linnarsson S. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2014;11(2):163–6. doi: 10.1038/nmeth.2772. - DOI - PubMed
1. Picelli S, Bjorklund AK, Faridani OR, Sagasser S, Winberg G, Sandberg R. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods. 2013;10(11):1096–8. doi: 10.1038/nmeth.2639. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data

Affiliations

EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data

Authors

Affiliations

Abstract

Conflict of interest statement

Ethics approval and consent to participate

Competing interests

Publisher’s Note

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources