Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 19;20(1):290.
doi: 10.1186/s13059-019-1852-7.

Genotype-free demultiplexing of pooled single-cell RNA-seq

Affiliations

Genotype-free demultiplexing of pooled single-cell RNA-seq

Jun Xu et al. Genome Biol. .

Abstract

A variety of methods have been developed to demultiplex pooled samples in a single cell RNA sequencing (scRNA-seq) experiment which either require hashtag barcodes or sample genotypes prior to pooling. We introduce scSplit which utilizes genetic differences inferred from scRNA-seq data alone to demultiplex pooled samples. scSplit also enables mapping clusters to original samples. Using simulated, merged, and pooled multi-individual datasets, we show that scSplit prediction is highly concordant with demuxlet predictions and is highly consistent with the known truth in cell-hashing dataset. scSplit is ideally suited to samples without external genotype information and is available at: https://github.com/jon-xu/scSplit.

Keywords: Allele fraction; Demultiplexing; Doublets; Expectation-maximization; Genotype-free; Hidden Markov Model; Machine learning; Unsupervised; scRNA-seq; scSplit.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Results on simulated, merged hash-tagged scRNA-seq datasets confirmed scSplit a useful tool to demultiplex pooled single cells. a Confusion matrix showing scSplit demultiplexing results on simulated 2-, 3-, 4- and 8-mix; b TPR and FDR of for singlets and doublets predicted by scSplit and demuxlet compared to known truth before merging; c TPR and FDR of for singlets and doublets predicted by scSplit and demuxlet compared to cell hashing tags
Fig. 2
Fig. 2
Results of scSplit on pooled PBMC scRNA-seq and that on a set of pooled fibroblast samples. a Singlet TPR and FDR compared to demuxlet predictions on pooled PBMC scRNA-seq. b Violin plot of singlet TPR and FDR for five 7- or 8-mixed samples based on scSplit vs demuxlet
Fig. 3
Fig. 3
Batch effect during sequencing runs found in comparison of individual runs was obvious compared to that in pooled scRNA-seq data. a UMAP for three individually sequenced samples. b UMAP for three individually sequenced and normalized samples. c UMAP for pooled sequencing of same three individual samples, samples marked based on demultiplexing results using scSplit. d UMAP for pooled sequencing of same three individual samples, normalized by total sample reads
Fig. 4
Fig. 4
The overall pipeline of scSplit tool. a SNV identified based on reads from all cells which have similar or different genotypes. b Alternative and reference allele count matrices built from each read in the pooled-sequenced BAM at the identified informative SNVs. c Initial allele fraction model constructed from the initial cell seeds and their allele counts. d Expectation-maximization process to find the most optimized allele fraction model, based on which the cells are assigned to clusters. e Presence/Absence matrix of alternative alleles generated from the cell assignments. f Minimum set of distinguishing variants found to be used to map clusters with samples

References

    1. Macosko Evan Z., Basu Anindita, Satija Rahul, Nemesh James, Shekhar Karthik, Goldman Melissa, Tirosh Itay, Bialas Allison R., Kamitaki Nolan, Martersteck Emily M., Trombetta John J., Weitz David A., Sanes Joshua R., Shalek Alex K., Regev Aviv, McCarroll Steven A. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. 2015;161(5):1202–1214. doi: 10.1016/j.cell.2015.05.002. - DOI - PMC - PubMed
    1. Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al.Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017; 8. 10.1038/ncomms14049. - PMC - PubMed
    1. Zhang Xiannian, Li Tianqi, Liu Feng, Chen Yaqi, Yao Jiacheng, Li Zeyao, Huang Yanyi, Wang Jianbin. Comparative Analysis of Droplet-Based Ultra-High-Throughput Single-Cell RNA-Seq Systems. Molecular Cell. 2019;73(1):130-142.e5. - PubMed
    1. Stoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, MauckIII WM et al. Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 2018;19:224. doi: 10.1186/s13059-018-1603-1. - DOI - PMC - PubMed
    1. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Swerdlow H, et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods. 2017;14:865–868. doi: 10.1038/nmeth.4380. - DOI - PMC - PubMed

Publication types