Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 6;25(1):257.
doi: 10.1186/s12859-024-05880-w.

scMaui: a widely applicable deep learning framework for single-cell multiomics integration in the presence of batch effects and missing data

Affiliations

scMaui: a widely applicable deep learning framework for single-cell multiomics integration in the presence of batch effects and missing data

Yunhee Jeong et al. BMC Bioinformatics. .

Abstract

The recent advances in high-throughput single-cell sequencing have created an urgent demand for computational models which can address the high complexity of single-cell multiomics data. Meticulous single-cell multiomics integration models are required to avoid biases towards a specific modality and overcome sparsity. Batch effects obfuscating biological signals must also be taken into account. Here, we introduce a new single-cell multiomics integration model, Single-cell Multiomics Autoencoder Integration (scMaui) based on variational product-of-experts autoencoders and adversarial learning. scMaui calculates a joint representation of multiple marginal distributions based on a product-of-experts approach which is especially effective for missing values in the modalities. Furthermore, it overcomes limitations seen in previous VAE-based integration methods with regard to batch effect correction and restricted applicable assays. It handles multiple batch effects independently accepting both discrete and continuous values, as well as provides varied reconstruction loss functions to cover all possible assays and preprocessing pipelines. We demonstrate that scMaui achieves superior performance in many tasks compared to other methods. Further downstream analyses also demonstrate its potential in identifying relations between assays and discovering hidden subpopulations.

Keywords: Autoencoders; Deep learning; Multi-omics; Single cell.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Illustration of scMaui model overview and the training process. Each single-cell multiomics assay is given to an encoder and batch effect factors are independently handled by covariates and adversary networks. Latent factors created by scMaui can be used for downstream analyses to find cellular heterogeneity (e.g. sub/population clustering) and reconstructed assays by the decoders can be used for imputation
Fig. 2
Fig. 2
Benchmarking results of single-cell multiomics integration methods. A Cell population classification AUC-ROC curves and mean AUC. B Classification AUC value for each population and each method. C UMAP representation of scMaui latent factor coloured by clustering result, ground-truth population, and subpopulation labels. D Batch effect silhouette score in each subpopulation. E Subpopulation silhouette score in each population. F Protein expression (antibody-derived tags, ADT) modality imputation task dataset overview (left) and correlation results between predicted and ground-truth values. All boxplots present the median value as a middle bar in the box and both extremes are referred to as the first and the third quantiles
Fig. 3
Fig. 3
Subpopulation examination and cell-trajectory analyses using scMaui. A Correlation between 50 scMaui latent factors and T cell marker genes. B Distribution of latent values over CD4+ and CD8+T cells. Only latent factors highly correlated with their marker genes were chosen. C UMAP plot of PCA derived from ATAC-seq assay (left) and scMaui latent factors (right) coloured by ATAC-seq pseudo-time order. D PAGA graph applied to scMaui latent factor of HSC, MK/E progenitor, and blast subpopulations. E UMAP plot of scMaui latent factor coloured by dendritic subpopulations (left) and Louvain clusters (right). F Dendritic subpopulation marker gene expression analysis in the detected Louvain clusters
Fig. 4
Fig. 4
Mouse embryo single-cell gene expression and methylation multiomics data analysis results. A UMAP plot of scMaui latent factor extracted from the entire dataset. The plots are coloured by embryo stages and populations each. B Latent values in different stages of embryo cells. Latent factors 24 and 42 are presented at the left and the middle. The right boxplot shows the latent factor 24 values by embryo stage only in epiblast cells. C Diagram of embryonic cell developmental lineage. D Latent values normalised between 0 and 1 and ordered by population. E Median correlation between each latent factor and methylation level of each region group. The groups were decided based on clustering methylation levels. F Gene expression of Rex2 and Susd2 over all single-cell samples
Fig. 5
Fig. 5
Summary of evaluation results conducted in our study. The darker the colour is, the better the method performed in each task. Performance scores are normalised by dividing with the standard deviation in each task

References

    1. Macaulay IC, Ponting CP, Voet T. Single-Cell multiomics: multiple measurements from single cells. Trends Genet. 2017;33:155–68. 10.1016/j.tig.2016.12.003. 10.1016/j.tig.2016.12.003 - DOI - PMC - PubMed
    1. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Swerdlow H, Satija R, Smibert P. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods. 2017;14:865–8. 10.1038/nmeth.4380. 10.1038/nmeth.4380 - DOI - PMC - PubMed
    1. Clark SJ, Argelaguet R, Kapourani C-A, Stubbs TM, Lee HJ, Alda-Catalinas C, Krueger F, Sanguinetti G, Kelsey G, Marioni JC, et al. scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat Commun. 2018;9:781. 10.1038/s41467-018-03149-4. 10.1038/s41467-018-03149-4 - DOI - PMC - PubMed
    1. Priego N, Zhu L, Monteiro C, Mulders M, Wasilewski D, Bindeman W, Doglio L, Martínez L, Martínez-Saez E, et al. STAT3 labels a subpopulation of reactive astrocytes required for brain metastasis. Nat Med. 2018;24:1024–35. 10.1038/s41591-018-0044-4. 10.1038/s41591-018-0044-4 - DOI - PubMed
    1. Keshava N, Toh TS, Yuan H, Yang B, Menden MP, Wang D. Defining subpopulations of differential drug response to reveal novel target populations. NPJ Syst Biol Appl. 2019;5:36. 10.1038/s41540-019-0113-4. 10.1038/s41540-019-0113-4 - DOI - PMC - PubMed

LinkOut - more resources