Automated single-cell omics end-to-end framework with data-driven batch inference
- PMID: 39366377
- PMCID: PMC11491117
- DOI: 10.1016/j.cels.2024.09.003
Automated single-cell omics end-to-end framework with data-driven batch inference
Abstract
To facilitate single-cell multi-omics analysis and improve reproducibility, we present single-cell pipeline for end-to-end data integration (SPEEDI), a fully automated end-to-end framework for batch inference, data integration, and cell-type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell-type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch-inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/. A record of this paper's transparent peer review process is included in the supplemental information.
Keywords: batch identification; cell-type mapping; information theory; integration; scATAC-seq; scRNA-seq; single-cell genomics.
Copyright © 2024 Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests S.C.S. is a consultant, equity owner, and interim chief scientific officer at GNOMX Corp. Patents were filed related to this work. O.G.T. is on the advisory board of Cell Systems.
Update of
-
Automated single-cell omics end-to-end framework with data-driven batch inference.bioRxiv [Preprint]. 2024 Jun 20:2023.11.01.564815. doi: 10.1101/2023.11.01.564815. bioRxiv. 2024. Update in: Cell Syst. 2024 Oct 16;15(10):982-990.e5. doi: 10.1016/j.cels.2024.09.003. PMID: 37961197 Free PMC article. Updated. Preprint.
Similar articles
-
Automated single-cell omics end-to-end framework with data-driven batch inference.bioRxiv [Preprint]. 2024 Jun 20:2023.11.01.564815. doi: 10.1101/2023.11.01.564815. bioRxiv. 2024. Update in: Cell Syst. 2024 Oct 16;15(10):982-990.e5. doi: 10.1016/j.cels.2024.09.003. PMID: 37961197 Free PMC article. Updated. Preprint.
-
Independent component analysis based gene co-expression network inference (ICAnet) to decipher functional modules for better single-cell clustering and batch integration.Nucleic Acids Res. 2021 May 21;49(9):e54. doi: 10.1093/nar/gkab089. Nucleic Acids Res. 2021. PMID: 33619563 Free PMC article.
-
A Cell Cycle-Aware Network for Data Integration and Label Transferring of Single-Cell RNA-Seq and ATAC-Seq.Adv Sci (Weinh). 2024 Aug;11(31):e2401815. doi: 10.1002/advs.202401815. Epub 2024 Jun 17. Adv Sci (Weinh). 2024. PMID: 38887194 Free PMC article.
-
Machine Intelligence in Single-Cell Data Analysis: Advances and New Challenges.Front Genet. 2021 May 31;12:655536. doi: 10.3389/fgene.2021.655536. eCollection 2021. Front Genet. 2021. PMID: 34135939 Free PMC article. Review.
-
A guide to single-cell RNA sequencing analysis using web-based tools for non-bioinformatician.FEBS J. 2024 Jun;291(12):2545-2561. doi: 10.1111/febs.17036. Epub 2024 Jan 20. FEBS J. 2024. PMID: 38148322 Review.
Cited by
-
Mapping Cell Identity from scRNA-seq: A primer on computational methods.Comput Struct Biotechnol J. 2025 Apr 2;27:1559-1569. doi: 10.1016/j.csbj.2025.03.051. eCollection 2025. Comput Struct Biotechnol J. 2025. PMID: 40270709 Free PMC article. Review.
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases