scGPT: toward building a foundation model for single-cell multi-omics using generative AI
- PMID: 38409223
- DOI: 10.1038/s41592-024-02201-0
scGPT: toward building a foundation model for single-cell multi-omics using generative AI
Abstract
Generative pretrained models have achieved remarkable success in various domains such as language and computer vision. Specifically, the combination of large-scale diverse datasets and pretrained transformers has emerged as a promising approach for developing foundation models. Drawing parallels between language and cellular biology (in which texts comprise words; similarly, cells are defined by genes), our study probes the applicability of foundation models to advance cellular biology and genetic research. Using burgeoning single-cell sequencing data, we have constructed a foundation model for single-cell biology, scGPT, based on a generative pretrained transformer across a repository of over 33 million cells. Our findings illustrate that scGPT effectively distills critical biological insights concerning genes and cells. Through further adaptation of transfer learning, scGPT can be optimized to achieve superior performance across diverse downstream applications. This includes tasks such as cell type annotation, multi-batch integration, multi-omic integration, perturbation response prediction and gene network inference.
© 2024. The Author(s), under exclusive licence to Springer Nature America, Inc.
References
-
- Silverman, A. D., Karim, A. S. & Jewett, M. C. Cell-free gene expression: an expanded repertoire of applications. Nat. Rev. Genet. 21, 151–170 (2020). - PubMed
-
- Preissl, S., Gaulton, K. J. & Ren, B. Characterizing cis-regulatory elements using single-cell epigenomics. Nat. Rev. Genet. 24, 21–43 (2022).
MeSH terms
Grants and funding
- RGPIN-2020-06189/Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada (NSERC Canadian Network for Research and Innovation in Machining Technology)
- DGECR-2020-00294/Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada (NSERC Canadian Network for Research and Innovation in Machining Technology)
- Doctoral Fellowship/Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada (NSERC Canadian Network for Research and Innovation in Machining Technology)
- Peter Munk Cardiac Centre AI Fund/University Health Network (UHN)
LinkOut - more resources
Full Text Sources
Other Literature Sources
