Zero-shot evaluation reveals limitations of single-cell foundation models
- PMID: 40251685
- PMCID: PMC12007350
- DOI: 10.1186/s13059-025-03574-x
Zero-shot evaluation reveals limitations of single-cell foundation models
Abstract
Foundation models such as scGPT and Geneformer have not been rigorously evaluated in a setting where they are used without any further training (i.e., zero-shot). Understanding the performance of models in zero-shot settings is critical to applications that exclude the ability to fine-tune, such as discovery settings where labels are unknown. Our evaluation of the zero-shot performance of Geneformer and scGPT suggests that, in some cases, these models may face reliability challenges and could be outperformed by simpler methods. Our findings underscore the importance of zero-shot evaluations in development and deployment of foundation models in single-cell research.
Keywords: Foundation models; Machine learning; Single-cell.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethical approval and consent to participate: Not applicable. Competing interests: A.X.L., L.C., and A.P.A. are employees of and hold equity in Microsoft.
Figures


Similar articles
-
scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction.ArXiv [Preprint]. 2025 May 8:arXiv:2505.05612v1. ArXiv. 2025. PMID: 40386575 Free PMC article. Preprint.
-
Evaluating the Utilities of Foundation Models in Single-cell Data Analysis.bioRxiv [Preprint]. 2024 Dec 10:2023.09.08.555192. doi: 10.1101/2023.09.08.555192. bioRxiv. 2024. PMID: 38464157 Free PMC article. Preprint.
-
GPT-4 as an X data annotator: Unraveling its performance on a stance classification task.PLoS One. 2024 Aug 15;19(8):e0307741. doi: 10.1371/journal.pone.0307741. eCollection 2024. PLoS One. 2024. PMID: 39146280 Free PMC article.
-
Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation.Diagnostics (Basel). 2023 Jun 2;13(11):1947. doi: 10.3390/diagnostics13111947. Diagnostics (Basel). 2023. PMID: 37296799 Free PMC article.
-
How to apply zero-shot learning to text data in substance use research: An overview and tutorial with media data.Addiction. 2024 May;119(5):951-959. doi: 10.1111/add.16427. Epub 2024 Jan 11. Addiction. 2024. PMID: 38212974 Review.
Cited by
-
Can AI build a virtual cell? Scientists race to model life's smallest unit.Nature. 2025 Jul;643(8070):13-14. doi: 10.1038/d41586-025-02011-0. Nature. 2025. PMID: 40579446 No abstract available.
-
Deep-learning-based gene perturbation effect prediction does not yet outperform simple linear baselines.Nat Methods. 2025 Aug;22(8):1657-1661. doi: 10.1038/s41592-025-02772-6. Epub 2025 Aug 4. Nat Methods. 2025. PMID: 40759747 Free PMC article.
-
New horizons at the interface of artificial intelligence and translational cancer research.Cancer Cell. 2025 Apr 14;43(4):708-727. doi: 10.1016/j.ccell.2025.03.018. Cancer Cell. 2025. PMID: 40233719 Review.
-
Limitations of cell embedding metrics assessed using drifting islands.Nat Biotechnol. 2025 Jun 11. doi: 10.1038/s41587-025-02702-z. Online ahead of print. Nat Biotechnol. 2025. PMID: 40500472
-
Primer on machine learning applications in brain immunology.Front Bioinform. 2025 Apr 17;5:1554010. doi: 10.3389/fbinf.2025.1554010. eCollection 2025. Front Bioinform. 2025. PMID: 40313869 Free PMC article. Review.
References
-
- Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, et al. On the opportunities and risks of foundation models. arXiv. 2022. ArXiv:2108.07258 [cs]. 10.48550/arXiv.2108.07258.
-
- Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. arXiv. 2020. ArXiv:2005.14165 [cs]. 10.48550/arXiv.2005.14165.
-
- Ramesh A, Pavlov M, Goh G, Gray S, Voss C, Radford A, et al. Zero-shot text-to-image generation. arXiv. 2021. ArXiv:2102.12092 [cs]. 10.48550/arXiv.2102.12092.
-
-
Program CSCB, Abdulla S, Aevermann B, Assis P, Badajoz S, Bell SM, et al. CZ CELL
GENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. bioRxiv. 2023. Pages: 2023.10.30.563174 Section: New Results. 10.1101/2023.10.30.563174.
-
Program CSCB, Abdulla S, Aevermann B, Assis P, Badajoz S, Bell SM, et al. CZ CELL
-
- Yang F, Wang W, Wang F, Fang Y, Tang D, Huang J, et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat Mach Intell. 2022;4(10):852–66. Number: 10 Publisher: Nature Publishing Group. 10.1038/s42256-022-00534-z.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources