Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 2;41(8):btaf402.
doi: 10.1093/bioinformatics/btaf402.

SCassist: an AI based workflow assistant for single-cell analysis

Affiliations

SCassist: an AI based workflow assistant for single-cell analysis

Vijayaraj Nagarajan et al. Bioinformatics. .

Abstract

Summary: Single-cell RNA sequencing (scRNA-seq) data analysis often involves complex iterative workflow, requiring significant expertise and time. To navigate this complexity, we have developed SCassist, an R package that leverages the power of the large language models (LLM's) to guide and enhance scRNA-seq analysis. SCassist integrates LLM's into key workflow steps, to analyze user data and provide relevant recommendations for filtering, normalization and clustering parameters. It also provides LLM guided insightful interpretations of variable features and principal components, along with cell type annotations and enrichment analysis. SCassist provides intelligent assistance using popular LLM's like Google's Gemini, OpenAI's GPT and Meta's Llama3, making scRNA-seq analysis accessible to researchers at all levels.

Availability and implementation: The SCassist package, along with the detailed tutorials, is available at GitHub. https://github.com/NIH-NEI/SCassist.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The general architecture of the SCassist algorithm. SCassist, an LLM-powered assistant, streamlines single-cell analysis within the standard Seurat workflow. The top portion of the figure depicts the typical Seurat steps (quality control, normalization, dimensionality reduction, clustering, and annotation), while the interconnected pink boxes represent SCassist components, providing data-driven insights and parameter recommendations for each step. SCassist could be used at any stage of the standard single-cell workflow, starting from the quality control stage, where the user input for SCassist is simply the Seurat object containing the raw count matrix data. For the given Seurat object, SCassist generates metrics like summary statistics, quantile data, variance explained, and others. These metrics are then used to build augmented prompts for large language models (LLMs), recommending optimal parameters for filtering, normalization, dimensionality reduction, identifying significant features and offering insights (from variable genes, principal components, differentially expressed genes), and annotating clusters along with detailed reasoning.

Update of

References

    1. Brown TB, Mann B, Ryder N et al. Language models are few-shot learners. In: NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems, Vol. 159. Vancouver, BC, Canada, 2020. 1877–901.
    1. Chen J, Xu H, Tao W et al. Transformer for one stop interpretable cell type annotation. Nat Commun 2023;14:223. - PMC - PubMed
    1. Cui H, Wang C, Maan H et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods 2024;21:1470–80. - PubMed
    1. Devlin J et al. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, 2019, 4171–86. 10.18653/v1/N19-1423 - DOI
    1. Fang Y, Liu K, Zhang N et al. ChatCell: Facilitating Single-Cell Analysis with Natural Language. arXiv, 10.48550/arXiv.2402.08303, 2024, preprint: not peer reviewed. - DOI