Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Apr 2:27:1559-1569.
doi: 10.1016/j.csbj.2025.03.051. eCollection 2025.

Mapping Cell Identity from scRNA-seq: A primer on computational methods

Affiliations
Review

Mapping Cell Identity from scRNA-seq: A primer on computational methods

Daniele Traversa et al. Comput Struct Biotechnol J. .

Abstract

Single cell (sc) technologies mark a conceptual and methodological breakthrough in our way to study cells, the base units of life. Thanks to these technological developments, large-scale initiatives are currently ongoing aimed at mapping of all the cell types in the human body, with the ambitious aim to gain a cell-level resolution of physiological development and disease. Since its broad applicability and ease of interpretation scRNA-seq is probably the most common sc-based application. This assay uses high throughput RNA sequencing to capture gene expression profiles at the sc-level. Subsequently, under the assumption that differences in transcriptional programs correspond to distinct cellular identities, ad-hoc computational methods are used to infer cell types from gene expression patterns. A wide array of computational methods were developed for this task. However, depending on the underlying algorithmic approach and associated computational requirements, each method might have a specific range of application, with implications that are not always clear to the end user. Here we will provide a concise overview on state-of-the-art computational methods for cell identity annotation in scRNA-seq, tailored for new users and non-computational scientists. To this end, we classify existing tools in five main categories, and discuss their key strengths, limitations and range of application.

Keywords: Cell identity; Cell type annotation; RNAseq; ScRNAseq; Transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Schematic representation of the five distinct main conceptual frameworks used for the implementation of computational methods for the classification of cell type identities from scRNA-seq data: A) marker-based (MB). Counts matrix of unlabelled cells (green table) are annotated through ad hoc scoring systems employing cell type specific lists of genes (indicated by different colours); B) classical machine learning (C-ML). Machine learning algorithms are trained with labelled (orange) counts matrix data (blue); C) semi-supervised learning (SSL). Labelled data (blue= count matrix and orange= annotation labels) and unlabeled data (green) processed in the same analytical workflow to transfer labels; D) deep learning (DL); a recent breakthrough in ML applies neural networks to learn labelled data (counts=blue, labels=orange) E) Hybrid methods any of A to D is combined in a single workflow.

References

    1. Hooke, R. Micrographia: or some physiological descriptions of minute bodies made by magnifying glasses. With observations and inquiries thereupon. London:Printed by Jo. Martyn, and Ja. Allestry … and are to be sold at their shop. 1665.
    1. Xie B., Jiang Q., Mora A., Li X. Automatic cell type identification methods for single-cell RNA sequencing. Comput Struct Biotechnol J. 2021;19:5874–5887. doi: 10.1016/j.csbj.2021.10.027. - DOI - PMC - PubMed
    1. A. Regev et al., The Human Cell Atlas, eLife, vol. 6, p. e27041, 2017, doi: 10.7554/eLife.27041. - DOI - PMC - PubMed
    1. The Tabula Sapiens Consortium, The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans, Science, vol. 376, no. 6594, p. eabl4896, 2022, doi: 10.1126/science.abl4896. - DOI - PMC - PubMed
    1. Baysoy A., Bai Z., Satija R., et al. The technological landscape and applications of single-cell multi-omics. Nat Rev Mol Cell Biol. 2023;24:695–713. doi: 10.1038/s41580-023-00615-w. - DOI - PMC - PubMed