. 2025 Jan-Dec;19(1):e12107.

doi: 10.1049/syb2.12107. Epub 2025 Apr 22.

scRSSL: Residual semi-supervised learning with deep generative models to automatically identify cell types

Yanru Gao¹, Hongyu Duan², Fanhao Meng¹, Conghui Zhang¹, Xiyue Li¹, Feng Li¹

Affiliations

¹ School of Computer Science, Qufu Normal University, Rizhao, China.
² Department of Statistics and Financial Mathematics, School of Mathematics, South China University of Technology, Guangzhou, China.

PMID: 40261690
PMCID: PMC12033026
DOI: 10.1049/syb2.12107

scRSSL: Residual semi-supervised learning with deep generative models to automatically identify cell types

Yanru Gao et al. IET Syst Biol. 2025 Jan-Dec.

. 2025 Jan-Dec;19(1):e12107.

doi: 10.1049/syb2.12107. Epub 2025 Apr 22.

Authors

Yanru Gao¹, Hongyu Duan², Fanhao Meng¹, Conghui Zhang¹, Xiyue Li¹, Feng Li¹

Affiliations

¹ School of Computer Science, Qufu Normal University, Rizhao, China.
² Department of Statistics and Financial Mathematics, School of Mathematics, South China University of Technology, Guangzhou, China.

PMID: 40261690
PMCID: PMC12033026
DOI: 10.1049/syb2.12107

Abstract

Single-cell sequencing (scRNA-seq) allows researchers to study cellular heterogeneity in individual cells. In single-cell transcriptomics analysis, identifying the cell type of individual cells is a key task. At present, single-cell datasets often face the challenges of high dimensionality, large number of samples, high sparsity and sample imbalance. The traditional methods of cell type recognition have been challenged. The authors propose a deep residual generation model based on semi-supervised learning (scRSSL) to address these challenges. ScRSSL creatively introduces residual networks into semi-supervised generative models. The authors take advantage of its semi-supervised learning to solve the problem of sample imbalance. During the training of the model, the authors use a residual neural network to accomplish the inference of cell types so that local features of single-cell data can be extracted. Because of the semi-supervised learning approach, it can automatically and accurately predict individual cell types in datasets, even with only a small number of cell labels. Experimentally, the authors' method has proven to have better performance compared to other methods.

Keywords: bioinformatics; deep generative model; deep learning; semi‐supervised learning; single cell.

PubMed Disclaimer

Conflict of interest statement

The authors declare no potential conflicts of interests.

Figures

**FIGURE 1**
Model framework summary figure. Firstly, part of the labelled data was preliminarily pre‐processed before entering the neural network model. Then, the first hidden layer is compressed to produce an initial possible representation, designated as z1. After separating the labelled cells from the unlabelled cells in z1, we consider the cell type of the unlabelled cell data as a latent variable and introduce the variable y for the labelled cell data. After the cell type is inferred by the residual neural network architecture, the unlabelled cells are created together with the labelled data as the second latent representation z2. Finally, the decoder neural network converts it into a negative binomial distribution of the original dataset.

**FIGURE 2**
Inter‐dataset experiment result figure. The box plot of the experimental results of scRSSL compared with seven other baseline methods, using the f1‐score as the evaluation metric. The first row of the graph represents the use of the Xin dataset as the reference dataset, with the remaining four datasets as the query dataset for prediction.

**FIGURE 3**
Intra‐dataset experiment result figure. Boxplot of the experimental results of scRSSL against seven other baseline methods on four datasets: Zeisel, Baron, Klein, and Romanov.

**FIGURE 4**
Confusion matrix figure of experimental results. The confusion matrix obtained by each model predicting the hECA dataset, where 0–7 represent: Adipocyte, Cardiomyocyte cell, Endothelial cell, Fibroblast, Lymphoid cell, Myeloid cell, Pericyte, Smooth muscle cell, and other 8 cell types.

**FIGURE 5**
Visualisation of the prediction performance comparison of different cell types on four models: scRSSL, RF, KNN, and AdaBoost. In the ROC plot, the closer the curve is to the upper left corner, the better the performance of the model.

**FIGURE 6**
Visualisation of the prediction performance comparison of the four models scRSSL, RF, KNN, and AdaBoost. In the ROC plot, the closer the curve is to the top‐left corner, the better the performance of the model.

**FIGURE 7**
Ablation experiment. F1 score visualisation results of two models, ScRSSL and scRSSL variants without the residual network module on four datasets: Zeisel, Baron, Klein, and Romanov.

See this image and copyright information in PMC

References

1. Azizi, E. , et al.: Single‐cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 174(5), 1293.e1236–1308.e1236 (2018). 10.1016/j.cell.2018.05.060 - DOI - PMC - PubMed
1. Schaum, N. , et al.: Single‐cell transcriptomics of 20 mouse organs creates a tabula Muris: the Tabula Muris consortium. Nature 562(7727), 367–372 (2018). 10.1038/s41586-018-0590-4 - DOI - PMC - PubMed
1. Jaitin, D.A. , et al.: Massively parallel single‐cell RNA‐seq for marker‐free decomposition of tissues into cell types. Science 343(6172), 776–779 (2014). 10.1126/science.1247651 - DOI - PMC - PubMed
1. Chen, S. , et al.: hECA: the cell‐centric assembly of a cell atlas. iScience 25(5), 104318 (2022). 10.1016/j.isci.2022.104318 - DOI - PMC - PubMed
1. Ao, C. , et al.: Computational approaches for predicting drug‐disease associations: a comprehensive review. (2023)

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central
- Wiley

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

scRSSL: Residual semi-supervised learning with deep generative models to automatically identify cell types

Affiliations

scRSSL: Residual semi-supervised learning with deep generative models to automatically identify cell types

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources