Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 27;25(3):bbae203.
doi: 10.1093/bib/bbae203.

scEWE: high-order element-wise weighted ensemble clustering for heterogeneity analysis of single-cell RNA-sequencing data

Affiliations

scEWE: high-order element-wise weighted ensemble clustering for heterogeneity analysis of single-cell RNA-sequencing data

Yixiang Huang et al. Brief Bioinform. .

Abstract

With the emergence of large amount of single-cell RNA sequencing (scRNA-seq) data, the exploration of computational methods has become critical in revealing biological mechanisms. Clustering is a representative for deciphering cellular heterogeneity embedded in scRNA-seq data. However, due to the diversity of datasets, none of the existing single-cell clustering methods shows overwhelming performance on all datasets. Weighted ensemble methods are proposed to integrate multiple results to improve heterogeneity analysis performance. These methods are usually weighted by considering the reliability of the base clustering results, ignoring the performance difference of the same base clustering on different cells. In this paper, we propose a high-order element-wise weighting strategy based self-representative ensemble learning framework: scEWE. By assigning different base clustering weights to individual cells, we construct and optimize the consensus matrix in a careful and exquisite way. In addition, we extracted the high-order information between cells, which enhanced the ability to represent the similarity relationship between cells. scEWE is experimentally shown to significantly outperform the state-of-the-art methods, which strongly demonstrates the effectiveness of the method and supports the potential applications in complex single-cell data analytical problems.

Keywords: Element-wise; Ensemble Clustering; High-order Similarity; scRNA-seq data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart for scEWE.
Figure 2
Figure 2
tSNE visualization of embedding capability for the Usoskin dataset. Subfigures correspond to SHARP, SIMLR, CIDR, SC3, Seurat and scEWE. The different colors represent different clusters in true labels.
Figure 3
Figure 3
tSNE visualization of embedding capability for the Goolam dataset. Subfigures correspond to SHARP, SIMLR, CIDR, SC3, Seurat and scEWE. The different colors represent different clusters in true labels.
Figure 4
Figure 4
tSNE visualization of embedding capability for the Deng dataset. Subfigures correspond to SHARP, SIMLR, CIDR, SC3, Seurat and scEWE. The different colors represent different clusters in true labels.
Figure 5
Figure 5
tSNE visualization of embedding capability for the Treulein dataset. Subfigures correspond to SHARP, SIMLR, CIDR, SC3, Seurat and scEWE. The different colors represent different clusters in true labels.
Figure 6
Figure 6
Optimal cluster number determination in different datasets.
Figure 7
Figure 7
Data distributions in the considered data sets. The upper figures correspond to the tSNE plots in the original data sets; the lower figures correspond to the nonzero ratio distribution of attributes in the data sets.
Figure 8
Figure 8
Performance evaluation of scEWE with varied formula image(formula image and formula image).
Figure 9
Figure 9
Performance evaluation of scEWE with varied formula image(formula image).
Figure 10
Figure 10
Performance Evaluation of scEWE with varied formula image.
Figure 11
Figure 11
Top biological process item in [38] by Metascape.

References

    1. Petegrosso R, Li Z, Kuang R. Machine learning and statistical methods for clustering single-cell rna-sequencing data. Brief Bioinform 2020;21(4):1209–23. - PubMed
    1. Hao J, Sohn LL, Haiyan H, Luonan C. Single cell clustering based on cell-pair differentiability correlation and variance analysis. Bioinformatics (Oxford, England) 34(21):3684–94. - PubMed
    1. Tian T, Zhang J, Lin X, et al. . Model-based deep embedding for constrained clustering analysis of single cell rna-seq data. Nat Commun 2021;12(1):1873. - PMC - PubMed
    1. Jiang H, Yi M, Zhang S. A kernel non-negative matrix factorization framework for single cell clustering. App Math Model 2021;90(1):875–88.
    1. žurauskienė J, Yau C. Pcareduce: hierarchical clustering of single cell transcriptional profiles. BMC Bioinformatics 2016;17:1–11. - PMC - PubMed

Publication types