Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 22;25(1):bbad424.
doi: 10.1093/bib/bbad424.

Identifying phenotype-associated subpopulations through LP_SGL

Affiliations

Identifying phenotype-associated subpopulations through LP_SGL

Juntao Li et al. Brief Bioinform. .

Abstract

Single-cell RNA sequencing (scRNA-seq) enables the resolution of cellular heterogeneity in diseases and facilitates the identification of novel cell types and subtypes. However, the grouping effects caused by cell-cell interactions are often overlooked in the development of tools for identifying subpopulations. We proposed LP_SGL which incorporates cell group structure to identify phenotype-associated subpopulations by integrating scRNA-seq, bulk expression and bulk phenotype data. Cell groups from scRNA-seq data were obtained by the Leiden algorithm, which facilitates the identification of subpopulations and improves model robustness. LP_SGL identified a higher percentage of cancer cells, T cells and tumor-associated cells than Scissor and scAB on lung adenocarcinoma diagnosis, melanoma drug response and liver cancer survival datasets, respectively. Biological analysis on three original datasets and four independent external validation sets demonstrated that the signaling genes of this cell subset can predict cancer, immunotherapy and survival.

Keywords: biological analysis; cell subpopulation; cell–cell interaction; data integration.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The workflow of LP_SGL.
Figure 2
Figure 2
Experimental results on the LUAD dataset. (A) UMAP visualization of 24 cell groups obtained using the Leiden algorithm. (B and C) Bar chart of the distribution of LP_SGL+ cells and LP_SGL- cells with respect to cell groups and cell types, respectively. (D) Line chart of the proportions of cancer cells contained in the LUAD phenotype cells identified by LP_SGL, Scissor and scAB. (E) Volcano map of DEGs between LP_SGL+ cells and LP_SGL- cells. (F and G) Box plot of GSVA scores for cancer and normal samples on TCGA-LUAD and GSE40419 datasets, respectively. (H) K-M survival curves of high- and low-risk group samples divided by the median prognostic score in the TCGA-LUAD dataset.
Figure 3
Figure 3
Experimental results on the melanoma dataset. (A) UMAP visualization of 17 cell groups obtained using the Leiden algorithm. (B) Bar chart of the distribution of LP_SGL+ cells with respect to cell types. (C) Line chart of the proportions of cancer cells contained in the response phenotype cells identified by LP_SGL,Scissor and scAB. (D) Volcano map of DEGs between LP_SGL+ cells and other cells. (EG) Box plot of GSVA scores for response and non-response in PRJEB23709, GSE91061 and GSE181815 datasets, respectively. (H) GSEA plots of upregulated and downregulated biological processes (BP) of the overall DEGs. (I) GSEA plots of upregulated BP of the upregulated DEGs.
Figure 4
Figure 4
Experimental results on the liver cancer datasets. (A) UMAP visualization of 16 cell groups obtained using the Leiden algorithm. (B) Bar chart of the distribution of LP_SGL+ cells with respect to cell types. (C) Volcano map of DEGs between LP_SGL+ cells and LP_SGL- cells. (D) Bar chart of the average C-index of the results from 10-times experiments results. (E) The K-M survival curves of high- and low-risk group samples in the TCGA-LIHC dataset. (F and G) The survival and recurrence K-M curves of the high- and low-risk groups in the GSE14520 dataset, respectively. (H) Gene set enrichment analysis plots of upregulated Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway of DEGs.

References

    1. Suvà Mario L, Tirosh I. Single-cell RNA sequencing in cancer: lessons learned and emerging challenges. Mol Cell 2019;75(1):7–12. - PubMed
    1. Zhao J, Zhao B, Song X, et al. Subtype-DCC: decoupled contrastive clustering method for cancer subtype identification based on multi-omics data. Brief Bioinform 2023;24(2): bbad025. - PubMed
    1. Kaushik AC, Wang YJ, Wang X, Wei DQ. Irinotecan and vandetanib create synergies for treatment of pancreatic cancer patients with concomitant TP53 and KRAS mutations. Brief Bioinform 2021;22(3): bbaa149. - PubMed
    1. Patel AP, Tirosh I, Trombetta JJ, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 2014;344(6190):1396–401. - PMC - PubMed
    1. Yofe I, Dahan R, Amit I. Single-cell genomic approaches for developing the next generation of immunotherapies. Nat Med 2020;26(2):171–7. - PubMed

Publication types