Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 31;15(3):1406-1425.
doi: 10.21037/jtd-23-238.

Integrating single-cell and bulk RNA sequencing to develop a cancer-associated fibroblast-related signature for immune infiltration prediction and prognosis in lung adenocarcinoma

Affiliations

Integrating single-cell and bulk RNA sequencing to develop a cancer-associated fibroblast-related signature for immune infiltration prediction and prognosis in lung adenocarcinoma

Xiulin Huang et al. J Thorac Dis. .

Abstract

Background: An accumulating amount of studies are highlighting the impacts of cancer-associated fibroblasts (CAFs) on the initiation, metastasis, invasion, and immune evasion of lung cancer. However, it is still unclear how to tailor treatment regimens based on the transcriptomic characteristics of CAFs in the tumor microenvironment of patients with lung cancer.

Methods: Our study examined single-cell RNA-sequencing data from the Gene Expression Omnibus (GEO) database to identify expression profiles for CAF marker genes and constructed a prognostic signature of lung adenocarcinoma using these genes in The Cancer Genome Atlas (TCGA) database. The signature was validated in 3 independent GEO cohorts. Univariate and multivariate analyses were used to confirm the clinical significance of the signature. Next, multiple differential gene enrichment analysis methods were used to explore the biological pathways related to the signature. Six algorithms were used to assess the relative proportion of infiltrating immune cells, and the relationship between the signature and immunotherapy response of lung adenocarcinoma (LUAD) was explored based on the tumor immune dysfunction and exclusion (TIDE) algorithm.

Results: The signature related to CAFs in this study showed good accuracy and predictive capacity. In all clinical subgroups, the high-risk patients had a poor prognosis. The univariate and multivariate analyses confirmed that the signature was an independent prognostic marker. Moreover, the signature was closely associated with particular biological pathways related to cell cycle, DNA replication, carcinogenesis, and immune response. The 6 algorithms used to assess the relative proportion of infiltrating immune cells indicated that a lower infiltration of immune cells in the tumor microenvironment was associated with high-risk scores. Importantly, we found a negative correlation between TIDE, exclusion score, and risk score.

Conclusions: Our study constructed a prognostic signature based on CAF marker genes useful for prognosis and immune infiltration estimation of lung adenocarcinoma. This tool could enhance therapy efficacy and allow individualized treatments.

Keywords: Single-cell RNA-sequencing; cancer-associated fibroblasts (CAFs); immune infiltration; lung adenocarcinoma (LUAD); prognostic signature.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-23-238/coif). The authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1
Clinical significance of the CAF proportion in patients with LUAD. (A-E) Kaplan–Meier curves of overall survival analysis for the high- and low-CAF proportion groups in the GSE30219, GSE31210, GSE72094, TCGA, and GSE13213 cohorts. (F) Kaplan-Meier disease-free survival curves for the high- and low-CAF proportion groups in GSE31210. (G) Comparison of immune cell infiltration and immune-related pathways between the high- and low-CAF proportion groups. ns, P≥0.05; *, P<0.05; **, P<0.01; ***, P>0.001. CAF, cancer-associated fibroblast; LUAD, lung adenocarcinoma; TCGA, The Cancer Genome Atlas; aDCs, activated dendritic cells; APC, antigen-presenting cells; CCR, chemokine receptors; HLA, human leukocyte antigen; iDCs, immature dendritic cells; MHC, major histocompatibility complex; NK cells, natural killer cells; pDCs, plasmacytoid dendritic cells; TIL, tumor infiltrating lymphocyte; IFN, interferon; GAF, Global Assessment of Functioning.
Figure 2
Figure 2
Identification of CAF marker genes in LUAD with single-cell RNA-sequencing. (A) tSNE plot colored by cell population; (B) cell clusters identified with marker genes for each cell type; (C) heatmap showing the top 10 significantly expressed marker genes in 6 cell clusters. CAF, cancer-associated fibroblast; LUAD, lung adenocarcinoma; tSNE, t-distributed stochastic neighbor embedding.
Figure 3
Figure 3
Construction of the prognostic model based on CAF marker genes in TCGA database. (A) Volcano plot exhibiting the DEGs between lung cancer and normal tissues. (B) Selection of the λ in the LASSO model through 10-fold cross-validation. (C) Distribution of risk scores and patients’ survival times. (D) Kaplan-Meier overall survival curves in the high- and low-risk groups. (E) ROC curves for predicting mortality risk at 1, 3, and 5 years. (F) The tSNE analysis. (G) The PCA analysis. CAF, cancer-associated fibroblast; TCGA, The Cancer Genome Atlas; DEGs, differentially expressed genes; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic; tSNE, t-distributed stochastic neighbor embedding; PCA, principal component analysis.
Figure 4
Figure 4
Prognostic signature validation in the 3 independent GEO cohorts. (A-C) Risk scores and survival status in the different cohorts. (D-F) Kaplan-Meier analysis of the overall survival in patients with LUAD in the 3 data sets. (G-I) Validation cohort ROC curves for overall survival at 1, 3, and 5 years. GEO, the Gene Expression Omnibus; AUC, area under curve; LUAD, lung adenocarcinoma; ROC, receiver operating characteristic.
Figure 5
Figure 5
Clinical relevance of the CAF-related signature in TCGA LUAD cohort. (A) Heat map of the expression differences of 11 genes in the different risk groups of TCGA cohort annotated by clinical characteristics. (B) Comparison of risk scores in patients with lung cancer of TCGA cohort at different clinical stages (T stage, N stage, and M stage). (C,D) Forest maps of the univariate and multivariate Cox regression analysis for the risk score and other clinical characteristics in the training set. *, P<0.05; ***, P<0.001. CAF, cancer-associated fibroblast; TCGA, The Cancer Genome Atlas; LUAD, lung adenocarcinoma.
Figure 6
Figure 6
Analysis of the differentially associated signaling pathways between the high-risk and low-risk groups in TCGA cohort. (A) Mountain map showing the score variations in 5 oncogenic pathways between the 2 groups (Wilcoxon test). *, P<0.05; ***, P<0.001. (B) GO enrichment analysis of the DEGs between the high- and low-risk groups, grouped by functional themes. (C) The different statuses of the signaling pathways between the different groups according to GSVA enrichment analysis. (D-G) The status of biological pathways in the high-risk group according to GSEA. TCGA, The Cancer Genome Atlas; GO, Gene Ontology; DEGs, differentially expressed genes; GSVA, gene set variation analysis; ssGSEA, single-sample gene set enrichment analysis; ES, enrichment score.
Figure 7
Figure 7
Differences in immune cell infiltration characteristics and immune-related gene expression in the different risk groups. (A) Heatmap showing the immune infiltration status for the different risk groups. (B) Representative images of pathological HE staining of patients with the highest and lowest risk scores in TCGA database (TCGA pathology slide). (C) Thermogram showing the mRNA expression levels of chemokines, interieukins, interferons, and other cytokines in the high- and low-risk groups. (D) Exclusion and TIDE scores for the different risk groups. (E) Box plot showing the TME-related scores in the different groups. The upper and lower ends of the box correspond to the quartile ranges of values, lines to medians, and dots to outliers. ns, P≥0.05; ***, P<0.001; ****, P<0.0001. TCGA, The Cancer Genome Atlas; TIDE, tumor immune dysfunction and exclusion; TME, tumor microenvironment; EMT, epithelial-mesenchymal transition; Pan-F-TBRs, pan-fibroblast TGF-β response signature; HE, hematoxylin eosin.

Similar articles

Cited by

References

    1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. 10.3322/caac.21660 - DOI - PubMed
    1. Thai AA, Solomon BJ, Sequist LV, et al. Lung cancer. Lancet 2021;398:535-54. 10.1016/S0140-6736(21)00312-3 - DOI - PubMed
    1. Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature 2018;553:446-54. 10.1038/nature25183 - DOI - PubMed
    1. Lin JJ, Cardarella S, Lydon CA, et al. Five-Year Survival in EGFR-Mutant Metastatic Lung Adenocarcinoma Treated with EGFR-TKIs. J Thorac Oncol 2016;11:556-65. 10.1016/j.jtho.2015.12.103 - DOI - PMC - PubMed
    1. Bejarano L, Jordāo MJC, Joyce JA. Therapeutic Targeting of the Tumor Microenvironment. Cancer Discov 2021;11:933-59. 10.1158/2159-8290.CD-20-1808 - DOI - PubMed