Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 4:14:1106724.
doi: 10.3389/fgene.2023.1106724. eCollection 2023.

Integrating multiple machine learning algorithms for prognostic prediction of gastric cancer based on immune-related lncRNAs

Affiliations

Integrating multiple machine learning algorithms for prognostic prediction of gastric cancer based on immune-related lncRNAs

Guoqi Li et al. Front Genet. .

Abstract

Background: Long non-coding RNAs (lncRNAs) play an important role in the immune regulation of gastric cancer (GC). However, the clinical application value of immune-related lncRNAs has not been fully developed. It is of great significance to overcome the challenges of prognostic prediction and classification of gastric cancer patients based on the current study. Methods: In this study, the R package ImmLnc was used to obtain immune-related lncRNAs of The Cancer Genome Atlas Stomach Adenocarcinoma (TCGA-STAD) project, and univariate Cox regression analysis was performed to find prognostic immune-related lncRNAs. A total of 117 combinations based on 10 algorithms were integrated to determine the immune-related lncRNA prognostic model (ILPM). According to the ILPM, the least absolute shrinkage and selection operator (LASSO) regression was employed to find the major lncRNAs and develop the risk model. ssGSEA, CIBERSORT algorithm, the R package maftools, pRRophetic, and clusterProfiler were employed for measuring the proportion of immune cells among risk groups, genomic mutation difference, drug sensitivity analysis, and pathway enrichment score. Results: A total of 321 immune-related lncRNAs were found, and there were 26 prognostic immune-related lncRNAs. According to the ILPM, 18 of 26 lncRNAs were selected and the risk score (RS) developed by the 18-lncRNA signature had good strength in the TCGA training set and Gene Expression Omnibus (GEO) validation datasets. Patients were divided into high- and low-risk groups according to the median RS, and the low-risk group had a better prognosis, tumor immune microenvironment, and tumor signature enrichment score and a higher metabolism, frequency of genomic mutations, proportion of immune cell infiltration, and antitumor drug resistance. Furthermore, 86 differentially expressed genes (DEGs) between high- and low-risk groups were mainly enriched in immune-related pathways. Conclusion: The ILPM developed based on 26 prognostic immune-related lncRNAs can help in predicting the prognosis of patients suffering from gastric cancer. Precision medicine can be effectively carried out by dividing patients into high- and low-risk groups according to the RS.

Keywords: ILPM; gastric cancer; immunity; machine learning; prognosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Workflow of this study.
FIGURE 2
FIGURE 2
Calculation of the C-index of 117 integrated machine learning algorithms. (A) ImmLnc identified a total of 321 lncRNAs significantly associated with immune‐related pathways. (B) Univariate Cox regression analysis of OS obtained 26 prognostic immune-related lncRNAs in the TCGA-STAD dataset (n = 296). Data are presented as a hazard ratio (HR) ± 95% confidence interval [CI]. (C) C-index of 117 kinds of prediction models was calculated across two validation datasets.
FIGURE 3
FIGURE 3
Lasso result diagram of TCGA training set and prognostic efficacy of the model across all datasets. (A) Multivariable Cox regression analysis (direction = forward) of OS in the TCGA-STAD dataset (n = 296), with no culling of variables. Data are presented as a hazard ratio (HR) ± 95% confidence interval [CI]. (B) Changing track of the Lasso regression independent variable; the abscissa represented the logarithm of the independent variable lambda, and the ordinate represented the coefficient of the independent variable. (C) Confidence interval under each lambda of Lasso. (D–I) Kaplan–Meier curve of OS according to the ILPM in the TCGA-STAD (n = 296, log-rank test: p <0.001): (D) GSE57303 (n = 68, log-rank test: p = = 0.029); (F) GSE62254 (n = 298, log-rank test: p <0.001); and (H) corresponding ROC curves for predicting OS at 1, 3, and 5 years in TCGA-STAD (E), GSE57303 (G), and GSE62254 (I).
FIGURE 4
FIGURE 4
Assessment of the ILPM. (A) C-index of the ILPM in all datasets. (B–D) Performance of the ILPM compared with other clinical variables in predicting prognosis in TCGA-STAD (n = 296) (B), GSE57303 (n = 68) (C), and GSE62254 (n = 298) (D). (E) AUC value of the ILPM and five published signatures in TCGA-STAD. (F–H) Multivariate Cox regression analysis of RS and other clinical variables in TCGA-STAD (F), GSE57303 (G), and GSE62254 (H). Data are presented as the hazard ratio (HR) ± 95% confidence interval [CI].
FIGURE 5
FIGURE 5
C-index and AUC assessment of RS and clinical variables. (A–C) C-index analysis of RS and clinical variables in TCGA-STAD (n = 296) (A), GSE57303 (n = 68) (B), and GSE62254 (n = 298) (C). (D) Time-dependent ROC analysis for predicting OS in all datasets.
FIGURE 6
FIGURE 6
Molecular and genomic features between high- and low-risk groups in TCGA-STAD. (A–C) Box plot of metabolism signatures (A), TME signatures (B), and tumor signatures (C) in high- and low-risk groups. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001. (D–E) SNV waterfall of top 20 (mutation frequency) genes in the high-risk group (n = 147) (D) and low-risk group (n = 147) (E).
FIGURE 7
FIGURE 7
Proportion of infiltration of 28 immune cells was evaluated based on ssGSEA. (A) Box plot of the proportion of immune infiltrating cells in high- and low-risk groups. Green represents the high-risk group, and red represents the low-risk group. (B) RS and 28 immune cell correlation heat map. The cross marks represent a non-significant correlation, where blue is positive and red is negative. (C) 18 lncRNAs and 28 immune cell correlation heat map, where pink represents positive and green represents negative. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.
FIGURE 8
FIGURE 8
Variations in drug sensitivity between model groups. (A–R) IC50 box diagram of 18 drugs with the significant difference in drug sensitivity in the high- and low-risk groups, respectively, in which yellow represents the high-risk group and blue represents the low-risk group.
FIGURE 9
FIGURE 9
GO and KEGG enrichment analyses of DEGs. (A) Heatmaps of DEGs based on RS. (B) Volcano plot of DEGs based on RS. (C–E) Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs, including biological process (BP) (C), molecular function (MF) (D), and KEGG pathway (E). Count: number of genes related to the enriched GO or KEGG pathway. The color of the bar denotes the p-value. (F) 18 lncRNA and 86 DEG correlation heat map, where red represents positive and green represents negative. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.
FIGURE 10
FIGURE 10
Variations in GO, KEGG, and Hallmark pathway enrichment scores between model groups. GO (A-D) and KEGG (E-H) enriched the top four pathways with the most significant results based on GSEA. (I) Box plot of GSVA of 50 Hallmark pathways. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001.

Similar articles

Cited by

References

    1. Amin M. B., Greene F. L., Edge S. B., Compton C. C., Gershenwald J. E., Brookland R. K., et al. (2017). The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more "personalized" approach to cancer staging. CA Cancer J. Clin. 67 (2), 93–99. 10.3322/caac.21388 - DOI - PubMed
    1. Aoki M., Shoji H., Nagashima K., Imazeki H., Miyamoto T., Hirano H., et al. (2019). Hyperprogressive disease during nivolumab or irinotecan treatment in patients with advanced gastric cancer. ESMO Open 4 (3), e000488. 10.1136/esmoopen-2019-000488 - DOI - PMC - PubMed
    1. Bai Y., Wei C., Zhong Y., Zhang Y., Long J., Huang S., et al. (2020). Development and validation of a prognostic nomogram for gastric cancer based on DNA methylation-driven differentially expressed genes. Int. J. Biol. Sci. 16 (7), 1153–1165. 10.7150/ijbs.41587 - DOI - PMC - PubMed
    1. Bang Y. J., Kang Y. K., Catenacci D. V., Muro K., Fuchs C. S., Geva R., et al. (2019). Pembrolizumab alone or in combination with chemotherapy as first-line therapy for patients with advanced gastric or gastroesophageal junction adenocarcinoma: Results from the phase II nonrandomized KEYNOTE-059 study. Gastric Cancer 22 (4), 828–837. 10.1007/s10120-018-00909-5 - DOI - PMC - PubMed
    1. Bang Y. J., Van Cutsem E., Fuchs C. S., Ohtsu A., Tabernero J., Ilson D. H., et al. (2019). KEYNOTE-585: Phase III study of perioperative chemotherapy with or without pembrolizumab for gastric cancer. Future Oncol. 15 (9), 943–952. 10.2217/fon-2018-0581 - DOI - PubMed