Identification of Key Genes Associated with Overall Survival in Glioblastoma Multiforme Using TCGA RNA-Seq Expression Data
- PMID: 40725410
- PMCID: PMC12294400
- DOI: 10.3390/genes16070755
Identification of Key Genes Associated with Overall Survival in Glioblastoma Multiforme Using TCGA RNA-Seq Expression Data
Abstract
Background/Objectives: Glioblastoma multiforme (GBM) is an aggressive and heterogeneous brain tumor with poor prognosis, emphasizing the need for reliable molecular biomarkers to improve patient stratification and treatment planning. This study aimed to identify key genes associated with overall survival in GBM by employing and comparing machine learning (ML) and deep learning (DL) approaches using RNA-Seq gene expression data. Methods: RNA-Seq expression and clinical data for primary GBM tumors were obtained from The Cancer Genome Atlas (TCGA). A univariate Cox proportional hazards regression was used to identify survival-associated genes. For survival prediction, ML-based feature selection techniques-RF, GB, SVM-RFE, RF-RFE, and PCA-were used to construct multivariate Cox models. Separately, DeepSurv, a DL-based survival model, was trained using the significant genes from the univariate analysis. Gradient-based importance scoring was applied to determine key genes from the DeepSurv model. Results: Univariate analysis yielded 694 survival-associated genes. The best ML-based Cox model (RF-RFE with 90% training data) achieved a c-index of 0.725. In comparison, DeepSurv demonstrated superior performance with a c-index of 0.822. The top 10 genes were identified from the DeepSurv analysis, including CMTR1, GMPR, and PPY. Kaplan-Meier survival curves confirmed their prognostic significance, and network analysis highlighted their roles in processes such as purine metabolism, RNA processing, and neuroendocrine signaling. Conclusions: This study demonstrates the effectiveness of combining ML and DL models to identify prognostic gene expression biomarkers in GBM, with DeepSurv providing higher predictive accuracy. The findings offer valuable insights into GBM biology and highlight candidate biomarkers for further validation and therapeutic development.
Keywords: Cox regression; RNA-Seq; biomarkers; deep learning; gene network analysis; glioblastoma multiforme; machine learning; survival analysis.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures
References
-
- Redekar S.S., Varma S.L., Bhattacharjee A. Identification of key genes associated with survival of glioblastoma multiforme using integrated analysis of TCGA datasets. Comput. Methods Programs Biomed. Update. 2022;2:100051. doi: 10.1016/j.cmpbup.2022.100051. - DOI
-
- Stoyanov G.S., Lyutfi E., Georgieva R., Georgiev R., Dzhenkov D.L., Petkova L., Ivanov B.D., Kaprelyan A., Ghenev P. Reclassification of glioblastoma multiforme according to the 2021 World Health Organization classification of central nervous system tumors: A single institution report and practical significance. Cureus. 2022;14:e21822. doi: 10.7759/cureus.21822. - DOI - PMC - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous
