Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 4:11:588811.
doi: 10.3389/fonc.2021.588811. eCollection 2021.

Identification of Five Glycolysis-Related Gene Signature and Risk Score Model for Colorectal Cancer

Affiliations

Identification of Five Glycolysis-Related Gene Signature and Risk Score Model for Colorectal Cancer

Jun Zhu et al. Front Oncol. .

Abstract

Metabolic changes, especially in glucose metabolism, are widely established during the occurrence and development of tumors and regarded as biological markers of pan-cancer. The well-known 'Warburg effect' demonstrates that cancer cells prefer aerobic glycolysis even if there is sufficient ambient oxygen. Accumulating evidence suggests that aerobic glycolysis plays a pivotal role in colorectal cancer (CRC) development. However, few studies have examined the relationship of glycolytic gene clusters with prognosis of CRC patients. Here, our aim is to build a glycolysis-associated gene signature as a biomarker for colorectal cancer. The mRNA sequencing and corresponding clinical data were downloaded from TCGA and GEO databases. Gene set enrichment analysis (GSEA) was performed, indicating that four gene clusters were significantly enriched, which revealed the inextricable relationship of CRC with glycolysis. By comparing gene expression of cancer and adjacent samples, 236 genes were identified. Univariate, multivariate, and LASSO Cox regression analyses screened out five prognostic-related genes (ENO3, GPC1, P4HA1, SPAG4, and STC2). Kaplan-Meier curves and receiver operating characteristic curves (ROC, AUC = 0.766) showed that the risk model could become an effective prognostic indicator (P < 0.001). Multivariate Cox analysis also revealed that this risk model is independent of age and TNM stages. We further validated this risk model in external cohorts (GES38832 and GSE39582), showing these five glycolytic genes could emerge as reliable predictors for CRC patients' outcomes. Lastly, based on five genes and risk score, we construct a nomogram model assessed by C-index (0.7905) and calibration plot. In conclusion, we highlighted the clinical significance of glycolysis in CRC and constructed a glycolysis-related prognostic model, providing a promising target for glycolysis regulation in CRC.

Keywords: ENO3; GPC1; P4HA1; SPAG4; STC2; colorectal cancer; glycolytic gene; prognosis analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The workflow identified the glycolysis-related prognostic risk model in CRC patients. TCGA, The Cancer Genome Atlas; COAD, Colon adenocarcinoma; DEG, differentially expressed gene; GSEA, gene set enrichment analysis.
Figure 2
Figure 2
GSEA analysis between cancer and normal samples. (A–C) GSEA reveals that glycolysis pathways were enriched in CRC tissues and (D) in adjacent normal tissues. (A) Go glycolytic process; (B) Hallmark glycolysis; (C) Reactome glycolysis; (D) Reactome regulation of glycolysis.
Figure 3
Figure 3
DGE and survival analysis of five glycolysis-related genes. (A–E) Five glycolysis-related genes were upregulated in tumor tissue than adjacent in normal tissue. (F–J) Kaplan–Meier curves revealed that patients with low gene expression had better outcomes than those with high expression. (A, F) ENO3, (B, G) GPC1, (C, H) P4HA1, (D, I) SPAG4, (E, J) STC2. *** represents P < 0.001.
Figure 4
Figure 4
Confirmation of prognostic genes by LASSO analysis. (A) Distribution of LASSO coefficients for five genes. (B) Partial likelihood deviation of the LASSO coefficient distribution. Two vertical lines are lambda.min and lambda.lse.
Figure 5
Figure 5
Identification of prognostic risk gene signature associated with glycolysis. (A) Survival analysis to verify the difference between the high- and low-risk groups. (B, C) Time-dependent ROC to evaluate the predictive efficacy of the risk model. (D) Distribution of risk scores of each CRC patient. (E) Correlation between survival time and survival status of each patient. (F) The expression pattern of five glycolysis-related genes.
Figure 6
Figure 6
Validation of clinical independence of risk score model. (A) Univariable analysis for each clinical feature (age, gender, TNM stage) and risk score model. (B) Multivariable analysis for risk score model and clinical characteristics (age, gender, TNM stage). The green and red boxes represent the hazard ratio, and the blue bars mean 95%CI. CI, confidence interval; T, T stage; N, N stage; M, M stage; riskScore, risk score model.
Figure 7
Figure 7
Determination of CRC patients suitable for the model. (A–L) Survival analysis for high- and low-risk groups in different patients. (A, B) Subgroups divided by age. (C, D) Subgroups divided by gender. (E, F) T1–T2 patients were divided into a common group and T3–T4 as another. (G, H) no lymph node metastasis (N0) as a group and lymph node-positive(N1–2) as another. (I, J) subgroup divided by the status of distant metastasis. (K, L) Stages I–II as a group and III–IV as another.
Figure 8
Figure 8
Validation of the risk model in the GEO dataset and mutational profiling of five genes. (A) GSE38832 and (B) GSE39582 dataset. (C) A visual summary of gene alternation from CRC. (D) The total alternation of five key genes. (E) The network of five glycolytic genes and high related expressed genes. GEO, Gene Expression Omnibus database.
Figure 9
Figure 9
Construction and validation of a nomogram. (A) Nomogram with gene expression and risk model for predicting 1-, 3-, 5-year death risk. (B, C) Calibration curves of the nomogram to verify the agreement of predicted and actual 3-, 5-year outcomes.
Figure 10
Figure 10
Immunohistochemistry and mRNA of the five genes by HPA database and RT-PCR. (A) ENO3, (B) GPC1, (C) SPAG4, (D) P4HA1, (E) STC2. (F–J) Relative mRNA expression of five genes in cancerous and paracancerous tissue. * means P < 0.05 and ** represents P < 0.01.

Similar articles

Cited by

References

    1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin (2018) 68(6):394–424. 10.3322/caac.21492 - DOI - PubMed
    1. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. . Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer (2015) 136(5):E359–86. 10.1002/ijc.29210 - DOI - PubMed
    1. Huibregtse JM, Scheffner M, Beaudenon S, Howley PM. A family of proteins structurally and functionally related to the E6-AP ubiquitin-protein ligase. Proc Natl Acad Sci USA (1995) 92(11):5249. 10.1073/pnas.92.11.5249-b - DOI - PMC - PubMed
    1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin (2018) 68(1):7–30. 10.3322/caac.21442 - DOI - PubMed
    1. Hanahan D, Weinberg R. Hallmarks of cancer: the next generation. Cell (2011) 144(5):646–74. 10.1016/j.cell.2011.02.013 - DOI - PubMed

LinkOut - more resources