Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep;44(3):787-796.
doi: 10.3892/ijmm.2019.4243. Epub 2019 Jun 13.

Construction of prognostic risk prediction model of oral squamous cell carcinoma based on co-methylated genes

Affiliations

Construction of prognostic risk prediction model of oral squamous cell carcinoma based on co-methylated genes

Qiang Zhu et al. Int J Mol Med. 2019 Sep.

Abstract

This study aimed to identify DNA methylation markers in oral squamous cell carcinoma (OSCC) and to construct a prognostic prediction model of OSCC. For this purpose, the methylation data of patients with OSCC downloaded from The Cancer Genome Atlas were considered as a training dataset. The methylation profiles of GSE37745 for OSCC samples were downloaded from Gene Expression Omnibus and considered as validation dataset. Differentially methylated genes (DMGs) were screened from the TCGA training dataset, followed by co‑methylation analysis using weighted correlation network analysis (WGCNA). Subsequently, the methylation and gene expression levels of DMGs involved in key modules were extracted for correlation analysis. Prognosis‑related methylated genes were screened using the univariate Cox regression analysis. Finally, the risk prediction model was constructed and validated through GSE52793. The results revealed that a total of 948 DMGs with CpGs were screened out. Co‑methylation gene analysis obtained 2 (brown and turquoise) modules involving 380 DMGs. Correlation analysis revealed that the methylation levels of 132 genes negatively correlated with the gene expression levels. By combining with the clinical survival prognosis of samples, 5 optimized prognostic genes [centromere protein V (CENPV), Tubby bipartite transcription factor (TUB), synaptotagmin like 2 (SYTL2), occludin (OCLN) and CAS1 domain containing 1 (CASD1)] were selected for constructing a risk prediction model. It was consistent in the training dataset and GSE52793 that low‑risk samples had a better survival prognosis. On the whole, this study indicates that the constructed risk prediction model based on CENPV, SYTL2, OCLN, CASD1, and TUB may have the potential to be used for predicting the survival prognosis of patients with OSCC.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The significant differentially methylated genes. (A) Volcano plot of significant differentially methylated genes. The red dots represent differentially methylated genes; the black dots represent non-differentially methylated genes; the green horizontal dotted line represents the false discovery rate (FDR) <0.05; the two green vertical dotted lines represent the |log2 fold change (FC)|>0.2. A total of 948 DMGs with CpGs were screened out. (B) Log2 Kernel density curve based on differentially methylated genes. The proportion of hypomethylated genes in the good prognostic group was 64.56% (612/948), and 35.44% (336/948) were significantly hypermethylated. (C) The hierarchical clustering heatmaps of significant differentially methylated genes. The red and green bars represent the samples in good and bad prognostic groups, respectively. The samples were clearly divided into 2 groups based on the screened differentially methylated genes.
Figure 2
Figure 2
Screening of modules related to gene CpG methylation by weighted correlation network analysis (WGCNA). (A) The modules related to gene CpGs methylation. A total of 15 modules were identified, and they were presented by 15 different colors (black, blue, brown, cyan, green, green-yellow, grey, magenta, pink, purple, red, salmon, tan, turquoise and yellow). (B) The number of differentially methylated genes in the brown, blue, black, yellow, turquoise, tan, red, purple, pink, magenta, green-yellow, green, cyan models. (C) Fold enrichment column graph of the brown, blue, black, yellow, turquoise, tan, red, purple, pink, magenta, green-yellow, green, cyan models. The green horizontal dotted line represents a fold enrichment ratio of 1.
Figure 3
Figure 3
The significantly enriched biological processes for the differentially methylated genes in brown and turquoise modules. The horizontal axis represents the number of genes involved in the biological process; the vertical axis represents the name of the biological process; the height of the column represents the number of genes involved in biological process; and the color of the column represents P-value. These 380 differentially methylated genes were significantly related to 24 biological processes.
Figure 4
Figure 4
The overall correlation analysis between methylation levels and expression levels of the 380 differentially methylated genes in brown and turquoise modules. The red line is the trend line of point distribution; Cor. represents the Pearson's correlation coefficient between the methylation levels and expression levels; 'P' represents the significance of the correlation. There was a significant negative correlation between the gene methylation and expression level.
Figure 5
Figure 5
Identification of optimized prognostic genes through the Cox-Proportional Hazards (Cox-PH) model. (A) The lambda parameter curve selected by cross-validation likelihood. The horizontal and vertical axes respectively represent different values of lambda and cross-validation likelihood. (B) The prognosis coefficients of the 5 optimized prognostic genes (CENPV, TUB, SYTL2, OCLN and CASD1) selected through the Cox-PH model. CENPV, centromere protein V; TUB, Tubby bipartite transcription factor; SYTL2, synaptotagmin like 2; OCLN, occluding; CASD1, CAS1 domain containing 1.
Figure 6
Figure 6
The Kaplan-Meier curves for patients with different methylation or expression levels of CASD1, OCLN, SYTL2, TUB and CENPV. According to the median of the methylated signal value, the samples were divided into the hypomethylation and hypermethylation groups (upper panels). The KM curves also revealed that the samples with high expression levels had a better overall survival prognosis (lower panels). CENPV, centromere protein V; TUB, Tubby bipartite transcription factor; SYTL2, synaptotagmin like 2; OCLN, occluding; CASD1, CAS1 domain containing 1.
Figure 7
Figure 7
The Kaplan-Meier curves for patients based on the risk score prediction model. (A) Kaplan-Meier curves of patients in TCGA training dataset revealed that the low-risk samples had a better survival prognosis. (B) Kaplan-Meier curves of patients in the validation dataset GSE52793 were consistent with those of the TCGA training dataset.

Similar articles

Cited by

References

    1. Werning JW. Thieme. 2007. Oral cancer: Diagnosis, management, and rehabilitation.
    1. World Health Organization (WHO) In: International classification of diseases for oncology (ICD-O)-3rd edition, 1st revision. Fritz A, Percy C, Jack A, Shanmugaratnam K, Sobin L, Parkin DM, Whelan S, editors. WHO; Geneva: 2013.
    1. Wu Y, Zhang L, Zhang L, Wang Y, Li H, Ren X, Wei F, Yu W, Liu T, Wang X, et al. Long non-coding RNA HOTAIR promotes tumor cell invasion and metastasis by recruiting EZH2 and repressing E-cadherin in oral squamous cell carcinoma. Int J Oncol. 2015;46:2586–2594. doi: 10.3892/ijo.2015.2976. - DOI - PubMed
    1. Wikner J, Gröbe A, Pantel K, Riethdorf S. Squamous cell carcinoma of the oral cavity and circulating tumour cells. World J Clin Oncol. 2014;5:114–124. doi: 10.5306/wjco.v5.i2.114. - DOI - PMC - PubMed
    1. Siegel R, Ma J, Zou Z, Jemal A. Cancer statistics, 2014. CA Cancer J Clin. 2014;64:9–29. doi: 10.3322/caac.21208. - DOI - PubMed

Substances