Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 24:14:1351393.
doi: 10.3389/fonc.2024.1351393. eCollection 2024.

Based on machine learning, CDC20 has been identified as a biomarker for postoperative recurrence and progression in stage I & II lung adenocarcinoma patients

Affiliations

Based on machine learning, CDC20 has been identified as a biomarker for postoperative recurrence and progression in stage I & II lung adenocarcinoma patients

Rui Miao et al. Front Oncol. .

Abstract

Objective: By utilizing machine learning, we can identify genes that are associated with recurrence, invasion, and tumor stemness, thus uncovering new therapeutic targets.

Methods: To begin, we obtained a gene set related to recurrence and invasion from the GEO database, a comprehensive gene expression database. We then employed the Weighted Gene Co-expression Network Analysis (WGCNA) to identify core gene modules and perform functional enrichment analysis on them. Next, we utilized the random forest and random survival forest algorithms to calculate the genes within the key modules, resulting in the identification of three crucial genes. Subsequently, one of these key genes was selected for prognosis analysis and potential drug screening using the Kaplan-Meier tool. Finally, in order to examine the role of CDC20 in lung adenocarcinoma (LUAD), we conducted a variety of in vitro and in vivo experiments, including wound healing assay, colony formation assays, Transwell migration assays, flow cytometric cell cycle analysis, western blotting, and a mouse tumor model experiment.

Results: First, we collected a total of 279 samples from two datasets, GSE166722 and GSE31210, to identify 91 differentially expressed genes associated with recurrence, invasion, and stemness in lung adenocarcinoma. Functional enrichment analysis revealed that these key gene clusters were primarily involved in microtubule binding, spindle, chromosomal region, organelle fission, and nuclear division. Next, using machine learning, we identified and validated three hub genes (CDC45, CDC20, TPX2), with CDC20 showing the highest correlation with tumor stemness and limited previous research. Furthermore, we found a close association between CDC20 and clinical pathological features, poor overall survival (OS), progression-free interval (PFI), progression-free survival (PFS), and adverse prognosis in lung adenocarcinoma patients. Lastly, our functional research demonstrated that knocking down CDC20 could inhibit cancer cell migration, invasion, proliferation, cell cycle progression, and tumor growth possibly through the MAPK signaling pathway.

Conclusion: CDC20 has emerged as a novel biomarker for monitoring treatment response, recurrence, and disease progression in patients with lung adenocarcinoma. Due to its significance, further research studying CDC20 as a potential therapeutic target is warranted. Investigating the role of CDC20 could lead to valuable insights for developing new treatments and improving patient outcomes.

Keywords: CDC20; invasion; lung adenocarcinoma; machine learning; recurrence.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Identification of WGCNA module genes. (A, B) In the GSE166722 and GSE31210 datasets, a clustering dendrogram was formed using weighted correlation coefficients to group genes with similar expression patterns into co-expression modules, with each module represented by a different color. (C, D) In the GSE166722 and GSE31210 datasets, correlation heatmaps illustrating the relationship between module characteristic genes (MEs) and invasion/recurrence, as well as scatter plots showing the correlation between module membership (MM) and gene significance (GS), were generated. (E) Venn diagram of invasion and recurrence. (F) KEGG analysis and GO enrichment analysis of the module genes.
Figure 2
Figure 2
Machine learning is used for screening recurrent and invasive key genes. (A–C) Random Survival Forest and Random Forest are two algorithms along with their overlapping Venn diagram. (D) OS Kaplan-Meier survival curve analysis was performed for the high and low expression groups of three genes in the LUAD patients cohort GSE31210. Additionally, ROC curves depicting the time-dependent changes showed the area under the curve (AUC) values for patient OS at 1 year (blue), 3 years (red), and 5 years (purple). (E) The ROC curve illustrates the diagnostic efficacy of CDC45, TPX2, and CDC20 in differentiating between invasive and non-invasive LUAD patients within the GSE166722 cohort. (F) The correlation between CDC20, TPX2, and CDC45 with stemness mRNAsi.
Figure 3
Figure 3
CDC20 is significantly associated with aggressive clinical phenotype in LUAD. (A–D) The OS curves of patients with high and low expression of CDC20 were analyzed in the TCGA, GSE30219, GSE50081, and GSE42127 cohorts. (E, F) Kaplan-Meier curves of PFI in patients with high and low expression of CDC20 in the TCGA cohort and PFS in patients with high and low expression of CDC20 in the GSE30219 cohort. (G–H) PFS curves of patients with high and low expression of CDC20 were analyzed in the GSE50081 cohort. Furthermore, box plots were generated to depict the differential expression of CDC20 between normal tissue and tumors in the TCGA and GSE31210 cohorts. (I) The clinical characteristics of patients with high and low expression of CDC20 in the TCGA cohort. (J) A multivariable Cox regression analysis was performed to further screen for the key cancer gene CDC20 in LUAD using the TCGA database.
Figure 4
Figure 4
Molecular characteristics and identification of related gene clusters. (A, B) The stemness gene sets and associated oncogenic pathways positively correlated with CDC20 in the GSE31210 and GSE166722 datasets are highlighted in red boxes. (C) Volcano plot displaying the distribution of differentially expressed genes in the GSE30219, GSE31210, GSE42127, GSE50081, GSE166722, and TCGA datasets. (D, E) Shared gene sets and PPI network analysis across the six datasets. (F) Waterfall plot illustrating the highest mutation rates in the top 33 genes based on stemness.
Figure 5
Figure 5
Potential drug screening. (A) IC50 values of 8 chemotherapy drugs. (B) CAMP screening for potential inhibitors of CDC20. (C) Potential drug screening for CDC20-related gene clusters.
Figure 6
Figure 6
CDC20 promotes migration, invasion and proliferation of lung adenocarcinoma cells in vitro.(A, B) CDC20 protein expression levels and mRNA levels were verified in three lung cancer cell lines and normal epithelial cells. (C) Images of wound healing in the NC and CDC20 knockdown groups of H1299 and H1975 lung cancer cells are presented. (D) Images from the Transwell assay are shown for the NC and CDC20 knockdown groups of H1299 and H1975 lung cancer cells. (E) Colony formation images are provided for the NC and CDC20 knockdown groups of H1299 and H1975 lung cancer cells. (***p < 0.001; **p < 0.01; *p < 0.05).
Figure 7
Figure 7
CDC20 affects the cell cycle and may be associated with the MAPK signaling pathway. (A, B) Flow cytometry of H1299 and H1975 cells in NC group and CDC20 knockdown group. (C, D) To verify the expression of MAPK signaling pathway by Western blotting assay in H1299 and H1975 cells transfected with siRNA. (***p < 0.001; **p < 0.01; *p < 0.05).
Figure 8
Figure 8
Knockdown of CDC20 inhibits tumor growth in mouse xenotransplantation models. (A, B) Images of mice and tumors in the NC group and shCDC20 group. (C) The mean tumor weight histogram and tumor volume histogram of the NC group and shCDC20 group were measured after injection. (D) Immunofluorescence of tumor CDC20 in NC group and shCDC20 group, and immunofluorescence of lung tissue in normal mice. (***p<0.001).

Similar articles

Cited by

References

    1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2018) 68:394–424. doi: 10.3322/caac.21492 - DOI - PubMed
    1. Bellayr IH, Marklein RA, Lo Surdo JL, Bauer SR, Puri RK. Identification of predictive gene markers for multipotent stromal cell proliferation. Stem Cells Dev. (2016) 25:861–73. doi: 10.1089/scd.2015.0374 - DOI - PubMed
    1. Peters S, Adjei AA, Gridelli C, Reck M, Kerr K, Felip E, et al. . Metastatic non-small-cell lung cancer (NSCLC): ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. (2012) 23:vii56–64. doi: 10.1093/annonc/mds226 - DOI - PubMed
    1. Xuhong JC, Qi XW, Zhang Y, Jiang J. Mechanism, safety and efficacy of three tyrosine kinase inhibitors lapatinib, neratinib and pyrotinib in HER2-positive breast cancer. Am J Cancer Res. (2019) 9:2103–19. - PMC - PubMed
    1. Aust S, Schwameis R, Gagic T, Müllauer L, Langthaler E, Prager G, et al. . Precision medicine tumor boards: clinical applicability of personalized treatment concepts in ovarian cancer. Cancers (Basel). (2020) 12:548. doi: 10.3390/cancers12030548 - DOI - PMC - PubMed

LinkOut - more resources