Training, Validating, and Testing Machine Learning Prediction Models for Endometrial Cancer Recurrence
- PMID: 40324114
- PMCID: PMC12054588
- DOI: 10.1200/PO-24-00859
Training, Validating, and Testing Machine Learning Prediction Models for Endometrial Cancer Recurrence
Abstract
Purpose: Endometrial cancer (EC) is the most common gynecologic cancer in the United States with rising incidence and mortality. Despite optimal treatment, 15%-20% of all patients will recur. To better select patients for adjuvant therapy, it is important to accurately predict patients at risk for recurrence. Our objective was to train, validate, and test models of EC recurrence using lasso regression and other machine learning (ML) and deep learning (DL) analytics in a large, comprehensive data set.
Methods: Data from patients with EC were downloaded from the Oncology Research Information Exchange Network database and stratified into low risk, The International Federation of Gynecology and Obstetrics (FIGO) grade 1 and 2, stage I (N = 329); high risk, or FIGO grade 3 or stages II, III, IV (N = 324); and nonendometrioid histology (N = 239) groups. Clinical, pathologic, genomic, and genetic data were used for the analysis. Genomic data included microRNA, long noncoding RNA, isoforms, and pseudogene expressions. Genetic variation included single-nucleotide variation (SNV) and copy-number variation (CNV). In the discovery phase, we selected variables informative for recurrence (P < .05), using univariate analyses of variance. Then, we trained, validated, and tested multivariate models using selected variables and lasso regression, MATLAB (ML), and TensorFlow (DL).
Results: Recurrence clinic models for low-risk, high-risk, and high-risk nonendometrioid histology had AUCs of 56%, 70%, and 65%, respectively. For training, we selected models with AUC >80%: five for the low-risk group, 20 models for the high-risk group, and 20 for the nonendometrioid group. The two best low-risk models included clinical data and CNVs. For the high-risk group, three of the five best-performing models included pseudogene expression. For the nonendometrioid group, pseudogene expression and SNV were overrepresented in the best models.
Conclusion: Prediction models of EC recurrence built with ML and DL analytics had better performance than models with clinical and pathologic data alone. Prospective validation is required to determine clinical utility.
Conflict of interest statement
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (
Figures

Similar articles
-
Integration of Genomic and Clinical Retrospective Data to Predict Endometrioid Endometrial Cancer Recurrence.Int J Mol Sci. 2022 Dec 16;23(24):16014. doi: 10.3390/ijms232416014. Int J Mol Sci. 2022. PMID: 36555654 Free PMC article.
-
Combination of Proactive Molecular Risk Classifier for Endometrial cancer (ProMisE) with sonographic and demographic characteristics in preoperative prediction of recurrence or progression of endometrial cancer.Ultrasound Obstet Gynecol. 2021 Sep;58(3):457-468. doi: 10.1002/uog.23573. Ultrasound Obstet Gynecol. 2021. PMID: 33314410 Free PMC article.
-
Developing and validating ultrasound-based radiomics models for predicting high-risk endometrial cancer.Ultrasound Obstet Gynecol. 2022 Aug;60(2):256-268. doi: 10.1002/uog.24805. Ultrasound Obstet Gynecol. 2022. PMID: 34714568
-
Adjuvant chemoradiotherapy vs. radiotherapy alone in early-stage high-risk endometrial cancer: a systematic review and meta-analysis.Eur Rev Med Pharmacol Sci. 2019 Jan;23(2):833-840. doi: 10.26355/eurrev_201901_16898. Eur Rev Med Pharmacol Sci. 2019. PMID: 30720192
-
High risk endometrial cancer: Clues towards a revision of the therapeutic paradigm.J Gynecol Obstet Hum Reprod. 2019 Dec;48(10):863-871. doi: 10.1016/j.jogoh.2019.06.003. Epub 2019 Jun 5. J Gynecol Obstet Hum Reprod. 2019. PMID: 31176047 Review.
References
-
- Siegel RL, Miller KD, Fuchs HE, et al. Cancer statistics, 2022. CA Cancer J Clin. 2022;72:7–33. - PubMed
-
- Sheikh MA, Althouse AD, Freese KE, et al. USA endometrial cancer projections to 2030: Should we be concerned? Future Oncol. 2014;10:2561–2568. - PubMed
-
- Creutzberg CL, van Putten WL, Koper PC, et al. Surgery and postoperative radiotherapy versus surgery alone for patients with stage-1 endometrial carcinoma: Multicentre randomised trial. PORTEC Study Group. Post Operative Radiation Therapy in Endometrial Carcinoma. Lancet. 2000;355:1404–1411. - PubMed
-
- Keys HM, Roberts JA, Brunetto VL, et al. A phase III trial of surgery with or without adjunctive external pelvic radiation therapy in intermediate risk endometrial adenocarcinoma: A Gynecologic Oncology Group study. Gynecol Oncol. 2004;92:744–751. - PubMed
-
- Nout RA, Smit VT, Putter H, et al. Vaginal brachytherapy versus pelvic external beam radiotherapy for patients with endometrial cancer of high-intermediate risk (PORTEC-2): An open-label, non-inferiority, randomised trial. Lancet. 2010;375:816–823. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources