Development and validation of survival prediction tools in early and late onset colorectal cancer patients
- PMID: 40229310
- PMCID: PMC11997042
- DOI: 10.1038/s41598-025-95385-0
Development and validation of survival prediction tools in early and late onset colorectal cancer patients
Abstract
This study aims to develop online calculators using machine learning models to predict survival probabilities for early- and late-onset colorectal cancer (EOCRC and LOCRC) over a 1- to 8-year period. We extracted data on 117,965 CRC patients from the published database spanning 2010 to 2021, divided into training and internal testing datasets. The data of 200 CRC patients from Chongqing Hospital of Jiangsu Province Hospital was used as the external testing dataset. We conducted univariate and multivariate regression analyses on the training dataset to identify key survival factors and develop predictive machine learning models. The models were evaluated using internal and external testing datasets based on AUC, accuracy, precision, recall, and F1 score. Web-based calculators were subsequently developed to predict survival curves for EOCRC and LOCRC patients under different treatment strategies. In the multivariate Cox regression analysis, 16 and 18 variables were independently significant survival factors for EOCRC and LOCRC, respectively. In the EOCRC group, the machine learning models achieved AUC values of 0.880 and 0.804 in the internal and external testing cohorts. For the LOCRC group, the machine learning models exhibited AUC values of 0.857 and 0.823 in the internal and external testing cohorts. The online calculators, powered by trained machine learning models, are accessible at https://eocrc-surv.streamlit.app/ and https://locrc-surv.streamlit.app/ . These tools estimate survival probabilities for EOCRC and LOCRC patients under various treatment strategies and display the corresponding survival curves post-treatment over the 1- to 8-year period. This study successfully developed online calculators using machine learning algorithms to predict 1- to 8-year survival probabilities for EOCRC and LOCRC patients under various treatment strategies.
Keywords: Colorectal cancer; Machine learning; Online calculators; Survival.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Competing interests: The authors declare no competing interests. Ethics statement: For data from the SEER database, ethical review and approval were not required since the SEER database is publicly available and de-identified. For data from Chongqing Hospital of Jiangsu Province Hospital (The People’s Hospital of Qijiang District), ethical approval was obtained from the Ethical Review Committee of Chongqing Hospital of Jiangsu Province Hospital (The People’s Hospital of Qijiang District) with the approval number of 20240005 prior to commencing this study. The requirement for informed consent for retrospective study was waived by the Ethical Review Committee of Chongqing Hospital of Jiangsu Province Hospital (The People’s Hospital of Qijiang District) because of the observational design and the anonymity of the patient’s identity.
Figures
References
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
