Development and Validation of a Colorectal Cancer Prediction Model: A Nationwide Cohort-Based Study
- PMID: 38662163
- PMCID: PMC11258054
- DOI: 10.1007/s10620-024-08427-4
Development and Validation of a Colorectal Cancer Prediction Model: A Nationwide Cohort-Based Study
Erratum in
-
Correction to: Development and Validation of a Colorectal Cancer Prediction Model: A Nationwide Cohort‑Based Study.Dig Dis Sci. 2025 Mar;70(3):1250. doi: 10.1007/s10620-025-08902-6. Dig Dis Sci. 2025. PMID: 39939456 Free PMC article. No abstract available.
Abstract
Background: Early diagnosis of colorectal cancer (CRC) is critical to increasing survival rates. Computerized risk prediction models hold great promise for identifying individuals at high risk for CRC. In order to utilize such models effectively in a population-wide screening setting, development and validation should be based on cohorts that are similar to the target population.
Aim: Establish a risk prediction model for CRC diagnosis based on electronic health records (EHR) from subjects eligible for CRC screening.
Methods: A retrospective cohort study utilizing the EHR data of Clalit Health Services (CHS). The study includes CHS members aged 50-74 who were eligible for CRC screening from January 2013 to January 2019. The model was trained to predict receiving a CRC diagnosis within 2 years of the index date. Approximately 20,000 EHR demographic and clinical features were considered.
Results: The study includes 2935 subjects with CRC diagnosis, and 1,133,457 subjects without CRC diagnosis. Incidence values of CRC among subjects in the top 1% risk scores were higher than baseline (2.3% vs 0.3%; lift 8.38; P value < 0.001). Cumulative event probabilities increased with higher model scores. Model-based risk stratification among subjects with a positive FOBT, identified subjects with more than twice the risk for CRC compared to FOBT alone.
Conclusions: We developed an individualized risk prediction model for CRC that can be utilized as a complementary decision support tool for healthcare providers to precisely identify subjects at high risk for CRC and refer them for confirmatory testing.
Keywords: Colorectal cancer; Colorectal cancer screening; Machine learning.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures



References
-
- Cancer (IARC) TIA for R on. Global Cancer Observatory [Internet]. [cited 2023 Mar 20]. Available from: https://gco.iarc.fr/.
-
- Xi Y, Xu P. Global colorectal cancer burden in 2020 and projections to 2040. Translational Oncology [Internet]. Neoplasia Press; 2021 [cited 2023 Mar 20];14. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8273208/. - PMC - PubMed
-
- US Preventive Services Task Force, Davidson KW, Barry MJ, Mangione CM, Cabana M, Caughey AB et al. Screening for colorectal cancer: US Preventive Services Task Force recommendation statement. JAMA 2021;325:1965–77. - PubMed
-
- Atkin WS, Edwards R, Kralj-Hans I, Wooldrage K, Hart AR, Northover JMA et al. Once-only flexible sigmoidoscopy screening in prevention of colorectal cancer: a multicentre randomised controlled trial. Lancet. 2010;375:1624–1633. - PubMed
-
- Bretthauer M, Løberg M, Wieszczy P, Kalager M, Emilsson L, Garborg K et al. Effect of colonoscopy screening on risks of colorectal cancer and related death. N Engl J Med 2022;387:1547–56. - PubMed