Machine Learning for Risk Prediction of Oesophago-Gastric Cancer in Primary Care: Comparison with Existing Risk-Assessment Tools
- PMID: 36291807
- PMCID: PMC9600097
- DOI: 10.3390/cancers14205023
Machine Learning for Risk Prediction of Oesophago-Gastric Cancer in Primary Care: Comparison with Existing Risk-Assessment Tools
Abstract
Oesophago-gastric cancer is difficult to diagnose in the early stages given its typical non-specific initial manifestation. We hypothesise that machine learning can improve upon the diagnostic performance of current primary care risk-assessment tools by using advanced analytical techniques to exploit the wealth of evidence available in the electronic health record. We used a primary care electronic health record dataset derived from the UK General Practice Research Database (7471 cases; 32,877 controls) and developed five probabilistic machine learning classifiers: Support Vector Machine, Random Forest, Logistic Regression, Naïve Bayes, and Extreme Gradient Boosted Decision Trees. Features included basic demographics, symptoms, and lab test results. The Logistic Regression, Support Vector Machine, and Extreme Gradient Boosted Decision Tree models achieved the highest performance in terms of accuracy and AUROC (0.89 accuracy, 0.87 AUROC), outperforming a current UK oesophago-gastric cancer risk-assessment tool (ogRAT). Machine learning also identified more cancer patients than the ogRAT: 11.0% more with little to no effect on false positives, or up to 25.0% more with a slight increase in false positives (for Logistic Regression, results threshold-dependent). Feature contribution estimates and individual prediction explanations indicated clinical relevance. We conclude that machine learning could improve primary care cancer risk-assessment tools, potentially helping clinicians to identify additional cancer cases earlier. This could, in turn, improve survival outcomes.
Keywords: cancer diagnosis; early detection; electronic health record; machine learning; oesophago-gastric cancer; primary care; risk-assessment.
Conflict of interest statement
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The views expressed in this article are those of the author(s) and not necessarily those of the NHS, the NIHR, UKRI EPSRC, TPP, Cancer Research UK, Macmillan Cancer Support, or the Department of Health and Social Care.
Figures
Similar articles
-
Early Detection of Septic Shock Onset Using Interpretable Machine Learners.J Clin Med. 2021 Jan 15;10(2):301. doi: 10.3390/jcm10020301. J Clin Med. 2021. PMID: 33467539 Free PMC article.
-
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251. Clin Orthop Relat Res. 2020. PMID: 32282466 Free PMC article.
-
Prediction of Acute Kidney Injury after Liver Transplantation: Machine Learning Approaches vs. Logistic Regression Model.J Clin Med. 2018 Nov 8;7(11):428. doi: 10.3390/jcm7110428. J Clin Med. 2018. PMID: 30413107 Free PMC article.
-
Machine learning in predicting cardiac surgery-associated acute kidney injury: A systemic review and meta-analysis.Front Cardiovasc Med. 2022 Sep 15;9:951881. doi: 10.3389/fcvm.2022.951881. eCollection 2022. Front Cardiovasc Med. 2022. PMID: 36186995 Free PMC article.
-
Potential applications and performance of machine learning techniques and algorithms in clinical practice: A systematic review.Int J Med Inform. 2022 Mar;159:104679. doi: 10.1016/j.ijmedinf.2021.104679. Epub 2021 Dec 31. Int J Med Inform. 2022. PMID: 34990939
Cited by
-
Ethical and legal implications of implementing risk algorithms for early detection and screening for oesophageal cancer, now and in the future.PLoS One. 2023 Oct 30;18(10):e0293576. doi: 10.1371/journal.pone.0293576. eCollection 2023. PLoS One. 2023. PMID: 37903120 Free PMC article.
-
Diagnostic Risk Prediction Models for Upper Gastrointestinal Cancers: A Systematic Review.Cancer Epidemiol Biomarkers Prev. 2025 Aug 1;34(8):1240-1251. doi: 10.1158/1055-9965.EPI-24-1714. Cancer Epidemiol Biomarkers Prev. 2025. PMID: 40402037 Free PMC article.
-
A Machine Learning Risk Prediction Model for Gastric Cancer with SHapley Additive exPlanations.Cancer Res Treat. 2025 Jul;57(3):821-829. doi: 10.4143/crt.2024.843. Epub 2024 Dec 16. Cancer Res Treat. 2025. PMID: 39701090 Free PMC article.
-
Opportunities, challenges, and requirements for Artificial Intelligence (AI) implementation in Primary Health Care (PHC): a systematic review.BMC Prim Care. 2025 Jun 9;26(1):196. doi: 10.1186/s12875-025-02785-2. BMC Prim Care. 2025. PMID: 40490689 Free PMC article.
-
Risk of Gastric Adenocarcinoma in a Multiethnic Population Undergoing Routine Care: An Electronic Health Records Cohort Study.Cancer Epidemiol Biomarkers Prev. 2024 Apr 3;33(4):547-556. doi: 10.1158/1055-9965.EPI-23-1200. Cancer Epidemiol Biomarkers Prev. 2024. PMID: 38231023 Free PMC article.
References
-
- Allum W., Lordick F., Alsina M., Andritsch E., Ba-Ssalamah A., Beishon M., Braga M., Caballero C., Carneiro F., Cassinello F., et al. ECCO essential requirements for quality cancer care: Oesophageal and gastric cancer. Crit. Rev. Oncol. Hematol. 2018;122:179–193. doi: 10.1016/j.critrevonc.2017.12.019. - DOI - PubMed
-
- Cancer Research UK UK Oesophageal Cancer Statistics. London, UK. 2019. [(accessed on 1 December 2021)]. Available online: https://www.cancerresearchuk.org/health-professional/cancer-statistics/s....
-
- Cancer Research UK UK Stomach Cancer Statistics. London, UK. 2019. [(accessed on 1 December 2021)]. Available online: https://www.cancerresearchuk.org/health-professional/cancer-statistics/s....
-
- Swann R., McPhail S., Witt J., Shand B., Abel G.A., Hiom S., Rashbass J., Lyratzopoulos G., Rubin G., The National Cancer Diagnosis Audit Steering Group Diagnosing cancer in primary care: Results from the National Cancer Diagnosis Audit. Br. J. Gen. Pract. 2018;68:e63–e72. doi: 10.3399/bjgp17X694169. - DOI - PMC - PubMed
-
- Arnold M., Rutherford M.J., Bardot A., Ferlay J., Andersson T.M.L., Myklebust T.Å., Tervonen H., Thursfield V., Ransom D., Shack L., et al. Progress in cancer survival, mortality, and incidence in seven high-income countries 1995–2014 (ICBP SURVMARK-2): A population-based study. Lancet Oncol. 2019;20:1493–1505. doi: 10.1016/S1470-2045(19)30456-5. - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous