Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data
- PMID: 40176498
- PMCID: PMC11965882
- DOI: 10.1002/cnr2.70175
Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data
Abstract
Background: Breast cancer (BC) is a major global health concern with rising incidence and mortality rates in many developing countries. Effective BC risk assessment models are crucial for prevention and early detection. While the Gail model, a traditional logistic regression-based model, has been broadly used, its predictive performance may be limited by its linear assumptions. With the rapid advancement of artificial intelligence (AI) in medical sciences, various complex machine learning algorithms have been developed for risk prediction, including for BC.
Aims: This study aims to compare the quality of AI-based models with the traditional Gail model in assessing BC risk using a population dataset. It also evaluates the performance of these models in predicting BC risk.
Methods and results: This study involved 942 newly diagnosed BC patients and 975 healthy controls at the Cancer Institute in IKH hospital Complex, Tehran. Ten classification algorithms were applied to the dataset. The accuracy, sensitivity, precision, and feature importance in the machine learning algorithms were assessed and compared to previous studies for evaluation. The study found that AI algorithms alone did not significantly improve predictability compared to the Gail model. However, the importance of variables varied significantly among the AI algorithms. Understanding feature importance and interactions is crucial in AI modeling in order to enhance accuracy and identify critical risk factors.
Conclusion: This study concluded that, in BC risk prediction, incorporating specific risk factors, such as genetic and image-related variables, may be necessary to further enhance accuracy in BC risk prediction models. Furthermore, it is crucial to address modeling issues in models with a restricted number of features for future research.
Keywords: artificial intelligence; breast cancer; conventional models; machine learning; risk assessment.
© 2025 The Author(s). Cancer Reports published by Wiley Periodicals LLC.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures
References
-
- Sung H., Ferlay J., Siegel R. L., et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA: A Cancer Journal for Clinicians 71, no. 3 (2021): 209–249. - PubMed
-
- Bray F., Ferlay J., Soerjomataram I., Siegel R. L., Torre L. A., and Jemal A., “Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA: A Cancer Journal for Clinicians 68, no. 6 (2018): 394–424. - PubMed
-
- Gail M. H., Brinton L. A., Byar D. P., et al., “Projecting Individualized Probabilities of Developing Breast Cancer for White Females Who Are Being Examined Annually,” Journal of the National Cancer Institute 81, no. 24 (1989): 1879–1886. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
