Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2025 Apr;8(4):e70175.
doi: 10.1002/cnr2.70175.

Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data

Affiliations
Comparative Study

Comparison of Machine Learning Models for Classification of Breast Cancer Risk Based on Clinical Data

Haniyeh Rafiepoor et al. Cancer Rep (Hoboken). 2025 Apr.

Abstract

Background: Breast cancer (BC) is a major global health concern with rising incidence and mortality rates in many developing countries. Effective BC risk assessment models are crucial for prevention and early detection. While the Gail model, a traditional logistic regression-based model, has been broadly used, its predictive performance may be limited by its linear assumptions. With the rapid advancement of artificial intelligence (AI) in medical sciences, various complex machine learning algorithms have been developed for risk prediction, including for BC.

Aims: This study aims to compare the quality of AI-based models with the traditional Gail model in assessing BC risk using a population dataset. It also evaluates the performance of these models in predicting BC risk.

Methods and results: This study involved 942 newly diagnosed BC patients and 975 healthy controls at the Cancer Institute in IKH hospital Complex, Tehran. Ten classification algorithms were applied to the dataset. The accuracy, sensitivity, precision, and feature importance in the machine learning algorithms were assessed and compared to previous studies for evaluation. The study found that AI algorithms alone did not significantly improve predictability compared to the Gail model. However, the importance of variables varied significantly among the AI algorithms. Understanding feature importance and interactions is crucial in AI modeling in order to enhance accuracy and identify critical risk factors.

Conclusion: This study concluded that, in BC risk prediction, incorporating specific risk factors, such as genetic and image-related variables, may be necessary to further enhance accuracy in BC risk prediction models. Furthermore, it is crucial to address modeling issues in models with a restricted number of features for future research.

Keywords: artificial intelligence; breast cancer; conventional models; machine learning; risk assessment.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
ROC curves for all algorithms on the validation set. The highest validation accuracy was related to gradient boosting (AUC = 0.65).
FIGURE 2
FIGURE 2
Prediction partition analysis of breast cancer risk prediction. Red: Cases, blue: Controls.

Similar articles

References

    1. Sung H., Ferlay J., Siegel R. L., et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA: A Cancer Journal for Clinicians 71, no. 3 (2021): 209–249. - PubMed
    1. Bray F., Ferlay J., Soerjomataram I., Siegel R. L., Torre L. A., and Jemal A., “Global Cancer Statistics 2018: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA: A Cancer Journal for Clinicians 68, no. 6 (2018): 394–424. - PubMed
    1. Gail M. H., Brinton L. A., Byar D. P., et al., “Projecting Individualized Probabilities of Developing Breast Cancer for White Females Who Are Being Examined Annually,” Journal of the National Cancer Institute 81, no. 24 (1989): 1879–1886. - PubMed
    1. Zhao Y., Wang X., Wang Y., and Zhu Z., “Logistic Regression Analysis and a Risk Prediction Model of Pneumothorax After CT‐Guided Needle Biopsy,” Journal of Thoracic Disease 9, no. 11 (2017): 4750–4757. - PMC - PubMed
    1. Schober P. and Vetter T. R., “Logistic Regression in Medical Research,” Anesthesia and Analgesia 132, no. 2 (2021): 365–366. - PMC - PubMed

Publication types