The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model
- PMID: 37085843
- PMCID: PMC10120176
- DOI: 10.1186/s12911-023-02166-8
The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model
Abstract
Objectives: This research was designed to compare the ability of different machine learning (ML) models and nomogram to predict distant metastasis in male breast cancer (MBC) patients and to interpret the optimal ML model by SHapley Additive exPlanations (SHAP) framework.
Methods: Four powerful ML models were developed using data from male breast cancer (MBC) patients in the SEER database between 2010 and 2015 and MBC patients from our hospital between 2010 and 2020. The area under curve (AUC) and Brier score were used to assess the capacity of different models. The Delong test was applied to compare the performance of the models. Univariable and multivariable analysis were conducted using logistic regression.
Results: Of 2351 patients were analyzed; 168 (7.1%) had distant metastasis (M1); 117 (5.0%) had bone metastasis, and 71 (3.0%) had lung metastasis. The median age at diagnosis is 68.0 years old. Most patients did not receive radiotherapy (1723, 73.3%) or chemotherapy (1447, 61.5%). The XGB model was the best ML model for predicting M1 in MBC patients. It showed the largest AUC value in the tenfold cross validation (AUC:0.884; SD:0.02), training (AUC:0.907; 95% CI: 0.899-0.917), testing (AUC:0.827; 95% CI: 0.802-0.857) and external validation (AUC:0.754; 95% CI: 0.739-0.771) sets. It also showed powerful ability in the prediction of bone metastasis (AUC: 0.880, 95% CI: 0.856-0.903 in the training set; AUC: 0.823, 95% CI:0.790-0.848 in the test set; AUC: 0.747, 95% CI: 0.727-0.764 in the external validation set) and lung metastasis (AUC: 0.906, 95% CI: 0.877-0.928 in training set; AUC: 0.859, 95% CI: 0.816-0.891 in the test set; AUC: 0.756, 95% CI: 0.732-0.777 in the external validation set). The AUC value of the XGB model was larger than that of nomogram in the training (0.907 vs 0.802) and external validation (0.754 vs 0.706) sets.
Conclusions: The XGB model is a better predictor of distant metastasis among MBC patients than other ML models and nomogram; furthermore, the XGB model is a powerful model for predicting bone and lung metastasis. Combining with SHAP values, it could help doctors intuitively understand the impact of each variable on outcome.
Keywords: Distant metastasis; Machine learning; Male breast cancer; Nomogram; SEER; XGBoost.
© 2023. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures






Similar articles
-
A machine learning-based model for predicting distant metastasis in patients with rectal cancer.Front Oncol. 2023 Aug 15;13:1235121. doi: 10.3389/fonc.2023.1235121. eCollection 2023. Front Oncol. 2023. PMID: 37655097 Free PMC article.
-
Prediction of lymph node metastasis in patients with breast invasive micropapillary carcinoma based on machine learning and SHapley Additive exPlanations framework.Front Oncol. 2022 Sep 15;12:981059. doi: 10.3389/fonc.2022.981059. eCollection 2022. Front Oncol. 2022. PMID: 36185290 Free PMC article.
-
Applying machine learning techniques to predict the risk of lung metastases from rectal cancer: a real-world retrospective study.Front Oncol. 2023 May 24;13:1183072. doi: 10.3389/fonc.2023.1183072. eCollection 2023. Front Oncol. 2023. PMID: 37293595 Free PMC article.
-
Individual risk and prognostic value prediction by interpretable machine learning for distant metastasis in neuroblastoma: A population-based study and an external validation.Int J Med Inform. 2025 Apr;196:105813. doi: 10.1016/j.ijmedinf.2025.105813. Epub 2025 Jan 29. Int J Med Inform. 2025. PMID: 39904180
-
Evaluating Machine Learning Models and Their Diagnostic Value.2023 Jul 23. In: Colliot O, editor. Machine Learning for Brain Disorders [Internet]. New York, NY: Humana; 2023. Chapter 20. 2023 Jul 23. In: Colliot O, editor. Machine Learning for Brain Disorders [Internet]. New York, NY: Humana; 2023. Chapter 20. PMID: 37988512 Free Books & Documents. Review.
Cited by
-
Bone scintigraphy based on deep learning model and modified growth optimizer.Sci Rep. 2024 Oct 27;14(1):25627. doi: 10.1038/s41598-024-73991-8. Sci Rep. 2024. PMID: 39465262 Free PMC article.
-
Explainable artificial intelligence in breast cancer detection and risk prediction: A systematic scoping review.Cancer Innov. 2024 Jul 3;3(5):e136. doi: 10.1002/cai2.136. eCollection 2024 Oct. Cancer Innov. 2024. PMID: 39430216 Free PMC article.
-
Interpretable prediction of cardiopulmonary complications after non-small cell lung cancer surgery based on machine learning and SHapley additive exPlanations.Heliyon. 2023 Jul 3;9(7):e17772. doi: 10.1016/j.heliyon.2023.e17772. eCollection 2023 Jul. Heliyon. 2023. PMID: 37483738 Free PMC article.
-
Predicting mortality and recurrence in colorectal cancer: Comparative assessment of predictive models.Heliyon. 2024 Mar 12;10(6):e27854. doi: 10.1016/j.heliyon.2024.e27854. eCollection 2024 Mar 30. Heliyon. 2024. PMID: 38515707 Free PMC article.
-
Development and validation of AI models using LR and LightGBM for predicting distant metastasis in breast cancer: a dual-center study.Front Oncol. 2024 Jun 14;14:1409273. doi: 10.3389/fonc.2024.1409273. eCollection 2024. Front Oncol. 2024. PMID: 38947897 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Medical