Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 27;14(1):22055.
doi: 10.1038/s41598-024-72385-0.

Developing machine learning models for personalized treatment strategies in early breast cancer patients undergoing neoadjuvant systemic therapy based on SEER database

Affiliations

Developing machine learning models for personalized treatment strategies in early breast cancer patients undergoing neoadjuvant systemic therapy based on SEER database

Jiahui Ren et al. Sci Rep. .

Abstract

This study aimed to compare the long-term outcomes of breast-conserving surgery plus radiotherapy (BCS + RT) and mastectomy in early breast cancer (EBC) patients who received neoadjuvant systemic therapy (NST), and sought to construct and authenticate a machine learning algorithm that could assist healthcare professionals in formulating personalized treatment strategies for this patient population. We analyzed data from the Surveillance, Epidemiology, and End Results database on EBC patients undergoing BCS + RT or mastectomy post-NST (2010-2018). Employing propensity score matching (PSM) to minimize potential biases, we compared breast cancer-specific survival (BCSS) and overall survival (OS) between the two surgical groups. Additionally, we trained and validated six machine learning survival models and developed a cloud-based recommendation system for surgical treatment based on the optimal model. Among the 13,958 patients, 9028 (64.7%) underwent BCS + RT and 4930 (35.3%) underwent mastectomy. After PSM, there were 3715 patients in each group. Compared to mastectomy, BCS + RT significantly improved BCSS (p < 0.001) and OS (p < 0.001). Prognostic variables associated with BCSS were utilized to develop machine learning models. In both the training and validation cohorts, the random survival forest (RSF) model demonstrated superior predictive performance (0.847 and 0.795), not only outperforming other machine learning models, including Rpart (0.725 and 0.707), Xgboost (0.762 and 0.727), Glmboost (0.748 and 0.788), Survctree (0.764 and 0.766), and Survsvm (0.777 and 0.790), but also outperforming the classical COX model (0.749 and 0.782). Lastly, a web-based prediction tool was built to facilitate clinical application [ https://jhren.shinyapps.io/shinyapp1 ]. After adjusting other confounders, BCS + RT was associated with improved outcomes in patients with EBC after NST, compared to those who underwent mastectomy. Moreover, the RSF model, a reliable tool, can predict long-term outcomes for patients, providing valuable guidance for operative methods and postoperative follow-up.

Keywords: Breast-conserving surgery; Early breast cancer; Long-term outcomes; Machine learning; Mastectomy; Neoadjuvant systemic therapy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The flowchart of developing models.
Fig. 2
Fig. 2
Kaplan–Meier survival analysis for EBC treated with NST followed by BCS + RT or mastectomy. (A) PSM-adjusted BCSS based on the type of surgery. (B) PSM-adjusted OS based on the type of surgery. EBC early breast cancer. NST neoadjuvant systemic therapy. BCS + RT breast-conserving surgery plus radiotherapy. BCSS breast cancer-specific survival. OS overall survival. PSM propensity score matching.
Fig. 3
Fig. 3
ROC curves and calibration plots for the RSF model. The ROC of 3, 5, and 10 years between the RSF model in the training cohort (A) and the validation cohort (B). Calibration plots in 3 year, 5 years, and 10 years in the training (C) and validation (D) cohorts. ROC receiver operating characteristic, RSF random survival forest.
Fig. 4
Fig. 4
The DCA curves of RSF model. (A) The 3-year, 5-year and 10-year DCA curves of RSF model in the training cohort. (B) The 3-year, 5-year and 10-year DCA curves of RSF model in the validation cohort. DCA decision analysis, RSF random survival forest.
Fig. 5
Fig. 5
Variable importance and error rate curve of RSF. RSF random survival forest.
Fig. 6
Fig. 6
The RSF model incorporates an input field featuring the current clinicopathologic characteristics of one patient, along with an output field showcasing patient risk scores and survival curves. RSF random survival forest.

Similar articles

Cited by

References

    1. Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin.73(1), 17–48 (2023). - PubMed
    1. Loibl, S., Poortmans, P., Morrow, M., Denkert, C. & Curigliano, G. Breast cancer. Lancet.397(10286), 1750–1769 (2021). - PubMed
    1. Mieog, J. S., van der Hage, J. A. & van de Velde, C. J. Neoadjuvant chemotherapy for operable breast cancer. Br. J. Surg.94(10), 1189–1200 (2007). - PubMed
    1. Wolmark, N., Wang, J., Mamounas, E., Bryant, J. & Fisher, B. Preoperative chemotherapy in patients with operable breast cancer: Nine-year results from National Surgical Adjuvant Breast and Bowel Project B-18. J. Natl. Cancer Inst. Monogr.30, 96–102 (2001). - PubMed
    1. Wrubel, E., Natwick, R. & Wright, G. P. Breast-conserving therapy is associated with improved survival compared with mastectomy for early-stage breast cancer: A propensity score matched comparison using the national cancer database. Ann. Surg. Oncol.28(2), 914–919 (2021). - PubMed