Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 15;17(4):2614-2628.
doi: 10.62347/CZYA6232. eCollection 2025.

Regression analysis and validation of risk factors for upper limb dysfunction following modified radical mastectomy for breast cancer patients

Affiliations

Regression analysis and validation of risk factors for upper limb dysfunction following modified radical mastectomy for breast cancer patients

Yonggang Li et al. Am J Transl Res. .

Abstract

Objective: To develop and validate a predictive tool using machine learning models for identifying risk factors for upper limb dysfunction following modified radical mastectomy (MRM) in breast cancer patients.

Methods: A total of 768 breast cancer patients who underwent Modified radical mastectomy (MRM) between January 2022 and December 2023 were included in this study. The dataset was divided into a training set (506 cases) and a validation set (262 cases). The collected data encompassed demographic characteristics, clinicopathological features, medical history, and postoperative rehabilitation plans. Predictive analyses were conducted using machine learning models, including support vector machine (SVM), extreme gradient boosting (XGBOOST), Gaussian naïve Bayes (GNB), adaptive boosting (ADABOOST), and random forest. Model evaluation was performed using ten-fold cross-validation, with performance metrics including receiver operating characteristic (ROC) curves, area under the curve (AUC) values, specificity, sensitivity, accuracy, and F1-score. DeLong's test was used to compare AUC values and identify the optimal predictive model.

Results: Baseline characteristics showed no significant differences between the training and validation sets (P>0.05). Analysis of factors associated with upper limb dysfunction in the training set revealed significant differences in variables such as age, BMI, cancer type, axillary lymph node dissection, ipsilateral radiotherapy, postoperative rehabilitation plans, and monthly per capita household income (P<0.05). Low correlations were observed among these variables (R values close to 0), indicating minimal multicollinearity. Model performance evaluation showed that the XGBOOST and random forest models demonstrated high AUC values (0.817-0.884) across both the training and validation sets. These models also exhibited superior specificity and sensitivity, indicating strong predictive performance and robustness in identifying patients at risk of postoperative upper limb dysfunction.

Conclusion: The XGBOOST and random forest models exhibited excellent predictive accuracy, offering valuable tools for the early identification and personalized management of high-risk patients. These models provide critical data support for postoperative rehabilitation planning and contribute to improving the quality of life for breast cancer patients.

Keywords: Modified radical mastectomy; XGBOOST; machine learning models; risk prediction; upper limb dysfunction.

PubMed Disclaimer

Conflict of interest statement

None.

Figures

Figure 1
Figure 1
Correlation analysis of significant variables between the dysfunction and non-dysfunction groups. Note: BMI, Body Mass Index.
Figure 2
Figure 2
ROC curves of the 5 machine learning models in training and validation sets. A. Training set ROC curves. B. Validation set ROC curves. Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.
Figure 3
Figure 3
Calibration curves for XGBOOST model in training and validation sets. A. Calibration curve for training set. B. Calibration curve for validation set. Note: XGBOOST, Extreme Gradient Boosting.
Figure 4
Figure 4
Calibration curves for Random Forest model in training and validation sets. A. Calibration curve for training set. B. Calibration curve for validation set.

Similar articles

References

    1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229–263. - PubMed
    1. Giaquinto AN, Sung H, Newman LA, Freedman RA, Smith RA, Star J, Jemal A, Siegel RL. Breast cancer statistics 2024. CA Cancer J Clin. 2024;74:477–495. - PubMed
    1. US Preventive Services Task Force. Screening for breast cancer. JAMA. 2024;331:1973–1974. - PubMed
    1. Sandoval JL, Franzoi MA, di Meglio A, Ferreira AR, Viansone A, André F, Martin AL, Everhard S, Jouannaud C, Fournier M, Rouanet P, Vanlemmens L, Dhaini-Merimeche A, Sauterey B, Cottu P, Levy C, Stringhini S, Guessous I, Vaz-Luis I, Menvielle G. Magnitude and temporal variations of socioeconomic inequalities in the quality of life after early breast cancer: results from the multicentric French CANTO cohort. J. Clin. Oncol. 2024;42:2908–2917. - PMC - PubMed
    1. Aitken GL, Correa G, Samuels S, Gannon CJ, Llaguna OH. Assessment of textbook oncologic outcomes following modified radical mastectomy for breast cancer. J Surg Res. 2022;277:17–26. - PubMed

LinkOut - more resources