Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 25:12:944569.
doi: 10.3389/fonc.2022.944569. eCollection 2022.

A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP

Affiliations

A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP

Gaosen Zhang et al. Front Oncol. .

Abstract

Background: This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status.

Method: This study retrospectively analyzed the ultrasound images and postoperative pathological findings of lesions in 952 breast cancer patients. Firstly, the univariate analysis of the relationship between the ultrasonographic features of breast cancer morphological features and SLN metastasis. Then, based on the ultrasound signs of breast cancer lesions, we screened ten ML models: support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), naive bayesian model (NB), k-nearest neighbors (KNN), multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN). The diagnostic performance of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), Kappa value, accuracy, F1-score, sensitivity, and specificity. Then we constructed a clinical prediction model which was based on the ML algorithm with the best diagnostic performance. Finally, we used SHapley Additive exPlanation (SHAP) to visualize and analyze the diagnostic process of the ML model.

Results: Of 952 patients with breast cancer, 394 (41.4%) had SLN metastasis, and 558 (58.6%) had no metastasis. Univariate analysis found that the shape, orientation, margin, posterior features, calculations, architectural distortion, duct changes and suspicious lymph node of breast cancer lesions in ultrasound signs were associated with SLN metastasis. Among the 10 ML algorithms, XGBoost had the best comprehensive diagnostic performance for SLN metastasis, with Average-AUC of 0.952, Average-Kappa of 0.763, and Average-Accuracy of 0.891. The AUC of the XGBoost model in the validation cohort was 0.916, the accuracy was 0.846, the sensitivity was 0.870, the specificity was 0.862, and the F1-score was 0.826. The diagnostic performance of the XGBoost model was significantly higher than that of experienced radiologists in some cases (P<0.001). Using SHAP to visualize the interpretation of the ML model screen, it was found that the ultrasonic detection of suspicious lymph nodes, microcalcifications in the primary tumor, burrs on the edge of the primary tumor, and distortion of the tissue structure around the lesion contributed greatly to the diagnostic performance of the XGBoost model.

Conclusions: The XGBoost model based on the ultrasound signs of the primary breast tumor and its surrounding tissues and lymph nodes has a high diagnostic performance for predicting SLN metastasis. Visual explanation using SHAP made it an effective tool for guiding clinical courses preoperatively.

Keywords: SHAP; XGBoost; breast cancer; sentinel lymph node metastasis; ultrasound signs.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The flowchart of the study.
Figure 2
Figure 2
The 10-fold cross-validation of machine learning algorithms.
Figure 3
Figure 3
ROC curves of the validation cohort.
Figure 4
Figure 4
Validation cohort DET curves of 10 machine learning models.
Figure 5
Figure 5
Learning curve of the XGBoost model.
Figure 6
Figure 6
The bar graph of the SHAP summary graph shows the effect of each ultrasound sign on the XGBoost model. “Suspicious lymph node” was the factor that contributed the most to the prediction result, and margin, architectural distortion, and calculations also had a higher contribution to the prediction result.
Figure 7
Figure 7
The scatter plot of the SHAP summary chart visually reflects the relationship between the feature value and the predicted probability through color, including positive and negative prediction effects. The three signs of “suspicious lymph node,” “architectural distortion,” and “calculations” are very clearly divided, and the margin is relatively clear. The higher the value (red), the greater the possibility of SLN transfer.
Figure 8
Figure 8
Sankey plot shows the distribution of ultrasound signs of breast cancer lesions in the primary cohort.
Figure 9
Figure 9
The force plot of the SHAP summary plot reflects the positive or negative impact of the eigenvalues on the diagnosis of the XGBoost model in red and blue.
Figure 10
Figure 10
Data from a female patient, 46 years old. (A). Right breast probing and hypoechoic lesions, not parallel to the skin, irregular in shape, burr-like edges, and disordered echoes of surrounding structures; (B). Right axillary probing and echoes of suspicious lymph nodes. Pathological findings: invasive ductal carcinoma, metastases in sentinel lymph nodes; (C). The waterfall chart of the XGBoost model predicted the process of SLN metastasis in this case. For this patient, the predicted outcome was 77.2% (baseline: 44.5%), and high-risk factors for being diagnosed with SLN metastasis included suspicious lymph nodes, spiculated lesion margins, and architectural distortion.
Figure 11
Figure 11
Data from a female patient, 62 years old. (A). Right breast probing and mixed echogenic lesions, not parallel to the skin, irregular in shape, lobulated at the edge, and echogenic in the rear; (B). No suspicious lymph node echo was detected in the right axilla. Pathological findings: invasive ductal carcinoma, no metastases in sentinel lymph nodes; (C). The waterfall chart of the XGBoost model predicting the process of SLN metastasis in this case. For this patient, the predicted outcome was 19.2% (the baseline was 44.5%), and the favorable factors mainly included the margin of the lesion being lobulated, no suspicious lymph nodes being found, no obvious distortion of the tissue structure around the lesion, and no calcification in the lesion.
Figure 12
Figure 12
Receiver operating characteristic (ROC) curves of XGBoost models and radiologists. The areas under the curve (AUCs) of the two methods (0.916 vs. 0.758) were significantly different as determined by the DeLong method (P<0.001).

Similar articles

Cited by

References

    1. Yardım-Akaydin S, Karahalil B, Baytas SN. New therapy strategies in the management of breast cancer. Drug Discov Today (2022), 2022;27(6):1755–1762. doi: 10.1016/j.drudis.2022.03.014 - DOI - PubMed
    1. Lyman GH, Somerfield MR, Bosserman LD, Perkins CL, Weaver DL, Giuliano AE. Sentinel lymph node biopsy for patients with early-stage breast cancer: American society of clinical oncology clinical practice guideline update. J Clin Oncol (2017) 35(5):561–4. doi: 10.1200/JCO.2016.71.0947 - DOI - PubMed
    1. Krag DN, Anderson SJ, Julian TB, Brown AM, Harlow SP, Constantino JP, et al. . Sentinel-lymph-node resection compared with conventional axillary-lymph-node dissection in clinically node-negative patients with breast cancer: overall survival findings from the NSABP b-32 randomised phase 3 trial. Lancet Oncol (2010) 11(10):927–33. doi: 10.1016/S1470-2045(10)70207-2 - DOI - PMC - PubMed
    1. Brem RF, Lenihan MJ, Lieberman J, Torrente J. Screening breast ultrasound: past, present, and future. AJR Am J Roentgenol (2015) 204(2):234–40. doi: 10.2214/AJR.13.12072 - DOI - PubMed
    1. Park VY, Kim EK, Moon HJ, Yoon JH, Kim MJ. Value of ultrasound-guided fine needle aspiration in diagnosing axillary lymph node recurrence after breast cancer surgery. Am J Surg (2018) 216(5):969–73. doi: 10.1016/j.amjsurg.2018.04.012 - DOI - PubMed

LinkOut - more resources