Construction and validation of machine learning models for predicting lymph node metastasis in cutaneous malignant melanoma: a large population-based study
- PMID: 40104720
- PMCID: PMC11912072
- DOI: 10.21037/tcr-24-1672
Construction and validation of machine learning models for predicting lymph node metastasis in cutaneous malignant melanoma: a large population-based study
Abstract
Background: Lymph node status is essential for determining the prognosis of cutaneous malignant melanoma (CMM). This study aimed to develop a machine learning (ML) model for predicting lymph node metastases (LNM) in CMM.
Methods: We gathered data on 6,196 patients from the Surveillance, Epidemiology, and End Results (SEER) database, including known clinicopathologic variables, using six ML algorithms, including logistic regression (LR), support vector machine (SVM), Complement Naive Bayes (CNB), Extreme Gradient Boosting (XGBoost), RandomForest (RF), and k-nearest neighbor algorithm (kNN), to predict the presence of LNM in CMM. Subsequently, we established prediction models. The utilization of the adaptive synthetic (ADASYN) method served to address the challenge posed by imbalanced data. We assessed prediction model performance in terms of average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA). Furthermore, employing SHapley Additive exPlanation (SHAP) analysis resulted in the creation of visualized explanations tailored to individual patients.
Results: Among the 6,196 CMM cases, 19.9% (n=1,234) presented with LNM. The XGBoost model showed the best predictive performance when compared with the other algorithms (AP of 0.805). XGBoost showed that age and Breslow thickness were the two most important factors related to LNM.
Conclusions: The XGBoost model predicted LNM of CMM with a high level of precision. We hope that this model could assist surgeons in accurately evaluating surgical approaches and determining the extent of surgery, while also guiding the subsequent adjuvant therapies, thereby improving the prognosis of patients.
Keywords: Cutaneous malignant melanoma (CMM); Surveillance, Epidemiology, and End Results (SEER); lymph node metastasis (LNM); machine learning (ML); shapley additive explanation (SHAP).
Copyright © 2025 AME Publishing Company. All rights reserved.
Conflict of interest statement
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-1672/coif). The authors have no conflicts of interest to declare.
Figures






References
-
- Melanoma of the Skin Statistics American Cancer Society—Cancer Facts and Statistics. American Cancer Society 2023. Available online: www.cancer.org/cancer/melanoma-skin-cancer/about/key-statistics.html
-
- Skin cancer World Cancer Research Fund International. Available online: https://www.wcrf.org/dietandcancer/skin-cancer/
-
- SEER*Explorer. An interactive website for SEER cancer statistics Surveillance Research Program, National Cancer Institute. 2023. Available online: https://seer.cancer.gov/explorer/
LinkOut - more resources
Full Text Sources