Explainable machine learning for predicting distant metastases in renal cell carcinoma patients: a population-based retrospective study
- PMID: 40800127
- PMCID: PMC12339556
- DOI: 10.3389/fmed.2025.1624198
Explainable machine learning for predicting distant metastases in renal cell carcinoma patients: a population-based retrospective study
Abstract
Background: Distant metastasis is a key factor contributing to poor prognosis in renal cell carcinoma (RCC). Early prediction of metastasis is crucial for developing personalized treatment plans and improving patient outcomes. This study aimed to establish and validate a clinical prediction model for distant metastasis in RCC patients.
Methods: Ten machine learning algorithms were employed to develop a predictive model for distant metastasis in RCC. Data from 51,566 RCC patients in The Surveillance, Epidemiology, and End Results (SEER) database (2010-2018) were used for model development, while 726 RCC patients from the First Hospital of Shanxi Medical University were selected for external validation. Hyperparameters were optimized using grid search and tenfold cross-validation. Model performance was assessed using metrics such as the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, precision, and accuracy. Shapley additive explanations (SHAP) were used for model interpretation. The best-performing model was then used to create a web-based calculator to predict metastasis risk in RCC patients.
Results: The study included 51,566 RCC patients, with 3,667 showing distant metastases. Logistic regression identified tumor size, grade, T-stage, N-stage, radiotherapy, chemotherapy, and surgery as independent risk factors. The Extreme Gradient Boosting (XGB) model demonstrated superior performance (AUC: 0.957, Accuracy: 0.898) in the training set and was validated externally (AUC: 0.742, Accuracy: 0.904). A web-based calculator was developed using the XGB model.
Conclusion: This study designed and validated an XGB model using clinicopathologic data to predict the risk of distant metastasis in RCC patients, potentially aiding clinical decision-making.
Keywords: distant metastasis; external validation; machine learning; predictive modeling; renal cell carcinoma; web-based calculator.
Copyright © 2025 Hou, Wang, Lv, Zhou, Guo, Li, Jia, Du and Shuang.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures









References
LinkOut - more resources
Full Text Sources