Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 3;16(6):2041-2061.
doi: 10.7150/jca.110141. eCollection 2025.

Development and validation of an explainable machine learning model to predict Delphian lymph node metastasis in papillary thyroid cancer: a large cohort study

Affiliations

Development and validation of an explainable machine learning model to predict Delphian lymph node metastasis in papillary thyroid cancer: a large cohort study

Jie Cui et al. J Cancer. .

Abstract

Background: The occurrence of papillary thyroid cancer (PTC) has risen substantially and tends to exhibit early-stage lymph node metastasis (LNM), increasing the risk of postoperative recurrence and decreasing survival. There is a lack of a machine learning (ML) model to predict delphian LNM (DLNM) in PTC. This investigation seeks to comprehensively assess the significance of standard clinical indicators for DLNM prediction, while constructing a dependable and widely applicable ensemble ML framework to support surgical planning and therapeutic decision-making. Methods: This investigation incorporated 1993 sequential PTC patients who underwent curative surgical procedures from 2020 to 2023. Based on the time to surgery, we divided the cohort into the training cohort (n=1395) and the validation cohort (n=598). The Boruta algorithm was applied to select feature variables, succeeded by the development of an innovative ML structure combining 12 ML techniques across 113 permutations to create a unified prediction model (DLNM index). ROC analysis, calibration curve, Bootstrapping, 10-fold cross validation, restricted cubic spline (RCS) regression, multivariable logistic regression, and subgroup analysis were utilised to evaluate the predictive accuracy and discriminative ability of the DLNM index. Model interpretation and feature impact visualisation were accomplished through the Shapley Additive Explanations (SHAP) methodology. Results: Based on 14 features via the Boruta algorithm selection, we integrated them into 12 ML approaches, yielding 113 permutations, from which we identified the superior algorithm to establish a consensus ML-derived diagnostic model (DLNM index). The DLNM index exhibited excellent diagnostic values with a mean AUC of 0.763 in two cohorts and discriminative ability, serving as an independent risk factor (P < 0.001). It performed better in predicting performance and yielded a larger net benefit than the published model (P < 0.05). Bootstrapping and 10-fold cross validation, and subgroup analysis showed that the DLNM index was generally robust and generalisable. SHAP explains the importance of ranking features (tumour size, right 4 region LN, FT4, TG, and T3) and visualises global and individual risk prediction. RCS regression suggested a nonlinear link between the DLNM index, TG, tumour size, FT3, and DLNM risk. Conclusion: An optimised explainable model (DLNM index) comprising 12 clinical features based on multiple ML algorithms was constructed and validated to provide an economical, readily available, and precise diagnostic instrument for DLNM in PTC, which has potential implications for clinical practice. The SHAP explanation and RCS regression quantify and visualise tumour size and FT4 as the most important variables that increase DLNM risk.

Keywords: delphian lymph node metastasis; machine learning approaches; model interpretability; papillary thyroid cancer; prediction model.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interest exists.

Figures

Figure 1
Figure 1
Important characteristic variables identified by the Boruta algorithm. The horizontal axis shows the names of variables, and the vertical axis shows the Z-score of variables. The boxplot shows the Z-score of variables during the model calculation process. (A, B) Entire cohort. (C, D) Training cohort. (E, F) Validation cohort.
Figure 2
Figure 2
Establishment and validations of a consensus diagnostic model for DLNM via 12 the machine learning (ML)-based integrative procedure. (A) A total of 113 ML algorithm combinations of prediction models using the LOOCV framework and further calculated the area under curve (AUC) of each model in all datasets in Figure S2. (B, C) Lasso coefficient profiles of the 12 predictors. A vertical line is drawn at the optimal value by 1 - s.e. criteria and results in 12 non-zero coefficients (B ultrasound tumor size, left CLN metastasis, left 3 region LN metastasis, right CLN metastasis, right 3 region LN metastasis, right 4 region LN metastasis, CT CLN metastasis, TG, TGAB, T3, T4, and FT4). (D) Lasso was used to identify candidate features with 10-fold cross-validation. The Y-axis shows mean-square error and the X-axis is Log (λ), dotted vertical lines represent minimum and 1 standard error values of λ. The features selected at minimum standard error values of λ were finally used for DLNM model.
Figure 3
Figure 3
Evaluation of diagnostic value and fitting ability of DLNM index. (A-C) Receiver operating characteristic (ROC) curves with AUC values to evaluate predictive efficacy of DLNM index in entire cohort (A), training cohort (B), validation cohort (C). (D-F) Calibration curves for DLNM index in entire cohort (D), training cohort (E), validation cohort (F). X-axis is predicted probability of DLNM. Y-axis is observed probability of DLNM.
Figure 4
Figure 4
Evaluation of clinical usefulness and nonlinear relationship of DLNM index. (A-C) Decision curve analysis was applied to evaluate the clinical usefulness of DLNM index in entire cohort (A), training cohort (B), validation cohort (C). The Y-axis represents the net benefit. The black line represents the hypothesis that no patients treatment. The X-axis represents the threshold probability. The threshold probability is where the expected benefit of treatment is equal to the expected benefit of avoiding treatment. (D-F) Potential nonlinear for the levels of DLNM index with DLNM risk measured by restricted cubic spline regression with 3 knots in entire cohort (D), training cohort (E), validation cohort (F). The brown line and shadow area represent the estimated OR and the 95% CI.
Figure 5
Figure 5
Comparison of optimized DLNM index to other models. (A) Multiple ROC analysis was performed to compare the diagnostic performance of the DLNM index against Li model and Zhou model . (B) The model's predictive performance was compared through a comprehensive array of metrics including accuracy, prevalence, recall, F1-score, Matthews correlation coefficient (MCC), precision, specificity, false negative rate (FNR), false positive rate (FPR). (C): Decision curve analysis was applied to evaluate the clinical usefulness of DLNM index against Li model and Zhou model . The Y-axis represents the net benefit. The black line represents the hypothesis that no patients die. The Xaxis represents the threshold probability. The threshold probability is where the expected benefit of treatment is equal to the expected benefit of avoiding treatment.
Figure 6
Figure 6
Global and local model explanation by the SHAP method. (A) Summary plot showed the 12 features ranking by mean absolute SHAP values. (B) Each variable name is shown on the left-hand side with the variable with the greatest contribution listed at the top. To the right of the variables, there are colored lines, which are individual points that correspond to observations in the population. A higher value for the variable is represented in yellow, while a lower value for the variable will be shown in purple. A value farther to the right (ie, a higher SHAP value) indicates that the variable is contributed to a prediction of a positive target, such as DLNM. (C) An example of risk factor analysis for a patient with PTC which represented the individual PTC towards the “DLNM” class. (D-I) One-way SHAP dependence plot of the 6 important predictors (continuous variable). (D) B ultrasound tumor size. (E) FT4. (F) TG. (G) T3. (H) T4. (I) TGAB. Each dependence plot shows how a single feature affects the output of the prediction model, and each dot represents a single patient. Specifically, the values of the predictor are represented by the x-axis, and its SHAP values are represented by the y-axis. To interpret these plots, for example, in (D), patients with higher tumor size (as x-axis increased) were associated with a higher SHAP value, which indicated a higher likelihood of DLNM (y-axis also increased).
Figure 7
Figure 7
Potential nonlinear for the levels of continuous predictors with DLNM risk measured by restricted cubic spline regression with 3 knots. (A) B ultrasound tumor size. (B) TSH. (C) TG. (D) TGAB. (E) FT3. (F) FT4. (G) T3. (H) T4. The brown line and shadow area represent the estimated OR and the 95% CI. TSH, thyroid stimulating hormone; TG, thyroglobulin; TGAB, anti-thyroglobulin antibodies; T3, triiodothyronine; T4, thyroxine; FT3, free triiodothyronine; FT4, free thyroxine.

Similar articles

References

    1. Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73(1):17–48. - PubMed
    1. Miller KD, Fidler-Benaoudia M, Keegan TH, Hipp HS, Jemal A, Siegel RL. Cancer statistics for adolescents and young adults, 2020. CA Cancer J Clin. 2020;70(6):443–459. - PubMed
    1. Haugen BR, Alexander EK, Bible KC. et al. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid. 2016;26(1):1–133. - PMC - PubMed
    1. Zhao D, Li W, Zhang X. Development and validation of a nomogram for preoperative prediction of ipsilateral cervical central lymph node metastasis in papillary thyroid cancer: a population-based study. Gland Surg. 2024;13(4):528–539. - PMC - PubMed
    1. So YK, Kim MJ, Kim S, Son YI. Lateral lymph node metastasis in papillary thyroid carcinoma: A systematic review and meta-analysis for prevalence, risk factors, and location. Int J Surg. 2018;50:94–103. - PubMed

LinkOut - more resources