Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 3;15(4):e092463.
doi: 10.1136/bmjopen-2024-092463.

Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China

Affiliations

Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China

Lianhua Liu et al. BMJ Open. .

Abstract

Objective: Diabetic peripheral neuropathy (DPN) is a common and serious complication of diabetes, which can lead to foot deformity, ulceration, and even amputation. Early identification is crucial, as more than half of DPN patients are asymptomatic in the early stage. This study aimed to develop and validate multiple risk prediction models for DPN in patients with type 2 diabetes mellitus (T2DM) and to apply the Shapley Additive Explanation (SHAP) method to interpret the best-performing model and identify key risk factors for DPN.

Design: A single-centre retrospective cohort study.

Setting: The study was conducted at a tertiary teaching hospital in Hainan.

Participants and methods: Data were retrospectively collected from the electronic medical records of patients with diabetes admitted between 1 January 2021 and 28 March 2023. After data preprocessing, 73 variables were retained for baseline analysis. Feature selection was performed using univariate analysis combined with recursive feature elimination (RFE). The dataset was split into training and test sets in an 8:2 ratio, with the training set balanced via the Synthetic Minority Over-sampling Technique. Six machine learning algorithms were applied to develop prediction models for DPN. Hyperparameters were optimised using grid search with 10-fold cross-validation. Model performance was assessed using various metrics on the test set, and the SHAP method was used to interpret the best-performing model.

Results: The study included 3343 T2DM inpatients, with a median age of 60 years (IQR 53-69), and 88.6% (2962/3343) had DPN. The RFE method identified 12 key factors for model construction. Among the six models, XGBoost showed the best predictive performance, achieving an area under the curve of 0.960, accuracy of 0.927, precision of 0.969, recall of 0.948, F1-score of 0.958 and a G-mean of 0.850 on the test set. The SHAP analysis highlighted C reactive protein, total bile acids, gamma-glutamyl transpeptidase, age and lipoprotein(a) as the top five predictors of DPN.

Conclusions: The machine learning approach successfully established a DPN risk prediction model with excellent performance. The use of the interpretable SHAP method could enhance the model's clinical applicability.

Keywords: Diabetes Mellitus, Type 2; Diabetic neuropathy; Machine Learning; Retrospective Studies; Risk Factors.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1. Study workflow. AUC, area under the curve; DPN, diabetic peripheral neuropathy; FNR, false negative rate; FPR, false positive rate; REF, recursive feature elimination; SHAP, Shapley Additive Explanation; SMOTE, Synthetic Minority Over-sampling Technique.
Figure 2
Figure 2. Change of AUC based on number of features (left) and ROC curve for different models on test set (right). AdaBoost, adaptive boosting; AUC, area under the curve; DT, decision tree; LR, logistic regression; RF, random forest; ROC, receiver operating characteristic; SVM, support vector machine; XGBoost, extreme gradient boosting.
Figure 3
Figure 3. Confusion matrix of six machine learning models on the test set. AdaBoost, adaptive boosting; DT, decision tree; LR, logistic regression; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
Figure 4
Figure 4. SHAP summary plot for the 12 clinical features contributing to the XGBoost model. BMI, body mass index; CRP, C reactive protein; GGT, gamma-glutamyl transpeptidase; LDH, lactate dehydrogenase; NLR, neutrophil-to-lymphocyte ratio; SHAP, Shapley Additive Explanation; TBA, total bile acid; XGBoost, extreme gradient boosting.
Figure 5
Figure 5. SHAP dependence plot of the XGBoost model. Feature dependency of age (left) and the mutual characteristic dependence between age and LDH (right). LDH, lactate dehydrogenase; SHAP, Shapley Additive Explanation; XGBoost, extreme gradient boosting.
Figure 6
Figure 6. SHAP force plot for a DPN patient (above) and SHAP force plot for a non-DPN patient (below). CRP, C reactive protein; DPN, diabetic peripheral neuropathy; GGT, gamma-glutamyl transpeptidase; LDH, lactate dehydrogenase; SHAP, Shapley Additive Explanation; TBA, total bile acid.

Similar articles

Cited by

References

    1. Sun H, Saeedi P, Karuranga S, et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res Clin Pract. 2022;183:S0168-8227(21)00478-2. doi: 10.1016/j.diabres.2021.109119. - DOI - PMC - PubMed
    1. Jiang A, Li J, Wang L, et al. Multi-feature, Chinese-Western medicine-integrated prediction model for diabetic peripheral neuropathy based on machine learning and SHAP. Diabetes Metab Res Rev. 2024;40:e3801. doi: 10.1002/dmrr.3801. - DOI - PubMed
    1. Selvarajah D, Kar D, Khunti K, et al. Diabetic peripheral neuropathy: advances in diagnosis and strategies for screening and early intervention. Lancet Diabetes Endocrinol. 2019;7:938–48. doi: 10.1016/S2213-8587(19)30081-6. - DOI - PubMed
    1. Pop-Busui R, Boulton AJM, Feldman EL, et al. Diabetic Neuropathy: A Position Statement by the American Diabetes Association. Diabetes Care. 2017;40:136–54. doi: 10.2337/dc16-2042. - DOI - PMC - PubMed
    1. Zhang W, Chen L, Lou M. Association of Elevated Serum Uric Acid with Nerve Conduction Function and Peripheral Neuropathy Stratified by Gender and Age in Type 2 Diabetes Patients. Brain Sci. 2022;12:1704. doi: 10.3390/brainsci12121704. - DOI - PMC - PubMed

Publication types