Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 17;24(1):241.
doi: 10.1186/s12944-025-02664-w.

Explainable machine learning-driven models for predicting Parkinson's disease and its prognosis: obesity patterns associations and models development using NHANES 1999-2018 data

Affiliations

Explainable machine learning-driven models for predicting Parkinson's disease and its prognosis: obesity patterns associations and models development using NHANES 1999-2018 data

Jiaxin Fan et al. Lipids Health Dis. .

Abstract

Background: Parkinson's disease (PD) is a prevalent neurodegenerative condition, the effect of obesity on PD remains controversial. We aimed to investigate the associations of obesity patterns on PD and all-cause mortality, while developing machine learning (ML)-driven predictive and prognostic models for PD.

Methods: Fifty-one thousand, three hundred ninety-four adults from the National Health and Nutrition Examination Survey (NHANES) 1999-2018 were classified into four obesity patterns via body mass index (BMI) and waist circumference (WC). Associations of obesity patterns with PD risk and all-cause mortality were evaluated via multivariable logistic and Cox proportional hazards regression across three adjusted models. Subgroup, sensitivity, and restricted cubic spline (RCS) analyses examined stability, robustness, and nonlinearity. An integrative ML-driven architecture identified key features to develop predictive and prognostic nomograms, validated by the area under the receiver operating characteristic curves (AUCROCs) and calibration curves. Survival differences were analyzed using Kaplan-Meier curves. Shapley additive explanations (SHAP) enhanced model explanation.

Results: Compound obesity significantly increased PD risk (Model 1: OR = 1.83, P < 0.001; Model 2: OR = 1.70, P = 0.002; Model 3: OR = 1.71, P = 0.006) yet correlated with reduced all-cause mortality in PD patients (Model 1: HR = 0.43, P = 0.003; Model 2: HR = 0.75, P = 0.428; Model 3: HR = 0.41, P = 0.033). Subgroup analysis revealed only HbA1c-modified association of compound obesity with PD (Pinteraction = 0.031). Sensitivity analyses confirmed robustness (pooled OR = 1.83, P < 0.001; pooled HR = 0.43, P = 0.003). RCS analyses revealed BMI-dependent PD risk escalation (Pnonlinearity = 0.008, BMI < 45.0 kg/m2), inverted U-shaped WC-PD link (Pnonlinearity < 0.001), and inverse dose-response BMI-mortality relationship (Pnonlinearity = 0.003), along with multiphasic WC-mortality association (PThreshold = 0.555 at 95 cm and PThreshold = 0.091 at 118 cm). LASSO + RF identified eight features, achieving moderate performance in PD prediction (SMOTE set: AUCROC = 0.75, Brier = 0.20) and prognosis (train set: AUCROC = 0.72, Brier = 0.22) nomograms, with similar results in the test set (AUCROC = 0.70, Brier = 0.01 for prediction, 0.87 and 0.18 for prognosis). No 24-month survival differences were observed across four obesity patterns (train set: Plog-rank = 0.73; test set: Plog-rank = 0.32).

Conclusions: This study preliminarily reveals that compound obesity significantly increases PD risk yet paradoxically associates with reduced all-cause mortality in PD patients. Validated predictive and prognostic nomograms for PD achieve relatively robust performances. Nonetheless, extensive longitudinal studies are required to validate these exploratory findings more comprehensively.

Keywords: Association; Multiple machine learning; Obesity pattern; Parkinson's disease; Prediction model; Shapley additive explanations.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The NCHS Ethics Review Board evaluated and granted approval for the NHANES survey ( https://www.cdc.gov/nchs/nhanes/irba98.htm ), and informed consent was obtained from all participants at the time of NHANES enrollment. Due to the nature of open access and the deidentification of data, this study was exempt from ethical approval by the Ethics Committee of Shaanxi Provincial People’s Hospital. Consent for publication: All authors have read, validate the accuracy of the data and approved the final manuscript. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Participants filtering workflow and conceptual architecture of this study
Fig. 2
Fig. 2
Correlation heatmaps of Kendall (A), Pearson (B), Spearman (C), and MIC (D) correlation coefficient algorithms. MIC, Maximum Information Coefficient; PIR, poverty-to-income ratio; TC, total cholesterol; HDL, high-density lipoprotein; HbA1c, glycated hemoglobin; BUN, blood urea nitrogen; ALT, alanine transaminase; AST, aspartate aminotransferase; CVD, cardiovascular disease; DM, diabetes mellitus
Fig. 3
Fig. 3
Subgroup analyses forest plots of for association estimates: between compound obesity and PD (A) and compound obesity and all-cause mortality (B). HbA1c, glycated hemoglobin; DM, diabetes mellitus; OR, odds ratio; HR, hazard ratio; 95%CI, 95% confidence interval
Fig. 4
Fig. 4
Multiple imputations forest plots for association estimates: between general obesity and PD (A), abdominal obesity and PD (B), compound obesity and PD (C), general obesity and all-cause mortality (D), abdominal obesity and all-cause mortality (E), and compound obesity and all-cause mortality (F). OR, odds ratio; HR, hazard ratio; 95%CI, 95% confidence interval
Fig. 5
Fig. 5
Restricted cubic spline plots with four knots of BMI-PD (A), WC-PD (B), BMI-all-cause mortality (C), and WC-all-cause mortality (D) associations. The red line represented the odds ratios, while the blue line denoted the 95% confidence intervals. BMI, body mass index; WC, waist circumference
Fig. 6
Fig. 6
The generation and predictive value of key features for PD. C-index of 76 algorithm combinations models (A). Nomogram of PD predictive model (B). ROC (C) and calibration (D) curves for PD predictive model in SMOTE set. ROC (E) and calibration (F) curves for PD predictive model in test set. HDL, high-density lipoprotein; BUN, blood urea nitrogen; AST, aspartate aminotransferase; PD, Parkinson's disease; ROC, receiver operating characteristic; AUC: area under the ROC curve; LASSO, least absolute shrinkage and selection operator; RF, random forest; SMOTE, Synthetic Minority Oversampling Technique
Fig. 7
Fig. 7
The prognostic value of key features for PD. Nomogram of PD prognostic model (A). Time-point ROC (B) and calibration (C) curves for PD prognostic model in train set. Time-point ROC (D) and calibration (E) curves for PD prognostic model in test set. HDL, high-density lipoprotein; BUN, blood urea nitrogen; AST, aspartate aminotransferase; PD, Parkinson's disease; ROC, receiver operating characteristic; AUC: area under the ROC curve
Fig. 8
Fig. 8
Kaplan–Meier curves for PD across different obesity patterns. Kaplan–Meier curve (A) and risk table (B) in train set. Kaplan–Meier curve (C) and risk table (D) in test set. KM, Kaplan–Meier
Fig. 9
Fig. 9
SHAP analysis for LASSO + RF model. SHAP summary plot (A) and interaction plot (B) of PD predictive model. SHAP force plots of participants without PD (C) and with PD (D). SHAP summary plot (E) and interaction plot (F) of PD prognostic model. SHAP force plots of PD participants with survival (G) and with death (H). HDL, high-density lipoprotein; BUN, blood urea nitrogen; AST, aspartate aminotransferase; PD, Parkinson's disease; LASSO, least absolute shrinkage and selection operator; RF, random forest; SHAP, Shapley additive explanations
Fig. 10
Fig. 10
Feature importance analyses for PD predictive (A) and prognostic (B) models. HDL, high-density lipoprotein; BUN, blood urea nitrogen; AST, aspartate aminotransferase; PD, Parkinson's disease; SHAP, Shapley additive explanations

References

    1. Zamanian MY, Nazifi M, Khachatryan LG, et al. The neuroprotective effects of agmatine on Parkinson’s disease: focus on oxidative stress. Inflamm Mol Mech Inflamm. 2024. 10.1007/s10753-024-02139-7. - PubMed
    1. GBD 2021 Nervous System Disorders Collaborators. Global, regional, and national burden of disorders affecting the nervous system, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021. Lancet Neurol. 2024;23(4):344–381. - PMC - PubMed
    1. Jeong SM, Song YD, Seok CL, et al. Machine learning-based classification of Parkinson’s disease using acoustic features: insights from multilingual speech tasks. Comput Biol Med. 2024;182:109078. - PubMed
    1. Pringsheim T, Jette N, Frolkis A, et al. The prevalence of Parkinson’s disease: a systematic review and meta-analysis. Mov Disord. 2014;29(13):1583–90. - PubMed
    1. Dzialas V, Doering E, Eich H, et al. Houston, we have AI problem! Quality issues with neuroimaging-based Artificial Intelligence in Parkinson’s Disease: a systematic review. Mov Disord. 2024. 10.1002/mds.30002. - PMC - PubMed

MeSH terms

LinkOut - more resources