Refine XGBoost with SHAP explainability for non-invasive early detection of diabetic kidney disease: Estimated cardiac output as a potential indicator
- PMID: 41138643
- DOI: 10.1016/j.cmpb.2025.109122
Refine XGBoost with SHAP explainability for non-invasive early detection of diabetic kidney disease: Estimated cardiac output as a potential indicator
Abstract
Introduction: Moderately increased (micro) albuminuria serves as a critical early indicator of Diabetic Kidney Disease (DKD). However, traditional screening methods that rely on laboratory-based analyses face significant challenges in enabling timely and continuous monitoring. This study addresses these limitations by introducing a non-invasive approach for albuminuria risk detection, allowing real-time estimation of mild albuminuria increases using vital signs and body measurements.
Methods: We developed a non-invasive model for albuminuria risk detection using vital signs and body measurements. Data were drawn from the NHANES cohort (USA) and a Bangladeshi cohort of people with diabetes (PwD). Feature selection identified four non-laboratory predictors - estimated cardiac output (eCO), body mass index, waist circumference, and diabetes duration - as the most informative inputs. The proposed models were benchmarked against baseline machine learning approaches and existing methods developed over the past decade, with model interpretability assessed via SHapely Additive exPlanation (SHAP) contributions.
Results: Our best model, an XGBoost classifier, achieved an AUC of 0.75 [0.67-0.84], an accuracy of 0.70, and a macro F1 score of 0.68, outperforming other non-invasive risk scores (0.58) and machine learning baselines. Validation against an external reference risk score confirmed superior precision-recall balance for both positive (microalbuminuria) and negative classes.
Conclusion: This study demonstrates that a fine-tuned, non-invasive XGBoost model using simple clinical measures can support albuminuria monitoring and early DKD detection without laboratory tests. While the selected predictors may not represent the definitive or optimal feature set, their strong performance highlights the potential of leveraging easily obtainable, clinically relevant measures. In particular, the contribution of eCO underscores a promising direction for exploring heart-kidney-metabolism interactions in DKD risk assessment. Together, these findings highlight a scalable, non-invasive tool for resource-limited settings, an interpretable framework for clinical trust, and a pathway to refining feature sets for both accuracy and biological insight.
Keywords: Albuminuria; DKD; Diabetic kidney disease; Estimated cardiac output (eCO); Machine learning; Non-invasive medical screening.
Copyright © 2025 The Authors. Published by Elsevier B.V. All rights reserved.
Conflict of interest statement
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
LinkOut - more resources
Full Text Sources
Miscellaneous
