Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Xia Cao^{1

2

3}, Yanhui Lin^{1

2

3}, Binfang Yang^{1

2

3}, Ying Li^{1

2

3}, Jiansong Zhou⁴

Affiliations

¹ Health Management Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China.
² Health Management Research Center, Central South University, Changsha, Hunan, People's Republic of China.
³ Hunan Chronic Disease Health Management Medical Research Center, Central South University, Changsha, Hunan, People's Republic of China.
⁴ National Clinical Research Center for Mental Disorders, and Department of Psychiatry, The Second Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China.

PMID: 35502445
PMCID: PMC9056070
DOI: 10.2147/RMHP.S346856

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Xia Cao et al. Risk Manag Healthc Policy. 2022.

. 2022 Apr 26:15:817-826.

doi: 10.2147/RMHP.S346856. eCollection 2022.

Authors

Xia Cao^{1

2

3}, Yanhui Lin^{1

2

3}, Binfang Yang^{1

2

3}, Ying Li^{1

2

3}, Jiansong Zhou⁴

Affiliations

¹ Health Management Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China.
² Health Management Research Center, Central South University, Changsha, Hunan, People's Republic of China.
³ Hunan Chronic Disease Health Management Medical Research Center, Central South University, Changsha, Hunan, People's Republic of China.
⁴ National Clinical Research Center for Mental Disorders, and Department of Psychiatry, The Second Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China.

PMID: 35502445
PMCID: PMC9056070
DOI: 10.2147/RMHP.S346856

Abstract

Purpose: Using machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicting the risk of renal function decline (RFD) using routine clinical data.

Patients and methods: This retrospective cohort study includes datasets from 2166 subjects, aged 35-74 years old, provided by an adult health screening follow-up program between 2010 and 2020. Seven different ML models were considered - random forest, gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbors, adaptive boosting, and decision tree - and were compared with standard logistic regression. There were 24 independent variables, and the baseline estimate glomerular filtration rate (eGFR) was used as the predictive variable.

Results: A total of 2166 participants (mean age 49.2±11.2 years old, 63.3% males) were enrolled and randomly divided into a training set (n=1732) and a test set (n=434). The area under receiver operating characteristic curve (AUROC) for detecting RFD corresponding to the different models were above 0.85 during the training phase. The gradient boosting algorithms exhibited the best average prediction accuracy (AUROC: 0.914) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the RFD prediction performance, compared to logistic regression model (AUROC:0.882), except the K-nearest neighbors and decision tree algorithms (AUROC:0.854 and 0.824, respectively). However, the improvement differences with logistic regression were small (less than 4%) and nonsignificant.

Conclusion: Our results indicate that the proposed health screening dataset-based RFD prediction model using ML algorithms is readily applicable, produces validated results. But logistic regression yields as good performance as ML models to predict the risk of RFD with simple clinical predictors.

Keywords: algorithm; chronic kidney disease; deep learning; health examination.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest in this work.

Figures

**Figure 1**
Study process and architecture of the RFD prediction model.

**Figure 2**
Receiver operating characteristics curve of the prediction performance for each algorithm in the training set (A) and testing set (B), respectively.

See this image and copyright information in PMC

Cited by

Comparing AI/ML approaches and classical regression for predictive modeling using large population health databases: Applications to COVID-19 case prediction.
Bjerre LM, Peixoto C, Alkurd R, Talarico R, Abielmona R. Bjerre LM, et al. Glob Epidemiol. 2024 Oct 4;8:100168. doi: 10.1016/j.gloepi.2024.100168. eCollection 2024 Dec. Glob Epidemiol. 2024. PMID: 39435397 Free PMC article.
CAREUP: An Integrated Care Platform with Intrinsic Capacity Monitoring and Prediction Capabilities.
Kolakowski M, Lupica A, Ben Bader S, Djaja-Josko V, Kolakowski J, Cichocki J, Ayadi J, Gilardi L, Consoli A, Mocanu IG, Cramariuc O, Ferrazzini L, Reithner E, Velciu M, Borgogni B, Rivaira S, Leonzi S, Cucchieri G, Stara V. Kolakowski M, et al. Sensors (Basel). 2025 Feb 3;25(3):916. doi: 10.3390/s25030916. Sensors (Basel). 2025. PMID: 39943555 Free PMC article.
Automated prognosis of renal function decline in ADPKD patients using deep learning.
Raj A, Tollens F, Caroli A, Nörenberg D, Zöllner FG. Raj A, et al. Z Med Phys. 2024 May;34(2):330-342. doi: 10.1016/j.zemedi.2023.08.001. Epub 2023 Aug 21. Z Med Phys. 2024. PMID: 37612178 Free PMC article.
Classification and Regression Trees analysis identifies patients at high risk for kidney function decline following hospitalization.
Wang W, Zhu W, Hajagos J, Fochtmann L, Koraishy FM. Wang W, et al. PLoS One. 2025 Jan 31;20(1):e0317558. doi: 10.1371/journal.pone.0317558. eCollection 2025. PLoS One. 2025. PMID: 39888928 Free PMC article.

References

1. Collaboration GBDCKD. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2020;395(10225):709–733. doi:10.1016/S0140-6736(20)30045-3 - DOI - PMC - PubMed
1. Zhang L, Wang F, Wang L, et al. Prevalence of chronic kidney disease in China: a cross-sectional survey. Lancet. 2012;379(9818):815–822. doi:10.1016/S0140-6736(12)60033-6 - DOI - PubMed
1. Yang C, Wang H, Zhao X, et al. CKD in China: evolving spectrum and public health implications. Am J Kidney Dis. 2020;76(2):258–264. doi:10.1053/j.ajkd.2019.05.032 - DOI - PubMed
1. Nelson RG, Grams ME, Ballew SH, et al. Development of risk prediction equations for incident chronic kidney disease. JAMA. 2019;322(21):2104–2114. doi:10.1001/jama.2019.17379 - DOI - PMC - PubMed
1. Carrillo-Larco RM, Miranda JJ, Gilman RH, et al. Risk score for first-screening of prevalent undiagnosed chronic kidney disease in Peru: the CRONICAS-CKD risk score. BMC Nephrol. 2017;18(1):343. doi:10.1186/s12882-017-0758-4 - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Affiliations

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous