Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 26:15:817-826.
doi: 10.2147/RMHP.S346856. eCollection 2022.

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Affiliations

Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening

Xia Cao et al. Risk Manag Healthc Policy. .

Abstract

Purpose: Using machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicting the risk of renal function decline (RFD) using routine clinical data.

Patients and methods: This retrospective cohort study includes datasets from 2166 subjects, aged 35-74 years old, provided by an adult health screening follow-up program between 2010 and 2020. Seven different ML models were considered - random forest, gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbors, adaptive boosting, and decision tree - and were compared with standard logistic regression. There were 24 independent variables, and the baseline estimate glomerular filtration rate (eGFR) was used as the predictive variable.

Results: A total of 2166 participants (mean age 49.2±11.2 years old, 63.3% males) were enrolled and randomly divided into a training set (n=1732) and a test set (n=434). The area under receiver operating characteristic curve (AUROC) for detecting RFD corresponding to the different models were above 0.85 during the training phase. The gradient boosting algorithms exhibited the best average prediction accuracy (AUROC: 0.914) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the RFD prediction performance, compared to logistic regression model (AUROC:0.882), except the K-nearest neighbors and decision tree algorithms (AUROC:0.854 and 0.824, respectively). However, the improvement differences with logistic regression were small (less than 4%) and nonsignificant.

Conclusion: Our results indicate that the proposed health screening dataset-based RFD prediction model using ML algorithms is readily applicable, produces validated results. But logistic regression yields as good performance as ML models to predict the risk of RFD with simple clinical predictors.

Keywords: algorithm; chronic kidney disease; deep learning; health examination.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest in this work.

Figures

Figure 1
Figure 1
Study process and architecture of the RFD prediction model.
Figure 2
Figure 2
Receiver operating characteristics curve of the prediction performance for each algorithm in the training set (A) and testing set (B), respectively.

Similar articles

Cited by

References

    1. Collaboration GBDCKD. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2020;395(10225):709–733. doi:10.1016/S0140-6736(20)30045-3 - DOI - PMC - PubMed
    1. Zhang L, Wang F, Wang L, et al. Prevalence of chronic kidney disease in China: a cross-sectional survey. Lancet. 2012;379(9818):815–822. doi:10.1016/S0140-6736(12)60033-6 - DOI - PubMed
    1. Yang C, Wang H, Zhao X, et al. CKD in China: evolving spectrum and public health implications. Am J Kidney Dis. 2020;76(2):258–264. doi:10.1053/j.ajkd.2019.05.032 - DOI - PubMed
    1. Nelson RG, Grams ME, Ballew SH, et al. Development of risk prediction equations for incident chronic kidney disease. JAMA. 2019;322(21):2104–2114. doi:10.1001/jama.2019.17379 - DOI - PMC - PubMed
    1. Carrillo-Larco RM, Miranda JJ, Gilman RH, et al. Risk score for first-screening of prevalent undiagnosed chronic kidney disease in Peru: the CRONICAS-CKD risk score. BMC Nephrol. 2017;18(1):343. doi:10.1186/s12882-017-0758-4 - DOI - PMC - PubMed