Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 15:9:4900511.
doi: 10.1109/JTEHM.2021.3073629. eCollection 2021.

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Affiliations

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Md Rashed-Al-Mahfuz et al. IEEE J Transl Eng Health Med. .

Abstract

Objective: Chronic kidney disease (CKD) is a major public health concern worldwide. High costs of late-stage diagnosis and insufficient testing facilities can contribute to high morbidity and mortality rates in CKD patients, particularly in less developed countries. Thus, early diagnosis aided by vital parameter analytics using affordable computer-aided diagnosis could not only reduce diagnosis costs but improve patient management and outcomes.

Methods: In this study, we developed machine learning models using selective key pathological categories to identify clinical test attributes that will aid in accurate early diagnosis of CKD. Such an approach will save time and costs for diagnostic screening. We have also evaluated the performance of several classifiers with k-fold cross-validation on optimized datasets derived using these selected clinical test attributes.

Results: Our results suggest that the optimized datasets with important attributes perform well in diagnosis of CKD using our proposed machine learning models. Furthermore, we evaluated clinical test attributes based on urine and blood tests along with clinical parameters that have low costs of acquisition. The predictive models with the optimized and pathologically categorized attributes set yielded high levels of CKD diagnosis accuracy with random forest (RF) classifier being the best performing.

Conclusions: Our machine learning approach has yielded effective predictive analytics for CKD screening which can be developed as a resource to facilitate improved CKD screening for enhanced and timely treatment plans.

Keywords: Attribute selection; chronic kidney disease (CKD); computer-aided diagnosis; explainable AI; machine learning (ML).

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
Schematic diagram of the overall workflow.
FIGURE 2.
FIGURE 2.
Sample frequency distribution of clinical test attributes. (a) sample frequency distribution (box plot) of numeric attributes, (b) sample frequency distribution (bar plot) of two-class nominal attributes, (c) sample frequency distribution (bar plot) of six-class nominal attributes, and (d) sample frequency distribution (bar plot) of five-class ’sg’ nominal attributes.
FIGURE 3.
FIGURE 3.
Performance of the classifiers using all the attributes. (a) Various model evaluation metrics for CKD classification. (b) Calculated classification accuracy for RF, GB, XGB, LR, and SVM classifiers.
FIGURE 4.
FIGURE 4.
SHAP plots (a) SHAP plot of the top most 20 attributes. The SHAP values were calculated when the random forest (RF) model was trained for a single fold training dataset where each dot corresponds to an instance from the training data. Each value is color coded, dark black represents the lower value and light color represents the higher value of the attributes. (b) The bar plot represents normalized mean absolute SHAP value across all the folds for the RF, GB, and XGB model training.
FIGURE 5.
FIGURE 5.
Attributes dependence plots for the interaction of hemoglobin and other attributes. X-axis represents hemoglobin level and Y-axis represents the SHAP value of hemoglobin in the RF model. Copper color in the color bars represents higher values, and dark color present lower values of the attributes. (a-l) Interaction effects with specific gravity, serum creatinine, albumin, packed cell volume, red blood cell count, hypertension, blood glucose random, diabetes mellitus, age, sodium, blood urea, and blood pressure, respectively.
FIGURE 6.
FIGURE 6.
CKD detection accuracy plots for the six versions of the dataset, namely ‘DB-I’, ‘DB-II’, ‘DB-III’, ‘DB-IV’, ‘DB-V’, and ‘DB-IV’ datasets with various machine learning models. The performance metrics of specific databases are color coded in the plot.
FIGURE 7.
FIGURE 7.
A bar chart of the CKD classification accuracy in the RF model for database evaluation. There are six datasets including various test attributes and the performance rate range is from 0 to 100%. The superior performances of the model trained with DB I were observed compared to those models trained with other datasets, in terms of specificity, precision, F-score, AUC and Accuracy, but, however, DB V trained model showed only marginally better sensitivity than that of DB I.

Similar articles

Cited by

References

    1. Muiru A. N.et al., “The epidemiology of chronic kidney disease (CKD) in rural east Africa: A population-based study,” PLoS ONE, vol. 15, no. 3, Mar. 2020, Art. no. e0229649. - PMC - PubMed
    1. Wen C. P.et al., “All-cause mortality attributable to chronic kidney disease: A prospective cohort study based on 462 293 adults in Taiwan,” Lancet, vol. 371, no. 9631, pp. 2173–2182, Jun. 2008. - PubMed
    1. Hossain M. A., Asa T. A., Rahman M. R., and Moni M. A., “Network-based approach to identify key candidate genes and pathways shared by thyroid cancer and chronic kidney disease,” Informat. Med. Unlocked, vol. 16, Jan. 2019, Art. no. 100240.
    1. Brück K.et al., “CKD prevalence varies across the European general population,” J. Amer. Soc. Nephrol., vol. 27, no. 7, pp. 2135–2147, 2016. - PMC - PubMed
    1. 2015 USRDS Annual Data Report: Epidemiology of Kidney Disease in the United States, United States Renal Data System, Bethesda, MD, USA, 2015.