. 2021 Apr 15:9:4900511.

doi: 10.1109/JTEHM.2021.3073629. eCollection 2021.

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Md Rashed-Al-Mahfuz¹, Abedul Haque², Akm Azad³, Salem A Alyami⁴, Julian M W Quinn⁵, Mohammad Ali Moni⁶

Affiliations

¹ Department of Computer Science and EngineeringUniversity of RajshahiRajshahi6205Bangladesh.
² Department of HematopathologyThe University of Texas MD Anderson Cancer CenterHoustonTX77030USA.
³ iThree Institute, University of Technology SydneyNSW2007Australia.
⁴ Department of Mathematics and StatisticsImam Muhammad Ibn Saud Islamic UniversityRiyadh13318Saudi Arabia.
⁵ Bone Biology DivisionGarvan Institute of Medical ResearchDarlinghurstNSW2010Australia.
⁶ WHO Collaborating Centre of eHealth, School of Public Health and Community MedicineUniversity of New South WalesSydneyNSW2052Australia.

PMID: 33948393
PMCID: PMC8075287
DOI: 10.1109/JTEHM.2021.3073629

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Md Rashed-Al-Mahfuz et al. IEEE J Transl Eng Health Med. 2021.

. 2021 Apr 15:9:4900511.

doi: 10.1109/JTEHM.2021.3073629. eCollection 2021.

Authors

Md Rashed-Al-Mahfuz¹, Abedul Haque², Akm Azad³, Salem A Alyami⁴, Julian M W Quinn⁵, Mohammad Ali Moni⁶

Affiliations

¹ Department of Computer Science and EngineeringUniversity of RajshahiRajshahi6205Bangladesh.
² Department of HematopathologyThe University of Texas MD Anderson Cancer CenterHoustonTX77030USA.
³ iThree Institute, University of Technology SydneyNSW2007Australia.
⁴ Department of Mathematics and StatisticsImam Muhammad Ibn Saud Islamic UniversityRiyadh13318Saudi Arabia.
⁵ Bone Biology DivisionGarvan Institute of Medical ResearchDarlinghurstNSW2010Australia.
⁶ WHO Collaborating Centre of eHealth, School of Public Health and Community MedicineUniversity of New South WalesSydneyNSW2052Australia.

PMID: 33948393
PMCID: PMC8075287
DOI: 10.1109/JTEHM.2021.3073629

Abstract

Objective: Chronic kidney disease (CKD) is a major public health concern worldwide. High costs of late-stage diagnosis and insufficient testing facilities can contribute to high morbidity and mortality rates in CKD patients, particularly in less developed countries. Thus, early diagnosis aided by vital parameter analytics using affordable computer-aided diagnosis could not only reduce diagnosis costs but improve patient management and outcomes.

Methods: In this study, we developed machine learning models using selective key pathological categories to identify clinical test attributes that will aid in accurate early diagnosis of CKD. Such an approach will save time and costs for diagnostic screening. We have also evaluated the performance of several classifiers with k-fold cross-validation on optimized datasets derived using these selected clinical test attributes.

Results: Our results suggest that the optimized datasets with important attributes perform well in diagnosis of CKD using our proposed machine learning models. Furthermore, we evaluated clinical test attributes based on urine and blood tests along with clinical parameters that have low costs of acquisition. The predictive models with the optimized and pathologically categorized attributes set yielded high levels of CKD diagnosis accuracy with random forest (RF) classifier being the best performing.

Conclusions: Our machine learning approach has yielded effective predictive analytics for CKD screening which can be developed as a resource to facilitate improved CKD screening for enhanced and timely treatment plans.

Keywords: Attribute selection; chronic kidney disease (CKD); computer-aided diagnosis; explainable AI; machine learning (ML).

PubMed Disclaimer

Figures

**FIGURE 1.**
Schematic diagram of the overall workflow.

**FIGURE 2.**
Sample frequency distribution of clinical test attributes. (a) sample frequency distribution (box plot) of numeric attributes, (b) sample frequency distribution (bar plot) of two-class nominal attributes, (c) sample frequency distribution (bar plot) of six-class nominal attributes, and (d) sample frequency distribution (bar plot) of five-class ’sg’ nominal attributes.

**FIGURE 3.**
Performance of the classifiers using all the attributes. (a) Various model evaluation metrics for CKD classification. (b) Calculated classification accuracy for RF, GB, XGB, LR, and SVM classifiers.

**FIGURE 4.**
SHAP plots (a) SHAP plot of the top most 20 attributes. The SHAP values were calculated when the random forest (RF) model was trained for a single fold training dataset where each dot corresponds to an instance from the training data. Each value is color coded, dark black represents the lower value and light color represents the higher value of the attributes. (b) The bar plot represents normalized mean absolute SHAP value across all the folds for the RF, GB, and XGB model training.

**FIGURE 5.**
Attributes dependence plots for the interaction of hemoglobin and other attributes. X-axis represents hemoglobin level and Y-axis represents the SHAP value of hemoglobin in the RF model. Copper color in the color bars represents higher values, and dark color present lower values of the attributes. (a-l) Interaction effects with specific gravity, serum creatinine, albumin, packed cell volume, red blood cell count, hypertension, blood glucose random, diabetes mellitus, age, sodium, blood urea, and blood pressure, respectively.

**FIGURE 6.**
CKD detection accuracy plots for the six versions of the dataset, namely ‘DB-I’, ‘DB-II’, ‘DB-III’, ‘DB-IV’, ‘DB-V’, and ‘DB-IV’ datasets with various machine learning models. The performance metrics of specific databases are color coded in the plot.

**FIGURE 7.**
A bar chart of the CKD classification accuracy in the RF model for database evaluation. There are six datasets including various test attributes and the performance rate range is from 0 to 100%. The superior performances of the model trained with DB I were observed compared to those models trained with other datasets, in terms of specificity, precision, F-score, AUC and Accuracy, but, however, DB V trained model showed only marginally better sensitivity than that of DB I.

See this image and copyright information in PMC

Cited by

Variational quantum classifier-based early identification and classification of chronic kidney disease using sparse autoencoder and LASSO shrinkage.
Parthasarathi P, Alshahrani HM, Venkatachalam K, Cho J. Parthasarathi P, et al. PeerJ Comput Sci. 2025 Apr 17;11:e2789. doi: 10.7717/peerj-cs.2789. eCollection 2025. PeerJ Comput Sci. 2025. PMID: 40567645 Free PMC article.
Specific patterns and potential risk factors to predict 3-year risk of death among non-cancer patients with advanced chronic kidney disease by machine learning.
Chang TH, Chen YD, Lu HH, Wu JL, Mak K, Yu CS. Chang TH, et al. Medicine (Baltimore). 2024 Feb 16;103(7):e37112. doi: 10.1097/MD.0000000000037112. Medicine (Baltimore). 2024. PMID: 38363886 Free PMC article.
Early Detection and Diagnosis of Chronic Kidney Disease Based on Selected Predominant Features.
Ullah Z, Jamjoom M. Ullah Z, et al. J Healthc Eng. 2023 Jan 30;2023:3553216. doi: 10.1155/2023/3553216. eCollection 2023. J Healthc Eng. 2023. Retraction in: J Healthc Eng. 2023 Nov 29;2023:9896079. doi: 10.1155/2023/9896079. PMID: 36756136 Free PMC article. Retracted.
To predict the risk of chronic kidney disease (CKD) using Generalized Additive2 Models (GA2M).
Lapi F, Nuti L, Marconi E, Medea G, Cricelli I, Papi M, Gorini M, Fiorani M, Piccinocchi G, Cricelli C. Lapi F, et al. J Am Med Inform Assoc. 2023 Aug 18;30(9):1494-1502. doi: 10.1093/jamia/ocad097. J Am Med Inform Assoc. 2023. PMID: 37330672 Free PMC article.
Artificial Intelligence in Kidney Disease: A Comprehensive Study and Directions for Future Research.
Wu CC, Islam MM, Poly TN, Weng YC. Wu CC, et al. Diagnostics (Basel). 2024 Feb 12;14(4):397. doi: 10.3390/diagnostics14040397. Diagnostics (Basel). 2024. PMID: 38396436 Free PMC article. Review.

See all "Cited by" articles

References

1. Muiru A. N.et al., “The epidemiology of chronic kidney disease (CKD) in rural east Africa: A population-based study,” PLoS ONE, vol. 15, no. 3, Mar. 2020, Art. no. e0229649. - PMC - PubMed
1. Wen C. P.et al., “All-cause mortality attributable to chronic kidney disease: A prospective cohort study based on 462 293 adults in Taiwan,” Lancet, vol. 371, no. 9631, pp. 2173–2182, Jun. 2008. - PubMed
1. Hossain M. A., Asa T. A., Rahman M. R., and Moni M. A., “Network-based approach to identify key candidate genes and pathways shared by thyroid cancer and chronic kidney disease,” Informat. Med. Unlocked, vol. 16, Jan. 2019, Art. no. 100240.
1. Brück K.et al., “CKD prevalence varies across the European general population,” J. Amer. Soc. Nephrol., vol. 27, no. 7, pp. 2135–2147, 2016. - PMC - PubMed
1. 2015 USRDS Annual Data Report: Epidemiology of Kidney Disease in the United States, United States Renal Data System, Bethesda, MD, USA, 2015.

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Affiliations

Clinically Applicable Machine Learning Approaches to Identify Attributes of Chronic Kidney Disease (CKD) for Use in Low-Cost Diagnostic Screening

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical