Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 19;30(6):1114-1124.
doi: 10.1093/jamia/ocad051.

Machine-learning enhancement of urine dipstick tests for chronic kidney disease detection

Affiliations

Machine-learning enhancement of urine dipstick tests for chronic kidney disease detection

Eun Chan Jang et al. J Am Med Inform Assoc. .

Abstract

Objective: Screening for chronic kidney disease (CKD) requires an estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) from a blood sample and a proteinuria level from a urinalysis. We developed machine-learning models to detect CKD without blood collection, predicting an eGFR less than 60 (eGFR60 model) or 45 (eGFR45 model) using a urine dipstick test.

Materials and methods: The electronic health record data (n = 220 018) obtained from university hospitals were used for XGBoost-derived model construction. The model variables were age, sex, and 10 measurements from the urine dipstick test. The models were validated using health checkup center data (n = 74 380) and nationwide public data (KNHANES data, n = 62 945) for the general population in Korea.

Results: The models comprised 7 features, including age, sex, and 5 urine dipstick measurements (protein, blood, glucose, pH, and specific gravity). The internal and external areas under the curve (AUCs) of the eGFR60 model were 0.90 or higher, and a higher AUC for the eGFR45 model was obtained. For the eGFR60 model on KNHANES data, the sensitivity was 0.93 or 0.80, and the specificity was 0.86 or 0.85 in ages less than 65 with proteinuria (nondiabetes or diabetes, respectively). Nonproteinuric CKD could be detected in nondiabetic patients under the age of 65 with a sensitivity of 0.88 and specificity of 0.71.

Discussion and conclusions: The model performance differed across subgroups by age, proteinuria, and diabetes. The CKD progression risk can be assessed with the eGFR models using the levels of eGFR decrease and proteinuria. The machine-learning-enhanced urine-dipstick test can become a point-of-care test to promote public health by screening CKD and ranking its risk of progression.

Keywords: XGBoost; chronic kidney disease; estimated glomerular filtration rate; machine-learning model; urinalysis.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Study flowchart for predicting a decrease in estimated glomerular filtration rate (eGFR) from urine dipstick test. Bold numbers in the boxes are case numbers of data. XGBoost: eXtreme Gradient Boosting; TPOT: Tree-based Pipeline Optimization Tool; ML: machine learning; GPU: Graphics processing unit; eGFR60 model: model to predict eGFR < 60 mL/min/1.73 m2; eGFR45 model: model to predict eGFR < 45 mL/min/1.73 m2; AUC: area under the curve; SHAP: SHapley Additive exPlanations.
Figure 2.
Figure 2.
Receiver operating characteristic (ROC) curves for the detection of estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m2 (A, eGFR60 model) and eGFR < 45 mL/min/1.73 m2 (B, eGFR45 model). The data used for internal validation, external validation 1, and external validation 2 were obtained from university hospitals (CHA and SHDC), a health checkup center (SCHPC), and the general population (KNHANES), respectively. For external validation 2, weighted ROC was drawn to represent the entire population. The threshold, sensitivity, and specificity are calculated for KNHANES data using the Youden index.
Figure 3.
Figure 3.
Summary plots of SHAP value (probability of a decline in estimated glomerular filtration rate, eGFR) for eGFR < 60 mL/min/1.73 m2 (A, eGFR60 model) and eGFR < 45 mL/min/1.73 m2 (B, eGFR45 model) prediction. KNHANES data were used as reference and foreground datasets. The distribution of each feature's impacts on the model output is plotted using SHAP values. Model features are sorted along the y-axis of the summary plot of SHAP value by the sum of SHAP value magnitudes over all samples.
Figure 4.
Figure 4.
Changes in weighted receiver operating curves for detecting a decline in estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) according to models, ages (years), hypertension, and diabetes. The area under the curve (AUC) presents with the 95% confidence interval in the parentheses. eGFR60 model: eGFR < 60 detection model; eGFR45 model: eGFR < 45 detection model.
Figure 5.
Figure 5.
Changes in weighted receiver operating curves for detecting a decline in estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) in ages under 65 years according to the model, urine protein, and diabetes. The area under the curve (AUC) presents with the 95% confidence interval in the parentheses. eGFR60 model: eGFR < 60 detection model; eGFR45 model: eGFR < 45 detection model.
Figure 6.
Figure 6.
Proposed use of machine-learning-enhanced urinalysis for kidney disease detection. CKD1, CKD2, and CKD3 are moderate, high, and very high risks of chronic kidney disease (CKD), respectively, according to Kidney Disease: Improving Global Outcomes guidelines. eGFR60 model: estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m2 detection model; eGFR45 model: eGFR < 45 mL/min/1.73 m2 detection model.

References

    1. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Introduction: the case for updating and context. Kidney Int Suppl (2011) 2013; 3 (1): 15–8. - PMC - PubMed
    1. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Chapter 1: Definition and classification of CKD. Kidney Int Suppl (2011) 2013; 3 (1): 19–62. - PMC - PubMed
    1. Webster AC, Nagler EV, Morton RL, et al.Chronic kidney disease. The Lancet 2017; 389 (10075): 1238–52. - PubMed
    1. Centers for Disease Control and Prevention. Chronic Kidney Disease in the United States. Atlanta, GA: US Department of Health and Human Services, Centers for Disease Control and Prevention; 2021.
    1. Levey AS, Stevens LA, Schmid CH, et al.A new equation to estimate glomerular filtration rate. Ann Internal Med 2009; 150 (9): 604–12. - PMC - PubMed

Publication types