Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Jun 7;12(6):1010-1017.
doi: 10.2215/CJN.06210616. Epub 2016 Sep 22.

Statistical Methods for Cohort Studies of CKD: Prediction Modeling

Affiliations
Review

Statistical Methods for Cohort Studies of CKD: Prediction Modeling

Jason Roy et al. Clin J Am Soc Nephrol. .

Abstract

Prediction models are often developed in and applied to CKD populations. These models can be used to inform patients and clinicians about the potential risks of disease development or progression. With increasing availability of large datasets from CKD cohorts, there is opportunity to develop better prediction models that will lead to more informed treatment decisions. It is important that prediction modeling be done using appropriate statistical methods to achieve the highest accuracy, while avoiding overfitting and poor calibration. In this paper, we review prediction modeling methods in general from model building to assessing model performance as well as the application to new patient populations. Throughout, the methods are illustrated using data from the Chronic Renal Insufficiency Cohort Study.

Keywords: C-statistic; Calibration; Cohort Studies; Disease Progression; Humans; ROC curve; Renal Insufficiency, Chronic; Risk; Sensitivity; Specificity.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Plot of risk score against CKD progression within 2 years in the Chronic Renal Insufficiency Cohort (CRIC) Study data. The plots are for three different models, with increasing numbers of established predictors included from left to right. The risk score is log odds (logit) of the predicted probability. The outcome, progressive CKD (1= yes, 0= no), is jittered in the plots to make the points easier to see. ACE/ARB, angiotensin-converting enzyme/angiotensin II receptor blocker; BMI, body mass index; CVD, cardiovascular disease; diab., diabetes; educ., education; SBP, systolic BP.
Figure 2.
Figure 2.
Plots of progressive CKD within 2 years by risk score from the model with all established predictors. The plot in the left panel is on the basis of a risk score cutoff of −4, and the plot in the right panel uses a cutoff of zero. The vertical line separates (left panel) low- and (right panel) high-risk patients. The horizontal line separates the groups with and without progressive CKD (1= yes, 0= no). By counting the number of subjects in each of the quadrants, we obtain a standard 2×2 table. For example, using the cutoff of −4, there were 12 patients who were classified as low risk and ended up with CKD progression within 2 years. The sensitivity of this classification approach can be estimated by taking the number of true positives (294) and dividing by the total number of high-risk patients (294+12=306). Thus, sensitivity is 96%. Similar calculations can be used to find the specificity of 50%. Given the same prediction model, a different classification cutpoint could be chosen. In the plot in the right panel, in which a risk score of zero is chosen as the cutpoint, sensitivity decreases to 43%, whereas specificity increases to 98%.
Figure 3.
Figure 3.
Receiver operating characteristics (ROC) curves for three different prediction models. A is the ROC curve when urine neutrophil gelatinase–associated lipocalin (NGAL) is the only predictor variable included in the model. The c statistic is 0.8. B and C each include two ROC curves. In B, the blue curve (area under the curve [AUC] =0.69) is from the model that includes only demographic variables. The red curve (AUC=0.82) additionally includes urine NGAL. In C, the blue curve is for the model that includes all established predictors. The red curve includes all established predictors plus urine NGAL. The c statistics in C are both 0.9. This plots show that urine NGAL is a relatively good predictor variable alone and has incremental improvement over the demographics-only model but does not help to improve prediction beyond established predictors.
Figure 4.
Figure 4.
Calibration plot for the model that includes established predictors and urine neutrophil gelatinase–associated lipocalin. The straight line is where a perfectly calibrated model would fall. The curve is the estimated relationship between the predicted and observed values. The shaded region is ±2 SEMs.
Figure 5.
Figure 5.
Classification of patients in the event and nonevent groups in models with and without urine neutrophil gelatinase–associated lipocalin (NGAL). The counts in red are from patients who were reclassified in the wrong direction. The left panel shows the reclassification that occurs when urine NGAL is added to a model that includes only demographics. For events, 3+8+37=48 were reclassified in the right direction, whereas 16+16+5=37 were reclassified in the wrong direction. The net reclassification improvement (NRI) for events was (48/306)−(37/306)=3.6%. The NRI for nonevents was (1153/2727)−(243/2727)=33%. Thus, there was large improvement in risk prediction for nonevents when going from the demographics-only model to demographics and NGAL model. In the right panel, urine NGAL is added to a model that includes all established predictors. Overall, there was little reclassification for both events and nonevents. The NRI for events is simply (6/306)−(8/306)=−0.65%. The NRI for nonevents is (94/2727)−(79/2727)=0.55%.

References

    1. Lennartz CS, Pickering JW, Seiler-Mußler S, Bauer L, Untersteller K, Emrich IE, Zawada AM, Radermacher J, Tangri N, Fliser D, Heine GH: External validation of the kidney failure risk equation and re-calibration with addition of ultrasound parameters. Clin J Am Soc Nephrol 11: 609–615, 2016 - PMC - PubMed
    1. Liu KD, Yang W, Anderson AH, Feldman HI, Demirjian S, Hamano T, He J, Lash J, Lustigova E, Rosas SE, Simonson MS, Tao K, Hsu CY; Chronic Renal Insufficiency Cohort (CRIC) study investigators : Urine neutrophil gelatinase-associated lipocalin levels do not improve risk prediction of progressive chronic kidney disease. Kidney Int 83: 909–914, 2013 - PMC - PubMed
    1. Solak Y, Yilmaz MI, Siriopol D, Saglam M, Unal HU, Yaman H, Gok M, Cetinkaya H, Gaipov A, Eyileten T, Sari S, Yildirim AO, Tonbul HZ, Turk S, Covic A, Kanbay M: Serum neutrophil gelatinase-associated lipocalin is associated with cardiovascular events in patients with chronic kidney disease. Int Urol Nephrol 47: 1993–2001, 2015 - PubMed
    1. Deo R, Shou H, Soliman EZ, Yang W, Arkin JM, Zhang X, Townsend RR, Go AS, Shlipak MG, Feldman HI: Electrocardiographic measures and prediction of cardiovascular and noncardiovascular death in CKD. J Am Soc Nephrol 27: 559–569, 2016 - PMC - PubMed
    1. Weiss JW, Platt RW, Thorp ML, Yang X, Smith DH, Petrik A, Eckstrom E, Morris C, O’Hare AM, Johnson ES: Predicting mortality in older adults with kidney disease: A pragmatic prediction model. J Am Geriatr Soc 63: 508–515, 2015 - PMC - PubMed

MeSH terms

LinkOut - more resources