Validation and updating of predictive logistic regression models: a study on sample size and shrinkage
- PMID: 15287085
- DOI: 10.1002/sim.1844
Validation and updating of predictive logistic regression models: a study on sample size and shrinkage
Abstract
A logistic regression model may be used to provide predictions of outcome for individual patients at another centre than where the model was developed. When empirical data are available from this centre, the validity of predictions can be assessed by comparing observed outcomes and predicted probabilities. Subsequently, the model may be updated to improve predictions for future patients. As an example, we analysed 30-day mortality after acute myocardial infarction in a large data set (GUSTO-I, n = 40 830). We validated and updated a previously published model from another study (TIMI-II, n = 3339) in validation samples ranging from small (200 patients, 14 deaths) to large (10,000 patients, 700 deaths). Updated models were tested on independent patients. Updating methods included re-calibration (re-estimation of the intercept or slope of the linear predictor) and more structural model revisions (re-estimation of some or all regression coefficients, model extension with more predictors). We applied heuristic shrinkage approaches in the model revision methods, such that regression coefficients were shrunken towards their re-calibrated values. Parsimonious updating methods were found preferable to more extensive model revisions, which should only be attempted with relatively large validation samples in combination with shrinkage.
Similar articles
-
Equally valid models gave divergent predictions for mortality in acute myocardial infarction patients in a comparison of logistic [corrected] regression models.J Clin Epidemiol. 2005 Apr;58(4):383-90. doi: 10.1016/j.jclinepi.2004.07.008. J Clin Epidemiol. 2005. PMID: 15862724
-
A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality.Stat Med. 2007 Jul 10;26(15):2937-57. doi: 10.1002/sim.2770. Stat Med. 2007. PMID: 17186501
-
Using the bootstrap to improve estimation and confidence intervals for regression coefficients selected using backwards variable elimination.Stat Med. 2008 Jul 30;27(17):3286-300. doi: 10.1002/sim.3104. Stat Med. 2008. PMID: 17940997
-
[Logistic regression: a useful tool in rehabilitation research].Rehabilitation (Stuttg). 2008 Feb;47(1):56-62. doi: 10.1055/s-2007-992790. Rehabilitation (Stuttg). 2008. PMID: 18247272 Review. German.
-
Calibrating machine learning approaches for probability estimation: A comprehensive comparison.Stat Med. 2023 Dec 20;42(29):5451-5478. doi: 10.1002/sim.9921. Epub 2023 Oct 17. Stat Med. 2023. PMID: 37849356 Review.
Cited by
-
Predictive parameters of arteriovenous fistula maturation in patients with end-stage renal disease.Kidney Res Clin Pract. 2018 Sep;37(3):277-286. doi: 10.23876/j.krcp.2018.37.3.277. Epub 2018 Sep 30. Kidney Res Clin Pract. 2018. PMID: 30254852 Free PMC article.
-
A comparative analysis of predictive models of morbidity in intensive care unit after cardiac surgery - part I: model planning.BMC Med Inform Decis Mak. 2007 Nov 22;7:35. doi: 10.1186/1472-6947-7-35. BMC Med Inform Decis Mak. 2007. PMID: 18034872 Free PMC article.
-
Dynamic logistic state space prediction model for clinical decision making.Biometrics. 2023 Mar;79(1):73-85. doi: 10.1111/biom.13593. Epub 2021 Nov 15. Biometrics. 2023. PMID: 34697801 Free PMC article.
-
Novel prediction score including pre- and intraoperative parameters best predicts acute kidney injury after liver surgery.World J Surg. 2013 Nov;37(11):2618-28. doi: 10.1007/s00268-013-2159-6. World J Surg. 2013. PMID: 23959337
-
Prediction models for exacerbations in patients with COPD.Eur Respir Rev. 2017 Jan 17;26(143):160061. doi: 10.1183/16000617.0061-2016. Print 2017 Jan. Eur Respir Rev. 2017. PMID: 28096287 Free PMC article. Review.
Publication types
MeSH terms
LinkOut - more resources
Other Literature Sources
Research Materials