Poor performance of clinical prediction models: the harm of commonly applied methods
- PMID: 29174118
- DOI: 10.1016/j.jclinepi.2017.11.013
Poor performance of clinical prediction models: the harm of commonly applied methods
Abstract
Objective: To evaluate limitations of common statistical modeling approaches in deriving clinical prediction models and explore alternative strategies.
Study design and setting: A previously published model predicted the likelihood of having a mutation in germline DNA mismatch repair genes at the time of diagnosis of colorectal cancer. This model was based on a cohort where 38 mutations were found among 870 participants, with validation in an independent cohort with 35 mutations. The modeling strategy included stepwise selection of predictors from a pool of over 37 candidate predictors and dichotomization of continuous predictors. We simulated this strategy in small subsets of a large contemporary cohort (2,051 mutations among 19,866 participants) and made comparisons to other modeling approaches. All models were evaluated according to bias and discriminative ability (concordance index, c) in independent data.
Results: We found over 50% bias for five of six originally selected predictors, unstable model specification, and poor performance at validation (median c = 0.74). A small validation sample hampered stable assessment of performance. Model prespecification based on external knowledge and using continuous predictors led to better performance (c = 0.836 and c = 0.852 with 38 and 2,051 events respectively).
Conclusion: Prediction models perform poorly if based on small numbers of events and developed with common but suboptimal statistical approaches. Alternative modeling strategies to best exploit available predictive information need wider implementation, with collaborative research to increase sample sizes.
Keywords: Events per variable; Prediction model; Regression analysis; Sample size; Simulation; Validation.
Copyright © 2017 Elsevier Inc. All rights reserved.
Similar articles
-
Validation of predictive models for germline mutations in DNA mismatch repair genes in colorectal cancer.Int J Cancer. 2010 Feb 15;126(4):930-9. doi: 10.1002/ijc.24808. Int J Cancer. 2010. PMID: 19653273
-
Mutation prediction models in Lynch syndrome: evaluation in a clinical genetic setting.J Med Genet. 2009 Nov;46(11):745-51. doi: 10.1136/jmg.2009.066589. Epub 2009 Jun 18. J Med Genet. 2009. PMID: 19541685
-
Development and Validation of the PREMM5 Model for Comprehensive Risk Assessment of Lynch Syndrome.J Clin Oncol. 2017 Jul 1;35(19):2165-2172. doi: 10.1200/JCO.2016.69.6120. Epub 2017 May 10. J Clin Oncol. 2017. PMID: 28489507 Free PMC article.
-
Hereditary nonpolyposis colorectal cancer: diagnostic strategies and their implications.Evid Rep Technol Assess (Full Rep). 2007 May;(150):1-180. Evid Rep Technol Assess (Full Rep). 2007. PMID: 17764220 Free PMC article. Review.
-
Criteria and prediction models for mismatch repair gene mutations: a review.J Med Genet. 2013 Dec;50(12):785-93. doi: 10.1136/jmedgenet-2013-101803. Epub 2013 Aug 16. J Med Genet. 2013. PMID: 23956446 Review.
Cited by
-
Canadian Anaphylaxis Network-Predicting Recurrence after Emergency Presentation for Allergic REaction (CAN-PREPARE): a prospective, cohort study protocol.BMJ Open. 2022 Oct 31;12(10):e061976. doi: 10.1136/bmjopen-2022-061976. BMJ Open. 2022. PMID: 36316072 Free PMC article.
-
Development and validation of a model to predict ceiling of care in COVID-19 hospitalized patients.BMC Palliat Care. 2024 Jul 16;23(1):173. doi: 10.1186/s12904-024-01490-8. BMC Palliat Care. 2024. PMID: 39010044 Free PMC article.
-
Evaluating Modeling and Validation Strategies for Tooth Loss.J Dent Res. 2019 Sep;98(10):1088-1095. doi: 10.1177/0022034519864889. Epub 2019 Jul 30. J Dent Res. 2019. PMID: 31361174 Free PMC article.
-
Assess the Performance and Cost-Effectiveness of LACE and HOSPITAL Re-Admission Prediction Models as a Risk Management Tool for Home Care Patients: An Evaluation Study of a Medical Center Affiliated Home Care Unit in Taiwan.Int J Environ Res Public Health. 2020 Feb 2;17(3):927. doi: 10.3390/ijerph17030927. Int J Environ Res Public Health. 2020. PMID: 32024309 Free PMC article.
-
Impact of predictor measurement heterogeneity across settings on the performance of prediction models: A measurement error perspective.Stat Med. 2019 Aug 15;38(18):3444-3459. doi: 10.1002/sim.8183. Epub 2019 May 31. Stat Med. 2019. PMID: 31148207 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical