Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review
- PMID: 27189013
- PMCID: PMC5201180
- DOI: 10.1093/jamia/ocw042
Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review
Abstract
Objective: Electronic health records (EHRs) are an increasingly common data source for clinical risk prediction, presenting both unique analytic opportunities and challenges. We sought to evaluate the current state of EHR based risk prediction modeling through a systematic review of clinical prediction studies using EHR data.
Methods: We searched PubMed for articles that reported on the use of an EHR to develop a risk prediction model from 2009 to 2014. Articles were extracted by two reviewers, and we abstracted information on study design, use of EHR data, model building, and performance from each publication and supplementary documentation.
Results: We identified 107 articles from 15 different countries. Studies were generally very large (median sample size = 26 100) and utilized a diverse array of predictors. Most used validation techniques (n = 94 of 107) and reported model coefficients for reproducibility (n = 83). However, studies did not fully leverage the breadth of EHR data, as they uncommonly used longitudinal information (n = 37) and employed relatively few predictor variables (median = 27 variables). Less than half of the studies were multicenter (n = 50) and only 26 performed validation across sites. Many studies did not fully address biases of EHR data such as missing data or loss to follow-up. Average c-statistics for different outcomes were: mortality (0.84), clinical prediction (0.83), hospitalization (0.71), and service utilization (0.71).
Conclusions: EHR data present both opportunities and challenges for clinical risk prediction. There is room for improvement in designing such studies.
Keywords: Electronic Medical Record; Review; Risk Assessment.
© The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Figures
Similar articles
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
Eliciting adverse effects data from participants in clinical trials.Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2. Cochrane Database Syst Rev. 2018. PMID: 29372930 Free PMC article.
-
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2. Cochrane Database Syst Rev. 2022. PMID: 35233774 Free PMC article.
-
The comparative and added prognostic value of biomarkers to the Revised Cardiac Risk Index for preoperative prediction of major adverse cardiac events and all-cause mortality in patients who undergo noncardiac surgery.Cochrane Database Syst Rev. 2021 Dec 21;12(12):CD013139. doi: 10.1002/14651858.CD013139.pub2. Cochrane Database Syst Rev. 2021. PMID: 34931303 Free PMC article.
Cited by
-
Prediction of risk of acquiring urinary tract infection during hospital stay based on machine-learning: A retrospective cohort study.PLoS One. 2021 Mar 31;16(3):e0248636. doi: 10.1371/journal.pone.0248636. eCollection 2021. PLoS One. 2021. PMID: 33788888 Free PMC article.
-
Digital Biomarkers for the Early Detection of Mild Cognitive Impairment: Artificial Intelligence Meets Virtual Reality.Front Hum Neurosci. 2020 Jul 24;14:245. doi: 10.3389/fnhum.2020.00245. eCollection 2020. Front Hum Neurosci. 2020. PMID: 32848660 Free PMC article.
-
Patient profiled data for treatment decision-making: valuable as an add-on to hepatitis C clinical guidelines?BMC Med Inform Decis Mak. 2024 Aug 13;24(1):227. doi: 10.1186/s12911-024-02608-x. BMC Med Inform Decis Mak. 2024. PMID: 39138441 Free PMC article.
-
Assessment of Value of Neighborhood Socioeconomic Status in Models That Use Electronic Health Record Data to Predict Health Care Use Rates and Mortality.JAMA Netw Open. 2020 Oct 1;3(10):e2017109. doi: 10.1001/jamanetworkopen.2020.17109. JAMA Netw Open. 2020. PMID: 33090223 Free PMC article.
-
Risk prediction of delirium in hospitalized patients using machine learning: An implementation and prospective evaluation study.J Am Med Inform Assoc. 2020 Jul 1;27(9):1383-1392. doi: 10.1093/jamia/ocaa113. J Am Med Inform Assoc. 2020. PMID: 32968811 Free PMC article.
References
-
- Charles D, Gabriel M, Searcy T. Adoption of electronic health record systems among U.S. non-federal acute care hospitals: 2008-2014. 2015https://www.healthit.gov/sites/default/files/data-brief/2014HospitalAdop....
-
- Rothman B, Leonard JC, Vigoda MM. Future of electronic health records: implications for decision support. Mt Sinai J Med NY. 2012;79(6): 757–768. - PubMed
-
- Wilson PW, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97(18):1837–1847. - PubMed