Leveraging derived data elements in data analytic models for understanding and predicting hospital readmissions
- PMID: 23304278
- PMCID: PMC3540449
Leveraging derived data elements in data analytic models for understanding and predicting hospital readmissions
Abstract
Hospital readmissions depend on numerous factors. Automated risk calculation using electronic health record (EHR) data could allow targeting care to prevent them. EHRs usually are incomplete with respect to data relevant to readmissions prediction. Lack of standard data representations in EHRs restricts generalizability of predictive models. We propose developing such models by first generating derived variables that characterize clinical phenotype. This reduces the number of variables, reduces noise, introduces clinical knowledge into model building, and abstracts away the underlying data representation, thus facilitating use of standard data mining algorithms. We combined this pre-processing step with a random forest algorithm to compute risk for readmission within 30 days for patients in ten disease categories. Results were promising for encounters that our algorithm assigned very high or very low risk. Assigning patients to either of these two risk groups could be of value to patient care teams aiming to prevent readmissions.
References
-
- QualityNet. Readmission Measures Overview: Publicly reporting risk-standardized, 30-day readmission measures for AMI, HF and PN. Accessed Mar 1, 2012. http://www.qualitynet.org/dcs/ContentServer?c=Page&pagename=QnetPublic%2...
-
- Post A, Kurc T, Overcash M, Cantrell D, Morris T, Eckerson K, et al. A Temporal Abstraction-based Extract, Transform and Load Process for Creating Registry Databases for Research. AMIA Summits on Translational Science proceedings AMIA Summit on Translational Science. 2011;2011:46–50. Epub 2012/01/03. - PMC - PubMed
-
- Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources