Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2018 Jun;33(6):921-928.
doi: 10.1007/s11606-018-4316-y. Epub 2018 Jan 30.

Development and Validation of Machine Learning Models for Prediction of 1-Year Mortality Utilizing Electronic Medical Record Data Available at the End of Hospitalization in Multicondition Patients: a Proof-of-Concept Study

Affiliations
Multicenter Study

Development and Validation of Machine Learning Models for Prediction of 1-Year Mortality Utilizing Electronic Medical Record Data Available at the End of Hospitalization in Multicondition Patients: a Proof-of-Concept Study

Nishant Sahni et al. J Gen Intern Med. 2018 Jun.

Abstract

Background: Predicting death in a cohort of clinically diverse, multicondition hospitalized patients is difficult. Prognostic models that use electronic medical record (EMR) data to determine 1-year death risk can improve end-of-life planning and risk adjustment for research.

Objective: Determine if the final set of demographic, vital sign, and laboratory data from a hospitalization can be used to accurately quantify 1-year mortality risk.

Design: A retrospective study using electronic medical record data linked with the state death registry.

Participants: A total of 59,848 hospitalized patients within a six-hospital network over a 4-year period.

Main measures: The last set of vital signs, complete blood count, basic and complete metabolic panel, demographic information, and ICD codes. The outcome of interest was death within 1 year.

Key results: Model performance was measured on the validation data set. Random forests (RF) outperformed logisitic regression (LR) models in discriminative ability. An RF model that used the final set of demographic, vitals, and laboratory data from the final 48 h of hospitalization had an AUC of 0.86 (0.85-0.87) for predicting death within a year. Age, blood urea nitrogen, platelet count, hemoglobin, and creatinine were the most important variables in the RF model. Models that used comorbidity variables alone had the lowest AUC. In groups of patients with a high probability of death, RF models underestimated the probability by less than 10%.

Conclusion: The last set of EMR data from a hospitalization can be used to accurately estimate the risk of 1-year mortality within a cohort of multicondition hospitalized patients.

Keywords: data mining; hospital outcomes; machine learning; predictive models.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest in this work.

Figures

Figure 1
Figure 1
Line chart of model AUCs for predicting 1-year mortality. The AUC of each model on the validation data set is plotted on the vertical (y-axis) and the model type is indicated on the horizontal (x-axis). The variables incorporated into each model are listed in the color-coded legend on the right hand side of the figure. The vertical error bars show the 95% confidence intervals around each AUC estimate. Each point represents the AUC of one model.
Figure 2
Figure 2
Feature importance in the random forest models. The features are ranked by importance as measured by the mean decrease in the Gini Index.
Figure 3
Figure 3
Calibration plots of the Demographic/Physiologic/Lab RF model. The observed rate of death at 1 year within each one of the ten probability bins is plotted on the y-axis. The predicted probability from the RF model is indicated on the x-axis. The dotted diagonal line represents points along a perfectly calibrated model. Each point on the graph represents one of the ten bins of probability. The number beside each point represents the total number of hospitalizations that fall within the particular bin. The bars delineate the 95% confidence intervals around the observed probability.
Figure 4
Figure 4
Calibration plot of the Demographic/Physiologic/Lab + Metastasis + Tumor RF model. The observed rate of death at 1 year within each one of the ten probability bins is plotted on the y-axis. The predicted probability from the RF model is indicated on the x-axis. The dotted diagonal line represents points along a perfectly calibrated model. Each point on the graph represents one of the ten bins of probability. The number beside each point represents the total number of hospitalizations that fall within the particular bin. The bars delineate the 95% confidence intervals around the observed probability.

Similar articles

Cited by

References

    1. Frost DW, Cook DJ, Heyland DK, Fowler RA. Patient and healthcare professional factors influencing end-of-life decision-making during critical illness: a systematic review*. Crit Care Med. 2011;39(5):1174–1189. doi: 10.1097/CCM.0b013e31820eacf2. - DOI - PubMed
    1. You JJ, Downar J, Fowler RA, et al. Barriers to goals of care discussions with seriously ill hospitalized patients and their families: a multicenter survey of clinicians. JAMA Intern Med. 2015;175(4):549–556. doi: 10.1001/jamainternmed.2014.7732. - DOI - PubMed
    1. Van Walraven C, McAlister FA, Bakal JA, Hawken S, Donzé J. External validation of the Hospital-patient One-year Mortality Risk (HOMR) model for predicting death within 1 year after hospital admission. CMAJ. 2015;187(10):725–733. doi: 10.1503/cmaj.150209. - DOI - PMC - PubMed
    1. Tabak YP, Sun X, Nunez CM, Johannes RS. Using electronic health record data to develop inpatient mortality predictive model: Acute Laboratory Risk of Mortality Score (ALaRMS) J Am Med Informatics Assoc. 2014;21(3):455–463. doi: 10.1136/amiajnl-2013-001790. - DOI - PMC - PubMed
    1. Escobar GJ, Greene JD, Scheirer P, Gardner MN, Draper D, Kipnis P. Risk-Adjusting Hospital Inpatient Mortality Using Automated Inpatient, Outpatient, and Laboratory Databases. Med Care. 2008;46(3):232–239. doi: 10.1097/MLR.0b013e3181589bb6. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources