Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 14;13(11):e0206862.
doi: 10.1371/journal.pone.0206862. eCollection 2018.

Optimal intensive care outcome prediction over time using machine learning

Affiliations

Optimal intensive care outcome prediction over time using machine learning

Christopher Meiring et al. PLoS One. .

Abstract

Background: Prognostication is an essential tool for risk adjustment and decision making in the intensive care unit (ICU). Research into prognostication in ICU has so far been limited to data from admission or the first 24 hours. Most ICU admissions last longer than this, decisions are made throughout an admission, and some admissions are explicitly intended as time-limited prognostic trials. Despite this, temporal changes in prognostic ability during ICU admission has received little attention to date. Current predictive models, in the form of prognostic clinical tools, are typically derived from linear models and do not explicitly handle incremental information from trends. Machine learning (ML) allows predictive models to be developed which use non-linear predictors and complex interactions between variables, thus allowing incorporation of trends in measured variables over time; this has made it possible to investigate prognosis throughout an admission.

Methods and findings: This study uses ML to assess the predictability of ICU mortality as a function of time. Logistic regression against physiological data alone outperformed APACHE-II and demonstrated several important interactions including between lactate & noradrenaline dose, between lactate & MAP, and between age & MAP consistent with the current sepsis definitions. ML models consistently outperformed logistic regression with Deep Learning giving the best results. Predictive power was maximal on the second day and was further improved by incorporating trend data. Using a limited range of physiological and demographic variables, the best machine learning model on the first day showed an area under the receiver-operator characteristic curve (AUC) of 0.883 (σ = 0.008), compared to 0.846 (σ = 0.010) for a logistic regression from the same predictors and 0.836 (σ = 0.007) for a logistic regression based on the APACHE-II score. Adding information gathered on the second day of admission improved the maximum AUC to 0.895 (σ = 0.008). Beyond the second day, predictive ability declined.

Conclusion: This has implications for decision making in intensive care and provides a justification for time-limited trials of ICU therapy; the assessment of prognosis over more than one day may be a valuable strategy as new information on the second day helps to differentiate outcomes. New ML models based on trend data beyond the first day could greatly improve upon current risk stratification tools.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist. No funding bodies had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Figures

Fig 1
Fig 1. Admission duration waterfall plot.
The total number of admissions in the patient database is shown, followed by the number removed due to a missing value for the classifier, leaving the number of admissions included in the analysis at day 1. On each day, bars show the number of patients discharged alive (light blue) or deceased (dark blue). The numbers above the bars represent the total number of patients remaining on intensive care at the start of the day and the percentage that this represents of all admissions included in the analysis, for days 1, 5, 10, 20 and 30. Where a total is shown, the grey bar represents those remaining on ICU at the end of the day.
Fig 2
Fig 2. AUC of APACHE-II when applied to patients remaining in ICU on days 1-5.
Area under the receiver operating characteristic curve (AUC) from logistic regression models built applying admission (24 hour) APACHE-II for only the patients remaining on each subsequent day, predicting vital status at discharge on twenty cross-folded validation sets. Points represent the mean AUC for each fold across nine imputations. Bars represent the mean of twenty folds +/- 2 standard deviations, calculated from the combined variance of folding and imputation. The predictive performance of admission APACHE-II declines when applied to predict outcome on subsequent days.
Fig 3
Fig 3. AUC of logistic regression and machine learning models for each day.
Area under the receiver operating characteristic curve (AUC) for predictions of vital status at discharge in twenty cross-folded validation sets for models built on each day. On the first day, the ‘APACHE’ model is a single predictor logistic regression model built from the APACHE-II score. The first and all subsequent days show AUCs of logistic regression (‘glm’), random forest (‘parRF’), a boosted decision tree algorithm (‘adaboost’), a single layer model averaged neural network (‘avNNet’), a support vector machines algorithm with radial basis function kernel and class weights (‘svmRadialWeights’), and a six hidden-layer deep learning neural network (‘DeepNN’). ‘Simple’ models use only measurements from one day as predictors. ‘Cumulative’ (cumul.) models use measurements from the day and preceding days as predictors. Points represent the mean AUC for each fold across nine imputations. Bars represent the mean of twenty folds +/- 2 standard deviations, calculated from the combined variance of folding and imputation.
Fig 4
Fig 4. Distributions of length of admission for patients correctly and incorrectly classified as alive or deceased by the Deep Learning classifiers.
Each plot represents the four smoothed distributions of correct/incorrect prediction split by actual outcome (alive/deceased) for each Deep Learning classifier on each day. Distributions of admission duration for correct predictions are shown in green, while those for incorrect predictions are shown in red. Thick lines represent arithmetic mean, with dashed lines indicating mean +/- standard deviation.

References

    1. Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how? BMJ. 2009;338:b375 10.1136/bmj.b375 - DOI - PubMed
    1. Luce JM, Rubenfeld GD. Can health care costs be reduced by limiting intensive care at the end of life? American journal of respiratory and critical care medicine. 2002;165(6):750–754. 10.1164/ajrccm.165.6.2109045 - DOI - PubMed
    1. Truog RD, Campbell ML, Curtis JR, Haas CE, Luce JM, Rubenfeld GD, et al. Recommendations for end-of-life care in the intensive care unit: a consensus statement by the American College of Critical Care Medicine. Critical care medicine. 2008;36(3):953–963. 10.1097/CCM.0B013E3181659096 - DOI - PubMed
    1. Sprung CL, Cohen SL, Sjokvist P, Baras M, Bulow HH, Hovilehto S, et al. End-of-life practices in European intensive care units: the Ethicus Study. Jama. 2003;290(6):790–797. 10.1001/jama.290.6.790 - DOI - PubMed
    1. Cook D, Rocker G, Marshall J, Sjokvist P, Dodek P, Griffith L, et al. Withdrawal of mechanical ventilation in anticipation of death in the intensive care unit. New England Journal of Medicine. 2003;349(12):1123–1132. 10.1056/NEJMoa030083 - DOI - PubMed

Publication types

MeSH terms