Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 19;10(1):38.
doi: 10.1186/s40635-022-00465-4.

Development and validation of an early warning model for hospitalized COVID-19 patients: a multi-center retrospective cohort study

Affiliations

Development and validation of an early warning model for hospitalized COVID-19 patients: a multi-center retrospective cohort study

Jim M Smit et al. Intensive Care Med Exp. .

Abstract

Background: Timely identification of deteriorating COVID-19 patients is needed to guide changes in clinical management and admission to intensive care units (ICUs). There is significant concern that widely used Early warning scores (EWSs) underestimate illness severity in COVID-19 patients and therefore, we developed an early warning model specifically for COVID-19 patients.

Methods: We retrospectively collected electronic medical record data to extract predictors and used these to fit a random forest model. To simulate the situation in which the model would have been developed after the first and implemented during the second COVID-19 'wave' in the Netherlands, we performed a temporal validation by splitting all included patients into groups admitted before and after August 1, 2020. Furthermore, we propose a method for dynamic model updating to retain model performance over time. We evaluated model discrimination and calibration, performed a decision curve analysis, and quantified the importance of predictors using SHapley Additive exPlanations values.

Results: We included 3514 COVID-19 patient admissions from six Dutch hospitals between February 2020 and May 2021, and included a total of 18 predictors for model fitting. The model showed a higher discriminative performance in terms of partial area under the receiver operating characteristic curve (0.82 [0.80-0.84]) compared to the National early warning score (0.72 [0.69-0.74]) and the Modified early warning score (0.67 [0.65-0.69]), a greater net benefit over a range of clinically relevant model thresholds, and relatively good calibration (intercept = 0.03 [- 0.09 to 0.14], slope = 0.79 [0.73-0.86]).

Conclusions: This study shows the potential benefit of moving from early warning models for the general inpatient population to models for specific patient groups. Further (independent) validation of the model is needed.

Keywords: Artificial intelligence; COVID-19; Dynamic model updating; Early warning score; Intensive care; Machine learning; Medical prediction model.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Study design. a Schematic representation of the dynamic model updating procedure. For example, to predict deterioration for patients admitted to hospital A in October 2020, the model is fitted using patient data collected up to that date in the remaining hospitals, and a calibrator is fitted using patient data collected up to that date in hospital A. These two combined result in calibrated predictions. This process is repeated each month, for each hospital, from August 2020 until May 2021. b Flowchart of patient inclusion. ICU = intensive care unit, ED = emergency department, EoLC = end-of-life care, LOS = length-of-stay
Fig. 2
Fig. 2
Model discrimination and decision curve analysis. a Overall ROC curves for the RF and LR models and the NEWS. We placed two landmarks for a NEW score of 5 and 7, i.e., the recommended trigger thresholds for an urgent and emergency response. We calculated both the pAUC between a false positive rate of 0 and 0.33 (grey area) and the complete AUC. Shaded areas around each point in the ROC curves represent the 95% bootstrap percentile CIs25 (with 1000 bootstrap replications stratified for positive and negative samples). b Hospital-specific pAUCs. The error bars represent the 95% bootstrap percentile CIs25 (with 1000 bootstrap replications stratified for positive and negative samples). P-values, calculated as described in Additional file 1: appendix F.4, are shown for the difference in pAUC between the RF models and NEWS (upper bar), between the RF and LR models (middle bar) and between the LR models and NEWS (lower bar). c Overall decision curve analysis results. The standardized net benefit is plotted over a range of clinically relevant probability thresholds with corresponding odds. The ‘Intervention for all’ line indicates the NB if a (urgent or emergency) response would always be triggered
Fig. 3
Fig. 3
Overall model calibration of the static and dynamic RF models (a) and LR models (b). Top left: smoothed flexible calibration curves. Top right: zoom-in of the calibration curve in the 0–0.2 probability range (grey area). Shaded areas around the curves represent the 95% CIs. Bottom: histogram of the predictions (logscale). Shaded areas around each point in the calibration curves (before smoothing) represent the 95% bootstrap percentile CIs25 (with 1000 bootstrap replications stratified for positive and negative samples). The smooth curves including CIs were estimated by locally weighted scatterplot smoothing (see https://github.com/jimmsmit/COVID-19_EWS for the implementation). a Overall model calibration of the static and dynamic RF models. b Overall model calibration of the static and dynamic LR models
Fig. 4
Fig. 4
Distribution of SHapley Additive exPlanations (SHAP) values of the included predictors (based on mean SHAP magnitude) for the random forest model. For each predictor, each dot represents the impact of that predictor for a single prediction. The colors of the dots correspond with the value for the specific predictor. Thus, pink dots with positive SHAP values indicate that high values of the predictor are associated with a high risk of clinical deterioration. Conversely, blue dots with positive SHAP values indicate that low values of the predictor are associated with a high risk of clinical deterioration

References

    1. Subbe CP, Kruger M, Rutherford P, Gemmel L. Validation of a modified Early Warning Score in medical admissions. QJM An Int J Med. 2001;94:521–526. doi: 10.1093/qjmed/94.10.521. - DOI - PubMed
    1. Smith GB, Prytherch DR, Meredith P, et al. The ability of the National Early Warning Score (NEWS) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death. Resuscitation. 2013;84:465–470. doi: 10.1016/j.resuscitation.2012.12.016. - DOI - PubMed
    1. RCOP . National Early Warning Score (NEWS): standardising the assessment of acute-illness severity in the NHS. London: Report of a working party; 2012.
    1. RCOP (2017) Royal College of Physicians. National Early Warning Score (NEWS) 2: Standardising the assessment of acute-illness severity in the NHS. Updated report of a working party. London
    1. Zhang K, Zhang X, Ding W, et al. National early warning score does not accurately predict mortality for patients with infection outside the intensive care unit: a systematic review and meta-analysis. Front Med. 2021;8:1–10. doi: 10.3389/fmed.2021.704358. - DOI - PMC - PubMed

LinkOut - more resources