Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Clinical Trial
. 2021 Jan 19;16(1):e0245157.
doi: 10.1371/journal.pone.0245157. eCollection 2021.

A comparison of machine learning models versus clinical evaluation for mortality prediction in patients with sepsis

Affiliations
Clinical Trial

A comparison of machine learning models versus clinical evaluation for mortality prediction in patients with sepsis

William P T M van Doorn et al. PLoS One. .

Abstract

Introduction: Patients with sepsis who present to an emergency department (ED) have highly variable underlying disease severity, and can be categorized from low to high risk. Development of a risk stratification tool for these patients is important for appropriate triage and early treatment. The aim of this study was to develop machine learning models predicting 31-day mortality in patients presenting to the ED with sepsis and to compare these to internal medicine physicians and clinical risk scores.

Methods: A single-center, retrospective cohort study was conducted amongst 1,344 emergency department patients fulfilling sepsis criteria. Laboratory and clinical data that was available in the first two hours of presentation from these patients were randomly partitioned into a development (n = 1,244) and validation dataset (n = 100). Machine learning models were trained and evaluated on the development dataset and compared to internal medicine physicians and risk scores in the independent validation dataset. The primary outcome was 31-day mortality.

Results: A number of 1,344 patients were included of whom 174 (13.0%) died. Machine learning models trained with laboratory or a combination of laboratory + clinical data achieved an area-under-the ROC curve of 0.82 (95% CI: 0.80-0.84) and 0.84 (95% CI: 0.81-0.87) for predicting 31-day mortality, respectively. In the validation set, models outperformed internal medicine physicians and clinical risk scores in sensitivity (92% vs. 72% vs. 78%;p<0.001,all comparisons) while retaining comparable specificity (78% vs. 74% vs. 72%;p>0.02). The model had higher diagnostic accuracy with an area-under-the-ROC curve of 0.85 (95%CI: 0.78-0.92) compared to abbMEDS (0.63,0.54-0.73), mREMS (0.63,0.54-0.72) and internal medicine physicians (0.74,0.65-0.82).

Conclusion: Machine learning models outperformed internal medicine physicians and clinical risk scores in predicting 31-day mortality. These models are a promising tool to aid in risk stratification of patients presenting to the ED with sepsis.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Overview of study design and model development.
(A) We included 1,344 patients with a diagnosis of sepsis who presented to the ED. Patients were randomly partitioned in a development subset (n = 1,244), used to train and evaluate performance of machine learning models, and a validation subset (n = 100), used to compare models with internal medicine physicians and clinical risk scores. Cross-validation was used to obtain a robust estimate of model performance in the development subset. (B) The machine learning model with the highest cross-validation performance was compared internal medicine physicians and clinical risk scores to predict 31-days mortality.
Fig 2
Fig 2. XGBoost model performance for predicting all-cause mortality at 31 days in the development dataset.
Models trained with laboratory data achieved a mean AUC of 0.82 (95% CI: 0.80–0.84) for predicting 31-day mortality. Predictive performance increased when models were trained with laboratory + clinical data to a mean AUC of 0.84 (95% CI: 0.81–0.87), but this was not statistically different (p = 0.25).
Fig 3
Fig 3. Analysis of parameter importance in the XGBoost models.
Models with laboratory data (left) and with laboratory + clinical data (right) were analyzed using SHAP values. Individual parameters are ranked by importance in descending order based on the sum of the SHAP values over all the samples. Negative or low SHAP values contribute towards a negative model outcome (survival), whereas high SHAP values contribute towards a positive model outcome (death).
Fig 4
Fig 4. Comparison of XGBoost model with internal medicine physicians and clinical risk scores.
The XGBoost model achieved a sensitivity (A) of 0.92 (95% CI: 0.87–0.95) and specificity (B) of 0.78 (95% CI: 0.70–0.86) for predicting mortality. This was significantly better than the mean prediction of internal medicine physicians for sensitivity (0.72, 0.62–0.81; p<0.001) as well as abbMEDS (0.54, 0.44–0.64; p<0.0001), mREMS (0.62, 0.52–0.72; p<0.001) and SOFA (0.77, 95% CI: 0.69–0.85; p = 0.003). In terms of specificity, internal medicine physicians (0.74, 0.64–0.82; p = 0.509), abbMEDS (0.72, 0.64–0.81; p = 0.327) and SOFA (0.74, 95% CI: 0.65–0.82, p = 0.447) achieved similar performance compared to the XGBoost model, opposed to mREMS (0.64, 0.55–0.74; p = 0.02) which was significantly worse than machine learning predictions. * = p<0.05; ** = p<0.001; *** = p<0.0001; NS = not significant.

References

    1. LaCalle E, Rabin E. Frequent users of emergency departments: the myths, the data, and the policy implications. Ann Emerg Med. 2010;56(1):42–8. Epub 2010/03/30. 10.1016/j.annemergmed.2010.01.032 . - DOI - PubMed
    1. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801–10. Epub 2016/02/24. 10.1001/jama.2016.0287 - DOI - PMC - PubMed
    1. Roest AA, Tegtmeier J, Heyligen JJ, Duijst J, Peeters A, Borggreve HF, et al. Risk stratification by abbMEDS and CURB-65 in relation to treatment and clinical disposition of the septic patient at the emergency department: a cohort study. BMC Emerg Med. 2015;15:29 Epub 2015/10/16. 10.1186/s12873-015-0056-z - DOI - PMC - PubMed
    1. McLymont N, Glover GW. Scoring systems for the characterization of sepsis and associated outcomes. Ann Transl Med. 2016;4(24):527 Epub 2017/02/06. 10.21037/atm.2016.12.53 - DOI - PMC - PubMed
    1. Seymour CW, Liu VX, Iwashyna TJ, Brunkhorst FM, Rea TD, Scherag A, et al. Assessment of Clinical Criteria for Sepsis: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):762–74. Epub 2016/02/24. 10.1001/jama.2016.0288 - DOI - PMC - PubMed

Publication types