Evaluating sepsis watch generalizability through multisite external validation of a sepsis machine learning model

Bruno Valan^#¹, Anusha Prakash^#¹, William Ratliff¹, Michael Gao¹, Srikanth Muthya², Ajit Thomas², Jennifer L Eaton³, Matt Gardner¹, Marshall Nichols¹, Mike Revoir¹, Dustin Tart⁴, Cara O'Brien^{4

5}, Manesh Patel⁵, Suresh Balu¹, Mark Sendak⁶

Affiliations

¹ Duke Institute for Health Innovation, Durham, NC, USA.
² Cohere Med Inc, 110 Corcoran St, 5th Floor, Durham, NC, USA.
³ Summa Health Research & Innovation, Akron, OH, USA.
⁴ Duke University Hospital, Durham, NC, USA.
⁵ Department of Medicine, Duke University School of Medicine, Durham, NC, USA.
⁶ Duke Institute for Health Innovation, Durham, NC, USA. mark.sendak@duke.edu.

^# Contributed equally.

PMID: 40500319
PMCID: PMC12159134
DOI: 10.1038/s41746-025-01664-5

Evaluating sepsis watch generalizability through multisite external validation of a sepsis machine learning model

Bruno Valan et al. NPJ Digit Med. 2025.

. 2025 Jun 11;8(1):350.

doi: 10.1038/s41746-025-01664-5.

Authors

Affiliations

¹ Duke Institute for Health Innovation, Durham, NC, USA.
² Cohere Med Inc, 110 Corcoran St, 5th Floor, Durham, NC, USA.
³ Summa Health Research & Innovation, Akron, OH, USA.
⁴ Duke University Hospital, Durham, NC, USA.
⁵ Department of Medicine, Duke University School of Medicine, Durham, NC, USA.
⁶ Duke Institute for Health Innovation, Durham, NC, USA. mark.sendak@duke.edu.

^# Contributed equally.

PMID: 40500319
PMCID: PMC12159134
DOI: 10.1038/s41746-025-01664-5

Abstract

Sepsis accounts for a substantial portion of global deaths and healthcare costs. The objective of this reproducibility study is to validate Duke Health's Sepsis Watch ML model, in a new community healthcare setting and assess its performance and clinical utility in early sepsis detection at Summa Health's emergency departments. The study analyzed the model's ability to predict sepsis using a combination of static and dynamic patient data using 205,005 encounters between 2020 and 2021 from 101,584 unique patients. 54.7% (n = 112,223) patients were female and the average age was 50 (IQR [38,71]). The AUROC ranged from 0.906 to 0.960, and the AUPRC ranged from 0.177 to 0.252 across the four sites. Ultimately, the reproducibility of the Sepsis Watch model in a community health system setting confirmed its strong and robust performance and portability across different geographical and demographic contexts with little variation.

PubMed Disclaimer

Conflict of interest statement

Competing interests: W.R. reported writing software licensed to Fullsteam Health. M.Gao reported writing software licensed to Clinetic, Cohere Med, and Fullsteam Health. M.Gao reported owning equity in Clinetic. M. Gardner reported writing software licensed to Fullsteam Health. M.N. reported writing software licensed to Clinetic, Cohere Med, and Fullsteam Health. M.N. reported owning equity in Clinetic. M.R. reported writing software licensed to Clinetic, Cohere Med, and Fullsteam Health. M.R. reported owning equity in Clinetic. M.P. reported receiving grants from HeartFlow, Bayer, Janssen, and Novartis outside the submittedwork. S.B. reported writing software licensed to Clinetic, Cohere Med, and Fullsteam Health. SB reported owning equity in Clinetic. M.S. reported writing software licensed to Clinetic, Cohere Med, KelaHealth, and Fullsteam Health. M.S. reported owning equity in Clinetic. M.S. and S.B reported receiving grants from the Gordon and Betty Moore Foundation and the PatrickJ McGovern Foundation. No other disclosures were reported.

Figures

**Fig. 1. Comparison of sepsis watch model performance across four hospital sites using ROC curves.**
Depicts ROC (receiver operating characteristic) curves of the Sepsis Watch model for each of the hospital sites across Summa Health. This is shown for (a) ACH Emergency Department, (b) SHB Emergency Department, (c) ACH Green Emergency Department, and (d) SHB Wadsworth Emergency Department. The model was evaluated on 10,000 thresholds to generate the AUROC graph for each of the four hospital sites.

**Fig. 2. Sepsis Watch Model’s prediction performance: AUPR analysis across four hospital sites.**
Depicts AUPR (Area under the Precision-Recall curve) for each of the four sites at Summa Health. This is shown for (a) ACH Emergency Department, (b) SHB Emergency Department, (c) ACH Green Emergency Department, and (d) SHB Wadsworth Emergency Department. The model was evaluated on 10,000 thresholds to generate the AUPR graph for each of the four hospital sites.

**Fig. 3. Model facts label updated to show generalizability results for external validation for the Sepsis Watch model.**
The Model Facts Label was updated with external validation results to highlight the reproducibility of the Sepsis Watch model in populations beyond its original training data, making the findings accessible to a general audience.

See this image and copyright information in PMC

References

1. Levy, M. M. et al. 2001 sccm/esicm/accp/ats/sis international sepsis definitions conference. Crit. Care Med.31, 1250–1256 (2003). - PubMed
1. Liu, V. et al. Hospital deaths in patients with sepsis from 2 independent cohorts. JAMA312, 90 (2014). - PubMed
1. Rudd, K. E. et al. Global, regional, and national sepsis incidence and mortality, 1990–2017: analysis for the global burden of disease study. Lancet395, 200–211 (2020). - PMC - PubMed
1. Buchman, T. G. et al. Sepsis among medicare beneficiaries: 1. the burdens of sepsis, 2012–2018*. Crit. Care Med.48, 276–288 (2020). - PMC - PubMed
1. Paoli, C. J., Reynolds, M. A., Sinha, M., Gitlin, M. & Crouser, E. Epidemiology and costs of sepsis in the united states–an analysis based on timing of diagnosis and severity level*. Crit. Care Med.46, 1889–1897 (2018). - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluating sepsis watch generalizability through multisite external validation of a sepsis machine learning model

Affiliations

Evaluating sepsis watch generalizability through multisite external validation of a sepsis machine learning model

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources