Early Warning Scores With and Without Artificial Intelligence
- PMID: 39405061
- PMCID: PMC11544488
- DOI: 10.1001/jamanetworkopen.2024.38986
Early Warning Scores With and Without Artificial Intelligence
Erratum in
-
Error in Author Name.JAMA Netw Open. 2024 Nov 4;7(11):e2448969. doi: 10.1001/jamanetworkopen.2024.48969. JAMA Netw Open. 2024. PMID: 39509136 Free PMC article. No abstract available.
Abstract
Importance: Early warning decision support tools to identify clinical deterioration in the hospital are widely used, but there is little information on their comparative performance.
Objective: To compare 3 proprietary artificial intelligence (AI) early warning scores and 3 publicly available simple aggregated weighted scores.
Design, setting, and participants: This retrospective cohort study was performed at 7 hospitals in the Yale New Haven Health System. All consecutive adult medical-surgical ward hospital encounters between March 9, 2019, and November 9, 2023, were included.
Exposures: Simultaneous Epic Deterioration Index (EDI), Rothman Index (RI), eCARTv5 (eCART), Modified Early Warning Score (MEWS), National Early Warning Score (NEWS), and NEWS2 scores.
Main outcomes and measures: Clinical deterioration, defined as a transfer from ward to intensive care unit or death within 24 hours of an observation.
Results: Of the 362 926 patient encounters (median patient age, 64 [IQR, 47-77] years; 200 642 [55.3%] female), 16 693 (4.6%) experienced a clinical deterioration event. eCART had the highest area under the receiver operating characteristic curve at 0.895 (95% CI, 0.891-0.900), followed by NEWS2 at 0.831 (95% CI, 0.826-0.836), NEWS at 0.829 (95% CI, 0.824-0.835), RI at 0.828 (95% CI, 0.823-0.834), EDI at 0.808 (95% CI, 0.802-0.812), and MEWS at 0.757 (95% CI, 0.750-0.764). After matching scores at the moderate-risk sensitivity level for a NEWS score of 5, overall positive predictive values (PPVs) ranged from a low of 6.3% (95% CI, 6.1%-6.4%) for an EDI score of 41 to a high of 17.3% (95% CI, 16.9%-17.8%) for an eCART score of 94. Matching scores at the high-risk specificity of a NEWS score of 7 yielded overall PPVs ranging from a low of 14.5% (95% CI, 14.0%-15.2%) for an EDI score of 54 to a high of 23.3% (95% CI, 22.7%-24.2%) for an eCART score of 97. The moderate-risk thresholds provided a median of at least 20 hours of lead time for all the scores. Median lead time at the high-risk threshold was 11 (IQR, 0-69) hours for eCART, 8 (IQR, 0-63) hours for NEWS, 6 (IQR, 0-62) hours for NEWS2, 5 (IQR, 0-56) hours for MEWS, 1 (IQR, 0-39) hour for EDI, and 0 (IQR, 0-42) hours for RI.
Conclusions and relevance: In this cohort study of inpatient encounters, eCART outperformed the other AI and non-AI scores, identifying more deteriorating patients with fewer false alarms and sufficient time to intervene. NEWS, a non-AI, publicly available early warning score, significantly outperformed EDI. Given the wide variation in accuracy, additional transparency and oversight of early warning tools may be warranted.
Conflict of interest statement
Figures

Comment in
-
Toward the Rigorous Evaluation of Early Warning Scores.JAMA Netw Open. 2024 Oct 1;7(10):e2438966. doi: 10.1001/jamanetworkopen.2024.38966. JAMA Netw Open. 2024. PMID: 39405065 No abstract available.
References
-
- Barwise A, Thongprayoon C, Gajic O, Jensen J, Herasevich V, Pickering BW. Delayed rapid response team activation is associated with increased hospital mortality, morbidity, and length of stay in a tertiary care institution. Crit Care Med. 2016;44(1):54-63. doi:10.1097/CCM.0000000000001346 - DOI - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources