Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 9;4(1):134.
doi: 10.1038/s41746-021-00504-6.

Artificial intelligence sepsis prediction algorithm learns to say "I don't know"

Affiliations

Artificial intelligence sepsis prediction algorithm learns to say "I don't know"

Supreeth P Shashikumar et al. NPJ Digit Med. .

Abstract

Sepsis is a leading cause of morbidity and mortality worldwide. Early identification of sepsis is important as it allows timely administration of potentially life-saving resuscitation and antimicrobial therapy. We present COMPOSER (COnformal Multidimensional Prediction Of SEpsis Risk), a deep learning model for the early prediction of sepsis, specifically designed to reduce false alarms by detecting unfamiliar patients/situations arising from erroneous data, missingness, distributional shift and data drifts. COMPOSER flags these unfamiliar cases as indeterminate rather than making spurious predictions. Six patient cohorts (515,720 patients) curated from two healthcare systems in the United States across intensive care units (ICU) and emergency departments (ED) were used to train and externally and temporally validate this model. In a sequential prediction setting, COMPOSER achieved a consistently high area under the curve (AUC) (ICU: 0.925-0.953; ED: 0.938-0.945). Out of over 6 million prediction windows roughly 20% and 8% were identified as indeterminate amongst non-septic and septic patients, respectively. COMPOSER provided early warning within a clinically actionable timeframe (ICU: 12.2 [3.2 22.8] and ED: 2.1 [0.8 4.5] hours prior to first antibiotics order) across all six cohorts, thus allowing for identification and prioritization of patients at high risk for sepsis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic diagram of COMPOSER.
Two possible deployment schemes are shown in panel (a). During the evaluation phase of COMPOSER (panel b), the input test data is first fed into the weighted input layer, and is then passed through the encoder (first module), after which the method of conformal prediction is utilized to determine the ‘conditions for use’ (second module). This is achieved by comparing the conformity of the new representation (H(2)) to the representations in the conformal set, which are carefully selected during the training phase (see Methods section). If conformity can be achieved at a given confidence level (ε), H(2) is then forwarded to the sepsis predictor to obtain a risk score (third module). In comparison, panel (c) shows a deployment scheme without the use of conformal prediction.
Fig. 2
Fig. 2. Summary of COMPOSER performance.
Comparison of COMPOSER model against GB-Vitala and a feedforward neural network (FFNNb). The line plots in af shows the relative improvement in positive predictive value (PPV), negative predictive value (NPV), diagnostic odds ratio (DOR), specificity (SPC), Area Under the Curve (AUC) and number of false alarms per patient hour (FAPH)+, respectively. The median and interquartiles for all six cohorts (three ICUs and three EDs) are summarized via superimposed box plots. In comparison, ESPMc (not shown here) achieved an AUC of 0.889 (PPV = 31.2%, NPV = 97.8%, DOR = 23.2, SPC = 84.3, FAPH = 0.132) and 0.876 (PPV = 35.9%, NPV = 96.8%, DOR = 17.4, SPC = 94.2%, FAPH = 0.05) across Hospital-A temporal ICU and ED. aGB-Vital corresponds to a Gradient Boosted Tree (XGBoost), built using six vital signs measurements: systolic blood pressure, diastolic blood pressure, heart rate, respiratory rate, oxygen saturation and temperature. bFFNN corresponds to a 2 layer feedforward neural network that uses the same number of input features as that of COMPOSER. The starting point of y-axis for a and b were determined by the chance level of a classifier at the lowest prevalence rate. cESPM corresponds to the Epic’s commercially available Best Practice Advisory (BPA) alert. We only had access to the risk scores produced by this system at Hospital-A during the temporal validation time-frame. + False alarms per patient hour (FAPH) can be used to calculate the expected number of false alarms per unit of time in a typical care unit (e.g., a FAPH of 0.025 translates to roughly 1 alarm every 2 h in a 20-bed care unit).

References

    1. Singer M, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3) JAMA. 2016;315:801–810. doi: 10.1001/jama.2016.0287. - DOI - PMC - PubMed
    1. Rhee C, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009–2014. JAMA. 2017;318:1241–1249. doi: 10.1001/jama.2017.13836. - DOI - PMC - PubMed
    1. Centers for Medicare & Medicaid Services. QualityNet—inpatient hospitals specifications manual. Quality website. https://www.qualitynet.org/inpatient/specifications-manuals. Accessed August, 2020.
    1. Villar J, et al. Many emergency department patients with severe sepsis and septic shock do not meet diagnostic criteria within 3h of arrival. Ann. Emerg. Med. 2014;64:48–54. doi: 10.1016/j.annemergmed.2014.02.023. - DOI - PubMed
    1. Desautels T, et al. Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach. JMIR Med. Inform. 2016;4:e28. doi: 10.2196/medinform.5909. - DOI - PMC - PubMed