Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 6;12(4):e0174708.
doi: 10.1371/journal.pone.0174708. eCollection 2017.

Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning

Affiliations

Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning

Steven Horng et al. PLoS One. .

Abstract

Objective: To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department.

Methods: This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consecutive ED patient visits between 12/17/08 and 2/17/13 were included. No patients were excluded. The primary outcome measure was infection diagnosed in the emergency department defined as a patient having an infection related ED ICD-9-CM discharge diagnosis. Patients were randomly allocated to train (64%), validate (20%), and test (16%) data sets. After preprocessing the free text using bigram and negation detection, we built four models to predict infection, incrementally adding vital signs, chief complaint, and free text nursing assessment. We used two different methods to represent free text: a bag of words model and a topic model. We then used a support vector machine to build the prediction model. We calculated the area under the receiver operating characteristic curve to compare the discriminatory power of each model.

Results: A total of 230,936 patient visits were included in the study. Approximately 14% of patients had the primary outcome of diagnosed infection. The area under the ROC curve (AUC) for the vitals model, which used only vital signs and demographic data, was 0.67 for the training data set, 0.67 for the validation data set, and 0.67 (95% CI 0.65-0.69) for the test data set. The AUC for the chief complaint model which also included demographic and vital sign data was 0.84 for the training data set, 0.83 for the validation data set, and 0.83 (95% CI 0.81-0.84) for the test data set. The best performing methods made use of all of the free text. In particular, the AUC for the bag-of-words model was 0.89 for training data set, 0.86 for the validation data set, and 0.86 (95% CI 0.85-0.87) for the test data set. The AUC for the topic model was 0.86 for the training data set, 0.86 for the validation data set, and 0.85 (95% CI 0.84-0.86) for the test data set.

Conclusion: Compared to previous work that only used structured data such as vital signs and demographic information, utilizing free text drastically improves the discriminatory ability (increase in AUC from 0.67 to 0.86) of identifying infection.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Pipeline for natural language processing and prediction.
Our algorithm first takes as input a triage note and processes it by applying tokenization followed by bigram and negation detection, the latter using a customized version of the NegEx tool [14]. The processed text is then transformed into a set of features. The Bag-of-Words features count how many times each word in our vocabulary appears in the processed note, and the Topic model features (derived using the Mallet [17] tool) measure how much certain topics are represented in the note. A Support Vector Machine (SVM) is then trained on these sets of features to determine whether the patient presents an infection, using the SVMperf software [15].
Fig 2
Fig 2. Receiver operating characteristic curve.
Vitals—Age, Gender, Severity, Temperature, Heart Rate, Respiratory Rate, Oxygen Saturation, Systolic Blood Pressure, Diastolic Blood Pressure, Pain Scale. Chief Complaint—Chief Complaint + Vitals. Bag of Words—Vitals + Chief Complaint + Triage Assessment. Topics—Vitals + Chief Complaint + Triage Assessment
Fig 3
Fig 3. Calibration plots.
We assess the models’ calibration by plotting for each predicted probability range, in increments of 0.1, the fraction of patients with this predicted probability of infection that truly had an infection. Perfect calibration would correspond to the straight line from (0,0) to (1,1). We additionally show bar plots of the number of predictions made by each method within each probability interval. The Vitals model, which has the least data to go on, makes very few predictions of infection with probability greater than 0.5, leading to very large confidence intervals toward the upper right of the plot. The Bag of Words and Topics models are better calibrated, and are particularly accurate for the highest risk patients. Vitals—Age, Gender, Severity, Temperature, Heart Rate, Respiratory Rate, Oxygen Saturation, Systolic Blood Pressure, Diastolic Blood Pressure, Pain Scale. CC—Chief Complaint + Vitals. BoW (Bag of Words)—Vitals + Chief Complaint + Triage Assessment. Topics—Vitals + Chief Complaint + Triage Assessment

References

    1. Garg AX (2005) Effects of Computerized Clinical Decision Support Systems on Practitioner Performance and Patient Outcomes: A Systematic Review. JAMA: The Journal of the American Medical Association 293: 1223–1238. 10.1001/jama.293.10.1223 - DOI - PubMed
    1. Chaudhry B, Wang J, Wu S, Maglione M, Mojica W, et al. (2006) Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Ann Intern Med 144: 742–752. - PubMed
    1. Shea S, DuMouchel W, Bahamonde L (1996) A meta-analysis of 16 randomized controlled trials to evaluate computer-based clinical reminder systems for preventive care in the ambulatory setting. J Am Med Inform Assoc 3: 399–409. - PMC - PubMed
    1. Institute of Medicine (U.S.). Committee on the Future of Emergency Care in the United States Health System. (2007) Hospital-based emergency care: at the breaking point. Washington, D.C.: National Academies Press; xxiii, 397 p. p.
    1. Handler JA, Feied CF, Coonan K, Vozenilek J, Gillam M, et al. (2004) Computerized physician order entry and online decision support. Acad Emerg Med 11: 1135–1141. 10.1197/j.aem.2004.08.007 - DOI - PubMed

Publication types