Observational Study

. 2019 Mar 1;26(3):254-261.

doi: 10.1093/jamia/ocy166.

Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation

Majid Afshar^{1

2

3}, Andrew Phillips⁴, Niranjan Karnik⁵, Jeanne Mueller⁶, Daniel To¹, Richard Gonzalez⁶, Ron Price², Richard Cooper⁴, Cara Joyce^{2

4}, Dmitriy Dligach^{2

3}

Affiliations

¹ Health Sciences Division, Burn and Shock Trauma Research Institute, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA.
² Health Sciences Division, Center for Health Outcomes and Informatics Research, Loyola University, Maywood, Illinois, USA.
³ Department of Public Health Sciences, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA.
⁴ Department of Computer Science, Loyola University, Chicago, Illinois, USA.
⁵ Department of Psychiatry, Rush University Medical Center, Chicago, Illinois, USA.
⁶ Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA.

PMID: 30602031
PMCID: PMC6657384
DOI: 10.1093/jamia/ocy166

Observational Study

Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation

Majid Afshar et al. J Am Med Inform Assoc. 2019.

. 2019 Mar 1;26(3):254-261.

doi: 10.1093/jamia/ocy166.

Authors

Majid Afshar^{1

2

3}, Andrew Phillips⁴, Niranjan Karnik⁵, Jeanne Mueller⁶, Daniel To¹, Richard Gonzalez⁶, Ron Price², Richard Cooper⁴, Cara Joyce^{2

4}, Dmitriy Dligach^{2

3}

Affiliations

¹ Health Sciences Division, Burn and Shock Trauma Research Institute, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA.
² Health Sciences Division, Center for Health Outcomes and Informatics Research, Loyola University, Maywood, Illinois, USA.
³ Department of Public Health Sciences, Stritch School of Medicine, Loyola University, Maywood, Illinois, USA.
⁴ Department of Computer Science, Loyola University, Chicago, Illinois, USA.
⁵ Department of Psychiatry, Rush University Medical Center, Chicago, Illinois, USA.
⁶ Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA.

PMID: 30602031
PMCID: PMC6657384
DOI: 10.1093/jamia/ocy166

Abstract

Objective: Alcohol misuse is present in over a quarter of trauma patients. Information in the clinical notes of the electronic health record of trauma patients may be used for phenotyping tasks with natural language processing (NLP) and supervised machine learning. The objective of this study is to train and validate an NLP classifier for identifying patients with alcohol misuse.

Materials and methods: An observational cohort of 1422 adult patients admitted to a trauma center between April 2013 and November 2016. Linguistic processing of clinical notes was performed using the clinical Text Analysis and Knowledge Extraction System. The primary analysis was the binary classification of alcohol misuse. The Alcohol Use Disorders Identification Test served as the reference standard.

Results: The data corpus comprised 91 045 electronic health record notes and 16 091 features. In the final machine learning classifier, 16 features were selected from the first 24 hours of notes for identifying alcohol misuse. The classifier's performance in the validation cohort had an area under the receiver-operating characteristic curve of 0.78 (95% confidence interval [CI], 0.72 to 0.85). Sensitivity and specificity were at 56.0% (95% CI, 44.1% to 68.0%) and 88.9% (95% CI, 84.4% to 92.8%). The Hosmer-Lemeshow goodness-of-fit test demonstrates the classifier fits the data well (P = .17). A simpler rule-based keyword approach had a decrease in sensitivity when compared with the NLP classifier from 56.0% to 18.2%.

Conclusions: The NLP classifier has adequate predictive validity for identifying alcohol misuse in trauma centers. External validation is needed before its application to augment screening.

PubMed Disclaimer

Figures

**Figure 1.**
CONSORT diagram. Alcohol misuse was rated as Alcohol Use Disorders Identification Test (AUDIT) score ≥5 for women and ≥8 for men. EHR: electronic health record.

**Figure 2.**
Learning curve demonstrating peak effect on area under the curve (AUC) in sample size up to 1137 patients used for the development cohort.

**Figure 3.**
(A) Discrimination for alcohol misuse with receiver-operating characteristic (ROC) area under the curve. (B) Calibration plot across 5 strata of predicted probabilities with n = 57 in each strata.

**Figure 4.**
Comparison between Alcohol Use Disorders Identification Test (AUDIT) score across 5 strata of predicted probabilities. n = 57 for each stratum. Boxplot type and summary statistics for alcohol misuse by AUDIT score and risk strata. The lower and upper quartiles, representing observations are the shaded boxes, the median observation is the horizontal line through the box, and mean observation is diamond. Data falling outside the lower to upper quartile range are plotted as outliers of the data.

See this image and copyright information in PMC

References

1. Stahre M, Roeber J, Kanny D, Brewer RD, Zhang X. Contribution of excessive alcohol consumption to deaths and years of potential life lost in the United States. Prev Chronic Dis 2014; 11: E109. - PMC - PubMed
1. Dawson DA, Goldstein RB, Saha TD, Grant BF. Changes in alcohol consumption: United States, 2001–2002 to 2012–2013. Drug Alcohol Depend 2015; 148: 56–61. - PMC - PubMed
1. Afshar M, Netzer G, Murthi S, Smith GS. Alcohol exposure, injury, and death in trauma patients. J Trauma Acute Care Surg 2015; 79 (4): 643–8. - PMC - PubMed
1. Field C, Walters S, Marti CN, Jun J, Foreman M, Brown C. A multisite randomized controlled trial of brief intervention to reduce drinking in the trauma care setting: how brief is brief? Ann Surg 2014; 259 (5): 873–80. - PMC - PubMed
1. Zatzick D, Donovan DM, Jurkovich G et al. . Disseminating alcohol screening and brief intervention at trauma centers: a policy-relevant cluster randomized effectiveness trial. Addiction 2014; 109 (5): 754–65. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- ClinicalTrials.gov
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation

Affiliations

Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous