Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 1;27(1):31-38.
doi: 10.1093/jamia/ocz100.

Ensemble method-based extraction of medication and related information from clinical texts

Affiliations

Ensemble method-based extraction of medication and related information from clinical texts

Youngjun Kim et al. J Am Med Inform Assoc. .

Abstract

Objective: Accurate and complete information about medications and related information is crucial for effective clinical decision support and precise health care. Recognition and reduction of adverse drug events is also central to effective patient care. The goal of this research is the development of a natural language processing (NLP) system to automatically extract medication and adverse drug event information from electronic health records. This effort was part of the 2018 n2c2 shared task on adverse drug events and medication extraction.

Materials and methods: The new NLP system implements a stacked generalization based on a search-based structured prediction algorithm for concept extraction. We trained 4 sequential classifiers using a variety of structured learning algorithms. To enhance accuracy, we created a stacked ensemble consisting of these concept extraction models trained on the shared task training data. We implemented a support vector machine model to identify related concepts.

Results: Experiments with the official test set showed that our stacked ensemble achieved an F1 score of 92.66%. The relation extraction model with given concepts reached a 93.59% F1 score. Our end-to-end system yielded overall micro-averaged recall, precision, and F1 score of 92.52%, 81.88% and 86.88%, respectively. Our NLP system for adverse drug events and medication extraction ranked within the top 5 of teams participating in the challenge.

Conclusion: This study demonstrated that a stacked ensemble with a search-based structured prediction algorithm achieved good performance by effectively integrating the output of individual classifiers and could provide a valid solution for other clinical concept extraction tasks.

Keywords: drug-related side effects and adverse reactions [MeSH C25.100]; medical informatics [L01.313.500]; natural language processing (NLP) [L01.224.050.375.580]; neural networks [L01.224.050.375.605].

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Sample text with (A) concept annotations and (B) examples of each concept category. ADE: adverse drug event.
Figure 2.
Figure 2.
Concept and relation category counts distribution. ADE: adverse drug event.
Figure 3.
Figure 3.
System architecture of the end-to-end system. CRF: conditional random fields; EHR: electronic health record; RNN: recurrent neural network; SEARN: search-based structured prediction; SVM: support vector machine.
Figure 4.
Figure 4.
F1 score of each external resource (lenient matching). i2b2: Informatics for Integrating Biology and the Beside; CADEC: CSIRO Adverse Drug Events Corpus.

References

    1. Meystre SM, Savova GK, Kipper-Schuler KC, et al. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008; 17 (1): 128–44. - PubMed
    1. Uzuner Ö, Goldstein I, Luo Y, et al. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc 2008; 15 (1): 14–24. - PMC - PubMed
    1. Uzuner Ö. Recognizing obesity and comorbidities in sparse data. J Am Med Inform Assoc 2009; 16 (4): 561–70. - PMC - PubMed
    1. Uzuner Ö, Solti I, Cadag E.. Extracting medication information from clinical text. J Am Med Inform Assoc 2010; 17 (5): 514–8. - PMC - PubMed
    1. Uzuner Ö, South BR, Shen S, et al. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 2011; 18 (5): 552–6. - PMC - PubMed

Publication types