Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;2 Suppl 3(Suppl 3):S2.
doi: 10.1186/2041-1480-2-S3-S2. Epub 2011 Jul 14.

A cascade of classifiers for extracting medication information from discharge summaries

Affiliations

A cascade of classifiers for extracting medication information from discharge summaries

Scott Russell Halgrim et al. J Biomed Semantics. 2011.

Abstract

Background: Extracting medication information from clinical records has many potential applications, and recently published research, systems, and competitions reflect an interest therein. Much of the early extraction work involved rules and lexicons, but more recently machine learning has been applied to the task.

Methods: We present a hybrid system consisting of two parts. The first part, field detection, uses a cascade of statistical classifiers to identify medication-related named entities. The second part uses simple heuristics to link those entities into medication events.

Results: The system achieved performance that is comparable to other approaches to the same task. This performance is further improved by adding features that reference external medication name lists.

Conclusions: This study demonstrates that our hybrid approach outperforms purely statistical or rule-based systems. The study also shows that a cascade of classifiers works better than a single classifier in extracting medication information. The system is available as is upon request from the first author.

PubMed Disclaimer

Figures

Figure 1
Figure 1
System performance on the development set with different training set sizes. Key: + represents horizontal F-scores with features in F1-F4b; ○ represents horizontal F-scores with features in F1-F4a.
Figure 2
Figure 2
Cascade vs. find_all for field detection on the development set. Key: + represents horizontal F-scores with the three-module cascade; ○ represents horizontal F-scores with find_all.

References

    1. Levin MA, Krol M, Doshi AM, Reich DL. Extraction and mapping of drug names from free text to a standardized nomenclature. AMIA Annual Symposium Proceedings: 10-14 November 2007; Chicago. 2007. pp. 438–442. - PMC - PubMed
    1. Gold S, Elhadad N, Zhu M, Cimino JJ, Hripcsak G. Extracting structured medication event information from discharge summaries. AMIA Annual Symposium Proceedings: 8-12 November 2008; Washington. 2008. pp. 237–241. - PMC - PubMed
    1. Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. MedEx: a medication information extraction system for clinical narratives. Journal of the American Medical Informatics Association. 2010;17:19–24. doi: 10.1197/jamia.M3378. - DOI - PMC - PubMed
    1. Taira RK, Soderland SG. In: Proceedings of the AMIA Symposium: 6-8 November 1999; Washington. Nancy M. Lorenzi, editor. Hanley & Belfus, Inc; 1999. A statistical natural language processor for medical reports; pp. 970–974. - PMC - PubMed
    1. Patrick J, Li M. High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge. Journal of the American Medical Informatics Association. 2010;17:524–527. doi: 10.1136/jamia.2010.003939. - DOI - PMC - PubMed

LinkOut - more resources