Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:2012:577-86.
Epub 2012 Nov 3.

A study of transportability of an existing smoking status detection module across institutions

Affiliations

A study of transportability of an existing smoking status detection module across institutions

Mei Liu et al. AMIA Annu Symp Proc. 2012.

Abstract

Electronic Medical Records (EMRs) are valuable resources for clinical observational studies. Smoking status of a patient is one of the key factors for many diseases, but it is often embedded in narrative text. Natural language processing (NLP) systems have been developed for this specific task, such as the smoking status detection module in the clinical Text Analysis and Knowledge Extraction System (cTAKES). This study examined transportability of the smoking module in cTAKES on the Vanderbilt University Hospital's EMR data. Our evaluation demonstrated that modest effort of change is necessary to achieve desirable performance. We modified the system by filtering notes, annotating new data for training the machine learning classifier, and adding rules to the rule-based classifiers. Our results showed that the customized module achieved significantly higher F-measures at all levels of classification (i.e., sentence, document, patient) compared to the direct application of the cTAKES module to the Vanderbilt data.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The process of creating annotated data sets for training and testing at different levels.
Figure 2.
Figure 2.
The architecture of cTAKES smoking status detection module for sentence-level classification.
Figure 3.
Figure 3.
Patient-level classification rules

Similar articles

Cited by

References

    1. Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161–174. - PMC - PubMed
    1. Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc. 2010 May-Jun;17(3):229–236. - PMC - PubMed
    1. Denny JC, Smithers JD, Miller RA, Spickard A., 3rd “Understanding” medical school curriculum content using KnowledgeMap. J Am Med Inform Assoc. 2003 Jul-Aug;10(4):351–362. - PMC - PubMed
    1. Savova GK, Kipper-Schuler K, Buntrock JD, Chute CG. UIMA-based clinical information extraction system. LREC 2008: Towards enhanced interoperability for large HLT systems: UIMA for NLP. 2008
    1. Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. MedEx: a medication information extraction system for clinical narratives. J Am Med Inform Assoc. 2010 Jan-Feb;17(1):19–24. - PMC - PubMed

Publication types

LinkOut - more resources