A study of transportability of an existing smoking status detection module across institutions

Mei Liu¹, Anushi Shah, Min Jiang, Neeraja B Peterson, Qi Dai, Melinda C Aldrich, Qingxia Chen, Erica A Bowton, Hongfang Liu, Joshua C Denny, Hua Xu

Affiliations

PMID: 23304330
PMCID: PMC3540509

A study of transportability of an existing smoking status detection module across institutions

Mei Liu et al. AMIA Annu Symp Proc. 2012.

. 2012:2012:577-86.

Epub 2012 Nov 3.

Authors

Mei Liu¹, Anushi Shah, Min Jiang, Neeraja B Peterson, Qi Dai, Melinda C Aldrich, Qingxia Chen, Erica A Bowton, Hongfang Liu, Joshua C Denny, Hua Xu

Affiliation

¹ Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, USA.

PMID: 23304330
PMCID: PMC3540509

Abstract

Electronic Medical Records (EMRs) are valuable resources for clinical observational studies. Smoking status of a patient is one of the key factors for many diseases, but it is often embedded in narrative text. Natural language processing (NLP) systems have been developed for this specific task, such as the smoking status detection module in the clinical Text Analysis and Knowledge Extraction System (cTAKES). This study examined transportability of the smoking module in cTAKES on the Vanderbilt University Hospital's EMR data. Our evaluation demonstrated that modest effort of change is necessary to achieve desirable performance. We modified the system by filtering notes, annotating new data for training the machine learning classifier, and adding rules to the rule-based classifiers. Our results showed that the customized module achieved significantly higher F-measures at all levels of classification (i.e., sentence, document, patient) compared to the direct application of the cTAKES module to the Vanderbilt data.

PubMed Disclaimer

Figures

**Figure 1.**
The process of creating annotated data sets for training and testing at different levels.

**Figure 2.**
The architecture of cTAKES smoking status detection module for sentence-level classification.

**Figure 3.**
Patient-level classification rules

See this image and copyright information in PMC

References

1. Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161–174. - PMC - PubMed
1. Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc. 2010 May-Jun;17(3):229–236. - PMC - PubMed
1. Denny JC, Smithers JD, Miller RA, Spickard A., 3rd “Understanding” medical school curriculum content using KnowledgeMap. J Am Med Inform Assoc. 2003 Jul-Aug;10(4):351–362. - PMC - PubMed
1. Savova GK, Kipper-Schuler K, Buntrock JD, Chute CG. UIMA-based clinical information extraction system. LREC 2008: Towards enhanced interoperability for large HLT systems: UIMA for NLP. 2008
1. Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. MedEx: a medication information extraction system for clinical narratives. J Am Med Inform Assoc. 2010 Jan-Feb;17(1):19–24. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A study of transportability of an existing smoking status detection module across institutions

Affiliation

A study of transportability of an existing smoking status detection module across institutions

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources