. 2008 Nov 6:2008:450-4.

Use of semantic features to classify patient smoking status

Patrick J McCormick¹, Noémie Elhadad, Peter D Stetson

Affiliations

PMID: 18998969
PMCID: PMC2655942

Use of semantic features to classify patient smoking status

Patrick J McCormick et al. AMIA Annu Symp Proc. 2008.

. 2008 Nov 6:2008:450-4.

Authors

Patrick J McCormick¹, Noémie Elhadad, Peter D Stetson

Affiliation

¹ College of Physicians & Surgeons, Columbia University, New York, NY, USA.

PMID: 18998969
PMCID: PMC2655942

Abstract

The recent i2b2 NLP Challenge smoking classification task offers a rare chance to compare different natural language processing techniques on actual clinical data. We compare the performance of a classifier which relies on semantic features generated by an unmodified version of MedLEE, a clinical NLP engine, to one using lexical features. We also compare the performance of supervised classifiers to rule-based symbolic classifiers. Our baseline supervised classifier with lexical features yields a microaveraged F-measure of 0.81. Our rule-based classifier using MedLEE semantic features is superior, with an F-measure of 0.83. Our supervised classifier trained with semantic MedLEE features is competitive with the top-performing smoking classifier in the i2b2 NLP Challenge, with microaveraged precision of 0.90, recall of 0.89, and F-measure of 0.89.

PubMed Disclaimer

Figures

**Figure 1**
Semantic feature to represent the sentence “She quit smoking tobacco in 1985.”

**Figure 2**
Partial XQuery expression using semantic features to determine if instance is “non-smoker”.

See this image and copyright information in PMC

References

1. Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc. 2008;15(1):14–24. - PMC - PubMed
1. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.
1. Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys. 2002;34:1–17.
1. Clark C, Good K, Jezierny L, Macpherson M, Wilson B, Chajewska U. Identifying smokers with a medical extraction system. J Am Med Inform Assoc. 2008;15(1):36–39. - PMC - PubMed
1. Cohen AM. Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J Am Med Inform Assoc. 2008;15(1):32–35. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Use of semantic features to classify patient smoking status

Affiliation

Use of semantic features to classify patient smoking status

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical