Identifying patient smoking status from medical discharge records
- PMID: 17947624
- PMCID: PMC2274873
- DOI: 10.1197/jamia.M2408
Identifying patient smoking status from medical discharge records
Abstract
The authors organized a Natural Language Processing (NLP) challenge on automatically determining the smoking status of patients from information found in their discharge records. This challenge was issued as a part of the i2b2 (Informatics for Integrating Biology to the Bedside) project, to survey, facilitate, and examine studies in medical language understanding for clinical narratives. This article describes the smoking challenge, details the data and the annotation process, explains the evaluation metrics, discusses the characteristics of the systems developed for the challenge, presents an analysis of the results of received system runs, draws conclusions about the state of the art, and identifies directions for future research. A total of 11 teams participated in the smoking challenge. Each team submitted up to three system runs, providing a total of 23 submissions. The submitted system runs were evaluated with microaveraged and macroaveraged precision, recall, and F-measure. The systems submitted to the smoking challenge represented a variety of machine learning and rule-based algorithms. Despite the differences in their approaches to smoking status identification, many of these systems provided good results. There were 12 system runs with microaveraged F-measures above 0.84. Analysis of the results highlighted the fact that discharge summaries express smoking status using a limited number of textual features (e.g., "smok", "tobac", "cigar", Social History, etc.). Many of the effective smoking status identifiers benefit from these features.
Figures
Similar articles
-
Evaluating the state-of-the-art in automatic de-identification.J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63. doi: 10.1197/jamia.M2444. Epub 2007 Jun 28. J Am Med Inform Assoc. 2007. PMID: 17600094 Free PMC article.
-
Recognizing obesity and comorbidities in sparse data.J Am Med Inform Assoc. 2009 Jul-Aug;16(4):561-70. doi: 10.1197/jamia.M3115. Epub 2009 Apr 23. J Am Med Inform Assoc. 2009. PMID: 19390096 Free PMC article.
-
Use of semantic features to classify patient smoking status.AMIA Annu Symp Proc. 2008 Nov 6;2008:450-4. AMIA Annu Symp Proc. 2008. PMID: 18998969 Free PMC article.
-
Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review.J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379. doi: 10.1093/jamia/ocy173. J Am Med Inform Assoc. 2019. PMID: 30726935 Free PMC article.
-
Clinical Text Data in Machine Learning: Systematic Review.JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984. JMIR Med Inform. 2020. PMID: 32229465 Free PMC article. Review.
Cited by
-
In response to: Method of electronic health record documentation and quality of primary care.J Am Med Inform Assoc. 2012 Nov-Dec;19(6):1120-1. doi: 10.1136/amiajnl-2012-001149. Epub 2012 Jul 28. J Am Med Inform Assoc. 2012. PMID: 22842547 Free PMC article. No abstract available.
-
A study of transportability of an existing smoking status detection module across institutions.AMIA Annu Symp Proc. 2012;2012:577-86. Epub 2012 Nov 3. AMIA Annu Symp Proc. 2012. PMID: 23304330 Free PMC article.
-
Employing computational linguistics techniques to identify limited patient health literacy: Findings from the ECLIPPSE study.Health Serv Res. 2021 Feb;56(1):132-144. doi: 10.1111/1475-6773.13560. Epub 2020 Sep 23. Health Serv Res. 2021. PMID: 32966630 Free PMC article.
-
Using electronic patient records to discover disease correlations and stratify patient cohorts.PLoS Comput Biol. 2011 Aug;7(8):e1002141. doi: 10.1371/journal.pcbi.1002141. Epub 2011 Aug 25. PLoS Comput Biol. 2011. PMID: 21901084 Free PMC article.
-
Open Globe Injury Patient Identification in Warfare Clinical Notes.AMIA Annu Symp Proc. 2018 Apr 16;2017:403-410. eCollection 2017. AMIA Annu Symp Proc. 2018. PMID: 29854104 Free PMC article.
References
-
- Chang JT, Altman RB. Promises of text processing: natural language processing meets AI Drug Discov Today 2002;7:992-993. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources