A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction
- PMID: 23268488
- PMCID: PMC3756265
- DOI: 10.1136/amiajnl-2012-001487
A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction
Abstract
Objective: The goal of this work was to evaluate machine learning methods, binary classification and sequence labeling, for medication-attribute linkage detection in two clinical corpora.
Data and methods: We double annotated 3000 clinical trial announcements (CTA) and 1655 clinical notes (CN) for medication named entities and their attributes. A binary support vector machine (SVM) classification method with parsimonious feature sets, and a conditional random fields (CRF)-based multi-layered sequence labeling (MLSL) model were proposed to identify the linkages between the entities and their corresponding attributes. We evaluated the system's performance against the human-generated gold standard.
Results: The experiments showed that the two machine learning approaches performed statistically significantly better than the baseline rule-based approach. The binary SVM classification achieved 0.94 F-measure with individual tokens as features. The SVM model trained on a parsimonious feature set achieved 0.81 F-measure for CN and 0.87 for CTA. The CRF MLSL method achieved 0.80 F-measure on both corpora.
Discussion and conclusions: We compared the novel MLSL method with a binary classification and a rule-based method. The MLSL method performed statistically significantly better than the rule-based method. However, the SVM-based binary classification method was statistically significantly better than the MLSL method for both the CTA and CN corpora. Using parsimonious feature sets both the SVM-based binary classification and CRF-based MLSL methods achieved high performance in detecting medication name and attribute linkages in CTA and CN.
Keywords: attribute linkages; clinical notes; clinical trial announcements; multi-layered sequence labeling; natural language processing.
Figures





Similar articles
-
Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426. J Med Internet Res. 2013. PMID: 23548263 Free PMC article.
-
Recognition of medication information from discharge summaries using ensembles of classifiers.BMC Med Inform Decis Mak. 2012 May 7;12:36. doi: 10.1186/1472-6947-12-36. BMC Med Inform Decis Mak. 2012. PMID: 22564405 Free PMC article.
-
A comprehensive study of named entity recognition in Chinese clinical text.J Am Med Inform Assoc. 2014 Sep-Oct;21(5):808-14. doi: 10.1136/amiajnl-2013-002381. Epub 2013 Dec 17. J Am Med Inform Assoc. 2014. PMID: 24347408 Free PMC article.
-
High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge.J Am Med Inform Assoc. 2010 Sep-Oct;17(5):524-7. doi: 10.1136/jamia.2010.003939. J Am Med Inform Assoc. 2010. PMID: 20819856 Free PMC article.
-
A study of deep learning approaches for medication and adverse drug event extraction from clinical text.J Am Med Inform Assoc. 2020 Jan 1;27(1):13-21. doi: 10.1093/jamia/ocz063. J Am Med Inform Assoc. 2020. PMID: 31135882 Free PMC article.
Cited by
-
Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.Yearb Med Inform. 2016 Nov 10;(1):224-233. doi: 10.15265/IY-2016-017. Yearb Med Inform. 2016. PMID: 27830255 Free PMC article. Review.
-
Increasing the efficiency of trial-patient matching: automated clinical trial eligibility pre-screening for pediatric oncology patients.BMC Med Inform Decis Mak. 2015 Apr 14;15:28. doi: 10.1186/s12911-015-0149-3. BMC Med Inform Decis Mak. 2015. PMID: 25881112 Free PMC article.
-
Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426. J Med Internet Res. 2013. PMID: 23548263 Free PMC article.
-
Clinical concept extraction: A methodology review.J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6. J Biomed Inform. 2020. PMID: 32768446 Free PMC article. Review.
-
An end-to-end hybrid algorithm for automated medication discrepancy detection.BMC Med Inform Decis Mak. 2015 May 6;15:37. doi: 10.1186/s12911-015-0160-8. BMC Med Inform Decis Mak. 2015. PMID: 25943550 Free PMC article.
References
-
- Clinical Trial Facts & Figures: General Patient Recruitment Information, The Center for Information and Study on Clinical Research Participation (CISCRP), http://www.ciscrp.org/professional/facts_pat.html#5 (accessed 18 Jul 2012).
-
- Tassignon JP, Sinackevich N. Speeding the critical path. Applied Clinical Trials, 2004
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical