. 2015 Aug 31;17(8):e212.

doi: 10.2196/jmir.4612.

Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

Albert Park¹, Andrea L Hartzler, Jina Huh, David W McDonald, Wanda Pratt

Affiliations

PMID: 26323337
PMCID: PMC4642409
DOI: 10.2196/jmir.4612

Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

Albert Park et al. J Med Internet Res. 2015.

. 2015 Aug 31;17(8):e212.

doi: 10.2196/jmir.4612.

Authors

Albert Park¹, Andrea L Hartzler, Jina Huh, David W McDonald, Wanda Pratt

Affiliation

¹ Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Seattle, WA, United States. alpark1216@gmail.com.

PMID: 26323337
PMCID: PMC4642409
DOI: 10.2196/jmir.4612

Abstract

Background: The prevalence and value of patient-generated health text are increasing, but processing such text remains problematic. Although existing biomedical natural language processing (NLP) tools are appealing, most were developed to process clinician- or researcher-generated text, such as clinical notes or journal articles. In addition to being constructed for different types of text, other challenges of using existing NLP include constantly changing technologies, source vocabularies, and characteristics of text. These continuously evolving challenges warrant the need for applying low-cost systematic assessment. However, the primarily accepted evaluation method in NLP, manual annotation, requires tremendous effort and time.

Objective: The primary objective of this study is to explore an alternative approach-using low-cost, automated methods to detect failures (eg, incorrect boundaries, missed terms, mismapped concepts) when processing patient-generated text with existing biomedical NLP tools. We first characterize common failures that NLP tools can make in processing online community text. We then demonstrate the feasibility of our automated approach in detecting these common failures using one of the most popular biomedical NLP tools, MetaMap.

Methods: Using 9657 posts from an online cancer community, we explored our automated failure detection approach in two steps: (1) to characterize the failure types, we first manually reviewed MetaMap's commonly occurring failures, grouped the inaccurate mappings into failure types, and then identified causes of the failures through iterative rounds of manual review using open coding, and (2) to automatically detect these failure types, we then explored combinations of existing NLP techniques and dictionary-based matching for each failure cause. Finally, we manually evaluated the automatically detected failures.

Results: From our manual review, we characterized three types of failure: (1) boundary failures, (2) missed term failures, and (3) word ambiguity failures. Within these three failure types, we discovered 12 causes of inaccurate mappings of concepts. We used automated methods to detect almost half of 383,572 MetaMap's mappings as problematic. Word sense ambiguity failure was the most widely occurring, comprising 82.22% of failures. Boundary failure was the second most frequent, amounting to 15.90% of failures, while missed term failures were the least common, making up 1.88% of failures. The automated failure detection achieved precision, recall, accuracy, and F1 score of 83.00%, 92.57%, 88.17%, and 87.52%, respectively.

Conclusions: We illustrate the challenges of processing patient-generated online health community text and characterize failures of NLP tools on this patient-generated health text, demonstrating the feasibility of our low-cost approach to automatically detect those failures. Our approach shows the potential for scalable and effective solutions to automatically assess the constantly evolving NLP tools and source vocabularies to process patient-generated text.

Keywords: UMLS; automatic data processing; information extraction; natural language processing; quantitative evaluation.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
Example failures that resulted from the application of MetaMap to process patient-generated text in an online health community (blue terms represent patient-generated text; black terms represent MetaMap’s interpretation; and red terms represent failure type).

See this image and copyright information in PMC

References

1. Fox S, Rainie L. Pew Research Center Internet, Science & Tech. 2014. Feb 27, [2015-04-23]. The Web at 25 in the US The overall verdict: The internet has been a plus for society and an especially good thing for individual users http://www.pewinternet.org/2014/02/27/the-web-at-25-in-the-u-s/
1. Fox S. Pew Research Center Internet, Science & Tech. 2005. May 17, [2015-04-23]. Health Information online: Eight in ten internet users have looked for health information online, with increased interest in diet, fitness, drugs, health insurance, experimental treatments, and particular doctors and hospitals http://www.pewinternet.org/2005/05/17/health-information-online/
1. Fox S. Pew Research Center Internet, Science & Tech. 2011. Feb 28, [2015-04-25]. Peer-to-peer healthcare: The internet gives patients and caregivers access not only to information, but also to each other http://www.pewinternet.org/2011/02/28/peer-to-peer-health-care-2/
1. Eysenbach G. Medicine 2.0: social networking, collaboration, participation, apomediation, and openness. J Med Internet Res. 2008;10(3):e22. doi: 10.2196/jmir.1030. http://www.jmir.org/2008/3/e22/ v10i3e22 - DOI - PMC - PubMed
1. Starbird K, Palen L. ‘Voluntweeters’: Self-Organizing by Digital Volunteers in Times of Crisis. ACM CHI Conference on Human Factors in Computing Systems; May 07-12, 2011; Vancouver, BC. ACM; 2011. pp. 1071–1080. http://dl.acm.org/citation.cfm?id=1979102 - DOI

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

Affiliation

Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources