Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May:125:37-46.
doi: 10.1016/j.ijmedinf.2019.02.008. Epub 2019 Feb 20.

A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data

Affiliations

A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data

Caitlin Dreisbach et al. Int J Med Inform. 2019 May.

Abstract

Objective: In this systematic review, we aim to synthesize the literature on the use of natural language processing (NLP) and text mining as they apply to symptom extraction and processing in electronic patient-authored text (ePAT).

Materials and methods: A comprehensive literature search of 1964 articles from PubMed and EMBASE was narrowed to 21 eligible articles. Data related to purpose, text source, number of users and/or posts, evaluation metrics, and quality indicators were recorded.

Results: Pain (n = 18) and fatigue and sleep disturbance (n = 18) were the most frequently evaluated symptom clinical content categories. Studies accessed ePAT from sources such as Twitter and online community forums or patient portals focused on diseases, including diabetes, cancer, and depression. Fifteen studies used NLP as a primary methodology. Studies reported evaluation metrics including the precision, recall, and F-measure for symptom-specific research questions.

Discussion: NLP and text mining have been used to extract and analyze patient-authored symptom data in a wide variety of online communities. Though there are computational challenges with accessing ePAT, the depth of information provided directly from patients offers new horizons for precision medicine, characterization of sub-clinical symptoms, and the creation of personal health libraries as outlined by the National Library of Medicine.

Conclusion: Future research should consider the needs of patients expressed through ePAT and its relevance to symptom science. Understanding the role that ePAT plays in health communication and real-time assessment of symptoms, through the use of NLP and text mining, is critical to a patient-centered health system.

Keywords: Electronic patient-authored text; Natural language processing; Review; Signs and symptoms.

PubMed Disclaimer

Conflict of interest statement

CONFLICT OF INTEREST

We have no conflicts of interest to disclose.

Figures

Figure 1.
Figure 1.
PRISMA flow diagram of included articles.
Figure 2.
Figure 2.
Number of studies grouped by symptom category.

Similar articles

Cited by

References

    1. Fox S, Duggan M. Health Online 2013 Pew Research Center Internet & American Life Project; 2013:1–55.
    1. MacLean DL, Heer J. Identifying medical terms in patient-authored text: a crowdsourcing-based approach. J. Am. Med. Inform. Assoc 2013;20(6):1120–1127. doi:10.1136/amiajnl-2012-001110. - DOI - PMC - PubMed
    1. Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J. Am. Med. Inform. Assoc 2011;18(5):544–551. doi:10.1136/amiajnl-2011-000464. - DOI - PMC - PubMed
    1. Calvo RA, Milne DN, Hussain MS, Christensen H. Natural language processing in mental health applications using non-clinical texts. Nat. Lang. Eng 2017:1–37. doi:10.1017/S1351324916000383. - DOI
    1. Yim W-W, Yetisgen M, Harris WP, Kwan SW. Natural language processing in oncology: A review. JAMA Oncol 2016;2(6):797–804. doi:10.1001/jamaoncol.2016.0213. - DOI - PubMed

Publication types