Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review
- PMID: 30726935
- PMCID: PMC6657282
- DOI: 10.1093/jamia/ocy173
Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review
Abstract
Objective: Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives.
Materials and methods: Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study.
Results: Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics.
Discussion: NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves.
Conclusion: Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.
Keywords: electronic health records; natural language processing; review; signs and symptoms.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Figures
References
-
- Mehta N, Pandit A.. Concurrence of big data analytics and healthcare: a systematic review. Int J Med Inform 2018; 114: 57–65. - PubMed
-
- Yim W-W, Yetisgen M, Harris WP, et al. Natural language processing in oncology. JAMA Oncol 2016; 2 (6): 797–804. - PubMed
-
- Fleuren WWM, Alkema W.. Application of text mining in the biomedical domain. Methods 2015; 74: 97–106. - PubMed
-
- Institute of Medicine (US) Committee on Data Standards for Patient Safety. Key Capabilities of an Electronic Health Record System: Letter Report Washington, DC: National Academies Press. 2003. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
