Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 1;26(4):364-379.
doi: 10.1093/jamia/ocy173.

Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review

Affiliations

Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review

Theresa A Koleck et al. J Am Med Inform Assoc. .

Abstract

Objective: Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives.

Materials and methods: Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study.

Results: Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics.

Discussion: NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves.

Conclusion: Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.

Keywords: electronic health records; natural language processing; review; signs and symptoms.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flow diagram of included articles. NLP: natural language processing.
Figure 2.
Figure 2.
Chord diagram of symptoms by clinical category included in systematic review articles. Relationships between symptoms (color sectors and tracks) and articles (black sectors) included in the systematic review are displayed. Individual symptoms are arranged via color by clinical category. Symptom sector size is proportional to the number of unique articles that include a given symptom. Article sector size is proportional to the number of unique symptoms included in a given study. Sample sizes in the legend correspond to the number of unique articles overall and in each clinical category. Shortness of breath includes dyspnea and orthopnea. Pain includes pain, ache, or discomfort not specified as occurring in the chest or abdomen. The figure was generated using R statistical software (R Foundation for Statistical Computing (R version 3.3.1), Vienna, Austria).

Similar articles

Cited by

References

    1. Mehta N, Pandit A.. Concurrence of big data analytics and healthcare: a systematic review. Int J Med Inform 2018; 114: 57–65. - PubMed
    1. Yim W-W, Yetisgen M, Harris WP, et al. Natural language processing in oncology. JAMA Oncol 2016; 2 (6): 797–804. - PubMed
    1. Fleuren WWM, Alkema W.. Application of text mining in the biomedical domain. Methods 2015; 74: 97–106. - PubMed
    1. Wang Y, Wang L, Rastegar-Mojarad M, et al. Clinical information extraction applications: a literature review. J Biomed Inform 2018; 77: 34–49. - PMC - PubMed
    1. Institute of Medicine (US) Committee on Data Standards for Patient Safety. Key Capabilities of an Electronic Health Record System: Letter Report Washington, DC: National Academies Press. 2003. - PubMed

Publication types