Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review
- PMID: 38042599
- PMCID: PMC10693655
- DOI: 10.1016/j.artmed.2023.102701
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review
Abstract
Objective: Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care.
Methods: We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs.
Results: Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP.
Conclusions: This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
Keywords: Electronic health records; Machine learning; Natural language processing; Patient-reported outcomes; Unstructured clinical narrative.
Copyright © 2023 Elsevier B.V. All rights reserved.
Conflict of interest statement
Declaration of competing interest All co-authors declare no conflict of interest.
Figures
Similar articles
-
Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review.Expert Rev Pharmacoecon Outcomes Res. 2024 Apr;24(4):467-475. doi: 10.1080/14737167.2024.2322664. Epub 2024 Mar 5. Expert Rev Pharmacoecon Outcomes Res. 2024. PMID: 38383308 Free PMC article.
-
Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review.J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379. doi: 10.1093/jamia/ocy173. J Am Med Inform Assoc. 2019. PMID: 30726935 Free PMC article.
-
Extracting social determinants of health from electronic health records using natural language processing: a systematic review.J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170. J Am Med Inform Assoc. 2021. PMID: 34613399 Free PMC article.
-
Leveraging Natural Language Processing and Machine Learning Methods for Adverse Drug Event Detection in Electronic Health/Medical Records: A Scoping Review.Drug Saf. 2025 Apr;48(4):321-337. doi: 10.1007/s40264-024-01505-6. Epub 2025 Jan 9. Drug Saf. 2025. PMID: 39786481 Free PMC article.
-
Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review.JAMIA Open. 2024 May 24;7(2):ooae044. doi: 10.1093/jamiaopen/ooae044. eCollection 2024 Jul. JAMIA Open. 2024. PMID: 38798774 Free PMC article. Review.
Cited by
-
Predicting ICU Readmission from Electronic Health Records via BERTopic with Long Short Term Memory Network Approach.J Clin Med. 2024 Sep 18;13(18):5503. doi: 10.3390/jcm13185503. J Clin Med. 2024. PMID: 39336990 Free PMC article.
-
The Frontiers of Smart Healthcare Systems.Healthcare (Basel). 2024 Nov 21;12(23):2330. doi: 10.3390/healthcare12232330. Healthcare (Basel). 2024. PMID: 39684952 Free PMC article. Review.
-
The recent history and near future of digital health in the field of behavioral medicine: an update on progress from 2019 to 2024.J Behav Med. 2025 Feb;48(1):120-136. doi: 10.1007/s10865-024-00526-x. Epub 2024 Oct 28. J Behav Med. 2025. PMID: 39467924 Free PMC article. Review.
-
Advancements in Herpes Zoster Diagnosis, Treatment, and Management: Systematic Review of Artificial Intelligence Applications.J Med Internet Res. 2025 Jun 30;27:e71970. doi: 10.2196/71970. J Med Internet Res. 2025. PMID: 40587773 Free PMC article. Review.
-
Leveraging large language models to mimic domain expert labeling in unstructured text-based electronic healthcare records in non-english languages.BMC Med Inform Decis Mak. 2025 Mar 31;25(1):154. doi: 10.1186/s12911-025-02871-6. BMC Med Inform Decis Mak. 2025. PMID: 40165165 Free PMC article.
References
-
- Wilson IB, Cleary PD. Linking clinical variables with health-related quality of life. A conceptual model of patient outcomes. JAMA. 1995;273:59–65. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous