Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr;48(4):321-337.
doi: 10.1007/s40264-024-01505-6. Epub 2025 Jan 9.

Leveraging Natural Language Processing and Machine Learning Methods for Adverse Drug Event Detection in Electronic Health/Medical Records: A Scoping Review

Affiliations

Leveraging Natural Language Processing and Machine Learning Methods for Adverse Drug Event Detection in Electronic Health/Medical Records: A Scoping Review

Su Golder et al. Drug Saf. 2025 Apr.

Abstract

Background: Natural language processing (NLP) and machine learning (ML) techniques may help harness unstructured free-text electronic health record (EHR) data to detect adverse drug events (ADEs) and thus improve pharmacovigilance. However, evidence of their real-world effectiveness remains unclear.

Objective: To summarise the evidence on the effectiveness of NLP/ML in detecting ADEs from unstructured EHR data and ultimately improve pharmacovigilance in comparison to other data sources.

Methods: A scoping review was conducted by searching six databases in July 2023. Studies leveraging NLP/ML to identify ADEs from EHR were included. Titles/abstracts were screened by two independent researchers as were full-text articles. Data extraction was conducted by one researcher and checked by another. A narrative synthesis summarises the research techniques, ADEs analysed, model performance and pharmacovigilance impacts.

Results: Seven studies met the inclusion criteria covering a wide range of ADEs and medications. The utilisation of rule-based NLP, statistical models, and deep learning approaches was observed. Natural language processing/ML techniques with unstructured data improved the detection of under-reported adverse events and safety signals. However, substantial variability was noted in the techniques and evaluation methods employed across the different studies and limitations exist in integrating the findings into practice.

Conclusions: Natural language processing (NLP) and machine learning (ML) have promising possibilities in extracting valuable insights with regard to pharmacovigilance from unstructured EHR data. These approaches have demonstrated proficiency in identifying specific adverse events and uncovering previously unknown safety signals that would not have been apparent through structured data alone. Nevertheless, challenges such as the absence of standardised methodologies and validation criteria obstruct the widespread adoption of NLP/ML for pharmacovigilance leveraging of unstructured EHR data.

PubMed Disclaimer

Conflict of interest statement

Declarations. Funding: U.S. National Library of Medicine, R01LM011176, Graciela Gonzalez-Hernandez. Conflicts of Interest: None declared. Ethics Approval: Not applicable. Secondary analysis of publicly available literature. Consent to Participate: Not applicable. Consent for Publication: Not applicable. Availability of Data and Material: The data supporting the findings of this scoping review are derived from published studies and are publicly accessible. A comprehensive list of the studies included in the review, along with the excluded sources, can be found in the text and supplementary materials of this manuscript. For any additional inquiries regarding the data, please contact the corresponding author. Code Availability: Not applicable. Authors' Contributions: Study conception and design (SG). Development of search strategies (SG). Running of search strategies (MB). Screening (MB and SG/KO). Data extraction (MB/DX). Data synthesis (DX, KO and YW). The first draft of the manuscript was written by SG and authors (GG, DX, KO, YW) commented on previous versions of the manuscript. All authors read and approved the final manuscript (GG, DX, KO, YW, MB, SG).

Figures

Fig. 1
Fig. 1
Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) flow diagram for included studies
Fig. 2
Fig. 2
Overview of system performance for included studiesa

Similar articles

Cited by

References

    1. Uslu A, Stausberg J. Value of the electronic medical record for hospital care: update from the literature. J Med Internet Res. 2021;23(12): e26323. - PMC - PubMed
    1. Garets D, Davis M. Electronic medical records vs. electronic health records: yes, there is a difference. Policy white paper Chicago, HIMSS Analytics. 2006;1.
    1. Knevel R, Liao KP. From real-world electronic health record data to real-world results using artificial intelligence. Ann Rheum Dis. 2023;82(3):306–11. - PMC - PubMed
    1. Ehrenstein V, Kharrazi H, Lehmann H, Taylor CO. Obtaining data from electronic health records. Tools and technologies for registry interoperability, registries for evaluating patient outcomes: A user’s guide, 3rd edn, Addendum 2 [Internet]: Agency for Healthcare Research and Quality (US); 2019.
    1. Manca DP. Do electronic medical records improve quality of care? Yes. Can Fam Physician. 2015;61(10):846–7, 50–1. - PMC - PubMed

Publication types

MeSH terms