Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Feb 6;25(1):58.
doi: 10.1186/s12876-025-03608-5.

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Affiliations

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Matthew Stammers et al. BMC Gastroenterol. .

Abstract

Objective: This review assesses the progress of NLP in gastroenterology to date, grades the robustness of the methodology, exposes the field to a new generation of authors, and highlights opportunities for future research.

Design: Seven scholarly databases (ACM Digital Library, Arxiv, Embase, IEEE Explore, Pubmed, Scopus and Google Scholar) were searched for studies published between 2015 and 2023 that met the inclusion criteria. Studies lacking a description of appropriate validation or NLP methods were excluded, as were studies ufinavailable in English, those focused on non-gastrointestinal diseases and those that were duplicates. Two independent reviewers extracted study information, clinical/algorithm details, and relevant outcome data. Methodological quality and bias risks were appraised using a checklist of quality indicators for NLP studies.

Results: Fifty-three studies were identified utilising NLP in endoscopy, inflammatory bowel disease, gastrointestinal bleeding, liver and pancreatic disease. Colonoscopy was the focus of 21 (38.9%) studies; 13 (24.1%) focused on liver disease, 7 (13.0%) on inflammatory bowel disease, 4 (7.4%) on gastroscopy, 4 (7.4%) on pancreatic disease and 2 (3.7%) on endoscopic sedation/ERCP and gastrointestinal bleeding. Only 30 (56.6%) of the studies reported patient demographics, and only 13 (24.5%) had a low risk of validation bias. Thirty-five (66%) studies mentioned generalisability, but only 5 (9.4%) mentioned explainability or shared code/models.

Conclusion: NLP can unlock substantial clinical information from free-text notes stored in EPRs and is already being used, particularly to interpret colonoscopy and radiology reports. However, the models we have thus far lack transparency, leading to duplication, bias, and doubts about generalisability. Therefore, greater clinical engagement, collaboration, and open sharing of appropriate datasets and code are needed.

Keywords: Colonoscopy; Gastroscopy; Hepatocellular carcinoma; Inflammatory bowel disease; Natural language Processing; Pancreatic disease.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not Applicable. Consent for publication: Not applicable. Competing interests: RN has received an educational grant from Pentax Medical. MS and MG have attended the fully funded Dr Falk symposium on AI in Gastroenterology. The other authors declare they have no competing interests.

Figures

Fig. 1
Fig. 1
Applied Example of Natural Language Processing in Gastroenterology. Figure 1 provides a visual applied example of clincial natural langugage processing (NLP) in gastroenterology flowing from semi-structured free-text to transformed data, then on to structured output and finally some examples of present gastroenterology(GI) NLP applications
Fig. 2
Fig. 2
PRISMA Flow Diagram For Review. Figure 2 provides the full PRISMA flow diagram for the study from abstract identification and screening through to full paper screening and extraction
Fig. 3
Fig. 3
Distribution of Available NLP Studies across Gastroenterology and Hepatology. Figure 3 visually examines the distribution of available NLP studies across varied clinical, data science and task domains

References

    1. Bates M. Models of natural language understanding. Proc Natl Acad Sci. 1995Oct 24;92(22):9977–82. - PMC - PubMed
    1. Khanbhai M, Anyadi P, Symons J, Flott K, Darzi A, Mayer E. Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review. BMJ Health Care Inform. 2021Mar 2;28(1): e100262. - PMC - PubMed
    1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is All you Need. In: Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2017. Available from: https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547de.... Cited 2023 Aug 25
    1. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv; 2019. Available from: http://arxiv.org/abs/1810.04805. Cited 2023 Aug 25.
    1. Floridi L, Chiriatti M. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds Mach. 2020Dec 1;30(4):681–94.

Publication types

LinkOut - more resources