. 2025 Feb 6;25(1):58.

doi: 10.1186/s12876-025-03608-5.

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Matthew Stammers^{1

2

3

4}, Balasubramanian Ramgopal⁵, Abigail Owusu Nimako⁵, Anand Vyas⁵, Reza Nouraei^{6

7

8}, Cheryl Metcalf^{7

9}, James Batchelor^{6

7}, Jonathan Shepherd¹⁰, Markus Gwiggner^{5

7}

Affiliations

¹ University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK. m.stammers@soton.ac.uk.
² Southampton Emerging Therapies and Technologies (SETT) Centre, Southampton, SO16 6YD, UK. m.stammers@soton.ac.uk.
³ Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK. m.stammers@soton.ac.uk.
⁴ University of Southampton, Southampton, SO17 1BJ, UK. m.stammers@soton.ac.uk.
⁵ University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK.
⁶ Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK.
⁷ University of Southampton, Southampton, SO17 1BJ, UK.
⁸ Queen's Medical Centre, ENT Department, Nottingham, NG7 2UH, UK.
⁹ School of Healthcare Enterprise and Innovation, University of Southampton, University of Southampton Science Park, Enterprise Road, Chilworth, Southampton, SO16 7NS, UK.
¹⁰ Southampton Health Technologies Assessment Centre (SHTAC), Enterprise Road, Alpha House, Southampton, SO16 7NS, England.

PMID: 39915703
PMCID: PMC11800601
DOI: 10.1186/s12876-025-03608-5

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Matthew Stammers et al. BMC Gastroenterol. 2025.

. 2025 Feb 6;25(1):58.

doi: 10.1186/s12876-025-03608-5.

Authors

Affiliations

¹ University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK. m.stammers@soton.ac.uk.
² Southampton Emerging Therapies and Technologies (SETT) Centre, Southampton, SO16 6YD, UK. m.stammers@soton.ac.uk.
³ Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK. m.stammers@soton.ac.uk.
⁴ University of Southampton, Southampton, SO17 1BJ, UK. m.stammers@soton.ac.uk.
⁵ University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK.
⁶ Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK.
⁷ University of Southampton, Southampton, SO17 1BJ, UK.
⁸ Queen's Medical Centre, ENT Department, Nottingham, NG7 2UH, UK.
⁹ School of Healthcare Enterprise and Innovation, University of Southampton, University of Southampton Science Park, Enterprise Road, Chilworth, Southampton, SO16 7NS, UK.
¹⁰ Southampton Health Technologies Assessment Centre (SHTAC), Enterprise Road, Alpha House, Southampton, SO16 7NS, England.

PMID: 39915703
PMCID: PMC11800601
DOI: 10.1186/s12876-025-03608-5

Abstract

Objective: This review assesses the progress of NLP in gastroenterology to date, grades the robustness of the methodology, exposes the field to a new generation of authors, and highlights opportunities for future research.

Design: Seven scholarly databases (ACM Digital Library, Arxiv, Embase, IEEE Explore, Pubmed, Scopus and Google Scholar) were searched for studies published between 2015 and 2023 that met the inclusion criteria. Studies lacking a description of appropriate validation or NLP methods were excluded, as were studies ufinavailable in English, those focused on non-gastrointestinal diseases and those that were duplicates. Two independent reviewers extracted study information, clinical/algorithm details, and relevant outcome data. Methodological quality and bias risks were appraised using a checklist of quality indicators for NLP studies.

Results: Fifty-three studies were identified utilising NLP in endoscopy, inflammatory bowel disease, gastrointestinal bleeding, liver and pancreatic disease. Colonoscopy was the focus of 21 (38.9%) studies; 13 (24.1%) focused on liver disease, 7 (13.0%) on inflammatory bowel disease, 4 (7.4%) on gastroscopy, 4 (7.4%) on pancreatic disease and 2 (3.7%) on endoscopic sedation/ERCP and gastrointestinal bleeding. Only 30 (56.6%) of the studies reported patient demographics, and only 13 (24.5%) had a low risk of validation bias. Thirty-five (66%) studies mentioned generalisability, but only 5 (9.4%) mentioned explainability or shared code/models.

Conclusion: NLP can unlock substantial clinical information from free-text notes stored in EPRs and is already being used, particularly to interpret colonoscopy and radiology reports. However, the models we have thus far lack transparency, leading to duplication, bias, and doubts about generalisability. Therefore, greater clinical engagement, collaboration, and open sharing of appropriate datasets and code are needed.

Keywords: Colonoscopy; Gastroscopy; Hepatocellular carcinoma; Inflammatory bowel disease; Natural language Processing; Pancreatic disease.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not Applicable. Consent for publication: Not applicable. Competing interests: RN has received an educational grant from Pentax Medical. MS and MG have attended the fully funded Dr Falk symposium on AI in Gastroenterology. The other authors declare they have no competing interests.

Figures

**Fig. 1**
Applied Example of Natural Language Processing in Gastroenterology. Figure 1 provides a visual applied example of clincial natural langugage processing (NLP) in gastroenterology flowing from semi-structured free-text to transformed data, then on to structured output and finally some examples of present gastroenterology(GI) NLP applications

**Fig. 2**
PRISMA Flow Diagram For Review. Figure 2 provides the full PRISMA flow diagram for the study from abstract identification and screening through to full paper screening and extraction

**Fig. 3**
Distribution of Available NLP Studies across Gastroenterology and Hepatology. Figure 3 visually examines the distribution of available NLP studies across varied clinical, data science and task domains

See this image and copyright information in PMC

References

1. Bates M. Models of natural language understanding. Proc Natl Acad Sci. 1995Oct 24;92(22):9977–82. - PMC - PubMed
1. Khanbhai M, Anyadi P, Symons J, Flott K, Darzi A, Mayer E. Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review. BMJ Health Care Inform. 2021Mar 2;28(1): e100262. - PMC - PubMed
1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is All you Need. In: Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2017. Available from: https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547de.... Cited 2023 Aug 25
1. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv; 2019. Available from: http://arxiv.org/abs/1810.04805. Cited 2023 Aug 25.
1. Floridi L, Chiriatti M. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds Mach. 2020Dec 1;30(4):681–94.

Publication types

Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- BioMed Central
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Affiliations

A foundation systematic review of natural language processing applied to gastroenterology & hepatology

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous