A scoping review of large language model based approaches for information extraction from radiology reports

Daniel Reichenpfader^{1

2}, Henning Müller^{3

4}, Kerstin Denecke⁵

Affiliations

¹ Institute for Patient-Centered Digital Health, Bern University of Applied Sciences, Biel/Bienne, Switzerland. daniel.reichenpfader@bfh.ch.
² Faculty of Medicine, University of Geneva, Geneva, Switzerland. daniel.reichenpfader@bfh.ch.
³ Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland.
⁴ Informatics Institute, HES-SO Valais-Wallis, Sierre, Switzerland.
⁵ Institute for Patient-Centered Digital Health, Bern University of Applied Sciences, Biel/Bienne, Switzerland.

PMID: 39182008
PMCID: PMC11344824
DOI: 10.1038/s41746-024-01219-0

A scoping review of large language model based approaches for information extraction from radiology reports

Daniel Reichenpfader et al. NPJ Digit Med. 2024.

. 2024 Aug 24;7(1):222.

doi: 10.1038/s41746-024-01219-0.

Authors

Daniel Reichenpfader^{1

2}, Henning Müller^{3

4}, Kerstin Denecke⁵

Affiliations

¹ Institute for Patient-Centered Digital Health, Bern University of Applied Sciences, Biel/Bienne, Switzerland. daniel.reichenpfader@bfh.ch.
² Faculty of Medicine, University of Geneva, Geneva, Switzerland. daniel.reichenpfader@bfh.ch.
³ Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland.
⁴ Informatics Institute, HES-SO Valais-Wallis, Sierre, Switzerland.
⁵ Institute for Patient-Centered Digital Health, Bern University of Applied Sciences, Biel/Bienne, Switzerland.

PMID: 39182008
PMCID: PMC11344824
DOI: 10.1038/s41746-024-01219-0

Abstract

Radiological imaging is a globally prevalent diagnostic method, yet the free text contained in radiology reports is not frequently used for secondary purposes. Natural Language Processing can provide structured data retrieved from these reports. This paper provides a summary of the current state of research on Large Language Model (LLM) based approaches for information extraction (IE) from radiology reports. We conduct a scoping review that follows the PRISMA-ScR guideline. Queries of five databases were conducted on August 1st 2023. Among the 34 studies that met inclusion criteria, only pre-transformer and encoder-based models are described. External validation shows a general performance decrease, although LLMs might improve generalizability of IE approaches. Reports related to CT and MRI examinations, as well as thoracic reports, prevail. Most common challenges reported are missing validation on external data and augmentation of the described methods. Different reporting granularities affect the comparability and transparency of approaches.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. PRISMA flowchart describing the source of evidence retrieval and selection process.**
Querying of five databases resulted in a total of 1237 sources of evidence eligible for screening. This number was reduced to 374 after deduplication and removal based on publication year. Eventually, 34 studies were included in this review after completion of the screening process.

**Fig. 2. Distribution of reported NLP tasks.**
The circles contain the absolute number of studies per task. NER Named entity recognition, RE Relation extraction, ML-CL Binary multi-label classification, MC-CL Multi-class classification, QA Question answering.

**Fig. 3. Distribution of modalities.**
The diagram shows absolute numbers of mentioned modalities. Several studies use reports obtained from multiple modalities. Other modalities include positron emission tomography-computed tomography (PET-CT) (n = 1) and ultrasound (n = 2). Three studies did not explicitly mention associated modalities. Abbreviations: CT Computer tomography, MRI Magnetic resonance imaging.

**Fig. 4. Distribution of anatomical regions.**
The diagram shows absolute numbers of mentioned anatomical regions. Several studies use reports corresponding to multiple anatomical regions. Other anatomical regions include the heart, abdomen, pelvis, “all body regions'', nose, thyroid (n = 1 each) and breast (n = 2). Four studies did not explicitly mention associated anatomical regions.

See this image and copyright information in PMC

References

1. Müskens, J. L. J. M., Kool, R. B., Van Dulmen, S. A. & Westert, G. P. Overuse of diagnostic testing in healthcare: a systematic review. BMJ Qual. Saf.31, 54–63 (2022). 10.1136/bmjqs-2020-012576 - DOI - PMC - PubMed
1. Nobel, J. M., Van Geel, K. & Robben, S. G. F. Structured reporting in radiology: a systematic review to explore its potential. Eur. Radiol.32, 2837–2854 (2022). 10.1007/s00330-021-08327-5 - DOI - PMC - PubMed
1. Khurana, D., Koli, A., Khatter, K. & Singh, S. Natural language processing: state of the art, current trends and challenges. Multimed. Tools Appl.82, 3713–3744 (2023). 10.1007/s11042-022-13428-4 - DOI - PMC - PubMed
1. Jurafsky, D. & Martin, J. H. Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (Pearson Education, 2024).
1. Birhane, A., Kasirzadeh, A., Leslie, D. & Wachter, S. Science in the age of large language models. Nat. Rev. Phys.5, 277–280 (2023). 10.1038/s42254-023-00581-4 - DOI

Publication types

Actions

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A scoping review of large language model based approaches for information extraction from radiology reports

Affiliations

A scoping review of large language model based approaches for information extraction from radiology reports

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous