Automated MRI protocoling in neuroradiology in the era of large language models

Affiliations

¹ Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany. lara.reiner@charite.de.
² Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany.
³ Department of Radiology, Technical University Munich, Klinikum Rechts Der Isar, Ismaninger Str. 22, 81675, Munich, Germany.
⁴ Department or Radiology and Nuclear Medicine, Technical University Munich, German Heart Center Munich, Lazarethstr. 36, 80636, Munich, Germany.
⁵ Department of Neuroradiology, Hôpital Maison-Blanche, CHU Reims, Université Reims-Champagne-Ardenne, Reims, France.
⁶ Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.

PMID: 40643871
PMCID: PMC12454495
DOI: 10.1007/s11547-025-02040-9

Automated MRI protocoling in neuroradiology in the era of large language models

Lara Noelle Reiner et al. Radiol Med. 2025 Sep.

. 2025 Sep;130(9):1472-1482.

doi: 10.1007/s11547-025-02040-9. Epub 2025 Jul 11.

Authors

Affiliations

¹ Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany. lara.reiner@charite.de.
² Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany.
³ Department of Radiology, Technical University Munich, Klinikum Rechts Der Isar, Ismaninger Str. 22, 81675, Munich, Germany.
⁴ Department or Radiology and Nuclear Medicine, Technical University Munich, German Heart Center Munich, Lazarethstr. 36, 80636, Munich, Germany.
⁵ Department of Neuroradiology, Hôpital Maison-Blanche, CHU Reims, Université Reims-Champagne-Ardenne, Reims, France.
⁶ Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.

PMID: 40643871
PMCID: PMC12454495
DOI: 10.1007/s11547-025-02040-9

Abstract

Purpose: This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge.

Material and methods: This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation.

Results: Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48).

Conclusion: Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.

Keywords: Artificial intelligence; Automation; Clinical; Decision-support systems; Large language models; Magnetic resonance imaging; Natural language processing; Neuroradiology.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: The authors declare no conflict of interests. Ethical approval: The study was conducted in accordance with the latest version of the Declaration of Helsinki. The study was approved by the ethics committee of the university (No. EA4/062/20). The need for informed consent was waived due to the retrospective nature of the research.

Figures

**Fig. 1**
**Report Selection Process**. A fully conducted MRI examination with a neuroradiological question served as inclusion criteria for this study. Exclusion criteria included early examination termination, the use of specific study protocols or requirement of one specific sequence, and patients under the age of 17, as all neuroradiological examinations for this age group follow pediatric MRI protocols

**Fig. 2**
**Study Design**. The clinical question and MRI device data were extracted from MRI reports and provided to an experienced neuroradiologist (J.S., 13 years of experience) for manual protocol selection to establish the ground truth. This information was also used as input for the four tested pipelines. Additionally, four radiologists with varying experience levels performed manual protocol selection for comparison. Statistical analysis was conducted to compare the protocol selections of the large language model and the radiologists against the ground truth

**Fig. 3**
**Combination of Large Language Model and Retrieval-Augmented Generation**. In the non-retrieval-augmented generation (RAG) approach, the clinical question is embedded directly in the prompt, enabling the large language model to predict contrast medium administration and suitable MRI sequences based on our institution’s standard sequences. RAG extends this process by using the clinical question to query the vector store, which is constructed from segmented protocols based on our institution-specific guidelines. Through similarity search, the four most relevant protocols are retrieved and incorporated into the prompt

**Fig. 4**
**Accuracies of Protocol Prediction**. Error bars indicate the respective 95% confidence interval (CI). LLM = Large Language Model

See this image and copyright information in PMC

References

1. Schemmel A et al (2016) Radiology workflow disruptors: a detailed analysis. J Am Coll Radiol 13(10):1210–1214. 10.1016/j.jacr.2016.04.009 - DOI - PubMed
1. Bjaalie JG et al (2021) Magnetic resonance imaging sequence identification using a metadata learning approach. Metadata Learn Approach Front Neuroinform. 10.3389/fninf.2021.622951 - DOI - PMC - PubMed
1. Ginat DT, Uppuluri P, Christoforidis G, Katzman G, Lee SK (2016) Identification of neuroradiology MRI protocol errors via a quality-driven categorization approach. J Am Coll Radiol 13(5):545–548. 10.1016/j.jacr.2015.08.027 - DOI - PubMed
1. Lee YH (2018) Efficiency improvement in a busy radiology practice: determination of musculoskeletal magnetic resonance imaging protocol using deep-learning convolutional neural networks. J Digit Imaging 31(5):604–610. 10.1007/s10278-018-0066-y - DOI - PMC - PubMed
1. López-Úbeda P, Díaz-Galiano MC, Martín-Noguerol T, Luna A, Ureña-López LA, Martín-Valdivia MT (2021) Automatic medical protocol classification using machine learning approaches. Comput Methods Programs Biomed. 10.1016/j.cmpb.2021.105939 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated MRI protocoling in neuroradiology in the era of large language models

Affiliations

Automated MRI protocoling in neuroradiology in the era of large language models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical