Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep;130(9):1472-1482.
doi: 10.1007/s11547-025-02040-9. Epub 2025 Jul 11.

Automated MRI protocoling in neuroradiology in the era of large language models

Affiliations

Automated MRI protocoling in neuroradiology in the era of large language models

Lara Noelle Reiner et al. Radiol Med. 2025 Sep.

Abstract

Purpose: This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge.

Material and methods: This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation.

Results: Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48).

Conclusion: Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.

Keywords: Artificial intelligence; Automation; Clinical; Decision-support systems; Large language models; Magnetic resonance imaging; Natural language processing; Neuroradiology.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: The authors declare no conflict of interests. Ethical approval: The study was conducted in accordance with the latest version of the Declaration of Helsinki. The study was approved by the ethics committee of the university (No. EA4/062/20). The need for informed consent was waived due to the retrospective nature of the research.

Figures

Fig. 1
Fig. 1
Report Selection Process. A fully conducted MRI examination with a neuroradiological question served as inclusion criteria for this study. Exclusion criteria included early examination termination, the use of specific study protocols or requirement of one specific sequence, and patients under the age of 17, as all neuroradiological examinations for this age group follow pediatric MRI protocols
Fig. 2
Fig. 2
Study Design. The clinical question and MRI device data were extracted from MRI reports and provided to an experienced neuroradiologist (J.S., 13 years of experience) for manual protocol selection to establish the ground truth. This information was also used as input for the four tested pipelines. Additionally, four radiologists with varying experience levels performed manual protocol selection for comparison. Statistical analysis was conducted to compare the protocol selections of the large language model and the radiologists against the ground truth
Fig. 3
Fig. 3
Combination of Large Language Model and Retrieval-Augmented Generation. In the non-retrieval-augmented generation (RAG) approach, the clinical question is embedded directly in the prompt, enabling the large language model to predict contrast medium administration and suitable MRI sequences based on our institution’s standard sequences. RAG extends this process by using the clinical question to query the vector store, which is constructed from segmented protocols based on our institution-specific guidelines. Through similarity search, the four most relevant protocols are retrieved and incorporated into the prompt
Fig. 4
Fig. 4
Accuracies of Protocol Prediction. Error bars indicate the respective 95% confidence interval (CI). LLM = Large Language Model

References

    1. Schemmel A et al (2016) Radiology workflow disruptors: a detailed analysis. J Am Coll Radiol 13(10):1210–1214. 10.1016/j.jacr.2016.04.009 - PubMed
    1. Bjaalie JG et al (2021) Magnetic resonance imaging sequence identification using a metadata learning approach. Metadata Learn Approach Front Neuroinform. 10.3389/fninf.2021.622951 - PMC - PubMed
    1. Ginat DT, Uppuluri P, Christoforidis G, Katzman G, Lee SK (2016) Identification of neuroradiology MRI protocol errors via a quality-driven categorization approach. J Am Coll Radiol 13(5):545–548. 10.1016/j.jacr.2015.08.027 - PubMed
    1. Lee YH (2018) Efficiency improvement in a busy radiology practice: determination of musculoskeletal magnetic resonance imaging protocol using deep-learning convolutional neural networks. J Digit Imaging 31(5):604–610. 10.1007/s10278-018-0066-y - PMC - PubMed
    1. López-Úbeda P, Díaz-Galiano MC, Martín-Noguerol T, Luna A, Ureña-López LA, Martín-Valdivia MT (2021) Automatic medical protocol classification using machine learning approaches. Comput Methods Programs Biomed. 10.1016/j.cmpb.2021.105939 - PubMed

LinkOut - more resources