Automated MRI protocoling in neuroradiology in the era of large language models
- PMID: 40643871
- PMCID: PMC12454495
- DOI: 10.1007/s11547-025-02040-9
Automated MRI protocoling in neuroradiology in the era of large language models
Abstract
Purpose: This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge.
Material and methods: This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation.
Results: Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48).
Conclusion: Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.
Keywords: Artificial intelligence; Automation; Clinical; Decision-support systems; Large language models; Magnetic resonance imaging; Natural language processing; Neuroradiology.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Conflict of interest: The authors declare no conflict of interests. Ethical approval: The study was conducted in accordance with the latest version of the Declaration of Helsinki. The study was approved by the ethics committee of the university (No. EA4/062/20). The need for informed consent was waived due to the retrospective nature of the research.
Figures
 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                References
- 
    - Schemmel A et al (2016) Radiology workflow disruptors: a detailed analysis. J Am Coll Radiol 13(10):1210–1214. 10.1016/j.jacr.2016.04.009 - PubMed
 
- 
    - Ginat DT, Uppuluri P, Christoforidis G, Katzman G, Lee SK (2016) Identification of neuroradiology MRI protocol errors via a quality-driven categorization approach. J Am Coll Radiol 13(5):545–548. 10.1016/j.jacr.2015.08.027 - PubMed
 
- 
    - López-Úbeda P, Díaz-Galiano MC, Martín-Noguerol T, Luna A, Ureña-López LA, Martín-Valdivia MT (2021) Automatic medical protocol classification using machine learning approaches. Comput Methods Programs Biomed. 10.1016/j.cmpb.2021.105939 - PubMed
 
MeSH terms
Substances
LinkOut - more resources
- Full Text Sources
- Medical
 
        