Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul;6(4):e230364.
doi: 10.1148/ryai.230364.

Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports

Affiliations

Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports

Bastien Le Guellec et al. Radiol Artif Intell. 2024 Jul.

Abstract

Purpose To assess the performance of a local open-source large language model (LLM) in various information extraction tasks from real-life emergency brain MRI reports. Materials and Methods All consecutive emergency brain MRI reports written in 2022 from a French quaternary center were retrospectively reviewed. Two radiologists identified MRI scans that were performed in the emergency department for headaches. Four radiologists scored the reports' conclusions as either normal or abnormal. Abnormalities were labeled as either headache-causing or incidental. Vicuna (LMSYS Org), an open-source LLM, performed the same tasks. Vicuna's performance metrics were evaluated using the radiologists' consensus as the reference standard. Results Among the 2398 reports during the study period, radiologists identified 595 that included headaches in the indication (median age of patients, 35 years [IQR, 26-51 years]; 68% [403 of 595] women). A positive finding was reported in 227 of 595 (38%) cases, 136 of which could explain the headache. The LLM had a sensitivity of 98.0% (95% CI: 96.5, 99.0) and specificity of 99.3% (95% CI: 98.8, 99.7) for detecting the presence of headache in the clinical context, a sensitivity of 99.4% (95% CI: 98.3, 99.9) and specificity of 98.6% (95% CI: 92.2, 100.0) for the use of contrast medium injection, a sensitivity of 96.0% (95% CI: 92.5, 98.2) and specificity of 98.9% (95% CI: 97.2, 99.7) for study categorization as either normal or abnormal, and a sensitivity of 88.2% (95% CI: 81.6, 93.1) and specificity of 73% (95% CI: 62, 81) for causal inference between MRI findings and headache. Conclusion An open-source LLM was able to extract information from free-text radiology reports with excellent accuracy without requiring further training. Keywords: Large Language Model (LLM), Generative Pretrained Transformers (GPT), Open Source, Information Extraction, Report, Brain, MRI Supplemental material is available for this article. Published under a CC BY 4.0 license. See also the commentary by Akinci D'Antonoli and Bluethgen in this issue.

Keywords: Brain; Generative Pretrained Transformers (GPT); Information Extraction; Large Language Model (LLM); MRI; Open Source; Report.

PubMed Disclaimer

Conflict of interest statement

Disclosures of conflicts of interest: B.L.G. No relevant relationships. A.L. No relevant relationships. C.G. No relevant relationships. L.S. No relevant relationships. C.B. No relevant relationships. L.H.B. No relevant relationships. P.A. Qalis think tank; stock/stock options in Genoscreen. J.P.P. No relevant relationships. G.K. Grants from French Society of Neuroradiology and French Society of Radiology. A.H. No relevant relationships.

Figures

None
Graphical abstract
Example texts from dialogues with Vicuna 13-B (LMSYS Org).
Figure 1:
Example texts from dialogues with Vicuna 13-B (LMSYS Org).
Diagram shows workflow for automated information extraction from pseudonymized radiology reports with Vicuna 13-B (LMSYS Org), an open-source large language model. Four tasks are defined as follows: 1, reports’ triaging based on the presence of headache; 2, extraction of contrast medium injection from the protocol; 3, classification of the examination as either normal or abnormal based on the conclusion; and 4, causal inference between the main finding and headache.
Figure 2:
Diagram shows workflow for automated information extraction from pseudonymized radiology reports with Vicuna 13-B (LMSYS Org), an open-source large language model. Four tasks are defined as follows: 1, reports’ triaging based on the presence of headache; 2, extraction of contrast medium injection from the protocol; 3, classification of the examination as either normal or abnormal based on the conclusion; and 4, causal inference between the main finding and headache.
Graph shows diagnostic yield of brain MRI scans in patients in the emergency department with headache.
Figure 3:
Graph shows diagnostic yield of brain MRI scans in patients in the emergency department with headache.
Graph shows performance of Vicuna 13-B (LMSYS Org) for tasks 1, 2, 3, and 4. Error bars are 95% CIs.
Figure 4:
Graph shows performance of Vicuna 13-B (LMSYS Org) for tasks 1, 2, 3, and 4. Error bars are 95% CIs.

Comment in

References

    1. Pons E , Braun LMM , Hunink MGM , Kors JA . Natural Language Processing in Radiology: A Systematic Review . Radiology 2016. ; 279 ( 2 ): 329 – 343 . - PubMed
    1. Langlotz CP . Automatic structuring of radiology reports: harbinger of a second information revolution in radiology . Radiology 2002. ; 224 ( 1 ): 5 – 7 . - PubMed
    1. Lungren MP , Amrhein TJ , Paxton BE , et al. . Physician self-referral: frequency of negative findings at MR imaging of the knee as a marker of appropriate utilization . Radiology 2013. ; 269 ( 3 ): 810 – 815 . - PubMed
    1. Happonen T , Nyman M , Ylikotila P , Merisaari H , Mattila K , Hirvonen J . Diagnostic yield of emergency MRI in non-traumatic headache . Neuroradiology 2023. ; 65 ( 1 ): 89 – 96 . - PMC - PubMed
    1. Budweg J , Sprenger T , De Vere-Tyndall A , Hagenkord A , Stippich C , Berger CT . Factors associated with significant MRI findings in medical walk-in patients with acute headache . Swiss Med Wkly 2016. ; 146 : w14349 . - PubMed