Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports

Wasif Bala¹, Hanzhou Li², John Moon², Hari Trivedi², Judy Gichoya², Patricia Balthazar²

Affiliations

¹ Department of Radiology and Imaging Sciences, Emory University School of Medicine, USA. Electronic address: wbala@emory.edu.
² Department of Radiology and Imaging Sciences, Emory University School of Medicine, USA.

PMID: 39179466
PMCID: PMC11802295 (available on 2026-03-01)
DOI: 10.1067/j.cpradiol.2024.08.003

Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports

Wasif Bala et al. Curr Probl Diagn Radiol. 2025 Mar-Apr.

. 2025 Mar-Apr;54(2):151-158.

doi: 10.1067/j.cpradiol.2024.08.003. Epub 2024 Aug 15.

Authors

Wasif Bala¹, Hanzhou Li², John Moon², Hari Trivedi², Judy Gichoya², Patricia Balthazar²

Affiliations

¹ Department of Radiology and Imaging Sciences, Emory University School of Medicine, USA. Electronic address: wbala@emory.edu.
² Department of Radiology and Imaging Sciences, Emory University School of Medicine, USA.

PMID: 39179466
PMCID: PMC11802295 (available on 2026-03-01)
DOI: 10.1067/j.cpradiol.2024.08.003

Abstract

Rationale and objectives: Radiology residents often receive limited feedback on preliminary reports issued during independent call. This study aimed to determine if Large Language Models (LLMs) can supplement traditional feedback by identifying missed diagnoses in radiology residents' preliminary reports.

Materials & methods: A randomly selected subset of 500 (250 train/250 validation) paired preliminary and final reports between 12/17/2022 and 5/22/2023 were extracted and de-identified from our institutional database. The prompts and report text were input into the GPT-4 language model via the GPT-4 API (gpt-4-0314 model version). Iterative prompt tuning was used on a subset of the training/validation sets to direct the model to identify important findings in the final report that were absent in preliminary reports. For testing, a subset of 10 reports with confirmed diagnostic errors were randomly selected. Fourteen residents with on-call experience assessed the LLM-generated discrepancies and completed a survey on their experience using a 5-point Likert scale.

Results: The model identified 24 unique missed diagnoses across 10 test reports with i% model prediction accuracy as rated by 14 residents. Five additional diagnoses were identified by users, resulting in a model sensitivity of 79.2 %. Post-evaluation surveys showed a mean satisfaction rating of 3.50 and perceived accuracy rating of 3.64 out of 5 for LLM-generated feedback. Most respondents (71.4 %) favored a combination of LLM-generated and traditional feedback.

Conclusion: This pilot study on the use of LLM-generated feedback for radiology resident preliminary reports demonstrated notable accuracy in identifying missed diagnoses and was positively received, highlighting LLMs' potential role in supplementing conventional feedback methods.

Keywords: Artificial Intelligence/machine Learning; Educational systems; Large language models.

PubMed Disclaimer

References

1. Accreditation Council for Graduate Medical Education. Supplemental Guide: Diagnostic Radiology. Published online October 2019. Accessed December 11, 2023. https://www.acgme.org/globalassets/PDFs/Milestones/DiagnosticRadiologySu...
1. Holt KD, Miller RS. The ACGME resident survey aggregate reports: an analysis and assessment of overall program compliance. J Grad Med Educ. 2009;1(2):327–333. doi:10.4300/JGME-D-09-00062.1 - DOI - PMC - PubMed
1. Stewart M, Yang N, Lim R. Provision of feedback to radiology trainees: Barriers and inefficiencies, why it matters and a potential solution. J Med Imaging Radiat Oncol. 2023;67(1):77–80. doi:10.1111/1754-9485.13497 - DOI - PubMed
1. Lam CZ, Nguyen HN, Ferguson EC. Radiology resident’ satisfaction with their training and education in the united states: effect of program directors, teaching faculty, and other factors on program success. AJR Am J Roentgenol. 2016;206(5):907–916. doi:10.2214/AJR.15.15020 - DOI - PubMed
1. Kyaw MM, Dedova I, Young N, Moscova M. Quality of radiology training and role of Royal Australian and New Zealand College of Radiology in supporting radiology trainees in NSW: Results of the first radiology trainee survey. J Med Imaging Radiat Oncol. 2021;65(3):261–271. doi:10.1111/1754-9485.13148 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

UL1 TR002378/TR/NCATS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- ClinicalKey
- Elsevier Science
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports

Affiliations

Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports

Authors

Affiliations

Abstract

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous