Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Mar;639(8056):888-896.
doi: 10.1038/s41586-025-08675-y. Epub 2025 Mar 26.

Multimodal generative AI for medical image interpretation

Affiliations
Review

Multimodal generative AI for medical image interpretation

Vishwanatha M Rao et al. Nature. 2025 Mar.

Abstract

Accurately interpreting medical images and generating insightful narrative reports is indispensable for patient care but places heavy burdens on clinical experts. Advances in artificial intelligence (AI), especially in an area that we refer to as multimodal generative medical image interpretation (GenMI), create opportunities to automate parts of this complex process. In this Perspective, we synthesize progress and challenges in developing AI systems for generation of medical reports from images. We focus extensively on radiology as a domain with enormous reporting needs and research efforts. In addition to analysing the strengths and applications of new models for medical report generation, we advocate for a novel paradigm to deploy GenMI in a manner that empowers clinicians and their patients. Initial research suggests that GenMI could one day match human expert performance in generating reports across disciplines, such as radiology, pathology and dermatology. However, formidable obstacles remain in validating model accuracy, ensuring transparency and eliciting nuanced impressions. If carefully implemented, GenMI could meaningfully assist clinicians in improving quality of care, enhancing medical education, reducing workloads, expanding specialty access and providing real-time expertise. Overall, we highlight opportunities alongside key challenges for developing multimodal generative AI that complements human experts for reliable medical report writing.

PubMed Disclaimer

Conflict of interest statement

Competing interests: P.R. is the co-founder of a2z Radiology AI. The other authors declare no competing interests.

References

    1. Côté, M. J. & Smith, M. A. Forecasting the demand for radiology services. Health Syst. 7, 79–88 (2018). - DOI
    1. Al Yassin, A., Sadaghiani, M. S., Mohan, S., Bryan, R. N. & Nasrallah, I. It is about “time”: academic neuroradiologist time distribution for interpreting brain MRIs. Acad. Radiol. 25, 1521–1525 (2018). - PubMed - DOI
    1. Reiner, B. I., Knight, N. & Siegel, E. L. Radiology reporting, past, present, and future: the radiologist’s perspective. J. Am. Coll. Radiol. 4, 313–319 (2007). - PubMed - DOI
    1. Carter, A. J., Davis, K. A., Evans, L. V. & Cone, D. C. Information loss in emergency medical services handover of trauma patients. Prehosp. Emerg. Care 13, 280–285 (2009). - PubMed - DOI
    1. Clynch, N. & Kellett, J. Medical documentation: part of the solution, or part of the problem? A narrative review of the literature on the time spent on and value of medical documentation. Int. J. Med. Inf. 84, 221–228 (2015). - DOI

MeSH terms

LinkOut - more resources