What Does DALL-E 2 Know About Radiology?
- PMID: 36927634
- PMCID: PMC10131692
- DOI: 10.2196/43110
What Does DALL-E 2 Know About Radiology?
Abstract
Generative models, such as DALL-E 2 (OpenAI), could represent promising future tools for image generation, augmentation, and manipulation for artificial intelligence research in radiology, provided that these models have sufficient medical domain knowledge. Herein, we show that DALL-E 2 has learned relevant representations of x-ray images, with promising capabilities in terms of zero-shot text-to-image generation of new images, the continuation of an image beyond its original boundaries, and the removal of elements; however, its capabilities for the generation of images with pathological abnormalities (eg, tumors, fractures, and inflammation) or computed tomography, magnetic resonance imaging, or ultrasound images are still limited. The use of generative models for augmenting and generating radiological data thus seems feasible, even if the further fine-tuning and adaptation of these models to their respective domains are required first.
Keywords: DALL-E; artificial intelligence; creating images from text; diagnostic imaging; generative model; image creation; image generation; machine learning; medical imaging; radiology; text-to-image; transformer language model; x-ray.
©Lisa C Adams, Felix Busch, Daniel Truhn, Marcus R Makowski, Hugo J W L Aerts, Keno K Bressem. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 16.03.2023.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures




References
-
- Sharma M, Savage C, Nair M, Larsson I, Svedberg P, Nygren JM. Artificial intelligence applications in health care practice: Scoping review. J Med Internet Res. 2022 Oct 05;24(10):e40238. doi: 10.2196/40238. https://www.jmir.org/2022/10/e40238/ v24i10e40238 - DOI - PMC - PubMed
-
- Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M. Hierarchical text-conditional image generation with CLIP latents. arXiv. doi: 10.48550/arXiv.2204.06125. Preprint posted online on April 13, 2022 https://arxiv.org/pdf/2204.06125.pdf . - DOI
-
- Conwell C, Ullman T. Testing relational understanding in text-guided image generation. arXiv. doi: 10.48550/arXiv.2208.00005. Preprint posted online on July 29, 2022 https://arxiv.org/pdf/2208.00005.pdf . - DOI
-
- Marcus G, Davis E, Aaronson S. A very preliminary analysis of DALL-E 2. arXiv. doi: 10.48550/arXiv.2204.13807. Preprint posted online on April 25, 2022 https://arxiv.org/vc/arxiv/papers/2204/2204.13807v1.pdf . - DOI
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources