Multimodal Artificial Intelligence in Medicine

Conor S Judge^{1

2}, Finn Krewer¹, Martin J O'Donnell¹, Lisa Kiely¹, Donal Sexton³, Graham W Taylor^{4

5}, Joshua August Skorburg⁴, Bryan Tripp⁶

Affiliations

¹ HRB-Clinical Research Facility, University of Galway, Galway, Ireland.
² Insight Data Analytics, University of Galway, Galway, Ireland.
³ Department of Medicine, Trinity College Dublin, Dublin, Ireland.
⁴ University of Guelph, Guelph, Ontario, Canada.
⁵ Vector Institute, Toronto, Ontario, Canada.
⁶ Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada.

PMID: 39167446
PMCID: PMC12282626
DOI: 10.34067/KID.0000000000000556

Review

Multimodal Artificial Intelligence in Medicine

Conor S Judge et al. Kidney360. 2024.

. 2024 Nov 1;5(11):1771-1779.

doi: 10.34067/KID.0000000000000556. Epub 2024 Aug 21.

Authors

Conor S Judge^{1

2}, Finn Krewer¹, Martin J O'Donnell¹, Lisa Kiely¹, Donal Sexton³, Graham W Taylor^{4

5}, Joshua August Skorburg⁴, Bryan Tripp⁶

Affiliations

¹ HRB-Clinical Research Facility, University of Galway, Galway, Ireland.
² Insight Data Analytics, University of Galway, Galway, Ireland.
³ Department of Medicine, Trinity College Dublin, Dublin, Ireland.
⁴ University of Guelph, Guelph, Ontario, Canada.
⁵ Vector Institute, Toronto, Ontario, Canada.
⁶ Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada.

PMID: 39167446
PMCID: PMC12282626
DOI: 10.34067/KID.0000000000000556

Abstract

Traditional medical artificial intelligence models that are approved for clinical use restrict themselves to single-modal data ( e.g ., images only), limiting their applicability in the complex, multimodal environment of medical diagnosis and treatment. Multimodal transformer models in health care can effectively process and interpret diverse data forms, such as text, images, and structured data. They have demonstrated impressive performance on standard benchmarks, like United States Medical Licensing Examination question banks, and continue to improve with scale. However, the adoption of these advanced artificial intelligence models is not without challenges. While multimodal deep learning models like transformers offer promising advancements in health care, their integration requires careful consideration of the accompanying ethical and environmental challenges.

PubMed Disclaimer

Conflict of interest statement

Disclosure forms, as provided by each author, are available with the online version of the article at http://links.lww.com/KN9/A646.

Figures

**Figure 1**
**Patient scenarios without AI, with single modal AI and with multimodal AI.** AI, artificial intelligence.

**Figure 2**
**A future multimodal transformer-based AKI alert system.** This figure depicts a multimodal transformer-based system designed to predict AKI risk by integrating text, structured data (*e.g*., laboratory tests), images (*e.g*., chest x-ray), and time series data. These are converted into numerical vectors through specific embedding layers. Positional encoding adds order information to these vectors, which are processed by the attention mechanism. The heatmap shows how the model focuses on relevant data parts, using keys, queries, and values to compute attention scores. Transformed vectors pass through a feed forward neural network for pattern learning. The architecture includes multiple identical decoder blocks for iterative refinement. At the top, the classification task predicts AKI risk at 24, 48, and 168 hours, triggering alerts if risk thresholds are exceeded. Cr, creatinine; Gl, glucose, Na, sodium; x H, attention heads; x K, decoder blocks.

See this image and copyright information in PMC

References

1. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31–38. doi: 10.1038/s41591-021-01614-0 - DOI - PubMed
1. Sisson JC, Schoomaker EB, Ross JC. Clinical decision analysis: the hazard of using additional data. JAMA. 1976;236(11):1259–1263. doi: 10.1001/jama.236.11.1259 - DOI - PubMed
1. Artificial Intelligence (AI) and Machine Learning (ML) in Medical Devices [Internet]. U.S. Food and Drug Administration; 2020. Accessed September 3, 2023. https://www.fda.gov/media/142998/download
1. Meskó B, Görög M. A short guide for medical professionals in the era of artificial intelligence. NPJ Digit Med. 2020;3:126. doi: 10.1038/s41746-020-00333-z - DOI - PMC - PubMed
1. Tomašev N Glorot X Rae JW, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572(7767):116–119. doi: 10.1038/s41586-019-1390-1 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multimodal Artificial Intelligence in Medicine

Affiliations

Multimodal Artificial Intelligence in Medicine

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources