Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Editorial
. 2023 Sep 15;381(6663):adk6139.
doi: 10.1126/science.adk6139. Epub 2023 Sep 15.

As artificial intelligence goes multimodal, medical applications multiply

Affiliations
Editorial

As artificial intelligence goes multimodal, medical applications multiply

Eric J Topol. Science. .

Abstract

Machines don't have eyes, but you wouldn't know that if you followed the progression of deep learning models for accurate interpretation of medical images, such as x-rays, computed tomography (CT) and magnetic resonance imaging (MRI) scans, pathology slides, and retinal photos. Over the past several years, there has been a torrent of studies that have consistently demonstrated how powerful "machine eyes" can be, not only compared with medical experts but also for detecting features in medical images that are not readily discernable by humans. For example, a retinal scan is rich with information that people can't see, but machines can, providing a gateway to multiple aspects of human physiology, including blood pressure; glucose control; risk of Parkinson's, Alzheimer's, kidney, and hepatobiliary diseases; and the likelihood of heart attacks and strokes. As a cardiologist, I would not have envisioned that machine interpretation of an electrocardiogram would provide information about the individual's age, sex, anemia, risk of developing diabetes or arrhythmias, heart function and valve disease, kidney, or thyroid conditions. Likewise, applying deep learning to a pathology slide of tumor tissue can also provide insight about the site of origin, driver mutations, structural genomic variants, and prognosis. Although these machine vision capabilities for medical image interpretation may seem impressive, they foreshadow what is potentially far more expansive terrain for artificial intelligence (AI) to transform medicine. The big shift ahead is the ability to transcend narrow, unimodal tasks, confined to images, and broaden machine capabilities to include text and speech, encompassing all input modes, setting the foundation for multimodal AI.

PubMed Disclaimer

Publication types

LinkOut - more resources