Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Feb;22(2):114-126.
doi: 10.1038/s41568-021-00408-3. Epub 2021 Oct 18.

Harnessing multimodal data integration to advance precision oncology

Affiliations
Review

Harnessing multimodal data integration to advance precision oncology

Kevin M Boehm et al. Nat Rev Cancer. 2022 Feb.

Abstract

Advances in quantitative biomarker development have accelerated new forms of data-driven insights for patients with cancer. However, most approaches are limited to a single mode of data, leaving integrated approaches across modalities relatively underdeveloped. Multimodal integration of advanced molecular diagnostics, radiological and histological imaging, and codified clinical data presents opportunities to advance precision oncology beyond genomics and standard molecular techniques. However, most medical datasets are still too sparse to be useful for the training of modern machine learning techniques, and significant challenges remain before this is remedied. Combined efforts of data engineering, computational methods for analysis of heterogeneous data and instantiation of synergistic data models in biomedical research are required for success. In this Perspective, we offer our opinions on synthesizing complementary modalities of data with emerging multimodal artificial intelligence methods. Advancing along this direction will result in a reimagined class of multimodal biomarkers to propel the field of precision oncology in the coming decade.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Example data modalities for integration include radiology, histopathology, and genomic information.
Image feature extraction involves choosing deep learning or engineered features. CT, computed tomography; H&E, hematoxylin & eosin.
Figure 2.
Figure 2.. Multimodal models integrate features across modalities.
Sub-models extract unimodal features from each data modality. Next, a multimodal integration step generates intermodal features—a Tensor Fusion Network (TFN) is indicated here . A final sub-model infers patient outcomes. GPU, graphics processing unit.
Figure 3.
Figure 3.. Design choices for multimodal models with genomic, radiological, and histopathological data.
Solid arrows indicate stages with learnable parameters (linear or otherwise), dashed arrows indicate stages with no learnable parameters, and dashed and dotted arrows indicate the option for learnable parameters, depending on model architecture. (a) In early fusion, features from disparate modalities are simply concatenated at the outset. (b) In late fusion, each set of unimodal features is separately and fully processed to generate a unimodal score before amalgamation by a classifier or simple arithmetic. (c) In intermediate fusion, unimodal features are initially processed separately prior to a fusion step, which may or may not have learnable parameters, and subsequent analysis of the fused representation. All schemata shown are for deep learning (DL) on engineered features: for convolutional neural networks (CNNs) directly on images, unimodal features and unimodal representations are synonymous. For linear machine learning (ML) on engineered features, no representations are learned between features and stratification. The magnetic resonance imaging (MRI) images were obtained from The Cancer Imaging Archive (TCIA). Histology images were obtained from the Stanford Tissue Microarray database, ref. .
Figure 4.
Figure 4.. Active learning reduces the burden of annotation.
In active (‘human in the loop’) learning, a pathologist first annotates small training areas representing tissue areas (for example, tumor, stroma and lymphocytes). Next, a machine learning classifier is trained from these expert annotations. Finally, the resulting labeled sample can be examined for misclassified regions, and the pathologist adds targeted additional training areas. This process is repeated until the classification is accurate and can be applied to multiple samples.
Figure 5.
Figure 5.. Recommender systems could learn from retrospective data to assist in clinical decision-making.
Logged healthcare data comprises multimodal patient contexts x, interventions y based on the standard of care (π0), and feedback δ based on the outcome of the intervention. Learning from such data is challenging because of the lack of two-arm design and the biased data based on the changing standard of care. Counterfactual recommender systems learn theoretically guaranteed unbiased policies from these data. Then, the validated policy π can be applied prospectively to support physicians’ management decisions.
Figure 6.
Figure 6.. Class activation maps highlight the image areas most important for the model to make a decision.
Magnetic resonance imaging (MRI) images of the breast (A) before neoadjuvant chemotherapy with region of interest circled by a radiologist, (B) after neoadjuvant chemotherapy with region of interest circled by a radiologist, and (C) before neoadjuvant chemotherapy with class activation mapping by a neural network trained to predict response to therapy. Warmer colors indicate higher saliency. The images were obtained from The Cancer Imaging Archive (TCIA),.

References

    1. AACR Project GENIE Consortium. AACR Project GENIE: Powering Precision Medicine through an International Consortium. Cancer Discov. 7, 818–831 (2017). - PMC - PubMed
    1. Vasan N et al. Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PI3Kα inhibitors. Science 366, 714–723 (2019). - PMC - PubMed
    1. Razavi P et al. The Genomic Landscape of Endocrine-Resistant Advanced Breast Cancers. Cancer Cell 34, 427–438.e6 (2018). - PMC - PubMed
    1. Jonsson P et al. Genomic Correlates of Disease Progression and Treatment Response in Prospectively Characterized Gliomas. Clin. Cancer Res 25, 5537–5547 (2019). - PMC - PubMed
    1. Soumerai TE et al. Clinical Utility of Prospective Molecular Characterization in Advanced Endometrial Cancer. Clin. Cancer Res 24, 5939–5947 (2018). - PMC - PubMed

Publication types