Multimodal deep learning for biomedical data fusion: a review

Sören Richard Stahlschmidt¹, Benjamin Ulfenborg¹, Jane Synnergren¹

Affiliations

PMID: 35089332
PMCID: PMC8921642
DOI: 10.1093/bib/bbab569

Review

Multimodal deep learning for biomedical data fusion: a review

Sören Richard Stahlschmidt et al. Brief Bioinform. 2022.

. 2022 Mar 10;23(2):bbab569.

doi: 10.1093/bib/bbab569.

Authors

Sören Richard Stahlschmidt¹, Benjamin Ulfenborg¹, Jane Synnergren¹

Affiliation

¹ Systems Biology Research Center, University of Skövde, Sweden.

PMID: 35089332
PMCID: PMC8921642
DOI: 10.1093/bib/bbab569

Abstract

Biomedical data are becoming increasingly multimodal and thereby capture the underlying complex relationships among biological processes. Deep learning (DL)-based data fusion strategies are a popular approach for modeling these nonlinear relationships. Therefore, we review the current state-of-the-art of such methods and propose a detailed taxonomy that facilitates more informed choices of fusion strategies for biomedical applications, as well as research on novel methods. By doing so, we find that deep fusion strategies often outperform unimodal and shallow approaches. Additionally, the proposed subcategories of fusion strategies show different advantages and drawbacks. The review of current methods has shown that, especially for intermediate fusion strategies, joint representation learning is the preferred approach as it effectively models the complex interactions of different levels of biological organization. Finally, we note that gradual fusion, based on prior biological knowledge or on search strategies, is a promising future research path. Similarly, utilizing transfer learning might overcome sample size limitations of multimodal data sets. As these data sets become increasingly available, multimodal DL approaches present the opportunity to train holistic models that can learn the complex regulatory dynamics behind health and disease.

Keywords: data integration; deep neural networks; fusion strategies; multi-omics; multimodal machine learning; representation learning.

PubMed Disclaimer

Figures

**Figure 1**
Development of technologies and multimodal deep learning (DL). ‘Omics’ and ‘multi-omics’ data become increasingly relevant in the scientific literature. To fully utilize the growing number of multimodal data sets, data fusion methods based on DL are evolving into an important approach in the biomedical field. This unprecedented generation of data has been made possible by high-throughtput technologies like microarrays and next-generation sequencing [7]. The development of bulk RNA-seq was followed by several related sequencing technologies, such as single-cell RNA-seq and ATAC-seq [8]. Currently, spatial transcriptomics [9] and single-cell multi-omics [10] are being increasingly used.

**Figure 2**
DL-based fusion strategies. Layers marked in blue are shared between modalities and learn joint representations. (a) Early fusion strategies take as input a concatenated vector. No marginal representations are learned. (b) Intermediate fusion strategies first learn marginal representations and fuse these later inside the network. This can occur in one layer or gradually. (c) Late fusion strategies combine decisions by sub-models for each modality. Figure adapted from [2].

**Figure 3**
Early fusion strategies. (a) Unimodal vector stacking alternatives. *dim(M)* is the combined dimensionality of the set of modalities M. m is the number of modalities and t the number of steps. (b) Architecture of a regular AE for early fusion with fusion layer marked in blue. (c) Visualization of the assumptions underlying variational AEs.

**Figure 4**
Intermediate fusion strategies. (a) Joint intermediate fusion with shared layer in blue. Subsequent to marginal representations, joint representations are learned (top). In marginal intermediate fusion, marginal representations are directly input to the decision function (bottom). (b) Marginal AE where marginal representations are concatenated and input into a decision function. (c) Joint AE in which a joint representation is learned in the shared layer marked in blue.

See this image and copyright information in PMC

References

1. Maayan A. Complex systems biology. J R Soc Interface 2017;14(134):20170391. - PMC - PubMed
1. Ramachandram D, Taylor GW. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Process Mag 2017;34(6):96–108.
1. Hall DL, Llinas J. An introduction to multisensor data fusion. Proc IEEE 1997;85(1):6–23.
1. Durrant-Whyte HF. Sensor models and multisensor integration. Int J Robot Res 1988;7:97–113.
1. Castanedo F. A review of data fusion techniques. Sci World J 2013;2013:704504. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions

LinkOut - more resources

Full Text Sources
Medical
- ClinicalTrials.gov
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multimodal deep learning for biomedical data fusion: a review

Affiliation

Multimodal deep learning for biomedical data fusion: a review

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Research Materials