Robust multi-modal fusion architecture for medical data with knowledge distillation
- PMID: 39709743
- DOI: 10.1016/j.cmpb.2024.108568
Robust multi-modal fusion architecture for medical data with knowledge distillation
Abstract
Background: The fusion of multi-modal data has been shown to significantly enhance the performance of deep learning models, particularly on medical data. However, missing modalities are common in medical data due to patient specificity, which poses a substantial challenge to the application of these models.
Objective: This study aimed to develop a novel and efficient multi-modal fusion framework for medical datasets that maintains consistent performance, even in the absence of one or more modalities.
Methods: In this paper, we fused three modalities: chest X-ray radiographs, history of present illness text, and tabular data such as demographics and laboratory tests. A multi-modal fusion module based on pooled bottleneck (PB) attention was proposed in conjunction with knowledge distillation (KD) for enhancing model inference in the case of missing modalities. In addition, we introduced a gradient modulation (GM) method to deal with the unbalanced optimization in multi-modal model training. Finally, we designed comparison and ablation experiments to evaluate the fusion effect, the model robustness to missing modalities, and the contribution of each component (PB, KD, and GM). The evaluation experiments were performed on the MIMIC-IV datasets with the task of predicting in-hospital mortality risk. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC).
Results: The proposed multi-modal fusion framework achieved an AUROC of 0.886 and AUPRC of 0.459, significantly surpassing the performance of baseline models. Even when one or two modalities were missing, our model consistently outperformed the reference models. Ablation of each of the three components resulted in varying degrees of performance degradation, highlighting their distinct contributions to the model's overall effectiveness.
Conclusions: This innovative multi-modal fusion architecture has demonstrated robustness to missing modalities, and has shown excellent performance in fusing three medical modalities for patient outcome prediction. This study provides a novel idea for addressing the challenge of missing modalities and has the potential be scaled to additional modalities.
Keywords: Clinical prediction; Deep learning; Knowledge distillation; Missing modalities; Multi-modal fusion; Transformer.
Copyright © 2024. Published by Elsevier B.V.
Conflict of interest statement
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Similar articles
-
Missing-modality enabled multi-modal fusion architecture for medical data.J Biomed Inform. 2025 Apr;164:104796. doi: 10.1016/j.jbi.2025.104796. Epub 2025 Feb 21. J Biomed Inform. 2025. PMID: 39988001
-
Multi-modal fusion model for Time-Varying medical Data: Addressing Long-Term dependencies and memory challenges in sequence fusion.J Biomed Inform. 2025 May;165:104823. doi: 10.1016/j.jbi.2025.104823. Epub 2025 Apr 4. J Biomed Inform. 2025. PMID: 40189181
-
CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation.Quant Imaging Med Surg. 2024 Jul 1;14(7):4579-4604. doi: 10.21037/qims-24-9. Epub 2024 Jun 27. Quant Imaging Med Surg. 2024. PMID: 39022265 Free PMC article.
-
MMGCN: Multi-modal multi-view graph convolutional networks for cancer prognosis prediction.Comput Methods Programs Biomed. 2024 Dec;257:108400. doi: 10.1016/j.cmpb.2024.108400. Epub 2024 Sep 6. Comput Methods Programs Biomed. 2024. PMID: 39270533
-
Integrating genetics, metabolites, and clinical characteristics in predicting cardiometabolic health outcomes using machine learning algorithms - A systematic review.Comput Biol Med. 2025 Mar;186:109661. doi: 10.1016/j.compbiomed.2025.109661. Epub 2025 Jan 11. Comput Biol Med. 2025. PMID: 39799831
MeSH terms
LinkOut - more resources
Full Text Sources