A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics
- PMID: 37308585
- DOI: 10.1038/s41551-023-01045-x
A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics
Abstract
During the diagnostic process, clinicians leverage multimodal information, such as the chief complaint, medical images and laboratory test results. Deep-learning models for aiding diagnosis have yet to meet this requirement of leveraging multimodal information. Here we report a transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner. Rather than learning modality-specific features, the model leverages embedding layers to convert images and unstructured and structured text into visual tokens and text tokens, and uses bidirectional blocks with intramodal and intermodal attention to learn holistic representations of radiographs, the unstructured chief complaint and clinical history, and structured clinical information such as laboratory test results and patient demographic information. The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary disease (by 12% and 9%, respectively) and in the prediction of adverse clinical outcomes in patients with COVID-19 (by 29% and 7%, respectively). Unified multimodal transformer-based models may help streamline the triaging of patients and facilitate the clinical decision-making process.
© 2023. The Author(s), under exclusive licence to Springer Nature Limited.
Similar articles
-
Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities.J Biomed Inform. 2023 Sep;145:104466. doi: 10.1016/j.jbi.2023.104466. Epub 2023 Aug 5. J Biomed Inform. 2023. PMID: 37549722
-
MESTrans: Multi-scale embedding spatial transformer for medical image segmentation.Comput Methods Programs Biomed. 2023 May;233:107493. doi: 10.1016/j.cmpb.2023.107493. Epub 2023 Mar 17. Comput Methods Programs Biomed. 2023. PMID: 36965298
-
Accurately Differentiating Between Patients With COVID-19, Patients With Other Viral Infections, and Healthy Individuals: Multimodal Late Fusion Learning Approach.J Med Internet Res. 2021 Jan 6;23(1):e25535. doi: 10.2196/25535. J Med Internet Res. 2021. PMID: 33404516 Free PMC article.
-
Clinical assistant decision-making model of tuberculosis based on electronic health records.BioData Min. 2023 Mar 16;16(1):11. doi: 10.1186/s13040-023-00328-y. BioData Min. 2023. PMID: 36927471 Free PMC article.
-
Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis.JMIR Med Inform. 2021 May 26;9(5):e23099. doi: 10.2196/23099. JMIR Med Inform. 2021. PMID: 34037527 Free PMC article.
Cited by
-
A deep learning model for personalized intra-arterial therapy planning in unresectable hepatocellular carcinoma: a multicenter retrospective study.EClinicalMedicine. 2024 Sep 5;75:102808. doi: 10.1016/j.eclinm.2024.102808. eCollection 2024 Sep. EClinicalMedicine. 2024. PMID: 39296944 Free PMC article.
-
AI-Driven Diagnostic Assistance in Medical Inquiry: Reinforcement Learning Algorithm Development and Validation.J Med Internet Res. 2024 Aug 23;26:e54616. doi: 10.2196/54616. J Med Internet Res. 2024. PMID: 39178403 Free PMC article.
-
ModuCLIP: multi-scale CLIP framework for predicting foundation pit deformation in multi-modal robotic systems.Front Neurorobot. 2025 Apr 1;19:1544694. doi: 10.3389/fnbot.2025.1544694. eCollection 2025. Front Neurorobot. 2025. PMID: 40236467 Free PMC article.
-
China Protocol for early screening, precise diagnosis, and individualized treatment of lung cancer.Signal Transduct Target Ther. 2025 May 27;10(1):175. doi: 10.1038/s41392-025-02256-1. Signal Transduct Target Ther. 2025. PMID: 40425545 Free PMC article. Review.
-
A non-invasive prediction model for coronary artery stenosis severity based on multimodal data.Front Physiol. 2025 Jun 2;16:1592593. doi: 10.3389/fphys.2025.1592593. eCollection 2025. Front Physiol. 2025. PMID: 40529992 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical