Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 24:12:1600736.
doi: 10.3389/fmed.2025.1600736. eCollection 2025.

Med-DGTN: Dynamic Graph Transformer with Adaptive Wavelet Fusion for multi-label medical image classification

Affiliations

Med-DGTN: Dynamic Graph Transformer with Adaptive Wavelet Fusion for multi-label medical image classification

Guanyu Zhang et al. Front Med (Lausanne). .

Abstract

Introduction: Multi-label classification of medical imaging data aims to enable simultaneous identification and diagnosis of multiple diseases, delivering comprehensive clinical decision support for complex conditions. Current methodologies demonstrate limitations in capturing disease co-occurrence patterns and preserving subtle pathological signatures. To address these challenges, we propose Med-DGTN, a dynamically integrated framework designed to advance multi-label classification performance in clinical imaging analytics.

Methods: The proposed Med-DGTN (Dynamic Graph Transformer Network with Adaptive Wavelet Fusion) introduces three key innovations: (1) A cross-modal alignment mechanism integrating convolutional visual patterns with graph-based semantic dependencies through conditionally reweighted adjacency matrices; (2) Wavelet-transform-enhanced dense blocks (WTDense) employing multi-frequency decomposition to amplify low-frequency pathological biomarkers; (3) An adaptive fusion architecture optimizing multi-scale feature hierarchies across spatial and spectral domains.

Results: Validated on two public medical imaging benchmarks, Med-DGTN demonstrates superior performance across modalities: (1) Achieving a mean average precision (mAP) of 70.65% on the retinal imaging dataset (MuReD2022), surpassing previous state-of-the-art methods by 2.68 percentage points. (2) On the chest X-ray dataset (ChestXray14), Med-DGTN achieves an average Area Under the Curve (AUC) of 0.841. It outperforms prior state-of-the-art methods in 5 of 14 disease categories.

Discussion: This investigation establishes that joint modeling of dynamic disease correlations and wavelet-optimized feature representation significantly enhances multi-label diagnostic capabilities. Med-DGTN's architecture demonstrates clinical translatability by revealing disease interaction patterns through interpretable graph structures, potentially informing precision diagnostics in multi-morbidity scenarios.

Keywords: Dynamic Graph Transformer; deep learning; medical image analysis; multi-label classification; wavelet transform.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Multi-label dependency diagram of fundus image. This diagram illustrates the co-occurrence relationships among various eye diseases. The labels “choroidal neovascularization” and “age-related macular degeneration” are connected by an edge, indicating a high probability of concurrent occurrence.
Figure 2
Figure 2
Overall framework of our Med-DGTN model for multi-label medical image classification. The WTDense block is composed of six non-linear combination layers, incorporating batch normalization, ReLU activation, and WTConv. The outputs from these layers are subsequently concatenated across channels. WTDenseNet utilizes cascaded WTDense Block modules as its core structure to extract multi-level image features. These features are subsequently combined with classifiers generated through graph convolution to generate the final predictions.
Figure 3
Figure 3
Schematic diagram of DAME module correlation matrix generation. The schematic diagram of the DAME module’s correlation matrix generation illustrates its processing of the co-occurrence matrix M. This process involves conditional probability modeling to quantify disease relationships, followed by noise filtering and reweighting to reduce spurious correlations. Finally, a GAT layer is applied to enhance the capture of structural information related to disease-label nodes, ultimately producing a refined graph structure P with enriched node representations.
Figure 4
Figure 4
Schematic diagram of The WTDense block module. This module employs a dense connection pattern. Within each WTDense Block, following the application of BN and ReLU, a WTconv operation is conducted. The feature maps generated by each layer are then concatenated along the channel dimension, thereby serving as the input for the subsequent layers.
Figure 5
Figure 5
Heatmap of the adjacency matrix for MuReD2022. This heatmap illustrates the correlation strengths between the fundus disease labels within the dataset. The darker the color, the stronger the correlation between the corresponding labels.
Figure 6
Figure 6
Heatmap of the adjacency matrix for ChestXray14. This heatmap illustrates the inter-label correlation patterns among the various disease labels within this dataset.
Figure 7
Figure 7
Loss and mAP change curves. The figure demonstrates the progression of the model’s loss and mAP across training epochs, both on the training and validation sets. The blue solid line signifies the train loss, the red dashed line represents the train mAP, the green solid line indicates the Val loss, and the purple dashed line depicts the Val mAP.

Similar articles

References

    1. Xu X, Li J, Zhu Z, Zhao L, Wang H, Song C, et al. A comprehensive review on synergy of multi-modal data and AI technologies in medical diagnosis. Bioengineering. (2024) 11:219. doi: 10.3390/bioengineering11030219, PMID: - DOI - PMC - PubMed
    1. Nie Z, Vonder M, de Vries M, Yang X, Oudkerk M, Slebos DJ, et al. Co-occurrence of bronchiectasis, airway wall thickening, and emphysema in Chinese low-dose CT screening. Eur Radiol. (2025) 35:3823–32. doi: 10.1007/s00330-024-11231-3, PMID: - DOI - PubMed
    1. Kropp M, Golubnitschaja O, Mazurakova A, Koklesova L, Sargheini N, Vo TTKS, et al. Diabetic retinopathy as the leading cause of blindness and early predictor of cascading complications—risks and mitigation. EPMA J. (2023) 14:21–42. doi: 10.1007/s13167-023-00314-8, PMID: - DOI - PMC - PubMed
    1. Williams R, Airey M, Baxter H, Forrester J, Kennedy-Martin T, Girach A. Epidemiology of diabetic retinopathy and macular oedema: a systematic review. Eye. (2004) 18:963–83. doi: 10.1038/sj.eye.6701476, PMID: - DOI - PubMed
    1. Nentwich MM, Ulbig MW. Diabetic retinopathy-ocular complications of diabetes mellitus. World J Diabetes. (2015) 6:489–99. doi: 10.4239/wjd.v6.i3.489, PMID: - DOI - PMC - PubMed

LinkOut - more resources