Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Oct 22:2023.08.24.23294574.
doi: 10.1101/2023.08.24.23294574.

A Cross-Modal Mutual Knowledge Distillation Framework for Alzheimer's Disease Diagnosis: Addressing Incomplete Modalities

Affiliations

A Cross-Modal Mutual Knowledge Distillation Framework for Alzheimer's Disease Diagnosis: Addressing Incomplete Modalities

Min Gu Kwak et al. medRxiv. .

Update in

Abstract

Early detection of Alzheimer's Disease (AD) is crucial for timely interventions and optimizing treatment outcomes. Despite the promise of integrating multimodal neuroimages such as MRI and PET, handling datasets with incomplete modalities remains under-researched. This phenomenon, however, is common in real-world scenarios as not every patient has all modalities due to practical constraints such as cost, access, and safety concerns. We propose a deep learning framework employing cross-modal Mutual Knowledge Distillation (MKD) to model different sub-cohorts of patients based on their available modalities. In MKD, the multimodal model (e.g., MRI and PET) serves as a teacher, while the single-modality model (e.g., MRI only) is the student. Our MKD framework features three components: a Modality-Disentangling Teacher (MDT) model designed through information disentanglement, a student model that learns from classification errors and MDT's knowledge, and the teacher model enhanced via distilling the student's single-modal feature extraction capabilities. Moreover, we show the effectiveness of the proposed method through theoretical analysis and validate its performance with simulation studies. In addition, our method is demonstrated through a case study with Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets, underscoring the potential of artificial intelligence in addressing incomplete multimodal neuroimaging datasets and advancing early AD detection.

Note to practitioners—: This paper was motivated by the challenge of early AD diagnosis, particularly in scenarios when clinicians encounter varied availability of patient imaging data, such as MRI and PET scans, often constrained by cost or accessibility issues. We propose an incomplete multimodal learning framework that produces tailored models for patients with only MRI and patients with both MRI and PET. This approach improves the accuracy and effectiveness of early AD diagnosis, especially when imaging resources are limited, via bi-directional knowledge transfer. We introduced a teacher model that prioritizes extracting common information between different modalities, significantly enhancing the student model's learning process. This paper includes theoretical analysis, simulation study, and real-world case study to illustrate the method's promising potential in early AD detection. However, practitioners should be mindful of the complexities involved in model tuning. Future work will focus on improving model interpretability and expanding its application. This includes developing methods to discover the key brain regions for predictions, enhancing clinical trust, and extending the framework to incorporate a broader range of imaging modalities, demographic information, and clinical data. These advancements aim to provide a more comprehensive view of patient health and improve diagnostic accuracy across various neurodegenerative diseases.

Keywords: Alzheimer’s disease; incomplete multimodal datasets; knowledge distillation; mild cognitive impairment; representation disentanglement.

PubMed Disclaimer

Figures

Fig. 1:
Fig. 1:
A conceptual depiction of the proposed MKD framework. The teacher, trained with more modalities, achieves higher accuracy, while the student, trained with fewer modalities but more samples, has better single-modal representation extraction capability. Both models enhance each other by mutually exchanging their respective strengths.
Fig. 2:
Fig. 2:
Categorization of KD methods: (a) Complex-to-simple standard KD within the same modality, (b) Single-to-single cross-modal KD, and (c) Multi-to-single cross-modal KD.
Fig. 3:
Fig. 3:
Teacher and student model architectures and training process in the MKD framework. MDT/MDT+: upper architecture without/with dash lines; MDT-Student: bottom architecture. First, the teacher is designed to disentangle modality-common and -specific information and uses modality-common information for classification (MDT). Second, the student then learns from the teacher through forward KD (MDT-Student). Last, the teacher is subsequently updated by learning the student’s feature extraction capability through reverse KD, and adding modality-specific information (MDT+).
Fig. 4:
Fig. 4:
UMAP visualization of the modality-common zca,zcb and modality-specific zsa,zsb representations in the simulation test set.
Fig. 5:
Fig. 5:
Hyperparameter tuning results for MKD. (a) αsim and αdiff for MDT-Student, (b) αrecon for MDT-Student, (c) αs for MDT-Student, and (d) αt for MDT+. The highest AUROC value for each experiment is highlighted in dark blue.
Fig. 6:
Fig. 6:
Comparisons of validation CE loss for MDT models with (blue) and without (red) decoder. Shaded regions represent standard deviations above and below the respective mean values. MDT without decoder overfits, with the CE loss increasing after 30 epochs.

References

    1. Roy S., Wang J., and Xu Y., “Alzheimer’s disease facts and figures,” Alzheimers Dement, vol. 19, pp. 1598–1695, 2023. - PubMed
    1. Canady V. A., “Fda approves new treatment for alzheimer’s disease,” Mental Health Weekly, vol. 33, no. 3, pp. 6–7, 2023.
    1. Sims J. R., Zimmer J. A., Evans C. D., Lu M., Ardayfio P., Sparks J., Wessels A. M., Shcherbinin S., Wang H., Nery E. S. M. et al. , “Donanemab in early symptomatic alzheimer disease: the trailblazeralz 2 randomized clinical trial,” Jama, vol. 330, no. 6, pp. 512–527, 2023. - PMC - PubMed
    1. Yiannopoulou K. G. and Papageorgiou S. G., “Current and future treatments in alzheimer disease: an update,” Journal of central nervous system disease, vol. 12, p. 1179573520907397, 2020. - PMC - PubMed
    1. Thung K.-H., Wee C.-Y., Yap P.-T., and Shen D., “Identification of progressive mild cognitive impairment patients using incomplete longitudinal mri scans,” Brain Structure and Function, vol. 221, pp. 3979–3995, 2016. - PMC - PubMed

Publication types