Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024:3:1332928.
doi: 10.3389/frdem.2024.1332928. Epub 2024 Feb 12.

Alzheimer's disease detection using data fusion with a deep supervised encoder

Affiliations

Alzheimer's disease detection using data fusion with a deep supervised encoder

Minh Trinh et al. Front Dement. 2024.

Abstract

Alzheimer's disease (AD) is affecting a growing number of individuals. As a result, there is a pressing need for accurate and early diagnosis methods. This study aims to achieve this goal by developing an optimal data analysis strategy to enhance computational diagnosis. Although various modalities of AD diagnostic data are collected, past research on computational methods of AD diagnosis has mainly focused on using single-modal inputs. We hypothesize that integrating, or "fusing," various data modalities as inputs to prediction models could enhance diagnostic accuracy by offering a more comprehensive view of an individual's health profile. However, a potential challenge arises as this fusion of multiple modalities may result in significantly higher dimensional data. We hypothesize that employing suitable dimensionality reduction methods across heterogeneous modalities would not only help diagnosis models extract latent information but also enhance accuracy. Therefore, it is imperative to identify optimal strategies for both data fusion and dimensionality reduction. In this paper, we have conducted a comprehensive comparison of over 80 statistical machine learning methods, considering various classifiers, dimensionality reduction techniques, and data fusion strategies to assess our hypotheses. Specifically, we have explored three primary strategies: (1) Simple data fusion, which involves straightforward concatenation (fusion) of datasets before inputting them into a classifier; (2) Early data fusion, in which datasets are concatenated first, and then a dimensionality reduction technique is applied before feeding the resulting data into a classifier; and (3) Intermediate data fusion, in which dimensionality reduction methods are applied individually to each dataset before concatenating them to construct a classifier. For dimensionality reduction, we have explored several commonly-used techniques such as principal component analysis (PCA), autoencoder (AE), and LASSO. Additionally, we have implemented a new dimensionality-reduction method called the supervised encoder (SE), which involves slight modifications to standard deep neural networks. Our results show that SE substantially improves prediction accuracy compared to PCA, AE, and LASSO, especially in combination with intermediate fusion for multiclass diagnosis prediction.

Keywords: Alzheimer’s biomarkers; Alzheimer’s disease; data integration; diagnosis prediction; dimensionality reduction; multimodal fusion; multiview data integration.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Overview of dimensionality reduction methods. (A) Principal component analysis (PCA), an unsupervised feature extraction method. (B) Autoencoder (AE) architecture for latent feature extraction. Conceptually, the AE can be interpreted as a nonlinear PCA. (C) LASSO, which applies a penalty term to identify the most useful features (with nonzero coefficients) in a supervised prediction problem. (D) Supervised encoder (SE) based on a deep neural network, with the latent features extracted from the bottleneck prior to the output layer. The SE extends the benefits of a supervised approach, like the LASSO, by finding more complex and nonlinear latent features.
Figure 2
Figure 2
Pipelines. Pipelines (a–c) are models with different types of single-modal input. Pipelines (d–f) are models utilizing different forms of multimodal data fusion.
Figure 3
Figure 3
Comparing different fusion methods. We compare the top ranked fusion pipelines for binary (left) and multiclass (right) classification across DR methods. Intermediate fusion with SE consistently ranked as the top model in both classification tasks. All binary classification tasks used logistic regression as the classifier, and the multiclass models are labeled with the classifier used (RF, random forest; NN, neural network).
Figure 4
Figure 4
Comparison across all models evaluated. We report the top ten models for binary (left) and multiclass (right) classification across fusion and DR methods. Ensemble models results are indicated by hashed bars, and the remaining results are from single-model pipelines. Intermediate fusion with SE consistently ranked as the top model. Models using single-modal CSF data did not rank among the top 10 models for either classification task. All binary classification tasks used logistic regression as the classifier, and the multiclass models are labeled with the classifier used (RF, random forest; NN, neural network).

References

    1. Alzheimer's Association (2023). 2023 Alzheimer's disease facts and figures. Alzheimer Dement. 19, 1598–1695. 10.1002/alz.13016 - DOI - PubMed
    1. Andrews J. S., Beach T. G., Buracchio T., Carrillo M. C., Dunn B., Graf A., et al. . (2023). Revised Criteria for Diagnosis and Staging of Alzheimer's Disease: Alzheimers Association Workgroup. Available online at: https://aaic.alz.org/diagnostic-criteria.asp (accessed December 13, 2023).
    1. Beach T. G., Monsell S. E., Phillips L. E., Kukull W. (2012). Accuracy of the clinical diagnosis of alzheimer disease at national institute on aging Alzheimer disease centers, 2005–2010. J. Neuropathol. Exper. Neurol. 71, 266–273. 10.1097/NEN.0b013e31824b211b - DOI - PMC - PubMed
    1. Beekly D. L., Ramos E. M., van Belle G., Deitrich W., Clark A. D., Jacka M. E., et al. . (2004). The national Alzheimer's coordinating center (NACC) database: an Alzheimer disease database. Alzheimer Dis. Assoc. Disor. 18, 270–277. - PubMed
    1. Besser L., Kukull W., Knopman D. S., Chui H., Galasko D., Weintraub S., et al. . (2018). Version 3 of the national Alzheimers coordinating centers uniform data set. Alzheimer Dis. Assoc. Disor. 32, 351–358. 10.1097/WAD.0000000000000279 - DOI - PMC - PubMed

Grants and funding

LinkOut - more resources