Circular-SWAT for deep learning based diagnostic classification of Alzheimer's disease: application to metabolome data

Collaborators, Affiliations

PMID: 37806288
PMCID: PMC10579282
DOI: 10.1016/j.ebiom.2023.104820

Circular-SWAT for deep learning based diagnostic classification of Alzheimer's disease: application to metabolome data

Taeho Jo et al. EBioMedicine. 2023 Nov.

. 2023 Nov:97:104820.

doi: 10.1016/j.ebiom.2023.104820. Epub 2023 Oct 7.

PMID: 37806288
PMCID: PMC10579282
DOI: 10.1016/j.ebiom.2023.104820

Abstract

Background: Deep learning has shown potential in various scientific domains but faces challenges when applied to complex, high-dimensional multi-omics data. Alzheimer's Disease (AD) is a neurodegenerative disorder that lacks targeted therapeutic options. This study introduces the Circular-Sliding Window Association Test (c-SWAT) to improve the classification accuracy in predicting AD using serum-based metabolomics data, specifically lipidomics.

Methods: The c-SWAT methodology builds upon the existing Sliding Window Association Test (SWAT) and utilizes a three-step approach: feature correlation analysis, feature selection, and classification. Data from 997 participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) served as the basis for model training and validation. Feature correlations were analyzed using Weighted Gene Co-expression Network Analysis (WGCNA), and Convolutional Neural Networks (CNN) were employed for feature selection. Random Forest was used for the final classification.

Findings: The application of c-SWAT resulted in a classification accuracy of up to 80.8% and an AUC of 0.808 for distinguishing AD from cognitively normal older adults. This marks a 9.4% improvement in accuracy and a 0.169 increase in AUC compared to methods without c-SWAT. These results were statistically significant, with a p-value of 1.04 × 10ˆ-4. The approach also identified key lipids associated with AD, such as Cer(d16:1/22:0) and PI(37:6).

Interpretation: Our results indicate that c-SWAT is effective in improving classification accuracy and in identifying potential lipid biomarkers for AD. These identified lipids offer new avenues for understanding AD and warrant further investigation.

Funding: The specific funding of this article is provided in the acknowledgements section.

Keywords: Alzheimer's disease; Deep learning; Lipidomics; Machine learning; Metabolomics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests Dr. Saykin receives support from multiple NIH grants (P30 AG010133, P30 AG072976, R01 AG019771, R01 AG057739, U19 AG024904, R01 LM013463, R01 AG068193, T32 AG071444, U01 AG068057, U01 AG072177, and U19 AG074879). He has also received support from Avid Radiopharmaceuticals, a subsidiary of Eli Lilly (in kind contribution of PET tracer precursor); Siemens Medical Solutions USA, Inc. (Dementia Advisory Board); NIH NHLBI (MESA Observational Study Monitoring Board); Eisai (Scientific Advisory Board); NIH/NIA: External Advisory Committees, Multiple NIH-funded centers/programs; and Springer-Nature Publishing (Editorial Office Support as Editor-in-Chief, Brain Imaging and Behavior). Dr. Kaddurah-Daouk receives support from multiple NIH grants (3U01AG061359, 1RF1AG059093, 1RF1AG058942, 5U19AG063744, 3U19AG063744-04S1, 1R01AG069901, 3U01AG061359-05S1). She is an inventor on a series of patents related to metabolomics signatures in neuropsychiatric diseases. Dr. Kaddurah-Daouk holds equity and stock in Metabolon, Inc., and PsyProtix, which were not involved in this study. Matthias Arnold is coinventor (through Duke University/Helmholtz Zentrum München) on patents on applications of metabolomics in diseases of the central nervous system. Matthias Arnold also holds equity in Chymia LLC and IP in PsyProtix and Atai that is unrelated to this work. The other authors declare no conflict of interest.

Figures

**Fig. 1**
The figure illustrates the original Sliding Window Association Test (SWAT) in Genome-Wide Association Studies (GWAS). SWAT begins by partitioning the entire genome into smaller, nonoverlapping fragments. For every fragment, SWAT employs a sliding window technique in conjunction with a Convolutional Neural Network (CNN) to compute a phenotype influence score (PIS) for each Single Nucleotide Polymorphism (SNP). This computation considers ‘w’, the number of SNPs in a fragment, and ‘S_k’, the position of each SNP. By distinguishing SNPs with significant PIS values, SWAT efficiently identifies phenotype-associated genetic variants.

**Fig. 2**
Overall structure of the c-SWAT. The phenotype influence score for the feature groups was calculated as shown in (a). Sliding windows of varying sizes overlap all feature groups except one to perform the classification prediction, thereby determining the importance of the excluded group. WGCNA was used to determine the group as shown in (b). Based on these results and the lipid classes, PIS for each metabolite was calculated and used to classify AD.

**Fig. 3**
An overview of how a deep learning approach was implemented in steps 1 and 3. Our model utilizes three main hidden layers, with the number of nodes in these layers optimized from 32 down to 8 using a grid search approach. The classification between AD and CN was performed with top-ranked features from each group using the CNN algorithm, and the performance was assessed by a 5-fold cross-validation.

**Fig. 4**
Visualization of AD/CN classification results. (a) Bar graph on the y-axis representing the average accuracy of a 10-fold cross validation. With c-SWAT, the Random Forest model could classify AD from CN with a highest accuracy of 0.807 when using 22 features, compared to an accuracy of 0.714 when the same number of features were randomly applied without using PIS. (b) The y-axis presents the accuracy for AD/CN classification in each subset, both with and without the implementation of c-SWAT, considering subsets ranging from the top 1 to 781 features. An outer circle represents the number of metabolite features utilized. Blue dots indicate classification accuracy when incorporating the results of PIS with c-SWAT, while red dots represent cases without the application of c-SWAT.

**Fig. 5**
Performance comparison between top features, randomly selected features, and least associated features in AD classification. (a) The Receiver Operating Characteristic (ROC) curve illustrates the classification capability using the top features selected by c-SWAT, randomly selected features, and the least associated features determined by c-SWAT. The highest average Area Under the Curve (AUC) from a 10-fold cross-validation for the top features reached 0.808 with 22 features. When randomly selecting the same number of 22 features, the AUC was 0.639. In comparison, the AUC for the same number of least associated features was significantly lower at 0.478. (b) The Precision-Recall (PR) curve showcases the predictive performance of these feature sets, including the top features selected by c-SWAT, randomly selected features, and the least associated features determined by c-SWAT. The top features consistently exhibited higher precision and recall values relative to the randomly selected and least associated features, underscoring their enhanced predictive proficiency in AD classification.

**Fig. 6**
Classification accuracy across Alzheimer's Disease stages. (a) Bar chart illustrating the area under the curve (AUC) values for the classification among different Alzheimer's disease stages. (b) ROC curves representing classification performance for various disease stage comparisons, with each curve displaying the relationship between the true positive rate and false positive rate.

See this image and copyright information in PMC

References

1. LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436. - PubMed
1. Jo T. Gilbut; 2019. Deep-learning for everyone.
1. Schmidhuber J. Deep learning in neural networks: an overview. Neural Network. 2015;61:85–117. - PubMed
1. Bengio Y., LeCun Y. Scaling learning algorithms towards AI. Large-scale kernel Machines. 2007;34(5):1–41.
1. Jo T., Hou J., Eickholt J., Cheng J. Improving protein fold recognition by deep learning networks. Sci Rep. 2015;5(1):1–11. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Circular-SWAT for deep learning based diagnostic classification of Alzheimer's disease: application to metabolome data

Circular-SWAT for deep learning based diagnostic classification of Alzheimer's disease: application to metabolome data

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous