Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 29;13(1):232.
doi: 10.1038/s41398-023-02531-1.

Classification and deep-learning-based prediction of Alzheimer disease subtypes by using genomic data

Affiliations

Classification and deep-learning-based prediction of Alzheimer disease subtypes by using genomic data

Daichi Shigemizu et al. Transl Psychiatry. .

Abstract

Late-onset Alzheimer's disease (LOAD) is the most common multifactorial neurodegenerative disease among elderly people. LOAD is heterogeneous, and the symptoms vary among patients. Genome-wide association studies (GWAS) have identified genetic risk factors for LOAD but not for LOAD subtypes. Here, we examined the genetic architecture of LOAD based on Japanese GWAS data from 1947 patients and 2192 cognitively normal controls in a discovery cohort and 847 patients and 2298 controls in an independent validation cohort. Two distinct groups of LOAD patients were identified. One was characterized by major risk genes for developing LOAD (APOC1 and APOC1P1) and immune-related genes (RELB and CBLC). The other was characterized by genes associated with kidney disorders (AXDND1, FBP1, and MIR2278). Subsequent analysis of albumin and hemoglobin values from routine blood test results suggested that impaired kidney function could lead to LOAD pathogenesis. We developed a prediction model for LOAD subtypes using a deep neural network, which achieved an accuracy of 0.694 (2870/4137) in the discovery cohort and 0.687 (2162/3145) in the validation cohort. These findings provide new insights into the pathogenic mechanisms of LOAD.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Disconnectivity graph.
The energy landscape was visualized by using 3D (a) and 2D (b) disconnectivity graphs, where all samples were classified into two groups. LOAD cases and CN subjects are represented in red and green spheres, and their sizes correspond to the difference of LOAD and CN frequencies at each node. b Gray circles represent to the same frequency in LOAD and CN subjects.
Fig. 2
Fig. 2. Validation of expression of the genes closest to the association signals using qRT-PCR.
a The expression of six genes in the blood (red) and brain tissues (yellow) was checked in the HPA database. The x-axis represents the resulting transcript expression values, denoted as normalized transcripts per million (nTPM). b RNA-seq data of 126 subjects and 151 subjects were available from the NCGG Biobank database for RELB and FBP1, respectively. The effects of association signals on the expression of their nearby genes were examined by using linear regression adjusted for age and sex. The association signal rs146190016 significantly increased RELB expression (p = 0.019), but the signal rs550833079 did not significantly change FBP1 expression (p = 0.37). c We used quantitative RT-PCR (qRT-PCR) data to validate eQTL results for these genes (RELB and FBP1). The results were consistent with the RNA-seq results for both genes. Data in (b) and (c) are represented as box and whisker plots, depicting minimum, lower quartile (Q1), mean (Q2), upper quartile (Q3), and maximum values.
Fig. 3
Fig. 3. Assessment of kidney function using routine blood test results.
We examined five markers of kidney function measured in routine blood tests (creatinine, cystatin C, eGFR, albumin [Alb], and hemoglobin [Hb]). The differences in the results between LOAD and CN were tested with the Wilcoxon rank-sum test. Data were represented as box and whisker plots, depicting minimum, lower quartile (Q1), mean (Q2), upper quartile (Q3), and maximum values. *FDR < 0.05, **FDR < 0.001.
Fig. 4
Fig. 4. A LOAD subtype prediction model.
a We applied a deep neural network with six hidden layers of 512 neurons along with RReLU activation, 50% dropout, and a batch size of 32 to predict PC scores from variant data of the discovery cohort. b The networks were trained for 100 to 1500 epochs at 100 intervals using the discovery cohort. The best model achieved an accuracy of 0.694 in 900 epochs in the discovery cohort and 0.687 in the independent validation cohort.

References

    1. Hardy J, Selkoe DJ. The amyloid hypothesis of Alzheimer’s disease: progress and problems on the road to therapeutics. Science. 2002;297:353–6. doi: 10.1126/science.1072994. - DOI - PubMed
    1. Prince M, Bryce R, Albanese E, Wimo A, Ribeiro W, Ferri CP. The global prevalence of dementia: a systematic review and metaanalysis. Alzheimers Dement. 2013;9:63–75. doi: 10.1016/j.jalz.2012.11.007. - DOI - PubMed
    1. Rabinovici GD. Late-onset Alzheimer disease. Continuum. 2019;25:14–33. - PMC - PubMed
    1. Huber CM, Yee C, May T, Dhanala A, Mitchell CS. Cognitive decline in preclinical Alzheimer’s disease: amyloid-beta versus tauopathy. J Alzheimers Dis. 2018;61:265–81. doi: 10.3233/JAD-170490. - DOI - PMC - PubMed
    1. Bredesen DE. Metabolic profiling distinguishes three subtypes of Alzheimer’s disease. Aging. 2015;7:595–600. doi: 10.18632/aging.100801. - DOI - PMC - PubMed

Publication types