Genome-wide association neural networks identify genes linked to family history of Alzheimer's disease

Upamanyu Ghose^{1

2}, William Sproviero¹, Laura Winchester¹, Najaf Amin³, Taiyu Zhu¹, Danielle Newby^{1

4}, Brittany S Ulm^{2

3}, Angeliki Papathanasiou¹, Liu Shi^{1

5}, Qiang Liu^{1

6}, Marco Fernandes^{1

7}, Cassandra Adams^{2

8}, Ashwag Albukhari^{2

9}, Majid Almansouri^{2

10}, Hani Choudhry^{2

9}, Cornelia van Duijn^{2

3}, Alejo Nevado-Holgado^{1

2}

Affiliations

¹ Department of Psychiatry, University of Oxford, Oxford, United Kingdom.
² King Abdulaziz University and the University of Oxford Centre for Artificial Intelligence in Precision Medicine (KO-CAIPM), Jeddah, Saudi Arabia.
³ Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom.
⁴ Centre for Statistics in Medicine, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom.
⁵ Department of Translational Medicine, Nxera Pharma UK Limited, Cambridge, United Kingdom.
⁶ School of Engineering Mathematics and Technology University of Bristol, Ada Lovelace Building, Bristol, United Kingdom.
⁷ School of Medicine, University of St Andrews, St Andrews, United Kingdom.
⁸ Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁹ Biochemistry Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia.
¹⁰ Clinical Biochemistry Department, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.

PMID: 39775791
PMCID: PMC11707606
DOI: 10.1093/bib/bbae704

Genome-wide association neural networks identify genes linked to family history of Alzheimer's disease

Upamanyu Ghose et al. Brief Bioinform. 2024.

. 2024 Nov 22;26(1):bbae704.

doi: 10.1093/bib/bbae704.

Authors

Affiliations

¹ Department of Psychiatry, University of Oxford, Oxford, United Kingdom.
² King Abdulaziz University and the University of Oxford Centre for Artificial Intelligence in Precision Medicine (KO-CAIPM), Jeddah, Saudi Arabia.
³ Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom.
⁴ Centre for Statistics in Medicine, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom.
⁵ Department of Translational Medicine, Nxera Pharma UK Limited, Cambridge, United Kingdom.
⁶ School of Engineering Mathematics and Technology University of Bristol, Ada Lovelace Building, Bristol, United Kingdom.
⁷ School of Medicine, University of St Andrews, St Andrews, United Kingdom.
⁸ Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁹ Biochemistry Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia.
¹⁰ Clinical Biochemistry Department, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.

PMID: 39775791
PMCID: PMC11707606
DOI: 10.1093/bib/bbae704

Abstract

Augmenting traditional genome-wide association studies (GWAS) with advanced machine learning algorithms can allow the detection of novel signals in available cohorts. We introduce "genome-wide association neural networks (GWANN)" a novel approach that uses neural networks (NNs) to perform a gene-level association study with family history of Alzheimer's disease (AD). In UK Biobank, we defined cases (n = 42 110) as those with AD or family history of AD and sampled an equal number of controls. The data was split into an 80:20 ratio of training and testing samples, and GWANN was trained on the former followed by identifying associated genes using its performance on the latter. Our method identified 18 genes to be associated with family history of AD. APOE, BIN1, SORL1, ADAM10, APH1B, and SPI1 have been identified by previous AD GWAS. Among the 12 new genes, PCDH9, NRG3, ROR1, LINGO2, SMYD3, and LRRC7 have been associated with neurofibrillary tangles or phosphorylated tau in previous studies. Furthermore, there is evidence for differential transcriptomic or proteomic expression between AD and healthy brains for 10 of the 12 new genes. A series of post hoc analyses resulted in a significantly enriched protein-protein interaction network (P-value < 1 × 10-16), and enrichment of relevant disease and biological pathways such as focal adhesion (P-value = 1 × 10-4), extracellular matrix organization (P-value = 1 × 10-4), Hippo signaling (P-value = 7 × 10-4), Alzheimer's disease (P-value = 3 × 10-4), and impaired cognition (P-value = 4 × 10-3). Applying NNs for GWAS illustrates their potential to complement existing algorithms and methods and enable the discovery of new associations without the need to expand existing cohorts.

Keywords: Alzheimer’s disease; GWAS; UK Biobank; artificial intelligence; machine learning; neural networks.

PubMed Disclaimer

Figures

**Figure 1**
NN architecture used in the GWANN method. The top-left branch generates a 1D encoding from the SNP input, while the bottom-left branch does so for the covariate input. The right trunk merges the encodings of both branches to output whether the input belongs to cases or controls.

**Figure 2**
Manhattan plot after running GWANN on family history of AD. Significant hits were identified at an empirically determined P-value threshold of P-value <1 × 10⁻²⁵ (top dotted line). After calculating the LD between significant genes, the gene with the best NLL within an LD block was identified as the hit gene. The P-values lower than 6.95 × 10⁻¹⁵⁹ have been cropped to a value of 6.95 × 10⁻¹⁶⁰. The bottom dotted line marks the Bonferroni-corrected threshold for the number of gene windows that were tested, P-value = 7.06 × 10⁻⁷.

**Figure 3**
Overlap of GWANN hits with previous studies. (a) Heatmap showing the count of previous GWAS where the GWANN hits were identified to be associated with the phenotypes on the x-axis. (b) Heatmap showing the presence of significant evidence for the terms on the x-axis for the GWANN hits. (c) Heatmap showing the overlap between GWANN hits (GWANN), a GWAS run using PLINK 2.0 on the same data as GWANN (TradGWAS), and the largest European AD GWAS (EADB GWAS) [12]. (d) Similar heatmap to (c) but instead of using the GWANN and TradGWAS hits, the top 100 genes from both methods were considered for the overlap with the EADB GWAS hit genes. The sample size of each method is mentioned in the x-axis of the heatmaps, and the diagonals show the number of genes of each method considered while calculating the overlap.

**Figure 4**
*Post hoc* enrichment analysis after GWANN analysis. (a–d) Gene set enrichment analysis for (a) Reactome, (b) Wiki, (c) KEGG, and (d) GO using GWANN summary metrics. (e, f) Genes were ranked according to the metric *1 –* NLL_NN, where NLL_NN was the negative log likelihood of the neural network for a given gene. (e) Disease and trait enrichment using the top 100 genes. (f) Enriched PPI network (P-value < 1 × 10⁻¹⁶) for the top 100 genes. The colors within the nodes highlight the trait categories enriched by the protein encoded by the gene.

See this image and copyright information in PMC

References

1. Gauthier S, Rosa-Neto P, Morais JA. et al. World alzheimer report 2021: journey through the diagnosis of dementia. 2021.
1. Querfurth HW, LaFerla FM. Alzheimer’s disease. N Engl J Med 2010;362:329–44. 10.1056/NEJMra0909142. - DOI - PubMed
1. Buniello A, MacArthur JAL, Cerezo M. et al. . The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 2019;47:D1005–12. 10.1093/nar/gky1120. - DOI - PMC - PubMed
1. Manolio TA, Collins FS, Cox NJ. et al. . Finding the missing heritability of complex diseases. Nature 2009;461:747–53. 10.1038/nature08494. - DOI - PMC - PubMed
1. Visscher PM, Brown MA, McCarthy MI. et al. . Five years of GWAS discovery. Am J Hum Genet 2012;90:7–24. 10.1016/j.ajhg.2011.11.029. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

ARUK-PhD2022-031/Alzheimer's Research UK

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genome-wide association neural networks identify genes linked to family history of Alzheimer's disease

Affiliations

Genome-wide association neural networks identify genes linked to family history of Alzheimer's disease

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous