Data mining of high density genomic variant data for prediction of Alzheimer's disease risk
- PMID: 22273362
- PMCID: PMC3355044
- DOI: 10.1186/1471-2350-13-7
Data mining of high density genomic variant data for prediction of Alzheimer's disease risk
Abstract
Background: The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs) at genes implicated in immune response, cholesterol/lipid metabolism, and cell membrane processes have been confirmed by genome-wide association studies (GWAS) to be associated with late-onset Alzheimer's disease (LOAD), a percentage of AD heritability continues to be unexplained. We try to find other genetic variants that may influence LOAD risk utilizing data mining methods.
Methods: Two different approaches were devised to select SNPs associated with LOAD in a publicly available GWAS data set consisting of three cohorts. In both approaches, single-locus analysis (logistic regression) was conducted to filter the data with a less conservative p-value than the Bonferroni threshold; this resulted in a subset of SNPs used next in multi-locus analysis (random forest (RF)). In the second approach, we took into account prior biological knowledge, and performed sample stratification and linkage disequilibrium (LD) in addition to logistic regression analysis to preselect loci to input into the RF classifier construction step.
Results: The first approach gave 199 SNPs mostly associated with genes in calcium signaling, cell adhesion, endocytosis, immune response, and synaptic function. These SNPs together with APOE and GAB2 SNPs formed a predictive subset for LOAD status with an average error of 9.8% using 10-fold cross validation (CV) in RF modeling. Nineteen variants in LD with ST5, TRPC1, ATG10, ANO3, NDUFA12, and NISCH respectively, genes linked directly or indirectly with neurobiology, were identified with the second approach. These variants were part of a model that included APOE and GAB2 SNPs to predict LOAD risk which produced a 10-fold CV average error of 17.5% in the classification modeling.
Conclusions: With the two proposed approaches, we identified a large subset of SNPs in genes mostly clustered around specific pathways/functions and a smaller set of SNPs, within or in proximity to five genes not previously reported, that may be relevant for the prediction/understanding of AD.
Figures
Similar articles
-
Alzheimer's Disease Risk Polymorphisms Regulate Gene Expression in the ZCWPW1 and the CELF1 Loci.PLoS One. 2016 Feb 26;11(2):e0148717. doi: 10.1371/journal.pone.0148717. eCollection 2016. PLoS One. 2016. PMID: 26919393 Free PMC article.
-
Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.BMC Genomics. 2015;16 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-16-S2-S5. Epub 2015 Jan 21. BMC Genomics. 2015. PMID: 25708662 Free PMC article.
-
Genetic variants influencing human aging from late-onset Alzheimer's disease (LOAD) genome-wide association studies (GWAS).Neurobiol Aging. 2012 Aug;33(8):1849.e5-18. doi: 10.1016/j.neurobiolaging.2012.02.014. Epub 2012 Mar 23. Neurobiol Aging. 2012. PMID: 22445811 Free PMC article.
-
Shared genetic etiology underlying Alzheimer's disease and type 2 diabetes.Mol Aspects Med. 2015 Jun-Oct;43-44:66-76. doi: 10.1016/j.mam.2015.06.006. Epub 2015 Jun 23. Mol Aspects Med. 2015. PMID: 26116273 Free PMC article. Review.
-
Genome-wide significant, replicated and functional risk variants for Alzheimer's disease.J Neural Transm (Vienna). 2017 Nov;124(11):1455-1471. doi: 10.1007/s00702-017-1773-0. Epub 2017 Aug 2. J Neural Transm (Vienna). 2017. PMID: 28770390 Free PMC article. Review.
Cited by
-
Mobile-phone radiation-induced perturbation of gene-expression profiling, redox equilibrium and sporadic-apoptosis control in the ovary of Drosophila melanogaster.Fly (Austin). 2017 Apr 3;11(2):75-95. doi: 10.1080/19336934.2016.1270487. Epub 2016 Dec 14. Fly (Austin). 2017. PMID: 27960592 Free PMC article.
-
A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets.J Am Med Inform Assoc. 2014 Oct;21(e2):e312-9. doi: 10.1136/amiajnl-2013-002358. Epub 2014 Apr 15. J Am Med Inform Assoc. 2014. PMID: 24737607 Free PMC article.
-
TMEM16C facilitates Na(+)-activated K+ currents in rat sensory neurons and regulates pain processing.Nat Neurosci. 2013 Sep;16(9):1284-90. doi: 10.1038/nn.3468. Epub 2013 Jul 21. Nat Neurosci. 2013. PMID: 23872594 Free PMC article.
-
Identification of novel radiation-induced p53-dependent transcripts extensively regulated during mouse brain development.Biol Open. 2015 Feb 13;4(3):331-44. doi: 10.1242/bio.20149969. Biol Open. 2015. PMID: 25681390 Free PMC article.
-
Genome-Wide Association Study of Gallstone Disease Identifies Novel Candidate Genomic Variants in a Latino Community of Southwest USA.J Racial Ethn Health Disparities. 2025 Feb;12(1):234-240. doi: 10.1007/s40615-023-01867-0. Epub 2023 Nov 28. J Racial Ethn Health Disparities. 2025. PMID: 38015333
References
-
- Park A. Alzheimer's Unlocked. (cover story) Time. 2010;176(17):53. - PubMed
-
- Hollingworth P, Harold D, Jones L, Owen MJ, Williams J. Alzheimer's disease genetics: current knowledge and future challenges. Int J Geriatr Psychiatry. 2010. - PubMed
-
- Harold D, Abraham R, Hollingworth P, Sims R, Gerrish A, Hamshere ML, Pahwa JS, Moskvina V, Dowzell K, Williams A, Jones N, Thomas C, Stretton A, Morgan AR, Lovestone S, Powell J, Proitsi P, Lupton MK, Brayne C, Rubinsztein DC, Gill M, Lawlor B, Lynch A, Morgan K, Brown KS, Passmore PA, Craig D, McGuinness B, Todd S, Holmes C. et al.Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease. Nat Genet. 2009;41(10):1088–1093. doi: 10.1038/ng.440. - DOI - PMC - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Research Materials
Miscellaneous