Integrative machine learning approach to risk prediction for dementia and Alzheimer's disease
- PMID: 40864401
- DOI: 10.1007/s11357-025-01828-x
Integrative machine learning approach to risk prediction for dementia and Alzheimer's disease
Abstract
Dementia, particularly Alzheimer's disease (AD), presents a growing global health challenge characterized by cognitive decline, behavioral changes, and loss of independence. With increasing life expectancy, early diagnosis and improved clinical strategies are urgently needed. This study developed and evaluated machine learning (ML) models to predict AD risk using UK Biobank data, integrating health, genetic, and lifestyle factors. The cohort included 2878 AD cases and 72,366 controls. Among several algorithms, CatBoost performed best (ROC-AUC = 0.773), especially in females. Inputs included ICD-10 codes from 5 years pre-diagnosis, ApoE-ε4 genotype, and large collection of modifiable risk factors. Despite fewer cases, the risk predictive models for vascular dementia (VaD) outperformed the unique AD models. ApoE-ε4 was the most predictive genetic marker, while other common variants had limited utility. Key non-genetic predictors included comorbidities (e.g., diabetes, hypertension), education, physical activity, and diet. These findings highlight the value of integrating diverse data sources for dementia risk prediction and emphasize the role of sex-specific modeling and modifiable factors in early, personalized intervention strategies.
Keywords: APOE; AUC; Feature selection; GWAS; PWAS; SHAP values; UK Biobank.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics: The study was approved by the University Committee for the Use of Human Subjects in Research Approval number 12072022 (July 2025). This study uses the UK-Biobank (UKB) application ID 26664 (Linial lab). Competing interests: The authors declare no competing interests.
Similar articles
-
Alzheimer's disease polygenic risk's association with all-cause dementia through the plasma metabolome in the UK Biobank study.Geroscience. 2025 Jul 1. doi: 10.1007/s11357-025-01724-4. Online ahead of print. Geroscience. 2025. PMID: 40588577
-
Building gender-specific sexually transmitted infection risk prediction models using CatBoost algorithm and NHANES data.BMC Med Inform Decis Mak. 2024 Jan 24;24(1):24. doi: 10.1186/s12911-024-02426-1. BMC Med Inform Decis Mak. 2024. PMID: 38267946 Free PMC article.
-
Optimized feature selection and advanced machine learning for stroke risk prediction in revascularized coronary artery disease patients.BMC Med Inform Decis Mak. 2025 Jul 24;25(1):276. doi: 10.1186/s12911-025-03116-2. BMC Med Inform Decis Mak. 2025. PMID: 40707947 Free PMC article.
-
Plasma and cerebrospinal fluid amyloid beta for the diagnosis of Alzheimer's disease dementia and other dementias in people with mild cognitive impairment (MCI).Cochrane Database Syst Rev. 2014 Jun 10;2014(6):CD008782. doi: 10.1002/14651858.CD008782.pub4. Cochrane Database Syst Rev. 2014. PMID: 24913723 Free PMC article.
-
CSF tau and the CSF tau/ABeta ratio for the diagnosis of Alzheimer's disease dementia and other dementias in people with mild cognitive impairment (MCI).Cochrane Database Syst Rev. 2017 Mar 22;3(3):CD010803. doi: 10.1002/14651858.CD010803.pub2. Cochrane Database Syst Rev. 2017. PMID: 28328043 Free PMC article.
References
-
- Garre-Olmo J. Epidemiology of Alzheimer’s disease and other dementias. Rev Neurol. 2018;66:377–86. - PubMed
-
- Zhang X-X, Tian Y, Wang Z-T, Ma Y-H, Tan L, Yu J-T. The epidemiology of Alzheimer’s disease modifiable risk factors and prevention. J Prevent Alzheim Dis. 2021;8:313–21.
-
- Li X, Feng X, Sun X, Hou N, Han F, Liu Y. Global, regional, and national burden of Alzheimer’s disease and other dementias, 1990–2019. Front Ag Neurosci. 2022;14:937486.
-
- Kumar A, Singh A. A review on Alzheimer’s disease pathophysiology and its management: an update. Pharmacol Rep. 2015;67:195–203. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous