Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 21;12(11):1661.
doi: 10.3390/genes12111661.

Pairwise Correlation Analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation

Affiliations

Pairwise Correlation Analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation

Erik D Huckvale et al. Genes (Basel). .

Abstract

The Alzheimer's Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer's disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (p-value ≤ 1.40754 × 10-13). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses.

Keywords: ADNI; Alzheimer’s disease; feature reduction; machine learning; pairwise feature correlation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Creation of the MRI domain from the MRI slice sequences using the trained convolutional autoencoders. A separate autoencoder was trained for each MRI slice, and the latent space was concatenated for each person to create a row specific to that individual.

References

    1. Zhang R., Simon G., Yu F. Advancing Alzheimer’s Research: A Review of Big Data Promises. Int. J. Med. Inform. 2017;106:48–56. doi: 10.1016/j.ijmedinf.2017.07.002. - DOI - PMC - PubMed
    1. Jack C.R., Bennett D.A., Blennow K., Carrillo M.C., Feldman H.H., Frisoni G.B., Hampel H., Jagust W.J., Johnson A., Knopman D.S., et al. A/T/N: An unbiased descriptive classification scheme for Alzheimer disease biomarkers. Neurology. 2016;87:539–547. doi: 10.1212/WNL.0000000000002923. - DOI - PMC - PubMed
    1. Lam B., Masellis M., Freedman M., Stuss D.T., Black S.E. Clinical, imaging, and pathological heterogeneity of the Alzheimer’s disease syndrome. Alzheimer’s Res. Ther. 2013;5:1. doi: 10.1186/alzrt155. - DOI - PMC - PubMed
    1. Ritchie K., Carrière I., Berr C., Amieva H., Dartigues J.F., Ancelin M.L., Ritchie C.W. The clinical picture of Alzheimer’s disease in the decade before diagnosis: Clinical and biomarker trajectories. J. Clin. Psychiatry. 2016;77 doi: 10.4088/JCP.15m09989. - DOI - PubMed
    1. Ang T.F., An N., Ding H., Devine S., Auerbach S.H., Massaro J., Joshi P., Liu X., Liu Y., Mahon E., et al. Using data science to diagnose and characterize heterogeneity of Alzheimer’s disease. Alzheimer’s Dement. Transl. Res. Clin. Interv. 2019;5:264–271. doi: 10.1016/j.trci.2019.05.002. - DOI - PMC - PubMed

Publication types

MeSH terms