. 2021 May 25:7:e560.

doi: 10.7717/peerj-cs.560. eCollection 2021.

Deep learning prediction of mild cognitive impairment conversion to Alzheimer's disease at 3 years after diagnosis using longitudinal and whole-brain 3D MRI

Ethan Ocasio¹, Tim Q Duong¹

Affiliations

PMID: 34141888
PMCID: PMC8176545
DOI: 10.7717/peerj-cs.560

Deep learning prediction of mild cognitive impairment conversion to Alzheimer's disease at 3 years after diagnosis using longitudinal and whole-brain 3D MRI

Ethan Ocasio et al. PeerJ Comput Sci. 2021.

. 2021 May 25:7:e560.

doi: 10.7717/peerj-cs.560. eCollection 2021.

Authors

Ethan Ocasio¹, Tim Q Duong¹

Affiliation

¹ Department of Radiology, Montefiore Medical Center, Albert Einstein College of Medicine, Bronx, NY, United States of America.

PMID: 34141888
PMCID: PMC8176545
DOI: 10.7717/peerj-cs.560

Abstract

Background: While there is no cure for Alzheimer's disease (AD), early diagnosis and accurate prognosis of AD may enable or encourage lifestyle changes, neurocognitive enrichment, and interventions to slow the rate of cognitive decline. The goal of our study was to develop and evaluate a novel deep learning algorithm to predict mild cognitive impairment (MCI) to AD conversion at three years after diagnosis using longitudinal and whole-brain 3D MRI.

Methods: This retrospective study consisted of 320 normal cognition (NC), 554 MCI, and 237 AD patients. Longitudinal data include T1-weighted 3D MRI obtained at initial presentation with diagnosis of MCI and at 12-month follow up. Whole-brain 3D MRI volumes were used without a priori segmentation of regional structural volumes or cortical thicknesses. MRIs of the AD and NC cohort were used to train a deep learning classification model to obtain weights to be applied via transfer learning for prediction of MCI patient conversion to AD at three years post-diagnosis. Two (zero-shot and fine tuning) transfer learning methods were evaluated. Three different convolutional neural network (CNN) architectures (sequential, residual bottleneck, and wide residual) were compared. Data were split into 75% and 25% for training and testing, respectively, with 4-fold cross validation. Prediction accuracy was evaluated using balanced accuracy. Heatmaps were generated.

Results: The sequential convolutional approach yielded slightly better performance than the residual-based architecture, the zero-shot transfer learning approach yielded better performance than fine tuning, and CNN using longitudinal data performed better than CNN using a single timepoint MRI in predicting MCI conversion to AD. The best CNN model for predicting MCI conversion to AD at three years after diagnosis yielded a balanced accuracy of 0.793. Heatmaps of the prediction model showed regions most relevant to the network including the lateral ventricles, periventricular white matter and cortical gray matter.

Conclusions: This is the first convolutional neural network model using longitudinal and whole-brain 3D MRIs without extracting regional brain volumes or cortical thicknesses to predict future MCI to AD conversion at 3 years after diagnosis. This approach could lead to early prediction of patients who are likely to progress to AD and thus may lead to better management of the disease.

Keywords: Artificial intelligence; Convolutional neural networks; Dementia; Machine learning; Magnetic resonance imaging; Neuroimaging.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

**Figure 1. Overview of experimental design.**
AD and NC MRI data were first trained to obtain weights (a classification task) for transfer learning (blue). After training, the weights are transferred to the prediction task (green) to predict whether patients will remain stable or progress within three years. Two different transfer learning methods were studied. With zero-shot, no further training was performed after the transfer, so the MCI images were analyzed for prediction by the network with the same weights copied over from the classification task. With fine-tuning, after weights are copied over from the classification task for initialization, additional training is performed against the MCI image data.

**Figure 2. Single and dual time point CNN architecture.**
(A) Single timepoint CNN. For classification, input consisted of a single timepoint full-subject 3D MRI of patients diagnosed at baseline as either AD or CN, and output was binary classification of AD vs CN. For prediction, input was a single timepoint full-subject 3D MRI of patients diagnosed as MCI and output was a binary prediction of whether the patient progressed (pMCI) or remained stable (sMCI) 3 years later. (B) Dual timepoint CNN. Input included 3D MRI images obtained at both baseline and 12 months, with the patient population and output categories identical than those used for single timepoint for classification and prediction. Both kinds of networks began with a series of convolutional blocks, followed by flattening into one or more fully connected layers ending in a final binary choice of classification or prediction.

**Figure 3. Sequential, residual with bottleneck, and wide residual CNN blocks.**
The convolutional layers portion of the network was organized as a series of blocks, each one with an increasing number K of activation maps (width), and with a corresponding decrease in resolution obtained by either pooling or stride during convolution. The figures detail the individual layers that compose a single block. (A) Sequential convolutional block. Each block was composed of a single 3 × 3 × 3 convolution, followed by batch normalization, ReLU activation, and max pooling to reduce the resolution. (B) Residual bottleneck with preactivation convolutional block. Convolutions were preceded by batch normalization and ReLU activation. Two bottleneck 3 × 3 × 3 convolutions have a width of K/4 followed by a final 1 × 1 × 1 convolution with K width. In parallel the skip residual used a 1 × 1 × 1 convolution to match the width and resolution. In this architecture the first residual block was preceded by an initial batch normalization followed by a single 5 × 5 × 5 convolution, plus one final batch normalization and ReLU activation after the last block (not shown). (C) Wide Residual Network convolutional block. In this architecture the batch normalization and activations occured after the convolutional layers. Each block had two 3 × 3 × 3 convolutional layers with 3D spatial dropout in between, plus a 1 × 1 × 1 skip residual convolution to match width and resolution.

**Figure 4. Three head architectures.**
(A) 3D global maximum pooling fully connected block. The global pooling inherently flattened the nodes into a fully connected layer with N nodes directly followed by the final binary classifier layer. (B) Long fully connected block. After flattening into a layer of N nodes, there are two sets of fully connected (size 2,048 and 1,024), batch normalization, and leaky ReLU activation layers separated by a single dropout layer, before the final binary classifier. (C) Medium fully connected block. Initial 3D max pooling is followed by flattening into a fully connected layer of size N followed by an additional fully connected layer of size 128 and ReLU activation.

**Figure 5. Training curves during classification.**
Loss and Accuracy curves during training for both training and validation sets. For sequential network and single timepoint, (A) loss, (B) accuracy. For wide residual network and dual timepoints, (C) loss, (D) accuracy. Solid lines are smoothed with 0.8 factor and faint lines show the unsmoothed values for each epoch.

**Figure 6. Training curve during fine tuning for prediction.**
(A) Loss function per epoch and (B) accuracy per epoch) during transfer learning fine tuning (sequential dual channel). Weights were initialized after training with AD vs. NC and then frozen at the convolutional layers, then additional training performed with the sMCI vs. pMCI data. There is an initial reduction in loss which stabilizes after 10 epochs, with no increase in accuracy.

**Figure 7. (A-J) Heatmap visualization for 10 patients.**
3D Grad-CAM heatmaps from the wide residual dual channel network used to predict conversion of MCI to AD. Heat maps were superimposed on individual patient’s anatomical MRI of 10 patients. A areas in bright yellow-orange (low to high) color corresponding to voxels with the gradient based on 3D Grad-CAM algorithm at the convolutional layer around 20 pixel resolution.

See this image and copyright information in PMC

Cited by

Machine learning based multi-modal prediction of future decline toward Alzheimer's disease: An empirical study.
Karaman BK, Mormino EC, Sabuncu MR; Alzheimer’s Disease Neuroimaging Initiative. Karaman BK, et al. PLoS One. 2022 Nov 16;17(11):e0277322. doi: 10.1371/journal.pone.0277322. eCollection 2022. PLoS One. 2022. PMID: 36383528 Free PMC article.
Comparing a pre-defined versus deep learning approach for extracting brain atrophy patterns to predict cognitive decline due to Alzheimer's disease in patients with mild cognitive symptoms.
Arvidsson I, Strandberg O, Palmqvist S, Stomrud E, Cullen N, Janelidze S, Tideman P, Heyden A, Åström K, Hansson O, Mattsson-Carlgren N. Arvidsson I, et al. Alzheimers Res Ther. 2024 Mar 19;16(1):61. doi: 10.1186/s13195-024-01428-5. Alzheimers Res Ther. 2024. PMID: 38504336 Free PMC article.
Longitudinal structural MRI-based deep learning and radiomics features for predicting Alzheimer's disease progression.
Aghajanian S, Mohammadifard F, Mohammadi I, Rajai Firouzabadi S, Baradaran Bagheri A, Moases Ghaffary E, Mirmosayyeb O. Aghajanian S, et al. Alzheimers Res Ther. 2025 Aug 7;17(1):182. doi: 10.1186/s13195-025-01827-2. Alzheimers Res Ther. 2025. PMID: 40775357 Free PMC article.
Deep learning improves utility of tau PET in the study of Alzheimer's disease.
Zou J, Park D, Johnson A, Feng X, Pardo M, France J, Tomljanovic Z, Brickman AM, Devanand DP, Luchsinger JA, Kreisl WC, Provenzano FA; Alzheimer's Disease Neuroimaging Initiative. Zou J, et al. Alzheimers Dement (Amst). 2021 Dec 31;13(1):e12264. doi: 10.1002/dad2.12264. eCollection 2021. Alzheimers Dement (Amst). 2021. PMID: 35005197 Free PMC article.
Mind the Gap: Does Brain Age Improve Alzheimer's Disease Prediction?
Tan TWK, Nguyen KN, Zhang C, Kong R, Cheng SF, Ji F, Chong JSX, Chong EJY, Venketasubramanian N, Orban C, Chee MWL, Chen C, Zhou JH, Yeo BTT; Australian Imaging Biomarkers and Lifestyle Study of Aging. Tan TWK, et al. bioRxiv [Preprint]. 2025 Jun 1:2024.11.16.623903. doi: 10.1101/2024.11.16.623903. bioRxiv. 2025. Update in: Hum Brain Mapp. 2025 Aug 15;46(12):e70276. doi: 10.1002/hbm.70276. PMID: 39605400 Free PMC article. Updated. Preprint.

See all "Cited by" articles

References

1. Alaa AM, Bolton T, Di Angelantonio E, Rudd JHF, Van der Schaar M. Cardiovascular disease risk prediction using automated machine learning: prospective study of 423, 604 UK Biobank participants. PLOS ONE. 2019;14:e0213653. doi: 10.1371/journal.pone.0213653. - DOI - PMC - PubMed
1. Basaia S, Agosta F, Wagner L, Canu E, Magnani G, Santangelo R, Filippi M. Automated classification of Alzheimer’s disease and mild cognitive impairment using a single MRI and deep neural networks. NeuroImage: Clinical. 2019;21:101645. doi: 10.1016/j.nicl.2018.101645. - DOI - PMC - PubMed
1. Bhagwat N, Viviano JD, Voineskos AN, Chakravarty MM. Modeling and prediction of clinical symptom trajectories in Alzheimer’s disease using longitudinal data. PLOS Computational Biology. 2018;14:e1006376. doi: 10.1371/journal.pcbi.1006376. - DOI - PMC - PubMed
1. Brun A, Gustafson L. Distribution of cerebral degeneration in Alzheimer’s disease. A clinico-pathological study. Arch Psychiatr Nervenkr (1970) 1976;223:15–33. doi: 10.1007/BF00367450. - DOI - PubMed
1. Cheng D, Liu M, Fu J, Wang Y. Ninth international conference on digital image processing (ICDIP 2017) International Society for Optics and Photonics; 2017. Classification of MR brain images by combination of multi-CNNs for AD diagnosis.

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep learning prediction of mild cognitive impairment conversion to Alzheimer's disease at 3 years after diagnosis using longitudinal and whole-brain 3D MRI

Affiliation

Deep learning prediction of mild cognitive impairment conversion to Alzheimer's disease at 3 years after diagnosis using longitudinal and whole-brain 3D MRI

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources