Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Mar 26:2024.02.08.24302531.
doi: 10.1101/2024.02.08.24302531.

AI-based differential diagnosis of dementia etiologies on multimodal data

Affiliations

AI-based differential diagnosis of dementia etiologies on multimodal data

Chonghua Xue et al. medRxiv. .

Update in

  • AI-based differential diagnosis of dementia etiologies on multimodal data.
    Xue C, Kowshik SS, Lteif D, Puducheri S, Jasodanand VH, Zhou OT, Walia AS, Guney OB, Zhang JD, Poésy S, Kaliaev A, Andreu-Arasa VC, Dwyer BC, Farris CW, Hao H, Kedar S, Mian AZ, Murman DL, O'Shea SA, Paul AB, Rohatgi S, Saint-Hilaire MH, Sartor EA, Setty BN, Small JE, Swaminathan A, Taraschenko O, Yuan J, Zhou Y, Zhu S, Karjadi C, Alvin Ang TF, Bargal SA, Plummer BA, Poston KL, Ahangaran M, Au R, Kolachalama VB. Xue C, et al. Nat Med. 2024 Oct;30(10):2977-2989. doi: 10.1038/s41591-024-03118-z. Epub 2024 Jul 4. Nat Med. 2024. PMID: 38965435 Free PMC article.

Abstract

Differential diagnosis of dementia remains a challenge in neurology due to symptom overlap across etiologies, yet it is crucial for formulating early, personalized management strategies. Here, we present an AI model that harnesses a broad array of data, including demographics, individual and family medical history, medication use, neuropsychological assessments, functional evaluations, and multimodal neuroimaging, to identify the etiologies contributing to dementia in individuals. The study, drawing on 51,269 participants across 9 independent, geographically diverse datasets, facilitated the identification of 10 distinct dementia etiologies. It aligns diagnoses with similar management strategies, ensuring robust predictions even with incomplete data. Our model achieved a micro-averaged area under the receiver operating characteristic curve (AUROC) of 0.94 in classifying individuals with normal cognition, mild cognitive impairment and dementia. Also, the micro-averaged AUROC was 0.96 in differentiating the dementia etiologies. Our model demonstrated proficiency in addressing mixed dementia cases, with a mean AUROC of 0.78 for two co-occurring pathologies. In a randomly selected subset of 100 cases, the AUROC of neurologist assessments augmented by our AI model exceeded neurologist-only evaluations by 26.25%. Furthermore, our model predictions aligned with biomarker evidence and its associations with different proteinopathies were substantiated through postmortem findings. Our framework has the potential to be integrated as a screening tool for dementia in various clinical settings and drug trials, with promising implications for person-level management.

PubMed Disclaimer

Conflict of interest statement

Ethics declarations V.B.K. is on the scientific advisory board for Altoida Inc., and serves as a consultant to AstraZeneca. S.K. serves as consultant to AstraZeneca. C.W.F. is a consultant to Boston Imaging Core Lab. K.L.P. is a member of the scientific advisory boards for Curasen, Biohaven, and Neuron23, receiving consulting fees and stock options, and for Amprion, receiving stock options. R.A. is a scientific advisor to Signant Health and NovoNordisk. She also serves as a consultant to Davos Alzheimer’s Collaborative. The remaining authors declare no competing interests.

Figures

Figure 1:
Figure 1:. Data, model architecture and modeling strategy.
(a) Our model for differential dementia diagnosis was developed using diverse data modalities, including individual-level demographics, health history, neurological testing, physical/neurological exams, and multi-sequence MRI scans. These data sources whenever available were aggregated from nine independent cohorts: 4RTNI, ADNI, AIBL, FHS, LBDSU, NACC, NIFD, OASIS, and PPMI (Tables 1 & S1). For model training, we merged data from NACC, AIBL, PPMI, NIFD, LBDSU, OASIS and 4RTNI. We employed a subset of the NACC dataset for internal testing. For external validation, we utilized the ADNI and FHS cohorts. (b) A transformer served as the scaffold for the model. Each feature was processed into a fixed-length vector using a modality-specific embedding strategy and fed into the transformer as input. A linear layer was used to connect the transformer with the output prediction layer. (c) A subset of the NACC dataset was randomly chosen to conduct a comparative analysis between neurologists’ performance augmented with the AI model and their performance without AI assistance. Similarly, we carried out comparative evaluations with practicing neuroradiologists, who were provided with a randomly selected sample of confirmed dementia cases from the NACC testing cohort, to assess the impact of AI augmentation on their diagnostic performance. For both these evaluations, the model and clinicians had access to the same set of multimodal data. Finally, we assessed the model’s predictions by comparing them with biomarker profiles and pathology grades available from the NACC, ADNI, and FHS cohorts.
Figure 2:
Figure 2:. Model performance on individuals along the cognitive spectrum.
(a,b) Receiver operating characteristic (ROC) and precision-recall (PR) curves, with their respective micro-average, macro-average, and weighted-average calculations based on the labels for NC, MCI, and DE. These averaging techniques consolidated the model’s performance across the spectrum of cognitive states. Cases from the NACC testing, ADNI and FHS were used. (c) Chord diagram indicating varied levels of model performance in the presence of missing data. The inner concentric circles represent various scenarios in which particular test information was either omitted (masked) or included (unmasked). The three outer concentric rings depict the model’s performance as measured by the area under the receiver operating characteristic curve (AUROC) for the NC, MCI and DE labels. (d, e, f) Raincloud plots with violin and box diagrams are shown to denote the distribution of clinical dementia rating scores (x-axis) versus model-predicted probability of dementia (y-axis), on the NACC, ADNI and FHS cohorts, respectively. (g) Raincloud plots are used to demonstrate the model’s ability to distinguish between MCI cases in the NACC cohort where AD was a factor for cognitive impairment and those attributed to non-AD etiologies. For plots (d-g), significance levels are denoted as ‘ns’ (not significant) for p ≥ 0.05; * for p < 0.05; ** for p < 0.01; *** for p < 0.001; and **** for p < 0.0001 based on Kruskal-Wallis H-test for independent samples followed by post-hoc Dunn’s testing with Bonferroni correction.
Figure 3:
Figure 3:. Model assessment on single and co-occurring dementias.
(a, b) Receiver operating characteristic (ROC) and precision-recall (PR) curves are provided, utilizing micro-average, macro-average, and weighted-average methods across all the dementia diagnostic labels. These averages were computed to synthesize the performance metrics across all dementia etiologies. (c) Heatmaps are used to depict the model’s performance on co-occurring dementias. We considered all combinations where two or more etiologies co-occurred from the NACC testing cohort, provided there were at least 25 positive samples. This ensured that the maximum variance of the AUROC calculation over all possible continuous distributions was upper bounded by 0.01. The first row shows the AUROC values and the second row shows the AUPR values. The table also displays the sample sizes for each case, with 1 representing a positive case and 0 indicating a negative sample.
Figure 4:
Figure 4:. Biomarker-level validation.
Raincloud plots representing model probabilities for dementia etiologies across their respective biomarker negative (blue) and positive groups (pink). (a-c) Model predicted probabilities for Alzheimer’s disease (P(AD)) were analyzed in relation to amyloid β(Aβ), tau, and fluorodeoxyglucose (FDG) PET biomarkers. Differences between Aβ negative and positive groups regarding P(AD) were evaluated using a one-sided Mann-Whitney U test for the NACC cohort and a one-sided t-test for ADNI. Similar analyses for tau and FDG PET biomarkers were conducted using one-sided Mann-Whitney U tests, with **** denoting p < 0.0001. (d-e) For frontotemporal lobar degeneration (P(FTD)), probabilities were assessed across MRI and FDG PET biomarker groups in the NACC cohort, using a one-sided Mann-Whitney U test, marked by **** for p < 0.0001. (f) Lewy body dementia (P(LBD)) probabilities were analyzed between DaTscan negative and positive groups using a one-sided Mann-Whitney U test, with **** indicating p < 0.0001.
Figure 5:
Figure 5:. Neuropathological validation.
Array of violin plots with integrated box plots, delineating the probability distributions as predicted by the model for different neuropathological grades. The analysis encompasses data from three distinct cohorts: the Framingham Heart Study (FHS), the National Alzheimer’s Coordinating Center (NACC), and the Alzheimer’s Disease Neuroimaging Initiative (ADNI), each denoted by unique markers (triangles, circles, and diamonds, respectively). Statistical significance is encoded using asterisks, determined by Dunn-Bonferroni post-hoc test: one asterisk (*) for p < 0.05; two asterisks (**) for p < 0.01, three asterisks (***) for p < 0.001, and four asterisks (****) for p < 0.0001, reflecting increasing levels of statistical significance. Table S13 presents more details on the statistics.
Figure 6:
Figure 6:. AI-augmented clinician assessments.
Comparison between the performance of the assessments provided by practicing clinicians versus model-assisted clinicians is shown. (a-b) For the analysis, neurologists were given 100 randomly selected cases encompassing individual-level demographics, health history, neurological tests, physical as well as neurological examinations, and multi-sequence MRI scans. The neurologists were then tasked with assigning confidence scores for NC, MCI, DE, and the 10 dementia etiologies: AD, LBD, VD, PRD, FTD, NPH, SEF, PSY, TBI, and ODE (see Glossary 1). The boxplots show (a) AUROC and (b) AUPR for individual neurologist and model-assisted neurologist performance (defined as the mean between model and neurologist confidence scores). Pairwise statistical comparisons were conducted using the Wilcoxon signed-rank test and significance levels are denoted as: ns (not significant) for p ≥ 0.05; * for p < 0.05; ** for p < 0.01; *** for p < 0.001; and **** for p < 0.0001. The percent increase in mean performance for each etiology is also presented above each statistical annotation. (c-d) Similarly, in a separate analysis, radiologists were given 70 randomly selected cases with a confirmed dementia diagnosis encompassing individual-level demographics and multi-sequence MRI scans. The radiologists were tasked with assigning confidence scores for the 10 dementia etiologies, and the boxplots show (c) AUROC and (d) AUPR for individual radiologist and model-assisted radiologist performance for the 10 etiologies. Statistical annotations and percent increase in mean performance with respect to each etiology are shown in a similar fashion.

References

    1. Organization W. H. et al. Global Status Report on the Public Health Response to Dementia: Web Annex Methodology for Producing Global Dementia Cost Estimates (World Health Organization, 2022).
    1. Cahill S. Who’s global action plan on the public health response to dementia: some challenges and opportunities. Aging & Mental Health 24, 197–199 (2019). - PubMed
    1. Gauthier S. et al. Why has therapy development for dementia failed in the last two decades? Alzheimer’s & Dementia 12, 60–64 (2016). - PubMed
    1. Schneider J. A., Arvanitakis Z., Bang W. & Bennett D. A. Mixed brain pathologies account for most dementia cases in community-dwelling older persons. Neurology 69, 2197–2204 (2007). - PubMed
    1. Habes M. et al. Disentangling heterogeneity in alzheimer’s disease and related dementias using data-driven methods. Biological psychiatry 88, 70–82 (2020). - PMC - PubMed

Publication types

LinkOut - more resources