Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;39(12):2130-2143.
doi: 10.1002/mds.30002. Epub 2024 Sep 5.

Houston, We Have AI Problem! Quality Issues with Neuroimaging-Based Artificial Intelligence in Parkinson's Disease: A Systematic Review

Affiliations

Houston, We Have AI Problem! Quality Issues with Neuroimaging-Based Artificial Intelligence in Parkinson's Disease: A Systematic Review

Verena Dzialas et al. Mov Disord. 2024 Dec.

Abstract

In recent years, many neuroimaging studies have applied artificial intelligence (AI) to facilitate existing challenges in Parkinson's disease (PD) diagnosis, prognosis, and intervention. The aim of this systematic review was to provide an overview of neuroimaging-based AI studies and to assess their methodological quality. A PubMed search yielded 810 studies, of which 244 that investigated the utility of neuroimaging-based AI for PD diagnosis, prognosis, or intervention were included. We systematically categorized studies by outcomes and rated them with respect to five minimal quality criteria (MQC) pertaining to data splitting, data leakage, model complexity, performance reporting, and indication of biological plausibility. We found that the majority of studies aimed to distinguish PD patients from healthy controls (54%) or atypical parkinsonian syndromes (25%), whereas prognostic or interventional studies were sparse. Only 20% of evaluated studies passed all five MQC, with data leakage, non-minimal model complexity, and reporting of biological plausibility as the primary factors for quality loss. Data leakage was associated with a significant inflation of accuracies. Very few studies employed external test sets (8%), where accuracy was significantly lower, and 19% of studies did not account for data imbalance. Adherence to MQC was low across all observed years and journal impact factors. This review outlines that AI has been applied to a wide variety of research questions pertaining to PD; however, the number of studies failing to pass the MQC is alarming. Therefore, we provide recommendations to enhance the interpretability, generalizability, and clinical utility of future AI applications using neuroimaging in PD. © 2024 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.

Keywords: Parkinson's disease; artificial intelligence; neuroimaging; quality control.

PubMed Disclaimer

Figures

FIG. 1
FIG. 1
Visualization of PubMed search term and the respective exclusion criteria for current systematic review. Top: the search term incorporated three main domains with the respective possible expressions: (1) the clinical condition of interest (Parkinson's disease [PD]), (2) the techniques of interest (artificial intelligence [AI]), and (3) the neuroimaging modality (eg, MRI [magnetic resonance imaging]). Bottom: exclusion criteria and number of studies failing the criteria. [Color figure can be viewed at wileyonlinelibrary.com]
FIG. 2
FIG. 2
Quantity of studies categorized under diagnosis, prognosis, and intervention, including subcategories. Some studies had multiple aims and were thus assorted to several subcategories. APS, atypical parkinsonian syndromes; HC, healthy control; PD, parkinson's disease [Color figure can be viewed at wileyonlinelibrary.com]
FIG. 3
FIG. 3
Accuracy of PD (Parkinson's disease) versus HC (healthy control) classification studies by (A) model type and (B) modality. (A) Accuracy of studies, sorted by model type (color) and sample size (size). Model types are sorted in alphabetical order. Studies of the same model type were also summarized to a weighted mean and weighted standard deviation, with sample size serving as weights (indicated by vertical lines). *Ensemble models performed significantly better (surviving Bonferroni correction, P < 0.005) than the remaining model types. (B) Accuracy of studies, sorted by modality type (color) and sample size (size, same as in A). Modalities are sorted in alphabetical order. Studies of the same modality were also summarized to a weighted mean and weighted standard deviation, with sample size serving as weights. *Significant differences (surviving Bonferroni correction, 10 comparisons ‐ P < 0.005) were detected between accuracies obtained using [123I]‐FP‐CIT SPECT and any other modality using weighted t tests. Notes for (A) and (B): sample sizes ranged from 21 to 2077 and were log transformed for plotting. Legend shows true (nontransformed) sample sizes. Only studies that reported accuracy were considered. If ranges were reported by studies, this figure shows the mean accuracy of these ranges. Only four most frequent modalities/models are shown explicitly, whereas the remaining ones were grouped as “others.” Studies of all model and modality types (including those summarized as “other”) are shown in Supplementary Materials Figures 1 and 2. CNN, convolutional neural network; SVM, support vector machine.
FIG. 4
FIG. 4
Minimal quality criteria (MQC) analysis. (A) Percentage of studies passing individual or all MQC. (B) Violin plot of the accumulated number of MQC passed by each individual study. (C) Number of total studies published (gray shaded area) and percentage of studies passing all MQC (red line) over the years. (D) Percentage of studies passing all MQC by impact factor. (E) Differences in accuracy between studies passing or failing individual or all implementation‐level MQC. [Color figure can be viewed at wileyonlinelibrary.com]

References

    1. Painous CM, Marti MJ. Cognitive impairment in Parkinson's disease: what we know so far. Res Rev Parkinson 2020;10:7–17. 10.2147/JPRLS.S263041 - DOI
    1. Bernheimer H, Birkmayer W, Hornykiewicz O, Jellinger K, Seitelberger F. Brain dopamine and the syndromes of Parkinson and Huntington. Clinical, morphological and neurochemical correlations. J Neurol Sci 1973;20(4):415–455. 10.1016/0022-510x(73)90175-5 - DOI - PubMed
    1. Postuma RB, Berg D, Stern M, et al. MDS clinical diagnostic criteria for Parkinson's disease. Mov Disord 2015;30(12):1591–1601. 10.1002/mds.26424 - DOI - PubMed
    1. Hughes AJ, Daniel SE, Lees AJ. Improved accuracy of clinical diagnosis of Lewy body Parkinson's disease. Neurology 2001;57(8):1497–1499. 10.1212/wnl.57.8.1497 - DOI - PubMed
    1. Jankovic J, Rajput AH, McDermott MP, Perl DP, Parkinson Study Group . The evolution of diagnosis in early Parkinson disease. Arch Neurol 2000;57(3):369–372. 10.1001/archneur.57.3.369 - DOI - PubMed

Publication types

MeSH terms