Explainable artificial intelligence to diagnose early Parkinson's disease via voice analysis

doi:10.1038/s41598-025-96575-6

. 2025 Apr 5;15(1):11687.

doi: 10.1038/s41598-025-96575-6.

Explainable artificial intelligence to diagnose early Parkinson's disease via voice analysis

Matthew Shen^{1

2}, Pouria Mortezaagha^{3

4}, Arya Rahgozar^{3

4}

Affiliations

¹ Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Canada. mtshen97@gmail.com.
² University of Ottawa School of Engineering Design and Teaching Innovation, University of Ottawa, Ottawa, Canada. mtshen97@gmail.com.
³ Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Canada.
⁴ University of Ottawa School of Engineering Design and Teaching Innovation, University of Ottawa, Ottawa, Canada.

PMID: 40188263
PMCID: PMC11972358
DOI: 10.1038/s41598-025-96575-6

Explainable artificial intelligence to diagnose early Parkinson's disease via voice analysis

Matthew Shen et al. Sci Rep. 2025.

. 2025 Apr 5;15(1):11687.

doi: 10.1038/s41598-025-96575-6.

Authors

Matthew Shen^{1

2}, Pouria Mortezaagha^{3

4}, Arya Rahgozar^{3

4}

Affiliations

¹ Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Canada. mtshen97@gmail.com.
² University of Ottawa School of Engineering Design and Teaching Innovation, University of Ottawa, Ottawa, Canada. mtshen97@gmail.com.
³ Ottawa Hospital Research Institute, The Ottawa Hospital, Ottawa, Canada.
⁴ University of Ottawa School of Engineering Design and Teaching Innovation, University of Ottawa, Ottawa, Canada.

PMID: 40188263
PMCID: PMC11972358
DOI: 10.1038/s41598-025-96575-6

Abstract

Parkinson's disease (PD) is a neurodegenerative disorder affecting motor control, leading to symptoms such as tremors and stiffness. Early diagnosis is essential for effective treatment, but traditional methods are often time-consuming and expensive. This study leverages Artificial Intelligence (AI) and Machine Learning (ML) techniques, using voice analysis to detect early signs of PD. We applied a hybrid model combining Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Multiple Kernel Learning (MKL), and Multilayer Perceptron (MLP) to a dataset of 81 voice recordings. Acoustic features such as Mel-Frequency Cepstral Coefficients (MFCCs), jitter, and shimmer were analyzed. The model achieved 91.11% accuracy, 92.50% recall, 89.84% precision, 91.13% F1 score, and an area-under-the-curve (AUC) of 0.9125. SHapley Additive exPlanations (SHAP) provided data explainability, identifying key features driving the PD diagnosis, thus enhancing AI interpretability and trustability. Furthermore, a probability-based scoring system was developed to enable PD patients and clinicians to track disease progression. This AI-driven approach offers a non-invasive, cost-effective, and rapid tool for early PD detection, facilitating personalized treatment through vocal biomarkers.

Keywords: Deep learning; Explainable AI; Parkinson’s disease; Vocal biomarkers.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
(a) Color-coded accuracy line graph of tested models across 5 cross-validation folds. (b) Color-coded cross-entropy loss line graph of tested models across 5 cross-validation folds.

**Fig. 2**
Color-coded bar graph of average performance metrics of tested models across 5 cross-validation folds.

**Fig. 3**
Color-coded line graph of ROC curves of tested models. Dashed blank line represents the ROC of a random chance diagnosis.

**Fig. 4**
The SHAP feature importance plot displays the impact of various acoustic features on the model’s output. The x-axis represents SHAP values, where negative values decrease the likelihood of PD, and positive values increase it. The y-axis lists the most influential features on the model’s prediction. Each dot represents an individual data point. The position of the dot indicates the contribution of that feature to the prediction. The color gradient reflects feature values, where red signifies high values, blue indicates low values, and purple represents intermediate values.

**Fig. 5**
The box and whisker plots show the minimum, first quartile, median, third quartile, maximum, and outlier points for mean pitch, local jitter, local shimmer, and mean HNR.

**Fig. 6**
(a) The spectrogram of an HC subject exhibits clear, stable harmonic structures with consistent frequency bands. This visualization demonstrates stronger and more evenly spaced harmonics than (b), indicating better vocal stability. The bright regions on the decibal scale signify higher signal intensity, distinguishing HC voices from PD-affected speech. (b) The spectrogram of a PD patient’s voice shows irregular frequency patterns and decreased signal stability. The disrupted harmonic structures and weaker intensity bands reflect vocal instability, which is characteristic of PD-related dysphonia. The color scale represents amplitude in decibels, with lower-intensity regions indicating reduced vocal control.

**Fig. 7**
The champion model’s sequential pipeline is shown with arrows indicating a series of steps that utilize the unique advantages of each neural network, eventually forming a diagnosis.

See this image and copyright information in PMC

Cited by

Harnessing artificial intelligence for brain disease: advances in diagnosis, drug discovery, and closed-loop therapeutics.
Fang SJ, Yin ZD, Cai Q, Li LF, Zheng PF, Chen LZ. Fang SJ, et al. Front Neurol. 2025 Jul 28;16:1615523. doi: 10.3389/fneur.2025.1615523. eCollection 2025. Front Neurol. 2025. PMID: 40791911 Free PMC article. Review.

References

1. Little, M. A. et al. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng.56, 1015–1022. 10.1109/TBME.2008.2005954 (2009). - PMC - PubMed
1. Tsanas, A. et al. Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans. Biomed. Eng.57, 884–893. 10.1109/TBME.2009.2036000 (2010). - PubMed
1. Alhanai, T., Au, R. & Glass, J. Detecting depression with audio/text sequence modeling of interviews. Interspeech 1716–1720. 10.21437/Interspeech.2018-2522 (2018).
1. Alissa, M. et al. Parkinson’s disease diagnosis using convolutional neural networks and figure-copying tasks. Neural Comput. Appl.34, 1433–1453. 10.1007/s00521-021-06469-7 (2022).
1. Iqbal, S. et al. On the analyses of medical images using traditional machine learning techniques and convolutional neural networks. Arch. Comput. Methods Eng.30, 3173–3233. 10.1007/s11831-023-09899-9 (2023). - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Medical
- MedlinePlus Health Information

[1] Little, M. A. et al. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng.56, 1015–1022. 10.1109/TBME.2008.2005954 (2009). - PMC - PubMed

[2] Little, M. A. et al. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng.56, 1015–1022. 10.1109/TBME.2008.2005954 (2009). - PMC - PubMed

[3] Tsanas, A. et al. Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans. Biomed. Eng.57, 884–893. 10.1109/TBME.2009.2036000 (2010). - PubMed

[4] Tsanas, A. et al. Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans. Biomed. Eng.57, 884–893. 10.1109/TBME.2009.2036000 (2010). - PubMed

[5] Alhanai, T., Au, R. & Glass, J. Detecting depression with audio/text sequence modeling of interviews. Interspeech 1716–1720. 10.21437/Interspeech.2018-2522 (2018).

[6] Alhanai, T., Au, R. & Glass, J. Detecting depression with audio/text sequence modeling of interviews. Interspeech 1716–1720. 10.21437/Interspeech.2018-2522 (2018).

[7] Alissa, M. et al. Parkinson’s disease diagnosis using convolutional neural networks and figure-copying tasks. Neural Comput. Appl.34, 1433–1453. 10.1007/s00521-021-06469-7 (2022).

[8] Alissa, M. et al. Parkinson’s disease diagnosis using convolutional neural networks and figure-copying tasks. Neural Comput. Appl.34, 1433–1453. 10.1007/s00521-021-06469-7 (2022).

[9] Iqbal, S. et al. On the analyses of medical images using traditional machine learning techniques and convolutional neural networks. Arch. Comput. Methods Eng.30, 3173–3233. 10.1007/s11831-023-09899-9 (2023). - PMC - PubMed

[10] Iqbal, S. et al. On the analyses of medical images using traditional machine learning techniques and convolutional neural networks. Arch. Comput. Methods Eng.30, 3173–3233. 10.1007/s11831-023-09899-9 (2023). - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explainable artificial intelligence to diagnose early Parkinson's disease via voice analysis

Affiliations

Explainable artificial intelligence to diagnose early Parkinson's disease via voice analysis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical