Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 17;7(1):37.
doi: 10.1038/s41746-024-01027-6.

Automatic speech-based assessment to discriminate Parkinson's disease from essential tremor with a cross-language approach

Affiliations

Automatic speech-based assessment to discriminate Parkinson's disease from essential tremor with a cross-language approach

Cristian David Rios-Urrego et al. NPJ Digit Med. .

Abstract

Parkinson's disease (PD) and essential tremor (ET) are prevalent movement disorders that mainly affect elderly people, presenting diagnostic challenges due to shared clinical features. While both disorders exhibit distinct speech patterns-hypokinetic dysarthria in PD and hyperkinetic dysarthria in ET-the efficacy of speech assessment for differentiation remains unexplored. Developing technology for automatic discrimination could enable early diagnosis and continuous monitoring. However, the lack of data for investigating speech behavior in these patients has inhibited the development of a framework for diagnostic support. In addition, phonetic variability across languages poses practical challenges in establishing a universal speech assessment system. Therefore, it is necessary to develop models robust to the phonetic variability present in different languages worldwide. We propose a method based on Gaussian mixture models to assess domain adaptation from models trained in German and Spanish to classify PD and ET patients in Czech. We modeled three different speech dimensions: articulation, phonation, and prosody and evaluated the models' performance in both bi-class and tri-class classification scenarios (with the addition of healthy controls). Our results show that a fusion of the three speech dimensions achieved optimal results in binary classification, with accuracies up to 81.4 and 86.2% for monologue and /pa-ta-ka/ tasks, respectively. In tri-class scenarios, incorporating healthy speech signals resulted in accuracies of 63.3 and 71.6% for monologue and /pa-ta-ka/ tasks, respectively. Our findings suggest that automated speech analysis, combined with machine learning is robust, accurate, and can be adapted to different languages to distinguish between PD and ET patients.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Histograms and the corresponding probability density distributions of the scores obtained in the best classification scenarios between PD and ET patients in Czech.
a For the monologue task, the adaptation was performed from the UBM trained with Verbmobil (German). b For the /pa-ta-ka/ task from the UBM trained with the German controls, both scenarios were obtained with the fusion of the three speech dimensions.
Fig. 2
Fig. 2. Confusion matrices of the best results obtained in the classification of ET patients (ET) vs. PD patients (PD) vs. healthy speech (HC).
a For the monologue task, the adaptation was performed from the UBM trained with Verbmobil (German) and using the fusion of the three speech dimensions. b For the /pa-ta-ka/ task, the adaptation was performed from the UBM trained with the German and Spanish controls and using the prosody dimension.
Fig. 3
Fig. 3. Visualization of the groups distribution after applying LDA using two components.
a Results based on monologue task. b Results based on /pa-ta-ka/ task.
Fig. 4
Fig. 4. General methodology.
a Databases considered. b Feature extraction. c UBM training. d Speaker adaptation. e Generation of supervectors. f Training and evaluation. GMM supervectors were created with information extracted from features of articulation (Art.), phonation (Phon.), and prosody (Pros.). Fus early fusion of all supervectors. PCA principal component analysis computed from the early fusion supervector. MAP maximum a posterior.

References

    1. Haubenberger D, Hallett M. Essential tremor. N. Eng. J. Med. 2018;378:1802–1810. doi: 10.1056/NEJMcp1707928. - DOI - PubMed
    1. Bloem BR, Okun MS, Klein C. Parkinson’s disease. Lancet. 2021;397:2284–2303. doi: 10.1016/S0140-6736(21)00218-X. - DOI - PubMed
    1. Thenganatt MA, Louis ED. Distinguishing essential tremor from Parkinson’s disease: bedside tests and laboratory evaluations. Expert Rev. Neurother. 2012;12:687–696. doi: 10.1586/ern.12.49. - DOI - PMC - PubMed
    1. Portalete, C. R. et al. Acoustic and physiological voice assessment and maximum phonation time in patients with different types of dysarthria. J. Voice10.1016/j.jvoice.2021.09.034 (2021). - PubMed
    1. Jain S, Lo SE, Louis ED. Common misdiagnosis of a common neurological disorder: how are we misdiagnosing essential tremor? Arch. Neurol. 2006;63:1100–1104. doi: 10.1001/archneur.63.8.1100. - DOI - PubMed