Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 14;12(12):e0189583.
doi: 10.1371/journal.pone.0189583. eCollection 2017.

Towards the identification of Idiopathic Parkinson's Disease from the speech. New articulatory kinetic biomarkers

Affiliations

Towards the identification of Idiopathic Parkinson's Disease from the speech. New articulatory kinetic biomarkers

J I Godino-Llorente et al. PLoS One. .

Abstract

Although a large amount of acoustic indicators have already been proposed in the literature to evaluate the hypokinetic dysarthria of people with Parkinson's Disease, the goal of this work is to identify and interpret new reliable and complementary articulatory biomarkers that could be applied to predict/evaluate Parkinson's Disease from a diadochokinetic test, contributing to the possibility of a further multidimensional analysis of the speech of parkinsonian patients. The new biomarkers proposed are based on the kinetic behaviour of the envelope trace, which is directly linked with the articulatory dysfunctions introduced by the disease since the early stages. The interest of these new articulatory indicators stands on their easiness of identification and interpretation, and their potential to be translated into computer based automatic methods to screen the disease from the speech. Throughout this paper, the accuracy provided by these acoustic kinetic biomarkers is compared with the one obtained with a baseline system based on speaker identification techniques. Results show accuracies around 85% that are in line with those obtained with the complex state of the art speaker recognition techniques, but with an easier physical interpretation, which open the possibility to be transferred to a clinical setting.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Categorization of the disturbances associated to the hypokinetic dysarthria of PD patents.
Fig 2
Fig 2. Histogram of the UPDRS-III labels of the corpus of speakers.
Fig 3
Fig 3. Speech waveform and spectrogram of a 35 years old normophonic speaker uttering the /pa/-/ta/-/ka/ test.
Fig 4
Fig 4. Detail of the speech corresponding to the /ka/ syllable of a 35 years old normophonic speaker.
The syllable starts with a stop gap (silence), followed by a burst that is previous to the periodic sound of the vowel. The structure depicted is typical of the plosive consonant-vowel combinations used in the DDK test.
Fig 5
Fig 5. Speech trace and spectrograms of voiceless bilabial/alveolar/velar (left/center/right) stops uttered by five PD patients with different degrees of the disease according to the H&Y and UPDRS scales.
Fig 6
Fig 6. Analogy of the movements of the articulators in PD patients.
Speakers do not move the articulators to their largest extent, with the required acceleration, and during the required time.
Fig 7
Fig 7. Recognition rate vs. kernel length used to calculate the velocity of the envelope.
The optimum is considered to be in the interval [–65] ms.
Fig 8
Fig 8. Recognition rate vs. kernel lengths used to calculate the velocity and acceleration of the envelope.
A 50 ms long kernel for the velocity corresponds with a 40 ms kernel for the acceleration.
Fig 9
Fig 9. Speech trace with its envelope and an estimate of the velocity and acceleration of the envelope for a young normophonic 35 years old person (a), a control speaker (b), a parkinsonian patient with H&Y = 2 (c), and a parkinsonian patient with H&Y = 3 (d), all of them calculated using 50 and 40 ms. long smoothing kernels for the velocity and acceleration respectively.
The speech traces correspond to one single utterance of the /pa/-/ta/-/ka/ test. The amplitudes are normalized in the range [–1, 1] for each 1.37 s long frame of analysis. Note that the time scales are different for each plot due to a different speech rate.
Fig 10
Fig 10. 3D attractors of the envelope speed for a young normophonic 35 years old person (a), a control speaker (b), a parkinsonian patient with H&Y = 2 (c), and a parkinsonian patient with H&Y = 3 (d), all of them calculated using 50 ms long smoothing kernel for the speed and a time delay of 70 samples.
Fig 11
Fig 11. Accuracy vs. window size for a GMM-UBM system trained with 128 gaussians for two different parameterization approaches.
Best results are with 10 ms. windows.
Fig 12
Fig 12. DET curve using GMM-UBM and iVectors approaches for MFCC and RASTA-PLP parameterization approaches.
Fig 13
Fig 13. Normalized histograms of the UPDRS-III labels corresponding to the speakers wrongly categorized.
a) using GMM-UBM and RASTA-PLP; b) using GMM-UBM and MFCC; c) using iVectors and RASTA-PLP; d) using iVectors and MFCC.
Fig 14
Fig 14. DET plot of the best baseline system and of the proposed method.
Fig 15
Fig 15. Boxplots corresponding to the complexity measures extracted from the acceleration (top row) and velocity (bottom row) sequences.
Fig 16
Fig 16. a) Normalized histogram of the UPDRS-III labels corresponding to the speakers wrongly categorized with the proposed method, b) UPDRS-III level vs. score given by the proposed method.
Fig 17
Fig 17
Example of the estimation of the time lag (a) and embedding dimension (b) for a 1.37 s. long frame corresponding to the velocity of variation of the envelope of a normophonic speaker during the /pa/-/ta/-/ka/ test. In this example, the first minimum of the auto mutual information can be found at 70. Regarding the embedding dimension, the plot of the E1 value used for the Cao’s method shows a kink at 6. The histograms in (c) and (d) correspond to the time delays and embedding dimensions respectively obtained for all the frames extracted from the database.

References

    1. Darley FL, Aronson AE, Brown JR, Motor speech disorders. Philadelphia, 1975.
    1. Darley FL, Aronson AE, Brown JR, “Differential Diagnostic Patterns of Dysarthria,” J. Speech Lang. Hear. Res., vol. 12, no. 2, p. 246, June 1969. - PubMed
    1. Darley FL, Aronson AR, Brown JR, “Clusters of Deviant Speech Dimensions in the Dysarthrias,” J. Speech Lang. Hear. Res., vol. 12, no. 3, p. 462, September 1969. - PubMed
    1. Duffy JR, Motor Speech disorders. Substrates, differential diagnosis and management, 2nd ed St. Louis, MO: Elsevier, 2005.
    1. Freed DB, Motor Speech Disorders: Diagnosis & Treatment, 2nd ed Clifton Park, NJ: Delmar, Cengage Learning, 2000.