Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019:2019:1-4.
doi: 10.1109/bhi.2019.8834506. Epub 2019 Sep 12.

DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Affiliations

DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Yang Yang Wang et al. IEEE EMBS Int Conf Biomed Health Inform. 2019.

Abstract

Oromotor dysfunction caused by neurological disorders can result in significant speech and swallowing impairments. Current diagnostic methods to assess oromotor function are subjective and rely on perceptual judgments by clinicians. In particular, the widely used oral-diadochokinesis (oral-DDK) test, which requires rapid, alternate repetitions of speech-based syllables, is conducted and interpreted differently among clinicians. It is therefore prone to inaccuracy, which results in poor test reliability and poor clinical application. In this paper, we present a deep learning based software to extract quantitative data from the oral DDK signal, thereby transforming it into an objective diagnostic and treatment monitoring tool. The proposed software consists of two main modules: a fully automated syllable detection module and an interactive visualization and editing module that allows inspection and correction of automated syllable units. The DeepDDK software was evaluated on speech files corresponding to 9 different DDK syllables (e.g., "Pa", "Ta", "Ka"). The experimental results show robustness of both syllable detection and localization across different types of DDK speech tasks.

Keywords: Diadochokinesis analysis; deep learning; event detection; event localization; speech signal analysis.

PubMed Disclaimer

Figures

Fig. 1:
Fig. 1:
Audio waveform samples for different types of oral-DDK tasks.
Fig. 2:
Fig. 2:
CNN-1 and CNN-2 architectures used for DDK syllable detection and localization.
Fig. 3:
Fig. 3:
Sample signals and associated training data. Colored dots mark ground-truth timestamps, shaded regions mark positive training samples for CNN-1.
Fig. 4:
Fig. 4:
Intermediate outputs from the different stages of DeepDDK for a sample “Pa” file. Top panel: original audio signal (blue) with ground-truth timestamps (red). Second panel: output of CNN-1. Third panel: output of CNN-2 where local maxima indicate syllable timestamp.
Fig. 5:
Fig. 5:
Cumulative distribution of event count error for pre-linguistic segmentation [14], Smekal et al. [15] [16], MFCC with Linear SVM and our DeepDDK software. Horizontal axis indicates count error (difference between the number of predicted events vs. ground truth events). Vertical axis shows the ratio of the test files. Absolute event count differences of 1, 2, 3, 4, 5 in the graph correspond to percent count errors of 1.35%, 2.70%, 4.05%, 5.40%, 6.75%, respectively (average number of events per file is 74).

References

    1. Horne Malcolm, Power Laura, and Szmulewicz David. “Quantitative Assessment of Syllabic Timing Deficits in Ataxic Dysarthria.” EMBC, pp. 425–428. IEEE, 2018. - PubMed
    1. Rusz Jan, Benova Barbora, Ruzickova Hana, Novotny Michal, Tykalova Tereza, Hlavnicka Jan, Uher Tomas et al. “Characteristics of motor speech phenotypes in multiple sclerosis.” Multiple sclerosis and related disorders 19 (2018): 62–69. - PubMed
    1. Poellabauer Christian, Yadav Nikhil, Daudet Louis, Schneider Sandra L., Busso Carlos, and Flynn Patrick J.. “Challenges in concussion detection using vocal acoustic biomarkers.” IEEE Access 3 (2015): 1143–1160.
    1. Godino-Llorente JI, Shattuck-Hufnagel S, Choi JY, Moro-Velzquez L, and Gmez-Garca JA. “Towards the identification of Idiopathic Parkinsons Disease from the speech. New articulatory kinetic biomarkers.” PloS one 12, no. 12 (2017): e0189583. - PMC - PubMed
    1. Duranovic Mirela, and Sehic Sabina. “The speed of articulatory movements involved in speech production in children with dyslexia.” Jour. of learning disabilities 46, no. 3 (2013): 278–286. - PubMed

LinkOut - more resources