DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Yang Yang Wang¹, Ke Gao¹, Yunxin Zhao¹, Mili Kuruvilla-Dugdale², Teresa E Lever³, Filiz Bunyak¹

Affiliations

¹ Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri 65211.
² Department of Speech, Language and Hearing Sciences, University of Missouri, Columbia, Missouri 65211.
³ Department of Otolaryngology - Head and Neck Surgery, University of Missouri, Columbia, Missouri 65211.

PMID: 32864624
PMCID: PMC7451101
DOI: 10.1109/bhi.2019.8834506

DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Yang Yang Wang et al. IEEE EMBS Int Conf Biomed Health Inform. 2019.

. 2019:2019:1-4.

doi: 10.1109/bhi.2019.8834506. Epub 2019 Sep 12.

Authors

Yang Yang Wang¹, Ke Gao¹, Yunxin Zhao¹, Mili Kuruvilla-Dugdale², Teresa E Lever³, Filiz Bunyak¹

Affiliations

¹ Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri 65211.
² Department of Speech, Language and Hearing Sciences, University of Missouri, Columbia, Missouri 65211.
³ Department of Otolaryngology - Head and Neck Surgery, University of Missouri, Columbia, Missouri 65211.

PMID: 32864624
PMCID: PMC7451101
DOI: 10.1109/bhi.2019.8834506

Abstract

Oromotor dysfunction caused by neurological disorders can result in significant speech and swallowing impairments. Current diagnostic methods to assess oromotor function are subjective and rely on perceptual judgments by clinicians. In particular, the widely used oral-diadochokinesis (oral-DDK) test, which requires rapid, alternate repetitions of speech-based syllables, is conducted and interpreted differently among clinicians. It is therefore prone to inaccuracy, which results in poor test reliability and poor clinical application. In this paper, we present a deep learning based software to extract quantitative data from the oral DDK signal, thereby transforming it into an objective diagnostic and treatment monitoring tool. The proposed software consists of two main modules: a fully automated syllable detection module and an interactive visualization and editing module that allows inspection and correction of automated syllable units. The DeepDDK software was evaluated on speech files corresponding to 9 different DDK syllables (e.g., "Pa", "Ta", "Ka"). The experimental results show robustness of both syllable detection and localization across different types of DDK speech tasks.

Keywords: Diadochokinesis analysis; deep learning; event detection; event localization; speech signal analysis.

PubMed Disclaimer

Figures

**Fig. 1:**
Audio waveform samples for different types of oral-DDK tasks.

**Fig. 2:**
CNN-1 and CNN-2 architectures used for DDK syllable detection and localization.

**Fig. 3:**
Sample signals and associated training data. Colored dots mark ground-truth timestamps, shaded regions mark positive training samples for CNN-1.

**Fig. 4:**
Intermediate outputs from the different stages of DeepDDK for a sample “Pa” file. Top panel: original audio signal (blue) with ground-truth timestamps (red). Second panel: output of CNN-1. Third panel: output of CNN-2 where local maxima indicate syllable timestamp.

**Fig. 5:**
Cumulative distribution of event count error for pre-linguistic segmentation [14], Smekal et al. [15] [16], MFCC with Linear SVM and our DeepDDK software. Horizontal axis indicates count error (difference between the number of predicted events vs. ground truth events). Vertical axis shows the ratio of the test files. Absolute event count differences of 1, 2, 3, 4, 5 in the graph correspond to percent count errors of 1.35%, 2.70%, 4.05%, 5.40%, 6.75%, respectively (average number of events per file is 74).

See this image and copyright information in PMC

References

1. Horne Malcolm, Power Laura, and Szmulewicz David. “Quantitative Assessment of Syllabic Timing Deficits in Ataxic Dysarthria.” EMBC, pp. 425–428. IEEE, 2018. - PubMed
1. Rusz Jan, Benova Barbora, Ruzickova Hana, Novotny Michal, Tykalova Tereza, Hlavnicka Jan, Uher Tomas et al. “Characteristics of motor speech phenotypes in multiple sclerosis.” Multiple sclerosis and related disorders 19 (2018): 62–69. - PubMed
1. Poellabauer Christian, Yadav Nikhil, Daudet Louis, Schneider Sandra L., Busso Carlos, and Flynn Patrick J.. “Challenges in concussion detection using vocal acoustic biomarkers.” IEEE Access 3 (2015): 1143–1160.
1. Godino-Llorente JI, Shattuck-Hufnagel S, Choi JY, Moro-Velzquez L, and Gmez-Garca JA. “Towards the identification of Idiopathic Parkinsons Disease from the speech. New articulatory kinetic biomarkers.” PloS one 12, no. 12 (2017): e0189583. - PMC - PubMed
1. Duranovic Mirela, and Sehic Sabina. “The speed of articulatory movements involved in speech production in children with dyslexia.” Jour. of learning disabilities 46, no. 3 (2013): 278–286. - PubMed

Grants and funding

R15 DC016383/DC/NIDCD NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Affiliations

DeepDDK: A Deep Learning based Oral-Diadochokinesis Analysis Software

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources