Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 31;14(1):20270.
doi: 10.1038/s41598-024-70774-z.

Deep learning approach for dysphagia detection by syllable-based speech analysis with daily conversations

Affiliations

Deep learning approach for dysphagia detection by syllable-based speech analysis with daily conversations

Seokhyeon Heo et al. Sci Rep. .

Abstract

Dysphagia, a disorder affecting the ability to swallow, has a high prevalence among the older adults and can lead to serious health complications. Therefore, early detection of dysphagia is important. This study evaluated the effectiveness of a newly developed deep learning model that analyzes syllable-segmented data for diagnosing dysphagia, an aspect not addressed in prior studies. The audio data of daily conversations were collected from 16 patients with dysphagia and 24 controls. The presence of dysphagia was determined by videofluoroscopic swallowing study. The data were segmented into syllables using a speech-to-text model and analyzed with a convolutional neural network to perform binary classification between the dysphagia patients and control group. The proposed model in this study was assessed in two different aspects. Firstly, with syllable-segmented analysis, it demonstrated a diagnostic accuracy of 0.794 for dysphagia, a sensitivity of 0.901, a specificity of 0.687, a positive predictive value of 0.742, and a negative predictive value of 0.874. Secondly, at the individual level, it achieved an overall accuracy of 0.900 and area under the curve of 0.953. This research highlights the potential of deep learning modal as an early, non-invasive, and simple method for detecting dysphagia in everyday environments.

Keywords: Artificial intelligence; Conversations; Deep learning; Dysphagia; Speech-to-text model; Syllable-based speech analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The overview of the proposed classification process.
Fig. 2
Fig. 2
Construction of dataset (a) dataset of syllable-based segmentation (b) dataset for train and test.
Fig. 3
Fig. 3
Method of segmenting conversation into syllable units.
Fig. 4
Fig. 4
Architecture of ResNet-34.
Fig. 5
Fig. 5
Evaluation methods (a) syllable-segmented analysis (b) individual-level evaluation (c) individual-level classification process.
Fig. 6
Fig. 6
Result of syllable-segmented classification for dysphagia.
Fig. 7
Fig. 7
Result of individual-level classification for dysphagia and ROC curve.

References

    1. Lai, D. K. H. et al. Computer-aided Screening of aspiration risks in dysphagia with wearable technology: A systematic review and meta-analysis on test accuracy. Front. Bioeng. Biotechnol.11, 1205009 (2023). 10.3389/fbioe.2023.1205009 - DOI - PMC - PubMed
    1. Subramani, S., Rao, A., Roy, A., Hegde, P. S., & Ghosh, P. K. SegNet-based deep representation learning for dysphagia classification. In ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://sigport.org/documents/segnet-based-deep-representation-learning-... (2022).
    1. Roldan-Vasco, S., Restrepo-Uribe, J. P., Orozco-Duque, A., Suarez-Escudero, J. C. & Orozco-Arroyave, J. R. Analysis of electrophysiological and mechanical dimensions of swallowing by non-invasive biosignals. Biomed. Signal Process. Control82, 104533 (2023). 10.1016/j.bspc.2022.104533 - DOI
    1. Dudik, J. M. et al. Deep learning for classification of normal swallows in adults. Neurocomputing285, 1–9 (2018). 10.1016/j.neucom.2017.12.059 - DOI - PMC - PubMed
    1. Shu, K., Mao, S., Coyle, J. L. & Sejdić, E. Improving non-invasive aspiration detection with auxiliary classifier wasserstein generative adversarial networks. IEEE J. Biomed. Health Inf.26, 1263–1272 (2021). 10.1109/JBHI.2021.3106565 - DOI - PMC - PubMed

LinkOut - more resources