Comparative Study

. 2016 Jan:66:15-28.

doi: 10.1016/j.artmed.2015.10.002. Epub 2015 Oct 30.

A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms

Jakob Unger¹, Maria Schuster², Dietmar J Hecker³, Bernhard Schick³, Jörg Lohscheller⁴

Affiliations

¹ Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany. Electronic address: jakob.unger@lfb.rwth-aachen.de.
² Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, Marchioninistr. 13, 81366 München, Germany.
³ Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany.
⁴ Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany.

PMID: 26597002
DOI: 10.1016/j.artmed.2015.10.002

Comparative Study

A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms

Jakob Unger et al. Artif Intell Med. 2016 Jan.

. 2016 Jan:66:15-28.

doi: 10.1016/j.artmed.2015.10.002. Epub 2015 Oct 30.

Authors

Jakob Unger¹, Maria Schuster², Dietmar J Hecker³, Bernhard Schick³, Jörg Lohscheller⁴

Affiliations

¹ Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany. Electronic address: jakob.unger@lfb.rwth-aachen.de.
² Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, Marchioninistr. 13, 81366 München, Germany.
³ Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany.
⁴ Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany.

PMID: 26597002
DOI: 10.1016/j.artmed.2015.10.002

Abstract

Objective: This work presents a computer-based approach to analyze the two-dimensional vocal fold dynamics of endoscopic high-speed videos, and constitutes an extension and generalization of a previously proposed wavelet-based procedure. While most approaches aim for analyzing sustained phonation conditions, the proposed method allows for a clinically adequate analysis of both dynamic as well as sustained phonation paradigms.

Materials and methods: The analysis procedure is based on a spatio-temporal visualization technique, the phonovibrogram, that facilitates the documentation of the visible laryngeal dynamics. From the phonovibrogram, a low-dimensional set of features is computed using a principle component analysis strategy that quantifies the type of vibration patterns, irregularity, lateral symmetry and synchronicity, as a function of time. Two different test bench data sets are used to validate the approach: (I) 150 healthy and pathologic subjects examined during sustained phonation. (II) 20 healthy and pathologic subjects that were examined twice: during sustained phonation and a glissando from a low to a higher fundamental frequency. In order to assess the discriminative power of the extracted features, a Support Vector Machine is trained to distinguish between physiologic and pathologic vibrations. The results for sustained phonation sequences are compared to the previous approach. Finally, the classification performance of the stationary analyzing procedure is compared to the transient analysis of the glissando maneuver.

Results: For the first test bench the proposed procedure outperformed the previous approach (proposed feature set: accuracy: 91.3%, sensitivity: 80%, specificity: 97%, previous approach: accuracy: 89.3%, sensitivity: 76%, specificity: 96%). Comparing the classification performance of the second test bench further corroborates that analyzing transient paradigms provides clear additional diagnostic value (glissando maneuver: accuracy: 90%, sensitivity: 100%, specificity: 80%, sustained phonation: accuracy: 75%, sensitivity: 80%, specificity: 70%).

Conclusions: The incorporation of parameters describing the temporal evolvement of vocal fold vibration clearly improves the automatic identification of pathologic vibration patterns. Furthermore, incorporating a dynamic phonation paradigm provides additional valuable information about the underlying laryngeal dynamics that cannot be derived from sustained conditions. The proposed generalized approach provides a better overall classification performance than the previous approach, and hence constitutes a new advantageous tool for an improved clinical diagnosis of voice disorders.

Keywords: Dynamic phonation; High-speed laryngoscopy; Multiscale product; Voice disorder; Wavelet ridge; Wavelet-based analysis.

PubMed Disclaimer

Cited by

Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters.
Schlegel P, Kunduk M, Stingl M, Semmler M, Döllinger M, Bohr C, Schützenberger A. Schlegel P, et al. PLoS One. 2019 Apr 22;14(4):e0215168. doi: 10.1371/journal.pone.0215168. eCollection 2019. PLoS One. 2019. PMID: 31009488 Free PMC article.
Determination of Clinical Parameters Sensitive to Functional Voice Disorders Applying Boosted Decision Stumps.
Schlegel P, Kist AM, Semmler M, Dollinger M, Kunduk M, Durr S, Schutzenberger A. Schlegel P, et al. IEEE J Transl Eng Health Med. 2020 May 22;8:2100511. doi: 10.1109/JTEHM.2020.2985026. eCollection 2020. IEEE J Transl Eng Health Med. 2020. PMID: 32518739 Free PMC article.
Interdependencies between acoustic and high-speed videoendoscopy parameters.
Schlegel P, Kist AM, Kunduk M, Dürr S, Döllinger M, Schützenberger A. Schlegel P, et al. PLoS One. 2021 Feb 2;16(2):e0246136. doi: 10.1371/journal.pone.0246136. eCollection 2021. PLoS One. 2021. PMID: 33529244 Free PMC article.
Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.
Fehling MK, Grosch F, Schuster ME, Schick B, Lohscheller J. Fehling MK, et al. PLoS One. 2020 Feb 10;15(2):e0227791. doi: 10.1371/journal.pone.0227791. eCollection 2020. PLoS One. 2020. PMID: 32040514 Free PMC article.
Empirical Distribution of Glottal Edges (EDGE): A Statistical Assessment of Vocal Fold Kinematics Using High-Speed Videoendoscopy.
Ibarra EJ, Galindo GE, Alzamendi GA, Cortes JP, Castro C, Manriquez R, Testart A, Zanartu M. Ibarra EJ, et al. IEEE J Biomed Health Inform. 2025 Feb;29(2):1087-1100. doi: 10.1109/JBHI.2024.3462632. Epub 2025 Feb 10. IEEE J Biomed Health Inform. 2025. PMID: 39288042

See all "Cited by" articles

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- ClinicalKey
- Elsevier Science
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms

Affiliations

A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms

Authors

Affiliations

Abstract

Similar articles

Cited by

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical