Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2025 Jul;39(4):1131.e31-1131.e43.
doi: 10.1016/j.jvoice.2023.02.014. Epub 2023 Mar 16.

Perceptual and Computational Estimates of Vocal Breathiness and Roughness in Sustained Phonation and Connected Speech

Affiliations
Comparative Study

Perceptual and Computational Estimates of Vocal Breathiness and Roughness in Sustained Phonation and Connected Speech

Supraja Anand. J Voice. 2025 Jul.

Abstract

Objective: Clinical assessment of voice quality (VQ) often uses a combination of sustained phonations and more prolonged and more complex vocalizations. The purpose of this study was to compare the perceived vocal breathiness and vocal roughness of sustained phonations and connected speech over a wide range of dysphonia severity and to evaluate their relationship with acoustic measures and bioinspired models of breathiness and roughness.

Methods: VQ dimension-specific single-variable matching task (SVMT) was used to index the perceived breathiness or roughness of five male and five female talkers on the basis of a sustained /a/ phonation and the 5th CAPE-V sentence. Acoustic measures of cepstral peak, autocorrelation peak and psychoacoustic measures of pitch strength, and temporal envelope standard deviation (EnvSD) was used to predict perceived breathiness and roughness judgments obtained from 10 listeners, respectively.

Results: High intra- and inter-listener reliability was observed for sustained phonations and connected speech. Perceived breathiness and roughness of sustained vowels and sentences obtained using SVMT were highly correlated for most dysphonic voices. The pitch strength model of breathiness was able to capture larger amount of perceptual variance compared to cepstral peak in both vowels and sentences. Autocorrelation peak was strongly correlated to perceived roughness in sentences while EnvSD was strongly correlated to perceived roughness in vowels.

Conclusions: Results provide evidence that perception of VQ via SVMT can be successfully extended to connected speech. Computational models of VQ can be easily adapted to connected speech. Such automated models of VQ perception are valuable due to their computational efficiency and their ability to accurately capture the non-linearities of the human auditory system.

Keywords: Breathiness—Roughness—Sustained phonation—Connected speech—Perception—Modeling.

PubMed Disclaimer

Publication types

MeSH terms

LinkOut - more resources