Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 11;11(12):1253.
doi: 10.3390/bioengineering11121253.

Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions

Affiliations

Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions

Ahmed M Yousef et al. Bioengineering (Basel). .

Abstract

Room reverberation can affect oral/aural communication and is especially critical in computer analysis of voice. High levels of reverberation can distort voice recordings, impacting the accuracy of quantifying voice production quality and vocal health evaluations. This study quantifies the impact of additive simulated reverberation on otherwise clean voice recordings as reflected in voice metrics commonly used for voice quality evaluation. From a larger database of voice recordings collected in a low-noise, low-reverberation environment, voice samples of a sustained [a:] vowel produced at two different speaker intents (comfortable and clear) by five healthy voice college-age female native English speakers were used. Using the reverb effect in Audacity, eight reverberation situations indicating a range of reverberation times (T20 between 0.004 and 1.82 s) were simulated and convolved with the original recordings. All voice samples, both original and reverberation-affected, were analyzed using freely available PRAAT software (version 6.0.13) to calculate five common voice parameters: jitter, shimmer, harmonic-to-noise ratio (HNR), alpha ratio, and smoothed cepstral peak prominence (CPPs). Statistical analyses assessed the sensitivity and variations in voice metrics to a range of simulated room reverberation conditions. Results showed that jitter, HNR, and alpha ratio were stable at simulated reverberation times below T20 of 1 s, with HNR and jitter more stable in the clear vocal style. Shimmer was highly sensitive even at T20 of 0.53 s, which would reflect a common room, while CPPs remained stable across all simulated reverberation conditions. Understanding the sensitivity and stability of these voice metrics to a range of room acoustics effects allows for targeted use of certain metrics even in less controlled environments, enabling selective application of stable measures like CPPs and cautious interpretation of shimmer, ensuring more reliable and accurate voice assessments.

Keywords: reverberation; sensitivity; simulated room acoustics; speech acoustics; voice metrics.

PubMed Disclaimer

Conflict of interest statement

The authors confirm that there are no conflicts of interest regarding the work introduced in the present paper.

Figures

Figure 1
Figure 1
Examples of the computed reverberation time (T20) for four of the simulated room conditions across different octave band frequencies. Minimal Reverb represents a simulated anechoic chamber, while Low, Medium, and High Reverb correspond to simulated rooms with increasing levels of reverberation intensity.
Figure 2
Figure 2
The mean and standard deviation of the absolute percent change in jitter as a function of simulated reverberation time T20 for comfortable (left) and clear (right) sustained vowel [a:] production. The red dashed line indicates the linear regression fit.
Figure 3
Figure 3
The mean and standard deviation of the absolute percent change in shimmer as a function of simulated reverberation time T20 for comfortable (left) and clear (right) sustained vowel [a:] production. The red dashed line indicates the linear regression fit.
Figure 4
Figure 4
The mean and standard deviation of the absolute percent change in harmonic-to-noise ratio (HNR) as a function of simulated reverberation time T20 for comfortable (left) and clear (right) sustained vowel [a:] production. The red dashed line indicates the linear regression fit.
Figure 5
Figure 5
The mean and standard deviation of the absolute percent change in alpha ratio as a function of simulated reverberation time T20 for comfortable (left) and clear (right) sustained vowel [a:] production. The red dashed line indicates the linear regression fit.
Figure 6
Figure 6
The mean and standard deviation of the absolute percent change in smoothed cepstral peak prominence (CPPs) as a function of simulated reverberation time T20 for comfortable (left) and clear (right) sustained vowel [a:] production. The red dashed line indicates the linear regression fit.

Similar articles

References

    1. Barsties v., Latoszek B., Mayer J., Watts C.R., Lehnert B. Advances in Clinical Voice Quality Analysis with VOXplot. J. Clin. Med. 2023;12:4644. doi: 10.3390/jcm12144644. - DOI - PMC - PubMed
    1. Batthyany C., Latoszek B.B.V., Maryn Y. Meta-Analysis on the Validity of the Acoustic Voice Quality Index. J. Voice. 2022;38:1527.e1–1527.e19. doi: 10.1016/j.jvoice.2022.04.022. - DOI - PubMed
    1. Werth K., Voigt D., Döllinger M., Eysholdt U., Lohscheller J. Clinical Value of Acoustic Voice Measures: A Retrospective Study. Eur. Arch. Otorhinolaryngol. 2010;267:1261–1271. doi: 10.1007/s00405-010-1214-2. - DOI - PubMed
    1. Yousef A.M. Ph.D. Dissertation. Michigan State University; East Lansing, MI, USA: 2023. Laryngeal Mechanisms and Vocal Folds Function in Adductor Laryngeal Dystonia During Connected Speech.
    1. Alipour F., Finnegan E., Scherer R. Aerodynamic and Acoustic Effects of Abrupt Frequency Changes in Excised Larynges. J. Speech Lang. Hear. Res. 2009;52:465–481. doi: 10.1044/1092-4388(2008/07-0212). - DOI - PMC - PubMed

LinkOut - more resources