Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;26(4):416-24.
doi: 10.1016/j.jvoice.2011.05.001. Epub 2011 Sep 22.

Vowel- and text-based cepstral analysis of chronic hoarseness

Affiliations

Vowel- and text-based cepstral analysis of chronic hoarseness

Cornelia Moers et al. J Voice. 2012 Jul.

Abstract

Objectives/hypothesis: Automatic voice evaluation is usually performed on stable sections of sustained vowels, which often cannot capture hoarseness properly. The measures cepstral peak prominence (CPP) and smoothed CPP (CPPS) do not require exact determination of the cycles of fundamental frequency like established perturbation-based measures. They can also be applied to text recordings. In this study, they were compared with perceptual evaluation of voice quality and the German roughness-breathiness-hoarseness (RBH) scheme.

Study design: Retrospective data analysis.

Methods: Seventy-three hoarse patients (48.3±16.8 years) uttered the vowel /e/ and read the German version of the text "The North Wind and the Sun". The text recordings were evaluated perceptually by five speech therapists and physicians according to the RBH scale. The criterion "overall quality" was measured on a 4-point scale and a visual analog scale. For the human-machine correlation, the automatic measures of the Praat program (vowels only) and the "cpps" software were compared with the experts' ratings. The experiments were repeated for speakers with jitter ≤5% or shimmer ≤5% (n=47).

Results: For the entire group (n=73), the best human-machine results for most of the rating criteria were obtained for text-based CPP and CPPS (up to |ρ|=0.73). For the 47 selected speakers, the correlation was remarkably worse for all measures but still best for text-based CPP and CPPS (|ρ|≤0.50).

Conclusions: Cepstrum analysis should be performed on a text recording. Then, it outperforms all perturbation-based measures, and it can be a meaningful objective support for perceptual analysis.

PubMed Disclaimer

Publication types

LinkOut - more resources