Vocal tract shaping of emotional speech
- PMID: 32523241
- PMCID: PMC7286544
- DOI: 10.1016/j.csl.2020.101100
Vocal tract shaping of emotional speech
Abstract
Emotional speech production has been previously studied using fleshpoint tracking data in speaker-specific experiment setups. The present study introduces a real-time magnetic resonance imaging database of emotional speech production from 10 speakers and presents articulatory analysis results of speech emotional expression using the database. Midsagittal vocal tract parameters (midsagittal distances and the vocal tract length) were parameterized based on a two-dimensional grid-line system, using image segmentation software. The principal feature analysis technique was applied to the grid-line system in order to find the major movement locations. Results reveal both speaker-dependent and speaker-independent variation patterns. For example, sad speech, a low arousal emotion, tends to show smaller opening for low vowels in the front cavity than the high arousal emotions more consistently than the other regions of the vocal tract. Happiness shows significantly shorter vocal tract length than anger and sadness in most speakers. Further details of speaker-dependent and speaker-independent speech articulation variation in emotional expression and their implications are described.
Keywords: MR image segmentation; USC-EMO-MRI corpus; emotional speech production; vocal tract shaping.
Conflict of interest statement
Declaration of interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures







References
-
- Banse R, Scherer KR, 1996. Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology 70(3), 614 – 636. - PubMed
-
- Bresch E, Kim Y-C, Nayak K, Byrd D, Narayanan S, 2008. Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP]. Signal Processing Magazine, IEEE 25(3), 123–132.
-
- Cai J, Laprie Y, Busset J, Hirsch F, 2009. Articulatory modeling based on semi-polar coordinates and guilded PCA technique In: Proceedings of Interspeech. ISCA, Brighton,UK, pp. 56 – 59.
-
- Erickson D, Fujimura O, Pardo B, 1998. Articulatory correlates of prosodic control: Emotion and emphasis. Language and Speech 41(3–4), 399–417. - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources