The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
- PMID: 35408076
- PMCID: PMC9003467
- DOI: 10.3390/s22072461
The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
Abstract
Machine Learning (ML) algorithms within a human-computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko's discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER.
Keywords: English; SER; SVM; artificial intelligence; cross-gender; cross-linguistic; emotion recognition; machine learning; speech.
Conflict of interest statement
The authors declare no conflict of interest.
Figures



Similar articles
-
Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features.Sensors (Basel). 2020 Sep 12;20(18):5212. doi: 10.3390/s20185212. Sensors (Basel). 2020. PMID: 32932723 Free PMC article.
-
A Comparison of Machine Learning Algorithms and Feature Sets for Automatic Vocal Emotion Recognition in Speech.Sensors (Basel). 2022 Oct 6;22(19):7561. doi: 10.3390/s22197561. Sensors (Basel). 2022. PMID: 36236658 Free PMC article.
-
An Urdu speech corpus for emotion recognition.PeerJ Comput Sci. 2022 May 9;8:e954. doi: 10.7717/peerj-cs.954. eCollection 2022. PeerJ Comput Sci. 2022. PMID: 35634125 Free PMC article.
-
Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives.Front Neurorobot. 2021 Nov 29;15:784514. doi: 10.3389/fnbot.2021.784514. eCollection 2021. Front Neurorobot. 2021. PMID: 34912204 Free PMC article. Review.
-
Random Deep Belief Networks for Recognizing Emotions from Speech Signals.Comput Intell Neurosci. 2017;2017:1945630. doi: 10.1155/2017/1945630. Epub 2017 Mar 5. Comput Intell Neurosci. 2017. PMID: 28356908 Free PMC article. Review.
Cited by
-
Facial expression recognition (FER) survey: a vision, architectural elements, and future directions.PeerJ Comput Sci. 2024 Jun 3;10:e2024. doi: 10.7717/peerj-cs.2024. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 38855254 Free PMC article.
-
Speech emotion classification using attention based network and regularized feature selection.Sci Rep. 2023 Jul 25;13(1):11990. doi: 10.1038/s41598-023-38868-2. Sci Rep. 2023. PMID: 37491423 Free PMC article.
-
Artificial Intelligence-Based Voice Assessment of Patients with Parkinson's Disease Off and On Treatment: Machine vs. Deep-Learning Comparison.Sensors (Basel). 2023 Feb 18;23(4):2293. doi: 10.3390/s23042293. Sensors (Basel). 2023. PMID: 36850893 Free PMC article.
-
High-Level CNN and Machine Learning Methods for Speaker Recognition.Sensors (Basel). 2023 Mar 25;23(7):3461. doi: 10.3390/s23073461. Sensors (Basel). 2023. PMID: 37050521 Free PMC article.
-
Acoustic Analysis of Speech for Screening for Suicide Risk: Machine Learning Classifiers for Between- and Within-Person Evaluation of Suicidality.J Med Internet Res. 2023 Mar 23;25:e45456. doi: 10.2196/45456. J Med Internet Res. 2023. PMID: 36951913 Free PMC article.
References
-
- Frijda N.H. Handbook of Emotions. The Guilford Press; New York, NY, USA: 1993. Moods, emotion episodes, and emotions; pp. 381–403.
-
- Ellis H., Seibert P., Varner L. Emotion and memory: Effect of mood states on immediate and unexpected delayed recall. Psychol. J. Soc. Behav. Personal. 1995;10:349.
-
- Kwon O.-W., Chan K., Hao J., Lee T.-W. Emotion recognition by speech signals; Proceedings of the 8th European Conference on Speech Communication and Technology, Eurospeech 2003—Interspeech 2003; Geneva, Switzerland. 1–4 September 2003.
-
- El Ayadi M., Kamel M.S., Karray F. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognit. 2011;44:572–587. doi: 10.1016/j.patcog.2010.09.020. - DOI
MeSH terms
LinkOut - more resources
Full Text Sources