Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives

Shiqing Zhang¹, Ruixin Liu^{1

2}, Xin Tao¹, Xiaoming Zhao¹

Affiliations

¹ Institute of Intelligence Information Processing, Taizhou University, Zhejiang, China.
² School of Sugon Big Data Science, Zhejiang University of Science and Technology, Zhejiang, China.

PMID: 34912204
PMCID: PMC8666588
DOI: 10.3389/fnbot.2021.784514

Review

Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives

Shiqing Zhang et al. Front Neurorobot. 2021.

. 2021 Nov 29:15:784514.

doi: 10.3389/fnbot.2021.784514. eCollection 2021.

Authors

Shiqing Zhang¹, Ruixin Liu^{1

2}, Xin Tao¹, Xiaoming Zhao¹

Affiliations

¹ Institute of Intelligence Information Processing, Taizhou University, Zhejiang, China.
² School of Sugon Big Data Science, Zhejiang University of Science and Technology, Zhejiang, China.

PMID: 34912204
PMCID: PMC8666588
DOI: 10.3389/fnbot.2021.784514

Abstract

Automatic speech emotion recognition (SER) is a challenging component of human-computer interaction (HCI). Existing literatures mainly focus on evaluating the SER performance by means of training and testing on a single corpus with a single language setting. However, in many practical applications, there are great differences between the training corpus and testing corpus. Due to the diversity of different speech emotional corpus or languages, most previous SER methods do not perform well when applied in real-world cross-corpus or cross-language scenarios. Inspired by the powerful feature learning ability of recently-emerged deep learning techniques, various advanced deep learning models have increasingly been adopted for cross-corpus SER. This paper aims to provide an up-to-date and comprehensive survey of cross-corpus SER, especially for various deep learning techniques associated with supervised, unsupervised and semi-supervised learning in this area. In addition, this paper also highlights different challenges and opportunities on cross-corpus SER tasks, and points out its future trends.

Keywords: cross-corpus; deep learning; feature learning; speech emotion recognition; survey.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1. Abdelwahab M., Busso C. (2018). Domain adversarial for acoustic emotion recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 26, 2423–2435. 10.1109/TASLP.2018.2867099 - DOI - PubMed
1. Akçay M. B., Oguz K. (2020). Speech emotion recognition: emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Commun. 116, 56–76. 10.1016/j.specom.2019.12.001 - DOI
1. Alam F., Joty S., Imran M. (2018). Graph based semi-supervised learning with convolution neural networks to classify crisis related tweets, in Twelfth International AAAI Conference on Web and Social Media. (Palo Alto, CA: ), 556–559.
1. Albornoz E. M., Milone D. H., Rufiner H. L. (2011). Spoken emotion recognition using hierarchical classifiers. Comput. Speech Lang. 25, 556–570. 10.1016/j.csl.2010.10.001 - DOI
1. Alloghani M., Al-Jumeily D., Mustafina J., Hussain A., Aljaaf A. J. (2020). A systematic review on supervised and unsupervised machine learning algorithms for data science, in Supervised unsupervised Learn Data Sci. 3–21. 10.1007/978-3-030-22475-2_1 - DOI

Publication types

Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives

Affiliations

Deep Cross-Corpus Speech Emotion Recognition: Recent Advances and Perspectives

Authors

Affiliations

Abstract

Conflict of interest statement

References

Publication types

LinkOut - more resources

Full Text Sources

Research Materials