Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017:2017:1945630.
doi: 10.1155/2017/1945630. Epub 2017 Mar 5.

Random Deep Belief Networks for Recognizing Emotions from Speech Signals

Affiliations
Review

Random Deep Belief Networks for Recognizing Emotions from Speech Signals

Guihua Wen et al. Comput Intell Neurosci. 2017.

Abstract

Now the human emotions can be recognized from speech signals using machine learning methods; however, they are challenged by the lower recognition accuracies in real applications due to lack of the rich representation ability. Deep belief networks (DBN) can automatically discover the multiple levels of representations in speech signals. To make full of its advantages, this paper presents an ensemble of random deep belief networks (RDBN) method for speech emotion recognition. It firstly extracts the low level features of the input speech signal and then applies them to construct lots of random subspaces. Each random subspace is then provided for DBN to yield the higher level features as the input of the classifier to output an emotion label. All outputted emotion labels are then fused through the majority voting to decide the final emotion label for the input speech signal. The conducted experimental results on benchmark speech emotion databases show that RDBN has better accuracy than the compared methods for speech emotion recognition.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Structure of deep belief network.
Figure 2
Figure 2
Structure of the standard RBM.
Figure 3
Figure 3
Framework of RDBN for speech emotion recognition, illustrating the method to create the base classifiers for the ensemble through random subspace, DBN, and SVM, where the majority voting is applied to perform the fusion.
Figure 4
Figure 4
Accuracies (WA) vary with the number of features for each ensemble size on EMODB, aiming to find the optimal ensemble size and the number of features for RDBN on this database.
Figure 5
Figure 5
Accuracies (WA) vary with the number of features for each ensemble size on CASIA, aiming to find the optimal ensemble size and the number of features for RDBN on this database.
Figure 6
Figure 6
Accuracies (WA) vary with the number of features for each ensemble size on SAVEE, aiming to find the optimal ensemble size and the number of features for RDBN on this database.
Figure 7
Figure 7
Accuracies (WA) vary with the number of features for each ensemble size on FAU database, aiming to find the optimal ensemble size and the number of features for RDBN on this database.
Algorithm 1
Algorithm 1
RDBN.

Similar articles

Cited by

References

    1. Fong B., Westerink J. Affective computing in consumer electronics. IEEE Transactions on Affective Computing. 2012;3(2):129–131. doi: 10.1109/T-AFFC.2012.20. - DOI
    1. El Ayadi M., Kamel M. S., Karray F. Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognition. 2011;44(3):572–587. doi: 10.1016/j.patcog.2010.09.020. - DOI
    1. Harimi A., AhmadyFard A., Shahzadi A., Yaghmaie K. Anger or joy? Emotion recognition using nonlinear dynamics of speech. Applied Artificial Intelligence. 2015;29(7):675–696. doi: 10.1080/08839514.2015.1051891. - DOI
    1. Sun Y., Wen G. Ensemble softmax regression model for speech emotion recognition. Multimedia Tools and Applications. 2016:1–24. doi: 10.1007/s11042-016-3487-y. - DOI
    1. Park J.-S., Kim J.-H., Oh Y.-H. Feature vector classification based speech emotion recognition for service robots. IEEE Transactions on Consumer Electronics. 2009;55(3):1590–1596. doi: 10.1109/TCE.2009.5278031. - DOI

LinkOut - more resources