Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 3;16(1):105.
doi: 10.1038/s41598-025-29073-4.

Empowering emotional intelligence through deep learning techniques

Affiliations

Empowering emotional intelligence through deep learning techniques

B V Gokulnath et al. Sci Rep. .

Abstract

We propose that employing an ensemble of deep learning models can enhance the recognition and adaptive response to human emotions, outperforming the use of single model. Our study introduces a multimodal emotional intelligence system that blends CNNs for facial emotion detection, BERT for text mood analysis, RNNs for tracking emotions over time, and GANs for creating emotion-specific content. We built these models with TensorFlow, Keras, and PyTorch, and trained them on Kaggle datasets, including FER-2013 for facial expressions and labeled text data for sentiment tasks. Our experiments show strong results: CNNs reach about 80% accuracy in recognizing facial emotions, BERT achieves about 92% accuracy in text sentiment, RNNs reach around 89% for sequential emotion tracking, and GANs produce personalized, age-related content that is judged contextually appropriate in over 90% of test cases. These findings support the idea that a combined model architecture can yield more accurate and adaptable emotional responses than simpler approaches. The framework could be useful in areas such as healthcare, customer service, education, and digital well-being, helping to create AI systems that are more empathetic and user-focused.

Keywords: Bidirectional Encoder Representations from Transformers (BERT); Convolutional Neural Networks (CNN); Deep learning; Emotional intelligence; Facial emotion recognition; Generative Adversarial Networks (GANs); Human–computer interaction; Multimodal emotion recognition; Recurrent Neural Networks (RNN); Sentiment analysis.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests. Consent for Publication: All individuals featured in the submitted images provided written consent for their use in this publication. I confirm that all methods were carried out in accordance with relevant guidelines and regulations. I further confirm that all experimental protocols were approved by the VIT AP university. Informed consent was obtained from all subjects to participation in the original data collection. Additionally, informed consent for publication of the data and images was also obtained from all subjects and/or their legal guardians.

Figures

Fig. 1
Fig. 1
Flowchart of the emotional intelligence system that combines CNN, BERT, RNN, and GAN models to create personalized emotional responses using different types of data.
Fig. 2
Fig. 2
Overall models of comparison of performance metrics
Fig. 3
Fig. 3
Model accuracy and model loss
Fig. 4
Fig. 4
Shows the testing accuracy, giving a clear picture of how well the model did on the test data.
Fig. 5
Fig. 5
Shows the classification report and confusion matrix, giving a detailed look at the model’s precision, recall, and how well it classifies data. The CNN model most reliably classifies “happy” and “neutral” emotions, with fewer misclassifications than “fearful” or "disgusted," according to the confusion matrix (Figure 5). Misclassifications typically happen between visually similar emotions, as “fearful” and "surprised."
Fig. 6
Fig. 6
Shows the ROC curve, which explains how the model balances correctly detecting positives and avoiding false alarms at different settings. which shows an average AUC of 0.91 for CNN, 0.95 for BERT, and 0.93 for the RNN. This performance is in directly supported by the GAN model, which generates realistic, emotionally consistent augmentation outputs.
Fig. 7
Fig. 7
Compares the accuracy of all models, clearly showing how they perform compared to each other.
Fig. 8
Fig. 8
Emotion-based output for elderly (Happy Emotion)
Fig. 9
Fig. 9
Emotion-based output for child (Happy Emotion)
Fig. 10
Fig. 10
Emotion-based output for adult (Fear Emotion)
Fig. 11
Fig. 11
Accuracy table

References

    1. Tan, M., Le, Q. EfficientNet: Rethinking model scaling for convolutional neural networks. Proc. of the 36th International Conference on Machine Learning (ICML). https://arxiv.org/pdf/1905.11946 (2019)
    1. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., Potts, C. Semisupervised recursive deep learning for sentiment analysis. Proc. of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://aclanthology.org/D11-1014.pdf (2011)
    1. Poria, S., Cambria, E., Hazarika, D., Zhao, H. Multimodal sentiment analysis for social media contents during public emergencies. Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM). https://scispace.com/pdf/multimodal-sentiment-analysis-for-social-media-... (2017)
    1. Poria, S., Majumder, N., Hazarika, D., & et al. End-to-end multimodal emotion recognition using deep neural networks. https://arxiv.org/pdf/1704.08619 (2017)
    1. Zhang, J., Pan, Z., Li, B. Improved deep convolutional neural network for emotion recognition in human–robot interaction systems. IEEE Access 7 140931-140939. https://ieeexplore.ieee.org/abstract/document/8741491 (2019)

LinkOut - more resources