MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
- PMID: 40807983
- PMCID: PMC12349021
- DOI: 10.3390/s25154819
MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
Abstract
Emotion analysis based on electroencephalogram (EEG) sensors is pivotal for human-machine interaction yet faces key challenges in spatio-temporal feature fusion and cross-band and brain-region integration from multi-channel sensor-derived signals. This paper proposes MB-MSTFNet, a novel framework for EEG emotion recognition. The model constructs a 3D tensor to encode band-space-time correlations of sensor data, explicitly modeling frequency-domain dynamics and spatial distributions of EEG sensors across brain regions. A multi-scale CNN-Inception module extracts hierarchical spatial features via diverse convolutional kernels and pooling operations, capturing localized sensor activations and global brain network interactions. Bi-directional GRUs (BiGRUs) model temporal dependencies in sensor time-series, adept at capturing long-range dynamic patterns. Multi-head self-attention highlights critical time windows and brain regions by assigning adaptive weights to relevant sensor channels, suppressing noise from non-contributory electrodes. Experiments on the DEAP dataset, containing multi-channel EEG sensor recordings, show that MB-MSTFNet achieves 96.80 ± 0.92% valence accuracy, 98.02 ± 0.76% arousal accuracy for binary classification tasks, and 92.85 ± 1.45% accuracy for four-class classification. Ablation studies validate that feature fusion, bidirectional temporal modeling, and multi-scale mechanisms significantly enhance performance by improving feature complementarity. This sensor-driven framework advances affective computing by integrating spatio-temporal dynamics and multi-band interactions of EEG sensor signals, enabling efficient real-time emotion recognition.
Keywords: Inception module; bidirectional gated recurrent unit (BiGRU); convolutional neural network (CNN); electroencephalograph (EEG); emotion signal recognition; multi-head attention (MHA).
Conflict of interest statement
The authors declare that they have no conflict of interest.
Figures











Similar articles
-
Multiscale Spatial-Temporal Feature Fusion Neural Network for Motor Imagery Brain-Computer Interfaces.IEEE J Biomed Health Inform. 2025 Jan;29(1):198-209. doi: 10.1109/JBHI.2024.3472097. Epub 2025 Jan 7. IEEE J Biomed Health Inform. 2025. PMID: 39352826
-
EEG-ERnet: Emotion Recognition based on Rhythmic EEG Convolutional Neural Network Model.J Integr Neurosci. 2025 Aug 28;24(8):41547. doi: 10.31083/JIN41547. J Integr Neurosci. 2025. PMID: 40919632
-
Multi-channel EEG-based neurological disorder classification using Cross-Dependency Spatiotemporal Interactive Network.Comput Methods Programs Biomed. 2025 Nov;271:108982. doi: 10.1016/j.cmpb.2025.108982. Epub 2025 Jul 30. Comput Methods Programs Biomed. 2025. PMID: 40752459
-
Emotion recognition in EEG Signals: Deep and machine learning approaches, challenges, and future directions.Comput Biol Med. 2025 Sep;196(Pt A):110713. doi: 10.1016/j.compbiomed.2025.110713. Epub 2025 Jul 11. Comput Biol Med. 2025. PMID: 40644885 Review.
-
EEG-based affective brain-computer interfaces: recent advancements and future challenges.J Neural Eng. 2025 Jun 27;22(3). doi: 10.1088/1741-2552/ade290. J Neural Eng. 2025. PMID: 40490007 Review.
References
-
- Xu W., Jiang H., Liang X. Leveraging Knowledge of Modality Experts for Incomplete Multimodal Learning; Proceedings of the 32nd ACM International Conference on Multimedia; Melbourne, Australia. 28 October–1 November 2024; pp. 438–446.
-
- Zhang W., Qiu F., Wang S., Zeng H., Zhang Z., An R., Ma B., Ding Y. Transformer-based multimodal information fusion for facial expression analysis; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; New Orleans, LA, USA. 18–24 June 2022; pp. 2428–2437.
-
- Zhuang X., Liu F., Hou J., Hao J., Cai X. Transformer-based interactive multi-modal attention network for video sentiment detection. Neural Process. Lett. 2022;54:1943–1960. doi: 10.1007/s11063-021-10713-5. - DOI
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous