Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning
- PMID: 39712112
- PMCID: PMC11655826
- DOI: 10.1007/s11571-023-09932-4
Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning
Abstract
The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.
Keywords: Cross-modal recognition; Deep learning; Multisensory integration; Neural networks.
© The Author(s), under exclusive licence to Springer Nature B.V. 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
References
-
- Alais D, Newell F, Mamassian P (2010) Multisensory processing in review: from physiology to behaviour. See Perceiv 23(1):3–38. 10.1163/187847510X488603 - PubMed
-
- Alvarado JC, Vaughan JW, Stanford TR, Stein BE (2007) Multisensory versus unisensory integration: contrasting modes in the superior colliculus. J Neurophysiol 97(5):3193–3205. 10.1152/jn.00018.2007 - PubMed
-
- Barak O (2017) Recurrent neural networks as versatile tools of neuroscience research. Curr Opin Neurobiol 46:1–6. 10.1016/j.conb.2017.06.003 - PubMed
LinkOut - more resources
Full Text Sources