Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;18(6):3615-3628.
doi: 10.1007/s11571-023-09932-4. Epub 2023 Feb 2.

Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning

Affiliations

Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning

Haitao Yu et al. Cogn Neurodyn. 2024 Dec.

Abstract

The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.

Keywords: Cross-modal recognition; Deep learning; Multisensory integration; Neural networks.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Alais D, Newell F, Mamassian P (2010) Multisensory processing in review: from physiology to behaviour. See Perceiv 23(1):3–38. 10.1163/187847510X488603 - PubMed
    1. Alvarado JC, Vaughan JW, Stanford TR, Stein BE (2007) Multisensory versus unisensory integration: contrasting modes in the superior colliculus. J Neurophysiol 97(5):3193–3205. 10.1152/jn.00018.2007 - PubMed
    1. Aponte DA, Handy G, Kline AM, Tsukano H, Doiron B, Kato HK (2021) Recurrent network dynamics shape direction selectivity in primary auditory cortex. Nat Commun 12(1):314. 10.1038/s41467-020-20590-6 - PMC - PubMed
    1. Barak O (2017) Recurrent neural networks as versatile tools of neuroscience research. Curr Opin Neurobiol 46:1–6. 10.1016/j.conb.2017.06.003 - PubMed
    1. Bi Z, Zhou C (2020) Understanding the computation of time using neural network models. Proc Natl Acad Sci 117(19):10530–10540. 10.1073/pnas.1921609117 - PMC - PubMed

LinkOut - more resources