Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 2:17:1058617.
doi: 10.3389/fnhum.2023.1058617. eCollection 2023.

Shape detection beyond the visual field using a visual-to-auditory sensory augmentation device

Affiliations

Shape detection beyond the visual field using a visual-to-auditory sensory augmentation device

Shira Shvadron et al. Front Hum Neurosci. .

Abstract

Current advancements in both technology and science allow us to manipulate our sensory modalities in new and unexpected ways. In the present study, we explore the potential of expanding what we perceive through our natural senses by utilizing a visual-to-auditory sensory substitution device (SSD), the EyeMusic, an algorithm that converts images to sound. The EyeMusic was initially developed to allow blind individuals to create a spatial representation of information arriving from a video feed at a slow sampling rate. In this study, we aimed to use the EyeMusic for the blind areas of sighted individuals. We use it in this initial proof-of-concept study to test the ability of sighted subjects to combine visual information with surrounding auditory sonification representing visual information. Participants in this study were tasked with recognizing and adequately placing the stimuli, using sound to represent the areas outside the standard human visual field. As such, the participants were asked to report shapes' identities as well as their spatial orientation (front/right/back/left), requiring combined visual (90° frontal) and auditory input (the remaining 270°) for the successful performance of the task (content in both vision and audition was presented in a sweeping clockwise motion around the participant). We found that participants were successful at a highly above chance level after a brief 1-h-long session of online training and one on-site training session of an average of 20 min. They could even draw a 2D representation of this image in some cases. Participants could also generalize, recognizing new shapes they were not explicitly trained on. Our findings provide an initial proof of concept indicating that sensory augmentation devices and techniques can potentially be used in combination with natural sensory information in order to expand the natural fields of sensory perception.

Keywords: auditory spatial perception; multisensory perception; multisensory spatial perception; sensory substitution; sensory substitution device (SSD); spatial perception; visual-auditory; visual-spatial perception.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
(A) Simulation of the Vision360 experiment in the multisensory room at the lab. The participants sat in the middle of the room, 2 m away from each wall, at an ear height of 1.4 m. An ambisonic system operated 21 speakers on every wall, with the corner speaker column shared among adjacent walls. F | R | B | L represents the division of space to front, right, back, and left, respectively, in an egocentric manner (in relation to the participant’s body and position). Participants were required to focus on the front side of the room, where they perceived a visual stimulus and then an auditory stimulus. It appeared only on the participants’ right, back, and left sides, in this order. Trained EyeMusic shapes and untrained shapes are perceived as stimuli both visually and auditorily. (B) Study outline. Participants went through various phases to complete the experiment: 60 min of basic training on the EyeMusic SSD at home. Following this, participants who passed the online training test at home were invited to the experiment performed at the lab and in the MultiSensory Ambisonics room. Upon the arrival of the participants to the lab, they went through a 5-min pre-test on the EyeMusic SSD material before moving on to phase 1 of the experiment. Phase 1 of the experiment was a test of stimuli presented sequentially for 6 min. Then, they moved on to the advanced training phase, where stimuli were presented to them in the ambisonics system for 20 min. Phase 2 of 25 min was a test that consisted of trained and untrained stimuli, which was presented spatially in the room. Then, they moved to phase 3, where participants were asked to draw the stimuli they perceived, and finally, they took a phenomenological questionnaire.
FIGURE 2
FIGURE 2
Various types of stimuli used during the experiment. (A) Images used during the online training. Participants see the visual image and simultaneously hear the monophonic audio rendition that represents this image. (B) Stimulus in sequence, participants perceived in their front 90° a visual shape and afterward heard the rest of the auditory stimulus in headphones. (C) Stimuli with three shapes in sequence presented 360° around the participant. The front is visually projected, while the sides and back are rendered as spatial audio. (D) Unified audio-visual stimuli presented partially in visual and partially in audio.
FIGURE 3
FIGURE 3
Two examples of the drawing task stimuli with 8 participants’ drawings. (A) Example of expanded single shapes type of stimuli. The group score for the number of shapes was 85 ± 24.6%, the score for the accuracy of the shapes themselves was 83 ± 24.4%, the group score for the proper positioning of the shapes was 86.7 ± 29.7%, and the group score for the unified visual-auditory shape was 70 ± 45.5%. (B) Example of a stimulus from category 3 (combining expanded and trained shapes in tandem). The group score for the number of shapes was 77.7 ± 10.3%, the score for the accuracy of the shapes themselves was 43.7 ± 17%, the group score for the proper positioning of the shapes was 66 ± 14.6%, and the group score for the unified visual-auditory shape was 65.3 ± 36.6%.
FIGURE 4
FIGURE 4
Experiments results. (A) Participants in the EyeMusic online training had a statistically significant success rate in the final test of 89.3 ± 5.5% [mean correct response ± SD; (A) bars indicate the standard error; dashed line indicates chance level]. (B) Experimental phase tasks’ results are divided into three categories: Recognition of stimulus in a sequence (54.6 ± 16.8%), the success rate of spatially perceiving trained shapes (78.8 ± 12.2%), and recognition of untrained shapes (Generalization) perceived spatially (82.4 ± 14.3%). There was no significant difference between the trained and generalized conditions (W(14) = 22, p = 0.349). However, between the trained sequentially presented stimulus and the trained stimuli presented spatially, there is a significant difference (W(14) = 117, pcorr < 0.01), as well as between the stimuli in a sequence compared to the Generalization of stimuli presented spatially (W(14) = 120, pcorr < 0.001). ***Means significantly above chance. *Means significantly different between two conditions. NS, not significant.
FIGURE 5
FIGURE 5
(A) Group result for shapes that began in the visual field and continued auditorily. We performed a one-tailed one-sample Wilcoxon test against chance. The correct response rate was 76.6 ± 15.1%, significantly higher than the chance level (W(14) = 120, p < 0.001) (bar indicates the standard error; dashed line indicates chance level). (B) An example of a full visual-auditory stimulus as processed in the Vision360 application. The “vision” section of the stimulus is perceived by the participants in the front, and the “Audition” section is perceived by the participants auditorily (starting from left to right in relation to the participants’ location). Underneath the stimulus, an example of a drawing by participant number 9, taken from the drawing phase of the experiment is shown. Front | right | back | left are the space expressions standardizing the division of space for the participants according to their egocentric position in space. The x-axis of 0°–360° represents the horizontal coverage of the stimulus in space, and the y-axis of 30°–28° represents the vertical coverage of the stimulus in space. ***Means significantly above chance.

Similar articles

Cited by

References

    1. Abboud S., Hanassy S., Levy-Tzedek S., Maidenbaum S., Amedi A. (2014). EyeMusic: Introducing a “visual” colorful experience for the blind using auditory sensory substitution. Restor. Neurol. Neurosci. 32 247–257. 10.3233/RNN-130338 - DOI - PubMed
    1. Alais D., Newell F., Mamassian P. (2010). Multisensory processing in review: From physiology to behaviour. Seeing Perceiv. 23 3–38. 10.1163/187847510X488603 - DOI - PubMed
    1. Amedi A., Hofstetter S., Maidenbaum S., Heimler B. (2017). Task selectivity as a comprehensive principle for brain organization. Trend. Cogn. Sci. 21, 307–310. 10.1016/j.tics.2017.03.007 - DOI - PubMed
    1. Amedi A., Merabet L. B., Camprodon J., Bermpohl F., Fox S., Ronen I., et al. (2008). Neural and behavioral correlates of drawing in an early blind painter: A case study. Brain Res. 1242 252–262. 10.1016/j.brainres.2008.07.088 - DOI - PMC - PubMed
    1. Amedi A., Stern W. M., Camprodon J. A., Bermpohl F., Merabet L., Rotman S., et al. (2007). Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nat. Neurosci. 10 687–689. 10.1038/nn1912 - DOI - PubMed