Spatial shifts of audio-visual interactions by perceptual learning are specific to the trained orientation and eye

Melissa A Batson¹, Anton L Beer, Aaron R Seitz, Takeo Watanabe

Affiliations

PMID: 22353537
PMCID: PMC3852883
DOI: 10.1163/187847611X603738

Spatial shifts of audio-visual interactions by perceptual learning are specific to the trained orientation and eye

Melissa A Batson et al. Seeing Perceiving. 2011.

. 2011;24(6):579-94.

doi: 10.1163/187847611X603738.

Authors

Melissa A Batson¹, Anton L Beer, Aaron R Seitz, Takeo Watanabe

Affiliation

¹ Boston University, Boston, MA 02215, USA. melbatson@gmail.com

PMID: 22353537
PMCID: PMC3852883
DOI: 10.1163/187847611X603738

Abstract

A large proportion of the human cortex is devoted to visual processing. Contrary to the traditional belief that multimodal integration takes place in multimodal processing areas separate from visual cortex, several studies have found that sounds may directly alter processing in visual brain areas. Furthermore, recent findings show that perceptual learning can change the perceptual mechanisms that relate auditory and visual senses. However, there is still a debate about the systems involved in cross-modal learning. Here, we investigated the specificity of audio-visual perceptual learning. Audio-visual cuing effects were tested on a Gabor orientation task and an object discrimination task in the presence of lateralised sound cues before and after eight-days of cross-modal task-irrelevant perceptual learning. During training, the sound cues were paired with visual stimuli that were misaligned at a proximal (trained) visual field location relative to the sound. Training was performed with one eye patched and with only one Gabor orientation. Consistent with previous findings we found that cross-modal perceptual training shifted the audio-visual cueing effect towards the trained retinotopic location. However, this shift in audio-visual tuning was only observed for the trained stimulus (Gabors), at the trained orientation, and in the trained eye. This specificity suggests that multimodal interactions resulting from cross-modal (audio-visual) task-irrelevant perceptual learning involves so-called unisensory visual processing areas in humans. Our findings provide further support for recent anatomical and physiological findings that suggest relatively early interactions in cross-modal processing.

PubMed Disclaimer

Figures

**Figure 1**
Experimental arrangement. (a) Stimuli: oriented Gabor patches or objects (faces/houses, adapted from Tong *et al.*, 1998), auditory stimuli were white noise bursts. See text for full stimulus details. All stimuli were presented at 16 d.o.v. from fixation. (b) Experiment apparatus and arrangement: subjects were seated facing the computer monitor. Speaker icons indicate the location of auditory cues. Visual stimuli appeared at proximal aperture locations (proximal trained — P_t or proximal untrained — P_u). One test session was conducted before and one after eight training sessions. In test sessions, subjects were tested on both orientation and object discrimination separately for each eye. During training sessions only one eye was exposed and only one of the test orientations was presented. (c) Stimulus timing: test — after a variable pre-trial period an auditory cue was presented (left or right) for 100 ms. The visual stimulus (Gabor or object) appeared at a proximal location (P_t or P_u); see (b) for 200 ms on either the valid or invalid side with a stimulus onset asynchrony (SOA) of 150 or 1000 ms. Training — subjects performed a shape detection task for eight training sessions. Each trial started with the presentation of a sound+Gabor pair (sound for 100 ms, Gabor for 200 ms). Gabors appeared at a proximal aperture location (P_u/P_t) on the left or right side. After 150 ms a circle or square (one being the target shape) encompassed the Gabor for 100 ms.

**Figure 2**
Pre-training cross-modal validity effects. (a) Sounds appeared on the same side as the visual stimulus on valid trials and on the opposite side on invalid trials. Response time (RT) validity effects (VEs) were calculated by subtracting response times for valid trials from those for invalid trials. A positive VE means that responses were faster on valid trials than on invalid trails (see short SOA). A negative VE means that responses were slower on valid trials than on invalid trials (see long SOA). The decrease in valid *versus* invalid measures seen at long SOAs is called inhibition of return (IOR). The data shown here are for informational purposes and do not relate directly to this study; these data represent the natural VE at a visual location aligned with the sound cue, collected for a previous experiment. (b) Cross-modal response time VEs were not significant for either SOA or task (orientation (left) or object (right) discrimination) at any location prior to training. Note that no eye, orientation or location has been trained prior to this test. Therefore, these graphs represent data pooled across eyes, orientations and locations. Error bars represent the 95% confidence interval; n = 11.

**Figure 3**
Changes in cross-modal validity effects (post-minus pre-training) for orientation discrimination. Changes in VE are displayed as the difference in response time VEs from pre- to post-training test sessions. Spatially specific realignment of the cross-modal facilitation effect was seen for the short SOA as a significant increase in response time VE at the trained location (P_t) only for the orientation and eye exposed during training sessions (p < 0.005). An opposite effect was observed in the untrained eye for the same (trained) orientation and location (opposition of effects seen in the trained *versus* the untrained eye: p = 0.01). Significant increases in cross-modal inhibition were seen for the long SOA at the trained location, specific to the trained orientation and eye (p < 0.05). An opposite effect was observed in the untrained eye for the same (trained) orientation and location (trained *versus* untrained eye: p = 0.01). Note that the trained eye was exposed, while the untrained eye was patched, during training sessions. Error bars represent the 95% confidence interval; α = 0.05; n = 11.

See this image and copyright information in PMC

References

1. Ahissar M, Hochstein S. Task difficulty and the specificity of perceptual learning. Nature. 1997;387:401–406. - PubMed
1. Alais D, Cass J. Multisensory perceptual learning of temporal order: audiovisual learning transfers to vision but not audition. PLoS One. 2010;5:e11283. - PMC - PubMed
1. Beauchamp MS. See me, hear me, touch me: multimodal integration in lateral occipital– temporal cortex. Curr. Opin. Neurobiol. 2005;15:145–153. - PubMed
1. Beer AL, Batson MA, Watanabe T. Multisensory perceptual learning escapes both fast and slow mechanisms of cross-modal processing. Cognit. Affect. Behav. Neurosci. 2011a;11:1–12. - PMC - PubMed
1. Beer AL, Plank T, Greenlee MW. Diffusion tensor imaging shows white matter tracts between human auditory and visual cortex. Exper. Brain Res. 2011b;213:299–308. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Spatial shifts of audio-visual interactions by perceptual learning are specific to the trained orientation and eye

Affiliation

Spatial shifts of audio-visual interactions by perceptual learning are specific to the trained orientation and eye

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources