Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010 May;31(10):1713-20.
doi: 10.1111/j.1460-9568.2010.07206.x.

Semantic confusion regarding the development of multisensory integration: a practical solution

Affiliations
Review

Semantic confusion regarding the development of multisensory integration: a practical solution

Barry E Stein et al. Eur J Neurosci. 2010 May.

Abstract

There is now a good deal of data from neurophysiological studies in animals and behavioral studies in human infants regarding the development of multisensory processing capabilities. Although the conclusions drawn from these different datasets sometimes appear to conflict, many of the differences are due to the use of different terms to mean the same thing and, more problematic, the use of similar terms to mean different things. Semantic issues are pervasive in the field and complicate communication among groups using different methods to study similar issues. Achieving clarity of communication among different investigative groups is essential for each to make full use of the findings of others, and an important step in this direction is to identify areas of semantic confusion. In this way investigators can be encouraged to use terms whose meaning and underlying assumptions are unambiguous because they are commonly accepted. Although this issue is of obvious importance to the large and very rapidly growing number of researchers working on multisensory processes, it is perhaps even more important to the non-cognoscenti. Those who wish to benefit from the scholarship in this field but are unfamiliar with the issues identified here are most likely to be confused by semantic inconsistencies. The current discussion attempts to document some of the more problematic of these, begin a discussion about the nature of the confusion and suggest some possible solutions.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Multisensory integration at the level of the single superior colliculus (SC) neuron and its manifestation in SC-mediated orientation behavior. (A) This visual–auditory SC neuron was recorded from during the presentation of repeated and interleaved trials consisting of visual (LED), auditory (broadband noise burst) and visual–auditory stimuli. The responses from these trials were then resorted and presented in the three raster displays at the left. Each dot in the raster display represents one impulse, and each row responses to one stimulus presentation. Trials in each display are ordered bottom to top. The middle line graph figure illustrates the mean cumulative impulse count for each response and shows that the cross-modal stimulus also produced a shortening (7ms) of the response latency, and the computation was clearly superadditive in its initial phase, although this amplification was less obvious later in the response. (B) In this example neuron, overall unisensory response efficacy was sufficiently low to yield a superadditive computation in the response magnitude averaged over the entire response duration. Both figures obtained from data published in Stanford et al. (2005). (C) Cats were tested on an orientation/approach task in a 90-cm-diameter perimetry apparatus containing a complex of LEDs and speakers separated by 15°. Each complex consisted of three LEDs and two speakers. Trials consisted of randomly interleaved modality-specific stimuli (a single visual or auditory stimulus) and cross-modal stimuli (a visual–auditory stimulus pair) at each location between ± 45°, as well as ‘catch’ trials in which no stimulus was presented (the animal remained still in response). At every spatial location, multisensory integration produced substantial performance enhancements (94–168%; mean, 137%) that exceeded performance in response to the best modality-specific component stimulus. Errors bars indicate the SEM response accuracy computed over multiple experimental days. Asterisks indicate comparisons that were significantly different (χ2 test, P < 0.05). In addition, errors (No-Go and Wrong Localization) were significantly decreased as a result of multisensory integration (not shown). Modified from Gingras et al. (2009).
Fig. 2
Fig. 2
The development of multisensory neurons and of multisensory neurons capable of multisensory integration takes place over a protracted postnatal period. (A) The increase in cat superior colliculus (SC) multisensory neurons as a function of postnatal age is plotted on the left. The insert shows this as an increasing proportion of the sensory-responsive neurons in the multisensory layers of the structure. (B) The plot on the right shows the late appearance and gradual increase in the proportion of neurons capable of integrating their different sensory inputs. (C and D) Two exemplar multisensory neurons are shown. In all cases the stimuli were presented within a neuron’s excitatory receptive fields. The one on the left was recorded in a 20-day-old animal and was incapable of multisensory integration. It typified the immature state in which the response (number of impulses) to the cross-modal stimulus is no greater than the response to the most effective modality-specific component stimulus. In contrast, at 30 days of age some neurons were capable of multisensory integration. The neuron at the right shows that the multisensory response consisted of significantly more impulses than the response to the visual stimulus. Modified from Wallace & Stein (1997).
Fig. 3
Fig. 3
The paired-preference cross-modal matching procedure. In the original version of the procedure, infants are seated in front of two side-by-side visual stimuli. These can be faces or objects that can be either static or moving. A sound that corresponds to one of the visual stimuli is presented concurrently. In the picture depicted here, the woman is seen producing two different utterances and the auditory track, which is presented concurrently through centrally placed speakers, corresponds to one of the visible utterances. The infant’s looking is monitored via a camera placed in the center. Typically, multiple such trials, each lasting anywhere between 30 s and 1 min, are given and the dependent measure is the amount of time the infant spends looking at each visual stimulus. Matching is inferred when looking at the corresponding visual stimulus is greater than looking at the non-matching one.
Fig. 4
Fig. 4
The familiarization/matching procedure used to study the narrowing of audio-visual speech perception in infancy. Here, during the first two baseline trials, infants saw side-by-side faces of the same person repeatedly uttering a silent /ba/ syllable on one side and a silent /va/ syllable on the other side for a total of 42 s (with the side of syllables switched after 21 s). During the remaining four test trials, two auditory familiarization trials were interspersed with two test trials to determine whether hearing one of the syllables would shift visual preferences toward the matching visual syllable on the subsequent test trial. Half the infants heard the /ba/ syllable and the other half heard the /va/ syllable during the two familiarization trials. During each of the two silent test trials that followed each familiarization trial, infants viewed the two visual syllables presented side-by-side (counterbalanced for side across these two test trials) again. Percentage looking times directed to the matching visual syllable were computed separately for the baseline and the test trials, and shown in the figure are the differences between these two scores, with a positive value meaning that infants increased their looking at the matching visual syllable following familiarization to the auditory syllable (open circles represent each infant’s difference score and black circles with error bars represent the mean difference score and the SEM for each group). Here, it can be seen that Spanish-learning, monolingual 6-month-old infants exhibited significantly greater looking at the matching visible syllable despite the fact that this phonemic distinction does not exist in Spanish, but that 11-month-old Spanish-learning infants no longer did. This finding indicates that the ability to make cross-modal matches of non-native phonemes is initially present in infancy and that it subsequently declines when experience with this phonetic distinction is absent. In contrast, English-learning infants, who have experience with this phonetic distinction, made cross-modal matches at both ages. From Pons et al. (2009).
Fig. 5
Fig. 5
Looking behavior of infant vervets at dynamic audio-visual presentations of macaque calls. This study assessed the possibility of multisensory perceptual narrowing in a nonhuman infant species. On each trial, the vervets first saw side-by-side videos of the same macaque monkey repeatedly producing a ‘coo’ call on one side of a screen and a ‘grunt’ call on the other side. The duration of their looks to each video was recorded. During the initial part of each trial, the vervets saw the two calls in silence for 4 s and during the subsequent 16 s of the trial they saw them while they also heard one of the two audible calls. The bar graph on left of the figure shows the data for two age groups. It depicts the percentage of time that the vervets looked at the matching call in the presence of the audible calls out of the total amount of time they looked at the matching call in the presence and absence of the audible calls. Here, the vervets looked significantly (*P < 0.05) less at the matching call. This preference for the ‘wrong’ call was evidence of cross-modal matching because subsequent experiments suggested that this was due to the fear-inducing nature of the naturalistic macaque calls. The right side of the figure shows a vervet monkey looking at the videos. From Zangenehpour et al., 2009.
Fig. 6
Fig. 6
The use of terms pertaining to multisensory processes over the last decade. The charts depict the number of articles containing each term as an indexed word (appearing in the title, abstract or keywords) in Pubmed journals by year of publication.

References

    1. Bremner AJ, Holmes NP, Spence C. Infants lost in (peripersonal) space? Trends Cogn Sci. 2008a;12:298–305. - PubMed
    1. Bremner AJ, Mareschal D, Lloyd-Fox S, Spence C. Spatial localization of touch in the first year of life: early influences of a visual spatial code and the development of remapping across changes in limb position. J Exp Psychol Gen. 2008b;137:149–162. - PubMed
    1. Buschman TJ, Miller EK. Serial, covert shifts of attention during visual search are reflected by the frontal eye fields and correlated with population oscillations. Neuron. 2009;63:386–396. - PMC - PubMed
    1. Calvert GA, Spence C, Stein BE. The Handbook of Multisensory Processes. MIT Press; Cambridge, MA: 2004.
    1. Fuster JM, Bodner M, Kroger JK. Cross-modal and cross-temporal association in neurons of frontal cortex. Nature. 2000;405:347–351. - PubMed

Publication types

LinkOut - more resources