Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;4(4):e5256.
doi: 10.1371/journal.pone.0005256. Epub 2009 Apr 22.

Auditory-visual object recognition time suggests specific processing for animal sounds

Affiliations

Auditory-visual object recognition time suggests specific processing for animal sounds

Clara Suied et al. PLoS One. 2009.

Abstract

Background: Recognizing an object requires binding together several cues, which may be distributed across different sensory modalities, and ignoring competing information originating from other objects. In addition, knowledge of the semantic category of an object is fundamental to determine how we should react to it. Here we investigate the role of semantic categories in the processing of auditory-visual objects.

Methodology/findings: We used an auditory-visual object-recognition task (go/no-go paradigm). We compared recognition times for two categories: a biologically relevant one (animals) and a non-biologically relevant one (means of transport). Participants were asked to react as fast as possible to target objects, presented in the visual and/or the auditory modality, and to withhold their response for distractor objects. A first main finding was that, when participants were presented with unimodal or bimodal congruent stimuli (an image and a sound from the same object), similar reaction times were observed for all object categories. Thus, there was no advantage in the speed of recognition for biologically relevant compared to non-biologically relevant objects. A second finding was that, in the presence of a biologically relevant auditory distractor, the processing of a target object was slowed down, whether or not it was itself biologically relevant. It seems impossible to effectively ignore an animal sound, even when it is irrelevant to the task.

Conclusions/significance: These results suggest a specific and mandatory processing of animal sounds, possibly due to phylogenetic memory and consistent with the idea that hearing is particularly efficient as an alerting sense. They also highlight the importance of taking into account the auditory modality when investigating the way object concepts of biologically relevant categories are stored and retrieved.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Setup and visual stimuli used in the experiments.
Panel A: the setup used in the experiment is composed of a large screen and loudspeakers. Stimuli are projected on the visual background representing a door. The asterisk indicates the location for the visual stimulus; the arrow indicates the loudspeaker used for the auditory stimuli. Panel B: screenshots of the visual stimuli (the sheep, the cow, the frog, the plane and the train).
Figure 2
Figure 2. Same processing time for animals and means of transport targets.
RTs for the unimodal (A+ and V+) and bimodal congruent (A+V+) conditions are presented: means of transport targets (Exp. I, light grey) are compared with animal targets (Exp. II, dark grey). RTs were first transformed to a log scale and then averaged across all participants. The log-scale was converted back to ms for displays purposes. The error bars represent one standard error of the mean. RTs to A+, V+, and A+V+ conditions were similar for animal and means of transport: there was no difference in recognition time between these two categories.
Figure 3
Figure 3. Interference effect of the animal sounds distractors.
RTs of the unimodal (A+ and V+), bimodal congruent (A+V+) and bimodal incongruent (A−V+ and A+V−) conditions are presented. RTs were first transformed to a log scale and then averaged across all participants. The log-scale was converted back to ms for displays purposes. The error bars represent one standard error of the mean. Firstly, RTs to the A−V+ condition (incongruent with auditory distractors) were significantly longer than to the V+ condition only when auditory distractors were animal sounds (panels A and C). When the auditory distractor was not an animal, but a means of transport, there was no supplementary processing cost in presence of the auditory distractor (panel B). Secondly, RTs to the A+V− condition (incongruent with visual distractors) were always similar to RTs to A+ condition: there was no interference effect for visual distractors, whatever the category of the target or the category of the distractor (see panels A, B, and C). Finally, RTs to the bimodal A+V+ condition were clearly shorter than both unimodal A+ and V+ conditions, for the three experiments (panels A, B, and C).

Similar articles

Cited by

References

    1. Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature. 1996;381:520–522. - PubMed
    1. Warrington EK, McCarthy R. Category specific access dysphasia. Brain. 1983;106:859–78. - PubMed
    1. Warrington EK, Shallice T. Category specific semantic impairments. Brain. 1984;107:829–54. - PubMed
    1. Humphreys GW, Forde EM. Hierarchies, similarity, and interactivity in object recognition: ‘category-specific’ neuropsychological deficits. Behav Brain Sci. 2001;24:453–476. - PubMed
    1. Caramazza A, Mahon BZ. The organization of conceptual knowledge: the evidence from category-specific semantic deficits. Trends Cogn Sci. 2003;7:354–361. - PubMed

Publication types