Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;36(1):324-39.
doi: 10.1002/hbm.22631. Epub 2014 Sep 13.

How the human brain exchanges information across sensory modalities to recognize other people

Affiliations

How the human brain exchanges information across sensory modalities to recognize other people

Helen Blank et al. Hum Brain Mapp. 2015 Jan.

Abstract

Recognizing the identity of other individuals across different sensory modalities is critical for successful social interaction. In the human brain, face- and voice-sensitive areas are separate, but structurally connected. What kind of information is exchanged between these specialized areas during cross-modal recognition of other individuals is currently unclear. For faces, specific areas are sensitive to identity and to physical properties. It is an open question whether voices activate representations of face identity or physical facial properties in these areas. To address this question, we used functional magnetic resonance imaging in humans and a voice-face priming design. In this design, familiar voices were followed by morphed faces that matched or mismatched with respect to identity or physical properties. The results showed that responses in face-sensitive regions were modulated when face identity or physical properties did not match to the preceding voice. The strength of this mismatch signal depended on the level of certainty the participant had about the voice identity. This suggests that both identity and physical property information was provided by the voice to face areas. The activity and connectivity profiles differed between face-sensitive areas: (i) the occipital face area seemed to receive information about both physical properties and identity, (ii) the fusiform face area seemed to receive identity, and (iii) the anterior temporal lobe seemed to receive predominantly identity information from the voice. We interpret these results within a prediction coding scheme in which both identity and physical property information is used across sensory modalities to recognize individuals.

Keywords: cross-modal priming; face recognition; multisensory; person identity; voice recognition.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental design and face stimuli. A. Participants were first trained in identifying the voices and faces of three male speakers (Training) and participated in a psychophysical pilot experiment (Pilot) (see Methods). Then, participants performed a cross‐modal priming experiment and a face‐area localizer during fMRI. In the cross‐modal priming experiment, voices of three speakers (indicated by amplitude waveforms) were followed by images of their faces after a 75 ms delay. The voices and faces could match (light gray font) or mismatch (dark gray font). Because the faces were morphed continua between two face identities (see panel B), voice and face could match or mismatch with regard to identity or physical properties. Participants performed a face‐identity recognition task on every trial (face task); on some trials they additionally performed a voice‐identity recognition task (“voice task”; excluded from fMRI analysis; see Methods). This was indicated by a colored fixation cross that followed the voice presentation. The faces were morphed between the three speakers (see Methods, panel B). B. Morphed face stimuli of the three speakers: Each combination of two possible face‐identity pairs was morphed, resulting in three morphed continua (Bernd–Anton, Anton–Carsten, and Carsten–Bernd). In the cross‐modal priming experiment, the morph levels 0, 33, 67, and 100% were used. These levels parametrically differed in physical properties (light blue), but perception of identity differed in a categorical manner (dark blue, morph levels 0 and 30% were perceived as matching in identity to the voice and morph levels 67 and 100% were perceived as the other person, mismatching in identity) (see Methods).
Figure 2
Figure 2
Behavioral Results. A. Behavioral responses of individual participants confirm that morphed faces were perceived in a categorical manner: Faces with 0 and 33% morph were recognized as one identity, whereas faces with 67 and 100% morph were recognized as the other identity. “Mismatch identity” conditions were defined as trials in which the subsequent face contained only 0% or 33% of the preceding voiceidentity. Vice versa, “match identity” conditions were defined as trials in which the subsequent face contained 67% or 100% of the preceding voice‐identity. B. Reaction times were facilitated in trials when identity of face and voice matched in identity (light gray bars) compared to when they mismatched (dark gray bars). Error bars indicate standard error of the mean. Stars indicate the significant main effect of the factor “mismatch vs. match of identity” (F(1,15) = 57.472, P < 0.001).
Figure 3
Figure 3
Effects of voice primes in face‐sensitive regions. A. ROI displayed on rendered Colin brain image (OFA (cyan) and FFA (light blue) defined by activity in the localizer contrast, and aTL (dark blue) defined by published coordinates (see Methods). B–D. Results for the correlation with the categorical “identity contrast” and the parametric “physical‐distance contrast” (i.e., identity contrast: (0% + 33%) > (67% + 100%) and physical‐distance contrast: (0% > 33% > 67% > 100%), both correlated with the reaction time to voices). B. In right OFA, there were significant correlations with both the “identity contrast” and the “physical‐distance contrast” and the effects did not differ significantly from each other. C. In right FFA (light blue), there was a significant correlation with the “identity contrast” and not for the “physicaldistance contrast,” but the effects did not differ significantly from each other. D. In right aTL (dark blue), there was a significant correlation with the “identity contrast” and not for the “physical‐distance contrast”. Specifically, the correlation with the “identity contrast” was significantly stronger than the one for the “physical‐distance contrast”.
Figure 4
Figure 4
Effects of voice primes on functional connectivity. Functional connectivity between FFA (light blue) and voice‐sensitive STS (red) and between FFA and OFA (cyan) was enhanced for both the “identity contrast” and the “physical‐distance contrast” (i.e., identity contrast: (0% + 33%) > (67% + 100%) and physical‐distance contrast: (0% > 33% > 67% > 100%), both correlated with the reaction time to voices).

Similar articles

Cited by

References

    1. Adachi I, Kuwahata H, Fujita K (2007): Dogs recall their owner's face upon hearing the owner's voice. Anim Cogn 10:17–21. - PubMed
    1. Andics A, McQueen JM, Petersson KM, Gal V, Rudas G, Vidnyanszky Z (2010): Neural mechanisms for voice recognition. Neuroimage 52:1528–1540. - PubMed
    1. Anzellotti S, Fairhall SL, Caramazza A (2014): Decoding representations of face identity that are tolerant to rotation. Cereb Cortex 24:1988–1995. - PubMed
    1. Atkinson AP, Adolphs R (2011): The neuropsychology of face perception: Beyond simple dissociations and functional selectivity. Philos Trans R Soc B Biol Sci 366:1726–1738. - PMC - PubMed
    1. Avidan G, Tanzer M, Hadj‐Bouziane F, Liu N, Ungerleider LG, Behrmann M (2014): Selective dissociation between core and extended regions of the face processing network in congenital prosopagnosia. Cereb Cortex 26:1565–1578. - PMC - PubMed

Publication types

LinkOut - more resources