Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 2:12:21.
doi: 10.3389/fnins.2018.00021. eCollection 2018.

Generic HRTFs May be Good Enough in Virtual Reality. Improving Source Localization through Cross-Modal Plasticity

Affiliations

Generic HRTFs May be Good Enough in Virtual Reality. Improving Source Localization through Cross-Modal Plasticity

Christopher C Berger et al. Front Neurosci. .

Abstract

Auditory spatial localization in humans is performed using a combination of interaural time differences, interaural level differences, as well as spectral cues provided by the geometry of the ear. To render spatialized sounds within a virtual reality (VR) headset, either individualized or generic Head Related Transfer Functions (HRTFs) are usually employed. The former require arduous calibrations, but enable accurate auditory source localization, which may lead to a heightened sense of presence within VR. The latter obviate the need for individualized calibrations, but result in less accurate auditory source localization. Previous research on auditory source localization in the real world suggests that our representation of acoustic space is highly plastic. In light of these findings, we investigated whether auditory source localization could be improved for users of generic HRTFs via cross-modal learning. The results show that pairing a dynamic auditory stimulus, with a spatio-temporally aligned visual counterpart, enabled users of generic HRTFs to improve subsequent auditory source localization. Exposure to the auditory stimulus alone or to asynchronous audiovisual stimuli did not improve auditory source localization. These findings have important implications for human perception as well as the development of VR systems as they indicate that generic HRTFs may be enough to enable good auditory source localization in VR.

Keywords: HRTF (head related transfer function); auditory perception; auditory training; cross-modal perception; cross-modal plasticity; spatial audio; virtual reality.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental setup. (A) The participants were equipped with the VR headset and could identify and report the source of sounds originating from five different locations (± 26.6°, ± 11.3°, 0°) along a white bar that was located 10 m in front of the participant and spanned 73.74° along the azimuth. (B) First person perspective within the VR environment during the auditory localization task. (The person in the picture is an author of the paper and gave consent to publish an identifiable image of him).
Figure 2
Figure 2
Results from all experiments. (A) Box-plots of the auditory remapping for all experiments. A significant improvement of the participants' auditory localization error wasand the localization test in a observed following the 60 s Audiovisual (AV) exposure. No such improvement was observed following the Auditory Only exposure. In the experiment on the effect of impact sounds, improved auditory source localization was observed following the synchronous audiovisual exposure phase with the additional impact related auditory cues (AV + Impact Sync). No significant remapping was observed following exposure to asynchronous but spatially aligned audiovisual stimuli (AV + Impact Async), or when the training was done in one sound and the localization was tested using a different sound (V + Impact Sync). (B) Mean pre- and post- adaptation localization errors for all participants, with each participant's data represented by pair of dots connected by a line. Asterisks indicate significant difference between pre-exposure and post-exposure phases (*p < 0.05) and “n.s.” indicates that there was no significant difference between pre- and post-exposure phases (p > 0.05).

Similar articles

Cited by

References

    1. Bauer R., Matvzsa J., Blackmer R. (1966). Noise localization after unilateral attenuation. J. Acoust. Soc. Am. 40, 441–444. 10.1121/1.1910093 - DOI
    1. Berger C. C., Ehrsson H. H. (in press). Mental imagery induces cross-modal plasticity changes future auditory perception. Psychol. Sci. - PubMed
    1. Berger C. C., Ehrsson H. H. (2016). Auditory motion elicits a visual motion aftereffect. Front. Neurosci. 10:559. 10.3389/fnins.2016.00559 - DOI - PMC - PubMed
    1. Bergström I., Azevedo S., Papiotis P., Saldanha N., Slater M. (2017). The plausibility of a string quartet performance in virtual reality. IEEE Trans. Visual. Comp. Graph. 23, 1352–1359. 10.1109/TVCG.2017.2657138 - DOI - PubMed
    1. Bertelson P., Aschersleben G. (1998). Automatic visual bias of perceived auditory location. Psychon. Bull. Rev. 5, 482–489. 10.3758/BF03208826 - DOI

LinkOut - more resources