Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 29;14(3):e0214603.
doi: 10.1371/journal.pone.0214603. eCollection 2019.

Sound source localization with varying amount of visual information in virtual reality

Affiliations

Sound source localization with varying amount of visual information in virtual reality

Axel Ahrens et al. PLoS One. .

Abstract

To achieve accurate spatial auditory perception, subjects typically require personal head-related transfer functions (HRTFs) and the freedom for head movements. Loudspeaker-based virtual sound environments allow for realism without individualized measurements. To study audio-visual perception in realistic environments, the combination of spatially tracked head mounted displays (HMDs), also known as virtual reality glasses, and virtual sound environments may be valuable. However, HMDs were recently shown to affect the subjects' HRTFs and thus might influence sound localization performance. Furthermore, due to limitations of the reproduction of visual information on the HMD, audio-visual perception might be influenced. Here, a sound localization experiment was conducted both with and without an HMD and with a varying amount of visual information provided to the subjects. Furthermore, interaural time and level difference errors (ITDs and ILDs) as well as spectral perturbations induced by the HMD were analyzed and compared to the perceptual localization data. The results showed a reduction of the localization accuracy when the subjects were wearing an HMD and when they were blindfolded. The HMD-induced error in azimuth localization was found to be larger in the left than in the right hemisphere. When visual information of the limited set of source locations was provided, the localization error induced by the HMD was found to be negligible. Presenting visual information of hand-location and room dimensions showed better sound localization performance compared to the condition with no visual information. The addition of possible source locations further improved the localization accuracy. Also adding pointing feedback in form of a virtual laser pointer improved the accuracy of elevation perception but not of azimuth perception.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Photography (left) and screenshot (right) of the acoustic reproduction system in the real and in the virtual environment (RE and VE).
The loudspeakers are numbered in azimuth and color coded in elevation.
Fig 2
Fig 2. Pointing bias in azimuth (squares) and elevation (circles) for each subject.
The bias was calculated as the mean error over all source locations in the two visual localization conditions. Negative angles indicate biases to the left and downwards for azimuth and elevation, respectively.
Fig 3
Fig 3. Spectral difference (SD) measured at the left ear of the B&K HATS with and without the HTC Vive.
The angles in the legend represent the elevation angles considered in the current study. The SD was calculated in auditory bands and averaged over three frequency regions at low-, mid- and high frequencies as shown in the legend.
Fig 4
Fig 4. Signed errors of interaural differences in level and time (ILD and ITD) with respect to azimuth angles on the horizontal plane (0° elevation).
Positive errors indicate larger ILDs and ITDs with than without the HMD. The ILDs were calculated in auditory bands and averaged over three frequency regions at low-, mid- and high frequencies as shown in the legend. The ITDs were calculated from the delays between the broadband binaural impulse responses (see Methods for details).
Fig 5
Fig 5. Response plot of the visual localization experiment for the real and virtual environment.
The black squares represent the source locations. The small markers show the responses of the subjects and the large markers the mean response over subjects and repetitions. Negative angles represent sources to the left and downwards for azimuth and elevation, respectively.
Fig 6
Fig 6. Mean absolute (circles) and signed (boxplots) azimuth error for visual localization in the virtual (light blue) and the real (dark blue) environment.
The error is shown over the thirteen azimuth angles in the horizontal plane (0° elevation). The boxplots indicate the median (line) and the first and third quartile. The whiskers extend to 1.5 times the interquartile range.
Fig 7
Fig 7. Mean absolute (circles) and signed (boxplots) azimuth error for acoustic localization of blind-folded subjects with (light grey) and without (dark grey) the head mounted display (HMD).
The error is shown over the thirteen azimuth angles in the horizontal plane (0° elevation). The boxplots indicate the median (line) and the first and third quartile. The whiskers extend to 1.5 times the interquartile range.
Fig 8
Fig 8. Mean absolute (circles) and signed (boxplots) azimuth error for acoustic localization with varying visual information in the virtual environment and the real environment. In all conditions, except in the real environment, subjects wore the head-mounted display (HMD).
The conditions depicted with shades of blue color include visual information of possible source locations. The error is shown over the thirteen azimuth angles in the horizontal plane (0° elevation). The boxplots indicate the median (line) and the first and third quartile. The whiskers extend to 1.5 times the interquartile range.
Fig 9
Fig 9. Absolute elevation error in degrees for acoustic localization for blind-folded subjects with (light grey) and without (dark grey) the head mounted display (HMD).
The error is shown over the three elevation angles and includes the sources from all azimuth locations. The boxplots indicate the median (line) and the first and third quartile. The whiskers extend to 1.5 times the interquartile range.
Fig 10
Fig 10. Absolute elevation error for acoustic localization with varying visual information in the virtual environment and the real environment.
In all conditions, except in the real environment, subjects wore the head-mounted display (HMD). The conditions depicted with shades of blue color include visual information of possible source locations. The error is shown over the three elevation angles and includes the sources from all azimuth locations. The boxplots indicate the median (line) and the first and third quartile. The whiskers extend to 1.5 times the interquartile range.

References

    1. Blauert J. Spatial hearing: the psychophysics of human sound localization. MIT Press; 1997.
    1. Hofman PM, Van Riswick JGA, Van Opstal AJ. Relearning Sound Localization with New Ears. Nat Neurosci. 1998;1: 417–421. 10.1038/1633 - DOI - PubMed
    1. Shinn-Cunningham BG, Durlach NI, Held RM. Adapting to supernormal auditory localization cues. II. Constraints on adaptation of mean response. J Acoust Soc Am. 1998;103: 3667–3676. 10.1121/1.423107 - DOI - PubMed
    1. Shinn-Cunningham BG, Durlach NI, Held RM. Adapting to supernormal auditory localization cues. I. Bias and resolution. J Acoust Soc Am. 1998;103: 3656–3666. 10.1121/1.423088 - DOI - PubMed
    1. Gupta R, Ranjan R, He J, Gan W-S. Investigation of effect of VR/AR headgear on Head related transfer functions for natural listening. AES International Conference on Audio for Virtual and Augmented Reality. Redmond; 2018. Available: http://www.aes.org/e-lib/browse.cfm?elib=19697

Publication types