Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 4;9(1):18284.
doi: 10.1038/s41598-019-54811-w.

Short-term effects of sound localization training in virtual reality

Affiliations

Short-term effects of sound localization training in virtual reality

Mark A Steadman et al. Sci Rep. .

Abstract

Head-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain's ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements ("gamification") and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion ("active listening"). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
(Top row) Distribution of localization errors pooled across all participants within each group before training (orange) and after completing a total of nine, 12-minute training sessions across three days (or following a matching testing schedule without training for the control group). (Bottom row) Polar histograms of average localization error grouped by target azimuth into eight sectors both before (orange) and after training (blue). The dashed lines constitute the scale bars of the histograms and correspond to a mean localization error of 90°.
Figure 2
Figure 2
Distributions of localization errors, divided by participant group, during the initial (before training, orange), and final testing block (after training, blue). Shown are the spherical angle errors (a), lateral angle errors (b), polar angle errors (PAE; c) and the rates of front-back confusions (d). For all angle errors, the madian value was calculated for each participant in each testing block. Significance indicators show results of separate, paired t-tests between each initial and final measures for each participant group.
Figure 3
Figure 3
Changes in response biases following training. The top row shows changes in signed lateral (a), elevation (b) and front-back biases (c) both before (orange) and after all nine, 12-minute training sessions (blue). A signed lateral error below zero indicates a tendency to perceive targets more medially and above zero indicates a tendency to perceive targets more laterally. A positive elevation error indicates that targets were perceived above the target position and negative values indicate that they were perceived below. Positive signed front-back (F-B) confusions indicate a tendency to localize targets in rear hemisphere towards the front and a negative value indicates a tendency to localize targets in the front hemisphere to the rear. Indicators of significance are from separate, paired t-tests. Lower panel. Distribution of response locations for four example participants before training (top row), following a single, 12-minute training block (second row) and following all nine training blocks (bottom row). Each dot indicates the orientation of a response with 0° azimuth corresponding to straight up on the axes. Elevation is indicated by the distance from the centre, with the origin corresponding to directly upwards and the inner dark ring indicating 0° elevation. The magnitude of the spherical angle error for each response is indicated by the colour of each dot (see colourbar).
Figure 4
Figure 4
(a) Change in spherical angle error as a function of the number of completed 12-minute training blocks. The median spherical angle error was calculated for each participant for each testing block. Shown here are the average values across participants for each group. Error bars indicate the standard deviation. The solid lines represent fitted exponential functions for each group (colours are matched to the symbols). (b) Change in spherical angle error between testing blocks from the final testing block on day 1 to the initial testing block on day two (orange) and from the final testing block on day 2 to the initial testing block on day three (blue).
Figure 5
Figure 5
Overall change in spherical angle error, (a) lateral error, (b) polar angle error (PAE; c) and the rate of front-back confusions, (d) from the initial to the final testing block for sounds spatialized using the same HRTFs used throughout training blocks (trained HRTF, orange) and for sound spatialized using a second set of non-individualized HRTFs (non-trained HRTF, blue).
Figure 6
Figure 6
Overall changes in spherical, lateral and polar errors (PAE), and front-back confusion rates (F-B) for sounds spatialized using the same HRTFs used throughout training blocks (trained HRTF, orange) and for sound spatialized using a second set of non-individualized HRTFs (non-trained HRTF, blue). Data are pooled across all participants undergoing gamified, non-gamified and active-gamified training. Significance indicators show results from separate, paired t-tests between the data for the trained and non-trained HRTFs.
Figure 7
Figure 7
Change in spherical (a), lateral (b) and polar errors (c) and rate of front-back confusions (d) relative to baseline for each participant. For spherical, lateral and polar errors, the median error within each testing block was calculated for each participant. Symbols mark the mean of these errors across participants within each group. Error bars indicate standard deviations.
Figure 8
Figure 8
(a) Schematic representation of the complex stimulus used during training and testing. (b) Schematic of the experimental design indicating the block structure of each session. Testing blocks are in orange and training blocks in blue. Each session was carried out on a different day. c Diagram of the centroids of target orientation “regions” used in testing blocks. Target sounds deviated from these centroids by up to 20°. (d,e) Screenshots of the virtual participants view in the virtual reality application. The marked features correspond to (a) timer, (b) cardinal direction indicator, (c) current score, (d) player health indicator, (e) animated “charge” indicator visual effect, (f) consecutive hit counter. (d) Shows the HUD used in the non-gamified version and (e) shows the HUD for the gamified and active-listening versions.

References

    1. Wightman FL, Kistler D. Headphone simulation of free-field listening. ii: Psychophysical. J. Acoust. Soc. Am. 1989;85:868–878. doi: 10.1121/1.397558. - DOI - PubMed
    1. Kahana, Y., Nelson, P. A., Petyt, M. & Choi, S. Numerical modelling of the transfer functions of a dummy-head and of the external ear. In Audio Engineering Society Conference: 16th International Conference: Spatial Sound Reproduction (Audio Engineering Society, 1999).
    1. Katz BF. Boundary element method calculation of individual head-related transfer function. i. rigid model calculation. The J. Acoust. Soc. Am. 2001;110:2440–2448. doi: 10.1121/1.1412440. - DOI - PubMed
    1. Dellepiane, M., Pietroni, N., Tsingos, N., Asselot, M. & Scopigno, R. Reconstructing head models from photographs for individualized 3d-audio processing. In Computer Graphics Forum, vol. 27, 1719–1727 (Wiley Online Library, 2008).
    1. Torres-Gallegos EA, Orduna-Bustamante F, Arámbula-Cosío F. Personalization of head-related transfer functions (hrtf) based on automatic photo-anthropometry and inference from a database. Appl. Acoust. 2015;97:84–95. doi: 10.1016/j.apacoust.2015.04.009. - DOI

Publication types