Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Oct 20;24(42):9291-302.
doi: 10.1523/JNEUROSCI.2671-04.2004.

Dynamic sound localization during rapid eye-head gaze shifts

Affiliations

Dynamic sound localization during rapid eye-head gaze shifts

Joyce Vliegen et al. J Neurosci. .

Erratum in

  • J Neurosci. 2004 Nov 3;24(44):following 10034

Abstract

Human sound localization relies on implicit head-centered acoustic cues. However, to create a stable and accurate representation of sounds despite intervening head movements, the acoustic input should be continuously combined with feedback signals about changes in head orientation. Alternatively, the auditory target coordinates could be updated in advance by using either the preprogrammed gaze-motor command or the sensory target coordinates to which the intervening gaze shift is made ("predictive remapping"). So far, previous experiments cannot dissociate these alternatives. Here, we study whether the auditory system compensates for ongoing saccadic eye and head movements in two dimensions that occur during target presentation. In this case, the system has to deal with dynamic changes of the acoustic cues as well as with rapid changes in relative eye and head orientation that cannot be preprogrammed by the audiomotor system. We performed visual-auditory double-step experiments in two dimensions in which a brief sound burst was presented while subjects made a saccadic eye-head gaze shift toward a previously flashed visual target. Our results show that localization responses under these dynamic conditions remain accurate. Multiple linear regression analysis revealed that the intervening eye and head movements are fully accounted for. Moreover, elevation response components were more accurate for longer-duration sounds (50 msec) than for extremely brief sounds (3 msec), for all localization conditions. Taken together, these results cannot be explained by a predictive remapping scheme. Rather, we conclude that the human auditory system adequately processes dynamically varying acoustic cues that result from self-initiated rapid head movements to construct a stable representation of the target in world coordinates. This signal is subsequently used to program accurate eye and head localization responses.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Three models for how the audiomotor system could behave in the double-step paradigm. A, Static double-step trial in which the sound (N) is presented before the first gaze shift (ΔG1). The noncompensation model (I) predicts that the auditory target (N) is kept in a fixed craniocentric reference frame (TH). Thus, after making the first gaze shift to V (the visual target), the second movement is directed to the location at N′. In the dynamic feedback model (II), the eye-head motor response to the sound fully accounts for the actual intervening gaze shift, ΔG1. The response is given by ΔG2 = TH - ΔG1 and is directed to N. In the visual-predictive remapping model (III), the system uses the predicted first gaze shift, specified by the required movement, FV. The second saccade is preprogrammed as ΔG2 = VN = TH - FV. Any localization error of the first movement will not be accounted for. Thus, the response is directed to P rather than to N. B, Predictions of the same three models for the dynamic double step, in which the sound is presented during the first gaze shift. This yields a different head-centered target location, TH. Model III uses the preprogrammed full first gaze shift to update the head-centered target location, instead of the partial gaze displacement after sound presentation (as in model II), thereby directing the response to P rather than to N. Note that, in contrast to model II, models I and III now predict different responses than in the static paradigm and that the predictions of models II and III are now better dissociated.
Figure 2.
Figure 2.
Double-step paradigms. A, Temporal order of the different targets in the static and dynamic double-step trials. M1 and M2, First and second eye-head movement; FIX, fixation; VIS, visual; AUD, auditory; RESP, response. B, Spatial layout of the target configurations. F, Initial fixation positions; V1, visual target in the first double-step series in which M1 is a purely horizontal movement; V2, visual target in the second double-step series in which M1 is an oblique gaze shift; A, potential auditory target locations in the first target configuration; dashed circle, area within which the auditory targets were selected for the second target configuration.
Figure 5.
Figure 5.
Properties of ongoing head and eye movements during presentation of the auditory target. A, Two-dimensional head (left) and eye (right) movement traces during stimulus presentation in the dynamic condition (early- and late-triggered trials pooled). B, Head and eye mean and peak velocity during the 50 msec stimulus presentation for both the static (dark gray histogram; only mean velocity shown) and the dynamic (black and light gray histograms for mean and peak velocities, respectively) condition. Note large trial-to-trial variability in eye and head movement kinematics for the dynamic double-step trials. Data are from subject JO. deg, Degrees.
Figure 7.
Figure 7.
End points of second gaze saccades in azimuth and elevation plotted relative to the acoustic target position. The latter (T) is aligned with (0, 0) degrees; gaze responses are expressed as undershoots or overshoots with respect to the target location. Histograms show the respective response distributions for the static (black; filled black circles) and triggered dynamic (gray; gray triangles) double-step responses. The dashed lines indicate means of the static double steps; solid lines indicate means of the dynamic double steps. Note similarities in the distributions. Open dots correspond to gaze end points toward single-step targets, and dotted lines indicate their means. Data are from subject JO. deg, Degrees.
Figure 3.
Figure 3.
Head (thick lines) and gaze (thin lines) double-step responses as a function of time for azimuth and elevation components. F, V, and N indicate the time of presentation of the fixation target, the visual target, and the auditory target, respectively. A, Trial from the nontriggered static condition in which the second auditory target is presented before initiation of the primary head and gaze movement. B, Trial from the early-triggered dynamic double-step condition. Here, the auditory target is presented early in the saccade. C, Trial in the late-triggered dynamic condition. Here, the auditory target falls halfway through the first head saccade. Data are from subject JV.
Figure 4.
Figure 4.
Head (thick lines) and gaze (thin lines) double-step response traces in space. F, V, and N indicate the positions of the fixation target, the visual target, and the auditory target, respectively. A, Two representative trials from the nontriggered condition. B, Two trials from the early-triggered condition. The target presentation epoch is indicated by a change in line thickness. C, Two trials from the late-triggered condition. If the second saccade would be based purely on the initial head-motor error, the responses would be directed toward the dashed square (N′). For the dynamic conditions, the initial target re-head position is defined as the target position relative to the head at sound onset. Note that the responses are directed toward the veridical location of the sound. Data are from subject JV. Responses in top row are the same as for Figure 3.
Figure 6.
Figure 6.
Eye-in-head positions at the onset of the second auditory-evoked gaze shift (E0 in Eq.1). The eye is typically eccentric in the head, so that gaze-in-space and head-in-space are not aligned at the start of the second gaze shift. Points within the square correspond to eye positions with azimuth and elevation components < 10°. deg, Degrees.
Figure 9.
Figure 9.
Partial correlation coefficients for the regression on the second head saccade (ΔH2) (A) and the second gaze saccade (ΔG2) (B), which are described as a function of the gaze (GM) and head (HM) motor errors (Eqs. 3a and 3b). The different gray-coded bars represent the two different conditions (dynamic/static) and response components (azimuth/elevation). Data are pooled across all subjects and recording sessions. Note that eye and head are mainly driven by motor commands expressed in their own reference frames.
Figure 8.
Figure 8.
A, Regression coefficients of Equation 1 for second gaze saccades (ΔG2), averaged across subjects and recording sessions. B, Regression coefficients of Equation 2 for second head saccades (ΔH2). Different double-step conditions (dynamic/static) and response directions (horizontal/vertical) are represented by the different gray-coded bars. The dotted lines at the values of +1.0 and -1.0 correspond to ideal compensation for the intervening movements.
Figure 10.
Figure 10.
Predicted auditory-evoked gaze shifts (ΔG2; ordinate) for the four models described in Results, plotted against measured responses (abcissa). Data are pooled across subjects and recording sessions. A, Static double-step condition, for horizontal (top row) and vertical (bottom row) response components. B, Dynamic double-step condition for both response components. If the model would predict ΔG2 perfectly, data points would fall on the unity line, and R2 would be 1. R2 values are given in the bottom right corner of all panels. The predictions of the dynamic feedback model are superior to the other models.
Figure 11.
Figure 11.
Comparison of gaze-elevation responses for single-step, static, and dynamic double-step conditions for very brief (3 msec) and longer (50 msec) sound durations. Data are pooled for all four subjects (JO, JV, MW, and RK), and ranked to create cumulative distributions of absolute elevation response errors for each of the three conditions and two stimulus durations (see legend for explanation of symbols and line styles). Note the larger errors in the 3 msec data when compared with the 50 msec data for all three conditions (p < 0.025; KS test). Thus, responses to the 50 msec stimuli are more accurate than to the 3 msec stimuli. The result for the triggered double-step responses is highlighted (p = 0.002). The horizontal dashed line at 50% is the median of the response distributions. For both stimulus durations, responses to the triggered and nontriggered double steps are slightly more inaccurate than to the single-step targets (p < 0.05).
Figure 12.
Figure 12.
Conceptual scheme underlying accurate dynamic gaze control toward acoustic targets. Two stages are discerned. First, head-centered target location at target onset, TH(0), is added to head orientation in space at sound onset, HS(0), to create a stable representation of the sound in world coordinates, TS. This target representation is kept in memory until a new target is selected. In the second stage, adequate motor commands for the eyes and head are generated by remapping the sound into head-centered (TH(t)) and eye-centered (TE(t)) target locations, respectively. This latter process requires signals about instantaneous eye position in the head, EH(t), and about head orientation in space, HS(t).

Similar articles

Cited by

References

    1. André-Deshays C, Berthoz A, Revel M (1988) Eye-head coupling in humans. I. Simultaneous recording of isolated motor units in dorsal neck muscles and horizontal eye movements. Exp Brain Res 69: 399-406. - PubMed
    1. Bahcall DO, Kowler E (1999) Illusory shifts in visual direction accompany adaptation of saccadic eye movements. Nature 400: 864-866. - PubMed
    1. Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. Cambridge, MA: MIT.
    1. Colby CL, Duhamel JR, Goldberg ME (1995) Oculocentric spatial representation in parietal cortex. Cereb Cortex 5: 470-481. - PubMed
    1. Collewijn H, van der Mark F, Jansen TC (1975) Precise recording of human eye movements. Vision Res 15: 447-450. - PubMed

Publication types