Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 21:6:1016.
doi: 10.3389/fpsyg.2015.01016. eCollection 2015.

Look together: analyzing gaze coordination with epistemic network analysis

Affiliations

Look together: analyzing gaze coordination with epistemic network analysis

Sean Andrist et al. Front Psychol. .

Abstract

When conversing and collaborating in everyday situations, people naturally and interactively align their behaviors with each other across various communication channels, including speech, gesture, posture, and gaze. Having access to a partner's referential gaze behavior has been shown to be particularly important in achieving collaborative outcomes, but the process in which people's gaze behaviors unfold over the course of an interaction and become tightly coordinated is not well understood. In this paper, we present work to develop a deeper and more nuanced understanding of coordinated referential gaze in collaborating dyads. We recruited 13 dyads to participate in a collaborative sandwich-making task and used dual mobile eye tracking to synchronously record each participant's gaze behavior. We used a relatively new analysis technique-epistemic network analysis-to jointly model the gaze behaviors of both conversational participants. In this analysis, network nodes represent gaze targets for each participant, and edge strengths convey the likelihood of simultaneous gaze to the connected target nodes during a given time-slice. We divided collaborative task sequences into discrete phases to examine how the networks of shared gaze evolved over longer time windows. We conducted three separate analyses of the data to reveal (1) properties and patterns of how gaze coordination unfolds throughout an interaction sequence, (2) optimal time lags of gaze alignment within a dyad at different phases of the interaction, and (3) differences in gaze coordination patterns for interaction sequences that lead to breakdowns and repairs. In addition to contributing to the growing body of knowledge on the coordination of gaze behaviors in joint activities, this work has implications for the design of future technologies that engage in situated interactions with human users.

Keywords: conversational repair; epistemic network analysis; gaze tracking; referential gaze; social signals.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Cross-recurrence plots adapted from work by Richardson and Dale (2005). Horizontal and vertical axes specify the gaze of a speaker and a listener. Diagonal slices (lower-left to upper-right) correspond to an alignment of the participants' gaze with a particular time lag between them. A point is plotted on the diagonal whenever the gaze is recurrent. These plots visually compare a “good” listener (well aligned with the speaker's gaze) to a “bad” listener (not as well aligned). They also show the poor alignment of random gaze with a speaker's gaze.
Figure 2
Figure 2
(A) The setup of the data collection experiment in the sandwich-making task. (B) A view from one participant's eye-tracking glasses, showing their scan path throughout a reference-action sequence. (C) A timeline view of the gaze fixations to ingredients, the partner, and the bread shown in the scan path in (B).
Figure 3
Figure 3
Center: Each circular point represents the centroid of a network for one dyad in a particular phase, collapsed across all reference-action sequences produced by that dyad. The centroid of the mean network for each phase is also plotted as a solid square surrounded by a larger square denoting the confidence interval. A cyclical relationship through the ENA space can be observed. Boxes in periphery: The mean network for each of the five sequences is fully plotted. A representative timeline of an example gaze sequence from the raw gaze data is shown beneath the mean networks to illustrate each phase. A view of the worker's and instructor's scan paths in that phase (same data as in the timeline) is also shown.
Figure 4
Figure 4
Percentage of gaze alignment between the instructor and worker at each of the five phases, plotted at offset lags from −2 to 2 s. Positive lags indicate instructor lead, while negative lags put the worker ahead of the instructor.
Figure 5
Figure 5
Centroids and mean networks from the ENA that used gaze data from each phase that was shifted by the optimal lag for that phase. The data is modeled from the perspective of the instructor. Four nodes represent the possible gaze targets for the instructor as before, but there are only two nodes for the worker, signifying whether the worker is looking at the same target or a different target. W_Different and W_Same are largely vertically separated. Networks that are low on the y-axis have strong connections to W_Same, while networks high on the axis have strong connections to W_Different. Thus, the y-axis can be interpreted as signifying “alignment,” and we can observe a rise and fall of alignment in the phases as their corresponding networks fall and rise respectively in the ENA space.
Figure 6
Figure 6
Right: Each circular point represents the centroid of a network for one dyad in a particular phase with or without a repair occurring in the reference-action sequence. The centroid of the mean network for each phase is also plotted as a solid square surrounded by a larger square denoting the confidence interval. Left: The difference in mean networks between repair and no-repair for each of the first three phases (pre-reference, reference, and post-reference).

References

    1. Allopenna P. D., Magnuson J. S., Tanenhaus M. K. (1998). Tracking the time course of spoken word recognition using eye movements: evidence for continuous mapping models. J. Mem. Lang. 38, 419–439. 10.1006/jmla.1997.2558 - DOI
    1. Altmann G. T., Kamide Y. (2004). Now you see it, now you don't: mediating the mapping between language and the visual world, in The Interface of Language, Vision, and Action: Eye Movements and the Visual World, eds Henderson J. M., Ferreira F. (New York, NY: Psychology Press; ), 347–386.
    1. Argyle M., Cook M. (1976). Gaze and Mutual Gaze. Cambridge, UK: Cambridge University Press.
    1. Baldwin D. A. (1995). Understanding the link between joint attention and language, in Joint Attention: Its Origins and Role in Development, eds Moore C., Dunham P. J. (Hillsdale, NJ: Erlbaum; ), 131–158.
    1. Bard E. G., Hill R., Arai M. (2009). Referring and gaze alignment: accessibility is alive and well in situated dialogue, in Proceedings of CogSci 2009 (Amsterdam: Cognitive Science Society; ), 1246–1251.

LinkOut - more resources