Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May 3:6:3.
doi: 10.3389/fnbot.2012.00003. eCollection 2012.

I Reach Faster When I See You Look: Gaze Effects in Human-Human and Human-Robot Face-to-Face Cooperation

Affiliations

I Reach Faster When I See You Look: Gaze Effects in Human-Human and Human-Robot Face-to-Face Cooperation

Jean-David Boucher et al. Front Neurorobot. .

Abstract

Human-human interaction in natural environments relies on a variety of perceptual cues. Humanoid robots are becoming increasingly refined in their sensorimotor capabilities, and thus should now be able to manipulate and exploit these social cues in cooperation with their human partners. Previous studies have demonstrated that people follow human and robot gaze, and that it can help them to cope with spatially ambiguous language. Our goal is to extend these findings into the domain of action, to determine how human and robot gaze can influence the speed and accuracy of human action. We report on results from a human-human cooperation experiment demonstrating that an agent's vision of her/his partner's gaze can significantly improve that agent's performance in a cooperative task. We then implement a heuristic capability to generate such gaze cues by a humanoid robot that engages in the same cooperative interaction. The subsequent human-robot experiments demonstrate that a human agent can indeed exploit the predictive gaze of their robot partner in a cooperative task. This allows us to render the humanoid robot more human-like in its ability to communicate with humans. The long term objectives of the work are thus to identify social cooperation cues, and to validate their pertinence through implementation in a cooperative robot. The current research provides the robot with the capability to produce appropriate speech and gaze cues in the context of human-robot cooperation tasks. Gaze is manipulated in three conditions: Full gaze (coordinated eye and head), eyes hidden with sunglasses, and head fixed. We demonstrate the pertinence of these cues in terms of statistical measures of action times for humans in the context of a cooperative task, as gaze significantly facilitates cooperation as measured by human response times.

Keywords: cooperation; gaze; human–human interaction; human–robot interaction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Cooperation paradigm for human–human and human–robot interaction. (A) Schematic representation. Cubes labeled with consonants are on the playing surface. The consonants are visible only to the informer. The manipulator hears a consonant over headphones, and announces this to the informer. The informer performs a gaze search for the consonant cube, fixates it, and announces the vowel-color location to the manipulator who grasps the cube and puts in front of himself. (B) Human–human setup. Eye movements of the referent subject are recorded by eye tracker (see inset). (C) View from the informer, taken from the eye tracker. Circled red cross indicates current eye position. (D) Human–robot setup. The iCub plays the role of the informer.
Figure 2
Figure 2
Common timeline of the experiments. The task is composed of four phases. Instruction phase: from the confidential instruction provided by the program (via the headphone to the manipulator) to the beginning of the label verbalization (a). Search phase: from the beginning of the label verbalization by the manipulator (a), the informer hearing (b), and then searching for (c) the cube, to the beginning of the location verbalization by the informer (e). The location phase: from the label verbalization by the informer (e) to manipulator’s hand contact on the cube (f). Move phase: the rest of the hand movement, from the cube contact to the initial position. RT is the reaction time, from the end of Says location (e) to the beginning of movement onset (f). Grasp time (GT) is from movement onset, to contact with the cube. Movement time is the sum RT and GT. Note that if the manipulator lifts the hand (f) before the end of Says location (e), then the RT will be negative, i.e., anticipatory.
Figure 3
Figure 3
Gaze Effects. Distributions of durations of the four different phases of the interaction with gaze blocked by Glasses (glasses-on), or not (glasses-off). During the Location phase when the Informer is wearing glasses, the manipulator’s response time is significantly impaired.
Figure 4
Figure 4
Distributions of referent subject’s gaze toward ROIs according to phase (rows) and condition (columns). During search, Manipulator looks at Informer’s eyes, while Informer looks at target blocks to identify the named cube.
Figure 5
Figure 5
Qualitative comparison between human (left) and iCub (right) saccadic responses: position (first column) and velocities (second column) of the head and the eye movements (both horizontal and vertical) are given when the human and the robot are asked to look at the target. In both cases, the first row represents the gaze response as result of the coordinated rotation of the head and the eye, which are illustrated in the second and third rows, respectively.
Figure 6
Figure 6
The HRI setup. On the left side, the robot plays the role of the informer whereas the human is the manipulator. According to the conditions, the robot changes its behavior (see Figures 7A,C). In order to observe a potential anticipation due to the eye and/or the head movement we measure the reaction time (RT) and the movement time (MT). The RT is the duration between the end of the speech location signal (Figure 2 part (e)) and when the human lifts her/his hand from the initial position (on the picture under the elbow). The MT is the duration between the end of the speech location signal and the first manipulator cube contact (details in Figure 7D). See Figure 1 for timeline.
Figure 7
Figure 7
Human–Robot interaction conditions (A–C) and the contact detection glove (D). In the head fixed condition (A), the robot stays in a neutral position, moving neither the eyes nor head. In the full gaze condition (B), the robot searches for the correct cube with coordinated gaze (eye and head movement). The sun glasses condition (C), is the same as (B) except the robot wears sun glasses. In this condition, the human cannot see the eyes. (D) Illustrates the contact detection glove. Contact is detected when the electrical circuits are closed. There two kinds of contact detections: (i) when the glove touches/releases a cube and (ii) when the glove palm contacts/moves from the initial position.
Figure 8
Figure 8
Reaction time (RT) and movement (MT). RT is positive for Head Fixed, and negative (anticipatory) for Full Gaze and Sun Glasses, indicating that manipulator can read the iCub gaze. This anticipation is reflected in the movement time, as MT = RT + GT.
Figure 9
Figure 9
Reaction time (RT) over the rounds. In initial rounds Head Fixed > Sun Glasses > Full Gaze. Later, this becomes Head Fixed > Sun Glasses = Full Gaze, indicating that subjects learn to “read” the gaze from head position of the robot.

Similar articles

Cited by

References

    1. Argyle M., Cook M. (1976). Gaze and Mutual Gaze. Oxford: Cambridge University Press
    1. Bailly G., Raidt S., Elisei F. (2010). Gaze, conversational agents and face-to-face communication. Speech Commun. 52, 598–61210.1016/j.specom.2010.02.015 - DOI
    1. Berez A. L. (2007). Eudico linguistic annotator (Elan). Lang. Document. Conserv. 1, 283–289
    1. Boersma P., Weenink D. (2007). Praat: Doing Phonetics by Computer (Version 5.1.08) (Computer Program). Available at: http://www.praat.org/ [retrieved June 28, 2009].
    1. Boucher J.-D., Ventre-Dominey J., Fagel S., Bailly G., Dominey P. F. (2010). “Facilitative effects of communicative gaze and speech in human-robot cooperation,” in ACM Workshop on Affective Interaction in Natural Environments (AFFINE), ed. Castellano G. (New York: ACM; ), 71–74

LinkOut - more resources