Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun:2014:82-94.
doi: 10.1145/2594368.2594388.

iShadow: Design of a Wearable, Real-Time Mobile Gaze Tracker

Affiliations

iShadow: Design of a Wearable, Real-Time Mobile Gaze Tracker

Addison Mayberry et al. MobiSys. 2014 Jun.

Abstract

Continuous, real-time tracking of eye gaze is valuable in a variety of scenarios including hands-free interaction with the physical world, detection of unsafe behaviors, leveraging visual context for advertising, life logging, and others. While eye tracking is commonly used in clinical trials and user studies, it has not bridged the gap to everyday consumer use. The challenge is that a real-time eye tracker is a power-hungry and computation-intensive device which requires continuous sensing of the eye using an imager running at many tens of frames per second, and continuous processing of the image stream using sophisticated gaze estimation algorithms. Our key contribution is the design of an eye tracker that dramatically reduces the sensing and computation needs for eye tracking, thereby achieving orders of magnitude reductions in power consumption and form-factor. The key idea is that eye images are extremely redundant, therefore we can estimate gaze by using a small subset of carefully chosen pixels per frame. We instantiate this idea in a prototype hardware platform equipped with a low-power image sensor that provides random access to pixel values, a low-power ARM Cortex M3 microcontroller, and a bluetooth radio to communicate with a mobile phone. The sparse pixel-based gaze estimation algorithm is a multi-layer neural network learned using a state-of-the-art sparsity-inducing regularization function that minimizes the gaze prediction error while simultaneously minimizing the number of pixels used. Our results show that we can operate at roughly 70mW of power, while continuously estimating eye gaze at the rate of 30 Hz with errors of roughly 3 degrees.

Keywords: eye tracking; lifelog; neural network.

PubMed Disclaimer

Figures

Figure 1
Figure 1
iShadow overview: A user wears the eyeglass and collects a few minutes of calibration data by looking at dots on a computer screen. The calibration data is downloaded from local storage on the eyeglass, and the neural network model is learnt offline. An appropriate model can then be uploaded to the eyeglass for real-time gaze tracking.
Figure 2
Figure 2
Illustration of the the neural network gaze prediction model.
Figure 3
Figure 3
Stages of the fixed-pattern-noise (FPN) correction process. The raw image is collected with FPN present (a), from which the the static FPN mask (b) is subtracted. The resulting image (c) is mostly free of FPN.
Figure 4
Figure 4
Figures show an architecture diagram and different views of the third-generation iShadow prototype. The prototype has two cameras, one at the center front and one facing the eye, with the electronics mounted on the control board on the side. Batteries, while not shown, are mounted behind the ear.
Figure 5
Figure 5
Amount of training time for the system versus the resulting gaze prediction accuracy.
Figure 6
Figure 6
Plots (a) and (b) show the effect of regularization parameter on gaze prediction accuracy and model size respectively. Plot (c) shows the net result, which is how the number of pixels acquired can be reduced dramatically (up to 10×) with minor effect on gaze prediction accuracy.
Figure 7
Figure 7
Comparison of eye images from multiple users, giving the average prediction error for that user. Notice that the position of the eye in the image and iris / sclera contrast have a prominent effect on prediction accuracy.
Figure 8
Figure 8
The circle gives an example of 3° of error in the outward imager plane around the white dot. The error is less if the eye is fully within the field of view of the imager, and higher when not fully in the field of view.
Figure 9
Figure 9
This figure shows the weights learned by each hidden unit in the neural network model for subsets of approximately 10% of pixel locations.
Figure 10
Figure 10
Smooth tradeoff between gaze tracking rate and prediction error as λ is varied. The MCU is always active in this experiment.
Figure 11
Figure 11
Breakdown of energy and time for different subsystems during the capture and predict process. (a) is the amount of energy consumed for pixel acquisition by the MCU, the gaze prediction computation, and cameras, and (b) is the amount of time spent by the MCU in acquisition and prediction.

References

    1. A neural-based remote eye gaze tracker under natural head motion. Computer Methods and Programs in Biomedicine. 2008;92(1):66–78. - PubMed
    1. Applied Science Laboratories [accessed April 7, 2013];NeXtGeneration Mobile Eye: Mobile Eye XG. 2013 http://www.asleyetracking.com/Site/Portals/0/MobileEyeXGwireless.pdf Online.
    1. Baluja S, Pomerleau D. Technical report. Pittsburgh, PA, USA: 1994. Non-intrusive gaze tracking using artificial neural networks.
    1. Bishop CM. Neural networks for pattern recognition. Oxford university press; 1995.
    1. Cheng D, Vertegaal R. An eye for an eye: a performance evaluation comparison of the lc technologies and tobii eye trackers; Eye Tracking Research & Application: Proceedings of the 2004 symposium on Eye tracking research & applications; 2004.pp. 61–61.