Accelerating eye movement research via accurate and affordable smartphone eye tracking

Nachiappan Valliappan¹, Na Dai¹, Ethan Steinberg^{1

2}, Junfeng He¹, Kantwon Rogers^{1

3}, Venky Ramachandran¹, Pingmei Xu¹, Mina Shojaeizadeh¹, Li Guo^{1

4}, Kai Kohlhoff¹, Vidhya Navalpakkam⁵

Affiliations

¹ Google Research, Mountain View, CA, USA.
² Stanford University, Stanford, CA, USA.
³ Georgia Institute of Technology, Atlanta, GA, USA.
⁴ Johns Hopkins University, Baltimore, MD, USA.
⁵ Google Research, Mountain View, CA, USA. vidhyan@google.com.

PMID: 32917902
PMCID: PMC7486382
DOI: 10.1038/s41467-020-18360-5

Accelerating eye movement research via accurate and affordable smartphone eye tracking

Nachiappan Valliappan et al. Nat Commun. 2020.

. 2020 Sep 11;11(1):4553.

doi: 10.1038/s41467-020-18360-5.

Authors

Affiliations

¹ Google Research, Mountain View, CA, USA.
² Stanford University, Stanford, CA, USA.
³ Georgia Institute of Technology, Atlanta, GA, USA.
⁴ Johns Hopkins University, Baltimore, MD, USA.
⁵ Google Research, Mountain View, CA, USA. vidhyan@google.com.

PMID: 32917902
PMCID: PMC7486382
DOI: 10.1038/s41467-020-18360-5

Abstract

Eye tracking has been widely used for decades in vision research, language and usability. However, most prior research has focused on large desktop displays using specialized eye trackers that are expensive and cannot scale. Little is known about eye movement behavior on phones, despite their pervasiveness and large amount of time spent. We leverage machine learning to demonstrate accurate smartphone-based eye tracking without any additional hardware. We show that the accuracy of our method is comparable to state-of-the-art mobile eye trackers that are 100x more expensive. Using data from over 100 opted-in users, we replicate key findings from previous eye movement research on oculomotor tasks and saliency analyses during natural image viewing. In addition, we demonstrate the utility of smartphone-based gaze for detecting reading comprehension difficulty. Our results show the potential for scaling eye movement research by orders-of-magnitude to thousands of participants (with explicit consent), enabling advances in vision research, accessibility and healthcare.

PubMed Disclaimer

Conflict of interest statement

This study was funded by Google LLC and/or a subsidiary thereof (‘Google’). N.V., N.D., J.H., V.R., P.X., M.S., K.K., and V.N. are employees of Google. E.S., K.R., and L.G. were interns at Google.

Figures

**Fig. 1. Accuracy of our smartphone eye tracker.**
a Gaze estimation accuracy (mean ± s.e.m., n = 26 participants) improves with # calibration frames for personalization. b Error across different screen locations. The radius of the circle indicates average model error at that screen location.

**Fig. 2. Comparison between accuracy of Tobii glasses vs. our model.**
Study setup shows the four experimental conditions: Participant (an author for visualization purposes) views stimuli on the phone (mounted on a device stand) while wearing Tobii glasses (a) and without (b). c, d Similar to the above, but participant holds the phone in the hand. e, f Accuracy of specialized eye tracker (Tobii glasses) vs. our smartphone eye tracker (mean ± s.e.m., n = 13 participants) for the device stand and hand-held settings. Statistical comparison shows no significant difference in accuracy across both settings (device stand: t(12) = −2.12, p = 0.06; hand-held: t(12) = −1.53, p = 0.15; two-tailed paired t-test).

**Fig. 3. Smartphone gaze for standard oculomotor tasks.**
a Prosaccade task. Each trial began with a central fixation for 800 ms, after which the target appeared at a random location and remained for 1000 ms. Participants were asked to saccade to the target as soon as it appeared. b Saccade latency distribution for the prosaccade task. c Smooth pursuit task. Participants were asked to look at the green dot as it moved along a circle. d Sample scanpath from a single user shown in black (ground truth in green). e Population-level heatmap from all users and trials.

**Fig. 4. Smartphone gaze during visual search.**
a, b, e Effect of target’s color contrast on visual search performance. a Gaze scanpath when the target has low contrast (i.e., similar to the distractors). b Scanpath when the target has high contrast (different from the distractors). e Number of fixations to find the target as a function of target’s color contrast (plot shows mean ± s.e.m., n = 44–65 trials/contrast-level). c, d, f Similar plots for orientation contrast (difference in orientation between target and distractors in degrees, Δθ; n = 42–63 trials/contrast-level). g Effect of set size. Number of fixations to find the target as the number of items in the display varied between 5, 10, and 15; and the target’s orientation contrast varied from low (Δθ = 7^∘) to medium-high (Δθ = 15^∘) to very high (Δθ = 75^∘). Plot shows mean ± s.e.m. in number of fixations (n = 42–63 trials for each combination of set size and Δθ).

**Fig. 5. Gaze on natural images depends on the task being performed.**
The columns refer to: a Original image; b fixation heatmap during free viewing; c example scanpath from a single participant for free viewing; d fixation heatmap during visual search for a target object (specified in the title of each image); e example scanpath from a single participant for the visual search task.

**Fig. 6. Gaze entropy and center bias during free viewing on phones.**
a Histogram of gaze entropy across all images for the free viewing task along with examples of low vs. high entropy images. b Averaging the fixations across all users and images reveals a center bias.

**Fig. 7. Comparison between mobile and desktop gaze for natural image viewing.**
The left hand side shows the most similar mobile vs. desktop heatmaps, while the right hand side shows the least similar heatmaps. Columns refer to: a and d original image; b and e mobile gaze heatmap with a blur width of 24 px; c and f desktop gaze heatmap with a blur width of 24 px (corresponding to 1^∘ desktop viewing angle). See Supplementary Fig. 9 and Supplementary Table 1 for similar results with a larger blur width of 67 px (corresponding to 1^∘ mobile viewing angle).

**Fig. 8. Different gaze patterns for factual vs. interpretive tasks.**
a Sample passage shown to the participant (actual text replaced with dummy for copyright reasons). Green bounding box highlights the relevant excerpt for the factual task (box shown for visualization purposes only, participants did not see this). b Population-level gaze heatmap for the factual task, for the passage shown in (a). c Heatmap for the interpretive task for the passage shown in (a). d–f Similar to (a–c) except that the factual task appeared after the interpretive task. In both examples, gaze was more dispersed across the passage for interpretive than factual tasks.

**Fig. 9. Effect of reading comprehension difficulty on gaze for factual tasks.**
a Barplot shows % fixation duration on the relevant portion of the passage (normalized by height) when participants answered the factual question correctly vs. not. Error bars denote the mean ± s.e.m. (n = 53, 13 tasks for correct vs. wrong responses). b Example of fixation heatmap for easy factual task; c difficult factual task. d–f Scatterplots showing different metrics as a function of task difficulty. d Time to answer the question in seconds (includes time spent reading the question and the passage); e number of fixations on the passage; f percentage time on relevant region, computed as the % total fixation duration on the relevant portion of the passage (normalized by height). Statistical correlation reported is the Spearman’s rank correlation coefficient (n = 10 tasks); two-tailed one sample t-test. The confidence band represents the bootstrapped 68% confidence interval.

See this image and copyright information in PMC

References

1. Anderson, C. H., Van Essen, D. C. & Olshausen, B. A. In Neurobiology of Attention 11–17 (2005).
1. Raichle ME. Two views of brain function. Trends Cogn. Sci. 2010;14:180–190. doi: 10.1016/j.tics.2010.01.008. - DOI - PubMed
1. Carrasco M. Visual attention: the past 25 years. Vis. Res. 2011;51:1484–1525. doi: 10.1016/j.visres.2011.04.012. - DOI - PMC - PubMed
1. Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nat. Rev. Neurosci. 2004;5:495–501. doi: 10.1038/nrn1411. - DOI - PubMed
1. Itti, L., Rees, G. & Tsotsos, J. K. Neurobiology of Attention (Elsevier, 2005).

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Accelerating eye movement research via accurate and affordable smartphone eye tracking

Affiliations

Accelerating eye movement research via accurate and affordable smartphone eye tracking

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources