OpenEDS2020 Challenge on Gaze Tracking for VR: Dataset and Results

Cristina Palmero^{1

2}, Abhishek Sharma³, Karsten Behrendt⁴, Kapil Krishnakumar⁴, Oleg V Komogortsev^{3

5}, Sachin S Talathi³

Affiliations

¹ Department of Mathematics and Informatics, Universitat de Barcelona, 08007 Barcelona, Spain.
² Computer Vision Center, Campus UAB, 08193 Bellaterra, Spain.
³ Eye Tracking Department, Facebook Reality Labs Research, Redmond, WA 98052, USA.
⁴ Facebook Reality Labs, Menlo Park, CA 94025, USA.
⁵ Department of Computer Science, Texas State University, San Marcos, TX 78666, USA.

PMID: 34300511
PMCID: PMC8309797
DOI: 10.3390/s21144769

OpenEDS2020 Challenge on Gaze Tracking for VR: Dataset and Results

Cristina Palmero et al. Sensors (Basel). 2021.

. 2021 Jul 13;21(14):4769.

doi: 10.3390/s21144769.

Authors

Cristina Palmero^{1

2}, Abhishek Sharma³, Karsten Behrendt⁴, Kapil Krishnakumar⁴, Oleg V Komogortsev^{3

5}, Sachin S Talathi³

Affiliations

¹ Department of Mathematics and Informatics, Universitat de Barcelona, 08007 Barcelona, Spain.
² Computer Vision Center, Campus UAB, 08193 Bellaterra, Spain.
³ Eye Tracking Department, Facebook Reality Labs Research, Redmond, WA 98052, USA.
⁴ Facebook Reality Labs, Menlo Park, CA 94025, USA.
⁵ Department of Computer Science, Texas State University, San Marcos, TX 78666, USA.

PMID: 34300511
PMCID: PMC8309797
DOI: 10.3390/s21144769

Abstract

This paper summarizes the OpenEDS 2020 Challenge dataset, the proposed baselines, and results obtained by the top three winners of each competition: (1) Gaze prediction Challenge, with the goal of predicting the gaze vector 1 to 5 frames into the future based on a sequence of previous eye images, and (2) Sparse Temporal Semantic Segmentation Challenge, with the goal of using temporal information to propagate semantic eye labels to contiguous eye image frames. Both competitions were based on the OpenEDS2020 dataset, a novel dataset of eye-image sequences captured at a frame rate of 100 Hz under controlled illumination, using a virtual-reality head-mounted display with two synchronized eye-facing cameras. The dataset, which we make publicly available for the research community, consists of 87 subjects performing several gaze-elicited tasks, and is divided into 2 subsets, one for each competition task. The proposed baselines, based on deep learning approaches, obtained an average angular error of 5.37 degrees for gaze prediction, and a mean intersection over union score (mIoU) of 84.1% for semantic segmentation. The winning solutions were able to outperform the baselines, obtaining up to 3.17 degrees for the former task and 95.2% mIoU for the latter.

Keywords: gaze estimation; gaze prediction; semantic segmentation; video oculography; virtual reality.

PubMed Disclaimer

Conflict of interest statement

The funders have been involved in the design of the study, dataset collection, analyses and interpretation of the data, writing of the manuscript, and the decision to publish the results.

Figures

**Figure 1**
Examples of images without glasses (**top row**) and with glasses (**bottom row**), representing the variability of the dataset in terms of accessories, ethnicity, age and gender.

**Figure 2**
Example of saccadic (**top row**) and smooth pursuit (**bottom row**) eye movements during 100 ms.

**Figure 3**
2D gaze angle distributions for train (**left**), validation (**center**) and test (**right**) splits of the Gaze Prediction data subset.

**Figure 4**
The ratio of labels vs. total number of samples for all the 594 sequences shown in the decreasing order.

**Figure 5**
For each pair of images, examples of human annotations (**left**) and baseline model performance (**right**) for eye semantic segmentation (best viewed in color).

See this image and copyright information in PMC

References

1. Chita-Tegmark M. Social attention in ASD: A review and meta-analysis of eye-tracking studies. Res. Dev. Disabil. 2016;48:79–93. doi: 10.1016/j.ridd.2015.10.011. - DOI - PubMed
1. O’Driscoll G.A., Callahan B.L. Smooth pursuit in schizophrenia: A meta-analytic review of research since 1993. Brain Cognit. 2008;68:359–370. doi: 10.1016/j.bandc.2008.08.023. - DOI - PubMed
1. Pan B., Hembrooke H.A., Gay G.K., Granka L.A., Feusner M.K., Newman J.K. The determinants of web page viewing behavior: An eye-tracking study; Proceedings of the 2004 Symposium on Eye Tracking Research &Applications; San Antonio, TX, USA. 22–24 March 2004; pp. 147–154.
1. Fan L., Wang W., Huang S., Tang X., Zhu S.C. Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning; Proceedings of the IEEE International Conference on Computer Vision; Seoul, Korea. 27–28 October 2019; pp. 5724–5733.
1. Fernandez M. Augmented virtual reality: How to improve education systems. High. Learn. Res. Commun. 2017;7:1–15. doi: 10.18870/hlrc.v7i1.373. - DOI

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

OpenEDS2020 Challenge on Gaze Tracking for VR: Dataset and Results

Affiliations

OpenEDS2020 Challenge on Gaze Tracking for VR: Dataset and Results

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources