ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading
- PMID: 30531985
- PMCID: PMC6289117
- DOI: 10.1038/sdata.2018.291
ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading
Abstract
We present the Zurich Cognitive Language Processing Corpus (ZuCo), a dataset combining electroencephalography (EEG) and eye-tracking recordings from subjects reading natural sentences. ZuCo includes high-density EEG and eye-tracking data of 12 healthy adult native English speakers, each reading natural English text for 4-6 hours. The recordings span two normal reading tasks and one task-specific reading task, resulting in a dataset that encompasses EEG and eye-tracking data of 21,629 words in 1107 sentences and 154,173 fixations. We believe that this dataset represents a valuable resource for natural language processing (NLP). The EEG and eye-tracking signals lend themselves to train improved machine-learning models for various tasks, in particular for information extraction tasks such as entity and relation extraction and sentiment analysis. Moreover, this dataset is useful for advancing research into the human reading and language understanding process at the level of brain activity and eye-movement.
Conflict of interest statement
The authors declare no competing interests.
Figures








References
Data Citations
-
- Hollenstein N., et al. . 2018. Open Science Framework. https://doi.org/10.17605/OSF.IO/Q3ZWS - DOI
References
-
- Barrett M., Bingel J., Keller F. & Søgaard A. Weakly supervised part-of-speech tagging using eye-tracking data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 2, 579–584 (2016).
-
- Mishra A., Kanojia D., Nagar S., Dey K. & Bhattacharyya P. Leveraging cognitive features for sentiment analysis. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (2016).
-
- Søgaard A. Evaluating word embeddings with fMRI and eye-tracking. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP 116–121 (2016).
-
- Kennedy A. The Dundee Corpus. (University of Dundee, 2003).
-
- Cop U., Dirix N., Drieghe D. & Duyck W. Presenting GECO: An eye-tracking corpus of monolingual and bilingual sentence reading. Behavior Research Methods 49, 602–615 (2016). - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources