Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 25;10(9):e0138198.
doi: 10.1371/journal.pone.0138198. eCollection 2015.

Predicting the Valence of a Scene from Observers' Eye Movements

Affiliations

Predicting the Valence of a Scene from Observers' Eye Movements

Hamed R-Tavakoli et al. PLoS One. .

Erratum in

Abstract

Multimedia analysis benefits from understanding the emotional content of a scene in a variety of tasks such as video genre classification and content-based image retrieval. Recently, there has been an increasing interest in applying human bio-signals, particularly eye movements, to recognize the emotional gist of a scene such as its valence. In order to determine the emotional category of images using eye movements, the existing methods often learn a classifier using several features that are extracted from eye movements. Although it has been shown that eye movement is potentially useful for recognition of scene valence, the contribution of each feature is not well-studied. To address the issue, we study the contribution of features extracted from eye movements in the classification of images into pleasant, neutral, and unpleasant categories. We assess ten features and their fusion. The features are histogram of saccade orientation, histogram of saccade slope, histogram of saccade length, histogram of saccade duration, histogram of saccade velocity, histogram of fixation duration, fixation histogram, top-ten salient coordinates, and saliency map. We utilize machine learning approach to analyze the performance of features by learning a support vector machine and exploiting various feature fusion schemes. The experiments reveal that 'saliency map', 'fixation histogram', 'histogram of fixation duration', and 'histogram of saccade slope' are the most contributing features. The selected features signify the influence of fixation information and angular behavior of eye movements in the recognition of the valence of images.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Analysis of gender and emotional distribution of images.
As visualized, there is a difference between the ratings of female and male observers. Each image is visualized in terms of its mean valence and standard deviation ratings in regard to the genders.
Fig 2
Fig 2. Valence categories and genders.
To show the existence of differences between male and female genders, a fuzzy c-means is run to categorize the images based on their mean valence and standard deviation into three classes of unpleasant, neutral, and pleasant. Comparing gender specific results, the disagreement of genders on valence is evident.
Fig 3
Fig 3. Distribution of images across the valence range in each class of unpleasant, neutral, and pleasant.
Fig 4
Fig 4. Mean value features versus valence.
From left to right: mean fixation duration (m = 0.012, p = 1.32e-36), mean saccade duration (m = -0.008, p = 1.11e-26), mean saccade length (m = -0.027, p = 1.77e-25), mean saccade slope (m = -0.002, p = 1.49e-35), mean saccade velocity(m = 0.004, p = 0.14), mean saccade orientation (m = -0.017, p = 5.19e-37).
Fig 5
Fig 5. Baseline mean performance for mean value features in terms of bookmaker for classification of images into three class of unpleasant, neutral, and pleasant.
The mean of classification accuracy across folds and repetitions (%) and their associated standard errors are also added to each bar as a second measurement unit.
Fig 6
Fig 6. Baseline performance of individual histogram-based features in terms of bookmaker for classification of images into three class of unpleasant, neutral, and pleasant.
The mean of classification accuracy across folds and repetitions (%) and their associated standard errors are added to each bar as a second measurement unit.
Fig 7
Fig 7. Performance of conventional and evolutionary based decomposition methods.
Mean of classification accuracy across folds and repetitions (%) and their associated standard errors are added to each bar as a second measurement unit.

References

    1. Yarbus AL. Eye Movements and Vision. Plenum Press; 1967.
    1. Henderson JM, Shinkareva SV, Wang J, Luke SG, Olejarczyk J. Predicting Cognitive State from Eye Movements. PLoS ONE. 2013;8(5). 10.1371/journal.pone.0064937 - DOI - PMC - PubMed
    1. Borji A, Itti L. Defending Yarbus: Eye Movements reveal observers’ task. Journal of Vision. 2014;14(5). - PubMed
    1. Bulling A, Ward JA, Gellersen H, Troster G. Eye Movement Analysis for Activity Recognition Using Electrooculography. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2011;33(4). 10.1109/TPAMI.2010.86 - DOI - PubMed
    1. Subramanian R, Yanulevskaya V, Sebe N. Can computers learn from humans to see better?: inferring scene semantics from viewers’ eye movements. In: ACM MM; 2011. p. 33–42.

Publication types

LinkOut - more resources