Decoding What People See from Where They Look: Predicting Visual Stimuli from Scanpaths

  • Moran Cerf
  • Jonathan Harel
  • Alex Huth
  • Wolfgang Einhäuser
  • Christof Koch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5395)


Saliency algorithms are applied to correlate with the overt attentional shifts, corresponding to eye movements, made by observers viewing an image. In this study, we investigated if saliency maps could be used to predict which image observers were viewing given only scanpath data. The results were strong: in an experiment with 441 trials, each consisting of 2 images with scanpath data - pooled over 9 subjects - belonging to one unknown image in the set, in 304 trials (69%) the correct image was selected, a fraction significantly above chance, but much lower than the correctness rate achieved using scanpaths from individual subjects, which was 82.4%. This leads us to propose a new metric for quantifying the importance of saliency map features, based on discriminability between images, as well as a new method for comparing present saliency map efficacy metrics. This has potential application for other kinds of predictions, e.g., categories of image content, or even subject class.


Inferior Temporal Cortex Correct Decode Binary Trial Saliency Algorithm Attention Prediction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Young, M., Yamane, S.: Sparse population coding of faces in the inferotemporal cortex. Science 256(5061), 1327–1331 (1992)CrossRefPubMedGoogle Scholar
  2. 2.
    Schwartz, E., Desimone, R., Albright, T., Gross, C.: Shape Recognition and Inferior Temporal Neurons. Proceedings of the National Academy of Sciences of the United States of America 80(18), 5776–5778 (1983)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Sato, T., Kawamura, T., Iwai, E.: Responsiveness of inferotemporal single units to visual pattern stimuli in monkeys performing discrimination. Experimental Brain Research 38(3), 313–319 (1980)CrossRefPubMedGoogle Scholar
  4. 4.
    Perrett, D., Rolls, E., Caan, W.: Visual neurones responsive to faces in the monkey temporal cortex. Experimental Brain Research 47(3), 329–342 (1982)CrossRefPubMedGoogle Scholar
  5. 5.
    Logothetis, N., Pauls, J., Poggio, T.: Shape representation in the inferior temporal cortex of monkeys. Current Biology 5(5), 552–563 (1995)CrossRefPubMedGoogle Scholar
  6. 6.
    Hung, C., Kreiman, G., Poggio, T., DiCarlo, J.: Fast Readout of Object Identity from Macaque Inferior Temporal Cortex (2005)Google Scholar
  7. 7.
    Quiroga, R., Reddy, L., Koch, C., Fried, I.: Decoding Visual Inputs From Multiple Neurons in the Human Temporal Lobe. Journal of Neurophysiology 98(4), 1997 (2007)CrossRefPubMedGoogle Scholar
  8. 8.
    Itti, L., Koch, C., Niebur, E., et al.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  9. 9.
    Dickinson, S., Christensen, H., Tsotsos, J., Olofsson, G.: Active object recognition integrating attention and viewpoint control. Computer Vision and Image Understanding 67(3), 239–260 (1997)CrossRefGoogle Scholar
  10. 10.
    Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985)PubMedGoogle Scholar
  11. 11.
    Yarbus, A.: Eye Movements and Vision. Plenum Press, New York (1967)CrossRefGoogle Scholar
  12. 12.
    Goldstein, R., Woods, R., Peli, E.: Where people look when watching movies: Do all viewers look at the same place? Computers in Biology and Medicine 37(7), 957–964 (2007)CrossRefPubMedGoogle Scholar
  13. 13.
    Privitera, C., Stark, L.: Evaluating image processing algorithms that predict regions of interest. Pattern Recognition Letters 19(11), 1037–1043 (1998)CrossRefGoogle Scholar
  14. 14.
    Itti, L., Koch, C.: Computational modeling of visual attention. Nature Rev. Neurosci. 2(3), 194–203 (2001)CrossRefGoogle Scholar
  15. 15.
    Peters, R., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision Res. 45(18), 2397–2416 (2005)CrossRefPubMedGoogle Scholar
  16. 16.
    Tatler, B., Baddeley, R., Gilchrist, I.: Visual correlates of fixation selection: effects of scale and time. Vision Research 45(5), 643–659 (2005)CrossRefPubMedGoogle Scholar
  17. 17.
    Cerf, M., Harel, J., Einhäuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20. MIT Press, Cambridge (2008)Google Scholar
  18. 18.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. Computer Vision and Pattern Recognition 1, 511–518 (2001)Google Scholar
  19. 19.
    Buswell, G.: How People Look at Pictures: A Study of the Psychology of Perception in Art. The University of Chicago press (1935)Google Scholar
  20. 20.
    Barton, J.: Disorders of face perception and recognition. Neurol. Clin. 21(2), 521–548 (2003)CrossRefPubMedGoogle Scholar
  21. 21.
    Klin, A., Jones, W., Schultz, R., Volkmar, F., Cohen, D.: Visual Fixation Patterns During Viewing of Naturalistic Social Situations as Predictors of Social Competence in Individuals With Autism (2002)Google Scholar
  22. 22.
    Adolphs, R.: Neural systems for recognizing emotion. Curr. Op. Neurobiol. 12(2), 169–177 (2002)CrossRefPubMedGoogle Scholar
  23. 23.
    Baddeley, R., Tatler, B.: High frequency edges (but not contrast) predict where we fixate: A Bayesian system identification analysis. Vision Research 46(18), 2824–2833 (2006)CrossRefPubMedGoogle Scholar
  24. 24.
    Einhäuser, W., König, P.: Does luminance-contrast contribute to a saliency map for overt visual attention?. Eur. J. Neurosci. 17(5), 1089–1097 (2003)CrossRefPubMedGoogle Scholar
  25. 25.
    Einhäuser, W., Kruse, W., Hoffmann, K., König, P.: Differences of monkey and human overt attention under natural conditions. Vision Res. 46(8-9), 1194–1209 (2006)CrossRefPubMedGoogle Scholar
  26. 26.
    Navalpakkam, V., Itti, L.: Search goal tunes visual features optimally. Neuron 53(4), 605–617 (2007)CrossRefPubMedGoogle Scholar
  27. 27.
    Kayser, C., Nielsen, K., Logothetis, N.: Fixations in natural scenes: Interaction of image structure and image content. Vision Res. 46(16), 2535–2545 (2006)CrossRefPubMedGoogle Scholar
  28. 28.
    Einhäuser, W., Rutishauser, U., Frady, E., Nadler, S., König, P., Koch, C.: The relation of phase noise and luminance contrast to overt attention in complex visual stimuli. J. Vis. 6(11), 1148–1158 (2006)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Moran Cerf
    • 1
  • Jonathan Harel
    • 1
  • Alex Huth
    • 1
  • Wolfgang Einhäuser
    • 2
  • Christof Koch
    • 1
  1. 1.California Institute of TechnologyPasadenaUSA
  2. 2.Philipps-UniversityMarburgGermany

Personalised recommendations