Identifying Objects in Images from Analyzing the Users’ Gaze Movements for Provided Tags

  • Tina Walber
  • Ansgar Scherp
  • Steffen Staab
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7131)


Assuming that eye tracking will be a common input device in the near future in notebooks and mobile devices like iPads, it is possible to implicitly gain information about images and image regions from these users’ gaze movements. In this paper, we investigate the principle idea of finding specific objects shown in images by looking at the users’ gaze path information only. We have analyzed 547 gaze paths from 20 subjects viewing different image-tag-pairs with the task to decide if the tag presented is actually found in the image or not. By analyzing the gaze paths, we are able to correctly identify 67% of the image regions and significantly outperform two baselines. In addition, we have investigated if different regions of the same image can be differentiated by the gaze information. Here, we are able to correctly identify two different regions in the same image with an accuracy of 38%.


region identification region labeling gaze analysis tagging 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Castagnos, S., Jones, N., Pu, P.: Eye-tracking product recommenders’ usage. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 29–36. ACM (2010)Google Scholar
  2. 2.
    Hajimirza, S.N., Izquierdo, E.: Gaze movement inference for implicit image annotation. In: Image Analysis for Multimedia Interactive Services. IEEE (2010)Google Scholar
  3. 3.
    Jaimes, A.: Using human observer eye movements in automatic image classifiers. In: SPIE (2001)Google Scholar
  4. 4.
    Klami, A.: Inferring task-relevant image regions from gaze data. In: Workshop on Machine Learning for Signal Processing. IEEE (2010)Google Scholar
  5. 5.
    Klami, A., Saunders, C., De Campos, T.E., Kaski, S.: Can relevance of images be inferred from eye movements? In: Multimedia Information Retrieval, ACM (2008)Google Scholar
  6. 6.
    Kozma, L., Klami, A., Kaski, S.: GaZIR: gaze-based zooming interface for image retrieval. In: Multimodal Interfaces. ACM (2009)Google Scholar
  7. 7.
    Ramanathan, S., Katti, H., Huang, R., Chua, T.-S., Kankanhalli, M.: Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis. In: Multimedia (2009)Google Scholar
  8. 8.
    Rowe, N.C.: Finding and labeling the subject of a captioned depictive natural photograph. IEEE Transactions on Knowledge and Data Engineering, 202–207 (2002)Google Scholar
  9. 9.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Journal of Computer Vision 77(1), 157–173 (2008)CrossRefGoogle Scholar
  10. 10.
    Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., Cohen, M.: Gaze-based interaction for semi-automatic photo cropping. In: CHI, p. 780. ACM (2006)Google Scholar
  11. 11.
    Shimojo, S., Simion, C., Shimojo, E., Scheier, C.: Gaze bias both reflects and influences preference. Nature Neuroscience 6(12), 1317–1322 (2003)CrossRefGoogle Scholar
  12. 12.
    von Ahn, L., Liu, R., Blum, M.: Peekaboom: a game for locating objects in images. In: CHI. ACM (2006)Google Scholar
  13. 13.
    Walber, T., Scherp, A., Staab, S.: Towards improving the understanding of image semantics by gaze-based tag-to-region assignments. Technical Report 08/2011, Institut WeST, Universität Koblenz-Landau (2011),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tina Walber
    • 1
  • Ansgar Scherp
    • 1
    • 2
  • Steffen Staab
    • 1
  1. 1.Institute for Web Science and TechnologyUniversity of Koblenz-LandauGermany
  2. 2.Institute for Information Systems ResearchUniversity of Koblenz-LandauGermany

Personalised recommendations