Can You See It? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions
Eye tracking information can be used to assign given tags to image regions in order to describe the depicted scene in more details. We introduce and compare two novel eye-tracking-based measures for conducting such assignments: The segmentation measure uses automatically computed image segments and selects the one segment the user fixates for the longest time. The heat map measure is based on traditional gaze heat maps and sums up the users’ fixation durations per pixel. Both measures are applied on gaze data obtained for a set of social media images, which have manually labeled objects as ground truth. We have determined a maximum average precision of 65% at which the segmentation measure points to the correct region in the image. The best coverage of the segments is obtained for the segmentation measure with a F-measure of 35%. Overall, both newly introduced gaze-based measures deliver better results than baseline measures that selects a segment based on the golden ratio of photography or the center position in the image. The eye-tracking-based segmentation measure significantly outperforms the baselines for precision and F-measure.
KeywordsFixation measures automatic segmentation heat maps
Unable to display preview. Download preview PDF.
- 1.Tobii studio 2.x - user manual (2010), http://www.tobii.com
- 3.Bartelma, J.M.: Flycatcher: Fusion of gaze with hierarchical image segmentation for robust object detection. PhD thesis, Massachusetts Institute of Technology (2004)Google Scholar
- 5.Bojko, A.: Informative or misleading? heatmaps deconstructed. In: Human-Computer Interaction. New Trends, pp. 30–39 (2009)Google Scholar
- 6.Essig, K.: Vision-Based Image Retrieval (VBIR)-A New Approach for Natural and Intuitive Image Retrieval. PhD thesis (2008)Google Scholar
- 7.Freeman, M.: The Photographer’s Eye: Composition and Design for Better Digital Photos. Focal Press (2007)Google Scholar
- 9.Klami, A.: Inferring task-relevant image regions from gaze data. In: Workshop on Machine Learning for Signal Processing. IEEE (2010)Google Scholar
- 10.Ramanathan, S., Katti, H., Huang, R., Chua, T., Kankanhalli, M.: Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis. In: Multimedia. ACM, New York (2009)Google Scholar
- 12.San Agustin, J., Skovsgaard, H., Hansen, J.P., Hansen, D.W.: Low-cost gaze interaction: ready to deliver the promises. In: CHI, pp. 4453–4458. ACM (2009)Google Scholar
- 13.Santella, A., Agrawala, M., DeCarlo, D., Salesin, D., Cohen, M.: Gaze-based interaction for semi-automatic photo cropping. In: CHI, p. 780. ACM (2006)Google Scholar
- 14.Walber, T., Scherp, A., Staab, S.: Identifying Objects in Images from Analyzing the Users’ Gaze Movements for Provided Tags. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 138–148. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 15.Yarbus, A.L.: Eye movements and vision. Plenum (1967)Google Scholar