Visual attention models are typically based on the concept of saliency, a conspicuity measure which considers features such as color, intensity or orientation. Much current research aims at modeling top-down interactions, which highly influence human attentional behavior. Typically, these are in the form of targets to be searched for or general characteristics (gist) of a scene. In humans, it has been shown that objects that afford actions, for example, graspable objects, strongly attract attention. Here, we integrate an artificial attention framework with a measure of affordances estimated from a sparse 3D scene representation. This work contributes further evidence for human attention being biased toward objects of high affordance, which for the first time is measured in an objective way. Furthermore, it demonstrates that artificial attention systems benefit from affordance estimation for predicting human attention. For technical systems, considering affordances provides mid-level influences that are not too specific or too general, but can guide attention toward potential action targets with respect to a system’s physical capabilities. Finally, the change-detection task we employ for model comparison constitutes a new method to evaluate artificial systems with respect to early human vision in natural scene perception.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
The affordance model suggested edges of shelves twice, which could not be removed without also removing all their contents or leaving the contained objects suspiciously floating; the saliency model also suggested such an element once by pointing to empty space on a table between objects.
Note that some additional center bias is added due to an artifact: A vertical strip on the left in Fig. 6b shows no changes. This is because the left image of the stereo pair was used and no stereo correspondences (and thus no object representations and no estimated grasping possibilities) exist in the far left of the image.
All t tests and ANOVAs reported in this paper assume an alpha level of 0.05 and were performed on the arcsine-transformed relative frequencies. Whenever differences are not significant, we additionally report the mean and standard deviation of the differences and 95 % confidence intervals around the mean difference.
Simons DJ, Levin DT. Change blindness. Trends Cogn Sci. 1997;1(7):261–7.
Simons DJ, Rensink RA. Change blindness: past, present, and future. Trends Cogn Sci. 2005;9(1):16–20.
Simons DJ, Levin DT. Failure to detect changes to people during a real-world interaction. Psychon Bull Rev. 1998;5:644–9.
Rensink RA, O’Regan JK, Clark JJ. To see or not to see: the need for attention to perceive changes in scenes. Psychol Sci. 1997;8(5):368–73.
Shore D, Klein RM. The effects of scene inversion on change blindness. J Gen Psychol. 2000;127:27–43.
Kelley T, Chun M, Chua K. Effects of scene inversion on change detection of targets matched for visual salience. J Vis. 2003;3(1):1–5.
Sampanes AC, Tseng P, Bridgeman B. The role of gist in scene recognition. Vis Res. 2008;48(21):2275–83.
Tseng P, Tünnermann J, Roker-Knight N, Winter D, Scharlau I, Bridgeman B. Enhancing implicit change detection through action. Perception. 2010;39:1311–21.
Symes E, Tucker M, Ellis R, Vainio L, Ottoboni G. Grasp preparation improves change detection for congruent objects. J Exp Psychol Hum Percept Perform. 2008;34(4):854–71.
Tseng P, Bridgeman B. Improved change detection with nearby hands. Exp Brain Res. 2011;209(2):257–69.
Tseng P, Bridgeman B, Juan CH. Take the matter into your own hands: a brief review of the effect of nearby-hands on visual processing. Vis Res. 2012;72:74–7.
Tünnermann J, Hilkenmeier F, Scharlau I. Change detection is enhanced for objects in near space; 2012. In: Poster presented at the 54. Tagung experimentell arbeitender Psychologen (TeaP)/54th Conference of Experimental Psychologists, Mannheim, Germany.
Stirk JA, Underwood G. Low-level visual saliency does not predict change detection in natural scenes. J Vis. 2007;7:1–10.
Gibson JJ. The theory of affordances. In: Shaw R, Bransford J, editors. Perceiving, acting, and knowing: toward an ecological psychology. Hillsdale: Lawrence Erlbaum Associates; 1977. p. 67–82.
Craighero L, Fadiga L, Rizzolatti G, Umiltà C. Action for perception: a motor-visual attentional effect. J Exp Psychol Hum Percept Perform. 1999;25(6):1673–92.
Bekkering H, Neggers SFW. Visual search is modulated by action intentions. Psychol Sci. 2002;13(4):370–4.
Garrido-Vásquez P, Schubö A. Modulation of visual attention by object affordance. Front Psychol. 2014;5:59.
Handy TC, Grafton ST, Shroff NM, Ketay S, Gazzaniga MS. Graspable objects grab attention when the potential for action is recognized. Nat Neurosci. 2003;6(4):421–7.
Castellini C, Tommasi T, Noceti N, Odone F, Caputo B. Using object affordances to improve object recognition. IEEE Trans Auton Ment Dev. 2011;3(3):207–15.
Detry R, Kraft D, Kroemer O, Bodenhagen L, Peters J, Krüger N, et al. Learning grasp affordance densities. Paladyn. 2011;2(1):1–17.
Varadarajan KM, Vincze M. Affordance based part recognition for grasping and manipulation. In: Proceedings of the ICRA workshop on autonomous grasping; 2011.
Stark L, Bowyer K. Generic recognition through qualitative reasoning about 3-D shape and object dunction. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 1991. p. 251–56.
Rivlin E, Dickinson SJ, Rosenfeld A. Recognition by functional parts [Function-based object recognition]. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 1994. p. 267–74.
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.
Treisman AM, Gelade G. A feature-integration theory of attention. Cogn Psychol. 1980;12(1):97–136.
Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol. 1985;4(4):219–27.
Navalpakkam V, Itti L. An integrated model of top–down and bottom–up attention for optimal object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2006. p. 2049–56.
Aziz MZ, Mertsching B. Visual search in static and dynamic scenes using fine-grain top–down visual attention. In: Proceedings of the 6th international conference on computer vision systems. Springer, Berlin; 2008. p. 3–12.
Tünnermann J, Born C, Mertsching B. Top–down visual attention with complex templates. In: Proceedings of the international conference on computer vision theory and applications, vol. 1; 2013. p. 370–377.
Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):185–207.
Pugeault N, Wörgötter F, Krüger N. Visual primitives: local, condensed, and semantically rich visual descriptors and their applications in robotics. Int J Humanoid Robot (Special Issue on Cognitive Humanoid Vision). 2010;7(3):379–405.
Aziz MZ, Mertsching B. Fast and robust generation of feature maps for region-based visual attention. IEEE Trans Image Process. 2008;17(5):633–44.
Tünnermann J, Mertsching B. Region-based artificial visual attention in space and time. Cogn Comput. 2014;6(1):125–43.
Aziz MZ, Shafik MS, Mertsching B, Munir A. Color segmentation for visual attention of mobile robots. In: Proceedings of the IEEE symposium on emerging technologies; 2005. p. 115–120.
Backer M, Tünnermann J, Mertsching B. Parallel k-means image segmentation using sort, scan and connected components on a GPU. In: Keller R, Kramer D, Weiss JP, editors. Facing the multicore-challenge III. vol. 7686 of lecture notes in computer science; 2013. p. 108–120.
Aziz MZ, Mertsching B. Pop-out and IOR in static scenes with region based visual attention. In: Proceedings of the ICVS workshop on computational attention & applications. Bielefeld: Bielefeld University eCollections; 2007. doi:10.2390/biecoll-icvs2007-157.
Tünnermann J, Mertsching B. Continuous region-based processing of spatiotemporal saliency. In: Proceedings of the international conference on computer vision theory and applications, vol. 1; 2012. p. 230–239.
Krüger N, Janssen P, Kalkan S, Lappe M, Leonardis A, Piater J, et al. Deep hierarchies in the primate visual cortex: what can we learn for computer vision? IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1847–71.
Kootstra G, Popovic M, Jørgensen J, Kuklinski K, Miatliuk K, Kragic D, et al. Enabling grasping of unknown objects through a synergistic use of edge and surface information. Int J Robot Res. 2012;31(10):1190–213.
Olesen SM, Lyder S, Kraft D, Krüger N, Jessen JB. Real-time extraction of surface patches with associated uncertainties by means of Kinect cameras. J Real-Time Image Process. 2015;10(1):105–18.
Marat S, Rahman A, Pellerin D, Guyader N, Houzet D. Improving visual saliency by adding ‘face feature map’ and ‘center bias’. Cogn Comput. 2013;5(1):63–75.
Schauerte B, Stiefelhagen R. How the distribution of salient objects in images influences salient object detection. In: Proceedings of the 20th international conference on image processing (ICIP); 2013. p. 1–5.
Schauerte B, Stiefelhagen R. On the distribution of salient objects in web images and its influence on salient object detection. arXiv preprint arXiv:1501.03383; 2015.
Born C, Tünnermann J, Mertsching B. Saliency from growing neural gas: learning pre-attentional structures for a flexible attention system. in review.
Tünnermann J, Born C, Mertsching B. Integrating object affordances with artificial visual attention. In: Agapito L, Bronstein MM, Rother C, editors. Computer vision-ECCV 2014 Workshops. Switzerland: Springer; 2015. p. 427–437.
Ball F, Elzemann A, Busch NA. The scene and the unseen: Manipulating photographs for experiments on change blindness and scene memory. Behav Res Methods. 2014;46(3):689–701.
Ma L, Xu K, Wong T, Jiang B, Hu S. Change blindness images. IEEE Trans Vis Comput Graph. 2013;19(11):1808–19.
This work was partly supported by the EU Cognitive Systems project Xperience (FP7-ICT-270273).
About this article
Cite this article
Tünnermann, J., Krüger, N., Mertsching, B. et al. Affordance Estimation Enhances Artificial Visual Attention: Evidence from a Change-Blindness Study. Cogn Comput 7, 526–538 (2015). https://doi.org/10.1007/s12559-015-9329-9