Skip to main content
Log in

A Behavioral Analysis of Computational Models of Visual Attention

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Robots often incorporate computational models of visual attention to streamline processing. Even though the number of visual attention systems employed on robots has increased dramatically in recent years, the evaluation of these systems has remained primarily qualitative and subjective. We introduce quantitative methods for evaluating computational models of visual attention by direct comparison with gaze trajectories acquired from humans. In particular, we focus on the need for metrics based not on distances within the image plane, but that instead operate at the level of underlying features. We present a framework, based on dimensionality-reduction over the features of human gaze trajectories, that can simultaneously be used for both optimizing a particular computational model of visual attention and for evaluating its performance in terms of similarity to human behavior. We use this framework to evaluate the Itti et al. (1998) model of visual attention, a computational model that serves as the basis for many robotic visual attention systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Balkenius, C., Eriksson, A.P., and Astrom, K. 2004. Learning in visual attention. In Proceedings of LAVS ’04. St Catharine’s College, Cambridge, UK.

  • Beauchemin, S.S. and Barron, J.L. 1995. The computation of optical flow. ACM Computing Surveys, 27(3):433–466.

    Article  Google Scholar 

  • Breazeal, C. and Scassellati, B. 1999. A context-dependent attention system for a social robot. In Proceedings of the Sixteenth international Joint Conference on Artificial Intelligence, T. Dean (Ed.). Morgan Kaufmann Publishers, San Francisco, CA pp. 1146–1153.

  • Burgard, W., Cremers, A.B., Fox, D., Hähnel, D., Lakemeyer, G., Schulz, D., Steiner, W., and Thrun, S. 1998. The interactive museum tour-guide robot. In Proceedings of the Fifteenth National/Tenth Conference on Artificial intelligence/innovative Applications of Artificial intelligence, pp. 11–18.

  • Burt, P.J. and Adelson, E.H. 1983. The laplacian pyramid as a compact image code. IEEE Transactions on Communications, COM-31(4):532–540.

    Article  Google Scholar 

  • Carmi, R. and Itti, L. 2006. Causal saliency effects during natural vision. In Proc. ACM Eye Tracking Research and Applications, pp. 1–9.

  • Draper, B.A. and Lionelle, A. 2005. Evaluation of selective attention under similarity transformations. Computer Vision and Image Understanding. 100:152–171.

    Article  Google Scholar 

  • Duda, R.O. and Hart, P.E. 1973. Pattern Recognition and Scene Analysis. John Willey: New York.

    Google Scholar 

  • Fong, T., Nourbakhsh, I., and Dautenhahn, K. 2003. A survey of socially interactive robots. Robotics and Autonomous Systems, 42:143–166.

    Article  Google Scholar 

  • Fujita, M. 2001. AIBO: Toward the era of digital creatures. The International Journal of Robotics Research. 20:781–794.

    Article  Google Scholar 

  • Gockley, R., Bruce, A., Forlizzi, J., Michalowski, M., Mundell, A., Rosenthal, S., Sellner, B., Simmons, R., Snipes, K., Schultz, A., and Wang, J. 2005. Designing robots for long-term social interaction. In 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2199–2204.

  • Gottlieb, J., Kusunoki M., and Goldberg, M.E. 1998. The representation of visual salience in monkey posterior parietal cortex. Nature, pp. 391:481–484.

    Article  Google Scholar 

  • Heeger, D.J. 1988. Optical flow using spatiotemporal filters. IJCV. 1:279–302.

    Article  Google Scholar 

  • iLab Neuromorphic Vision C++ Toolkit (iNVT). 2006. Retrieved June 5, 2006, from http://ilab.usc.edu/toolkit/home.shtml

  • Imai, M., Kanda, T., Ono, T., Ishiguro, H., and Mase, K. 2002. Robot mediated round table: Analysis of the effect of robot’s gaze. In Proc. 11th IEEE Int. Workshop Robot and Human Interactive Communication (ROMAN 2002), pp. 411–416.

  • Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(11):1254–1259.

    Article  Google Scholar 

  • Itti, L., Dhavale, N., and Pighin, F. 2003. Realistic avatar eye and head animation using a neurobiological model of visual attention. In Proc. SPIE 48th Annual International Symposium on Optical Science and Technology.

  • Itti, L., Rees, G., and Tsotsos, J.K. (Eds.) 2005a. Neurobiology of Attention. Elsevier Academic Press.

  • Itti, L. 2005b. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition, 12(6):1093–1123.

    Article  Google Scholar 

  • Itti, L. and Baldi, P. 2006a. Bayesian surprise attracts human attention. Advances in Neural Information Processing Systems, 19:1–8.

    Google Scholar 

  • Itti, L. 2006b. Quantitative modeling of perceptual salience at human eye position. Visual Cognition, (in press)

  • Jain, R., Kasturi, R., and Schunck, B.G. 1995. Machine Vision. McGraw-Hill Science/Engineering/Math.

  • Klin, A., Jones, W., Schultz, R., Volkmar, F., and Cohen, D. 2002. Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. Arch Gen Psychiatry, 59:809–816.

    Article  Google Scholar 

  • Koch, C. 1984. A theoretical analysis of the electrical properties of an X-cell in the cat s LGN: Does the spine-triad circuit subserve selective visual attention? Artif. Intell. Memo 787, MIT, Artificial Intelligence Laboratory.

  • Koch, C. and Ullman, S. 1985. Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4:219–227.

    Google Scholar 

  • Kustov, A.A. and Robinson, D.L. 1996 Shared neural control of attentional shifts and eye movements. Nature, 384:74–77.

    Article  Google Scholar 

  • Lee, D.K., Itti, L., Koch, C., and Braun, J. 1999. Attention activates winner-take-all competition among visual filters. Nat. Neurosci., 2(4):375–381.

    Article  Google Scholar 

  • Li, Z. 2002. A saliency map in primary visual cortex. Trends Cogn. Sci, 6:9–16.

    Article  Google Scholar 

  • Mazer, J.A. and Gallant J.L. 2003. Goal-related activity in V4 during free viewing visual search: Evidence for a ventral stream visual salience map. Neuron, 40:1241–1250.

    Article  Google Scholar 

  • Nagai, Y., Asada, M., and Hosoda, K. 2002. Developmental learning model for joint attention. In Proceedings of the 15th International Conference on Intelligent Robots and Systems (IROS 2002) (Lausanne, Switzerland), pp. 932–937.

  • Niebur, E., Itti, L., and Koch, C. 1995. Modeling the “where” visual pathway. In Proceedings of 2nd Joint Symposium on Neural Computation, Caltech-UCSD, Sejnowski, T.J. (Ed.), Institute for Neural Computation, La Jolla, vol. 5, pp. 26–35.

  • Niebur, E. and Koch, C. 1996. Control of selective visual attention: Modeling the “where” pathway. Advances in Neural Information Processing Systems, In Touretzky, D.S., Mozer, M.C., and Hasselmo, M.E. (Eds.), MIT Press: Cambridge, MA, vol. 8, pp. 802–808.

  • Ouerhani, N., von Wartburg, R., Hugli, H., and Muri, R. 2004. Empirical validation of the saliency-based model of visual attention. Elec. Letters on Computer Vision and Image Analysis, 3:13–24.

    Google Scholar 

  • Parkhurst, D., Law, K., and Niebur, E. 2002. Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42(1):107–123.

    Article  Google Scholar 

  • Petersen, S.E., Robinson, D.L., and Morris, J.D. 1987. Contributions of the pulvinar to visual spatial attention. Neuropsychologia, 25:97–105.

    Article  Google Scholar 

  • Robinson, D.L. and Petersen, S.E. 1992. The pulvinar and visual salience. Trends Neurosci., 15(4):127–132.

    Article  Google Scholar 

  • Salvucci, D.D. and Goldberg J.H. 2000. Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the symposium on Eye tracking research & applications, pp. 71–78.

  • Scassellati, B. 1999. Imitation and mechanisms of joint attention: A developmental structure for building social skills on a humanoid robot. Lecture Notes in Computer Science, 1562:176.

  • Tatler, B.W., Baddeley, R.J., and Gilchrist, I.D. 2005. Visual correlates of fixation selection: Effects of scale and time. Vision Res, 45(5): 643–659.

    Article  Google Scholar 

  • Tessier-Lavigne, M. 1991. Phototransduction and information processing in the retina. In Principles of Neural Science, E. Kandel, J. Schwartz, and T. Jessel (Eds.), Elsevier Science Publishers B.V. pp. 401–439.

  • Torralba, A. 2003. Modeling global scene factors in attention. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 20:1407–1418.

    Google Scholar 

  • Treue, S. 2003. Visual attention: The where, what, how and why of saliency. Curr Opin Neurobiol, 13(4):428–432.

    Article  Google Scholar 

  • Tsotsos, J.K. 1988. A ‘complexity level’ analysis of immediate vision. International Journal of Computer Vision (Historical Archive), 1(4):303–320.

    Article  Google Scholar 

  • Tsotsos, J.K., Culhane, S.M., Winky, Y.K.W., Yuzhong, L., Davis, N. and Nuflo, F. 1995. Modeling visual attention via selective tuning. Artificial Intelligence, 78(1):507–545(39).

    Google Scholar 

  • Tsotsos, J.K., Liu, Y., Martinez-Trujillo, J., Pomplun, M., Simine, E., and Zhou, K. 2005. Attending to Motion, Computer Vision and Image Understanding, 100(1–2):3–40.

  • Turano, K.A., Geruschat, D.R., and Baker, F.H. 2003. Oculomotor strategies for the direction of gaze tested with a real-world activity. Vision Research, 43(3):333–346.

    Article  Google Scholar 

  • Wolfe, J.M. 1994. Guided search 2.0: A revised model of visual search. Psychonomic Bulletin & Review., 1(2):202–238.

    Google Scholar 

  • Wolfe, J.M. and Gancarz, G. 1996. Guided Search 3.0: A model of visual search catches up with Jay Enoch 40 years later. Basic and Clinical Applications of Vision Science. Kluwer Academic: In V. Lakshminarayanan (Ed.), Dordrecht, Netherlands.

    Google Scholar 

  • Yee, C. and Walther, D. 2002. Motion detection for bottom-up visual attention, tech. rep., SURF/CNS, California Institute of Technology.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shic, F., Scassellati, B. A Behavioral Analysis of Computational Models of Visual Attention. Int J Comput Vision 73, 159–177 (2007). https://doi.org/10.1007/s11263-006-9784-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-006-9784-6

Keywords

Navigation