Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention

Wischnewski, Marco; Belardinelli, Anna; Schneider, Werner X.; Steil, Jochen J.

doi:10.1007/s12559-010-9080-1

Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention

Published: 06 November 2010

Volume 2, pages 326–343, (2010)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Marco Wischnewski¹,
Anna Belardinelli¹,
Werner X. Schneider¹ &
…
Jochen J. Steil²

755 Accesses
74 Citations
Explore all metrics

Abstract

To decide “Where to look next ?” is a central function of the attention system of humans, animals and robots. Control of attention depends on three factors, that is, low-level static and dynamic visual features of the environment (bottom-up), medium-level visual features of proto-objects and the task (top-down). We present a novel integrated computational model that includes all these factors in a coherent architecture based on findings and constraints from the primate visual system. The model combines spatially inhomogeneous processing of static features, spatio-temporal motion features and task-dependent priority control in the form of the first computational implementation of saliency computation as specified by the “Theory of Visual Attention” (TVA, [7]). Importantly, static and dynamic processing streams are fused at the level of visual proto-objects, that is, ellipsoidal visual units that have the additional medium-level features of position, size, shape and orientation of the principal axis. Proto-objects serve as input to the TVA process that combines top-down and bottom-up information for computing attentional priorities so that relatively complex search tasks can be implemented. To this end, separately computed static and dynamic proto-objects are filtered and subsequently merged into one combined map of proto-objects. For each proto-object, attentional priorities in the form of attentional weights are computed according to TVA. The target of the next saccade is the center of gravity of the proto-object with the highest weight according to the task. We illustrate the approach by applying it to several real world image sequences and show that it is robust to parameter variations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object-Based Attention: Cognitive and Computational Perspectives

Visual attention and its intimate links to spatial cognition

Article Open access 09 August 2018

A Model of Top-Down Attentional Control for Visual Search Based on Neurosciences

Notes

References

Adelson EH, Bergen JR. Spatiotemporal energy models for the perception of motion. J Opt Soc Am A. 1985;2(2):284–99.
Article CAS PubMed Google Scholar
Ali S, Shah M. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: IEEE conference on computer vision and pattern recognition, 2007. CVPR ’07. 2007. p. 1–6.
Aziz M, Mertsching B. Fast and robust generation of feature maps for region-based visual attention. IEEE Trans Image Process. 2008;17(5):633 –44.
Article PubMed Google Scholar
Belardinelli A, Pirri F, Carbone A. Motion saliency maps from spatiotemporal filtering. Attention in Cognitive Systems 2009. p. 112–23.
Breazeal C, Scassellati B. A context-dependent attention system for a social robot. In: IJCAI ’99. San Francisco: Morgan Kaufmann Publishers Inc.; 1999. p. 1146–53.
Bruce NDB, Tsotsos JK. Saliency, attention, and visual search: an information theoretic approach. J Vis. 2009;9(3), 1–24.
Google Scholar
Bundesen C. A theory of visual attention. Psychol Rev. 1990;97(4):523–47.
Article CAS PubMed Google Scholar
Bundesen C, Habekost T. Principles of visual attention: linking mind and brain. Oxford: Oxford University Press; 2008.
Google Scholar
Bundesen C, Habekost T, Kyllingsbaek S. A neural theory of visual attention: bridging cognition and neurophysiology. Psychol Rev. 2005;112(2):291–328.
Article PubMed Google Scholar
Carbone E, Schneider WX. Gaze is special: the control of stimulus-driven saccades is not subject to central, but visual attention limitations. Atten Percept Psychophys. (in press).
Clark A. Feature-placing and proto-objects. Philos Psychol. 2004;17(4):443+.
Article Google Scholar
Comaniciu D, Meer P. Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell. 2002;24(5):603–19.
Article Google Scholar
Daugman JG. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J Opt Soc Am A. 1985;2(7):1160–9.
Article CAS PubMed Google Scholar
De Monasterio FM, Gouras P. Functional properties of ganglion cells of the rhesus monkey retina. J Physiol. 1975;251(1): 167–95.
CAS PubMed Google Scholar
DeAngelis GC, Ohzawa I, Freeman RD. Spatiotemporal organization of simple-cell receptive fields in the cat’s striate cortex. i. general characteristics and postnatal development. J Neurophysiol. 1993;69(4):1091–117.
CAS PubMed Google Scholar
Deubel H, Schneider WX. Saccade target selection and object recognition: evidence for a common attentional mechanism. Vis Res. 1996;36(12):1827–37.
Article CAS PubMed Google Scholar
Domijan D, Šetić M. A feedback model of figure-ground assignment. J Vis. 2008;8(7):1–27.
Article PubMed Google Scholar
Dosil R, Fdez-Vidal XR, Pardo XM. Motion representation using composite energy features. Pattern Recognit. 2008;41(3):1110–23.
Article Google Scholar
Driscoll J II, RP Cave K. A visual attention network for a humanoid robot. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, 1998. p. 12–6.
Findlay JM. Global visual processing for saccadic eye movements. Vis Res. 1982;22(8):1033–45.
Article CAS PubMed Google Scholar
Forssén PE. Low and medium level vision using channel representations. Ph.D. thesis, Linköping University, Sweden, SE-581 83 Linköping, Sweden (2004). Dissertation No. 858, ISBN 91-7373-876-X.
Frey HP, Konig P, Einhauser W. The role of first- and second-order stimulus features for human overt attention, perception and psychophysics. Percept Psychophys. 2007;69(2):153–61.
PubMed Google Scholar
Frintrop S, Klodt M, Rome E. A real-time visual attention system using integral images. In: Proceedings of the 5th international conference on computer vision systems (2007).
Frintrop S, Rome E, Christensen HI. Computational visual attention systems and their cognitive foundations: a survey. ACM Trans Appl Percept. 2010;7(1):1–39.
Article Google Scholar
Geisler WS, Albrecht DG. Visual cortex neurons in monkeys and cats: detection, discrimination, and identification. Vis Neurosci. 1997;14:897–919.
Article CAS PubMed Google Scholar
Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci. 1992;15(1):20–5.
Article CAS PubMed Google Scholar
Goodale MA, Westwood DA. An evolving view of duplex vision: separate but interacting cortical pathways for perception and action. Curr Opin Neurobiol. 2004;14(2):203–11.
Article CAS PubMed Google Scholar
van Hateren JH, Ruderman DL. Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proc Biol Sci. 1998;265(1412):2315–20.
Article PubMed Google Scholar
Heeger DJ. Optical flow using spatiotemporal filters. Int J Comput Vis. 1988;1(4):279–302.
Article Google Scholar
Itti L, Baldi P. Bayesian surprise attracts human attention. Vis Res. 2009;49(10):1295–306.
Article PubMed Google Scholar
Itti L, Koch C. Feature combination strategies for saliency-based visual attention systems. J Electron Imag. 2001;10(1):161–9.
Article Google Scholar
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.
Article Google Scholar
Kehrer L, Meinecke C. A space-variant filter model of texture segregation: parameter adjustment guided by psychophysical data. Biol Cybern. 2003;88(3):183–200.
Article CAS PubMed Google Scholar
Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol. 1985;4(4):219–27.
CAS PubMed Google Scholar
Land M, Tatler B. Looking and acting: vision and eye movements in natural behaviour. Oxford: Oxford University Press; 2009.
Google Scholar
Le Meur O, Le Callet P, Barba D. Predicting visual fixations on video based on low-level visual features. Vis Res. 2007;47(19):2483–98.
Article PubMed Google Scholar
Mahadevan V, Vasconcelos N. Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell. 2009;32:171–7.
Article Google Scholar
Marat S, Ho Phuoc T, Granjon L, Guyader N, Pellerin D, Guérin-Dugué A. Modelling spatio-temporal saliency to predict gaze direction for short videos. Int J Comput Vis. 2009;82(3):231–43.
Article Google Scholar
Moren J, Ude A, Koene A, Cheng G. Biologically based top-down attention modulation for humanoid interactions. Int J HR. 2008;5(1):3–24.
Google Scholar
Morrone MC, Burr DC. Feature detection in human vision A phase-dependent energy model. Proc R Soc Lond B Biol Sci. 1988;235(1280):221–45.
Article Google Scholar
Nagai Y. From bottom-up visual attention to robot action learning. In: Proceedings of 8 IEEE international conference on development and learning. IEEE Press; 2009.
Nagai Y, Hosoda K, Morita A, Asada M. A constructive model for the development of joint attention. Conn Sci. 2003;15(4): 211–29.
Article Google Scholar
Navalpakkam, V, Itti L. An integrated model of top-down and bottom-up attention for optimal object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), New York, NY. 2006. p. 2049–56.
Navalpakkam V, Itti L. A goal oriented attention guidance model. In: Biologically Motivated Computer Vision, pp. 81–118. Springer (2010).
Nothdurft H. The role of features in preattentive vision: comparison of orientation, motion and color cues. Vis Res. 1993;33(14):1937–58.
Article CAS PubMed Google Scholar
Olveczky Bence P, Baccus SA, Meister M. Segregation of object and background motion in the retina. Nature. 2003;423:401–8.
Article CAS PubMed Google Scholar
Orabona F, Metta G, Sandini G. A proto-object based visual attention model. In: Attention in cognitive systems. Theories and systems from an interdisciplinary viewpoint. 2008. p. 198–215.
Palmer SE. Vision science. Cambridge: MIT; 1999.
Google Scholar
Park S, Shin J, Lee M. Biologically inspired saliency map model for bottom-up visual attention. In: Biologically motivated computer vision. Springer; 2010. p. 113–45.
Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2(11):1019–25.
Article CAS PubMed Google Scholar
Rosenholtz R. A simple saliency model predicts a number of motion popout phenomena. Vis Res. 1999;39(19):3157–63.
Article CAS PubMed Google Scholar
Ruesch J, Lopes M, Bernardino A, Hornstein J, Santos-Victor J, Pfeifer R. Multimodal saliency-based bottom-up attention a framework for the humanoid robot icub. In: International conference on robotics and automation, Pasadena, CA, USA. 2008. p. 962–7.
Schaefer G, Stich M. UCID - An Uncompressed Colour Image Database. In: Storage and retrieval methods and applications for multimedia 2004. Proceedings of SPIE, vol. 5307. 2004. p. 472–80.
Schneider WX. VAM: A neuro-cognitive model for visual attention control of segmentation, object recognition, and space-based motor action. Vis Cogn. 1995;2(2–3):331–76.
Article Google Scholar
Scholl BJ. Objects and attention: the state of the art. Cognition. 2001;80(1–2):1–46.
Article CAS PubMed Google Scholar
Steil, JJ, Heidemann G, Jockusch J, Rae R, Jungclaus N, Ritter, H.: Guiding attention for grasping tasks by gestural instruction: The gravis-robot architecture. In: Proceedings IROS 2001, IEEE 2001. p. 1570–7.
Sun Y, Fisher R, Wang F, Gomes HM. A computer vision model for visual-object-based attention and eye movements. Comput Vis Image Underst. 2008;112(2):126–42.
Article Google Scholar
Tatler B (2009) Current understanding of eye guidance. Vis Cogn. 777–89.
Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766–86.
Article PubMed Google Scholar
Treisman A. The binding problem. Curr Opin Neurobiol. 1996;6(2):171–8.
Article CAS PubMed Google Scholar
Treisman AM, Gelade G. A feature-integration theory of attention. Cogn Psychol. 1980;12(1):97–136.
Article CAS PubMed Google Scholar
Tsotsos JK, Culhane SM, Winky WYK, Lai Y, Davis N, Nuflo F Modeling visual attention via selective tuning. Artif Intell. 1995;78(1–2):507–45.
Article Google Scholar
Van Essen D, Anderson C. Information processing strategies and pathways in the primate visual system. In: Zornetzer S, Davis J, Lau C, McKenna T (eds.), An introduction to neural and electronic networks. Academic Press, New York; 1995. p. 45–76.
Google Scholar
Walther D, Itti L, Riesenhuber M, Poggio T, Koch C. Attentional selection for object recognition—a gentle way. In: Biologically motivated computer vision, Springer; 2002. p. 251–67.
Walther D, Koch C. Modeling attention to salient proto-objects. Neural Netw. 2006;19(9):1395–407.
Article PubMed Google Scholar
Watson AB. Detection and recognition of simple spatial forms. Technical report, NASA Ames Research Center; 1983.
Watson AB, Albert Jr J. Model of human visual-motion sensing. J Opt Soc Am A. 1985;2(2):322–41.
Google Scholar
Wildes RP, Bergen JR. Qualitative spatiotemporal analysis using an oriented energy representation. In: ECCV ’00: Proceedings of the 6th European conference on computer vision-part II. 2000. p. 768–84.
Wischnewski M, Steil JJ, Kehrer L, Schneider WX. Integrating inhomogeneous processing and proto-object formation in a computational model of visual attention. In: Human centered robot systems. 2009. p. 93–102.
Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci. 2004;5(6):495–501.
Article CAS PubMed Google Scholar

Download references

Acknowledgments

This research was supported by grants of the Cluster of Excellence - Cognitive Interaction Technology (CITEC).

Author information

Authors and Affiliations

Center of Excellence - Cognitive Interaction Technology (CITEC) and Neuro-cognitive Psychology, Bielefeld University, Bielefeld, Northrhine-Westphalia, Germany
Marco Wischnewski, Anna Belardinelli & Werner X. Schneider
Research Institute for Cognition and Robotics (CoR-Lab) & Faculty of Technology, Bielefeld University, Bielefeld, Northrhine-Westphalia, Germany
Jochen J. Steil

Authors

Marco Wischnewski
View author publications
You can also search for this author in PubMed Google Scholar
Anna Belardinelli
View author publications
You can also search for this author in PubMed Google Scholar
Werner X. Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Jochen J. Steil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Wischnewski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wischnewski, M., Belardinelli, A., Schneider, W.X. et al. Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention. Cogn Comput 2, 326–343 (2010). https://doi.org/10.1007/s12559-010-9080-1

Download citation

Received: 01 May 2010
Accepted: 09 October 2010
Published: 06 November 2010
Issue Date: December 2010
DOI: https://doi.org/10.1007/s12559-010-9080-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention

Abstract

Access this article

Similar content being viewed by others

Object-Based Attention: Cognitive and Computational Perspectives

Visual attention and its intimate links to spatial cognition

A Model of Top-Down Attentional Control for Visual Search Based on Neurosciences

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention

Abstract

Access this article

Similar content being viewed by others

Object-Based Attention: Cognitive and Computational Perspectives

Visual attention and its intimate links to spatial cognition

A Model of Top-Down Attentional Control for Visual Search Based on Neurosciences

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation