Cognitive Computation

, Volume 4, Issue 2, pp 141–156 | Cite as

A Time-Dependent Saliency Model Combining Center and Depth Biases for 2D and 3D Viewing Conditions

  • J. GautierEmail author
  • O. Le Meur


The role of the binocular disparity in the deployment of visual attention is examined in this paper. To address this point, we compared eye tracking data recorded while observers viewed natural images in 2D and 3D conditions. The influence of disparity on saliency, center and depth biases is first studied. Results show that visual exploration is affected by the introduction of the binocular disparity. In particular, participants tend to look first at closer areas in 3D condition and then direct their gaze to more widespread locations. Beside this behavioral analysis, we assess the extent to which state-of-the-art models of bottom-up visual attention predict where observers looked at in both viewing conditions. To improve their ability to predict salient regions, low-level features as well as higher-level foreground/background cues are examined. Results indicate that, consecutively to initial centering response, the foreground feature plays an active role in the early but also middle instants of attention deployments. Importantly, this influence is more pronounced in stereoscopic conditions. It supports the notion of a quasi-instantaneous bottom-up saliency modulated by higher figure/ground processing. Beyond depth information itself, the foreground cue might constitute an early process of “selection for action”. Finally, we propose a time-dependent computational model to predict saliency on still pictures. The proposed approach combines low-level visual features, center and depth biases. Its performance outperforms state-of-the-art models of bottom-up attention.


Eye movements Saliency model Binocular disparity Stereoscopy 



Authors would like to thank Lina Jansen, Selim Önat, and Peter König for providing us the eye tracking database and giving us helpful information and comments for this study. We also would like to thank Tien Ho-Phuoc for the support on bootstrap estimate.


  1. 1.
    Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol. 1985;4(4):219–27.PubMedGoogle Scholar
  2. 2.
    Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.CrossRefGoogle Scholar
  3. 3.
    Yarbus AL. Eye movements and vision. New York: Plenum press; 1967.Google Scholar
  4. 4.
    Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766–86.PubMedCrossRefGoogle Scholar
  5. 5.
    Cutting JE, Vishton PM. Perceiving layout and knowing distances: the integration, relative potency, and contextual use of different information about depth. Percept Space Motion. 1995;5:69–117.Google Scholar
  6. 6.
    Maki A, Nordlund P, Eklundh JO. A computational model of depth-based attention. In: Proceedings of the 13th international conference on pattern recognition; 1996. p. 734–739.Google Scholar
  7. 7.
    Maki A, Nordlund P, Eklundh JO. Attentional scene segmentation: integrating depth and motion. Comput Vis Image Underst. 2000;78(3):351–73.CrossRefGoogle Scholar
  8. 8.
    Ouerhani N, Hugli H. Computing visual attention from scene depth. In: Proceedings of 15th international conference on pattern recognition; 2000. p. 375–378.Google Scholar
  9. 9.
    Zhang Y, Jiang G, Yu M, Chen K. Stereoscopic visual attention model for 3D video. Adv Multimed Model. 2010;5916:314–24.Google Scholar
  10. 10.
    Bruce NDB, Tsotsos JK. An attentional framework for stereo vision. In: Proceedings of the 2nd Canadian conference on computer and robot vision; 2005. p. 88–95.Google Scholar
  11. 11.
    Tatler BW. The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. J Vis. 2007;7(14):1–17.Google Scholar
  12. 12.
    Judd T, Ehinger K, Durand F, Torralba A. Learning to predict where humans look. IEEE 12th international conference on computer vision. 2009. p. 2106–2113.Google Scholar
  13. 13.
    Zhao Q, Koch C. Learning a saliency map using fixated locations in natural scenes. J Vis. 2011;11(3):1–15.Google Scholar
  14. 14.
    Vincent BT, Baddeley R, Correani A, Troscianko T, Leonards U. Do we look at lights? Using mixture modelling to distinguish between low-and high-level factors in natural image viewing. Vis Cognit. 2009;17(6):856–79.CrossRefGoogle Scholar
  15. 15.
    Ho-Phuoc T, Guyader N, Guerin-Dugue A. A functional and statistical bottom-up saliency model to reveal the relative contributions of low-level visual guiding factors. Cogn Comput. 2010;2(4):344–59.CrossRefGoogle Scholar
  16. 16.
    Jansen L, Onat S, König P. Influence of disparity on fixation and saccades in free viewing of natural scenes. J Vis. 2009;11(5):1–23.Google Scholar
  17. 17.
    Steger JM, König P. Fusion of 3D laser scans and stereo images for disparity maps of natural scenes. Osnabruck: Institute of Cognitive Science; 2010.Google Scholar
  18. 18.
    Le Meur O, Baccino T, Roumy A, et al. Prediction of the inter-observer visual congruency (IOVC) and application to image ranking. In: Proceedings of the 19th ACM international conference on Multimedia; 2011. p. 373–82.Google Scholar
  19. 19.
    Bindemann M. Scene and screen center bias early eye movements in scene viewing. Vis Res. 2010;50(23):2577–87.Google Scholar
  20. 20.
    Bruce ND. Features that draw visual attention: an information theoretic perspective. Neurocomputing. 2005;65:125–33.CrossRefGoogle Scholar
  21. 21.
    Le Meur O, Le Callet P, Barba D, Thoreau D. A coherent computational approach to model bottom-up visual attention. IEEE Trans Pattern Anal Mach Intell. 2006;28(5):802–17.PubMedCrossRefGoogle Scholar
  22. 22.
    Zhaoping L, Guyader N, Lewis A. Relative contributions of 2D and 3D cues in a texture segmentation task, implications for the roles of striate and extrastriate cortex in attentional selection. J Vis. 2009;9(11):1–22.Google Scholar
  23. 23.
    Parkhurst D, Law K, Niebur E. Modeling the role of salience in the allocation of overt visual attention. Vis Res. 2002;42(1):107–23.PubMedCrossRefGoogle Scholar
  24. 24.
    Tatler BW, Baddeley RJ, Gilchrist ID. Visual correlates of fixation selection: effects of scale and time. Vis Res. 2005;45(5):643–59.PubMedCrossRefGoogle Scholar
  25. 25.
    Underwood G. Cognitive processes in eye guidance: algorithms for attention in image processing. Cogn Comput. 2009;1(1):64–76.CrossRefGoogle Scholar
  26. 26.
    Cutsuridis V. A cognitive model of saliency, attention, and picture scanning. Cogn Comput. 2009;1(4):292–9.CrossRefGoogle Scholar
  27. 27.
    Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc. 1977;39:1–38.Google Scholar
  28. 28.
    Rubin E. Visuell wahrgenommene figuren: studien in psychologischer analyse. Gyldendalske boghandel; 1921.Google Scholar
  29. 29.
    Palmer S. Vision: from photons to phenomenology. Cambridge: MIT Press; 2000.Google Scholar
  30. 30.
    Qiu FT, Sugihara T, von der Heydt R. Figure-ground mechanisms provide structure for selective attention. Nat Neurosci. 2007;10(11):1492–9.PubMedCrossRefGoogle Scholar
  31. 31.
    Peters RJ, Itti L. Beyond bottom-up: incorporating task-dependent influences into a computational model of spatial attention. IEEE conference on computer vision and pattern recognition; 2007. p. 1–8.Google Scholar
  32. 32.
    Jost T, Ouerhani N, Wartburg R, Müri R, Hügli H. Assessing the contribution of color in visual attention. Comput Vis Image Underst. 2005;100(1–2):107–23.CrossRefGoogle Scholar
  33. 33.
    Le Meur O, Chevet JC. Relevance of a feed-forward model of visual attention for goal-oriented and free-viewing tasks. IEEE Trans Image Process. 2010;19(11):2801–13.CrossRefGoogle Scholar
  34. 34.
    VanRullen R. Visual saliency and spike timing in the ventral visual pathway. J Physiol Paris. 2003;97(2–3):365–77.PubMedCrossRefGoogle Scholar
  35. 35.
    Chamaret C, Chevet JC, Le Meur O. Spatio-temporal combination of saliency maps and eye-tracking assessment of different strategies. 17th IEEE international conference on image processing (ICIP); 2010. p. 1077–1080.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.University of Rennes 1RennesFrance

Personalised recommendations