Skip to main content
Log in

Improving Visual Saliency by Adding ‘Face Feature Map’ and ‘Center Bias’

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Faces play an important role in guiding visual attention, and thus, the inclusion of face detection into a classical visual attention model can improve eye movement predictions. In this study, we proposed a visual saliency model to predict eye movements during free viewing of videos. The model is inspired by the biology of the visual system and breaks down each frame of a video database into three saliency maps, each earmarked for a particular visual feature. (a) A ‘static’ saliency map emphasizes regions that differ from their context in terms of luminance, orientation and spatial frequency. (b) A ‘dynamic’ saliency map emphasizes moving regions with values proportional to motion amplitude. (c) A ‘face’ saliency map emphasizes areas where a face is detected with a value proportional to the confidence of the detection. In parallel, a behavioral experiment was carried out to record eye movements of participants when viewing the videos. These eye movements were compared with the models’ saliency maps to quantify their efficiency. We also examined the influence of center bias on the saliency maps and incorporated it into the model in a suitable way. Finally, we proposed an efficient fusion method of all these saliency maps. Consequently, the fused master saliency map developed in this research is a good predictor of participants’ eye positions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. For clarity, only statistics using NSS criteria are presented since both NSS and TC generally produce the same conclusion. We took the sample mean of 70 frames from each video snippet and then applied the significance tests.

References

  1. Cerf M, Harel J, Einhuser W, Koch C. Predicting human gaze using low-level saliency combined with face detection. In: NIPS’07. 2007.

  2. Ro T, Russell C, Lavie N. Changing faces: a detection advantage in the flicker paradigm. Psychol Sci. 2001;12(1):94.

    Article  PubMed  CAS  Google Scholar 

  3. Vuilleumier P. Faces call for attention: evidence from patients with visual extinction. Neuropsychologia. 2000;38(5):693.

    Article  PubMed  CAS  Google Scholar 

  4. Theeuwes J, Van Der Stigchel S. Faces capture attention: evidence from inhibition of return. Vis Cogn. 2006;13(6):657.

    Article  Google Scholar 

  5. Bindemann M, Burton AM, Langton SRH, Schweinberger SR, Doherty MJ. The control of attention to faces. J Vision. 2007;7(10):15.1.

    Article  Google Scholar 

  6. Müller HJ, Findlay JM. The effect of visual attention on peripheral discrimination thresholds in single and multiple element displays. Acta Psychologica. 1988;69(2):129.

    Article  Google Scholar 

  7. Müller HJ, Rabbitt PM. Reflexive and voluntary orienting of visual attention: time course of activation and resistance to interruption. J Exp Psychol Human. 1989;15(2):315.

    Article  Google Scholar 

  8. Shepherd M, Mller HJ. Movement versus focusing of visual attention. Percept Psychophys. 1989;46(2):146.

    Article  PubMed  CAS  Google Scholar 

  9. Cheal ML, Lyon DR. Central and peripheral precuing of forced-choice discrimination. Q J Exp Psychol. 1991;43(4):859.

    CAS  Google Scholar 

  10. Kanwisher N, Yovel G. The fusiform face area: a cortical region specialized for the perception of faces. Philos Trans R Soc London, Ser B. 2006;361(1476):2109.

    Article  PubMed  Google Scholar 

  11. Loffler G, Yourganov G, Wilkinson F, Wilson HR. fmri evidence for the neural representation of faces. Nat Neurosci. 2005;8(10):1386.

    Article  PubMed  CAS  Google Scholar 

  12. Johnson MH. Subcortical face processing. Nat Rev Neurosci. 2005;6(10):766.

    Article  PubMed  CAS  Google Scholar 

  13. Birmingham E, Bischof W, Kingstone A. Gaze selection in complex social scenes. Vis Cogn. 2008;16(2):341.

    Article  Google Scholar 

  14. Driver J, Davis G, Ricciardelli P, Kidd P, Maxwell E, Baron-Cohen S. Gaze perception triggers reflexive visuospatial orienting. Vis Cogn. 1999;6(5):509.

    Article  Google Scholar 

  15. Langton S, Bruce V. Reflexive visual orienting in response to the social attention of others. Vis Cogn. 1999;6(5):541.

    Article  Google Scholar 

  16. Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol. 1985;4:219.

    PubMed  CAS  Google Scholar 

  17. Tsotsos JK, Culhane SM, Wai WYK, Lai Y, Davis N, Nuflo F. Modeling visual attention via selective tuning. Artif Intell. 1995;78:507.

    Article  Google Scholar 

  18. Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE T Pattern Anal. 1998;20:1254.

    Article  Google Scholar 

  19. Ho-Phuoc T, Guyader N, Guérin-Dugué A. A functional and statistical bottom-up saliency model to reveal the relative contributions of low-level visual guiding factors. Cogn Comput. 2010;2(4):344.

    Article  Google Scholar 

  20. Yanulevskaya V, Marsman JB, Cornelissen F, Geusebroek JM. An image statistics-based model for fixation prediction. Cogn Comput. 2011;3(1):94.

    Article  Google Scholar 

  21. Le Meur O, Le Callet P, Barba D. Predicting visual fixations on video based on low-level visual features. Vision Res. 2007;47(19):2483.

    Article  PubMed  Google Scholar 

  22. Peters RJ, Itti L. Applying computational tools to predict gaze direction in interactive visual environments. ACM T Appl Percept. 2008;5(2):1.

    Article  Google Scholar 

  23. Marat S, Phuoc TH, Granjon L, Guyader N, Pellerin D, Guérin-Dugué A. Modelling spatio-temporal saliency to predict gaze direction for short videos. Int J Comput Vision. 2009;82:231.

    Article  Google Scholar 

  24. Mital PK, Smith TJ, Hill RL, Henderson JM. Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn Comput. 2010;3(1):5.

    Article  Google Scholar 

  25. Cerf M, Frady EP, Koch C. Using semantic content as cues for better scanpath prediction. In: Proceedings of the 2008 symposium on eye tracking research & applications. 2008.

  26. Ma YF, Hua XS, Lu L, Zhang HJ. A generic framework of user attention model and its application in video summarization. IEEE T Multimedia. 2005;7:907.

    Article  Google Scholar 

  27. Birmingham E, Bischof WF, Kingstone A. Saliency does not account for fixations to eyes within social scenes. Vision Res. 2009;49(24):2992.

    Article  PubMed  Google Scholar 

  28. Chen LQ, Xie X, Fan X, Ma WY, Zhang HJ, Zhou HQ. A visual attention model for adapting images on small displays. Multimedia Syst. 2003;9(4):353.

    Article  Google Scholar 

  29. Viola P, Jones MJ. Robust real-time face detection. Int J Comput Vision. 2004;57:137.

    Article  Google Scholar 

  30. Hubel DH, Wiesel TN. Functional architecture of macaque monkey visual cortex. Society. 1977;198(1130):1.

    CAS  Google Scholar 

  31. Odobez JM, Bouthemy P. Robust multiresolution estimation of parametric motion models applied to complex scenes. J Visual Commun Image Represent. 1995;6:348.

    Article  Google Scholar 

  32. Bruno E, Pellerin D. Robust motion estimation using spatial gabor-like filters. Signal Process. 2002;82:297.

    Article  Google Scholar 

  33. Mechelli A, Price CJ, Friston KJ, Ishai A. Where bottom-up meets top-down: neuronal interactions during perception and imagery. Cereb Cortex. 2004;14(11):1256.

    Article  PubMed  Google Scholar 

  34. Summerfield C, Egner T, Greene M, Koechlin E, Mangels J, Hirsch J. Predictive codes for forthcoming perception in the frontal cortex. Science. 2006;314(5803):1311.

    Article  PubMed  CAS  Google Scholar 

  35. Bentin S, Allison T, Puce A, Perez E, McCarthy G. Electrophysiological studies of face perception in humans. J Cognitive Neurosci. 1996;8(6):551.

    Article  Google Scholar 

  36. Marat S, Guyader N, Pellerin D.Recent advances in signal processing (In-Tech, 2009), chap. Gaze prediction improvement by adding a face feature to a saliency model, pp. 195–210. 12.

  37. Milner A, Goodale M. The visual brain in action. Oxford psychology series. Oxford University Press, Oxford; 2006.

    Book  Google Scholar 

  38. Tseng P, Carmi R, Cameron IGM, Munoz D, Itti L. Quantifying center bias of observers in free viewing of dynamic natural scenes. J Vision. 2009;9(7):1.

    Article  Google Scholar 

  39. Dorr M, Martinetz T, Gegenfurtner KR, Barth E. Variability of eye movements when viewing dynamic natural scenes. J Vision. 2010;10(10):1.

    Article  Google Scholar 

  40. Zhao Q, Koch C. Learning a saliency map using fixated locations in natural scenes. J Vision. 2011;11(3):1.

    Article  Google Scholar 

  41. Tatler BW. The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. J Vision. 2007;7(14):4.1.

    Article  Google Scholar 

  42. Renninger LW, Verghese P, Coughlan J. Where to look next? Eye movements reduce local uncertainty. J Vision. 2007;7:1.

    Article  Google Scholar 

  43. Judd T, Ehinger K, Durand F, Torralba A. Learning to predict where humans look. In: Computer Vision, 2009 IEEE 12th international conference on. 2009. pp. 2106 –2113.

  44. Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW. Sun: a bayesian framework for saliency using natural statistics. J Vision. 2008;8(7):1.

    Article  CAS  Google Scholar 

  45. Carmi R, Itti L. Visual causes versus correlates of attentional selection in dynamic scenes. Vision Res. 2006;46(26):4333.

    Article  PubMed  Google Scholar 

  46. Peters RJ, Iyer A, Itti L, Koch C. Components of bottom-up gaze allocation in natural images. Vision Res. 2005;45:2397.

    Article  PubMed  Google Scholar 

  47. Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766.

    Article  PubMed  Google Scholar 

  48. Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci. 2004;5:1.

    Article  Google Scholar 

  49. Hershler O, Golan T, Bentin S, Hochstein S. The wide window of face detection. J Vision. 2010;10(10):21.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anis Rahman.

Additional information

Sophie Marat and Anis Rahman contributed equally as first authors to this work. The research is supported by Rhone-Alpes region (France) under the CIBLE project No. 2136.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marat, S., Rahman, A., Pellerin, D. et al. Improving Visual Saliency by Adding ‘Face Feature Map’ and ‘Center Bias’. Cogn Comput 5, 63–75 (2013). https://doi.org/10.1007/s12559-012-9146-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-012-9146-3

Keywords

Navigation