Skip to main content
Log in

Predicting the eye fixation locations in the gray scale images in the visual scenes with different semantic contents

  • Research Article
  • Published:
Cognitive Neurodynamics Aims and scope Submit manuscript

Abstract

In recent years, there has been considerable interest in visual attention models (saliency map of visual attention). These models can be used to predict eye fixation locations, and thus will have many applications in various fields which leads to obtain better performance in machine vision systems. Most of these models need to be improved because they are based on bottom-up computation that does not consider top-down image semantic contents and often does not match actual eye fixation locations. In this study, we recorded the eye movements (i.e., fixations) of fourteen individuals who viewed images which consist natural (e.g., landscape, animal) and man-made (e.g., building, vehicles) scenes. We extracted the fixation locations of eye movements in two image categories. After extraction of the fixation areas (a patch around each fixation location), characteristics of these areas were evaluated as compared to non-fixation areas. The extracted features in each patch included the orientation and spatial frequency. After feature extraction phase, different statistical classifiers were trained for prediction of eye fixation locations by these features. This study connects eye-tracking results to automatic prediction of saliency regions of the images. The results showed that it is possible to predict the eye fixation locations by using of the image patches around subjects’ fixation points.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Awh E, Belopolsky AV, Theeuwes J (2012) Top-down versus bottom-up attentional control: a failed theoretical dichotomy. Trends Cogn Sci 16(8):437–443

    Article  PubMed Central  PubMed  Google Scholar 

  • Bian P, Zhang L (2010) Visual saliency: a biologically plausible contourlet-like frequency domain approach. Cogn Neurodyn 4(3):189–198

    Article  PubMed Central  PubMed  Google Scholar 

  • Borji A, Itti L (2013) State-of-the-art in visual attention modeling. Pattern Anal Mach Intell IEEE Trans 35(1):185–207

    Article  Google Scholar 

  • Bruce ND, Tsotsos JK (2009) Saliency, attention, and visual search: an information theoretic approach. J Vis 9(3):5

    Article  PubMed  Google Scholar 

  • De Valois RL, Albrecht DG, Thorell LG (1982a) Spatial frequency selectivity of cells in macaque visual cortex. Vis Res 22(5):545–559

    Article  PubMed  Google Scholar 

  • De Valois RL, William Yund E, Hepler N (1982b) The orientation and direction selectivity of cells in macaque visual cortex. Vis Res 22(5):531–544

    Article  PubMed  Google Scholar 

  • DeCarlo D, Santella A (2002) Stylization and abstraction of photographs. In: ACM transactions on graphics (TOG), vol 21, no 3. ACM, pp 769–776

  • Filipe S, Alexandre LA (2013) From the human visual system to the computational models of visual attention: a survey. Artif Intell Rev 39(1):1–47

    Article  Google Scholar 

  • Geisler WS, Perry JS (1998) Real-time foveated multiresolution system for low-bandwidth video communication. In: Photonics West’98 electronic imaging. International society for optics and photonics, pp 294–305

  • Geusebroek JM, Smeulders AWM (2002) A physical explanation for natural image statistics. In: Proceedings of the 2nd international workshop on texture analysis and synthesis (Texture 2002). Copenhagen, Denmark, pp 47–52

  • Gu Y, Liljenström H (2007) A neural network model of attention-modulated neurodynamics. Cogn Neurodyn 1(4):275–285

    Article  PubMed Central  PubMed  Google Scholar 

  • Henderson JM, Brockmole JR, Castelhano MS, Mack M (2007) Visual saliency does not account for eye movements during visual search in real-world scenes. In: van Gompel R, Fischer M, Murray W, Hill RW (eds) Eye movements: a window on mind and brain. Elsevier, Oxford, pp 537–562

  • Issa NP, Trepel C, Stryker MP (2000) Spatial frequency maps in cat visual cortex. J Neurosci 20(22):8504–8514

    PubMed Central  CAS  PubMed  Google Scholar 

  • Itti L, Koch C (2001) Computational modeling of visual attention. Nat Rev Neurosci 2(3):194–203

    Article  CAS  PubMed  Google Scholar 

  • Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  • Jaimes A, Pelz JB, Grabowski T, Babcock JS, Chang SF (2001) Using human observer eye movements in automatic image classifiers. In: Photonics west 2001-electronic imaging. International society for optics and photonics, pp 373–384

  • Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to predict where humans look. In: Computer vision, 2009 IEEE 12th international conference on. IEEE, pp 2106–2113

  • Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol 4:219–227

    CAS  PubMed  Google Scholar 

  • Lanyon LJ, Denham SL (2009) Modelling attention in individual cells leads to a system with realistic saccade behaviours. Cogn Neurodyn 3(3):223–242

    Article  PubMed Central  PubMed  Google Scholar 

  • Le Meur O (2014) Visual attention modelling and applications. Towards perceptual-based editing methods (Doctoral dissertation, University of Rennes 1)

  • Le Meur O, Le Callet P, Barba D, Thoreau D (2006) A coherent computational approach to model bottom-up visual attention. Pattern Anal Mach Intell IEEE Trans 28(5):802–817

    Article  Google Scholar 

  • Li Z (2002) A saliency map in primary visual cortex. Trends Cogn Sci 6(1):9–16

    Article  PubMed  Google Scholar 

  • Marat S, Phuoc TH, Granjon L, Guyader N, Pellerin D, Guérin-Dugué A (2009) Modelling spatio-temporal saliency to predict gaze direction for short videos. Int J Comput Vis 82(3):231–243

    Article  Google Scholar 

  • Martinez LM, Alonso JM (2003) Complex receptive fields in primary visual cortex. Neurosci 9(5):317–331

    Google Scholar 

  • Oliva A, Torralba A, Castelhano MS, Henderson JM (2003) Top-down control of visual attention in object detection. In: Image processing, 2003. ICIP 2003. Proceedings. 2003 international conference on (vol 1, pp I–253). IEEE

  • Posner MI (1980) Orienting of attention. Q J Exp Psychol 32(1):3–25

    Article  CAS  PubMed  Google Scholar 

  • Rajashekar U, Cormack LK, Bovik AC (2003) Image features that draw fixations. In Image processing, 2003. ICIP 2003. Proceedings. 2003 international conference on (vol 3, pp III–313). IEEE

  • Rensink RA, O’Regan JK, Clark JJ (1997) To see or not to see: the need for attention to perceive changes in scenes. Psychol Sci 8(5):368–373

    Article  Google Scholar 

  • Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025

    Article  CAS  PubMed  Google Scholar 

  • Riesenhuber M, Poggio T (2000) Models of object recognition. Nat Neurosci 3:1199–1204

    Article  CAS  PubMed  Google Scholar 

  • Schiller PH, Finlay BL, Volman SF (1976a) Quantitative studies of single-cell properties in monkey striate cortex: III. Spatial frequency. J Neurophysiol 39(6):1334–1351

    Google Scholar 

  • Schiller PH, Finlay BL, Volman SF (1976b) Quantitative studies of single-cell properties in monkey striate cortex. I. Spatiotemporal organization of receptive fields. J Neurophysiol 39(6):1288–1319

    CAS  PubMed  Google Scholar 

  • Schiller PH, Finlay BL, Volman SF (1976c) Quantitative studies of single-cell properties in monkey striate cortex. II. Orientation specificity and ocular dominance. J Neurophysiol 39(6):1320–1333

    CAS  PubMed  Google Scholar 

  • Serre T, Riesenhuber M (2004) Realistic modeling of simple and complex cell tuning in the HMAX model, and implications for invariant object recognition in cortex (No. AI-MEMO-2004-017). Massachusetts Inst of tech Cambridge computer science and artificial intelligence lab

  • Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on, vol 2. IEEE, pp 994–1000

  • Shen C, Zhao Q (2014) Learning to predict eye fixations for semantic contents using multi-layer sparse network. Neurocomputing 138:61–68

    Article  Google Scholar 

  • Shi X, Bruce ND, Tsotsos JK (2011) Fast, recurrent, attentional modulation improves saliency representation and scene recognition. In: Computer vision and pattern recognition workshops (CVPRW), 2011 IEEE computer society conference on. IEEE, pp 1–8

  • Torralba A, Oliva A, Castelhano MS, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113(4):766

    Article  PubMed  Google Scholar 

  • Viola P, Jones M (2001) Robust real-time object detection. Int J Comput Vision 4:34–47

    Google Scholar 

  • Wang Z, Lu L, Bovik AC (2003) Foveation scalable video coding with automatic fixation selection. Image Processing, IEEE Transactions on 12(2):243–254

    Article  Google Scholar 

  • Wang X, Lv Q, Wang B, Zhang L (2013) Airport detection in remote sensing images: a method based on saliency map. Cogn Neurodyn 7(2):143–154

    Article  PubMed Central  PubMed  Google Scholar 

  • Wei H, Ren Y, Wang ZY (2013) A computational neural model of orientation detection based on multiple guesses: comparison of geometrical and algebraic models. Cogn Neurodyn 7(5):361–379

    Article  PubMed Central  PubMed  Google Scholar 

  • Yarbus AL (1967) In: Rigss LA (ed) Eye movements and vision (vol 2, no 5.10). Plenum Press, New York

  • Yu Y, Wang B, Zhang L (2011) Bottom–up attention: pulsed PCA transform and pulsed cosine transform. Cogn Neurodyn 5(4):321–332

    Article  PubMed Central  PubMed  Google Scholar 

  • Zhang L, Lin W (2013) Selective visual attention: computational models and applications. Wiley, London

    Book  Google Scholar 

  • Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) SUN: a Bayesian framework for saliency using natural statistics. J Vis 8(7):32

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

This research was supported by the School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Niavaran, Tehran, Iran. We wish to thank Morteza Saraf, Moein Esghaei, Mehdi Behrozi, Kourosh Maboudi and everyone who participated in our experiment.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Reza Daliri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zanganeh Momtaz, H., Daliri, M.R. Predicting the eye fixation locations in the gray scale images in the visual scenes with different semantic contents. Cogn Neurodyn 10, 31–47 (2016). https://doi.org/10.1007/s11571-015-9357-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11571-015-9357-x

Keywords

Navigation