Advertisement

Worldly Eyes on Video: Learnt vs. Reactive Deployment of Attention to Dynamic Stimuli

  • Vittorio CuculoEmail author
  • Alessandro D’Amelio
  • Giuliano Grossi
  • Raffaella Lanzarotti
Conference paper
  • 533 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11751)

Abstract

Computational visual attention is a hot topic in computer vision. However, most efforts are devoted to model saliency, whilst the actual eye guidance problem, which brings into play the sequence of gaze shifts characterising overt attention, is overlooked. Further, in those cases where the generation of gaze behaviour is considered, stimuli of interest are by and large static (still images) rather than dynamic ones (videos). Under such circumstances, the work described in this note has a twofold aim: (i) addressing the problem of estimating and generating visual scan paths, that is the sequences of gaze shifts over videos; (ii) investigating the effectiveness in scan path generation offered by features dynamically learned on the base of human observers attention dynamics as opposed to bottom-up derived features. To such end a probabilistic model is proposed. By using a publicly available dataset, our approach is compared against a model of scan path simulation that does not rely on a learning step.

Keywords

Visual attention Scan path HMM Bag of visual words Video gaze prediction 

References

  1. 1.
    Boccignone, G., Ferraro, M.: Modelling gaze shift as a constrained random walk. Physica A 331(1–2), 207–218 (2004)CrossRefGoogle Scholar
  2. 2.
    Boccignone, G., Ferraro, M.: Gaze shift behavior on video as composite information foraging. Signal Process. Image Commun. 28(8), 949–966 (2013)CrossRefGoogle Scholar
  3. 3.
    Boccignone, G., Ferraro, M.: Ecological sampling of gaze shifts. IEEE Trans. Cybern. 44(2), 266–279 (2014)CrossRefGoogle Scholar
  4. 4.
    Boccignone, G., Cuculo, V., D’Amelio, A., Grossi, G., Lanzarotti, R.: Give ear to my face: modelling multimodal attention to social interactions. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11130, pp. 331–345. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11012-3_27CrossRefGoogle Scholar
  5. 5.
    Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)CrossRefGoogle Scholar
  6. 6.
    Brockmann, D., Geisel, T.: The ecology of gaze shifts. Neurocomputing 32(1), 643–650 (2000)CrossRefGoogle Scholar
  7. 7.
    Bruce, N.D., Wloka, C., Frosst, N., Rahman, S., Tsotsos, J.K.: On computational modeling of visual saliency: examining what’s right, and what’s left. Vision Res. 116, 95–112 (2015)CrossRefGoogle Scholar
  8. 8.
    Bylinskii, Z., DeGennaro, E., Rajalingham, R., Ruda, H., Zhang, J., Tsotsos, J.: Towards the quantitative evaluation of visual attention models. Vision. Res. 116, 258–268 (2015)CrossRefGoogle Scholar
  9. 9.
    Chernyak, D.A., Stark, L.W.: Top-down guided eye movements. IEEE Trans. Syst. Man Cybern. B 31, 514–522 (2001)CrossRefGoogle Scholar
  10. 10.
    Clavelli, A., Karatzas, D., Lladós, J., Ferraro, M., Boccignone, G.: Towards modelling an attention-based text localization process. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds.) IbPRIA 2013. LNCS, vol. 7887, pp. 296–303. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38628-2_35CrossRefGoogle Scholar
  11. 11.
    Coen-Cagli, R., Coraggio, P., Napoletano, P., Boccignone, G.: What the draughtsman’s hand tells the draughtsman’s eye: a sensorimotor account of drawing. Int. J. Pattern Recognit Artif Intell. 22(05), 1015–1029 (2008)CrossRefGoogle Scholar
  12. 12.
    Cuculo, V., D’Amelio, A., Lanzarotti, R., Boccignone, G.: Personality gaze patterns unveiled via automatic relevance determination. In: Mazzara, M., Ober, I., Salaün, G. (eds.) STAF 2018. LNCS, vol. 11176, pp. 171–184. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-04771-9_14CrossRefGoogle Scholar
  13. 13.
    Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 524–531. IEEE (2005)Google Scholar
  14. 14.
    Grossi, G., Lanzarotti, R., Lin, J.: Orthogonal procrustes analysis for dictionary learning in sparse linear representation. PLoS ONE 12(1), 1–16 (2017).  https://doi.org/10.1371/journal.pone.0169663CrossRefGoogle Scholar
  15. 15.
    Henderson, J.M., Hayes, T.R., Rehrig, G., Ferreira, F.: Meaning guides attention during real-world scene description. Sci. Rep. 8, 10 (2018)CrossRefGoogle Scholar
  16. 16.
    Le Meur, O., Coutrot, A.: Introducing context-dependent and spatially-variant viewing biases in saccadic models. Vision Res. 121, 72–84 (2016)CrossRefGoogle Scholar
  17. 17.
    Le Meur, O., Liu, Z.: Saccadic model of eye movements for free-viewing condition. Vision Res. 116, 152–164 (2015)CrossRefGoogle Scholar
  18. 18.
    Tatler, B., Hayhoe, M., Land, M., Ballard, D.: Eye guidance in natural vision: reinterpreting salience. J. Vision 11(5), 5 (2011)CrossRefGoogle Scholar
  19. 19.
    Tatler, B., Vincent, B.: The prominence of behavioural biases in eye guidance. Vis. Cogn. 17(6–7), 1029–1054 (2009)CrossRefGoogle Scholar
  20. 20.
    Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53, 153–167 (2003)CrossRefGoogle Scholar
  21. 21.
    Torralba, A.: Modeling global scene factors in attention. JOSA A 20(7), 1407–1418 (2003)CrossRefGoogle Scholar
  22. 22.
    Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol. Rev. 113(4), 766 (2006)CrossRefGoogle Scholar
  23. 23.
    Xia, C., Han, J., Qi, F., Shi, G.: Predicting human saccadic scanpaths based on iterative representation learning. IEEE Trans. Image Process., 1 (2019)Google Scholar
  24. 24.
    Xu, M., Liu, Y., Hu, R., He, F.: Find who to look at: turning from action to saliency. IEEE Trans. Image Process. 27(9), 4529–4544 (2018)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Yang, S.C.H., Wolpert, D.M., Lengyel, M.: Theoretical perspectives on active sensing. Curr. Opin. Behav. Sci. 11, 100–108 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.PHuSe Lab - Dipartimento di InformaticaUniversity of MilanMilanItaly

Personalised recommendations