Skip to main content

Modeling the Temporality of Saliency

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9005))

Included in the following conference series:

Abstract

Dynamic cues have until recently been usually considered as a simple extension of the static saliency, usually in the form of optic flow between two frames. The evolution of stimuli over a period longer than two frames has been largely ignored in saliency research. We argue that considering temporal evolution of trajectory even for a relatively short period can significantly extend the kind of meaningful regions that can be extracted from videos, without resorting to higher-level processes. Our work is a systematic and principled investigation of the temporal aspect of saliency under a dynamic setting. Departing from the majority of works where the dynamic cue is considered as an extension of the static saliency, our work places central importance on temporality. We formulate both intra- and inter-trajectory saliency to measure relationships within and between trajectories respectively. Our inter-trajectory saliency formulation also represents the first attempt among computational saliency works to look beyond the immediate neighborhood in space and time, utilizing the perceptual organization rule of common fate (temporal synchrony) to make a group of trajectories stand out from the rest. At the technical level, our use of the superpixel trajectory representation captures the detailed dynamics of superpixels as they progress in time. This allows us to better measure changes such as sudden movement or onset compared to other representations. Experimental results show that our method achieves state-of-the-art performance both quantitatively and qualitatively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2007)

    Google Scholar 

  2. Einhauser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8, 1–26 (2008)

    Google Scholar 

  3. Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV, pp. 2106–2113 (2009)

    Google Scholar 

  4. Cerf, M., Harel, J., Einhaeuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: NIPS, pp. 241–248 (2008)

    Google Scholar 

  5. Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. J. Vis. 11, 1–15 (2011)

    Article  Google Scholar 

  6. Cerf, M., Koch, C.: Faces and text attract gaze independent of the task: experimental data and computer model. J. Vis. 9, 1–15 (2009)

    Article  Google Scholar 

  7. Shen, C., Mingli, S., Zhao, Q.: Learning high-level concepts by training a deep network on eye fixations. In: Deep Learning and Unsupervised Feature Learning Workshop, in Conjunction with NIPS (2012)

    Google Scholar 

  8. Fowlkes, C.C., Martin, D.R., Malik, J.: Local figure-ground cues are valid for natural images. J. Vis. 7, 1–9 (2007)

    Article  Google Scholar 

  9. Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., Zhao, Q.: Predicting human gaze beyond pixels. J. Vis. 14, 1–20 (2014)

    Article  Google Scholar 

  10. Blake, R., Lee, S.H.: The role of temporal structure in human vision. Behav. Cogn. Neurosci. Rev. 4, 21–42 (2005)

    Article  Google Scholar 

  11. Gao, T., Scholl, B.: Chasing vs. stalking: Interrupting the perception of animacy. J. Exp. Psychol. 37, 669–684 (2011)

    Google Scholar 

  12. Ballas, N., Yang, Y., Lan, Z.Z., Delezoide, B., Preteux, F., Hauptmann, A.: Space-time robust representation for action recognition. In: ICCV, pp. 2704–2711 (2013)

    Google Scholar 

  13. Seo, H.J.J., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9, 1–27 (2009)

    Article  Google Scholar 

  14. Wang, W., Wang, Y., Huang, Q., Gao, W.: Measuring visual saliency by site entropy rate. In: CVPR, pp. 2368–2375 (2010)

    Google Scholar 

  15. Riche, N., Mancas, M., Culibrk, D., Crnojevic, V., Gosselin, B., Dutoit, T.: Dynamic saliency models and human attention: a comparative study on videos. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part III. LNCS, vol. 7726, pp. 586–598. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Borji, A., Itti, L.: State-of-the-art in visual attention modeling. T-PAMI 35, 185–207 (2013)

    Article  Google Scholar 

  17. Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. T-PAMI 32, 171–177 (2010)

    Article  Google Scholar 

  18. Rahtu, E., Kannala, J., Salo, M., Heikkilä, J.: Segmenting salient objects from images and videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 366–379. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Zhou, F., Kang, S.B., Cohen, M.F.: Time-mapping using space-time saliency. In: CVPR (2014)

    Google Scholar 

  20. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. T-PAMI 20, 1254–1259 (1998)

    Article  Google Scholar 

  21. Li, Y., Zhou, Y., Xu, L., Yang, X., Yang, J.: Incremental sparse saliency detection. In: ICIP, pp. 3093–3096 (2009)

    Google Scholar 

  22. Zhang, L., Tong, M.H., Cottrell, G.W.: SUNDAy: saliency using natural statistics for dynamic analysis of scenes. In: The Thirty-First Annual Cognitive Science Society Conference, pp. 1–6 (2009)

    Google Scholar 

  23. Hou, X., Zhang, L.: Dynamic visual attention: searching for coding length increments. In: NIPS, pp. 681–688 (2008)

    Google Scholar 

  24. Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: CVPR, pp. 631–637 (2005)

    Google Scholar 

  25. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. T-PAMI 34, 2189–2202 (2012)

    Article  Google Scholar 

  26. Bergh, M.V.D., Roig, G., Boix, X., Manen, S., Gool, L.V.: Online video seeds for temporal window objectness. In: ICCV, pp. 377–384 (2013)

    Google Scholar 

  27. Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  28. Ochs, P., Brox, T.: Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In: ICCV, pp. 1583–1590 (2011)

    Google Scholar 

  29. Zhang, D., Javed, O., Shah, M.: Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: CVPR, pp. 628–635 (2013)

    Google Scholar 

  30. Tatler, B.W., Hayhoe, M.M., Land, M.F., Ballard, D.H.: Eye guidance in natural vision: Reinterpreting salience. J. Vis. 11, 1–23 (2011)

    Article  Google Scholar 

  31. Chang, J., Wei, D., III, J.W.F.: A video representation using temporal superpixels. In: CVPR, pp. 2051–2058 (2013)

    Google Scholar 

  32. Borga, M.: Learning Multidimensional Signal Processing. Ph.D. thesis, Linköping University, Sweden, SE-581 83 Linköping, Sweden (1998)

    Google Scholar 

  33. Shapovalova, N., Raptis, M., Sigal, L., Mori, G.: Action is in the eye of the beholder: eye-gaze driven model for spatio-temporal action localization. In: NIPS, pp. 2409–2417 (2013)

    Google Scholar 

  34. Fukuchi, K., Miyazato, K., Kimura, A., Takagi, S., Yamato, J.: Saliency-based video segmentation with graph cuts and sequentially updated priors. In: Proceeding of International Conference on Multimedia and Expo (ICME), pp. 638–641 (2009)

    Google Scholar 

  35. Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: state-of-the-art and study of comparison metrics. In: ICCV (2013)

    Google Scholar 

  36. Borji, A., Sihite, D.N., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. TIP 1, 55–69 (2012)

    Google Scholar 

  37. Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision. Res. 45, 2397–2416 (2005)

    Article  Google Scholar 

  38. Jost, T., Ouerhani, N., von Wartburg, R., Muri, R., Hugli, H.: Assessing the contribution of color in visual attention. CVIU 100, 107–123 (2005)

    Google Scholar 

  39. Green, D.M., Swets, J.A.: Signal Detection Theory and Psychophysics. Wiley, New York (1966)

    Google Scholar 

  40. Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. TIP 57, 1856–1866 (2010)

    Google Scholar 

Download references

Acknowledgement

This work was partially supported by the Singapore PSF grant 1321202075 and the NUS AcRF grant R-263-000-A21-112.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Loong-Fah Cheong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Luo, Y., Cheong, LF., Cabibihan, JJ. (2015). Modeling the Temporality of Saliency. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9005. Springer, Cham. https://doi.org/10.1007/978-3-319-16811-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16811-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16810-4

  • Online ISBN: 978-3-319-16811-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics