Modeling the Temporality of Saliency

Luo, Ye; Cheong, Loong-Fah; Cabibihan, John-John

doi:10.1007/978-3-319-16811-1_14

Ye Luo¹⁷,
Loong-Fah Cheong¹⁷ &
John-John Cabibihan¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9005))

Included in the following conference series:

Asian Conference on Computer Vision

2645 Accesses
1 Citations

Abstract

Dynamic cues have until recently been usually considered as a simple extension of the static saliency, usually in the form of optic flow between two frames. The evolution of stimuli over a period longer than two frames has been largely ignored in saliency research. We argue that considering temporal evolution of trajectory even for a relatively short period can significantly extend the kind of meaningful regions that can be extracted from videos, without resorting to higher-level processes. Our work is a systematic and principled investigation of the temporal aspect of saliency under a dynamic setting. Departing from the majority of works where the dynamic cue is considered as an extension of the static saliency, our work places central importance on temporality. We formulate both intra- and inter-trajectory saliency to measure relationships within and between trajectories respectively. Our inter-trajectory saliency formulation also represents the first attempt among computational saliency works to look beyond the immediate neighborhood in space and time, utilizing the perceptual organization rule of common fate (temporal synchrony) to make a group of trajectories stand out from the rest. At the technical level, our use of the superpixel trajectory representation captures the detailed dynamics of superpixels as they progress in time. This allows us to better measure changes such as sudden movement or onset compared to other representations. Experimental results show that our method achieves state-of-the-art performance both quantitatively and qualitatively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2007)
Google Scholar
Einhauser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8, 1–26 (2008)
Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV, pp. 2106–2113 (2009)
Google Scholar
Cerf, M., Harel, J., Einhaeuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: NIPS, pp. 241–248 (2008)
Google Scholar
Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. J. Vis. 11, 1–15 (2011)
Article Google Scholar
Cerf, M., Koch, C.: Faces and text attract gaze independent of the task: experimental data and computer model. J. Vis. 9, 1–15 (2009)
Article Google Scholar
Shen, C., Mingli, S., Zhao, Q.: Learning high-level concepts by training a deep network on eye fixations. In: Deep Learning and Unsupervised Feature Learning Workshop, in Conjunction with NIPS (2012)
Google Scholar
Fowlkes, C.C., Martin, D.R., Malik, J.: Local figure-ground cues are valid for natural images. J. Vis. 7, 1–9 (2007)
Article Google Scholar
Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., Zhao, Q.: Predicting human gaze beyond pixels. J. Vis. 14, 1–20 (2014)
Article Google Scholar
Blake, R., Lee, S.H.: The role of temporal structure in human vision. Behav. Cogn. Neurosci. Rev. 4, 21–42 (2005)
Article Google Scholar
Gao, T., Scholl, B.: Chasing vs. stalking: Interrupting the perception of animacy. J. Exp. Psychol. 37, 669–684 (2011)
Google Scholar
Ballas, N., Yang, Y., Lan, Z.Z., Delezoide, B., Preteux, F., Hauptmann, A.: Space-time robust representation for action recognition. In: ICCV, pp. 2704–2711 (2013)
Google Scholar
Seo, H.J.J., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9, 1–27 (2009)
Article Google Scholar
Wang, W., Wang, Y., Huang, Q., Gao, W.: Measuring visual saliency by site entropy rate. In: CVPR, pp. 2368–2375 (2010)
Google Scholar
Riche, N., Mancas, M., Culibrk, D., Crnojevic, V., Gosselin, B., Dutoit, T.: Dynamic saliency models and human attention: a comparative study on videos. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part III. LNCS, vol. 7726, pp. 586–598. Springer, Heidelberg (2013)
Chapter Google Scholar
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. T-PAMI 35, 185–207 (2013)
Article Google Scholar
Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. T-PAMI 32, 171–177 (2010)
Article Google Scholar
Rahtu, E., Kannala, J., Salo, M., Heikkilä, J.: Segmenting salient objects from images and videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 366–379. Springer, Heidelberg (2010)
Chapter Google Scholar
Zhou, F., Kang, S.B., Cohen, M.F.: Time-mapping using space-time saliency. In: CVPR (2014)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. T-PAMI 20, 1254–1259 (1998)
Article Google Scholar
Li, Y., Zhou, Y., Xu, L., Yang, X., Yang, J.: Incremental sparse saliency detection. In: ICIP, pp. 3093–3096 (2009)
Google Scholar
Zhang, L., Tong, M.H., Cottrell, G.W.: SUNDAy: saliency using natural statistics for dynamic analysis of scenes. In: The Thirty-First Annual Cognitive Science Society Conference, pp. 1–6 (2009)
Google Scholar
Hou, X., Zhang, L.: Dynamic visual attention: searching for coding length increments. In: NIPS, pp. 681–688 (2008)
Google Scholar
Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: CVPR, pp. 631–637 (2005)
Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. T-PAMI 34, 2189–2202 (2012)
Article Google Scholar
Bergh, M.V.D., Roig, G., Boix, X., Manen, S., Gool, L.V.: Online video seeds for temporal window objectness. In: ICCV, pp. 377–384 (2013)
Google Scholar
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)
Chapter Google Scholar
Ochs, P., Brox, T.: Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In: ICCV, pp. 1583–1590 (2011)
Google Scholar
Zhang, D., Javed, O., Shah, M.: Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: CVPR, pp. 628–635 (2013)
Google Scholar
Tatler, B.W., Hayhoe, M.M., Land, M.F., Ballard, D.H.: Eye guidance in natural vision: Reinterpreting salience. J. Vis. 11, 1–23 (2011)
Article Google Scholar
Chang, J., Wei, D., III, J.W.F.: A video representation using temporal superpixels. In: CVPR, pp. 2051–2058 (2013)
Google Scholar
Borga, M.: Learning Multidimensional Signal Processing. Ph.D. thesis, Linköping University, Sweden, SE-581 83 Linköping, Sweden (1998)
Google Scholar
Shapovalova, N., Raptis, M., Sigal, L., Mori, G.: Action is in the eye of the beholder: eye-gaze driven model for spatio-temporal action localization. In: NIPS, pp. 2409–2417 (2013)
Google Scholar
Fukuchi, K., Miyazato, K., Kimura, A., Takagi, S., Yamato, J.: Saliency-based video segmentation with graph cuts and sequentially updated priors. In: Proceeding of International Conference on Multimedia and Expo (ICME), pp. 638–641 (2009)
Google Scholar
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: state-of-the-art and study of comparison metrics. In: ICCV (2013)
Google Scholar
Borji, A., Sihite, D.N., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. TIP 1, 55–69 (2012)
Google Scholar
Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision. Res. 45, 2397–2416 (2005)
Article Google Scholar
Jost, T., Ouerhani, N., von Wartburg, R., Muri, R., Hugli, H.: Assessing the contribution of color in visual attention. CVIU 100, 107–123 (2005)
Google Scholar
Green, D.M., Swets, J.A.: Signal Detection Theory and Psychophysics. Wiley, New York (1966)
Google Scholar
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. TIP 57, 1856–1866 (2010)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the Singapore PSF grant 1321202075 and the NUS AcRF grant R-263-000-A21-112.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
Ye Luo & Loong-Fah Cheong
Mechanical and Industrial Engineering Department, Qatar University, Doha, Qatar
John-John Cabibihan

Authors

Ye Luo
View author publications
You can also search for this author in PubMed Google Scholar
Loong-Fah Cheong
View author publications
You can also search for this author in PubMed Google Scholar
John-John Cabibihan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Loong-Fah Cheong .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, Y., Cheong, LF., Cabibihan, JJ. (2015). Modeling the Temporality of Saliency. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9005. Springer, Cham. https://doi.org/10.1007/978-3-319-16811-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-16811-1_14
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16810-4
Online ISBN: 978-3-319-16811-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics