Learning-Based Visual Saliency Computation

Li, Jia; Gao, Wen

doi:10.1007/978-3-319-05642-5_5

Jia Li¹⁶ &
Wen Gao¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8408))

1658 Accesses
1 Citations

Abstract

In this Chapter, we describe how to model the top-down factors in human vision system. Usually, such top-down factors should be learned from user data such as fixations or labeled salient objects by using machine learning algorithms. Therefore, we will present how to model various top-down factors by using supervised or unsupervised learning algorithms. Moreover, we aim to show how the machine learning algorithms can help to improve the performance of saliency models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1597–1604 (2009), doi:10.1109/CVPR.2009.5206596
Google Scholar
Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 41–48 (2007)
Google Scholar
Bruce, N.D., Tsotsos, J.K.: Saliency based on information maximization. In: Advances in Neural Information Processing Systems (NIPS), Vancouver, BC, Canada, pp. 155–162 (2005)
Google Scholar
Cheung, C.H., Po, L.M.: A novel cross-diamond search algorithm for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology 12(12), 1168–1177 (2002), doi:10.1109/TCSVT.2002.806815
Article Google Scholar
Chikkerur, S., Serre, T., Tan, C., Poggio, T.: What and where: A bayesian inference theory of attention. Vision Research 50(22), 2233–2247 (2010), doi:10.1016/j.visres.2010.05.013
Article Google Scholar
Chun, M.M.: Contextual guidance of visual attention. In: Itti, L., Rees, G., Tsotsos, J. (eds.) Neurobiology of Attention, 1st edn., pp. 246–250. Elsevier Press, Amsterdam (2005)
Chapter Google Scholar
Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615–637 (2005)
MATH MathSciNet Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)
Article MATH MathSciNet Google Scholar
Frith, C.: The top in top-down attention. In: Itti, L., Rees, G., Tsotsos, J. (eds.) Neurobiology of Attention, 1st edn., pp. 105–108. Elsevier Press, Amsterdam (2005)
Chapter Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2376–2383 (2010), doi:10.1109/CVPR.2010.5539929
Google Scholar
Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008), doi:10.1109/CVPR.2008.4587715
Google Scholar
Han, F., Zhu, S.C.: Bottom-up/top-down image parsing with attribute grammar. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 59–73 (2009), doi:10.1109/TPAMI.2008.65
Article MathSciNet Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems (NIPS), pp. 545–552 (2007)
Google Scholar
Henderson, J.M.: Human gaze control during real-world scene perception. Trends in Cognitive Sciences 7(11), 498–504 (2003)
Article Google Scholar
Hou, X., Zhang, L.: Saliency detection: A spectral residual approach. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007), doi:10.1109/CVPR.2007.383267
Google Scholar
Hou, X., Zhang, L.: Dynamic visual attention: Searching for coding length increments. In: Advances in Neural Information Processing Systems (NIPS), pp. 681–688 (2009)
Google Scholar
Itti, L.: Models of bottom-up and top-down visual attention. PhD thesis, California Institute of Technology (2000)
Google Scholar
Itti, L.: Crcns data sharing: Eye movements during free-viewing of natural videos. In: Collaborative Research in Computational Neuroscience Annual Meeting, Los Angeles, California (2008)
Google Scholar
Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 631–637 (2005), doi:10.1109/CVPR.2005.40
Google Scholar
Itti, L., Koch, C.: Computational modeling of visual attention. Nature Review Neuroscience 2(3), 194–203 (2001a)
Article Google Scholar
Itti, L., Koch, C.: Feature combination strategies for saliency-based visual attention systems. Journal of Electronic Imaging 10(1), 161–169 (2001b)
Article Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254–1259 (1998), doi:10.1109/34.730558
Article Google Scholar
Itti, L., Rees, G., Tsotsos, J.: Neurobiology of Attention, 1st edn. Elsevier Press, Amsterdam (2005)
Google Scholar
Jacob, L., Bach, F., Vert, J.P.: Clustered multi-task learning: A convex formulation. In: Advances in Neural Information Processing Systems (NIPS), pp. 745–752 (2009)
Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: Preceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2106–2113 (2009), doi:10.1109/ICCV.2009.5459462
Google Scholar
Kienzle, W., Scholkopf, B., Wichmann, F.A., Franz, M.O.: How to find interesting locations in video: a spatiotemporal interest point detector learned from human eye movements. In: Preceedings of the 29th DAGM Symposium, pp. 405–414 (2007a)
Google Scholar
Kienzle, W., Wichmann, F.A., Scholkopf, B., Franz, M.O.: A nonparametric approach to bottom-up visual saliency. In: Advances in Neural Information Processing Systems (NIPS), pp. 689–696 (2007b)
Google Scholar
Li, J., Tian, Y., Huang, T., Gao, W.: A dataset and evaluation methodology for visual saliency in video. In: Preceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 442–445 (2009), doi:10.1109/ICME.2009.5202529
Google Scholar
Li, J., Tian, Y., Huang, T., Gao, W.: Probabilistic multi-task learning for visual saliency estimation in video. International Journal of Computer Vision 90(2), 150–165 (2010), doi:10.1007/s11263-010-0354-6
Article Google Scholar
Li, J., Tian, Y., Huang, T.: Visual saliency with statistical priors. International Journal of Computer Vision, 1–15 (2013), doi:10.1007/s11263-013-0678-0
Google Scholar
Liu, T., Sun, J., Zheng, N.N., Tang, X., Shum, H.Y.: Learning to detect a salient object. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007), doi:10.1109/CVPR.2007.383047
Google Scholar
Liu, T., Zheng, N., Ding, W., Yuan, Z.: Video attention: Learning to detect a salient object sequence. In: Preceedings of the 19th IEEE Conference on Pattern Recognition (ICPR), pp. 1–4 (2008), doi:10.1109/ICPR.2008.4761406
Google Scholar
Lu, Y., Zhang, W., Jin, C., Xue, X.: Learning attention map from images. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1067–1074 (2012), doi:10.1109/CVPR.2012.6247785
Google Scholar
Marat, S., Ho Phuoc, T., Granjon, L., Guyader, N., Pellerin, D., Guérin-Dugué, A.: Modelling spatio-temporal saliency to predict gaze direction for short videos. International Journal of Computer Vision 82(3), 231–243 (2009), doi:10.1007/s11263-009-0215-3
Article Google Scholar
Mozer, M.C., Shettel, M., Vecera, S.: Top-down control of visual attention: A rational account. In: Advances in Neural Information Processing Systems (NIPS), pp. 923–930 (2005)
Google Scholar
Navalpakkam, V., Itti, L.: Search goal tunes visual features optimally. Neuron 53, 605–617 (2007)
Article Google Scholar
Parikh, D., Zitnick, C., Chen, T.: Determining patch saliency using low-level context, Berlin, Germany, vol. 2, pp. 446–459 (2008)
Google Scholar
Peters, R., Itti, L.: Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007), doi:10.1109/CVPR.2007.383337
Google Scholar
Riche, N., Mancas, M., Gosselin, B., Dutoit, T.: Rare: A new bottom-up saliency model. In: Preceedings of the 19th IEEE International Conference on Image Processing (ICIP), pp. 641–644 (2012), doi:10.1109/ICIP.2012.6466941
Google Scholar
Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: The role of global features on object search. Psychological Review 113(4), 766–786 (2006)
Article Google Scholar
Tseng, P.H., Carmi, R., Cameron, I.G.M., Munoz, D.P., Itti, L.: Quantifying center bias of observers in free viewing of dynamic natural scenes. Journal of Vision 9(7):4, 1–16 (2009), doi:10.1167/9.7.4
Google Scholar
Walther, D., Koch, C.: Modeling attention to salient proto-objects. Neural Networks 19(9), 1395–1407 (2006)
Article MATH Google Scholar
Wang, W., Wang, Y., Huang, Q., Gao, W.: Measuring visual saliency by site entropy rate. In: Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2368–2375 (2010), doi:10.1109/CVPR.2010.5539927
Google Scholar
Wolfe, J.M.: Visual search. In: Pashler, H.E. (ed.) Attention, pp. 13–73. Psychology Press, Hove (1998)
Google Scholar
Wolfe, J.M.: Guidance of visual search by preattentive information. In: Itti, L., Rees, G., Tsotsos, J. (eds.) Neurobiology of Attention, 1st edn., pp. 101–104. Elsevier Press, Amsterdam (2005)
Chapter Google Scholar
Wolfe, J.M., Alvarez, G.A., Horowitz, T.S.: Attention is fast but volition is slow. Nature 406, 691 (2000)
Article Google Scholar
Zhai, Y., Shah, M.: Visual attention detection in video sequences using spatiotemporal cues. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, MULTIMEDIA 2006, pp. 815–824. ACM, New York (2006), doi:10.1145/1180639.1180824
Google Scholar
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: Sun: A bayesian framework for saliency using natural statistics. Journal of Vision 8(7):32, 1–20 (2008), doi:10.1167/8.7.32
Google Scholar
Zhao, Q., Koch, C.: Learning visual saliency by combining feature maps in a nonlinear manner using adaboost. Journal of Vision 12(6):22, 1–15 (2012), doi:10.1167/12.6.22
Google Scholar

Download references

Author information

Authors and Affiliations

Peking University, Haidian District, 100871, Beijing, China
Jia Li & Wen Gao

Authors

Jia Li
View author publications
You can also search for this author in PubMed Google Scholar
Wen Gao
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, J., Gao, W. (2014). Learning-Based Visual Saliency Computation. In: Visual Saliency Computation. Lecture Notes in Computer Science, vol 8408. Springer, Cham. https://doi.org/10.1007/978-3-319-05642-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-05642-5_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05641-8
Online ISBN: 978-3-319-05642-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics