Advertisement

Saliency in Crowd

  • Ming Jiang
  • Juan Xu
  • Qi Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8695)

Abstract

Theories and models on saliency that predict where people look at focus on regular-density scenes. A crowded scene is characterized by the co-occurrence of a relatively large number of regions/objects that would have stood out if in a regular scene, and what drives attention in crowd can be significantly different from the conclusions in the regular setting. This work presents a first focused study on saliency in crowd. To facilitate saliency in crowd study, a new dataset of 500 images is constructed with eye tracking data from 16 viewers and annotation data on faces (the dataset will be publicly available with the paper). Statistical analyses point to key observations on features and mechanisms of saliency in scenes with different crowd levels and provide insights as of whether conventional saliency models hold in crowding scenes. Finally a new model for saliency prediction that takes into account the crowding information is proposed, and multiple kernel learning (MKL) is used as a core computational module to integrate various features at both low- and high-levels. Extensive experiments demonstrate the superior performance of the proposed model compared with the state-of-the-art in saliency computation.

Keywords

visual attention saliency crowd multiple kernel learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Bruce, N., Tsotsos, J.: Saliency, attention, and visual search: An information theoretic approach. Journal of Vision 9(3), 5 (2009)CrossRefGoogle Scholar
  3. 3.
    Burges, C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  4. 4.
    Cerf, M., Frady, E., Koch, C.: Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision 9(12), 10 (2009)CrossRefGoogle Scholar
  5. 5.
    Cerf, M., Harel, J., Einhäuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: NIPS (2008)Google Scholar
  6. 6.
    Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46(1-3), 131–159 (2002)CrossRefzbMATHGoogle Scholar
  7. 7.
    Chikkerur, S., Serre, T., Tan, C., Poggio, T.: What and where: a bayesian inference theory of attention. Vision Research 50(22), 2233–2247 (2010)CrossRefGoogle Scholar
  8. 8.
    Einhäuser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. Journal of Vision 8(14), 18 (2008)CrossRefGoogle Scholar
  9. 9.
    Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: Liblinear: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  10. 10.
    Field, D.J.: What is the goal of sensory coding? Neural Computation 6, 559–601 (1994)CrossRefGoogle Scholar
  11. 11.
    Gao, D., Mahadevan, V., Vasconcelos, N.: The discriminant center-surround hypothesis for bottom-up saliency. In: NIPS (2007)Google Scholar
  12. 12.
    Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Saliency from hierarchical adaptation through decorrelation and variance normalization. Image and Vision Computing 30(1), 51–64 (2012)CrossRefGoogle Scholar
  13. 13.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS (2007)Google Scholar
  14. 14.
    Hou, X., Harel, J., Koch, C.: Image signature: Highlighting sparse salient regions. T-PAMI 34(1), 194–201 (2012)CrossRefGoogle Scholar
  15. 15.
    Hyvarinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Networks 13(4-5), 411–430 (2000)CrossRefGoogle Scholar
  16. 16.
    Itti, L., Baldi, P.: Bayesian surprise attracts human attention. In: NIPS (2006)Google Scholar
  17. 17.
    Itti, L., Koch, C., Niebur, E.: A model for saliency-based visual attention for rapid scene analysis. T-PAMI 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  18. 18.
    Judd, T.: Learning to predict where humans look, http://people.csail.mit.edu/tjudd/WherePeopleLook/index.html
  19. 19.
    Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV (2009)Google Scholar
  20. 20.
    Kienzle, W., Wichmann, F., Scholkopf, B., Franz, M.: A nonparametric approach to bottom-up visual saliency. In: NIPS (2006)Google Scholar
  21. 21.
    Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology 4(4), 219–227 (1985)Google Scholar
  22. 22.
    Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. T-PAMI 36(1), 18–32 (2014)CrossRefGoogle Scholar
  23. 23.
    Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in highly dynamic scenes. T-PAMI 32(1), 171–177 (2010)CrossRefGoogle Scholar
  24. 24.
    Mancas, M.: Attention-based dense crowds analysis. In: WIAMIS (2010)Google Scholar
  25. 25.
    Margolin, R., Tal, A., Zelnik-Manor, L.: What makes a patch distinct? In: CVPR (2013)Google Scholar
  26. 26.
    Nuthmann, A., Henderson, J.: Object-based attentional selection in scene viewing. Journal of Vision 10(8), 20 (2010)CrossRefGoogle Scholar
  27. 27.
    Ouerhani, N., Von Wartburg, R., Hugli, H., Muri, R.: Empirical validation of the saliency-based model of visual attention. Electronic Letters on Computer Vision and Image Analysis 3(1), 13–24 (2004)Google Scholar
  28. 28.
    Parkhurst, D., Law, K., Niebur, E.: Modeling the role of salience in the allocation of overt visual attention. Vision Research 42(1), 107–123 (2002)CrossRefGoogle Scholar
  29. 29.
    Peters, R., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision Research 45(18), 2397–2416 (2005)CrossRefGoogle Scholar
  30. 30.
    Raj, R., Geisler, W., Frazor, R., Bovik, A.: Contrast statistics for foveated visual systems: Fixation selection by minimizing contrast entropy. Journal of the Optical Society of America A 22(10), 2039–2049 (2005)CrossRefGoogle Scholar
  31. 31.
    Rudoy, D., Goldman, D.B., Shechtman, E., Zelnik-Manor, L.: Learning video saliency from human gaze using candidate selection. In: CVPR (2013)Google Scholar
  32. 32.
    Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. Journal of Vision 9(12), 15 (2009)CrossRefGoogle Scholar
  33. 33.
    Tatler, B.W.: The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision 7(14), 4 (2007)CrossRefGoogle Scholar
  34. 34.
    Tatler, B.W., Baddeley, R., Gilchrist, I.: Visual correlates of fixation selection: Effects of scale and time. Vision Research 45(5), 643–659 (2005)CrossRefGoogle Scholar
  35. 35.
    Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cognitive Psychology 12(1), 97–136 (1980)CrossRefGoogle Scholar
  36. 36.
    Treue, S.: Neural correlates of attention in primate visual cortex. Trends in Neurosciences 24(5), 295–300 (2001)CrossRefGoogle Scholar
  37. 37.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)Google Scholar
  38. 38.
    Zhang, L., Tong, M., Marks, T., Shan, H., Cottrell, G.: Sun: A bayesian framework for saliency using natural statistics. Journal of Vision 8(7), 32 (2008)CrossRefGoogle Scholar
  39. 39.
    Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. Journal of Vision 11(3), 9 (2011)CrossRefGoogle Scholar
  40. 40.
    Zhao, Q., Koch, C.: Learning visual saliency by combining feature maps in a nonlinear manner using adaboost. Journal of Vision 12(6), 22 (2012)CrossRefGoogle Scholar
  41. 41.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: CVPR (2012)Google Scholar
  42. 42.
    Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., Zhao, Q.: Predicting human gaze beyond pixels. Journal of Vision 14(1), 28 (2014)CrossRefGoogle Scholar
  43. 43.
    Jiang, M., Song, M., Zhao, Q.: Leveraging Human Fixations in Sparse Coding: Learning a Discriminative Dictionary for Saliency Prediction. In: SMC (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Ming Jiang
    • 1
  • Juan Xu
    • 1
  • Qi Zhao
    • 1
  1. 1.Department of Electrical and Computer EngineeringNational University of SingaporeSingapore

Personalised recommendations