Advertisement

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

  • Matthias Kümmerer
  • Thomas S. A. Wallis
  • Matthias Bethge
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11220)

Abstract

Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN. However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics. Here we show that no single saliency map can perform well under all metrics. Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics. Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density. We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision. We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics. Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: “good” models will perform well in all metrics.

Keywords

Saliency Benchmarking Metrics Fixations Bayesian decision theory Model comparison 

Notes

Acknowledgements

This study is part of Matthias Kümmerer’s thesis work at the International Max Planck Research School for Intelligent Systems (IMPRS-IS). The research has been funded by the German Science Foundation (DFG; Collaborative Research Centre 1233) and the German Excellency Initiative (EXC307).

Supplementary material

474218_1_En_47_MOESM1_ESM.pdf (494 kb)
Supplementary material 1 (pdf 493 KB)

References

  1. 1.
    Adeli, H., Vitu, F., Zelinsky, G.J.: A model of the superior colliculus predicts fixation locations during scene viewing and visual search. J. Neurosci. 37(6), 1453–1467 (2016).  https://doi.org/10.1523/jneurosci.0825-16.2016CrossRefGoogle Scholar
  2. 2.
    Barthelme, S., Trukenbrod, H., Engbert, R., Wichmann, F.: Modeling fixation locations using spatial point processes. J. Vis. 13(12), 1–1 (2013).  https://doi.org/10.1167/13.12.1CrossRefGoogle Scholar
  3. 3.
    Borji, A., Sihite, D.N., Itti, L.: Objects do not predict fixations better than early saliency: a re-analysis of einhauser et al’.s data. J. Vis. 13(10), 18–18 (2013).  https://doi.org/10.1167/13.10.18CrossRefGoogle Scholar
  4. 4.
    Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013).  https://doi.org/10.1109/tpami.2012.89CrossRefGoogle Scholar
  5. 5.
    Borji, A., Sihite, D.N., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans. Image Process. 22(1), 55–69 (2013).  https://doi.org/10.1109/tip.2012.2210727MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bruce, N.D.B., Tsotsos, J.K.: Saliency, attention, and visual search: an information theoretic approach. J. Vis. 9(3), 5–5 (2009).  https://doi.org/10.1167/9.3.5CrossRefGoogle Scholar
  7. 7.
    Bruce, N.D.B., Catton, C., Janjic, S.: A deeper look at saliency: Feature contrast, semantics, and beyond. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2016).  https://doi.org/10.1109/cvpr.2016.62
  8. 8.
    Bruce, N.D., Wloka, C., Frosst, N., Rahman, S., Tsotsos, J.K.: On computational modeling of visual saliency: examining what’s right, and what’s left. Vis. Res. 116, 95–112 (2015).  https://doi.org/10.1016/j.visres.2015.01.010CrossRefGoogle Scholar
  9. 9.
    Bylinskii, Z., Judd, T., Durand, F., Oliva, A., Torralba, A.: MIT saliency benchmark. http://saliency.mit.edu/
  10. 10.
    Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? [cs] (2016), arXiv:1604.03605
  11. 11.
    Bylinskii, Z., Recasens, A., Borji, A., Oliva, A., Torralba, A., Durand, F.: Where should saliency models look next? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 809–824. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_49CrossRefGoogle Scholar
  12. 12.
    Cerf, M., Harel, J., Huth, A., Einhäuser, W., Koch, C.: Decoding what people see from where they look: predicting visual stimuli from scanpaths. In: Paletta, L., Tsotsos, J.K. (eds.) WAPCV 2008. LNCS (LNAI), vol. 5395, pp. 15–26. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00582-4_2CrossRefGoogle Scholar
  13. 13.
    Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. [cs] (2016), arXiv:1611.09571
  14. 14.
    Einhauser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8(14), 18–18 (2008).  https://doi.org/10.1167/8.14.18CrossRefGoogle Scholar
  15. 15.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in neural information processing systems, pp. 545–552 (2006)Google Scholar
  16. 16.
    Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In: 2015 IEEE International Conference on Computer Vision (ICCV). IEEE (2015).  https://doi.org/10.1109/iccv.2015.38
  17. 17.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998).  https://doi.org/10.1109/34.730558CrossRefGoogle Scholar
  18. 18.
    Itti, L.: Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Vis. Cogn. 12(6), 1093–1123 (2005).  https://doi.org/10.1080/13506280444000661CrossRefGoogle Scholar
  19. 19.
    Itti, L., Borji, A.: Computational models: Bottom-up and top-down aspects. The Oxford Handbook of Attention. Oxford University Press, Oxford (2014)Google Scholar
  20. 20.
    Jetley, S., Murray, N., Vig, E.: End-to-end saliency mapping via probability distribution prediction. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2016).  https://doi.org/10.1109/cvpr.2016.620
  21. 21.
    Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015).  https://doi.org/10.1109/cvpr.2015.7298710
  22. 22.
    Jost, T., Ouerhani, N., Wartburg, R.V., Müri, R., Hügli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100(1–2), 107–123 (2005).  https://doi.org/10.1016/j.cviu.2004.10.009CrossRefGoogle Scholar
  23. 23.
    Judd, T., Durand, F.d., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. CSAIL Technical reports (2012). 1721.1/68590Google Scholar
  24. 24.
    Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision. IEEE (2009).  https://doi.org/10.1109/iccv.2009.5459462
  25. 25.
    Kienzle, W., Franz, M.O., Scholkopf, B., Wichmann, F.A.: Center-surround patterns emerge as optimal predictors for human saccade targets. J. Vis. 9(5), 7–7 (2009).  https://doi.org/10.1167/9.5.7CrossRefGoogle Scholar
  26. 26.
    Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4, 219–227 (1985). https://cseweb.ucsd.edu/classes/fa09/cse258a/papers/koch-ullman-1985.pdf
  27. 27.
    Koehler, K., Guo, F., Zhang, S., Eckstein, M.P.: What do saliency models predict? J. Vis. 14(3), 14–14 (2014).  https://doi.org/10.1167/14.3.14CrossRefGoogle Scholar
  28. 28.
    Kruthiventi, S.S.S., Ayush, K., Babu, R.V.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans. Image Process. 26(9), 4446–4456 (2017).  https://doi.org/10.1109/tip.2017.2710620MathSciNetCrossRefGoogle Scholar
  29. 29.
    Kümmerer, M.: pysaliency. https://github.com/matthias-k/pysaliency
  30. 30.
    Kümmerer, M., Theis, L., Bethge, M.: Deep gaze i: boosting saliency prediction with feature maps trained on ImageNet. In: 2015 International Conference on Learning Representations - Workshop Track (ICLR) (2015), arXiv:1411.1045
  31. 31.
    Kümmerer, M., Wallis, T.S.A., Gatys, L.A., Bethge, M.: Understanding low- and high-level contributions to fixation prediction. In: The IEEE International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar
  32. 32.
    Kümmerer, M., Wallis, T.S.A., Bethge, M.: Information-theoretic model comparison unifies saliency metrics. Proc. Natl. Acad. Sci. USA 112(52), 16054–16059 (2015).  https://doi.org/10.1073/pnas.1510393112CrossRefGoogle Scholar
  33. 33.
    Le Meur, O., Baccino, T.: Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav. Res. 45(1), 251–266 (2012).  https://doi.org/10.3758/s13428-012-0226-9CrossRefGoogle Scholar
  34. 34.
    Li, Z.: A saliency map in primary visual cortex. Trends Cogn. Sci. 6(1), 9–16 (2002).  https://doi.org/10.1016/s1364-6613(00)01817-9CrossRefGoogle Scholar
  35. 35.
    Nuthmann, A., Einhäuser, W., Schütz, I.: How well can saliency models predict fixation selection in scenes beyond central bias? a new approach to model evaluation using generalized linear mixed models. Front. Hum. Neurosci. 11, 491 (2017).  https://doi.org/10.3389/fnhum.2017.00491CrossRefGoogle Scholar
  36. 36.
    Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. [cs] (2017), arXiv:1701.01081
  37. 37.
    Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vis. Res. 45(18), 2397–2416 (2005).  https://doi.org/10.1016/j.visres.2005.03.019CrossRefGoogle Scholar
  38. 38.
    Riche, N.: Metrics for saliency model validation. From Human Attention to Computational Attention, pp. 209–225. Springer, New York (2016).  https://doi.org/10.1007/978-1-4939-3435-5_12CrossRefGoogle Scholar
  39. 39.
    Riche, N.: Saliency model evaluation. From Human Attention to Computational Attention, pp. 245–267. Springer, New York (2016).  https://doi.org/10.1007/978-1-4939-3435-5_14CrossRefGoogle Scholar
  40. 40.
    Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: state-of-the-art and study of comparison metrics. In: 2013 IEEE International Conference on Computer Vision. IEEE (2013).  https://doi.org/10.1109/iccv.2013.147
  41. 41.
    Rothkopf, C.A., Ballard, D.H., Hayhoe, M.M.: Task and context determine where you look. J. Vis. 7(14), 16 (2016).  https://doi.org/10.1167/7.14.16CrossRefGoogle Scholar
  42. 42.
    Schütt, H.H., Rothkegel, L.O.M., Trukenbrod, H.A., Reich, S., Wichmann, F.A., Engbert, R.: Likelihood-based parameter estimation and comparison of dynamical cognitive models. Psychol. Rev. 124(4), 505–524 (2017).  https://doi.org/10.1037/rev0000068CrossRefGoogle Scholar
  43. 43.
    Tatler, B.W., Hayhoe, M.M., Land, M.F., Ballard, D.H.: Eye guidance in natural vision: reinterpreting salience. J. Vis. 11(5), 5–5 (2011).  https://doi.org/10.1167/11.5.5CrossRefGoogle Scholar
  44. 44.
    Tatler, B.W.: The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. J. Vis. 7(14), 4 (2007).  https://doi.org/10.1167/7.14.4CrossRefGoogle Scholar
  45. 45.
    Tatler, B.W., Baddeley, R.J., Gilchrist, I.D.: Visual correlates of fixation selection: effects of scale and time. Vis. Res. 45(5), 643–659 (2005).  https://doi.org/10.1016/j.visres.2004.09.017CrossRefGoogle Scholar
  46. 46.
    Tatler, B.W., Vincent, B.T.: Systematic tendencies in scene viewing. J. Eye Mov. Res. 2(2), 1–18 (2008). http://csi.ufs.ac.za/resres/files/tatler_2008_jemr.pdf
  47. 47.
    Thomas, C.: OpenSalicon: an open source implementation of the salicon saliency model. CoRR abs/1606.00110 (2016), arXiv:1606.00110
  48. 48.
    Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980).  https://doi.org/10.1016/0010-0285(80)90005-5CrossRefGoogle Scholar
  49. 49.
    Vig, E., Dorr, M., Cox, D.: Large-scale optimization of hierarchical features for saliency prediction in natural images. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2014).  https://doi.org/10.1109/cvpr.2014.358
  50. 50.
    Vincent, B.T., Baddeley, R., Correani, A., Troscianko, T., Leonards, U.: Do we look at lights? using mixture modelling to distinguish between low- and high-level factors in natural image viewing. Vis. Cogn. 17(6–7), 856–879 (2009).  https://doi.org/10.1080/13506280902916691CrossRefGoogle Scholar
  51. 51.
    Wilming, N., Betz, T., Kietzmann, T.C., König, P.: Measures and limits of models of fixation selection. PLoS ONE 6(9), e24038 (2011).  https://doi.org/10.1371/journal.pone.0024038CrossRefGoogle Scholar
  52. 52.
    Xiao, J., Xu, P., Zhang, Y., Ehinger, K., Finkelstein, A., Kulkarni, S.: What can we learn from eye tracking data on 20,000 images? J. Vis. 15(12), 790 (2015).  https://doi.org/10.1167/15.12.790CrossRefGoogle Scholar
  53. 53.
    Yu, F., et al.: Large-scale scene understanding challenge. http://lsun.cs.princeton.edu/2017/
  54. 54.
    Yu, F., et al.: SALICON saliency prediction challenge. http://salicon.net/challenge-2017/
  55. 55.
    Zhang, J., Sclaroff, S.: Saliency detection: a Boolean map approach. In: 2013 IEEE International Conference on Computer Vision. IEEE (2013).  https://doi.org/10.1109/iccv.2013.26
  56. 56.
    Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32 (2008).  https://doi.org/10.1167/8.7.32CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Werner Reichardt Centre for Integrative Neuroscience, University of TübingenT übingenGermany
  2. 2.Wilhelm-Schickard Institute for Computer Science (Informatik), University of TübingenTübingenGermany

Personalised recommendations