On metrics for measuring scanpath similarity


Saliency and visual attention have been studied in a computational context for decades, mostly in the capacity of predicting spatial topographical saliency maps or simulated heatmaps. Spatial selection by an attentive mechanism is, however, inherently a sequential sampling process in humans. There have been recent efforts in analyzing and modeling scanpaths, however, there is as of yet no universal agreement on what metrics should be applied to measure scanpath similarity or the quality of a predicted scanpath from a computational model. Many similarity measures have been suggested in different contexts and little is known about their behavior or properties. This paper presents in one place a review of these metrics, axiomatic analysis of gaze metrics for scanpaths, and careful analysis of the discriminative power of different metrics in order to provide a roadmap for further future analysis. This is accompanied by experimentation based on classic modeling strategies for simulating sequential selection from traditional representations of saliency, and deep neural networks that produce sequences by construction. Experiments provide strong support for the necessity of sequential analysis of attention and support for certain metrics including a family of metrics introduced in this paper motivated by the notion of scanpath plausibility.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12


  1. Adeli, H., & Zelinsky, G. (2018). Deep-BXN: Deep networks meet biased competition to create a brain-inspired model of attention control. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 1932–1942).

  2. Anderson, N.C., Anderson, F., Kingstone, A., & Bischof, W.F. (2015). A comparison of scanpath comparison methods. Behavior Research Methods, 47(4), 1377–1392.

    PubMed  Google Scholar 

  3. Anderson, N.C., Bischof, W.F., Laidlaw, K.E., Risko, E.F., & Kingstone, A. (2013). Recurrence quantification analysis of eye movements. Behavior Research Methods, 45(3), 842– 856.

    PubMed  Google Scholar 

  4. Anderson, N.C., Bischof, W.F., Laidlaw, K.E.W., Risko, E.F., & Kingstone, A. (2013). Recurrence quantification analysis of eye movements. Behavior Research Methods, 45(3), 842–856.

    PubMed  Google Scholar 

  5. Aronov, B., Har-Peled, S., Knauer, C., Wang, Y., & Wenk, C. (2006). Fréchet distance for curves, revisited. In European symposium on algorithms (pp. 52–63): Springer.

  6. Berndt, D.J. (1994). Using dynamic time warping to find patterns in time series. In KDD workshop, (Vol. 10 pp. 359–370). WA: Seattle.

  7. Biondi, J., Fernandez, G., Castro, S., & Agamennoni, O. (2017). Eye-movement behavior identification for ad diagnosis. arXiv:1702.00837.

  8. Borji, A., & Itti, L. (2015). Cat2000:, A large-scale fixation dataset for boosting saliency research. arXiv:1505.03581.

  9. Bruce, N., Catton, C., & Janjic, S (2016). A deeper look at saliency:, Feature contrast, semantics, and beyond, pp. 516–524.

  10. Cristino, F., Mathôt, S., Theeuwes, J., & Gilchrist, I.D. (2010). Scanmatch: a novel method for comparing fixation sequences. Behavior Research Methods, 42(3), 692–700.

    PubMed  Google Scholar 

  11. Dewhurst, R., Nyström, M., Jarodzka, H., Foulsham, T., Johansson, R., & Holmqvist, K. (2012). It depends on how you look at it: Scanpath comparison in multiple dimensions with multimatch, a vector-based approach. Behavior Research Methods, 44(4), 1079–1100.

    PubMed  Google Scholar 

  12. Duchowski, A.T., Driver, J., Jolaoso, S., Tan, W., Ramey, B.N., & Robbins, A. (2010). Scanpath comparison revisited. In Proceedings of the 2010 symposium on eye-tracking research & applications (pp. 219–226): ACM.

  13. Eiter, T., & Mannila, H. (1994). Computing discrete Fréchet distance. Technical report, Citeseer.

  14. Foulsham, T., Dewhurst, R., Nyström, M., Jarodzka, H., Johansson, R., Underwood, G., & Holmqvist, K. (2012). Comparing scanpaths during scene encoding and recognition: A multi-dimensional approach. Journal of Eye Movement Research, 5(4). https://doi.org/10.16910/jemr.5.4.3, https://bop.unibe.ch/JEMR/article/view/2341.

  15. Foulsham, T., & Underwood, G. (2008). What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Journal of Vision, 8(2), 6–6.

    PubMed  Google Scholar 

  16. Henderson, J.M., Brockmole, J.R., Castelhano, M.S., & Mack, M. (2007). Visual saliency does not account for eye movements during visual search in real-world scenes. In Eye movements (pp. 537–III): Elsevier.

  17. Huttenlocher, D.P., Klanderman, G.A., & Rucklidge, W.J. (1993). Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.

    Google Scholar 

  18. Islam, M.A., Kalash, M., & Bruce, N.D. (2018). Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In Proceedings of the IEEE international conference on computer vision.

  19. Islam, M.A., Kalash, M., Rochan, M., Bruce, N.D., & Wang, Y. (2017). Salient object detection using a context-aware refinement network. In British machine vision conference.

  20. Jiang, M., Xu, J., & Zhao, Q. (2014). Saliency in crowd. In European conference on computer vision (pp. 17–32): Springer.

  21. Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: towards the underlying neural circuitry. In Matters of Intelligence (pp. 115–141): Springer.

  22. Kuo, W., Hariharan, B., & Malik, J. (2015). Deepbox: Learning objectness with convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 2479–2487).

  23. Mannan, S.K., Ruddock, K.H., & Wooding, D.S. (1996). The relationship between the locations of spatial features and those of fixations made during visual examination of briefly presented images. Spatial Vision, 10(3), 165–188.

    PubMed  Google Scholar 

  24. Mathôt, S., Cristino, F., Gilchrist, I.D., & Theeuwes, J. (2012). A simple way to estimate similarity between pairs of eye movement sequences. Journal of Eye Movement Research, 5(1).

  25. Meur, L. (2015). Saccadic model of eye movements for free-viewing condition. Vision Research, 116, 152–164.

    PubMed  Google Scholar 

  26. Needleman, S.B., & Wunsch, C.D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3), 443–453.

    PubMed  Google Scholar 

  27. Noton, D., & Stark, L. (1971). Scanpaths in eye movements during pattern perception. Science, 171(3968), 308–311.

    PubMed  Google Scholar 

  28. Pellicano, E., Smith, A.D., Cristino, F., Hood, B.M., Briscoe, J., & Gilchrist, I.D. (2011). Children with autism are neither systematic nor optimal foragers. Proceedings of the National Academy of Sciences, 108(1), 421–426.

    Google Scholar 

  29. Privitera, C.M., & Stark, L.W. (2000). Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9), 970–982.

    Google Scholar 

  30. Samuel, A.G., & Kat, D. (2003). Inhibition of return: A graphical meta-analysis of its time course and an empirical test of its temporal and spatial properties. Psychonomic Bulletin & Review, 10(4), 897–906.

    Google Scholar 

  31. Sauer, T., Yorke, J.A., & Casdagli, M. (1991). Embedology. Journal of Statistical Physics, 65(3-4), 579–616.

    Google Scholar 

  32. Sharafi, Z., Soh, Z., & Guéhéneuc, Y.-G. (2015). A systematic literature review on the usage of eye-tracking in software engineering. Information and Software Technology, 67, 79–107.

    Google Scholar 

  33. Tatler, B.W., Baddeley, R.J., & Gilchrist, I.D. (2005). Visual correlates of fixation selection: Effects of scale and time. Vision Research, 45(5), 643–659.

    PubMed  Google Scholar 

  34. Wang, W., Chen, C., Wang, Y., Jiang, T., Fang, F., & Yao, Y. (2011). Simulating human saccadic scanpaths on natural images. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 441–448): IEEE.

  35. Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., & Zhao, Q. (2014). Predicting human gaze beyond pixels. Journal of Vision, 14(1), 28–28.

    PubMed  Google Scholar 

  36. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., ..., Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (pp. 2048–2057).

Download references

Author information



Corresponding author

Correspondence to Ramin Fahimi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Fig. 13

Standard deviation error bars and distribution of data for spatial offset experiment in Fig. 7

Fig. 14

Standard deviation error bars and distribution of data for ordinal offset experiment in Fig. 8

Fig. 15

Standard deviation error bars and distribution of data for reverse ordinal offset experiment in Fig. 9

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fahimi, R., Bruce, N.D.B. On metrics for measuring scanpath similarity. Behav Res (2020). https://doi.org/10.3758/s13428-020-01441-0

Download citation


  • Saliency
  • Visual attention
  • Eye movement
  • Scanpath