Saliency and visual attention have been studied in a computational context for decades, mostly in the capacity of predicting spatial topographical saliency maps or simulated heatmaps. Spatial selection by an attentive mechanism is, however, inherently a sequential sampling process in humans. There have been recent efforts in analyzing and modeling scanpaths, however, there is as of yet no universal agreement on what metrics should be applied to measure scanpath similarity or the quality of a predicted scanpath from a computational model. Many similarity measures have been suggested in different contexts and little is known about their behavior or properties. This paper presents in one place a review of these metrics, axiomatic analysis of gaze metrics for scanpaths, and careful analysis of the discriminative power of different metrics in order to provide a roadmap for further future analysis. This is accompanied by experimentation based on classic modeling strategies for simulating sequential selection from traditional representations of saliency, and deep neural networks that produce sequences by construction. Experiments provide strong support for the necessity of sequential analysis of attention and support for certain metrics including a family of metrics introduced in this paper motivated by the notion of scanpath plausibility.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Adeli, H., & Zelinsky, G. (2018). Deep-BXN: Deep networks meet biased competition to create a brain-inspired model of attention control. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 1932–1942).
Anderson, N.C., Anderson, F., Kingstone, A., & Bischof, W.F. (2015). A comparison of scanpath comparison methods. Behavior Research Methods, 47(4), 1377–1392.
Anderson, N.C., Bischof, W.F., Laidlaw, K.E., Risko, E.F., & Kingstone, A. (2013). Recurrence quantification analysis of eye movements. Behavior Research Methods, 45(3), 842– 856.
Anderson, N.C., Bischof, W.F., Laidlaw, K.E.W., Risko, E.F., & Kingstone, A. (2013). Recurrence quantification analysis of eye movements. Behavior Research Methods, 45(3), 842–856.
Aronov, B., Har-Peled, S., Knauer, C., Wang, Y., & Wenk, C. (2006). Fréchet distance for curves, revisited. In European symposium on algorithms (pp. 52–63): Springer.
Berndt, D.J. (1994). Using dynamic time warping to find patterns in time series. In KDD workshop, (Vol. 10 pp. 359–370). WA: Seattle.
Biondi, J., Fernandez, G., Castro, S., & Agamennoni, O. (2017). Eye-movement behavior identification for ad diagnosis. arXiv:1702.00837.
Borji, A., & Itti, L. (2015). Cat2000:, A large-scale fixation dataset for boosting saliency research. arXiv:1505.03581.
Bruce, N., Catton, C., & Janjic, S (2016). A deeper look at saliency:, Feature contrast, semantics, and beyond, pp. 516–524.
Cristino, F., Mathôt, S., Theeuwes, J., & Gilchrist, I.D. (2010). Scanmatch: a novel method for comparing fixation sequences. Behavior Research Methods, 42(3), 692–700.
Dewhurst, R., Nyström, M., Jarodzka, H., Foulsham, T., Johansson, R., & Holmqvist, K. (2012). It depends on how you look at it: Scanpath comparison in multiple dimensions with multimatch, a vector-based approach. Behavior Research Methods, 44(4), 1079–1100.
Duchowski, A.T., Driver, J., Jolaoso, S., Tan, W., Ramey, B.N., & Robbins, A. (2010). Scanpath comparison revisited. In Proceedings of the 2010 symposium on eye-tracking research & applications (pp. 219–226): ACM.
Eiter, T., & Mannila, H. (1994). Computing discrete Fréchet distance. Technical report, Citeseer.
Foulsham, T., Dewhurst, R., Nyström, M., Jarodzka, H., Johansson, R., Underwood, G., & Holmqvist, K. (2012). Comparing scanpaths during scene encoding and recognition: A multi-dimensional approach. Journal of Eye Movement Research, 5(4). https://doi.org/10.16910/jemr.5.4.3, https://bop.unibe.ch/JEMR/article/view/2341.
Foulsham, T., & Underwood, G. (2008). What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Journal of Vision, 8(2), 6–6.
Henderson, J.M., Brockmole, J.R., Castelhano, M.S., & Mack, M. (2007). Visual saliency does not account for eye movements during visual search in real-world scenes. In Eye movements (pp. 537–III): Elsevier.
Huttenlocher, D.P., Klanderman, G.A., & Rucklidge, W.J. (1993). Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.
Islam, M.A., Kalash, M., & Bruce, N.D. (2018). Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In Proceedings of the IEEE international conference on computer vision.
Islam, M.A., Kalash, M., Rochan, M., Bruce, N.D., & Wang, Y. (2017). Salient object detection using a context-aware refinement network. In British machine vision conference.
Jiang, M., Xu, J., & Zhao, Q. (2014). Saliency in crowd. In European conference on computer vision (pp. 17–32): Springer.
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: towards the underlying neural circuitry. In Matters of Intelligence (pp. 115–141): Springer.
Kuo, W., Hariharan, B., & Malik, J. (2015). Deepbox: Learning objectness with convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 2479–2487).
Mannan, S.K., Ruddock, K.H., & Wooding, D.S. (1996). The relationship between the locations of spatial features and those of fixations made during visual examination of briefly presented images. Spatial Vision, 10(3), 165–188.
Mathôt, S., Cristino, F., Gilchrist, I.D., & Theeuwes, J. (2012). A simple way to estimate similarity between pairs of eye movement sequences. Journal of Eye Movement Research, 5(1).
Meur, L. (2015). Saccadic model of eye movements for free-viewing condition. Vision Research, 116, 152–164.
Needleman, S.B., & Wunsch, C.D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3), 443–453.
Noton, D., & Stark, L. (1971). Scanpaths in eye movements during pattern perception. Science, 171(3968), 308–311.
Pellicano, E., Smith, A.D., Cristino, F., Hood, B.M., Briscoe, J., & Gilchrist, I.D. (2011). Children with autism are neither systematic nor optimal foragers. Proceedings of the National Academy of Sciences, 108(1), 421–426.
Privitera, C.M., & Stark, L.W. (2000). Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9), 970–982.
Samuel, A.G., & Kat, D. (2003). Inhibition of return: A graphical meta-analysis of its time course and an empirical test of its temporal and spatial properties. Psychonomic Bulletin & Review, 10(4), 897–906.
Sauer, T., Yorke, J.A., & Casdagli, M. (1991). Embedology. Journal of Statistical Physics, 65(3-4), 579–616.
Sharafi, Z., Soh, Z., & Guéhéneuc, Y.-G. (2015). A systematic literature review on the usage of eye-tracking in software engineering. Information and Software Technology, 67, 79–107.
Tatler, B.W., Baddeley, R.J., & Gilchrist, I.D. (2005). Visual correlates of fixation selection: Effects of scale and time. Vision Research, 45(5), 643–659.
Wang, W., Chen, C., Wang, Y., Jiang, T., Fang, F., & Yao, Y. (2011). Simulating human saccadic scanpaths on natural images. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 441–448): IEEE.
Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., & Zhao, Q. (2014). Predicting human gaze beyond pixels. Journal of Vision, 14(1), 28–28.
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., ..., Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (pp. 2048–2057).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Fahimi, R., Bruce, N.D.B. On metrics for measuring scanpath similarity. Behav Res (2020). https://doi.org/10.3758/s13428-020-01441-0
- Visual attention
- Eye movement