Skip to main content

Evaluation of similarity measures for video retrieval


Similarity measures are very crucial especially in the field of information retrieval. Thus, various distance/similarity measures were proposed throughout the literature. In the video retrieval field, videos are represented as multi-dimensional features vector. Once this features vector is extracted from video shots; the retrieval task is primarily performed based on the measurement of similarity between respective videos’ feature vectors. Moreover, the retrieval quality could be greatly improved with careful distance measure selection. This paper presents an extensive analysis regarding the most commonly used video retrieval similarity measures. The results are consolidated with a multifaceted analysis, i.e. multiple challenging video datasets, retrieval curves and confusion matrices. The major contribution of this paper is investigating the effectiveness of the common similarity measures from a video retrieval perspective. This would give the field researchers the required knowledge to select the most suitable distance measure for their video retrieval research work.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    The terms similairty measure and distance metric are used interchangibly in this paper.

  2. 2.

    Average±Standard deviation.


  1. 1.

    Altadmri A, Ahmed A (2014) A framework for automatic semantic video annotation. Multimed Tools Appl 72(2):1167–1191

    Article  Google Scholar 

  2. 2.

    Basharat A, Zhai Y, Shah M (2008) Content based video matching using spatiotemporal volumes. Comput Vis Image Underst 110(3):360–377

    Article  Google Scholar 

  3. 3.

    Bekhet S, Ahmed A (2017) Video similarity detection using fixed-length statistical dominant colour profile (SDCP) signatures. Journal of Real-Time Image Processing.

  4. 4.

    Bekhet S, Ahmed A, Hunter A (2014) Dc-image for real time compressed video matching. In: Transactions on engineering technologies. Springer, pp 513–527

  5. 5.

    Bekhet S, Ahmed A, Altadmri A, Hunter A (2016) Compressed video matching: Frame-to-frame revisited. Multimed Tools Appl 75(23):15,763–15,778

    Article  Google Scholar 

  6. 6.

    Bekhet S, Ahmed A (2018) Graph-based video sequence matching using dominant colour graph profile (DCGP). SIViP 12(2):291–298.

    Article  Google Scholar 

  7. 7.

    Bekhet S, Ahmed A (2018) An integrated signature-based framework for efficient visual similarity detectionan integrated signature-based framework for efficient visual similarity detection and measurement in video shots. ACM Trans Inf Syst (TOIS) 36(4):37

    Article  Google Scholar 

  8. 8.

    Black PE (2004) Dictionary of algorithms and data structures. National Institute of Standards and Technology

  9. 9.

    Cha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. City 1(2):1

    MathSciNet  Google Scholar 

  10. 10.

    Chardy P, Glemarec M, Laurec A (1976) Application of inertia methods to benthic marine ecology: practical implications of the basic options. Estuarine Coast Mar Sci 4(2):179–205

    Article  Google Scholar 

  11. 11.

    Dubuisson S (2010) The computation of the bhattacharyya distance between histograms without histograms. In: 2010 2nd international conference on Image processing theory tools and applications (IPTA). IEEE, pp 373–378

  12. 12.

    Jiang L, Li C (2019) Two improved attribute weighting schemes for value difference metric. Knowledge and information systems 60(2):949–970

    Article  Google Scholar 

  13. 13.

    Kantorov V, Laptev I (2014) Efficient feature extraction, encoding, and classification for action recognition. In: 2014 IEEE conference on Computer vision and pattern recognition (CVPR). IEEE, pp 2593–2600

  14. 14.

    Krause EF (1975) Taxicab geometry: An adventure in non-Euclidean geometry. Courier Corporation

  15. 15.

    Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos ”in the wild”. In: 2009. CVPR 2009. IEEE conference on Computer vision and pattern recognition. IEEE, pp 1996–2003

  16. 16.

    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  17. 17.

    Manning CD, Raghavan P, Schütze H et al (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge

  18. 18.

    Ng CW, King I, Lyu MR (2001) Video comparison using tree matching algorithms. In: Proceedings of The International Conference on Imaging Science, Systems, and Technology, vol 1, pp 184–190

  19. 19.

    Rodriguez MD, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: 2008. CVPR 2008. IEEE conference on Computer vision and pattern recognition, pp 1–8

  20. 20.

    Rubner Y, Tomasi C, Guibas LJ (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vis 40(2):99–121

    Article  Google Scholar 

  21. 21.

    Sadanand S, Corso JJ (2012) Action bank: a high-level representation of activity in video. In: 2012 IEEE conference on Computer vision and pattern recognition (CVPR). IEEE, pp 1234–1241

  22. 22.

    Sathya N, Rathi S (2018) A survey on reducing the semantic gap in content based image retrieval system. Int J Adv Stud Comput Sci Eng 7(3):9–17

    Google Scholar 

  23. 23.

    Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vis 7(1):11–32

    Article  Google Scholar 

  24. 24.

    TrecVid(2011): Trec video retrival task, bbc ruch (1-02-2011) (2011).

  25. 25.

    Van Der Heijden F, Duin RP, De Ridder D, Tax DM (2005) Classification, parameter estimation and state estimation: an engineering approach using MATLAB. Wiley, New York

  26. 26.

    Yang L, Jin R (2006) Distance metric learning: A comprehensive survey Michigan State Universiy 2(2):4

  27. 27.

    Yang L, Jin R, Mummert L, Sukthankar R, Goode A, Zheng B, Hoi SC, Satyanarayanan M (2010) A boosting framework for visuality-preserving distance metric learning and its application to medical image retrieval. IEEE Trans Pattern Anal Mach Intell 32(1):30–44

    Article  Google Scholar 

  28. 28.

    YouTube: Youtube statistics (2014).

  29. 29.

    Yu J, Yang X, Gao F, Tao D (2017) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans Cybern 47(12):4014–4024

    Article  Google Scholar 

  30. 30.

    Zhang D, Lu G (2003) Evaluation of similarity measurement for image retrieval. In: 2003. Proceedings of the 2003 International Conference on Neural Networks and Signal Processing. IEEE, vol 2, pp 928–931

Download references

Author information



Corresponding author

Correspondence to Saddam Bekhet.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bekhet, S., Ahmed, A. Evaluation of similarity measures for video retrieval. Multimed Tools Appl 79, 6265–6278 (2020).

Download citation


  • Distance metrics
  • Similarity measures
  • Video retrieval
  • Video matching