Advertisement

The Visual Computer

, Volume 28, Issue 11, pp 1063–1084 | Cite as

Using Normalized Compression Distance for image similarity measurement: an experimental study

  • Pere-Pau Vázquez
  • Jordi Marco
Original Article

Abstract

Similarity metrics are widely used in computer graphics. In this paper, we will concentrate on a new, algorithmic complexity-based metric called Normalized Compression Distance. It is a universal distance used to compare strings. This measure has also been used in computer graphics for image registration or viewpoint selection. However, there is no previous study on how the measure should be used: which compressor and image format are the most suitable. This paper presents a practical study of the Normalized Compression Distance (NCD) applied to color images. The questions we try to answer are: Is NCD a suitable metric for image comparison? How robust is it to rotation, translation, and scaling? Which are the most adequate image formats and compression algorithms? The results of our study show that NCD can be used to address some of the selected image comparison problems, but care must be taken on the compressor and image format selected.

Keywords

Image similarity Normalized Compression Distance Kolmogorov complexity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adobe Systems Incorporated: Developer resources/TIFF. http://partners.adobe.com/public/developer/tiff/index.html (2010). Online; accessed 3th March 2010
  2. 2.
    Asimakis, K.: INSHAME: FrontPaQ. http://inshame.blogspot.com/2009/09/frontpaq.html (2010). Online; accessed 23th February 2010
  3. 3.
    Bardera, A., Feixas, M., Boada, I., Sbert, M.: Compression-based image registration. In: Proc. of IEEE International Conference on Information Theory. IEEE Press, New York (2006) Google Scholar
  4. 4.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2008) CrossRefGoogle Scholar
  5. 5.
    Benedetto, D., Caglioti, E., Loreto, V.: Language trees and zipping. Phys. Rev. Lett. 88(4) (2002) Google Scholar
  6. 6.
    Bennett, C., Gacs, P., Li, M., Vitanyi, P., Zurek, W.: Information distance. IEEE Trans. Inf. Theory 44 (1998) Google Scholar
  7. 7.
    Bergmans, W.: Maximum compression (lossless compression software). http://www.maximumcompression.com/index.html (2010). Online; accessed 23th February 2010
  8. 8.
    Bourke, P.: BMP image format. http://local.wasp.uwa.edu.au/pbourke/dataformats/bmp/ (2010). Online; accessed 3th March 2010
  9. 9.
    Cebrián, M., Alfonseca, M., Ortega, A.: The normalized compression distance is resistant to noise. IEEE Trans. Inf. Theory 53(5), 1895–1900 (2007) CrossRefGoogle Scholar
  10. 10.
    Cleary, J.G., Teahan, W.J., Witten, I.H.: Unbounded length contexts for ppm. In: DCC ’95: Proceedings of the Conference on Data Compression, p. 52. IEEE Computer Society, Washington (1995) CrossRefGoogle Scholar
  11. 11.
    Cilibrasi, R.: Complearn home. http://www.complearn.org/ (2005). Online; accessed 2th June 2010
  12. 12.
    Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005) MathSciNetCrossRefGoogle Scholar
  13. 13.
    Cilibrasi, R., Vitanyi, P., de Wolf, R.: Algorithmic clustering of music based on string compression. Comput. Music J. 28(4), 49–67 (2004) CrossRefGoogle Scholar
  14. 14.
    Dubnov, S., Assayag, G., Lartillot, O., Bejerano, G.: Using machine-learning methods for musical style modeling. Computer 36(10), 73–80 (2003) CrossRefGoogle Scholar
  15. 15.
    Henderson, B.: Netpbm. http://netpbm.sourceforge.net/ (2010). Online; accessed 3th March 2010
  16. 16.
    Joint Photographic Experts Group: JPEG 2000. http://www.jpeg.org/jpeg2000/ (2010). Online; accessed 3th March 2010
  17. 17.
    Joint Photographic Experts Group: The JPEG committee home page. http://www.jpeg.org/jpeg/ (2010). Online; accessed 3th March 2010
  18. 18.
    Lan, Y., Harvey, R.: Image classification using compression distance. In: Proceedings of the 2nd International Conference on Vision, Video and Graphics, pp. 173–180 (2005) Google Scholar
  19. 19.
    Lee, S.M., Xin, J.H., Westland, S.: Evaluation of image similarity by histogram intersection. Color Res. Appl. 30(4), 265–274 (2005). doi: 10.1002/col.20122 CrossRefGoogle Scholar
  20. 20.
    Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1075–1088 (2003) CrossRefGoogle Scholar
  21. 21.
    Li, M., Badger, J., Chen, X., Kwong, S., Kearney, P., Zhang, H.: An information-based sequence distance and its application to the whole mitochondrial genome phylogeny. Bioinformatics 17(2), 149–154 (2001) CrossRefGoogle Scholar
  22. 22.
    Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–3264 (2004) MathSciNetCrossRefGoogle Scholar
  23. 23.
    Li, M., Vitanyi, P.M.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, Berlin (1993) zbMATHGoogle Scholar
  24. 24.
    Li, M., Zhu, Y.: Image classification via lz78 based string kernel: a comparative study. In: PAKDD, pp. 704–712 (2006) Google Scholar
  25. 25.
    Macedonas, A., Besiris, D., Economou, G., Fotopoulos, S.: Dictionary based color image retrieval. J. Vis. Commun. Image Represent. 19(7), 464–470 (2008) CrossRefGoogle Scholar
  26. 26.
    Nelson, M., Gailly, J.-L.: The Data Compression Book, 2nd edn. M&T Books, New York (1996) Google Scholar
  27. 27.
    Mahoney, M.: Data compression programs. http://mattmahoney.net/dc/#paq (2010). Online; accessed 23th February 2010
  28. 28.
    Pavlov, I.: 7-zip. http://www.7-zip.org/ (2010). Online; accessed 23th February 2010
  29. 29.
    Pierre-Emmanuel, G.: XnView Software—free graphic and photo viewer, converter, organizer. http://www.xnview.com/ (2010). Online; accessed 23th February 2010
  30. 30.
    Rocha, J., Rosselló, F., Segura, J.: Compression ratios based on the universal similarity metric still yield protein distances far from cath distances. CoRR arXiv:q-bio/0603007v2 (2006)
  31. 31.
    Roshal, A.: WinRar archiver, a powerful tool to process RAR and ZIP files. http://www.rarlab.com (2010). Online; accessed 23th February 2010
  32. 32.
    Team, T.G.: GIMP—The GNU Image Manipulation Program. http://www.gimp.org/ (2010). Online; accessed 23th February 2010
  33. 33.
    Tran, N.: The normalized compression distance and image distinguishability. In: Proceedings 19th IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San José, USA, pp. 508–515 (2007) Google Scholar
  34. 34.
    Tran, N.: A perceptual similarity measure based on smoothing filters and the normalized compression distance. In: Proceedings 22nd IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San José, USA (2010) Google Scholar
  35. 35.
    Väyrynen, J.J., Tapiovaara, T., Kettunen, K., Dobrinkat, M.: Normalized compression distance as an automatic MT evaluation metric. In: Proceedings of MT 25 Years on (2011, to appear) Google Scholar
  36. 36.
    Vázquez, P.P.: Automatic view selection through depth-based view stability analysis. Vis. Comput. 25(5–7), 441–449 (2009) CrossRefGoogle Scholar
  37. 37.
    Vázquez, P.P., Monclús, E., Navazo, I.: Representative views and paths for volume models. In: SG ’08: Proceedings of the 9th International Symposium on Smart Graphics, pp. 106–117. Springer, Berlin, Heidelberg (2008) Google Scholar
  38. 38.
    World Wide Web Consortium: Portable network graphics (PNG) specification, 2nd edn. http://www.w3.org/TR/PNG/ (2010). Online; accessed 3th March 2010
  39. 39.
    OpenCV: Open Source Computer Vision. http://opencv.willowgarage.com/wiki/ (2010). Online; accessed 2nd February 2011

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Departament de Llenguatges i Sistemes Informàtics (LSI)Universitat Politècnica de CatalunyaBarcelonaSpain

Personalised recommendations