Image Classification Via LZ78 Based String Kernel: A Comparative Study

  • Ming Li
  • Yanong Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


Normalized Information Distance (NID) [1] is a general-purpose similarity metric based on the concept of Kolmogorov Complexity. We have developed this notion into a valid kernel distance, called LZ78-based string kernel [2] and have shown that it can be used effectively for a variety of 1D sequence classification tasks [3]. In this paper, we further demonstrate its applicability on 2D images. We report experiments with our technique on two real datasets: (i) a collection of real-life photographs and (ii) a collection of medical diagnostic images from Magnetic Resonance (MR) data. The classification results are compared with those of the original similarity metric (i.e. NID) and several conventional classification algorithms. In all cases, the proposed kernel approach demonstrates better or equivalent performance when compared with other candidate methods but with lower computational overhead.


Compression Algorithm Kolmogorov Complexity Extracapsular Extension String Kernel Normalize Compression Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Li, M., Chen, X., Ma, B., Vitanyi, P.: The similarity metric. In: Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms, pp. 863–872 (2003)Google Scholar
  2. 2.
    Li, M., Sleep, R.M.: A LZ78-based string kernel. In: Li, X., Wang, S., Dong, Z.Y. (eds.) ADMA 2005. LNCS (LNAI), vol. 3584, pp. 678–689. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Li, M., Sleep, R.M.: A robust approach to sequence classification. In: Proceedings of the 17th IEEE Conference on Tools with Artificial Intelligence, Hong Kong, China (2005)Google Scholar
  4. 4.
    Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, Heidelberg (1997)CrossRefMATHGoogle Scholar
  5. 5.
    Teahan, W.J., Harper, D.J.: Using compression-based language models for text categorization. In: Workshop on Language Modeling and Information Retrieval,, Carnegie Mellon University, pp. 83–88 (2001)Google Scholar
  6. 6.
    Benedetto, D., Caglioti, E., Loreto, V.: Language trees and zipping. Physical Review Letters 88 (2000)Google Scholar
  7. 7.
    Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Transactions on Information Theory 51, 1523–1545 (2005)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Lan, Y., Harvey, R.: Image classification using compression distance. In: Proceedings of the 2nd International Conference on Vision, Video and Graphics, Edinburgh (2005)Google Scholar
  9. 9.
    Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. Microsoft Research Technical Report MSR-TR-98-14 (1998), Available at,
  10. 10.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Cleary, J., Witten, I.: Data compression using adaptive coding and partial string matching. IEEE Transactions on Communication COM 32, 396–402 (1984)CrossRefGoogle Scholar
  12. 12.
    Zhu, Y., Williams, S., Fisher, M., Zwiggelaar, R.: The use of grey-level profiles for detection of extracapsular extension of prostate cancer from MRI. In: Proceedings of Medical Image Understanding and Analysis, pp. 215–218 (2005)Google Scholar
  13. 13.
    Bangham, A.J., Harvey, R., Ling, P., Aldridge, R.: Morphological scale-space preserving transforms in many dimensions. Journal of Electronic Imaging 5, 283–299 (1996)CrossRefGoogle Scholar
  14. 14.
    Keogh, E., Lonardi, S., Rtanamahatana, C.A.: Toward parameter free data mining. In: Proceeding of the 10th ACM SIGKDD, Seattle, Washington, USA, pp. 206–215 (2004)Google Scholar
  15. 15.
    Burrows, M., Wheeler, D.J.: A blocksorting lossless data compression algorithm. SRC Research Report 124 (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ming Li
    • 1
  • Yanong Zhu
    • 1
  1. 1.School of Computing SciencesUniversity of East AngliaNorwichUK

Personalised recommendations