Prototype Based Classification Using Information Theoretic Learning

  • Th. Villmann
  • B. Hammer
  • F. -M. Schleif
  • T. Geweniger
  • T. Fischer
  • M. Cottrell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4233)


In this article we extend the (recently published) unsupervised information theoretic vector quantization approach based on the Cauchy–Schwarz-divergence for matching data and prototype densities to supervised learning and classification. In particular, first we generalize the unsupervised method to more general metrics instead of the Euclidean, as it was used in the original algorithm. Thereafter, we extend the model to a supervised learning method resulting in a fuzzy classification algorithm. Thereby, we allow fuzzy labels for both, data and prototypes. Finally, we transfer the idea of relevance learning for metric adaptation known from learning vector quantization to the new approach.


Mutual Information Vector Quantization Learn Vector Quantization Information Theoretic Approach Relevance Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Haykin, S.: Neural Networks - A Comprehensive Foundation. IEEE Press, New York (1994)MATHGoogle Scholar
  2. 2.
    Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1995), 2nd extended edn. (1997)Google Scholar
  3. 3.
    Oja, E., Lampinen, J.: Unsupervised learning for feature extraction. In: Zurada, J.M., Marks II, R.J., Robinson, C.J. (eds.) Computational Intelligence Imitating Life, pp. 13–22. IEEE Press, Los Alamitos (1994)Google Scholar
  4. 4.
    Brause, R.: Neuronale Netze, 2nd edn. B. G. Teubner, Stuttgart (1995)Google Scholar
  5. 5.
    Deco, G., Obradovic, D.: An Information-Theoretic Approach to Neural Computing. Springer, Berlin (1997)Google Scholar
  6. 6.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37 (2000)CrossRefGoogle Scholar
  7. 7.
    Kapur, J.N.: Measures of Information and their Application. Wiley, New Delhi (1994)Google Scholar
  8. 8.
    Principe, J.C., Fischer III, J.W., Xu, D.: Information theoretic learning. In: Haykin, S. (ed.) Unsupervised Adaptive Filtering, Wiley, New York (2000)Google Scholar
  9. 9.
    Zador, P.L.: Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Transaction on Information Theory (28), 149–159 (1982)Google Scholar
  10. 10.
    Van Hulle, M.M.: Faithful Representations and Topographic Maps. Wiley Series and Adaptive Learning Systems for Signal Processing, Communications, and Control. Wiley & Sons, New York (2000)Google Scholar
  11. 11.
    Villmann, T., Claussen, J.-C.: Magnification control in self-organizing maps and neural gas. Neural Computation 18(2), 446–469 (2006)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Van Hulle, M.M.: Joint entropy maximization in kernel-based topographic maps. Neural Computation 14(8), 1887–1906 (2002)MATHCrossRefGoogle Scholar
  13. 13.
    Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22, 79–86 (1951)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27, 379–432 (1948)MATHMathSciNetGoogle Scholar
  15. 15.
    Renyi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press (1961)Google Scholar
  16. 16.
    Lehn-Schiler, T., Hegde, A., Erdogmus, D., Principe, J.C.: Vector quantization using information theoretic concepts. Natural Computing 4(1), 39–51 (2005)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Renyi, A.: Probability Theory. North-Holland Publishing Company, Amsterdam (1970)Google Scholar
  18. 18.
    Jenssen, R.: An Information Theoretic Approach to Machine Learning, PhD thesis, University of Troms, Department of Physics (2005)Google Scholar
  19. 19.
    Seo, S., Obermayer, K.: Soft learning vector quantization. Neural Computation 15, 1589–1604 (2003)MATHCrossRefGoogle Scholar
  20. 20.
    Sato, A., Yamada, K.: Generalized learning vector quantization. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8. Proceedings of the 1995 Conference, pp. 423–429. MIT Press, Cambridge (1996)Google Scholar
  21. 21.
    Seo, S., Bode, M., Obermayer, K.: Soft nearest prototype classification. IEEE Transaction on Neural Networks 14, 390–398 (2003)CrossRefGoogle Scholar
  22. 22.
    Torkkola, K.: Feature extraction by non-parametric mutual information maximization. Journal of Machine Learning Research 3, 1415–1438 (2003)MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Villmann, T., Schleif, F.-M., Hammer, B.: Comparison of relevance learning vector quantization with other metric adaptive classification methods. Neural Networks 19 (in press, 2006)Google Scholar
  24. 24.
    Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman & Hall, Boca Raton (1986)MATHGoogle Scholar
  25. 25.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA (1998), available at:
  26. 26.
    Torkkola, K., Campbell, W.M.: Mutual information in learning feature transformations. In: Proc. Of International Conference on Machine Learning ICML 2000, Stanford, CA (2000)Google Scholar
  27. 27.
    Hammer, B., Strickert, M., Villmann, T.: Supervised neural gas with general similarity measure. Neural Processing Letters 21(1), 21–44 (2005)CrossRefGoogle Scholar
  28. 28.
    Verleysen, M., François, D.: Computational Intelligence and Bioinspired Systems. In: Cabestany, J., Prieto, A., Hernández, F.S. (eds.) Proceedings of the 8th International Work-Conference on Artificial Neural Networks 2005 (IWANN), Barcelona (2005)Google Scholar
  29. 29.
    Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Networks 15(8-9), 1059–1068 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Th. Villmann
    • 1
  • B. Hammer
    • 2
  • F. -M. Schleif
    • 3
    • 4
  • T. Geweniger
    • 3
    • 5
  • T. Fischer
    • 3
  • M. Cottrell
    • 6
  1. 1.Medical DepartmentUniversity LeipzigGermany
  2. 2.Inst. of Computer ScienceClausthal University of TechnologyGermany
  3. 3.Inst. of Computer ScienceUniversity LeipzigGermany
  4. 4.BRUKER DALTONIK LeipzigGermany
  5. 5.Dep. of Computer ScienceUniversity of Applied Science MittweidaGermany
  6. 6.University Paris I Sorbonne-Panthéon, SAMOSFrance

Personalised recommendations