Advertisement

Kernel MDL to Determine the Number of Clusters

  • Ivan O Kyrgyzov
  • Olexiy O Kyrgyzov
  • Henri Maître
  • Marine Campedel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4571)

Abstract

In this paper we propose a new criterion, based on Minimum Description Length (MDL), to estimate an optimal number of clusters. This criterion, called Kernel MDL (KMDL), is particularly adapted to the use of kernel K-means clustering algorithm. Its formulation is based on the definition of MDL derived for Gaussian Mixture Model (GMM). We demonstrate the efficiency of our approach on both synthetic data and real data such as SPOT5 satellite images.

Keywords

Synthetic Data Gaussian Mixture Model Gaussian Kernel Minimum Description Length Model Free Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. Inform. Theory 44(6), 2743–2760 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46, 131–159 (2002)zbMATHCrossRefGoogle Scholar
  4. 4.
    Figueiredo, A.K., Jain, M.A.F.: Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(3), 381–396 (2002)CrossRefGoogle Scholar
  5. 5.
    Govaert, G.: Analyse des données. Lavoisier (2003)Google Scholar
  6. 6.
    Heas, P., Datcu, M.: Modelling trajectory of dynamic clusters in image time-series for spatio-temporal reasoning. IEEE Transactions on Geoscience and Remote Sensing 43(7), 1635–1647 (2005)CrossRefGoogle Scholar
  7. 7.
    Maître, H., Kyrgyzov, I., Campedel, M.: Combining clustering results for the analysis of textures of spot5 images. In: ESA-EUSC: Image Information Mining (2005)Google Scholar
  8. 8.
    Jain, A., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ (1988)zbMATHGoogle Scholar
  9. 9.
    Maître, H., Campedel, M., Moulines, E., Datcu, M.: Feature selection for satellite image indexing. In: ESA-EUSC: Image Information Mining (2005)Google Scholar
  10. 10.
    MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003)zbMATHGoogle Scholar
  11. 11.
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)zbMATHCrossRefGoogle Scholar
  12. 12.
    Rissanen, J.: Universal coding, information, prediction, and estimation. IEEE Trans. Inform. Theory 30(4), 629–636 (1984)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, USA (2001)Google Scholar
  14. 14.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ivan O Kyrgyzov
    • 1
  • Olexiy O Kyrgyzov
    • 2
  • Henri Maître
    • 1
  • Marine Campedel
    • 1
  1. 1.Competence Centre for Information Extraction, and Image Understanding for Earth Observation, GET/Télécom Paris - LTCI, UMR 5141, CNRS, 46, rue Barrault, 75013, ParisFrance
  2. 2.Department of Computer Science and Electrical Engineering, OGI School of Science and Engineering, Oregon Health and Science University, 20000 NW Walker Road, Beaverton, OR, 97006USA

Personalised recommendations