Journal of Global Optimization

, Volume 37, Issue 1, pp 137–157 | Cite as

Application of the cross-entropy method to clustering and vector quantization

  • Dirk P. Kroese
  • Reuven Y. Rubinstein
  • Thomas Taimre
Original Paper

Abstract

We apply the cross-entropy (CE) method to problems in clustering and vector quantization. The CE algorithm for clustering involves the following iterative steps: (a) generate random clusters according to a specified parametric probability distribution, (b) update the parameters of this distribution according to the Kullback–Leibler cross-entropy. Through various numerical experiments, we demonstrate the high accuracy of the CE algorithm and show that it can generate near-optimal clusters for fairly large data sets. We compare the CE method with well-known clustering and vector quantization methods such as K-means, fuzzy K-means and linear vector quantization, and apply each method to benchmark and image analysis data.

Keywords

Cross-entropy method Clustering Vector quantization Simulation Global optimization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The USC-SIPI Image Database.: http://sipi.usc.edu/services/database/Database.htmlGoogle Scholar
  2. 2.
    TSPLIB : A Traveling Salesman Problem Library. http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/Google Scholar
  3. 3.
    Ahalt S.C., Krishnamurthy A.K., Chen P., Melton D.E. (1990) Competitive learning algorithms for vector quantization. Neural Net. 3, 277–290CrossRefGoogle Scholar
  4. 4.
    Betke, M., Makris, N.: Fast object recognition in noisy images using simulated annealing. In: Proceedings of the Fifth International Conference on Computer Vision, pp. 523–530 (1995)Google Scholar
  5. 5.
    Bezdek J. (1981) Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New YorkGoogle Scholar
  6. 6.
    Botev, Z., Kroese, D.P. Global likelihood optimization via the cross-entropy method, with an application to mixture models. In: Ingalls, R.G., Rossetti, M.D., Smith, J.S., Peters, B.A. (eds.) Proceedings of the 2004 Winter Simulation Conference, IEEE, Washington, DC, December 2004Google Scholar
  7. 7.
    Brown, L. A survey of image registration techniques. Technical report, Department of Computer Science, Columbia University (1992)Google Scholar
  8. 8.
    Chen J., Kundu A. (1995) Unsupervised texture segmentation using multichannel decomposition and hidden Markov models. IEEE Trans. Image Process. 4, 603–619CrossRefGoogle Scholar
  9. 9.
    de Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann Oper. Res. Vol. 134, pp.19–67. Springer-Verlag (2005)Google Scholar
  10. 10.
    Dorigo, M., Di Caro, G. The ant colony optimization meta-heuristic. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas in optimization, pp. 11–32. McGraw-Hill (1999)Google Scholar
  11. 11.
    Duda R.O., Hart P.E., Stork D.G. (2001) Pattern Classification. Wiley, New YorkGoogle Scholar
  12. 12.
    Geman S., Geman D. (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Trans. PAMI 6, 721–741Google Scholar
  13. 13.
    Glover F., Laguna M.L. (1993) Modern Heuristic Techniques for Combinatorial Optimization, Chapter 3: Tabu Search. Blackwell Scientific Publications, OxfordGoogle Scholar
  14. 14.
    Goldberg D. (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading, MAGoogle Scholar
  15. 15.
    Hansen P., Mladenović N. (2001) J-means: a new local search heuristic for minimum sum of squares clustering. Pattern Recogn. 34, 405–413CrossRefGoogle Scholar
  16. 16.
    Hansen P., Mladenović N., Perez-Brito D. (2001). Variable neighborhood decomposition search. J Heuristics 7, 335–350CrossRefGoogle Scholar
  17. 17.
    Jain A.K., Murty M.N., Flynn P.J. (1999) Data clustering: a review. ACM Comput. Surv. 31, 264–323CrossRefGoogle Scholar
  18. 18.
    Kaufman L., Rousseeuw P. (1990) Finding Groups in Data, and Introduction to Cluster Analysis. Wiley, New YorkGoogle Scholar
  19. 19.
    Keith, J., Kroese, D.P. Sequence alignment by rare event simulation. In: Proceedings of the 2002 Winter Simulation Conference, pp.~320–327. San Diego (2002)Google Scholar
  20. 20.
    McLachlan G., Krishnan T. (1997) The EM Algorithm and Extensions. Wiley, New YorkGoogle Scholar
  21. 21.
    Rubinstein R.Y. (1999) The cross-entropy method for combinatorial and continuous optimization. Methodol Comp. Appl. Prob. 2, 127–190CrossRefGoogle Scholar
  22. 22.
    Rubinstein R.Y., Kroese D.P. (2004) The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer-Verlag, New YorkGoogle Scholar
  23. 23.
    Salomon D. (2000) Data Compression: The Complete Reference. Springer-Verlag, New YorkGoogle Scholar
  24. 24.
    Sherali H.D., Desai J. (2005) A global optimization rlt-based approach for solving the hard clustering problem. J Global Optim. 32, 281–306CrossRefGoogle Scholar
  25. 25.
    Shi L., Olafsson S. (2000) Nested partitioning method for global optimization. Oper. Res. 48(3): 390–407CrossRefGoogle Scholar
  26. 26.
    Spall J.C. (2003) Introduction to Stochastic Search and Optimization. John Wiley, New YorkCrossRefGoogle Scholar
  27. 27.
    Stork D.G., Yom-Tov E. (2004) Computer Manual to Accompany Pattern Classification. Wiley, New YorkGoogle Scholar
  28. 28.
    Webb A. (1999) Statistical Pattern Recognition. Arnold, LondonGoogle Scholar
  29. 29.
    Yang M.-S., Wu K.-L. (2006) Unsupervised possibilistic clustering. Pattern Recog. 39, 5–21CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2006

Authors and Affiliations

  • Dirk P. Kroese
    • 1
  • Reuven Y. Rubinstein
    • 2
  • Thomas Taimre
    • 1
  1. 1.Department of MathematicsThe University of QueenslandBrisbaneAustralia
  2. 2.Faculty of Industrial Engineering and ManagementTechnionHaifaIsrael

Personalised recommendations