Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 227))

  • 850 Accesses

Abstract

One of the basic endeavors in Pattern Recognition and particularly in Data Mining is the process of determining which unlabeled objects in a set do share interesting properties. This implies a singular process of classification usually denoted as "clustering", where the objects are grouped into k subsets (clusters) in accordance with an appropriate measure of likelihood. Clustering can be considered the most important unsupervised learning problem. The more traditional clustering methods are based on the minimization of a similarity criteria based on a metric or distance. This fact imposes important constraints on the geometry of the clusters found. Since each element in a cluster lies within a radial distance relative to a given center, the shape of the covering or hull of a cluster is hyper-spherical (convex) which sometimes does not encompass adequately the elements that belong to it. For this reason we propose to solve the clustering problem through the optimization of Shannon’s Entropy. The optimization of this criterion represents a hard combinatorial problem which disallows the use of traditional optimization techniques, and thus, the use of a very efficient optimization technique is necessary. We consider that Genetic Algorithms are a good alternative. We show that our method allows to obtain successfull results for problems where the clusters have complex spatial arrangements. Such method obtains clusters with non-convex hulls that adequately encompass its elements. We statistically show that our method displays the best performance that can be achieved under the assumption of normal distribution of the elements of the clusters. We also show that this is a good alternative when this assumption is not met.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Casella, G., Robert, C.P.: Monte carlo statistical methods (1999)

    Google Scholar 

  2. De Sa, J.M.: Pattern recognition: concepts, methods, and applications. Springer (2001)

    Google Scholar 

  3. Digalakis, J., Margaritis, K.: An experimental study of benchmarking functions for genetic algorithms (2002)

    Google Scholar 

  4. Duda, R., Hart, P., Stork, D.: Pattern classification, Section 10, P. 6. John Wiley, New York (2001)

    Google Scholar 

  5. Eshelman, L.: The chc adaptive search algorithm. how to have safe search when engaging in nontraditional genetic recombination (1991)

    Google Scholar 

  6. Gallager, R.G.: Information theory and reliable communication (1968)

    Google Scholar 

  7. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2), 107–145 (2001)

    Article  MATH  Google Scholar 

  8. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. (1998)

    Google Scholar 

  9. Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)

    Google Scholar 

  10. Johnson, J.L.: Probability and statistics for computer science. Wiley Online Library (2003)

    Google Scholar 

  11. Kim, J.H., Myung, H.: Evolutionary programming techniques for constrained optimization problems (1997)

    Google Scholar 

  12. Kuri-Morales, A.: A statistical genetic algorithm (1999)

    Google Scholar 

  13. Kuri-Morales, A., Aldana-Bobadilla, E.: A comprehensive comparative study of structurally different genetic algorithms (sent for publication, 2013)

    Google Scholar 

  14. Kuri-Morales, A., Aldana-Bobadilla, E.: The Search for Irregularly Shaped Clusters in Data Mining. Kimito, Funatsu and Kiyoshi, Hasegawa (2011)

    Google Scholar 

  15. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, California, USA, vol. 1, p. 14 (1967)

    Google Scholar 

  16. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press (1996)

    Google Scholar 

  17. Mitchell, M., Holland, J., Forrest, S.: When Will a Genetic Algorithm Outperform Hill Climbing? In: Advances of Neural Information Processing Systems 6, pp. 51–58. Morgan Kaufmann (1994)

    Google Scholar 

  18. Molga, M., Smutnicki, C.: Test functions for optimization needs, pp. 41–42 (2005), http://www.zsd.ict.pwr.wroc.pl/files/docs/functions.pdf (retrieved March 11, 2012)

  19. Pohlheim, H.: Geatbx: Genetic and evolutionary algorithm toolbox for use with matlab documentation (2012)

    Google Scholar 

  20. Rezaee, J., Hashemi, A., Nilsaz, N., Dezfouli, H.: Analysis of the strategies in heuristic techniques for solving constrained optimisation problems (2012)

    Google Scholar 

  21. Rudolph, G.: Convergence Analysis of Canonical Genetic Algorithms. IEEE Transactions on Neural Networks 5(1), 96–101 (1994)

    Article  Google Scholar 

  22. Rumelhart, D.E., Hintont, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  23. Sánchez-Ferrero, G., Arribas, J.: A statistical-genetic algorithm to select the most significant features in mammograms (2007)

    Google Scholar 

  24. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1), 3–55 (2001)

    Article  Google Scholar 

  25. Steliga, K., Szynal, D.: On markov-type inequalities (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Aldana-Bobadilla, E., Kuri-Morales, A. (2013). Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle. In: Emmerich, M., et al. EVOLVE - A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation IV. Advances in Intelligent Systems and Computing, vol 227. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-01128-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01128-8_2

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-01127-1

  • Online ISBN: 978-3-319-01128-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics