Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle

Aldana-Bobadilla, Edwin; Kuri-Morales, Angel

doi:10.1007/978-3-319-01128-8_2

Edwin Aldana-Bobadilla^11,12 &
Angel Kuri-Morales^11,12

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 227))

850 Accesses

Abstract

One of the basic endeavors in Pattern Recognition and particularly in Data Mining is the process of determining which unlabeled objects in a set do share interesting properties. This implies a singular process of classification usually denoted as "clustering", where the objects are grouped into k subsets (clusters) in accordance with an appropriate measure of likelihood. Clustering can be considered the most important unsupervised learning problem. The more traditional clustering methods are based on the minimization of a similarity criteria based on a metric or distance. This fact imposes important constraints on the geometry of the clusters found. Since each element in a cluster lies within a radial distance relative to a given center, the shape of the covering or hull of a cluster is hyper-spherical (convex) which sometimes does not encompass adequately the elements that belong to it. For this reason we propose to solve the clustering problem through the optimization of Shannon’s Entropy. The optimization of this criterion represents a hard combinatorial problem which disallows the use of traditional optimization techniques, and thus, the use of a very efficient optimization technique is necessary. We consider that Genetic Algorithms are a good alternative. We show that our method allows to obtain successfull results for problems where the clusters have complex spatial arrangements. Such method obtains clusters with non-convex hulls that adequately encompass its elements. We statistically show that our method displays the best performance that can be achieved under the assumption of normal distribution of the elements of the clusters. We also show that this is a good alternative when this assumption is not met.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Casella, G., Robert, C.P.: Monte carlo statistical methods (1999)
Google Scholar
De Sa, J.M.: Pattern recognition: concepts, methods, and applications. Springer (2001)
Google Scholar
Digalakis, J., Margaritis, K.: An experimental study of benchmarking functions for genetic algorithms (2002)
Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern classification, Section 10, P. 6. John Wiley, New York (2001)
Google Scholar
Eshelman, L.: The chc adaptive search algorithm. how to have safe search when engaging in nontraditional genetic recombination (1991)
Google Scholar
Gallager, R.G.: Information theory and reliable communication (1968)
Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2), 107–145 (2001)
Article MATH Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. (1998)
Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)
Google Scholar
Johnson, J.L.: Probability and statistics for computer science. Wiley Online Library (2003)
Google Scholar
Kim, J.H., Myung, H.: Evolutionary programming techniques for constrained optimization problems (1997)
Google Scholar
Kuri-Morales, A.: A statistical genetic algorithm (1999)
Google Scholar
Kuri-Morales, A., Aldana-Bobadilla, E.: A comprehensive comparative study of structurally different genetic algorithms (sent for publication, 2013)
Google Scholar
Kuri-Morales, A., Aldana-Bobadilla, E.: The Search for Irregularly Shaped Clusters in Data Mining. Kimito, Funatsu and Kiyoshi, Hasegawa (2011)
Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, California, USA, vol. 1, p. 14 (1967)
Google Scholar
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press (1996)
Google Scholar
Mitchell, M., Holland, J., Forrest, S.: When Will a Genetic Algorithm Outperform Hill Climbing? In: Advances of Neural Information Processing Systems 6, pp. 51–58. Morgan Kaufmann (1994)
Google Scholar
Molga, M., Smutnicki, C.: Test functions for optimization needs, pp. 41–42 (2005), http://www.zsd.ict.pwr.wroc.pl/files/docs/functions.pdf (retrieved March 11, 2012)
Pohlheim, H.: Geatbx: Genetic and evolutionary algorithm toolbox for use with matlab documentation (2012)
Google Scholar
Rezaee, J., Hashemi, A., Nilsaz, N., Dezfouli, H.: Analysis of the strategies in heuristic techniques for solving constrained optimisation problems (2012)
Google Scholar
Rudolph, G.: Convergence Analysis of Canonical Genetic Algorithms. IEEE Transactions on Neural Networks 5(1), 96–101 (1994)
Article Google Scholar
Rumelhart, D.E., Hintont, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Sánchez-Ferrero, G., Arribas, J.: A statistical-genetic algorithm to select the most significant features in mammograms (2007)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1), 3–55 (2001)
Article Google Scholar
Steliga, K., Szynal, D.: On markov-type inequalities (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Nacional Autónoma de México, Mexico City, Mexico
Edwin Aldana-Bobadilla & Angel Kuri-Morales
Instituto Técnologico Autónomo de México, México City, Mexico
Edwin Aldana-Bobadilla & Angel Kuri-Morales

Authors

Edwin Aldana-Bobadilla
View author publications
You can also search for this author in PubMed Google Scholar
Angel Kuri-Morales
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Leiden Institute of Advanced, Leiden University, Niels Bohrweg 1, Leiden, 2333CA, Netherlands
Michael Emmerich
, Leiden Institute of Advanced, Leiden University, Niels Bohrweg 1, Leiden, 2333CA, Netherlands
Andre Deutz
, Depto. de Computacion, CINVESTAV-IPN, Av. IPN No. 2508, Col. San Pedro Zacatenco, Mexico, 07360, Mexico
Oliver Schuetze
, Leiden Institute of Advanced, Leiden University, Niels Bohrweg 1, Leiden, 2333CA, Netherlands
Thomas Bäck
, Luxembourg Centre for, University of Luxembourg, Avenue des Hauts Fourneaux 7, Belval, 4362, Luxembourg
Emilia Tantar
, Computer Science and, University of Luxembourg, rue Richard Coudenhove-Kalergi 6, Luxembourg, 1359, Luxembourg
Alexandru-Adrian Tantar
, Bordeaux Mathematical Institute, Université Bordeaux I, cours de la Libération 351, Talence cedex, 33405, France
Pierre Del Moral
, UFR Sciences et Modélisation, Université Bordeaux Segalen, 3ter place de la Victoire, Bordeaux, 33076, France
Pierrick Legrand
Faculty of Sci., Tech. & Communication, Computer Science and, University of Luxembourg, rue Richard Coudenhove-Kalergi 6, Luxembourg, 1359, Luxembourg
Pascal Bouvry
, Depto. de Computatción, CINVESTAV-IPN, Av. IPN No. 2508, Col. San Pedro Zacatenco, Mexico, 07360, Mexico
Carlos A. Coello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aldana-Bobadilla, E., Kuri-Morales, A. (2013). Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle. In: Emmerich, M., et al. EVOLVE - A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation IV. Advances in Intelligent Systems and Computing, vol 227. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-01128-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-01128-8_2
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-01127-1
Online ISBN: 978-3-319-01128-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics