Automatic Clustering Using a Genetic Algorithm with New Solution Encoding and Operators
- 6 Citations
- 1.8k Downloads
Abstract
Genetic algorithms (GA) are randomized search and optimization techniques which have proven to be robust and effective in large scale problems. In this work, we propose a new GA approach for solving the automatic clustering problem, ACGA - Automatic Clustering Genetic Algorithm. It is capable of finding the optimal number of clusters in a dataset, and correctly assign each data point to a cluster without any prior knowledge about the data. An encoding scheme which had not yet been tested with GA is adopted and new genetic operators are developed. The algorithm can use any cluster validity function as fitness function. Experimental validation shows that this new approach outperforms the classical clustering methods K-means and FCM. The method provides good results, and requires a small number of iterations to converge.
Keywords
Genetic Algorithms Clustering Calinski-Harabasz index K-means Fuzzy C-MeansReferences
- 1.Belahbib, F., Souami, F.: Genetic algorithm clustering for color image quantization. In: 3rd European Workshop on Visual Information Processing (EUVIP), pp. 83–87 (2011)Google Scholar
- 2.Mecca, G., Raunich, S., Pappalardo, A.: A New Algorithm for Clustering Search Results. Data and Knowledge Engineering 62, 504–522 (2007)CrossRefGoogle Scholar
- 3.Valafar, F.: Pattern Recognition Techniques in Microarray Data Analysis: A Survey. Annals of New York Academy of Sciences 980, 41–64 (2002)CrossRefGoogle Scholar
- 4.Hartigan, J., Wong, M.: Algorithm AS 136: A K-Means Clustering Algorithm. Applied Statistics 28(1), 100–108 (1979)CrossRefzbMATHGoogle Scholar
- 5.Bezdek, J., Ehrlich, R., Full, W.: FCM: The fuzzy c-means clustering algorithm. Computers and Geosciences 10(2-3), 191–203 (1984)CrossRefGoogle Scholar
- 6.Holland, J.: Genetic algorithms. Scientific American (1992)Google Scholar
- 7.Srinivas, M., Patnaik, M.: Genetic algorithm: A survey. IEEE Computer 27(6), 17–26 (1994)CrossRefGoogle Scholar
- 8.Murthy, C., Chowdhury, N.: In search of optimal clusters using GA. Pattern Recognition Letters 17, 825–832 (1996)CrossRefGoogle Scholar
- 9.Tseng, L., Yang, S.: A genetic approach to the automatic clustering problem. Pattern Recognition 34(2), 415–424 (2001)CrossRefzbMATHGoogle Scholar
- 10.Agustin-Blas, L., Salcedo-Sanz, S., Jimenez-Fernandez, S., Carro-Calvo, L., Del Ser, J., Portilla-Figueras, J.A.: A new grouping GA for clustering problems. Expert Systems with Applications 39(10) (2012)Google Scholar
- 11.Sheikh, R., Raghuwanshi, M., Jaiswal, A.: Genetic Algorithm Based Clustering: A Survey. In: First International Conference on Emerging Trends in Engineering and Technology, vol. 2(6), pp. 314–319 (2008)Google Scholar
- 12.Liu, Y., Wu, X., Shen, Y.: Automatic clustering using genetic algorithms. Applied Mathematics and Computation 218(4), 1267–1279 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
- 13.He, H., Tan, Y.: A two-stage genetic algorithm for automatic clustering. Neurocomputing 81, 49–59 (2012)CrossRefGoogle Scholar
- 14.Das, S., Abraham, A., Konar, A.: Automatic Clustering Using an Improved Differential Evolution Algorithm. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 38(1), 218–237 (2008)CrossRefGoogle Scholar
- 15.Calinski, R., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3(1), 1–27 (1974)zbMATHMathSciNetGoogle Scholar
- 16.Asuncion, A., Newman, J.: UCI Machine Learning Repository. University of California, Department of Information and Computer Science, Irvine, CA (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 17.Speech and Image Processing Unit. Clustering datasets, http://www.cs.joensuu.fi/sipu/datasets/
- 18.Hubert, L., Arabie, P.: Comparing Partitions. Journal of Classification (2), 193–218 (1985)Google Scholar