Data Clustering Based on an Efficient Hybrid of K-Harmonic Means, PSO and GA
Clustering is one of the most commonly techniques in Data Mining. Kmeans is one of the most popular clustering techniques due to its simplicity and efficiency. However, it is sensitive to initialization and easily trapped in local optima. K-harmonic means clustering solves the problem of initialization using a built-in boosting function, but it is suffering from running into local optima. Particle Swarm Optimization is a stochastic global optimization technique that is the proper solution to solve this problem. In this paper, PSOKHM not only helps KHM clustering escape from local optima but also overcomes the shortcoming of slow convergence speed of PSO. In this paper, a hybrid data clustering algorithm based on PSO and Genetic algorithm, GSOKHM, is proposed. We investigate local optima method in addition to the global optima in PSO, called LSOKHM. The experimental results on five real datasets indicate that LSOKHM is superior to the GSOKHM algorithm.
KeywordsData clustering PSO KHM Genetic algorithm
Unable to display preview. Download preview PDF.
- 2.Tan, P.N., Steinbach, M., Kumar, V.: Introduction to data mining, pp. 487–559. Addison-Wesley, Boston (2005)Google Scholar
- 6.Cui, X., Potok, T.E., Palathingal, P.: Document clustering using Particle Swarm Optimization. In: Proceedings 2005 IEEE Swarm Intelligence Symposium, pp. 185–191 (2005)Google Scholar
- 7.Zhang, B., Hsu, M., Dayal, U.: K-harmonic means – a data clustering algorithm. Technical Report HPL-1999-124, Hewlett-Packard Laboratories (1999) Google Scholar
- 8.Hammerly, G., Elkan, C.: Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 11th International Conference on Information and Knowledge Management, Virginia, USA, pp. 600–607 (2002)Google Scholar
- 9.Güngör, Z., Ünler, A.: K-harmonic means data clustering with simulated annealing heuristic. Applied Mathematics and Computation, 199–209 (2007)Google Scholar
- 13.Chu, S., Roddick, J.: A clustering algorithm using Tabu search approach with simulated annealing for vector quantization. Chinese Journal of Electronics 12, 349–353 (2003)Google Scholar
- 16.Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the 1995 IEEE International Conference on Neural Networks, pp. 1942–1948. IEEE Press, New Jersey (1985)Google Scholar
- 17.Dalli, A.: Adaptation of the F-measure to cluster-based Lexicon quality evaluation. In: EACL (2003)Google Scholar