New heuristic for harmonic means clustering
It is well known that some local search heuristics for \(K\)-clustering problems, such as \(k\)-means heuristic for minimum sum-of-squares clustering occasionally stop at a solution with a smaller number of clusters than the desired number \(K\). Such solutions are called degenerate. In this paper, we reveal that the degeneracy also exists in \(K\)-harmonic means (KHM) method, proposed as an alternative to \(K\)-means heuristic, but which is less sensitive to the initial solution. In addition, we discover two types of degenerate solutions and provide examples for both. Based on these findings, we give a simple method to remove degeneracy during the execution of the KHM heuristic; it can be used as a part of any other heuristic for KHM clustering problem. We use KHM heuristic within a recent variant of variable neighborhood search (VNS) based heuristic. Extensive computational analysis, performed on test instances usually used in the literature, shows that significant improvements are obtained if our simple degeneracy correcting method is used within both KHM and VNS. Moreover, our VNS based heuristic suggested here may be considered as a new state-of-the-art heuristic for solving KHM clustering problem.
KeywordsClustering \(K\)-harmonic means heuristic Variable neighborhood search Degeneracy
The research of E. Carrizosa is partially supported by Grants MTM2009-14039 (Ministerio de Educación y Ciencia, Spain) and FQM329 (Junta de Andalucía, Spain). Part of this research was done while N. Mladenović was visiting the Instituto de Matemáticas de la Universidad de Sevilla (Grant SAB2009-0144, Ministerio de Educación y Ciencia, Spain).
- 3.Aloise, D., Hansen, P.: Clustering. In: Sheir D.r. (ed.) Handbook of Discrete and Combinatorial Mathemaics. CRC Press (2009)Google Scholar
- 5.Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/datasets.html (1998)
- 11.Čižmešija, A.: A new sharp double inequality for generalized Heronian, harmonic and power means. Comput. Math. Appl. 64, 664671 (2012)Google Scholar
- 13.Hamerly, G., Elkan, C.: Alternatives to the \(k\)-means algorithm that find better clusterings. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 600–607. ACM (2002)Google Scholar
- 19.Mladenović, N., Brimberg, J.: A degeneracy property in continuous location-allocation problems. Les Cahiers du GERAD, G-96-37 (1996)Google Scholar
- 22.Pakhira, M.K.: A modified k-means Algorithm to avoid empty clusters. Int. J. Recent Trends Eng. 1, 220–226 (2009)Google Scholar
- 25.Xu, R., Wunsch, D.: Clustering. IEEE Press, New York (2009)Google Scholar
- 28.Zhang, B.: Generalized k-harmonic means—boosting in unsupervised learning. Technical Report, HPL-2000-137, Hewlett-Packard Laboratories (2000)Google Scholar
- 29.Zhang, B., Hsu, M., Dayal, U.: K-harmonic means—a data clustering algorithm. Technical Report, HPL-1999-124, Hewlett-Packard Laboratories (1999)Google Scholar