Abstract
It is well known that some local search heuristics for \(K\)-clustering problems, such as \(k\)-means heuristic for minimum sum-of-squares clustering occasionally stop at a solution with a smaller number of clusters than the desired number \(K\). Such solutions are called degenerate. In this paper, we reveal that the degeneracy also exists in \(K\)-harmonic means (KHM) method, proposed as an alternative to \(K\)-means heuristic, but which is less sensitive to the initial solution. In addition, we discover two types of degenerate solutions and provide examples for both. Based on these findings, we give a simple method to remove degeneracy during the execution of the KHM heuristic; it can be used as a part of any other heuristic for KHM clustering problem. We use KHM heuristic within a recent variant of variable neighborhood search (VNS) based heuristic. Extensive computational analysis, performed on test instances usually used in the literature, shows that significant improvements are obtained if our simple degeneracy correcting method is used within both KHM and VNS. Moreover, our VNS based heuristic suggested here may be considered as a new state-of-the-art heuristic for solving KHM clustering problem.
Similar content being viewed by others
References
Alguwaizani, A., Hansen, P., Mladenovic, N., Ngai, E.: Variable neighborhood search for harmonic means clustering. Appl. Math. Model. 35, 2688–2694 (2011)
Aloise, D., Deshpande, A., Hansen, P., Popat, P.: Np-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75, 245–248 (2009)
Aloise, D., Hansen, P.: Clustering. In: Sheir D.r. (ed.) Handbook of Discrete and Combinatorial Mathemaics. CRC Press (2009)
Bai, L., Liang, J., Dang, C., Cao, F.: A cluster centers initialization method for clustering categorical data. Expert Syst. Appl. 39, 8022–8029 (2012)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/datasets.html (1998)
Brimberg, J., Mladenović, N.: Degeneracy in the multi-source Weber problem. Math. Program. 85(1), 213–220 (1999)
Brimberg, J., Hansen, P., Mladenovic, N.: Attraction probabilities in Variable neighborhood search. 4OR 8, 181–194 (2010)
Cao, F., Liang, J., Jiang, G.: An initialization method for the \(K\)- Means algorithm using neighborhood model. Comput. Math. Appl. 58, 474–483 (2009)
Carrizosa, E., Mladenovic, N., Todosijevic, R.: Sum-of-squares clustering on networks. Yugosl. J. Oper. Res. 21, 157–161 (2011)
Chua, Y.-M., Xiab, W.-F.: Two optimal double inequalities between power mean and logarithmic mean. Comput. Math. Appl. 60, 83–89 (2010)
Čižmešija, A.: A new sharp double inequality for generalized Heronian, harmonic and power means. Comput. Math. Appl. 64, 664671 (2012)
Erisoglu, M., Calis, N., Sakallioglu, S.: A new algorithm for initial cluster centers in \(k\)-means algorithm. Pattern Recognit. Lett. 32, 1701–1705 (2011)
Hamerly, G., Elkan, C.: Alternatives to the \(k\)-means algorithm that find better clusterings. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 600–607. ACM (2002)
Hansen, P., Jaumard, B., Mladenovic, N.: Minimum sum of squares clustering in a low dimensional space. J. Classif. 15, 37–56 (1998)
Hansen, P., Mladenovic, N.: J-means: a new local search heuristic for minimum sum of squares clustering. Pattern Recognit. 34, 405–413 (2001)
Hansen, P., Mladenovic, N., Pérez, J.A.M.: Variable neighbourhood search: methods and applications. 4-OR 6, 319–360 (2008)
Hua, J., Yi, S., Li, J., et al.: Ant clustering algorithm with K-harmonic means clustering. Expert Syst. Appl. 37, 8679–8684 (2010). doi:10.1016/j.eswa.2010.06.061
Li, Q., Mitianoudis, N., Stathaki, T.: Spatial kernel K-harmonic means clustering for multi-spectral image segmentation. Image Process. IET 1(2), 156–167 (2007)
Mladenović, N., Brimberg, J.: A degeneracy property in continuous location-allocation problems. Les Cahiers du GERAD, G-96-37 (1996)
Mladenović, N., Hansen, P.: Variable neighbourhood search. Comput. Oper. Res. 24, 1097–1100 (1997)
Mladenovic, N., Todosijevic, R., Urosevic, D.: An efficient general variable neighborhood search for large TSP problem with time windows. Yugosl. J. Oper. Res. 23, 19–31 (2013)
Pakhira, M.K.: A modified k-means Algorithm to avoid empty clusters. Int. J. Recent Trends Eng. 1, 220–226 (2009)
Ruspini, E.H.: Numerical methods for fuzzy clustering. Inf. Sci. 2, 319–350 (1970)
Steinley, D., Brusco, M.J.: Initializing k-means batch clustering: a critical evaluation of several techniques. J. Classif. 24, 99–121 (2007)
Xu, R., Wunsch, D.: Clustering. IEEE Press, New York (2009)
Yang, Fengqin, Sun, Tieli, Zhang, Changhai: An efficient hybrid data clustering method based on K-harmonic means and particle swarm optimization original research article. Expert Syst. Appl. 36, 9847–9852 (2009)
Yin, M., Hu, Y., Yang, F., et al.: A novel hybrid K-harmonic means and gravitational search algorithm approach for clustering. Expert Syst. Appl. 38, 9319–9324 (2011). doi:10.1016/j.eswa.2011.01.018
Zhang, B.: Generalized k-harmonic means—boosting in unsupervised learning. Technical Report, HPL-2000-137, Hewlett-Packard Laboratories (2000)
Zhang, B., Hsu, M., Dayal, U.: K-harmonic means—a data clustering algorithm. Technical Report, HPL-1999-124, Hewlett-Packard Laboratories (1999)
Acknowledgments
The research of E. Carrizosa is partially supported by Grants MTM2009-14039 (Ministerio de Educación y Ciencia, Spain) and FQM329 (Junta de Andalucía, Spain). Part of this research was done while N. Mladenović was visiting the Instituto de Matemáticas de la Universidad de Sevilla (Grant SAB2009-0144, Ministerio de Educación y Ciencia, Spain).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carrizosa, E., Alguwaizani, A., Hansen, P. et al. New heuristic for harmonic means clustering. J Glob Optim 63, 427–443 (2015). https://doi.org/10.1007/s10898-014-0175-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-014-0175-1