Using More Initial Centers for the Seeding-Based Semi-Supervised K-Harmonic Means Clustering
In the initialization of the traditional semi-supervised k-means, the mean of some labeled data belonging to one same class was regarded as one initial center and the number of the initial centers is equal to the number of clusters. However, this initialization method using a small amount of labeled data also called seeds which are not appropriate for the semi-supervised k-harmonic means clustering insensitive to the initial centers. In this paper, a novel semi-supervised k-harmonic means clustering is proposed. Some seeds with one same class are divided into several groups and the mean of all data is viewed as one initial center in every group. Therefore, the number of the initial centers is more than the number of clusters in the new method. To investigate the effectiveness of the approach, several experiments are done on three datasets. Experimental results show that the presented method can improve the clustering performance compared to other traditional semi-supervised clustering algorithms.
KeywordsSemi-supervised clustering K-harmonic means clustering K-means clustering Seeds
This research is supported by the Open Foundation of the Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, China (No.2011-01). This research is also supported by the Scientific Research Foundation of Nanjing University of Posts and Telecommunications (No.NY210078).
- 8.Zhang, B., Hus, M., Dayal, U.: K-harmonic means- a data clustering algorithm. Technical Report HPL-1999-124, Hewlett-Packard Laboratories (1999)Google Scholar
- 9.Zhang, B., Hsu, M., Dayal, U.: K-harmonic means. In: Proceedings of International Workshop on Temporal, Spatial and Spatio-temporal Data Mining, Lyon, France (2000)Google Scholar
- 11.Hammerly, C., Elkan, C.: Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 11th International Conference on Information and Knowledge Management, pp. 600–607 (2002)Google Scholar
- 12.Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised clustering by seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 27–34 (2002)Google Scholar
- 14.Runkler, T.A.: Partially supervised k-harmonic means clustering. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining, pp. 96–103 (2011)Google Scholar
- 15.UCI Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLSummary.html