Using More Initial Centers for the Seeding-Based Semi-Supervised K-Harmonic Means Clustering

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 236)

Abstract

In the initialization of the traditional semi-supervised k-means, the mean of some labeled data belonging to one same class was regarded as one initial center and the number of the initial centers is equal to the number of clusters. However, this initialization method using a small amount of labeled data also called seeds which are not appropriate for the semi-supervised k-harmonic means clustering insensitive to the initial centers. In this paper, a novel semi-supervised k-harmonic means clustering is proposed. Some seeds with one same class are divided into several groups and the mean of all data is viewed as one initial center in every group. Therefore, the number of the initial centers is more than the number of clusters in the new method. To investigate the effectiveness of the approach, several experiments are done on three datasets. Experimental results show that the presented method can improve the clustering performance compared to other traditional semi-supervised clustering algorithms.

Keywords

Semi-supervised clustering K-harmonic means clustering K-means clustering Seeds 

Notes

Acknowledgments

This research is supported by the Open Foundation of the Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, China (No.2011-01). This research is also supported by the Scientific Research Foundation of Nanjing University of Posts and Telecommunications (No.NY210078).

References

  1. 1.
    Jain, A.K., et al.: Data clustering: a review. ACM Comput. Surv. 31(3), 256–323 (1999)CrossRefGoogle Scholar
  2. 2.
    Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)CrossRefGoogle Scholar
  3. 3.
    Tou, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley, London (1974)MATHGoogle Scholar
  4. 4.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRefMATHGoogle Scholar
  5. 5.
    Krishnapuram, R., et al.: Low complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. Fuzzy Syst. 9(4), 595–607 (2001)CrossRefGoogle Scholar
  6. 6.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefGoogle Scholar
  7. 7.
    Matinetz, T.M., et al.: Neural-gas network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Netw. 4(4), 558–568 (1993)CrossRefGoogle Scholar
  8. 8.
    Zhang, B., Hus, M., Dayal, U.: K-harmonic means- a data clustering algorithm. Technical Report HPL-1999-124, Hewlett-Packard Laboratories (1999)Google Scholar
  9. 9.
    Zhang, B., Hsu, M., Dayal, U.: K-harmonic means. In: Proceedings of International Workshop on Temporal, Spatial and Spatio-temporal Data Mining, Lyon, France (2000)Google Scholar
  10. 10.
    Yang, F.Q., Sun, T.L., Zhang, C.H.: An efficient hybrid data clustering method based on k-harmonic means and particle swarm optimization. Expert Syst. Appl. 36(6), 9847–9852 (2009)CrossRefGoogle Scholar
  11. 11.
    Hammerly, C., Elkan, C.: Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 11th International Conference on Information and Knowledge Management, pp. 600–607 (2002)Google Scholar
  12. 12.
    Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised clustering by seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 27–34 (2002)Google Scholar
  13. 13.
    Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recogn. 41(5), 1834–1844 (2008)CrossRefMATHGoogle Scholar
  14. 14.
    Runkler, T.A.: Partially supervised k-harmonic means clustering. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining, pp. 96–103 (2011)Google Scholar
  15. 15.
    UCI Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLSummary.html

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.The Key Laboratory of Embedded System and Service Computing, Ministry of EducationTongji UniversityShanghaiChina
  2. 2.School of Computer Science and TechnologyNanjing University of Posts and TelecommunicationNanjingChina

Personalised recommendations