Semi-Supervised Kernel Clustering with Sample-to-Cluster Weights

  • Stefan Faußer
  • Friedhelm Schwenker
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7081)

Abstract

Collecting unlabelled data is often effortless while labelling them can be difficult. Either the amount of data is too large or samples cannot be assigned a specific class label with certainty. In semi-supervised clustering the aim is to set the cluster centres close to their label-matching samples and unlabelled samples. Kernel based clustering methods are known to improve the cluster results by clustering in feature space. In this paper we propose a semi-supervised kernel based clustering algorithm that minimizes convergently an error function with sample-to-cluster weights. These sample-to-cluster weights are set dependent on the class label, i.e. matching, not-matching or unlabelled. The algorithm is able to use many kernel based clustering methods although we suggest Kernel Fuzzy C-Means, Relational Neural Gas and Kernel K-Means. We evaluate empirically the performance of this algorithm on two real-life dataset, namely Steel Plates Faults and MiniBooNE.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kohonen, T.: Self-organizing maps. Springer, Heidelberg (1997)CrossRefMATHGoogle Scholar
  2. 2.
    Wu, K.-L., Yang, M.-S.: A fuzzy-soft learning vector quantization. Neurocomputing 55(3-4), 681–697 (2003)CrossRefGoogle Scholar
  3. 3.
    Basu, S., Bilenko, M., Mooney, R.J.: A Probabilistic Framework for Semi-Supervised Clustering. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data-Mining, pp. 59–68 (2004)Google Scholar
  4. 4.
    Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised Graph Clustering: A Kernel Approach. In: Proceedings of the 25th International Conference on Machine Learning, vol. 74(1), pp. 1–22 (2008)Google Scholar
  5. 5.
    Yan, B., Domeniconi, C.: Exploration of Different Constraints and Query Methods with Kernel-based Semi-Supervised Clustering. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, pp. 829–834 (2006)Google Scholar
  6. 6.
    Hu, E., Chen, S., Zhang, D., Yin, X.: Semisupervised Kernel Matrix Learning by Kernel Propagation. IEEE Transactions on Neural Networks 21(11) (2010)Google Scholar
  7. 7.
    Weston, J.: Large-Scale Semi-Supervised Learning. In: Proceedings of NATO Advanced Study Institute on Mining Massive Data Sets for Security, vol. 19, pp. 62–75 (2008)Google Scholar
  8. 8.
    Zhang, D.Q., Chen, S.C.: Fuzzy clustering using kernel methods. In: International Conference of Control and Automatation, ICCA 2002, pp. 123–128 (2002)Google Scholar
  9. 9.
    Hammer, B., Hasenfuss, A.: Relational Neural Gas. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS (LNAI), vol. 4667, pp. 190–204. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Frank, A., Asuncion, A.: UCI Machine Learning Repository, University of California, School of Information and Computer Sciences, Irvine (2010), http://archive.ics.uci.edu/ml

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Stefan Faußer
    • 1
  • Friedhelm Schwenker
    • 1
  1. 1.Institute of Neural Information ProcessingUniversity of UlmUlmGermany

Personalised recommendations