Skip to main content
Log in

Composite kernels for semi-supervised clustering

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

A critical problem related to kernel-based methods is how to select optimal kernels. A kernel function must conform to the learning target in order to obtain meaningful results. While solutions to the problem of estimating optimal kernel functions and corresponding parameters have been proposed in a supervised setting, it remains a challenge when no labeled data are available, and all we have is a set of pairwise must-link and cannot-link constraints. In this paper, we address the problem of optimizing the kernel function using pairwise constraints for semi-supervised clustering. We propose a new optimization criterion for automatically estimating the optimal parameters of composite Gaussian kernels, directly from the data and given constraints. We combine our proposal with a semi-supervised kernel-based algorithm to demonstrate experimentally the effectiveness of our approach. The results show that our method is very effective for kernel-based semi-supervised clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Amini MR, Gallinari P (2005) Semi-supervised learning with an imperfect supervisor. Knowl Inf Syst 8: 385–413

    Article  Google Scholar 

  2. Bach F, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: International conference on machine learning, pp 41–48

  3. Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: International conference on machine learning, pp 11–18

  4. Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: International conference on knowledge discovery and data mining, pp 59–68

  5. Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  6. Blaszczyk J, Karbowski A, Malinowski K (2007) Object library of algorithms for dynamic optimization problems: benchmaring SQP and nonlinear interior point methods. Int J Appl Math Comput Sci 17(4): 515–537

    Article  MathSciNet  Google Scholar 

  7. Chapelle O, Vapnik V (2002) Choosing mutiple parameters for support vector machines. Machine Learn 46(1): 131–159

    Article  MATH  Google Scholar 

  8. Chen Y, Rege M, Dong M, Hua J (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17: 355–379

    Article  Google Scholar 

  9. Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. TR2003-1892, Cornell University

  10. Cristianini N, Shawe-Taylor J, Elisseeff A (2001) On kernel-target alignment, neural information processing systems 14. MIT Press, Cambridge, pp 367–373

    Google Scholar 

  11. Davis J, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: International conference on machine learning, pp 209–216

  12. Han SP (2007) A globally convergent method for nonlinear programming. J Optim Theory Appl 22(3): 297–309

    Article  Google Scholar 

  13. Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, Berlin

    MATH  Google Scholar 

  14. Huang J, Yuen PC, Chen WS, Lai JH (2004) Kernel subspace LDA with optimized kernel parameters on face recognition. In: The sixth IEEE international conference on automatic face and gesture recognition, pp 327–332

  15. Kondor R, Jebara T (2007) Gaussian and Wishart hyperkernels. In: Advances in neural information processing systems vol 19. MIT Press, Cambridge, pp 729–736

  16. Kulis B, Basu S, Dhillon I, Moony R (2005) Semi-supervised graph clustering: a kernel approach. In: International conference on machine learning, pp 457–464

  17. Lanckriet GRG, Cristianini N, Bartlett P, El Ghaoui L, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Machine Learn Res 5: 27–72

    MathSciNet  Google Scholar 

  18. Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20(16): 2626–2635

    Article  Google Scholar 

  19. Li T, Ogihara M (2004) Semisupervised learning from different information sources. Knowl Inf Syst 7: 289–309

    Article  MATH  Google Scholar 

  20. Ong CS, Smola AJ, Williamson RC (2003) Hyperkernels. In: Advances in neural information processing systems vol 15. MIT Press, Cambridge, pp 478–485

  21. Powell MJD (1978) A fast algorithm for nonlinearly constrained optimization calculations. In: Watson GA (ed) Numerical analysis. Lecture notes in mathematics, Springer, Berlin 630:144–157

  22. Powell MJD (1978) The convergence of variable metric methods for nonlinearly constrained optimization calculations. In: Mangasarian OL, Meyer RR, Robinson SM (eds) Nonlinear programming 3. Academic Press, New York

    Google Scholar 

  23. Shawe-Tayor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambrige University, Cambrige

    Google Scholar 

  24. Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Machine Learn Res 3(3): 583–617

    Article  MathSciNet  Google Scholar 

  25. Theodoridis S, Koutroubas K (1999) Pattern recognition. Academic Press, New York

    Google Scholar 

  26. Vapnik V (1995) The nature of statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  27. Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained K-Means clustering with background knowledge. In: International conference on machine learning, pp 577–584

  28. Wang W, Xu Z, Lu W, Zhang X (2002) Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing 55(3–4): 643–663

    Google Scholar 

  29. Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems vol 15. MIT Press, Cambridge, pp 505–512

  30. Yan B, Domeniconi C (2006) An adaptive kernel method for semi-supervised clustering. In: European conference on machine learning, pp 521–532

  31. Ye J, Chen K, Wu T, Li J, Zhao Z, Patel R, Bae M, Janardan R, Liu H, Alexander G, Reiman E (2008) Heterogeneous data fusion for Alzheimer’s disease study. In: The 4th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1025–1033

  32. Ye J, Ji S, Chen J (2008) Multi-class discriminant kernel learning via convex programming. J Machine Learn Res 9: 719–758

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlotta Domeniconi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Domeniconi, C., Peng, J. & Yan, B. Composite kernels for semi-supervised clustering. Knowl Inf Syst 28, 99–116 (2011). https://doi.org/10.1007/s10115-010-0318-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-010-0318-8

Keywords

Navigation