Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 219-234

A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9284)

Abstract

We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reflect side-information that is not expressed in the feature vectors, directly. Relative comparisons can express structures at finer level of detail than must-link (ML) and cannot-link (CL) constraints that are commonly used for semi-supervised clustering. Relative comparisons are particularly useful in settings where giving an ML or a CL constraint is difficult because the granularity of the true clustering is unknown.

Our main contribution is an efficient algorithm for learning a kernel matrix using the log determinant divergence (a variant of the Bregman divergence) subject to a set of relative distance constraints. Given the learned kernel matrix, a clustering can be obtained by any suitable algorithm, such as kernel k-means. We show empirically that kernels found by our algorithm yield clusterings of higher quality than existing approaches that either use ML/CL constraints or a different means to implement the supervision using relative comparisons.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anand, S., Mittal, S., Tuzel, O., Meer, P.: Semi-supervised kernel mean shift clustering. PAMI 36, 1201–1215 (2014)CrossRefGoogle Scholar
  2. 2.
    Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: ICML (2000)Google Scholar
  3. 3.
    Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained \(k\)-means clustering with background knowledge. In: ICML (2001)Google Scholar
  4. 4.
    Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints. In: ICML (2002)Google Scholar
  5. 5.
    Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: KDD (2004)Google Scholar
  6. 6.
    Basu, S., Banerjee, A., Mooney, R.J.: Active semi-supervision for pairwise constrained clustering. In: SDM (2004)Google Scholar
  7. 7.
    Lu, Z., Leen, T.K.: Semi-supervised learning with penalized probabilistic clustering. NIPS (2005)Google Scholar
  8. 8.
    Lu, Z.: Semi-supervised clustering with pairwise constraints: A discriminative approach. In: AISTATS (2007)Google Scholar
  9. 9.
    Pei, Y., Fern, X.Z., Rosales, R., Tjahja, T.V.: Discriminative clustering with relative constraints. arXiv:1501.00037 (2014)
  10. 10.
    Lu, Z., Ip, H.H.S.: Constrained Spectral Clustering via Exhaustive and Efficient Constraint Propagation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 1–14. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  11. 11.
    Lu, Z., Carreira-Perpiñán, M.: Constrained spectral clustering through affinity propagation. In: CVPR (2008)Google Scholar
  12. 12.
    Dhillon, I.S., Guan, Y., Kulis, B.: A unified view of kernel k-means, spectral clustering and graph cuts. Technical Report TR-04-25, University of Texas (2005)Google Scholar
  13. 13.
    Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised graph clustering: a kernel approach. Machine Learning 74, 1–22 (2009)CrossRefGoogle Scholar
  14. 14.
    Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.J.: Distance metric learning with application to clustering with side-information. In: NIPS (2002)Google Scholar
  15. 15.
    Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: NIPS (2003)Google Scholar
  16. 16.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: ICML (2007)Google Scholar
  17. 17.
    Liu, W., Ma, S., Tao, D., Liu, J., Liu, P.: Semi-supervised sparse metric learning using alternating linearization optimization. In: KDD (2010)Google Scholar
  18. 18.
    Liu, E.Y., Guo, Z., Zhang, X., Jojic, V., Wang, W.: Metric learning from relative comparisons by minimizing squared residual. In: ICDM (2012)Google Scholar
  19. 19.
    Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML (2004)Google Scholar
  20. 20.
    Xiang, S., Nie, F., Zhang, C.: Learning a mahalanobis distance metric for data clustering and classification. Pattern Recognition 41, 3600–3612 (2008)CrossRefMATHGoogle Scholar
  21. 21.
    Kumar, N., Kummamuru, K.: Semisupervised clustering with metric learning using relative comparisons. TKDE 20, 496–503 (2008)Google Scholar
  22. 22.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. PAMI 24, 603–619 (2002)CrossRefGoogle Scholar
  23. 23.
    Kulis, B., Sustik, M.A., Dhillon, I.S.: Low-rank kernel learning with Bregman matrix divergences. JMLR 10, 341–376 (2009)MathSciNetMATHGoogle Scholar
  24. 24.
    Bregman, L.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics 7, 200–217 (1967)CrossRefGoogle Scholar
  25. 25.
    Tsuda, K., Rätsch, G., Warmuth, M.: Matrix exponentiated gradient updates for on-line learning and Bregman projection. JMLR 6, 995–1018 (2005)MATHGoogle Scholar
  26. 26.
    Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2, 193–218 (1985)CrossRefGoogle Scholar
  27. 27.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)CrossRefMATHGoogle Scholar
  28. 28.
    Hinton, G., Roweis, S.: Stochastic neighbor embedding. In: NIPS (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Helsinki Institute for Information Technology, and Department of Computer ScienceAalto UniversityEspooFinland
  2. 2.Finnish Institute of Occupational HealthHelsinkiFinland

Personalised recommendations