Skip to main content
Log in

Constrained clustering and multiple kernel learning without pairwise constraint relaxation

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Clustering under pairwise constraints is an important knowledge discovery tool that enables the learning of appropriate kernels or distance metrics to improve clustering performance. These pairwise constraints, which come in the form of must-link and cannot-link pairs, arise naturally in many applications and are intuitive for users to provide. However, the common practice of relaxing discrete constraints to a continuous domain to ease optimization when learning kernels or metrics can harm generalization, as information which only encodes linkage is transformed to informing distances. We introduce a new constrained clustering algorithm that jointly clusters data and learns a kernel in accordance with the available pairwise constraints. To generalize well, our method is designed to maximize constraint satisfaction without relaxing pairwise constraints to a continuous domain where they inform distances. We show that the proposed method outperforms existing approaches on a large number of diverse publicly available datasets, and we discuss how our method can scale to handling large data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Code available at github.com/autonlab/constrained-clustering.

  2. Data: https://github.com/EpistasisLab/penn-ml-benchmarks.

  3. We evaluated a range of alternatives to establish hard baselines. A final LCVQE partitioning provided the best performance compared to other options such as k-means or COP-Kmeans.

References

  • Anand S, Mittal S, Tuzel O, Meer P (2013) Semi-supervised kernel mean shift clustering. IEEE trans on pattern anal and machine intell 36(6):1201–1215

    Article  Google Scholar 

  • Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on Pattern Analysis and Machine Intelligence

  • Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: ICML, pp 11–18

  • Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: ICML

  • Basu S, Bilenko M, Banerjee A, Mooney RJ (2006) Probabilistic semi-supervised clustering with constraints. Semi-supervised learning 71–98

  • Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: SIGKDD

  • Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: ICML

  • Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. Constrained Clustering: Adv in Algorithms, Theory, and Appl 4(1):17–32

    MATH  Google Scholar 

  • Croce D, Moschitti A, Basili R (2011) Structured lexical similarity via convolution kernels on dependency trees. In: EMNLP, pp 1034–1046

  • Cuturi M (2011) Fast global alignment kernels. In: ICML, pp 929–936

  • Daumé H, Marcu D (2006) A bayesian model for supervised clustering with the dirichlet process prior. JMLR 6:1551–1551

    MathSciNet  MATH  Google Scholar 

  • Davidson I, Ravi S (2007) The complexity of non-hierarchical clustering with instance and cluster level constraints. Data mining and knowledge discovery 14(1):25–61

    Article  MathSciNet  Google Scholar 

  • Davidson I, Wagstaff KL, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: European conference on principles of data mining and knowledge discovery, pp 115–126. Springer

  • Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: ICML, pp 209–216

  • Eisenberg D, Marcotte EM, Xenarios I, Yeates TO (2000) Protein function in the post-genomic era. Nature 405(6788):823

    Article  Google Scholar 

  • Finley T, Joachims T (2005) Supervised clustering with support vector machines. In: ICML, pp 217–224

  • Fogel S, Averbuch-Elor H, Cohen-Or D, Goldberger J (2019) Clustering-driven deep embedding with pairwise constraints. IEEE comput graphics and appl 39(4):16–27

    Article  Google Scholar 

  • Gittens A, Mahoney MW (2016) Revisiting the nyström method for improved large-scale machine learning. JMLR 17(1):3977–4041

    MATH  Google Scholar 

  • Gönen M, Alpaydın E (2011) Multiple kernel learning algorithms. JMLR 12:2211–2268

    MathSciNet  MATH  Google Scholar 

  • Hoi SC, Jin R, Lyu MR (2007) Learning nonparametric kernel matrices from pairwise constraints. In: ICML, pp 361–368

  • Hsu YC, Kira Z (2016) Neural network-based clustering using pairwise constraints. ICLR Workshop track

  • Hsu YC, Lv Z, Kira Z (2018) Learning to cluster in order to transfer across domains and tasks. In: ICLR

  • Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: International conference on learning and intelligent optimization, pp 507–523. Springer

  • Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. Tech. rep., Stanford

  • Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: a kernel approach. Machine Learning 74(1):1–22

    Article  Google Scholar 

  • Liu H, Tao Z, Fu Y (2017) Partition level constrained clustering. IEEE trans on pattern anal and machine intell 40(10):2469–2483

    Article  Google Scholar 

  • Olson RS, La Cava W, Orzechowski P, Urbanowicz RJ, Moore JH (2017) Pmlb: A large benchmark suite for machine learning evaluation and comparison. BioData Mining 10(1):36

    Article  Google Scholar 

  • Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: ECML, pp 674–682

  • Póczos B, Xiong L, Sutherland DJ, Schneider J (2012) Nonparametric kernel estimators for image classification. In: CVPR, pp 2989–2996

  • Sahbi H, Audibert J, Keriven R (2011) Context-dependent kernels for object classification. IEEE Trans on Pattern Anal and Machine Intell 33(4):699–708

    Article  Google Scholar 

  • Srinivas N, Krause A, Kakade S, Seeger M (2010) Gaussian process optimization in the bandit setting: No regret and experimental design. In: ICML, pp 1015–1022

  • Subrahmanya N, Shin YC (2010) Sparse multiple kernel learning for signal processing applications. IEEE Trans on Pattern Anal and Machine Intell 32(5):788–798

    Article  Google Scholar 

  • Vishwanathan SVN, Borgwardt KM, Schraudolph NN (2006) Fast computation of graph kernels. In: NIPS

  • Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. AAAI/IAAI 1097:577–584

    Google Scholar 

  • Wagstaff K, Cardie C, Rogers S, Schrödl S et al. (2001) Constrained k-means clustering with background knowledge. In: ICML, pp 577–584

  • Wagstaff KL (2002) Intelligent clustering with instance-level constraints. Ph.D. thesis, Cornell University, Ithaca, NY, USA

  • Wang F, Sun J, Ebadollahi S (2011) Integrating distance metrics learned from multiple experts and its application in patient similarity assessment. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp 59–70. SIAM

  • Wang S, Gittens A, Mahoney MW (2019) Scalable kernel k-means clustering with nyström approximation: Relative-error bounds. JMLR 20(1):431–479

    MATH  Google Scholar 

  • Wu B, Zhang Y, Hu BG, Ji Q (2013) Constrained clustering and its application to face clustering in videos. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 3507–3514

  • Xing EP, Jordan MI, Russell SJ, Ng AY (2003) Distance metric learning with application to clustering with side-information. In: NIPS, pp 521–528

  • Yan B, Domeniconi C (2006) An adaptive kernel method for semi-supervised clustering. In: ECML, pp 521–532

  • Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: An adaptive kernel method. Pattern Recognit 43(4):1320–1333

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by a Space Technology Research Institutes grant (80NSSC19K1052) from NASA’s Space Technology Research Grants Program and by Defense Advanced Research Projects Agency’s award FA8750-17-2-0130.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benedikt Boecking.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 245 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boecking, B., Jeanselme, V. & Dubrawski, A. Constrained clustering and multiple kernel learning without pairwise constraint relaxation. Adv Data Anal Classif (2022). https://doi.org/10.1007/s11634-022-00507-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11634-022-00507-5

Keywords

Mathematics Subject Classification

Navigation