Abstract
Clustering under pairwise constraints is an important knowledge discovery tool that enables the learning of appropriate kernels or distance metrics to improve clustering performance. These pairwise constraints, which come in the form of must-link and cannot-link pairs, arise naturally in many applications and are intuitive for users to provide. However, the common practice of relaxing discrete constraints to a continuous domain to ease optimization when learning kernels or metrics can harm generalization, as information which only encodes linkage is transformed to informing distances. We introduce a new constrained clustering algorithm that jointly clusters data and learns a kernel in accordance with the available pairwise constraints. To generalize well, our method is designed to maximize constraint satisfaction without relaxing pairwise constraints to a continuous domain where they inform distances. We show that the proposed method outperforms existing approaches on a large number of diverse publicly available datasets, and we discuss how our method can scale to handling large data.
Similar content being viewed by others
Notes
Code available at github.com/autonlab/constrained-clustering.
We evaluated a range of alternatives to establish hard baselines. A final LCVQE partitioning provided the best performance compared to other options such as k-means or COP-Kmeans.
References
Anand S, Mittal S, Tuzel O, Meer P (2013) Semi-supervised kernel mean shift clustering. IEEE trans on pattern anal and machine intell 36(6):1201–1215
Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on Pattern Analysis and Machine Intelligence
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: ICML, pp 11–18
Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: ICML
Basu S, Bilenko M, Banerjee A, Mooney RJ (2006) Probabilistic semi-supervised clustering with constraints. Semi-supervised learning 71–98
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: SIGKDD
Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: ICML
Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. Constrained Clustering: Adv in Algorithms, Theory, and Appl 4(1):17–32
Croce D, Moschitti A, Basili R (2011) Structured lexical similarity via convolution kernels on dependency trees. In: EMNLP, pp 1034–1046
Cuturi M (2011) Fast global alignment kernels. In: ICML, pp 929–936
Daumé H, Marcu D (2006) A bayesian model for supervised clustering with the dirichlet process prior. JMLR 6:1551–1551
Davidson I, Ravi S (2007) The complexity of non-hierarchical clustering with instance and cluster level constraints. Data mining and knowledge discovery 14(1):25–61
Davidson I, Wagstaff KL, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: European conference on principles of data mining and knowledge discovery, pp 115–126. Springer
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: ICML, pp 209–216
Eisenberg D, Marcotte EM, Xenarios I, Yeates TO (2000) Protein function in the post-genomic era. Nature 405(6788):823
Finley T, Joachims T (2005) Supervised clustering with support vector machines. In: ICML, pp 217–224
Fogel S, Averbuch-Elor H, Cohen-Or D, Goldberger J (2019) Clustering-driven deep embedding with pairwise constraints. IEEE comput graphics and appl 39(4):16–27
Gittens A, Mahoney MW (2016) Revisiting the nyström method for improved large-scale machine learning. JMLR 17(1):3977–4041
Gönen M, Alpaydın E (2011) Multiple kernel learning algorithms. JMLR 12:2211–2268
Hoi SC, Jin R, Lyu MR (2007) Learning nonparametric kernel matrices from pairwise constraints. In: ICML, pp 361–368
Hsu YC, Kira Z (2016) Neural network-based clustering using pairwise constraints. ICLR Workshop track
Hsu YC, Lv Z, Kira Z (2018) Learning to cluster in order to transfer across domains and tasks. In: ICLR
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: International conference on learning and intelligent optimization, pp 507–523. Springer
Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. Tech. rep., Stanford
Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: a kernel approach. Machine Learning 74(1):1–22
Liu H, Tao Z, Fu Y (2017) Partition level constrained clustering. IEEE trans on pattern anal and machine intell 40(10):2469–2483
Olson RS, La Cava W, Orzechowski P, Urbanowicz RJ, Moore JH (2017) Pmlb: A large benchmark suite for machine learning evaluation and comparison. BioData Mining 10(1):36
Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: ECML, pp 674–682
Póczos B, Xiong L, Sutherland DJ, Schneider J (2012) Nonparametric kernel estimators for image classification. In: CVPR, pp 2989–2996
Sahbi H, Audibert J, Keriven R (2011) Context-dependent kernels for object classification. IEEE Trans on Pattern Anal and Machine Intell 33(4):699–708
Srinivas N, Krause A, Kakade S, Seeger M (2010) Gaussian process optimization in the bandit setting: No regret and experimental design. In: ICML, pp 1015–1022
Subrahmanya N, Shin YC (2010) Sparse multiple kernel learning for signal processing applications. IEEE Trans on Pattern Anal and Machine Intell 32(5):788–798
Vishwanathan SVN, Borgwardt KM, Schraudolph NN (2006) Fast computation of graph kernels. In: NIPS
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. AAAI/IAAI 1097:577–584
Wagstaff K, Cardie C, Rogers S, Schrödl S et al. (2001) Constrained k-means clustering with background knowledge. In: ICML, pp 577–584
Wagstaff KL (2002) Intelligent clustering with instance-level constraints. Ph.D. thesis, Cornell University, Ithaca, NY, USA
Wang F, Sun J, Ebadollahi S (2011) Integrating distance metrics learned from multiple experts and its application in patient similarity assessment. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp 59–70. SIAM
Wang S, Gittens A, Mahoney MW (2019) Scalable kernel k-means clustering with nyström approximation: Relative-error bounds. JMLR 20(1):431–479
Wu B, Zhang Y, Hu BG, Ji Q (2013) Constrained clustering and its application to face clustering in videos. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 3507–3514
Xing EP, Jordan MI, Russell SJ, Ng AY (2003) Distance metric learning with application to clustering with side-information. In: NIPS, pp 521–528
Yan B, Domeniconi C (2006) An adaptive kernel method for semi-supervised clustering. In: ECML, pp 521–532
Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: An adaptive kernel method. Pattern Recognit 43(4):1320–1333
Acknowledgements
This work was partially supported by a Space Technology Research Institutes grant (80NSSC19K1052) from NASA’s Space Technology Research Grants Program and by Defense Advanced Research Projects Agency’s award FA8750-17-2-0130.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Boecking, B., Jeanselme, V. & Dubrawski, A. Constrained clustering and multiple kernel learning without pairwise constraint relaxation. Adv Data Anal Classif (2022). https://doi.org/10.1007/s11634-022-00507-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11634-022-00507-5