Abstract
Due to its capability to exploit training datasets encompassing both labeled and unlabeled patterns, semi-supervised learning (SSL) has been receiving attention from the community throughout the last decade. Several SSL approaches to data clustering have been proposed and investigated, as well. Unlike typical SSL setups, in semi-supervised clustering (SSC) the partial supervision is generally not available in terms of class labels associated with a subset of the training sample. In fact, general SSC algorithms rely rather on additional constraints which bring some kind of a-priori, weak side-knowledge to the clustering process. Significant instances are: COP-COBWEB and COP k-means, HMRF k-means, seeded k-means, constrained k-means, and active fuzzy constrained clustering. This chapter is a survey of major SSC philosophies, setups, and techniques. It provides the reader with an insight into these notions, categorizing and reviewing the major state-of-the-art approaches to SSC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
According to the statistical notion of sufficient statistics.
- 2.
The authors introduce their algorithm as a partitional method, since it yields a flat partition of the data, corresponding to the top level of the resulting COBWEB hierarchy. Nevertheless, it is our conviction that COP-COBWEB is actually a HSSC approach, due to the hierarchical way in which the process is carried out. The eventual selection of a partition from the dendrogram does not affect this; actually, it is quite a common fact in the hierarchical framework.
References
Alpaydin E (2010) Introduction to machine learning, 2nd edn. MIT Press, Cambridge
Anand R, Reddy CK (2011) Graph-based clustering with constraints. In: Proceedings of the 15th Pacific-Asia conference on advances in knowledge discovery and data mining - volume part II, PAKDD’11, pp 51–62. Springer, New York
Arbelaitz O, Gurrutxaga I, Muguerza J, Perez JM, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256
Bade K, Nurnberger A (2006) Personalized hierarchical clustering. In: IEEE/WIC/ACM international conference on web intelligence, pp 181–187
Bade K, Nurnberger A (2008) Creating a cluster hierarchy under constraints of a partially known hierarchy. In: SDM ’08, pp 13–24
Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: Proceedings of the 19st international conference on machine learning, pp 19–26
Basu S, Banerjee A, Mooney R (2004) Active semi-supervision for pairwise constrained clustering. In: Proceedings of the 2004 SIAM international conference on data mining (SDM-04). URL http://www.cs.utexas.edu/users/ai-lab/?basu:sdm04
Basu S, Bilenko M, Mooney R (2004) A probabilistic framework for semi-supervised clustering. In: Proc. of the 10th ACM SIGKDD conference on knowledge discovery and data mining (KDD’04), pp 59–68
Bilenko M, Basu S, Mooney R (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21st international conference on machine learning, Banff, Canada, pp 81–88
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Celebi ME, Kingravi H, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210
Chu SM, Tang H, Huang TS (2009) Fishervoice and semi-supervised speaker clustering. In: IEEE international conference on acoustics, speech and signal processing (ICASSP’F09), pp 4089–4092. IEEE, Washington, DC, USA.
Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. Tech. rep.
Daniels K, Giraud-Carrier C (2006) Learning the threshold in hierarchical agglomerative clustering. In: Machine learning and applications (ICMLA ’06) 5th international conferance, pp 270–278
Davidson I, Ravi SS (2005) Agglomerative hierarchical clustering with constraints: Theoretical and empirical results. In: Lecture notes in computer science, pp 59–70. Springer, New York
Davidson I, Ravi SS (2007) Intractability and clustering with constraints. In: Proceedings of the 24th international conference on machine learning, ICML ’07, pp. 201–208. ACM, New York. DOI 10.1145/1273496.1273522. URL http://doi.acm.org/10.1145/1273496.1273522
Deborah L, Baskaran R, Kannan A (2010) A survey on internal validity measure for cluster validation. Int J Comput Sci Eng Survey 1(2):85–102
Dhillon I, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, p 556. ACM, New York
Dhillon IS, Fan J, Guan Y (2001) Efficient clustering of very large document collections. In: Grossman RL, Kamath C, Kegelmeyer P, Kumar V, Namburu RR (eds) Data mining for scientific and engineering applications. Springer, New York, pp 357–381
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Willey, New York
Faußer S, Schwenker F (2012) Semi-supervised kernel clustering with sample-to-cluster weights. In: Schwenker F, Trentin E (eds) Partially supervised learning - First IAPR TC3 workshop, PSL 2011, Ulm, Germany, September 15–16, 2011, Revised Selected Papers, pp 72–81. Springer, New York
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172
Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5(6):345
Frigui H, Krishnapuram R (1997) Clustering by competitive agglomeration. Pattern Recogn 7:1109–1119
Grira N, Crucianu M, Boujemaa N (2005) Semi-supervised fuzzy clustering with pairwise-constrained competitive agglomeration. In: IEEE international conference on fuzzy systems
Grira N, Crucianu M, Boujemaa N (2008) Active semi-supervised fuzzy clustering. Pattern Recogn 41:1834–1844
Hofmann T, Buhmann JM (1998) Active data clustering. In: In advances in neural information processing systems 10, pp 528–534
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Upper Saddle River
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: A review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Kamvar SD, Klein D, Manning CD (2003) Spectral learning. In: IJCAI, pp 561–566
Kohonen T (ed) (1997) Self-organizing maps. Springer, New York
Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: A kernel approach. Mach Learn 74(1):1–22
Křivánek M, Morávek J (1986) Np-hard problems in hierarchical-tree clustering. Acta Inf 23(3):311–323. DOI 10.1007/BF00289116. URL http://dx.doi.org/10.1007/BF00289116
Li T, Ding C, Jordan MI (2007) Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: Proceedings of the 2007 seventh IEEE international conference on data mining, ICDM ’07, pp 577–582. IEEE Computer Society, Washington, DC, USA. DOI 10.1109/ICDM.2007.98. URL http://dx.doi.org/10.1109/ICDM.2007.98
Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York
Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: Proceedings of the 2010 IEEE international conference on data mining, pp. 911–916. IEEE Computer Society, Washington, DC, USA
Lloyd S (2006) Least squares quantization in pcm. IEEE Trans Inform Theory 28(2):129–137
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Cam LML, Neyman J (eds) Proc. of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, California, pp 281–297
Martinetz TM, Berkovich SG, Schulten KJ (1993) Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans Neural Network 4(4):558–569
Newman CBD, Merz C (1998) UCI repository of machine learning databases. URL http://www.ics.uci.edu/~mlearn/MLRepository.html
Podani J (2000) Simulation of random dendrograms and comparison tests: Some comments. J Classification 17(1):123–142
Rendón E, Abundez IM, Gutierrez C, Zagal SD, Arizmendi A, Quiroz EM, Arzate HE (2011) A comparison of internal and external cluster validation indexes. In: Proceedings of the 2011 american conference on applied mathematics and the 5th WSEAS international conference on computer engineering and applications, pp. 158–163. World Scientific and Engineering Academy and Society (WSEAS)
Rockafellar RT (1970) Convex analysis. Princeton Mathematical Series. Princeton University Press, Princeton
Sattah S, Tversky A (1977) Additive similarity trees. Psychometrika 3:319–345
Schaeffer SE (2007) Survey: graph clustering. Comput Sci Rev 1(1):27–64
Schwenker F, Trentin E (2014) Pattern classification and clustering: A review of partially supervised learning approaches. Pattern Recogn Lett 37:4–14
Segal E, Wang H, Koller D (2003) Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics 74(19):264–272
Soleymani Baghshah M, Bagheri Shouraki S (2010) Kernel-based metric learning for semi-supervised clustering. Neurocomputing 73(7):1352–1361
Strehl A, Ghosh J, Mooney R (2000) Impact of similarity measures on web-page clustering. In: Proceedings of the 17th national conference on artificial intelligence: workshop of artificial intelligence for web search (AAAI 2000), 30–31 July 2000. AAAI, Austin, Texas, USA, pp 58–64
Streit RL, Luginbuhl TE (1994) Maximum likelihood training of probabilistic neural networks. IEEE Trans Neural Network 5(5):764–783
Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: A comparative overview. Stat Anal Data Min 3(4):209–235
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceeding of the 17th international conference on machine learning, ICML 2000, pp 1103–1110
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. In: Proc. of the 18th international conference on machine learning (ICML’01), pp 577–584
Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems 15, pp 505–512. MIT Press, Cambridge
Xiong H, Li Z (2013) Clustering validation measures. In: Data clustering: algorithms and applications, pp 571–606
Zha H, He X, Ding C, Simon H, Gu M (2001) Spectral relaxation for k-means clustering. In: NIPS, pp 1057–1064. MIT Press, Cambridge
Zhao H, Qi Z (2010) Hierarchical agglomerative clustering with ordering constraints. In: Proceedings of the 2010 third international conference on knowledge discovery and data mining, WKDD ’10, pp. 195–199. IEEE Computer Society, Washington, DC, USA. DOI 10.1109/WKDD.2010.123. URL http://dx.doi.org/10.1109/WKDD.2010.123
Zheng L, Li T (2011) Semi-supervised hierarchical clustering. In: IEEE international conference on data mining, pp 982–991
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Bongini, M., Schwenker, F., Trentin, E. (2015). On Semi-Supervised Clustering. In: Celebi, M. (eds) Partitional Clustering Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-09259-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-09259-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09258-4
Online ISBN: 978-3-319-09259-1
eBook Packages: EngineeringEngineering (R0)