Abstract
Clustering algorithms incorporated with prior knowledge have been widely studied and many nice results were shown in recent years. However, most existing algorithms implicitly assume that the prior information is complete, typically specified in the form of labeled objects with each category. These methods decay and behave unstably when the labeled classes are incomplete. In this paper a new type of prior knowledge which bases on partially labeled data is proposed. Then we develop two novel semi-supervised clustering algorithms to face this new challenge. An empirical study performed on benchmark dataset shows that our proposed algorithms produce better results with limited labeled examples comparing with existing baselines.
Keywords
Download to read the full chapter text
Chapter PDF
References
Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised clustering by seeding. In: Proceedings of 19th International Conference on Machine Learning (ICML-2002), pp. 19–26 (2002)
Bilmes, J.: A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Tech. rep, ICSI-TR-97-021, ICSI (1997)
Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of 21st International Conference on Machine Learning, ICML-2004 (2004)
Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained K-Means clustering with background knowledge. In: Proceedings of 18th International Conference on Machine Learning (ICML-2001), pp. 577–584 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Wang, C., Chen, W., Yin, P., Wang, J. (2007). Semi-supervised Clustering Using Incomplete Prior Knowledge. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds) Computational Science – ICCS 2007. ICCS 2007. Lecture Notes in Computer Science, vol 4487. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72584-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-72584-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72583-1
Online ISBN: 978-3-540-72584-8
eBook Packages: Computer ScienceComputer Science (R0)