A Novel Initialization Method for Semi-supervised Clustering

Dang, Yanzhong; Xuan, Zhaoguo; Rong, Lili; Liu, Ming

doi:10.1007/978-3-642-15280-1_30

A Novel Initialization Method for Semi-supervised Clustering

Yanzhong Dang²¹,
Zhaoguo Xuan²¹,
Lili Rong²¹ &
…
Ming Liu²¹

Conference paper

1440 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6291))

Abstract

In recent years, the research of semi-supervised clustering has been paid more and more attention. For most of the semi-supervised clustering algorithms, a good initialization method can create the high-quality seeds which are helpful to improve the clustering accuracy. In the real world, there are few labeled samples but many unlabeled ones, whereas most of the existing initialization methods put the unlabeled data away for clustering which may contain some potentially useful information for clustering tasks. In this paper, we propose a novel initialization method to transfer some of the unlabeled samples into labeled ones, in which the neighbors of labeled samples are identified at first and then the known labels are propagated to the unlabeled ones. Experimental results show that the proposed initialization method can improve the performance of the semi-supervised clustering.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhou, Z.H., Zhang, D.C., Yang, Q.: Semi-Supervised Learning with Very Few Labeled Training Examples. In: 22nd AAAI Conference on Artificial Intelligence, pp. 675–680. AAAI Press, Vancouver (2007)
Google Scholar
Basu, S., Banerjee, A., Mooney, R.: Semi-supervised Clustering by Seeding. In: 19th International Conference Machine Learning, pp. 19–26. Morgan Kaufmann Press, Sydney (2002)
Google Scholar
Zhong, S.: Semi-supervised Model-based Document Clustering: A Comparative Study. J. Mach. Learn. 65, 3–29 (2006)
Article Google Scholar
Katsavounidis, I., Kuo, C., Zhang, Z.: A New Initialization Technique for Generalized Lloyd Iteration. J. Sig. Proc. Lett. 1, 144–146 (1994)
Article Google Scholar
Sun, X., Li, K.L., Zhao, R.: Global optimization for semi-supervised K-means. In: Asia-Pacific Conference on Information Processing, pp. 410–413. IEEE Press, Shen Zhen (2009)
Chapter Google Scholar
Zhu, X.J., Ghahramani, Z.: Learning from Labeled and Unlabeled Data with Label Propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon Univ. (2002)
Google Scholar
Zhong, S., Ghosh, J.: A unified framework for model-based clustering. J. Mach. Learn. Resear. 4, 1001–1037 (2003)
Article MathSciNet Google Scholar
He, J., Lan, M., Tan, C.L., Sung, S.Y., Low, H.B.: Initialization of Cluster Refinement Algorithms: A Review and Comparative Study. In: IEEE International Joint Conference Neural Networks, pp. 297–302. IEEE Press, Budapest (2004)
Google Scholar
Luo, C., Li, Y.J., Chung, S.M.: Text Document Clustering Based on Neighbors. J. Data & Kno. Engin. 68, 1271–1288 (2009)
Article Google Scholar
Nigam, K., Mccallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. J. Mach. Learn. 39, 103–134 (2000)
Article MATH Google Scholar
Nigam, K.: Using Unlabeled Data to Improve Text Classification. Doctoral Dissertation, School of Computer Science, Carnegie Mellon University (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China
Yanzhong Dang, Zhaoguo Xuan, Lili Rong & Ming Liu

Authors

Yanzhong Dang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoguo Xuan
View author publications
You can also search for this author in PubMed Google Scholar
Lili Rong
View author publications
You can also search for this author in PubMed Google Scholar
Ming Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing and Mathematics, University of Ulster, Newtownabbey, Co. Antrim, BT37 0QB, UK
Yaxin Bi
Innovation and Technology Research Laboratory, University of Technology, 2007, Sydney, NSW, Australia
Mary-Anne Williams

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dang, Y., Xuan, Z., Rong, L., Liu, M. (2010). A Novel Initialization Method for Semi-supervised Clustering. In: Bi, Y., Williams, MA. (eds) Knowledge Science, Engineering and Management. KSEM 2010. Lecture Notes in Computer Science(), vol 6291. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15280-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-15280-1_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15279-5
Online ISBN: 978-3-642-15280-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics