A Semi-supervised Clustering Algorithm Based on Must-Link Set

  • Haichao Huang
  • Yong Cheng
  • Ruilian Zhao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5139)

Abstract

Clustering analysis is traditionally considered as an unsupervised learning process. In most cases, people usually have some prior or background knowledge before they perform the clustering. How to use the prior or background knowledge to imporve the cluster quality and promote the efficiency of clustering data has become a hot research topic in recent years. The Must-Link and Cannot-Link constraints between instances are common prior knowledge in many real applications. This paper presents the concept of Must-Link Set and designs a new semi-supervised clustering algorithm MLC-KMeans using Musk-Link Set as assistant centroid. The preliminary experiment on several UCI datasets confirms the effectiveness and efficiency of the algorithm.

Keywords

Semi-supervised Learning Data Clustering Constraint MLC-KMeans Algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Haichao Huang
    • 1
  • Yong Cheng
    • 1
  • Ruilian Zhao
    • 1
  1. 1.Computer Department, College of Information Science and TechnologyBeijing University of Chemical TechnologyBeijingChina

Personalised recommendations