Graph-Based Clustering with Constraints

  • Rajul Anand
  • Chandan K. Reddy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6635)

Abstract

A common way to add background knowledge to the clustering algorithms is by adding constraints. Though there had been some algorithms that incorporate constraints into the clustering process, not much focus was given to the topic of graph-based clustering with constraints. In this paper, we propose a constrained graph-based clustering method and argue that adding constraints in distance function before graph partitioning will lead to better results. We also specify a novel approach for adding constraints by introducing the distance limit criteria. We will also examine how our new distance limit approach performs in comparison to earlier approaches of using fixed distance measure for constraints. The proposed approach and its variants are evaluated on UCI datasets and compared with the other constrained-clustering algorithms which embed constraints in a similar fashion.

Keywords

Clustering constrained clustering graph-based clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised clustering by seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), pp. 27–34 (2002)Google Scholar
  2. 2.
    Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2004)Google Scholar
  3. 3.
    Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC (2008)Google Scholar
  4. 4.
    Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the Twenty-first International Conference on Machine Learning, ICML 2004 (2004)Google Scholar
  5. 5.
    Davidson, I., Ravi, S.S.: Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 59–70. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Davidson, I., Ravi, S.S., Shamis, L.: A sat-based framework for efficient constrained clustering. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 94–105. Springer, Heidelberg (2010)Google Scholar
  7. 7.
    Davidson, I., Wagstaff, K.L., Basu, S.: Measuring Constraint-Set Utility for Partitional Clustering Algorithms. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 115–126. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  9. 9.
    Gunopulos, D., Vazirgiannis, M., Halkidi, M.: From unsupervised to semi-supervised learning: Algorithms and evaluation approaches. In: SIAM International Conference on Data Mining: Tutorial (2006)Google Scholar
  10. 10.
    Halkidi, M., Gunopulos, D., Kumar, N., Vazirgiannis, M., Domeniconi, C.: A framework for semi-supervised learning based on subjective and objective clustering criteria. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), pp. 637–640 (2005)Google Scholar
  11. 11.
    Karypis, G., Han, E.-H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  12. 12.
    Karypis, G., Kumar, V.: Metis 4.0: Unstructured graph partitioning and sparse matrix ordering system. Tech. Report, Dept. of Computer Science, Univ. of Minnesota (1998)Google Scholar
  13. 13.
    Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), pp. 307–314 (2002)Google Scholar
  14. 14.
    Kulis, B., Basu, S., Dhillon, I.S., Mooney, R.J.: Semi-supervised graph clustering: a kernel approach. In: Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), pp. 457–464 (2005)Google Scholar
  15. 15.
    Lelis, L., Sander, J.: Semi-supervised density-based clustering. In: Perner, P. (ed.) ICDM 2009. LNCS, vol. 5633, pp. 842–847. Springer, Heidelberg (2009)Google Scholar
  16. 16.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850 (1971)CrossRefGoogle Scholar
  17. 17.
    Ruiz, C., Spiliopoulou, M., Menasalvas, E.: Density based semi-supervised clustering. Data Mining and Knowledge Discovery 21(3), 345–370 (2009)CrossRefGoogle Scholar
  18. 18.
    Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining, US edition. Addison Wesley, Reading (2005)Google Scholar
  19. 19.
    Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pp. 577–584 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rajul Anand
    • 1
  • Chandan K. Reddy
    • 1
  1. 1.Department of Computer ScienceWayne State UniversityDetroitUSA

Personalised recommendations