Advertisement

On the Effects of Constraints in Semi-supervised Hierarchical Clustering

  • Hans A. Kestler
  • Johann M. Kraus
  • Günther Palm
  • Friedhelm Schwenker
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4087)

Abstract

We explore the use of constraints with divisive hierarchical clustering. We mention some considerations on the effects of the inclusion of constraints into the hierarchical clustering process. Furthermore, we introduce an implementation of a semi-supervised divisive hierarchical clustering algorithm and show the influence of including constraints into the divisive hierarchical clustering process. In this task our main interest lies in building stable dendrograms when clustering with different subsets of data.

Keywords

Cluster Algorithm Hierarchical Cluster Data Item Rand Index Pairwise Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Demiriz, A., Bennett, K., Embrechts, M.: Semi-supervised clustering using genetic algorithms. In: Artificial Neural Networks in Engineering, New York, Troy, pp. 809–814 (1999)Google Scholar
  2. 2.
    Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proceedings of the 17th International Conference on Machine Learning, pp. 1103–1110. Morgan Kaufmann, San Francisco (2000)Google Scholar
  3. 3.
    Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proceedings of 19th International Conference on Machine Learning, pp. 19–26 (2002)Google Scholar
  4. 4.
    Klein, D., Kamvar, S., Manning, C.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: Proceedings of 19th International Conference on Machine Learning, pp. 307–314 (2002)Google Scholar
  5. 5.
    Bilenko, M., Mooney, R.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 39–48 (2003)Google Scholar
  6. 6.
    Cohn, D., Caruana, R., McCallum, A.: Semi-supervised clustering with user feedback. Technical report, Cornell University (2003)Google Scholar
  7. 7.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems 15, 505–512 (2003)Google Scholar
  8. 8.
    Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning distance functions using equivalence relations. In: Proceedings of 20th International Conference on Machine Learning, pp. 11–18 (2003)Google Scholar
  9. 9.
    Davidson, I., Ravi, S.: Agglomerative hierarchical clustering with constraints: Theoretical and empirical results. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 59–70. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of 18th International Conference on Machine Learning, pp. 577–584 (2001)Google Scholar
  11. 11.
    Bilenko, M., Basu, S., Mooney, R.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21st International Conference on Machine Learning (2004)Google Scholar
  12. 12.
    Buchholz, M., Kestler, H., Bauer, A., Böck, W., Rau, B., Leder, G., Kratzer, W., Bommer, M., Scarpa, A., Schilling, M., Adler, G., Hoheisel, J., Gress, T.: Specialized dna arrays for the differentiation of pancreatic tumors: A solution for a common diagnostic dilemma. Clin. Cancer Res. 11, 8048–8054 (2005)CrossRefGoogle Scholar
  13. 13.
    Jain, A.K., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, New Jersey (1988)MATHGoogle Scholar
  14. 14.
    Rand, W.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846–850 (1971)CrossRefGoogle Scholar
  15. 15.
    Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns genetics cluster analysis and display of genome-wide expression patterns. PNAS 95, 14863–14868 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hans A. Kestler
    • 1
    • 2
  • Johann M. Kraus
    • 1
    • 2
  • Günther Palm
    • 1
  • Friedhelm Schwenker
    • 1
  1. 1.Department of Neural Information ProcessingUniversity of UlmUlmGermany
  2. 2.Department of Internal Medicine IUniversity Hospital UlmUlmGermany

Personalised recommendations