Comparison of Three Objective Functions for Conceptual Clustering

  • Céline Robardet
  • Fabien Feschet
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2168)


Unsupervised clustering algorithms aims to synthesize a dataset such that similar objects are grouped together whereas dissimilar ones are separated. In the context of data analysis, it is often interesting to have tools for interpreting the result. There are some criteria for symbolic attributes which are based on the frequency estimation of the attribute-value pairs. Our point of view is to integrate the construction of the interpretation inside the clustering process. To do this, we propose an algorithm which provides two partitions, one on the set of objects and the second on the set of attribute-value pairs such that those two partitions are the most associated ones. In this article, we present a study of several functions for evaluating the intensity of this association.


Unsupervised clustering conceptual clustering association measures 


  1. [BFOS84]
    L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth International, California, 1984.zbMATHGoogle Scholar
  2. [BWL95]
    G. Biswas, J. Weinberg, and C. Li. Iterate: a conceptual clustering method for knowledge discovery in databases. Technical report, Departement of Computer Science, Vanderbilt university, Nashville, 1995.Google Scholar
  3. [CDG+88]
    G. Celeux, E. Diday, G. Govaert, Y. Lechevallier, and H. Ralambondrainy. Classification automatique des données. Dunod, paris, 1988.Google Scholar
  4. [CS96]
    G. Celeux and G. Soromenho. An entropy criterion for assessing the number of clusters in a mixture model. Journal of classification, 13:195–212, 1996.zbMATHCrossRefMathSciNetGoogle Scholar
  5. [Fis87]
    D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2: 139–172, 1987.Google Scholar
  6. [Fis96]
    D. H. Fisher. Iterative optimization and simplification of hierarchical clusterings. Journal of Artificial Intelligence Research, 4:147–180, 1996.zbMATHGoogle Scholar
  7. [GK54]
    L. A. Goodman and W. H. Kruskal. Measures of association for cross classification. Journal of the American Statistical Association, 49:732–764, 1954.zbMATHCrossRefGoogle Scholar
  8. [Gov84]
    G. Govaert. Classification simultanée de tableaux binaires. In E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, editors, Data analysis and informatics III, pages 233–236. North Holland, 1984.Google Scholar
  9. [JD88]
    A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice Hall, Englewood cliffs, New Jersey, 1988.zbMATHGoogle Scholar
  10. [LdC96]
    I.C. Lerman and J. F. P. da Costa. Coefficients d’association et variables à très grand nombre de catégories dans les arbres de décision: application à l’identification de la structure secondaire d’une protéine. Technical Report 2803, INRIA, février 1996.Google Scholar
  11. [MH91]
    G. Matthews and J. Hearne. Clustering without a metric. IEEE Transaction on pattern analysis and machine intelligence, 13(2):175–184, 1991.CrossRefGoogle Scholar
  12. [RF00]
    C. Robardet and F. Feschet. A new methodology to compare clustering algorithms. In H. Meng K. S. Leung, L. Chan, editor, Intelligent data engineering and automated learning-IDEAL 2000, number 1983 in LNCS. Springer-Verlag, 2000.Google Scholar
  13. [TB01]
    L. Talavera and J. Béjar. Generality-based conceptual clustering with probabilistic concepts. IEEE Transactions on pattern analysis and machine intelligence, 23(2):196–206, 2001.CrossRefGoogle Scholar
  14. [Weh96]
    L. Wehenkel. On uncertainty measures used for decision tree induction. In Info. Proc. and Manag. of Uncertainty, pages 413–418, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Céline Robardet
    • 1
  • Fabien Feschet
    • 1
  1. 1.Laboratoire d’Analyse des Systèmes de SantéUniversité Lyon 1Villeurbanne cedexFRANCE

Personalised recommendations