Efficient construction of comprehensible hierarchical clusterings

  • Luis Talavera
  • Javier Béjar
Communications Session 4. Clustering and Discretization
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1510)


Clustering is an important data mining task which helps in finding useful patterns to summarize the data. In the KDD context, data mining is often used for description purposes rather than for prediction. However, it turns out difficult to find clustering systems that help to ease the interpretation task to the user in both, statistics and Machine Learning fields. In this paper we present Isaac, a hierarchical clustering system which employs traditional clustering ideas combined with a feature selection mechanism and heuristics in order to provide comprehensible results. At the same time, it allows to efficiently deal with large datasets by means of a preprocessing step. Results suggest that these aims are achieved and encourage further research.


  1. 1.
    P. Cheeseman and J. Stutz. Bayesian classification (autoclass): theory and results. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in knowledge discovery and data mining, pages 153–180. AAAI Press, Menlo Park, CA, 1996.Google Scholar
  2. 2.
    U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery: An overview. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 1–34. AAAI Press, Cambridge, Massachusetts, 1996.Google Scholar
  3. 3.
    D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, (2):139–172, 1987.Google Scholar
  4. 4.
    D. F. Gordon, P. M. Tag, and R. L. Bankert. Unsupervised classification procedures applied to satellite cloud data. Technical Report AIC95-005, Navy Center for Applied Research in Artificial Intelligence, 1995.Google Scholar
  5. 5.
    S. J. Hanson and M. Bauer. Conceptual clustering, categorization and polymorphy. Machine Learning, (3):343–372, 1989.Google Scholar
  6. 6.
    M. Lebowitz. Experiments with incremental concept formation: UNIMEM. Machine Learning, (2):103–138, 1987.Google Scholar
  7. 7.
    R. López de Mántaras. A distance based attribute selection measure for decision tree induction. Machine Learning, (6):81–92, 1991.Google Scholar
  8. 8.
    R. S. Michalski and R. E. Stepp. Learning from observation: Conceptual clustering. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Artificial intelligence approach, pages 331–363. Morgan Kauffmann, 1983.Google Scholar
  9. 9.
    L. Talavera and U. Cortés. Exploiting bias shift in knowledge acquisition. In 10th European Workshop on Knowledge Acquisition, Modeling, and Management, Lecture Notes in Artificial Intelligence, Sant Feliu de Guixols, Barcelona, Spain, 1997. Springer.Google Scholar
  10. 10.
    L. Talavera and J. Roure. A buffering strategy to avoid ordering effects in clustering. In Proceedings of the Tenth European Conference on Machine Learning, volume 1398 of Lecture Notes in Artificial Intelligence, Chemnitz, Germany, 1998. Springer.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Luis Talavera
    • 1
  • Javier Béjar
    • 1
  1. 1.Department de Llenguatges i Sistemes InformàticsUniversitat Politècnica de CatalunyaBarcelona, CataloniaSpain

Personalised recommendations