Generalized Conditional Entropy and a Metric Splitting Criterion for Decision Trees

  • Dan A. Simovici
  • Szymon Jaroszewicz
Conference paper

DOI: 10.1007/11731139_7

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)
Cite this paper as:
Simovici D.A., Jaroszewicz S. (2006) Generalized Conditional Entropy and a Metric Splitting Criterion for Decision Trees. In: Ng WK., Kitsuregawa M., Li J., Chang K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science, vol 3918. Springer, Berlin, Heidelberg

Abstract

We examine a new approach to building decision tree by introducing a geometric splitting criterion, based on the properties of a family of metrics on the space of partitions of a finite set. This criterion can be adapted to the characteristics of the data sets and the needs of the users and yields decision trees that have smaller sizes and fewer leaves than the trees built with standard methods and have comparable or better accuracy.

Keywords

decision tree generalized conditional entropy metric metric betweenness 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Dan A. Simovici
    • 1
  • Szymon Jaroszewicz
    • 2
  1. 1.Dept. of Computer ScienceUniversity of Massachusetts at BostonBoston
  2. 2.Faculty of Computer and Information SystemsTechnical University of SzeczinPoland

Personalised recommendations