Abstract
In general, it is considered that pre-processings for data mining are necessary techniques to remove irrelevant and meaningless aspects of data before applying data mining algorithms. From this viewpoint, we have considered pre-processing for detecting a decision tree, and already proposed a notion of Information Theoretical Abstraction, and implemented a system ITA. Given a relational database and a family of possible abstractions for its attribute values, called an abstraction hierarchy, our system ITA selects the best abstraction among the possible ones so that class distributions needed to perform our classification task are preserved, and generalizes database according to the best abstraction. According to our previous experiment, just one application of abstraction for the whole database has shown its effectiveness in reducing the size of the detected decision tree, without making the classification accuracy worse. However, since such classification systems as C4.5 perform serial attribute-selection repeatedly, ITA does not generally guarantee the preservingness of class distributions, given a sequence of attribute-selections. For this reason, in this paper, we propose a new version of ITA, called iterative ITA, so that it tries to keep the class distributions in each attribute selection step as possibly as we can.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fayyad, U.N., Piatetsky-Shapiro, G., Smyth, P. and Uthurusamy, R.(eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.
Han, J. and Fu, Y.: Attribute-Oriented Induction in Data Mining. In:[1], pp.399–421, 1996.
Kudoh, Y. and Haraguchi, M.: An Appropriate Abstraction for an Attribute-Oriented Induction Proceeding of The Second International Conference on Discovery Science, LNAI 1721, pp.43–55, 1999.
Quinlan, J.R.: C4.5-Programs for Machine Learning, Morgan Kaufmann, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kudoh, Y., Haraguchi, M. (2000). An Appropriate Abstraction for Constructing a Compact Decision Tree. In: Arikawa, S., Morishita, S. (eds) Discovery Science. DS 2000. Lecture Notes in Computer Science(), vol 1967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44418-1_33
Download citation
DOI: https://doi.org/10.1007/3-540-44418-1_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41352-3
Online ISBN: 978-3-540-44418-3
eBook Packages: Springer Book Archive