An Appropriate Abstraction for Constructing a Compact Decision Tree

Kudoh, Yoshimitsu; Haraguchi, Makoto

doi:10.1007/3-540-44418-1_33

Yoshimitsu Kudoh³ &
Makoto Haraguchi³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1967))

Included in the following conference series:

International Conference on Discovery Science

359 Accesses
2 Citations

Abstract

In general, it is considered that pre-processings for data mining are necessary techniques to remove irrelevant and meaningless aspects of data before applying data mining algorithms. From this viewpoint, we have considered pre-processing for detecting a decision tree, and already proposed a notion of Information Theoretical Abstraction, and implemented a system ITA. Given a relational database and a family of possible abstractions for its attribute values, called an abstraction hierarchy, our system ITA selects the best abstraction among the possible ones so that class distributions needed to perform our classification task are preserved, and generalizes database according to the best abstraction. According to our previous experiment, just one application of abstraction for the whole database has shown its effectiveness in reducing the size of the detected decision tree, without making the classification accuracy worse. However, since such classification systems as C4.5 perform serial attribute-selection repeatedly, ITA does not generally guarantee the preservingness of class distributions, given a sequence of attribute-selections. For this reason, in this paper, we propose a new version of ITA, called iterative ITA, so that it tries to keep the class distributions in each attribute selection step as possibly as we can.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fayyad, U.N., Piatetsky-Shapiro, G., Smyth, P. and Uthurusamy, R.(eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.
Google Scholar
Han, J. and Fu, Y.: Attribute-Oriented Induction in Data Mining. In:[1], pp.399–421, 1996.
Google Scholar
Kudoh, Y. and Haraguchi, M.: An Appropriate Abstraction for an Attribute-Oriented Induction Proceeding of The Second International Conference on Discovery Science, LNAI 1721, pp.43–55, 1999.
Google Scholar
Quinlan, J.R.: C4.5-Programs for Machine Learning, Morgan Kaufmann, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Electronics and Information Engineering, Hokkaido University, N 13 W 8, 060-8628, Sapporo, JAPAN
Yoshimitsu Kudoh & Makoto Haraguchi

Authors

Yoshimitsu Kudoh
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, 812-8581, Fukuoka, Japan
Setsuo Arikawa
Faculty of Science Department of Information Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan
Shinichi Morishita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kudoh, Y., Haraguchi, M. (2000). An Appropriate Abstraction for Constructing a Compact Decision Tree. In: Arikawa, S., Morishita, S. (eds) Discovery Science. DS 2000. Lecture Notes in Computer Science(), vol 1967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44418-1_33

Download citation

DOI: https://doi.org/10.1007/3-540-44418-1_33
Published: 19 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41352-3
Online ISBN: 978-3-540-44418-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics