Abstract
A machine learning technique called Decision tree Graph-Based Induction (DT-GBI) constructs a classifier (decision tree) for graph-structured data, which are usually not explicitly expressed with attribute-value pairs. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI and they are used as attributes for testing. DT-GBI is efficient since GBI is used to extract patterns by greedy search and the obtained result (decision tree) is easy to understand. However, experiments against a DNA dataset from UCI repository revealed that the predictive accuracy of the classifier constructed by DT-GBI was not high enough compared with other approaches. Improvement is made on its predictive accuracy and the performance evaluation of the improved DT-GBI is reported against the DNA dataset. The predictive accuracy of a decision tree is affected by which attributes (patterns) are used and how it is constructed. To extract good enough discriminative patterns, search capability is enhanced by incorporating a beam search into the pairwise chunking within the greedy search framework. Pessimistic pruning is incorporated to avoid overfitting to the training data. Experiments using a DNA dataset were conducted to see the effect of the beam width, the number of chunking at each node of a decision tree, and the pruning. The results indicate that DT-GBI that does not use any prior domain knowledge can construct a decision tree that is comparable to other classifiers constructed using the domain knowledge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blake, C.L., Keogh, E., Merz, C.J.: Uci repository of machine leaning database (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software (1984)
Matsuda, T., Horiuchi, T., Motoda, H., Washio, T.: Extension of graph-based induction for general graph structured data. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS (LNAI), vol. 1805, pp. 420–431. Springer, Heidelberg (2000)
Matsuda, T., Motoda, H., Yoshida, T., Washio, T.: Knowledge discovery from structured data by beam-wise graph-based induction. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 255–264. Springer, Heidelberg (2002)
Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Mining patterns from structured data by beam-wise graph-based induction. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 422–429. Springer, Heidelberg (2002)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: C4.5:Programs For Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Towell, G.G., Shavlik, J.W.: Extracting refined rules from knowledge-based neural networks. Machine Learning 13, 71–101 (1993)
Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)
Yoshida, K., Motoda, H.: Clip: Concept learning from inference pattern. Journal of Artificial Intelligence 75(1), 63–92 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., Washio, T. (2003). Performance Evaluation of Decision Tree Graph-Based Induction. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-39644-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive