Performance Evaluation of Decision Tree Graph-Based Induction
A machine learning technique called Decision tree Graph-Based Induction (DT-GBI) constructs a classifier (decision tree) for graph-structured data, which are usually not explicitly expressed with attribute-value pairs. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI and they are used as attributes for testing. DT-GBI is efficient since GBI is used to extract patterns by greedy search and the obtained result (decision tree) is easy to understand. However, experiments against a DNA dataset from UCI repository revealed that the predictive accuracy of the classifier constructed by DT-GBI was not high enough compared with other approaches. Improvement is made on its predictive accuracy and the performance evaluation of the improved DT-GBI is reported against the DNA dataset. The predictive accuracy of a decision tree is affected by which attributes (patterns) are used and how it is constructed. To extract good enough discriminative patterns, search capability is enhanced by incorporating a beam search into the pairwise chunking within the greedy search framework. Pessimistic pruning is incorporated to avoid overfitting to the training data. Experiments using a DNA dataset were conducted to see the effect of the beam width, the number of chunking at each node of a decision tree, and the pruning. The results indicate that DT-GBI that does not use any prior domain knowledge can construct a decision tree that is comparable to other classifiers constructed using the domain knowledge.
KeywordsDecision Tree Domain Knowledge Predictive Accuracy Information Gain Beam Width
Unable to display preview. Download preview PDF.
- 1.Blake, C.L., Keogh, E., Merz, C.J.: Uci repository of machine leaning database (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 2.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software (1984)Google Scholar
- 6.Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)Google Scholar
- 7.Quinlan, J.R.: C4.5:Programs For Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)Google Scholar
- 8.Towell, G.G., Shavlik, J.W.: Extracting refined rules from knowledge-based neural networks. Machine Learning 13, 71–101 (1993)Google Scholar
- 9.Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)Google Scholar