Reducing complexity of decision trees with two variable tests
This paper examines some ways to reduce the compexity of built trees particularly with respect to disjunctive concepts.
A number of heuristics that allow two categorical (nominal) variables to be combined at each node are described. Different combinations of these heuristics are then applied to five data sets. When these cases are analysed it is found that a number of combinations perform better than or equal to the conventional partitioning techniques for nearly all the data sets. The only data set which doesn't perform well is the one which has attributes with a high arity. Future directions are then discussed raising the possibility of more (> 2) attributes tested at each node.
Unable to display preview. Download preview PDF.
- 1.Breiman, L., Friedman, J., Olshen, R. A. and Stone, C.J.: Classification and Regression Trees. Wadsworth California, Wadsworth International Group. (1984)Google Scholar
- 2.Oliver, J. J.: Decision Graphs — An Extension of Decision Trees. Technical Report 92/173, Department of Computer Science, Monash University, Victoria, Australia. (1992)Google Scholar
- 4.Pearson, R. A. and Clarke, D. G.: Combining Values in Building and Pruning Decision Trees. Technical Report CS2/94, Department of Computer Science, UC, ADFA, Australia. (1995)Google Scholar
- 5.Pearson, R. A.: Single Pass Constructive Induction with Continuous Variables. To be published ISIS, Melbourne, Australia (1996)Google Scholar
- 6.Utgoff, P. and Brodley, C.: An incremental method for finding multivariate splits for decision tress. International Conference on Machine Learning, Morgan Kaufmann, (1990) 58–65.Google Scholar