Variable Randomness in Decision Tree Ensembles
In this paper, we propose Max-diverse.α, which has a mechanism to control the degrees of randomness in decision tree ensembles. This control gives an ensemble the means to balance the two conflicting functions of a random random ensemble, i.e., the abilities to model non-axis-parallel boundary and eliminate irrelevant features. We find that this control is more sensitive to the one provided by Random Forests. Using progressive training errors, we are able to estimate an appropriate randomness for any given data prior to any predictive tasks. Experiment results show that Max-diverse.α is significantly better than Random Forests and Max-diverse Ensemble, and it is comparable to the state-of-the-art C5 boosting.
KeywordsFeature Selection Random Forest Variable Randomness Decision Boundary Training Error
Unable to display preview. Download preview PDF.
- 6.Fan, W., Wang, H., Yu, P.S., Ma, S.: Is random model better? on its accuracy and efficiency. In: Third IEEE International Conference on Data Mining, pp. 51–58 (2003)Google Scholar
- 9.Ji, C., Ma, S.: Combinations of weak classifiers. IEEE Transactions on Neural Networks 8, 494–500 (1997)Google Scholar
- 11.Buttrey, S., Kobayashi, I.: On strength and correlation in random forests. In: Proceedings of the 2003 Joint Statistical Meetings (2003)Google Scholar
- 13.Holte, R.C., Acker, L., Porter, B.W.: Concept learning and the problem of small disjuncts. In: IJCAI, pp. 813–818 (1989)Google Scholar
- 14.Blake, C., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar