Structural diversity for decision tree ensemble learning
Decision trees are a kind of off-the-shelf predictive models, and they have been successfully used as the base learners in ensemble learning. To construct a strong classifier ensemble, the individual classifiers should be accurate and diverse. However, diversity measure remains a mystery although there were many attempts. We conjecture that a deficiency of previous diversity measures lies in the fact that they consider only behavioral diversity, i.e., how the classifiers behave when making predictions, neglecting the fact that classifiers may be potentially different even when they make the same predictions. Based on this recognition, in this paper, we advocate to consider structural diversity in addition to behavioral diversity, and propose the TMD (tree matching diversity) measure for decision trees. To investigate the usefulness of TMD, we empirically evaluate performances of selective ensemble approaches with decision forests by incorporating different diversity measures. Our results validate that by considering structural and behavioral diversities together, stronger ensembles can be constructed. This may raise a new direction to design better diversity measures and ensemble methods.
Keywordsensemble learning structural diversity decision tree
Unable to display preview. Download preview PDF.
The authors would like to thank anonymous reviewers for their helpful comments and suggestions. This research was supported by the National Natural Science Foundation of China (Grant No. 61333014).
- 4.Zhou Z H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chapman & Hall/CRC, 2012Google Scholar
- 9.Melville P, Mooney R J. Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. 2003, 505–510Google Scholar
- 10.Yu Y, Li Y F, Zhou Z H. Diversity regularized machine. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1603–1608Google Scholar
- 15.Reyzin L, Schapire R E. How boosting the margin can also boost classifier complexity. In: Proceedings of the 23rd International Conference on Machine Learning. 2006, 753–760Google Scholar
- 17.Freund Y, Mason L. The alternating decision tree learning algorithm. In: Proceedings of the 16th International Conference on Machine Learning. 1999, 124–133Google Scholar
- 18.Friedman J H. Greedy function approximation: a gradient boosting machine. Annals of Statistics. 2001, 1189–1232Google Scholar
- 19.Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning. In: Proceedings of International Conference on Neural Information Processing Systems. 1995, 231–238Google Scholar
- 21.Margineantu D D, Dietterich T G. Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning. 1997, 211–218Google Scholar
- 28.Giacinto G, Roli F, Fumera G. Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition. 2000, 160–163Google Scholar
- 29.Lazarevic A, Obradovic Z. Effective pruning of neural network classifier ensembles. In: Proceedings of International Joint Conference on Neural Networks. 2001, 796–801Google Scholar
- 32.Qian C, Yu Y, Zhou Z H. Pareto ensemble pruning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 2935–2941Google Scholar