Abstract
Ensembles of randomized trees such as Random Forests are among the most popular tools used in machine learning and data mining. Such algorithms work by introducing randomness in the induction of several decision trees before employing a voting scheme to give a prediction for unseen instances. In this paper, randomized trees ensembles are studied in the point of view of the basis functions they induce. We point out a connection with kernel target alignment, a measure of kernel quality, which suggests that randomization is a way to obtain a high alignment, leading to possibly low generalization error. The connection also suggests to post-process ensembles with sophisticated linear separators such as Support Vector Machines (SVM). Interestingly, post-processing gives experimentally better performances than a classical majority voting. We finish by comparing those results to an approximate infinite ensemble classifier very similar to the one introduced by Lin and Li. This methodology also shows strong learning abilities, comparable to ensemble post-processing.
Keywords
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Biau, G., Devroye, L., Lugosi, G.: Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research 9, 2015–2033 (2008)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Breiman, L.: Bias, variance and arcing classifiers (1996), http://www.sasenterpriseminer.com/documents/arcing.pdf
Breiman, L.: Some infinity theory for predictor ensembles (2000), http://www.stat.berkeley.edu
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks (1984)
Buntine, W., Niblett, T.: A further comparison of splitting rules for decision tree induction. Machine Learning 8, 75–85 (1992)
Chang, C.-C., Lin, C.-J.: Libsvm: A library for support vector machines (2001), Software available at, http://www.csie.ntu.edu.tw/cjlin/libsvm
Cristianini, N., Kandola, J., Elisseeff, A., Shawe-Taylor, J.: On kernel-target alignment. In: Holmes, D., Jain, L. (eds.) Innovations in Machine Learning: Theory and Application, pp. 205–255 (2006)
Cutler, A., Zhao, G.: Pert - perfect random tree ensembles. Computer Science and Statistics (2001)
Demiriz, A., Bennett, K., Shawe-Taylor, J.: Linear programming boosting via column generation. Machine Learning 46, 225–254 (2002)
Dietterich, T.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–157 (2000)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML, pp. 148–156 (1996)
Freund, Y., Schapire, R.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14, 771–780 (1999)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. The Annals of Statistics 38, 95–118 (2000)
Friedman, J., Popescu, B.: Predictive learning via rule ensembles. The Annals of Applied Statistics 2, 916–954 (2008)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63, 3–42 (2006)
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 545–552. MIT Press, Cambridge (2005)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: Data Mining, Inference and Prediction. Springer, Heidelberg (2009)
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification (2003), http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
Kleinberg, E.: On the algorithmic implementation of stochastic discrimination. IEEE Trans. Pattern Anal. Mach. Intell. 22, 473–490 (2000)
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51, 181–207 (2003)
Lin, H.-T., Li, L.: Support vector machinery for infinite ensemble learning. Journal of Machine Learning Research 9, 941–973 (2008)
Liu, F.-T., Ting, K.-M., Yu, Y., Zhou, Z.-H.: Spectrum of Variable-Random Trees. Journal of Artifical Intelligence Research 32, 355–384 (2008)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Ratsch, G., Onoda, T., Muller, K.-R.: Soft margins for adaboost. Machine Learning 42, 287–320 (2001)
Rosset, S., Zhu, J., Hastie, T.: Boosting as a regularized path to a maximum margin classifier. Journal of Machine Learning Research 5, 941–973 (2004)
Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2001)
Utgoff, P., Clouse, J.: A kolmogorov-smirnov metric for decision tree induction (1996), http://www.ics.uci.edu/~mlearn/MLRepository.html
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Wolpert, D.: Stacked Generalization. Neural Networks 5, 241–259 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pisetta, V., Jouve, PE., Zighed, D.A. (2010). Learning with Ensembles of Randomized Trees : New Insights. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-15939-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)