Ensemble Learning

Tuv, Eugene

doi:10.1007/978-3-540-35488-8_8

Eugene Tuv⁶

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 207))

9134 Accesses
8 Citations

Abstract

Supervised ensemble methods construct a set of base learners (experts) and use their weighted outcome to predict new data. Numerous empirical studies confirm that ensemble methods often outperform any single base learner (Freund and Schapire, 1996, Bauer and Kohavi, 1999, Dietterich, 2000b). The improvement is intuitively clear when a base algorithm is unstable. In an unstable algorithm small changes in the training data lead to large changes in the resulting base learner (such as for decision tree, neural network, etc). Recently, a series of theoretical developments (Bousquet and Elisseeff, 2000, Poggio et al., 2002, Mukherjee et al., 2003, Poggio et al., 2004) also confirmed the fundamental role of stability for generalization (ability to perform well on the unseen data) of any learning engine. Given a multivariate learning algorithm, model selection and feature selection are closely related problems (the latter is a special case of the former). Thus, it is sensible that model-based feature selection methods (wrappers, embedded) would benefit from the regularization effect provided by ensemble aggregation. This is especially true for the fast, greedy and unstable learners often used for feature evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Y. Amit and D. Geman. Shape quantization and recognition with randomized trees. Neural Computation, 9(7):1545–1588, 1997.
Article Google Scholar
E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:525–536, 1999.
Article Google Scholar
A. Borisov, V. Eruhimov, and E. Tuv. Feature Extraction, Foundations and Applications, chapter Dynamic soft feature selection for tree-based ensembles. Springer, 2005.
Google Scholar
B. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Fifth Annual Workshop on Computational Learning Theory, pages 144–152, Pittsburgh, 1992.
Google Scholar
O. Bousquet and A. Elisseeff. Algorithmic stability and generalization performance. In Advances in Neural Information Processing Systems 13, pages 196–202, 2000. URL citeseer.nj.nec.com/bousquet01algorithmic.html.
Google Scholar
L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.
MATH MathSciNet Google Scholar
L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
Article MATH Google Scholar
L. Breiman. Manual On Setting Up, Using, And Understanding Random Forests V3.1, 2002.
Google Scholar
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.
MATH Google Scholar
Leo Breiman. Arcing the edge. Technical Report 486, Statistics Department, University of California at Berkeley, 1997.
Google Scholar
T.G. Dietterich. Ensemble methods in machine learning. In Multiple Classifier Systems. First International Workshop, volume 1857. Springer-Verlag, 2000a.
Google Scholar
T.G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–157, 2000b. available at ftp://ftp.cs.orst.edu/pub/tgd/papers/tr-randomized-c4.ps.gz.
Article Google Scholar
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Statistics, 32(2):407–451, 2004.
Article MATH MathSciNet Google Scholar
R. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(II):179–188, 1936.
Google Scholar
Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of Thirteenth International Conference, pages 148–156, 1996.
Google Scholar
J. Friedman. Greedy function approximation: a gradient boosting machine, 1999a. IMS 1999 Reitz Lecture, February 24, 1999, Dept. of Statistics, Stanford University.
Google Scholar
J. Friedman. Stochastic gradient boosting. Technical report, Dept. of Statistics, Stanford University, 1999b.
Google Scholar
J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28:832–844, 2000.
Article MathSciNet Google Scholar
A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis. Chapman and Hall, 1995.
Google Scholar
W.R. Gillks, S. Richardson, and D.J. Spiegelhalter. Markov Chain Monte Carlo in practice. Chapman and Hall, 1996.
Google Scholar
P. Green. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika, 82(4):711–732, 1995.
Article MATH MathSciNet Google Scholar
L.K. Hansen and P. Salamon. Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.
Article Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
Google Scholar
D. J. C. MacKay. Bayesian non-linear modelling for the prediction competition. ASHRAE Transactions: Symposia, OR-94-17-1, 1994.
Google Scholar
S. Mukherjee, P. Niyogi, T. Poggio, and R. Rifkin. Statistical learning: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. AI Memo 2002-024, MIT, 2003.
Google Scholar
R. Neal. Bayesian Learning for Neural Networks. Springer-Verlag, 1996.
Google Scholar
A.Y. Ng and M.I. Jordan. Convergence rates of the voting gibbs classifier, with application to bayesian feature selection. In ICML 2001, pages 377–384, 2001.
Google Scholar
B. Parmanto, P.W. Munro, and H.R. Doyle. Improving committee diagnosis with resampling techniques. In D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 882–888. The MIT Press, 1996.
Google Scholar
T. Poggio, R. Rifkin, S. Mukherjee, and P. Niyogi. General conditions for predictivity in learning theory. Nature, 428:419–422, 2004.
Article Google Scholar
T. Poggio, R. Rifkin, S. Mukherjee, and A. Rakhlin. Bagging regularizes. AI Memo 2002-003, MIT, 2002.
Google Scholar
M. Stephens. Bayesian analysis of mixtures with an unknown number of components an alternative to reversible jump methods. The Annals of Statistics, 28(1):40–74, 2000.
Article MATH MathSciNet Google Scholar
R. Tibshirani. Regression shrinkage and selection via lasso. J. Royal Statist. Soc., 58:267–288, 1996.
MATH MathSciNet Google Scholar
G. Valentini and T. Dietterich. Low bias bagged support vector machines. In ICML 2003, pages 752–759, 2003.
Google Scholar
G. Valentini and F. Masulli. Ensembles of learning machines. In M. Marinaro and R. Tagliaferri, editors, Neural Nets WIRN Vietri-02, Lecture Notes in Computer Sciences. Springer-Verlag, Heidelberg, 2002.
Google Scholar
A. Vehtari and J. Lampinen. Bayesian input variable selection using posterior probabilities and expected utilities. Technical Report Report B31, Laboratory of Computational Engineering, Helsinki University of Technology, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Intel, USA
Eugene Tuv

Authors

Eugene Tuv
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Clopinet, 955 Creston Road, 94708, Berkeley, USA
Isabelle Guyon
Department of Electrical Engineering & Computer Science — EECS, University of California, 94720, Berkeley, USA
Masoud Nikravesh
School of Electronics and Computer Sciences, University of Southampton, SO17 1BJ, Southampton Highfield, UK
Steve Gunn
Division of Computer Science Lab. Electronics Research, University of California, Soda Hall 387, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tuv, E. (2006). Ensemble Learning. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds) Feature Extraction. Studies in Fuzziness and Soft Computing, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-35488-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-35488-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35487-1
Online ISBN: 978-3-540-35488-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics