Skip to main content

Machine Learning, Ensemble Methods in

  • Reference work entry
Computational Complexity

Article Outline

Glossary

Definition of the Subject

Introduction

Learning Ensembles

Frequently Used Ensemble Methods

Future Directions

Bibliography

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,399.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

Attribute (also feature or variable):

An attribute is an entity that defines a property of an object (or example). It has a domain defined by its type which denotes the values that can be taken by an attribute (e. g., nominal or numeric). For example, apples can have attributes such as weight (with numeric values) and color (with nominal values such as red or green).

Example (also instance or case):

An example is a single object from a problem domain of interest. In machine learning, examples are typically described by a set of attribute values and are used for learning a descriptive and/or predictive model.

Model (also classifier):

In machine learning, a model is a computer program that attempts to simulate a particular system or its part with the aim of gaining insight into the operation of this system, or to observe its behavior. Strictly speaking, a classifier is a type of model that performs a mapping from a set of unlabeled examples to a set of (discrete) classes. However, in machine learning the term classifier is often used as a synonym for model.

Learning (also training) set:

A learning set is a set of examples that are used for learning a model or a classifier. Examples are typically described in terms of attribute values and have a corresponding output value or class.

Testing set:

A testing set is a set of examples that, as opposed to examples from the learning set, have not been used in the process of model learning; they are also called unseen examples. They are used for evaluating the learned model.

Ensemble:

An ensemble in machine learning is a set of predictive models whose predictions are combined into a single prediction. The purpose of learning ensembles is typically to achieve better predictive performance.

Bibliography

Primary Literature

  1. Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141

    MathSciNet  Google Scholar 

  2. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139

    Article  Google Scholar 

  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  4. Breiman L (1998) Arcing classifiers. Ann Stat 26(3):801–849

    Article  MathSciNet  MATH  Google Scholar 

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  6. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth and Brooks, Monterey

    MATH  Google Scholar 

  7. Buciluǎ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proc. of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’06). ACM, New York, pp 535–541

    Google Scholar 

  8. Cohen S, Intrator N (2000) A hybrid projection based and radial basis function architecture. In: Proc. of the 1st international workshop on multiple classifier systems (MCS ’00). Springer, Berlin, pp 147–156

    Chapter  Google Scholar 

  9. Cohen S, Intrator N (2001) Automatic model selection in a hybrid perceptron/radial network. In: Proc. of the 2nd international workshop on multiple classifier systems (MCS ’01). Springer, Berlin, pp 440–454

    Google Scholar 

  10. Dietterich TG (1997) Machine‐learning research: four current directions. AI Mag 18(4):97–136

    Google Scholar 

  11. Dietterich TG (2000) Ensemble methods in machine learning. In: Proc. of the 1st international workshop on multiple classifier systems (MCS ’00). Springer, Berlin, pp 1–15

    Chapter  Google Scholar 

  12. Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error‐correcting output codes. J Artif Intell Res 2:263–286

    MATH  Google Scholar 

  13. Džeroski S, Ženko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54(3):255–273

    Google Scholar 

  14. Efron B (1979) Bootstrap methods: Another look at the jackknife. Ann Stat 7(1):1–26

    Article  MathSciNet  MATH  Google Scholar 

  15. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Saitta L (ed) Machine learning: Proc. of the 13th international conference (ICML ’96). Morgan Kaufmann, San Francisco, pp 148–156

    Google Scholar 

  16. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):11189–1232

    Article  Google Scholar 

  17. Friedman JH, Popescu BE (2005) Predictive learning via rule ensembles. Technical report, Stanford University, Department of Statistics

    Google Scholar 

  18. Friedman JH, Hastie T, Tibshirani RJ (1998) Additive logistic regression: a statistical view of boosting. Technical report, Stanford University, Department of Statistics

    Google Scholar 

  19. Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 26(2):147–160

    MathSciNet  Google Scholar 

  20. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001

    Article  Google Scholar 

  21. Hastie T, Tibshirani RJ, Friedman JH (2001) The elements of statistical learning. Springer Series in Statistics. Springer, Berlin

    Google Scholar 

  22. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  23. Ho TK (2000) Complexity of classification problems and comparative advantages of combined classifiers. In: Kittler J, Roli F (eds) Proc. of the 1st international workshop on multiple classifier systems (MCS ’00), vol 1857. Springer, Berlin, pp 97–106

    Chapter  Google Scholar 

  24. Jacobs RA (1995) Methods for combining experts’ probability assessments. Neural Comput 7(5):867–888

    Article  Google Scholar 

  25. Jordan MI, Jacobs RA (1992) Hierarchies of adaptive experts. In: Moody JE, Hanson S, Lippmann RP (eds) Advances in Neural Information Processing System (NIPS). Morgan Kaufmann, San Mateo, pp 985–992

    Google Scholar 

  26. Kearns MJ, Vazirani UV (1994) An introduction to computational learning theory. MIT Press, Cambridge

    Google Scholar 

  27. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239

    Article  Google Scholar 

  28. Kononenko I, Kukar M (2007) Machine learning and data mining: introduction to principles and algorithms. Horwood, Chichester

    Book  Google Scholar 

  29. Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken

    Book  MATH  Google Scholar 

  30. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207

    Article  MATH  Google Scholar 

  31. Mitchell T (1997) Machine Learning. McGraw-Hill, New York

    MATH  Google Scholar 

  32. Panov P, Džeroski S (2007) Combining bagging and random subspaces to create better ensembles. In: Proc. of 7th international symposium on intelligent data analysis (IDA ’07), vol 4723. Lecture notes in computer science. Springer, Berlin, pp 118–129

    Google Scholar 

  33. Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45

    Article  Google Scholar 

  34. Rätsch G, Demiriz A, Bennett KP (2002) Sparse regression ensembles in infinite and finite hypothesis spaces. Mach Learn 48(1–3):189–218

    Google Scholar 

  35. Ridgeway G, Madigan D, Richardson T (1999) Boosting methodology for regression problems. In: Heckerman D, Whittaker J (eds) Proc. of the 7th international workshop on artificial intelligence and statistics. Morgan Kaufmann, San Francisco, pp 152–161

    Google Scholar 

  36. Rosenblatt F (1962) Principles of neurodynamics: perceptron and the theory of brain mechanisms. Spartan Books, Washington

    Google Scholar 

  37. Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227

    Google Scholar 

  38. Schapire RE (1999) A brief introduction to boosting. In: Proc. of the 6th international joint conference on artificial intelligence. Morgan Kaufmann, San Francisco, pp 1401–1406

    Google Scholar 

  39. Schapire RE (2001) The boosting approach to machine learning: an overview. In: MSRI workshop on nonlinear estimation and classification, Berkeley, CA, 2001

    Google Scholar 

  40. Schapire RE, Singer Y (1999) Improved boosting using confidence‐rated predictions. Mach Learn 37(3):297–336

    Article  MATH  Google Scholar 

  41. Schapire RE, Freund Y, Bartlett P, Lee WS (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26(5):1651–1686

    Article  MathSciNet  MATH  Google Scholar 

  42. Ting KM, Witten IH (1999) Issues in stacked generalization. J Artif Intell Res 10:271–289

    MATH  Google Scholar 

  43. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  44. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  MathSciNet  Google Scholar 

Books and Reviews

  1. Brown G Ensemble learning bibliography. http://www.cs.man.ac.uk/~gbrown/ensemblebib/index.php. Accessed 26 March 2008

  2. Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka/. Accessed 26 March 2008

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag

About this entry

Cite this entry

Džeroski, S., Panov, P., Ženko, B. (2012). Machine Learning, Ensemble Methods in. In: Meyers, R. (eds) Computational Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1800-9_114

Download citation

Publish with us

Policies and ethics