Pattern Analysis and Applications

, Volume 19, Issue 4, pp 1093–1128 | Cite as

An extensive empirical comparison of ensemble learning methods for binary classification

  • Anil Narassiguin
  • Mohamed Bibimoune
  • Haytham Elghazel
  • Alex Aussem
Short Paper


We present an extensive empirical comparison between nineteen prototypical supervised ensemble learning algorithms, including Boosting, Bagging, Random Forests, Rotation Forests, Arc-X4, Class-Switching and their variants, as well as more recent techniques like Random Patches. These algorithms were compared against each other in terms of threshold, ranking/ordering and probability metrics over nineteen UCI benchmark data sets with binary labels. We also examine the influence of two base learners, CART and Extremely Randomized Trees, on the bias–variance decomposition and the effect of calibrating the models via Isotonic Regression on each performance metric. The selected data sets were already used in various empirical studies and cover different application domains. The source code and the detailed results of our study are publicly available.


Ensemble learning Empirical analysis Binary classification 


  1. 1.
    Zhou Z-H (2012) Ensemble methods: foundations and algorithms. Chapman & Hall/CRC, Boca RatonGoogle Scholar
  2. 2.
    Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139CrossRefGoogle Scholar
  3. 3.
    Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the ICML, pp 161–168Google Scholar
  4. 4.
    Chen N, Ribeiro B, Chen A (2015) Comparative study of classifier ensembles for cost-sensitive credit risk assessment. Intell Data Anal 19(1):127–144Google Scholar
  5. 5.
    Zhang C, Zhang J (2008) Rotboost: a technique for combining rotation forest and adaboost. Pattern Recognit Lett 29(10):1524–1536CrossRefGoogle Scholar
  6. 6.
    Rodríguez JJ, Kuncheva L, Alonso CJ (2006) A rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630CrossRefGoogle Scholar
  7. 7.
    Louppe G, Geurts P (2012) Ensembles on random patches. In: Proceedings of the ECML/PKDD, pp 346–361Google Scholar
  8. 8.
    Geurts P, Ernst D, Wehenkel W (2006) Extremely randomized trees. Mach Learn 63(1):3–42MATHCrossRefGoogle Scholar
  9. 9.
    Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Proceedings of the ICML, pp 625–632Google Scholar
  10. 10.
    Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. In: WadsworthGoogle Scholar
  11. 11.
    Ho T (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRefGoogle Scholar
  12. 12.
    Hernández-Lobato D, Martínez-Muñoz G, Suárez A (2013) How large should ensembles of classifiers be? Pattern Recognit 46(5):1323–1336MATHCrossRefGoogle Scholar
  13. 13.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MathSciNetMATHGoogle Scholar
  14. 14.
    Breiman L (2001) Random forests. Mach Learn 45(1):5–32MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Shivaswamy PK, Jebara T (2011) Variance penalizing adaboost. In: Proceedings of the NIPS, pp 1908–1916Google Scholar
  17. 17.
    Breiman L (1996) Bias, variance, and arcing classifiers. Statistics Department, University of California at Berkeley, BerkeleyGoogle Scholar
  18. 18.
    Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 1998:28MathSciNetMATHGoogle Scholar
  19. 19.
    Breiman L (2000) Randomizing outputs to increase prediction accuracy. Mach Learn 40(3):229–242MATHCrossRefGoogle Scholar
  20. 20.
    Martínez-Muñoz G, Suárez A (2005) Switching class labels to generate classification ensembles. Pattern Recognit 38(10):1483–1494CrossRefGoogle Scholar
  21. 21.
    Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATHGoogle Scholar
  22. 22.
    Kong EB, Dietterich TG (1995) Error-correcting output coding corrects bias and variance. In: Proceedings of the ICML, pp 313–321Google Scholar
  23. 23.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRefGoogle Scholar
  24. 24.
    Caruana R, Niculescu-Mizil A (2004) Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proceedings of the KDD, pp 69–78Google Scholar
  25. 25.
    Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of the ICML, pp 609–616Google Scholar
  26. 26.
    Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A (2008) Advancing feature selection research—ASU feature selection repository. Technical report. Arizona State UniversityGoogle Scholar
  27. 27.
    Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences, IrvineGoogle Scholar
  28. 28.
    Ben-Dor A, Bruhn L, Laboratories A, Friedmann N, Schummer M, Nachman I, Washington U, Yakhini Z (2000) Tissue classification with gene expression profiles. J Comput Biol 7:559–584CrossRefGoogle Scholar
  29. 29.
    Golub R, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537CrossRefGoogle Scholar
  30. 30.
    Schummer M, Ng WV, Bumgarnerd RE (1999) Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas. Gene 238(2):375–385CrossRefGoogle Scholar
  31. 31.
    Liu K, Huang D (2008) Cancer classification using rotation forest. Comput Biol Med 38(5):601–610CrossRefGoogle Scholar
  32. 32.
    Slonim DK, Tamayo P, Mesirov JP, Golub TR, Lander ES (2000) Class prediction and discovery using gene expression data. In: Proceedings of the fourth annual international conference on computational molecular biology, pp 263–272Google Scholar
  33. 33.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATHGoogle Scholar
  34. 34.
    Kuncheva L, Rodríguez JJ (2007) An experimental study on rotation forest ensembles. In: Proceedings of the 7th international workshop of multiple classifier systems (MCS), pp 459–468Google Scholar
  35. 35.
    Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. In: Proceedings of the ICML, pp 211–218Google Scholar
  36. 36.
    Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4(1):1–58CrossRefGoogle Scholar
  37. 37.
    Kohavi R, Wolpert D (1996) Bias plus variance decomposition for zero-one loss functions. In: Proceedings of the ICML, pp 275–283Google Scholar
  38. 38.
    Domingos P (2000) A unified bias–variance decomposition and its applications. In: Proceedings of the ICML, pp 231–238Google Scholar
  39. 39.
    James G (2003) Variance and bias for general loss functions. Mach Learn 51(2):115–135MATHCrossRefGoogle Scholar
  40. 40.
    Webb GI (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2):159–196CrossRefGoogle Scholar
  41. 41.
    Valentini G, Dietterich TG (2004) Bias–variance analysis of support vector machines for the development of SVM-based ensemble methods. J Mach Learn Res 5:725–775MathSciNetMATHGoogle Scholar
  42. 42.
    Bouckaert RR (2008) Practical bias variance decomposition. In: Proceedings of the Australasian conference on artificial intelligence, pp 247–257Google Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  • Anil Narassiguin
    • 1
    • 2
  • Mohamed Bibimoune
    • 1
  • Haytham Elghazel
    • 1
  • Alex Aussem
    • 1
  1. 1.LIRIS UMR CNRS 5205Université Lyon 1LyonFrance
  2. 2.EASYTRUSTLa garenne colombesFrance

Personalised recommendations