Bagging Ensemble Selection for Regression

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7691)


Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classification problems have shown that using random trees as base classifiers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classification, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non-negative least squares algorithm is a viable approach for pruning an ensemble of ensembles.


Predictive Performance Base Learner Ensemble Size Ensemble Learning Model Library 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)CrossRefGoogle Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)zbMATHCrossRefGoogle Scholar
  4. 4.
    Buhlmann, P., van de Geer, S.: Statistics for High-Demensional Data: Methods, Theory and Applications. Springer (2011)Google Scholar
  5. 5.
    Caruana, R., Munson, A., Niculescu-Mizil, A.: Getting the most out of ensemble selection. In: Proceedings of the Sixth International Conference on Data Mining, ICDM 2006 (2006)Google Scholar
  6. 6.
    Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004 (2004)Google Scholar
  7. 7.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)zbMATHGoogle Scholar
  8. 8.
    Friedman, J.H.: Stochastic gradient boosting. Computational Statistics and Data Analysis 38, 367–378 (1999)CrossRefGoogle Scholar
  9. 9.
    Friedman, J.H.: Greedy function approximation: A gradient boosting machine. Annals of Statistics 29, 1189–1232 (2000)CrossRefGoogle Scholar
  10. 10.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar
  11. 11.
    Hernandez-Lobato, D., Martinez-Munoz, G., Suarez, A.: Pruning in ordered regression bagging ensembles. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 1266–1273 (2006)Google Scholar
  12. 12.
    Lawson, C., Hanson, R.: Solving Least-Squares Problems. Prentice-Hall (1974)Google Scholar
  13. 13.
    Quinlan, J.R.: Learning with continuous Classes. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992)Google Scholar
  14. 14.
    Rokach, L.: Ensemble-based classifiers. Artificial Intelligence Review 33, 1–39 (2010)CrossRefGoogle Scholar
  15. 15.
    Sun, Q., Pfahringer, B.: Bagging ensemble selection. In: Proceedings of the 24th Australasian Conference on Artificial Intelligence, pp. 251–260. Springer, Perth (2011)Google Scholar
  16. 16.
    Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2) (2000)Google Scholar
  17. 17.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)CrossRefGoogle Scholar
  18. 18.
    Yu, Y., Zhou, Z.H., Ting, K.M.: Cocktail ensemble for regression. In: Seventh IEEE International Conference on Data Mining, ICDM 2007, pp. 721–726 (2007)Google Scholar
  19. 19.
    Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, Boca Raton, FL (2012)Google Scholar
  20. 20.
    Zhou, Z.H., Wu, J., Tang, W.: Ensembling neural networks: many could be better than all. Artificial Intelligence 137, 239–263 (2002)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of WaikatoHamiltonNew Zealand

Personalised recommendations