Skip to main content

Random Aggregated and Bagged Ensembles of SVMs: An Empirical Bias–Variance Analysis

  • Conference paper
Book cover Multiple Classifier Systems (MCS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3077))

Included in the following conference series:

Abstract

Bagging can be interpreted as an approximation of random aggregating, an ideal ensemble method by which base learners are trained using data sets randomly drawn according to an unknown probability distribution. An approximate realization of random aggregating can be obtained through subsampled bagging, when large training sets are available. In this paper we perform an experimental bias–variance analysis of bagged and random aggregated ensembles of Support Vector Machines, in order to quantitatively evaluate their theoretical variance reduction properties. Experimental results with small samples show that random aggregating, implemented through subsampled bagging, reduces the variance component of the error by about 90%, while bagging, as expected, achieves a lower reduction. Bias–variance analysis explains also why ensemble methods based on subsampling techniques can be successfully applied to large data mining problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  2. Dietterich, T.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning 40, 139–158 (2000)

    Article  Google Scholar 

  3. Bousquet, O., Elisseeff, A.: Stability and Generalization. Journal of Machine Learning Research 2, 499–526 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  4. Valentini, G., Dietterich, T.G.: Bias–variance analysis of Support Vector Machines for the development of SVM-based ensemble methods. Journal of Machine Learning Research (accepted for publication)

    Google Scholar 

  5. Valentini, G., Dietterich, T.: Bias–variance analysis and ensembles of SVM. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 222–231. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Andersen, T., Rimer, M., Martinez, T.R.: Optimal artificial neural network architecture selection for voting. In: Proc. of the IEEE International Joint Conference on Neural Networks IJCNN 2001, pp. 790–795. IEEE, Los Alamitos (2001)

    Google Scholar 

  7. Kim, H., Pang, S., Je, H., Kim, D., Bang, S.: Pattern Classification Using Support Vector Machine Ensemble. In: Proc. of ICPR 2002, vol. 2, pp. 20160–20163. IEEE, Los Alamitos (2002)

    Google Scholar 

  8. Breiman, L.: Pasting Small Votes for Classification in Large Databases and On-Line. Machine Learning 36, 85–103 (1999)

    Article  Google Scholar 

  9. Joachims, T.: Making large scale SVM learning practical. In: Scholkopf, B., Burges, C.S.A. (eds.) Advances in Kernel Methods - Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)

    Google Scholar 

  10. Chawla, N., Hall, L., Bowyer, K., Moore, T., Kegelmeyer, W.: Distributed pasting of small votes. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 52–61. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Valentini, G.: Ensemble methods based on bias–variance analysis. PhD thesis, DISI, Università di Genova, Italy (2003), ftp://ftp.disi.unige.it/person/ValentiniG/Tesi/finalversion/vale-th-2003-04.pdf

  12. Domingos, P.: A Unified Bias-Variance Decomposition for Zero-One and Squared Loss. In: Proc. of the Seventeenth National Conference on Artificial Intelligence, Austin, TX, pp. 564–569. AAAI Press, Menlo Park (2000)

    Google Scholar 

  13. Valentini, G., Masulli, F.: NEURObjects: an object-oriented library for neural network development. Neurocomputing 48, 623–646 (2002)

    Article  MATH  Google Scholar 

  14. Merz, C., Murphy, P.: UCI repository of machine learning databases (1998), www.ics.uci.edu/mlearn/MLRepository.html

  15. Evgeniou, T., Perez-Breva, L., Pontil, M., Poggio, T.: Bounds on the Generalization Performance of Kernel Machine Ensembles. In: Langley, P. (ed.) Proc. of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 271–278. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Valentini, G. (2004). Random Aggregated and Bagged Ensembles of SVMs: An Empirical Bias–Variance Analysis. In: Roli, F., Kittler, J., Windeatt, T. (eds) Multiple Classifier Systems. MCS 2004. Lecture Notes in Computer Science, vol 3077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25966-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-25966-4_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22144-9

  • Online ISBN: 978-3-540-25966-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics