Advertisement

Practical Bias Variance Decomposition

  • Remco R. Bouckaert
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5360)

Abstract

Bias variance decomposition for classifiers is a useful tool in understanding classifier behavior. Unfortunately, the literature does not provide consistent guidelines on how to apply a bias variance decomposition. This paper examines the various parameters and variants of empirical bias variance decompositions through an extensive simulation study. Based on this study, we recommend to use ten fold cross validation as sampling method and take 100 samples within each fold with a test set size of at least 2000. Only if the learning algorithm is stable, fewer samples, a smaller test set size or lower number of folds may be justified.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kohavi, R., Wolpert, D.H.: Bias plus variance decomposition for zero-one loss functions. In: Saitta, L. (ed.) Machine Learning: Proceedings of the Thirteenth International Conference, pp. 275–283. Morgan Kaufmann, San Francisco (1996)Google Scholar
  2. 2.
    Witten, I., Frank, E.: Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  3. 3.
    Kong, E.B., Dietterich, T.G.: Error-correcting output coding corrects bias and variance. In: Proceedings of the 12th International Conference on Machine Learning, pp. 313–321. Morgan Kaufmann, San Francisco (1995)Google Scholar
  4. 4.
    Domingos, P.: A unified bias-variance decomposition and its applications. In: International Conference on Machine Learning, ICML 2000, pp. 231–238 (2000)Google Scholar
  5. 5.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)CrossRefGoogle Scholar
  6. 6.
    James, G.: Variance and bias for general loss functions. Machine Learning 51, 115–135 (2003)CrossRefzbMATHGoogle Scholar
  7. 7.
    Valentini, G., Dietterich, T.G.: Bias-variance analysis and ensembles of svm. In: Multiple Classifier Systems: Third International Workshop, pp. 222–231 (2002)Google Scholar
  8. 8.
    Valentini, G., Dietterich, T.G.: Low bias bagged support vector machines. In: International Conference on Machine Learning, ICML 2003 (2003)Google Scholar
  9. 9.
    Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2), 159–196 (2000)CrossRefGoogle Scholar
  10. 10.
    Webb, G.I., Conilione, P.: Estimating bias and variance from data (unpublished manuscript) (2002), http://www.csse.monash.edu.au/~webb/files/webbconilione06.pdf
  11. 11.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, pp. 43–49. Wadsworth International Group, Belmont (1984)zbMATHGoogle Scholar
  12. 12.
    Agrawal, R., Imielinski, T., Swami, A.: Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5(6), 914–925 (1993); Special issue on Learning and Discovery in Knowledge-Based Databases.CrossRefGoogle Scholar
  13. 13.
    Dwyer, K., Holte, R.C.: Decision tree instability and active learning. In: Kok, J.N., Koronacki, J., de Mántaras, R.L., Matwin, S., Mladenic, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 128–139. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Remco R. Bouckaert
    • 1
  1. 1.Computer Science DepartmentUniversity of WaikatoNew Zealand

Personalised recommendations