Machine Learning

, Volume 40, Issue 2, pp 159–196

MultiBoosting: A Technique for Combining Boosting and Wagging

  • Geoffrey I. Webb
Article

Abstract

MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, MultiBoosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution.

boosting bagging wagging aggregation decision committee decision tree 

References

  1. Ali, K., Brunk, C., & Pazzani, M. (1994). On learning multiple descriptions of a concept. In Proceedings of Tools with Artificial Intelligence (pp. 476–483). New Orleans, LA.Google Scholar
  2. Bauer, E. & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36, 105–139.Google Scholar
  3. Blake, C., Keogh, E., & Merz, C. J. (1999). UCI repository of machine learning databases. [Machine-readable data repository]. University of California, Department of Information and Computer Science, Irvine, CA.Google Scholar
  4. Breiman, L. (1996a). Bagging predictors. Machine Learning, 24, 123–140.Google Scholar
  5. Breiman, L. (1996b). Bias, variance, and arcing classifiers. Technical report 460. Berkeley, CA, Department of Statistics, University of California.Google Scholar
  6. Breiman, L. (1997). Arcing the edge. Technical report 486. Berkeley, CA, Department of Statistics, University of California.Google Scholar
  7. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International.Google Scholar
  8. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.Google Scholar
  9. Freund, Y. & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119–139.Google Scholar
  10. Freund, Y. & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148–156). Bari, Italy: Morgan Kaufmann.Google Scholar
  11. Friedman, J. H. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55–77.Google Scholar
  12. Friedman, J., Hastie, T., & Tibshirani, R. Additive logistic regression: A statistical view of boosting. Annals of Statistics. To appear.Google Scholar
  13. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–48.Google Scholar
  14. Kohavi, R. & Wolpert, D. (1996). Bias plus variance decomposition for zero-one loss functions. In Proceedings of the 13th International Conference on Machine Learning (pp. 275–283). Bari, Italy: Morgan Kaufmann.Google Scholar
  15. Kong, E. B. & Dietterich, T. G. (1995). Error-correcting output coding corrects bias and variance. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 313–321). Tahoe City, CA: Morgan Kaufmann.Google Scholar
  16. Krogh, A. & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning. G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in Neural Information Processing Systems (Vol. 7). Boston, MA: MIT Press.Google Scholar
  17. Nock, R. & Gascuel, O. (1995). On learning decision committees. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 413–420). Taho City, CA: Morgan Kaufmann.Google Scholar
  18. Oliver, J. J. & Hand, D. J. (1995). On pruning and averaging decision trees. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 430–437). Taho City, CA: Morgan Kaufmann.Google Scholar
  19. Quinlan, J. R. (1996). Bagging, boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725–730). AAAI/MIT Press.Google Scholar
  20. Rao, R. B., Gordon, D., & Spears, W. (1995). For every generalization action is there really an equal and opposite reaction? Analysis of the conservation law for generalization performance. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 471–479). Taho City, CA: Morgan Kaufmann.Google Scholar
  21. Salzberg, S. L. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1, 317–327.Google Scholar
  22. Schaffer, C. (1994). A conservation law for generalization performance. In Proceedings of the 1994 International Conference on Machine Learning. Morgan Kaufmann.Google Scholar
  23. Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26, 1651–1686.Google Scholar
  24. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259.Google Scholar
  25. Wolpert, D. H. (1995). Off-training set error and a priori distinctions between learning algorithms. Technical Report SFI TR 95–01–003. Santa Fe, NM, The Santa Fe Institute.Google Scholar
  26. Zheng, Z. & Webb, G. I. (1998). Multiple boosting: A combination of boosting and bagging. In Proceedings of the 1998 International Conference on Parallel and Distributed Processing Techniques and Applications (pp. 1133–1140). CSREA Press.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Geoffrey I. Webb
    • 1
  1. 1.School of Computing and MathematicsDeakin UniversityGeelongAustralia

Personalised recommendations