Machine Learning

, Volume 40, Issue 3, pp 229–242 | Cite as

Randomizing Outputs to Increase Prediction Accuracy

  • Leo Breiman
Article

Abstract

Bagging and boosting reduce error by changing both the inputs and outputs to form perturbed training sets, growing predictors on these perturbed training sets and combining them. An interesting question is whether it is possible to get comparable performance by perturbing the outputs alone. Two methods of randomizing outputs are experimented with. One is called output smearing and the other output flipping. Both are shown to consistently do better than bagging.

ensemble randomization output variability 

References

  1. An, G. (1996). The effects of adding noise during backpropagation training on generalization performance. Neural Computation, 6, 643–674.Google Scholar
  2. Breiman, L. (1996a). Bagging predictors. Machine Learning, 26(2), 123–140.Google Scholar
  3. Breiman, L. (1996b). The heuristics of instability in model selection. Annals of Statistics, 24, 2350–2383.Google Scholar
  4. Breiman, L. (1997). Prediction games and arcing algorithms. Technical Report 504, Statistics Department, University of California at Berkeley. Available at www.stat.berkeley.eduGoogle Scholar
  5. Breiman, L. (1998a). Arcing classifiers (with discussion). Annals of Statistics, 26, 801–849.Google Scholar
  6. Breiman, L. (1998b). Half and half bagging and hard boundary points. Technical Report 534, Statistics Dept. Univ. of Calif. at Berkeley.Google Scholar
  7. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees. Chapman and Hall.Google Scholar
  8. Dietterich, T. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 1–22.Google Scholar
  9. Freund, Y. & Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.Google Scholar
  10. Freund, Y. & Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference (pp. 148–156).Google Scholar
  11. Freund, Y. & Schapire, R. (in press). Discussion of “Arcing Classifiers” by L. Breiman. Annals of Statistics.Google Scholar
  12. Friedman, J. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1–141.Google Scholar
  13. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58.Google Scholar
  14. Grossamn, T. & Lapedes, A. (1993). Use of bad training data for better predictions. NIPS, 6, 343–350.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Leo Breiman
    • 1
  1. 1.Statistics DepartmentUniversity of CaliforniaBerkeleyUSA

Personalised recommendations