Machine Learning

, Volume 45, Issue 1, pp 5–32 | Cite as

Random Forests

  • Leo Breiman
Article

Abstract

Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

classification regression ensemble 

References

  1. Amit, Y. & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9, 1545–1588.Google Scholar
  2. Amit, Y., Blanchard, G., & Wilder, K. (1999). Multiple randomized classifiers: MRCL Technical Report, Department of Statistics, University of Chicago.Google Scholar
  3. Bauer, E. & Kohavi, R. (1999). An empirical comparison of voting classification algorithms. Machine Learning, 36(1/2), 105–139.Google Scholar
  4. Breiman, L. (1996a). Bagging predictors. Machine Learning 26(2), 123–140.Google Scholar
  5. Breiman, L. (1996b). Out-of-bag estimation, ftp.stat.berkeley.edu/pub/users/breiman/OOBestimation.psGoogle Scholar
  6. Breiman, L. (1998a). Arcing classifiers (discussion paper). Annals of Statistics, 26, 801–824.Google Scholar
  7. Breiman. L. (1998b). Randomizing outputs to increase prediction accuracy. Technical Report 518, May 1, 1998, Statistics Department, UCB (in press, Machine Learning).Google Scholar
  8. Breiman, L. 1999. Using adaptive bagging to debias regressions. Technical Report 547, Statistics Dept. UCB.Google Scholar
  9. Breiman, L. 2000. Some infinity theory for predictor ensembles. Technical Report 579, Statistics Dept. UCB.Google Scholar
  10. Dietterich, T. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization, Machine Learning, 1–22.Google Scholar
  11. Freund, Y. & Schapire, R. (1996). Experiments with a new boosting algorithm, Machine Learning: Proceedings of the Thirteenth International Conference, 148–156.Google Scholar
  12. Grove, A. & Schuurmans, D. (1998). Boosting in the limit: Maximizing the margin of learned ensembles. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98).Google Scholar
  13. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(8), 832–844.Google Scholar
  14. Kleinberg, E. (2000). On the algorithmic implementation of stochastic discrimination. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(5), 473–490.Google Scholar
  15. Schapire, R., Freund, Y., Bartlett, P., & Lee,W. (1998). Boosting the margin:Anewexplanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686.Google Scholar
  16. Tibshirani, R. (1996). Bias, variance, and prediction error for classification rules. Technical Report, Statistics Department, University of Toronto.Google Scholar
  17. Wolpert, D. H. & Macready, W. G. (1997). An efficient method to estimate Bagging's generalization error (in press, Machine Learning).Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Leo Breiman
    • 1
  1. 1.Statistics DepartmentUniversity of CaliforniaBerkeley

Personalised recommendations