Advertisement

Random Forests

  • Adele CutlerEmail author
  • D. Richard Cutler
  • John R. Stevens
Chapter

Abstract

Random Forests were introduced by Leo Breiman [6] who was inspired by earlier work by Amit and Geman [2]. Although not obvious from the description in [6], Random Forests are an extension of Breiman’s bagging idea [5] and were developed as a competitor to boosting. Random Forests can be used for either a categorical response variable, referred to in [6] as “classification,” or a continuous response, referred to as “regression.” Similarly, the predictor variables can be either categorical or continuous.

Keywords

Random Forest Regression Tree Terminal Node Variable Importance Generalization Error 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Amaratunga, D., Cabrera, J., Lee, Y.-S.: Enriched random forests. Bioinformatics 24 (18) pp. 2010–2014 (2008).CrossRefGoogle Scholar
  2. 2.
    Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9(7) pp. 1545–1588 (1997).CrossRefGoogle Scholar
  3. 3.
    Biau, G., Devroye, L., Lugosi, G.: Consistency of Random Forests and Other Averaging Classifiers. Journal of Machine Learning Research 9 pp. 2039–2057 (2008).MathSciNetGoogle Scholar
  4. 4.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, New York (1984).zbMATHGoogle Scholar
  5. 5.
    Breiman, L.: Bagging Predictors. Machine Learning 24 (2) pp. 123–140 (2001).MathSciNetGoogle Scholar
  6. 6.
    Breiman, L.: Random Forests. Machine Learning 45 (1) pp. 5–32 (2001).zbMATHCrossRefGoogle Scholar
  7. 7.
    Chen, X., Liu, C.-T., Zhang, M., Zhang, H.: A forest-based approach to identifying gene and genegene interactions. Proc Natl Acad Sci USA 104 (49) pp. 19199–19203 (2007).CrossRefGoogle Scholar
  8. 8.
    Dettling, M.: BagBoosting for Tumor Classification with Gene Expression Data. Bioinformatics 20 (18) pp. 3583–3593 (2004).CrossRefGoogle Scholar
  9. 9.
    Diaz-Uriarte, R., Alvarez de Andres, S.: Gene Selection and Classification of Microarray Data Using Random Forest. BMC Bioinformatics 7 (1) 3 (2006).Google Scholar
  10. 10.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Series in Statistics, Springer, New York (2009).zbMATHGoogle Scholar
  11. 11.
    Goldstein, B., Hubbard, A., Cutler, A. Barcellos, L.: An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings. BMC Genetics 11 (1) 49 (2010).CrossRefGoogle Scholar
  12. 12.
    Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., Van Der Laan, M.: Survival Ensembles. Biostatistics 7 (3) pp. 355–373 (2006).zbMATHCrossRefGoogle Scholar
  13. 13.
    Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Annals of Applied Statistics 2 (3) pp. 841–860 (2008).MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Izenman, A.: Modern Multivariate Statistical Techniques. Springer Texts in Statistics, Springer, New York (2008).zbMATHCrossRefGoogle Scholar
  15. 15.
    Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2 (3) pp. 18–22 (2002).Google Scholar
  16. 16.
    Lin, Y., Jeon, Y.: Random Forests and Adaptive Nearest Neighbors. Journal of the American Statistical Association 101 (474) pp. 578–590 (2006).MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Mease, D., Wyner, A.: Evidence Contrary to the Statistical View of Boosting. Journal of Machine Learning Research 9 pp. 131–156 (2008).Google Scholar
  18. 18.
    Meinshausen, N.: Quantile Regression Forests. Journal of Machine Learning Research 7 pp. 983–999 (2006).MathSciNetzbMATHGoogle Scholar
  19. 19.
    R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011). http://www.R-project.org.
  20. 20.
    Schroff, F., Criminisi, A., Zisserman, A.: Object Class Segmentation using Random Forests. Proceedings of the British Machine Vision Conference 2008, British Machine Vision Association, 1 (2008).Google Scholar
  21. 21.
    Segal, M., Xiao, Y.: Multivariate Random Forests. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1 (1) pp. 80–87 (2011).CrossRefGoogle Scholar
  22. 22.
    Singh D., Febbo P.G., Ross K., Jackson D.G., Manola J., Ladd C., Tamayo P., Renshaw A.A., D’Amico A.V., Richie J.P., Lander E.S., Loda M., Kantoff P.W., Golub T.R., Sellers W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 (2) pp. 203–209 (2002).CrossRefGoogle Scholar
  23. 23.
    Stamey, T., Kabalin, J., McNeal J., Johnstone I., Freiha F., Redwine E., Yang N.: Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. Journal of Urology 16 pp. 1076–1083 (1989).Google Scholar
  24. 24.
    Statnikov, A., Wang, L., Aliferis, C.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics 9 (1) 319 (2008).CrossRefGoogle Scholar
  25. 25.
    Wang, M., Chen, X., Zhang, H.: Maximal conditional chi-square importance in random forests. 26 (6): pp. 831–837 (2010).Google Scholar
  26. 26.
    Zhang, H., Singer, B.H.: Recursive Partitioning and Applications, Second Edition. Springer Series in Statistics, Springer, New York (2010).zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Adele Cutler
    • 1
    Email author
  • D. Richard Cutler
    • 1
  • John R. Stevens
    • 1
  1. 1.Department of Mathematics and StatisticsUtah State UniversityLoganUSA

Personalised recommendations