Bell, D., Wang, H.: A formalism for relevance and its application in feature subset selection. Mach. Learn. 4(2), 175–195 (2000)
Article
Google Scholar
Berk, R.A.: An introduction to ensemble methods for data analysis. Sociol. Methods Res. 34(3), 263–295 (2006)
Article
MathSciNet
Google Scholar
Breiman, L.: The heuristic of instability in model selection. Ann. Stat. 24, 2350–2383 (1996)
MATH
Article
MathSciNet
Google Scholar
Breiman, L.: Random Forests. Mach. Learn. 45, 5–32 (2001a)
MATH
Article
Google Scholar
Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16, 199–231 (2001b)
MATH
Article
MathSciNet
Google Scholar
Breiman, L.: Manual on setting up, using, and understanding Random Forests v3.1. Technical report (2002). http://oz.berkeley.edu/users/breiman
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, London (1984)
MATH
Google Scholar
Breiman, L., Cutler, A., Liaw, A., Wiener, M.: Breiman and Cutler’s Random Forests for classification and regression. R package version 4.5-18 (2006). http://cran.r-project.org/doc/packages/randomForest.pdf
Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Stat. 30(4), 927–961 (2002)
MATH
Article
Google Scholar
Dobra, A., Gehrke, J.: Bias correction in classification tree construction. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Seventeenth International Conference on Machine Learning, Williams College, Williamstown, MA, USA, pp. 90–97 (2001)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
MATH
Article
Google Scholar
Friedman, J.H.: Tutorial: getting started with MART in R. Technical report, Standford University (2002). http://www-stat.stanford.edu/~jhf/r-mart/tutorial/tutorial.pdf
Hothorn, T., Hornik, K., Zeileis, A.: Unbiased recursive partitioning: a conditional inference framework. J. Comput. Graph. Stat. 15(3), 651–674 (2006)
Article
MathSciNet
Google Scholar
Kim, H., Loh, W.: Classification trees with unbiased multiway splits. J. Am. Stat. Assoc. 96, 589–604 (2001)
Article
MathSciNet
Google Scholar
Kononenko, I.: On biases in estimating multi-valued attributes. In: Mellish, C. (ed.) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Montréal, Canada, pp. 1034–1040 (1995)
Liaw, A., Wiener, M.: Classification and regression by Random Forest. R News 2(3), 18–22 (2002)
Google Scholar
Loh, W.-Y., Shih, Y.-S.: Split selection methods for classification trees. Stat. Sinica 7, 815–840 (1997)
MATH
MathSciNet
Google Scholar
Murthy, K.: Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min. Knowl. Discov. 2(4), 1384–5810 (2004)
Google Scholar
Nierenberg, D.W., Stukel, T.A., Baron, J.A., Dain, B.J., Greenberg, E.R.: Determinants of plasma levels of beta-carotene and retinol. Am. J. Epidemiol. 130, 511–521 (1989)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Google Scholar
R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. (2008)
Ridgeway, G.: Generalized boosted models: a guide to the gbm package. http://i-pensieri.com/gregr/papers/gbm-vignette.pdf (2007)
Sandri, M., Zuccolotto, P.: A bias correction algorithm for the Gini variable importance measure in classification trees. J. Comput. Graph. Stat. 17(3), 1–18 (2008)
MathSciNet
Google Scholar
Schonlau, M.: Boosted regression (boosting): a tutorial and a stata plugin. Stata J. 5(3), 330–354 (2005)
Google Scholar
Shih, Y.-S.: Families of splitting criteria for classification trees. Stat. Comput. 9, 309–315 (1999)
Article
Google Scholar
Strobl, C.: Statistical sources of variable selection bias in classification trees based on the Gini index. Technical report, SFB 386. http://epub.ub.uni-muenchen.de/archive/00001789/01/paper_420.pdf (2005)
Strobl, C., Boulesteix, A.-L., Augustin, T.: Unbiased split selection for classification trees based on the Gini index. Comput. Stat. Data Anal. (2007a). doi:10.1016/j.csda.2006.12.030
MathSciNet
Google Scholar
Strobl, C., Boulesteix, A.-L., Zeileis, A., Hothorn, T.: Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinf. 8, 25 (2007b). doi:10.1186/1471-2105-8-25
Article
Google Scholar
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinf. 9, 307 (2008). doi:10.1186/1471-2105-9-307
Article
Google Scholar
van der Laan, M.J.: Statistical inference for variable importance. Int. J. Biostat. 2(1), 1–30 (2005)
Google Scholar
White, A.P., Liu, W.Z.: Bias in information-based measures in decision tree induction. Mach. Learn. 15, 321–329 (1994)
MATH
Google Scholar
Wu, Y., Boos, D.D., Stefanski, L.A.: Controlling variable selection by the addition of pseudovariables. J. Am. Stat. Assoc. 102(477), 235–243 (2007)
MATH
Article
MathSciNet
Google Scholar