Comments on: A random forest guided tour
- 388 Downloads
We discuss future challenges in developing statistical theory for Random Forests. In particular, we suggest that an analysis of bias and extrapolation is vital to understanding the statistical properties of variable importance measures. We further point to the incorporation of random forests within larger statistical models as an important tool for high-dimensional statistical inference.
KeywordsRandom forests Machine learning Extrapolation Variable importance
Mathematics Subject Classification62G09
This work was supported by NSF grants DMS-103252 and DEB-1353039.
- Bosch A, Zisserman A, Muoz X (2007) Image classification using random forests and ferns. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007. IEEE. pp 1–8Google Scholar
- Hooker G (2004) Variable interaction networks. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data MiningGoogle Scholar
- Hooker G, Mentch L (2015) Bootstrap bias corrections for ensemble methods. arXiv preprint arXiv:1506.00553
- Lou Y, Caruana R, Gehrke J, Hooker G (2013) Accurate intelligible models with pairwise interactions. In: Proceedings of the Ninteenth ACM SIGKDD International Conference on Knowledge Discovery and Data MiningGoogle Scholar
- Mentch L, Hooker G (2014) Detecting feature interactions in bagged trees and random forests. ArXiv e-printsGoogle Scholar
- Mentch L, Hooker G (2015) Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. J Mach Learn Res (In press)Google Scholar
- Sorokina D, Caruana R, Riedewald M (2007) Additive groves of regression trees. In: Proceedings of the 18th European Conference on Machine Learning (ECML’07)Google Scholar
- Stone CJ (1980) Optimal rates of convergence for nonparametric estimators. Ann Stat 1348–1360Google Scholar
- Wager S, Athey S (2015) Estimation and inference of heterogeneous treatment effects using random forests. arXiv preprint arXiv:1510.04342