Skip to main content
Log in

Neural Random Forests

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

Given an ensemble of randomized regression trees, it is possible to restructure them as a collection of multilayered neural networks with particular connection weights. Following this principle, we reformulate the random forest method of Breiman (2001) into a neural network setting, and in turn propose two new hybrid procedures that we call neural random forests. Both predictors exploit prior knowledge of regression trees for their architecture, have less parameters to tune than standard networks, and less restrictions on the geometry of the decision boundaries than trees. Consistency results are proved, and substantial numerical evidence is provided on both synthetic and real data sets to assess the excellent performance of our methods in a large variety of prediction problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. http://tensorflow.org/.

  • Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning2, 1–127.

    Article  Google Scholar 

  • Biau, G. and Scornet, E. (2016). A random forest guided tour (with comments and a rejoinder by the authors). TEST25, 197–227.

    Article  MathSciNet  Google Scholar 

  • Boulesteix, A.-L., Janitza, S., Kruppa, J. and König, I.R. (2012). Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery2, 493–507.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning45, 5–32.

    Article  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and regression trees. Chapman & Hall/CRC, Boca Raton.

    MATH  Google Scholar 

  • Brent, R.P. (1991). Fast training algorithms for multi-layer neural nets. IEEE Transactions on Neural Networks2, 346–354.

    Article  Google Scholar 

  • Chipman, H.A., George, E.I. and McCulloch, R.E. (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics4, 266–298.

    Article  MathSciNet  Google Scholar 

  • Cortez, P. and Morais, A. (2007). A data mining approach to predict forest fires using meteorological data. In: J. Neves, M.F. Santos, and J. Machado, editors, New Trends in Artificial Intelligence, Proceedings of the 13th Portugese Conference on Artificial Intelligence, pp. 512–523.

  • Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.

    Book  Google Scholar 

  • Fernández-Delgado, M., Cernadas, E., Barro, S. and Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research15, 3133–3181.

    MathSciNet  MATH  Google Scholar 

  • Geurts, P. and Wehenkel, L. (2005). Closed-form dual perturb and combine for tree-based models. ACM, New York, pp. 233–240.

  • Györfi, L., Kohler, M., KrzyŻak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.

    Book  Google Scholar 

  • Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning, 2nd edition. Springer, New York.

    Book  Google Scholar 

  • Ioannou, Y., Robertson, D., Zikic, D., Kontschieder, P., Shotton, J., Brown, M. and Criminisi, A. (2016). Decision forests, convolutional networks and the models in-between. arXiv:1603.01250.

  • Ishwaran, H., Kogalur, U.B., Chen, X. and Minn, A.J. (2011). Random survival forests for high-dimensional data. Statistical Analysis and Data Mining4, 115–132.

    Article  MathSciNet  Google Scholar 

  • Jordan, M.I. and Jacobs, R.A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation6, 181–214.

    Article  Google Scholar 

  • Kingma, D.P. and Ba, J. (2015). Adam: A method for stochastic optimization. In: International conference on learning representations.

  • Kontschieder, P., Fiterau, M., Criminisi, A. and Rota Bulò, S. (2015). Deep neural decision forests. In: International conference on computer vision.

    Book  Google Scholar 

  • Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.

    Google Scholar 

  • Lugosi, G. and Zeger, K. (1995). Nonparametric estimation via empirical risk minimization. IEEE Transactions on Information Theory41, 677–687.

    Article  MathSciNet  Google Scholar 

  • Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research7, 983–999.

    MathSciNet  MATH  Google Scholar 

  • Menze, B.H., Kelm, B.M., Splitthoff, D.N., Koethe, U. and Hamprecht, F.A. (2011). On oblique random forests. Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, D. Gunopulos, T. Hofmann, D. Malerba, and M. Vazirgiannis, editors, pages 453–469.

  • Olaru, C. and Wehenkel, L. (2003). A complete fuzzy decision tree technique. Fuzzy Sets and Systems138, 221–254.

    Article  MathSciNet  Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research12, 2825– 2830.

    MathSciNet  MATH  Google Scholar 

  • Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.

    Book  Google Scholar 

  • Redmond, M. and Baveja, A. (2002). A data-driven software tool for enabling cooperative information sharing among police departments. European Journal of Operational Research141, 660–678.

    Article  Google Scholar 

  • Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W. and Rother, C. (2015). Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. arXiv:1507:07583.

  • Rokach, L. and Maimon, O. (2008). Data Mining with Decision Trees: Theory and Applications. World Scientific, Singapore.

    MATH  Google Scholar 

  • Scornet, E., Biau, G. and Vert, J.-P. (2015). Consistency of random forests. The Annals of Statistics43, 1716–1741.

    Article  MathSciNet  Google Scholar 

  • Sethi, I.K. (1990). Entropy nets: From decision trees to neural networks. Proceedings of the IEEE78, 1605–1613.

    Article  Google Scholar 

  • Sethi, I.K. (1991). Decision tree performance enhancement using an artificial neural network interpretation. Elsevier, Amsterdam, Sethi, I.K. and Jain, A.K. (eds.), 6912, pp. 71–88.

  • Welbl, J. (2014). Casting random forests as artificial neural networks (and profiting from it). Pattern Recognition, Springer, Berlin Jiang, X., Hornegger, J., Koch, R. (eds.), pp. 765–771.

  • Yeh, I.-C. (1998). Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete Research28, 1797–1808.

    Article  Google Scholar 

  • Yildiz, O.T. and Alpaydin, E. (2013). Regularizing soft decision trees. Information Sciences and Systems 2013, Springer, Cham,. pp. 15–21.

Download references

Acknowledgments

We greatly thank the Associate Editor and the Referee for valuable comments and insightful suggestions, which lead to a substantial improvement of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gérard Biau.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Biau, G., Scornet, E. & Welbl, J. Neural Random Forests. Sankhya A 81, 347–386 (2019). https://doi.org/10.1007/s13171-018-0133-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-018-0133-y

Keywords

AMS (2000) subject classification

Navigation