Skip to main content

Bootstrap Aggregating and Random Forest

Part of the Advanced Studies in Theoretical and Applied Econometrics book series (ASTA,volume 52)

Abstract

Bootstrap Aggregating (Bagging) is an ensemble technique for improving the robustness of forecasts. Random Forest is a successful method based on Bagging and Decision Trees. In this chapter, we explore Bagging, Random Forest, and their variants in various aspects of theory and practice. We also discuss applications based on these methods in economic forecasting and inference.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Audrino, F., & Medeiros, M. C. (2011). Modeling and forecasting short-term interest rates: The benefits of smooth regimes, macroeconomic variables, and Bagging. Journal of Applied Econometrics, 26(6), 999–1022.

    CrossRef  Google Scholar 

  • Biau, O., & D’Elia, A. (2011). Euro area GDP forecast using large survey dataset - A random forest approach. In EcoMod 2010.

    Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 26(2), 123–140.

    Google Scholar 

  • Breiman, L. (2000). Some infinity theory for predictor ensembles. Berkeley: University of California.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    CrossRef  Google Scholar 

  • Breiman, L., Friedman, J., Stone, C., & Olshen, R. (1984). Classification and regression trees. The Wadsworth and Brooks-Cole Statistics-Probability Series. Oxfordshire: Taylor & Francis.

    Google Scholar 

  • Bühlmann, P. (2004). Bagging, boosting and ensemble methods (pp. 877–907). Handbook of Computational Statistics: Concepts and Methods. Berlin: Springer.

    Google Scholar 

  • Bühlmann, P., & Yu, B. (2002). Analyzing bagging. Annals of Statistics, 30(4), 927–961.

    CrossRef  Google Scholar 

  • Buja, A., & Stuetzle, W. (2000a), Bagging does not always decrease mean squared error definitions (Preprint). Florham Park: AT&T Labs-Research.

    Google Scholar 

  • Buja, A., & Stuetzle, W. (2000b). Smoothing effects of bagging (Preprint). Florham Park: AT&T Labs-Research.

    Google Scholar 

  • Fischer, T., Krauss, C., & Treichel, A. (2018). Machine learning for time series forecasting - a simulation study (2018). FAU Discussion Papers in Economics, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.

    Google Scholar 

  • Friedman, J. H., & Hall, P. (2007). On Bagging and nonlinear estimation. Journal of Statistical Planning and Inference, 137(3), 669–683.

    CrossRef  Google Scholar 

  • Frosst, N., & Hinton, G. (2017). Distilling a neural network into a soft decision tree. In Ceur workshop proceedings.

    Google Scholar 

  • Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.

    CrossRef  Google Scholar 

  • Hillebrand, E., Lee, T.-H., & Medeiros, M. (2014). Bagging constrained equity premium predictors (Chap. 14, pp. 330–356). In Essays in Nonlinear Time Series Econometrics, Festschrift in Honor of Timo Teräsvirta. Oxford: Oxford University Press.

    Google Scholar 

  • Hirano, K., & Wright, J. H. (2017). Forecasting with model uncertainty: Representations and risk reduction. Econometrica, 85(2), 617–643.

    CrossRef  Google Scholar 

  • Hothorn, T., & Zeileis, A. (2017). Transformation forests. Technical report. https://arxiv.org/abs/1701.02110.

  • Huang, G.-B., Zhu, Q.-Y., & Siew, C.-K. (2006). Extreme learning machine: Algorithm, theory and applications. Neurocomputing, 70, 489–501.

    CrossRef  Google Scholar 

  • Inoue, A., & Kilian, L. (2008). How useful is Bagging in forecasting economic time. Journal of the American Statistical Association, 103(482), 511–522.

    CrossRef  Google Scholar 

  • Irsoy, O., Yildiz, O. T., & Alpaydin, E. (2012). A soft decision tree. In 21st International Conference on Pattern Recognition (ICPR 2012).

    Google Scholar 

  • Janitza, S., Celik, E., & Boulesteix, A. L. (2016). A computationally fast variable importance test for Random Forests for high-dimensional data. Advances in Data Analysis and Classification, 185, 1–31.

    Google Scholar 

  • Jin, S., Su, L., & Ullah, A. (2014). Robustify financial time series forecasting with Bagging. Econometric Reviews, 33(5-6), 575–605.

    CrossRef  Google Scholar 

  • Jordan, M., & Jacob, R. (1994). Hierarchical Mixtures of Experts and the EM algorithm. Neural Computation, 6, 181–214.

    CrossRef  Google Scholar 

  • Kontschieder, P., Fiterau, M., Criminisi, A., Bul, S. R., Kessler, F. B., & Bulo’, S. R. (2015). Deep Neural Decision Forests. In The IEEE International Conference on Computer Vision (ICCV) (pp. 1467–1475).

    Google Scholar 

  • Lee, T.-H., & Yang, Y. (2006). Bagging binary and quantile predictors for time series. Journal of Econometrics, 135(1), 465–497.

    CrossRef  Google Scholar 

  • Lee, T.-H., Tu, Y., & Ullah, A. (2014). Nonparametric and semiparametric regressions subject to monotonicity constraints: estimation and forecasting. Journal of Econometrics, 182(1), 196–210.

    CrossRef  Google Scholar 

  • Lee, T.-H., Tu, Y., & Ullah, A. (2015). Forecasting equity premium: Global historical average versus local historical average and constraints. Journal of Business and Economic Statistics, 33(3), 393–402.

    CrossRef  Google Scholar 

  • Lin, Y., & Jeon, Y. (2006). Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101(474), 578–590.

    CrossRef  Google Scholar 

  • Luong, C., & Dokuchaev, N. (2018). Forecasting of realised volatility with the random forests algorithm. Journal of Risk and Financial Management, 11(4), 61.

    CrossRef  Google Scholar 

  • Nyman, R., & Ormerod, P. (2016). Predicting economic recessions using machine learning. arXiv:1701.01428.

    Google Scholar 

  • Panagiotelis, A., Athanasopoulos, G., Hyndman, R. J., Jiang, B., & Vahid, F. (2019). Macroeconomic forecasting for Australia using a large number of predictors. International Journal of Forecasting, 35(2), 616–633.

    Google Scholar 

  • Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1, 81–106.

    Google Scholar 

  • Quinlan, J. R. (1994). C4.5: programs for machine learning. Machine Learning, 16(3), 235–240.

    Google Scholar 

  • Strobl, C., Boulesteix, A.-L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics, 8, 25.

    CrossRef  Google Scholar 

  • Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional variable importance for Random Forests. BMC Bioinformatics, 9, 1–11.

    CrossRef  Google Scholar 

  • Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228–1242.

    CrossRef  Google Scholar 

  • Welch, I., & Goyal, A. (2008). A comprehensive look at the empirical performance of equity premium prediction. Review of Financial Studies, 21-4 1455–1508.

    CrossRef  Google Scholar 

  • Yildiiz, O. T., Írsoy, O., & Alpaydin, E. (2016). Bagging soft decision trees. In Machine Learning for Health Informatics (Vol. 9605, pp. 25–36).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tae-Hwy Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lee, TH., Ullah, A., Wang, R. (2020). Bootstrap Aggregating and Random Forest. In: Fuleky, P. (eds) Macroeconomic Forecasting in the Era of Big Data. Advanced Studies in Theoretical and Applied Econometrics, vol 52. Springer, Cham. https://doi.org/10.1007/978-3-030-31150-6_13

Download citation