Abstract
We present an introduction to “stacked generalization” (Wolpert in Neural Networks 5(2):241–259, 1992 [21]). The increased availability of “Big Data” in economics and the emergence of non-traditional machine learning tools presents new opportunities for applied economics, but it also imposes additional challenges. The full range of supervised learning algorithms offers a rich variety of methods, each of which could be more efficient in addressing a specific problem. Selecting the optimal algorithm and tuning its parameters can be time-consuming due to potential lack of experience in machine learning and relevant economic literature. “Stacking” is a useful tool that can be used to address this: it is an ensemble method for combining multiple supervised machine learners in order to achieve more accurate predictions than could be produced by any of the individual machine learners. Besides providing an introduction to the stacking methodology, we also present a short survey of some of the estimators or “base learners” that can be used with stacking: lasso, ridge, elastic net, support vector machines, gradient boosting, and random forests. Our empirical example of how to use stacking regression uses the study by Fatehkia et al. (PLOS ONE 14(2):1–16, 2019 [6]): predicting crime rates in localities using demographic and socioeconomic data combined with data from Facebook on user interests.
Invited paper for the 15th International Conference of the Thailand Econometric Society, Chiang Mai University, Thailand, 5–7 January 2022. All errors are our own.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Validation refers to an out-of-sample performance check of the estimator by applying it on labelled data that is not part of the training set.
- 3.
See Pedregosa et al. [14] for background on the scikit-learn project, and https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.StackingRegressor.html for details about the StackingRegressor package.
- 4.
A neural net base learner is also available, but we omit this in part because it can require substantial tuning. We did not do such tuning for the other base learners.
References
Ahrens, A., Hansen, C. B., & Schaffer, M. E. (2021). pystacked: Stata program for stacking regression. https://statalasso.github.io/docs/pystacked/
Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79.
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory (pp. 144–152).
Breiman, L. (1996). Stacked regressions. Machine Learning, 24(1), 49–64.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Fatehkia, M., O’Brien, D., & Weber, I. (2019). Correlated impulses: Using Facebook interests to improve predictions of crime rates in urban areas. PLOS ONE, 14(2), 1–16. https://doi.org/10.1371/journal.pone.0211350
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378.
Friedman, J. H. (2017). The elements of statistical learning: Data mining, inference, and prediction. Springer Open.
Graczyk, M., Lasota, T., Trawiński, B., & Trawiński, K. (2010). Comparison of bagging, boosting and stacking ensembles applied to real estate appraisal. In Asian Conference on Intelligent Information and Database Systems (pp. 340–350). Springer.
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer Series in Statistics. Springer. ISBN: 9780387848846. https://books.google.co.uk/books?id=eBSgoAEACAAJ
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Applications to nonorthogonal problems. Technometrics, 12(1), 69–82. https://doi.org/10.1080/00401706.1970.10488635
Larson, S. C. (1931). The shrinkage of the coefficient of multiple correlation. Journal of Educational Psychology, 22(1), 45.
Lei, J. (2020). Cross-validation with confidence. Journal of the American Statistical Association, 115(532), 1978–1997.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Picard, R. R., & Cook, R. D. (1984). Cross-validation of regression models. Journal of the American Statistical Association, 79(387), 575–583.
Shao, J. (1993). Linear model selection by cross-validation. Journal of the American statistical Association, 88(422), 486–494.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36(2), 111–133.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288. ISSN: 00359246. https://doi.org/10.2307/2346178
Tikhonov, A. N. (1963). On the solution of ill-posed problems and the method of regularization. Doklady Akademii Nauk, 151(3), 501–504.
Ting, K. M., & Witten, I. H. (1999). Issues in stacked generalization. Journal of Artificial Intelligence Research, 10, 271–289.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
Yang, Y. (2007). Consistency of cross validation for comparing regression procedures. The Annals of Statistics, 35(6), 2450–2473.
Zhang, P. (1993). Model selection via multifold cross validation. The Annals of Statistics, 299–313.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 67(2), 301–320. ISSN: 13697412. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ahrens, A., Ersoy, E., Iakovlev, V., Li, H., Schaffer, M.E. (2022). An Introduction to Stacking Regression for Economists. In: Sriboonchitta, S., Kreinovich, V., Yamaka, W. (eds) Credible Asset Allocation, Optimal Transport Methods, and Related Topics. TES 2022. Studies in Systems, Decision and Control, vol 429. Springer, Cham. https://doi.org/10.1007/978-3-030-97273-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-97273-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97272-1
Online ISBN: 978-3-030-97273-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)