Abstract
Forecasting the business cycle can help policymakers implement economic policies more effectively. This paper selects 62 macroeconomic and financial indicators and divides them into two data sets to forecast China's business cycle. The data of the past 36 months is used to predict China’s business cycle for the next month by the simple rolling window method. For testing the training set and determining model parameters, five machine learning models are used: XGBoost, SVM, Logistic Regression, Decision Tree, and Random Forest. The statistical evaluation indicators of the confusion matrix show that these five machine learning algorithms can reliably anticipate China's economy cycle, with Logistic Regression outperforming the others. At the same time, the paper compares the model predictions with the actual values and discusses the differences between them.
Similar content being viewed by others
References
Antunes, A., Bonfim, D., Monteiro, N., et al. (2018). Forecasting banking crises with dynamic panel probit models. International Journal of Forecasting, 34, 249–275.
Barro, R. J. (1995). Inflation and economic growth. Bank of England Quarterly Bulletin, 35, 166–176.
Bellotti, A., Brigo, D., Gambetti, P., et al. (2021). Forecasting recovery rates on non-performing loans with machine learning. International Journal of Forecasting, 37, 428–444.
Berger, T., Everaert, G., & Pozzi, L. (2021). Testing for international business cycles: A multilevel factor model with stochastic factor selection. Journal of Economic Dynamics and Control, 128, 1–16.
Blanchard, O., & Simon, J. (2001). The long and large decline in US output volatility. Brookings Papers on Economic Activity, 1, 135–174.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Burns, A. F., & Mitchell, W. C. (1946). Measuring business cycles. National Bureau of Economic Research.
Chauvet, M., & Potter, S. (2005). Forecasting recessions using the yield curve. Journal of Forecasting, 24(2), 77–103.
Chen, T., & Guestrin, C. (2016) Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, pp 785–794.
Davig, T., & Hall, A. S. (2019). Recession forecasting using Bayesian classification. International Journal of Forecasting, 35, 848–867.
Del Negro, M., & Schorfheide, F. (2013). DSGE model-based forecasting. Handbook of Economic Forecasting, 2, 57–140.
Döpke, J., Fritsche, U., & Pierdzioch, C. (2017). Predicting recessions with boosted regression trees. International Journal of Forecasting, 33, 745–759.
Dueker, M. (1997). Strengthening the case for the yield curve as a predictor of US recessions. Federal Reserve Bank of St Louis Economic Review, 79, 41–51.
Estrella, A. (1998). A new measure of fit for equations with dichotomous dependent variables. Journal of Bussiness and Economics Statistics, 16, 198–205.
Estrella, A., & Mishkin, F. S. (1998). Predicting US recessions: Financial variables as leading indicators. The Review of Economics and Statistics, 80(1), 45–61.
Faccini, R., et al. (2019). A new predictor of US real economic activity: The S&P 500 option implied risk aversion. Management Science, 65(10), 4451–4949.
Ge, X., et al. (2022). The driving forces of China’s business cycles: Evidence from an estimated DSGE model with housing and banking. China Economic Review, 72, 101753.
Gogas, P., Papadimitriou, T., & Chrysanthidou, E. (2015). Yield curve point triplets in recession forecasting. International Finance, 18, 207–226.
Hasse, J. B., & Lajaunie, Q. (2020). Does the yield curve signal recessions? New evidence from an international panel data analysis. The Quarterly Review of Economics and Finance, 84, 9–22.
He, Q., et al. (2017). Housing prices and business cycle in China: A DSGE Analysis. International Review of Economics & Finance, 52, 246–256.
Hsu M. and Zhao M. (2009) China’s Business Cycles Between 1954–2004: Productivity and Fiscal Policy Changes. MPRA Paper 21283.
Huang, J., Tsai, Y., Wu, P., et al. (2020). Predictive modeling of blood pressure during hemodialysis: A comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method. Computer Methods and Programs in Biomedicine, 195, 1–6.
Huang, Y., & Yen, M. (2019). A new perspective of performance comparison among machine learning algorithms for financial distress prediction. Applied Soft Computing., 83, 1–14.
Jiang, H., Deng, W., Zhou, J., et al. (2021). Machine learning algorithms to predict the 1year unfavorable prognosis for advanced schistosomiasis. International Journal for Parasitology., 21, 1–7.
Jurado, K., Ludvigson, S. C., & Ng, S. (2015). Measuring uncertainty. American Economic Review, 105(3), 1177–1216.
Kauppi, H., & Saikkonen, S. (2008). Predicting US recessions with dynamic binary response models. The Review of Economics and Statistics., 90(4), 777–791.
Kiani, K. M. (2011). Fluctuations in economic and activity and stabilization policies in the CIS. Computational Economics, 37(2), 5669.
King, R. G., & Watson, M. W. (1996). Money, prices, interest rates and the business cycle. The Review of Economics and Statistics, 78(1), 35–53.
Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 79956.
Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT press.
Ng, E. C. Y. (2012). Forecasting US recessions with various risk factors and dynamic probit models. Journal of Macroeconomics, 34, 114–125.
Nyberg, H. (2010). Dynamic probit models and financial variables in recession forecasting. Journal of Forecasting., 29, 215–230.
Petropoulos, F., et al. (2022). Forecasting: Theory and practice. International Journal of Forecasting, 38(3), 705–871.
Phillips, T., & Abdulla, W. (2021). Developing a new ensemble approach with multi-class SVMs for Manuka honey quality classification. Applied Soft Computing, 111, 1–12.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Sahani, N., & Ghosh, T. (2021). GIS-based spatial prediction of recreational trail susceptibility in protected area of Sikkim Himalaya using logistic regression, decision tree and random forest model. Ecological Informatics, 64, 1–17.
Smets, F., & Wouters, R. (2007). Shocks and frictions in US business cycles: A Bayesian DSGE approach. American Economic Review, 97(3), 586–606.
Vapnik, V., Golowich, S., & Smola, A. (1996). Support vector method for function approximation, regression estimation and signal processing. Advances in Neural Information Processing Systems, 9, 469652.
Vrontos, S. D., Galakis, J., & Vrontos, I. D. (2021). Modeling and predicting U.S. recessions using machine learning techniques. International Journal of Forecasting, 37, 647–671.
Zeng, Z., & Li, M. (2021). Bayesian median autoregression for robust time series forecasting. International Journal of Forecasting, 37, 1000–1010.
Funding
The authors gratefully acknowledge financial support from the National Social Science Fund of China (No. 19CJL028).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that there is no conflict of interests, we do not have any possible conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Statistical Description of the Data
The overview of data 1 and data 2 are given in Table
7 and
8. The trend diagrams of the dependent and explanatory variables of data 1, containing the relevant indices of Y, RPI, demand deposit rate, and CPI etc. is shown in Fig.
5.
Appendix B: Prediction Results of Small Data Set
To obtain a more accurate ranking of factor importance for machine learning models, we employ the interpretable machine learning method-Shap method. SHAP (Shaply Additive Explanation) ranks the degree of influence of features by the marginal contribution of each feature in a computerized machine learning model based on the contribution allocation method of cooperative games; it is a post-hoc interpretable method that can substantially improve the interpretability of machine learning models. The TreeSHAP utilized in this paper was proposed by Lundberg and Lee (2017) and can be used to explain models including logistic regression, random forest, and XGBoost. Because logistic regression has the maximum accuracy for data 1, we rank the logistic regression features. Figure
6 depicts the top ten ranked features. The top ten characteristics include money (M1) supply up (%) month-on-month, cumulative growth in investment in new fixed assets in real estate development (%), Corporate Goods Transaction Price Index (CGPI) up (%), Production Price Index (PPI) up (%) for the previous n months of data. Therefore, we use these four variables to forecast the economic cycle, and Table
9 and Fig.
7 illustrate the accuracy of our forecast. By removing redundant variables, it is evident that the four variables can also achieve high prediction accuracy, with Logistic Regression model having the maximum accuracy of 80% and SUV having the second highest accuracy of 77%.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tang, P., Zhang, Y. China's business cycle forecasting: a machine learning approach. Comput Econ (2024). https://doi.org/10.1007/s10614-024-10549-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10614-024-10549-w