## Abstract

Forecasting multiple macroeconomic variables with accounting identity restrictions, also known as macroframework, is useful for presenting an internally consistent economic narrative and is widely used in policy institutions. Macroframework forecasting, however, is challenging. Forecasters often have information about only a subset of (known) variables, and in the absence of a systematic way to forecast the rest of the (unknown) variables, the task is resource-intensive and involves ad-hoc adjustments. We propose a novel 2-step method to forecast unknown variables conditional on known variables, which reflects historical correlations and satisfies accounting identities. The method offers (1) the flexibility to incorporate available information in known variables and (2) the convenience to automate the forecasting of unknown variables. Applying our method to forecast GDP subcomponents in an advanced and emerging market country, we show that it improves upon alternative forecasting techniques.

### Similar content being viewed by others

## Notes

Spiliotis et al. (2021) also use machine learning techniques, although their method requires specifying bottom variables and predicting only those variables before aggregating them to construct the rest higher-level variables.

The formulation assumes that the first step forecast is unbiased. Taieb and Koo (2019) propose a method without the assumption, although it is more computationally complex in a high-dimensional environment.

Panagiotelis et al. (2021) discuss the interpretation of reconciliation methodologies as projections.

The formulation also assumes that there are no inequality constraints. Wickramasuriya et al. (2020) propose an optimal non-negative forecast reconciliation, although it may incur some bias.

Panagiotelis et al. (2023) extend reconciliation from point forecasting to probabilistic forecasting, which can be applied to the probabilistic forecasting of the unknown variables, although this paper abstracts from it.

## References

Ando, Sakai, and Futoshi Narita. 2022. â€śAn Alternative Proof of Minimum Trace Reconciliation,â€ť

*IMF Working Paper 2022/136*.Athanasopoulos, George, Puwasala Gamakumara, Anastasios Panagiotelis, Rob Hyndman, and Mohamed Affan. 2020. Hierarchical forecasting.

*Macroeconomic forecasting in the era of big data*.Banbura, Marta, Domenico Giannone and Michele Lenza. 2015. Conditional Forecasts and Scenario Analysis with Vector Autoregressions for Large Cross-Sections.

*International Journal of Forecasting*31: 739.Bayton, Flint, Thomas Laubach, and David Reifschneider. 2014. The FRB/US Model: A Tool for Macroeconomic Policy Analysis.

*FEDS Notes. Board of Governors of the Federal Reserve System*.Bikker, Reinier, Jacco Daalmans, and Nino Mushkudiani. 2013. Benchmarking Large Accounting Frameworks: A Generalized Multivariate Model.

*Economic System Research*25: 390.Byron, Raymond. 1978. The Estimation of Large Social Account Matrices.

*Journal of the Royal Statistical Society Series A: General*141: 359.Capistran, Carlos, Christian Constandse, and Manuel Ramos-Francia. 2010. Multi-Horizon Inflation Forecasts Using Disaggregated Data.Â

*Economic Modelling*Â 27: 666.Chen, Yilun, Ami Wiesel, Yonica Eldar, and Alfred Hero. 2010. Shrinkage Algorithms for MMSE Covariance Estimation.

*IEEE Transactions on Signal Processing*58: 5016.Chen, Baoline, Tommaso Di Fonzo, Thomas Howells, and Marco Marini. 2018. The Statistical Reconciliation of Time Series of Accounts Between Two Benchmark Revisions.

*Statistica Neerlandica*72: 533.Chen, Tianqi, and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In

*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*.Coulombe, Philippe Goulet, Maxime Leroux, Dalibor Stevanovic and Stephane Surprenant. 2022. How is Machine Learning Useful for Macroeconomic Forecasting?

*Journal of Applied Econometrics*37: 920.De Fonzo, Tommaso and Marco Marini. 2011. Simultaneous and Two-Step Reconciliation of Systems of Time Series: Methodological and Practical Issues.

*Journal of the Royal Statistical Society Series C: Applied Statistics*60: 143.ECB. 2016. A Guide to the Eurosystem/ECB Staff Macroeconomic Projection Exercises.

*European Central Bank.*Hyndman, Rob and George Athanasopoulos. 2022.

*Forecasting: Principles and Practice*, 3rd edn. OTexts.International Monetary Fund. 1996. Financial Programming and Policy: The Case of Sri Lanka. IMF Institute.

International Monetary Fund. 2000. Financial Programming and Policy: The Case of Turkey. International Monetary Fund

*.*International Monetary Fund. 2021. France: 2021 Article IV Consultation-Press Release; Staff Report and Statement by the Executive Director for France.

*IMF Country Report*.Lucas, Robert. 1976. Econometric Policy Evaluation: A Critique.

*Carnegie-Rochester Conference Series on Public Policy.*OECD. 2022. OECD Economic Outlook Database Inventory 112, Volume 2022/2.

Panagiotelis, Anastasios, George Athanasopoulos, Puwasala Gamakumara and Rob Hyndman. 2021. Forecast Reconciliation: A Geometric View with New Insights on Bias Correction.

*International Journal of Forecasting*37: 343.Panagiotelis, Anastasios, Puwasala Gamakumara, George Athanasopoulos and Rob Hyndman. 2023. Probabilistic Forecast Reconciliation: Properties, Evaluation and Score Optimization.

*European Journal of Operational Research*306: 693.Schafer, Juliane and Korbinian Strimmer. 2005. A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics.

*Statistical Applications in Genetics and Molecular Biology*. https://doi.org/10.2202/1544-6115.1175.Spiliotis, Evangelos, Mahdi Abolghasemi, Rob Hyndman, Fotios Petropoulos and Vassilios Assimakopoulos. 2021. Hierarchical Forecast Reconciliation with Machine Learning.

*Applied Soft Computing*112: 107756.Stone, Richard. 1976. The Development of Economic Data Systems, in

*Social Accounting for Development Planning with Special Reference to Sri Lanka.*Cambridge University Press.Taieb, Souhaib Ben, and Bonsoo Koo. 2019. Regularized Regression for Hierarchical Forecasting Without Unbiasedness Conditions, in the

*25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining*.Uematsu, Yoshimasa, and Shinya Tanaka. 2018. High-Dimensional Macroeconomic Forecasting and Variable Selection Via Penalized Regression.

*The Econometrics Journal*. https://onlinelibrary.wiley.com/doi/10.1111/ectj.12117Wickramasuriya, Shanika, George Athanasopoulos and Rob Hyndman. 2019. Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization.

*Journal of the American Statistical Association*114: 804.Wickramasuriya, Shanika, Berwin Turlach and Rob Hyndman. 2020. Optimal Non-negative Forecast Reconciliation.

*Statistics and Computing*30: 1167.Zhang, Bohan, Yanfei Kang, Anastasios Panagiotelis and Feng Li. 2023. Optimal Reconciliation with Immutable Forecasts.

*European Journal of Operational Research*308: 650.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank the editor Andrei Levchenko, two anonymous referees,Â Christopher Sims, Jean-Jacques Forneron, Pau Rabanal, Petia Topalova, Prachi Mishra, and participants of the seminars at the 24th Federal Forecasters Conference, the 43rd International Symposium on Forecasting, IMF, Osaka Metropolitan University, and Waseda University. On behalf of all authors, the corresponding author states that there is no conflict of interest. The views expressed are those of the authors and do not necessarily represent the views of the IMF, its Executive Board, or IMF management.

## Appendices

### Appendix 1. Elastic Net Cross-Validation

This appendix describes the algorithm of the elastic net cross-validation in four steps. First, since the elastic net is not scale-invariant, the data are standardized.

This step applies to all the variables that are not constant over time. For constant variables, the constant values themselves become the forecast.

Second, for each unknown variable \(i\in u\), the parameters minimize the squared error subject to \({L}_{1}\) and \({L}_{2}\) penalties

where \(\left(i,T\right)\) indicates that the parameter differs for each unknown variable \(i\in u\) and estimated using data up to \(T\), and \(\mathrm{TSCV}\) means that the penalty parameters \(\left({\lambda }_{1},{\lambda }_{2}\right)\) are chosen by TSCV.

Third, the estimated coefficients are used to construct the first step forecast for the standardized variables

where the forecasts of the known variables \({\left\{{\overline{r} }_{1t}^{k},\dots ,{\overline{r} }_{kt}^{k}\right\}}_{t=T+1}^{T+h}\) are standardized by the mean \({\mu }_{i}\) and standard deviation \({\sigma }_{i}\) derived from the historical data. Note that the coefficients are estimated using historical data up to \(T\) and not updated with forecasted data. This is to avoid the instability in the selected parameters and ensure that the model is smooth in known variables. When the known variables include the lags of unknown variables, the forecast is generated recursively from \(t=T+1\) to \(t=T+h\). The implicit assumption behind this approach is that the correlation between \({r}_{t}^{u}\) and \({r}_{t}^{k}\) is stable over time. An alternative is to apply local projection, but we prefer our method because (1) it could suffer from small sample when the time dimension \(T\) is small and the forecast horizon \(h\) is large, and (2) the local projection with the elastic net may choose a different set of variables for different \(t=T+1,\dots ,T+h\) and result in volatile forecast paths.

Finally, the forecast \({\widehat{r}}_{it}^{u}\) is obtained by transforming the standardized forecast \({\overline{r} }_{it}^{u}\) back to the original scale.

When the elastic net chooses \({\widehat{\beta }}_{k}^{\mathrm{EN}}\left(i,T\right)=0\) for all \(k\), the forecast is the historical mean \({\widehat{r}}_{it}^{u}={\mu }_{i}\). The forecast of the known variables remains the same as the exogenously given values \({\widehat{r}}_{t}^{k}={r}_{t}^{k}\).

### Appendix 2. Proof of Theorem 1

Set the Lagrangian

Taking the derivative with respect to \({\widetilde{r}}^{u}\) leads to

Multiplying both sides by \({U}_{t}{\prime}\) and using the constraint give

Since \({U}_{t}\) and \(\widehat{W}\) are full rank, \({U}_{t}{\prime}\widehat{W}{U}_{t}\) is invertible. Thus, the Lagrange multiplier is

Substituting the Lagrange multiplier back to the first-order condition gives

### Appendix 3. Nonlinear Constraints and Multiple Solutions

This appendix shows that, when a variable satisfies both liner and log-linear constraints, there can be multiple solutions. Consider the following problem.

The first constraint implies that \(x\ne 0\) and \(y\ne 0\). By substituting out \(y\) and \(z\), the problem can be reduced to

The first-order condition is sufficient since the second-order derivative is always positive.

### Appendix 4. Country Example: France with Nonlinear Constraints

This appendix features nonlinear constraints using France data. The vintages are the same as Sect. 3, but the set of variables are chosen to highlight the nonlinearity associated with nominal GDP. Specifically, Table 6 provides a summary of the variables.

The accounting identities are

Thus, the nominal GDP satisfies both linear and log-linear constraints. Let the transformation be

The unknown and known variables are \({r}^{u}=\left[{\gamma }^{Y},{\gamma }^{D},{r}^{Y},{r}^{D},{r}^{F}\right]\) and \({r}^{k}=\left[{\gamma }^{R},\mathrm{inflation},bca\_con\right]\) where \(\mathrm{inflation}\) is the growth rate of consumer prices PCPI and \(bca\_con\) is the contribution of the current account to nominal GDP growth. Since the GDP deflator and foreign balance are close to the inflation based on consumer prices and the current account, we do not include lags in the first step forecast. The second step nonlinear reconciliation problem is

The nonlinear optimization is executed by the trust-region algorithm available in the Scipy package.

Figure 2 shows that the second step forecast improves the WEO forecast by around 20 percent. The mean RMSE for the WEO forecast \({r}^{WEO}\) is 1.01, the first step forecast \(\widehat{r}\) is 0.96, the second step forecast \(\widetilde{r}\) is 0.81, and the second step conditional on true known variables \({\widetilde{r}}^{*}\) is 0.73.

As in Sect. 3, Tables 7 and 8 suggest that the second step forecast \(\widetilde{r}\) can improve the WEO forecast on average, but the improvement can be heterogenous across variables and time. In Table 7, the Diebold-Mariano tests are insignificant for individual unknown variable but is significant at 10 percent when all unknown variables are concatenated into a single vector. Table 8 shows that the second step forecast error in absolute value can be larger than the WEO forecast in some years, although the former performs better on average across time.

### Appendix 5. Country Example: Seychelles

This appendix shows another country example using Seychelles, which is a tourism-dependent small open economy. Seychellesâ€™ example uses the same WEO vintages as France but focuses more on external variables. Table 9 provides a summary of the variables. Importantly, we include exports of services since inbound tourism is expected to be informative about the economy.

As in (11), we transform all variables into contributions to GDP growth. The accounting identities are the following four equations, and the transformation preserves the linearity as in (15).

Figure 3 shows that the second step forecast improves the WEO forecast by around one third on average. The mean RMSE for the WEO forecast \({r}^{WEO}\) is 7.24, the first step forecast \(\widehat{r}\) is 5.06, the second step forecast \(\widetilde{r}\) is 4.66, and the second step forecast conditional on true known variables \({\widetilde{r}}^{*}\) is 3.50.

As in Sect. 3, Tables

10 and 11 suggest that the second step forecast \(\widetilde{r}\) can improve the WEO forecast on average, but the improvement can be heterogenous across variables and time. For example, the second step forecast of nominal GDP is slightly worse than the WEO forecast. In Table 10, the Diebold-Mariano tests are insignificant for individual unknown variable but is significant when all unknown variables are concatenated into a single vector. Table 11 shows that the second step forecast error in absolute value can be larger than the WEO forecast in some years, although the former performs better on average across time.

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.