A Theory-Based Lasso for Time-Series Data
- 234 Downloads
Abstract
We present two new lasso estimators, the HAC-lasso and AC-lasso, that are suitable for time-series applications. The estimators are variations of the theory-based or ‘rigorous’ lasso of Bickel et al. (2009), Belloni et al. (2011), Belloni and Chernozhukov (2013), Belloni et al. (2016) and recently extended to the case of dependent data by Chernozhukov et al. (2019), where the lasso penalty level is derived on theoretical grounds. The rigorous lasso has appealing theoretical properties and is computationally very attractive compared to conventional cross-validation. The AC-lasso version of the rigorous lasso accommodates dependence in the disturbance term of arbitrary form, so long as the dependence is known to die out after q periods; the HAC-lasso also allows for heteroskedasticity of arbitrary form. The HAC- and AC-lasso are particularly well-suited to applications such as nowcasting, where the time series may be short and the dimensionality of the predictors is high. We present some Monte Carlo comparisons of the performance of the HAC-lasso versus penalty selection by cross-validation approach. Finally, we use the HAC-lasso to estimate a nowcasting model of US GDP growth based on Google Trends data and compare its performance to the Bayesian methods employed by Kohns and Bhattacharjee (2019).
Keywords
Lasso Machine learning Time-series DependenceReferences
- Ahrens, A., Hansen, C. B., & Schaffer, M. E. (2020). lassopack: Model selection and prediction with regularized regression in Stata. The Stata Journal, 20, 176–235.Google Scholar
- Akaike, H. (1969). Fitting autoregressive models for prediction. Annals of the Institute of Statistical Mathematics, 21, 243–247.MathSciNetCrossRefGoogle Scholar
- Akaike, H. (1971). Autoregressive model fitting for control.Google Scholar
- Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.Google Scholar
- Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79.MathSciNetCrossRefGoogle Scholar
- Askitas, N., & Zimmermann, K. F. (2009). Google econometrics and unemployment forecasting. Applied Economics Quarterly.Google Scholar
- Athey, S. (2017). The impact of machine learning on economics.Google Scholar
- Banbura, M., Giannone, D., & Reichlin, L. (2008). Large Bayesian VARs. ECB Working Paper Series, 966.Google Scholar
- Belloni, A., Chen, D., Chernozhukov, V., & Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to Eminent domain. Econometrica, 80, 2369–2429.MathSciNetCrossRefGoogle Scholar
- Belloni, A., & Chernozhukov, V. (2011). High dimensional sparse econometric models: An introduction. In P. Alquier, E. Gautier, & G. Stoltz (Eds.), Inverse problems and high-dimensional estimation SE-3 (pp. 121–156). Lecture Notes in Statistics. Berlin, Heidelberg: Springer.Google Scholar
- Belloni, A., & Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli, 19, 521–547.MathSciNetCrossRefGoogle Scholar
- Belloni, A., Chernozhukov, V., & Hansen, C. (2011). Inference for high-dimensional sparse econometric models.Google Scholar
- Belloni, A., Chernozhukov, V., & Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Review of Economic Studies, 81, 608–650.Google Scholar
- Belloni, A., Chernozhukov, V., Hansen, C., & Kozbur, D. (2016). Inference in high dimensional panel models with an application to gun control. Journal of Business & Economic Statistics, 34, 590–605.MathSciNetCrossRefGoogle Scholar
- Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis, 120, 70–83.MathSciNetCrossRefGoogle Scholar
- Bickel, P. J., Ritov, Y., & Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 37, 1705–1732.MathSciNetCrossRefGoogle Scholar
- Bock, J. (2018). Quantifying macroeconomic expectations in stock markets using Google trends. SSRN Electronic Journal.Google Scholar
- Buhlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli, 19, 1212–1242.MathSciNetCrossRefGoogle Scholar
- Buono, D., Kapetanios, G., Marcellino, M., Mazzi, G., & Papailias, F. (2018). Big data econometrics: Now casting and early estimates. Bocconi Working Paper Series.Google Scholar
- Burman, P., Chow, E., & Nolan, D. (1994). A cross-validatory method for dependent data. Biometrika, 81, 351–358.MathSciNetCrossRefGoogle Scholar
- Carriero, A., Clark, T. E., & Marcellino, M. (2015). Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility. Journal of the Royal Statistical Society. Series A: Statistics in Society.Google Scholar
- Chernozhukov, V., Hansen, C., & Spindler, M. (2015). Post-selection and post-regularization inference in linear models with many controls and instruments. American Economic Review, 105, 486–490.CrossRefGoogle Scholar
- Chernozhukov, V., Härdle, W., Huang, C., & Wang, W. (2019). LASSO-driven inference in time and space. arXiv:1806.05081v3.
- Chetverikov, D., Liao, Z., & Chernozhukov, V. (forthcoming). On cross-validated lasso in high dimensions. Annals of Statistics.Google Scholar
- Choi, H., & Varian, H. (2012). Predicting the present with google trends. Economic Record, 88, 2–9.CrossRefGoogle Scholar
- Croushore, D. (2006). Forecasting with real-time macroeconomic data. In Handbook of economic forecasting. Elsevier.Google Scholar
- Doan, T., Litterman, R. B., & Sims, C. A. (1983). Forecasting and conditional projection using realistic prior distributions. NBER Working Paper Series.Google Scholar
- Ettredge, M., Gerdes, J., & Karuga, G. (2005). Using web-based search data to predict macroeconomic statistics.Google Scholar
- Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its Oracle properties. Journal of the American Statistical Association, 96, 1348–1360.MathSciNetCrossRefGoogle Scholar
- Ferrara, L., & Simoni, A. (2019). When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage. Banque de France Working Paper.Google Scholar
- Foroni, C., & Marcellino, M. (2015). A comparison of mixed frequency approaches for nowcasting Euro area macroeconomic aggregates. International Journal of Forecasting, 30, 554–568.CrossRefGoogle Scholar
- Frank, l. E., & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35, 109–135.Google Scholar
- Ghysels, E., Santa-Clara, P., & Valkanov, R. (2004). The MIDAS touch: Mixed data sampling regression models. Discussion Paper, University of California and University of North Carolina.Google Scholar
- Giannone, D., Lenza, M., & Primiceri, G. E. (2018). Economic predictions with big data: The illusion of sparsity. SSRN Electronic Journal.Google Scholar
- Giannone, P., Monti, F., & Reichlin, L. (2016). Exploiting monthly data flow in structural forecasting. Journal of Monetary Economics, 88, 201–216.CrossRefGoogle Scholar
- Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society: Series B (Methodological), 41, 190–195.MathSciNetzbMATHGoogle Scholar
- Hastie, T., Tibshirani, R., & Wainwright, M. J. (2015). Statistical learning with sparsity: The Lasso and generalizations, monographs on statistics and applied probability. Boca Raton: CRC Press, Taylor & Francis.Google Scholar
- Hayashi, F. (2000). Econometrics. Princeton: Princeton University Press.Google Scholar
- Hsu, N. J., Hung, H. L., & Chang, Y. M. (2008). Subset selection for vector autoregressive processes using Lasso. Computational Statistics and Data Analysis, 52, 3645–3657.MathSciNetCrossRefGoogle Scholar
- Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice, 2nd ed.Google Scholar
- Jing, B.-Y., Shao, Q.-M., & Wang, Q. (2003). Self-normalized Cramér-type large deviations for independent random variables. The Annals of Probability, 31, 2167–2215.MathSciNetCrossRefGoogle Scholar
- Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2018). Human decisions and machine predictions. The Quarterly Journal of Economics, 133, 237–293.zbMATHGoogle Scholar
- Kohns, D., & Bhattacharjee, A. (2019). Interpreting big data in the macro economy: A Bayesian mixed frequency estimator. In CEERP Working Paper Series. Heriot-Watt University.Google Scholar
- Koop, G., & Onorante, L. (2016). Macroeconomic nowcasting using Google probabilities.Google Scholar
- Li, X. (2016). Nowcasting with big data: Is Google useful in the presence of other information?Google Scholar
- Litterman, R. B. (1986). Forecasting with Bayesian vector autoregressions—five years of experience. Journal of Business & Economic Statistics, 4, 25–38.Google Scholar
- Lockhart, R., Taylor, J., Tibshirani, R. J., & Tibshirani, R. (2014). A significance test for the Lasso. Annals of Statistics, 42, 413–468.MathSciNetCrossRefGoogle Scholar
- Lütkepohl, H. (2005). New introduction to multiple time series analysis.Google Scholar
- Medeiros, M. C., & Mendes, E. F. (2015). L1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics, 191, 255–271.Google Scholar
- Meinshausen, N., Meier, L., & Bühlmann, P. (2009). p-values for high-dimensional regression. Journal of the American Statistical Association, 104, 1671–1681.MathSciNetCrossRefGoogle Scholar
- Mol, C. D., Vito, E. D., & Rosasco, L. (2009). Elastic-net regularization in learning theory. Journal of Complexity, 25, 201–230.MathSciNetCrossRefGoogle Scholar
- Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31, 87–106.CrossRefGoogle Scholar
- Nardi, Y., & Rinaldo, A. (2011). Autoregressive process modeling via the Lasso procedure. Journal of Multivariate Analysis, 102, 528–549.MathSciNetCrossRefGoogle Scholar
- Romer, C. D., & Romer, D. H. (2000). Federal reserve information and the behavior of interest rates. American Economic Review, 90, 429–457.Google Scholar
- Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.MathSciNetCrossRefGoogle Scholar
- Scott, S., & Varian, H. (2014). Predicting the present with Bayesian structural time series. International Journal of Mathematical Modelling and Numerical Optimisation, 5.Google Scholar
- Sims, C. A. (2002). The role of models and probabilities in the monetary policy process. In Brookings Papers on Economic Activity.Google Scholar
- Smith, P. (2016). Google’s MIDAS touch: Predicting UK unemployment with internet search data. Journal of Forecasting, 35, 263–284.MathSciNetCrossRefGoogle Scholar
- Stock, J. H., Wright, J. H., & Yogo, M. (2002). A survey of weak instruments and weak identification in generalized method of moments.Google Scholar
- Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58, 267–288.MathSciNetCrossRefGoogle Scholar
- Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 67, 91–108.MathSciNetCrossRefGoogle Scholar
- Varian, H. R. (2014). Big data: New tricks for econometrics. The Journal of Economic Perspectives, 28, 3–27.CrossRefGoogle Scholar
- Wasserman, L., & Roeder, K. (2009). High-dimensional variable selection. Annals of Statistics, 37, 2178–2201.MathSciNetCrossRefGoogle Scholar
- Weilenmann, B., Seidl, I., & Schulz, T. (2017). The socio-economic determinants of urban sprawl between 1980 and 2010 in Switzerland. Landscape and Urban Planning, 157, 468–482.CrossRefGoogle Scholar
- Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika, 92, 937–950.MathSciNetCrossRefGoogle Scholar
- Yuan, M., & Lin, Y. (2006). Model selection and estimation in additive regression models. Journal of the Royal Statistical Society. Series B (Methodological), 68, 49–67.CrossRefGoogle Scholar
- Zhang, Y., Li, R., & Tsai, C.-L. (2010). Regularization parameter selections via generalized information criterion. Journal of the American Statistical Association, 105, 312–323.MathSciNetCrossRefGoogle Scholar
- Zou, H. (2006). The adaptive Lasso and its Oracle properties. Journal of the American Statistical Association, 101, 1418–1429.MathSciNetCrossRefGoogle Scholar