A Theory-Based Lasso for Time-Series Data

Part of the Studies in Computational Intelligence book series (SCI, volume 898)


We present two new lasso estimators, the HAC-lasso and AC-lasso, that are suitable for time-series applications. The estimators are variations of the theory-based or ‘rigorous’ lasso of Bickel et al. (2009), Belloni et al. (2011), Belloni and Chernozhukov (2013), Belloni et al. (2016) and recently extended to the case of dependent data by Chernozhukov et al. (2019), where the lasso penalty level is derived on theoretical grounds. The rigorous lasso has appealing theoretical properties and is computationally very attractive compared to conventional cross-validation. The AC-lasso version of the rigorous lasso accommodates dependence in the disturbance term of arbitrary form, so long as the dependence is known to die out after q periods; the HAC-lasso also allows for heteroskedasticity of arbitrary form. The HAC- and AC-lasso are particularly well-suited to applications such as nowcasting, where the time series may be short and the dimensionality of the predictors is high. We present some Monte Carlo comparisons of the performance of the HAC-lasso versus penalty selection by cross-validation approach. Finally, we use the HAC-lasso to estimate a nowcasting model of US GDP growth based on Google Trends data and compare its performance to the Bayesian methods employed by Kohns and Bhattacharjee (2019).


Lasso Machine learning Time-series Dependence 


  1. Ahrens, A., Hansen, C. B., & Schaffer, M. E. (2020). lassopack: Model selection and prediction with regularized regression in Stata. The Stata Journal, 20, 176–235.Google Scholar
  2. Akaike, H. (1969). Fitting autoregressive models for prediction. Annals of the Institute of Statistical Mathematics, 21, 243–247.MathSciNetCrossRefGoogle Scholar
  3. Akaike, H. (1971). Autoregressive model fitting for control.Google Scholar
  4. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.Google Scholar
  5. Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79.MathSciNetCrossRefGoogle Scholar
  6. Askitas, N., & Zimmermann, K. F. (2009). Google econometrics and unemployment forecasting. Applied Economics Quarterly.Google Scholar
  7. Athey, S. (2017). The impact of machine learning on economics.Google Scholar
  8. Banbura, M., Giannone, D., & Reichlin, L. (2008). Large Bayesian VARs. ECB Working Paper Series, 966.Google Scholar
  9. Belloni, A., Chen, D., Chernozhukov, V., & Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to Eminent domain. Econometrica, 80, 2369–2429.MathSciNetCrossRefGoogle Scholar
  10. Belloni, A., & Chernozhukov, V. (2011). High dimensional sparse econometric models: An introduction. In P. Alquier, E. Gautier, & G. Stoltz (Eds.), Inverse problems and high-dimensional estimation SE-3 (pp. 121–156). Lecture Notes in Statistics. Berlin, Heidelberg: Springer.Google Scholar
  11. Belloni, A., & Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli, 19, 521–547.MathSciNetCrossRefGoogle Scholar
  12. Belloni, A., Chernozhukov, V., & Hansen, C. (2011). Inference for high-dimensional sparse econometric models.Google Scholar
  13. Belloni, A., Chernozhukov, V., & Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Review of Economic Studies, 81, 608–650.Google Scholar
  14. Belloni, A., Chernozhukov, V., Hansen, C., & Kozbur, D. (2016). Inference in high dimensional panel models with an application to gun control. Journal of Business & Economic Statistics, 34, 590–605.MathSciNetCrossRefGoogle Scholar
  15. Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis, 120, 70–83.MathSciNetCrossRefGoogle Scholar
  16. Bickel, P. J., Ritov, Y., & Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 37, 1705–1732.MathSciNetCrossRefGoogle Scholar
  17. Bock, J. (2018). Quantifying macroeconomic expectations in stock markets using Google trends. SSRN Electronic Journal.Google Scholar
  18. Buhlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli, 19, 1212–1242.MathSciNetCrossRefGoogle Scholar
  19. Buono, D., Kapetanios, G., Marcellino, M., Mazzi, G., & Papailias, F. (2018). Big data econometrics: Now casting and early estimates. Bocconi Working Paper Series.Google Scholar
  20. Burman, P., Chow, E., & Nolan, D. (1994). A cross-validatory method for dependent data. Biometrika, 81, 351–358.MathSciNetCrossRefGoogle Scholar
  21. Carriero, A., Clark, T. E., & Marcellino, M. (2015). Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility. Journal of the Royal Statistical Society. Series A: Statistics in Society.Google Scholar
  22. Chernozhukov, V., Hansen, C., & Spindler, M. (2015). Post-selection and post-regularization inference in linear models with many controls and instruments. American Economic Review, 105, 486–490.CrossRefGoogle Scholar
  23. Chernozhukov, V., Härdle, W., Huang, C., & Wang, W. (2019). LASSO-driven inference in time and space. arXiv:1806.05081v3.
  24. Chetverikov, D., Liao, Z., & Chernozhukov, V. (forthcoming). On cross-validated lasso in high dimensions. Annals of Statistics.Google Scholar
  25. Choi, H., & Varian, H. (2012). Predicting the present with google trends. Economic Record, 88, 2–9.CrossRefGoogle Scholar
  26. Croushore, D. (2006). Forecasting with real-time macroeconomic data. In Handbook of economic forecasting. Elsevier.Google Scholar
  27. Doan, T., Litterman, R. B., & Sims, C. A. (1983). Forecasting and conditional projection using realistic prior distributions. NBER Working Paper Series.Google Scholar
  28. Ettredge, M., Gerdes, J., & Karuga, G. (2005). Using web-based search data to predict macroeconomic statistics.Google Scholar
  29. Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its Oracle properties. Journal of the American Statistical Association, 96, 1348–1360.MathSciNetCrossRefGoogle Scholar
  30. Ferrara, L., & Simoni, A. (2019). When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage. Banque de France Working Paper.Google Scholar
  31. Foroni, C., & Marcellino, M. (2015). A comparison of mixed frequency approaches for nowcasting Euro area macroeconomic aggregates. International Journal of Forecasting, 30, 554–568.CrossRefGoogle Scholar
  32. Frank, l. E., & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35, 109–135.Google Scholar
  33. Ghysels, E., Santa-Clara, P., & Valkanov, R. (2004). The MIDAS touch: Mixed data sampling regression models. Discussion Paper, University of California and University of North Carolina.Google Scholar
  34. Giannone, D., Lenza, M., & Primiceri, G. E. (2018). Economic predictions with big data: The illusion of sparsity. SSRN Electronic Journal.Google Scholar
  35. Giannone, P., Monti, F., & Reichlin, L. (2016). Exploiting monthly data flow in structural forecasting. Journal of Monetary Economics, 88, 201–216.CrossRefGoogle Scholar
  36. Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society: Series B (Methodological), 41, 190–195.MathSciNetzbMATHGoogle Scholar
  37. Hastie, T., Tibshirani, R., & Wainwright, M. J. (2015). Statistical learning with sparsity: The Lasso and generalizations, monographs on statistics and applied probability. Boca Raton: CRC Press, Taylor & Francis.Google Scholar
  38. Hayashi, F. (2000). Econometrics. Princeton: Princeton University Press.Google Scholar
  39. Hsu, N. J., Hung, H. L., & Chang, Y. M. (2008). Subset selection for vector autoregressive processes using Lasso. Computational Statistics and Data Analysis, 52, 3645–3657.MathSciNetCrossRefGoogle Scholar
  40. Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice, 2nd ed.Google Scholar
  41. Jing, B.-Y., Shao, Q.-M., & Wang, Q. (2003). Self-normalized Cramér-type large deviations for independent random variables. The Annals of Probability, 31, 2167–2215.MathSciNetCrossRefGoogle Scholar
  42. Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2018). Human decisions and machine predictions. The Quarterly Journal of Economics, 133, 237–293.zbMATHGoogle Scholar
  43. Kohns, D., & Bhattacharjee, A. (2019). Interpreting big data in the macro economy: A Bayesian mixed frequency estimator. In CEERP Working Paper Series. Heriot-Watt University.Google Scholar
  44. Koop, G., & Onorante, L. (2016). Macroeconomic nowcasting using Google probabilities.Google Scholar
  45. Li, X. (2016). Nowcasting with big data: Is Google useful in the presence of other information?Google Scholar
  46. Litterman, R. B. (1986). Forecasting with Bayesian vector autoregressions—five years of experience. Journal of Business & Economic Statistics, 4, 25–38.Google Scholar
  47. Lockhart, R., Taylor, J., Tibshirani, R. J., & Tibshirani, R. (2014). A significance test for the Lasso. Annals of Statistics, 42, 413–468.MathSciNetCrossRefGoogle Scholar
  48. Lütkepohl, H. (2005). New introduction to multiple time series analysis.Google Scholar
  49. Medeiros, M. C., & Mendes, E. F. (2015). L1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics, 191, 255–271.Google Scholar
  50. Meinshausen, N., Meier, L., & Bühlmann, P. (2009). p-values for high-dimensional regression. Journal of the American Statistical Association, 104, 1671–1681.MathSciNetCrossRefGoogle Scholar
  51. Mol, C. D., Vito, E. D., & Rosasco, L. (2009). Elastic-net regularization in learning theory. Journal of Complexity, 25, 201–230.MathSciNetCrossRefGoogle Scholar
  52. Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31, 87–106.CrossRefGoogle Scholar
  53. Nardi, Y., & Rinaldo, A. (2011). Autoregressive process modeling via the Lasso procedure. Journal of Multivariate Analysis, 102, 528–549.MathSciNetCrossRefGoogle Scholar
  54. Romer, C. D., & Romer, D. H. (2000). Federal reserve information and the behavior of interest rates. American Economic Review, 90, 429–457.Google Scholar
  55. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.MathSciNetCrossRefGoogle Scholar
  56. Scott, S., & Varian, H. (2014). Predicting the present with Bayesian structural time series. International Journal of Mathematical Modelling and Numerical Optimisation, 5.Google Scholar
  57. Sims, C. A. (2002). The role of models and probabilities in the monetary policy process. In Brookings Papers on Economic Activity.Google Scholar
  58. Smith, P. (2016). Google’s MIDAS touch: Predicting UK unemployment with internet search data. Journal of Forecasting, 35, 263–284.MathSciNetCrossRefGoogle Scholar
  59. Stock, J. H., Wright, J. H., & Yogo, M. (2002). A survey of weak instruments and weak identification in generalized method of moments.Google Scholar
  60. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58, 267–288.MathSciNetCrossRefGoogle Scholar
  61. Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 67, 91–108.MathSciNetCrossRefGoogle Scholar
  62. Varian, H. R. (2014). Big data: New tricks for econometrics. The Journal of Economic Perspectives, 28, 3–27.CrossRefGoogle Scholar
  63. Wasserman, L., & Roeder, K. (2009). High-dimensional variable selection. Annals of Statistics, 37, 2178–2201.MathSciNetCrossRefGoogle Scholar
  64. Weilenmann, B., Seidl, I., & Schulz, T. (2017). The socio-economic determinants of urban sprawl between 1980 and 2010 in Switzerland. Landscape and Urban Planning, 157, 468–482.CrossRefGoogle Scholar
  65. Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika, 92, 937–950.MathSciNetCrossRefGoogle Scholar
  66. Yuan, M., & Lin, Y. (2006). Model selection and estimation in additive regression models. Journal of the Royal Statistical Society. Series B (Methodological), 68, 49–67.CrossRefGoogle Scholar
  67. Zhang, Y., Li, R., & Tsai, C.-L. (2010). Regularization parameter selections via generalized information criterion. Journal of the American Statistical Association, 105, 312–323.MathSciNetCrossRefGoogle Scholar
  68. Zou, H. (2006). The adaptive Lasso and its Oracle properties. Journal of the American Statistical Association, 101, 1418–1429.MathSciNetCrossRefGoogle Scholar

Copyright information

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021

Authors and Affiliations

  1. 1.ETH ZürichZürichSwitzerland
  2. 2.Heriot-Watt UniversityEdinburghUK

Personalised recommendations