Model Selection in Online Learning for Times Series Forecasting

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 840)


This paper discusses the problem of selecting model parameters in time series forecasting using aggregation. It proposes a new algorithm that relies on the paradigm of prediction with expert advice, where online and offline autoregressive models are regarded as experts. The desired goal of the proposed aggregation-based algorithm is to perform not worse than the best expert in the hindsight. The theoretical analysis shows that the algorithm has a guarantee that holds for any data sequence. Moreover, the empirical evaluation shows that the algorithm outperforms other popular model selection criteria such as Akaike and Bayesian information criteria on cyclic behaving time series.


Model selection Online learning Aggregation algorithm Time series 


  1. 1.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Selected Papers of Hirotugu Akaike, pp. 199–213. Springer (1998)Google Scholar
  2. 2.
    Anava, O., Hazan, E., Mannor, S., Shamir, O.: Online learning for time series prediction. In: COLT, pp. 172–184 (2013)Google Scholar
  3. 3.
    Daily maximum temperatures in Melbourne, Australia. Australian Bureau of Meteorology (2012).!ds=2323&display=line
  4. 4.
    Daily minimum temperatures in Melbourne, Australia. Australian Bureau of Meteorology (2012).!ds=2324&display=line
  5. 5.
    Erven, T.V., Rooij, S.D., Grünwald, P.: Catching up faster in Bayesian model selection and model averaging. In: Advances in Neural Information Processing Systems, pp. 417–424 (2007)Google Scholar
  6. 6.
    Hamilton, J.D.: Time Series Analysis, vol. 2. Princeton University Press, Princeton (1994)zbMATHGoogle Scholar
  7. 7.
    Herbster, M., Warmuth, M.K.: Tracking the best expert. Mach. Learn. 32(2), 151–178 (1998)CrossRefGoogle Scholar
  8. 8.
    Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice. OTexts (2014)Google Scholar
  9. 9.
    Jamil, W., Kalnishkan, Y., Bouchachia, A.: Aggregation algorithm vs. average for time series prediction. In: Proceedings of the ECMLPKDD 2016 Workshop on Large-Scale Learning from Data Streams in Evolving Environments STREAMEVOLV-2016, September 2016 (2016)Google Scholar
  10. 10.
    Le Borgne, Y.-A., Santini, S., Bontempi, G.: Adaptive model selection for time series prediction in wireless sensor networks. Sig. Process. 87(12), 3010–3020 (2007)CrossRefGoogle Scholar
  11. 11.
    Liu, C., Hoi, S.C., Zhao, P., Sun, J.: Online ARIMA algorithms for time series prediction. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)Google Scholar
  12. 12.
    Noshad, M., Ding, J., Tarokh, V.: Sequential learning of multi-state autoregressive time series. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, pp. 44–51. ACM (2015)Google Scholar
  13. 13.
    Prado, R., Lopes, H.F.: Sequential parameter learning and filtering in structured autoregressive state-space models. Stat. Comput. 23(1), 43–57 (2013)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Robert, C.: The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation. Springer, New York (2007)zbMATHGoogle Scholar
  15. 15.
    Romanenko, A.: Aggregation of adaptive forecasting algorithms under asymmetric loss function. In: International Symposium on Statistical Learning and Data Sciences, pp. 137–146. Springer (2015)Google Scholar
  16. 16.
    Sato, M.-A.: Online model selection based on the variational Bayes. Neural Comput. 13(7), 1649–1681 (2001)CrossRefGoogle Scholar
  17. 17.
    Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Shibata, R.: Selection of the order of an autoregressive model by akaike’s information criterion. Biometrika 63(1), 117–126 (1976)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Vovk, V.: Aggregating strategies. In: Proceedings of Third Workshop on Computational Learning Theory, pp. 371–383. Morgan Kaufmann (1990)Google Scholar
  20. 20.
    Vovk, V.: A game of prediction with expert advice. In: Proceedings of the Eighth Annual Conference on Computational Learning Theory, pp. 51–60. ACM (1995)Google Scholar
  21. 21.
    Vovk, V.: Competitive on-line statistics. In: International Statistical Review/Revue Internationale de Statistique, pp. 213–248 (2001)Google Scholar
  22. 22.
    Vovk, V., Zhdanov, F.: Prediction with expert advice for the brier game. J. Mach. Learn. Res. 10, 2445–2471 (2009)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. Technical report CMU-CS-03-110, School of Computer Science, Carnegie Mellon University (2003)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Machine Intelligence GroupBournemouth UniversityPooleUK

Personalised recommendations