Batch and incremental dynamic factor machine learning for multivariate and multi-step-ahead forecasting

  • Jacopo De Stefani
  • Yann-Aël Le Borgne
  • Olivier Caelen
  • Dalila Hattab
  • Gianluca Bontempi
Regular Paper


Most multivariate forecasting methods in the literature are restricted to vector time series of low dimension, linear methods and short horizons. Big data revolution is instead shifting the focus to problems (e.g., issued from the IoT technology) characterized by very large dimension, nonlinearity and long forecasting horizons. This paper discusses and compares a set of state-of-the-art methods which could be promising in tackling such challenges. Also, it proposes DFML, a machine-learning version of the dynamic factor model, a successful forecasting methodology well-known in econometrics. The DFML strategy is based on a out-of-sample selection of the nonlinear forecaster, the number of latent components and the multi-step-ahead strategy. We will discuss both a batch and an incremental version of DFML, and we will show that it can consistently outperform state-of-the-art methods in a number of Synthetic and real forecasting tasks.


Multivariate forecasting Multi-step-ahead forecasting Dynamic factor models Nonlinear forecasting 



GB and YLB acknowledge the support of the INNOVIRIS SecurIT project BruFence: Scalable machine learning for automating defense system. JD acknowledges the support of the ULB-Worldline Collaboration Agreement. Computational resources have been provided by the Consortium des Équipements de Calcul Intensif (CÉCI), funded by the Fonds de la Recherche Scientifique de Belgique (F.R.S.-FNRS) under Grant No. 2.5020.11.

Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. 1.
    Andrecut, M.: Parallel GPU implementation of iterative PCA algorithms. J. Comput. Biol. 16(11), 1593–1599 (2009)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 861–868. IEEE (2012)Google Scholar
  3. 3.
    Ben Taieb, S., Atiya, A.: A bias and variance analysis for multistep-ahead time series forecasting. IEEE Trans. Neural Netw. Learn. Syst. 27(1), 62–76 (2016)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Ben Taieb, S., Bontempi, G., Atiya, A., Sorjamaa, A.: A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst. Appl. 39(8), 7067–7083 (2012)CrossRefGoogle Scholar
  5. 5.
    Ben Taieb, S., Bontempi, G., Sorjamaa, A., Lendasse, A.: Long-term prediction of time series by combining direct and mimo strategies. In: Proceedings of the 2009 IEEE International Joint Conference on Neural Networks, pp. 3054–3061. Atlanta, USA (2009)Google Scholar
  6. 6.
    Ben Taieb, S., Sorjamaa, A., Bontempi, G.: Multiple-output modelling for multi-step-ahead forecasting. Neurocomputing 73, 1950–1957 (2010)CrossRefGoogle Scholar
  7. 7.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009). CrossRefzbMATHGoogle Scholar
  8. 8.
    Blum, A., Rivest, R.L.: Training a 3-node neural network is np-complete. In: Proceedings of the 1st International Conference on Neural Information Processing Systems, pp. 494–501. MIT Press (1988)Google Scholar
  9. 9.
    Bontempi, G.: Long term time series prediction with multi-input multi-output local learning. In: Proceedings of the 2nd European Symposium on Time Series Prediction (TSP), ESTSP08 pp. 145–154 (2008)Google Scholar
  10. 10.
    Bontempi, G.: A Monte Carlo strategy for structured multiple-step-ahead time series prediction. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 853–858 (2014).
  11. 11.
    Bontempi, G., Ben Taieb, S., Le Borgne, Y.A.: Machine learning strategies for time series forecasting, pp. 62–77. Springer, Berlin (2013).
  12. 12.
    Bontempi, G., Birattari, M., Bersini, H.: Lazy learning for modeling and control design. Int. J. Control 72(7/8), 643–658 (1999)CrossRefzbMATHGoogle Scholar
  13. 13.
    Bontempi, G., Birattari, M., Bersini, H.: Local learning for iterated time-series prediction. In: Bratko, I., Dzeroski, S. (eds.) Machine Learning: Proceedings of the Sixteenth International Conference, pp. 32–38. Morgan Kaufmann Publishers, San Francisco (1999)Google Scholar
  14. 14.
    Bontempi, G., Le Borgne, Y.A., De Stefani, J.: A dynamic factor machine learning method for multi-variate and multi-step-ahead forecasting. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 222–231. IEEE (2017)Google Scholar
  15. 15.
    Bontempi, G., Taieb, S.B.: Conditionally dependent strategies for multiple-step-ahead prediction in local learning. Int. J. Forecast. 27(3), 689–699 (2011)CrossRefGoogle Scholar
  16. 16.
    Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59(4), 291–294 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Box, G., Tiao, G.: A canonical analysis of multiple time series. Biometrika 64(2), 355–365 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Cheng, H., Tan, P.N., Gao, J., Scripps, J.: Multistep-ahead time series prediction. In: PAKDD, pp. 765–774 (2006)Google Scholar
  19. 19.
    Chevillon, G.: Direct multi-step estimation and forecasting. J. Econ. Surv. 21(4), 746–785 (2007)CrossRefGoogle Scholar
  20. 20.
    Fernández, A.M., Torres, J.F., Troncoso, A., Martínez-Álvarez, F.: Automated Spark Clusters Deployment for Big Data with Standalone Applications Integration, pp. 150–159. Springer International Publishing, Cham (2016). Google Scholar
  21. 21.
    Forni, M., Hallin, M., Lippi, M., Reichlin, L.: The generalized dynamic factor model. J. Am. Stat. Assoc. 100(471), 830–840 (2005). CrossRefzbMATHGoogle Scholar
  22. 22.
    Franses, P., Legerstee, R.: A unifying view on multi-step forecasting using an autoregression. J. Econ. Surv. 24(3), 389–401 (2010)Google Scholar
  23. 23.
    Galicia, A., Torres, J.F., Martínez-Álvarez, F., Troncoso, A.: Scalable Forecasting Techniques Applied to Big Electricity Time Series, pp. 165–175. Springer International Publishing, Cham (2017). Google Scholar
  24. 24.
    Garman, M.B., Klass, M.J.: On the estimation of security price volatilities from historical data. J. Bus. 53, 67–78 (1980)CrossRefGoogle Scholar
  25. 25.
    Gilbert, P.D.: State space and ARMA models : an overview of the equivalence. Bank of Canada, Ottawa (1993)Google Scholar
  26. 26.
    Golyandina, N., Korobeynikov, A., Shlemov, A., Usevich, K.: Multivariate and 2d extensions of singular spectrum analysis with the RSSA package. J. Stat. Softw. 67, 1–78 (2015)CrossRefGoogle Scholar
  27. 27.
    Golyandina, N., Nekrutkin, V., Zhigljavsky, A.: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, Boca Raton (2001)CrossRefzbMATHGoogle Scholar
  28. 28.
    Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Berlin (2012)CrossRefzbMATHGoogle Scholar
  29. 29.
    Guo, M., Bai, Z., An, H.: Multi-step prediction for nonlinear autoregressive models based on empirical distributions. Stat. Sin. 9, 559–570 (1999)zbMATHGoogle Scholar
  30. 30.
    Hegde, A., Principe, J.C., Erdogmus, D., Ozertem, U., Rao, Y.N., Peddaneni, H.: Perturbation-based eigenvector updates for on-line principal components analysis and canonical correlation analysis. J. VLSI Signal Process. 45(1), 85–95 (2006)CrossRefGoogle Scholar
  31. 31.
    Jolliffe, I.: Principal Component Analysis. Springer, Berlin (2002)zbMATHGoogle Scholar
  32. 32.
    Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.E., He-Guelton, L., Caelen, O.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100, 234–245 (2018)CrossRefGoogle Scholar
  33. 33.
    Kirchgassner, G., Wolters, J.: Introduction to Modern Time Series Analysis. Springer, Berlin (2007)CrossRefzbMATHGoogle Scholar
  34. 34.
    Kline,D.M.:Methods for multi-step time series forecasting neural networks. In: Neural networks in business forecasting, pp. 226–250. IGI Global, HersheyGoogle Scholar
  35. 35.
    Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning (2015). arXiv preprint arXiv:1506.00019
  36. 36.
    Matías, J.M.: Multi-output nonparametric regression. In: EPIA, pp. 288–292 (2005)Google Scholar
  37. 37.
    McNames, J.: A nearest trajectory strategy for time series prediction. In: Proceedings of the International Workshop on Advanced Black-Box Techniques for Nonlinear Modeling, pp. 112–128. K.U. Leuven, Belgium (1998)Google Scholar
  38. 38.
    Micchelli, C.A., Pontil, M.A.: On learning vector-valued functions. Neural Comput. 17(1), 177–204 (2005). MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Mitliagkas, I., Caramanis, C., Jain, P.: Memory limited, streaming PCA. In: Advances in Neural Information Processing Systems, pp. 2886–2894 (2013)Google Scholar
  40. 40.
    Oja, E.: Principal components, minor components, and linear neural networks. Neural Netw. 5(6), 927–935 (1992)CrossRefGoogle Scholar
  41. 41.
    Papadimitriou, S., Sun, J., Faloutsos, C.: Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 697–708 (2005)Google Scholar
  42. 42.
    Peña, D., Poncela, P.: Dimension Reduction in Multivariate Time Series, pp. 433–458. Birkhäuser Boston, Boston (2006)zbMATHGoogle Scholar
  43. 43.
    Perez-Chacon, R., Talavera-Llames, R.L., Martinez-Alvarez, F., Troncoso, A.: Finding Electric Energy Consumption Patterns in Big Time Series Data, pp. 231–238. Springer International Publishing, Cham (2016). Google Scholar
  44. 44.
    Poon, S.H., Granger, C.W.: Forecasting volatility in financial markets: a review. J. Econ. Lit. 41(2), 478–539 (2003)CrossRefGoogle Scholar
  45. 45.
    Saad, E., Prokhorov, D., Wunsch, D.: Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks. IEEE Trans. Neural Netw. 9(6), 1456–1470 (1998). CrossRefGoogle Scholar
  46. 46.
    Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2(6), 459–473 (1989)CrossRefGoogle Scholar
  47. 47.
    Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., Lendasse, A.: Methodology for long-term prediction of time series. Neurocompuing 70(16–18), 2861–2869 (2007). CrossRefGoogle Scholar
  48. 48.
    Stock, J., Watson, M.: Forecasting using principal components from a large number of predictors. J. Am. Stat. Assoc. 97(460), 1167–1179 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Stock, J., Watson, M.: Dynamic factor models. In: Clements, M., Hendry, D. (eds.) Oxford Handbook of Economic Forecasting. Oxford University Press, Oxford (2010)Google Scholar
  50. 50.
    Talavera-Llames, R.L., Pérez-Chacón, R., Martínez-Ballesteros, M., Troncoso, A., Martínez-Álvarez, F.: A Nearest Neighbours-Based Algorithm for Big Time Series Data Forecasting, pp. 174–185. Springer International Publishing, Cham (2016). Google Scholar
  51. 51.
    Tashman, L.J.: Out-of-sample tests of forecasting accuracy: an analysis and review. Int. J. Forecast. 16(4), 437–450 (2000). (The M3- Competition)CrossRefGoogle Scholar
  52. 52.
    Tashman, L.J.: Out-of-sample tests of forecasting accuracy: an analysis and review. Int. J. Forecast. 16(4), 437–450 (2000)CrossRefGoogle Scholar
  53. 53.
    Tong, H.: Threshold Models in Nonlinear Time Series Analysis. Springer, Berlin (1983)CrossRefzbMATHGoogle Scholar
  54. 54.
    Torres, J.F., Fernández, A.M., Troncoso, A., Martínez-Álvarez, F.: Deep Learning-Based Approach for Time Series Forecasting with Application to Electricity Load, pp. 203–212. Springer International Publishing, Cham (2017). Google Scholar
  55. 55.
    Tsay, R.S.: Multivariate Time Series Analysis with R and Financial Applications. Wiley, Hoboken (2014)zbMATHGoogle Scholar
  56. 56.
    Tuarob, S., Tucker, C.S., Kumara, S., Giles, C.L., Pincus, A.L., Conroy, D.E., Ram, N.: How are you feeling?: a personalized methodology for predicting mental states from temporally observable physical and behavioral information. J. Biomed. Inform. 68, 1–19 (2017). CrossRefGoogle Scholar
  57. 57.
    Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetzbMATHGoogle Scholar
  58. 58.
    Weigend, A., Gershenfeld, N.: Time Series Prediction: forecasting the future and understanding the past. Addison Wesley, Harlow (1994)Google Scholar
  59. 59.
    Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1034–1040 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Machine Learning Group, Computer Science Department, Faculty of SciencesUniversité Libre de Bruxelles (ULB)BrusselsBelgium
  2. 2.Worldline SA/NV R&DBruxellesBelgium
  3. 3.Equens Worldline R&DLille, SeclinFrance

Personalised recommendations