Skip to main content

Time Series Forecasting with Statistical, Machine Learning, and Deep Learning Methods: Past, Present, and Future

  • Chapter
  • First Online:
Forecasting with Artificial Intelligence

Abstract

Time series forecasting covers a wide range of methods extending from exponential smoothing and ARIMA models to sophisticated machine learning ones, such as neural networks and regression-tree-based techniques. More recently, deep learning methods have also shown considerable improvements in many forecasting applications. This chapter provides an overview of the key advances that have occurred per class of method in the last decades, presents their advantages and drawbacks, describes the conditions they are expected to perform better under, and discusses some approaches that can be exploited to improve their accuracy. Finally, some directions for future research are proposed to further improve their accuracy and applicability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Other terms commonly used to describe statistical methods are “traditional”, “conventional”, “structured”, “model driven” or simply “time series” methods.

  2. 2.

    Other terms commonly used to describe ML methods are “computational intelligence”, “unstructured”, or “data-driven” methods.

  3. 3.

    Although it is true that some ML models make certain assumptions about data distributions, they rarely prescribe the structural components of the series.

  4. 4.

    Vector ARIMA (VARIMA) and its variants, providing a multivariate generalization of the univariate ARIMA model, are probably the most notable counterexamples. Yet, their modeling approach still differs from global models in the sense that they are not tasked to forecast each series separately, but accommodate assumptions on cross-series and contemporaneous relationships. Also, when the number of series becomes large, VARIMA can be challenging to estimate and include many insignificant parameters, thus becoming impractical (Riise & Tjozstheim, 1984). Similarly, although one can argue that hierarchical forecasting methods (Hyndman et al., 2011) allow the exchange of information between multiple series, they fundamentally differ from global models as said exchanges are possible just for hierarchically organized data and not for series that may originate from different data sources or represent a variety of measures.

  5. 5.

    Although some primitive global models have been proposed before 2018 (e.g. a NN model that was trained on pools of series was ranked 3rd in the NN3 competition; Crone et al., 2011), the M4 competition (Makridakis et al., 2020) was probably the first major study to demonstrate their superiority.

  6. 6.

    The term “look-back window” is typically used to define the number of lags considered by the model for making forecasts.

  7. 7.

    As an example, in each iteration, a RT will split the samples of the train set using a rule that applies on a certain regressor variable so that the forecast error is minimized. In this manner, given a set of predictors that includes both relevant and irrelevant regressor variables, only the former will be used. In the same forecasting task, given a feedforward network, the weights of the nodes that correspond to the irrelevant predictors will gradually be set to zero to minimize their impact on the forecast. Although in practice the above claims may not be absolutely true, it is evident that, structurally speaking, the ML methods do allow an automated feature selection process.

  8. 8.

    Larger data sets can be created either by including longer series, each contributing multiple observations to the train set, or by including several short series, together contributing numerous observations to the train set.

  9. 9.

    Some researchers have proposed focusing on similar past forecast errors (Shrestha & Solomatine, 2006) or point forecasts (Pinson & Kariniotakis, 2010) to the period being forecast with the objective to account for the effect that critical explanatory variables and seasonality have on forecast uncertainty, while reducing computational cost.

  10. 10.

    This includes NNs that consider different weight initializations, sizes of look-back windows, and loss functions.

  11. 11.

    The rectified linear unit (ReLU) and leaky ReLU are commonly used as activation functions in DL models.

  12. 12.

    The root mean squared propagation (RMSProp) and the Adam optimizers can be used to dynamically adapt the step size (learning rate) for each input variable based on the most recently observed gradients of said variable over the search.

  13. 13.

    According to this approach, the inputs of the layers are normalized by re-centering and re-scaling so that that the distribution of each layer’s inputs does not vary during training.

  14. 14.

    According to this approach, the layers of the NN are successively added to the model and refitted, allowing the newly added layers to learn the inputs from the existing ones.

  15. 15.

    In order to track the performance of forecasting models under different conditions, extensive simulations (e.g. rolling origin evaluations; Tashman, 2000) are typically required.

References

  • Agarap, A. F. (2018). Deep learning using rectified linear units (relu).

    Google Scholar 

  • Alexandrov, A., Benidis, K., Bohlke-Schneider, M., Flunkert, V., Gasthaus, J., Januschowski, T., Maddix, D. C., Rangapuram, S., Salinas, D., Schulz, J., Stella, L., Türkmen, A. C., & Wang, Y. (2019). Gluonts: Probabilistic time series models in python.

    Google Scholar 

  • Alolayan, O. S., Raymond, S. J., Montgomery, J. B., & Williams, J. R. (2022). Towards better shale gas production forecasting using transfer learning. Upstream Oil and Gas Technology, 9, 100072.

    Article  Google Scholar 

  • Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: A decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521–530.

    Article  Google Scholar 

  • Athanasopoulos, G., Hyndman, R. J., Kourentzes, N., & Petropoulos, F. (2017). Forecasting with temporal hierarchies. European Journal of Operational Research, 262(1), 60–74.

    Article  Google Scholar 

  • Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.

    Google Scholar 

  • Bandara, K., Hewamalage, H., Liu, Y. H., Kang, Y., & Bergmeir, C. (2021). Improving the accuracy of global forecasting models using time series data augmentation. Pattern Recognition, 120, 108148.

    Article  Google Scholar 

  • Barker, J. (2020). Machine learning in M4: What makes a good unstructured model? International Journal of Forecasting, 36(1), 150–155.

    Article  Google Scholar 

  • Bates, J. M., & Granger, C. W. J. (1969). The combination of forecasts. Journal of the Operational Research Society, 20(4), 451–468.

    Article  Google Scholar 

  • Beaumont, A. N. (2014). Data transforms with exponential smoothing methods of forecasting. International Journal of Forecasting, 30(4), 918–927.

    Article  Google Scholar 

  • Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2006). Greedy layer-wise training of deep networks. In B. Schölkopf, J. Platt, T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19). MIT Press.

    Google Scholar 

  • Bergmeir, C., Hyndman, R. J., & Benítez, J. M. (2016). Bagging exponential smoothing methods using STL decomposition and Box-Cox transformation. International Journal of Forecasting, 32(2), 303–312.

    Article  Google Scholar 

  • Bojer, C. S. (2022). Understanding machine learning-based forecasting methods: A decomposition framework and research opportunities. International Journal of Forecasting, 38(4), 1555–1561.

    Article  Google Scholar 

  • Bojer, C. S., & Meldgaard, J. P. (2021). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting, 37(2), 587–603.

    Article  Google Scholar 

  • Borovykh, A., Bohte, S., & Oosterlee, C. W. (2017). Conditional time series forecasting with convolutional neural networks.

    Google Scholar 

  • Chatigny, P., Wang, S., Patenaude, J. M., & Oreshkin, B. N. (2021). Neural forecasting at scale.

    Google Scholar 

  • Claeskens, G., Magnus, J. R., Vasnev, A. L., & Wang, W. (2016). The forecast combination puzzle: A simple theoretical explanation. International Journal of Forecasting, 32(3), 754–762.

    Article  Google Scholar 

  • Crone, S. F., Hibon, M., & Nikolopoulos, K. (2011). Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction. International Journal of Forecasting, 27(3), 635–660.

    Article  Google Scholar 

  • De Gooijer, J. G., & Hyndman, R. J. (2006). 25 years of time series forecasting. International Journal of Forecasting, 22(3), 443–473.

    Article  Google Scholar 

  • De Gooijer, J. G., & Kumar, K. (1992). Some recent developments in non-linear time series modelling, testing, and forecasting. International Journal of Forecasting, 8(2), 135–156.

    Article  Google Scholar 

  • Gilliland, M. (2020). The value added by machine learning approaches in forecasting. International Journal of Forecasting, 36(1), 161–166.

    Article  Google Scholar 

  • Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Y. W. Teh & M. Titterington (Eds.), Proceedings of the thirteenth international conference on artificial intelligence and statistics. Proceedings of machine learning research (Vol. 9, pp. 249–256).

    Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  • Hewamalage, H., Bergmeir, C., & Bandara, K. (2021). Recurrent Neural Networks for Time Series Forecasting: Current status and future directions. International Journal of Forecasting, 37(1), 388–427.

    Article  Google Scholar 

  • Hyndman, R., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O’Hara-Wild, M., Petropoulos, F., Razbash, S., Wang, E., & Yasmeen, F. (2020). forecast: Forecasting functions for time series and linear models. R package version, 8, 12.

    Google Scholar 

  • Hyndman, R. J., Koehler, A. B., Snyder, R. D., & Grose, S. (2002). A state space framework for automatic forecasting using exponential smoothing methods. International Journal of Forecasting, 18(3), 439–454.

    Article  Google Scholar 

  • Hyndman, R. J., Ahmed, R. A., Athanasopoulos, G., & Shang, H. L. (2011). Optimal combination forecasts for hierarchical time series. Computational Statistics & Data Analysis, 55(9), 2579–2589.

    Article  Google Scholar 

  • Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift.

    Google Scholar 

  • Januschowski, T., Gasthaus, J., Wang, Y., Salinas, D., Flunkert, V., Bohlke-Schneider, M., & Callot, L. (2020). Criteria for classifying forecasting methods. International Journal of Forecasting, 36(1), 167–177.

    Article  Google Scholar 

  • Januschowski, T., Wang, Y., Torkkola, K., Erkkilä, T., Hasson, H., & Gasthaus, J. (2022). Forecasting with trees. International Journal of Forecasting, 38(4), 1473–1481.

    Article  Google Scholar 

  • Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35–45.

    Article  Google Scholar 

  • Kang, Y., Spiliotis, E., Petropoulos, F., Athiniotis, N., Li, F., & Assimakopoulos, V. (2021). Déjá vu: A data-centric forecasting approach through time series cross-similarity. Journal of Business Research, 132, 719–731.

    Article  Google Scholar 

  • Kang, Y., Cao, W., Petropoulos, F., & Li, F. (2022). Forecast with forecasts: Diversity matters. European Journal of Operational Research, 301(1), 180–190.

    Article  Google Scholar 

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization.

    Google Scholar 

  • Kolassa, S. (2016). Evaluating predictive count data distributions in retail sales forecasting. International Journal of Forecasting, 32(3), 788–803.

    Article  Google Scholar 

  • Lai, G., Chang, W. C., Yang, Y., & Liu, H. (2017). Modeling long- and short-term temporal patterns with deep neural networks.

    Google Scholar 

  • Lainder, A. D., & Wolfinger, R. D. (2022). Forecasting with gradient boosted trees: Augmentation, tuning, and cross-validation strategies: Winning solution to the M5 Uncertainty competition. International Journal of Forecasting, 38(4), 1426–1433.

    Article  Google Scholar 

  • Li, X., Petropoulos, F., & Kang, Y. (2021). Improving forecasting by subsampling seasonal time series.

    Google Scholar 

  • Ma, S., Fildes, R., & Huang, T. (2016). Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra- and inter-category promotional information. European Journal of Operational Research, 249(1), 245–257.

    Article  Google Scholar 

  • Makridakis, S., Hibon, M., Lusk, E., & Belhadjali, M. (1987). Confidence intervals: An empirical investigation of the series in the M-competition. International Journal of Forecasting, 3(3), 489–508.

    Article  Google Scholar 

  • Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLOS ONE, 13(3), 1–26.

    Article  Google Scholar 

  • Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2020). The M4 Competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(1), 54–74.

    Article  Google Scholar 

  • Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2022a). M5 accuracy competition: Results, findings, and conclusions. International Journal of Forecasting, 38(4), 1346–1364.

    Google Scholar 

  • Makridakis. S., Spiliotis, E., Assimakopoulos, V., Semenoglou, A. A., Mulder, G., & Nikolopoulos, K. (2022b). Statistical, machine learning and deep learning forecasting methods: Comparisons and ways forward. Journal of the Operational Research Society, 1–20.

    Google Scholar 

  • Miller, D. M., & Williams, D. (2003). Shrinkage estimators of time series seasonal factors and their effect on forecasting accuracy. International Journal of Forecasting, 19(4), 669–684.

    Article  Google Scholar 

  • Montero-Manso, P., & Hyndman, R. J. (2021). Principles and algorithms for forecasting groups of time series: Locality and globality. International Journal of Forecasting, 37(4), 1632–1653.

    Article  Google Scholar 

  • Montero-Manso, P., Athanasopoulos, G., Hyndman, R. J., & Talagala, T. S. (2020). FFORMA: Feature-based forecast model averaging. International Journal of Forecasting, 36(1), 86–92.

    Article  Google Scholar 

  • Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y. (2019). N-beats: Neural basis expansion analysis for interpretable time series forecasting.

    Google Scholar 

  • Petropoulos, F., & Spiliotis, E. (2021). The wisdom of the data: Getting the most out of univariate time series forecasting. Forecasting, 3(3), 478–497.

    Article  Google Scholar 

  • Petropoulos, F., & Svetunkov, I. (2020). A simple combination of univariate models. International Journal of Forecasting, 36(1), 110–115.

    Article  Google Scholar 

  • Petropoulos, F., Makridakis, S., Assimakopoulos, V., & Nikolopoulos, K. (2014). ‘Horses for Courses’ in demand forecasting. European Journal of Operational Research, 237(1), 152–163.

    Article  Google Scholar 

  • Petropoulos, F., Hyndman, R. J., & Bergmeir, C. (2018). Exploring the sources of uncertainty: Why does bagging for time series forecasting work? European Journal of Operational Research, 268(2), 545–554.

    Article  Google Scholar 

  • Petropoulos, F., Grushka-Cockayne, Y., Siemsen, E., & Spiliotis, E. (2021). Wielding occam’s razor: Fast and frugal retail forecasting.

    Google Scholar 

  • Petropoulos, F., Apiletti, D., Assimakopoulos, V., Babai, M. Z., Barrow, D. K., Ben Taieb, S., Bergmeir, C., Bessa, R. J., Bijak, J., Boylan, J. E., Browell, J., Carnevale, C., Castle, J. L., Cirillo, P., Clements, M. P., Cordeiro, C., Cyrino Oliveira, F. L., De Baets, S., Dokumentov, A., ... Ziel, F. (2022a). Forecasting: theory and practice. International Journal of Forecasting, 38(3), 705–871.

    Google Scholar 

  • Petropoulos, F., Spiliotis, E., Panagiotelis, A. (2022b). Model combinations through revised base rates. International Journal of Forecasting.

    Google Scholar 

  • Pinson, P., & Kariniotakis, G. (2010). Conditional prediction intervals of wind power generation. IEEE Transactions on Power Systems, 25(4), 1845–1856.

    Article  Google Scholar 

  • Proietti, T., & Lütkepohl, H. (2013). Does the Box-Cox transformation help in forecasting macroeconomic time series? International Journal of Forecasting, 29(1), 88–99.

    Article  Google Scholar 

  • Rajapaksha, D., Bergmeir, C., & Hyndman, R. J. (2022). LoMEF: A framework to produce local explanations for global model time series forecasts. International Journal of Forecasting.

    Google Scholar 

  • Riise, T., & Tjozstheim, D. (1984). Theory and practice of multivariate arma forecasting. Journal of Forecasting, 3(3), 309–317.

    Article  Google Scholar 

  • Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191.

    Article  Google Scholar 

  • Semenoglou, A. A., Spiliotis, E., Makridakis, S., & Assimakopoulos, V. (2021). Investigating the accuracy of cross-learning time series forecasting methods. International Journal of Forecasting, 37(3), 1072–1084.

    Article  Google Scholar 

  • Semenoglou, A. A., Spiliotis, E., & Assimakopoulos, V. (2023a). Data augmentation for univariate time series forecasting with neural networks. Pattern Recognition, 134, 109132.

    Article  Google Scholar 

  • Semenoglou, A. A., Spiliotis, E., & Assimakopoulos, V. (2023b). Image-based time series forecasting: A deep convolutional neural network approach. Neural Networks, 157, 39–53.

    Article  Google Scholar 

  • Shih, S. Y., Sun, F. K., & Lee, Hy. (2019). Temporal pattern attention for multivariate time series forecasting. Machine Learning, 108(8), 1421–1441.

    Article  Google Scholar 

  • Shrestha, D. L., & Solomatine, D. P. (2006). Machine learning approaches for estimation of prediction interval for the model output. Neural Networks, 19(2), 225–235.

    Article  Google Scholar 

  • Smyl, S. (2020). A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting, 36(1), 75–85.

    Article  Google Scholar 

  • Spiliotis, E., Assimakopoulos, V., & Nikolopoulos, K. (2019). Forecasting with a hybrid method utilizing data smoothing, a variation of the Theta method and shrinkage of seasonal factors. International Journal of Production Economics, 209, 92–102.

    Article  Google Scholar 

  • Spiliotis, E., Assimakopoulos, V., & Makridakis, S. (2020). Generalizing the theta method for automatic forecasting. European Journal of Operational Research, 284(2), 550–558.

    Article  Google Scholar 

  • Spiliotis, E., Makridakis, S., Kaltsounis, A., & Assimakopoulos, V. (2021). Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data. International Journal of Production Economics, 240, 108237.

    Article  Google Scholar 

  • Spiliotis, E., Makridakis, S., Semenoglou, A. A., & Assimakopoulos, V. (2022). Comparison of statistical and machine learning methods for daily SKU demand forecasting. Operational Research, 22(3), 3037–3061.

    Article  Google Scholar 

  • Spithourakis, G. P., Petropoulos, F., Babai, M. Z., Nikolopoulos, K., & Assimakopoulos, V. (2011). Improving the performance of popular supply chain forecasting techniques. Supply Chain Forum: An International Journal, 12(4), 16–25.

    Article  Google Scholar 

  • Svetunkov, I., Kourentzes, N., & Ord, J. K. (2022). Complex exponential smoothing. Naval Research Logistics.

    Google Scholar 

  • Tang, Y., Yang, K., Zhang, S., & Zhang, Z. (2022). Photovoltaic power forecasting: A hybrid deep learning model incorporating transfer learning strategy. Renewable and Sustainable Energy Reviews, 162, 112473.

    Article  Google Scholar 

  • Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: An analysis and review. International Journal of Forecasting, 16(4), 437–450.

    Article  Google Scholar 

  • Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropConnect. In S. Dasgupta & D. McAllester (Eds.), Proceedings of the 30th international conference on machine learning. Proceedings of machine learning research (Vol. 28, pp. 1058–1066).

    Google Scholar 

  • Wellens, A. P., Udenio, M., & Boute, R. N. (2022). Transfer learning for hierarchical forecasting: Reducing computational efforts of M5 winning methods. International Journal of Forecasting, 38(4), 1482–1491.

    Article  Google Scholar 

  • Zhang, G., Eddy Patuwo, B., & Hu, Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evangelos Spiliotis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Spiliotis, E. (2023). Time Series Forecasting with Statistical, Machine Learning, and Deep Learning Methods: Past, Present, and Future. In: Hamoudia, M., Makridakis, S., Spiliotis, E. (eds) Forecasting with Artificial Intelligence. Palgrave Advances in the Economics of Innovation and Technology. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-35879-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35879-1_3

  • Published:

  • Publisher Name: Palgrave Macmillan, Cham

  • Print ISBN: 978-3-031-35878-4

  • Online ISBN: 978-3-031-35879-1

  • eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics