Time Series Forecasting with Statistical, Machine Learning, and Deep Learning Methods: Past, Present, and Future

Spiliotis, Evangelos

doi:10.1007/978-3-031-35879-1_3

Evangelos Spiliotis⁵

Part of the book series: Palgrave Advances in the Economics of Innovation and Technology ((PAEIT))

803 Accesses
1 Citations

Abstract

Time series forecasting covers a wide range of methods extending from exponential smoothing and ARIMA models to sophisticated machine learning ones, such as neural networks and regression-tree-based techniques. More recently, deep learning methods have also shown considerable improvements in many forecasting applications. This chapter provides an overview of the key advances that have occurred per class of method in the last decades, presents their advantages and drawbacks, describes the conditions they are expected to perform better under, and discusses some approaches that can be exploited to improve their accuracy. Finally, some directions for future research are proposed to further improve their accuracy and applicability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Other terms commonly used to describe statistical methods are “traditional”, “conventional”, “structured”, “model driven” or simply “time series” methods.
2.
Other terms commonly used to describe ML methods are “computational intelligence”, “unstructured”, or “data-driven” methods.
3.
Although it is true that some ML models make certain assumptions about data distributions, they rarely prescribe the structural components of the series.
4.
Vector ARIMA (VARIMA) and its variants, providing a multivariate generalization of the univariate ARIMA model, are probably the most notable counterexamples. Yet, their modeling approach still differs from global models in the sense that they are not tasked to forecast each series separately, but accommodate assumptions on cross-series and contemporaneous relationships. Also, when the number of series becomes large, VARIMA can be challenging to estimate and include many insignificant parameters, thus becoming impractical (Riise & Tjozstheim, 1984). Similarly, although one can argue that hierarchical forecasting methods (Hyndman et al., 2011) allow the exchange of information between multiple series, they fundamentally differ from global models as said exchanges are possible just for hierarchically organized data and not for series that may originate from different data sources or represent a variety of measures.
5.
Although some primitive global models have been proposed before 2018 (e.g. a NN model that was trained on pools of series was ranked 3rd in the NN3 competition; Crone et al., 2011), the M4 competition (Makridakis et al., 2020) was probably the first major study to demonstrate their superiority.
6.
The term “look-back window” is typically used to define the number of lags considered by the model for making forecasts.
7.
As an example, in each iteration, a RT will split the samples of the train set using a rule that applies on a certain regressor variable so that the forecast error is minimized. In this manner, given a set of predictors that includes both relevant and irrelevant regressor variables, only the former will be used. In the same forecasting task, given a feedforward network, the weights of the nodes that correspond to the irrelevant predictors will gradually be set to zero to minimize their impact on the forecast. Although in practice the above claims may not be absolutely true, it is evident that, structurally speaking, the ML methods do allow an automated feature selection process.
8.
Larger data sets can be created either by including longer series, each contributing multiple observations to the train set, or by including several short series, together contributing numerous observations to the train set.
9.
Some researchers have proposed focusing on similar past forecast errors (Shrestha & Solomatine, 2006) or point forecasts (Pinson & Kariniotakis, 2010) to the period being forecast with the objective to account for the effect that critical explanatory variables and seasonality have on forecast uncertainty, while reducing computational cost.
10.
This includes NNs that consider different weight initializations, sizes of look-back windows, and loss functions.
11.
The rectified linear unit (ReLU) and leaky ReLU are commonly used as activation functions in DL models.
12.
The root mean squared propagation (RMSProp) and the Adam optimizers can be used to dynamically adapt the step size (learning rate) for each input variable based on the most recently observed gradients of said variable over the search.
13.
According to this approach, the inputs of the layers are normalized by re-centering and re-scaling so that that the distribution of each layer’s inputs does not vary during training.
14.
According to this approach, the layers of the NN are successively added to the model and refitted, allowing the newly added layers to learn the inputs from the existing ones.
15.
In order to track the performance of forecasting models under different conditions, extensive simulations (e.g. rolling origin evaluations; Tashman, 2000) are typically required.

References

Agarap, A. F. (2018). Deep learning using rectified linear units (relu).
Google Scholar
Alexandrov, A., Benidis, K., Bohlke-Schneider, M., Flunkert, V., Gasthaus, J., Januschowski, T., Maddix, D. C., Rangapuram, S., Salinas, D., Schulz, J., Stella, L., Türkmen, A. C., & Wang, Y. (2019). Gluonts: Probabilistic time series models in python.
Google Scholar
Alolayan, O. S., Raymond, S. J., Montgomery, J. B., & Williams, J. R. (2022). Towards better shale gas production forecasting using transfer learning. Upstream Oil and Gas Technology, 9, 100072.
Article Google Scholar
Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: A decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521–530.
Article Google Scholar
Athanasopoulos, G., Hyndman, R. J., Kourentzes, N., & Petropoulos, F. (2017). Forecasting with temporal hierarchies. European Journal of Operational Research, 262(1), 60–74.
Article Google Scholar
Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.
Google Scholar
Bandara, K., Hewamalage, H., Liu, Y. H., Kang, Y., & Bergmeir, C. (2021). Improving the accuracy of global forecasting models using time series data augmentation. Pattern Recognition, 120, 108148.
Article Google Scholar
Barker, J. (2020). Machine learning in M4: What makes a good unstructured model? International Journal of Forecasting, 36(1), 150–155.
Article Google Scholar
Bates, J. M., & Granger, C. W. J. (1969). The combination of forecasts. Journal of the Operational Research Society, 20(4), 451–468.
Article Google Scholar
Beaumont, A. N. (2014). Data transforms with exponential smoothing methods of forecasting. International Journal of Forecasting, 30(4), 918–927.
Article Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2006). Greedy layer-wise training of deep networks. In B. Schölkopf, J. Platt, T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19). MIT Press.
Google Scholar
Bergmeir, C., Hyndman, R. J., & Benítez, J. M. (2016). Bagging exponential smoothing methods using STL decomposition and Box-Cox transformation. International Journal of Forecasting, 32(2), 303–312.
Article Google Scholar
Bojer, C. S. (2022). Understanding machine learning-based forecasting methods: A decomposition framework and research opportunities. International Journal of Forecasting, 38(4), 1555–1561.
Article Google Scholar
Bojer, C. S., & Meldgaard, J. P. (2021). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting, 37(2), 587–603.
Article Google Scholar
Borovykh, A., Bohte, S., & Oosterlee, C. W. (2017). Conditional time series forecasting with convolutional neural networks.
Google Scholar
Chatigny, P., Wang, S., Patenaude, J. M., & Oreshkin, B. N. (2021). Neural forecasting at scale.
Google Scholar
Claeskens, G., Magnus, J. R., Vasnev, A. L., & Wang, W. (2016). The forecast combination puzzle: A simple theoretical explanation. International Journal of Forecasting, 32(3), 754–762.
Article Google Scholar
Crone, S. F., Hibon, M., & Nikolopoulos, K. (2011). Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction. International Journal of Forecasting, 27(3), 635–660.
Article Google Scholar
De Gooijer, J. G., & Hyndman, R. J. (2006). 25 years of time series forecasting. International Journal of Forecasting, 22(3), 443–473.
Article Google Scholar
De Gooijer, J. G., & Kumar, K. (1992). Some recent developments in non-linear time series modelling, testing, and forecasting. International Journal of Forecasting, 8(2), 135–156.
Article Google Scholar
Gilliland, M. (2020). The value added by machine learning approaches in forecasting. International Journal of Forecasting, 36(1), 161–166.
Article Google Scholar
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Y. W. Teh & M. Titterington (Eds.), Proceedings of the thirteenth international conference on artificial intelligence and statistics. Proceedings of machine learning research (Vol. 9, pp. 249–256).
Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
Hewamalage, H., Bergmeir, C., & Bandara, K. (2021). Recurrent Neural Networks for Time Series Forecasting: Current status and future directions. International Journal of Forecasting, 37(1), 388–427.
Article Google Scholar
Hyndman, R., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O’Hara-Wild, M., Petropoulos, F., Razbash, S., Wang, E., & Yasmeen, F. (2020). forecast: Forecasting functions for time series and linear models. R package version, 8, 12.
Google Scholar
Hyndman, R. J., Koehler, A. B., Snyder, R. D., & Grose, S. (2002). A state space framework for automatic forecasting using exponential smoothing methods. International Journal of Forecasting, 18(3), 439–454.
Article Google Scholar
Hyndman, R. J., Ahmed, R. A., Athanasopoulos, G., & Shang, H. L. (2011). Optimal combination forecasts for hierarchical time series. Computational Statistics & Data Analysis, 55(9), 2579–2589.
Article Google Scholar
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift.
Google Scholar
Januschowski, T., Gasthaus, J., Wang, Y., Salinas, D., Flunkert, V., Bohlke-Schneider, M., & Callot, L. (2020). Criteria for classifying forecasting methods. International Journal of Forecasting, 36(1), 167–177.
Article Google Scholar
Januschowski, T., Wang, Y., Torkkola, K., Erkkilä, T., Hasson, H., & Gasthaus, J. (2022). Forecasting with trees. International Journal of Forecasting, 38(4), 1473–1481.
Article Google Scholar
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35–45.
Article Google Scholar
Kang, Y., Spiliotis, E., Petropoulos, F., Athiniotis, N., Li, F., & Assimakopoulos, V. (2021). Déjá vu: A data-centric forecasting approach through time series cross-similarity. Journal of Business Research, 132, 719–731.
Article Google Scholar
Kang, Y., Cao, W., Petropoulos, F., & Li, F. (2022). Forecast with forecasts: Diversity matters. European Journal of Operational Research, 301(1), 180–190.
Article Google Scholar
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization.
Google Scholar
Kolassa, S. (2016). Evaluating predictive count data distributions in retail sales forecasting. International Journal of Forecasting, 32(3), 788–803.
Article Google Scholar
Lai, G., Chang, W. C., Yang, Y., & Liu, H. (2017). Modeling long- and short-term temporal patterns with deep neural networks.
Google Scholar
Lainder, A. D., & Wolfinger, R. D. (2022). Forecasting with gradient boosted trees: Augmentation, tuning, and cross-validation strategies: Winning solution to the M5 Uncertainty competition. International Journal of Forecasting, 38(4), 1426–1433.
Article Google Scholar
Li, X., Petropoulos, F., & Kang, Y. (2021). Improving forecasting by subsampling seasonal time series.
Google Scholar
Ma, S., Fildes, R., & Huang, T. (2016). Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra- and inter-category promotional information. European Journal of Operational Research, 249(1), 245–257.
Article Google Scholar
Makridakis, S., Hibon, M., Lusk, E., & Belhadjali, M. (1987). Confidence intervals: An empirical investigation of the series in the M-competition. International Journal of Forecasting, 3(3), 489–508.
Article Google Scholar
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLOS ONE, 13(3), 1–26.
Article Google Scholar
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2020). The M4 Competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(1), 54–74.
Article Google Scholar
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2022a). M5 accuracy competition: Results, findings, and conclusions. International Journal of Forecasting, 38(4), 1346–1364.
Google Scholar
Makridakis. S., Spiliotis, E., Assimakopoulos, V., Semenoglou, A. A., Mulder, G., & Nikolopoulos, K. (2022b). Statistical, machine learning and deep learning forecasting methods: Comparisons and ways forward. Journal of the Operational Research Society, 1–20.
Google Scholar
Miller, D. M., & Williams, D. (2003). Shrinkage estimators of time series seasonal factors and their effect on forecasting accuracy. International Journal of Forecasting, 19(4), 669–684.
Article Google Scholar
Montero-Manso, P., & Hyndman, R. J. (2021). Principles and algorithms for forecasting groups of time series: Locality and globality. International Journal of Forecasting, 37(4), 1632–1653.
Article Google Scholar
Montero-Manso, P., Athanasopoulos, G., Hyndman, R. J., & Talagala, T. S. (2020). FFORMA: Feature-based forecast model averaging. International Journal of Forecasting, 36(1), 86–92.
Article Google Scholar
Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y. (2019). N-beats: Neural basis expansion analysis for interpretable time series forecasting.
Google Scholar
Petropoulos, F., & Spiliotis, E. (2021). The wisdom of the data: Getting the most out of univariate time series forecasting. Forecasting, 3(3), 478–497.
Article Google Scholar
Petropoulos, F., & Svetunkov, I. (2020). A simple combination of univariate models. International Journal of Forecasting, 36(1), 110–115.
Article Google Scholar
Petropoulos, F., Makridakis, S., Assimakopoulos, V., & Nikolopoulos, K. (2014). ‘Horses for Courses’ in demand forecasting. European Journal of Operational Research, 237(1), 152–163.
Article Google Scholar
Petropoulos, F., Hyndman, R. J., & Bergmeir, C. (2018). Exploring the sources of uncertainty: Why does bagging for time series forecasting work? European Journal of Operational Research, 268(2), 545–554.
Article Google Scholar
Petropoulos, F., Grushka-Cockayne, Y., Siemsen, E., & Spiliotis, E. (2021). Wielding occam’s razor: Fast and frugal retail forecasting.
Google Scholar
Petropoulos, F., Apiletti, D., Assimakopoulos, V., Babai, M. Z., Barrow, D. K., Ben Taieb, S., Bergmeir, C., Bessa, R. J., Bijak, J., Boylan, J. E., Browell, J., Carnevale, C., Castle, J. L., Cirillo, P., Clements, M. P., Cordeiro, C., Cyrino Oliveira, F. L., De Baets, S., Dokumentov, A., ... Ziel, F. (2022a). Forecasting: theory and practice. International Journal of Forecasting, 38(3), 705–871.
Google Scholar
Petropoulos, F., Spiliotis, E., Panagiotelis, A. (2022b). Model combinations through revised base rates. International Journal of Forecasting.
Google Scholar
Pinson, P., & Kariniotakis, G. (2010). Conditional prediction intervals of wind power generation. IEEE Transactions on Power Systems, 25(4), 1845–1856.
Article Google Scholar
Proietti, T., & Lütkepohl, H. (2013). Does the Box-Cox transformation help in forecasting macroeconomic time series? International Journal of Forecasting, 29(1), 88–99.
Article Google Scholar
Rajapaksha, D., Bergmeir, C., & Hyndman, R. J. (2022). LoMEF: A framework to produce local explanations for global model time series forecasts. International Journal of Forecasting.
Google Scholar
Riise, T., & Tjozstheim, D. (1984). Theory and practice of multivariate arma forecasting. Journal of Forecasting, 3(3), 309–317.
Article Google Scholar
Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191.
Article Google Scholar
Semenoglou, A. A., Spiliotis, E., Makridakis, S., & Assimakopoulos, V. (2021). Investigating the accuracy of cross-learning time series forecasting methods. International Journal of Forecasting, 37(3), 1072–1084.
Article Google Scholar
Semenoglou, A. A., Spiliotis, E., & Assimakopoulos, V. (2023a). Data augmentation for univariate time series forecasting with neural networks. Pattern Recognition, 134, 109132.
Article Google Scholar
Semenoglou, A. A., Spiliotis, E., & Assimakopoulos, V. (2023b). Image-based time series forecasting: A deep convolutional neural network approach. Neural Networks, 157, 39–53.
Article Google Scholar
Shih, S. Y., Sun, F. K., & Lee, Hy. (2019). Temporal pattern attention for multivariate time series forecasting. Machine Learning, 108(8), 1421–1441.
Article Google Scholar
Shrestha, D. L., & Solomatine, D. P. (2006). Machine learning approaches for estimation of prediction interval for the model output. Neural Networks, 19(2), 225–235.
Article Google Scholar
Smyl, S. (2020). A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting, 36(1), 75–85.
Article Google Scholar
Spiliotis, E., Assimakopoulos, V., & Nikolopoulos, K. (2019). Forecasting with a hybrid method utilizing data smoothing, a variation of the Theta method and shrinkage of seasonal factors. International Journal of Production Economics, 209, 92–102.
Article Google Scholar
Spiliotis, E., Assimakopoulos, V., & Makridakis, S. (2020). Generalizing the theta method for automatic forecasting. European Journal of Operational Research, 284(2), 550–558.
Article Google Scholar
Spiliotis, E., Makridakis, S., Kaltsounis, A., & Assimakopoulos, V. (2021). Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data. International Journal of Production Economics, 240, 108237.
Article Google Scholar
Spiliotis, E., Makridakis, S., Semenoglou, A. A., & Assimakopoulos, V. (2022). Comparison of statistical and machine learning methods for daily SKU demand forecasting. Operational Research, 22(3), 3037–3061.
Article Google Scholar
Spithourakis, G. P., Petropoulos, F., Babai, M. Z., Nikolopoulos, K., & Assimakopoulos, V. (2011). Improving the performance of popular supply chain forecasting techniques. Supply Chain Forum: An International Journal, 12(4), 16–25.
Article Google Scholar
Svetunkov, I., Kourentzes, N., & Ord, J. K. (2022). Complex exponential smoothing. Naval Research Logistics.
Google Scholar
Tang, Y., Yang, K., Zhang, S., & Zhang, Z. (2022). Photovoltaic power forecasting: A hybrid deep learning model incorporating transfer learning strategy. Renewable and Sustainable Energy Reviews, 162, 112473.
Article Google Scholar
Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: An analysis and review. International Journal of Forecasting, 16(4), 437–450.
Article Google Scholar
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropConnect. In S. Dasgupta & D. McAllester (Eds.), Proceedings of the 30th international conference on machine learning. Proceedings of machine learning research (Vol. 28, pp. 1058–1066).
Google Scholar
Wellens, A. P., Udenio, M., & Boute, R. N. (2022). Transfer learning for hierarchical forecasting: Reducing computational efforts of M5 winning methods. International Journal of Forecasting, 38(4), 1482–1491.
Article Google Scholar
Zhang, G., Eddy Patuwo, B., & Hu, Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Forecasting and Strategy Unit, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
Evangelos Spiliotis

Authors

Evangelos Spiliotis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evangelos Spiliotis .

Editor information

Editors and Affiliations

France Telecom Group, Orange Business Services, Eragny, France
Mohsen Hamoudia
Institute For the Future (IFF), University of Nicosia, Engomi, Cyprus
Spyros Makridakis
School of Electrical and Computer Engineering, National Technical University of Athens, Zografou, Greece
Evangelos Spiliotis

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Spiliotis, E. (2023). Time Series Forecasting with Statistical, Machine Learning, and Deep Learning Methods: Past, Present, and Future. In: Hamoudia, M., Makridakis, S., Spiliotis, E. (eds) Forecasting with Artificial Intelligence. Palgrave Advances in the Economics of Innovation and Technology. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-35879-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-35879-1_3
Published: 21 September 2023
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-031-35878-4
Online ISBN: 978-3-031-35879-1
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics