Stacked LSTM Snapshot Ensembles for Time Series Forecasting

Conference paper
Part of the Contributions to Statistics book series (CONTRIB.STAT.)


Ensembles of machine learning models have proven to improve the performance of prediction tasks in various domains. The additional computational costs for the performance increase are usually high since multiple models must be trained. Recently, snapshot ensembles (Huang et al. in Snapshot ensembles: train 1 get M for free, (2017) [16]) provide a comparably computationally cheap way of ensemble learning for artificial neural networks (ANNs). We extend snapshot ensembles to the application of time series forecasting, which comprises two essential steps. First, we show that determining reasonable selections for sequence lengths can be used to efficiently escape local minima. Additionally, combining the forecasts of snapshot LSTMs with a stacking approach greatly boosts the performance compared to the mean of the forecasts as used in the original snapshot ensemble approach. We demonstrate the effectiveness of the algorithm on five real-world datasets and show that the forecasting performance of our approach is superior to conservative ensemble architectures as well as a single, highly optimized LSTM.


Time series LSTM ARIMA Ensembles Stacking Meta-learning 


  1. 1.
    Adhikari, R.: A neural network based linear ensemble framework for time series forecasting. Neurocomputing 157(2015), 231–242 (2015)CrossRefGoogle Scholar
  2. 2.
    Adhikari, R., Agrawal, R.K.: A linear hybrid methodology for improving accuracy of time series forecasting. Neural Comput. Appl. 25(2), 269–281 (2014)CrossRefGoogle Scholar
  3. 3.
    Aladag, C.H., Egrioglu, E., Kadilar, C.: Forecasting nonlinear time series with a hybrid methodology. Appl. Math. Lett. 22(9), 1467–1470 (2009)CrossRefGoogle Scholar
  4. 4.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  5. 5.
    Borovykh, A., Bohte, S., Oosterlee, C.W.: Conditional time series forecasting with convolutional neural networks. J. Computat. Financ. (2018)Google Scholar
  6. 6.
    Cerqueira, V., et al.: Arbitrated ensemble for time series forecasting. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham (2017)CrossRefGoogle Scholar
  7. 7.
    Elfeky, M.G., Aref, W.G., Elmagarmid, A.K.: Periodicity detection in time series databases. IEEE Trans. Knowl. Data Eng. 17(7), 875–887 (2005)CrossRefGoogle Scholar
  8. 8.
    Gers, F.A., Eck, D., Schmidhuber, J.: Applying LSTM to time series predictable through time-window approaches. In: Neural Nets WIRN Vietri-01, pp. 193–200. Springer (2002)Google Scholar
  9. 9.
    Gothwal, H., Kedawat, S., Kumar, R.: Cardiac arrhythmias detection in an ECG beat signal using fast fourier transform and artificial neural network. J. Biomedi. Sci. Eng. 4(04), 289 (2011)CrossRefGoogle Scholar
  10. 10.
    Hamilton, J.D.: Time Series Analysis, vol. 2. Princeton University Press, Princeton (1994)Google Scholar
  11. 11.
    He, Z., Gao, S., Xiao, L., Liu, D., He, H., Barber, D.: Wider and deeper, cheaper and faster: tensorized LSTMs for sequence learning. In: Advances in Neural Information Processing Systems, pp. 1–11 (2017)Google Scholar
  12. 12.
    Hipel, K.W., McLeod, A.I.: Time Series Modelling of Water Resources and Environmental Systems, vol. 45. Elsevier (1994)Google Scholar
  13. 13.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  14. 14.
    Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)CrossRefGoogle Scholar
  15. 15.
    Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)Google Scholar
  16. 16.
    Huang, G., Li, Y., Pleiss, G., Li, Z., Hopcroft, J., Weinberger, K.: Snapshot ensembles: train 1 get M for free. In: Proceedings of the International Conference on Learning Representations (ICLR 2017) (2017)Google Scholar
  17. 17.
    Krstanovic, S., Paulheim, H.: Ensembles of recurrent neural networks for robust time series forecasting. In: International Conference on Innovative Techniques and Applications of Artificial Intelligence, pp. 34-46. Springer (2017)Google Scholar
  18. 18.
    Lichman, M.: UCI Machine Learning Repository (2013).
  19. 19.
    Lngkvist, M., Karlsson, L., Loutfi, A.: A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognit. Lett. 42, 11–24 (2014)Google Scholar
  20. 20.
    Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: Proceedings Presses Universitaires de Louvain, vol. 89 (2015)Google Scholar
  21. 21.
    Oliveira, M., Torgo, L.: Ensembles for time series forecasting. In: JMLR: Workshop and Conference Proceedings, vol. 39, pp. 360–370 (2014)Google Scholar
  22. 22.
    Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)Google Scholar
  23. 23.
    Pratt, H., et al.: FCNN: Fourier Convolutional Neural Networks. Machine Learning and Knowledge Discovery in Databases, Springer, Cham (2017)CrossRefGoogle Scholar
  24. 24.
    Sharma, D., Issac, B., Raghava, G.P.S., Ramaswamy, R.: Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics 20(9), 1405–1412 (2004)CrossRefGoogle Scholar
  25. 25.
    Shumway, R.H., Stoffer, D.S.: Time Series Analysis and Its Applications: With R Examples. Springer, New York (2010)Google Scholar
  26. 26.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  27. 27.
    Van Den Oord, A., et al.: WaveNet: a generative model for raw audio. In: SSW (2016)Google Scholar
  28. 28.
    Wang, L., Zou, H., Su, J., Li, L., Chaudhry, S.: An ARIMA-ANN hybrid model for time series forecasting. Syst. Res. Behav. Sci. 30(3), 244–259 (2013)CrossRefGoogle Scholar
  29. 29.
    Welch, P.: The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15(2), 70–73 (1967)CrossRefGoogle Scholar
  30. 30.
    Wen, R., Torkkola, K., Narayanaswamy, B.: A Multi-Horizon Quantile Recurrent Forecaster. NIPS 2017 Time Series Workshop (2017)Google Scholar
  31. 31.
    Zhang, L., Suganthan, P.N.: Benchmarking ensemble classifiers with novel co-trained Kernal Ridge regression and random vector functional link ensembles [Research Frontier]. IEEE Computat. Intell. Maga. 12(4), 61–72 (2017)CrossRefGoogle Scholar
  32. 32.
    Zhang, P.G.: Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50(2003), 159–175 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Research Group Data and Web ScienceUniversity of MannheimMannheimGermany

Personalised recommendations