Applying LSTM to Time Series Predictable through Time-Window Approaches

  • Felix A. Gers
  • Douglas Eck
  • Jürgen Schmidhuber
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2130)


Long Short-Term Memory (LSTM) is able to solve many time series tasks unsolvable by feed-forward networks using fixed size time windows. Here we find that LSTM’s superiority does not carry over to certain simpler time series prediction tasks solvable by time window approaches: the Mackey-Glass series and the Santa Fe FIR laser emission series (Set A). This suggests to use LSTM only when simpler traditional approaches fail.


Recurrent Neural Network Time Series Prediction Chaotic Time Series Local Linear Model Time Series Predictable 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.CrossRefGoogle Scholar
  2. 2.
    F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” Neural Computation, vol. 12, no. 10, pp. 2451–2471, 2000.CrossRefGoogle Scholar
  3. 3.
    F. A. Gers and J. Schmidhuber, “Recurrent nets that time and count,” in Proc. IJCNN’2000, Int. Joint Conf. on Neural Networks, (Como, Italy), 2000.Google Scholar
  4. 4.
    F. A. Gers and J. Schmidhuber, “LSTM recurrent networks learn simple context free and context sensitive languages,” IEEE Transactions on Neural Networks, 2001. accepted.Google Scholar
  5. 5.
    M. Mackey and L. Glass, “Oscillation and chaos in a physiological control system,” Science, vol. 197, no. 287, 1977.Google Scholar
  6. 6.
    A. Weigend and N. Gershenfeld, Time Series Prediction: Forecasting the Future and Understanding the Past. Addison-Wesley, 1993”.Google Scholar
  7. 7.
    P. Haffner and A. Waibel, “Multi-state time delay networks for continuous speech recognition,” in Advances in Neural Information Processing Systems (J. E. Moody, S. J. Hanson, and R. P. Lippmann, eds.), vol. 4, pp. 135–142, Morgan Kaufmann Publishers, Inc., 1992.Google Scholar
  8. 8.
    T. Lin, B. G. Home, P. Tiño, and C. L. Giles, “Learning long-term dependencies in NARX recurrent neural networks,” IEEE Transactions on Neural Networks, vol. 7, pp. 1329–1338, Nov. 1996.Google Scholar
  9. 9.
    J. C. Principe and J.-M. Kuo, “Dynamic modelling of chaotic time series with neural networks,” in Advances in Neural Information Processing Systems (G. Tesauro, D. Touretzky, and T. Leen, eds.), vol. 7, pp. 311–318, The MIT Press, 1995.Google Scholar
  10. 10.
    R. Bakker, J. C. Schouten, C. L. Giles, F. Takens, and C. M. van den Bleek, “Learning chaotic attractors by neural networks,” Neural Computation, vol. 12, no. 10, 2000.Google Scholar
  11. 11.
    S. P. Day and M. R. Davenport, “Continuous-time temporal back-progagation with adaptive time delays,” IEEE Transactions on Neural Networks, vol. 4, pp. 348–354, 1993.CrossRefGoogle Scholar
  12. 12.
    L. Chudy and I. Farkas, “Prediction of chaotic time-series using dynamic cell struc-turesand local linear models,” Neural Network World, vol. 8, no. 5, pp. 481–489, 1998.Google Scholar
  13. 13.
    R. Bone, M. Crucianu, G. Verley, and J.-P. Asselin de Beauville, “A bounded exploration approach to constructive algorithms for recurrent neural networks,” in Proceedings of IJCNN 2000, (Como, Italy), 2000.Google Scholar
  14. 14.
    I. de Falco, A. Iazzetta, P. Natale, and E. Tarantino, “Evolutionary neural networks for nonlinear dynamics modeling,” in Parallel Problem Solving from Nature 98, vol. 1498 of Lectures Notes in Computer Science, pp. 593–602, Springer, 1998”.Google Scholar
  15. 15.
    X. Yao and Y. Liu, “A new evolutionary system for evolving artificial neural networks,’ IEEE Transactions on Neural Networks, vol. 8, pp. 694–713, May 1997.Google Scholar
  16. 16.
    J. Vesanto, “Using the SOM and local models in time-series prediction,” in Proceedings of WSOM’97, Workshop on S elf-Organizing Maps, Espoo, Finland, June 4-6, pp. 209–214, Espoo, Finland: Helsinki University of Technology, Neural Networks Research Centre, 1997.Google Scholar
  17. 17.
    T. M. Martinez, S. G. Berkovich, and K. J. Schulten, “Neural-gas network for vector quantization and its application to time-series prediction,” IEEE Transactions on Neural Networks, vol. 4, pp. 558–569, July 1993.Google Scholar
  18. 18.
    H. Bersini, M. Birattari, and G. Bontempi, “Adaptive memory-based regression methods,” in In Proceedings of the 1998 IEEE International Joint Conference on Neural Networks, pp. 2102–2106, 1998.Google Scholar
  19. 19.
    J. Platt, “A resource-allocating network for function interpolation,” Neural Computation, vol. 3, pp. 213–225, 1991.MathSciNetCrossRefGoogle Scholar
  20. 20.
    U. Huebner, N. B. Abraham, and C. O. Weiss, “Dimensions and entropies of chaotic intensity pulsations in a single-mode far-infrared nh3 laser,” Phys. Rev. A, vol. 40, p. 6354, 1989.CrossRefGoogle Scholar
  21. 21.
    T. Koskela, M. Varsta, J. Heikkonen, and K. Kaski, “Recurrent SOM with local linear models in time series prediction,” in 6th European Symposium on Artificial Neural Networks. ESANN’98. Proceedings. D-Facto, Brussels, Belgium, pp. 167–72, 1998.Google Scholar
  22. 22.
    E. A. Wan, “Time series prediction by using a connectionist network with internal time delays,” in Time Series Prediction: Forecasting the Future and Understanding the Past (W. A. S. and G. N. A., eds.), pp. 195–217, Addison-Wesley, 1994.Google Scholar
  23. 23.
    J. Kohlmorgen and K.-R. Müller, “Data set a is a pattern matching problem,” Neural Processing Letters, vol. 7, no. 1, pp. 43–47, 1998.CrossRefGoogle Scholar
  24. 24.
    T. Sauer, “Time series prediction using delay coordinate embedding,” in Time Series Prediction: Forecasting the Future and Understanding the Past (A. S. Weigend and N. A. Gershenfeld, eds.), Addison-Wesley, 1994.Google Scholar
  25. 25.
    A. S. Weigend and D. A. Nix, “Predictions with confidence intervals (local error bars),” in Proceedings of the International Conference on Neural Information Processing (ICONIP’94), (Seoul, Korea), pp. 847–852, 1994.Google Scholar
  26. 26.
    J. McNames, “Local modeling optimization for time series prediction,” in In Proceedings of the 8th European Symposium on Artificial Neural Networks, pp. 305–310, 2000.Google Scholar
  27. 27.
    B. H. Bontempi G., Birattari M., ”Local learning for iterated time-series prediction,” in Machine Learning: Proceedings of the Sixteenth International Conference (B. I. and D. S., eds.), (San Francisco, USA), pp. 32–38, Morgan Kaufmann, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Felix A. Gers
    • 1
  • Douglas Eck
    • 1
  • Jürgen Schmidhuber
    • 1
  1. 1.IDSIAMannoSwitzerland

Personalised recommendations