Overview of Long Short-Term Memory Neural Networks

  • Kamilya Smagulova
  • Alex Pappachen JamesEmail author
Part of the Modeling and Optimization in Science and Technologies book series (MOST, volume 14)


Long Short-term Memory was designed to avoid vanishing and exploding gradient problems in recurrent neural networks. Over the last twenty years, various modifications of an original LSTM cell were proposed. This chapter gives an overview of basic LSTM cell structures and demonstrates forward and backward propagation within the most widely used configuration called traditional LSTM cell. Besides, LSTM neural network configurations are described.


  1. 1.
    Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386CrossRefGoogle Scholar
  2. 2.
    Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv:1506.00019
  3. 3.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  4. 4.
    Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with LSTMGoogle Scholar
  5. 5.
    Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232MathSciNetCrossRefGoogle Scholar
  6. 6.
    Gomez, A. (2016). Backpropogating an LSTM: A Numerical Example. Aidan Gomez blog at MediumGoogle Scholar
  7. 7.
    Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation now casting. In: Advances in neural information processing systems, pp 802–810Google Scholar
  8. 8.
    Neil D, Pfeiffer M, Liu SC (2016) Phased lstm: accelerating recurrent network training for long or event-based sequences. In: Advances in Neural Information Processing Systems, pp 3882–3890Google Scholar
  9. 9.
    Karpathy A (2015) The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blogGoogle Scholar
  10. 10.
    Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681CrossRefGoogle Scholar
  11. 11.
    Graves A, Jaitly N, Mohamed AR (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 273–278Google Scholar
  12. 12.
    Yoon J, Zame WR, van der Schaar M (2017) Multi-directional recurrent neural networks: a novel method for estimating missing dataGoogle Scholar
  13. 13.
    Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Nazarbayev UniversityAstanaKazakhstan

Personalised recommendations