Gated RNN: The Long Short-Term Memory (LSTM) RNN

Salem, Fathi M.

doi:10.1007/978-3-030-89929-5_4

Fathi M. Salem²

1625 Accesses
2 Citations

Abstract

Long Short-Term Memory (LSTM) Recurrent Neural networks (RNN) are improved and expanded architectures designed to overcome the shortcomings of training simple RNN (sRNN) simple RNN. They rely on three separate gating signals, each replicates a simple RNN, joined together by the ”memory-cell” structure. This architecture has been the effective and successful workhorse of RNN in numerous applications following its publication in 1997. The chapter introduces the standard architecture which embeds four replica of the simple RNN and thus has four times the parameters. Then, the chapter presents reduced-parameter in the gating signals by introducing five variants that maintain the gating structure while progressively reducing parameters. These five variants belong to the Slim LSTMs. The chapter ends with example case studies of the powerful standard LSTM RNN and the five slim LSTMs exhibiting their comparative performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmad, M., & Salem, F. M. (1992). Dynamic learning using exponential energy functions. In Proceedings of the IEEE International Joint Conference on Neural Networks pp. (II–121–II–126).
Google Scholar
Akandeh, A., & Salem, F. M. (2017a). Simplified long short-term memory recurrent neural networks: part I. arXiv:1707.04619.
Google Scholar
Akandeh, A., & Salem, F. M. (2017b). Simplified long short-term memory recurrent neural networks: Part II. arXiv:1707.04623.
Google Scholar
Akandeh, A., & Salem, F. M. (2017c). Simplified long short-term memory recurrent neural networks: Part III. arXiv:1707.04626.
Google Scholar
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.
Google Scholar
Chollet, F. Keras. https://keras.io
Chollet, F. Keras-codes2. https://github.com/keras-team/keras
Chollet, F. Keras-codes. https://github.com/keras-team/keras
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014b). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Google Scholar
Gers, F. A., Schraudolph, N. N., & Schmidhuber, J. (2002). Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3, 115–143.
Google Scholar
Graves, A. (2012). Supervised sequence labelling with recurrent neural networks. In Studies in Computational Intelligence. Springer.
Google Scholar
Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222–2232.
Google Scholar
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Google Scholar
Jianxin, C.-L., Zhang, G.-B., Zhou, Wu, J., & Zhou, Z.-H. (2016). Minimal gated unit for recurrent neural networks. https://arxiv.org/abs/1603.09420.
Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F. B., Wattenberg, M., Corrado, G., M. Hughes, & Dean, J. (2016). Google’s multilingual neural machine translation system: Enabling zero-shot translation. http://arxiv.org/abs/1611.04558.
Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941
Google Scholar
Lu, Y., & Salem, F. M. (2017). Simplified gating in long short-term memory (LSTM) recurrent neural networks. In 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) (p. 1601).
Google Scholar
Salem, F. M. (2016a). A basic recurrent neural network model. arXiv preprint arXiv:1612.09022.
Google Scholar
Salem, F. M. (2016b). Reduced parameterization in gated recurrent neural networks. Technical Report 11-2016, MSU.
Google Scholar
Salem, F. M. (2018). Slim LSTMs. https://arxiv.org/abs/1812.11391
Zaremba, W. (2015). An empirical exploration of recurrent network architectures. In An empirical exploration of recurrent network architectures.
Google Scholar
Zhou, G.-B., Wu, J., Zhang, C.-L., & Zhou, Z.-H. (2016). Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3), 226–234
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA
Fathi M. Salem

Authors

Fathi M. Salem
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Salem, F.M. (2022). Gated RNN: The Long Short-Term Memory (LSTM) RNN. In: Recurrent Neural Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-89929-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-89929-5_4
Published: 23 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89928-8
Online ISBN: 978-3-030-89929-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics