Skip to main content

A Comparison of LSTM and GRU Networks for Learning Symbolic Sequences

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2023)

Abstract

We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences that it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network’s capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase in RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low-complexity sequences while on high-complexity sequences LSTMs perform better.

R. C.’s work has been funded by the UK’s Alan Turing Institute. S. G. has been supported by a Fellowship of the Alan Turing Institute, EPSRC grant EP/N510129/1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/robcah/RNNExploration4SymbolicTS.

  2. 2.

    textdistance 4.2.0, https://pypi.org/project/textdistance.

References

  1. Bandara, K., Bergmeir, C., Smyl, S.: Forecasting across time series databases using recurrent neural networks on groups of similar series: a clustering approach. Expert Syst. Appl. 140, 112896 (2020)

    Article  Google Scholar 

  2. Boytsov, L.: Indexing methods for approximate dictionary searching: comparative analysis. J. Exp. Algorithmics 16, 1.10–1.91 (2011)

    Google Scholar 

  3. Cahuantzi, R., Chen, X., Güttel, S.: slearn (2021)

    Google Scholar 

  4. Cho, K., et al.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)

    Google Scholar 

  5. Elsworth, S., Güttel, S.: Time series forecasting using LSTM networks: a symbolic approach. arXiv:2003.05672 (2020)

  6. Greff, K., Srivastava, R.K., Koutnik, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. Trans. Neural Netw. Learn. Syst. 28, 2222–2232 (2017)

    Article  MathSciNet  Google Scholar 

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  8. Irie, K., Tüske, Z., Alkhouli, T., Schlüter, R., Ney, H.: LSTM, GRU, highway and a bit of attention: an empirical overview for language modeling in speech recognition. In: Interspeech, pp. 3519–3523 (2016)

    Google Scholar 

  9. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37 of Proceedings of Machine Learning Research, pp. 2342–2350. PMLR (2015)

    Google Scholar 

  10. Kaspar, F., Schuster, H.G.: Easily calculable measure for the complexity of spatiotemporal patterns. Phys. Rev. A 36, 842–848 (1987)

    Article  MathSciNet  Google Scholar 

  11. Kim, S., Nam, H., Kim, J., Jung, K.: Neural sequence-to-grid module for learning symbolic rules. Proc. AAAI Conf. Artif. Intell. 35(9), 8163–8171 (2021)

    Google Scholar 

  12. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

    Google Scholar 

  13. The Indian Journal of Statistics: A. N. Kolmogorov. On tables of random numbers. Sankhyā. Series A 25, 369–376 (1963)

    Google Scholar 

  14. Lample, G., Charton, F.: Deep learning for symbolic mathematics. In: International Conference on Learning Representations (2020)

    Google Scholar 

  15. Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2–11. ACM (2003)

    Google Scholar 

  16. Lin, T., Guo, T., Aberer, K.: Hybrid neural networks for learning the trend in time series. In: International Joint Conference on Artificial Intelligence, pp. 2273–2279 (2017)

    Google Scholar 

  17. Maas, A., Le, Q.V., O’Neil, T.M., Vinyals, O., Nguyen, P., Ng, A.Y.: Recurrent neural networks for noise reduction in robust ASR. In: INTERSPEECH (2012)

    Google Scholar 

  18. Mao, J., Xu, W., Yang, Y., Wang, J., Huang, Z., Yuille, A.: Deep captioning with multimodal recurrent neural networks (m-rnn). In: International Conference on Learning Representations (2015)

    Google Scholar 

  19. Montero-Manso, P., Hyndman, R.J.: Principles and algorithms for forecasting groups of time series: Locality and globality. Technical report, Monash University, Department of Econometrics and Business Statistics, (2020)

    Google Scholar 

  20. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-BEATS: neural basis expansion analysis for interpretable time series forecasting. In: International Conference on Learning Representations (2020)

    Google Scholar 

  21. Rabanser, S., Januschowski, T., Flunkert, V., Salinas, D., Gasthaus, J.: The effectiveness of discretization in forecasting: an empirical study on neural time series models. arXiv 2005.10111 (2020)

    Google Scholar 

  22. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv 1609.04747 (2016)

    Google Scholar 

  23. Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: DeepAR: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2020)

    Article  Google Scholar 

  24. Smyl, S.: A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. Int. J. Forecast. 36, 75–85 (2020)

    Article  Google Scholar 

  25. Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., Saenko, K.: Sequence to sequence – video to text. In: International Conference on Computer Vision, pp. 4534–4542. IEEE (2015)

    Google Scholar 

  26. Welch, T.: A technique for high-performance data compression. Computer 17, 8–19 (1984)

    Article  Google Scholar 

  27. Winkler, W.E.: Overview of record linkage and current research directions. Technical report, Bureau of the Census (2006)

    Google Scholar 

  28. Yamak, P.T., Yujian, L., Gadosey, P.K.: A comparison between ARIMA, LSTM, and GRU for time series forecasting. In: International Conference on Algorithms, Computing and Artificial Intelligence, pp. 49–55. ACM (2019)

    Google Scholar 

  29. Zaremba, W., Kurach, K., Fergus, R.: Learning to discover efficient mathematical identities. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.), Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014)

    Google Scholar 

  30. Zenil, H.: A review of methods for estimating algorithmic complexity: options, challenges, and new directions. Entropy 22, 1–28 (2020)

    Article  MathSciNet  Google Scholar 

  31. Zhang, S., Bahrampour, S., Ramakrishnan, N., Schott, L., Shah, M.: Deep learning on symbolic representations for large-scale heterogeneous time-series event prediction. In: International Conference on Acoustics, Speech and Signal Processing, pp. 5970–5974. IEEE (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Cahuantzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cahuantzi, R., Chen, X., Güttel, S. (2023). A Comparison of LSTM and GRU Networks for Learning Symbolic Sequences. In: Arai, K. (eds) Intelligent Computing. SAI 2023. Lecture Notes in Networks and Systems, vol 739. Springer, Cham. https://doi.org/10.1007/978-3-031-37963-5_53

Download citation

Publish with us

Policies and ethics