Abstract
We study the ability of recurrent neural networks to model and recognize simple regular languages. Training the networks under different levels of noise and regularization, we analyze their response in terms of accuracy and interpretability using a complete set of validation data. Our results show that a small noise level improves the generalization of the networks, while regularization provides a higher interpretability. Under proper levels of noise and regularization, the networks are able to obtain a high accuracy, and the hidden units display activation patterns that could be related to discrete states in a deterministic finite automaton.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Neurocomputing: foundations of research. In: Learning Representations by Back-propagating Errors, pp. 696–699. MIT Press, Cambridge (1988)
Graves, A., Mohamed, A., Hinton, G.E.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, 26–31 May 2013, pp. 6645–6649 (2013)
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, 26–30 September 2010, pp. 1045–1048 (2010)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 3104–3112 (2014)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar. A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724–1734 (2014)
Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription. In: Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, 26 June–1 July 2012 (2012)
Mayer, H., Gomez, F., Wierstra, D., Nagy, I., Knoll, A., Schmidhuber, J.: A system for robotic heart surgery that learns to tie knots using recurrent neural networks. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 543–548, October 2006
Zeng, Z., Goodman, R.M., Smyth, P.: Learning finite state machines with self-clustering recurrent networks. Neural Comput. 5(6), 976–990 (1993)
Omlin, C.W., Giles, C.L.: Extraction of rules from discrete-time recurrent neural networks. Neural Netw. 9(1), 41–52 (1996)
Casey, M.: The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction. Neural Comput. 8(6), 1135–1178 (1996)
Siegelmann, H.T., Sontag, E.D.: On the computational power of neural nets. J. Comput. Syst. Sci. 50(1), 132–150 (1995)
Cohen, M., Caciularu, A., Rejwan, I., Berant, J.: Inducing regular grammars using recurrent neural networks. CoRR, abs/1710.10453 (2017)
Gers, F.A., Schmidhuber, E.: LSTM recurrent networks learn simple context-free and context-sensitive languages. Trans. Neural Netw. 12(6), 1333–1340 (2001)
Giles, C.L., Miller, C.B., Chen, D., Sun, G., Chen, H., Lee, Y.: Extracting and learning an unknown grammar with recurrent neural networks. In: Advances in Neural Information Processing Systems 4, [NIPS Conference, Denver, Colorado, USA, 2–5 December 1991], pp. 317–324 (1991)
Jacobsson, H.: Rule extraction from recurrent neural networks: a taxonomy and review. Neural Comput. 17(6), 1223–1263 (2005)
Wang, Q., Zhang, K., Ororbia II, A.G., Xing, X., Liu, X., Giles, C.L.: An empirical evaluation of rule extraction from recurrent neural networks. Neural Comput. 30(9), 2568–2591 (2018)
Kolen, J.F.: Fool’s gold: extracting finite state machines from recurrent network dynamics. In: Advances in Neural Information Processing Systems 6, [7th NIPS Conference, Denver, Colorado, USA, 1993], pp. 501–508 (1993)
Karpathy, A., Johnson, J., Fei-Fei, L.: Visualizing and understanding recurrent networks. CoRR, abs/1506.02078 (2015)
Linz, P.: An Introduction to Formal Languages and Automata, 4th edn. Jones and Bartlett Publishers, Burlington (2006)
Acknowledgments
This work was funded by grant S2017/BMD-3688 from Comunidad de Madrid, and by Spanish project MINECO/FEDER TIN2017-84452-R (http://www.mineco.gob.es/).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Oliva, C., Lago-Fernández, L.F. (2019). Interpretability of Recurrent Neural Networks Trained on Regular Languages. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-20518-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20517-1
Online ISBN: 978-3-030-20518-8
eBook Packages: Computer ScienceComputer Science (R0)