Abstract
It is known that recurrent neural networks may have difficulties remembering data over long time lags. To overcome this problem, we propose an extended architecture of recurrent neural networks, which is able to deal with long time lags between relevant input signals. A register of latches at the input layer of the network is applied to bypass irrelevant input information and to propagate relevant inputs. The latches are implemented with differentiable multiplexers, thus enabling the derivatives to be propagated through the network. The relevance of input vectors is learned concurrently with the weights of the network using a gradient-based algorithm.
Similar content being viewed by others
References
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2): 270–280
Cleeremans A, Servan-Schreiber D, McClelland JL (1989) Finite state automata and simple recurrent networks. Neural Comput 1(3): 372–381
Gabrijel I, Dobnikar A (2003) On-line identification and reconstruction of finite automata with generalized recurrent neural networks. Neural Netw 16(1): 101–120
Chen LH, Chua HC, Tan PB (1998) Grammatical inference using an adaptive recurrent neural network. Neural Process Lett 8: 211–219
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10): 1550–1560
Haykin S, Puskorius GV, Feldkamp LA, Patel GS, Becker S, Racine R, Wan EA, Nelson AT, Rowels ST, Ghahramani Z, van der Merwe R (2002) Kalman filtering and neural networks. Wiley, New York
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2): 157–166
Schaefer AM, Udluft S, Zimmermann HG (2006) Learning long term dependencies with recurrent neural networks. ICANN 2006, Lecture Notes in Computer Science, Volume 4131/2006, 71–80
Martens J, Sutskever I (2011) Learning recurrent neural networks with Hessian-free optimization. In: Proceedings of the 28th international conference on machine learning (ICML)
Schmidhuber J (1992) Learning complex, extended sequences using the principle of history compression. Neural Comput 4(2): 234–242
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8): 1735–1780
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10): 2451–2471
Bengio Y, el Hihi S (1996) Hierarchical recurrent neural networks for long-term dependencies. NIPS 8, The MIT Press, Cambridge, pp 493–499
Bone R, Crucianu M, Asselinde Beauville JP (2002) Learning long-term dependencies by the selective addition of time-delayed connections to recurrent neural networks. Neurocomputing 48(1): 251–266
Bengio Y, Frasconi P, Simard P (1993) The problem of learning long-term dependencies in recurrent networks. In: IEEE International Conference on Neural Networks, IEEE Press, pp 1183–1195
Lin T, Horne BG, Tino P, Giles CL (1996) Learning long-term dependencies in NARX recurrent neural networks. IEEE Trans Neural Netw 7(6): 1329–1338
Ster B (2003) Latched recurrent neural network. Elektrotehniški vestnik 70(1–2): 46–51
Pearlmutter BA (1990) Dynamic recurrent neural networks. Technical Report CMU-CS-90-196, Carnegie Mellon University
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Šter, B. Selective Recurrent Neural Network. Neural Process Lett 38, 1–15 (2013). https://doi.org/10.1007/s11063-012-9259-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9259-4