Dynamical Recurrent Networks for Sequential Data Processing

  • Stefan C. Kremer
  • John F. Kolen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1778)


All symbol processing tasks can be viewed as instances of symbol-to-symbol transduction (SST). SST generalizes many familiar symbolic problem classes including language identification and sequence generation. One method of performing SST is via dynamical recurrent networks employed as symbol-to-symbol transducers. We construct these transducers by adding symbol-to-vector preprocessing and vector-to-symbol postprocessing to the vector-to-vector mapping provided by neural networks. This chapter surveys the capabilities and limitations of these mechanisms from both top-down (task dependent) and bottom up (implementation dependent) forces.


Turing Machine Recurrent Neural Network Hide Unit Feedforward Network Recurrent Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2), 157–166 (1994)CrossRefGoogle Scholar
  2. 2.
    Cleeremans, A., Servan-Schreiber, D., McClelland, J.L.: Finite state automata and simple recurrent networks. Neural Computation 1(3), 372–381 (1989)CrossRefGoogle Scholar
  3. 3.
    de Vries, B., Principe, J.M.: A theory for neural networks with time delays. In: Lippmann, R.P., Moody, J.E., Touretzky, D.S. (eds.) Advances in Neural Information Processing 3, San Mateo, CA, pp. 162–168. Morgan Kaufmann Publishers, Inc., San Francisco (1991)Google Scholar
  4. 4.
    Elman, J.L.: Distributed representations, simple recurrent networks and gram matical structure. Machine Learning 7(2/3), 195–226 (1991)CrossRefGoogle Scholar
  5. 5.
    Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990)CrossRefGoogle Scholar
  6. 6.
    Fodor, J.A., Pylyshyn, Z.W.: Connectionism and cognitive architecture: A critical analysis. Cognition 28, 3–71 (1988)CrossRefGoogle Scholar
  7. 7.
    Forcada, M.L., Ñeco, R.P.: Recursive hetero-associative memories for translation. In: Proceedings of the International Workshop on Artificial Neural Networks IWANN 1997, Lanzarote, Spain, June 1997, pp. 453–462 (1997)Google Scholar
  8. 8.
    Frasconi, P., Gori, M., Soda, G.: Recurrent networks for continuous speech recognition. In: Computational Intelligence 1990, pp. 45–56. Elsevier, Amsterdam (1990)Google Scholar
  9. 9.
    Fu, K.-S.: Syntactic Pattern Recognition and Applications. Prentice-Hall, Inc., Engelwood Cliffs (1982)zbMATHGoogle Scholar
  10. 10.
    Giles, C.L., Horne, B.G., Lin, T.: Learning a class of large finite state machines with a recurrent neural network. Neural Networks 8(9), 1359–1365 (1995)CrossRefGoogle Scholar
  11. 11.
    Giles, C.L., Sun, G.Z., Chen, H.H., Lee, Y.C., Chen, D.: Higher order recurrent networks & grammatical inference. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 2, San Mateo, CA, pp. 380–387. Morgan Kaufmann Publishers, San Francisco (1990)Google Scholar
  12. 12.
    Gold, E.M.: Language identification in the limit. Information and Control 10, 447–474 (1967)zbMATHCrossRefGoogle Scholar
  13. 13.
    Hochreiter, J.: Untersuchungen zu dynamischen neuronalen Netzen. Master’s thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München (1991)Google Scholar
  14. 14.
    Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)zbMATHGoogle Scholar
  15. 15.
    Kilian, J., Siegelmann, H.T.: On the power of sigmoid neural networks. In: Proceedings of the Sixth ACM Workshop on Computational Learning Theory, pp. 137–143. ACM Press, New York (1993)Google Scholar
  16. 16.
    Kohavi, Z.: Switching and Finite Automata Theory, 2nd edn. McGraw-Hill, Inc., New York (1978)zbMATHGoogle Scholar
  17. 17.
    Kolen, J.F.: Exploring the Computational Capabilities of Recurrent Neural Networks. PhD thesis, Ohio State University (1994) Google Scholar
  18. 18.
    Kolen, J.F.: The origin of clusters in recurrent network state space. In: Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society, Hillsdale, NJ, pp. 508–513. Earlbaum (1994)Google Scholar
  19. 19.
    Kolen, J.F., Kremer, S.C. (eds.): A Field Guide to Dynamical Recurrent Networks. IEEE Press, Los Alamitos (2000)Google Scholar
  20. 20.
    Kolen, F., Pollack, J.B.: The paradox of observation and the observation of paradox. Journal of Experimental and Theoretical Artificial Intelligence 7, 275–277 (1995)CrossRefGoogle Scholar
  21. 21.
    Kremer, S.C.: Spatio-temporal connectionist networks: A taxonomy and review (submitted)Google Scholar
  22. 22.
    Kremer, S.C.: On the computational power of Elman-style recurrent networks. IEEE Transactions on Neural Networks 6(4), 1000–1004 (1995)CrossRefGoogle Scholar
  23. 23.
    Kremer, S.C.: Comments on ‘constructive learning of recurrent neural networks: Limitations of recurrent cascade correlation and a simple solution’. IEEE Transactions on Neural Networks 7(4), 1049–1051 (1996)CrossRefGoogle Scholar
  24. 24.
    Kremer, S.C.: Finite state automata that recurrent cascade-correlation cannot represent. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8, pp. 612–618. MIT Press, Cambridge (1996)Google Scholar
  25. 25.
    Kremer, S.C.: Identification of a specific limitation on local-feedback recurrent networks acting as mealy-moore machines. IEEE Transactions on Neural Networks 10(2), 433–438 (1999)CrossRefGoogle Scholar
  26. 26.
    Lin, T., Horne, B.G., Tiño, P., Giles, C.L.: Learning long-term dependencies in NARX recurrent neural networks. IEEE Transactions on Neural Networks 7(6), 1329–1338 (1996)CrossRefGoogle Scholar
  27. 27.
    Mozer, M.C.: Neural net architectures for temporal sequence processing. In: Weigend, A.S., Gershenfeld, N.A. (eds.) Time Series Prediction, pp. 243–264. Addison–Wesley, London (1994)Google Scholar
  28. 28.
    Narendra, K.S., Parthasarathy, K.: Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks 1(1), 4–27 (1990)CrossRefGoogle Scholar
  29. 29.
    Narendra, K.S., Parthasarathy, K.: Gradient methods for the optimization of dynamical systems containing neural networks. IEEE Transactions on Neural Networks 2, 252–262 (1991)CrossRefGoogle Scholar
  30. 30.
    Pollack, J.B.: On Connectionist Models of Natural Language Processing. PhD thesis, Computer Science Department of the University of Illinois at Urbana- Champaign, Urbana, Illinois (1987)Google Scholar
  31. 31.
    Pollack, J.B.: Recursive distributed representations. Artificial Intelligence 46, 77–105 (1990)CrossRefGoogle Scholar
  32. 32.
    Pollack, J.B.: The induction of dynamical recognizers. Machine Learning 7, 227–252 (1991)Google Scholar
  33. 33.
    Pollack, J.B.: Implications of recursive distributed representations. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 1, San Mateo, CA, pp. 527–536. Morgan Kaufmann, San Francisco (1989)Google Scholar
  34. 34.
    Rumberlhart, D., Hinton, G., Williams, R.: Learning internal representation by error propagation. In: McClelland, J.L., Rumelhart, D.E., The P.D.P. Group (eds.) Parallel Distributed Processing: Explorations in the Micro structure of Cognition: Foundations, vol. 1, pp. 318–364. MIT Press, Cambridge (1986)Google Scholar
  35. 35.
    Schmidhuber, J.H.: A fixed size storage o(n3 ) time complexity learning algorithmfor fully recurrent continually running networks. Neural Computation 4(2), 243–248 (1992)CrossRefGoogle Scholar
  36. 36.
    Sejnowski, T.J., Rosenberg, C.R.: NETtalk: a parallel network that learns to read aloud. In: Anderson, J.A., Rosenfeld, E. (eds.) Neurocomputing: Foundations of Research, pp. 663–672. MIT Press, Cambridge (1988)Google Scholar
  37. 37.
    Siegelmann, H.T., Horne, B.G., Giles, C.L.: Computational capabilities of recurrent narx neural networks. IEEE Transactions on Systems, Man and Cybernetics (1997) (in press)Google Scholar
  38. 38.
    Siegelmann, H.T., Sontag, E.D.: On the computational power of neural nets. Journal of Computer and System Sciences 50(1), 132–150 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  39. 39.
    Waibel, A.: Consonant recognition by modular construction of large phonemic time-delay neural networks. In: Anderson, D.Z. (ed.) Neural Information Processing Systems, New York, NY, pp. 215–223. American Institute of Physics (1988)Google Scholar
  40. 40.
    Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Computation 1(2), 270–280 (1989)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Stefan C. Kremer
    • 1
  • John F. Kolen
    • 2
  1. 1.Guelph Natural Computation Group, Dept. of Computing and Information ScienceUniversity of GuelphGuelphCANADA
  2. 2.Dept. of Computer Science & Institute for Human and Machine CognitionUniversity of West FloridaPensacolaUSA

Personalised recommendations