Training simple recurrent networks through gradient descent algorithms

  • M. A. Castaño
  • F. Casacuberta
  • A. Bonet
Plasticity Phenomena (Maturing, Learning and Memory)
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1240)


In the literature Simple Recurrent Networks have been successfully trained through both Exact and Truncated Gradient Descent algorithms. This paper empirically compares these two learning methods, training an Elman architecture with first-order connections in order to approach a simple Language Understanding task.


Language Understanding Simple Recurrent Neural Networks Training Algorithms Gradient Descent 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Alquézar,95]
    R. Alquézar, A. Sanfeliú. An Algebraic Framework to Represent Finite-State Machines in Single-Layer Recurrent Neural Networks. Neural Computation vol. 7, no. 5, pp. 931–949, 1995.Google Scholar
  2. [Castaño,93]
    M.A. Castaño, E. Vidal, F. Casacuberta. Learning Direct Acoustic-to-Semantic Mapping through Simple Recurrent Networks. Procs EUROSPEECH-93, vol. 2, pp. 1017–1020, 1993.Google Scholar
  3. [Castaño,95]
    M.A. Castaño, E. Vidal, F. Casacuberta. Preliminary Experiments for Automatic Speech Understanding through Simple Recurrent Networks Procs. EUROSPEECH-95, vol. 3, pp. 1673–1676, 1995.Google Scholar
  4. [Cleeremans,89]
    A. Cleeremans, D. Servan-Schreiber, J.L. McClelland. Finite State Automata and Simple Recurrent Networks. Neural Computation, no. 1, pp 372–381, 1989.Google Scholar
  5. [Elman,90]
    J.L. Elman. Finding Structure in Time. Cognitive Science, vol. 2, no. 4 pp 279–311, 1990.Google Scholar
  6. [Giles,92]
    C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, Y.C. Lee. Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks. Neural Computation, no. 4, pp. 393–405, 1992.Google Scholar
  7. [Giles,95]
    C.L. Giles, D. Chen, G.Z. Sun, H.H. Chen, Y.C. Lee, M.W. Goudreau. Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Cascade Correlation and a Simple Solution. IEEE Trans. on Neural Networks, vol. 6, pp. 829–836, 1995.Google Scholar
  8. [Jordan,88]
    M.I. Jordan. Serial Order: A Parallel Distributed Processing Approach. Technical Report no. 8604, Institute of Cognitive Science, University of California, San Diego, 1988.Google Scholar
  9. [Rumelhart,86]
    D.E. Rumelhart, G. Hinton, R. Williams. Learning Sequential sSructure in Simple Recurrent Networks. In Parallel distributed processing: Experiments in the microstructure of cognition, vol. 1. Rumelhart D.E., McClelland J.L. and the PDP Research Group (Eds),. MIT Press, Cambridge, 1986.Google Scholar
  10. [Sopena,94]
    J.M. Sopena, R. Alquézar. Improvement of Learning in Recurrent Networks by Substituting the Sigmoid Activation Function. Procs. ot the Intenational Conference on Artificial Neural Networks, Springer Verlag, vol. 1, pp. 417–420, 1994.Google Scholar
  11. [Williams,89]
    R.J. Williams, D. Zipser. Experimental Analysis of the Real-time Recurrent Learning Algorithm. Connection Science, vol. 1, no. 1, pp. 87–111, 1989.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • M. A. Castaño
    • 1
  • F. Casacuberta
    • 2
  • A. Bonet
    • 2
  1. 1.Dpto. de InformáticaUniversitat Jaume I de CastellónSpain
  2. 2.Dpto. Sistemas Informáticos y ComputaciónUniversidad Politécnica de ValenciaSpain

Personalised recommendations