Training simple recurrent networks through gradient descent algorithms
In the literature Simple Recurrent Networks have been successfully trained through both Exact and Truncated Gradient Descent algorithms. This paper empirically compares these two learning methods, training an Elman architecture with first-order connections in order to approach a simple Language Understanding task.
KeywordsLanguage Understanding Simple Recurrent Neural Networks Training Algorithms Gradient Descent
Unable to display preview. Download preview PDF.
- [Alquézar,95]R. Alquézar, A. Sanfeliú. An Algebraic Framework to Represent Finite-State Machines in Single-Layer Recurrent Neural Networks. Neural Computation vol. 7, no. 5, pp. 931–949, 1995.Google Scholar
- [Castaño,93]M.A. Castaño, E. Vidal, F. Casacuberta. Learning Direct Acoustic-to-Semantic Mapping through Simple Recurrent Networks. Procs EUROSPEECH-93, vol. 2, pp. 1017–1020, 1993.Google Scholar
- [Castaño,95]M.A. Castaño, E. Vidal, F. Casacuberta. Preliminary Experiments for Automatic Speech Understanding through Simple Recurrent Networks Procs. EUROSPEECH-95, vol. 3, pp. 1673–1676, 1995.Google Scholar
- [Cleeremans,89]A. Cleeremans, D. Servan-Schreiber, J.L. McClelland. Finite State Automata and Simple Recurrent Networks. Neural Computation, no. 1, pp 372–381, 1989.Google Scholar
- [Elman,90]J.L. Elman. Finding Structure in Time. Cognitive Science, vol. 2, no. 4 pp 279–311, 1990.Google Scholar
- [Giles,92]C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, Y.C. Lee. Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks. Neural Computation, no. 4, pp. 393–405, 1992.Google Scholar
- [Giles,95]C.L. Giles, D. Chen, G.Z. Sun, H.H. Chen, Y.C. Lee, M.W. Goudreau. Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Cascade Correlation and a Simple Solution. IEEE Trans. on Neural Networks, vol. 6, pp. 829–836, 1995.Google Scholar
- [Jordan,88]M.I. Jordan. Serial Order: A Parallel Distributed Processing Approach. Technical Report no. 8604, Institute of Cognitive Science, University of California, San Diego, 1988.Google Scholar
- [Rumelhart,86]D.E. Rumelhart, G. Hinton, R. Williams. Learning Sequential sSructure in Simple Recurrent Networks. In Parallel distributed processing: Experiments in the microstructure of cognition, vol. 1. Rumelhart D.E., McClelland J.L. and the PDP Research Group (Eds),. MIT Press, Cambridge, 1986.Google Scholar
- [Sopena,94]J.M. Sopena, R. Alquézar. Improvement of Learning in Recurrent Networks by Substituting the Sigmoid Activation Function. Procs. ot the Intenational Conference on Artificial Neural Networks, Springer Verlag, vol. 1, pp. 417–420, 1994.Google Scholar
- [Williams,89]R.J. Williams, D. Zipser. Experimental Analysis of the Real-time Recurrent Learning Algorithm. Connection Science, vol. 1, no. 1, pp. 87–111, 1989.Google Scholar