Experimental Analysis of Performance of Temporal Supervised Learning Algorithm, Applied to a Long and Complex Sequence
In the present paper, we evaluate the performance of temporal supervised learning algorithm(TSLA), developed by R.J. Williams and D. Zipser, and propose the computational methods which accelerate and stabilise the learning process of recurrent neural network with TSLA, when it is applied to long and complex sequences.
TSLA represents consecutive events in architecture itself, which enables the network to deal with long and complex time-changing phenomena without increasing the number of units. However, TSLA shows extreme instability, when it is applied to long and complex time-changing phenomena. Moreover, it tends to take a long time to finish the learning. It is absolutely necessary to evaluate the performance of TSLA and to develop some computational methods which improve the instability and reduce considerably the learning time. We attempt to remove the instability by using the variable learning rate, which means that the learning rate can vary, according to the progress of learning. The Minkowski-r power metrics are used to accelerate the learning time.
From the experiments, it was confirmed that the instability was removed by the variable learning rate and some other minor adjustment, and that the network with Minkowski-r power metric learned a relatively long sequence (English sentences) more than three times faster than that with ordinary error function.
KeywordsLearning Rate Recurrent Neural Network Output Unit Complex Sequence Learning Time
Unable to display preview. Download preview PDF.
- S. E. Fahiman, “Faster-learning variations on back-propagation: an empirical study,” in Proceedings of the 1988 Connectionist Models, Carnegie Mellon University, pp. 38–51, 1988.Google Scholar
- S. J. Hanson and D. J. Burr “Minkowski-r back-propagation: learning in connectionist models with non-Euclidian signals”, in Neural Information Processing Systems. D. Z. Anderson, Ed. New York: American Institute of Physics, pp. 348–357, 1989.Google Scholar
- D. E. Rumelhart, G. E. Hinton and R. J. Williams, “Learning internal representations by error propagation,” in Parallel distributed processing. D. E. Rumelhart, J. L. McClelland and the PDP research group, Ed. Cambridge, Massachusetts: The MIT Press, Vol.1, pp. 318–362, 1986.Google Scholar