Error Entropy Minimization for LSTM Training
In this paper we present a new training algorithm for the Long Short-Term Memory (LSTM) recurrent neural network. This algorithm uses entropy instead of the usual mean squared error as the cost function for the weight update. More precisely we use the Error Entropy Minimization approach, were the entropy of the error is minimized after each symbol is present to the network. Our experiments show that this approach enables the convergence of the LSTM more frequently than with the traditional learning algorithm. This in turn relaxes the burden of parameter tuning since learning is achieved for a wider range of parameter values. The use of EEM also reduces, in some cases, the number of epochs needed for convergence.
KeywordsOutput Layer Learning Rate Time Lapse Recurrent Neural Network Output Gate
Unable to display preview. Download preview PDF.
- 3.Gers, F., Schmidhuber, J.: Recurrent nets that time and count. In: Proc. IJCNN 2000, Int. Joint Conf. on Neural Networks, Como, Italy (2000)Google Scholar
- 7.Santos, J., Alexandre, L., Sereno, F., de Sá, J.M.: Optimization of the error entropy minimization algorithm for neural network classification. In: ANNIE 2004, St.Louis, USA. Intelligent Engineering Systems Through Artificial Neural Networks, vol. 14, pp. 81–86. ASME Press Series, St. Louis (2004)Google Scholar
- 8.Santos, J., Alexandre, L., de Sá, J.M.: The error entropy minimization algorithm for neural network classification. In: Lofti, A. (ed.) Proceedings of the 5th International Conference on Recent Advances in Soft Computing, Nottingham, United Kingdom, pp. 92–97 (2004)Google Scholar
- 9.Silva, L., de Sá, J.M., Alexandre, L.: Neural network classification using Shannon’s entropy. In: 13th European Symposium on Artificial Neural Networks - ESANN 2005, Bruges, Belgium, pp. 217–222 (2005)Google Scholar