An EM Based Training Algorithm for Recurrent Neural Networks
Recurrent neural networks serve as black-box models for nonlinear dynamical systems identification and time series prediction. Training of recurrent networks typically minimizes the quadratic difference of the network output and an observed time series. This implicitely assumes that the dynamics of the underlying system is deterministic, which is not a realistic assumption in many cases. In contrast, state-space models allow for noise in both the internal state transitions and the mapping from internal states to observations. Here, we consider recurrent networks as nonlinear state space models and suggest a training algorithm based on Expectation-Maximization. A nonlinear transfer function for the hidden neurons leads to an intractable inference problem. We investigate the use of a Particle Smoother to approximate the E-step and simultaneously estimate the expectations required in the M-step. The method is demonstrated for a sythetic data set and a time series prediction task arising in radiation therapy where it is the goal to predict the motion of a lung tumor during respiration.
KeywordsRecurrent neural networks Dynamical System identification EM Particle Smoother
Unable to display preview. Download preview PDF.
- 4.Klaas, M., Briers, M., Doucet, A., Maskell, S.: Fast particle smoothing: If i had a million particles. In: International Conference on Machine Learning (ICML), pp. 25–29 (2006)Google Scholar
- 5.Williams, R.J.: Training recurrent networks using the extended kalman filter. In: Proceedings International Joint Conference on Neural Networks, pp. 241–246 (1992), citeseer.nj.nec.com/williams92training.html
- 6.Ghahramani, Z., Roweis, S.T.: Learning nonlinear dynamical systems using an EM algorithm. In: Advances in Neural Information Processing Systems 11, pp. 599–605. MIT Press, Cambridge (1999)Google Scholar
- 7.Williams, R.J., Zipser, D.: Gradient-based learning algorithms for recurrent networks and their computational complexity. In: Back-propagation: Theory, Architectures and Applications. Erlbaum, Hillsdale (1994)Google Scholar
- 9.Jaeger, H.: The ”echo state” approach to analysing and training recurrent neural networks. Technical Report GMD Report 148, German National Research Center for Information Technology (2001)Google Scholar