An EM Based Training Algorithm for Recurrent Neural Networks

  • Jan Unkelbach
  • Sun Yi
  • Jürgen Schmidhuber
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5768)


Recurrent neural networks serve as black-box models for nonlinear dynamical systems identification and time series prediction. Training of recurrent networks typically minimizes the quadratic difference of the network output and an observed time series. This implicitely assumes that the dynamics of the underlying system is deterministic, which is not a realistic assumption in many cases. In contrast, state-space models allow for noise in both the internal state transitions and the mapping from internal states to observations. Here, we consider recurrent networks as nonlinear state space models and suggest a training algorithm based on Expectation-Maximization. A nonlinear transfer function for the hidden neurons leads to an intractable inference problem. We investigate the use of a Particle Smoother to approximate the E-step and simultaneously estimate the expectations required in the M-step. The method is demonstrated for a sythetic data set and a time series prediction task arising in radiation therapy where it is the goal to predict the motion of a lung tumor during respiration.


Recurrent neural networks Dynamical System identification EM Particle Smoother 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mandic, D.P., Chambers, J.A.: Recurrent neural networks for prediction. John Wiley & Sons, Inc., Chichester (2001)CrossRefGoogle Scholar
  2. 2.
    Bishop, C.M.: Pattern recognition and Machine learning. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  3. 3.
    McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. John Wiley & Sons, Inc., Chichester (1997)zbMATHGoogle Scholar
  4. 4.
    Klaas, M., Briers, M., Doucet, A., Maskell, S.: Fast particle smoothing: If i had a million particles. In: International Conference on Machine Learning (ICML), pp. 25–29 (2006)Google Scholar
  5. 5.
    Williams, R.J.: Training recurrent networks using the extended kalman filter. In: Proceedings International Joint Conference on Neural Networks, pp. 241–246 (1992),
  6. 6.
    Ghahramani, Z., Roweis, S.T.: Learning nonlinear dynamical systems using an EM algorithm. In: Advances in Neural Information Processing Systems 11, pp. 599–605. MIT Press, Cambridge (1999)Google Scholar
  7. 7.
    Williams, R.J., Zipser, D.: Gradient-based learning algorithms for recurrent networks and their computational complexity. In: Back-propagation: Theory, Architectures and Applications. Erlbaum, Hillsdale (1994)Google Scholar
  8. 8.
    Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.: Training recurrent networks by evolino. Neural Computation 19(3), 757–779 (2007)CrossRefzbMATHGoogle Scholar
  9. 9.
    Jaeger, H.: The ”echo state” approach to analysing and training recurrent neural networks. Technical Report GMD Report 148, German National Research Center for Information Technology (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jan Unkelbach
    • 1
  • Sun Yi
    • 1
  • Jürgen Schmidhuber
    • 1
  1. 1.IDSIA, Galleria 2MannoSwitzerland

Personalised recommendations