A Regularized Minimum Cross-Entropy Algorithm on Mixtures of Experts for Time Series Prediction

  • Zhiwu Lu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3972)


The well-known mixtures of experts(ME) model is usually trained by expectation maximization(EM) algorithm for maximum likelihood learning. However, we have to first determine the number of experts, which is often hardly known. Derived from regularization theory, a regularized minimum cross-entropy(RMCE) algorithm is proposed to train ME model, which can automatically make model selection. When time series is modeled by ME, it is demonstrated by some climate prediction experiments that RMCE algorithm outperforms EM algorithm. We also compare RMCE algorithm with other regression methods such as back-propagation(BP) algorithm and normalized radial basis function(NRBF) network, and find that RMCE algorithm still shows promising results.


Expectation Maximization Expectation Maximization Algorithm Normalize Mean Square Error Time Series Prediction Expert Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lu, Z., Cheng, Q., Ma, J.: A Gradient BYY Harmony Learning Algorithm on Mixture of Experts for Curve Detection. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 250–257. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Weigend, A.S., Manageas, M., Srivastava, A.N.: Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Over-fitting. Int. J. Neural Systems 6, 373–399 (1995)CrossRefGoogle Scholar
  3. 3.
    Jordan, M.I., Xu, L.: Convergence Results for the EM Approach to Mixtures-of-Experts Architectures. Neural Networks 8, 1409–1431 (1995)CrossRefGoogle Scholar
  4. 4.
    Pal, N.R.: On Minimum Cross-Entropy Thresholding. Pattern Recognition 29, 575–580 (1996)CrossRefGoogle Scholar
  5. 5.
    Vapnik, V.N.: An Overview of Statistical Learning Theory. IEEE Trans. Neural Networks 10, 988–999 (1999)CrossRefGoogle Scholar
  6. 6.
    Lee, T.L.: Back-Propagation Neural Network for Long-Term Tidal Predictions. Ocean Engineering 31, 225–238 (2004)CrossRefGoogle Scholar
  7. 7.
    Moody, J., Darken, J.: Fast Learning in Networks of Locally-Tuned Processing Units. Neural Comput. 1, 281–294 (1989)CrossRefGoogle Scholar
  8. 8.
    Xu, L.: BYY Harmony Learning, Structural RPCL, and Topological Self-Organizing on Mixture Modes. Neural Networks 15, 1231–1237 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zhiwu Lu
    • 1
  1. 1.Institute of Computer Science & Technology of Peking UniversityBeijingChina

Personalised recommendations