Bounds for the Average Generalization Error of the Mixture of Experts Neural Network

  • Luís A. Alexandre
  • Aurélio Campilho
  • Mohamed Kamel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)

Abstract

In this paper we derive an upper bound for the average-case generalization error of the mixture of experts modular neural network, based on an average-case generalization error bound for an isolated neural network. By doing this we also generalize a previous bound for this architecture that was restricted to special problems.

We also present a correction factor for the original average generalization error, that was empirically obtained, that yields more accurate error bounds for the 6 data sets used in the experiments. These experiments illustrate the validity of the derived error bound for the mixture of experts modular neural network and show how it can be used in practice.

Keywords

modular neural networks mixture of experts generalization error bounds 

References

  1. 1.
    Gu, H., Takahashi, H.: Towards more practical average bounds on supervised learning. IEEE Trans. Neural Networks 7, 953–968 (1996)CrossRefGoogle Scholar
  2. 2.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1999)Google Scholar
  3. 3.
    Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1999)MATHGoogle Scholar
  4. 4.
    Alexandre, L., Campilho, A., Kamel, M.: Average error bound for the mixture of experts MNN architecture. In: Proceedings of the 12th Portuguese Conference on Pattern Recognition, Aveiro, Portugal (2002)Google Scholar
  5. 5.
    Jacobs, R., Jordan, M., Nowlan, S., Hinton, G.: Adaptive mixtures of local experts. Neural Computation, 79–87 (1991)Google Scholar
  6. 6.
    Auda, G., Kamel, M.: Modular neural network classifiers: A comparative study. J. Intel. Robotic Systems, 117–129 (1998)Google Scholar
  7. 7.
    Jordan, M., Jacobs, R.: Hierarchical mixture of experts and the EM algorithm. Neural Computation, 181–214 (1994)Google Scholar
  8. 8.
    Jacobs, R., Peng, F., Tanner, M.: A bayesian approach to model selection in hierarchical mixtures-of-experts architectures. Neural Networks 10, 231–241 (1997)CrossRefGoogle Scholar
  9. 9.
    Marques de Sá, J.: Pattern Recognition: Concepts, Methods and Applications. Springer, Heidelberg (2001)Google Scholar
  10. 10.
    Blake, C., Keogh, E., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  11. 11.
    Gu, H., Takahashi, H.: Estimating learning curves of concept learning. Neural Networks 10, 1089–1102 (1997)CrossRefGoogle Scholar
  12. 12.
    Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  13. 13.
    Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: IEEE International Conference on Neural Networks, San Francisco, pp. 586–591 (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Luís A. Alexandre
    • 1
  • Aurélio Campilho
    • 2
  • Mohamed Kamel
    • 3
  1. 1.Networks and Multimedia GroupITCovilhãPortugal
  2. 2.INEB – Instituto de Engenharia BiomédicaPortugal
  3. 3.Dept. Systems Design EngineeringUniv. WaterlooCanada

Personalised recommendations