Skip to main content

Automatic Model Selection in a Hybrid Perceptron/Radial Network

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2096))

Abstract

We introduce an algorithm for incrementaly constructing a hybrid network fo radial and perceptron hidden units. The algorithm determins if a radial or a perceptron unit is required at a given region of input space. Given an error target, the algorithm also determins the number of hidden units. This results in a final architecture which is often much smaller than an RBF network or a MLP. A benchmark on four classification problems and three regression problems is given. The most striking performance improvement is achieved on the vowel data set [4].

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. The Wadsworth Statistics/Probability Series, Belmont, CA, 1984.

    Google Scholar 

  2. J. Buckheit and D. L. Donoho. Improved linear discrimination using time-frequency dictionaries. Technical Report, Stanford University, 1995.

    Google Scholar 

  3. S. Cohen and N. Intrator. A hybrid projection based and radial basis function architecture. In J. Kittler and F. Roli, editors, Proc. Int. Workshop on Multiple Classifier Systems (LNCS1857), pages 147–156, Sardingia, June 2000. Springer.

    Chapter  Google Scholar 

  4. D. H. Deterding. Speaker Normalisation for Automatic Speech Recognition. PhD thesis, University of Cambridge, 1989.

    Google Scholar 

  5. D. L. Donoho and I. M. Johnstone. Projection-based approximation and a duality with kernel methods. Annals of Statistics, 17:58–106, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  6. H. Drucker, R. Schapire, and P. Simard. Improving performance in neural networks using a boosting algorithm. In Steven J. Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems, volume 5, pages 42–49. Morgan Kaufmann, 1993.

    Google Scholar 

  7. S. E. Fahlman and C. Lebiere. The cascade-correlation learning architecture. CMU-CS-90-100, Carnegie Mellon University, 1990.

    Google Scholar 

  8. G. W. Flake. Square unit augmented, radially extended, multilayer percpetrons. In G. B. Orr and K. Müller, editors, Neural Networks: Tricks of the Trade, pages 145–163. Springer, 1998.

    Google Scholar 

  9. J. H. Friedman. Mutltivariate adaptive regression splines. The Annals of Statistics, 19:1–141, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  10. J. H. Friedman and W. Stuetzle. Projection pursuit regression. Journal of the American Statistical Association, 76:817–823, 1981.

    Article  MathSciNet  Google Scholar 

  11. T. Hastie and R. Tibshirani. Generalized additive models. Statistical Science, 1:297–318, 1986.

    Article  MathSciNet  Google Scholar 

  12. T. Hastie and R. Tibshirani. Generalized Additive Models. Chapman and Hall, London, 1990.

    MATH  Google Scholar 

  13. T. Hastie, R. Tibshirani, and A. Buja. Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association, 89:1255–1270, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  14. R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991.

    Article  Google Scholar 

  15. M. I. Jordan and R. A. Jacobs. Hierarchies of adaptive experts. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4, pages 985–992. Morgan Kaufmann, San Mateo, CA, 1992.

    Google Scholar 

  16. R. E. Kass and A. E. Raftery. Bayes factors. Journal of The American Statistical Association, 90:773–795, 1995.

    Article  MATH  Google Scholar 

  17. Y. C. Lee, G. Doolen, H. H. Chen, G. Z. Sun, T. Maxwell, H.Y. Lee, and C. L. Giles. Machine learning using higher order correlation networks. Physica D, 22:276–306, 1986.

    MathSciNet  Google Scholar 

  18. D. J. C. MacKay. Bayesian interpolation. Neural Computation, 4(3):415–447, 1992.

    Article  Google Scholar 

  19. John Moody. Prediction risk and architecture selection for neural networks. In V. Cherkassky, J. H. Friedman, and H. Wechsler, editors, From Statistics to Neural Networks: Theory and Pattern Recognition Applications. Springer, NATO ASI Series F, 1994.

    Google Scholar 

  20. S. J. Nowlan. Soft competitive adaptation: Neural network learning algorithms basd on fitting statistical mixtures. Ph.D. dissertation, Carnegie Mellon University, 1991.

    Google Scholar 

  21. M. J. Orr, J. Hallman, K. Takezawa, A. Murray, S. Ninomiya, M. Oide, and T. Leonard. Combining regression trees and radial basis functions. Division of informatics, Edinburgh University, 1999. Submitted to IJNS.

    Google Scholar 

  22. Gorman R. P. and Sejnowski T. J. Analysis of hidden units in a layered network trained to classify sonar targets. Neural Network, pages 75–89, 1988. Vol. 1.

    Article  Google Scholar 

  23. A. J. Robinson. Dynamic Error Propogation Networks. PhD thesis, University of Cambridge, 1989.

    Google Scholar 

  24. D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representations by error propagation. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, volume 1, pages 318–362. MIT Press, Cambridge, MA, 1986.

    Google Scholar 

  25. C. J. Stone. The dimensionality reduction principle for generalized additive models. The Annals of Statistics, 14:590–606, 1986.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cohen, S., Intrator, N. (2001). Automatic Model Selection in a Hybrid Perceptron/Radial Network. In: Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2001. Lecture Notes in Computer Science, vol 2096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48219-9_44

Download citation

  • DOI: https://doi.org/10.1007/3-540-48219-9_44

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42284-6

  • Online ISBN: 978-3-540-48219-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics