Skip to main content
Log in

Generalization bounds for function approximation from scattered noisy data

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

We consider the problem of approximating functions from scattered data using linear superpositions of non-linearly parameterized functions. We show how the total error (generalization error) can be decomposed into two parts: an approximation part that is due to the finite number of parameters of the approximation scheme used; and an estimation part that is due to the finite number of data available. We bound each of these two parts under certain assumptions and prove a general bound for a class of approximation schemes that include radial basis functions and multilayer perceptrons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A.R. Barron, Approximation and estimation bounds for artificial neural networks, Technical Report 59, Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL (March 1991).

    Google Scholar 

  2. A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, Tehnical Report 58, Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL (March 1991).

    Google Scholar 

  3. A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39(3) (1993) 930-945.

    Article  MATH  MathSciNet  Google Scholar 

  4. A.R. Barron, Approximation and estimation bounds for artificial neural networks, Machine Learning 14 (1994) 115-133.

    MATH  Google Scholar 

  5. R.E. Bellman, Adaptive Control Processes (Princeton University Press, Princeton, NJ, 1961).

    MATH  Google Scholar 

  6. L. Breiman, Hinging hyperplanes for regression, classification, and function approximation, IEEE Trans. Inform. Theory 39(3) (1993) 999-1013.

    Article  MATH  MathSciNet  Google Scholar 

  7. L. Breiman, Stacked regression, Technical Report, University of California, Berkeley, CA (1993).

    Google Scholar 

  8. G. Cybenko, Approximation by superposition of a sigmoidal function, Math. Control Systems Signals 2(4) (1989) 303-314.

    MATH  MathSciNet  Google Scholar 

  9. R. DeVore, R. Howard and C. Micchelli, Optimal nonlinear approximation, Manuskripta Math. (1989).

  10. R.A. DeVore, Degree of nonlinear approximation, in: Approximation Theory, Vol. 6, eds. C.K. Chui, L.L. Schumaker and D.J. Ward (Academic Press, New York, 1991) pp. 175-201.

    Google Scholar 

  11. R.A. DeVore and X.M. Yu, Nonlinear n-widths in Besov spaces, in: Approximation Theory, Vol. 6, eds. C.K. Chui, L.L. Schumaker and D.J. Ward (Academic Press, New York, 1991) pp. 203-206.

    Google Scholar 

  12. R.M. Dudley, Universal Donsker classes and metric entropy, Ann. Probab. 14(4) (1987) 1306-1326.

    MathSciNet  Google Scholar 

  13. K. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural Networks 2 (1989) 183-192.

    Article  Google Scholar 

  14. F. Girosi, On some extensions of radial basis functions and their applications in artificial intelligence, Comput. Math. Appl. 24(12) (1992) 61-80.

    Article  MATH  MathSciNet  Google Scholar 

  15. F. Girosi and G. Anzellotti, Rates of convergence for radial basis functions and neural networks, in: Artificial Neural Networks for Speech and Vision, ed. R.J. Mammone (Chapman and Hall, London, 1993) pp. 97-113.

    Google Scholar 

  16. D. Haussler, Decision theoretic generalizations of the PAC model for neural net and other learning applications, Technical Report UCSC-CRL-91-02, University of California, Santa Cruz, CA (1989).

    Google Scholar 

  17. D. Haussler, Decision theoretic generalizations of the PAC model for neural net and other learning applications, Inform. Comput. 100(1) (1992) 78-150.

    Article  MATH  MathSciNet  Google Scholar 

  18. K. Hornik, M. Stinchcombe and H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989) 359-366.

    Article  Google Scholar 

  19. K. Hornik, M. Stinchcombe, H. White and P. Auer, Degree of approximation results for feedforward network approximating unknown functions and their derivatives, Neural Comput. 6 (1994) 1262-127.

    MATH  Google Scholar 

  20. L.K. Jones, A simple lemma on greedy approximation in Hilbert space and convergence rates for Projection Pursuit Regression and neural network training, Ann. Statist. 20(1) (1992) 608-613.

    MATH  MathSciNet  Google Scholar 

  21. L.D. Kudryavtsev and S.M. Nikol'skii, Spaces of differentiable functions of several variables, in: Analysis III, ed. S.M. Nikol'skii (Springer, Berlin, 1991).

    Google Scholar 

  22. R.P. Lippmann, An introduction to computing with neural nets, IEEE ASSP Mag. (April 1987) 4-22.

  23. G.G. Lorentz, Metric entropy, widths, and superposition of functions, Amer. Math. Monthly 69 (1962) 469-485.

    Article  MATH  MathSciNet  Google Scholar 

  24. G.G. Lorentz, Approximation of Functions (Chelsea, New York, 1986).

  25. H. Mhaskar and C. Micchelli, Degree of approximation by neural and translation networks with a single hidden layer, Adv. Appl. Math. 16 (1995) 151-183.

    Article  MATH  MathSciNet  Google Scholar 

  26. H.N. Mhaskar, Neural networks for optimal approximation of smooth and analytic functions, Neural Comput. 8 (1996) 164-167.

    Google Scholar 

  27. H.N. Mhaskar, Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math. 1 (1993) 61-80.

    Article  MATH  MathSciNet  Google Scholar 

  28. H.N. Mhaskar and C.A. Micchelli, Approximation by superposition of a sigmoidal function, Adv. Appl. Math. 13 (1992) 350-373.

    Article  MATH  MathSciNet  Google Scholar 

  29. H.N. Mhaskar and C.A. Micchelli, Dimension independent bounds on the degree of approximation by neural networks, IBM J. Res. Devel. 38 (1994) 277-284.

    Article  MATH  Google Scholar 

  30. C.A. Micchelli, Interpolation of scattered data: distance matrices and conditionally positive definite functions, Constr. Approx. 2 (1986) 11-22.

    Article  MATH  MathSciNet  Google Scholar 

  31. J. Moody and C. Darken, Fast learning in networks of locally-tuned processing units, Neural Comput. 1(2) (1989) 281-294.

    Google Scholar 

  32. N. Murata, An integral representation of functions using three layered networks and their approximation bounds, Neural Networks 9 (1996) 947-956.

    Article  Google Scholar 

  33. A. Pinkus, N-widths in Approximation Theory (Springer, New York, 1986).

    Google Scholar 

  34. T. Poggio and F. Girosi, Networks for approximation and learning, Proc. IEEE 78(9) (1990).

  35. T. Poggio and F. Girosi, Regularization algorithms for learning that are equivalent to multilayer networks, Science 247 (1990) 978-982.

    MathSciNet  Google Scholar 

  36. D. Pollard, Convergence of Stochastic Processes (Springer, Berlin, 1984).

    MATH  Google Scholar 

  37. M.J.D. Powell, Radial basis functions for multivariable interpolation: a review, in: Algorithms for Approximation, eds. J.C. Mason and M.G. Cox (Clarendon Press, Oxford, 1987).

    Google Scholar 

  38. D.E. Rumelhart, G.E. Hinton and R.J. Williams, Parallel Distributed Processing (MIT Press, Cambridge, MA, 1986).

    Google Scholar 

  39. E.M. Stein, Singular Integrals and Differentiability Properties of Functions (Princeton University Press, Princeton, NJ, 1970).

    MATH  Google Scholar 

  40. A.F. Timan, Theory of Approximation of Functions of a Real Variable (Macmillan, New York, 1963).

    MATH  Google Scholar 

  41. V.N. Vapnik, Estimation of Dependences Based on Empirical Data (Springer, Berlin, 1982).

    MATH  Google Scholar 

  42. V.N. Vapnik and A.Y. Chervonenkis, On the uniform convergence of relative frequences of events to their probabilities, Theory Probab. Appl. 17(2) (1971) 264-280.

    Article  MathSciNet  Google Scholar 

  43. V.N. Vapnik and A.Y. Chervonenkis, The necessary and sufficient conditions for the uniform convergence of averages to their expected values, Teor. Veroyatnost. i Primenen. 26(3) (1981) 543-564.

    MATH  MathSciNet  Google Scholar 

  44. V.N. Vapnik and A.Y. Chervonenkis, The necessary and sufficient conditions for consistency in the empirical risk minimization method, Pattern Recognition and Image Analysis 1(3) (1991) 283-305.

    Google Scholar 

  45. H. White, Connectionist nonparametric regression: Multilayer perceptrons can learn arbitrary mappings, Neural Networks 3 (1990) 535-549.

    Article  Google Scholar 

  46. W.P. Ziemer, Weakly Differentiable Functions: Sobolev Spaces and Functions of Bounded Variation (Springer, New York, 1989).

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Niyogi, P., Girosi, F. Generalization bounds for function approximation from scattered noisy data. Advances in Computational Mathematics 10, 51–80 (1999). https://doi.org/10.1023/A:1018966213079

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1018966213079

Keywords

Navigation