Skip to main content
Log in

On universal estimators in learning theory

  • Published:
Proceedings of the Steklov Institute of Mathematics Aims and scope Submit manuscript

Abstract

This paper addresses the problem of constructing and analyzing estimators for the regression problem in supervised learning. Recently, there has been great interest in studying universal estimators. The term “universal” means that, on the one hand, the estimator does not depend on the a priori assumption that the regression function f ρ belongs to some class F from a collection of classes F and, on the other hand, the estimation error for f ρ is close to the optimal error for the class F. This paper is an illustration of how the general technique of constructing universal estimators, developed in the author’s previous paper, can be applied in concrete situations. The setting of the problem studied in the paper has been motivated by a recent paper by Smale and Zhou. The starting point for us is a kernel K(x, u) defined on X × Ω. On the base of this kernel, we build an estimator that is universal for classes defined in terms of nonlinear approximations with regard to the system {K(·, u)} uεΩ. To construct an easily implementable estimator, we apply the relaxed greedy algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. R. Barron, “Universal Approximation Bounds for Superposition of n Sigmoidal Functions,” IEEE Trans. Inf. Theory 39(3), 930–945 (1993).

    Article  MathSciNet  Google Scholar 

  2. P. Binev, A. Cohen, W. Dahmen, R. DeVore, and V. Temlyakov, “Universal Algorithms for Learning Theory. Part I: Piecewise Constant Functions,” J. Mach. Learn. Res. 6, 1297–1321 (2005).

    Google Scholar 

  3. B. Carl, “Entropy Numbers, s-Numbers, and Eigenvalue Problems,” J. Funct. Anal. 41, 290–306 (1981).

    Article  MathSciNet  Google Scholar 

  4. A. Barron, A. Cohen, W. Dahmen, and R. DeVore, “Approximation and Learning by Greedy Algorithms,” Manuscript (2005), http://www.ann.jussieu.fr/:_cohen/greedy.pdf.gz

  5. F. Cucker and S. Smale, “On the Mathematical Foundations of Learning,” Bull. Am. Math. Soc. 39, 1–49 (2002).

    Article  MathSciNet  Google Scholar 

  6. R. DeVore, G. Kerkyacharian, D. Picard, and V. Temlyakov, “On Mathematical Methods of Learning,” Ind. Math. Inst. Res. Rep. No. 10 (Univ. South Carolina, Columbia, 2004).

    Google Scholar 

  7. R. DeVore, G. Kerkyacharian, D. Picard, and V. Temlyakov, “Mathematical Methods for Supervised Learning,” Ind. Math. Inst. Res. Rep. No. 22 (Univ. South Carolina, Columbia, 2004); “Approximation Methods for Supervised Learning,” Found. Comput. Math. 6, 3–58 (2006).

    Google Scholar 

  8. L. Györfi, M. Kohler, A. Krzyzak, and H. Walk, A Distribution-Free Theory of Nonparametric Regression (Springer, Berlin, 2002).

    MATH  Google Scholar 

  9. R. A. DeVore and V. N. Temlyakov, “Some Remarks on Greedy Algorithms,” Adv. Comput. Math. 5, 173–187 (1996).

    Article  MathSciNet  Google Scholar 

  10. P. J. Huber, “Projection Pursuit,” Ann. Stat. 13, 435–475 (1985).

    MathSciNet  Google Scholar 

  11. L. Jones, “On a Conjecture of Huber Concerning the Convergence of Projection Pursuit Regression,” Ann. Stat. 15, 880–882 (1987).

    Google Scholar 

  12. G. Kerkyacharian and D. Picard, “Thresholding in Learning Theory,” math.ST/0510271.

  13. S. V. Konyagin and V. N. Temlyakov, “The Entropy in the Learning Theory. Error Estimates,” Ind. Math. Inst. Res. Rep. No. 09 (Univ. South Carolina, Columbia, 2004); Constr. Approx. 25, 1–27 (2007).

    Google Scholar 

  14. W. S. Lee, P. L. Bartlett, and R. C. Williamson, “Efficient Agnostic Learning of Neural Networks with Bounded Fan-in,” IEEE Trans. Inf. Theory 42(6), 2118–2132 (1996).

    Article  MathSciNet  Google Scholar 

  15. W. S. Lee, P. L. Bartlett, and R. C. Williamson, “The Importance of Convexity in Learning with Squared Loss,” IEEE Trans. Inf. Theory 44(5), 1974–1980 (1998).

    Article  MathSciNet  Google Scholar 

  16. E. Schmidt, “Zur Theorie der linearen und nichtlinearen Integralgleichungen. I,” Math. Ann. 63, 433–476 (1906–1907).

    Article  Google Scholar 

  17. S. Smale and D.-X. Zhou, “Learning Theory Estimates via Integral Operators and Their Approximations,” Manuscript (2005), http://www.tti-c.org/smale_papers/sampIII5412.pdf

  18. V. N. Temlyakov, “Optimal Estimators in Learning Theory,” Ind. Math. Inst. Res. Rep. No. 23 (Univ. South Carolina, Columbia, 2004); in Approximation and Probability (Inst. Math. Pol. Acad. Sci., Warsaw, 2006), Banach Center Publ. 72, pp. 341–366.

    Google Scholar 

  19. V. N. Temlyakov, “Approximation in Learning Theory,” Ind. Math. Inst. Res. Rep. No. 05 (Univ. South Carolina, Columbia, 2005).

    Google Scholar 

  20. V. N. Temlyakov, “Nonlinear Methods of Approximation,” Found. Comput. Math. 3, 33–107 (2003).

    Article  MathSciNet  Google Scholar 

  21. V. N. Temlyakov, “Greedy Algorithms in Banach Spaces,” Adv. Comput. Math. 14, 277–292 (2001).

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Published in Russian in Trudy Matematicheskogo Instituta imeni V.A. Steklova, 2006, Vol. 255, pp. 256–272.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Temlyakov, V.N. On universal estimators in learning theory. Proc. Steklov Inst. Math. 255, 244–259 (2006). https://doi.org/10.1134/S0081543806040201

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0081543806040201

Keywords

Navigation