Advertisement

Constructive Approximation

, Volume 42, Issue 2, pp 231–264 | Cite as

Entropy and Sampling Numbers of Classes of Ridge Functions

  • Sebastian Mayer
  • Tino Ullrich
  • Jan Vybíral
Article

Abstract

We study the properties of ridge functions \(f(x)=g(a\cdot x)\) in high dimensions \(d\) from the viewpoint of approximation theory. The function classes considered consist of ridge functions such that the profile \(g\) is a member of a univariate Lipschitz class with smoothness \(\alpha >0\) (including infinite smoothness) and the ridge direction \(a\) has \(p\)-norm \(\Vert a\Vert _p\le 1\). First, we investigate entropy numbers in order to quantify the compactness of these ridge function classes in \(L_{\infty }\). We show that they are essentially as compact as the class of univariate Lipschitz functions. Second, we examine sampling numbers and consider two extreme cases. In the case \(p=2\), sampling ridge functions on the Euclidean unit ball suffers from the curse of dimensionality. Moreover, it is as difficult as sampling general multivariate Lipschitz functions, which is in sharp contrast to the result on entropy numbers. When we additionally assume that all feasible profiles have a first derivative uniformly bounded away from zero at the origin, the complexity of sampling ridge functions reduces drastically to the complexity of sampling univariate Lipschitz functions. In between, the sampling problem’s degree of difficulty varies, depending on the values of \(\alpha \) and \(p\). Surprisingly, we see almost the entire hierarchy of tractability levels as introduced in the recent monographs by Novak and Woźniakowski.

Keywords

Ridge functions Sampling numbers Entropy numbers  Rate of convergence Information-based complexity Curse of dimensionality 

Mathematics Subject Classification

41A10 41A25 41A50 41A63 46E35 65D05 65D15 

Notes

Acknowledgments

The authors would like to thank Aicke Hinrichs, Erich Novak, and Mario Ullrich for pointing out relations to the paper [19], as well as Sjoerd Dirksen, Thomas Kühn, and Winfried Sickel for useful comments and discussions. The last author acknowledges the support by the DFG Research Center Matheon “Mathematics for key technologies” in Berlin. The last author was supported by the ERC CZ grant LL1203 of the Czech Ministry of Education.

References

  1. 1.
    Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer, Heidelberg (2011)CrossRefzbMATHGoogle Scholar
  2. 2.
    Buhmann, M.D., Pinkus, A.: Identifying linear combinations of ridge functions. Adv. Appl. Math. 22, 103–118 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Candés, E.J.: Harmonic analysis of neural networks. Appl. Comput. Harmon. Anal. 6, 197–218 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Candés, E.J., Donoho, D.L.: Ridgelets: a key to higher-dimensional intermittency? Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 357, 2495–2509 (1999)CrossRefzbMATHGoogle Scholar
  5. 5.
    Carl, B., Stefani, I.: Entropy, Compactness and the Approximation of Operators. Cambridge Tracts in Mathematics, vol. 98. Cambridge University Press, Cambridge (1990)CrossRefGoogle Scholar
  6. 6.
    Cohen, A., Daubechies, I., DeVore, R.A., Kerkyacharian, G., Picard, D.: Capturing ridge functions in high dimensions from point queries. Constr. Approx. 35, 225–243 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Creutzig, J., Dereich, S., Müller-Kronbach, T., Ritter, K.: Infinite-dimensional quadrature and approximation of distributions. Found. Comput. Math. 9, 391–429 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Cucker, F., Zhou, D.-X.: Learning theory: an approximation theory viewpoint. Cambridge Monographs on Applied and Computational Mathematics, vol. 24. Cambridge University Press, Cambridge (2007)Google Scholar
  9. 9.
    DeVore, R.A., Lorentz, G.G.: Constructive Approximation. Springer, Berlin (1993)CrossRefzbMATHGoogle Scholar
  10. 10.
    Edmunds, D.E., Triebel, H.: Function Spaces, Entropy Numbers, Differential Operators. Cambridge Tracts in Mathematics, vol. 120. Cambridge University Press, Cambridge (1996)CrossRefGoogle Scholar
  11. 11.
    Flad, H.J., Hackbusch, W., Khoromskij, B.N., Schneider, R.: Concepts of data-sparse tensor-product approximation in many-particle modeling. In: Olshevsky, V., Tyrtyshnikov, E. (eds.) Matrix Methods: Theory, Algorithms and Applications. World Scientific, Singapore (2010)Google Scholar
  12. 12.
    Fornasier, M., Schnass, K., Vybíral, J.: Learning functions of few arbitrary linear parameters in high dimensions. Found. Comput. Math. 12, 229–262 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Foucart, S., Pajor, A., Rauhut, H., Ullrich, T.: The Gelfand widths of lp-balls for \(0 < p\le 1\). J. Complexity 26, 629–640 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Friedman, J.H., Stuetzle, W.: Projection pursuit regression. J. Am. Stat. Assoc. 76, 817–823 (1981)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Golubev, G.K.: Asymptotically minimax estimation of a regression function in an additive model. Problemy Peredachi Informatsii 28, 101–112 (1992)MathSciNetGoogle Scholar
  16. 16.
    Graham, R., Sloane, N.: Lower bounds for constant weight codes. IEEE Trans. Inform. Theory 26, 37–43 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, New York (2001)CrossRefzbMATHGoogle Scholar
  18. 18.
    Hinrichs, A., Mayer, S.: Entropy numbers of spheres in Banach and quasi-Banach spaces. University of Bonn, preprintGoogle Scholar
  19. 19.
    Hinrichs, A., Novak, E., Ullrich, M., Woźniakowski, H.: The curse of dimensionality for numerical integration of smooth functions II. J. Complex. 30, 117–143 (2014)CrossRefzbMATHGoogle Scholar
  20. 20.
    Hristache, M., Juditsky, A., Spokoiny, V.: Direct estimation of the index coefficient in a single-index model. Ann. Stat. 29, 595–623 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Kühn, T.: A lower estimate for entropy numbers. J. Approx. Theory 110, 120–124 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Logan, B.P., Shepp, L.A.: Optimal reconstruction of a function from its projections. Duke Math. J. 42, 645–659 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Lorentz, G., von Golitschek, M., Makovoz, Y.: Constructive Approximation: Advanced Problems. Volume 304 of Grundlehren der Mathematischen Wissenschaften, Springer, Berlin (1996)Google Scholar
  24. 24.
    Maiorov, V.: Geometric properties of the ridge manifold. Adv. Comput. Math. 32, 239–253 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Novak, E., Triebel, H.: Function spaces in Lipschitz domains and optimal rates of convergence for sampling. Constr. Approx. 23, 325–350 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. EMS Tracts in Mathematics, vol. 6, Eur. Math. Soc. Publ. House, Zürich (2008)Google Scholar
  27. 27.
    Novak, E., Woźniakowski, H.: Approximation of infinitely differentiable multivariate functions is intractable. J. Complex. 25, 398–404 (2009)CrossRefzbMATHGoogle Scholar
  28. 28.
    Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems, Volume II: Standard Information for Functionals. EMS Tracts in Mathematics, vol. 12, Eur. Math. Soc. Publ. House, Zürich (2010)Google Scholar
  29. 29.
    Paskov, S., Traub, J.: Faster evaluation of financial derivatives. J. Portf. Manag. 22, 113–120 (1995)CrossRefGoogle Scholar
  30. 30.
    Pinkus, A.: Approximating by ridge functions. In: Le Méhauté, A., Rabut, C., Schumaker, L.L. (eds.) Surface Fitting and Multiresolution Methods, pp. 279–292. Vanderbilt University Press, Nashville (1997)Google Scholar
  31. 31.
    Pinkus, A.: Approximation theory of the MLP model in neural networks. Acta Numerica 8, 143–195 (1999)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Raskutti, G., Wainwright, M.J., Yu, B.: Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Mach. Learn. Res. 13, 389–427 (2012)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Schütt, C.: Entropy numbers of diagonal operators between symmetric Banach spaces. J. Approx. Theory 40, 121–128 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Schwab, C., Gittelson, C.J.: Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs. Acta Numerica 20, 291–467 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Traub, J., Wasilkowski, G., Woźniakowski, H.: Information-Based Complexity. Academic Press, New York (1988)Google Scholar
  36. 36.
    Triebel, H.: Fractals and Spectra. Birkhäuser, Basel (1997)CrossRefzbMATHGoogle Scholar
  37. 37.
    Tyagi, H., Cevher, V.: Active learning of multi-index function models. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1475–1483. Curran Associates, Red Hook (2012)Google Scholar
  38. 38.
    Vybíral, J.: Weak and quasi-polynomial tractability of approximation of infinitely differentiable functions. J. Complex. 30, 48–55 (2014)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Hausdorff-Center for MathematicsBonnGermany
  2. 2.Department of Mathematical AnalysisCharles UniversityPrague 8Czech Republic

Personalised recommendations