Constructive Approximation

, Volume 42, Issue 2, pp 231–264 | Cite as

Entropy and Sampling Numbers of Classes of Ridge Functions

Article

Abstract

We study the properties of ridge functions \(f(x)=g(a\cdot x)\) in high dimensions \(d\) from the viewpoint of approximation theory. The function classes considered consist of ridge functions such that the profile \(g\) is a member of a univariate Lipschitz class with smoothness \(\alpha >0\) (including infinite smoothness) and the ridge direction \(a\) has \(p\)-norm \(\Vert a\Vert _p\le 1\). First, we investigate entropy numbers in order to quantify the compactness of these ridge function classes in \(L_{\infty }\). We show that they are essentially as compact as the class of univariate Lipschitz functions. Second, we examine sampling numbers and consider two extreme cases. In the case \(p=2\), sampling ridge functions on the Euclidean unit ball suffers from the curse of dimensionality. Moreover, it is as difficult as sampling general multivariate Lipschitz functions, which is in sharp contrast to the result on entropy numbers. When we additionally assume that all feasible profiles have a first derivative uniformly bounded away from zero at the origin, the complexity of sampling ridge functions reduces drastically to the complexity of sampling univariate Lipschitz functions. In between, the sampling problem’s degree of difficulty varies, depending on the values of \(\alpha \) and \(p\). Surprisingly, we see almost the entire hierarchy of tractability levels as introduced in the recent monographs by Novak and Woźniakowski.

Keywords

Ridge functions Sampling numbers Entropy numbers  Rate of convergence Information-based complexity Curse of dimensionality 

Mathematics Subject Classification

41A10 41A25 41A50 41A63 46E35 65D05 65D15 

Notes

Acknowledgments

The authors would like to thank Aicke Hinrichs, Erich Novak, and Mario Ullrich for pointing out relations to the paper [19], as well as Sjoerd Dirksen, Thomas Kühn, and Winfried Sickel for useful comments and discussions. The last author acknowledges the support by the DFG Research Center Matheon “Mathematics for key technologies” in Berlin. The last author was supported by the ERC CZ grant LL1203 of the Czech Ministry of Education.

References

  1. 1.
    Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer, Heidelberg (2011)CrossRefMATHGoogle Scholar
  2. 2.
    Buhmann, M.D., Pinkus, A.: Identifying linear combinations of ridge functions. Adv. Appl. Math. 22, 103–118 (1999)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Candés, E.J.: Harmonic analysis of neural networks. Appl. Comput. Harmon. Anal. 6, 197–218 (1999)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Candés, E.J., Donoho, D.L.: Ridgelets: a key to higher-dimensional intermittency? Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 357, 2495–2509 (1999)CrossRefMATHGoogle Scholar
  5. 5.
    Carl, B., Stefani, I.: Entropy, Compactness and the Approximation of Operators. Cambridge Tracts in Mathematics, vol. 98. Cambridge University Press, Cambridge (1990)CrossRefGoogle Scholar
  6. 6.
    Cohen, A., Daubechies, I., DeVore, R.A., Kerkyacharian, G., Picard, D.: Capturing ridge functions in high dimensions from point queries. Constr. Approx. 35, 225–243 (2012)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Creutzig, J., Dereich, S., Müller-Kronbach, T., Ritter, K.: Infinite-dimensional quadrature and approximation of distributions. Found. Comput. Math. 9, 391–429 (2009)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Cucker, F., Zhou, D.-X.: Learning theory: an approximation theory viewpoint. Cambridge Monographs on Applied and Computational Mathematics, vol. 24. Cambridge University Press, Cambridge (2007)Google Scholar
  9. 9.
    DeVore, R.A., Lorentz, G.G.: Constructive Approximation. Springer, Berlin (1993)CrossRefMATHGoogle Scholar
  10. 10.
    Edmunds, D.E., Triebel, H.: Function Spaces, Entropy Numbers, Differential Operators. Cambridge Tracts in Mathematics, vol. 120. Cambridge University Press, Cambridge (1996)CrossRefGoogle Scholar
  11. 11.
    Flad, H.J., Hackbusch, W., Khoromskij, B.N., Schneider, R.: Concepts of data-sparse tensor-product approximation in many-particle modeling. In: Olshevsky, V., Tyrtyshnikov, E. (eds.) Matrix Methods: Theory, Algorithms and Applications. World Scientific, Singapore (2010)Google Scholar
  12. 12.
    Fornasier, M., Schnass, K., Vybíral, J.: Learning functions of few arbitrary linear parameters in high dimensions. Found. Comput. Math. 12, 229–262 (2012)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Foucart, S., Pajor, A., Rauhut, H., Ullrich, T.: The Gelfand widths of lp-balls for \(0 < p\le 1\). J. Complexity 26, 629–640 (2010)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Friedman, J.H., Stuetzle, W.: Projection pursuit regression. J. Am. Stat. Assoc. 76, 817–823 (1981)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Golubev, G.K.: Asymptotically minimax estimation of a regression function in an additive model. Problemy Peredachi Informatsii 28, 101–112 (1992)MathSciNetGoogle Scholar
  16. 16.
    Graham, R., Sloane, N.: Lower bounds for constant weight codes. IEEE Trans. Inform. Theory 26, 37–43 (1980)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, New York (2001)CrossRefMATHGoogle Scholar
  18. 18.
    Hinrichs, A., Mayer, S.: Entropy numbers of spheres in Banach and quasi-Banach spaces. University of Bonn, preprintGoogle Scholar
  19. 19.
    Hinrichs, A., Novak, E., Ullrich, M., Woźniakowski, H.: The curse of dimensionality for numerical integration of smooth functions II. J. Complex. 30, 117–143 (2014)CrossRefMATHGoogle Scholar
  20. 20.
    Hristache, M., Juditsky, A., Spokoiny, V.: Direct estimation of the index coefficient in a single-index model. Ann. Stat. 29, 595–623 (2001)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Kühn, T.: A lower estimate for entropy numbers. J. Approx. Theory 110, 120–124 (2001)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Logan, B.P., Shepp, L.A.: Optimal reconstruction of a function from its projections. Duke Math. J. 42, 645–659 (1975)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Lorentz, G., von Golitschek, M., Makovoz, Y.: Constructive Approximation: Advanced Problems. Volume 304 of Grundlehren der Mathematischen Wissenschaften, Springer, Berlin (1996)Google Scholar
  24. 24.
    Maiorov, V.: Geometric properties of the ridge manifold. Adv. Comput. Math. 32, 239–253 (2010)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Novak, E., Triebel, H.: Function spaces in Lipschitz domains and optimal rates of convergence for sampling. Constr. Approx. 23, 325–350 (2006)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. EMS Tracts in Mathematics, vol. 6, Eur. Math. Soc. Publ. House, Zürich (2008)Google Scholar
  27. 27.
    Novak, E., Woźniakowski, H.: Approximation of infinitely differentiable multivariate functions is intractable. J. Complex. 25, 398–404 (2009)CrossRefMATHGoogle Scholar
  28. 28.
    Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems, Volume II: Standard Information for Functionals. EMS Tracts in Mathematics, vol. 12, Eur. Math. Soc. Publ. House, Zürich (2010)Google Scholar
  29. 29.
    Paskov, S., Traub, J.: Faster evaluation of financial derivatives. J. Portf. Manag. 22, 113–120 (1995)CrossRefGoogle Scholar
  30. 30.
    Pinkus, A.: Approximating by ridge functions. In: Le Méhauté, A., Rabut, C., Schumaker, L.L. (eds.) Surface Fitting and Multiresolution Methods, pp. 279–292. Vanderbilt University Press, Nashville (1997)Google Scholar
  31. 31.
    Pinkus, A.: Approximation theory of the MLP model in neural networks. Acta Numerica 8, 143–195 (1999)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Raskutti, G., Wainwright, M.J., Yu, B.: Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Mach. Learn. Res. 13, 389–427 (2012)MathSciNetMATHGoogle Scholar
  33. 33.
    Schütt, C.: Entropy numbers of diagonal operators between symmetric Banach spaces. J. Approx. Theory 40, 121–128 (1984)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Schwab, C., Gittelson, C.J.: Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs. Acta Numerica 20, 291–467 (2011)MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    Traub, J., Wasilkowski, G., Woźniakowski, H.: Information-Based Complexity. Academic Press, New York (1988)Google Scholar
  36. 36.
    Triebel, H.: Fractals and Spectra. Birkhäuser, Basel (1997)CrossRefMATHGoogle Scholar
  37. 37.
    Tyagi, H., Cevher, V.: Active learning of multi-index function models. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1475–1483. Curran Associates, Red Hook (2012)Google Scholar
  38. 38.
    Vybíral, J.: Weak and quasi-polynomial tractability of approximation of infinitely differentiable functions. J. Complex. 30, 48–55 (2014)CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Hausdorff-Center for MathematicsBonnGermany
  2. 2.Department of Mathematical AnalysisCharles UniversityPrague 8Czech Republic

Personalised recommendations