Skip to main content
Log in

Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

  • Published:
Constructive Approximation Aims and scope

Abstract

We study the variation space corresponding to a dictionary of functions in \(L^2(\Omega )\) for a bounded domain \(\Omega \subset {\mathbb {R}}^d\). Specifically, we compare the variation space, which is defined in terms of a convex hull with related notions based on integral representations. This allows us to show that three important notions relating to the approximation theory of shallow neural networks, the Barron space, the spectral Barron space, and the Radon BV space, are actually variation spaces with respect to certain natural dictionaries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39(3), 930–945 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  2. Barron, A.R., Cohen, A., Dahmen, W., DeVore, R.A.: Approximation and learning by greedy algorithms. Annal Stat. 36(1), 64–94 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  3. DeVore, R.A.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)

    Article  MATH  Google Scholar 

  4. DeVore, R.A., Temlyakov, V.N.: Some remarks on greedy algorithms. Adv. Comput. Math. 5(1), 173–187 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  5. Diestel, J.: Sequences and series in Banach spaces, vol. 92. Springer Science & Business Media (2012)

  6. Dudley, R.M.: Real analysis and probability. CRC Press (2018)

    Google Scholar 

  7. E, W., Ma, C., Wu, L.: Barron spaces and the compositional function spaces for neural network models. arXiv preprint arXiv:1906.08039 (2019)

  8. Weinan, E., Wojtowytsch, S.: Representation formulas and pointwise properties for barron functions. CoRR (2020)

  9. Evans, L.C., Gariepy, R.F.: Measure theory and fine properties of functions. CRC press (2015)

  10. Hao, W., Jin, X., Siegel, J.W., Xu, J.: An efficient greedy training algorithm for neural networks and applications in pdes. arXiv preprint arXiv:2107.04466 (2021)

  11. Hornik, K., Stinchcombe, M., White, H., Auer, P.: Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives. Neural Comput. 6(6), 1262–1275 (1994)

    Article  MATH  Google Scholar 

  12. Jones, L.K.: A simple lemma on greedy approximation in hilbert space and convergence rates for projection pursuit regression and neural network training. Ann. Stat. 20(1), 608–613 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  13. Kainen, P.C., Krková, V., Vogt, A.: Integral combinations of heavisides. Math. Nachr. 283(6), 854–878 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  14. Klusowski, J.M., Barron, A.R.: Approximation by combinations of relu and squared relu ridge functions with \(\ell ^1\) and \(\ell ^0\) controls. IEEE Trans. Inform. Theory 64(12), 7649–7656 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  15. Krogh, A., Hertz, J.: A simple weight decay can improve generalization. Adv. Neural Inform. Process. Syst.4 (1991)

  16. Kurková, V., Sanguineti, M.: Bounds on rates of variable-basis and neural-network approximation. IEEE Trans. Inform. Theory 47(6), 2659–2665 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kurková, V., Sanguineti, M.: Comparison of worst case errors in linear and neural network approximation. IEEE Trans. Inform. Theory 48(1), 264–275 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  18. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  19. Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks 6(6), 861–867 (1993)

    Article  Google Scholar 

  20. Li, Z., Ma, C., Wu, L.: Complexity measures for neural networks with general activation functions using path-based norms. arXiv preprint arXiv:2009.06132 (2020)

  21. Livshits, E.D.: Lower bounds for the rate of convergence of greedy algorithms. Izvestiya: Mathematics 73(6), 1197 (2009)

  22. Makovoz, Y.: Random approximants and neural networks. J. Approx. Theory 85(1), 98–109 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  23. Makovoz, Y.: Uniform approximation by neural networks. J. Approx. Theory 95(2), 215–228 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  24. Markoff, A.: On mean values and exterior densities. Rec. Math. 4(1), 165–191 (1938)

    MATH  Google Scholar 

  25. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML (2010)

  26. Ongie, G., Willett, R., Soudry, D., Srebro, N.: A function space view of bounded norm infinite width relu nets: The multivariate case. In: International Conference on Learning Representations (ICLR 2020) (2019)

  27. Parhi, R., Nowak, R.D.: Banach space representer theorems for neural networks and ridge splines. arXiv preprint arXiv:2006.05626 (2020)

  28. Parhi, R., Nowak, R.D.: What kinds of functions do deep neural networks learn? insights from variational spline theory. arXiv preprint arXiv:2105.03361 (2021)

  29. Petrosyan, A., Dereventsov, A., Webster, C.G.: Neural network integral representations with the relu activation function. In: Mathematical and Scientific Machine Learning, pp. 128–143. PMLR (2020)

  30. Pisier, G.: Remarques sur un résultat non publié de b. maurey. Séminaire Analyse fonctionnelle (dit “Maurey-Schwartz") pp. 1–12 (1981)

  31. Prokhorov, Y.V.: Convergence of random processes and limit theorems in probability theory. Theory Probab. Appl. 1(2), 157–214 (1956)

    Article  MathSciNet  Google Scholar 

  32. Savarese, P., Evron, I., Soudry, D., Srebro, N.: How do infinite width bounded norm networks look in function space? In: Conference on Learning Theory, pp. 2667–2690. PMLR (2019)

  33. Siegel, J.W., Xu, J.: Approximation rates for neural networks with general activation functions. Neural Networks 128, 313–321 (2020)

    Article  MATH  Google Scholar 

  34. Siegel, J.W., Xu, J.: High-order approximation rates for neural networks with ReLU\(^k\) activation functions. arXiv preprint arXiv:2012.07205 (2020)

  35. Siegel, J.W., Xu, J.: Sharp bounds on the approximation rates, metric entropy, and \(n\)-widths of shallow neural networks. arXiv preprint arXiv:2101.12365 (2021)

  36. Sil’nichenko, A.: Rate of convergence of greedy algorithms. Math. Notes 76(3), 582–586 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  37. Temlyakov, V.: Greedy approximation, vol. 20. Cambridge University Press (2011)

  38. Temlyakov, V.N.: Greedy approximation. Acta Numerica 17(235), 409 (2008)

    MathSciNet  MATH  Google Scholar 

  39. Weinan, E., Ma, C., Wu, L.: Barron spaces and the compositional function spaces for neural network models. arXiv preprint arXiv:1906.08039 (2019)

  40. Weinan, E., Ma, C., Wu, L.: The barron space and the flow-induced function spaces for neural network models. Constructive Approximation pp. 1–38 (2021)

  41. Xu, J.: Finite neuron method and convergence analysis. Commun. Comput. Phys. 28(5), 1707–1745 (2020). https://doi.org/10.4208/cicp.OA-2020-0191

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Professors Russel Caflisch, Ronald DeVore, Weinan E, Albert Cohen, Stephan Wojtowytsch and Jason Klusowski for helpful discussions. We would also like to thank the anonymous reviewers for their helpful comments. This work was supported by the Verne M. Willaman Chair Fund at the Pennsylvania State University, and the National Science Foundation (Grant No. DMS-1819157 and DMS-2111387).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan W. Siegel.

Additional information

Communicated by Zuowei Shen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Siegel, J.W., Xu, J. Characterization of the Variation Spaces Corresponding to Shallow Neural Networks. Constr Approx 57, 1109–1132 (2023). https://doi.org/10.1007/s00365-023-09626-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00365-023-09626-4

Keywords

Mathematics Subject Classification

Navigation