Limitations of shallow networks representing finite mappings

S.I. : EANN 2017
Published: 17 August 2018

Volume 31, pages 1783–1792, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Věra Kůrková ORCID: orcid.org/0000-0002-8181-2128¹

333 Accesses
3 Citations
Explore all metrics

Abstract

Limitations of capabilities of shallow networks to efficiently compute real-valued functions on finite domains are investigated. Efficiency is studied in terms of network sparsity and its approximate measures. It is shown that when a dictionary of computational units is not sufficiently large, computation of almost any uniformly randomly chosen function either represents a well-conditioned task performed by a large network or an ill-conditioned task performed by a network of a moderate size. The probabilistic results are complemented by a concrete example of a class of functions which cannot be efficiently computed by shallow perceptron networks. The class is constructed using pseudo-noise sequences which have many features of random sequences but can be generated using special polynomials. Connections to the No Free Lunch Theorem and the central paradox of coding theory are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Similar content being viewed by others

Sparsity of Shallow Networks Representing Finite Mappings

Chapter © 2017

Complexity of Shallow Networks Representing Finite Mappings

Chapter © 2015

Limitations of Shallow Networks

Chapter © 2020

References

Ba LJ, Caruana R (2014) Do deep networks really need to be deep? In: Ghahrani Z (ed) Advances in neural information processing systems, vol 27. MIT Press, Cambridge, pp 1–9
Google Scholar
Ball K (1997) An elementary introduction to modern convex geometry. In: Levy S (ed) Flavors of geometry. Cambridge University Press, Cambridge, pp 1–58
Google Scholar
Barron AR (1992) Neural net approximation. In: Narendra KS (ed) Proceedings of 7th Yale workshop on adaptive and learning systems. Yale University Press, New Haven, pp 69–72
Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39:930–945
Article MathSciNet MATH Google Scholar
Bellman R (1957) Dynamic programming. Princeton University Press, Princeton
MATH Google Scholar
Bengio Y, LeCun Y (2007) Scaling learning algorithms towards AI. In: Bottou L, Chapelle O, DeCoste D, Weston J (eds) Large-scale kernel machines. MIT Press, Cambridge
Google Scholar
Bengio Y, Delalleau O, Roux NL (2006) The curse of highly variable functions for local kernel machines. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 107–114
Google Scholar
Bianchini M, Scarselli F (2014) On the complexity of neural network classifiers: a comparison between shallow and deep architectures. IEEE Trans Neural Netw Learn Syst 25:1553–1565
Article Google Scholar
Candès EJ (2008) The restricted isometric property and its implications for compressed sensing. C R Acad Sci Paris I 346:589–592
Article MATH Google Scholar
Coffey JT, Goodman RM (1990) Any code of which we cannot think is good. IEEE Trans Inf Theor 36:1453–1461
Article MathSciNet MATH Google Scholar
Cover T (1965) Geometrical and statistical properties of systems of linear inequalities with applictions in pattern recognition. IEEE Trans Electron Comput 14:326–334
Article MATH Google Scholar
DeVore RA, Howard R, Micchelli C (1989) Optimal nonlinear approximation. Manuscr Math 63:469–478
Article MathSciNet MATH Google Scholar
Donoho D (2006) For most large underdetermined systems of linear equations the minimal \(\ell _1\)-norm solution is also the sparsest solution. Commun Pure Appl Math 59:797–829
Article MATH Google Scholar
Donoho DL, Tsaig Y (2008) Fast solution of 1-norm minimization problems when the solution may be sparse. IEEE Trans Inf Theory 54:4789–4812
Article MathSciNet MATH Google Scholar
Fine TL (1999) Feedforward neural network methodology. Springer, Berlin
MATH Google Scholar
Gnecco G, Sanguineti M (2009) The weight-decay technique in learning from data: an optimization point of view. Comput Manag Sci 6:53–79
Article MathSciNet MATH Google Scholar
Gribonval R, Nielsen M (2003) Sparse representations in unions of bases. IEEE Trans Inf Theory 49:3320–3325
Article MathSciNet MATH Google Scholar
Ito Y (1992) Finite mapping by neural networks and truth functions. Math Sci 17:69–77
MathSciNet MATH Google Scholar
Kainen PC, Kůrková V, Vogt A (1999) Approximation by neural networks is not continuous. Neurocomputing 29:47–56
Article Google Scholar
Kainen PC, Kůrková V, Vogt A (2000) Geometry and topology of continuous best and near best approximations. J Approx Theory 105:252–262
Article MathSciNet MATH Google Scholar
Kainen PC, Kůrková V, Vogt A (2001) Continuity of approximation by neural networks in \({L}_p\)-spaces. Ann Oper Res 101:143–147
Article MathSciNet MATH Google Scholar
Kainen PC, Kůrková V, Sanguineti M (2012) Dependence of computational models on input dimension: tractability of approximation and optimization tasks. IEEE Trans Inf Theory 58:1203–1214
Article MathSciNet MATH Google Scholar
Kůrková V (1997) Dimension-independent rates of approximation by neural networks. In: Warwick K, Kárný M (eds) Computer-intensive methods in control and signal processing. Birkhäuser, Boston, pp 261–270 The Curse of Dimensionality
Chapter Google Scholar
Kůrková V (2012) Complexity estimates based on integral transforms induced by computational units. Neural Netw 33:160–167
Article MATH Google Scholar
Kůrková V (2017) Sparsity of shallow networks representing finite mappings. In: Boracchi G (ed) Engineering applications of neural networks, vol CCIS 744. Springer, Berlin, pp 337–348
Chapter Google Scholar
Kůrková V (2018) Constructive lower bounds on model complexity of shallow perceptron networks. Neural Comput Appl 29:305–315
Article Google Scholar
Kůrková V, Sanguineti M (2008) Approximate minimization of the regularized expected error over kernel models. Math Oper Res 33:747–756
Article MathSciNet MATH Google Scholar
Kůrková V, Kainen PC (2014) Comparing fixed and variable-width Gaussian networks. Neural Netw 57:23–28
Article MATH Google Scholar
Kůrková V, Sanguineti M (2016) Model complexities of shallow networks representing highly varying functions. Neurocomputing 171:598–604
Article Google Scholar
Kůrková V, Savický P, Hlaváčková K (1998) Representations and rates of approximation of real-valued Boolean functions by neural networks. Neural Networks 11:651–659
Article Google Scholar
Kůrková V, Sanguineti M (2017) Probabilistic lower bounds for approximation by shallow perceptron network. Neural Netw 91:34–41
Article Google Scholar
Laughlin SB, Sejnowski TJ (2003) Communication in neural networks. Science 301:1870–1874
Article Google Scholar
MacWilliams F, Sloane NA (1977) The theory of error-correcting codes. North Holland Publishing Co., New York
MATH Google Scholar
Mhaskar H, Liao Q, Poggio T (2016) Learning functions: when is deep better than shallow. CBMM Memo No. 045, May 31, 2016. https://arxiv.org/pdf/1603.00988v4.pdf. Accessed 29 May 2016
Mhaskar H, Liao Q, Poggio T (2016) Learning real and Boolean functions: when is deep better than shallow. CBMM Memo No. 45, March 4, 2016. https://arxiv.org/pdf/1603.00988v1.pdf. Accessed 3 Mar 2016
Pinkus A (1999) Approximation theory of the MLP model in neural networks. Acta Numer 8:143–195
Article MathSciNet MATH Google Scholar
Poggio T, Mhaskar H, Rosasco L, Miranda B, Liao Q (2017) Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. Int J Autom Comput. https://doi.org/10.1007/s11633-017-1054-2
Google Scholar
Roychowdhury V, Siu KY, Orlitsky A (1994) Neural models and spectral methods. In: Roychowdhury V, Siu K, Orlitsky A (eds) Theoretical advances in neural computation and learning. Springer, New York, pp 3–36
Chapter Google Scholar
Schläfli L (1901) Theorie der Vielfachen Kontinuität. Zürcher & Furrer, Zürich
Book MATH Google Scholar
Schroeder M (2009) Number theory in science and communication. Springer, Berlin
MATH Google Scholar
Tillmann A (2015) On the computational intractability of exact and approximate dictionary learning. IEEE Signal Process Lett 22:45–49
Article Google Scholar
Vaiter S, Peyre G, Dossal C, Fadili J (2013) Robust sparse analysis regularization. IEEE Trans Inf Theory 59:2001–2016
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was partially supported by the Czech Grant Foundation Grant GA15-18108S and institutional support of the Institute of Computer Science RVO 67985807.

Author information

Authors and Affiliations

Institute of Computer Science, Czech Academy of Sciences, Pod Vodárenskou věží 2, 182 07, Prague, Czech Republic
Věra Kůrková

Authors

Věra Kůrková
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Věra Kůrková.

Ethics declarations

Conflict of interest

The author declares that she has no conflict of interest.

Additional information

This work was partially supported by the Czech Grant Foundation Grants GA15-18108S, GA18-23827S and institutional support of the Institute of Computer Science RVO 67985807.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kůrková, V. Limitations of shallow networks representing finite mappings. Neural Comput & Applic 31, 1783–1792 (2019). https://doi.org/10.1007/s00521-018-3680-1

Download citation

Received: 08 January 2018
Accepted: 09 August 2018
Published: 17 August 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00521-018-3680-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions