Deep learning (DL) has achieved remarkable empirical success in several important application domains including image classification, autonomous navigation, and game playing. It is now being considered as an alternative to more traditional methods in numerical analysis. However, a mathematical explanation for the success of deep learning is still lacking. Since the applications of DL involve the numerical recovery of an underlying function (typically high dimensional) from data observations, a major role in such an explanation must, at least in part, lie in the ability to approximate these functions better than more traditional approximation methods based on polynomials, Fourier expansions, splines, wavelets, etc. In deep learning, the approximation is provided by the outputs of neural networks with a prescribed architecture and activation. While there are currently numerous papers dealing with neural network approximation, several fundamental questions remain unsolved. These include the role of the architecture and the activation in determining approximation efficiency, how to numerically implement neural network approximation, the stability of numerical methods, and quantifying the optimal approximation rates (error versus the number of parameters) possible when using this form of approximation.

This special issue of CA looks at neural network approximation from several vantage points. One focal point is to understand the specific structural properties of functions/model classes that guarantee they can be well approximated by neural networks. This includes the role of classical function spaces, model classes characterized by an intrinsic compositional structure, classes of solutions to operator equations, or invariance properties and symmetry. A particular constructive theme is to establish the universal ability of neural network approximation to emulate many classical approximation methods as is, for example, exploited in the context of model reduction when approximating holomorphic mappings. A second important focal point concerns structural properties of the networks themselves such as closedness, the role of depth versus width, sparse or full connectedness, or identifiability of a deep network.

While these topics indicate a considerable breadth of perspectives and conceptual ingredients, discussed in this special issue, they also hint at the fact that all these threads are deeply interrelated. The Editors hope that the collection of articles will help to guide the development of neural network approximation and help shed light on when and how this form of approximation should be implemented, including its advantages and disadvantages.

Wolfgang Dahmen, Ronald A. DeVore and Philipp Grohs.