1 Introduction

Variational Inequalities (VIs for short) in infinite-dimensional spaces arise in variational formulations of numerous models in the sciences. We refer only to [7, 17, 26] and the references there for models of contact problems in continuum mechanics, [20] and the references there for applications from optimal stopping in finance (mainly option pricing with “American-style,” early exercise features) and [4] and the references there for resource allocation and game theoretic models. Two broad classes of approaches toward numerical solution of VIs can be identified: deterministic approaches, which are based on discretization of the VI in function space, and probabilistic approaches, which exploit stochastic numerical simulation and an interpretation of the solution of the VI as conditional expectations of optimally stopped sample paths. The latter approach has been used to design ML algorithms for the approximation of the solution of one instance of the VI in [3].

Deep neural network structures arise naturally in abstract variational inequality problems (VIs) posed on the product of (possibly infinite-dimensional) Hilbert spaces, as review, e.g., in [5]. Therein, the activation functions correspond to proximity operators of certain potentials that define the constraints of the VI. Weak convergence of this recurrent NN structure in the limit of infinite depth to feasible solutions of the VI is shown under suitable assumptions. An independent, but related, development in recent years has been the advent of DNN-based numerical approximations which are based on encoding known, iterative solvers for discretized partial differential equations, and certain fixed point iterations for nonlinear operator equations. We mention only [9], that developed DNNs which emulate the ISTA iteration of [6], or the more recently proposed generalization of “deep unrolling/unfolding” methodology [22]. Closer to PDE numerics, recently [11] proposed MGNet, a neural network emulation of multilevel, iterative solvers for linear, elliptic PDEs.

The general idea behind these approaches is to emulate by a DNN a contractive map, say \(\Phi \), which is assumed to satisfy the conditions of Banach’s Fixed Point Theorem (BFPT), and whose unique fixed point is the solution of the operator equation of interest. Let us denote the approximate map realized by emulating \(\Phi \) with a DNN by \(\tilde{\Phi }\). The universality theorem for DNNs in various function classes implies (see, e.g., [16, 25] and the references there) that for any \(\varepsilon >0\) a DNN surrogate \(\tilde{\Phi }\) to the contraction map exists, which is \(\varepsilon \)-close to \(\Phi \), uniformly on the domain of attraction of \(\Phi \).

Iteration of the DNN \(\tilde{\Phi }\) being realized by composition, any finite number K of steps of the fixed point iteration can be realized by K-fold composition of the DNN surrogate \(\tilde{\Phi }\). Iterating \(\tilde{\Phi }\), instead of \(\Phi \), induces an error of order \(\mathcal {O}(\varepsilon /(1-L))\), uniformly in the number of iterations K, where \(L\in (0,1)\) denotes the contraction constant of \(\Phi \). Due to the contraction property of \(\Phi \), K may be chosen as \(O(|\log (\varepsilon )|)\) in order to output an approximate fixed point with accuracy \(\varepsilon \) upon termination. The K-fold composition of the surrogate DNN \(\tilde{\Phi }\) is, in turn, itself a DNN of depth \(O(\mathrm{depth}(\tilde{\Phi }) | \log (\varepsilon ) | )\). This reasoning is valid also in metric spaces, since the notions of continuity and contractivity of the map \(\Phi \) do not rely on availability of a norm. Hence, a (sufficiently large) DNN \(\tilde{\Phi }\) exists which may be used likewise for the iterative solution of VIs in metric spaces. Furthermore, the resulting fixed-point-iteration nets obtained in this manner naturally exhibit a recurrent structure, in the case (considered here) that the surrogate \(\tilde{\Phi }\) is fixed throughout the K-fold composition (more refined constructions with stage-dependent approximations \(\{ \tilde{\Phi }^{(k)} \}_{k=1}^K\) of increasing emulation accuracy could be considered, but shall not be addressed here).

In summary, with the geometric error reduction in FPIs which is implied by the contraction condition, finite truncation at a prescribed emulation precision \(\varepsilon >0\) will imply \(O(|\log (\varepsilon )|)\) iterations, and exact solution representation (of the fixed point of \(\tilde{\Phi }\)) in the infinite depth limit. In DNN calculus, finitely terminated FPIs can be realized via finite concatenation of the DNN approximation \(\tilde{\Phi }\) of the contraction map \(\Phi \). The corresponding DNNs exhibit depth \(\mathcal O(|\log (\epsilon )|)\), and naturally a recurrent structure due to the repetition of the Net \(\tilde{\Phi }\) in their construction. Thereby, recurrent DNNs can be built which encode numerical solution maps of fixed point iterations. This idea has appeared in various incarnations in recent work; we refer to, e.g., MGNet for the realization of Multi-grid iterative solvers of discretized elliptic PDEs [11]. The presently proposed ProxNet architectures are, in fact, DNN emulations of corresponding fixed point iterations of (discretized) variational inequalities.

Recent work has promoted so-called Deep Operator Nets which emulate Data-to-Solution operators for classes of PDEs. We mention only [19] and the references there. To analyze expression rates of deep neural networks (DNNs) for emulating data-to-solution operators for VIs is the purpose of the present paper. In line with recent work (e.g., [19, 21] and the references there), we take the perspective of infinite-dimensional VIs, which are set on closed cones in separable Hilbert spaces. The task at hand is then the analysis of rates of expression of the approximate data-to-solution map, which relates the input data (i.e., operator, cone, etc.) to the unique solution of the VI.

1.1 Layout

The structure of this paper is as follows. In Sect. 2, we recapitulate basic notions and definitions of proximal neural networks in infinite-dimensional, separable Hilbert spaces. A particular role is taken by so-called proximal activations, and a calculus of ProxNets, which we shall use throughout the rest of the paper to build solution operators of VIs. Section 3 addresses the conceptual use of ProxNets in the constructive solution of VIs. We build in particular ProxNet emulators of convergent fixed point iterations to construct solutions of VIs. Section 3.2 introduces quantitative bounds for perturbations of ProxNets. Section 4 emphasizes that ProxNets may be regarded as (approximate) solution operators to unilateral obstacle problems in infinite-dimensional Hilbert spaces. Section 5 presents DNN emulations of iterative solvers of matrix LCPs which arise from discretization of unilateral problems for PDEs. Section 6 presents several numerical experiments, which illustrate the foregoing developments. More precisely, we consider the numerical solution of free boundary value problems arising in the valuation of American-style options, and in parametric obstacle problems. Section 7 provides a brief summary of the main results and indicates possible directions for further research.

1.2 Notation

We use standard notation. By \(\mathcal {L}(\mathcal {H},\mathcal {K})\), we denote the Banach space of bounded, linear operators from the Banach space \(\mathcal {H}\) into \(\mathcal {K}\) (surjectivity will not be required). Unless explicitly stated otherwise, all Hilbert and Banach-spaces are infinite-dimensional. By bold symbols, we denote matrices resp. linear maps between finite-dimensional spaces. We use the notation conventions \(\sum _{i=1}^0 \cdot = 0\) and \(\Pi _{i=1}^0\cdot =1\) for the empty sum and empty product, respectively. Vectors in finite-dimensional, Euclidean space are always understood as column vectors, with \(\top \) denoting transposition of matrices and vectors.

2 Proximal neural networks (ProxNets)

We consider the following model for an artificial neural network: For finite \(m\in \mathbb {N}\), let \(\mathcal {H}\) and \((\mathcal {H}_i)_{0\le i \le m}\) be real, separable Hilbert spaces. For every \(i\in \{1,\ldots ,m\}\), let \(W_i\in \mathcal {L}(\mathcal {H}_{i-1}, \mathcal {H}_{i})\) be a bounded linear operator, let \(b_i\in \mathcal {H}_i\), let \(R_i:\mathcal {H}_i\rightarrow \mathcal {H}_i\) be a nonlinear, continuous operator, and define

$$\begin{aligned} T_i:\mathcal {H}_{i-1}\rightarrow \mathcal {H}_i, \quad x\mapsto R_i(W_ix +b_i). \end{aligned}$$
(1)

Moreover, let \(W_0\in \mathcal {L}(\mathcal {H}_0,\mathcal {H})\), \(W_{m+1}\in \mathcal {L}(\mathcal {H}_m,\mathcal {H})\), \(b_{m+1}\in \mathcal {H}\) and consider the neural network (NN) model

$$\begin{aligned} \Psi :\mathcal {H}_0\rightarrow \mathcal {H},\quad x\mapsto W_0x+W_{m+1}(T_m\circ \cdots \circ T_1) (x)+b_{m+1}. \end{aligned}$$
(2)

The operator \(W_0\in \mathcal {L}(\mathcal {H}_0,\mathcal {H})\) allows to include skip connections in the model, similar to deep residual neural networks as proposed in [12, 13]. This article focuses in particular on NNs with identical input and output spaces as in [5, Model 1.1], that arise as special case of model (2) with \(\mathcal {H}_0=\mathcal {H}_m=\mathcal {H}\) and are of the form

$$\begin{aligned} \Phi :\mathcal {H}\rightarrow \mathcal {H},\quad x\mapsto (1-\lambda )x+\lambda (T_m\circ \cdots \circ T_1) (x), \end{aligned}$$
(3)

for a relaxation parameter \(\lambda >0\) to be adjusted for each application. The relation \(\mathcal {H}_0=\mathcal {H}_m=\mathcal {H}\) allows us to investigate fixed points of \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\), which are in turn solutions to variational inequalities. The nonlinear operators \(R_i\) act as activation operators of the NNs and are subsequently given by suitable proximity operators on \(\mathcal {H}_i\). We refer to \(\Psi \) and \(\Phi \) as proximal neural networks or ProxNets for short and derive sufficient conditions on the operators \(T_i\), resp. \(W_i\) and \(R_i\), so that \(\Phi \) defines a contraction on \(\mathcal {H}\). Hence, the unique fixed point \(x^*=\Phi (x^*)\in \mathcal {H}\) solves a variational inequality, that is turn uniquely determined by the network parameters \(W_i, b_i\) and \(R_i\) for \(i\in \{1,\ldots ,m\}\). On the other hand, any well-posed variational inequality on \(\mathcal {H}\) may be recast as fixed-point problem for a suitable contractive ProxNet \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\).

As an example, consider an elliptic variational inequality on \(\mathcal {H}\), with solution \(u\in \mathcal {K}\subset \mathcal {H}\), where \(\mathcal {K}\) is a closed, convex set. The set of contractive mappings on \(\mathcal {H}\) is open; therefore, we may construct a one-layer ProxNet \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\), such that u is the unique fixed-point of \(\Phi \). Therein, \(W_1\in \mathcal {L}(\mathcal {H})\) stems from the bilinear form of the variational inequality, \(\lambda >0\) is a relaxation parameter chosen to ensure a Lipschitz constant below one, and \(R_1\) is the \(\mathcal {H}\)-orthogonal projection onto \(\mathcal {K}\), see Sect. 4.1 for a detailed construction.

This enables us to approximate solutions to variational inequality problems as fixed-point iterations of ProxNets and derive convergence rates. Due to the contraction property of \(\Phi \), the fixed-point iteration \(x_n=\Phi (x_{n-1}), n\in \mathbb {N}\) converges to \(x^*=\Phi (x^*)\) for any \(x_0\in \mathcal {H}\) at linear rate. Moreover, as the set of contractions on \(\mathcal {H}\) is open, the iteration is stable under small perturbations of the network parameters. As we show in Sect. 5.3 below, the latter property allows us to solve entire classes of variational inequality problems using only one ProxNet with fixed parameters.

2.1 Proximal activations

Definition 2.1

Let \(i\in \{0,\ldots ,m\}\) be a fixed index, \(\psi _i:\mathcal {H}_i\rightarrow \mathbb {R}\cup \{\infty \}\) and \(\mathrm {dom}(\psi _i):=\{x\in \mathcal {H}_i|\psi _i(x)<\infty \}\). We denote by \(\Gamma _0(\mathcal {H}_i)\) the set of all proper, convex, lower semi-continuous functions on \(\mathcal {H}_i\), that is

$$\begin{aligned}&\Gamma _0(\mathcal {H}_i):= \\&\quad \left\{ \psi _i:\mathcal {H}_i\rightarrow \mathbb {R}\cup \{\infty \}\Big |\; \liminf _{y\rightarrow x} \psi _i(y) \ge \psi _i(x)\text { for all } x\in \mathcal {H}_i \text { and } \mathrm {dom}(\psi _i)\ne \emptyset \right\} . \end{aligned}$$

For any \(\psi _i\in \Gamma _0(\mathcal {H}_i)\), the subdifferential of \(\psi _i\) at \(x\in \mathcal {H}_i\) is

$$\begin{aligned} \partial \psi _i(x):=\{v\in \mathcal {H}_i|\, (y-x,v)+f(x)\le f(y) \text { for all } y\in \mathcal {H}_i \}\subset \mathcal {H}_i,\quad x\in \mathcal {H}_i, \end{aligned}$$

and the proximity operator of \(\psi _i\) is

$$\begin{aligned} \mathrm {prox}_{\psi _i}:\mathcal {H}_i\rightarrow \mathcal {H}_i,\quad x\mapsto \mathop {\hbox {argmin}}\limits _{y\in \mathcal {H}_i} \psi _i(y) +\frac{\Vert x-y\Vert ^2_{\mathcal {H}_i}}{2}. \end{aligned}$$
(4)

It is well-known that \(\mathrm {prox}_{\psi _i}\) is a firmly nonexpansive operator, i.e., \(2\mathrm {prox}_{\psi _i}-\mathrm{id}\) is nonexpansive, see, e.g., [2, Proposition 12.28]. As outlined in [5, Section 2], there is a natural relation between proximity operators and activation functions in neural networks: Virtually any commonly used activation function such as rectified linear unit, tanh, softmax, etc., may be expressed as proximity operator on \(\mathcal {H}_i=\mathbb {R}^d\), \(d\in \mathbb {N}\), for an appropriate \(\psi _i\in \Gamma _0(\mathcal {H}_i)\) (see [5, Section 2] for examples). We consider a set of particular proximity operators given by

$$\begin{aligned} \mathcal {A}(\mathcal {H}_i):=\{R_i=\mathrm {prox}_{\psi _i}|\, \psi _i\in \Gamma _0(\mathcal {H}_i) \text { such that } \psi _i \text { is minimal at}\, 0\in \mathcal {H}_i\}, \end{aligned}$$
(5)

cf. [5, Definition 2.20]. Apart from being continuous and nonexpansive, any \(R_i\in \mathcal {A}(\mathcal {H}_i)\) satisfies \(R_i(0)=0\) [5, Proposition 2.21]. Therefore, in the case \(\mathcal {H}_i=\mathbb {R}\), the elements in \(\mathcal {A}(\mathbb {R})\) are also referred to as stable activation functions, cf. [10, Lemma 5.1]. With this in mind, we formally define proximal neural networks, or ProxNets.

Definition 2.2

Let \(\Psi :\mathcal {H}_0\rightarrow \mathcal {H}\) be the m-layer neural network model in (2). If \(R_i\in \mathcal {A}(\mathcal {H}_i)\) holds for any \(i\in \{1,\ldots ,m\}\), \(\Psi \) is called a proximal neural network or ProxNet.

2.2 ProxNet calculus

Before investigating the relation of \(\Phi \) in (3) to variational inequality models, we record several useful definitions and results for NN calculus in the more general model \(\Psi \) from Eq. (2).

Definition 2.3

Let \(j\in \{1,2\}\), \(m_j\in \mathbb {N}\), let \(\mathcal {H}^{(j)}, \mathcal {H}_0^{(j)},\ldots , \mathcal {H}_m^{(j)} \) be separable Hilbert spaces such that \(\mathcal {H}^{(2)}=\mathcal {H}_0^{(1)}\), and let \(\Psi _j\) be \(m_j\)-layer ProxNets as in (2) given by

$$\begin{aligned} \Psi _j:\mathcal {H}_{0}^{(j)}\rightarrow \mathcal {H}^{(j)},\quad x\mapsto W_{m_j+1}^{(j)} \left( T_m^{(j)}\circ \cdots \circ T_1^{(j)}\right) (x)+b_{m+1}^{(j)}. \end{aligned}$$

The concatenation of \(\Psi _1\) and \(\Psi _2\) is defined by the map

$$\begin{aligned} \Psi _1\bullet \Psi _2:\mathcal {H}_{0}^{(2)}\rightarrow \mathcal {H}^{(1)}, \quad x\mapsto (\Psi _1\circ \Psi _2)(x). \end{aligned}$$
(6)

Remark 2.4

Due to \(W_0^{(j)}\equiv 0\), there are no skip connections after the last proximal activation in \(\Psi _j\); hence, \(\Psi _1\bullet \Psi _2\) is in fact a ProxNet as in (2) with 2m layers and no skip connection.

Definition 2.5

Let \(m\in \mathbb {N}\), \(j\in \{1,2\}\), let \(\mathcal {H}^{(j)}, \mathcal {H}_0^{(j)},\ldots ,\mathcal {H}_m^{(j)}\) be separable Hilbert spaces such that \(\mathcal {H}_0^{(1)}=\mathcal {H}_0^{(2)}\), and let \(\Psi _j\) be m-layer ProxNets as in (2) given by

$$\begin{aligned} \Psi _j:\mathcal {H}_{0}^{(j)}\rightarrow \mathcal {H}^{(j)},\quad x\mapsto W_0^{(j)}x+W_{m+1}^{(j)}\left( T_m^{(j)}\circ \cdots \circ T_1^{(j)}\right) (x)+b_{m+1}^{(j)}. \end{aligned}$$

The parallelization of \(\Psi _1\) and \(\Psi _2\) is given for \(\mathcal {H}_0:=\mathcal {H}_0^{(1)}=\mathcal {H}_{0}^{(2)}\) by

$$\begin{aligned} P(\Psi _1,\Psi _2):\mathcal {H}_0\rightarrow \mathcal {H}^{(1)}\oplus \mathcal {H}^{(2)},\quad x\mapsto (\Psi _1(x), \Psi _2(x)). \end{aligned}$$

Proposition 2.6

The parallelization \(P(\Psi _1,\Psi _2)\) of two ProxNets \(\Psi _1\) and \(\Psi _2\) as in Definition 2.5 is a ProxNet.

Proof

We set \(\mathcal {H}^{(j)}_{m+1}:=\mathcal {H}^{(j)}\) for \(j\in \{1,2\}\), fix \(i\in \{1,\ldots ,m\}\) and observe that \(\mathcal {H}_{i}^{(1)}\oplus \mathcal {H}_{i}^{(2)}\) equipped with the scalar product \((\cdot ,\cdot )_{\mathcal {H}_{i}^{(1)}\oplus \mathcal {H}_{i}^{(2)}}:=(\cdot ,\cdot )_{\mathcal {H}_{i}^{(1)}} +(\cdot ,\cdot )_{\mathcal {H}_{i}^{(2)}}\) is again a separable Hilbert space. We define

Note that all \(W_i\) are bounded, linear operators. Moreover, if \(R_i^{(j)}=\mathrm {prox}_{\psi _i^{(j)}}\in \mathcal {A}(\mathcal {H}_i^{(j)})\) holds for \(\psi _i^{(j)}\in \Gamma _0(\mathcal {H}_i^{(j)})\) and \(j\in \{1,2\}\), then \(R_i=\mathrm {prox}_{\psi _i}\), where \(\psi _i\in \Gamma _0(\mathcal {H}_{i}^{(1)}\oplus \mathcal {H}_{i}^{(2)})\) is defined by \(\psi _i(x,y):=\psi _i^{(1)}(x)+\psi _i^{(2)}(y)\). Hence, \(R_i\in \mathcal {A}(\mathcal {H}_{i}^{(1)}\oplus \mathcal {H}_{i}^{(2)})\) and it holds that

$$\begin{aligned} P(\Psi _1,\Psi _2):\mathcal {H}_0\rightarrow \mathcal {H}^{(1)}\oplus \mathcal {H}^{(2)},\quad x\mapsto W_0x+W_{m+1}(T_m\circ \cdots \circ T_1)(x)+b_{m+1}, \end{aligned}$$

with \(T_i:=R_i(W_i\cdot +b_i)\) for \(i\in \{1,\ldots ,m\}\), which shows the claim. \(\square \)

3 ProxNets and variational inequalities

3.1 Contractive ProxNets

We formulate sufficient conditions on the neural network model in (3) so that \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\) is a contraction. The associated fixed-point iteration converges to the unique solution of a variational inequality, which is characterized in the following.

Assumption 3.1

Let \(\Phi \) be a ProxNet as in (3) with \(m\in \mathbb {N}\) layers such that \(W_i\in \mathcal {L}(\mathcal {H}_{i-1}, \mathcal {H}_{i})\), \(b_i\in \mathcal {H}_i\), and \(R_i\in \mathcal {A}(\mathcal {H}_i)\) for all \(i\in \{1,\ldots ,m\}\). It holds that \(\lambda \in (0,2)\) and the operators \(W_i\) satisfy

$$\begin{aligned} L_{\Phi }:=\prod _{i=1}^{m}\Vert W_i\Vert _{\mathcal {L}(\mathcal {H}_{i-1},\mathcal {H}_i)}<\min (1,2/\lambda -1). \end{aligned}$$

Theorem 3.2

Let \(\Phi \) be as in (3), let \(x^0\in \mathcal {H}\) and define the iteration \(x^{k+1}:=\Phi (x^{k})\), \(k\in \mathbb {N}_0\). Under Assumption 3.1, the sequence \((x^k,k\in \mathbb {N}_0)\) converges for any \(x^0\in \mathcal {H}\) to the unique fixed-point \(x^*\in \mathcal {H}\). For any finite number \(k\in \mathbb {N}\), the error is bounded by

$$\begin{aligned} \Vert x^*-x^k\Vert _\mathcal {H}\le \frac{\Vert \Phi (x^0)-x^0\Vert }{1-L_{\Phi ,\lambda }}L_{\Phi ,\lambda }^k, \quad L_{\Phi ,\lambda }:=|1-\lambda |+\lambda L_{\Phi }\in [0,1). \end{aligned}$$
(7)

It holds that

$$\begin{aligned} (x_1^*,\ldots ,x_m^*):= (T_1x^*, (T_2\circ T_1)x^*, \ldots , (T_{m-1}\circ \cdots \circ T_1)x^*, x^*) \in \mathcal {H}_1\times \cdots \times \mathcal {H}_m \end{aligned}$$

is the unique solution to the variational inequality problem: find \(x_1\in \mathcal {H}_1,\ldots ,x_0=x_m\in \mathcal {H}_m\), such that

$$\begin{aligned} W_ix_{i-1}+b_i-x_i\in \partial \psi _i (x_i), \quad i\in \{1,\ldots ,m\}. \end{aligned}$$
(8)

Moreover, \(x^*\) is bounded by

$$\begin{aligned}&\Vert x^*\Vert _{\mathcal {H}} \le C^*\sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \Vert b_{i}\Vert _{\mathcal {H}_i}, \\&C^*:={\left\{ \begin{array}{ll} \frac{1}{1-L_\Phi } &{}<\infty ,\quad \lambda \in (0,1], \\ \frac{\lambda }{2-\lambda (1+L_\Phi )}&{}<\infty ,\quad \lambda \in (1,2). \\ \end{array}\right. } \end{aligned}$$

Proof

By the non-expansiveness of \(R_i:\mathcal {H}_i\rightarrow \mathcal {H}_i\) for \(i\in \{1,\ldots ,m\}\), it follows for any \(x,y\in \mathcal {H}\)

$$\begin{aligned}&\Vert \Phi (x)-\Phi (y)\Vert _\mathcal {H}\le |1-\lambda |\Vert x-y\Vert _\mathcal {H}+ \lambda \Vert (T_m\circ \cdots \circ T_1)x - (T_m\circ \cdots \circ T_1)y\Vert _{\mathcal {H}_m} \\&\quad \le |1-\lambda |\Vert x-y\Vert _\mathcal {H}\\&\quad \quad + \lambda \Vert (W_m\circ (T_{m-1}\circ \cdots \circ T_1))x - (W_m\circ (T_{m-1}\circ \cdots \circ T_1))y\Vert _{\mathcal {H}_m} \\&\quad \le |1-\lambda |\Vert x-y\Vert _\mathcal {H}\\&\qquad + \lambda \Vert W_m\Vert _{\mathcal {L}(\mathcal {H}_{m-1}, \mathcal {H}_{m})} \Vert (T_{m-1}\circ \cdots \circ T_1)x - (T_{m-1}\circ \cdots \circ T_1)y\Vert _{\mathcal {H}_{m-1}} \\&\quad \le |1-\lambda |\Vert x-y\Vert _\mathcal {H}+ \lambda \left( \prod _{i=1}^m\Vert W_i\Vert _{\mathcal {L}(\mathcal {H}_{i-1}, \mathcal {H}_{i})} \right) \Vert x-y\Vert _{\mathcal {H}_0}\\&\quad = \underbrace{(|1-\lambda |+\lambda L_{\Phi })}_{:=L_{\Phi ,\lambda }}\Vert x-y\Vert _{\mathcal {H}}. \end{aligned}$$

As \(\lambda \in (0,2)\) and \(L_{\Phi }< \min (1,2/\lambda -1)\) by Assumption 3.1, it follows that \(L_{\Phi ,\lambda }<1\), hence, \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\) is a contraction. Existence and uniqueness of \(x^*\in \mathcal {H}\) and the first part of the claim then follow by Banach’s fixed-point theorem for any initial value \(x^0\in \mathcal {H}\).

By [2, Proposition 16.44], it holds for any \(i\in \{1,\ldots ,m\}\), \(x_i,y_i\in \mathcal {H}_i\) and \(\psi _i\in \Gamma _0(\mathcal {H}_i)\) that

$$\begin{aligned} x_i=\mathrm {prox}_{\psi _i}(y_i) \quad \Leftrightarrow \quad y_i-x_i\in \partial \psi _i (x_i). \end{aligned}$$

Now, let \(x^*_0:=x^*\) and \(x^*_i:=(T_i\circ \cdots \circ T_1)(x^*)\) for \(i\in \{1,\ldots ,m\}\). This yields \(\Phi (x^*_0)=(1-\lambda )x^*+\lambda x^*_m=x^*\) and hence, \(x^*_m=x^*\). Recalling that \(R_i=\mathrm {prox}_{\psi _i}\) with \(\psi _i\in \Gamma _0(\mathcal {H}_i)\) for all \(i\in \{1,\ldots ,m\}\), it hence follows that

$$\begin{aligned} W_ix^*_{i-1}+b_i-x^*_i\in \partial \psi _i (x^*_i), \end{aligned}$$

cf. [5, Proposition 4.3]. Finally, to bound \(x^*\), we use that

$$\begin{aligned} \Vert x^*\Vert _\mathcal {H}\le \Vert \Phi (x^*)-\Phi (0)\Vert _\mathcal {H}+ \Vert \Phi (0)\Vert _\mathcal {H}\le L_{\Phi ,\lambda }\Vert x^*\Vert _\mathcal {H}+ \lambda \Vert (T_m\circ \cdots \circ T_1)(0)\Vert _{\mathcal {H}_m}. \end{aligned}$$

As \(R_i\in \mathcal {A}(\mathcal {H}_i)\), it holds \(R_i(0)=0\) and therefore, \(\Vert R_i(x)\Vert _{\mathcal {H}_i}\le \Vert x\Vert _{\mathcal {H}_i}\) for all \(x\in \mathcal {H}_i\), which in turn shows

$$\begin{aligned}&\Vert (T_m\circ \cdots \circ T_1)(0)\Vert _{\mathcal {H}_m} \le \Vert W_m\Vert _{\mathcal {L}(\mathcal {H}_{m_1},\mathcal {H}_m)}\Vert (T_{m-1}\circ \cdots \circ T_1)(0)\Vert _{\mathcal {H}_{m-1}} +\Vert b_m\Vert _{\mathcal {H}_m} \\&\quad \le \Vert W_m\Vert _{\mathcal {L}(\mathcal {H}_{m_1},\mathcal {H}_m)}\\&\qquad \cdot \left( \Vert W_{m-1}\Vert _{\mathcal {L}(\mathcal {H}_{m-2},\mathcal {H}_{m-1})} \Vert (T_{m-2}\circ \cdots \circ T_1)(0)\Vert _{\mathcal {H}_{m-2}} +\Vert b_{m-1}\Vert _{\mathcal {H}_{m-1}} \right) +\Vert b_m\Vert _{\mathcal {H}_m} \\&\quad \le \sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \Vert b_{i}\Vert _{\mathcal {H}_i}. \end{aligned}$$

The claim follows with \(L_\Phi <\min (1,2/\lambda -1)\), since

$$\begin{aligned} 1-L_{\Phi ,\lambda }= {\left\{ \begin{array}{ll} \lambda (1-L_\Phi )&{}>0,\quad \lambda \in (0,1], \\ 2-\lambda (1+L_\Phi )&{}>0,\quad \lambda \in (1,2). \\ \end{array}\right. } \end{aligned}$$

\(\square \)

3.2 Perturbation estimates for ProxNets

We introduce a perturbed version of the ProxNet \(\Phi \) in (3) in this subsection. Besides changing the network parameters \(W_i, b_i\) and \(R_i\), we also augment the input space \(\mathcal {H}\) and allow an architecture that approximates each nonlinear operator \(T_i\) itself by a multilayer network. These changes allow us to consider ProxNet as an approximate data-to-solution operator for infinite-dimensional variational inequalities and to control perturbations of the network parameters. For instance, we show in Example 3.4 that augmented ProxNets mimic the solution operator to Problem (8), that maps the bias vectors \(b_1,\ldots ,b_m\) to the solution \(x_1,\ldots ,x_m\).

Let \(\widetilde{\mathcal {H}}_0,\ldots ,\widetilde{\mathcal {H}}_{m-1}\) be arbitrary separable Hilbert spaces and let \(\widetilde{\mathcal {H}}:=\widetilde{\mathcal {H}}_0\). Then, for \(i\in \{0,\ldots ,m-1\}\) the direct sum \(\mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\) equipped with the inner product \((\cdot ,\cdot )_{\mathcal {H}_i}+(\cdot ,\cdot )_{\widetilde{\mathcal {H}}_i}\) is again a separable Hilbert space. For notational convenience, we set \(\widetilde{\mathcal {H}}_m:=\{0\in \mathcal {H}_m\}\) and use the identification \(\mathcal {H}_m\oplus \widetilde{\mathcal {H}}_m = \mathcal {H}_m = \mathcal {H}\). We consider the ProxNet

$$\begin{aligned} \widetilde{\Phi }:\mathcal {H}\oplus \widetilde{\mathcal {H}}\rightarrow \mathcal {H}, \quad (x,\widetilde{x})\mapsto (1-\lambda )x +\lambda (\widetilde{T}_m\circ \cdots \circ \widetilde{T}_1)(x,\widetilde{x}), \end{aligned}$$
(9)

where we allow that the operators \(\widetilde{T}_i\) are itself multi-layer ProxNets: For any \(i\in \{1,\ldots ,m\}\), let \(m_i\in \mathbb {N}\) and let \(\mathcal {H}_{0}^{(i)}:= \mathcal {H}_{i-1}\oplus \widetilde{\mathcal {H}}_{i-1}\), \(\mathcal {H}_1^{(i)},\ldots , \mathcal {H}_{m_i-1}^{(i)}\),\( \mathcal {H}_{m_i}^{(i)}:=\mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\) be separable Hilbert spaces. For \(j_i\in \{1,\ldots ,m_i\}\), consider the operators \( \widetilde{T}_{j_i}^{(i)}(\cdot ) = R_{j_i}^{(i)}(W_{j_i}^{(i)}\cdot + b_{j_i}^{(i)})\) given by

$$\begin{aligned} R_{j_i}^{(i)}\in \mathcal {A}(\mathcal {H}_{j_i}^{(i)}),\quad W_{j_i}^{(i)}\in \mathcal {L}(\mathcal {H}_{j_i-1}^{(i)}, \mathcal {H}_{j_i}^{(i)}),\quad b_{j_i}^{(i)}\in \mathcal {H}_{j_i}^{(i)}. \end{aligned}$$

We then define \(\widetilde{T}_i\) as

$$\begin{aligned} \widetilde{T}_i&: \mathcal {H}_{i-1}\oplus \widetilde{\mathcal {H}}_{i-1}\rightarrow \mathcal {H}_{i}\oplus \widetilde{\mathcal {H}}_{i}, \quad (x_{i-1},\widetilde{x}_{i-1})\mapsto (\widetilde{T}_{m_i}^{(i)}\circ \cdots \circ \widetilde{T}_{1}^{(i)})(x_{i-1},\widetilde{x}_{i-1}), \end{aligned}$$

which in turn determines \(\widetilde{\Phi }\) in (9). By construction, \(\widetilde{\Phi }\) is a ProxNet of the form (2) with \(\sum _{i=1}^m m_i\ge m\) layers. As compared to \(\Phi \), we augmented the input and intermediate spaces by \(\widetilde{\mathcal {H}}_i\). The composite structure of the maps \(\widetilde{T}_i\) allows to choose input vectors \(\widetilde{x}_{i-1}\in \widetilde{\mathcal {H}}_{i-1}\) such that the first component of \(\widetilde{T}_i(x_{i-1},\widetilde{x}_{i-1})\) approximates \(T_i(x_{i-1})\) uniformly on a subset of \(\mathcal {H}_{i-1}\). As we show in Sect. 5.3 below, this enables us to solve large classes of variational inequalities with only one fixed ProxNet \(\widetilde{\Phi }\), that in turn approximates a data-to-solution operator, instead of employing different fixed maps \(\Phi :\mathcal {H}\rightarrow \mathcal {H}\) for every problem.

To formulate reasonable assumptions on \(\widetilde{\Phi }\), we denote for any \(i\in \{1,\ldots ,m-1\}\) by

$$\begin{aligned} P_{\mathcal {H}_i}&:\mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\mapsto \mathcal {H}_i, \quad (x_i,\widetilde{x}_i)\mapsto x_i, \\ P_{\widetilde{\mathcal {H}}_i}&:\mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\mapsto \widetilde{\mathcal {H}}_i, \quad (x_i,\widetilde{x}_i)\mapsto \widetilde{x}_i \end{aligned}$$

the projections to the first and second component for an element in \(\mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\), respectively. Moreover, we define the closed ball \(B_r^{(i)}:=\{x_i\in \mathcal {H}_i|\, \Vert x_i\Vert _{\mathcal {H}_{i}}\le r\}\subset \mathcal {H}_{i}\) with radius \(r>0\).

Assumption 3.3

Let \(\Phi \) and \(\widetilde{\Phi }\) be proximal neural networks defined as in Eqs. (3) and (9), respectively. There are constants \(\widetilde{L}\in (0,1)\), \(\delta \ge 0\) and \(\Theta _1\ge \Theta _0\ge \Theta _2>0\) such that

  1. 1.

    \(\Phi \) satisfies Assumption 3.1 with \(\lambda \in (0,1]\) and \(L_{\Phi }\le \widetilde{L}\in (0,1)\).

  2. 2.

    It holds that

    $$\begin{aligned}&\left( \max _{i\in \{0,1,\ldots ,m\}}\prod _{j=1}^{i} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})} \right) \Theta _0 + \sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) (\Vert b_{i}\Vert _{\mathcal {H}_m}+\delta ) \le \Theta _1, \\&\sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \Vert b_{i}\Vert _{\mathcal {H}_i} \le (1-\widetilde{L})\Theta _2, \end{aligned}$$

    as well as

    $$\begin{aligned} \Theta _2 + \frac{\delta }{(1-\widetilde{L})} \sum _{i=1}^m \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \le \Theta _0. \end{aligned}$$
  3. 3.

    There is a vector \(\widetilde{x}_0\in \widetilde{\mathcal {H}}_0\), such that for \(i\in \{1,\ldots ,m\}\), any \(x_{i-1}\in B_{\Theta _1}^{(i-1)}\subset \mathcal {H}_{i-1}\) and \(\widetilde{x}_i:=P_{\widetilde{\mathcal {H}}_i}\widetilde{T}_i(x_{i-1}, \widetilde{x}_{i-1})\) it holds

    $$\begin{aligned} \Vert T_i(x_{i-1}) - P_{\mathcal {H}_i}\widetilde{T}_i(x_{i-1},\widetilde{x}_{i-1}) \Vert _{\mathcal {H}_i} \le \delta . \end{aligned}$$

Before we derive error bounds, we provide an example to motivate the construction of \(\widetilde{\Phi }\) and Assumption 3.3.

Example 3.4

(Bias-to-solution operator) Let \(\Phi \) be as in Assumption 3.1 with \(m=2\) layers and network parameters \(R_i, W_i, b_i\) for \(i\in \{1,2\}\). We construct a ProxNet \(\widetilde{\Phi }\) that takes the bias vectors \(b_1,b_2\) of \(\Phi \) as inputs to represent \(\Phi \) for any choice of \(b_i\in \mathcal {H}_i\) and therefore, may be concatenated to map any choice of \(b_1,b_2\) to the respective solution \((x_1,x_2)\) of (8). In other words, we approximate the bias-to-solution operator

$$\begin{aligned} {O_{\mathrm{bias}}}:\mathcal {H}_1\oplus \mathcal {H}_2\mapsto \mathcal {H}_1\oplus \mathcal {H}_2,\; (b_1,b_2)\mapsto (x_1,x_2). \end{aligned}$$

To this end, we set \(\widetilde{\mathcal {H}}_0=\mathcal {H}_1\oplus \mathcal {H}_2\), \(\widetilde{\mathcal {H}}_1=\mathcal {H}_2\), \(m_1=m_2=1\), \(b_{i,1}=0\in \mathcal {H}_i\oplus \widetilde{\mathcal {H}}_i\) and

$$\begin{aligned} W_1^{(1)}&: \mathcal {H}\oplus \mathcal {H}_1\oplus \mathcal {H}_2 \rightarrow \mathcal {H}_1\oplus \mathcal {H}_2, \quad&(x,x_1,x_2) \mapsto (W_1x + x_1, x_2) \\ W_1^{(2)}&: \mathcal {H}_1\oplus \mathcal {H}_2 \rightarrow \mathcal {H}_2, \quad&(x_1,x_2) \mapsto W_2x_1 + x_2,\\ R_1^{(1)}&: \mathcal {H}_1\oplus \mathcal {H}_2\rightarrow \mathcal {H}_1\oplus \mathcal {H}_2, \quad&(x_1,x_2) \mapsto R_1(x_1) + x_2, \\ R_1^{(2)}&: \mathcal {H}_2\rightarrow \mathcal {H}_2, \quad&x_2 \mapsto R_2(x_2). \\ \end{aligned}$$

Note that \(R_1^{(1)}=\mathrm {prox}_{\psi _1^{(1)}}\) with \(\psi _1^{(1)}(x_1,x_2):=\psi _1(x_1)\) for any \((x_1,x_2)\in \mathcal {H}_1\oplus \mathcal {H}_2\), where \(\psi _1\) determines \(R_1=\mathrm {prox}_{\psi _1}\). Hence, \(R_1^{(1)}\in \mathcal {A}(\mathcal {H}_1\oplus \widetilde{\mathcal {H}}_1)\), and it follows with \(\widetilde{x}_0:=(b_1,b_2)\in \mathcal {H}_1\oplus \mathcal {H}_2\) for any \(x\in \mathcal {H}\) and \(x_1\in \mathcal {H}_1\) that

$$\begin{aligned}&T_1(x) = R_1(W_1x+b_1) = P_{\mathcal {H}_1} (R_1(W_1x+b_1), b_2) = P_{\mathcal {H}_1} R_1^{(1)}(W_1^{(1)}(x,\widetilde{x}_0)) = P_{\mathcal {H}_1} \widetilde{T}_1(x, \widetilde{x}_0), \\&T_2(x) = R_2(W_2x_1+b_2) = R_1^{(2)}(W_1^{(2)}(x_1,b_2)) = P_{\mathcal {H}_2} R_1^{(2)}(W_1^{(2)}(x_1,P_{\widetilde{\mathcal {H}}_1}\widetilde{T}_1(x_1,\widetilde{x}_0)). \end{aligned}$$

Therefore, the last part of Assumption 3.3 holds with \(\delta =0\) for arbitrary large \(\Theta _1>0\) and hence, the constants \(\Theta _0,\Theta _1,\Theta _2\) do not play any role in this example. The generalization to \(m>2\) layers follows by a similar construction of \(\Phi \).

Now, let \((x_1, x_2)\) be the solution to (8) for any choice \((b_1, b_2)\in \mathcal {H}_1\oplus \mathcal {H}_2\). It follows from Theorem 3.2 that the operator

$$\begin{aligned} {\widetilde{O}_{\mathrm{bias}}:\mathcal {H}_1\oplus \mathcal {H}_2} \rightarrow \mathcal {H}, \quad (b_1,b_2)\mapsto \underbrace{\widetilde{\Phi }(\cdot , b_1, b_2) \bullet \cdots \bullet \widetilde{\Phi }(\cdot , b_1, b_2)}_{k \text { times}} (x^0) \end{aligned}$$

satisfies \(x_2 \approx {\widetilde{O}_{\mathrm{bias}}(b_1, b_2)}\) and \(x_1\approx T_1( {\widetilde{O}_{\mathrm{bias}}(b_1, b_2))}\) for any fixed \(x^0\in \mathcal {H}\) and any tuple \((b_1, b_2)\in \mathcal {H}_1\oplus \mathcal {H}_2\), for a sufficiently large number k of concatenations of \(\widetilde{\Phi }(\cdot , b_1, b_2)\).

The augmented ProxNet \(\widetilde{\Phi }\) may also be utilized to consider parametric families of obstacle problems, as shown in Example 4.4 below. Therein, the parametrization is with respect to the proximity operators \(R_i\) instead of the bias vectors \(b_i\), and we construct an approximate obstacle-to-solution operator in the fashion of Example 3.4. In the finite-dimensional case (where the linear operators \(W_i\) correspond to matrices), the input of \(\widetilde{\Phi }\) may even be augmented by a suitable space of operators, see Sect. 5.3 below for a detailed discussion. We conclude this section with a perturbation estimate that allows us to approximate the fixed-point of \(\Phi \) by the augmented NN \(\widetilde{\Phi }\).

Theorem 3.5

Let \(\Phi \) and \(\widetilde{\Phi }\) be proximal neural networks as in Eqs. (3) and (9) that satisfy Assumption 3.3, and denote by \(x^*\in \mathcal {H}\) the unique fixed-point of \(\Phi \) from Theorem 3.2. Let \(x^0\in B_{\Theta _2}^{(0)}\) be arbitrary, let \(\widetilde{x}_0\) be as in Assumption 3.3 and define the sequence \(\widetilde{x}^{k+1}:=\widetilde{\Phi }(\widetilde{x}^k, \widetilde{x}_0)\) for \(k\in \mathbb {N}_0\), where \(\widetilde{x}^0:=x^0\). Then, there is a constant \(C>0\) which is independent of \(\delta >0\) and \(\widetilde{x}_0\), such that for any \(k\in \mathbb {N}\), it holds

$$\begin{aligned} \Vert x^*-\widetilde{x}^k\Vert _\mathcal {H}\le C\left( \widetilde{L}_\lambda ^k + \delta \right) , \end{aligned}$$

where \(\widetilde{L}_\lambda :=(1-\lambda )+\lambda \widetilde{L}<1\).

Proof

Let \(x\in B_{\Theta _0}^{(0)}\) and let \(\widetilde{x}_0\in \widetilde{\mathcal {H}}_0\) be as in Assumption 3.3. We define \(v_0:=x\), \(v_{i}:=P_{\mathcal {H}_{i}}(\widetilde{T}_{i}\circ \cdots \circ \widetilde{T}_1)(x, \widetilde{x}_0)\in \mathcal {H}_i\) for \(i\in \{1,\ldots ,m-1\}\), and \(v_m:=(\widetilde{T}_{m}\circ \cdots \circ \widetilde{T}_1)(x, \widetilde{x}_0)\in \mathcal {H}\). With \(\widetilde{x}_i:=P_{\widetilde{\mathcal {H}}_i}\widetilde{T}_i(x_{i-1}, \widetilde{x}_{i-1})\) and the convention that \(P_{\mathcal {H}_m}=\text {id}\), we obtain the recursion formula

$$\begin{aligned} v_i=P_{\mathcal {H}_{i}}\widetilde{T}_i(v_{i-1}, \widetilde{x}_{i-1}), \quad i\in \{1,\ldots ,m\}. \end{aligned}$$
(10)

We now show by induction that \(\Vert v_i\Vert _{\mathcal {H}_i}\le \Theta _1\) for \(i\in \{0,\ldots ,m\}\). By Assumption 3.3 it holds

$$\begin{aligned} \Vert v_0\Vert _{\mathcal {H}_0}&=\Vert x\Vert _\mathcal {H}\\&\le \Theta _0 \\&= \left( \prod _{j=1}^{0} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})} \right) \Theta _0 + \sum _{j=1}^{0} \left( \prod _{\ell =j+1}^{0} \Vert W_{\ell }\Vert _{\mathcal {L}(\mathcal {H}_{\ell -1},\mathcal {H}_{\ell })}\right) (\Vert b_{j}\Vert _{\mathcal {H}_j}+\delta ) \\&\le \Theta _1. \end{aligned}$$

Now, let

$$\begin{aligned} \Vert v_i\Vert _{\mathcal {H}_i} \le \left( \prod _{j=1}^{i} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})} \right) \Theta _0 + \sum _{j=1}^{i} \left( \prod _{\ell =j+1}^{i} \Vert W_{\ell }\Vert _{\mathcal {L}(\mathcal {H}_{\ell -1},\mathcal {H}_{\ell })}\right) (\Vert b_{j}\Vert _{\mathcal {H}_j} +\delta ) \end{aligned}$$

hold for a fixed \(i\in \{0,\ldots ,m-1\}\). Assumption 3.3 yields with Eq. (10)

$$\begin{aligned} \Vert T_{i+1}(v_{i})-v_{i+1}\Vert _{\mathcal {H}_{i+1}} = \Vert T_{i+1}(v_i) -P_{\mathcal {H}_{i+1}}\widetilde{T}_{i+1}(v_i, \widetilde{x}_0)\Vert _{\mathcal {H}_{i+1}} \le \delta . \end{aligned}$$

Using \(\Vert R_{i+1}(x)\Vert _{\mathcal {H}_{i+1}}\le \Vert x\Vert _{\mathcal {H}_{i+1}}\) for \(x\in \mathcal {H}_{i+1}\) then yields together with the triangle inequality and the induction hypothesis

$$\begin{aligned} \Vert v_{i+1}\Vert _{\mathcal {H}_{i+1}}&\le \delta + \Vert T_{i+1}(v_{i})\Vert _{\mathcal {H}_{i+1}}\\&\le \delta + \Vert W_{i+1}\Vert _{\mathcal {L}(\mathcal {H}_{i},\mathcal {H}_{i+1})} \Vert v_{i}\Vert {_{\mathcal {H}_i}} + \Vert b_{i+1}\Vert _{\mathcal {H}_{i+1}} \\&\le \left( \prod _{j=1}^{i+1}\Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})} \right) \Theta _0 + \sum _{l=1}^{i+1} \left( \prod _{j=l+1}^{i+1} \Vert W_{\ell }\Vert _{\mathcal {L}(\mathcal {H}_{\ell -1},\mathcal {H}_{\ell })}\right) (\Vert b_{j}\Vert _{\mathcal {H}_j}+\delta ) \\&\le \Theta _1, \end{aligned}$$

and hence, \(v_i\in B_{\Theta _1}^{(i)}\) for all \(i\in \{0,\ldots ,m\}\). With Assumption 3.3 and Eq. (10), we further obtain for each \(x\in B_{\Theta _0}^{(0)}\)

$$\begin{aligned}&\frac{1}{\lambda }\Vert \Phi (x)-\widetilde{\Phi }(x, \widetilde{x}_0)\Vert _{\mathcal {H}_m}\\&\quad =\Vert (T_{m}\circ \cdots \circ T_1)(x)- v_m\Vert _\mathcal {H}\\&\quad \le \Vert (T_{m}\circ \cdots \circ T_1)(x)- T_{m}(v_{m-1}) \Vert _\mathcal {H}+ \Vert T_{m}(v_{m-1}) -\widetilde{T}_m(v_{m-1},\widetilde{x}_{{m-1}})\Vert _\mathcal {H}\\&\quad \le \Vert W_m\Vert _{\mathcal {L}(\mathcal {H}_{m-1},\mathcal {H}_m)} \Vert (T_{m-1}\circ \cdots \circ T_1)(x)- v_{m-1}\Vert _{\mathcal {H}_{m-1}} + \delta , \end{aligned}$$

and by iterating this estimate over \(i\in \{1,\ldots ,m\}\)

$$\begin{aligned} \begin{aligned} \Vert \Phi (x)-\widetilde{\Phi }(x, \widetilde{x}_0)\Vert _{\mathcal {H}_m}&\le \lambda \delta \sum _{i=1}^m \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) =:\lambda \delta C_{\Phi }. \end{aligned} \end{aligned}$$
(11)

Now, let \(x^*\in \mathcal {H}\) be the unique fixed-point of \(\Phi \) as in Theorem 3.2, let \(x^k=\Phi (x^{k-1})\) and \(\widetilde{x}^k=\widetilde{\Phi }(\widetilde{x}^{k-1},\widetilde{x}_0)\) for any \(k\in \mathbb {N}\) and a given initial value \(x^0=\widetilde{x}^0\in \mathcal {H}\) with \(\Vert x^0\Vert _\mathcal {H}\le \Theta _2\). We obtain as in the proof of Theorem 3.2

$$\begin{aligned} \begin{aligned} \Vert x^1\Vert _\mathcal {H}&\le \Vert \Phi (x^0)-\Phi (0)\Vert _\mathcal {H}+ \Vert \Phi (0)\Vert _\mathcal {H}\\&\le L_{\Phi ,\lambda }\Vert x^{0}\Vert _\mathcal {H}+\lambda \sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \Vert b_{i}\Vert _{\mathcal {H}_i} \\&\le (1-\lambda )\Theta _2 +\lambda \left( \widetilde{L}\Theta _2+\sum _{i=1}^{m} \left( \prod _{j=i+1}^{m} \Vert W_{j}\Vert _{\mathcal {L}(\mathcal {H}_{j-1},\mathcal {H}_{j})}\right) \Vert b_{i}\Vert _{\mathcal {H}_i}\right) \\&\le \Theta _2, \end{aligned} \end{aligned}$$
(12)

where we have used that \(L_{\Phi ,\lambda }=(1-\lambda )+\lambda L_{\Phi }\le (1-\lambda )+\lambda \widetilde{L}\) and Assumption 3.3. Hence, we have \(\Vert x^k\Vert _\mathcal {H}\le \Theta _2\) inductively for all \(k\in \mathbb {N}\). In the next step, we show that \(\Vert \widetilde{x}^k\Vert _\mathcal {H}\le \Theta _0\) by induction over k. First, we obtain with \(\Vert x^0\Vert \le \Theta _2\le \Theta _0\), (11) and (12) that

$$\begin{aligned} \Vert \widetilde{x}^1\Vert _\mathcal {H}= \Vert \widetilde{\Phi }(x^0,\widetilde{x}_0)\Vert _\mathcal {H}\le \Vert \widetilde{\Phi }(x^0,\widetilde{x}_0)-\Phi (x^0)\Vert _\mathcal {H}+ \Vert \Phi (x^0)\Vert _\mathcal {H}\le \lambda \delta C_{\Phi }+\Theta _2. \end{aligned}$$

Thus, \(\Vert \widetilde{x}^1\Vert _\mathcal {H}\le \Theta _0\) follows with Assumption 3.3 on the relation of \(\Theta _0\) and \(\Theta _2\) as \(\lambda (1-\widetilde{L})<1\). Using the induction hypothesis \(\Vert \widetilde{x}^k-x^k\Vert _\mathcal {H}\le \lambda \delta C_{\Phi }\sum _{j=0}^{k-1}\widetilde{L}_{\Phi , \lambda }^j\) for a fixed \(k\in \mathbb {N}\), \(\Vert x^k\Vert _\mathcal {H}\le \Theta _2\), and \(L_{\Phi ,\lambda }\le \widetilde{L}_\lambda :=(1-\lambda )+\lambda \widetilde{L}<1\) yields similarly

$$\begin{aligned} \Vert \widetilde{x}^{k+1}\Vert _\mathcal {H}&\le \Vert \widetilde{\Phi }(\widetilde{x}^{k},\widetilde{x}_0)-\Phi (\widetilde{x}^{k})\Vert _\mathcal {H}+ \Vert \Phi (\widetilde{x}^{k})-\Phi (x^{k})\Vert _\mathcal {H}+ \Vert \Phi (x^{k})\Vert _\mathcal {H}\\&\le \lambda \delta C_{\Phi }+L_{\Phi ,\lambda }\Vert \widetilde{x}^{k}-x^{k}\Vert _\mathcal {H}+\Theta _2 \\&\le \lambda \delta C_{\Phi }\sum _{j=0}^k\widetilde{L}_\lambda ^j+\Theta _2, \end{aligned}$$

and hence, \(\Vert \widetilde{x}^k\Vert _\mathcal {H}\le \lambda \delta C_{\Phi }/(\lambda (1-\widetilde{L})) + \Theta _2\le \Theta _0\) holds by induction for all \(k\in \mathbb {N}\). We apply the bounds from Theorem 3.2 and (11) and conclude the proof by deriving

$$\begin{aligned} \Vert x^*-\widetilde{x}^k\Vert&\le \Vert x^*- x^k\Vert + \Vert \Phi (x^{k-1})-\Phi (\widetilde{x}^{k-1})\Vert + \Vert \Phi (\widetilde{x}^{k-1}) -\widetilde{\Phi }(\widetilde{x}^{k-1},\widetilde{x}_0)\Vert \\&\le \frac{\Vert x^1-x^0\Vert }{1-L_{\Phi ,\lambda }}L_{\Phi ,\lambda }^k + L_{\Phi ,\lambda }\Vert x^{k-1}-\widetilde{x}^{k-1}\Vert _\mathcal {H}+ \lambda \delta C_{\Phi }\\&\le \frac{\Vert \Phi (x^0)-x^0\Vert }{1-\widetilde{L}_\lambda }\widetilde{L}_\lambda ^k + \lambda \delta C_{\Phi } \sum _{j=0}^{k-1} \widetilde{L}_\lambda ^j\\&\le \frac{\max (2\Theta _0, \lambda C_{\Phi })}{1-\widetilde{L}_\lambda } \left( \widetilde{L}_\lambda ^k + \delta \right) . \end{aligned}$$

\(\square \)

4 Variational inequalities in Hilbert spaces

In the previous sections, we have considered a ProxNet model and derived the associated variational inequalities. Now, we use the variational inequality as starting point and derive suitable ProxNets for its (numerical) solution. Let \((\mathcal {H},(\cdot ,\cdot )_\mathcal {H})\) be a separable Hilbert space with topological dual space denoted by \(\mathcal {H}'\), and let \(_{\mathcal {H}'}\langle {\cdot }.{\cdot }\rangle _{\mathcal {H}}\) be the associated dual pairing. Let \(a:\mathcal {H}\times \mathcal {H}\rightarrow \mathbb {R}\) be a bilinear form, let \(f:\mathcal {H}\rightarrow \mathbb {R}\) be a functional, and let \(\mathcal {K}\subset \mathcal {H}\) be a subset of \(\mathcal {H}\). We consider the variational inequality problem

$$\begin{aligned} {\text {find }} u\in \mathcal {K}: \quad a(u,v-u) \ge f(v-u),\quad \forall v\in \mathcal {K}. \end{aligned}$$
(13)

Assumption 4.1

The bilinear form \(a:\mathcal {H}\times \mathcal {H}\rightarrow \mathbb {R}\) is bounded and coercive on \(\mathcal {H}\), i.e., there exists constants \(C_-,C_+>0\) such that for any \(v,w\in \mathcal {H}\) it holds

$$\begin{aligned} a(v,w)\le C_+\Vert v\Vert _\mathcal {H}\Vert w\Vert _\mathcal {H}\quad \text {and}\quad a(v,v)\ge C_-\Vert v\Vert ^2_\mathcal {H}. \end{aligned}$$

Moreover, \(f\in \mathcal {H}'\) and \(\mathcal {K}\subset \mathcal {H}\) is nonempty, closed and convex.

Problem (13) arises in various applications in the natural sciences, engineering and finance. It is well-known that there exists a unique solution \(u\in \mathcal {K}\) under Assumption 4.1, see, e.g., [14, Theorem A.3.3] for a proof. We also mention that well-posedness of Problem (13) is ensured under weaker conditions as Assumption 4.1; in particular, the coercivity requirement may be relaxed as shown in [8]. For this article, however, we focus on the bounded and coercive case in order to obtain numerical convergence rates for ProxNet approximations.

4.1 Fixed-point approximation by ProxNets

Theorem 4.2

Let Assumption 4.1 hold, and define \(\mathcal {H}_1:=\mathcal {H}_0:=\mathcal {H}\). Then, there exists a one-layer ProxNet \(\Phi \) as in Eq. (3) such that \(u\in \mathcal {K}\) is the unique fixed-point of \(\Phi \). Furthermore, for a given \(u^0\in \mathcal {H}\) define the iteration \(u^k:=\Phi (u^{k-1})\), \(k\in \mathbb {N}\).

Then, there are constants \(L_{\Phi ,\lambda }\in (0,1)\) and \(C=C(u^0)>0\) such that

$$\begin{aligned} \Vert u-u^k\Vert \le CL_{\Phi ,\lambda }^k, \quad k\in \mathbb {N}. \end{aligned}$$
(14)

Proof

We recall the fixed-point argument, e.g., in [14, Theorem A.3.3], for proving existence and uniqueness of u since it is the base for the ensuing ProxNet construction: Assumption 4.1 ensures that \(a(v,\cdot ), f\in \mathcal {H}'\) for any \(v\in \mathcal {H}\). The Riesz representation theorem yields the existence of \(A\in \mathcal {L}(\mathcal {H})\) and \(F\in \mathcal {H}\) such that for all \(v,w \in \mathcal {H}\)

$$\begin{aligned} (Av,w)_\mathcal {H}= a(v,w) \quad \text {and}\quad (F,v)_\mathcal {H}=f(v). \end{aligned}$$

Since \(\mathcal {K}\) is closed convex, the \(\mathcal {H}\)-orthogonal projection \(P_\mathcal {K}:\mathcal {H}\rightarrow \mathcal {K}\) onto \(\mathcal {K}\) is well-defined and for any \(\omega >0\) there holds

$$\begin{aligned} u \text { solves } (13) \qquad \Longleftrightarrow \qquad u = P_\mathcal {K}(\omega (F-Au)+u). \end{aligned}$$

Hence, u is a fixed-point of the mapping

$$\begin{aligned} T_\omega :\mathcal {H}\rightarrow \mathcal {H}, \quad v\mapsto P_\mathcal {K}(\omega (F-Av)+v). \end{aligned}$$

By Assumption 4.1, it is now possible to choose \(\omega >0\) sufficiently small, so that \(T_\omega \) is a contraction on \(\mathcal {H}\), which proves existence and uniqueness of u. The optimal relaxation parameter in terms of the bounds \(C_-,C_+\) is \(\omega ^*=C_-/C_+^2\), leading to \(\Vert T_{\omega ^*}\Vert _{\mathcal {L}(\mathcal {H})}^2=(1-C_1^2/C_2^2)<1\), see, e.g., [14, Theorem A.3.3].

To transfer this constructive proof of existence and uniqueness of solutions to the ProxNet setting, we denote by \(\iota _\mathcal {K}\) the indicator function of \(\mathcal {K}\) given by

$$\begin{aligned} \iota _\mathcal {K}:\mathcal {H}\rightarrow (-\infty ,\infty ],\quad v\mapsto {\left\{ \begin{array}{ll} 0,\quad &{}\text {if }v\in \mathcal {K},\\ \infty ,\quad &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Since \(\mathcal {K}\) is closed convex, it holds that \(\iota _\mathcal {K}\in \Gamma _0(\mathcal {H})\) and \(\mathrm {prox}_{\iota _\mathcal {K}}=P_\mathcal {K}\) (cf. [2, Examples 1.25 and 12.25]). Now, let \(m=1\), \(\mathcal {H}_1=\mathcal {H}\), \(W_1:=I-\omega A\in \mathcal {L}(\mathcal {H})\), \(b_1:=\omega F\in \mathcal {H}\), and \(R_1:=\mathrm {prox}_{\iota _\mathcal {K}}\), where \(\omega >0\) is such that \(I-\omega A\) is a contraction.

The ProxNet emulation \(\Phi \) of the contraction map reads: for a parameter \(\lambda \in (0,1]\),

$$\begin{aligned} \Phi :\mathcal {H}\rightarrow \mathcal {H}, \quad v\mapsto (1-\lambda )v+\lambda \underbrace{R_1(W_1v+b_1)}_{:=T_1(v)}. \end{aligned}$$

Since \(\Vert W_1\Vert _{\mathcal {L}(\mathcal {H})}<1\), Assumption 3.1 is satisfied for every \(\lambda \in (0,1]\). Theorem 3.2 yields that the iteration \(u^k:=\Phi (u^{k-1})\) converges for any \(u^0\in \mathcal {H}\) to a unique fixed-point \(u^*\in \mathcal {H}\) with error bounded by (14) and \(L_{\Phi ,\lambda }:=(1-\lambda )+\lambda \Vert W_1\Vert _{\mathcal {L}(\mathcal {H})}\in (0,1)\). Since \(\Phi (v)=(1-\lambda )v+\lambda T_1(v)\), it follows that \(u^*\) is in turn the unique fixed-point of \(T_1\), hence \(u=u^*\), which proves the claim. \(\square \)

Remark 4.3

In the fashion of Example 3.4, we may construct an augmented ProxNet \(\widetilde{\Phi }:\mathcal {H}\otimes \mathcal {H}\rightarrow \mathcal {H}\) such that \(\widetilde{\Phi }(v,F)=\Phi (v)\) for any \(v\in \mathcal {H}\), where \(F\in \mathcal {H}\) is the Riesz representer of \(f\in \mathcal {H}'\) in Problem (13). The only difference is that F has to be multiplied with \(\omega \) in the first linear transform to obtain \(b_1=\omega F\) instead of F as bias vector. The parameters of \(\widetilde{\Phi }\) in this construction are independent of F; hence, Theorem 3.5 yields that for any \(f\in \mathcal {H}'\) (resp. \(F\in \mathcal {H}\)) and \(x^0\in \mathcal {H}\) it holds

$$\begin{aligned} \Vert u-\widetilde{u}^k\Vert \le CL_{\Phi ,\lambda }^k, \quad k\in \mathbb {N}, \end{aligned}$$

where \(\widetilde{u}^k:=\widetilde{\Phi }(u^{k-1},F)\). \(\square \)

The previous remark shows that one fixed ProxNet is sufficient to solve Problem (13) for any \(f\in \mathcal {H}'\). A similar result is achieved if the set \(\mathcal {K}\subset \mathcal {H}\) associated Problem (13) is parameterized by a suitable family of functions:

Example 4.4

(Obstacle-to-solution operator) Let \(\mathcal {H}\) be a Hilbert space of real-valued functions over a domain \(\mathcal {D}\subset \mathbb {R}^d\) such that \(C(\mathcal {D})\cap \mathcal {H}\) is a dense subset, e.g., \(\mathcal {H}=L^2(\mathcal {D})\) or \(\mathcal {H}=H^1(\mathcal {D})\), and let \(\mathcal {K}:=\{v\in \mathcal {H}|\, v\ge g \text { almost everywhere}\}\) for a sufficiently smooth function \(g:\mathcal {D}\rightarrow \mathbb {R}\). With this choice of \(\mathcal {K}\), (13) is an obstacle problem and \(P_\mathcal {K}(v)=\max (v,g)\) holds for any \(v\in \mathcal {H}\cap C(\mathcal {D})\). We construct a ProxNet approximation to the obstacle-to-solution operator \({O_{obs}}:\mathcal {H}\rightarrow \mathcal {H},\; g\mapsto u\) corresponding to Problem (13) with \(\mathcal {K}=\{v\in \mathcal {H}|\, v\ge g \text { almost everywhere}\}\).

Assume \(\Phi (v)=P_\mathcal {K}(W_1v+b_1)\) for \(W_1\in \mathcal {L}(\mathcal {H})\) and \(b_1\in \mathcal {H}\) are as in Theorem 4.2 and let \(\mathcal {K}_0:=\{v\in \mathcal {H}|\, v\ge 0 \text { almost everywhere}\}\). To obtain a ProxNet that uses the obstacle \(g\in \mathcal {H}\) as input, we define

$$\begin{aligned} \widetilde{\Phi }:\mathcal {H}\oplus \mathcal {H}\rightarrow \mathcal {H}, \quad (v,g)\mapsto \widetilde{T}_1(v,g) = (\widetilde{T}_{2}^{(1)}\circ \widetilde{T}_{1}^{(1)})(v,g) \end{aligned}$$

via \(\widetilde{T}_{j_1}^{(1)}(v,g):= R_{j_1}^{(1)}(W_{j_1}^{(1)}(v,g)+b_{j_1}^{(1)})\) which are, for \(j_1\in \{1,2\}\), defined by

$$\begin{aligned}&W_{1}^{(1)}:\mathcal {H}\oplus \mathcal {H}\rightarrow \mathcal {H}\oplus \mathcal {H},\; (v_1,v_2)\mapsto (W_1v_1 - v_2, v_2), \\&b_{1}^{(1)}:= (b_1, 0)\in \mathcal {H}\oplus \mathcal {H}, \quad R_{1}^{(1)}:= \mathrm {prox}_{\psi _{1}^{(1)}}, \quad \psi _{1}^{(1)}(v,g):=\iota _{\mathcal {K}_0}(v),\\&W_{2}^{(1)}:\mathcal {H}\oplus \mathcal {H}\rightarrow \mathcal {H},\; (v_1,v_2) \mapsto v_1+v_2, \quad b_{2}^{(1)}:= 0\in \mathcal {H}, \quad R_{2}^{(1)}:=\mathrm {id}\in \mathcal {A}(\mathcal {H}). \end{aligned}$$

Note that this yields \(W_{1}^{(1)}\in \mathcal {L}(\mathcal {H}\oplus \mathcal {H})\), \(W_{2}^{(1)}\in \mathcal {L}(\mathcal {H})\), and \(R_{1}^{(1)}(v_1,v_2) = (P_{\mathcal {K}_0}v_1,v_2)\) for all \(v_1,v_2\in \mathcal {H}\). It now follows for any given \(v,g\in \mathcal {H}\) and \(\mathcal {K}:=\{v\in \mathcal {H}|\, v\ge g \text { almost everywhere}\}\)

$$\begin{aligned} \Phi (v)&=P_\mathcal {K}(W_1v+b_1) \\&=P_{\mathcal {K}_0}(W_1v+b_1-g)+g \\&= R_{2}^{(1)} (W_{2}^{(1)} (P_{\mathcal {K}_0}(W_1v+b_1-g),g) + b_{2}^{(1)}) \\&= \widetilde{T}_{2}^{(1)}\left( (P_{\mathcal {K}_0}(W_1v+b_1-g),g) \right) \\&=\widetilde{T}_{2}^{(1)}\circ (R_{1}^{(1)}(W_{1}^{(1)}(v,g)+b_{1}^{(1)})) \\&=\widetilde{\Phi }(v,g). \end{aligned}$$

As in Example 3.4, we concatenate \(\widetilde{\Phi }\) to obtain for a fixed choice of \(x^0\in \mathcal {H}\) the operator

$$\begin{aligned} {\widetilde{O}_{obs}:\mathcal {H}} \rightarrow \mathcal {H}, \quad g\mapsto \left[ \widetilde{\Phi }(\cdot , g) \bullet \cdots \bullet \widetilde{\Phi }(\cdot , g) \right] (x^0). \end{aligned}$$

Convergence of \({\widetilde{O}_{obs}(g)}\) to u for any \(g\in \mathcal {H}\) (with arbitrary a-priori fixed \(x^0\in \mathcal {H}\)) with a contraction rate that is uniform with respect to \(g\in \mathcal {H}\) is again guaranteed as the number of concatenations tends to infinity. Therefore, as in Example 3.4, there exists one ProxNet \(\widetilde{\Phi }\) that approximately solves a family of obstacle problems with obstacle ‘parameter’ \(g\in \mathcal {H}\)\(\square \)

A combination of the ProxNets from Remark 4.3 and Example 4.4 enables us to consider both, f and \(\mathcal {K}\) in (13), as input variables of a suitable NN \(\widetilde{\Phi }:\mathcal {H}\oplus \mathcal {H}\oplus \mathcal {H}\rightarrow \mathcal {H}\). This allows, in particular, to construct an approximation of the data-to-solution operator to Problem (13) that maps \((F,g)\in \mathcal {H}\oplus \mathcal {H}\) to u.

5 Example: linear matrix complementarity problems

Common examples for Problem (13) arise in financial and engineering applications, where the bilinear form \(a:\mathcal {H}\times \mathcal {H}\rightarrow \mathbb {R}\) stems from a second-order elliptic or parabolic differential operator. In this case, \(\mathcal {H}\subset H^s(\mathcal {D})\), where \(H^s(\mathcal {D})\) is the Sobolev space of smoothness \(s>0\) with respect to the spatial domain \(\mathcal {D}\subset \mathbb {R}^n\), \(n\in \mathbb {N}\). Coercivity and boundedness of a as in Assumption 4.1 often arise naturally in this setting. To obtain a computationally tractable problem, it is necessary to discretize (13), for instance by a Galerkin approximation with respect to a finite dimensional subspace \(\mathcal {H}_d\subset \mathcal {H}\). To illustrate this, we assume that \(\dim (\mathcal {H}_d)=d\in \mathbb {N}\) is a suitable finite-dimensional subspace with basis \(\{v_1,\ldots ,v_d\}\) and consider an obstacle problem with \(\mathcal {K}=\{v\in \mathcal {H}|\,v\ge g \text { almost everywhere}\}\) for a smooth function \(g\in \mathcal {H}\).

Following Example 4.4, we introduce the set \(\mathcal {K}_0:=\{v\in \mathcal {H}|\, v\ge 0 \text { almost everywhere}\}\) and note that Problem (13) is equivalent to finding \(u=u_0+g\in \mathcal {K}\)

$$\begin{aligned} \text {with } u_0\in \mathcal {K}_0 \text { such that:}\quad a(u_0,v-u_0) \ge f(v-u_0)-a(g,v-u_0),\quad \forall v\in \mathcal {K}_0. \end{aligned}$$
(15)

5.1 Discretization and matrix LCP

Any element \(v\in \mathcal {H}_d\) may be expanded as \(v =\sum _{i=1}^dw_iv_i\) for a coefficient vector \(w\in \mathbb {R}^d\). To preserve non-negativity of the discrete approximation to (15), we assume that \(v\in \mathcal {K}_0\) if and only if the basis coordinates are nonnegative, i.e., if \(w\in \mathbb {R}^d_{\ge 0}\). This property holds, for instance, in finite element approaches. We write the discrete solution as \(u_d=\sum _{i=1}^dx_iv_i\). Then, \(u_d\in \mathcal {K}_0\) if and only if \(x\in \mathbb {R}^d_{\ge 0}\). Consequently, the discrete version of (15) is to

$$\begin{aligned} \text {find } x\in \mathbb {R}^d_{\ge 0}: \quad (y-x)^\top {\mathbf {A}}x \ge (y-x)^\top c,\quad \forall y\in \mathbb {R}^d_{\ge 0}, \end{aligned}$$
(16)

where the matrix \(\mathbf {A}\in \mathbb {R}^{d\times d}\) and the vector \(c\in \mathbb {R}^d\) are given by

$$\begin{aligned} \mathbf {A}_{ij}:=a(v_j, v_i)\quad \text {and}\quad c_i:= _{\mathcal {H}'}\langle {f}, {v_i}\rangle _{\mathcal {H}}-a(g,v_i),\quad i,j\in \{1,\ldots ,d\}. \end{aligned}$$
(17)

Problem (16) is equivalent to the linear complementary problem (LCP) to find \(x\in \mathbb {R}^d\) such that for \(\mathbf {A}\in \mathbb {R}^{d\times d}\) and \(c\in \mathbb {R}^d\) as in (17) it holds

$$\begin{aligned} \begin{aligned} {\mathbf {A}}x \ge c, \quad x \ge 0, \quad x^\top ({\mathbf {A}}x - c)&= 0, \end{aligned} \end{aligned}$$
(18)

see, e.g., [14, Lemma 5.1.3]. If \(a:\mathcal {H}\times \mathcal {H}\rightarrow \mathbb {R}\) is bounded and coercive as in Assumption 4.1, it readily follows that

$$\begin{aligned} C_- \Vert x\Vert _2^2 \le x^\top {\mathbf {A}}x \le C_+ \Vert x\Vert _2^2, \quad x\in \mathbb {R}^d, \end{aligned}$$
(19)

where the constants \(C_+\ge C_->0\) stem from Assumption 4.1 and \(\left\| \cdot \right\| _2\) is the Euclidean norm on \(\mathbb {R}^d\). This implies in particular that the LCP (18) has a unique solution \(x\in \mathbb {R}^d\), see [23, Theorem 4.2]. Equivalently, we may regard Problem (16), resp. (18), as variational inequality on the finite-dimensional Hilbert space \(\mathbb {R}^d\) equipped with the Euclidean scalar product \((\cdot ,\cdot )_2\). Well-posedness then follows directly from Assumption 4.1 with the identification \(\mathcal {H}=\mathbb {R}^d\) and the discrete bilinear form \(a:\mathbb {R}^d\times \mathbb {R}^d\rightarrow \mathbb {R},\,(x,y)\mapsto x^\top {\mathbf {A}}y\).

5.2 Solution of matrix LCPs by ProxNets

The purpose of this section is to show that several well-known iterative algorithms to solve (finite-dimensional) LCPs may be recovered as particular cases of ProxNets in the setting of Sect. 2. To this end, we fix \(d\in \mathbb {N}\) and use the notation \(\mathcal {H}:=\mathbb {R}^d\) for convenience. We denote by \(\{e_1,\ldots ,e_d\}\subset \mathbb {R}^d\) the canonical basis of \(\mathcal {H}\). To approximately solve LCPs by ProxNets, and to introduce a numerical LCP solution map, we introduce the scalar and vector-valued Rectified Linear Unit (ReLU) activation function.

Definition 5.1

The scalar ReLU activation function \(\varrho \) is defined as \(\varrho :\mathbb {R}\rightarrow \mathbb {R}, x\mapsto \max (x,0)\). The component-wise ReLU activation in \(\mathbb {R}^d\) is given by

$$\begin{aligned} \varrho ^{(d)}:\mathbb {R}^d\rightarrow \mathbb {R}^d, \quad x\mapsto \sum _{i=1}^d \varrho ((x,e_i)_\mathcal {H})e_i. \end{aligned}$$
(20)

Remark 5.2

The scalar ReLU activation function \(\varrho \) satisfies \(\varrho =\mathrm {prox}_{\iota _{[0,\infty )}}\) with \(\iota _{[0,\infty )}\in \Gamma _0(\mathbb {R})\) (see [5, Example 2.6]). This in turn yields \(\varrho ^{(d)}\in \mathcal {A}(\mathbb {R}^d)\) for any \(d\in \mathbb {N}\) by [5, Proposition 2.24].

Example 5.3

(PJORNet) Consider the LCP (18) with matrix \({\mathbf {A}}\) and triangular decomposition

$$\begin{aligned} {\mathbf {A}}= {\mathbf {D}} + {\mathbf {L}} + {\mathbf {U}}, \end{aligned}$$
(21)

where \({\mathbf {D}}\in \mathbb {R}^{d\times d}\) contains the diagonal entries of \({\mathbf {A}}\), and \({\mathbf {L}}, {\mathbf {U}}\in \mathbb {R}^{d\times d}\) are the (strict) lower and upper triangular parts of \({\mathbf {A}}\), respectively. The projected Jacobi (PJOR) overrelaxation method to solve LCP (18) is given as:

figure a

The \(\max \)-function in Algorithm 1 acts component-wise on each entry of a vector in \(\mathbb {R}^d\). Hence, one iteration of the PJOR may be expressed as a ProxNet in Model (3) with \(m=1\), \(\lambda =1\) and \(\varrho ^{(d)}\) from Eq. (20) as

$$\begin{aligned} \Phi _{PJOR}:\mathbb {R}^d\rightarrow \mathbb {R}^d, \quad x\mapsto T_1(x):= \varrho ^{(d)}(\underbrace{({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})}_{=:{\mathbf {W}}_1} x + \underbrace{\omega {\mathbf {D}}^{-1}c}_{:=b_1}). \end{aligned}$$

If \({\mathbf {A}}\) satisfies (19) for constants \(C_+\ge C_->0\), it holds that

$$\begin{aligned} \Vert {\mathbf {W}}_1\Vert ^2_{\mathcal {L}(\mathcal {H})}&=\Vert {\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}}\Vert ^2_2 \\&= \sup _{x\in \mathbb {R}^d, \Vert x\Vert _2=1} x^\top x -\omega x^\top {\mathbf {D}}^{-1} ({\mathbf {A}}^\top +{\mathbf {A}})x + \omega ^2 (x {\mathbf {D}}^{-1}{\mathbf {A}})^\top {\mathbf {D}}^{-1}\mathbf {A}x\\&\le 1 - 2\omega \min _{i\in \{1,\ldots ,d\}} \frac{1}{{\mathbf {A}}_{ii}} C_- +\omega ^2 \max _{i\in \{1,\ldots ,d\}} \frac{1}{{\mathbf {A}}_{ii}^2} \Vert {\mathbf {A}}\Vert _2^2\\&\le 1 - 2\omega \frac{C_-}{C_+} +\omega ^2 \frac{\Vert {\mathbf {A}}\Vert _2^2}{C_-^2} =:\Lambda (\omega ). \end{aligned}$$

The choice \(\omega ^*:=C_-^3/(C_+\Vert {\mathbf {A}}\Vert _2^2)\) minimizes \(\Lambda \) such that \(\Lambda (\omega ^*)<1\). Moreover, \(\Lambda (0)=1\), \(\Lambda \) is strictly decreasing on \([0,\omega ^*]\), and increasing for \(\omega >\omega ^*\). Hence, there exists \(\overline{\omega }>0\) such that for any \(\omega \in (0,\overline{\omega })\) the mapping \(\Phi _{PJOR}:\mathbb {R}^d\rightarrow \mathbb {R}^d\) is a contraction. An application of Theorem 3.2 then shows that Algorithm (1) converges linearly for suitable \(\omega >0\) and any initial guess \(x^0\). In the special case that \({\mathbf {A}}\) is strictly diagonally dominant, choosing \(\omega =1\) is sufficient to ensure convergence, i.e., no relaxation before the activation is necessary.

Example 5.4

(PSORNet) Another popular algorithm to numerically solve LCPs is the projected successive overrelaxation (PSOR) method in Algorithm 2.

figure b

To represent the PSOR-iteration by a ProxNet as in (3), we use the scalar ReLU activation \(\varrho \) from Definition 5.1 and define for \(i\in \{1,\ldots ,d\}\)

$$\begin{aligned} R_i:\mathbb {R}^d\rightarrow \mathbb {R}^d, \quad x\mapsto \varrho ((x,e_i)_\mathcal {H})e_i +\sum _{j=1,\,j\ne i}^d x_je_j. \end{aligned}$$
(22)

In contrast to \(\varrho ^{(d)}\) in Eq. (20), the activation operator \(R_i\) takes the maximum only with respect to the ith entry of the input vector. Nevertheless, \(R_i\in \mathcal {A}(\mathbb {R}^d)\) holds again by [5, Proposition 2.24]. Now, define \(b_i\in \mathbb {R}^d\) and \({\mathbf {W}}_i\in \mathbb {R}^{d\times d}\) by

$$\begin{aligned} b_i = (0,\ldots ,0, \underbrace{\omega \frac{c_i}{{\mathbf {A}}_{ii}}} _{i\text {th entry}}, 0,\ldots ,0),\quad ({\mathbf {W}}_i)_{lj} = {\left\{ \begin{array}{ll} 1-\omega &{}\quad l=j=i,\\ 1&{}\quad l=j\in \{1,\ldots ,d\}{\setminus }\{i\},\\ -\omega \frac{{\mathbf {A}}_{ij}}{{\mathbf {A}}_{ii}}, &{}\quad l=i,j\in \{1,\ldots ,d\}{\setminus }\{i\},\\ 0,&{}\quad \text {elsewhere,} \end{array}\right. } \end{aligned}$$

and let \(T_i(x):=R_i({\mathbf {W}}_ix+b_i)\) for \(x\in \mathbb {R}^d\). Given the kth iterate \(x^k\) and \(x^{k+1}_1,\ldots ,x^{k+1}_{i-1}\) from the inner loop of Algorithm 2, it follows for \(z^{k,i-1}:=(x^{k+1}_1,\ldots ,x^{k+1}_{i-1},x^k_{i},\ldots , x^k_d)^\top \) that

$$\begin{aligned} x_i^{k+1} = z^{k,i}_i,\quad z^{k,i} = T_i(z^{k,i-1}),\quad i\in \{1,\ldots ,d\},\,k\in \mathbb {N}. \end{aligned}$$
(23)

As \(z^{k-1,d}=z^{k,0}=x^k\) for \(k\in \mathbb {N}\), this shows \(x^{k+1}=\Phi _{PSOR} (x^k)\) for

$$\begin{aligned} \Phi _{PSOR}:\mathbb {R}^d\rightarrow \mathbb {R}^d,\quad x \mapsto (T_d\circ \cdots \circ T_1)(x). \end{aligned}$$
(24)

Provided (19) holds, we derive similarly to Example 5.3

$$\begin{aligned} \Vert {\mathbf {W}}_i\Vert _2^2&= \sup _{x\in \mathbb {R}^d, \Vert x\Vert _2=1} x^\top x -2\frac{\omega }{{\mathbf {A}}_{ii}}x^\top {\mathbf {A}}_{[i]}x_i + \frac{\omega ^2}{{\mathbf {A}}_{ii}^2}(x^\top {\mathbf {A}}_{[i]})^2\\&\le 1 - 2\omega \frac{1}{{\mathbf {A}}_{ii}}C_- + \frac{\omega ^2}{{\mathbf {A}}_{ii}^2}\Vert {\mathbf {A}}\Vert ^2, \end{aligned}$$

where \({\mathbf {A}}_{[i]}\) denotes the ith row of \({\mathbf {A}}\). Hence, \(\omega ^*:=C_-^3/(C_+\Vert {\mathbf {A}}\Vert _2^2)\) is sufficient to ensure that \(\Phi _{PSOR}\) is a contraction, and convergence to a unique fixed-point follows as in Theorem 3.2.

Remark 5.5

Both, the PJORNet and PSORNet from Examples 5.3 and 5.4, may be augmented as in 3.4 to take \(c\in \mathbb {R}^d\) as additional input vector, and therefore to solve the LCP (18) for varying c. That is, concatenation of the PJORNet/PSORNet again yields an approximation to the solution operator \({O_{RHS}}:\mathbb {R}^d\rightarrow \mathbb {R}^d,\; c\mapsto x\) associated with the LCP (18) for fixed \({\mathbf {A}}\). This is of particular interest, for instance, in the valuation of American options, where a collection of LCPs with varying model parameters has to be solved, see [14, Chapter 5] and the numerical examples in Sect. 6. Recall that \(c_i := _{\mathcal {H}'}\langle {f},{v_i}\rangle _{\mathcal {H}}-a(g,v_i)\) if the matrix LCP stems from a discretized obstacle problem as introduced in the beginning of this section. Hence, by varying c it is possible to modify the right hand side f, as well as the obstacle g, of the underlying variational inequality (cf. Example 4.4 and Sect. 6.3). \(\square \)

5.3 Solution of parametric matrix LCPs by ProxNets

In this section, we construct ProxNets that take arbitrary LCPs \(({\mathbf {A}}, c)\) in finite-dimensional, Euclidean space as input, and output approximations of the solution x to (18) with any prescribed accuracy. Consequently, these ProxNets realize approximate data-to-solution operators

$$\begin{aligned} {O}: \{{\mathbf {A}}\in \mathbb {R}^{d^2}| \; \text {there are } C_-,C_+>0 \text { s.t. } \,{\mathbf {A}} \text { satisfies~(19)}\} \times \mathbb {R}^d \rightarrow \mathbb {R}^d, \quad ({\mathbf {A}}, c)\mapsto x. \end{aligned}$$
(25)

The idea is to construct a NN that realizes Algorithm (1) that achieves prescribed error threshold \(\varepsilon >0\) uniformly for LCP data \(({\mathbf {A}},c)\) from a set \(\mathfrak A_\Theta \), meaning the weights of the NN may not depend on \({\mathbf {A}}\) as in the previous section. To this end, we use that the multiplication of real numbers may be emulated by ReLU-NNs with controlled error and growth bounds on the layers and size of the ReLU NN. This was first shown in [27], and subsequently extended to the multiplication of an arbitrary number \(n\in \mathbb {N}\) of real numbers in [24].

Proposition 5.6

[24, Proposition 2.6] For any \(\delta _0\in (0,1)\), \(n\in \mathbb {N}\) and \(\Theta \ge 1\), there exists a ProxNet \(\widetilde{\prod }_{\delta _0,\Theta }^n:\mathbb {R}^n\rightarrow \mathbb {R}\) of the form (2) such that

(26)

where \(\partial _{x_j}\) denotes the weak derivative with respect to \(x_j\). The neural network \(\widetilde{\prod }_{\delta _0,\Theta }^n\) uses only ReLUs as in Definition 5.1 as proximal activations. There exists a constant C, independent of \(\delta _0\in (0,1)\), \(n\in \mathbb {N}\) and \(\Theta \ge 1\), such that the number of layers \(m_{n, \delta _0, \Theta }\in \mathbb {N}\) of \(\widetilde{\prod }_{\delta _0,\Theta }^n\) is bounded by

$$\begin{aligned} m_{n, \delta _0, \Theta }\le C\left( 1+\log (n)\log \left( \frac{n\Theta ^n}{\delta _{ 0}}\right) \right) . \end{aligned}$$
(27)

Remark 5.7

For our purposes, it is sufficient to consider the cases \(n\in \{2,3\}\); therefore, we assume without loss of generality that there is a constant C, independent of \(\delta _0\in (0,1)\) and \(\Theta \ge 1\), such that for \(n\in \{2,3\}\) it holds

$$\begin{aligned} m_{n, \delta _0, \Theta }\le C\left( 1+\log \left( \frac{\Theta }{\delta _0}\right) \right) . \end{aligned}$$

Moreover, we may assume without loss of generality that \(m_{2, \delta _0, \Theta }=m_{3, \delta _0, \Theta }\), as it is always possible to add ReLU-layers that emulate the identity function to the shallower network (see [24, Section 2] for details).

With this at hand, we are ready to prove a main result of this section.

Theorem 5.8

Let \(\Theta \ge 2\) be a fixed constant, \(d\ge 2\) and define for any given \(\Theta \ge 2\) the set

$$\begin{aligned} \mathfrak A_\Theta :=\left\{ ({\mathbf {A}}, c)\in \mathbb {R}^{d\times d}\times \mathbb {R}^d\Big |\; \begin{array}{l} {\mathbf {A}} \text { satisfies~(19) with } \Theta \ge C_+\ge C_-\ge \Theta ^{-1}>0, \\ \text {and } \Vert c\Vert _\infty \le \Theta \end{array} \right\} . \end{aligned}$$
(28)

For the triangular decomposition \({\mathbf {A}}= {\mathbf {D}} + {\mathbf {L}} +{\mathbf {U}}\) as in (21), define \(z_{{\mathbf {A}}}:=\mathrm {vec}({\mathbf {D}}^{-1} + {\mathbf {L}} +{\mathbf {U}})\in \mathbb {R}^{d^2}\), where \(\mathrm {vec}:\mathbb {R}^{d\times d}\rightarrow \mathbb {R}^{d^2}\) is the row-wise vectorization of a \(\mathbb {R}^{d\times d}\)-matrix. Let \(x^*\) be the unique solution to the LCP \(({\mathbf {A}}, c)\), and let \(\widetilde{x}^0\in \mathbb {R}^d\) be arbitrary such that \(\Vert \widetilde{x}^0\Vert _2\le \Theta \).

For any \(\varepsilon >0\), there exists a ProxNet

$$\begin{aligned} \widetilde{\Phi }:\mathbb {R}^d\oplus \mathbb {R}^{d^2}\oplus \mathbb {R}^d\rightarrow \mathbb {R}^d \end{aligned}$$
(29)

as in (9) and a \(k_\varepsilon \in \mathbb {N}\) such that

$$\begin{aligned} \Vert x^*-\widetilde{x}^{k_\varepsilon }\Vert _2\le \varepsilon \end{aligned}$$

holds for the sequence \(\widetilde{x}^k:=\widetilde{\Phi }(\widetilde{x}^{k-1},z_{{\mathbf {A}}}, c)\) generated by \(\widetilde{\Phi }\) and any tuple \(({\mathbf {A}}, c)\ \in \mathfrak A_\Theta \). Moreover, \(k_{\varepsilon }\le C_1(1+{|\log (\varepsilon )|})\), where \(C_1>0\) only depends on \(\Theta \) and \(\widetilde{\Phi }\) has \(m\le C_2(1+{|\log (\varepsilon )|}+\log (d))\) layers, where \(C_2>0\) is independent of \(\Theta \).

Proof

Our strategy is to approximate \(\Phi _{PJOR}\) from Example 5.3 for given \(({\mathbf {A}},c)\, {\in \mathfrak A_\Theta }\) by \(\widetilde{\Phi }(\cdot ,z_{{\mathbf {A}}},c)\). We achieve this by constructing \(\widetilde{\Phi }\) based on the approximate multiplication NNs from Proposition 5.6 and show that \(\Phi _{PJOR}\) and \(\widetilde{\Phi }\) satisfy Assumption 3.3 to apply the error estimate from Theorem 3.5.

We start by defining the map \(\widetilde{\Phi }:\mathbb {R}^d\oplus \mathbb {R}^{d^2}\oplus \mathbb {R}^d\rightarrow \mathbb {R}^d\) via

$$\begin{aligned}&\widetilde{\Phi }(x, z_{{\mathbf {A}}},c)_i = \\&\quad \max \left( (1-\omega )x_i - \omega \sum _{j=1, j\ne i} \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}_{ii}},{\mathbf {A}}_{ij}\right) +\omega \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}_{ii}},c_i\right) , 0 \right) , \end{aligned}$$

for \(i\in \{1,\ldots ,d\}\), \(0<\omega :=\Theta ^{-6}\le \frac{C_-^3}{C_+\Vert {\mathbf {A}}\Vert _2^2} = \omega ^*\) and \(\delta _0\in (0,d^{-3/2}]\).

We show in the following that \(\widetilde{\Phi }\) is indeed a ProxNet. To bring the input into the correct order for multiplication, we define for \(i\in \{1,\ldots ,d\}\) the binary matrix \({\mathbf {W}}^{(i)}\in \mathbb {R}^{(2d+1)\times (d^2+2d)}\) by

$$\begin{aligned} {\mathbf {W}}^{(i)}_{lj} := {\left\{ \begin{array}{ll} 1 &{}\quad l=j\in \{1,\ldots ,d\},\\ 1 &{}\quad l\in \{d+1,\ldots ,2d\},\, j=d+d(i-1)+(l-d),\\ 1 &{}\quad l=2d+1,\, j=d+d^2+i,\\ 0 &{}\quad \text {elsewhere}. \end{array}\right. } \end{aligned}$$

Hence, we obtain

$$\begin{aligned} {\mathbf {W}}^{(i)} \begin{pmatrix} x \\ z_{{\mathbf {A}}} \\ c \end{pmatrix} = \left( x^\top , \left( {\mathbf {A}}_{ij} \right) _{j< i} , \frac{1}{{\mathbf {A}}_{ii}}, \left( {\mathbf {A}}_{ij} \right) _{j>i}, c_i \right) ^\top . \end{aligned}$$

Now, let \(e_1,\ldots ,e_{{2d+1}}\subset \mathbb {R}^{2d+1}\) be the canonical basis of \(\mathbb {R}^{2d+1}\) and define \({\mathbf {E}^{(i)}_i}:=e_i^\top \in \mathbb {R}^{1\times (2d+1)}\), \({\mathbf {E}^{(i)}_j}:=[e_j\; e_{d+i} \; e_{d+j}]^\top \in \mathbb {R}^{3\times (2d+1)}\) for \(j\in \{1,\ldots ,d\}{\setminus }\{i\}\) and \({\mathbf {E}^{(i)}_{d+1}}:=[e_{d+i}\; e_{2d+{1}}]^\top \in \mathbb {R}^{2\times (2d+1)}\). By Remark 5.7, we may assume that \(\widetilde{\prod }_{\delta _0,\Theta }^3\) and \(\widetilde{\prod }_{\delta _0,\Theta }^2\) have an identical number of layers, denoted by \(m_{\delta _0, \Theta }\in \mathbb {N}\). Moreover, it is straightforward to construct a ProxNet \(\mathrm {Id}_{m_{\delta _0, \Theta }}:\mathbb {R}\rightarrow \mathbb {R}\) with \(m_{\delta _0,\Theta }\) layers that corresponds to the identity map, i.e., \(\mathrm {Id}_{m_{\delta _0, \Theta }}(x)=x\) for all \(x\in \mathbb {R}\). We use the concatenation from Definition 2.3 to define

$$\begin{aligned} \widetilde{\Phi }_{i}^{(i)}&:=\mathrm {Id}_{m_{\delta _0,\Theta }}\bullet ({\mathbf {E}^{(i)}_i {\mathbf {W}}^{(i)}}):\mathbb {R}^{d^2+2d}\rightarrow \mathbb {R}\\ \widetilde{\Phi }_{j}^{(i)}&:=\widetilde{\prod }_{\delta _0,\Theta }^2\bullet ({\mathbf {E}^{(i)}_j {\mathbf {W}}^{(i)}}):\mathbb {R}^{d^2+2d}\rightarrow \mathbb {R},\quad j\in \{1,\ldots ,d\}{\setminus }\{i\}, \\ \widetilde{\Phi }_{d+1}^{(i)}&:=\widetilde{\prod }_{\delta _0,\Theta }^3\bullet ({\mathbf {E}^{(i)}_{d+1}{\mathbf {W}}^{(i)}}):\mathbb {R}^{d^2+2d}\rightarrow \mathbb {R}. \end{aligned}$$

Note that this yields

$$\begin{aligned} \widetilde{\Phi }_{i}^{(i)}(x, z_{{\mathbf {A}}},c)= x_i, \; \widetilde{\Phi }_{j}^{(i)}(x, z_{{\mathbf {A}}},c)= \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}_{ii}},{\mathbf {A}}_{ij}\right) , \; \widetilde{\Phi }_{d+1}^{(i)}(x, z_{{\mathbf {A}}},c)= \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}_{ii}},c_i\right) . \end{aligned}$$

Furthermore, we set \({m_1}:=m_{\delta _0, \Theta }+1\) and define \(\widetilde{T}_{{m_1}}^{(+,i)}:\mathbb {R}^{d^2+d}\rightarrow \mathbb {R},\, x\mapsto \varrho ({{\mathbf {W}}^{(+,i)}}x)\), where \(\varrho :\mathbb {R}\rightarrow \mathbb {R}\) is the (scalar) ReLU activation and \({{\mathbf {W}}^{(+,i)}}\in \mathbb {R}^{1\times (d+1)}\) is given by

$$\begin{aligned} {{\mathbf {W}}^{(+,i)}_j} := {\left\{ \begin{array}{ll} 1 - \omega &{}\quad j=i,\\ - \omega &{}\quad j\in \{1,\ldots ,d\}{\setminus }\{i\},\\ \omega &{}\quad j=d+1. \end{array}\right. } \end{aligned}$$

As \(\widetilde{\Phi }_1^{(i)},\ldots , \widetilde{\Phi }_{d+1}^{(i)}\) have the same input dimension, the same number of \(m_{\delta _0, \Theta }\) layers, and no skip connections, we may parallelize as in Definition 2.5 to ensure

$$\begin{aligned} \widetilde{\Phi }(x, z_{{\mathbf {A}}},c)_i&= \max \left( (1-\omega )x_i - \omega \sum _{j=1, j\ne i} \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}_{ii}},{\mathbf {A}}_{ij}\right) +\omega \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}_{ii}},c_i\right) , 0 \right) \\&= \left( \widetilde{T}_{{m_1}}^{{(+,i)}} \bullet P\left( \widetilde{\Phi }_1^{(i)},\ldots , \widetilde{\Phi }_{d+1}^{(i)}\right) \right) (x, z_{{\mathbf {A}}},c). \end{aligned}$$

It holds that \(\widetilde{\Phi }_i:=\widetilde{T}_{{m_1}}^{{(+,i)}} \bullet P\left( \widetilde{\Phi }_1^{(i)},\ldots , \widetilde{\Phi }_{d+1}^{(i)}\right) \) is a ProxNet as in Eq. (9) with \(\widetilde{\Phi }_i:\mathbb {R}^{d^2+2d}\rightarrow \mathbb {R}\) and \({m_1}=m_{\delta _0,\Theta }+1\) layers for any \(i\in \{1,\ldots ,d\}\). We parallelize once more and obtain that \(\widetilde{\Phi } := P(\widetilde{\Phi }_1,\ldots ,\widetilde{\Phi }_d)\) is a ProxNet with \(m_{\delta _0,\Theta }+1\) layers that may be written as \(\widetilde{\Phi } = \widetilde{T}_{1}^{(1)}\circ \cdots \circ \widetilde{T}_{{m_1}}^{(1)}\) for suitable one-layer networks \(\widetilde{T}_{1}^{(1)}:\mathbb {R}^{d_{j-1}}\rightarrow \mathbb {R}^{d_j}\) and dimensions \(d_j\in \mathbb {N}\) for \(j\in \{0,1,\ldots ,{m_1}\}\) such that \(d_0 = d^2+2d\) and \(d_{{m_1}}=d\).

We now fix \(({\mathbf {A}},c){\in \mathfrak A_\Theta }\) and let \(\Phi _{PJOR}:=R({\mathbf {W}}_1\cdot +\, b_1)\) be as in Example 5.3 with \(\omega =\Theta ^{-6}\), \({\mathbf {W}}_1={\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}}\) and \(b_1:=\omega {\mathbf {D}}^{-1}c\). This shows that \(\Phi _{PJOR}\) has Lipschitz constant \(L_{\Phi }=\Vert {\mathbf {W}}_1\Vert _2\le \sqrt{1-2\Theta ^{-4}+\Theta ^{-8}}=1-\Theta ^{-4}< 1\) and \(\Vert b_1\Vert _2\le \omega \Theta ^2\le \Theta ^{-4}\).

Note that \(|c_i|,1/{\mathbf {A}}_{ii},|{\mathbf {A}}_{ij}|\le \Theta \) for any \(i,j\in \{1,\ldots ,d\}\). Therefore, Proposition 5.6 yields for \(\widetilde{x}_0:=(z_{{\mathbf {A}}},c)\) and any \(x\in \mathbb {R}^d\) with \(\Vert x\Vert _\infty \le \Theta \) that

$$\begin{aligned}&\Vert \Phi (x)-\widetilde{\Phi }(x, {z_{{\mathbf {A}}},c})\Vert _2^2 \\&\quad =\Vert T_1(x)-\widetilde{T}_1(x, {z_{{\mathbf {A}}},c})\Vert _2^2 \\&\quad =\omega ^2 \sum _{i=1}^d \left( \frac{c_i}{{\mathbf {A}}_{ii}} -\widetilde{\prod }_{\delta _0,\Theta }^2\left( c_{i},\frac{1}{{\mathbf {A}}_{ii}}\right) - \sum _{j=1, j\ne i}^d \frac{{\mathbf {A}}_{ij}}{{\mathbf {A}}_{ii}}x_j - \widetilde{\prod }_{\delta _0,\Theta }^3\left( {\mathbf {A}}_{ij},\frac{1}{{\mathbf {A}}_{ii}},x_j\right) \right) ^2 \\&\quad \le \omega ^2 d^3\delta _0^2. \end{aligned}$$

Hence, since \(\delta _0\in (0,d^{-3/2}]\) and \(\omega =\Theta ^{-6}\), \(\Phi _{{PJOR}}\) and \(\widetilde{\Phi }\) satisfy Assumption 3.3 with

$$\begin{aligned} \widetilde{L}&:=1-\Theta ^{-4}\in (0,1), \qquad \delta :=\omega d^{3/2} \delta _0\ge 0, \qquad \Theta _1:=\Theta \ge 2, \\ \Theta _0&:=\Theta _1-\Vert b_1\Vert _2-\delta \ge \Theta -\Theta ^{-4}-\omega d^{3/2}\delta _0\ge \frac{123}{64},\\ \Theta _2&:=\Theta _0-\delta /(1-\widetilde{L})\ge \Theta _0 - \frac{\Theta ^{-6}}{\Theta ^{-4}} \ge \frac{123}{64} - \frac{1}{4} >0. \end{aligned}$$

Theorem 3.5 then yields that there exists a constant \(C>0\) such that for all \(k, \delta \) holds

$$\begin{aligned} \Vert x^*-\widetilde{x}^k\Vert _\mathcal {H}\le C\left( \widetilde{L}^k + \delta \right) . \end{aligned}$$

Here, \(C\le \max (2\Theta _0,1)/(1-\widetilde{L})\le 2\Theta ^5\) is independent of k. Given \(\varepsilon >0\), we choose

$$\begin{aligned} k_\varepsilon =: \left\lceil {\frac{\log (\varepsilon )-\log (2C)}{\log (\widetilde{L})}}\right\rceil , \quad \delta _0:=\frac{\min \left( 1, \frac{\varepsilon }{2C\omega }\right) }{d^{3/2}} \ge \frac{\min \left( 1, \frac{\varepsilon \Theta }{4} \right) }{d^{3/2}} \end{aligned}$$

to ensure \(\Vert x^*-\widetilde{x}^{k_\varepsilon }\Vert \le \varepsilon \). Hence, \(k_\varepsilon \le C_1(1+{|\log (\varepsilon )|})\), where \(C_1=C_1(\Theta )>0\) is independent of d. Moreover, Inequality (27) in Proposition 5.6 and the choice \(\delta _0{\le d^{-3/2}}\) shows that \(m_{\delta _0,\Theta }\le C_2(1+{|\log (\varepsilon )|}+\log (d))\), where \(C_2>0\) is independent of \(\Theta \). The claim follows since \(\widetilde{\Phi }\) has \({m_1}=m_{\delta _0, \Theta }+1\) layers by construction. \(\square \)

For fixed \(\Theta \) and \(\varepsilon \), the ProxNets \(\widetilde{\Phi }\) emulate one step of the PJOR algorithm for any LCP \(({\mathbf {A}}, c)\in \mathfrak A_\Theta \) and a given initial guess \(\widetilde{x}^0\). This in turn allows to approximate the data-to-solution operator O from (25) to arbitrary accuracy by concatenation of suitable ProxNets. The precise statement is given in the main result of this section:

Theorem 5.9

Let \(\Theta \ge 2\) be fixed, let \(\mathfrak A_\Theta \) be given as in (28), and let the data-to-solution operator O be given as in (25). Then, for any \(\varepsilon >0\), there is a ProxNet \(\widetilde{O}_\varepsilon :\mathfrak A_\Theta \rightarrow \mathbb {R}^d\) such that for any LCP \(({\mathbf {A}}, c)\in \mathfrak A_\Theta \) there holds

$$\begin{aligned} \Vert O({\mathbf {A}},c)-\widetilde{O}_\varepsilon ({\mathbf {A}}, c)\Vert _2\le \varepsilon . \end{aligned}$$

Furthermore, let \(\left\| \cdot \right\| _F\) denote the Frobenius norm on \(\mathbb {R}^{d\times d}\). There is a constant \(\widetilde{C}>0\), depending only on \(\Theta \) and d, such that for any \(\varepsilon >0\) and any two \(({\mathbf {A}}^{(1)}, c^{(1)}), ({\mathbf {A}}^{(2)}, c^{(2)})\in \mathfrak A_\Theta \) there holds

$$\begin{aligned} \Vert \widetilde{O}_\varepsilon ({\mathbf {A}}^{(1)} , c^{(1)} ) - \widetilde{O}_\varepsilon ({\mathbf {A}}^{(2)} , c^{(2)} )\Vert _2 \le \widetilde{C}\left( \Vert {\mathbf {A}}^{(1)} - {\mathbf {A}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) . \end{aligned}$$
(30)

We give an explicit construction of the approximate data-to-solution operator \(\widetilde{O}_\varepsilon \) in the proof of Theorem 5.9 at the end of this section. To show the Lipschitz continuity of \(\widetilde{O}_\varepsilon \) with respect to the parametric LCPs in \(\mathfrak A_\Theta \), we derive an operator version of the so-called Strang Lemma:

Lemma 5.10

Let \(\Theta \ge 2\), \(d\ge 2\), and let \(({\mathbf {A}}^{(1)}, c^{(1)}), ({\mathbf {A}}^{(2)}, c^{(2)})\in \mathfrak A_\Theta \). For \(l\in \{1,2\}\), let \({{\mathbf {A}}^{(l)}}={\mathbf {D}}^{(l)}+{\mathbf {L}}^{(l)}+ {\mathbf {U}}^{(l)}\) be the decomposition of \({{\mathbf {A}}^{(l)}}\) as in (21) and define \(z_{{\mathbf {A}}^{(l)}}:=\mathrm {vec}(({\mathbf {D}}^{(l)})^{-1} + {\mathbf {L}}^{(l)} +{\mathbf {U}}^{(l)})\in \mathbb {R}^{d^2}\). For target emulation accuracy \(\varepsilon >0\), let \(\widetilde{\Phi }\) be the ProxNet as in (29), let \(\widetilde{x}^0\in \mathbb {R}^d\) be such that \(\Vert \widetilde{x}^0\Vert _2\le \Theta \) and define the sequences

$$\begin{aligned} \widetilde{x}^{(l),k}:=\widetilde{\Phi }(\widetilde{x}^{(l),k-1},z_{{\mathbf {A}}^{(l)}}, c^{(l)}),\, k\in \mathbb {N}, \quad \widetilde{x}^{(l),0}:=\widetilde{x}^0, \quad l\in \{1,2\}. \end{aligned}$$
(31)

Then, there is a constant \({\widetilde{C}}>0\), depending only on \(\Theta \) and d, such that for any \(k\in \mathbb {N}_0\) and arbitrary, fixed \(\varepsilon >0\) it holds that

$$\begin{aligned} \Vert \widetilde{x}^{({1}),k} - \widetilde{x}^{({2}),k}\Vert _2 \le \widetilde{C}\left( \Vert {{\mathbf {A}}}^{(1)} - {{\mathbf {A}}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) . \end{aligned}$$
(32)

Proof

By construction of \(\widetilde{\Phi }\) in Theorem 5.8, we have for \(x\in \mathbb {R}^d\), \(l\in \{1,2\}\), and \(i\in \{1,\ldots ,d\}\) that

$$\begin{aligned}&\widetilde{\Phi }(x, z_{{\mathbf {A}}^{(l)}},c^{(l)})_i\\&\quad =\max \left( (1-\omega )x_i - \omega \sum _{j=1, j\ne i}^d \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(l)}_{ii}},{\mathbf {A}}^{(l)}_{ij}\right) +\omega \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(l)}_{ii}},c^{(l)}_i\right) , 0 \right) . \end{aligned}$$

Therefore, we estimate by the triangle inequality

$$\begin{aligned}&|\widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(1)})_i - \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(2)}},c^{(2)})_i| \\&\quad \le \omega \sum _{j=1, j\ne i}^d \left| \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(1)}_{ij}\right) -\widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(2)}_{ii}},{\mathbf {A}}^{(2)}_{ij}\right) \right| \\&\qquad + \omega \left| \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(1)}_{ii}},c^{(1)}_i\right) - \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(2)}_{ii}},c^{(2)}_i\right) \right| \\&\quad \le \omega \sum _{j=1, j\ne i}^d \left| \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(1)}_{ij}\right) -\widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(2)}_{ij}\right) \right| \\&\quad \omega \sum _{j=1, j\ne i}^d \left| \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(2)}_{ij}\right) -\widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(2)}_{ii}},{\mathbf {A}}^{(2)}_{ij}\right) \right| \\&\qquad + \omega \left| \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(1)}_{ii}},c^{(1)}_i\right) - \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(1)}_{ii}},c^{(2)}_i\right) \right| \\&\qquad + \omega \left| \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(1)}_{ii}},c^{(2)}_i\right) - \widetilde{\prod }_{\delta _0,\Theta }^2 \left( \frac{1}{{\mathbf {A}}^{(2)}_{ii}},c^{(2)}_i\right) \right| . \end{aligned}$$

Since \(({\mathbf {A}}^{(l)}, c^{(l)})\in \mathfrak A_\Theta \) for \(l\in \{1,2\}\), it holds for any \(i,j\in \{1,\ldots ,d\}\) that \(1/{\mathbf {A}}^{(l)}_{ii}\), \({\mathbf {A}}^{(l)}_{ij}\), \(c^{(l)}_i \in [-\Theta ,\Theta ]\). Hence, for any x with \(\Vert x\Vert _\infty \le \Theta \) we obtain by \(\Theta \ge 2\) and the second estimate in (26)

$$\begin{aligned}&|\widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(2)})_i - \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(2)}},c^{(2)})_i| \\&\quad \le \omega \sum _{j=1, j\ne i}^d \left( \delta _0 + \left| \frac{x_j}{{\mathbf {A}}_{ii}^{(1)}}\right| \right) \left| {\mathbf {A}}^{(1)}_{ij}-{\mathbf {A}}^{(2)}_{ij}\right| \\&\qquad + \omega \left( \delta _0 + |x_j{\mathbf {A}}_{ij}^{(2)}|\right) \left| \frac{1}{{\mathbf {A}}^{(1)}_{ii}}-\frac{1}{{\mathbf {A}}^{(2)}_{ii}}\right| \\&\qquad + \omega \left( \delta _0 + \frac{1}{{\mathbf {A}}_{ij}^{(1)}}\right) \left| c^{(1)}_i-c^{(2)}_i\right| \\&\qquad + \omega \left( \delta _0 + |c^{(2)}_i|\right) \left| \frac{1}{{\mathbf {A}}^{(1)}_{ii}}-\frac{1}{{\mathbf {A}}^{(2)}_{ii}}\right| \\&\quad \le \omega 2(\delta _0\Theta ^2 + \Theta ^4) \ \left( \sum _{j=1}^d \left| {\mathbf {A}}^{(1)}_{ij}-{\mathbf {A}}^{(2)}_{ij}\right| + \left| c^{(1)}_i-c^{(2)}_i\right| \right) \\&\quad \le \omega (\delta _0\Theta ^2 + \Theta ^4) \ \left( d^{1/2}\left( \sum _{j=1}^d \left| {\mathbf {A}}^{(1)}_{ij}-{\mathbf {A}}^{(2)}_{ij}\right| ^2\right) ^{1/2} + \left| c^{(1)}_i-c^{(2)}_i\right| \right) . \end{aligned}$$

We have used the mean-value theorem to obtain the bound

$$\begin{aligned} \left| \frac{1}{{\mathbf {A}}^{(1)}_{ii}}-\frac{1}{{\mathbf {A}}^{(2)}_{ii}}\right| \le \Theta ^2 \left| {\mathbf {A}}^{(1)}_{ii}-{\mathbf {A}}^{(2)}_{ii}\right| \end{aligned}$$

in the second last inequality and the Cauchy–Schwarz inequality in the last step. We recall from the proof of Theorem 5.8 that \(\omega =\Theta ^{-6}\) and \(\delta _0\le d^{-3/2}\); hence, there is a constant \(C=C(\Theta ,d)>0\), depending only on the indicated parameters, such that for any \(x\in \mathbb {R}^d\) with \(\Vert x\Vert _\infty \le \Theta \) it holds

$$\begin{aligned} \Vert \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(1)}) - \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(2)}},c^{(2)})\Vert _2 \le C\left( \Vert {\mathbf {A}}^{(1)} - {\mathbf {A}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) . \end{aligned}$$
(33)

Moreover, for any \(x, y\in \mathbb {R}\) such that \(\Vert x\Vert _\infty , \Vert y\Vert _\infty \le \Theta \), it holds by the mean-value theorem and the second estimate in (26)

$$\begin{aligned} \begin{aligned}&|\widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(1)})_i - \widetilde{\Phi }(y, z_{{\mathbf {A}}^{(1)}},c^{(1)})_i| \\&\quad \le \left| \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(1)})_i - \widetilde{\Phi }(y, z_{{\mathbf {A}}^{(1)}},c^{(1)})_i -(({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})(x-y))_i\right| \\&\qquad +\left| (({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})(x-y))_i\right| \\&\quad = \omega \left| \sum _{j=1, j\ne i}^d \widetilde{\prod }_{\delta _0,\Theta }^3 \left( x_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(1)}_{ij}\right) -\widetilde{\prod }_{\delta _0,\Theta }^3 \left( y_j,\frac{1}{{\mathbf {A}}^{(1)}_{ii}},{\mathbf {A}}^{(1)}_{ij}\right) - \frac{{\mathbf {A}}^{(1)}_{ij}}{{\mathbf {A}}^{(1)}_{ii}}(x_j-y_j) \right| \\&\qquad +\left| (({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})(x-y))_i\right| \\&\quad \le \omega \delta _0 \sum _{j=1, j\ne i}^d |x_j-y_j| + \left| (({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})(x-y))_i\right| . \end{aligned} \end{aligned}$$

Hence, Young’s inequality yields for any \({\epsilon }>0\) that

$$\begin{aligned} \begin{aligned}&\Vert \widetilde{\Phi }(x, z_{{\mathbf {A}}^{(1)}},c^{(1)}) - \widetilde{\Phi }(y, z_{{\mathbf {A}}^{(1)}},c^{(1)})\Vert _2^2 \\&\quad \le \sum _{i=1}^d \left( 1+\frac{1}{{\epsilon }}\right) \omega ^2 \delta _0^2 \left( \sum _{j=1, j\ne i}^d |x_j-y_j|\right) ^2 + (1+{\epsilon })\Vert ({\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}})(x-y)\Vert _2^2 \\&\quad \le \left( \left( 1+\frac{1}{{\epsilon }}\right) \omega ^2 \delta _0^2d(d-1) +(1+{\epsilon })\Vert {\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}}\Vert _2^2\right) \Vert x-y\Vert _2^2, \end{aligned} \end{aligned}$$
(34)

where we have used the Cauchy–Schwarz inequality in the last step. From the proof of Theorem 5.8, we have as before that \(\omega =\Theta ^{-6}\), \(\delta _0\le d^{-3/2}\), and, furthermore \(\Vert {\mathbf {I}}_d-\omega {\mathbf {D}}^{-1}{\mathbf {A}}\Vert _2\le 1-\Theta ^{4}\). Setting \({\epsilon }:=\Theta ^{-4}\) therefore shows that \(\widetilde{\Phi }(\cdot , z_{{\mathbf {A}}^{(1)}},c^{(1)}):\mathbb {R}^d\rightarrow \mathbb {R}^d\) is a contraction on \((\mathbb {R}^d,\Vert \cdot \Vert _2)\) with Lipschitz constant \(\widetilde{L}_1>0\) bounded by

$$\begin{aligned} \widetilde{L}_1 \le \left( \left( \Theta ^{-12}+{\Theta ^{-8}}\right) d^{-1} + (1 - \Theta ^{-8}) \right) ^{1/2} { \le \sqrt{1+\Theta ^{-12}-\frac{\Theta ^{-8}}{2}} \le \sqrt{1-\frac{7}{16}\Theta ^{-8}} }\in (0,1). \end{aligned}$$
(35)

Note that we have used \(d\ge 2\) and \(\Theta \ge 2\) in the last two steps to obtain (35). Now, let \((\widetilde{x}^{(l),k})\) for \(l\in \{1,2\}\) and \(k\in \mathbb {N}_0\) denote the iterates as defined in (31) and recall from the proof of Theorem 3.5 that \(\Vert \widetilde{x}^{(l),k}\Vert _\infty \le \Vert \widetilde{x}^{(l),k}\Vert _2\le \Theta \). Therefore, we may apply the estimates in (33) and (34) to obtain

$$\begin{aligned} \Vert \widetilde{x}^{(1),k} - \widetilde{x}^{(2),k}\Vert _2&\le \Vert \widetilde{x}^{(1),k} - \widetilde{\Phi }(\widetilde{x}^{(2),k-1}, z_{{\mathbf {A}}^{(1)}},c^{(1)})\Vert _2 + \Vert \widetilde{\Phi }(\widetilde{x}^{(2),k}, z_{{\mathbf {A}}^{(1)}},c^{(1)}) - \widetilde{x}^{(2),k}\Vert _2 \\&\le \widetilde{L}_1 \Vert \widetilde{x}^{(1),k-1} - \widetilde{x}^{(2),k-1}\Vert _2 + C\left( \Vert {\mathbf {A}}^{(1)} - {\mathbf {A}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) \\&\le C\left( \Vert {\mathbf {A}}^{(1)} - {\mathbf {A}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) \sum _{j=1}^{k-1} \widetilde{L}_1^j \\&\le \frac{C}{1-\widetilde{L}_1}\left( \Vert {\mathbf {A}}^{(1)} - {\mathbf {A}}^{(2)}\Vert _F + \Vert c^{(1)} - c^{(2)}\Vert _2 \right) . \end{aligned}$$

The c laim follows for \(\widetilde{C}:=\frac{C}{1-\widetilde{L}_1}<\infty \), since \(C=C(\Theta , d)\), and \(\widetilde{L}_1\) is bounded independently with respect to \(\varepsilon \) and k by (35). \(\square \)

Proof of Theorem 5.9

For fixed \(\Theta \) and \(\varepsilon \), let the ProxNet \(\widetilde{\Phi }:\mathbb {R}^d\oplus \mathbb {R}^{d^2}\oplus \mathbb {R}^d\rightarrow \mathbb {R}^d\) and \(k_\varepsilon \in \mathbb {N}\) be given as in Theorem 5.8. We define the operator \(\widetilde{O}_\varepsilon \) by concatenation of \(\widetilde{\Phi }\) via

$$\begin{aligned} \widetilde{O}_\varepsilon ({\mathbf {A}}, c) := \left[ \underbrace{ \widetilde{\Phi }(\cdot ,z_{{\mathbf {A}}}, c) \bullet \cdots \bullet \widetilde{\Phi }(\cdot ,z_{{\mathbf {A}}}, c)} _{k_\varepsilon \text { -fold concatenation}} \right] (0), \quad ({\mathbf {A}}, c)\in \mathfrak A_\Theta . \end{aligned}$$

Note that the initial value \(\widetilde{x}^0:=0\in \mathbb {R}^d\) satisfies \(\Vert \widetilde{x}^0\Vert _\infty \le \Theta \) for arbitrary \(\Theta >0\).Footnote 1 Thus, applying Theorem 5.8 with \(\widetilde{x}^0=0\) yields for any LCP \(({\mathbf {A}}, c)\in \mathfrak A_\Theta \) with solution \(x^*\in \mathbb {R}^d\) that

$$\begin{aligned} \Vert O({\mathbf {A}},c)-\widetilde{O}_\varepsilon ({\mathbf {A}}, c)\Vert _2 = \Vert x^*-\widetilde{x}^{k_\varepsilon }\Vert _2 \le \varepsilon . \end{aligned}$$

To show the second part of the claim, we set \(\widetilde{x}^{1,0}=\widetilde{x}^{2,0}:=0\) and observe that \(x^{(1),k_\varepsilon }, x^{(2),k_\varepsilon }\) in Lemma 5.10 are given by \(x^{(l),k_\varepsilon }=\widetilde{O}_\varepsilon ({\mathbf {A}}^{(l)}, c^{(l)})\) for \(l\in \{1,2\}\). Hence, the estimate (30) follows immediately for any \(\varepsilon >0\) and \(({\mathbf {A}}^{(1)}, c^{(1)}), ({\mathbf {A}}^{(2)}, c^{(2)})\in \mathfrak A_\Theta \) from (32), by setting \(k=k_\varepsilon \). \(\square \)

6 Numerical experiments

6.1 Valuation of American options: Black–Scholes model

To illustrate an application for ProxNets, we consider the valuation of an American option in the Black–Scholes model. The associated payoff function of the American option is denoted by \(g:\mathbb {R}_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\), and we assume a time horizon \(\mathbb {T}=[0,T]\) for \(T>0\). In any time \(t\in \mathbb {T}\) and for any spot price \(x_0\ge 0\) of the underlying stock, the value of the option is denoted by V(tx) and defines a mapping \(V:\mathbb {T}\times \mathbb {R}_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\). Changing to time-to-maturity and log-price yields the map \(v:\mathbb {T}\times \mathbb {R}\rightarrow \mathbb {R}_{\ge 0},\;(t,x)\mapsto V(T-t,e^x)\), which is the solution to the free boundary value problem

$$\begin{aligned} \begin{aligned} \partial _t v - \frac{\sigma ^2}{2}\partial _{xx} v - \left( r-\frac{\sigma ^2}{2}\right) \partial _x v +rv&\ge 0&\quad \text {in } (0,T]\times \mathbb {R}, \\ v(t,x)&\ge g(e^x)&\quad \text {in } (0,T]\times \mathbb {R}, \\ \left( \partial _t v - \frac{\sigma ^2}{2}\partial _{xx} v - \left( r-\frac{\sigma ^2}{2}\right) \partial _x v +rv\right) (g-v)&=0&\quad \text {in } (0,T]\times \mathbb {R}, \\ v(0,e^x)&= g(e^x)&\quad \text {in } \mathbb {R}, \end{aligned} \end{aligned}$$
(36)

see, e.g., [14, Chapter 5.1]. The parameters \(\sigma >0\) and \(r\in \mathbb {R}\) are the volatility of the underlying stock and the interest rate, respectively. We assume that \(g\in H^1(\mathbb {R}_{\ge 0})\) and construct in the following a ProxNet-approximation to the payoff-to-solution operator at time \(t\in \mathbb {T}\) given by

$$\begin{aligned} {O_{\text {payoff},t}}: H^1(\mathbb {R}_{\ge 0})\rightarrow H^1(\mathbb {R}),\quad g\mapsto v(t,\cdot ). \end{aligned}$$
(37)

As V and v, and therefore \({O_{\text {payoff},t}}\), are in general not known in closed-form, a common approach to approximate v for a given payoff function g is to restrict Problem (36) to a bounded domain \(\mathcal {D}\subset \mathbb {R}\) and to discretize \(\mathcal {D}\) by linear finite elements based on d equidistant nodal points. The payoff function is interpolated with respect to the nodal basis, and we collect the respective interpolation coefficients of g in the vector \(\underline{g}\in \mathbb {R}^d\). The time domain [0, T] is split by \(M\in \mathbb {N}\) equidistant time steps and step size \(\Delta t=T/M\), and the temporal derivative is approximated by a backward Euler approach. This space-time discretization of the free boundary problem (36) leads to a sequence of discrete variational inequalities: Given \(\underline{g}\in \mathbb {R}^d\) and \(u_0:=0\in \mathbb {R}^d\) find \(u_m\in \mathbb {R}^d\) such that for \(m\in \{1,\ldots ,M\}\), it holds

$$\begin{aligned} {\mathbf {A}}u_{m+1} \ge F_m ,\quad u_{m+1} \ge 0,\quad ({\mathbf {A}}u_{m+1}-F_m)^\top u_{m+1} = 0. \end{aligned}$$
(38)

The LCP (38) is defined by the matrices \({\mathbf {A}}:= \mathbf {M} + \Delta t{\mathbf {A}}^{BS}\in \mathbb {R}^{d\times d}\), \({\mathbf {A}}^{BS} := \frac{\sigma ^2}{2}\mathbf {S} + (\frac{\sigma ^2}{2}-r)\mathbf {B} + r\mathbf {M} \in \mathbb {R}^{d\times d}\) and right hand side \(F_m:=-\Delta t ({\mathbf {A}}^{BS})^\top \underline{g} +\mathbf {M} u_m\in \mathbb {R}^{d}\). The matrices \(\mathbf {S}, \mathbf {B}, \mathbf {M}\in \mathbb {R}^{d\times d}\) represent the finite element stiffness, advection and mass matrices; hence, \({\mathbf {A}}\) is tri-diagonal and asymmetric if \(\frac{\sigma ^2}{2}\ne r\). The true value of the options at time km is approximated at the nodal points via \(v(\Delta tm, \cdot )\approx u_m+\underline{g}\). This yields the discrete payoff-to-solution operator at time \(\Delta tm\) defined by

$$\begin{aligned} { \overline{O}_{\text {payoff},\Delta tm}}:\mathbb {R}^d\mapsto \mathbb {R}^d,\quad \underline{g} \mapsto u_m+\underline{g}, \qquad m\in \{1,\ldots , M\}. \end{aligned}$$
(39)

Problem (38) may be solved for all m using a shallow ProxNet

$$\begin{aligned} \Phi :\mathbb {R}^d\oplus \mathbb {R}^d\oplus \mathbb {R}^d \rightarrow \mathbb {R}^d, \quad x\mapsto R(W_1x+b_1), \end{aligned}$$

with ReLU-activation \(R=\varrho ^{(d)}:\mathbb {R}^d\rightarrow \mathbb {R}^d\). The architecture of \(\Phi \) allows to take \(\underline{g}\) and \(u_m\) as additional inputs in each step; hence, we train only one shallow ProxNet that may be used for any payoff function g and every time horizon \(\mathbb {T}\). Therefore, we learn the payoff-to-solution operator \({O_{\text {payoff},t}}\) associated with Problem (36) by concatenating \(\Phi \). The parameters \(W_1\in \mathbb {R}^{d\times 3d}\) and \(b_1\in \mathbb {R}^d\) are learned in the training process and shall emulate one step of the PJOR Algorithm 1, as well as the linear transformation \((\underline{g}, u_m)\mapsto F_m\) to obtain the right hand side in (38). Therefore, a total of \(3d^2+d\) parameters have to be learned in each example.

For our experiments, we use the Python-based machine learning package PyTorch.Footnote 2 All experiments are run on a notebook with 8 CPUs, each with 1.80 GHz, and 16 GB memory. To train \(\Phi \), we sample \(N_s\in \mathbb {N}\) input data points \(x^{(i)}:=(x_0^{(i)},\underline{g}^{(i)}, u^{(i)})\in \mathbb {R}^{3d}\), \(i\in \{1,\ldots ,N_s\}\), from a 3d-dimensional standard-normal distribution. The output-training data samples \(y^{(i)}\) consist of one iteration of Algorithm 1 with \(\omega =1\), initial value \(x^0:=x_0^{(i)}\), with \({\mathbf {A}}\) as in  (38) and right hand side given by \(c:=-\Delta t ({\mathbf {A}}^{BS})^\top \underline{g}^{(i)} +\mathbf {M} u^{(i)}\in \mathbb {R}^{d}\). We draw a total of \(N_s=2\cdot 10^4\) input–output samples, use half of the data for training, and the other half for validation. In the training process, we use mini-batches of size \(N_{\mathrm{batch}}=100\) and the Adam Optimizer [18] with initial learning rate \(10^{-3}\), which is reduced by 50% every 20 epochs. As error criterion, we use the mean-squared error (MSE) loss function, which is for each batch of inputs \(((x^{(i_j)},\underline{g}^{(i_j)}, u^{(i_j)}), j=1,\ldots ,N_{\mathrm{batch}})\) and outputs \((y^{(i_j)}, j=1,\ldots ,N_{\mathrm{batch}})\) given by

$$\begin{aligned}&Loss \left( (x^{(i_1)},\underline{g}^{(i_1)}, u^{(i_1)}), \cdots , (x^{(i_{N_{\mathrm{batch}}})},\underline{g}^{(i_{N_{\mathrm{batch}}})}, u^{(i_{N_{\mathrm{batch}}})}) \right) \\&:= \frac{1}{N_{\mathrm{batch}}}\sum _{j=1}^{N_{\mathrm{batch}}} \Vert \Phi (x^{(i_j)}, \underline{g}^{(i_j)}, u^{(i_j)})-y^{(i_j)}\Vert _2^2. \end{aligned}$$

We stop the training process if the loss function falls below the tolerance \(10^{-12}\) or after a maximum of 300 epochs. The number of spatial nodal points d that determines the size of the matrix LCPs is varied throughout our experiments in \(d\in \{200,400,\ldots ,1000\}\). We choose the Black–Scholes parameters \(\sigma =0.1\), \(r=0.01\) and \(T=1\). Spatial and temporal refinement are balanced by using \(M=d\) time steps of size \(\Delta t=T/M=1/d\). The decay of the loss-curves is depicted in Fig. 1, where the reduction in the learning rate every 20 epochs explains the characteristic “steps” in the decay. This stabilizes the training procedure, and we reached a loss of \(\mathcal {O}(10^{-12})\) for each d before the 250th epoch.

Fig. 1
figure 1

Decay of the loss function for \(d=600\) (left) and \(d=1000\) (right). In all of our experiments, the training loss falls below the threshold of \(10^{-12}\) before the 250th epoch, and training is stopped early

Once training is terminated, we compress the resulting weight matrix of the trained single-layer ProxNet by setting all entries with absolute value lower than \(10^{-{7}}\) to zero. This speeds up evaluation of the trained network, while the resulting error is negligible. As the matrix \(W_1\) in the trained ProxNet is close to the “true” tri-diagonal matrix \({\mathbf {A}}\) from (38), this eliminates most of the ProxNet’s \(\mathcal {O}(d^2)\) parameters, and only \(\mathcal {O}(d)\) non-trivial entries remain.

The relative validation error is estimated based on the \(N_{\mathrm{val}}:=10^4\) validation samples via

$$\begin{aligned} \hbox {err}_{\mathrm{val}}^2 := \frac{\sum _{j=1}^{N_{\mathrm{val}}} \Vert \Phi (x^{(i_j)}, \underline{g}^{(i_j)}, u^{(i_j)})-y^{(i_j)}\Vert _2^2}{\sum _{j=1}^{N_{\mathrm{val}}} \Vert y^{(i_j)}\Vert _2^2}. \end{aligned}$$
(40)

The validation errors and training times for each dimension are found in Table 1 and confirm the successful training of the ProxNet. Naturally, training time increases in d, while the validation error is small of order \(\mathcal {O}(10^{-6})\) for all d.

Table 1 Training times and validation errors for the ProxNets in the Black–Scholes model in several dimensions, as estimated in (40) based on \(N_{\mathrm{val}}=10^4\) samples

To test the trained neural networks on Problem (38) for the valuation of an American option, we consider a basket of 20 put options with payoff function \(g_i(x):=\max (K_i-x,0)\) and strikes \(K_i= 10+90\frac{i}{20}\) for \(i\in \{1,\ldots ,20\}\). Hence, we use the same ProxNet for 20 different payoff vectors \(\underline{g}_i\). Note that we did not train our networks on payoff functions, but on random samples, and thus, we could in principle consider an arbitrary basket containing different types of payoffs. The restriction to put options is for the sake of brevity only. We denote by \(u_{m,i}\) for \(m\in \{0,\ldots ,M\}\) the sequence of solutions to (38) with payoff vector \(\underline{g}_i\) and \(u_{0,i}=0\in \mathbb {R}^d\) for each i.

Concatenating \(\Phi \) k times yields an approximation to the discrete operator \({ \overline{O}_{\text {payoff},\Delta tm}}\) in (39) for any \(m\in \{1,\ldots ,M\}\) via

$$\begin{aligned} {\widetilde{O}_{\text {payoff},\Delta tm}}:\mathbb {R}^d\oplus \mathbb {R}^d\oplus \mathbb {R}^d \rightarrow \mathbb {R}^d, \quad (x, \widetilde{u}_m, \underline{g}) \mapsto \left[ \underbrace{ \Phi (\cdot ,\underline{g}, \widetilde{u}_{m})\bullet \cdots \bullet \Phi (\cdot ,\underline{g}, \widetilde{u}_{m})}_{k\text {-fold concatenation}} \right] (x). \end{aligned}$$

An approximating sequence of \((u_{m,i}, m\in \{0,\ldots ,M\})\) is then in turn generated by

$$\begin{aligned} \widetilde{u}_{m+1,i} := {\widetilde{O}_{\text {payoff},\Delta tm}}(\widetilde{u}_{m,i}, \widetilde{u}_{m,i}, \underline{g}),\quad \widetilde{u}_{0,i}:=u_{0,i}=0\in \mathbb {R}^d. \end{aligned}$$

That is, \(\widetilde{u}_{m+1,i}\) is given by iterating \(\Phi \) k times with initial input \(x^0=\widetilde{u}_{m,i}\in \mathbb {R}^d\) and fixed inputs and \(\underline{g}_i\) and \(\widetilde{u}_{m,i}\). We stop for each m after k iterations if two subsequent iterates \(x^k\) and \(x^{k-1}\) satisfy \(\Vert x^k-x^{k-1}\Vert _2<10^{-3}\).

The reference solution \(u_{M,i}\) is calculated by a Python-implementation that uses the Primal-Dual Active Set (PDAS) Algorithm from [15] to solve LCP (38) with tolerance \(\varepsilon =10^{-6}\) in every time step. Compared to a fixed-point iteration, the PDAS method converges (locally) superlinear according to [15, Theorem 3.1], but has to be called separately for each payoff function \(g_i\). In contrast, the ProxNet \(\Phi \) may be iterated for the entire batch of 20 payoffs at once in PyTorch. We measure the relative error

$$\begin{aligned} \hbox {err}_{i,\mathrm{rel}} := \Vert \widetilde{u}_{M,i}-u_{M,i}\Vert _2/\Vert u_{M,i}\Vert _2 \end{aligned}$$

for each payoff vector \(\underline{g}_i\) at the end point \(T=\Delta tM=1\) and report the sample mean error

$$\begin{aligned} \hbox {err}_{\mathrm{rel}}:=\frac{1}{20}\sum _{i=1}^{20}\hbox {err}_{i,\mathrm{rel}}. \end{aligned}$$
(41)

Sample mean errors and computational times are depicted for \(d\in \{200,400,\ldots ,1000\}\) in Table 2, where we also report the number of iterations k for each d to achieve the desired tolerance of \(10^{-3}\). The results clearly show that ProxNets significantly accelerate the valuation of American option baskets, if compared to the standard, PDAS-based implementation. This holds true for any spatial resolution, i.e., the number of grid points d, while the relative error is small of magnitude \(\mathcal {O}(10^{-3})\) or \(\mathcal {O}(10^{-4})\). For \(d\ge 600\), we actually find that the combined times for training and evaluation of ProxNets is below the runtime of the reference solution. We further observe that computational times scale similarly for both, ProxNet and reference solution, in d. Hence, in our experiments, ProxNets are computationally advantageous even for a fine resolution of \(d=1000\) nodal points.

Table 2 Relative errors and computational times of a ProxNet solver for a basket of American put options in the Black–Scholes model

6.2 Valuation of American options: jump-diffusion model

We generalize the setting of the previous subsection from the Black–Scholes market to an exponential Lévy model. That is, the log-price of the stock evolves as a Lévy process, with jumps distributed with respect to the Lévy measure \(\nu :\mathcal {B}(\mathbb {R})\rightarrow [0,\infty )\). The option value v (in log-price and time-to-maturity) is now the solution of a partial integro-differential inequality given by

(42)

Introducing jumps in the model hence adds a non-local integral term to Eq. (36). The drift is set to \(\gamma :=-\sigma ^{2/2}-\int _{\mathbb {R}}(e^z-1-z)\nu (\mathrm{d}z)\in \mathbb {R}\) in order to eliminate arbitrage in the market. We discretize Problem (42) by an equidistant grid in space and time as in the previous subsection, for details, e.g., integration with respect to \(\nu \), we refer to [14, Chapter 10]. The space-time approximation yields again a sequence of LCPs of the form

$$\begin{aligned} {\mathbf {A}}^L u_{m+1} \ge F_m ,\quad u_{m+1} \ge 0,\quad ({\mathbf {A}}^L u_{m+1}-F_m)^\top u_{m+1} = 0, \end{aligned}$$
(43)

where \({\mathbf {A}}^L := \mathbf {M} + \Delta t{\mathbf {A}}^{Levy}\in \mathbb {R}^{d\times d}\) with \({\mathbf {A}}^{Levy} := \frac{\sigma ^2}{2}\mathbf {S} + {\mathbf {A}}^J\), and the matrix \({\mathbf {A}}^J\) stems from the integration of \(\nu \). A crucial difference to (38) is that \({\mathbf {A}}^L\) is not anymore tri-diagonal, but a dense matrix, due to the non-local integral term caused by the jumps. The drift \(\gamma \) and interest rate r are transformed into the right hand side, such that \(F_m:=-\Delta t ({\mathbf {A}}^{Levy})^\top \underline{g}_m +\mathbf {M} u_m\in \mathbb {R}^{d}\), where \(\underline{g}_m\) is the nodal interpolation of the transformed payoff \(g_m(x):=ge^{rkm}(x-(\gamma +r)km)\). The inverse transformation gives an approximation to the solution v of (42) at the nodal points via \(v(km, \cdot - (\gamma +r)T)\approx e^{-rT}u_M\). We refer to [14, Chapter 10.6] for further details on the discretization of American options in Lévy models.

The jumps are distributed according to the Lévy measure

$$\begin{aligned} \nu (dz) = \lambda p\beta _+e^{-\beta _+z}\mathbb {1}_{\{z> 0\}}(z) + \lambda (1-p)\beta _-e^{-\beta _-z}\mathbb {1}_{\{z< 0\}}(z), \quad z\in \mathbb {R}. \end{aligned}$$
(44)

That is, the jumps follow an asymmetric, double-sided exponential distribution with jump intensity \(\lambda =\nu (\mathbb {R})\in (0,\infty )\). We choose \(p=0.7\), \(\beta _+=25\), \(\beta _-=20\) to characterize the tails of \(\nu \) and set jump intensity to \(\lambda =1\). We further use \(\sigma =0.1\) and \(r=0.01\) as in the Black–Scholes example.

We use the same training procedure and parameters as in the previous subsection to train the shallow ProxNets. As only difference, we compress the weight matrix with tolerance \(10^{-8}\) instead of \(10^{-7}\) (recall that \({\mathbf {A}}^L\) is dense). This yields slightly better relative errors in this example, while it does not affect the time to evaluate the ProxNets. Training times and validation errors are depicted in Table 3 and indicate again a successful training. The decay of the training loss is for each d very similar to Fig. 1, and training is again stopped in each case before the 300th epoch.

Table 3 Training times and validation errors for the ProxNets in the jump-diffusion model, as estimated in (40) based on \(N_{\mathrm{val}}=10^4\) samples

After training, we again concatenate the shallow nets to approximate the operator \({O_{\text {payoff},t}}\) in (37), that maps the payoff function g to the corresponding option value \(v(t,\cdot )\) at any (discrete) point in time. We repeat the test from Sect. 6.1 in the jump-diffusion model with the identical basket of put options to test the trained ProxNets. The reference solution is again computed by a PDAS-based implementation. The results for American options in the jump-diffusion model are depicted in Table 4. Again, we see that the trained ProxNets approximated the solution v to (42) for any g to an error of magnitude \(\mathcal {O}(10^{-3})\) or less. While keeping the relative error small, ProxNets again significantly reduce computational time and are therefore a valid alternative in more involved financial market models. We finally observe that the number of iterations to tolerance in the jump-diffusion model is stable at 6–7 for all d, whereas this number increases with d in the Black–Scholes market (compare the third row in Tables 2 and  4). The explanation for this effect is that the excess-to-payoff vector \(u_M\) has a smaller norm in the jump-diffusion case, but the iterations terminate at the (absolute) threshold \(10^{-3}\) in both, the Black–Scholes and jump-diffusion model. Therefore, we require less iterations in the latter scenario, although the option prices v and relative errors are of comparable magnitude in both examples.

Table 4 Relative errors and computational times of a ProxNet solver for a basket of American put options in the jump-diffusion model

6.3 Parametric obstacle problem

To show an application for ProxNets beyond finance, we consider an elliptic obstacle problem in the two-dimensional domain \(\mathcal {D}:=(-1,1)^2\). We define \(\mathcal {H}:=H_0^1(\mathcal {D})\) and aim to find the solution \(u\in \mathcal {H}\) to the partial differential inequality

$$\begin{aligned} - \bigtriangleup u \ge f \quad \text {in } \mathcal {D}, \qquad u \ge g \quad \text {in } \mathcal {D}, \qquad u = 0 \quad \text {on } \partial \mathcal {D}. \end{aligned}$$
(45)

Therein, \(f\in \mathcal {H}'\) is a given source term and \(g\in \mathcal {H}\) is an obstacle function, for which we assume \(g\in C(\overline{\mathcal {D}})\cap \mathcal {H}\) for simplicity in the following. We introduce the convex set \(\mathcal {K}:=\{v\in \mathcal {H}|\, v\ge g \text { almost everywhere}\}\) and the bilinear form

$$\begin{aligned} a:\mathcal {H}\times \mathcal {H}\rightarrow \mathbb {R},\quad (v,w)\mapsto \int _\mathcal {D}\nabla v\cdot \nabla w\, \mathrm{d}x, \end{aligned}$$

and note that a, f and \(\mathcal {K}\) satisfy Assumption 4.1. The variational inequality problem associated with (45) is then to

$$\begin{aligned} \text {find } u\in \mathcal {K}\text { such that:}\quad a(u,v-u) \ge f(v-u),\quad \forall v\in \mathcal {K}. \end{aligned}$$
(46)

As for (15) at the beginning of Sect. 5, we introduce \(\mathcal {K}_0:=\{v\in \mathcal {H}|\, v\ge 0 \text { almost everywhere}\}\), and Problem (46) is equivalent to finding \(u=u_0+g\in \mathcal {K}\)

$$\begin{aligned} \text {with } u_0\in \mathcal {K}_0 \text { such that:}\quad a(u_0,v-u_0) \ge f(v-u_0)-a(g,v-u_0),\quad \forall v\in \mathcal {K}_0. \end{aligned}$$
(47)

As for the previous examples in this section, we use ProxNets to emulate the obstacle-to-solution operator

$$\begin{aligned} O_{\text {obs}}: \mathcal {H}\rightarrow \mathcal {H},\quad g\mapsto u. \end{aligned}$$
(48)

We discretize \(\overline{\mathcal {D}}=[-1,1]^2\) for \(d_0\in \mathbb {N}\) by a \((d_0+2)^2\)-dimensional nodal basis of linear finite elements, based on \((d_0+2)\) equidistant points in every dimension. Due to the homogeneous Dirichlet boundary conditions in (45), we only have to determine the discrete approximation of u within \(\mathcal {D}\) and may restrict ourselves to a finite element basis \(\{v_1,\ldots ,v_d\}\), for \(d:=d_0^2\), with respect to the interior nodal points. Following the procedure outlined in Sect. 5.1, we denote by \(\underline{g}\in \mathbb {R}^d\) again the nodal interpolation coefficients of g (recall that we have assumed \(g\in C(\overline{\mathcal {D}})\)) and by \({\mathbf {A}}\in \mathbb {R}^{d\times d}\) the finite element stiffness matrix with entries \({\mathbf {A}}_{ij}:=a(v_j, v_i)\) for \(i,j\in \{1,\ldots ,d\}\) This leads to the matrix LCP to find \(\underline{u}\in \mathbb {R}^d\) such that

$$\begin{aligned} \begin{aligned} {\mathbf {A}}\underline{u} \ge c, \quad \underline{u}\ge 0, \quad \underline{u}^\top ({\mathbf {A}}\underline{u} - c)&= 0, \end{aligned} \end{aligned}$$
(49)

where \(c\in \mathbb {R}^d\) is in turn given by \(c_i:=f(v_i)-({\mathbf {A}}^T\underline{g})_i\) for \(i\in \{1,\ldots ,d\}\). Given a fixed spatial discretization based on d nodes, we again approximate the discrete obstacle-to-solution operator

$$\begin{aligned} \overline{O}_{\text {obs}}: \mathbb {R}^d\rightarrow \mathbb {R}^d ,\quad \underline{g}\mapsto \underline{u} \end{aligned}$$
(50)

by concatenating shallow ProxNets \(\Phi :\mathbb {R}^d\oplus \mathbb {R}^d\rightarrow \mathbb {R}^d\).

The training process of the ProxNets in the obstacle problem is the same as in Sects. 6.1 and 6.2 and thus, is not further outlined here. The only difference is that we draw the input data for training now from a 2d-dimensional standard normal distribution. The output samples again correspond to one PJOR-Iteration with \({\mathbf {A}}\) and c as in (49) and \(\omega =1\), where the initial value and \(\underline{g}\) are both replaced by the 2d-dimensional random input vector. After training, we again compress the weight matrices by setting all entries with absolute value lower than \(10^{-7}\) to zero. We test the ProxNets for LCPs of dimension \(d\in \{10^2,20^2, 30^2, 40^2\}\) and report training times and validation errors in Table 5. As before, training is successful and aborted early for each d, since the loss function falls below \(10^{-12}\) before the 300th epoch.

Table 5 Training times and validation errors for the ProxNets in the Obstacle Problem, as estimated in (40) based on \(N_{\mathrm{val}}=10^4\) samples

Once \(\Phi :\mathbb {R}^d\oplus \mathbb {R}^d\rightarrow \mathbb {R}^d\) is trained for given d, we use the initial value zero \(x=0\in \mathbb {R}^d\) and concatenate \(\Phi \) k times to obtain for any \(\underline{g}\) the approximate discrete obstacle-to-solution operator

$$\begin{aligned} \widetilde{O}_{\text {obs}}:\mathbb {R}^d\rightarrow \mathbb {R}^d, \quad \underline{g} \mapsto \left[ \underbrace{ \Phi (\cdot ,\underline{g})\bullet \cdots \bullet \Phi (\cdot ,\underline{g})}_{k\text {-fold concatenation}} \right] (0). \end{aligned}$$

This yields \(\underline{u} = \overline{O}_{\text {obs}}(\underline{g})\approx \underline{u}^k:=\widetilde{O}_{\text {obs}}(\underline{g})\). We test the trained ProxNets on the parametric family of obstacles \((g_r, r>0)\subset \mathcal {H}\), given by

$$\begin{aligned} g_r(x):=\min \left( \max \left( e^{-r\Vert x\Vert _2^2} - \frac{1}{2}, 0\right) , \frac{1}{4}\right) , \quad x\in \mathcal {D}. \end{aligned}$$
(51)

For given \(r>0\), let \(\underline{g}_r\in \mathbb {R}^d\) denote the nodal interpolation of \(g_r\), and let \(\underline{u}_r\) be discrete solution to the corresponding obstacle problem. We approximate the solutions \(\underline{u}_r\) to (49) for a basket of 100 obstacles \(g_r\) with \(r\in \mathfrak R:=\{1+\frac{4i}{99}|\; i\in \{0,\ldots ,99\}\}\). For this, we iterate the ProxNets \(\Phi \) again on the entire batch of obstacles and denote by \(\underline{u}_r^k\) the kth iterate for any \(r\in \mathfrak R\). We stop the concatenation of \(\Phi \) after k iterations if \(\max _{r\in \mathfrak R} \Vert \underline{u}_r^k-\underline{u}_r^{k-1}\Vert _2<10^{-4}\), and report on the value of k for each d. The lower absolute tolerance is necessary in the obstacle problem, since the solutions \(u_r\) now have lower absolute magnitude as compared to the previous examples. The reference solution is again calculated by solving (49) with the PDAS algorithm, which has to be called separately for each obstacle in \((g_r, r\in \mathfrak R)\). A sample of \(g_r\) together with the associated discrete solution \(\underline{u}_r\) and its ProxNet approximation \(\underline{u}_r^k\) is depicted in Fig. 2.

Fig. 2
figure 2

From left to right: Obstacle \(g_r\) as in (51) with scale parameter \(r=1.7677\), the corresponding discrete solution \(\underline{u}_r\) with refinement parameter \(h:=\frac{2}{41}\) in each spatial dimension (corresponds to \(d=40^2\) interior nodal points in \(\mathcal {D}\)), and its ProxNet approximation \(\underline{u}_r^k\) based on \(k=698\) iterations

The relative error of the ProxNet approximation, the number of iterations and the computational times are depicted in Table 6. ProxNets approximate the discrete solutions well with relative errors of magnitude \(\mathcal {O}(10^{-4})\) for all d. However, compared to the examples in Sects. 6.1 and 6.2, we observe that significantly more iterations are necessary to achieve the absolute tolerance of \(10^{-4}\). This is due to the larger contraction constants in the obstacle problem, which are very close to one for all d. The lower absolute tolerance of \(10^{-4}\) adds more iterations, but is not the main reason why we observe larger values of k in the obstacle problem. Nevertheless, ProxNets still outperform the reference solver in terms of computational time, with a relative error of at most 0.1% for large d.

Table 6 Relative errors and computational times of a ProxNet solver for a family of parametric obstacle problems

7 Conclusions

We proposed deep neural networks which realize approximate input-to-solution operators for unilateral, inequality problems in separable Hilbert spaces. Their construction was based on realizing approximate solution constructions in the continuous (infinite dimensional) setting, via proximinal and contractive maps. As particular cases, several classes of finite-dimensional projection maps (PSOR, PJOR) were shown to be representable by the proposed ProxNet DNN architecture. The general construction principle behind ProxNet introduced in the present paper can be employed to realize further DNN architectures, also in more general settings. We refer to [1] for multilevel and multigrid methods to solve (discretized) variational inequality problems. The algorithms in this reference may also be realized as concatenation of ProxNets, similarly to the PJOR-Net and PSOR-Net from Examples 5.3 and 5.4. The analysis and representation of multigrid methods as ProxNets will be considered in a forthcoming work.