Proofs of the Results from Sect. 3
Proof of Proposition 3.7
In this subsection, we will prove Proposition 3.7. As a preparation, we first prove the following special instance under which
can be estimated by \(\max \left\{ M(\Phi ^1), M(\Phi ^2)\right\} \).
Lemma A.1
Let \(\Phi \) be a NN with m-dimensional output and d-dimensional input. If \({\mathbf {a}} \in {\mathbb {R}}^{1 \times m}\), then, for all \(\ell = 1, \dots , L(\Phi )\),
In particular, it holds that
Moreover, if \({\mathbf {D}} \in {\mathbb {R}}^{d \times n}\) such that, for every \(k \le d\) there is at most one \(l_k\le n\) such that \({\mathbf {D}}_{k,l_k} \ne 0\), then, for all \(\ell = 1, \dots , L(\Phi )\),
In particular, it holds that
.
Proof
Let \(\Phi = \big ( ({\mathbf {A}}_1,{\mathbf {b}}_1), \dots , ({\mathbf {A}}_{L},{\mathbf {b}}_{L}) \big )\), and \({\mathbf {a}}, {\mathbf {D}}\) as in the statement of the lemma. Then, the result follows if
$$\begin{aligned} \Vert {\mathbf {a}} {\mathbf {A}}_L\Vert _0 + \Vert {\mathbf {a}}{\mathbf {b}}_L\Vert _{0} \le \Vert {\mathbf {A}}_L\Vert _0 + \Vert {\mathbf {b}}_L\Vert _{0} \end{aligned}$$
(A.1)
and
$$\begin{aligned} \Vert {\mathbf {A}}_1 {\mathbf {D}}\Vert _0 \le \Vert {\mathbf {A}}_1\Vert _0. \end{aligned}$$
It is clear that \(\Vert {\mathbf {a}} {\mathbf {A}}_L\Vert _0\) is less than the number of nonzero columns of \({\mathbf {A}}_L\) which is certainly bounded by \(\Vert {\mathbf {A}}_L\Vert _0\). The same argument shows that \(\Vert {\mathbf {a}}{\mathbf {b}}_L\Vert _{0} \le \Vert {\mathbf {b}}_L\Vert _{0}\). This yields (A.1).
We have that for two vectors \({\mathbf {p}},{\mathbf {q}} \in {\mathbb {R}}^{k}\), \(k \in {\mathbb {N}}\) and for all \(\mu , \nu \in {\mathbb {R}}\)
$$\begin{aligned} \Vert \mu {\mathbf {p}} + \nu {\mathbf {q}} \Vert _{0} \le I(\mu ) \Vert {\mathbf {p}}\Vert _{0} + I(\nu )\Vert {\mathbf {q}} \Vert _{0}, \end{aligned}$$
where \(I(\gamma ) = 0\) if \(\gamma = 0\) and \(I(\gamma ) = 1\) otherwise. Also,
$$\begin{aligned} \Vert {\mathbf {A}}_1 {\mathbf {D}}\Vert _0 = \left\| {\mathbf {D}}^T {\mathbf {A}}_1^T \right\| _0 = \sum _{l = 1}^n \left\| \left( {\mathbf {D}}^T {\mathbf {A}}_1^T\right) _{l, -}\right\| _0, \end{aligned}$$
where, for a matrix \({\mathbf {G}}\), \({\mathbf {G}}_{l, -}\) denotes the l-th row of \({\mathbf {G}}\). Moreover, we have that for all \(l \le n\)
$$\begin{aligned} \left( {\mathbf {D}}^T {\mathbf {A}}_1^T\right) _{l, -} = \sum _{k = 1}^d \left( {\mathbf {D}}^T\right) _{l, k} \left( {\mathbf {A}}_1^T\right) _{k, -} = \sum _{k = 1}^d {\mathbf {D}}_{k, l} \left( {\mathbf {A}}_1^T\right) _{k, -}. \end{aligned}$$
As a consequence, we obtain
$$\begin{aligned} \Vert {\mathbf {A}}_1 {\mathbf {D}}\Vert _0&\le \sum _{l = 1}^n \left\| \sum _{k = 1}^d {\mathbf {D}}_{k, l} \left( {\mathbf {A}}_1^T\right) _{k, -}\right\| _0 \le \sum _{l = 1}^n \sum _{k = 1}^d I\left( {\mathbf {D}}_{k, l} \right) \left\| \left( {\mathbf {A}}_1^T\right) _{k, -}\right\| _0\\&= \sum _{k = 1}^d I\left( {\mathbf {D}}_{k, l_k} \right) \left\| \left( {\mathbf {A}}_1^T\right) _{k, -}\right\| _0 \le \Vert {\mathbf {A}}_1\Vert _0. \end{aligned}$$
\(\square \)
Now we are ready to prove Proposition 3.7.
Proof of Proposition 3.7
Without loss of generality, assume that \(Z\ge 1\). By [21, Lemma 6.2], there exists a NN \(\times ^Z_{\epsilon }\) with input dimension 2, output dimension 1 such that for \(\Phi _{\epsilon } :=\times ^Z_{\epsilon }\)
$$\begin{aligned} L\left( \Phi _{\epsilon }\right)&\le 0.5 \log _2\left( \frac{n\sqrt{dl}}{\epsilon }\right) +\log _2(Z)+6, \end{aligned}$$
(A.2)
$$\begin{aligned} M\left( \Phi _{\epsilon }\right)&\le 90 \cdot \left( \log _2\left( \frac{n\sqrt{dl}}{\epsilon }\right) +2\log _2(Z) +6\right) , \end{aligned}$$
(A.3)
$$\begin{aligned} M_1\left( \Phi _{\epsilon }\right)&\le 16, \text { as well as } M_{L \left( \Phi _{\epsilon }\right) }\left( \Phi _{\epsilon }\right) \le 3, \end{aligned}$$
(A.4)
$$\begin{aligned} \sup _{|a|,|b| \le Z}\left| ab - \mathrm {R}^{{\mathbb {R}}^2}_{\varrho }\left( \Phi _{\epsilon }\right) (a,b) \right|&\le \frac{\epsilon }{n\sqrt{dl}}. \end{aligned}$$
(A.5)
Since \(\Vert {\mathbf {A}}\Vert _2,\Vert {\mathbf {B}}\Vert _2\le Z\), we know that for every \(i=1,\ldots ,d,~ k=1,\ldots ,n,~j=1,\ldots ,l\) we have that \(|{\mathbf {A}}_{i,k}|,|{\mathbf {B}}_{k,j}|\le Z\). We define, for \(i\in \{1, \dots , d\}, k \in \{1, \dots , n\}, j \in \{1, \dots , l\}\), the matrix \({\mathbf {D}}_{i,k,j}\) such that, for all \({\mathbf {A}} \in {\mathbb {R}}^{d\times n}, {\mathbf {B}} \in {\mathbb {R}}^{n\times l}\)
$$\begin{aligned} {\mathbf {D}}_{i,k,j}(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) = ({\mathbf {A}}_{i,k}, {\mathbf {B}}_{k,j}). \end{aligned}$$
Moreover, let
We have, for all \(i\in \{1, \dots , d\}, k \in \{1, \dots , n\}, j \in \{1, \dots , l\}\), that \(L\left( \Phi ^Z_{i,k,j;\epsilon }\right) = L\left( \times ^Z_{\epsilon }\right) \) and by Lemma A.1 that \(\Phi ^Z_{i,k,j;\epsilon }\) satisfies (A.2), (A.3), (A.4) with \(\Phi _{\epsilon } :=\Phi ^Z_{i,k,j;\epsilon }\). Moreover, we have by (A.5)
$$\begin{aligned} \sup _{(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^{Z}_{d,n,l}} \left| {\mathbf {A}}_{i,k}{\mathbf {B}}_{k,j}-\mathrm {R}^{K^Z_{d,n,l}}_\varrho \left( \Phi ^Z_{i,j,k;\epsilon } \right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) \right| \le \frac{\epsilon }{n\sqrt{dl}}. \end{aligned}$$
(A.6)
As a next step, we set, for \({\mathbf {1}}_{{\mathbb {R}}^n} \in {\mathbb {R}}^n\) being a vector with each entry equal to 1,
which by Lemma 3.6 is a NN with \(n\cdot (d+l)\)-dimensional input and 1-dimensional output such that (A.2) holds with \(\Phi _{\epsilon } :=\Phi ^Z_{i,j;\epsilon }\). Moreover, by Lemmas A.1 and 3.6 and by (A.3) we have that
$$\begin{aligned} M\left( \Phi ^Z_{i,j;\epsilon }\right)&\le M\left( \mathrm {P}\left( \Phi ^Z_{i,1,j;\epsilon },\ldots ,\Phi ^Z_{i,n,j;\epsilon }\right) \right) \nonumber \\&\le 90 n \cdot \left( \log _2\left( \frac{n\sqrt{dl}}{\epsilon }\right) +2\log _2(Z)+6\right) . \end{aligned}$$
(A.7)
Additionally, by Lemmas 3.6 and A.1 and (A.4), we obtain
$$\begin{aligned} M_1\left( \Phi ^Z_{i,j;\epsilon }\right)&\le M_1\left( \mathrm {P}\left( \Phi ^Z_{i,1,j;\epsilon },\ldots ,\Phi ^Z_{i,n,j;\epsilon }\right) \right) \le 16 n. \end{aligned}$$
and
$$\begin{aligned} M_{L\left( \Phi ^Z_{i,j;\epsilon }\right) }\left( \Phi ^Z_{i,j;\epsilon }\right)&= M_{L\left( \Phi ^Z_{i,j;\epsilon }\right) }\left( \mathrm {P}\left( \Phi ^Z_{i,1,j;\epsilon },\ldots ,\Phi ^Z_{i,n,j;\epsilon }\right) \right) \le 2 n. \end{aligned}$$
(A.8)
By construction it follows that
$$\begin{aligned} \mathrm {R}^{K^Z_{d,n,l}}_{\varrho }\left( \Phi ^Z_{i,j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) = \sum _{k=1}^n \mathrm {R}^{K^Z_{d,n,l}}_{\varrho }\left( \Phi ^Z_{i,k,j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) \end{aligned}$$
and hence we have, by (A.6),
$$\begin{aligned}&\sup _{(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,n,l}} \left| \sum _{k=1}^n {\mathbf {A}}_{i,k} {\mathbf {B}}_{k,j}- \mathrm {R}^ {K^Z_{d,n,l}}_{\varrho }\left( \Phi ^Z_{i,j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) \right| \le \frac{\epsilon }{\sqrt{dl}}. \end{aligned}$$
As a final step, we define
. Then, by Lemma 3.6, we have that (A.2) is satisfied for \(\Phi _{\epsilon } :=\Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\). This yields (i) of the asserted statement. Moreover, invoking Lemma 3.6, Lemma A.1 and (A.7) yields that
$$\begin{aligned} M\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right)&\le 90 dln \cdot \left( \log _2\left( \frac{n\sqrt{dl}}{\epsilon }\right) +2\log _2(Z)+6\right) , \end{aligned}$$
which yields (ii) of the result. Moreover, by Lemma 3.6 and (A.8) it follows that
$$\begin{aligned} M_1\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) \le 16 d l n \text { and } M_{L\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) } \left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) \le 2 d l n, \end{aligned}$$
completing the proof of (iii). By construction and using the fact that for any \({\mathbf {N}}\in {\mathbb {R}}^{d\times l}\) there holds
$$\begin{aligned} \Vert {\mathbf {N}}\Vert _2\le \sqrt{dl} \max _{i,j}|{\mathbf {N}}_{i,j}|, \end{aligned}$$
we obtain that
$$\begin{aligned}&\sup _{(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,n,l}} \left\| {\mathbf {A}} {\mathbf {B}}- \mathbf {matr} \left( \mathrm {R}^{K^Z_{d,n,l}}_{\varrho }\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) (\mathbf {vec}({\mathbf {A}}), \mathbf {vec}({\mathbf {B}}))\right) \right\| _2 \nonumber \\&\le \sqrt{dl} \sup _{(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,n,l}} \max _{i=1,\ldots ,d,~ j=1,\ldots ,l} \nonumber \\&\quad \left| \sum _{k=1}^n {\mathbf {A}}_{i,k} {\mathbf {B}}_{k,j} - \mathrm {R}^{K^Z_{d,n,l}}_{\varrho }\left( \Phi ^Z_{i,j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}})) \right| \le \epsilon . \end{aligned}$$
(A.9)
Equation (A.9) establishes (iv) of the asserted result. Finally, we have for any \((\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,n,l}\) that
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d,n,l}}\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\right) \right\| _2 \\&\le \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d,n,l}}\left( \Phi ^{Z,d,n,l}_{\mathrm {mult};{\tilde{\epsilon }}}\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\right) -{\mathbf {A}} {\mathbf {B}}\right\| _2 +\Vert {\mathbf {A}} {\mathbf {B}}\Vert _2 \\&\le \epsilon +\Vert {\mathbf {A}}\Vert _2 \Vert {\mathbf {B}}\Vert _2\le \epsilon +Z^2\le 1+Z^2. \end{aligned}$$
This demonstrates that (v) holds and thereby finishes the proof. \(\square \)
Proof of Theorem 3.8
The objective of this subsection is to prove of Theorem 3.8. Toward this goal, we construct NNs which emulate the map \({\mathbf {A}}\mapsto {\mathbf {A}}^k\) for \(k\in {\mathbb {N}}\) and square matrices \({\mathbf {A}}.\) This is done by heavily using Proposition 3.7. First of all, as a direct consequence of Proposition 3.7 we can estimate the sizes of the emulation of the multiplication of two squared matrices. Indeed, there exists a universal constant \(C_1>0\) such that for all \(d \in {\mathbb {N}}\), \(Z>0\), \(\epsilon \in (0,1)\)
-
(i)
\(L\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) \le C_1\cdot \left( \log _2\left( 1/\epsilon \right) +\log _2\left( d\right) +\log _2\left( \max \left\{ 1,Z\right\} \right) \right) \),
-
(ii)
\(M\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) \le C_1\cdot \left( \log _2\left( 1/\epsilon \right) +\log _2\left( d\right) +\log _2\left( \max \left\{ 1,Z\right\} \right) \right) d^3\),
-
(iii)
\(M_1\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) \le C_1 d^3, \qquad \text {as well as} \qquad M_{L\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) }\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) \le C_1 d^3\),
-
(iv)
\(\sup _{(\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,d,d}}\left\| {\mathbf {A}} {\mathbf {B}}- \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d,d,d}}\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\right) \right\| _2 \le \epsilon \),
-
(v)
for every \((\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\in K^Z_{d,d,d}\) we have
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d,d,d}}\left( \Phi _{\mathrm {mult}; \epsilon }^{Z, d, d, d}\right) (\mathbf {vec}({\mathbf {A}}),\mathbf {vec}({\mathbf {B}}))\right) \right\| _2\le & {} \epsilon +\Vert {\mathbf {A}}\Vert _2 \Vert {\mathbf {B}}\Vert _2\\\le & {} \epsilon +Z^2 \le 1+Z^2. \end{aligned}$$
One consequence of the ability to emulate the multiplication of matrices is that we can also emulate the squaring of matrices. We make this precise in the following definition.
Definition A.2
For \(d \in {\mathbb {N}}\), \(Z>0\), and \(\epsilon \in (0,1)\) we define the NN
which has \(d^2 \)-dimensional input and \(d^2 \)-dimensional output. By Lemma 3.6 we have that there exists a constant \(C_{\mathrm {sq}}>C_1\) such that for all \(d \in {\mathbb {N}}\), \(Z>0\), \(\epsilon \in (0,1)\)
-
(i)
\(L\left( \Phi ^{Z, d}_{2;\epsilon }\right) \le C_{\mathrm {sq}}\cdot \left( \log _2(1/\epsilon )+\log _2(d)+\log _2\left( \max \left\{ 1,Z\right\} \right) \right) ,\)
-
(ii)
\(M\left( \Phi ^{Z, d}_{2;\epsilon }\right) \le C_{\mathrm {sq}} d^3 \cdot \left( \log _2(1/\epsilon )+\log _2(d)+\log _2\left( \max \left\{ 1,Z\right\} \right) \right) , \)
-
(iii)
\(M_1\left( \Phi ^{Z, d}_{2;\epsilon }\right) \le C_{\mathrm {sq}} d^3,\qquad \text {as well as} \qquad M_{L\left( \Phi ^{Z, d}_{2;\epsilon }\right) }\left( \Phi ^{Z, d}_{2;\epsilon }\right) \le C_{\mathrm {sq}} d^3, \)
-
(iv)
\(\sup _{\mathbf {vec}({\mathbf {A}})\in K^Z_{d}}\left\| {\mathbf {A}}^2- \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d}}\left( \Phi ^{Z, d}_{2;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon , \)
-
(v)
for all \(\mathbf {vec}({\mathbf {A}})\in K^Z_{d}\) we have
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^Z_{d}}\left( \Phi ^{Z, d}_{2;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\le \epsilon + \Vert {\mathbf {A}}\Vert ^2 \le \epsilon +Z^2\le 1+Z^2. \end{aligned}$$
Our next goal is to approximate the map \({\mathbf {A}}\mapsto {\mathbf {A}}^k\) for an arbitrary \(k\in {\mathbb {N}}_0.\) We start with the case that k is a power of 2 and for the moment we only consider the set of all matrices the norm of which is bounded by 1/2.
Proposition A.3
Let \(d\in {\mathbb {N}},~j\in {\mathbb {N}},~\) as well as \(\epsilon \in \left( 0, 1/4\right) \). Then there exists a NN \(\Phi ^{1/2, d}_{2^j;\epsilon }\) with \(d^2\)-dimensional input and \(d^2 \)-dimensional output with the following properties:
-
(i)
\(L\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) \le C_{\mathrm {sq}} j\cdot \left( \log _2(1/\epsilon )+\log _2(d) \right) +2C_{\mathrm {sq}}\cdot (j-1)\),
-
(ii)
\(M\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) \le C_{\mathrm {sq}} j d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d) \right) +4C_{\mathrm {sq}}\cdot (j-1)d^3\),
-
(iii)
\(M_1\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) \le C_{\mathrm {sq}} d^3,\qquad \text {as well as} \qquad M_{L\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) }\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) \le C_{\mathrm {sq}} d^3\),
-
(iv)
\(\sup _{\mathbf {vec}({\mathbf {A}})\in K^{1/2}_d}\left\| {\mathbf {A}}^{2^j}- \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{1/2}_d}\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon \),
-
(v)
for every \(\mathbf {vec}({\mathbf {A}})\in K^{1/2}_d\) we have
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{1/2}_d}\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon +\left\| {\mathbf {A}}^{2^j}\right\| _2\le & {} \epsilon +\left\| {\mathbf {A}}\right\| ^{2^j}_2 \\\le & {} \frac{1}{4}+\left( \frac{1}{2}\right) ^{2^j}\le \frac{1}{2}. \end{aligned}$$
Proof
We show the statement by induction over \(j\in {\mathbb {N}}\). For \(j=1\), the statement follows by choosing \(\Phi ^{1/2, d}_{2;\epsilon }\) as in Definition A.2. Assume now, as induction hypothesis, that the claim holds for an arbitrary, but fixed \(j\in {\mathbb {N}}\), i.e., there exists a NN \(\Phi ^{1/2, d}_{2^j;\epsilon }\) such that
$$\begin{aligned}&\left\| \mathbf {matr}\!\left( \!\mathrm {R}_{\varrho }^{K_d^{1/2}}\!\left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) \! (\mathbf {vec}({\mathbf {A}}))\!\right) \! -{\mathbf {A}}^{2^{j}} \right\| _2 \le \epsilon , \ \ \left\| \mathbf {matr}\left( \!\mathrm {R}_{\varrho }^{K_d^{1/2}}\!\left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\!\right) \! \right\| _2\nonumber \\&\quad \le \epsilon + \left( \frac{1}{2}\right) ^{2^j} \end{aligned}$$
(A.10)
and \(\Phi ^{1/2, d}_{2^j;\epsilon }\) satisfies (i),(ii),(iii). Now we define
$$\begin{aligned} \Phi ^{1/2, d}_{2^{j+1};\epsilon } :=\Phi ^{1, d}_{2;\frac{\epsilon }{4}}\odot \Phi ^{1/2, d}_{2^{j};\epsilon }. \end{aligned}$$
By the triangle inequality, we obtain for any \(\mathbf {vec}({\mathbf {A}})\in K_d^{1/2}\)
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j+1};\epsilon } \right) (\mathbf {vec}({\mathbf {A}})) \right) -{\mathbf {A}}^{2^{j+1}}\right\| _2 \nonumber \\&\le \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j+1};\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) -{\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\nonumber \\&\qquad + \left\| {\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - \left( {\mathbf {A}}^{2^{j}}\right) ^2 \right\| _2. \end{aligned}$$
(A.11)
By construction of \(\Phi ^{1/2, d}_{2^{j+1};\epsilon }\), we know that
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j+1};\epsilon } \right) (\mathbf {vec}({\mathbf {A}})) \right) - \left( \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2, d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right) ^2\right\| _2 \le \frac{\epsilon }{4}. \end{aligned}$$
Therefore, using the triangle inequality and the fact that \(\Vert \cdot \Vert _2\) is a submultiplicative operator norm, we derive that
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j+1};\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) -{\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\nonumber \\&\le \frac{\epsilon }{4} + \left\| \left( \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right) ^2 - {\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\nonumber \\&\le \frac{\epsilon }{4} + \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - {\mathbf {A}}^{2^{j}}\right\| _2 \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j}; \epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\nonumber \\&\le \frac{\epsilon }{4} + \epsilon \cdot \left( \epsilon + \left( \frac{1}{2}\right) ^{2^j}\right) \le \frac{3}{4} \epsilon , \end{aligned}$$
(A.12)
where the penultimate estimate follows by the induction hypothesis (A.10) and \(\epsilon <1/4\). Hence, since \(\Vert \cdot \Vert _2\) is a submultiplicative operator norm, we obtain
$$\begin{aligned}&\left\| {\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - \left( {\mathbf {A}}^{2^{j}}\right) ^2 \right\| _2\nonumber \\&\quad \le \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}} \left( \Phi ^{1/2,d}_{2^{j};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - {\mathbf {A}}^{2^{j}} \right\| _2 \left\| {\mathbf {A}}^{2^{j}}\right\| _2 \nonumber \\&\quad \le \frac{\epsilon }{4}, \end{aligned}$$
(A.13)
where we used \(\left\| {\mathbf {A}}^{2^{j}}\right\| _2\le 1/4\) and the induction hypothesis (A.10). Applying (A.13) and (A.12) to (A.11) yields
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j+1};\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) -{\mathbf {A}}^{2^{j+1}}\right\| _2 \le \epsilon . \end{aligned}$$
(A.14)
A direct consequence of (A.14) is that
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2, d}_{2^{j+1};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon + \left\| {\mathbf {A}}^{2^{j+1}}\right\| _2 \le \epsilon + \Vert {\mathbf {A}}\Vert ^{2^{j+1}}_2. \end{aligned}$$
(A.15)
The estimates (A.14) and (A.15) complete the proof of the assertions (iv) and (v) of the proposition statement. Now we estimate the size of \(\Phi ^{1/2, d}_{2^{j+1};\epsilon }\). By the induction hypothesis and Lemma 3.6(a)(i), we obtain
$$\begin{aligned} L\left( \Phi ^{1/2, d}_{2^{j+1};\epsilon }\right)&= L\left( \Phi ^{1, d}_{2;\frac{\epsilon }{4}}\right) +L\left( \Phi ^{1/2, d}_{2^{j};\epsilon }\right) \\&\le C_{\mathrm {sq}} \cdot \left( \log _2(1/\epsilon ) +\log _2(d) + \log _2(4) + j \log _2(1/\epsilon )\right. \\&\left. + 2\cdot (j-1) +j \log _2(d) \right) \\&= C_{\mathrm {sq}} \cdot \left( (j+1)\log _2(1/\epsilon ) +(j+1)\log _2(d)+2j\right) , \end{aligned}$$
which implies (i). Moreover, by the induction hypothesis and Lemma 3.6(a)(ii), we conclude that
$$\begin{aligned} M\left( \Phi ^{1/2,d}_{2^{j+1};\epsilon }\right)&\le M\left( \Phi ^{1,d}_{2;\frac{\epsilon }{4}}\right) + M\left( \Phi ^{1/2, d}_{2^{j};\epsilon }\right) + M_1\left( \Phi ^{1,d}_{2;\frac{\epsilon }{4}}\right) + M_{L\left( \Phi ^{1/2, d}_{2^{j};\epsilon }\right) }\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) \\&\le C_{\mathrm {sq}} d^3\cdot \left( \log _2(1/\epsilon ) +\log _2(d)+\log _2(4) + j \log _2(1/\epsilon ) +j \log _2(d) \right. \\&\quad \left. + 4\cdot (j-1) \right) + 2C_{\mathrm {sq}}d^3 \\&= C_{\mathrm {sq}} d^3\cdot \left( (j+1)\log _2(1/\epsilon ) +(j+1)\log _2(d)+ 4j \right) , \end{aligned}$$
implying (ii). Finally, it follows from Lemma 3.6(a)(iii) in combination with the induction hypothesis as well Lemma 3.6(a)(iv) that
$$\begin{aligned} M_1\left( \Phi ^{1/2, d}_{2^{j+1};\epsilon }\right) = M_1\left( \Phi ^{1/2, d}_{2^{j};\epsilon }\right) \le C_{\mathrm {sq}} d^3, \end{aligned}$$
as well as
$$\begin{aligned} M_{L\left( \Phi ^{1/2, d}_{2^{j+1};\epsilon }\right) }\left( \Phi ^{1/2, d}_{2^{j+1};\epsilon }\right) = M_{L\left( \Phi ^{1,d}_{2;\frac{\epsilon }{4}}\right) }\left( \Phi ^{1,d}_{2;\frac{\epsilon }{4}}\right) \le C_{\mathrm {sq}} d^3, \end{aligned}$$
which finishes the proof. \(\square \)
We proceed by demonstrating, how to build a NN that emulates the map \({\mathbf {A}}\mapsto {\mathbf {A}}^k\) for an arbitrary \(k \in {\mathbb {N}}_0\). Again, for the moment we only consider the set of all matrices the norms of which are bounded by 1/2. For the case of the set of all matrices the norms of which are bounded by an arbitrary \(Z>0\), we refer to Corollary A.5.
Proposition A.4
Let \(d\in {\mathbb {N}}\), \(k\in {\mathbb {N}}_0\), and \(\epsilon \in \left( 0,1/4\right) \). Then, there exists a NN \(\Phi ^{1/2, d}_{k;\epsilon }\) with \(d^2\)- dimensional input and \(d^2\)-dimensional output satisfying the following properties:
-
(i)
$$\begin{aligned} L\left( \Phi ^{1/2, d}_{k;\epsilon }\right)&\le \left\lfloor \log _2\left( \max \{k,2\}\right) \right\rfloor L\left( \Phi ^{1,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) + L\left( \Phi ^{1/2, d}_{2^{\left\lfloor \log _2(\max \{k,2\})\right\rfloor };\epsilon }\right) \\&\le 2C_{\mathrm {sq}} \left\lfloor \log _2\left( \max \{k,2\}\right) \right\rfloor \cdot \left( \log _2(1/\epsilon )+\log _2(d)+2 \right) , \end{aligned}$$
-
(ii)
\(M\left( \Phi ^{1/2, d}_{k;\epsilon }\right) \le \frac{3}{2}C_{\mathrm {sq}} d^3\cdot \left\lfloor \log _2\left( \max \{k,2\}\right) \right\rfloor \cdot \left( \left\lfloor \log _2\left( \max \{k,2\}\right) \right\rfloor +1\right) \cdot \left( \log _2(1/\epsilon )+\log _2(d)+4 \right) \),
-
(iii)
\(M_1\left( \Phi ^{1/2, d}_{k;\epsilon } \right) \le C_{\mathrm {sq}}\cdot \left( \left\lfloor \log _2\left( \max \{k,2\}\right) \right\rfloor +1\right) d^3, \ \text {as well as} \ M_{L\left( \Phi ^{1/2, d}_{k;\epsilon }\right) }\left( \Phi ^{1/2, d}_{k;\epsilon }\right) \le C_{\mathrm {sq}} d^3, \)
-
(iv)
\(\sup _{\mathbf {vec}({\mathbf {A}})\in K^{1/2}_d}\left\| {\mathbf {A}}^k- \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{1/2}_d}\left( \Phi ^{1/2, d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon \),
-
(v)
for any \(\mathbf {vec}({\mathbf {A}})\in K^{1/2}_d\) we have
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{1/2}_d}\left( \Phi ^{1/2, d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon +\Vert {\mathbf {A}}^k\Vert _2 \le \frac{1}{4}+\Vert {\mathbf {A}}\Vert _2^k \le \frac{1}{4}+\left( \frac{1}{2}\right) ^k. \end{aligned}$$
Proof
We prove the result per induction over \(k\in {\mathbb {N}}_{0}\). The cases \(k=0\) and \(k=1\) hold trivially by defining the NNs
$$\begin{aligned} \Phi ^{1/2, d}_{0;\epsilon }:=\left( \left( {\mathbf {0}}_{{\mathbb {R}}^{d^2}\times {\mathbb {R}}^{d^2}},\mathbf {vec}(\mathbf {Id}_{{\mathbb {R}}^{d}})\right) \right) ,\qquad \Phi ^{1/2, d}_{1;\epsilon }:=\left( \left( \mathbf {Id}_{{\mathbb {R}}^{d^2}},{\mathbf {0}}_{{\mathbb {R}}^{d^2}}\right) \right) . \end{aligned}$$
For the induction hypothesis, we claim that the result holds true for all \(k' \le k \in {\mathbb {N}}\). If k is a power of two, then the result holds per Proposition A.3; thus, we can assume without loss of generality, that k is not a power of two. We define \(j :=\lfloor \log _2(k)\rfloor \) such that, for \(t :=k - 2^{j}\), we have that \(0< t < 2^{j}\). This implies that \(A^k= A^{2^j} A^{t}\). Hence, by Proposition A.3 and by the induction hypothesis, respectively, there exist a NN \(\Phi ^{1/2,d}_{2^j;\epsilon }\) satisfying (i)–(v) of Proposition A.3 and a NN \(\Phi ^{1/2,d}_{t;\epsilon }\) satisfying (i)-(v) of the statement of this proposition. We now define the NN
By construction and Lemma 3.6(a)(iv), we first observe that
$$\begin{aligned} M_{L\left( \Phi ^{1/2,d}_{k;\epsilon } \right) }\left( \Phi ^{1/2,d}_{k;\epsilon } \right) = M_{L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) }\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) \le C_{\mathrm {sq}} d^3. \end{aligned}$$
Moreover, we obtain by the induction hypothesis as well as Lemma 3.6(a)(iii) in combination with Lemma 3.6(b)(iv) that
$$\begin{aligned} M_1\left( \Phi ^{1/2,d}_{k;\epsilon } \right)&= M_1\left( \mathrm {P}\left( \Phi ^{1/2,d}_{2^j;\epsilon },\Phi ^{1/2,d}_{t;\epsilon }\right) \right) = M_1\left( \Phi ^{1/2,d}_{2^j;\epsilon } \right) + M_{1}\left( \Phi ^{1/2,d}_{t;\epsilon } \right) \\&\le C_{\mathrm {sq}} d^3+ (j+1) C_{\mathrm {sq}} d^3 = (j+2) C_{\mathrm {sq}} d^3. \end{aligned}$$
This shows (iii). To show (iv), we perform a similar estimate as the one following (A.11). By the triangle inequality,
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) -{\mathbf {A}}^{k}\right\| _2 \nonumber \\&\le \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) - {\mathbf {A}}^{2^j} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \nonumber \\&\qquad + \left\| {\mathbf {A}}^{2^j} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - {\mathbf {A}}^{2^{j}} {\mathbf {A}}^{t} \right\| _2. \end{aligned}$$
(A.16)
By the construction of \(\Phi ^{1/2, d}_{k;\epsilon }\) and Proposition 3.7, we conclude that
$$\begin{aligned}&\bigg \Vert \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}})) \right) \\&\quad - \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \bigg \Vert _2 \\&\qquad \le \frac{\epsilon }{4}. \end{aligned}$$
Hence, using (A.16), we can estimate
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{k;\epsilon } (\mathbf {vec}({\mathbf {A}}))\right) \right) -{\mathbf {A}}^{k}\right\| _2 \nonumber \\&\le \frac{\epsilon }{4} + \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right. \\&\qquad - \left. {\mathbf {A}}^{2^j} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2\\&\quad + \left\| {\mathbf {A}}^{2^{j}} \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) - {\mathbf {A}}^{k} \right\| _2\\&\le \frac{\epsilon }{4} + \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{2^{j};\epsilon } \right) (\mathbf {vec}({\mathbf {A}}))\right) - {\mathbf {A}}^{2^j} \right\| _2 \\&\quad + \left\| {\mathbf {A}}^{2^j}\right\| _2 \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2,d}_{t;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) -{\mathbf {A}}^{t}\right\| _2 =:\frac{\epsilon }{4} + \mathrm {I}+\mathrm {II}. \end{aligned}$$
We now consider two cases: If \(t=1,\) then we know by the construction of \(\Phi ^{1/2, d}_{1;\epsilon }\) that \(\mathrm {II}=0\). Thus,
$$\begin{aligned} \frac{\epsilon }{4}+\mathrm {I}+\mathrm {II} = \frac{\epsilon }{4} + \mathrm {I}\le \frac{\epsilon }{4}+\Vert {\mathbf {A}}\Vert _2 \epsilon \le \frac{3\epsilon }{4}\le \epsilon . \end{aligned}$$
If \(t\ge 2\), then
$$\begin{aligned} \frac{\epsilon }{4}+\mathrm {I}+\mathrm {II}&\le \frac{\epsilon }{4}+ \left( \epsilon + \Vert {\mathbf {A}}\Vert ^{t}+\Vert {\mathbf {A}}\Vert ^{2^j} \right) \epsilon \le \frac{\epsilon }{4} + \left( \frac{1}{4}+ \left( \frac{1}{2}\right) ^{t}+ \left( \frac{1}{2} \right) ^{2^j} \right) \epsilon \\&\le \frac{\epsilon }{4}+\frac{3\epsilon }{4} = \epsilon , \end{aligned}$$
where we have used that \(\left( \frac{1}{2}\right) ^{t}\le \frac{1}{4}\) for \(t \ge 2\). This shows (iv). In addition, by an application of the triangle inequality, we have that
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K_d^{1/2}}\left( \Phi ^{1/2, d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon + \left\| {\mathbf {A}}^{k} \right\| _2 \le \epsilon + \Vert {\mathbf {A}}\Vert _2^{k} \le \frac{1}{4}+\left( \frac{1}{2}\right) ^k. \end{aligned}$$
This shows (v). Now we analyze the size of \(\Phi ^{1/2,d}_{k;\epsilon }\). We have by Lemma 3.6(a)(i) in combination with Lemma 3.6(b)(i) and by the induction hypothesis that
$$\begin{aligned} L\left( \Phi ^{1/2,d}_{k;\epsilon }\right)&\le L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) + \max \left\{ L\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) , L\left( \Phi ^{1/2,d}_{t;\epsilon } \right) \right\} \\&\le L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) + \max \left\{ L\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) , (j-1) L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right) +L\left( \Phi ^{1/2,d}_{2^{j-1};\epsilon } \right) \right\} \\&\le L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) \\&\quad + \max \left\{ (j-1) L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right) +L\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) , (j-1) L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right) \right. \\&\qquad \left. +L\left( \Phi ^{1/2,d}_{2^{j-1};\epsilon } \right) \right\} \\&\le j L\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) + L\left( \Phi ^{1/2,d}_{2^{j};\epsilon } \right) \\&\le C_{\mathrm {sq}} j\cdot \left( \log _2(1/\epsilon )+\log _2(d)+ 2\right) +C_{\mathrm {sq}} j\cdot \left( \log _2(1/\epsilon )+\log _2(d) \right) \\&\quad +2C_{\mathrm {sq}}\cdot (j-1) \\&\le 2 C_{\mathrm {sq}} j\cdot \left( \log _2(1/\epsilon )+\log _2(d) +2 \right) , \end{aligned}$$
which implies (i). Finally, we address the number of nonzero weights of the resulting NN. We first observe that, by Lemma 3.6(a)(ii),
$$\begin{aligned} M\left( \Phi ^{1/2,d}_{k;\epsilon } \right)&\le \left( M\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}} \right) +M_1\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right) \right) + M\left( \mathrm {P}\left( \Phi ^{1/2,d}_{2^j;\epsilon }, \Phi ^{1/2,d}_{t;\epsilon }\right) \right) \\&\qquad + M_{L\left( \mathrm {P}\left( \Phi ^{1/2,d}_{2^j;\epsilon }, \Phi ^{1/2,d}_{t;\epsilon }\right) \right) }\left( \mathrm {P}\left( \Phi ^{1/2,d}_{2^j;\epsilon }, \Phi ^{1/2,d}_{t;\epsilon }\right) \right) \\&=:\mathrm {I'}+\mathrm {II'}(a)+\mathrm {II'}(b). \end{aligned}$$
Then, by the properties of the NN \(\Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\), we obtain
$$\begin{aligned} \mathrm {I'}=M\left( \Phi ^{1, d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right) + M_1\left( \Phi ^{1,d,d,d}_{\mathrm {mult};\frac{\epsilon }{4}}\right)&\le C_{\mathrm {sq}} d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+2 \right) + C_{\mathrm {sq}} d^3 \\&= C_{\mathrm {sq}} d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+3 \right) . \end{aligned}$$
Next, we estimate
$$\begin{aligned} \mathrm {II'}(a)+\mathrm {II'}(b)&=M\left( \mathrm {P}\left( \Phi ^{1/2, d}_{2^j;\epsilon }, \Phi ^{1/2, d}_{t;\epsilon }\right) \right) \\&+ M_{L\left( \mathrm {P}\left( \Phi ^{1/2, d}_{2^j;\epsilon }, \Phi ^{1/2, d}_{t;\epsilon }\right) \right) }\left( \mathrm {P}\left( \Phi ^{1/2, d}_{2^j;\epsilon }, \Phi ^{1/2, d}_{t;\epsilon }\right) \right) . \end{aligned}$$
Without loss of generality we assume that \(L:=L\left( \Phi ^{1/2, d}_{t;\epsilon }\right) - L\left( \Phi ^{1/2, d}_{2^j;\epsilon }\right) > 0.\) The other cases follow similarly. We have that \(L\le 2C_{\mathrm {sq}} j\cdot \left( \log _2(1/\epsilon )+\log _2(d)+2\right) \) and, by the definition of the parallelization of two NNs with a different number of layers that
$$\begin{aligned} \mathrm {II'}(a)&= M\left( \mathrm {P}\left( \Phi ^{1/2,d}_{2^j;\epsilon }, \Phi ^{1/2,d}_{t;\epsilon } \right) \right) \\&= M\left( \mathrm {P}\left( \Phi ^{\mathbf {Id}}_{d^2,L} \odot \Phi ^{1/2,d}_{2^j;\epsilon }, \Phi ^{1/2,d}_{t;\epsilon } \right) \right) \\&=M\left( \Phi ^{\mathbf {Id}}_{d^2,L} \odot \Phi ^{1/2,d}_{2^j;\epsilon }\right) + M\left( \Phi ^{1/2,d}_{t;\epsilon } \right) \\&\le M\left( \Phi ^{\mathbf {Id}}_{d^2,L}\right) +M_1\left( \Phi ^{\mathbf {Id}}_{d^2,L}\right) + M_{L\left( \Phi ^{1/2,d}_{2^j;\epsilon } \right) }\left( \Phi ^{1/2,d}_{2^j;\epsilon } \right) \\&+ M\left( \Phi ^{1/2,d}_{2^j;\epsilon } \right) +M\left( \Phi ^{1/2,d}_{t;\epsilon } \right) \\&\le 2d^2(L+1) + C_{\mathrm {sq}} d^3 + M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) +M\left( \Phi ^{1/2,d}_{2^j;\epsilon } \right) , \end{aligned}$$
where we have used the definition of the parallelization for the first two equalities, Lemma 3.6(b)(iii) for the third equality, Lemma 3.6(a)(ii) for the fourth inequality as well as the properties of \(\Phi ^{\mathbf {Id}}_{d^2,L}\) in combination with Proposition A.3(iii) for the last inequality. Moreover, by the definition of the parallelization of two NNs with different numbers of layers, we conclude that
Combining the estimates on \(\mathrm {I'}\), \(\mathrm {II'}(a)\), and \(\mathrm {II'}(b)\), we obtain by using the induction hypothesis that
$$\begin{aligned}&M\left( \Phi ^{1/2,d}_{k;\epsilon } \right) \le C_{\mathrm {sq}} d^3\cdot \left( \log _2(1/\epsilon ) + \log _2(d)+3 \right) + 2 d^2 \cdot (L+1)+d^2+C_{\mathrm {sq}} d^3 \\&\quad + M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) + M\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) \\&\quad \le C_{\mathrm {sq}} d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+4 \right) + 2 d^2 \cdot (L+2) + M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) + M\left( \Phi ^{1/2,d}_{2^j;\epsilon }\right) \\&\quad \le C_{\mathrm {sq}} \cdot (j+1) d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+4 \right) + 2 d^2 \cdot (L+2)+M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) \\&\quad \le C_{\mathrm {sq}}\cdot (j+1) d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+4 \right) +2C_{\mathrm {sq}} j d^2\cdot \left( \log _2(1/\epsilon )+\log _2(d)+2 \right) \\&\qquad +4d^2+M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) \\&\quad \quad \le 3C_{\mathrm {sq}}\cdot (j+1) d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+4\right) + M\left( \Phi ^{1/2,d}_{t;\epsilon }\right) \\&\le 3C_{\mathrm {sq}} d^3\cdot \left( j+1+ \frac{j\cdot (j+1)}{2}\right) \cdot \left( \log _2(1/\epsilon ) +\log _2(d)+4)\right) \\&\quad = \frac{3}{2} C_{\mathrm {sq}}\cdot (j+1)\cdot (j+2) d^3\cdot \left( \log _2(1/\epsilon )+\log _2(d)+4 \right) . \end{aligned}$$
\(\square \)
Proposition A.4 only provides a construction of a NN the ReLU-realization of which emulates a power of a matrix \({\mathbf {A}}\), under the assumption that \(\Vert {\mathbf {A}}\Vert _2\le 1/2\). We remove this restriction in the following corollary by presenting a construction of a NN \(\Phi ^{Z, d}_{k;\epsilon }\) the ReLU-realization of which approximates the map \({\mathbf {A}}\mapsto {\mathbf {A}}^k,\) on the set of all matrices \({\mathbf {A}}\) the norms of which are bounded by an arbitrary \(Z>0\).
Corollary A.5
There exists a universal constant \(C_{\mathrm {pow}}>C_{\mathrm {sq}}\) such that for all \(Z>0,\) \(d\in {\mathbb {N}}\) and \(k\in {\mathbb {N}}_0,\) there exists some NN \(\Phi ^{Z,d}_{k;\epsilon }\) with the following properties:
-
(i)
\(L\left( \Phi ^{Z, d}_{k;\epsilon }\right) \le C_{\mathrm {pow}} \log _2\left( \max \{k,2\}\right) \cdot \left( \log _2(1/\epsilon )+\log _2(d)+k \log _2\left( \max \left\{ 1,Z\right\} \right) \right) \),
-
(ii)
\(M\left( \Phi ^{Z,d}_{k;\epsilon }\right) \le C_{\mathrm {pow}}\log _2^2\left( \max \{k,2\}\right) d^3 \cdot \left( \log _2(1/\epsilon )+\log _2(d)+k \log _2\left( \max \left\{ 1,Z\right\} \right) \right) \),
-
(iii)
\(M_1\left( \Phi ^{Z,d}_{k;\epsilon } \right) \le C_{\mathrm {pow}} \log _2\left( \max \{k,2\}\right) d^3, \qquad \text {as well as} \qquad M_{L\left( \Phi ^{Z,d}_{k;\epsilon }\right) }\left( \Phi ^{Z}_{k;\epsilon }\right) \le C_{\mathrm {pow}} d^3\),
-
(iv)
\(\sup _{\mathbf {vec}({\mathbf {A}})\in K^{Z}_d}\left\| {\mathbf {A}}^k- \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{Z}_d}\left( \Phi ^{Z,d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon \),
-
(v)
for any \(\mathbf {vec}({\mathbf {A}})\in K^{Z}_d\) we have
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{K^{Z}_d}\left( \Phi ^{Z,d}_{k;\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \le \epsilon + \Vert {\mathbf {A}}^k\Vert _2 \le \epsilon +\Vert {\mathbf {A}}\Vert _2^k. \end{aligned}$$
Proof
Let \(\left( ({\mathbf {A}}_1,{\mathbf {b}}_1),\ldots ,({\mathbf {A}}_L,{\mathbf {b}}_L) \right) :=\Phi ^{1/2,d}_{k;\frac{\epsilon }{2\max \left\{ 1,Z^k\right\} }}\) according to Proposition A.4. Then, the NN
$$\begin{aligned} \Phi ^{Z,d}_{k;\epsilon } :=\left( \left( \frac{1}{2Z}{\mathbf {A}}_1 ,{\mathbf {b}}_1\right) , ({\mathbf {A}}_2,{\mathbf {b}}_2),\ldots ,({\mathbf {A}}_{L-1},{\mathbf {b}}_{L-1}), \left( 2Z^k {\mathbf {A}}_L,2Z^k {\mathbf {b}}_L\right) \right) \end{aligned}$$
fulfills all of the desired properties. \(\square \)
We have seen how to construct a NN that takes a matrix as an input and computes a power of this matrix. With this tool at hand, we are now ready to prove Theorem 3.8.
Proof of Theorem 3.8
By the properties of the partial sums of the Neumann series, for \(m\in {\mathbb {N}}\) and every \(\mathbf {vec}({\mathbf {A}})\in K_d^{1-\delta },\) we have that
$$\begin{aligned} \left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}}\right) ^{-1}- \sum _{k=0}^{m}{\mathbf {A}}^k\right\| _2&=\left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}}\right) ^{-1} {\mathbf {A}}^{m+1}\right\| _2 \le \left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}}\right) ^{-1}\right\| _2 \Vert {\mathbf {A}}\Vert _2^{m+1} \\&\le \frac{1}{1-(1-\delta )}\cdot (1-\delta )^{m+1} = \frac{(1-\delta )^{m+1}}{\delta }. \end{aligned}$$
Hence, for
$$\begin{aligned} m(\epsilon ,\delta ) = \left\lceil \log _{1-\delta }(2) \log _2\left( \frac{\epsilon \delta }{2}\right) \right\rceil= & {} \left\lceil \frac{\log _2(\epsilon )+\log _2(\delta )-1}{\log _{2}(1-\delta )}\right\rceil \\&\ge \frac{\log _2(\epsilon )+\log _2(\delta )-1}{\log _{2}(1-\delta )} \end{aligned}$$
we obtain
$$\begin{aligned} \left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}}\right) ^{-1}- \sum _{k=0}^{m(\epsilon ,\delta )}{\mathbf {A}}^k\right\| _2 \le \frac{\epsilon }{2}. \end{aligned}$$
Let now
where \(\left( \mathbf {Id}_{{\mathbb {R}}^{d^2}}|\cdots |\mathbf {Id}_{{\mathbb {R}}^{d^2}}\right) \in {\mathbb {R}}^{ d^2\times m(\epsilon ,\delta )\cdot d^2}\). Then we set
$$\begin{aligned} \Phi ^{1-\delta ,d}_{\mathrm {inv};\epsilon } :=\left( ({\mathbf {A}}_1,{\mathbf {b}}_1),\ldots ,\left( {\mathbf {A}}_L,{\mathbf {b}}_L+\mathbf {vec}\left( \mathbf {Id}_{{\mathbb {R}}^{d}}\right) \right) \right) . \end{aligned}$$
We have for any \(\mathbf {vec}({\mathbf {A}})\in K^{1-\delta }_{d}\)
$$\begin{aligned}&\left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}} \right) ^{-1} - \mathbf {matr}\left( \mathrm {R}_\varrho ^{K^{1-\delta }_{d}}\left( \Phi ^{1-\delta ,d}_{\mathrm {inv};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \\&\le \left\| \left( \mathbf {Id}_{{\mathbb {R}}^{d}}-{\mathbf {A}}\right) ^{-1}- \sum _{k=0}^{m(\epsilon ,\delta )}{\mathbf {A}}^k\right\| _2 + \left\| \sum _{k=0}^{m(\epsilon ,\delta )}{\mathbf {A}}^k - \mathbf {matr}\left( \mathrm {R}_\varrho ^{K^{1-\delta }_{d}}\left( \Phi ^{1-\delta ,d}_{\mathrm {inv};\epsilon }\right) (\mathbf {vec}({\mathbf {A}}))\right) \right\| _2 \\&\le \frac{\epsilon }{2} + \sum _{k=2}^{m(\epsilon ,\delta )} \left\| {\mathbf {A}}^k- \mathbf {matr}\left( \mathrm {R}_\varrho ^{K^{1-\delta }_{d}}\left( \Phi ^{1,d}_{k;\frac{\epsilon }{2\left( m(\epsilon ,\delta )-1\right) }}\right) (\mathbf {vec}({\mathbf {A}})) \right) \right\| _2 \\&\le \frac{\epsilon }{2} + \left( m(\epsilon ,\delta )-1\right) \frac{\epsilon }{2(m(\epsilon ,\delta )-1)} = \epsilon , \end{aligned}$$
where we have used that
$$\begin{aligned} \left\| {\mathbf {A}}-\mathbf {matr}\left( \mathrm {R}_\varrho ^{K^{1-\delta }_{d}}\left( \Phi ^{1, d}_{1;\frac{\epsilon }{2\left( m(\epsilon ,\delta )-1\right) }}\right) (\mathbf {vec}({\mathbf {A}})) \right) \right\| _2 = 0. \end{aligned}$$
This completes the proof of (iii). Moreover, (iv) is a direct consequence of (iii). Now we analyze the size of the resulting NN. First of all, we have by Lemma 3.6(b)(i) and Corollary A.5 that
$$\begin{aligned} L\left( \Phi ^{1-\delta , d}_{\mathrm {inv};\epsilon } \right)&= \max _{k=1,\ldots ,m(\epsilon ,\delta )} L\left( \Phi ^{1, d}_{k;\frac{\epsilon }{2\left( m(\epsilon ,\delta )-1\right) }}\right) \\&\le C_{\mathrm {pow}} \log _2\left( m(\epsilon ,\delta )-1 \right) \cdot \left( \log _2\left( 1/\epsilon \right) +1+\log _2\left( m(\epsilon ,\delta )-1 \right) +\log _2(d)\right) \\&\le C_{\mathrm {pow}} \log _2\left( \frac{\log _2\left( 0.5 \epsilon \delta \right) }{\log _{2}(1-\delta )} \right) \cdot \left( \log _2\left( 1/ \epsilon \right) +1+\log _2\left( \frac{\log _2\left( 0.5 \epsilon \delta \right) }{\log _{2}(1-\delta )} \right) +\log _2(d)\right) , \end{aligned}$$
which implies (i). Moreover, by Lemma 3.6(b)(ii), Corollary A.5 and the monotonicity of the logarithm, we obtain
$$\begin{aligned}&M\left( \Phi ^{1-\delta , d}_{\mathrm {inv};\epsilon }\right) \le 3\cdot \left( \sum _{k=1}^{m(\epsilon ,\delta )} M\left( \Phi ^{1,d}_{k;\frac{\epsilon }{2\left( m(\epsilon ,\delta )-1\right) }} \right) \right) \\&\quad +4C_{\mathrm {pow}} m(\epsilon ,\delta ) d^2 \log _2\left( m(\epsilon ,\delta )\right) \cdot \left( \log _2\left( 1/ \epsilon \right) +1+\log _2\left( m(\epsilon ,\delta ) \right) +\log _2(d)\right) \\&\quad \le 3 C_{\mathrm {pow}}\cdot \left( \sum _{k=1}^{m(\epsilon ,\delta )}\log _2^2(\max \{k,2\})\right) d^3 \cdot \left( \log _2(1/\epsilon )+1+\log _2\left( m(\epsilon ,\delta )\right) + \log _2(d) \right) \\&\quad +5 m(\epsilon ,\delta ) d^2 C_{\mathrm {pow}} \log _2\left( m(\epsilon ,\delta )\right) \cdot \left( \log _2\left( 1/ \epsilon \right) +1+\log _2\left( m(\epsilon ,\delta ) \right) +\log _2(d)\right) =:\mathrm {I}. \end{aligned}$$
Since \(\sum _{k=1}^{m(\epsilon ,\delta )}\log _2^2(\max \{k,2\})\le m(\epsilon ,\delta )\log _2^2(m(\epsilon ,\delta )),\) we obtain for some constant \(C_{\mathrm {inv}}>C_{\mathrm {pow}}\) that
$$\begin{aligned} \mathrm {I} \le C_{\mathrm {inv}} m(\epsilon ,\delta ) \log _2^2(m(\epsilon ,\delta ))d^3\cdot \left( \log _2(1/\epsilon ) + \log _2\left( m(\epsilon ,\delta )\right) + \log _2(d) \right) . \end{aligned}$$
This completes the proof. \(\square \)
Proof of Theorem 4.3
We start by establishing a bound on \( \left\| \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}-\alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right\| _2.\)
Proposition B.1
For any \(\alpha \in \left( 0, {1}/{C_{\mathrm {cont}}}\right) \) and \(\delta :=\alpha C_{\mathrm {coer}}\in (0, 1)\) there holds
$$\begin{aligned} \left\| \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}-\alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right\| _2 \le 1-\delta <1, \quad \text { for all } y\in {\mathcal {Y}},~{{\tilde{\epsilon }}}>0. \end{aligned}$$
Proof
Since \( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\) is symmetric, there holds that
$$\begin{aligned} \left\| \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}-\alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right\| _2&= \max _{\mu \in \sigma \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) } \left| 1-\alpha \mu \right| \le \max _{\mu \in [C_{\mathrm {coer}},C_{\mathrm {cont}}]} \left| 1-\alpha \mu \right| \\&=1-\alpha C_{\mathrm {coer}} = 1-\delta <1, \end{aligned}$$
for all \(y\in {\mathcal {Y}},~{{\tilde{\epsilon }}}>0.\) \(\square \)
With an approximation to the parameter-dependent stiffness matrices with respect to a RB, due to Assumption 4.1, we can next state a construction of a NN the ReLU-realization of which approximates the map \(y\mapsto \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1}\). As a first step, we observe the following remark.
Remark B.2
It is not hard to see that if \( \left( ({\mathbf {A}}^1_{{{\tilde{\epsilon }}},\epsilon },{\mathbf {b}}^1_{{{\tilde{\epsilon }}},\epsilon }),\ldots ,({\mathbf {A}}^L_{{{\tilde{\epsilon }}},\epsilon },{\mathbf {b}}^L_{{{\tilde{\epsilon }}},\epsilon })\right) :=\Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon }\) is the NN of Assumption 4.1, then for
$$\begin{aligned} \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon }:=\left( ({\mathbf {A}}^1_{{{\tilde{\epsilon }}},\epsilon },{\mathbf {b}}^1_{{{\tilde{\epsilon }}},\epsilon }),\ldots ,(- {\mathbf {A}}^L_{{{\tilde{\epsilon }}},\epsilon },- {\mathbf {b}}^L_{{{\tilde{\epsilon }}},\epsilon }+\mathbf {vec}\left( \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}) \right) \right) \end{aligned}$$
we have that
$$\begin{aligned} \sup _{y\in {\mathcal {Y}}} \left\| \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}} - \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} -\mathbf {matr}\left( \mathrm {R}^{\mathcal {Y}}_{\varrho } \left( \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon } \right) (y) \right) \right\| _2 \le \epsilon , \end{aligned}$$
as well as \(M\left( \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon }\right) \le B_{M}({{\tilde{\epsilon }}}, \epsilon ) + d({{\tilde{\epsilon }}})^2\) and \(L\left( \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon }\right) = B_{L}\left( {{\tilde{\epsilon }}}, \epsilon \right) \).
Now we present the construction of the NN emulating \(y\mapsto \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1}\).
Proposition B.3
Let \({{\tilde{\epsilon }}}\ge {\hat{\epsilon }},\epsilon \in \left( 0, \alpha /4 \cdot \min \{1,C_{\mathrm {coer}}\} \right) \) and \(\epsilon ':=3/8 \cdot \epsilon \alpha C_{\mathrm {coer}}^2 <\epsilon \). Assume that Assumption 4.1 holds. We define
which has p-dimensional input and \(d({{\tilde{\epsilon }}})^2\)- dimensional output.
Then, there exists a constant \(C_B=C_B(C_{\mathrm {coer}},C_{\mathrm {cont}} ) > 0\) such that
-
(i)
\( L\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) \le C_{B} \log _2(\log _2(1/\epsilon ))\big (\log _2(1/\epsilon ) + \log _2(\log _2(1/\epsilon ))+\log _2(d({{\tilde{\epsilon }}}))\big ) + B_L({{\tilde{\epsilon }}}, \epsilon '), \)
-
(ii)
\( M\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) \le C_{B} \log _2(1/\epsilon ) \log _2^2(\log _2(1/\epsilon )) d({{\tilde{\epsilon }}})^3\cdot \big (\log _2(1/\epsilon )+\log _2(\log _2(1/\epsilon ))+\log _2(d({{\tilde{\epsilon }}})) \big ) + 2 B_M\left( {{\tilde{\epsilon }}},\epsilon '\right) , \)
-
(iii)
\( \sup _{y\in {\mathcal {Y}}} \left\| \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv}; {{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right\| _2 \le \epsilon \),
-
(iv)
\( \sup _{y\in {\mathcal {Y}}} \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\cdot \left( \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv}; {{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right) \right\| _2 \le \epsilon \),
-
(v)
\(\sup _{y\in {\mathcal {Y}}} \left\| \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right\| _2 \le \epsilon + \frac{1}{C_{\mathrm {coer}}},\)
-
(vi)
\( \sup _{y\in {\mathcal {Y}}} \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\mathbf {matr} \left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv}; {{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right\| _2 \le \epsilon + \frac{1}{C_{\mathrm {coer}}}. \)
Proof
First of all, for all \(y\in {\mathcal {Y}}\) the matrix \(\mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \) is invertible. This can be deduced from the fact that
$$\begin{aligned} \left\| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}}- \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) \right\| _2 \le \epsilon '<\epsilon \le \frac{\alpha \min \{1,C_{\mathrm {coer}}\}}{4} \le \frac{\alpha C_{\mathrm {coer}}}{4}. \end{aligned}$$
(B.1)
Indeed, we estimate
$$\begin{aligned}&\min _{{\mathbf {z}} \in {\mathbb {R}}^{d({{\tilde{\epsilon }}})}\setminus \{0\}} \frac{\left| \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho } \left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) {\mathbf {z}} \right| }{|{\mathbf {z}}|} \\ \text {[Reverse triangle inequality]} \quad&\quad \ge \min _{{\mathbf {z}} \in {\mathbb {R}}^{d({{\tilde{\epsilon }}})}\setminus \{0\}} \frac{\left| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} {\mathbf {z}}\right| }{|{\mathbf {z}}|} \\&\quad - \max _{{\mathbf {z}} \in {\mathbb {R}}^{d({{\tilde{\epsilon }}})}\setminus \{0\}} \frac{\left| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} {\mathbf {z}} - \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho } \left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) {\mathbf {z}}\right| }{|{\mathbf {z}}|}\\ \text {[Definition of }\Vert .\Vert _2] \quad&\quad \ge \left( \max _{{\mathbf {z}} \in {\mathbb {R}}^{d({{\tilde{\epsilon }}})}\setminus \{0\}} \frac{|{\mathbf {z}}|}{\left| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} {\mathbf {z}}\right| }\right) ^{-1} \\&\quad - \left\| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} - \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '} \right) (y)\right) \right\| _2\\ \text {[Set }\tilde{{\mathbf {z}}} :=(\alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}}){\mathbf {z}}] \quad&\quad \ge \left( \max _{\tilde{{\mathbf {z}}} \in {\mathbb {R}}^{d({{\tilde{\epsilon }}})} \setminus \{0\}} \frac{|(\alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}})^{-1} \tilde{{\mathbf {z}}}|}{\left| \tilde{{\mathbf {z}}}\right| }\right) ^{-1}\\&\quad - \left\| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} - \mathbf {matr} \left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}}, \epsilon '}\right) (y)\right) \right\| _2\\ \text {[Definition of }\Vert \cdot \Vert _2] \quad&\quad \ge \left\| \left( \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}}\right) ^{-1}\right\| _2^{-1} - \left\| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}} - \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) \right\| _2\\ \text {[By Equations }(B.1) \hbox { and } (2.9)]\quad&\quad \ge \alpha C_{\mathrm {coer}} - \frac{\alpha C_{\mathrm {coer}}}{4} \ge \frac{3}{4} \alpha C_{\mathrm {coer}}. \end{aligned}$$
Thus, it follows that
$$\begin{aligned} \left\| \left( \mathbf {matr}\left( \mathrm {R}^{{\mathcal {Y}}}_{\varrho } \left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) \right) ^{-1} \right\| _2 \le \frac{4 }{3}\frac{1}{C_{\mathrm {coer}}\alpha }. \end{aligned}$$
(B.2)
Then,
$$\begin{aligned}&\left\| \frac{1}{\alpha }\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{1-\delta /2,d({{\tilde{\epsilon }}})}_{\mathrm {inv};\frac{\epsilon }{2\alpha }} \odot \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right\| _2 \\&\quad \le \left\| \frac{1}{\alpha }\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right) ^{-1} -\left( \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right) ^{-1}\right\| _2 \\&\quad \quad + \left\| \left( \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{1-\delta /2,d({{\tilde{\epsilon }}})}_{\mathrm {inv};\frac{\epsilon }{2\alpha }} \odot \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right\| _2 \\&\qquad =:\mathrm {I}+\mathrm {II}. \end{aligned}$$
Due to the fact that for two invertible matrices \({\mathbf {M}},{\mathbf {N}},\)
$$\begin{aligned} \left\| {\mathbf {M}}^{-1}-{\mathbf {N}}^{-1} \right\| _2 = \left\| {\mathbf {M}}^{-1}({\mathbf {N}}-{\mathbf {M}}){\mathbf {N}}^{-1}\right\| _2 \le \Vert {\mathbf {M}}-{\mathbf {N}}\Vert _2 \Vert {\mathbf {M}}^{-1}\Vert _2\Vert {\mathbf {N}}^{-1}\Vert _2, \end{aligned}$$
we obtain
$$\begin{aligned} \mathrm {I}&\le \left\| \alpha {\mathbf {B}}^{\mathrm {rb}}_{y,{{\tilde{\epsilon }}}}- \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right\| _2 \left\| \left( \alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} \right\| _2 \left\| \left( \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right) ^{-1}\right\| _2 \\&\le \frac{3 }{8} \epsilon \alpha C_{\mathrm {coer}}^2 \frac{1}{\alpha C_{\mathrm {coer}}} \frac{4}{3}\frac{1}{C_{\mathrm {coer}}\alpha } = \frac{\epsilon }{2\alpha }, \end{aligned}$$
where we have used Assumption 4.1, Eq. (2.9) and Eq. (B.2). Now we turn our attention to estimating II. First, observe that for every \(y\in {\mathcal {Y}}\) by the triangle inequality and Remark B.2, that
$$\begin{aligned}&\left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{{\mathcal {Y}}}\left( \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) \right\| _2 \le \left\| \mathbf {matr}\left( \mathrm {R}_{\varrho }^{{\mathcal {Y}}}\left( \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y)\right) -\left( \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}-\alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) \right\| _2 \\&\quad + \left\| \mathbf {Id}_{{\mathbb {R}}^{d({{\tilde{\epsilon }}})}}-\alpha {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right\| _2 \\&\quad \le \epsilon '+1-\delta \le 1-\delta +\frac{\alpha C_{\mathrm {coer}}}{4} \le 1-\delta + \frac{\alpha C_{\mathrm {cont}}}{4} \le 1-\delta +\frac{\delta }{2}=1-\frac{\delta }{2}. \end{aligned}$$
Moreover, have that \(\epsilon /(2\alpha ) \le \alpha /(8\alpha )< 1/4.\) Hence, by Theorem 3.8, we obtain that \( \mathrm {II} \le {\epsilon }/{2\alpha }.\) Putting everything together yields
$$\begin{aligned} \sup _{y\in {\mathcal {Y}}} \left\| \frac{1}{\alpha }\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{1-\delta /2,d({{\tilde{\epsilon }}})}_{\mathrm {inv};\frac{\epsilon }{2\alpha }} \odot \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '}\right) (y) \right) \right\| _2 \le \mathrm {I}+\mathrm {II} \le \frac{\epsilon }{\alpha }. \end{aligned}$$
Finally, by construction we can conclude that
$$\begin{aligned} \sup _{y\in {\mathcal {Y}}} \left\| \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} \right) ^{-1}- \mathbf {matr}\left( \mathrm {R}^{\mathcal {Y}}_{\varrho }\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon }\right) (y)\right) \right\| _2 \le \epsilon . \end{aligned}$$
This implies (iii) of the assertion. Now, by Equation (2.7) we obtain
$$\begin{aligned} \sup _{y\in {\mathcal {Y}}}\left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} -{\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right\| _2 \le \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\right\| _2 \epsilon =\epsilon , \end{aligned}$$
completing the proof of (iv). Finally, for all \(y\in {\mathcal {Y}}\) we estimate
$$\begin{aligned}&\left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right\| _2 \\&\le \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\cdot \left( \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} -\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) (y)\right) \right) \right\| _2+ \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} \right\| _2 \\&\le \epsilon + \frac{1}{C_{\mathrm {coer}}}. \end{aligned}$$
This yields (vi). A minor modification of the calculation above yields (v). At last, we show (i) and (ii). First of all, it is clear that \(L\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon }\right) = L\left( \Phi ^{1-\delta /2,d({{\tilde{\epsilon }}})}_{\mathrm {inv};\frac{\epsilon }{2\alpha }} \odot \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '} \right) \) and \(M\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon }\right) = M\left( \Phi ^{1-\delta /2,d({{\tilde{\epsilon }}})}_{\mathrm {inv};\frac{\epsilon }{2\alpha }} \odot \Phi ^{{\mathbf {B}},\mathbf {Id}}_{{{\tilde{\epsilon }}},\epsilon '} \right) .\) Moreover, by Lemma 3.6(a)(i) in combination with Theorem 3.8 (i) we have
$$\begin{aligned}&L\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) \le C_{\mathrm {inv}} \log _2\left( m\left( \epsilon /(2\alpha ),\delta /2\right) \right) \cdot \left( \log _2\left( 2\alpha /\epsilon \right) +\log _2\left( m\left( \epsilon /(2\alpha ),\delta /2\right) \right) \right. \\&\quad \left. +\log _2(d({{\tilde{\epsilon }}}))\right) + B_L({{\tilde{\epsilon }}},\epsilon ') \end{aligned}$$
and, by Lemma 3.6(a)(ii) in combination with Theorem 3.8(ii), we obtain
$$\begin{aligned}&M\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon } \right) \\&\quad \le 2C_{\mathrm {inv}} m(\epsilon /(2\alpha ),\delta /2)\log _2^2\left( m(\epsilon /(2\alpha ),\delta /2) \right) d({{\tilde{\epsilon }}})^3\cdot \left( \log _2\left( 2\alpha /\epsilon \right) \right. \\&\quad \left. + \log _2\left( m(\epsilon /(2\alpha ),\delta /2)\right) +\log _2(d({{\tilde{\epsilon }}})) \right) \\&\quad +2d({{\tilde{\epsilon }}})^2 + 2B_M({{\tilde{\epsilon }}}, \epsilon '). \end{aligned}$$
In addition, by definition of \(m(\epsilon ,\delta )\) in the statement of Theorem 3.8, for some constant \({\tilde{C}}>0\) there holds \(m\left( \epsilon /(2\alpha ),\delta /2\right) \le {\tilde{C}}\log _2(1/\epsilon ).\) Hence, the claim follows for a suitably chosen constant \(C_B = C_B(C_{\mathrm {coer}},C_{\mathrm {cont}}) > 0\). \(\square \)
Proof of Theorem 4.3
We start with proving (i) by deducing the estimate for \(\Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\). The estimate for \(\Phi ^{{\mathbf {u}},\mathrm {rb}}_{{{\tilde{\epsilon }}},\epsilon }\) follows in a similar, but simpler way. For \(y\in {\mathcal {Y}}\), we have that
$$\begin{aligned}&\left| \tilde{{\mathbf {u}}}^{\mathrm {h}}_{y,{{\tilde{\epsilon }}}} - \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\right) (y) \right| _{{\mathbf {G}}} \\&=\left| {\mathbf {G}}^{1/2}\cdot \left( {\mathbf {V}}_{{\tilde{\epsilon }}}\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} {\mathbf {f}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} - \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\right) (y) \right) \right| \\&\le \left| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\cdot \left( \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} {\mathbf {f}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}} -\left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y)\right) \right| \\&\quad +\left| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\cdot \left( \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y) - \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '} \right) (y) \right) \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y)\right) \right| \\&\quad + \left| {\mathbf {G}}^{1/2}\cdot \left( {\mathbf {V}}_{{\tilde{\epsilon }}}\mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '} \right) (y) \right) \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y) - \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\right) (y) \right) \right| \\&\quad =:\mathrm {I} + \mathrm {II} + \mathrm {III}. \end{aligned}$$
We now estimate \(\mathrm {I},\mathrm {II},\mathrm {III}\) separately. By Equation (2.7), Equation (2.9), Assumption 4.2, and the definition of \(\epsilon ''\) there holds for \(y\in {\mathcal {Y}}\) that
$$\begin{aligned} \mathrm {I}&\le \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\right\| _2 \left\| \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} \right\| _2 \left| {\mathbf {f}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}- \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y) \right| \le \frac{1}{C_{\mathrm {coer}}} \frac{\epsilon C_{\mathrm {coer}}}{3} = \frac{\epsilon }{3}. \end{aligned}$$
We proceed with estimating II. It is not hard to see from Assumption 4.2 that
$$\begin{aligned} \sup _{y\in {\mathcal {Y}}} \left| \mathrm {R}^{{\mathcal {Y}}}_{\varrho }\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon }\right) (y)\right| \le \epsilon +C_{\mathrm {rhs}}. \end{aligned}$$
(B.3)
By definition, \(\epsilon ' = \epsilon /\max \{6,C_{\mathrm {rhs}}\} \le \epsilon .\) Hence, by Assumption 4.1 and (B.3) in combination with Proposition B.3 (i), we obtain
$$\begin{aligned} \mathrm {II}&\le \left\| {\mathbf {G}}^{1/2}{\mathbf {V}}_{{\tilde{\epsilon }}}\cdot \left( \left( {\mathbf {B}}_{y,{{\tilde{\epsilon }}}}^{\mathrm {rb}}\right) ^{-1} - \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '} \right) (y) \right) \right) \right\| _2 \left| \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y) \right| \\&\le \epsilon '\cdot \left( C_{\mathrm {rhs}}+\frac{\epsilon \cdot C_{\mathrm {coer}}}{3} \right) \\&\le \frac{\epsilon }{\max \{6,C_{\mathrm {rhs}}\}} C_{\mathrm {rhs}} + \frac{\epsilon C_{\mathrm {coer}}}{\max \{6,C_{\mathrm {rhs}}\}} \frac{\epsilon }{3} \le \frac{2\epsilon }{6}=\frac{\epsilon }{3}, \end{aligned}$$
where we have used that \(C_{\mathrm {coer}} \epsilon<C_{\mathrm {coer}} {\alpha }/{4}<1.\) Finally, we estimate III. Per construction, we have that
$$\begin{aligned} \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\right) (y) = {\mathbf {V}}_{{\tilde{\epsilon }}}\mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{\kappa ,d({{\tilde{\epsilon }}}),d({{\tilde{\epsilon }}}),1}_{\mathrm {mult};\frac{\epsilon }{3}} \odot \mathrm {P}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '},\Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \right) (y,y). \end{aligned}$$
Moreover, we have by Proposition B.3(v)
$$\begin{aligned} \left\| \mathbf {matr}\left( \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '} \right) (y) \right) \right\| _2 \le \epsilon + \frac{1}{C_{\mathrm {coer}}} \le 1+\frac{1}{C_{\mathrm {coer}}}\le \kappa \end{aligned}$$
and by (B.3) that
$$\begin{aligned} \left| \mathrm {R}_\varrho ^{\mathcal {Y}}\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) (y) \right| \le \epsilon '' + C_{\mathrm {rhs}} \le \epsilon C_{\mathrm {coer}}+ C_{\mathrm {rhs}} \le 1+ C_{\mathrm {rhs}} \le \kappa . \end{aligned}$$
Hence, by the choice of \(\kappa \) and Proposition 3.7 we conclude that \(\mathrm {III} \le {\epsilon }/{3}\). Combining the estimates on \(\mathrm {I}, \mathrm {II}\), and \(\mathrm {III}\) yields (i) and using (i) implies (v). Now we estimate the size of the NNs. We start with proving (ii). First of all, we have by the definition of \(\Phi ^{{\mathbf {u}}, \mathrm {rb}}_{{{\tilde{\epsilon }}},\epsilon }\) and \(\Phi ^{{\mathbf {u}}, \mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\) as well as Lemma 3.6(a)(i) in combination with Proposition 3.7 that
$$\begin{aligned}&L\left( \Phi ^{{\mathbf {u}}, \mathrm {rb}}_{{{\tilde{\epsilon }}},\epsilon }\right) < L\left( \Phi ^{{\mathbf {u}},\mathrm {h}}_{{{\tilde{\epsilon }}},\epsilon }\right) \le 1 + L\left( \Phi ^{\kappa ,d({{\tilde{\epsilon }}}),d({{\tilde{\epsilon }}}),1}_{\mathrm {mult};\frac{\epsilon }{3}}\right) + L\left( \mathrm {P}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '},\Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \right) \nonumber \\&\quad \le 1 +C_{\mathrm {mult}}\cdot \left( \log _2(3/\epsilon )+3/2 \log _2(d({{\tilde{\epsilon }}}))+\log _2(\kappa )\right) \nonumber \\&\qquad +\max \left\{ L\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '} \right) , F_L\left( {{\tilde{\epsilon }}},\epsilon ''\right) \right\} \nonumber \\&\quad \le C^{{\mathbf {u}}}_L \max \left\{ \log _2(\log _2(1/\epsilon ))\left( \log _2(1/\epsilon )+ \log _2(\log _2(1/\epsilon ))+\log _2(d({{\tilde{\epsilon }}}))\right) \right. \nonumber \\&\quad \left. + B_L({{\tilde{\epsilon }}}, \epsilon '''), F_L\left( {{\tilde{\epsilon }}},\epsilon ''\right) \right\} \end{aligned}$$
(B.4)
where we applied Proposition B.3(i) and chose a suitable constant
$$\begin{aligned} C^{{\mathbf {u}}}_L = C^{{\mathbf {u}}}_L(\kappa , \epsilon ', C_B) = C^{{\mathbf {u}}}_L(C_{\mathrm {rhs}}, C_{\mathrm {coer}},C_{\mathrm {cont}}) >0. \end{aligned}$$
We now note that if we establish (iii), then (iv) follows immediately by Lemma 3.6(a)(ii). Thus, we proceed with proving (iii). First of all, by Lemma 3.6(a)(ii) in combination with Proposition 3.7 we have
$$\begin{aligned} M\left( \Phi ^{{\mathbf {u}}, \mathrm {rb}}_{{{\tilde{\epsilon }}},\epsilon }\right)&\le 2M\left( \Phi ^{\kappa ,d({{\tilde{\epsilon }}}),d({{\tilde{\epsilon }}}),1}_{\mathrm {mult};\frac{\epsilon }{3}}\right) + 2M\left( \mathrm {P}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '},\Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \right) \\&\le 2C_{\mathrm {mult}}d({{\tilde{\epsilon }}})^2 \cdot \left( \log _2(3/\epsilon )+3/2 \log _2(d({{\tilde{\epsilon }}}))+\log _2(\kappa )\right) \\&\quad + 2M\left( \mathrm {P}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '},\Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \right) . \end{aligned}$$
Next, by Lemma 3.6(b)(ii) in combination with Proposition B.3 as well as Assumption 4.1 and Assumption 4.2 we have that
$$\begin{aligned}&M\left( \mathrm {P}\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '},\Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \right) \nonumber \\&\le M\left( \Phi ^{{\mathbf {B}}}_{\mathrm {inv};{{\tilde{\epsilon }}},\epsilon '}\right) + M\left( \Phi ^{{\mathbf {f}}}_{{{\tilde{\epsilon }}},\epsilon ''} \right) \nonumber \\&\quad + 8d({{\tilde{\epsilon }}})^2 \max \left\{ C^{{\mathbf {u}}}_L \log _2(\log _2(1/\epsilon '))\left( \log _2(1/\epsilon ') + \log _2(\log _2(1/\epsilon '))+\log _2(d({{\tilde{\epsilon }}}))\right) \right. \nonumber \\&\quad \left. + B_L({{\tilde{\epsilon }}}, \epsilon '''), F_L\left( {{\tilde{\epsilon }}},\epsilon ''\right) \right\} \nonumber \\&\le C_{B} \log _2(1/\epsilon ') \log _2^2(\log _2(1/\epsilon ')) d({{\tilde{\epsilon }}})^3\cdot \left( \log _2(1/\epsilon ')+\log _2(\log _2(1/\epsilon '))+\log _2(d({{\tilde{\epsilon }}})) \right) \nonumber \\&\quad + 8d({{\tilde{\epsilon }}})^2 \max \left\{ C^{{\mathbf {u}}}_L \log _2(\log _2(1/\epsilon '))\left( \log _2(1/\epsilon ') + \log _2(\log _2(1/\epsilon '))+\log _2(d({{\tilde{\epsilon }}}))\right) \right. \nonumber \\&\quad \left. + B_L({{\tilde{\epsilon }}}, \epsilon '''),\nonumber F_L\left( {{\tilde{\epsilon }}},\epsilon ''\right) \right\} \\&\quad + 2B_M\left( {{\tilde{\epsilon }}},\epsilon '''\right) + F_M\left( {{\tilde{\epsilon }}},\epsilon '' \right) \nonumber \\&\le C^{{\mathbf {u}}}_M d({{\tilde{\epsilon }}})^2 \cdot \bigg (d({{\tilde{\epsilon }}})\log _2(1/\epsilon ) \log _2^2(\log _2(1/\epsilon ))\big (\log _2(1/\epsilon )+\log _2(\log _2(1/\epsilon )) + \log _2(d({{\tilde{\epsilon }}}))\big )\ldots \nonumber \\&\quad \cdots + B_L({{\tilde{\epsilon }}},\epsilon ''') + F_L\left( {{\tilde{\epsilon }}},\epsilon ''\right) \bigg ) \quad +2B_M({{\tilde{\epsilon }}},\epsilon ''') +F_M\left( {{\tilde{\epsilon }}},\epsilon ''\right) , \end{aligned}$$
(B.5)
for a suitably chosen constant \(C^{{\mathbf {u}}}_M = C^{{\mathbf {u}}}_M(\epsilon ', C_B, C^{{\mathbf {u}}}_L) = C^{{\mathbf {u}}}_L(C_{\mathrm {rhs}}, C_{\mathrm {coer}},C_{\mathrm {cont}})> 0\). This shows the claim.