1 Introduction

Many problems in science and engineering, including optimal control problems governed by partial differential equations (PDEs), are subject to uncertainty. If not taken into account, the inherent uncertainty of such problems has the potential to render worthless any solutions obtained using state-of-the-art methods for deterministic problems. The careful analysis of the uncertainty in PDE-constrained optimization is hence indispensable and a growing field of research (see, e.g., [5,6,7, 18, 29, 30, 36, 37, 45, 47, 48]).

In this paper we consider the heat equation with an uncertain thermal diffusion coefficient, modelled by a series in which a countably infinite number of independent random variables enter affinely. By controlling the source term of the heat equation, we aim to steer its solution towards a desired target state. To study the effect of this randomness on the objective function, we consider two risk measures: the expected value and the entropic risk measure, both involving integrals with respect to the countably infinite random variables. The integrals are replaced by integrals over finitely many random variables by truncating the series that represents the input random field to a sum over finitely many terms and then approximated using quasi-Monte Carlo (QMC) methods.

QMC approximations are particularly well suited for optimization since they preserve convexity due to their nonnegative (equal) cubature weights. Moreover, for sufficiently smooth integrands it is possible to construct QMC rules with error bounds not depending on the number of stochastic variables while attaining faster convergence rates compared to Monte Carlo methods. For these reasons QMC methods have been very successful in applications to PDEs with random coefficients (see, e.g., [2, 11, 16, 19, 20, 24,25,26, 33,34,35, 39, 42, 43]) and especially in PDE-constrained optimization under uncertainty, see [22, 23]. In [32] the authors derive regularity results for the saddle point operator, which fall within the same framework as the QMC approximation of affine parametric operator equation setting considered in [43].

This paper builds upon our previous work [22]. The novelty lies in the use and analysis of parabolic PDE constraints in conjunction with the nonlinear entropic risk measure, which inspired the development of an error analysis that is applicable in separable Banach spaces and thus discretization invariant. Solely based on regularity assumptions, our novel error analysis covers a very general class of problems. Specifically, we extend QMC error bounds in the literature (see, e.g., [12, 33]) to separable Banach spaces. A crucial part of our new work is the regularity analysis of the entropic risk measure, which is used to prove our main theoretical results about error estimates and convergence rates for the dimension truncation and the QMC errors. We then apply these new bounds to assess the total errors in the optimal control problem under uncertainty.

The structure of this paper is as follows. The parametric weak formulation of the PDE problem is given in Sect. 2. The corresponding optimization problem is discussed in Sect. 3, with linear risk measures considered in Sect. 3.1, the entropic risk measures in Sect. 3.2, and optimality conditions in Sect. 3.3. While the regularity of the adjoint PDE problem is the topic of Sect. 4, the regularity analysis for the entropic risk measure is addressed in Sect. 5. Section 6 contains the main error analysis of this paper. Section 6.1 covers the truncation error and Sect. 6.2 analyzes the QMC integration error. Our approach differs from most studies of QMC in the literature insofar as we develop the QMC and dimension truncation error theory for the full PDE solutions (with respect to an appropriately chosen function space norm) instead of considering the composition of the PDE solution with a linear functional. In Sect. 7 we confirm our theoretical findings with supporting numerical experiments. Section 8 is a concluding summary of this paper.

2 Problem formulation

Let \(D\subset {\mathbb {R}}^d\), \(d\in \{1,2,3\}\), denote a bounded physical domain with Lipschitz boundary, let \(I:= [0,T]\) denote the time interval with finite time horizon \(0<T<\infty \), and let \(U:=[-\frac{1}{2},\frac{1}{2}]^{\mathbb {N}}\) denote a space of parameters. The components of the sequence \({\varvec{y}}\in U\) are realizations of independently and identically distributed uniform random variables in \([-\tfrac{1}{2},\tfrac{1}{2}]\), and the corresponding probability measure is

$$\begin{aligned} \mu (\textrm{d}{\varvec{y}})=\bigotimes _{j\ge 1}\textrm{d}y_j=\textrm{d}{\varvec{y}}. \end{aligned}$$

Let

$$\begin{aligned} a^{{\varvec{y}}}({\varvec{x}},t) := a_0({\varvec{x}},t) + \sum _{j\ge 1} y_j\,\psi _j({\varvec{x}},t), \qquad {\varvec{x}}\in D,\quad {\varvec{y}}\in U,\quad t\in I, \end{aligned}$$
(2.1)

be an uncertain (thermal) diffusion coefficient, where we assume (i) for a.e. \(t\in I\) we have \(a_0(\cdot ,t) \in L^\infty (D)\), \(\psi _j(\cdot ,t)\in L^\infty (D)\) for all \(j\ge 1\), and that we have \((\sup _{t\in I}\Vert \psi _j(\cdot ,t)\Vert _{L^\infty (D)})_{j\ge 1}\in \ell ^1\); (ii) \(t \mapsto a^{\varvec{y}}({\varvec{x}},t)\) is measurable on I; (iii) uniform ellipticity: there exist positive constants \(a_{\min }\) and \(a_{\max }\) such that \(0<a_{\min }\le a^{\varvec{y}}({\varvec{x}},t)\le a_{\max }<\infty \) for all \({\varvec{x}}\in D\), \({\varvec{y}}\in U\) and a.e. \(t\in I\). Time-varying diffusion coefficients occur e.g., in finance or cancer tomography. However, the presented setting clearly also includes time-constant diffusion coefficients, i.e., \(a^{\varvec{y}}({\varvec{x}},t) = a^{\varvec{y}}({\varvec{x}})\, \forall t \in I\).

We consider the heat equation over the time interval \(I=[0,T]\), given by the partial differential equation (PDE)

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial }{\partial t} u^{{\varvec{y}}}({\varvec{x}},t)-\nabla \cdot \big (a^{{\varvec{y}}}({\varvec{x}},t)\nabla u^{{\varvec{y}}}({\varvec{x}},t)\big ) = z({\varvec{x}},t), &{}{\varvec{x}}\in D,\quad \;\, t\in I,\\ u^{{\varvec{y}}}({\varvec{x}},t) = 0, &{}{\varvec{x}}\in \partial D, \quad t\in I,\\ u^{{\varvec{y}}}({\varvec{x}},0) = u_0({\varvec{x}}), &{}{\varvec{x}}\in D, \end{array}\right. } \end{aligned}$$
(2.2)

for all \({\varvec{y}}\in U\). Here \(z({\varvec{x}},t)\) is the control and \(u_0 \in L^2(D)\) denotes the initial heat distribution. We denote the input functions collectively by \(f:= (z, u_0)\). We have imposed homogeneous Dirichlet boundary conditions.

Given a target state \({\widehat{u}}({\varvec{x}},t)\), we will study the problem of minimizing the following objective function:

$$\begin{aligned} {\widetilde{J}}(u,z)&:= {{\mathcal {R}}} \Big (\frac{\alpha _1}{2}\Vert u^{\varvec{y}}- {\widehat{u}}\Vert ^2_{L^2(V;I)} + \frac{\alpha _2}{2} \Vert u^{\varvec{y}}(\cdot ,T) - {\widehat{u}}(\cdot ,T)\Vert _{L^2(D)}^2 \Big )\nonumber \\&\qquad + \frac{\alpha _3}{2}\Vert z\Vert _{L^2(V';I)}^2\,, \end{aligned}$$
(2.3)

subject to the PDE (2.2) and constraints on the control to be defined later in the manuscript. By \({{\mathcal {R}}}\) we denote a risk measure, which is a functional that maps a set of random variables into the extended real numbers. Specifically, \({{\mathcal {R}}}\) will later be either the expected value or the entropic risk measure, both involving high-dimensional integrals with respect to \({\varvec{y}}\). We will first introduce a function space setting to describe the problem properly, including the definition of the \(L^2(V;I)\) and \(L^2(V';I)\) norms.

2.1 Function space setting

We define \(V:= H_0^1(D)\) and its (topological) dual space \(V':= H^{-1}(D)\), and identify \(L^2(D)\) with its own dual. Let \(\langle \cdot ,\cdot \rangle _{V',V}\) denotes the duality pairing between \(V'\) and V. The norm and inner product in V are defined as usual by

$$\begin{aligned} \Vert v\Vert _V:= \Vert \nabla v\Vert _{L^2(D)}, \quad \langle v_1,v_2\rangle _V:= \langle \nabla v_1,\nabla v_2\rangle _{L^2(D)}. \end{aligned}$$

We shall make use of the Riesz operator \(R_V\!: V\rightarrow V'\) defined by

$$\begin{aligned} \langle R_V v_1, v_2 \rangle _{V',V} = \langle v_1,v_2\rangle _V \quad \forall \,v_1,v_2\in V, \end{aligned}$$
(2.4)

as well as its inverse \(R_V^{-1}\!: V'\rightarrow V\) satisfying \(R_V^{-1} w = v \Leftrightarrow w = R_V v\) for \(v\in V,\,w\in V'\). It follows from (2.4) that

$$\begin{aligned} \langle w, v \rangle _{V',V} = \langle R_V^{-1}w,v\rangle _V \quad \forall \,v\in V,w\in V'. \end{aligned}$$
(2.5)

In turn we define the inner product in \(V'\) by

$$\begin{aligned} \langle w_1, w_2\rangle _{V'}:= \langle R_V^{-1} w_1, R_V^{-1} w_2\rangle _V. \end{aligned}$$

The norm induced by this inner product is equal to the usual dual norm.

We use analogous notations for inner products and duality pairings between function spaces on the space-time cylinder \(D\times I\). The space \(L^2(V;I)\) consists of all measurable functions \(v: I \rightarrow V\) with finite norm

$$\begin{aligned} \Vert v\Vert _{L^2(V;I)}:= \Big ( \int _I \Vert v(\cdot ,t)\Vert _{V}^2\, \textrm{d}t \Big )^{1/2}. \end{aligned}$$

Note that \((L^2(V;I))'=L^2(V';I)\), with the duality pairing given by

$$\begin{aligned} \langle w,v\rangle _{L^2(V';I),L^2(V;I)} = \int _I \langle w(\cdot ,t),v(\cdot ,t)\rangle _{V',V}\,\textrm{d}t. \end{aligned}$$

We extend the Riesz operator \(R_V\) to \(R_V: L^2(V;I) \rightarrow L^2(V';I)\) so that

$$\begin{aligned} \langle v_1,v_2\rangle _{L^2(V;I)}&= \int _I \langle v_1(\cdot ,t),v_2(\cdot ,t)\rangle _V\,\textrm{d}t = \int _I \big \langle R_V v_1(\cdot ,t),v_2(\cdot ,t)\big \rangle _{V',V}\,\textrm{d}t \\&= \big \langle R_V v_1,v_2\big \rangle _{L^2(V';I),L^2(V;I)} \quad \forall \, v_1,v_2\in L^2(V;I), \end{aligned}$$

and we extend the inverse \(R_V^{-1}: L^2(V';I) \rightarrow L^2(V;I)\) analogously.

We define the space of solutions \(u^{\varvec{y}}\) for \({\varvec{y}}\in U\) by

$$\begin{aligned} {{\mathcal {X}}} := \Big \{v\in L^2(V;I) : \tfrac{\partial }{\partial t}v \in L^2(V';I)\Big \}, \end{aligned}$$

which is the space of all functions v in \(L^2(V;I)\) with (distributional) derivative \(\tfrac{\partial }{\partial t}v\) in \(L^2(V';I)\), and which is equipped with the (graph) norm

$$\begin{aligned} \Vert v\Vert _{{{\mathcal {X}}}}&:=\, \Big (\int _I \Big (\Vert v(\cdot ,t)\Vert _{V}^2 + \Vert \tfrac{\partial }{\partial t}v(\cdot ,t)\Vert _{V'}^2 \Big )\, \textrm{d}t \Big )^{1/2} \\&=\, \Big (\Vert v\Vert _{L^2(V;I)}^2 + \Vert \tfrac{\partial }{\partial t}v\Vert _{L^2(V';I)}^2 \Big )^{1/2}. \end{aligned}$$

Finally, because there are two inputs in Eq. (2.2), namely \(z \in L^2(V';I)\) and \(u_0 \in L^2(D)\), it is convenient to define the product space \({{\mathcal {Y}}}:= L^2(V;I) \times L^2(D)\), and its dual space by \({{\mathcal {Y}}}':= L^2(V';I) \times L^2(D)\), with the norms

$$\begin{aligned} \Vert v\Vert _{{{\mathcal {Y}}}}&:= \Big (\int _I\Vert v_1(\cdot ,t)\Vert _{V}^2\,\textrm{d}t + \Vert v_2\Vert _{L^2(D)}^2\Big )^{1/2}\!\!,\\ \Vert w\Vert _{{{\mathcal {Y}}}'}&:= \Big (\int _I\Vert w_1(\cdot ,t)\Vert _{V'}^2\,\textrm{d}t + \Vert w_2\Vert _{L^2(D)}^2\Big )^{1/2}\!\!. \end{aligned}$$

In particular, we extend \({{\mathcal {X}}}\) to \({{\mathcal {Y}}}\) as follows. For all \(v \in {{\mathcal {X}}}\) we interpret v as an element of \({{\mathcal {Y}}}\) as \(v = (v({\varvec{x}},t),v({\varvec{x}},0))\). This gives \({{\mathcal {X}}} \subseteq {{\mathcal {Y}}}\). We further know from [13, Theorem 5.9.3] that \({{\mathcal {X}}} \hookrightarrow {{\mathcal {C}}}(L^2(D);I)\) and \(\max _{t \in I} \Vert v(\cdot ,t)\Vert _{L^2(D)} \le C_1(\Vert v\Vert _{L^2(V;I)} + \Vert \tfrac{\partial }{\partial t} v\Vert _{L^2(V';I)}) \le \sqrt{2}\, C_1 \Vert v\Vert _{{\mathcal {X}}}\) for \(v \in {{\mathcal {X}}}\), where \(C_1\) depends on T only. Hence we obtain for all \(v \in {{\mathcal {X}}}\) that

$$\begin{aligned} \Vert v\Vert _{{\mathcal {Y}}}^2&= \Vert v\Vert _{L^2(V;I) \times L^2(D)}^2 = \Vert v\Vert _{L^2(V;I)}^2 + \Vert v(\cdot ,0)\Vert _{L^2(D)}^2\\&\le \Vert v\Vert _{L^2(V;I)}^2 + \Big (\max _{t \in I}\Vert v(\cdot ,t)\Vert _{L^2(D)}\Big )^2 \le \Vert v\Vert _{{\mathcal {X}}}^2 + 2\,C_1^2\Vert v\Vert _{{\mathcal {X}}}^2 = (1 + 2\,C_1^2)\Vert v\Vert ^2_{{\mathcal {X}}}, \end{aligned}$$

and thus we get that \({{\mathcal {X}}}\) is continuously embedded into \({{\mathcal {Y}}}\), i.e., \({{\mathcal {X}}} \hookrightarrow {{\mathcal {Y}}}\).

2.2 Variational formulation

Based on these spaces, using integration by parts with respect to \({\varvec{x}}\) we can write (2.2) as a variational problem as follows. Given the input functions \(f =(z,u_0)\in {{\mathcal {Y}}}'\) and \({\varvec{y}}\in U\), find a function \(u^{{\varvec{y}}} \in {{\mathcal {X}}}\) such that

$$\begin{aligned} b({\varvec{y}};u^{{\varvec{y}}},v) \,=\, \langle f,v \rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}} \quad \forall \; v= (v_1,v_2) \in {{\mathcal {Y}}}\,, \end{aligned}$$
(2.6)

where for all \(w\in {{\mathcal {X}}}\), \(v = (v_1,v_2) \in {{\mathcal {Y}}}\) and \({\varvec{y}}\in U\),

$$\begin{aligned} b({\varvec{y}};w,v)&:= \langle B^{\varvec{y}}w, v\rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}} \nonumber \\&:= \underbrace{\int _I \big \langle \tfrac{\partial }{\partial t} w,v_1\big \rangle _{V',V}\, \textrm{d}t + \int _I \int _{D} \big (a^{{\varvec{y}}} \nabla w \cdot \nabla v_1\big )\,\textrm{d}{\varvec{x}}\,\textrm{d}t }_{=:\,\langle B_1^{\varvec{y}}w,v_1 \rangle _{L^2(V';I),L^2(V;I)}}\nonumber \\&\quad + \underbrace{\int _D w(\cdot ,0)\,v_2\, \textrm{d}{\varvec{x}}}_{=:\, \langle B_2^{\varvec{y}}w, v_2 \rangle _{L^2(D)}}\,, \end{aligned}$$
(2.7)
$$\begin{aligned} \langle f,v \rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}}:= \int _I \langle z, v_1\rangle _{V',V}\, \textrm{d}t + \int _D u_0\, v_2\,\textrm{d}{\varvec{x}}, \nonumber \end{aligned}$$

with operators \(B^{\varvec{y}}: {{\mathcal {X}}} \rightarrow {{\mathcal {Y}}}'\), \(B_1^{\varvec{y}}: {{\mathcal {X}}} \rightarrow L^2(V';I)\), \(B_2^{\varvec{y}}:{{\mathcal {X}}} \rightarrow L^2(D)\), and \(B^{\varvec{y}}w = (B_1^{\varvec{y}}w, B_2^{\varvec{y}}w)\). For better readability we have omitted the parameter dependence \(v = (v_1({\varvec{x}},t),v_2({\varvec{x}}))\), \(f = (z({\varvec{x}},t), u_0({\varvec{x}}))\), \(w = w({\varvec{x}},t)\) and \(a^{{\varvec{y}}} = a^{{\varvec{y}}}({\varvec{x}},t)\). Note that a solution of (2.6) automatically satisfies \(u^{\varvec{y}}(\cdot ,0) = u_0\), as can be seen by setting \(v_1 = 0\) and allowing arbitrary \(v_2\).

The parametric family of parabolic evolution operators \(\{B^{\varvec{y}},\, {\varvec{y}}\in U\}\) associated with this bilinear form is a family of isomorphisms from \({{\mathcal {X}}}\) to \({{\mathcal {Y}}}'\), see, e.g., [10]. In [44] a shorter proof based on the characterization of the bounded invertibility of linear operators between Hilbert spaces is presented, together with precise bounds on the norms of the operator and its inverse: there exist constants \(0< \beta _1 \le \beta _2 < \infty \) such that

$$\begin{aligned} \sup _{{\varvec{y}}\in U} \Vert (B^{\varvec{y}})^{-1}\Vert _{{{\mathcal {Y}}}' \rightarrow {{\mathcal {X}}}} \le \frac{1}{\beta _1} \quad \text {and} \quad \sup _{{\varvec{y}}\in U} \Vert B^{\varvec{y}}\Vert _{{{\mathcal {X}}} \rightarrow {{\mathcal {Y}}}'} \le \beta _2\,, \end{aligned}$$
(2.8)

where \(\beta _1 \ge \tfrac{\min \{a_{\min } a_{\max }^{-2},a_{\min }\}}{\sqrt{2\max \{a_{\min }^{-2},1\}+\varrho ^2}}\) and \(\beta _2 \le \sqrt{2\max \{1,a_{\max }^2\}+\varrho ^2}\) with \(\varrho := \sup \limits _{w \in {{\mathcal {X}}}} \tfrac{\Vert w(\cdot ,0)\Vert _{L^2(D)}}{\Vert w\Vert _{{{\mathcal {X}}}}}\), and hence for all \({\varvec{y}}\in U\) we have the a priori estimate

$$\begin{aligned} \Vert u^{{\varvec{y}}}\Vert _{{{\mathcal {X}}}} \le \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} = \frac{1}{\beta _1} \Vert (z,u_0)\Vert _{{{\mathcal {Y}}}'} = \frac{1}{\beta _1} \left( \Vert z\Vert ^2_{L^2(V';I)} + \Vert u_0\Vert ^2_{L^2(D)} \right) ^{1/2}\,. \end{aligned}$$
(2.9)

With \(\mathbb {N}_0:= \{0,1,2,\ldots \}\), let \({\varvec{\nu }}\in {\mathbb {N}}_0^\infty \) denote a multi-index, and define \({\textrm{supp}}({\varvec{\nu }}):=\{j\ge 1: \nu _j\ne 0\}\) and \(|{\varvec{\nu }}|:= \sum _{j\ge 1} \nu _j\). In the sequel, we shall consider the set \({\mathscr {F}}:=\{{\varvec{\nu }}\in \mathbb N_0^\infty :|\textrm{supp}({\varvec{\nu }})|<\infty \}\) of multi-indices with finite support. We use the notation \(\partial ^{\varvec{\nu }}_{\varvec{y}}:= \prod _{j\ge 1} (\partial /\partial y_j)^{\nu _j}\) to denote the mixed partial derivatives with respect to \({\varvec{y}}\). For any sequence of real numbers \({\varvec{b}}= (b_j)_{j\ge 1}\), we define \({\varvec{b}}^{\varvec{\nu }}:= \prod _{j\ge 1} b_j^{\nu _j}\).

The following regularity result for the state \(u^{\varvec{y}}\) was proved in [32].

Lemma 2.1

Let \(f=(z,u_0)\in {{\mathcal {Y}}}'\). For all \({\varvec{\nu }}\in {\mathscr {F}}\) and all \({\varvec{y}}\in U\), we have

$$\begin{aligned} \Vert \partial ^{{\varvec{\nu }}}_{{\varvec{y}}}u^{{\varvec{y}}}\Vert _{{{\mathcal {X}}}} \le \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} \, |{\varvec{\nu }}|!\,{\varvec{b}}^{{\varvec{\nu }}}, \end{aligned}$$
(2.10)

where \(\beta _1\) is as described in (2.8) and the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) is defined by

$$\begin{aligned} b_j := \frac{1}{\beta _1}\,\sup _{t\in I}\Vert \psi _j(\cdot ,t)\Vert _{L^{\infty }(D)}. \end{aligned}$$
(2.11)

For our later derivation of the optimality conditions for the optimal control problem, it is helpful to write the variational form of the PDE (2.6) as an operator equation using (2.7):

$$\begin{aligned} B^{\varvec{y}}u^{\varvec{y}}\,=\, (B_1^{\varvec{y}}u^{\varvec{y}}, B_2^{\varvec{y}}u^{\varvec{y}}) \,=\, (z, u_0) \qquad \text {in }{{\mathcal {Y}}}'\,, \end{aligned}$$
(2.12)

with \(B_1^{\varvec{y}}: {{\mathcal {X}}} \rightarrow L^2(V';I)\) and \(B_2^{\varvec{y}}: {{\mathcal {X}}} \rightarrow L^2(D)\) given by

$$\begin{aligned} B_1^{\varvec{y}}= \Lambda _1 B^{\varvec{y}}\quad \text {and} \quad B_2^{\varvec{y}}= \Lambda _2 B^{\varvec{y}}, \end{aligned}$$

where \(\Lambda _1: {{\mathcal {Y}}}' \rightarrow L^2(V';I)\) and \(\Lambda _2: {{\mathcal {Y}}}' \rightarrow L^2(D)\) are the restriction operators defined, for any \(v = (v_1,v_2) \in {{\mathcal {Y}}}'\), by

$$\begin{aligned} \Lambda _1 (v_1,v_2) := v_1 \quad \text { and } \quad \Lambda _2 (v_1,v_2) := v_2\,. \end{aligned}$$

For the definition of a meaningful inverse of the operators \(B_1^{\varvec{y}}\) and \(B_2^{\varvec{y}}\), we first define the trivial extension operators \(\Xi _1:L^2(V';I) \rightarrow {{\mathcal {Y}}}'\) and \(\Xi _2: L^2(D) \rightarrow {{\mathcal {Y}}}'\), for any \(v_1 \in L^2(V';I)\) and \(v_2\in L^2(D)\), by

$$\begin{aligned} \Xi _1 v_1 := (v_1,0) \quad \text { and } \quad \Xi _2 v_2 := (0,v_2)\,. \end{aligned}$$

We observe that \(P_1:= \Xi _1\Lambda _1\) is an orthogonal projection on the \(L^2(V';I)\)-component in \({{\mathcal {Y}}}'\) and analogously \(P_2:= \Xi _2\Lambda _2\) is an orthogonal projection on the \(L^2(D)\)-component in \({{\mathcal {Y}}}'\). This is verified as follows. For all \(v,u\in {{\mathcal {Y}}}'\) it is true that

$$\begin{aligned} \langle ({{\mathcal {I}}}_{{{\mathcal {Y}}}'}-P_1) v, P_1 u \rangle _{{{\mathcal {Y}}}'} = 0 \quad \text {and} \quad \langle ({{\mathcal {I}}}_{{{\mathcal {Y}}}'}-P_2) v, P_2 u \rangle _{{{\mathcal {Y}}}'} = 0\,, \end{aligned}$$

where \({{\mathcal {I}}}_{{{\mathcal {Y}}}'}\) denotes the identity operator on \({{\mathcal {Y}}}'\). We clearly have \({{\mathcal {I}}}_{{{\mathcal {Y}}}'} = P_1 + P_2\). Therefore we can write any element v in \({{\mathcal {Y}}}'\) as \(v = P_1 v + P_2 v\) in \({{\mathcal {Y}}}'\), and by linearity of \((B^{\varvec{y}})^{-1}\) we get

$$\begin{aligned} (B^{\varvec{y}})^{-1} v = (B^{\varvec{y}})^{-1} (P_1 v + P_2 v) = (B^{\varvec{y}})^{-1} P_1 v + (B^{\varvec{y}})^{-1} P_2 v. \end{aligned}$$

A meaningful inverse of the operators \(B^{\varvec{y}}_1: {{\mathcal {X}}} \rightarrow L^2(V';I)\) and \(B^{\varvec{y}}_2: {{\mathcal {X}}} \rightarrow L^2(D)\) are then given by \((B_1^{\varvec{y}})^\dagger : L^2(V';I) \rightarrow {{\mathcal {X}}}\) and \((B_2^{\varvec{y}})^\dagger : L^2(D) \rightarrow {{\mathcal {X}}}\), defined as

$$\begin{aligned} (B_1^{\varvec{y}})^\dagger := (B^{\varvec{y}})^{-1} \Xi _1 \quad \text {and} \quad (B_2^{\varvec{y}})^\dagger := (B^{\varvec{y}})^{-1} \Xi _2\,. \end{aligned}$$
(2.13)

We call the operator \((B_1^{\varvec{y}})^\dagger \) the pseudoinverse of \(B_1^{\varvec{y}}\) and the operator \((B_2^{\varvec{y}})^\dagger \) the pseudoinverse of \(B_2^{\varvec{y}}\). Clearly, the pseudoinverse operators are linear and bounded operators.

Lemma 2.2

The pseudoinverse operators \((B_1^{\varvec{y}})^\dagger \) and \((B_2^{\varvec{y}})^\dagger \) defined by (2.13) satisfy

$$\begin{aligned} {{\mathcal {I}}}_{L^2(V';I)}&= B_1^{\varvec{y}}(B_1^{\varvec{y}})^\dagger \,, \quad {{\mathcal {I}}}_{L^2(D)} = B_2^{\varvec{y}}(B_2^{\varvec{y}})^\dagger \,, \quad \text {and} \nonumber \\ {{\mathcal {I}}}_{{\mathcal {X}}}&= (B_1^{\varvec{y}})^\dagger B_1^{\varvec{y}}+ (B_2^{\varvec{y}})^\dagger B_2^{\varvec{y}}, \end{aligned}$$
(2.14)

which are the identity operators on \(L^2(V';I)\), \(L^2(D)\), and \({{\mathcal {X}}}\), respectively.

Proof

From the definition of various operators, we have

$$\begin{aligned} B_1^{\varvec{y}}(B_1^{\varvec{y}})^\dagger&= \Lambda _1 B^{\varvec{y}}(B^{\varvec{y}})^{-1} \Xi _1 = \Lambda _1 {{\mathcal {I}}}_{{{\mathcal {Y}}}'} \Xi _1 = \Lambda _1 \Xi _1 = {{\mathcal {I}}}_{L^2(V';I)}\,, \\ B_2^{\varvec{y}}(B_2^{\varvec{y}})^\dagger&= \Lambda _2 B^{\varvec{y}}(B^{\varvec{y}})^{-1} \Xi _2 = \Lambda _2 {{\mathcal {I}}}_{{{\mathcal {Y}}}'} \Xi _2 = \Lambda _2 \Xi _2 = {{\mathcal {I}}}_{L^2(D)}\,, \\ (B_1^{\varvec{y}})^\dagger B_1^{\varvec{y}}+ (B_2^{\varvec{y}})^{\dagger } B_2^{\varvec{y}}&= (B^{\varvec{y}})^{-1} \Xi _1 \Lambda _1 B^{\varvec{y}}+ (B^{\varvec{y}})^{-1} \Xi _2 \Lambda _2 B^{\varvec{y}}\\&= (B^{\varvec{y}})^{-1} (P_1 + P_2) B^{\varvec{y}}= (B^{\varvec{y}})^{-1} {{\mathcal {I}}}_{{{\mathcal {Y}}}'} B^{\varvec{y}}= {{\mathcal {I}}}_{{{\mathcal {X}}}}\,, \end{aligned}$$

as required. \(\square \)

Lemma 2.3

For \({\varvec{y}}\in U\) and given \((z,u_0)\in {{\mathcal {Y}}}'\), the solution \(u^{\varvec{y}}\) of the operator equation (2.12) can be written as

$$\begin{aligned} u^{\varvec{y}}\,=\, (B^{\varvec{y}})^{-1}(z,u_0) \,=\, (B_1^{\varvec{y}})^{\dagger }z + (B_2^{\varvec{y}})^{\dagger } u_0 \quad \text {in } {{\mathcal {X}}}\,. \end{aligned}$$
(2.15)

Proof

From (2.14) we have \(u^{\varvec{y}}= (B_1^{\varvec{y}})^{\dagger } B_1^{\varvec{y}}u^{\varvec{y}}+ (B_2^{\varvec{y}})^{\dagger } B_2^{\varvec{y}}u^{\varvec{y}}= (B_1^{\varvec{y}})^{\dagger }z + (B_2^{\varvec{y}})^{\dagger } u_0\), as required. \(\square \)

2.3 Dual problem

In the following we will need the dual operators \((B^{\varvec{y}})'\), \((B^{\varvec{y}}_1)'\) and \((B^{\varvec{y}}_2)'\) of \(B^{\varvec{y}}\), \(B^{\varvec{y}}_1\) and \(B^{\varvec{y}}_2\), respectively, which are formally defined by

$$\begin{aligned} \langle w, (B^{\varvec{y}})'v\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}&\,:=\, \langle B^{\varvec{y}}w, v\rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}}\\ \langle w, (B_1^{\varvec{y}})'v_1\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}&\,:=\, \langle B_1^{\varvec{y}}w, v_1\rangle _{L^2(V';I),L^2(V;I)}\\ \langle w, (B_2^{\varvec{y}})'v_2\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}&\,:=\, \langle B_2^{\varvec{y}}w, v_2\rangle _{L^2(D)} \end{aligned}$$

for all \(w\in {{\mathcal {X}}}\), \(v = (v_1,v_2) \in {{\mathcal {Y}}}\) and \({\varvec{y}}\in U\), with \((B^{\varvec{y}})'v = (B_1^{\varvec{y}})'v_1 + (B_2^{\varvec{y}})'v_2\).

The dual problem to (2.6) (or equivalently (2.12)) is as follows. Given the input function \(f_{\textrm{dual}} \in {{\mathcal {X}}}'\) and \({\varvec{y}}\in U\), find a function \(q^{\varvec{y}}= (q_1^{\varvec{y}},q_2^{\varvec{y}})\in {{\mathcal {Y}}}\) such that

$$\begin{aligned} \langle w, (B^{\varvec{y}})'q^{\varvec{y}}\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} = \langle w, f_{\textrm{dual}} \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}\quad \forall w\in {{\mathcal {X}}}, \end{aligned}$$
(2.16)

or in operator form \( (B^{\varvec{y}})'q^{\varvec{y}}=f_{\textrm{dual}}, \) which has the unique solution \( q^{\varvec{y}}= \big ((B^{\varvec{y}})'\big )^{-1} f_{\textrm{dual}}\,. \)

Existence and uniqueness of the solution of the dual problem follow directly from the bounded invertibility of \(B^{\varvec{y}}\). We know that its inverse, \((B^{\varvec{y}})^{-1}\), is a bounded linear operator and thus the dual of \((B^{\varvec{y}})^{-1}\) is (uniquely) defined (see, e.g., [49, Theorem 1 and Definition 1, Chapter VII]). The operator \((B^{\varvec{y}})^{-1}\) and its dual operator \(((B^{\varvec{y}})^{-1})' =((B^{\varvec{y}})')^{-1}\) are equal in their operator norms (see, e.g., [49, Theorem 2, Chapter VII]), i.e., the operator norms of the dual operator \((B^{\varvec{y}})'\) and its inverse are bounded by the constants \(\beta _2\) and \(\tfrac{1}{\beta _1}\) in (2.8).

Applying integration by parts with respect to the time variable in (2.7), the left-hand side of the dual problem (2.16) can be written as

$$\begin{aligned}&\langle w, (B^{\varvec{y}})'q^{\varvec{y}}\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} = \langle B^{\varvec{y}}w, q^{\varvec{y}}\rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}} \nonumber \\&\quad = \bigg (\int _I \langle w,-\tfrac{\partial }{\partial t}q_1^{\varvec{y}}\rangle _{V,V'}\, \textrm{d}t + \int _I \int _D (a^{\varvec{y}}\nabla w \cdot \nabla q_1^{\varvec{y}})\,\textrm{d}{\varvec{x}}\,\textrm{d}t \nonumber \\&\qquad + \int _D w(\cdot ,T)\,q_1^{\varvec{y}}(\cdot ,T)\,\textrm{d}{\varvec{x}}- \int _D w(\cdot ,0)\,q_1^{\varvec{y}}(\cdot ,0)\,\textrm{d}{\varvec{x}}\bigg ) + \int _D w(\cdot ,0)\, q_2^{\varvec{y}}\,\textrm{d}{\varvec{x}}\nonumber \\&\quad = \big \langle w,(B_1^{\varvec{y}})'q_1^{\varvec{y}}\big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} + \big \langle w,(B_2^{\varvec{y}})'q_2^{\varvec{y}}\big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}\,. \end{aligned}$$
(2.17)

We may express the solution \(q^{\varvec{y}}= (q_1^{\varvec{y}},q_2^{\varvec{y}})\in {{\mathcal {Y}}}\) of the dual problem (2.16) in terms of the dual operators of the pseudoinverse operators \((B_1^{\varvec{y}})^\dagger \) and \((B_2^{\varvec{y}})^\dagger \). This is true because we get an analogous result to Lemma 2.2 in the dual spaces.

Lemma 2.4

The dual operators \(((B_1^{\varvec{y}})^\dagger )'\) and \(((B_2^{\varvec{y}})^\dagger )'\) of the pseudoinverse operators defined in (2.13) satisfy

$$\begin{aligned} {{\mathcal {I}}}_{L^2(V;I)}&= ((B_1^{\varvec{y}})^\dagger )'(B_1^{\varvec{y}})'\,, \quad {{\mathcal {I}}}_{L^2(D)} = ((B_2^{\varvec{y}})^\dagger )'(B_2^{\varvec{y}})' \,, \quad \text {and} \nonumber \\ {{\mathcal {I}}}_{{{\mathcal {X}}}'}&= (B_1^{\varvec{y}})'((B_1^{\varvec{y}})^\dagger )' + (B_2^{\varvec{y}})' ((B_2^{\varvec{y}})^\dagger )'\,, \end{aligned}$$
(2.18)

which are the identity operators on \(L^2(V;I)\), \(L^2(D)\) and \({{\mathcal {X}}}'\), respectively.

Proof

For all \(v_1 \in L^2(V';I)\), \(w_1 \in L^2(V;I)\), \(v_2,w_2\in L^2(D)\), it follows from (2.14) that

$$\begin{aligned} \langle v_1,w_1 \rangle _{L^2(V';I),L^2(V;I)}&= \big \langle B_1^{\varvec{y}}(B_1^{\varvec{y}})^\dagger v_1, w_1 \big \rangle _{L^2(V';I),L^2(V;I)}\\&= \big \langle v_1,((B_1^{\varvec{y}})^\dagger )' (B_1^{\varvec{y}})' w_1 \big \rangle _{L^2(V';I),L^2(V;I)}\,, \text{ and } \\ \langle v_2,w_2 \rangle _{L^2(D)}&= \big \langle B_2^{\varvec{y}}(B_2^{\varvec{y}})^\dagger v_2,w_2 \big \rangle _{L^2(D)} \!=\! \big \langle v_2, ((B_2^{\varvec{y}})^\dagger )' (B_2^{\varvec{y}})' w_2 \big \rangle _{L^2(D)}. \end{aligned}$$

Similarly, for all \(v\in {{\mathcal {X}}}\) and \(w\in {{\mathcal {X}}}'\) we have

$$\begin{aligned} \langle v,w \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}&= \big \langle \big ((B_1^{\varvec{y}})^\dagger B_1^{\varvec{y}}+ (B_2^{\varvec{y}})^\dagger B_2^{\varvec{y}}\big ) v, w \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}\\&= \big \langle (B_1^{\varvec{y}})^{\dagger } B_1^{\varvec{y}}v, w \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} + \big \langle (B_2^{\varvec{y}})^{\dagger } B_2^{\varvec{y}}v, w \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\&= \big \langle v, (B_1^{\varvec{y}})'((B_1^{\varvec{y}})^\dagger )' w \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} + \big \langle v, (B_2^{\varvec{y}})'((B_2^{\varvec{y}})^\dagger )' w \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\&= \langle v, \big ((B_1^{\varvec{y}})'((B_1^{\varvec{y}})^\dagger )' + (B_2^{\varvec{y}})'((B_2^{\varvec{y}})^\dagger )'\big ) w \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}\,. \end{aligned}$$

This completes the proof. \(\square \)

Lemma 2.5

Given the input function \(f_{\textrm{dual}} \in {{\mathcal {X}}}'\) and \({\varvec{y}}\in U\), the (unique) solution of the dual problem (2.16) is given by

$$\begin{aligned} q^{\varvec{y}}= (q^{\varvec{y}}_1, q^{\varvec{y}}_2) = \big ( ((B_1^{\varvec{y}})^\dagger )' f_{\textrm{dual}}, ((B_2^{\varvec{y}})^\dagger )' f_{\textrm{dual}}\big ) \quad \hbox { in}\ {{\mathcal {Y}}}. \end{aligned}$$
(2.19)

Proof

Existence and uniqueness follow from the bounded invertibility of \((B^{\varvec{y}})^\prime \), see Sect. 2.2. Thus, we only need to verify that (2.19) solves the dual problem (2.16). It follows from (2.18) that

$$\begin{aligned} f_{\textrm{dual}}&= \big ((B_1^{\varvec{y}})'((B_1^{\varvec{y}})^\dagger )' + (B_2^{\varvec{y}})' ((B_2^{\varvec{y}})^\dagger )'\big ) f_{\textrm{dual}} \\&= (B_1^{\varvec{y}})'((B_1^{\varvec{y}})^\dagger )' f_{\textrm{dual}} + (B_2^{\varvec{y}})' ((B_2^{\varvec{y}})^\dagger )' f_{\textrm{dual}}\\&= (B_1^{\varvec{y}})' q_1^{\varvec{y}}+ (B_2^{\varvec{y}})' q_2^{\varvec{y}}= (B^{\varvec{y}})' q^{\varvec{y}}\,, \end{aligned}$$

as required. \(\square \)

We will see in the next section that, with the correct choice of the right-hand side \(f_{\textrm{dual}}\), the gradient of the objective function (2.3) can be computed using the solution \(q^{\varvec{y}}\) of the dual problem.

3 Parabolic optimal control problems under uncertainty with control constraints

The presence of uncertainty in the optimization problem requires the introduction of a risk measure \({\mathcal {R}}\) that maps the random variable objective function (see (3.3) below) to the extended real numbers. Let \((\Omega , {\mathcal {A}}, {\mathbb {P}})\) be a complete probability space. A functional \({{\mathcal {R}}}: L^p(\Omega , {\mathcal {A}}, {\mathbb {P}}) \rightarrow {\mathbb {R}} \cup \{\infty \}\), for \(p \in [1,\infty )\), is said to be a coherent risk measure [1] if for \(X,{\tilde{X}} \in L^p(\Omega , {\mathcal {A}}, {\mathbb {P}})\) we have

  1. (1)

    Convexity: \({\mathcal {R}}(\lambda X + (1-\lambda ) {\tilde{X}}) \le \lambda {\mathcal {R}}(X) + (1-\lambda ) {\mathcal {R}}({\tilde{X}})\) for all \(\lambda \in [0,1]\).

  2. (2)

    Translation equivariance: \({\mathcal {R}}(X+c) = {\mathcal {R}}(X) + c\) for all \(c \in {\mathbb {R}}\).

  3. (3)

    Monotonicity: If \(X \le {\tilde{X}}\) \({\mathbb {P}}\)-a.e. then \({\mathcal {R}}(X) \le {\mathcal {R}}({\tilde{X}})\).

  4. (4)

    Positive homogeneity: \({\mathcal {R}}(tX) = t {\mathcal {R}}(X)\) for all \(t\ge 0\).

Coherent risk measures are popular as numerous desirable properties can be derived from the above conditions (see, e.g., [29] and the references therein). However, it can be shown (see [30, Theorem 1]) that the only coherent risk measures that are Fréchet differentiable are linear ones. The expected value has all of these properties, but is risk-neutral. In order to address also risk-averse problems we focus on the (nonlinear) entropic risk measures, which are risk-averse, Fréchet differentiable, and satisfy the conditions (1)–(3) above, i.e., they are not positively homogeneous (and thus not coherent). Risk measures satisfying (2) and (3) are called monetary risk measures, and a monetary risk measure that also satisfies (1) is called a convex risk measure (see [14]).

In this section we will first discuss the required conditions on the risk measure \({\mathcal {R}}\) under which the optimal control problem has a unique solution. We will then present two classes of risk measures that satisfy these conditions, namely the linear risk measures that include the expected value, and the entropic risk measures. Finally we derive necessary and sufficient optimality conditions for the optimal control problem with these two risk measures. We assume that the target state \({\widehat{u}}\) belongs to \({{\mathcal {X}}}\) and that the constants \(\alpha _{1}, \alpha _{2}\) are nonnegative with \(\alpha _1+\alpha _2 >0\) and \(\alpha _{3}>0\). Then we consider the following problem: minimize \({\widetilde{J}}(u,z)\) defined in (2.3) subject to the parabolic PDE (2.2) and constraints on the control

$$\begin{aligned} z \in {\mathcal {Z}} \end{aligned}$$
(3.1)

with \({\mathcal {Z}}\) being nonempty, bounded, closed and convex.

We want to analyze the problem in its reduced form, i.e., expressing the state \(u^{\varvec{y}}= (B^{\varvec{y}})^{-1} (z,u_0)\) in (2.3) in terms of the control z. This reformulation is possible because of the bounded invertibility of the operator \(B^{\varvec{y}}\) for every \({\varvec{y}}\in U\), see Sect. 2.2 and the references therein. We therefore introduce an alternative notation \(u(z) = (u^{\varvec{y}}(z))({\varvec{x}},t) = u^{\varvec{y}}({\varvec{x}},t)\). (Of course \(u^{\varvec{y}}\) depends also on \(u_0\), but we can think of \(u_0\) as fixed, and therefore uninteresting.) The reduced problem is then to minimize

$$\begin{aligned} J(z) := {\widetilde{J}}\big (u(z),z\big )&= {{\mathcal {R}}} \Big ( \frac{\alpha _1}{2}\big \Vert u^{\varvec{y}}(z)-{\widehat{u}}\big \Vert _{L^2(V;I)}^2 + \frac{\alpha _2}{2}\big \Vert E_T\big (u^{\varvec{y}}(z)-{\widehat{u}}\big )\big \Vert _{L^2(D)}^2 \Big ) \nonumber \\&\quad +\frac{\alpha _3}{2}\Vert z\Vert _{L^2(V';I)}^2 , \end{aligned}$$
(3.2)

where \(E_T\!:{{\mathcal {X}}}\rightarrow L^2(D)\) is the bounded linear operator defined by \(v\mapsto v(\cdot ,T)\) for some fixed terminal time \(T>0\).

Defining

$$\begin{aligned} \Phi ^{\varvec{y}}(z) := \frac{\alpha _1}{2} \big \Vert (B^{\varvec{y}})^{-1}(z,u_0)-{\widehat{u}}\big \Vert ^2_{L^2(V;I)} + \frac{\alpha _2}{2} \big \Vert E_T\big ((B^{\varvec{y}})^{-1}(z,u_0) - {\widehat{u}}\big )\Vert ^2_{L^2(D)}, \end{aligned}$$
(3.3)

we can equivalently write the reduced problem as

$$\begin{aligned} \min _{z \in {\mathcal {Z}}} \Big ( {{\mathcal {R}}}(\Phi ^{\varvec{y}}(z)) + \frac{\alpha _3}{2}\Vert z\Vert _{L^2(V';I)}^2 \Big )\,. \end{aligned}$$
(3.4)

With the uniformly boundedly invertible forward operator \(B^{\varvec{y}}\), our setting fits into the abstract framework of [29] where the authors derive existence and optimality conditions for PDE-constrained optimization under uncertainty. In particular, the forward operator \(B^{\varvec{y}}\), the regularization term \(\tfrac{\alpha _3}{2}\Vert z\Vert _{L^2(V';I)}^2\) and the random variable tracking-type objective function \(\Phi ^{\varvec{y}}\) satisfy the assumptions of [29, Proposition 3.12]. In order to present the result about the existence and uniqueness of the solution of (3.4), which is based on [29, Proposition 3.12], we recall some definitions from convex analysis (see, e.g., [29] and the references therein): A functional \({\mathcal {R}}: L^p(\Omega ,{\mathcal {A}},{\mathbb {P}}) \rightarrow {\mathbb {R}} \cup \{\infty \}\) is called proper if \({\mathcal {R}}(X) > -\infty \) for all \(X \in L^p(\Omega ,{\mathcal {A}},{\mathbb {P}})\) and \({\textrm{dom}}({\mathcal {R}}):= \{X \in L^p(\Omega ,{\mathcal {A}},{\mathbb {P}}): {\mathcal {R}}(X) < \infty \} \ne \emptyset \); it is called lower semicontinuous or closed if its epigraph \(\textrm{epi}({\mathcal {R}}):= \{(X,\alpha ) \in L^p(\Omega ,{\mathcal {A}},{\mathbb {P}}) \times {\mathbb {R}}: {\mathcal {R}}(X) \le \alpha \}\) is closed in the product topology \(L^p(\Omega ,{\mathcal {A}},{\mathbb {P}}) \times {\mathbb {R}}\).

Lemma 3.1

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3>0\) with \(\alpha _1+\alpha _2 >0\) and let \({\mathcal {R}}\) be proper, closed, convex and monotonic, then there exists a unique solution of (3.4).

Proof

The existence of the solution follows directly from [29, Proposition 3.12]. We thus only prove the strong convexity of the objective function, which implies strict convexity and hence uniqueness of the solution. Clearly \(\frac{\alpha _3}{2}\Vert z\Vert ^2_{L^2(V';I)}\) is strongly convex. Since the sum of a convex and a strongly convex function is strongly convex it remains to show the convexity of \(\mathcal R(\Phi ^{\varvec{y}}(z))\). By the linearity and the bounded invertibility of the linear forward operator \(B^{\varvec{y}}\), the tracking-type objective functional \(\Phi ^{\varvec{y}}(z)\) is quadratic in z and hence convex, i.e., we have for \(z,{\tilde{z}} \in L^2(V';I)\) and \(\lambda \in [0,1]\) that \(\Phi ^{\varvec{y}}(\lambda z + (1-\lambda ) {\tilde{z}}) \le \lambda \Phi ^{\varvec{y}}(z) + (1-\lambda )\Phi ^{\varvec{y}}({\tilde{z}})\). Then, by the monotonicity and the convexity of the risk measure \({\mathcal {R}}\) we get \( {\mathcal {R}}(\Phi ^{\varvec{y}}(\lambda z + (1-\lambda ) {\tilde{z}})) \le {\mathcal {R}}(\lambda \Phi ^{\varvec{y}}(z) + (1-\lambda )\Phi ^{\varvec{y}}({\tilde{z}})) \le \lambda {\mathcal {R}}( \Phi ^{\varvec{y}}(z)) + (1-\lambda )\mathcal R(\Phi ^{\varvec{y}}({\tilde{z}}))\,, \) as required. \(\square \)

3.1 Linear risk measures, including the expected value

First we derive a formula for the Fréchet derivative of (3.2) when \({{\mathcal {R}}}\) is linear, which includes the special case \({{\mathcal {R}}}(\cdot ) = \int _U (\cdot )\,\textrm{d}{\varvec{y}}\).

Lemma 3.2

Let \({{\mathcal {R}}}\) be linear. Then the Fréchet derivative of (3.2) as an element of \((L^2(V';I))'=L^2(V;I)\) is given by

$$\begin{aligned} J'(z)&\,=\, {{\mathcal {R}}}\Big ( \big ((B_1^{\varvec{y}})^{\dagger }\big )' \big (\alpha _1 R_V + \alpha _2 E_T'E_T\big ) \big (u^{\varvec{y}}(z)-{\widehat{u}}\big ) \Big ) +\alpha _3 R_V^{-1} z \end{aligned}$$
(3.5)

for \(z\in L^2(V';I)\).

Proof

For \(z,\delta \in L^2(V';I)\), we can write

$$\begin{aligned} J(z+\delta )&= {{\mathcal {R}}}\Big (\frac{\alpha _1}{2} \big \Vert u^{\varvec{y}}(z+\delta )-u^{\varvec{y}}(z)+u^{\varvec{y}}(z)-{\widehat{u}}\big \Vert _{L^2(V;I)}^2 \\&\quad +\frac{\alpha _2}{2} \big \Vert E_T\big ( u^{\varvec{y}}(z+\delta )-u^{\varvec{y}}(z)+u^{\varvec{y}}(z)-{\widehat{u}}\big )\big \Vert _{L^2(D)}^2 \Big ) +\frac{\alpha _3}{2} \Vert z+\delta \Vert _{L^2(V';I)}^2 \\&= {{\mathcal {R}}}\Big (\frac{\alpha _1}{2} \big \Vert (B_1^{\varvec{y}})^{\dagger }\delta + \big (u^{\varvec{y}}(z)-{\widehat{u}}\big )\big \Vert _{L^2(V;I)}^2 \\&\quad +\frac{\alpha _2}{2} \big \Vert E_T(B_1^{\varvec{y}})^{\dagger }\delta + E_T \big (u^{\varvec{y}}(z)-{\widehat{u}}\big )\big \Vert _{L^2(D)}^2 \Big ) +\frac{\alpha _3}{2} \Vert z+\delta \Vert _{L^2(V';I)}^2, \end{aligned}$$

where we used (2.15) to write \(u^{\varvec{y}}(z+\delta )-u^{\varvec{y}}(z) = [(B_1^{\varvec{y}})^{\dagger }(z+\delta ) + (B_2^{\varvec{y}})^{\dagger }u_0] - [(B_1^{\varvec{y}})^{\dagger }(z) + (B_2^{\varvec{y}})^{\dagger }u_0] = (B_1^{\varvec{y}})^{\dagger }\delta \). Expanding the squared norms using \(\Vert v+w\Vert ^2 = \langle v+w,v+w\rangle = \Vert v\Vert ^2 + 2\langle v,w\rangle + \Vert w\Vert ^2\), we obtain

$$\begin{aligned} J(z+\delta ) \,=\, J(z)+ (\partial _z J(z))\,\delta + \textrm{o}(\delta ), \end{aligned}$$

with the Fréchet derivative \(\partial _z J(z): L^2(V';I)\rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} (\partial _z J(z))\,\delta&:= {{\mathcal {R}}}\Big (\alpha _1 \overbrace{\big \langle (B_1^{\varvec{y}})^{\dagger }\delta , u^{\varvec{y}}(z)-{\widehat{u}} \big \rangle _{L^2(V;I)}}^{=:\,\textrm{Term}_1} \\&\quad \qquad +\alpha _2 \underbrace{\big \langle E_T(B_1^{\varvec{y}})^{\dagger }\delta , E_T\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ) \big \rangle _{L^2(D)}}_{=:\,\textrm{Term}_2} \Big ) +\alpha _3 \underbrace{\langle z,\delta \rangle _{L^2(V';I)}}_{=:\,\textrm{Term}_3}. \end{aligned}$$

It remains to simplify the three terms. Using the extended Riesz operator \(R_V\!: L^2(V;I) \rightarrow L^2(V';I)\), we have

$$\begin{aligned} \textrm{Term}_1&= \big \langle u^{\varvec{y}}(z)-{\widehat{u}} , (B_1^{\varvec{y}})^{\dagger }\delta \big \rangle _{L^2(V;I)} \\&= \big \langle R_V\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ),(B_1^{\varvec{y}})^{\dagger }\delta \big \rangle _{L^2(V';I),L^2(V;I)} \\&= \big \langle R_V\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ),(B_1^{\varvec{y}})^{\dagger }\delta \big \rangle _{{{\mathcal {X}}}',{{\mathcal {X}}}} \\&= \big \langle \big ((B_1^{\varvec{y}})^{\dagger }\big )'R_V\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ),\delta \big \rangle _{L^2(V;I),L^2(V';I)}, \end{aligned}$$

where the third equality follows since \((B_1^{\varvec{y}})^{\dagger }\delta \in {{\mathcal {X}}} \hookrightarrow L^2(V;I)\), and the fourth equality follows from the definition of the dual operator \(((B_1^{\varvec{y}})^{\dagger })': {{\mathcal {X}}}' \rightarrow L^2(V;I)\), noting that \((L^2(V';I))' = L^2(V;I)\).

Next, using the definition of the dual operator \((E_T)':L^2(D) \rightarrow {{\mathcal {X}}}'\), we can write

$$\begin{aligned} \textrm{Term}_2&= \big \langle E_T\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ), E_T(B_1^{\varvec{y}})^{\dagger }\delta \big \rangle _{L^2(D)} \\&= \big \langle E_T'E_T\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ), (B_1^{\varvec{y}})^{\dagger }\delta \big \rangle _{{{\mathcal {X}}}',{{\mathcal {X}}}} \\&= \big \langle \big ((B_1^{\varvec{y}})^{\dagger }\big )' E_T' E_T\big (u^{\varvec{y}}(z)-{\widehat{u}}\big ),\delta \big \rangle _{L^2(V;I),L^2(V';I)}. \end{aligned}$$

Finally, using the definition of the \(L^2(V',I)\) inner product and the extended inverse Riesz operator \(R_V^{-1}\!:L^2(V';I)\rightarrow L^2(V;I)\), we obtain

$$\begin{aligned} \textrm{Term}_3 \,=\, \langle z, \delta \rangle _{L^2(V';I)} \,=\, \langle R_V^{-1} z, R_V^{-1} \delta \rangle _{L^2(V;I)} \,=\, \big \langle R_V^{-1} z,\delta \big \rangle _{L^2(V;I),L^2(V';I)}. \end{aligned}$$

Writing \((\partial _z J(z))\, \delta = \langle J'(z),\delta \rangle _{L^2(V;I),L^2(V';I)}\) and collecting the terms above leads to the expression for \(J'(z)\) in (3.5).\(\square \)

We call \(J'(z)\) the gradient of J(z) and show next, that \(J'(z)\) can be computed using the solution of the dual problem (2.16) with

$$\begin{aligned} f_{\textrm{dual}} := (\alpha _1 R_V + \alpha _2 E_T'E_T)(u^{\varvec{y}}- {\widehat{u}}) \in {{\mathcal {X}}}'\,. \end{aligned}$$
(3.6)

We show this first for the special case when \({{\mathcal {R}}}\) is linear.

Lemma 3.3

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3 > 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and then let \(q^{\varvec{y}}\in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_{\textrm{dual}}\) given by (3.6). Then for \({{\mathcal {R}}}\) linear, the gradient of (3.2) is given as an element of \(L^2(V;I)\) by

$$\begin{aligned} J'(z) = {{\mathcal {R}}}(q_1) + \alpha _3 R_V^{-1}z \end{aligned}$$
(3.7)

for \(z \in L^2(V';I)\).

Proof

This follows immediately from (3.6), Lemma 3.2 and Lemma 2.5. \(\square \)

Proposition 3.4

Under the conditions of Lemma 3.3, with \(f_{\textrm{dual}}\) given by (3.6), the dual solution \(q^{\varvec{y}}= (q_1^{\varvec{y}},q_2^{\varvec{y}}) \in {{\mathcal {Y}}}\) satisfies

$$\begin{aligned} q_2^{\varvec{y}}= q_1^{\varvec{y}}(\cdot ,0). \end{aligned}$$

Consequently, the left-hand side of (2.16) reduces to

$$\begin{aligned} \int _I \big \langle w,-\tfrac{\partial }{\partial t}q_1^{\varvec{y}}\big \rangle _{V,V'}\, \textrm{d}t + \int _I \int _D \big (a^{\varvec{y}}\nabla w \cdot \nabla q_1^{\varvec{y}}\big )\,\textrm{d}{\varvec{x}}\,\textrm{d}t + \int _D w(\cdot ,T)\,q_1^{\varvec{y}}(\cdot ,T)\,\textrm{d}{\varvec{x}}\,, \end{aligned}$$
(3.8)

and hence \(q_1^{\varvec{y}}\) is the solution to

$$\begin{aligned} {\left\{ \begin{array}{ll} -\frac{\partial }{\partial t} q_1^{{\varvec{y}}}({\varvec{x}},t)-\nabla \cdot \big (a^{{\varvec{y}}}({\varvec{x}},t)\nabla q_1^{{\varvec{y}}}({\varvec{x}},t)\big ) = \alpha _1 R_V\big (u^{\varvec{y}}({\varvec{x}},t)-{\widehat{u}}({\varvec{x}},t)\big ) \\ q_1^{{\varvec{y}}}({\varvec{x}},t) = 0 \\ q_1^{{\varvec{y}}}({\varvec{x}},T) = \alpha _2 \big (u^{\varvec{y}}({\varvec{x}},T)-{\widehat{u}}({\varvec{x}},T)\big ), \end{array}\right. } \end{aligned}$$
(3.9)

where the first equation holds for \({\varvec{x}}\in D\), \(t\in I\), and the second equation holds for \({\varvec{x}}\in \partial D\), \(t\in I\), and the last equation holds for \(x\in D\).

Proof

Since (2.16) holds for arbitrary \(w\in {{\mathcal {X}}}\), it holds in particular for the special case

$$\begin{aligned} w = w_n({\varvec{x}},t) := {\left\{ \begin{array}{ll} \big (1-\tfrac{nt}{T}\big )\,v({\varvec{x}}) &{} \text {for } t\in \left[ 0,\tfrac{T}{n} \right] \,,\\ 0 &{} \text {for } t\in \left( \tfrac{T}{n},T \right] \,, \end{array}\right. } \end{aligned}$$

with arbitrary \(v \in V\). For \(f_{\textrm{dual}}\) given by (3.6), the right-hand side of (2.16) becomes

$$\begin{aligned}&\langle w_n, f_{\textrm{dual}} \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\&\quad = \big \langle w_n, \alpha _1 R_V (u^{\varvec{y}}-{\widehat{u}}) \big \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} + \big \langle w_n(\cdot ,T), \alpha _2\big (u^{\varvec{y}}(\cdot ,T)-{\widehat{u}}(\cdot ,T)\big ) \big \rangle _{L^2(D)} \nonumber \\&\quad = \int _0^{\frac{T}{n}} \!\int _D \big (1 - \tfrac{nt}{T}\big )\,v\, \alpha _1 R_V(u^{\varvec{y}}-{\widehat{u}}) \,\textrm{d}{\varvec{x}}\, \textrm{d}t \,\,\, \rightarrow 0 \,\,\,\text{ as }\,\,\, n\rightarrow \infty \,. \end{aligned}$$

From (2.17) the left-hand side of (2.16) is now

$$\begin{aligned}&\langle w_n, (B^{\varvec{y}})' q^{\varvec{y}}\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\&\quad = \int _0^{\frac{T}{n}} \! \big (1 - \tfrac{nt}{T}\big ) \big \langle v,-\tfrac{\partial }{\partial t}q_1^{\varvec{y}}\big \rangle _{V,V'}\, \textrm{d}t + \int _0^{\frac{T}{n}} \! \int _D \big (1 - \tfrac{nt}{T}\big ) \big (a^{\varvec{y}}\nabla v \cdot \nabla q_1^{\varvec{y}}\big )\,\textrm{d}{\varvec{x}}\,\textrm{d}t \\&\qquad - \int _D v\,q_1^{\varvec{y}}(\cdot ,0)\,\textrm{d}{\varvec{x}}+ \int _D v\,q_2^{\varvec{y}}\,\textrm{d}{\varvec{x}}\\&\quad \rightarrow \int _D v\,\big (q_2^{\varvec{y}}- q_1^{\varvec{y}}(\cdot ,0)\big )\,\textrm{d}{\varvec{x}}\quad \text{ as }\quad n\rightarrow \infty \,. \end{aligned}$$

Equating the two sides, letting \(n\rightarrow \infty \), and noting that \(v\in V\) is arbitrary, we conclude that necessarily \(q_2^{\varvec{y}}= q_1^{\varvec{y}}(\cdot ,0)\).

Hence, the left-hand side of (2.16) reduces to (3.8). By analogy with the weak form of (2.2), using the transformation \(t \mapsto T-t\), we conclude that \(q_1^{\varvec{y}}\) is the solution to (3.9). \(\square \)

3.2 The entropic risk measure

The expected value is risk neutral. Next, we consider risk averse risk measures such as the entropic risk measure

$$\begin{aligned} {{\mathcal {R}}}_{\textrm{e}}(Y({\varvec{y}})) := \frac{1}{\theta } \ln \left( \int _U \exp \big (\theta \,Y({\varvec{y}})\big ) \,\mathrm d{\varvec{y}}\right) \,, \end{aligned}$$

for an essentially bounded random variable \(Y({\varvec{y}})\) and some \(\theta \in (0,\infty )\). Using \({{\mathcal {R}}} = {{\mathcal {R}}}_{\textrm{e}}\) in (3.2), the optimal control problem becomes \(\min _{z \in {\mathcal {Z}}} J(z)\), with

$$\begin{aligned} J(z) = \frac{1}{\theta } \ln \left( \int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big ) \,\mathrm d{\varvec{y}}\right) + \frac{\alpha _3}{2} \Vert z\Vert _{L^2(V';I)}^2\,, \end{aligned}$$
(3.10)

for some \(\theta \in (0,\infty )\) and \(\Phi ^{\varvec{y}}\) defined in (3.3).

In the following we want to compute the Fréchet derivative of J(z) with respect to \(z \in L^2(V';I)\). To this end, we verify that \(\Phi ^{\varvec{y}}(z) \le C < \infty \) is uniformly bounded in \({\varvec{y}}\in U\) for any \(z\in L^2(V';I)\), i.e. the constant \(C>0\) is independent of \({\varvec{y}}\in U\).

Lemma 3.5

Let \(f = (z,u_0)\in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\), and let \(\alpha _1,\alpha _2 \ge 0\) with \(\alpha _1+\alpha _2 >0\). Then for all \({\varvec{y}}\in U\), the function \(\Phi ^{\varvec{y}}\) defined by (3.3) satisfies

$$\begin{aligned} 0\le \Phi ^{\varvec{y}}\le \frac{\alpha _1+\alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2}{2}\, \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}+\Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\Big )^2 < \infty . \end{aligned}$$
(3.11)

Thus for all \(\theta >0\) we have

$$\begin{aligned}&1 \le \exp \big (\theta \,\Phi ^{\varvec{y}}\big ) \le e^\sigma <\infty , \quad \text{ with }\quad \end{aligned}$$
(3.12)
$$\begin{aligned}&\sigma := \frac{\alpha _1+\alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2}{2}\, \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}+\Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\Big )^2 \theta . \end{aligned}$$
(3.13)

Proof

We have from (3.3) that

$$\begin{aligned} \Phi ^{\varvec{y}}(z)&\le \frac{\alpha _1}{2} \big \Vert (B^{\varvec{y}})^{-1}f- {\widehat{u}}\big \Vert ^2_{{{\mathcal {X}}}} + \frac{\alpha _2}{2} \Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2 \big \Vert (B^{\varvec{y}})^{-1}f-{\widehat{u}}\big \Vert ^2_{{{\mathcal {X}}}} \nonumber \\&\le \frac{\alpha _1+\alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2}{2}\, \big (\big \Vert (B^{\varvec{y}})^{-1}f\big \Vert _{{\mathcal {X}}} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}\big )^2\,, \end{aligned}$$

which yields (3.11) after applying (2.9). \(\square \)

Using the preceding lemma, we compute the gradient of (3.10).

Lemma 3.6

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3 > 0\), with \(\alpha _1 + \alpha _2 > 0\), and let \(0<\theta <\infty \). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and then let \(q^{\varvec{y}}= (q_1^{\varvec{y}},q_2^{\varvec{y}}) \in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_{\textrm{dual}}\) given by (3.6). Then the gradient of (3.10) is given as an element of \(L^2(V;I)\) for \(z \in L^2(V';I)\) by

$$\begin{aligned} J'(z) = \frac{1}{\int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big ) \,\mathrm d{\varvec{y}}} \int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\, q_1^{\varvec{y}}\,\mathrm d{\varvec{y}}+ \alpha _3 R_V^{-1}z \end{aligned}$$
(3.14)

where \(\Phi ^{\varvec{y}}(z)\) is defined in (3.3).

Proof

The application of the chain rule gives

$$\begin{aligned} \partial _z {\mathcal {R}}_{\textrm{e}}(\Phi ^{\varvec{y}}(z)) = \frac{1}{\theta \int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\, \mathrm d{\varvec{y}}} \partial _z \Big (\int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\,\mathrm d{\varvec{y}}\Big )\,. \end{aligned}$$

Lemma 3.5 implies that \(1 \le \int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\,\textrm{d}{\varvec{y}}<\infty \). Then the integral is a bounded and linear operator and hence its Fréchet derivative is the operator itself. Exploiting this fact, we obtain that \(\partial _z \left( \int _U \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\,\mathrm d{\varvec{y}}\right) = \int _U \left( \partial _z \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\right) \mathrm d{\varvec{y}}\). By the chain rule it follows for each \({\varvec{y}}\in U\) that \( \partial _z \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big ) = \theta \, \exp \big (\theta \,\Phi ^{\varvec{y}}(z)\big )\, \partial _z \Phi ^{\varvec{y}}(z)\,. \) Recalling from the previous subsection that \(\partial _z (\frac{\alpha _3}{2}\Vert z\Vert _{L^2(V';I)}^2) = \alpha _3 R_V^{-1} z\) and \( \partial _z \Phi ^{\varvec{y}}(z) = \big ((B_1^{\varvec{y}})^{\dagger }\big )'(\alpha _1 R_V +\alpha _2 E_T'E_T)(u^{\varvec{y}}(z)-{\widehat{u}}) = q_1^{\varvec{y}}\,, \) and collecting terms gives (3.14). \(\square \)

3.3 Optimality conditions

In the case when the feasible set of controls \({\mathcal {Z}}\) is a nonempty and convex set, we know (see, e.g., [46, Lemma 2.21]) that the optimal control \(z^*\) satisfies the variational inequality

$$\begin{aligned} \langle J'(z^*), z - z^* \rangle _{L^2(V;I),L^2(V';I)} \ge 0 \quad \forall z \in {\mathcal {Z}}\,. \end{aligned}$$
(3.15)

For convex objective functionals J(z), like the ones considered in this work, the variational inequality is a necessary and sufficient condition for optimality. The complete optimality conditions are then given by the following result.

Theorem 3.7

Let \({\mathcal {R}}\) be the expected value or the entropic risk measure. A control \(z^* \in L^2(V';I)\) is the unique minimizer of (2.3) subject to (2.2) and (3.1) if and only if it satisfies the optimality system:

$$\begin{aligned} {\left\{ \begin{array}{ll} \langle B^{\varvec{y}}u^{\varvec{y}}, (v_1,v_2) \rangle _{{{\mathcal {Y}}}',{{\mathcal {Y}}}} = \langle z^*, v_1 \rangle _{L^2(V';I),L^2(V;I)} + \langle u_0, v_2 \rangle _{L^2(D)} \,\,\,\forall \,v\in {{\mathcal {Y}}}, \\ \langle w, (B^{\varvec{y}})' q^{\varvec{y}}\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} = \langle w, \alpha _1 R_V(u^{\varvec{y}}- {\widehat{u}}) \rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\ \qquad \qquad \qquad \qquad \qquad + \langle w(T), \alpha _2(u^{\varvec{y}}(T)-{\widehat{u}}(T)) \rangle _{L^2(D)} \,\,\, \forall \, w \in {{\mathcal {X}}}, \\ z^* \in {\mathcal {Z}}\,, \\ \langle J'(z^*), z - z^* \rangle _{L^2(V;I),L^2(V';I)} \ge 0 \quad \forall z \in {\mathcal {Z}}\,, \end{array}\right. } \end{aligned}$$

which holds for all \({\varvec{y}}\in U\), and \(J'(z)\) is given by (3.7) for the expected value, or (3.14) for the entropic risk measure.

Observe that the optimality system in Theorem 3.7 contains the variational formulations of the state PDE (2.6) and the dual PDE (2.16) in the first and second equation, respectively.

It is convenient to reformulate the variational inequality (3.15) in terms of an orthogonal projection onto \(\mathcal Z\). The orthogonal projection onto a nonempty, closed and convex subset \({\mathcal {Z}} \subset H\) of a Hilbert space H, denoted by \(P_{{\mathcal {Z}}}: H \rightarrow {\mathcal {Z}}\), is defined as

$$\begin{aligned} P_{{\mathcal {Z}}}(h) \in {\mathcal {Z}}\,, \quad \Vert P_{{\mathcal {Z}}}(h) - h\Vert _H = \min _{v \in {\mathcal {Z}}} \Vert v - h\Vert _H\,, \quad \forall h \in H\,. \end{aligned}$$

Then, see, e.g., [27, Lemma 1.11], for all \(h \in H\) and \(\gamma >0\) the condition \(h \in {\mathcal {Z}}\), \(\langle h, v - z\rangle _{H} \ge 0\, \forall v \in {\mathcal {Z}}\) is equivalent to \(z - P_{{\mathcal {Z}}}(z - \gamma h) = 0\). Using the definition of the Riesz operator and \(H = L^2(V';I)\), we conclude that (3.15) is equivalent to

$$\begin{aligned} z^* - P_{{\mathcal {Z}}}(z^* - \gamma R_V J'(z^*)) = 0\,. \end{aligned}$$

This equivalence can then be used to develop projected descent methods to solve the optimal control problem, see, e.g., [27, Chapter 2.2.2].

Remark 3.8

If \({\mathcal {Z}}\) is the closed ball with radius \(r > 0\) in a Hilbert space H, then the orthogonal projection \(P_{{\mathcal {Z}}}\) is given by

$$\begin{aligned} P_{{\mathcal {Z}}}(h) = \min {\Big (1,\frac{r}{\Vert h\Vert _H}\Big )}\,h \qquad \text{ for } \text{ all } h \in H. \end{aligned}$$

4 Parametric regularity of the adjoint state

In this section we derive an a priori bound for the adjoint state and the partial derivatives of the adjoint state with respect to the parametric variables. Existing results, e.g., [32, Theorem 4], do not directly apply to our case, since the right-hand side of the affine linear, parametric operator equation depends on the parametric variable, more specifically

$$\begin{aligned} (B^{\varvec{y}})' q^{\varvec{y}}= (\alpha _1 R_V + \alpha _2 E_T'E_T)(u^{\varvec{y}}- {\widehat{u}}). \end{aligned}$$

Lemma 4.1

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3 > 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and then let \(q^{\varvec{y}}\in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_\textrm{dual}\) given by (3.6). Then we have

$$\begin{aligned} \Vert q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}} \le \frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \bigg (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}\bigg ), \end{aligned}$$

where \(\beta _1\) is described in (2.8).

Proof

By the bounded invertibility of \(B^{\varvec{y}}\) and its dual operator, we have

$$\begin{aligned} \Vert q^{\varvec{y}}\Vert _{{\mathcal {Y}}}&\le \Vert ((B^{\varvec{y}})')^{-1}\Vert _{{{\mathcal {X}}}'\rightarrow {{\mathcal {Y}}}}\, \Vert (\alpha _1 R_V + \alpha _2 E_T'E_T)(u^{\varvec{y}}- {\widehat{u}})\Vert _{{{\mathcal {X}}}'} \end{aligned}$$

with \(\Vert ((B^{\varvec{y}})')^{-1}\Vert _{{{\mathcal {X}}}'\rightarrow {{\mathcal {Y}}}} \le 1/\beta _1\),

$$\begin{aligned} \Vert R_V(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}'}&\le \Vert R_V(u^{\varvec{y}}-{\widehat{u}})\Vert _{L^2(V';I)} = \Vert u^{\varvec{y}}-{\widehat{u}}\Vert _{L^2(V;I)} \le \Vert u^{\varvec{y}}-{\widehat{u}}\Vert _{{{\mathcal {X}}}}, \\ \Vert E_T'E_T(u^{\varvec{y}}- {\widehat{u}})\Vert _{{{\mathcal {X}}}'}&\le \Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}\, \Vert u^{\varvec{y}}- {\widehat{u}}\Vert _{{{\mathcal {X}}}}, \\ \Vert u^{\varvec{y}}- {\widehat{u}}\Vert _{{{\mathcal {X}}}}&\le \Vert u^{\varvec{y}}\Vert _{{\mathcal {X}}} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}} \le \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}, \end{aligned}$$

where we used (2.9). Combining the estimates gives the desired result. \(\square \)

Theorem 4.2

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3 > 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and then let \(q^{\varvec{y}}\in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_{\textrm{dual}}\) given by (3.6). Then for every \({\varvec{\nu }}\in {\mathscr {F}}\) we have

$$\begin{aligned} \Vert \partial ^{\varvec{\nu }}_{\varvec{y}}q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}} \le \frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}\Big )\, (|{\varvec{\nu }}| + 1)!\,{\varvec{b}}^{\varvec{\nu }}, \end{aligned}$$

where \(\beta _1\) is described in (2.8) and the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) is defined in (2.11).

Proof

For \({\varvec{\nu }}= {\varvec{0}}\) the assertion follows from the previous lemma. For \({\varvec{\nu }}\ne {\varvec{0}}\) we take derivatives \(\partial _{\varvec{y}}^{\varvec{\nu }}((B^{\varvec{y}})' q^{\varvec{y}}) = \partial _{\varvec{y}}^{\varvec{\nu }}((\alpha _1 R_V + \alpha _2 E_T'E_T)(u^{\varvec{y}}- {\widehat{u}}))\) and use the Leibniz product rule to get

$$\begin{aligned} \sum _{{\varvec{m}}\le {\varvec{\nu }}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \big (\partial _{\varvec{y}}^{{\varvec{m}}} (B^{\varvec{y}})'\big ) \big (\partial _{\varvec{y}}^{{\varvec{\nu }}-{\varvec{m}}} q^{\varvec{y}}\big ) = (\alpha _1 R_V + \alpha _2 E_T'E_T)\big (\partial _{\varvec{y}}^{\varvec{\nu }}(u^{\varvec{y}}- {\widehat{u}})\big )\,. \end{aligned}$$

Separating out the \({\varvec{m}}= {\varvec{0}}\) term, we obtain

$$\begin{aligned}&(B^{\varvec{y}})' (\partial _{\varvec{y}}^{\varvec{\nu }}q^{\varvec{y}}) \\&\quad = - \sum _{{\varvec{0}}\ne {\varvec{m}}\le {\varvec{\nu }}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \big (\partial _{\varvec{y}}^{{\varvec{m}}} (B^{\varvec{y}})'\big ) \big (\partial _{\varvec{y}}^{{\varvec{\nu }}-{\varvec{m}}} q^{\varvec{y}}\big ) + (\alpha _1 R_V + \alpha _2 E_T'E_T)\big (\partial _{\varvec{y}}^{\varvec{\nu }}(u^{\varvec{y}}- {\widehat{u}})\big ). \end{aligned}$$

By the bounded invertibility of \((B^{\varvec{y}})'\), we have \(\Vert ((B^{\varvec{y}})')^{-1}\Vert _{{{\mathcal {X}}}'\rightarrow {{\mathcal {Y}}}} \le \tfrac{1}{\beta _1}\) and

$$\begin{aligned} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}}&\le \sum _{{\varvec{0}}\ne {\varvec{m}}\le {\varvec{\nu }}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \Vert ((B^{\varvec{y}})')^{-1} \partial _{\varvec{y}}^{{\varvec{m}}} (B^{\varvec{y}})' \Vert _{{{\mathcal {Y}}}\rightarrow {{\mathcal {Y}}}}\, \Vert \partial _{\varvec{y}}^{{\varvec{\nu }}-{\varvec{m}}} q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}}\\&\qquad + \Vert ((B^{\varvec{y}})')^{-1}\Vert _{{{\mathcal {X}}}' \rightarrow {{\mathcal {Y}}}}\, \Vert (\alpha _1 R_V + \alpha _2 E_T'E_T)(\partial _{\varvec{y}}^{\varvec{\nu }}(u^{\varvec{y}}- {\widehat{u}}))\Vert _{{{\mathcal {X}}}'} \\&\le \sum _{{\varvec{0}}\ne {\varvec{m}}\le {\varvec{\nu }}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \frac{1}{\beta _1} \Vert \partial _{\varvec{y}}^{{\varvec{m}}} (B^{\varvec{y}})' \Vert _{{{\mathcal {Y}}}\rightarrow {{\mathcal {X}}}'}\, \Vert \partial _{\varvec{y}}^{{\varvec{\nu }}-{\varvec{m}}} q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}} \\&\qquad + \frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}. \end{aligned}$$

Recall that

$$\begin{aligned}&\langle v, (B^{\varvec{y}})' w\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} \\&\quad = \int _I \langle v, -\tfrac{\partial }{\partial t} w\rangle _{V,V'}\,\mathrm dt + \int _I \int _D a^{\varvec{y}}\, \nabla v \cdot \nabla w\, \mathrm dx\, \mathrm dt + \int _D E_Tw\, E_Tv\, \textrm{d}x. \end{aligned}$$

For \({\varvec{m}}\ne {\varvec{0}}\), we conclude with (2.1) that \(\langle v,\partial ^{{\varvec{m}}} (B^{\varvec{y}})' w\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'} = \int _I \int _D \psi _j\, \nabla v \cdot \nabla w \,\textrm{d}x \,\textrm{d}t\) if \({\varvec{m}}=\varvec{e}_j\), and otherwise it is zero. Hence for \({\varvec{m}}={\varvec{e}}_j\) we obtain for all \(v\in {{\mathcal {Y}}}\) that

$$\begin{aligned} \Vert \partial ^{{\varvec{m}}}(B^{\varvec{y}})' v\Vert _{{{\mathcal {X}}}'}&= \sup _{w\in {{\mathcal {X}}}} \frac{|\langle v,\partial ^{{\varvec{m}}} (B^{\varvec{y}})' w\rangle _{{{\mathcal {X}}},{{\mathcal {X}}}'}|}{\Vert w\Vert _{{\mathcal {X}}}} = \sup _{w\in {{\mathcal {X}}}} \frac{|\int _I \int _D \psi _j\, \nabla v \cdot \nabla w \,\mathrm dx\, \mathrm dt |}{\Vert w\Vert _{{\mathcal {X}}}}\\&\le b_j\,\sup _{w\in {{\mathcal {X}}}} \frac{\Vert v\Vert _{L^2(V;I)}\, \Vert w \Vert _{L^2(V;I)}}{\Vert w\Vert _{{\mathcal {X}}}} \le b_j \Vert v\Vert _{{{\mathcal {Y}}}}\,. \end{aligned}$$

Hence

$$\begin{aligned} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}q^{\varvec{y}}\Vert _{{\mathcal {Y}}} \le \!\!\!\sum _{j \in \textrm{supp}({\varvec{\nu }})}\!\!\! \nu _j\, b_j\, \Vert \partial _{\varvec{y}}^{{\varvec{\nu }}-\varvec{e}_j}q^{\varvec{y}}\Vert _{{\mathcal {Y}}} + \frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}. \end{aligned}$$

By Lemma 4.1 this recursion is true for \({\varvec{\nu }}= {\varvec{0}}\) and we may apply [33, Lemma 9.1] to get

$$\begin{aligned} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}q^{\varvec{y}}\Vert _{{\mathcal {Y}}} \le \sum _{{\varvec{m}}\le {\varvec{\nu }}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \,|{\varvec{m}}|!\,{\varvec{b}}^{\varvec{m}}\Big (\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Vert \partial _{\varvec{y}}^{{\varvec{\nu }}-{\varvec{m}}} (u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}\Big )\,. \end{aligned}$$

From (2.9) and (2.10) we have

$$\begin{aligned} \Vert \partial ^{{\varvec{\nu }}}_{\varvec{y}}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}} \le {\left\{ \begin{array}{ll} \frac{1}{\beta _1}\Vert f\Vert _{{{\mathcal {Y}}}'} + \Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}} &{} \text {if}~{\varvec{\nu }}={\varvec{0}}, \\ \frac{1}{\beta _1}\Vert f\Vert _{{{\mathcal {Y}}}'}\,|{\varvec{\nu }}|!\,{\varvec{b}}^{{\varvec{\nu }}} &{}\text {if}~{\varvec{\nu }}\ne {\varvec{0}}. \end{array}\right. } \end{aligned}$$

We finally arrive at

$$\begin{aligned} \Vert \partial _{\varvec{y}}^{\varvec{\nu }}q^{\varvec{y}}\Vert _{{{\mathcal {Y}}}}&\le \sum _{{\mathop {\scriptstyle {{\varvec{m}}\ne {\varvec{\nu }}}}\limits ^{\scriptstyle {{\varvec{m}}\le {\varvec{\nu }}}}}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \,|{\varvec{m}}|!\,{\varvec{b}}^{\varvec{m}}\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1}\, \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} \,|{\varvec{\nu }}-{\varvec{m}}|!\,{\varvec{b}}^{{\varvec{\nu }}-{\varvec{m}}} \\&\qquad + |{\varvec{\nu }}|!\,{\varvec{b}}^{\varvec{\nu }}\,\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\Big ) \\&= (|{\varvec{\nu }}| +1)!\, {\varvec{b}}^{\varvec{\nu }}\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1}\, \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}\\&\qquad + |{\varvec{\nu }}|!\, {\varvec{b}}^{\varvec{\nu }}\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1} \Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}} \\&\le (|{\varvec{\nu }}| +1)!\, {\varvec{b}}^{\varvec{\nu }}\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1}\, \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\Big ), \end{aligned}$$

where the equality follows from [33, Formula (9.4)]. \(\square \)

5 Regularity analysis for the entropic risk measure

Our goal is to use QMC to approximate the following high-dimensional integrals appearing in the denominator and numerator of the gradient (3.14). To this end, we develop regularity bounds for the integrands.

Lemma 5.1

Let \(\theta >0\), \(\alpha _1,\alpha _2 \ge 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and let \(\Phi ^{\varvec{y}}\) be as in (3.3). Then for all \({\varvec{\nu }}\in {\mathscr {F}}\) we have

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\Phi ^{\varvec{y}}| \le \frac{\alpha _1+\alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2}{2}\, \bigg (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}+\Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\bigg )^2\, (|{\varvec{\nu }}|+1)!\,{\varvec{b}}^{{\varvec{\nu }}}, \end{aligned}$$

where the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) is defined by (2.11).

Proof

The case \({\varvec{\nu }}={\varvec{0}}\) is precisely (3.11). Consider now \({\varvec{\nu }}\ne {\varvec{0}}\). We estimate the partial derivatives of \(\Phi ^{\varvec{y}}\) by differentiating under the integral sign and using the Leibniz product rule in conjunction with the Cauchy–Schwarz inequality to obtain

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\Phi ^{\varvec{y}}|\le \frac{\alpha _1+\alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}^2}{2} \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \Vert \partial ^{{\varvec{m}}}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}\, \Vert \partial ^{{\varvec{\nu }}-{\varvec{m}}}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}. \end{aligned}$$

Separating out the \({\varvec{m}}={\varvec{0}}\) and \({\varvec{m}}={\varvec{\nu }}\) terms and utilizing (2.10), we obtain

$$\begin{aligned}&\sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, \Vert \partial ^{{\varvec{m}}}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}\, \Vert \partial ^{{\varvec{\nu }}-{\varvec{m}}}(u^{\varvec{y}}-{\widehat{u}})\Vert _{{{\mathcal {X}}}}\\&\quad = 2\,\Vert u^{\varvec{y}}-{\widehat{u}}\Vert _{{{\mathcal {X}}}}\, \Vert \partial ^{{\varvec{\nu }}}u^{\varvec{y}}\Vert _{{{\mathcal {X}}}} +\sum _{\begin{array}{c} {\varvec{m}}\le {\varvec{\nu }}\\ {\varvec{0}}\ne {\varvec{m}}\ne {\varvec{\nu }} \end{array}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, \Vert \partial ^{{\varvec{m}}}u^{\varvec{y}}\Vert _{{{\mathcal {X}}}}\, \Vert \partial ^{{\varvec{\nu }}-{\varvec{m}}}u^{\varvec{y}}\Vert _{{{\mathcal {X}}}}\\&\quad \le 2\,\bigg (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}+\Vert {\widehat{u}}\Vert _{{{\mathcal {X}}}}\bigg ) \frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} |{\varvec{\nu }}|!\,{\varvec{b}}^{{\varvec{\nu }}} + \bigg (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1}\bigg )^2 {\varvec{b}}^{{\varvec{\nu }}} \!\! \sum _{\begin{array}{c} {\varvec{m}}\le {\varvec{\nu }}\\ {\varvec{0}}\ne {\varvec{m}}\ne {\varvec{\nu }} \end{array}}\!\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \,|{\varvec{m}}|!\,|{\varvec{\nu }}-{\varvec{m}}|!, \end{aligned}$$

where the sum over \({\varvec{m}}\) can be rewritten as

$$\begin{aligned} \sum _{\ell =1}^{|{\varvec{\nu }}|-1}\ell !\,(|{\varvec{\nu }}|-\ell )!\sum _{{\varvec{m}}\le {\varvec{\nu }},\,|{\varvec{m}}|=\ell } \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) = \sum _{\ell =1}^{|{\varvec{\nu }}|-1}\ell !\,(|{\varvec{\nu }}|-\ell )!\,\left( {\begin{array}{c}|{\varvec{\nu }}|\\ \ell \end{array}}\right) = |{\varvec{\nu }}|!\,(|{\varvec{\nu }}|-1), \end{aligned}$$

where we used the identity

$$\begin{aligned} \sum _{{\varvec{m}}\le {\varvec{\nu }},\,|{\varvec{m}}|=\ell } \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) = \left( {\begin{array}{c}|{\varvec{\nu }}|\\ \ell \end{array}}\right) = \frac{|{\varvec{\nu }}|!}{(|{\varvec{\nu }}|-\ell )!\,\ell !}, \end{aligned}$$
(5.1)

which is a simple consequence of the Vandermonde convolution [38, Eq. (5.1)]. Combining the estimates yields the required result. \(\square \)

For future reference, we state a recursive form of Faà di Bruno’s formula [41] for the exponential function.

Theorem 5.2

Let \(G: U\rightarrow \mathbb {R}\). For all \({\varvec{y}}\in U\) and \({\varvec{\nu }}\in {\mathscr {F}}\setminus \{{\varvec{0}}\}\), we have

$$\begin{aligned} \partial ^{{\varvec{\nu }}}_{\varvec{y}}\exp (G({\varvec{y}})) = \exp (G({\varvec{y}})) \sum _{\lambda =1}^{|{\varvec{\nu }}|} \alpha _{{\varvec{\nu }},\lambda }({\varvec{y}}), \end{aligned}$$

where the sequence \((\alpha _{{\varvec{\nu }},\lambda }({\varvec{y}}))_{{\varvec{\nu }}\in \mathscr {F},\lambda \in {\mathbb {N}}_0}\) is defined recursively by \(\alpha _{{\varvec{\nu }}, 0}({\varvec{y}})=\delta _{{\varvec{\nu }},{\varvec{0}}}\), \(\alpha _{{\varvec{\nu }},\lambda }({\varvec{y}})=0\) for \(\lambda >|{\varvec{\nu }}|\), and otherwise

$$\begin{aligned} \alpha _{{\varvec{\nu }}+{\varvec{e}}_j,\lambda }({\varvec{y}}) = \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, (\partial ^{{\varvec{\nu }}-{\varvec{m}}+{\varvec{e}}_j}G)({\varvec{y}})\, \alpha _{{\varvec{m}},\lambda -1}({\varvec{y}}), \qquad j\ge 1. \end{aligned}$$

Proof

This is a special case of [41, Formulas (3.1) and (3.5)] in which f is the exponential function and \(m=1\) so that \(\lambda \) is an integer. \(\square \)

Lemma 5.3

Let the sequence \((\mathbb {A}_{{\varvec{\nu }},\lambda })_{{\varvec{\nu }}\in {\mathscr {F}},\,\lambda \in \mathbb {N}_0}\) satisfy \(\mathbb {A}_{{\varvec{\nu }},0}=\delta _{{\varvec{\nu }},{\varvec{0}}}\), \(\mathbb {A}_{{\varvec{\nu }},\lambda }=0\) for \(\lambda >|{\varvec{\nu }}|\), and otherwise satisfy the recursion

$$\begin{aligned} \mathbb {A}_{{\varvec{\nu }}+{\varvec{e}}_j,\lambda } \le \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, c\, {\varvec{\rho }}^{{\varvec{\nu }}-{\varvec{m}}+{\varvec{e}}_j}\,(|{\varvec{\nu }}|-|{\varvec{m}}|+2)!\, \mathbb {A}_{{\varvec{m}},\lambda -1}, \qquad j\ge 1, \end{aligned}$$
(5.2)

for some \(c>0\) and a nonnegative sequence \({\varvec{\rho }}\). Then for all \({\varvec{\nu }}\ne {\varvec{0}}\) and \(1\le \lambda \le |{\varvec{\nu }}|\) we have

$$\begin{aligned} \mathbb {A}_{{\varvec{\nu }},\lambda } \,\le \, c^\lambda \, {\varvec{\rho }}^{\varvec{\nu }}\sum _{k=1}^\lambda \frac{(-1)^{\lambda +k}\,(|{\varvec{\nu }}|+2k-1)!}{(2k-1)!\,(\lambda -k)!\,k!}. \end{aligned}$$
(5.3)

The result is sharp in the sense that both inequalities can be replaced by equalities.

Proof

We prove (5.3) for all \({\varvec{\nu }}\ne {\varvec{0}}\) and \(1\le \lambda \le |{\varvec{\nu }}|\) by induction on \(|{\varvec{\nu }}|\). The base case \(\mathbb {A}_{{\varvec{e}}_j,1}\) is easy to verify. Let \({\varvec{\nu }}\ne {\varvec{0}}\) and suppose that (5.3) holds for all multi-indices \({\varvec{m}}\) of order \(\le |{\varvec{\nu }}|\) and all \(1\le \lambda \le |{\varvec{m}}|\). The case \(\mathbb {A}_{{\varvec{\nu }}+{\varvec{e}}_j,1}\) is also straightforward to verify. We consider therefore \(2\le \lambda \le |{\varvec{\nu }}|+1\). Using (5.2) and the induction hypothesis, we have

$$\begin{aligned}&\mathbb {A}_{{\varvec{\nu }}+{\varvec{e}}_j,\lambda } \nonumber \\&\quad \le \sum _{{\varvec{0}}\ne {\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) c\, {\varvec{\rho }}^{{\varvec{\nu }}-{\varvec{m}}+{\varvec{e}}_j}(|{\varvec{\nu }}|-|{\varvec{m}}|+2)!\, \nonumber \\&\qquad \times \bigg (c^{\lambda -1}\, {\varvec{\rho }}^{\varvec{m}}\sum _{k=1}^{\lambda -1} \frac{(-1)^{\lambda -1+k}\,(|{\varvec{m}}|+2k-1)!}{(2k-1)!\,(\lambda -1-k)!\,k!}\bigg ) \nonumber \\&\quad = c^\lambda \, {\varvec{\rho }}^{{\varvec{\nu }}+{\varvec{e}}_j} \sum _{\ell =1}^{|{\varvec{\nu }}|} \sum _{{\mathop {\scriptstyle {|{\varvec{m}}|=\ell }}\limits ^{\scriptstyle {{\varvec{m}}\le {\varvec{\nu }}}}}} \left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, \sum _{k=1}^{\lambda -1} \frac{(-1)^{\lambda -1+k}\, (|{\varvec{\nu }}|-\ell +2)!\,(\ell +2k-1)!}{(2k-1)!\,(\lambda -1-k)!\,k!} \nonumber \\&\quad = c^\lambda \, {\varvec{\rho }}^{{\varvec{\nu }}+{\varvec{e}}_j}\, \frac{2\,|{\varvec{\nu }}|!\,(-1)^{\lambda -1}}{(\lambda -1)!} \underbrace{\sum _{k=1}^{\lambda -1} (-1)^k\,\left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) \sum _{\ell =1}^{|{\varvec{\nu }}|} \left( {\begin{array}{c}|{\varvec{\nu }}|-\ell +2\\ |{\varvec{\nu }}|-\ell \end{array}}\right) \left( {\begin{array}{c}\ell +2k-1\\ \ell \end{array}}\right) }_{=:\,T}, \end{aligned}$$
(5.4)

where we used (5.1) and then regrouped the factors as binomial coefficients. Next we take the binomial identity [38, Eq. (5.6)]

$$\begin{aligned} \sum _{\ell =0}^{|{\varvec{\nu }}|} \left( {\begin{array}{c}|{\varvec{\nu }}|-\ell +2\\ |{\varvec{\nu }}|-\ell \end{array}}\right) \left( {\begin{array}{c}\ell +2k-1\\ \ell \end{array}}\right) = \left( {\begin{array}{c}|{\varvec{\nu }}|+2k+2\\ |{\varvec{\nu }}|\end{array}}\right) , \end{aligned}$$

separate out the \(\ell =0\) term, and use \(\sum _{k=1}^{\lambda -1} (-1)^k \left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) = \sum _{k=0}^{\lambda -1} (-1)^k \left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) -1 = -1\), to rewrite T as

$$\begin{aligned} T&= \sum _{k=1}^{\lambda -1} (-1)^k\,\left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) \bigg [ \left( {\begin{array}{c}|{\varvec{\nu }}|+2k+2\\ |{\varvec{\nu }}|\end{array}}\right) - \left( {\begin{array}{c}|{\varvec{\nu }}|+2\\ |{\varvec{\nu }}|\end{array}}\right) \bigg ] \\&= \sum _{k=1}^{\lambda -1} (-1)^k\,\left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) \left( {\begin{array}{c}|{\varvec{\nu }}|+2k+2\\ |{\varvec{\nu }}|\end{array}}\right) + \left( {\begin{array}{c}|{\varvec{\nu }}|+2\\ |{\varvec{\nu }}|\end{array}}\right) \\&= \sum _{k=0}^{\lambda -1} (-1)^k\,\left( {\begin{array}{c}\lambda -1\\ k\end{array}}\right) \left( {\begin{array}{c}|{\varvec{\nu }}|+2k+2\\ |{\varvec{\nu }}|\end{array}}\right) = \sum _{k=1}^{\lambda } (-1)^{k-1}\,\left( {\begin{array}{c}\lambda -1\\ k-1\end{array}}\right) \left( {\begin{array}{c}|{\varvec{\nu }}|+2k\\ |{\varvec{\nu }}|\end{array}}\right) . \end{aligned}$$

Substituting this back into (5.4) and simplifying the factors, we obtain

$$\begin{aligned} \mathbb {A}_{{\varvec{\nu }}+{\varvec{e}}_j,\lambda } \le c^\lambda \, {\varvec{\rho }}^{{\varvec{\nu }}+{\varvec{e}}_j} \sum _{k=1}^{\lambda } \frac{(-1)^{\lambda +k}\,(|{\varvec{\nu }}|+2k)!}{(2k-1)!\,(\lambda -k)!\,k!}, \end{aligned}$$

as required. \(\square \)

Theorem 5.4

Let \(\theta >0\), \(\alpha _1,\alpha _2 \ge 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and let \(\Phi ^{\varvec{y}}\) be as in (3.3). Then for all \({\varvec{\nu }}\in {\mathscr {F}}\) we have

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\exp (\theta \,\Phi ^{\varvec{y}})| \le e^{\max (\sigma ,\,\sigma e^2+2\sigma -1)}\, |{\varvec{\nu }}|!\,(e{\varvec{b}})^{{\varvec{\nu }}}, \end{aligned}$$

where the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) is defined by (2.11) and \(\sigma \) is defined by (3.13).

Proof

For \({\varvec{\nu }}={\varvec{0}}\) we have from (3.12) that \(|\exp (\theta \,\Phi ^{\varvec{y}})| \le e^\sigma \), which satisfies the required bound. For \({\varvec{\nu }}\ne {\varvec{0}}\), from Faà di Bruno’s formula (Theorem 5.2) we have

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\exp (\theta \Phi ^{\varvec{y}})| \le \exp (\theta \,\Phi ^{\varvec{y}}) \sum _{\lambda =1}^{|{\varvec{\nu }}|} |\alpha _{{\varvec{\nu }},\lambda }({\varvec{y}})|, \end{aligned}$$
(5.5)

with \(\alpha _{{\varvec{\nu }}, 0}({\varvec{y}})=\delta _{{\varvec{\nu }},{\varvec{0}}}\), \(\alpha _{{\varvec{\nu }},\lambda }({\varvec{y}})=0\) for \(\lambda >|{\varvec{\nu }}|\), and

$$\begin{aligned} |\alpha _{{\varvec{\nu }}+{\varvec{e}}_j,\lambda }({\varvec{y}})|&\le \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, \theta \,|\partial ^{{\varvec{m}}+{\varvec{e}}_j}_{\varvec{y}}\Phi ^{\varvec{y}}|\, |\alpha _{{\varvec{\nu }}-{\varvec{m}},\lambda -1}({\varvec{y}})| \\&\le \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, \sigma \,(|{\varvec{m}}|+2)!\,{\varvec{b}}^{{\varvec{m}}+{\varvec{e}}_j}\, |\alpha _{{\varvec{\nu }}-{\varvec{m}},\lambda -1}({\varvec{y}})|, \end{aligned}$$

where we used Lemma 5.1. Applying Lemma 5.3 we conclude that

$$\begin{aligned} |\alpha _{{\varvec{\nu }},\lambda }({\varvec{y}})| \le \sigma ^\lambda \,{\varvec{b}}^{\varvec{\nu }}\sum _{k=1}^\lambda \frac{(-1)^{\lambda +k}\,(|{\varvec{\nu }}|+2k-1)!}{(2k-1)!\,(\lambda -k)!\,k!}. \end{aligned}$$
(5.6)

We have

$$\begin{aligned}&\sum _{\lambda =1}^{|{\varvec{\nu }}|} \sigma ^\lambda \sum _{k=1}^\lambda \frac{(-1)^{\lambda +k}\,(|{\varvec{\nu }}|+2k-1)!}{(2k-1)!\,(\lambda -k)!\,k!} = \sum _{k=1}^{|{\varvec{\nu }}|} \frac{(|{\varvec{\nu }}|+2k-1)!}{(2k-1)!\,k!} \sum _{\lambda =k}^{|{\varvec{\nu }}|} \frac{(-1)^{\lambda +k}\,\sigma ^\lambda }{(\lambda -k)!} \nonumber \\&\quad = |{\varvec{\nu }}|!\sum _{k=1}^{|{\varvec{\nu }}|} \frac{\sigma ^k}{k!} \left( {\begin{array}{c}|{\varvec{\nu }}|+2k-1\\ 2k-1\end{array}}\right) \sum _{\ell =0}^{|{\varvec{\nu }}|-k} \frac{(-\sigma )^\ell }{\ell !} \le |{\varvec{\nu }}|! \sum _{k=1}^{|{\varvec{\nu }}|} \frac{\sigma ^k}{k!} e^{|{\varvec{\nu }}|+2k-1} e^\sigma \nonumber \\&\quad \le |{\varvec{\nu }}|!\, e^{|{\varvec{\nu }}|+\sigma e^2+\sigma -1}, \end{aligned}$$
(5.7)

where we used \(\left( {\begin{array}{c}n\\ m\end{array}}\right) \le n^m/m! \le e^n\). Combining (5.5), (5.6), (5.7) and (3.11) gives

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\exp (\theta \Phi ^{\varvec{y}})| \le \exp (\sigma )\,{\varvec{b}}^{\varvec{\nu }}\,|{\varvec{\nu }}|!\, e^{|{\varvec{\nu }}|+\sigma e^2+\sigma -1} = e^{\sigma e^2+2\sigma -1}\, |{\varvec{\nu }}|!\,(e{\varvec{b}})^{\varvec{\nu }}, \end{aligned}$$

as required. \(\square \)

Remark 5.5

In the proof of Theorem 5.4, a different manipulation of (5.7) can yield a different bound \(2c\, e^{|{\varvec{\nu }}|+\sigma e^2+\sigma +1}(|{\varvec{\nu }}|-1)!\) for \({\varvec{\nu }}\ne {\varvec{0}}\), leading to a tighter upper bound for large \(|{\varvec{\nu }}|\) at the expense of a bigger constant,

$$\begin{aligned} |\partial ^{{\varvec{\nu }}}_{\varvec{y}}\exp (\theta \,\Phi ^{\varvec{y}})| \le 2\sigma \,\textrm{e}^{\sigma e^2+2\sigma +1}\,(|{\varvec{\nu }}|-1)!\,(e{\varvec{b}})^{{\varvec{\nu }}}. \end{aligned}$$

This then leads to a more complicated bound for Theorem 5.6 below. We have chosen to present the current form of Theorem 5.4 to simplify our subsequent analysis.

Interestingly, the sum in (5.6) can also be rewritten as a sum with only positive terms: denoting \(v = |{\varvec{\nu }}|\),

$$\begin{aligned} \sum _{k=1}^\lambda \frac{(-1)^{\lambda +k}(v+2k-1)!}{(2k-1)!(\lambda -k)!k!}&= \frac{v!}{\lambda !}\sum _{k=0}^\lambda \left( {\begin{array}{c}\lambda \\ k\end{array}}\right) \left( {\begin{array}{c}v-1\\ v-\lambda -k\end{array}}\right) 2^{\lambda -k}\\&=2^\lambda \left( {\begin{array}{c}v-1\\ v-\lambda \end{array}}\right) \sum _{k=0}^{\lambda } \frac{\left( {\begin{array}{c}\lambda \\ k\end{array}}\right) \left( {\begin{array}{c}v-\lambda \\ k\end{array}}\right) }{\left( {\begin{array}{c}\lambda +k-1\\ k\end{array}}\right) }2^{-k}, \end{aligned}$$

which is identical to the sequence [3, Proposition 7] and the sequence A181289 in the OEIS (written in slightly different form). However, we were unable to find a closed form expression for the sum; neither [21] nor [38] shed any light. The hope is to obtain an alternative bound for (5.7) that does not involve the factor \(e^{|{\varvec{\nu }}|}\). This is open for future research.

As an alternative approach to the presented bootstrapping method, holomorphy arguments can be used to derive similar regularity bounds, see, e.g., [8].

Theorem 5.6

Let \(\theta >0\), \(\alpha _1,\alpha _2 \ge 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and \(\Phi ^{\varvec{y}}\) be as in (3.3), and then let \(q^{\varvec{y}}= (q_1^{\varvec{y}},q_2^{\varvec{y}}) \in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_{\textrm{dual}}\) given by (3.6). Then for all \({\varvec{\nu }}\in {\mathscr {F}}\) we have

$$\begin{aligned} \big \Vert \partial ^{\varvec{\nu }}_{\varvec{y}}\big (\!\exp (\theta \,\Phi ^{\varvec{y}})\,q_1^{\varvec{y}}\big )\big \Vert _{L^2(V;I)} \le \big \Vert \partial ^{\varvec{\nu }}_{\varvec{y}}\big (\!\exp (\theta \,\Phi ^{\varvec{y}})\,q^{\varvec{y}}\big )\big \Vert _{{{\mathcal {Y}}}} \le \frac{\mu }{2}\, (|{\varvec{\nu }}|+2)!\, (e{\varvec{b}})^{\varvec{\nu }}, \end{aligned}$$

where the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) is defined by (2.11), \(\sigma \) is defined by (3.13) and

$$\begin{aligned} \mu := e^{\max (\sigma ,\,\sigma e^2+2\sigma -1)}\, \Big (\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert _{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1}\Big ) \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}\Big ). \end{aligned}$$

Proof

Using the Leibniz product rule and Theorems 5.4 with 4.2, we obtain

$$\begin{aligned}&\big \Vert \partial ^{\varvec{\nu }}_{\varvec{y}}\big (\!\exp (\theta \,\Phi ^{\varvec{y}})\,q^{\varvec{y}}\big )\big \Vert _{{{\mathcal {Y}}}} \le \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \big |\partial ^{\varvec{m}}_{\varvec{y}}\exp (\theta \Phi ^{\varvec{y}})\big |\, \big \Vert \partial ^{{\varvec{\nu }}-{\varvec{m}}}_{\varvec{y}}q^{\varvec{y}}\big \Vert _{{{\mathcal {Y}}}} \\&\quad \le \sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) \, e^{\max (\sigma ,\sigma \,e^2+2\sigma -1)}\,|{\varvec{m}}|!\,(e{\varvec{b}})^{\varvec{m}}\\&\qquad \times \Big (\frac{\alpha _1 + \alpha _2\,\Vert E_T\Vert ^{2}_{{{\mathcal {X}}}\rightarrow L^2(D)}}{\beta _1}\Big ) \Big (\frac{\Vert f\Vert _{{{\mathcal {Y}}}'}}{\beta _1} + \Vert {\widehat{u}}\Vert _{{\mathcal {X}}}\Big )\, {\varvec{b}}^{{\varvec{\nu }}-{\varvec{m}}}\,(|{\varvec{\nu }}| -|{\varvec{m}}| + 1)! \\&\quad \le \mu \, (e{\varvec{b}})^{\varvec{\nu }}\sum _{{\varvec{m}}\le {\varvec{\nu }}}\left( {\begin{array}{c}{\varvec{\nu }}\\ {\varvec{m}}\end{array}}\right) |{\varvec{m}}|!\,(|{\varvec{\nu }}| -|{\varvec{m}}| + 1)! \;=\, \mu \, (e{\varvec{b}})^{\varvec{\nu }}\frac{(|{\varvec{\nu }}|+2)!}{2}, \end{aligned}$$

with the last equality due to [33, Formula (9.5)]. \(\square \)

6 Error analysis

Let \(z^*\) denote the solution of (3.4) and let \(z_{s,n}^*\) be the minimizer of

$$\begin{aligned} J_{s,n}(z):= {\mathcal {R}}_{s,n}(\Phi ^{\varvec{y}}_{s}(z)) + \frac{\alpha _3}{2} \Vert z\Vert ^2_{L^2(V';I)}, \end{aligned}$$

where \(\Phi _s^{{\varvec{y}}}(z) = \Phi ^{(y_1,y_2,\ldots ,y_s,0,0,\ldots )}(z)\) is the truncated version of \(\Phi ^{\varvec{y}}(z)\) defined in (3.3), and \({\mathcal {R}}_{s,n}\) is an approximation of the risk measure \({\mathcal {R}}\), for which the integrals over the parameter domain \(U= [-\frac{1}{2},\frac{1}{2}]^{{\mathbb {N}}}\) are replaced by s-dimensional integrals over \(U_s= [-\frac{1}{2},\frac{1}{2}]^{s}\) and then approximated by an n-point randomly-shifted QMC rule:

$$\begin{aligned} {{\mathcal {R}}}_{s,n}(\Phi ^{\varvec{y}}_{s}(z)) = {\left\{ \begin{array}{ll} \displaystyle \frac{1}{n}\sum _{i=1}^n \Phi ^{{\varvec{y}}^{(i)}}_{s}(z) &{} \text{ for } \text{ expected } \text{ value }, \\ \displaystyle \frac{1}{\theta } \ln \Big ( \frac{1}{n}\sum _{i=1}^n \exp \big (\theta \,\Phi ^{{\varvec{y}}^{(i)}}_{s}(z)\big ) \Big ) &{} \text{ for } \text{ entropic } \text{ risk } \text{ measure }, \end{array}\right. } \end{aligned}$$

for \(\theta \in (0,\infty )\), for carefully chosen QMC points \({\varvec{y}}^{(i)}\), \(i=1,\ldots ,n\), involving a uniformly sampled random shift \({\varvec{\Delta }}\in [0,1]^s\), see Sect. 6.2.

We have seen in the proof of Lemma 3.1 that the risk measures considered in this manuscript are convex and the objective function J, see (3.2), is thus strongly convex. It is important to note that the n-point QMC rule preserves the convexity of the risk measure, so \(J_{s,n}\) is a strongly convex function, because it is a sum of a convex and a strongly convex function. Therefore we have the optimality conditions \(\langle J_{s,n}'(z_{s,n}^*), z - z^*_{s,n}\rangle _{L^2(V;I),L^2(V';I)} \ge 0\) for all \(z \in {\mathcal {Z}}\) and thus in particular \(\langle J_{s,n}'(z_{s,n}^*), z^* - z^*_{s,n}\rangle _{L^2(V;I),L^2(V';I)} \ge 0\). Similarly, we have \(\langle J'(z^*), z - z^* \rangle _{L^2(V;I),L^2(V';I)} \ge 0\), and in particular \(\langle -J'(z^*), z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \ge 0\). Adding these inequalities gives

$$\begin{aligned} \langle J'_{s,n}(z^*_{s,n}) - J'(z^*), z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \ge 0\,. \end{aligned}$$

Hence

$$\begin{aligned}&\alpha _3 \Vert z^* - z^*_{s,n} \Vert ^2_{L^2(V';I)}\\&\quad \le \alpha _3 \Vert z^* - z^*_{s,n} \Vert ^2_{L^2(V';I)} + \langle J'_{s,n}(z^*_{s,n}) - J'(z^*), z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \\&\quad = \langle J'_{s,n}(z^*_{s,n}) - \alpha _3 R_V^{-1}z^*_{s,n} - J'(z^*) + \alpha _3 R_V^{-1}z^*, z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)}\\&\quad = \langle J'_{s,n}(z^*_{s,n}) - \alpha _3 R_V^{-1}z^*_{s,n} - J'_{s,n}(z^*) + \alpha _3R_V^{-1}z^* , z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \\&\qquad + \langle J'_{s,n}(z^*) - \alpha _3R_V^{-1}z^* - J'(z^*) + \alpha _3 R_V^{-1} z^*, z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \\&\quad \le \langle J'_{s,n}(z^*) - \alpha _3 R_V^{-1}z^* - J'(z^*) + \alpha _3 R_V^{-1}z^*, z^* - z^*_{s,n} \rangle _{L^2(V;I),L^2(V';I)} \\&\quad \le \Vert J'_{s,n}(z^*) - \alpha _3R_V^{-1}z^* - J'(z^*) + \alpha _3 R_V^{-1}z^*\Vert _{L^2(V;I)} \Vert z^* - z^*_{s,n} \Vert _{L^2(V';I)} \,, \end{aligned}$$

where we used the \(\alpha _3\)-strong convexity of \(J'_{s,n}\) in the fourth step, i.e.,

$$\begin{aligned} \langle J'_{s,n}(z^*_{s,n}) - J'_{s,n}(z^*) - \alpha _3 R_V^{-1}(z^* - z_{s,n}^*), z^* - z_{s,n}^*\rangle _{L^2(V;I),L^2(V';I)} \le 0\,. \end{aligned}$$

Thus we have with (3.4)

$$\begin{aligned} \Vert z^* - z^*_{s,n}\Vert _{L^2(V';I)}&\le \frac{1}{\alpha _3} \Vert J'(z^*) - J'_{s,n}(z^*)\Vert _{L^2(V;I)}. \end{aligned}$$

We will next expand this upper bound in order to split it into the different error contributions: dimension truncation error and QMC error. The different error contributions are then analyzed separately in the following subsections for both risk measures.

In the case of the expected value, it follows from (3.7) that

$$\begin{aligned}&{\mathbb {E}}_{{\varvec{\Delta }}} \Vert z^* - z^*_{s,n}\Vert ^2_{L^2(V';I)} \le \frac{1}{\alpha _3^2} {\mathbb {E}}_{{\varvec{\Delta }}} \Big \Vert \int _U q_{1}^{\varvec{y}}\,\mathrm d{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n q_{1,s}^{{\varvec{y}}^{(i)}} \Big \Vert ^2_{L^2(V;I)} \nonumber \\&\quad \le \frac{{2}}{\alpha _3^2} \Big \Vert \int _U (q_{1}^{\varvec{y}}- q_{1,s}^{\varvec{y}}) \,\mathrm d{\varvec{y}}\Big \Vert _{L^2(V;I)}^2 + \frac{{2}}{\alpha _3^2} {\mathbb {E}}_{{\varvec{\Delta }}}\Big \Vert \int _{U_s} q_{1,s}^{\varvec{y}}\,\mathrm d{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n q_{1,s}^{{\varvec{y}}^{(i)}} \Big \Vert _{L^2(V;I)}^2, \end{aligned}$$
(6.1)

where \(q^{\varvec{y}}_{1,s}:= q^{(y_1,y_2,\ldots ,y_s,0,0,\ldots )}_{1}\) denotes the truncated version of \(q_1^{\varvec{y}}\), and \(\mathbb {E}_{\varvec{\Delta }}\) denotes the expected value with respect to the random shift \({\varvec{\Delta }}\in [0,1]^s\).

In the case of the entropic risk measure, we recall that \(J'(z)\) is given by (3.14). Let

$$\begin{aligned} T&:= \int _U \exp {\big (\theta \, \Phi ^{{\varvec{y}}}(z^*)\big )}\,\mathrm d{\varvec{y}}\,, \qquad \quad \, T_{s,n} := \frac{1}{n} \sum _{i=1}^n \exp {\big (\theta \, \Phi ^{{\varvec{y}}^{(i)}}_{s}(z^*_{s,n})\big )}\,,\\ S&:= \int _U \exp {\big (\theta \, \Phi ^{{\varvec{y}}}(z^*)\big )}\,q_1^{\varvec{y}}(z^*)\,\mathrm d{\varvec{y}}\,,\,\, S_{s,n} := \frac{1}{n} \sum _{i=1}^n \exp {\big (\theta \, \Phi ^{{\varvec{y}}^{(i)}}_{s}(z^*_{s,n})\big )}\,q_{1,s}^{{\varvec{y}}^{(i)}}(z^*_{s,n}), \end{aligned}$$

then we have

$$\begin{aligned} \alpha _3\big \Vert z^* - z_{s,n}^*\big \Vert _{L^2(V';I)}&\le \Big \Vert \frac{S}{T} - \frac{S_{s,n}}{T_{s,n}} \Big \Vert _{L^2(V;I)} = \frac{\big \Vert S\,T_{s,n} - S_{s,n}\,T\big \Vert _{L^2(V;I)}}{T\,T_{s,n}} \\&= \frac{\big \Vert S\,T_{s,n} - S\,T + S\,T - S_{s,n}\,T\big \Vert _{L^2(V;I)}}{T\,T_{s,n}} \\&\le \frac{\big \Vert S\big \Vert _{L^2(V;I)}\,\big |T-T_{s,n}\big |}{T\,T_{s,n}} + \frac{\big \Vert S - S_{s,n}\big \Vert _{L^2(V;I)}}{T_{s,n}} \\&\le \mu \,\big |T-T_{s,n}\big | + \big \Vert S - S_{s,n}\big \Vert _{L^2(V;I)}, \end{aligned}$$

where we used \(T\ge 1\) and \(T_{s,n}\ge 1\) and, using the abbreviation \(g^{\varvec{y}}({\varvec{x}},t):= \exp (\theta \,\Phi ^{{\varvec{y}}}(z))\,q_1^{{\varvec{y}}}({\varvec{x}},t)\) we get

$$\begin{aligned} \Vert S\Vert _{L^2(V;I)}^2&\,=\, \int _I \Big \Vert \int _U g^{\varvec{y}}(\cdot ,t) \,\textrm{d}{\varvec{y}}\Big \Vert _V^2\,\textrm{d}t \,=\, \int _I \int _D \Big | \nabla \Big (\int _U g^{\varvec{y}}({\varvec{x}},t) \,\textrm{d}{\varvec{y}}\Big ) \Big |^2\,\textrm{d}{\varvec{x}}\,\textrm{d}t \\&\,\le \, \int _U \int _I \int _D \big | \nabla g^{\varvec{y}}({\varvec{x}},t) \big |^2\,\textrm{d}{\varvec{x}}\,\textrm{d}t\,\textrm{d}{\varvec{y}}\,=\, \int _U \big \Vert g^{\varvec{y}}\big \Vert ^2_{L^2(V;I)}\,\textrm{d}{\varvec{y}}\,\le \, \mu ^2\,, \end{aligned}$$

where we used Theorem 5.6 with \({\varvec{\nu }}={\varvec{0}}\).

We can write

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Big \Vert \frac{S}{T} - \frac{S_{s,n}}{T_{s,n}} \Big \Vert _{L^2(V;I)}^2&\,\le \, 2\mu ^2\,\mathbb {E}_{\varvec{\Delta }}\big |T-T_{s,n}\big |^2 + 2\mathbb {E}_{\varvec{\Delta }}\big \Vert S - S_{s,n}\big \Vert _{L^2(V;I)}^2. \end{aligned}$$
(6.2)

For the first term on the right-hand side of (6.2) we obtain

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\big |T-T_{s,n}\big |^2&\,\le \, {2}\big |T-T_s\big |^2 + {2}\mathbb {E}_{\varvec{\Delta }}\big |T_s-T_{s,n}\big |^2, \end{aligned}$$
(6.3)

and for the second term we have

$$\begin{aligned}&\mathbb {E}_{\varvec{\Delta }}\Vert S-S_{s,n}\Vert _{L^2(V;I)}^2 \,\le \, {2} \Vert S-S_s\Vert _{L^2(V;I)}^2 + {2} \mathbb {E}_{\varvec{\Delta }}\Vert S_s-S_{s,n}\Vert _{L^2(V;I)}^2. \end{aligned}$$
(6.4)

Remark 6.1

Since we have \(\Vert v_1\Vert _{L^2(V;I)} \le \Vert v\Vert _{{{\mathcal {Y}}}}\) for all \(v=(v_1,v_2)\in {{\mathcal {Y}}}\) by definition, and thus in particular \(\Vert \int _U (q_1^{\varvec{y}}- q_{1,s}^{\varvec{y}})\,\mathrm d{\varvec{y}}\Vert _{L^2(V;I)} \le \Vert \int _U (q^{\varvec{y}}- q_s^{\varvec{y}}) \,\mathrm d{\varvec{y}}\Vert _{{{\mathcal {Y}}}}\), we can replace \(q_1^{\varvec{y}},q_{1,s}^{\varvec{y}}\in L^2(V;I)\) in (6.1) and (6.4) by \(q^{\varvec{y}},q_s^{\varvec{y}}\in {{\mathcal {Y}}}\). In order to obtain error bounds and convergence rates for (6.1) and (6.4), it is then sufficient to derive the results in the \({{\mathcal {Y}}}\)-norm, which is slightly stronger than the \(L^2(V;I)\)-norm.

6.1 Truncation error

In this section we derive bounds and convergence rates for the errors that occur by truncating the dimension, i.e., for the first terms in (6.1), (6.3) and (6.4).

We prove a new and very general theorem for the truncation error based on knowledge of regularity. The idea of the proof is based on a Taylor series expansion and is similar to the approach in [19, Theorem 4.1]. The use of Taylor series for dimension truncation error analysis has also been considered, for instance, in [4, 17].

Theorem 6.2

Let Z be a separable Banach space and let \(g({\varvec{y}}): U \rightarrow Z\) be analytically dependent on the sequence of parameters \({\varvec{y}}\in U = [-\frac{1}{2},\frac{1}{2}]^{{\mathbb {N}}}\). Suppose there exist constants \(C_0>0\), \(r_1\ge 0\), \(r_2>0\), and a sequence \({\varvec{\rho }}= (\rho _j)_{j\ge 1} \in \ell ^p({\mathbb {N}})\) for \(0<p<1\), with \(\rho _1 \ge \rho _2 \ge \cdots \), such that for all \({\varvec{y}}\in U\) and \({\varvec{\nu }}\in {\mathscr {F}}\) we have

$$\begin{aligned} \Vert \partial _{{\varvec{y}}}^{{\varvec{\nu }}} g({\varvec{y}})\Vert _Z \le C_0\, (|{\varvec{\nu }}|+r_1)!\, (r_2{\varvec{\rho }})^{{\varvec{\nu }}}. \end{aligned}$$

Then, denoting \(({\varvec{y}}_{\le s};\varvec{0}) = (y_1,y_2,\ldots ,y_s,0,0,\ldots )\), we have for all \(s \in {\mathbb {N}}\)

$$\begin{aligned} \Big \Vert \int _U \big (g({\varvec{y}}) - g({\varvec{y}}_{\le s};\varvec{0})\big )\,\mathrm d{\varvec{y}}\Big \Vert _Z \le C_0\,C\,s^{-2/p+1}\,, \end{aligned}$$

for \(C>0\) independent of s.

Proof

Let \({\varvec{y}}\in U\) and \(G\in Z'\) with \(\Vert G\Vert _{Z'}\le 1\) and define

$$\begin{aligned} F({\varvec{y}}):=\langle G,g({\varvec{y}}) \rangle _{Z',Z}. \end{aligned}$$

Evidently, \(\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})=\langle G,\partial _{{\varvec{y}}}^{{\varvec{\nu }}}g({\varvec{y}})\rangle _{Z',Z}\) for all \({\varvec{\nu }}\in \mathscr {F}\). Moreover,

$$\begin{aligned} \sup _{{\varvec{y}}\in U}|\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})|\le C_0 (|{\varvec{\nu }}|+r_1)! (r_2{\varvec{\rho }})^{{\varvec{\nu }}}\quad \text {for all}~{\varvec{\nu }}\in {\mathscr {F}}. \end{aligned}$$

For arbitrary \(k\ge 1\) we consider the Taylor expansion of F about \(({\varvec{y}}_{\le s};{\textbf{0}}) = (y_1,y_2,\ldots ,y_s,0,0,\ldots )\):

$$\begin{aligned} F({\varvec{y}})&=F({\varvec{y}}_{\le s};{\textbf{0}})+\sum _{\ell =1}^k \sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s \end{array}}\frac{{\varvec{y}}^{{\varvec{\nu }}}}{{\varvec{\nu }}!}\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};{\textbf{0}})\\&\quad +\sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}}\frac{k+1}{{\varvec{\nu }}!}\,{\varvec{y}}^{{\varvec{\nu }}}\int _0^1(1-t)^k\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};t{\varvec{y}}_{>s})\,\textrm{d}t. \end{aligned}$$

Rearranging this equation and integrating over \({\varvec{y}}\in U\) yields

$$\begin{aligned}&\int _U (F({\varvec{y}})-F({\varvec{y}}_{\le s};{\textbf{0}}))\,\textrm{d}{\varvec{y}}=\sum _{\ell =1}^k \sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s \end{array}}\frac{1}{{\varvec{\nu }}!} \int _U {\varvec{y}}^{{\varvec{\nu }}}\,\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};{\textbf{0}})\,\textrm{d}{\varvec{y}}\nonumber \\&\qquad + \sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}}\frac{k+1}{{\varvec{\nu }}!} \int _U\int _0^1(1-t)^k\,{\varvec{y}}^{{\varvec{\nu }}}\,\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};t\,{\varvec{y}}_{>s})\,\textrm{d}t\,\textrm{d}{\varvec{y}}. \end{aligned}$$
(6.5)

If there is any component \(\nu _j=1\) with \(j>s\), then the summand in the first term vanishes, since (for all \({\varvec{\nu }}\in {\mathscr {F}}\) with \(\nu _j = 0\) \(\forall j\le s\))

$$\begin{aligned} \int _U {\varvec{y}}^{{\varvec{\nu }}}\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};{\textbf{0}})\,\textrm{d}{\varvec{y}}=\int _{U_{\le s}} \partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}}_{\le s};{\textbf{0}})\, \underset{=0 \text { if at least one } \nu _j = 1}{\underbrace{\Big (\prod _{j>s}\int _{-1/2}^{1/2}y_j^{\nu _j}\,\textrm{d}y_j\Big )}}\,\textrm{d}{\varvec{y}}_{\le s}=0, \end{aligned}$$

where we used Fubini’s theorem. Taking the absolute value on both sides in (6.5) and using \(|y_j|\le \frac{1}{2}\), we obtain

$$\begin{aligned}&\Big |\int _U (F({\varvec{y}})-F({\varvec{y}}_{\le s};{\textbf{0}}))\,\textrm{d}{\varvec{y}}\Big |\nonumber \\&\quad \le \sum _{\ell =2}^k \!\!\!\sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}\!\!\frac{1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!}\sup _{{\varvec{y}}\in U}|\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})| +\!\!\!\!\sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}}\!\!\frac{k+1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!} \int _0^1 (1-t)^k \sup _{{\varvec{y}}\in U}|\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})|\, \mathrm dt\nonumber \\&\quad = \sum _{\ell =2}^k \!\!\!\sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}\frac{1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!}\sup _{{\varvec{y}}\in U}|\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})| + \!\!\!\! \sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}}\frac{1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!} \sup _{{\varvec{y}}\in U}|\partial _{{\varvec{y}}}^{{\varvec{\nu }}}F({\varvec{y}})|\nonumber \\&\quad \le \sum _{\ell =2}^k \!\!\! \sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}\frac{1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!} C_0\,(|{\varvec{\nu }}|+r_1)!\, (r_2{\varvec{\rho }})^{{\varvec{\nu }}} + \!\!\!\!\sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}}\frac{1}{2^{{\varvec{\nu }}}{\varvec{\nu }}!} C_0\, (|{\varvec{\nu }}|+r_1)!\, (r_2{\varvec{\rho }})^{{\varvec{\nu }}}\nonumber \\&\quad \le C_0\,(k+r_1)!\,\frac{{r_2}^k}{2^2\,2!} \sum _{\ell =2}^k \!\!\! \sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}\!\!\!\! {\varvec{\rho }}^{{\varvec{\nu }}} +C_0\,(k+1+r_1)!\,\left( \frac{{r_2}}{2}\right) ^{k+1} \!\!\!\!\!\!\! \sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}} \!\!\!\!\!\frac{{\varvec{\rho }}^{{\varvec{\nu }}}}{{\varvec{\nu }}!}.\nonumber \\ \end{aligned}$$
(6.6)

Furthermore, we have

$$\begin{aligned} \Big \Vert \int _U (g({\varvec{y}}) - g({\varvec{y}}_{\le s};\varvec{0}))\,\textrm{d}{\varvec{y}}\Big \Vert _{Z}&= \sup _{\begin{array}{c} G\in Z'\\ \Vert G\Vert _{Z'}\le 1 \end{array}}\Big |\Big \langle G,\int _U (g({\varvec{y}}) - g({\varvec{y}}_{\le s};\varvec{0}))\,\textrm{d}{\varvec{y}}\Big \rangle _{Z',Z}\Big |\\&=\sup _{\begin{array}{c} G\in Z'\\ \Vert G\Vert _{Z'}\le 1 \end{array}}\Big |\int _U\langle G,g({\varvec{y}}) - g({\varvec{y}}_{\le s};\varvec{0})\rangle _{Z',Z}\,\textrm{d}{\varvec{y}}\Big | \\&=\sup _{\begin{array}{c} G\in Z'\\ \Vert G\Vert _{Z'}\le 1 \end{array}}\Big |\int _U (F({\varvec{y}})-F({\varvec{y}}_{\le s};{\textbf{0}}))\,\textrm{d}{\varvec{y}}\Big |\,, \end{aligned}$$

which is also bounded by the last expression in (6.6).

For s sufficiently large, we obtain in complete analogy to [15] that the first term in (6.6) satisfies

$$\begin{aligned} \sum _{\ell =2}^k \sum _{\begin{array}{c} |{\varvec{\nu }}|=\ell \\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}{\varvec{\rho }}^{{\varvec{\nu }}}&=\sum _{\begin{array}{c} 2\le |{\varvec{\nu }}|\le k\\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}{\varvec{\rho }}^{{\varvec{\nu }}} \le \sum _{\begin{array}{c} 0 \ne |{\varvec{\nu }}|_\infty \le k\\ \nu _j=0~\forall j\le s\\ \nu _j\ne 1~\forall j>s \end{array}}{\varvec{\rho }}^{{\varvec{\nu }}} =-1+\prod _{j>s}\Big (1+\sum _{\ell =2}^k \rho _j^\ell \Big )\\&=-1+\prod _{j>s}\Big (1+\frac{1-\rho _j^{k-1}}{1-\rho _j}\,\rho _j^2\Big )={\mathcal {O}}(s^{-2/p+1}), \end{aligned}$$

since \({\varvec{\rho }}\in \ell ^p\) with \(0<p<1\) and \(\rho _1\ge \rho _2\ge \cdots \) by assumption.

On the other hand, we can use the multinomial theorem to bound the second term in (6.6)

$$\begin{aligned} \sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}} \frac{{\varvec{\rho }}^{{\varvec{\nu }}} }{{\varvec{\nu }}!} \le \sum _{\begin{array}{c} |{\varvec{\nu }}|=k+1\\ \nu _j=0~\forall j\le s \end{array}} \left( {\begin{array}{c}k+1\\ {\varvec{\nu }}\end{array}}\right) {\varvec{\rho }}^{{\varvec{\nu }}} =\bigg (\sum _{j>s}\rho _j\bigg )^{k+1} ={\mathcal {O}}(s^{(k+1)(-1/p+1)}), \end{aligned}$$

where we used \(\sum _{j>s}\rho _j\le s^{-1/p+1} (\sum _{j=1}^\infty \rho _j^p )^{1/p}\). (The last inequality follows directly from [9, Lemma 5.5], which is often attributed to Stechkin. For an elementary proof we refer to [31, Lemma 3.3].)

Taking now \(k=\lceil \frac{1}{1-p}\rceil \) yields that (6.6) is of order \({{\mathcal {O}}}(s^{-2/p+1})\). Note that \(k\ge 2\) for \(0<p<1\). The result can be extended to all s by a trivial adjustment of the constants. \(\square \)

Remark 6.3

The assumption of analyticity of the integrand can be replaced since the Taylor series representation remains valid under the weaker assumption that only the \({\varvec{\nu }}\)-th partial derivatives with \(|{\varvec{\nu }}|\le k+1\) for \(k=\lceil \frac{1}{1-p}\rceil \) and \(0<p<1\) exist.

We now apply this general result to the first terms in (6.1), (6.3) and (6.4).

Theorem 6.4

Let \(\theta >0\), \(\alpha _1,\alpha _2 \ge 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\), let \(u^{\varvec{y}}\in {{\mathcal {X}}}\) be the solution of (2.2) and \(\Phi ^{\varvec{y}}\) be as in (3.3), and then let \(q^{\varvec{y}}\in {{\mathcal {Y}}}\) be the solution of (2.16) with \(f_\textrm{dual}\) given by (3.6). Suppose the sequence \({\varvec{b}}= (b_j)_{j\ge 1}\) defined by (2.11) satisfies

$$\begin{aligned}&\sum _{j\ge 1} b_j^p< \infty \quad \text{ for } \text{ some } 0< p < 1, \quad \text{ and } \end{aligned}$$
(6.7)
$$\begin{aligned}&b_1\ge b_2 \ge \cdots . \end{aligned}$$
(6.8)

Then for every \(s \in {\mathbb {N}}\), the truncated solutions \(u_s^{\varvec{y}}\), \(q_s^{\varvec{y}}\) and \(\Phi _s^{\varvec{y}}\) satisfy

$$\begin{aligned} \Big \Vert \int _U (u^{\varvec{y}}- u_s^{\varvec{y}}) \,\textrm{d}{\varvec{y}}\Big \Vert _{{{{\mathcal {X}}}}}&\le C\, s^{-2/p+1}, \\ \Big \Vert \int _U (q^{\varvec{y}}- q_s^{\varvec{y}}) \,\textrm{d}{\varvec{y}}\Big \Vert _{{{\mathcal {Y}}}}&\le C\, s^{-2/p+1}, \\ \Vert S - S_s\Vert _{{{\mathcal {Y}}}} = \Big \Vert \int _U \big (\exp {\big (\theta \, \Phi ^{{\varvec{y}}}\big )}\,q^{\varvec{y}}- \exp {\big (\theta \, \Phi ^{\varvec{y}}_s\big )}\,q_s^{\varvec{y}}\big ) \,\textrm{d}{\varvec{y}}\Big \Vert _{{{\mathcal {Y}}}}&\le C\, s^{-2/p+1}, \\ |T - T_s| = \Big | \int _U \big (\exp {\big (\theta \, \Phi ^{{\varvec{y}}}\big )} - \exp {\big (\theta \, \Phi ^{\varvec{y}}_s\big )}\big ) \,\textrm{d}{\varvec{y}}\Big |&\le C\, s^{-2/p+1}. \end{aligned}$$

In each case we have a generic constant \(C>0\) independent of s, but depending on z, \(u_0\), \({\widehat{u}}\) and other constants as appropriate.

Proof

The result is a corollary of Theorem 6.2 by applying the regularity bounds in Lemma 2.1, Theorems 4.2, 5.6 and 5.4. \(\square \)

6.2 Quasi-Monte Carlo error

We are interested in computing s-dimensional Bochner integrals of the form

$$\begin{aligned} I_s(g):=\int _{U_s}g({\varvec{y}})\,\textrm{d}{\varvec{y}}, \end{aligned}$$

where \(g({\varvec{y}})\) is an element of a separable Banach space Z for each \({\varvec{y}}\in U_s\). As our estimator of \(I_s(g)\), we use a cubature rule of the form

$$\begin{aligned} Q_{s,n}(g):= \sum _{i=1}^n \alpha _i \,g({\varvec{y}}^{(i)}). \end{aligned}$$

with weights \(\alpha _i \in {\mathbb {R}}\) and cubature points \({\varvec{y}}^{(i)} \in U_s\). In particular, we are interested in QMC rules (see, e.g., [12, 33]), which are cubature rules characterized by equal weights \(\alpha _i = 1/n\) and carefully chosen points \({\varvec{y}}^{(i)}\) for \(i=1,\ldots ,n\).

We shall see that for sufficiently smooth integrands, randomly shifted lattice rules lead to convergence rates not depending on the dimension, and which are faster compared to Monte Carlo methods. Randomly shifted lattice rules are QMC rules with cubature points given by

$$\begin{aligned} {\varvec{y}}^{(i)}_{\varvec{\Delta }} := \textrm{frac}\Big (\frac{i\varvec{z}}{n}+\varvec{\Delta }\Big ) - \Big (\frac{1}{2},\ldots ,\frac{1}{2}\Big ), \end{aligned}$$
(6.9)

where \(\varvec{z}\in {\mathbb {N}}^s\) is known as the generating vector, \(\varvec{\Delta } \in [0,1]^s\) is the random shift and \(\textrm{frac}(\cdot )\) means to take the fractional part of each component in the vector. In order to get an unbiased estimator, in practice we take the mean over R uniformly drawn random shifts, i.e., we estimate \(I_s(g)\) using

$$\begin{aligned} {\overline{Q}}(g):=\frac{1}{R}\sum _{r=1}^R Q^{(r)}(g),\quad \text {with}\quad Q^{(r)}(g):=\frac{1}{n}\sum _{i=1}^ng({\varvec{y}}^{(i)}_{\varvec{\Delta }^{(r)}}). \end{aligned}$$
(6.10)

In this section we derive bounds and convergence rates for the errors that occur by applying a QMC method for the approximation of the integrals in the second terms in (6.1), (6.3) and (6.4). We first prove a new general result which holds for any cubature rule in a separable Banach space setting.

Theorem 6.5

Let \(U_s = [-\frac{1}{2},\frac{1}{2}]^s\) and let \({{\mathcal {W}}}_s\) be a Banach space of functions \(F: U_s\rightarrow \mathbb {R}\), which is continuously embedded in the space of continuous functions. Consider an n-point cubature rule with weights \(\alpha _i\in \mathbb {R}\) and points \({\varvec{y}}^{(i)}\in U_s\), given by

$$\begin{aligned} I_s(F) :=\, \int _{U_s} F({\varvec{y}})\,\textrm{d}{\varvec{y}}\,\approx \, \sum _{i=1}^n \alpha _i\, F({\varvec{y}}^{(i)}) \,=:\, Q_{s,n}(F), \end{aligned}$$

and define the worst case error of \(Q_{s,n}\) in \({{\mathcal {W}}}_s\) by

$$\begin{aligned} e^{\textrm{wor}}(Q_{s,n};{{\mathcal {W}}}_s) :=\, \sup _{F\in {{\mathcal {W}}}_s,\, \Vert F\Vert _{{{\mathcal {W}}}_s}\le 1} |I_s(F) - Q_{s,n}(F)|. \end{aligned}$$

Let Z be a separable Banach space and let \(Z'\) denote its dual space. Let \(g: {\varvec{y}}\mapsto g({\varvec{y}})\) be continuous and \(g({\varvec{y}}) \in Z\) for all \({\varvec{y}}\in U_s\). Then

$$\begin{aligned} \Big \Vert \int _{U_s} g({\varvec{y}}) \,\textrm{d}{\varvec{y}}- \sum _{i=1}^n \alpha _i\, g({\varvec{y}}^{(i)}) \Big \Vert _Z \,\le \, e^{\textrm{wor}}(Q_{s,n};{{\mathcal {W}}}_s)\, \sup _{{\mathop {\scriptstyle {\Vert G\Vert _{Z'}\le 1}}\limits ^{\scriptstyle {G\in Z'}}}} \Vert G(g)\Vert _{{{\mathcal {W}}}_s}. \end{aligned}$$
(6.11)

Proof

From the separability of Z and the continuity of \(g({\varvec{y}})\) we get strong measurability of \(g({\varvec{y}})\). Moreover, from the compactness of \(U_s\) and the continuity of \({\varvec{y}}\mapsto g({\varvec{y}})\) we conclude that \(\sup _{{\varvec{y}}\in U_s}\Vert g({\varvec{y}})\Vert _Z < \infty \) and hence \(\int _{U_s} \Vert g({\varvec{y}})\Vert _Z\, \textrm{d}{\varvec{y}}< \infty \), which in turn implies \(\Vert \int _{U_s} g({\varvec{y}})\, \textrm{d}{\varvec{y}}\Vert _Z <\infty \). Thus \(g({\varvec{y}})\) is Bochner integrable.

Furthermore, for every normed space Z, its dual space \(Z'\) is a Banach space equipped with the norm \(\Vert G\Vert _{Z'}:= \sup _{g \in Z,\, \Vert g\Vert _{Z} \le 1} |\langle G,g\rangle _{Z',Z}|\). Then it holds for every \(g\in Z\) that \(\Vert g\Vert _Z = \sup _{G \in Z',\, \Vert G\Vert _{Z'} \le 1} |\langle G, g\rangle _{Z',Z}|\). This follows from the Hahn–Banach Theorem, see, e.g., [40, Theorem 4.3].

Thus we have

$$\begin{aligned}&\Big \Vert \int _{U_s} g({\varvec{y}})\, \textrm{d}{\varvec{y}}- \sum _{i=1}^n \alpha _i\, g({\varvec{y}}^{(i)}) \Big \Vert _Z \nonumber \\&\qquad = \sup _{{\mathop {\Vert G\Vert _{Z'}\le 1}\limits ^{G\in Z'}}} \Big | \Big \langle G, \int _{U_s} g({\varvec{y}})\, \textrm{d}{\varvec{y}}- \sum _{i=1}^n \alpha _i\, g({\varvec{y}}^{(i)}) \Big \rangle _{Z',Z} \Big | \nonumber \\&\qquad = \sup _{{\mathop {\Vert G\Vert _{Z'}\le 1}\limits ^{G\in Z'}}} \Big | \int _{U_s} \langle G,g({\varvec{y}})\rangle _{Z',Z}\, \textrm{d}{\varvec{y}}- \sum _{i=1}^n \alpha _i\, \langle G,g({\varvec{y}}^{(i)}) \rangle _{Z',Z} \Big | \,, \end{aligned}$$
(6.12)

where we used the linearity of G and the fact that for Bochner integrals we can swap the integral with the linear functional, see, e.g., [49, Corollary V.2].

From the definition of the worst case error of \(Q_{s,n}\) in \({{\mathcal {W}}}_s\) it follows that for any \(F\in {{\mathcal {W}}}_s\) we have

$$\begin{aligned} |I_s(F) - Q_{s,n}(F)| \,\le \, e^{\textrm{wor}}(Q_{s,n};{{\mathcal {W}}}_s)\,\Vert F\Vert _{{{\mathcal {W}}}_s}. \end{aligned}$$

Applying this to the special case \(F({\varvec{y}}) = G(g({\varvec{y}})) = \langle G, g({\varvec{y}})\rangle _{Z',Z}\) in (6.12) yields (6.11). \(\square \)

Theorem 6.6

Let the assumptions of the preceding Theorem hold. In addition, suppose there exist constants \(C_0>0\), \(r_1\ge 0\), \(r_2>0\) and a positive sequence \({\varvec{\rho }}= (\rho _j)_{j\ge 1}\) such that for all \(\mathrm {\mathfrak {u}}\subseteq \{1:s\}\) and for all \({\varvec{y}}\in U_s\) we have

$$\begin{aligned} \Big \Vert \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_{\mathrm {\mathfrak {u}}}} g({\varvec{y}})\Big \Vert _Z \le C_0\, (|\mathrm {\mathfrak {u}}|+r_1)!\, \prod _{j \in \mathrm {\mathfrak {u}}} (r_2\,\rho _j). \end{aligned}$$
(6.13)

Then, a randomly shifted lattice rule can be constructed using a component-by-component (CBC) algorithm such that

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Big \Vert \int _{U_s} g({\varvec{y}}) \,\textrm{d}{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n g({{\varvec{y}}^{(i)}}) \Big \Vert _{Z}^2 \,\le \, C_{s,{\varvec{\gamma }},\lambda }\, [\phi _{\textrm{tot}}(n)]^{-1/\lambda } \,\,\,\text{ for } \text{ all }\,\,\, \lambda \in (\tfrac{1}{2},1], \end{aligned}$$

where \(\phi _{\textrm{tot}}(n)\) is the Euler totient function, with \(1/\phi _{\textrm{tot}}(n)\le 2/n\) when n is a prime power, and

$$\begin{aligned} C_{s,{\varvec{\gamma }},\lambda } := C_0^2 \left( \sum _{\emptyset \ne \mathrm {\mathfrak {u}}\subseteq \{1:s\}} \!\!\!\gamma _\mathrm {\mathfrak {u}}^\lambda \bigg (\frac{2\zeta (2\lambda )}{(2\pi ^2)^\lambda } \bigg )^{|\mathrm {\mathfrak {u}}|}\right) ^{\frac{1}{\lambda }} \left( \sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \!\!\!\frac{[(|\mathrm {\mathfrak {u}}|+r_1)!]^2 \prod _{j\in \mathrm {\mathfrak {u}}} (r_2\,\rho _j)^2}{\gamma _\mathrm {\mathfrak {u}}} \right) . \end{aligned}$$
(6.14)

Proof

We consider randomly shifted lattice rules in the unanchored weighted Sobolev space \({{\mathcal {W}}}_{s,{\varvec{\gamma }}}\) with norm

$$\begin{aligned} \Vert F\Vert _{{{\mathcal {W}}}_{s,{\varvec{\gamma }}}}^2 :=&\!\sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \!\frac{1}{\gamma _\mathrm {\mathfrak {u}}} \int _{[-\frac{1}{2},\frac{1}{2}]^{|\mathrm {\mathfrak {u}}|}} \Big | \int _{[-\frac{1}{2},\frac{1}{2}]^{s-|\mathrm {\mathfrak {u}}|}}\! \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_\mathrm {\mathfrak {u}}} F({\varvec{y}}_\mathrm {\mathfrak {u}};{\varvec{y}}_{\{1:s\}\setminus \mathrm {\mathfrak {u}}})\, \textrm{d}{\varvec{y}}_{\{1:s\}\setminus \mathrm {\mathfrak {u}}} \Big |^2 \textrm{d}{\varvec{y}}_\mathrm {\mathfrak {u}}\\ \le&\!\sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \frac{1}{\gamma _\mathrm {\mathfrak {u}}} \int _{U_s} \Big | \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_\mathrm {\mathfrak {u}}} F({\varvec{y}})\, \Big |^2 \textrm{d}{\varvec{y}}. \end{aligned}$$

It is known that CBC construction yields a lattice generating vector satisfying

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}[e^{\textrm{wor}}(Q_{s,n};{{\mathcal {W}}})]^2 \,\le \, \Big ( \frac{1}{\phi _{\textrm{tot}}(n)} \sum _{\emptyset \ne \mathrm {\mathfrak {u}}\subseteq \{1:s\}}\!\! \gamma _{\mathrm {\mathfrak {u}}}^\lambda \Big (\frac{2\zeta (2\lambda )}{(2\pi ^2)^\lambda } \Big )^{|\mathrm {\mathfrak {u}}|}\Big )^{\frac{1}{\lambda }} \,\,\text{ for } \text{ all }\,\, \lambda \in (\tfrac{1}{2},1]. \end{aligned}$$

We have from (6.11) that

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Big \Vert \int _{U_s} g({\varvec{y}}) \,\textrm{d}{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n g({\varvec{y}}^{(i)}) \Big \Vert _{Z}^2 \,\le \, \mathbb {E}_{\varvec{\Delta }}[e^{\textrm{wor}}(Q_{s,n};{{\mathcal {W}}})]^2 \!\sup _{{\mathop {\Vert G\Vert _{Z'}\le 1}\limits ^{G\in Z'}}}\! \Vert G(g)\Vert _{{{\mathcal {W}}}_{s,{\varvec{\gamma }}}}^2. \end{aligned}$$

Using the definition of the \({{\mathcal {W}}}_{s,{\varvec{\gamma }}}\)-norm, we have

$$\begin{aligned}&\Vert G(g)\Vert _{{{\mathcal {W}}}_{s,{\varvec{\gamma }}}}^2 \le \sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \frac{1}{\gamma _{\mathrm {\mathfrak {u}}}} \int _{U_s} \Big | \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_{\mathrm {\mathfrak {u}}}} G(g({\varvec{y}}))\Big |^2 \, \textrm{d}{\varvec{y}}\\&\quad = \sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \frac{1}{\gamma _{\mathrm {\mathfrak {u}}}} \int _{U_s} \Big | G\Big ( \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_{\mathrm {\mathfrak {u}}}} g({\varvec{y}})\Big )\Big |^2 \, \textrm{d}{\varvec{y}}\le \!\sum _{\mathrm {\mathfrak {u}}\subseteq \{1:s\}} \!\frac{1}{\gamma _{\mathrm {\mathfrak {u}}}} \int _{U_s} \Vert G\Vert _{Z'}^2\, \Big \Vert \frac{\partial ^{|\mathrm {\mathfrak {u}}|}}{\partial {\varvec{y}}_{\mathrm {\mathfrak {u}}}} g({\varvec{y}})\Big \Vert _Z^2\, \textrm{d}{\varvec{y}}\,. \end{aligned}$$

We can now use the assumption (6.13) and combine all of the estimates to arrive at the required bound. \(\square \)

Remark 6.7

Theorem 6.5 holds for arbitrary cubature rules, thus similar results to Theorem 6.6 can be stated for other cubature rules. In particular, the regularity bounds obtained in Sects. 4 and 5 can also be used for worst case error analysis of higher-order QMC quadrature as well as sparse grid integration.

We now apply Theorem 6.6 to the second terms in (6.1), (6.3) and (6.4).

Theorem 6.8

Let \(\theta >0\), \(\alpha _1,\alpha _2 \ge 0\), with \(\alpha _1 + \alpha _2 > 0\). Let \(f = (z,u_0) \in {{\mathcal {Y}}}'\) and \({\widehat{u}} \in {{\mathcal {X}}}\). For every \({\varvec{y}}\in U\) and \(s\in \mathbb {N}\), let \(u^{\varvec{y}}_s \in {{\mathcal {X}}}\) be the truncated solution of (2.2) and \(\Phi ^{\varvec{y}}_s\) be as in (3.3), and then let \(q^{\varvec{y}}_s \in {{\mathcal {Y}}}\) be the truncated solution of (2.16) with \(f_\textrm{dual}\) given by (3.6). Then a randomly shifted lattice rule can be constructed using a CBC algorithm such that for all \(\lambda \in (\tfrac{1}{2},1]\) we have

$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Big \Vert \int _{U_s} u_{s}^{\varvec{y}}\,\textrm{d}{\varvec{y}}&- \frac{1}{n} \sum _{i=1}^n u_{s}^{{\varvec{y}}^{(i)}} \Big \Vert _{{{{\mathcal {X}}}}}^2 \le C_{s,{\varvec{\gamma }},\lambda }\, [\phi _\textrm{tot}(n)]^{-1/\lambda }, \end{aligned}$$
(6.15)
$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Big \Vert \int _{U_s} q_{s}^{\varvec{y}}\,\textrm{d}{\varvec{y}}&- \frac{1}{n} \sum _{i=1}^n q_{s}^{{\varvec{y}}^{(i)}} \Big \Vert _{{{\mathcal {Y}}}}^2 \le C_{s,{\varvec{\gamma }},\lambda }\, [\phi _\textrm{tot}(n)]^{-1/\lambda }, \end{aligned}$$
(6.16)
$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}\Vert S_s - S_{s,n}\Vert _{L^2(V;I)}^2&\le \mathbb {E}_{\varvec{\Delta }}\Big \Vert \int _{U_s} \exp \Big (\theta \,\Phi _{s}^{{\varvec{y}}}\Big )\,q_s^{\varvec{y}}\,\textrm{d}{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n \exp \Big (\theta \,\Phi _{s}^{{\varvec{y}}^{(i)}}\Big )\,q_s^{{\varvec{y}}^{(i)}} \Big \Vert ^2_{{\mathcal {Y}}}\nonumber \\&\le C_{s,{\varvec{\gamma }},\lambda }\, [\phi _\textrm{tot}(n)]^{-1/\lambda }, \end{aligned}$$
(6.17)
$$\begin{aligned} \mathbb {E}_{\varvec{\Delta }}|T_s - T_{s,n}|^2&\le \mathbb {E}_{\varvec{\Delta }}\Big | \int _{U_s} \exp \Big (\theta \,\Phi _{s}^{{\varvec{y}}}\Big ) \,\textrm{d}{\varvec{y}}- \frac{1}{n} \sum _{i=1}^n \exp \Big (\theta \,\Phi _{s}^{{\varvec{y}}^{(i)}}\Big ) \Big |^2\nonumber \\&\le C_{s,{\varvec{\gamma }},\lambda }\, [\phi _{\textrm{tot}}(n)]^{-1/\lambda }, \end{aligned}$$
(6.18)

where \(\phi _{\textrm{tot}}(n)\) is the Euler totient function, with \(1/\phi _{\textrm{tot}}(n)\le 2/n\) when n is a prime power. Here \(C_{s,{\varvec{\gamma }},\lambda }\) is given by (6.14), with \(r_1 = 2\), \(r_2 = e\), \(\rho _j = b_j\) defined in (2.11), and \(C_0>0\) is independent of s, n, \(\lambda \) and weights \({\varvec{\gamma }}\) but depends on u, \(z_0\), \({\widehat{u}}\) and other constants.

With the choices

$$\begin{aligned} \lambda&= {\left\{ \begin{array}{ll} \frac{1}{2 - 2\delta } \hbox { for all}\ \delta \in (0,1) &{} \text {if } p\in \left( 0,\frac{2}{3}\right] \,, \\ \frac{p}{2-p} &{} \text {if } p\in \left( \frac{2}{3},1\right) \,, \end{array}\right. } \\ \gamma _\mathrm {\mathfrak {u}}&= \gamma _\mathrm {\mathfrak {u}}^* := \bigg ( (|\mathrm {\mathfrak {u}}|+r_1)!\, \prod _{j \in \mathrm {\mathfrak {u}}} \frac{r_2\, \rho _j}{\sqrt{2\zeta (2\lambda )/(2\pi ^2)^\lambda }} \bigg )^{2/(1+\lambda )}\,, \end{aligned}$$

we have that \(C_{s,{\varvec{\gamma }}^*,\lambda }\) is bounded independently of s. (However, \(C_{s,{\varvec{\gamma }}^*,\frac{1}{2-2\delta }} \rightarrow \infty \) as \(\delta \rightarrow 0\) and \(C_{s,{\varvec{\gamma }}^*,\frac{p}{2-p}} \rightarrow \infty \) as \(p \rightarrow (2/3)^+\).) Consequently, under the assumption (6.7), the above three mean-square errors are of order

$$\begin{aligned} \kappa (n): = \left\{ {\begin{array}{*{20}{l}} {{{[{\phi _{{\textrm{tot}}}}(n)]}^{ - (2 - 2\delta )}}\quad {\text {for}}\;{\text {all}}\;\delta \in (0,1)}&{}{{\text {if}}\;p \in \left( {0,\frac{2}{3}} \right] ,} \\ {{{[{\phi _{{\textrm{tot}}}}(n)]}^{ - (2/p - 1)}}}&{}{{\text {if}}\;p \in \left( {\frac{2}{3},1} \right) .} \end{array}} \right. \end{aligned}$$
(6.19)

Proof

The mean-square error bounds are a corollary of Theorem 6.6 by applying the regularity bounds in Lemma 2.1, Theorems 4.2, 5.4 and 5.6. For simplicity we set \(C_0\), \(r_1\) and \(r_2\) to be the largest values arising from the four results.

We know from [35, Lemma 6.2] that for any \(\lambda \), \(C_{s,{\varvec{\gamma }},\lambda }\) defined in (6.14) is minimized by \(\gamma _\mathrm {\mathfrak {u}}= \gamma _\mathrm {\mathfrak {u}}^*\). By inserting \({\varvec{\gamma }}^*\) into \(C_{s,{\varvec{\gamma }},\lambda }\) we can then derive the condition \(p< \frac{2\lambda }{1+\lambda } <1\) for which \(C_{s,{\varvec{\gamma }}^*,\lambda }\) is bounded independently of s. This condition on \(\lambda \), together with \(\lambda \in (\frac{1}{2},1]\) and \(p \in (0,1)\) yields the result. \(\square \)

6.3 Combined error bound

Combining the results in this section gives the final theorem.

Theorem 6.9

Let \(\alpha _1,\alpha _2 \ge 0\) and \(\alpha _3>0\) with \(\alpha _1+\alpha _2 >0\), and let the risk measure \({{\mathcal {R}}}\) be either the expected value or the entropic risk measure with \(\theta >0\). Denote by \(z^*\) the unique solution of (3.4) and by \(z^*_{s,n}\) the unique solution of the truncated problem using a randomly shifted lattice rule as approximation of the risk measure. Then, if (6.7) and (6.8) hold, we have

$$\begin{aligned} \sqrt{\mathbb {E}_{\varvec{\Delta }}\Vert z^* - z^*_{s,n}\Vert _{L^2(V';I)}^2} \le C \Big (s^{-2/p+1} + \sqrt{\kappa (n)}\Big )\,, \end{aligned}$$

where \(\kappa (n)\) is given in (6.19), and the constant \(C>0\) is independent of s and n but depends on z, \(u_0\), \({\widehat{u}}\) and other constants.

Proof

This follows from (6.1)–(6.4), Remark 6.1, Theorems 6.4 and 6.8. \(\square \)

7 Numerical experiments

We consider the coupled PDE system

$$\begin{aligned}&{\left\{ \begin{array}{ll} \partial _t u^{\varvec{y}}({\varvec{x}},t)-\nabla \cdot (a^{\varvec{y}}({\varvec{x}})\nabla u^{\varvec{y}}({\varvec{x}},t))=z({\varvec{x}},t)\\ u^{\varvec{y}}({\varvec{x}},t)=0\\ u^{\varvec{y}}({\varvec{x}},0)=u_0({\varvec{x}}) \end{array}\right. } \end{aligned}$$
(7.1)

and

$$\begin{aligned}&{\left\{ \begin{array}{ll} -\partial _t q^{\varvec{y}}({\varvec{x}},t)-\nabla \cdot (a^{\varvec{y}}({\varvec{x}})\nabla q^{\varvec{y}}({\varvec{x}},t))=\alpha _1R_V(u^{\varvec{y}}({\varvec{x}},t)-{\widehat{u}}({\varvec{x}},t))\\ q^{\varvec{y}}({\varvec{x}},t)=0\\ q^{\varvec{y}}({\varvec{x}},T)=\alpha _2(u^{\varvec{y}}({\varvec{x}},T)-{\widehat{u}}({\varvec{x}},T)), \end{array}\right. } \end{aligned}$$
(7.2)

where the first equations in (7.1) and (7.2) hold for \({\varvec{x}}\in D\), \(t\in I\), \({\varvec{y}}\in U\), the second equations hold for \({\varvec{x}}\in \partial D\), \(t\in I\), \({\varvec{y}}\in U\), and the last equations hold for \({\varvec{x}}\in D\) and \({\varvec{y}}\in U\). We fix the physical domain \(D=(0,1)^2\) and the terminal time \(T=1\). The uncertain diffusion coefficient, defined as in (2.1), is independent of t, and parameterized in all experiments with mean field \(a_0({\varvec{x}})\equiv 1\) and the fluctuations

$$\begin{aligned} \psi _j({\varvec{x}})=\frac{1}{2} j^{-\vartheta }\sin (\pi jx_1)\sin (\pi j x_2)\quad \text {for }\vartheta >1~\text {and}~j\in {\mathbb {N}}. \end{aligned}$$

We use the implicit Euler finite difference scheme with step size \(\Delta t=\frac{T}{500}=2\cdot 10^{-3}\) to discretize the PDE system with respect to the temporal variable. The spatial part of the PDE system is discretized using a first order finite element method with mesh size \(h=2^{-5}\) and the Riesz operator in the loading term corresponding to (7.2) can be evaluated using (2.4). In all experiments, the lattice rules are generated using the fast CBC algorithm with weights chosen as in Theorem 6.8, where we used the parameter value \(\beta _1=1\) in (2.11).

In the numerical experiments presented in Sects. 7.17.3, we choose

$$\begin{aligned} {\widehat{u}}({\varvec{x}},t)&:= \chi _{\Vert {\varvec{x}}-(c_1(t),c_2(t))\Vert _{\infty }\le \frac{1}{10}}({\varvec{x}})\,{\widehat{u}}_1({\varvec{x}},t)\\&\qquad + \chi _{\Vert {\varvec{x}}+(c_1(t),c_2(t))-(1,1)\Vert _{\infty }\le \frac{1}{10}}({\varvec{x}})\,{\widehat{u}}_2({\varvec{x}},t), \end{aligned}$$

where

$$\begin{aligned} {\widehat{u}}_1({\varvec{x}},t)&:=10240\bigg (x_1-c_1(t)-\frac{1}{10}\bigg )\bigg (x_2-c_2(t)-\frac{1}{10}\bigg )\\&\quad \times \bigg (x_1-c_1(t)+\frac{1}{10}\bigg )\bigg (x_2-c_2(t)+\frac{1}{10}\bigg ),\\ {\widehat{u}}_2({\varvec{x}},t)&:=10240\bigg (x_1+c_1(t)-\frac{11}{10}\bigg )\bigg (x_2+c_2(t)-\frac{11}{10}\bigg )\\&\quad \times \bigg (x_1+c_1(t)-\frac{9}{10}\bigg )\bigg (x_2+c_2(t)-\frac{9}{10}\bigg ),\\ c_1(t)&:=\frac{1}{2} + \frac{1}{4} (1-t^{10})\cos (4\pi t^2)\quad \text {and}\quad c_2(t):=\frac{1}{2} + \frac{1}{4} (1-t^{10})\sin (4\pi t^2). \end{aligned}$$

As the parameters appearing in the objective functional (2.3) and adjoint equation (7.2), we use \(\alpha _1=10^{-3}\), \(\alpha _2=10^{-2}\), and \(\alpha _3=10^{-7}\). Moreover, we always use

$$\begin{aligned} u_0({\varvec{x}})=\sin (2\pi x_1)\sin (2\pi x_2). \end{aligned}$$

In Sects. 7.1 and 7.2, we fix the source term

$$\begin{aligned} z({\varvec{x}},t)=10x_1(1-x_1)x_2(1-x_2) \end{aligned}$$

to assess the dimension truncation and QMC errors.

All computations were carried out on the computational cluster Katana supported by Research Technology Services at UNSW Sydney [28].

7.1 Dimension truncation error

The dimension truncation errors in Theorem 6.4 are estimated by approximating the quantities

$$\begin{aligned} \bigg \Vert \int _{U_{s'}}(u_{s'}^{\varvec{y}}-u_s^{\varvec{y}})\,\textrm{d}{\varvec{y}}\bigg \Vert _{L^2(V;I)}\quad \text {and}\quad \bigg \Vert \int _{U_{s'}}(q_{s'}^{\varvec{y}}-q_s^{\varvec{y}})\,\textrm{d}{\varvec{y}}\bigg \Vert _{L^2(V;I)} \end{aligned}$$

as well as

$$\begin{aligned} \Vert S_{s'}-S_s\Vert _{L^2(V;I)}\quad \text {and}\quad |T_{s'}-T_s| \end{aligned}$$

for \(s'\gg s\), by using a tailored lattice cubature rule generated using the fast CBC algorithm with \(n=2^{15}\) nodes and a single fixed random shift to compute the high-dimensional parametric integrals. The obtained results are displayed in Figs. 1 and 2 for the fluctuations \((\psi _j)_{j\ge 1}\) corresponding to decay rates \(\vartheta \in \{1.3,2.6\}\) and dimensions \(s\in \{2^k\mid k\in \{1,\ldots ,9\}\}\). We use \(\theta =10\) in the computations corresponding to \(S_s\) and \(T_s\). As the reference solution, we use the solutions corresponding to dimension \(s'=2048=2^{11}\).

Fig. 1
figure 1

The approximate dimension truncation errors corresponding to the state and adjoint PDEs

Fig. 2
figure 2

The approximate dimension truncation errors corresponding to \(\Vert S_{s'}-S_s\Vert _{L^2(V;I)}\) and \(|T_{s'}-T_s|\)

The theoretical dimension truncation rate is readily observed in the case \(\vartheta =1.3\). We note in the case \(\vartheta =2.6\) that the dimension truncation convergence rates degenerate for large values of s. This may be explained by the fact that the QMC cubature with \(n=2^{15}\) nodes has an error around \(10^{-8}\) (see Fig. 3 in Sect. 7.2), but the finite element discretization error may also be a contributing factor. For smaller values of s, the higher order convergence is also apparent in the case \(\vartheta =2.6\).

7.2 QMC error

We investigate the QMC error rate by computing the root-mean-square approximations

Fig. 3
figure 3

Left: The approximate root-mean-square error for QMC approximation of the integrals \(\int _{U_s}u_s^{\varvec{y}}\,\textrm{d}{\varvec{y}}\) and \(\int _{U_s}q_s^{\varvec{y}}\,\textrm{d}{\varvec{y}}\). Right: The approximate root-mean-square error for QMC approximation of quantities \(S_s\) and \(T_s\). All computations were carried out using \(R=16\) random shifts, \(n=2^m\), \(m\in \{4,\ldots ,15\}\), and dimension \(s=100\)

$$\begin{aligned}&\sqrt{\frac{1}{R(R-1)}\sum _{r=1}^R \Vert ({\overline{Q}}-Q^{(r)})(u_s^{\cdot })\Vert _{L^2(V;I)}^2},\\&\sqrt{\frac{1}{R(R-1)}\sum _{r=1}^R \Vert ({\overline{Q}}-Q^{(r)})(q_s^{\cdot })\Vert _{L^2(V;I)}^2},\\&\sqrt{\frac{1}{R(R-1)}\sum _{r=1}^R \Vert ({\overline{Q}}-Q^{(r)})(\exp (\Phi _s^{\cdot })\,q_s^{\cdot })\Vert _{L^2(V;I)}^2},\\&\sqrt{\frac{1}{R(R-1)}\sum _{r=1}^R |({\overline{Q}}-Q^{(r)})(\exp (\Phi _s^{\cdot }))|^2}, \end{aligned}$$

corresponding to (6.15)–(6.18), where \({\overline{Q}}\) and \(Q^{(r)}\) are as in (6.10) for a randomly shifted lattice rule with cubature nodes (6.9), where the random shift \(\varvec{\Delta }\) is drawn from \(U([0,1]^s)\). As the generating vector, we use lattice rules constructed using the fast CBC algorithm with \(n=2^m\), \(m\in \{4,\ldots ,15\}\), lattice points and \(R=16\) random shifts, and \(s=100\). We carry out the experiments using two different decay rates \(\vartheta \in \{1.3,2.6\}\) for the input random field. The results are displayed in Fig. 3. The root-mean-square error converges at a linear rate in all experiments, which is consistent with the theory.

7.3 Optimal control problem

We consider the problem of finding the optimal control \(z\in {\mathcal {Z}}\) that minimizes (2.3) subject to the PDE constraint (2.2). We consider constrained optimization over \({\mathcal {Z}}=\{z\in L^2(V';I):\Vert z\Vert _{L^2(V';I)}\le 2\}\) and compare our results with the reconstruction obtained by carrying out unconstrained optimization over \({\mathcal {Z}}=L^2(V';I)\). To this end, we define the projection operator

$$\begin{aligned} {\mathcal {P}}(w):=\min \bigg \{1,\frac{2}{\Vert w\Vert _{L^2(V;I)}}\bigg \}w\quad \text {for}~w\in L^2(V;I) \end{aligned}$$

which is used in the constrained setting, while in the unconstrained setting we use \({\mathcal {P}}:={\mathcal {I}}_{L^2(V;I)}\). The operator \({\mathcal {P}}\) acts on \(L^2(V;I)\) and hence it is different from the operator \(P_{{\mathcal {Z}}}\) introduced in Sect. 3.3, which projects onto \({\mathcal {Z}}\).

Algorithm 1
figure a

Projected gradient descent

Algorithm 2
figure b

Projected Armijo rule

Fig. 4
figure 4

The inverse Riesz transform \(R_V^{-1}z^*\) of the reconstructed optimal control \(z^*\) using the entropic risk measure for several values of t in the constrained setting

Fig. 5
figure 5

Left: the inverse Riesz transform of the control at time \(t=1\) in the constrained setting after 25 iterations of the projected gradient descent algorithm using the entropic risk measure. Right: The difference between the reconstruction obtained in the constrained setting and the corresponding solution in the unconstrained setting

Fig. 6
figure 6

The value of the objective functional for each gradient descent iteration. The results corresponding to the constrained setting and the unconstrained setting are plotted in blue and red, respectively (color figure online)

To be able to handle elements of \({\mathcal {Z}}\) numerically, we apply the projected gradient method (see, e.g., [27]) as described in Algorithm 1 together with the projected Armijo rule stated in Algorithm 2. Note that evaluating \(J(R_Vw)\) and \(J'(R_Vw)\) in Algorithms 1 and 2 requires solving the state PDE with the source term \(R_Vw\). In particular, the Riesz operator appears in the loading term after finite element discretization and can thus be evaluated using (2.4). We use the initial guess \(w_0=0\). The parameters of the gradient descent method were chosen to be \(\eta _0=100\), \(\gamma =10^{-4}\), and \(\beta =0.1\).

We consider the entropic risk measure with \(\theta =10\) and set \(\vartheta = 1.3\). The reconstructed optimal control obtained using the bounded set of feasible controls \({\mathcal {Z}}\) is displayed in Fig. 4. The reconstructed optimal control at the terminal time \(T=1\) and its pointwise difference to the control obtained without imposing control constraints are displayed in Fig. 5. Finally, the evolution of the objective functional as the number of gradient descent iterations increases is plotted in Fig. 6 for the constrained and unconstrained optimization problems.

8 Conclusion

We developed a specially designed QMC method for an optimal control problem subject to a parabolic PDE with an uncertain diffusion coefficient. To account for the uncertainty, we considered as measures of risk the expected value and the more conservative (nonlinear) entropic risk measure. For the high-dimensional integrals originating from the risk measures, we provide error bounds and convergence rates in terms of dimension truncation and the QMC approximation. In particular, after dimension truncation, the QMC error bounds do not depend on the number of uncertain variables, while leading to faster convergence rates compared to Monte Carlo methods. In addition we extended QMC error bounds in the literature to separable Banach spaces, and hence the presented error analysis is discretization invariant.