1 Introduction

Our objective is to develop efficient first-order algorithms for the solution of PDE-constrained optimization problems of the type

$$\begin{aligned} \min _{x,u} F(x) + Q(u) + G(Kx) \quad \text {subject to}\quad B(u, w; x)=Lw \quad \text {for all}\quad w, \end{aligned}$$

where K is a linear operator and the functions F, G, and Q are convex but the first two possibly nonsmooth. The functionals B and L model a partial differential equation in weak form, parametrised by x; for example, \(B(u, w; x)=\langle \nabla u,x\nabla w\rangle \).

Semismooth Newton methods [28, 30] are conventionally used for such problems when a suitable reformulation exists [19,20,21, 34, 35]. Reformulations may not always be available, or yield effective algorithms. The solution of large linear systems may also pose scaling challenges. Therefore, first-order methods for PDE-constrained optimization have been proposed [6,7,8, 27] based on the primal-dual proximal splitting (PDPS) of [5]. The original version applies to convex problems of the form

$$\begin{aligned} \min _x F(x) + G(Kx). \end{aligned}$$
(1)

The primal-dual expansion permits efficient treatment of \(G \circ K\) for nonsmooth G. In [6,7,8, 27] K may be nonlinear, such as the solution operator of a nonlinear PDE.

However, first-order methods generally require a very large number of iterations to exhibit convergence. If the iterations are cheap, they can, nevertheless, achieve good performance. If the iterations are expensive, such as when a PDE needs to be solved on each step, their performance can be poor. Therefore, especially in inverse problems research, Gauss–Newton -type approaches are common for (1) with nonlinear K; see, e.g., [10, 22, 39]. They are easy: first linearise K, then apply a convex optimization method or, in simplest cases, a linear system solver. Repeat. Even when a first-order method is used for the subproblem, Gauss–Newton methods can be significantly faster than full first-order methods [22] if they converge at all [36]. This stems from the following and only practical difference between the PDPS for nonlinear K and Gauss–Newton applied to (1) with PDPS for the inner problems: the former re-linearizes and factors K on each PDPS iteration, the latter only on each outer Gauss–Newton iteration.

In this work, we avoid forming and factorizing the PDE solution operators altogether by running an iterative solver for the constantly adapting PDE simultaneously with the optimization method. This may be compared to the approach to bilevel optimization in [32]. We concentrate on the simple Jacobi and Gauss–Seidel splitting methods for the PDE, while the optimization method is based on the PDPS, as we describe in Sect. 2. We prove convergence in Sect. 3 using the testing approach introduced in [37] and further elucidated in [9]. We explain how standard splittings and PDEs fit into the framework in Sect. 4, and finish with numerical experiments in Sect. 5.

Pseudo-time-stepping one-shot methods have been introduced in [33] and further studied, among others, in [2, 13,14,15,16,17, 24, 31]. A “one-shot” approach, as opposed to an “all-at-once” approach, solves the PDE constraints on each step, instead of considering them part of a unified system of optimality conditions. The aforementioned works solve these constraints inexactly through “pseudo-”time-stepping. This corresponds to the trivial split \(A_x=(A_x-{\text {Id}})+{\text {Id}}\) where \(A_x\) is such that \(\langle A_xu,w\rangle =B(u, w; x)\). We will, instead, apply Jacobi, Gauss–Seidel or even (quasi-)conjugate gradient splitting on \(A_x\). In [2, 13] Jacobi and Gauss–Seidel updates are used for the control variable, but not for the PDEs. The authors of [17] come closest to introducing non-trivial splitting of the PDEs via Hessian approximation. However, they and the other aforementioned works generally restrict themselves to smooth problems and employ gradient descent, Newton-type methods, or sequential quadratic programming (SQP) for the control variable x. Our focus is on nonsmooth problems involving, in particular, total variation regularization \(G(Kx)=\Vert \nabla x\Vert _1\).

1.1 Notation and basic results

Let X be a normed space. We write \(\langle \,\varvec{\cdot }\,|\,\varvec{\cdot }\,\rangle \) for the dual product and, in a Hilbert space, \(\langle \,\varvec{\cdot }\,,\,\varvec{\cdot }\,\rangle \) for the inner product. The order of the arguments in the dual product is not important when the action is obvious from context. For X a Hilbert space, we denote by \({\text {In}}_X: X \hookrightarrow X^*\) the canonical injection, \(\langle {\text {In}}_X x|{\tilde{x}}\rangle =\langle x,{\tilde{x}}\rangle \) for all \(x, {\tilde{x}} \in X\).

We write \(\mathbb {L}(X; Y)\) for the space of bounded linear operators between X and Y. We write \({\text {Id}}_X = {\text {Id}}\in \mathbb {L}(X; X)\) for the identity operator on X. If \(M \in \mathbb {L}(X; X^*)\) is non-negative and self-adjoint, i.e., \(\langle Mx|y\rangle =\langle x|My\rangle \) and \(\langle x|Mx\rangle \ge 0\) for all \(x, y \in X\), we define \(\Vert x\Vert _M {:}{=}\sqrt{\langle x|Mx\rangle }\). Then the three-point identity holds:

$$\begin{aligned} \langle M(x-y)|x-z\rangle = \frac{1}{2}\Vert x-y\Vert ^2_M - \frac{1}{2}\Vert y-z\Vert ^2_M + \frac{1}{2}\Vert x-z\Vert ^2_M \qquad \text {for all } x,y,z\in X. \end{aligned}$$
(2)

We extensively use the vector Young’s inequality

$$\begin{aligned} \langle x|y\rangle \le \frac{1}{2a}\Vert x\Vert _X^2 + \frac{a}{2}\Vert y\Vert _{X^*}^2 \quad (x \in X,\,y \in X^*,\, a>0). \end{aligned}$$
(3)

These expressions hold in Hilbert spaces also with the inner product in place of the dual product. We write \(M^{\star }\) for the inner product adjoint of M, and \(M^*\) for the dual product adjoint.

We write \({{\,\textrm{dom}\,}}F\) for the effective domain, and \(F^*\) for the Fenchel conjugate of \(F: X \rightarrow {\overline{\mathbb {R}}}{:}{=}[-\infty , \infty ]\). We write \(F'(x) \in X^*\) for the Fréchet derivative at x when it exists, and, if X is a Hilbert space, \(\nabla F(x) \in X\) for its Riesz presentation. For convex F on a Hilbert space X, we write \(\partial F(x) \subset X\) for the subdifferential at \(x \in X\) (or, more precisely, the corresponding set of Riesz representations, but aside from a single proof in Appendix A, we will not be needing subderivatives as elements of \(X^*\)). We then define the proximal map

$$\begin{aligned} {{\,\textrm{prox}\,}}_{F}(x) {:}{=}({\text {Id}}+ \partial F)^{-1}(x) = \mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{{\tilde{x}}\in X}\left\{ F({\tilde{x}}) + \frac{1}{2}\Vert {\tilde{x}} - x\Vert _{X}^2\right\} ,\quad x \in X. \end{aligned}$$

We denote the \(\{0,\infty \}\)-valued indicator function of a set A by \(\delta _A\).

We occasionally apply operations on \(x \in X\) to all elements of sets \(A \subset X\), writing \(\langle x+A|z\rangle {:}{=}\{\langle x+a|z\rangle \mid a \in A \}\). For \(B \subset \mathbb {R}\), we write \(B \ge c\) if \(b \ge c\) for all \(b \in B\).

On a Lipschitz domain \(\Omega \subset \mathbb {R}^n\), we write \({{\,\textrm{trace}\,}}_{\partial \Omega } \in \mathbb {L}(H^1(\Omega ); L^2(\partial \Omega ))\) for the trace operator on the boundary \(\partial \Omega \).

We list various symbols introduced and used throughout the manuscript in a table in Appendix B.

2 Problem and proposed algorithm

We start by introducing in detail the type of problem we are trying to solve. We then rewrite in Sect. 2.1 its optimality conditions in a form suitable for developing our proposed method in Sect. 2.3. Before this we recall the structure and derivation of the basic PDPS in Sect. 2.2.

2.1 Problem description

Our objective is to solve

$$\begin{aligned} \min _x J(x) {:}{=}F(x) + Q(S(x)) + G(K x), \end{aligned}$$
(4)

where \(F: X \rightarrow {\overline{\mathbb {R}}}\), \(G: Y \rightarrow {\overline{\mathbb {R}}}\), and \(Q: U \rightarrow \mathbb {R}\) are convex, proper, and lower semicontinuous on Hilbert spaces X, U, and Y with Q Fréchet differentiable. We assume \(K \in \mathbb {L}(X; Y)\) while \(S: X \ni x \mapsto u \in U\) is a solution operator of the weak PDE

$$\begin{aligned} B(u, w; x) = Lw \quad \text {for all}\quad w \in W. \end{aligned}$$
(5)

Here \(L \in U^*\) and \(B: U \times W \times X \rightarrow \mathbb {R}\) is continuous, and affine-linear-affine in its three arguments. The space W is Hilbert, possibly distinct from U to model nonhomogeneous boundary conditions. For this initial development, we will tacitly assume unique S(x) and \(\nabla S(x)\) to exist for all \(x \in {{\,\textrm{dom}\,}}F\), but later on in the manuscript, do not directly impose this restriction, or use S.

Example 2.1

(A linear PDE) On a Lipschitz domain \(\Omega \subset \mathbb {R}^n\), consider the PDE

$$\begin{aligned} \left\{ \begin{array}{ll} \nabla \cdot \nabla u = x, &{} \text {on } \Omega ,\\ u = g, &{} \text {on } \partial \Omega . \end{array} \right. \end{aligned}$$

For the weak form (5) we can take the spaces \(U = H^1(\Omega )\), \(W = H_0^1(\Omega ) \times H^{1/2}(\partial \Omega )\), and \(X = L^2(\Omega )\). Writing \(w = (w_\Omega , w_\partial )\), we then set

$$\begin{aligned} B(u, w; x) = \langle \nabla u,\nabla w_\Omega \rangle _{L^2(\Omega )} - \langle x,w_\Omega \rangle _{L^2(\Omega )} + \langle {{\,\textrm{trace}\,}}_{\partial \Omega } u,w_\partial \rangle _{L^2(\partial \Omega )} \end{aligned}$$

and \( Lw {:}{=}\langle g,w_\partial \rangle _{L^2(\partial \Omega )}. \)

Example 2.2

(A nonlinear PDE) On a Lipschitz domain \(\Omega \subset \mathbb {R}^n\), consider the PDE

$$\begin{aligned} \left\{ \begin{array}{ll} \nabla \cdot (x\nabla u) = 0, &{} \text {on } \Omega ,\\ u = g, &{} \text {on } \partial \Omega . \end{array} \right. \end{aligned}$$

For the weak form (5) we can take the spaces \(U \subset H^1(\Omega )\), \(W \subset H_0^1(\Omega ) \times H^{1/2}(\partial \Omega )\), and \(X \subset L^2(\Omega )\), such that at least one of these subspaces ensures the corresponding x, \(\nabla u\), or \(\nabla w\) to be in the relevant \(L^\infty \) space. This, in practise, requires one of the subspaces to be finite-dimensional, or X to be \(H^k(\Omega )\) for \(k > n/2\), such that the boundedness of \(\Omega \) and Sobolev’s inequalities provide the \(L^\infty \) bound. The latter is an option in infinite-dimensional theory, but in finite-dimensional realisations, it is desirable to use a standard 2-norm in X, as proximal operators and gradient steps with respect to \(H^k\)-norms (for \(k > 0\)) are computationally expensive. Writing \(w = (w_\Omega , w_\partial )\), we then set

$$\begin{aligned} B(u, w; x) = \langle x\nabla u,\nabla w_\Omega \rangle _{L^2(\Omega )} + \langle {{\,\textrm{trace}\,}}_{\partial \Omega } u,w_\partial \rangle _{L^2(\partial \Omega )} \quad \text {and}\quad Lw {:}{=}\langle g,w_\partial \rangle _{L^2(\partial \Omega )}. \end{aligned}$$

To ensure the coercivity of \(B(\,\varvec{\cdot }\,, \,\varvec{\cdot }\,; x)\), and hence the existence of unique solutions to (5), we will further need to restrict x through \({{\,\textrm{dom}\,}}F\).

We require the sum and chain rules for convex subdifferentials to hold on \(F + G \circ K\). This is the case when

$$\begin{aligned} \text {there exists an } x\in {{\,\textrm{dom}\,}}(G\circ K) \cap {{\,\textrm{dom}\,}}F \text { with } Kx\in {{\,\textrm{int}\,}}({{\,\textrm{dom}\,}}G). \end{aligned}$$
(6)

We refer to [9] for basic results and concepts of infinite-dimensional convex analysis. Then by the Fréchet differentiability of Q and the compatibility of limiting (Mordukhovich) subdifferentials (denoted \(\partial _{M}\)) with Fréchet derivatives and convex subdifferentials [9, 29],

$$\begin{aligned} \partial _{M} J(x) = \partial F(x) + \nabla S(x)^{\star } \nabla Q(S(x)) + K^{\star } \partial G(Kx). \end{aligned}$$

Therefore, the Fermat principle for limiting subdifferentials and simple rearrangements (see [6, 36] or [9, Chapter 15]) establish for (4) in terms of \((\bar{u}, \bar{w}, \bar{x}, \bar{y}) \in U \times W \times X \times Y\) the necessary first-order optimality condition

$$\begin{aligned} \left\{ \begin{aligned} {\bar{u}}&= S({\bar{x}}), \\ - \nabla S(\bar{x})^{\star } \nabla Q({\bar{u}}) - K^{\star } \bar{y}&\in \partial F(\bar{x}), \\ K \bar{x}&\in \partial G^*(\bar{y}). \end{aligned}\right. \end{aligned}$$
(7)

We recall that \(G^*: Y \rightarrow {\overline{\mathbb {R}}}\) is the Fenchel conjugate of G.

The term \(\nabla S(\bar{x})^{\star }\nabla Q({\bar{u}})\) involves the solution \(\bar{u}\) to the original PDE and the solution \(\bar{w}\) to an adjoint PDE. We derive it from a primal-dual reformulation of (4). To do this, we first observe that since B is affine in x, it can be decomposed as

$$\begin{aligned} B(u, w; x) = B_x(u, w; x) + B_{{\textrm{const}}}(u, w), \end{aligned}$$
(8)

where, \(B_x: U \times W \times X \rightarrow \mathbb {R}\) is affine-linear-linear, and \(B_{{\textrm{const}}}: U \times W \rightarrow \mathbb {R}\) is affine-linear. Indeed \(B_{{\textrm{const}}}(u, w) = B(u, w; 0)\), and \(B_x(u, w; x) = B(u, w; x) - B(u, w; 0)\). We then introduce the Riesz representation \({\bar{\nabla }}_x B(u, w)\) of \(B_x(u, w; \,\varvec{\cdot }\,) \in X^*\). Thus

$$\begin{aligned} \langle {\bar{\nabla }}_x B(u, w),x\rangle _X = B_x(u, w; x) \quad \text {for all } u \in U,\, w \in W,\, x \in X. \end{aligned}$$
(9)

We have \(\nabla _x B(u, w; x) \equiv {\bar{\nabla }}_x B(u, w) \in X\) for all \(x \in X\).

Clearly, also, \(B_x\) is an abbreviation for \((u, w; x) \rightarrow D_x B(u, w, 0)(x)\), where, just here, we write \(D_x\) for the Fréchet derivative with respect to x. Likewise we write \(B_u\) to abbreviate \((u, w; x) \rightarrow D_u B(0, w, x)(u)\), and \(B_{xu}\) to abbreviate \((u, w; x) \rightarrow D_uB_x(0, w, x)(u)\). If \(B\) is linear in u, then \(B_u=B\); and if \(B\) is linear in both u and x, then \(B_{xu}= B\).

We may now write (4) asFootnote 1

$$\begin{aligned} \min _{x,u}\max _{w}~ F(x) + Q(u) + B(u, w; x)-Lw + G(Kx) \end{aligned}$$
(10)

or

$$\begin{aligned} \min _{x,u}\max _{w,y}~ F(x) + Q(u) + B(u, w; x)-Lw + \langle K x,y\rangle _Y - G^*(y). \end{aligned}$$
(11)

In terms of \(({\bar{u}}, \bar{w}, {\bar{x}}, {\bar{y}}) \in U \times W \times X \times Y\), subject to a qualification condition, this problem has the necessary first-order optimality conditions

$$\begin{aligned} \left\{ \begin{aligned} B(\bar{u}, {\tilde{w}}; \bar{x})&= L{\tilde{w}}{} & {} \text {for all}\quad {\tilde{w}} \in W, \\ B_u({\tilde{u}}, \bar{w}; \bar{x})&= -Q'({\bar{u}}){\tilde{u}}{} & {} \text {for all}\quad {\tilde{u}} \in U,\\ -{\bar{\nabla }}_x B({\bar{u}}, \bar{w}) - K^{\star } \bar{y}&\in \partial F(\bar{x}), \\ K\bar{x}&\in \partial G^*(\bar{y}). \end{aligned}\right. \end{aligned}$$
(12)

This is our principal form of optimality conditions for (4).

It is easy to see that (12) are necessary for \((\bar{u}, \bar{w}, \bar{x}, \bar{y})\) to be a saddle point of (11). The next theorem shows, subject to qualification conditions, that (12) are also necessary for a solution to (11) (which may not be a saddle point in the non-convex-concave setting). Note that \(w \in W\) is inconsequential in (11). If one choice forms a part of a solution of the problem, so does any other (or else the problem has no solution at all). However, \(\bar{w}\) solving (12) is more precisely determined.

Theorem 2.3

Suppose \(({\bar{u}}, w, {\bar{x}}, {\bar{y}}) \in U \times W \times X \times Y\) solve (11). If, moreover, \({{\,\textrm{int}\,}}{{\,\textrm{dom}\,}}[F + G \circ K] \ne \emptyset \), and, for some \(c>0\),

$$\begin{aligned} \sup _{\Vert (h_x, h_u)\Vert =1} B_x({\bar{u}}, w; h_x) + B_u(h_u, w; {\bar{x}}) \ge c \Vert w\Vert \quad \text {for all}\quad w \in W \quad \text {and} \nonumber \\\end{aligned}$$
(13a)
$$\begin{aligned} B_u({\tilde{u}}, w; {\bar{x}}) = 0 \ \text {for all}\ {\tilde{u}} \implies B_x({\bar{u}}, w; x) = 0 \ \text {for all}\ x \in {{\,\textrm{dom}\,}}(F + G \circ K),\nonumber \\ \end{aligned}$$
(13b)

then (12) holds for some \(\bar{w} \in W\).

After an affine shift and restriction of x to a subspace, the condition \({{\,\textrm{int}\,}}{{\,\textrm{dom}\,}}[F + G \circ K] \ne \emptyset \) can always be relaxed to the corresponding relative interior being non-empty. Since the proof of Theorem 2.3 is long and depends on techniques not needed in our main line of work, we relegate it to Appendix A.

Example 2.4

If \(W=U\), taking \(h_u=w/\Vert w\Vert \) and \(h_x=0\), we see that the qualification conditions (13) hold when \(B_u(\,\varvec{\cdot }\,,\,\varvec{\cdot }\,; {\bar{x}})\) is coercive. Similarly, also when \(W \ne U\), if the weak coercivity conditions of the Babuška–Lax–Milgram theorem hold for \((w, h_u) \mapsto B_u(h_u, w; {\bar{x}})\), then so do (13).

The second line of (12) is the adjoint PDE, needed for \(\nabla S(\bar{x})^* \nabla Q({\bar{u}})\) in (7):

Corollary 2.5

Suppose (13) hold for \(\bar{x} = x \in X\), some \(w \in W\), and \(\bar{u} = u\) a unique solution to (5). Then the solution operator S of (5) satisfies for all \(z \in U\) that

$$\begin{aligned} \nabla S(x)^{\star } z = {\bar{\nabla }}_x B(u, w) \ \text {where} \ u = S(x) \ \text {and}\ \left\{ \begin{array}{l} w \text { solves the weak adjoint PDE:}\\ B_u({\tilde{u}}, w; x) = - \langle z,{\tilde{u}}\rangle \text { for all } {\tilde{u}} \in U. \end{array}\right. \end{aligned}$$

Proof

Take \(F \equiv 0\), \(K={\text {Id}}\), \(G \equiv \delta _{\{x\}}\), and \(Q = \langle z,\,\varvec{\cdot }\,\rangle _U \). Then any solution \(({\bar{u}}, w, {\bar{x}}, y)\) to (11) has \(\bar{x} = x\). Since \(G^*({\tilde{y}})=\langle x,{\tilde{y}}\rangle \), any choice of y and w solve (11). Therefore, Theorem 2.3 applied to the problem we just constructed shows that

$$\begin{aligned} B_u({\tilde{u}}, w; x) = -\langle z,{\tilde{u}}\rangle _U \ \text {for all}\ {\tilde{u}} \in U \quad \text {and}\quad -{\bar{\nabla }}_x B(u, w) - y = 0. \end{aligned}$$

On the other hand, (7) reduces to some y satisfying \( - \nabla S(x)^{\star } z - y = 0. \) Comparing these two expressions, we obtain the claim. \(\square \)

2.2 Primal-dual proximal splitting: a recap

The primal-dual proximal splitting (PDPS) for (1) is based on the optimality conditions

$$\begin{aligned} \left\{ \begin{aligned} - K^{\star } \bar{y}&\in \partial F(\bar{x}), \\ K\bar{x}&\in \partial G^*(\bar{y}). \end{aligned}\right. \end{aligned}$$
(14)

These are just the last two lines of (12) without \({\bar{\nabla }}_x B\). As derived in [9, 18, 37], the basic (unaccelerated) PDPS solves (14) by iteratively solving for each \(k \in \mathbb {N}\) the system

$$\begin{aligned} \left\{ \begin{aligned} 0&\in \tau \partial F({x}^{k+1}) + \tau K^{\star }y^k + {x}^{k+1} - x^k \\ 0&\in \sigma \partial G^*({y}^{k+1}) - \sigma K[{x}^{k+1} + \omega ({x}^{k+1}-x^k)] + {y}^{k+1} - y^k, \end{aligned}\right. \end{aligned}$$
(15)

where the primal and dual step length parameters \(\tau , \sigma >0\) satisfy \(\tau \sigma \Vert K\Vert < 1\), and the over-relaxation parameter \(\omega =1\). We can write (15) in explicit form as

$$\begin{aligned} \left\{ \begin{aligned} {x}^{k+1}&{:}{=}{{\,\textrm{prox}\,}}_{\tau F}\bigl (x^k - \tau K^{\star }y^k\bigr ),\\ {y}^{k+1}&{:}{=}{{\,\textrm{prox}\,}}_{\sigma G^*}\bigl (y^k + \sigma K[{x}^{k+1} + \omega ({x}^{k+1}-x^k)]\bigr ). \end{aligned}\right. \end{aligned}$$

2.3 Algorithm derivation

The derivation of the PDPS and the optimality conditions (12) suggest to solve (12) by iteratively solving

$$\begin{aligned} \left\{ \begin{aligned} B({u}^{k+1}, \,\varvec{\cdot }\,; x^k)&= L,\\ B_u(\,\varvec{\cdot }\,, {w}^{k+1}; x^k)&= -Q'({u}^{k+1}),\\ 0&\in \tau _k\partial F({x}^{k+1}) + \tau _k {\bar{\nabla }}_x B({u}^{k+1}, {w}^{k+1}) + \tau K^{\star }y^k + {x}^{k+1} - x^k \\ 0&\in \sigma _{k+1}\partial G^*({y}^{k+1}) - \sigma _{k+1} K[{x}^{k+1} + \omega _k({x}^{k+1}-x^k)] + {y}^{k+1} - y^k. \end{aligned}\right. \end{aligned}$$
(16)

We have made the step length and over-relaxation parameters iteration-dependent for acceleration purposes. The indexing \(\tau _k\) and \(\sigma _{k+1}\) is off-by-one to maintain the symmetric update rules from [5].

The method in (16) still requires exact solution of the PDEs. For some splitting operators \(\Gamma _k, \Upsilon _k: U \times W \times X \rightarrow \mathbb {R}\), we therefore transform the first two lines into

$$\begin{aligned} B({u}^{k+1}, \,\varvec{\cdot }\,; x^k) - \Gamma _k({u}^{k+1} - u^k, \,\varvec{\cdot }\,; x^k)&= L\quad \text {and} \end{aligned}$$
(17a)
$$\begin{aligned} B_u(\,\varvec{\cdot }\,, {w}^{k+1}; x^k) - \Upsilon _k(\,\varvec{\cdot }\,, {w}^{k+1} - w^k; x^k)&= - Q'({u}^{k+1}). \end{aligned}$$
(17b)

Example 2.6

(Splitting) Let \(B(u, w; x)= \langle A_x u,w\rangle \) for symmetric \(A_x \in \mathbb {R}^{n \times n}\) on \(U=W=\mathbb {R}^n\). Take \(\Gamma _k(u, w; x)=\langle [A_x - N_x] u,w\rangle \) and \(\Upsilon _k=\Gamma _k\) for easily invertible \(N_x \in \mathbb {R}^{n \times n}\). With \(L=\langle b,\,\varvec{\cdot }\,\rangle \), \(b \in \mathbb {R}^n\) and \(M_x {:}{=}A_x-N_x\), (17) now reads

$$\begin{aligned} N_{x^k} {u}^{k+1} = b - M_{x^k} u^k \quad \text {and}\quad N_{x^k} {w}^{k+1} = - \nabla Q({u}^{k+1}) - M_{x^k} w^k. \end{aligned}$$
(18)

For Jacobi splitting we take \(N_{x^k}\) as the diagonal part of \(A_{x^k}\), and for Gauss–Seidel splitting as the lower triangle including the diagonal. We study these choices further in Sect. 4.2.

Let us introduce the general notation \(v=(u, w, x, y)\) as well as the step length operators \(T_k \in \mathbb {L}(U^* \times W^* \times X \times Y; U^* \times W^* \times X \times Y)\),

$$\begin{aligned} T_k {:}{=}{{\,\textrm{diag}\,}}\begin{pmatrix} {\text {Id}}_{U^*}&{\text {Id}}_{W^*}&\tau _k {\text {Id}}_X&\sigma _{k+1} {\text {Id}}_Y\end{pmatrix}, \end{aligned}$$
(19)

the set-valued operators \( H_k: U \times W \times X \times Y \rightrightarrows U^* \times W^* \times X \times Y\),

$$\begin{aligned} H_k(v) {:}{=}\begin{pmatrix} B(u, \,\varvec{\cdot }\,; x^k) - \Gamma _k(u - u^k, \,\varvec{\cdot }\,; x^k) - L\\ B_u(\,\varvec{\cdot }\,, w; x^k) - \Upsilon _k(\,\varvec{\cdot }\,, w - w^k; x^k) + Q'(u)\\ \partial F(x) + {\bar{\nabla }}_x B(u,w) + K^{\star }y \\ \partial G^*(y) - Kx \end{pmatrix}, \end{aligned}$$
(20)

and the preconditioning operators \(M_k \in \mathbb {L}(U \times W \times X \times Y; U^* \times W^* \times X \times Y)\),

$$\begin{aligned} M_k {:}{=}{{\,\textrm{diag}\,}}\begin{pmatrix} 0 &{} 0 &{} \begin{pmatrix} {\text {Id}}_X &{} -\tau _k K^{\star } \\ -\omega _k\sigma _{k+1} K &{} {\text {Id}}_Y \end{pmatrix} \end{pmatrix}. \end{aligned}$$
(21)

The implicit form of our proposed algorithm for the solution of (4) is then

$$\begin{aligned} 0 \in T_k H_k({v}^{k+1}) + M_k({v}^{k+1}-v^k). \end{aligned}$$
(22)

Writing out (22) in terms of explicit proximal maps, we obtain Algorithm 2.1.

Algorithm 1
figure a

Primal dual splitting with parallel adaptive PDE solves (PDPAP)

3 Convergence

We now treat the convergence of Algorithm 2.1. Following [9, 37] we “test” its implicit form (22) by applying on both sides the linear functional \(\langle Z_k\,\varvec{\cdot }\,|v^{k+1} - \bar{v}\rangle \). Here \(Z_k\) is a convergence rate encoding “testing operator” (Sect. 3.2). A simple argument involving the three-point identity (2) and a growth estimate for \(H_k\) then yields in Sect. 3.3 a Féjer-type monotonicity estimate in terms of iteration-dependent norms. This establishes in Sect. 3.4 global convergence subject to a growth condition. We start with assumptions.

3.1 The main assumptions

We start with our main structural assumption. Further central conditions related to the PDE constraint will follow in Assumption 3.3, and through its verification for specific linear system solvers in Sect. 4.2.

Assumption 3.1

(Structure) On Hilbert spaces X, Y, U, and W, we are given convex, proper, and lower semicontinuous \(F: X \rightarrow {\overline{\mathbb {R}}}\), \(G^*: Y \rightarrow {\overline{\mathbb {R}}}\), and \(Q: U \rightarrow \mathbb {R}\) with Q Fréchet differentiable, as well as \(K \in \mathbb {L}(X; Y)\), \(L \in U^*\), and \(B: U \times W \times X \rightarrow \mathbb {R}\) affine-linear-affine. We assume:

  1. (i)

    F and G are (strongly) convex with factors \( \gamma _F, \gamma _{G^*} \ge 0 \). With K they satisfy the condition (6) for the subdifferential sum and chain rules to be exact.

  2. (ii)

    For all \(x \in {{\,\textrm{dom}\,}}F\), there exist solutions \((u, w) \in U \times W\) to the PDE \(B(u, \,\varvec{\cdot }\,; x) = L\) and the adjoint PDE \(B_u(\,\varvec{\cdot }\,, w; x) = -Q'(u)\).

We then fix a solution \(\bar{v}=(\bar{u}, \bar{w}, \bar{x}, \bar{y}) \in U \times W \times X \times Y\) to (12) and assume that:

  1. (iii)

    For some \(\mathscr {S}(\bar{u}), \mathscr {S}(\bar{w}) \ge 0\), for all \((u, w) \in U \times W\) and \(x \in {{\,\textrm{dom}\,}}F\), we have

    $$\begin{aligned} B_{xu}(u, \bar{w}; x - \bar{x})&\le \sqrt{\mathscr {S}({\bar{w}})}\Vert u\Vert _U\Vert x - {\bar{x}}\Vert _X \quad \text {and}\\ B_x(\bar{u}, w; x - {\bar{x}})&\le \sqrt{\mathscr {S}({\bar{u}})}\Vert w\Vert _W\Vert x - {\bar{x}}\Vert _X. \end{aligned}$$
  2. (iv)

    For some \(C_x \ge 0\), for all \((u, w) \in U \times W\) and \(x \in {{\,\textrm{dom}\,}}F\) we have the bound

    $$\begin{aligned} B_{xu}(u, w; x-\bar{x}) \le C_x \Vert u\Vert _U\Vert w\Vert _W. \end{aligned}$$

Remark 3.2

Part (i) is easy to check. In general, (iv) requires \({{\,\textrm{dom}\,}}F\) to be bounded with respect to an \(\infty \)-norm with \(B_x(u, w, x) \le C \Vert u\Vert _U\Vert w\Vert _W\Vert x\Vert _\infty \) for some \(C>0\). Then \(C_x= \sup _{x \in {{\,\textrm{dom}\,}}F} C \Vert x\Vert _\infty \). If \(B_x\) is independent of u, i.e., for linear PDEs, both \(C_x=0\) and \(\mathscr {S}({\bar{w}})=0\), while \(\mathscr {S}({\bar{u}})\) is a constant independent of \(\bar{u}\). We study (ii)–(iv) further in Sect. 4.1.

The next assumption encodes our conditions on the PDE splittings.

Assumption 3.3

(Splitting) Let Assumption 3.1 hold. For \(k \in \mathbb {N}\), for which this assumption is to hold, we are given splitting operators \(\Gamma _k, \Upsilon _k: U \times W \times X \rightarrow \mathbb {R}\) and \(v^k=(u^k, w^k, x^k, y^k) \in U \times W \times X \times Y\) such that:

  1. (i)

    \(\Gamma _k\) is linear in the second argument, \(\Upsilon _k\) in the first.

  2. (ii)

    There exist solutions \(u^{k+1}\) and \(w^{k+1}\) to the split equations (17).

  3. (iii)

    For some \(\gamma _B > 0\) and \(C_Q, \pi _u, \pi _w \ge 0\), we have

    $$\begin{aligned} \Vert u^k-{\bar{u}}\Vert _U^2&\ge \gamma _B \Vert {u}^{k+1}-{\bar{u}}\Vert _U^2 - \pi _u \Vert x^k - \bar{x}\Vert _X^2 \quad \text {and}\\ \Vert w^k-\bar{w}\Vert _W^2&\ge \gamma _B \Vert {w}^{k+1}-\bar{w}\Vert _W^2 - C_Q \Vert {u}^{k+1}-{\bar{u}}\Vert _U^2 - \pi _w \Vert x^k - \bar{x}\Vert _X^2. \end{aligned}$$

We verify the assumption for standard splittings in Sect. 4.2. The verification will introduce the assumption that \(Q'\) be Lipschitz. The Lipschitz factor then appears in \(C_Q\), justifying the Q-subscript notation. Generally \(\pi _u\) and \(\pi _w\) model the x-sensitivity of \(B\) and \(B_u\). For linear PDEs, such as Example 2.1, \(B_u\) does not depend on x. In that case most iterative solvers for the adjoint PDE would also be independent of x and have \(\pi _w=0\). The factor \(\gamma _B\) relates to the contractivity of the iterative solver.

The next, final, assumption introduces testing parameters that encode convergence rates and restrict the step length parameters in the standard primal-dual component of our method. It has no difference to the treatment of the PDPS in [9, 37]. Dependent on whether both, one, or none of \({\tilde{\gamma }}_F>0\) and \({\tilde{\gamma }}_{G^*}>0\), the parameters can be chosen to yield varying modes and rates of convergence.

Assumption 3.4

(Primal-dual parameters) Let Assumption 3.1 hold. For all \(k \in \mathbb {N}\), the testing parameters \(\varphi _k, \psi _k > 0\), step length parameters \(\tau _k, \sigma _k > 0\), and the over-relaxation parameter \(\omega _k \in (0, 1]\) satisfy for some \({\tilde{\gamma }}_F \in [0, \gamma _F]\) and \({\tilde{\gamma }}_{G^*} \in [0, \gamma _{G^*}]\), and \(\kappa \in (0, 1)\) that

$$\begin{aligned} \varphi _{k+1}&= \varphi _k(1+2{\tilde{\gamma }}_{F}\tau _k),&\psi _{k+1}&= \psi _k(1+2{\tilde{\gamma }}_{G^*}\sigma _k),\\ \eta _k&{:}{=}\varphi _k \tau _k = \psi _k\sigma _k,&\omega _k&= \eta ^{-1}_{k+1}\eta _{k}, \quad \text {and}&\kappa&\ge \frac{\tau _k\sigma _k}{1+2{\tilde{\gamma }}_{G^*}\sigma _k}\Vert K\Vert ^2. \end{aligned}$$

3.2 The testing operator

To complement the primal-dual testing parameters in Assumption 3.4, we introduce testing parameters \(\lambda _k,\theta _k>0\) corresponding to the PDE updates in our method; the first two lines of (22). We combine all of them into the testing operator \(Z_k \in \mathbb {L}(U^* \times W^* \times X \times Y; U^* \times W^* \times X^{*} \times Y^{*})\) defined by

$$\begin{aligned} Z_k {:}{=}{{\,\textrm{diag}\,}}\begin{pmatrix}\lambda _k {\text {Id}}&\theta _k {\text {Id}}&\varphi _k {\text {In}}_X&\psi _{k+1} {\text {In}}_Y\end{pmatrix}. \end{aligned}$$
(23)

Recalling \(M_k\) and \(Z_k\) from (21) and (23), thanks to Assumption 3.4, we have

$$\begin{aligned} Z_kM_k = {{\,\textrm{diag}\,}}\begin{pmatrix} 0 &{} 0 &{} \begin{pmatrix} \varphi _k{\text {In}}_X &{} -\eta _k {\text {In}}_X K^{\star } \\ -\eta _k {\text {In}}_Y K &{} \psi _{k+1}{\text {In}}_Y \end{pmatrix} \end{pmatrix}. \end{aligned}$$
(24)

Therefore,

$$\begin{aligned} Z_k(M_k+\Xi _k)=Z_{k+1}M_{k+1}+D_{k+1} \end{aligned}$$
(25)

for skew-symmetric

$$\begin{aligned} D_{k+1} {:}{=}{{\,\textrm{diag}\,}}\begin{pmatrix} 0 &{} 0 &{} \begin{pmatrix} 0 &{} (\eta _{k+1} + \eta _k) {\text {In}}_X K^{\star } \\ -(\eta _{k+1} + \eta _k) {\text {In}}_Y K &{} 0 \end{pmatrix} \end{pmatrix} \end{aligned}$$

and \(\Xi _k \in \mathbb {L}(U \times W \times X \times Y; U^* \times W^* \times X^{*} \times Y^{*})\) satisfying

$$\begin{aligned} Z_k\Xi _k = {{\,\textrm{diag}\,}}\begin{pmatrix} 0 &{} 0 &{} \begin{pmatrix} 2\eta _k{\tilde{\gamma }}_{F}{\text {In}}_X &{} 2\eta _k {\text {In}}_X K^{\star } \\ -2\eta _{k+1} {\text {In}}_Y K &{} 2\eta _{k+1}{\tilde{\gamma }}_{G^*}{\text {In}}_Y \end{pmatrix} \end{pmatrix}. \end{aligned}$$
(26)

Assumption 3.4 ensures \(Z_kM_k \) to be positive semi-definite. The proof is exactly as for the PDPS, see, e.g., [9], but we include it for completeness.

Lemma 3.5

Let \(k \in \mathbb {N}\) and suppose Assumption 3.4 holds. Then

$$\begin{aligned} Z_kM_k \ge {\text {diag}}\left( 0, 0, \varphi _k (1-\kappa ){\text {In}}_X, \psi _{k+1}\varepsilon {\text {In}}_Y \right) \ge 0 \end{aligned}$$

for

$$\begin{aligned} \varepsilon {:}{=}1 - \frac{\tau _k\sigma _k}{\kappa (1+2{\tilde{\gamma }}_{G^*}\sigma _k)} \Vert K\Vert ^2 > 0. \end{aligned}$$

Proof

By Young’s inequality, for any \(v=(u,w,x,y)\),

$$\begin{aligned} \langle Z_k M_k v|v\rangle&= \varphi _k\Vert x\Vert _X^2 + \psi _{k+1}\Vert y\Vert _{Y}^2 - 2\eta _k\left\langle x, K^{\star } y\right\rangle _X\\&\ge \varphi _k(1-\kappa )\Vert x\Vert _X^2 + \psi _{k+1}\Vert y\Vert _Y^2- \kappa ^{-1}\varphi _k\tau _k^2\Vert K^{\star }y\Vert _X^2. \end{aligned}$$

Since \(\varphi _k\tau _k^2=\eta _k\tau _k=\psi _k\sigma _k\tau _k=\psi _{k+1}\sigma _k\tau _k/(1+2{\tilde{\gamma }}_{G^*}\sigma _k)\), the claim follows. \(\square \)

3.3 Growth estimates and monotonicity

We start by deriving a three-point monotonicity estimate for \(H_k\). This demands the somewhat strict bounds (27).

Lemma 3.6

Let \(k \in \mathbb {N}\). Suppose Assumptions 3.4, 3.1 and 3.3 hold and

$$\begin{aligned} \gamma _F&\ge {\tilde{\gamma }}_F + \varepsilon _u + \varepsilon _w + \frac{\lambda _{k+1}\pi _u + \theta _{k+1}\pi _w}{\eta _k}, \end{aligned}$$
(27a)
$$\begin{aligned} \gamma _{G^*}&\ge {\tilde{\gamma }}_{G^*}, \end{aligned}$$
(27b)
$$\begin{aligned} \gamma _B&\ge \frac{\lambda _{k+1}}{\lambda _k} + \frac{\theta _k}{\lambda _k}C_Q + \frac{\eta _k\mathscr {S}(\bar{w})}{4\varepsilon _w\lambda _k} + \frac{C_x \mu \eta _k}{2\lambda _k}, \quad \text {and} \end{aligned}$$
(27c)
$$\begin{aligned} \gamma _B&\ge \frac{\theta _{k+1}}{\theta _k} + \frac{\eta _k\mathscr {S}(\bar{u})}{4\varepsilon _u\theta _k} + \frac{C_x\eta _k}{2 \mu \theta _k} \end{aligned}$$
(27d)

for some \(\varepsilon _u,\varepsilon _w,\mu > 0\). Then \(H_k\) defined in (20) satisfies

$$\begin{aligned}{} & {} \langle Z_k T_k H_k({v}^{k+1})|{v}^{k+1}-\bar{v}\rangle \ge \frac{1}{2}\Vert {v}^{k+1}-\bar{v}\Vert _{Z_k\Xi _k}^2\nonumber \\{} & {} \quad + (\lambda _{k+1}\pi _u + \theta _{k+1}\pi _w) \Vert {x}^{k+1}-\bar{x}\Vert _X^2 - (\lambda _k\pi _u + \theta _k\pi _w) \Vert x^k-\bar{x}\Vert _X^2\nonumber \\{} & {} \quad + \lambda _{k+1} \Vert {u}^{k+1}-\bar{u}\Vert _U^2 - \lambda _k \Vert u^k-{\bar{u}}\Vert _U^2 + \theta _{k+1} \Vert {w}^{k+1}-\bar{w}\Vert _W^2 - \theta _k \Vert w^k-\bar{w}\Vert _W^2.\nonumber \\ \end{aligned}$$
(28)

Proof

For brevity we denote \(v=(u, w, x, y) {:}{=}{v}^{k+1}\). Recall that \(\bar{v}=(\bar{u}, \bar{w}, \bar{x}, \bar{y})\) satisfies by Assumption 3.1 the optimality conditions (12). Since Algorithm 2.1 guarantees the first two lines of \(H_k\) to be zero through the choice of \(M_k\) in (21), introducing \(q_F {:}{=}-{\bar{\nabla }}_x B({\bar{u}}, \bar{w}) - K^{\star } \bar{y} \in \partial F(\bar{x})\) we expand

$$\begin{aligned} \langle Z_k T_k H_k(v)|v-\bar{v}\rangle&= \eta _k\langle \partial F(x) + {\bar{\nabla }}_x B(u,w) + K^{\star } y,x-\bar{x}\rangle _X + \eta _{k+1}\langle \partial G^*(y) - K x,y - \bar{y}\rangle _{Y}\\&= \eta _k\langle \partial F(x) - q_F,x-\bar{x}\rangle _X + \eta _k\langle {\bar{\nabla }}_x B(u,w) - {\bar{\nabla }}_x B({\bar{u}}, \bar{w}),x-\bar{x}\rangle _X\\&\quad +\eta _{k+1}\langle \partial G^*(y) - K\bar{x},y- \bar{y}\rangle _Y + (\eta _k-\eta _{k+1})\langle K(x-\bar{x}),y - \bar{y}\rangle _Y. \end{aligned}$$

Using (26) we also have

$$\begin{aligned} \frac{1}{2}\Vert v-\bar{v}\Vert _{Z_k\Xi _k}^2&= \eta _k{\tilde{\gamma }}_{F}\Vert x-\bar{x}\Vert _X^2 + (\eta _k-\eta _{k+1})\left\langle K(x-\bar{x}),y - \bar{y}\right\rangle _{Y}\\&\quad + \eta _{k+1}{\tilde{\gamma }}_{G^*}\Vert |y-\bar{y}\Vert _{Y}^2. \end{aligned}$$

We now use the (strong) monotonicity of F and \(G^*\) with constants \( \gamma _F\) and \(\gamma _{G^*}\) contained Assumption 3.1 (i), as well as the splitting inequality Assumption 3.3 (iii). Thus

$$\begin{aligned} \begin{aligned} \langle Z_kT_k H_k(v)|v-\bar{v}\rangle&\ge \frac{1}{2}\Vert v-\bar{v}\Vert _{Z_k\Xi _k}^2 + \eta _k(\gamma _F-{\tilde{\gamma }}_F)\Vert x-\bar{x}\Vert _X^2\\&\quad - (\lambda _k\pi _u + \theta _k\pi _w)\Vert x^k - \bar{x}\Vert _X^2\\&\quad + \eta _{k+1}(\gamma _{G^*}-{\tilde{\gamma }}_{G^*})\Vert y-\bar{y}\Vert ^2_{Y} \\&\quad + \eta _k\langle {\bar{\nabla }}_x B(u, w)-{\bar{\nabla }}_x B({\bar{u}}, \bar{w}),x-\bar{x}\rangle _X\\&\quad + (\lambda _k \gamma _B-\theta _k C_Q)\Vert u-\bar{u}\Vert _U^2 - \lambda _k \Vert u^k-{\bar{u}}\Vert _U^2\\&\quad + \theta _k \gamma _B\Vert w-\bar{w}\Vert _W^2 - \theta _k \Vert w^k-\bar{w}\Vert _W^2. \end{aligned} \end{aligned}$$
(29)

The Riesz equivalence (9), affine-linear-linear structure of \(B_x\), Assumption 3.1 (iii) and (iv), and Young’s inequality give

$$\begin{aligned} \begin{aligned}&\eta _k\langle {\bar{\nabla }}_x B(u, w)-{\bar{\nabla }}_x B({\bar{u}}, \bar{w}),x-\bar{x}\rangle _X = \eta _kB_x(u, w, x-{\bar{x}}) - \eta _kB_x({\bar{u}}, \bar{w}, x-{\bar{x}})\\&\quad = \eta _kB_x(u, w, x-{\bar{x}}) + \eta _kB_x({\bar{u}}, w - \bar{w}, x-{\bar{x}}) - \eta _kB_x({\bar{u}}, w, x-\bar{x})\\&\quad = \eta _k B_{xu}(u - \bar{u}, w - \bar{w}; x - \bar{x})\\&\qquad + \eta _k B_x(\bar{u}, w - \bar{w}; x - \bar{x}) + \eta _k B_{xu}(u - \bar{u}, \bar{w}; x - \bar{x})\\&\quad \ge - \eta _k\left( \frac{\mathscr {S}({\bar{u}})}{4\varepsilon _u}+\frac{C_x \mu }{2}\right) \Vert w-\bar{w}\Vert _W^2 - \eta _k\left( \frac{\mathscr {S}(\bar{w})}{4\varepsilon _w}+\frac{C_x}{2 \mu }\right) \Vert u - \bar{u}\Vert _U^2\\&\quad - \eta _k(\varepsilon _u+\varepsilon _w)\Vert x - \bar{x}\Vert _X^2 \end{aligned} \end{aligned}$$
(30)

Combining (29) and (30), we obtain

$$\begin{aligned} \langle Z_k T_k H_k(v)|v-\bar{v}\rangle&\ge \frac{1}{2}\Vert v-\bar{v}\Vert _{Z_k\Xi _k}^2 + \eta _{k+1}(\gamma _{G^*}-{\tilde{\gamma }}_{G^*})\Vert y-\bar{y}\Vert ^2_{Y}\\&\quad + \eta _k(\gamma _F-{\tilde{\gamma }}_F-\varepsilon _u-\varepsilon _w)\Vert x-\bar{x}\Vert _X^2 - (\lambda _k\pi _u + \theta _k\pi _w) \Vert x^k-\bar{x}\Vert _X^2\\&\quad - \lambda _k \Vert u^k-{\bar{u}}\Vert _U^2 + \lambda _k\biggl ( \gamma _B - \frac{\theta _k}{\lambda _k} C_Q - \frac{\eta _k\mathscr {S}({\bar{w}})}{4\varepsilon _w\lambda _k} - \frac{C_x \mu \eta _k}{2\lambda _k} \biggr )\Vert u-\bar{u}\Vert _U^2\\&\quad - \theta _k \Vert w^k-\bar{w}\Vert _W^2 + \theta _k\biggl ( \gamma _B - \frac{\eta _k\mathscr {S}(\bar{u})}{4\varepsilon _u\theta _k} - \frac{C_x\eta _k}{2\mu \theta _k} \biggr )\Vert w-\bar{w}\Vert _W^2. \end{aligned}$$

The claim now follows by applying (27). \(\square \)

We now simplify and interpret (27).

Lemma 3.7

Suppose \(\gamma _F> {\tilde{\gamma }}_F > 0\) as well as \(\gamma _{G^*} \ge {\tilde{\gamma }}_{G^*} \ge 0\) and that there exists \(\omega , t > 0 \) with \(\omega \eta _{k+1} \le \eta _k\) for all \(k \in \mathbb {N}\), such that

$$\begin{aligned} \gamma _B \ge \omega ^{-1}+ tC_Q + \frac{2(1 + t^{-1})}{\omega (\gamma _F-{\tilde{\gamma }}_F)^2} \left( \mathscr {S}({\bar{u}})\pi _w + t\mathscr {S}(\bar{w})\pi _u + \frac{1}{2}\sqrt{t\pi _u\pi _w}C_x(\gamma _F-{\tilde{\gamma }}_F) \right) . \end{aligned}$$
(31)

Then there exist \(\varepsilon _u,\varepsilon _w,\mu >0\) and, for all \(k \in \mathbb {N}\), \(\lambda _k,\theta _k>0\) such that (27) holds. Moreover

$$\begin{aligned} \lambda _k\pi _u + \theta _k\pi _w = \eta _k\omega \frac{\gamma _F-{\tilde{\gamma }}_F}{2}. \end{aligned}$$
(32)

Proof

We take

$$\begin{aligned} \lambda _k {:}{=}t^{-1} r \pi ^{-1}_u \eta _k \quad \text {and}\quad \theta _k {:}{=}r \pi ^{-1}_w \eta _k \quad \text {for}\quad r {:}{=}\frac{(\gamma _F-{\tilde{\gamma }}_F)\omega }{2(t^{-1} + 1)} \quad \text {and}\quad c_k {:}{=}\frac{\eta _{k+1}}{\eta _k}. \end{aligned}$$
(33)

These expressions readily give (32). We then take \(\mu {:}{=}(t\pi _u/\pi _w)^{-1/2}\),

$$\begin{aligned} \varepsilon _u {:}{=}\frac{\mathscr {S}({\bar{u}})}{\mathscr {S}({\bar{u}}) + t\mathscr {S}(\bar{w})} \frac{\gamma _F - {\tilde{\gamma }}_F}{2}, \quad \text {and}\quad \varepsilon _w {:}{=}\frac{t\mathscr {S}(\bar{w})}{\mathscr {S}({\bar{u}}) + t\mathscr {S}(\bar{w})} \frac{\gamma _F - {\tilde{\gamma }}_F}{2}. \end{aligned}$$

Since both

$$\begin{aligned} \frac{\lambda _{k+1}\pi _u + \theta _{k+1}\pi _w}{\eta _k} =c_kr(t^{-1} + 1) =c_k\omega \frac{\gamma _F-{\tilde{\gamma }}_F}{2} \le \frac{\gamma _F-{\tilde{\gamma }}_F}{2} \end{aligned}$$

and \( \varepsilon _u + \varepsilon _w = (\gamma _F-{\tilde{\gamma }}_F)/2, \) (27a) is readily verified, while (27b) we have assumed. Inserting \(\lambda _k,\theta _k,\eta _k\), and \(\mu \), we also rewrite (27c) and (27d) as

$$\begin{aligned} \gamma _B \ge c_k + t C_Q + \frac{t\mathscr {S}(\bar{w})\pi _u}{4\varepsilon _w r} + \frac{\sqrt{t\pi _u\pi _w}C_x}{2 r} \quad \text {and} \quad \gamma _B \ge c_k + \frac{\mathscr {S}(\bar{u})\pi _w}{4\varepsilon _u r} + \frac{\sqrt{t\pi _u\pi _w}C_x}{2 r}. \end{aligned}$$

After also inserting \(\varepsilon _u, \varepsilon _w\), and r, and using \(\omega c_k \le 1\), these are readily verified by (31). \(\square \)

Remark 3.8

Since \(\eta _{k+1} \ge \eta _k\) for convergent algorithms, i.e., \(\omega ^{-1}\ge 1\), letting \(\omega =1\) and \({\tilde{\gamma }}_F=0\) in (31), we obtain at the solution \((\bar{u}, \bar{w}, \bar{x}, \bar{y})\) a fundamental “second order growth” and splitting condition (via \(C_Q\), \(\pi _u\), and \(\pi _w\)) that cannot be avoided by step length parameter choices.

Our convergence proof is based based on the next Féjer-type monotonicity estimate with respect to the iteration-dependent norms \(\Vert \,\varvec{\cdot }\,\Vert _{Z_k{\tilde{M}}_k}\). Here \({\tilde{M}}_k \in \mathbb {L}(U \times W \times X \times Y; U^* \times W^* \times X \times Y)\) modifies \(M_k\) defined in (21) as

$$\begin{aligned} {\tilde{M}}_k {:}{=}M_k + {{\,\textrm{diag}\,}}\begin{pmatrix} {\text {In}}_U&{\text {In}}_W&\varphi ^{-1}_k (\lambda _k\pi _u + \theta _k\pi _w) {\text {Id}}_X&0 \end{pmatrix}. \end{aligned}$$
(34)

By (24) and Assumption 3.4, this satisfies

$$\begin{aligned} Z_k{\tilde{M}}_k = {{\,\textrm{diag}\,}}\begin{pmatrix} \lambda _k {\text {In}}_U &{} &{} \theta _k {\text {In}}_W &{} \begin{pmatrix} (\varphi _k+\lambda _k\pi _u + \theta _k\pi _w) {\text {In}}_X &{} -\eta _k {\text {In}}_X K^{\star } \\ -\eta _k {\text {In}}_Y K &{} \psi _{k+1}{\text {In}}_Y \end{pmatrix} \end{pmatrix}. \end{aligned}$$
(35)

Lemma 3.9

Suppose Assumptions 3.1 and 3.4 hold as does Assumption 3.3 and (27) for \(k=0,\ldots ,N\). Given \(v^0\), let \(v^1,\ldots ,v^{N-1}\) be produced by Algorithm 2.1. Then

$$\begin{aligned} \frac{1}{2}\Vert {v}^{k+1}-\bar{v}\Vert _{Z_{k+1}{\tilde{M}}_{k+1}}^2 + \frac{1}{2}\Vert {v}^{k+1} - v^k\Vert _{Z_kM_k}^2 \le \frac{1}{2}\Vert v^k - \bar{v}\Vert _{Z_k{\tilde{M}}_k}^2 \quad (k=0,\ldots ,N-1) \end{aligned}$$
(36)

where all the terms are non-negative.

Proof

Lemma 3.6 gives the estimate

$$\begin{aligned}{} & {} \langle Z_k T_k H_k({v}^{k+1})|{v}^{k+1}-\bar{v}\rangle \ge \frac{1}{2}\Vert {v}^{k+1}-\bar{v}\Vert _{Z_k\Xi _k}^2\nonumber \\{} & {} \qquad + (\lambda _{k+1}\pi _u + \theta _{k+1}\pi _w) \Vert {x}^{k+1}-\bar{x}\Vert _X^2 - (\lambda _k\pi _u + \theta _k\pi _w) \Vert x^k-\bar{x}\Vert _X^2\nonumber \\{} & {} \qquad + \lambda _{k+1} \Vert {u}^{k+1}-\bar{u}\Vert _U^2 - \lambda _k \Vert u^k-{\bar{u}}\Vert _U^2 + \theta _{k+1} \Vert {w}^{k+1}-\bar{w}\Vert _W^2 - \theta _k \Vert w^k-\bar{w}\Vert _W^2\nonumber \\{} & {} \quad = \frac{1}{2}\Vert {v}^{k+1}-\bar{v}\Vert _{Z_{k+1}({\tilde{M}}_{k+1}-M_{k+1})+Z_k\Xi _k}^2 - \frac{1}{2}\Vert v^k-\bar{v}\Vert _{Z_k({\tilde{M}}_k-M_k)}^2. \end{aligned}$$
(37)

By the implicit form (22) of Algorithm 2.1, \(-Z_k M_k({v}^{k+1}-v^k) \in Z_k T_k H_k({v}^{k+1})\). Thus (37) combined with the three-point identity (2) for the operator \(M=Z_kM_k\) yields

$$\begin{aligned} \frac{1}{2}\Vert v^k-\bar{v}\Vert _{Z_k{\tilde{M}}_k}^2&\ge \frac{1}{2}\Vert {v}^{k+1}-\bar{v}\Vert _{Z_{k+1}({\tilde{M}}_{k+1}-M_{k+1})+Z_k(M_k+\Xi _k)}^2 +\frac{1}{2}\Vert {v}^{k+1} - v^k\Vert _{Z_kM_k}^2 \end{aligned}$$

Therefore (36) follows by applying (25), i.e., \(Z_k(M_k+\Xi _k)=Z_{k+1}M_{k+1}+D_k\), where the skew symmetric term \(D_k\) does not contribute to the norms. Finally, we have \(Z_k{\tilde{M}}_k \ge Z_kM_k \ge 0\) by Lemma 3.5, proving the non-negativity of all the terms. \(\square \)

3.4 Main results

We can now state our main convergence theorems. In terms of assumptions, the only fundamental difference between the accelerated O(1/N) and the linear convergence result is that the latter requires \(G^*\) to be strongly convex and the former doesn’t. Both require sufficient second order growth in terms of the respective technical conditions (38b) or (41b). The step length parameters differ.

Theorem 3.10

(Accelerated convergence) Suppose Assumptions 3.1 and 3.3 hold with \(\gamma _F>0\). Put \({\tilde{\gamma }}_{G^*}=0\) and pick \(\tau _0,\sigma _0,\kappa ,t>0\) and \(0< {\tilde{\gamma }}_F < \gamma _F\) satisfying

$$\begin{aligned} 1&> \kappa \ge \tau _0\sigma _0\Vert K\Vert ^2 \quad \text {and}\quad \end{aligned}$$
(38a)
$$\begin{aligned} \gamma _B&\ge \omega ^{-1}_0 + tC_Q + \frac{2(1 + t^{-1})}{\omega _0(\gamma _F-{\tilde{\gamma }}_F)^2} \left( \mathscr {S}({\bar{u}})\pi _w + t\mathscr {S}(\bar{w})\pi _u + \frac{1}{2}\sqrt{t\pi _u\pi _w}C_x(\gamma _F-{\tilde{\gamma }}_F) \right) , \end{aligned}$$
(38b)

where \(\omega _0\) is defined as part of the update rules

$$\begin{aligned} \tau _{k+1}&{:}{=}\tau _k\omega _k, \quad \sigma _{k+1} {:}{=}\sigma _k/\omega _k, \quad \text {and}\quad \omega _k {:}{=}1/\sqrt{1+2{\tilde{\gamma }}_F\tau _k} \quad (k \in \mathbb {N}). \end{aligned}$$
(38c)

Let \(\{{v}^{k+1}\}_{k \in \mathbb {N}}\) be generated by Algorithm 2.1 for any \(v^0 \in U \times W \times X \times Y\). Then \(x^k \rightarrow \bar{x}\) in X; \(u^k\rightarrow {\bar{u}}\) in U; and \(w^k\rightarrow \bar{w}\) in W, all strongly at the rate O(1/N).

Proof

We use Lemma 3.9, whose assumptions we now verify. Assumptions 3.1 and 3.3 we have assumed. As shown in [9, 37], Assumption 3.4 holds with \(\psi _k \equiv \sigma ^{-1}_0\tau _0\),\(\varphi _0=1\), and \(\varphi _{k+1} {:}{=}\varphi _k/\omega _k^2\). Moreover, \(\{\varphi _k\}_{k \in \mathbb {N}}\) grows at the rate \(\Omega (k^2)\). Hence

$$\begin{aligned} \eta _{k+1} = \omega ^{-1}_k\eta _k = \sqrt{1+2{\tilde{\gamma }}_F\tau _k}\eta _k \le \omega ^{-1}_0\eta _k \quad \text {for}\quad \omega ^{-1}_0 = \sqrt{1+2{\tilde{\gamma }}_F\tau _0}. \end{aligned}$$

Thus (38) verifies (31) so that Lemma 3.7 verifies (27). Thus we may apply Lemma 3.9. By summing its result over \(k=0,\ldots ,N-1\), we get

$$\begin{aligned} \frac{1}{2}\Vert v^N-\bar{v}\Vert _{Z_{N}{\tilde{M}}_{N}}^2 \le \frac{1}{2}\Vert v^0 - \bar{v}\Vert _{Z_0{\tilde{M}}_0}^2. \end{aligned}$$
(39)

By (24), (35), and Lemma 3.5 we have

$$\begin{aligned} Z_k{\tilde{M}}_k \ge Z_k M_k \ge {{\,\textrm{diag}\,}}\begin{pmatrix} \lambda _k {\text {In}}_U&\theta _k {\text {In}}_W&\varphi _k (1-\kappa ){\text {In}}_X&\psi _{k+1}\varepsilon {\text {In}}_Y \end{pmatrix} \ge 0. \end{aligned}$$
(40)

where \(\varepsilon {:}{=}1-\tau _k\sigma _k\kappa ^{-1}\Vert K\Vert ^2= 1-\tau _0\sigma _0\kappa ^{-1}\Vert K\Vert ^2>0\) by assumption. By Lemma 3.7, \(\{\lambda _k\}_{k \in \mathbb {N}}\) and \(\{\theta _k\}_{k \in \mathbb {N}}\) grow at the same \(\Omega (k^2)\) rate as \(\{\varphi _k\}_{k \in \mathbb {N}}\). Therefore (39) and (40) establish \(\Vert x^k- {\bar{x}}\Vert _X^2 \rightarrow 0\) as well as \(\Vert u^k-{\bar{u}}\Vert _U^2\) and \(\Vert w^k- \bar{w}\Vert _W^2 \rightarrow 0\), all at the rate \(O(1/N^2)\). The claim follows by removing the squares. \(\square \)

Theorem 3.11

(Linear convergence) Suppose Assumptions 3.1 and 3.3 hold with both \(\gamma _F>0\) and \(\gamma _{G^*}>0\). Pick \(\tau ,\kappa ,t>0\), \(0 < {\tilde{\gamma }}_{F} \le \gamma _{F}\), \(0 < {\tilde{\gamma }}_{G^*} \le \gamma _{G^*}\) satisfying

$$\begin{aligned} 1&> \kappa \ge \tau ^2{\tilde{\gamma }}_{G^*}^{-1}{\tilde{\gamma }}_F\Vert K\Vert ^2 \quad \text {and}\quad \end{aligned}$$
(41a)
$$\begin{aligned} \gamma _B&\ge \omega ^{-1}+ tC_Q + \frac{2(1 + t^{-1})}{\omega (\gamma _F-{\tilde{\gamma }}_F)^2} \left( \mathscr {S}({\bar{u}})\pi _w + t\mathscr {S}(\bar{w})\pi _u + \frac{1}{2}\sqrt{t\pi _u\pi _w}C_x(\gamma _F-{\tilde{\gamma }}_F) \right) \end{aligned}$$
(41b)

for

$$\begin{aligned} \sigma&{:}{=}{\tilde{\gamma }}_{G^*}^{-1}{\tilde{\gamma }}_F\tau \quad \text {and}\quad \omega {:}{=}1/(1+2{\tilde{\gamma }}_F\tau )=1/(1+{\tilde{\gamma }}_{G^*}\sigma ). \end{aligned}$$

Take \(\tau _k \equiv \tau \), \(\sigma _k \equiv \sigma \), and \(\omega _k \equiv \omega \). Let \(\{{v}^{k+1}\}_{k \in \mathbb {N}}\) be generated by Algorithm 2.1 for any \(v^0 \in U \times W \times X \times Y\). Then \(x^k \rightarrow \bar{x}\) in X; \(u^k\rightarrow {\bar{u}}\) in U; and \(w^k\rightarrow \bar{w}\) in W, all strongly at a linear rate.

Proof

As shown in [9, 37], Assumption 3.4 is satisfied for \(\varphi _0=1\), \(\psi _0=\sigma ^{-1}\tau \), \(\varphi _{k+1} {:}{=}\varphi _k/\omega _k\), and \(\psi _{k+1} {:}{=}\psi _k/\omega _k\). Moreover, both \(\{\varphi _k\}_{k \in \mathbb {N}}\) and \(\{\psi _k\}_{k \in \mathbb {N}}\) grow exponentially and \( \eta _{k+1} \le \omega ^{-1}\eta _k. \) Thus (41) verifies (31) with \(c=\omega ^{-1}\) so that Lemma 3.7 verifies (27). The rest follows as in the proof of Theorem 3.10. \(\square \)

Theorems 3.10 and 3.11 show global convergence, but may require a very constricted \({{\,\textrm{dom}\,}}F\) through the constant \(C_x\) in Assumption 3.1 (iv). In the appendix of the arXiv version of this manuscript we relax the constant by localizing the convergence.

Remark 3.12

(Linear and sufficiently linear PDEs) For linear PDEs, i.e., when \(B_x\) does not depend on u, we have \(C_x=0\) and \(\mathscr {S}(\bar{w})=0\), as observed in Remark 3.2. Moreover, for typical solvers for the adjoint PDE, we would have \(\pi _w=0\), as \(B_u\) does not then depend on x. In that case, by taking , (38b) (and likewise (41b)) reduces to \(\gamma _B > \omega ^{-1}_0\). Practically this means that the convergence rate factor \(\omega ^{-1}_0\) has to be bounded by the inverse contractivity factor \(\gamma _B\) of the linear system solver. If \(\gamma _B>1\), as we should have, this condition can be satisfied by suitable choices of \({\tilde{\gamma }}_F \in (0, \gamma _F]\) and \({\tilde{\gamma }}_{G^*}\). By extension then, the conditions (38b) and (41b) are satisfiable for small t when the PDE is “sufficiently linear”.

Remark 3.13

(Weak convergence) It is possible to prove weak convergence when \(\omega \equiv 1\) and \(\tau \equiv \tau _0\), \(\sigma \equiv \sigma _0\) satisfy (38). The proof is based on an extension of Opial’s lemma to the quantitative Féjer monotonicity (36). We have not included the proof since it is technical, and does not permit reducing assumptions from those of Theorems 3.10 and 3.11. We refer to [6] for the corresponding proof for the NL-PDPS.

4 Splittings and partial differential equations

We now prove Assumption 3.1 and derive explicit expressions for the operator \({\bar{\nabla }}_x B\) from (9). We do this in Sect. 4.1 for some sample PDEs. Then in Sect. 4.2 we study the satisfaction of Assumption 3.3 for Gauss–Seidel and Jacobi splitting, as well as a simple infinite-dimensional example without splitting. We briefly discuss a quasi-conjugate gradient splitting to illustrate the generality of our approach. We conclude with a discussion of the convergence theory and discretisation in Sect. 4.3.

4.1 Partial differential equations and Riesz representations

Let \({{\,\textrm{Sym}\,}}^d \subset \mathbb {R}^{d \times d}\) stand for the symmetric matrices. Recall that in Example 2.2, to ensure the continuity of B, we needed in practise that at least one of the spaces U, W, or X be finite-dimensional. The same will be the case here. Accordingly, with \(\Omega \subset \mathbb {R}^d\) a Lipschitz domain, we take

$$\begin{aligned} x = (A,c) \in X {:}{=}X_1 \times X_2 \ \text {for subspaces}\ X_1 \subset L^2(\Omega ; {{\,\textrm{Sym}\,}}^d) \quad \text {and}\quad X_2 \subset L^2(\Omega ), \end{aligned}$$
(42a)

as well as \(U \subset H^1(\Omega )\) and \(W \subset H_0^1(\Omega ) \times H^{1/2}(\partial \Omega )\) such that

$$\begin{aligned} B(u, w; x)&{:}{=}B_x(u, w; x) + B_{{\textrm{const}}}(u, w) \quad \text {for}\quad u \in U,\, w \in W,\, x \in X \end{aligned}$$
(42b)

is continuous, where, writing \(w=(w_\Omega , w_\partial )\),

$$\begin{aligned} B_x(u, w; x)&{:}{=}\langle \nabla u,A\nabla w_\Omega \rangle _{L^2(\Omega )} + \langle cu,w_\Omega \rangle _{L^2(\Omega )} \quad \text {and} \end{aligned}$$
(42c)
$$\begin{aligned} B_{{\textrm{const}}}(u, w)&{:}{=}\langle {{\,\textrm{trace}\,}}_{\partial \Omega } u,w_\partial \rangle _{L^2(\partial \Omega )}. \end{aligned}$$
(42d)

Thus \(B_{{\textrm{const}}}\) models the nonhomogeneous Dirichlet boundary condition \(u=g\) on \(\partial \Omega \) for some \(g \in H^{-\frac{1}{2}}(\partial \Omega ) \). Correspondingly we take for some \(L_0 \in H^{-1}(\Omega )\) the right-hand-side

$$\begin{aligned} Lw&{:}{=}L_0w_\Omega + \langle g,w_\partial \rangle _{L^2(\partial \Omega )}. \end{aligned}$$
(42e)

The next lemma verifies the PDE components of Assumption 3.1. Afterwards we look at particular choices of \(X_1\) and \(X_2\). We could also take \(W=H^1(\Omega )\), \(w=w_\Omega \), \(L=L_0\), and \(B_{{\textrm{const}}}=0\) to model Neumann boundary conditions, and the result would still hold. In the range spaces of \(L^p(\Omega ; \mathbb {R}^d)\), \(W^{1,p}(\Omega )\), and \(L^p(\Omega ; \mathbb {R}^{d \times d})\), we use the Euclidean norm in \(\mathbb {R}^d\) and the spectral norm \(\Vert \,\varvec{\cdot }\,\Vert _2\) in \(\mathbb {R}^{d \times d}\).

Lemma 4.1

Assume (42) and that \({{\,\textrm{dom}\,}}F \subset L^\infty (\Omega ; \mathbb {R}^{d \times d}) \times L^\infty (\Omega )\). Then:

(ii\('\)):

Assumption 3.1 (ii) holds if there exists \( \lambda \in (0,1) \) such that

$$\begin{aligned} A(\xi ) \ge \lambda \,{\text {Id}}\ \ \text {and}\ \ |c(\xi )| \ge \lambda \quad \text {for all}\quad \xi \in \Omega \ \ \text {and}\ \ (A,c) \in (X_1\times X_2)\cap {{\,\textrm{dom}\,}}F. \end{aligned}$$

Suppose then that (12) is solved by \(\bar{v}=(\bar{u},\bar{w},\bar{x},\bar{y})\) with \({\bar{x}}=(\bar{A}, \bar{c}) \in {{\,\textrm{dom}\,}}F \subset (X_1 \times X_2) \), \({\bar{u}}\in H^1(\Omega ) \), \(\bar{w}= (\bar{w}_\Omega , \bar{w}_\partial ) \in H_0^1(\Omega ) \times H^{1/2}(\partial \Omega )\). If \( \Vert {\bar{u}}\Vert _{W^{1,\infty }(\Omega )}, \Vert \bar{w}\Vert _{W^{1,\infty }(\Omega )} < \infty \), and \({\bar{y}}\in Y\) for a Hilbert space Y, then also:

(iii\('\)):

Assumption 3.1 (iii) holds with \( \mathscr {S}({\bar{u}}) = \Vert {\bar{u}}\Vert _{W^{1,\infty }(\Omega )}^2\) and \( \mathscr {S}(\bar{w}) = \Vert \bar{w}_\Omega \Vert _{W^{1,\infty }(\Omega )}^2\).

(iv\('\)):

Assumption 3.1 (iv) holds with

$$\begin{aligned} C_x = \sup _{(A, c) \in {{\,\textrm{dom}\,}}F}~ \Vert A-\bar{A}\Vert _{L^\infty (\Omega ; \mathbb {R}^{d \times d})} + \Vert c-\bar{c}\Vert _{L^\infty (\Omega )}. \end{aligned}$$

Remark 4.2

On bounded \( \Omega \) the condition \( \Vert {\bar{u}}\Vert _{W^{1,\infty }(\Omega )} < \infty \) is stronger than \( {\bar{u}}\in H^1(\Omega ) \). We include both to emphasise that the latter defines the Hilbert space structure and topology that we generally work with, while the former is a technical restriction that arises from our proofs. Under appropriate smoothness conditions on \({\bar{x}}\), the boundary of \(\Omega \), as well as the boundary data, standard elliptic theory proves that \({\bar{u}}\in H^1(\Omega )\) is a classical solution, hence Lipschitz and \(W^{1,\infty }(\Omega )\) on the whole domain; see, e.g., [11].

Proof

For (ii\('\)), we identify \( g \in H^{-1/2}(\partial \Omega ) \) with \( {\hat{g}} \in H^{1/2}(\partial \Omega ) \) by the Riesz mapping and fix \( \hat{u} \in H^1(\Omega ) \) with \( {{\,\textrm{trace}\,}}_{\partial \Omega }\hat{u} = {\hat{g}} \). This is possible by the definition of \( H^{1/2}(\partial \Omega ) \). By the Lax–Milgram lemma there is then a unique solution \(v \in H_0^1(\Omega )\) to

$$\begin{aligned} \langle \nabla v,A\nabla w_\Omega \rangle _{L^2(\Omega )} + \langle cv,w_\Omega \rangle _{L^2(\Omega )} = L_0w_\Omega - B_x({\hat{u}},w_\Omega ;x) \quad \text {for all}\quad w_\Omega \in H_0^1(\Omega ), \end{aligned}$$

Now \( u = v + \hat{u} \) satisfies \( B(u,w;x) = Lw \) and is independent of the choice of \( \hat{u} \). Analogously we prove the existence of a solution to the adjoint equation.

To prove (iv), pick arbitrary \(u \in H^1(\Omega )\), \(w=(w_\Omega , w_\partial ) \in H_0^1(\Omega ) \times H^{1/2}(\partial \Omega )\), and \(x = (A,c) \in (X_1 \times X_2) \cap {{\,\textrm{dom}\,}}F\). Hölder’s inequality and the symmetry of \(A(\xi )\) give

$$\begin{aligned} \begin{aligned} \langle \nabla u,A\nabla w_\Omega \rangle _{L^2(\Omega )}&\le \Vert \nabla w_\Omega \Vert _{L^2(\Omega ; \mathbb {R}^d)} \biggl (\int _\Omega \Vert A(\xi )\nabla u(\xi )\Vert _2^2 \,d\xi \biggr )^{1/2}\\&\le \Vert \nabla w_\Omega \Vert _{L^2(\Omega ; \mathbb {R}^d)}\Vert A\Vert _{L^\infty (\Omega ; \mathbb {R}^{d \times d})}\Vert \nabla u\Vert _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

Therefore, as claimed

$$\begin{aligned} B_x(u,w;x-{\bar{x}})&\le \Vert A-\bar{A}\Vert _{L^\infty (\Omega ; \mathbb {R}^{d \times d})}\Vert \nabla u\Vert _{L^2(\Omega ; \mathbb {R}^d)}\Vert \nabla w_\Omega \Vert _{L^2(\Omega ; \mathbb {R}^d)}\\&\quad +\Vert c-\bar{c}\Vert _{L^\infty (\Omega )}\Vert u\Vert _{L^2(\Omega )}\Vert w_\Omega \Vert _{L^2(\Omega )}\\&\le \bigl (\Vert A-\bar{A}\Vert _{L^\infty (\Omega ; \mathbb {R}^{d \times d})} + \Vert c-\bar{c}\Vert _{L^\infty (\Omega )}\bigr )\Vert u\Vert _{H^1(\Omega )}\Vert w_\Omega \Vert _{H^1(\Omega )}\\&\le C_x\Vert u\Vert _{H^1(\Omega )}\Vert w_\Omega \Vert _{H^1(\Omega )}. \end{aligned}$$

For (iii\('\)), using Hölder’s twice inequality and the symmetry of \(A(\xi )\), we estimate

$$\begin{aligned} \begin{aligned} \langle \nabla u,A\nabla w_\Omega \rangle _{L^2(\Omega )}&\le \Vert \nabla w_\Omega \Vert _{L^\infty (\Omega ; \mathbb {R}^d)} \int _\Omega \Vert A(\xi )\nabla u(\xi )\Vert _2 \,d\xi \\&\le \Vert \nabla w_\Omega \Vert _{L^\infty (\Omega ; \mathbb {R}^d)}\Vert A\Vert _{L^2(\Omega ; \mathbb {R}^{d \times d})}\Vert \nabla u\Vert _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} B_x(u,\bar{w};x)&\le \Vert A\Vert _{L^2(\Omega ; \mathbb {R}^{d\times d})}\Vert \nabla u\Vert _{L^2(\Omega ; \mathbb {R}^d)}\Vert \nabla \bar{w}_\Omega \Vert _{L^\infty (\Omega ; \mathbb {R}^d)}\\&\quad + \Vert c\Vert _{L^2(\Omega )}\Vert u\Vert _{L^2(\Omega )}\Vert \bar{w}_\Omega \Vert _{L^\infty (\Omega )}\\&\le \bigl ( \Vert \nabla \bar{w}_\Omega \Vert _{L^\infty (\Omega ; \mathbb {R}^d)} + \Vert \bar{w}_\Omega \Vert _{L^\infty (\Omega )} \bigr ) \Vert u\Vert _{H^1(\Omega )}\bigl ( \Vert A\Vert _{L^2(\Omega ; \mathbb {R}^{d\times d})} + \Vert c\Vert _{L^2} \bigr )\\&= \Vert \bar{w}_\Omega \Vert _{W^{1,\infty }(\Omega )}\Vert u\Vert _{H^1(\Omega )}\Vert x\Vert _X. \end{aligned}$$

Thus we may take as claimed \(\mathscr {S}(\bar{w}) = \Vert \bar{w}_\Omega \Vert _{W^{1,\infty }}^2\), and analogously \(\mathscr {S}({\bar{u}}) = \Vert {\bar{u}}\Vert _{W^{1,\infty }}^2\).

\(\square \)

To describe \({\bar{\nabla }}_x B\) we denote the double dot product and the outer product by

$$\begin{aligned} A :{\tilde{A}} = \sum _{ij}A_{ij}{\tilde{A}}_{ij}, \quad \text {and}\quad v \otimes w = vw^T \quad \text {for}\quad A,{\tilde{A}} \in \mathbb {R}^{d\times d} \quad \text {and}\quad v,w \in \mathbb {R}^d \end{aligned}$$

Observe the identity \( v^TAw = A :(v \otimes w)\).

Example 4.3

(General case) In the fully general case, formally and without regard for the solvability of the PDE (5), we equip \( X_1 = L^2(\Omega ; \mathbb {R}^{d\times d}) \) with the inner product \( \langle A_1,A_2\rangle _{X_1}:= \int _\Omega A_1(\xi ) :A_2(\xi ) \,d\xi \) and \(X_2 = L^2(\Omega ; \mathbb {R})\) with the standard inner product in \(L^2(\Omega ; \mathbb {R})\). Then for all \(u \in U\), \(w \in W\), and \((d, h) \in X_1 \times X_2\), we have

$$\begin{aligned} B_x(u,w; (d,h)) = \langle \nabla u,d\nabla w\rangle _{L^2(\Omega )} + \langle hu,w\rangle _{L^2(\Omega )} = \langle \nabla u \otimes \nabla w,d\rangle _{X_1} + \langle uw,h\rangle _{X_2}. \end{aligned}$$

Therefore the Riesz representation \({\bar{\nabla }}_x B\) has pointwise in \(\Omega \) the expression

$$\begin{aligned} {\bar{\nabla }}_x B(u,w) = \begin{pmatrix} \nabla u \otimes \nabla w \\ uw \end{pmatrix}. \end{aligned}$$

The constant \(C_x\) is as provided by Lemma 4.1.

Example 4.4

(Scalar function diffusion coefficient) Let then \(X_1 {:}{=}\{ \xi \mapsto a(\xi ){\text {Id}}\mid a \in L^2(\Omega ) \}\). \(X_1\) is isometrically isomorphic with \(L^2(\Omega )\) since the spectral norm \(\Vert a(\xi ){\text {Id}}\Vert _2 = |a(\xi )|\). We may therefore identify \(X_1\) and \(L^2(\Omega )\). We also observe that the term \(\langle \nabla u,A\nabla w\rangle _{L^2(\Omega )} = \langle a,\nabla u\cdot \nabla w\rangle _{X_1} \). Hence, pointwise in \(\Omega \),

$$\begin{aligned} {\bar{\nabla }}_x B(u,w) = \begin{pmatrix} \nabla u \cdot \nabla w \\ uw \end{pmatrix}. \end{aligned}$$

According to Lemma 4.1, the constant

$$\begin{aligned} C_x = \sup _{(a, c) \in {{\,\textrm{dom}\,}}F}~\Vert a-\bar{a}\Vert _{L^\infty (\Omega )} + \Vert c-\bar{c}\Vert _{L^\infty (\Omega )}. \end{aligned}$$

Example 4.5

(Spatially uniform coefficients) Let \(X_1 {:}{=}\{ \xi \mapsto {\tilde{A}} \mid {\tilde{A}} \in {{\,\textrm{Sym}\,}}^d \} \subset L^2(\Omega ; {{\,\textrm{Sym}\,}}^d)\) and \(X_2 {:}{=}\{ \xi \mapsto {\tilde{c}} \mid {\tilde{c}} \in \mathbb {R}\} \subset L^2(\Omega )\) consist of constant functions \(A: \xi \mapsto {\tilde{A}}\) and \(c: \xi \mapsto {\tilde{c}}\) on the bounded domain \(\Omega \). Then \(\Vert x\Vert _{X_1 \times X_2} = |\Omega |^{1/2}(\Vert {\tilde{A}}\Vert _2 + |{\tilde{c}}|)\) for all \(x=(A, c) \in X_1 \times X_2\). We may thus identify \(X_1\) and \(X_2\) with \(\mathbb {R}^{d \times d}\) and \(\mathbb {R}\) if we weigh the norms by \(|\Omega |^{1/2}\). We have

$$\begin{aligned} \langle \nabla u,A\nabla w\rangle _{L^2(\Omega )} = \int _{\Omega } {\tilde{A}} :\nabla u \otimes \nabla w\,d\xi = {\tilde{A}} :\int _\Omega \nabla u \otimes \nabla w\,d\xi . \end{aligned}$$

Thus

$$\begin{aligned} {\bar{\nabla }}_x B(u,w) = \begin{pmatrix} \int _\Omega \nabla u \otimes \nabla w\,d{\xi }\\ \int _\Omega uw\,d{\xi } \end{pmatrix}. \end{aligned}$$

According to Lemma 4.1, the constant

$$\begin{aligned} C_x =\sup _{(A, c) \in {{\,\textrm{dom}\,}}F}~ \Vert {\tilde{A}} - \tilde{\bar{A}}\Vert _2 + |{\tilde{c}}-\tilde{\bar{c}}|. \end{aligned}$$

4.2 Splittings

We now discuss linear system splittings and Assumption 3.3. Throughout this subsection we assume that

$$\begin{aligned} B(u, w; x)=\langle A_x u + f_x|w\rangle \quad \text {and}\quad Lw=\langle b|w\rangle \end{aligned}$$
(43)

with \(A_x \in \mathbb {L}(U; W^*)\) invertible for \(x \in X\), and \(f_x, b \in W^*\). Then for fixed \(x \in X\) the weak PDE (5) and the adjoint \(B_u(\,\varvec{\cdot }\,, w, x) = - Q'(u)\) reduce to the linear equations

$$\begin{aligned} A_x u = b - f_x \quad \text {and}\quad A_x^{*} w = - Q'(u), \end{aligned}$$

where \(A_x^* \in \mathbb {L}(W; U^*)\) is the dual product adjoint of \(A_x\) restricted to \(W \hookrightarrow W^{**}\).

The basic splittings

The next lemma helps to prove Assumption 3.3 subject to a control on the rate of dependence of A on x. In its setting, with \(A_x=N_x+M_x\) with \(N_x\) “easily” invertible, Lines 3 and 4 of Algorithm 2.1 are given by (18).

Theorem 4.6

In the setting (43), suppose Assumption 3.1 holds and

$$\begin{aligned} \Vert A_x - A_{{\tilde{x}}}\Vert _{\mathbb {L}(U; W^*)} \le L_A \Vert x-{\tilde{x}}\Vert _{X} \quad \text {and}\quad \Vert f_x - f_{{\tilde{x}}}\Vert _{W^*} \le L_f \Vert x-{\tilde{x}}\Vert _X \quad (x, {\tilde{x}} \in {{\,\textrm{dom}\,}}F) \end{aligned}$$
(44)

for some \(L_A \ge 0\). Split \(A_x=N_x+M_x\) with \(N_x\) invertible, and assume there exist \(\alpha \in [0, 1)\) and \(\gamma _N>0\) such that all

$$\begin{aligned} \Vert N^{-1}_x M_x\Vert _{\mathbb {L}(U; U)},\Vert N_x^{-1,*}M_x^*\Vert _{\mathbb {L}(W; W)} \le \alpha \quad \text {and}\quad \gamma _N \Vert N^{-1}_x\Vert _{\mathbb {L}(W^*; U)} \le 1 \end{aligned}$$
(45)

for all \(x \in {{\,\textrm{dom}\,}}F\). Also suppose \(\nabla Q\) is \(L_Q\)-Lipschitz. For any \(\gamma _B \in (1, 1/\alpha ^2)\), \(\lambda \in (0, 1)\), and \(\beta >0\), set

$$\begin{aligned} \pi _w&= \left( 1+\beta +\frac{\alpha ^2\gamma _B}{\lambda (1-\alpha ^2\gamma _B)}\right) \frac{\gamma _BL_A^2\Vert \bar{w}\Vert ^2_{W}}{\gamma _N^2},\\ C_Q&= \left( \frac{1+\beta }{\beta } + \frac{\alpha ^2\gamma _B}{(1-\lambda )(1-\alpha ^2\gamma _B)}\right) \frac{\gamma _B L_Q^2}{\gamma _N^2}, \quad \text {and}\\ \pi _u&= \left( 1+\beta +\frac{\alpha ^2\gamma _B}{\lambda (1-\alpha ^2\gamma _B)}\right) \frac{\gamma _BL_A^2\Vert {\bar{u}}\Vert ^2_{U}}{\gamma _N^2} + \left( \frac{1+\beta }{\beta } + \frac{\alpha ^2\gamma _B}{(1-\lambda )(1-\alpha ^2\gamma _B)}\right) \frac{\gamma _B L_f^2}{\gamma _N^2}. \end{aligned}$$

Let \(\Gamma _k(u, w, x)=\langle M_x u|w\rangle \) and \(\Upsilon _k(u, w, x)=\langle u|M_x^{*} w\rangle \). Then Assumption 3.3 holds for all \(k \in \mathbb {N}\) with \(\{{v}^{k+1}\}_{k=0}^\infty \) generated by Algorithm 2.1 for any \(v^0 \in U \times W \times X \times Y\).

Proof

Assumption 3.3 (i) holds by construction, and (ii) by the assumed invertibility of \(N_x\) for \(x \in {{\,\textrm{dom}\,}}F\). We only consider the second inequality of (iii) for \(\Upsilon \), the proof of the first inequality for \(\Gamma \) being analogous with \(-Q'(u)\) replaced by \(b-f_x\). We thus need to prove

$$\begin{aligned} \Vert w^k-\bar{w}\Vert ^2_{W} \ge \gamma _B \Vert {w}^{k+1}-\bar{w}\Vert ^2_{W} - C_Q\Vert {u}^{k+1} - \bar{u}\Vert ^2_{U} - \pi _B \Vert x^k - {\bar{x}}\Vert ^2_{X}. \end{aligned}$$
(46)

Using (18) with \(A_{{\bar{x}}}^{*} \bar{w} = -Q'({\bar{u}})\) and \(A_{x^k}^{*} \bar{w} = N_{x^k}^{*}\bar{w}+ M_{x^k}^{*}\bar{w}\), we expand

$$\begin{aligned} \begin{aligned} {w}^{k+1} - \bar{w}&= N_{x^k}^{-1,*}(- Q'({u}^{k+1}) - M_{x^k}^{*} w^k) - \bar{w}\\&= N_{x^k}^{-1,*}[ Q'(\bar{u}) - Q'({u}^{k+1})] + N_{x^k}^{-1,*}(A_{{\bar{x}}}^{*}-A_{x^k}^{*})\bar{w} - N_{x^k}^{-1,*}M_{x^k}^* (w^k- \bar{w}). \end{aligned} \end{aligned}$$

Expanding \(\Vert {w}^{k+1}-\bar{w}\Vert _{W}^2\) and applying the triangle inequality, and Young’s inequality thrice, yields

$$\begin{aligned} \begin{aligned} \Vert {w}^{k+1}-\bar{w}\Vert ^2_{W}&\le \left( 1+\frac{\alpha ^2\gamma _B}{\lambda (1-\alpha ^2\gamma _B)}+\beta \right) \Vert N_{x^k}^{-1,*}(A_{{\bar{x}}}^{*}-A_{x^k}^{*})\bar{w}\Vert ^2_{W}\\&\quad + \frac{1}{\alpha ^2\gamma _B}\Vert N_{x^k}^{-1,*}M_{x^k}^{*} (w^k- \bar{w})\Vert ^2_{W}\\&\quad + \left( \frac{1 + \beta }{\beta } + \frac{\alpha ^2\gamma _B}{(1-\lambda )(1-\alpha ^2\gamma _B)}\right) \Vert N_{x^k}^{-1,*}[Q'({u}^{k+1}) - Q'({\bar{u}})]\Vert ^2_{W}. \end{aligned} \end{aligned}$$

Note that the first part of (44) and the second part (45) hold also for the adjoints \(A_x^*\) and \(N_x^*\) in the corresponding spaces. Therefore, we establish \(\Vert N_{x^k}^{-1,*}(A_{{\bar{x}}}^{*}-A_{x^k}^{*})\bar{w}\Vert ^2_{W} \le \gamma _N^{-2}L_A^2\Vert \bar{w}\Vert ^2_{W}\Vert {\bar{x}}-x^k\Vert ^2_{X}\), \(\Vert N_{x^k}^{-1,*}[ Q'({u}^{k+1}) - Q'({\bar{u}})]\Vert ^2_{W} \le \gamma _N^{-2} L_Q^2 \Vert {u}^{k+1} - \bar{u}\Vert ^2_{X}\), and \(\Vert N_{x^k}^{-1,*}M_{x^k}^{*} (w^k- \bar{w})\Vert ^2_{W} \le \alpha ^2\gamma _B \Vert w^k- \bar{w}\Vert ^2_{W}\). Taking \(\pi _w\) and \(C_Q\) as stated, we therefore obtain (46). \(\square \)

For our first, infinite-dimensional example of the satisfaction of the conditions of Theorem 4.6, and hence of Assumption 3.3, note that we have in general

$$\begin{aligned} \Vert N^{-1}_x\Vert _{\mathbb {L}(W^*; U)} = \sup _{w^*} \frac{\Vert N^{-1}_x w^*\Vert _{U}}{\Vert w^*\Vert _{W^*}} = \sup _{u} \frac{\Vert u\Vert _{U}}{\Vert N_x u\Vert _{W^*}} = \sup _{u} \inf _{w} \frac{\Vert u\Vert _{U}\Vert w\Vert _{W}}{\langle N_x u|w\rangle } \end{aligned}$$

and

$$\begin{aligned} \Vert A_x - A_{{\tilde{x}}}\Vert _{\mathbb {L}(U; W^*)} = \sup _u \frac{\Vert [A_x-A_{{\tilde{x}}}]u\Vert _{W^*}}{\Vert u\Vert _U} = \sup _{u,w} \frac{\langle [A_x-A_{{\tilde{x}}}]u|w\rangle }{\Vert u\Vert _U\Vert w\Vert _W}. \end{aligned}$$

Example 4.7

(No splitting of a weighted Laplacian in \(H^1\)) Let \(U=W=H_0^1(\Omega )\), \(X=\mathbb {R}\), and \(N_x=A_x=x\nabla ^*\nabla \in \mathbb {L}(H_0^1(\Omega ); H^{-1}(\Omega ))\) be the Laplacian weighted by \(x \in (0, \infty )\). Then

$$\begin{aligned} \Vert N^{-1}_x\Vert _{\mathbb {L}(W^*; U)} = \sup _{u} \inf _{w} \frac{\Vert u\Vert _{H^1(\Omega )}\Vert w\Vert _{H^1(\Omega )}}{x\langle \nabla u,\nabla w\rangle _{L^2(\Omega )}} \le \sup _{u} \frac{\Vert u\Vert _{H^1(\Omega )}^2}{x\Vert \nabla u\Vert _{L^2(\Omega )}^2}. \end{aligned}$$

Therefore, assuming \(\inf {{\,\textrm{dom}\,}}F > 0\), we can in (45) take \(\gamma _N=\inf _{x \in {{\,\textrm{dom}\,}}F} x\lambda \) for \(\lambda \) the infimum of the spectrum of the Laplacian as a bounded self-adjoint operator in \(H_0^1(\Omega )\); see, e.g., [25, Theorem 9.2-1]. Clearly also \(\alpha =0\) due to \(M_x=0\). For (44), we get

$$\begin{aligned} \Vert A_x - A_{{\tilde{x}}}\Vert _{\mathbb {L}(U; W^*)} = \sup _{u,w} (x-{\tilde{x}})\frac{\langle \nabla u,\nabla w\rangle _{L^2(\Omega )}}{\Vert u\Vert _{H^1(\Omega )}\Vert w\Vert _{H^1(\Omega )}} = \sup _{u} (x-{\tilde{x}})\frac{\Vert \nabla u\Vert _{L^2(\Omega )}^2}{\Vert u\Vert _{H^1(\Omega )}}. \end{aligned}$$

Thus we can take \(L_A\) as the supremum of the spectrum of the Laplacian as a bounded self-adjoint operator in \(H_0^1(\Omega )\).

In the following examples, we take \(U=W=\mathbb {R}^n\) with the standard Euclidean norm. Then (45) can be rewritten as the spectral radius bound and positivity condition

$$\begin{aligned} \rho (N^{-1}_x M_x), \rho ({N}^{-1,*}_x M_x^*) \le \alpha \quad \text {and}\quad N_x^*N_x \ge \gamma _N^2. \end{aligned}$$

The first example also works in general spaces, as seen in a special case in Example 4.7, but \(\gamma _N\) and \(L_A\) depend on the norms chosen. Theorem 4.6 now shows that Assumption 3.3 holds.

Example 4.8

(No splitting) If \(N_x=A_x \in \mathbb {R}^{n \times n}\), (45) holds with \(\alpha =0\) and \(\gamma _N\) the minimal eigenvalue of \(A_x\), assumed symmetric positive definite. Theorem 4.6 now shows that Assumption 3.3 holds, where for any \(\gamma _B > 1\) and \(\beta >0\), we can take \( \pi _w = (1+\beta )\gamma _B \gamma _N^{-2}L_A^2\Vert \bar{w}\Vert ^2, \) \( C_Q = (1+\beta ^{-1})\gamma _B \gamma _N^{-2} L_Q^2, \) and \(\pi _u = \gamma _B \gamma _N^{-2}[(1+\beta )L_A^2\Vert {\bar{u}}\Vert ^2 + (1+\beta ^{-1})L_f^2]\).

Example 4.9

(Jacobi splitting) If \(N_x\) is the diagonal of \(A_x \in \mathbb {R}^{n \times n}\), we obtain Jacobi splitting. The first part of (45) reduces to strict diagonal dominance, see [12, §10.1]. The second part always holds and \(N_x\) is invertible when the diagonal of \(A_x\) has only positive entries. Then \(\gamma _N\) is the minimum of the diagonal values. Theorem 4.6 now shows that Assumption 3.3 holds.

Example 4.10

(GaussSeidel splitting) If \(N_x\) is the lower triangle and diagonal of \(A_x \in \mathbb {R}^{n \times n}\), we obtain Gauss–Seidel splitting. The first part of (45) holds for some \(\alpha \in [0, 1)\) when \(A_x\) is symmetric and positive definite; compare [12, proof of Theorem 10.1.2]. The second part holds for some \(\gamma _N\) when \(N_x\) is invertible. Theorem 4.6 now shows that Assumption 3.3 holds.

Example 4.11

(Successive over-relaxation) Based on any one of Examples 4.8 to 4.10, take \({\tilde{N}}_x = (1+r)N_x\) and \({\tilde{M}}_x = M_x - rN_x\) for some \(r>0\). Then, for small enough \(\gamma _B\), all as .

Indeed, \({\tilde{N}}_x^{-1}{\tilde{M}}_x z = {\tilde{\lambda }} z\) if and only if \(M_x z = ((1+r){\tilde{\lambda }} +r) N_x z \), which gives the eigenvalues \({\tilde{\lambda }}\) of \({\tilde{N}}_x^{-1}{\tilde{M}}_x\) as \({\tilde{\lambda }}=(\lambda -r)/(1+r)\) for \(\lambda \) an eigenvalue of \(N^{-1}_x M_x\). So, for large r, we can in (45) take \(\alpha =(r+\rho )/(1+r)\) and \(\gamma _{{\tilde{N}}} = \gamma _N (1+r)\), where \(\rho {:}{=}\rho (N_x^{-1}M_x) < 1\). Now, for every large enough \(r>0\), for \(\gamma _B=(1+\alpha ^{-2})/2 > 1\), we have

$$\begin{aligned} \begin{aligned} \frac{\alpha ^2}{\gamma _{{\tilde{N}}_x}^2(1-\alpha ^2\gamma _B)}&= \frac{2\alpha ^2}{\gamma _{{\tilde{N}}_x}^2(1-\alpha ^2)} = \frac{2(1+r)^2\alpha ^2}{(1+r)^2\gamma _{N_x}^2((1+r)^2-(1+r)^2\alpha ^2)}\\&= \frac{2(r+\rho )^2}{(1+r)^2\gamma _{N_x}^2((1+r)^2-(r+\rho )^2)} = \frac{2(r+\rho )^2}{(1+r)^2\gamma _{N_x}^2(1 -\rho ^2 + 2(1-\rho )r)}. \end{aligned} \end{aligned}$$

Since \(0 \le \rho <1\), the right hand side tends to zero as . Since also , and \(\gamma _B>1\), Theorem 4.6 now shows that Assumption 3.3 holds with as .

Quasi-conjugate gradients

With \(f_x=0\) for simplicity, motivated by the conjugate gradient method for solving \(A_x u = b\), see, e.g., [12], we propose to perform on Line 3 of Algorithm 2.1, and analogously Line 4 the quasi-conjugate gradient update

$$\begin{aligned} \Biggl \{\begin{aligned} r^k&{:}{=}b - A_{x^k} u^k,&z^{k+1}&{:}{=}-\langle p^k,A_{x^k} r^k\rangle /\Vert p^k\Vert _{A_{x^k}}^2,\\ p^{k+1}&{:}{=}r^k + z^{k+1} p^k,&t^{k+1}&{:}{=}\langle p^{k+1},r^k\rangle /\Vert p^{k+1}\Vert _{A_{x^k}}^2,&u^{k+1}&{:}{=}u^k + t^{k+1}p^{k+1}. \end{aligned} \Biggr . \end{aligned}$$
(47)

For standard conjugate gradients \(A_{x^k} \equiv A\) permits a recursive residual update optimization that we are unable to perform. We have \(\langle A_{x^k}{p}^{k+1},p^k\rangle =0\) for all k, although no “A-conjugacy” relationship necessarily exists between \({p}^{k+1}\) and \(p^j\) for \(j < k\).

The next lemma molds the updates (47) into our overall framework.

Lemma 4.12

The update (47) corresponds to Line 3 of Algorithm 2.1 with

$$\begin{aligned} \Gamma _k(u, \,\varvec{\cdot }\,, x) = \left[ {\text {Id}}- \Vert p^{k+1}\Vert _{A_x}^{-2} A_x \left( p^{k+1} \otimes p^{k+1}\right) \right] (A_x u^k-b) \quad (u \in U). \end{aligned}$$
(48)

for \({p}^{k+1} = r^k_x + {z}^{k+1}_x p^k\) for \({z}^{k+1}_x = -\langle p^k,A_xr^k_x\rangle /\Vert p^k\Vert _{A_x}^2\) and \(r^k_x {:}{=}A_x u^k - b\).

Proof

Indeed, expanding \(t^{k+1}\), the u-update of (47) may be rewritten as

$$\begin{aligned} u^{k+1} - u^k = \Vert p^{k+1}\Vert _{A_{x^k}}^{-2}(p^{k+1} \otimes p^{k+1}) r^k. \end{aligned}$$

Applying the invertible matrix \(A_{x^k}\) and expanding \(r^k\), this is

$$\begin{aligned} A_{x^k}(u^{k+1}-u^k) = - \Vert p^{k+1}\Vert _{A_{x^k}}^{-2} A_{x^k}(p^{k+1} \otimes p^{k+1})(A_{x^k} u^k-b), \end{aligned}$$

and, adding \(A_{x^k}u^k-b\) on both sides, further

$$\begin{aligned} A_{x^k} u^{k+1}-b = [{\text {Id}}- \Vert p^{k+1}\Vert _{_{x^k}}^{-2} A_{x^k} (p^{k+1} \otimes p^{k+1})](A_{x^k} u^k-b). \end{aligned}$$

Since \(B({u}^{k+1}, \,\varvec{\cdot }\,; x^k) = \langle A_{x^k} u^{k+1},\,\varvec{\cdot }\,\rangle \), and \(L(\,\varvec{\cdot }\,)=\langle b,\,\varvec{\cdot }\,\rangle \), the claim follows. \(\square \)

Unless \(A_x\) is independent of x, a simple approach as in Theorem 4.6 can only verify Assumption 3.3 with \(\gamma _B<1\). We hence leave the verification of convergence of Algorithm 2.1 with quasi-conjugate gradient updates to future research.

4.3 Discussion

Before we embark on numerical experiments, it is time to make a few unifying observations about the disparate results above, with regard to the main conditions (38b) and (41b) of the convergence Theorems 3.10 and 3.11, and their connection to the fundamentally discrete viewpoint of Examples 4.9 and 4.10. As we have already noted in Remark 3.12,

  1. (i)

    The main conditions (38b) and (41b) are easily satisfied for linear PDEs, i.e., when \(B_x\) does not depend on u. In Sect. 4.2, this corresponds to \(A_x=A\) (while \(f_x\) may still depend on x). The only condition given in Remark 3.12 was that \(\pi _w=0\), which is satisfied in Examples 4.9 to 4.8 due to \(L_A=0\).

For linear PDEs, \(\mathscr {S}(\bar{w})=0\). Together with \(\pi _w=0\), this causes also \(\mathscr {S}({\bar{u}})\) and \(\pi _u\) to disappear from the convergence conditions. All of these quantities might depend on the discretisation.

As we have seen in Sect. 4.1, \(\mathscr {S}({\bar{u}})\) and \(\mathscr {S}(\bar{w})\) require the use of \(\infty \)-norm bounds on the solutions, even when the underlying space is \(H^k\). Such bounds may not always hold in infinite dimensions (however, see Remark 4.2), although they do always hold in finite-dimensional subspaces. In our numerical experiments, we have, however, not observed any grid dependency of \(\mathscr {S}({\bar{u}})\) and \(\mathscr {S}(\bar{w})\) (calculated a posteriori, after a very large number of iterations).

On a more negative note, with \(U=W=\mathbb {R}^{n(h, d)}\) equipped with the standard Euclidean norm, consider \(A_x=-x\Delta _h\) for a scalar x with \(\Delta _h\) a finite differences discretisation of the Laplacian on a d-dimensional square grid of cell width h and n(hd) nodes. Then, for both Jacobi and Gauss–Seidel splitting, as well as the trivial splitting (gradient descent) \(N_x \propto {\text {Id}}\), the spectral radius as ; see, e.g., [26, Chapter 4.2.1]. By simple numerical experiments, \(L_A^2/\gamma _N^2\) nevertheless stays roughly constant, so the result is that as . For “no splitting”, i.e., \(N_x=A_x\), instead due to the worsening condition number of \(\Delta _h\). This latter negative result is, however, dependent on taking \(U=W=\mathbb {R}^{n(h, d)}\) with the standard Euclidean norm: in Example 4.7 we showed that “no splitting” is applicable to the same problem in \(H^1\). It is, therefore, an interesting question for future research, whether a change of norms would remove the grid dependency of Jacobi and Gauss–Seidel. Our guess is that it would not.

The above indicates that, for nonlinear PDEs, whether our methods even convergence, can depend on the level of discretisation. Nevertheless, to help comes the successive over-relaxation of Example 4.11, which shows that

  1. (ii)

    By letting the over-relaxation parameter , we get , and therefore may be able to obtain convergence (with a comparable iteration count) for any magnitude of \(\mathscr {S}({\bar{u}})\), \(\mathscr {S}(\bar{w})\).

With over-relaxation as , so even then, to satisfy (38b) and (41b), it is necessary to have very small \(C_x\). However,

  1. (iii)

    In Sects. 3 and 4.1, we have bounded \(C_x\) through \({{\,\textrm{dom}\,}}F\), obtaining global convergence when (38b) and (41b) hold. With a more refined analysis, it is possible to make \(C_x\) arbitrary small by sufficiently good initialisation, i.e., by being content with mere local convergence.

We include a sketch of this analysis in an appendix of the arXiv version of this manuscript.

Finally, although convergence rates (\(O(1/N^2)\) or linear) are unaffected by the discretisation level, constant factors of convergence depend on \(Z_k{\tilde{M}}_k\) through the bound (39). This operator, written out in (35), depends on the constants \(\pi _u\) and \(\pi _w\). They inversely scale the magnitude of the testing parameters \(\lambda _k\) and \(\theta _k\) as chosen in (33). By (32), the term \(\varphi _k+\lambda _k\pi _u + \theta _k\pi _w\) in (35) is, however, independent of \(\pi _u\) and \(\pi _w\). Smaller \(\pi _u\) and \(\pi _w\) are, hence, better for the convergence of u and w (by weighing down the x and y initialisation errors on the right hand side of (39)), and higher \(\pi _u\) and \(\pi _w\) are better for the convergence of x and y (by weighing down u and w initialisation errors). Even for linear PDEs, therefore

  1. (iv)

    Convergence speed may depend on the level of discretisation through the x-sensitivity factors \(\pi _u\) and \(\pi _w\) of the splitting method for the PDE.

This is to be expected: the linear system solvers that Sect. 4.2 is based on, are fundamentally discrete, and their convergence depends on the eigenvalues of \(N^{-1}_x M_x\) and \(N_x\). In “standard” optimisation methods, the dimensionally-dependent linear system solver is taken as a black box, and its computational cost is hidden from the estimates for the optimisation method. The estimates for our method, by contrast, include the solver.

5 Numerical results

We now illustrate the numerical performance of Algorithm 2.1. We first describe our experimental setup, and then discuss the results.

5.1 Experimental setup

The PDEs in our numerical experiments take one of the forms of Sect. 4.1 on the domain \(\Omega = [0,1]\times [0,1]\) with nonhomogeneous Dirichlet boundary conditions. We discretize the domain as a regular grid and the PDEs by backward differences. We use both a coarse and a fine grid.

The function G and the PDE vary by experiment, but in each one we take the regularization term for the control parameter x and the data fitting term as

$$\begin{aligned} F(x) {:}{=}\frac{\alpha }{2}\Vert x\Vert _{L^2(\Omega ; \mathbb {R}^{d \times d}) \times L^2(\Omega )}^2 + \delta _{[\lambda ,\lambda ^{-1}]}(x) \quad \text {and}\quad Q(u) {:}{=}\widehat{\beta } \sum _{i=1}^m\Vert u_i - z_i\Vert _{L^2(\Omega )}^2\nonumber \\ \end{aligned}$$
(49)

for some \(\alpha , \beta , \lambda > 0 \) as well as \(\widehat{\beta } {:}{=}\beta /(2\Vert {\bar{z}}\Vert _{L^2(\Omega )}^2)\) where \(\bar{z} = \frac{1}{m}\sum _{i=1}^m z_i\) is the average of the measurement data \(z_i\). The norms here are in function spaces, but in the numerical experiments the variables are, of course, taken to be in a finite-dimensional (finite element) subspace.

The variables \(u_i\) correspond to multiple copies of the same PDE with different boundary conditions \(u_i = f_i\) on \(\partial \Omega \), (\(i = 1,\dots ,m\)), for the same control x. Parametrizing \(\partial \Omega \) by \(\rho : (0,1) \rightarrow \partial \Omega \), we take as boundary data

$$\begin{aligned} f_{2j-1}(\rho (t)) = \cos (2\pi jt) \quad \text {and} \quad f_{2j}(\rho (t)) = \sin (2\pi jt), \quad (j=1,\ldots ,m/2).\nonumber \\ \end{aligned}$$
(50)

To produce the synthetic measurement \(z_i\), we solve for \({\hat{u}}_i\) the PDE corresponding to the experiment with the ground truth control parameter \({\hat{x}} = (\hat{A},\hat{c}) \) and boundary data \(f_i\). To this we add Gaussian noise of standard deviation \(0.01\Vert \hat{u}_i\Vert _{L^2(\Omega )}\) to get \(z_i\).

We next describe the PDEs for each of our experiments.

Experiment 1

(Scalar coefficient) In our first numerical experiment, we aim to determine the scalar coefficient \( c \in \mathbb {R}\) for the PDEs

$$\begin{aligned} \left\{ \begin{aligned} -\Delta u_i + cu_i&= 0{} & {} \text {in}\, \Omega , \\ u_i&= f_i{} & {} \text {on}\, \partial \Omega , \end{aligned}\right. \end{aligned}$$
(51)

where \(i=1,\ldots ,m\). For this problem we choose \( G(Kx) = 0 \). Thus the objective is

$$\begin{aligned} \min _{u,c} J(x) {:}{=}\frac{\alpha }{2}\Vert c{\textbf {1}}\Vert _{L^2(\Omega )}^2 + \delta _{[\lambda , \lambda ^{-1}]}(c) + \widehat{\beta }\sum _{i=1}^m\Vert u_i - z_i\Vert _{L^2(\Omega )}^2 \quad \text {subject to (51)}.\nonumber \\ \end{aligned}$$
(52)

Our parameter choices can be found in Table 1.

With \(u = (u_1,\ldots , u_m) \in U^m \subset {} H^1(\Omega )^m\) and \(w=(w_{1,\Omega }, \ldots , w_{m,\Omega },w_{1,\partial },\ldots ,w_{m,\partial }) \in W^m \subset H_0^1(\Omega )^m \times H^{1/2}(\partial \Omega )^m\), for the weak formulation of (51) we take

$$\begin{aligned} B(u, w; c) = \sum _{i=1}^m\left( \langle \nabla u_i,\nabla w_{i,\Omega }\rangle _{L^2(\Omega )} + c \langle u_i,w_{i,\Omega }\rangle _{L^2(\Omega )} + \langle {{\,\textrm{trace}\,}}_{\partial \Omega } u_i,w_{i,\partial }\rangle _{L^2(\partial \Omega )} \right) \end{aligned}$$

and

$$\begin{aligned} Lw = \sum _{i=1}^m\langle f_i,w_{i,\partial }\rangle _{L^2(\partial \Omega )}. \end{aligned}$$
(53)

Then \( {\bar{\nabla }}_x B(u,w) = \sum _{i=1}^m \langle u_i,w_{i,\Omega }\rangle _{L^2(\Omega )} \) following Example 4.5.

For data generation we take \({\hat{c}} = 1.0 \). Since we are dealing with an ill-posed inverse problem, an optimal control parameter \(\bar{c}\) for (52) does not in general equal \({\hat{c}}\). Therefore, to compare algorithm progress, we take as surrogate for the unknown \(\bar{c}\) the iterate \( \tilde{c}_A {:}{=}c^{50,000} \) on the coarse grid and \({\tilde{c}}_B {:}{=}c^{500,000} \) on the fine grid, each computed using Algorithm 2.1 without splitting.

The next theorem verifies the basic structural conditions of the convergence Theorems 3.10 and 3.11. The splitting conditions contained Assumption 3.3 are ensured through Example 4.9 (Jacobi), 4.10 (Gauss–Seidel), or 4.8 (no splitting).

Theorem 5.1

Let \( X = \mathbb {R}\); U a finite-dimensional subspace of \(H^1(\Omega )\); and W a finite-dimensional subspace of \( H_0^1(\Omega ) \times H^{1/2}(\partial \Omega ) \). Let F and Q be given by (49) along with the PDE (51) and the boundary conditions \( f_i \) defined as in (50). Take \(G=0\). Then Assumption 3.1 holds.

Proof

The chosen F, Q and either G satisfy Assumption 3.1(i). The boundary conditions \( f_i \in H^{1/2}(\partial \Omega )\) along with the constraint \(x \in [\lambda ,\lambda ^{-1}]\) ensure the condition Lemma 4.1(ii\('\)). In the discretized setting, also (iii\('\)) and (iv\('\)) also hold. In conclusion, Lemma 4.1 verifies Assumption 3.1. \(\square \)

Remark 5.2

It remains to verify (38) or (41), depending on the convergence theorem used. The condition (38a) is readily verified by appropriate choice of the primal and dual step length parameters \(\tau _0,\sigma _0>0\). We also take \({\tilde{\gamma }}_F=0\) (slightly violating the assumptions), so that \(\omega _k \equiv 1\), and \(\tau _k \equiv \tau _0\) and \(\sigma _k \equiv \sigma _0\). The condition (38b) (and likewise (41b) for linear convergence) is very difficult to verify a priori for nonlinear PDEs, as it depends on the knowledge of a solution to the optimisation problem through \(\mathscr {S}({\bar{u}})\) and \(\mathscr {S}(\bar{w})\). This is akin to the difficulty of verifying (a priori) a positive Hessian at a solution for standard nonconvex optimisation methods. Hence we do not attempt to verify (38b).

Experiment 2

(Diffusion + scalar coefficient) In this experiment we aim to determine the coefficient function \( a: \Omega \rightarrow \mathbb {R}\) and scalar \( c \in \mathbb {R}\) for the group of PDEs

$$\begin{aligned} \left\{ \begin{aligned} -\nabla \cdot (a\nabla u_i) + cu_i&= 0{} & {} \text {in}\, \Omega , \\ u_i&= f_i{} & {} \text {on}\, \partial \Omega , \end{aligned}\right. \end{aligned}$$
(54)

where \(i=1,\ldots ,m\). The optimization problem then is

$$\begin{aligned} \min _{x = (a,c)} J(x) = \delta _{[\lambda , \lambda ^{-1}]}(x) + \widehat{\beta }\sum _{i=1}^m\Vert u_i - z_i\Vert _{L^2(\Omega )}^2 + \gamma \Vert \nabla a\Vert _1 \quad \text {subject to (54)}.\nonumber \\ \end{aligned}$$
(55)

Note that, although we take the total variation of a, which is natural in the space of functions of bounded variation, we consider a to lie in (as per Example 2.2 a finite-dimensional subspace of) \(L^2(\Omega )\). Thus the total variation term has value \(+\infty \) in \(L^2(\Omega ) {\setminus } {\text {BV}}(\Omega )\). Nevertheless, the term is weakly lower semicontinuous even in \(L^2\) due to Poincaré’s inequalities (for example, [1, Theorem 3.44]), so the problem is well-defined. Subdifferentiation in \(L^2(\Omega )\) is a slightly more delicate issue, but not a problem for optimality conditions of problems of the type (55), as discussed in [38, Remark 4.7]. Moreover, as said, in practise we work in a finite-dimensional subspace that corresponds to the backward differences discretisation of the gradient in the total variation term. The convergence of discretisations is discussed in [4].

For the weak formulation of (54) with \(w=(w_{1,\Omega }, \ldots , w_{m,\Omega },w_{1,\partial },\ldots ,w_{m,\partial }) \in W^m \subset H_0^1(\Omega )^m \times H^{1/2}(\partial \Omega )^m\), \(u = (u_1,\ldots , u_m) \in U^m\subset {}H^1(\Omega )^m\), and \(x=(a,c) \in X \subset L^2(\Omega ) \times \mathbb {R}\), we take L as in (53) and

$$\begin{aligned} B(u, w; x){} & {} = \sum _{i=1}^m\left( \langle \nabla u_i,a\nabla w_{i,\Omega }\rangle _{L^2(\Omega )} + c \langle u_i,w_{i,\Omega }\rangle _{L^2(\Omega )}\right. \\{} & {} \quad \left. + \langle {{\,\textrm{trace}\,}}_{\partial \Omega } u_i,w_{i,\partial }\rangle _{L^2(\partial \Omega )} \right) . \end{aligned}$$

Then \( {\bar{\nabla }}_x B(u,w) = ({\bar{\nabla }}_x B^1(w,u),{\bar{\nabla }}_x B^2(w,u))\) takes on a mixed form with \( {\bar{\nabla }}_x B^1(w,u) = \sum _{i=1}^m \nabla u_i\cdot \nabla w_{i,\Omega } \) from Example 4.4 and \({\bar{\nabla }}_x B^2(w,u) = \sum _{i=1}^m \langle u_i,w_{i,\Omega }\rangle _{L^2(\Omega )} \) from Example 4.5.

For data generation we take \( {\hat{c}} = 1.0 \) and \( {\hat{a}} \) as the phantom depicted in Fig. 3. Similarly to Experiment 1 we compare the progress towards \( {\tilde{a}} {:}{=}a^{1,000,000} \) and \( {\tilde{c}} {:}{=}c^{1,000,000} \) computed using Algorithm 2.1 with full matrix inversion.

As above for Experiment 1, the next theorem verifies the basic structural conditions of the convergence Theorems 3.10 and 3.11. The proofs is analogous to that Theorem 5.1. Likewise, the splitting Assumption 3.3 is verified as before through Example 4.9 (Jacobi), 4.10 (Gauss–Seidel), or 4.8 (no splitting), while Remark 5.2 applies for the remaining step length and growth conditions.

Theorem 5.3

Let X be a finite-dimensional subspace of \(L^2(\Omega ) \times \mathbb {R}\), U a finite-dimensional subspace of \(H^1(\Omega )\) and W a finite-dimensional subspace of \( H_0^1(\Omega ) \times H^{1/2}(\partial \Omega ) \). Let F and Q be given by (49) along with the PDE (54) with the boundary conditions \( f_i \) defined as in (50) and G be \( \Vert \,\varvec{\cdot }\,\Vert _1\). Then Assumption 3.1 holds.

Fig. 1
figure 1

Performance of various splittings in the coarse grid Experiment 1

Fig. 2
figure 2

Performance of various splittings in fine grid Experiment 1

Fig. 3
figure 3

Data generation phantom and splitting performance in the coarse grid Experiment 2

Fig. 4
figure 4

Performance of various splittings in the fine grid Experiment 2

5.2 Algorithm parametrisation

We apply Algorithm 2.1 with no splitting (full inversion), and with Jacobi and Gauss–Seidel splitting, and quasi conjugate gradients, as discussed in Sect. 4.2. We fix \( \sigma = 1.0 \), \( \omega = 1.0 \), \( \lambda = 0.1 \), \( \varepsilon = 0.01 \), and \( \beta = 10^2 \) for all experiments. Other parameters, including the grid size, \( \alpha \), \( \gamma _i \), \( \tau \) and m vary according to experiment with values listed in Table 1.

For the initial iterate \( (x^0,u^0,w^0,y^0) \) we make an experiment-specific choice of the control parameter \(x^0\). Then we determine \( u^0 \) by solving the PDE, and \( w_0 \) by solving the adjoint PDE. We set \( y^0 = Kx^0 \). For Experiment 1 we take the initial \( c^0 = 4.0 \) and run the algorithm for 20,000 iterations on the coarse grid and 125,000 on the fine. For Experiment 2 we take the initial \( a^0 \equiv 1.0\) a constant function, and \( c^0 = 2.0 \). The algorithm is run for 200,000 iterations on the coarse grid, and 500,000 on the fine.

Table 1 Parameter choices for all examples

We implemented the algorithm in Julia. The implementation is available on Zenodo [23]. The experiments were run on a ThinkPad laptop with Intel Core i5-8265U CPU at 1.60GHz \(\times 4\) and 15.3 GiB memory.

5.3 Results

The results for Experiment 1 are in Figs. 1 (coarse grid) 2 (fine grid). There we illustrate the evolution of the coefficient \( c^k \) together with the relative errors of the coeffcient and of the functional value.

The results for Experiment 2 are in Figs. 3 (coarse grid) and 4 (fine grid). They show the evolution of the relative error of the coefficient and of the functional value. We also illustrate in Fig. 3 the data generation phantom for Experiment 2 on the coarse grid. The phantom on the fine grid has the same shapes and intensities.

The performance plots have time on the x-axis rather than the number of iterations, as the main difference between the splittings is expected to be in the computational effort for linear system solution, i.e., Lines 3 and 4 of Algorithm 2.1. For fairness, we limited the number of threads used by Julia/OpenBLAS to one.

In all experiments the splittings outperform full matrix inversion: the best splittings require roughly half of the computational effort for an iterate of the same quality. No particular splitting completely dominates another, however, Jacobi appear to be more prone to overstepping and oscillatory patterns. On the other hand, quasi-CG currently has no convergence theory, and we have observed situations where it does not exhibit convergence while Jacobi and Gauss–Seidel splittings do. Therefore, Gauss–Seidel is our recommended option.