Abstract
In this article a special class of nonlinear optimal control problems involving a bilinear term in the boundary condition is studied. These kind of problems arise for instance in the identification of an unknown space-dependent Robin coefficient from a given measurement of the state, or when the Robin coefficient can be controlled in order to reach a desired state. Necessary and sufficient optimality conditions are derived and several discretization approaches for the numerical solution of the optimal control problem are investigated. Considered are both a full discretization and the postprocessing approach meaning that we compute an improved control by a pointwise evaluation of the first-order optimality condition. For both approaches finite element error estimates are shown and the validity of these results is confirmed by numerical experiments.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper is concerned with bilinear boundary control problems of the form
subject to
where \(\varOmega \subset {{\mathbb {R}}}^n\), \(n\in \{2,3\}\), is a bounded domain, \(\alpha >0\) is the regularization parameter, \(y_d\in L^2(\varOmega )\) is a desired state and \(0 \le u_a < u_b\) are the control bounds.
As an application of bilinear boundary control problems we mentioned the identification of an unknown Robin coefficient from a given measurement \(y_d\) of the state quantity. This is for instance of interest in the modeling of stem cell division processes [16, 17], where u is the unknown parameter describing the chemical reactions between proteins from the cell interior and the cell cortex. For further applications, u can be interpreted as a heat-exchange coefficient in thermodynamics or as a quantity for corrosion damage in electrostatics. There are many publications dealing with the identification of the Robin coefficient, see for instance [12, 23, 31, 34]. Only a few papers use an optimal control approach similar to the one considered in the present article. We mention [22, 25], where the parabolic version of our model problem is considered. The authors prove convergence of a finite element approximation but no convergence rate is established. A similar problem is discussed in [21], dealing with the recovery of the Robin parameter in a variational inequality.
The aim of the present paper is to derive necessary and sufficient optimality conditions for the optimal control problem and to investigate several numerical approximations regarding convergence towards a local solution. This complements a previous contribution of Kröner and Vexler [27] where the distributed control case, meaning that the bilinear term \(u\,y\) appears in the differential equation, is discussed. The main results in their article are error estimates for the approximate controls in the \(L^2(\varOmega )\)-norm for several finite element approximations. To be more precise, the convergence rate 1 is shown for piecewise constant and 3/2 for piecewise linear approximations for the control. Moreover, advanced discretization concepts like the postprocessing approach [32] and the variational discretization [24] are investigated which allow an improvement up to a convergence rate of 2. It is the purpose of the present article to extend the results to the case of bilinear boundary control.
The numerical analysis of boundary control problems is usually more difficult than for distributed control problems as the adjoint control-to-state operator maps onto some Sobolev/Lebesgue space defined on the boundary. As a consequence, error estimates for the traces of finite element solutions have to be proved, more precisely, in the \(L^2(\varGamma )\)-norm. Here, we consider two different discretization approaches. The first one is a full discretization using piecewise linear finite elements for the states and piecewise constant functions on the boundary for the control approximation. Under the assumption that the domain has a Lipschitz boundary we show that the discrete optimal control converges with the optimal rate 1. To show this result we exploit the local coercivity of the objective, best-approximation properties of the control space and suboptimal error estimates for the state and adjoint equation. In order to obtain a more accurate solution we also investigate the postprocessing approach where an improved control is computed by a pointwise application of the first-order optimality condition to the discrete state variables. For this approach we have to assume more regularity for the exact solution and thus, we restrict our considerations to two-dimensional domains with sufficiently smooth boundary. Under this assumption we show the optimal convergence rate of \(2-\varepsilon\) with arbitrary \(\varepsilon >0\) which is the rate one would also expect in the case of linear quadratic boundary control problems and smooth solutions [3, 4, 33] (even with \(h^{-\varepsilon }\) replaced by \(|\ln h|\), where h is the maximal element diameter of the finite element mesh). The proof relies on the non-expansivity of the projection onto the feasible set as well as sharp error estimates for the state and adjoint state in \(L^2(\varGamma )\). To obtain estimates in these norms superconvergence properties of the midpoint interpolant, finite element error estimates for the Ritz projection in \(L^2(\varGamma )\) and a supercloseness result between the midpoint interpolant of the exact and the discrete solution are exploited. To show the \(L^2(\varGamma )\)-norm error estimate we will, as we consider smooth solutions, derive a maximum norm estimate. To the best of the author’s knowledge these results are not available in the literature for problems with Robin boundary conditions. Based on the ideas from [18] we formulate the missing proof.
We moreover note that the setting discussed here does not fit into the well-known framework of the semilinear optimal control problems discussed e. g. in [5, 9, 11, 29], as these contributions deal with nonlinearities depending solely on the state variable. However, many techniques can be reused for the problem considered here. The only publication where more general nonlinearities depending both on the state and the control variable is, to the best of the author’s knowledge, [35]. Therein optimality conditions are discussed but there is no theory on the numerical analysis of approximation methods for this problem class available yet. However, we think that the consideration of bilinear control problems may serve as a starting point for the investigation of a more general class of nonlinear optimal control problems.
The article is structured as follows. In Sect. 2 we discuss the solubility of the state equation and regularity results for its solution. In Sect. 3 we analyze the optimal control problem. In particular, necessary and sufficient optimality conditions are investigated. Section 4 is devoted to the finite element discretization of the state equation, where we show finite element error estimates required for the numerical analysis of the optimal control problem later. The discretization of the optimal control problem is considered in Sect. 5. In particular, we discuss convergence rates for the numerical solution obtained by a full discretization of the optimal control problem as well as for an improved control obtained by a postprocessing step. The latter result requires some auxiliary results that we discuss in the appendix. To be more precise, a maximum norm error estimate for the finite element solution of an elliptic equation with Robin boundary conditions is needed. A proof is given in Appendix 1. Moreover, a proof of local error estimates for the midpoint interpolant and the \(L^2(\varGamma )\) projection onto piecewise constant functions on the boundary is needed. To the best of the author’s knowledge these results are not available in the literature in case of domains with curved boundaries. Thus, we discuss these auxiliary results in Appendix 2. Finally, we will compare the theoretical results with numerical experiments in Sect. 6.
2 Analysis of the state equation
We consider the boundary value problem
on a bounded Lipschitz domain \(\varOmega \subset {{\mathbb {R}}}^n\), \(n\in \{2,3\}\), with data \(f\in L^2(\varOmega )\) and \(g\in L^2(\varGamma )\). The corresponding weak formulation reads
with
First, we show an existence and uniqueness result for (1). Therefore, we introduce a decomposition of the control into positive and negative parts \(u^+,u^-\in L_+^2(\varGamma ):=\{v\in L^2(\varGamma ):v\ge 0 \text{ a. } \text{ e. } \text{ on }\ \varGamma \}\) such that \(u=u^+ - u^-\). The following result then relies on the Lax–Milgram–Lemma. However, an assumption on the coefficient u is required.
Lemma 1
Assume that\(u\in L^2(\varGamma )\)satisfies
with the constant\(c^*\)which is due to the estimate\(\Vert v\Vert _{L^4(\varGamma )} \le c^*\Vert v\Vert _{H^1(\varOmega )}\). Then, the solutionyof (1) belongs to\(H^1(\varOmega )\)and satisfies the a priori estimate
with\(\gamma _u := 1-c_*^2\,\Vert u^-\Vert _{L^2(\varGamma )}>0\).
Proof
The boundedness of \(a_u\) follows directly from the Cauchy–Schwarz inequality and the continuity of the trace operator \(\tau :H^1(\varOmega )\rightarrow L^4(\varGamma )\). This implies
To show the coercivity we take into account the decomposition \(u=u^+ - u^-\) to get
Here, the assumption (2) will ensure the coercivity. An application of the Lax–Milgram Lemma leads to the desired result. \(\square\)
Note that \(\{v\in L^2(\varGamma ):\Vert v^-\Vert _{L^2(\varOmega )} < c_*^{-2}\}\) is an open subset of \(L^2(\varGamma )\). This is the key idea which allows us to avoid the two-norm discrepancy for the optimal control problem as we will see that the reduced objective functional is differentiable with respect to the \(L^2(\varGamma )\)-topology. In the following we will hide the dependency of the estimates on \(\Vert u^-\Vert _{L^2(\varGamma )}\) and thus \(\gamma _u\) in the generic constant as we impose positive control bounds in the considered optimal control problem.
Later, we will frequently make use of the following Lipschitz estimate.
Lemma 2
If\(u_1,u_2\in L^2(\varGamma )\)satisfy the assumption (2), the corresponding states\(y_1,y_2\in H^1(\varOmega )\)solving
fulfill the estimate
Proof
Subtracting the variational formulations for \(y_1\) and \(y_2\) from each other leads to
The result follows from Lemma 1 and the continuity of the product mapping from \(L^2(\varGamma )\times H^{1/2}(\varGamma )\) to \(H^{-1/2}(\varGamma )\), see [20, Theorem 1.4.4.2]. \(\square\)
In the following theorem we collect some regularity results for the solution of (1).
Lemma 3
Let\(\varOmega \subset {{\mathbb {R}}}^n\), \(n\in \{2,3\}\), be a bounded Lipschitz domain. By\(y\in H^1(\varOmega )\)we denote the solution of (1). The following a priori estimates are valid, under the assumption that the input data possess the regularity demanded by the right-hand side:
- (a):
If \(r>2n/(1+n)\)and\(p > 2\)for\(n=2\)and\(p\ge 4\)for\(n=3\), then
$$\Vert y\Vert _{H^{3/2}(\varOmega )} + \Vert y\Vert _{H^1(\varGamma )}\le c\left( 1+\Vert u\Vert _{L^p(\varGamma )}\right) \left( \Vert f\Vert _{L^{r}(\varOmega )} + \Vert g\Vert _{L^2(\varGamma )} \right) .$$- (b):
If \(r> n/2,\ s > n-1,\)and\(p\ge 2\)for\(n=2\)and\(p > 8/3\) for \(n=3\), then
$$\Vert y\Vert _{C({\overline{\varOmega }})}\le c \left( 1+\Vert u\Vert _{L^{p}(\varGamma )}\right) ^2\left( \Vert f\Vert _{L^{r}(\varOmega )} + \Vert g\Vert _{L^{s}(\varGamma )}\right) .$$- (c):
Furthermore, if\(\varOmega\)is a convex polygonal/polyhedral domain, or possesses a boundary which is of class\(C^{1,1}\), there holds
$$\begin{aligned} \Vert y\Vert _{H^2(\varOmega )}&\le c\left( 1+\Vert u\Vert _{H^{1/2}(\varGamma )}\right) ^2\,\left( \Vert f\Vert _{L^2(\varOmega )} + \Vert g\Vert _{H^{1/2}(\varGamma )}\right) . \end{aligned}$$
Proof
(a)
In [15, Theorem 1.12] it is shown that the problem
possesses a solution in \(H^{3/2}(\varOmega )\) provided that \(F\in H^{s-2}(\varOmega )\) for some \(s\in (3/2,2]\) and \(G\in L^2(\varGamma )\), as well as \(\int _\varOmega F + \int _\varGamma G = 0\). The solubility condition is satisfied in our situation with \(F=f-y\) and \(G=g-u\,y\) and becomes clear when testing (1) with \(v\equiv 1\). The regularity required for F follows from the embedding \(f\in L^r(\varOmega )\hookrightarrow H^{-1/2+\varepsilon }(\varOmega )\) for sufficiently small \(\varepsilon >0\). Moreover, the Hölder inequality and the continuity of the trace operator \(\tau :H^1(\varOmega )\rightarrow L^q(\varGamma )\) for \(q<\infty\) (\(n=2\)) or \(q\le 4\) (\(n=3\)) imply \(\Vert u\,y\Vert _{L^2(\varGamma )} \le c\,\Vert u\Vert _{L^p(\varGamma )}\,\Vert y\Vert _{H^1(\varOmega )}\), from which we conclude \(G\in L^2(\varGamma )\). From [15, Theorem 1.12] and Lemma 1 we then obtain
It remains to show the \(H^1(\varGamma )\)-norm estimate. We split the solution into the parts \(y_f\) and \(y_g\) solving
Using [19, Theorem 5.4] we directly deduce
and Lemma 1 leads to the desired estimate for \(y_g\). For the function \(y_f\), we get the desired estimate by an application of a trace theorem and the a priori estimate (3) which can in case of \(g\equiv 0\) be improved to
provided that \(\varepsilon >0\) is sufficiently small. The validity of the second step can be confirmed by means of [15, Theorem 1.12] and [14, Theorem 23.3]. The decomposition \(y=y_f+y_g\) and the estimates shown above imply the desired estimate in the \(H^1(\varGamma )\)-norm.
(b) We prove the result for the case \(n=3\). The two-dimensional case follows from the same arguments. From [8, Theorem 3.1] it is known that the solution of (1) belongs to \(C({\overline{\varOmega }})\) if \(f\in L^r(\varOmega )\), \(r>n/2\), and \(g-uy\in L^s(\varGamma )\), \(s>n-1\). The latter assumption can be concluded from the Hölder inequality, a Sobolev embedding and a trace theorem, which implies
for \(1/p+1/8=1/(2+\varepsilon )\). A simple computation shows that \(p>8/3\) and \(s=2+\varepsilon\) with \(\varepsilon >0\) sufficiently small guarantee the validity of the previous steps. It remains to show \(y\in H^{5/4+\varepsilon }(\varOmega )\). This can be deduced from [14, Theorem 23.3] where the a priori estimate
is stated. The regularity demanded by the right-hand side of (4) is confirmed with the embeddings \(f\in L^r(\varOmega )\hookrightarrow H^{3/4-\varepsilon }(\varOmega )^*\) and \(g\in L^s(\varGamma )\hookrightarrow H^{-1/4+\varepsilon }(\varGamma )\). Moreover, there holds \(\Vert u\,y\Vert _{H^{-1/4+\varepsilon }(\varGamma )} \le c\,\Vert u\Vert _{L^p(\varGamma )}\,\Vert y\Vert _{L^4(\varGamma )}\), see [20, Theorem 1.4.4.2]. Collecting up the arguments above leads to
and the assertion follows after insertion of the a priori estimate from Lemma 1.
(c) With an embedding we deduce from the assumption that \(u\in L^4(\varGamma )\). Hence, (4) is applicable which implies \(y\in H^{3/4}(\varGamma )\) and thus, \(u\,y\in H^{1/2}(\varGamma )\), see [20, Theorem 1.4.4.2]. The \(H^2(\varOmega )\)-regularity of y then follows from a shift theorem applied to the equation with boundary conditions \(\partial _n y = g - u y\in H^{1/2}(\varGamma )\) on \(\varGamma\), see [20, Theorem 2.4.2.7] (for domains with smooth boundary) or [20, Theorem 4.4.3.8] (for convex polygonal domains). \(\square\)
3 The optimal control problem
Due to the well-posedness of the state equation we may introduce the control-to-state operator \(S:U_{ad}\rightarrow H^1(\varOmega )\) defined by \(S(u):=y\), with y solving (1). This allows to reformulate the optimal control problem introduced in Sect. 1 and we arrive at
subject to \(u\in U_{ad}:=\{v\in L^2(\varGamma ):u_a\le v\le u_b\ {\text {a. e. on}}\ \varGamma \}\). Here, \(\alpha >0\) is the regularization parameter, \(y_d\in L^2(\varOmega )\) the desired state and \(0< u_a < u_b\) the control bounds. Our aim is to derive necessary and sufficient optimality conditions as well as regularity results for local solutions. Note, that the operator S is non-affine and consequently, j is non-convex. The existence of at least one local solution can be concluded from standard arguments [39], taking into account that for a minimizing sequence \(\{u_n\}\subset L^q(\varGamma )\), \(q\in (2,\infty )\), the corresponding states converge strongly in \(L^p(\varGamma )\) for each \(p<4\), which is due the compact embedding \(H^1(\varOmega )\hookrightarrow L^p(\varGamma )\).
3.1 Optimality conditions
To derive optimality conditions differentiability properties of the (implicitly defined) operator S are of interest.
Lemma 4
The operator\(S:U_{ad}\rightarrow H^1(\varOmega )\)is infinitely many times Fréchet differentiable with respect to the\(L^2(\varGamma )\)-topology. The first derivative\(\delta y := S'(u)\delta u\)is the weak solution of the tangent equation
Proof
The result follows from an application of the implicit function theorem to the operator \(e:H^1(\varOmega )\times U \rightarrow H^1(\varOmega )^*\) with \(U:=\{v\in L^2(\varGamma ):v \text{ fulfills }\) (2)} defined by
whose roots are solutions of (1). We choose \(\delta y\in H^1(\varOmega )\), \(\delta u\in U\) such that \(u+\delta u\in U\) (note that U is an open subset of \(L^2(\varGamma )\)). First, we confirm that the linear operator \(e'(y,u):H^1(\varOmega )\times U\rightarrow H^1(\varOmega )^*\) defined by
is the Fréchet-derivative of e. This is a consequence of
and the fact that the remainder term satisfies
where we applied the generalized Hölder inequality and \(H^1(\varOmega )\hookrightarrow L^4(\varGamma )\). The second Fréchet derivative
is given by
and the mapping \((y,u)\mapsto e''(y,u)\) is continuous. The derivatives of order \(n\ge 3\) vanish. Hence, \(e:H^1(\varOmega )\times U\rightarrow H^1(\varOmega )^*\) is of class \(C^\infty\).
Finally, due to Lemma 1 we conclude that the linear mapping
is bijective. The implicit function theorem implies the assertion and the derivative \(\delta y:=S'(u)\delta u\) is given by \(e'(y,u)(\delta y,\delta u) = 0\). This corresponds to the weak formulation of (6). \(\square\)
From the chain rule and Lemma 4 we directly conclude the following differentiability result:
Lemma 5
The functional \(j:U_{ad}\rightarrow {{\mathbb {R}}}\) is infinitely many times Fréchet differentiable with respect to the \(L^2(\varGamma )\) -topology and the first derivative is given by
The derivative of j can be simplified exploiting a precise representation of the adjoint \(S'(u)^*:H^1(\varOmega )^*\rightarrow L^2(\varGamma )\) of the linearized control-to-state operator \(S'(u)\). In order to compute this, we introduce the the adjoint state \(p\in H^1(\varOmega )\) as the weak solution of the adjoint equation
Testing the variational problems for (10) and (6) with \(\delta y:=S'(u)\delta u\) and p, respectively, leads to the relation
which implies
In the following we denote the control-to-adjoint mapping \(Z:L^2(\varGamma )\rightarrow H^1(\varOmega )\) defined by \(u\mapsto Z(u):=p\) via (10) with \(y=S(u)\). Finally, we are able to formulate the necessary optimality condition
and with (11) we get the equivalent representation
Taking into account the definitions of S and Z we can write this variational inequality in the form
The latter inequality is equivalent to the projection formula
with \(\varPi _{ad}\) the \(L^2(\varGamma )\)-projection onto \(U_{ad}\).
As the problem (5) is not convex, we have to investigate second-order sufficient conditions. To obtain the Hessian of j we apply the product rule and get
The function \(\tau p = Z'(u)\tau u\in H^1(\varOmega )\) is the weak solution of the “dual for Hessian”-equation
where \(\tau y = S'(u)\tau u\). As in the proof of Lemma 3 this follows from the implicit function theorem. Note that also further representations of the Hessian are possible. For instance, a direct application of the product rule to (9) yields
with \(y=S(u)\), \(\delta y = S'(u)\delta y\), \(\tau y = S'(u)\tau u\) and \(\delta \tau y = S''(u)(\delta u,\tau u)\). The latter relation means that \(\delta \tau y\in H^1(\varOmega )\) is the weak solution of
Moreover, due to the definition of p and \(\delta \tau y\) there holds the relation \((y-y_d,\delta \tau y)_{L^2(\varOmega )} = -(p,\delta u\tau y + \tau u\,\delta y)_{L^2(\varGamma )}\) and as a consequence, we can further simplify the representation of the Hessian and obtain
Next, we derive some stability and Lipschitz properties of S, Z, \(S'\) and \(Z'\). As the following results require different assumptions on f, \(y_d\) and g we simply assume the most restrictive ones, this is,
Moreover, we will hide the dependency on these quantities in the generic constant to simplify the notation.
Lemma 6
Let\(u\in L^2(\varGamma )\)satisfy the assumption (2). The control-to-state operatorSsatisfies the following inequalities:
with\(p_1 > 2\)and\(p_2\ge 2\)for\(n=2\), and\(p_1\ge 4\)and\(p_2 > 8/3\)for\(n=3\). The estimates remain valid when replacing the operatorSby the control-to-adjoint operatorZ.
Proof
The inequalities for S are a direct consequence of Lemmata 1 and 3. The inequalities for Z can be derived with similar arguments, but the right-hand side of the adjoint equation involves the corresponding state S(u). However, in all cases the norms of \(S(u)-y_d\) can be bounded by \(c\,(1+\Vert S(u)\Vert _{H^1(\varOmega )})\le c\). \(\square\)
Lemma 7
Given are\(u,\delta u\in L^2(\varGamma )\)and it is assumed thatusatisfies (2). Then, the following stability estimates hold true:
with\(p>2\)for\(n=2\)and\(p\ge 4\)for\(n=3\). The estimates remain valid when replacing\(S'\)by\(Z'\).
Proof
In the following we write \(y:=S(u)\) and \(\delta y = S'(u)\delta u\). The stability in \(H^1(\varOmega )\) follows directly from Lemma 1 and the estimate
which follows from the same arguments used already in (8). The boundedness of \(y:=S(u)\) in \(H^1(\varOmega )\) can be found in the previous lemma. The estimate in the \(H^{3/2}(\varOmega )\)-norm follows analogously with Lemma 3a) and
and the stability in \(L^\infty (\varOmega )\) proved in Lemma 6.
The estimates for \(Z'\) are deduced with similar techniques. With the a priori estimate from Lemma 3a) and the embedding \(H^1(\varOmega )\hookrightarrow L^r(\varOmega )\) which holds for \(r <\infty\) (\(n=2\)) or \(r\le 6\) (\(n=3\)) we get
with \(p=Z(u)\). The stability of Z in \(L^\infty (\varOmega )\) is discussed in the previous lemma. \(\square\)
Lemma 8
Let\(u,v\in L^2(\varGamma )\)satisfy assumption (2). Then, the following Lipschitz estimates hold:
The estimates are also valid when replacingSbyZandZby\(Z'\).
Proof
The estimates for S and \(S'\) follow directly from Lemma 2 and the stability estimates for S and \(S'\) in \(H^1(\varOmega )\) proved in the Lemmata 6 and 7. The Lipschitz estimate for Z is proved in a similar way. In this case one has to apply the Lipschitz estimate shown for S to the term \(\Vert S(u)-S(v)\Vert _{H^1(\varOmega )}\) appearing due to the differences in the right-hand sides. With the same idea we show the Lipschitz estimate for \(Z'\). Using again Lemma 2 we get
It remains to bound the three terms on the right-hand side. To this end, we apply Lemma 7 to the first term, the Lipschitz estimate for \(S'(\cdot )\delta u\) to the second term, and the multiplication rule (16) with \(y=Z(u) - Z(v)\) as well as the Lipschitz estimate for Z to the third term. \(\square\)
As the optimal control problem is non-convex we have to deal with local solutions. For some local solution \({\bar{u}}\in U_{ad}\) we require the following second-order sufficient condition:
Assumption 1
(SSC) The objective functional is locally convex near the local solution \({\bar{u}}\), i. e., a constant \(\delta > 0\) exists such that
With standard arguments and the estimate we will prove below in Corollary 1 one can show that each function \({\bar{u}}\in U_{ad}\) fulfilling the first-order necessary condition (12) and the second-order sufficient condition (17) is indeed a local solution and satisfies the quadratic growth condition
with certain constants \(\gamma ,\tau > 0\). Note that there are weaker assumptions which are sufficient for local minima, for instance one could formulate (17) for all directions v from a critical cone. However, with this assumption the convergence proof for the postprocessing approach presented in Sect. 5.3 requires some more careful investigations, in particular the construction of a modified interpolant onto \(U_{ad}\). One possible solution for this issue can be found in [29].
Later, we will require the following Lipschitz estimate for the Hessian of j.
Lemma 9
Let\(u,v \in L^2(\varGamma )\)fulfilling (2) be given. Then, the Lipschitz-estimate
is valid for all\(\delta u\in L^2(\varGamma )\).
Proof
To shorten the notation we write \(y_u=S(u)\), \(p_u = Z(u)\), \(\delta y_u = S'(u)\delta u\) and \(\delta p_u = Z'(u)\delta u\). From the representation (14) we obtain
We estimate the right-hand side using the Cauchy–Schwarz inequality, the embedding \(H^1(\varOmega )\hookrightarrow L^4(\varGamma )\) and the Lipschitz estimates from Lemma 8 as well as the a priori estimates from Lemmata 6 and 7. This implies
With similar arguments we deduce
and conclude the assertion. \(\square\)
Corollary 1
Let\({\bar{u}}\in U_{ad}\)be a local solution of (5) satisfying Assumption 1. Then, some\(\varepsilon >0\)exists such that the inequality
is valid for all\(\delta u\in L^2(\varGamma )\)and\(u\in L^2(\varGamma )\)with\(\Vert u-{\bar{u}}\Vert _{L^2(\varGamma )}\le \varepsilon\).
Proof
The assertion follows immediately from the previous lemma. For further details we refer to [27, Lemma 2.23]. \(\square\)
In the next Lemma we will collect some basic regularity results for the solution of (5).
Lemma 10
Let\(\varOmega \subset {{\mathbb {R}}}^n\), \(n\in \{2,3\}\), be a Lipschitz domain. Each local solution\({\bar{u}}\in U_{ad}\)of (5) and the corresponding states\({\bar{y}}=S({\bar{u}})\), \({\bar{p}}=Z({\bar{u}})\)satisfy
Proof
All regularity result, except \({\bar{u}}\in H^1(\varGamma )\), follow directly from Lemma 3. To show \({\bar{u}}\in H^1(\varGamma )\) we apply the product rule
and confirm \({\bar{y}}\,{\bar{p}}\in H^1(\varGamma )\). The desired result then follows after an application of the Stampacchia-Lemma, [26, p. 50], to the projection formula (13). The fact that the Stampacchia-Lemma is also valid on the boundary \(\varGamma\) is discussed in [28, Lemma 2.8] and [30, Lemma 3.3]. \(\square\)
Under additional assumptions on the geometry of \(\varOmega\) we can show even higher regularity. This is needed for the postprocessing approach studied in Sect. 5.3 where we will show almost quadratic convergence of the control approximations.
Lemma 11
Let\(\varOmega \subset {{\mathbb {R}}}^2\)be a bounded domain with a\(C^{1,1}\)-boundary\(\varGamma\). Then, there holds
for all\({\tilde{\varGamma }}\subset \subset {\mathcal {A}}\)or\({\tilde{\varGamma }}\subset \subset {\mathcal {I}}\), where\({\mathcal {A}}:=\{x\in \varGamma :u(x)\in \{u_a,u_b\}\}\)and\({\mathcal {I}}:=\varGamma \setminus {\mathcal {A}}\)denote the active and inactive set, respectively.
Proof
With the regularity results obtained already in Lemma 10, in particular \({\bar{u}}\in H^{1/2}(\varGamma )\), and Lemma 3c) we conclude \({\bar{y}}, {\bar{p}}\in H^2(\varOmega )\hookrightarrow W^{1,q}(\varGamma )\) for all \(q<(1,\infty )\) and a further application of the multiplication rule yields \({\bar{y}}\,{\bar{p}}\in W^{1,q}(\varGamma )\). From (13) we conclude the property \({\bar{u}}\in W^{1,q}(\varGamma )\). Furthermore, we confirm that \({\bar{u}}\,{\bar{y}}, {\bar{u}}\,{\bar{p}}\in W^{1-1/q,q}(\varGamma )\) and a standard shift theorem for the Neumann problem, compare also the technique used in the proof of Lemma 3a), results in \({\bar{y}},{\bar{p}}\in W^{2,q}(\varOmega )\). Repeating the arguments above, i. e., using the multiplication rule and the projection formula, we obtain \({\bar{u}}\in W^{2-1/q,q}({\tilde{\varGamma }})\hookrightarrow H^{2-1/q}({\tilde{\varGamma }})\). \(\square\)
We chose the assumptions of the previous lemma in such a way that the regularity is only restricted due to the projection formula. Of course, when the control bounds are never active we could further improve the regularity results.
4 Finite element approximation of the state equation
This section is devoted to the finite element approximation of the variational problem (1). While the results from the previous sections are valid for arbitrary Lipschitz domains (unless otherwise explicitly assumed), we have to assume more smoothness of the boundary \(\varGamma\) in order to establish our discretization results:
- (A1):
The domain \(\varOmega \subset {{\mathbb {R}}}^n\), \(n\in \{2,3\}\), possesses a Lipschitz continuous boundary \(\varGamma\) which is piecewise \(C^1\).
This definition includes arbitrary (possibly non-convex) polygonal or polyhedral domains. Indeed, the regularity of solutions is in this case also restricted by corner and edge singularities. However, for the first convergence result we require only \(H^{3/2}(\varOmega )\cap H^1(\varGamma )\)-regularity of the solution. Later, we want to investigate improved discretization techniques for which more regularity is needed. Then, we will use a stronger assumption on the domain.
First, we introduce shape-regular triangulations \(\{{\mathcal {T}}_h\}_{h>0}\) of \(\varOmega\) consisting of triangles (\(n=2\)) or tetrahedra (\(n=3\)). The elements T may have curved edges/faces such that the property
is valid for an arbitrary domain \(\varOmega\). Moreover, we assume that the triangulations are feasible in the sense of Ciarlet [13].
The mesh parameter \(h>0\) is the maximal element diameter
The family of meshes \(\{{\mathcal {T}}_h\}_{h>0}\) is assumed to be quasi-uniform, this means some \(\kappa > 0\) independent of h exists such that each element \(T\in {\mathcal {T}}_h\) contains a ball with radius \(\rho _T\) satisfying the estimate \(\frac{\rho _T}{h} \ge \kappa\). Each triangulation \({\mathcal {T}}_h\) of \(\varOmega\) induces also a triangulation \({\mathcal {E}}_h\) of the boundary \(\varGamma\)
By \(F_T:{\hat{T}}\rightarrow T\) we denote the transformations from the reference triangle or tetrahedron \({\hat{T}}\) to the world element \(T\in {\mathcal {T}}_h\). The transformations \(F_T\) may be non-affine for elements with curved faces. Here, we consider transformations of the form
with some affine function \({\tilde{F}}_T({\hat{x}}) = {\tilde{B}}_T{\hat{x}} + {\tilde{b}}_T\), \({\tilde{B}}_T\in {{\mathbb {R}}}^{n\times n}\), \({\tilde{b}}\in {{\mathbb {R}}}^n\), chosen in such a way that if T is a curved boundary element, \({\tilde{T}}={\tilde{F}}_T({\hat{T}})\) is an n-simplex whose vertices coincide with the vertices of T. The assumed shape-regularity implies \(\Vert {\tilde{B}}_T\Vert \le c\,h_T\) and \(\Vert {\tilde{B}}_T^{-1}\Vert \le h_T^{-1}\), see [13, Theorem 15.2].
To guarantee the validity of interpolation error estimates we assume:
- (A2):
The triangulations \({\mathcal {T}}_h\) are regular of order 2 in the sense of [6], this is, for all sufficiently small \(h>0\) there holds
$$\begin{aligned} \sup _{{\hat{x}}\in {\hat{T}}} \Vert D \varPhi _T({\hat{x}})\cdot {\tilde{B}}_T^{-1}\Vert \le c < 1,\qquad \sup _{{\hat{x}}\in {\hat{T}}}\Vert D^2 \varPhi _T({\hat{x}})\Vert \le c h^2, \end{aligned}$$(18)for all \(T\in {\mathcal {T}}_h\).
There are multiple strategies to construct the mappings \(F_T\) satisfying these assumptions and we refer the reader for instance to [6, 37, 41]. Therein, it is assumed that \(\varGamma\) is piecewise \(C^3\), only in the second reference \(C^4\) is required.
The trial and test space is defined by
Next, we introduce an interpolation operator onto \(V_h\). We partly use the quasi-interpolant proposed by Bernardi [6], but use a modification for boundary nodes as in [36], see also [1, 2]. To each interior node \(x_i\), \(i=1,\ldots ,N^{{\text {in}}}\), of \({\mathcal {T}}_h\), we associate an element \(\sigma _i:=T\in {\mathcal {T}}_h\) with \(x_i\in T\). For the boundary nodes \(x_i\), \(i=N^{{\text {in}}}+1,\ldots ,N\), we define instead \(\sigma _i:= E\in {\mathcal {E}}_h\) with \(x_i\in E\). Instead of using nodal values as for the Lagrange interpolant, we use the nodal values of some regularized function computed by an \(L^2\)-projection over \(\sigma _i\). The interpolation operator \(\varPi _h:W^{1,1}(\varOmega )\rightarrow V_h\) is defined as follows. For each node \(i=1,\ldots ,N\) we define a local \(L^2\)-projection \({\hat{\pi }}_i\) onto \({\mathcal {P}}_1(\sigma _i)\) by
with \(F_i\) the transformation from the reference element \({\hat{T}}\) (\(i=1,\ldots ,N^{{\text {in}}}\)) or \({\hat{E}}\) (\(i=N^{{\text {in}}}+1,\ldots ,N\)) onto \(\sigma _i\). The interpolation operator is defined by
where \(\{\varphi _i\}_{i=1,\ldots ,N}\) is the nodal basis of \(V_h\). Note, that due to the modification for boundary nodes, this operator is only applicable to \(W^{1,1}(\varOmega )\)-functions. The desired interpolation properties remain valid. In particular, for each \(T\in {\mathcal {T}}_h\), there holds
where \(S_T\) is the patch of elements adjacent to T, see [6, Theorem 4.1], [36, Theorem 4.1]. Due to the special choice of the patches \(\sigma _i\) for the boundary nodes we get similar interpolation error estimates on the boundary elements \(E\in {\mathcal {E}}_h\), this is,
with the patch \(S_E\) of that boundary elements \(E'\in {\mathcal {E}}_h\) that touch E. The proof follows from the same arguments as in [36, Theorem 4.1].
The finite element solutions of (1) are characterized by the variational formulations
As in the continuous case one can show that (21) possesses a unique solution for each \(h>0\).
With the usual arguments we can derive an error estimate for the approximation error in the energy-norm.
Lemma 12
Assume that (A1)and (A2)are satisfied and that the solutionyof (1) belongs to\(H^{s}(\varOmega )\)with some\(s\in [1,2]\). Then, there holds the error estimate
Proof
The proof follows from the Céa-Lemma and the interpolation error estimates (19). \(\square\)
Of particular interest are error estimates on the boundary. This is required in order to derive error estimates for boundary control problems. To this end, we prove first a suboptimal result which is valid for arbitrary Lipschitz domains \(\varOmega\).
Lemma 13
Let the assumptions (A1)and (A2)be satisfied. It is assumed that the solutionyof (1) belongs to\(H^{3/2}(\varOmega )\). Moreover, the parameterufulfills (2) and belongs to\(L^p(\varGamma )\)with\(p>2\)for\(n=2\)and\(p\ge 4\)for\(n=3\). Then, the error estimate
holds, for all\(h>0\).
Proof
We introduce the dual problem
and obtain with the typical arguments of the Aubin-Nitsche trick
The last step is an application of Lemma 12 and the interpolation error estimate (19). The regularity required for the dual solution w can be deduced from Lemma 3 with \(f\equiv 0\) and \(g=y-y_h\). Taking into account the a priori estimate
we conclude the assertion. \(\square\)
If the solution is more regular, we can also show a higher convergence rate. In this case we will use the Hölder inequality and a trace theorem to obtain \(\Vert y-y_h\Vert _{L^2(\varGamma )} \le \Vert y-y_h\Vert _{L^\infty (\varOmega )}\), and insert the following result.
Theorem 2
Consider a planar domain domain\(\varOmega \in {{\mathbb {R}}}^2\). Let \(u\in H^{1/2}(\varGamma )\)with\(u\ge 0\)a. e., and assume that (A1)and (A2)are satisfied. Assume that the solutionyof (1) belongs to\(y\in W^{2,q}(\varOmega )\)with\(q\in [2,\infty )\). Then, the error estimate
is valid.
The proof requires rather technical arguments and is postponed to the appendix.
5 The discrete optimal control problem
In the following we investigate the discretized optimal control problem:
subject to
The reduced objective functional is denoted by \(j_h(u_h):=J_h(S_h(u_h),u_h)\). We use piecewise linear finite elements to approximate the state y, i. e., the space \(V_h\) is defined as in the previous section. The controls are sought in the space of piecewise constant functions,
where \({\mathcal {E}}_h\) is the triangulation of the boundary induced by \({\mathcal {T}}_h\).
As in the continuous case we can derive a first-order necessary optimality condition which reads
The discrete control-to-state operator \(S_h:L^2(\varGamma )\rightarrow V_h\) and the discrete control-to-adjoint operator \(Z_h:L^2(\varGamma )\rightarrow V_h\) are defined by \(y_h= S_h(u)\) and \(p_h = Z_h(u)\) with
Analogous to the continuous case we compute the first and second derivatives of \(j_h\) and obtain
and
where \(\tau y_h = S_h'(u)\tau u\in V_h\) and \(\tau p_h = Z_h'(u)\tau u\in V_h\) are the solutions of
with \(y_h=S_h(u)\) and \(p_h=Z_h(u)\). These are the discretized versions of the equations (6) and (15). The first-order optimality condition reads in the short form
5.1 Properties of the discrete control-to-state/adjoint operator
In Sect. 3 we have derived several stability and Lipschitz properties for the operators S, Z, \(S'\) and \(Z'\). Here, we will derive the discrete analogues that are needed in the following. Throughout this section we assume that (A1) and (A2) are fulfilled.
Lemma 14
There hold the following properties:
for\(p_1, p_2 >2\)for\(n=2\)and\(p_1\ge 4\), \(p_2> 4\)for\(n=3\). These estimates remain valid when replacing\(S_h\)by\(Z_h\).
Proof
We start with the estimate in the \(H^1(\varGamma )\)-norm. With the triangle inequality and an inverse estimate we obtain
The first two terms are bounded by the last one due to (20) and it remains to apply the stability estimate from Lemma 6. For the third term we apply the error estimate from Lemma 13. This implies the first estimate.
We prove the maximum norm estimate only for the case \(n=3\). In the following, we write \(y_h := S_h(u)\). We introduce the function \({\tilde{y}}\in H^1(\varOmega )\) solving the problem
Obviously, \(y_h\) is the Neumann Ritz-projection of \({\tilde{y}}\), i. e.,
Let \(x^*\in {\bar{T}}^*\) with \(T^*\in \mathcal T_h\) be the point where \(|y_h|\) attains its maximum. With an inverse inequality and the Hölder inequality we get
where \(\delta ^h\) is a regularized delta function defined by \(\delta ^h(x) = |T^*|^{-1}{{\,\mathrm{sgn}\,}}({\tilde{y}}(x)-y_h(x))\) if \(x\in T^*\) and \(\delta ^h(x)=0\) otherwise. The second term on the right-hand side can be treated with the arguments used already in the proof of Lemma 3b), namely
with \(r>3/2\) and \(s=2+\varepsilon\) with \(\varepsilon >0\) sufficiently small such that the following arguments remain valid. We estimate the last term with the Hölder inequality for \(p_2=4\,(2+\varepsilon )/(2-\varepsilon )\) and \(p'=4\) (note that \(1/{p_2}+1/p'=1/s\)) and the embedding \(H^1(\varOmega )\hookrightarrow L^4(\varGamma )\). This yields
It remains to exploit stability of \(S_h\) in the \(H^1(\varOmega )\)-norm to conclude
The estimate for the first term on the right-hand side of (28) is based on the ideas from [40, Section 3.6]. First, we introduce a regularized Green’s function \(g^h\in H^1(\varOmega )\) solving the variational problem \(a^{\text{ N }}(z,g^h) = (\delta ^h,z)_{L^2(\varOmega )}\) for all \(z\in H^1(\varOmega )\). The Neumann Ritz-projection of \(g^h\) is denoted by \(g_h^h\). Using the Galerkin orthogonality we obtain
where the last step follows form the stability of the Ritz projection and the interpolation error estimate (19). To bound the \(H^1(\varOmega )\)-norm of \(g^h\) we apply the ellipticity of \(a^{\text{ N }}\), the definition of \(g^h\), the Hölder inequality and an embedding to arrive at
The last step follows from the property \(\Vert \delta ^h\Vert _{L^{6/5}(\varOmega )}\le c\,|T^*|^{-1/6}\le c\,h^{-1/2}\) that can be confirmed with a simple computation. Insertion into (30) and taking into account (28) and (29) yields the desired stability estimate.
The estimates for \(Z_h\) follow in a similar way. One just has to replace f by \(S_h(u)-y_d\) and the result follows from the estimates proved already for \(S_h(u)\). \(\square\)
Lemma 15
Assume that\(u,v\in L^2(\varGamma )\)satisfy the assumption (2). Then, the Lipschitz estimate
holds.
Proof
The proof follows with the same arguments as in the continuous case, see Lemmata 2 and 8. \(\square\)
Next, we discuss some error estimates for the approximation of the control-to-state and control-to-adjoint operator. While error estimates for \(S_h\) and \(Z_h\) are a direct consequence of Lemma 12, the results for the linearized operators \(S_h'\) and \(Z_h'\) require some more effort as for instance \(S'(u)\delta u - S_h'(u)\delta u\) does not fulfill the Galerkin orthogonality.
Lemma 16
For each\(u\in U_{ad}\)and\(\delta u\in L^2(\varGamma )\)the error estimates
are valid for\(p>2\)for\(n=2\)and\(p\ge 4\)for\(n=3\). The results are also valid when replacingSand\(S_h\)byZand\(Z_h\), as well as\(S'\)and\(S_h'\)by\(Z'\)and\(Z_h'\), respectively.
Proof
The first estimate is just a combination of the Lemmata 6 and 12. To show the estimate for the linearized operators we introduce again the abbreviations \(y:=S(u)\), \(y_h:=S_h(u)\), \(\delta y := S'(u)\delta u\) and \(\delta y_h:= S_h'(u)\delta u\). Moreover, define the auxiliary function \(\delta {\tilde{y}}_h\in V_h\) as the solution of
This function fulfills the Galerkin orthogonality, i. e., \(a_u(\delta y-\delta {\tilde{y}}_h,v_h) = 0\) for all \(v_h\in V_h\). Hence, we obtain with Lemma 12 and the Lipschitz-property from Lemma 2 (note that this Lemma is also valid for the discrete solutions)
For the first term we simply insert the second estimate from Lemma 7. The second term on the right-hand side is further estimated by means of [20, Theorem 1.4.4.2] and a trace theorem which yield
and the assertion follows after an application of the estimate shown already for \(S(u)-S_h(u)\). The estimates for Z and \(Z'\) follow with similar arguments. \(\square\)
5.2 Convergence of the fully discrete solutions
Throughout this subsection we assume that the properties (A1) and (A2) are fulfilled. These assumptions are needed to guarantee the required regularity of the solution and the validity of interpolation error estimates.
As the solutions of both the continuous and discrete optimal control problem (5) and (23), respectively, are not unique we have to construct a sequence of discrete local solutions converging towards a continuous one. The first question which arises is whether such a sequence exists. To this end, we introduce a localized problem
where \({\bar{u}}\in U_{ad}\) is a fixed local solution of (5) fulfilling Assumption 1 and \(B_\varepsilon (\bar{u})\) is the \(L^2(\varGamma )\)-ball with radius \(\varepsilon\) around \({\bar{u}}\). The parameter \(\varepsilon >0\) is arbitrary but sufficiently small. First, we show that this problem possesses a unique local solution which would immediately follow if we could show that the coercivity discussed in Corollary 1 is transferred to the discrete case. The following arguments are similar to the investigations in [11], in particular Theorem 4.4 and 4.5 therein.
Lemma 17
Let\({\bar{u}}\in U_{ad}\)be a local solution of (5). Assume that\(\varepsilon >0\)and\(h>0\)are sufficiently small. Then, the inequality
is valid for allusatisfying\(\Vert u-{\bar{u}}\Vert _{L^2(\varGamma )} \le \varepsilon\).
Proof
With the explicit representations of \(j''\) and \(j_h''\) from (14) and (26), respectively, and Corollary 1, we obtain
with \(y=S(u)\), \(p = Z(u)\), \(\delta y= S'(u)\delta u\) and \(\delta p=Z'(u)\delta u\), and the discrete analogues \(y_h=S_h(u)\), \(p_h = Z_h(u)\), \(\delta y_h= S_h'(u)\delta u\) and \(\delta p_h=Z_h'(u)\delta u\). It remains to bound the two norms in parentheses appropriately. Therefore, we apply the triangle inequality, the stability properties for \(S'\), \(S_h\), \(Z'\) and \(Z_h\) from Lemmata 6, 7 and 14 as well as the error estimates from Lemma 16. Note that the control bounds provide the regularity for u that is required for these estimates. As a consequence we obtain
With similar arguments we can show
The previous two estimates together with (32) imply
Choosing h sufficiently small such that \(c\, h^{1/2} \le \frac{\delta }{4}\) leads to the assertion. \(\square\)
Theorem 3
Let\({\bar{u}}\in U_{ad}\)be a local solution of (5) satisfying Assumption 1. Assume that\(\varepsilon >0\)and\(h_0>0\)are sufficiently small. Then, the auxiliary problem (31) possesses a unique solution for each\(h\le h_0\)denoted by\({\bar{u}}_h^\varepsilon\), and there holds
Proof
The existence of at least one solution of (31) follows immediately from the compactness and non-emptyness of \(U_h^{ad}\cap B_\varepsilon ({\bar{u}})\). Note that the \(L^2(\varGamma )\)-projection \(Q_h {\bar{u}}\) of \(\bar{u}\) onto \(U_h\), defined in (81) in Appendix 2, belongs to \(U_h^{ad}\cap B_{\varepsilon }({\bar{u}})\) provided that \(h>0\) is sufficiently small. This means that the feasible set is not empty. Due to Lemma 17 this solution is unique.
Moreover, the family \(\{{\bar{u}}_h^\varepsilon \}_{h\le h_0}\) is bounded and hence, a weakly convergent sequence \(\{{\bar{u}}_{h_k}^\varepsilon \}_{k\in {{\mathbb {N}}}}\) with \(h_k \searrow 0\) exists. The weak limit is denoted by \({\tilde{u}}\in L^2(\varGamma )\) and from the convexity of the feasible set we deduce \({\tilde{u}}\in U_h^{ad}\cap B_\varepsilon ({\bar{u}})\). Without loss of generality it is assumed that \({\bar{u}}_h^\varepsilon \rightharpoonup {\tilde{u}}\) in \(L^2(\varGamma )\) as \(h\searrow 0\).
Next, we show that \({\tilde{u}}\) is a local minimum of the continuous problem. First, we show the convergence of the corresponding states which follows with the arguments from [10]. First, we employ the triangle inequality to get
For the first term on the right-hand side we exploit convergence of the finite element method proved in Lemma 16 which yields
With similar arguments as in the proof of Lemma 2 we moreover deduce
The integral term on the right-hand side is non-negative due to the lower control bounds \({\bar{u}}_h^\varepsilon \ge u_a\ge 0\). We can bound the first term on the right-hand side with the Cauchy–Schwarz inequality and the multiplication rule from [20, Theorem 1.4.4.2] which provides
for arbitrary \(s\in (0,1/2)\). Note that there holds \(\Vert {\tilde{u}} - {\bar{u}}_h^\varepsilon \Vert _{H^{-s}(\varGamma )}\rightarrow 0\) for \(h\searrow 0\) due to the compact embedding \(L^2(\varGamma )\hookrightarrow H^{-s}(\varGamma )\), \(s>0\). It remains to bound the second factor on the right-hand side by an application of Lemma 14 and to divide the whole estimate by the third factor. After insertion of this estimate into (33) we obtain the strong convergence of the states, this is,
Next, we show that \({\tilde{u}}\) is a local solution of the continuous problem (5). To this end we exploit (34) and the lower semi-continuity of the norm map to arrive at
The second to last step follows from the optimality of \({\bar{u}}_h^\varepsilon\) for (31) and the admissibility of the \(L^2(\varGamma )\)-projection \(Q_h {\bar{u}}\) for sufficiently small \(h>0\). The last step follows from the strong convergence of the \(L^2(\varGamma )\)-projection \(Q_h\) in \(L^2(\varGamma )\). Note that this implies \(\lim _{h\searrow 0} \Vert S_h(Q_h {\bar{u}}) - S({\bar{u}})\Vert _{L^2(\varOmega )} = 0\). Due to Assumption 1 the solution \({\bar{u}}\) is unique within \(B_\varepsilon ({\bar{u}})\) when \(\varepsilon > 0\) is sufficiently small. This implies \({\tilde{u}} = {\bar{u}}\). Note that all “\(\le\)” signs in (35) then turn to “\(=\)” signs.
To conclude the strong convergence of the sequence \(\{{\bar{u}}_h^\varepsilon \}_{h>0}\) we show additionally the convergence of the norms. This follows from (35) and the strong convergence of the states from which we infer
\(\square\)
The previous lemma guarantees that every local solution \({\bar{u}}\in U_{ad}\) satisfying the second-order sufficient condition in Assumption 1 can be approximated by a sequence of local solutions of the discretized problems (31). Due to \({\bar{u}}_h^\varepsilon \in B_\varepsilon ({\bar{u}})\) and \({\bar{u}}_h^\varepsilon \rightarrow {\bar{u}}\) for \(h\searrow 0\) (i. e., the constraint \({\bar{u}}_h^\varepsilon \in B_\varepsilon ({\bar{u}})\) is never active), the functions \({\bar{u}}_h^\varepsilon\) are local solutions of the discrete problems (23) provided that \(h>0\) is small enough. Hence, we neglect the superscript \(\varepsilon\) in the following and denote by \({\bar{u}}_h\) the sequence of discrete local solutions converging to the local solution \({\bar{u}}\).
Next, we show linear convergence of the sequence \({\bar{u}}_h\).
Theorem 4
Let\({\bar{u}}\in U_{ad}\)be a local solution of (5) which fulfills Assumption 1, and\(\{{\bar{u}}_h\}_{h>0}\)are local solutions of (23) with\({\bar{u}}_h\rightarrow {\bar{u}}\)for\(h\searrow 0\). Then, the error estimate
holds.
Proof
Let \(\xi = {\bar{u}}+t({\bar{u}}_h-{\bar{u}})\) with \(t\in (0,1)\). From Corollary 1 we obtain for sufficiently small h the estimate
where the last step follows from the mean value theorem for some \(t\in (0,1)\). Next, we confirm with the first-order optimality conditions that
with the \(L^2(\varGamma )\) projection \(Q_h\) onto \(U_h\). Note that the property \(Q_h{\bar{u}}\in U_{ad}\) is trivially satisfied. Insertion into the inequality above leads to
An estimate for the second part follows from orthogonality of the \(L^2(\varGamma )\)-projection, this is,
Furthermore, we exploit the Leibniz rule and the stability properties for S and Z from Lemma 6 to obtain
Next, we discuss the first term on the right-hand side of (36). Insertion of the definition of \(j_h'\) and \(j'\) and the stability of \(Q_h\) yield
In the last step we inserted the finite element error estimates from Lemma 13. Exploiting also the stability estimates from Lemmata 6 and 14 we obtain
Together with (36), (37) and (38) we arrive at the assertion. \(\square\)
5.3 Postprocessing approach
In this section we consider the so-called postprocessing approach introduced in [33]. The basic idea is to compute an “improved” control \({\tilde{u}}_h\) by a pointwise evaluation of the projection formula, i. e.,
where \({\bar{y}}_h\) and \({\bar{p}}_h\) is the discrete state and adjoint state, respectively, obtained by the full discretization approach discussed in Sect. 5.2. As we require higher regularity of the exact solution in order to observe a higher convergence rate than for the full discretization approach, we replace (A1) by the stronger assumption
- (A1’):
The domain \(\varOmega\) is planar and its boundary is globally \(C^3\).
The most technical part of convergence proofs for this approach is the proof of \(L^2\)-norm estimates for the state variables. This is usually done by considering the following three terms separately:
In [33] \(R_h:C(\varGamma )\rightarrow U_h\) is chosen as the midpoint interpolant. We will construct and investigate such an operator in Appendix 1. Note that a definition of a midpoint interpolant on curved elements is not straight-forward. The first term on the right-hand side of (40) is a finite element error in the \(L^2(\varGamma )\)-norm. We collect the required estimates in the following Lemma.
Lemma 18
For all \(q<\infty\) there hold the estimates
Proof
The first estimate follows from the Hölder inequality and the maximum norm estimate derived in Theorem 2. The second estimate requires an intermediate step. We denote by \(p^h({\bar{u}})\in V_h\) the solution of the equation
As \(p^h({\bar{u}})\) is the Ritz-projection of \({\bar{p}}\) we can apply Theorem 2 again and obtain
To show an estimate for the error between \(p^h({\bar{u}})\) and \(Z_h({\bar{u}})\) we test the equations defining both functions by \(v_h = p^h({\bar{u}}) - Z_h(\bar{u})\), compare the proof of Lemma 2. Together with the non-negativity of \({\bar{u}}\) we obtain
The last step follows from the estimate \(\Vert S({\bar{u}}) - S_h({\bar{u}})\Vert _{L^2(\varOmega )} \le c\,h^2\,\Vert S({\bar{u}})\Vert _{H^2(\varOmega )}\) which is a consequence of the Aubin-Nitsche trick. With the triangle inequality we conclude the desired estimate for the discrete control-to-adjoint operator. \(\square\)
To obtain an optimal error estimate for the second term we need an additional assumption which is used in all contributions studying the postprocessing approach. To this end, define the subsets \({\mathcal {K}}_2:=\cup \{{\bar{E}}:E\in {\mathcal {E}}_h,\ E\subset {\mathcal {A}},\ \text{ or }\ E\subset {\mathcal {I}}\}\) and \({\mathcal {K}}_1:=\varGamma \setminus {\mathcal {K}}_2\). In the following we will assume that \({\mathcal {K}}_1\) satisfies
The idea of this assumption is, that the control can only switch between active and inactive set on \({\mathcal {K}}_1\). Only due to these switching points the regularity of the control is reduced, see also Lemma 11. One can in general expect that this happens at finitely many points and thus, the assumption (41) is not very restrictive.
As an intermediate result required to prove estimates for \(Z_h({\bar{u}})-Z_h(R_h{\bar{u}})\) in \(L^2(\varGamma )\), we need an estimate for \(S_h({\bar{u}}) - S_h(R_h{\bar{u}})\) in \(L^2(\varOmega )\).
Lemma 19
For all \(q<\infty\) there holds the estimate
Proof
To shorten the notation we write \(e_h:= S_h({\bar{u}}) - S_h(R_h {\bar{u}})\). Moreover, we introduce the function \(w\in H^1(\varOmega )\) solving the equation
This implies
Next, we discuss both terms on the right-hand side separately. The first one is treated with the Cauchy–Schwarz inequality and the interpolation error estimate (19). These arguments lead to
The \(H^1(\varOmega )\)-norm of \(e_h\) is further estimated by the Lipschitz property from Lemma 15 and the interpolation error estimate for the midpoint interpolant from Lemma 25. This yields
for all \(q\ge 2\).
Insertion into the estimate above taking into account the stability estimates from Lemma 14 yields
Next, we consider the second term on the right-hand side of (43). After a reformulation by means of the definition of \(S_h\) we get
We can further estimate this term with the interpolation error estimate from Lemma 27
The last step follows from the embedding \(H^1(\varGamma )\hookrightarrow L^\infty (\varGamma )\) and the multiplication rule \(\Vert u\,v\Vert _{H^1(\varGamma )} \le c\,\Vert u\Vert _{H^1(\varGamma )}\,\Vert v\Vert _{H^1(\varGamma )}\), see [20, Theorem 1.4.4.2]. Both properties are only fulfilled in case of \(n=2\).
Let us discuss the terms on the right-hand side separately. For elements \(E\subset {\mathcal {K}}_1\) we can exploit the assumption (41) which provides the estimate \(\sum _{E\subset {\mathcal {K}}_1} |E| \le c\,h\) and the second interpolation error estimate from Lemma 25 to arrive at
On elements \(E\subset {\mathcal {K}}_2\) the control has higher regularity, namely \({\bar{u}}\in H^{2-1/q}(E)\). To this end, we show by interpolation arguments in Banach spaces, see e. g. [7, Section 14.3], that the two estimates from Lemma 25 (the second one with \(r=1\) and \(q=2\)) also imply
As a consequence, we deduce
The remaining terms on the right-hand side of (46) can be treated with stability estimates for \(S_h\) (see Lemma 14) and \(R_h\), the estimate \(\Vert \varPi _h w\Vert _{H^1(\varGamma )}\le c\,\Vert w\Vert _{H^1(\varGamma )}\) stated in (20) and the a priori estimate \(\Vert w\Vert _{H^1(\varGamma )} \le c\,\Vert e_h\Vert _{L^2(\varOmega )}\) from Lemma 3a). Insertion of the previous estimates into (46) yields
Note that we hide the of lower-order norms of \({\bar{u}}\) in the generic constant as these quantities may be estimated by means of the control bounds \(u_a\) and \(u_b\). Insertion of (44), (45) and (48) into (43) and dividing by \(\Vert e_h\Vert _{L^2(\varOmega )}\) implies the assertion. \(\square\)
Lemma 20
Under the assumption (41) the estimates
are valid for arbitrary\(q\in [2,\infty )\).
Proof
We will only prove the second estimate as the first one follows from the same technique and is even easier as the right-hand sides of the equations defining \(S_h({\bar{u}})\) and \(S_h(R_h{\bar{u}})\) coincide. This is not the case for the control-to-adjoint operator.
To shorten the notation we write \(e_h:=Z_h({\bar{u}}) - Z_h(R_h {\bar{u}})\). As in the previous lemma we rewrite the error by a duality argument using a dual problem similar to (42) with solution \(w\in H^1(\varOmega )\), more precisely,
This yields
We rewrite the second expression in (49) and get analogous to (45)
Note that the first term would not appear when deriving estimates for \(S_h\) instead of \(Z_h\) as the equations defining \(S_h({\bar{u}})\) and \(S_h(R_h {\bar{u}})\) have the same right-hand side.
The first term can be treated with the Cauchy–Schwarz inequality, Lemma 19 and the estimate \(\Vert \varPi _h w\Vert _{L^2(\varOmega )} \le c\,\Vert w\Vert _{H^1(\varOmega )}\le c\,\Vert e_h\Vert _{L^2(\varGamma )}\) which can be deduced from (19) and Lemma 1 with \(g=e_h\). These ideas lead to
For the second term on the right-hand side of (50) we apply the same steps as for (48) with the only modification that the a priori estimate \(\Vert w\Vert _{H^1(\varGamma )} \le c\,\Vert e_h\Vert _{L^2(\varGamma )}\) from Lemma 3a) has to be employed. From this we infer
In the last step we used the boundedness of \(Z_h(R_h {\bar{u}})\), see Lemma 14. Insertion of (51) and (52) into (50) leads to
It remains to discuss the first term on the right-hand side of (49). We obtain with the boundedness of \(a_{{\bar{u}}}\), the interpolation error estimate (19) and Lemma 3a)
An estimate for the expression \(\Vert e_h\Vert _{H^1(\varOmega )}\) follows from the equality
which can be deduced by subtracting the equations for \(Z_h({\bar{u}})\) and \(Z_h(R_h {\bar{u}})\) from each other. Rearranging the terms yields
The second term on the right-hand side can be bounded by zero as \({\bar{u}}\ge 0\). An estimate for the last term is proved in Lemma 19. For the first term we apply the estimate (52) with \(\varPi _h w\) replaced by \(e_h\). All together, we obtain
Moreover, with an inverse inequality and a trace theorem we get
Consequently, we deduce from (55)
Insertion into (54) leads to
Together with (53) and (49) we conclude the desired estimate for \(Z_h\). \(\square\)
Lemma 21
Under the assumption (41) there holds the estimate
with
Proof
We observe that each function \(\xi :=t\, R_h {\bar{u}} + (1-t)\, {\bar{u}}_h\) for \(t\in [0,1]\) satisfies
for arbitrary \(\varepsilon >0\) provided that h is sufficiently small. This follows from the convergence of the midpoint interpolant, see Lemma 25, and convergence of \({\bar{u}}_h\) towards \({\bar{u}}\), see Theorem 3. Hence, with the coercivity of \(j_h''\) proved in Lemma 17 and the mean value theorem we conclude
For the latter term we exploit the discrete optimality condition and the fact that the continuous optimality condition holds even pointwise. This implies the inequality
Insertion into the estimate above implies
The right-hand side can be decomposed into two parts. With appropriate intermediate functions we obtain for the latter one
Moreover, we apply the triangle inequality and the estimates from Lemmata 18 and 20 to deduce
Analogously, one can derive an estimate for the term \(\Vert {\bar{p}} - Z_h(R_h{\bar{u}})\Vert _{L^2(\varGamma )}\). Moreover, we apply Lemmata 6 and 14 to bound the norms of \(p=Z({\bar{u}})\) and \(S_h(R_h {\bar{u}})\), respectively. All together we obtain the estimate
Next we discuss that part of (56) which involves the term \(R_h({\bar{y}}\,{\bar{p}})-{\bar{y}}\,{\bar{p}}\) in the first argument. Here, we again use the interpolation error estimate (47) exploiting regularity in fractional-order Sobolev spaces and obtain
With [20, Theorem 1.4.4.2] and a trace theorem we conclude
Insertion of the estimates (57) and (58) into (56), and dividing the resulting estimate by \(\Vert R_h u - u_h\Vert _{L^2(\varGamma )}\), leads to the desired result. \(\square\)
Now we are in the position to state the main result of this section.
Theorem 5
Let\(({\bar{y}},{\bar{u}},{\bar{p}})\)be a local solution of (12) satisfying the assumption (41). Moreover, let\(\{{\bar{u}}_h\}_{h>0}\)be a sequence of local solutions of (27) such that for sufficiently small\(\varepsilon ,h_0>0\)the property
holds. Then, the error estimate
is satisfied with\(c=c(\Vert {\bar{u}}\Vert _{W^{1,q}(\varGamma )},\Vert {\bar{u}}\Vert _{H^{2-1/q}({\mathcal {K}}_2)},\Vert {\bar{y}}\Vert _{W^{2,q}(\varOmega )}, \Vert {\bar{p}}\Vert _{W^{2,q}(\varOmega )})\).
Proof
With the projection formulas (13) and (39), respectively, the non-expansivity of the operator \(\varPi _{ad}\) and the triangle inequality we obtain
The assertion follows after insertion of (40) together with the estimates obtained in Lemmata 18, 20 and 21, as well as the stability estimates of Z and \(S_h\) from Lemmata 3 and 14, respectively. \(\square\)
6 Numerical experiments
It is the purpose of this last section to confirm the theoretical results by numerical experiments. To this end, we reformulate the discrete optimality condition (27) and use the equivalent projection formula
Here, \(R_h^{{\text {Simp}}}:C(\varGamma )\rightarrow U_h\) is a projection operator based on the Simpson rule, this is,
where \(x_{E_1}\) and \(x_{E_2}\) are the endpoints of the boundary edge \(E\in {\mathcal {E}}_h\) and \(x_E\) its midpoint. The numerical solution of (59) is computed by a semismooth Newton-method.
The input data of the considered benchmark problem is chosen as follows. The computational domain is the unit square \(\varOmega :=(0,1)^2\). We define the exact Robin parameter \({\tilde{u}}\) by
and use the desired state \(y_d = S_h({\tilde{u}})\) and the right-hand side \(f\equiv 0\). Moreover, the regularization parameter \(\alpha = 10^{-2}\) and the control bounds \(u_a=0\), \(u_b=\infty\) are used.
We compute the numerical solution of our benchmark problem on a sequence of meshes starting with \({\mathcal {T}}_{h_0}\), \(h_0=\sqrt{2}\), consisting of two rectangular triangles only. The remaining grids \({\mathcal {T}}_{h_i}\), \(i=1,2,\ldots ,\) are obtained by a double bisection through the longest edge of each element applied to the previous mesh. This guarantees \(h_i = \frac{1}{2} h_{i-1}\). In order to compute the discretization error we use the solution on the mesh \({\mathcal {T}}_{h_{11}}\) as an approximation of the exact solution, this means,
Analogously, we compute the error for the approximation obtained by the postprocessing strategy. However, in this case the exact solution is approximated by \({\bar{u}}\approx \varPi _{ad}(\frac{1}{\alpha }{\bar{y}}_{h_{11}}\,{\bar{p}}_{h_{11}})\). The error norms \(\Vert \varPi _{ad}(\frac{1}{\alpha }{\bar{y}}_{h_{11}}\,{\bar{p}}_{h_{11}}) - \varPi _{ad}(\frac{1}{\alpha }{\bar{y}}_{h_{i}}\,{\bar{p}}_{h_{i}})\Vert _{L^2(\varGamma )}\), \(i=0,\ldots ,11\), are computed element-wise by the Simpson quadrature formula with the modification that elements E are split at those points where \({\bar{y}}_{h_i}\,{\bar{p}}_{h_i}\) or \({\bar{y}}_{h_{11}}\,{\bar{p}}_{h_{11}}\) change its sign.
The optimal control and corresponding state of our benchmark problem is illustrated in Fig. 1 and the measured discretization errors as well as the experimentally computed convergence rates are summarized in Table 1. As we have proven in Theorem 4 the numerical solutions obtained by a full discretization using a piecewise constant control approximation converge with the optimal convergence rate 1. Moreover, it is confirmed that the solution obtained with a postprocessing step, see Theorem 5, converges with order \(2-\varepsilon\), \(\varepsilon >0\). Note that we actually proved the results for the case that the boundary is smooth which is indeed not the case in our example. However, the corner singularities contained in the solution are for a \(90^\circ\)-corner comparatively mild so that the regularity results from Lemma 11 remain valid.
References
Apel, Th: Anisotropic Finite Elements: Local Estimates and Applications. Teubner, Stuttgart (1999)
Apel, Th, Lombardi, A.L., Winkler, M.: Anisotropic mesh refinement in polyhedral domains: error estimates with data in \(L^2(\varOmega )\). ESAIM Math. Model. Numer. Anal. 48(4), 1117–1145 (2014). https://doi.org/10.1051/m2an/2013134
Apel, Th, Pfefferer, J., Rösch, A.: Finite element error estimates on the boundary with application to optimal control. Math. Comp. 84, 33–70 (2015). https://doi.org/10.1090/S0025-5718-2014-02862-7
Apel, Th, Pfefferer, J., Winkler, M.: Error estimates for the postprocessing approach applied to Neumann boundary control problems in polyhedral domains. IMA J. Numer. Anal. 38(4), 1984–2025 (2018). https://doi.org/10.1093/imanum/drx059
Arada, N., Casas, E., Tröltzsch, F.: Error estimates for the numerical approximation of a semilinear elliptic control problem. Comput. Optim. Appl. 23(2), 201–229 (2002). https://doi.org/10.1023/A:1020576801966
Bernardi, C.: Optimal finite-element interpolation on curved domains. SIAM J. Numer. Anal. 26(5), 1212–1240 (1989). https://doi.org/10.1137/0726068
Brenner, S.C., Scott, L.R.: The Mathematical Theory of Finite Element Methods, 3rd ed. Texts in Applied Mathematics. Springer, New York (2008)
Casas, E.: Boundary control of semilinear elliptic equations with pointwise state constraints. SIAM J. Control Optim. 31(4), 993–1006 (1993). https://doi.org/10.1137/0331044
Casas, E., Mateos, M.: Error estimates for the numerical approximation of Neumann control problems. Comput. Optim. Appl. 39(3), 265–295 (2008). https://doi.org/10.1007/s10589-007-9056-6
Casas, E., Mateos, M.: Uniform convergence of the FEM. Applications to state constrained control problems. Comput. Appl. Math. 21(1), 67–100 (2002)
Casas, E., Mateos, M., Tröltzsch, F.: Error estimates for the numerical approximation of boundary semilinear elliptic control problems. Comput. Optim. Appl. 31(2), 193–219 (2005). https://doi.org/10.1007/s10589-005-2180-2
Chaabane, S., Jaoua, M.: Identification of Robin coefficients by the means of boundary measurements. Inverse Probl. 15(6), 1425–1438 (1999)
Ciarlet, P.G.: Basic error estimates for elliptic problems. In: Ciarlet, P.G., Lions, J.L. (eds.) Finite Element Methods, vol. 2. Handbook of Numerical Analysis, pp. 17–352. Elsevier, North-Holland (1991)
Dauge, M.: Elliptic Boundary Value Problems on Corner Domains. Springer, Berlin (1988). https://doi.org/10.1007/BFb0086682
Dhamo, V.: Optimal Boundary Control of Quasilinear Elliptic Partial Diffierential Equations: Theory and Numerical Analysis. PhD thesis. TU Berlin (2012)
Egger, H., et al.: Analysis and numerical solution of coupled volume-surface reaction–diffiusion systems with application to cell biology. Appl. Math. Comput. 336, 351–367 (2018). https://doi.org/10.1016/j.amc.2018.04.031. ISSN: 0096-3003
Fellner, K., Rosenberger, S., Tang, B.Q.: Quasi-steady-state approximation and numerical simulation for a volume-surface reaction–diffiusion system. Commun. Math. Sci. 14(6), 1553–1580 (2016). https://doi.org/10.4310/cms.2016.v14.n6.a5
Frehse, J., Rannacher, R.: Eine \(\text{ L }^{1}\)-Fehlerabschätzung für diskrete Grundlösungen in der Methode der finiten Elemente. Bonn. Math. Schr. 89, 92–114 (1976)
Gesztesy, F., Mitrea, M.: A description of all self-adjoint extensions of the Laplacian and Krein-type resolvent formulas on non-smooth domains. J. Anal. Math. 113, 53–172 (2011). https://doi.org/10.1007/s11854-011-0002-2
Grisvard, P.: Elliptic Problems in Nonsmooth Domains. Pitman, Boston (1985)
Gwinner, J.: On two-coefficient identification in elliptic variational inequalities. Optimization 67(7), 1017–1030 (2018). https://doi.org/10.1080/02331934.2018.1446955
Hào, D.N., Thanh, P.X., Lesnic, D.: Determination of the heat transfer coefficients in transient heat conduction. IOP Inverse Probl. (2013). https://doi.org/10.1088/0266-5611/29/9/095020
Hetmaniok, E., et al.: Identification of the heat transfer coefficient in the two-dimensional model of binary alloy solidification. Heat Mass Transf. 53(5), 1657–1666 (2017). https://doi.org/10.1007/s00231-016-1923-1
Hinze, M.: A variational discretization concept in control constrained optimization: the linear-quadratic case. Comput. Optim. Appl. 30(1), 45–61 (2005). https://doi.org/10.1007/s10589-005-4559-5
Jin, B., Lu, X.: Numerical identification of a Robin coefficient in parabolic problems. Math. Comp. 81(279), 1369–1398 (2012). https://doi.org/10.1090/S0025-5718-2012-02559-2
Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and Their Applications, Vol. 88. Pure and Applied Mathematics. Academic Press, New York (1980)
Kröner, A., Vexler, B.: A priori error estimates for elliptic optimal control problems with a bilinear state equation. J. Comput. Appl. Math. 230(2), 781–802 (2009). https://doi.org/10.1016/j.cam.2009.01.023
Krumbiegel, K., Meyer, C., Rösch, A.: A priori error analysis for linear quadratic elliptic Neumann boundary control problems with control and state constraints. SIAM J. Control Optim. 48(8), 5108–5142 (2010). https://doi.org/10.1137/090746148. ISSN: 0363-0129
Krumbiegel, K., Pfefferer, J.: Superconvergence for Neumann boundary control problems governed by semilinear elliptic equations. Comput. Optim. Appl. 61(2), 373–408 (2015). https://doi.org/10.1007/s10589-014-9718-0
Kunisch, K., Vexler, B.: Constrained Dirichlet boundary control in \(\text{ L }^{2}\) for a class of evolution equations. SIAM J. Control Optim. 46(5), 1726–1753 (2007). https://doi.org/10.1137/060670110. ISSN: 0363-0129
Liu, J., Nakamura, G.: Recovering the boundary corrosion from electrical potential distribution using partial boundary data. Inverse Probl. Imaging 11(3), 521–538 (2017). https://doi.org/10.3934/ipi.2017024
Mateos, M., Rösch, A.: On saturation effects in the Neumann boundary control of elliptic optimal control problems. Comput. Optim. Appl. 49(2), 359–378 (2011). https://doi.org/10.1007/s10589-009-9299-5
Meyer, C., Rösch, A.: Superconvergence properties of optimal control problems. SIAM J. Control Optim. 43(3), 970–985 (2004). https://doi.org/10.1137/S0363012903431608
Mohebbi, F., Sellier, M.: Identification of space- and temperature-dependent heat transfer coefficient. Int. J. Therm. Sci. 128, 28–37 (2018). https://doi.org/10.1016/j.ijthermalsci.2018.02.007
Rösch, A., Tröltzsch, F.: An optimal control problem arising from the identification of nonlinear heat transfer laws. Pol. Acad. Sci. Comm. Autom. Control Robot. Arch. Control Sci. 1(3–4), 183–195 (1992)
Scott, L.R., Zhang, S.: Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comp. 54(190), 483–493 (1990). https://doi.org/10.2307/2008497
Scott, R.: Finite Element Techniques for Curved Boundaries. PhD thesis. MIT (1973)
Scott, R.: Optimal \(L^{\infty }\) estimates for the finite element method on irregular meshes. Math. Comp. 30, 681–697 (1976). https://doi.org/10.2307/2005390
Tröltzsch, F.: Optimal Control of Partial Diffierential Equations: Theory, Methods, and Applications. Graduate Studies in Mathematics. American Mathematical Society (2010). ISBN: 978-0-82-184904-0
Winkler, G.: Control Constrained Optimal Control Problems in Non-convex Three Dimensional Polyhedral Domains. PhD Thesis. TU Chemnitz (2008)
Zlamal, M.: Curved elements in the finite element method. I. SIAM J. Numer. Anal. 10, 229–240 (1973). https://doi.org/10.1137/0710022
Acknowledgements
Open Access funding provided by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Proof of Theorem 2
The proof of the maximum norm estimate presented in Theorem 2 follows basically from the arguments of [18, 38]. For the convenience of the reader we want to repeat the proof as the result of Theorem 2 is, for our specific situation, not directly available in the literature. The novelty of the present proof is that it includes curved elements as well as Robin boundary conditions. In the aforementioned articles, a representation of the error term based on a regularized Dirac function is used. This function forms the right-hand side of a dual problem whose solution is an approximation of Green’s function. The main difficulty is to bound this solution in appropriate norms.
To this end, we denote by \(T^*\in {\mathcal {T}}_h\) the element where \(|y-y_h|\) attains its maximum. The regularized Dirac function is defined by \(\delta ^h(x) := |T^*|^{-1}{{\,\mathrm{sgn}\,}}(y(x)-y_h(x))\) if \(x\in T^*\), and \(\delta ^h(x):=0\) if \(x\not \in T^*\). The corresponding Green’s function denoted by \(g^h\) solves the problem
The Dirac function satisfies the properties
We start our considerations with some a priori estimates for the solution \(g^h\).
Lemma 22
The following a priori estimates hold:
Proof
(ii) To show the estimate in the \(H^2(\varOmega )\)-norm we apply the a priori estimate from Lemma 3c) and \(\Vert \delta ^h\Vert _{L^2(\varOmega )} \le c h^{-1}\).
(i) The weak form of (60) and the property (61) imply
where the discrete Sobolev inequality was applied in the last step. The function \(g_h^h\in V_h\) is the Ritz-projection of \(g^h\) and satisfies the usual stability estimate
Next, we derive a suboptimal error estimate for the finite-element error in the \(L^\infty (\varOmega )\)-norm. Using an inverse inequality, estimates for the interpolant \(\varPi _h\) from (19), the Aubin-Nitsche trick and the a priori estimate shown already in the \(H^2(\varOmega )\)-norm we deduce
Note that we hide the dependency on u, or more precisely on \(\Vert u\Vert _{H^{1/2}(\varOmega )}\) and lower-order norms, in the generic constant to simplify the notation. Insertion of (63) and (64) into (62) yields with Young’s inequality
The desired estimate follows form a kick-back-argument.
(iii) The \(L^\infty (\varOmega )\)-estimate follows directly from (62), (63) and (64) using the inequality (i). \(\square\)
Next, we show an a priori estimate for \(g^h\) in a weighted norm. This is the key idea which allows us to bound second derivatives by a logarithmic factor only. The weight function we will use is defined by
with \(x^*:={{\,\mathrm{arg\,max}\,}}_{x\in T^*} |y-y_h|(x)\). This function satisfies
which follows from a simple computation. With this weight function at hand we can prove the following regularity result:
Lemma 23
Assume that\(u\in H^{1/2}(\varGamma )\). There holds the estimate
Proof
We introduce the functions \(\xi _i:=|x_i-x_i^*|\), \(i=1,2\), which allow us to write
With the reverse product rule we obtain
Moreover, we easily confirm that \(\xi _i\,g^h\) is the solution of the problem
where \(n_i\) is the ith component of the outer unit normal vector on \(\varGamma\). Lemma 3c) using the property \(\Vert \xi _i\,\delta ^h\Vert _{L^2(\varOmega )} \le c\), which follows from a simple computation, leads to
Insertion into (67) and using Lemma 22(i) leads to
An estimate for the second term on the right-hand side of (66) is derived in Lemma 22(ii). \(\square\)
Next, we derive some error estimates for the approximation \(g_h^h\) in several norms.
Lemma 24
Assume that\(u\in H^{1/2}(\varGamma )\). Then, there hold the error estimates
Proof
The first estimate follows directly from the \(H^1(\varOmega )\)-error estimate stated in Lemma 12 and the Aubin-Nitsche trick. Moreover, the a priori estimate for the \(H^2(\varOmega )\)-norm of \(g^h\) from Lemma 22 has to be exploited.
In the second estimate one observes that the discrete function \(g_h^h\) would vanish except on curved elements (note that \(g_h^h\) is affine on the reference element only, but not on T). With the transformation result [6, Lemma 2.3] we obtain
where \(g^h = {\hat{g}}^h\circ F_T^{-1}\), \(g_h^h={\hat{g}}_h^h\circ F_T^{-1}\). Taking into account \(\inf _{x\in T}\sigma (x) \sim \sup _{x\in T} \sigma (x)\), which holds due to the assumed shape-regularity, and \(|\sigma (x)| \le c\) for all \(x\in \varOmega\), we obtain
The first term has been discussed in the previous lemma and the last term has been considered in the present Lemma already. \(\square\)
Now we are in the position to prove Theorem 2.
Proof
With an inverse inequality and the Hölder inequality, the definition of \(\delta ^h\) and a maximum norm estimate for the interpolant \(\varPi _h\), see e. g. [6, Theorem 4.1], we obtain
where \(g_h^h\in V_h\) denotes the Ritz projection of \(g^h\).
For the latter part on the right-hand side of (69) we get with the Galerkin orthogonality, the Hölder inequality, a trace theorem for the boundary integral term as well as \(\Vert u\Vert _{L^\infty (\varGamma )} \le c\)
An estimate for the interpolation error is deduced in [6]. The \(L^1(\varOmega )\)-norms can be replaced by weighted \(L^2(\varOmega )\)-norms involving the weighting function \(\sigma\). Taking into account the properties (65) we obtain
In the following we will show that the expressions on the right-hand side of (71) are bounded by \(c\,h\,|\ln h|^{1/2}\). Therefore, we apply the reverse product rule and get
From this we conclude
Here, we exploited that \((u\,\sigma ^2\,(g^h-g_h^h),g^h-g_h^h)_{L^2(\varGamma )}\ge 0\) due to \(u\ge u_a\ge 0\). Next, we introduce the abbreviation \(z:=\sigma ^2\,(g^h-g_h^h)\). The Galerkin orthogonality of \(g^h-g_h^h\), Young’s inequality and the trace theorem taking into account \(|\sigma | + |\nabla \sigma |\le c\) yield
and thus,
Next, we derive local interpolation error estimates. In the following we use the notation \({\underline{\sigma }}_T:=\inf _{x\in T} \sigma (x)\) and \({\overline{\sigma }}_T :=\sup _{x\in T} \sigma (x)\). Due to the assumed shape-regularity there holds \({\underline{\sigma }}_T\sim {\overline{\sigma }}_T\) for all \(T\in {\mathcal {T}}_h\), and hence,
where \(S_T\) is the patch of all elements adjacent to T (note that \(\varPi _h\) is a quasi-interpolant). The Leibniz rule and the properties \(|\nabla \sigma ^2|\le \sigma\) and \(|\nabla ^2 \sigma ^2|\le c\) imply
Next, we combine the two estimates above and take into account the properties \(h\,{\underline{\sigma }}_T^{-1} \le c\) and \({\overline{\sigma }}_T\sim {\overline{\sigma }}_{S_T}\) which follows from the assumed quasi-uniformity. Summation over all \(T\in {\mathcal {T}}_h\) and an application of Lemma 24 yields
It remains to discuss the third term on the right-hand side of (73). With interpolation error estimates for \(\varPi _h\) on the boundary, compare also (20), and \(u\in L^\infty (\varGamma )\) we obtain
where we exploited the product rule and the property \(\nabla \sigma ^2\le 2\sigma \mathbf {1}\) in the last step. With a trace theorem and Lemma 24 we conclude
and with a multiplicative trace theorem, Young’s inequality, the product rule and the estimates from Lemma 24 we obtain
The estimate (75) then simplifies to
Insertion of (74) and (76) into (73) leads to the estimate
It remains to show an estimate for the second term on the right-hand side of (72). Due to \(|\nabla \sigma ^2| \le 2\sigma \mathbf {1}\), Young’s inequality and the \(L^2(\varOmega )\)-error estimate from Lemma 24 we get
Insertion of (77) and (78) into (72) yields
and with a kick-back-argument we conclude \(\varTheta ^2=c\,h^2\,|\ln h|\). Finally, we collect up the previous estimates. To this end, we insert (79) into (71), the resulting estimate into (70) and this into (69). \(\square\)
Appendix 2: Local estimates for the midpoint interpolant and the \(L^2(\varGamma )\)-projection
To the best of the author’s knowledge there are no error estimates for the midpoint interpolant defined on a curved boundary available in the literature. Thus, we prove the following Lemmata which are needed in the proof of Lemma 20.
Consider a single boundary element \(E\subset {\bar{T}}\) with corresponding element \(T\in {\mathcal {T}}_h\). A parametrization of the boundary element is given by \(E:=\{\gamma _E(\xi ) := F_T(\xi ,0),\ \xi \in (0,1)\}\) when assuming that the edge of \({\hat{T}}\) with endpoints (0, 0), (1, 0) is mapped onto E. In the following we denote the length of a boundary element \(E\in {\mathcal {E}}_h\) by \(L_E = \int _0^1 |{{\dot{\gamma }}}_E(\xi )|\,\mathrm {d}\xi\).
Lemma 25
For each function\(u:\varGamma \rightarrow {{\mathbb {R}}}\)there exists some piecewise constant function\(R_h u\in U_h\)satisfying the local estimates
for all\(E\in {\mathcal {E}}_h\), provided thatupossesses the regularity demanded by the right-hand side.
Proof
Let us first construct a suitable interpolation operator. To obtain the desired second-order accuracy we have to guarantee that the property \(\int _E p = \int _E R_h p\) holds for all functions \(p(\gamma _E(\xi )) = {\hat{p}}(\xi )\) with some first-order polynomial \({\hat{p}}(\xi ):=a +b\,\xi\). The transformation to \({\hat{E}}:=(0,1)\times \{0\}\) yields
The latter step holds true when choosing
To this end, we define our operator by \(R_h u|_E:= (u\circ \gamma _E)(\xi _E)\). Obviously, the definition of \(R_h\) depends on the transformations \(F_T\).
To show the interpolation error estimates we apply the property \(\int _E(p-R_h p)=0\) for arbitrary p satisfying \({\hat{p}}=p\circ \gamma _E\in {\mathcal {P}}_1\), the stability of the interpolant \({\hat{R}}_h {\hat{u}} = {\hat{u}}(\xi _E)\), the properties (18) of the transformation \(F_T\) and the Bramble–Hilbert Lemma. This yields
For the transformation back to the world element E we apply the chain rule
and the properties (18) to arrive at
Finally, the norm of \({\dot{\gamma }}_E\) can be bounded by means of
Note, that the last step is valid for the spectral norm only.
An application of Lemma 2.2 from [6] which provides \(\sup _{{\hat{x}}\in {\hat{T}}}\Vert DF_T({\hat{x}})^{-1}\Vert \le c h^{-1}\) leads to the first estimate.
The second estimate follows with similar arguments. For an arbitrary constant \({\hat{p}}\) we then obtain
\(\square\)
A further operator that is needed in Sect. 3 is the \(L^2(\varGamma )\)-projection onto \(U_h\). In case of curved boundaries, this operator reads
for each \(E\in {\mathcal {E}}_h\). Again, this definition depends on the parametrizations \(\gamma _E\) of the boundary elements \(E\in {\mathcal {E}}_h\). By a simple computation one can show that this definition implies the orthogonality property
With similar arguments as in the previous lemma we obtain the following local estimate which is standard in case of a boundary consisting of straight edges.
Lemma 26
Assume that\(u\in H^1(\varGamma )\). Then the estimate
is fulfilled for all\(E\in {\mathcal {E}}_h\).
Proof
We introduce a further projection onto \(U_h\), namely \([{\tilde{Q}}_h u]|_E:=\int _0^1 u(\gamma _E(\xi ))\,\mathrm {d}\xi\). Using (81), the transformation to the reference element as in the previous lemma and the Poincaré inequality we obtain
where the last step is a consequence of the chain rule \(\partial _\xi u(\gamma _E(\xi )) = \nabla u(\gamma _E(\xi ))\cdot {\dot{\gamma }}_E(\xi )\) and \(|{\dot{\gamma }}_E(\xi )|\sim h\) for all \(\xi \in (0,1)\), see also (80). \(\square\)
We conclude this section with an estimate for an expression which is need in Lemma 20.
Lemma 27
Assume that the functionsuandvbelong to\(\in H^1(\varGamma )\). Then the inequality
is valid.
Proof
First, we split the term under consideration using the \(L^2(\varGamma )\)-projection onto \(U_h\) and obtain
The first term on the right-hand side can be treated with the local estimate from Lemma 26 which yields
For the second term we exploit the definition of \(Q_h\) and \(R_h\) on the reference element. For each \(E\in {\mathcal {E}}_h\) we then obtain
where we used \(\int _0^1|{\dot{\gamma }}_E(\xi ')|\,\mathrm {d}\xi ' = L_E\) in the second step. Consequently, we obtain
and conclude the assertion. \(\square\)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Winkler, M. Error estimates for the finite element approximation of bilinear boundary control problems. Comput Optim Appl 76, 155–199 (2020). https://doi.org/10.1007/s10589-020-00171-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-020-00171-5