Numerical approximation of control problems of non-monotone and non-coercive semilinear elliptic equations

We analyze the numerical approximation of a control problem governed by a non-monotone and non-coercive semilinear elliptic equation. The lack of monotonicity and coercivity is due to the presence of a convection term. First, we study the finite element approximation of the partial differential equation. While we can prove existence of a solution for the discrete equation when the discretization parameter is small enough, the uniqueness is an open problem for us if the nonlinearity is not globally Lipschitz. Nevertheless, we prove the existence and uniqueness of a sequence of solutions bounded in L∞(Ω)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^\infty (\varOmega )$$\end{document} and converging to the solution of the continuous problem. Error estimates for these solutions are obtained. Next, we discretize the control problem. Existence of discrete optimal controls is proved, as well as their convergence to solutions of the continuous problem. The analysis of error estimates is quite involved due to the possible non-uniqueness of the discrete state for a given control. To overcome this difficulty we define an appropriate discrete control-to-state mapping in a neighbourhood of a strict solution of the continuous control problem. This allows us to introduce a reduced functional and obtain first order optimality conditions as well as error estimates. Some numerical experiments are included to illustrate the theoretical results.

error estimates. Some numerical experiments are included to illustrate the theoretical results.

Introduction
In this paper, we consider the numerical approximation of the optimal control problem (P) min where Ω ⊂ R n , n = 2 or n = 3, is a convex domain with boundary Γ , y u is the solution of the following state equation

1)
A is an elliptic operator, b : Ω −→ R n is a given function, and f : Ω × R −→ R is non-decreasing monotone in the second variable. Moreover, y d ∈ L 2 (Ω) is a given function, ν > 0, and with −∞ ≤ α < β ≤ +∞. This problem is studied in [12], where existence and uniqueness results for the equation, as well as existence of optimal controls and optimality conditions are obtained. For the convenience of the reader, these results are summarized in Sect. 2. In this work, we will discretize the problem and obtain approximation results. The reader is also referred to [8,15] for a similar control problem associated to a non-monotone quasilinear elliptic equation. The main difference with respect to the above equation is that the operators considered in [8,15] are coercive, while our equation is neither monotonone nor coercive. In Sect. 3 we study the approximation of the state equation by finite elements. The reader is referred to [32] for the linear case or [8,16,22] for the case of nonmonotone but coercive quasilinear equations. In the quasilinear case, the discrete equation has at least one solution for every h, which easily follows from an application of Brouwer's fixed point theorem and the coercivity of the operator. However, there is not a uniqueness result. In the linear case, we only can prove existence of solution if h is small enough, but we have a unique solution for each of these values of h. In the semilinear discrete case corresponding to (1.1), the existence of a discrete solution requires, as in the linear case, a parameter h small. But, as in the quasilinear case, the uniqueness of discrete solutions is an open issue. We prove existence and uniqueness of a discrete solution if the equation is linear or if the non-monotone term is bounded. In the general case we can prove the existence and uniqueness of a bounded sequence of solutions as h tends to 0, but we cannot rule out the possible existence of a divergent sequence of solutions in the L ∞ (Ω)-norm. Error estimates are provided for the bounded approximations of the solutions of the state equation.
In Sect. 4 we discretize the control problem, using either piecewise constant or continuous piecewise linear approximations of the control. We prove the existence of a number h 0 > 0 such that the discrete optimal control problem has at least one solution (ȳ h ,ū h ) for every discretization parameter h < h 0 . We also prove the boundedness of these solutions in H 1 0 (Ω) × L 2 (Ω). Moreover, every limit in the H 1 0 (Ω) × L 2 (Ω) weak topology when h → 0 of a sequence of discrete solutions is a solution of the continuous optimal control problem. In addition, the converge is not only weak, it is strong in the H 1 0 (Ω) × L 2 (Ω) topology. Next, we define a discrete control-to-state mapping in a neighborhood of a strict solution of the continuous problem, as well as an associated reduced functional and we state first order optimality conditions. In Sect. 5 we obtain error estimates and in the last section we include a numerical experiment.
To finish this introduction let us mention some papers concerning error estimates for the numerical approximation of non-linear elliptic control problems. Early references for the numerical analysis of linear quadratic control problems are the papers [23] and [24]. The first reference we are aware of dealing with the numerical approximation of optimal control problems governed by a semilinear elliptic equation is [3]; state constraints were included in the analysis in [29]. Different aspects of Neumann boundary optimal control problems have been treated in [9,13] or [26]. The case of Dirichlet boundary control was first treated in [14]. In all these references, the equations were coercive and monotone. Optimal control problems governed by quasi-linear elliptic equations have been studied in [8,16,17,20]. In these works, the equations were coercive but not monotone. It is also worth mentioning the works [2,30]. In the first one, the authors investigate under which conditions discrete local minima are indeed global. In the second one the authors study how to numerically verify second order optimality conditions, which are very important for the study of local minima for non-convex optimal control problems.
NOTATION: Along the paper we will consider the following operators

2)
As usual we will denote C(Ω) the space of continuous functions inΩ, the closure of Ω. C 0,δ (Ω) is the space of Hölder functions inΩ if 0 < δ < 1 and of Lipschitz functions if δ = 1. For p ∈ [1, +∞], s ≥ 0, we will denote L p (Ω) and W s, p (Ω) respectively the Lebesgue and Sobolev spaces. We also abbreviate [1] for definitions and further properties of these spaces. In the Sobolev space H 1 0 (Ω) we take the norm According to the Poincaré inequality, there exists a constant C Ω such that From this inequality and Sobolev's embedding theorem, we also know that there exists a constant K Ω such that , which is a Banach space when endowed with the norm

Assumption 1
We assume that Ω is a convex domain in R n with n = 2 or 3. We also suppose that Ω is polygonal if n = 2 or polyhedral if n = 3. Γ denotes its boundary, which is Lipschitz. The following conditions are satisfied by the coefficients of the operator A: The following properties on the operators A and A * were proved in Theorem 2.2, Corollary 2.4, Theorem 2.5 and Corollary 2.6 of [12]. The proofs of these results make use of [ Regarding the Eq. (1.1) we make the following assumption on the non-linear function f .
Assumption 2 Function f : Ω × R −→ R is a Carathéodory function, monotone non-decreasing with respect to the second variable, and satisfying The following result concerning existence, uniqueness and regularity of a solution of (1.1) follows from Theorems 2.6 and 2.8 of [12].
holds for some constant C A, f depending on A and f .
Additional regularity assumptions on f are necessary to consider the differentiability of the functional J .

Assumption 3
We suppose that f : Ω × R −→ R is a Carathéodory function of class C 2 with respect to the second variable satisfying: For every M > 0 and ε > 0 there exists δ > 0, depending on M and ε, such that It is easy to check that Assumption 3 implies Assumption 2. Typical examples of functions satisfying these assumptions are The functional J is of class C 2 . Moreover, (2.11) Since (P) is not a convex problem, we distinguish between local and global solutions. We say thatū is a local solution of (P) if there exists ε > 0 such that As usual, we say thatū is a strict local solution if the above inequality is strict whenever u =ū. The reader is referred to [12, Definition 3.3 and Lemma 3.4] for a discussion of different notions of local solutions.

Theorem 2.4
Under Assumptions 1 and 3, (P) has at least one solution. Moreover, ifū is a local solution of (P), then there exist two unique elementsȳ,φ ∈ H 2 (Ω)∩ H 1 0 (Ω) such that This theorem follows from [12, Theorems 3.1 and 3.6, and Corollary 3.7]. We finish this section by establishing the second order optimality conditions. To this end, we define the cone of critical directions as follows: Let us observe that (2.14) implies that In the case where there are not control constraints, namely U ad = L 2 (Ω), then J (ū) = 0 and Cū = L 2 (Ω).

Approximation of the state equation
In this section we consider the finite element discretization of the Eq. (1.1). The goal is to prove the existence of solution for the discrete problems and to derive some error estimates. We proceed in three steps. First we study the linear equation; see Lemma 3.1. Next we replace the local Lipschitz condition stated in Assumption 2 by the more restrictive global condition (3.12). Using this condition, we prove the existence of a unique discrete solution in Theorem 3.5 and error estimates in Theorem 3.6. Finally, we remove assumption (3.12) to obtain the main result of this section, Theorem 3.7. It will be assumed, without express mention, that Assumption 1 holds. From Theorem 2.2 we know that, under the Assumptions 1 and 2, given u ∈ L 2 (Ω), (1.1) has a unique solution y ∈ H 2 (Ω) ∩ H 1 0 (Ω). In the rest of the section u denotes a fixed element of L 2 (Ω) and y the corresponding solution of (1.1).
To formulate a discrete version of (1.1) we introduce a quasi-uniform family of triangulations {T h } h>0 ofΩ; cf. [5,Definition (4.4.13)]. We denote N h the number of interior nodes of T h . Associated with these triangulations we consider the finite dimensional spaces where P 1 (T ) denotes the space of polynomials in T of degree less than or equal to one. Now, we introduce the discrete version of (1.1) as follows From Theorem 2.1 we have Due to the presence of b in the definition of the bilinear form a, it is not necessarily coercive. However, we can prove, see [12, Lemma 2.1] that it satisfies Gårding's inequality. There exists C Λ,b such that From [12, Corollary 2.6] we know that A * : is an isomorphism. Then, we argue similarly to [32] to deduce the the following result. Lemma 3.1 Let a 0 ∈ L 2 (Ω) be a non-negative function. There exists h A,a 0 > 0 depending on A and a 0 L 2 (Ω) such that the variational problem has a unique solution for every h ≤ h A,a 0 and for every u ∈ L 2 (Ω). Moreover, there exists a constant C A,a 0 depending on A and a 0 such that Proof Because of the linearity of the system (3.3), it is enough to show the existence of h A,a 0 such that the only solution of the homogeneous problem is y h = 0. Therefore, let us assume that y h ∈ Y h satisfies Then, from Gårding's inequality (3.2) and the fact that a 0 ≥ 0 we get Then, there exist constantsĉ 2 andĉ ∞ such that From the estimates (3.8) and (3.9) we get Taking h 1 > 0 such that Now, we select h 2 as follows Let us conclude the demonstration by proving the estimate (3.4). To this end we set h A,a 0 = min{h 1 , h 2 }/2. Let y ∈ Y be the solution of Ay = u in Ω and let y h ∈ Y h be the solution of (3.3) for h ≤ h A,a 0 . Then, using again ψ andψ h , and arguing similarly as we did above, we get Then, using that h ≤ h 1 , and taking into account that , see Lemma 3.2 below, and arguing as above we deduce Then, from the last inequalities we infer Young's inequality implies This implies (3.4). Indeed, it is enough to observe that y = A −1 u.

and φ 2 in
Then, the identity ψ = ψ 1 − ψ 2 holds. From the comparison principle proven in [12, Lemma 2.8] we know that all the above functions are non-negative. Using the same Lemma and the fact that Now, from Theorem 2.1 we obtain where C depends on A * . Combining the above inequalities it follows Using the above estimate and applying again Theorem 2.1 to the equation which proves the lemma.

Remark 3.3
Notice that in the proof, we have also stated the existence of a constant K ∞ A * which does not depend on a 0 such that Since the non-linear discrete Eq. (3.1) is neither monotone nor coercive, the proof of existence or uniqueness of solution is not obvious. We will establish the existence for h small enough. In a first step, we make the following assumption Assumption 2' f : Ω × R −→ R is a Carathéodory function, monotone nondecreasing with respect to the second variable, and satisfying x ∈ Ω and ∀y 1 , y 2 ∈ R. (3.12)

Remark 3.4 Let us observe that under Assumption 2, given M > 0 and setting
This property will be used later to remove the Assumption 2'.
It is obvious that (3.12) is more restrictive than (2.3). Indeed, Assumption 2 obviously follows from the above hypothesis. Later we will get rid of this assumption. Now, we address the existence of a solution of (3.1). In the next theorem we apply Lemma 3.1 with a 0 = 0, and we set h A = h A,0 and C A = C A,0 .

Theorem 3.5 If Assumptions 1 and 2' hold, then there exists
From this inequality and (3.12) we deduce that From this estimate, the continuity of F h and Brouwer's fixed point theorem we obtain the existence of at least one fixed point y h . Obviously, y h is solution of (3.1). It remains to prove the uniqueness of a solution for h small enough. Let us assume that y h,1 , y h,2 ∈ Y h are two solutions of (3.1). Then subtracting the equations satisfied by these solutions we get Using (3.2) along with the monotonicity of f we get We define . Then, from (3.14), the definition of a 0 , the choice of ψ andψ h ∈ Y h , and using (1.4), (1.5), (3.8) and that y h,1 and y h,2 are solutions of (3.1) we infer From this inequality we obtain that y h,2 − y h, Next we prove some error estimates for y − y h .
where y and y h denote the solutions of (1.1) and (3.1).

Proof
The proof is divided into three steps. Step . We proceed similarly as we did in the proof of the above theorem. We define Then, subtracting the equations satisfied by y and y h , using the estimate of Lemma 3.2 and takingĉ 2 as in (3.8), we obtain which proves the desired estimate with a constant Hence, we have with (2.4) From the estimate proved in Step 1, (3.2) along with the monotonicity of f , and (3.17) we get for which proves Step 2 with Step 3. Proof of (3.16). The proof of (3.16) follows from (3.15) and the inverse inequality   Then, according to Remark 3.4, f M satisfies the conditions (3.12). Moreover, if y is the solution of (1.1), then f M (x, y(x)) = f (x, y(x)) in Ω, thus y also satisfies According to Theorem 3.5, there existsh has a unique solution for every h <h M . Moreover, from Theorem 3.6 we have the estimate Taking h M such that we have Then, y h andỹ h are two different solutions of (3.20) for every h < hM , which contradicts the uniqueness already established.
Using the previous theorem, we are going to establish a well defined local mapping u h → y h by ignoring those solutions of (3.1) with big C(Ω)-norms. Theorem 3.8 Suppose that Assumptions 1 and 2 hold. Letȳ ∈ Y be the solution of (1.1) corresponding to the controlū ∈ L 2 (Ω). Given ρ > 0 arbitrary, there exist ρ * > 0 and h 0 > 0 such that (3.1) has a unique solution y h (u) ∈B Y ρ * (ȳ) for every u ∈B ρ (ū) ⊂ L 2 (Ω) and for all h < h 0 , wherē Furthermore, there exist constants K 2 and K ∞ such that where y u and y h (u) are the solutions of (1.1) and (3.1), respectively, associated with the control u.
Proof Let us fix ρ > 0. In [12,Lemma 3.5], it was proved the existence of a constant M ρ such that Hence, we have Then, for every u ∈B ρ (ū), (3.24) implies that y u ∈ B Y ρ * (ȳ) holds. Furthermore, (3.22), (3.23) and (3.25) imply . It remains to prove the uniqueness. This follows from the fact that h 0 ≤ h M and, thanks to (3.25), any element y h ∈B Y ρ * (ū) satisfies

Approximation of the control problem (P)
In this section, we discretize the control problem (P) and study the convergence of the discretizations. To this end, we suppose without express mention that Assumptions 1 and 2 hold. Let us consider the functional J : L 2 (Ω) × L 2 (Ω) → R given by Let us denote by U h one of the following two spaces: where P 0 (T ) and P 1 (T ) denote the space of polynomials in T of degree 0 and ≤ 1, respectively. We also set U ad,h = U h ∩ U ad . If U h = U 0 h , then we will denote Π h : will denote Cartensen's quasi-interpolation operator. In both cases it is known that Π h u converges to u in L 2 (Ω) as h tends to 0 for all u ∈ L 2 (Ω), and Π h u ∈ U ad,h for all u ∈ U ad .
We will approximate Problem (P) by the problem

ū is a solution of (P) with associated statedȳ, and (ȳ
Proof Claim 1-Existence of discrete solutions: Let us prove the existence of a solution of (P h ) for every h small enough.
Since F h is continuous and U ad,h is closed, then the set of feasible points (y h , u h ) Moreover, J is continuous and coercive. Hence, it is enough to prove the existence of feasible points for (P h ). We choose a constant u ∈ U ad,h to guarantee u ∈ U ad,h for every h > 0. This can be done by u ≡ α if α > −∞, or u ≡ β if β < ∞, or u ≡ 0 otherwise. According to Theorem 3.7, there exists h 0 > 0 such that (4.1) has a solution y h (u) ∈ Y h for every h < h 0 satisfying y h (u) → y u in Y . Therefore, (y h (u), u) is a feasible point for (P h ) for every h < h 0 .
Claim 2-Uniform boundedness of discrete solutions in H 1 0 (Ω) × L 2 (Ω) and weak convergence: Let us denote by {(ȳ h ,ū h )} h<h 0 a sequence of solutions for problems (P h ). We prove the boundedness of this sequence in satisfies (4.1), taking z h =ȳ h in (4.1) we deduce from (3.2) and the monotonicity of f From here we obtain ∀h < h 0 .
This inequality along with the boundedness of , it has weakly convergent subsequences in this topology. Now, we take a subsequence, denoted in the same way, such that

Claim 3-Validity of the state equation for the limit element:
The proof of this claim is split into three main steps: First, we prove that f (·,ȳ h ) → f (·,ȳ) strongly in L 1 (Ω). Next, we use this to prove (4.3), which is a weak version of (1.1) for bounded test functions. Finally, we prove thatȳ ∈ L ∞ (Ω) to conclude this part of the proof.
To We get from (4.1) and the boundedness of Let ε > 0 be arbitrarily small. Using (2.3) with M = 2C/ε, we deduce the existence of a function φ ε ∈ L 2 (Ω) such that From the integrability of φ ε we infer the existence of λ 0 > 0 such that Let us set λ = 2Cλ 0 ε and Ω h,λ = {x ∈ Ω : | f (x,ȳ h (x)) − f (x, 0)| > λ}. We notice the following properties: From here, and using (4.2), we infer Since λ was chosen independently of h, the uniform integrability follows and . For instance, we can take z h Carstensen's quasi-interpolation of z; see [7]. Hence, using Lebesgue's dominated convergence theorem, we can pass to the limit in (4.1) and deduce that Finally, we prove thatȳ ∈ L ∞ (Ω), and consequently, by a truncation argument, it will follow that (4.3) holds for all z ∈ H 1 0 (Ω) Let us set a(y, z) = a(y, z) where C Λ,b is given by (3.2). Then we have thatã is coercive in H 1 0 (Ω) and (4.4) The above identity holds, in particular, for y k = Proj [−k,+k] (ȳ) for every k ≥ 1: Moreover, from Fatou's Lemma, (4.5), denoting g =ū + C Λ,bȳ − f (x, 0) ∈ L 2 (Ω) and taking into account thatã(ȳ, y k ) ≥ã(y k , y k ) ≥ 0, we have , we can apply the Lebesgue's dominated convergence theorem to pass to the limit in (4.5): Then, combining this identity and (4.5), we get for y k =ȳ − y k : From the monotonicity of f and the definition of y k we get Then, we can proceed as in [34,Theorem 4.1] or [35, §7.2] to infer the existence of k 0 such that y k = 0 for k ≥ k 0 . Hence,ȳ ∈ L ∞ (Ω) holds. Moreover, from (2.5) and (2.6) we deduce that f (·,ȳ) ∈ L 2 (Ω). Therefore, we have that Aȳ ∈ L 2 (Ω) and, consequently,ȳ ∈ C(Ω); see [12,Corollary 2.2]. Thus,ȳ ∈ Y and (4.3) implies that y is the solution of (1.1) associated withū. Claim 4-Optimality ofū: Let us prove thatū is a solution of (P). First, notice that it follows from the inclusion U ad,h ⊂ U ad thatū ∈ U ad . To prove the optimality, we take an arbitrary element u ∈ U ad and set u h = Π h u ∈ U ad,h . Moreover, Theorem 3.7 implies that there exists y h (u h ) ∈ Y h solution of (4.1) for every h small enough and such that y h (u h ) → y u in Y . Hence, we deduce from the optimality of (ȳ h ,ū h ) Since u was taken arbitrary in U ad , the above inequalities prove thatū is a solution of (P).
Next, we prove a kind of converse theorem. More precisely, we assume thatū ∈ U ad is a strict local minimum of (P) with associated stateȳ. This means that there exists ρ > 0 such that Under assumptions of Theorem 4.1 there exists ρ * > 0 and h 0 > 0 such that for every u ∈B ρ (ū) there exists a unique solution of (4.1) y h (u) ∈B Y ρ * (ȳ); see Theorem 3.8. Then, for every h < h 0 we have a well defined mapping G h : Associated with J h we define the discrete control problem converges to the same limit y h (u h ), it follows that the whole sequence converges to y h (u h ). This proves the continuity of G h and, consequently, the continuity of J h . Therefore, (P ρ h ) consists of the minimization of a continuous function on a non-empty compact set, which implies the existence of a solutionū h .
It remains to prove that {ū h } h≤h ρ converges toū strongly in L 2 (Ω). First, from the boundedness of {ū h } h≤h ρ ⊂B ρ (ū) and the inclusions U ad,h ⊂ U ad we deduce the existence of a subsequence, denoted in the same way, and an elementũ ∈B ρ (ū) ∩U ad such thatū h ũ in L 2 (Ω). This implies that yū h → yũ strongly in Y ; see [12,Theorem 2.9]. Therefore, using (3.22) and (3.23) we infer This convergence and the optimality ofū h imply (4.8) This inequality and (4.7) lead to the identityū =ũ. Moreover, (4.8) implies that u h →ū strongly in L 2 (Ω). This property is satisfied by every weakly convergent subsequence of {ū h } h≤h ρ , hence the whole sequence converges strongly toū.

Remark 4.3
By selecting h ρ sufficiently small, we have that the solutionsū h of (P ρ h ) belong to the open ball B ρ (ū). Indeed, this is an obvious consequence of the strong convergenceū h →ū in L 2 (Ω). From now on, we will assume that h ρ has been chosen so thatū h is included in the open ball. From Theorem 3.8 we deduce that (y h (ū h ),ū h ) is a local solution of (P h ). Thus, Theorem 4.2 proves that strict local solutions of (P) can be approximated by local solutions of (P h ).
The next goal is to derive the optimality conditions satisfied by a solution of (P ρ h ). To this end, we firstly analyze the differentiability of the mapping G h and the functional J h . Theorem 4.4 Suppose that Assumptions 1 and 3 hold. Then, there existsh ρ ≤ h 0 such that for every h <h ρ the mapping G h : It is clear that F h is of class C 2 and F h (G h (u), u)) = 0 ∀u ∈ B ρ (ū). Hence, the differentiability of G h is a consequence of the implicit function theorem applied to F h . We only need to prove that is an isomorphism. This is equivalent to prove that (4.9) has a unique solution z h ∈ Y h for every u ∈ L 2 (Ω). We prove this. First we observe that Then, Lemma 3.1 implies the existence ofh ρ ≤ h 0 depending on C f ,ρ * and A such that (4.9) has a unique solution for every h ≤h ρ and for all (u, v) ∈ B ρ (ū) × L 2 (Ω).
As an immediate corollary of the above theorem we get the differentiability of the objective functional J h .

10)
where ϕ h (u) ∈ Y h is the adjoint state, i.e. it is the solution of the variational problem (4.11) Let us observe that (4.11) is a linear system of equations, adjoint to the one defined by (4.9). Therefore, the existence and uniqueness of a solution of (4.11) is a consequence of the same property for (4.9). Now, we can formulate the first order optimality conditions satisfied by a solution of (P ρ h ).

Moreover, for any of these solutions there exist two unique functionsȳ h
(4.14) Proof The existence of a solution of (P ρ h ) in the open ball follows from Remark 4.3. Then, the inequality J h (ū h )(u h −ū h ) ≥ 0 ∀u h ∈ U ad,h holds for every h <h. This along with (4.10) leads straightforward to (4.12)-(4.14).

Error estimates
In this section, we suppose that Assumptions 1 and 3 hold. In the whole sectionū will denote a strict local solution of (P) with associated stateȳ and adjoint stateφ. Following Theorem 4.6, in the sequel we will assume that h <h, and we consider the discrete problems (P ρ h ) having solutionsū h ∈ B ρ (ū) ∩ U ad,h and satisfying the optimality conditions (4.12)-(4.14). We know thatū h →ū strongly in L 2 (Ω). The goal is to provide some error estimates for the differenceū −ū h . We will distinguish two cases depending on the set U ad . Firstly we analyze the case where U ad L 2 (Ω), next we treat the case where U ad = L 2 (Ω). Let us prove a preliminary result that we will use later.
Theorem 5.1 Let u ∈ B ρ (ū) be arbitrary. Let ϕ ∈ H 2 (Ω) ∩ H 1 0 (Ω) and ϕ h ∈ Y h denote the solutions of (2.11) and (4.11), respectively. Then, there exist constants k 2 and k ∞ such that ) and we estimate both summands. For the first summand we subtract the equations (2.11) and (5.4): Using the mean value theorem, we have that there exists a measurable function 0 < θ(x) < 1 such that, if we nameŷ = y h (u) + θ(y u − y h (u)), then Using Lemma 3.2, (5.3), the boundedness of {ϕ h } h in C(Ω), and (5.5) we infer To estimate the term ϕ h − ϕ h (u) we observe that the equation satified by ϕ h (u) is the corresponding discretization of the equation satisfied by ϕ h . Both equations are linear. Hence, we can use [32] to deduce that Using the estimate in L 2 (Ω), interpolation error estimates, and an inverse inequality we obtain The reader can also consider to apply Theorem 3.6 with a function f that is linear and change the equation by its adjoint. Finally, (5.1) and (5.2) follow from the above estimates and (5.6).
Error estimates can be deduced from the abstract error estimate of [17, Theorem 2.14].
Lemma 5.2 Letū be a local minimizer of (P) with associated stateȳ and satisfying (2.16). Let {(ȳ h ,ū h )} be a sequence of local minimizers of the problems (P h ) converging strongly to (ȳ,ū) Proof We use [17,Theorem 2.14]. To this end, we have to check the Assumptions (A2), (A3) and (A7) of [17]. First we observe that there exist positive constants r , M 1 and M 2 such that for all v, v 1 , v 2 ∈ L 2 (Ω) and all u ∈ U ad such that ū −u L 2 (Ω) < r Moreover, for every ε > 0 there exists δ > 0 such that for all u 1 , Hence, (A2) holds. Assumption (A3) says that for any element u ∈ U ad there exists a family {u h } h>0 with u h ∈ U ad,h such that u − u h L 2 (Ω) → 0 when h → 0, which is well known to be satisfied for our choices of U ad,h . Finally, estimate (5.1) implies Therefore, Assumption (A7) holds with ε h = h 2 . Then, [17,Theorem 2.14] claims the existence of a constant C independent of h such that and the result follows.
Next, we obtain error estimates for unconstrained problems.

Theorem 5.3
Suppose U ad = L 2 (Ω) and set U h = U i h , i = 0, 1. Letū be a local minimizer of (P) with associated stateȳ and satisfying (2.16). Let {(ȳ h ,ū h )} be a sequence of local minimizers of the problems (P h ) converging strongly to (ȳ,ū) in H 1 0 (Ω) × L 2 (Ω). Then there exists h 0 > 0 such that Proof We apply Lemma 5.2. In this case J (ū) = 0. For i = 0 we take u h = Π hū and for i = 1, we take u h = I hū , the nodal interpolation ofū and the result follows from the approximation properties of the projection in the L 2 (Ω) sense and the nodal interpolation respectively.
In the following result, we obtain error estimates for constrained problems.
Letū be a local minimizer of (P) with associated stateȳ and satisfying (2.16). Let {(ȳ h ,ū h )} be a sequence of local minimizers of the problems (P h ) converging strongly to (ȳ,ū) in Proof We apply again Lemma 5.2 with u h = Π hū ∈ U ad,h , where we recall that Π h is either the linear projection in the L 2 (Ω) sense onto U 0 h or Carstensen's quasiinterpolation operator, depending on the approximation space for the controls. In both cases we have that ū − u h L 2 (Ω) ≤ Ch; see [21,Lemma 4.3] for Carstenen's quasiinterpolation operator. For the last term we have where the estimate u h −ū H −1 (Ω) ≤ Ch 2 follows by duality for the L 2 (Ω) projection and is proved in [21,Lemma 4.4] for Carstenen's quasi-interpolation operator.
Finally, we deduce error estimates in the norm of L ∞ (Ω). We start with a result for the adjoint state.
Proof By the triangle inequality Using either Theorem 5.3 or Theorem 5.4, we have that there exists some h 0 > 0 such thatū h ∈B ρ (ū) for all h < h 0 . Therefore, we can use (5.2) to obtain Using the same technique as in the proof of Theorem 5.1 and the Sobolev embedding Next, we use [12,Lemma 3.5] and either the estimate proved in Theorem 5.4 or the ones proved in Theorem 5.3, depending on wether we have control constraints or not. Sinceū h ∈B ρ (ū) for all h < h 0 , we know that there is a constant MB ρ (ū) such that The result follows from the previous estimates, just taking into account that 2 − n/2 ≤ 1.
To deduce error estimates for the control variable in L ∞ (Ω), we replace Assumption (2.2) and the assumption on the target y d by the following one, which is not very restrictive in practice: b ∈ Lp(Ω) withp > n, div b, y d ∈ L q (Ω) with q > 2, (5.8) Using that Ω is convex, we know that there exists some 2 < p ≤ min{p, q} such that ϕ ∈ W 2, p (Ω); see, e.g., [25] for n = 2 and [19, Corollary 3.12] for n = 3.

Corollary 5.6
Letū be a local minimizer of (P) with associated stateȳ and satisfying (2.16). Let {(ȳ h ,ū h )} be a sequence of local minimizers of the problems (P h ) converging strongly to (ȳ,ū) in H 1 0 (Ω) × L 2 (Ω). Suppose further that one of the following conditions is satisfied: h and U ad = L 2 (Ω). The optimality conditions (2.14) and (4.14) and Corollary 5.5 lead straightforward to Case 2-U h = U 0 h and (5.8) holds. In this case (2.14) and (4.14) lead tō where Proj [α,β] (s) = max(α, min(β, s)) andū T is the constant value ofū h at the triangle T . From the mean value theorem, for every element T ∈ T h , we deduce the existence of some x T ∈ T such that Since Proj [α,β] (s) is a contraction, we have that for every T ∈ T h and almost every x ∈ T , Sinceφ ∈ W 2, p (Ω) for some p > 2, by the Sobolev imbedding theorem, alsō ϕ ∈ C 0,δ (Ω) for δ = 1 if n = 2 and some 1/2 < δ ≤ 1 depending on p if n = 3. Therefore, there exists a constant Λφ > 0 such that If there are no control constraints, we are in the situation of Case 1, so we assume that −∞ < α or β < +∞. In this case, (4.14) implies thatū h is the projection in the L 2 (Ω)-sense of − 1 νφ h onto U ad,h , but we do not have a pointwise projection formula.
The estimate follows from the results of [28,Sections 3,4]. Notice that, although that reference is about linear equations, the proof only requires L 2 (Ω)-error estimates for the control, which we have in Theorem 5.4, L ∞ (Ω)-error estimates and Lipschitz regularity for the adjoint state, which we have from Corollary 5.5 and assumption (5.8) and the fact that the discrete optimal control is a projection in the L 2 (Ω)-sense of − 1 νφ h . Notice also that the technique of proof cannot be translated to n = 3, since the analogous of [28, Lemma 3.5] for n = 3 does not hold.

Remark 5.7
Under additional regularity conditions, higher orders of convergence can be proved. Indeed, let us suppose that ϕ u ∈ W 2, p (Ω) for some p > n if u ∈ L ∞ (Ω). For n = 2, condition (5.8) is sufficient for this regularity, while for n = 3 we have to assume that b, div b, y d ∈ Lp(Ω) for somep > 3 and also that the internal angles of Ω are small enough; see [19]. Using the same technique as in the proof of From this estimate, it can proved as in Corollaries 5.5 and 5.6, that where Λφ is the Lipschitz constant ofφ. Assume that we have that ϕ u ∈ W 2, p (Ω) for all p < +∞ if u ∈ L ∞ (Ω). If further U h = U 1 h and U ad = L 2 (Ω), then we obtain by setting p = | log h| in the above inequality See [11,Lemma 3] for the proof of a similar result. This high regularity can be achieved, for instance, if b, div b, y d ∈ Lp(Ω) for allp < +∞ and Ω is a rectangle or a rectangular parallelepiped or its boundary Γ is of class C 1,1 . Also, when U h = U 1 h and U ad L 2 (Ω), the order of convergence usually observed in experiments for the L 2 (Ω)-error of the control is O(h 3/2 ). A detailed explanation of this phenomenon can be found in [10,Section 10]. In our case, this order is achieved if p > n. The proof is based on the assumption that the measure of set ∪{T ∈ T h : u / ∈ H 2 (T )} is of order h. This assumption is not restrictive and is usually satisfied in practice; see [27].

Numerical experiments
We are going to build an example with an explicitly known local solution satisfying the second order sufficient optimality condition (2.16).
To define the target state y d , we first defineφ(x) = −Π n i=1 x i (1 − x i ) andū(x) = proj [α,β] (−φ(x)/ν). Next, we takeȳ ∈ H 2 (Ω)∩ H 1 0 (Ω) solution of the state equation and set y d (x) = Δφ(x) + div(b(x)φ(x)) +ȳ(x) − ∂ f ∂ y (x,ȳ(x))φ(x). (In practice, we do not haveȳ, but we can use y h (ū) to compute a good approximation of y d ).    With these choices, it is clear that (ū,ȳ,φ) satisfies first order optimality conditions (2.12)-(2.14). From (2.10), we have Sinceφ(x) < 0 for all x ∈ Ω, the condition (2.16) holds and henceū is a local solution of (P). The problem is discretized using the finite element method. To solve the discrete problems, we use a semi-smooth Newton method as described in [10,Section 14]. The success of the conjugate gradient method used to solve the unconstrained quadratic programs arising in the optimization process is an indication that the solutions of the finite dimensional problems are strict local minima.
The mesh of size h i = 2 −i is obtained splitting Ω into 2 in congruent cells obtained by translation of (0, h i ) n and dividing each cell into n! n−simplices. In this family of meshes, the experimental order of convergence for the error of the variable z ∈ {u, y, ϕ} measured in the norm of X = L 2 (Ω) or L ∞ (Ω) can be computed as We report on the L 2 (Ω) and L ∞ (Ω) experimental order of convergence of the error for the control, the state, and the adjoint state for i = 8 if n = 2, and i = 5 if n = 3. We summarize the results Tables 1, 2 and 3.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,