1 Introduction

In this paper, we consider the numerical approximation of the optimal control problem

$$\begin{aligned} \text{(P) } \quad \min _{u \in U_{\mathrm{ad}}} J(u) := \frac{1}{2}\int _\varOmega (y_u(x) - y_d(x))^2\, dx + \frac{\nu }{2}\int _\varOmega u^2(x)\, dx, \end{aligned}$$

where \(\varOmega \subset {\mathbb {R}}^n\), \(n=2\) or \(n=3\), is a convex domain with boundary \(\varGamma \), \(y_u\) is the solution of the following state equation

$$\begin{aligned} \left\{ \begin{array}{l} Ay + b(x)\cdot \nabla y + f(x,y) = u \text { in } \varOmega ,\\ y = 0\text { on } \varGamma ,\end{array}\right. \end{aligned}$$
(1.1)

A is an elliptic operator, \(b:\varOmega \longrightarrow {\mathbb {R}}^n\) is a given function, and \(f:\varOmega \times {\mathbb {R}} \longrightarrow {\mathbb {R}}\) is non-decreasing monotone in the second variable. Moreover, \(y_d \in L^2(\varOmega )\) is a given function, \(\nu > 0\), and

$$\begin{aligned} U_{\mathrm{ad}}= \{u \in L^2(\varOmega ) : \alpha \le u(x) \le \beta \text { for a.e. } x \in \varOmega \} \end{aligned}$$

with \(-\infty \le \alpha < \beta \le +\infty \). This problem is studied in [12], where existence and uniqueness results for the equation, as well as existence of optimal controls and optimality conditions are obtained. For the convenience of the reader, these results are summarized in Sect. 2. In this work, we will discretize the problem and obtain approximation results. The reader is also referred to [8, 15] for a similar control problem associated to a non-monotone quasilinear elliptic equation. The main difference with respect to the above equation is that the operators considered in [8, 15] are coercive, while our equation is neither monotonone nor coercive.

In Sect. 3 we study the approximation of the state equation by finite elements. The reader is referred to [32] for the linear case or [8, 16, 22] for the case of non-monotone but coercive quasilinear equations. In the quasilinear case, the discrete equation has at least one solution for every h, which easily follows from an application of Brouwer’s fixed point theorem and the coercivity of the operator. However, there is not a uniqueness result. In the linear case, we only can prove existence of solution if h is small enough, but we have a unique solution for each of these values of h. In the semilinear discrete case corresponding to (1.1), the existence of a discrete solution requires, as in the linear case, a parameter h small. But, as in the quasilinear case, the uniqueness of discrete solutions is an open issue. We prove existence and uniqueness of a discrete solution if the equation is linear or if the non-monotone term is bounded. In the general case we can prove the existence and uniqueness of a bounded sequence of solutions as h tends to 0, but we cannot rule out the possible existence of a divergent sequence of solutions in the \(L^\infty (\varOmega )\)-norm. Error estimates are provided for the bounded approximations of the solutions of the state equation.

In Sect. 4 we discretize the control problem, using either piecewise constant or continuous piecewise linear approximations of the control. We prove the existence of a number \(h_0 > 0\) such that the discrete optimal control problem has at least one solution \(({\bar{y}}_h,{\bar{u}}_h)\) for every discretization parameter \(h < h_0\). We also prove the boundedness of these solutions in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). Moreover, every limit in the \(H_0^1(\varOmega ) \times L^2(\varOmega )\) weak topology when \(h \rightarrow 0\) of a sequence of discrete solutions is a solution of the continuous optimal control problem. In addition, the converge is not only weak, it is strong in the \(H_0^1(\varOmega ) \times L^2(\varOmega )\) topology. Next, we define a discrete control-to-state mapping in a neighborhood of a strict solution of the continuous problem, as well as an associated reduced functional and we state first order optimality conditions. In Sect. 5 we obtain error estimates and in the last section we include a numerical experiment.

To finish this introduction let us mention some papers concerning error estimates for the numerical approximation of non-linear elliptic control problems. Early references for the numerical analysis of linear quadratic control problems are the papers [23] and [24]. The first reference we are aware of dealing with the numerical approximation of optimal control problems governed by a semilinear elliptic equation is [3]; state constraints were included in the analysis in [29]. Different aspects of Neumann boundary optimal control problems have been treated in [9, 13] or [26]. The case of Dirichlet boundary control was first treated in [14]. In all these references, the equations were coercive and monotone. Optimal control problems governed by quasi-linear elliptic equations have been studied in [8, 16, 17, 20]. In these works, the equations were coercive but not monotone. It is also worth mentioning the works [2, 30]. In the first one, the authors investigate under which conditions discrete local minima are indeed global. In the second one the authors study how to numerically verify second order optimality conditions, which are very important for the study of local minima for non-convex optimal control problems.

NOTATION: Along the paper we will consider the following operators

$$\begin{aligned}&Ay= - \sum _{i,j = 1}^n\partial _{x_j}(a_{ij}(x)\partial _{x_i}y)\ \text { and }\ A^*\varphi = -\sum _{i, j = 1}^n \partial _{x_i}(a_{ij}(x)\partial _{x_j}\varphi ), \end{aligned}$$
(1.2)
$$\begin{aligned}&{\mathcal {A}}y = Ay + b(x)\cdot \nabla y\ \text { and }\ {\mathcal {A}}^*\varphi = A^*\varphi - {\text {div}}[b(x)\varphi ]. \end{aligned}$$
(1.3)

As usual we will denote \(C({\bar{\varOmega }})\) the space of continuous functions in \({\bar{\varOmega }}\), the closure of \(\varOmega \). \(C^{0,\delta }({\bar{\varOmega }})\) is the space of Hölder functions in \({\bar{\varOmega }}\) if \(0<\delta <1\) and of Lipschitz functions if \(\delta =1\). For \(p\in [1,+\infty ]\), \(s\ge 0\), we will denote \(L^p(\varOmega )\) and \(W^{s,p}(\varOmega )\) respectively the Lebesgue and Sobolev spaces. We also abbreviate \(H^s(\varOmega )=W^{s,2}(\varOmega )\). \(H^1_0(\varOmega )\) is the space of elements in \(H^1(\varOmega )\) with null trace on \(\varGamma \). \(H^{-1}(\varOmega )\) is the dual of \(H^1_0(\varOmega )\). See [1] for definitions and further properties of these spaces. In the Sobolev space \(H_0^1(\varOmega )\) we take the norm

$$\begin{aligned} \Vert y\Vert _{H_0^1(\varOmega )} = \left( \int _\varOmega |\nabla y(x)|^2\, dx\right) ^{\frac{1}{2}}. \end{aligned}$$

According to the Poincaré inequality, there exists a constant \(C_\varOmega \) such that

$$\begin{aligned} \Vert y\Vert _{L^2(\varOmega )} \le C_\varOmega \Vert y\Vert _{H_0^1(\varOmega )} \quad \forall y \in H_0^1(\varOmega ). \end{aligned}$$
(1.4)

From this inequality and Sobolev’s embedding theorem, we also know that there exists a constant \(K_\varOmega \) such that

$$\begin{aligned} \Vert y\Vert _{L^4(\varOmega )} \le K_\varOmega \Vert y\Vert _{H_0^1(\varOmega )} \quad \forall y \in H_0^1(\varOmega ). \end{aligned}$$
(1.5)

We will denote \(Y = H_0^1(\varOmega ) \cap C({\bar{\varOmega }})\), which is a Banach space when endowed with the norm

$$\begin{aligned} \Vert y\Vert _Y = \Vert y\Vert _{H_0^1(\varOmega )} + \Vert y\Vert _{C({\bar{\varOmega }})}. \end{aligned}$$

2 Preliminary results

Assumption 1

We assume that \(\varOmega \) is a convex domain in \({\mathbb {R}}^n\) with \(n = 2\) or 3. We also suppose that \(\varOmega \) is polygonal if \(n = 2\) or polyhedral if \(n = 3\). \(\varGamma \) denotes its boundary, which is Lipschitz. The following conditions are satisfied by the coefficients of the operator \({\mathcal {A}}\):

$$\begin{aligned}&\left\{ \begin{array}{l}a_{ij} \in C^{0,1}({\bar{\varOmega }}) \ \text { for } i, j = 1, \ldots , n,\\ \displaystyle \exists \varLambda > 0 \text { such that } \sum _{i,j = 1}^na_{ij}(x)\xi _i\xi _j \ge \varLambda |\xi |^2\ \ \forall \xi \in {\mathbb {R}}^n \text { and for a.{e.} } x \in \varOmega ,\end{array}\right. \end{aligned}$$
(2.1)
$$\begin{aligned}&b \in L^{{\bar{p}}}(\varOmega ) \text { with } {\bar{p}} > 2 \text { if } n = 2 \text { and } {\bar{p}} \ge 3 \text { if } n = 3, \text { and } {\text {div}}{b} \in L^2(\varOmega ).\qquad \end{aligned}$$
(2.2)

The following properties on the operators \({\mathcal {A}}\) and \({\mathcal {A}}^*\) were proved in Theorem 2.2, Corollary 2.4, Theorem 2.5 and Corollary 2.6 of [12]. The proofs of these results make use of [25, Theorems 2.2.2.3 and 3.2.1.2].

Theorem 2.1

Under Assumption 1, both operators \({\mathcal {A}}\) and \({\mathcal {A}}^*\) define isomorphisms between the spaces \(H_0^1(\varOmega )\) and \(H^{-1}(\varOmega )\), and \(H^2(\varOmega ) \cap H_0^1(\varOmega )\) and \(L^2(\varOmega )\).

Regarding the Eq. (1.1) we make the following assumption on the non-linear function f.

Assumption 2

Function \(f:\varOmega \times {\mathbb {R}} \longrightarrow {\mathbb {R}}\) is a Carathéodory function, monotone non-decreasing with respect to the second variable, and satisfying

$$\begin{aligned} \left\{ \begin{array}{l}f(\cdot ,0) \in L^2(\varOmega ) \text { and } \forall M > 0\ \exists \phi _M \in L^2(\varOmega ) \text { such that }\\ |f(x,y_2) - f(x,y_1)| \le \phi _M(x)|y_2 - y_1|\, \text {for a.{e.}}\, x \in \varOmega \, \text{ and }\, |y_i| \le M, \, i = 1, 2.\end{array}\right. \nonumber \\ \end{aligned}$$
(2.3)

The following result concerning existence, uniqueness and regularity of a solution of (1.1) follows from Theorems 2.6 and 2.8 of [12].

Theorem 2.2

Under Assumptions 1 and 2, for every \(u \in L^2(\varOmega )\) the Eq. (1.1) has a unique solution \(y_u \in H^2(\varOmega ) \cap H_0^1(\varOmega )\). Moreover, the estimate

$$\begin{aligned} \Vert y_u\Vert _{H^2(\varOmega )} \le C_{A,f}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )\ \ \forall u \in L^2(\varOmega ) \end{aligned}$$
(2.4)

holds for some constant \(C_{A,f}\) depending on A and f.

Additional regularity assumptions on f are necessary to consider the differentiability of the functional J.

Assumption 3

We suppose that \(f:\varOmega \times {\mathbb {R}}\longrightarrow {\mathbb {R}}\) is a Carathéodory function of class \(C^2\) with respect to the second variable satisfying:

$$\begin{aligned} f(\cdot ,0)\in L^2(\varOmega ) \text { and } \frac{\partial f}{\partial y}(x,y) \ge 0 \ \text{ for } \text{ a.e. } x \in \varOmega \text { and } \forall y \in {\mathbb {R}}. \end{aligned}$$
(2.5)

For every \(M>0\) there exists a constant \(C_{f,M}>0\) such that

$$\begin{aligned} \left| \frac{\partial f}{\partial y}(x,y)\right| + \left| \frac{\partial ^2 f}{\partial y^2}(x,y)\right| \le C_{f,M} \text{ for } \text{ a.e. } x \in \varOmega \text{ and } \text{ for } \text{ all } |y| \le M. \end{aligned}$$
(2.6)

For every \(M > 0\) and \(\varepsilon > 0\) there exists \(\delta > 0\), depending on M and \(\varepsilon \), such that

$$\begin{aligned} \left| \frac{\partial ^2f}{\partial y^2}(x,y_2) - \frac{\partial ^2f}{\partial y^2}(x,y_1)\right| < \varepsilon \ \text{ if } |y_1|, |y_2| \le M,\ |y_2 - y_1| \le \delta , \text{ for } \text{ a.e. } x \in \varOmega .\nonumber \\ \end{aligned}$$
(2.7)

It is easy to check that Assumption 3 implies Assumption 2. Typical examples of functions satisfying these assumptions are \(f(x,y) = a_0(x)y^{2n+1}\) or \(f(x,y)=a_0(x)\exp (y)\), where \(a_0 \in L^\infty (\varOmega )\), \(a_0(x)\ge 0\), and n is a positive integer.

Concerning the differentiability of J we have the following result [12, Theorems 2.12 and 3.2 and Corollary 2.6].

Theorem 2.3

Let us suppose that Assumptions 1 and 3 hold. Then, the mapping \(G:L^{2}(\varOmega ) \longrightarrow Y\) given by \(G(u) = y_u\) is well defined and of class \(C^2\). Moreover, given \(u, v \in L^{2}(\varOmega )\), \(z_v = DG(u)v\) is the solution of

$$\begin{aligned} \left\{ \begin{array}{l}\displaystyle Az + b(x)\cdot \nabla z + \frac{\partial f}{\partial y}(x,y_u)z = v \text { in } \varOmega ,\\ z = 0\text { on } \varGamma .\end{array}\right. \end{aligned}$$
(2.8)

The functional J is of class \(C^2\). Moreover, given \(u, v, v_1, v_2 \in L^2(\varOmega )\) we have

$$\begin{aligned}&J'(u)v = \int _\varOmega (\varphi _u + \nu u)v\, dx, \end{aligned}$$
(2.9)
$$\begin{aligned}&J''(u)(v_1,v_2) = \int _\varOmega \Big [1 - \varphi _u\frac{\partial ^2f}{\partial y^2}(x,y_u)\Big ]z_{v_1}z_{v_2}\, dx + \nu \int _\varOmega v_1v_2\, dx, \end{aligned}$$
(2.10)

where \(z_{v_i} = DG(u)v_i\), \(i = 1,2\), and \(\varphi _u \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) is the unique solution of the adjoint equation

$$\begin{aligned} \left\{ \begin{array}{l}\displaystyle A^*\varphi - {\text {div}}[b(x)\varphi ] + \frac{\partial f}{\partial y}(x,y_u)\varphi = y_u - y_d \text { in } \varOmega ,\\ \varphi = 0\text { on } \varGamma .\end{array}\right. \end{aligned}$$
(2.11)

Since (P) is not a convex problem, we distinguish between local and global solutions. We say that \({\bar{u}}\) is a local solution of (P) if there exists \(\varepsilon > 0\) such that

$$\begin{aligned} J({\bar{u}}) \le J(u)\quad \forall u \in U_{\mathrm{ad}}: \Vert u - {\bar{u}}\Vert _{L^2(\varOmega )} \le \varepsilon . \end{aligned}$$

As usual, we say that \({\bar{u}}\) is a strict local solution if the above inequality is strict whenever \(u \ne {\bar{u}}\). The reader is referred to [12, Definition 3.3 and Lemma 3.4] for a discussion of different notions of local solutions.

Theorem 2.4

Under Assumptions 1 and 3, (P) has at least one solution. Moreover, if \({\bar{u}}\) is a local solution of (P), then there exist two unique elements \({\bar{y}}, {\bar{\varphi }} \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) such that

$$\begin{aligned}&\left\{ \begin{array}{l} A{\bar{y}} + b(x)\cdot \nabla {\bar{y}} + f(x,{\bar{y}}) = {\bar{u}} \text { in } \varOmega ,\\ {\bar{y}} = 0\text { on } \varGamma ,\end{array}\right. \end{aligned}$$
(2.12)
$$\begin{aligned}&\left\{ \begin{array}{l}\displaystyle A^*{\bar{\varphi }} - {\text {div}}[b(x){\bar{\varphi }}] + \frac{\partial f}{\partial y}(x,{\bar{y}}){\bar{\varphi }} = {\bar{y}} - y_d \text { in } \varOmega ,\\ {\bar{\varphi }} = 0\text { on } \varGamma ,\end{array}\right. \end{aligned}$$
(2.13)
$$\begin{aligned}&\int _\varOmega ({\bar{\varphi }} + \nu {\bar{u}})(u - {\bar{u}})\, dx \ge 0 \quad \forall u \in U_{\mathrm{ad}}. \end{aligned}$$
(2.14)

Further, the regularity \({\bar{u}} \in H^1(\varOmega ) \cap C({\bar{\varOmega }})\) holds. In addition, if \(U_{\mathrm{ad}}= L^2(\varOmega )\), then we have \({\bar{u}} \in H^2(\varOmega ) \cap H_0^1(\varOmega ).\)

This theorem follows from [12, Theorems 3.1 and 3.6, and Corollary 3.7].

We finish this section by establishing the second order optimality conditions. To this end, we define the cone of critical directions as follows:

$$\begin{aligned} C_{{\bar{u}}}= & {} \{v \in L^2(\varOmega ) : J'({\bar{u}})v = 0 \text { and } (2.15) \text { holds}\}\nonumber \\&v(x,t)\left\{ \begin{array}{ll} \ge 0 &{}\quad \text{ if }\;{\bar{u}}(x,t)=\alpha , \\ \le 0 &{} \quad \text{ if }\;{\bar{u}}(x,t)=\beta . \end{array}\right. \end{aligned}$$
(2.15)

Let us observe that (2.14) implies that

$$\begin{aligned} {\bar{\varphi }}(x)+\nu {\bar{u}}(x) \left\{ \begin{array}{ll} \ge 0&{}\quad \text {if}\; {\bar{u}}(x) = \alpha ,\\ \le 0&{}\quad \text {if}\; {\bar{u}}(x) = \beta .\end{array}\right. \end{aligned}$$

Therefore, if \(v \in L^2(\varOmega )\) satisfies (2.15), then \(J'({\bar{u}})v \ge 0\) holds, and \(J'({\bar{u}})v = 0\) if and only if \(v(x) = 0\) whenever \({\bar{\varphi }}(x) +\nu {\bar{u}}(x) \ne 0\).

In the case where there are not control constraints, namely \(U_{\mathrm{ad}}= L^2(\varOmega )\), then \(J'({\bar{u}}) = 0\) and \(C_{{\bar{u}}} = L^2(\varOmega )\).

Theorem 2.5

Under Assumptions 1 and 3, if \({\bar{u}}\) is a local solution of (P), then \(J''({\bar{u}})v^2 \ge 0\) \(\forall v \in C_{{\bar{u}}}\). Conversely, if \({\bar{u}} \in U_{\mathrm{ad}}\) satisfies (2.12)–(2.14) along with \(({\bar{y}},{\bar{\varphi }})\) and

$$\begin{aligned} J''({\bar{u}})v^2 > 0\quad \forall v \in C_{{\bar{u}}}\setminus \{0\}, \end{aligned}$$
(2.16)

then there exist \(\varepsilon > 0\) and \(\kappa > 0\) such that

$$\begin{aligned} J({\bar{u}}) + \frac{\kappa }{2}\Vert u - {\bar{u}}\Vert ^2_{L^2(\varOmega )} \le J(u) \ \ \forall u \in U_{\mathrm{ad}}: \Vert u - {\bar{u}}\Vert _{L^2(\varOmega )} \le \varepsilon . \end{aligned}$$
(2.17)

This result was established in [12, Corollary 3.9].

3 Approximation of the state equation

In this section we consider the finite element discretization of the Eq. (1.1). The goal is to prove the existence of solution for the discrete problems and to derive some error estimates. We proceed in three steps. First we study the linear equation; see Lemma 3.1. Next we replace the local Lipschitz condition stated in Assumption 2 by the more restrictive global condition (3.12). Using this condition, we prove the existence of a unique discrete solution in Theorem 3.5 and error estimates in Theorem 3.6. Finally, we remove assumption (3.12) to obtain the main result of this section, Theorem 3.7. It will be assumed, without express mention, that Assumption 1 holds.

From Theorem 2.2 we know that, under the Assumptions 1 and 2, given \(u \in L^2(\varOmega )\), (1.1) has a unique solution \(y \in H^2(\varOmega ) \cap H_0^1(\varOmega )\). In the rest of the section u denotes a fixed element of \(L^2(\varOmega )\) and y the corresponding solution of (1.1).

To formulate a discrete version of (1.1) we introduce a quasi-uniform family of triangulations \(\{{\mathcal {T}}_{h}\}_{h>0}\) of \({\bar{\varOmega }}\); cf. [5, Definition (4.4.13)]. We denote \(N_h\) the number of interior nodes of \({\mathcal {T}}_{h}\). Associated with these triangulations we consider the finite dimensional spaces

$$\begin{aligned} Y_h = \{y_h\in C({\bar{\varOmega }}):\ y_{h|T}\in P_1(T)\ \forall T\in {\mathcal {T}}_h\text { and } y_h\equiv 0 \text{ on } \varGamma \}, \end{aligned}$$

where \(P_1(T)\) denotes the space of polynomials in T of degree less than or equal to one. Now, we introduce the discrete version of (1.1) as follows

$$\begin{aligned} \left\{ \begin{array}{l}\text {Find } y_h \in Y_h \text { such that }\\ a(y_h,z_h) + \displaystyle \int _\varOmega f(x,y_h(x))z_h(x)\, dx = \displaystyle \int _\varOmega u(x)z_h(x)\, dx \ \ \forall z_h \in Y_h.\end{array}\right. \end{aligned}$$
(3.1)

Above \(a:H^1(\varOmega ) \times H^1(\varOmega ) \longrightarrow {\mathbb {R}}\) denotes the bilinear form associated to the operator \({\mathcal {A}}\)

$$\begin{aligned} a(y_1,y_2)&= \langle {\mathcal {A}}y_1,y_2\rangle _{H^{-1}(\varOmega ),H_0^1(\varOmega )}\\&= \int _\varOmega \Big (\sum _{i, j = 1}^na_{ij}(x)\partial _{x_i}y_1\partial _{x_j}y_2 + [b(x)\cdot \nabla y_1]y_2\Big )\, dx. \end{aligned}$$

From Theorem 2.1 we have

$$\begin{aligned} |a(y_1,y_2)| \le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Vert y_1\Vert _{H_0^1(\varOmega )}\Vert y_2\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

Due to the presence of b in the definition of the bilinear form a, it is not necessarily coercive. However, we can prove, see [12, Lemma 2.1] that it satisfies Gårding’s inequality. There exists \(C_{\varLambda ,b}\) such that

$$\begin{aligned} a(z,z) \ge \frac{\varLambda }{4}\Vert z\Vert ^2_{H_0^1(\varOmega )} - C_{\varLambda ,b}\Vert z\Vert ^2_{L^2(\varOmega )}\quad \forall z \in H_0^1(\varOmega ). \end{aligned}$$
(3.2)

From [12, Corollary 2.6] we know that \({\mathcal {A}}^*:H^2(\varOmega ) \cap H_0^1(\varOmega ) \longrightarrow L^2(\varOmega )\) is an isomorphism. Then, we argue similarly to [32] to deduce the the following result.

Lemma 3.1

Let \(a_0 \in L^2(\varOmega )\) be a non-negative function. There exists \(h_{{\mathcal {A}},a_0} > 0\) depending on \({\mathcal {A}}\) and \(\Vert a_0\Vert _{L^2(\varOmega )}\) such that the variational problem

$$\begin{aligned} \left\{ \begin{array}{l}\text {Find } y_h \in Y_h \text { such that }\\ \displaystyle a(y_h,z_h) + \int _\varOmega a_0(x) y_h(x) z_h(x)\, dx = \int _\varOmega u(x)z_h(x)\, dx \ \ \forall z_h \in Y_h\end{array}\right. \end{aligned}$$
(3.3)

has a unique solution for every \(h \le h_{{\mathcal {A}},a_0}\) and for every \(u \in L^2(\varOmega )\). Moreover, there exists a constant \(C_{{\mathcal {A}},a_0}\) depending on \({\mathcal {A}}\) and \(a_0\) such that

$$\begin{aligned} \Vert y_h\Vert _{H_0^1(\varOmega )} \le C_{{\mathcal {A}},a_0}\Vert {\mathcal {A}}^{-1}u\Vert _{H_0^1(\varOmega )}\quad \forall h \le h_{{\mathcal {A}},a_0}. \end{aligned}$$
(3.4)

Proof

Because of the linearity of the system (3.3), it is enough to show the existence of \(h_{{\mathcal {A}},a_0}\) such that the only solution of the homogeneous problem is \(y_h = 0\). Therefore, let us assume that \(y_h \in Y_h\) satisfies

$$\begin{aligned} a(y_h,z_h) + \int _\varOmega a_0(x)y_h(x)z_h(x)\, dx = 0 \quad \forall z_h \in Y_h. \end{aligned}$$
(3.5)

Then, from Gårding’s inequality (3.2) and the fact that \(a_0 \ge 0\) we get

$$\begin{aligned} 0 = a(y_h,y_h) + \int _\varOmega a_0(x)y^2_h(x)\, dx \ge \frac{\varLambda }{4}\Vert y_h\Vert ^2_{H_0^1(\varOmega )} - C_{\varLambda ,b}\Vert y_h\Vert ^2_{L^2(\varOmega )}, \end{aligned}$$

hence

$$\begin{aligned} \Vert y_h\Vert _{H_0^1(\varOmega )} \le 2\sqrt{\frac{C_{\varLambda ,b}}{\varLambda }}\Vert y_h\Vert _{L^2(\varOmega )}. \end{aligned}$$
(3.6)

Now, let us take \(\psi \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) satisfying of \({\mathcal {A}}^*\psi + a_0\psi = y_h\) in \(\varOmega \). We have

$$\begin{aligned} \Vert \psi \Vert _{H^2(\varOmega )} \le K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )\Vert y_h\Vert _{L^2(\varOmega )}; \end{aligned}$$

see Lemma 3.2 below. Let us denote by \({\hat{\psi }}_h \in Y_h\) the \(H_0^1(\varOmega )\) projection of \(\psi \) on \(Y_h\), i.e.:

$$\begin{aligned} \int _\varOmega \nabla {\hat{\psi }}_h \cdot \nabla z_h\, dx = \int _\varOmega \nabla \psi \cdot \nabla z_h\, dx \quad \forall z_h \in Y_h. \end{aligned}$$
(3.7)

Then, there exist constants \({\hat{c}}_2\) and \({\hat{c}}_\infty \) such that

$$\begin{aligned}&\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )} \le {\hat{c}}_2 h\Vert \psi \Vert _{H^2(\varOmega )} \le {\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h\Vert y_h\Vert _{L^2(\varOmega )}, \end{aligned}$$
(3.8)
$$\begin{aligned}&\Vert \psi - {\hat{\psi }}_h\Vert _{L^\infty (\varOmega )} \le {\hat{c}}_\infty h^{2 - \frac{n}{2}}\Vert \psi \Vert _{H^2(\varOmega )} \le {\hat{c}}_\infty K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}\Vert y_h\Vert _{L^2(\varOmega )}; \end{aligned}$$
(3.9)

see Theorems 18.1 and 19.3 of [18].

Now, from (3.5) we get

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\varOmega )}&= \int _\varOmega ({\mathcal {A}}^*\psi + a_0\psi )y_h\, dx = a(y_h,\psi ) + \int _\varOmega a_0\psi y_h \, dx\\&= a(y_h,\psi - {\hat{\psi }}_h) + \int _\varOmega a_0(\psi - {\hat{\psi }}_h) y_h \, dx\\&\le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Vert y_h\Vert _{H_0^1(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )}\\&\qquad + \Vert a_0\Vert _{L^2(\varOmega )}\Vert y_h\Vert _{L^2(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{L^\infty (\varOmega )}. \end{aligned}$$

From the estimates (3.8) and (3.9) we get

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\varOmega )}&\le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} \Vert y_h\Vert _{H_0^1(\varOmega )}{\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h\Vert y_h\Vert _{L^2(\varOmega )}\\&\quad + \Vert a_0\Vert _{L^2(\varOmega )}{\hat{c}}_\infty K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}\Vert y_h\Vert _{L^2(\varOmega )}^2. \end{aligned}$$

Taking \(h_1 > 0\) such that

$$\begin{aligned} \Vert a_0\Vert _{L^2(\varOmega )}{\hat{c}}_\infty K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h_1^{2 - \frac{n}{2}} = \frac{1}{2}, \end{aligned}$$
(3.10)

we deduce for \(h \le h_1\)

$$\begin{aligned} \Vert y_h\Vert _{L^2(\varOmega )} \le 2\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} {\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h\Vert y_h\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

Now, we select \(h_2\) as follows

$$\begin{aligned} 4\sqrt{\frac{C_{\varLambda ,b}}{\varLambda }} \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} {\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h_2 = 1. \end{aligned}$$
(3.11)

Finally, we infer from (3.6) that \(y_h = 0\) if \(h < \min \{h_1,h_2\}\).

Let us conclude the demonstration by proving the estimate (3.4). To this end we set \(h_{{\mathcal {A}},a_0} = \min \{h_1,h_2\}/2\). Let \(y \in Y\) be the solution of \({\mathcal {A}}y = u\) in \(\varOmega \) and let \(y_h \in Y_h\) be the solution of (3.3) for \(h \le h_{{\mathcal {A}},a_0}\). Then, using again \(\psi \) and \({\hat{\psi }}_h\), and arguing similarly as we did above, we get

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\varOmega )}&= \int _\varOmega ({\mathcal {A}}^*\psi + a_0\psi )y_h\, dx = a(y_h,\psi ) + \int _\varOmega a_0\psi y_h \, dx\\&= a(y_h,\psi - {\hat{\psi }}_h) + \int _\varOmega a_0(\psi - {\hat{\psi }}_h) y_h \, dx + \int _\varOmega u{\hat{\psi }}_h\, dx\\&= a(y_h,\psi - {\hat{\psi }}_h) + \int _\varOmega a_0(\psi - {\hat{\psi }}_h) y_h \, dx + a(y,{\hat{\psi }}_h)\\&\le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} \Vert y_h\Vert _{H_0^1(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )} \\&\quad + \Vert a_0\Vert _{L^2(\varOmega )}\Vert y_h\Vert _{L^2(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{L^\infty (\varOmega )}\\&\quad + \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} \Vert y\Vert _{H_0^1(\varOmega )}\Vert {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

Then, using that \(h \le h_1\), and taking into account that \(\Vert {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )} \le \Vert \psi \Vert _{H_0^1(\varOmega )} \le \Vert \psi \Vert _{H^2(\varOmega )}\le K_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big ) \Vert y_h\Vert _{L^2(\varOmega )}\), see Lemma 3.2 below, and arguing as above we deduce

$$\begin{aligned} \Vert y_h\Vert _{L^2(\varOmega )}&\le 2\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} {\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )h\Vert y_h\Vert _{H_0^1(\varOmega )}\\&\quad + 2\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} K_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big ) \Vert y\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

Moreover, since \(h \le h_2/2\) we obtain

$$\begin{aligned} \Vert y_h\Vert _{L^2(\varOmega )}&\le \frac{1}{4}\sqrt{\frac{\varLambda }{C_{\varLambda ,b}}} \Vert y_h\Vert _{H_0^1(\varOmega )} \\&\quad + 2\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} K_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big ) \Vert y\Vert _{H_0^1(\varOmega )}, \end{aligned}$$

and

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\varOmega )}&\le \frac{1}{8}\frac{\varLambda }{C_{\varLambda ,b}}\Vert y_h\Vert ^2_{H_0^1(\varOmega )} \\&\quad + 8\Vert {\mathcal {A}}\Vert ^2_{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} K^2_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )^2 \Vert y\Vert ^2_{H_0^1(\varOmega )}. \end{aligned}$$

Now, from (3.2) we obtain

$$\begin{aligned}&\frac{\varLambda }{4}\Vert y_h\Vert ^2_{H_0^1(\varOmega )} - C_{\varLambda ,b}\Vert y_h\Vert ^2_{L^2(\varOmega )} \le a(y_h,y_h) + \int _\varOmega a_0(x)y^2_h(x)\, dx\\&\quad = \int _\varOmega u y_h \, dx = a(y,y_h) \le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Vert y\Vert _{H_0^1(\varOmega )}\Vert y_h\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

Then, from the last inequalities we infer

$$\begin{aligned}&\frac{\varLambda }{8}\Vert y_h\Vert ^2_{H_0^1(\varOmega )} \le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Vert y\Vert _{H_0^1(\varOmega )}\Vert y_h\Vert _{H_0^1(\varOmega )}\\&\quad + C_{\varLambda ,b}8\Vert {\mathcal {A}}\Vert ^2_{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} K^2_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )^2 \Vert y\Vert ^2_{H_0^1(\varOmega )}. \end{aligned}$$

Young’s inequality implies

$$\begin{aligned} \frac{\varLambda }{16}\Vert y_h\Vert ^2_{H_0^1(\varOmega )} \le \Vert {\mathcal {A}}\Vert ^2_{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Big (\frac{4}{\varLambda } + 8C_{\varLambda ,b} K^2_{{\mathcal {A}}^*} \Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )^2 \Big ) \Vert y\Vert ^2_{H_0^1(\varOmega )}. \end{aligned}$$

This implies (3.4). Indeed, it is enough to observe that \(y = {\mathcal {A}}^{-1}u\).\(\square \)

Lemma 3.2

Let \(a_0 \in L^2(\varOmega )\) be a non-negative function. For every \(u\in L^2(\varOmega )\) there exists a unique solution \(\psi \in H^2(\varOmega )\cap H^1_0(\varOmega )\) of

$$\begin{aligned} \left\{ \begin{array}{l} {\mathcal {A}}^*\psi + a_0\psi =u \text{ in } \varOmega , \\ \psi =0 \text{ on } \varGamma .\end{array}\right. \end{aligned}$$

Moreover, there exists a constant \(K_{{\mathcal {A}}^*}\) only depending on \({\mathcal {A}}^*\) such that

$$\begin{aligned} \Vert \psi \Vert _{H^2(\varOmega )}\le K_{{\mathcal {A}}^*}\Big (\Vert a_0\Vert _{L^2(\varOmega )} + 1\Big )\Vert u\Vert _{L^2(\varOmega )}\quad \forall u \in L^2(\varOmega ). \end{aligned}$$

Proof

The existence and uniqueness of a solution \(\psi \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) follows from [12, Corollary 2.6]. Let us prove the estimate. To this end we set \(u=u^+-u^-\), where \(u^+ =\max \{u,0\}\) and \(u^-=-\min \{u,0\}\). Now, we take \(\psi _1, \psi _2, \phi _1\) and \(\phi _2\) in \(H^2(\varOmega ) \cap H_0^1(\varOmega )\) satisfying

$$\begin{aligned} {\mathcal {A}}^*\psi _1 + a_0\psi _1 = u^+\&\text { and } \ {\mathcal {A}}^*\psi _2 + a_0\psi _2 = u^-,\\ {\mathcal {A}}^*\phi _1 = u^+\&\text { and } \ {\mathcal {A}}^*\phi _2 = u^-. \end{aligned}$$

Then, the identity \(\psi = \psi _1 - \psi _2\) holds. From the comparison principle proven in [12, Lemma 2.8] we know that all the above functions are non-negative. Using the same Lemma and the fact that \({\mathcal {A}}^*(\psi _1 - \phi _1) = -a_0\psi _1 \le 0\) and \({\mathcal {A}}^*(\psi _2 - \phi _2) = -a_0\psi _2 \le 0\), we infer that \(0 \le \psi _1 \le \phi _1\) and \(0 \le \psi _2 \le \phi _2\). Hence, we have

$$\begin{aligned} \Vert \psi \Vert _{L^\infty (\varOmega )} \le \Vert \psi _1\Vert _{L^\infty (\varOmega )} + \Vert \psi _2\Vert _{L^\infty (\varOmega )} \le \Vert \phi _1\Vert _{L^\infty (\varOmega )} + \Vert \phi _2\Vert _{L^\infty (\varOmega )}. \end{aligned}$$

Now, from Theorem 2.1 we obtain

$$\begin{aligned} \Vert \phi _1\Vert _{L^\infty (\varOmega )} \le C\Vert u^+\Vert _{L^2(\varOmega )} \ \text { and }\ \Vert \phi _2\Vert _{L^\infty (\varOmega )} \le C\Vert u^-\Vert _{L^2(\varOmega )}, \end{aligned}$$

where C depends on \({\mathcal {A}}^*\). Combining the above inequalities it follows

$$\begin{aligned} \Vert \psi \Vert _{L^\infty (\varOmega )} \le C\Big (\Vert u^+\Vert _{L^2(\varOmega )} + \Vert u^-\Vert _{L^2(\varOmega )}\Big ) \le 2C\Vert u\Vert _{L^2(\varOmega )}. \end{aligned}$$

Using the above estimate and applying again Theorem 2.1 to the equation \({\mathcal {A}}^*\psi = u - a_0\psi \) we infer

$$\begin{aligned}&\Vert \psi \Vert _{H^2(\varOmega )} \le C\Big (\Vert u\Vert _{L^2(\varOmega )}+ \Vert a_0\Vert _{L^2(\varOmega )}\Vert \psi \Vert _{L^\infty (\varOmega )}\Big )\\&\quad \le C\max \{1,2C\}\Big (1 + \Vert a_0\Vert _{L^2(\varOmega )}\Big )\Vert u\Vert _{L^2(\varOmega )}, \end{aligned}$$

which proves the lemma.\(\square \)

Remark 3.3

Notice that in the proof, we have also stated the existence of a constant \(K^\infty _{{\mathcal {A}}^*}\) which does not depend on \(a_0\) such that

$$\begin{aligned} \Vert \psi \Vert _{L^\infty (\varOmega )}\le K^\infty _{{\mathcal {A}}^*}\Vert u\Vert _{L^2(\varOmega )}. \end{aligned}$$

Since the non-linear discrete Eq. (3.1) is neither monotone nor coercive, the proof of existence or uniqueness of solution is not obvious. We will establish the existence for h small enough. In a first step, we make the following assumption

Assumption 2’

\(f:\varOmega \times {\mathbb {R}} \longrightarrow {\mathbb {R}}\) is a Carathéodory function, monotone non-decreasing with respect to the second variable, and satisfying

$$\begin{aligned} \left\{ \begin{array}{l} \exists \phi _f \in L^2(\varOmega ) \text { such that } |f(x,y)| \le \phi _f(x) \ \forall y\in {\mathbb {R}}\text { and }\\ |f(x,y_2) - f(x,y_1)| \le \phi _f(x)|y_2 - y_1| \text{ for } \text{ a.e. } x \in \varOmega \text{ and } \forall y_1,y_2 \in {\mathbb {R}}.\end{array}\right. \nonumber \\ \end{aligned}$$
(3.12)

Remark 3.4

Let us observe that under Assumption 2, given \(M > 0\) and setting \(f_M(x,y) = f(x,{\text {Proj}}_{[-M,+M]}(y))\), we have that \(f_M\) satisfies Assumption 2’. Indeed, we have

$$\begin{aligned} \left\{ \begin{array}{l}\displaystyle |f_M(x,y)| \le |f(x,0)| + \phi _M(x)|{\text {Proj}}_{[-M,+M]}(y)| \le |f(x,0)| + \phi _M(x)M,\\ |f_M(x,y_2) - f_M(x,y_1)| \le \phi _M(x)|y_2 - y_1|.\end{array}\right. \nonumber \\ \end{aligned}$$
(3.13)

Hence, (3.12) holds with

$$\begin{aligned} \phi _f(x) = |f(x,0)| + \phi _M(x)\max (M,1). \end{aligned}$$

This property will be used later to remove the Assumption 2’.

It is obvious that (3.12) is more restrictive than (2.3). Indeed, Assumption 2 obviously follows from the above hypothesis. Later we will get rid of this assumption. Now, we address the existence of a solution of (3.1). In the next theorem we apply Lemma 3.1 with \(a_0 = 0\), and we set \(h_{\mathcal {A}} = h_{{\mathcal {A}},0}\) and \(C_{\mathcal {A}} = C_{{\mathcal {A}},0}\).

Theorem 3.5

If Assumptions 1 and  2’ hold, then there exists \(0 < h_{{\mathcal {A}},f} \le h_{\mathcal {A}}\) independent of u such that (3.1) has a unique solution.

Proof

Let us take \(h \in (0,h_{\mathcal {A}}]\). We define the function \(F:Y_h \longrightarrow Y_h\) where \(F_h(w_h) = y_{h}(w_h)\) satisfies

$$\begin{aligned} a(y_h(w_h),z_h) = \int _\varOmega [u(x) - f(x,w_h(x))]z_h(x)\, dx \ \ \forall z_h \in Y_h \end{aligned}$$

From Lemma 3.1 we know that \(y_h(w_h)\) is well defined and

$$\begin{aligned} \Vert y_h(w_h)\Vert _{H_0^1(\varOmega )} \le C_{\mathcal {A}}\Vert {\mathcal {A}}^{-1}(u - f(\cdot ,w_h))\Vert _{H_0^1(\varOmega )}. \end{aligned}$$

From this inequality and (3.12) we deduce that \(\Vert y_h(w_h)\Vert _{H_0^1(\varOmega )} \le r\) \(\forall w_h \in Y_h\) with

$$\begin{aligned} r&= C_{\mathcal {A}}\sup _{w_h \in Y_h}\Vert {\mathcal {A}}^{-1}(u - f(\cdot ,w_h))\Vert _{H_0^1(\varOmega )}\\&\le C_{\mathcal {A}}\Vert {\mathcal {A}}^{-1}\Vert _{{\mathcal {L}}(H^{-1}(\varOmega ),H_0^1(\varOmega ))}C_\varOmega (\Vert u\Vert _{L^2(\varOmega )} + \Vert \phi _f\Vert _{L^2(\varOmega )}). \end{aligned}$$

From this estimate, the continuity of \(F_h\) and Brouwer’s fixed point theorem we obtain the existence of at least one fixed point \(y_h\). Obviously, \(y_h\) is solution of (3.1). It remains to prove the uniqueness of a solution for h small enough. Let us assume that \(y_{h,1}, y_{h,2} \in Y_h\) are two solutions of (3.1). Then subtracting the equations satisfied by these solutions we get

$$\begin{aligned} a(y_{h,2} - y_{h,1},y_{h,2} - y_{h,1}) + \int _\varOmega [f(x,y_{h,2}) - f(x,y_{h,1})](y_{h,2} - y_{h,1})\, dx = 0. \end{aligned}$$

Using (3.2) along with the monotonicity of f we get

$$\begin{aligned} \frac{\varLambda }{4}\Vert y_{h,2} - y_{h,1}\Vert ^2_{H_0^1(\varOmega )} - C_{\varLambda ,b}\Vert y_{h,2} - y_{h,1}\Vert ^2_{L^2(\varOmega )}\le 0. \end{aligned}$$
(3.14)

We define

$$\begin{aligned} a_0(x) = \left\{ \begin{array}{cl}\displaystyle \frac{f(x,y_{h,2}(x)) - f(x,y_{h,1}(x))}{y_{h,2}(x) - y_{h,1}(x)} &{}\qquad \text {if }\ y_{h,2}(x) \ne y_{h,1}(x),\\ 0 &{}\qquad \text {otherwise.}\end{array}\right. \end{aligned}$$

From (3.12) we get that \(a_0 \in L^2(\varOmega )\) and \(\Vert a_0\Vert _{L^2(\varOmega )} \le \Vert \phi _f\Vert _{L^2(\varOmega )}\). Now we take \(\psi \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) such that \({\mathcal {A}}^*\psi + a_0\psi = y_{h,2} - y_{h,1}\) in \(\varOmega \). According to Lemma 3.2 we have that \(\Vert \psi \Vert _{H^2(\varOmega )} \le \) \(K_{{\mathcal {A}}^*}\big (1+\Vert \phi _f\Vert _{L^2(\varOmega )}\big )\Vert y_{h,2} - y_{h,1}\Vert _{L^2(\varOmega )}\). We denote by \({\hat{\psi }}_h \in Y_h\) the \(H_0^1(\varOmega )\)-projection of \(\psi \) in \(Y_h\); see (3.7). Then, from (3.14), the definition of \(a_0\), the choice of \(\psi \) and \({\hat{\psi }}_h \in Y_h\), and using (1.4), (1.5), (3.8) and that \(y_{h,1}\) and \(y_{h,2}\) are solutions of (3.1) we infer

$$\begin{aligned}&\frac{\varLambda }{4C_{\varLambda ,b}}\Vert y_{h,2} - y_{h,1}\Vert ^2_{H_0^1(\varOmega )} \le \Vert y_{h,2} - y_{h,1}\Vert ^2_{L^2(\varOmega )} = \int _\varOmega [{\mathcal {A}}^*\psi + a_0\psi ](y_{h,2} - y_{h,1})\, dx\\&\quad = a(y_{h,2} - y_{h,1},\psi ) + \int _\varOmega [f(x,y_{h,2}) - f(x,y_{h,1})]\psi \, dx\\&\quad = a(y_{h,2} - y_{h,1},\psi - {\hat{\psi }}_h) + \int _\varOmega [f(x,y_{h,2}) - f(x,y_{h,1})](\psi - {\hat{\psi }}_h)\, dx\\&\quad \le \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}\Vert y_{h,2} - y_{h,1}\Vert _{H_0^1(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )}\\&\qquad + \Vert \phi _f\Vert _{L^2(\varOmega )}\Vert y_{h,2} - y_{h,1}\Vert _{L^4(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{L^4(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}+\Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big )\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )}\Vert y_{h,2} - y_{h,1}\Vert _{H_0^1(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}+\Vert \phi _f\Vert _{L^2(\varOmega )}K^2_\varOmega \Big ) {\hat{c}}_2 h\Vert \psi \Vert _{H^2(\varOmega )}\Vert y_{h,2} - y_{h,1}\Vert _{H_0^1(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}+\Vert \phi _f\Vert _{L^2(\varOmega )}K^2_\varOmega \Big ) K_{{\mathcal {A}}^*}\big (1+\Vert \phi _f\Vert _{L^2(\varOmega )}\big ) {\hat{c}}_2\ h\\&\qquad \Vert y_{h,2} - y_{h,1}\Vert _{L^2(\varOmega )}\Vert y_{h,2} - y_{h,1}\Vert _{H_0^1(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}+\Vert \phi _f\Vert _{L^2(\varOmega )}K^2_\varOmega \Big )C_\varOmega K_{{\mathcal {A}}^*}\big (1+\Vert \phi _f\Vert _{L^2(\varOmega )}\big ) {\hat{c}}_2\ h\\&\qquad \Vert y_{h,2} - y_{h,1}\Vert _{H_0^1(\varOmega )}^2. \end{aligned}$$

From this inequality we obtain that \(y_{h,2} - y_{h,1} = 0\) if

$$\begin{aligned} h < h_{{\mathcal {A}},f} = \min \left\{ h_{\mathcal {A}},\frac{\varLambda }{4C_{\varLambda ,b}{\tilde{C}}}\right\} , \end{aligned}$$

where

$$\begin{aligned} {\tilde{C}} = \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))}+\Vert \phi _f\Vert _{L^2(\varOmega )}K^2_\varOmega \Big )C_\varOmega K_{{\mathcal {A}}^*}\big (1+\Vert \phi _f\Vert _{L^2(\varOmega )}\big ) {\hat{c}}_2. \end{aligned}$$

\(\square \)

Next we prove some error estimates for \(y - y_h\).

Theorem 3.6

If Assumptions 1 and 2’ hold, then there exists \(h_0 \le h_{{\mathcal {A}},f}\) and constants \(M_{{\mathcal {A}},f}\) and \(M_{\infty ,{\mathcal {A}},f}\) independent of u such that for every \(h < h_0\)

$$\begin{aligned}&\Vert y - y_h\Vert _{L^2(\varOmega )} + h\Vert y - y_h\Vert _{H_0^1(\varOmega )} \le M_{{\mathcal {A}},f}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^2, \end{aligned}$$
(3.15)
$$\begin{aligned}&\Vert y - y_h\Vert _{L^\infty (\varOmega )} \le M_{\infty ,{\mathcal {A}},f}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}, \end{aligned}$$
(3.16)

where y and \(y_h\) denote the solutions of (1.1) and (3.1).

Proof

The proof is divided into three steps.

Step 1. \(\Vert y - y_h\Vert _{L^2(\varOmega )} \le K_1h\Vert y - y_h\Vert _{H_0^1(\varOmega )}\).

We proceed similarly as we did in the proof of the above theorem. We define

$$\begin{aligned} a_0(x) = \left\{ \begin{array}{cl}\displaystyle \frac{f(x,y(x)) - f(x,y_h(x))}{y(x) - y_h(x)} &{}\qquad \text {if } y(x) \ne y_h(x),\\ 0 &{}\qquad \text {otherwise.}\end{array}\right. \end{aligned}$$

Now we take \(\psi \in H^2(\varOmega ) \cap H_0^1(\varOmega )\) such that \({\mathcal {A}}^*\psi + a_0\psi = y - y_h\) in \(\varOmega \), and \({\hat{\psi }}_h \in Y_h\) as the projection of \(\psi \) in \(Y_h\). Then, subtracting the equations satisfied by y and \(y_h\), using the estimate of Lemma 3.2 and taking \({\hat{c}}_2\) as in (3.8), we obtain

$$\begin{aligned}&\Vert y - y_h\Vert _{L^2(\varOmega )}^2 = \int _\varOmega ({\mathcal {A}}^*\psi + a_0\psi )(y - y_h)\, dx\\&\quad = a(y-y_h,\psi ) + \int _\varOmega [f(x,y) - f(x,y_h)]\psi \, dx\\&\quad = a(y-y_h,\psi - {\hat{\psi }}_h) + \int _\varOmega [f(x,y) - f(x,y_h)](\psi - {\hat{\psi }}_h) \, dx\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big )\Vert y - y_h\Vert _{H_0^1(\varOmega )}\Vert \psi - {\hat{\psi }}_h\Vert _{H_0^1(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big )\Vert y - y_h\Vert _{H_0^1(\varOmega )}{\hat{c}}_2h\Vert \psi \Vert _{H^2(\varOmega )}\\&\quad \le K_1\Vert y - y_h\Vert _{H_0^1(\varOmega )}\Vert y - y_h\Vert _{L^2(\varOmega )}h, \end{aligned}$$

which proves the desired estimate with a constant

$$\begin{aligned} K_1 = \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big ){\hat{c}}_2 K_{{\mathcal {A}}^*}\Big (1+\Vert \phi _f\Vert _{L^2(\varOmega )}\Big ) \end{aligned}$$

independent of u.

Step 2. \(\Vert y - y_h\Vert _{H_0^1(\varOmega )} \le K_2\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h\).

Let us denote \(\hat{y}_h \in Y_h\) the projection of y in \(Y_h\), so that

$$\begin{aligned} \int _\varOmega \nabla {\hat{y}}_h\nabla z_h\, dx = \int _\varOmega \nabla y\nabla z_h\, dx \quad \forall z_h \in Y_h. \end{aligned}$$

Hence, we have with (2.4)

$$\begin{aligned} \Vert y - \hat{y}_h\Vert _{H_0^1(\varOmega )} \le {\hat{c}}_2 C_{A,f}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h. \end{aligned}$$
(3.17)

From the estimate proved in Step 1, (3.2) along with the monotonicity of f, and (3.17) we get for

$$\begin{aligned} h < h_0 = \min \left\{ h_{{\mathcal {A}},f}, \sqrt{\frac{\varLambda }{20 K_1^2C_{\varLambda ,b}}}\right\} \end{aligned}$$

the following estimate

$$\begin{aligned}&\frac{\varLambda }{5}\Vert y - y_h\Vert ^2_{H_0^1(\varOmega )} \le a(y - y_h,y - y_h) + \int _\varOmega [f(x,y) - f(x,y_h)](y - y_h)\, dx\\&\quad = a(y - y_h,y - {\hat{y}}_h) + \int _\varOmega [f(x,y) - f(x,y_h)](y - {\hat{y}}_h)\, dx\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big )\Vert y - y_h\Vert _{H_0^1(\varOmega )}\Vert y - {\hat{y}}_h\Vert _{H_0^1(\varOmega )}\\&\quad \le \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big )\Vert y - y_h\Vert _{H_0^1(\varOmega )}{\hat{c}}_2C_{A,f}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h, \end{aligned}$$

which proves Step 2 with

$$\begin{aligned} K_2 = \Big (\Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H_0^1(\varOmega ),H^{-1}(\varOmega ))} + \Vert \phi _f\Vert _{L^2(\varOmega )}K_\varOmega ^2\Big ){{\hat{c}}_2} C_{A,f}, \end{aligned}$$

and (3.15) follows for \(M_{{\mathcal {A}},f} = K_2\max \{1,K_1\}\).

Step 3. Proof of (3.16). The proof of (3.16) follows from (3.15) and the inverse inequality

$$\begin{aligned} \Vert z_h\Vert _{L^\infty (\varOmega )}\le c_{\infty ,2} \Vert z_h\Vert _{L^2(\varOmega )}h^{-\frac{n}{2}}\ \forall z_h\in Y_h, \end{aligned}$$

where \(c_{\infty ,2}\) in independent of h; see [18, Theorem 17.2]. Though the proof is quite classical we include it here for convenience of the reader. Taking \({\tilde{y}}_h \in Y_h\) as the function interpolating y at the nodes of the triangulation, we know, see [18, Theorem 16.2 and Theorem 17.1], that

$$\begin{aligned} \Vert y - {\tilde{y}}_h\Vert _{L^\infty (\varOmega )} \le {\tilde{c}}_\infty h^{2 - \frac{n}{2}}\Vert y\Vert _{H^2(\varOmega )} \ \text { and }\ \Vert y - {\tilde{y}}_h\Vert _{L^2(\varOmega )} \le {\tilde{c}}_2 h^2\Vert y\Vert _{H^2(\varOmega )}. \end{aligned}$$

Hence, we have with (2.4)

$$\begin{aligned}&\Vert y - y_h\Vert _{L^\infty (\varOmega )} \le \Vert y - {\tilde{y}}_h\Vert _{L^\infty (\varOmega )} + \Vert {\tilde{y}}_h - y_h\Vert _{L^\infty (\varOmega )}\\&\quad \le \Vert y - {\tilde{y}}_h\Vert _{L^\infty (\varOmega )} + c_{\infty ,2}\Vert {\tilde{y}}_h - y_h\Vert _{L^2(\varOmega )}h^{-\frac{n}{2}}\\&\quad \le \Vert y - {\tilde{y}}_h\Vert _{L^\infty (\varOmega )} +c_{\infty ,2} \Big (\Vert {\tilde{y}}_h - y\Vert _{L^2(\varOmega )} + \Vert y - y_h\Vert _{L^2(\varOmega )}\Big )h^{-\frac{n}{2}}\\&\quad \le ({\tilde{c}}_\infty C_{A,f} + c_{\infty ,2} {\tilde{c}}_2 + c_{\infty ,2} M_{{\mathcal {A}},f})\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}, \end{aligned}$$

which implies (3.16). \(\square \)

Now, we replace Assumption 2’ by the weaker Assumption 2’.

Theorem 3.7

Under Assumptions 1 and 2, for all \(M \ge 1 + \Vert y\Vert _{C({\bar{\varOmega }})}\) there exists \(h_M > 0\) such that for every \(h < h_M\) (3.1) has a unique solution \(y_h\) satisfying \(\Vert y_h\Vert _{C({\bar{\varOmega }})} \le M\). Moreover, there exist constants \(K_M\) and \(K_{\infty ,M}\) independent of u such that

$$\begin{aligned}&\Vert y - y_h\Vert _{L^2(\varOmega )} + h\Vert y - y_h\Vert _{H_0^1(\varOmega )} \le K_M\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^2, \end{aligned}$$
(3.18)
$$\begin{aligned}&\Vert y - y_h\Vert _{L^\infty (\varOmega )} \le K_{\infty ,M}\Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}. \end{aligned}$$
(3.19)

Further, if there exist other solutions \(\{{\tilde{y}}_h\}_{h < h_M}\) of (3.1) with \(y_h \ne \tilde{y}_h\) for all h, then \(\lim _{h \rightarrow 0}\Vert {\tilde{y}}_h\Vert _{C({\bar{\varOmega }})} = \infty \).

Proof

Given M, we define the function \(f_M:\varOmega \times {\mathbb {R}} \longrightarrow {\mathbb {R}}\) by

$$\begin{aligned} f_M(x,y) = f(x,{\text {Proj}}_{[-M,+M]}(y)). \end{aligned}$$

Then, according to Remark 3.4, \(f_M\) satisfies the conditions (3.12). Moreover, if y is the solution of (1.1), then \(f_M(x,y(x)) = f(x,y(x))\) in \(\varOmega \), thus y also satisfies

$$\begin{aligned} \left\{ \begin{array}{l} Ay + b(x)\cdot \nabla y + f_M(x,y) = u \text { in } \varOmega ,\\ y = 0\text { on } \varGamma .\end{array}\right. \end{aligned}$$

According to Theorem 3.5, there exists \({\tilde{h}}_M = h_{{\mathcal {A}},f_M}\), which depends on \({\mathcal {A}}\) and \(\Vert f(x,0)\Vert _{L^2(\varOmega )}+\max (M,1)\Vert \phi _M\Vert _{L^2(\varOmega )}\) but is independent of u, such that the variational problem

$$\begin{aligned} \left\{ \begin{array}{l}\text {Find } y_h \in Y_h \text { such that }\\ a(y_h,z_h) + \displaystyle \int _\varOmega f_M(x,y_h(x))z_h(x)\, dx = \displaystyle \int _\varOmega u(x)z_h(x)\, dx \ \ \forall z_h \in Y_h.\end{array}\right. \qquad \end{aligned}$$
(3.20)

has a unique solution for every \(h < {\tilde{h}}_M\). Moreover, from Theorem 3.6 we have the estimate

$$\begin{aligned} \Vert y - y_h\Vert _{L^\infty (\varOmega )} \le M_{\infty ,{\mathcal {A}},f_M} \Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h^{2 - \frac{n}{2}}. \end{aligned}$$

Taking \(h_M\) such that

$$\begin{aligned} 0 < h_M \le {\tilde{h}}_M\ \text { and }\ M_{\infty ,{\mathcal {A}},f_M} \Big (\Vert u\Vert _{L^2(\varOmega )} + 1\Big )h_M^{2 - \frac{n}{2}} \le 1, \end{aligned}$$
(3.21)

we have

$$\begin{aligned} \Vert y_h\Vert _{C({\bar{\varOmega }})} \le \Vert y - y_h\Vert _{C({\bar{\varOmega }})} + \Vert y\Vert _{C({\bar{\varOmega }})} \le M. \end{aligned}$$

Hence, \(f_M(x,y_h(x)) = f(x,y_h(x))\) in \(\varOmega \) for all \(h \le h_M\). Consequently, \(y_h\) is a solution of (3.1). Moreover, if \({\hat{y}}_h\) is another solution of (3.1) such that \(\Vert {\hat{y}}_h\Vert _{C({\bar{\varOmega }})}\le M\), then \({\hat{y}}_h\) also solves (3.20). Hence, Theorem 3.5 implies that \(y_h = {\hat{y}}_h\). Moreover, the estimates (3.18) and (3.19) follow from (3.15) and (3.16).

Finally, let us assume that \(\{{\tilde{y}}_h\}_{h < h_M}\) are solutions of (3.1) with \(y_h \ne {\tilde{y}}_h\). We argue by contradiction and assume that there exists a constant \(C_\infty \) such that \(\Vert {\tilde{y}}_h\Vert _{C({\bar{\varOmega }})} \le C_\infty \) for all \(h < h_M\). We take \({\tilde{M}} = \max \{1 + \Vert y\Vert _{C({\bar{\varOmega }})},C_\infty \}\). Then, \(y_h\) and \({\tilde{y}}_h\) are two different solutions of (3.20) for every \(h < h_{{\tilde{M}}}\), which contradicts the uniqueness already established. \(\square \)

Using the previous theorem, we are going to establish a well defined local mapping \(u_h \rightarrow y_h\) by ignoring those solutions of (3.1) with big \(C({\bar{\varOmega }})\)-norms.

Theorem 3.8

Suppose that Assumptions 1 and 2 hold. Let \({\bar{y}} \in Y\) be the solution of (1.1) corresponding to the control \({\bar{u}} \in L^2(\varOmega )\). Given \(\rho > 0\) arbitrary, there exist \(\rho ^* > 0\) and \(h_0 > 0\) such that (3.1) has a unique solution \(y_h(u) \in \bar{B}^Y_{\rho ^*}({\bar{y}})\) for every \(u \in {\bar{B}}_\rho ({\bar{u}}) \subset L^2(\varOmega )\) and for all \(h < h_0\), where

$$\begin{aligned} {{\bar{B}}_\rho ({\bar{u}}) = \{u \in L^2(\varOmega ) : \Vert u - {\bar{u}}\Vert _{L^2(\varOmega )} \le \rho \} \text{ and } \bar{B}}^Y_{\rho ^*}({\bar{y}}) = \{y \in Y : \Vert y - {\bar{y}}\Vert _Y \le \rho ^*\}. \end{aligned}$$

Furthermore, there exist constants \(K_2\) and \(K_\infty \) such that

$$\begin{aligned}&\Vert y_u - y_h(u)\Vert _{L^2(\varOmega )} + h\Vert y_u - y_h(u)\Vert _{H_0^1(\varOmega )} \le K_2\Big (\Vert {\bar{u}}\Vert _{L^2(\varOmega )} + \rho + 1\Big )h^2, \end{aligned}$$
(3.22)
$$\begin{aligned}&\Vert y_u - y_h(u)\Vert _{L^\infty (\varOmega )} \le K_\infty \Big (\Vert {\bar{u}}\Vert _{L^2(\varOmega )} + \rho + 1\Big )h^{2 - \frac{n}{2}}\ \ \forall u \in {\bar{B}}_\rho ({\bar{u}}), \end{aligned}$$
(3.23)

where \(y_u\) and \(y_h(u)\) are the solutions of (1.1) and (3.1), respectively, associated with the control u.

Proof

Let us fix \(\rho > 0\). In [12, Lemma 3.5], it was proved the existence of a constant \(M_\rho \) such that

$$\begin{aligned} \Vert y_u - {\bar{y}}\Vert _Y \le M_\rho \Vert u - {\bar{u}}\Vert _{L^2(\varOmega )} \le M_\rho \rho \quad \forall u \in {\bar{B}}_\rho ({\bar{u}}). \end{aligned}$$
(3.24)

Hence, we have

$$\begin{aligned} \Vert y_u\Vert _{C({\bar{\varOmega }})} \le \Vert {\bar{y}}\Vert _{C({\bar{\varOmega }})} + M_\rho \rho \quad \forall u \in {\bar{B}}_\rho ({\bar{u}}). \end{aligned}$$

Now, we set \(M = 1 + \Vert {\bar{y}}\Vert _{C({\bar{\varOmega }})} + M_\rho \rho \). According to Theorem 3.7 and (3.21), there exists \(h_M > 0\) such that for all \(h < h_M\) and for every \(u \in {\bar{B}}_\rho ({\bar{u}})\) (3.1) has a unique solution \(y_h(u)\) satisfying \(\Vert y_h(u)\Vert _{C({\bar{\varOmega }})} \le M\). Moreover, the estimates (3.18) and (3.19) hold for \(y_u - y_h(u)\). Further, it is enough to observe that \(\Vert u\Vert _{L^2(\varOmega )} \le \Vert {\bar{u}}\Vert _{L^2(\varOmega )} + \rho \) holds to deduce (3.22) and (3.23). Finally, we define \(\rho ^* = 1 + M_\rho \rho \) and take \(h_0 \in (0,h_M]\) such that

$$\begin{aligned} \Big (\Vert {\bar{u}}\Vert _{L^2(\varOmega )} + \rho + 1\Big )(K_2h_0 + K_\infty h_0^{2 - \frac{n}{2}}) \le 1. \end{aligned}$$
(3.25)

Then, for every \(u \in {\bar{B}}_\rho ({\bar{u}})\), (3.24) implies that \(y_u \in B^Y_{\rho ^*}({\bar{y}})\) holds. Furthermore, (3.22), (3.23) and (3.25) imply

$$\begin{aligned} \Vert y_h(u) - {\bar{y}}\Vert _Y \le \Vert y_h(u) - y_u\Vert _Y + \Vert y_u - {\bar{y}}\Vert _Y< 1 + M_\rho \rho = \rho ^*\quad \forall h < h_0. \end{aligned}$$

Hence, \(y_h(u) \in B^Y_{\rho ^*}({\bar{u}})\) holds. Thus, we have proved that (3.1) has a solution in \(B^Y_{\rho ^*}({\bar{u}}) \cap Y_h\) for every \(u \in {\bar{B}}_\rho ({\bar{u}})\). It remains to prove the uniqueness. This follows from the fact that \(h_0 \le h_M\) and, thanks to (3.25), any element \(y_h \in \bar{B}^Y_{\rho ^*}({\bar{u}})\) satisfies

$$\begin{aligned} \Vert y_h\Vert _{C({\bar{\varOmega }})} \le \Vert y_h - {\bar{y}}\Vert _{C({\bar{\varOmega }})} + \Vert {\bar{y}}\Vert _{C({\bar{\varOmega }})} \le \rho ^* + \Vert {\bar{y}}\Vert _{C({\bar{\varOmega }})} = M.\qquad \square \end{aligned}$$

\(\square \)

4 Approximation of the control problem (P)

In this section, we discretize the control problem (P) and study the convergence of the discretizations. To this end, we suppose without express mention that Assumptions 1 and 2 hold.

Let us consider the functional \({\mathcal {J}}:L^2(\varOmega )\times L^2(\varOmega )\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} {\mathcal {J}}(y,u) = \frac{1}{2}\int _\varOmega (y(x) - y_d(x))^2\,dx+ \frac{\nu }{2}\int _\varOmega u^2\,dx. \end{aligned}$$

Let us denote by \({\mathcal {U}}_h\) one of the following two spaces:

$$\begin{aligned} {\mathcal {U}}_h= {\mathcal {U}}_h^0:=\{u_h\in L^2(\varOmega ):\ u_{h|T}\in P_0(T)\ \forall T\in {\mathcal {T}}_h\},\\ {\mathcal {U}}_h= {\mathcal {U}}_h^1:=\{u_h\in C({\bar{\varOmega }}):\ u_{h|T}\in P_1(T)\ \forall T\in {\mathcal {T}}_h\}, \end{aligned}$$

where \(P_0(T)\) and \(P_1(T)\) denote the space of polynomials in T of degree 0 and \(\le 1\), respectively. We also set \(U_{\mathrm{ad},h}= {\mathcal {U}}_h\cap U_{\mathrm{ad}}\). If \({\mathcal {U}}_h={\mathcal {U}}_h^0\), then we will denote \(\varPi _h:L^2(\varOmega ) \longrightarrow {\mathcal {U}}_h^0\) the \(L^2(\varOmega )\) linear projection. If \({\mathcal {U}}_h= {\mathcal {U}}_h^1\), then \(\varPi _h:L^2(\varOmega ) \longrightarrow {\mathcal {U}}_h^1\) will denote Cartensen’s quasi-interpolation operator. In both cases it is known that \(\varPi _h u\) converges to u in \(L^2(\varOmega )\) as h tends to 0 for all \(u\in L^2(\varOmega )\), and \(\varPi _h u\in U_{\mathrm{ad},h}\) for all \(u\in U_{\mathrm{ad}}\).

We will approximate Problem (P) by the problem

$$\begin{aligned} ({\mathcal {P}}_h) \qquad \min \{ {\mathcal {J}}(y_h,u_h):\ (y_h,u_h)\in Y_h\times U_{\mathrm{ad},h} \text{ satisfies } (4.1)\}. \end{aligned}$$

where

$$\begin{aligned} a(y_h,z_h)+\int _\varOmega f(x,y_h(x))z_h(x)\,dx = \int _\varOmega u_h(x) z_h(x)\, dx\ \forall z_h\in Y_h. \end{aligned}$$
(4.1)

Theorem 4.1

There exists \(h_0>0\) such that problem \(({\mathcal {P}}_h)\) has at least one solution \(({\bar{y}}_h,{\bar{u}}_h)\) for all \(h<h_0\). Moreover, if \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) is a sequence of solutions of problems \(({\mathcal {P}}_h)\), then it is bounded in \(H_0^1(\varOmega ) \times L^2(\varOmega )\) and there exist subsequences converging weakly in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). In addition, if a subsequence, denoted in the same way, satisfies that \(({\bar{y}}_h,{\bar{u}}_h) \rightharpoonup ({\bar{y}},{\bar{u}})\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\) as \(h \rightarrow 0\), then \(({\bar{y}},{\bar{u}}) \in Y \times U_{\mathrm{ad}}\), \({\bar{u}}\) is a solution of (P) with associated stated \({\bar{y}}\), and \(({\bar{y}}_h,{\bar{u}}_h) \rightarrow ({\bar{y}},{\bar{u}})\) strongly in \(H_0^1(\varOmega ) \times L^2(\varOmega )\).

Proof

Claim 1—Existence of discrete solutions: Let us prove the existence of a solution of \(({\mathcal {P}}_h)\) for every h small enough. Let \({F}_h:Y_h \times U_{\mathrm{ad},h}\longrightarrow Y_h^*\) be the mapping defined by

$$\begin{aligned} \langle {F}_h(y_h,u_h),w_h\rangle _{Y_h^*,Y_h} = a(y_h,w_h) + \int _\varOmega [f(x,y_h) - u_h]w_h\, dx \quad \forall w_h \in Y_h. \end{aligned}$$

Since \(F_h\) is continuous and \(U_{\mathrm{ad},h}\) is closed, then the set of feasible points \((y_h,u_h)\) for \(({\mathcal {P}}_h)\), which is \(F_h^{-1}(\{0\})\), is closed in \(Y_h\times {\mathcal {U}}_h\). Moreover, \({\mathcal {J}}\) is continuous and coercive. Hence, it is enough to prove the existence of feasible points for \(({\mathcal {P}}_h)\). We choose a constant \(u \in U_{\mathrm{ad},h}\) to guarantee \(u \in U_{\mathrm{ad},h}\) for every \(h > 0\). This can be done by \(u \equiv \alpha \) if \(\alpha > -\infty \), or \(u \equiv \beta \) if \(\beta < \infty \), or \(u \equiv 0\) otherwise. According to Theorem 3.7, there exists \(h_0 > 0\) such that (4.1) has a solution \(y_h(u) \in Y_h\) for every \(h < h_0\) satisfying \(y_h(u) \rightarrow y_u\) in Y. Therefore, \((y_h(u),u)\) is a feasible point for \(({\mathcal {P}}_h)\) for every \(h < h_0\).

Claim 2—Uniform boundedness of discrete solutions in \(H_0^1(\varOmega ) \times L^2(\varOmega )\) and weak convergence: Let us denote by \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) a sequence of solutions for problems \(({\mathcal {P}}_h)\). We prove the boundedness of this sequence in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). Since

$$\begin{aligned} {\mathcal {J}}({\bar{y}}_h,{\bar{u}}_h) \le {\mathcal {J}}(y_h(u),u) \rightarrow {\mathcal {J}}(y_u,u), \end{aligned}$$

we infer that \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) is bounded in \(L^2(\varOmega ) \times L^2(\varOmega )\). Moreover, since \(({\bar{y}}_h,{\bar{u}}_h)\) satisfies (4.1), taking \(z_h = {\bar{y}}_h\) in (4.1) we deduce from (3.2) and the monotonicity of f

$$\begin{aligned}&\frac{\varLambda }{4}\Vert {\bar{y}}_h\Vert _{H^1_0(\varOmega )}^2 - C_{\varLambda ,b}\Vert {\bar{y}}_h\Vert _{L^2(\varOmega )}^2 \le a({\bar{y}}_h,{\bar{y}}_h) + \int _\varOmega [f(x,{\bar{y}}_h) - f(x,0)]{\bar{y}}_h\, dx\\&\quad = \int _\varOmega [{\bar{u}}_h - f(x,0)]{\bar{y}}_h\, dx. \end{aligned}$$

From here we obtain \(\forall h < h_0\).

$$\begin{aligned} \frac{\varLambda }{4}\Vert {\bar{y}}_h\Vert _{H^1_0(\varOmega )}^2&\le C_{\varLambda ,b}\Vert {\bar{y}}_h\Vert _{L^2(\varOmega )}^2 + \Big (\Vert {\bar{u}}_h\Vert _{L^2(\varOmega )} + \Vert f(\cdot ,0)\Vert _{L^2(\varOmega )}\Big )\Vert {\bar{y}}_h\Vert _{L^2(\varOmega )}. \end{aligned}$$

This inequality along with the boundedness of \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) in \(L^2(\varOmega ) \times L^2(\varOmega )\) implies that \(\{{\bar{y}}_h\}_{h < h_0}\) is bounded in \(H_0^1(\varOmega )\) as well.

Since \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) is bounded in \(H^1_0(\varOmega )\times L^2(\varOmega )\), it has weakly convergent subsequences in this topology. Now, we take a subsequence, denoted in the same way, such that

$$\begin{aligned}&({\bar{y}}_h,{\bar{u}}_h) {\mathop {\rightharpoonup }\limits ^{h \rightarrow 0}} ({\bar{y}},{\bar{u}})\ \text { in } H_0^1(\varOmega ) \times L^2(\varOmega ),\\&{\bar{y}}_h {\mathop {\longrightarrow }\limits ^{h \rightarrow 0}} {\bar{y}}\ \text { in } L^2(\varOmega ) \text { and } {\bar{y}}_h(x) \rightarrow {\bar{y}}(x) \text { a.e.~in } \varOmega . \end{aligned}$$

Claim 3—Validity of the state equation for the limit element: The proof of this claim is split into three main steps: First, we prove that \(f(\cdot ,{\bar{y}}_h) \rightarrow f(\cdot ,{\bar{y}})\) strongly in \(L^1(\varOmega )\). Next, we use this to prove (4.3), which is a weak version of (1.1) for bounded test functions. Finally, we prove that \({\bar{y}}\in L^\infty (\varOmega )\) to conclude this part of the proof.

To prove that \(f(\cdot ,{\bar{y}}_h) \rightarrow f(\cdot ,{\bar{y}})\) strongly in \(L^1(\varOmega )\), we show that \(\{f(x,{\bar{y}}_h(x)) - f(x,0)\}_{h < h_0}\) is uniformly integrable. Then, the convergence will follow from Vitali’s convergence theorem and the pointwise convergence of the sequence \(f(x,{\bar{y}}_h(x)) \rightarrow f(x,{\bar{y}}(x))\) in \(\varOmega \); see [4, Volume I, Theorem 4.5.4] or [31, Chapter 6, Exercise 10].

We get from (4.1) and the boundedness of \(\{({\bar{y}}_h,{\bar{u}}_h)\}_{h < h_0}\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\) the existence of a constant C such that

$$\begin{aligned}&\int _\varOmega [f(x,{\bar{y}}_h) - f(x,0)]{\bar{y}}_h\, dx \nonumber \\&\quad = \int _\varOmega [{\bar{u}}_h - f(x,0)]{\bar{y}}_h\, dx - a({\bar{y}}_h,{\bar{y}}_h) \le C\ \ \forall h < h_0. \end{aligned}$$
(4.2)

Let \(\varepsilon > 0\) be arbitrarily small. Using (2.3) with \(M=2C/\varepsilon \), we deduce the existence of a function \(\phi _\varepsilon \in L^2(\varOmega )\) such that

$$\begin{aligned} |f(x,y) - f(x,0)| \le \phi _\varepsilon (x)|y| \le \phi _\varepsilon (x)\frac{2C}{\varepsilon }\ \ \text { if }\ \ |y| \le \frac{2C}{\varepsilon }. \end{aligned}$$

From the integrability of \(\phi _\varepsilon \) we infer the existence of \(\lambda _0 > 0\) such that

$$\begin{aligned} \int _{\{x : \phi _\varepsilon (x) \ge \lambda _0\}}\phi _\varepsilon (x)\, dx \le \frac{\varepsilon ^2}{4C}. \end{aligned}$$

Let us set \(\lambda = \frac{2C\lambda _0}{\varepsilon }\) and \(\varOmega _{h,\lambda } = \{x \in \varOmega : |f(x,{\bar{y}}_h(x)) - f(x,0)| > \lambda \}\). We notice the following properties:

  • If \(x \in \varOmega _{h,\lambda }\) and \(|{\bar{y}}_h(x)| > \frac{2C}{\varepsilon }\), then

    $$\begin{aligned} |f(x,{\bar{y}}_h(x)) - f(x,0)| \le \frac{\varepsilon }{2C}[f(x,{\bar{y}}_h(x)) - f(x,0)]y_h(x). \end{aligned}$$
  • If \(x \in \varOmega _{h,\lambda }\) and \(|{\bar{y}}_h(x)| \le \frac{2C}{\varepsilon }\) then

    $$\begin{aligned} |f(x,{\bar{y}}_h(x)) - f(x,0)| \le \phi _\varepsilon (x)\frac{2C}{\varepsilon } \text { and } \phi _\varepsilon (x) \ge \lambda _0. \end{aligned}$$

From here, and using (4.2), we infer

$$\begin{aligned}&\int _{\varOmega _{h,\lambda }}|f(x,{\bar{y}}_h(x)) - f(x,0)|\, dx \le \frac{2C}{\varepsilon }\int _{\{x : \phi _\varepsilon (x) \ge \lambda _0\}}\phi _\varepsilon (x)\, dx \\&\quad + \frac{\varepsilon }{2C}\int _{\varOmega _{h,\lambda }}[f(x,{\bar{y}}_h(x)) - f(x,0)]{\bar{y}}_h(x)\, dx \le \varepsilon . \end{aligned}$$

Since \(\lambda \) was chosen independently of h, the uniform integrability follows and \(f(\cdot ,{\bar{y}}_h) \rightarrow f(\cdot ,{\bar{y}})\) strongly in \(L^1(\varOmega )\).

Next, given \(z \in H_0^1(\varOmega ) \cap L^\infty ({\bar{\varOmega }})\), we can take a sequence \(\{z_h\}_{h < h_0}\) with \(z_h \in Y_h\) for every h such that \(z_h \rightarrow z\) in \(H^1_0(\varOmega )\) and \(\Vert z_h\Vert _{L^\infty (\varOmega )}\le \Vert z\Vert _{L^\infty (\varOmega )}\). For instance, we can take \(z_h\) Carstensen’s quasi-interpolation of z; see [7]. Hence, using Lebesgue’s dominated convergence theorem, we can pass to the limit in (4.1) and deduce that

$$\begin{aligned} a({\bar{y}},z) + \int _\varOmega f(x,{\bar{y}})z\, dx = \int _\varOmega {\bar{u}} z\, dx\ \forall z \in H_0^1(\varOmega ) \cap L^\infty (\varOmega ). \end{aligned}$$
(4.3)

Finally, we prove that \({\bar{y}}\in L^\infty (\varOmega )\), and consequently, by a truncation argument, it will follow that (4.3) holds for all \(z\in H^1_0(\varOmega )\) Let us set

$$\begin{aligned} {\tilde{a}}(y,z) = a(y,z) + C_{\varLambda ,b}\int _\varOmega yz \, dx, \end{aligned}$$

where \(C_{\varLambda ,b}\) is given by (3.2). Then we have that \({\tilde{a}}\) is coercive in \(H_0^1(\varOmega )\) and

$$\begin{aligned} {\tilde{a}}({\bar{y}},z) + \int _\varOmega [f(x,{\bar{y}}) - f(x,0)]z \, dx = \int _\varOmega [{\bar{u}} + C_{\varLambda ,b}{\bar{y}}- f(x,0)]z\, dx \ \ \forall z \in Y.\nonumber \\ \end{aligned}$$
(4.4)

The above identity holds, in particular, for \(y_k = {\text {Proj}}_{[-k,+k]}({\bar{y}})\) for every \(k \ge 1\):

$$\begin{aligned}&{\tilde{a}}({\bar{y}},y_k) + \int _\varOmega [f(x,{\bar{y}}) - f(x,0)]y_k \, dx \nonumber \\&\quad = \int _\varOmega [{\bar{u}} + C_{\varLambda ,b}{\bar{y}}- f(x,0)]y_k\, dx \ \ \forall k \ge 1. \end{aligned}$$
(4.5)

Moreover, from Fatou’s Lemma, (4.5), denoting \(g = {\bar{u}} + C_{\varLambda ,b}{\bar{y}}- f(x,0) \in L^2(\varOmega )\) and taking into account that \(\tilde{a}({\bar{y}},y_k)\ge \tilde{a}(y_k,y_k)\ge 0\), we have

$$\begin{aligned} \int _\varOmega [f(x,{\bar{y}}(x)) - f(x,0)]{\bar{y}}(x)\, dx \le&\liminf _{k \rightarrow \infty }\int _\varOmega [f(x,{\bar{y}}(x)) - f(x,0)]y_k(x)\, dx \nonumber \\ \le&\liminf _{k\rightarrow \infty }\int _\varOmega g(x) {\bar{y}}_k(x) dx\le \int _\varOmega g(x) {\bar{y}}(x) dx. \end{aligned}$$
(4.6)

Hence, \([f(\cdot ,{\bar{y}}) - f(\cdot ,0)]{\bar{y}} \in L^1(\varOmega )\) holds. Since \(0 \le [f(x,{\bar{y}}(x)) - f(x,0)]y_k(x) \le [f(x,{\bar{y}}(x)) - f(x,0)]{\bar{y}}(x)\), we can apply the Lebesgue’s dominated convergence theorem to pass to the limit in (4.5):

$$\begin{aligned} {\tilde{a}}({\bar{y}},{\bar{y}}) + \int _\varOmega [f(x,{\bar{y}}) - f(x,0)]{\bar{y}} \, dx = \int _\varOmega [{\bar{u}} + C_{\varLambda ,b}{\bar{y}}- f(x,0)]{\bar{y}}\, dx. \end{aligned}$$

Then, combining this identity and (4.5), we get for \(y^k = {\bar{y}} - y_k\):

$$\begin{aligned} {\tilde{a}}({\bar{y}},y^k) + \int _\varOmega [f(x,{\bar{y}}) - f(x,0)]y^k \, dx = \int _\varOmega gy^k\, dx \ \ \forall k \ge 1. \end{aligned}$$

From the monotonicity of f and the definition of \(y^k\) we get

$$\begin{aligned} \frac{\varLambda }{4}\Vert y^k\Vert ^2_{H_0^1(\varOmega )} \le {\tilde{a}}(y^k,y^k) \le {\tilde{a}}({\bar{y}},y^k) \le \int _\varOmega gy^k\, dx\ \ \forall k \ge 1. \end{aligned}$$

Then, we can proceed as in [34, Theorem 4.1] or [35, §7.2] to infer the existence of \(k_0\) such that \(y^k = 0\) for \(k \ge k_0\). Hence, \({\bar{y}} \in L^\infty (\varOmega )\) holds. Moreover, from (2.5) and (2.6) we deduce that \(f(\cdot ,{\bar{y}}) \in L^2(\varOmega )\). Therefore, we have that \({\mathcal {A}}{\bar{y}} \in L^2(\varOmega )\) and, consequently, \({\bar{y}} \in C({\bar{\varOmega }})\); see [12, Corollary 2.2]. Thus, \({\bar{y}} \in Y\) and (4.3) implies that \({\bar{y}}\) is the solution of (1.1) associated with \({\bar{u}}\).

Claim 4—Optimality of \({\bar{u}}\): Let us prove that \({\bar{u}}\) is a solution of (P). First, notice that it follows from the inclusion \(U_{\mathrm{ad},h}\subset U_{\mathrm{ad}}\) that \({\bar{u}} \in U_{\mathrm{ad}}\). To prove the optimality, we take an arbitrary element \(u \in U_{\mathrm{ad}}\) and set \(u_h = \varPi _hu \in U_{\mathrm{ad},h}\). Moreover, Theorem 3.7 implies that there exists \(y_h(u_h) \in Y_h\) solution of (4.1) for every h small enough and such that \(y_h(u_h) \rightarrow y_u\) in Y. Hence, we deduce from the optimality of \(({\bar{y}}_h,{\bar{u}}_h)\)

$$\begin{aligned}&J({\bar{u}}) = {\mathcal {J}}({\bar{y}},{\bar{u}}) \le \liminf _{h \rightarrow 0}{\mathcal {J}}({\bar{y}}_h,{\bar{u}}_h) \le \limsup _{h \rightarrow 0}{\mathcal {J}}({\bar{y}}_h,{\bar{u}}_h)\\&\le \limsup _{h \rightarrow 0}{\mathcal {J}}({\bar{y}}_h(u_h),u_h) = {\mathcal {J}}(y_u,u) = J(u). \end{aligned}$$

Since u was taken arbitrary in \(U_{\mathrm{ad}}\), the above inequalities prove that \({\bar{u}}\) is a solution of (P).

Claim 5—Strong convergence in \(H^1_0(\varOmega )\times L^2(\varOmega )\). If we take above \(u = {\bar{u}}\) we deduce from the previous inequalities and the strong convergence \({\bar{y}}_h \rightarrow {\bar{y}}\) in \(L^2(\varOmega )\) that \(\Vert {\bar{u}}_h\Vert _{L^2(\varOmega )} \rightarrow \Vert {\bar{u}}\Vert _{L^2(\varOmega )}\) and, hence, \({\bar{u}}_h \rightarrow {\bar{u}}\) strongly in \(L^2(\varOmega )\). Finally, we prove the strong convergence \({\bar{y}}_h \rightarrow {\bar{y}}\) in \(H_0^1(\varOmega )\) as follows

$$\begin{aligned}&a({\bar{y}},{\bar{y}}) \le \liminf _{h \rightarrow 0}a({\bar{y}}_h,{\bar{y}}_h) \le \limsup _{h \rightarrow 0}a({\bar{y}}_h,{\bar{y}}_h)\\&\quad \le \limsup _{h \rightarrow 0}\int _\varOmega \big [{\bar{u}}_h - f(x,0)\big ]{\bar{y}}_h\, dx - \liminf _{h \rightarrow 0}\int _\varOmega [f(x,{\bar{y}}_h) - f(x,0)]{\bar{y}}_h\, dx\\&\quad = \int _\varOmega \big [{\bar{u}} - f(x,0)\big ]{\bar{y}}\, dx - \int _\varOmega [f(x,{\bar{y}}) - f(x,0)]{\bar{y}}\, dx = a({\bar{y}},{\bar{y}}), \end{aligned}$$

where we have used (4.6). The above inequalities imply that \(a({\bar{y}}_h,{\bar{y}}_h) \rightarrow a({\bar{y}},{\bar{y}})\). Hence, from (3.2) and the weak convergence \({\bar{y}}_h \rightharpoonup {\bar{y}}\) in \(H_0^1(\varOmega )\) we infer that \({\bar{y}}_h \rightarrow {\bar{y}}\) strongly in \(H_0^1(\varOmega )\).\(\square \)

Next, we prove a kind of converse theorem. More precisely, we assume that \({\bar{u}} \in U_{\mathrm{ad}}\) is a strict local minimum of (P) with associated state \({\bar{y}}\). This means that there exists \(\rho > 0\) such that

$$\begin{aligned} J({\bar{u}}) < J(u)\quad \forall u \in (\bar{B}_\rho ({\bar{u}}) \cap U_{\mathrm{ad}})\setminus \{{\bar{u}}\}. \end{aligned}$$
(4.7)

Under assumptions of Theorem 4.1 there exists \(\rho ^* > 0\) and \(h_0 > 0\) such that for every \(u \in \bar{B}_\rho ({\bar{u}})\) there exists a unique solution of (4.1) \(y_h(u) \in \bar{B}^Y_{\rho ^*}({\bar{y}})\); see Theorem 3.8. Then, for every \(h < h_0\) we have a well defined mapping \(G_h:\bar{B}_\rho ({\bar{u}}) \longrightarrow \bar{B}^Y_{\rho ^*}({\bar{y}}) \cap Y_h\) given by \(G_h(u) = y_h(u)\). Now we define the functional \(J_h:\bar{B}_\rho ({\bar{u}}) \longrightarrow {\mathbb {R}}\) by

$$\begin{aligned} J_h(u) = {\mathcal {J}}(y_h(u),u) = \frac{1}{2}\int _\varOmega (y_h(u) - y_d)^2\, dx + \frac{\nu }{2}\int _\varOmega u^2\, dx. \end{aligned}$$

Associated with \(J_h\) we define the discrete control problem

$$\begin{aligned} (\hbox {P}_h^{\rho }) \quad \min _{u_h \in \bar{B}_\rho ({\bar{u}}) \cap U_{\mathrm{ad},h}} J_h(u_h). \end{aligned}$$

Theorem 4.2

Under Assumptions 1 and 2, and with the above notations, there exists \(h_\rho \in (0,h_0]\) such that \((\hbox {P}_h^{\rho })\) has at least one solution \({\bar{u}}_h\) for every \(h \le h_\rho \). Moreover, the convergence \({\bar{u}}_h {\mathop {\longrightarrow }\limits ^{h \rightarrow 0}} {\bar{u}}\) in \(L^2(\varOmega )\) holds.

Proof

Since \(\varPi _h{\bar{u}} {\mathop {\longrightarrow }\limits ^{h \rightarrow 0}} {\bar{u}}\), then there exists \(h_\rho \in (0,h_0]\) such that \(\varPi _h{\bar{u}} \in \bar{B}_\rho ({\bar{u}}) \cap U_{\mathrm{ad},h}\) for every \(h \le h_\rho \). Hence, \(\bar{B}_\rho ({\bar{u}}) \cap U_{\mathrm{ad},h}\) is a compact non-empty subset in \({\mathcal {U}}_h\) for every \(h \le h _\rho \). Let us prove that \(J_h\) is continuous. Let \(\{u_{hk}\}_{k = 1}^\infty \subset \bar{B}_\rho ({\bar{u}})\) be a sequence converging to \(u_h\) in \({\mathcal {U}}_h\). Let \(\{y_h(u_{hk})\}_{k = 1}^\infty \subset \bar{B}^Y_{\rho ^*}({\bar{y}}) \cap Y_h\) be the associated discrete states. From the boundedness of this sequence in \(Y_h\), we deduce the existence of a subsequence, denoted in the same way, such that \(y_h(u_{hk}) {\mathop {\longrightarrow }\limits ^{k \rightarrow \infty }} y_h\) for some \(y_h \in \bar{B}^Y_{\rho ^*}({\bar{y}}) \cap Y_h\). Now, it is immediate to pass to the limit in the Eq. (4.1) satisfied by \((y_{hk},u_{hk})\) and to deduce that \(y_h = y_h(u_h)\). Since every subsequence of \(\{y_h(u_{hk})\}_{k = 1}^\infty \) converges to the same limit \(y_h(u_h)\), it follows that the whole sequence converges to \(y_h(u_h)\). This proves the continuity of \(G_h\) and, consequently, the continuity of \(J_h\). Therefore, \((\hbox {P}_h^{\rho })\) consists of the minimization of a continuous function on a non-empty compact set, which implies the existence of a solution \({\bar{u}}_h\).

It remains to prove that \(\{{\bar{u}}_h\}_{h \le h_\rho }\) converges to \({\bar{u}}\) strongly in \(L^2(\varOmega )\). First, from the boundedness of \(\{{\bar{u}}_h\}_{h \le h_\rho } \subset \bar{B}_\rho ({\bar{u}})\) and the inclusions \(U_{\mathrm{ad},h}\subset U_{\mathrm{ad}}\) we deduce the existence of a subsequence, denoted in the same way, and an element \({\tilde{u}} \in \bar{B}_\rho ({\bar{u}}) \cap U_{\mathrm{ad}}\) such that \({\bar{u}}_h \rightharpoonup {\tilde{u}}\) in \(L^2(\varOmega )\). This implies that \(y_{{\bar{u}}_h} \rightarrow y_{{\tilde{u}}}\) strongly in Y; see [12, Theorem 2.9]. Therefore, using (3.22) and (3.23) we infer

$$\begin{aligned} \Vert y_h({\bar{u}}_h) - y_{{\tilde{u}}}\Vert _Y \le \Vert y_h({\bar{u}}_h) - y_{{\bar{u}}_h}\Vert _Y + \Vert y_{{\bar{u}}_h} - y_{{\tilde{u}}}\Vert _Y \longrightarrow 0. \end{aligned}$$

This convergence and the optimality of \({\bar{u}}_h\) imply

$$\begin{aligned} J({\tilde{u}}) \le \liminf _{h \rightarrow 0}J_h({\bar{u}}_h) \le \limsup _{h \rightarrow 0}J_h({\bar{u}}_h) \le \limsup _{h \rightarrow 0}J_h(\varPi _h{\bar{u}}) = J({\bar{u}}). \end{aligned}$$
(4.8)

This inequality and (4.7) lead to the identity \({\bar{u}} = {\tilde{u}}\). Moreover, (4.8) implies that \({\bar{u}}_h \rightarrow {\bar{u}}\) strongly in \(L^2(\varOmega )\). This property is satisfied by every weakly convergent subsequence of \(\{{\bar{u}}_h\}_{h \le h_\rho }\), hence the whole sequence converges strongly to \({\bar{u}}\).\(\square \)

Remark 4.3

By selecting \(h_\rho \) sufficiently small, we have that the solutions \({\bar{u}}_h\) of \((\hbox {P}_h^{\rho })\) belong to the open ball \(B_\rho ({\bar{u}})\). Indeed, this is an obvious consequence of the strong convergence \({\bar{u}}_h \rightarrow {\bar{u}}\) in \(L^2(\varOmega )\). From now on, we will assume that \(h_\rho \) has been chosen so that \({\bar{u}}_h\) is included in the open ball. From Theorem 3.8 we deduce that \((y_h({\bar{u}}_h),{\bar{u}}_h)\) is a local solution of \(({\mathcal {P}}_h)\). Thus, Theorem 4.2 proves that strict local solutions of (P) can be approximated by local solutions of \(({\mathcal {P}}_h)\).

The next goal is to derive the optimality conditions satisfied by a solution of \((\hbox {P}_h^{\rho })\). To this end, we firstly analyze the differentiability of the mapping \(G_h\) and the functional \(J_h\).

Theorem 4.4

Suppose that Assumptions 1 and 3 hold. Then, there exists \({\bar{h}}_\rho \le h_0\) such that for every \(h < \bar{h}_\rho \) the mapping \(G_h:B_\rho ({\bar{u}}) \longrightarrow Y_h\) is of class \(C^2\). Moreover, if \(z_h = G'_h(u)v\) for \(u \in B_\rho ({\bar{u}})\) and \(v \in L^2(\varOmega )\), then \(z_h\) is the unique solution of the variational problem

$$\begin{aligned} \left\{ \begin{array}{l}\text {Find } z_h \in Y_h \text { such that }\\ \displaystyle a(z_h,w_h) + \int _\varOmega \frac{\partial f}{\partial y}(x,y_h(u))z_h(x) w_h(x)\, dx = \int _\varOmega v(x)w_h(x)\, dx \ \ \forall w_h \in Y_h.\end{array}\right. \nonumber \\ \end{aligned}$$
(4.9)

Proof

For every \(h < h_0\) let \({\mathcal {F}}_h:Y_h \times B_\rho ({\bar{u}}) \longrightarrow Y_h^*\) be the mapping defined by

$$\begin{aligned} \langle {\mathcal {F}}_h(y_h,u),w_h\rangle _{Y_h^*,Y_h} = a(y_h,w_h) + \int _\varOmega [f(x,y_h) - u]w_h\, dx \quad \forall w_h \in Y_h. \end{aligned}$$

It is clear that \({\mathcal {F}}_h\) is of class \(C^2\) and \({\mathcal {F}}_h(G_h(u),u)) = 0\) \(\forall u \in B_\rho ({\bar{u}})\). Hence, the differentiability of \(G_h\) is a consequence of the implicit function theorem applied to \({\mathcal {F}}_h\). We only need to prove that

$$\begin{aligned}&\frac{\partial {\mathcal {F}}_h}{\partial y}(y_h(u),u):Y_h \longrightarrow Y_h^*\\&\left\langle \frac{\partial {\mathcal {F}}_h}{\partial y}(y_h(u),u)z_h,w_h\right\rangle _{Y_h^*,Y_h} = a(z_h,w_h) + \int _\varOmega \frac{\partial f}{\partial y}(x,y_h(u))z_hw_h\, dx \end{aligned}$$

is an isomorphism. This is equivalent to prove that (4.9) has a unique solution \(z_h \in Y_h\) for every \(u \in L^2(\varOmega )\). We prove this. First we observe that \(y_h(u) \in \bar{B}^Y_{\rho ^*}({\bar{y}})\) \(\forall u \in B_\rho ({\bar{u}})\). Therefore, \(\Vert y_h(u)\Vert _{C({\bar{\varOmega }})} \le \rho ^*\) \(\forall u \in B_\rho ({\bar{u}})\) holds. From (2.6) we infer the existence of a constant \(C_{f,\rho ^*}\) such that

$$\begin{aligned} \Big |\frac{\partial f}{\partial y}(x,y_h(u))\Big | \le C_{f,\rho ^*}\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$

Then, Lemma 3.1 implies the existence of \(\bar{h}_\rho \le h_0\) depending on \(C_{f,\rho ^*}\) and \({\mathcal {A}}\) such that (4.9) has a unique solution for every \(h \le \bar{h}_\rho \) and for all \((u,v) \in B_\rho ({\bar{u}}) \times L^2(\varOmega )\).\(\square \)

As an immediate corollary of the above theorem we get the differentiability of the objective functional \(J_h\).

Corollary 4.5

Under the assumptions of Theorem 4.4, the mapping \(J_h:B_\rho ({\bar{u}}) \longrightarrow {\mathbb {R}}\) is of class \(C^2\) \(\forall h < \bar{h}_\rho \), and

$$\begin{aligned} J'_h(u)v = \int _\varOmega (\varphi _h(u) + \nu u)v\, dx \quad \forall (u,v) \in B_\rho ({\bar{u}}) \times L^2(\varOmega ), \end{aligned}$$
(4.10)

where \(\varphi _h(u) \in Y_h\) is the adjoint state, i.e. it is the solution of the variational problem

$$\begin{aligned} \left\{ \begin{array}{l}\text {Find } \varphi _h \in Y_h \text { such that }\\ \displaystyle a(w_h,\varphi _h) + \int _\varOmega \frac{\partial f}{\partial y}(x,y_h(u))\varphi _h w_h\, dx = \int _\varOmega (y_h(u) - y_d)w_h\, dx \ \ \forall w_h \in Y_h.\end{array}\right. \nonumber \\ \end{aligned}$$
(4.11)

Let us observe that (4.11) is a linear system of equations, adjoint to the one defined by (4.9). Therefore, the existence and uniqueness of a solution of (4.11) is a consequence of the same property for (4.9).

Now, we can formulate the first order optimality conditions satisfied by a solution of \((\hbox {P}_h^{\rho })\).

Theorem 4.6

Assume that \(h < \bar{h} = \min \{h_\rho ,\bar{h}_\rho \}\) with \(h_\rho \) and \(\bar{h}_\rho \) given by Theorems 4.2 and 4.4. Then, \((\hbox {P}_h^{\rho })\) has a solution \({\bar{u}}_h\) in the open ball \(B_\rho ({\bar{u}})\) for every \(h < \bar{h}\). Moreover, for any of these solutions there exist two unique functions \({\bar{y}}_h, {\bar{\varphi }}_h \in Y_h\) satisfying

$$\begin{aligned}&a({\bar{y}}_h,w_h) + \int _\varOmega f(x,{\bar{y}}_h)w_h\, dx = \int _\varOmega {\bar{u}}_hw_h\, dx \ \ \forall w_h \in Y_h, \end{aligned}$$
(4.12)
$$\begin{aligned}&a(w_h,{\bar{\varphi }}_h) + \int _\varOmega \frac{\partial f}{\partial y}(x,{\bar{y}}_h){\bar{\varphi }}_hw_h\, dx = \int _\varOmega ({\bar{y}}_h - y_d)w_h\, dx \ \ \forall w_h \in Y_h, \end{aligned}$$
(4.13)
$$\begin{aligned}&\int _\varOmega ({\bar{\varphi }}_h + \nu {\bar{u}}_h)(u_h - {\bar{u}}_h)\, dx \ge 0 \quad \forall u_h \in U_{\mathrm{ad},h}. \end{aligned}$$
(4.14)

Proof

The existence of a solution of \((\hbox {P}_h^{\rho })\) in the open ball follows from Remark 4.3. Then, the inequality \(J'_h({\bar{u}}_h)(u_h - {\bar{u}}_h) \ge 0\) \(\forall u_h \in U_{\mathrm{ad},h}\) holds for every \(h < {\bar{h}}\). This along with (4.10) leads straightforward to (4.12)–(4.14). \(\square \)

5 Error estimates

In this section, we suppose that Assumptions 1 and 3 hold. In the whole section \({\bar{u}}\) will denote a strict local solution of (P) with associated state \({\bar{y}}\) and adjoint state \({\bar{\varphi }}\). Following Theorem 4.6, in the sequel we will assume that \(h < {\bar{h}}\), and we consider the discrete problems \((\hbox {P}_h^{\rho })\) having solutions \({\bar{u}}_h \in B_\rho ({\bar{u}}) \cap U_{\mathrm{ad},h}\) and satisfying the optimality conditions (4.12)–(4.14). We know that \({\bar{u}}_h \rightarrow {\bar{u}}\) strongly in \(L^2(\varOmega )\). The goal is to provide some error estimates for the difference \({\bar{u}} - {\bar{u}}_h\). We will distinguish two cases depending on the set \(U_{\mathrm{ad}}\). Firstly we analyze the case where \(U_{\mathrm{ad}}\subsetneq L^2(\varOmega )\), next we treat the case where \(U_{\mathrm{ad}}= L^2(\varOmega )\). Let us prove a preliminary result that we will use later.

Theorem 5.1

Let \(u \in B_\rho ({\bar{u}})\) be arbitrary. Let \(\varphi \in {H^2(\varOmega )\cap H^1_0(\varOmega )}\) and \(\varphi _h \in Y_h\) denote the solutions of (2.11) and (4.11), respectively. Then, there exist constants \(k_2\) and \(k_\infty \) such that

$$\begin{aligned}&\Vert \varphi - \varphi _h\Vert _{L^2(\varOmega )} + h\Vert \varphi - \varphi _h\Vert _{H_0^1(\varOmega )} \le k_2h^2\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.1)
$$\begin{aligned}&\Vert \varphi - \varphi _h\Vert _{L^\infty (\varOmega )} \le k_\infty h^{2 - \frac{n}{2}}\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.2)

Proof

First, we recall that \(\Vert y_h(u)\Vert _{C({\bar{\varOmega }})} \le \rho ^*\) \(\forall u \in B_\rho ({\bar{u}})\). Hence, with (2.6) we get

$$\begin{aligned} \Big |\frac{\partial f}{\partial y}(x,y_h(u))\Big | + \Big |\frac{\partial ^2f}{\partial y^2}(x,y_h(u))\Big | \le C_{f,\rho ^*}\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.3)

Now we introduce the function \(\varphi ^h \in H^2(\varOmega ) \cap H_0^1(\varOmega )\), the unique solution of

$$\begin{aligned} \left\{ \begin{array}{l}\displaystyle A^*\varphi ^h - {\text {div}}[b(x)\varphi ^h] + \frac{\partial f}{\partial y}(x,y_h(u))\varphi ^h = y_h(u) - y_d \text { in } \varOmega ,\\ \varphi = 0\text { on } \varGamma .\end{array}\right. \end{aligned}$$
(5.4)

From Lemma 3.2 we deduce the existence of a constant \(C_1\) such that \(\Vert \varphi ^h\Vert _{H^2(\varOmega )} \le C_1\) \(\forall u \in B_\rho ({\bar{u}})\). From the continuous embedding \(H^2(\varOmega ) \subset C({\bar{\varOmega }})\) we get that \(\Vert \varphi ^h\Vert _{C({\bar{\varOmega }})} \le C_2\) \(\forall u \in B_\rho ({\bar{u}})\). On the other side, from (3.18) we obtain for some constant \(K_2\)

$$\begin{aligned} \Vert y_u - y_h(u)\Vert _{L^2(\varOmega )} \le K_2 h^2\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.5)

Now, we write \(\varphi - \varphi _h = (\varphi - \varphi ^h) + (\varphi ^h - \varphi _h(u))\) and we estimate both summands. For the first summand we subtract the equations (2.11) and (5.4):

$$\begin{aligned}&A^*(\varphi - \varphi ^h) - {\text {div}}(b(x)(\varphi -\varphi ^h)) + \frac{\partial f}{\partial y}(x,y_u)(\varphi - \varphi ^h) \\&\quad = (y_u - y_h(u)) + \left[ \frac{\partial f}{\partial y}(x,y_h(u)) - \frac{\partial f}{\partial y}(x,y_u)\right] \varphi ^h. \end{aligned}$$

Using the mean value theorem, we have that there exists a measurable function \(0<\theta (x)<1\) such that, if we name \({\hat{y}} = y_h(u)+\theta (y_u-y_h(u))\), then

$$\begin{aligned}\frac{\partial f}{\partial y}(x,y_h(u)) - \frac{\partial f}{\partial y}(x,y_u) = \frac{\partial ^2 f}{\partial y^2}(x,{\hat{y}})(y_h(u)-y_u).\end{aligned}$$

Using Lemma 3.2, (5.3), the boundedness of \(\{\varphi ^h\}_h\) in \(C({\bar{\varOmega }})\), and (5.5) we infer

$$\begin{aligned}&\Vert \varphi - \varphi ^h\Vert _{H^2(\varOmega )}\nonumber \\&\quad \le C_3\Big (\Vert y_u - y_h(u)\Vert _{L^2(\varOmega )} + \Big \Vert \frac{\partial f}{\partial y}(x,y_h(u)) - \frac{\partial f}{\partial y}(x,y_u)\Big \Vert _{L^2(\varOmega )}\Vert \varphi ^h\Vert _{C({\bar{\varOmega }})}\Big )\nonumber \\&\quad \le C_3\Big ( 1 + \Big \Vert \frac{\partial ^2 f}{\partial y^2}(x,{\hat{y}})\Big \Vert _{L^\infty (\varOmega )}\Vert \varphi ^h\Vert _{C({\bar{\varOmega }})}\Big ) \Vert y_u - y_h(u)\Vert _{L^2(\varOmega )}\nonumber \\&\quad \le C_3(1 + C_{f,\rho ^*}C_2)\Vert y_u - y_h(u)\Vert _{L^2(\varOmega )} \le C_3(1 + C_{f,\rho ^*}C_2)Kh^2\ \ \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.6)

To estimate the term \(\varphi ^h - \varphi _h(u)\) we observe that the equation satified by \(\varphi _h(u)\) is the corresponding discretization of the equation satisfied by \(\varphi ^h\). Both equations are linear. Hence, we can use [32] to deduce that

$$\begin{aligned}&\Vert \varphi ^h - \varphi _h(u)\Vert _{L^2(\varOmega )} + h\Vert \varphi ^h - \varphi _h(u)\Vert _{H_0^1(\varOmega )} \le C_4h^2\quad \forall u \in B_\rho ({\bar{u}}).\end{aligned}$$

Using the estimate in \(L^2(\varOmega )\), interpolation error estimates, and an inverse inequality we obtain

$$\begin{aligned}&\Vert \varphi ^h - \varphi _h(u)\Vert _{L^\infty (\varOmega )} \le C_5h^{2 - \frac{n}{2}}\quad \forall u \in B_\rho ({\bar{u}}). \end{aligned}$$
(5.7)

The reader can also consider to apply Theorem 3.6 with a function f that is linear and change the equation by its adjoint. Finally, (5.1) and (5.2) follow from the above estimates and (5.6).\(\square \)

Error estimates can be deduced from the abstract error estimate of [17, Theorem 2.14].

Lemma 5.2

Let \({\bar{u}}\) be a local minimizer of (P) with associated state \({\bar{y}}\) and satisfying (2.16). Let \(\{({\bar{y}}_h,{\bar{u}}_h)\}\) be a sequence of local minimizers of the problems \(({\mathcal {P}}_h)\) converging strongly to \(({\bar{y}},{\bar{u}})\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\) (see Theorem 4.2 and Remark 4.3). Then, there exists \(h_0>0\) such that for every \(h < h_0\)

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\varOmega )}\le C\big [ h^4 + \Vert {\bar{u}}-u_h\Vert ^2_{L^2(\varOmega )} + J'({\bar{u}})(u_h-{\bar{u}})\big ]^{1/2}\ \forall u_h\in U_{\mathrm{ad},h}. \end{aligned}$$

Proof

We use [17, Theorem 2.14]. To this end, we have to check the Assumptions (A2), (A3) and (A7) of [17]. First we observe that there exist positive constants r, \(M_1\) and \(M_2\) such that for all \(v,v_1,v_2\in L^2(\varOmega )\) and all \(u\in U_{\mathrm{ad}}\) such that \(\Vert {\bar{u}}-u\Vert _{L^2(\varOmega )}<r\)

$$\begin{aligned} |J'(u)v|\le M_1\Vert v\Vert _{L^2(\varOmega )},\ |J''(u)(v_1,v_2)|\le M_2\Vert v_1\Vert _{L^2(\varOmega )}\Vert v_2\Vert _{L^2(\varOmega )}. \end{aligned}$$

Moreover, for every \(\varepsilon >0\) there exists \(\delta >0\) such that for all \(u_1,u_2\in L^\infty (\varOmega )\) with \(\Vert u_i-{\bar{u}}\Vert _{L^\infty (\varOmega )}<r\), \(i=1,2\), and \(\forall v \in L^2(\varOmega )\)

$$\begin{aligned} \Vert u_1-u_1\Vert _{L^\infty (\varOmega )}<\delta \Rightarrow \left\{ \begin{array}{c} |(J'(u_1)-J'(u_2))v|\le \varepsilon \Vert v\Vert _{L^2(\varOmega )} \\ |(J''(u_1)-J''(u_2))v^2|\le \varepsilon \Vert v\Vert _{L^2(\varOmega )}^2. \end{array} \right. \end{aligned}$$

Hence, (A2) holds. Assumption (A3) says that for any element \(u \in U_{\mathrm{ad}}\) there exists a family \(\{u_h\}_{h > 0}\) with \(u_h \in U_{\mathrm{ad},h}\) such that \(\Vert u - u_h\Vert _{L^2(\varOmega )} \rightarrow 0\) when \(h \rightarrow 0\), which is well known to be satisfied for our choices of \(U_{\mathrm{ad},h}\). Finally, estimate (5.1) implies

$$\begin{aligned} |(J'_h(u)-J'(u))(u_h-{\bar{u}})| =&\int _\varOmega (\varphi _h(u)-\varphi _u)(u_h-{\bar{u}}) dx\\ \le&\, \Vert \varphi _h(u)-\varphi _u\Vert _{L^2(\varOmega )}\Vert u_h-{\bar{u}}\Vert _{L^2(\varOmega )}\\ \le&\, k_2 h ^2 \Vert u_h-{\bar{u}}\Vert _{L^2(\varOmega )}. \end{aligned}$$

Therefore, Assumption (A7) holds with \(\varepsilon _h = h^2\). Then, [17, Theorem 2.14] claims the existence of a constant C independent of h such that

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\varOmega )}\le C\big [h^4 + \Vert {\bar{u}}-u_h\Vert ^2_{L^2(\varOmega )} + J'({\bar{u}})(u_h-{\bar{u}})\big ]^{1/2}\ \forall u_h\in U_{\mathrm{ad},h}\ \forall h<h_0, \end{aligned}$$

and the result follows.\(\square \)

Next, we obtain error estimates for unconstrained problems.

Theorem 5.3

Suppose \(U_{\mathrm{ad}}= L^2(\varOmega )\) and set \({\mathcal {U}}_h = {\mathcal {U}}_h^i\), \(i=0,1\). Let \({\bar{u}}\) be a local minimizer of (P) with associated state \({\bar{y}}\) and satisfying (2.16). Let \(\{({\bar{y}}_h,{\bar{u}}_h)\}\) be a sequence of local minimizers of the problems \(({\mathcal {P}}_h)\) converging strongly to \(({\bar{y}},{\bar{u}})\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). Then there exists \(h_0>0\) such that

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\varOmega )}\le C h^{1+i}\quad \forall h < h_0. \end{aligned}$$

Proof

We apply Lemma 5.2. In this case \(J'({\bar{u}})=0\). For \(i=0\) we take \(u_h=\varPi _h {\bar{u}}\) and for \(i=1\), we take \(u_h=I_h{\bar{u}}\), the nodal interpolation of \({\bar{u}}\) and the result follows from the approximation properties of the projection in the \(L^2(\varOmega )\) sense and the nodal interpolation respectively.\(\square \)

In the following result, we obtain error estimates for constrained problems.

Theorem 5.4

Suppose \(-\infty < \alpha \) or \(\beta <\infty \) and set \({\mathcal {U}}_h = {\mathcal {U}}_h^i\), \(i=0,1\). Let \({\bar{u}}\) be a local minimizer of (P) with associated state \({\bar{y}}\) and satisfying (2.16). Let \(\{({\bar{y}}_h,{\bar{u}}_h)\}\) be a sequence of local minimizers of the problems \(({\mathcal {P}}_h)\) converging strongly to \(({\bar{y}},{\bar{u}})\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). Then there exists \(h_0>0\) such that

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\varOmega )}\le C h \quad \forall h < h_0. \end{aligned}$$

Proof

We apply again Lemma 5.2 with \(u_h=\varPi _h{\bar{u}}\in U_{\mathrm{ad},h}\), where we recall that \(\varPi _h\) is either the linear projection in the \(L^2(\varOmega )\) sense onto \({\mathcal {U}}_h^0\) or Carstensen’s quasi-interpolation operator, depending on the approximation space for the controls. In both cases we have that \(\Vert {\bar{u}}-u_h\Vert _{L^2(\varOmega )} \le C h\); see [21, Lemma 4.3] for Carstenen’s quasi-interpolation operator. For the last term we have

$$\begin{aligned}J'({\bar{u}})(u_h-{\bar{u}})&= \int _\varOmega ({\bar{\varphi }}+\nu {\bar{u}})(u_h-{\bar{u}})dx\le C\Vert {\bar{\varphi }}+\nu {\bar{u}}\Vert _{H^1_0(\varOmega )}\Vert u_h-{\bar{u}}\Vert _{H^{-1}(\varOmega )}\\&\le C h^2,\end{aligned}$$

where the estimate \(\Vert u_h-{\bar{u}}\Vert _{H^{-1}(\varOmega )} \le C h^2\) follows by duality for the \(L^2(\varOmega )\) projection and is proved in [21, Lemma 4.4] for Carstenen’s quasi-interpolation operator. \(\square \)

Finally, we deduce error estimates in the norm of \(L^\infty (\varOmega )\). We start with a result for the adjoint state.

Corollary 5.5

Let us suppose that the assumptions of Theorem 5.3 or Theorem 5.4 are satisfied and let \({\bar{\varphi }}\in H^2(\varOmega )\cap H^1_0(\varOmega )\) and \({\bar{\varphi }}_h\in Y_h\) be the solutions of (2.13) and (4.13). Then there exists \(h_0>0\) and a constant C independent of h such that

$$\begin{aligned} \Vert {\bar{\varphi }}-{\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}\le C h^{2-n/2} \quad \forall h < h_0. \end{aligned}$$

Proof

By the triangle inequality

$$\begin{aligned} \Vert {\bar{\varphi }}-{\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}\le \Vert {\bar{\varphi }}-\varphi _{{\bar{u}}_h}\Vert _{L^\infty (\varOmega )}+ \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}. \end{aligned}$$

Using either Theorem 5.3 or Theorem 5.4, we have that there exists some \(h_0>0\) such that \({\bar{u}}_h\in {\bar{B}}_\rho ({\bar{u}})\) for all \(h<h_0\). Therefore, we can use (5.2) to obtain

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}\le k_\infty h^{2-n/2}\end{aligned}$$

Using the same technique as in the proof of Theorem 5.1 and the Sobolev embedding \(H^2(\varOmega )\hookrightarrow L^\infty (\varOmega )\), which is valid for \(n\le 3\), we have that

$$\begin{aligned} \Vert {\bar{\varphi }}-\varphi _{{\bar{u}}_h}\Vert _{L^\infty (\varOmega )} \le C_3(1 + C_{f,\rho ^*}C_2)\Vert {\bar{y}} - y_{{\bar{u}}_h}\Vert _{L^2(\varOmega )}. \end{aligned}$$

Next, we use [12, Lemma 3.5] and either the estimate proved in Theorem 5.4 or the ones proved in Theorem 5.3, depending on wether we have control constraints or not. Since \({\bar{u}}_h\in {\bar{B}}_\rho ({\bar{u}})\) for all \(h<h_0\), we know that there is a constant \(M_{{\bar{B}}_\rho ({\bar{u}})}\) such that

$$\begin{aligned} \Vert {\bar{y}} - y_{{\bar{u}}_h}\Vert _{L^2(\varOmega )}\le M_{{\bar{B}}_\rho ({\bar{u}})}\Vert {\bar{u}} - {\bar{u}}_h\Vert _{L^2(\varOmega )}\le Ch.\end{aligned}$$

The result follows from the previous estimates, just taking into account that \(2-n/2\le 1\). \(\square \)

To deduce error estimates for the control variable in \(L^\infty (\varOmega )\), we replace Assumption (2.2) and the assumption on the target \(y_d\) by the following one, which is not very restrictive in practice:

$$\begin{aligned} b\in L^{{\bar{p}}}(\varOmega ) \text{ with } {\bar{p}}>n,\ {\text {div}}b,\, y_d \in L^q(\varOmega ) \text{ with } q>2, \end{aligned}$$
(5.8)

Using that \(\varOmega \) is convex, we know that there exists some \(2<p\le \min \{{\bar{p}}, q\}\) such that \({\bar{\varphi }} \in W^{2,p}(\varOmega )\); see, e.g., [25] for \(n=2\) and [19, Corollary 3.12] for \(n=3\).

Corollary 5.6

Let \({\bar{u}}\) be a local minimizer of (P) with associated state \({\bar{y}}\) and satisfying (2.16). Let \(\{({\bar{y}}_h,{\bar{u}}_h)\}\) be a sequence of local minimizers of the problems \(({\mathcal {P}}_h)\) converging strongly to \(({\bar{y}},{\bar{u}})\) in \(H_0^1(\varOmega ) \times L^2(\varOmega )\). Suppose further that one of the following conditions is satisfied:

  1. 1.

    \({\mathcal {U}}_h = {\mathcal {U}}_h^1\) and \(U_{\mathrm{ad}}= L^2(\varOmega )\);

  2. 2.

    \({\mathcal {U}}_h = {\mathcal {U}}_h^0\) and (5.8) holds;

  3. 3.

    \({\mathcal {U}}_h = {\mathcal {U}}_h^1\), \(n=2\) and (5.8) holds.

Then there exists \(h_0>0\) and a constant C independent of h such that

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^\infty (\varOmega )}\le C h^{2-n/2} \quad \forall h < h_0. \end{aligned}$$

Proof

Case 1\({\mathcal {U}}_h = {\mathcal {U}}_h^1\) and \(U_{\mathrm{ad}}= L^2(\varOmega )\). The optimality conditions (2.14) and (4.14) and Corollary 5.5 lead straightforward to

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^\infty (\varOmega )} = \frac{1}{\nu } \Vert {\bar{\varphi }}-{\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}\le C h^{2-n/2} \quad \forall h < h_0. \end{aligned}$$

Case 2\({\mathcal {U}}_h = {\mathcal {U}}_h^0\) and (5.8) holds. In this case (2.14) and (4.14) lead to

$$\begin{aligned} {\bar{u}}(x) = {\text {Proj}}_{[\alpha ,\beta ]}\left( -\frac{1}{\nu }{\bar{\varphi }}(x)\right) ,\ {\bar{u}}_T = {\text {Proj}}_{[\alpha ,\beta ]}\left( -\frac{1}{|T|\nu }\int _T {\bar{\varphi }}_h(x)dx\right) \ \forall T\in {\mathcal {T}}_h, \end{aligned}$$

where \({\text {Proj}}_{[\alpha ,\beta ]}(s)=\max (\alpha ,\min (\beta ,s))\) and \({\bar{u}}_T\) is the constant value of \({\bar{u}}_h\) at the triangle T. From the mean value theorem, for every element \(T\in {\mathcal {T}}_h\), we deduce the existence of some \(x_T\in T\) such that

$$\begin{aligned} \int _T {\bar{\varphi }}_h(x)dx = |T| {\bar{\varphi }}_h(x_T). \end{aligned}$$

Since \({\text {Proj}}_{[\alpha ,\beta ]}(s)\) is a contraction, we have that for every \(T\in {\mathcal {T}}_h\) and almost every \(x\in T\),

$$\begin{aligned} |{\bar{u}}(x)-{\bar{u}}_h(x)| = |{\bar{u}}(x)- u_T| \le \frac{1}{\nu }|{\bar{\varphi }}(x) - {\bar{\varphi }}_h(x_T)|. \end{aligned}$$
(5.9)

Since \({\bar{\varphi }}\in W^{2,p}(\varOmega )\) for some \(p>2\), by the Sobolev imbedding theorem, also \({\bar{\varphi }}\in C^{0,\delta }({\bar{\varOmega }})\) for \(\delta = 1\) if \(n=2\) and some \(1/2<\delta \le 1\) depending on p if \(n=3\). Therefore, there exists a constant \(\varLambda _{{\bar{\varphi }}}>0\) such that

$$\begin{aligned} |{\bar{\varphi }}(x) - {\bar{\varphi }}_h(x_T)|\le & {} |{\bar{\varphi }}(x) - {\bar{\varphi }}(x_T)| + |{\bar{\varphi }}(x_T) - {\bar{\varphi }}_h(x_T)| \nonumber \\\le & {} \varLambda _{{\bar{\varphi }}} h^\delta + \Vert {\bar{\varphi }} - {\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )}. \end{aligned}$$
(5.10)

From (5.9), (5.10), Corollary 5.5 and the fact that \(2-n/2\le \delta \), we have that

$$\begin{aligned} |{\bar{u}}(x)-{\bar{u}}_h(x)| \le C h^{2-n/2}\ \text{ for } \text{ a.e. } x\in \varOmega , \end{aligned}$$

and the result follows.

Case 3\({\mathcal {U}}_h = {\mathcal {U}}_h^1\), \(n=2\) and (5.8) holds. If there are no control constraints, we are in the situation of Case 1, so we assume that \(-\infty <\alpha \) or \(\beta < +\infty \). In this case, (4.14) implies that \({\bar{u}}_h\) is the projection in the \(L^2(\varOmega )\)-sense of \(-\frac{1}{\nu }{\bar{\varphi }}_h\) onto \(U_{\mathrm{ad},h}\), but we do not have a pointwise projection formula.

The estimate follows from the results of [28, Sections 3,4]. Notice that, although that reference is about linear equations, the proof only requires \(L^2(\varOmega )\)-error estimates for the control, which we have in Theorem 5.4, \(L^\infty (\varOmega )\)-error estimates and Lipschitz regularity for the adjoint state, which we have from Corollary 5.5 and assumption (5.8) and the fact that the discrete optimal control is a projection in the \(L^2(\varOmega )\)-sense of \(-\frac{1}{\nu }{\bar{\varphi }}_h\). Notice also that the technique of proof cannot be translated to \(n=3\), since the analogous of [28, Lemma 3.5] for \(n=3\) does not hold. \(\square \)

Remark 5.7

Under additional regularity conditions, higher orders of convergence can be proved. Indeed, let us suppose that \(\varphi _{u}\in W^{2,p}(\varOmega )\) for some \(p>n\) if \(u\in L^\infty (\varOmega )\). For \(n=2\), condition (5.8) is sufficient for this regularity, while for \(n=3\) we have to assume that \(b, {\text {div}}b,y_d\in L^{{\bar{p}}}(\varOmega )\) for some \({\bar{p}}>3\) and also that the internal angles of \(\varOmega \) are small enough; see [19]. Using the same technique as in the proof of Theorem 5.1 together with [33, Theorem 2.1], [6, Theorem 2.2], and the fact that \(\{{\bar{u}}_h\}\) is bounded in \(L^\infty (\varOmega )\), we obtain

$$\begin{aligned}&\Vert \varphi _{{\bar{u}}_h} - {\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )} \le C h^{2 - \frac{n}{p}} |\log h| \Vert \varphi _{{\bar{u}}_h}\Vert _{W^{2,p}(\varOmega )} \le C p |\log h| h^{2 - \frac{n}{p}} . \end{aligned}$$

From this estimate, it can proved as in Corollaries 5.5 and 5.6, that

$$\begin{aligned} \Vert {\bar{u}} - {\bar{u}}_h\Vert _{L^\infty (\varOmega )} =&\frac{1}{\nu }\Vert {\bar{\varphi }} - {\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )} \le C p|\log h| h^{2 - \frac{n}{p}}&\text{ if } {\mathcal {U}}_h = {\mathcal {U}}_h^1 \text{ and } U_{\mathrm{ad}}= L^2(\varOmega )\\ \Vert {\bar{u}} - {\bar{u}}_h\Vert _{L^\infty (\varOmega )} \le&\varLambda _{{\bar{\varphi }}} h + \frac{1}{\nu }\Vert {\bar{\varphi }} - {\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )} \le C h&\text{ if } {\mathcal {U}}_h = {\mathcal {U}}_h^0, \end{aligned}$$

where \(\varLambda _{{\bar{\varphi }}} \) is the Lipschitz constant of \({\bar{\varphi }}\).

Assume that we have that \(\varphi _{u}\in W^{2,p}(\varOmega )\) for all \(p<+\infty \) if \(u\in L^\infty (\varOmega )\). If further \({\mathcal {U}}_h = {\mathcal {U}}_h^1\) and \(U_{\mathrm{ad}}= L^2(\varOmega )\), then we obtain by setting \(p=|\log h|\) in the above inequality

$$\begin{aligned}\Vert {\bar{u}} - {\bar{u}}_h\Vert _{L^\infty (\varOmega )} = \frac{1}{\nu }\Vert {\bar{\varphi }} - {\bar{\varphi }}_h\Vert _{L^\infty (\varOmega )} \le C |\log h|^2 h^2.\end{aligned}$$

See [11, Lemma 3] for the proof of a similar result. This high regularity can be achieved, for instance, if \(b, {\text {div}}b,\,y_d\in L^{{\bar{p}}}(\varOmega )\) for all \({\bar{p}}<+\infty \) and \(\varOmega \) is a rectangle or a rectangular parallelepiped or its boundary \(\varGamma \) is of class \(C^{1,1}\).

Also, when \({\mathcal {U}}_h = {\mathcal {U}}_h^1\) and \(U_{\mathrm{ad}}\subsetneq L^2(\varOmega )\), the order of convergence usually observed in experiments for the \(L^2(\varOmega )\)-error of the control is \(O(h^{3/2})\). A detailed explanation of this phenomenon can be found in [10, Section 10]. In our case, this order is achieved if \(p>n\). The proof is based on the assumption that the measure of set \(\cup \{T\in {\mathcal {T}}_h:\,{\bar{u}}\not \in H^2(T)\}\) is of order h. This assumption is not restrictive and is usually satisfied in practice; see [27].

6 Numerical experiments

We are going to build an example with an explicitly known local solution satisfying the second order sufficient optimality condition (2.16).

Let us consider \(\varOmega = (0,1)^n\), \(Ay = -\varDelta y\), \(f(x,y)=\exp (y)\), \(\nu = 1\) and \(b(x) = (B(x_1),0)\) if \(n=2\) or \(b(x) = (B(x_1),0,0)\) if \(n=3\), where \(B(x) = 5x^{3/4}(1-2x)\). With these choices, Assumptions 12 and 3 are satisfied, but notice that Assumption 2’ does not hold. The lower control constraint will be \(\alpha =-\infty \) and we will investigate both the upper unconstrained case \(\beta = \infty \) and the constrained case \(\beta = 2^{-2n-1}\). To define the target state \(y_d\), we first define \({\bar{\varphi }}(x) = -\varPi _{i=1}^n x_i(1-x_i)\) and \({\bar{u}}(x) = \mathrm {proj}_{[\alpha ,\beta ]}(-{\bar{\varphi }}(x)/\nu )\). Next, we take \({\bar{y}}\in H^2(\varOmega )\cap H^1_0(\varOmega )\) solution of the state equation and set \(y_d(x) = \varDelta {\bar{\varphi }}(x) + {\text {div}}(b(x){\bar{\varphi }}(x)) + {\bar{y}}(x) - \frac{\partial f}{\partial y} (x,{\bar{y}}(x)){\bar{\varphi }}(x)\). (In practice, we do not have \({\bar{y}}\), but we can use \(y_h({\bar{u}})\) to compute a good approximation of \(y_d\)).

With these choices, it is clear that \(({\bar{u}},{\bar{y}},{\bar{\varphi }})\) satisfies first order optimality conditions (2.12)–(2.14). From (2.10), we have

$$\begin{aligned}J''({\bar{u}})v^2 = \int _\varOmega \left( 1-{\bar{\varphi }}(x) e^{{\bar{y}}(x)}\right) z_v^2(x)dx +\int _\varOmega v^2(x) dx \ \forall v\in L^2(\varOmega ).\end{aligned}$$

Since \({\bar{\varphi }}(x) < 0\) for all \(x\in \varOmega \), the condition (2.16) holds and hence \({\bar{u}}\) is a local solution of (P).

The problem is discretized using the finite element method. To solve the discrete problems, we use a semi-smooth Newton method as described in [10, Section 14]. The success of the conjugate gradient method used to solve the unconstrained quadratic programs arising in the optimization process is an indication that the solutions of the finite dimensional problems are strict local minima.

The mesh of size \(h_i=2^{-i}\) is obtained splitting \(\varOmega \) into \(2^{in}\) congruent cells obtained by translation of \((0,h_i)^n\) and dividing each cell into n! \(n-\)simplices. In this family of meshes, the experimental order of convergence for the error of the variable \(z\in \{u,y,\varphi \}\) measured in the norm of \(X =L^2(\varOmega )\) or \(L^\infty (\varOmega )\) can be computed as

$$\begin{aligned}{EOC_i = \log _2(\Vert {\bar{z}}_{h_{i-1}}-{\bar{z}}\Vert _X)- \log _2(\Vert {\bar{z}}_{h_{i}}-{\bar{z}}\Vert _X).} \end{aligned}$$

We report on the \(L^2(\varOmega )\) and \(L^\infty (\varOmega )\) experimental order of convergence of the error for the control, the state, and the adjoint state for \(i=8\) if \(n=2\), and \(i=5\) if \(n=3\). We summarize the results Tables 1, 2 and 3.

Table 1 Experimental order of convergence for the control error
Table 2 Experimental order of convergence for the state error
Table 3 Experimental order of convergence for the adjoint state error