1 Introduction

Let us consider a domain \(\Omega \subset {\mathbb {R}}^2\) with a polygonal boundary \(\Gamma \). We are concerned with the Neumann boundary control problem

$$\begin{aligned} \mathrm{(P)} \quad \min _{u \in U_\textrm{ad}} J(u):= \frac{1}{2}\int _\Omega (y_u(x) - y_d(x))^2\, \,\textrm{d}x+ \frac{\nu }{2}\int _\Gamma u^2(x)\, \,\textrm{d}x+\int _\Gamma y_u(x) g_\varphi (x)\, \,\textrm{d}x\end{aligned}$$

where \(y_d \in L^2(\Omega )\) and \(g_\varphi \in L^2(\Gamma )\) are given functions, \(\nu > 0\),

$$\begin{aligned} U_\textrm{ad}= \{u \in L^2(\Omega ): u_{a}\le u(x) \le u_{b}\text {for a.{e.}} x \in \Omega \} \end{aligned}$$

with \(-\infty \le u_{a}< u_{b}\le +\infty \), and \(y_u\) is the solution of

$$\begin{aligned} \left\{ \begin{array}{rccl} {} Ay + b(x)\cdot \nabla y + a_0(x) y &{} = &{}0&{}{} \quad \text{ in } \Omega ,\\ {} {} \partial _{n_A} y &{} = &{}u&{}{}\quad \text{ on } \Gamma .\end{array}\right. \end{aligned}$$
(1.1)

Assumptions regarding the symmetric second order differential operator A and the coefficients b and \(a_0\) will be described later. Let us just emphasize now that we will make no assumptions on b and \(a_0\) that would imply coerciveness of the associated bilinear form.

The main objective of this paper is to discretize the optimal control problem using the finite element method and to obtain error estimates for the approximations of the optimal control in terms of the discretization parameter h. The paper aims to minimize assumptions to better capture their essence. The results are valid for possibly non-convex domains and both quasi-uniform and graded meshes. Although the theory for Neumann boundary optimal control problems governed by elliptic equations is quite complete, to the best of our knowledge, the issues that arise when the elliptic operator governing the equation is not coercive have not been addressed yet; see Casas, Mateos and Tröltzsch 2005 [1], Casas and Mateos 2007 [2], Mateos and Rösch 2011 [3], Apel Pfefferer and Rösch 2012 and 2015 [4, 5], Krumbiegel and Pfefferer 2015 [6] or the thesis by Winkler 2015 [7]. The only papers, we are aware, that deal with optimal control problems governed by a non-coercive elliptic equation are about distributed controls; see Casas, Mateos and Rösch 2020 and 2021 [8, 9]. In both papers, this fact and the convexity of the domain are used in an essential way in some of the proofs, and hence those results are not applicable to our problem.

We will see that problem (P) has a unique solution \({\bar{u}}\), and that it satisfies the optimality conditions, which we state now in an informal way: there exist \({\bar{y}}\) and \({\bar{\varphi }}\) such that

$$\begin{aligned} \left\{ \begin{array}{rccl}{} A{\bar{y}} + b(x)\cdot \nabla {\bar{y}} + a_0(x) {\bar{y}}&{} = &{}0 &{}{}\quad \text{ in } \Omega ,\\ {} {} \partial _{n_A} {\bar{y}} &{}= &{}\bar{u}&{}{}\quad \text{ on } \Gamma ,\end{array}\right. \end{aligned}$$
(1.2a)
$$\begin{aligned} \left\{ \begin{array}{rccl} A{\bar{\varphi }} -\nabla \cdot (b(x)\bar{\varphi }) + a_0(x) {\bar{\varphi }} &{}=&{} {\bar{y}}-y_d &{} \text { in } \Omega ,\\ \partial _{n_A} {\bar{\varphi }} + \varphi b\cdot n &{}=&{} g_\varphi &{} \text { on } \Gamma ,\end{array}\right. \end{aligned}$$
(1.2b)
$$\begin{aligned} \int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}})(u-{\bar{u}})\,\,\textrm{d}x\ge 0\ \forall u\in U_\textrm{ad}. \end{aligned}$$
(1.2c)

Since (P) is a linear-quadratic strictly convex problem, existence and uniqueness of the solution follow in a standard way once we have proved existence and uniqueness of solution of the state equation and continuity of the control-to-state mapping. But, since we will not formulate any assumptions on b or \(a_0\) that would lead to a coercive operator, this task is not standard. In particular, \(\textrm{div}\,b\) may be large, such that the usual assumption \(a_0-\frac{1}{2}\textrm{div}\,b\ge c_0>0\) is not satisfied. This will be done in Sect. 2.

In Sect. 3 we investigate the regularity properties of the solutions of the state equation and the adjoint state equations. Since these are different, we perform this task in two steps resulting in Theorems 3.4 and 3.5, respectively. We obtain results in Hilbert–Sobolev, in Sobolev–Slobodeckiĭ and in weighted Sobolev spaces, with our focus on treating the numerical approximation of (P) in non-convex domains. The regularity results in non-weighted spaces serve us as intermediate results to prove the error estimates in weighted Sobolev spaces, but they are also of independent interest. Note that, although regularity results for elliptic boundary value problems are widely investigated, see, e. g., the monographs [10,11,12,13,14], the particular results which we need for our approximation theory were not available for non-coercive problems with variable coefficients.

In Sect. 4 we study the numerical discretization of both the state and adjoint state equation. We obtain existence and uniqueness of the solution as well as error estimates. Our results are valid in convex and non-convex domains and for quasi-uniform and graded meshes, with possibly a non-optimal grading parameter \(\mu \).

With these results at hand, we will be able to deduce existence, uniqueness, and optimality conditions in Sect. 5. Moreover, regularity properties of the optimal solution and its related state and adjoint state are given in terms of weighted Sobolev spaces. Finally, we will discretize the control problem. The control is approximated using piecewise constant functions whereas the state and adjoint state are discretized by continuous piecewise linear functions. A close inspection of the proofs in the above mentioned papers about Neumann control problems, suggests that, if no postprocessing step is done, the order of convergence of the error in \(L^2(\Gamma )\) for the control variable will be limited by the order of convergence of the finite element error in \(H^1(\Omega )\) for the state or the adjoint state equation; see e.g. the proof of Lemma 4.7 in [1]. This means that, for a non-convex domain and a quasi-uniform mesh, the order of convergence that can be obtained—applying the usual techniques in optimal control together with the regularity results and the finite element error estimates provided in this paper—is approximately \(h^\lambda \), where \(1/2<\lambda <1\). For instance, in the problem shown as an example in Sect. 6, \(h^{2/3}\) would be expected. Nevertheless, the numerical experiments show clearly order h, and we are able to get that in Theorem 5.7: If the corner singularities are of type \(r^{\lambda _j}\), the index j counting the corners, and the mesh is graded near the corners with parameter \(\mu _j\), then the approximation order of the control is \(s^*\le 1\) with \(s^*<\frac{3\lambda _j}{2\mu _j}\), i. e., \(s^*=1\) is achieved if \(\mu _j<\frac{3}{2}\lambda _j\) for all j. In the works by Apel, Pfefferer and Rösch [4, 5] a stronger grading \(\mu _j<\lambda _j\) is used to obtain an optimal control of convergence for the so-called post-processed control, i.e., the pointwise projection onto the admissible set of \(-{\bar{\varphi }}_h/\nu \), where \({\bar{\varphi }}_h\) is the discrete adjoint state associated to the discrete optimal control. In Theorem 4.2.1 of the thesis of Winkler [7] it is shown that for quasi-uniform meshes, i.e., \(\mu _j=1\), the order \(s^*=1\) is achieved for any angle.

2 Existence, Uniqueness and Continuous Dependence of the Solution of the State and Adjoint State Equations

On A, b and \(a_0\) we make the following assumptions.

Assumption 2.1

A is the operator given by

$$\begin{aligned} Ay= - \sum _{i,k = 1}^{2}\partial _{x_k}(a_{ik}(x)\partial _{x_i}y) \ \text { with }\ a_{ik} \in L^\infty (\Omega ), \end{aligned}$$

\(a_{ik}=a_{ki}\) for \(1\le i,k\le 2\), and satisfying the following ellipticity condition:

$$\begin{aligned} \exists \Lambda > 0 \text{ such } \text{ that } \sum _{i,k = 1}^{2}a_{ik}(x)\xi _i\xi _k \ge \Lambda \vert \xi \vert ^2\ \ \forall \xi \in \mathbb {R}^2 \text{ and } \text{ for } \text{ a.a. } x \in \Omega .\qquad \end{aligned}$$
(2.1)

The function \(b:\Omega \rightarrow \mathbb {R}^2\) satisfies \(b\in L^{{\hat{p}}}(\Omega )^2\) with \({{\hat{p}}} > 2\) and there exists \({{\hat{q}}}>1\) such that \(\nabla \cdot b\in L^{{\hat{q}}}(\Omega )\) and \(b\cdot n\in L^{{\hat{q}}}(\Gamma )\). For the function \(a_0:\Omega \rightarrow \mathbb {R}\) it is assumed that \(a_0 \in L^{{\hat{q}}}(\Omega )\), \(a_0(x)\ge 0\) for a.e. \(x\in \Omega \) and there exists \(E\subset \Omega \) with \(\vert E\vert >0\) such that \(a_0(x)\ge \Lambda /2\) for all \(x\in E\).

Remark 2.2

Note that this assumption does not lead to a coercive bilinear form. While Assumption 2.1 is sufficient for the purposes of proving existence and uniqueness of solution, to establish adequate regularity results for the solution, further regularity must be imposed on the coefficients. The reader is referred to the results of Sect. 3 for the details required in the different scenarios.

Before addressing the main results of this section, we recall some well known inequalities that will be used throughout this paper.

We will often use the following form of Hölder’s inequality: for \(q,p_1,\ldots , p_k\in [1,\infty ]\) such that \(1/p_1+\cdots +1/p_k\le 1/q\) and \(f_i\in L^{p_i}(\Omega )\), \(i=1,\ldots ,k\) there exists a constant \(C_\Omega =\vert \Omega \vert ^{1/q-(1/p_1+\cdots +1/p_k)}\), such that \(\Vert f_1\cdots f_k\Vert _{L^q(\Omega )}\le C_\Omega \Vert f_1\Vert _{L^{p_1}(\Omega )}\cdots \Vert f_k\Vert _{L^{p_k}(\Omega )}\).

The inequality

$$\begin{aligned} \Vert y\Vert _{H^1(\Omega )} \le C_E(\Vert \nabla y\Vert _{L^2(\Omega )}+\Vert y\Vert _{L^2(E)})\quad \forall y \in H^1(\Omega ) \end{aligned}$$
(2.2)

is a generalization of Poincaré’s inequality and can be found, e.g., in [15, Theorem 11.19]. In dimension 2, Sobolev’s embedding theorem gives that for all \(r<\infty \) there exists \(K_{\Omega ,r}>0\) such that

$$\begin{aligned} \Vert y\Vert _{L^r(\Omega )} \le K_{\Omega ,r}\Vert y\Vert _{H^1(\Omega )}\quad \forall y \in H^1(\Omega ). \end{aligned}$$
(2.3)

We will denote by \(\langle \cdot ,\cdot \rangle _\Omega \) the duality product in \(H^1(\Omega )'\times H^1(\Omega )\) and by \(\langle \cdot ,\cdot \rangle _\Gamma \) the duality product in \(H^{1/2}(\Gamma )'\times H^{1/2}(\Gamma )\). We notice that any \(g\in H^{1/2}(\Gamma )'\) defines an element in \(H^1(\Omega )'\), which will be denoted in the same way by

$$\begin{aligned} \langle g,z\rangle _\Omega = \langle g,\textrm{tr}z\rangle _\Gamma \ \forall z\in H^1(\Omega ). \end{aligned}$$
(2.4)

In this case, we will simply write \(\langle g,z\rangle _\Gamma \). Also notice that for any fixed \(q>1\), the functions \(f\in L^q(\Omega )\) and \(g\in L^q(\Gamma )\) define elements in \(H^1(\Omega )'\) and \(H^{1/2}(\Gamma )'\) respectively by

$$\begin{aligned} \langle f,z\rangle _\Omega = \int _\Omega fz\,\textrm{d}x,\qquad \langle g,z\rangle _\Gamma = \int _\Gamma gz\,\,\textrm{d}x\ \forall z\in H^1(\Omega ). \end{aligned}$$
(2.5)

For every \(y\in H^1(\Omega )\), we define \({\mathcal {A}}y\) by

$$\begin{aligned} \langle {\mathcal {A}}y,z\rangle _\Omega = \int _\Omega \sum _{i,k = 1}^2 a_{ik}\partial _{x_i}y\partial _{x_k} z \,\textrm{d}x+ \int _\Omega ( b\cdot \nabla y )z\,\textrm{d}x+ \int _\Omega a_0 y z\,\textrm{d}x\ \forall z\in H^1(\Omega ). \nonumber \\ \end{aligned}$$
(2.6)

Using this operator, we have that the weak form of the state Eq. (1.1) is: find \(y_u\in H^1(\Omega )\) such that

$$\begin{aligned} \langle {\mathcal {A}}y_u,z\rangle _\Omega = \langle u, z\rangle _\Gamma \ \forall z\in H^1(\Omega ). \end{aligned}$$
(2.7)

We first prove continuity of the operator \({\mathcal {A}}\) and Gårding’s inequality. We adapt the proof of [8, Lemma 2.1]

Lemma 2.3

Under Assumption 2.1 we have that \({\mathcal {A}}\in {\mathcal {L}}(H^1(\Omega ),H^{1}(\Omega )')\) and there exists a constant \(C_{\Lambda ,E,b}\) such that

$$\begin{aligned} \langle {\mathcal {A}} z,z\rangle _\Omega \ge \frac{\Lambda }{8 C_E^2}\Vert z\Vert ^2_{H^1(\Omega )} - C_{\Lambda ,E,b}\Vert z\Vert ^2_{L^2(\Omega )}\quad \forall z \in H^1(\Omega ) \end{aligned}$$
(2.8)

where \(\Lambda \) and \(C_E\) are the constants from (2.1) and (2.2), respectively.

Proof

Let us show that \({\mathcal {A}}\) is a linear continuous operator. Denote \(S =\{z\in H^1(\Omega ):\ \Vert z\Vert _{H^1(\Omega )} = 1\}\). We split \({\mathcal {A}}\) into three parts \({\mathcal {A}}_i\), \(i=1,2,3\).

$$\begin{aligned} \Vert {\mathcal {A}}_1 y\Vert _{H^1(\Omega )'}&= \sup _{z\in S} \int _\Omega \sum _{i,k = 1}^2 a_{ik}\partial _{x_i}y\partial _{x_k} z \,\textrm{d}x\\&\le \sup _{z\in S} 4 \max _{1\le i,k\le 2}\Vert a_{ik}\Vert _{L^\infty (\Omega )} \Vert \nabla y\Vert _{L^2(\Omega )^2}\Vert \nabla z\Vert _{L^2(\Omega )^2} \\&\le 4 \max _{1\le i,k\le 2}\Vert a_{ik}\Vert _{L^\infty (\Omega )}\Vert y\Vert _{H^1(\Omega )}. \end{aligned}$$

Take now \(s_p>1\) such that \(1/s_p = 1/{{\hat{p}}} + 1/2\). From (2.3) and Hölder’s inequality we infer for every \(y \in H^1(\Omega )\)

$$\begin{aligned} \Vert {\mathcal {A}}_2 y\Vert _{H^1(\Omega )'}&= \sup _{z\in S} \int _\Omega ( b\cdot \nabla y )z\,\textrm{d}x\le \Vert b\cdot \nabla y\Vert _{L^{s_p}(\Omega )} \Vert z\Vert _{L^{s'_p}(\Omega )} \nonumber \\&\le K_{\Omega ,s'_p} \Vert b\Vert _{L^{{\hat{p}}}(\Omega )^2}\Vert \nabla y\Vert _{L^2(\Omega )^2} \le K_{\Omega ,s'_p} \Vert b\Vert _{L^{{\hat{p}}}(\Omega )^2}\Vert y\Vert _{H^1(\Omega )}, \end{aligned}$$

Fix now some \(s_q\in (1,{{\hat{q}}})\) and take \(r\in (1,+\infty )\) such that \(1/{{\hat{q}}}+1/r = 1/s_q\). From (2.3) we infer that

$$\begin{aligned} \Vert {\mathcal {A}}_3 y\Vert _{H^1(\Omega )'}&= \sup _{z\in S} \int _\Omega a_0 y z\,\textrm{d}x\le \Vert a_0 y\Vert _{L^{s_q}(\Omega )}\Vert z\Vert _{L^{s'_q}(\Omega )} \nonumber \\&\le K_{\Omega ,s'_q} \Vert a_0\Vert _{L^{{{\hat{q}}}}(\Omega )}\Vert y\Vert _{L^{r}(\Omega )} \le K_{\Omega ,s'_q} K_{\Omega ,r}\Vert a_0\Vert _{L^{{{\hat{q}}}}(\Omega )}\Vert y\Vert _{H^1(\Omega )}. \end{aligned}$$

Hence, we have that \({\mathcal {A}}\) is a well-posed linear and continuous operator.

Let us prove (2.8). Using Assumption 2.1, (2.2), and Young and Hölder inequalities we get

$$\begin{aligned} \langle {\mathcal {A}} z,z\rangle _\Omega&\ge \Lambda \Vert \nabla z\Vert ^2_{L^2(\Omega )^2} + \frac{\Lambda }{2} \Vert z\Vert ^2_{L^2(E)} - \Vert \nabla z\Vert _{L^2(\Omega )^2}\Vert bz\Vert _{L^2(\Omega )^2} \\&\ge \frac{\Lambda }{2}\Vert \nabla z\Vert ^2_{L^2(\Omega )^2} + \frac{\Lambda }{2} \Vert z\Vert ^2_{L^2(E)} - \frac{1}{2\Lambda }\Vert bz\Vert ^2_{L^2(\Omega )^2}\\&\ge \frac{ \Lambda }{4C_E^2}\Vert z\Vert ^2_{H^1(\Omega )} - \frac{1}{2\Lambda }\Vert b\Vert ^2_{L^{{{\hat{p}}}}(\Omega )^2}\Vert z\Vert ^2_{L^{\frac{2 {{\hat{p}}}}{{{\hat{p}}} - 2}}(\Omega )}. \end{aligned}$$

Observe that the assumption \({{\hat{p}}} > 2\) implies that \(2< \dfrac{2 {{\hat{p}}}}{{{\hat{p}}} - 2} < \infty \). Now, we apply Lions’ Lemma, [16, Chapter 2, Lemma 6.1], to the chain of embeddings \(H^1(\Omega ) \subset \subset L^{\frac{2 {{\hat{p}}}}{ {{\hat{p}}} - 2}}(\Omega ) \subset L^2(\Omega )\), the first one being compact and the second one continuous, to deduce the existence of a constant \(C_0\) depending on \(\Lambda \), \(C_E\) and \(\Vert b\Vert _{L^{p}(\Omega )^2}\) such that

$$\begin{aligned} \Vert z\Vert _{L^{\frac{2{{\hat{p}}}}{{{\hat{p}}} - 2}}(\Omega )} \le \frac{\Lambda }{2^{3/2}\Vert b\Vert _{L^{{{\hat{p}}}}(\Omega )^2}C_E}\Vert z\Vert _{H^1(\Omega )} + C_0\Vert z\Vert _{L^2(\Omega )}. \end{aligned}$$

From the last two inequalities we conclude (2.8) with

$$\begin{aligned} C_{\Lambda ,E,b} = \frac{C_0^2\Vert b\Vert ^2_{L^{{{\hat{p}}}}(\Omega )^2}}{\Lambda } \end{aligned}$$

and the proof is complete. \(\square \)

Remark 2.4

Notice that, to prove Lemma 2.3, we use neither \(\nabla \cdot b\in L^{{\hat{q}}}(\Omega )\) nor \(b\cdot n\in L^{{\hat{q}}}(\Gamma )\) for some \({{\hat{q}}}>1\).

The adjoint operator of \({\mathcal {A}}\) is \({\mathcal {A}}^*\). We have \({\mathcal {A}}^*z\in H^1(\Omega )'\) for every \(z\in H^1(\Omega )\). In the next lemma, we justify that under the mild Assumption 2.1, we can integrate by parts and use the expected form of the adjoint Eq. (1.2b).

Lemma 2.5

Suppose that Assumption 2.1 holds. Then

$$\begin{aligned} \langle {\mathcal {A}}^* z,y \rangle _\Omega = \int _\Omega \sum _{i,k = 1}^2 a_{ki}\partial _{x_i}z\partial _{x_k} y \,\textrm{d}x- \int _\Omega y\nabla \cdot ( b z)\,\textrm{d}x+ \int _\Gamma y z b\cdot n\,\,\textrm{d}x+ \int _\Omega a_0 y z\,\textrm{d}x. \end{aligned}$$

for all \(y\in H^1(\Omega )\).

Proof

By definition

$$\begin{aligned} \langle {\mathcal {A}}^* z,y \rangle _\Omega =\langle {\mathcal {A}} y,z \rangle _\Omega \ \forall y,z\in H^1(\Omega ) \end{aligned}$$

and we only have to justify that, under Assumption 2.1, we can do integration by parts to get

$$\begin{aligned} \int _\Omega (b\cdot \nabla y) z\,\textrm{d}x= -\int _\Omega y\nabla \cdot (bz)\,\textrm{d}x+ \int _\Gamma yz b\cdot n \,\,\textrm{d}x. \end{aligned}$$

This is equivalent to proving that we can apply the Gauss theorem to obtain

$$\begin{aligned} \int _\Omega \nabla \cdot (yzb)\,\textrm{d}x= \int _\Gamma yz b\cdot n \,\,\textrm{d}x. \end{aligned}$$

Using that \(y,z\in H^1(\Omega )\hookrightarrow L^r(\Omega )\) for all \(r<+\infty \), \(b\in L^{{\hat{p}}}(\Omega )^2\) for some \({{\hat{p}}}>2\) and \(\nabla \cdot b\in L^{{\hat{q}}}(\Omega )\) for some \({{\hat{q}}}>1\), applying Hölder’s inequality, we have

$$\begin{aligned} \nabla (yz)\cdot b=z\nabla y\cdot b+ y\nabla z\cdot b\in L^{\frac{2{{\hat{p}}}}{2+{{\hat{p}}}}}(\Omega )\text { and }yz\nabla \cdot b\in L^{\frac{{{\hat{q}}}+1}{2}}(\Omega ), \end{aligned}$$

so \(\nabla \cdot (yzb)\in L^s(\Omega )\), where \(s = \min \left\{ \dfrac{2{{\hat{p}}}}{2+{{\hat{p}}}},\dfrac{{{\hat{q}}}+1}{2}\right\} \) satisfies \(1<s <2\). From Assumption 2.1, it is also clear that \(yzb\in L^s(\Omega )^2\), and using [17, Lema II.1.2.2], we deduce that yzb has a normal trace \(yzb\cdot n\) defined in the space of \((W^{1-1/s',s'}(\Gamma ))'\) via Gauss theorem: for every \(v\in W^{1,s'}(\Omega )\)

$$\begin{aligned} \langle yzb\cdot n,v\rangle _{(W^{1-1/s',s'}(\Gamma ))',W^{1-1/s',s'}(\Gamma )} = \int _\Omega \nabla \cdot (v yzb) \,\textrm{d}x. \end{aligned}$$

Since we are assuming that \(b\cdot n\in L^{{\hat{q}}}(\Gamma )\) for some \({{\hat{q}}}>1\), then \(yz b\cdot n\in L^{\frac{{{\hat{q}}}+1}{2}}(\Gamma )\hookrightarrow L^s(\Gamma )\). Therefore, we have that

$$\begin{aligned} \langle yzb\cdot n,v\rangle _{(W^{1-1/s',s'}(\Gamma ))',W^{1-1/s',s'}(\Gamma )} = \int _\Gamma v yzb\cdot n\,\,\textrm{d}x. \end{aligned}$$

Taking \(v = 1\) in the above equalities, we have that

$$\begin{aligned} \int _\Omega \nabla \cdot (yzb) \,\textrm{d}x= \int _\Gamma yzb\cdot n\,\,\textrm{d}x, \end{aligned}$$

and the proof is complete. \(\square \)

Next, we adapt the proof of [8, Theorem 2.2] to show existence and uniqueness of the solution of the state equation.

Lemma 2.6

Under Assumption 2.1, the linear operator \({\mathcal {A}}:H^1(\Omega ) \longrightarrow H^{1}(\Omega )'\) is an isomorphism.

Proof

Let us first see that \({\mathcal {A}}\) is injective. Consider \(y\in H^1(\Omega )\) such that \({\mathcal {A}}y =0\). We will prove that \(y\le 0\). The contrary inequality follows by arguing on \(-y\). Suppose there exists some \({\mathcal {O}}\subset \Omega \) with positive measure such that \(y (x) > 0\) if \(x\in {\mathcal {O}}\). Take \(0< \rho < \text {ess}\sup _{x\in \Omega } y(x)\le +\infty \) and define \(y_\rho (x) = (y(x)-\rho )^+ = \max \{y(x)-\rho ,0\}\). Denote \(\Omega _\rho = \{x\in \Omega :\ \nabla y_\rho (x)\ne 0\}\). Notice that \(y_\rho \in H^1(\Omega )\),

$$\begin{aligned} \nabla y_\rho (x) = \left\{ \begin{array}{cc} \nabla y(x) &{} \text { if }y(x) > \rho \\ 0 &{} \text { if }y(x) \le \rho , \end{array} \right. \end{aligned}$$

which means that \(\Omega _\rho \subset \{x:\ y(x)>\rho \}\). We also remark that \(y_\rho (x) = 0\text { if }y(x)\le 0\), and that \(y(x)\ge y_\rho (x) \ge 0\) if \(y(x) \ge 0 \). Using these properties, and Hölder’s and Young’s inequalities, we have that

$$\begin{aligned} 0&= \langle {\mathcal {A}} y ,y_\rho \rangle _\Omega = \int _\Omega \sum _{i,k = 1}^2 a_{ik}\partial _{x_i}y\partial _{x_k} y_\rho \,\textrm{d}x+ \int _\Omega ( b\cdot \nabla y )y_\rho \,\textrm{d}x+ \int _\Omega a_0 y y_\rho \,\textrm{d}x\\&\ge \int _{\Omega _\rho } \sum _{i,k = 1}^2 a_{ik}\partial _{x_i}y_\rho \partial _{x_k} y_\rho \,\textrm{d}x+ \int _{\Omega _\rho } ( b\cdot \nabla y_\rho )y_\rho \,\textrm{d}x+ \int _{\Omega } a_0 y_\rho y_\rho \,\textrm{d}x\\&\ge \Lambda \Vert \nabla y_\rho \Vert ^2_{L^2(\Omega _\rho )} - \Vert b\Vert _{L^{{\hat{p}}}(\Omega _\rho )^2}\Vert \nabla y_\rho \Vert _{L^2(\Omega _\rho )} \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega _\rho )} + \frac{\Lambda }{2}\Vert y_\rho \Vert _{L^2(E)}^2\\&\ge \frac{\Lambda }{2}\Vert \nabla y_\rho \Vert ^2_{L^2(\Omega _\rho )} -\frac{1}{2\Lambda } \Vert b\Vert _{L^{{\hat{p}}}(\Omega _\rho )^2}^2 \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega _\rho )}^2+ \frac{\Lambda }{2}\Vert y_\rho \Vert _{L^2(E)}^2\\&= \frac{\Lambda }{2}\Vert \nabla y_\rho \Vert ^2_{L^2(\Omega )} -\frac{1}{2\Lambda } \Vert b\Vert _{L^{{\hat{p}}}(\Omega _\rho )^2}^2 \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega _\rho )}^2+ \frac{\Lambda }{2}\Vert y_\rho \Vert _{L^2(E)}^2 \end{aligned}$$

Next we use that \(\Omega _\rho \subset \Omega \), (2.3), (2.2) and the just proved inequality to obtain:

$$\begin{aligned} \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega _\rho )}^2&\le \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega )}^2 \le K_{\Omega ,\frac{2p}{p-2}}^2 \Vert y_\rho \Vert _{H^1(\Omega )}^2 \\&\le 2 K_{\Omega ,\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}}^2 C_E^2 \left( \Vert \nabla y_\rho \Vert ^2_{L^2(\Omega )} + \Vert y_\rho \Vert _{L^2(E)}^2\right) \\&\le \frac{2 K_{\Omega ,\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}}^2 C_E^2}{\Lambda ^2} \Vert b\Vert _{L^{{\hat{p}}}(\Omega _\rho )^2}^2 \Vert y_\rho \Vert _{L^\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}(\Omega _\rho )}^2 \end{aligned}$$

We can deduce from this a positive lower bound for the norm of b in \(L^{{\hat{p}}}(\Omega _\rho )^2\) independent of \(\rho \):

$$\begin{aligned} \Vert b\Vert _{L^{{\hat{p}}}(\Omega _\rho )^2}\ge \frac{\Lambda }{\sqrt{2}K_{\Omega ,\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}} C_E} > 0. \end{aligned}$$

But we have that \(\vert \Omega _\rho \vert \rightarrow 0\) as \(\rho \rightarrow \text {ess}\sup _{x\in \Omega } y(x)\); see [8, Theorem 2.2]. So we have achieved a contradiction.

Finally we have just to check that the range of \({\mathcal {A}}\) is dense and closed. Since we already have established Gårding’s inequality (2.8) for the operator \({\mathcal {A}}\), the proof of closeness done in [8, Theorem 2.2] applies to our case changing the norms in \(H^1_0(\Omega )\) and \(H^{-1}(\Omega )\) respectively by the norms in \(H^1(\Omega )\) and its dual space, and thus it is omitted. By a well known duality argument, the denseness of the range of \({\mathcal {A}}\) follows from the injectivity of \({\mathcal {A}}^*\).

The argument used above to obtain the injectivity of \({\mathcal {A}}\) does not work for \({\mathcal {A}}^*\). Notice that at one moment we use that \(\int _\Omega (b\cdot \nabla y) y_\rho \,\textrm{d}x= \int _{\Omega _\rho } (b\cdot \nabla y_\rho ) y_\rho \,\textrm{d}x\). When dealing with the adjoint operator, we would find the term \(\int _\Omega (b\cdot \nabla z_\rho ) z\,\textrm{d}x\), which in general is different from \(\int _{\Omega _\rho } (b\cdot \nabla z_\rho ) z_\rho \,\textrm{d}x\). But we can obtain injectivity of the adjoint operator as follows. Consider \(z\in H^1(\Omega )\) such that \({\mathcal {A}}^*z =0\). For all \(\varepsilon \ge 0\) define

$$\begin{aligned} \Omega ^\varepsilon = \{x\in \Omega :\ \vert z(x)\vert > \varepsilon \}. \end{aligned}$$

Let us see that \(\vert \Omega ^0 \vert = 0\), which readily implies that \(z=0\). Let us define \(z^\varepsilon (x) = {\text {Proj}}_{[-\varepsilon ,\varepsilon ]}(z(x))\). Using integration by parts, that \(z=0\) in \(\Omega \setminus \Omega ^0\), that \(\nabla z^\varepsilon =0\) in \(\Omega ^\varepsilon \) and \(\nabla z^\varepsilon = \nabla z\) in \(\Omega {\setminus }\Omega ^\varepsilon \), and that \(z z^\varepsilon \ge (z^\varepsilon )^2\), we have

$$\begin{aligned} 0&= \langle {\mathcal {A}}^*z ,z^\varepsilon \rangle _\Omega \\ {}&= \int _\Omega \sum _{i,k = 1}^2 a_{ki}\partial _{x_i}z\partial _{x_k} z^\varepsilon \,\text {d}x- \int _\Omega z^\varepsilon \nabla \cdot ( b z)\,\text {d}x+ \int _\Gamma z^\varepsilon z b\cdot n\,\,\text {d}x+ \int _\Omega a_0 z^\varepsilon z\,\text {d}x\\ {}&= \int _\Omega \sum _{i,k = 1}^2 a_{ki}\partial _{x_i}z\partial _{x_k} z^\varepsilon \,\text {d}x+ \int _\Omega z b\cdot \nabla z^\varepsilon \,\text {d}x+ \int _\Omega a_0 z^\varepsilon z\,\text {d}x\\ {}&\ge \Lambda \Vert \nabla z^\varepsilon \Vert _{L^2(\Omega )^2} - \Vert b\Vert _{L^{{{\hat{p}}}}(\Omega ^0\setminus \Omega ^\varepsilon )} \Vert \nabla z^\varepsilon \Vert _{L^2(\Omega )^2} \Vert z^\varepsilon \Vert _{L^{\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}}(\Omega ^0\setminus \Omega ^\varepsilon )} +\frac{\Lambda }{2}\Vert z^\varepsilon \Vert _{L^2(E)}^2\\ {}&\ge \frac{\Lambda }{2} \Vert \nabla z^\varepsilon \Vert _{L^2(\Omega )^2} -\frac{1}{2\Lambda } \Vert b\Vert _{L^{{{\hat{p}}}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 \Vert z^\varepsilon \Vert _{L^{\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 +\frac{\Lambda }{2}\Vert z^\varepsilon \Vert _{L^2(E)}^2. \\ \end{aligned}$$

So, using this and that \(\vert z^\varepsilon (x)\vert \le \varepsilon \) for a.e. \(x\not \in \Omega ^\varepsilon \) we get

$$\begin{aligned} \Vert z^\varepsilon \Vert _{H^1(\Omega )}^2&\le 2 C_E^2 \left( \Vert \nabla z^\varepsilon \Vert _{L^2(\Omega )^2} + \Vert z^\varepsilon \Vert _{L^2(E)}^2\right) \\&\le \frac{2 C_E^2}{\Lambda ^2} \Vert b\Vert _{L^{{{\hat{p}}}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 \Vert z^\varepsilon \Vert _{L^{\frac{2{{\hat{p}}}}{{{\hat{p}}}-2}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 \le \frac{2 C_E^2}{\Lambda ^2} \Vert b\Vert _{L^{{{\hat{p}}}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 \vert \Omega ^0\setminus \Omega ^\varepsilon \vert ^{\frac{{{\hat{p}}}-2}{{{\hat{p}}}}} \varepsilon ^2. \end{aligned}$$

On the other hand, using that \(\vert z^\varepsilon \vert =\varepsilon \) in \(\Omega ^\varepsilon \) and the previous inequality, we have

$$\begin{aligned} \vert \Omega ^\varepsilon \vert&= \frac{1}{\varepsilon ^2}\int _{\Omega ^\varepsilon }z^\varepsilon (x)^2\,\text {d}x\le \frac{1}{\varepsilon ^2}\Vert z^\varepsilon \Vert _{L^2(\Omega )}^2 \le \frac{1}{\varepsilon ^2}\Vert z^\varepsilon \Vert _{H^1(\Omega )}^2 \\ {}&\le \frac{2 C_E^2}{\Lambda ^2} \Vert b\Vert _{L^{{{\hat{p}}}}(\Omega ^0\setminus \Omega ^\varepsilon )}^2 \vert \Omega ^0\setminus \Omega ^\varepsilon \vert ^{\frac{{{\hat{p}}}-2}{{{\hat{p}}}}}. \end{aligned}$$

Since \(\vert \Omega ^0{\setminus }\Omega ^\varepsilon \vert = \textrm{meas}\{x\in \Omega : 0< \vert z(x)\vert < \varepsilon \}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), we have proved that \(\vert \Omega ^\varepsilon \vert \rightarrow 0\) as \(\varepsilon \rightarrow 0\) and hence \(\vert \Omega ^0\vert =0\). \(\square \)

Corollary 2.7

Under Assumption 2.1, the linear operator \({\mathcal {A}}^*:H^1(\Omega ) \longrightarrow H^{1}(\Omega )'\) is an isomorphism.

3 Regularity of the Solution of the State and Adjoint State Equations

To obtain further regularity, from now on we will suppose

Assumption 3.1

The coefficients \(a_{ik}\) belong to \(C^{0,1}({\bar{\Omega }})\), \(1\le i,k,\le 2\).

Let us denote by m the number of sides of \(\Gamma \) and \(\{S_j\}_{j=1}^m\) its vertices, ordered counterclockwise. For convenience denote also \(S_0=S_m\) and \(S_{m+1}=S_1\). We denote by \(\Gamma _j\) the side of \(\Gamma \) connecting \(S_{j}\) and \(S_{j+1}\), and by \(\omega _j\in (0,2\pi )\) the angle interior to \(\Omega \) at \(S_j\), i.e., the angle defined by \(\Gamma _{j}\) and \(\Gamma _{j-1}\), measured counterclockwise. Notice that \(\Gamma _{0}=\Gamma _m\). We use \((r_j,\theta _j)\) as local polar coordinates at \(S_j\), with \(r_j=\vert x-S_j\vert \) and \(\theta _j\) the angle defined by \(\Gamma _j\) and the segment \([S_j,x]\). In order to describe and analyze the regularity of the functions near the corners, we will introduce for every \(j\in \{1,\ldots ,m\}\) the infinite cone

$$\begin{aligned} K_j=\{x\in \mathbb {R}^2:0<r_j,\,0<\theta _j<\omega _j\}. \end{aligned}$$

For every \(j\in \{1,\ldots ,m\}\) we call \(A_j\) the operator with constant coefficients, corresponding to the corner \(S_j\), given by

$$\begin{aligned} A_j y = \sum _{i,k = 1}^{2}\partial _{x_k}(a_{ik}(S_j)\partial _{x_i}y). \end{aligned}$$

We denote by \(\lambda _j\) the leading singular exponent associated with the operator \(A_j\) at the corner \(S_j\), i.e., the smallest \(\lambda _j>0\) such that there exists a solution of the form \(y_j=r_j^{\lambda _j}\varphi _j(\theta _j)\), with \(\varphi _j\) smooth enough, of

$$\begin{aligned} A_j y_j =0 \text{ in } K_j,\ \partial _{n_{A_j}} y_j =0 \text{ on } \partial K_j. \end{aligned}$$

For instance, for \(Ay = -\Delta y\) we have \(\lambda _j = \pi /\omega _j\). We denote \(\lambda = \min \{\lambda _j\}\).

With the usual technique of taking a partition of the unity to localize the problem in the corners, freezing the coefficients and doing an appropriate linear change of variable, the classical results for the Laplace operator are also valid in our case; see, e.g. [18, Section 2.1] for a detailed example of application of this technique. Notice that the symmetry hypothesis \(a_{ik}=a_{ki}\) introduced in Assumption 2.1 implies that the same change of variable that transforms \(A_j\) into \(-\Delta \) will transform the conormal derivative \(\partial _{n_{A_j}}\) into the normal derivative \(\partial _n\) in the new variables, and not in an oblique derivative.

We continue with regularity results for problems with \(b\equiv 0\) and \(a_0\equiv 0\) and use the standard Sobolev and Sobolev–Slobodetskiĭ spaces but also weighted Sobolev spaces as follows. Let \(k\in \mathbb {N}_0\), \(1\le p\le \infty \), and \(\mathbf {\beta }=(\beta _1,\ldots ,\beta _m)^T\in \mathbb {R}^m\), \(j\in \{1,\ldots ,m\}\). For ball-neighborhoods \(\Omega _{R_j}\) of \(S_j\) with radius \(R_j\le 1\) and \(\Omega ^0:=\Omega \setminus \bigcup _{j=1}^m \Omega _{R_j/2}\) we define norms via

$$\begin{aligned} \Vert v\Vert _{W^{k,p}_{\beta _j}(\Omega _{R_j})}^p&= \sum _{\mid \alpha \mid \le k} \Vert r_j^{\beta _j}D^\alpha v\Vert _{L^p(\Omega _{R_j})}^p, \\ \Vert v\Vert _{V^{k,p}_{\beta _j}(\Omega _{R_j})}^p&= \sum _{\mid \alpha \mid \le k} \Vert r_j^{\beta _j-k+\mid \alpha \mid }D^\alpha v\Vert _{L^p(\Omega _{R_j})}^p, \end{aligned}$$

where the standard modification for \(p=\infty \) is used. The spaces \(W^{k,p}_{\mathbf {\beta }}(\Omega )\) and \(V^{k,p}_{\mathbf {\beta }}(\Omega )\) denote the set of all functions v such that

$$\begin{aligned} \Vert v\Vert _{W^{k,p}_{\mathbf {\beta }}(\Omega )}&:= \Vert v\Vert _{W^{k,p}(\Omega ^0)} + \sum _{j=1}^m \Vert v\Vert _{W^{k,p}_{\beta _j}(\Omega _{R_j})}, \\ \Vert v\Vert _{V^{k,p}_{\mathbf {\beta }}(\Omega )}&:= \Vert v\Vert _{W^{k,p}(\Omega ^0)} + \sum _{j=1}^m \Vert v\Vert _{V^{k,p}_{\beta _j}(\Omega _{R_j})}, \end{aligned}$$

respectively, are finite. The corresponding seminorms are defined by setting \(\vert \alpha \vert =k\) instead of \(\vert \alpha \vert \le k\). For the definition of the corresponding trace spaces \(W^{k-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\), \(V^{k-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\), \(W^{k-1/p,p}_{\mathbf {\beta }}(\Gamma )\) and \(V^{k-1/p,p}_{\mathbf {\beta }}(\Gamma )\) we refer to [13, Sect. 6.2.10], see also [19, Section 2.2]. We will also use the notation \(L^p_{\mathbf {\beta }}(\Omega )\) for \(W^{0,p}_{\mathbf {\beta }}(\Omega )\).

Lemma 3.2

Suppose that Assumption 3.1 holds. Consider \(f\in H^1(\Omega )'\) and \(g\in H^{1/2}(\Gamma )'\) such that

$$\begin{aligned} \langle f,1\rangle _\Omega + \langle g,1\rangle _\Gamma = 0, \end{aligned}$$

and let \(y\in H^{1}(\Omega )\) be the unique solution, up to a constant, of

$$\begin{aligned} \int _\Omega \sum _{i,k=1}^2 a_{ik}\partial _{x_i}y\partial _{x_k} z\,\textrm{d}x= \langle f,z\rangle _{\Omega } + \langle g,z\rangle _\Gamma \quad \forall z\in H^1(\Omega ). \end{aligned}$$

We have the following regularity results.

(a) If \(f\in H^{2-t}(\Omega )'\), and \(g\in \displaystyle \prod _{j=1}^m H^{t-3/2}(\Gamma _j)\) for some \(1<t<1+\lambda \), \(t\le 2\), then

$$\begin{aligned} y\in H^{t}(\Omega ) \text{ and } \vert y\vert _{H^{t}(\Omega )}\le C_{A,t}\Big (\Vert f\Vert _{ H^{2-t}(\Omega )'}+\sum _{j=1}^m\Vert g\Vert _{H^{t-3/2}(\Gamma _j)}\Big ). \end{aligned}$$

(b) If \(f\in L^r(\Omega )\) and \(g\in \displaystyle \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\) for some \(1<r<\dfrac{2}{2-\lambda }\) if \(\lambda <2\), \(r>1\) arbitrary if \(\lambda \ge 2\), then

$$\begin{aligned} y\in W^{2,r}(\Omega ) \text{ and } \vert y\vert _{W^{2,r}(\Omega )}\le C_{A,r}\Big (\Vert f\Vert _{ L^{r}(\Omega )}+\sum _{j=1}^m\Vert g\Vert _{W^{1-1/r,r}(\Gamma _j)}\Big ). \end{aligned}$$

(c) Consider \(s\in (1,\infty )\) and \({\mathbf {\beta }}\) such that \(2-\dfrac{2}{s}-\lambda _j<\beta _j<2-\dfrac{2}{s}\), \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\). If \(f\in L^s_{\mathbf {\beta }}(\Omega )\) and \(g\in \prod _{j=1}^m V^{1-1/s,s}_{\mathbf {\beta }}(\Gamma _j)\), then

$$\begin{aligned} y\in W^{2,s}_{\mathbf {\beta }}(\Omega ) \text{ and } \vert y\vert _{W^{2,s}_{\mathbf {\beta }}(\Omega )}\le C_{A,{\mathbf {\beta }}}\Big (\Vert f\Vert _{ L^{s}_{\mathbf {\beta }}(\Omega )}+\sum _{j=1}^m\Vert g\Vert _{V^{1-1/s,s}_{\mathbf {\beta }}(\Gamma _j)}\Big ). \end{aligned}$$

Remark 3.3

Let us briefly comment on the function spaces appearing in the lemma. Notice that for \(t=2\), \(H^{t-2}(\Omega ) = H^{2-t}(\Omega )=L^2(\Omega )\), and for \(3/2<t\), \(H^{2-t}(\Omega ) = H^{2-t}_0(\Omega )\) and hence \(H^{t-2}(\Omega ) = H^{2-t}(\Omega )'\). Nevertheless, for \(1<t<3/2\), \(H^{2-t}(\Omega )' \ne H^{t-2}(\Omega )\). Also take into account that

$$\begin{aligned} \displaystyle \prod _{j=1}^m H^{t-3/2}(\Gamma _j) = H^{t-3/2}(\Gamma )&\text{ if } t<2,&\displaystyle \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)= W^{1-1/r,r}(\Gamma )&\text{ if } r<2. \end{aligned}$$

We remark that the mapping \(y\mapsto \partial _{n_A}y\) is linear and continuous from \(H^2(\Omega )\) onto \(\prod _{j=1}^m H^{1/2}(\Gamma _j)\); see [10, Theorem 1.5.2.8].

Regarding weighted spaces, we notice that \(V^{1-1/s,s}_{\mathbf {\beta }}(\Gamma )=W^{1-1/s,s}_{\mathbf {\beta }}(\Gamma )\) if \(\beta _j>1-\frac{2}{s}\) or \(\beta _j<-\frac{2}{s}\) for all \(j\in \{1,\ldots ,m\}\), while these spaces differ by a constant in the vicinity of each corner \(S_j\) where \(-\frac{2}{s}<\beta _j<1-\frac{2}{s}\), see [12, Theorem 2.1] or [14, page 131].

Proof of Lemma 3.2

The result in (a) can be deduced from [20, Theorem 9.2] for \(1<t<3/2\), from [21, Theorem (23.3)] for \(3/2<t<2\), and from [10, Corollary 4.4.4.14] for \(t=2\). The case \(t=3/2\) follows by interpolation. Statement (b) follows from [10, Corollary 4.4.4.14]. Part (c) follows by standard arguments but we did not find this particular result in the literature. Therefore we sketch the proof here for the case of constant coefficients. As said above, the result in the case of Lipschitz coefficients follows from this one using the localization-and-freezing technique.

We will use [13, Theorem 1.2.5] stating a similar result for a cone K and weighted V-spaces. For the problem under consideration and in our notation it says that \(y\in V^{2,s}_\beta (K)\) if \(f\in L^s_\beta (K)\) and \(g\in V^{1-1/s,s}_\beta (\partial K{\setminus } O)\) provided that \(s\in (1,\infty )\) and \(2-\frac{2}{s}-\beta \not \in \{k\lambda , k\in \mathbb {Z}\}\). To satisfy the latter condition we assume \(2-\frac{2}{s}-\lambda _j<\beta _j < 2-\frac{2}{s}\) for all \(j\in \{1,\ldots ,m\}\).

The reformulation from the vicinity of a vertex of the domain \(\Omega \) is achieved by using cut-off functions \(\zeta _j:\Omega \rightarrow [0,1]\) with \(\zeta _j\equiv 1\) in \(\Omega _{R_j/2}\), \(\zeta _j\equiv 0\) in \(\Omega {\setminus }\Omega _{R_j}\), and \(\partial _{n_{A_j}}\zeta _j=0\) on \(\partial \Omega \cap \partial \Omega _{R_j}\). We split \(y\in H^1(\Omega )\) into

$$\begin{aligned} y=\sum _{j=1}^m y_j+w,\quad \text {where}\quad y_j=\zeta _j(y-y(S_j)). \end{aligned}$$

With this construction we get \(y_j(S_j)=0\) and \(\text {supp}\,y_j={\bar{\Omega }}_{R_j}\) such that we can consider the problem \(A y_j=f_j\) with Neumann boundary condition \(\partial _{n_{A_j}}y_j=g_j=\zeta _j g\) in the cone \(K_j\). For \(f_j\), we have

$$\begin{aligned} f_j=A\big (\zeta _j(y-y(S_j))\big )={\left\{ \begin{array}{ll} \zeta _jf &{} \text {in }\Omega _{R_j/2} \\ \zeta _jf-{\mathfrak {b}}_j\cdot \nabla y-{\mathfrak {a}}_j(y-y(S_j)) &{} \text {in }\Omega _{R_j}\setminus \Omega _{R_j/2} \\ 0 &{} \text {in }K_j\setminus \Omega _{R_j} \end{array}\right. } \end{aligned}$$

with smooth functions \({\mathfrak {b}}_j\) and \({\mathfrak {a}}_j\) due to the constant coefficients in A. From \(f\in L^s_{\mathbf {\beta }}(\Omega )\) and \(y\in H^1(\Omega )\) we conclude \(f_j\in L^{{\hat{s}}}_{\beta _j}(K)\), \({\hat{s}}=\min (s,2)\) where we use that \(\beta _j\ge 0\). Moreover, the assumption \(g\in \prod _{j=1}^mV^{1-1/s,s}_{\mathbf {\beta }}(\Gamma _j)\) leads to \(g_j\in V^{1-1/s,s}_{\beta _j}(\partial K_j{\setminus } O_j)\) such that [13, Theorem 1.2.5] leads to \(y_j\in V^{2,{\hat{s}}}_{\beta _j}(K_j)\hookrightarrow W^{2,{\hat{s}}}_{\beta _j}(K_j)\). Since the function w does not contain corner singularities, hence \(w\in W^{2,s}(\Omega )\), we obtain \(y\in W^{2,{\hat{s}}}_{\mathbf {\beta }}(\Omega )\). If \(s\le 2\) we are done.

Otherwise, when \(s>2\), we have \(y\in H^2(\Omega _{R_j}{\setminus }\Omega _{R_j/2})\hookrightarrow W^{1,s}(\Omega _{R_j}{\setminus }\Omega _{R_j/2})\), and we reiterate \(f_j\in L^s_{\beta _j}(K)\) and \(y_j\in V^{2,s}_{\beta _j}(K_j)\hookrightarrow W^{2,{\hat{s}}}_{\beta _j}(K_j)\) leading to \(y\in W^{2,s}_{\mathbf {\beta }}(\Omega )\). \(\square \)

Theorem 3.4

Suppose that Assumptions 2.1 and 3.1 hold. Consider \(f\in H^1(\Omega )'\) and \(u\in H^{1/2}(\Gamma )'\) and let \(y\in H^{1}(\Omega )\) be the unique solution of

$$\begin{aligned} \langle {\mathcal {A}}y,z\rangle _\Omega = \langle f,z\rangle _\Omega + \langle u,z\rangle _\Gamma \quad \forall z\in H^1(\Omega ). \end{aligned}$$
(3.1)

We have the following regularity results.

(a) If \(a_0\in L^q(\Omega )\), \(f\in H^{2-t}(\Omega )'\) and \(u\in \prod _{j=1}^m H^{t-3/2}(\Gamma _j)\) for some t such that \(1<t<1+\lambda \), \(t\le 2\) and \(q = \dfrac{2}{3-t}\), then \(y\in H^{t}(\Omega )\) and there exists a constant \(C_{{\mathcal {A}},t}>0\) such that

$$\begin{aligned} \Vert y\Vert _{H^{t}(\Omega )}\le C_{{\mathcal {A}},t}(\Vert f\Vert _{H^{2-t}(\Omega )'}+\sum _{j=1}^m\Vert u\Vert _{H^{t-3/2}(\Gamma _j)}). \end{aligned}$$

(b) If \(a_0\in L^r(\Omega )\), \(f\in L^r(\Omega )\) and \(u\in \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\) for some \(r\in (1,{{\hat{p}}}]\) satisfying \(r<\dfrac{2}{2-\lambda }\) in case of \(\lambda <2\), then \(y\in W^{2,r}(\Omega )\) and there exists a constant \(C_{{\mathcal {A}},r}>0\) such that

$$\begin{aligned} \Vert y\Vert _{W^{2,r}(\Omega )}\le C_{{\mathcal {A}},r}(\Vert f\Vert _{ L^{r}(\Omega )}+\sum _{j=1}^m\Vert u\Vert _{ W^{1-1/r,r}(\Gamma _j)}). \end{aligned}$$

(c) If \(a_0\in L^{ p}_{\mathbf {\beta }}(\Omega )\), \(f\in L^{p}_{\mathbf {\beta }}(\Omega )\) and \(u\in \prod _{j=1}^m W^{1-1/{ p}, p}_{\mathbf {\beta }}(\Gamma _j)\) for some \(p\in (1,2]\) and some \(\mathbf {\beta }\) such that \(2-\dfrac{2}{p}-\lambda _j<\beta _j <2-\dfrac{2}{p}\) and \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\), then \(y\in W^{2,p}_{\mathbf {\beta }}(\Omega )\) and there exists a constant \(C_{{\mathcal {A}},\mathbf {\beta },p}>0\) such that

$$\begin{aligned} \Vert y\Vert _{W^{2,p}_{\mathbf {\beta }}(\Omega )}\le C_{{\mathcal {A}},{\mathbf {\beta }},p}(\Vert f\Vert _{ L^{p}_{\mathbf {\beta }}(\Omega )}+\sum _{j=1}^m\Vert u\Vert _{W^{1-1/{p},p}_{\mathbf {\beta }}(\Gamma _j)}). \end{aligned}$$

Proof

Let us define

$$\begin{aligned}F = -b\cdot \nabla y -a_0 y.\end{aligned}$$

From the proof of Lemma 2.3, we know that \(F\in H^1(\Omega )'\). Also, taking \(z=1\) in (3.1), we have that

$$\begin{aligned} \langle f+F,1\rangle _\Omega + \langle u,1\rangle _\Gamma = 0, \end{aligned}$$

so the conditions of Lemma 3.2 apply to the problem

$$\begin{aligned} \langle A y,z\rangle _\Omega = \langle f+F,z\rangle _\Omega + \langle u,z\rangle _\Gamma \quad \forall z\in H^1(\Omega ). \end{aligned}$$

We have to investigate the regularity of F.

(a) For \(1<\tau \le t\), define \(S = \{z\in H^{2-\tau }(\Omega ):\ \Vert z\Vert _{H^{2-\tau }(\Omega )} = 1\}\). We have that \(F \in H^{2-\tau }(\Omega )'\) if and only if

$$\begin{aligned} \Vert F\Vert _{H^{2-\tau }(\Omega )'} = \sup _{z\in S}\vert \langle F,z\rangle _\Omega \vert < +\infty . \end{aligned}$$

Applying Hölder’s inequality, we can deduce the existence of a constant \(C_\Omega >0\), that may depend on the measure of \(\Omega \), such that

$$\begin{aligned} \vert \langle F,z\rangle _\Omega \vert&= \left| \int _\Omega (b\cdot \nabla y+a_0 y)z\,\textrm{d}x\right| \nonumber \\&\le C_\Omega \big (\Vert b\Vert _{L^{{\hat{p}}}(\Omega )^2} \Vert \nabla y\Vert _{L^{r_p}(\Omega )} + \Vert a_0\Vert _{L^q(\Omega )} \Vert y\Vert _{L^{r_q}(\Omega )}\big )\Vert z\Vert _{L^s(\Omega )} \end{aligned}$$
(3.2)

where

$$\begin{aligned} \frac{1}{{{\hat{p}}}}+\frac{1}{r_p}+\frac{1}{s}\le 1,\quad \frac{1}{q}+\frac{1}{r_q}+\frac{1}{s}\le 1. \end{aligned}$$
(3.3)

Let us also notice that \(H^{2-\tau }(\Omega )\hookrightarrow L^s(\Omega )\) if and only if

$$\begin{aligned} \tau = 1 + \frac{2}{s}. \end{aligned}$$
(3.4)

We will apply a boot-strap argument.

Step 1. We know that \(y\in H^1(\Omega )\), so \(r_p=2\) and for \(r_q\) we can take any real number. Noting that \(q>1\), using (3.3) and taking

$$\begin{aligned} \frac{1}{s} = \min \left\{ 1-\frac{1}{{{\hat{p}}}}-\frac{1}{r_p},1-\frac{1}{q}-\frac{1}{r_q}\right\} , \end{aligned}$$

we have that \(1/s>0\) for \(r_q\) big enough and both conditions in (3.3) are satisfied. Hence we deduce that \(F\in H^{2-\tau }(\Omega )'\). Since \(u\in \prod _{j = 1}^{m}H^{t-3/2}(\Gamma _j)\), a direct application of Lemma 3.2 yields that \(y\in H^{\min \{t,\tau \}}(\Omega )\). If \(\tau \ge t\), the proof is complete.

Step 2. Otherwise we have that \(\nabla y\in H^{\tau -1}(\Omega )^2\hookrightarrow L^{r_p}(\Omega )^2\) for

$$\begin{aligned} \frac{1}{r_p} = 1-\frac{\tau }{2} \end{aligned}$$

and, since \(\tau > 1\), we can take \(r_q = +\infty \). As before, we select

$$\begin{aligned} \frac{1}{s} = \min \left\{ 1-\frac{1}{{{\hat{p}}}}-\frac{1}{r_p},1-\frac{1}{q}\right\} . \end{aligned}$$

We have two possibilities now.

Step 3. If \(\dfrac{1}{s} = 1-\dfrac{1}{q}\), then, applying (3.4) and taking into account our choice of q, we have that \(y\in H^{{\hat{\tau }}}(\Omega )\) with

$$\begin{aligned} {\hat{\tau }} = 1+\frac{2}{s} = 3-\frac{2}{q} = t, \end{aligned}$$

and the proof is complete.

Step 4. Otherwise, \(\dfrac{1}{s} = 1-\dfrac{1}{{{\hat{p}}}}-\dfrac{1}{r_p}\) and we will have \(y\in H^{{\hat{\tau }}}(\Omega )\) with

$$\begin{aligned} {\hat{\tau }} = 1+\frac{2}{s} = 1+2-\frac{2}{{{\hat{p}}}}-(2-\tau ) = \tau +1-\frac{2}{{{\hat{p}}}}, \end{aligned}$$

and we have advanced a fixed amount \(1-\dfrac{2}{{{\hat{p}}}}\). If \(\hat{\tau } \ge t\), the proof is complete.

Step 5. In other case, we can redefine \(\tau = {\hat{\tau }}\) and go back to step 2.

Every time we repeat the process, either we finish the proof or we increment the size of \(\tau \) by the fixed amount \(1-\frac{2}{{{\hat{p}}}}\), so the proof will end in a finite number of steps.

(b) From the Sobolev embedding theorem, we have that

$$\begin{aligned}f\in L^r(\Omega )\hookrightarrow H^{2-t}(\Omega )'\text { and }u\in \displaystyle \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\hookrightarrow \displaystyle \prod _{j=1}^m H^{t-3/2}(\Gamma _j)\end{aligned}$$

for \(t =\min \{2, 3-2/r\}\). The conditions imposed on r imply that \(1<t<1+\lambda \), \(t\le 2\), so we can apply Theorem 3.4(a) to obtain \(y\in H^t(\Omega )\) and we readily have that \(y\in L^\infty (\Omega )\) and hence \(a_0 y\in L^r(\Omega )\). Let us investigate the regularity of \(b\cdot \nabla y\). We use again a boot-strap argument.

We have that \(\nabla y\in H^{t-1}(\Omega )\hookrightarrow L^{\frac{2}{2-t}}(\Omega )\). Therefore \(b\cdot \nabla y\in L^s(\Omega )\) where

$$\begin{aligned} \frac{1}{s} = \frac{1}{{{\hat{p}}}}+\frac{2-t}{2}<1. \end{aligned}$$

Applying Lemma 3.2(b), we have that \(y\in W^{2,\min \{s,r\}}(\Omega )\). If \(s\ge r\), the proof is complete. Otherwise, we have that \(\nabla y\in W^{1,s}(\Omega )\hookrightarrow L^{s^*}(\Omega )\), with

$$\begin{aligned} \frac{1}{s^*} = \frac{1}{s}-\frac{1}{2}. \end{aligned}$$

Therefore \(b\cdot \nabla y\in L^{{\hat{s}}}(\Omega )\) where

$$\begin{aligned} \frac{1}{{\hat{s}}} = \frac{1}{{{\hat{p}}}}+\frac{1}{s^*} = \frac{1}{{{\hat{p}}}} + \frac{1}{s}-\frac{1}{2} = \frac{1}{s}-\left( \frac{1}{2}-\frac{1}{{{\hat{p}}}}\right) . \end{aligned}$$

If \(\frac{1}{{\hat{s}}}\le \frac{1}{r}\), then the proof is complete. Otherwise, we can rename \(s:={\hat{s}}\) and repeat the argument subtracting at each step the positive constant \(\dfrac{1}{2}-\dfrac{1}{{{\hat{p}}}}\) until \(\dfrac{1}{{\hat{s}}}\le \dfrac{1}{r}\).

(c) To obtain this result, we want to apply Lemma 3.2(c), but the boundary datum in that result is in the space \(\prod _{j = 1}^m V^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\), while the boundary datum in this result is in \(\prod _{j = 1}^m W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\). Taking into account Remark 3.3, it is clear that for \(p<2\), the condition \(\beta _j\ge 0\) implies that \(\beta _j > 1-2/p\) and hence \(W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j) = V^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\) for all \(j\in \{1,\ldots ,m\}\). If \(p=2\), we define

$$\begin{aligned} u_s =\displaystyle \sum _{\beta _j>0} u\zeta _j, \end{aligned}$$

where the \(\zeta _j\) are the cutoff functions introduced in the proof of Lemma 3.2(c). Taking into account again Remark 3.3 and noting that \(u_s\equiv 0\) in a neighbourhood of the corners \(S_j\) with \(\beta _j=0\), it is readily deduced that \(u_s\in \prod _{j=1}^m V^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\). We also have that the function \(u_r = u-u_s\in \prod _{j=1}^m W^{1-1/p,p}(\Gamma _j)\), because \(u_r\equiv 0\) in a neighbourhood of the corners \(S_j\) such that \(\beta _j > 0\). In the same way we define

$$\begin{aligned} f_s = \displaystyle \sum _{\beta _j>0} f\zeta _j \in L^p_{\mathbf {\beta }}(\Omega )\text { and } f_r= f-f_s\in L^p(\Omega ), \end{aligned}$$

and consider \(y_s,y_r\in H^1(\Omega )\) such that

$$\begin{aligned} \langle {\mathcal {A}}y_r,z\rangle _\Omega = \langle f_r,z\rangle _\Omega + \langle u_r,z\rangle _\Gamma ,\text { and } \langle {\mathcal {A}}y_s,z\rangle _\Omega = \langle f_s,z\rangle _\Omega + \langle u_s,z\rangle _\Gamma \quad \forall z\in H^1(\Omega ), \end{aligned}$$

so that \(y =y_r+y_s\). As an application of Theorem 3.4(b), \(y_r\in W^{2,2}(\Omega )\), which is continuously embedded in \(W^{2,2}_{\mathbf {\beta }}(\Omega )\) because \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\).

Taking into account the above considerations, in the rest of the proof we assume that \(\beta _j > 1-2/p\). If \(p< 2\) then this holds, as discussed before. If \(p=2\), we denote \(u_s=u\) to treat both cases simultaneously, and hence we can use both that \(u\in \prod _{j=1}^m W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\), which is needed to have an embedding in a non-weighted Sobolev space, and \(u\in \prod _{j=1}^m V^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\), which is needed to apply Lemma 3.2(c).

From [19, Lemma 2.29(ii)], we deduce that \(L^{p}_{\mathbf {\beta }}(\Omega )\hookrightarrow L^r(\Omega )\) for all \(r<\frac{2}{{\beta _j}+2/p}\le \frac{2}{2/p}= p\) for all \(j\in \{1,\ldots ,m\}\). On the other hand, using the definition of the \(\prod _{j = 1}^mW^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\)-norm and [19, Lemma 2.29(i)], we have the embedding \(\prod _{j = 1}^mW^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\hookrightarrow \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\) for the same r as above. We notice at this point that the assumption \(\beta _j<2-\frac{2}{p}\) implies that \(\frac{2}{{\beta _j}+2/p}>1\), and \(2-\frac{2}{p}-\lambda _j <{\beta _j}\) implies \(r < \dfrac{2}{2-\lambda _j}\) for all j, therefore we can choose some \(r>1\) satisfying the assumptions of Theorem 3.4(b) and we have that \(a_0\in L^r(\Omega )\), \(f\in L^r(\Omega )\), and \(u\in \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\). By Theorem 3.4(b) we obtain \(y\in W^{2,r}(\Omega )\) for some \(r>1\).

In particular, the result \(y\in W^{2,r}(\Omega )\) implies \(y\in L^\infty (\Omega )\), and hence \(a_0 y \in L^{ p}_{{\mathbf {\beta }}}(\Omega )\). We also have that \(\nabla y\in W^{1,r}(\Omega )^2\hookrightarrow L^{s_y}(\Omega )^2\) for \(s_y = \dfrac{2r}{2-r}\) if \(r<2\), any \(s_y<+\infty \) if \(r=2\) and \(s_y=+\infty \) if \(r>2\). From this we deduce that \(b\cdot \nabla y \in L^s(\Omega )\) for

$$\begin{aligned} \frac{1}{s} = \frac{1}{{{\hat{p}}}}+\frac{1}{s_y}. \end{aligned}$$

Now we use that \({\mathbf {\beta }}\ge 0\) to deduce that \(b\cdot \nabla y\in L^{s}_{{\mathbf {\beta }}}(\Omega )\) and hence \(F = -b\cdot \nabla y -a_0 y \in L^{\min \{s,p\}}_{{\mathbf {\beta }}}(\Omega )\). By applying Lemma 3.2(c), we have that \(y\in W^{2,{\min \{s,p\}}}_{\mathbf {\beta }}(\Omega )\). If \(s\ge p\), the proof is complete.

Otherwise, in case \(s < p \le 2\), from Sobolev’s embedding theorem, we have that \(\nabla y\in W^{1,s}_{\mathbf {\beta }}(\Omega )\hookrightarrow L^{s_y}_{\mathbf {\beta }}(\Omega )\) for

$$\begin{aligned} \frac{1}{s_y}=\frac{1}{s}-\frac{1}{2}=\frac{s-2}{s}\iff s_y = \frac{2s}{s-2}. \end{aligned}$$

Since \(\mathbf {\beta }\ge \textbf{0}\), using that \(b\in L^{{\hat{p}}}(\Omega )\), we have that \(b\cdot \nabla y \in L^{{\hat{s}}}_{\mathbf {\beta }}\), where

$$\begin{aligned} \frac{1}{{\hat{s}}} = \frac{1}{s_y}+\frac{1}{{{\hat{p}}}} = \frac{1}{s}-\left( \frac{1}{2}-\frac{1}{{{\hat{p}}}}\right) . \end{aligned}$$
(3.5)

By applying Lemma 3.2(c), we have that \(y\in W^{2,\min \{p,\hat{s}\}}_{\mathbf {\beta }}(\Omega )\). If \({\hat{s}}\ge 2\), the proof is complete. Otherwise, we redefine \(s:={\hat{s}}\) and repeat the last step. Since at each iteration we subtract the positive constant \(\dfrac{1}{2}-\dfrac{1}{{{\hat{p}}}}\), the proof will end in a finite number of steps. \(\square \)

We conjecture that the result of Theorem 3.4(c) holds for \(p\in (1,{\hat{p}}]\), but the proof is limited to \(p\le 2\).

Notice that the operator \({\mathcal {A}}^*\) is different from \({\mathcal {A}}\), and hence the results in Theorem 3.4 are not immediately applicable. For the adjoint state equation, we will need another assumption on \(b\cdot n\), which is a result of the boundary term obtained due to integration by parts.

Theorem 3.5

Suppose Assumptions 2.1 and 3.1 hold. Consider \(f\in H^1(\Omega )'\) and \(g\in H^{1/2}(\Gamma )'\) and let \(\varphi \in H^1(\Omega )\) be the unique solution of

$$\begin{aligned} \langle {\mathcal {A}}^*\varphi ,z\rangle _\Omega = \langle f,z\rangle _{\Omega }+\langle g,z\rangle _\Gamma \quad \forall z\in H^1(\Omega ). \end{aligned}$$
(3.6)

(a) If \(a_0,\, \nabla \cdot b\in L^q(\Omega )\), \(b\cdot n\in L^{q_\Gamma }(\Gamma )\cap H^{t-3/2}(\Gamma )\), \(f\in H^{2-t}(\Omega )'\), and \(g\in \prod _{j=1}^m H^{t-3/2}(\Gamma _j)\) for \(1<t<1+\lambda \), \(t\le 2\), \(q = \dfrac{2}{3-t}\), and \(q_\Gamma = \min \{2,1/(2-t)\}\), then \(\varphi \in H^{t}(\Omega )\), and there exists a constant \(C_{{\mathcal {A}}^*,t}>0\) such that

$$\begin{aligned} \Vert \varphi \Vert _{H^{t}(\Omega )} \le C_{{\mathcal {A}}^*,t}\Big ( \Vert f\Vert _{ H^{2-t}(\Omega )'} + \sum _{j=1}^m\Vert g\Vert _{H^{t-3/2}(\Gamma _j)}\Big ). \end{aligned}$$

(b) If \(a_0,\, \nabla \cdot b,\, f\in L^r(\Omega )\), and \(g,b\cdot n\in \prod _{j=1}^m W^{1-1/r,r}(\Gamma _j)\) for some \(r\in (1,{{\hat{p}}}]\) satisfying \(r<\dfrac{2}{2-\lambda }\) in case of \(\lambda <2\), then \(\varphi \in W^{2,r}(\Omega )\), and there exists a constant \(C_{{\mathcal {A}}^*,r}>0\) such that

$$\begin{aligned} \Vert \varphi \Vert _{W^{2,r}(\Omega )} \le C_{{\mathcal {A}}^*,r}\Big (\Vert f\Vert _{ L^{r}(\Omega )} + \sum _{j=1}^m\Vert g\Vert _{W^{1-1/r,r}(\Gamma _j)}\Big ). \end{aligned}$$

(c) If \(a_0,\, \nabla \cdot b,\, f\in L^{p}_{{\mathbf {\beta }}}(\Omega )\), and \(b\cdot n,\,g\in \prod _{j = 1}^m W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\) for some \(p\in (1,2]\) and some \(\mathbf {\beta }\) such that \( 2-\frac{2}{p}-\lambda _j<\beta _j < 2-\frac{2}{p}\), \(\beta _j \ge 0\), for all \(j\in \{1,\ldots ,m\}\), then \(\varphi \in W^{2,p}_{\mathbf {\beta }}(\Omega )\) and there exists a constant \(C_{{\mathcal {A}}^*,\mathbf {\beta },p}>0\) such that

$$\begin{aligned} \Vert \varphi \Vert _{W^{2,p}_{{\mathbf {\beta }}}(\Omega )}\le C_{{\mathcal {A}}^*,\mathbf {\beta },p}\Big (\Vert f\Vert _{ L^{2}_{\mathbf {\beta }}(\Omega )}+ \sum _{j=1}^m\Vert g\Vert _{W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)}\Big ). \end{aligned}$$

Proof

The expression for \(\langle {\mathcal {A}}^*\varphi ,z\rangle _\Omega \) is derived in Lemma 2.5. Using the product rule, we have that the function \(\varphi \) satisfies

$$\begin{aligned} \int _\Omega \sum _{i,k = 1}^2 a_{ki}\partial _{x_i}\varphi \partial _{x_k} z \,\textrm{d}x&-\int _\Omega (b\cdot \nabla \varphi ) z \,\textrm{d}x+ \int _\Omega a_0 \varphi z\,\textrm{d}x\\ =&\int _\Omega (\nabla \cdot b )\varphi z\,\textrm{d}x-\int _\Gamma \varphi (b\cdot n) z\,\,\textrm{d}x+ \langle f,z\rangle _\Omega +\langle g,z\rangle _\Gamma \end{aligned}$$

and we can apply Theorem 3.4 to this problem provided \(\varphi \nabla \cdot b\) and \(\varphi b\cdot n\) are in the appropriate spaces.

Notice that statement (a) for \(t=2\) is the same than statement (b) for \(r=2\). We will prove (a) for \(t<2\), and refer to (b) for \(t=2\).

Step 1: First, we prove \(W^{1,\delta }(\Omega )\) regularity for some \(\delta >2\).

Let us write the equation as

$$\begin{aligned} \left\{ \begin{array}{rcll} A\varphi + \varphi &{}= &{}f+\varphi \nabla \cdot b + b\cdot \nabla \varphi - a_0\varphi + \varphi &{}\text { in }\Omega \\ \partial _{n_A}\varphi &{}=&{} -b\cdot n \varphi +g&{}\text { on }\Gamma . \end{array} \right. \end{aligned}$$

This is a Neumann problem posed on a Lipschitz domain. We will apply the regularity results in [22]. To that end, we first investigate the existence of \(r_f>2\) and \(q_\Gamma > 1\) such that \(f\in W^{1,r_f'}(\Omega )'\) and \(b\cdot n\in L^{q_\Gamma }(\Gamma )\). In each of the three cases, we have:

  1. (a)

    \(f\in H^{2-t}(\Omega )'\hookrightarrow W^{1,r_f'}(\Omega )'\) for \(r_f=\dfrac{2}{2-t}>2\) since \(1<t<2\). The exponent \(q_\Gamma \) is given in the theorem.

  2. (b)

    \(f\in L^r(\Omega )\hookrightarrow W^{1,r_f'}(\Omega )'\) for \(r_f = \dfrac{2r}{2-r} > 2\) if \(1<r<2\) and all \(r_f<+\infty \) if \(r\ge 2\). In this case we take \(q_\Gamma = r>1\).

  3. (c)

    \(f\in L^{p}_{\mathbf {\beta }}(\Omega )\subset L^r(\Omega )\) for \(r<\dfrac{2}{\beta _j+\frac{2}{p}}\) for all \(j\in \{1,\ldots ,m\}\). The condition \(\beta _j < 2-\dfrac{2}{p}\) implies that \(\dfrac{2}{\beta _j+\frac{2}{p}} > 1\), so we can choose \(r>1\) and \(L^r(\Omega )\hookrightarrow W^{1,r_f'}(\Omega )'\) for \(r_f = \dfrac{2r}{2-r} > 2\). Therefore \(f\in W^{1,r_f'}(\Omega )'\) for all \(2< r_f <\dfrac{2p}{2-(1-\beta _j) p}\). In this case we take \(q_\Gamma = r>1\).

Note that also in each of the three cases we have different assumptions on \(a_0\) and \(\nabla \cdot b\), but in any case there exists \(q_0>1\) such that \(a_0, \nabla \cdot b\in L^{q_0}(\Omega )\).

Let us check that also \(F = \varphi \nabla \cdot b + b\cdot \nabla \varphi - a_0\varphi +\varphi \in W^{1,r_\Omega '}(\Omega )'\) for some \(r_\Omega >2\). To this end define \(r_\varphi \), \(s_\Omega \) and \(r_\Omega \) by

$$\begin{aligned} \frac{1}{s_\Omega }=\frac{1}{r_\varphi }= \min \left\{ \frac{1}{2}\left( 1-\frac{1}{q_0}\right) , \frac{1}{2}-\frac{1}{p}\right\} \in (0,\tfrac{1}{2})\text { and } \frac{1}{r_\Omega } = \frac{1}{2}-\frac{1}{s_\Omega }\in (0,\tfrac{1}{2}) \end{aligned}$$

such that

$$\begin{aligned} \frac{1}{r_\varphi }+\frac{1}{q_0}+\frac{1}{s_\Omega }\le 1\text { and }\frac{1}{p}+\frac{1}{2}+\frac{1}{s_\Omega }\le 1 \end{aligned}$$

and \(W^{1,r_\Omega '}(\Omega )\hookrightarrow L^{s_\Omega }(\Omega )\). Using Lemma 2.6, we have that \(\varphi \in H^1(\Omega )\hookrightarrow L^{r_\varphi }(\Omega )\). Denote \(S = \{z\in W^{1,r_\Omega '}(\Omega ): \Vert z\Vert _{W^{1,r_\Omega '}(\Omega )} = 1\}\). Then

$$\begin{aligned} \Vert&F\Vert _{W^{1,r_\Omega '}(\Omega )'} = \sup _{z\in S}\int _\Omega \left( \varphi \nabla \cdot b + b\cdot \nabla \varphi + a_0\varphi -\varphi \right) z\,\textrm{d}x\\&\le C\sup _{z\in S} \big (\Vert \varphi \Vert _{L^{r_\varphi }(\Omega )} \Vert 1+a_0+\nabla \cdot b\Vert _{L^{q_0}(\Omega )} + \Vert b\Vert _{L^p(\Omega )}\Vert \nabla \varphi \Vert _{L^2(\Omega )} \big )\Vert z\Vert _{L^{s_\Omega }(\Omega )}\\&\le C_{r_\Omega }\sup _{z\in S}\big (\Vert \varphi \Vert _{L^{r_\varphi }(\Omega )} (\Vert 1+a_0+\nabla \cdot b\Vert _{L^{q_0}(\Omega )} )+ \Vert b\Vert _{L^p(\Omega )}\Vert \nabla \varphi \Vert _{L^2(\Omega )} \big )\Vert z\Vert _{W^{1,r_\Omega '}(\Omega )}\\&= C_{r_\Omega }\big (\Vert \varphi \Vert _{L^{r_\varphi }(\Omega )} (\Vert 1+a_0+\nabla \cdot b\Vert _{L^{q_0}(\Omega )} )+ \Vert b\Vert _{L^p(\Omega )}\Vert \nabla \varphi \Vert _{L^2(\Omega )} \big ). \end{aligned}$$

On the boundary, we want to check that \(b\cdot n \varphi \in W^{-1/r_\Gamma ,r_\Gamma }(\Gamma ) = W^{1/r_\Gamma ,r_\Gamma '}(\Gamma )'\) for some \(r_\Gamma >2\). To this end, define \(\hat{r}_\varphi \), \(s_\Gamma \) and \(r_\Gamma \) by

$$\begin{aligned} \frac{1}{s_\Gamma }=\frac{1}{\hat{r}_\varphi } = \frac{1}{2}\left( 1-\frac{1}{q_\Gamma }\right) \in (0,\tfrac{1}{2})\text { and }\frac{1}{r_\Gamma } = \frac{1}{2}\left( 1-\frac{1}{s_\Gamma }\right) \in (0,\tfrac{1}{2}) \end{aligned}$$

such that

$$\begin{aligned} \frac{1}{\hat{r}_\varphi }+\frac{1}{q_\Gamma }+\frac{1}{s_\Gamma } = 1 \end{aligned}$$

and \(W^{1/r_\Gamma ,r_\Gamma '}(\Gamma )\hookrightarrow L^{s_\Gamma }(\Gamma )\). From Lemma 2.3 and the trace theorem, we have that \(\varphi \in H^{1/2}(\Gamma )\hookrightarrow L^{\hat{r}_\varphi }(\Gamma )\). Denote \(S = \{z\in W^{1/r_\Gamma ,r_\Gamma '}(\Gamma ):\ \Vert z\Vert _{W^{1/r_\Gamma ,r_\Gamma '}(\Gamma )} =1\}\). Then

$$\begin{aligned}&\Vert b\cdot n \varphi \Vert _{W^{-1/r_\Gamma ,r_\Gamma }(\Gamma )} = \sup _{z\in S}\int _\Gamma b\cdot n \varphi z\,\,\textrm{d}x\le \sup _{z\in S} \Vert b\cdot n\Vert _{L^{q_\Gamma }(\Gamma )} \Vert \varphi \Vert _{L^{\hat{r}_\varphi }(\Gamma )} \Vert z\Vert _{L^{s_\Gamma }(\Gamma )}\\&\le C_{r_\Gamma } \sup _{z\in S} \Vert b\cdot n\Vert _{L^{q_\Gamma }(\Gamma )} \Vert \varphi \Vert _{L^{\hat{r}_\varphi }(\Gamma )} \Vert z\Vert _{W^{1/r_\Gamma ,r_\Gamma '}(\Gamma )}(\Gamma ) = C_{r_\Gamma } \Vert \varphi \Vert _{L^{\hat{r}_\varphi }(\Gamma )}\Vert b\cdot n\Vert _{L^{q_\Gamma }(\Gamma )}. \end{aligned}$$

Noting that for a general Lipschitz domain the \(W^{1,\delta }(\Omega )\) regularity is limited to \(\delta \le 4\), see [22], from the previous estimates, we can deduce that, for \(\delta = \min \{4,r_f,r_\Omega ,r_\Gamma \}>2\), \(\varphi \in W^{1,\delta }(\Omega )\).

Step 2: Let us check that \(\varphi \nabla \cdot b\) and \(\varphi b\cdot n\) satisfy the regularity assumptions for the source and the Neumann data respectively of the different cases of Theorem 3.4.

(a) On one hand \(\varphi \in W^{1,\delta }(\Omega )\hookrightarrow L^\infty (\Omega )\) and the assumption \(\nabla \cdot b\in L^q(\Omega )\) imply \(\varphi \nabla \cdot b\in L^q(\Omega )\hookrightarrow H^{2-t}(\Omega )'\), by the definition of q. On the boundary, by the trace theorem \(\varphi \in W^{1-1/\delta ,\delta }(\Gamma )\). If \(1<t\le 3/2\), then we use that \(W^{1-1/\delta ,\delta }(\Gamma )\hookrightarrow L^\infty (\Gamma )\) to conclude that \(\varphi b\cdot n\in L^{q_\Gamma }(\Gamma )\hookrightarrow H^{t-3/2}(\Gamma )\). The last inclusion follows by duality and the Sobolev imbedding \(H^{3/2-t}(\Gamma )\hookrightarrow L^{\frac{1}{t-1}}(\Gamma )\). If \(3/2<t<2\), we use that \(W^{1-1/\delta ,\delta }(\Gamma )\hookrightarrow H^{s_1}(\Gamma )\) for \(s_1=1-1/\delta > 1/2\). Since we are assuming that \(b\cdot n\in H^{s_2}(\Gamma )\) with \(s_2 = t-3/2\in (0,1/2)\), from the trace theorem and the multiplication theorem [23, Theorem 7.4], we have that \(\varphi b\cdot n\in H^{t-3/2}(\Gamma )\).

The result follows from Theorem 3.4(a).

(b) Using again that \(\varphi \in L^\infty (\Omega )\), we readily deduce that \(\varphi \nabla \cdot b\in L^r(\Omega )\). Let us check that \(\varphi b\cdot n\in \prod _{j = 1}^m W^{1-1/r,r}(\Gamma _j)\).

For all \(j\in \{1,\ldots ,m\}\), by the trace theorem and the assumption on \(b\cdot n\) we deduce the existence of \(B_j\in W^{1,r}(\Omega )\) such that the trace of \(B_j\) on \(\Gamma _j\) is \(b\cdot n\).

Suppose first that \(r\le 2\). Then, a straightforward application of the multiplication Lemma 3.6 below (in the case \(\beta _j=0\)) yields \(\varphi B_j\in W^{1,r}(\Omega )\), and hence, its trace on \(\Gamma _j\) belongs to \(W^{1-1/r,r}(\Gamma _j)\). Therefore, \(\varphi b\cdot n\in \prod _{j = 1}^m W^{1-1/r,r}(\Gamma _j)\) and the result follows from Theorem 3.4(b).

If \(r > 2\), from the previous paragraph we have that \(\varphi \in W^{2,2}(\Omega )\hookrightarrow W^{1,\delta }(\Omega )\) for all \(\delta < +\infty \). Repeating the previous argument, we obtain the desired result.

(c) Since \(\varphi \in L^\infty (\Omega )\) and \(\nabla \cdot b\in L^p_{\mathbf {\beta }}(\Omega )\), we have that \(\varphi \nabla \cdot b\in L^p_{\mathbf {\beta }}(\Omega )\). Next, we show that \(\varphi b\cdot n\in \prod _{j = 1}^m W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\).

For all \(j\in \{1,\ldots ,m\}\), by the trace theorem and the assumption on \(b\cdot n\) we deduce the existence of \(B_j\in W^{1,p}_{\mathbf {\beta }}(\Omega )\) such that the trace of \(B_j\) on \(\Gamma _j\) is \(b\cdot n\).

Since \(p\le 2<\delta \), a straightforward application of the multiplication Lemma 3.6 below yields \(\varphi B_j\in W^{1,p}_{\mathbf {\beta }}(\Omega )\), and hence, its trace on \(\Gamma _j\) belongs to \(W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\). Therefore, \(\varphi b\cdot n\in \prod _{j = 1}^m W^{1-1/p,p}_{\mathbf {\beta }}(\Gamma _j)\) and the result follows from Theorem 3.4(c). \(\square \)

It remains to prove the multiplication theorem used in the proofs of cases (b) and (c) in Theorem 3.5.

Lemma 3.6

(A multiplication theorem in weighted Sobolev spaces) Let \(1<q < +\infty \). Consider \(\varphi \in W^{1,\delta }(\Omega )\) for some \(\delta >\max \{2, q \}\) and \(\psi \in W^{1,q }_{{\mathbf {\beta }}}(\Omega )\) for some \(\mathbf {\beta }\) such that \(2-\frac{2}{q }-\lambda _j< \beta _j <2-\frac{2}{q }\), \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\). Then \(\psi \varphi \in W^{1,q }_{{\mathbf {\beta }}}(\Omega )\).

Proof

Since \(\delta > 2\), \(\varphi \in L^\infty (\Omega )\). Also it is clear that \(\psi \in L^{q}_{\mathbf {\beta }}(\Omega )\), and hence \(\psi \varphi \in L^{q}_{\mathbf {\beta }}(\Omega )\).

Let us check that also \(\vert \nabla (\psi \varphi )\vert \in L^{q}_{\mathbf {\beta }}(\Omega )\). We write \(\nabla (\psi \varphi ) = \varphi \nabla \psi + \psi \nabla \varphi \). Using again that \(\varphi \in L^\infty (\Omega )\) it is immediate to deduce that \(\vert \nabla \psi \vert \in L^{q}_{\mathbf {\beta }}(\Omega )\) implies that \(\vert \varphi \nabla \psi \vert \in L^{q}_{\mathbf {\beta }}(\Omega )\).

Checking that the term \(\psi \vert \nabla \varphi \vert \in L^{q}_{\mathbf {\beta }}(\Omega )\) is more involved. By localizing the problem at corner \(x_j\), and applying Hölder’s inequality we obtain

$$\begin{aligned} \int _{{\Omega _{R_j}}} (r^{{\beta _j}} \psi \vert \nabla \varphi \vert )^q\,\textrm{d}x\le \Vert r^{{\beta _j}} \psi \Vert _{L^{\frac{{q}\delta }{\delta -{q}}}({\Omega _{R_j}})}^{q}\Vert \nabla \varphi \Vert _{L^\delta ({\Omega _{R_j}})}^{q}, \end{aligned}$$

and therefore it is sufficient to prove that \(r^{\beta _j}\psi \in L^{\frac{q \delta }{\delta -{q}}}(\Omega _{R_j})\). Let us introduce \(1\le q_\delta < q \) and \(2\le q_\delta ^*<+\infty \) such that

$$\begin{aligned} \frac{1}{q_\delta ^*}=\min \left\{ \frac{1}{2},\frac{1}{q}-\frac{1}{\delta }\right\} \text { and } \frac{1}{q_\delta }=\min \left\{ 1,\frac{1}{q}+\frac{1}{2}-\frac{1}{\delta }\right\} =\frac{1}{q_\delta ^*}+\frac{1}{2} \end{aligned}$$

so that \(q_\delta ^* \ge \frac{q \delta }{\delta -q}\), and \(W^{1,q_\delta }(\Omega _{R_j})\hookrightarrow L^{q_\delta ^*}(\Omega _{R_j})\hookrightarrow L^{\frac{q \delta }{\delta -{q}}}(\Omega _{R_j})\). We are going to prove that \(r^{\beta _j}\psi \in W^{1,q_{\delta }}(\Omega _{R_j})\).

First of all we notice that \(\nabla (r^{{\beta _j}} \psi ) = r^{{\beta _j}} \nabla \psi + \psi \nabla r^{{\beta _j}}\). By definition of \(W^{1,q}_{\mathbf {\beta }}(\Omega )\), we have that \( r^{\beta _j} \vert \nabla \psi \vert \in L^{q}(\Omega _{R_j})\hookrightarrow L^{q_\delta }(\Omega _{R_j})\).

For the second term we notice that \(\vert \psi \nabla r^{\beta _j}\vert \sim r^{{{\beta _j}}-1} \psi \). Since \(1-2/q_\delta =\max \{-1,\frac{2}{\delta }-\frac{2}{q}\} <0\le \beta _j\), we have that \(W^{1,q_{\delta }}_{\mathbf {\beta }}(\Omega _{R_j})\hookrightarrow L^{q_{\delta }}_{\mathbf {\beta }-1}(\Omega _{R_j})\); see e.g. [19, Lemma 2.29(i)]. We deduce that \(\psi \in W^{1,q}_{\mathbf {\beta }}(\Omega _{R_j}) \hookrightarrow W^{1,q_\delta }_{\mathbf {\beta }}(\Omega _{R_j})\hookrightarrow L^{q_{\delta }}_{\mathbf {\beta }-1}(\Omega _{R_j})\). This means that \(r^{\beta _j-1} \psi \in L^{q_{\delta }}(\Omega _{R_j})\), and we gather that \(\vert \psi \nabla r^{{\beta _j}}\vert \in L^{q_{\delta }}(\Omega _{R_j})\).

Therefore \(\nabla (r^{\beta _j} \psi )\in L^{q_{\delta }}(\Omega _{R_j})\), so we have that \(r^{\beta _j} \psi \in W^{1,q_{\delta }}({\Omega _{R_j}})\).

Using this, we conclude that \(\psi \vert \nabla \varphi \vert \in L^{q}_{{\mathbf {\beta }}}(\Omega )\) and consequently \(\vert \nabla (\psi \varphi )\vert \in L^{q}_{{\mathbf {\beta }}}(\Omega )\), which leads to the desired result. \(\square \)

4 Discretization

Consider a family of regular triangulations \(\{{\mathcal {T}}_h\}\) graded with mesh grading parameters \(\mu _j\in (0,1]\), \(j\in \{1,\ldots ,m\}\) in the sense of [24, Section 3.1], see also [25]. As usual, \(Y_h\subset H^1(\Omega )\cap C({\bar{\Omega }})\) is the space of continuous piecewise linear functions.

Lemma 4.1

There exists a constant \(c_{\mathbf {\mu }}>0\) such that

$$\begin{aligned} \Vert \psi - I_h\psi \Vert _{H^1(\Omega )}\le c_{\mathbf {\mu }} h^s \Vert \psi \Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )}\ \forall \psi \in W^{2,2}_{\mathbf {\beta }}(\Omega ), \end{aligned}$$

where \(I_h \) is the Lagrange interpolation operator, the vector \(\mathbf {\beta }\) satisfies that \(1-\lambda _j< \beta _j < 1\) and \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\) and the exponent s satisfies that \(s\le 1\) and \(s <\frac{\lambda _j}{\mu _j}\) for all \(j\in \{1,\ldots ,m\}\).

Proof

The case \(\mu _j=1\) (quasi-uniform mesh) is classical. For \(\mu _j<\lambda _j\) see [4, Lemma 4.1]. The case \(\lambda _j\le \mu _j<1\) can be proved with the same techniques and the additional idea that \(h_T\sim h^s r_T^{1-s\mu _j}\), \(1-s\mu _j=\beta _j > 1-\lambda _j\); see equation (3.14) in [24, Theorem 3.2], where it was used for a Dirichlet problem. \(\square \)

Define the bilinear form \(a(y,z)= \langle {\mathcal {A}}y,z\rangle _\Omega \). For a datum \(u\in H^{1/2}(\Gamma )'\), the discrete state equation reads

$$\begin{aligned} a(y_h,z_h)=\langle u, z_h\rangle _\Gamma \ \forall z_h\in Y_h. \end{aligned}$$
(4.1)

Existence and uniqueness of the solution of this equation is not immediate since \(a(\cdot ,\cdot )\) is not coercive over \(Y_h\).

Theorem 4.2

There exists \(h_0>0\) that depends on A, b, \(a_0\), \(\Omega \) and the mesh grading parameter \(\mathbf {\mu }\), such that the system (4.1) has a unique solution for every \(h\le h_0\) and every \(u\in H^{1/2}(\Gamma )'\). Further, there exists a constant \(K_0\) that depends on A, b, \(a_0\), \(\Omega \) and is independent of \(\mathbf {\mu }\) and h such that

$$\begin{aligned} \Vert y_h\Vert _{H^1(\Omega )}\le K_0\Vert {\mathcal {A}}^{-1} u\Vert _{H^1(\Omega )} \ \forall h\le h_0. \end{aligned}$$
(4.2)

The scheme of the proof is similar to that of [9, Lemma 3.1] for distributed control problems with homogeneous Dirichlet boundary conditions, but that proof is done for quasi-uniform meshes and uses this fact explicitly; see equations (3.8) and (3.9) in [9]. Since the mesh grading and the boundary terms imply some extra technicalities, we include a complete proof for the convenience of the reader.

Proof

Due to the linearity of the system, to show existence it is sufficient to prove uniqueness of solution in the case \(u=0\). Suppose \(y_h\in Y_h\) satisfies

$$\begin{aligned} a(y_h,z_h) = 0\ \forall z_h\in Y_h. \end{aligned}$$
(4.3)

Taking \(z_h=y_h\) and using Gårding’s inequality established in Lemma 2.3, we have that

$$\begin{aligned} 0=a(y_h,y_h)= \langle {\mathcal {A}} y_h,y_h\rangle _\Omega \ge \frac{\Lambda }{8 C_E^2}\Vert y_h\Vert ^2_{H^1(\Omega )} - C_{\Lambda ,E,b}\Vert y_h\Vert ^2_{L^2(\Omega )}. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert y_h\Vert _{H^1(\Omega )}\le 2C_E\sqrt{\frac{ 2 C_{\Lambda ,E,b}}{\Lambda }}\Vert y_h\Vert _{L^2(\Omega )}. \end{aligned}$$
(4.4)

Since \(y_h\in L^2(\Omega )\subset L^2_{\mathbf {\beta }}(\Omega )\) for all \(\mathbf {\beta }\ge \textbf{0}\) such that \(1-\lambda _j < \beta _j \) for all \(j\in \{1,\ldots ,m\}\), from Theorem 3.5(c), we have that there exists a unique \(\psi \in W^{2,2}_{\mathbf {\beta }}(\Omega )\) such that

$$\begin{aligned} a(z,\psi ) = \int _\Omega y_h z\textrm{d}x\ \forall z\in H^1(\Omega ) \end{aligned}$$
(4.5)

and there exists a constant \(C_{{\mathcal {A}}^*,\mathbf {\beta }}\) such that

$$\begin{aligned} \Vert \psi \Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )}\le C_{{\mathcal {A}}^*,\mathbf {\beta }}\Vert y_h\Vert _{L^2_{\mathbf {\beta }}(\Omega )}. \end{aligned}$$

Let us denote \({\hat{\psi }}_h\in Y_h\) the Ritz–Galerkin projection of \(\psi \) onto \(Y_h\) in the sense of \(H^1(\Omega )\), i.e., \({\hat{\psi }}_h\) is the unique solution of

$$\begin{aligned} \int _\Omega (\nabla {\hat{\psi }}_h\nabla z_h + {\hat{\psi _h}} z_h )\textrm{d}x = \int _\Omega (\nabla \psi \nabla z_h + \psi z_h )\textrm{d}x\ \forall z_h\in Y_h. \end{aligned}$$

From [5, Eq. (4.2)], Theorem 3.5(c), and the embedding \(L^2(\Omega )\hookrightarrow L^2_{\mathbf {\beta }}(\Omega )\), with embedding constant 1 due to the choice \(R_j\le 1\), we have that there exists a constant \({\hat{c}}_{\mathbf {\mu }}\) such that

$$\begin{aligned} \Vert \psi -{\hat{\psi }}_h\Vert _{H^1(\Omega )}\le {\hat{c}}_{\mathbf {\mu }} h^{s}\Vert \psi \Vert _{W^{2,2}_\beta (\Omega )} \le {\hat{c}}_{\mathbf {\mu }} C_{{\mathcal {A}}^*,\beta } h^s \Vert y_h\Vert _{L^2_{\mathbf {\beta }}(\Omega )} \le {\hat{c}}_{\mathbf {\mu }} C_{{\mathcal {A}}^*,\beta } h^s \Vert y_h\Vert _{L^2(\Omega )}, \nonumber \\ \end{aligned}$$
(4.6)

where \(s\le 1\) and \(s <\frac{\lambda _j}{\mu _j}\) for all \(j\in \{1,\ldots ,m\}\); see Lemma 4.1. Taking \(z=y_h\) in the adjoint Eq. (4.5), and \(z_h={\hat{\psi _h}}\) in the homogeneous discrete Eq. (4.3), we deduce

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\Omega )}&= a(y_h,\psi ) = a(y_h,\psi -{\hat{\psi }}_h) \le \Vert {\mathcal {A}}\Vert \Vert y_h\Vert _{H^1(\Omega )}\Vert \psi -{\hat{\psi }}_h\Vert _{H^1(\Omega )}\\&\le {\hat{c}}_{\mathbf {\mu }} C_{{\mathcal {A}}^*,\mathbf {\beta }} \Vert {\mathcal {A}}\Vert \Vert y_h\Vert _{H^1(\Omega )} h^s \Vert y_h\Vert _{L^2(\Omega )}. \end{aligned}$$

Along the proof we will denote \(\Vert {\mathcal {A}}\Vert = \Vert {\mathcal {A}}\Vert _{{\mathcal {L}}(H^1(\Omega ),H^1(\Omega )')}\). Choosing \(h_0\) such that

$$\begin{aligned} {\hat{c}}_{\mathbf {\mu }} C_{{\mathcal {A}}^*,\mathbf {\beta }} \Vert {\mathcal {A}} \Vert h_0^s=\frac{1}{2}\frac{1}{2C_E} \sqrt{ \frac{\Lambda }{ 2 C_{\Lambda ,E,b}}}, \end{aligned}$$
(4.7)

we have that, for all \(h \le h_0\)

$$\begin{aligned} \Vert y_h\Vert _{L^2(\Omega )}\le \frac{1}{2}\frac{1}{2C_E} \sqrt{ \frac{\Lambda }{ 2 C_{\Lambda ,E,b}}} \Vert y_h\Vert _{H^1(\Omega )}. \end{aligned}$$

Using this and estimate (4.4), we deduce that

$$\begin{aligned} \Vert y_h\Vert _{H^1(\Omega )}\le \frac{1}{2}\Vert y_h\Vert _{H^1(\Omega )}\ \forall h \le h_0, \end{aligned}$$

and hence \(y_h=0\).

Take now \(u\in H^{1/2}(\Gamma )'\) and denote \(y={\mathcal {A}}^{-1}u\). For \(h\le h_0\), let \(y_h\) be the solution of (4.1). Taking \(z=y_h\) in the adjoint Eq. (4.5), and \(z_h={\hat{\psi _h}}\) in the discrete Eq. (4.1), we deduce

$$\begin{aligned} \Vert y_h\Vert ^2_{L^2(\Omega )}&= a(y_h,\psi ) = a(y_h,\psi -{\hat{\psi }}_h)+ \langle u,{\hat{\psi }}_h\rangle _\Gamma = a(y_h,\psi -{\hat{\psi }}_h)+a(y,{\hat{\psi }}_h)\\&\le \Vert {\mathcal {A}}\Vert \left( \Vert y_h\Vert _{H^1(\Omega )} \Vert \psi -{\hat{\psi }}_h\Vert _{H^1(\Omega )} + \Vert y\Vert _{H^1(\Omega )}\Vert {\hat{\psi }}_h\Vert _{H^1(\Omega )}\right) \\&\le {\hat{c}}_{\mathbf {\mu }} C_{{\mathcal {A}}^*,\mathbf {\beta }} \Vert {\mathcal {A}}\Vert \Vert y_h\Vert _{H^1(\Omega )} h^s \Vert y_h\Vert _{L^2(\Omega )} + c_{\mathbf {\beta }} C_{{\mathcal {A}}^*,\mathbf {\beta }} \Vert {\mathcal {A}}\Vert \Vert y\Vert _{H^1(\Omega )}\Vert y_h\Vert _{L^2(\Omega )}, \end{aligned}$$

where we have used that \(\Vert {\hat{\psi }}_h\Vert _{H^1(\Omega )}\le \Vert \psi \Vert _{H^1(\Omega )}\le {\hat{c}}_{\mathbf {\beta }}\Vert \psi \Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )}\le c_{\mathbf {\beta }} C_{{\mathcal {A}}^*,\mathbf {\beta }}\Vert y_h\Vert _{L^2(\Omega )}\); see [19, Lemma 2.29(i)] for the embedding \(W^{2,2}_{\mathbf {\beta }}(\Omega )\hookrightarrow H^1(\Omega )\). Now, using that \(h\le h_0\) and (4.7) we have

$$\begin{aligned} \Vert y_h\Vert _{L^2(\Omega )}&\le \frac{1}{2}\frac{1}{2C_E} \sqrt{ \frac{\Lambda }{ 2 C_{\Lambda ,E,b}}} \Vert y_h\Vert _{H^1(\Omega )} +c_{\mathbf {\beta }} C_{{\mathcal {A}}^*,\mathbf {\beta }} \Vert {\mathcal {A}}\Vert \Vert y\Vert _{H^1(\Omega )}, \end{aligned}$$

and applying Young’s inequality we deduce

$$\begin{aligned} \Vert y_h\Vert _{L^2(\Omega )}^2 \le \frac{1}{16}\frac{1}{C_E^2} \frac{\Lambda }{ C_{\Lambda ,E,b}} \Vert y_h\Vert _{H^1(\Omega )}^2 +2 c_{\mathbf {\beta }}^2C_{{\mathcal {A}}^*,\mathbf {\beta }}^2 \Vert {\mathcal {A}}\Vert ^2 \Vert y\Vert _{H^1(\Omega )}^2. \end{aligned}$$
(4.8)

Using Gårding’s inequality, the discrete Eq. (4.1) and \(y={\mathcal {A}}^{-1}u\), we infer

$$\begin{aligned} \frac{\Lambda }{8 C_E^2}\Vert y_h\Vert ^2_{H^1(\Omega )}&- C_{\Lambda ,E,b}\Vert y_h\Vert ^2_{L^2(\Omega )} \le a(y_h,y_h)\nonumber \\&= \langle u,y_h\rangle _\Gamma = a(y,y_h) \le \Vert {\mathcal {A}}\Vert \Vert y\Vert _{H^1(\Omega )} \Vert y_h\Vert _{H^1(\Omega )}. \end{aligned}$$
(4.9)

Multiplying (4.8) by \(C_{\Lambda ,E,b}\) and using the resulting inequality in (4.9), we obtain

$$\begin{aligned} \frac{\Lambda }{16C_E^2}&\Vert y_h\Vert _{H^1(\Omega )}^2 \le 2 c_{\mathbf {\beta }}^2C_{{\mathcal {A}}^*,\mathbf {\beta }}^2 \Vert {\mathcal {A}}\Vert ^2 \Vert y\Vert _{H^1(\Omega )}^2 + \Vert {\mathcal {A}}\Vert \Vert y\Vert _{H^1(\Omega )} \Vert y_h\Vert _{H^1(\Omega )}\\&\le 2 c_{\mathbf {\beta }}^2 C_{{\mathcal {A}}^*,\mathbf {\beta }}^2 \Vert {\mathcal {A}}\Vert ^2 \Vert y\Vert _{H^1(\Omega )}^2 +\frac{8 C_E^2}{\Lambda }\Vert {\mathcal {A}}\Vert ^2 \Vert y\Vert _{H^1(\Omega )}^2 + \frac{\Lambda }{32C_E^2} \Vert y_h\Vert _{H^1(\Omega )}^2, \end{aligned}$$

where in the second step we have used Young’s inequality. Gathering the terms with \(\Vert y_h\Vert _{H^1(\Omega )}^2\) and taking the square root, we finally obtain:

$$\begin{aligned} \Vert y_h\Vert _{H^1(\Omega )}&\le \frac{C_E}{4\sqrt{2\Lambda }} \Vert {\mathcal {A}}\Vert \left( 2c_{\mathbf {\beta }}^2 C_{{\mathcal {A}}^*,\mathbf {\beta }}^2 + \frac{8 C_E^2}{\Lambda }\right) ^{1/2} \Vert {\mathcal {A}}^{-1}u\Vert _{H^1(\Omega )}. \end{aligned}$$

Notice that the constant depends on \(\mathbf {\beta }\), which is itself limited by the value of \(\mathbf {\lambda }\), and hence the constant will finally depend on \(\mathbf {\lambda }\). \(\square \)

Theorem 4.3

There exists \(h_0^*>0\) that depends on A, b, \(a_0\), \(\Omega \) and the mesh grading parameter \(\mathbf {\mu }\), such that the discrete adjoint problem

$$\begin{aligned} a(z_h,\varphi _h)=\langle y,z_h\rangle _\Omega \ \forall z_h\in Y_h \end{aligned}$$
(4.10)

has a unique solution for every \(y\in H^1(\Omega )'\) and every \(0< h \le h_0^*\). Further, there exists a constant \(K_0^*\) that depends on A, b, \(a_0\), \(\Omega \) and is independent of \(\mathbf {\mu }\) and h such that

$$\begin{aligned} \Vert \varphi _h\Vert _{H^1(\Omega )}\le K_0^*\Vert ({\mathcal {A}}^{*})^{-1}y\Vert _{H^1(\Omega )}\ \forall h<h_0^*. \end{aligned}$$
(4.11)

Proof

Existence and uniqueness of solution of the discrete adjoint Eq. (4.10) follows for all \(0<h<h_0\) due to the finite-dimensional character of the problem. To get the estimate (4.11), we follow the steps of the proof of Theorem 4.2. Notice that in this case, the value of \(h_0^*\), which is used explicitly in the proof, may be different from the value of \(h_0\) provided in (4.7). \(\square \)

The following estimate is an immediate consequence of the previous results, Lemma 2.6, Corollary 2.7 and the trace theorem.

Corollary 4.4

Let \({\bar{h}}=\min \{h_0,h_0^*\}\) with \(h_0\) from Theorem 4.2 and \(h_0^*\) from Theorem 4.3. For \(u\in L^2(\Gamma )\) let \(y_h\in Y_h\) be the unique solution of (4.1). There exists a constant \(c_2>0\) that depends on the data of the problem, but not on the mesh grading parameters \(\mathbf {\mu }\) or on h, such that, for all \(h<{\bar{h}}\)

$$\begin{aligned} \Vert y_h\Vert _{L^2(\Gamma )}&\le c_2\Vert u\Vert _{L^2(\Gamma )}. \end{aligned}$$
(4.12)

Proof

Let us denote \(C_{\textrm{TR}}\) the norm of the trace operator from \(H^1(\Omega )\) to \(L^2(\Gamma )\). We use Theorem 4.2, Lemma 2.6, and the fact that u can be seen as an element of \(H^1(\Omega )'\) and \(\Vert u\Vert _{H^1(\Omega )'} \le C_{\textrm{TR}} \Vert u\Vert _{L^2(\Gamma )}\), cf. (2.4) and (2.5). A straightforward estimation shows that

$$\begin{aligned} \Vert y_h\Vert _{L^2(\Gamma )}&\le C_{\textrm{TR}}\Vert y_h\Vert _{H^1(\Omega )} \le C_{\textrm{TR}} K_0\Vert {\mathcal {A}}^{-1}u\Vert _{H^1(\Omega )} \le C_{\textrm{TR}} K_0 \Vert {\mathcal {A}}^{-1}\Vert C_{\textrm{TR}} \Vert u\Vert _{L^2(\Gamma )}, \end{aligned}$$

where \(\Vert {\mathcal {A}}^{-1}\Vert \) denotes the norm in \({\mathcal {L}}(H^1(\Omega )',H^1(\Omega ))\). The result follows for \(c_2 = C_{\textrm{TR}}^2 K_0 \Vert {\mathcal {A}}^{-1}\Vert \). \(\square \)

Theorem 4.5

(Error estimates in the domain). For \(0< h<{\bar{h}}\), where \({\bar{h}}\) is defined in Corollary 4.4, and \(u\in H^{1/2}(\Gamma )'\), let \(y_h\in Y_h\) be the solution of (4.1) and \(y\in H^1(\Omega )\) be the solution of (3.1) for \(f=0\). There exists \(C>0\) that depends on A, b, \(a_0\), \(\Omega \) but is independent of h such that

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )} \le C h^s \Vert u\Vert _{H^{1/2}(\Gamma )'}. \end{aligned}$$
(4.13)

If further \(u\in \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\), where \(1-\lambda _j< \beta _j < 1\) and \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\), there exists \(C>0\) that depends on A, b, \(a_0\), \(\Omega \), \(\mathbf {\beta }\), and the mesh grading parameter \(\mathbf {\mu }\), but is independent of h and u such that

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )}+ h^s \Vert y-y_h\Vert _{H^1(\Omega )} \le C h^{2s} \Vert y\Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )} \le C h^{2s} \sum _{j=1}^m\Vert u\Vert _{ W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\nonumber \\ \end{aligned}$$
(4.14)

for all \(s\le 1\) and \(s <\frac{\lambda _j}{\mu _j}\) for all \(j\in \{1,\ldots ,m\}\).

Furthermore, for all \(f\in L^{2}_{\mathbf {\beta }}(\Omega )\) and \(g\in \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma )\), let \(\varphi \in W^{2,2}_{\mathbf {\beta }}(\Omega )\) be the solution of (3.6) and \(\varphi _h\) be the unique solution of

$$\begin{aligned} a(z_h,\varphi _h) = \int _\Omega fz_h\,\textrm{d}x+ \int _\Gamma g z_h\,\textrm{d}x\ \forall z_h\in Y_h. \end{aligned}$$

Then

$$\begin{aligned} \Vert \varphi -\varphi _h\Vert _{L^2(\Omega )}&+ h^s \Vert \varphi -\varphi _h\Vert _{H^1(\Omega )} \le C h^{2s} \Vert \varphi \Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )} \nonumber \\&\le C h^{2s} \left( \Vert f\Vert _{L^{2}_{\mathbf {\beta }}(\Omega )} + \sum _{j=1}^m \Vert g\Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\right) . \end{aligned}$$
(4.15)

Proof

We will prove (4.13) and (4.14). The proof of (4.15) follows the same lines.

We first prove that

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )}\le C_{{\mathcal {A}}^*,\beta } {\hat{c}}_{\mathbf {\mu }} \Vert {\mathcal {A}}\Vert h^s \Vert y-y_h\Vert _{H^1(\Omega )} \end{aligned}$$
(4.16)

Consider \(\psi \in W^{2,2}_\beta (\Omega )\) the solution of the adjoint problem

$$\begin{aligned} a(z,\psi ) =\int _\Omega (y-y_h)z\textrm{d}x\ \forall z\in H^1(\Omega ) \end{aligned}$$

and let \({\hat{\psi _h\in }} Y_h\) be its Ritz–Galerkin projection onto \(Y_h\) in the sense of \(H^1(\Omega )\), as in the proof of Theorem 4.2. We have, with (4.6), that

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )}^2&= a(y-y_h,\psi ) = a(y-y_h,\psi -{\hat{\psi }}_h) \\&\le \Vert {\mathcal {A}}\Vert \Vert y-y_h\Vert _{H^1(\Omega )}\Vert \psi -{\hat{\psi }}_h\Vert _{H^1(\Omega )} \\&\le C_{{\mathcal {A}}^*,\beta } {\hat{c}}_{\mathbf {\mu }} \Vert {\mathcal {A}}\Vert h^s \Vert y-y_h\Vert _{H^1(\Omega )}\Vert y-y_h\Vert _{L^2(\Omega )} \end{aligned}$$

and (4.16) follows. Estimate (4.13) follows from this, Theorem 4.2 and Lemma 2.6.

Using Gårding’s inequality established in Lemma 2.3, estimate (4.16), and the definition of \(h_0>0\) in (4.7), we have that for all \(h < h_0\)

$$\begin{aligned} \frac{\Lambda }{8 C_E^2} \Vert y-y_h\Vert _{H^1(\Omega )}^2&\le a(y-y_h,y-y_h) + C_{\Lambda ,E,b} \Vert y-y_h\Vert _{L^2(\Omega )}^2 \\&\le a(y-y_h,y-y_h) + C_{\Lambda ,E,b} \Big (C_{{\mathcal {A}}^*,\beta } {\hat{c}}_{\mathbf {\mu }} \Vert {\mathcal {A}}\Vert h^s\Big )^2 \Vert y-y_h\Vert _{H^1(\Omega )}^2\\&\le a(y-y_h,y-y_h) + \frac{1}{4} \frac{\Lambda }{8 C_E^2} \Vert y-y_h\Vert _{H^1(\Omega )}^2, \end{aligned}$$

and hence

$$\begin{aligned} \frac{3\Lambda }{32 C_E^2} \Vert y-y_h\Vert _{H^1(\Omega )}^2\le a(y-y_h,y-y_h) \end{aligned}$$
(4.17)

Using Theorem 3.4(c) and Lemma 4.1

$$\begin{aligned} \Vert y- I_h y_h\Vert _{H^1(\Omega )}\le {\hat{c}}_{\mathbf {\mu }} h^s \Vert y\Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )} \le {\hat{c}}_{\mathbf {\mu }}C_{{\mathcal {A}},\mathbf {\beta }} h^s \sum _{j=1}^m\Vert u\Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}. \end{aligned}$$
(4.18)

Using that \(a(y,I_h y_h) = a(y_h,I_h y_h)\), (4.17) and the above inequality, we have that

$$\begin{aligned} \frac{3\Lambda }{32 C_E^2} \Vert y-y_h\Vert _{H^1(\Omega )}^2&\le a(y-y_h,y-I_h y_h) \le \Vert {\mathcal {A}}\Vert \Vert y-y_h\Vert _{H^1(\Omega )}\Vert y-I_h y_h\Vert _{H^1(\Omega )}\\&\le {\hat{c}}_{\mathbf {\mu }}C_{{\mathcal {A}},\mathbf {\beta }} \Vert {\mathcal {A}}\Vert h^s \Vert y-y_h\Vert _{H^1(\Omega )}, \end{aligned}$$

and the result follows. \(\square \)

Corollary 4.6

There exists \(C>0\) that depends on A, b, \(a_0\), \(\Omega \), and the mesh grading parameter \(\mathbf {\mu }\), but is independent of h such that for \(0< h<{\bar{h}}\)

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )} \le C h^{3s/2} \Vert u\Vert _{L^{2}(\Gamma )}\ \forall u\in L^2(\Gamma ) \end{aligned}$$
(4.19)

for all \(s\le 1\) and \(s <\frac{\lambda _j}{\mu _j}\) for all \(j\in \{1,\ldots ,m\}\).

Further, for all \(f\in L^{2}_{\mathbf {\beta }}(\Omega )\) and \(g\in \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma )\) and all \(\theta \in (0,1)\), then

$$\begin{aligned} \Vert \varphi -\varphi _h\Vert _{H^{\theta }(\Omega )}\le C h^{(2-\theta )s} \left( \Vert f\Vert _{L^{2}_{\mathbf {\beta }}(\Omega )} + \sum _{j=1}^m \Vert g\Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\right) , \end{aligned}$$
(4.20)

where C is independent of \(\theta \).

Proof

If \(u\in H^{1/2}(\Gamma )\), then, by (4.14) and the embedding \(H^{1/2}(\Gamma )\hookrightarrow W^{1/2,2}_{\mathbf {\beta }}(\Gamma )\hookrightarrow \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\) for some \(\mathbf {\beta }\) with \(\beta _j\ge 0\), \(1-\lambda _j< \beta _j < 1\), we obtain

$$\begin{aligned} \Vert y-y_h\Vert _{L^2(\Omega )} \le C h^{2s} \Vert u\Vert _{H^{1/2}(\Gamma )}. \end{aligned}$$

The first result follows by complex interpolation between this estimate and (4.13).

The second one follows by interpolation between the estimates for \(\theta = 0\) and \(\theta =1\) that follow from (4.15). \(\square \)

5 Analysis of the Control Problem

Now, we turn to the analysis of the control problem

$$\begin{aligned} \mathrm{(P)} \quad \min _{u \in U_\textrm{ad}} J(u):= \frac{1}{2}\int _\Omega (y_u(x) - y_d(x))^2\, \,\textrm{d}x+ \frac{\nu }{2}\int _\Gamma u^2(x)\, \,\textrm{d}x+\int _\Gamma y_u(x) g_\varphi (x)\, \,\textrm{d}x, \end{aligned}$$

where \(y_u\in H^1(\Omega )\) solves (2.7). For every \(u\in H^{1/2}(\Gamma )'\), we define \(\varphi _u\in H^1(\Omega )\) as the unique solution of

$$\begin{aligned} \langle z, {\mathcal {A}}^*\varphi _u\rangle _\Omega = \int _\Omega ( y_u-y_d)z\,\textrm{d}x+\int _\Gamma g_\varphi z\,\textrm{d}x\forall z\in H^1(\Omega ). \end{aligned}$$

We have that

$$\begin{aligned} J'(u)v = \int _\Omega (\varphi _u+\nu u)v\,\textrm{d}x. \end{aligned}$$

Theorem 5.1

For any \(y_d\in L^2(\Omega )\) and \(g_\varphi \in L^2(\Gamma )\), problem (P) has a unique solution \({\bar{u}}\in U_\textrm{ad}\) and there exist \({\bar{y}},{\bar{\varphi }}\in H^1(\Omega )\) such that

$$\begin{aligned} \langle {\mathcal {A}}{\bar{y}},z\rangle _\Omega&= \int _\Gamma {\bar{u}} z\,\textrm{d}x{} & {} \forall z\in H^1(\Omega ),\\ \langle z, {\mathcal {A}}^*{\bar{\varphi }}\rangle _\Omega&= \int _\Omega ({\bar{y}}-y_d)z\,\textrm{d}x+\int _\Gamma g_\varphi z\,\textrm{d}x{} & {} \forall z\in H^1(\Omega ),\\ \int _\Gamma ({\bar{\varphi }}+\nu {\bar{u}})(u-{\bar{u}})\,\textrm{d}x&\ge 0{} & {} \forall u\in U_\textrm{ad}, \end{aligned}$$

and \({\bar{u}}\in H^{1/2}(\Gamma )\).

If, further, \(g_\varphi \in \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\) for some \(\mathbf {\beta }\) such that \(1-\lambda _j<\beta _j < 1\) and \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\), then \({\bar{y}}, {\bar{\varphi }}\in W^{2,2}_{\mathbf {\beta }}(\Omega )\cap C({\bar{\Omega }})\), \({\bar{\varphi }}\in W^{3/2,2}_{\mathbf {\beta }}(\Gamma )\cap C(\Gamma )\), \({\bar{u}}\in C(\Gamma )\).

If, moreover, the weights also satisfy \(\beta _j < 1/2\), for all \(j\in \{1,\ldots ,m\}\) then \({\bar{\varphi }}, {\bar{u}}\in H^1(\Gamma )\).

Proof

The existence of the solution follows from the appropriate continuity properties of the involved operators that are deduced from Lemma 2.6. Uniqueness is deduced from the strict convexity of the functional. The first order optimality conditions are deduced, hence, in a standard way from the Euler-Lagrange equation \(J'({\bar{u}})(u-{\bar{u}})\ge 0\) for all \(u\in U_\textrm{ad}\) and Corollary 2.7. The \(H^1(\Omega )\) regularity of \({\bar{y}}\) follows from Lemma 2.3 and the regularity of the adjoint state from Lemma 2.6. By the trace theorem, we have that \(\varphi \in H^{1/2}(\Gamma )\). This and the projection formula

$$\begin{aligned} {\bar{u}}(x) = {\text {Proj}}_{[u_{a},u_{b}]}\left( -\frac{{\bar{\varphi }}(x)}{\nu }\right) , \end{aligned}$$
(5.1)

which follows in a standard way from the third optimality condition, imply the regularity of \({\bar{u}}\).

Suppose now that \(g_\varphi \) belongs to \(L^2(\Gamma )\cap \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\) for some \(\mathbf {\beta }\) such that \( 1-\lambda _j<\beta _j < 1\) and \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\). The \(W^{2,2}_{\mathbf {\beta }}(\Omega )\) regularity of the state and adjoint state follow from a bootstrapping argument: since \({\bar{y}}\in H^1(\Omega )\) and \(\beta _j\ge 0\) for all j, we have that \({\bar{y}}-y_d\in L^2(\Omega )\hookrightarrow L^2_{\mathbf {\beta }}(\Omega )\). From Theorem 3.5(c) we deduce that \({\bar{\varphi }}\in W^{2,2}_{\mathbf {\beta }}(\Omega )\). This readily implies that \({\bar{\varphi }}\in W^{3/2,2}_{\mathbf {\beta }}(\Gamma )\). Using that \(L^2_{\mathbf {\beta }}(\Omega )\subset L^r(\Omega )\) for all \(1<r < 2/(1+\beta _j)\), we deduce from Theorem 3.5(b) that \({\bar{\varphi }}\in W^{2,r}(\Omega )\hookrightarrow C({\bar{\Omega }})\), so \({\bar{\varphi }}\in C(\Gamma )\). Again the projection formula leads to \({\bar{u}}\in C(\Gamma )\).

If \(\beta _j < 1/2\), then \(2/(1+\beta _j) > 4/3\), so there exists \(r>4/3\) such that \({\bar{\varphi }}\in W^{2,r}(\Omega )\hookrightarrow H^{3-2/r}(\Omega )\). Since \(3-2/r>3/2\), by the trace theorem we have that \({\bar{\varphi }}\in C(\Gamma )\cap _{j=1}^m H^1(\Gamma _j) = H^1(\Gamma )\). This last equality follows because \(\Gamma \) is one-dimensional and polygonal. This regularity is preserved by the projection formula, and therefore \({\bar{u}}\in H^1(\Gamma )\). \(\square \)

Notice that for any polygonal domain \(\lambda _j > 1/2\) for all \(j\in \{1,\ldots ,m\}\), so the condition \(\beta _j < 1/2\) may be a constraint in the regularity of the datum \(g_\varphi \), but it is not a constraint on the domain. Although some of the intermediate results below can be proved for \(g_\varphi \in L^2(\Gamma )\), since the main result requires \(H^1(\Gamma )\) regularity of the optimal control, in the rest of the work we will do the following assumption.

Assumption 5.2

We assume that \(g_\varphi \in \prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\) for some \(\mathbf {\beta }\) such that \(1-\lambda _j<\beta _j<1/2\), \(\beta _j\ge 0\) for all \(j\in \{1,\ldots ,m\}\). We denote

$$\begin{aligned} M_d = \Vert y_d\Vert _{L^2(\Omega )} + \sum _{j =1}^m\Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)} + 1. \end{aligned}$$

For every \(u\in L^2(\Gamma )\), we will denote \(y_h(u)\) the solution of the discrete state Eq. (4.1) and \(\varphi _h(u)\) the solution of

$$\begin{aligned} a(z_h,\varphi _h) = \int _\Omega (y_h(u)-y_d) z_h\,\textrm{d}x\int _\Gamma g_\varphi z_h\,\textrm{d}x\ \forall z_h\in Y_h. \end{aligned}$$

Our discrete functional reads like

$$\begin{aligned} J_h(u) = \frac{1}{2}\int _\Omega (y_h(u)-y_d)^2\,\textrm{d}x+ \frac{\nu }{2}\int _\Gamma u^2\,\textrm{d}x+\int _\Gamma y_h(u) g_\varphi \,\textrm{d}x. \end{aligned}$$

To discretize the control, we notice that every triangulation \({\mathcal {T}}_h\) of \(\Omega \) defines a segmentation \({\mathcal {E}}_h\) of \(\Gamma \) and define \(U_{h,\mathrm ad}= U_h\cap U_\textrm{ad}\), where

$$\begin{aligned} U_h= \{u_h\in L^2(\Gamma ):\ u_{h\vert E}\in {\mathcal {P}}^0(E)\ \forall E\in {\mathcal {E}}_h\}. \end{aligned}$$

Here and elsewhere \({\mathcal {P}}^i(K)\) is the set of polynomials of degree i in the set K. For every \(u\in L^1(\Gamma )\), we define \(Q_h u\in U_h\) by

$$\begin{aligned} Q_h u(x) = \displaystyle \frac{1}{h_E} \int _E u\,\textrm{d}x \text{ if } x\in E, \end{aligned}$$

where \(E\in {\mathcal {E}}_h\) and \(h_E\) is the length of E. Notice that \(u\in U_\textrm{ad}\) implies \(Q_h u\in U_{h,\mathrm ad}\).

Lemma 5.3

For every \(u\in H^1(\Gamma )\) there exists a constant \(C>0\) independent of h such that

$$\begin{aligned} \Vert u-Q_h u\Vert _{(H^{1}(\Gamma ))'}+ h \Vert u-Q_h u\Vert _{L^2(\Gamma )} \le C h^2\Vert u\Vert _{H^1(\Gamma )}. \end{aligned}$$

If Assumption 5.2 holds, then we also have that

$$\begin{aligned} \left| \int _\Gamma (\varphi _u + \nu u)(u-Q_h u)\,\textrm{d}x\right| \le C h^2\left( \Vert u\Vert _{H^1(\Gamma )}^2+ M_d^2 \right) . \end{aligned}$$

Proof

It is well known that for every \(E\in {\mathcal {E}}_h\) we have \(\Vert u-Q_h u\Vert _{L^2(E)} \le C h_E\Vert u\Vert _{H^1(E)}\). Using that \(h_E\le c h\), we have

$$\begin{aligned} \Vert u-Q_h u\Vert _{L^2(\Gamma )}^2 = \sum _{E\in {\mathcal {E}}_h} \Vert u-Q_h u\Vert _{L^2(E)}^2 \le C \sum _{E\in {\mathcal {E}}_h}h_E^2\Vert u\Vert _{H^1(E)}^2 \le C h^2 \Vert u\Vert _{H^1(\Gamma )}^2. \end{aligned}$$

The estimate for the norm in \(H^1(\Gamma )'\) follows now by duality since \(\int _\Gamma (u-Q_hu)w_h\,\textrm{d}x=0\) for all \(w_h\in U_h\). This estimate implies the third one taking into account that, using the same arguments as in the proof of Theorem 5.1, \(\varphi _u\in H^1(\Gamma )\), and

$$\begin{aligned} \Vert&\varphi _u\Vert _{H^1(\Gamma )} \le C \Vert \varphi \Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )} \le C \left( \Vert y_u-y_d\Vert _{L^2_{\mathbf {\beta }}(\Omega )} + \sum _{j = 1}^m \Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\right) \\&\le C \left( \Vert y_u\Vert _{L^2(\Omega )}+ \Vert y_d\Vert _{L^2(\Omega )} + \sum _{j = 1}^m \Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\right) \le C \left( \Vert u\Vert _{L^2(\Gamma )}+ M_d\right) . \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} \left| \int _\Gamma (\varphi _u + \nu u)(u-Q_h u)\,\textrm{d}x\right|&\le \Vert \varphi _u + \nu u\Vert _{H^1(\Gamma )} \Vert u-Q_h u\Vert _{H^1(\Gamma )'}\\&\le C \left( M_d + \Vert u\Vert _{L^2(\Gamma )} + \nu \Vert u\Vert _{H^1(\Gamma )}\right) h^2 \Vert u\Vert _{H^1(\Gamma )} \end{aligned}$$

and the result follows using Young’s inequality. \(\square \)

Our discrete problems reads like

$$\begin{aligned} (\textrm{P}_h)\ \min _{u_h\in U_{h,\mathrm ad}} J_h(u_h). \end{aligned}$$

Existence and uniqueness of solution of problem \((\textrm{P}_h) \), as well as first order optimality conditions follow in a standard way. We state the result in the next theorem for further reference.

Theorem 5.4

For every \(0<h<{\bar{h}}\), problem \((\textrm{P}_h) \) has a unique solution \(\bar{u}_h\in U_{h,\mathrm ad}\). Further, if we denote \({\bar{y}}_h = y_h({\bar{u}}_h)\) and \({\bar{\varphi }}_h = \varphi _h({\bar{u}}_h)\), then

$$\begin{aligned} \int _\Gamma ({\bar{\varphi }}_h + \nu {\bar{u}}_h)(u_h-{\bar{u}}_h)\,\textrm{d}x\ge 0\ \forall u_h\in U_{h,\mathrm ad}. \end{aligned}$$
(5.2)

Before stating and proving the main theorem of this section, we prove two auxiliary results.

Lemma 5.5

There exists \(C>0\) independent of h, \(y_d\) and \(g_\varphi \) such that for all \(0<h<{\bar{h}}\),

$$\begin{aligned} \Vert {\bar{y}}\Vert _{H^1(\Omega )}+\Vert {\bar{u}}\Vert _{H^{1/2}(\Gamma )} + \Vert {\bar{y}}_h\Vert _{H^1(\Omega )}+\Vert \bar{u}_h\Vert _{L^2(\Gamma )} \le C \left( \Vert y_d\Vert _{L^2(\Omega )} + \Vert g_\varphi \Vert _{L^2(\Gamma )}+1\right) . \end{aligned}$$

If, moreover, Assumption 5.2 holds, then

$$\begin{aligned} \Vert {\bar{u}}\Vert _{H^1(\Gamma )}\le C M_d. \end{aligned}$$

Proof

Consider a fixed \(u_{\textrm{ad}}\in U_\textrm{ad}\) such that \(u_{\textrm{ad}}\in U_{h,\mathrm ad}\) for all \(h>0\). Using that \(\Vert \bar{y}_h-y_d\Vert ^2_{L^2(\Omega )}\ge 0\) and the optimality of \({\bar{u}}_h\) together with Young’s inequality and estimate (4.12), we have for all \(\varepsilon > 0\) that

$$\begin{aligned} \frac{\nu }{2}&\Vert {\bar{u}}_h\Vert ^2_{L^2(\Gamma )}\le J_h(\bar{u}_h)- \int _\Gamma y_h({\bar{u}}_h)g_\varphi \,\textrm{d}x\\&\le J_h(u_{\textrm{ad}}) + \varepsilon \Vert y_h(\bar{u}_h)\Vert ^2_{L^2(\Gamma )} + \frac{1}{4\varepsilon } \Vert g_\varphi \Vert _{L^2(\Gamma )}^2\\&\le \frac{1}{2}\Vert y_h(u_{\textrm{ad}}) - y_d\Vert _{L^2(\Omega )}^2 + \frac{\nu }{2}\Vert u_{\textrm{ad}}\Vert ^2_{L^2(\Gamma )} + \int _\Gamma y_h(u_{\textrm{ad}}) g_\varphi \,\textrm{d}x\\&\qquad + \varepsilon c_2^2 \Vert {\bar{u}}_h\Vert ^2_{L^2(\Gamma )} + \frac{1}{4\varepsilon } \Vert g_\varphi \Vert _{L^2(\Gamma )}^2\\&\le \Vert y_h(u_{\textrm{ad}})\Vert _{L^2(\Omega )}^2+ \Vert y_d\Vert _{L^2(\Omega )}^2 + (\frac{\nu }{2}+c_2^2)\Vert u_{\textrm{ad}}\Vert ^2_{L^2(\Gamma )} \\&\qquad + \varepsilon c_2^2 \Vert {\bar{u}}_h\Vert ^2_{L^2(\Gamma )} + \frac{1+\varepsilon }{4\varepsilon } \Vert g_\varphi \Vert _{L^2(\Gamma )}^2 \end{aligned}$$

where \(c_2\) is introduced in (4.12). Taking \(\varepsilon = \nu /(4c_2^2)\), we readily deduce that \(\{{\bar{u}}_h\}\) is uniformly bounded in \(L^2(\Gamma )\). The estimate for \(\Vert \bar{y}_h\Vert _{H^1(\Omega )}\) follows from this one and estimate (4.2).

Estimates for \(\Vert {\bar{u}}\Vert _{L^2(\Gamma )}\) and \(\Vert \bar{y}\Vert _{H^1(\Omega )}\) follow in a similar way. From this last one and Lemma 2.6 an estimate for \(\Vert {\bar{\varphi }}\Vert _{H^1(\Omega )}\) in terms of the data is obtained. The trace theorem and the projection formula (5.1) lead to the estimate for \(\Vert {\bar{u}}\Vert _{H^{1/2}(\Gamma )}\).

If Assumption 5.2 holds, then, using the estimate for \(\Vert {\bar{y}}\Vert _{L^2(\Omega )}\) and noting that the condition \(\beta _j<1/2\) implies \(\prod _{j=1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)\hookrightarrow L^2(\Gamma )\) and hence

$$\begin{aligned} \Vert y_d\Vert _{L^2(\Omega )} + \Vert g_\varphi \Vert _{L^{2}(\Gamma )} + 1\le M_d, \end{aligned}$$

we obtain an estimate of \(\Vert {\bar{\varphi }}\Vert _{W^{2,2}_{\mathbf {\beta }}(\Omega )}\) in terms of \(M_d\). The trace theorem and the projection formula (5.1) lead to the estimate for \(\Vert {\bar{u}}\Vert _{H^1(\Gamma )}\). \(\square \)

In the rest of the work s represents any positive number satisfying \(s\le 1\) and \(s<\lambda _j/\mu _j\).

Lemma 5.6

Suppose Assumption 5.2 holds. Then, there exists \(C>0\) independent of h, \(y_d\), \(g_\varphi \) and \(\{{\bar{u}}_h\}\) such that

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h\Vert _{L^2(\Omega )}\le C h^{3s/2} M_d. \end{aligned}$$
(5.3)

Moreover, for all \(\theta \in (1/2,1]\) we have the following estimate:

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h \Vert _{L^2(\Gamma )} \le C h^{(2-\theta )s}M_d. \end{aligned}$$
(5.4)

Proof

By the triangle inequality

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h\Vert _{L^2(\Omega )}\le \Vert \varphi _{{\bar{u}}_h}-\varphi ^h\Vert _{L^2(\Omega )} + \Vert \varphi ^h-{\bar{\varphi }}_h\Vert _{L^2(\Omega )}, \end{aligned}$$
(5.5)

where \(\varphi ^h\) is the unique element in \(H^1(\Omega )\) such that \(a(z,\varphi ^h) = \int _\Omega ({\bar{y}}_h-y_d)z\,\textrm{d}x+ \int _\Gamma g_\varphi z\,\textrm{d}x\) for all \(z\in H^1(\Omega )\), i.e., \({\bar{\varphi }}_h\) is the finite element approximation of \(\varphi ^h\).

Let us estimate the first term in the right hand side of (5.5). Noting that

$$\begin{aligned} a(z,\varphi _{{\bar{u}}_h}-\varphi ^h) = \int _\Omega (y_{\bar{u}_h}-y_h({\bar{u}}_h)) z\,\textrm{d}x\ \forall z\in H^1(\Omega ), \end{aligned}$$

we deduce from Theorem 3.5, the existence of \(C>0\) independent of h such that

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-\varphi ^h\Vert _{L^2(\Omega )}\le C \Vert y_{{\bar{u}}_h}-y_h({\bar{u}}_h)\Vert _{L^2(\Omega )}. \end{aligned}$$
(5.6)

Applying the finite element error estimate for the state (4.19) of Corollary 4.6 and Lemma 5.5, we have

$$\begin{aligned} \Vert y_{{\bar{u}}_h}-y_h({\bar{u}}_h)\Vert _{L^2(\Omega )}&\le C h^{3s/2}\Vert {\bar{u}}_h\Vert _{L^2(\Gamma )} \\&\le C h^{3s/2}(\Vert y_d\Vert _{L^2(\Omega )} + \Vert g_\varphi \Vert _{L^2(\Gamma )}+1)\le C h^{3s/2} M_d. \end{aligned}$$

This, together with (5.6) leads to

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-\varphi ^h\Vert _{L^2(\Omega )}\le C h^{3s/2}M_d. \end{aligned}$$
(5.7)

To estimate the second summand in the right hand side of (5.5) we apply the finite element error estimate (4.15), the uniform boundness result in Lemma 5.5 and the embedding \(\prod _{j = 1}^m W^{1/2,2}_{\mathbf {\beta }}(\Gamma ) \hookrightarrow L^2(\Gamma )\):

$$\begin{aligned} \Vert \varphi ^h-{\bar{\varphi }}_h\Vert _{L^2(\Omega )}&\le C \left( \Vert {\bar{y}}_h-y_d\Vert _{L^2(\Omega )} + \sum _{j =1}^m\Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)}\right) h^{2s}\\&\le C \left( \Vert {\bar{y}}_h\Vert _{L^2(\Omega )}+ \Vert y_d\Vert _{L^2(\Omega )} + \sum _{j =1}^m\Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)} \right) h^{2s}\\&\le C \left( 2\Vert y_d\Vert _{L^2(\Omega )} + \Vert g_\varphi \Vert _{L^2(\Gamma )} + \sum _{j =1}^m\Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)} +1 \right) h^{2s}\\&\le C \left( \Vert y_d\Vert _{L^2(\Omega )} + \sum _{j =1}^m\Vert g_\varphi \Vert _{W^{1/2,2}_{\mathbf {\beta }}(\Gamma _j)} +1 \right) h^{2s} = C h^{2s}M_d. \end{aligned}$$

Estimate (5.3) follows, hence, from (5.5) together with this last estimate and (5.7).

Let us prove (5.4). First we notice that for \(1/2<\theta \le 1\), the trace operator is continuous from \(H^\theta (\Omega )\) to \(L^2(\Gamma )\), so

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h \Vert _{L^2(\Gamma )}&\le C \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h \Vert _{H^\theta (\Omega )} . \end{aligned}$$

To estimate the term \(\Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h \Vert _{H^\theta (\Omega )}\), we first introduce \(\phi _h\in Y_h\), the finite element approximation of \(\varphi _{{\bar{u}}_h}\), that satisfies \(a(z_h,\phi _h) = \int _\Omega ( y_{{\bar{u}}_h}-y_d)z_h\,\textrm{d}x+ \int _\Gamma g_\varphi z_h\,\textrm{d}x\) for all \(z_h\in Y_h\). The difference \(\phi _h-{\bar{\varphi }}_h\) satisfies \(a(z_h,\phi _h-{\bar{\varphi }}_h) = \int _\Omega (y_{{\bar{u}}_h}-{\bar{y}}_h) z_h\,\textrm{d}x\) for all \(z_h\in Y_h\). From the continuity estimate for the discrete adjoint equation of Theorem 4.3 we deduce that

$$\begin{aligned} \Vert \phi _h-{\bar{\varphi }}_h \Vert _{H^1(\Omega )}\le C\Vert y_{\bar{u}_h}- {\bar{y}}_h\Vert _{L^2(\Omega )}. \end{aligned}$$
(5.8)

Using the triangle inequality, the fact that \(\theta \le 1\), the finite element error estimate for the adjoint estate Eq. (4.19) of Corollary 4.6, (5.8), and the finite element error estimate for the state equation (4.20), together with the uniform boundness of \(\Vert \bar{u}_h\Vert _{L^2(\Gamma )}\) provided in Lemma 5.5, we obtain

$$\begin{aligned} \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h \Vert _{H^\theta (\Omega )}&\le \Vert \varphi _{{\bar{u}}_h}-\phi _h \Vert _{H^\theta (\Omega )} + \Vert \phi _h-{\bar{\varphi }}_h \Vert _{H^1(\Omega )}\\&\le C\left( h^{(2-\theta )s} M_d + \Vert y_{{\bar{u}}_h}- {\bar{y}}_h\Vert _{L^2(\Omega )} \right) \\&\le C\left( h^{(2-\theta )s} M_d + h^{3s/2}\Vert {\bar{u}}_h\Vert _{L^2(\Gamma )} \right) \le C h^{(2-\theta )s} M_d, \end{aligned}$$

where the last inequality is a result of Lemma 5.5 and the condition \(\theta > 1/2\).\(\square \)

We are now in position to prove the main result of this section.

Theorem 5.7

Suppose Assumption 5.2 holds. Then, there exists a constant independent of h, \(y_d\) and \(g_\varphi \) such that, for all \(0< h<{\bar{h}}\)

$$\begin{aligned} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )} \le C h^{s^*} M_d, \end{aligned}$$

for all \(s^*\le 1\) such that \(s^* <\dfrac{3}{2}\dfrac{\lambda _j}{\mu _j}\) for all \(j\in \{1,\ldots ,m\}\).

Proof

Testing the equality \(a(z,{\bar{\varphi }}-\varphi _{{\bar{u}}_h}) = \int _\Omega ({\bar{y}}-y_{{\bar{u}}_h})z\,\textrm{d}x\) for \(z = {\bar{y}}-y_{{\bar{u}}_h}\) and using the state equation, we have that

$$\begin{aligned} 0\le \Vert {\bar{y}}-y_{{\bar{u}}_h}\Vert ^2_{L^2(\Omega )} = a(\bar{y}-y_{{\bar{u}}_h},{\bar{\varphi }}-\varphi _{{\bar{u}}_h}) = \int _\Gamma (\bar{u}-{\bar{u}}_h)({\bar{\varphi }}-\varphi _{{\bar{u}}_h})\,\textrm{d}x. \end{aligned}$$

So we can write

$$\begin{aligned} \nu \Vert {\bar{u}}&-{\bar{u}}_h\Vert _{L^2(\Gamma )}^2 \le \int _\Gamma ({\bar{\varphi }}-\varphi _{{\bar{u}}_h} + \nu ({\bar{u}}-{\bar{u}}_h))( {\bar{u}}-{\bar{u}}_h)\,\textrm{d}x\\&= \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-\bar{u}_h))({\bar{u}}-{\bar{u}}_h)\,\textrm{d}x+ \int _\Gamma ({\bar{\varphi }}_h-\varphi _{\bar{u}_h} )( {\bar{u}}-{\bar{u}}_h)\,\textrm{d}x= I+II. \end{aligned}$$

Let us bound the first term. First we insert in appropriate places \(Q_h{\bar{u}}\) and \({\bar{u}}\). Next, we apply the first order optimality conditions for the continuous and discrete problem. Finally we insert \(\varphi _{{\bar{u}}_h}\) to obtain

$$\begin{aligned} I&= \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-\bar{u}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x+ \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-{\bar{u}}_h) )( Q_h{\bar{u}}-{\bar{u}}_h)\,\textrm{d}x\\&= \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-{\bar{u}}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x+ \int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}} )( Q_h{\bar{u}}-{\bar{u}}_h)\,\textrm{d}x\\&\quad + \int _\Gamma ({\bar{\varphi }}_h + \nu {\bar{u}}_h) )( {\bar{u}}_h-Q_h{\bar{u}})\,\textrm{d}x\\&= \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-{\bar{u}}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x+ \int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}} )( Q_h{\bar{u}}-{\bar{u}})\,\textrm{d}x\\&\quad + \int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}} )( {\bar{u}}-{\bar{u}}_h)\,\textrm{d}x+ \int _\Gamma ({\bar{\varphi }}_h + \nu {\bar{u}}_h) )( {\bar{u}}_h-Q_h{\bar{u}})\,\textrm{d}x\\&\le \int _\Gamma ({\bar{\varphi }}-{\bar{\varphi }}_h + \nu ({\bar{u}}-\bar{u}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x+\int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}} )( Q_h{\bar{u}}-{\bar{u}})\,\textrm{d}x\\&= \int _\Gamma ({\bar{\varphi }}-\varphi _{{\bar{u}}_h} + \nu ({\bar{u}}-\bar{u}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x+ \int _\Gamma (\varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x\\&\quad + \int _\Gamma ({\bar{\varphi }} + \nu {\bar{u}} )( Q_h{\bar{u}}-\bar{u})\,\textrm{d}x= I_A + I_B + I_C. \end{aligned}$$

From Lemmas 5.3 and 5.5, it is clear that \(I_C\le C h^2 M_d^2\).

Let us study \(I_A\). Testing the equality \(a(z,{\bar{\varphi }}-\varphi _{{\bar{u}}_h}) = \int _\Omega ({\bar{y}}-y_{\bar{u}_h})z\,\textrm{d}x\) for \(z = {\bar{y}}-y_{Q_h{\bar{u}}}\) and using the state equation, Cauchy–Schwarz inequality, and Theorem 3.4(a), we obtain

$$\begin{aligned} \int _\Gamma ({\bar{\varphi }}&-\varphi _{{\bar{u}}_h})({\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x= a({\bar{y}}-y_{Q_h{\bar{u}}},{\bar{\varphi }}-\varphi _{{\bar{u}}_h}) = \int _\Omega ({\bar{y}}-y_{{\bar{u}}_h})({\bar{y}}-y_{Q_h{\bar{u}}})\,\textrm{d}x\\&\le \Vert y_{{\bar{u}}-Q_h{\bar{u}}}\Vert _{L^2(\Omega )} \Vert y_{\bar{u}-{\bar{u}}_h}\Vert _{L^2(\Omega )} \le C \Vert {\bar{u}}-Q_h\bar{u}\Vert _{L^2(\Gamma )} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )} \end{aligned}$$

Using this and Lemmas 5.3 and 5.5, we obtain

$$\begin{aligned} I_A = \int _\Gamma ({\bar{\varphi }}-\varphi _{{\bar{u}}_h} + \nu ({\bar{u}}-\bar{u}_h) )( {\bar{u}}- Q_h{\bar{u}})\,\textrm{d}x\le C h \Vert {\bar{u}}-\bar{u}_h\Vert _{L^2(\Gamma )}M_d. \end{aligned}$$

Next we bound \(I_B\) and II. By the Cauchy-Schwarz inequality, we have that, for every \(v\in L^2(\Gamma )\),

$$\begin{aligned} \int _\Gamma (\varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h)v\,\textrm{d}x\le \Vert \varphi _{{\bar{u}}_h}-{\bar{\varphi }}_h\Vert _{L^2(\Gamma )} \Vert v\Vert _{L^2(\Gamma )}. \end{aligned}$$
(5.9)

Taking \(v = {\bar{u}}- Q_h{\bar{u}}\) in (5.9) and using (5.4) and Lemmas 5.3 and 5.5, we conclude that

$$\begin{aligned} I_B \le \Vert \varphi _{{\bar{u}}_h}-\bar{\varphi }_h\Vert _{L^2(\Gamma )} \Vert {\bar{u}}- Q_h\bar{u}\Vert _{L^2(\Gamma )} \le C h^{(2-\theta )s+1} M_d^2. \end{aligned}$$

Finally, taking \(v = {\bar{u}}-{\bar{u}}_h\) in (5.9) and using (5.4), we have

$$\begin{aligned} II \le C h^{(2-\theta )s} \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )} M_d. \end{aligned}$$

Gathering all the estimates we have that

$$\begin{aligned} \nu \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )}^2&\le C( h \Vert \bar{u}-{\bar{u}}_h\Vert _{L^2(\Gamma )} M_d + h^{(2-\theta )s+1} M_d^2 \\&\qquad + h^2 M_d^2 + h^{(2-\theta )s}\Vert {\bar{u}}-\bar{u}_h\Vert _{L^2(\Gamma )} M_d) \end{aligned}$$

and the proof concludes using Young’s inequality. Notice that the appearance of the terms \(h \Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )} M_d \) and \(h^2 M_d^2\) implies that the resulting exponent \(s^*\) is less or equal than one. On the other hand, since \(\theta > 1/2\), the term \(h^{(2-\theta )s}\Vert {\bar{u}}-{\bar{u}}_h\Vert _{L^2(\Gamma )} M_d\) yields the bound \(s^*\le (2-\theta )s< \dfrac{3}{2}s <\dfrac{3}{2}\dfrac{\lambda _j}{\mu _j}\). Finally, from the term \(h^{(2-\theta )s+1} M_d^2\) we obtain the bound \(s^*\le \min \{(2-\theta )s,1\}\), so no new conditions are imposed on \(s^*\). \(\square \)

6 A Numerical Example

Let \(\Omega \) be the L-shaped domain \(\Omega = \{x\in \mathbb {R}^2: r<\sqrt{2}, \theta < 3\pi /2\}\cap (-1,1)^2\). We consider a functional of the form

$$\begin{aligned} J(u) = \frac{1}{2}\int _\Omega (y_u(x)-y_d(x))^2 \,\textrm{d}x+\frac{\nu }{2}\int _\Gamma u(x)^2 \,\textrm{d}x+\int _\Gamma y_u(x) g_\varphi (x) \,\textrm{d}x, \end{aligned}$$

where

$$\begin{aligned} \left\{ \begin{array}{rccl} -\Delta y_u + b\cdot \nabla y_u + a_0 y_u &{}{}=&{}{} f &{} \text{ in } \Omega ,\\ \partial _{n} y &{}{}=&{}{} u+g_y&{} \text{ on } \Gamma .\end{array}\right. \end{aligned}$$

with data \(\nu \), \(y_d\), \(g_\varphi \), b, \(a_0\), \(g_y\) described below. The inclusion of data f and \(g_y\) is useful to write a problem with known exact solution. Notice that, if we denote \(y_0\in L^2(\Omega )\) the state related to \(u\equiv 0\) and redefine \(y_d:=y_d-y_0\) and \(y_u:=y_u-y_0\), the problem fits into the framework of problem (P) and Eq. (1.1).

Let \((r,\theta )\) be the polar coordinates in the plane, \(r\ge 0\), \(\theta \in [0,2\pi ]\). The interior angle at the vertex of the domain located at the origin is \(\omega = \omega _1 = 3\pi /2\) and we denote \(\lambda = \lambda _1 =\pi /\omega _1 = 2/3\). For \(j=2,\ldots ,6\), \(\omega _j=\pi /2\) and \(\lambda _j=2\).

We introduce \({\bar{y}} = r^\lambda \cos (\lambda \theta ),\) \({\bar{\varphi }} = -{\bar{y}}\) and \({\bar{u}} = -{\bar{\varphi }}/\nu \) on \(\Gamma \) and, for some \(\alpha > -3/2\) and some \(\delta \ge 0\), we consider \(b(x) = \delta r^{\alpha +1}(\cos \theta ,\sin \theta )^T\) and \(a_0(x) = r^\alpha \).

The data for this problem are defined as \(f = b\cdot \nabla {\bar{y}}+ a_0 {\bar{y}}\), \( g_y = \partial _{n} {\bar{y}} - {\bar{u}}\) on \(\Gamma \), \( y_d = {\bar{y}} +\nabla \cdot ({\bar{\varphi }} b)-a_0{\bar{\varphi }}\) and \(g_\varphi = \partial _n{\bar{\varphi }} + (b{\bar{\varphi }})\cdot n\).

For all \(\alpha > -2\), \(b\in L^{{\hat{p}}}(\Omega )\) for some \({{\hat{p}}}>2\) (Assumption 2.1). For \(\alpha > -1-\beta \), \(a_0,\nabla \cdot b, f, y_d\in L^2_{\mathbf {\beta }}(\Omega )\) and \(b\cdot n, g_y, g_\varphi \in W^{1/2,2}_{\mathbf {\beta }}(\Gamma )\), so the assumptions of Theorems 3.4(c) and 3.5(c) hold. If we impose \(\beta <1/2\) (assumption in Theorem 5.1), we have that for \(\alpha > -3/2\) all the assumptions of the paper hold. In our experiments, we fix \(\alpha = -1.25\).

The given \({\bar{u}}\) is the solution of the control problem

$$\begin{aligned} \mathrm{(P)} \min _{u\in L^2(\Gamma )}J(u), \end{aligned}$$

with related state \({\bar{y}}\) and adjoint state \({\bar{\varphi }}\), which satisfy the optimality system

$$\begin{aligned} \left\{ \begin{array}{rccl} -\Delta {\bar{y}} + b\cdot \nabla {\bar{y}} + a_0 {\bar{y}} &{}{}=&{}{} f &{} \text{ in } \Omega ,\\ \partial _{n_A} {\bar{y}} &{}{}=&{}{} g_y+{\bar{u}}&{} \text{ on } \Gamma ,\end{array}\right. \\ \left\{ \begin{array}{rccl} -\Delta {\bar{\varphi }} -\nabla \cdot (b \bar{\varphi }) + a_0 {\bar{\varphi }} &{}{}=&{}{} {\bar{y}}-y_d &{}{} \text{ in } \Omega ,\\ \partial _{n} {\bar{\varphi }} + {\bar{\varphi }} b\cdot n &{}{}=&{}{} g_\varphi &{}{} \text{ on } \Gamma ,\end{array}\right. \\ \begin{array}{rccl}{\bar{u}}= & {} -{\bar{\varphi }}/\nu&\text { on } \Gamma .\end{array} \end{aligned}$$

It is clear that \({\bar{y}},{\bar{\varphi }}\in W^{2,2}_{\mathbf {\beta }}(\Omega )\) and \({\bar{u}}\in H^1(\Gamma )\cap W^{1/2,2}_{\mathbf {\beta }}(\Gamma )\) for \(\mathbf {\beta }=(\beta ,0,0,0,0,0)\) for all \(\beta> 1-\lambda >1/3\).

For \(\delta = 6\), we have checked numerically that the operator is not coercive,

To discretize the problem we use the finite element approximation described in the work. We use a family of graded meshes obtained by bisection; see, e.g., [25, Figure 1.2]. This meshing method does not lead to superconvergence properties in the gradients. The code has been done with Matlab on a desktop PC with Interl(R) Core(TM) i5-7500CPU at 3.4GHz with 24GB of RAM. The meshes have been prepared using functions provided by Johannes Pfefferer. The finite element approximations are obtained with code prepared by us and the linear systems are solved using Matlab’s [L,U,P,Q,D] = lu(S) method. The optimization of the resulting finite-dimensional quadratic program is done using Matlab’s pcg.

First we check estimates (4.14) and (4.15) for the error in the solution of the boundary value problem. For appropriately graded meshes, \(\mu < 2/3 = \lambda \), we expect order \(h^2\) in \(L^2(\Omega )\) and order h in \(H^1(\Omega )\). For a quasi-uniform family, \(\mu = 1\), we have \(s < 2/3\), so we expect order \(h^{1.33}\) in \(L^2(\Omega )\) and order \(h^{0.66}\) in \(H^1(\Omega )\). We summarize the results in Tables 1, 2, 3 and 4. We include results for both the state and adjoint state equation. Notice that \({{\tilde{\varphi }}}_h\) is the finite element approximation of \({\bar{\varphi }}\), obtained using the exact \({\bar{y}}\), i.e., \(a(z_h,{{\tilde{\varphi }}}_h) = \int _\Omega (\bar{y}-y_d)z_h\,\textrm{d}x+ \int _\Gamma g_\varphi z_h\,\textrm{d}x\) for all \(z_h\in Y_h\).

Table 1 Errors and experimental orders of convergence for the boundary value problem
Table 2 Errors and experimental orders of convergence for the boundary value problem
Table 3 Errors and experimental orders of convergence for the boundary value problem
Table 4 Errors and experimental orders of convergence for the boundary value problem

Next, we turn to the control problem and check the estimate in Theorem 5.7. Notice that we should obtain order of convergence h for both graded-meshes and quasi-uniform meshes. We summarize the results in Table 5.

Table 5 Errors and experimental orders of convergence for the optimal control problem

Note that in this example the regularity of the adjoint state is even \({\bar{\varphi }}\in W^{2,\infty }_{\mathbf {\gamma }}(\Gamma )\) for \(\mathbf {\gamma }=(\gamma ,0,0,0,0,0)\) with \(\gamma >4/3\). This leads to superconvergence properties in the convergence in the norms of \(L^2(\Omega )\) and \(L^2(\Gamma )\) of both the state and adjoint state variable, where, despite expecting order of convergence 1, as for the control, we obtain the same order of convergence as the one for the boundary value problem, i.e, 1.33 or almost 2 in our examples. This phenomenon will be studied in a future paper.