1 Introduction

Arguably one of the most fundamental results in finite element analysis is the best approximation result for the Galerkin method, known as Cea’s lemma [20], which together with approximation estimates for finite element functions results in quasi-optimal error estimates for finite element methods [3, 27, 44, 48]. This result, that we will review below, essentially says that if a (2m)-order elliptic problem, \(m\ge 1\), is approximated with \(H^m\)-conforming finite elements of local polynomial order p the error in \(H^m\)-norm is of the order \(h^{p+1-m}\) for a sufficiently smooth solution and that this rate is optimal compared to approximation: the best interpolant of the exact solution has similar accuracy.

For ill-posed elliptic problems the situation is different. On the continuous level existence can only be guaranteed after regularisation of the problem. The two main approaches are Tikhonov regularisation [45] and quasi-reversibility [31]. These two approaches are strongly related (see for instance [7]). The main effort in the error analysis has been to estimate the perturbation induced by the addition of regularisation, and how to choose the associated regularisation operator or parameter [6, 28, 35, 38, 43]. The error due to approximation in finite dimensional spaces of such regularised problems has also been analysed [24, 36, 41].

There is also a rich literature on projection methods for ill-posed problems where the discretisation serves as regularisation and refinement has to stop as soon as the effect of perturbations in data becomes dominant [22, 23, 26, 30, 42]. These methods are often based on least squares methods and the convergence of the approximate solution to the exact solution for unperturbed data has been proven in several works. There are also different stopping criteria for mesh refinement in order to avoid degeneration due to pollution from perturbations. However no results on rates of convergence where the discretisation errors and the perturbation errors are both included appear in these references.

The use of conditional stability (continuous dependence on data under the assumption of a certain a priori bound) to obtain more complete error estimates has been proposed in [9,10,11, 18] for a class of finite element methods based on weakly consistent regularisation/stabilisation in a primal-dual framework. Here stability is obtained through a combination of consistent stabilisation and Tikhonov regularisation, scaled with the mesh parameter to obtain weak consistency. The upshot is that for this class of methods an error analysis exists, where the computational error is bounded in terms of the mesh parameter and perturbations of data, with constants depending on Sobolev norms of the exact solution. Similarly to the well-posed case, the error estimates for this approach combine the stability of the physical problem with the numerical stability of the computational method and the approximability of the finite element space. Contrary to the well-posed case, numerical stability can not be deduced from the physical stability, but has to be a consequence of the design of the stabilisation terms. This means that the stabilisation in this framework is bespoke, and must be designed to combine optimal (weak) consistency and sufficient numerical stability. There is often a tension between these two design criteria. As noted above, sometimes Tikhonov regularisation, scaled with the mesh parameter, may be used in the framework. An interesting feature is that the bespoke character also allows for the integration of the dependence of the estimates on physical parameters and different problems regimes [16, 17]. Other physical models that have been considered in this framework include data assimilation for fluids [5, 13, 18], or wave equations [12]. Common for all these references is the fact that the error estimates reflect the stability of the continuous problem and the approximation order of the finite element space, which seems to be an optimality property of the methods. No rigorous proof, however, has been given for this optimality. The objective of the present work is to show, in the model case of unique continuation for Laplace equation, that the proposed error estimates are indeed optimal.

For ill-posed PDEs that are conditionally stable, error estimates in terms of the modulus of continuity in the conditional stability, the consistency error and the best approximation error have also been obtained in [21]. Based on least squares with the norms and the regularisation term dictated by the conditional stability estimate, this variation of quasi-reversibility relies on working with discrete dual norms and constructing Fortin projectors. By choosing the regularisation parameter in terms of the consistency error and the best approximation error, the obtained error bound reflects the conditional stability estimate (qualitatively optimal). Conditional stability estimates have also been used to obtain some bounds on the generalisation error for physics-informed neural networks solving ill-posed PDEs [39]. The question of optimality for both these kind of methods is included in our discussion.

Another well-known ill-posed problem is analytic continuation, which, similarly to unique continuation, possesses conditional stability under the assumption of an a priori bound. We will not discuss this problem here; for its conditional stability/conditioning and numerical approximations, we refer the reader to [46, 47] and the references therein.

1.1 Unique Continuation Problem

Let \(0< r_1< r_2 < R\) and write \(\omega := B(r_1)\) and \(B:= B(r_2)\) where B(r) is the open ball of radius \(r > 0\), with the centre at the origin in \({\mathbb {R}}^n\). The objective is to solve the continuation problem: given the restriction \(u|_\omega \) to the subset \(\omega \), find the restriction \(u|_B\) when u satisfies \(\Delta u = 0\) in B(R).

Further, letting \(r_2< r_3 < R\) and writing \(\Omega := B(r_3)\), it is classical [29] that the following conditional stability estimate holds:

$$\begin{aligned} \Vert u\Vert _{L^2(B)} \lesssim \Vert u\Vert _{L^2(\omega )}^\alpha \Vert u\Vert _{L^2(\Omega )}^{(1-\alpha )}, \end{aligned}$$
(1)

where \(\alpha \in (0,1)\) and the implicit constant do not depend on the harmonic function u. Estimate (1) is often called a three-ball inequality and in this case the constants can be given explicitly, see Theorem 1 below. We may view the unique continuation problem as finding \(u \in H^1(\Omega )\) such that

$$\begin{aligned} \left\{ \begin{array}{rcl} -\Delta u &{}=&{} 0 \text{ in } \Omega , \\ u\vert _\omega &{}=&{} q \text{ in } \omega , \end{array} \right. \end{aligned}$$
(2)

with a priori knowledge on the size of the solution in \(\Omega \) as prescribed by the \(L^2(\Omega )\)-norm in (1).

1.2 Motivation and Outline

The motivation of this paper comes from error estimates obtained for primal-dual finite element methods applied to this problem, with perturbed data \(u\vert _\omega =q + \delta q\), of the form

$$\begin{aligned} \Vert u - u_h\Vert _{L^2(B)} \lesssim h^\alpha \Vert u\Vert _{H^2(\Omega )} + h^{\alpha - 1} \Vert \delta q\Vert _{L^2(\omega )}, \end{aligned}$$
(3)

which have been shown in [14, 16, 17] in different variations of the second order elliptic equation. Here \(\alpha \in (0,1)\) is the exponent in (1) and \(h>0\) denotes the mesh parameter defining the characteristic length scale of the finite dimensional space. This is in the case of piecewise affine approximation, however the estimate generalises in a natural way to higher order approximation, as we shall see below. One can obtain a similar bound in the \(H^1\)-norm over B. In the counterfactual case that \(\alpha = 1\) one would then recover an estimate that is optimal compared to interpolation. Hence a natural question is if the bound (3) in the \(L^2\)-norm can be improved upon, since it is suboptimal with one order in h when compared to interpolation in the case \(\alpha =1\).

We show in this paper that if the coefficient \(\alpha \) in (1) is optimal and depends continuously on \(r_3\) (Theorem 1), then regardless of the underlying method, no sequence of approximations to (2) can converge with a rate better than that given by (3) (Theorem 2) without increasing the sensitivity to perturbations. We also point out that although the discussion focuses on the finite element method, the definition of optimal convergence given and the proof of optimality hold for any method producing an approximating sequence in \(H^1\) (or relying on such a sequence for the analysis).

The paper is organised as follows. In Sect. 2 we will discuss the notion of optimality of finite element approximations. First we revisit the classical finite element analysis for well-posed problems. In Sect. 2.2 we then discuss how the ideas of the well-posed case translate to the ill-posed case. This leads us to a definition of optimal approximation for the problem (2) and we prove in Sect. 2.3 that no approximation method can converge in a better way than that given by this definition. Finally in Sect. 3 we show that optimality can indeed be attained by presenting a finite element method with optimal error estimates which extend (3) for higher order approximations.

2 Optimal Error Estimates for Elliptic Problems

In this section we will first briefly recall the theory of optimal error estimates for Galerkin approximations of well-posed second order elliptic problems. We then consider the ill-posed model problem (2) and discuss how the construction that led to optimal approximation in the well-posed case can be adapted to this situation. This leads us to a definition of optimality of approximate solutions in the ill-posed case. We let \(V:=H^1(\Omega )\) and \(V_0:=H^1_0(\Omega )\).

For simplicity we consider the Poisson problem for \(f \in V_0'\):

$$\begin{aligned} \left\{ \begin{array}{rcl} -\Delta u &{} = &{} f \text{ in } \Omega \\ u &{} = &{} 0 \text{ on } \partial \Omega . \end{array} \right. \end{aligned}$$
(4)

We define the associated weak formulation by: find \(u \in V_0\) such that

$$\begin{aligned} a(u,v) = \ell (v) \quad \forall v \in V_0, \end{aligned}$$
(5)

where \(a(u,v):= \int _\Omega \nabla u \cdot \nabla v ~\text{ d }x\) and \(\ell (v):= \langle f,v\rangle _{V'_0,V_0}\). It is well known that a is a coercive (V-elliptic), continuous bilinear form on \(V_0 \times V_0\), and \(\ell \) is bounded and continuous on \(V_0\).

It then follows from Lax-Milgram’s lemma [32] that the weak formulation admits a unique solution satisfying the stability estimate

$$\begin{aligned} \Vert u\Vert _V \le \Vert \ell \Vert _{{V_0'}} \end{aligned}$$
(6)

where \(\Vert \cdot \Vert _V:= \sqrt{a(\cdot ,\cdot )}\) and the dual norm is defined by

$$\begin{aligned} \Vert \ell \Vert _{{V_0'}}:= \sup _{v \in V_0 \setminus \{0\}} \frac{|\ell (v)|}{\Vert v\Vert _V}. \end{aligned}$$

Introducing a finite dimensional subspace \(V_{0h} \subset V_0\) we define the Galerkin method, find \(u_h \in V_{0h}\) such that

$$\begin{aligned} a(u_h,v_h) = \ell _\delta (v_h) \quad \forall v_h \in V_{0h}, \end{aligned}$$
(7)

where \(\ell _\delta \) denotes a perturbed right hand side \(\ell _\delta (v):= \langle f + \delta f,v\rangle _{V_0',V_0}\), with \(\delta f \in {V_0'}\) and we assume \(\Vert \delta f\Vert _{{V_0'}}\) to be known. The associated linear system is invertible since a is coercive.

Let \(\bar{u}_h\in V_{0 h}\) be the solution for the unperturbed right hand side, satisfying

$$\begin{aligned} a\left( \bar{u}_h, v_h\right) =\ell \left( v_h\right) \quad \forall v_h \in V_{0 h}. \end{aligned}$$

Then we have that \(\left\| u-\bar{u}_h\right\| _V=\inf _{w_h \in V_{0\,h}}\left\| u-w_h\right\| _V\), by Galerkin orthogonality, and that \(\left\| u_h-\bar{u}_h\right\| _V \le \Vert \delta f\Vert _{V_0^{\prime }}\). Since \(\Delta : V_0 \rightarrow V_0^{\prime }\) is an isomorphic isometry, an application of the triangle-inequality for the approximation error \(e:= u- u_h\) gives that

$$\begin{aligned} \Vert e\Vert _V \le \inf _{w_h \in V_{0h}} \Vert \Delta (u - w_h)\Vert _{{V_0'}}+ \Vert \delta f\Vert _{{V_0'}}. \end{aligned}$$
(8)

This is equivalent to the classical result of Cea’s lemma, but written in a form suitable for our purposes. If \(u\in H^{k+1}(\Omega )\) and \(V_{0\,h}\) is the space of \(H^1\)-conforming piecewise polynomial finite elements of order k we immediately have by approximation that

$$\begin{aligned} \Vert e\Vert _V \lesssim h^k |u|_{H^{k+1}(\Omega )} + \Vert \delta f\Vert _{{V_0'}}, \end{aligned}$$
(9)

where \(|\cdot |_{H^{k+1}(\Omega )}\) stands for the seminorm. Observe how the Lipschitz stability of (6) combines with the approximation properties of the finite element space to yield an optimal error estimate. Perturbations in data lead to stagnation of the error at the level of the perturbation.

2.1 Optimal Three-Ball Estimate

Three-ball estimates such as (1) for solutions of second-order elliptic equations are well-known in the literature, see e.g. the review [1] or [8]. However, such results typically contain constants that depend implicitly on the geometry and the coefficients of the differential operator, and whose optimality is not clear [8]. We aim here to give a result in the case of the Laplace operator which, barring optimality, is a variation of existing results in the literature, see [37, Theorem 1] and [2, Eq. (1.2)]. We will consider only the two and three dimensional cases, for which we prove the following three-ball estimate in \(L^2\)-norms with optimal explicit constants.

Theorem 1

Let \(n\in \{2,3\}\) and \(B(r) \subset {\mathbb {R}}^n\) be the open ball of radius \(r > 0\). Let \(0< r_1< r_2 < r_3\). Then for all harmonic functions u there holds

$$\begin{aligned} \Vert u\Vert _{L^2(B(r_2))} \le \Vert u\Vert _{L^2(B(r_1))}^\alpha \Vert u\Vert _{L^2(B(r_3))}^{(1-\alpha )}, \end{aligned}$$
(10)

where

$$\begin{aligned} \alpha := \frac{\beta }{1+\beta } = \frac{\log (r_3) - \log (r_2)}{\log (r_3) - \log (r_1)}, \quad \beta := \frac{\log (r_3) - \log (r_2)}{\log (r_2) - \log (r_1)}. \end{aligned}$$
(11)

Moreover, there does not exist \({\tilde{\alpha }} > \alpha \) such that

$$\begin{aligned} \Vert u\Vert _{L^2(B(r_2))} \lesssim \Vert u\Vert _{L^2(B(r_1))}^{{\tilde{\alpha }}} \Vert u\Vert _{L^2(B(r_3))}^{(1-{\tilde{\alpha }})}. \end{aligned}$$
(12)

Proof

For any \(r > 0\) and \(\theta \in (0,1)\) there holds

$$\begin{aligned} \left\| u \right\| _{L^2(B_{r} )} \le \left\| u \right\| _{L^2(B_{\theta r})}^{1/2} \left\| u \right\| _{L^2(B_{\theta ^{-1} r} )}^{1/2}, \end{aligned}$$
(13)

see e.g. [2, Eq. (1.2)]. We aim to transform this estimate into (10). We take the logarithm of (13) and write \(f(t) = \log \left\| u \right\| _{L^2(B(\exp (t)) )}\) to obtain

$$\begin{aligned} f(\log (r)) \le \frac{1}{2} f(\log (\theta r)) + \frac{1}{2} f(\log (\theta ^{-1} r)). \end{aligned}$$

Notice that \(\log (\theta r) + \log (\theta ^{-1} r) = 2 \log (r)\), so that writing \(t = \log (\theta r)\) and \(s = \log (\theta ^{-1} r)\), we obtain that

$$\begin{aligned} f(\tfrac{1}{2} (s + t )) \le \tfrac{1}{2} ( f(s) + f(t)) , \end{aligned}$$
(14)

yielding convexity of f. Hence, for every \(\alpha \in (0,1)\) and \(s,t \in {\mathbb {R}}\)

$$\begin{aligned} f(\alpha s + (1-\alpha ) t ) \le \alpha f(s) + (1-\alpha ) f(t). \end{aligned}$$
(15)

We now set \(r = r_2\) and \(\theta = r_1 / r_2\). Then \(r_1 = \theta r\) and \(r_3 = \theta ^{-\beta } r\), with \(\beta \) given by (11). Taking \(s = \log (\theta r)\) and \(t = \log (\theta ^{-\beta } r)\) there holds

$$\begin{aligned} \alpha s + (1-\alpha ) t&= \alpha \log \theta + \alpha \log r - (1-\alpha ) \beta \log \theta + (1-\alpha ) \log r\\&= (\alpha - (1-\alpha ) \beta ) \log \theta + \log r = \log r, \end{aligned}$$

since \((1-\alpha ) \beta = \alpha \). With this choice, taking the exponential of (15) gives (10).

Suppose now that (12) holds for some \({\tilde{\alpha }} > 0\). We will show that \({\tilde{\alpha }} \le \alpha \). Let us consider first the two dimensional case. Identifying \({\mathbb {R}}^2\) with \({\mathbb {C}}\), consider the function \(u(z) = z^{n-1}\) which is harmonic for \(n \in {\mathbb {N}}\). The following argument is similarly valid for its real part. Using polar coordinates we have that, for \(\rho >0\),

$$\begin{aligned} \Vert u\Vert _{L^2(B(\rho ))}^2 = 2 \pi \int _0^\rho r^{2(n-1) + 1} ~\text{ d }r = c_n \rho ^{2n}, \quad c_n = \frac{\pi }{n}. \end{aligned}$$
(16)

Notice that we have equality in (10) for \(\alpha = \beta /(1+\beta ) = \frac{\log (r_3) - \log (r_2)}{\log (r_3) - \log (r_1)}\).

Recalling that \(r_1/r_2 = \theta ,\, r_3/r_2 = \theta ^{-\beta }\), estimate (12) reads as

$$\begin{aligned} 1 \lesssim \theta ^{n ({\tilde{\alpha }} - \beta (1-{\tilde{\alpha }}))} . \end{aligned}$$
(17)

As \(n \in {\mathbb {N}}\) is arbitrary and \(\theta \in (0,1)\) we must have \({\tilde{\alpha }} - \beta (1-{\tilde{\alpha }}) \le 0\). In other words,

$$\begin{aligned} {\tilde{\alpha }} \le \frac{\beta }{1+\beta } = \alpha . \end{aligned}$$

We turn to the three dimensional case, and consider the function

$$\begin{aligned} u(x^1, x^2, x^3) = z^{n-1}, \quad z = x^1 + i x^2. \end{aligned}$$
(18)

As above, this is harmonic for \(n \in {\mathbb {N}}\). Passing to spherical coordinates there holds

$$\begin{aligned} \Vert u\Vert _{L^2(B(\rho ))}^2 = 2 \pi \int _0^\pi \int _0^\rho (r\sin \theta )^{2(n-1) + 1} r ~\text{ d }r \text{ d }\theta = c_n \rho ^{2n + 1}, \end{aligned}$$
(19)

where the constant \(c_n\) can be written using the Gamma function

$$\begin{aligned} c_n = \frac{2 \pi ^{3/2} \varGamma (n)}{(2n + 1)\varGamma (n + 1/2)}. \end{aligned}$$

The conclusion follows as in the two dimensional case. \(\square \)

Note that the same explicit constants as in Theorem 1 appear in Hadamard’s three-circle theorem (in \(L^{\infty }\)-norms) for holomorphic functions.

Remark 1

In Theorem 1 we proved the optimality and continuous dependence of the exponent \(\alpha \) for unique continuation subject to the Laplace equation. In a more general setting, a discussion of optimality of three-ball inequalities can be found in [25] for elliptic and parabolic problems. In [33, 34] some cases in fluid mechanics and elasticity are considered for which optimality is claimed.

2.2 Definition of Optimal Convergence for Ill-Posed Problems with Conditional Stability

In this section we will try to mimic the development in the well-posed case for the problem (2) and point out where things go wrong. We will do this with minimal reference to a particular approximation method to keep the discussion general. However, in Sect. 3 we introduce a method for which the programme can be carried out.

First we will derive a weak formulation. This time, since no boundary conditions are set on u, we must consider the trial space V. To make form a consistent with the problem, the test space must be chosen to be \(V_0\), as keeping V would imply a homogeneous Neumann condition on the boundary. We may then write a weak formulation of the problem (2), find \(u \in V\) such that \(u\vert _{\omega } = q\) and

$$\begin{aligned} a(u,v) = 0 \quad \forall v \in V_0. \end{aligned}$$

We know that the exact solution satisfies this formulation and that (1) holds. Assume now that we have an approximation \(u_h \in V_h\) obtained using the perturbed data \({\tilde{q}}:= q + \delta q\), where \(\delta q \in L^2(\omega )\). Observe that although for this data, most likely, no exact solution will exist, a discrete approximation of the unperturbed exact solution u may still be constructed. Similarly as before the error \(e:=u-u_h\) satisfies

$$\begin{aligned} a(e,v) = \ell _h(v) \quad \forall v \in V_0, \end{aligned}$$

where

$$\begin{aligned} \ell _h(v):= -a(u_h, v), \, \text{ with } \Vert \ell _h\Vert _{{V_0'}} = \Vert \Delta u_h\Vert _{{V_0'}}. \end{aligned}$$

Observe that even if \(u_h\) is produced using a Galerkin procedure we can not use here the same techniques as when proving (8), since the trial space V in this case is bigger than the test space \(V_0\).

As before we would now like to apply a stability estimate, this time (1), using the right hand side on the perturbation. However this is not possible, since there is no right hand side in (2) and (1). Instead we first decompose \(e = e_0 + {\tilde{e}}\), where \(e_0 \in V_0\) solves the well-posed problem

$$\begin{aligned} a(e_0,v) = \ell _h(v) \quad \forall v \in V_0, \end{aligned}$$

and \({\tilde{e}}\) solves (2) with \({\tilde{e}}\vert _{\omega } = (e - e_0)\vert _{\omega }\). Using the triangle inequality and then applying (6) to \(e_0\) and (1) to \({\tilde{e}}\) we arrive at

$$\begin{aligned} \Vert e\Vert _{L^2(B)} \le \Vert e_0\Vert _{V} + \Vert {\tilde{e}}\Vert _{L^2(B)} \lesssim \Vert \Delta u_h\Vert _{{V_0'}} +\Vert {\tilde{e}}\Vert ^{\alpha }_{L^2(\omega )} \Vert {\tilde{e}}\Vert ^{1-\alpha }_{L^2(\Omega )}. \end{aligned}$$

Using once again the triangle inequality this leads to

$$\begin{aligned} \Vert {\tilde{e}}\Vert ^{\alpha }_{L^2(\omega )} \Vert \tilde{e}\Vert ^{1-\alpha }_{L^2(\Omega )} \lesssim (\Vert e\Vert _{L^2(\omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{\alpha } (\Vert e\Vert _{L^2(\Omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{1-\alpha }. \end{aligned}$$

We conclude that any approximation \(u_h\) must satisfy the bound

$$\begin{aligned} \begin{aligned} \Vert e\Vert _{L^2(B)}&\lesssim \Vert \Delta u_h\Vert _{{V_0'}} +(\Vert e\Vert _{L^2(\omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{\alpha } (\Vert e\Vert _{L^2(\Omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{1-\alpha } \\&\lesssim (\Vert q- u_h\Vert _{L^2(\omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{\alpha } (\Vert e\Vert _{L^2(\Omega )} + \Vert \Delta u_h\Vert _{{V_0'}})^{1-\alpha }. \end{aligned} \end{aligned}$$
(20)

If we assume that the term \(\Vert e\Vert _{L^2(\Omega )}\) is bounded, then inequality (20) gives an a posteriori bound for the error on B in the \(L^2(B)\)-norm.

For the sake of discussion, we will, for a moment, consider an approximation \(u_h\) satisfying certain properties. These properties can be thought of as design criteria for the numerical method, since as it turns out they lead to optimal convergence. In Sect. 3 we construct a finite element method with these properties.

  1. 1.

    Bound on the equation residual:

    $$\begin{aligned} \Vert \Delta u_h\Vert _{{V_0'}} \lesssim C h^k|u|_{H^{k+1}(\Omega )} + \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$
    (21)

    Observe that this means that the residual convergence in the ill-posed case is as good as the residual convergence in the well-posed case, see (9).

  2. 2.

    Bound on the data fitting term:

    $$\begin{aligned} \Vert q- u_h\Vert _{L^2(\omega )} \lesssim C h^k|u|_{H^{k+1}(\Omega )}+ \Vert \delta q\Vert _{L^2(\omega )} \end{aligned}$$
    (22)

    This term is suboptimal by one order in h compared to interpolation, but nothing can be gained by assuming better convergence since the term always is dominated by the contribution from \(\Vert \Delta u_h\Vert _{{V_0'}}\) in the bound (20). Strengthening the norm on \(\omega \) on the other hand is possible provided the perturbation \(\delta q\) also has additional smoothness.

  3. 3.

    Finally we need to assume an a priori bound on \(u_h\):

    $$\begin{aligned} \Vert u_h\Vert _{L^2(\Omega )} \lesssim |u|_{H^{k+1}(\Omega )} +h^{-k} \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$
    (23)

    The rationale for this choice is that it is the strongest control that can be achieved through Tikhonov regularisation without affecting the convergence order when the data is unperturbed, assuming that the previous two assumptions hold.

Injecting these three bounds in (20) we get the error estimate

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(B)} \lesssim h^{\alpha k} \Vert u\Vert _{H^{k+1}(\Omega )} + h^{-(1-\alpha ) k} \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$
(24)

A generic version of this estimate can be obtained by decoupling the rate of convergence from the sensitivity to perturbations, by considering the following bound

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(B)} \lesssim h^{\alpha _1 k} \Vert u\Vert _{H^{k+1}(\Omega )} + h^{(\alpha _2 - 1) k} \Vert \delta q\Vert _{L^2(\omega )}, \end{aligned}$$
(25)

for \(\alpha _1,\, \alpha _2\in (0,1)\). Denoting the upper bound here by

$$\begin{aligned} E(h):=h^{\alpha _1 k} \Vert u\Vert _{H^{k+1}(\Omega )} + h^{(\alpha _2 - 1) k} \Vert \delta q\Vert _{L^2(\omega )}, \end{aligned}$$

we see that E has a unique critical point

$$\begin{aligned} h_{\min } := \left( \frac{1-\alpha _2}{\alpha _1} \frac{\Vert \delta q\Vert _{L^2(\omega )}}{\Vert u\Vert _{H^{k+1}(\Omega )}}\right) ^{1/((1+\alpha _1-\alpha _2)k)}, \end{aligned}$$
(26)

which is a minimum since \(E''(h_{\min }) > 0\). Hence from (25) we get that

$$\begin{aligned} \Vert u-u_{h_{\min }}\Vert _{L^2(B)} \lesssim \Vert u\Vert _{H^{k+1}(\Omega )}^{1-\tilde{\alpha }} \Vert \delta q\Vert _{L^2(\omega )}^{\tilde{\alpha }}, \end{aligned}$$
(27)

with \({\tilde{\alpha }}:= \frac{\alpha _1}{1+\alpha _1 - \alpha _2}\). Notice that \(\tilde{\alpha } = \alpha \) when \(\alpha _1 = \alpha _2 = \alpha \). Considering convergence with respect to perturbations \(\delta q\) in this bound, one would like to have \(\tilde{\alpha }\) as large as possible.

Based on the discussion above we here propose a definition of what it means that a family of approximations \(\{u_h\}\) to an ill-posed problem of the form (2) is optimally convergent.

Definition 1

Assume that \(u\in H^{k+1}(\Omega ),\, k\in {\mathbb {N}}\), solves the unique continuation problem (2). Let \(\alpha \in (0,1)\) be the largest value for which the conditional stability estimate (1) holds. Let \(\{u_h\}_{h>0}\) be a family of functions in \(H^1(\Omega )\). If the family \(\{u_h\}\) satisfies the inequality (25) with \(\frac{\alpha _1}{1+\alpha _1 - \alpha _2} = \alpha \), then we say that its convergence is optimal.

Remark 2

An optimal \(\alpha \in (0,1)\) in the stability estimate (1) is provided in Theorem 1. We prove below that, independently of the method used, no family \(\{u_h\} \subset H^1(\Omega )\) of approximations to the solution of (2) can satisfy (25) with \(\frac{\alpha _1}{1+\alpha _1 - \alpha _2} > \alpha \). In particular, no method can exceed the convergence rate in (24) without increasing the sensitivity to data perturbations nor can it improve this sensitivity without decreasing the convergence rate, i.e. there exist no \(\alpha _1, \alpha _2 \in [\alpha ,1)\) with \(\alpha _1 > \alpha \) or \(\alpha _2 > \alpha \), such that

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(B)} \lesssim h^{\alpha _1 k} \Vert u\Vert _{H^{k+1}(\Omega )} + h^{(\alpha _2 - 1) k} \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$

Remark 3

The question of constructing such an optimal method with \(\alpha _1 > \alpha \) is currently an open question. For a small enough noise level, such a method would realize the best possible error upper bound (27) with a larger \(h_{\min }\), i.e. at a lower computational cost. Indeed, if

$$\begin{aligned} \Vert \delta q\Vert _{L^2(\omega )} < \alpha \Vert u\Vert _{H^{k+1}(\Omega )}/(1-\alpha ), \end{aligned}$$

then setting \(\frac{\alpha _1}{1+\alpha _1-\alpha _2} = \alpha \) we see that (26) becomes

$$\begin{aligned} h_{\min }=\left( (\alpha ^{-1}-1) \frac{\Vert \delta q\Vert _{L^2(\omega )}}{\Vert u\Vert _{H^{k+1}(\Omega )}}\right) ^{\frac{\alpha }{\alpha _1 k}}. \end{aligned}$$

2.3 Proof of Optimality

The following Caccioppoli-type inequality is known but we give a short proof for the convenience of the reader.

Lemma 1

Let \(r_3< r_4 < R\) and \(k \ge 0\). Then for all \(w \in H^{k+1}(B(R))\) satisfying \(\Delta w = 0\) in B(R) there holds

$$\begin{aligned} \Vert w\Vert _{H^{k+1}(B(r_3))} \lesssim \Vert w\Vert _{L^2(B(r_4))}. \end{aligned}$$

Proof

Divide the interval \((r_3,r_4)\) in \(k+1\) subintervals \((R_j,R_{j+1})\) of equal length with \(R_j=r_3 + j \delta R\), \(R_0=r_3\), \(R_{k+1} = r_4, \delta R = R_{j+1} - R_j = (r_4 - r_3)/(k+1)\). For an index \(j=0,\ldots ,k\) choose a \(\chi \in C_0^\infty (B(R_{j+1}))\) such that \(\chi = 1\) in \(B(R_j)\) and write \(v = \chi y\), where \(y \in H^1(B(R_{j+1}))\) and \(\Delta y = 0\). Then, if \([\Delta , \chi ]\) denotes the commutator \(\Delta \chi - \chi \Delta \),

$$\begin{aligned} \left\{ \begin{aligned} \Delta v&= [\Delta , \chi ] y&\text{ in }&B(R_{j+1}) \\ v&= 0&\text{ on }&\partial B(R_{j+1}). \end{aligned} \right. \end{aligned}$$

and therefore by (6) we have that

$$\begin{aligned} \begin{aligned} \Vert y\Vert _{H^1(B(R_j))}&\le \Vert v\Vert _{H^1(B(R_{j+1}))} \le \Vert \Delta v\Vert _{H^{-1}(B(R_{j+1}))}\\&\lesssim \Vert [\Delta , \chi ] y\Vert _{H^{-1}(B(R_{j+1})} \lesssim \Vert y\Vert _{L^2(B(R_{j+1}))}. \end{aligned} \end{aligned}$$
(28)

Here in the last step we used that

$$\begin{aligned} ([\Delta , \chi ] y,w)_{B(R_{j+1})}&= ((\Delta \chi ) y + 2 \nabla \chi \cdot \nabla y,w)_{B(R_{j+1})}\\&= - ((\Delta \chi ) y,w)_{B(R_{j+1})}-2 (y, \nabla \chi \cdot \nabla w)_{B(R_{j+1})} \\&\lesssim \Vert y\Vert _{L^2(B(R_{j+1}))} \Vert w\Vert _{H^{1}(B(R_{j+1}))}. \end{aligned}$$

Let \(y= D^{k-j} w\) where \(D^{k-j}\) denotes an arbitrary partial derivative of order \(k-j\), \(j = 0,\ldots ,k\). Then \(\Delta y = 0\). It follows from equation (28) that

$$\begin{aligned} \Vert y\Vert _{H^1(B(R_j))} \lesssim \Vert y\Vert _{L^2(B(R_{j+1}))}. \end{aligned}$$

By applying this to all partial derivatives of order \(k-j\), we see that

$$\begin{aligned} \Vert w\Vert _{H^{k+1-j}(B(R_j))} \lesssim \Vert w\Vert _{H^{k-j}(B(R_{j+1}))}. \end{aligned}$$

Hence by applying this inequality sequentially for \(j = 0,\ldots ,k\) we see that

$$\begin{aligned} \Vert w\Vert _{H^{k+1}(B(r_3))} = \Vert w\Vert _{H^{k+1}(B(R_0))} \lesssim \ldots \lesssim \Vert w\Vert _{H^{1}(B(R_{k}))} \lesssim \Vert w\Vert _{L^2(B(r_4))}. \end{aligned}$$

This concludes the proof. \(\square \)

Theorem 2

Let \(0<r_1<r_2<r_3<R\) and let \(\omega = B(r_1)\), \(B = B(r_2)\), \(\Omega = B(r_3)\). Let \(\alpha \in (0,1)\) be the optimal exponent in Theorem 1. Let \(u\in H^{k+1}(B(R))\) satisfy \(\Delta u = 0\) in B(R) and let \(q = u|_\omega \). Consider a family of mappings \(\{F_h\}_{h>0},\, F_h: L^2(\omega ) \rightarrow H^1(\Omega ),\, F_h(q + \delta q) =: u_h \), for all \(\delta q \in L^2(\omega )\). Then there exist no \(\alpha _1, \alpha _2 \in (0,1)\) with \(\frac{\alpha _1}{1+\alpha _1 - \alpha _2} > \alpha \), such that

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(B)} \lesssim h^{\alpha _1 k} \Vert u\Vert _{H^{k+1}(\Omega )} + h^{(\alpha _2 - 1) k} \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$
(29)

In particular, there exist no \(\alpha _1, \alpha _2 \in [\alpha ,1)\) with \(\alpha _1 > \alpha \) or \(\alpha _2 > \alpha \) such that (29) holds.

Proof

We give a proof by contradiction. Assume that there exist \(\alpha _1, \alpha _2 \in (0,1)\) with

$$\begin{aligned} {\tilde{\alpha }}:= \frac{\alpha _1}{1+\alpha _1 - \alpha _2} > \alpha \end{aligned}$$

such that (29) holds. Taking \(u=0\) and \(\delta q = {\tilde{u}}|_\omega \) for \({\tilde{u}}\) satisfying \(\Delta {\tilde{u}} = 0\) in B(R), the estimate (29) reduces to

$$\begin{aligned} \Vert F_h({\tilde{u}}|_\omega )\Vert _{L^2(B)} \lesssim h^{(\alpha _2 - 1) k} \Vert {\tilde{u}}\Vert _{L^2(\omega )}. \end{aligned}$$

Using (29) again with \(u = {\tilde{u}}\) and \(\delta q = 0\), we get

$$\begin{aligned} \Vert {\tilde{u}} - F_h({\tilde{u}}|_\omega )\Vert _{L^2(B)} \lesssim h^{\alpha _1 k} \Vert {\tilde{u}}\Vert _{H^{k+1}(\Omega )}. \end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \Vert {\tilde{u}}\Vert _{L^2(B)}&\le \Vert {\tilde{u}} - F_h({\tilde{u}}|_\omega )\Vert _{L^2(B)}+ \Vert F_h({\tilde{u}}|_\omega )\Vert _{L^2(B)} \\&\lesssim h^{\alpha _1 k } \Vert {\tilde{u}}\Vert _{H^{k+1}(\Omega )} + h^{(\alpha _2 - 1) k} \Vert {\tilde{u}}\Vert _{L^2(\omega )}. \end{aligned} \end{aligned}$$
(30)

We will write \(u = {\tilde{u}}\) from now on, and recall that u is an arbitrary solution to \(\Delta u = 0\) in B(R). For a nonzero u, we define

$$\begin{aligned} r := \frac{\Vert u\Vert _{L^2(\omega )}}{\Vert u\Vert _{H^{k+1}(\Omega )}}, \end{aligned}$$

and choose \(h > 0\) such that

$$\begin{aligned} h^{\alpha _1 k} \Vert u\Vert _{H^{k+1}(\Omega )} = h^{(\alpha _2 - 1) k} \Vert u\Vert _{L^2(\omega )}, \end{aligned}$$

that is,

$$\begin{aligned} h = r^{1/((\alpha _1+1-\alpha _2)k)}. \end{aligned}$$

With this choice, inequality (30) reduces to

$$\begin{aligned} \Vert u\Vert _{L^2(B)} \lesssim \Vert u\Vert _{L^2(\omega )}^{{\tilde{\alpha }}} \Vert u\Vert _{H^{k+1}(\Omega )}^{1-{\tilde{\alpha }}}, \end{aligned}$$
(31)

which trivially holds for the zero solution also. Observe that (31) would right away contradict the optimality of \(\alpha \) in Theorem 1 if the \(H^{k+1}\)-norm on its right-hand side was an \(L^2\)-norm. To weaken this norm, we can use Lemma 1 to get that \(\Vert u\Vert _{H^{k+1}(B(r_3))} \lesssim \Vert u\Vert _{L^2(B(r_4))}\) for \(r_3< r_4 < R\). Hence, using this bound in (31) leads to

$$\begin{aligned} \Vert u\Vert _{L^2(B)} \lesssim \Vert u\Vert _{L^2(\omega )}^{{\tilde{\alpha }}} \Vert u\Vert _{L^2(B(r_4))}^{1-{\tilde{\alpha }}}. \end{aligned}$$
(32)

We now denote by \(\hat{\alpha }\) the optimal exponent corresponding to \(r_4\) in the three-ball estimate in Theorem 1, for which

$$\begin{aligned} \Vert u\Vert _{L^2(B)} \lesssim \Vert u\Vert _{L^2(\omega )}^{\hat{\alpha }} \Vert u\Vert _{L^2(B(r_4))}^{1-\hat{\alpha }}, \end{aligned}$$
(33)

for any harmonic function u. This means that such an inequality cannot hold with an exponent larger than \(\hat{\alpha }\). However, since \(\hat{\alpha }\) depends continuously on \(r_4\), by considering \(r_4 > r_3\) sufficiently close to \(r_3\) we can get \(\hat{\alpha }\) arbitrarily close to \(\alpha \), i.e. \(\tilde{\alpha }> \hat{\alpha } > \alpha \). Thus inequality (32) holds with \(\tilde{\alpha } > \hat{\alpha }\), which contradicts the optimality of \(\hat{\alpha }\) in (33).

Let us finally show that if \(\alpha _1, \alpha _2 \in [\alpha ,1)\) with \(\alpha _1 > \alpha \) or \(\alpha _2 > \alpha \), then \({\tilde{\alpha }} > \alpha \). Consider first the case \(\alpha _1 < \alpha _2\). As \(\alpha _j \in [\alpha ,1)\), \(j=1,2\), there holds \(\alpha _2 - \alpha _1 \in (0,1)\) and

$$\begin{aligned} {\tilde{\alpha }} \ge \frac{\alpha }{1-(\alpha _2 - \alpha _1)} > \alpha . \end{aligned}$$

For the case \(\alpha _1 \ge \alpha _2\), we have that \(\alpha _1 = \alpha + \epsilon \) for some \(\epsilon > 0\), and

$$\begin{aligned} {\tilde{\alpha }} - \alpha \ge \frac{\alpha + \epsilon }{1 + \epsilon } - \alpha = \frac{(1-\alpha )\epsilon }{1 + \epsilon } > 0, \end{aligned}$$

which concludes the proof. \(\square \)

Remark 4

The proof of Theorem 2 is still valid if we assume (29) to hold with a weaker norm instead of \(\Vert u\Vert _{H^{k+1}(\Omega )}\).

Remark 5

The approximation \(u_h\) in Theorem 2 depends only on h and \(q+\delta q\). This result does not exclude the possibility of a regularisation method that uses more information, for example the size of the perturbation \(\delta q\). The optimal method that we present in the following section can also use this information, see Remark 6.

3 Primal-Dual Finite Element Methods with Weakly Consistent Regularisation

In this section we will use a finite element method with weakly consistent stabilisation to construct a sequence of approximate solutions for unique continuation (2) that satisfy the error estimate (24), showing that the optimal convergence for this ill-posed problem can be attained by a discrete approximation method. This discussion is based on ideas from [11, 14], modified to match the assumptions of the theoretical developments above.

Let \(\{{\mathcal {T}}\}\) be a quasi-uniform family of triangulations of \(\Omega \), where triangles T with curved boundaries are allowed so that the the covering of \(\Omega \) is exact [4, 49]. On these meshes we define a \(C^0\) finite element space \(V_h \subset H^1(\Omega )\), consisting of piecewise polynomials of order k (after mapping of the triangles to a reference element). We also let \(V_{0h} = V_h \cap H^1_0(\Omega )\). It then follows that there exist interpolants \(\Pi _h:H^{1}(\Omega ) \mapsto V_h\) [4, Corollary 4.1] and \(\Pi _h^0: H^1_0(\Omega ) \mapsto V_{0\,h}\) [4, Corollary 5.2] for which the following interpolation estimates hold

$$\begin{aligned} \Vert u - \Pi _h u\Vert _{T} + h \Vert \nabla (u - \Pi _h u)\Vert _T + h^2 \Vert D^2(u - \Pi _h u)\Vert _{T} \lesssim h^{k+1} |u|_{H^{k+1}(\Delta _T)}, \end{aligned}$$
(34)

where \(\Delta _T:= \{T' \in \mathcal {T}: T' \cap T \ne \emptyset \}\) and \(D^2 u\) is the Hessian of u, and

$$\begin{aligned} \Vert w - \Pi ^0_h w\Vert _{L^2(\Omega )} + h \Vert \Pi _h^0 w\Vert _V \lesssim h \Vert w\Vert _V. \end{aligned}$$
(35)

We will also use the broken norm defined by

$$\begin{aligned} \Vert v\Vert _{\mathcal {T}}:= \left( \sum _{T \in \mathcal {T}} \Vert v\Vert _T^2\right) ^{1/2}. \end{aligned}$$

To set up the numerical method, we formulate the continuation problem (2) as pde-constrained optimisation and consider the Lagrangian \(L_h: V_h \times V_{0h} \rightarrow {\mathbb {R}}\),

$$\begin{aligned} L_h(u_h,z_h) := \underbrace{{\textstyle \frac{1}{2}} \Vert u_h - \tilde{q}\Vert ^2_{L^2(\omega )}}_{\text {data fit}} + \underbrace{a(u_h,z_h)}_{\text {pde constraint}} + \underbrace{{\textstyle \frac{1}{2}} s(u_h,u_h) - {\textstyle \frac{1}{2}} a(z_h,z_h)}_{\text {discrete regularisation}}. \end{aligned}$$

By taking its saddle points, we define the finite element method as follows: find \((u_h,z_h) \in V_h \times V_{0h}\) such that

$$\begin{aligned} \begin{aligned} a(u_h,w_h) - a(z_h,w_h)&= 0 \\ a(v_h,z_h) + s(u_h,v_h) + (u_h,v_h)_{L^2(\omega )}&= ({\tilde{q}},v_h)_{L^2(\omega )} \end{aligned} \end{aligned}$$
(36)

for all \((v_h,w_h) \in V_h \times V_{0h}\), with

$$\begin{aligned} \begin{aligned} s(u_h,v_h):={}&\sum _{T \in \mathcal {T}} \left( (h_T^2 \Delta u_h, \Delta v_h)_T +\sum _{F \in \partial T} (h_T \llbracket \nabla u_h \rrbracket _F, \llbracket \nabla v_h \rrbracket _F)_{F\setminus \partial \Omega } \right) \\&+h^{2k} (u_h,v_h)_{L^2(\Omega )} \end{aligned} \end{aligned}$$
(37)

where F denotes a face of a triangle T and the jump of the gradient over a face F is defined by \(\llbracket \nabla u_h \rrbracket _F:= \nabla u_h\vert _{T_1} n_{T_1} + \nabla u_h\vert _{T_2} n_{T_2}\) for \(F = {\bar{T}}_1 \cap {\bar{T}}_2\), with \(n_{T_m}\) the outward pointing unit normal of the triangle \(T_m\). For a more compact formulation we introduce the global form \(A_h\),

$$\begin{aligned} A_h[(x_h,y_h),(v_h,w_h)] :={}&a(x_h,w_h) - a(y_h,w_h)\\&+a(v_h,y_h) + s(x_h,v_h) + (x_h,v_h)_{L^2(\omega )} \end{aligned}$$

to write: find \((u_h,z_h) \in V_h \times V_{0h}\) such that

$$\begin{aligned} A_h[(u_h,z_h),(v_h,w_h)] = ({\tilde{q}},v_h)_{L^2(\omega )} \end{aligned}$$
(38)

for all \((v_h,w_h) \in V_h \times V_{0h}\). Observe that this form satisfies the consistency property

$$\begin{aligned} A_h[( u - u_h, -z_h),(v_h,w_h)] = h^{2k} (u,v_h) - (\delta q, v_h)_{L^2(\omega )}. \end{aligned}$$
(39)

To show that this method satisfies the error bound (24), we only need to verify that it satisfies (21), (22) and (23) (which represent the design criteria for the method). To this end we introduce the norm

$$\begin{aligned} |\!|\!|(v_h,w_h)|\!|\!|_S^2:= s(v_h,v_h)+ \Vert w_h\Vert ^2_V+ \Vert v_h\Vert ^2_{L^2(\omega )} \end{aligned}$$

and we observe that the formulation satisfies the positivity property

$$\begin{aligned} |\!|\!|(u_h,z_h)|\!|\!|_S^2 = A_h[(u_h,z_h),(u_h,-z_h)] \end{aligned}$$
(40)

which ensures the existence of a discrete solution \((u_h, z_h) \in V_h \times V_{0\,h}\) for all \(h>0\).

We proceed by first proving convergence in the S-norm, which immediately gives (22). The proof of the other two bounds and satisfaction of (24) then follow as a corollary. First we establish an approximation result for the S norm.

Lemma 2

Let \(v \in H^{k+1}(\Omega )\), then there holds

$$\begin{aligned} |\!|\!|(v - \Pi _h v,0)|\!|\!|_S \lesssim h^{k} \Vert v\Vert _{H^{k+1}(\Omega )}. \end{aligned}$$

Proof

By the definition of the S-norm we see that

$$\begin{aligned} \begin{aligned} |\!|\!|(v - \Pi _h v,0)|\!|\!|_S^2 ={}&\Vert h \Delta (v - \Pi _h v)\Vert ^2_{\mathcal {T}} + \sum _{T \in \mathcal {T}} \sum _{F \in \partial T\setminus \partial \Omega } h_T \Vert \llbracket \nabla \Pi _h v \rrbracket _F\Vert ^2_F \\&+ h^{2k} \Vert v - \Pi _h v\Vert _{L^2(\Omega )}^2 + \Vert v - \Pi _h v\Vert ^2_{L^2(\omega )}. \end{aligned} \end{aligned}$$

By the approximation property (34) we have that

$$\begin{aligned} \Vert h {\Delta } (v - {\Pi }_h v)\Vert ^2_{\mathcal {T}} + h^{2k} \Vert v - {\Pi }_h v\Vert _{L^2(\Omega )}^2+ \Vert v - {\Pi }_h v\Vert ^2_{L^2(\omega )} \lesssim h^{2k} |v|^{2}_{H^{k+1}(\mathrm{\Omega })}. \end{aligned}$$

For the term measuring the jump of \(\Pi _h v\) over element faces we note that

$$\begin{aligned} \begin{aligned} \Vert \llbracket \nabla \Pi _h v \rrbracket _F\Vert ^2_F&\le \Vert \llbracket \nabla (v - \Pi _h v) \rrbracket _F\Vert ^2_F \\&\le h^{-1} \Vert \nabla (v - \Pi _h v)\Vert ^2_{T_1 \cup T_2} + h \Vert D^2 (v - \Pi _h v)\Vert ^2_{T_1 \cup T_2} \end{aligned} \end{aligned}$$

where we used the regularity of v and the trace inequality [40]

$$\begin{aligned} \Vert v\Vert _{\partial T} \lesssim h^{-\frac{1}{2}} \Vert v\Vert _T + h^{\frac{1}{2}} \Vert \nabla v\Vert _T, \quad \forall v \in H^1(T). \end{aligned}$$

We conclude by applying (34) once again and summing over all the faces. \(\square \)

Proposition 1

Let \((u_h,z_h)\) denote the solution to (38) and let \(u \in H^{k+1}(\Omega )\) be the solution to (2), then there holds

$$\begin{aligned} |\!|\!|(u - u_h,z_h)|\!|\!|_S \lesssim h^{k} \Vert u\Vert _{H^{k+1}(\Omega )} + \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$

Proof

First we decompose the error \(u -u_h = u - \Pi _h u + \underbrace{\Pi _h u - u_h}_{=:e_h}\) in the continuous and discrete parts. By the triangle inequality and Lemma 2 it is enough to bound \(|\!|\!|(e_h,z_h)|\!|\!|_S\). Using (40) and (39) we have

$$\begin{aligned} \begin{aligned} |\!|\!|(e_h, z_h)|\!|\!|_S^2 = A_h[(e_h,-z_h),(e_h,z_h)] ={}&A_h[(\Pi _h u - u,0), (e_h,z_h)]\\&- (\delta q, e_h)_{L^2(\omega )} + h^{2k} (u, e_h)_{L^2(\Omega )}. \end{aligned} \end{aligned}$$

For the last two terms on the right hand side we have

$$\begin{aligned} -(\delta q, e_h)_{L^2(\omega )} + h^{2k} (u, e_h)_{L^2(\Omega )} \le (\Vert \delta q\Vert _{L^2(\omega )} + h^{k} \Vert u\Vert _{L^2(\Omega )}) |\!|\!|(e_h, 0)|\!|\!|_S. \end{aligned}$$

Finally, the following continuity holds

$$\begin{aligned} A_h[(\Pi _h u - u,0), (e_h,z_h)] \lesssim \Vert \Pi _h u - u\Vert _* |\!|\!|(e_h, z_h)|\!|\!|_S \end{aligned}$$
(41)

where

$$\begin{aligned} \Vert v \Vert _*:= \Vert v\Vert _V + |\!|\!|v|\!|\!|_S. \end{aligned}$$

To prove the continuity (41) recall that by definition

$$\begin{aligned}{} & {} A_h[(\Pi _h u - u,0), (e_h,z_h)] \\{} & {} \quad = a(\Pi _h u - u,z_h) + s(\Pi _h u - u,e_h) + (\Pi _h u - u,e_h)_{L^2(\omega )}. \end{aligned}$$

Using the Cauchy–Schwarz inequality we have that

$$\begin{aligned} a(\Pi _h u - u,z_h) \le \Vert \Pi _h u - u\Vert _V |\!|\!|(0, z_h)|\!|\!|_S \end{aligned}$$

and

$$\begin{aligned} s(\Pi _h u - u,e_h) + (\Pi _h u - u,e_h)_{L^2(\omega )} \le |\!|\!|(\Pi _h u - u,0)|\!|\!|_S |\!|\!|(e_h,0)|\!|\!|_S. \end{aligned}$$

We end the proof by observing that by equation (34) and Lemma 2 there holds

$$\begin{aligned} \Vert \Pi _h u - u \Vert _* \lesssim h^{k} |u|_{H^{k+1}(\Omega )}. \end{aligned}$$

\(\square \)

Corollary 1

Under the same hypothesis as for Proposition 1 there holds

$$\begin{aligned} \Vert u_h\Vert _{L^2(\Omega )} \lesssim \Vert u\Vert _{H^{k+1}(\Omega )} + h^{-k}\Vert \delta q\Vert _{L^2(\omega )} \end{aligned}$$

and

$$\begin{aligned} \Vert \Delta u_h\Vert _{H^{-1}(\Omega )} \lesssim h^{k} \Vert u\Vert _{H^{k+1}(\Omega )} + \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$

Finally \(u_h\) satisfies the error bound (24).

Proof

First we observe that the third claim is an immediate consequence of the first two and Proposition 1. Indeed, this follows from the discussion of Sect. 2.2, using the error bound (20) and Eqs. (21)–(23).

The first inequality is immediate by Proposition 1 observing that

$$\begin{aligned} \Vert u_h\Vert _{L^2(\Omega )} \le \Vert u-u_h\Vert _{L^2(\Omega )}+\Vert u\Vert _{L^2(\Omega )}, \end{aligned}$$

and, for the first term in the right hand side,

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(\Omega )} \le h^{-k} |\!|\!|(u-u_h,0)|\!|\!|_S \lesssim \Vert u\Vert _{H^{k+1}(\Omega )} + h^{-k}\Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$

For the second inequality, by definition

$$\begin{aligned} \Vert \Delta u_h\Vert _{H^{-1}(\Omega )} = \sup _{w \in V_0\setminus \{0\}} \frac{a(u_h,w)}{\Vert w\Vert _V}. \end{aligned}$$

Using (39), followed by integration by parts, we see that for all \(w_h \in V_{0h}\)

$$\begin{aligned} a(u_h,w)= & {} a(u_h-u,w) = a(u_h - u,w - w_h) + a(z_h,w_h) \\= & {} \sum _{T \in \mathcal {T}} \left( -(\Delta (u_h - u),w - w_h)_{L^2(T)} + (\llbracket \nabla u_h \rrbracket _F,w - w_h)_{L^2(\partial T \setminus \partial \Omega )} \right) + a(z_h,w_h). \end{aligned}$$

Choosing \(w_h = \Pi ^0_h w\) and using the Cauchy–Schwarz inequality in the first term of the right hand side and the continuity of a in the second, followed by (35), we see that

$$\begin{aligned}&\sum _{T\in \mathcal {T}} \left( -(\Delta (u_h - u),w - w_h)_{L^2(T)} + (\llbracket \nabla u_h \rrbracket _F,w - w_h)_{L^2(\partial T \setminus \partial \Omega )} \right) + a(z_h,w_h) \\&\quad \lesssim |\!|\!|(u_h - u,z_h)|\!|\!|_S \Vert w\Vert _V. \end{aligned}$$

The conclusion now follows using Proposition 1 to obtain the desired bound

$$\begin{aligned} \Vert \Delta u_h\Vert _{H^{-1}(\Omega )} = \sup _{w \in V_0\setminus \{0\}} \frac{a(u_h,w)}{\Vert w\Vert _V} \lesssim h^{k} \Vert u\Vert _{H^{k+1}(\Omega )} + \Vert \delta q\Vert _{L^2(\omega )}. \end{aligned}$$

\(\square \)

Remark 6

Both for the well-posed problem (4) and the ill-posed problem (2) there is a lower bound for how well the exact solution can be approximated if the data are perturbed. In the well-posed case the limit is trivially given by \(\Vert \delta f\Vert _{{V_0'}}\) in (9), whereas in the ill-posed case the lower bound occurs when

$$\begin{aligned} h = h_{\min } = (\Vert \delta q\Vert _{L^2(\omega )}/ \Vert u\Vert _{H^{k+1}(\Omega )})^{1/k}, \end{aligned}$$

see (26). If \(h_{\min }\) is known, the numerical scheme can be designed to stagnate at the level of the best approximation, by modifying the last term in the definition of the stabilisation (37) to read \(\max (h,h_{\min })^{2k} (u_h,v_h)_{L^2(\Omega )} \). This shows the connection between this stabilising term and classical Tikhonov regularisation and similar tools as for the latter can be applied here to optimise the parameter compared to perturbations in data. It is straighforward to show that this leads to stagnation at

$$\begin{aligned} \Vert u - u_h\Vert _{L^2(B)} \lesssim h_{\min }^{\alpha k} \Vert u\Vert _{H^{k+1}(\Omega )} = \Vert \delta q\Vert _{L^2(\omega )}^{\alpha } \Vert u\Vert _{H^{k+1}(\Omega )}^{1-\alpha }. \end{aligned}$$

Here the implicit constant may depend on k. We see that increasing k will increase the value of \(h_{\min }\), so that the best approximation is obtained on a coarser mesh. However due to the k dependence of \(\Vert u\Vert _{H^{k+1}(\Omega )}\) stagnation may take place on a different, potentially higher, value of the error. A similar kind of bound was obtained in [21, Theorem 2.2].

4 Conclusion

In this paper we have shown that the convergence order of the approximation error for unique continuation problems, obtained by combining the approximation orders of the data fitting and the pde-residual with the conditional stability, can not be improved without increasing the sensitivity to perturbations. This shows that the asymptotic accuracy of the methods for unique continuation discussed in [10, 11, 14,15,16,17, 21] is optimal, in the sense that it is impossible to design a method with better convergence properties. The only remaining possibilities to enhance the accuracy of approximation methods is either to resort to adaptivity, or to introduce some additional a priori assumption to make the continuous problem more stable, such as finite dimensionality of target quantities (see [19]).