1 Introduction

Ill-posed and inverse problems for elastodynamics occur in various geophysical Yaman et al. (2013, Section 3) and medical applications (Doyley 2012). Analyzing these problems under realistic assumptions is subject to current research and often requires the use of sophisticated mathematical tools, see e.g. Rachele (2000), Stefanov et al. (2021), Bhattacharyya et al. (2022) for the application of microlocal techniques to the recovery of material parameters from certain type of boundary measurements. Furthermore, numerical methods which make optimal use of the latest analytical results and lead to provably convergent and reliable solutions of the inverse problems are in high demand. In this paper we aim to present such a method for the unique continuation problem of time-harmonic elastodynamics. Arguably, this is the simplest ill-posed problem encountered in this field. We consider this problem here because understanding of the stability properties of the continuous problem and the required tools for its numerical treatment are now sufficiently advanced to allow for a fairly complete convergence analysis.

The unique continuation problem for the elastic wave equation is formulated as follows. Let \(\Omega \subset {\mathbb {R}}^d\), \(d \ge 2\) be a bounded Lipschitz domain. Given \(f \in [L^2(\Omega )]^d\) we seek to find the wave displacement \(u \in V:= [H^1(\Omega )]^d\) fulfilling

$$\begin{aligned} \mathcal {L}u = f \text { in } \Omega , \end{aligned}$$
(1)

where

$$\begin{aligned} \mathcal {L}u:={-} \nabla \cdot \sigma (u) {-} \rho u, \quad \sigma (u):= 2 \mu \mathcal {E}(u) {+} \lambda \left( \nabla \cdot u \right) I, \quad \mathcal {E}(u):= \frac{1}{2} \left( \nabla {u} {+} \nabla {u}^T \right) .\nonumber \\ \end{aligned}$$
(2)

The Lamé coefficients \(\lambda (x),\mu (x)\) and the density \(\rho (x)\) are assumed to be known. If suitable boundary conditions on \(\partial \Omega \) are given, then this problem is well-posedFootnote 1 and approximate solutions of any desired accuracy can be obtained using standard numerical methods. However, here we will consider the case in which no information of the wave displacement on the boundary is provided. To partially compensate for this lack of information, we assume instead that measurements of u in some open subset \(\omega \subset \Omega \) are available, that is

$$\begin{aligned} u = u_{\omega } \quad \text { in } \omega . \end{aligned}$$
(3)

The objective is then to continue the solution into a larger subset \(B \subset \Omega \). Note that this problem is ill-posed since continuous dependence on the data fails, i.e. an estimate of the form \(\left\Vert u\right\Vert _{B} \le C ( \left\Vert f\right\Vert _{ \Omega } + \left\Vert u\right\Vert _{ \omega } )\), where \(B \subset \Omega \) such that \(B {\setminus } \omega \ne \emptyset \), is in general not valid. Here we introduced the shorthand \( \left\Vert f\right\Vert _{ M }:= \left\Vert f\right\Vert _{ [L^2(M)]^d } \) for a subset \(M \subset {\mathbb {R}}^d\). It is possible though to obtain [see Lin et al. (2010, 2011) and Sect. 2 for details] a conditional stability estimate of the form

$$\begin{aligned} \left\Vert u\right\Vert _{B} \le C \left( \left\Vert f\right\Vert _{ \Omega } + \left\Vert u\right\Vert _{\Omega } \right) ^{1-\tau } \left( \left\Vert f\right\Vert _{ \Omega } + \left\Vert u\right\Vert _{ \omega } \right) ^{\tau } \end{aligned}$$
(4)

on a subset \(\omega \subset B \subset \Omega \) such that \(B {\setminus } \omega \) does not touch the boundary of \(\Omega \). Note that the first factor in Eq. (4) involves \(\left\Vert u\right\Vert _{\Omega }\) and that the ill-posedness of the problem increases with decreasing Hölder exponent \(\tau \in (0,1)\).

A lack of well-posedness precludes the use of many established numerical methods (e.g. standard finite elements) which heavily rely on this property to obtain reliable approximate solutions. In order to apply these methods anyway, the continuous problem is usually approximated by a series of well-posed problems which are perturbations of the original problem. We will follow a different approach here based on casting the original data assimilation problem as a constrained optimization problem at the discrete level. This discrete problem is unstable since no regularization has been introduced at the continuous level. Subsequently, regularization will be added at the discrete level by utilizing stabilization terms well-known in the finite element community. An appropriate choice thereof allows us to conduct an error analysis which exploits the conditional stability estimate of Eq. (4) and leads to explicit convergence rates (see Theorem 10).

Our method is based on a general framework for noncoercive problems that has been introduced by Burman (2013) and was thereupon applied to a variety of problems including unique continuation and source reconstruction for the Poisson problem (Burman et al. 2018) as well as data assimilation for the heat (Burman and Oksanen 2018) and linearized Navier–Stokes equations (Boulakia et al. 2020). Concerning time-harmonic wave equations, Nechita (2020) treated unique continuation for the constant coefficient Helmholtz equation in his dissertation using piecewise affine finite elements, see also Burman et al. (2019). A hybridized high order method for the same problem has been analyzed in Burman et al. (2021). In relation to the literature, the contributions of the paper on hand are as follows:

  • We generalize the method from Burman et al. (2019) to the case of elastic wave propagation, in particular we treat the Lamé system instead of the scalar Helmholtz equation.

  • Additionally, we carry out an error analysis for arbitrary polynomial orders and investigate the benefits of using higher order polynomials in numerical experiments. In contrast to the hybrid high order method presented in Burman et al. (2021), standard \(H^1\)-conforming finite elements are employed in this work.

  • Whereas the publications (Burman et al. 2019, 2021) treat the case of constant coefficients, we allow for a spatial dependence of the material parameters and present numerical experiments for the practically relevant setting of a jumping shear modulus. The shortcoming for working at this level of generality is that in contrast to the cited works our error analysis is not explicit in the wavenumber.

The remainder of this paper is structured as follows. In Sect. 2 we give a precise statement of the conditional stability estimate of Eq.(4), whose actual derivation is deferred to Section A. In Sect. 3 we introduce a stabilized finite element method to numerically approximate the unique continuation problem from Eq. (1)- Eq. (3). Section 4 presents an analysis which leads to \(L^2\)-error estimates first for the case of unperturbed (Theorem 8) and then for perturbed data (Theorem 10). Numerical experiments that confirm our theoretical findings and investigate additional aspects are presented in Sect. 5. We finish with a conclusion and an outlook towards future research.

2 Conditional stability result for the continuous problem

Deriving unique continuation or conditional stability results for the Lamé system requires some regularity assumptions on the coefficients. The foundation for the conditional stability estimate employed in this paper is a three ball inequality derived in Lin et al. (2011) which is based on the following assumption.

Assumption 1

Let \(\mu \in C^{0,1}(\Omega )\), and let \(\lambda ,\rho \in L^{\infty }(\Omega )\) satisfy

$$\begin{aligned} \left\{ \begin{array}{llll} &{} \mu (x) \ge \delta _0, &{} \lambda (x) + 2 \mu (x) \ge \delta _0 > 0 \quad \forall x \in \Omega , \\ &{} \left\Vert \mu \right\Vert _{C^{0,1}(\Omega )} + \left\Vert \lambda \right\Vert _{L^{\infty }(\Omega )} \le M_0, &{} \left\Vert \rho \right\Vert _{L^{\infty }(\Omega )} \le M_0. \end{array}\right. \end{aligned}$$
(5)

for some positive constants \(\delta _0\) and \(M_0\). Here,

$$\begin{aligned} \left\Vert g\right\Vert _{C^{0,1}(\Omega )}:= \left\Vert g\right\Vert _{L^{\infty }(\Omega ) } + \left\Vert \nabla g \right\Vert _{L^{\infty }(\Omega ) }. \end{aligned}$$

The three ball inequality of Lin et al. (2011, Theorem 1.1) takes the following form.

Theorem 1

Let the origin of \({\mathbb {R}}^d\) be contained in \(\Omega \). There exists a positive number \({\tilde{R}} < 1\), depending only on \(d,M_0,\delta _0\), such that if

$$\begin{aligned} 0< R_1< R_2< R_3 \le R_0 \text { and } R_1/R_3< R_2 / R_3 < {\tilde{R}}, \end{aligned}$$

then

$$\begin{aligned} \int \limits _{ \left|x\right|< R_2 } \left|u\right|^2 \textrm{d}x \le C \left( \int \limits _{ \left|x\right|< R_1 } \left|u\right|^2 \textrm{d}x \right) ^{\tau } \left( \int \limits _{ \left|x\right| < R_3 } \left|u\right|^2 \textrm{d}x \right) ^{1 - \tau } \end{aligned}$$
(6)

for \(u \in H^{1}_{\text {loc}}(B_{R_0})\) satisfying \(\mathcal {L}u = 0\) in \(B_{R_0}\), where the constant C depends on \(R_2 / R_3, d,M_0,\delta _0\), and \(0< \tau < 1\) depends on \(R_1/R_3, R_2 / R_3, d, M_0, \delta _0\). Moreover, for fixed \(R_2\) and \(R_3\), the exponent \(\tau \) behaves like \(1/(-\log R_1 )\) where \(R_1\) is sufficiently small.

To obtain a conditional stability result from Theorem 1 that is suitable for our purpose, we require well-posedness of the interior impedance problem. Let us fix some notation before stating the required result.

  • As in the introduction let \(V:= [H^1(\Omega )]^d\) denote the usual Sobolev space of real-valued functions with square integrable weak derivatives up to first order and \(V_0:= [H^1_0(\Omega )]^d\) denote V with homogeneous Dirichlet boundarty conditions included. We use a prime to denote the corresponding dual spaces, i.e. \(V^{\prime }\) and \(V_0^{\prime }\).

  • Let \(V_{\mathbb {C}}\) denote the Sobolev space \([H^1(\Omega )]^d\) of functions taking values in the complex numbers with inner product \((u,v)_{V_{\mathbb {C}}}:= \int \limits _{\Omega } ( \nabla u \nabla {\bar{v}} + u {\bar{v}} ) \; \textrm{d}x\) and \(V_{\mathbb {C}}^{\prime }\) denote its dual space. The corresponding space with homogeneous Dirichlet boundary conditions and its dual are denoted by \(V_{{\mathbb {C}},0}\) and \(V_{{\mathbb {C}},0}^{\prime }\), respectively.

Assumption 2

Let \(\Omega \) be a bounded Lipschitz domain, \(f \in V_{\mathbb {C}}^{\prime }\) and \(k > 0\). We assume that there exists a unique solution \( u \in V_{\mathbb {C}}\) of the problem

$$\begin{aligned} \left\{ \begin{array}{ll} \mathcal {L} u = f &{}\quad \text { in } \Omega ,\\ \sigma (u) \cdot {\textbf{n}}_{\partial \Omega } + i ku = 0&{} \quad \text { on } \partial \Omega , \end{array}\right. \end{aligned}$$
(7)

fulfilling the stability bound

$$\begin{aligned} \left\Vert u\right\Vert _{V_{\mathbb {C}}} \le C \left\Vert f\right\Vert _{V_{\mathbb {C}}^{\prime }}. \end{aligned}$$
(8)

Here, \({\textbf{n}}_{ \partial \Omega }\) denotes the exterior normal vector on \(\partial \Omega \). The constant C is assumed to be independent of u and f but may depend on the material parameters, k and the domain \(\Omega \).

If \(\partial \Omega \) and the Lamé coefficients are sufficiently smooth, then Assumption 2 follows by exploiting elliptic regularity. Indeed, as Korn’s inequality and the assumption \(\lambda (x) + 2 \mu (x) > 0 \) yield a Gårding inequality, the Fredholm alternative implies the desired well-posedness provided that uniqueness can be shown. To this end, note that a solution of Eq. (7) with \(f=0\) has to vanish on \(\partial \Omega \) which follows by taking the imaginary part of the weak formulation using \({\bar{u}}\) as a test function. If one can now show that \(\sigma (u) \cdot {\textbf{n}}_{\partial \Omega }\) vanishes as well on \(\partial \Omega \), then u can be extended by zero to an \(H^1\)-solution in all of \({\mathbb {R}}^d\) which implies by Theorem 1 that it must vanish everywhere. At this point, smoothness assumptions are required to obtain that \(\mathcal {L}u = 0\) in \(\Omega \) which implies vanishing of \(\sigma (u) \cdot {\textbf{n}}_{\partial \Omega }\) on the boundary using integration by parts.

For applications to high-frequency wave propagation it is important to understand how the constant C in the stability bound given in Eq. (8) depends on the coefficient \(\rho \) in Eq. (2). According to the next lemma, for a homogeneous medium the dependence is fortunately no worse than linear.

Lemma 2

Let \(\Omega \) be a bounded Lipschitz domain, \(f \in V_{\mathbb {C}}^{\prime }\) and \(k>0\). If \(\mu , \lambda \) and \(\rho =k^2\) are constant with \(k \ge 1\) and \(d=3\), then

$$\begin{aligned} \left\Vert \nabla u\right\Vert _{ \Omega } + k \left\Vert u\right\Vert _{ \Omega } \le C k^2 \left\Vert f\right\Vert _{V_{\mathbb {C}}^{\prime }}, \end{aligned}$$
(9)

for the solution u of Eq. (7) holds with C being independent of k.

Proof

Given in Section A. \(\square \)

Let us remark that Lemma 2 is obtained as a Corollary from Brown and Gallistl (2022, Theorem 2.7) in which the authors proved Eq. (9) with \(\left\Vert f\right\Vert _{ \Omega }\) on the right hand side and a factor of k. Actually, a sharper bound which is \(\mathcal {O}(1)\) in k has been obtained in Chaumont-Frelet and Nicaise (2019, Proposition 4.3) by imposing stronger smoothness assumptions on \(\partial \Omega \) and requiring that \(\Omega \) is star-sharped. Hence, under these additional assumptions the bound in Eq. (9) could be lowered from \(k^2\) to k.

Utilizing well-posedness of Eq. (7) allows to mold the three-ball inequality of Theorem 1 into a form which is suitable for the numerical analysis in Sect. 4.

Corollary 3

Let u be a solution of \(\mathcal {L} u = f \in V_0^{\prime }\). Consider subdomains \(\omega \subset B \subset \Omega \) such that \(B {\setminus } \omega \) does not touch the boundary of \(\Omega \). Then there exists a constant \(C>0\) and \(\tau \in (0,1)\) such that

$$\begin{aligned} \left\Vert u\right\Vert _{L^2(B)} \le C \left( \left\Vert f\right\Vert _{ V_{0}^{\prime } } + \left\Vert u\right\Vert _{[L^2(\Omega )]^d} \right) ^{1-\tau } \left( \left\Vert f\right\Vert _{ V_{0}^{\prime } } + \left\Vert u\right\Vert _{ [L^2(\omega )]^d} \right) ^{\tau }. \end{aligned}$$
(10)

Proof

Given in Section A. \(\square \)

3 Discretisation

In this section we introduce a stabilized finite element method to numerically approximate the unique continuation problem given in Eqs. (1)–(3). In Sect. 3.1 triangulations and finite element spaces are defined. We proceed in Sect. 3.2 by defining a Lagrangian functional from which numerical approximations to the wave displacement will be obtained as saddle points. In SectionSect. 3.3 we specify stabilization terms for this Lagrangian. Suitable norms for the error analysis are presented in Sect. 3.4. Interpolation operators and certain stability estimates for them are considered in Sect. 3.5.

3.1 Finite element spaces

For the analysis it will be assumed that the domain \(\Omega \) is polygonal. This is consistent with the regularity requirement on \(\Omega \) stated in Assumption 2. Consider then a family \( \mathcal {T} = \{ \mathcal {T}_h \}_{ h >0}\) of triangulations of \(\Omega \) consisting of simplices \(K \in \mathcal {T}_h\) such that the intersection of any two distinct ones is either a common vertex, a common edge or a common face. Further assume that the family \(\mathcal {T}\) is quasi-uniform and fitted to the subsets \(\omega \) and B. Let \(\mathcal {F}_i\) denote the set of all interior facets of the triangulation. Let \(X_h^{p}\) be the standard \(H^1\)-conforming finite element space of piecewise polynomials of order p on \(\mathcal {T}_h\). Define

$$\begin{aligned} V_h^p:= [X_h^p]^d, \qquad W_h^p:= V_h^p \cap V_0. \end{aligned}$$
(11)

For ease of notation the superscript p will usually be omitted below.

3.2 Lagrangian and optimality conditions

The weak formulation of the partial differential equation (PDE) constraint in Eq. (1) is given by: Find \(u \in V\) such that \(a_h(u,v) = (f,v)_{\Omega } \) for all \(v \in V_0\), where

$$\begin{aligned} a_h(u,v):= \int \limits _{\Omega } \left[ \sigma (u):\mathcal {E}(v) - \rho uv \right] \textrm{d}x, \end{aligned}$$
(12)

and \((v,w)_{M}:= (v,w)_{ [L^2(M)]^d }\) for any \(M \subset {\mathbb {R}}^d\). Following Burman et al. (2019), Nechita (2020) we define the Lagrangian

$$\begin{aligned} L(u_h,z_h)&= \frac{1}{2} \left\Vert u_h - u_{\omega } \right\Vert _{ \omega }^2 + \frac{1}{2} s_{\gamma }(u_h-u,u_h-u) + \frac{1}{2} s_{\alpha }(u_h,u_h) \nonumber \\&\quad - \frac{1}{2} s^{*}(z_h,z_h) + a_h(u_h,z_h) + s_{\beta }(u_h,z_h) - (f,z_h)_ {\Omega } \end{aligned}$$
(13)

which contains aside from the data fidelity and PDE constraint four stabilization terms \(s_{\gamma },s_{\alpha }\),\(s_{\beta }\) and \(s^{*}\) that will be specified later. Since a solution u of Eq. (1) is explicitly inserted in \(s_{\gamma }\) above, we have to be careful in choosing this stabilization so that it can indeed be implemented using only the given data f and \(u_{\omega }\). We will see below in Eq. (25) that this is indeed the case.

The first order optimality conditions lead to the equations: Find \((u_h,z_h) \in V_h \times W_h\) such that

$$\begin{aligned} \begin{array}{l}(u_h,v_h)_{ \omega } + s_{\gamma }(u_h-u,v_h) + s_{\alpha }(u_h,v_h) + a_h(v_h,z_h) + s_{\beta }(v_h,z_h) = (u_{\omega },v_h)_{ \omega }, \\ \qquad a_h(u_h,w_h) - s^{*}(z_h,w_h) + s_{\beta }(u_h,w_h) = (f,w_h)_{ \Omega } \end{array}\nonumber \\ \end{aligned}$$
(14)

for all \((v_h,w_h) \in V_h \times W_h\). This can be written in the compact form: Find \((u_h,z_h) \in V_h \times W_h\) such that

$$\begin{aligned} A[(u_h,z_h),(v_h,w_h)] {=} (u_{\omega },v_h)_{ \omega } {+} s_{\gamma }(u,v_h) {+} (f,w_h)_{ \Omega } \quad \forall (v_h,w_h) \in V_h \times W_h,\nonumber \\ \end{aligned}$$
(15)

with

$$\begin{aligned} A[(u_h,z_h),(v_h,w_h)] := & {} (u_h,v_h)_{ \omega } + s_{\gamma }(u_h,v_h) + s_{\alpha }(u_h,v_h) + a_h(v_h,z_h) \nonumber \\{} & {} \quad + s_{\beta }(v_h,z_h) - s^{*}(z_h,w_h) + a_h(u_h,w_h) + s_{\beta }(u_h,w_h).\nonumber \\ \end{aligned}$$
(16)

3.3 Stabilization

In this section we introduce suitable stabilization terms which are crucial for our method to operate properly. For well-definedness and consistency of the stabilization some regularity assumptions on the coefficients are required. To keep the exposition clear, we will first introduce all stabilization terms in Sect. 3.3.1 under the assumption of smooth coefficients and then specify in Sect. 3.3.2 the minimal regularity requirement under which specific parts of the stabilization can be activated.

3.3.1 Definition

We start by introducing a notation for the j-th order jumps of the stress \(\sigma (u)\) in normal direction over the interior facets:

$$\begin{aligned} J_j(u_h,v_h):= \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} h^{2j-1} \llbracket (\nabla ^{j-1} \sigma (u_h)) \cdot {\textbf{n}} \rrbracket \llbracket (\nabla ^{j-1} \sigma (v_h)) \cdot {\textbf{n}} \rrbracket \; \textrm{d}S, \quad j \ge 1.\nonumber \\ \end{aligned}$$
(17)

Here, the jump over a facet \(F = K_1 \cap K_2\) for two neighboring simplices \(K_1,K_2 \in \mathcal {T}_h\) is defined as

$$\begin{aligned} \llbracket (\nabla ^{j-1} \sigma (u_h)) \cdot {\textbf{n}} \rrbracket := \left. (\nabla ^{j-1} \sigma (u_h))\right| _{K_1} \cdot {\textbf{n}}_1 + \left. (\nabla ^{j-1} \sigma (u_h))\right| _{K_2} \cdot {\textbf{n}}_2, \end{aligned}$$
(18)

where \({\textbf{n}}_i\) are the outward pointing normal vectors of \(K_i,i=1,2\). These jump terms appear in both stabilizers \(s_{\beta }\) and \(s_{\gamma }\), albeit are applied to different variables and fulfill a separate purpose. The stabilizer

$$\begin{aligned} s_{\beta }(u_h,w_h):= \sum \limits _{j=1}^{p} \beta _j J_j(u_h,w_h) \end{aligned}$$
(19)

for penalty parameters \(\beta _j \in {\mathbb {R}}, j=1,\ldots ,p\), represents a perturbation of the original PDE constraint whose effectiveness to mitigate polution effects will be investigated in numerical experiments (see Sect. 5.3). This is inspired by the well-known continuous interior penalty (CIP)-FEM for the Helmholtz equation, see e.g. Wu (2013), Zhu and Wu (2013), Du and Wu (2015), Zhou and Wu (2022). Let us mention that this stabilization term is optional in the sense that the final error estimate stated in Theorem 10 holds even for \(\beta _j = 0, j=1,\ldots ,p\).

Jump terms also appear as part of the stabilizer

$$\begin{aligned} s_{\gamma }(u_h,v_h):= \sum \limits _{j=1}^{p} \gamma _{j} J_j(u_h,v_h) + \gamma _{\text {GLS}}h^2 (\mathcal {L}u_h,\mathcal {L} v_h)_{ \mathcal {T}_h }, \end{aligned}$$
(20)

which is required to guarantee unique solvability of Eq. (15). Here,

$$\begin{aligned} h^2 (\mathcal {L}u_h,\mathcal {L} v_h)_{ \mathcal {T}_h }:= h^2 \sum \limits _{K \in \mathcal {T}_h} (\mathcal {L}u_h,\mathcal {L} v_h)_{K}, \end{aligned}$$

is a Galerkin least squares stabilization. We will require that the penalty parameters satisfy

$$\begin{aligned} \gamma _1> 0 \text { and } \gamma _j \ge \max \{0,\left|\beta _j\right|\} \text { for } j=2,\ldots ,p \text { and } \gamma _{\text {GLS}}> 0. \end{aligned}$$
(21)

Similar as in Burman et al. (2021), we additionally add a discrete Tikhonov regularization term

$$\begin{aligned} s_{\alpha }(u_h,v_h):= \alpha h^{2p} (u_h,v_h)_{ \Omega }, \end{aligned}$$
(22)

for some \(\alpha >0\) to control the \(L^2\)-norm of the approximation \(u_h\) on all of \(\Omega \).

The stabilization for the dual variable is defined as

$$\begin{aligned} s^{*}(z_h,w_h):= \int \limits _{\Omega } \nabla z_h: \nabla w_h \; \textrm{d}x. \end{aligned}$$
(23)

3.3.2 Regularity requirements

We will propose a numerical method that is well-defined for jumping shear moduli as they occur in practical applications, e.g. in seismology. Even though such jumps will violate the regularity Assumption 1 for the three ball inequality of Theorem 1 on which our error estimates will be based, we can nevertheless implement our method and carry out numerical experiments if jumps occur, see Sect. 5.5.

In the Galerkin least squares stabilization given in Eq. (20) the strong form of the differential operator \(\mathcal {L}\) is applied in an element-wise fashion. This requires the following assumption on the Lamé coefficients.

Assumption 3

We will assume that the meshes \(\mathcal {T}_h\) can be constructed so that possible singularities of \(\mu \) and \(\lambda \) only occur on element edges (for \(d=2\)) or faces (for \(d=3\)), that is \(\mu , \lambda \in H^1(\mathcal {T}_h)\), where

$$\begin{aligned} H^1(\mathcal {T}_h):= \{ v \in L^2(\Omega ) \mid \forall K \in \mathcal {T}_h, v|_{K} \in H^1(K) \}. \end{aligned}$$

This assumption appears to be realistic for applications in global seismic wave propagation in which meshes are usually contructed to respect the singularities of stratified reference earth models, see e.g. Komatitsch and Tromp (2002, Figure 6).

Next we will discuss the other contribution in Eq. (19) and Eq. (20), i.e. the jump terms over the facets. The analysis presented in Sect. 4 requires that these terms are consistent, i.e. if u is a weak solution of Eqs. (1)–(3), then

$$\begin{aligned} \gamma _{j} J_j(u,v) = \beta _{j} J_j(u,v) = 0 \text { for } j=1,\ldots ,p \end{aligned}$$
(24)

and v in \(V+V_h\) has to hold. Since the meshes are assumed to be aligned with discontinuities of the Lamé coefficients, this automatically holds for \(j=1\) as weak solutions are required to satisfy \(\llbracket \sigma (u) \cdot {\textbf{n}} \rrbracket = 0 \) across an interface over which the coefficients exhibit jumps. However, higher order jumps do not need to vanish and so it is not conducive to penalize them. Therefore, we will set \(\beta _j\) and \(\gamma _{j} \) to zero for \(j \ge 2\) unless the Lamé coefficients are smooth. Note that this is consistent with Eq. (21).

Assumption 4

If \(\mu , \lambda \notin C^{\infty }(\Omega )\) then \(\beta _j = \gamma _{j} = 0\) for \(j \ge 2\).

Under Assumption 4 we have that Eq. (24) holds in any case since either the corresponding penalty parameter vanishes or because \(\llbracket (\nabla ^{j-1} \sigma (u)) \cdot {\textbf{n}} \rrbracket = 0\) for a sufficiently regular solution \(u \in [H^{p+1}(\Omega )]^{d}\), see e.g. Di Pietro and Ern (2011, Lemma 1.23). A possibility to relax Assumption 4 could be to allow for spatially-varying penalty parameters which vanish in regions where the material parameters are non-smooth and may take positive values elsewhere. We conclude this subsection by noting that

$$\begin{aligned} s_{\gamma }(u,v_h) = \gamma _{\text {GLS}}h^2 (\mathcal {L}u,\mathcal {L} v_h)_{ \mathcal {T}_h } = \gamma _{\text {GLS}}h^2 (f,\mathcal {L} v_h)_{ \mathcal {T}_h }, \end{aligned}$$
(25)

holds, which shows that the right hand side of Eq. (15) is known.

3.4 Norms and inf-sup condition

We define

$$\begin{aligned} \left\Vert u_h\right\Vert _{V_h}:= \left( s_{\gamma }(u_h,u_h) + s_{\alpha }(u_h,u_h) \right) ^{1/2}, \quad \left\Vert z_h\right\Vert _{W_h}:= s^{*}(z_h,z_h)^{1/2}, \end{aligned}$$
(26)

for \(u_h \in V_h\) and \(z_h \in W_h\). Since \(\alpha >0\) and thanks to the Friedrichs inequality on \(W_h \subset V_0\), c.f. Equation (39), these expressions indeed define norms on \(V_h\), respectively \(W_h\). On the product space \(V_h \times W_h\) we define

$$\begin{aligned} \left\Vert (u_h,z_h)\right\Vert _{s}^2:= \left\Vert u_h\right\Vert ^2_{V_h} + \left\Vert u_h\right\Vert ^2_{ \omega } + \left\Vert z_h\right\Vert _{W_h}^2, \quad (v_h,z_h) \in V_h \times W_h \end{aligned}$$
(27)

which then also defines a norm on \(V_h \times W_h\). Note that

$$\begin{aligned} A[(u_h,z_h),(u_h,-z_h)]= & {} \left\Vert u_h\right\Vert _{ [L^2(\omega )]^d}^2 + \left\Vert u_h\right\Vert _{V_h}^2 + \left\Vert z_h\right\Vert _{W_h}^2 \nonumber \\= & {} \left\Vert (u_h,z_h)\right\Vert _s \left\Vert (u_h,-z_h)\right\Vert _{s}, \end{aligned}$$
(28)

which implies the inf-sup condition

$$\begin{aligned} \sup _{ (v_h,w_h) \in V_h \times W_h} \frac{ A[(u_h,z_h),(v_h,w_h)] }{ \left\Vert (v_h,w_h)\right\Vert _s } \ge C \left\Vert (u_h,z_h) \right\Vert _s. \end{aligned}$$
(29)

Here and in the following \(C>0\) denotes a generic constant independent of h but possibly depending on the stabilization parameters.

3.5 Interpolation

Let \(\Pi _h:V \rightarrow V_h\) denote the Scott-Zhang interpolation operator, which preserves homogeneous boundary conditions and fulfills (see Scott and Zhang 1990) the following stability

$$\begin{aligned} \left\Vert \Pi _h u\right\Vert _{ [H^1(\Omega )]^d } \le C \left\Vert u\right\Vert _{[H^1(\Omega )]^d}, \quad \forall u \in [H^1(\Omega )]^d \end{aligned}$$
(30)

and approximation property:

$$\begin{aligned} \left\Vert u - \Pi _h u\right\Vert _{[H^m(\Omega )]^d} \le C h^{s-m} \left\Vert u\right\Vert _{[H^{s}(\Omega )]^d}, \; \forall u \in [H^s(\Omega )]^d, \end{aligned}$$
(31)

with \(1 \le s \le p+1\) and \(0 \le m \le s\).

We will now derive some further approximation and stability results for this interpolation required for the analysis in Sect. 4. To this end, the following standard [see e.g. Brenner and Scott (2008, Eq.10.3.9)] continuous trace inequality will be employed: There exists a constant \(C>0\) such that

$$\begin{aligned} \left\Vert v\right\Vert _{ \partial K } \le C \left( h^{-1/2} \left\Vert v\right\Vert _{ K } + h^{1/2} \left\Vert \nabla v \right\Vert _{ K } \right) , \; \forall v \in [H^1(K)]^d. \end{aligned}$$
(32)

Before we proceed to work, let us give a remark discussing solutions with low regularity.

Remark 1

Below we will assume that the solution u of Eqs. (1)–(3) is in \( [H^{p+1}(\Omega )]^d\) for \(p \ge 1\). If the Lamé coefficients are allowed to have jumps, then it is not realistic to assume that the solution enjoys such a high global regularity. However, according to Assumption 3 the jumps are limited to subdomains which are respected by the mesh. This would allow us to split \(\Omega \) into subdomains \(\Omega _i\) such that the restriction of u is in \( [H^{p+1}(\Omega _i)]^d\) for each i and treat each subdomain separately. To keep the analysis simple we will only consider such a scenario in our numerical experiments, see Sect. 5.5.

Lemma 4

(Weak consistency) Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution of Eqs. (1)–(3). Then there exists a constant \(C>0\) such that there holds

$$\begin{aligned} \sum \limits _{j=1}^{p} \gamma _{j} J_j(\Pi _h u,\Pi _h u) \le C h^{2p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }^2. \end{aligned}$$
(33)

Proof

As discussed in Sect. 3.3.2 we have \( \gamma _{j} J_j( u, u) = \gamma _{j} J_j( u, \Pi _h u) = 0\) for \(j=1,\ldots ,p\). Therefore, inserting u, using the trace inequality of Eq. (32) and approximation properties of \(\Pi _h\) given in Eq. (31) yields:

$$\begin{aligned}&\sum \limits _{j=1}^{p} \gamma _{j} J_j(\Pi _h u, \Pi _h u) = \sum \limits _{j=1}^{p} \gamma _{j} \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} h^{2j-1} \llbracket (\nabla ^{j-1} \sigma ( \Pi _h u -u )) \cdot {\textbf{n}} \rrbracket ^2 \textrm{d}S \\&\quad \le C \sum \limits _{j=1}^{p} h^{2j-2} \left\Vert \nabla ^{j-1} \sigma \left( \Pi _h u - u \right) \right\Vert _{ [L^2(\Omega )]^d }^2 + h^{2j} \left\Vert \nabla ^{j-1} \sigma \left( \Pi _h u - u \right) \right\Vert _{ [H^1(\Omega )]^d }^2 \\&\quad \le C \sum \limits _{j=1}^{p} h^{2j-2} \left\Vert \left( \Pi _h u - u \right) \right\Vert _{ [H^{j}(\Omega )]^d }^2 + h^{2j} \left\Vert \left( \Pi _h u - u \right) \right\Vert _{ [H^{j+1}(\Omega )]^d }^2 \\&\quad \le C \sum \limits _{j=1}^{p} h^{2j-2} h^{2(p+1-j)} \left\Vert v \right\Vert _{ [H^{p+1}(\Omega )]^d }^2 + h^{2j} h^{2(p+1-j-1)} \left\Vert u \right\Vert _{ [H^{p+1}(\Omega )]^d }^2 \\&\quad \le C h^{2p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }^2. \end{aligned}$$

\(\square \)

Corollary 5

Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution of Eqs. (1)–(3). Then there exists a constant \(C>0\) such that

$$\begin{aligned} \left\Vert \Pi _h u - u \right\Vert _{V_h} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$
(34)

Proof

We have

$$\begin{aligned} \left\Vert \Pi _h u - u \right\Vert _{V_h}^2&= s_{\gamma }(\Pi _h u -u, \Pi _h u - u ) + s_{\alpha }(\Pi _h u -u, \Pi _h u - u ) \\&= \sum \limits _{j=1}^{p} \gamma _{j} J_j(\Pi _h u,\Pi _h u) {+} \gamma h^2 \left\Vert \mathcal {L}(\Pi _h u {-} u) \right\Vert _{ \mathcal {T}_h }^2 {+} \alpha h^{2p} \left\Vert \Pi _h u {-}u\right\Vert ^2_{ \Omega }. \end{aligned}$$

In view of Lemma 4, it only remains to treat the last two terms. We have

$$\begin{aligned} \left\Vert \Pi _h u -u\right\Vert ^2_{ \Omega } \le C h^{2(p+1)} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }^2 \end{aligned}$$

by Eq. (31). For the other term it follows from Assumption 3 on the coefficients and the approximation properties (Eq. (31)) of \(\Pi _h\) that

$$\begin{aligned} h^2 \left\Vert \mathcal {L}(\Pi _h u - u) \right\Vert _{ \mathcal {T}_h }^2 \le C h^2 \left\Vert \Pi _h u - u \right\Vert _{ [H^2(\mathcal {T}_h)]^d }^2 \le C h^2 h^{2(p-1)} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }^2. \end{aligned}$$

Combining these estimates yields the claim. \(\square \)

4 Error analysis

This section is concerned with the derivation of covergence rates for the stabilized finite element method introduced in Sect. 3. In Sect. 4.1 we first consider the case of unperturbed data. The perturbed case can then be treated in Sect. 4.2 by minor modification of the proofs for the unperturbed situation.

4.1 Unperturbed data

To obtain error estimates, we will apply the conditional stability estimate from Corrollary 3 to the error \(u-u_h\). Controlling the arising terms on the right hand side of Eq. (10) requires estimates on the residual

$$\begin{aligned} \langle r,w \rangle := a_h(u_h-u,w) = a_h(u_h,w) - (f,w)_{ \Omega }, \quad w \in V_0. \end{aligned}$$

The next lemma provides one of the essential bounds for this purpose.

Lemma 6

  1. (a)

    There exists a constant \(C>0\) such that

    $$\begin{aligned} a_h(u,v) \le C \left\Vert u\right\Vert _{V_h} \left( h^{-1} \left\Vert v\right\Vert _{ \Omega } + \left\Vert \nabla v\right\Vert _{ \Omega } \right) , \end{aligned}$$
    (35)

    for all \( u \in V_h + [H^{p+1}(\Omega )]^{d}\) and \( v \in V_0\).

  2. (b)

    There exists a constant \(C>0\) such that

    $$\begin{aligned} s_{\beta }(u, w_h ) \le C \left\Vert u\right\Vert _{V_h} \left\Vert \nabla w_h\right\Vert _{ \Omega }, \; \forall u \in V_h + [H^{p+1}(\Omega )]^{d}, \; \forall w_h \in W_h. \end{aligned}$$
    (36)

Proof

  1. (a)

    Element-wise integration by parts yields

    $$\begin{aligned} a_h(u,v)&= \sum \limits _{K \in \mathcal {T}_h} \int \limits _{K} \left[ \sigma (u):\nabla v - \rho u v \right] \; \textrm{d}x \\&= \sum \limits _{K \in \mathcal {T}_h} \int \limits _{K} \left[ - \nabla \cdot \sigma (u) - \rho u \right] v \; \textrm{d}x + \sum \limits _{K \in \mathcal {T}_h} \int \limits _{\partial K} \sigma (u) \cdot {\textbf{n}} v \; \textrm{d}S \\&= \sum \limits _{K \in \mathcal {T}_h} \int \limits _{K} \mathcal {L} u v \; \textrm{d}x + \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} \llbracket \sigma (u) \cdot {\textbf{n}} \rrbracket v \; \textrm{d}S \\&:= \textrm{I} + \textrm{II}. \end{aligned}$$

    We control the first term by means of the Galerkin least squares stabilization:

    $$\begin{aligned} \textrm{I} \le \left( h^2 (\mathcal {L} u, \mathcal {L} u)_{ \mathcal {T}_h } \right) ^{1/2} h^{-1} \left\Vert v\right\Vert _{ \Omega } \le C \left\Vert u\right\Vert _{V_h} h^{-1} \left\Vert v\right\Vert _{ \Omega }. \end{aligned}$$

    The penalty on the normal jumps of \(\sigma (u)\) over the facets allows to estimate the second term:

    $$\begin{aligned} \textrm{II}&\le C \left( \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} h \llbracket \sigma (u) \cdot {\textbf{n}} \rrbracket ^2 \; \textrm{d}S \right) ^{1/2} \left( \sum \limits _{F \in \mathcal {F}_i} h^{-1} \left\Vert v\right\Vert _{ F }^2 \right) ^{1/2} \\&\le C \left( \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} h \llbracket \sigma (u) \cdot {\textbf{n}} \rrbracket ^2 \; \textrm{d}S \right) ^{1/2} \left( \sum \limits _{K \in \mathcal {T}_h} h^{-2} \left\Vert v\right\Vert _{ K }^2 + \left\Vert \nabla v\right\Vert _{ K }^2 \right) ^{1/2} \\&\le C J_1(u,u)^{1/2} \left( h^{-1} \left\Vert v\right\Vert _{ \Omega } + \left\Vert \nabla v\right\Vert _{ \Omega } \right) , \end{aligned}$$

    where the trace inequality in Eq. (32) has been employed. Note that \(J_1(u,u)^{1/2} \le C \left\Vert u\right\Vert _{V_h}\) thanks to \(\gamma _1 > 0\). Combining both contributions yields the claim.

  2. (b)

    Making use of the inverse inequalities \(\left\Vert w_h\right\Vert _F \le C h^{-1/2} \left\Vert w_h\right\Vert _K\) and \(\left\Vert \nabla w_h\right\Vert _K \le C h^{-1} \left\Vert w_h\right\Vert _K\) yields

    $$\begin{aligned} \sum \limits _{j=1}^{p} \left|\beta _j\right| J_j(w_h,w_h)= & {} \sum \limits _{j=1}^{p} \left|\beta _j\right| \sum \limits _{F \in \mathcal {F}_i} \int \limits _{F} h^{2j-1} \llbracket (\nabla ^{j-1} \sigma (w_h)) \cdot {\textbf{n}} \rrbracket ^2 \; \textrm{d}S \\\le & {} C \sum \limits _{j=1}^{p} \sum \limits _{K \in \mathcal {T}_h} h^{2j-1} h^{-1} \left\Vert \nabla ^{j-1} \sigma (w_h) \right\Vert _{K}^2 \\\le & {} C \sum \limits _{j=1}^{p} \sum \limits _{K \in \mathcal {T}_h} h^{2j-1} h^{-1} h^{-2(j-1)} \left\Vert \sigma (w_h) \right\Vert _{K}^2 \\\le & {} C \sum \limits _{j=1}^{p} \sum \limits _{K \in \mathcal {T}_h} \left\Vert \nabla w_h \right\Vert _{K}^2. \end{aligned}$$

    Combining this with the Cauchy-Schwarz inequality

    $$\begin{aligned} s_{\beta }(u, w_h ){} & {} \le \left( \sum \limits _{j=1}^{p} \left|\beta _j\right| J_j(u,u) \right) ^{1/2} \left( \sum \limits _{j=1}^{p} \left|\beta _j\right| J_j(w_h,w_h) \right) ^{1/2} \\{} & {} \le C \left\Vert u\right\Vert _{V_h} \left\Vert \nabla w_h\right\Vert \end{aligned}$$

    yields the claim. Here we used that

    $$\begin{aligned} \sum \limits _{j=1}^{p} \left|\beta _j\right| J_j(u,u) \le C \sum \limits _{j=1}^{p} \gamma _{j} J_j(u,u), \end{aligned}$$

    which follows from the assumption given in Eq. (21) on the penalty parameters.

\(\square \)

We will later apply Lemma 6 (a) to \(u_h -u\), which will result in a term \(\left\Vert u_h-u\right\Vert _{V_h}\) on the right hand side of Eq. (35). Since

$$\begin{aligned} \left\Vert u_h-u\right\Vert _{V_h} \le \left\Vert u_h- \Pi _h u\right\Vert _{V_h} + \left\Vert \Pi _h u - u \right\Vert _{V_h} \end{aligned}$$

and we can already control \(\left\Vert \Pi _h u - u \right\Vert _{V_h}\) by Eq. (34), it remains to consider \(\left\Vert u_h- \Pi _h u\right\Vert _{V_h}\). To this end, we prove the next lemma.

Lemma 7

Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution of Eqs. (1)–(3) and let \((u_h,z_h) \in V_h \times W_h\) be the solution to Eq. (15). Then there exists \(C>0\) such that for all \(h \in (0,1)\) it holds that

$$\begin{aligned} \left\Vert (u_h - \Pi _h u, z_h)\right\Vert _{s} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$
(37)

Proof

It suffices to prove that for \((v_h,w_h) \in V_h \times W_h\) the inequality

$$\begin{aligned} A[ (u_h - \Pi _h u,z_h),(v_h,w_h) ] \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } \left\Vert (v_h,w_h) \right\Vert _{s} \end{aligned}$$
(38)

holds, because then the inf-sup condition in Eq. (29) yields

$$\begin{aligned} C \left\Vert (u_h - \Pi _h u, z_h)\right\Vert _{s}\le & {} \!\!\!\! \sup _{ (v_h,w_h) \in V_h \times W_h} \!\!\!\! \frac{ A[(u_h -\Pi _h u,z_h),(v_h,w_h)] }{ \left\Vert (v_h,w_h)\right\Vert _s } \\\le & {} C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$

To prove Eq. (38), we use Eq. (15) to arrive at

$$\begin{aligned} A[ (u_h - \Pi _h u,z_h),(v_h,w_h) ]&= A[ (u_h,z_h),(v_h,w_h) ] \\&\qquad - (\Pi _h u,v_h)_{ \omega } - s_{\gamma }(\Pi _h u, v_h) - s_{\alpha }(\Pi _h u, v_h) \\&\qquad - a_h(\Pi _h u,w_h) - s_{\beta }(\Pi _h u , w_h ) \\&= (u_{\omega },v_h)_{ \omega } + s_{\gamma }(u,v_h) + \underbrace{(f,w_h)_{ \Omega }}_{= a_h(u,w_h) } \\&\qquad - (\Pi _h u,v_h)_{ \omega } - s_{\gamma }(\Pi _h u, v_h) - s_{\alpha }(\Pi _h u, v_h) \\&\qquad - a_h(\Pi _h u,w_h) - s_{\beta }(\Pi _h u , w_h ) \\&= (u - \Pi _h u,v_h)_{ \omega } + a_h(u - \Pi _h u,w_h) \\&\qquad {+} s_{\gamma }(u {-} \Pi _h u, v_h) {-} s_{\alpha }(\Pi _h u, v_h) {+} s_{\beta }(u {-}\Pi _h u , w_h ). \end{aligned}$$

Here we also employed the consisteny of the jump penalties, c.f. Eq. (24).

  • The first term is bounded by using Cauchy-Schwarz and the approximation properties of \(\Pi _h\):

    $$\begin{aligned} (u - \Pi _h u,v_h)_{ \omega } \le \left\Vert u - \Pi _h u\right\Vert _{ \Omega } \left\Vert v_h \right\Vert _{ \omega } \le C h^{p+1} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d} \left\Vert v_h \right\Vert _{\omega }. \end{aligned}$$
  • For the second term we have

    $$\begin{aligned} a_h(u-\Pi _h u,w_h)&= \int \limits _{\Omega } \left[ \sigma (u-\Pi _h u) :\nabla w_h - \rho (u -\Pi _h u) w_h \right] \textrm{d}x \\&\le C \left( \left\Vert \nabla ( u_h - \Pi _h u) \right\Vert _{ \Omega } \left\Vert \nabla w_h \right\Vert _{ \Omega } + \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega } \left\Vert w_h \right\Vert _{ \Omega } \right) \\&\le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d} \left\Vert \nabla w_h \right\Vert _{ \Omega } + h^{p+1} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d} \left\Vert w_h \right\Vert _{ \Omega } \right) \\&\le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d} \left\Vert w_h\right\Vert _{W_h}, \end{aligned}$$

    where we used the approximation properties of \(\Pi _h\) and Friedrichs inequality

    $$\begin{aligned} \left\Vert w_h \right\Vert _{ [L^2(\Omega )]^d } \le C \left\Vert \nabla w_h \right\Vert _{ [L^2(\Omega )]^d }, \quad \forall w_h \in W_h \subset V_0. \end{aligned}$$
    (39)
  • For the third term we obtain from Eq. (34) that

    $$\begin{aligned} s_{\gamma }(u - \Pi _h u, v_h) \le \left\Vert u - \Pi _h u\right\Vert _{V_h} \left\Vert v_h \right\Vert _{V_h} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d} \left\Vert v_h\right\Vert _{V_h}. \end{aligned}$$
  • The second to last term is bounded by

    $$\begin{aligned} s_{\alpha }(\Pi _h u, v_h)&= \sqrt{\alpha } h^{p}(\Pi _h u, \sqrt{\alpha } h^{p}v_h)_{ \Omega } \le \sqrt{\alpha } h^p \left\Vert u\right\Vert _{ [H^1(\Omega )]^d } \left\Vert v_h \right\Vert _{V_h}. \end{aligned}$$
  • For the last term we can use Eq. (36) to obtain

    $$\begin{aligned} s_{\beta }(u -\Pi _h u, w_h ) \le C \left\Vert u - \Pi _h u\right\Vert _{V_h} \left\Vert \nabla w_h\right\Vert _{ \Omega } \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d} \left\Vert \nabla w_h\right\Vert _{ \Omega }. \end{aligned}$$

Combining these estimates yields Eq. (38). \(\square \)

We are now in a position to derive an \(L^2\)-error estimate for unperturbed data. For \(p=1\) it is comparable with Burman et al. (2019, Theorem 1) for the Helmholtz equation except that the dependence on the wavenumber is implicit in our estimate.

Theorem 8

Let the subdomains \(\omega \) and B of \(\Omega \) be defined as in Corrollary 3. Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution to Eqs. (1)–(3) and let \((u_h,z_h) \in V_h \times W_h\) be the solution to Eq. (15). Then there exists \(C>0\) and \(\tau \in (0,1)\) such that

$$\begin{aligned} \left\Vert u - u_h\right\Vert _{ B } \le C h^{\tau p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$
(40)

Proof

Consider the residual

$$\begin{aligned} \langle r,w \rangle := a_h(u_h-u,w) = a_h(u_h,w) - (f,w)_{ \Omega }, \quad w \in V_0. \end{aligned}$$

Taking \(v_h = 0\) in Eq. (15) yields:

$$\begin{aligned} a_h(u_h,w_h) = (f,w_h)_{\Omega } + s^{*}(z_h,w_h) - s_{\beta }(u_h,w_h) \quad \forall w_h \in W_h. \end{aligned}$$

Using this identity with \(w_h = \Pi _h w\) implies

$$\begin{aligned} \langle r,w \rangle&= a_h(u_h,w) - (f,w)_{ \Omega } - a_h(u_h,\Pi _h w) + a_h(u_h,\Pi _h w) \\&= a_h(u_h,w - \Pi _h w) - (f, w - \Pi _h w )_{ \Omega } + s^{*}(z_h,\Pi _h w ) - s_{\beta }(u_h, \Pi _h w ) \\&= a_h(u_h - u,w - \Pi _h w) + s^{*}(z_h,\Pi _h w ) - s_{\beta }(u_h - u, \Pi _h w ). \end{aligned}$$
  • From Lemma 6 we obtain that

    $$\begin{aligned} a_h(u_h {-} u,w {-} \Pi _h w)&\le C \left\Vert u_h {-}u\right\Vert _{V_h} \left( h^{-1} \left\Vert w {-} \Pi _h w \right\Vert _{ \Omega } {+} \left\Vert \nabla \left( w {-} \Pi _h w \right) \right\Vert _{ \Omega } \right) \\&\le C \left\Vert u_h -u\right\Vert _{V_h} \left\Vert w\right\Vert _{ [H^1(\Omega )]^d}, \end{aligned}$$

    by the properties in Eq. (30) and Eq. (31) of \(\Pi _h\). Further, from Lemma 7 and Eq. (34) we obtain

    $$\begin{aligned} \left\Vert u_h {-}u\right\Vert _{V_h} \le \left\Vert u_h {-} \Pi _h u\right\Vert _{V_h} {+} \left\Vert \Pi _h u {-}u\right\Vert _{V_h} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d }. \end{aligned}$$
    (41)
  • To bound the second term, we again use Lemma 7 and the \(H^1\)-stability of \(\Pi _h\):

    $$\begin{aligned} s^{*}(z_h,\Pi _h w ) \le \left\Vert z_h\right\Vert _{W_h} \left\Vert \Pi _h w \right\Vert _{W_h} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } \left\Vert w\right\Vert _{ [H^{1}(\Omega )]^d }. \end{aligned}$$
  • The last term is treated by invoking Eq. (36) and then proceeding as in Eq. (41):

    $$\begin{aligned} s_{\beta }(u_h {-} u, \Pi _h w ) \le C \left\Vert u{-}u_h\right\Vert _{V_h} \left\Vert w\right\Vert _{ [H^1(\Omega )]^d} \le C h^{p} \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } \left\Vert w\right\Vert _{ [H^{1}(\Omega )]^d }. \end{aligned}$$

Hence, the following residual norm estimate holds

$$\begin{aligned} \left\Vert r\right\Vert _{ V_{0}^{\prime } } \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$

Using the conditional stability estimate from Corrollary 3 for \(u-u_h\) (note that in Eq. (10) we have \(f = r\) in \(V_0^{\prime }\)) yields the following error estimate

$$\begin{aligned} \left\Vert u{-}u_h\right\Vert _{ B } {\le } C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } {+} \left\Vert u{-}u_h\right\Vert _{ \omega } \right) ^{\tau } \left( h^{p} \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } {+} \left\Vert u{-}u_h\right\Vert _{ \Omega } \right) ^{1{-}\tau }. \end{aligned}$$
  • From Eq. (31) and Lemma 7 we obtain

    $$\begin{aligned} \left\Vert u-u_h\right\Vert _{ \omega } \le \left\Vert u- \Pi _h u\right\Vert _{ \omega } + \left\Vert \Pi _h u - u_h\right\Vert _{ \omega } \le C h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }. \end{aligned}$$
  • We also have

    $$\begin{aligned} \left\Vert u-u_h\right\Vert _{ \Omega }{} & {} \le \left\Vert u- \Pi _h u\right\Vert _{ \Omega } + \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega }\\{} & {} \le C h^{p+1} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega }. \end{aligned}$$

    It remains to estimate \(\left\Vert u_h - \Pi _h u\right\Vert _{ [L^2(\Omega )]^d }\). By definition of \(s_{\alpha }(\cdot ,\cdot )\), see Eq. (22), we have

    $$\begin{aligned} \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega }&= \alpha ^{-1/2} h^{-p} s_{\alpha }( u_h - \Pi _h u, u_h - \Pi _h u )^{1/2} \\&\le C h^{-p} \left\Vert u_h - \Pi _h u\right\Vert _{V_h} \le C \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }, \end{aligned}$$

    where the last inequality follows by Lemma 7.

It follows that

$$\begin{aligned} \left\Vert u{-}u_h\right\Vert _{ B } {\le } C \left( h^p \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } \right) ^{\tau } \left( \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } \right) ^{1{-}\tau } {=} C h^{p \tau } \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d }. \end{aligned}$$

\(\square \)

4.2 Perturbed data

We now proceed to the case of perturbed data

$$\begin{aligned} {\tilde{u}}_{\omega }:= u_{\omega } + \delta u, \quad {\tilde{f}}:= f + \delta f \end{aligned}$$

with unperturbed data \(u_{\omega },f\) in Eq. (1), respectively Eq. (3) and perturbations \(\delta u \in [L^2(\omega )]^d\) and \(\delta f \in [L^2(\Omega )]^d\) measured by

$$\begin{aligned} \delta ({\tilde{u}}_{\omega },{\tilde{f}}):= \left\Vert \delta u\right\Vert _{ \omega } + h \left\Vert \delta f \right\Vert _{ \Omega } + \left\Vert \delta f \right\Vert _{H^{-1}(\Omega ) }. \end{aligned}$$
(42)

In view of Eq. (25), we have

$$\begin{aligned} \gamma _{\text {GLS}}h^2 ({\tilde{f}},\mathcal {L} v_h)_{ \mathcal {T}_h } = s_{\gamma }(u,v_h) + \gamma _{\text {GLS}}h^2 (\delta f,\mathcal {L} v_h)_{ \mathcal {T}_h }, \end{aligned}$$

so that the saddle points of the corresponding perturbed Lagrangian now satisfy:

$$\begin{aligned} A[(u_h,z_h),(v_h,w_h)] = ({\tilde{u}}_{\omega },v_h)_{ \omega } + s_{\gamma }(u,v_h) + \gamma _{\text {GLS}}h^2 (\delta f,\mathcal {L} v_h)_{ \mathcal {T}_h } + ({\tilde{f}},w_h)_{ \Omega }\nonumber \\ \end{aligned}$$
(43)

for all \((v_h,w_h) \in V_h \times W_h\). Let us first prove the analogue of Lemma 7 for perturbed data.

Lemma 9

Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution of the unperturbed problem in Eqs. (1)–(3) and let \((u_h,z_h) \in V_h \times W_h\) be the solution of the perturbed problem in Eq. (43). Then there exists \(C>0\) such that for all \(h \in (0,1)\) it holds that

$$\begin{aligned} \left\Vert (u_h - \Pi _h u, z_h)\right\Vert _{s} \le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$
(44)

Proof

Proceeding as in the proof of Lemma 7 we use Eq. (43) to arrive at

$$\begin{aligned}&A[ (u_h - \Pi _h u,z_h),(v_h,w_h) ] \\&\quad = (u_{\omega },v_h)_{ \omega } + (\delta u,v_h)_{ \omega } + s_{\gamma }(u,v_h) + \gamma _{\text {GLS}}h^2 (\delta f,\mathcal {L} v_h)_{ \mathcal {T}_h }, + a_h(u,w_h) \\&\qquad + (\delta f, w_h)_{ \Omega } - (\Pi _h u,v_h)_{ \omega } - s_{\gamma }(\Pi _h u, v_h) - s_{\alpha }(\Pi _h u, v_h) \\&\qquad - a_h(\Pi _h u,w_h) - s_{\beta }(\Pi _h u , w_h ) \\&\quad = (u - \Pi _h u,v_h)_{ \omega } + a_h(u - \Pi _h u,w_h) + s_{\gamma }(u - \Pi _h u, v_h) - s_{\alpha }(\Pi _h u, v_h) \\&\qquad + s_{\beta }(u -\Pi _h u , w_h )+ \gamma _{\text {GLS}}h^2 (\delta f,\mathcal {L} v_h)_{ \mathcal {T}_h }+ (\delta u,v_h)_{ \omega } + (\delta f, w_h)_{ \Omega }. \end{aligned}$$

The terms in the second to last line are bounded as in the proof of Lemma 7. The terms including the perturbations are estimated by

$$\begin{aligned}{} & {} \gamma _{\text {GLS}}h^2 (\delta f,\mathcal {L} v_h)_{ \mathcal {T}_h } + (\delta u,v_h)_{ \omega } + (\delta f, w_h)_{ \Omega } \\{} & {} \quad \le \gamma _{\text {GLS}}h \left\Vert \delta f \right\Vert _{ \Omega } h \left\Vert \mathcal {L} v_h\right\Vert _{ \mathcal {T}_h } + \left\Vert \delta u\right\Vert _{ \omega } \left\Vert v_h\right\Vert _{ \omega } + \left\Vert \delta f \right\Vert _{ H^{-1}(\Omega ) } \left\Vert w_h\right\Vert _{ H^1(\Omega ) } \\{} & {} \quad \le C \left( h \left\Vert \delta f \right\Vert _{ \Omega } + \left\Vert \delta f \right\Vert _{ H^{-1}(\Omega ) } +\left\Vert \delta u\right\Vert _{ \omega } \right) \left( \left\Vert v_h\right\Vert _{V_h } + \left\Vert w_h\right\Vert _{ W_h } \right) \\{} & {} \quad \le C \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \left\Vert (v_h,w_h) \right\Vert _{s}, \end{aligned}$$

where Friedrichs inequality, see Eq. (39), has been employed. \(\square \)

With this lemma being established we can show the analogue of Theorem 8 for perturbed data. Our result is comparable with Burman et al. (2019, Theorem 3) for the Helmholtz equation obtained for \(p=1\). It is also comparable with Burman et al. (2021, Theorem 5.7) for the case of higher polynomial orders p except that the latter publication even controls the \(H^1\)-norm in B. This is out of scope here since it would require a conditional stability estimate which additionally controls the gradient of u, see Burman et al. (2021, Lemma 3.1).

Theorem 10

Let the subdomains \(\omega \) and B of \(\Omega \) be defined as in Corrollary 3. Assume that \(u \in [H^{p+1}(\Omega )]^d\) is a solution to the unperturbed problem in Eqs. (1)–(3) and let \((u_h,z_h) \in V_h \times W_h\) be the solution of the perturbed problem in Eq. (43). Then there exists \(C>0\) and \(\tau \in (0,1)\) such that

$$\begin{aligned} \left\Vert u - u_h\right\Vert _{B} \le C h^{\tau p} \left( \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + h^{-p} \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$
(45)

Proof

Following the proof of Theorem 8, the residual can now be written as

$$\begin{aligned} \langle r,w \rangle = a_h(u_h-u,w - \Pi _h w) + s^{*}(z_h,\Pi _h w ) - s_{\beta }(u_h - u, \Pi _h w ) + (\delta f,\Pi _h w )_{ \Omega }. \end{aligned}$$

We estimate the first three terms similar as in the proof of Theorem 8 but now appealing to Lemma 9 instead of Lemma 7.

  • As in the proof of Theorem 8 we obtain

    $$\begin{aligned} a_h(u_h-u,w - \Pi _h w)&\le C \left\Vert u_h -u\right\Vert _{V_h} \left\Vert w\right\Vert _{ [H^1(\Omega )]^d} \\&\le C \left( \left\Vert \Pi _h u - u\right\Vert _{V_h} + \left\Vert u_h - \Pi _h u \right\Vert _{V_h} \right) \left\Vert w\right\Vert _{ [H^1(\Omega )]^d} \\&\le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) \left\Vert w\right\Vert _{ [H^1(\Omega )]^d}, \end{aligned}$$

    where Lemma 9 and Eq. (34) have been emloyed.

  • Furthermore,

    $$\begin{aligned} s^{*}(z_h,\Pi _h w ){} & {} \le \left\Vert z_h\right\Vert _{W_h} \left\Vert \Pi _h w \right\Vert _{W_h} \\{} & {} \le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) \left\Vert w\right\Vert _{ [H^{1}(\Omega )]^d )}, \end{aligned}$$

    by invoking Lemma 9 again.

  • The term involving \(s_{\beta }\) is treated by using the inequality in Eq. (36) and then proceeding as above to estimate \(\left\Vert u -u_h\right\Vert _{V_h}\), i.e.

    $$\begin{aligned} s_{\beta }(u_h - u, \Pi _h w )&\le C \left\Vert u-u_h\right\Vert _{V_h} \left\Vert w\right\Vert _{ [H^1(\Omega )]^d} \\&\le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) \left\Vert w\right\Vert _{ [H^1(\Omega )]^d}. \end{aligned}$$
  • The perturbation term is easily bounded by using Cauchy-Schwarz and the stability of the interpolation:

    $$\begin{aligned} (\delta f,\Pi _h w )_{ \Omega } \le \left\Vert \delta f \right\Vert _{ H^{-1}( \Omega ) } \left\Vert \Pi _h w \right\Vert _{ H^1(\Omega ) } \le C \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \left\Vert w \right\Vert _{ H^1(\Omega ) }. \end{aligned}$$

It follows that

$$\begin{aligned} \left\Vert r\right\Vert _{V_0^{\prime }} \le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$

The conditional stability estimate from Corrollary 3 therefore leads to the error estimate

$$\begin{aligned} \left\Vert u-u_h\right\Vert _{ B }&\le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) + \left\Vert u-u_h\right\Vert _{ \omega } \right) ^{\tau } \\&\quad \times \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) + \left\Vert u-u_h\right\Vert _{ \Omega } \right) ^{1-\tau }. \end{aligned}$$
  • From Lemma 9 and equation Eq. (31) we obtain

    $$\begin{aligned} \left\Vert u-u_h\right\Vert _{ \omega }&\le \left\Vert u- \Pi _h u\right\Vert _{ \omega } + \left\Vert \Pi _h u - u_h\right\Vert _{ \omega } \\&\le C \left( h^{p} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$
  • We also have

    $$\begin{aligned} \left\Vert u-u_h\right\Vert _{ \Omega }&\le \left\Vert u- \Pi _h u\right\Vert _{ \Omega } + \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega } \\&\le C h^{p+1} \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega } \end{aligned}$$

    and by definition of \(s_{\alpha }(\cdot ,\cdot )\), see Eq. (22), and Lemma 9 it holds that

    $$\begin{aligned} \left\Vert u_h - \Pi _h u\right\Vert _{ \Omega }\le & {} C h^{-p} \left\Vert u_h - \Pi _h u\right\Vert _{V_h} \\\le & {} C \left( \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + h^{-p} \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$

It follows that

$$\begin{aligned} \left\Vert u{-}u_h\right\Vert _{ B }\le & {} C \left( h^p \left[ \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } {+} h^{{-}p} \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right] \right) ^{\tau }\\{} & {} \times \left( \left\Vert u\right\Vert _{ [H^{p{+}1}(\Omega )]^d } {+} h^{{-}p} \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) ^{1{-}\tau } \\= & {} C h^{\tau p} \left( \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } + h^{-p} \delta ({\tilde{u}}_{\omega },{\tilde{f}}) \right) . \end{aligned}$$

\(\square \)

Remark 2

Notice that if \(h < h_{\textrm{min}}\) for \(h_{\textrm{min}}:= (\delta ({\tilde{u}}_{\omega },{\tilde{f}}) / \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d })^{1/p} \) the data perturbation term in Eq. (45) dominates so that further refinement of the mesh will lead to poorer accuracy. Hence, refinement should be stopped at \(h =h_{\textrm{min}}\). Alternatively, the coefficient in front of the Tikhonov term in Eq. (22) can be made lower bounded to ensure that stagnation of the error occurs for \(h < h_{\textrm{min}}\) as explained in detail in Burman et al. (2021, Remark 5.1). However, this also requires an estimate of \(\left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d }\) and the noise level.

5 Numerical experiments

In this section we present a selection of numerical experiments to confirm the analytically derived error estimate of Theorem 10 and shed light on several additional aspects which exceed the scope of our present analysis. All experiments have been implemented using the open-source computing platform FEniCSx (Alnæes et al. 2014; Scroggs et al. 2022). To check our results we also implemented some of the numerical experiments in Netgen/ NGSolve (Schöberl 1997, 2014) and observed a qualitatively good agreement. A docker image containing all software and instructions to reproduce the numerical experiments shown in this paper can be obtained from the zenodo repository: Burman and Preuss (2022).

In all the numerical experiments we will set \(\rho = -k^2\) for a positive constant \(k>0\) representing the wavenumber. For the experiments in Sect. 5.1- Sect. 5.3 we consider the following geometrical setup. Let \(\Omega = [0,1]^2\) be the unit square and the measurement \(\omega \) and target domain B be given by

$$\begin{aligned} \omega = \Omega \setminus [0.1,0.9] \times [0.25,1] \text { and } B = \Omega \setminus [0.1,0.9] \times [0.95,1]. \end{aligned}$$
(46)

These subdomains are displayed in Fig. 1. We consider a sequence of meshes which are obtained by successive refinements of an initial mesh which is shown in Fig. 1a. All these meshes fulfill our assumption (see Sect. 3.1) of being fitted to the subdomains.

Fig. 1
figure 1

Subdomains Eq. (46) for the numerical experiments in Sect. 5.1-Sect. 5.3

5.1 Tuning of stabilization parameters

Even though the convergence rates in Theorem 10 hold for any finite \(\gamma _1,\gamma _{\text {GLS}},\alpha >0\), optimizing the stabilizing parameters can have a significant impact on the quality of the obtained numerical solution and the stability of the linear systems. For simplicity we will set the non-essential stabilization parameters, i.e. \(\beta _j\) for \( j \ge 1 \) and \( \gamma _{j}\) for \( j \ge 2 \), to zero in all numerical experiments to follow except for Sect. 5.3 where their potential benefits are investigated. Here we will optimize for the remaining essential parameters.

Fig. 2
figure 2

Relative \(L^2\)-error \(\left\Vert u-u_h\right\Vert _{ B } / \left\Vert u\right\Vert _{ B } \) in B for geometrical setup of Fig. 1 and oscillatory reference solution given in Eq. (48) for different stabilization parameters. For each of these plots only one penalty parameter has been varied, while the other parameters remain fixed. For the left column we set \(\gamma _{\text {GLS}}= 10^{-12}, \alpha = 10^{-3}\), for the middle column \(\gamma _1 = 10^{-5}/p^{3.5}, \alpha = 10^{-3} \) and for the right column \(\gamma _1 = \gamma _{\text {GLS}}= 10^{-5}/p^{3.5}\)

The Lamé coefficients for this experiment will be chosen as

$$\begin{aligned} \mu = 1 + \frac{1}{2} \sin (x) \sin (y), \qquad \lambda = 1.25 + \frac{1}{2} \cos (x) \cos (y). \end{aligned}$$
(47)

The right hand side f is manufactured so that the exact solution of the problem is given by

$$\begin{aligned} u(x,y) = \sin (k \pi x) \sin ( k \pi y) \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \end{aligned}$$
(48)

The dependence of the relative errors \(\left\Vert u-u_h\right\Vert _{ B } / \left\Vert u\right\Vert _{ B } \) and the condition number of the system matrix on the penalty parameters is displayed in Fig. 2 for \(k=6\) on a fixed mesh which is obtained by two consecutive refinements of the initial mesh shown in Fig. 1a. Firstly, it can be noticed that the error in terms of \(\gamma _1\) behaves like a well with an approximate minimum at \(\gamma _1 = 10^{-5}/p^{3.5}\). The error invariably has to increase as \(\gamma _1\) goes to zero since we loose control over the condition number of the linear system. We now fix \(\gamma _1 = 10^{-5}/p^{3.5}\) and show the behavior of the error as \(\gamma _{\text {GLS}}\) varies in the central plot of Fig. 2. It seems that \(\gamma _{\text {GLS}}\) basically has to be chosen sufficiently small. However, let us mention that if \(\gamma _1\) was chosen smaller, e.g. \(\gamma _1=10^{-12}\), then the error would also exhibit a well-like structure similar as shown in the left column of Fig. 2. From now on it seems then appropriate to set \(\gamma _{\text {GLS}}= \gamma _1\). The right plot of Fig. 2 shows that the Tikhonov parameter \(\alpha \) has almost no influence on the error or the condition number. We will set \(\alpha = 10^{-3}\) from now on.

Fig. 3
figure 3

Condition number of the system matrix for \(k=6\) in terms of the number of degrees of freedom, respectively the mesh width

As shown in Fig. 2 the condition number of the linear systems is already very high on a moderately refined mesh and appears to scale unfavourably with p. This is investigated further in Fig. 3 which displays the condition number of the linear systems for our choice of penalty parameters under mesh refinement. An approximate scaling of \(\mathcal {O}(h^{-2.5-p})\) is observed. Hence, when using higher polynomial orders one is more likely to encounter ill-conditioning effects. This issue should be kept in mind when analyzing the numerical results on fine meshes, in particular for orders \(p>1\).

5.2 Data perturbations

With the stabilization parameters determined as above let us now proceed to the numerical verification of Theorem 10. The exact solution, respectively the exact data \(u_{\omega }\) and f, will be chosen as in Sect. 5.1, but we will assume now that only perturbed data

$$\begin{aligned} {\tilde{u}}_{\omega }:= u_{\omega } + \delta u, \quad {\tilde{f}}:= f + \delta f \end{aligned}$$

with random perturbations

$$\begin{aligned} \left\Vert \delta u\right\Vert _{ \omega } = \mathcal {O}\left( h^{p-\theta }\right) , \quad \left\Vert \delta f\right\Vert _{ \Omega } = \mathcal {O}\left( h^{p-\theta }\right) , \end{aligned}$$

for some \(\theta \in {\mathbb {N}}_0\) is available for implementing our method. According to Theorem 10 we have the error bound

$$\begin{aligned} \left\Vert u - u_h\right\Vert _{ B } \le C h^{\tau p-\theta } \left( 1 + \left\Vert u\right\Vert _{ [H^{p+1}(\Omega )]^d } \right) , \end{aligned}$$
(49)

which means that achieving convergence requires the condition \(\tau p-\theta > 0\).

Fig. 4
figure 4

Relative error \(\left\Vert u-u_h\right\Vert _{ B } / \left\Vert u\right\Vert _{ B } \) for geometrical setup shown in Fig. 1 in terms of the strength of the data perturbation

The relative errors for \(\theta =0\) for two different wavenumbers \(k=1\) and \(k=6\) are compared in the first row of Fig. 4. For both cases the observed convergence rates are consistent with Theorem 10. As one may expect, for a smooth reference solution higher polynomial orders deliver higher accuracy with fewer degrees of freedom compared to piecewise affine linear elements. Note that the errors are in general higher for \(k=6\) than for \(k=1\). The case \(p=3\) is a lucky exception. The dependence of the error on the wavenumber will be investigated more thoroughly in Sect. 5.3. Let us now turn to the discussion of the second row of Fig. 4 which displays the results for stronger perturbations. According to Eq. (49), we would expect the \(p=1\) method to diverge for \(\theta = 1\) which is confirmed by Fig. 4. The \(p=2\) method still converges, albeit at an extremely slow rate, whereas the \(p=3\) method at least manages to converge linearly. Based on this result it is consistent that for \(\theta =2\) convergence is no longer observed for any \(p \le 3\) as shown in the lower right plot of Fig. 4.

5.3 Pollution error

In this section the dependence of the error on the wavenumber k will be investigated. To this end, we stick to the setup of the previous two subsections. Based on results for the CIP-FEM applied to well-posed Helmholtz equations available in the literature (Wu 2013; Zhu and Wu 2013; Du and Wu 2015; Zhou and Wu 2022), one would expect a scaling of

$$\begin{aligned} k \left\Vert u-u_h\right\Vert _{ B } + \left\Vert \nabla u- \nabla u_h\right\Vert _{ B } \sim k \end{aligned}$$
(50)

as k increases when \(kh < 1\) remains constant. To connect to these results, note that the method proposed in this article can also be applied in the setting in which boundary data is available on \(\partial \Omega \). Here we assume that Dirichlet data on the whole boundary is given which we implement in a strong sense (alternatively, a weak imposition following the technique of Nitsche could be used). We will denote this as the “well-posed” problem to the distinguish it from the “ill-posed” problem we usually consider in this paper.

Fig. 5
figure 5

The weighted error \( k \left\Vert u-u_h\right\Vert _{ B } + \left\Vert \nabla u- \nabla u_h\right\Vert _{ B } \) under mesh refinement for constant kh using \(\beta _j=0\) for \( j \ge 1 \)

In Fig. 5 the weighted error in Eq. (50) is then displayed for these two different settings. For this experiment the case of unperturbed data was considered. For order \(p=1\) a linear scaling in k is observed regardless of whether the well-posed or ill-posed problem is considered. However, for \(p=2\) the error for the ill-posed problem grows significantly faster than linear and appears to be unstable. This could possibly also be related to ill-conditioning of the linear systems, cp. Sect. 5.1. Similar results are observed for \(p=3\).

Let us now investigate if this phenomenon can be mitigated by choosing \(\beta _j > 0\), which amounts to adding the stabilisation term \(s_{\beta }\), see equation Eq. (19), to the Lagrangian. The results displayed in Fig. 6 show that by choosing \(\beta _2\) large enough a linear scaling of the weighted error for \(p=2\) can indeed be achieved. However, a comparison of the middle and right plot of Fig. 6 shows that this comes at the expense of increasing the overall error in the ill-posed case significantly, which is especially noticeable for lower wavenumbers. Note that this behavior does not appear in the well-posed case (the red lines in the middle and right plots of Fig. 6 are nearly identical) which has been run using exactly the same stabilization parameters. Similar results are obtained for \(\beta _j < 0\), i.e. only the magnitude of \(\beta _j\) matters. We conclude by recording that care has to be taken when applying higher order methods to the ill-posed elastodynamics problem as the wavenumber increases. Further research is required to find satisfactory remedies for this issue.

Fig. 6
figure 6

The scaled error \( k \left\Vert u-u_h\right\Vert _{ B } + \left\Vert \nabla u- \nabla u_h\right\Vert _{ B } \) under mesh refinement for constant kh utilizing the additional stabilization term \(s_{\beta }\) defined in Eq. (19)

5.4 Influence of the geometry

It is well-known, see e.g. Burman et al. (2019), Nechita (2020), Burman et al. (2021), that the geometry of the data and target sets has a major influence on the quality of the reconstruction outside the data domain. Roughly speaking, the best results can be expected if the target set B is part of the convex hull of the data set \(\omega \) as in the setup shown in Fig. 1. To increase the level of difficulty, let us now shrink the data set to

$$\begin{aligned} \omega = [0,0.1] \times [0,\xi ] \cup [0.9,1.0] \times [0,\xi ] \cup [0.1,0.9] \times [0,0.25] \end{aligned}$$
(51)

for \(\xi = 0.6\). This splits the target domain into two halves

$$\begin{aligned} B_-:= [0,1] \times [0,\xi ], \qquad B_+:= [0.1,0.9] \times [\xi ,0.95], \end{aligned}$$
(52)

where \(B_-\) is in the convex hull of \(\omega \) while \(B_+\) is not. A sketch of the geometrical setup is given in Fig. 7.

Fig. 7
figure 7

The data and target domains for the geometry defined in Eqs. (51)–(52)

Let us consider constant coefficients \(\mu = 1\) and \(\lambda = 1.25\) throughout the entire domain and \(k=1\) to render the remainder of the problem as simple as possible. The relative errors (using unperturbed data) in the two subdomains \(B_{\pm }\) are displayed in Fig. 8 as solid lines. A stark contrast can be observed (note the different scalings of the vertical axis). While near optimal rates of \(\mathcal {O}(h^p)\) are obtained in \(B_{-}\), we have to use \(p=3\) to reach linear convergence rates in \(B_{+}\). This is a clear indication that the conditional stability, in particular the value of the exponent \(\tau \) in Eq. (10), is very sensitive to the geometry of the sets \(\omega \) and B.

5.4.1 Adding additional information on divergence of wave diplacement

Let us check if these results can be improved if more a priori information is provided. Now we will assume that not only u is given in \(\omega \) as data, but additionally \(\nabla \cdot u = q \) is available in the entire domain \(\Omega \). This basically means that the divergence part of the stress tensor \(\sigma (u)\) in Eq. (2) is known. The proposed method can easily be modified to cover this case by adding \(\frac{1}{2} \left\Vert \nabla \cdot u_h - q \right\Vert _{ \Omega }^2\) as an additional term to the Lagrangian in Eq. (13). The relative \(L^2\)-errors for running the same problem as above are displayed as dashed lines in Fig. 8. Even though a significant decrease in the absolute value of the errors is observed, the asymptotic convergence rates improve only marginally. Hence, additional information on the divergence does apparently not enhance the conditional stability of the problem.

Fig. 8
figure 8

The solid lines display the relative \(L^2\)-errors in the two different parts of the target domain: \(B_{-}\) is contained in the convex hull of the data domain while \(B_{+}\) is outside of it. The dashed lines show the same quantities when additional information on \( \nabla \cdot u\) in \(\Omega \) is included in the Lagrangian

5.5 Jumping shear modulus

5.5.1 Jump in a plane

Being able to treat Lamé parameters that exhibit jump discontinuities is of particular interest in applications. For example, jumps of \(\mu \) and \(\lambda \) occur at positions of seismic discontinuities in the Earth’s mantle. To emulate this behavior in our toy problem, we introduce an artificial interface \(\Gamma := \{ (x,y) \in \Omega \mid y = \eta \}\) and consider a piecewise constant shear modulus

$$\begin{aligned} \mu = {\left\{ \begin{array}{ll} \mu _+, &{} \text {for } y > \eta , \\ \mu _{-} &{} \text {for } y < \eta . \end{array}\right. } \end{aligned}$$
(53)

A weak solution has to fulfill the interface conditions

$$\begin{aligned} \llbracket u\rrbracket _{\Gamma } = 0; \quad \llbracket \sigma (u) \cdot {\textbf{n}}\rrbracket _{\Gamma } = 0, \quad \text {across } \Gamma . \end{aligned}$$
(54)

Let us denote \(\Omega _{+}:= \Omega \cap \{ y > \eta \}\) and \(\Omega _{-}:= \Omega \cap \{ y < \eta \}\).

Fig. 9
figure 9

Relative errors for geometry shown in Fig. 7 for a shear modulus which jumps in the plane separating the subdomains \(B_{\pm }\). We consider \(k=4\) and measure the errors in the convex and non-convex part of the target domain separately. The solid lines display \(\left\Vert u-u_h\right\Vert _{ B_{-} } / \left\Vert u\right\Vert _{ B_{-} } \) while the dashed lines show \(\left\Vert u-u_h\right\Vert _{ B_{+} } / \left\Vert u\right\Vert _{ B_{+} } \)

We make the following ansatz for the wave displacement \(u_+ \) in \(\Omega _+\) and \(u_{-}\) in \(\Omega _{-}\):

$$\begin{aligned} u^{+} = \begin{pmatrix} (a_1 + b_1 y + c_1 y^2) \sin (k \pi x) \\ (a_2 + b_2 y + c_2 y^2) \cos (k \pi x) \end{pmatrix}, \; u^{-} = \begin{pmatrix} \sin (k \pi x) \cos ( k \pi (y-\eta )) \\ \cos (k \pi x) \cos ( k \pi (y-\eta )) \end{pmatrix}.\nonumber \\ \end{aligned}$$
(55)

As shown in Section B, the interface conditions of Eq. (54) can be fulfilled by choosing:

$$\begin{aligned} \begin{aligned}&b_1 = 0, \qquad c_1 = \frac{k \pi }{2 \eta } \left[ \frac{\mu _+ - \mu _{-} }{ \mu _{+} } \right] , \qquad a_1 = 1 - c_1 \eta ^2, \\&b_2 = 1, \qquad c_2 = -\frac{1}{2 \eta }, \qquad a_2 = 1 - b_2 \eta - c_2 \eta ^2. \end{aligned} \end{aligned}$$
(56)

For the numerical experiment we consider the geometry shown in Fig. 7 and set \(\eta = \xi = 0.6\) so that the jump occurs in the plane separating the subdomains \(B_{\pm }\) and is respected by the mesh. A contrast of about two between \(\mu _{+}\) and \(\mu _{-}\) is realistic for applications in Earth’s seismology, see e.g. the reference Earth model of Dziewonski and Anderson (1981). So we consider \(\mu _{\pm } \in \{1,2 \}\) and let \(\lambda = 1.25\). The relative errors for unperturbed data are displayed in Fig. 9 for \(k=4\). Similar results as in Sect. 5.4 in which a globally constant shear modulus was considered are observed, i.e. we achieve rates of nearly \(\mathcal {O}(h^p)\) in \(B_{-}\), whereas the method struggles in \(B_+\) but does not break down either. By comparing the left and the right plot in Fig. 9 we notice that doubling the value of \(\mu \) apparently leads to a reduction of the errors in the respective subdomain. This is reasonable since it basically amounts to halving the wavenumber in this subdomain. Overall, the presence of a jump appears to have little influence for the considered problem despite the fact that the theoretical error estimate given in Theorem 8 cannot be applied to this case due to insufficient regularity of the shear modulus.

Fig. 10
figure 10

Relative \(L^2\)-errors \(\left\Vert u-u_h\right\Vert _{ B_{-} } / \left\Vert u\right\Vert _{ B_{-} } \) (solid) and \(\left\Vert u-u_h\right\Vert _{ B_{+} } / \left\Vert u\right\Vert _{ B_{+} } \) (dashed) for geometry from Fig. 10a. Here, the shear modulus is \(\mu _+\) in \(B_+ \cup B_-\) and \(\mu _-\) in the complement

5.5.2 Data only at bottom with jump between target domain and exterior

Finally, we consider an even more challenging setup shown in Fig. 10a to explore the limits of the proposed method. Now data \(\omega = [0,1] \times [0,0.25]\) is only available at the bottom of the domain. The shear modulus is set to \(\mu _e\) in the exterior and to \(\mu _i\) iside the target domain

$$\begin{aligned} B_- \cup B_+ = [x_L,x_R] \times [y_L,y_R] = [0.25,0.75] \times [0.25,0.9], \end{aligned}$$
(57)

which is separated by the plane \( \{ y = 0.6 \}\) into two halves. Note that in this example the entire target domain is situated outside the convex hull of the data set. However, \(B_-\) is closer to the data set than \(B_+\) which should aid the reconstruction. We set \(\lambda = 1.25\) and use the following reference solution

$$\begin{aligned} u = \zeta ^2 \begin{pmatrix} \cos (k \pi x) \sin (k \pi y) \\ \cos (k \pi x) \cos (k \pi y) \end{pmatrix} \text { in } B_- \cup B_+, \; u = \zeta ^2 \begin{pmatrix} \sin (k \pi x) \sin (k \pi y) \\ \sin (k \pi x) \cos (k \pi y) \end{pmatrix} \text { else, }\nonumber \\ \end{aligned}$$
(58)

for \(\zeta := (x-x_L)(x-x_R)(y-y_L)(y-y_R)\). The results shown in Fig. 10b for \(k=4\) are already so poor even without a jump that we decided to lower the wavenumber even further to \(k=1\) to investigate whether the effect of a jump can be detected. However, the results for \(k=1\) shown in Fig. 10c for different combinations of \(\mu _i\) and \(\mu _e\) do not provide evidence that the presence of a jump is of significant importance here. Instead, the major variable appears to be the distance between the data and target domain which accounts for the observation that the errors are about two order of magnitudes larger in \(B_+\) than in \(B_-\). Another important factor is the size of the shear in the target domain as already observed in Sect. 5.5.1. Finally, we remark that our method also performed well in several further setups featuring jump discontinuities in the shear modulus not shown in this article.

6 Conclusion

In this paper we presented a high order stabilized finite element method for unique continuation subject to the Lamé system. The method proceeds by first formulating the data assimilation problem as an ill-posed minimization problem at the discrete level and then adding carefully chosen stabilization terms to enhance numerical stability without leading to an exaggerated perturbation of the solution. Convergence rates have been derived and verified in numerical experiments. It turned out that higher order polynomial degrees in the FEM can on the one hand improve efficiency, but on the other hand are at greater risk to suffer from ill-conditioning effects. We have also observed numerically that the geometry of the data and target domains plays a crucial role for the conditional stability of the problem, which suggests that the wavenumber-explicit convergence results proven in Burman et al. (2019), Nechita (2020) for the constant coefficient Helmholtz equation under specific convexity assumptions on the geometry may extend to the Lamé system of elastodynamics. Moreover, in our numerical experiments the geometry appeared to be of far greater importance than the regularity of the Lamé coefficients as our good numerical results for a discontinuous shear modulus suggest. However, this point deserves further investigation since it is of course possible that we simply failed to trigger a problematic behavior in our limited set of experiments.

Another interesting direction for future research could be to extend our methodology to the reconstruction of the Lamé parameters from measurements of the wave displacement in the interior of \(\Omega \). Due to its relevance in practical applications, this problem has already attracted significant research interest, see e.g. McLaughlin et al. (2010), Lechleiter and Schlasche (2017), Davies et al. (2019). Note also that the potential of the augmented Lagrangian method [on which our approach is to some extent based) for parameter identification problems is well-established, see Ito and Kunisch (1990), Chan and Tai (2003)].