1 Introduction

1.1 Overview

The design of a pointwise symmetric stress approximation \(\sigma _h\in L^2(\varOmega ;{\mathbb {S}})\) with divergence in \(L^2(\varOmega ;{\mathbb {R}}^d)\), written \(\sigma _h \in H({\text {div}},\varOmega ; \mathbb S)\), has been a long-standing challenge [2] and the first positive examples in [5] initiated what nowadays is called the finite element exterior calculus [4]. The a posteriori error analysis of mixed finite element methods in elasticity started with [11] on PEERS [3], where the asymmetric stress approximation \(\gamma _h\) arises in the discretization as a Lagrange multiplier to enforce weakly the stress symmetry. This allows the treatment of the term \({\mathbb {C}}^{-1} \sigma _h + \gamma _h\) as an approximation of the (nonsymmetric) functional matrix Du for the displacement field [11] with the arguments of [1, 9] developed for mixed finite element schemes for a Poisson model problem. Here and throughout, \({\mathbb {C}}\) denotes a fourth-order elasticity tensor with two Lamé constants \(\lambda \) and \(\mu \) and \({\mathbb {C}}^{-1}\) is its inverse. Mixed finite element methods appear attractive in the incompressible limit for they typically avoid the locking phenomenon [12] as \(\lambda \rightarrow \infty \).

For mixed finite element methods like the symmetric Arnold–Winther finite element schemes [5], the subtle term is the nonconforming residual: Given any piecewise polynomial \(\sigma _h \in H({\text {div}},\varOmega ; {\mathbb {S}})\), compute an upper bound \(\eta ({\mathscr {T}},\sigma _h)\) of

$$\begin{aligned} \inf _{v\in V} \left\| {\mathbb {C}}^{-1/2} \sigma _h-{\mathbb {C}}^{1/2}\varepsilon (v) \right\| _{L^2(\varOmega )} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Despite general results in this direction [10, 17, 18], this task had been addressed only by the computation of an approximation to the optimal v with Green strain \(\varepsilon (v):={\text {sym}}D v\) or of some skew-symmetric approximation \(\gamma _h\) motivated from the first results in [11] on PEERS. In fact, any choice of a piecewise smooth and pointwise skew-symmetric \(\gamma _h\) allows for an a posteriori error control of the symmetric stress error \(\sigma -\sigma _h\) in [15]. Its efficiency, however, depends on the (unknown and uncontrolled) efficiency of the choice of \(\gamma _h\) as an approximation to the skew-symmetric part \(\gamma \) of Du.

This paper proposes the first reliable and efficient explicit residual-based a posteriori error estimator of the nonconforming residual with the typical contributions to \(\eta ({\mathscr {T}},\sigma _h)\) computed from the (known) Green strain approximation \(\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1} \sigma _h\). Besides oscillations of the applied forces in the volume and along the Neumann boundary, there is a volume contribution \(h_T^2\Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)}\) for each triangle \(T\in {\mathscr {T}}\) and an edge contribution with the jump \([\varvec{\varepsilon }_h]_E\) across an interior edge E with unit normal \(\nu _E\), tangential unit vector \(\tau _E\), and length \(h_E\), namely

$$\begin{aligned} h_E^{1/2} \left\| \tau _E \cdot [\varvec{\varepsilon }_h]_E \tau _E \right\| _{L^2(E)} + h_E^{3/2} \left\| \tau _E \cdot [{\text {rot}}_{NC} \varvec{\varepsilon }_h]_E - \partial ( \nu _E \cdot [\varvec{\varepsilon }_h]_E \tau _E)/\partial s \right\| _{L^2(E)}, \end{aligned}$$

and corresponding modification on the edges on the Dirichlet boundary with the (possibly inhomogeneous) Dirichlet data; cf. Remark 2 for some partial simplification of the last term displayed.

The analysis is restricted to the two dimensional case, since it involves explicit calculations in two dimensions without any reference to the exterior calculus but with inhomogeneous Dirichlet and Neumann boundary data. The main result is reliability and efficiency to control the stress error robustly in the sense that the multiplicative generic constants hidden in the notation \(\lesssim \) do neither depend on the (local or global) mesh-size nor on the parameter \(\lambda >0\) but may depend on \(\mu >0\) and on the shape regularity of the underlying triangulation \({\mathscr {T}}\) of the domain \(\varOmega \) into triangles through a lower bound of the minimal angle therein.

1.2 Linear elastic model problem

The elastic body \(\varOmega \) is a simply-connected bounded Lipschitz domain \(\varOmega \subset {\mathbb {R}}^2\) in the plane with a (connected) polygonal boundary \(\partial \varOmega = \varGamma _D\cup \varGamma _N\) split into parts. The displacement boundary \(\varGamma _D\) is compact and of positive surface measure, while the traction boundary is the relative open complement \(\varGamma _N=\partial \varOmega \backslash \varGamma _D\) with outer unit normal vector \(\nu \). Given \(u_D\in H^1(\varOmega ;{\mathbb {R}}^2)\), the volume force \(f\in L^2( \varOmega ; {\mathbb {R}}^2)\), and the applied surface traction \(g \in L^2( \varGamma _N; {\mathbb {R}}^2)\), the linear elastic problem seeks a displacement \(u\in H^1(\varOmega ;{\mathbb {R}}^2)\) and a symmetric stress tensor \(\sigma \in H({\text {div}},\varOmega ; {\mathbb {S}})\) with

$$\begin{aligned} \begin{aligned}&\displaystyle -{\text {div}}\sigma = f \quad \text {and}\quad \sigma = {\mathbb {C}}\varepsilon (u)\quad \text{ in } \varOmega ,\\&\displaystyle u = u_D\quad \text{ on } \varGamma _D, \qquad \sigma \nu = g\quad \text{ on } \varGamma _N. \end{aligned} \end{aligned}$$
(1)

Throughout this paper, given the Lamé parameters \(\lambda ,\mu >0\) for isotropic linear elasticity, the positive definite fourth-order elasticity tensor \({\mathbb {C}}\) acts as \({\mathbb {C}}E:=2\mu \, E+ \lambda \,{\text {tr}}(E)\,1_{2\times 2} \) on any matrix \(E\in {\mathbb {S}}\) with trace \({\text {tr}}(E)\) and the \(2\times 2\) unit matrix \(1_{2\times 2} \). Note that \(u_D\) acts in (1) only on \(\varGamma _D\) and is an extension of the continuous function \(u_D\in C(\varGamma _D;{\mathbb {R}}^2)\) also supposed to belong to the edgewise second order Sobolev space \( H^2( {\mathscr {E}}(\varGamma _D))\) below to allow second derivatives with respect to the arc length along boundary edges.

More essential will be a discussion on the precise conditions on the Neumann data g and its discrete approximation \(g_h\) below for they belong to the essential boundary conditions in the mixed finite element method based on the dual formulation.

In addition to the set of homogeneous displacements V and the aforementioned stress space \(H({\text {div}},\varOmega ;{\mathbb {S}}) \), namely,

$$\begin{aligned} V&:=\{v\in H^1(\varOmega ;{\mathbb {R}}^2)\ \big \vert \ v|_{\varGamma _D}=0\},\\ H({\text {div}},\varOmega ;{\mathbb {S}})&:= \{ \tau \in L^2(\varOmega ;{\mathbb {S}}) \ \big \vert \ {\text {div}}\tau \in L^2(\varOmega ;{\mathbb {R}}^2)\}, \end{aligned}$$

and with the exterior unit normal vector \(\nu \) along \(\partial \varOmega \), the inhomogeneous stress space

$$\begin{aligned} \varSigma (g) := \left\{ \sigma \in H({\text {div}},\varOmega ;{\mathbb {S}})\ \big \vert \int _{\varGamma _N}\psi \cdot (\sigma \nu )\;ds = \int _{\varGamma _N}\psi \cdot g\;ds \text { for all } \psi \in V \right\} \end{aligned}$$

is defined with respect to the Neumann data \(g\in L^2(\varGamma _N)\) and, in particular, \( \varSigma _0 :=\varSigma (0)\) abbreviates the stress space with homogeneous Neumann boundary conditions.

Given data \(u_D, f,g\) as before, the dual weak formulation of (1) seeks \((\sigma ,u)\in \varSigma (g)\times L^2(\varOmega ;{\mathbb {R}}^2)\) with

$$\begin{aligned} \begin{aligned} \int _\varOmega \sigma :{\mathbb {C}}^{-1}\tau \;dx + \int _\varOmega u\cdot {\text {div}}\tau \;dx&= \int _{\varGamma _D} u_D\cdot (\tau \nu )\,ds\quad \text {for all }\tau \in \varSigma _0,\\ \int _\varOmega v\cdot {\text {div}}\sigma \;dx&= -\int _\varOmega f\cdot v\,dx\quad \text {for all } v\in L^2(\varOmega ;{\mathbb {R}}^2). \end{aligned} \end{aligned}$$
(2)

It is well known that the two formulations are equivalent and well posed in the sense that they allow for unique solutions in the above spaces and are actually slightly more regular according to the reduced elliptic regularity theory. The reader is refereed to textbooks on finite element methods [6,7,8] for proofs and further descriptions.

Throughout this paper, the model problem considers truly mixed boundary conditions with the hypothesis that both \(\varGamma _D\) and \(\varGamma _N\) have positive length. The remaining cases of a pure Neumann problem or a pure Dirichlet problem require standard modification and are immediately adopted. The presentation focuses on the case of isotropic linear elasticity with constant Lamé parameters \(\lambda \) and \(\mu \) for brevity and many results carry over to more general situations (cf. Remarks 1 and 2 for instance).

1.3 Mixed finite element discretization

Let \({\mathscr {T}}\) denote a shape-regular triangulation of \(\varOmega \) into triangles (in the sense of Ciarlet [8]) with set of nodes \({\mathscr {N}}\), set of interior edges \({\mathscr {E}}(\varOmega )\), set of Dirichlet edges \({\mathscr {E}}(\varGamma _D)\) and set of Neumann edges \({\mathscr {E}}(\varGamma _N)\). The triangulation is compatible with the boundary pieces \(\varGamma _D\) and \(\varGamma _N\) in that the boundary condition changes only at some vertex \({\mathscr {N}}\) and \(\varGamma _D\) (resp. \(\overline{\varGamma _N}\)) is partitioned in \({\mathscr {E}}(\varGamma _D)\) (resp. \({\mathscr {E}}(\varGamma _N)\)).

The piecewise polynomials (piecewise with respect to the triangulation \({\mathscr {T}}\)) of total degree at most \(k\in {\mathbb {N}}_0\) are denoted as \(P_k({\mathscr {T}})\), their vector- or matrix-valued versions as \(P_k({\mathscr {T}};{\mathbb {R}}^2)\) or \(P_k({\mathscr {T}};{\mathbb {R}}^{2\times 2})\) etc. The subordinated Arnold–Winther finite element space \( AW _k({\mathscr {T}})\) of index \(k\in {\mathbb {N}}\) [5] reads

$$\begin{aligned} { AW }_k({\mathscr {T}}):=\left\{ \tau \in P_{k+2}({\mathscr {T}};{\mathbb {S}})\cap H({\text {div}},\varOmega ;{\mathbb {S}})\, \big |\,{\text {div}}\tau \in P_k({\mathscr {T}};{\mathbb {R}}^2)\right\} . \end{aligned}$$

The Neumann boundary conditions are essential conditions and are traditionally implemented by some approximation \(g_h\) to g in the normal trace space

$$\begin{aligned} G({\mathscr {T}}):=\left\{ (\tau _h \nu ) |_{\varGamma _N} \in L^2(\varGamma _N;{\mathbb {R}}^2) \,\big |\, \tau _h\in AW _k({\mathscr {T}}) \right\} \end{aligned}$$

(recall that \(\nu \) is the exterior unit normal along the boundary). Given any \(g_h\in G({\mathscr {T}})\), the discrete stress approximations are sought in the non-void affine subspace

$$\begin{aligned} \varSigma (g_h,{\mathscr {T}}):=\varSigma (g_h) \cap AW _k({\mathscr {T}}) \end{aligned}$$

of \( AW _k({\mathscr {T}})\) with test functions in the linear subspace \(\varSigma (0,{\mathscr {T}}):=\varSigma _0\,\cap \, AW _k({\mathscr {T}})\). Then there exists a unique discrete solution \(\sigma _h\in \varSigma (g_h,{\mathscr {T}})\) and \( u_h\in V_h:= P_k({\mathscr {T}};{\mathbb {R}}^2)\) to

$$\begin{aligned} \begin{aligned} \int _\varOmega \sigma _h :{\mathbb {C}}^{-1}\tau _h\,dx+\int _\varOmega u_h\cdot {\text {div}}\tau _h\,dx&= \int _{\varGamma _D} u_D \cdot (\tau _h\nu )\, ds \quad \text {for all } \tau _h\in \varSigma (0,{\mathscr {T}}),\\ \int _\varOmega v_{h}\cdot {\text {div}}\sigma _h\, dx&= \int _\varOmega f\cdot v_{h}\, dx\quad \text { for all } v_{h}\in V_h. \end{aligned} \end{aligned}$$
(3)

The explicit design of a Fortin projection leads in [5] to quasi-optimal a priori error estimates for an exact solution \((\sigma , u)\in (\varSigma (g)\cap H^{k+2}(\varOmega ;{\mathbb {S}})) \times H^{k+2}(\varOmega )\) to (1) and the approximate solution \((\sigma _h, u_h)\) to (3), namely (with the maximal mesh-size h)

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )}&\lesssim h^{m}\Vert \sigma \Vert _{H^m(\varOmega )}\quad \text{ for } 1\le m \le k+2,\\ \Vert u- u_h\Vert _{L^2(\varOmega )}&\lesssim h^{m}\Vert u\Vert _{H^{m+1}(\varOmega )}\quad \text{ for } 1\le m \le k+1. \end{aligned}$$

Another stable pair of different and mesh-depending norms in [14] implies the \(L^2\) best approximation of the stress error \(\sigma -\sigma _h\) up to a generic multiplicative constant and data oscillations on f under some extra condition (N) on the Neumann data approximation \(g_h\) implied by the first and zero moment orthogonality assumption \(g-g_h\perp P_1( {\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)\) (\(\perp \) indicates orthogonality in \(L^2(\varGamma _N)\)) met in all the numerical examples of this paper.

For simple benchmark examples with piecewise polynomial data f and g, there is even a superconvergence phenomenon visible in numerical examples. The arguments of this paper allow a proof of fourth-order convergence of the \(L^2\) stress error \(\Vert \sigma -\sigma _h\Vert ={\mathscr {O}}( h^4 )\) in the lowest-order Arnold–Winther method with \(k=1\) for a smooth stress \(\sigma \in H^4(\varOmega ;{\mathbb {S}})\) with \(f=f_h\in P_1({\mathscr {T}};{\mathbb {R}}^2)\) and \(g=g_h\in G({\mathscr {T}})\). (In fact, once the data are not piecewise affine, the arising oscillation terms are only of at most third order and the aforementioned convergence estimates are sharp.)

This is stated as Theorem 5 in the appendix, because the a priori error analysis lies outside of the main focus of this work. It is surprising though that adaptive mesh-refining suggested with this paper recovers this higher convergence rate even for the inconsistent Neumann data in the Cook membrane benchmark example below.

1.4 Explicit residual-based a posteriori error estimator

The novel explicit residual-based error estimator for the discrete solution \((\sigma _h, u_h)\) to (3) depends only on the Green strain approximation \({\mathbb {C}}^{-1}\sigma _h\) and its piecewise derivatives and jumps across edges.

Given any edge E of length \(h_E\), let \(\nu _E\) denote the unit normal vector (chosen with a fixed orientation such that it points outside along the boundary \(\partial \varOmega \) of \(\varOmega \)) and let \(\tau _E\) denote its tangential unit vector; by convention \(\tau _E = (0,-1; 1,0) \nu _E\) with the indicated asymmetric \(2\times 2\) matrix. The tangential derivative \(\tau _E\cdot \nabla \bullet \) along an edge (or boundary) is identified with the one-dimensional derivative \(\partial \bullet /\partial s\) with respect to the arc-length parameter s. The jump \([v]_E\) of any piecewise continuous scalar, vector, or matrix v across an interior edge \(E = \partial T_+\cap \partial T_-\) shared by the two triangles \(T_+\) and \(T_-\) such that \(\nu _E\) points outside \(T_+\) along \(E\subset \partial T_+\) reads

$$\begin{aligned}{}[v]_E := (v|_{T_+})|_E -(v|_{T_-})|_E. \end{aligned}$$

The rotation acts on a vector field \(\varPhi \) (and row-wise on matrices) via \({\text {rot}}\varPhi := \partial _1 \varPhi _2 - \partial _2 \varPhi _1\) and \({\text {rot}}_{NC}\) denotes its piecewise application.

Under the present notation and the throughout abbreviation \(\varvec{\varepsilon }_h:={\mathbb {C}}^{-1}\sigma _h\), the explicit residual-based a posteriori error estimator reads

$$\begin{aligned}&\eta ^2({\mathscr {T}},\sigma _h)\nonumber \\&\quad :=\sum _{T\in {\mathscr {T}}}h_T^4 \Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)}^2 +{\text {osc}}^2(f,{\mathscr {T}}) + {\text {osc}}^2(g-g_h,{\mathscr {E}}(\varGamma _N)) \nonumber \\&\qquad +\displaystyle \sum _{E\in {\mathscr {E}}(\varOmega )} \left( h_E \Vert \tau _E \cdot [\varvec{\varepsilon }_h]_E \tau _E \Vert _{L^2(E)}^2 + h_E^3 \Vert \tau _E\cdot \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E \nu _E }{\partial s }\right) \Vert _{L^2(E)}^2 \right) \nonumber \\&\qquad +\displaystyle \sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( h_E\Vert \tau _E \cdot \left( \varvec{\varepsilon }_h\tau _E - \frac{ \partial u_D}{\partial s}\right) \Vert _{L^2(E)}^2 +\, h_E^3 \Vert \tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot \left( \frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2}\right) \Vert _{L^2(E)}^2\right) \nonumber \\ \end{aligned}$$
(4)

for the oscillations \({\text {osc}}(f,{\mathscr {T}})\) of the volume force and the oscillations of the traction boundary condition \({\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))\), defined below.

Theorem 1

(reliability) There exists a mesh-size and \(\lambda \) independent constant \(C_{\text {rel}} \) (which may depend on \(\mu \) and on the shape-regularity of the triangulation \({\mathscr {T}}\) through a global lower bound of the minimal angle therein) such that the exact (resp. discrete) stress \(\sigma \) from (1) [resp. \(\sigma _h\) from (3)] with \(g-g_h\perp P_0({\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)\) and the error estimator (4) satisfy

$$\begin{aligned} \Vert \sigma - \sigma _h \Vert _{L^2(\varOmega )} \le C_{\text {rel}} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

The a posteriori error estimator \(\eta ({\mathscr {T}},\sigma _h)\) already involves two data oscillation terms \({\text {osc}}(f,{\mathscr {T}})\) and \({\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))\) defined as the square roots of the respective terms in

$$\begin{aligned}&\displaystyle {\text {osc}}^2(f,{\mathscr {T}}) := \sum _{T\in {\mathscr {T}}} h_T^2\Vert f{-}f_h\Vert _{L^2(T)}^2 \text { for the }L^2\text { projection }f_h \text { of } f \text { onto } P_k({\mathscr {T}};{\mathbb {R}}^2); \\&\displaystyle {\text {osc}}^2(g-g_h,{\mathscr {E}}(\varGamma _N)) := \sum _{E\in {\mathscr {E}}(\varGamma _N)} h_E \Vert g-g_h\Vert _{L^2(E)}^2. \end{aligned}$$

For any edge E and a degree \(m\ge k+2\), let \(\varPi _{m,E}:L^2(E)\rightarrow P_{m}(E)\) denote the \(L^2\) projection onto polynomials of degree at most m. For any \(E\in {\mathscr {E}}(\varGamma _D)\) define the two Dirichlet data oscillation terms

$$\begin{aligned} {\text {osc}}^2_I(u_D,E):= & {} h_E\Vert (1-\varPi _{m,E}) \partial (u_D\cdot \tau _E)/\partial s \Vert _{L^2(E)}^2, \end{aligned}$$
(5)
$$\begin{aligned} {\text {osc}}^2_{II}(u_D,E):= & {} h_E^3\Vert (1-\varPi _{m,E}) \partial ^2 ( u_D\cdot \nu _E )/ \partial s^2\Vert _{L^2(E)}^2. \end{aligned}$$
(6)

Their sum defines the overall Dirichlet data approximation \({\text {osc}}(u_D,{\mathscr {E}}(\varGamma _D))\) as the square root of

$$\begin{aligned} {\text {osc}}^2(u_D,{\mathscr {E}}(\varGamma _D)):=\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( {\text {osc}}^2_I(u_D,E)+{\text {osc}}^2_{II}(u_D,E)\right) . \end{aligned}$$

The analysis of Sect. 3 is local and states for each of the five local residuals an upper bound related to the error in a neighborhood. The global efficiency is displayed as follows.

Theorem 2

(efficiency) There exists a mesh-size and \(\lambda ,\mu \) independent constant \(C_{\text {eff}} \) (which may depend on the shape-regularity of the triangulation \({\mathscr {T}}\) through a global lower bound of the minimal angle therein) such that the exact (resp. discrete) stress \(\sigma \) from (1) [resp. \(\sigma _h\) from (3)] with \(g-g_h\perp P_0({\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)\) and the error estimator (4) satisfy

$$\begin{aligned} C_{\text {eff}}^{-1} \eta ({\mathscr {T}},\sigma _h) \le \Vert \sigma - \sigma _h \Vert _{L^2(\varOmega )} +{\text {osc}}(f,{\mathscr {T}})+{\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))+{\text {osc}}(u_D,{\mathscr {E}}(\varGamma _D)). \end{aligned}$$

1.5 Outline of the paper

The remaining parts of this paper provide a mathematical proof of Theorems 1 and 2 and numerical evidence in computational experiments on the novel a posteriori error estimation and its robustness as well as on associated mesh-refining algorithms.

The proof of the reliability of Theorem 1 in Sect. 2 adopts arguments of [11, 15] and carries out two integration by parts on each triangle plus one-dimensional integration by parts along all edges. The resulting terms are in fact locally efficient in Sect. 3 with little generalizations of the bubble-function methodology due to Verfürth [24]. The five lemmas of Sect. 3 give slightly sharper results and in total imply Theorem 2.

The point in Theorems 1 and 2 is that the universal constants \(C_{\text {rel}}\) and \(C_{\text {eff}}\) may depend on the Lamé parameter \(\mu \) but are independent of the critical Lamé parameter \(\lambda \) as supported by the benchmark examples of the concluding Sect. 4. Adaptive mesh-refining proves to be highly effective with the novel a posteriori error estimator even for incompatible Neumann data. Four benchmark examples with the Poisson ratio \(\nu =0.3\) or 0.4999 provide numerical evidence of the robustness of the reliable and efficient a posteriori error estimation and for the fourth-order convergence of Theorem 5.

Three appendices highlight some improvements in the numerical benchmarks: Appendix A explains the improved convergence order for piecewise affine data and B and C explain how to treat incompatible Neumann data successfully.

1.6 Comments on general notation

Standard notation on Lebesgue and Sobolev spaces and norms is adopted throughout this paper and, for brevity, \(\Vert \cdot \Vert :=\Vert \cdot \Vert _{L^2(\varOmega )}\) denotes the \(L^2\) norm. The piecewise action of a differential operator is denoted with a subindex NC, e.g., \(\nabla _{NC}\) denotes the piecewise gradient \((\nabla _{NC} \bullet )|_T := \nabla (\bullet |_T)\) for all \(T\in {\mathscr {T}}\). Sobolev functions are usually defined on open sets and the notation \(W^{m,p}(T)\) (resp. \(W^{m,p}({\mathscr {T}})\)) substitutes \(W^{m,p}({\text {int}}(T))\) for a (compact) triangle T and its interior \({\text {int}}(T)\) (resp. \(W^{m,p}({\text {int}}({\mathscr {T}}))\)) and their vector and matrix versions.

For a differentiable function \(\phi \), \({\text {Curl}}\phi := (-\partial _2 \phi , \partial _1 \phi )\) is the rotated gradient; for a two-dimensional vector field \(\varPhi \), \({\text {Curl}}\varPhi \) is the \(2\times 2\) matrix-valued rotated gradient

$$\begin{aligned} {\text {Curl}}\varPhi := (-\partial _2 \varPhi _1 , \partial _1 \varPhi _1 ; -\partial _2 \varPhi _2 , \partial _1 \varPhi _2 ) =D\varPhi (0,1; -1,0). \end{aligned}$$

(The signs are not uniquely determined in the literature and some care is required.)

The colon denotes the scalar product \(A:B:=\sum _{\alpha ,\beta =1,2} A_{\alpha ,\beta } B_{\alpha ,\beta }\) of \(2\times 2\) matrices AB. The inequality \(A\lesssim B\) between two terms A and B abbreviates \(A\le C\, B\) with some multiplicative generic constant C, which is independent of the mesh-size and independent of the one Lamé parameter \(\lambda \ge 0\) but may depend on the other \(\mu >0\) and may depend on the shape-regularity of the underlying triangulation \({\mathscr {T}}\) and the parameter k related to the polynomial degree of the scheme.

2 Proof of reliability

This section is devoted to the proof of Theorem 1 based on a Helmholtz decomposition of [11] with two parts as decomposed in Theorem 3 below. The critical part is the \(L^2\) product of \({\mathbb {C}}^{-1} (\sigma -\sigma _h)\) times the \({\text {Curl}}\) of an unknown function \({\text {Curl}}\beta \). The observation from [15] is that one can find an Argyris finite element approximation \(\beta _h\) to \(\beta \in H^2(\varOmega )\) such that the continuous function \(\phi :=\beta -\beta _h\in H^2(\varOmega )\) vanishes at all vertices \({\mathscr {N}}\) of the triangulation. Two integration by parts on each triangle plus one-dimensional integration by parts along the edges \({\mathscr {E}}\) of the triangulation eventually lead to a key identity.

Lemma 1

(representation formula) Any function \(\varvec{\varepsilon }_h\in H^2({\mathscr {T}};{\mathbb {S}})\) (i.e. \(\varvec{\varepsilon }_h\) is piecewise in \(H^2\) with values in \({\mathbb {S}}\)) and any \(\phi \in H^2(\varOmega )\) with \(\phi (z)=0\) at all vertices \(z\in {\mathscr {N}}\) in the regular triangulation \({\mathscr {T}}\) satisfy

$$\begin{aligned}&(\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\varOmega )} =({\text {rot}}_{NC}{\text {rot}}_{NC}\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )}\\&\quad +\sum _{E\in {\mathscr {E}}(\varOmega )} \left( (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} - \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} ,\phi \, \tau _E \right) _{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\partial \varOmega )} \left( (\tau _E\cdot \varvec{\varepsilon }_h\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} - \left( {\text {rot}}\varvec{\varepsilon }_h-\frac{ \partial \varvec{\varepsilon }_h\nu _E}{\partial s} , \phi \, \tau _E\right) _{L^2(E)}\right) . \end{aligned}$$

The subsequent integration by parts formula is utilized frequently throughout this paper for \(\phi \in H^1(\varOmega ;{\mathbb {R}}^2)\) and \(\varPsi \in H^1(\varOmega ;{\mathbb {R}}^{2\times 2})\)

$$\begin{aligned} \int _\varOmega \varPsi :{\text {Curl}}\phi \, dx+ \int _\varOmega \phi \cdot {\text {rot}}\varPsi \, dx = \int _{\partial \varOmega } \phi \cdot \varPsi \tau _E\, ds. \end{aligned}$$

Any differentiable (scalar) function \(\varphi \), satisfies the elementary relations

$$\begin{aligned} \tau _E\cdot {\text {Curl}}\varphi = \partial \varphi /\partial \nu _E \quad \text {and}\quad \nu _E\cdot {\text {Curl}}\varphi = - \partial \varphi /\partial s=- \partial \varphi /\partial \tau _E \quad \text {on }E\in {\mathscr {E}}. \end{aligned}$$

Proof

Integrate by parts twice on each triangle and rearrange the remaining boundary terms to deduce (with the abbreviation \({\text {rot}}_{NC}{\text {rot}}_{NC}\equiv {\text {rot}}_{NC}^2\))

$$\begin{aligned}&(\varvec{\varepsilon }_h,{\text {Curl}}^2\phi )_{L^2(\varOmega )} = ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )} \\&\quad +\sum _{E\in {\mathscr {E}}(\varOmega )} \left( ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} -([{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E\cdot \tau _E,\phi )_{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\partial \varOmega )} \left( (\varvec{\varepsilon }_h\tau _E,{\text {Curl}}\phi )_{L^2(E)} -({\text {rot}}\varvec{\varepsilon }_h\cdot \tau _E,\phi )_{L^2(E)} \right) . \end{aligned}$$

The term \( ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} \) in the above sum is split into orthogonal components

$$\begin{aligned} {\text {Curl}}\phi= & {} (\tau _E\cdot {\text {Curl}}\phi )\tau _E+ (\nu _E\cdot {\text {Curl}}\phi )\nu _E\\= & {} (\tau _E\cdot {\text {Curl}}\phi )\tau _E- (\partial \phi /\partial s )\nu _E\quad \text {on }E\in {\mathscr {E}}. \end{aligned}$$

Since \(\phi \) vanishes at the vertices, an integration by parts along each interior edge E for the last term shows \( ([\varvec{\varepsilon }_h]_E\tau _E, (\partial \phi /\partial s )\nu _E)_{L^2(E)} = - (\partial [\varvec{\varepsilon }_h]_E\tau _E/\partial s , \phi \nu _E)_{L^2(E)}\). This proves

$$\begin{aligned} ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} = (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} + \left( \frac{\partial \nu _E\cdot [\varvec{\varepsilon }_h]_E\tau _E}{\partial s},\phi \right) _{L^2(E)}. \end{aligned}$$

The same formula holds for any boundary edge E when \([\varvec{\varepsilon }_h]_E\) is replaced by \(\varvec{\varepsilon }_h\). The combination of the latter identities with the first displayed formula of this proof verifies the asserted representation formula. \(\square \)

The contribution of \(\varepsilon (u)={\mathbb {C}}^{-1} \sigma \) times the \({\text {Curl}}^2\phi \in L^2(\varOmega ;{\mathbb {S}})\) exclusively leads to boundary terms. Throughout this paper, suppose that the Dirichlet data \(u_D\) satisfies \(u_D\in C(\varGamma _D)\cap H^2({\mathscr {E}}(\varGamma _D))\) in the sense that \(u_D\) is globally continuous with \(u_D|_E\in H^2(E;{\mathbb {R}}^2)\) for all \(E\in {\mathscr {E}}(\varGamma _D)\).

Lemma 2

(boundary terms) Any Sobolev function \(v\in H^1(\varOmega ;{\mathbb {R}}^2) \) with boundary values \(u_D\in C(\varGamma _D)\cap H^2({\mathscr {E}}(\varGamma _D))\) on \(\varGamma _D\) and any \(\phi \in H^2(\varOmega )\) with \(\phi =\partial \phi /\partial \nu =0\) along \(\varGamma _N\) with \(\phi (z)=0\) for any vertex z of \(\varGamma _D\) in its relative interior satisfy

$$\begin{aligned} (\varepsilon (v) , {\text {Curl}}^2 \phi )_{L^2(\varOmega )}=\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( \left( \frac{ \partial u_D}{\partial s } , \frac{ \partial \phi }{\partial \nu _E}\, \tau _E \right) _{L^2(E)} + \left( \frac{\partial ^2 u_D}{\partial s^2 } ,\phi \, \nu _E \right) _{L^2(E)}\right) . \end{aligned}$$

Proof

A density argument shows that it suffices to prove this identity for smooth functions v and \(\phi \), when integration by parts arguments show that the left-hand side is equal to

$$\begin{aligned} \int _{\partial \varOmega } {\text {Curl}}\phi \cdot \frac{\partial v }{\partial s}ds= \sum _{E\in {\mathscr {E}}(\partial \varOmega )} \int _{E} \left( \frac{\partial \phi }{\partial \nu _E} \frac{\partial (v\cdot \tau _E)}{\partial s} + \phi \frac{\partial ^2 (v\cdot \nu _E) }{\partial s^2}\right) ds . \end{aligned}$$

The equality follows from an orthogonal split \({\text {Curl}}\phi = (\tau \cdot {\text {Curl}}\phi )\tau + (\nu \cdot {\text {Curl}}\phi )\nu \) into the normal and tangential directions of \(\nu \) and \(\tau \) along the boundary \(\partial \varOmega \) followed by an integration by parts along \(\partial \varOmega \) with \(\phi (z)=0\) for vertices z in \(\varGamma _D\) with a jump of the normal unit vector. The substitution of the boundary conditions concludes the proof. \(\square \)

The consequence of the previous two lemmas is a representation formula for the error times a typical function \({\text {Curl}}^2 \phi \). To understand why the contributions on the Neumann boundary of \(\phi \) and \(\nabla \phi \) disappear along \(\varGamma _N\), some details on the Helmholtz decomposition are recalled from the literature. For this, let \(\varGamma _0,\ldots , \varGamma _J\) denote the compact connectivity components of \(\overline{\varGamma _N}\).

Theorem 3

(Helmholtz decomposition [11, Lemma 3.2]) For \(\sigma -\sigma _h\in L^2(\varOmega ;{\mathbb {S}})\), there exists \(\alpha \in V\), constant vectors \(c_0,\ldots ,c_J\in {\mathbb {R}}^2\) with \(c_0=0\) and \(\beta \in H^2(\varOmega )\) with \(\int _\varOmega \beta \, dx = 0\) and \({\text {Curl}}\beta = c_j \) on \(\varGamma _j\subseteq \varGamma _N\) for all \(j=0,\ldots ,J\) such that

$$\begin{aligned} \sigma -\sigma _h = {\mathbb {C}}\varepsilon (\alpha ) + {\text {Curl}}{\text {Curl}}\beta . \end{aligned}$$
(7)

\(\square \)

The second ingredient is an approximation \(\beta _h\) of \(\beta \) from the Helmholtz decomposition in Theorem 3 based on the Argyris finite element functions \( A({\mathscr {T}}) \subset C^1(\varOmega )\cap P_5({\mathscr {T}})\) [7, 8, 20]. The local mesh-size \(h_{\mathscr {T}}\in P_0({\mathscr {T}})\) in the triangulation \({\mathscr {T}}\) is defined as its diameter \(h_{\mathscr {T}}|_T:=h_T\) on each triangle \(T\in {\mathscr {T}}\).

Lemma 3

(quasi-interpolation) Given any \(\beta \) as in Theorem 3 there exists some \(\beta _h\in A({\mathscr {T}})\) such that \(\phi :=\beta -\beta _h \in H^2(\varOmega ) \) vanishes at any vertex \(z\in {\mathscr {N}}\) of the triangulation, \(\phi \) and its normal gradient \(\nabla \phi \cdot \nu \) vanish on \(\varGamma _N\), and the local approximation and stability property holds in the sense that

$$\begin{aligned} \Vert h_{\mathscr {T}}^{-2} \phi \Vert + \Vert h_{\mathscr {T}}^{-1}{\text {Curl}}\phi \Vert +\Vert {\text {Curl}}^2\phi \Vert \lesssim \Vert \beta \Vert _{H^2(\varOmega )}. \end{aligned}$$

Proof

This has been (partly) utilized in [15] and also follows from [21]. \(\square \)

The combination of all aforementioned arguments leads to the following estimate as an answer to the question of Sect. 1.1 in terms of directional derivatives of \(\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h\). Recall the definition of \(\eta ({\mathscr {T}},\sigma _h)\) from (4).

Theorem 4

(key result) Let \(\sigma \in H({\text {div}},\varOmega ;{\mathbb {S}})\) solve (1) and let \(\sigma _h \in AW _k({\mathscr {T}})\) solve (3). Given \(\beta \) from Theorem 3 and its quasi-interpolation \(\beta _h\) from Lemma 3, the difference \(\phi :=\beta -\beta _h\) satisfies

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h), {\text {Curl}}^2\phi )_{L^2(\varOmega )} \lesssim | \beta |_{H^2(\varOmega )} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Proof

Lemmas 1 and 2 lead to a formula for \((\varvec{\varepsilon }_h, {\text {Curl}}^2\phi )_{L^2(\varOmega )}\), \(\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h\), in which all the contributions for \(E\in {\mathscr {E}}(\varGamma _N)\) with \(\phi \) and \(\nabla \phi \) vanish along \(\varGamma _N\). The remaining formula reads

$$\begin{aligned}&({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\phi )_{L^2(\varOmega )} = - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )}\\&\quad -\sum _{E\in {\mathscr {E}}(\varOmega )} \left( \left( \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\frac{\partial \phi }{\partial \nu _E}\right) _{L^2(E)}- \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} ,\phi \, \tau _E \right) _{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( \left( \frac{\partial u_D}{\partial s} - \varvec{\varepsilon }_h\tau _E, \tau _E\, \frac{\partial \phi }{\nu _E} \right) _{L^2(E)} \right. \\&\quad \left. + \left( \tau _E\cdot \left( {\text {rot}}_{NC}\varvec{\varepsilon }_h-\frac{\partial ( \varvec{\varepsilon }_h\nu _E)}{\partial s} \right) +\frac{\partial ^2 u_D\cdot \nu _E}{\partial s^2 } ,\phi \right) _{L^2(E)}\right) . \end{aligned}$$

The proof concludes with Cauchy–Schwarz inequalities, trace inequalities, and the approximation estimates of Lemma 3. The remaining details are nowadays standard arguments in the a posteriori error analysis of nonconforming and mixed finite element methods and hence are omitted. \(\square \)

Before the proof of Theorem 1 concludes this section, three remarks and one lemma are in order.

Remark 1

(nonconstant coefficients) The main parts of the reliability analysis of this section hold for rather general material tensors \({\mathbb {C}}\) as long as \(\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h\) allows for the existence of the traces and the derivatives in the error estimator (4) in the respective \(L^2\) spaces. For instance, if \(\lambda \) and \(\mu \) are piecewise smooth with respect to the underlying triangulation \({\mathscr {T}}\).

Remark 2

(constant coefficients) The overall assumption of constant Lamé parameters \(\lambda \) and \(\mu \) allows a simplification in the error estimator (4). It suffices to have \(\mu \) globally continuous and \(\mu \) and \(\lambda \) piecewise smooth to guarantee

$$\begin{aligned} \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} \cdot \tau _E =0\quad \text {along }E\in {\mathscr {E}}(\varOmega ). \end{aligned}$$

(The proof utilizes the structure of \({\mathbb {C}}^{-1}\) with \({\mathbb {C}}^{-1} E = \frac{1}{2\mu } (E - \frac{\lambda }{2(\lambda +\mu )} {\text {tr}}(E) 1_{2\times 2})\) for any \(E\in {\mathbb {S}}\) as a linear combination of the identity and some scalar multiple of the \(2\times 2\) unit matrix. The terms with the identity lead to \(1/(2\mu )\) times the jump \([\sigma _h]_E\nu _E=0\) of the \(H({\text {div}})\) conforming stress approximations. The jump terms with the unit matrix (even with jumps of \(\lambda \)) are multiplied with the orthogonal unit vectors \(\nu _E\) and \(\tau _E\) and so vanish as well.)

Remark 3

(related work) Although the work [22] concerns a different problem (bending of a plate of fourth order) with a different discretization (even nonconforming in \(H({\text {div}})\)), some technical parts of that paper are related to those of this by a rotation of the underlying coordinate system and the substitution of \({\text {div}}{\text {div}}\) by \({\text {rot}}{\text {rot}}\) etc. Another Helmholtz decomposition also allows for a discrete version and thereby enables a proof of optimal convergence of an adaptive algorithm with arguments from [13, 19].

A technical detail related to the robustness in \(\lambda \rightarrow \infty \) is a well known lemma that controls the trace of a matrix \(E\in {\mathbb {R}}^{2\times 2}\) by its deviatoric part \({\text {dev}} E:= E-{\text {tr}}(E)/2\, 1_{2\times 2}\) and its divergence measured in the dual \(V^*\subset H^{-1}(\varOmega ;{\mathbb {R}}^2)\) of V, namely

$$\begin{aligned} \Vert {\text {div}}\tau \Vert _{-1} := \sup _{\begin{array}{c} v\in V \\ | v |_{H^1(\varOmega )} =1 \end{array}}{\int _\varOmega \tau : Dv\, dx} \quad \text {for all } \tau \in L^2(\varOmega ;{\mathbb {R}}^{2\times 2}). \end{aligned}$$

Lemma 4

(tr-dev-div) Let \(\varSigma _0\) be a closed subspace of \(H({\text {div}},\varOmega ;{\mathbb {R}}^{2\times 2})\), which does not contain the constant tensor \(1_{2\times 2}\). Then any \(\tau \in \varSigma _0\) satisfies

$$\begin{aligned} \Vert {\text {tr}}(\tau )\Vert _{L^2(\varOmega )} \lesssim \Vert {\text {dev}}\tau \Vert _{L^2(\varOmega )} + \Vert {\text {div}}\tau \Vert _{-1}. \end{aligned}$$

Proof

There are several variants of the tr-dev-div lemma known in the literature [6, Proposition 9.1.1]. The version in [11, Theorem 4.1] explicitly displays a version with \(\Vert {\text {div}}\tau \Vert \) replacing \(\Vert {\text {div}}\tau \Vert _{-1}\). Since its proof is immediately adopted to prove the asserted version, further details are omitted. \(\square \)

The remaining part of this section outlines why Theorem 1 follows from Theorem 4 with the arguments from [11, 15]. The energy norms for any \(v\in V\) and \(\tau \in H({\text {div}},\varOmega ;{\mathbb {S}}) \) read

$$\begin{aligned} |||v |||^2:=\int _\varOmega \varepsilon (v):{\mathbb {C}}\varepsilon (v)\, dx \quad \text {and}\quad \Vert \tau \Vert _{{\mathbb {C}}^{-1}}^2:=\int _\varOmega \tau : {\mathbb {C}}^{-1}\tau \, dx. \end{aligned}$$

The remaining residual is denoted by

$$\begin{aligned} {{\,\mathrm{Res}\,}}(v) := \int _\varOmega f\cdot v\, dx + \int _{\varGamma _N} g\cdot v\, ds - \int _\varOmega \sigma _h:\varepsilon (v)\, dx\quad \text {for all }v\in V \end{aligned}$$

with its dual norm

$$\begin{aligned} |||{{\,\mathrm{Res}\,}} |||_* := \sup _{\begin{array}{c} v\in V \\ |||v |||=1 \end{array}}{{{\,\mathrm{Res}\,}}(v)} . \end{aligned}$$

It is shown in the proof of [15, Theorem 3.1] that \(\alpha \in V\) and \(\beta \in H^2(\varOmega )\) from the Helmholtz decomposition of the error \(\sigma -\sigma _h\) in Theorem 3 are orthogonal with respect to the \(L^2\) scalar product weighted with \({\mathbb {C}}^{-1}\). This implies

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}}^2 = (\sigma -\sigma _h,\varepsilon (\alpha ))_{L^2(\varOmega )} + ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )}. \end{aligned}$$
(8)

Let \(\beta _h\) denote the quasi-interpolation of \(\beta \) from Lemma 3. It is known [15] that \({\text {Curl}}^2\beta _h\) is a divergence-free element of \(\varSigma (0,{\mathscr {T}})\). Therefore, (2) and (3) imply

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta _h)_{L^2(\varOmega )} =0. \end{aligned}$$

Thus, with \(\phi =\beta -\beta _h\), the second term of (8) equals \( ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\phi )_{L^2(\varOmega )} \) and hence is controlled in the key estimate of Theorem 4 as

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )} \lesssim |\beta |_{H^2(\varOmega )} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Lemma 4 applies to \(\varSigma _0\) as the subspace of all \(\tau \in H({\text {div}},\varOmega ;{\mathbb {S}})\) with homogeneous Neumann data \(\tau \nu =0\) along \(\varGamma _N\). Since \(\tau :={\text {Curl}}^2\beta \) is divergence free (by the relation \({\text {div}}{\text {Curl}}=0\)) and since \(\tau \nu =-\partial {\text {Curl}}\beta /\partial s\) along \(\varGamma _N\) (owing to the aforementioned elementary relations and the convention that the first \({\text {Curl}}\) acts row-wise on \({\text {Curl}}\beta \)), where \({\text {Curl}}\beta \) in Theorem 3 is piecewise constant, it follows that \(\tau \in \varSigma _0\). On the other hand \(1_{2\times 2}\notin \varSigma _0\) because \(\varGamma _N\ne \emptyset \). Consequently, Lemma 4 implies \(\Vert {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {dev}} {\text {Curl}}^2 \beta \Vert \). This and elementary calculations with \({\mathbb {C}}^{-1}\) lead to

$$\begin{aligned} | \beta |_{H^2(\varOmega )} = \Vert {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {dev}} {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {Curl}}^2 \beta \Vert _{{\mathbb {C}}^{-1}}. \end{aligned}$$

The combination with the estimate resulting from Theorem 4 proves

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )} \lesssim \Vert {\text {Curl}}^2\beta \Vert _{{\mathbb {C}}^{-1}}\, \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

This, the stability \( \Vert {\text {Curl}}^2\beta \Vert _{{\mathbb {C}}^{-1}} \le \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \), and \(|||\alpha |||=|||{{\,\mathrm{Res}\,}} |||_{*}\) lead in (8) to

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \lesssim |||{{\,\mathrm{Res}\,}} |||_{*} + \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$
(9)

The remaining term is the estimate of the dual norm \(|||{{\,\mathrm{Res}\,}} |||_{*}\) of the residual which is done, e.g., in [15, Lemma 3.3] (under the assumption \(g-g_h\perp P_0({\mathscr {E}}(\varGamma _N))\))

$$\begin{aligned} |||{{\,\mathrm{Res}\,}} |||_{*} \lesssim {\text {osc}}(f,{\mathscr {T}})+ {\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))\le \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

This and (9) imply

$$\begin{aligned} \Vert {\text {dev}}(\sigma -\sigma _h)\Vert \lesssim \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

For any test function \(v\in V\) with \( | v |_{H^1(\varOmega )} =1\), \(\int _\varOmega (\sigma -\sigma _h) : Dv\, dx= {{\,\mathrm{Res}\,}}(v)\) and so

$$\begin{aligned} \Vert {\text {div}}(\sigma -\sigma _h)\Vert _{-1} = \sup _{\begin{array}{c} v\in V \\ | v |_{H^1(\varOmega )} =1 \end{array}}{Res(v)} \le \sup _{\begin{array}{c} v\in V \\ \Vert \varepsilon (v) \Vert =1 \end{array}}{Res(v)} \le 2\mu \, |||{{\,\mathrm{Res}\,}} |||_{*} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

(In the second last step one utilizes that \( 2\mu \, E:E \le E:{\mathbb {C}}E\) for all \(E\in {\mathbb {S}}\).) The combination of Lemma 4 for \(\tau =\sigma -\sigma _h\) with the previous displayed estimates concludes the proof of \(\Vert \sigma -\sigma _h\Vert \lesssim \eta ({\mathscr {T}},\sigma _h)\). There exist several appropriate choices of \(\varSigma _0\subset H({\text {div}},\varOmega ;{\mathbb {S}})\) in this last step. Recall that \(\varGamma _N\) is the union of connectivity components and so pick one edge \(E_0\) in this polygon and consider \(\varSigma _0:=\{ \tau \in H({\text {div}},\varOmega ;{\mathbb {S}}): \int _{E_0} \tau \nu \, ds =0\}\) with \(1_{2\times 2}\notin \varSigma _0\). This choice of \(E_0\) and so \(\varSigma _0\) depend only on \(\varGamma _N\) (independent of \({\mathscr {T}}\)). Since \(g-g_h=(\sigma -\sigma _h)\nu \) along \(E_0\) has (piecewise on \({\mathscr {E}}(E_0)\), whence in total) an integral mean zero, Lemma 4 indeed applies to \(\tau =\sigma -\sigma _h\in \varSigma _0\). \(\square \)

3 Local efficiency analysis

The local efficiency follows with the bubble-function technique for \(C^1\) finite elements [24, Sec 3.7]. This section focuses on a constant \({\mathbb {C}}\) for linear isotropic elasticity with constant Lamé parameters \(\lambda \) and \(\mu \) such that \(\varvec{\varepsilon }_h:={\mathbb {C}}^{-1}\sigma _h\in P_{k+2}({\mathscr {T}})\) for some \(\sigma _h\in AW _k({\mathscr {T}})\) is a polynomial of degree at most \(k+2\). Apart from this, the Lamé parameters do not further arise in this section.

The moderate point of departure is the volume term for each triangle \(T\in {\mathscr {T}}\) with barycentric coordinates \(\lambda _1,\lambda _2,\lambda _3\in P_1(T)\) and their product, the cubic volume bubble function, \(b_T:=27\,\lambda _1\lambda _2\lambda _3 \in W^{1,\infty }_0(T)\) plus its square \(b_T^2\in W^{2,\infty }_0(T)\) with \(0\le b_T^2\le 1\), \( \Vert b_T\Vert _{L^2(T)} \lesssim 1\), and \(|b_T|_{H^2(T)} \lesssim h_T^{-2}\) etc.

Lemma 5

(efficiency of volume residual) Any \(v\in H^1(T;{\mathbb {R}}^2)\), \(T\in {\mathscr {T}}\), satisfies

$$\begin{aligned} h_T^2 \Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)}. \end{aligned}$$

Proof

An inverse estimate for the polynomial \({\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\equiv {\text {rot}}^2\varvec{\varepsilon }_h\) implies the estimate

$$\begin{aligned} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}^2 \lesssim \Vert b_T {\text {rot}}^2 \varvec{\varepsilon }_h\Vert ^2_{L^2(T)} = ({\text {rot}}^2\varvec{\varepsilon }_h, b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)_{L^2(T)}. \end{aligned}$$

Lemma 1 with \(\phi =b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h\) and \(( \varepsilon (v) , {\text {Curl}}^2\phi )_{L^2(T)} = 0\) leads to

$$\begin{aligned} \begin{aligned} \Vert b_T {\text {rot}}^2 \varvec{\varepsilon }_h\Vert ^2_{L^2(T)}&= ( \varvec{\varepsilon }_h- \varepsilon (v) , {\text {Curl}}^2({b_T^2} {\text {rot}}^2 \varvec{\varepsilon }_h) )_{L^2(T)}\\&\le \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)} \Vert {\text {Curl}}^2(b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)\Vert _{L^2(T)}. \end{aligned} \end{aligned}$$

This and the inverse estimate \(\Vert {\text {Curl}}^2(b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)\Vert _{L^2(T)} \lesssim h_T^{-2} \Vert b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)} \) imply

$$\begin{aligned} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}^2 \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)} h_T^{-2} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}. \end{aligned}$$

This concludes the proof. \(\square \)

The localization of the first edge residual involves the piecewise quadratic edge-bubble function \(b_E\) with support \(T_+\cup T_-\) for an interior edge \(E=\partial T_+\cap \partial T_-\) shared by the two triangles \(T_+\) and \(T_-\) with edge-patch \(\omega _E := {\text {(}}T_+\cup T_-)\). With an appropriate scaling \(b_E|_T=4\lambda _1\lambda _2\) for the two barycentric coordinates \(\lambda _1,\lambda _2\) on \(T\in \{T_+,T_-\}\) associated with the two vertices of E. Then \(b_E\in W^{1,\infty }(\omega _E)\) and \(b_E^2\in W^{2,\infty }(\omega _E)\) satisfy \(0\le b_E^2\le b_E\le 1\) and \(|b_E|_{H^1(E)} \lesssim h_E^{-1}\) etc.

The remaining technical detail is an extension of functions on the edge E to \(\omega _E\). Throughout this section those functions are polynomials and given \(\rho _E\in P_m(E)\), their coefficients (in some fixed basis) already define an algebraic object that is a natural extension \(\rho \in P_m({\hat{E}})\) along the straight line \({\hat{E}}:={\text {mid}}(E)+{\mathbb {R}}\, \tau _E\) that extends E with midpoint \({\text {mid}}(E)\) and tangential unit vector \(\tau _E\). This and

$$\begin{aligned} P_E(\rho _E)(x):= \rho ( \tau _E\cdot (x- {\text {mid}}(E))) \quad \text {for all }x\in {\mathbb {R}}^2 \end{aligned}$$

define a linear extension operator \(P_E:P_m(E)\rightarrow C^\infty ({\mathbb {R}}^2)\) with \(P_E(\rho _E)=\rho _E\) on E for any \(\rho _E\in P_m(E)\), which is constant in the normal direction, \(\nabla P_E(\rho _E)\cdot \nu _E\equiv 0\). This design is different from that in [24].

Lemma 6

(efficiency of first interior edge residual) Any \(v\in H^1(\omega _E;{\mathbb {R}}^2)\), \(E\in {\mathscr {E}}(\varOmega )\), satisfies

$$\begin{aligned} h_E^{1/2} \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)}. \end{aligned}$$

Proof

Since \(\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\in P_{k+2}(E)\) is a polynomial, the above extension \(P_E(\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E)\) and the function \(b\in W^{2,\infty }_0(\omega _E)\) with

$$\begin{aligned} b(x):= b_E^2(x)\, \nu _E\cdot (x- {\text {mid}}(E))\quad \text {for all }x\in {\mathbb {R}}^2 \end{aligned}$$
(10)

define some function \(\phi := b\, P_E(\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E)\). Since \(b=0\) and \(\nabla b_E\cdot \nu _E= b_E^2\) along E, the test function \(\phi \in H^2_0(\omega _E)\subset H_0^2(\varOmega )\) leads in Lemma 1 to

$$\begin{aligned} (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E, \partial _{\nu _E} \phi )_{L^2(E)}= (\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

Since \(\partial _{\nu _E} \phi =b_E^2 \, \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\) on E and \(\varepsilon (v)\perp {\text {Curl}}^2\phi \), an inverse estimate shows

$$\begin{aligned} \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert ^2_{L^2(E)} \lesssim (\varvec{\varepsilon }_h-\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

At the heart of the bubble-function methodology are inverse and trace inequalities that allow for appropriate scaling properties [24] under the overall assumption of shape-regularity. In the present case, one power of \(h_E\approx h_{T_\pm }\) is hidden in the function b and

$$\begin{aligned} h_E^{1/2}\, | \phi |_{H^2(\omega _E)} + h_E^{-3/2} \Vert \phi \Vert _{L^2(\omega _E)} \lesssim \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E \Vert _{L^2(E)}. \end{aligned}$$
(11)

The combination with the previous estimate results in

$$\begin{aligned}&\Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert ^2_{L^2(E)} \\&\quad \lesssim \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E \Vert _{L^2(E)} \left( h_E^{-1/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v)\Vert _{L^2(\omega _E)} + h_E^{3/2} \Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)}\right) . \end{aligned}$$

This and Lemma 5 conclude the proof. \(\square \)

For any edge \(E\in {\mathscr {E}}(\varGamma _D)\), the edge-bubble function \(b_E=4\lambda _1\lambda _2\in W^{1,\infty }(\omega _E)\) for the two barycentric coordinates \(\lambda _1,\lambda _2\) associated with the two vertices of E and \(b_E\) vanishes on the remaining sides \(\partial \omega _E{\setminus } E\) of the aligned triangle \(\overline{\omega _E}\). The Dirichlet data \(u_D\) allows for some polynomial approximation \(\varPi _{m,E} u_{D}\in P_{m}(E)\) of a maximal degree bounded by \(m\ge k+2\); recall the definition of \({\text {osc}}_I(u_D,E)\) from (5).

Lemma 7

(efficiency of first boundary edge residual) Any \(v\in H^1(\omega _E;{\mathbb {R}}^2)\) with \(v|_E=u_D|_E\) along \(E\in {\mathscr {E}}(\varGamma _D)\) satisfies

$$\begin{aligned} h_E^{1/2} \Vert \tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)} + {\text {osc}}_I(u_D,E). \end{aligned}$$

Proof

Since \(\tau _E\cdot \varvec{\varepsilon }_h\tau _E\) is a polynomial of degree at most \(k+2\le m\) along the exterior edge E, the residual \(\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)\) is well approximated by its \(L^2\) projection \(\rho _E:= (\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \varPi _{m,E} \partial u_D/\partial s))\) onto \(P_{m}(E)\). The Pythagoras theorem based on the \(L^2\) orthogonality reads

$$\begin{aligned} h_E \Vert \tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)\Vert _{L^2(E)}^2= h_E \Vert \rho _E\Vert _{L^2(E)}^2+{\text {osc}}^2_I(u_D,E) \end{aligned}$$

and it remains to bound \(h_E^{1/2} \Vert \rho _E\Vert _{L^2(E)}\) by the right-hand side of the claimed inequality. The extension \(P_E\rho _E\in C^\infty ({\mathbb {R}}^2)\) and the function b from (10) lead to an admissible test function \( \phi := b P_E \rho _E \in W^{2,\infty }_0(\omega _E)\). Two successive integration by parts as in Lemma 1 show

$$\begin{aligned} (\varepsilon (v),{\text {Curl}}^2\phi )_{L^2(\omega _E)}=(\partial u_D/\partial s ,\tau _E (\nu _E\cdot \nabla \phi ))_{L^2(E)}. \end{aligned}$$

This and Lemma 1 lead to

$$\begin{aligned} \left( \tau _E\cdot \left( \varvec{\varepsilon }_h\tau _E- \frac{ \partial u_D}{\partial s}\right) , \frac{\partial \phi }{\partial \nu _E} \right) _{L^2(E)} = (\varvec{\varepsilon }_h-\varepsilon (v) , {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

Since \(\partial _{\nu _E} \phi = b_E^2 \rho _E\) along E and \(\rho _E\) is the \(L^2\) projection of \(\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E- \partial u_D/\partial s)\), the left-hand side equals \(\Vert b_E \rho _E\Vert _{L^2(E)}^2 - ( ( 1-\varPi _{m,E}) \partial u_D/\partial s, b_E^2 \rho _E )_{L^2(E)} \). The scaling argument which leads to (11) shows that the left-hand side of (11) is \(\lesssim \Vert \rho _E\Vert _{L^2(E)}\). The combination with the previously displayed identity leads to

$$\begin{aligned} \Vert \rho _E\Vert _{L^2(E)}^2\lesssim & {} \Vert \rho _E\Vert _{L^2(E)} \left( h_E^{-1/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v)\Vert _{L^2(\omega _E)} \right. \\&\left. + h_E^{3/2} \Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)} + h_E^{-1/2} {\text {osc}}_I(E,u_D) \right) . \end{aligned}$$

This and Lemma 5 conclude the proof. \(\square \)

The edge-bubble functions for the second edge residuals are defined slightly differently to ensure some vanishing normal derivative.

Lemma 8

(efficiency of second interior edge residual) Any \(v\in H^1(\omega _E;{\mathbb {R}}^2)\), \(E\in {\mathscr {E}}(\varOmega )\), satisfies

$$\begin{aligned} h_E^{3/2} \Vert \tau _E\cdot ( [{\text {rot}}_{NC} \varvec{\varepsilon }_h]_E - \partial [\varvec{\varepsilon }_h]_E/\partial s\, \nu _E)\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)}. \end{aligned}$$

Proof

There are many ways to define an edge-bubble function for this situation and one may first select a maximal open ball \(B(x_E,2r_E) \subset \omega _E\) around a point \(x_E\in E\) with maximal radius \(2r_E\), which is entirely included in \(\omega _E\). The characteristic function \(\chi _{B(x_E,r_E)}\) of the smaller ball \(B(x_E,r_E)\) may be regularized with a standard mollification \(\eta _{r_E}\) to define the smooth function \(b:=\chi _{B(x_E,r_E)}*\eta _{r_E}\in C_c^\infty (\varOmega _E)\) with values in [0, 1] and with \(\nabla b\cdot \nu _E=0\) along E. The polynomial \(\rho _E:= \tau _E\cdot ( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \partial [\varvec{\varepsilon }_h]_E/\partial s \, \nu _E)\) and its extension \(P_E \rho _E\) define the test function \(\phi := b P_E \rho _E\in C_0^\infty (\omega _E)\) in Lemma 1. The representation formula and \((\varepsilon (v),{\text {Curl}}^2\phi )_{L^2(\omega _E)}=0\) lead to

$$\begin{aligned} \Vert b^{1/2} \rho _E\Vert _{L^2(E)}^2 = (\varepsilon (v)-\varvec{\varepsilon }_h,{\text {Curl}}^2\phi )_{L^2(\omega _E)} + ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)} . \end{aligned}$$

The inverse inequality \(\Vert \rho _E\Vert _{L^2(E)} \lesssim \Vert b^{1/2} \rho _E\Vert _{L^2(E)}\), Cauchy-Schwarz inequalities, and the right scaling properties of \(\phi \) lead to

$$\begin{aligned} \Vert \rho _E\Vert _{L^2(E)}^2\lesssim \Vert \rho _E\Vert _{L^2(E)}\left( h_E^{-3/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v) \Vert _{L^2(\omega _E)} + h_E^{1/2}\Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)} \right) . \end{aligned}$$

This and Lemma 5 conclude the proof. \(\square \)

The efficiency of the last edge contribution involves the second Dirichlet data oscillation \({\text {osc}}_{II}(u_D,E)\) from (6).

Lemma 9

(efficiency of second boundary edge residual) Any \(v\in H^1(\omega _E;{\mathbb {R}}^2)\) with \(v|_E=u_D|_E\) along \(E\in {\mathscr {E}}(\varGamma _D)\) satisfies

$$\begin{aligned}&h_E^{3/2} \Vert \tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot \left( \frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2}\right) \Vert _{L^2(E)}\\&\quad \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)} + {\text {osc}}_{II}(u_D,E). \end{aligned}$$

Proof

Select a maximal open ball \(B(x_E,2r_E)\cap \varOmega \subset \omega _E\) around a point \(x_E\in E\) with maximal radius \(2r_E\) such that \(B(x_E,2r_E)\cap \omega _E\) is a half ball. The regularization \(b:=\chi _{B(x_E,r_E)}*\eta _{r_E}\in C_c^\infty ({\mathbb {R}}^2)\) of the characteristic function \(\chi _{B(x_E,r_E)}\) attains values in [0, 1] and a positive integral mean \(h_E^{-1} \int _E b\, ds \approx 1\) along E (depending only on the shape regularity of \({\mathscr {T}}\)); b vanishes on \(\partial \omega _E{\setminus } E\) and its normal derivative \(\nabla b\cdot \nu =0\) vanishes along the entire boundary \(\partial \omega _E\).

The Pythagoras theorem \( \Vert \rho \Vert _{L^2(E)}^2 = \Vert \rho _E \Vert _{L^2(E)}^2 + h_E^{-3} {\text {osc}}_{II}^2(u_D,E)\) for the residual \(\rho :=\tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot (\frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2})\) and its \(L^2\) projection \(\rho _E:=\varPi _{m,E}\rho \) onto \(P_m(E)\) reduces the proof to the estimation of \(\Vert \rho _E \Vert _{L^2(E)}\). The normal derivative of \(\phi := b\, P_E \rho _E\in C^\infty (\overline{\omega _E})\) vanishes along the boundary \(\partial \omega _E\) and Lemma 1 shows

$$\begin{aligned} \left( {\text {rot}}\varvec{\varepsilon }_h-\frac{ \partial \varvec{\varepsilon }_h\nu _E}{\partial s} , b \rho _E \tau _E\right) _{L^2(E)} = ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}-(\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\omega _E)}. \end{aligned}$$

The arguments in Lemma 2 show \( (\partial ^2 u_D/\partial s^2 , b\rho _E \, \nu _E)_{L^2(E)} = (\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)}\). The combination of the two identities leads to a formula for \( (\rho ,b\rho _E) _{L^2(E)}\). Since \(\rho -\rho _E\) is controlled in \({\text {osc}}_{II}^2(u_D,E)\), this and an inverse inequality in the beginning result in

$$\begin{aligned}&\Vert \rho _E\Vert _{L^2(E)}^2\lesssim (b\rho _E,\rho _E) _{L^2(E)} =(\rho , b\rho _E) _{L^2(E)}- (\rho -\rho _E, b\rho _E) _{L^2(E)} \\&\; \lesssim ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)} -(\varvec{\varepsilon }_h-\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)} +\Vert \rho _E\Vert _{L^2(E) }h_E^{-3/2} {\text {osc}}_{II}(u_D,E). \end{aligned}$$

The scaling properties of \(\phi \) and its derivatives are as in the proof of the previous lemma. With Lemma 5 in the end, this concludes the proof. \(\square \)

4 Numerical examples

This section is devoted to numerical experiments for four different domains to demonstrate robustness in the reliability and efficiency of the a posteriori error estimator \(\eta ({\mathscr {T}}_\ell ,\sigma _\ell )\). The implementation follows [12, 15, 16] for \(k=1\) with Lamé parameters \(\lambda \) and \(\mu \) from \(\lambda = E\nu /( (1+\nu )(1-2\nu ))\) and \(\mu ={E}/(2(1+\nu ))\) for a Young’s modulus \(E=10^5\) and various Poisson ratios \( \nu =0.3\) and \( \nu = 0.4999\).

4.1 Academic example

The model problem (1) on the unit square \(\varOmega = (0,1)^2 \) with homogeneous Dirichlet boundary conditions and the right-hand side \(f=(f_1,f_2)\),

$$\begin{aligned} f_1(x,y) = -f_2(y,x)= -2\mu \pi ^3\cos (\pi y)\sin (\pi y)(2\cos (2\pi x) - 1)\quad \text {for }(x,y)\in \varOmega , \end{aligned}$$

allows the smooth exact solution

$$\begin{aligned} u(x,y) {=} \pi \, \sin (\pi x)\sin (\pi y)\, \left( \cos (\pi y) \sin (\pi x), -\cos (\pi x) \sin (\pi y)\right) \quad \text {for }(x,y){\in }\varOmega . \end{aligned}$$

Note that f depends only on the Lamé parameter \(\mu \) and not on the critical Lamé parameter \(\lambda \). For uniform mesh refinement, Fig. 1 displays the robust third-order convergence of the a posteriori error estimator \(\eta ({\mathscr {T}}_\ell ,\sigma _\ell )\) as well as the Arnold–Winther finite element stress error. The convergence is robust in the Poisson ratio \(\nu \rightarrow 1/2\) and the a posteriori error estimator proves to be reliable and efficient. In this example, the oscillations \({\text {osc}}(f,{\mathscr {T}}_\ell )\) dominate the a posteriori error estimator.

This typical observation motivates numerical examples with \(f\equiv 0\) in the sequel.

Fig. 1
figure 1

Convergence history plot in academic example

Fig. 2
figure 2

Domain circular inclusion

Fig. 3
figure 3

Convergence history plot in circular inclusion benchmark

4.2 Circular inclusion

The second benchmark example from the literature models a rigid circular inclusion in an infinite plate for the domain \(\varOmega \) with rather mixed boundary conditions indicated with mechanical symbols in Fig. 2. The exact solution [23] to the model problem (1) reads (with polar coordinates \((r,\phi )\) and \(\kappa = 3-4\nu \), \(\gamma = 2\nu - 1\), \(a=1/4\))

$$\begin{aligned} u_r&= \frac{1}{8\mu r}\left( (\kappa -1)r^2 + 2\gamma a^2 + \left( 2r^2 -\frac{2(\kappa +1)a^2}{\kappa } + \frac{2 a^4}{\kappa r^2}\right) \cos (2\phi )\right) ,\\ u_\phi&= -\frac{1}{8\mu r}\left( 2r^2-\frac{2(\kappa -1)a^2}{\kappa } - \frac{2a^4}{\kappa r^2}\right) \sin (2\phi ). \end{aligned}$$

The approximation of the circular inclusion through a polygon is rather critical for the higher-order Arnold–Winther MFEM. In the absence of an implementation of parametric boundaries, adaptive mesh refinement is necessary for higher improvements. The adaptive algorithm of this section is the same for all examples and acts on polygons; in particular, it does not monitor the curved boundary, but whenever some edge at the curved part \(\varGamma _D\) is refined in this example, the midpoint is a new node and projected onto \(\varGamma _D\). The convergence history plot in Fig. 3 shows a reduced convergence for uniform refinement, while adaptive refinement (of the circular boundary) leads to optimal third-order convergence.

Fig. 4
figure 4

L-shaped domain

4.3 L-shaped benchmark

Consider the rotated L-shaped domain with Dirichlet and Neumann boundary depicted in Fig. 4. The exact solution reads in polar coordinates

$$\begin{aligned} u_r(r,\phi )&= \frac{r^{\alpha }}{2\mu } \left( -(\alpha +1)\cos ((\alpha +1)\phi ) + (C_2-\alpha -1)C_1\cos ((\alpha -1)\phi )\right) ,\\ u_\phi (r,\phi )&= \frac{r^{\alpha }}{2\mu } \left( (\alpha +1)\sin ((\alpha +1)\phi ) + (C_2+\alpha -1)C_1\sin ((\alpha -1)\phi )\right) . \end{aligned}$$

The constants are \(C_1 := -\cos ((\alpha +1)\omega )/\cos ((\alpha -1)\omega )\) and \(C_2:= 2(\lambda + 2\mu )/(\lambda +\mu )\), where \(\alpha = 0.544483736782\) is the first root of \(\alpha \sin (2\omega )+\sin (2\omega \alpha )=0\) for \(\omega = 3\pi /4\). The volume force \(f\equiv 0\) and the Neumann boundary data \(g\equiv 0\) vanish, and the Dirichlet boundary conditions \(u_D\) are extracted from the exact solution.

Figure 5 shows suboptimal convergence \({\mathscr {O}}(N_\ell ^{-0.27})\), namely an expected rate \(\alpha \) in terms of the maximal mesh-size, for uniform and fourth-order \(L^2\) stress convergence for adaptive mesh-refinement.

Despite the singular solution, the adaptive algorithm recovers the higher convergence of Theorem 5 as in [15].

Fig. 5
figure 5

Convergence history plot in L-shaped benchmark for \(\nu =0.4999\)

Fig. 6
figure 6

Cook membrane

Fig. 7
figure 7

Convergence history plot in Cook’s membrane benchmark

4.4 Cook membrane problem

One of the more popular benchmarks in computational mechanics is the tapered panel \(\varOmega \) with the vertices ABCD of Fig. 6 clamped on the left side \(\varGamma _D={\text {conv}}(D,A)\) (with \(u_D\equiv 0\)) under no volume force (\(f\equiv 0\)) but applied surface tractions \(g = (0,1)\) along \({\text {conv}}(B,C)\) and traction free on the remaining parts \({\text {conv}}(A,B)\) and \({\text {conv}}(C,D)\) along the Neumann boundary.

This example is a particular difficult one for the Arnold–Winther MFEM because of the incompatible Neumann boundary conditions on the right corners [12, 15, 16]. That means, although g is piecewise constant, g does not belong to \(G({\mathscr {T}})\) for any triangulation. In the two Neumann corner vertices B and C we therefore strongly impose the values \(\sigma _\ell (B) = (0.2491 , 0.7283; 0.7283 , 0.6676)\) and \(\sigma _\ell (C) = ( 3/20, 11/20 ; 11/20 , 11/60)\) for the design of \(g_\ell \in G({\mathscr {T}}_\ell )\).

Since the exact solution is unknown, the error approximation rests on a reference solution \({\tilde{\sigma }}\) computed as \(P_5({\mathscr {T}})\) displacement approximation on the uniform refinement of the finest adapted triangulation.

The large pre-asymptotic range of the convergence history plot in Fig. 7 illustrates the difficulties of the Arnold–Winther finite element method in case of incompatible Neumann boundary conditions according to its nodal degrees of freedom. Once the resulting and dominating boundary oscillations (caused by the necessary choice of discrete compatible Neumann conditions in \(G({\mathscr {T}}_\ell )\)) \({\text {osc}}(g-g_\ell ,{\mathscr {E}}_\ell (\varGamma _N))\) are resolved through adaptive mesh-refining, even the fourth-order \(L^2\) stress convergence is visible in a long asymptotic range in (the approximated error and) the equivalent error estimator.

This example underlines that adaptive mesh-refining is unavoidable in computational mechanics with optimal rates and a large saving in computational time and memory compared to naive uniform mesh-refining.

With the modifications of the Arnold-Winter MFEM for incompatible Neumann data as outlined in Appendix B, which only involves adjustments of the right-hand side at the critical incompatible nodal stress degrees of freedom, we observe optimal convergence rates from the very beginning in Fig. 8 without any visible pre-asymptotically reduced convergence caused by incompatible Neumann boundary conditions.

Fig. 8
figure 8

Convergence history plot in Cook’s membrane benchmark with modifications near the right corner points

4.5 Comments

The generic constants in this paper are not worked out explicitly in detail and so a numerical comparison with the earlier paper [15] cannot be quantitatively. It is conjectured that the residual-based error estimation with the reliability constants (for a guaranteed upper error bound) overestimates the true error up to an order of magnitude.

The qualitative comparison in Fig. 5 (without the reliability constants for the estimators) provides numerical evidence that the error estimators of this paper converge with the same convergence rates as those from [15] and it also indicates global equivalence of the errors with the two error estimators. The theoretical evidence in [15] for efficiency depends on unrealistically high regularity assumptions – unlike the general efficiency results of this paper.