Residual-based a posteriori error analysis for symmetric mixed Arnold–Winther FEM

Carstensen, Carsten; Gallistl, Dietmar; Gedicke, Joscha

doi:10.1007/s00211-019-01029-7

Residual-based a posteriori error analysis for symmetric mixed Arnold–Winther FEM

Open access
Published: 28 February 2019

Volume 142, pages 205–234, (2019)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

Residual-based a posteriori error analysis for symmetric mixed Arnold–Winther FEM

Download PDF

Carsten Carstensen¹,
Dietmar Gallistl² &
Joscha Gedicke³

1934 Accesses
5 Citations
Explore all metrics

Abstract

This paper introduces an explicit residual-based a posteriori error analysis for the symmetric mixed finite element method in linear elasticity after Arnold–Winther with pointwise symmetric and $H({\text {div}})$-conforming stress approximation. The residual-based a posteriori error estimator of this paper is reliable and efficient and truly explicit in that it solely depends on the symmetric stress and does neither need any additional information of some skew symmetric part of the gradient nor any efficient approximation thereof. Hence, it is straightforward to implement an adaptive mesh-refining algorithm. Numerical experiments verify the proven reliability and efficiency of the new a posteriori error estimator and illustrate the improved convergence rate in comparison to uniform mesh-refining. A higher convergence rate for piecewise affine data is observed in the $L^2$ stress error and reproduced in non-smooth situations by the adaptive mesh-refining strategy.

New discontinuous Galerkin algorithms and analysis for linear elasticity with symmetric stress tensor

Article 22 October 2021

History of the Finite Element Method – Mathematics Meets Mechanics – Part II: Mathematical Foundation of Primal FEM for Elastic Deformations, Error Analysis and Adaptivity

Residual-based a posteriori error estimates for symmetric conforming mixed finite elements for linear elasticity problems

Article 15 March 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Overview

The design of a pointwise symmetric stress approximation $\sigma _h\in L^2(\varOmega ;{\mathbb {S}})$ with divergence in $L^2(\varOmega ;{\mathbb {R}}^d)$, written $\sigma _h \in H({\text {div}},\varOmega ; \mathbb S)$, has been a long-standing challenge [2] and the first positive examples in [5] initiated what nowadays is called the finite element exterior calculus [4]. The a posteriori error analysis of mixed finite element methods in elasticity started with [11] on PEERS [3], where the asymmetric stress approximation $\gamma _h$ arises in the discretization as a Lagrange multiplier to enforce weakly the stress symmetry. This allows the treatment of the term ${\mathbb {C}}^{-1} \sigma _h + \gamma _h$ as an approximation of the (nonsymmetric) functional matrix Du for the displacement field [11] with the arguments of [1, 9] developed for mixed finite element schemes for a Poisson model problem. Here and throughout, ${\mathbb {C}}$ denotes a fourth-order elasticity tensor with two Lamé constants $\lambda $ and $\mu $ and ${\mathbb {C}}^{-1}$ is its inverse. Mixed finite element methods appear attractive in the incompressible limit for they typically avoid the locking phenomenon [12] as $\lambda \rightarrow \infty $.

For mixed finite element methods like the symmetric Arnold–Winther finite element schemes [5], the subtle term is the nonconforming residual: Given any piecewise polynomial $\sigma _h \in H({\text {div}},\varOmega ; {\mathbb {S}})$, compute an upper bound $\eta ({\mathscr {T}},\sigma _h)$ of

$$\begin{aligned} \inf _{v\in V} \left\| {\mathbb {C}}^{-1/2} \sigma _h-{\mathbb {C}}^{1/2}\varepsilon (v) \right\| _{L^2(\varOmega )} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Despite general results in this direction [10, 17, 18], this task had been addressed only by the computation of an approximation to the optimal v with Green strain $\varepsilon (v):={\text {sym}}D v$ or of some skew-symmetric approximation $\gamma _h$ motivated from the first results in [11] on PEERS. In fact, any choice of a piecewise smooth and pointwise skew-symmetric $\gamma _h$ allows for an a posteriori error control of the symmetric stress error $\sigma -\sigma _h$ in [15]. Its efficiency, however, depends on the (unknown and uncontrolled) efficiency of the choice of $\gamma _h$ as an approximation to the skew-symmetric part $\gamma $ of Du.

This paper proposes the first reliable and efficient explicit residual-based a posteriori error estimator of the nonconforming residual with the typical contributions to $\eta ({\mathscr {T}},\sigma _h)$ computed from the (known) Green strain approximation $\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1} \sigma _h$. Besides oscillations of the applied forces in the volume and along the Neumann boundary, there is a volume contribution $h_T^2\Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)}$ for each triangle $T\in {\mathscr {T}}$ and an edge contribution with the jump $[\varvec{\varepsilon }_h]_E$ across an interior edge E with unit normal $\nu _E$, tangential unit vector $\tau _E$, and length $h_E$, namely

$$\begin{aligned} h_E^{1/2} \left\| \tau _E \cdot [\varvec{\varepsilon }_h]_E \tau _E \right\| _{L^2(E)} + h_E^{3/2} \left\| \tau _E \cdot [{\text {rot}}_{NC} \varvec{\varepsilon }_h]_E - \partial ( \nu _E \cdot [\varvec{\varepsilon }_h]_E \tau _E)/\partial s \right\| _{L^2(E)}, \end{aligned}$$

and corresponding modification on the edges on the Dirichlet boundary with the (possibly inhomogeneous) Dirichlet data; cf. Remark 2 for some partial simplification of the last term displayed.

The analysis is restricted to the two dimensional case, since it involves explicit calculations in two dimensions without any reference to the exterior calculus but with inhomogeneous Dirichlet and Neumann boundary data. The main result is reliability and efficiency to control the stress error robustly in the sense that the multiplicative generic constants hidden in the notation $\lesssim $ do neither depend on the (local or global) mesh-size nor on the parameter $\lambda >0$ but may depend on $\mu >0$ and on the shape regularity of the underlying triangulation ${\mathscr {T}}$ of the domain $\varOmega $ into triangles through a lower bound of the minimal angle therein.

1.2 Linear elastic model problem

The elastic body $\varOmega $ is a simply-connected bounded Lipschitz domain $\varOmega \subset {\mathbb {R}}^2$ in the plane with a (connected) polygonal boundary $\partial \varOmega = \varGamma _D\cup \varGamma _N$ split into parts. The displacement boundary $\varGamma _D$ is compact and of positive surface measure, while the traction boundary is the relative open complement $\varGamma _N=\partial \varOmega \backslash \varGamma _D$ with outer unit normal vector $\nu $. Given $u_D\in H^1(\varOmega ;{\mathbb {R}}^2)$, the volume force $f\in L^2( \varOmega ; {\mathbb {R}}^2)$, and the applied surface traction $g \in L^2( \varGamma _N; {\mathbb {R}}^2)$, the linear elastic problem seeks a displacement $u\in H^1(\varOmega ;{\mathbb {R}}^2)$ and a symmetric stress tensor $\sigma \in H({\text {div}},\varOmega ; {\mathbb {S}})$ with

$$\begin{aligned} \begin{aligned}&\displaystyle -{\text {div}}\sigma = f \quad \text {and}\quad \sigma = {\mathbb {C}}\varepsilon (u)\quad \text{ in } \varOmega ,\\&\displaystyle u = u_D\quad \text{ on } \varGamma _D, \qquad \sigma \nu = g\quad \text{ on } \varGamma _N. \end{aligned} \end{aligned}$$

(1)

Throughout this paper, given the Lamé parameters $\lambda ,\mu >0$ for isotropic linear elasticity, the positive definite fourth-order elasticity tensor ${\mathbb {C}}$ acts as ${\mathbb {C}}E:=2\mu \, E+ \lambda \,{\text {tr}}(E)\,1_{2\times 2} $ on any matrix $E\in {\mathbb {S}}$ with trace ${\text {tr}}(E)$ and the $2\times 2$ unit matrix $1_{2\times 2} $. Note that $u_D$ acts in (1) only on $\varGamma _D$ and is an extension of the continuous function $u_D\in C(\varGamma _D;{\mathbb {R}}^2)$ also supposed to belong to the edgewise second order Sobolev space $ H^2( {\mathscr {E}}(\varGamma _D))$ below to allow second derivatives with respect to the arc length along boundary edges.

More essential will be a discussion on the precise conditions on the Neumann data g and its discrete approximation $g_h$ below for they belong to the essential boundary conditions in the mixed finite element method based on the dual formulation.

In addition to the set of homogeneous displacements V and the aforementioned stress space $H({\text {div}},\varOmega ;{\mathbb {S}}) $, namely,

$$\begin{aligned} V&:=\{v\in H^1(\varOmega ;{\mathbb {R}}^2)\ \big \vert \ v|_{\varGamma _D}=0\},\\ H({\text {div}},\varOmega ;{\mathbb {S}})&:= \{ \tau \in L^2(\varOmega ;{\mathbb {S}}) \ \big \vert \ {\text {div}}\tau \in L^2(\varOmega ;{\mathbb {R}}^2)\}, \end{aligned}$$

and with the exterior unit normal vector $\nu $ along $\partial \varOmega $, the inhomogeneous stress space

$$\begin{aligned} \varSigma (g) := \left\{ \sigma \in H({\text {div}},\varOmega ;{\mathbb {S}})\ \big \vert \int _{\varGamma _N}\psi \cdot (\sigma \nu )\;ds = \int _{\varGamma _N}\psi \cdot g\;ds \text { for all } \psi \in V \right\} \end{aligned}$$

is defined with respect to the Neumann data $g\in L^2(\varGamma _N)$ and, in particular, $ \varSigma _0 :=\varSigma (0)$ abbreviates the stress space with homogeneous Neumann boundary conditions.

Given data $u_D, f,g$ as before, the dual weak formulation of (1) seeks $(\sigma ,u)\in \varSigma (g)\times L^2(\varOmega ;{\mathbb {R}}^2)$ with

$$\begin{aligned} \begin{aligned} \int _\varOmega \sigma :{\mathbb {C}}^{-1}\tau \;dx + \int _\varOmega u\cdot {\text {div}}\tau \;dx&= \int _{\varGamma _D} u_D\cdot (\tau \nu )\,ds\quad \text {for all }\tau \in \varSigma _0,\\ \int _\varOmega v\cdot {\text {div}}\sigma \;dx&= -\int _\varOmega f\cdot v\,dx\quad \text {for all } v\in L^2(\varOmega ;{\mathbb {R}}^2). \end{aligned} \end{aligned}$$

(2)

It is well known that the two formulations are equivalent and well posed in the sense that they allow for unique solutions in the above spaces and are actually slightly more regular according to the reduced elliptic regularity theory. The reader is refereed to textbooks on finite element methods [6,7,8] for proofs and further descriptions.

Throughout this paper, the model problem considers truly mixed boundary conditions with the hypothesis that both $\varGamma _D$ and $\varGamma _N$ have positive length. The remaining cases of a pure Neumann problem or a pure Dirichlet problem require standard modification and are immediately adopted. The presentation focuses on the case of isotropic linear elasticity with constant Lamé parameters $\lambda $ and $\mu $ for brevity and many results carry over to more general situations (cf. Remarks 1 and 2 for instance).

1.3 Mixed finite element discretization

Let ${\mathscr {T}}$ denote a shape-regular triangulation of $\varOmega $ into triangles (in the sense of Ciarlet [8]) with set of nodes ${\mathscr {N}}$, set of interior edges ${\mathscr {E}}(\varOmega )$, set of Dirichlet edges ${\mathscr {E}}(\varGamma _D)$ and set of Neumann edges ${\mathscr {E}}(\varGamma _N)$. The triangulation is compatible with the boundary pieces $\varGamma _D$ and $\varGamma _N$ in that the boundary condition changes only at some vertex ${\mathscr {N}}$ and $\varGamma _D$ (resp. $\overline{\varGamma _N}$) is partitioned in ${\mathscr {E}}(\varGamma _D)$ (resp. ${\mathscr {E}}(\varGamma _N)$).

The piecewise polynomials (piecewise with respect to the triangulation ${\mathscr {T}}$) of total degree at most $k\in {\mathbb {N}}_0$ are denoted as $P_k({\mathscr {T}})$, their vector- or matrix-valued versions as $P_k({\mathscr {T}};{\mathbb {R}}^2)$ or $P_k({\mathscr {T}};{\mathbb {R}}^{2\times 2})$ etc. The subordinated Arnold–Winther finite element space $ AW _k({\mathscr {T}})$ of index $k\in {\mathbb {N}}$ [5] reads

$$\begin{aligned} { AW }_k({\mathscr {T}}):=\left\{ \tau \in P_{k+2}({\mathscr {T}};{\mathbb {S}})\cap H({\text {div}},\varOmega ;{\mathbb {S}})\, \big |\,{\text {div}}\tau \in P_k({\mathscr {T}};{\mathbb {R}}^2)\right\} . \end{aligned}$$

The Neumann boundary conditions are essential conditions and are traditionally implemented by some approximation $g_h$ to g in the normal trace space

$$\begin{aligned} G({\mathscr {T}}):=\left\{ (\tau _h \nu ) |_{\varGamma _N} \in L^2(\varGamma _N;{\mathbb {R}}^2) \,\big |\, \tau _h\in AW _k({\mathscr {T}}) \right\} \end{aligned}$$

(recall that $\nu $ is the exterior unit normal along the boundary). Given any $g_h\in G({\mathscr {T}})$, the discrete stress approximations are sought in the non-void affine subspace

$$\begin{aligned} \varSigma (g_h,{\mathscr {T}}):=\varSigma (g_h) \cap AW _k({\mathscr {T}}) \end{aligned}$$

of $ AW _k({\mathscr {T}})$ with test functions in the linear subspace $\varSigma (0,{\mathscr {T}}):=\varSigma _0\,\cap \, AW _k({\mathscr {T}})$. Then there exists a unique discrete solution $\sigma _h\in \varSigma (g_h,{\mathscr {T}})$ and $ u_h\in V_h:= P_k({\mathscr {T}};{\mathbb {R}}^2)$ to

$$\begin{aligned} \begin{aligned} \int _\varOmega \sigma _h :{\mathbb {C}}^{-1}\tau _h\,dx+\int _\varOmega u_h\cdot {\text {div}}\tau _h\,dx&= \int _{\varGamma _D} u_D \cdot (\tau _h\nu )\, ds \quad \text {for all } \tau _h\in \varSigma (0,{\mathscr {T}}),\\ \int _\varOmega v_{h}\cdot {\text {div}}\sigma _h\, dx&= \int _\varOmega f\cdot v_{h}\, dx\quad \text { for all } v_{h}\in V_h. \end{aligned} \end{aligned}$$

(3)

The explicit design of a Fortin projection leads in [5] to quasi-optimal a priori error estimates for an exact solution $(\sigma , u)\in (\varSigma (g)\cap H^{k+2}(\varOmega ;{\mathbb {S}})) \times H^{k+2}(\varOmega )$ to (1) and the approximate solution $(\sigma _h, u_h)$ to (3), namely (with the maximal mesh-size h)

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )}&\lesssim h^{m}\Vert \sigma \Vert _{H^m(\varOmega )}\quad \text{ for } 1\le m \le k+2,\\ \Vert u- u_h\Vert _{L^2(\varOmega )}&\lesssim h^{m}\Vert u\Vert _{H^{m+1}(\varOmega )}\quad \text{ for } 1\le m \le k+1. \end{aligned}$$

Another stable pair of different and mesh-depending norms in [14] implies the $L^2$ best approximation of the stress error $\sigma -\sigma _h$ up to a generic multiplicative constant and data oscillations on f under some extra condition (N) on the Neumann data approximation $g_h$ implied by the first and zero moment orthogonality assumption $g-g_h\perp P_1( {\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)$ ($\perp $ indicates orthogonality in $L^2(\varGamma _N)$) met in all the numerical examples of this paper.

For simple benchmark examples with piecewise polynomial data f and g, there is even a superconvergence phenomenon visible in numerical examples. The arguments of this paper allow a proof of fourth-order convergence of the $L^2$ stress error $\Vert \sigma -\sigma _h\Vert ={\mathscr {O}}( h^4 )$ in the lowest-order Arnold–Winther method with $k=1$ for a smooth stress $\sigma \in H^4(\varOmega ;{\mathbb {S}})$ with $f=f_h\in P_1({\mathscr {T}};{\mathbb {R}}^2)$ and $g=g_h\in G({\mathscr {T}})$. (In fact, once the data are not piecewise affine, the arising oscillation terms are only of at most third order and the aforementioned convergence estimates are sharp.)

This is stated as Theorem 5 in the appendix, because the a priori error analysis lies outside of the main focus of this work. It is surprising though that adaptive mesh-refining suggested with this paper recovers this higher convergence rate even for the inconsistent Neumann data in the Cook membrane benchmark example below.

1.4 Explicit residual-based a posteriori error estimator

The novel explicit residual-based error estimator for the discrete solution $(\sigma _h, u_h)$ to (3) depends only on the Green strain approximation ${\mathbb {C}}^{-1}\sigma _h$ and its piecewise derivatives and jumps across edges.

Given any edge E of length $h_E$, let $\nu _E$ denote the unit normal vector (chosen with a fixed orientation such that it points outside along the boundary $\partial \varOmega $ of $\varOmega $) and let $\tau _E$ denote its tangential unit vector; by convention $\tau _E = (0,-1; 1,0) \nu _E$ with the indicated asymmetric $2\times 2$ matrix. The tangential derivative $\tau _E\cdot \nabla \bullet $ along an edge (or boundary) is identified with the one-dimensional derivative $\partial \bullet /\partial s$ with respect to the arc-length parameter s. The jump $[v]_E$ of any piecewise continuous scalar, vector, or matrix v across an interior edge $E = \partial T_+\cap \partial T_-$ shared by the two triangles $T_+$ and $T_-$ such that $\nu _E$ points outside $T_+$ along $E\subset \partial T_+$ reads

$$\begin{aligned}{}[v]_E := (v|_{T_+})|_E -(v|_{T_-})|_E. \end{aligned}$$

The rotation acts on a vector field $\varPhi $ (and row-wise on matrices) via ${\text {rot}}\varPhi := \partial _1 \varPhi _2 - \partial _2 \varPhi _1$ and ${\text {rot}}_{NC}$ denotes its piecewise application.

Under the present notation and the throughout abbreviation $\varvec{\varepsilon }_h:={\mathbb {C}}^{-1}\sigma _h$, the explicit residual-based a posteriori error estimator reads

$$\begin{aligned}&\eta ^2({\mathscr {T}},\sigma _h)\nonumber \\&\quad :=\sum _{T\in {\mathscr {T}}}h_T^4 \Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)}^2 +{\text {osc}}^2(f,{\mathscr {T}}) + {\text {osc}}^2(g-g_h,{\mathscr {E}}(\varGamma _N)) \nonumber \\&\qquad +\displaystyle \sum _{E\in {\mathscr {E}}(\varOmega )} \left( h_E \Vert \tau _E \cdot [\varvec{\varepsilon }_h]_E \tau _E \Vert _{L^2(E)}^2 + h_E^3 \Vert \tau _E\cdot \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E \nu _E }{\partial s }\right) \Vert _{L^2(E)}^2 \right) \nonumber \\&\qquad +\displaystyle \sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( h_E\Vert \tau _E \cdot \left( \varvec{\varepsilon }_h\tau _E - \frac{ \partial u_D}{\partial s}\right) \Vert _{L^2(E)}^2 +\, h_E^3 \Vert \tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot \left( \frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2}\right) \Vert _{L^2(E)}^2\right) \nonumber \\ \end{aligned}$$

(4)

for the oscillations ${\text {osc}}(f,{\mathscr {T}})$ of the volume force and the oscillations of the traction boundary condition ${\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))$, defined below.

Theorem 1

(reliability) There exists a mesh-size and $\lambda $ independent constant $C_{\text {rel}} $ (which may depend on $\mu $ and on the shape-regularity of the triangulation ${\mathscr {T}}$ through a global lower bound of the minimal angle therein) such that the exact (resp. discrete) stress $\sigma $ from (1) [resp. $\sigma _h$ from (3)] with $g-g_h\perp P_0({\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)$ and the error estimator (4) satisfy

$$\begin{aligned} \Vert \sigma - \sigma _h \Vert _{L^2(\varOmega )} \le C_{\text {rel}} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

The a posteriori error estimator $\eta ({\mathscr {T}},\sigma _h)$ already involves two data oscillation terms ${\text {osc}}(f,{\mathscr {T}})$ and ${\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))$ defined as the square roots of the respective terms in

$$\begin{aligned}&\displaystyle {\text {osc}}^2(f,{\mathscr {T}}) := \sum _{T\in {\mathscr {T}}} h_T^2\Vert f{-}f_h\Vert _{L^2(T)}^2 \text { for the }L^2\text { projection }f_h \text { of } f \text { onto } P_k({\mathscr {T}};{\mathbb {R}}^2); \\&\displaystyle {\text {osc}}^2(g-g_h,{\mathscr {E}}(\varGamma _N)) := \sum _{E\in {\mathscr {E}}(\varGamma _N)} h_E \Vert g-g_h\Vert _{L^2(E)}^2. \end{aligned}$$

For any edge E and a degree $m\ge k+2$, let $\varPi _{m,E}:L^2(E)\rightarrow P_{m}(E)$ denote the $L^2$ projection onto polynomials of degree at most m. For any $E\in {\mathscr {E}}(\varGamma _D)$ define the two Dirichlet data oscillation terms

$$\begin{aligned} {\text {osc}}^2_I(u_D,E):= & {} h_E\Vert (1-\varPi _{m,E}) \partial (u_D\cdot \tau _E)/\partial s \Vert _{L^2(E)}^2, \end{aligned}$$

(5)

$$\begin{aligned} {\text {osc}}^2_{II}(u_D,E):= & {} h_E^3\Vert (1-\varPi _{m,E}) \partial ^2 ( u_D\cdot \nu _E )/ \partial s^2\Vert _{L^2(E)}^2. \end{aligned}$$

(6)

Their sum defines the overall Dirichlet data approximation ${\text {osc}}(u_D,{\mathscr {E}}(\varGamma _D))$ as the square root of

$$\begin{aligned} {\text {osc}}^2(u_D,{\mathscr {E}}(\varGamma _D)):=\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( {\text {osc}}^2_I(u_D,E)+{\text {osc}}^2_{II}(u_D,E)\right) . \end{aligned}$$

The analysis of Sect. 3 is local and states for each of the five local residuals an upper bound related to the error in a neighborhood. The global efficiency is displayed as follows.

Theorem 2

(efficiency) There exists a mesh-size and $\lambda ,\mu $ independent constant $C_{\text {eff}} $ (which may depend on the shape-regularity of the triangulation ${\mathscr {T}}$ through a global lower bound of the minimal angle therein) such that the exact (resp. discrete) stress $\sigma $ from (1) [resp. $\sigma _h$ from (3)] with $g-g_h\perp P_0({\mathscr {E}}(\varGamma _N);{\mathbb {R}}^2)$ and the error estimator (4) satisfy

$$\begin{aligned} C_{\text {eff}}^{-1} \eta ({\mathscr {T}},\sigma _h) \le \Vert \sigma - \sigma _h \Vert _{L^2(\varOmega )} +{\text {osc}}(f,{\mathscr {T}})+{\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))+{\text {osc}}(u_D,{\mathscr {E}}(\varGamma _D)). \end{aligned}$$

1.5 Outline of the paper

The remaining parts of this paper provide a mathematical proof of Theorems 1 and 2 and numerical evidence in computational experiments on the novel a posteriori error estimation and its robustness as well as on associated mesh-refining algorithms.

The proof of the reliability of Theorem 1 in Sect. 2 adopts arguments of [11, 15] and carries out two integration by parts on each triangle plus one-dimensional integration by parts along all edges. The resulting terms are in fact locally efficient in Sect. 3 with little generalizations of the bubble-function methodology due to Verfürth [24]. The five lemmas of Sect. 3 give slightly sharper results and in total imply Theorem 2.

The point in Theorems 1 and 2 is that the universal constants $C_{\text {rel}}$ and $C_{\text {eff}}$ may depend on the Lamé parameter $\mu $ but are independent of the critical Lamé parameter $\lambda $ as supported by the benchmark examples of the concluding Sect. 4. Adaptive mesh-refining proves to be highly effective with the novel a posteriori error estimator even for incompatible Neumann data. Four benchmark examples with the Poisson ratio $\nu =0.3$ or 0.4999 provide numerical evidence of the robustness of the reliable and efficient a posteriori error estimation and for the fourth-order convergence of Theorem 5.

Three appendices highlight some improvements in the numerical benchmarks: Appendix A explains the improved convergence order for piecewise affine data and B and C explain how to treat incompatible Neumann data successfully.

1.6 Comments on general notation

Standard notation on Lebesgue and Sobolev spaces and norms is adopted throughout this paper and, for brevity, $\Vert \cdot \Vert :=\Vert \cdot \Vert _{L^2(\varOmega )}$ denotes the $L^2$ norm. The piecewise action of a differential operator is denoted with a subindex NC, e.g., $\nabla _{NC}$ denotes the piecewise gradient $(\nabla _{NC} \bullet )|_T := \nabla (\bullet |_T)$ for all $T\in {\mathscr {T}}$. Sobolev functions are usually defined on open sets and the notation $W^{m,p}(T)$ (resp. $W^{m,p}({\mathscr {T}})$) substitutes $W^{m,p}({\text {int}}(T))$ for a (compact) triangle T and its interior ${\text {int}}(T)$ (resp. $W^{m,p}({\text {int}}({\mathscr {T}}))$) and their vector and matrix versions.

For a differentiable function $\phi $, ${\text {Curl}}\phi := (-\partial _2 \phi , \partial _1 \phi )$ is the rotated gradient; for a two-dimensional vector field $\varPhi $, ${\text {Curl}}\varPhi $ is the $2\times 2$ matrix-valued rotated gradient

$$\begin{aligned} {\text {Curl}}\varPhi := (-\partial _2 \varPhi _1 , \partial _1 \varPhi _1 ; -\partial _2 \varPhi _2 , \partial _1 \varPhi _2 ) =D\varPhi (0,1; -1,0). \end{aligned}$$

(The signs are not uniquely determined in the literature and some care is required.)

The colon denotes the scalar product $A:B:=\sum _{\alpha ,\beta =1,2} A_{\alpha ,\beta } B_{\alpha ,\beta }$ of $2\times 2$ matrices A, B. The inequality $A\lesssim B$ between two terms A and B abbreviates $A\le C\, B$ with some multiplicative generic constant C, which is independent of the mesh-size and independent of the one Lamé parameter $\lambda \ge 0$ but may depend on the other $\mu >0$ and may depend on the shape-regularity of the underlying triangulation ${\mathscr {T}}$ and the parameter k related to the polynomial degree of the scheme.

2 Proof of reliability

This section is devoted to the proof of Theorem 1 based on a Helmholtz decomposition of [11] with two parts as decomposed in Theorem 3 below. The critical part is the $L^2$ product of ${\mathbb {C}}^{-1} (\sigma -\sigma _h)$ times the ${\text {Curl}}$ of an unknown function ${\text {Curl}}\beta $. The observation from [15] is that one can find an Argyris finite element approximation $\beta _h$ to $\beta \in H^2(\varOmega )$ such that the continuous function $\phi :=\beta -\beta _h\in H^2(\varOmega )$ vanishes at all vertices ${\mathscr {N}}$ of the triangulation. Two integration by parts on each triangle plus one-dimensional integration by parts along the edges ${\mathscr {E}}$ of the triangulation eventually lead to a key identity.

Lemma 1

(representation formula) Any function $\varvec{\varepsilon }_h\in H^2({\mathscr {T}};{\mathbb {S}})$ (i.e. $\varvec{\varepsilon }_h$ is piecewise in $H^2$ with values in ${\mathbb {S}}$) and any $\phi \in H^2(\varOmega )$ with $\phi (z)=0$ at all vertices $z\in {\mathscr {N}}$ in the regular triangulation ${\mathscr {T}}$ satisfy

$$\begin{aligned}&(\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\varOmega )} =({\text {rot}}_{NC}{\text {rot}}_{NC}\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )}\\&\quad +\sum _{E\in {\mathscr {E}}(\varOmega )} \left( (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} - \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} ,\phi \, \tau _E \right) _{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\partial \varOmega )} \left( (\tau _E\cdot \varvec{\varepsilon }_h\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} - \left( {\text {rot}}\varvec{\varepsilon }_h-\frac{ \partial \varvec{\varepsilon }_h\nu _E}{\partial s} , \phi \, \tau _E\right) _{L^2(E)}\right) . \end{aligned}$$

The subsequent integration by parts formula is utilized frequently throughout this paper for $\phi \in H^1(\varOmega ;{\mathbb {R}}^2)$ and $\varPsi \in H^1(\varOmega ;{\mathbb {R}}^{2\times 2})$

$$\begin{aligned} \int _\varOmega \varPsi :{\text {Curl}}\phi \, dx+ \int _\varOmega \phi \cdot {\text {rot}}\varPsi \, dx = \int _{\partial \varOmega } \phi \cdot \varPsi \tau _E\, ds. \end{aligned}$$

Any differentiable (scalar) function $\varphi $, satisfies the elementary relations

$$\begin{aligned} \tau _E\cdot {\text {Curl}}\varphi = \partial \varphi /\partial \nu _E \quad \text {and}\quad \nu _E\cdot {\text {Curl}}\varphi = - \partial \varphi /\partial s=- \partial \varphi /\partial \tau _E \quad \text {on }E\in {\mathscr {E}}. \end{aligned}$$

Proof

Integrate by parts twice on each triangle and rearrange the remaining boundary terms to deduce (with the abbreviation ${\text {rot}}_{NC}{\text {rot}}_{NC}\equiv {\text {rot}}_{NC}^2$)

$$\begin{aligned}&(\varvec{\varepsilon }_h,{\text {Curl}}^2\phi )_{L^2(\varOmega )} = ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )} \\&\quad +\sum _{E\in {\mathscr {E}}(\varOmega )} \left( ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} -([{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E\cdot \tau _E,\phi )_{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\partial \varOmega )} \left( (\varvec{\varepsilon }_h\tau _E,{\text {Curl}}\phi )_{L^2(E)} -({\text {rot}}\varvec{\varepsilon }_h\cdot \tau _E,\phi )_{L^2(E)} \right) . \end{aligned}$$

The term $ ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} $ in the above sum is split into orthogonal components

$$\begin{aligned} {\text {Curl}}\phi= & {} (\tau _E\cdot {\text {Curl}}\phi )\tau _E+ (\nu _E\cdot {\text {Curl}}\phi )\nu _E\\= & {} (\tau _E\cdot {\text {Curl}}\phi )\tau _E- (\partial \phi /\partial s )\nu _E\quad \text {on }E\in {\mathscr {E}}. \end{aligned}$$

Since $\phi $ vanishes at the vertices, an integration by parts along each interior edge E for the last term shows $ ([\varvec{\varepsilon }_h]_E\tau _E, (\partial \phi /\partial s )\nu _E)_{L^2(E)} = - (\partial [\varvec{\varepsilon }_h]_E\tau _E/\partial s , \phi \nu _E)_{L^2(E)}$. This proves

$$\begin{aligned} ([\varvec{\varepsilon }_h]_E\tau _E,{\text {Curl}}\phi )_{L^2(E)} = (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\partial _{\nu _E}\phi )_{L^2(E)} + \left( \frac{\partial \nu _E\cdot [\varvec{\varepsilon }_h]_E\tau _E}{\partial s},\phi \right) _{L^2(E)}. \end{aligned}$$

The same formula holds for any boundary edge E when $[\varvec{\varepsilon }_h]_E$ is replaced by $\varvec{\varepsilon }_h$. The combination of the latter identities with the first displayed formula of this proof verifies the asserted representation formula. $\square $

The contribution of $\varepsilon (u)={\mathbb {C}}^{-1} \sigma $ times the ${\text {Curl}}^2\phi \in L^2(\varOmega ;{\mathbb {S}})$ exclusively leads to boundary terms. Throughout this paper, suppose that the Dirichlet data $u_D$ satisfies $u_D\in C(\varGamma _D)\cap H^2({\mathscr {E}}(\varGamma _D))$ in the sense that $u_D$ is globally continuous with $u_D|_E\in H^2(E;{\mathbb {R}}^2)$ for all $E\in {\mathscr {E}}(\varGamma _D)$.

Lemma 2

(boundary terms) Any Sobolev function $v\in H^1(\varOmega ;{\mathbb {R}}^2) $ with boundary values $u_D\in C(\varGamma _D)\cap H^2({\mathscr {E}}(\varGamma _D))$ on $\varGamma _D$ and any $\phi \in H^2(\varOmega )$ with $\phi =\partial \phi /\partial \nu =0$ along $\varGamma _N$ with $\phi (z)=0$ for any vertex z of $\varGamma _D$ in its relative interior satisfy

$$\begin{aligned} (\varepsilon (v) , {\text {Curl}}^2 \phi )_{L^2(\varOmega )}=\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( \left( \frac{ \partial u_D}{\partial s } , \frac{ \partial \phi }{\partial \nu _E}\, \tau _E \right) _{L^2(E)} + \left( \frac{\partial ^2 u_D}{\partial s^2 } ,\phi \, \nu _E \right) _{L^2(E)}\right) . \end{aligned}$$

Proof

A density argument shows that it suffices to prove this identity for smooth functions v and $\phi $, when integration by parts arguments show that the left-hand side is equal to

$$\begin{aligned} \int _{\partial \varOmega } {\text {Curl}}\phi \cdot \frac{\partial v }{\partial s}ds= \sum _{E\in {\mathscr {E}}(\partial \varOmega )} \int _{E} \left( \frac{\partial \phi }{\partial \nu _E} \frac{\partial (v\cdot \tau _E)}{\partial s} + \phi \frac{\partial ^2 (v\cdot \nu _E) }{\partial s^2}\right) ds . \end{aligned}$$

The equality follows from an orthogonal split ${\text {Curl}}\phi = (\tau \cdot {\text {Curl}}\phi )\tau + (\nu \cdot {\text {Curl}}\phi )\nu $ into the normal and tangential directions of $\nu $ and $\tau $ along the boundary $\partial \varOmega $ followed by an integration by parts along $\partial \varOmega $ with $\phi (z)=0$ for vertices z in $\varGamma _D$ with a jump of the normal unit vector. The substitution of the boundary conditions concludes the proof. $\square $

The consequence of the previous two lemmas is a representation formula for the error times a typical function ${\text {Curl}}^2 \phi $. To understand why the contributions on the Neumann boundary of $\phi $ and $\nabla \phi $ disappear along $\varGamma _N$, some details on the Helmholtz decomposition are recalled from the literature. For this, let $\varGamma _0,\ldots , \varGamma _J$ denote the compact connectivity components of $\overline{\varGamma _N}$.

Theorem 3

(Helmholtz decomposition [11, Lemma 3.2]) For $\sigma -\sigma _h\in L^2(\varOmega ;{\mathbb {S}})$, there exists $\alpha \in V$, constant vectors $c_0,\ldots ,c_J\in {\mathbb {R}}^2$ with $c_0=0$ and $\beta \in H^2(\varOmega )$ with $\int _\varOmega \beta \, dx = 0$ and ${\text {Curl}}\beta = c_j $ on $\varGamma _j\subseteq \varGamma _N$ for all $j=0,\ldots ,J$ such that

$$\begin{aligned} \sigma -\sigma _h = {\mathbb {C}}\varepsilon (\alpha ) + {\text {Curl}}{\text {Curl}}\beta . \end{aligned}$$

(7)

$\square $

The second ingredient is an approximation $\beta _h$ of $\beta $ from the Helmholtz decomposition in Theorem 3 based on the Argyris finite element functions $ A({\mathscr {T}}) \subset C^1(\varOmega )\cap P_5({\mathscr {T}})$ [7, 8, 20]. The local mesh-size $h_{\mathscr {T}}\in P_0({\mathscr {T}})$ in the triangulation ${\mathscr {T}}$ is defined as its diameter $h_{\mathscr {T}}|_T:=h_T$ on each triangle $T\in {\mathscr {T}}$.

Lemma 3

(quasi-interpolation) Given any $\beta $ as in Theorem 3 there exists some $\beta _h\in A({\mathscr {T}})$ such that $\phi :=\beta -\beta _h \in H^2(\varOmega ) $ vanishes at any vertex $z\in {\mathscr {N}}$ of the triangulation, $\phi $ and its normal gradient $\nabla \phi \cdot \nu $ vanish on $\varGamma _N$, and the local approximation and stability property holds in the sense that

$$\begin{aligned} \Vert h_{\mathscr {T}}^{-2} \phi \Vert + \Vert h_{\mathscr {T}}^{-1}{\text {Curl}}\phi \Vert +\Vert {\text {Curl}}^2\phi \Vert \lesssim \Vert \beta \Vert _{H^2(\varOmega )}. \end{aligned}$$

Proof

This has been (partly) utilized in [15] and also follows from [21]. $\square $

The combination of all aforementioned arguments leads to the following estimate as an answer to the question of Sect. 1.1 in terms of directional derivatives of $\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h$. Recall the definition of $\eta ({\mathscr {T}},\sigma _h)$ from (4).

Theorem 4

(key result) Let $\sigma \in H({\text {div}},\varOmega ;{\mathbb {S}})$ solve (1) and let $\sigma _h \in AW _k({\mathscr {T}})$ solve (3). Given $\beta $ from Theorem 3 and its quasi-interpolation $\beta _h$ from Lemma 3, the difference $\phi :=\beta -\beta _h$ satisfies

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h), {\text {Curl}}^2\phi )_{L^2(\varOmega )} \lesssim | \beta |_{H^2(\varOmega )} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Proof

Lemmas 1 and 2 lead to a formula for $(\varvec{\varepsilon }_h, {\text {Curl}}^2\phi )_{L^2(\varOmega )}$, $\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h$, in which all the contributions for $E\in {\mathscr {E}}(\varGamma _N)$ with $\phi $ and $\nabla \phi $ vanish along $\varGamma _N$. The remaining formula reads

$$\begin{aligned}&({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\phi )_{L^2(\varOmega )} = - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\varOmega )}\\&\quad -\sum _{E\in {\mathscr {E}}(\varOmega )} \left( \left( \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E,\frac{\partial \phi }{\partial \nu _E}\right) _{L^2(E)}- \left( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} ,\phi \, \tau _E \right) _{L^2(E)} \right) \\&\quad +\sum _{E\in {\mathscr {E}}(\varGamma _D)} \left( \left( \frac{\partial u_D}{\partial s} - \varvec{\varepsilon }_h\tau _E, \tau _E\, \frac{\partial \phi }{\nu _E} \right) _{L^2(E)} \right. \\&\quad \left. + \left( \tau _E\cdot \left( {\text {rot}}_{NC}\varvec{\varepsilon }_h-\frac{\partial ( \varvec{\varepsilon }_h\nu _E)}{\partial s} \right) +\frac{\partial ^2 u_D\cdot \nu _E}{\partial s^2 } ,\phi \right) _{L^2(E)}\right) . \end{aligned}$$

The proof concludes with Cauchy–Schwarz inequalities, trace inequalities, and the approximation estimates of Lemma 3. The remaining details are nowadays standard arguments in the a posteriori error analysis of nonconforming and mixed finite element methods and hence are omitted. $\square $

Before the proof of Theorem 1 concludes this section, three remarks and one lemma are in order.

Remark 1

(nonconstant coefficients) The main parts of the reliability analysis of this section hold for rather general material tensors ${\mathbb {C}}$ as long as $\varvec{\varepsilon }_h:= {\mathbb {C}}^{-1}\sigma _h$ allows for the existence of the traces and the derivatives in the error estimator (4) in the respective $L^2$ spaces. For instance, if $\lambda $ and $\mu $ are piecewise smooth with respect to the underlying triangulation ${\mathscr {T}}$.

Remark 2

(constant coefficients) The overall assumption of constant Lamé parameters $\lambda $ and $\mu $ allows a simplification in the error estimator (4). It suffices to have $\mu $ globally continuous and $\mu $ and $\lambda $ piecewise smooth to guarantee

$$\begin{aligned} \frac{ \partial [\varvec{\varepsilon }_h]_E\nu _E}{\partial s} \cdot \tau _E =0\quad \text {along }E\in {\mathscr {E}}(\varOmega ). \end{aligned}$$

(The proof utilizes the structure of ${\mathbb {C}}^{-1}$ with ${\mathbb {C}}^{-1} E = \frac{1}{2\mu } (E - \frac{\lambda }{2(\lambda +\mu )} {\text {tr}}(E) 1_{2\times 2})$ for any $E\in {\mathbb {S}}$ as a linear combination of the identity and some scalar multiple of the $2\times 2$ unit matrix. The terms with the identity lead to $1/(2\mu )$ times the jump $[\sigma _h]_E\nu _E=0$ of the $H({\text {div}})$ conforming stress approximations. The jump terms with the unit matrix (even with jumps of $\lambda $) are multiplied with the orthogonal unit vectors $\nu _E$ and $\tau _E$ and so vanish as well.)

Remark 3

(related work) Although the work [22] concerns a different problem (bending of a plate of fourth order) with a different discretization (even nonconforming in $H({\text {div}})$), some technical parts of that paper are related to those of this by a rotation of the underlying coordinate system and the substitution of ${\text {div}}{\text {div}}$ by ${\text {rot}}{\text {rot}}$ etc. Another Helmholtz decomposition also allows for a discrete version and thereby enables a proof of optimal convergence of an adaptive algorithm with arguments from [13, 19].

A technical detail related to the robustness in $\lambda \rightarrow \infty $ is a well known lemma that controls the trace of a matrix $E\in {\mathbb {R}}^{2\times 2}$ by its deviatoric part ${\text {dev}} E:= E-{\text {tr}}(E)/2\, 1_{2\times 2}$ and its divergence measured in the dual $V^*\subset H^{-1}(\varOmega ;{\mathbb {R}}^2)$ of V, namely

$$\begin{aligned} \Vert {\text {div}}\tau \Vert _{-1} := \sup _{\begin{array}{c} v\in V \\ | v |_{H^1(\varOmega )} =1 \end{array}}{\int _\varOmega \tau : Dv\, dx} \quad \text {for all } \tau \in L^2(\varOmega ;{\mathbb {R}}^{2\times 2}). \end{aligned}$$

Lemma 4

(tr-dev-div) Let $\varSigma _0$ be a closed subspace of $H({\text {div}},\varOmega ;{\mathbb {R}}^{2\times 2})$, which does not contain the constant tensor $1_{2\times 2}$. Then any $\tau \in \varSigma _0$ satisfies

$$\begin{aligned} \Vert {\text {tr}}(\tau )\Vert _{L^2(\varOmega )} \lesssim \Vert {\text {dev}}\tau \Vert _{L^2(\varOmega )} + \Vert {\text {div}}\tau \Vert _{-1}. \end{aligned}$$

Proof

There are several variants of the tr-dev-div lemma known in the literature [6, Proposition 9.1.1]. The version in [11, Theorem 4.1] explicitly displays a version with $\Vert {\text {div}}\tau \Vert $ replacing $\Vert {\text {div}}\tau \Vert _{-1}$. Since its proof is immediately adopted to prove the asserted version, further details are omitted. $\square $

The remaining part of this section outlines why Theorem 1 follows from Theorem 4 with the arguments from [11, 15]. The energy norms for any $v\in V$ and $\tau \in H({\text {div}},\varOmega ;{\mathbb {S}}) $ read

$$\begin{aligned} |||v |||^2:=\int _\varOmega \varepsilon (v):{\mathbb {C}}\varepsilon (v)\, dx \quad \text {and}\quad \Vert \tau \Vert _{{\mathbb {C}}^{-1}}^2:=\int _\varOmega \tau : {\mathbb {C}}^{-1}\tau \, dx. \end{aligned}$$

The remaining residual is denoted by

$$\begin{aligned} {{\,\mathrm{Res}\,}}(v) := \int _\varOmega f\cdot v\, dx + \int _{\varGamma _N} g\cdot v\, ds - \int _\varOmega \sigma _h:\varepsilon (v)\, dx\quad \text {for all }v\in V \end{aligned}$$

with its dual norm

$$\begin{aligned} |||{{\,\mathrm{Res}\,}} |||_* := \sup _{\begin{array}{c} v\in V \\ |||v |||=1 \end{array}}{{{\,\mathrm{Res}\,}}(v)} . \end{aligned}$$

It is shown in the proof of [15, Theorem 3.1] that $\alpha \in V$ and $\beta \in H^2(\varOmega )$ from the Helmholtz decomposition of the error $\sigma -\sigma _h$ in Theorem 3 are orthogonal with respect to the $L^2$ scalar product weighted with ${\mathbb {C}}^{-1}$. This implies

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}}^2 = (\sigma -\sigma _h,\varepsilon (\alpha ))_{L^2(\varOmega )} + ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )}. \end{aligned}$$

(8)

Let $\beta _h$ denote the quasi-interpolation of $\beta $ from Lemma 3. It is known [15] that ${\text {Curl}}^2\beta _h$ is a divergence-free element of $\varSigma (0,{\mathscr {T}})$. Therefore, (2) and (3) imply

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta _h)_{L^2(\varOmega )} =0. \end{aligned}$$

Thus, with $\phi =\beta -\beta _h$, the second term of (8) equals $ ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\phi )_{L^2(\varOmega )} $ and hence is controlled in the key estimate of Theorem 4 as

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )} \lesssim |\beta |_{H^2(\varOmega )} \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

Lemma 4 applies to $\varSigma _0$ as the subspace of all $\tau \in H({\text {div}},\varOmega ;{\mathbb {S}})$ with homogeneous Neumann data $\tau \nu =0$ along $\varGamma _N$. Since $\tau :={\text {Curl}}^2\beta $ is divergence free (by the relation ${\text {div}}{\text {Curl}}=0$) and since $\tau \nu =-\partial {\text {Curl}}\beta /\partial s$ along $\varGamma _N$ (owing to the aforementioned elementary relations and the convention that the first ${\text {Curl}}$ acts row-wise on ${\text {Curl}}\beta $), where ${\text {Curl}}\beta $ in Theorem 3 is piecewise constant, it follows that $\tau \in \varSigma _0$. On the other hand $1_{2\times 2}\notin \varSigma _0$ because $\varGamma _N\ne \emptyset $. Consequently, Lemma 4 implies $\Vert {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {dev}} {\text {Curl}}^2 \beta \Vert $. This and elementary calculations with ${\mathbb {C}}^{-1}$ lead to

$$\begin{aligned} | \beta |_{H^2(\varOmega )} = \Vert {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {dev}} {\text {Curl}}^2 \beta \Vert \lesssim \Vert {\text {Curl}}^2 \beta \Vert _{{\mathbb {C}}^{-1}}. \end{aligned}$$

The combination with the estimate resulting from Theorem 4 proves

$$\begin{aligned} ({\mathbb {C}}^{-1}(\sigma -\sigma _h),{\text {Curl}}^2\beta )_{L^2(\varOmega )} \lesssim \Vert {\text {Curl}}^2\beta \Vert _{{\mathbb {C}}^{-1}}\, \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

This, the stability $ \Vert {\text {Curl}}^2\beta \Vert _{{\mathbb {C}}^{-1}} \le \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} $, and $|||\alpha |||=|||{{\,\mathrm{Res}\,}} |||_{*}$ lead in (8) to

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \lesssim |||{{\,\mathrm{Res}\,}} |||_{*} + \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

(9)

The remaining term is the estimate of the dual norm $|||{{\,\mathrm{Res}\,}} |||_{*}$ of the residual which is done, e.g., in [15, Lemma 3.3] (under the assumption $g-g_h\perp P_0({\mathscr {E}}(\varGamma _N))$)

$$\begin{aligned} |||{{\,\mathrm{Res}\,}} |||_{*} \lesssim {\text {osc}}(f,{\mathscr {T}})+ {\text {osc}}(g-g_h,{\mathscr {E}}(\varGamma _N))\le \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

This and (9) imply

$$\begin{aligned} \Vert {\text {dev}}(\sigma -\sigma _h)\Vert \lesssim \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

For any test function $v\in V$ with $ | v |_{H^1(\varOmega )} =1$, $\int _\varOmega (\sigma -\sigma _h) : Dv\, dx= {{\,\mathrm{Res}\,}}(v)$ and so

$$\begin{aligned} \Vert {\text {div}}(\sigma -\sigma _h)\Vert _{-1} = \sup _{\begin{array}{c} v\in V \\ | v |_{H^1(\varOmega )} =1 \end{array}}{Res(v)} \le \sup _{\begin{array}{c} v\in V \\ \Vert \varepsilon (v) \Vert =1 \end{array}}{Res(v)} \le 2\mu \, |||{{\,\mathrm{Res}\,}} |||_{*} \lesssim \eta ({\mathscr {T}},\sigma _h). \end{aligned}$$

(In the second last step one utilizes that $ 2\mu \, E:E \le E:{\mathbb {C}}E$ for all $E\in {\mathbb {S}}$.) The combination of Lemma 4 for $\tau =\sigma -\sigma _h$ with the previous displayed estimates concludes the proof of $\Vert \sigma -\sigma _h\Vert \lesssim \eta ({\mathscr {T}},\sigma _h)$. There exist several appropriate choices of $\varSigma _0\subset H({\text {div}},\varOmega ;{\mathbb {S}})$ in this last step. Recall that $\varGamma _N$ is the union of connectivity components and so pick one edge $E_0$ in this polygon and consider $\varSigma _0:=\{ \tau \in H({\text {div}},\varOmega ;{\mathbb {S}}): \int _{E_0} \tau \nu \, ds =0\}$ with $1_{2\times 2}\notin \varSigma _0$. This choice of $E_0$ and so $\varSigma _0$ depend only on $\varGamma _N$ (independent of ${\mathscr {T}}$). Since $g-g_h=(\sigma -\sigma _h)\nu $ along $E_0$ has (piecewise on ${\mathscr {E}}(E_0)$, whence in total) an integral mean zero, Lemma 4 indeed applies to $\tau =\sigma -\sigma _h\in \varSigma _0$. $\square $

3 Local efficiency analysis

The local efficiency follows with the bubble-function technique for $C^1$ finite elements [24, Sec 3.7]. This section focuses on a constant ${\mathbb {C}}$ for linear isotropic elasticity with constant Lamé parameters $\lambda $ and $\mu $ such that $\varvec{\varepsilon }_h:={\mathbb {C}}^{-1}\sigma _h\in P_{k+2}({\mathscr {T}})$ for some $\sigma _h\in AW _k({\mathscr {T}})$ is a polynomial of degree at most $k+2$. Apart from this, the Lamé parameters do not further arise in this section.

The moderate point of departure is the volume term for each triangle $T\in {\mathscr {T}}$ with barycentric coordinates $\lambda _1,\lambda _2,\lambda _3\in P_1(T)$ and their product, the cubic volume bubble function, $b_T:=27\,\lambda _1\lambda _2\lambda _3 \in W^{1,\infty }_0(T)$ plus its square $b_T^2\in W^{2,\infty }_0(T)$ with $0\le b_T^2\le 1$, $ \Vert b_T\Vert _{L^2(T)} \lesssim 1$, and $|b_T|_{H^2(T)} \lesssim h_T^{-2}$ etc.

Lemma 5

(efficiency of volume residual) Any $v\in H^1(T;{\mathbb {R}}^2)$, $T\in {\mathscr {T}}$, satisfies

$$\begin{aligned} h_T^2 \Vert {\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\Vert _{L^2(T)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)}. \end{aligned}$$

Proof

An inverse estimate for the polynomial ${\text {rot}}{\text {rot}}\varvec{\varepsilon }_h\equiv {\text {rot}}^2\varvec{\varepsilon }_h$ implies the estimate

$$\begin{aligned} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}^2 \lesssim \Vert b_T {\text {rot}}^2 \varvec{\varepsilon }_h\Vert ^2_{L^2(T)} = ({\text {rot}}^2\varvec{\varepsilon }_h, b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)_{L^2(T)}. \end{aligned}$$

Lemma 1 with $\phi =b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h$ and $( \varepsilon (v) , {\text {Curl}}^2\phi )_{L^2(T)} = 0$ leads to

$$\begin{aligned} \begin{aligned} \Vert b_T {\text {rot}}^2 \varvec{\varepsilon }_h\Vert ^2_{L^2(T)}&= ( \varvec{\varepsilon }_h- \varepsilon (v) , {\text {Curl}}^2({b_T^2} {\text {rot}}^2 \varvec{\varepsilon }_h) )_{L^2(T)}\\&\le \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)} \Vert {\text {Curl}}^2(b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)\Vert _{L^2(T)}. \end{aligned} \end{aligned}$$

This and the inverse estimate $\Vert {\text {Curl}}^2(b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h)\Vert _{L^2(T)} \lesssim h_T^{-2} \Vert b_T^2 {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)} $ imply

$$\begin{aligned} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}^2 \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(T)} h_T^{-2} \Vert {\text {rot}}^2 \varvec{\varepsilon }_h\Vert _{L^2(T)}. \end{aligned}$$

This concludes the proof. $\square $

The localization of the first edge residual involves the piecewise quadratic edge-bubble function $b_E$ with support $T_+\cup T_-$ for an interior edge $E=\partial T_+\cap \partial T_-$ shared by the two triangles $T_+$ and $T_-$ with edge-patch $\omega _E := {\text {(}}T_+\cup T_-)$. With an appropriate scaling $b_E|_T=4\lambda _1\lambda _2$ for the two barycentric coordinates $\lambda _1,\lambda _2$ on $T\in \{T_+,T_-\}$ associated with the two vertices of E. Then $b_E\in W^{1,\infty }(\omega _E)$ and $b_E^2\in W^{2,\infty }(\omega _E)$ satisfy $0\le b_E^2\le b_E\le 1$ and $|b_E|_{H^1(E)} \lesssim h_E^{-1}$ etc.

The remaining technical detail is an extension of functions on the edge E to $\omega _E$. Throughout this section those functions are polynomials and given $\rho _E\in P_m(E)$, their coefficients (in some fixed basis) already define an algebraic object that is a natural extension $\rho \in P_m({\hat{E}})$ along the straight line ${\hat{E}}:={\text {mid}}(E)+{\mathbb {R}}\, \tau _E$ that extends E with midpoint ${\text {mid}}(E)$ and tangential unit vector $\tau _E$. This and

$$\begin{aligned} P_E(\rho _E)(x):= \rho ( \tau _E\cdot (x- {\text {mid}}(E))) \quad \text {for all }x\in {\mathbb {R}}^2 \end{aligned}$$

define a linear extension operator $P_E:P_m(E)\rightarrow C^\infty ({\mathbb {R}}^2)$ with $P_E(\rho _E)=\rho _E$ on E for any $\rho _E\in P_m(E)$, which is constant in the normal direction, $\nabla P_E(\rho _E)\cdot \nu _E\equiv 0$. This design is different from that in [24].

Lemma 6

(efficiency of first interior edge residual) Any $v\in H^1(\omega _E;{\mathbb {R}}^2)$, $E\in {\mathscr {E}}(\varOmega )$, satisfies

$$\begin{aligned} h_E^{1/2} \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)}. \end{aligned}$$

Proof

Since $\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\in P_{k+2}(E)$ is a polynomial, the above extension $P_E(\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E)$ and the function $b\in W^{2,\infty }_0(\omega _E)$ with

$$\begin{aligned} b(x):= b_E^2(x)\, \nu _E\cdot (x- {\text {mid}}(E))\quad \text {for all }x\in {\mathbb {R}}^2 \end{aligned}$$

(10)

define some function $\phi := b\, P_E(\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E)$. Since $b=0$ and $\nabla b_E\cdot \nu _E= b_E^2$ along E, the test function $\phi \in H^2_0(\omega _E)\subset H_0^2(\varOmega )$ leads in Lemma 1 to

$$\begin{aligned} (\tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E, \partial _{\nu _E} \phi )_{L^2(E)}= (\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

Since $\partial _{\nu _E} \phi =b_E^2 \, \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E$ on E and $\varepsilon (v)\perp {\text {Curl}}^2\phi $, an inverse estimate shows

$$\begin{aligned} \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert ^2_{L^2(E)} \lesssim (\varvec{\varepsilon }_h-\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

At the heart of the bubble-function methodology are inverse and trace inequalities that allow for appropriate scaling properties [24] under the overall assumption of shape-regularity. In the present case, one power of $h_E\approx h_{T_\pm }$ is hidden in the function b and

$$\begin{aligned} h_E^{1/2}\, | \phi |_{H^2(\omega _E)} + h_E^{-3/2} \Vert \phi \Vert _{L^2(\omega _E)} \lesssim \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E \Vert _{L^2(E)}. \end{aligned}$$

(11)

The combination with the previous estimate results in

$$\begin{aligned}&\Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E\Vert ^2_{L^2(E)} \\&\quad \lesssim \Vert \tau _E\cdot [\varvec{\varepsilon }_h]_E\tau _E \Vert _{L^2(E)} \left( h_E^{-1/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v)\Vert _{L^2(\omega _E)} + h_E^{3/2} \Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)}\right) . \end{aligned}$$

This and Lemma 5 conclude the proof. $\square $

For any edge $E\in {\mathscr {E}}(\varGamma _D)$, the edge-bubble function $b_E=4\lambda _1\lambda _2\in W^{1,\infty }(\omega _E)$ for the two barycentric coordinates $\lambda _1,\lambda _2$ associated with the two vertices of E and $b_E$ vanishes on the remaining sides $\partial \omega _E{\setminus } E$ of the aligned triangle $\overline{\omega _E}$. The Dirichlet data $u_D$ allows for some polynomial approximation $\varPi _{m,E} u_{D}\in P_{m}(E)$ of a maximal degree bounded by $m\ge k+2$; recall the definition of ${\text {osc}}_I(u_D,E)$ from (5).

Lemma 7

(efficiency of first boundary edge residual) Any $v\in H^1(\omega _E;{\mathbb {R}}^2)$ with $v|_E=u_D|_E$ along $E\in {\mathscr {E}}(\varGamma _D)$ satisfies

$$\begin{aligned} h_E^{1/2} \Vert \tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)} + {\text {osc}}_I(u_D,E). \end{aligned}$$

Proof

Since $\tau _E\cdot \varvec{\varepsilon }_h\tau _E$ is a polynomial of degree at most $k+2\le m$ along the exterior edge E, the residual $\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)$ is well approximated by its $L^2$ projection $\rho _E:= (\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \varPi _{m,E} \partial u_D/\partial s))$ onto $P_{m}(E)$. The Pythagoras theorem based on the $L^2$ orthogonality reads

$$\begin{aligned} h_E \Vert \tau _E\cdot ( \varvec{\varepsilon }_h\tau _E - \partial u_D/\partial s)\Vert _{L^2(E)}^2= h_E \Vert \rho _E\Vert _{L^2(E)}^2+{\text {osc}}^2_I(u_D,E) \end{aligned}$$

and it remains to bound $h_E^{1/2} \Vert \rho _E\Vert _{L^2(E)}$ by the right-hand side of the claimed inequality. The extension $P_E\rho _E\in C^\infty ({\mathbb {R}}^2)$ and the function b from (10) lead to an admissible test function $ \phi := b P_E \rho _E \in W^{2,\infty }_0(\omega _E)$. Two successive integration by parts as in Lemma 1 show

$$\begin{aligned} (\varepsilon (v),{\text {Curl}}^2\phi )_{L^2(\omega _E)}=(\partial u_D/\partial s ,\tau _E (\nu _E\cdot \nabla \phi ))_{L^2(E)}. \end{aligned}$$

This and Lemma 1 lead to

$$\begin{aligned} \left( \tau _E\cdot \left( \varvec{\varepsilon }_h\tau _E- \frac{ \partial u_D}{\partial s}\right) , \frac{\partial \phi }{\partial \nu _E} \right) _{L^2(E)} = (\varvec{\varepsilon }_h-\varepsilon (v) , {\text {Curl}}^2 \phi )_{L^2(\omega _E)} - ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}. \end{aligned}$$

Since $\partial _{\nu _E} \phi = b_E^2 \rho _E$ along E and $\rho _E$ is the $L^2$ projection of $\tau _E\cdot ( \varvec{\varepsilon }_h\tau _E- \partial u_D/\partial s)$, the left-hand side equals $\Vert b_E \rho _E\Vert _{L^2(E)}^2 - ( ( 1-\varPi _{m,E}) \partial u_D/\partial s, b_E^2 \rho _E )_{L^2(E)} $. The scaling argument which leads to (11) shows that the left-hand side of (11) is $\lesssim \Vert \rho _E\Vert _{L^2(E)}$. The combination with the previously displayed identity leads to

$$\begin{aligned} \Vert \rho _E\Vert _{L^2(E)}^2\lesssim & {} \Vert \rho _E\Vert _{L^2(E)} \left( h_E^{-1/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v)\Vert _{L^2(\omega _E)} \right. \\&\left. + h_E^{3/2} \Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)} + h_E^{-1/2} {\text {osc}}_I(E,u_D) \right) . \end{aligned}$$

This and Lemma 5 conclude the proof. $\square $

The edge-bubble functions for the second edge residuals are defined slightly differently to ensure some vanishing normal derivative.

Lemma 8

(efficiency of second interior edge residual) Any $v\in H^1(\omega _E;{\mathbb {R}}^2)$, $E\in {\mathscr {E}}(\varOmega )$, satisfies

$$\begin{aligned} h_E^{3/2} \Vert \tau _E\cdot ( [{\text {rot}}_{NC} \varvec{\varepsilon }_h]_E - \partial [\varvec{\varepsilon }_h]_E/\partial s\, \nu _E)\Vert _{L^2(E)} \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)}. \end{aligned}$$

Proof

There are many ways to define an edge-bubble function for this situation and one may first select a maximal open ball $B(x_E,2r_E) \subset \omega _E$ around a point $x_E\in E$ with maximal radius $2r_E$, which is entirely included in $\omega _E$. The characteristic function $\chi _{B(x_E,r_E)}$ of the smaller ball $B(x_E,r_E)$ may be regularized with a standard mollification $\eta _{r_E}$ to define the smooth function $b:=\chi _{B(x_E,r_E)}*\eta _{r_E}\in C_c^\infty (\varOmega _E)$ with values in [0, 1] and with $\nabla b\cdot \nu _E=0$ along E. The polynomial $\rho _E:= \tau _E\cdot ( [{\text {rot}}_{NC}\varvec{\varepsilon }_h]_E - \partial [\varvec{\varepsilon }_h]_E/\partial s \, \nu _E)$ and its extension $P_E \rho _E$ define the test function $\phi := b P_E \rho _E\in C_0^\infty (\omega _E)$ in Lemma 1. The representation formula and $(\varepsilon (v),{\text {Curl}}^2\phi )_{L^2(\omega _E)}=0$ lead to

$$\begin{aligned} \Vert b^{1/2} \rho _E\Vert _{L^2(E)}^2 = (\varepsilon (v)-\varvec{\varepsilon }_h,{\text {Curl}}^2\phi )_{L^2(\omega _E)} + ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)} . \end{aligned}$$

The inverse inequality $\Vert \rho _E\Vert _{L^2(E)} \lesssim \Vert b^{1/2} \rho _E\Vert _{L^2(E)}$, Cauchy-Schwarz inequalities, and the right scaling properties of $\phi $ lead to

$$\begin{aligned} \Vert \rho _E\Vert _{L^2(E)}^2\lesssim \Vert \rho _E\Vert _{L^2(E)}\left( h_E^{-3/2} \Vert \varvec{\varepsilon }_h-\varepsilon (v) \Vert _{L^2(\omega _E)} + h_E^{1/2}\Vert {\text {rot}}_{NC}^2\varvec{\varepsilon }_h\Vert _{L^2(\omega _E)} \right) . \end{aligned}$$

This and Lemma 5 conclude the proof. $\square $

The efficiency of the last edge contribution involves the second Dirichlet data oscillation ${\text {osc}}_{II}(u_D,E)$ from (6).

Lemma 9

(efficiency of second boundary edge residual) Any $v\in H^1(\omega _E;{\mathbb {R}}^2)$ with $v|_E=u_D|_E$ along $E\in {\mathscr {E}}(\varGamma _D)$ satisfies

$$\begin{aligned}&h_E^{3/2} \Vert \tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot \left( \frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2}\right) \Vert _{L^2(E)}\\&\quad \lesssim \Vert \varvec{\varepsilon }_h- \varepsilon (v) \Vert _{L^2(\omega _E)} + {\text {osc}}_{II}(u_D,E). \end{aligned}$$

Proof

Select a maximal open ball $B(x_E,2r_E)\cap \varOmega \subset \omega _E$ around a point $x_E\in E$ with maximal radius $2r_E$ such that $B(x_E,2r_E)\cap \omega _E$ is a half ball. The regularization $b:=\chi _{B(x_E,r_E)}*\eta _{r_E}\in C_c^\infty ({\mathbb {R}}^2)$ of the characteristic function $\chi _{B(x_E,r_E)}$ attains values in [0, 1] and a positive integral mean $h_E^{-1} \int _E b\, ds \approx 1$ along E (depending only on the shape regularity of ${\mathscr {T}}$); b vanishes on $\partial \omega _E{\setminus } E$ and its normal derivative $\nabla b\cdot \nu =0$ vanishes along the entire boundary $\partial \omega _E$.

The Pythagoras theorem $ \Vert \rho \Vert _{L^2(E)}^2 = \Vert \rho _E \Vert _{L^2(E)}^2 + h_E^{-3} {\text {osc}}_{II}^2(u_D,E)$ for the residual $\rho :=\tau _E\cdot {\text {rot}}\varvec{\varepsilon }_h- \nu _E\cdot (\frac{ \partial \varvec{\varepsilon }_h\tau _E }{\partial s } -\frac{ \partial ^2 u_D}{ \partial s^2})$ and its $L^2$ projection $\rho _E:=\varPi _{m,E}\rho $ onto $P_m(E)$ reduces the proof to the estimation of $\Vert \rho _E \Vert _{L^2(E)}$. The normal derivative of $\phi := b\, P_E \rho _E\in C^\infty (\overline{\omega _E})$ vanishes along the boundary $\partial \omega _E$ and Lemma 1 shows

$$\begin{aligned} \left( {\text {rot}}\varvec{\varepsilon }_h-\frac{ \partial \varvec{\varepsilon }_h\nu _E}{\partial s} , b \rho _E \tau _E\right) _{L^2(E)} = ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)}-(\varvec{\varepsilon }_h, {\text {Curl}}^2 \phi )_{L^2(\omega _E)}. \end{aligned}$$

The arguments in Lemma 2 show $ (\partial ^2 u_D/\partial s^2 , b\rho _E \, \nu _E)_{L^2(E)} = (\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)}$. The combination of the two identities leads to a formula for $ (\rho ,b\rho _E) _{L^2(E)}$. Since $\rho -\rho _E$ is controlled in ${\text {osc}}_{II}^2(u_D,E)$, this and an inverse inequality in the beginning result in

$$\begin{aligned}&\Vert \rho _E\Vert _{L^2(E)}^2\lesssim (b\rho _E,\rho _E) _{L^2(E)} =(\rho , b\rho _E) _{L^2(E)}- (\rho -\rho _E, b\rho _E) _{L^2(E)} \\&\; \lesssim ({\text {rot}}_{NC}^2\varvec{\varepsilon }_h,\phi )_{L^2(\omega _E)} -(\varvec{\varepsilon }_h-\varepsilon (v), {\text {Curl}}^2 \phi )_{L^2(\omega _E)} +\Vert \rho _E\Vert _{L^2(E) }h_E^{-3/2} {\text {osc}}_{II}(u_D,E). \end{aligned}$$

The scaling properties of $\phi $ and its derivatives are as in the proof of the previous lemma. With Lemma 5 in the end, this concludes the proof. $\square $

4 Numerical examples

This section is devoted to numerical experiments for four different domains to demonstrate robustness in the reliability and efficiency of the a posteriori error estimator $\eta ({\mathscr {T}}_\ell ,\sigma _\ell )$. The implementation follows [12, 15, 16] for $k=1$ with Lamé parameters $\lambda $ and $\mu $ from $\lambda = E\nu /( (1+\nu )(1-2\nu ))$ and $\mu ={E}/(2(1+\nu ))$ for a Young’s modulus $E=10^5$ and various Poisson ratios $ \nu =0.3$ and $ \nu = 0.4999$.

4.1 Academic example

The model problem (1) on the unit square $\varOmega = (0,1)^2 $ with homogeneous Dirichlet boundary conditions and the right-hand side $f=(f_1,f_2)$,

$$\begin{aligned} f_1(x,y) = -f_2(y,x)= -2\mu \pi ^3\cos (\pi y)\sin (\pi y)(2\cos (2\pi x) - 1)\quad \text {for }(x,y)\in \varOmega , \end{aligned}$$

allows the smooth exact solution

$$\begin{aligned} u(x,y) {=} \pi \, \sin (\pi x)\sin (\pi y)\, \left( \cos (\pi y) \sin (\pi x), -\cos (\pi x) \sin (\pi y)\right) \quad \text {for }(x,y){\in }\varOmega . \end{aligned}$$

Note that f depends only on the Lamé parameter $\mu $ and not on the critical Lamé parameter $\lambda $. For uniform mesh refinement, Fig. 1 displays the robust third-order convergence of the a posteriori error estimator $\eta ({\mathscr {T}}_\ell ,\sigma _\ell )$ as well as the Arnold–Winther finite element stress error. The convergence is robust in the Poisson ratio $\nu \rightarrow 1/2$ and the a posteriori error estimator proves to be reliable and efficient. In this example, the oscillations ${\text {osc}}(f,{\mathscr {T}}_\ell )$ dominate the a posteriori error estimator.

This typical observation motivates numerical examples with $f\equiv 0$ in the sequel.

4.2 Circular inclusion

The second benchmark example from the literature models a rigid circular inclusion in an infinite plate for the domain $\varOmega $ with rather mixed boundary conditions indicated with mechanical symbols in Fig. 2. The exact solution [23] to the model problem (1) reads (with polar coordinates $(r,\phi )$ and $\kappa = 3-4\nu $, $\gamma = 2\nu - 1$, $a=1/4$)

$$\begin{aligned} u_r&= \frac{1}{8\mu r}\left( (\kappa -1)r^2 + 2\gamma a^2 + \left( 2r^2 -\frac{2(\kappa +1)a^2}{\kappa } + \frac{2 a^4}{\kappa r^2}\right) \cos (2\phi )\right) ,\\ u_\phi&= -\frac{1}{8\mu r}\left( 2r^2-\frac{2(\kappa -1)a^2}{\kappa } - \frac{2a^4}{\kappa r^2}\right) \sin (2\phi ). \end{aligned}$$

The approximation of the circular inclusion through a polygon is rather critical for the higher-order Arnold–Winther MFEM. In the absence of an implementation of parametric boundaries, adaptive mesh refinement is necessary for higher improvements. The adaptive algorithm of this section is the same for all examples and acts on polygons; in particular, it does not monitor the curved boundary, but whenever some edge at the curved part $\varGamma _D$ is refined in this example, the midpoint is a new node and projected onto $\varGamma _D$. The convergence history plot in Fig. 3 shows a reduced convergence for uniform refinement, while adaptive refinement (of the circular boundary) leads to optimal third-order convergence.

4.3 L-shaped benchmark

Consider the rotated L-shaped domain with Dirichlet and Neumann boundary depicted in Fig. 4. The exact solution reads in polar coordinates

$$\begin{aligned} u_r(r,\phi )&= \frac{r^{\alpha }}{2\mu } \left( -(\alpha +1)\cos ((\alpha +1)\phi ) + (C_2-\alpha -1)C_1\cos ((\alpha -1)\phi )\right) ,\\ u_\phi (r,\phi )&= \frac{r^{\alpha }}{2\mu } \left( (\alpha +1)\sin ((\alpha +1)\phi ) + (C_2+\alpha -1)C_1\sin ((\alpha -1)\phi )\right) . \end{aligned}$$

The constants are $C_1 := -\cos ((\alpha +1)\omega )/\cos ((\alpha -1)\omega )$ and $C_2:= 2(\lambda + 2\mu )/(\lambda +\mu )$, where $\alpha = 0.544483736782$ is the first root of $\alpha \sin (2\omega )+\sin (2\omega \alpha )=0$ for $\omega = 3\pi /4$. The volume force $f\equiv 0$ and the Neumann boundary data $g\equiv 0$ vanish, and the Dirichlet boundary conditions $u_D$ are extracted from the exact solution.

Figure 5 shows suboptimal convergence ${\mathscr {O}}(N_\ell ^{-0.27})$, namely an expected rate $\alpha $ in terms of the maximal mesh-size, for uniform and fourth-order $L^2$ stress convergence for adaptive mesh-refinement.

Despite the singular solution, the adaptive algorithm recovers the higher convergence of Theorem 5 as in [15].

4.4 Cook membrane problem

One of the more popular benchmarks in computational mechanics is the tapered panel $\varOmega $ with the vertices A, B, C, D of Fig. 6 clamped on the left side $\varGamma _D={\text {conv}}(D,A)$ (with $u_D\equiv 0$) under no volume force ($f\equiv 0$) but applied surface tractions $g = (0,1)$ along ${\text {conv}}(B,C)$ and traction free on the remaining parts ${\text {conv}}(A,B)$ and ${\text {conv}}(C,D)$ along the Neumann boundary.

This example is a particular difficult one for the Arnold–Winther MFEM because of the incompatible Neumann boundary conditions on the right corners [12, 15, 16]. That means, although g is piecewise constant, g does not belong to $G({\mathscr {T}})$ for any triangulation. In the two Neumann corner vertices B and C we therefore strongly impose the values $\sigma _\ell (B) = (0.2491 , 0.7283; 0.7283 , 0.6676)$ and $\sigma _\ell (C) = ( 3/20, 11/20 ; 11/20 , 11/60)$ for the design of $g_\ell \in G({\mathscr {T}}_\ell )$.

Since the exact solution is unknown, the error approximation rests on a reference solution ${\tilde{\sigma }}$ computed as $P_5({\mathscr {T}})$ displacement approximation on the uniform refinement of the finest adapted triangulation.

The large pre-asymptotic range of the convergence history plot in Fig. 7 illustrates the difficulties of the Arnold–Winther finite element method in case of incompatible Neumann boundary conditions according to its nodal degrees of freedom. Once the resulting and dominating boundary oscillations (caused by the necessary choice of discrete compatible Neumann conditions in $G({\mathscr {T}}_\ell )$) ${\text {osc}}(g-g_\ell ,{\mathscr {E}}_\ell (\varGamma _N))$ are resolved through adaptive mesh-refining, even the fourth-order $L^2$ stress convergence is visible in a long asymptotic range in (the approximated error and) the equivalent error estimator.

This example underlines that adaptive mesh-refining is unavoidable in computational mechanics with optimal rates and a large saving in computational time and memory compared to naive uniform mesh-refining.

With the modifications of the Arnold-Winter MFEM for incompatible Neumann data as outlined in Appendix B, which only involves adjustments of the right-hand side at the critical incompatible nodal stress degrees of freedom, we observe optimal convergence rates from the very beginning in Fig. 8 without any visible pre-asymptotically reduced convergence caused by incompatible Neumann boundary conditions.

4.5 Comments

The generic constants in this paper are not worked out explicitly in detail and so a numerical comparison with the earlier paper [15] cannot be quantitatively. It is conjectured that the residual-based error estimation with the reliability constants (for a guaranteed upper error bound) overestimates the true error up to an order of magnitude.

The qualitative comparison in Fig. 5 (without the reliability constants for the estimators) provides numerical evidence that the error estimators of this paper converge with the same convergence rates as those from [15] and it also indicates global equivalence of the errors with the two error estimators. The theoretical evidence in [15] for efficiency depends on unrealistically high regularity assumptions – unlike the general efficiency results of this paper.

References

Alonso, A.: Error estimators for a mixed method. Numer. Math. 74(4), 385–395 (1996)
Article MathSciNet MATH Google Scholar
Arnold, D.N.: Differential complexes and numerical stability. In: Proceedings of the International Congress of Mathematicians, Vol. I (Beijing, 2002), pp. 137–157. Higher Ed. Press, Beijing (2002)
Arnold, D.N., Brezzi, F., Douglas, J.: PEERS: a new mixed finite element for plane elasticity. Jpn. J. Appl. Math. 1, 347–367 (1984)
Article MathSciNet MATH Google Scholar
Arnold, D.N., Falk, R.S., Winther, R.: Finite element exterior calculus, homological techniques, and applications. Acta Numer. 15, 1–155 (2006)
Article MathSciNet MATH Google Scholar
Arnold, D.N., Winther, R.: Mixed finite elements for elasticity. Numer. Math. 92(3), 401–419 (2002)
Article MathSciNet MATH Google Scholar
Boffi, D., Brezzi, F., Fortin, M.: Mixed Finite Element Methods and Applications. Springer Series in Computational Mathematics, vol. 44. Springer, Heidelberg (2013)
MATH Google Scholar
Braess, D.: Finite Elements. Theory, Fast Solvers, and Applications in Elasticity Theory, 3rd edn. Cambridge University Press, Cambridge (2007)
Book MATH Google Scholar
Brenner, S.C., Scott, L.R.: The Mathematical Theory of Finite Element Methods. Texts in Applied Mathematics, vol. 15, 3rd edn. Springer, New York (2008)
Book MATH Google Scholar
Carstensen, C.: A posteriori error estimate for the mixed finite element method. Math. Comput. 66(218), 465–476 (1997)
Article MathSciNet MATH Google Scholar
Carstensen, C.: A unifying theory of a posteriori finite element error control. Numer. Math. 100(4), 617–637 (2005)
Article MathSciNet MATH Google Scholar
Carstensen, C., Dolzmann, G.: A posteriori error estimates for mixed FEM in elasticity. Numer. Math. 81(2), 187–209 (1998)
Article MathSciNet MATH Google Scholar
Carstensen, C., Eigel, M., Gedicke, J.: Computational competition of symmetric mixed FEM in linear elasticity. Comput. Methods Appl. Mech. Eng. 200(41–44), 2903–2915 (2011)
Article MathSciNet MATH Google Scholar
Carstensen, C., Feischl, M., Page, M., Praetorius, D.: Axioms of adaptivity. Comput. Math. Appl. 67(6), 1195–1253 (2014)
Article MathSciNet MATH Google Scholar
Carstensen, C., Gallistl, D., Schedensack, M.: $L^2$ best approximation of the elastic stress in the Arnold–Winther FEM. IMA J. Numer. Anal. 36(3), 1096–1119 (2016)
Article MathSciNet MATH Google Scholar
Carstensen, C., Gedicke, J.: Robust residual-based a posteriori Arnold–Winther mixed finite element analysis in elasticity. Comput. Methods Appl. Mech. Eng. 300, 245–264 (2016)
Article MathSciNet Google Scholar
Carstensen, C., Günther, D., Reininghaus, J., Thiele, J.: The Arnold-Winther mixed FEM in linear elasticity. Part I: Implementation and numerical verification. Comput. Methods Appl. Mech. Eng. 197, 3014–3023 (2008)
Article MATH Google Scholar
Carstensen, C., Hu, J.: A unifying theory of a posteriori error control for nonconforming finite element methods. Numer. Math. 100(3), 473–502 (2007)
Article MathSciNet MATH Google Scholar
Carstensen, C., Peterseim, D., Schröder, A.: The norm of a discretized gradient in $H({\text{ div }})^{*}$ for a posteriori finite element error analysis. Numer. Math. 132, 519–539 (2016)
Article MathSciNet MATH Google Scholar
Carstensen, C., Rabus, H.: Axioms of adaptivity with separate marking for data resolution. SIAM J. Numer. Anal. 55(6), 2644–2665 (2017)
Article MathSciNet MATH Google Scholar
Ciarlet, P.G.: The Finite Element Method for Elliptic Problems. Studies in Mathematics and Its Applications, vol. 4. North-Holland Publishing Co., Amsterdam-New York-Oxford (1978)
Book MATH Google Scholar
Girault, V., Scott, L.R.: Hermite interpolation of nonsmooth functions preserving boundary conditions. Math. Comput. 71(239), 1043–1074 (2002)
Article MathSciNet MATH Google Scholar
Huang, J., Huang, X., Xu, Y.: Convergence of an adaptive mixed finite element method for Kirchhoff plate bending problems. SIAM J. Numer. Anal. 49(2), 574–607 (2011)
Article MathSciNet MATH Google Scholar
Kouhia, R., Stenberg, R.: A linear nonconforming finite element method for nearly incompressible elasticity and Stokes flow. Comput. Methods Appl. Mech. Eng. 124(3), 195–212 (1995)
Article MathSciNet MATH Google Scholar
Verfürth, R.: A Review of a Posteriori Error Estimation and Adaptive Mesh-Refinement Techniques. Wiley, Hoboken (1996)
MATH Google Scholar

Download references

Acknowledgements

Open access funding provided by Austrian Science Fund (FWF). The work has been written, while the three authors enjoyed the hospitality of the Hausdorff Research Institute of Mathematics in Bonn, Germany, during the Hausdorff Trimester Program Multiscale Problems: Algorithms, Numerical Analysis and Computation.

Author information

Authors and Affiliations

Humboldt-Universität zu Berlin, Berlin, Germany
Carsten Carstensen
University of Twente, Enschede, The Netherlands
Dietmar Gallistl
Faculty of Mathematics, University of Vienna, Vienna, Austria
Joscha Gedicke

Authors

Carsten Carstensen
View author publications
You can also search for this author in PubMed Google Scholar
Dietmar Gallistl
View author publications
You can also search for this author in PubMed Google Scholar
Joscha Gedicke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joscha Gedicke.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research of the first author (CC) has been supported by the Deutsche Forschungsgemeinschaft in the Priority Program 1748 ‘Reliable simulation techniques in solid mechanics. Development of non-standard discretization methods, mechanical and mathematical analysis’ under the project ‘Foundation and application of generalized mixed FEM towards nonlinear problems in solid mechanics’ (CA 151/22-1). The second author (DG) has been supported by the Deutsche Forschungsgemeinschaft (DFG) through CRC 1173. The third author (JG) has been funded by the Austrian Science Fund (FWF) through the Project P 29197-N32.

Appendices

A Fourth-order convergence of the stress in $L^2$

This appendix explains a high-order convergence phenomenon observed in some numerical benchmark examples for the lowest-order Arnold–Winther method. Adopt the notation from this paper for $k=1$ and let $\sigma $ solve (1) and let $\sigma _h \in AW _k({\mathscr {T}})$ solve (3).

Theorem 5

(fourth-order convergence) Suppose $f=f_h\in P_1({\mathscr {T}};{\mathbb {R}}^2)$ and $g=g_h\in G({\mathscr {T}})$ and suppose that the stress $\sigma \in H^4(\varOmega ;{\mathbb {S}})$. Then the $L^2$ stress error satisfies (with the maximal mesh-size h)

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )} \lesssim h^4 \Vert \sigma \Vert _{H^4(\varOmega )}. \end{aligned}$$

Proof

Since the stress error $\sigma -\sigma _h\in H^4({\mathscr {T}};{\mathbb {S}})$ is divergence-free, $\alpha $ vanishes in (7) and $\sigma -\sigma _h = {\text {Curl}}^2 \beta \in H^4({\mathscr {T}};{\mathbb {S}})$. Since $\beta \in H^2({\mathscr {T}})$ is piecewise in $C^2$, it follows $\beta \in C^1({\overline{\varOmega }})$. The Arnold–Winther finite elements have nodal degrees of freedom at the vertices and hence $\sigma _h$ is continuous at each vertex $z\in {\mathscr {N}}$. Hence the second derivatives of $\beta \in C^2({\mathscr {T}})\cap C^1({\overline{\varOmega }})$ are continuous at each vertex $z\in {\mathscr {N}}$. It follows that the nodal interpolation operator $I_A$ associated to the Argyris finite element space $ A({\mathscr {T}}) \subset C^1(\varOmega )\cap P_5({\mathscr {T}})$ exists for $\beta $ in the classical sense and is composed of the piecewise local interpolation. This defines $\beta _h= I_A\beta $ and the divergence-free $\tau _h:={\text {Curl}}^2\beta _h\in AW _k(\mathscr {T})$ test function in (1) and in (3). Consequently,

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )}^2= (\sigma -\sigma _h , {\text {Curl}}^2(\beta -\beta _h))_{L^2(\varOmega )} \le \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )} \, |\beta -I_A\beta |_{H^2(\varOmega )}. \end{aligned}$$

This and standard local interpolation error estimates for the nodal interpolation of the quintic Argyris finite elements [7, 8, 20] show

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{L^2(\varOmega )}\lesssim h^4 |\beta |_{H^6({\mathscr {T}})}=h^4 |\sigma |_{H^4(\varOmega )}. \end{aligned}$$

(With $\sigma -\sigma _h={\text {Curl}}^2\beta $ and $|\sigma _h |_{H^4({\mathscr {T}})} =0$ for piecewise cubic $\sigma _h$ in the last step.) $\square $

B General Neumann data

The nodal degrees of freedom of the Arnold–Winther stresses do not allow a nodal interpolation of arbitrary Neumann data. As documented in [15, 16] the performance of the numerical method indeed suffers from that property.

In this work, the following alternative is proposed. Let $g\in L^2(\varGamma ;{\mathbb {R}}^2)$ be the Neumann data. Define $g_h:=\varPi ^\varGamma _{k+2} g$ where $\varPi ^\varGamma _{k+2}$ denotes the $L^2$ projection onto $P_{k+2}({\mathscr {E}}(\varGamma );{\mathbb {R}}^2)$. Note that $g_h$ may be discontinuous. Let $\sigma ^P\in H({\text {div}},\varOmega ;{\mathbb {S}})$ be any piecewise polynomial stress approximation with $\sigma ^P = g_h$ on $\varGamma _N$. The explicit design of such a particular solution is outlined in Appendix C below.

The proposed scheme is to seek a solution $\sigma _h^0\in \varSigma (0,{\mathscr {T}})$ and $u_h\in V_h$ such that, for all $\tau _h\in \varSigma (0,{\mathscr {T}})$ and all $v_{h}\in V_h$,

$$\begin{aligned} \int _\varOmega \sigma _h^0 :{\mathbb {C}}^{-1}\tau _h\,dx+\int _\varOmega u_h\cdot {\text {div}}\tau _h\,dx&= \int _{\varGamma _D} u_D \cdot (\tau _h\nu )\, ds -\int _\varOmega \sigma ^P :{\mathbb {C}}^{-1}\tau _h\,dx \nonumber \\ \int _\varOmega v_{h}\cdot {\text {div}}\sigma _h\, dx&= - \int _\varOmega (f+{\text {div}}\sigma ^P)\cdot v_{h}\, dx. \end{aligned}$$

(12)

Then $\sigma _h := \sigma _h^0 + \sigma ^P$ satisfies the Neumann boundary conditions along $\varGamma $ as well as $-{\text {div}}\sigma _h=f$ in $\varOmega $. Note that this modification merely affects the right-hand side while the system matrix remains unchanged.

The scheme allows for a direct a priori error analysis. The following result states a quasi-optimal a priori error estimate provided $\sigma ^P$ is chosen sufficiently accurate.

Theorem 6

(a priori error estimate) The discrete solution $(\sigma _h,u_h)$ to the modified scheme (12) satisfies

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \lesssim \inf _{\sigma _h^\star \in \varSigma (0,{\mathscr {T}})} \left( \Vert \sigma -\sigma _h^\star -\sigma ^P\Vert _{{\mathbb {C}}^{-1}} + \Vert \varPi _k(f+{\text {div}}(\sigma _h^\star +\sigma ^P)\Vert \right) . \end{aligned}$$

Proof

The discrete inf-sup condition (with appropriate discrete test functions $\tau _h\in \varSigma (0,{\mathscr {T}})$ and $v_h\in V_h$ with norm 1) and the discrete equations (12) show that, for any $\sigma _h^\star \in \varSigma (0,{\mathscr {T}})$,

$$\begin{aligned} \begin{aligned}&\Vert \sigma _h^\star -\sigma _h^0\Vert _{{\mathbb {C}}^{-1}} + \Vert {\text {div}}(\sigma _h^\star -\sigma _h^0)\Vert + \Vert u_h-\varPi _k u\Vert \\&\quad \lesssim (\sigma _h^\star -\sigma _h^0,{\mathbb {C}}^{-1}\tau _h)_{L^2(\varOmega )} + ({\text {div}}(\sigma _h^\star -\sigma _h^0),v_h)_{L^2(\varOmega )} + ({\text {div}}\tau _h, u_h-\varPi _k u)_{L^2(\varOmega )} \\&\quad = (\sigma _h^\star +\sigma ^P-\sigma ,{\mathbb {C}}^{-1}\tau _h)_{L^2(\varOmega )} + (\varPi _k({\text {div}}\sigma _h^\star +f+{\text {div}}\sigma ^P,v_h)_{L^2(\varOmega )} \\&\quad \le \Vert \sigma -\sigma _h^\star -\sigma ^P\Vert _{{\mathbb {C}}^{-1}} + \Vert \varPi _k(f+{\text {div}}(\sigma _h^\star +\sigma ^P)\Vert . \end{aligned} \end{aligned}$$

The decomposition $\sigma _h=\sigma _h^0-\sigma _P$ and the triangle inequality

$$\begin{aligned} \Vert \sigma -\sigma _h\Vert _{{\mathbb {C}}^{-1}} \le \Vert \sigma -\sigma _h^\star -\sigma ^P\Vert _{{\mathbb {C}}^{-1}} + \Vert \sigma _h^\star -\sigma _h^0\Vert _{{\mathbb {C}}^{-1}} \end{aligned}$$

thus imply the stated bound. $\square $

The a posteriori error estimates from Theorem 1, Sects. 2 and 3 hold verbatim also for this case. For the efficiency proof it is required that $\sigma ^P$ is piecewise polynomial. Appendix C proposes an explicit design as a discrete particular solution.

C Neumann boundary conditions

This section explains the modification of the lowest-order Arnold–Winther finite element methods that requires a different treatment of the nodal degrees of freedom at a vertex z on the Neumann boundary $\varGamma _N$ in the presence of incompatible Neumann data.

Two situations arise at the vertex $z\in {\mathscr {N}}$ in the relative interior of $\varGamma _N$ with neighboring triangles ${\mathscr {T}}(z):=\{ T\in {\mathscr {T}}: z\in {\mathscr {N}}(T)\}=:\{T_1,\ldots ,T_J\}$ enumerated counterclockwise. For $J=1$ there is no option to modify nodal degrees of freedom to allow for incompatible Neumann boundary conditions at the vertex z and one requires $J\ge 2$ (resp. $J\ge 3$) in case the angle at the polygon $\varGamma _N$ is different from $\pi $ (resp. equal to $\pi $). The idea behind the required modification of the Arnold–Winther finite element space $ AW _k({\mathscr {T}})$ is to split the various degrees of freedom $\sigma _{ AW }|_{T_j}(z)\in {\mathbb {S}}$ for $j=1,\ldots ,J$, which coincide in $ AW _k({\mathscr {T}})$. This modification leads to some conforming and piecewise Arnold–Winther space $ AW '_k({\mathscr {T}})\subset H({\text {div}},\varOmega ;{\mathbb {S}}) $ and its modified finite element space

$$\begin{aligned} \varSigma '(g_h,{\mathscr {T}}):=\varSigma (\varPi _k g) \cap AW '_k({\mathscr {T}}) \end{aligned}$$

for the edgewise $L^2$ projection $\varPi _k g$ of the Neumann data g.

1.1 Two triangles at an interior vertex of the Neumann boundary

In the first part suppose that $J=2$ and that the node z is a vertex of the polygon $\varGamma _N$ with an interior angle $\omega _1+\omega _2\ne \pi $ for the interior angle $\omega _j$ of the triangle $T_j$ at the vertex z. Let $\alpha $ denote the angle of the edge $E_1\subset \overline{\varGamma _N}$ in the global coordinate system, so that the interior edge $E_2$ shared by $T_1$ and $T_2$ has the angle $\varphi :=\alpha +\omega _1$, while the remaining edge $E_3\subset \overline{\varGamma _N}$ has the angle $\beta =\alpha +\omega _1+\omega _2$; the respective normals $\nu _{E_1}$, $\nu _{E_2}$, and $-\nu _{E_3}$ read $(\sin (\psi ),-\cos (\psi ))$ for $\psi =\alpha $, $\varphi $, and $\beta $; $\nu _{E_1}$ and $\nu _{E_3}$ point outwards of the domain.

Let $(\sigma _{11}^{(j)}, \sigma _{12}^{(j)}, \sigma _{22}^{(j)})$ denote the three components of $\sigma '_{ AW }|_{T_j}(z)\in {\mathbb {S}}$ for $j=1,2$. Those six variables (rather then three $\sigma _{11}^{(1)}=\sigma _{11}^{(2)}$ etc. for the classical nodal values of $ AW $) are required to satisfy boundary conditions

$$\begin{aligned} \sigma '_{\textit{AW}}|_{T_1}(z)\nu _{E_1} = \varPi _k (g|_{E_1})(z) \quad \text {and}\quad \sigma '_{\textit{AW}}|_{T_2}(z)\nu _{E_3} = \varPi _k (g|_{E_3})(z) \end{aligned}$$

and, for $H({\text {div}},\varOmega ;{\mathbb {S}}) $-conformity, the interface conditions

$$\begin{aligned} \sigma '_{\textit{AW}}|_{T_1}(z)\nu _{E_2} =\sigma '_{\textit{AW}}|_{T_2}(z)\nu _{E_2}. \end{aligned}$$

This $6\times 6$ linear system of equations has a unique solution (despite of possibly incompatible conditions provided by $\varPi _k (g|_{E_1})(z) $ and $\varPi _k (g|_{E_3})(z) $. The proof requires the regularity of the corresponding $6\times 6$ coefficient matrix

$$\begin{aligned} \begin{pmatrix} \sin \alpha &{}\quad -\cos \alpha &{} \quad 0 &{}&{}&{} \\ 0&{}\quad \sin \alpha &{}\quad -\cos \alpha &{} &{}&{}&{} \\ \sin \varphi &{}\quad -\cos \varphi &{}\quad 0 &{}\quad -\sin \varphi &{}\quad \cos \varphi &{}\quad 0 \\ 0&{}\quad \sin \varphi &{} \quad -\cos \varphi &{}\quad 0&{}\quad -\sin \varphi &{}\quad \cos \varphi \\ &{}&{}&{}\quad - \sin \beta &{}\quad \cos \beta &{}\quad 0 \\ &{}&{}&{}\quad 0&{}\quad - \sin \beta &{}\quad \cos \beta \end{pmatrix} \end{aligned}$$

with empty space representing $2\times 3$ zero blocks, which is multiplied with the coefficient vector

$$\begin{aligned}(\sigma _{11}^{(1)}, \sigma _{12}^{(1)}, \sigma _{22}^{(1)},\sigma _{11}^{(2)}, \sigma _{12}^{(2)}, \sigma _{22}^{(2)})\in {\mathbb {R}}^6 \end{aligned}$$

representing $(\sigma '_{\textit{AW}}|_{T_1}, \sigma '_{\textit{AW}}|_{T_2})$ at z.

There are several ways to cross-check the regularity of this coefficient matrix. One reduces it to the regularity of the $3\times 3$ matrix

$$\begin{aligned} \begin{pmatrix} \cos ^2\alpha &{} \sin \alpha \, \cos \alpha &{} \sin ^2\alpha \\ \cos ^2\varphi &{} \sin \varphi \, \cos \varphi &{} \sin ^2\varphi \\ \cos ^2\beta &{} \sin \beta \, \cos \beta &{} \sin ^2\beta \end{pmatrix} \end{aligned}$$

as follows. The following abbreviations apply throughout this section

$$\begin{aligned} m(\psi ):=(\cos ^2\psi , \sin \psi \, \cos \psi , \sin ^2\psi )^T \quad \text {and}\quad N(\psi ):= \begin{pmatrix} \sin \psi &{} -\cos \psi &{} 0 \\ 0 &{} \sin \psi &{} -\cos \psi \end{pmatrix}. \end{aligned}$$

Any vector $(x_1,\ldots ,x_6)$ in the kernel of the above displayed $6\times 6$ coefficient matrix satisfies in particular $N(\alpha )(x_1,x_2,x_3)^T=0\in {\mathbb {R}}^2$. Hence it is parallel to the cross-product of the two row vectors in the $2\times 3$ matrix $N(\alpha )$ and so $(x_1,x_2,x_3)^T || m(\alpha )$. The same is true for $(x_4,x_5,x_6)^T || m(\beta )$. The remaining two conditions for $(x_1,\ldots ,x_6)$ to be a kernel vector read $N(\varphi )(x_4-x_1,x_5-x_2,x_6-x_3)^T=0\in {\mathbb {R}}^2$ and so $(x_4-x_1,x_5-x_2,x_6-x_3) || m(\varphi )$. This leads to $(x_1,\ldots ,x_6)=0\in {\mathbb {R}}^6$ if and only if the displayed $3\times 3$ matrix $ (m(\alpha ),m(\varphi ),m(\beta ))^T\in {\mathbb {R}}^{3\times 3}$ is regular. Its determinant $\det $ depends on $\omega _1$ and $\omega _2$ and not on $\alpha $ if one substitutes $\varphi :=\alpha +\omega _1$ and $\beta := \alpha +\omega _1+\omega _2$; the elementary proof abbreviates the Vandermonde determinant

$$\begin{aligned} \det =:\left| \begin{matrix} \cos ^2 &{} \sin \, \cos &{} \sin ^2 \\ \alpha &{} \varphi &{}\beta \end{matrix} \right| \end{aligned}$$

of a $3\times 3$ matrix with columns determined by the three functions $\cos ^2, \sin \, \cos , \sin ^2$ and rows evaluated respectively at $\alpha ,\varphi ,\beta $. The derivative $\partial \det /\partial \alpha $ of the determinant $\det $ with respect to the variable reads

$$\begin{aligned} \left| \begin{matrix} -2\sin \, \cos &{} \sin \, \cos &{} \sin ^2 \\ \alpha &{}\varphi &{}\beta \end{matrix} \right| + \left| \begin{matrix} \cos ^2&{} \cos ^2 -\sin ^2 &{} \sin ^2 \\ \alpha &{}\varphi &{}\beta \end{matrix} \right| + \left| \begin{matrix} \cos ^2&{} \sin \, \cos &{} 2\sin \,\cos \\ \alpha &{}\varphi &{}\beta \end{matrix} \right| . \end{aligned}$$

Each of these three determinants vanishes (for the columns are linearly dependent). This proves that $\partial \det /\partial \alpha =0$ and so one may choose without loss of generality $\alpha =0$ to compute

$$\begin{aligned} \det =\sin (\omega _1)\sin (\omega _2)\sin (\omega _1+\omega _2). \end{aligned}$$

Since $\beta -\alpha = \omega _1+\omega _2\ne \pi $, $\det \ne 0$ and the design of a minimum norm solution of the above six conditions is feasible for any (in particular incompatible) discrete data.

Provided the angle at a node in the relative interior of the Neumann boundary is $\pi $, the design of this subsection may be infeasible (although solutions exist for discrete compatible data) and $J\ge 3$ is required.

1.2 Three triangles at a node in the Neumann boundary

Suppose ${\mathscr {T}}(z)=\{T_1,T_2,T_3\}$ enumerated counterclockwise with interior angles $\omega _1,\omega _2,\omega _3$ at the vertex z and $E_1$ parallel to $E_4$ on $\overline{\varGamma _N}$ and $\omega _1+\omega _2+\omega _3=\pi $ with $\beta =\alpha +\pi $. Let $(\sigma _{11}^{(j)}, \sigma _{12}^{(j)}, \sigma _{22}^{(j)})$ denote the three components of $\sigma '_{\textit{AW}}|_{T_j}(z)\in {\mathbb {S}}$ for $j=1,2,3$. With the aforementioned block matrices, the four Neumann boundary conditions and the four interface conditions for $H({\text {div}},\varOmega ;{\mathbb {S}}) $-conformity can be summarized into a linear system of equations with the $8\times 9$ coefficient matrix

$$\begin{aligned} \begin{pmatrix} N(\alpha ) &{} &{} \\ N(\alpha +\omega _1)&{} -N(\alpha +\omega _1)&{} \\ &{} N(\alpha +\omega _1+\omega _2) &{} -N(\alpha +\omega _1+\omega _2) \\ &{}&{} N(\alpha ) \end{pmatrix}. \end{aligned}$$

In order to prove that the design of $(\sigma _{11}^{(1)}, \sigma _{12}^{(1)}\ldots , \sigma _{22}^{(3)})\in {\mathbb {R}}^9$ is always possible (even for incompatible discrete Neumann data), it suffices to prove that this coefficient matrix has full rank. The subsequent regular transformation matrix

$$\begin{aligned} T:=\begin{pmatrix} -\sin \alpha &{}-\cos \alpha &{} \cos ^2\alpha \\ \cos \alpha &{} -\sin \alpha &{} \sin \alpha \, \cos \alpha \\ \sin \alpha &{} \cos \alpha &{} \sin ^2\alpha \end{pmatrix} \end{aligned}$$

leads to a new set of variables $(\tau _{11}^{(1)}, \tau _{12}^{(1)}\ldots , \tau _{22}^{(3)})\in {\mathbb {R}}^9$ by $(\sigma _{11}^{(j)}, \sigma _{12}^{(j)}, \sigma _{22}^{(j)})^T:= T (\tau _{11}^{(j)}$, $\tau _{12}^{(j)}, \tau _{22}^{(j)})^T$ for $j=1,2,3$. Elementary trigonometry shows

$$\begin{aligned} {\widetilde{N}}(\psi -\alpha ):=N(\psi )T=\begin{pmatrix} -\cos (\psi -\alpha ) &{} -\sin (\psi -\alpha ) &{} \cos \alpha \, \sin (\psi -\alpha ) \\ \sin (\psi -\alpha ) &{} -\cos (\psi -\alpha ) &{} \sin \alpha \, \sin (\psi -\alpha ) \end{pmatrix}. \end{aligned}$$

In particular, $-N(\alpha ) T$ is the unit matrix and shows that the kernel vectors $(\tau _{11}^{(1)}, \tau _{12}^{(1)}\ldots $, $\tau _{22}^{(3)})\in {\mathbb {R}}^9$ of the aforementioned $8\times 9$ coefficient matrix satisfy $\tau _{11}^{(j)}=0= \tau _{12}^{(j)}$ for $j=1$ and $j=3$. Moreover, the remaining components $\tau _{22}^{(1)}, \tau _{11}^{(2)}, \tau _{12}^{(2)}, \tau _{22}^{(2)}$, and $\tau _{22}^{(3)}$ of a (transformed) kernel vector satisfy

$$\begin{aligned} \begin{pmatrix} \cos (\omega _1) &{} \sin (\omega _1) &{} -2\cos \alpha \, \sin (\omega _1) &{} -\cos \alpha \, \sin (\omega _1) \\ -\sin (\omega _1) &{} \cos (\omega _1) &{} -2\sin \alpha \, \sin (\omega _1) &{} -\sin \alpha \, \sin (\omega _1) \\ -\cos (\omega _1+\omega _2) &{} -\sin (\omega _1+\omega _2) &{} \cos \alpha \, \sin (\omega _1+\omega _2) &{} 2\cos \alpha \, \sin (\omega _1+\omega _2) \\ \sin (\omega _1+\omega _2) &{} -\cos (\omega _1+\omega _2) &{} \sin \alpha \, \sin (\omega _1+\omega _2) &{} 2\sin \alpha \, \sin (\omega _1+\omega _2) \end{pmatrix} \begin{pmatrix} \tau _{11}^{(2)} \\ \tau _{12}^{(2)} \\ \tau _{22}^{(2)}- \tau _{22}^{(1)}\\ \tau _{22}^{(2)}- \tau _{22}^{(3)} \end{pmatrix}=0. \end{aligned}$$

It remains to prove that the determinant of this $4\times 4$ coefficient matrix is nonzero, because this proves $\tau _{11}^{(2)} =0= \tau _{12}^{(2)}$ and $\tau _{22}^{(1)}= \tau _{22}^{(2)}=\tau _{22}^{(3)}$ and so the null space is one-dimensional and the rank of the above $8\times 9$ matrix is 8, i.e., all conditions can be satisfied by a one-dimensional (whence non-empty) solution space.

The expansion of the determinant of the $4\times 4$ coefficient matrix confirms that the determinant is equal to $ - 3\sin (\omega _1)\sin (\omega _2) \sin (\omega _1+\omega _2)$ and so negative for $0<\omega _1< \omega _1+\omega _2<\pi $.

1.3 More triangles at a node in the Neumann boundary

The general situation is that there are $J\ge 2$ (resp. $J\ge 3$) triangles at a node $z\in \varGamma _N$ in the relative interior of the Neumann boundary $\varGamma _N$ with an interior angle $\ne \pi $ (resp. $=\pi $) of the polygon $\varGamma _N$ at z. The above analysis shows that the modified Arnold–Winther space $\textit{AW}'_k({\mathscr {T}})\subset H({\text {div}},\varOmega ;{\mathbb {S}}) $ allows for solutions for $J=2$ (resp. $J=3$). In case that J is larger, one may choose a partition of $T_1,\ldots ,T_J$ in two (resp. three) groups $T_1,\ldots ,T_k,T_{k+1},\ldots T_J$ (resp. $T_1,\ldots ,T_k,T_{k+1},\ldots ,T_\ell ,T_{\ell +1},\ldots T_J$). Then the extra constraint $\sigma '_{\textit{AW}}|_{T_1}=\ldots =\sigma '_{\textit{AW}}|_{T_k}$ and $\sigma '_{\textit{AW}}|_{T_{k+1}}=\ldots =\sigma '_{\textit{AW}}|_{T_J}$ for each group reduces the discussion to the above calculations with two triangles (resp. to the calculations with three triangles).

1.4 Modified Arnold–Winter FEM for Neumann boundary

The implementation of nodes at the Neumann boundary concerns only z in (the relative interior of) $\varGamma _N$ in case of incompatible discrete Neumann data. The most general case may be implemented via Lagrange multipliers for the interface conditions of the nodal values of $\sigma '_{\textit{AW}}|_{T_j}(z)$ in $\textit{AW}'_k({\mathscr {T}})\subset H({\text {div}},\varOmega ;{\mathbb {S}}) $.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Carstensen, C., Gallistl, D. & Gedicke, J. Residual-based a posteriori error analysis for symmetric mixed Arnold–Winther FEM. Numer. Math. 142, 205–234 (2019). https://doi.org/10.1007/s00211-019-01029-7

Download citation

Received: 02 March 2017
Revised: 03 October 2018
Published: 28 February 2019
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00211-019-01029-7

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Residual-based a posteriori error analysis for symmetric mixed Arnold–Winther FEM

Abstract

Similar content being viewed by others

New discontinuous Galerkin algorithms and analysis for linear elasticity with symmetric stress tensor

History of the Finite Element Method – Mathematics Meets Mechanics – Part II: Mathematical Foundation of Primal FEM for Elastic Deformations, Error Analysis and Adaptivity

Residual-based a posteriori error estimates for symmetric conforming mixed finite elements for linear elasticity problems

1 Introduction

1.1 Overview

1.2 Linear elastic model problem

1.3 Mixed finite element discretization

1.4 Explicit residual-based a posteriori error estimator

Theorem 1

Theorem 2

1.5 Outline of the paper

1.6 Comments on general notation

2 Proof of reliability

Lemma 1

Proof

Lemma 2

Proof

Theorem 3

Lemma 3

Proof

Theorem 4

Proof

Remark 1

Remark 2

Remark 3

Lemma 4

Proof

3 Local efficiency analysis

Lemma 5

Proof

Lemma 6

Proof

Lemma 7

Proof

Lemma 8

Proof

Lemma 9

Proof

4 Numerical examples

4.1 Academic example

4.2 Circular inclusion

4.3 L-shaped benchmark

4.4 Cook membrane problem

4.5 Comments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Fourth-order convergence of the stress in \(L^2\)

Theorem 5

Proof

B General Neumann data

Theorem 6

Proof

C Neumann boundary conditions

1.1 Two triangles at an interior vertex of the Neumann boundary

1.2 Three triangles at a node in the Neumann boundary

1.3 More triangles at a node in the Neumann boundary

1.4 Modified Arnold–Winter FEM for Neumann boundary

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation