1 Introduction

Adaptive mesh-refining is vital in the computational sciences and engineering with optimal rates known in many linear problems [19, 21, 29, 54]. Besides eigenvalue problems [6, 23, 24, 33] much less is known for stationary nonlinear PDEs. The few positive results in the literature concern mainly conforming FEM with plain convergence results [4, 12, 22, 41, 56]. An important exception is the p-Laplacian in [5], where the notion of a quasi-norm enables two-sided error control. The next larger class of convex minimization problems from [27] emerged in the relaxation of non-convex minimization problems with enforced microstructures and this is in the focus of this paper. This class is characterized by a two-sided growth condition on a \(C^1\) energy density W with an additional convexity control that enables a unique stress \(\mathrm {D}W(\mathrm {D}u)\) independent of the multiple minimizers u on the continuous level. In fact, there is no further control of the convex closed set of minimizers in beyond a priori boundedness. This leads to the reliability-efficiency gap [14] in the a posteriori error control: If the mesh-size tends to zero, the known guaranteed lower and upper error bounds converge with a different convergence rate. In other words, the efficiency index tends to infinity. This dramatic loss of sharp error control does not prevent convergence of an adaptive algorithm in general, but it makes the analysis of plain convergence much harder and seemingly disables any proof of optimal rates.

The numerical experiments in [28, 38] motivate this paper on the adaptive HHO. The only nonconforming scheme known to converge for general convex minimization problems is [52] for the first-order Crouzeix–Raviart schemes and, according to the knowledge of the authors, there is no contribution for the convergence of an adaptive higher-order nonconforming scheme for nonlinear PDEs in the literature. In fact, this paper is the first one to guarantee plain convergence even for linear PDEs for the HHO schemes at all. The reason is a negative power of the mesh-size in the stabilization terms that is overcome in dG schemes for linear PDEs by over-penalization in [7] to be close to conforming approximations (and then enable arguments for optimal convergence rates) and recently by generalized Galerkin solutions in a limit space in [47] for plain convergence. One advantage of the HHO methodology is the absence of a stabilization parameter and, hence, this argument is not employed in this paper.

The main contributions of this paper are adaptive HHO methods with and without stabilization with guaranteed plain convergence for the class of convex minimization problems from [27]. Three types of results are available for those schemes.

  1. (a)

    If W is \(C^1\) and convex with two-sided p-growth, then the minimal discrete energies converge to the exact minimal energy.

  2. (b)

    If furthermore W satisfies the convexity control in the class of degenerate convex minimization problems of [27], then the discrete stress approximations converge to the (unique) exact stress \(\sigma \).

  3. (c)

    If W is even strongly convex, then the discrete approximations of the gradients converge to the gradient \(\mathrm {D}u\) of the (unique) exact minimizer u.

The two-sided growth condition excludes problems that exhibit the Lavrentiev gap phenomenon [48] and so we only comment on the lowest-order schemes that overcome the Lavrentiev gap owing to the Jensen inequality.

Numerical experiments are carried out on simplicial meshes, but the design of the stabilized HHO method allows for polytopal meshes for a fairly flexible mesh-design, e.g., in 3D. The mesh-refinement of those schemes is less elaborated (e.g., in comparison with [55] on simplicial meshes) and remains as an important aspect for future research.

The remaining parts of this paper are organized as follows. Section 2 introduces the continuous minimization problem, the adaptive mesh-refining algorithm, and the main results of this paper. Section 3 reviews the discretization with the HHO methodology on simplicial triangulations without stabilization. Section 4 departs from discrete compactness, proves the plain convergence of an adaptive scheme, and concludes with an application to the Lavrentiev gap. Section 5 treats HHO methods on general polytopal meshes with stabilization and proves the results of Sect. 4.2. Numerical results for three model examples from Sect. 2.4 below are presented in Sect. 6 with conclusions drawn from the numerical experiments.

2 Mathematical setting and main results

This paper analyzes the convergence of an adaptive mesh-refining algorithm based on the hybrid high-order methodology [35, 37, 39] for convex minimization problems with a two-sided p-growth.

2.1 Continuous problem

Given a bounded polyhedral Lipschitz domain \(\Omega \subset {\mathbb {R}}^n\) and \(1< p < \infty \), let \(W \in C^1({\mathbb {M}})\) with \({\mathbb {M}}{:}{=}{\mathbb {R}}^{m \times n}\) satisfy

  1. (A1)

    (convexity) W is convex;

  2. (A2)

    (two-sided growth) \(c_{1}|A|^p - c_{2} \le W(A) \le c_{3}|A|^p + c_{4}\) for all \(A \in {\mathbb {M}}\).

The constants \(c_{1}, c_{3} > 0\) and \(c_{2}, c_{4} \ge 0\) are universal in this paper and independent of the argument \(A \in {\mathbb {M}}\); the same universality applies to \(c_{5}, c_{6}\) in (2.5)–(2.6). Throughout this paper, the boundary \(\partial \Omega \) of the domain \(\Omega \) is divided into a compact Dirichlet part \(\Gamma _\mathrm {D}\) with positive surface measure and a relatively open (and possibly empty) Neumann part \(\Gamma _\mathrm {N} = \partial \Omega \setminus \Gamma _\mathrm {D}\). Given \(f \in L^{p'}(\Omega ;{\mathbb {R}}^m)\), \(g \in L^{p'}(\Gamma _\mathrm {N};{\mathbb {R}}^m)\) with \(1/p + 1/p' = 1\), and \(u_\mathrm {D} \in V {:}{=}W^{1,p}(\Omega ;{\mathbb {R}}^m)\), minimize the energy functional

$$\begin{aligned} E(v) {:}{=}\int _\Omega (W(\mathrm {D}v) - f \cdot v) \,{\mathrm {d}}x - \int _{\Gamma _\mathrm {N}} g \cdot v \,{\mathrm {d}}s \end{aligned}$$
(2.1)

among admissible functions \(v \in {\mathcal {A}}{:}{=}u_\mathrm {D} + V_\mathrm {D}\) subject to the Dirichlet boundary condition \(v|_{\Gamma _\mathrm {D}} = u_\mathrm {D}|_{\Gamma _\mathrm {D}}\) and \(V_\mathrm {D} {:}{=}\{v \in V: v|_{\Gamma _{\mathrm {D}}} \equiv 0\}\).

2.2 Adaptive hybrid high-order method (AHHO)

The adaptive algorithm computes a sequence of discrete approximations of the minimal energy \(\min E({\mathcal {A}})\) in the affine space \({\mathcal {A}}= u_\mathrm {D} + V_\mathrm {D}\) of admissible functions in a successive loop over the steps outlined below. The first version of the adaptive algorithm focuses on the newest-vertex-bisection (NVB) [55] and the first HHO method without stabilization on triangulations into simplices. It will be generalized to polytopal meshes in Sect. 5.

  1. 1.

    INPUT. The input is a regular initial triangulation \({\mathcal {T}}_0\) of \(\Omega \) into simplices, a polynomial degree \(k \ge 0\), a positive parameter \(0 < \varepsilon \le k+1\), and a bulk parameter \(0< \theta < 1\).

  2. 2.

    SOLVE. Let \({\mathcal {T}}_\ell \) denote the triangulation associated to the level \(\ell \in {\mathbb {N}}_0\) with the set of all sides \({\mathcal {F}}_\ell \). The hybrid high-order method utilizes the discrete ansatz space \(V({\mathcal {T}}_\ell ) {:}{=}P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m) \times P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m)\) with a split of the discrete variables \(v_\ell = (v_{{\mathcal {T}}_\ell }, v_{{\mathcal {F}}_\ell })\) into a volume variable \(v_{{\mathcal {T}}_\ell } \in P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) and a skeleton variable \(v_{{\mathcal {F}}_\ell } \in P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m)\) of polynomial degree at most \(k \ge 0\) with respect to the simplices (\({\mathcal {T}}_\ell \)) and the sides (\({\mathcal {F}}_\ell \)) in the triangulation \({\mathcal {T}}_\ell \). The proposed numerical scheme replaces \(\mathrm {D}v\) in (2.1) by a gradient reconstruction \({\mathcal {G}}_\ell \) in the space of piecewise Raviart–Thomas finite element functions \(\Sigma ({\mathcal {T}}_\ell ) = \mathrm {RT}_k^\mathrm {pw}({\mathcal {T}}_\ell ;{\mathbb {M}})\) for a shape-regular triangulation \({\mathcal {T}}_\ell \) of \(\Omega \) into simplices. The details on the gradient reconstruction \({\mathcal {G}}_\ell \) are postponed to Sect. 3.4. The discrete problem computes a discrete minimizer \(u_\ell \) of

    $$\begin{aligned} E_\ell (v_\ell ) {:}{=}\int _\Omega (W({\mathcal {G}}v_\ell ) - f \cdot v_{{\mathcal {T}}_\ell }) \,{\mathrm {d}}x - \int _{\Gamma _\mathrm {N}} g \cdot v_{{\mathcal {F}}_\ell } \,{\mathrm {d}}s \end{aligned}$$
    (2.2)

    among \(v_\ell = (v_{{\mathcal {T}}_\ell }, v_{{\mathcal {F}}_\ell }) \in {\mathcal {A}}({\mathcal {T}}_\ell )\) with the discrete analog \({\mathcal {A}}({\mathcal {T}}_\ell )\) of \({\mathcal {A}}\) so that \(v_{{\mathcal {F}}_\ell }|_F = \Pi _F^k u_\mathrm {D}\) for any Dirichlet side \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\), where \(\Pi _F^k\) is the \(L^2\) projection onto the polynomials \(P_k(F)\) of degree at most k. Let \(\sigma _\ell {:}{=}\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}W({\mathcal {G}}u_\ell ) \in \Sigma ({\mathcal {T}}_\ell )\) be the \(L^2\) projection of \(\mathrm {D}W({\mathcal {G}}u_\ell )\) onto \(\Sigma ({\mathcal {T}}_\ell )\). Further details on the hybrid high-order method follow in Sect. 3 below.

  3. 3.

    REFINEMENT INDICATORS. The computation of the refinement indicator \(\eta _\ell \) utilizes an elliptic potential reconstruction \({\mathcal {R}}_\ell u_\ell \in P_{k+1}({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) of the discrete minimizer \(u_\ell = (u_{{\mathcal {T}}_\ell }, u_{{\mathcal {F}}_\ell }) \in {\mathcal {A}}({\mathcal {T}}_\ell )\) computed in SOLVE. The definition of \({\mathcal {R}}_\ell u_\ell \) follows in (3.4)–(3.5) below. Any interior side \(F \in {\mathcal {F}}_\ell (\Omega )\) is shared by two simplices \(T_+, T_- \in {\mathcal {T}}_\ell \) with \(F = T_+ \cap T_-\). The jump \([{\mathcal {R}}_\ell u_\ell ]_F\) along F is defined by \([{\mathcal {R}}_\ell u_\ell ]_F {:}{=}({\mathcal {R}}_\ell u_\ell )|_{T_+} - ({\mathcal {R}}_\ell u_\ell )|_{T_-} \in P_{k+1}(F;{\mathbb {R}}^m)\). Given a positive parameter \(0 < \varepsilon \le k+1\), compute the local refinement indicator

    $$\begin{aligned} \eta _\ell ^{(\varepsilon )}(T)&{:}{=}|T|^{(\varepsilon p - p)/n}\Vert \Pi _T^k({\mathcal {R}}_\ell u_\ell - u_T)\Vert _{L^p(T)}^p + |T|^{\varepsilon p'/n}\Vert \sigma _\ell - \mathrm {D}W({\mathcal {G}}u_\ell )\Vert _{L^{p'}(T)}^{p'}\nonumber \\&\quad + |T|^{p'/n}\Vert (1 - \Pi _{T}^k) f\Vert _{L^{p'}(T)}^{p'} + |T|^{1/n} \sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})} \Vert (1 - \Pi _{F}^k) g\Vert _{L^{p'}(F)}^{p'}\nonumber \\&\quad + |T|^{(\varepsilon p + 1 - p)/n} \Big (\sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})} \Vert {\mathcal {R}}_\ell u_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p\nonumber \\&\quad + \sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Omega )} \Vert [{\mathcal {R}}_\ell u_\ell ]_F\Vert _{L^p(F)}^p + \sum _{F \in {\mathcal {F}}_\ell (T)} \Vert \Pi _F^k (({\mathcal {R}}_\ell u_\ell )|_T - u_F)\Vert _{L^p(F)}^p\Big ) \end{aligned}$$
    (2.3)

    for all \(T \in {\mathcal {T}}_\ell \) of volume |T| and sides \({\mathcal {F}}_\ell (T)\) with the abbreviation \(u_T {:}{=}u_{{\mathcal {T}}_\ell }|_T\) and \(u_F {:}{=}u_{{\mathcal {F}}_\ell }|_F\). Let \(\eta _\ell ^{(\varepsilon )} {:}{=}\sum _{T \in {\mathcal {T}}_\ell } \eta _\ell ^{(\varepsilon )}(T)\). The refinement indicator is motivated by the discrete compactness from Theorem 4.1 below. In fact, if \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\), then there exists a \(v \in {\mathcal {A}}\) such that, up to a subsequence, \({\mathcal {G}}_\ell u_\ell \rightharpoonup \nabla v\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) and \(u_{{\mathcal {T}}_\ell } \rightharpoonup v\) weakly in \(L^p(\Omega ;{\mathbb {R}}^m)\) as \(\ell \rightarrow \infty \). It turns out that v is a minimizer of the continuous energy E from (2.1).

  4. 4.

    MARK and REFINE. Given a positive bulk parameter \(0< \theta < 1\), select a subset \({\mathfrak {M}}_\ell \subset {\mathcal {T}}_\ell \) of minimal cardinality such that

    $$\begin{aligned} \theta \eta _\ell ^{(\varepsilon )} \le \eta _\ell ^{(\varepsilon )}({\mathfrak {M}}_\ell ) {:}{=}\sum _{T \in {\mathfrak {M}}_\ell } \eta _\ell ^{(\varepsilon )}(T). \end{aligned}$$
    (2.4)

    This marking strategy is known as Dörfler marking. The marked simplices are refined by the newest-vertex bisection [55] to define \({\mathcal {T}}_{\ell +1}\).

  5. 5.

    OUTPUT. The output is a sequence of shape-regular triangulations \(({\mathcal {T}}_\ell )_{\ell \in {\mathbb {N}}_0}\), the corresponding discrete minimizers \((u_\ell )_{\ell \in {\mathbb {N}}_0}\), the discrete stresses \((\sigma _\ell )_{\ell \in {\mathbb {N}}_0}\), and the refinement indicators \((\eta _\ell ^{(\varepsilon )})_{\ell \in {\mathbb {N}}_0}\). On each level \(\ell \ge 0\), let \({\mathcal {J}}_{\ell } u_\ell \in V\) denote the conforming post-processing of \(u_\ell \) from Lemma 3.4 below.

2.3 Main results

The main results establish the convergence of the sequence \((E_\ell (u_\ell ))_{\ell \in {\mathbb {N}}_0}\) of minimal discrete energies computed by AHHO towards the exact minimal energy.

Theorem 2.1

(plain convergence) Given the input \({\mathcal {T}}_0\), \(k \in {\mathbb {N}}_0\), \(0 < \varepsilon \le k+1\), \(0< \theta < 1\), let \(({\mathcal {T}}_\ell )_{\ell \in {\mathbb {N}}_0}\), \((u_\ell )_{\ell \in {\mathbb {N}}_0}\), and \((\sigma _\ell )_{\ell \in {\mathbb {N}}_0}\) be the output of the adaptive algorithm AHHO from Sect. 2.2. Assume that W satisfies (A1)–(A2), then (a)–(d) hold.

  1. (a)

    \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \min E({\mathcal {A}})\).

  2. (b)

    The sequence of the post-processing \(({\mathcal {J}}_\ell u_\ell )_{\ell \in {\mathbb {N}}_0}\) is bounded in \(V = W^{1,p}(\Omega ;{\mathbb {R}}^n)\) and any weak accumulation point of \(({\mathcal {J}}_\ell u_\ell )_{\ell \in {\mathbb {N}}_0}\) in V minimizes E in \({\mathcal {A}}\).

  3. (b)

    Suppose there exists \(c_{5} > 0\) such that W satisfies, for all \(A,B \in {\mathbb {M}}\),

    $$\begin{aligned} |A - B|^r \le c_{5}(1 + |A|^s + |B|^s)(W(A) - W(B) - \mathrm {D}W(B):(A - B)) \end{aligned}$$
    (2.5)

    with parameters rs from Table 1. Then the minimizer u of E in \({\mathcal {A}}\) is unique and \(\lim _{\ell \rightarrow \infty } {\mathcal {G}}_\ell u_\ell = \mathrm {D}u\) (strongly) in \(L^p(\Omega ;{\mathbb {M}})\) holds for the entire sequence.

  4. (d)

    Suppose there exists \( c_{6} > 0\) such that W satisfies, for all \(A,B \in {\mathbb {M}}\),

    $$\begin{aligned} \begin{aligned} |\mathrm {D}W(A) - \mathrm {D}W(B)|^{{\widetilde{r}}}&\le c_{6}(1 + |A|^{{\widetilde{s}}} + |B|^{{\widetilde{s}}})\\&\quad \times (W(A) - W(B) - \mathrm {D}W(B):(A - B)) \end{aligned} \end{aligned}$$
    (2.6)

    with parameters \({\widetilde{r}}, {\widetilde{s}}\) from Table 1. Then the stress \(\sigma {:}{=}\mathrm {D}W(\mathrm {D}u) \in L^{p'}(\Omega ;{\mathbb {M}})\) is unique (independent of the choice of a (possibly nonunique) minimizer u) and \(\lim _{\ell \rightarrow \infty } \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) = \sigma \) (strongly) in \(L^{p'}(\Omega ;{\mathbb {M}})\) and \(\sigma _\ell \rightharpoonup \sigma \) (weakly) in \(L^{p'}(\Omega ;{\mathbb {M}})\) hold for the entire sequence.

Table 1 Parameters r, s, \({\widetilde{r}}\), \({\widetilde{s}}\) in Theorem 2.1 and \(t,{\widetilde{t}}\) in Sect. 4.2

A second focus is on the classical HHO method [34, 35, 37] on general polytopal meshes \({\mathcal {M}}_\ell \) with a stabilization \(\mathrm {s}_\ell (\bullet ,\bullet )\) defined in (5.1) below. The convergence of AHHO for the stabilized HHO method on polytopal meshes is established under two assumptions (M1)–(M2). Further details on (M1)–(M2) and on the stabilized HHO method follow in Sect. 5.

Theorem 2.2

(plain convergence for stabilized HHO) Given the input \({\mathcal {M}}_0\), \(k \in {\mathbb {N}}_0\), \(0 < \varepsilon \le \min \{k+1, (k+1)/(p-1)\}\), \(0< \theta < 1\), let \(({\mathcal {M}}_\ell )_{\ell \in {\mathbb {N}}_0}\), \((u_\ell )_{\ell \in {\mathbb {N}}_0}\), and \((\sigma _\ell )_{\ell \in {\mathbb {N}}_0}\) be the output of the adaptive algorithm from Sect. 2.2. Suppose that (M1)–(M2) hold, then (a)–(d) from Theorem 2.1 hold verbatim and \(\lim _{\ell \rightarrow \infty } \mathrm {s}_\ell (u_\ell ;u_\ell ) = 0\).

Notice that an additional restriction on the parameter \(\varepsilon \) is imposed in Theorem 2.2 to control the stabilization \(\mathrm {s}_{\ell }\). The proofs of Theorems 2.1 and 2.2 are postponed to Sects. 4 and 5.

2.4 Examples

Theorem 2.1 applies to the following scalar examples with \(m = 1\).

2.4.1 p-Laplace

The minimization of the energy \(E:{\mathcal {A}}\rightarrow {\mathbb {R}}\) with the energy density \(W \in C^1({\mathbb {R}}^n)\),

$$\begin{aligned} W(a) {:}{=}|a|^p/p \quad \text {for any } a\in {\mathbb {R}}^n \text { with } 1< p < \infty , \end{aligned}$$

is related to the nonlinear PDE \(- \mathrm {div}\sigma = f \in L^{p'}(\Omega )\) with \(\sigma {:}{=}\nabla W(\nabla u) = |\nabla u|^{p-2}\nabla u \in L^{p'}(\Omega ;{\mathbb {R}}^n)\) subject to the boundary conditions \(\sigma \nu = g\) on \(\Gamma _{\mathrm {N}}\) and \(u = u_\mathrm {D}\) on \(\Gamma _{\mathrm {D}}\). The energy density W satisfies (A1)–(A2) and (2.5)–(2.6) [15, 44]. It is worth noticing that the convergence results of this paper are new even for a linear model problem with \(p = 2\) for the two HHO algorithms.

2.4.2 Optimal design problem

The optimal design problem seeks the optimal distribution of two materials with fixed amounts to fill a given domain for maximal torsion stiffness [4, 46]. For fixed parameters \(0< \xi _1 < \xi _2\) and \(0< \mu _1 < \mu _2\) with \(\xi _1\mu _2 = \xi _2\mu _1\), the energy density \(W(a) {:}{=}\psi (\xi )\), \(a \in {\mathbb {R}}^n\), \(\xi {:}{=}|a| \ge 0\) with

$$\begin{aligned} \psi (\xi ) {:}{=}{\left\{ \begin{array}{ll} \mu _2 \xi ^2/2 &{}\text{ if } 0 \le \xi \le \xi _1,\\ \xi _1\mu _2(\xi - \xi _1/2) &{}\text{ if } \xi _1 \le \xi \le \xi _2,\\ \mu _1 \xi ^2/2 - \xi _1\mu _2(\xi _1/2 - \xi _2/2) &{}\text{ if } \xi _2 \le \xi \end{array}\right. } \end{aligned}$$

satisfies (A1)–(A2) and (2.6) [4, Prop. 4.2].

2.4.3 Relaxed two-well problem

Given distinct \(F_1, F_2 \in {\mathbb {R}}^n\) in the two-well problem of [30], the convex envelope W of \(|F - F_1|^2|F - F_2|^2\) for \(F \in {\mathbb {R}}^n\) reads

$$\begin{aligned} W(F) = \max \{0,|F - B|^2 - |A|^2\}^2 + 4\big (|A|^2 |F-B|^2 - (A \cdot (F - B))^2 \big ) \end{aligned}$$

with \(A = (F_2 - F_1)/2\), \(B = (F_1 + F_2)/2\), and satisfies (A1)–(A2) and (2.6) [22, 27].

2.5 Notation

Standard notation for Sobolev and Lebesgue functions applies throughout this paper with the abbreviations \(V {:}{=}W^{1,p}(\Omega ;{\mathbb {R}}^m) = W^{1,p}(\Omega )^m\) and \(V_\mathrm {D} {:}{=}W^{1,p}_\mathrm {D}(\Omega ;{\mathbb {R}}^m) = \{v \in V: v|_{\Gamma _{\mathrm {D}}} = 0\}\). In particular, \((\bullet , \bullet )_{L^2(\Omega )}\) denotes the scalar product of \(L^2(\Omega )\) and \(W^{p'}(\mathrm {div},\Omega ;{\mathbb {M}}) {:}{=}W^{p'}(\mathrm {div},\Omega )^m\) is the matrix-valued version of

$$\begin{aligned} W^{p'}(\mathrm {div},\Omega ) {:}{=}\{\tau \in L^{p'}(\Omega ;{\mathbb {R}}^n): \mathrm {div}\tau \in L^{p'}(\Omega )\}. \end{aligned}$$
(2.7)

For any \(A,B \in {\mathbb {M}}{:}{=}{\mathbb {R}}^{m \times n}\), A : B denotes the Euclidean scalar product of A and B, which induces the Frobenius norm \(|A| {:}{=}(A:A)^{1/2}\) in \({\mathbb {M}}\). The context-depending notation \(|\bullet |\) denotes the length of a vector, the Frobenius norm of a matrix, the Lebesgue measure of a subset of \({\mathbb {R}}^n\), or the counting measure of a discrete set. For \(1< p < \infty \), \(p' = p/(p-1)\) denotes the Hölder conjugate of p with \(1/p + 1/p' = 1\). The notation \(A \lesssim B\) abbreviates \(A \le CB\) for a generic constant C independent of the mesh-size and \(A \approx B\) abbreviates \(A \lesssim B \lesssim A\).

3 Hybrid high-order method without stabilization

This section recalls the discrete ansatz space and reconstruction operators from the HHO methodology [35, 37, 39] for convenient reading.

3.1 Triangulation

A regular triangulation \({\mathcal {T}}_\ell \) of \(\Omega \) in the sense of Ciarlet is a finite set of closed simplices T of positive volume \(|T| > 0\) with boundary \(\partial T\) and outer unit normal \(\nu _T\) such that \(\cup _{T \in {\mathcal {T}}_\ell } T = {\overline{\Omega }}\) and two distinct simplices are either disjoint or share one common (lower-dimensional) subsimplex (vertex or edge in 2D and vertex, edge, or face in 3D). Let \({\mathcal {F}}_\ell (T)\) denote the set of the \(n+1\) hyperfaces of T, called sides of T. Define the set of all sides \({\mathcal {F}}_\ell {:}{=}\cup _{T \in {\mathcal {T}}_\ell } {\mathcal {F}}_\ell (T)\), the set of interior sides \({\mathcal {F}}_\ell (\Omega ) {:}{=}{\mathcal {F}}_\ell \setminus \{F \in {\mathcal {F}}_\ell : F \subset \partial \Omega \}\), the set of Dirichlet sides \({\mathcal {F}}_{\ell }(\Gamma _{\mathrm {D}}) {:}{=}\{F \in {\mathcal {F}}_\ell : F \subset \Gamma _{\mathrm {D}}\}\), and the set of Neumann sides \({\mathcal {F}}_{\ell }(\Gamma _{\mathrm {N}}) {:}{=}\{F \in {\mathcal {F}}_\ell : F \subset \Gamma _{\mathrm {N}}\}\) of \({\mathcal {T}}_\ell \).

For any interior side \(F \in {\mathcal {F}}_\ell (\Omega )\), there exist exactly two simplexes \(T_+, T_- \in {\mathcal {T}}_\ell \) such that \(\partial T_+ \cap \partial T_- = F\). The orientation of the outer normal unit \(\nu _F = \nu _{T_+}|_F = -\nu _{T_-}|_F\) along F is fixed beforehand. Define the side patch \(\omega _F {:}{=}\mathrm {int}(T_+ \cup T_-)\) of F. Let \([v]_F {:}{=}(v|_{T_+})|_F - (v|_{T_-})|_F \in L^1(F)\) denote the jump of \(v \in L^1(\omega _F)\) with \(v \in W^{1,1}(T_+)\) and \(v \in W^{1,1}(T_-)\) across F (with the abbreviations \(W^{1,1}(T_+) {:}{=}W^{1,1}(\mathrm {int}(T_+))\) and \(W^{1,1}(T_-) {:}{=}W^{1,1}(\mathrm {int}(T_-))\)). For any boundary side \(F \in {\mathcal {F}}_\ell (\partial \Omega ) {:}{=}{\mathcal {F}}_\ell \setminus {\mathcal {F}}_\ell (\Omega )\), there is a unique \(T \in {\mathcal {T}}_\ell \) with \(F \in {\mathcal {F}}_\ell (T)\). Then \(\omega _F = \mathrm {int}(T)\), \(\nu _F {:}{=}\nu _T\), and \([v]_F {:}{=}(v|_T)|_F\). The differential operators \(\mathrm {div}_{\mathrm {pw}}\) and \(\mathrm {D}_\mathrm {pw}\) depend on the triangulation \({\mathcal {T}}_\ell \) and denote the piecewise application of \(\mathrm {div}\) and \(\mathrm {D}\) without explicit reference to \({\mathcal {T}}_\ell \).

The shape regularity of a triangulation \({\mathcal {T}}\) is the minimum \(\min _{T \in {\mathcal {T}}} \varrho (T)\) of all ratios \(\varrho (T) {:}{=}r_i/r_c \le 1\) of the maximal radius \(r_i\) of an inscribed ball and the minimal radius \(r_c\) of a circumscribed ball for a simplex \(T \in {\mathcal {T}}\).

3.2 Discrete spaces

The discrete ansatz space of the HHO methods consists of piecewise polynomials on the triangulation \({\mathcal {T}}_\ell \) and on the skeleton \(\partial {\mathcal {T}}_\ell {:}{=}\cup {\mathcal {F}}_\ell \). For a simplex or a side \(M \subset {\mathbb {R}}^n\) of diameter \(h_M\), let \(P_k(M)\) denote the space of polynomials of degree at most \(k \in {\mathbb {N}}_0\) regarded as functions defined in M. The \(L^2\) projection \(\Pi _M^k v \in P_k(M)\) of \(v \in L^1(M)\) is defined by \(\Pi _M^k v \in P_k(M)\) with

$$\begin{aligned} \int _{M} \varphi _k (1-\Pi _M^k)v \,{\mathrm {d}}x = 0 \quad \text {for any } \varphi _k \in P_k(M). \end{aligned}$$

The gradient reconstruction in \(T \in {\mathcal {T}}_\ell \) maps in the space of Raviart–Thomas finite element functions

$$\begin{aligned} \mathrm {RT}_k(T)&{:}{=}P_k(T;{\mathbb {R}}^n) + x P_k(T) \subset P_{k+1}(T;{\mathbb {R}}^n). \end{aligned}$$

Let \(P_k({\mathcal {T}}_\ell )\), \(P_k({\mathcal {F}}_\ell )\), and \(\mathrm {RT}^\mathrm {pw}_k({\mathcal {T}}_\ell )\) denote the space of piecewise functions with respect to the mesh \({\mathcal {T}}_\ell \) or \({\mathcal {F}}_\ell \) and with restrictions to T or F in \(P_k(T)\), \(P_k(F)\), and \(\mathrm {RT}_k(T)\). The \(L^2\) projections \(\Pi _{{\mathcal {T}}_\ell }^k\) and \(\Pi _{{\mathcal {F}}_\ell }^k\) onto the discrete spaces \(P_k({\mathcal {T}}_\ell )\) and \(P_k({\mathcal {F}}_\ell )\) are the global versions of \(\Pi _T^k\) and \(\Pi _F^k\), e.g., \((\Pi _{{\mathcal {T}}_\ell }^k v)|_T {:}{=}\Pi _T^k (v|_T)\) for \(v \in L^1(\Omega )\). For vector-valued functions \(v \in L^1(\Omega ;{\mathbb {R}}^m)\), the \(L^2\) projection \(\Pi _{{\mathcal {T}}_\ell }^k\) onto \(P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m) {:}{=}P_k({\mathcal {T}}_\ell )^m\) applies componentwise. This convention extends to the \(L^2\) projections onto \(P_k(M;{\mathbb {R}}^m) {:}{=}P_k(M)^m\) and \(P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m) {:}{=}P_k({\mathcal {F}}_\ell )^m\). The space of lowest-order Crouzeix–Raviart finite element functions reads

$$\begin{aligned} \begin{aligned} \mathrm {CR}^1({\mathcal {T}}_\ell )&{:}{=}\{v_\mathrm {CR}\in P_1({\mathcal {T}}_\ell ) : v_\mathrm {CR}\text { is continuous}\\&\qquad \text {at midpoints of } F \text { for all } F \in {\mathcal {F}}_\ell (\Omega )\}. \end{aligned} \end{aligned}$$
(3.1)

Define the mesh-size function \(h_\ell \in P_0({\mathcal {T}}_\ell )\) with \(h_\ell |_T \equiv |T|^{1/n}\) for all \(T \in {\mathcal {T}}_\ell \), the (volume data) oscillation \(\mathrm {osc}(f,{\mathcal {T}}_\ell )^{p'} {:}{=}\sum _{T \in {\mathcal {T}}_\ell } h_T\Vert (1 - \Pi _{T}^k) f\Vert _{L^{p'}(\Omega )}^{p'}\), and the (Neumann data) oscillation \(\mathrm {osc}_\mathrm {N}(g,{\mathcal {F}}_{\ell }(\Gamma _{\mathrm {N}}))^{p'} {:}{=}\sum _{F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})} h_F\Vert (1 - \Pi _F^k) g\Vert _{L^{p'}(F)}^{p'}\) with the diameter \(h_F = \mathrm {diam}(F)\) of \(F \in {\mathcal {F}}_\ell \). (Notice that the shape regularity of \({\mathcal {T}}_\ell \) implies the equivalence \(h_F \approx h_T \approx |T|^{1/n}\) for all \(T \in {\mathcal {T}}_\ell , F \in {\mathcal {F}}_\ell (T)\).)

3.3 HHO ansatz space

For fixed \(k \in {\mathbb {N}}_0\), let \(V({\mathcal {T}}_\ell ) {:}{=}P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m) \times P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m)\) denote the discrete ansatz space for V in HHO methods [35, 37]. The notation \(v_\ell \in V({\mathcal {T}}_\ell )\) means that \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) = ((v_T)_{T \in {\mathcal {T}}_\ell },(v_F)_{F \in {\mathcal {F}}_\ell })\) for some \(v_{{\mathcal {T}}_\ell } \in P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) and \(v_{{\mathcal {F}}_\ell } \in P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m)\) with the identification \(v_T {:}{=}v_{{\mathcal {T}}_\ell }|_T \in P_k(T;{\mathbb {R}}^m)\) and \(v_F {:}{=}v_{{\mathcal {F}}_\ell }|_F \in P_k(F;{\mathbb {R}}^m)\) for all \(T \in {\mathcal {T}}_\ell \), \(F \in {\mathcal {F}}_\ell \). The discrete space \(V({\mathcal {T}}_\ell )\) is endowed with the seminorm

$$\begin{aligned} \Vert v_\ell \Vert _\ell ^p {:}{=}\Vert \mathrm {D}_\mathrm {pw}v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )}^p + \sum _{T \in {\mathcal {T}}_\ell } \sum _{F \in {\mathcal {F}}_\ell (T)} h_F^{1-p}\Vert v_T - v_F\Vert _{L^p(F)}^p \end{aligned}$$
(3.2)

for any \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell )\). The set \({\mathcal {F}}_\ell \setminus {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\) of non-Dirichlet sides gives rise to the space \(P_k({\mathcal {F}}_\ell \setminus {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}});{\mathbb {R}}^m)\) of piecewise polynomials \(v_{{\mathcal {F}}_\ell } \in P_k({\mathcal {F}}_\ell ;{\mathbb {R}}^m)\) with the convention \(v_{{\mathcal {F}}_\ell }|_F \equiv 0\) on \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\) to model homogenous Dirichlet boundary conditions along the side \(F \subset \Gamma _{\mathrm {D}}\). The discrete linear space \(V_\mathrm {D}({\mathcal {T}}_\ell ) {:}{=}P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m) \times P_k({\mathcal {F}}_\ell \setminus {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}});{\mathbb {R}}^m) \subset V({\mathcal {T}}_\ell )\), equipped with the norm \(\Vert \bullet \Vert _\ell \) from (3.2), is the discrete analogue to \(V_\mathrm {D} = W^{1,p}_\mathrm {D}(\Omega ;{\mathbb {R}}^m)\). The interpolation

$$\begin{aligned} \mathrm {I}_\ell : V \rightarrow V({\mathcal {T}}_\ell ), v \mapsto (\Pi _{{\mathcal {T}}_\ell }^k v, \Pi _{{\mathcal {F}}_\ell }^k v) \end{aligned}$$
(3.3)

gives rise to the discrete space \({\mathcal {A}}({\mathcal {T}}_\ell ) {:}{=}\mathrm {I}_\ell u_\mathrm {D} + V_{\mathrm {D}}({\mathcal {T}}_\ell )\) of admissible functions.

3.4 Reconstruction operators

The reconstruction operators defined in this section link the two components of \(v_\ell \in V({\mathcal {T}}_\ell )\) and provide discrete approximations \({\mathcal {R}}_\ell v_\ell \) and \({\mathcal {G}}_\ell v_\ell \) of the displacement \(v \in V\) and its derivative \(\mathrm {D}v \in L^2(\Omega ;{\mathbb {M}})\).

Potential reconstruction Given \(T \in {\mathcal {T}}_\ell \) and \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell )\) with the convention \(v_T = v_{{\mathcal {T}}_\ell }|_T\) and \(v_F = v_{{\mathcal {F}}_\ell }|_F\) for all \(F \in {\mathcal {F}}_\ell (T)\) from Sect. 3.3, the local potential reconstruction \({\mathcal {R}}_T v_\ell \in P_{k+1}(T;{\mathbb {R}}^m)\) satisfies

$$\begin{aligned} \begin{aligned}&\int _T \mathrm {D}{\mathcal {R}}_T v_\ell : \mathrm {D}\varphi _{k+1} \,{\mathrm {d}}x \\&\quad = -\int _T \Delta \varphi _{k+1} \cdot v_T \,{\mathrm {d}}x + \sum _{F \in {\mathcal {F}}(T)} \int _F v_F \cdot (\mathrm {D}\varphi _{k+1} \nu _T)|_F \,{\mathrm {d}}s \end{aligned} \end{aligned}$$
(3.4)

for all \(\varphi _{k+1} \in P_{k+1}(T;{\mathbb {R}}^m)\). The bilinear form \((\mathrm {D}\bullet , \mathrm {D}\bullet )_{L^2(T)}\) on the left-hand side of (3.4) defines a scalar product in the quotient space \(P_{k+1}(T;{\mathbb {R}}^m)/{\mathbb {R}}^m\) and the right-hand side of (3.4) is a linear functional in \(P_{k+1}(T;{\mathbb {R}}^m)/{\mathbb {R}}^m\). The Riesz representation \({\mathcal {R}}_T v_\ell \in P_{k+1}(T;{\mathbb {R}}^m)\) of this linear functional in \(P_{k+1}(T;{\mathbb {R}}^m)/{\mathbb {R}}^m\) equipped with the energy scalar product is selected by

$$\begin{aligned} \int _T {\mathcal {R}}_T v_\ell \,{\mathrm {d}}x = \int _{T} v_T \,{\mathrm {d}}x. \end{aligned}$$
(3.5)

The unique solution \({\mathcal {R}}_T v_\ell \in P_{k+1}(T;{\mathbb {R}}^m)\) to (3.4)–(3.5) gives rise to the potential reconstruction operator \({\mathcal {R}}_\ell : V({\mathcal {T}}_\ell ) \rightarrow P_{k+1}({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) with restriction \(({\mathcal {R}}_\ell v_\ell )|_T {:}{=}{\mathcal {R}}_T v_\ell \) on each simplex \(T \in {\mathcal {T}}_\ell \) for any \(v_\ell \in V({\mathcal {T}}_\ell )\).

Gradient reconstruction The gradient is reconstructed in the space \(\Sigma ({\mathcal {T}}_\ell ) = \mathrm {RT}_k^\mathrm {pw}({\mathcal {T}}_\ell ;{\mathbb {M}})\) of piecewise Raviart–Thomas finite element functions [1, 28]. Given \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell )\), its gradient reconstruction \({\mathcal {G}}_\ell v_\ell \in \Sigma ({\mathcal {T}}_\ell )\) solves

$$\begin{aligned} \int _{\Omega } {\mathcal {G}}_\ell v_\ell :\tau _\ell \,{\mathrm {d}}x&= -\int _{\Omega } v_{{\mathcal {T}}_\ell } \cdot \mathrm {div}_\mathrm {pw}\tau _\ell \,{\mathrm {d}}x + \sum _{F \in {\mathcal {F}}_\ell } \int _F v_F \cdot [\tau _\ell \nu _F]_F \,{\mathrm {d}}s \end{aligned}$$
(3.6)

for all \(\tau _\ell \in \Sigma ({\mathcal {T}}_\ell )\). In other words, \({\mathcal {G}}_\ell v_\ell \) is the Riesz representation of the linear functional on the right-hand side of (3.6) in the Hilbert space \(\Sigma ({\mathcal {T}}_\ell )\) endowed with the \(L^2\) scalar product. Since \(\mathrm {D}_\mathrm {pw}P_{k+1}({\mathcal {T}}_\ell ;{\mathbb {R}}^m) \subset \Sigma ({\mathcal {T}}_\ell )\), it follows that \(\mathrm {D}_\mathrm {pw}{\mathcal {R}}_\ell v_\ell \) is the \(L^2\) projection of \({\mathcal {G}}_\ell v_\ell \) onto \(\mathrm {D}_\mathrm {pw}P_{k+1}({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\).

Lemma 3.1

(properties of \({\mathcal {G}})\) Any \(v \in V\) and \(v_\ell \in V({\mathcal {T}}_\ell )\) satisfy (a) \(\Vert v_\ell \Vert _\ell \approx \Vert {{\mathcal {G}}_\ell v_\ell }\Vert _{L^p(\Omega )}\) and (b) \(\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}v = {\mathcal {G}}_\ell \mathrm {I}_\ell v\). There exist positive constants \(C_{\mathrm {dF}}\) and \(C_{\mathrm {dtr}}\) that only depend on \(\Omega \), the shape regularity of \({\mathcal {T}}_\ell \), k, and p such that (c) \(\Vert v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )} \le C_{\mathrm {dF}}\Vert {{\mathcal {G}}_\ell v_\ell }\Vert _{L^p(\Omega )}\) and (d) \(\Vert v_{{\mathcal {F}}_\ell }\Vert _{L^p(\Gamma _\mathrm {N})} \le C_{\mathrm {dtr}}\Vert {{\mathcal {G}}_\ell v_\ell }\Vert _{L^p(\Omega )}\) hold for all \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V_\mathrm {D}({\mathcal {T}}_\ell )\).

Proof

The proofs of (a)–(b) are outlined in [1, 28]. The discrete Sobolev embedding \(\Vert v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell \) follows as in [11, 34, 36]. Theorem 4.4 in [11] and (c) lead to \(\Vert v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})} \lesssim \Vert v_\ell \Vert _\ell \). This and the triangle inequality \(\Vert v_{{\mathcal {F}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})} \le \Vert v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})} + \Vert v_{{\mathcal {T}}_\ell } - v_{{\mathcal {F}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})}\) imply \(\Vert v_{{\mathcal {F}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})} \lesssim \Vert v_\ell \Vert _\ell + \Vert v_{{\mathcal {T}}_\ell } - v_{{\mathcal {F}}_\ell }\Vert _{L^p(\Gamma _{\mathrm {N}})}\). The latter term is controlled by \(\mathrm {diam}(\Omega )^{1/p'}\Vert v_\ell \Vert _\ell \). This concludes the proof of (d). \(\square \)

3.5 Discrete problem

Lemma 3.1 implies the coercivity of \(E_\ell \) in \({\mathcal {A}}({\mathcal {T}}_\ell )\) with respect to the discrete seminorm \(\Vert {\mathcal {G}}_\ell \bullet \Vert _{L^p(\Omega )}\) and the existence and the boundedness of discrete minimizers \(u_\ell \) below.

Theorem 3.2

(Discrete minimizers) The minimal discrete energy \(\inf E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell ))\) is attained. There exists a positive constant \(C_{1} > 0\) that depends only on \(c_{1}, c_{2}, \Omega \), \(\Gamma _{\mathrm {D}}\), \(u_\mathrm {D}\), f, g, the shape regularity of \({\mathcal {T}}_\ell \), k, and p with \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )} \le C_{1}\) for all discrete minimizers \(u_\ell \in \arg \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell ))\). Any discrete stress \(\sigma _\ell {:}{=}\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) \in L^{p'}(\Omega ;{\mathbb {M}})\) satisfies the discrete Euler–Lagrange equations

$$\begin{aligned} \int _\Omega \sigma _\ell : {\mathcal {G}}_\ell v_\ell \,{\mathrm {d}}x = \int _\Omega f \cdot v_{{\mathcal {T}}_\ell } \,{\mathrm {d}}x + \int _{\Gamma _{\mathrm {N}}} g \cdot v_{{\mathcal {F}}_\ell } \,{\mathrm {d}}s \end{aligned}$$
(3.7)

for all \(v_\ell = (v_{{\mathcal {T}}_\ell }, v_{{\mathcal {F}}_\ell }) \in V_\mathrm {D}({\mathcal {T}}_\ell )\). If W satisfies (2.5), then \(u_\ell = \arg \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell ))\) is unique. If W satisfies (2.6), then \(\mathrm {D}W({\mathcal {G}}_\ell u_\ell ) \in L^{p'}(\Omega ;{\mathbb {M}})\) is unique (independent of the choice of a (possibly non-unique) discrete minimizer \(u_\ell \)).

Proof

The boundedness \(\inf E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell )) > - \infty \) of \(E_\ell \) in \({\mathcal {A}}({\mathcal {T}}_\ell )\) follows from the lower p-growth of W, the discrete Friedrichs, and the discrete trace inequality from Lemma 3.1, cf., e.g., [27, 28, 32]. The direct method in the calculus of variations [32] implies the existence of discrete minimizers \(u_\ell \in \arg \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell ))\). The bound \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )} \le C_{1}\) is a consequence of the coercivity of \(E_\ell \) in \({\mathcal {A}}({\mathcal {T}}_\ell )\) with respect to \(\Vert \bullet \Vert _\ell \) as in [28]. If W satisfies (2.5), then W is strictly convex and the discrete minimizer \(u_\ell \in \arg \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell ))\) is unique. If W satisfies (2.6), then the uniqueness of \(\mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) follows as in [16, 27, 28]. \(\square \)

Remark 3.3

(\(H(\mathrm {div})\) conformity) The discrete Euler–Lagrange equations (3.7) imply the continuity of the normal jumps \([\sigma _\ell \nu _F]_F\) of \(\sigma _\ell = \Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) along all interior side \(F \in {\mathcal {F}}_\ell (\Omega )\) [28, Theorem 3.2]. In other words, \(\sigma _\ell \in \Sigma ({\mathcal {T}}_\ell ) \cap W^{p'}(\mathrm {div},\Omega ;{\mathbb {M}})\) with \(\mathrm {div}\sigma _\ell = -\Pi _{{\mathcal {T}}_\ell }^k f\) and \(\sigma _\ell \nu _F = \Pi _F^k g\) for all \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})\).

3.6 Conforming companion

The companion operator \({\mathcal {J}}_\ell : V({\mathcal {T}}_\ell ) \rightarrow V\) is a right-inverse of the interpolation \(\mathrm {I}_\ell : V \rightarrow V({\mathcal {T}}_\ell )\) in spirit of [18, 23, 26, 42]. In particular, \({\mathcal {J}}_\ell \) preserves the moments

$$\begin{aligned} \Pi _{{\mathcal {T}}_\ell }^k {\mathcal {J}}_\ell v_\ell = v_{{\mathcal {T}}_\ell } \quad \text {and}\quad \Pi _{{\mathcal {F}}_\ell }^k {\mathcal {J}}_\ell v_\ell = v_{{\mathcal {F}}_\ell } \quad \text {for any } v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell ). \end{aligned}$$
(3.8)

An explicit construction of \({\mathcal {J}}_\ell v_\ell \) on simplicial meshes is presented in [42, Section 4.3] for simplicial triangulations with the following properties.

Lemma 3.4

(right-inverse) There exists a linear operator \({\mathcal {J}}_\ell : V({\mathcal {T}}_\ell ) \rightarrow V\) with (3.8) such that any \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell )\) satisfies, for all \(T \in {\mathcal {T}}_\ell \),

$$\begin{aligned}&\Vert {\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(T)}^p \lesssim \sum _{E \in {\mathcal {F}}_\ell (\Omega ), E \cap T \ne \emptyset } h_E^{1-p}\Vert [{\mathcal {R}}_\ell v_\ell ]_E\Vert _{L^p(E)}^p \nonumber \\&\quad + \sum _{F \in {\mathcal {F}}_\ell (T)} h_F^{1-p}\Vert \Pi _F^k(({\mathcal {R}}_\ell v_\ell )_T - v_F)\Vert _{L^p(F)}^p + h_T^{-p}\Vert \Pi _{T}^k({\mathcal {R}}_\ell v_\ell - v_{T})\Vert _{L^p(T)}^p. \end{aligned}$$
(3.9)

In particular, \({\mathcal {J}}_\ell \) is stable in the sense that \(\Vert \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \le \Lambda _0 \Vert v_\ell \Vert _\ell \) holds with the constant \(\Lambda _0\) that exclusively depends on k, p, and the shape regularity of \({\mathcal {T}}_\ell \).

Proof

For \(p = 2\), the right-hand side of (3.9) is an upper bound for \(\Vert \mathrm {D}({\mathcal {R}}_\ell v_\ell - {\mathcal {J}}_\ell v_\ell )\Vert _{L^2(T)}^2\), cf. [42, Proof of Proposition 4.7], and scaling arguments confirm this for \(1< p < \infty \). The \(L^p\) stability of the \(L^2\) projection [34, Lemma 3.2] and the orthogonality \({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \perp \mathrm {RT}_k(T;{\mathbb {M}})\) in \(L^2(T;{\mathbb {M}})\) imply \(\Vert {\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(T)} \lesssim \Vert \mathrm {D}({\mathcal {R}}_\ell v_\ell - {\mathcal {J}}_\ell v_\ell )\Vert _{L^p(T)}\). This proves (3.9). The right-hand side of (3.9) can be bounded by

$$\begin{aligned}&\sum _{E \in {\mathcal {F}}_\ell (\Omega ), E \cap T \ne \emptyset } h_E^{1-p}\Vert [{\mathcal {R}}_\ell v_\ell ]_E\Vert _{L^p(E)}^p + \sum _{F \in {\mathcal {F}}_\ell (T)} h_F^{1-p}\Vert \Pi _F^k(({\mathcal {R}}_\ell v_\ell )_T - v_F)\Vert _{L^p(F)}^p \nonumber \\&\quad + h_T^{-p}\Vert \Pi _{T}^k({\mathcal {R}}_\ell v_\ell - v_{T})\Vert _{L^p(T)}^p \lesssim \sum _{K \in {\mathcal {T}}_\ell , K \cap T \ne \emptyset } \sum _{E \in {\mathcal {F}}_\ell (K)} h_E^{1-p}\Vert v_K - v_F\Vert _{L^p(E)}^p \end{aligned}$$
(3.10)

with a hidden constant that only depends on the shape regularity of \({\mathcal {T}}_\ell \), k, and p [42, Proof of Proposition 4.7]. The sum of this over all simplices \(T \in {\mathcal {T}}_\ell \), a triangle inequality, and the shape regularity of \({\mathcal {T}}_\ell \) imply \(\Vert {\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell \). This, a reverse triangle inequality, and the norm equivalence from Lemma 3.1a conclude the stability \(\Vert \mathrm {D}{\mathcal {J}}_{\ell } v_\ell \Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell \). \(\square \)

4 Proof of Theorem 2.1

This section is devoted to the proof of the convergence results in Theorem 2.1.

4.1 Discrete compactness

The proof of Theorem 2.1 departs from a discrete compactness in spirit of [11, 34, 36] and generalizes [34]. Recall the mesh-size function \(h_\ell \in P_0({\mathcal {T}}_\ell )\) from Sect. 3.2 with \(h_\ell |_T = |T|^{1/n}\) for \(T \in {\mathcal {T}}_\ell \) and the seminorm \(\Vert \bullet \Vert _\ell \) in \(V({\mathcal {T}}_\ell )\) from (3.2).

Theorem 4.1

(discrete compactness) Given a uniformly shape-regular sequence \(({\mathcal {T}}_\ell )_{\ell \in {\mathbb {N}}_0}\) of triangulations and \((v_\ell )_{\ell \in {\mathbb {N}}_0}\) with \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in {\mathcal {A}}({\mathcal {T}}_\ell )\) for all \(\ell \in {\mathbb {N}}_0\). Suppose that the sequence \((\Vert v_\ell \Vert _\ell )_{\ell \in {\mathbb {N}}_0}\) is bounded and suppose that \(\lim _{\ell \rightarrow \infty } \mu _\ell (v_\ell ) = 0\) with

$$\begin{aligned} \mu _\ell (v_\ell ) {:}{=}\Vert h_\ell ^{k+1}({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell )\Vert _{L^p(\Omega )}^p + \sum _{F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})} h_F^{kp+1}\Vert {\mathcal {J}}_\ell v_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p. \end{aligned}$$
(4.1)

Then there exist a subsequence \((v_{\ell _j})_{j \in {\mathbb {N}}_0}\) of \((v_{\ell })_{\ell \in {\mathbb {N}}_0}\) and a weak limit \(v \in {\mathcal {A}}\) such that \({\mathcal {J}}_{\ell _j} v_{\ell _j} \rightharpoonup v\) weakly in V and \({\mathcal {G}}_{\ell _j} v_{\ell _j} \rightharpoonup \mathrm {D}v\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) as \(j \rightarrow \infty \).

Proof

The first part of the proof proves the uniform boundedness

$$\begin{aligned} \Vert {\mathcal {J}}_\ell v_\ell \Vert _{W^{1,p}(\Omega )} \lesssim \Vert v_\ell \Vert _\ell + \Vert u_\mathrm {D}\Vert _{W^{1,p}(\Omega )} \lesssim 1 \end{aligned}$$
(4.2)

of the sequence \({\mathcal {J}}_\ell v_\ell \) in \(W^{1,p}(\Omega ;{\mathbb {R}}^m)\). Since \(\Vert \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )}\lesssim \Vert v\Vert _\ell \) from the stability of \({\mathcal {J}}_\ell \) in Lemma 3.4, it remains to show \(\Vert {\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell + \Vert u_\mathrm {D}\Vert _{W^{1,p}(\Omega )}\) to obtain (4.2). The triangle inequality implies

$$\begin{aligned} \Vert {\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \le \Vert {\mathcal {J}}_\ell v_\ell - v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )} + \Vert v_{{\mathcal {T}}_\ell } - \Pi _{{\mathcal {T}}_\ell }^k u_\mathrm {D}\Vert _{L^p(\Omega )} + \Vert \Pi _{{\mathcal {T}}_\ell }^k u_\mathrm {D}\Vert _{L^p(\Omega )}. \end{aligned}$$
(4.3)

The right-inverse \({\mathcal {J}}_\ell \) of the interpolation \(\mathrm {I}_\ell \) from Lemma 3.4 satisfies the \(L^2\) orthogonality \({\mathcal {J}}_\ell v_\ell - v_{{\mathcal {T}}_\ell } \perp P_k({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\). This, a piecewise application of the Poincaré inequality, a triangle inequality, and \(\Vert h_\ell \Vert _{L^\infty (\Omega )} \le \mathrm {diam}(\Omega )\) lead to

$$\begin{aligned} \Vert {\mathcal {J}}_\ell v_\ell - v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )}&\lesssim \Vert h_\ell \mathrm {D}_\mathrm {pw}({\mathcal {J}}_\ell v_\ell - v_{{\mathcal {T}}_\ell })\Vert _{L^p(\Omega )}\nonumber \\&\lesssim \Vert \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} + \Vert \mathrm {D}_\mathrm {pw}v_{{\mathcal {T}}_\ell }\Vert _{L^p(\Omega )}. \end{aligned}$$
(4.4)

Since \(v_\ell - \mathrm {I}_\ell u_\mathrm {D} \in V_\mathrm {D}({\mathcal {T}}_\ell )\), the Sobolev embedding from Lemma 3.1c and a triangle inequality show \(\Vert v_{{\mathcal {T}}_\ell } - \Pi _{{\mathcal {T}}_\ell }^k u_\mathrm {D}\Vert _{L^p(\Omega )} \lesssim \Vert {\mathcal {G}}_\ell (v_\ell - \mathrm {I}_\ell u_\mathrm {D})\Vert _{L^p(\Omega )} \le \Vert {\mathcal {G}}_\ell v_\ell \Vert _{L^p(\Omega )} + \Vert {\mathcal {G}}_\ell \mathrm {I}_\ell u_\mathrm {D}\Vert _{L^p(\Omega )}\). This, the equivalence \(\Vert {\mathcal {G}}_\ell v_\ell \Vert _{L^p(\Omega )} \approx \Vert v_\ell \Vert _\ell \) from Lemma 3.1a, the commutativity \({\mathcal {G}}_\ell \mathrm {I}_\ell u_\mathrm {D} = \Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}u_\mathrm {D}\) from Lemma 3.1b, and the \(L^p\) stability of the \(L^2\) projection \(\Pi _{\Sigma ({\mathcal {T}}_\ell )}\) [34, Lemma 3.2] provide

$$\begin{aligned} \Vert v_{{\mathcal {T}}_\ell } - \Pi _{{\mathcal {T}}_\ell }^k u_\mathrm {D}\Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell + \Vert \mathrm {D}u_\mathrm {D}\Vert _{L^p(\Omega )}. \end{aligned}$$
(4.5)

Lemma 3.4 and the definition of the discrete norm \(\Vert v_\ell \Vert _\ell \) in (3.2) prove that the right-hand side of (4.4) is controlled by \(\Vert v_\ell \Vert _\ell \). Hence, the combination of (4.3)–(4.5) concludes (4.2).

The Banach–Alaoglu theorem [10, Theorem 3.18] ensures the existence of a (not relabelled) subsequence of \(({\mathcal {J}}_\ell v_\ell )_{\ell \in {\mathbb {N}}_0}\) and a weak limit \(v \in V\) such that \({\mathcal {J}}_\ell v_\ell \rightharpoonup v\) weakly in V as \(\ell \rightarrow \infty \). Lemma 3.1a assures that the sequence \(({\mathcal {G}}_\ell v_\ell )_{\ell \in {\mathbb {N}}_0}\) is uniformly bounded in \(L^p(\Omega ;{\mathbb {M}})\). Hence there exist a (not relabelled) subsequence of \((v_\ell )_{\ell \in {\mathbb {N}}_0}\) and its weak limit \(G \in L^p(\Omega ;{\mathbb {M}})\) such that \({\mathcal {G}}_\ell v_\ell \rightharpoonup G\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) as \(\ell \rightarrow \infty \). The second part of the proof verifies \(\mathrm {D}v = G\) in \(\Omega \) and \(v = u_\mathrm {D}\) on \(\Gamma _{\mathrm {D}}\) (and so \(v \in {\mathcal {A}}\)). Since \(\mathrm {I}_\ell {\mathcal {J}}_\ell v_\ell = v_\ell \), the commutativity \({\mathcal {G}}_\ell v_\ell = \Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}{\mathcal {J}}_\ell v_\ell \) from Lemma 3.1b proves the \(L^2\) orthogonality \({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \perp \Sigma ({\mathcal {T}}_\ell )\). This and an integration by parts verify, for all \(\Phi \in C^{\infty }({\overline{\Omega }};{\mathbb {M}})\) with \(\Phi \equiv 0\) on \(\Gamma _{\mathrm {N}}\), that

$$\begin{aligned} \int _\Omega {\mathcal {G}}_\ell v_\ell : \Phi \,{\mathrm {d}}x&= \int _\Omega ({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell ) : \Phi \,{\mathrm {d}}x + \int _\Omega \mathrm {D}{\mathcal {J}}_\ell v_\ell : \Phi \,{\mathrm {d}}x\nonumber \\&= \int _\Omega ({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell ) : (1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \Phi \,{\mathrm {d}}x - \int _\Omega {\mathcal {J}}_\ell v_\ell \cdot \mathrm {div}\Phi \,{\mathrm {d}}x\nonumber \\&\quad + \int _{\Gamma _{\mathrm {D}}} ({\mathcal {J}}_\ell v_\ell - u_\mathrm {D}) \cdot \Phi \nu \,{\mathrm {d}}s + \int _{\Gamma _{\mathrm {D}}} u_\mathrm {D} \cdot \Phi \nu \,{\mathrm {d}}s. \end{aligned}$$
(4.6)

The approximation property of piecewise polynomials, also known under the name Bramble–Hilbert lemma [9, Lemma 4.3.8], leads to

$$\begin{aligned} \Vert h_\ell ^{-(k+1)}(1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \Phi \Vert _{L^{p'}(\Omega )} \lesssim |\Phi |_{W^{k+1,p'}(\Omega )}. \end{aligned}$$
(4.7)

This and a Hölder inequality imply

$$\begin{aligned} \begin{aligned}&\int _\Omega ({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell ) : (1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \Phi \,{\mathrm {d}}x\\&\quad \lesssim \Vert h_\ell ^{k+1}({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell )\Vert _{L^p(\Omega )} |\Phi |_{W^{k+1,p'}(\Omega )}. \end{aligned} \end{aligned}$$
(4.8)

The \(L^2\) orthogonality \(({\mathcal {J}}_\ell u_\ell - u_\mathrm {D})|_F \perp P_k(F;{\mathbb {R}}^m)\) for each side \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\) on the Dirichlet boundary, a piecewise application of the trace inequality, and (4.7) imply

$$\begin{aligned} \int _{\Gamma _{\mathrm {D}}} ({\mathcal {J}}_\ell v_\ell - u_\mathrm {D}) \cdot \Phi \nu \,{\mathrm {d}}s \lesssim \Big (\sum _{F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})} h_F^{kp + 1} \Vert {\mathcal {J}}_\ell v_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p\Big )^{1/p}|\Phi |_{W^{k+1,p'}(\Omega )}. \end{aligned}$$
(4.9)

The right-hand sides of (4.8)–(4.9) vanish in the limit as \(\ell \rightarrow \infty \) by assumption (4.1). This, (4.6), \({\mathcal {G}}_\ell v_\ell \rightharpoonup G\) in \(L^p(\Omega ;{\mathbb {M}})\), and \({\mathcal {J}}_\ell v_\ell \rightharpoonup v\) in V prove

$$\begin{aligned} \int _\Omega (G : \Phi + v \cdot \mathrm {div}\Phi ) \,{\mathrm {d}}x - \int _{\Gamma _{\mathrm {D}}} u_\mathrm {D} \cdot \Phi \nu \,{\mathrm {d}}s = 0 \end{aligned}$$

for all \(\Phi \in C^\infty ({\overline{\Omega }};{\mathbb {M}})\) with \(\Phi \equiv 0\) on \(\Gamma _{\mathrm {N}}\). This implies \(\mathrm {D}v = G\) a.e. in \(\Omega \) with \(v = u_\mathrm {D}\) on \(\Gamma _{\mathrm {D}}\) and concludes the proof. \(\square \)

Since \({\mathcal {J}}_\ell v_\ell \) cannot attain the exact value \(u_\mathrm {D}\) on \(\Gamma _{\mathrm {D}}\) in general, a (Dirichlet boundary data) oscillation arises in (4.1), but is controlled by the contributions of \(\eta _\ell ^{(\varepsilon )}\).

Lemma 4.2

(Dirichlet boundary data oscillation) Given \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\), let \(T \in {\mathcal {T}}_\ell \) be the unique simplex with \(F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\). Then it holds, for all \(v_\ell = (v_{{\mathcal {T}}_\ell },v_{{\mathcal {F}}_\ell }) \in V({\mathcal {T}}_\ell )\), that

$$\begin{aligned} \Vert {\mathcal {J}}_\ell v_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p&\lesssim \sum _{E \in {\mathcal {F}}_\ell (\Omega ), E \cap F \ne \emptyset } \Vert [{\mathcal {R}}_\ell v_\ell ]_E\Vert _{L^p(E)}^p\\&\quad + \Vert \Pi _F^k({\mathcal {R}}_\ell v_\ell - v_F)\Vert _{L^p(F)}^p + \Vert {\mathcal {R}}_\ell v_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p. \end{aligned}$$

Proof

The proof of Lemma 4.2 utilizes standard averaging and bubble-function techniques, cf., e.g, [20, 42, 57, 58]; further details are therefore omitted. \(\square \)

4.2 Plain convergence

Before the remaining parts of this subsection prove Theorem 2.1, the following lemma establishes the reduction of the mesh-size function \(h_\ell \) with \(h_\ell |_T \equiv |T|^{1/n}\) for all \(T \in {\mathcal {T}}_\ell \).

Lemma 4.3

(Mesh-size reduction) Given the output \(({\mathcal {T}}_\ell )_{\ell \in {\mathbb {N}}_0}\) of AHHO from Sect. 2.2, let \(\Omega _\ell {:}{=}\mathrm {int}(\cup ({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}))\) for all level \(\ell \in {\mathbb {N}}_0\). Then it holds

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \Vert h_{\ell }\Vert _{L^\infty (\Omega _{\ell })} = 0. \end{aligned}$$
(4.10)

Proof

The proof is omitted for two reasons. First this is known from [50, Lemma 9] and second it is a particular case of Lemma 5.2 below. \(\square \)

Proof of Theorem 2.1

This proof is motivated by [4, 13, 22, 49, 50, 52] and is divided into five steps.

Step 1 establishes \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) = 0\). Since no suitable residual-based a posteriori control is available in the general setting (A1)–(A2), standard arguments, e.g., reliability, efficiency, or estimator reduction [22, 41, 49] fail. The proof of Step 1 rather relies on a positive power of the mesh-size that arises from the smoothness of test functions in Theorem 4.1. This is done in [52] for a similar setting and in [11, 34, 36] for uniform mesh-refinements. Let \(\mu _\ell ^{(\varepsilon )}(T)\) abbreviate some contributions of \(\eta _\ell ^{(\varepsilon )}(T)\) from (2.3) related to \(\mu _\ell (u_\ell )\) in (4.1), namely

$$\begin{aligned}&\mu _\ell ^{(\varepsilon )}(T) {:}{=}|T|^{(\varepsilon p - p)/n}\Vert \Pi _T^k({\mathcal {R}}_\ell u_\ell - u_T)\Vert _{L^p(T)}^p\nonumber \\&\quad + |T|^{(\varepsilon p + 1 - p)/n}\Big (\sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})} \Vert {\mathcal {R}}_\ell u_\ell - u_\mathrm {D}\Vert _{L^p(F)}^p\nonumber \\&\quad + \sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Omega )} \Vert [{\mathcal {R}}_\ell u_\ell ]_F\Vert _{L^p(F)}^p + \sum _{F \in {\mathcal {F}}_\ell (T)} \Vert \Pi _F^k (({\mathcal {R}}_\ell u_\ell )|_T - u_F)\Vert _{L^p(F)}^p\Big ). \end{aligned}$$
(4.11)

Denote \(\mu _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) {:}{=}\sum _{T \in {\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}} \mu _\ell ^{(\varepsilon )}(T)\). Given any \(T \in {\mathcal {T}}_\ell \) and \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})\), the \(L^{p'}\) stability of the \(L^2\) projection \(\Pi _{\mathrm {RT}_k(T;{\mathbb {M}})}\) resp. \(\Pi _T^k\) or \(\Pi _F^k\) [34, Lemma 3.2] implies \(\Vert \sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(T)} \lesssim \Vert \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(T)}\) resp. \(\Vert (1 - \Pi _T^k) f\Vert _{L^{p'}(T)} \lesssim \Vert f\Vert _{L^{p'}(T)}\) or \(\Vert (1 - \Pi _F^k) g\Vert _{L^{p'}(F)} \lesssim \Vert g\Vert _{L^{p'}(F)}\). Since \(\mu _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) \le \Vert h_\ell \Vert _{L^{\infty }(\Omega _\ell )}^{\varepsilon p}\mu _\ell ^{(0)}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1})\) with \(\Vert h_\ell \Vert _{L^{\infty }(\Omega _\ell )} = \sup _{T \in {\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}} |T|^{1/n}\) and \(\Omega _\ell = \mathrm {int}(\cup ({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}))\) from Lemma 4.3, this leads to

$$\begin{aligned} \eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1})&\lesssim \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{\varepsilon p} \mu _\ell ^{(0)}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{\varepsilon p'}\Vert \mathrm {D}W({\mathcal {G}}u_\ell )\Vert _{L^{p'}(\Omega )}^{p'}\nonumber \\&\quad + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{p'}\Vert f\Vert _{L^{p'}(\Omega )}^{p'} + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}\Vert g\Vert _{L^{p'}(\Gamma _{\mathrm {N}})}^{p'}. \end{aligned}$$
(4.12)

The two-sided growth \(|A|^p - 1 \lesssim W(A) \lesssim |A|^p + 1\) implies \(|\mathrm {D}W(A)|^{p'} \lesssim |A|^p + 1\) [28, Lemma 2.1] and so Theorem 3.2 provides \(\Vert \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(\Omega )}^{p'} \lesssim \Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^p + |\Omega | \lesssim 1\). This and (4.10) prove

$$\begin{aligned} \begin{aligned}&\lim _{\ell \rightarrow \infty } \Big (\Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{\varepsilon p'}\Vert \mathrm {D}W({\mathcal {G}}u_\ell )\Vert _{L^{p'}(\Omega )}^{p'}\\&\quad + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{p'}\Vert f\Vert _{L^{p'}(\Omega )}^{p'} + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}\Vert g\Vert _{L^{p'}(\Gamma _{\mathrm {N}})}^{p'}\Big ) = 0, \end{aligned} \end{aligned}$$
(4.13)

whence, in order to obtain \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) = 0\), it suffices to prove that \(\mu _\ell ^{(0)}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1})\) is uniformly bounded. The estimate (3.10) provides control over all but only one contribution of \(\mu _\ell ^{(0)}(T)\) in (4.11); that is \(\Vert h_F^{-1/p'}({\mathcal {R}}_\ell u_\ell - u_\mathrm {D})\Vert _{L^p(F)}\) for any \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\). Triangle inequalities and \(u_F = \Pi _F^k u_\mathrm {D}\) imply

$$\begin{aligned} \begin{aligned}&\Vert h_F^{-1/p'}({\mathcal {R}}_\ell u_\ell - u_\mathrm {D})\Vert _{L^p(F)} \le \Vert h_F^{-1/p'}(1 - \Pi _{F}^k){\mathcal {R}}_\ell u_\ell \Vert _{L^p(F)}\\&\quad + \Vert h_F^{-1/p'}\Pi _F^k({\mathcal {R}}_\ell u_\ell - u_F)\Vert _{L^p(F)} + \Vert h_F^{-1/p'}(1 - \Pi _{F}^k)u_\mathrm {D}\Vert _{L^p(F)}. \end{aligned} \end{aligned}$$
(4.14)

Given any \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\), let \(T \in {\mathcal {T}}_\ell \) be the unique simplex with \(F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\). The \(L^p\) stability of the \(L^2\) projection \(\Pi _F^k\) [34, Lemma 3.2] and a trace inequality show \(\Vert h_F^{-1/p'}(1 - \Pi _F^k) u_\mathrm {D}\Vert _{L^p(F)} \lesssim \Vert \mathrm {D}u_\mathrm {D}\Vert _{L^p(T)}\) and \(\Vert h_F^{-1/p'}(1 - \Pi _F^k) ({\mathcal {R}}_\ell u_\ell )|_F\Vert _{L^p(F)} \lesssim \Vert \mathrm {D}{\mathcal {R}}_\ell u_\ell \Vert _{L^p(T)}\). Recall that \(\mathrm {D}{\mathcal {R}}_\ell u_\ell \) is the \(L^2\) projection of \({\mathcal {G}}_\ell u_\ell \) onto \(\mathrm {D}_\mathrm {pw}P_{k+1}({\mathcal {T}})\), whence the \(L^p\) stability of \(L^2\) projections [34, Lemma 3.2] proves \(\Vert \mathrm {D}{\mathcal {R}}_\ell u_\ell \Vert _{L^p(T)} \lesssim \Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(T)}\). Hence, the right-hand side of (4.14) is controlled by \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(T)} + \Vert \mathrm {D}u_\mathrm {D}\Vert _{L^p(T)} + \Vert h_F^{-1/p'}\Pi _F^k({\mathcal {R}}_\ell u_\ell - u_F)\Vert _{L^p(F)}\). This, (3.10), and \(\Vert u_\ell \Vert _\ell \approx \Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )} \le C_{1}\) from Lemma 3.1a and Theorem 3.2 lead to

$$\begin{aligned} \mu _\ell ^{(0)}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) \le \mu _\ell ^{(0)} {:}{=}\sum _{T \in {\mathcal {T}}_\ell } \mu _\ell ^{(0)}(T) \lesssim \Vert u_\ell \Vert _\ell ^p + \Vert \mathrm {D}u_\mathrm {D}\Vert _{L^p(\Omega )}^p \lesssim 1. \end{aligned}$$

Hence, the combination of (4.12)–(4.13) with \(\lim _{\ell \rightarrow \infty } \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )} = 0\) in (4.10) confirms \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) = 0\).

Step 2 establishes \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\). Recall the set \({\mathfrak {M}}_\ell \) of marked simplices on level \(\ell \in {\mathbb {N}}_0\) from Sect. 2.2. Since all simplices in \({\mathfrak {M}}_\ell \subset {\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}\) are refined and the Dörfler marking enforces \(\eta _\ell ^{(\varepsilon )} \le \theta ^{-1}\eta _\ell ^{(\varepsilon )}({\mathfrak {M}}_\ell ) \le \theta ^{-1}\eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1})\) in (2.4), the convergence \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )}({\mathcal {T}}_\ell \setminus {\mathcal {T}}_{\ell +1}) = 0\) in Step 1 implies \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\).

Step 3 provides the lower energy bound (LEB)

$$\begin{aligned} \begin{aligned} \mathrm {LEB}_\ell&{:}{=}E_\ell (u_\ell ) + \int _\Omega (1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) : \mathrm {D}u \,{\mathrm {d}}x\\&\quad - C_{3}\big (\mathrm {osc}(f,{\mathcal {T}}_\ell ) + \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_{\ell }(\Gamma _{\mathrm {N}}))\big ) \le E(u). \end{aligned} \end{aligned}$$
(4.15)

The convexity of \(W \in C^1({\mathbb {M}})\) implies \(\mathrm {D}W({\mathcal {G}}_\ell u_\ell ):(\mathrm {D}u - {\mathcal {G}}_\ell u_\ell ) \le W(\mathrm {D}u) - W({\mathcal {G}}_\ell u_\ell )\) a.e. in \(\Omega \). The integral of this inequality with \(\sigma _\ell {:}{=}\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) reads

$$\begin{aligned} 0&\le \int _\Omega \big (W(\mathrm {D}u) - W({\mathcal {G}}_\ell u_\ell ) - (1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) : \mathrm {D}u\big ) \,{\mathrm {d}}x\nonumber \\&\quad - \int _\Omega \sigma _\ell : (\mathrm {D}u - {\mathcal {G}}_\ell u_\ell ) \,{\mathrm {d}}x. \end{aligned}$$
(4.16)

The commutativity \(\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}u = {\mathcal {G}}_\ell \mathrm {I}_\ell u\) from Lemma 3.1b and the discrete Euler–Lagrange equations (3.7) lead to

$$\begin{aligned} \int _\Omega \sigma _\ell : (\mathrm {D}u - {\mathcal {G}}_\ell u_\ell ) \,{\mathrm {d}}x = \int _\Omega f \cdot (\Pi _{{\mathcal {T}}_\ell }^k u - u_{{\mathcal {T}}_\ell }) \,{\mathrm {d}}x + \int _{\Gamma _\mathrm {N}} g \cdot (\Pi _{{\mathcal {F}}_\ell }^k u - u_{{\mathcal {F}}_\ell }) \,{\mathrm {d}}x. \end{aligned}$$

The substitution of this in (4.16), the definition of E in (2.1), and the definition of \(E_h\) in (2.2) result in

$$\begin{aligned} \begin{aligned} 0&\le E(u) - E_\ell (u_\ell ) - \int _\Omega (1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )}) \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) : \mathrm {D}u \,{\mathrm {d}}x\\&\quad + \int _\Omega (u - \Pi _{{\mathcal {T}}_\ell }^k u) \cdot (f - \Pi _{{\mathcal {T}}_\ell }^k f) \,{\mathrm {d}}s + \int _{\Gamma _{\mathrm {N}}} (u - \Pi _{{\mathcal {F}}_\ell }^k u) \cdot (g - \Pi _{{\mathcal {F}}_\ell }^k g) \,{\mathrm {d}}s. \end{aligned} \end{aligned}$$
(4.17)

The final two integrals on the right-hand side of (4.17) give rise to the data oscillations \(\mathrm {osc}(f,{\mathcal {T}}_\ell )\) and \(\mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))\) defined in Sect. 3.2. In fact, a Hölder inequality and a piecewise application of the Poincaré inequality show

$$\begin{aligned} \int _\Omega (u - \Pi _{{\mathcal {T}}_\ell }^k u) \cdot (f - \Pi _{{\mathcal {T}}_\ell }^k f) \,{\mathrm {d}}x \lesssim \Vert \mathrm {D}u\Vert _{L^p(\Omega )} \mathrm {osc}(f,{\mathcal {T}}_\ell ). \end{aligned}$$

A trace inequality and the Bramble–Hilbert lemma [9, Lemma 4.3.8] lead, for all \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})\) and the unique \(T \in {\mathcal {T}}_\ell \) with \(F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})\), to \(\Vert h_F^{-1/p'}(u - \Pi _F^k u)\Vert _{L^{p}(F)} \lesssim \Vert \mathrm {D}u\Vert _{L^p(T)}\). Consequently,

$$\begin{aligned} \int _{\Gamma _{\mathrm {N}}} (u - \Pi _{{\mathcal {F}}_\ell }^k u) \cdot (g - \Pi _{{\mathcal {F}}_\ell }^k g) \,{\mathrm {d}}s \lesssim \Vert \mathrm {D}u\Vert _{L^p(\Omega )} \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})). \end{aligned}$$

The lower p-growth \(c_{1}|A|^p - c_{2} \le W(A)\) for all \(A \in {\mathbb {M}}\) implies the coercivity of E in the seminorm \(\Vert \mathrm {D}\bullet \Vert _{L^p(\Omega )}\) and so the bound \(\Vert \mathrm {D}u\Vert _{L^p(\Omega )} \le C_{2}\) with a positive constant \(C_{2}\), that exclusively depends on \(c_{1}, c_{2}, \Omega \), \(\Gamma _{\mathrm {D}}\), f, g, and \(u_\mathrm {D}\), cf., e.g, [32, Theorem 4.1]. Thus there exists a positive constant \(C_{3}\) independent of the mesh-size with

$$\begin{aligned} \begin{aligned}&\int _\Omega (u - \Pi _{{\mathcal {T}}_\ell }^k u) \cdot (f - \Pi _{{\mathcal {T}}_\ell }^k f) \,{\mathrm {d}}s + \int _{\Gamma _{\mathrm {N}}} (u - \Pi _{{\mathcal {F}}_\ell }^k u) \cdot (g - \Pi _{{\mathcal {F}}_\ell }^k g) \,{\mathrm {d}}s\\&\quad \le C_{3}\big (\mathrm {osc}(f,{\mathcal {T}}_\ell ) + \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))\big ). \end{aligned} \end{aligned}$$
(4.18)

The combination of this with (4.17) concludes the proof of (4.15).

Step 4 establishes \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = E(u)\). On the one hand, the discrete compactness from Theorem 4.1 and the weak lower semicontinuity of the energy functional imply \(E(u) \le \liminf _{\ell \rightarrow \infty } \mathrm {LEB}_\ell \). On the other hand, \(\mathrm {LEB}_\ell \le E(u)\) from (4.15). This proves \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = E(u)\) as follows. Given any \(\Phi \in C^\infty (\Omega ;{\mathbb {M}})\), the definition \(\sigma _\ell {:}{=}\Pi _{\Sigma ({\mathcal {T}}_\ell )} \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\), a Hölder inequality, and (4.7) show

$$\begin{aligned} \Big |\!\int _{\Omega } (\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )):\Phi \,{\mathrm {d}}x\Big |&= \Big |\!\int _{\Omega } (\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )):(1 - \Pi _{\Sigma ({\mathcal {T}}_\ell )})\Phi \,{\mathrm {d}}x\Big | \nonumber \\&\lesssim \Vert h_\ell ^{k+1}(\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell ))\Vert _{L^{p'}(\Omega )}|\Phi |_{W^{k+1,p}(\Omega )}. \end{aligned}$$
(4.19)

Since \(\Vert h_\ell ^{k+1}(\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell ))\Vert _{L^{p'}(\Omega )}^{p'} \le \eta _\ell ^{(k+1)} \lesssim \eta _\ell ^{(\varepsilon )} \rightarrow 0\) as \(\ell \rightarrow \infty \) from Step 2, the right-hand side of (4.19) vanishes in the limit as \(\ell \rightarrow \infty \). This, the density of \(C^\infty (\Omega ;{\mathbb {M}})\) in \(L^p(\Omega ;{\mathbb {M}})\), and the uniform boundedness of the sequence \((\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell ))_{\ell \in {\mathbb {N}}_0}\) in \(L^{p'}(\Omega ;{\mathbb {M}})\) prove \(\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) \rightharpoonup 0\) (weakly) in \(L^{p'}(\Omega ;{\mathbb {M}})\) as \(\ell \rightarrow \infty \). In particular,

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \int _{\Omega } (\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )):\mathrm {D}u \,{\mathrm {d}}x = 0. \end{aligned}$$
(4.20)

Recall \(\mu _\ell (u_\ell )\) from (4.1) and \(\mu _\ell ^{(\varepsilon )}\) from (4.11). The combination of (3.9) with the bound of the Dirichlet data oscillation from Lemma 4.2 and the equivalence \(h_F \approx h_T \approx |T|^{1/n}\) for all \(T \in {\mathcal {T}}_\ell \), \(F \in {\mathcal {F}}_\ell (T)\) from the shape regularity of \({\mathcal {T}}_\ell \) result in

$$\begin{aligned} \mu _\ell (u_\ell ) \lesssim \mu _\ell ^{(k+1)} \le {\text {diam}}(\Omega )^{(k+1-\varepsilon )p}\mu _\ell ^{(\varepsilon )} \lesssim \eta _\ell ^{(\varepsilon )}. \end{aligned}$$

This and \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\) from Step 2 imply \(\lim _{\ell \rightarrow \infty } \mu _\ell (u_\ell ) = 0\). Since \(\Vert u_\ell \Vert _\ell \approx \Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )} \le C_{1}\) from Lemma 3.1a and Theorem 3.2, the discrete compactness from Theorem 4.1 leads to a (not relabelled) subsequence of \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) and a weak limit \(v \in {\mathcal {A}}\) such that \({\mathcal {J}}_\ell u_\ell \rightharpoonup v\) weakly in V and \({\mathcal {G}}_\ell u_\ell \rightharpoonup \mathrm {D}v\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) as \(\ell \rightarrow \infty \). The boundedness of the linear trace operator \(\gamma : V \rightarrow L^p(\partial \Omega ;{\mathbb {R}}^m)\) [10, Chapter 9] implies \(({\mathcal {J}}_\ell u_\ell )|_{\partial \Omega } \rightharpoonup v|_{\partial \Omega }\) (weakly) in \(L^p(\partial \Omega ;{\mathbb {R}}^m)\). Hence

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \int _{\Gamma _{\mathrm {N}}} g \cdot {\mathcal {J}}_\ell u_\ell \,{\mathrm {d}}s = \int _{\Gamma _\mathrm {N}} g \cdot v \,{\mathrm {d}}s. \end{aligned}$$

This, \({\mathcal {J}}_{\ell } u_\ell \rightharpoonup v\) (weakly) in V, \({\mathcal {G}}_\ell u_\ell \rightharpoonup \mathrm {D}v\) (weakly) in \(L^p(\Omega ;{\mathbb {M}})\), the sequential weak lower semicontinuity of the functional \(\int _\Omega W(\bullet ) \,{\mathrm {d}}x\) in \(L^p(\Omega ;{\mathbb {M}})\), and (3.8) verify

$$\begin{aligned} E(v)&\le \liminf _{\ell \rightarrow \infty } \Big (\int _\Omega (W({\mathcal {G}}_\ell u_\ell ) - f \cdot {\mathcal {J}}_\ell u_\ell ) \,{\mathrm {d}}x - \int _{\Gamma _{\mathrm {N}}} g \cdot {\mathcal {J}}_\ell u_\ell \,{\mathrm {d}}s\Big )\nonumber \\&= \liminf _{\ell \rightarrow \infty } \Big (E_\ell (u_\ell ) - \int _\Omega {\mathcal {J}}_\ell u_\ell \cdot (1 - \Pi _{{\mathcal {T}}_\ell }^k) f \,{\mathrm {d}}x - \int _{\Gamma _{\mathrm {N}}} {\mathcal {J}}_\ell u_\ell \cdot (1 - \Pi _{{\mathcal {F}}_\ell }^k) g \,{\mathrm {d}}s\Big ). \end{aligned}$$
(4.21)

As in (4.18), a piecewise application of the Poincaré inequality, the trace inequality, the approximation property of polynomials, and the uniform bound \(\Vert \mathrm {D}{\mathcal {J}}_\ell u_\ell \Vert _{L^p(\Omega )} \lesssim 1\) from (4.2) confirm

$$\begin{aligned} \begin{aligned}&\Big |\!\int _\Omega {\mathcal {J}}_\ell u_\ell \cdot (1 - \Pi _{{\mathcal {T}}_\ell }^k) f \,{\mathrm {d}}x\Big | + \Big |\!\int _{\Gamma _{\mathrm {N}}} {\mathcal {J}}_\ell u_\ell \cdot (1 - \Pi _{{\mathcal {F}}_\ell }^k) g \,{\mathrm {d}}s\Big |\\&\quad \lesssim \mathrm {osc}(f,{\mathcal {T}}_\ell ) + \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})). \end{aligned} \end{aligned}$$
(4.22)

Since \(\mathrm {osc}(f,{\mathcal {T}}_\ell )^{p'} + \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))^{p'} \lesssim \eta _\ell ^{(\varepsilon )}\) and \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\) from Step 2, the LEB from (4.15) and (4.20)–(4.22) lead to

$$\begin{aligned} E(u) \le E(v) \le \liminf _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \liminf _{\ell \rightarrow \infty } \mathrm {LEB}_\ell \le E(u). \end{aligned}$$

Hence \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \lim _{\ell \rightarrow \infty } \mathrm {LEB}_\ell = E(u)\) for a (not relabelled) subsequence of \((u_\ell )_{\ell \in {\mathbb {N}}_0}\). Since the above arguments from Step 4 apply to all subsequences of \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) and the limit E(u) is unique, this holds for the entire sequence.

Step 5 is the finish of the proof. Suppose that W satisfies (2.5). Then the arguments from [27, 28] show, for all \(\varrho , \xi \in L^p(\Omega ;{\mathbb {M}})\) and rt from Table 1, that

$$\begin{aligned} \begin{aligned} \Vert \varrho - \xi \Vert _{L^p(\Omega )}^r&\le 3c_{5}(|\Omega | + \Vert \varrho \Vert ^{p}_{L^p(\Omega )} + \Vert \xi \Vert _{L^p(\Omega )}^p)^{t/t'}\\&\quad \times \int _\Omega (W(\varrho ) - W(\xi ) - \mathrm {D}W(\xi ) : (\varrho - \xi )) \,{\mathrm {d}}x. \end{aligned} \end{aligned}$$
(4.23)

The choice \(\varrho {:}{=}\mathrm {D}u\) and \(\xi {:}{=}{\mathcal {G}}_\ell u_\ell \) in (4.23) and the bounds \(\Vert \mathrm {D}u\Vert _{L^p(\Omega )} \le C_{2}\) and \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )} \le C_{1}\) lead, with the constant \(C_{4} {:}{=}3 c_{5}(|\Omega | + C_{1}^p + C_{2}^p)^{t/t'}\), to

$$\begin{aligned} \begin{aligned}&C_{4}^{-1} \Vert \mathrm {D}u - {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^r\\&\quad \le \int _\Omega (W(\mathrm {D}u) - W({\mathcal {G}}_\ell u_\ell ) - \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) : (\mathrm {D}u - {\mathcal {G}}_\ell u_\ell )) \,{\mathrm {d}}x. \end{aligned} \end{aligned}$$
(4.24)

The right-hand side of (4.24) coincides with the right-hand side of (4.16). The latter is bounded by the right-hand side of (4.15) in Step 3. This implies

$$\begin{aligned} C_{4}^{-1}\Vert \mathrm {D}u - {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^r \le E(u) - \mathrm {LEB}_\ell . \end{aligned}$$
(4.25)

Step 4 proves that \(E(u) - \mathrm {LEB}_\ell \) vanishes in the limit as \(\ell \rightarrow \infty \). Thus,

$$\begin{aligned} \lim _{\ell \rightarrow \infty } {\mathcal {G}}_\ell u_\ell = \mathrm {D}u \text { (strongly) in } L^p(\Omega ;{\mathbb {M}}). \end{aligned}$$

If W satisfies (2.6), then [28, Lemma 4.2] implies, for all \(\varrho , \xi \in L^p(\Omega ;{\mathbb {M}})\) and \({\widetilde{r}},{\widetilde{t}}\) from Table 1, that

$$\begin{aligned} \begin{aligned} \Vert \mathrm {D}W(\varrho ) - \mathrm {D}W(\xi )\Vert _{L^p(\Omega )}^{{\widetilde{r}}}&\le 3c_{6}(|\Omega | + \Vert \varrho \Vert ^{p}_{L^p(\Omega )} + \Vert \xi \Vert _{L^p(\Omega )}^p)^{{\widetilde{t}}/{\widetilde{t}}'}\\&\quad \times \int _\Omega (W(\varrho ) - W(\xi ) - \mathrm {D}W(\xi ) : (\varrho - \xi )) \,{\mathrm {d}}x, \end{aligned} \end{aligned}$$
(4.26)

whence the left-hand side of (4.25) can be replaced by \(C_{5}^{-1}\Vert \sigma - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(\Omega )}^{{\widetilde{r}}}\) with \(C_{5} {:}{=}3c_{6}(|\Omega | + C_{1}^p + C_{2}^p)^{{\widetilde{t}}/{\widetilde{t}}'}\). This and Step 4 conclude the proof of

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) = \sigma \text { (strongly) in } L^{p'}(\Omega ;{\mathbb {M}}). \end{aligned}$$

Remark 4.4

(necessity of \(\varepsilon > 0\)) The counter example in [52, Subsection 3.4] shows that the restriction \(\varepsilon > 0\) is necessary. Indeed, for \(k = 0\), the data W, \(\Omega \), \(\Gamma _{\mathrm {D}}\), \(\Gamma _{\mathrm {N}}\), f, g, \(u_\mathrm {D}\), and \(({\mathcal {T}}_\ell )_{\ell \in {\mathbb {N}}_0}\) from [52, Subsection 3.4], there exists a sequence of discrete minimizers \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) such that \({\mathcal {J}}_\ell u_\ell \rightharpoonup 0\) weakly in V and \({\mathcal {G}}_\ell u_\ell \rightharpoonup 0\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) as \(\ell \rightarrow \infty \), but \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} \ne 0\).

4.3 The Lavrentiev gap

A particular challenge in the computational calculus of variations is the Lavrentiev phenomenon \(\inf E({\mathcal {A}}) < \inf E({\mathcal {A}}\cap W^{1,\infty }(\Omega ))\) [48]. Its presence is equivalent to the failure of standard conforming FEMs [17, Theorem 2.1] in the sense that a wrong minimal energy is approximated. As a remedy, the nonconforming Crouzeix–Raviart FEM in [3, 51, 52] can overcome the Lavrentiev gap under fairly general assumptions on W: Throughout the remaining parts of this section, let \(W \in C^1({\mathbb {M}})\) be convex with the one-sided lower growth

$$\begin{aligned} c_{1}|A|^p - c_{2} \le W(A) \text { for all } A \in {\mathbb {M}}\text { and some } 1< p < \infty . \end{aligned}$$

(A two-sided growth of W excludes a Lavrentiev gap.) Since there is no upper growth of W, the dual variable \(\sigma {:}{=}\mathrm {D}W(\mathrm {D}u)\) is not guaranteed to be in \(L^{p'}(\Omega ;{\mathbb {M}})\). This denies an access to the Euler–Lagrange equations and, therefore, the convergence analysis of [51, 52] solely relies on the Jensen inequality. For \(k = 0\), HHO methods can overcome the Lavrentiev gap because the Crouzeix–Raviart FEM can.

Lemma 4.5

(Lower-energy bound for \(k = 0)\) Let \(k = 0\). There exists a positive constant \(C_{6}\) such that, for all level \(\ell \in {\mathbb {N}}_0\),

$$\begin{aligned} \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell )) - C_{6}\big (\Vert h_\ell f\Vert _{L^{p'}(\Omega )} - \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))\big ) \le \min E({\mathcal {A}}). \end{aligned}$$

Proof

Recall the discrete space \(\mathrm {CR}^1({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) of Crouzeix–Raviart finite element functions from (3.1). Define

$$\begin{aligned} \mathrm {CR}^1_\mathrm {D}({\mathcal {T}}_\ell ;{\mathbb {R}}^m) {:}{=}\{v_\mathrm {CR}\in \mathrm {CR}^1({\mathcal {T}}_\ell ;{\mathbb {R}}^m) : v_\mathrm {CR}({\text {mid}}(F)) = 0 \text { for all } F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {D}})\} \end{aligned}$$

and the nonconforming interpolation \(\mathrm {I}_\mathrm {CR}: V \rightarrow \mathrm {CR}^1({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) [31] with

$$\begin{aligned} \mathrm {I}_\mathrm {CR}v ({\text {mid}}(F)) {:}{=}\int _F v \,{\mathrm {d}}s/|F| \quad \text {for all } F \in {\mathcal {F}}_\ell , v \in V. \end{aligned}$$

The discrete CR-FEM minimizes the non-conforming energy

$$\begin{aligned} E_\mathrm {NC}(v_\mathrm {CR}) {:}{=}\int _\Omega (W(\mathrm {D}_\mathrm {pw}v_\mathrm {CR}) - \Pi _{{\mathcal {T}}_\ell }^0 f \cdot v_\mathrm {CR}) \,{\mathrm {d}}x - \int _{\Gamma _{\mathrm {N}}} \Pi _{{\mathcal {F}}_\ell }^0 g \cdot v_\mathrm {CR}\,{\mathrm {d}}s \end{aligned}$$

among \(v_\mathrm {CR}\in {\mathcal {A}}_\mathrm {NC} {:}{=}\mathrm {I}_\mathrm {NC} u_\mathrm {D} + \mathrm {CR}^1_\mathrm {D}({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\). A straight-forward modification of the proof of [52, Lemma 4] shows, for a positive constant \(C_{6} > 0\), that

$$\begin{aligned} \min E_\mathrm {NC}({\mathcal {A}}_\mathrm {NC}) - C_{6}\big (\Vert h_\ell f\Vert _{L^{p'}(\Omega )} - \mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))\big ) \le \min E({\mathcal {A}}). \end{aligned}$$
(4.27)

Notice that \(\mathrm {I}_\mathrm {CR}\) does not provide the \(L^2\) orthogonality \(\mathrm {I}_\mathrm {CR}v - v \perp P_0({\mathcal {T}}_\ell ;{\mathbb {R}}^m)\) in \(L^2(\Omega ;{\mathbb {R}}^m)\), but \((\mathrm {I}_\mathrm {CR}v - v)|_F \perp P_0(F;{\mathbb {R}}^m)\) in \(L^2(F;{\mathbb {R}}^m)\) for all \(F \in {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})\) and \(v \in V\). Hence the Neumann boundary data oscillations \(\mathrm {osc}_{\mathrm {N}}(g,{\mathcal {F}}_\ell (\Gamma _{\mathrm {N}}))\) arise in (4.27), but \(\Vert h_\ell f\Vert _{L^{p'}(\Omega )}\) cannot be replaced by \(\mathrm {osc}(f,{\mathcal {T}}_\ell )\). For any \(v_\mathrm {CR}\in {\mathcal {A}}_\mathrm {NC}\), \(v_\ell {:}{=}(\Pi _{{\mathcal {T}}_\ell }^0 v_\mathrm {CR}, \Pi _{{\mathcal {F}}_\ell }^0 v_\mathrm {CR}) \in {\mathcal {A}}({\mathcal {T}}_\ell )\) satisfies \({\mathcal {G}}_\ell v_\ell = \mathrm {D}_\mathrm {pw}v_\mathrm {CR}\) and hence, \(\min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell )) \le \min E_\mathrm {NC}({\mathcal {A}}_\mathrm {NC})\). This and (4.27) conclude the proof. \(\square \)

The discrete compactness from Theorem 4.1, the LEB in (4.27), and straightforward modifications of the proof of Theorem 2.1 lead to \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \min E({\mathcal {A}})\) for the output \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) of the adaptive algorithm in Sect. 2.2 with the refinement indicator, for all \(T \in {\mathcal {T}}_\ell \),

$$\begin{aligned} \eta _\ell ^{(\varepsilon )}(T) {:}{=}\mu _\ell ^{(\varepsilon )}(T) +|T|^{p'/n}\Vert f\Vert _{L^{p'}(T)}^{p'} + |T|^{1/n}\sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _{\mathrm {N}})} \Vert (1 - \Pi _F^0) g\Vert _{L^{p'}(F)}^{p'}. \end{aligned}$$

For \(k \ge 1\), the consistency error \(\sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) arises in (4.15), but is not guaranteed to be bounded in \(L^{p'}(\Omega ;{\mathbb {M}})\) in the limit as \(\ell \rightarrow \infty \) in general. Thus, in the absence of further conditions, the convergence \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \min E({\mathcal {A}})\) cannot be proven for \(k \ge 1\) with this methodology.

5 Stabilized HHO method on polytopal meshes

The classical HHO methodology [35, 37] allows even polytopal partitions of the domain \(\Omega \). The assumption (M1) on the mesh follows the works [35, 37, 40].

5.1 Polytopal meshes

Let \({\mathcal {M}}_\ell \) be a finite collection of closed polytopes of positive volume with overlap of volume measure zero that cover \({\overline{\Omega }} = \cup _{K \in {\mathcal {M}}_\ell } K\). A side S of the mesh \({\mathcal {M}}_\ell \) is the (in general disconnected) closed subset of a hyperplane \(H_S \subset \Omega \) with positive (\((n-1)\)-dimensional) surface measure such that either (a) there exist \(K_1, K_2 \in {\mathcal {M}}_\ell \) with \(S = \partial K_1 \cap \partial K_2 \cap H_S\) (interior side) or (b) there exists \(K \in {\mathcal {M}}_\ell \) with \(S = \partial K \cap \partial \Omega \cap H_S\) (boundary side). Let \(\Sigma _\ell \) denote the set of all sides of \({\mathcal {M}}_\ell \) and adapt the notation \(\Sigma _\ell (K)\), \(\Sigma _\ell (\Omega )\), \(\Sigma _\ell (\Gamma _\mathrm {D})\), and \(\Sigma _\ell (\Gamma _{\mathrm {N}})\) from Sect. 3.1. The convergence results of this section are established under the assumptions (M1)–(M2) below.

  1. (M1)

    Assume that there exists a universal constant \(\varrho > 0\) such that, for all level \(\ell \in {\mathbb {N}}_0\), \({\mathcal {M}}_\ell \) admits a shape-regular simplicial subtriangulation \({\mathcal {T}}_\ell \) with the shape regularity \(\ge \varrho \) defined in Sect. 3.1 and, for each simplex \(T \in {\mathcal {T}}_\ell \), there exists a unique cell \(K \in {\mathcal {M}}_\ell \) with \(T \subseteq K\) and \(\varrho h_K \le h_T\).

  2. (M2)

    Assume the existence of a universal constant \(0< \gamma < 1\) such that \(|{\widehat{K}}| \le \gamma |K|\) holds for all \(K \in {\mathcal {M}}_\ell \setminus {\mathcal {M}}_{\ell +1}\), \({\widehat{K}} \in {\mathcal {M}}_{\ell +1}\) with \({\widehat{K}} \subset K\) and level \(\ell \in {\mathbb {N}}_0\), i.e., the volume measure of all children \({\widehat{K}}\) of a refined cell K is at most \(\gamma |K|\).

The assumption (M1) is typical for the error analysis of HHO methods on polytopal meshes, cf., e.g., [34, 35, 37, 40, 42]. The assumption (M2) holds for the newest-vertex bisection on simplicial triangulations with \(\gamma = 1/2\).

Remark 5.1

(equivalence of side lengths) The assumption (M1) ensures that \(h_S \approx h_K \approx |K|^{1/n}\) holds for all \(K \in {\mathcal {M}}_\ell \) and \(S \in \Sigma _\ell (K)\) with equivalence constants that exclusively depend on the universal constant \(\varrho \) in (M1) [40, Lemma 1.42].

Lemma 5.2

(mesh-size reduction) Suppose that the sequence \(({\mathcal {M}}_\ell )_{\ell \in {\mathbb {N}}_0}\) satisfies (M2), then the mesh-size function \(h_\ell \in P_0({\mathcal {M}}_\ell )\) with \(h_\ell |_K {:}{=}|K|^{1/n}\) for all \(K \in {\mathcal {M}}_\ell \) satisfies \(\lim _{\ell \rightarrow \infty } \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )} = 0\) for \(\Omega _\ell {:}{=}\mathrm {int}(\cup ({\mathcal {M}}_\ell \setminus {\mathcal {M}}_{\ell +1}))\).

Proof

Given any \(j \in {\mathbb {N}}_0\) and \(\alpha _j {:}{=}\gamma ^{j}|\Omega |\), define the set \({\mathcal {M}}(j) \subset \cup _{\ell \in {\mathbb {N}}_0} {\mathcal {M}}_\ell \) of all polytopes K with volume measure \(\alpha _{j+1} < |K| \le \alpha _j\). Since the volume measure of any refined polytope is at least reduced by the factor \(\gamma \), the polytopes of \({\mathcal {M}}(j)\) are not children of each other and so \(|K \cap T| = 0\) holds for any two distinct polytopes \(K, T \in {\mathcal {M}}(j)\). This implies that the cardinality \(|{\mathcal {M}}(j)|\) of \({\mathcal {M}}(j)\) satisfies \(|{\mathcal {M}}(j)| < \gamma ^{-(j+1)}\). For any level \(\ell \in {\mathbb {N}}_0\), select some \(K_\ell \in {\mathcal {M}}_\ell \setminus {\mathcal {M}}_{\ell +1}\) with \(|K_\ell | = \Vert h_\ell \Vert _{L^{\infty }(\Omega _\ell )}^n\). Since \(K_\ell \notin {\mathcal {M}}_j\) for all \(j > \ell \), the polytopes \(K_0,K_1,K_2,\dots \) are pairwise distinct. Given \(N \in {\mathbb {N}}_0\), the number \(|\{\ell \in {\mathbb {N}}_0 : |K_\ell | > \alpha _{N+1}\}|\) of all indices \(\ell \in {\mathbb {N}}_0\) with \(|K_\ell | > \alpha _{N+1}\) is bounded by \(|{\mathcal {M}}(0)| + |{\mathcal {M}}(1)| + \dots + |{\mathcal {M}}(N)| \le (\gamma ^{-(N+1)} - 1)/(1 - \gamma )\). Hence there exists a maximal index L such that \(\Vert h_\ell \Vert ^n_{L^\infty (\Omega _\ell )} = |K_\ell | \le \alpha _{N+1}\) for all \(\ell \ge L\). Notice that Lemma 4.3 follows for simplicial triangulations with \(\gamma = 1/2\)\(\square \)

5.2 Stabilization

The classical HHO method [1, 34] utilizes a gradient reconstruction \({\mathcal {G}}_\ell : V({\mathcal {M}}_\ell ) \rightarrow \Sigma ({\mathcal {M}}_\ell )\) in the space \(\Sigma ({\mathcal {M}}_\ell ) {:}{=}P_k({\mathcal {M}}_\ell ;{\mathbb {M}})\) of matrix-valued piecewise polynomials of total degree at most k. The discrete seminorm \(\Vert \bullet \Vert _\ell \) of \(V({\mathcal {M}}_\ell )\) and the operators \(\mathrm {I}_\ell \), \({\mathcal {R}}_\ell \), \({\mathcal {G}}_\ell \) of this section are defined by the formulas (3.2)–(3.6) in Sect. 3.3 with adapted notation, i.e., \({\mathcal {T}}_\ell \) (resp. \({\mathcal {F}}_\ell \)) is replaced by \({\mathcal {M}}_\ell \) (resp. \(\Sigma _\ell \)).

Remark 5.3

(need of stabilization) The kernel of the gradient reconstruction \({\mathcal {G}}_\ell \) restricted to \(V_\mathrm {D}({\mathcal {M}}_\ell )\) is not trivial. For instance, any \(v_\ell = (v_{{\mathcal {M}}_\ell }, 0) \in V_\mathrm {D}({\mathcal {M}}_\ell )\) with \(v_{{\mathcal {M}}_\ell } \in P_k({\mathcal {M}}_\ell ;{\mathbb {R}}^m)\) and \(v_{{\mathcal {M}}_\ell } \perp P_{k-1}({\mathcal {M}}_\ell ;{\mathbb {R}}^m)\) (with the convention \(P_{-1}({\mathcal {M}}_0;{\mathbb {R}}^m) {:}{=}\{0\}\)) satisfies \({\mathcal {G}}_\ell v_\ell = 0\) and the norm equivalence in Lemma 3.1a fails. On simplicial meshes, a gradient reconstruction in any discrete space \(\Sigma ({\mathcal {M}}_\ell )\) with \(\mathrm {RT}_k({\mathcal {M}}_\ell ;{\mathbb {M}}) \subset \Sigma ({\mathcal {M}}_\ell )\) is stable, but the commutativity from Lemma 3.1b may fail if \(\Sigma ({\mathcal {M}}_\ell )\) is too large, e.g., \(\Sigma ({\mathcal {M}}_\ell ) = P_{k+1}({\mathcal {M}}_\ell ;{\mathbb {M}})\) [1].

The stabilization function \(\mathrm {s}_\ell : V({\mathcal {M}}_\ell ) \times V({\mathcal {M}}_\ell ) \rightarrow {\mathbb {R}}\) in the HHO methodology is defined, for any \(u_\ell , v_\ell = (v_{{\mathcal {M}}_\ell },v_{\Sigma _\ell }) \in V({\mathcal {M}}_\ell )\) and any side \(S \in \Sigma _\ell (K)\) of \(K \in {\mathcal {M}}_\ell \) with diameter \(h_S = \mathrm {diam}(S)\), by \(\mathrm {s}_\ell (u_\ell ; v_\ell ) {:}{=}\sum _{K \in {\mathcal {M}}_\ell } \mathrm {s}_K(u_\ell ; v_\ell )\) and

$$\begin{aligned} \begin{aligned} {\mathcal {S}}_{K,S} v_\ell&{:}{=}\Pi _S^k (v_S - v_K - (1 - \Pi _K^k) ({\mathcal {R}}_\ell v_\ell )|_K) \in P_k(S;{\mathbb {R}}^m),\\ \mathrm {s}_K(u_\ell ; v_\ell )&{:}{=}\sum _{S \in \Sigma _\ell (K)} h_S^{1-p} \int _S |{{\mathcal {S}}_{K,S} u_\ell }|^{p-2} {\mathcal {S}}_{K,S} u_\ell \cdot {\mathcal {S}}_{K,S} v_\ell \,{\mathrm {d}}s. \end{aligned} \end{aligned}$$
(5.1)

Notice that \(\mathrm {s}_\ell (\bullet ;\bullet )\) is linear in the second component, but not in the first unless \(p = 2\). The relevant properties of \(\mathrm {s}_\ell (\bullet ;\bullet )\) are summarized below.

Lemma 5.4

(stabilization) Any \(u_\ell , v_\ell = (v_{{\mathcal {M}}_\ell },v_{\Sigma _\ell }) \in V({\mathcal {M}}_\ell )\), \(v \in V\), and \(K \in {\mathcal {M}}_\ell \) satisfy (a)–(e) with parameters prst from Table 1.

  1. (a)

    \(\Vert v_\ell \Vert _\ell ^p \approx \Vert {\mathcal {G}}_\ell v_\ell \Vert _{L^p(\Omega )}^p + \mathrm {s}_{\ell }(v_\ell ;v_\ell )\).

  2. (b)

    \(s_K(\mathrm {I}_\ell v; \mathrm {I}_\ell v)^{1/p} \lesssim \min _{\varphi _{h} \in P_{k+1}(K;{\mathbb {R}}^m)} \Vert \mathrm {D}(v - \varphi _{h})\Vert _{L^p(K)}\). In particular, if \(v \in W^{k+2,p}(K;{\mathbb {R}}^m)\), then \(s_K(\mathrm {I}_\ell v; \mathrm {I}_\ell v)^{1/p} \lesssim h_K^{k+1}|v|_{W^{k+2,p}(K)}\).

  3. (c)
    $$\begin{aligned} \begin{aligned} \mathrm {s}_K(v_\ell ;v_\ell )&\lesssim h_K^{-p}\Vert \Pi _{K}^k({\mathcal {R}}_\ell v_\ell - v_{K})\Vert _{L^p(K)}^p\\&\quad + \sum _{S \in \Sigma _\ell (K)} h_S^{1-p}\Vert \Pi _S^k(({\mathcal {R}}_\ell v_\ell )_K - v_S)\Vert _{L^p(S)}^p. \end{aligned} \end{aligned}$$
  4. (d)

    \(\mathrm {s}_K(u_\ell ;v_\ell ) \le \mathrm {s}_K(u_\ell ;u_\ell )^{1/p'}\mathrm {s}_K(v_\ell ;v_\ell )^{1/p}\).

  5. (e)
    $$\begin{aligned} \begin{aligned}&\sum _{K \in {\mathcal {M}}_\ell } \sum _{S \in \Sigma _\ell (K)} \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} (u_\ell - v_\ell )\Vert _{L^p(S)}^r\\&\quad \lesssim \big (1 + \mathrm {s}_{\ell }(u_\ell ;u_\ell ) + \mathrm {s}_\ell (v_\ell ;v_\ell )\big )^{t/t'} (\mathrm {s}_\ell (v_\ell ;v_\ell )/p - \mathrm {s}_\ell (u_\ell ;v_\ell ) + \mathrm {s}_{\ell }(u_\ell ;u_\ell )/p'). \end{aligned} \end{aligned}$$

Proof

The norm equivalence in (a) is established in [37, Lemma 4] for \(p= 2\) and extended to \(1 \le p < \infty \) in [34, Lemma 5.2]; the approximation property (b) is [42, Lemma 3.2]. The upper bound (c) follows immediately from a triangle and a discrete trace inequality. The proof of (d) concerns \(K \in {\mathcal {M}}_\ell \) and \(S \in \Sigma _\ell (K)\). A Hölder inequality with the exponents p, \(p'\) and \(1-p+1/p' = (1-p)/p'\) show

$$\begin{aligned}&h_S^{1-p} \int _S |{\mathcal {S}}_{K,S} u_\ell |^{p-2} {\mathcal {S}}_{K,S} u_\ell \cdot {\mathcal {S}}_{K,S} v_\ell \,{\mathrm {d}}s\\&\quad \le \Vert h_S^{(1-p)/p'}|{\mathcal {S}}_{K,S} u_\ell |^{p-2}{\mathcal {S}}_{K,S} u_\ell \Vert _{L^{p'}(S)}\Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} v_\ell \Vert _{L^p(S)}\\&\quad = \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} u_\ell \Vert _{L^p(S)}^{p/p'}\Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} v_\ell \Vert _{L^p(S)}. \end{aligned}$$

The sum of this over all \(S \in \Sigma _\ell (K)\) and a Cauchy inequality prove (d). The proof of (e) departs from the function \(W(a) {:}{=}|a|^p/p\) for \(a \in {\mathbb {R}}^m\) with the convexity control (2.5). The integral of (2.5) over the side S leads to (4.23) for all \(\varrho , \xi \in L^p(S;{\mathbb {R}}^m)\) and \(\Omega \) (resp. \({\mathbb {M}}\)) replaced by S (resp. \({\mathbb {R}}^m\)). The choice \(\varrho {:}{=}h_S^{-1/p'}{\mathcal {S}}_{K,S} v_\ell \) and \(\xi {:}{=}h_S^{-1/p'}{\mathcal {S}}_{K,S} u_\ell \) in (4.23) leads to

$$\begin{aligned}&(3c_{5})^{-1}(|S| + \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} u_\ell \Vert _{L^p(S)}^p + \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} v_\ell \Vert _{L^p(S)}^p)^{-t/t'}\nonumber \\&\quad \times \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} (u_\ell - v_\ell )\Vert _{L^p(S)}^r \le \Vert h_S^{-1/p'} {\mathcal {S}}_{K,S} v_\ell \Vert ^p_{L^p(S)}/p \nonumber \\&\qquad - \int _S h_S^{1-p}|{\mathcal {S}}_{K,S} u_\ell |^{p-2}{\mathcal {S}}_{K,S} u_\ell \cdot {\mathcal {S}}_{K,S} v_\ell \,{\mathrm {d}}s + \Vert h_S^{-1/p'}{\mathcal {S}}_{K,S} u_\ell \Vert _{L^p(S)}^p/p'. \end{aligned}$$
(5.2)

The sum of this over all \(S \in \Sigma _\ell (K)\) and \(K \in {\mathcal {M}}_\ell \) concludes the proof of (e). \(\square \)

5.3 Stabilized HHO method on a polytopal mesh

The discrete problem minimizes the discrete energy

$$\begin{aligned} E_\ell (v_\ell ) {:}{=}\int _\Omega (W({\mathcal {G}}_\ell v_\ell ) - f \cdot v_{{\mathcal {M}}_\ell }) \,{\mathrm {d}}x - \int _{\Gamma _{\mathrm {N}}} g \cdot v_{\Sigma _\ell } \,{\mathrm {d}}s + \mathrm {s}_\ell (v_\ell ;v_\ell )/p \end{aligned}$$
(5.3)

among \(v_\ell = (v_{{\mathcal {M}}_\ell }, v_{\Sigma _\ell }) \in {\mathcal {A}}({\mathcal {M}}_\ell )\).

Theorem 5.5

(discrete minimizers) The minimal discrete energy \(\inf E_\ell ({\mathcal {A}}({\mathcal {M}}_\ell ))\) is attained. There exists a positive constant \(C_{7} > 0\) that merely depends on \(c_{1}, c_{2}, \Omega \), \(\Gamma _{\mathrm {D}}\), \(u_\mathrm {D}\), f, g, \(\varrho \) in (M1), k, and p with \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^p + \mathrm {s}_\ell (u_\ell ;u_\ell ) \le C_{7}^p\) for any discrete minimizer \(u_\ell \in \arg \min E_\ell ({\mathcal {A}}({\mathcal {M}}_\ell ))\). Any discrete stress \(\sigma _\ell {:}{=}\Pi _{{\mathcal {M}}_\ell }^k \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) satisfies the discrete Euler–Lagrange equations

$$\begin{aligned} \int _\Omega \sigma _\ell : {\mathcal {G}}_\ell v_\ell \,{\mathrm {d}}x = \int _\Omega f \cdot v_{{\mathcal {M}}_\ell } \,{\mathrm {d}}x + \int _{\Gamma _{\mathrm {N}}} g \cdot v_{\Sigma _\ell } \,{\mathrm {d}}s - \mathrm {s}_\ell (u_\ell ;v_\ell ) \end{aligned}$$
(5.4)

for all \(v_\ell = (v_{{\mathcal {M}}_\ell }, v_{\Sigma _\ell }) \in V_\mathrm {D}({\mathcal {M}}_\ell )\). If W satisfies (2.5), then \(u_\ell = \arg \min E_\ell ({\mathcal {A}}({\mathcal {M}}_\ell ))\) is unique. If W satisfies (2.6), then \(\mathrm {D}W({\mathcal {G}}_\ell u_\ell ) \in L^{p'}(\Omega ;{\mathbb {M}})\) is unique (independent of the choice of a (possibly non-unique) discrete minimizer \(u_\ell \)).

Proof

The proof follows that of Theorem 3.2. The norm equivalence in Lemma 5.4a and the lower growth of W lead to the coercivity of \(E_\ell \) in \({\mathcal {A}}({\mathcal {M}}_\ell )\) with respect to the seminorm \(\Vert \bullet \Vert _\ell ^p \approx \Vert {\mathcal {G}}_\ell \bullet \Vert _{L^p(\Omega )}^p + \mathrm {s}_\ell (\bullet ;\bullet )\) from Lemma 5.4a. This implies the existence of discrete minimizers and the bound \(\Vert {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^p + \mathrm {s}_\ell (u_\ell ;u_\ell ) \le C_{7}^p\) for all \(u_\ell \in \arg \min E_\ell ({\mathcal {A}}({\mathcal {M}}_\ell ))\). If W satisfies (2.5), then the strict convexity of W and of \(\mathrm {s}_\ell \) in Lemma 5.4c leads to the uniqueness of \(u_\ell = \arg \min E_\ell ({\mathcal {A}}({\mathcal {M}}_\ell ))\). If W satisfies (2.6), then the uniqueness of \(\mathrm {D}W({\mathcal {G}}_\ell u_\ell )\) follows as in [16, 27, 28]. \(\square \)

The following lemma extends Lemma 3.4 to polytopal meshes.

Lemma 5.6

There exists a linear operator \({\mathcal {J}}_{\ell } : V({\mathcal {M}}_\ell ) \rightarrow V\) such that any \(v_\ell = (v_{{\mathcal {M}}_\ell }, v_{\Sigma _\ell }) \in V({\mathcal {M}}_\ell )\) satisfies

$$\begin{aligned} \Pi _{{\mathcal {M}}_\ell }^k {\mathcal {J}}_\ell v_\ell = v_{{\mathcal {M}}_\ell } \quad \text {and}\quad \Pi _{\Sigma _\ell }^k {\mathcal {J}}_{\ell } v_\ell = v_{\Sigma _\ell } \end{aligned}$$
(5.5)

and, for any \(K \in {\mathcal {M}}_\ell \), the estimate

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(K)}^p \lesssim \sum _{S \in \Sigma _\ell (\Omega ), S \cap K \ne \emptyset } h_S^{1-p}\Vert [{\mathcal {R}}_\ell v_\ell ]_S\Vert _{L^p(S)}^p\\&\quad + \sum _{S \in \Sigma _\ell (K)} h_S^{1-p}\Vert ({\mathcal {R}}_\ell v_\ell )|_K - v_S\Vert _{L^p(S)}^p + h_K^{-p} \Vert {\mathcal {R}}_\ell v_\ell - v_K\Vert _{L^p(K)}^p. \end{aligned} \end{aligned}$$
(5.6)

In particular, \({\mathcal {J}}_\ell \) is stable in the sense that \(\Vert \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \le \Lambda _1 \Vert v_\ell \Vert _\ell \) holds with the constant \(\Lambda _1\) that exclusively depends on k, p, and \(\varrho \) in (M1).

Proof

The construction of the conforming operator \({\mathcal {J}}_\ell \) on polytopal meshes in [42, Section 5] utilizes averaging and bubble-function techniques on the subtriangulation \({\mathcal {T}}_\ell \) and give rise an upper bound of \(\Vert {\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(K)}^p\), namely

$$\begin{aligned}&\sum _{S \in \Sigma _\ell (\Omega ), S \cap K \ne \emptyset } h_S^{1-p}\Vert [{\mathcal {R}}_\ell v_\ell ]_S\Vert _{L^p(S)}^p + \sum _{T \in {\mathcal {T}}_\ell , T \subset K} h_T^{-p}\Vert \Pi _T^k({\mathcal {R}}_\ell v_\ell - v_K)\Vert ^p_{L^p(T)} \nonumber \\&\quad + \sum _{S \in \Sigma _\ell (K)} \sum _{F \in {\mathcal {F}}_\ell , F \subset S} h_F^{1-p} \Vert \Pi _{F}(({\mathcal {R}}_\ell v_\ell )|_K - v_S)\Vert ^p_{L^p(F)}. \end{aligned}$$
(5.7)

Since the \(L^2\) projection \(\Pi _T^k\) (resp. \(\Pi _F^k\)) is stable in \(L^p(T;{\mathbb {R}}^m)\) (resp. \(L^p(F;{\mathbb {R}}^m)\)) [34, Lemma 3.2], it can be omitted in (5.7). This, the equivalence \(h_T \approx h_K\) for all \(K \in {\mathcal {M}}_\ell \), \(T \in {\mathcal {T}}_\ell \) with \(T \subset K\) from (M1), and \(h_F \approx h_S\) for all \(S \in \Sigma _\ell \), \(F \in {\mathcal {F}}_\ell \) with \(F \subset S\) from (M1) and Remark 5.1 show (5.6). This implies the stability \(\Vert \mathrm {D}{\mathcal {J}}_\ell v_\ell \Vert _{L^p(\Omega )} \lesssim \Vert v_\ell \Vert _\ell \), cf., e.g., [42, Subsection 4.3] for more details. Notice that the computation of the right-hand side of (5.6) does not require explicit information on the subtriangulation \({\mathcal {T}}_\ell \). \(\square \)

Remark 5.7

(discrete compactness) The discrete compactness from Theorem 4.1 holds verbatim with \({\mathcal {T}}_\ell \) (resp. \({\mathcal {F}}_\ell \)) replaced by \({\mathcal {M}}_\ell \) (resp. \(\Sigma _\ell \)). Notice that \({\mathcal {G}}_\ell \) from Sect. 5.2 and \({\mathcal {J}}_\ell \) from Lemma 5.6 in this section are different objects. With adapted notation, all arguments from the proof of Theorem 4.1 apply verbatim. Indeed, the commutativity \(\Pi _{\Sigma ({\mathcal {M}}_\ell )} \mathrm {D}v = {\mathcal {G}}_\ell \mathrm {I}_\ell v\) for all \(v \in V\) from Lemma 3.1b remains valid [1, 34]. This and (5.5) imply the \(L^2\) orthogonality \({\mathcal {G}}_\ell v_\ell - \mathrm {D}{\mathcal {J}}_\ell v_\ell \perp \Sigma ({\mathcal {M}}_\ell )\) for any \(v_\ell \in V({\mathcal {M}}_\ell )\). This is the key argument in the proof of Theorem 4.1 and provides a positive power of the mesh-size in (4.1).

5.4 Proof of Theorem 2.2

Given any \(K \in {\mathcal {M}}_\ell \), Lemma 5.6 motivates the refinement indicator

$$\begin{aligned} \eta _\ell ^{(\varepsilon )}(K)&{:}{=}\mu _\ell ^{(\varepsilon )}(K) + |K|^{\varepsilon p'/n}\Vert \sigma _\ell - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(K)}^{p'}\\&\quad + |K|^{p'/n}\Vert (1 - \Pi _K^k) f\Vert _{L^{p'}(K)}^{p'} + |K|^{1/n}\sum _{S \in \Sigma _\ell (K) \cap \Sigma _\ell (\Gamma _{\mathrm {N}})}\Vert (1 - \Pi _S^k) g\Vert _{L^{p'}(S)}^{p'} \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} \mu _\ell ^{(\varepsilon )}(K)&{:}{=}|K|^{(\varepsilon p - p)/n} \Vert {\mathcal {R}}_\ell u_\ell - u_K\Vert _{L^p(K)}^p\\&\quad + |K|^{(\varepsilon p + 1 - p)/n} \Big (\sum _{S \in \Sigma _\ell (K)\cap \Sigma _\ell (\Gamma _{\mathrm {D}})} \Vert {\mathcal {R}}_\ell u_\ell - u_\mathrm {D}\Vert _{L^p(S)}^p\\&\quad + \sum _{S \in \Sigma _\ell (K)\cap \Sigma _\ell (\Omega )} \Vert [{\mathcal {R}}_\ell u_\ell ]_S\Vert _{L^p(S)}^p + \sum _{S \in \Sigma _\ell (K)} \Vert ({\mathcal {R}}_\ell u_\ell )|_K - u_S\Vert _{L^p(S)}^p\Big ). \end{aligned} \end{aligned}$$

The remaining parts of this section are devoted to the proof of the convergence results in Theorem 2.2.

Proof of Theorem 2.2

The proof follows that of Theorem 2.1.

Step 1 establishes \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\). The key argument from Step 1 of the proof of Theorem 2.1 is the positive power of the mesh size in \(\eta _\ell ^{(\varepsilon )}\) in the sense that

$$\begin{aligned} \eta _\ell ^{(\varepsilon )}({\mathcal {M}}_\ell \setminus {\mathcal {M}}_{\ell +1})&\lesssim \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{\varepsilon p}(\Vert u_\ell \Vert _\ell ^p + \Vert \mathrm {D}u_\mathrm {D}\Vert _{L^p(\Omega )}^p)\\&\quad + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}^{\varepsilon p'} \Vert \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(\Omega )}^{p'}\\&\quad + \Vert h_\ell \Vert _{L^{\infty }(\Omega _\ell )}^{p'}\Vert f\Vert _{L^{p'}(\Omega )}^{p'} + \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )}\Vert g\Vert _{L^{p'}(\Gamma _{\mathrm {N}})}^{p'}. \end{aligned}$$

Hence \(\lim _{\ell \rightarrow \infty } \Vert h_\ell \Vert _{L^\infty (\Omega _\ell )} = 0\) from Lemma 5.2 implies \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )}({\mathcal {M}}_\ell \setminus {\mathcal {M}}_{\ell +1}) = 0\). This and the Dörfler marking in (2.4) conclude \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\).

Step 2 provides a LEB with the extra stabilization term \(\mathrm {s}_{\ell }(u_\ell ;\mathrm {I}_\ell u)\), namely

$$\begin{aligned} \mathrm {LEB}_\ell&{:}{=}E_\ell (u_\ell ) + \int _\Omega (1 - \Pi _{{\mathcal {M}}_\ell }^k) \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) : \mathrm {D}u \,{\mathrm {d}}x - \mathrm {s}_\ell (u_\ell ;\mathrm {I}_\ell u)\nonumber \\&\quad - C_{3}\big (\mathrm {osc}(f,{\mathcal {M}}_\ell ) + \mathrm {osc}_{\mathrm {N}}(g,\Sigma _{\ell }(\Gamma _{\mathrm {N}}))\big ) \le E(u) - \mathrm {s}_\ell (u_\ell ;u_\ell )/p' \le E(u). \end{aligned}$$
(5.8)

The commutativity \(\Pi _{\Sigma ({\mathcal {M}}_\ell )} \mathrm {D}u = {\mathcal {G}}_\ell \mathrm {I}_\ell u\) from Lemma 3.1b and the discrete Euler–Lagrange equations (5.4) show that

$$\begin{aligned} \begin{aligned} \int _\Omega \sigma _\ell : (\mathrm {D}u - {\mathcal {G}}_\ell u_\ell ) \,{\mathrm {d}}x&= \int _\Omega f \cdot (\Pi _{{\mathcal {M}}_\ell }^k u - u_{{\mathcal {M}}_\ell }) \,{\mathrm {d}}x\\&\quad \int _{\Gamma _\mathrm {N}} g \cdot (\Pi _{\Sigma _\ell }^k u - u_{\Sigma _\ell }) \,{\mathrm {d}}x - \mathrm {s}_\ell (u_\ell ; \mathrm {I}_\ell u - u_\ell ). \end{aligned} \end{aligned}$$
(5.9)

This, (4.16), and (4.18) (with adapted notation) conclude the proof of (5.8).

Step 3 establishes \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = E(u)\). Notice from (5.6) that \(\eta _\ell ^{(\varepsilon )}\) is an upper bound for \(\mu _\ell (u_\ell )\) in (4.1). Hence the discrete compactness (from Remark 5.7) implies the existence of a (not relabelled) subsequence of \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) and a weak limit \(v \in {\mathcal {A}}\) such that \({\mathcal {J}}_{\ell } u_\ell \rightharpoonup v\) weakly in V and \({\mathcal {G}}_\ell u_\ell \rightharpoonup \mathrm {D}v\) weakly in \(L^p(\Omega ;{\mathbb {M}})\) as \(\ell \rightarrow \infty \). The only difference between the LEB in (5.8) and that in (4.15) for simplicial meshes is the additional term \(\mathrm {s}_{\ell }(u_\ell ;\mathrm {I}u)\) in this proof.

Lemma 5.8

(convergence of \(\mathrm {s}_{\ell }(u_\ell ;\mathrm {I}u))\) Given a sequence \((u_\ell )_{\ell \in {\mathbb {N}}_0}\) with \(u_\ell \in V({\mathcal {M}}_\ell )\) for all \(\ell \in {\mathbb {N}}_0\), suppose that \(\mathrm {s}_\ell (u_\ell ;u_\ell ) \le C_{8}\) for a universal constant \(C_{8}\) independent of the level \(\ell \) and \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\) with \(\varepsilon \le \min \{k+1,(k+1)/(p-1)\}\). Then

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \mathrm {s}_\ell (u_\ell ;\mathrm {I}u) = 0. \end{aligned}$$
(5.10)

Proof of Lemma 5.8

The proof of (5.10) first establishes this for smooth functions. Given any \(\varphi \in C^\infty ({\overline{\Omega }};{\mathbb {R}}^m)\), the Hölder inequality from Lemma 5.4d, \(h_K \approx |K|^{1/n}\) from Remark 5.1, and the interpolation error from Lemma 5.4b prove

$$\begin{aligned} \begin{aligned} |\mathrm {s}_{\ell }(u_\ell ;\mathrm {I}_\ell \varphi )|&\le \sum _{K \in {\mathcal {M}}_\ell } \mathrm {s}_{K}(u_\ell ;u_\ell )^{1/p'}\mathrm {s}_{K}(\mathrm {I}_\ell \varphi ;\mathrm {I}_\ell \varphi )^{1/p}\\&\lesssim \Big (\sum _{K \in {\mathcal {M}}_\ell } |K|^{(k+1)p'/n}\mathrm {s}_{K}(u_\ell ;u_\ell )\Big )^{1/p'} |\varphi |_{W^{k+2,p}(\Omega )}. \end{aligned} \end{aligned}$$
(5.11)

Lemma 5.4c implies that \(\eta _\ell ^{(\varepsilon )}\) controls the stabilization in the sense that

$$\begin{aligned} \sum _{K \in {\mathcal {M}}_\ell } |K|^{(k+1)p'/n}\mathrm {s}_{K}(u_\ell ;u_\ell ) \lesssim \Vert h_\ell \Vert _{L^\infty (\Omega )}^{((k+1)p' - \varepsilon p)/n}\eta _\ell ^{(\varepsilon )}. \end{aligned}$$
(5.12)

The restriction \(\varepsilon \le (k+1)/(p-1)\) provides \((k+1)p' - \varepsilon p > 0\). Hence \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\) implies that the right-hand side of (5.12) vanishes in the limit as \(\ell \rightarrow \infty \). This and (5.11)–(5.12) lead to \(\lim _{\ell \rightarrow \infty } \mathrm {s}_{\ell }(u_\ell ;\mathrm {I}_\ell \varphi ) = 0\) for all \(\varphi \in C^\infty ({\overline{\Omega }};{\mathbb {R}}^m)\). Given any \(\delta > 0\), let \(\varphi \in C^\infty ({\overline{\Omega }};{\mathbb {R}}^m)\) such that \(\Vert \mathrm {D}(u - \varphi )\Vert _{L^p(\Omega )} \le \delta \). The interpolation error from Lemma 5.4b proves \(\mathrm {s}_\ell (\mathrm {I}_\ell (u - \varphi ); \mathrm {I}_\ell (u - \varphi )) \le C_{9}L^p \Vert {\mathrm {D}(u - \varphi )}\Vert _{L^p(\Omega )}^p \le C_{9}^p \delta ^p\) with a universal constant \(C_{9} > 0\). The convergence \(\lim _{\ell \rightarrow \infty } \mathrm {s}_{\ell }(u_\ell ;\mathrm {I}_\ell \varphi ) = 0\) implies the existence of \(N \in {\mathbb {N}}_0\) with \(|\mathrm {s}_\ell (u_\ell ;\mathrm {I}_\ell \varphi )| \le \delta \) for all \(\ell \ge N\). This, a triangle inequality, a Hölder inequality, and the bound \(\mathrm {s}_\ell (u_\ell ;u_\ell ) \le C_{8}\) (by assumption) verify

$$\begin{aligned} |\mathrm {s}_\ell (u_\ell ; \mathrm {I}_\ell u)|&\le |\mathrm {s}_\ell (u_\ell ;\mathrm {I}_\ell \varphi )| + |\mathrm {s}_\ell (u_\ell ; \mathrm {I}_\ell (u - \varphi ))| \le |\mathrm {s}_\ell (u_\ell ;\mathrm {I}_\ell \varphi )|\\&\quad + \mathrm {s}_\ell (u_\ell ;u_\ell )^{1/p'}\mathrm {s}_\ell (\mathrm {I}_\ell (u - \varphi ); \mathrm {I}_\ell (u - \varphi ))^{1/p} \le (1 + C_{8}^{1/p'}C_{9}) \delta . \end{aligned}$$

This concludes the proof of \(\lim _{\ell \rightarrow \infty } \mathrm {s}_\ell (u_\ell ;\mathrm {I}u) = 0\) in (5.10). \(\square \)

We return to proof of Theorem 2.2 and recall \(\mathrm {s}_\ell (u_\ell ;u_\ell ) \le C_{7}^p\) from Theorem 5.5 and \(\lim _{\ell \rightarrow \infty } \eta _\ell ^{(\varepsilon )} = 0\) from Step 1. Hence Lemma 5.8 applies and (5.10) follows. With this additional argument (5.10) and the remaining conclusions, that lead to (4.21) in the proof of Theorem 2.1, \(E(u) \le E(v) \le \liminf _{\ell \rightarrow \infty } \mathrm {LEB}_\ell \le E(u)\) follows for the weak limit v. This implies \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = \lim _{\ell \rightarrow \infty } \mathrm {LEB}_\ell = E(u)\). Since \(\mathrm {s}_\ell (u_\ell ;u_\ell )/p' \le E(u) - \mathrm {LEB}_\ell \) from (5.8), \(\mathrm {s}_\ell (u_\ell ;u_\ell )\) vanishes in the limit as \(\ell \rightarrow \infty \). If W satisfies (2.5), then the choice \(\varrho {:}{=}\mathrm {D}u\) and \(\xi {:}{=}{\mathcal {G}}_\ell u_\ell \) in (4.23), (5.9) and the data oscillations from (4.18) imply

$$\begin{aligned} C_{10}^{-1}\Vert \mathrm {D}u - {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^r + \mathrm {s}_\ell (u_\ell ;u_\ell )/p' \le E(u) - \mathrm {LEB}_\ell \end{aligned}$$
(5.13)

with the constant \(C_{10} {:}{=}3 c_{5}(|\Omega | + C_{2}^p + C_{7}^p)^{t/t'}\) and rt from Table 1. This shows \(\lim _{\ell \rightarrow \infty } {\mathcal {G}}_\ell u_\ell = \mathrm {D}u\) (strongly) in \(L^p(\Omega ;{\mathbb {M}})\). If W satisfies (2.6), then (4.26) holds and \(C_{10}^{-1}\Vert \mathrm {D}u - {\mathcal {G}}_\ell u_\ell \Vert _{L^p(\Omega )}^r\) on the left-hand side of (5.13) can be replaced by \(C_{11}^{-1}\Vert \sigma - \mathrm {D}W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(\Omega )}^{{\widetilde{r}}}\) with \(C_{11} {:}{=}3 c_{6}(|\Omega | + C_{2}^p + C_{7}^p)^{{\widetilde{t}}/{\widetilde{t}}'}\) and \({\widetilde{r}},{\widetilde{t}}\) from Table 1. Hence \(\lim _{\ell \rightarrow \infty } \mathrm {D}W({\mathcal {G}}_\ell u_\ell ) = \sigma \) (strongly) in \(L^{p'}(\Omega ;{\mathbb {M}})\). \(\square \)

6 Numerical examples

Some remarks on the implementation precede the numerical benchmarks for the three examples of Sect. 2.4 and the experiments in the Foss–Hrusa–Mizel example with the Lavrentiev gap in Sect. 6.5.

6.1 Implementation

The realization in MATLAB follows that of [28, Subsubsetion 5.1.1] with the parameters \(\texttt {FunctionTolerance} = \texttt {OptimalityTolerance} = \texttt {StepTolerance} = 10^{-15}\) and \(\texttt {MaxIterations} = \texttt {Inf}\) for improved accuracy.

The class of minimization problems at hand allows, in general, for multiple exact and discrete solutions. The numerical experiments select one (of those) by the approximation in fminunc with the initial value computed as follows. On the coarse initial triangulations \({\mathcal {T}}_0\), the initial value \(v_0 = (v_{{\mathcal {T}}_0},v_{{\mathcal {F}}_0}) \in V({\mathcal {T}}_0)\) is defined by \(v_{{\mathcal {T}}_0} \equiv 1\), \(v_{{\mathcal {F}}_0}|_F \equiv 1\) on any \(F \in {\mathcal {F}}_0(\Omega )\), and \(v_{{\mathcal {F}}_0}|_F = \Pi _F^k u_\mathrm {D}\) for all \(F \in {\mathcal {F}}_0(\Gamma _{\mathrm {D}})\). On each refinement \({\mathcal {T}}_{\ell +1}\) of some triangulation \({\mathcal {T}}_\ell \), the initial approximation is defined by a prolongation of the output \(u_\ell \) of the call fminunc on the coarse triangulation \({\mathcal {T}}_\ell \). The prolongation maps \(u_\ell \) onto \(v_{\ell +1} {:}{=}\mathrm {I}_{\ell +1} {\mathcal {J}}_{\ell } u_\ell \in V({\mathcal {T}}_{\ell +1})\).

The numerical integration of polynomials is exact with the quadrature formula in [45]: For non-polynomial functions such as \(W({\mathcal {G}}_\ell v_\ell )\) with \(v_\ell \in V({\mathcal {T}}_\ell )\), the number of chosen quadrature points allows for exact integration of polynomials of order \(p(k+1)\) with the growth p of W and the polynomial order k of the discretization; the same quadrature formula also applies to the integration of the dual energy density \(W^*\) in (6.2). The implementation is based on the in-house AFEM software package in MATLAB [2, 8]. Adaptive computations are carried out with \(\theta = 0.5\), \(\varepsilon = (k+1)/100\), and the polynomial degrees k from Fig. 1. Undisplayed computer experiments suggest only marginal influence of the choice of \(\varepsilon \) on the convergence rates of the errors.

The uniform or adaptive mesh-refinement leads to convergence history plots of the energy error \(|E(u) - E_\ell (u_\ell )|\) or the stress error \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{p'}(\Omega )}^2\) plotted against the number of degrees of freedom (ndof) in Figs. 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11. (Recall the scaling \(\mathrm {ndof} \propto h_{\max }^{-2}\) in 2D for uniform mesh refinements with maximal mesh size \(h_{\max }\) in a log-log plot.) In the numerical experiments without a priori knowledge of u, the reference value displayed for \(\min E({\mathcal {A}})\) stems from an Aitken extrapolation of the numerical results for a sequence of uniformly refined triangulations.

Fig. 1
figure 1

Polynomial degrees \(k = 0,\dots ,4\) in the numerical benchmarks of Sect. 6

6.2 The p-Laplace equation

Fig. 2
figure 2

Initial triangulation \({\mathcal {T}}_0\) (left) of the L-shaped domain and convergence history plot (right) of \(|E(u) - E_\ell (u_\ell )|\) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the p-Laplace benchmark in Sect. 6.2

Fig. 3
figure 3

Adaptive triangulations of the L-shaped domain into 492 triangles (1238 dofs) for \(k = 0\) (left) and 490 triangles (7824 dofs) for \(k = 3\) (right) for the p-Laplace benchmark in Sect. 6.2

Fig. 4
figure 4

Convergence history plot of \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}^2\) (left) and \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}^2\) (right) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the p-Laplace benchmark in Sect. 6.2

Fig. 5
figure 5

Material distribution of the L-shaped domain (left) and convergence history plot (right) of \(\mathrm {RHS}_\ell \) in (6.1) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the optimal design problem in Sect. 6.3

Fig. 6
figure 6

Adaptive triangulation of the L-shaped domain into 1510 triangles (3721 dofs) for \(k = 0\) (left) and 1351 triangles (21,450 dofs) for \(k = 3\) (right) for the optimal design problem in Sect. 6.3

The third numerical benchmark from [15, Section 6] for the p-Laplace problem in Sect. 2.4.1 considers \(p = 4\), the right-hand side

$$\begin{aligned} f(r,\varphi ) {:}{=}343/2048 r^{-11/8} \sin (7\varphi /8), \end{aligned}$$

on the L-shaped domain \(\Omega {:}{=}(-1,1)^2\setminus ([0,1)\times (-1,0])\) with the initial triangulation \({\mathcal {T}}_0\) displayed in Fig. 2a, the Dirichlet boundary data \(u_\mathrm {D}(r,\varphi ) {:}{=}r^{7/8}\sin (7\varphi /8)\) \(\Gamma _{\mathrm {D}} {:}{=}({0} \times [-1,0]) \cup ([0,1] \times {0})\), and the Neumann boundary data

$$\begin{aligned} g(r,\varphi ) {:}{=}343/512 r^{-3/8} (-\sin (\varphi /8), \cos (\varphi /8)) \cdot \nu \end{aligned}$$

in polar coordinates with the outer normal unit vector \(\nu \) on \(\Gamma _{\mathrm {N}} {:}{=}\partial \Omega \setminus \Gamma _{\mathrm {D}}\). The minimal energy \(\min E({\mathcal {A}}) = -1.4423089582447\) is attained at the unique minimizer

$$\begin{aligned} u(r,\varphi ) {:}{=}r^{7/8}\sin (7\varphi /8). \end{aligned}$$

Since u is singular at the origin, reduced convergence rates are expected for uniform mesh-refining. Figure 2b displays the suboptimal convergence rates 0.75 for the energy error \(|E(u) - E_\ell (u_\ell )|\) and all polynomial degrees \(k = 0, \dots , 4\). The adaptive mesh-refining algorithm refines towards the origin as depicted in Fig. 3 and we observed a stronger local refinement for larger polynomial degree k. Since W satisfies (2.5)–(2.6), the interest is on the displacement error \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}\) and the stress error \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}\). On uniformly refined meshes, \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}^2\) converges with the suboptimal convergence rate 0.375 and adaptive computation improves the convergence rate to 0.8 for \(k = 0\) and 2.5 for \(k = 4\) as depicted in Fig. 4a, b displays the convergence rate 1 for the stress error \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}^2\) on uniform triangulations for all \(k = 0, \dots , 4\). This is optimal for \(k = 0\), but not for \(k \ge 1\). The adaptive mesh-refining algorithm recovers the optimal convergence rates \(k+1\) for \(k \ge 1\).

6.3 The optimal design problem

Consider W from Sect. 2.4.2 for \(\mu _1 = 1\), \(\mu _2 = 2\), \(\xi _1 = \sqrt{2\lambda \mu _1/\mu _2}\), and \(\xi _2 = \mu _2\xi _1/\mu _1\) with the fixed parameter \(\lambda = 0.0145\) on the L-shaped domain \(\Omega {:}{=}(-1,1)^2\setminus ([0,1)\times (-1,0])\) from [4, Figure 1.1]. Let \(f \equiv 1\) in \(\Omega \) and \(u_\mathrm {D} \equiv 0\) on \(\Gamma _{\mathrm {D}} = \partial \Omega \) with the reference value \(\min E({\mathcal {A}}) = -0.0745512\).

The material distribution in Fig. 5a consists of two homogenous phases, an interior (red) and a boundary (yellow) layer, and a transition layer, also called microstructure zone with a fine mixture of the two materials [4, 16, 25, 28]. The approximated volume fractions \(\Lambda (|\Pi _{{\mathcal {T}}_\ell }^0 {\mathcal {G}}_\ell u_\ell |)\) for a discrete minimizer \(u_\ell \) with \(\Lambda (\xi ) = 0\) if \(0 \le \xi \le \xi _1\), \(\Lambda (\xi ) = (\xi - \xi _1)/(\xi _2 - \xi _1)\) if \(\xi _1 \le \xi \le \xi _2\), and \(\Lambda (\xi ) = 1\) if \(\xi \ge \xi _2\), define the colour map of the fraction plot of Fig. 5. Since W satisfies (2.6), Theorem 2.1 implies the convergence of \(|E(u) - E_\ell (u_\ell )|\) and \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{2}(\Omega )}\). Since the exact solution is unknown, the numerical experiment computes \(\mathrm {RHS}_\ell \) in

$$\begin{aligned} \begin{aligned}&\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^2(\Omega )}^2 + |E(u) - E_\ell (u_\ell )|\\&\quad \lesssim \mathrm {RHS}_\ell {:}{=}E_\ell (u_\ell ) - E^*(\sigma _\ell ) + \mathrm {osc}(f,{\mathcal {T}}_\ell ) + \Vert {\mathcal {G}}_\ell u_\ell - \nabla {\mathcal {J}}_\ell u_\ell \Vert _{L^2(\Omega )}^2 \end{aligned} \end{aligned}$$
(6.1)

from [28, Theorem 4.6] with the convex conjugate \(W^* \in C({\mathbb {M}})\) [53, Corollary 12.2.2] and the dual energy

$$\begin{aligned} E^*(\sigma _\ell ) {:}{=}-\int _\Omega W^*(\sigma _\ell ) \,{\mathrm {d}}x. \end{aligned}$$
(6.2)

Figure 5b displays the suboptimal convergence rate 0.4 for \(\mathrm {RHS}_\ell \) on uniform triangulations. The adaptive algorithm refines towards the reentrant corner and the boundaries of the microstructure zone as displayed in Fig. 6. This improves the convergence rates up to 1.2 for \(k = 4\). Undisplayed computer experiments show significant improvement for the convergence rates of \(\mathrm {RHS}_\ell \) for examples with small microstructure zones in agreement with the related empirical observations in [25].

Fig. 7
figure 7

Initial triangulation (left) of the rectangular domain \(\Omega \) and convergence history plot (right) of \(|E(u) - E_\ell (u_\ell )|\) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the two-well benchmark in Sect. 6.4

6.4 The relaxed two-well benchmark

Let \(\Omega {:}{=}(0,1) \times (0,3/2)\) with pure Dirichlet boundary \(\Gamma _{\mathrm {D}} {:}{=}\partial \Omega \). The computational benchmark from [15] considers the two distinct wells \(F_1 = -(3,2)/\sqrt{13} = -F_2\) in the definition of W from Sect. 2.4.3 and introduces an additional quadratic term \(\Vert \zeta - v\Vert _{L^2(\Omega )}^2\) in the energy

$$\begin{aligned} E(v) {:}{=}\int _\Omega (W(\nabla v) - f v) \,{\mathrm {d}}x + \Vert \zeta - v\Vert _{L^2(\Omega )}^2/2 \end{aligned}$$

for all \(v \in {\mathcal {A}}{:}{=}u_\mathrm {D} + W^{1,4}_0(\Omega )\) with \(f(x,y) {:}{=}- 3\varrho ^5/128 - \varrho ^3/3\), \(\zeta (x,y) {:}{=}\varrho ^3/24 + \varrho \),

$$\begin{aligned} u(x,y) {:}{=}u_\mathrm {D}(x,y) {:}{=}{\left\{ \begin{array}{ll} f(x,y) &{}\text{ if } -1/2 \le \varrho \le 0,\\ \zeta (x,y) &{}\text{ if } 0 \le \varrho \le 1/2 \end{array}\right. } \end{aligned}$$

at \((x,y) \in {\mathbb {R}}^2\) and \(\varrho {:}{=}(3(x-1) + 2y)/\sqrt{13}\). Since E is strictly convex in \({\mathcal {A}}\), the minimal energy \(\min E({\mathcal {A}}) = E(u) = 0.1078147674\) is attained at the unique minimizer u. The discrete minimizer \(u_\ell = (u_{{\mathcal {T}}_\ell }, u_{{\mathcal {F}}_\ell })\) of the discrete energy

$$\begin{aligned} E_\ell (v_\ell )&{:}{=}\int _\Omega (W({\mathcal {G}}_\ell v_\ell ) - f v_{{\mathcal {T}}_\ell }) \,{\mathrm {d}}x + \Vert \zeta - v_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}^2/2 \end{aligned}$$

among \(v_\ell = (v_{{\mathcal {T}}_\ell }, v_{{\mathcal {F}}_\ell }) \in {\mathcal {A}}({\mathcal {T}}_\ell )\) is unique in the volume component \(u_{{\mathcal {T}}_\ell }\) only. The convergence analysis can be extended to the situation at hand with the refinement indicator \({\widetilde{\eta }}_\ell ^{(\varepsilon )}(T) {:}{=}\eta _\ell ^{(\varepsilon )}(T) + |T|\Vert (1 - \Pi _T^k) \zeta \Vert _{L^2(T)}^2\) and leads to \(\lim _{\ell \rightarrow \infty } E_\ell (u_\ell ) = E(u)\), \(\lim _{\ell \rightarrow \infty } \nabla W({\mathcal {G}}_\ell u_\ell ) = \sigma \) (strongly) in \(L^{4/3}(\Omega ;{\mathbb {R}}^2)\), and \(\lim _{\ell \rightarrow \infty } u_{{\mathcal {T}}_\ell } = u\) (strongly) in \(L^4(\Omega )\).

Fig. 8
figure 8

Adaptive triangulation (left) of the rectangular domain \(\Omega \) into 1192 triangles (7104 dofs) for \(k = 1\) and convergence history plot (right) of \(\Vert u - u_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}^2\) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the two-well benchmark in Sect. 6.4

Fig. 9
figure 9

Convergence history plot of \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}^2\) (left) and \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}^2\) (right) with k from Fig. 1 on uniform (dashed line) and adaptive (right) triangulations for the two-well benchmark in Sect. 6.4

The exact solution u is piecewise smooth and the derivative \(\nabla u\) jumps across the interface \(\Gamma = \mathrm {conv}\{(1,0),(0,3/2)\}\). For an aligned initial triangulation, where \(\Gamma \) coincides with the sides of the triangulation, the numerical results from [28] display optimal convergence rates \(k+1\) for \(|E(u) - E_\ell (u_\ell )|\), \(\Vert \sigma - \sigma _\ell \Vert _{L^{4/3}(\Omega )}^2\) \(\Vert u - u_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}^2\), and \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}^2\) on uniformly refined meshes. Since a priori information on u is not available in general, this numerical benchmark considers the non-aligned initial triangulation \({\mathcal {T}}_0\) in Fig. 7a, where \(\Gamma \) cannot be resolved exactly (even not with adaptively refined triangulations of \({\mathcal {T}}_0\)). In this case, Carstensen and Jochimsen [14] predicted

$$\begin{aligned} \Vert (1 - \Pi _{{\mathcal {T}}_\ell }^0) u\Vert _{L^4(\Omega )} + \Vert (1 - \Pi _{{\mathcal {T}}_\ell }^0) \sigma \Vert _{L^{4/3}(\Omega )} \lesssim H_\ell , \Vert (1 - \Pi _{{\mathcal {T}}_\ell }^0) \nabla u\Vert _{L^4(\Omega )} \lesssim H_\ell ^{1/4} \end{aligned}$$

for \(H_\ell {:}{=}\Vert h_\ell \Vert _{L^\infty (\Omega )}\). These expected (optimal) convergence rates on uniform meshes are indeed observed empirically for the lowest-order HHO scheme. Figures 7b, 8b and 9 display the convergence rate 1, 1, 1/4, and 1 for \(|E(u) - E_\ell (u_\ell )|\), \(\Vert u - u_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}^2\), \(\Vert \nabla u - {\mathcal {G}}_\ell u_\ell \Vert _{L^4(\Omega )}^2\), and \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}^2\), respectively. This improves the convergence rate 3/4 of the stress error from the lowest-order Courant FEM in [14]. The adaptive algorithm generates adaptive meshes with a strong local mesh-refinement near the interface \(\Gamma \) and improve the convergence rate of \(|E(u) - E_\ell (u_\ell )|\) to 2.2 in Fig. 7b, of \(\Vert u - u_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}^2\) to 2 in Fig. 8b, and of \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}^2\) to 2.5 in Fig. 9b for polynomial degrees \(k \ge 2\). For \(k = 1\), adaptive mesh refinements only leads to marginal improvements. Since optimal convergence rates are obtained for \(\Vert u - u_{{\mathcal {T}}_\ell }\Vert _{L^2(\Omega )}\) and \(\Vert \sigma - \nabla W({\mathcal {G}}_\ell u_\ell )\Vert _{L^{4/3}(\Omega )}\) with \(k = 0\) on uniform meshes, there is not much gain from adaptive computation.

6.5 Modified Foss–Hrusa–Mizel benchmark

The final example considers a modified Foss–Hrusa–Mizel [43] benchmark in [52], extended to the domain \(\Omega {:}{=}(-1,1) \times (0,1)\) with \(\Gamma _1 {:}{=}[-1,0] \times \{0\}\), \(\Gamma _2 {:}{=}[0,1] \times \{0\}\), \(\Gamma _3 {:}{=}\{x = (x_1,x_2) \in \partial \Omega : x_1 = -1 \text { or } x_1 = 1 \text { or } x_2 = 1\}\), and the initial triangulation \({\mathcal {T}}_0\) of Fig. 10a. Define the energy density \(W(A) {:}{=}(|A|^2 - 2\det A)^4 + |A|^2/2\) for all \(A \in {\mathbb {M}}{:}{=}{\mathbb {R}}^{2 \times 2}\), the set

$$\begin{aligned} {\mathcal {A}}{:}{=}\{v = (v_1,v_2) \in W^{1,2}(\Omega ;{\mathbb {R}}^2) : v_1 \equiv 0 \text { on } \Gamma _1, v_2 \equiv 0 \text { on } \Gamma _2, v = u_\mathrm {D} \text { on } \Gamma _3\} \end{aligned}$$

of admissible functions in \(W^{1,2}(\Omega ;{\mathbb {R}}^2)\) with \(u_\mathrm {D} {:}{=}(\cos (\varphi /2), \sin (\varphi /2))\) in polar coordinates, and the vanishing right-hand side \(f \equiv 0\). The minimal energy \(E(u) = \min E({\mathcal {A}}) = 0.88137023556\) of

$$\begin{aligned} E(v) {:}{=}\int _\Omega W(\mathrm {D}v) \,{\mathrm {d}}x \text { among } v \in {\mathcal {A}}\end{aligned}$$

is attained at \(u {:}{=}r^{1/2}(\cos (\varphi /2), \sin (\varphi /2))\) in polar coordinates. The energy density \(W \in C^1({\mathbb {M}})\) is convex and satisfies the lower growth \(W(A) \ge |A|^2/2\) of order \(p = 2\), but no upper growth of order 2.

Fig. 10
figure 10

Initial triangulation \({\mathcal {T}}_0\) (left) of \(\Omega \) and empirical verification of the Lavrentiev gap (right) for the modified Foss–Hrusa–Mizel benchmark in Sect. 6.5: convergence history plot of \(|E(u) - E_\ell (u_\ell )|\) for the Courant FEM (dotted line) and the lowest-order HHO method on uniform (dashed line) and adaptive (solid line) triangulations

Fig. 11
figure 11

Adaptive triangulation of \(\Omega \) (left) into 614 triangles (7336 dofs) for \(k = 1\) and convergence history plot (right) of \(|E(u) - E_\ell (u_\ell )|\) with k from Fig. 1 on uniform (dashed line) and adaptive (solid line) triangulations for the modified Foss–Hrusa–Mizel benchmark in Sect. 6.5

The application of the discrete compactness to this model example with free boundary requires the modified refinement indicator

$$\begin{aligned} \eta _\ell ^{(\varepsilon )}(T)&{:}{=}|T|^{\varepsilon - 1}\Vert \Pi _{T}^k(({\mathcal {R}}_\ell u_\ell )|_T - u_T)\Vert ^2_{L^2(T)} + |T|^{\varepsilon - 1/2}\\&\quad \times \Big (\sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _1)} \Vert ({\mathcal {R}}_\ell u_\ell )|_F \cdot e_1\Vert _{L^2(F)}^2 + \sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _2)} \Vert ({\mathcal {R}}_\ell u_\ell )|_F \cdot e_2\Vert _{L^2(F)}^2\\&\quad + \sum _{F \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Gamma _3)} \Vert ({\mathcal {R}}_\ell u_\ell )|_F - u_\mathrm {D}\Vert _{L^2(F)}^2 + \sum _{E \in {\mathcal {F}}_\ell (T) \cap {\mathcal {F}}_\ell (\Omega )} \Vert [{\mathcal {R}}_\ell u_\ell ]_F\Vert _{L^2(F)}^2\\&\quad + \sum _{F \in {\mathcal {F}}_\ell (T)} \Vert \Pi _F(({\mathcal {R}}_\ell u_\ell )_T - u_F)\Vert _{L^2(F)}^2\Big ) \end{aligned}$$

with the j-th canonical unit vector \(e_j \in {\mathbb {R}}^2\). Since the presence of the Lavrentiev gap is equivalent to the failure of conforming FEMs [17, Theorem 2.1], the lowest-order HHO can be utilized to detect the Lavrentiev gap, cf. Sect. 4.3. Figure 10b provides empirical evidence that there is a Lavrentiev gap: \(|E(u) - E_\ell (u_\ell )|\) converges with the suboptimal convergence rate 0.5 on uniformly refined meshes, but the Courant FEM seems to approximate a wrong energy. The adaptive mesh-refining algorithm refines towards the origin as depicted in Fig. 11a. It is outlined in Sect. 4.3 that a convergence proof of AHHO for minimization problems with the Lavrentiev gap is impossible with the known mathematical methodology for \(k \ge 1\). It comes as a welcome surprise that optimal convergence rates \(k+1\) are obtained for any polynomial degrees k on adaptively refined meshes in Fig. 11b.

6.6 Conclusions

The numerical results from Sect. 6 confirm the theoretical findings in Theorem 2.1. In particular, the convergence of the energy \(\lim _{\ell \rightarrow \infty } \min E_\ell ({\mathcal {A}}({\mathcal {T}}_\ell )) = \min E({\mathcal {A}})\) is observed in all examples. The introduced adaptive mesh-refining algorithm of Sect. 2.2 provides efficient approximations of singular solutions and even leads to improved empirical convergence rates. The choice of the parameter \(\varepsilon \) only has marginal influence on the convergence rates and convergence is observed for \(\varepsilon = 0\) in undisplayed computer experiments. Better convergence rates are obtained for larger polynomial degrees k. The computer experiments provide empirical evidence that the HHO method can overcome the Lavrentiev gap for any polynomial degree k.