Convergent adaptive hybrid higher-order schemes for convex minimization

This paper proposes two convergent adaptive mesh-refining algorithms for the hybrid high-order method in convex minimization problems with two-sided p-growth. Examples include the p-Laplacian, an optimal design problem in topology optimization, and the convexified double-well problem. The hybrid high-order method utilizes a gradient reconstruction in the space of piecewise Raviart–Thomas finite element functions without stabilization on triangulations into simplices or in the space of piecewise polynomials with stabilization on polytopal meshes. The main results imply the convergence of the energy and, under further convexity properties, of the approximations of the primal resp. dual variable. Numerical experiments illustrate an efficient approximation of singular minimizers and improved convergence rates for higher polynomial degrees. Computer simulations provide striking numerical evidence that an adopted adaptive HHO algorithm can overcome the Lavrentiev gap phenomenon even with empirical higher convergence rates.


Introduction
Adaptive mesh-refining is vital in the computational sciences and engineering with optimal rates known in many linear problems [54,29,21,19].Besides eigenvalue problems [33,24,23,6] much less is known for stationary nonlinear PDEs.The few positive results in the literature concern mainly conforming FEM with plain convergence results [56,4,22,41,12].An important exception is the p-Laplacian in [5], where the notion of a quasi-norm enables two-sided error control.The next larger class of convex minimization problems from [27] emerged in the relaxation of nonconvex minimization problems with enforced microstructures and this is in the focus of this paper.This class is characterized by a two-sided growth condition on a C 1 energy density W with an additional convexity control that enables a unique stress DW (Du) independent of the multiple minimizers u on the continuous level.In fact, there is no further control of the convex closed set of minimizers in beyond a priori boundedness.This leads to the reliability-efficiency gap [14] in the a posteriori error control: If the mesh-size tends to zero, the known guaranteed lower and upper error bounds converge with a different convergence rate.In other words, the efficiency index tends to infinity.This dramatic loss of sharp error control does not prevent convergence of an adaptive algorithm in general, but it makes the analysis of plain convergence much harder and seemingly disables any proof of optimal rates.
The numerical experiments in [38,28] motivate this paper on the adaptive HHO.The only nonconforming scheme known to converge for general convex minimization problems is [52] for the first-order Crouzeix-Raviart schemes and, according to the knowledge of the authors, there is no contribution for the convergence of an adaptive higher-order nonconforming scheme for nonlinear PDEs in the literature.In fact, this paper is the first one to guarantee plain convergence even for linear PDEs for the HHO schemes at all.The reason is a negative power of the meshsize in the stabilization terms that is overcome in dG schemes for linear PDEs by over-penalization in [7] to be close to conforming approximations (and then enable arguments for optimal convergence rates) and recently by generalized Galerkin solutions in a limit space in [47] for plain convergence.One advantage of the HHO methodology is the absence of a stabilization parameter and, hence, this argument is not employed in this paper.
The main contributions of this paper are adaptive HHO methods with and without stabilization with guaranteed plain convergence for the class of convex minimization problems from [27].Three types of results are available for those schemes.
(a) If W is C 1 and convex with two-sided p-growth, then the minimal discrete energies converge to the exact minimal energy.
(b) If furthermore W satisfies the convexity control in the class of degenerate convex minimization problems of [27], then the discrete stress approximations converge to the (unique) exact stress σ.
(c) If W is even strongly convex, then the discrete approximations of the gradients converge to the gradient Du of the (unique) exact minimizer u.
The two-sided growth condition excludes problems that exhibit the Lavrentiev gap phenomenon [48] and so we only comment on the lowest-order schemes that overcome the Lavrentiev gap owing to the Jensen inequality.
Numerical experiments are carried out on simplicial meshes, but the design of the stabilized HHO method allows for polytopal meshes for a fairly flexible meshdesign, e.g., in 3D.The mesh-refinement of those schemes is less elaborated (e.g., in comparison with [55] on simplicial meshes) and remains as an important aspect for future research.
The remaining parts of this paper are organized as follows.Section 2 introduces the continuous minimization problem, the adaptive mesh-refining algorithm, and the main results of this paper.Section 3 reviews the discretization with the HHO methodology on simplicial triangulations without stabilization.Section 4 departs from discrete compactness, proves the plain convergence of an adaptive scheme, and concludes with an application to the Lavrentiev gap.Section 5 treats HHO methods on general polytopal meshes with stabilization and proves the results of Subsection 4.2.Numerical results for three model examples from Subsection 2.4 below are presented in Section 6 with conclusions drawn from the numerical experiments.

Mathematical setting and main results
This paper analyzes the convergence of an adaptive mesh-refining algorithm based on the hybrid high-order methodology [37,35,39] for convex minimization problems with a two-sided p-growth.
The constants c 1 , c 3 > 0 and c 2 , c 4 ≥ 0 are universal in this paper and independent of the argument A ∈ M; the same universality applies to c 5 , c 6 in (2.5)- (2.6).Throughout this paper, the boundary ∂Ω of the domain Ω is divided into a compact Dirichlet part Γ D with positive surface measure and a relatively open (and possibly empty) Neumann part Γ N = ∂Ω \ Γ D .Given f ∈ L p (Ω; R m ), g ∈ L p (Γ N ; R m ) with 1/p + 1/p = 1, and u D ∈ V := W 1,p (Ω; R m ), minimize the energy functional amongst admissible functions v ∈ A := u D + V D subject to the Dirichlet boundary condition v|

Adaptive hybrid high-order method (AHHO)
The adaptive algorithm computes a sequence of discrete approximations of the minimal energy min E(A) in the affine space A = u D + V D of admissible functions in a successive loop over the steps outlined below.The first version of the adaptive algorithm focuses on the newest-vertex-bisection (NVB) [55] and the first HHO method without stabilization on triangulations into simplices.It will be generalized to polytopal meshes in Section 5.

INPUT.
The input is a regular initial triangulation T 0 of Ω into simplices, a polynomial degree k ≥ 0, a positive parameter 0 < ε ≤ k + 1, and a bulk parameter 0 < θ < 1.

SOLVE.
Let T denote the triangulation associated to the level ∈ N 0 with the set of all sides F .The hybrid high-order method utilizes the discrete ansatz space of polynomial degree at most k ≥ 0 with respect to the simplices (T ) and the sides (F ) in the triangulation T .The proposed numerical scheme replaces Dv in (2.1) by a gradient reconstruction G in the space of piecewise Raviart-Thomas finite element functions Σ(T ) = RT pw k (T ; M) for a shape-regular triangulation T of Ω into simplices.The details on the gradient reconstruction G are postponed to Subsection 3.4.The discrete problem computes a discrete minimizer u of , where Π k F is the L 2 projection onto the polynomials P k (F ) of degree at most k.Let σ := Π Σ(T ) DW (Gu ) ∈ Σ(T ) be the L 2 projection of DW (Gu ) onto Σ(T ).Further details on the hybrid high-order method follow in Section 3 below.
for all T ∈ T of volume |T | and sides F (T ) with the abbreviation u T := u T | T and . The refinement indicator is motivated by the discrete compactness from Theorem 4.1 below.In fact, if lim →∞ η (ε) = 0, then there exists a v ∈ A such that, up to a subsequence, G u ∇v weakly in L p (Ω; M) and u T v weakly in L p (Ω; R m ) as → ∞.It turns out that v is a minimizer of the continuous energy E from (2.1). 4. MARK and REFINE.Given a positive bulk parameter 0 < θ < 1, select a subset M ⊂ T of minimal cardinality such that This marking strategy is known as Dörfler marking.The marked simplices are refined by the newest-vertex bisection [55] to define T +1 .

Main results
The main results establish the convergence of the sequence (E (u )) ∈N 0 of minimal discrete energies computed by AHHO towards the exact minimal energy.(b) The sequence of the post-processing (J u ) ∈N 0 is bounded in V = W 1,p (Ω; R n ) and any weak accumulation point of (J u ) ∈N 0 in V minimizes E in A.
(c) Suppose there exists c 5 > 0 such that W satisfies, for all A, B ∈ M, with parameters r, s from Table 1.Then the minimizer u of E in A is unique and lim →∞ G u = Du (strongly) in L p (Ω; M) holds for the entire sequence.
(d) Suppose there exists c 6 > 0 such that W satisfies, for all A, B ∈ M, with parameters r, s from Table 1.Then the stress σ := DW (Du) ∈ L p (Ω; M) is unique (independent of the choice of a (possibly nonunique) minimizer u) and lim →∞ DW (G u ) = σ (strongly) in L p (Ω; M) and σ σ (weakly) in L p (Ω; M) hold for the entire sequence.case

Examples
Theorem 2.1 applies to the following scalar examples with m = 1.

p-Laplace
The minimization of the energy E : A → R with the energy density It is worth noticing that the convergence results of this paper are new even for a linear model problem with p = 2 for the two HHO algorithms.

Notation
Standard notation for Sobolev and Lebesgue functions applies throughout this paper with the abbreviations (2.7) For any A, B ∈ M := R m×n , A : B denotes the Euclidean scalar product of A and B, which induces the Frobenius norm |A| := (A : A) 3 Hybrid high-order method without stabilization This section recalls the discrete ansatz space and reconstruction operators from the HHO methodology [37,35,39] for convenient reading.

Triangulation
).For any boundary side F ∈ F (∂Ω) := F \ F (Ω), there is a unique T ∈ T with F ∈ F (T ).Then ω F = int(T ), ν F := ν T , and [v] F := (v| T )| F .The differential operators div pw and D pw depend on the triangulation T and denote the piecewise application of div and D without explicit reference to T .
The shape regularity of a triangulation T is the minimum min T ∈T (T ) of all ratios (T ) := r i /r c ≤ 1 of the maximal radius r i of an inscribed ball and the minimal radius r c of a circumscribed ball for a simplex T ∈ T .

Discrete spaces
The discrete ansatz space of the HHO methods consists of piecewise polynomials on the triangulation T and on the skeleton ∂T := ∪F .For a simplex or a side M ⊂ R n of diameter h M , let P k (M ) denote the space of polynomials of degree at most k ∈ N 0 regarded as functions defined in M .The The gradient reconstruction in T ∈ T maps in the space of Raviart-Thomas finite element functions Let P k (T ), P k (F ), and RT pw k (T ) denote the space of piecewise functions with respect to the mesh T or F and with restrictions to T or F in P k (T ), P k (F ), and RT k (T ).The L 2 projections Π k T and Π k F onto the discrete spaces P k (T ) and P k (F ) are the global versions of Π k T and Π k F , e.g., ( T onto P k (T ; R m ) := P k (T ) m applies componentwise.This convention extends to the L 2 projections onto P k (M ; R m ) := P k (M ) m and P k (F ; R m ) := P k (F ) m .The space of lowest-order Crouzeix-Raviart finite element functions reads Define the mesh-size function , and the (Neumann data) oscillation osc N (g,

HHO ansatz space
For fixed k ∈ N 0 , let V (T ) := P k (T ; R m ) × P k (F ; R m ) denote the discrete ansatz space for V in HHO methods [37,35].
gives rise to the discrete space A(T ) := I u D + V D (T ) of admissible functions.

Reconstruction operators
The reconstruction operators defined in this section link the two components of v ∈ V (T ) and provide discrete approximations R v and G v of the displacement v ∈ V and its derivative Dv ∈ L 2 (Ω; M).
for all ϕ k+1 ∈ P k+1 (T ; R m ).The bilinear form (D•, D•) L 2 (T ) on the left-hand side of (3.4) defines a scalar product in the quotient space P k+1 (T ; R m )/R m and the right-hand side of (3.4) is a linear functional in The unique solution R T v ∈ P k+1 (T ; R m ) to (3.4)- (3.5) gives rise to the potential reconstruction operator R : Gradient reconstruction.The gradient is reconstructed in the space Σ(T ) = RT pw k (T ; M) of piecewise Raviart-Thomas finite element functions [1,28].
for all τ ∈ Σ(T ).In other words, G v is the Riesz representation of the linear functional on the right-hand side of (3.6) in the Hilbert space Σ(T ) endowed with the L 2 scalar product.Since There exist positive constants C dF and C dtr that only depend on Ω, the shape regularity of T , k, and Proof.The proofs of (a)-(b) are outlined in [1,28].The discrete Sobolev embedding v T L p (Ω) v follows as in [11,36,34].Theorem 4.4 in [11] and (c) lead to v .This and the triangle inequality . The latter term is controlled by diam(Ω) 1/p v .This concludes the proof of (d).

Discrete problem
Lemma 3.1 implies the coercivity of E in A(T ) with respect to the discrete seminorm G • L p (Ω) and the existence and the boundedness of discrete minimizers u below.

Theorem 3.2 (discrete minimizers). The minimal discrete energy inf E (A(T )) is attained. There exists a positive constant
the choice of a (possibly non-unique) discrete minimizer u ).
Proof.The boundedness inf E (A(T )) > −∞ of E in A(T ) follows from the lower p-growth of W , the discrete Friedrichs, and the discrete trace inequality from Lemma 3.1, cf., e.g., [27,32,28].The direct method in the calculus of variations [32] implies the existence of discrete minimizers u ∈ arg min E (A(T )).The bound • as in [28].If W satisfies (2.5), then W is strictly convex and the discrete minimizer u ∈ arg min E (A(T )) is unique.If W satisfies (2.6), then the uniqueness of DW (G u ) follows as in [27,16,28].

Conforming companion
The companion operator J : [23,18,42,26].In particular, J preserves the moments An explicit construction of J v on simplicial meshes is presented in [42, Section 4.3] for simplicial triangulations with the following properties.

Lemma 3.4 (right-inverse).
There exists a linear operator J : In particular, J is stable in the sense that DJ v L p (Ω) ≤ Λ 0 v holds with the constant Λ 0 that exclusively depends on k, p, and the shape regularity of T .
Proof.For p = 2, the right-hand side of (3.9) is an upper bound for D . This proves (3.9).The right-hand side of (3.9) can be bounded by with a hidden constant that only depends on the shape regularity of T , k, and p [42, Proof of Proposition 4.7].The sum of this over all simplices T ∈ T , a triangle inequality, and the shape regularity of v .This, a reverse triangle inequality, and the norm equivalence from Lemma 3.1.aconclude the stability DJ v L p (Ω) v .

Proof of Theorem 2.1
This section is devoted to the proof of the convergence results in Theorem 2.1.

Discrete compactness
The proof of Theorem 2.1 departs from a discrete compactness in spirit of [11,36,34] and generalizes [34].Recall the mesh-size function h ∈ P 0 (T ) from Subsection 3.2 with h | T := |T | 1/n for T ∈ T and the seminorm • in V (T ) from (3.2).
Theorem 4.1 (discrete compactness).Given a uniformly shape-regular sequence (T ) ∈N 0 of triangulations and (v ) Suppose that the sequence ( v ) ∈N 0 is bounded and suppose that lim →∞ µ (v ) = 0 with Then there exist a subsequence (v j ) j∈N 0 of (v ) ∈N 0 and a weak limit v ∈ A such that J j v j v weakly in V and G j v j Dv weakly in L p (Ω; M) as j → ∞.
Proof.The first part of the proof proves the uniform boundedness to obtain (4.2).The triangle inequality implies 3) The right-inverse J of the interpolation I from Lemma 3.4 satisfies the L 2 orthogonality J v − v T ⊥ P k (T ; R m ).This, a piecewise application of the Poincaré inequality, a triangle inequality, and The Banach-Alaoglu theorem [10,Theorem 3.18] ensures the existence of a (not relabelled) subsequence of (J v ) ∈N 0 and a weak limit v ∈ V such that J v v weakly in V as → ∞.Lemma 3.1.aassures that the sequence (G v ) ∈N 0 is uniformly bounded in L p (Ω; M).Hence there exist a (not relabelled) subsequence of (v ) ∈N 0 and its weak limit . This and an integration by parts verify, for all Φ ∈ C ∞ (Ω; M) with Φ ≡ 0 on Γ N , that The approximation property of piecewise polynomials, also known under the name Bramble-Hilbert lemma [9, Lemma 4.3.8],leads to This and a Hölder inequality imply The L 2 orthogonality (J u − u D )| F ⊥ P k (F ; R m ) for each side F ∈ F (Γ D ) on the Dirichlet boundary, a piecewise application of the trace inequality, and (4.7) imply The right-hand sides of (4.8)-(4.9)vanish in the limit as → ∞ by assumption (4.1).This, (4.6 Proof.The proof of Lemma 4.2 utilizes standard averaging and bubble-function techniques, cf., e.g, [20,58,57,42]; further details are therefore omitted.

Plain convergence
Before the remaining parts of this subsection prove Theorem 2.1, the following lemma establishes the reduction of the mesh-size function h with h | T ≡ |T | 1/n for all T ∈ T .
Lemma 4.3 (mesh-size reduction).Given the output (T ) ∈N 0 of AHHO from Subsection 2.2, let Ω := int(∪(T \ T +1 )) for all level ∈ N 0 .Then it holds Proof.The proof is omitted for two reasons.First this is known from [50, Lemma 9] and second it is a particular case of Lemma 5.2 below.
Step 1 establishes lim →∞ η (ε) (T \ T +1 ) = 0. Since no suitable residual-based a posteriori control is available in the general setting (A1)-(A2), standard arguments, e.g., reliability, efficiency, or estimator reduction [49,22,41] fail.The proof of Step 1 rather relies on a positive power of the mesh-size that arises from the smoothness of test functions in Theorem 4.1.This is done in [52] for a similar setting and in [11,36,34] for uniform mesh-refinements.Let µ (ε) (T ) abbreviate some contributions of 3) related to µ (u ) in (4.1), namely The two-sided growth whence, in order to obtain lim →∞ η (ε) (T \ T +1 ) = 0, it suffices to prove that ) is uniformly bounded.The estimate (3.10) provides control over all but only one contribution of µ (0) (T ) in (4.11); that is h Given any F ∈ F (Γ D ), let T ∈ T be the unique simplex with F ∈ F (T ) ∩ F (Γ D ).The L p stability of the L 2 projection Π k F [34, Lemma 3.2] and a trace inequality show h Recall that DR u is the L 2 projection of G u onto D pw P k+1 (T ), whence the L p stability of L 2 projections [34, Lemma 3.2] proves DR u L p (T ) G u L p (T ) .Hence, the right-hand side of (4.14) is controlled by . This, (3.10), and u ≈ G u L p (Ω) ≤ C 1 from Lemma 3.1.aand Theorem 3.2 lead to Hence, the combination of (4.12)-(4.13)with lim →∞ h L ∞ (Ω ) = 0 in (4.10) con- Step 2 establishes lim →∞ η (ε) = 0. Recall the set M of marked simplices on level ∈ N 0 from Subsection 2.2.Since all simplices in M ⊂ T \ T +1 are refined and the Dörfler marking enforces η Step 3 provides the lower energy bound (LEB) The substitution of this in (4.16), the definition of E in (2.1), and the definition of (4.17) The final two integrals on the right-hand side of (4.17) give rise to the data oscillations osc(f, T ) and osc N (g, F (Γ N )) defined in Subsection 3.2.In fact, a Hölder inequality and a piecewise application of the Poincaré inequality show A trace inequality and the Bramble-Hilbert lemma [9, Lemma 4.3.8]leads, for all F ∈ F (Γ N ) and the unique T ∈ T with F ∈ F (T ) ∩ F (Γ N ), to h The lower p-growth c 1 |A| p − c 2 ≤ W (A) for all A ∈ M implies the coercivity of E in the seminorm D• L p (Ω) and so the bound Du L p (Ω) ≤ C 2 with a positive constant C 2 , that exclusively depends on c 1 , c 2 , Ω, Γ D , f , g, and u D , cf., e.g, [32,Theorem 4.1].Thus there exists a positive constant C 3 independent of the mesh-size with The combination of this with (4.17) concludes the proof of (4.15).
Step 4 establishes lim →∞ E (u ) = E(u).On the one hand, the discrete compactness from Theorem 4.1 and the weak lower semicontinuity of the energy functional imply E(u) ≤ lim inf →∞ LEB .On the other hand, LEB ≤ E(u) from (4.15).This proves lim →∞ E (u ) = E(u) as follows.Given any Φ ∈ C ∞ (Ω; M), the definition σ := Π Σ(T ) DW (G u ), a Hölder inequality, and (4.7) show This and lim →∞ η (ε) = 0 from Step 2 imply lim →∞ µ (u ) = 0. Since u ≈ G u L p (Ω) ≤ C 1 from Lemma 3.1.aand Theorem 3.2, the discrete compactness from Theorem 4.1 leads to a (not relabelled) subsequence of (u ) ∈N 0 and a weak limit v ∈ A such that J u v weakly in V and G u Dv weakly in L p (Ω; M) as → ∞.The boundedness of the linear trace operator γ : This, J u v (weakly) in V , G u Dv (weakly) in L p (Ω; M), the sequential weak lower semicontinuity of the functional Ω W (•) dx in L p (Ω; M), and (3.8) verify As in (4.18), a piecewise application of the Poincaré inequality, the trace inequality, the approximation property of polynomials, and the uniform bound DJ u L p (Ω) 1 from (4.2) confirm Hence lim →∞ E (u ) = lim →∞ LEB = E(u) for a (not relabelled) subsequence of (u ) ∈N 0 .Since the above arguments from Step 4 apply to all subsequences of (u ) ∈N 0 and the limit E(u) is unique, this holds for the entire sequence.
Step 5 is the finish of the proof.Suppose that W satisfies (2.5).Then the arguments from [27,28] show, for all , ξ ∈ L p (Ω; M) and r, t from Table 1, that The choice := Du and ξ := G u in (4.23) and the bounds Du The right-hand side of (4.24) coincides with the right-hand side of (4.16).The latter is bounded by the right-hand side of (4.15) in Step 3.This implies Step 4 proves that E(u) − LEB vanishes in the limit as → ∞.Thus, If W satisfies (2.6), then [28,Lemma 4.2] implies, for all , ξ ∈ L p (Ω; M) and r, t from Table 1, that Remark 4.4 (necessity of ε > 0).The counter example in [52,Subsection 3.4] shows that the restriction ε > 0 is necessary.Indeed, for k = 0, the data W , Ω, Γ D , Γ N , f , g, u D , and (T ) ∈N 0 from [52, Subsection 3.4], there exists a sequence of discrete minimizers (u ) ∈N 0 such that J u 0 weakly in V and G u 0 weakly in L p (Ω; M) as → ∞, but lim →∞ η (ε) = 0.

The Lavrentiev gap
A particular challenge in the computational calculus of variations is the Lavrentiev phenomenon inf E(A) < inf E(A ∩ W 1,∞ (Ω)) [48].Its presence is equivalent to the failure of standard conforming FEMs [17,Theorem 2.1] in the sense that a wrong minimal energy is approximated.As a remedy, the nonconforming Crouzeix-Raviart FEM in [51,52,3] can overcome the Lavrentiev gap under fairly general assumptions on W : Throughout the remaining parts of this section, let W ∈ C 1 (M) be convex with the one-sided lower growth (A two-sided growth of W excludes a Lavrentiev gap.)Since there is no upper growth of W , the dual variable σ := DW (Du) is not guaranteed to be in L p (Ω; M).This denies an access to the Euler-Lagrange equations and, therefore, the convergence analysis of [51,52] solely relies on the Jensen inequality.For k = 0, HHO methods can overcome the Lavrentiev gap because the Crouzeix-Raviart FEM can.Proof.Recall the discrete space CR 1 (T ; R m ) of Crouzeix-Raviart finite element functions from (3.1).Define and the nonconforming interpolation I CR : V → CR 1 (T ; R m ) [31] with The discrete CR-FEM minimizes the non-conforming energy A straight-forward modification of the proof of [52,Lemma 4] shows, for a positive constant Notice that I CR does not provide the
The discrete compactness from Theorem 4.1, the LEB in (4.27), and straightforward modifications of the proof of Theorem 2.1 lead to lim →∞ E (u ) = min E(A) for the output (u ) ∈N 0 of the adaptive algorithm in Subsection 2.2 with the refinement indicator, for all T ∈ T , For k ≥ 1, the consistency error σ −DW (G u ) arises in (4.15), but is not guaranteed to be bounded in L p (Ω; M) in the limit as → ∞ in general.Thus, in the absence of further conditions, the convergence lim →∞ E (u ) = min E(A) cannot be proven for k ≥ 1 with this methodology.

Polytopal meshes
Let M be a finite collection of closed polytopes of positive volume with overlap of volume measure zero that cover Ω = ∪ K∈M K.A side S of the mesh M is the (in general disconnected) closed subset of a hyperplane H S ⊂ Ω with positive ((n − 1)dimensional) surface measure such that either (a) there exist (M1) Assume that there exists a universal constant > 0 such that, for all level ∈ N 0 , M admits a shape-regular simplicial subtriangulation T with the shape regularity ≥ defined in Subsection 3.1 and, for each simplex T ∈ T , there exists a unique cell K ∈ M with T ⊆ K and h K ≤ h T .
(M2) Assume the existence of a universal constant 0 < γ < 1 such that, | K| ≤ γ|K| holds for all K ∈ M \ M +1 , K ∈ M +1 with K ⊂ K, and level ∈ N 0 , i.e., the volume measure of all children K of a refined cell K is at most γ|K|.
The assumption (M1) is typical for the error analysis of HHO methods on polytopal meshes, cf., e.g., [40,37,35,34,42].The assumption (M2) holds for the newestvertex bisection on simplicial triangulations with γ = 1/2.Remark 5.1 (equivalence of side lengths).The assumption (M1) ensures that h S ≈ h K ≈ |K| 1/n holds for all K ∈ M and S ∈ Σ (K) with equivalence constants that exclusively depend on the universal constant in (M1) [40,Lemma 1.42].Lemma 5.2 (mesh-size reduction).Suppose that the sequence (M ) ∈N 0 satisfies (M2), then the mesh-size function h ∈ P 0 (M ) with h Proof.Given any j ∈ N 0 and α j := γ j |Ω|, define the set M(j) ⊂ ∪ ∈N 0 M of all polytopes K with volume measure α j+1 < |K| ≤ α j .Since the volume measure of any refined polytope is at least reduced by the factor γ, the polytopes of M(j) are not children of each other and so |K ∩T | = 0 holds for any two distinct polytopes K, T ∈ M(j).This implies that the cardinality for all ≥ L. Notice that Lemma 4.3 follows for simplicial triangulations with γ = 1/2.

Stabilization
The classical HHO method [34,1] utilizes a gradient reconstruction G : V (M ) → Σ(M ) in the space Σ(M ) := P k (M ; M) of matrix-valued piecewise polynomials of total degree at most k.The discrete seminorm • of V (M ) and the operators I , R , G of this section are defined by the formulas (3.2)-(3.6) in Subsection 3.3 with adapted notation, i.e., T (resp.F ) is replaced by M (resp.Σ ).

The stabilization function s :
Notice that s (•; •) is linear in the second component, but not in the first unless p = 2.The relevant properties of s (•; •) are summarized below.
The sum of this over all S ∈ Σ (K) and a Cauchy inequality prove (d).The proof of (e) departs from the function W (a) := |a| p /p for a ∈ R m with the convexity control (2.5).The integral of (2.5) over the side S leads to (4.23) for all , ξ ∈ L p (S; R m ) and Ω (resp.M) replaced by S (resp.R m ).The choice := h −1/p S S K,S v and The sum of this over all S ∈ Σ (K) and K ∈ M concludes the proof of (e).

Stabilized HHO method on a polytopal mesh
The discrete problem minimizes the discrete energy    [27,16,28].
The following lemma extends Lemma 3.4 to polytopal meshes.Lemma 5.6.There exists a linear operator J : and, for any K ∈ M , the estimate (5.6) In particular, J is stable in the sense that DJ v L p (Ω) ≤ Λ 1 v holds with the constant Λ 1 that exclusively depends on k, p, and in (M1).
Proof.The construction of the conforming operator J on polytopal meshes in [42,Section 5] utilizes averaging and bubble-function techniques on the subtriangulation T and give rise an upper bound of G v − DJ v p L p (K) , namely (5.7) Lemma 3.2], it can be omitted in (5.7).This, the equivalence h T ≈ h K for all K ∈ M , T ∈ T with T ⊂ K from (M1), and h F ≈ h S for all S ∈ Σ , F ∈ F with F ⊂ S from (M1) and Remark 5.1 show (5.6).This implies the stability DJ v L p (Ω) v , cf., e.g., [42,Subsection 4.3] for more details.Notice that the computation of the right-hand side of (5.6) does not require explicit information on the subtriangulation T .
Remark 5.7 (discrete compactness).The discrete compactness from Theorem 4.1 holds verbatim with T (resp.F ) replaced by M (resp.Σ ).Notice that G from Subsection 5.2 and J from Lemma 5.6 in this section are different objects.With adapted notation, all arguments from the proof of Theorem 4.1 apply verbatim.Indeed, the commutativity Π Σ(M ) Dv = G I v for all v ∈ V from Lemma 3.1.bremains valid [34,1].This and (5.5) imply the L 2 orthogonality G v − DJ v ⊥ Σ(M ) for any v ∈ V (M ).This is the key argument in the proof of Theorem 4.1 and provides a positive power of the mesh-size in (4.1).

Proof of Theorem 2.2
Given any K ∈ M , Lemma 5.6 motivates the refinement indicator The remaining parts of this section are devoted to the proof of the convergence results in Theorem 2.2.
Proof of Theorem 2.2.The proof follows that of Theorem 2.1.
We return to proof of Theorem 2.

Numerical examples
Some remarks on the implementation precede the numerical benchmarks for the three examples of Subsection 2.4 and the experiments in the Foss-Hrusa-Mizel example with the Lavrentiev gap in Subsection 6.5.
The class of minimization problems at hand allows, in general, for multiple exact and discrete solutions.The numerical experiments select one (of those) by the approximation in fminunc with the initial value computed as follows.On the coarse initial triangulations T 0 , the initial value . On each refinement T +1 of some triangulation T , the initial approximation is defined by a prolongation of the output u of the call fminunc on the coarse triangulation T .The prolongation maps u onto v +1 := I +1 J u ∈ V (T +1 ).
The numerical integration of polynomials is exact with the quadrature formula in [45]: For non-polynomial functions such as W (G v ) with v ∈ V (T ), the number of chosen quadrature points allows for exact integration of polynomials of order p(k + 1) with the growth p of W and the polynomial order k of the discretization; the same quadrature formula also applies to the integration of the dual energy density W * in (6.2).The implementation is based on the in-house AFEM software package in MATLAB [2,8].Adaptive computations are carried out with θ = 0.5, ε = (k +1)/100, and the polynomial degrees k from Figure 1.Undisplayed computer experiments suggest only marginal influence of the choice of ε on the convergence rates of the errors.
The uniform or adaptive mesh-refinement leads to convergence history plots of the energy error |E(u) − E (u )| or the stress error σ − ∇W (G u ) 2 L p (Ω) plotted against the number of degrees of freedom (ndof) in Figure 2-Figure 11 below.(Recall the scaling ndof ∝ h 2 max in 2D for uniform mesh refinements with maximal mesh size h max in a log-log plot.)In the numerical experiments without a priori knowledge of u, the reference value displayed for min E(A) stems from an Aitken extrapolation of the numerical results for a sequence of uniformly refined triangulations.The material distribution in Figure 5.a consists of two homogenous phases, an interior (red) and a boundary (yellow) layer, and a transition layer, also called microstructure zone with a fine mixture of the two materials [4,16,25,28]

The relaxed two-well benchmark
Let Ω := (0, 1)×(0, 3/2) with pure Dirichlet boundary Γ D := ∂Ω.The computational benchmark from [15] considers the two distinct wells The exact solution u is piecewise smooth and the derivative ∇u jumps across the interface Γ = conv{(1, 0), (0, 3/2)}.For an aligned initial triangulation, where Γ coincides with the sides of the triangulation, the numerical results from [28]  L 4/3 (Ω) (right) with k from Figure 1 on uniform (dashed line) and adaptive (right) triangulations for the two-well benchmark in Subsection 6.4 initial triangulation T 0 in Figure 7.a, where Γ cannot be resolved exactly (even not with adaptively refined triangulations of T 0 ).In this case, [14] predicted for H := h L ∞ (Ω) .These expected (optimal) convergence rates on uniform meshes are indeed observed empirically for the lowest-order HHO scheme., respectively.This improves the convergence rate 3/4 of the stress error from the lowestorder Courant FEM in [14].The adaptive algorithm generates adaptive meshes with Subsection 2.2 provides efficient approximations of singular solutions and even leads to improved empirical convergence rates.The choice of the parameter ε only has marginal influence on the convergence rates and convergence is observed for ε = 0 in undisplayed computer experiments.Better convergence rates are obtained for larger polynomial degrees k.The computer experiments provide empirical evidence that the HHO method can overcome the Lavrentiev gap for any polynomial degree k.

3 .
REFINEMENT INDICATORS.The computation of the refinement indicator η utilizes an elliptic potential reconstruction R u ∈ P k+1 (T ; R m ) of the discrete minimizer u = (u T , u F ) ∈ A(T ) computed in SOLVE.The definition of R u follows in (3.4)-(3.5)below.Any interior side F ∈ F (Ω) is shared by two simplices and (σ ) ∈N 0 be the output of the adaptive algorithm AHHO from Subsection 2.2.Assume that W satisfies (A1)-(A2), then (a)-(d) hold.(a) lim →∞ E (u ) = min E(A).

A
regular triangulation T of Ω in the sense of Ciarlet is a finite set of closed simplices T of positive volume |T | > 0 with boundary ∂T and outer unit normal ν T such that ∪ T ∈T T = Ω and two distinct simplices are either disjoint or share one common (lower-dimensional) subsimplex (vertex or edge in 2D and vertex, edge, or face in 3D).Let F (T ) denote the set of the n + 1 hyperfaces of T , called sides of T .Define the set of all sides F := ∪ T ∈T F (T ), the set of interior sides F (Ω) := F \ {F ∈ F : F ⊂ ∂Ω}, the set of Dirichlet sides F (Γ D ) := {F ∈ F : F ⊂ Γ D }, and the set of Neumann sides F (Γ N ) := {F ∈ F : F ⊂ Γ N } of T .For any interior side F ∈ F (Ω), there exist exactly two simplexes T + , T − ∈ T such that ∂T + ∩ ∂T − = F .The orientation of the outer normal unit ν

. 5 )
Lemma 3.4  and the definition of the discrete norm v in (3.2) prove that the right-hand side of (4.4) is controlled by v .Hence, the combination of (4.3)-(4.5)conclude (4.2).

Lemma 4 . 2 (
Φν ds = 0 for all Φ ∈ C ∞ (Ω; M) with Φ ≡ 0 on Γ N .This implies Dv = G a.e. in Ω with v = u D on Γ D and concludes the proof.Since J v cannot attain the exact value u D on Γ D in general, a (Dirichlet boundary data) oscillation arises in (4.1), but is controlled by the contributions of η(ε) .Dirichlet boundary data oscillation).Given F ∈ F (Γ D ), let T ∈ T be the unique simplex with F ∈ F (T ) ∩ F (Γ D ).Then it holds, for all

Proof.
The norm equivalence in (a) is established [37, Lemma 4] for p = 2 and extended to 1 ≤ p < ∞ in [34, Lemma 5.2]; the approximation property (b) is [42, Lemma 3.2].The upper bound (c) follows immediately from a triangle and a discrete trace inequality.The proof of (d) concerns K ∈ M and S ∈ Σ (K).A Hölder inequality with the exponents p, p and 1 − p + 1/p = (1 − p)/p show

Theorem 5 . 5 (
discrete minimizers).The minimal discrete energy inf E (A(M )) is attained.There exists a positive constant C 7 > 0 that merely depends on c 1 , c 2 , Ω, Γ D , u D , f , g, in (M1), k, and p with G u p L p (Ω) + s (u ; u ) ≤ C p 7 for any discrete minimizer u ∈ arg min E (A(M )).Any discrete stress σ := Π k M DW (G u ) satisfies the discrete Euler-Lagrange equations

Ω
5), then u = arg min E (A(M )) is unique.If W satisfies (2.6), then DW (G u ) ∈ L p (Ω; M) is unique (independent of the choice of a (possibly non-unique) discrete minimizer u ).Proof.The proof follows that of Theorem 3.2.The norm equivalence in Lemma 5.4.a and the lower growth of W lead to the coercivity of E in A(M ) with respect to the seminorm • p ≈ G • p L p (Ω) + s (•; •) from Lemma 5.4.a.This implies the existence of discrete minimizers and the bound G u p L p (Ω) + s (u ; u ) ≤ C p 7 for all u ∈ arg min E (A(M )).If W satisfies (2.5), then the strict convexity of W and of s in Lemma 5.4.cleads to the uniqueness of u = arg min E (A(M )).If W satisfies (2.6), then the uniqueness of DW (G u ) follows as in

2 L 2 28 ,Figure 5 .
Figure 5.b displays the suboptimal convergence rate 0.4 for RHS on uniform triangulations.The adaptive algorithm refines towards the reentrant corner and the boundaries of the microstructure zone as displayed in Figure6.This improves the convergence rates up to 1.2 for k = 4. Undisplayed computer experiments show significant improvement for the convergence rates of RHS for examples with small microstructure zones in agreement with the related empirical observations in[25].

F 1 = −( 3 ,ndofFigure 7 : 2 L 2
Figure 7: Initial triangulation (left) of the rectangular domain Ω and convergence history plot (right) of |E(u) − E (u )| with k from Figure 1 on uniform (dashed line) and adaptive (solid line) triangulations for the two-well benchmark in Subsection 6.4

2 L 2 (Figure 8 : 2 L 2 ndofFigure 9 :
Figure 8: Adaptive triangulation (left) of the rectangular domain Ω into 1192 triangles (7104 dofs) for k = 1 and convergence history plot (right) of u − u T 2 L 2 (Ω) with k from Figure 1 on uniform (dashed line) and adaptive (solid line) triangulations for the two-well benchmark in Subsection 6.4

Table 1 :
Parameters r, s, r, s in Theorem 2.1 and t, t in Subsection 4.2 1/2in M. The context-depending notation | • | denotes the length of a vector, the Frobenius norm of a matrix, the Lebesgue measure of a subset of R n , or the counting measure of a discrete set.For 1 < p < ∞, p = p/(p − 1) denotes the Hölder conjugate of p with 1/p + 1/p = 1.The notation A B abbreviates A ≤ CB for a generic constant C independent of the mesh-size and A ≈ B abbreviates A B A.
42, Proof of Proposition 4.7], and scaling arguments confirm this for 1 < p < ∞.The L p stability of the L 2 projection [34, Lemma 3.2] and the orthogonality Hence the Neumann boundary data oscillations osc N (g, F (Γ N )) arise in (4.27), but h f L p (Ω) cannot be replaced by osc(f, T ).For any v CR