1 Introduction

In this paper we study the regularity of weak solutions to the Euler–Lagrange system

$$\begin{aligned} - {{\,\textrm{div}\,}}F'(\nabla u) = 0 \end{aligned}$$
(1)

in \(\Omega \subset \mathbb R^n\) where \(u :\Omega \rightarrow \mathbb R^N\) is a vector-valued mapping, that is u satisfies

$$\begin{aligned} \int _{\Omega } F'(\nabla u): \nabla \varphi \,\textrm{d}x = 0 \end{aligned}$$
(2)

for all \(\varphi \in C^{\infty }_c(\Omega ,\mathbb R^N).\) Henceforth referred to as F-extremals, solutions to (1) are critical points to the functional

$$\begin{aligned} {\mathcal {F}}(w) = \int _{\Omega } F(\nabla w(x)) \,\textrm{d}x. \end{aligned}$$
(3)

There is a considerable literature studying the partial regularity theory for minimisers of such functionals, under a suitably strict version of the quasiconvexity condition introduced by Morrey [40]. A striking feature of the vectorial (\(n,N\ge 2\)) setting is that minimisers need not be everywhere regular (see for instance [11, 37, 39, 44]), so the best we can hope for are partial regularity results. In the quasiconvex setting the first result in this direction was due to Evans [14] which has been extended considerably since; we refer the interested reader to the monograph of Giusti [20] and the references therein.

For arbitrary weak solutions of the above equation however, the work of Müller and Šverák [43] shows that we cannot hope for improved regularity results. Developing the theory of convex integration for Lipschitz mappings they constructed highly irregular solutions to (1), including Lipschitz solutions that fail to be \(C^1\) in any open subset and compactly supported solutions whose gradient is \(L^q\)-integrable if and only if \(q \le 2.\) These results have been extended by Kristensen and Taheri [33] for weak local minimisers, and by Székelyhidi [50] for strongly polyconvex integrands.

However it is well-known that if u is suitably regular, we can infer higher regularity by a bootstrap argument. This follows for instance using the classical Schauder estimates, where if the integrand F is smooth and suitably convex, any \(C^{1,\alpha }\) solution for \(\alpha \in (0,1)\) can be shown to be smooth. A natural question is to ask whether this a-priori Hölder condition can be further relaxed.

This was by Moser in the preprint [42], who claimed it was sufficient to assume that u was Lipschitz such that \(\nabla u\) lies in the space \({{\,\textrm{VMO}\,}}\) of functions of vanishing mean oscillation as introduced by Sarason [45]. This condition was motivated from related regularity results for linear elliptic systems, where the work of Chiarenza et al. [8] established \(W^{2,p}\) estimates for linear uniformly elliptic equations where the coefficient matrix A was assumed to be in \({{\,\textrm{VMO}\,}}.\) A similar statement was established by Campos Cordero [7] for quasiconvex integrands through different means, noting also an inconsistency in the proof in [42]. In this paper we will extend these results, establishing regularity up to the boundary in a more general setting.

While we focus on the case of F-extremals to illustrate the main ideas, it turns out the arguments do not make use of the variational structure and extends to more general Legendre–Hadamard elliptic systems. We will sketch this extension in Sect. 5, where higher-order equations are also considered.

1.1 Setup and main results

We will study the following class of integrands; we refer the reader to Sect. 1.3 for the precise notational conventions.

Hypotheses 1.1

For \(n \ge 2\) and \(N \ge 1,\) let \(F :\mathbb R^{Nn} \rightarrow \mathbb R\) satisfy the following.

  1. (H0)

    F is of class \(C^2.\)

  2. (H1)

    There is \(q \ge 2\) such that F satisfies the natural growth condition

    $$\begin{aligned} |F(z)|\le K (1+|z|)^q \end{aligned}$$

    for all \(z \in \mathbb R^{Nn}.\)

  3. (H2)

    \(F''\) satisfies a strict Legendre–Hadamard condition, namely for all \(z \in \mathbb R^{N n}\) we have

    $$\begin{aligned} F''(z_0)(\xi \otimes \eta ):(\xi \otimes \eta ) \ge 0 \end{aligned}$$

    for all \(\xi \in \mathbb R^N\) and \(\eta \in \mathbb R^n,\) with equality if and only if \(\xi \otimes \eta = 0.\)

A key feature of our results is that we only need to assume a strict Legendre–Hadamard condition which is closely related to rank-one convexity of F,  and as the construction of Šverák [49] illustrates rank-one convexity is strictly weaker than the quasiconvexity condition of Morrey. We also highlight that we do not require control in the \(L^q\) scales from below, so this allows for all growth conditions of type (1, q),  that is

$$\begin{aligned} |z|-1 \lesssim F(z) \lesssim |z|^q +1 \end{aligned}$$
(4)

The key ideas are contained in the following interior regularity theorem, which we will prove in Sect. 2. For the precise definition of \({{\,\textrm{BMO}\,}}\) functions we adopt in the text we refer the reader to Sect. 3.1.

Theorem 1.2

(\({{\,\textrm{BMO}\,}}\) \(\varepsilon \)-regularity theorem) Suppose F satisfies Hypotheses 1.1. Then for all \(M > 0\) and \(\alpha \in (0,1),\) there is \(\varepsilon = \varepsilon (M,F,\alpha )>0\) such that for any ball \(B_R(x_0) \subset \mathbb R^n\) if u is F-extremal in \(B_R(x_0)\) with \(|(\nabla u)_{B_R(x_0)}|\le M\) and

$$\begin{aligned} \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R)} \le \varepsilon , \end{aligned}$$
(5)

we have u is \(C^{1,\alpha }\) on \(\overline{B_{R/2}(x_0)}.\)

We will follow a similar strategy to the partial regularity theory for minimisers, which traces back to the works of Morrey [41] and Giusti and Miranda [21] in the variational setting. This will involve establishing a suitable Caccioppoli inequality and a harmonic approximation result, which are combined in a final iteration argument. For the former we will use a modification of an estimate which appeared in Moser [42], and for the latter we will follow a recent approach of Gmeineder and Kristensen [22] adapted to our setting.

We will also establish an analogous result up to the boundary, using ideas from Kronz [35] and Campos Cordero [7]. We will prove this in Sect. 4, and will rely on technical results established in Sect. 3. Here we denote \(\Omega _R(x_0) = \Omega \cap B_R(x_0).\)

Theorem 1.3

(Boundary \({{\,\textrm{BMO}\,}}\) \(\varepsilon \)-regularity theorem) Suppose F satisfies Hypotheses 1.1, \(\Omega \subset \mathbb R^n\) is a \(C^{1,\beta }\) domain for some \(\beta \in (0,1),\) \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N),\) and let \(u \in W^{1,q}_g(\Omega ,\mathbb R^N)\) be F-extremal. Then for each \(\alpha \in (0,\beta )\) and \(M>0\) there is \(\varepsilon = \varepsilon (M,F,\Omega ,g,\beta ,\alpha )>0\) and \({\widetilde{R}}_0 = {\widetilde{R}}_0(M,F,\Omega ,g,\beta ,\alpha )>0\) such that if \(x_0 \in \partial \Omega \) and \(0<R<{\widetilde{R}}_0\) with \(|(\nabla u)_{\Omega _R(x_0)}|\le M\) and

$$\begin{aligned} \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _R(x_0))} \le \varepsilon , \end{aligned}$$
(6)

we have u is \(C^{1,\alpha }\) on \(\overline{\Omega _{R/2}(x_0)}.\)

Patching these local regularity results we can infer global consequences, for which we will need some notation. Following [38], we define the infinitesimal mean oscillation of \(f \in {{\,\textrm{BMO}\,}}(\Omega ,\mathbb R^{Nn})\) as

(7)

Note that \(\{f\}_{\textrm{osc}(\Omega )} = 0\) if and only if \(f \in {{\,\textrm{VMO}\,}}(\Omega ,\mathbb R^{Nn}).\)

Corollary 1.4

(Regularity of almost \({{\,\textrm{VMO}\,}}\) Lipschitz solutions) Suppose F satisfies Hypotheses 1.1, let \(\Omega \subset \mathbb R^n\) be a \(C^{1,\beta }\) domain for some \(\beta \in (0,1)\) and \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N).\) Then for each \(M>0\) and \(\alpha \in (0,\beta ),\) there is \(\varepsilon = \varepsilon (M,F,\Omega ,g,\beta ,\alpha )>0\) such that if \(u \in W^{1,\infty }_g(\Omega ,\mathbb R^N)\) is F-extremal such that \(\left\Vert \nabla u\right\Vert _{L^{\infty }(\Omega )} \le M,\) and

$$\begin{aligned} \{\nabla u\}_{\textrm{osc}(\Omega )} \le \varepsilon , \end{aligned}$$
(8)

then \(u \in C^{1,\alpha }({\overline{\Omega }},\mathbb R^N).\)

It is unclear if the Lipschitz assumption can be removed; the infinitesimal mean oscillation assumption requires us to consider balls of arbitrarily small radius, which in turn requires a uniform bound on all averages \(|(\nabla u)_{\Omega _R(x_0)}|\) for all \(x_0 \in {\overline{\Omega }}\) and \(R>0\) small. However this is equivalent to assuming \(\nabla u\) is bounded by the Lebesgue differentiation theorem.

We point out that Dolzmann et al. [13] constructed an example of a minimiser of a quasiconvex integrand who gradient is unbounded but lies in \({{\,\textrm{BMO}\,}}.\) We will later investigate whether this Lipschitz assumption can be relaxed under further assumptions, but for the moment we will record several other straightforward consequences.

Corollary 1.5

(Partial regularity of \({{\,\textrm{VMO}\,}}\) solutions) Suppose F satisfies Hypotheses 1.1, let \(\Omega \subset \mathbb R^n\) be a \(C^{1,\beta }\) domain for some \(\beta \in (0,1)\) and \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N).\) Then if \(u \in W^{1,q}_g(\Omega ,\mathbb R^N)\) is F-extremal such that \(\nabla u \in {{\,\textrm{VMO}\,}}(\Omega ;\mathbb R^{Nn}),\) letting

$$\begin{aligned} {\mathcal {R}}_{{\overline{\Omega }}} = \left\{ x \in {\overline{\Omega }}: \limsup _{R \rightarrow 0} |(\nabla u)_{\Omega _R(x_0)}|< \infty \right\} , \end{aligned}$$
(9)

we have \({\mathcal {R}}_{{\overline{\Omega }}} \subset {\overline{\Omega }}\) is a relatively open subset of full measure and u is \(C^{1,\alpha }\) on \({\mathcal {R}}_{{\overline{\Omega }}}\) for all \(\alpha \in (0,\beta ).\)

We can also obtain a global regularity result if we assume \(\nabla u\) is suitably small in both \(L^1\) and \({{\,\textrm{BMO}\,}}.\) The \(L^1\) smallness condition allows us to cover \({\overline{\Omega }}\) by balls finitely many balls \(B_{R_k}(x_k)\) such that each \(|(\nabla u)_{\Omega _{R_k}(x_k)}|\le 1 + \left[ \nabla g\right] _{L^{\infty }(\Omega )},\) on which we can apply our \(\varepsilon \)-regularity result to obtain the following.

Corollary 1.6

(Regularity of \({{\,\textrm{BMO}\,}}\)-small solutions) Suppose F satisfies Hypotheses 1.1, let \(\Omega \subset \mathbb R^n\) be a \(C^{1,\beta }\) domain for some \(\beta \in (0,1)\) and \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N).\) Then for each \(\alpha \in (0,\beta )\) there is \(\varepsilon = \varepsilon (F,\Omega ,g,\beta ,\alpha )>0\) such that if \(u \in W^{1,q}_g(\Omega ,\mathbb R^N)\) is F-extremal in \(\Omega \) with \(\nabla u \in {{\,\textrm{BMO}\,}}(\Omega ,\mathbb R^{Nn})\) satisfying

$$\begin{aligned} \left\Vert \nabla u-\nabla g\right\Vert _{L^1(\Omega )} + \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega )} \le \varepsilon , \end{aligned}$$
(10)

then \(u \in C^{1,\alpha }({\overline{\Omega }},\mathbb R^N).\)

Finally we will show that the Lipschitz condition in Corollary 1.4 can be removed it we assume the following uniformly controlled growth condition.

Hypotheses 1.7

For \(n \ge 2,\) \(N \ge 1\) and \(p \ge 2,\) let \(F :\mathbb R^{Nn} \rightarrow \mathbb R\) satisfy the following.

  1. (H0)

    F is of class \(C^2.\)

  2. (H1)

    We have \((1+|z|^2)^{-(p-2)}F''(z)\) is bounded and uniformly continuous on \(\mathbb R^{Nn}.\)

  3. (H2)

    For all \(z \in \mathbb R^{N n}\) we have

    $$\begin{aligned} F''(z)(\xi \otimes \eta ) \cdot (\xi \otimes \eta ) \ge \lambda (1+|z|)^{p-2} |\xi |^2 |\eta |^2 \end{aligned}$$

    for all \(\xi \in \mathbb R^N\) and \(\eta \in \mathbb R^n.\)

Note the growth and continuity hypothesis (H1) is satisfied if F is a polynomial; in particular this includes the example of Šverák [49].

Theorem 1.8

(\({{\,\textrm{BMO}\,}}\) \(\varepsilon \)-regularity in the uniformly elliptic case) Suppose \(F :\mathbb R^{Nn} \rightarrow \mathbb R\) satisfies Hypotheses 1.7, let \(\Omega \subset \mathbb R^N\) be a bounded \(C^{1,\beta }\) domain, and let \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N)\) for some \(\beta \in (0,1).\) Then for each \(\alpha \in (0,\beta )\) there is \(\varepsilon =\varepsilon (M,F,\Omega ,g,\beta ,\alpha )>0\) such that if \(u \in W^{1,p}_g(\Omega ;\mathbb R^N)\) is F-extremal in \(\Omega \) and

$$\begin{aligned} \{\nabla u\}_{\textrm{osc}(\Omega )} \le \varepsilon , \end{aligned}$$
(11)

then \(u \in C^{1,\alpha }({\overline{\Omega }},\mathbb R^N).\)

1.2 Connection to minimisers and quasiconvexity

In the context of strictly quasiconvex integrands, there is a close connection between sufficiency results (whether extremals are minimising) and regularity of the extremal. One of the early results in the quasiconvex setting is due to Zhang [51], who showed that a \(C^2\) extremal is absolutely minimising on small balls \(B \subset \Omega .\) In the opposite direction, it was shown by Kristensen and Taheri in [33, Theorem 4.1] that if \(u \in W^{1,p} \cap W^{1,q}_{{{\,\textrm{loc}\,}}}\) is a \(W^{1,q}\)-local minimiser for some \(1 \le q \le \infty ,\) then we can establish a partial regularity theorem (we refer the reader to the aforementioned paper for the precise terminology and results).

It was moreover established in [33, Theorems 6.1, 7.1] that if u is a Lipschitz extremal with strictly positive second variation (it is a weak local minimiser) then it is minimising among perturbations such that \(\left[ \nabla \varphi \right] _{{{\,\textrm{BMO}\,}}(\Omega )}\) is small, but that this is too weak to infer improved regularity though counterexamples. The former statement uses a modular version of the Fefferman–Stein inequality which we also use (see Sect. 3.1), and the latter follows by adapting the construction of Müller and Šverák [43]. For Lipschitz weak local minimisers however, it was shown by Campos Cordero [7] that we can infer global regularity if we additionally assume that \(\nabla u \in {{\,\textrm{VMO}\,}}(\Omega ).\) The proof loosely follows the compensated compactness argument used in [33, Section 4].

We see that Corollary 1.4 generalises the above result in [7], by removing the condition on the second variation and allowing F to merely satisfy a strict Legendre–Hadamard condition (H2). Here the Legendre–Hadamard condition can be seen to be a natural relaxation in the following sense; it is proved by Kristensen [31] that (H2) implies that F is locally quasiconvex in the sense that for each \(z_0 \in \mathbb R^{Nn}\) there exists a quasiconvex function G such that \(F=G\) in a neighbourhood of \(z_0.\) Our argument, which builds upon ideas of Moser [42], streamlines this process by establishing regularity directly. In particular we note that the same Fefferman–Stein estimate used for the \({{\,\textrm{BMO}\,}}\)-sufficiency result in [33] is crucially used to obtain a Caccioppoli-type inequality in [42] and Sect. 2.2.

1.3 Basic notation

We will briefly fix some notation that will be used throughout the text. We will equip \(\mathbb R^n\) with the Lebesgue measure \({\mathcal {L}}^n,\) and if \(A \subset \mathbb R^n\) is non-empty and open such that \({\mathcal {L}}^n(A)<\infty ,\) for any \(f \in L^1(A,\mathbb R^k)\) with \(k\ge 1\) we define

(12)

We also denote by a \(B_R(x_0)\) the open ball in \(\mathbb R^n\) centred at \(x_0\) with radius R,  and for \(\Omega \subset \mathbb R^n\) open write \(\Omega _R(x_0) = \Omega \cap B_R(x_0).\) We may write \(B_R,\) \(\Omega _R\) respectively if the centre point \(x_0\) is clear from context.

We will denote by \(\mathbb R^{Nn}\) the space of \(N \times n\) real matrices, which we equip with the inner product \(z: w = {{\,\textrm{tr}\,}}(z^tw)\) and \(\ell ^2\)-norm \(|z|^2 = z: z\) for \(z,w \in \mathbb R^{Nn}.\) For a differentiable map \(F :\mathbb R^{Nn} \rightarrow \mathbb R\) we define its derivative \(F' :\mathbb R^{Nn} \rightarrow \mathbb R^{Nn}\) as

(13)

and for a differentiable map \(A :\mathbb R^{Nn} \rightarrow \mathbb R^{Nn}\) its derivative \(A'(z)\) will be a linear map \(\mathbb R^{Nn} \rightarrow \mathbb R^{Nn}\) at each \(z \in \mathbb R^{Nn},\) defined by

(14)

If F is \(C^2\) this allows us to define \(F'',\) which satisfies \(F''(z)v: w = F''(z)w: v\) for all \(z, v, w \in \mathbb R^{Nn}.\)

Additionally C will denote a constant that may change from line to line, and if not specified in proofs they will depend only on the parameters the resulting estimate depends on.

2 Interior regularity for F-extremals

We begin by considering the interior regularity theory for solutions to the Euler–Lagrange system. While the techniques extend to the general case, we will present a detailed proof in this simplified setting first to illustrate the key ideas. We will refer to Sect. 3 for some auxiliary results, but since we only apply them on balls B they can be obtained through simpler means.

2.1 Estimates for F

We will consider \(F :\mathbb R^{Nn} \rightarrow \mathbb R\) satisfying Hypotheses 1.1, and fix \(M>0.\) Since \(F''(z)\) is uniformly continuous on compact subsets, there is \(\Lambda _M>0\) and a modulus of continuity function \(\omega _M :[0,\infty ) \rightarrow [0,1]\) such that

$$\begin{aligned} |F''(z)|&\le \Lambda _M, \end{aligned}$$
(15)
$$\begin{aligned} |F''(z)-F''(w)|&\le \Lambda _M \omega _M(|z-w|) \end{aligned}$$
(16)

for all \(z,w \in \mathbb R^{Nn}\) with \(|z|,|w|\le M+1.\) Here \(\omega _M\) can be chosen to be a non-decreasing, continuous, and concave function such that \(\omega _M(0)=0.\) Also since the strict Legendre–Hadamard condition holds uniformly on compact subsets, there is \(\lambda _M>0\) such that for all \(z \in \mathbb R^{Nn}\) with \(|z|\le M\) we have

$$\begin{aligned} F''(z)(\xi \otimes \eta ): (\xi \otimes \eta ) \ge \lambda _M |\xi |^2|\eta |^2 \end{aligned}$$
(17)

for all \(\xi \in \mathbb R^N\) and \(\eta \in \mathbb R^n.\) Now for \(w \in \mathbb R^{Nn}\) with \(|w|\le M,\) following Acerbi and Fusco [1] consider the shifted integrand

$$\begin{aligned} F_w(z) = F(z+w) - F(w) - F'(w)z. \end{aligned}$$
(18)

Since \(F''\) satisfies a Legendre–Hadamard condition, we infer F is rank-one convex and so its derivative satisfies \(|F'(z)|\le C(n,N)K(1+|z|)^{q-1}.\) Hence \(F_w\) satisfies the growth conditions

$$\begin{aligned} |F_w(z)|&\le K_M( |z|^2 + |z|^q), \end{aligned}$$
(19)
$$\begin{aligned} |F'_w(z)|&\le K_M( |z|+ |z|^{q-1}) \end{aligned}$$
(20)

where

$$\begin{aligned} K_M = \Lambda _M + C(N,n)K, \end{aligned}$$
(21)

using the mean value theorem and distinguishing between the cases when \(|z|\le 1\) and \(|z|>1.\) A similar argument gives the comparison estimate

$$\begin{aligned} |F''_w(0)z - F'_w(z)|\le K_M\,\omega _M(|z|)(|z|+|z|^{q-1}). \end{aligned}$$
(22)

2.2 Caccioppoli-type inequality

We now prove the following weakening of the Caccioppoli inequality of the second kind introduced by Evans [14], which is a staple for many partial regularity proofs in the quasiconvex setting. The following estimate was essentially proved by Moser [42], and involves applying the modular version of the estimate of Fefferman and Stein [15] established in Sect. 3.1 (see also Remark 2.3 at the end of this subsection).

Lemma 2.1

(Caccioppoli-type inequality) Suppose F satisfies Hypotheses 1.1, and let \(M \ge 1.\) Then if u is F-extremal in some ball \(B_R(x_0) \subset \Omega \) such that \(\nabla u \in {{\,\textrm{BMO}\,}}(B_R,\mathbb R^{Nn})\) with \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R)} \le 1\) and \(|(\nabla u)_{B_R}|\le M,\) then setting

(23)

there is \({\widetilde{M}} = C(n)M\) and \(C=C(n,N,q,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}})>0\) such that

(24)

with \(\gamma (t) :[0,\infty ) \rightarrow [0,1]\) a non-decreasing, continuous function such that \(\gamma (0)=0,\) depending on \(\omega _{{\widetilde{M}}}\) and q only.

This choice of \(a_R\) is due to Kronz [34], whose significance is illustrated in the lemma below; this is essentially contained in [34, Lemma 2(ii)], applying the Poincaré inequality in \(W^{1,1}\) instead.

Lemma 2.2

If \(u \in W^{1,2}(B_R,\mathbb R^N)\), we have \(a_R\) defined as in (23) satisfies

(25)

for any \(a: \mathbb R^n \rightarrow \mathbb R^N.\) Further we have the estimate

(26)

In particular if \(\nabla u \in {{\,\textrm{BMO}\,}}(B_R,\mathbb R^{Nn})\) we have \(|\nabla a_R - (\nabla u)_{B_R}|\le C(n) \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R)}.\)

Proof of Lemma 2.1

Set \({\widetilde{F}}(z) = F_{\nabla a_R}(z)\) as in (18), and note by Lemma 2.2 that

$$\begin{aligned} |\nabla a_R|\le |(\nabla u)_{B_R}|+ C(n)\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R)} \le {\widetilde{M}}. \end{aligned}$$
(27)

Also fix a cutoff \(\eta \in C^{\infty }_c(B_R)\) such that \(1_{B_{R/2}} \le \eta \le 1_{B_R}\) and \(|\nabla \eta |\le \frac{C}{R}.\) Putting \(w = u-a_R\) we have w is \({\widetilde{F}}\)-extremal since u is F-extremal, and so testing the equation against \(\phi = \eta ^2w\) gives

(28)

Also since \({\widetilde{F}}''(0) = F''(\nabla a)\) satisfies the strict Legendre–Hadamard condition (17) with \(|\nabla a|\le {{\widetilde{M}}},\) applying this to \(\eta w \in W^{1,2}_0(\Omega ,\mathbb R^N)\) gives (see for instance [20, Theorem 10.1]),

(29)

Taking the difference of (28), (29) and rearranging we get

(30)

where we have used the comparison estimate (22) to control the first term along with the fact that \(\eta ^2 \le 1,\) and the growth estimates (15), (20) for the additional terms. We apply the modular Fefferman–Stein estimate (Corollary 3.5) to the first term, noting that \(\nabla w = \nabla u - (\nabla u)_{B_R}\) so

(31)

where we have used Lemma 2.2 along with the fact that \(\omega _{{\widetilde{M}}}(ts) \le t\, \omega _{{\widetilde{M}}}(s)\) for \(t \ge 1,\) \(s \ge 0\) in the last line. Hence combining these with the earlier estimate and using Young’s inequality to absorb the \(|\nabla (\eta w)|^2\) term we arrive at

(32)

Note that if \(q=2,\) we do not get the \(|\nabla w|^{2(q-1)}\) term. Otherwise by the John–Nirenberg inequality (Proposition 3.3) and Lemma 2.2 we can bound

(33)
(34)

Therefore if we let \(\gamma (t) = \min \{1,\left( \omega _{{\widetilde{M}}}(t)(1+t^{q-2}) + t^{2(q-2)}\right) \}\) (omitting the \(t^{q-2}\) terms if \(q=2\)) we deduce that

(35)

as required. \(\square \)

Remark 2.3

We have referred to Sect. 3.1 for the John–Nirenberg and modular Fefferman–Stein estimates, however in the interior case they can also be deduced from the corresponding statements in the full space using a cutoff argument. We will omit the details, but the argument is similar to that found in [42]; in this case the modular estimate can be proved more simply via a good-\(\lambda \) estimate (see [33, Lemma 6.2]).

2.3 Harmonic approximation and interior regularity

Our second ingredient is a comparison estimate for solutions to the linearised system. The following duality argument is an adaptation of the estimate proved in [22]. The linear theory we need will straightforwardly follow from the strict Legendre–Hadamard condition satisfied by \(F''(\nabla a),\) and we will refer the reader to [20, Chapter 10] for details.

Lemma 2.4

(Interior harmonic approximation) Suppose F satisfies Hypotheses 1.1, let \(M >0,\) and suppose u is F-extremal in some ball \(B_R(x_0) \subset \Omega \) such that \(\nabla u \in {{\,\textrm{BMO}\,}}(B_R,\mathbb R^{Nn})\) with \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R)} \le 1\) and \(|(\nabla u)_{B_R}|\le M.\) Then letting \(\nabla a_R\) as in (23), we have the unique solution \(h \in W^{1,2}(B_R,\mathbb R^N)\) to the problem

(36)

satisfies the \(L^2\) estimate

(37)

with \({\widetilde{M}} = C(n)M\) and \(C=C(n,N,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}})>0,\) and further the comparison estimate

(38)

with \(C=C(n,N,q,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}})\) and some \(\gamma :[0,\infty ) \rightarrow [0,1]\) increasing and continuous such that \(\gamma (0)=0,\) depending on \(\omega _{{\widetilde{M}}}\) and q only.

Proof

By replacing F with \(\lambda _{{\widetilde{M}}}^{-1}F,\) we can replace \((K_{{\widetilde{M}}},\lambda _{{\widetilde{M}}})\) with \((K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}},1),\) where \({\widetilde{M}} \ge 1\) as in (27). Put \(w = u-a_R\) and \({\widetilde{F}} = F_{\nabla a}.\) Then the existence of a unique \(h \in W^{1,2}_w(B_R,\mathbb R^N)\) follows from \(L^2\)-coercivity of \({\widetilde{F}}''(0)\) (see [20, Theorem 10.1]) which gives (37). Then for any \(\phi \in W^{1,2}_0(B_R,\mathbb R^N)\) we have

(39)

where we have used the fact that w is \({\widetilde{F}}\)-extremal the comparison estimate (22). Now choose \(\phi \) to be the unique solution in \(W^{1,2}_0 \cap W^{2,2}(B_R,\mathbb R^N)\) to the problem

$$\begin{aligned} - {{\,\textrm{div}\,}}{\widetilde{F}}''(0)\nabla \phi = w-h \end{aligned}$$
(40)

in \(B_R\) (see [20, Theorem 10.3]), so in particular by symmetry of \({\widetilde{F}}''(0)\) this satisfies

(41)

Moreover \(\phi \) satisfies a \(W^{2,2}\) estimate which combined with the Poincaré–Sobolev inequality (noting \((\nabla \phi )_{B_R} = 0\)) gives

$$\begin{aligned} \left\Vert \nabla \phi \right\Vert _{L^{2^*}(B_R)} \le C(n)\left\Vert \nabla ^2 \phi \right\Vert _{L^{2}(B_R)} \le C\left\Vert w-h\right\Vert _{L^{2}(B_R)} \end{aligned}$$
(42)

where \(2^* = \frac{2n}{n-2}\) provided \(n > 2.\) For this choice of \(\phi ,\) applying Hölder’s inequality and rearranging (2.3) using (42) we get

(43)

If \(n=2\) we use the fact that \(\left\Vert \nabla \phi \right\Vert _{L^4(B_R)} \le C R^{\frac{1}{2}} \left\Vert w-h\right\Vert _{L^2(B_R)}\) to get the slightly modified estimate

(44)

In both cases since \(\omega _{{\widetilde{M}}} \le 1\) is concave by Jensen’s inequality we have

(45)

and by the John–Nirenberg estimate (Proposition 3.3) and Lemma 2.2 we can also estimate

(46)

Putting everything together the result follows by taking \(\gamma (t) = \min \{1,\omega _{{\widetilde{M}}}(t)^{\frac{2}{n}} \left( 1+t^{2(q-2)}\right) \},\) modified suitably if \(n=2.\) \(\square \)

From here Theorem 1.2 follows by combining the above estimate to get a suitable decay estimate, which can be applied iteratively. This approach is standard among many partial regularity proofs, and we follow a similar argument to that found in [22].

Proof of Theorem 1.2

We will begin by establishing the following decay estimate for the excess energy

(47)

Claim For any \(B_{r}(x) \subset B_{R}(x_0)\) and \(\sigma \in \left( 0,\frac{1}{4}\right) \) for which \(|(\nabla u)_{B_{2\sigma r}(x)}|,|(\nabla u)_{B_r(x)}|\le 2^{n+1}M\) and \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}_R(x)} \le 1\) we have

$$\begin{aligned} E(x,\sigma r) \le C \left( \sigma ^2 + \sigma ^{-(n+2)}\gamma \left( \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_r(x))}\right) \right) E(x,r), \end{aligned}$$
(48)

where \(C=C(n,N,q,K_M/\lambda _M)>0\) and \(\gamma \) satisfies both Lemmas 2.1 and 2.4 with \(C_*(n) M\) in place of M.

Indeed let \(a_r\) be as in (23) centred at x,  and apply the harmonic approximation result (Lemma 2.4) in \(B_r(x)\) to get \(h \in W^{1,2}_{u-a_r}(B_r(x),\mathbb R^N)\) solving

$$\begin{aligned} -{{\,\textrm{div}\,}}F''(\nabla a_r)\nabla h = 0, \end{aligned}$$
(49)

which satisfies

(50)

Now letting \(a_h(y) = h(x) + \nabla h(x) \cdot (y-x),\) since \(B_{2\sigma r}(x) \subset B_{r/2}(x)\) we have

(51)

using interior regularity for h (see for instance [20, Theorem 10.7]). We will use these in conjunction with the Caccioppoli-type inequality (Lemma 2.1) applied in \(B_{2\sigma r}(x),\) letting \(a_{2\sigma r}\) be given by (23) we have

(52)

Now using the estimates (50) and (51) and the minimising property (25) we can estimate

(53)

So the claim follows by combining the above two estimates.

We now iteratively apply the claim for suitably chosen parameters. Since \(|(\nabla u)|_{B_R(x_0)} \le M,\) for all \(x \in B_{R/2}(x_0)\) we have \(|(\nabla u)_{B_{R/2}(x)}|\le M\) and so

$$\begin{aligned} |(\nabla u)_{B_{\sigma R/2}(x)}|\le |(\nabla u)_{B_{R/2}(x)}|+ |(\nabla u)_{B_{\sigma R/2}(x)} - (\nabla u)_{B_{R/2}(x)}|\le 2^nM + \sigma ^{-n} E(x,R/2). \end{aligned}$$
(54)

Iteratively applying this therefore gives

$$\begin{aligned} |(\nabla u)_{B_{\sigma ^k R/2}(x)}|\le 2^nM + \sigma ^{-n}\sum _{j=0}^{k-1} E(x,\sigma ^jR/2). \end{aligned}$$
(55)

Since \(E(x,r) \le \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(B_R(x_0))}^2 \le \varepsilon ^2\) for all \(r < R/2,\) see that if \(\sigma ^{-n}\varepsilon ^2 \le 2^nM,\) we can apply the claimed decay estimate (48) to obtain

$$\begin{aligned} E(x,\sigma R) \le C \left( \sigma ^2 + \sigma ^{-n-2} \gamma (\varepsilon )\right) E(x,R/2). \end{aligned}$$
(56)

Fix \(\alpha \in (0,1),\) and choose \(\sigma \le \frac{1}{4}\) such that \(C\sigma ^2 \le \frac{1}{2} \sigma ^{2\alpha }.\) Then we can take \(\varepsilon >0\) small enough so \(C \sigma ^{-(n+2)} \gamma (\varepsilon ) \le \frac{1}{2} \sigma ^{2\alpha }\) and \(\sigma ^{-n}\varepsilon ^2 \sum _j \sigma ^{\alpha j} \le 2^nM.\) Then we can inductively check that (48) gives

$$\begin{aligned} E(x,\sigma ^k R/2) \le \sigma ^{2\alpha k} E(x,R/2), \end{aligned}$$
(57)

and by (55) we can ensure

$$\begin{aligned} |(\nabla u)_{B_{\sigma ^k R/2}(x)}|\le 2^{n+1}M \end{aligned}$$
(58)

for each \(k\ge 1.\) Hence for each \(r \in (0,R/2),\) choosing k such that \(\sigma ^k R/2 \le r < \sigma ^{k-1}R/2\) we deduce that

$$\begin{aligned} E(x,r) \le C \left( \frac{r}{R}\right) ^{2\alpha } E(x_0,R). \end{aligned}$$
(59)

This verifies the Campanato–Meyers characterisation of Hölder continuity (see for instance [20, Theorem 2.9]), allowing us to conclude that \(u \in C^{1,\alpha }(\overline{B_{R/2}(x_0)},\mathbb R^N)\) as required. \(\square \)

3 Preliminaries for boundary regularity

Before we consider the boundary case, we will collect some technical results which will be used in our subsequent regularity proofs. While these results are largely known, some care was needed in keeping track of the associated constants.

3.1 \({{\,\textrm{BMO}\,}}\) in domains

We will review some preliminary results about \({{\,\textrm{BMO}\,}}\) functions and fix our conventions. For any \(D \subset \mathbb R^n\) open, we define the Fefferman–Stein maximal function associated to \(f \in L^1_{{{\,\textrm{loc}\,}}}(D,\mathbb R^{Nn})\) as

(60)

where we are taking the supremum over balls B. Using this we can define the John–Nirenberg space \({{\,\textrm{BMO}\,}}(D,\mathbb R^{Nn})\) of functions of bounded mean oscillation in D as the space of \(f \in L^1_{{{\,\textrm{loc}\,}}}(D,\mathbb R^{Nn})\) for which \({\mathcal {M}}_{D}^{\#}f \in L^{\infty }(D).\) We equip this space with the seminorm \(\left[ f\right] _{{{\,\textrm{BMO}\,}}(D)} = \Vert M_{D}^{\#}f\Vert _{L^{\infty }(D)}.\)

While we wish to apply the results in this section to domains which are piecewise \(C^{1,\beta },\) in order to understand the dependence of constants on the domain D it will be convenient to work with John domains; these were first introduced by John [27] and later named by Martio and Sarvas [36]. The definition given here is slightly different to what appeared in the original papers, but can be found for instance in [46].

Definition 3.1

For \(\delta \in (0,1)\) we say bounded domain \(D \subset \mathbb R^n\) is a \(\delta \)-John domain if there exists \(x_0 \in D,\) called the John centre, such that for all \(x \in D\) there is a rectifiable curve \(\gamma :[0,d] \rightarrow D\) parametrised by arclength such that \(\gamma (0)=x,\) \(\gamma (d) = x_0,\) and

$$\begin{aligned} {{\,\textrm{dist}\,}}(\gamma (t),\partial D) \ge \delta t \end{aligned}$$
(61)

for all \(t \in [0,d].\)

This can be viewed as a twisted cone condition, and since bounded Lipschitz domains satisfy a uniform cone condition (see for instance [2, Section 4.4]) it follows that they are John domains. Moreover we have the following localisation property.

Proposition 3.2

Let \(\Omega \) be a \(C^1\) domain. Then there is \(R_0 > 0\) and \(\delta =\delta (n)>0\) such that for all \(x_0 \in {\overline{\Omega }}\) and \(0<R<R_0,\) we have \(\Omega _R(x_0)\) is a \(\delta \)-John domain.

We will postpone the proof to Sect. 3.2, which may be of independent interest. This is the main reason why we have introduced these domains; if we can establish estimates in domains \(\Omega _R(x_0)\) where the associated constant only depends on \(\delta ,\) then the constant holds uniformly among these domains. This is particularly useful for our purposes where the dependence on \(\delta \) naturally enters when considering estimates involving \({{\,\textrm{BMO}\,}}.\) However we believe this is more generally a useful way to keep track of the constants for various technical estimates applied on \(\Omega _R(x_0),\) which may not be easily controlled by the Lipschitz norm.

The first result we need is a global version of the John–Nirenberg inequality, which was proved in greater generality by Smith and Stegenga [47] and Hurri-Syrjänen [24]. We will sketch the proof to clarify the dependence of constants.

Proposition 3.3

(Global John–Nirenberg estimate [24, 47]) Suppose D is a bounded \(\delta \)-John domain, and \(f \in {{\,\textrm{BMO}\,}}(D,\mathbb R^{Nn}).\) Then for all \(1 \le p < \infty ,\) there is \(C = C(n,p,\delta ) > 0\) such that

(62)

Proof sketch

The strategy is to take a Whitney decomposition \(W = \{ Q_j\}\) of D as given in [48, Section VI.1], and apply the John–Nirenberg inequality on each \(Q_j,\) which is is easily adapted from the original argument in [28] (see also [20, Corollary 2.2]). To patch these local estimates we can use Whitney chains following [29] to show that

$$\begin{aligned} \int _D |f-(f)_D|^p \,\textrm{d}x \le C(n,p)\left( {\mathcal {L}}^n(D) + \int _D k_D(x_0,x)^p \,\textrm{d}x\right) \left[ f\right] _{{{\,\textrm{BMO}\,}}(D)}^p, \end{aligned}$$
(63)

for a distinguished point \(x_0 \in D,\) where \(k_D\) is the quasi-hyperbolic distance introduced in [17] defined by

$$\begin{aligned} k_{D}(x_1,x_2) = \inf _{\gamma } \int _{\gamma } \frac{1}{{{\,\textrm{dist}\,}}(x,\partial \Omega )} \,\textrm{d}t, \end{aligned}$$
(64)

taking the infimum over all rectifiable curves \(\gamma \) connecting \(x_1, x_2 \in D.\) To verify the integrability of \(k_D(x_0,\cdot )^p,\) letting \(x_0\) be the John centre it is shown in [16] that for all \(x \in D,\)

$$\begin{aligned} k_D(x_0,x) \le \frac{1}{\delta } \log \frac{{{\,\textrm{dist}\,}}(x_0,\partial \Omega )}{{{\,\textrm{dist}\,}}(x,\partial \Omega )} + \frac{1}{\delta }\left( 1+\log \left( 1+\delta ^{-1}\right) \right) . \end{aligned}$$
(65)

Using this and keeping track of constants in the proof of [46, Theorem 4] we have \(k_D\) satisfies the integrability condition

$$\begin{aligned} \int _D k_D(x_0,x)^p \,\textrm{d}x \le C(n,p,\delta ) {\mathcal {L}}^n(D), \end{aligned}$$
(66)

from which the result follows. \(\square \)

We will also need an modular version of the Fefferman–Stein theorem [15, Theorem 5] that holds up to the boundary. This estimate in the full space appeared in the work of Kristensen and Taheri [33] where is was proven by means of a good-\(\lambda \) estimate, however to obtain estimates up to the boundary we will need a more refined approach using the extrapolation results of Cruz-Uribe et al. [9]. We will briefly recall the notions of N-functions considered in [9]; these are mappings \(\Phi :[0,\infty ) \rightarrow [0,\infty )\) which are continuous, convex, and strictly increasing such that

$$\begin{aligned} \lim _{t \rightarrow 0^+} \frac{\Phi (t)}{t} = 0, \quad \lim _{t \rightarrow \infty } \frac{\Phi (t)}{t} = \infty . \end{aligned}$$
(67)

For such a \(\Phi \) we can associate a conjugate function \({\overline{\Phi }}(t) = \sup _{s>0} \{st - \Phi (s)\},\) which can be shown to also be an N-function. We say \(\Phi \in \Delta _2\) if there is \(C>0\) such that the doubling property \(\Phi (2t) \le C\Phi (t)\) holds, in which case the minimal C will be denoted by \(\Delta _2(\Phi ).\) We also say \(\Phi \in \nabla _2\) if \({\overline{\Phi }} \in \Delta _2\) and write \(\nabla _2(\Phi ) = \Delta _2({\overline{\Phi }});\) note this holds if there is \(r>0\) such that \(\Phi (rt) \ge 2r \Phi (t)\) for all \(t \ge 0.\)

Proposition 3.4

(Modular Fefferman–Stein estimate) Let \(D \subset \mathbb R^n\) be a bounded \(\delta \)-John domain, and \(\Phi \) an N-function such that \(\Phi \in \Delta _2 \cap \nabla _2.\) Then there is \(C=C(n,\delta ,\Delta _2(\Phi ),\nabla _2(\Phi ))>0\) such that

$$\begin{aligned} \int _{D} \Phi (|f-(f)_{D}|) \,\textrm{d}x \le C \int _{D} \Phi \left( {\mathcal {M}}^{\#}_{D}f\right) \,\textrm{d}x \end{aligned}$$
(68)

for all \(f \in L^1_{{{\,\textrm{loc}\,}}}(D,\mathbb R^{Nn})\) such that both sides are finite.

This result is essentially proved in the work of Diening et al. [12] in greater generality, however to obtain a modular estimate a slight modification is required in the proof.

Proof

We first need a weighted \(L^p\) estimate in D, so let \(1<p<\infty \) and \(w \in A_p.\) Then for any cube Q it is shown in [12, Corllary 7.2] that

(69)

for all \(f \in L^p(Q,w,\mathbb R^{Nn}),\) that is \(f :Q \rightarrow \mathbb R^{Nn}\) such that \(|f|^pw\) is integrable on Q. By applying this to \(f - (f)_Q\) and noting that \(|f - (f)_Q|\le {\mathcal {M}}_Q^{\#}f\) we deduce that

$$\begin{aligned} \int _Q |f-(f)_Q|^p w\,\textrm{d}x \le C\left( n,p,\left[ w\right] _{A_p}\right) \int _Q |{\mathcal {M}}_Q^{\#}f|^p w \,\textrm{d}x. \end{aligned}$$
(70)

To extend this to John domains we can apply [26, Theorem 3]; note it is proved in [4, Lemma 2.1] that a \(\delta \)-John domain D is a \({\mathcal {F}}(\sigma ,N)\)-domain as in [26], where \(\sigma = \min \left\{ \frac{10}{9},\frac{n+1}{n}\right\} \) and \(N=N(n,\delta ).\) Thus we obtain

$$\begin{aligned} \int _D |f-(f)_{Q_0}|^pw \,\textrm{d}x \le C\left( n,p,\left[ w\right] _{A_p},\delta \right) \int _D |{\mathcal {M}}_D^{\#}f|^p w \,\textrm{d}x \end{aligned}$$
(71)

for all \(1<p<\infty \) and \(w \in A_p,\) for a distinguished cube \(Q_0 \subset D.\) A similar estimate appears in [12, Theorem 5.23], however the above is slightly sharper as we estimate \(|f - (f)_{Q_0}|\) instead of \(|f - (f)_D|\) which is important in the sequel.

Now we apply the modular extrapolation theorem in [9] (see also [10, Chapter 4]) to the family of pairs \((|f-f_{Q_0}|1_{D}, |M^{\#}_Df|1_D)\) we obtain

$$\begin{aligned} \int _{D} \Phi (|f-(f)_{Q_0}|) \,\textrm{d}x \le C(n,\delta ,\Delta _2(\Phi ),\nabla _2(\Phi )) \int _{D} \Phi \left( {\mathcal {M}}^{\#}_{D}f\right) \,\textrm{d}x. \end{aligned}$$
(72)

Replacing the average \((f)_{Q_0}\) by \((f)_D\) using the doubling property and convexity of \(\Phi ,\) the results follows. \(\square \)

We wish to apply this result to \(\Phi (t) = \omega (t)t^p\) with \(p>1,\) where \(\omega :[0,\infty ) \rightarrow [0,1]\) is a continuous, non-decreasing, concave function such that \(\omega (0)=0\) as in Sect. 2.1. A technical complication arises as this need not be convex in general, but adapting a construction in Kokilashvili and Krbec [30] we can work with a modified \({\widetilde{\Phi }}\) which is convex instead.

Corollary 3.5

Suppose \(D \subset \mathbb R^n\) is a bounded \(\delta \)-John domain, \(1<p<\infty ,\) and \(\omega :[0,\infty ) \rightarrow [0,1]\) is non-decreasing, continuous, concave with \(\omega (0)=0.\) Then if \(f \in {{\,\textrm{BMO}\,}}(D,\mathbb R^{Nn}),\) for each \(1<p<\infty \) there is \(C = C(n,p,\delta )>0\) such that

$$\begin{aligned} \int _{D} \omega (|f-(f)_D|)|f-(f)_D|^p \,\textrm{d}x \le C\,\omega \left( \left[ f\right] _{{{\,\textrm{BMO}\,}}(D)}\right) \int _{D} |f-(f)_D|^p \,\textrm{d}x. \end{aligned}$$
(73)

Proof

We will first construct an N-function \({\widetilde{\Phi }}\) such that

$$\begin{aligned} {\widetilde{\Phi }}(t) \le \Phi (t) \le {\widetilde{\Phi }}(2at) \end{aligned}$$
(74)

for all \(t \ge 0,\) where \(a \ge 1\) to be determined. Since \(\omega \) is increasing we have \(\Phi (t) \le \frac{1}{2a} \Phi (at)\) with \(a = 2^{\frac{1}{p-1}},\) and so by [30, Lemmas 1.1.1, 1.2.3] we get

$$\begin{aligned} {\widetilde{\Phi }}(t) = \frac{1}{a} \int _0^{\frac{t}{a}} \sup _{0< \tau < s} \left( \omega (\tau ) \tau ^{p-1}\right) \,\textrm{d}s \end{aligned}$$
(75)

is convex and increasing on \([0,\infty )\) satisfying (74). Further since \(\Phi \) satisfies \(\Phi (2t) \le 2^{p+1}\Phi (t)\) and \(\Phi (at) \ge 2a\Phi (t)\) we can infer that \({\widetilde{\Phi }}(2t) \le 2^{p+1}{\widetilde{\Phi }}(t)\) and \({\widetilde{\Phi }}(at) \ge 2a\Phi (t)\) also, so \({\widetilde{\Phi }} \in \Delta _2 \cap \nabla _2\) and the associated constants can be chosen to depend on p only.

Now applying Proposition 3.4 to \({\widetilde{\Phi }}\) and using (74), for \(f \in L^p(D,\mathbb R^{Nn})\) we deduce that

$$\begin{aligned} \int _D \Phi \left( |f - (f)_D|\right) \,\textrm{d}x&\le C \int _D \Phi \left( {\mathcal {M}}_D^{\#}f \right) \,\textrm{d}x\nonumber \\&\le C\,\omega (\left\Vert f\right\Vert _{{{\,\textrm{BMO}\,}}(D)}) \int _{\mathbb R^n} |{\mathcal {M}}(f 1_D)|^p \,\textrm{d}x, \end{aligned}$$
(76)

where we have used the fact that \({\mathcal {M}}_D^{\#}f \le \left\Vert f\right\Vert _{{{\,\textrm{BMO}\,}}}(D)\) and \({\mathcal {M}}_D^{\#}f \le {\mathcal {M}}(f 1_D),\) where \({\mathcal {M}}\) is the Hardy–Littlewood maximal operator on \(\mathbb R^n\) defined for \(g \in L^1_{{{\,\textrm{loc}\,}}}(\mathbb R^n)\) by

(77)

taking the supremum over all balls \(B \subset \mathbb R^n\) containing x. The Hardy–Littlewood maximal theorem asserts \({\mathcal {M}}\) is bounded on \(L^p(\mathbb R^n)\) for \(1<p \le \infty \) (see for instance [48, Theorem I.1]), so applying this with \(g = f\, 1_D\), (76) becomes

$$\begin{aligned} \int _D \Phi \left( |f - (f)_D|\right) \,\textrm{d}x \le C \omega (\left\Vert f\right\Vert _{{{\,\textrm{BMO}\,}}(D)}) \int _D |f|^p \textrm{d}x \end{aligned}$$
(78)

as required. \(\square \)

3.2 Localisation near the boundary

For the Caccioppoli-type estimate in the interior (Lemma 2.1), our strategy involved testing the equation against \(\phi = \eta (u-a)\) with \(\eta \) a cutoff and a an affine approximation to u in a ball. This will need to be modified for the boundary case to ensure our test function \(\phi \) vanishes on \(\partial \Omega .\) In this section we collect the necessary technical ingredients to construct a suitable replacement function, using ideas of Kronz [35] along with the refinements of Campos Cordero [7].

Let \(\Omega \subset \mathbb R^n\) be a bounded \(C^{1,\beta }\) domain, that is, \(\partial \Omega \) can locally be written as the graph of a \(C^{1,\beta }\) function in the following sense; for all \(x_0 \in \partial \Omega ,\) there is \(R_0>0\) and a unit vector \(\nu _{x_0} \in \mathbb R^n\) such that letting \(T_{x_0}= \langle \nu _{x_0} \rangle ^{\perp }\) denote the orthogonal complement, there is a map

$$\begin{aligned} \gamma :T_{x_0} \cap B_{R_0} \rightarrow \mathbb R \end{aligned}$$
(79)

which is of class \(C^{1,\beta }\) such that we have \(\nabla \gamma (0) = 0\) and

$$\begin{aligned} \Omega \cap B_{R_0}(x_0)&= B_{R_0}(x_0) \cap \left\{ x_0 + y + \lambda \nu : y \in T_{x_0} \cap B_{R_0}, \lambda < \gamma (y)\right\} , \end{aligned}$$
(80)
$$\begin{aligned} \partial \Omega \cap B_{R_0}(x_0)&= B_{R_0}(x_0) \cap \left\{ x_0+ y + \gamma (y) \nu : y \in T_{x_0} \cap B_{R_0}\right\} . \end{aligned}$$
(81)

Note this also allows us to define Lipschitz domains and \(C^{k,\beta }\) domains analogously. In the \(C^{1,\beta }\) case, this implies there is an outward facing unit normal \(\nu _{\partial \Omega }\) given by \(\nu _{\partial \Omega }(x_0) = \nu _{x_0}\) at each \(x_0 \in \partial \Omega .\) This also allows us to construct a defining function \(\rho = \rho _{\Omega } \in C^{1,\beta }(\mathbb R^n)\) with the property that

$$\begin{aligned} \Omega = \{ x \in \mathbb R^n: \rho (x) < 0\},\quad \mathbb R^n \setminus {\overline{\Omega }} = \{x \in \mathbb R^n: \rho (x)>0\}, \end{aligned}$$
(82)

and such that \(\nabla \rho (x) \ne 0\) in \(\partial \Omega ,\) by locally defining \(\rho (x) = \langle (x-x_0), \nu \rangle - \gamma (x-x_0) \) in \(B_{R_0}(x_0)\) and patching using a partition of unity. Note that \(\nabla \rho (x)\) is normal to \(\partial \Omega \) at each \(x \in \partial \Omega ,\) so we have \(\nu _{\partial \Omega }(x) = \frac{\nabla \rho (x)}{|\nabla \rho (x)|}.\) We also define the associated \(C^{1,\beta }\)-constant of \(\Omega \) as

$$\begin{aligned} \left\Vert \Omega \right\Vert _{C^{1,\beta }} = \inf \left\{ \sup _{1 \le j \le N} \left\Vert \nabla \gamma _j\right\Vert _{C^{0,\beta }(T_{x_j} \cap B_{R_j}(x_j))}\right\} , \end{aligned}$$
(83)

where the infimum is taken over collections \(\{\gamma _j,x_j,R_j\}_{j=1}^N\) where \(\{B_{R_j}(x_j)\}\) covers \(\partial \Omega \) and each \(\Omega \cap B_{R_j}(x_j)\) is represented as the graph of the \(C^{1,\beta }\) function \(\gamma _j.\)

The idea is to use this defining function \(\rho \) as a replacement for the affine approximation, considering maps of the form

$$\begin{aligned} a(x) = \xi \, \frac{\rho (x)}{|\nabla \rho (x_0)|}, \end{aligned}$$
(84)

with \(\xi \in \mathbb R^N.\) Since \(\nabla a = \xi \otimes \frac{\nabla \rho (x)}{|\nabla \rho (x_0|}\) which is close to \(\xi \otimes \nu _{x_0}\) however, taking \(\xi = (\nabla v \cdot \nu _{x_0})_{\Omega _R(x_0)}\) only allows us to control the normal component compared to the full derivative \(\nabla a = (\nabla u)_{B_R(x_0)}\) from the interior case. It turns out this is sufficient however; this is illustrated by the following result, which is an adaptation of an observation of Campos Cordero [7].

Lemma 3.6

Let \(\Omega \subset \mathbb R^n\) be a bounded \(C^{1,\beta }\) domain and let \(p > \frac{3}{2}.\) There is \(R_0 > 0\) and \(C>0\) such that for all \(x_0 \in \partial \Omega \) and \(0< R < R_0,\) for all \(v \in W^{1,p}(\Omega _R(x_0),\mathbb R^N)\) such that \(v = 0\) on \(\partial \Omega \cap B_R(x_0)\) we have

(85)

Proof

Fix \(x_0 \in \partial \Omega ,\) then by translating and rotating we can assume \(x_0=0\) and \(\nu (x_0)=e_n,\) and take \(R_0>0\) small enough so we can write \(\Omega _{R_0}(x_0)\) as the graph of some \(\gamma .\) We have

(86)

where we write \(\nabla _j v = \nabla v \cdot e_j,\) so we need to estimate the tangential derivatives. We proceed analogously to [7, Lemma 5.6] with minor modifications to account for the curved boundary, so letting \(\rho \) be the defining function for \(\Omega \) as above we consider

$$\begin{aligned} {\widetilde{v}}(x) = v(x) - (\nabla _nv)_{\Omega _R} \frac{\rho (x)}{|\nabla \rho (0)|}. \end{aligned}$$
(87)

Note that \({\widetilde{v}}\) still vanishes on \(\partial \Omega \cap B_R,\) so writing \(x=(x',x_n) \in \mathbb R^{n-1} \times \mathbb R,\) so a similar argument to [7] gives

$$\begin{aligned} \int _{\Omega _R} \nabla _i {\widetilde{v}} \,\textrm{d}x = \int _{\Omega \cap \partial B_R} {\widetilde{v}}(x) \frac{x_i}{R} \,\textrm{d}{\mathcal {H}}^{n-1}(x) = \int _{\Omega _R} \nabla _n{\widetilde{v}}(x) \frac{x_i}{(R^2-|x'|^2)^{\frac{1}{2}}} \,\textrm{d}x, \end{aligned}$$
(88)

where the only difference is that \({\widetilde{v}}\) vanishes at \((x',\gamma (x'))\) writing \(x=(x',x_n).\) This can then be estimated using Hölder’s inequality as in [7] to get

(89)

Now using the fact that \(\rho \) is of class \(C^{1,\beta }\) we deduce that

(90)

where we used (89) in the second line. Thus combining with (86) the result follows. \(\square \)

We close this subsection with the proof of Proposition 3.2, which will be an consequence of the following more general result.

Lemma 3.7

Let \(\Omega \) be a Lipschitz-domain with \(\left\Vert \Omega \right\Vert _{C^{0,1}} <1.\) Then for all \(x_0 \in {\overline{\Omega }}\) and \(0<R<R_0,\) we have \(\Omega _R(x_0) = \Omega \cap B_R(x_0)\) is a \(\delta \)-John domain, where \(\delta \) can be chosen to depend on n and \(\left\Vert \Omega \right\Vert _{C^{0,1}}\) only.

Proof

Put \(L:= \left\Vert \Omega \right\Vert _{C^{0,1}} < 1.\) Let \(R_0>0\) such that \(\Omega _R(x_0)\) can be written as the graph of a Lipschitz function \(\gamma \) when \(R<R_0.\) By means of a rigid motion assume that \(x_0 =0,\) \(\nu (x_0) = -e_n\) and \(T_{x_0} = H = \{x \in \mathbb R^n: x_n = 0\}.\) Moreover by rescaling we can assume that \(R=1,\) so we have

$$\begin{aligned} \Omega \cap B = \left\{ x \in B: x_n > \gamma (x')\right\} . \end{aligned}$$
(91)

By assumption we have \(|\nabla \gamma |\le L\) a.e. in \(H \cap B\) and \(\gamma (0)=0\), which implies that \(|\gamma (x')|\le L |x'|.\) Therefore noting \(x_n = \gamma (x')\) on \(\partial \Omega \) we have

$$\begin{aligned} \partial \Omega \cap B \subset \left\{ x \in B: |x'|\ge \frac{|x|}{\sqrt{L^2+1}} \right\} =: S_{L}. \end{aligned}$$
(92)

Moreover \(S_L\) can be seen as the union of all cones

$$\begin{aligned} C(n,\theta _L):= \left\{ x \in \mathbb R^n: |x \cdot n|\ge |x |\cos \theta _L \right\} \end{aligned}$$
(93)

intersected with B for all \(n \in S^{n-2} \times \{0\},\) where \(\cos \theta _L = 1/\sqrt{L^2+1}.\) Note that \(\theta _L < \frac{\pi }{4}\) if and only if \(L < 1.\) We will also let \(S_1\) to be as in (92) where L is replaced by 1. Also since \(\Omega \cap B \supset B^+ {\setminus } S_L\) where \(B^+ = \{ x \in B: x_n > 0\},\) we will choose \(y_0 = \frac{1}{2} e_n\) in \(B^+ {\setminus } S_L\) to be our John centre. Since \(B^+ \setminus S_L\) is convex, it is shown by Martio and Sarvas [36, Remark 2.4(c)] that it is a John domain with constant \(\frac{1}{2\sqrt{2}}\), since \(B_{\frac{1}{2\sqrt{2}}}(y_0) \subset B^+ {\setminus } S_1 \subset B^+ {\setminus } S_L \subset B.\)

Now let \(x=(x',x_n) \in \Omega \cap S\), noting that \(x' \ne 0\) necessarily. We wish to construct a piecewise linear path from x to \(y_0\) verifying the John domain assumption, as drawn in Fig. 1, which will involve some elementary geometry.

Fig. 1
figure 1

Construction of the path \(\gamma \)

Let \(\omega = \frac{(x',0)}{|x'|}\) and put \(x_t = x + \frac{t}{|x|} (-x_n \omega + |x'|e_n),\) which is parametrised by arclength. We also let \(\theta _t \in (0,2\pi )\) such that \(x_t = \omega \cos \theta _t + e_n \sin \theta _t\); note that \(\theta _0 \in (\theta _L, \frac{\pi }{4})\). We now claim that \(|x_t|= {{\,\textrm{dist}\,}}(x, \partial B_1)\) is linearly decreasing in t provided \(x_t \in S_1.\) To see this, consider the triangle formed by the points \(P = 0,\) \(Q = x_0\) and \(R = x_t;\) then we have the angles \(\angle RPQ = \pi - (\frac{\pi }{4} + \theta _t)\) and \(\angle PQR = \theta _L + \frac{\pi }{4}\). Then \(|Q - P|= |x_0|\le 1\), \(|R - Q |= |x_t|\), \(|R-Q|= t\), and \(\angle PQR = \frac{\pi }{4} - \theta _0\). By the cosine rule we have

$$\begin{aligned} |x_t|^2= |x|^2 + t^2 - 2 t \cos \left( \frac{\pi }{4} - \theta _0 \right) := p(t). \end{aligned}$$
(94)

Let \(t_0>0\) be the unique value such that \(\theta _{t_0} = \frac{\pi }{4}\), which is where \(x_t\) exists \(S_1.\) In this case, since \(\angle QRP = \frac{\pi }{2}\) we have \(t_0 = |x|\sin (\frac{\pi }{4}-\theta _0).\) Note also that \(|x_{t_0}|= |x|\cos (\frac{\pi }{4}-\theta _0)\). Therefore for \(t \in (0,t_0)\) we have

$$\begin{aligned} p'(t) = 2 (t - |x|\cos \left( \frac{\pi }{4}-\theta _0 \right) \ge 2|x|\left( \cos \left( \frac{\pi }{4}-\theta _0 \right) -t_0\right) = - 2 \sqrt{2} |x|\sin (\theta _0). \end{aligned}$$
(95)

Hence we deduce that

$$\begin{aligned} |x |- |x_t|= -\frac{1}{2} \int _0^t \frac{p'(s)}{\sqrt{p(s)}}\,\textrm{d}s \ge \sqrt{2} \sin (\theta _0) \frac{|x|}{|x_{t_0}|} \ge \delta t:= \frac{\sqrt{2} \sin (\theta _L)}{\cos \left( \frac{\pi }{4}-\theta _L\right) } t, \end{aligned}$$
(96)

so it follows that

$$\begin{aligned} {{\,\textrm{dist}\,}}(x_t,\partial B) = 1 - |x_t|\ge \delta t. \end{aligned}$$
(97)

Also since \(\gamma \) is L-Lipschitz, we have

$$\begin{aligned} \left( x + C\left( e_n,\frac{\pi }{2} - \theta _L\right) \right) \cap B \subset \Omega \cap B. \end{aligned}$$
(98)

Indeed if \(y \in C(e_n,\frac{\pi }{2}-\theta _L)\) then \(y_n > L|y'|\) and hence

$$\begin{aligned} x_n + y_n > \gamma (x') - L |y'|\ge \gamma (x'+y'), \end{aligned}$$
(99)

so \(x + y \in \Omega \) provided \(|x+y|\le 1\). Since \(x_t\) lies in the cone \(x + C(e_n,\frac{\pi }{2})\), some more trigonometry gives

$$\begin{aligned} {{\,\textrm{dist}\,}}(x_t,\partial \Omega ) \ge {{\,\textrm{dist}\,}}\left( x_t, x + \partial C\left( e_n,\frac{\pi }{2}-\theta _L\right) \right) = |x_t - x|\sin \left( \frac{\pi }{4}-\theta _L\right) = \sigma t, \end{aligned}$$
(100)

where \(\sigma = \sin \left( \frac{\pi }{4}-\theta _L \right) .\) Combining the above two estimates we deduce that

$$\begin{aligned} {{\,\textrm{dist}\,}}(x_t, \partial (\Omega \cap B)) \ge \min \{\delta ,\sigma \} t \end{aligned}$$
(101)

for all \(0<t<t_0.\) We can then join \(x_{t_0}\) to the John centre \(y_0\) via a linear combination to conclude, which is also how the case \(x \in \Omega \setminus S_L\) is treated. \(\square \)

3.3 Reference estimates up to the boundary

We will also need some reference estimates for linear elliptic systems for the harmonic approximation step. We consider a linear mapping \(\mathbb A: \mathbb R^{Nn} \rightarrow \mathbb R^{Nn}\) which is symmetric in the sense that \(v: \mathbb A w = \mathbb Av: w,\) satisfying the uniform Legendre–Hadamard ellipticity condition

$$\begin{aligned} \lambda |\xi |^2|\eta |^2 \le \mathbb A(\xi \otimes \eta ): (\xi \otimes \eta ) \le \Lambda |\xi |^2|\eta |^2 \end{aligned}$$
(102)

holds for all \(\xi \in \mathbb R^N,\) \(\eta \in \mathbb R^n\) with \(\lambda >0.\) By means of the Fourier transform one can infer that for any \(\Omega \subset \mathbb R^n\) open the estimate

$$\begin{aligned} \int _{\Omega } |\nabla \varphi |^2 \,\textrm{d}x \le \frac{1}{\lambda }\int _{\Omega } \mathbb A \nabla \varphi : \nabla \varphi \,\textrm{d}x \end{aligned}$$
(103)

holds for all \(\varphi \in W^{1,2}_0(\Omega ,\mathbb R^N),\) so the Lax-Milgram lemma gives the associated operator \(-{{\,\textrm{div}\,}}(\mathbb A \nabla \cdot ): W^{1,2}_0(\Omega ,\mathbb R^N) \rightarrow W^{-1,2}(\Omega ,\mathbb R^N)\) is an isomorphism.

In the interior case we considered the same setting, but we used uniform and \(W^{2,2}\) estimates which could be found in many sources such a [20]. For boundary regularity we wish to establish analogous estimates for \(\Omega _R(x_0),\) however such domains are merely piecewise \(C^{1,\beta }\) which is too weak to expect estimates in those scales. To circumvent this we will need to replace \(\Omega _R\) by a suitably regular domain following an argument used by Kristensen and Mingione [32], and obtain weakened estimates which will be sufficient for our purposes.

Lemma 3.8

Let \(\Omega \subset \mathbb R^n\) be a bounded \(C^{1,\beta }\) domain and let \(\mathbb A\) be symmetric and uniformly Legendre–Hadamard elliptic as above. Then there is \(R_0>0\) such that for each \(x_0 \in \partial \Omega \) and \(0< R < R_0,\) there exists a \(C^{1,\beta }\) domain \({\widetilde{\Omega }}_R(x_0)={\widetilde{\Omega }}_R\) such that

$$\begin{aligned} \overline{\Omega _{R/2}(x_0)} \subset {\widetilde{\Omega }}_R \subset \Omega _{R}(x_0), \end{aligned}$$
(104)

on which the following solvability results hold.

  1. (i)

    If \(v \in W^{1,2}({\widetilde{\Omega }}_R(x_0))\) such that \(v=0\) on \(\partial \Omega \cap \partial {\widetilde{\Omega }}_R(x_0),\) the unique \(h \in W^{1,2}_v({\widetilde{\Omega }}_R(x_0))\) solving

    (105)

    is of class \(C^{1,\beta }\) in \({\widetilde{\Omega }}_R \cup \left( \partial \Omega \cap \partial {\widetilde{\Omega }}_R(x_0)\right) \) with the associated estimate

    (106)
  2. (ii)

    If \(2 \le p<\infty \) and \(F \in L^p(\Omega ,\mathbb R^{nN}),\) then there is a unique \(u \in W^{1,p}_0(\Omega ,\mathbb R^N)\) solving

    (107)

    which satisfies the estimate

    $$\begin{aligned} \int _{{\widetilde{\Omega }}_R(x_0)} |\nabla u|^p \,\textrm{d}x \le C(n,N,p,\Lambda /\lambda ,\left\Vert \Omega \right\Vert _{C^{1,\beta }}) \int _{{\widetilde{\Omega }}_R(x_0)} |F|^p \,\textrm{d}x. \end{aligned}$$
    (108)

Proof

Fix a smooth domain \(A \subset \mathbb R^n\) such that \({\overline{B}}_{\frac{5}{6}}(0)^+ \subset A \subset B_1(0)^+.\) Using the graph representation above we can construct a diffeomorphism \(\psi :B_{R_0}(x_0) \rightarrow U \subset B_1(0)\) such that \(A \subset U,\) \(\psi (B_{R_0} \cap \Omega ) = U \cap \mathbb R^n_+,\) and such that \(D\psi (x_0)\) is orthogonal. Hence by shrinking \(R_0\) if necessary we can assume that

$$\begin{aligned} B_{\frac{5R}{6R_0}}(0) \subset \psi (B_R(x_0)) \subset B_{\frac{6R}{5R_0}} \end{aligned}$$
(109)

for all \(R \in (0,R_0).\) Hence if we let \({\widetilde{\Omega }}_R = \psi ^{-1}\left( \frac{18R}{25R_0}A\right) \) this satisfies,

$$\begin{aligned} \overline{\Omega _{R/2}} \subset \psi ^{-1}\left( {\overline{B}}_{\frac{3R}{5R_0}}(0)^+\right) \subset {\widetilde{\Omega }}_R \subset \psi ^{-1}\left( B_{\frac{18R}{25R_0}}(0)^+\right) \subset \Omega _{R}, \end{aligned}$$
(110)

as claimed. Now if \(\varphi \in W^{1,2}({\widetilde{\Omega }}_R,\mathbb R^N),\) setting \({\widetilde{\varphi }} = \varphi \circ \psi ^{-1}\) we have for \(\psi (y)=x\) that

$$\begin{aligned} -{{\,\textrm{div}\,}}(\mathbb A \nabla \varphi ) = - {{\,\textrm{div}\,}}(\widetilde{\mathbb A} \nabla {\widetilde{\varphi }}) \end{aligned}$$
(111)

where we define

$$\begin{aligned} \widetilde{\mathbb A}(y)v: w = |\det (\nabla \psi (y))|^{-1}\, \mathbb A (\nabla \psi (x)v): (\nabla \psi (x)w) \end{aligned}$$
(112)

for \(y = \psi (x)\) and all \(v,w \in \mathbb R^{Nn}.\) We can check \(\widetilde{\mathbb A}\) is Legendre–Hadamard elliptic and \(\beta \)-Hölder continuous with constants depending on \(n, \lambda , \Lambda \) and \(\left\Vert \Omega \right\Vert _{C^{1,\beta }},\) noting \(\nabla \psi \in C^{0,\beta }\) with bounded inverse. Hence (i) and (ii) follow by analogous estimates on \(A_R:= \frac{18R}{25R_0} A\) applying the classical Schauder and Calderón–Zygmund estimates respectively; see for instance Theorems 10.12, 10.17 in [20] for details. \(\square \)

Remark 3.9

The second estimate (ii) replaces \(W^{2,2}\) estimates by weaker bounds in \(W^{1,p},\) which suffices for our application. We will apply this with \(f \in L^2({\widetilde{\Omega }}_R(x_0),\mathbb R^N)\) by using the Newtonian potential to define

$$\begin{aligned} F = \frac{-1}{n\omega _n}\int _{{\widetilde{\Omega }}_R(x_0)} f(y) \frac{x-y}{|x-y|^{n}} \,\textrm{d}x, \end{aligned}$$
(113)

which satisfies \(-{{\,\textrm{div}\,}}F = f\chi _{{\widetilde{\Omega }}_R(x_0)}\) in \(\mathbb R^n.\) By standard potential estimates (see for instance Lemmas 7.12, 7.14, and Theorem 9.9 in [19]) we have \(C=C(n,p)\) such that

$$\begin{aligned} \left\Vert F\right\Vert _{L^p({\widetilde{\Omega }}_R(x_0))} \le C {\mathcal {L}}^n\left( {\widetilde{\Omega }}_R(x_0)\right) ^{\frac{1}{n} + \frac{1}{p} - \frac{1}{2}} \left\Vert f\right\Vert _{L^2({\widetilde{\Omega }}_R(x_0))}, \end{aligned}$$
(114)

provided \(\frac{1}{2}-\frac{1}{p} \le \frac{1}{n}\) with \(1 \le p<\infty ,\) which puts us in the setting of the above lemma.

Finally we conclude by stating a Poincaré inequality we will use extensively later. For the case of the modified domain, this follows by flattening the boundary and rescaling the smooth domain A, whereas in \(\Omega _R\) we can extend by zero to \(B_R(x_0)\) and apply the corresponding inequality there.

Lemma 3.10

(Poincaré inequality) Let \(\Omega \subset \mathbb R^n\) be a bounded \(C^{1,\beta }\) domain and let \(R_0>0,\) \({\widetilde{\Omega }}_R(x_0)\) as in Lemma 3.8 above. Then for all \(x_0 \in \partial \Omega ,\) \(0<R<R_0,\) \(1<p<\infty ,\) for \(u \in W^{1,p}({\widetilde{\Omega }}_R(x_0),\mathbb R^N)\) such that \(u = 0\) on \(\partial \Omega \cap B_R(x_0)\) in the trace sense we have

$$\begin{aligned} R^{\frac{n}{p}-\frac{n}{q} - 1} \left\Vert u\right\Vert _{L^q({\widetilde{\Omega }}_R)} \le C \left\Vert \nabla u\right\Vert _{L^p({\widetilde{\Omega }}_R)} \end{aligned}$$
(115)

for all \(1 \le q < \infty \) such that \(\frac{1}{p} - \frac{1}{q} \le \frac{1}{n},\) with \(C=C(n,p,q,\beta ,\left\Vert \Omega \right\Vert _{C^{1,\beta }})>0.\) Also the same conclusion holds for \(\Omega _R(x_0)\) in place of \({\widetilde{\Omega }}_R(x_0).\)

4 Regularity up to the boundary for F-extremals

We now use the results from the previous section to prove Theorem 1.3. The framework will be analogous to the interior regularity theory, involving establishing a Caccioppoli-type inequality and a harmonic approximation result.

We will continue to use the notation introduced in Sect. 2.1. Additionally, given a bounded \(C^{1,\beta }\) domain \(\Omega \subset \mathbb R^n,\) we will fix \(R_0>0\) and \(\delta \in (0,1)\) such that \(\Omega _R(x_0)\) is a \(\delta \)-John domain for all \(x_0 \in \partial \Omega ,\) \(0<R<R_0,\) and given \(\rho \) as above we will also assume that we have \({\mathcal {L}}^n(\Omega _R(x_0)) \ge 4^{-n}{\mathcal {L}}^n(B_R(x_0))\) and

(116)

for all \(R<R_0.\) Shrinking \(R_0\) further if necessary, we will moreover assume Proposition 3.2 and Lemmas 3.63.83.10 from the previous section hold with this choice of \(R_0.\)

4.1 Boundary Caccioppoli-type inequality

Lemma 4.1

(Boundary Caccioppoli-type inequality) Suppose F satisfies Hypotheses 1.1, let \(M\ge 1,\) and suppose \(\Omega \subset \mathbb R^n\) is a bounded \(C^{1,\beta }\) domain for some \(\beta \in (0,1).\) Given \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N),\) there is \(R_0=R_0(n,\Omega )>0\) such that the following holds. Suppose \(x_0 \in \partial \Omega ,\) \(0<R<R_0,\) and \(u \in W^{1,q}_g(\Omega ,\mathbb R^N)\) is F-extremal in \(\Omega _R(x_0)\) such that \(\nabla u \in {{\,\textrm{BMO}\,}}(\Omega _R(x_0),\mathbb R^{Nn}),\) \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _R(x_0))}\le 1,\) and \(|(\nabla u)_{\Omega _R}|\le M.\) Then if we define

$$\begin{aligned} a_R(x) = \xi _R\, \frac{\rho (x)}{|\nabla \rho (x_0)|} = \frac{((u-g) \rho )_{\Omega _R}}{(\rho ^2)_{\Omega _R}}\rho (x), \end{aligned}$$
(117)

with \(\rho \) the defining function for \(\Omega \) as in Sect. 3.2, we have the estimate

(118)

where setting \({\widetilde{M}} = C(n,\beta ,\left\Vert \Omega \right\Vert _{C^{1,\beta }},\left\Vert \nabla g\right\Vert _{C^{0,\beta }(\Omega )})M,\) \(\gamma :[0,\infty ) \rightarrow [0,1]\) is a non-decreasing continuous function satisfying \(\gamma (0)=0\) depending on \(\omega _{{\widetilde{M}}}\) and q only, and

$$\begin{aligned} C=C\left( n,N,q,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}},\delta ,\left\Vert \Omega \right\Vert _{C^{1,\beta }},R_0,\left[ \nabla g\right] _{C^{0,\beta }(\Omega )}\right) >0. \end{aligned}$$
(119)

The main technical obstruction is that we need a suitable test function \(\phi \) vanishing on \(\partial \Omega \cap B_R(x_0)\) in our coercivity estimates. We will achieve this without flattening the boundary, using ideas from Campos Cordero [6, Chapter 4] and results from Sect. 3.2.

Remark 4.2

Similarly as in the interior case, the choice of \(a_R(x)\) in (117) ensures that

$$\begin{aligned} \xi \mapsto \int _{\Omega _R} \left|u-g- \xi \, \frac{\rho }{|\nabla \rho (x_0)|}\right|^2 \,\textrm{d}x \end{aligned}$$
(120)

is minimised in \(\xi \in \mathbb R^N\) when \(\xi = \xi _R\) from (117), as noted by Kronz [35]. If we set \(\xi = (\nabla (u-g) \cdot \nu (x_0))_{\Omega _R},\) these can be compared through estimate

(121)

where \(C=C(n,\beta ,\left\Vert \Omega \right\Vert _{C^{1,\beta }})>0.\) This is proved in [35, Lemma 2(ii)], relying on the Poincaré inequality (Lemma 3.10) and (116).

Proof

Let \(R_0>0\) as in the beginning of this section, and define

$$\begin{aligned} w(x) = u(x) - g(x) - a_R(x), \end{aligned}$$
(122)

noting that \(w = 0\) on \(\partial \Omega \cap B_R(x_0).\) We also fix a cutoff \(\eta \in C^{\infty }_c(B_R(x_0))\) such that \(1_{B_{R/2}(x_0)} \le \eta \le 1_{B_R(x_0)}\) and \(|\nabla \eta |\le \frac{C}{R},\) and consider the shifted functional \({\widetilde{F}}(z) = F_{z_{R}}(z)\) as in (18) where

$$\begin{aligned} z_{R} = \xi _R \otimes \nu _{x_0} + (\nabla g)_{\Omega _R}, \end{aligned}$$
(123)

with \(\xi _R\) as in (117). Using the Poincaré inequality (Lemma 3.10), we can choose \({\widetilde{M}}\) so that

(124)

Now by the strict Legendre–Hadamard condition applied to \(\eta w\) and testing the equation (1) against \(\eta ^2w\) we have

(125)

We can absorb the \(\nabla (\eta w)\) terms using Cauchy-Schwarz and Young’s inequality; for the last term we can use the growth estimate (20) for \({\widetilde{F}}'\) to estimate

(126)

Hence since \(\eta ^2 \le 1\) we deduce that

(127)

where the final term can be omitted if \(q=2.\) For the second term we note that since \(g, \rho \) are \(C^{1,\beta }\) we have

$$\begin{aligned} |z_{R} - \nabla a_R - \nabla g|\le \left|\xi _R \otimes \nu _{x_0}\right|\frac{|\nabla \rho (x)-\nabla \rho (x_0)|}{|\nabla \rho (x_0)|} + |\nabla g - (\nabla g)_{\Omega _R}|\le CM R^{\beta } \end{aligned}$$
(128)

in \(\Omega _R,\) where \(C = C(\left\Vert \Omega \right\Vert _{C^{1,\beta }}, \left[ \nabla g\right] _{C^{0,\beta }})>0.\) For the first term we apply the comparison estimate (22); writing \(\Phi (t) = \omega _{{\widetilde{M}}}(t)(t^2+t^{2(q-1)})\) this gives

(129)

noting that \(\omega _{{\widetilde{M}}}(t)\le 1.\) Now we estimate

$$\begin{aligned} |\nabla u - z_{R}|&\le |\nabla u - (\nabla u)_{\Omega _R}|+ |\xi _R - ((\nabla (u - g)) \cdot \nu _{x_0})_{\Omega _R}|\nonumber \\&\quad + |(\nabla (u - g))_{\Omega _R} - ((\nabla ( u - g)) \cdot \nu _{x_0})_{\Omega _R} \otimes \nu _{x_0}|. \end{aligned}$$
(130)

By Remark 4.2 the second term can be estimated as

$$\begin{aligned}&|\xi _R - (\nabla (u-g)\cdot \nu _{x_0})_{\Omega _R}|\nonumber \\&\quad \le C\left( \int _{\Omega _R} \left|\nabla u - \nabla g - (\nabla (u-g) \cdot \nu _{x_0})_{\Omega _R} \otimes \nu _{x_0}\right|^2 \,\textrm{d}x\right) ^{\frac{1}{2}} + CMR^{\beta } \nonumber \\&\quad \le C |(\nabla (u - g))_{\Omega _R} - ((\nabla ( u - g)) \cdot \nu _{x_0})_{\Omega _R} \otimes \nu _{x_0}|\nonumber \\&\qquad + C\left( \int _{\Omega _R} \left|\nabla u - \nabla g - (\nabla (u-g))_{\Omega _R}\right|^2 \,\textrm{d}x\right) ^{\frac{1}{2}} + CMR^{\beta }, \end{aligned}$$
(131)

and applying Campos Cordero’s trick (Lemma 3.6) followed by the John–Nirenberg estimate (Proposition 3.3) we have

(132)

for \(p \in \{2,q\}.\) Also applying the modular Fefferman–Stein estimate (Corollary 3.5) we can bound

(133)

Now since \(\left[ \nabla g\right] _{{{\,\textrm{BMO}\,}}(\Omega _R)} \le CR^{\beta }\) and \(\Phi (R^{\beta }) \le \left( 1+R_0^{2(q-2)}\right) R^{2\beta },\) we can combine the above using the doubling property of \(\Phi \) to get

(134)

To complete the estimate, note by the John–Nirenberg inequality (Proposition 3.3) that

(135)

and similarly

(136)

Hence putting everything together gives

(137)

from which the result follows taking \(\gamma (t)=\min \{1,\omega _{{\widetilde{M}}}(t)(1+t^{2(q-2)})+t^{2(q-2)}\},\) omitting the \(t^{2(q-2)}\) terms if \(q=2.\) \(\square \)

4.2 Boundary harmonic approximation

Lemma 4.3

(Boundary harmonic approximation) Suppose F satisfies Hypotheses 1.1, let \(M \ge 1,\) and suppose \(\Omega \subset \mathbb R^n\) is a bounded \(C^{1,\beta }\) domain and \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N),\) for some \(\beta \in (0,1).\) Suppose \(x_0 \in \partial \Omega ,\) \(0<R <R_0\) with \(R_0=R_0(n,\Omega )>0\) and \(u \in W^{1,q}_g(\Omega _{R},\mathbb R^N)\) is F-extremal in \(\Omega _R(x_0)\) with \(\nabla u \in {{\,\textrm{BMO}\,}}(\Omega _{R},\mathbb R^{Nn}),\) \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _R)} \le 1,\) and \(|(\nabla u)_{\Omega _R}|\le M.\)

Then letting \({\widetilde{\Omega }}_R\) as in Lemma 3.8, the unique solution \(h \in W^{1,2}({\widetilde{\Omega }}_R,\mathbb R^N)\) to the Dirichlet problem

(138)

with \(z_{R}, a_{R}\) as in (117), (123) respectively satisfies

$$\begin{aligned} \int _{{\widetilde{\Omega }}_R} |\nabla h|^2 \,\textrm{d}x \le C \int _{{\widetilde{\Omega }}_R} |\nabla (u-g-a_{R})|^2 \,\textrm{d}x, \end{aligned}$$
(139)

where \(C=C\left( n,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}}\right) >0\) with \({\widetilde{M}} = C\left( n,\beta ,\left\Vert \Omega \right\Vert _{C^{1,\beta }},\left\Vert \nabla g\right\Vert _{C^{0,\beta }(\Omega )}\right) M.\) Moreover we have the remainder estimate

(140)

where \(C=C\left( n,N,q,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}},\left\Vert \Omega \right\Vert _{C^{1,\beta }},\left[ \nabla g\right] _{C^{0,\beta }(\Omega )}\right) >0\) and \(\gamma :[0,\infty ) \rightarrow [0,1]\) non-decreasing continuous such that \(\gamma (0)=0,\) depending on nq and \(\omega _{{\widetilde{M}}}\) only.

Proof

We will assume \(n\ge 3\) so Sobolev embedding applies, taking similar modifications as in the interior case if \(n=2.\) Additionally we will use similar arguments used in the proof of Lemma 4.1 which we will not reproduce in detail, in particular choosing \(R_0, {\widetilde{M}}\) in the same way. As in the interior case we will also replace F with \(\lambda _{{\widetilde{M}}}^{-1}F.\) Letting \({\widetilde{F}} = F_{z_{R}}\) be the shifted functional with \(z_{R}\) as in (123) and setting \(w = u-g-a_{R},\) note for \(\phi \in W^{1,2}_0({\widetilde{\Omega }}_R,\mathbb R^N)\) we have

(141)

where we used the comparison estimate (22). We now choose \(\phi \) to be the unique solution to the Dirichlet problem

(142)

Since \(w-h \in L^2({\widetilde{\Omega }}_R) \hookrightarrow W^{-1,2^*}({\widetilde{\Omega }}_R)\) by Remark 3.9, by Lemma 3.8(ii) with \(p=2^*\) we obtain the estimate \(\left\Vert \nabla \phi \right\Vert _{L^{2^*}({\widetilde{\Omega }}_R)} \le C \left\Vert w-h\right\Vert _{L^2({\widetilde{\Omega }}_R)}.\) Therefore for this choice of \(\phi \) we get

(143)

where we have used Hölder and Jensen’s inequalities (here \(2_* = \frac{2n}{n+2}\)), and absorbed the term on the right-hand side. Arguing by splitting \(|\nabla u - z_{R}|\) as in (130) from the previous section (proof of Lemma 4.1) we arrive at the estimate

(144)

with \(\gamma (t) = \min \{1,\omega _{{\widetilde{M}}}(t)^{\frac{2}{n}}(1+t^{2(q-2)})\},\) as required. \(\square \)

4.3 Boundary \(\varepsilon \)-regularity and the controlled case

We now combine the estimates from the previous sections to conclude as in the interior case.

Proof of Theorem 1.3

For \(B_r(x) \subset B_{R_0}(x_0)\) with \(x \in {\overline{\Omega }}\) we consider the excess energy

(145)

so by assumption and Proposition 3.3 there is \(C_1=C_1(n,\delta )>0\) such that \(E(x,r) \le C_1\varepsilon ^2,\) which we can assume is less than 1.

Claim If \(x \in \partial \Omega \) and \(r>0\) so that \(\Omega _r(x) \subset \Omega _R(x_0)\) and \(\sigma \in \left( 0,\frac{1}{4}\right) \) for which

$$\begin{aligned} |(\nabla u)_{\Omega _{2\sigma r}(x)}|, |(\nabla u)_{\Omega _r(x)}|\le 2^{3n+1}M, \end{aligned}$$
(146)

we have

$$\begin{aligned} E(x,\sigma r) \le C \left( \sigma ^{2\beta } + \sigma ^{-(n+2)} \gamma \left( \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _r(x))}\right) \right) E(x,r) + CM^{2(q-1)}\sigma ^{-(n+2)}r^{2\beta }, \end{aligned}$$
(147)

where \(\gamma \) is as in Lemmas 4.1 and 4.3 with \(2^{3n+1}M\) in place of M,  and

$$\begin{aligned} C=C\left( n,N,q,K_{{\widetilde{M}}}/\lambda _{{\widetilde{M}}},\delta , \left\Vert \Omega \right\Vert _{C^{1,\beta }},R_0,\left[ \nabla g\right] _{C^{0,\beta }(\Omega )}\right) >0. \end{aligned}$$
(148)

Proof of claim Applying the Caccioppoli-type inequality (Lemma 4.1) we have

(149)

where \(a_{2\sigma r}\) is given by (117) in \(\Omega _{2\sigma r}(x).\) Also by the boundary harmonic approximation (Lemma 4.3) in \(\Omega _{r}(x)\) the unique solution \(h \in W^{1,2}({\widetilde{\Omega }}_{r}(x),\mathbb R^N)\) solving

(150)

satisfies

(151)

noting that \(\Omega _{r/2}(x) \subset {\widetilde{\Omega }}_r(x) \subset \Omega _r(x).\) Now by Remark 4.2 we have

(152)

for all \(\xi \in \mathbb R^N,\) so taking \(\xi = \xi _r + (\nabla h \cdot \nu _{x})_{\Omega _{2\sigma r}(x)}\) we can split

(153)

For the second term we use the Poincaré inequality (Lemma 3.10) and Lemma 3.6 to estimate

(154)

where we have used the bound \(|(\nabla h)_{\Omega _{2\sigma r}(x)}\cdot \nu _x|^2 \le CM^2\sigma ^{-n}.\) Now as h vanishes on \(\partial \Omega \cap \partial \Omega _r(x),\) using (106) from Lemma 3.8(ii) we have the estimate

(155)

where the last line is obtained by arguing as in the proof of Lemma 4.1. Hence it follows that

(156)

so the claim follows by putting everything together.

We now argue analogously as in the interior case; note for \(x \in \partial \Omega \cap B_{R/2}(x_0)\) we have \(|(\nabla u)_{\Omega _{R/2}(x)}|\le 2^{3n}M,\) and so \(|(\nabla u)_{\Omega _{\sigma R/2}(x)}|\le 2^{3n}M+C_1\sigma ^{-n}\varepsilon \le 2^{3n+1}M\) for \(\varepsilon >0\) sufficiently small. Hence applying the claim gives

$$\begin{aligned} E(x,\sigma R/2) \le C\left( \sigma ^{2\beta } + \sigma ^{-(n+2)}\gamma (\varepsilon )\right) E(x,r/2) + CM^{2(q-1)}\sigma ^{-(n+2)}{\widetilde{R}}_0^{2(\beta -\alpha )}R^{2\alpha }.\nonumber \\ \end{aligned}$$
(157)

We choose \(\sigma \in \left( 0,\frac{1}{4}\right) \) such that \(C\sigma ^{2\beta } \le \frac{1}{4} \sigma ^{2\alpha },\) and \(\varepsilon >0\) such that \(C\sigma ^{-(n+2)}\gamma (\varepsilon ) \le \frac{1}{4} \sigma ^{2\alpha }.\) We then choose \({\widetilde{R}}_0>0\) such that \(CM^{2(q-1)}\sigma ^{-(n+2)}{\widetilde{R}}_0^{2(\beta -\alpha )} \le \kappa \sigma ^{2\alpha }\) for \(0<\kappa <1\) to be chosen to get

$$\begin{aligned} E(x,\sigma R/2) \le \frac{1}{2} \sigma ^{2\alpha } E(x,R/2) + \kappa \left( \sigma R\right) ^{2\alpha }. \end{aligned}$$
(158)

Further shrinking \(\varepsilon >0\) if necessary and taking \(\kappa >0\) small enough so

$$\begin{aligned} \sigma ^{-(n+2)}(C_1\varepsilon +\kappa ) \sum _j \sigma ^{\alpha j} \le 3^nM, \end{aligned}$$
(159)

we can iteratively argue that for all \(k \ge 0,\)

$$\begin{aligned} |(\nabla u)_{B_{\sigma ^kR/2}(x)}|&\le 2^{3n+1}M, \end{aligned}$$
(160)
$$\begin{aligned} E(x,\sigma ^kR/2)&\le 2^{-k}\sigma ^{2\alpha k}E(x,R/2) + (\sigma ^kR)^{2\alpha }. \end{aligned}$$
(161)

Hence it follows that \(E(x,r) \le Cr^{2\alpha }\) for all \(r \in (0,R/2).\)

By the interior case we also have \(E(x,r) \le Cr^{2\alpha }\) when \(B(x,r) \subset \Omega _{R}(x_0)\) with \(x \in B_{R/2}.\) We can extend this to all \(x \in \Omega _{R/2}(x_0)\) and \(0< r < R/2\) by a covering argument (adjusting constants as necessary), so by the Campanato–Meyers characterisation we get u is \(C^{1,\alpha }\) in \({\overline{\Omega }}_{R/2}(x_0),\) as required. \(\square \)

We now turn to the proof of Theorem 1.8. The key point is the follow lemma, which asserts that we obtain estimates analogous to those established in Sect. 2.1, with a precise dependence on \(|w|\le M\).

Lemma 4.4

Suppose F satisfies Hypotheses 1.7 for some \(p \ge 2.\) Then there is \(K>0\) such that for any \(z,w \in \mathbb R^{Nn}\) we have (18) satisfies

$$\begin{aligned} |F_w(z)|&\le K(1+|w|)^{p-2}( |z|^2 + |z|^p), \end{aligned}$$
(162)
$$\begin{aligned} |F'_w(z)|&\le K(1+|w|)^{p-2}( |z|+ |z|^{p-1}), \end{aligned}$$
(163)
$$\begin{aligned} |F''_w(0)|&\le K(1+|w|)^{p-2}, \end{aligned}$$
(164)

and

$$\begin{aligned} |F''_w(0)z - F'_w(z)|\le K(1+|w|)^{p-2}\omega (|z|)(|z|+|z|^{p-1}). \end{aligned}$$
(165)

for all \(z,w \in \mathbb R^{Nn},\) with \(\omega :[0,\infty ) \rightarrow [0,1]\) a non-decreasing, continuous, and concave function such that \(\omega (0)=0.\)

Proof

Quantifying (H1) we have \(F''(z)/(1+|z|)^{p-2}\) is bounded by K, and we let \(\omega \) denote the associated modulus of continuity. From this (164) immediately follows, as does (162), (163) by noting that

$$\begin{aligned} |F_w(z) |\le \left\{ \begin{array}{ll} CK (1+|w|)^{p-2} |z|^2 &{} \quad \text {if}\;|z|\le 1, \\ \Lambda (1 + |z|)^p &{} \quad \text {if}\;|z|> 1, \end{array} \right. \end{aligned}$$
(166)

and similarly for \(F_w'(z).\) Also if \(|z|\le 1\) we have

$$\begin{aligned} |F''(w+z) - F''(w)|&\le (1+|w+z|)^{p-2} \left|\frac{F''(w+z)}{(1+|w+z|)^{p-2}} - \frac{F''(w)}{(1+|w|)^{p-2}}\right|\nonumber \\&\quad + \frac{|F''(w)|}{(1+|w|)^{p-2}} \left|(1+|w|)^{p-2}- (1+|w+z|)^{p-2} \right|\nonumber \\&\le (2+ |w|)^{p-2} K \omega (|z|) + CK (1 + |w|)^{p-2} |z- w|, \end{aligned}$$
(167)

where the second term is estimated by distinguishing between the cases \(p \in [2,3]\) and \(p > 3\). Hence taking \({\widetilde{\omega }} = \min \{1, \omega (t) + t\}\) we deduce that

$$\begin{aligned} |F_w''(0)z - F_w'(z)|\le C K(1+|w|)^{p-2} {\widetilde{\omega }}(|z|) |z|\end{aligned}$$
(168)

for \(|z |\le 1,\) and when \(|z|\ge 1\) we use (163), (164) to estimate

$$\begin{aligned} |F_w''(0)z - F_w'(z)|\le C K(1+|w|)^{p-2} ( |z|+ |z|^{p-1}), \end{aligned}$$
(169)

so combining these (165) follows, replacing \(CK, {\widetilde{\omega }}\) by \(K, \omega \) respectively. \(\square \)

Proof of Theorem 1.8

Owing to Lemma 4.4, the constants \(K_M, \lambda _M\) from Sect. 2.1 can be chosen so that \(K_M / \lambda _M\) is independent of \(M \ge |z_0|.\) Similarly, we have the modulus of continuity \(\omega = \omega _M\) is also independent of M. Hence we claim the following excess decay estimate

$$\begin{aligned} E(x,\sigma r)&\le C\left( \sigma ^{2\beta }+ \sigma ^{-(n+2)} \gamma \left( \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _r(x))}\right) \right) E(x,r) \nonumber \\&\quad + C(1+|(\nabla u)_{\Omega _{2\sigma r}}|+ |(\nabla u)_{\Omega _{r}}|)^{2}\sigma ^{-(n+2)}r^{2\beta } \end{aligned}$$
(170)

holds for all \(x \in {\overline{\Omega }},\) \(R>0\) such that either \(B_R(x) \subset \Omega \) or \(x \in {\overline{\Omega }}\) and \(0<R<R_0\) (with \(R_0=R_0(n,\Omega )>0\)). Indeed this follows from the excess decay estimates (48), (147) from the proofs of Theorems 1.2 and 1.3 respectively. Letting \(M>0\) such that \(|(\nabla u)_{\Omega _{2\sigma r}}|+ |(\nabla u)_{\Omega _{r}}|\le CM,\) in the above estimates we have C and \(\gamma \) depends on M only through \(K_M/\lambda _M\) and \(\omega _M\), hence under our assumptions they are independent of M. Note in the interior case the second term can be omitted.

Fix \(\varepsilon >0\) to be determined. Then there is \(0<R<\frac{R_0}{2}\) for which there exists a finite covering of \(\Omega \) by balls \(\{B_R(x_j)\}\) where either \(B_R(x_j) \subset \Omega \) or \(x_j \in {\overline{\Omega }},\) and \(\left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _{2R}(x_j))} \le 2\varepsilon \le 1\) for each j. Let \(M>0\) such that \(|(\nabla u)_{\Omega _{2R}(x_j)}|\le M\) for all j,  then observe that for all \(x \in {\overline{\Omega }}\) and \(0<r<R\) we have \(|(\nabla u)_{\Omega _r(x)}|\le C(n)M\left( 1+ \log (R/r)\right) .\) Hence the excess decay estimate becomes

$$\begin{aligned} E(x,\sigma r) \le C\left( \sigma ^{2\beta } + \sigma ^{-(n+2)}\gamma (2\varepsilon )\right) E(x,r) + CM^2 \sigma ^{-(2n+2)} r^{2\beta } \left( 1+\log (R/r)\right) \qquad \end{aligned}$$
(171)

whenever \(0<r<R,\) and modifying constants this holds for all \(x \in {\overline{\Omega }}.\)

Now choose \(\sigma \in (0,\frac{1}{4})\) such that \(C\sigma ^{2\beta } \le \frac{1}{4} \sigma ^{2\alpha },\) and \(\varepsilon >0\) such that \(C\sigma ^{-(n+2)}\gamma (2\varepsilon ) \le \frac{1}{4} \sigma ^{2\alpha }.\) Then choose \(0<r_0<R\) such that \(CM^2\sigma ^{-(2n+2)}r_0^{2(\beta -\alpha )} \left( 1+\log (R/r_0)\right) \le \sigma ^{2\alpha }.\) This gives

$$\begin{aligned} E(x,\sigma r) \le \frac{1}{2}\sigma ^{2\alpha }E(x,r) + (\sigma r)^{2\alpha }, \end{aligned}$$
(172)

from which the result follows by iteration as in the proof of Theorem 1.3. \(\square \)

5 Extensions

Up until now we have confined our discussion to the setting of autonomous integrands, however the framework we developed extends to more general elliptic systems and higher order equation. Rather than state the most general case possible, we will aim to highlight the necessary changes to adapt our arguments to these more general situations.

5.1 Quasilinear elliptic systems

While our motivation for this investigation arose from studying the behaviour of extremals, it turns out our arguments do not make use of the variational structure of the equation. We will illustrate this by considering general Legendre–Hadamard elliptic systems, and also show how lower order terms can be handled.

More precisely we consider weak solutions to the equation

$$\begin{aligned} -{{\,\textrm{div}\,}}A(x,u,\nabla u) + B(x,u,\nabla u) = 0 \end{aligned}$$
(173)

in \(\Omega ,\) subject to the following conditions.

Hypotheses 5.1

Let \(n \ge 2,\) \(N\ge 1,\) \(\beta \in (0,1),\) \(q\ge 2\) and \(\Omega \subset \mathbb R^n\) a bounded \(C^{1,\beta }\) domain. We consider Carathéodory functions

$$\begin{aligned} A :{\overline{\Omega }} \times \mathbb R^N \times \mathbb R^{Nn}&\rightarrow \mathbb R^{Nn}, \end{aligned}$$
(174)
$$\begin{aligned} B :{\overline{\Omega }} \times \mathbb R^N \times \mathbb R^{Nn}&\rightarrow \mathbb R^{N}, \end{aligned}$$
(175)

satisfying the following (we use \(D_u, D_z\) to denote partial derivatives in uz respectively).

  1. (A1)

    For all \((x,u,z) \in {\overline{\Omega }} \times \mathbb R^N \times \mathbb R^{Nn}\) we have

    $$\begin{aligned} |A(x,u,z)|+ |B(x,u,z)|\le K(1+|z|^{q-1}). \end{aligned}$$
  2. (A2)

    The map \(z \mapsto A(x,u,z)\) is continuously differentiable for each (xu),  and for all \(M>0\) there is \(\Lambda _M>0\) and a continuous, non-decreasing concave function \(\omega _M :[0,\infty ) \rightarrow [0,1]\) satisfying \(\omega _M(0)=0\) such that

    $$\begin{aligned} |D_zA(x,u,z_1) - D_zA(x,u,z_2)|\le \Lambda _M \omega _M(|z_1-z_2|) \end{aligned}$$

    for all \(x \in {\overline{\Omega }},\) \(|u|\le M\) and \(|z_1|,|z_2|\le M+1.\)

  3. (A3)

    For all \(M>0,\) for \(x \in {\overline{\Omega }}\) and \(|u|,|z|\le M\) we have the strong Legendre–Hadamard ellipticity condition

    $$\begin{aligned} D_zA(x,u,z)(\xi \otimes \eta ): (\xi \otimes \eta ) \ge \lambda _M |\xi |^2 |\eta |^2 \end{aligned}$$

    for all \(\xi \in \mathbb R^N\) and \(\eta \in \mathbb R^n.\)

  4. (A4)

    For all \(x_1,x_2 \in {\overline{\Omega }},\) \(u_1,u_2 \in \mathbb R^N\) and \(z \in \mathbb R^{Nn}\) we have

    $$\begin{aligned} |A(x_1,u_1,z) - A(x_2,u_2,z)|\le K(1+|z|^{q-1}) \varrho _{\beta }(|x_1-x_2|+ |u_1-u_2|), \end{aligned}$$

    where \(\varrho _{\beta }(t) = \min \{1,t^{\beta }\}.\)

Remark 5.2

A special case of the above is the Euler–Lagrange system associated to the non-autonomous integrand \(F=F(x,u,z).\) Here the Euler–Lagrange system reads

$$\begin{aligned} - {{\,\textrm{div}\,}}D_zF(x,u,\nabla u) + D_uF(x,u,\nabla u) = 0, \end{aligned}$$
(176)

so we need F to be \(C^2\) in z and \(C^1\) in x,  such that Hypotheses 5.1 are satisfied with \(A(x,u,z)=D_zF(x,u,z)\) and \(B(x,u,z)=D_uF(x,u,z).\)

Theorem 5.3

(\({{\,\textrm{BMO}\,}}\) \(\varepsilon \)-regularity theorem for elliptic systems) Suppose \(\Omega , A,B\) satisfies Hypotheses 5.1 and suppose \(u \in W^{1,q}_g(\Omega ,\mathbb R^N)\) solves (173) with \(g \in C^{1,\beta }({\overline{\Omega }},\mathbb R^N).\) Then for each \(\alpha \in (0,\beta )\) and \(M>0\) there is \(\varepsilon >0\) and \({\widetilde{R}}_0>0\) such that if \(x \in {\overline{\Omega }}\) and \(R \in (0,{\widetilde{R}}_0)\) such that \(|(\nabla u)_{\Omega _R(x_0)}|\le M\) and

$$\begin{aligned} \left[ \nabla u\right] _{{{\,\textrm{BMO}\,}}(\Omega _R(x_0))} \le \varepsilon , \end{aligned}$$
(177)

then u is \(C^{1,\alpha }\) in \(\overline{\Omega _{R/2}(x_0)}.\)

Step 0: Reduction and linearisation Our strategy will be similar to before; we fix \(x_0 \in {\overline{\Omega }}\) and \(R>0\) such that either \(B_R(x_0) \subset \Omega ,\) or \(x_0 \in \partial \Omega \) and \(0<R<R_0\) with \(R_0>0\) as in the start of Sect. 4. We will focus our attention to the boundary case, as the interior case is similar but simpler. We will also fix \(M>0\) such that \(|(\nabla u)_{\Omega _R(x_0)}|\le M\).

We first observe that we can suppress the u-dependence; since \(\nabla u \in {{\,\textrm{BMO}\,}}(\Omega ,\mathbb R^{Nn})\) we can use the John–Nirenberg and Sobolev inequalities to obtain \(u \in C^{0,\chi }({\overline{\Omega }},\mathbb R^N)\) for all \(\chi \in (0,1).\) Then fixing any \({\widetilde{\beta }} \in (\alpha ,\beta )\) and taking \(\chi = {\widetilde{\beta }}/\beta ,\) we see that \(x \mapsto A(x,u(x),z)\) and \(x \mapsto B(x,u(x),z)\) are \({\widetilde{\beta }}\)-Hölder continuous in \({\overline{\Omega }}.\) Hence changing the constant K in (A4) (depending on n\(\Omega ,\) M) we can assume AB are independent of u.

We then consider the linearisation

$$\begin{aligned} {\widetilde{A}}(z) = A(x_0,z+z_0))-A(x_0,z_0), \end{aligned}$$
(178)

which satisfies the growth estimates

$$\begin{aligned} {\widetilde{A}}(z)&\le K_M (|z|+ |z|^{q-1})\end{aligned}$$
(179)
$$\begin{aligned} {\widetilde{A}}'(0)&\le K_M \end{aligned}$$
(180)
$$\begin{aligned} |{\widetilde{A}}'(0)z - {\widetilde{A}}(z)|&\le K_M\,\omega _{{\widetilde{M}}}(|z|)(|z|+|z|^{q-1}) \end{aligned}$$
(181)

for all \(z \in \mathbb R^{Nn}\), along with the coercivity estimate

(182)

for all \(\phi \in W^{1,2}_0(\Omega ,\mathbb R^N).\)

From here one can proceed analogously as in the autonomous case detailed in Sects. 2 and 4 replacing \({\widetilde{F}}'\) with \({\widetilde{A}}.\) We will sketch how the details can be modified, however the only difference is that we obtain extra terms arising from the x-dependence and the presence of the lower order term B.

Step 1: Caccioppoli inequality We claim that

(183)

with \(\gamma (t) = \min \{1, \omega _{{\widetilde{M}}}(t)(1+t^{2(q-2)}) + t^{2(q-2)}\}\), omitting the \(t^{2(q-2)}\) terms if \(q =2\). To show this, as before we will fix a cutoff \(\eta \in C^{\infty }_c(B_R)\) satisfying \(1_{B_{R/2}} \le \eta \le 1_{B_R},\) \(|\nabla \eta |\le \frac{C}{R}.\) Taking a \(a_R\) as in (117) and set \(z_R = \nabla a_R(x_0) + (\nabla g)_{\Omega _R}\). We then consider the linearisation \({\widetilde{A}}(z)\) with this choice of \(z_R\), and also put \(w=u-g-a_R.\) By the Legendre–Hadamard condition (182) we have

(184)

and since u weakly solves (173) we have

(185)

so combining these estimates we obtain

(186)

We can argue exactly as in the autonomous case (proof of Lemma 4.1) to estimate the terms \(I, I\!I, I\!V\) as before using (179), (180), (181). For the remaining terms note that

(187)

where we have used the fact that and (A4) in the second line, and the last line follows from similar bounds given in the proof of Lemma 4.1. Finally for the last term we can estimate

(188)

where we have used the Poincaré inequality (Lemma 3.10) in the last line, which allows us to absorb the \(\nabla (\eta w)\) term. Hence the result follows by putting everything together.

Step 2: Harmonic approximation We now introduce the harmonic approximation which solves

(189)

along with the dual problem

(190)

which lies in \(W^{1,2^*}_0({\widetilde{\Omega }}_R,\mathbb R^N).\) Using \(\phi \) as a test function we obtain

(191)

The first two terms can be estimated as in Lemma 4.3, and for the latter two terms we have (making suitable modifications if \(n=2\)),

(192)
(193)

which can be controlled similarly as the previous step using along with the Poincaré inequality (Lemma 3.10) with \(\phi \) for the second term. Therefore we obtain the remainder estimate

(194)

where \(\gamma (t) = \min \{1,\omega _{{\widetilde{M}}}(t)^{\frac{2}{n}}(1+t^{2(q-2)})\}.\)

Step 3: Excess decay and conclusion Now we can combine the above two estimates to deduce decay estimates for the excess energy (145). Since the estimates (183) and (194) are identical to the estimates established in Lemmas 4.1, 4.3, we can argue exactly as in Sect. 4.3 to conclude. Thus we have established Theorem 5.3.

5.2 Higher order integrands

We will also outline how analogous results can be obtained for kth order problems. For this fix \(k \ge 1,\) and let \(\mathbb M_k = {{\,\textrm{Sym}\,}}_k(\mathbb R^n,\mathbb R^N)\) denote the space of symmetric k-linear maps \((\mathbb R^{n})^k \rightarrow \mathbb R^n.\) If \(\xi \in \mathbb R^N\) and \(\eta \in \mathbb R^n,\) we write \(\eta ^k = \eta \, \otimes \cdots \otimes \, \eta \) to denote the k-fold tensor product and identify elements \(\xi \otimes \eta ^{k} \in \mathbb M_k\) to send \((x_1,\dots ,x_k) \rightarrow \xi \sum _{|\alpha |=k} x^{\alpha }\eta ^{\alpha }.\) Similarly in the case when \(k=1,\) for \(z,w \in \mathbb M_k\) we write \(z:w = \sum _{|\alpha |=k} z(e^{\alpha }).w(e^{\alpha }),\) where we take tensor powers of the standard orthonormal basis \(\{e_i\}\) for \(\mathbb R^n.\) This defines an inner product and hence an associated norm \(|\cdot |\) on \(\mathbb M_k.\)

We will consider extremals of the integrand

$$\begin{aligned} {\mathcal {F}}(w) = \int _{\Omega } F(\nabla ^kw(x)) \,\textrm{d}x, \end{aligned}$$
(195)

where \(F :\mathbb M_k \rightarrow \mathbb R\) and \(\nabla ^ku\) denotes the kth order partial derivatives of u. These satisfy the Euler–Lagrange equation

$$\begin{aligned} (-1)^k \nabla ^k: F'(\nabla ^ku) = 0 \end{aligned}$$
(196)

weakly in \(\Omega \) in the sense that

$$\begin{aligned} \int _{\Omega } F'(\nabla ^ku): \nabla ^k \varphi \,\textrm{d}x = 0 \end{aligned}$$
(197)

for all \(\varphi \in C^{\infty }_c(\Omega ,\mathbb R^N).\) The minimising case has been studied for instance in [18, 23, 34], and also by the author in [25] where similar arguments are employed to what is considered below.

Hypotheses 5.4

For \(n \ge 2,\) \(N,k\ge 1,\) let \(F :\mathbb M_k \rightarrow \mathbb R\) be a \(C^2\) integrand satisfying the natural growth condition

$$\begin{aligned} |F(z)|\le K(1+|z|)^q \end{aligned}$$
(198)

for all \(z \in \mathbb M_k\) with \(q \ge 2,\) and the strict Legendre–Hadamard condition

$$\begin{aligned} F''(z_0)(\xi \otimes \eta ^k): (\xi \otimes \eta ^k) \ge 0 \end{aligned}$$
(199)

for all \(z_0\) and all \(\xi \in \mathbb R^N,\) \(\eta \in \mathbb R^n,\) with equality if and only if \(\xi \otimes \eta ^k = 0.\)

Theorem 5.5

(Higher order \({{\,\textrm{BMO}\,}}\) \(\varepsilon \)-regularity theorem) Suppose F satisfies Hypotheses 5.4, \(\Omega \) is a bounded \(C^{1,\beta }\) domain for some \(\beta \in (0,1),\) and \(g \in C^{k,\beta }({\overline{\Omega }},\mathbb R^N).\) Then for each \(\alpha \in (0,\beta )\) and \(M>0,\) there is \(\varepsilon >0\) and \({\widetilde{R}}_0>0\) such that if \(x \in {\overline{\Omega }}\) and \(0<R<{\widetilde{R}}_0\) such that if \(u \in W^{k,q}_g(\Omega ,\mathbb R^N)\) is F-extremal in \(\Omega _R(x_0)\) such that \(|(\nabla ^k u)_{\Omega _R(x_0)}|\le M\) and

$$\begin{aligned} \left[ \nabla ^ku\right] _{{{\,\textrm{BMO}\,}}(\Omega _R(x_0))} \le \varepsilon , \end{aligned}$$
(200)

we have u is \(C^{k,\alpha }\) in \(\overline{\Omega _{R/2}(x_0)}.\)

Similarly as in Sect. 2.1 for each \(M>0\) there is \(K_M, \lambda _M > 0\) and a non-decreasing continuous and concave function \(\omega _M :[0,\infty ) \rightarrow [0,1]\) satisfying \(\omega _M(0)=0\) for which the following holds. If for \(z_0 \in \mathbb M_k\) such that \(|z_0|\le M\) we define

$$\begin{aligned} F_{z_0}(z) = F(z_0+z) - F(z_0) - F'(z_0)z. \end{aligned}$$
(201)

This satisfies identical growth and perturbation estimates as in (20), (22), namely

$$\begin{aligned} |F_{z_0}(z)|&\le K_M( |z|^2 + |z|^q), \end{aligned}$$
(202)
$$\begin{aligned} |F'_{z_0}(z)|&\le K_M( |z|+ |z|^{q-1}),\end{aligned}$$
(203)
$$\begin{aligned} |F_{z_0}''(0)|&\le K_M, \end{aligned}$$
(204)
$$\begin{aligned} |F_{z_0}''(0)z - F_{z_0}'(z)|&\le K_M\,\omega _M(|z|)\left( |z|+ |z|^{q-1} \right) \end{aligned}$$
(205)

for all \(z \in \mathbb M_k\), along with the coercivity estimate

$$\begin{aligned} \int _{\mathbb R^n} F_{z_0}''(0) \nabla ^k\varphi : \nabla ^k\varphi \,\textrm{d}x \ge \lambda _M \int _{\mathbb R^n} |\nabla ^k\varphi |^2 \,\textrm{d}x \end{aligned}$$
(206)

for all \(\varphi \in C^{\infty }_c(\mathbb R^n,\mathbb R^N).\)

We will also need the following extension of Campos Cordero’s estimate (Lemma 3.6).

Lemma 5.6

Suppose \(\Omega \) is a bounded \(C^{k,\beta }\) domain for some \(\beta \in (0,1)\) and \(p>\frac{3}{2},\) then there is \(R_0>0\) such that for all \(x_0 \in \partial \Omega ,\) \(0<R<R_0\) and \(v \in W^{k,p}(\Omega _R(x_0),\mathbb R^N)\) such that \(\nabla _{\nu }^jv = \nabla ^jv \cdot \nu ^j = 0\) on \(\partial \Omega \cap B_R(x_0)\) for each \(0\le j \le k-1,\) we have the estimate

(207)

with \(C=C(n,k,\beta ,p,\Omega ).\)

Proof

As in the \(k=1\) case, by translation and rotation we can assume that \(x_0 = 0\) and \(\nu (x_0) = e_n\), and put

$$\begin{aligned} {\tilde{v}}(x) = v(x) - (\nabla _n^kv)_{\Omega _R} \frac{\rho (x)^k}{|\nabla \rho (0)|^k}. \end{aligned}$$
(208)

Here \(\rho \) is the defining function from Sect. 3.2, which can be chosen to be of class \(C^{k,\beta }\) since \(\partial \Omega \) is of this regularity.

Claim For any multi-index \(|{\tilde{\alpha }}|\le k-1\) and \(1 \le i \le n-1,\) there is \(C>0\) such that

(209)

Proof of claim: Arguing as in the proof of Lemma 3.6, applying (89) with \(\nabla ^{{\tilde{\alpha }}}{\tilde{v}}\) in place of \({\tilde{v}}\) gives

(210)

and so by the triangle inequality

(211)

The second term can be estimated as

(212)

To estimate the \(\rho ^k\) term we use the uniform estimate

$$\begin{aligned} \frac{1}{|\nabla \rho (0)|^k}|\nabla ^k(\rho ^k)(x) - \nabla ^k(\rho ^k)(0)|\le C R^{\beta } \end{aligned}$$
(213)

holding for all \(x \in \Omega _R\), which follows by noting that \(\nabla ^k(\rho ^k)\) is of class \(C^{0,\beta }\) such that \(\nabla ^k(\rho ^k)(0) = (\nabla \rho (0))^k\). Combining the estimates the claim follows.

We can now conclude by iterating (209) to show that for any \(|\alpha |= k,\) we have

(214)

To pass from v to \({\tilde{v}}\) we note that

$$\begin{aligned} |(\nabla ^{\alpha }v)_{\Omega _R(x_0)}|- |(\nabla ^{\alpha }{\tilde{v}})_{\Omega _R(x_0)}|\le |(\nabla _n^k v)_{\Omega _R}|\frac{|(\nabla ^{\alpha }(\rho ^k))_{\Omega _R}|}{|\nabla \rho (0)|^k} \le C |(\nabla _n^k v)_{\Omega _R}|R^{\beta }, \end{aligned}$$
(215)

noting again that \(\nabla ^{\alpha }(\rho ^k)\) is a \(C^{0,\beta }\)-function vanishing at the origin. Hence we can conclude by estimating

(216)

where we used both (214) and (215) in the last line. \(\square \)

With this technical estimate in hand, we can turn to the proof of Theorem 5.5. We fix \(x_0 \in {\overline{\Omega }}\) and chose \(R>0\) such that either \(B_R(x_0) \subset \Omega ,\) or \(x_0 \in \partial \Omega \) and \(R <R_0\) with \(R_0\) as in the start of Sect. 4. In the interior case we let \(a: \mathbb R^n \rightarrow \mathbb R^N\) the kth  order polynomial satisfying

(217)

and \(\left( \nabla ^j(u-a_R)\right) _{B_R(x_0)}=0\) for each \(0 \le j \le k-1,\) and in the boundary case we take

$$\begin{aligned} a_R(x) = \xi _R \frac{\rho (x)^k}{|\nabla \rho (x_0)|^k} = \frac{((u-g) \rho ^k)_{\Omega _R(x_0)}}{(\rho ^{2k})_{\Omega _R(x_0)}} \rho (x)^k. \end{aligned}$$
(218)

We then set \(w=u-g-a_R\) and \(z_R = \nabla ^ka_R(x_0) + (\nabla ^kg)_{\partial \Omega _R}\) omitting the g-terms in the interior case. Since \(\rho \) vanishes at \(x_0\) we note that \(\nabla ^k(\rho ^k)(x_0) = (\nabla \rho (x_0))^k,\) and so \(\nabla ^ka_R(x_0) = \xi _R \otimes \nu _{x_0}^k\) in the boundary case. We assume \(|(\nabla ^k u)_{\Omega _R(x_0)}|\le M,\) so then \(|z_R|\le {\widetilde{M}} = CM.\) As before write \({\widetilde{F}} = F_{z_R}\). In the below we will focus on the boundary case; the interior case is similar but usually simpler.

Step 1: Caccioppoli-type inequality We will show that

(219)

with \(\gamma (t) = \min \{1,\omega _{{\widetilde{M}}}(t)(1+t^{2(q-2)}) + t^{2(q-2)}\}\), omitting the \(t^{2(q-2)}\) terms if \(q =2\).

This will involve a slight modification to account for intermediate derivatives. Fix \(0<t<s<R\) and let \(\eta \in C^{\infty }_c(B_R(x_0))\) such that \(1_{B_t} \le \eta \le 1_{B_s}\) with \(|\nabla ^j\eta |\le C(s-t)^{-j}\) for each \(0\le j \le k.\) Then applying the coercivity estimate (206) to \(\eta ^k w\) and testing the equation (197) against \(\eta ^{2k}w\) we have

(220)

The for the last term \(I\!V\) we use (202) and uniform bounds on \(\eta \) to estimate

(221)

For the remaining terms we estimate I using (205), and for \(I\!I,\) \(I\!I\!I\) we use (204). By splitting terms using Young’s inequality to absorb terms of the form \(\nabla ^k(\eta ^kw)\)(as in the proof of Lemma 4.1) we arrive at

$$\begin{aligned} \int _{\Omega _R} |\nabla ^k(\eta ^k w)|^2 \,\textrm{d}x&\le C \int _{\Omega _R} \omega _{{\widetilde{M}}}(|\nabla ^ku-z_R|) \left( |\nabla ^k u -z_R|^2 + |\nabla ^k u - z_R|^{2(q-1)}\right) \,\textrm{d}x \nonumber \\&\quad + C \int _{\Omega _R} |\nabla ^k u - z_R|^{2(q-1)} \,\textrm{d}x \nonumber \\&\quad + C \int _{\Omega _R} |\nabla ^k a_R - \nabla ^kg - z_R|^2 + |\nabla ^k a_R - \nabla ^kg - z_R|^{2(q-1)} \,\textrm{d}x\nonumber \\&\quad + C \sum _{j=0}^{k-1} \frac{1}{(s-t)^{2j}}\int _{\Omega _s} |\nabla ^{k-j}w|^2 \,\textrm{d}x. \end{aligned}$$
(222)

where the second term does not arise if \(q = 2\). For the last term we use the interpolation estimate to bound the intermediate derivatives \(\left\Vert \nabla ^{k-j}w\right\Vert _{L^2(\Omega _s)},\) using for instance in [2, Lemma 5.6] (applied in \(B_s(x_0)\) after extending by zero). Applying this for the terms we can bound

$$\begin{aligned} C \sum _{j=0}^{k-1} \frac{1}{(s-t)^{2j}}\int _{\Omega _s} |\nabla ^{k-j}w|^2 \,\textrm{d}x \le \frac{1}{2} \int _{\Omega _t} |\nabla ^kw|^2 \,\textrm{d}x + \frac{C}{(s-t)^{2k}} \int _{\Omega _s} |w|^2 \,\textrm{d}x, \end{aligned}$$
(223)

so then we can absorb the \(\nabla ^k w\) term by a standard iteration argument (for instance [20, Lemma 6.1]). For the remaining terms we can bound \(|\nabla ^ka_R-\nabla ^kg-z_R|\le CR^{\beta }\) and for the \(|\nabla ^ku-z_R|\) term we note that

(224)

Here the first inequality generalises the estimate of Kronz [35] used in Remark 4.2, and involves noting that for \(R>0\) sufficiently small and applying the Poincaré inequality k-times. In the second line we apply Lemma 5.6. Now we can replace \(z_R\) with \((\nabla u)_{\Omega _R}\) in the first two terms in (222), allowing us to apply the modular Fefferman–Stein estimate (Corollary 3.5) and the John–Nirenberg inequality (Proposition 3.3) to infer the claimed estimate (219).

Step 2: Harmonic approximation Now we take the unique \(h \in W^{k,2}({\widetilde{\Omega }}_R,\mathbb R^N)\) solving the Dirichlet problem

(225)

where \({\widetilde{\Omega }}_R\) is as in Proposition 3.8, noting it can be chosen to be \(C^{k,\beta }\) to match the regularity of the boundary. For the duality argument we also consider the unique \(\phi \in W^{k,2^*}({\widetilde{\Omega }}_R,\mathbb R^N)\) to

(226)

which we claim satisfies the scaled estimate

$$\begin{aligned} \left\Vert \nabla ^k\phi \right\Vert _{L^{2^*}({\widetilde{\Omega }}_R)} \le R^{-1} \left\Vert \nabla ^{k-1}(w-h)\right\Vert _{L^2({\widetilde{\Omega }}_R)}. \end{aligned}$$
(227)

For the excess decay estimate we will also need the Hölder estimate

(228)

These results go back to [5] (see also [3]), but they can also be straightforwardly adapted from the second order case detailed in [20, Chapter 10].

Given these estimates we can argue analogously to the proofs of Lemmas 2.44.3 to show that

(229)

with \(\gamma (t) = \min \{1,\omega _{{\widetilde{M}}}(t)^{\frac{1}{n}}(1+ t^{2(q-2)})\},\) suitably modified if \(n=2.\) Indeed we can write

(230)

and we split the first term using Hölder, invoking the \(L^{2^*}\) estimates for \(\nabla ^k\phi .\) Replacing \(z_R\) by \((\nabla u)_{\Omega _R}\) and sing the John–Nirenberg inequality, the claimed estimate (229) follows.

Step 3: Excess decay estimate Finally to conclude we consider the higher-order excess

(231)

Then assuming \(|(\nabla ^ku)_{\Omega _{2\sigma r}(x)}|, |(\nabla ^ku)_{\Omega _r(x)}|\le 2^{3n+1}M\) we can combine the previous two estimates to deduce the decay estimate

$$\begin{aligned} E(x,\sigma r) \le C \left( \sigma ^{2\beta } + \sigma ^{-(n+2k)} \gamma \left( \left[ \nabla ^ku\right] \right) _{{{\,\textrm{BMO}\,}}(\Omega _r(x))}\right) E(x,r) + C\sigma ^{-(n+2k)}r^{2\beta }. \end{aligned}$$
(232)

Now we can iterate in the usual way to establish Theorem 5.5.