1 Introduction and main result

In this paper, we study \(C^1\)-regularity of minimizers of integral functionals of the Calculus of Variations with widely degenerate convex integrands of the form

$$\begin{aligned} {\mathcal {F}}(u) := \int _\Omega \big [F(Du) + f\cdot u\big ] \mathrm {d}x, \end{aligned}$$
(1.1)

where \(\Omega \subset {\mathbb {R}}^n\), \(n\ge 2\), is a bounded domain and \(u:\Omega \rightarrow {\mathbb {R}}^N\), \(N\ge 1\), a possibly vector valued function. We concentrate ourself on the study of the prototype integrand

$$\begin{aligned} F(\xi ) := \tfrac{1}{p}(|\xi | - 1)_+^p, \end{aligned}$$
(1.2)

for some \(p>1\). The datum f is required to belong to \(L^{n+\sigma }\) for some \(\sigma >0\). The functional \({\mathcal {F}}\) with the specific integrand F from (1.2) is the prototype for a class of more general functionals where F is a convex function vanishing inside some convex set, and satisfying specific growth and ellipticity assumptions. For sake of clarity, the results in this paper are stated and proved for the functionals \({\mathcal {F}}(u)\) as in (1.1)–(1.2). However, we expect our techniques to apply to a general class of integrands with a widely degenerate structure as well. The functional \({\mathcal {F}}(u)\) and its associated Euler–Lagrange system

$$\begin{aligned} \mathrm {div}\bigg (\big (|Du|-1\big )^{p-1}_+\frac{Du}{|Du|}\bigg ) = f \end{aligned}$$
(1.3)

naturally arise in problems of optimal transport with congestion effects. In fact, minimizing (1.1) with \(N=1\) and the integrand from (1.2) is equivalent to the dual minimization problem

$$\begin{aligned} \min \bigg \{ \int _\Omega {\mathcal {H}} (\sigma )\,\mathrm {d}x:\sigma \in L^q(\Omega ,{\mathbb {R}}^n),\, {{\,\mathrm{div}\,}}\sigma =f, \, \sigma \cdot \nu _{\partial \Omega }=0\bigg \}, \end{aligned}$$
(1.4)

where the integrand

$$\begin{aligned} {{\mathcal {H}}(\sigma )=H(|\sigma |), \qquad \hbox { with } H(t)= t +\tfrac{1}{q} t^{q} \quad \hbox {and}\quad \tfrac{1}{p}+\tfrac{1}{q}=1} \end{aligned}$$

is the convex conjugate of F, or equivalently \(F={\mathcal {H}}^*\), and \(\sigma \) represents the traffic flow. The function \(g(t)=H'(t)\) models the congestion effect. Note that \(\sigma \mapsto g(\sigma )\) is increasing and \(g(0)=1>0\), so that moving in an empty street has nonzero cost. As shown in [4] the unique minimizer \(\sigma (x)\) of (1.4) is given by \(D_\xi F(Du(x))\). We refer to [2,3,4, 6, 7, 36] and the references therein for detailed motivations and for the physical meaning of the regularity of minimizers. It would be interesting to investigate if there are applications of the vectorial problem. However, our main motivation to consider the very degenerate system (1.3) was from a mathematical point of view.

In connection with congested traffic dynamic problems the regularity of minimizers, as well as the regularity of weak solutions of the associated autonomous Euler–Lagrange system has been an active field of research in recent years. For instance, in [2, 4, 9, 24] Lipschitz regularity of minimizers has been established under suitable assumptions on the datum f.

At this point it is worthwhile to observe that, in general, no more than Lipschitz regularity can be expected for solutions of equations or systems as in (1.3). Indeed when \(f=0\), every 1-Lipschitz continuous function solves (1.3). On the other hand, in the scalar case \(N=1\) assuming \(f\in L^{n+\sigma } \) for some \(\sigma >n\), it was shown by Santambrogio and Vespri [36] for \(n=2\) and Figalli and Colombo [10, 11] for \(n\ge 2\), that the composition of an arbitrary continuous function vanishing on the set \(\{|\nabla u|\le 1\}\) with \(\nabla u\) is continuous.

Our aim in this paper is to investigate \(C^1\)-regularity of minimizers in the vectorial case \(N\ge 1\). In general, regularity in the vectorial case is much more delicate and minimizers may be irregular although the integrand is smooth, cf. [15, 38]. In this respect, regularity can be expected only for integrands with special structure. The first result in this direction has been obtained by Uhlenbeck [40] for the p-Laplace system when \(p\ge 2\). She proved that weak solutions are of class \(C^{1,\alpha }\). The scalar case had previously been established by Ural’ceva [41], while the case \(p\in (1,2)\) was obtained by Tolksdorf [39]. As already mentioned, we cannot hope for a \(C^{1,\alpha }\)-regularity result for the elliptic system (1.3), since Lipschitz continuity is optimal. However, we are able to establish in the vectorial setting that the composition \({\mathcal {K}}(Du)\) is continuous for any continuous function \({\mathcal {K}}:{\mathbb {R}}^{Nn}\rightarrow {\mathbb {R}}\) vanishing on \(\{\xi \in {\mathbb {R}}^{Nn} : |\xi |\le 1\}\). This phenomenon is somewhat reminiscent of comparable results for the Stefan problem, in which the continuity of the energy cannot been shown, but the temperature shows a logarithmic type continuity, cf. [16, 31].

1.1 Statement of the main result

Before formulating the main regularity result, we need to introduce a few notations. The natural energy space to deal with (local) minimizers of the integral functional \({\mathcal {F}}\) is the Sobolev space \(W^{1,p}(\Omega ,{\mathbb {R}}^N)\). Then, (local) minimizers in \(W^{1,p}_{\mathrm{loc}}(\Omega ,{\mathbb {R}}^N)\) of the functional \({\mathcal {F}}\) are weak solutions of the Euler–Lagrange system

$$\begin{aligned} {{\,\mathrm{div}\,}}{\mathbf {A}}(Du) = f \end{aligned}$$
(1.5)

and vice versa, where

$$\begin{aligned} {\textbf {A}}(\xi ):=h(|\xi |)\xi , \quad \text{ with } h(t):= \frac{(t-1)_+^{p-1}}{t} \ \ \text{ for } t\in {\mathbb {R}}_+, \end{aligned}$$

for some \(p>1\). A function \(u\in W^{1,p}_{\mathrm{loc}}(\Omega ,{\mathbb {R}}^N)\) is a weak solution of the Euler–Lagrange system (1.5) if and only if

$$\begin{aligned} \int _\Omega {\mathbf {A}}(Du)\cdot D\varphi \,\mathrm {d}x= -\int _\Omega f\cdot \varphi \,\mathrm {d}x\end{aligned}$$

holds true for any testing function \(\varphi \in C_0^\infty (\Omega ,{\mathbb {R}}^N)\). Our main result proves the continuity of the composition \({\mathcal {K}}(Du)\) explained above.

Theorem 1.1

Let \(p>1\), \(f\in L^{n+\sigma }(\Omega ,{\mathbb {R}}^N)\) for some \(\sigma >0\) and \(u\in W^{1,p}_{\mathrm{loc}} (\Omega ,{\mathbb {R}}^N)\) be a weak solution of (1.5) in \(\Omega \). Then,

$$\begin{aligned} {\mathcal {K}}(Du)\in C^0(\Omega ) \end{aligned}$$

for any continuous function \({\mathcal {K}}:{\mathbb {R}}^{Nn}\rightarrow {\mathbb {R}}\) vanishing on \(\{\xi \in {\mathbb {R}}^{Nn} : |\xi |\le 1\}\).

By carefully tracing the dependence of constants on the parameter \(\delta \) in the proof of Theorem 3.6, one could determine an explicit modulus of continuity of \({\mathcal {G}}(Du)\), where \({\mathcal {G}}\) is defined in (2.2). However, it is not clear if \({\mathcal {G}}(Du)\) is Hölder continuous in general. For a different very degenerate elliptic equation a counterexample to Hölder continuity is provided in [11].

Theorem 1.1 can be regarded as the vectorial analog of the regularity results of Santambrogio and Vespri [36, Theorem 11] and of Figalli and Colombo [11, Theorem 1.1] as far as the model type integral functional is considered. The vectorial case cannot be treated with the methods from [11, 36], since these are tailored to the scalar case. Nevertheless, some steps in our proof are similar, for example, the approximation procedure by some sequence of uniformly elliptic problems. The main difference in our proof is that we are establishing a variant of DiBenedetto’s and Manfredi’s proofs of \(C^{1,\alpha }\)-regularity of minimizers to p-energy type functionals [17, 32]. It is also inspired by the arguments from DiBenedetto and Friedman’s pioneering results on \(C^{1,\alpha }\)-regularity for parabolic p-Laplacian systems [18, 20]. Roughly speaking, our strategy is the adaptation of De Giorgi’s approach to the level of gradients in combination with Campanato type comparison arguments. The past has shown that De Giorgi’s approach is extremely flexible. Therefore, we expect that our approach can be transferred to larger classes of widely degenerate functionals in the vectorial case. However, and in order to keep the individual steps as simple as possible, we limit ourselves to treating the model case.

1.2 Strategy of the proof

Concerning the overall strategy of proof a few words are in order. First, we observe that weak solutions of (1.5) are Lipschitz continuous. This has been proved in [2, 4, 9]. Moreover, functionals as in (1.1) fit into the broader context of asymptotically convex functionals, i.e. functionals having a p-Laplacian type structure only at infinity. This class of functionals has been widely studied, since the local Lipschitz regularity result by Chipot and Evans [8]. In particular we mention generalizations allowing super- and sub-quadratic growth [28, 30, 35], lower order terms [34]. Extensions to various other settings can be found in the non-complete list [12,13,14, 23,24,25,26,27, 37].

The proof of Theorem 1.1 is divided into several steps and starts by an approximation procedure. Indeed, by replacing h(t) by \(h_\varepsilon (t):= h(t)+\varepsilon \) for \(\varepsilon >0\) and considering instead of (1.5) the Dirichlet-problem on a ball compactly contained in \(\Omega \) associated to the regularized coefficients \(h_\varepsilon \) and with Dirchlet boundary datum u we obtain a sequence of approximating more regular mappings \(u_\varepsilon \). In particular, \(u_\varepsilon \) has second weak derivatives in \(L^2_{\text {loc}}\). In Sect. 3.1 we summarize the most important properties, i.e. uniform energy bounds, uniform quantitative interior \(L^\infty \)-gradient bounds, uniform quantitative higher differentiability \(W^{2,2}\)-estimates, and finally strong \(L^p\)-convergence of \({\mathcal {G}}_\delta (Du_\varepsilon )\rightarrow \mathcal G_\delta (Du)\) in the limit \(\delta \downarrow 0\). The nonlinear mapping \({\mathcal {G}}_\delta :{\mathbb {R}}^{Nn}\rightarrow {\mathbb {R}}^{Nn}\) with \(\delta \in (0,1]\) is defined by

$$\begin{aligned} {\mathcal {G}}_\delta (\xi ) := \frac{(|\xi |-1-\delta )_+}{|\xi |}\,\xi , \quad \text{ for } \xi \in {\mathbb {R}}^{Nn}. \end{aligned}$$

Observe that \({\mathcal {G}}_\delta \) vanishes on the larger set \(\{ |\xi |\le 1+\delta \}\). The reason for considering \(\mathcal G_\delta \) is that on the complement of \(\{ |\xi |> 1+\delta \}\) the system (1.5) behaves non-degenerate in the sense that the vector field \({\mathbf {A}}\) admits a uniform ellipticity bound from below, of course, with constants depending on \(\delta \). This point of view has already been exploited in [11, 36]. As a first main result we prove that \({\mathcal {G}}_\delta (Du_\varepsilon )\) is Hölder continuous uniformly with respect to \(\varepsilon \). However, the constants in the quantitative estimate, i.e. the Hölder exponent and the Hölder norm, may blow up when \(\delta \downarrow 0\). We distinguish between two different regimes: the degenerate and non-degenerate regime. The degenerate regime is characterized by the fact that the measure of those points in a ball in which \(|{\mathcal {G}}_{\delta }(Du_\varepsilon )|\) is far from its supremum is large, while the non-degenerate regime is characterized by the opposite. In the non-degenerate regime we compare \(u_\varepsilon \) with a solution of a linearized system. This allows us to derive a quantitative \(L^2\)-excess-improvement for \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\) on some smaller ball (see Proposition 3.4). This step utilizes a suitable comparison estimate and the higher integrability of \(u_\varepsilon \). On the smaller ball we are again in the non-degenerate regime, so that the argument can be iterated yielding a Campanato-type estimate for the \(L^2\)-excess of \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\). In the degenerate regime we establish that \(U_\varepsilon := ( |Du_\varepsilon |-1-\delta )_+^2\) is a subsolution to a linear uniformly elliptic equation with measurable coefficients; of course the ellipticity constants depend on \(\delta \) and blow up as \(\delta \downarrow 0\). At this stage a De Giorgi type argument allows a reduction of the modulus of \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) on some smaller ball (see Proposition 3.5). However, on this smaller scale it is not clear whether or not we are in the degenerate or non-degenerate regime. Therefore one needs to distinguish between these two regimes again. In the non-degenerate regime we can conclude as above, while in the degenerate regime the reduction of the modulus of \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) applies again. This argument can be iterated as long as we stay in the degenerate regime. However, if at a certain scale the switching from degenerate to non-degenerate occurs, the above Campanato type decay applies. If no switching occurs, we have at any scale of the iteration process a reduction of the modulus of \({\mathcal {G}}_{\delta }(Du_\varepsilon )\). This, however, shows that the supremum of \(| {\mathcal {G}}_{\delta }(Du_\varepsilon )|\)—and hence also the one of \(| {\mathcal {G}}_{2\delta }(Du_\varepsilon )|\)—on shrinking concentric balls converges to 0. Altogether this leads to a quantitative Hölder estimate for \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\) which remains stable under the already established convergence \(u_\varepsilon \rightarrow u\) as \(\varepsilon \downarrow 0\). The final step consits in passing to the limit \(\delta \downarrow 0\) and conclude that \({\mathcal {G}}(Du):=\frac{(|Du|-1)_+}{|Du|}Du\) is continuous. This can be achieved by an application of Ascoli–Arzela’s theorem. It is here where we loose control on the quantitative Hölder exponent. At this point the continuity of \({\mathcal {K}}(Du)\) for any continuous function \({\mathcal {K}}\) vanishing on \(\{|\xi |\le 1\}\) is an immediate consequence.

2 Notation and preliminary results

2.1 Notation

For the open ball of radius \(\varrho >0\) and center \(x_o\in {\mathbb {R}}^n\) we write \(B_\varrho (x_o)\subset {\mathbb {R}}^n\). The mean value of a function \(v\in L^1(B_\varrho (x_o),{\mathbb {R}}^k)\) is defined by

If the center is clear from the context we omit the reference to the center and write \(B_\varrho \) respectively \((v)_\varrho \) for short. For the standard scalar product on Euclidean spaces \({\mathbb {R}}^k\) as well as the space \({\mathbb {R}}^{k n}\) of \(k\times n\) matrices, we use the notation \(\xi \cdot \eta \). Finally, we use the notion \(\nabla u\) for the gradient of a scalar function u, while we use Du for a vector field u.

Throughout this paper we abbreviate

$$\begin{aligned} g(t) := \frac{(t-1)_+^p}{t} \qquad \text{ for } t\in {\mathbb {R}}_+ \end{aligned}$$
(2.1)

and

$$\begin{aligned} {\mathcal {G}}(\xi ) := \frac{(|\xi |-1)_+}{|\xi |}\, \xi \qquad \text{ for } \xi \in {\mathbb {R}}^{k},\, k\in {\mathbb {N}}. \end{aligned}$$
(2.2)

Observe that

$$\begin{aligned} g(|\xi |)\xi = |{\mathcal {G}}(\xi )|^{p-1} {\mathcal {G}}(\xi ) \qquad \text{ for } \text{ any } \xi \in {\mathbb {R}}^{k}. \end{aligned}$$

Moreover, for \(\delta \in (0,1]\) we define

$$\begin{aligned} {\mathcal {G}}_{\delta }(\xi ) := \frac{(|\xi |-1-\delta )_+}{|\xi |}\, \xi \qquad \text{ for } \xi \in {\mathbb {R}}^{k},\, k\in {\mathbb {N}}\end{aligned}$$
(2.3)

and note that \({\mathcal {G}}_0\equiv {\mathcal {G}}\).

Generic constants are denoted by c. They may vary from line to line. Relevant dependencies on parameters and special constants will be suitably emphasized using parentheses or subscripts.

2.2 Algebraic inequalities

In this section, we summarize the relevant algebraic inequalities that will be needed later on. The first lemma follows from an elementary computation.

Lemma 2.1

For \(\eta ,\zeta \in {\mathbb {R}}^k_{\not =0}\), \(k\in {\mathbb {N}}\) we have

$$\begin{aligned} \bigg |\frac{\eta }{|\eta |} - \frac{\zeta }{|\zeta |}\bigg | \le \frac{2}{|\eta |}|\eta -\zeta |. \end{aligned}$$

The next lemma can be deduced as in [29, Lemma 8.3].

Lemma 2.2

For any \(\alpha >0\), there exists a constant \(c=c(\alpha )\) such that, for all \(\eta , \zeta \in {\mathbb {R}}^k_{\not = 0}\), \(k\in {\mathbb {N}}\), we have

$$\begin{aligned} \tfrac{1}{c}\big ||\eta |^{\alpha -1}\eta - |\zeta |^{\alpha -1}\zeta \big | \le \big (|\eta | + |\zeta |\big )^{\alpha -1}|\eta -\zeta | \le c \big ||\eta |^{\alpha -1}\eta - |\zeta |^{\alpha -1}\zeta \big |. \end{aligned}$$

Lemma 2.3

Let \(\delta \ge 0\) and \(\eta ,\zeta \in {\mathbb {R}}^k\), \(k\in {\mathbb {N}}\). Then, for \({\mathcal {G}}_\delta \) as defined in (2.3) we have

$$\begin{aligned} |{\mathcal {G}}_\delta (\eta )-{\mathcal {G}}_\delta (\zeta )| \le 3 |\eta -\zeta |. \end{aligned}$$

Moreover, if \(\delta >0\) and \(|\eta |\ge 1+\delta \) there holds

$$\begin{aligned} |\eta -\zeta | \le \big (1+ \tfrac{2}{\delta }\big )\,|\mathcal G(\eta )-{\mathcal {G}}(\zeta )|. \end{aligned}$$

Proof

We distinguish between different cases. If \(|\eta |,|\zeta |\le 1+\delta \) the inequality holds trivially since \(\mathcal G_\delta (\eta )=0={\mathcal {G}}_\delta (\zeta )\). If \(|\eta |,|\zeta |> 1+\delta \), we apply Lemma 2.1 and obtain

$$\begin{aligned} |{\mathcal {G}}_\delta (\eta )-{\mathcal {G}}_\delta (\zeta )|&= \bigg |\frac{|\eta |-1-\delta }{|\eta |}\,\eta - \frac{|\zeta |-1-\delta }{|\zeta |}\,\zeta \bigg | \\&\le |\eta -\zeta | + (1+\delta ) \bigg |\frac{\eta }{|\eta |} - \frac{\zeta }{|\zeta |}\bigg | \\&\le 3|\eta -\zeta |. \end{aligned}$$

If \(|\eta |> 1+\delta \) and \(|\zeta |\le 1+\delta \), we have

$$\begin{aligned} |{\mathcal {G}}_\delta (\eta )-{\mathcal {G}}_\delta (\zeta )| = |\mathcal G_\delta (\eta )| = |\eta |-1-\delta \le |\eta |-|\zeta | \le |\eta -\zeta | . \end{aligned}$$

The case when \(|\eta |\le 1+\delta \) and \(|\zeta |> 1+\delta \) is similar, we just have to interchange the role of \(\eta \) and \(\zeta \). Joining the three cases gives the first assertion of the Lemma.

Now, we come to the proof of the second assertion. First, we consider the case \(|\zeta |\le 1\) in which \({\mathcal {G}}(\zeta )=0\). In this case we have

$$\begin{aligned} \frac{|\eta -\zeta |}{|{\mathcal {G}}(\eta )- {\mathcal {G}}(\zeta )|}&= \frac{|\eta -\zeta |}{|{\mathcal {G}}(\eta )|} = \frac{|\eta -\zeta |}{|\eta |-1} \le \frac{|\eta |+1}{|\eta |-1} = 1+\frac{2}{|\eta |-1} \le 1+\frac{2}{\delta }. \end{aligned}$$

Next, we consider the case \(|\zeta |>1\). Recall that by assumption \(|\eta |\ge 1+\delta \). We start with the observation that \(\mathcal G:{\mathbb {R}}^k\setminus \{ |\eta |\le 1\}\rightarrow {\mathbb {R}}^k\setminus \{ 0\}\) is a one to one mapping whose inverse mapping is given by \(\mathcal G^{-1}({\tilde{\xi }})= \frac{|{\tilde{\xi }}|+1}{|{\tilde{\xi }}|}{\tilde{\xi }}\). We let \({\tilde{\eta }} := {\mathcal {G}}(\eta )\) and \({\tilde{\zeta }} := \mathcal G(\zeta )\) and estimate with Lemma 2.1

$$\begin{aligned} \frac{\big |{\mathcal {G}}^{-1}({\tilde{\eta }}) -\mathcal G^{-1}({\tilde{\zeta }})\big |}{|{\tilde{\eta }} -{\tilde{\zeta }}|} = \frac{\left| \frac{|{\tilde{\eta }}|+1}{|{\tilde{\eta }}|}{\tilde{\eta }} - \frac{|{\tilde{\zeta }}|+1}{|{\tilde{\zeta }}|}{\tilde{\zeta }}\right| }{|{\tilde{\eta }} -{\tilde{\zeta }}|} = \frac{\left| {\tilde{\eta }} + \frac{{\tilde{\eta }}}{|{\tilde{\eta }}|} - {\tilde{\zeta }} - \frac{{\tilde{\zeta }}}{|{\tilde{\zeta }}|}\right| }{|{\tilde{\eta }} -{\tilde{\zeta }} |} \le 1 + \frac{2}{|{\tilde{\eta }}|} \le 1+ \frac{2}{\delta }. \end{aligned}$$

In the second to last estimate we used \(|{\tilde{\eta }}| >\delta \). Using in the previous inequality the definition of \({\tilde{\eta }}\) and \(\tilde{\zeta }\) the claim immediately follows. \(\square \)

Lemma 2.4

There exists a constant \(c=c(p)\) such that for any \(a>1\) and \(b\ge 0\) we have

$$\begin{aligned} |h(b) - h(a)|b \le c(p)\,\frac{[a-1 + (b-1)_+]^{p-1}}{a-1} |b-a|. \end{aligned}$$

Proof

We apply Lemma 2.2 with \(\alpha =p-1>0\) to obtain

$$\begin{aligned} |h(a)-h(b)| b&= \Big | h(a)(b-a) +(a-1)^{p-1} - (b-1)_+^{p-1}\Big |\\&\le h(a)|b-a| + \big |(a-1)^{p-1} - (b-1)_+^{p-1}\big | \\&\le h(a)|b-a| + c\,\big [(a-1) + (b-1)_+\big ]^{p-2} |b-a| \\&\le c(p)\,\frac{[a-1 + (b-1)_+]^{p-1}}{a-1} |b-a|. \end{aligned}$$

This proves the claim. \(\square \)

Lemma 2.5

For \(a>1\) we have

$$\begin{aligned} |h'(a)| \le \frac{p(a-1)^{p-2}}{a}. \end{aligned}$$

Moreover, for \(a,b>1\) there holds

$$\begin{aligned} \big |h'(b)b - h'(a)a\big | \le c(p)\big [(a-1)^{p-3} + (b-1)^{p-3}\big ] |b-a|. \end{aligned}$$

Proof

By direct computation we have for \(a>1\) that

$$\begin{aligned} h '(a)&= \frac{(p-1)(a-1)^{p-2}a - (a-1)^{p-1}}{a^2} = \frac{(a-1)^{p-2}[p-2+\frac{1}{a}]}{a}, \end{aligned}$$
(2.4)

from which the first claim immediately follows. We now turn our attention to the second claim. We may assume that \(1<a<b\); otherwise we interchange the role of a and b. In view of (2.4) we find

$$\begin{aligned} |h'(b)b - h'(a)a| \le (b-1)^{p-2} \big |\tfrac{1}{a}-\tfrac{1}{b}\big | + \big |(p-2)+\tfrac{1}{a}\big |\big |(b-1)^{p-2} - (a-1)^{p-2}\big | . \end{aligned}$$

For the first term we have

$$\begin{aligned} (b-1)^{p-2} \big |\tfrac{1}{a}-\tfrac{1}{b}\big | = \frac{(b-1)^{p-2}}{ab} |a-b| \le (b-1)^{p-3} |a-b| . \end{aligned}$$

For the second term we estimate

$$\begin{aligned}&\big |(p-2)+\tfrac{1}{a}\big |\big |(b-1)^{p-2} - (a-1)^{p-2}\big |\\&\quad \le p\big |(b-1)^{p-2} - (a-1)^{p-2}\big | \\&\quad \le p|p-2|\,\max _{t\in [a,b]}(t-1)^{p-3} |b-a| \\&\quad \le p|p-2|\,\big [(a-1)^{p-3} + (b-1)^{p-3}\big ] |b-a| . \end{aligned}$$

This completes the proof of the lemma. \(\square \)

Lemma 2.6

For any \(t\in {\mathbb {R}}_+\) we have

$$\begin{aligned} g(t)^2 \le h(t)(t-1)_+^p \end{aligned}$$

and

$$\begin{aligned} g(t)^2 + g'(t)^2t^2 \le \frac{p^2}{p-1} \big [h(t) + h'(t)t\big ] (t-1)_+^p. \end{aligned}$$

Proof

The first assertion can be achieved, since for \(t\in {\mathbb {R}}_+\) we have

$$\begin{aligned} g(t)^2&= \frac{(t-1)_+^{2p}}{t^2} \le \frac{(t-1)_+^{2p-1}}{t} = h(t)(t-1)_+^p. \end{aligned}$$

For the second claim, we first observe that both sides are zero for \(t\le 1\). Therefore, it remains to consider \(t>1\). Recalling (2.4) we compute

$$\begin{aligned} h (t) +h'(t)t = (p-1)(t-1)^{p-2}. \end{aligned}$$
(2.5)

Moreover, we have

$$\begin{aligned} g'(t) = \frac{(t-1)^{p-1}[(p-1)t+1]}{t^2}, \end{aligned}$$

so that

$$\begin{aligned} g(t)^2 + g'(t)^2t^2&= \frac{(t-1)^{2p}}{t^2} +\frac{(t-1)^{2p-2}[(p-1)t+1]^2}{t^2} \\&= \frac{(t-1)^{2p-2}}{t^2}\Big [ (t-1)^{2} +\big [ (p-1)t+1\big ]^2\Big ]\\&\le \frac{(t-1)^{2p-2}}{t^2}\Big [ (t-1) +\big [ (p-1)t+1\big ]\Big ]^2\\&= p^2 (t-1)^{2p-2}. \end{aligned}$$

Dividing the previous inequality by \(h(t) +h'(t)t\) we infer that

$$\begin{aligned} \frac{g(t)^2 + g' (t)^2t^2}{h(t) + h'(t)t}&\le \frac{p^2(t-1)^{2p-2}}{( p-1)(t-1)^{p-2}} = \frac{p^2}{p-1}(t-1)^p \end{aligned}$$

holds, proving the second claimed inequality. \(\square \)

2.3 Bilinear forms

For \(\varepsilon \in [0,1]\) we define

$$\begin{aligned} h_\varepsilon (t):= h(t) + \varepsilon , \qquad \text{ for } t\in {\mathbb {R}}_{\ge 0}. \end{aligned}$$

We note that \(h_\varepsilon \in C^1({\mathbb {R}}_{\ge 0})\) for \(p>2\), while for \(p\le 2\), we have \(h_\varepsilon \in W^{1,1}({\mathbb {R}}_{> 0})\cap C^1\big ([0,1)\cap (1,\infty )\big )\). For \(p=2\) we additionally have \(h_\varepsilon \in W^{1,\infty }({\mathbb {R}}_{\ge 0})\). In any case, \(h_\varepsilon \equiv \varepsilon \) on the interval [0, 1]. Moreover, we let

$$\begin{aligned} {\textbf {A}}_\varepsilon (\xi ) := h_\varepsilon (|\xi |)\xi , \qquad \text{ for } \xi \in {\mathbb {R}}^{Nn}. \end{aligned}$$
(2.6)

For \(\xi \in {\mathbb {R}}^{Nn}\setminus \{0\}\) with \(|\xi |\not = 1\) if \(1<p<2\) we define the bilinear forms

$$\begin{aligned} \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\eta ,\zeta ) := h_\varepsilon (|\xi |)\eta \cdot \zeta + h_\varepsilon '(|\xi |)|\xi | \sum _{i,j=1}^{N} \sum _{\alpha ,\beta ,\gamma =1}^{n} \frac{\xi _\alpha ^i\eta _{\alpha \gamma }^i\, \xi _\beta ^j\zeta _{\beta \gamma }^j}{|\xi |^2} \quad \text{ for } \eta ,\zeta \in {\mathbb {R}}^{Nn^2} \end{aligned}$$

and

$$\begin{aligned} \varvec{{\mathcal {B}}}_\varepsilon (\xi )(\eta ,\zeta ) := h_\varepsilon (|\xi |)\eta \cdot \zeta + h_\varepsilon '(|\xi |)|\xi | \sum _{i,j=1}^{N} \sum _{\alpha ,\beta =1}^{n} \frac{\xi _\alpha ^i\eta _{\alpha }^i\, \xi _\beta ^j\zeta _{\beta }^j}{|\xi |^2} \quad \text{ for } \eta ,\zeta \in {\mathbb {R}}^{Nn} \end{aligned}$$
(2.7)

and

$$\begin{aligned} \varvec{{\mathcal {C}}}_\varepsilon (\xi )(\eta ,\zeta ) := h_\varepsilon (|\xi |) \eta \cdot \zeta + h'_\varepsilon (|\xi |)|\xi | \sum _{i=1}^{N_1} \sum _{\alpha ,\beta =1}^{n} \frac{\xi ^i_{\alpha }\eta _\alpha \,\xi ^i_{\beta }\zeta _\beta }{|\xi |^2} \quad \text{ for } \eta ,\zeta \in {\mathbb {R}}^n. \end{aligned}$$
(2.8)

Observe that all forms are symmetric in the arguments \(\eta \) and \(\zeta \). Due to the special structure of \(h_\varepsilon \) and \(h_\varepsilon '\) the compositions \(\varvec{{\mathcal {A}}}_\varepsilon (Dv)\), \(\varvec{{\mathcal {B}}}_\varepsilon (Dv)\) and \(\varvec{{\mathcal {C}}}_\varepsilon (Dv)\) are well definined for \(v\in W^{1,p}\) as integrable functions. Therefore integral calculations involving these quantities make sense.

The next Lemma provides the relevant ellipticity and boundedness properties of the bilinear forms \(\varvec{{\mathcal {A}}}_\varepsilon (\xi )\), \(\varvec{{\mathcal {B}}}_\varepsilon (\xi )\) and \(\varvec{{\mathcal {C}}}_\varepsilon (\xi )\). The following abbreviations

$$\begin{aligned} \varvec{\lambda }(t)&:= \min \big \{ h(t), (p-1)( t-1)^{p-2} \big \} \qquad \text{ for } t>1, \end{aligned}$$

and

$$\begin{aligned} \varvec{\Lambda }(t)&:= \max \big \{ h(t), (p-1)( t-1)^{p-2} \big \} \qquad \text{ for } t>1, \end{aligned}$$

and \(\varvec{\lambda }(t)=0=\varvec{\Lambda }(t)\) for \(0\le t\le 1\) prove to be useful in the formulation of the Lemma.

Lemma 2.7

Let \(\varepsilon \in [0,1]\) and \(\xi \in {\mathbb {R}}^{Nn}\setminus \{0\}\). The bilinear form \(\varvec{{\mathcal {A}}}_\varepsilon (\xi )\) defined above satisfies

$$\begin{aligned} \big [\varepsilon +\varvec{\lambda }(|\xi |)\big ]|\zeta |^2 \le \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\zeta ,\zeta ) \le \big [ \varepsilon +\varvec{\Lambda }(|\xi |)\big ]|\zeta |^2 \qquad \text{ for } \text{ any } \zeta \in {\mathbb {R}}^{Nn^2}. \end{aligned}$$

The analogous estimates hold for the bilinear form \(\varvec{{\mathcal {B}}}_\varepsilon \) and any \(\eta ,\zeta \in {\mathbb {R}}^{Nn}\), as well as for \(\varvec{{\mathcal {C}}}_\varepsilon \) and any \(\eta ,\zeta \in {\mathbb {R}}^n\).

Proof

If \(|\xi |\le 1\) the inequality holds trivially. Therefore, it remains to consider the case \(|\xi |>1\). We first establish the lower bound. If \(h'_\varepsilon (|\xi |)\ge 0\) we omit the second term in the definition of \(\varvec{{\mathcal {A}}}_\varepsilon \) and obtain

$$\begin{aligned} \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\zeta ,\zeta )&\ge h_\varepsilon (|\xi |)|\zeta |^2 \ge \big [\varepsilon + \varvec{\lambda }(|\xi |)\big ] |\zeta |^2, \end{aligned}$$

while for \(h'_\varepsilon (|\xi |)<0\) we use Cauchy-Schwarz inequality and (2.5) to conclude

$$\begin{aligned} \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\zeta ,\zeta )&\ge \big [\varepsilon + h(|\xi |) + h'(|\xi |)|\xi |\big ] |\zeta |^2\\&= \big [\varepsilon + (p-1)(|\xi |-1)^{p-2}\big ]|\zeta |^2 \ge \big [\varepsilon + \varvec{\lambda }(|\xi |)\big ] |\zeta |^2. \end{aligned}$$

Now, we turn our attention to the upper bound. If \(h'_\varepsilon (|\xi |)\ge 0\) we have

$$\begin{aligned} \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\zeta ,\zeta )&\le \big [ h_\varepsilon (|\xi |) + h'_\varepsilon (|\xi |)|\xi |\big ] |\zeta |^2\\&= \big [\varepsilon + (p-1)(|\xi |-1)^{p-2}\big ]|\zeta |^2 \le \big [\varepsilon +\varvec{\Lambda }(|\xi |)\big ]|\zeta |^2, \end{aligned}$$

while for \(h'_\varepsilon (|\xi |)<0\) we obtain

$$\begin{aligned} \varvec{{\mathcal {A}}}_\varepsilon (\xi )(\zeta ,\zeta ) \le h_\varepsilon (|\xi |)|\zeta |^2 \le \big [\varepsilon +\varvec{\Lambda }(|\xi |)\big ]|\zeta |^2. \end{aligned}$$

This proves the claim for the bilinear form \(\varvec{\mathcal A}_\varepsilon \). The corresponding estimates for \(\varvec{\mathcal B}_\varepsilon \) and \(\varvec{{\mathcal {C}}}_\varepsilon \) follow in the same way. \(\square \)

It should also be mentioned that the coercive symmetric bilinear forms fulfill Cauchy-Schwarz inequality. In particular, we have

$$\begin{aligned} \big |\varvec{{\mathcal {C}}}_\varepsilon (\xi )(\eta ,\zeta )\big | \le \sqrt{\varvec{{\mathcal {C}}}_\varepsilon (\xi )(\eta ,\eta )} \sqrt{\varvec{{\mathcal {C}}}_\varepsilon (\xi )(\zeta ,\zeta )} \qquad \text{ for } \text{ any } \eta ,\zeta \in {\mathbb {R}}^{n}. \end{aligned}$$

In the next Lemma we put together the monotonicity and growth properties of the vector field \({\mathbf {A}}_\varepsilon \).

Lemma 2.8

Let \(\varepsilon \in [0,1]\) and \(\xi ,{\tilde{\xi }}\in {\mathbb {R}}^{k}\) with \(|\xi |>1\). Then, we have

$$\begin{aligned} \big |{\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big | \le c(p)\,\bigg [\varepsilon + \frac{[(|\xi |-1) + (|{\tilde{\xi }}|-1)_+]^{p-1}}{|\xi |-1}\bigg ] |{\tilde{\xi }} - \xi | \end{aligned}$$

and

$$\begin{aligned} \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi )&\ge \bigg [\varepsilon + \frac{\min \{1,p-1\}}{2^{p+1}} \frac{(|\xi |-1)^p}{|\xi |(|\xi | + |{\tilde{\xi }}|)}\bigg ] |{\tilde{\xi }}-\xi |^2. \end{aligned}$$

Proof

The first inequality results from the following chain of inequalities

$$\begin{aligned} |{\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )|&\le h_\varepsilon (|\xi |)|{\tilde{\xi }}-\xi | + \big |h(|{\tilde{\xi }}|) - h(|\xi |)\big ||{\tilde{\xi }}| \\&\le h_\varepsilon (|\xi |)|{\tilde{\xi }}-\xi | + c\,\frac{[(|\xi |-1) + (|{\tilde{\xi }}|-1)_+]^{p-1}}{|\xi |-1} \big ||{\tilde{\xi }}|-|\xi |\big |\\&\le c(p)\,\bigg [\varepsilon + \frac{[(|\xi |-1) + (|{\tilde{\xi }}|-1)_+]^{p-1}}{|\xi |-1}\bigg ] |{\tilde{\xi }}-\xi |. \end{aligned}$$

From the third last line to the second last we used Lemma 2.4. For the proof of the second inequality we abbreviate

$$\begin{aligned} \xi _s:=\xi +s({\tilde{\xi }}-\xi ), \quad \text{ for } s\in [0,1]. \end{aligned}$$

Keeping this in mind we compute

$$\begin{aligned} \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }})&- {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi ) \\&= \int _0^1 \frac{\mathrm {d}}{\mathrm {d}s} {\mathbf {A}}_\varepsilon (\xi _s)\,\mathrm {d}s\cdot ({\tilde{\xi }}-\xi ) \\&= \int _0^1\bigg [ h_\varepsilon (|\xi _s|)({\tilde{\xi }}-\xi ) +\frac{h'_\varepsilon (|\xi _s|)}{|\xi _s|} \, \xi _s\cdot ({\tilde{\xi }}-\xi )\, \xi _s \bigg ] \,\mathrm {d}s\cdot ({\tilde{\xi }}-\xi ) \\&= \int _0^1 \varvec{{\mathcal {B}}}_\varepsilon (\xi _s)({\tilde{\xi }}-\xi ,{\tilde{\xi }}-\xi ) \,\mathrm {d}s\\&\ge \int _0^1 \big [\varepsilon +\varvec{\lambda }(|\xi _s|)\big ]\,\mathrm {d}s\,|{\tilde{\xi }}-\xi |^2 \\&\ge \bigg [\varepsilon + \frac{\min \{1,p-1\}}{|\xi | + |{\tilde{\xi }}|} \int _0^1 (|\xi _s|-1)_+^{p-1} \,\mathrm {d}s\bigg ] \,|{\tilde{\xi }}-\xi |^2. \end{aligned}$$

In turn we used Lemma 2.7 and the elementary inequality \( (|\xi _s|-1)_+ \le |\xi _s|\le |\xi |+|{\tilde{\xi }}|\). Now, we distinguish whether or not \(|{\tilde{\xi }}|\le |\xi |\). If \(|{\tilde{\xi }}|\le |\xi |\), then

$$\begin{aligned} |\xi _s| \ge (1-s)|\xi | - s|{\tilde{\xi }}|\ge (1-2s)|\xi | > 1 \qquad \forall \, s\in \big [ 0,\tfrac{|\xi |-1}{2|\xi |}\big ). \end{aligned}$$

For \(s\in \big [ 0, \tfrac{|\xi |-1}{4|\xi |}\big ]\) this implies a bound from below in the form

$$\begin{aligned} (|\xi _s|-1)_+&= |\xi _s|-1 \ge (1-2s)|\xi | - 1 \ge \bigg [1 - \frac{|\xi |-1}{2|\xi |}\bigg ]|\xi | - 1 = \tfrac{1}{2}(|\xi |-1). \end{aligned}$$

So it follows that

$$\begin{aligned} \int _0^1 (|\xi _s|-1)_+^{p-1} \,\mathrm {d}s\ge \int _0^{\tfrac{|\xi |-1}{4|\xi |}} (|\xi _s|-1)_+^{p-1}\, \mathrm {d}s= \frac{1}{2^{p+1}}\frac{(|\xi |-1)^p}{|\xi |} \end{aligned}$$

holds true. In the case that \(|{\tilde{\xi }}|> |\xi |\), we estimate \(|\xi _s|\) from below by

$$\begin{aligned} |\xi _s| \ge s|{\tilde{\xi }}| - (1-s)|\xi |> (2s-1)|\xi | > 1 \qquad \forall \, s\in \big (\tfrac{|\xi |+1}{2|\xi |},1\big ]. \end{aligned}$$

Therefore, for \(s\in \big [\frac{3|\xi |+1}{4|\xi |}, 1\big ]\) we obtain

$$\begin{aligned} (|\xi _s|-1)_+ = |\xi _s|-1 \ge (2s-1)|\xi | - 1 \ge \bigg [\frac{3|\xi |+1}{2|\xi |} - 1\bigg ]|\xi | - 1 = \tfrac{1}{2}(|\xi |-1). \end{aligned}$$

This yields

$$\begin{aligned} \int _0^1 (|\xi _s|-1)_+^{p-1} \,\mathrm {d}s\ge \int _{\tfrac{3|\xi |+1}{4|\xi |}}^1 (|\xi _s|-1)_+^{p-1}\, \mathrm {d}s\ge \frac{1}{2^{p+1}}\frac{(|\xi |-1)^p}{|\xi |} . \end{aligned}$$

Inserting this above, we obtain the second claim of the Lemma. \(\square \)

Lemma 2.9

Let \(\varepsilon ,\delta \in (0,1]\) and \(\xi ,{\tilde{\xi }}\in {\mathbb {R}}^{Nn}\). Then, we have

$$\begin{aligned} \varepsilon ^{\frac{1}{2}}|\xi -{\tilde{\xi }}|^2 + \big |\mathcal G_\delta (\xi )-{\mathcal {G}}_\delta ({\tilde{\xi }})\big |^p \le \varepsilon ^{\frac{1}{2}}|\xi |^2 + c\varepsilon ^{-\frac{1}{2}}\big (\mathbf{A}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi ) \end{aligned}$$

for a constant \(c=c(p,\delta )\).

Proof

If \(|\xi |,|{\tilde{\xi }}|\le 1+\delta \), we have \(\mathcal G_\delta (\xi )=0={\mathcal {G}}_\delta ({\tilde{\xi }})\). Therefore, the desired inequality follows from the second inequality in Lemma 2.8 after omitting the positive second term in the bracket on the right-hand side.

Therefore, it remains to consider the case where either \(|\xi |>1+\delta \) or \(|{\tilde{\xi }}|>1+\delta \). We again distinguish two cases and start with \(|\xi |\ge |{\tilde{\xi }}|\). Note that this implies \(|\xi |>1+\delta \). If \(p\ge 2\) we use Lemma 2.3 to conclude

$$\begin{aligned} \big |{\mathcal {G}}_\delta ({\tilde{\xi }})-{\mathcal {G}}_\delta (\xi )\big |^p \le 3^p|{\tilde{\xi }} - \xi |^p \le 2^{p-2}3^p |\xi |^{p-2}|{\tilde{\xi }} - \xi |^2, \end{aligned}$$

while in the case \(p< 2\) we use Young’s inequality and Lemma 2.3 to obtain

$$\begin{aligned} \big |{\mathcal {G}}_\delta ({\tilde{\xi }})-{\mathcal {G}}_\delta (\xi )\big |^p&= |\xi |^{\frac{p(2-p)}{2}} |\xi |^{\frac{p(p-2)}{2}} \big |{\mathcal {G}}_\delta ({\tilde{\xi }})-{\mathcal {G}}_\delta (\xi )\big |^p \\&\le \tfrac{1}{2}\varepsilon ^{\frac{1}{2}}|\xi |^{p} + c\varepsilon ^{-\frac{2-p}{2p}}|\xi |^{p-2} \big |{\mathcal {G}}_\delta ({\tilde{\xi }})-{\mathcal {G}}_\delta (\xi )\big |^2 \\&\le \tfrac{1}{2}\varepsilon ^{\frac{1}{2}}|\xi |^{2} + c \varepsilon ^{-\frac{1}{2}}|\xi |^{p-2}|{\tilde{\xi }} - \xi |^2 , \end{aligned}$$

for a constant \(c=c(p)\). Combining both cases, taking into account the elementary inequalities \(\frac{1}{|\xi |}\le \frac{2}{|\xi |+|{\tilde{\xi }}|}\) and \(|\xi |\le (1+\frac{1}{\delta })(|\xi |-1)\), and finally applying Lemma 2.8, we obtain

$$\begin{aligned} \varepsilon ^{\frac{1}{2}}|{\tilde{\xi }}-\xi |^2 + \big |\mathcal G_\delta (\xi )-{\mathcal {G}}_\delta ({\tilde{\xi }})\big |^p&\le \varepsilon ^{\frac{1}{2}}|{\tilde{\xi }}-\xi |^2 + \tfrac{1}{2} \varepsilon ^{\frac{1}{2}} |\xi |^{2} + \frac{c\, (|\xi |-1)_+^p}{\varepsilon ^{\frac{1}{2}}|\xi |(|{\tilde{\xi }}| + |\xi |)} \,\big |{\tilde{\xi }}-\xi \big |^2 \\&\le \tfrac{1}{2} \varepsilon ^{\frac{1}{2}} |\xi |^2 + c\,\varepsilon ^{-\frac{1}{2}}\big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - \mathbf{A}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi ), \end{aligned}$$

where \(c=c(p,\delta )\). This proves the claimed inequality in this case. In the remaning case, i.e. \(|{\tilde{\xi }}|>|\xi |\), we obtain a similar estimate. We only have to replace on the right-hand side \(|\xi |^2\) by \(|{\tilde{\xi }}|^2\). Then, we use the estimate \(|{\tilde{\xi }}|^2 \le 2(|{\tilde{\xi }}-\xi |^2 + |\xi |^2)\) and absorb \(\varepsilon ^{\frac{1}{2}}|{\tilde{\xi }}-\xi |^2\) by Lemma 2.8 into the second term on the right-hand side. In this way, we obtain

$$\begin{aligned} \varepsilon ^{\frac{1}{2}}|{\tilde{\xi }}-\xi |^2 + \big |\mathcal G_\delta (\xi )-{\mathcal {G}}_\delta ({\tilde{\xi }})\big |^p&\le \tfrac{1}{2}\varepsilon ^{\frac{1}{2}} |{\tilde{\xi }}|^2 + c\,\varepsilon ^{-\frac{1}{2}}\big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - \mathbf{A}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi ) \\&\le \varepsilon ^{\frac{1}{2}}|\xi |^2 + c\,\varepsilon ^{-\frac{1}{2}}\big (\mathbf{A}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot ({\tilde{\xi }}-\xi ), \end{aligned}$$

proving the claim also in this case. \(\square \)

In the following Lemma we quantify the remainder term in the linearization of \({\mathbf {A}}_\varepsilon \). In the application, it can be assumed that the linearization only takes place in points \(\xi \) with \(|\xi |>1\) in a quantifiable way. The precise statement is

Lemma 2.10

Let \(\varepsilon \in [0,1]\), and \(\xi ,{\tilde{\xi }}\in {\mathbb {R}}^{Nn}\) with \(|\xi | \ge 1+\frac{1}{4}\mu \) and \(|\xi |, |{\tilde{\xi }}|\le 1+2\mu \) for some \(\mu >0\). Then, we have

$$\begin{aligned} \big |\varvec{{\mathcal {B}}}_\varepsilon (\xi )({\tilde{\xi }}-\xi ,\zeta ) - \big ({\textbf {A}}_\varepsilon ({\tilde{\xi }}) - {\textbf {A}}_\varepsilon (\xi )\big ) \cdot \zeta \big | \le c(p)\,\mu ^{p-3}|{\tilde{\xi }}-\xi |^2|\zeta |, \qquad \forall \zeta \in {\mathbb {R}}^{Nn}. \end{aligned}$$

Proof

We distinguish two cases. We start with the case \(|\xi -{\tilde{\xi }}|\le \frac{1}{8}\mu \). For \(s\in [0,1]\) we write \( \xi _s:=\xi +s({\tilde{\xi }}-\xi ) \). Note that

$$\begin{aligned} |\xi _s| \ge |\xi |-s|{\tilde{\xi }}-\xi | \ge 1+\tfrac{1}{4}\mu - \tfrac{1}{8}\mu = 1+\tfrac{1}{8}\mu \qquad \forall \,s\in [0,1]. \end{aligned}$$
(2.9)

Similarly to the computations in the proof of Lemma 2.8 we have

$$\begin{aligned} \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \zeta&= \int _0^1 \frac{\mathrm {d}}{\mathrm {d}s} \mathbf{A}_\varepsilon \big (\xi +s({\tilde{\xi }}-\xi )\big ) \cdot \zeta \,\mathrm {d}s\\&= \int _0^1\bigg [ h_\varepsilon (|\xi _s|)({\tilde{\xi }}-\xi ) +\frac{h_\varepsilon '(|\xi _s|)}{|\xi _s|} \xi _s\cdot ({\tilde{\xi }}-\xi )\, \xi _s \bigg ]\cdot \zeta \,\mathrm {d}s\\&= \int _0^1 \varvec{\mathcal B}_\varepsilon (\xi _s)({\tilde{\xi }}-\xi ,\zeta ) \,\mathrm {d}s. \end{aligned}$$

This allows us to re-write

$$\begin{aligned} \big |\varvec{{\mathcal {B}}}_\varepsilon (\xi )({\tilde{\xi }}-\xi ,\zeta ) - \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \zeta \big |&= \bigg |\int _0^1 \big [\varvec{\mathcal B}_\varepsilon (\xi )-\varvec{{\mathcal {B}}}_\varepsilon (\xi _s)\big ] ({\tilde{\xi }}-\xi ,\zeta ) \,\mathrm {d}s\bigg |. \end{aligned}$$

We decompose and estimate the integrand appearing on the right-hand side and obtain in this way

$$\begin{aligned} \big |\big [\varvec{{\mathcal {B}}}_\varepsilon (\xi ) - \varvec{\mathcal B}_\varepsilon (\xi _s)\big ]({\tilde{\xi }}-\xi ,\zeta )\big |&\le \mathbf{I}+\mathbf {II}, \end{aligned}$$

where

$$\begin{aligned} {\mathbf {I}}&:= \big |h (|\xi |) - h (|\xi _s|)\big | |{\tilde{\xi }}-\xi | |\zeta | \\ \mathbf {II}&:= \bigg |\bigg (h'(|\xi |)|\xi | \frac{\xi _\alpha ^i\, \xi _\beta ^j}{|\xi |^2} - h'(|\xi _s|)|\xi _s| \frac{(\xi _s)_\alpha ^i\, (\xi _s)_\beta ^j}{|\xi _s|^2}\bigg ) ({\tilde{\xi }}-\xi )_{\alpha }^i \zeta _{\beta }^j\bigg | \end{aligned}$$

In view of Lemma 2.4 we have

$$\begin{aligned} \big |h_\varepsilon (|\xi |) - h_\varepsilon (|\xi _s|)\big |&\le c(p) \frac{\big [|\xi |-1 + (|\xi _s|-1)_+\big ]^{p-1}}{|\xi _s|(|\xi |-1)} |\xi -\xi _s| \\&\le c(p)\,\mu ^{p-3} |\xi -{\tilde{\xi }}|, \end{aligned}$$

which immediately implies

$$\begin{aligned} {\mathbf {I}}\le c(p)\mu ^{p-3} |\xi -{\tilde{\xi }}|^2|\zeta |. \end{aligned}$$

For the second term we use Lemma 2.1, Lemma 2.5, the assumptions on \(\xi ,{\tilde{\xi }}\) and (2.9) to obtain

$$\begin{aligned} \mathbf {II}&\le \Bigg [\sum _{\alpha ,\beta =1}^n\sum _{i,j=1}^N \bigg (h'(|\xi |)|\xi | \frac{\xi _\alpha ^i\, \xi _\beta ^j}{|\xi |^2} - h'(|\xi _s|)|\xi _s| \frac{(\xi _s)_\alpha ^i\, (\xi _s)_\beta ^j}{|\xi _s|^2}\bigg )^2 \Bigg ]^{\frac{1}{2}} |{\tilde{\xi }}-\xi ||\zeta | \\&= \bigg | h'(|\xi |)|\xi |\frac{\xi \otimes \xi }{|\xi |^2} - h'(|\xi _s|)|\xi _s|\frac{\xi _s\otimes \xi _s}{|\xi _s|^2}\bigg | |{\tilde{\xi }}-\xi ||\zeta | \\&\le \bigg [ |h'(|\xi |)||\xi | \bigg | \frac{\xi \otimes \xi }{|\xi |^2}-\frac{\xi _s\otimes \xi _s}{|\xi _s|^2}\bigg | + \big |h'(|\xi |)|\xi | - h'(|\xi _s|)|\xi _s|\big | \bigg ] |{\tilde{\xi }}-\xi ||\zeta | \\&\le \bigg [ 2|h'(|\xi |)|\frac{|\xi |+|\xi _s|}{|\xi |}|\xi -\xi _s| + \big |h'(|\xi |)|\xi | - h'(|\xi _s|)|\xi _s|\big | \bigg ] |{\tilde{\xi }}-\xi ||\zeta | \\&\le c(p)\bigg [ (|\xi |-1)^{p-3}\frac{|\xi |+|\xi _s|}{|\xi |} + (|\xi |-1)^{p-3}+(|\xi _s|-1)^{p-3} \bigg ]|{\tilde{\xi }}-\xi |^2|\zeta | \\&\le c(p)\,\mu ^{p-3} |{\tilde{\xi }}-\xi |^2|\zeta | . \end{aligned}$$

Inserting the preceding estimates above, we find that

$$\begin{aligned} \big |\varvec{{\mathcal {B}}}_\varepsilon (\xi )({\tilde{\xi }}-\xi ,\zeta ) - \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \zeta \big |&\le c(p)\,\mu ^{p-3} |{\tilde{\xi }}-\xi |^2|\zeta | . \end{aligned}$$

At this stage it remains to consider the case \(|\xi -{\tilde{\xi }}|> \frac{1}{8}\mu \). Note that

$$\begin{aligned} \varvec{\mathcal B}_\varepsilon (\xi )({\tilde{\xi }}-\xi ,\zeta )=\varepsilon ({\tilde{\xi }}-\xi )\cdot \zeta + \varvec{{\mathcal {B}}}_0(\xi )({\tilde{\xi }}-\xi ,\zeta ). \end{aligned}$$

Therefore, by Lemma 2.7 and the assumption \(\mu <8|{\tilde{\xi }} -\xi |\) we obtain

$$\begin{aligned}&\big |\varvec{{\mathcal {B}}}_\varepsilon (\xi )({\tilde{\xi }}-\xi ,\zeta )- \big ({\mathbf {A}}_\varepsilon ({\tilde{\xi }}) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \zeta \big | \\&\quad = \big |\varvec{{\mathcal {B}}}_0(\xi )({\tilde{\xi }}-\xi ,\zeta ) - \big (h(|{\tilde{\xi }}|){\tilde{\xi }} - h(|\xi |)\xi \big ) \cdot \zeta \big | \\&\quad \le \Big [\varvec{\Lambda }(|\xi |) |{\tilde{\xi }}-\xi | + (|{\tilde{\xi }}|-1)_+^{p-1} + (|\xi |-1)^{p-1} \Big ] |\zeta | \\&\quad \le c(p)\big [\mu ^{p-2}|{\tilde{\xi }}-\xi | + \mu ^{p-1}\big ]|\zeta | \\&\quad \le c(p)\,\mu ^{p-3}|{\tilde{\xi }}-\xi |^2|\zeta |. \end{aligned}$$

Joining both cases yields the claim. \(\square \)

Lemma 2.11

For any \(\varepsilon \in [0,1]\), any ball \(B_R\subset {\mathbb {R}}^n\) and any \(v\in W^{2,2}_{\mathrm{loc}}(B_R,{\mathbb {R}}^{N})\) we have

$$\begin{aligned} \big |D\big [g (|Dv|) Dv\big ]\big |^2 \le \frac{2p^2}{p-1}\, \varvec{{\mathcal {A}}}_\varepsilon (Dv) \big (D^2v,D^2v\big )\, (|Dv|-1)_+^p \qquad \text{ a.e. } \text{ in } B_R, \end{aligned}$$

where g is defined in (2.1).

Proof

For \(\alpha \in \{1,\dots ,n\}\) we compute

$$\begin{aligned} \big | D_\alpha \big [g(|Dv|) Dv\big ]\big |^2&= \big | g(|Dv|)D_\alpha Dv + g'(|Dv|) D_\alpha |Dv|Dv\big |^2 \\&\le 2\Big [ g(|Dv|)^2\big |D_\alpha Dv\big |^2 + g'(|Dv|)^2|Dv|^2 \big |D_\alpha |Dv|\big | ^2\Big ] . \end{aligned}$$

Summing with respect to \(\alpha \in \{1,\dots ,n\}\), we obtain

$$\begin{aligned} \big | D \big [g(|Dv|) Dv\big ]\big |^2&= \sum _{\alpha =1}^n \big | D_\alpha \big [g(|Dv|) Dv\big ]\big |^2 \\&\le 2\Big [g(|Dv|)^2 |D^2v|^2 + g'(|Dv|)^2 |Dv|^2 \big | \nabla |Dv|\big |^2\Big ] . \end{aligned}$$

In the preceding inequality we replace the term \(g(|Dv|)^2\) by \(h(|Dv|)(|Dv|-1)_+^p\), which is possible by an application of the first inequality in Lemma 2.6. Moreover, we would also like to replace \(g'(|Dv|)^2|Dv|^2\) by \(h'(|Dv|) |Dv|(|Dv|-1)_+^p\). To this aim we have to distinguish two cases. If \(h'(|Dv|)\ge 0\), we use the second inequality in Lemma 2.6 to replace \(g'(|Dv|)^2|Dv|^2\) by \([h(|Dv|)+h'(|Dv|) |Dv|](|Dv|-1)_+^p\). Thereby, we may omit the positive term \(g(|Dv|)^2\) on the left-hand side. Inserting this above and using also Kato’s inequality in the form \(|\nabla |Dv||\le |D^2 v|\), we obtain

$$\begin{aligned}&\big | D \big [g(|Dv|) Dv\big ]\big |^2 \\&\quad \le \frac{p^2}{p-1}(|Dv|-1)_+^p\Big [h(|Dv|) |D^2v|^2 + \big [h(|Dv|) + h'(|Dv|) |Dv|\big ] \big | \nabla |Dv|\big |^2\Big ] \\&\quad \le \frac{2p^2}{p-1}(|Dv|-1)_+^p\Big [h(|Dv|) |D^2v|^2 + h'(|Dv|) |Dv|\big | \nabla |Dv|\big |^2\Big ] \\&\quad \le \frac{2p^2}{p-1}\, \varvec{\mathcal A}_\varepsilon (Dv)\big (D^2v,D^2v\big )\, (|Dv|-1)_+^p. \end{aligned}$$

Otherwise, if \(h'(|Dv|)< 0\) we use Kato’s inequality twice and again the second inequality in Lemma 2.6 to obtain

$$\begin{aligned} \big | D \big [g(|Dv|) Dv\big ]\big |^2&\le 2\big [g(|Dv|)^2 + g'(|Dv|)^2|Dv|^2 \big ] |D^2v|^2 \\&\le \frac{2p^2}{p-1}(|Dv|-1)_+^p \big [h(|Dv|) + h'(|Dv|) |Dv| \big ] |D^2v|^2\\&\le \frac{2p^2}{p-1}(|Dv|-1)_+^p\Big [h(|Dv|) |D^2v|^2 + h'(|Dv|) |Dv| \big | \nabla |Dv|\big |^2\Big ] \\&\le \frac{2p^2}{p-1}\, \varvec{\mathcal A}_\varepsilon (Dv)\big (D^2v,D^2v\big )\, (|Dv|-1)_+^p. \end{aligned}$$

This proves the asserted inequality also in the second case. \(\square \)

3 Proof of Theorem 1.1

In this section we will prove Theorem 1.1 under the hypothesis that Propositions 3.4 and 3.5 below are true. The remainder of the paper is then devoted to the proof of those two propositions.

Here and in the following we denote by \(u\in W^{1,p}(\Omega ,{\mathbb {R}}^N)\) a weak solution of (1.5). We first observe that \(u\in W^{1,\infty }_{\mathrm{loc}} (\Omega ,{\mathbb {R}}^N)\); cf [2, 4] for the scalar case and [9] for the vectorial case. Therefore, we may always assume that Du is locally bounded in \(\Omega \).

3.1 Regularization

The first step in the proof consists in the construction of more regular approximating solutions. To this aim we consider a fixed ball \(B_{R}\equiv B_{R}(y_o)\Subset \Omega \). We let \(\varepsilon \in (0,1]\) and \({\mathfrak {p}}:= \max \{p,2\}\) and recall the definition of the regularized coefficients \({\mathbf {A}}_\varepsilon \) from (2.6). By \(u_\varepsilon \in u+W_0^{1,{\mathfrak {p}}} (B_{R},{\mathbb {R}}^N)\) we denote the unique weak solution of the regularized elliptic system

$$\begin{aligned} \left\{ \begin{array}{cl} {{\,\mathrm{div}\,}}{\mathbf {A}}_\varepsilon (Du_\varepsilon ) = f, &{} \quad \text{ in } B_{R},\\[5pt] u_\varepsilon =u, &{}\quad \text{ on } \partial B_{R}. \end{array}\right. \end{aligned}$$
(3.1)

The weak formulation of (3.1) is

$$\begin{aligned} \int _{B_R} {\textbf {A}}_\varepsilon (Du_\varepsilon )\cdot D\varphi \,\textrm {d}x= -\int _{B_R} f\cdot \varphi \,\textrm {d}x\qquad \forall \,\varphi \in W^{1,{\mathfrak {p}}}_0 \big (B_R,{\mathbb {R}}^N\big ). \end{aligned}$$
(3.2)

Note that \(u_\varepsilon \in W^{1,\infty }_{\mathrm{loc}} (B_{R},{\mathbb {R}}^N)\cap W^{2,2}_{\mathrm{loc}} (B_{R},{\mathbb {R}}^N)\). This can be retrieved similarly as in [33].

Lemma 3.1

For any \(\varepsilon \in (0,1]\) we have \(u_\varepsilon \in W^{1,\infty }_\mathrm{loc} (B_{R},{\mathbb {R}}^N)\cap W^{2,2}_{\mathrm{loc}} (B_{R},{\mathbb {R}}^N)\). Moreover, for any ball \(B_{2\varrho }(x_o)\Subset B_{R}\) the (uniform with respect to \(\varepsilon \)) quantitative \(L^\infty \)-gradient bound

with \(c=c(n,N,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)})\) and the quantitative \(W^{2,2}\)-estimate

$$\begin{aligned} \int _{B_{ \varrho }(x_o)} | D^2u_\varepsilon |^2 \, \mathrm {d}x&\le \frac{c}{\varepsilon \varrho ^2} \bigg [ \int _{B_{2 \varrho }(x_o)} \big (|Du_\varepsilon |^{p} + \varepsilon |Du_\varepsilon |^2\big )\, \mathrm {d}x+ \frac{\varrho ^2}{\varepsilon }\int _{B_{2 \varrho }(x_o)} |f|^2\, \mathrm {d}x\bigg ] \end{aligned}$$

with \(c=c(n,N,p)\) hold true.

Our first observation is a uniform energy bound for \(Du_\varepsilon \).

Lemma 3.2

There exists a constant \(c=c(n,p)\) such that for any \(\varepsilon \in (0,1]\) we have

$$\begin{aligned} \int _{B_R} \big (|Du_\varepsilon |^{p} + \varepsilon |Du_\varepsilon |^2\big ) \,\mathrm {d}x&\le c\int _{B_R}\big (|Du|^p+ \varepsilon |Du|^2+1\big )\,\mathrm {d}x+ c\,R^n\Vert f\Vert _{L^n(B_R)}^{\frac{p}{p-1}}. \end{aligned}$$

Proof

The desired estimate can be deduced with a standard argument by testing the weak formulation (3.2) with the test-function \(\varphi := u_\varepsilon -u\). Indeed, we have

$$\begin{aligned}&\int _{B_R} \big [(|Du_\varepsilon |-1)_+^{p} + \varepsilon |Du_\varepsilon |^2\big ] \,\mathrm {d}x\\&\quad \le \int _{B_R} h_\varepsilon (|Du_\varepsilon |) Du_\varepsilon \cdot Du_\varepsilon \,\mathrm {d}x\\&\quad = \int _{B_R} h_\varepsilon (|Du_\varepsilon |) Du_\varepsilon \cdot Du\,\mathrm {d}x- \int _{B_R} f\cdot (u_\varepsilon -u)\,\mathrm {d}x. \end{aligned}$$

By Young’s inequality we obtain for the first integral on the right-hand side

$$\begin{aligned} \int _{B_R} h_\varepsilon (|Du_\varepsilon |) Du_\varepsilon \cdot Du\,\mathrm {d}x&\le \int _{B_R} \big [(|Du_\varepsilon |-1)_+^{p-1} |Du| +\varepsilon |Du_\varepsilon ||Du|\big ] \,\mathrm {d}x\\&\le \tfrac{1}{4} \int _{B_R}(|Du_\varepsilon |-1)_+^{p} \,\mathrm {d}x+ c \int _{B_R} |Du| ^p \,\mathrm {d}x\\&\quad + \tfrac{\varepsilon }{2} \int _{B_R} \big (|Du_\varepsilon |^2+ |Du |^2\big ) \,\mathrm {d}x. \end{aligned}$$

The second integral on the right-hand side is estimated with Hölder’s and Sobolev’s inequality, so that

$$\begin{aligned}&\bigg |\int _{B_R} f\cdot (u_\varepsilon -u)\,\mathrm {d}x\bigg | \\&\quad \le \bigg [\int _{B_R} |f|^n\,\mathrm {d}x\bigg ]^\frac{1}{n} \bigg [\int _{B_R} |u_\varepsilon -u|^{\frac{n}{n-1}} \,\mathrm {d}x\bigg ]^{\frac{n-1}{n}} \\&\quad \le c\bigg [\int _{B_R} |f|^n\,\mathrm {d}x\bigg ]^\frac{1}{n} \int _{B_R} |Du_\varepsilon -Du| \,\mathrm {d}x\\&\quad \le c\, |B_R|^{\frac{p-1}{p}} \bigg [\int _{B_R} |f|^n\,\mathrm {d}x\bigg ]^\frac{1}{n} \bigg [\int _{B_R} \big [(|Du_\varepsilon |-1)_+^p + |Du|^p + 1\big ] \,\mathrm {d}x\bigg ]^\frac{1}{p} \\&\quad \le c\,R^n\Vert f\Vert _{L^n(B_R)}^{\frac{p}{p-1}} + \tfrac{1}{4}\int _{B_R} \big [(|Du_\varepsilon |-1)_+^p + |Du|^p + 1\big ] \,\mathrm {d}x, \end{aligned}$$

where \(c=c(n,p)\). We insert these inequalities above and reabsorb the terms containing \(Du_\varepsilon \) from the right-hand side into the left. In this way, we get

$$\begin{aligned}&\int _{B_R} \big [(|Du_\varepsilon |-1)_+^{p} + \varepsilon |Du_\varepsilon |^2\big ] \,\mathrm {d}x\\&\quad \le c\int _{B_R} \big [|Du|^p+ \varepsilon |Du|^2 + 1\big ] \mathrm {d}x+ c\,R^n\Vert f\Vert _{L^n(B_R)}^{\frac{p}{p-1}}, \end{aligned}$$

with a constant \(c=c(n,p)\). The desired uniform energy bound can easily be deduced from the preceding inequality. \(\square \)

The next lemma ensures strong convergence of the approximating solutions, in the sense that \({\mathcal {G}}_\delta (Du_\varepsilon )\) strongly converges to \({\mathcal {G}}_\delta (Du)\) in \(L^p\).

Lemma 3.3

Let \(\delta \in (0,1]\) and \(u_\varepsilon \) with \(\varepsilon \in (0,1]\) be the unique solution of the Dirichlet problem (3.1). Then, we have

$$\begin{aligned} {\mathcal {G}}_\delta (Du_\varepsilon )\rightarrow {\mathcal {G}}_\delta (Du)\;\; \text{ in } L^p\big (B_R,{\mathbb {R}}^{Nn}\big ) \text{ as } \varepsilon \downarrow 0. \end{aligned}$$

Proof

Testing (3.2) and the weak formulation of (1.5) with \(\varphi :=u_\varepsilon -u\) we have

$$\begin{aligned} \int _{B_R} \big ( {\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (Du)\big ) \cdot (Du_\varepsilon -Du)\,\mathrm {d}x= \varepsilon \int _{B_R} Du\cdot (Du-Du_\varepsilon )\,\mathrm {d}x. \end{aligned}$$

Using Lemma 2.9 and Young’s inequality we find

$$\begin{aligned}&\int _{B_R} \Big [\varepsilon ^{\frac{1}{2}} |Du_\varepsilon -Du|^2 + \big |{\mathcal {G}}_\delta (Du_\varepsilon ) -{\mathcal {G}}_\delta (Du)\big |^p \Big ]\,\mathrm {d}x\\&\quad \le \varepsilon ^{\frac{1}{2}}\int _{B_R} |Du|^2 \,\mathrm {d}x+ c\varepsilon ^{-\frac{1}{2}} \int _{B_R} \big ( {\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (Du)\big ) \cdot (Du_\varepsilon -Du)\,\mathrm {d}x\\&\quad \le \varepsilon ^{\frac{1}{2}}\int _{B_R} |Du|^2 \,\mathrm {d}x+ c\varepsilon ^{\frac{1}{2}} \int _{B_R} |Du||Du_\varepsilon -Du|\,\mathrm {d}x\\&\quad \le \varepsilon ^{\frac{1}{2}} \int _{B_R} |Du_\varepsilon -Du|^2 \,\mathrm {d}x+ c \varepsilon ^{\frac{1}{2}} \int _{B_R} |Du|^2 \,\mathrm {d}x. \end{aligned}$$

Re-absorbing the first integral from the right into the left-hand side, we obtain

$$\begin{aligned} \int _{B_R}\big |{\mathcal {G}}_\delta (Du_\varepsilon ) -\mathcal G_\delta (Du)\big |^p\,\mathrm {d}x&\le c \varepsilon ^{\frac{1}{2}} \int _{B_R} |Du|^2 \,\mathrm {d}x. \end{aligned}$$

The integral on the right-hand side is finite, since \(Du\in L^\infty _{\mathrm{loc}}(\Omega , {\mathbb {R}}^{Nn})\). Therefore, the preceding inequality implies strong convergence \(\mathcal G_\delta (Du_\varepsilon )\rightarrow {\mathcal {G}}_\delta (Du)\) in \(L^p(B_R,{\mathbb {R}}^{Nn})\) as \(\varepsilon \downarrow 0\). \(\square \)

3.2 Hölder-continuity of \({\mathcal {G}}_{\delta }(Du_\varepsilon )\)

We recall the definition of \({\mathcal {G}}_{\delta }\) from (2.3). In this subsection we will prove that \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) is locally Hölder continuous in \(B_R\) for any \(\delta \in (0,1]\). This will be achieved in Theorem 3.6. Thereby, it is essential that all constants are independent of \(\varepsilon \).

The proof of Theorem 3.6 relies on the distinction between two different regimes—the degenerate and non-degenerate regime. In the non-degenerate regime we will prove an excess-decay estimate for \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) (see Proposition 3.4), while in the degenerate regime we establish a reduction of the modulus of \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) (see Proposition 3.5). The precise setup is as follows. We consider a ball \(B_R\Subset \Omega \) and denote by \(u_\varepsilon \) the unique weak solution of the Cauchy–Dirichlet problem (3.1). For \(0<r_o<R\) we let \(r_1:=\frac{1}{2} (R+r_o)\). Then, by Lemma 3.1 and Lemma 3.2 we have (uniform with respect to \(\varepsilon \)) boundedness of \(Du_\varepsilon \) on \(B_{r_1}\). More precisely, there exists a constant

$$\begin{aligned} M=M\big (n,N,p,R-r_o,\Vert Du\Vert _{L^p(B_R)}, ...\big ) \end{aligned}$$
(3.3)

independent of \(\varepsilon >0\), such that \(\Vert Du_\varepsilon \Vert _{L^\infty (B_{r_1})}\le M\). We may assume that \(M\ge 3\). Now, we consider a center \(x_o\in B_{r_o}\) and a radius \(\varrho \le r_1\) such that \(B_{2\varrho }(x_o)\subset B_{r_1}\). On this ball we have

$$\begin{aligned} \sup _{B_{2\varrho }(x_o)} |Du_\varepsilon | \le 1+\delta +\mu \end{aligned}$$
(3.4)

for some \(\mu >0\) such that

$$\begin{aligned} 1+\delta +\mu \le M. \end{aligned}$$
(3.5)

Next, for \(\nu \in (0,1)\) we define the super-level set of \(|Du_\varepsilon |\) by

$$\begin{aligned} E_\varrho ^\nu (x_o) := \big \{x\in B_\varrho (x_o): |Du_\varepsilon (x)|-1-\delta > (1-\nu )\mu \big \}. \end{aligned}$$

The definition of the super-level set allows us to distinguish between the degenerate regime which is characterized by the measure condition \(|B_\varrho (x_o)\setminus E_\varrho ^\nu (x_o)|\ge \nu |B_\varrho (x_o)|\) and the non-degenerate regime which is characterized by the reversed inequality. Roughly speaking, in the degenerate regime the set of points with \(|Du_\varepsilon |\) small has large measure, while in the non-degenerate regime the set of points with \(|Du_\varepsilon |\) small is small in measure. We start with the latter one. In the following we abbreviate \(\beta :=\frac{\sigma }{n+\sigma }\).

Proposition 3.4

Let \(\varepsilon , \delta \in (0,1]\) and

$$\begin{aligned} 0<\delta <\mu . \end{aligned}$$
(3.6)

Then, there exist \(\nu =\nu (n,N,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)}, M,\delta )\in (0,\frac{1}{4}]\) and \({\hat{\varrho }}={\hat{\varrho }}(n,N,p,\sigma ,\) \(\Vert f\Vert _{L^{n+\sigma }(B_R)}, M,\delta )\in (0,1]\) such that there holds: Whenever \(B_{2\varrho }(x_o)\subset B_{r_1}\) is a ball with radius \(\varrho \le {\hat{\varrho }}\) and center \(x_o\in B_{r_o}\), and \(u_\varepsilon \) is the unique weak solution of the Dirichlet problem (3.1) and hypothesis (3.4) and (3.5) and the measure condition

$$\begin{aligned} \big |B_{\varrho }(x_o)\setminus E_\varrho ^\nu (x_o)\big | < \nu \big |B_{\varrho }(x_o)\big | \end{aligned}$$
(3.7)

are satisfied, then the limit

$$\begin{aligned} \Gamma _{x_o} := \lim _{r\downarrow 0} \big (\mathcal G_{2\delta }(Du_\varepsilon )\big )_{x_o,r} \end{aligned}$$
(3.8)

exists, and the excess decay estimate

(3.9)

holds true. Moreover, we have

$$\begin{aligned} |\Gamma _{x_o}| \le \mu . \end{aligned}$$

The statement for the degenerate regime is as follows.

Proposition 3.5

Let \(\varepsilon , \delta \in (0,1]\), \(\mu >0\) and \(\nu \in (0,\frac{1}{4}]\). Then, there exist constants \(\kappa =\kappa (n,p,\sigma ,\Vert f\Vert _{n+\sigma },M,\delta ,\nu )\in [2^{-\beta /2},1)\) and \(c_o=c_o(n,p,\Vert f\Vert _{n+\sigma },M,\delta ,\nu )\ge 1\) such that the there holds: Whenever \(B_{2\varrho }(x_o)\subset B_{r_1}\) is a ball with center \(x_o\in B_{r_o}\), and \(u_\varepsilon \) is the unique weak solution of the Dirichlet problem (3.1) and hypothesis (3.4) and (3.5) and the measure condition

$$\begin{aligned} \big |B_{\varrho }(x_o)\setminus E_{\varrho }^\nu (x_o)\big | \ge \nu \big |B_{\varrho }(x_o)\big | \end{aligned}$$
(3.10)

are satisfied, then, either

$$\begin{aligned} \mu ^2<c_o\varrho ^{\beta }, \end{aligned}$$

or

$$\begin{aligned} \sup _{B_{\varrho /2}(x_o)}\big |{\mathcal {G}}_{\delta }(Du_\varepsilon )\big | \le \kappa \mu \end{aligned}$$

hold true.

We postpone the proofs of Proposition 3.4 and 3.5 to Chapters 46 and continue with formulating the main result of this subsection.

Theorem 3.6

Let \(\varepsilon ,\delta \in (0,1]\) and \(u_\varepsilon \) be the unique weak solution of the Dirichlet problem (3.1) in \(B_R\). Then, \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) is Hölder continuous in \(B_{r_o}\) for any \(0<r_o<R\) with Hölder-exponent \(\alpha _\delta \in (0,1)\) and a Hölder constant \(c_\delta \) both depending on \(n,N,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)},M\) and \(\delta \).

Proof

By

$$\begin{aligned} \nu = \nu (n,N,p, \sigma , \Vert f\Vert _{L^{n+\sigma }(B_R)},M,\delta )\in (0,\tfrac{1}{4}] \end{aligned}$$

and

$$\begin{aligned} {\hat{\varrho }} = {\hat{\varrho }}(n,N,p,\sigma , \Vert f\Vert _{L^{n+\sigma }(B_R)},M,\delta ) \in (0,1] \end{aligned}$$

we denote the constants from Proposition 3.4 and by

$$\begin{aligned}&\kappa = \kappa (n,p,\sigma ,\Vert f\Vert _{n+\sigma },M,\delta ,\nu )\in [2^{-\beta /2},1),\\&c_o = c_o(n,p,\Vert f\Vert _{n+\sigma },M,\delta ,\nu )\ge 1 \end{aligned}$$

we denote the ones from Proposition 3.5. The dependence of \(\nu \) on the structural parameters implies that \(\kappa \) depends on \(n,N,p,\sigma ,\Vert f\Vert _{n+\sigma },M\) and \(\delta \). Finally, we let \(\mu =M-1-\delta \) and

$$\begin{aligned} \varrho _*= \min \bigg \{{\hat{\varrho }}, \bigg [\frac{(\kappa \mu )^2}{c_o}\bigg ]^{\frac{1}{\beta }}\bigg \}. \end{aligned}$$

We consider a ball \(B_{2\varrho } (x_o)\subset B_{r_1}\) with center \(x_o\in B_{r_o}\) and \(\varrho \le \varrho _*\) as described above. On this ball we have (3.3)–(3.5) satisfied. Our aim now is to prove that \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\) is Hölder continuous in \(B_{r_o}\) with Hölder exponent

$$\begin{aligned} \alpha := -\frac{\log \kappa }{\log 2} \in \big (0,\tfrac{\beta }{2}\big ]. \end{aligned}$$

In turn, substituting \(2\delta \) by \(\delta \), this proves the claim of the proposition. We proceed in two steps.

Step 1: We prove that the limit

$$\begin{aligned} \Gamma _{x_o} := \lim _{r\downarrow 0} \big (\mathcal G_{2\delta }(Du_\varepsilon )\big )_{x_o,r} \end{aligned}$$

exists and that

(3.11)

holds true for a constant \(c=c(n,p, \sigma ,\Vert f\Vert _{n+\sigma },M,\delta )\). To this aim, we define for \(i\in {\mathbb {N}}_0\) radii

$$\begin{aligned} \varrho _i:=2^{-i} \varrho \quad \text{ and } \quad \mu _i := \kappa ^i\mu \end{aligned}$$

and observe that

$$\begin{aligned} \mu _i = \kappa ^i\mu \le 2^{-\alpha i} \mu = \Big (\frac{\varrho _{i}}{\varrho }\Big )^{\alpha } \mu \qquad \text{ for } \text{ any } i\in {\mathbb {N}}. \end{aligned}$$
(3.12)

Now, suppose that assumption (3.10) holds on \(B_\varrho (x_o)\). Then, Proposition 3.5 yields that either \(\mu ^2< c_o \varrho ^\beta \) or

$$\begin{aligned} \sup _{B_{\varrho _1}(x_o)}|{\mathcal {G}}_{\delta }(Du_\varepsilon )| \le \kappa \mu = \mu _1. \end{aligned}$$

Note that the first alternative cannot happen, since it would imply

$$\begin{aligned} \mu ^2 < c_o\varrho ^{\beta } \le c_o\varrho _*^{\beta } \le \kappa ^2\mu ^2. \end{aligned}$$

Hence, we conclude that (3.4) holds on \(B_{\varrho _1}(x_o)\) with \(\mu =\mu _1\). If the measure condition (3.10) is satisfied with \(\varrho =\varrho _1\) and \(\mu =\mu _1\), then a second application of Proposition 3.5 yields that either \(\mu _1^2< c_o \varrho _1^\beta \) or

$$\begin{aligned} \sup _{B_{\varrho _2}(x_o)}|{\mathcal {G}}_{\delta }(Du_\varepsilon )| \le \kappa \mu _1 = \mu _2. \end{aligned}$$

As before, the first alternative cannot happen, since it would imply

$$\begin{aligned} \mu _1^2 < c_o\varrho _{1}^{\beta } \le 2^{-\beta }(\kappa \mu )^2 \le \kappa ^2\mu _1^2. \end{aligned}$$

Assume now that (3.10) is satisfied for \(i=1,\dots ,i_o-1\) up to some \(i_o\in {\mathbb {N}}\), i.e. that (3.10) holds true on the balls \(B_{\varrho _i}(x_o)\) with \(\mu =\mu _i\). Then, we iteratively conclude that

$$\begin{aligned} \sup _{B_{\varrho _i}(x_o)}|{\mathcal {G}}_{\delta }(Du_\varepsilon )| \le \mu _i, \quad \text{ for } i=0,\dots ,i_o. \end{aligned}$$
(3.13)

Now assume that (3.10) fails to hold for some \(i_o\in {\mathbb {N}}_0\). If \(\mu _{i_o}>\delta \), the hypothesis of Proposition 3.4 are satisfied on \(B_{\varrho _{i_o}}(x_o)\) and we conclude that the limit

$$\begin{aligned} \Gamma _{x_o} := \lim _{r\downarrow 0} \big (\mathcal G_{2\delta }(Du_\varepsilon )\big )_{x_o,r} \end{aligned}$$

exists and that

Moreover, we have

$$\begin{aligned} |\Gamma _{x_o}| \le \mu _{i_o}. \end{aligned}$$
(3.14)

Therefore, we obtain from the preceding inequality and (3.12) that

holds true for any \(0<r\le \varrho _{i_o}\). For a radius \(r\in (\varrho _{i_o},\varrho ]\) there exists \(i\in \{0,\dots ,i_o\}\) such that \(\varrho _{i+1}<r\le \varrho _i\). Using (3.13), (3.14) and (3.12) we obtain

Combining the preceding two inequalities, we have shown (3.11) provided \(\mu _{i_o}>\delta \).

In the case \(\mu _{i_o}\le \delta \), we have \({\mathcal {G}}_{2\delta }(Du_\varepsilon )=0\) on \(B_{\varrho _{i_o}}(x_o)\). Combining this with (3.13) and keeping in mind that \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\le {\mathcal {G}}_{\delta }(Du_\varepsilon )\), we obtain

$$\begin{aligned} \sup _{B_{\varrho _i}(x_o)}\big |{\mathcal {G}}_{2\delta }(Du_\varepsilon )\big | \le \mu _i, \quad \text{ for } \text{ any } i\in \{0,1,\dots \}. \end{aligned}$$
(3.15)

In the final case when (3.10) holds for any \(i\in {\mathbb {N}}\), then (3.13) is satisfied for any \(i\in {\mathbb {N}}\) and hence we obtain (3.15) also in this case. (3.15), however, implies

$$\begin{aligned} \Gamma _{x_o} := \lim _{r\downarrow 0} \big (\mathcal G_{2\delta }(Du_\varepsilon )\big )_{x_o,r} = 0. \end{aligned}$$

For \(r\in (0,\varrho ]\) we find \(i\in \{0,1,\dots \}\) such that \(\varrho _{i+1}<r\le \varrho _i\). Then, (3.15) and (3.12) imply

This establishes (3.11) in the remaining cases.

Step 2: Here, we prove that the Lebesgue representative \(x\mapsto \Gamma _x\) of \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\) is Hölder continuous in \(B_{r_o}\). The proof is standard once the excess decay (3.11) is established. For convenience of the reader we give the details. We consider \(x_1,x_2\in B_{r_o}\). If \(r:=|x_1-x_2|\le \varrho _*\) we define \({\tilde{x}}:=\frac{1}{2}(x_1+x_2)\) and obtain from Step 1 that

This can be re-written in the form

$$\begin{aligned} \big |\Gamma _{x_1} - \Gamma _{x_2}\big | \le c\,\bigg (\frac{|x_1-x_2|}{\varrho _*}\bigg )^{\alpha } \mu . \end{aligned}$$

Otherwise, if \(r=|x_1-x_2|>\varrho _*\), we trivially have

$$\begin{aligned} \big |\Gamma _{x_1} - \Gamma _{x_2}\big | \le 2\mu \le 2\,\bigg (\frac{|x_1-x_2|}{\varrho _*}\bigg )^{\alpha } \mu . \end{aligned}$$

Together, this establishes that the Lebesgue representative \(x\mapsto \Gamma _x\) of \({\mathcal {G}}_{2\delta }(Du_\varepsilon )\) is Hölder continuous in \(B_{r_o}\) with Hölder exponent \(\alpha \). Note that \(\alpha \) admits the same dependencies as \(\kappa \), i.e. \(\alpha =\alpha (n,N,p,\sigma ,\Vert f\Vert _{n+\sigma },M,\delta )\). This finishes the proof of the theorem. \(\square \)

3.3 Continuity of \({\mathcal {G}}(Du)\)

In this subsection it is important that all estimates are independent of \(\varepsilon \). More precisely, constants might depend on \(\delta \), but are independent of \(\varepsilon \).

Proof of Theorem 1.1

We let \(\varepsilon \in (0,1]\) and consider a fixed ball \(B_{R}\equiv B_{R}(y_o)\Subset \Omega \). By \(u_\varepsilon \) we denote the weak solution to (3.1) constructed in Sect. 3.1. Next, we fix \(\delta \in (0,1]\) and \(r\in (0,R)\). From Theorem 3.6 we know that \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) is Hölder continuous in \({\overline{B}}_r\) with Hölder-exponent \(\alpha _\delta \in (0,1)\) and constant \(c_\delta \), both depending at most on \(n,N,p,\sigma ,\Vert f\Vert _{n+\sigma },M\), and \(\delta \). From Lemma 3.3 we know that \(\mathcal G_\delta (Du_\varepsilon )\rightarrow {\mathcal {G}}_\delta (Du)\) in \(L^p(B_R,{\mathbb {R}}^{Nn})\) as \(\varepsilon \downarrow 0\). This implies that there exists a subsequence \(\varepsilon _i\downarrow 0\) as \(i\rightarrow \infty \) such that \({\mathcal {G}}_\delta (Du_{\varepsilon _i})\rightarrow \mathcal G_\delta (Du)\) a.e. in \(B_R\). On the other hand, by Ascoli-Arzelà’s Theorem we conclude that \({\mathcal {G}}_\delta (Du_{\varepsilon _i})\) converges uniformly on compact subsets of \(B_R\). Therefore the limit \({\mathcal {G}}_\delta (Du)\) is Hölder continuous in \({\overline{B}}_r\) with Hölder-exponent \(\alpha _\delta \in (0,1)\) and constant \(c_\delta \). In particular, \({\mathcal {G}}_\delta (Du)\) is continuous in \({\overline{B}}_r\) for any \(\delta \in (0,1]\). Moreover, we have \({\mathcal {G}}_\delta (Du)\rightarrow {\mathcal {G}}(Du)\) uniformly in \(\overline{B}_r\) as \(\delta \downarrow 0\). Indeed

$$\begin{aligned} \big |{\mathcal {G}}_\delta (Du) - {\mathcal {G}}(Du)\big |&= \bigg |(|Du|-1-\delta )_+\frac{Du}{|Du|} - (|Du|-1)_+\frac{Du}{|Du|}\bigg | \\&= \big |(|Du|-1-\delta )_+ - (|Du|-1)_+\big | \le \delta \end{aligned}$$

in \({\overline{B}}_r\). As the uniform limit of a sequence of continuous functions, \({\mathcal {G}}(Du)\) itself is continuous on \({\overline{B}}_r\). Observe that \({\mathcal {G}}(Du)\) is also uniformly continuous on \({\overline{B}}_r\).

Now, let \({\mathcal {K}}:{\mathbb {R}}^{Nn}\rightarrow {\mathbb {R}}\) be any continuous function vanishing on \(\{\xi \in {\mathbb {R}}^{Nn} : |\xi |\le 1\}\). Since \(u\in W^{1,\infty }_{\mathrm{loc}} (\Omega ,{\mathbb {R}}^N)\), we find \(M>0\) such that \(|Du|\le M\) on \({\overline{B}}_r\). By \(\omega :{\mathbb {R}}_+\rightarrow {\mathbb {R}}_+\) we denote the modulus of continuity of \({\mathcal {K}}\) on \(\{\xi \in {\mathbb {R}}^{Nn} : |\xi |\le M\}\), i.e. for any \(\xi ,\zeta \in {\mathbb {R}}^{Nn}\) with \(|\xi |,|\zeta |\le M\) we have \(|{\mathcal {K}}(\xi )-{\mathcal {K}}(\zeta )|\le \omega (|\xi -\zeta |)\). Next, given \(\varepsilon \in (0,1)\) we choose \(\delta >0\) such that

$$\begin{aligned} \big |{\mathcal {G}}(Du(x)) - {\mathcal {G}}(Du(y))\big |< \varepsilon \qquad \text{ for } \text{ any } x,y\in {\overline{B}}_r \hbox { with } |x-y|< \delta . \end{aligned}$$
(3.16)

We now distinguish two cases. First, we assume \(|Du(x)|\le 1+\sqrt{\varepsilon }\). If \(|Du(x)|\ge 1\) we use \({\mathcal {K}}=0\) on \(\{|\xi |\le 1\}\) to conclude

$$\begin{aligned} |{\mathcal {K}}(Du(x))| = \bigg |{\mathcal {K}}(Du(x)) - \mathcal K\Big (\frac{Du(x)}{|Du(x)|}\Big )\bigg | \le \omega \bigg ( \Big |Du(x) - \frac{Du(x)}{|Du(x)|}\Big | \bigg ) \le \omega \big (\sqrt{\varepsilon }\big ). \end{aligned}$$

The preceding estimate trivially holds if \(|Du(x)|\le 1\). Moreover, by (3.16) we have

$$\begin{aligned} \big (|Du(y)|-1\big )_+&= \big |{\mathcal {G}}(Du(y))\big | \le \big |{\mathcal {G}}(Du(y)) - {\mathcal {G}}(Du(x))\big | + \big |{\mathcal {G}}(Du(x))\big |\\&\le \varepsilon + \big (|Du(x)|-1\big )_+ \le \varepsilon +\sqrt{\varepsilon }\le 2\sqrt{\varepsilon }. \end{aligned}$$

This implies \(|Du(y)|\le 1+2\sqrt{\varepsilon }\). Similarly as above we conclude

$$\begin{aligned} |{\mathcal {K}}(Du(y))| \le \omega \big (2\sqrt{\varepsilon }\big ). \end{aligned}$$

Combining the estimates from above, we end up with

$$\begin{aligned} \big |{\mathcal {K}}(Du(x)) - {\mathcal {K}}(Du(y))\big | \le \omega \big (\sqrt{\varepsilon }\big ) + \omega \big (2\sqrt{\varepsilon }\big ) \le 2\omega \big (2\sqrt{\varepsilon }\big ). \end{aligned}$$

Now, we consider the case \(|Du(x)|> 1+\sqrt{\varepsilon }\). Here, Lemma 2.3 and (3.16) imply

$$\begin{aligned} |Du(x) - Du(y)| \le \big (1+ \tfrac{2}{\sqrt{\varepsilon }}\big )\, \big |{\mathcal {G}}(Du(x))-{\mathcal {G}}(Du(y))\big | \le \varepsilon + 2\sqrt{\varepsilon }\le 3\sqrt{\varepsilon }\, , \end{aligned}$$

proving

$$\begin{aligned} \big |{\mathcal {K}}(Du(x)) - {\mathcal {K}}(Du(y))\big | \le \omega \big (3\sqrt{\varepsilon }\big ). \end{aligned}$$

Hence, \({\mathcal {K}}(Du)\) is continuous on \({\overline{B}}_r\). Since \(B_r\Subset B_R\Subset \Omega \) were arbitrary, we have shown that \({\mathcal {K}}(Du)\) is continuous in \(\Omega \). This completes the proof of the theorem. \(\square \)

As mentioned above, we have now finished the proof of the main theorem Theorem 1.1 under the condition that Propositions 3.4 and 3.5 are true. The rest of the paper is now devoted to the proof of those two propositions.

4 Conclusions from the differentiated system

4.1 The main integral inequality for second derivatives

Throughout this subsection we assume as a general requirement that \(u_\varepsilon :B_R\rightarrow {\mathbb {R}}^N\) is a weak solution to the regularized system (3.1). Instead of \(u_\varepsilon \), we write u for the sake of simplicity. In contrast, we will continue to use the subscript \(\varepsilon \) in the notation for the coefficients \(h_\varepsilon \) and its associated bilinear forms, such as \(\varvec{{\mathcal {C}}}_\varepsilon \). We recall that the bilinear forms have been defined in Sect. 2.3.

For some index \(\beta =1,\dots ,n\) we differentiate the regularized system (3.1) with respect to \(x_\beta \) and obtain

$$\begin{aligned} \int _{B_R} D_\beta \big [h_\varepsilon (|Du|)Du\big ] \cdot D\varphi \, \mathrm {d}x= \int _{B_R} f\cdot D_\beta \varphi \,\mathrm {d}x, \end{aligned}$$
(4.1)

for any \(\varphi \in W^{1,p}_0(B_R,{\mathbb {R}}^N)\). We have

$$\begin{aligned} D_\beta \big [h_\varepsilon (|Du|)D_\alpha u^i\big ]&= h_\varepsilon (|Du|)D_\alpha D_\beta u^i + h'_\varepsilon (|Du|)|Du|\frac{D_\alpha u^i D_\gamma u^j u^j_{x_\gamma x_{\beta }}}{|Du|^2}\\&= \bigg [ h_\varepsilon (|Du|)\delta ^{ij}\delta _{\gamma \alpha } + h'_\varepsilon (|Du|)|Du|\frac{D_\alpha u^i D_\gamma u^j}{|Du|^2}\bigg ] u^j_{x_\gamma x_\beta }. \end{aligned}$$

In (4.1) we choose the testing function \(\varphi =\zeta \phi (|Du|)D_\beta u\), where \(\zeta \in C^1_0(B_R)\) is non-negative and \(\phi \in W^{1,\infty }_{\text {loc}}({\mathbb {R}}_{\ge 0},{\mathbb {R}}_{\ge 0})\) is non-decreasing. Note that

$$\begin{aligned} D_\alpha \varphi&= \zeta \phi (|Du|) D_\alpha D_\beta u + \zeta \phi '(|Du|)D_\alpha |Du| D_\beta u + D_\alpha \zeta \phi (|Du|) D_\beta u. \end{aligned}$$

The resulting equations are then summed with respect to \(\beta \) from 1 to n. This leads to

$$\begin{aligned}&\int _{B_R} \Big [h_\varepsilon (|Du|) |D^2u|^2 + h_\varepsilon '(|Du|)|Du| \big | \nabla |Du|\big |^2 \Big ] \phi (|Du|)\zeta \, \mathrm {d}x\\&\qquad + \int _{B_R} \Big [h_\varepsilon (|Du|)|Du| \big |\nabla |Du|\big |^2 + h_\varepsilon '(|Du|) \big |Du\nabla |Du|\big |^2 \Big ] \phi '(|Du|) \zeta \,\mathrm {d}x\\&\qquad + \int _{B_R} \Big [\underbrace{ h_\varepsilon (|Du|)|Du| \nabla |Du|\cdot \nabla \zeta + h_\varepsilon '(|Du|) Du\nabla |Du|\cdot Du\nabla \zeta }_{=\,\varvec{{\mathcal {C}}}_\varepsilon (Du)(\nabla |Du|,\nabla \zeta )|Du|} \Big ] \phi (|Du|)\,\mathrm {d}x\\&\quad = \int _{B_R} f \cdot D_\beta \big [ \zeta \phi (|Du|)D_\beta u\big ] \,\mathrm {d}x. \end{aligned}$$

Now, we compute the right-hand side.

$$\begin{aligned}&\int _{B_R} f \cdot D_\beta \big [ \zeta \phi (|Du|)D_\beta u\big ] \,\mathrm {d}x\\&\quad = \int _\Omega f \cdot D_\beta D_\beta u\, \phi (|Du|) \zeta \,\mathrm {d}x+ \int _\Omega f \cdot \phi '(|Du|) Du\nabla |Du| \zeta \,\mathrm {d}x\\&\qquad + \int _\Omega f \cdot \phi (|Du|) D u \nabla \zeta \,\mathrm {d}x\\&\quad = {\mathbf {R}}_1 + {\mathbf {R}}_2 + {\mathbf {R}}_3, \end{aligned}$$

with the obvious meaning of \({\mathbf {R}}_i\). For the first term, we have by Young’s inequality

$$\begin{aligned} {\mathbf {R}}_1 \le \tau \int _{B_R} h_\varepsilon (|Du|) |D^2u|^2 \phi (|Du|) \zeta \,\mathrm {d}x+ \tfrac{1}{\tau } \int _{B_R} \frac{|f|^2 \phi (|Du|)}{h_\varepsilon (|Du|)} \zeta \,\mathrm {d}x\end{aligned}$$

for any \(\tau \in (0,1)\). Similarly, we get

$$\begin{aligned} {\mathbf {R}}_2 \le \tau \int _{B_R} h_\varepsilon (|Du|) |Du|\big |\nabla |Du|\big |^2 \phi '(|Du|)\zeta \,\mathrm {d}x+ \tfrac{1}{\tau }\int _{B_R} \frac{|f|^2 \phi '(|Du|)|Du|}{h_\varepsilon (|Du|)} \zeta \,\mathrm {d}x\end{aligned}$$

and

$$\begin{aligned} {\mathbf {R}}_3 \le \int _\Omega |f| \phi (|Du|) |Du| |\nabla \zeta | \,\mathrm {d}x. \end{aligned}$$

Inserting this above and re-absorbing the terms containing second derivatives from the right-hand side into the left, we obtain

$$\begin{aligned}&\int _{B_R} \Big [(1-\tau )h_\varepsilon (|Du|) |D^2u|^2 + h_\varepsilon '(|Du|) |Du| \big | \nabla |Du|\big |^2\Big ] \phi (|Du|)\zeta \, \mathrm {d}x\\&\qquad + \int _{B_R} \bigg [(1-\tau )h_\varepsilon (|Du|) \big |\nabla |Du|\big |^2 + h_\varepsilon '(|Du|)|Du| \frac{\big |Du \nabla |Du|\big |^2}{|Du|^2} \bigg ] \phi '(|Du|)|Du| \zeta \,\mathrm {d}x\\&\qquad + \int _{B_R} \varvec{{\mathcal {C}}}_\varepsilon (Du)\big (\nabla |Du|,\nabla \zeta \big ) \phi (|Du|)|Du|\,\mathrm {d}x\\&\quad \le \tfrac{1}{\tau } \int _{B_R} |f|^2 \bigg [\frac{\phi (|Du|)}{h_\varepsilon (|Du|)} + \frac{\phi '(|Du|)|Du|}{h_\varepsilon (|Du|)}\bigg ] \zeta \,\mathrm {d}x+ \int _{B_R} |f| \phi (|Du|) |Du| |D\zeta | \,\mathrm {d}x\end{aligned}$$

for any non-negative function \(\zeta \in C^1_0(B_R)\). In the preceding inequality the parameter \(\tau \) is at our disposal. We choose \(\tau =\frac{1}{2}\min \{1,p-1\}\). For the first factor, i.e. the term \([\dots ]\), in the integrand of the first integral on the left-hand side we have

$$\begin{aligned}&(1-\tau )h_\varepsilon (|Du|) |D^2u|^2 + h_\varepsilon '(|Du|) |Du| \big | \nabla |Du|\big |^2 - \tfrac{1}{2} \varvec{{\mathcal {A}}}_\varepsilon (Du)\big (D^2u,D^2u\big )\\&\quad = \tfrac{1}{2}\max \{0,2-p\} h_\varepsilon (|Du|) |D^2u|^2 + \tfrac{1}{2} h_\varepsilon '(|Du|) |Du| \big | D|Du|\big |^2 \ge 0. \end{aligned}$$

In fact, if \(h_\varepsilon '(|Du|)\ge 0\) the inequality is obvious. If otherwise \(h_\varepsilon '(|Du|)< 0\), which can only happen if \(p<2\) and \(|Du|>1\), the result follows by an application of Kato’s inequality and (2.4). Indeed

$$\begin{aligned}&\tfrac{1}{2}\max \{0,2-p\} h_\varepsilon (|Du|) |D^2u|^2 + \tfrac{1}{2} h_\varepsilon '(|Du|) |Du| \big | \nabla |Du|\big |^2 \\&\quad \ge \tfrac{1}{2} \Big [(2-p)h (|Du|) + h'(|Du|) |Du| \Big ]|D^2u|^2 \\&\quad = \tfrac{1}{2} \bigg [ (2-p)\frac{(|Du|-1)_+^{p-1}}{|Du|} + \frac{(|Du|-1)_+^{p-2}[(p-2)|Du|+1]}{|Du|} \bigg ]|D^2u|^2 \\&\quad = \frac{(|Du|-1)_+^{p-2}}{2|Du|} \Big [ (2-p)(|Du|-1) + (p-2)|Du|+1 \Big ]|D^2u|^2 \\&\quad = (p-1)\frac{(|Du|-1)_+^{p-2}}{2|Du|} |D^2u|^2 \ge 0. \end{aligned}$$

For the term in brackets of the second integral on the left-hand side a similar computation applies. The result of the calculation is

$$\begin{aligned}&(1-\tau )h_\varepsilon (|Du|) \big |D|Du|\big |^2 + h_\varepsilon '(|Du|)|Du| \frac{\big |Du \nabla |Du|\big |^2}{|Du|^2} \\&\quad - \tfrac{1}{2} \varvec{{\mathcal {C}}}_\varepsilon (Du)\big (\nabla |Du|,\nabla |Du|\big ) \ge 0. \end{aligned}$$

Using the last and second last inequality above, we obtain an inequality which can be interpreted in two ways. On the one hand it can be seen as an energy inequality for the second derivatives of u. On the other hand—by discarding on the left-hand side the two non-negative terms containing second derivatives—the inequality implies that |Du| is a subsolution of an elliptic equation with measurable coefficients.

Lemma 4.1

Let \(\varepsilon \in (0,1]\) and \(u=u_\varepsilon \) a weak solution to the regularized system (3.1) on \(B_R\). Then, for any non-decreasing function \(\phi \in W^{1,\infty }_{\text {loc}}({\mathbb {R}}_{\ge 0},{\mathbb {R}}_{\ge 0})\) and any non-negative testing function \(\zeta \in C^1_0(B_R)\) we have

$$\begin{aligned}&\int _{B_R}\Big [ \varvec{{\mathcal {A}}}_\varepsilon (Du)\big (D^2u,D^2u\big ) \phi (|Du|) + \varvec{\mathcal C}_\varepsilon (Du) \big (\nabla |Du|,\nabla |Du|\big ) \phi '(|Du|)|Du|\Big ] \zeta \,\mathrm {d}x\\&\qquad + 2 \int _{B_R} \varvec{{\mathcal {C}}}_\varepsilon (Du)\big (\nabla |Du|,\nabla \zeta \big ) \phi (|Du|)|Du| \,\mathrm {d}x\\&\quad \le c\int _{B_R} |f|^2 \bigg [\frac{\phi (|Du|)}{h_\varepsilon (|Du|)} + \frac{\phi '(|Du|)|Du|}{h_\varepsilon (|Du|)}\bigg ] \zeta \,\mathrm {d}x+ 2 \int _{B_R} |f| \phi (|Du|) |Du| |\nabla \zeta | \,\mathrm {d}x, \end{aligned}$$

where \(c:=\frac{4}{\min \{1,p-1\}}\).\(\Box \)

4.2 Subsolution to an elliptic equation

We start by showing that \((|Du_\varepsilon |-1-\delta )_+^2\) is a sub-solution of a certain elliptic equation. More precisely

Lemma 4.2

Let \(\varepsilon \in (0,1]\) and \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) be a weak solution of the regularized system (3.1) satisfying (3.4) and (3.5) on \(B_\varrho (x_o)\subset B_{r_o}\Subset B_R\). Then, the function

$$\begin{aligned} U_\varepsilon := \big (|Du_\varepsilon |-1-\delta \big )_+^2 \end{aligned}$$
(4.2)

is a sub-solution of a linear elliptic equation on \(B_\varrho (x_o)\), in the sense that

$$\begin{aligned} \int _{B_\varrho (x_o)} A_{\alpha \beta } D_\alpha U_\varepsilon D_\beta \zeta \,\mathrm {d}x\le c \int _{B_\varrho (x_o)} |f|^2 \zeta \,\mathrm {d}x+ c\, \mu \int _{B_\varrho (x_o)} |f| |D\zeta | \,\mathrm {d}x, \end{aligned}$$
(4.3)

holds true for any non-negative test function \(\zeta \in C^1_0(B_\varrho (x_o))\) and with a universal constant \(c=c(p,M,\delta )\). The coefficients \(A_{\alpha \beta }\) are given by

$$\begin{aligned} A_{\alpha \beta } \eta _\alpha \zeta _\beta = |Du_\varepsilon |\,\varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon )(\eta ,\zeta ) \qquad \text{ for } \eta ,\zeta \in {\mathbb {R}}^n, \end{aligned}$$

where \(\varvec{{\mathcal {C}}}_\varepsilon \) is the bilinear form defined in (2.8).

Proof

We apply Lemma 4.1 with \(\phi (t)=(t-1-\delta )_+\). Due to Lemma 2.7 the first two integrals on the left-hand side are non-negative and therefore can be discarded. In this way, we obtain with \(c=\frac{4}{\min \{1,p-1\}}\) that

$$\begin{aligned} {\mathbf {L}}&:= \int _{B_\varrho (x_o)} |Du_\varepsilon |\,\varvec{\mathcal C}_\varepsilon (Du_\varepsilon )\big (\nabla |Du_\varepsilon |,\nabla \zeta \big ) \phi (|Du_\varepsilon |) \,\mathrm {d}x\nonumber \\&\,\le c\int _{B_\varrho (x_o)} |f|^2 \bigg [\frac{\phi (|Du_\varepsilon |)}{h_\varepsilon (|Du_\varepsilon |)} + \frac{\phi '(|Du_\varepsilon |)|Du_\varepsilon |}{h_\varepsilon (|Du_\varepsilon |)}\bigg ] \zeta \,\mathrm {d}x\nonumber \\&\quad + \int _{B_\varrho (x_o)} |f| \phi (|Du_\varepsilon |) |Du_\varepsilon | |D\zeta | \,\mathrm {d}x=:{\mathbf {R}} \end{aligned}$$
(4.4)

holds true for any non-negative \(\zeta \in C^1_0(B_\varrho (x_o))\). To proceed further we compute

$$\begin{aligned} \frac{\phi (t)}{h_\varepsilon (t)} \le \frac{\phi (t)}{h(t)} = \frac{t(t-1-\delta )_+}{(t-1)^{p-1}} \le \frac{t^2}{\delta ^{p-1}} \end{aligned}$$

and

$$\begin{aligned} \frac{\phi '(t)t}{h_\varepsilon (t)} \le \frac{\phi '(t)t}{h(t)} = \frac{\varvec{\chi }_{\{ t>1+\delta \}}t^2}{(t-1)^{p-1}} \le \frac{t^2}{\delta ^{p-1}}. \end{aligned}$$

The above inequalities together with the general assumptions (3.4), (3.5) allow to estimate

$$\begin{aligned} \frac{\phi (|Du_\varepsilon |)}{h_\varepsilon (|Du_\varepsilon |)} + \frac{\phi '(|Du_\varepsilon |)|Du_\varepsilon |}{h_\varepsilon (|Du_\varepsilon |)} \le \frac{2M^2}{\delta ^{p-1}}. \end{aligned}$$

Moreover, we have \(\phi (|Du_\varepsilon |) |Du_\varepsilon |\le M\mu \). In this way, we obtain for the right-hand side in (4.4) the estimate

$$\begin{aligned} {\mathbf {R}}&\le c \int _{B_R} |f|^2 \zeta \,\mathrm {d}x+ c\, \mu \int _{B_R} |f| |D\zeta | \,\mathrm {d}x, \end{aligned}$$

where \(c=c(p,M,\delta )\). Now, we consider the left-hand side in (4.4). Observe that \(\nabla U_\varepsilon =2\phi (|Du_\varepsilon |)\nabla |Du_\varepsilon |\), so that by the linearity of \(\varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon )\) with respect to the first variable we have

$$\begin{aligned} \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla |Du_\varepsilon |,\nabla \zeta \big ) \phi (|Du_\varepsilon |)&= \tfrac{1}{2} \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla U_\varepsilon ,\nabla \zeta \big ). \end{aligned}$$

For the left-hand side this has the consequence that

$$\begin{aligned} {\mathbf {L}}&= \tfrac{1}{2} \int _{B_\varrho (x_o)} |Du_\varepsilon | \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla U_\varepsilon ,\nabla \zeta \big )\, \mathrm {d}x= \tfrac{1}{2} \int _{B_\varrho (x_o)} A_{\alpha \beta }(x)D_\alpha U_\varepsilon D_\beta \zeta \,\mathrm {d}x\end{aligned}$$

holds true. Here, we have taken into account the definition of the coefficients \(A_{\alpha \beta }\). Altogether we have shown the claim (4.3). \(\square \)

The coefficients \(A_{\alpha \beta }\) in (4.3) can be explicitly written as

$$\begin{aligned} A_{\alpha \beta } := h_\varepsilon (|Du_\varepsilon |) |Du_\varepsilon |\bigg [ \delta _{\alpha \beta } + \frac{h_\varepsilon '(|Du_\varepsilon |)D_\alpha u_\varepsilon \cdot D_\beta u_\varepsilon }{h_\varepsilon (|Du_\varepsilon |)|Du_\varepsilon |}\bigg ]. \end{aligned}$$

They are only degenerate elliptic due to the factor \(h_\varepsilon (|Du_\varepsilon |)|Du_\varepsilon |\) which, for \(\varepsilon =0\), vanishes on the set \(\{|Du|\le 1\}\). On the other hand \(U_\varepsilon \) has its support in the set \(B_R\cap \{|Du_\varepsilon |\ge 1+\delta \}\). This allows us to modify the coefficients on \(B_R \cap \{|Du_\varepsilon |\le 1+\delta \}\). This idea will lead us to an energy estimate for \(U_\varepsilon \) in the next lemma.

Lemma 4.3

Let \(\varepsilon \in (0,1]\) and \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) be a weak solution of the regularized system (3.1) satisfying (3.4) and (3.5) on \(B_\varrho (x_o)\subset B_{r_o}\Subset B_R\) and denote by \(U_\varepsilon \) the function defined in (4.2). Then, for any \(k>0\) and any \(\tau \in (0,1)\) we have

$$\begin{aligned}&\int _{B_{\tau \varrho }(x_o)} | D(U_\varepsilon -k)_+|^2 \,\mathrm {d}x\\&\quad \le \frac{c}{(1-\tau )^2\varrho ^2} \int _{B_\varrho (x_o)} (U_\varepsilon -k)_+^2 \,\mathrm {d}x+ c\,\Vert f\Vert _{n+\sigma }^2 \big |B_\varrho (x_o)\cap \{U_\varepsilon >k\}\big |^{1-\frac{2}{n}+\frac{2\beta }{n}}, \end{aligned}$$

where \(c=c(n,p,M,\delta )\) and \(\beta =\frac{\sigma }{n+\sigma }\).

Proof

We let \(A_{\alpha \beta }\) be the coefficients defined in Lemma 4.2 and extend them from \(B_R \cap \{ |Du_\varepsilon |> 1+\delta \}\) to the complement \(B_R \cap \{ |Du_\varepsilon |\le 1+\delta \}\) by letting \(A_{\alpha \beta }\equiv \delta _{\alpha \beta }\). The new coefficients \({\widetilde{A}}_{\alpha \beta }\) are thus defined by

$$\begin{aligned} {\widetilde{A}}_{\alpha \beta }(x) := \left\{ \begin{array}{cl} \delta _{\alpha \beta }, &{}\text{ on } \big \{x\in B_R : |Du_\varepsilon (x)|\le 1+\delta \big \},\\[7pt] A_{\alpha \beta }(x), &{}\text{ on } \big \{x\in B_R : |Du_\varepsilon (x)|>1+\delta \big \}. \end{array} \right. \end{aligned}$$

From this definition and Lemma 4.2 we observe that \(U_\varepsilon \) is a weak sub-solution also with the modified coefficients. More precisely, we have that

$$\begin{aligned} \int _{B_\varrho (x_o)} {\widetilde{A}}_{\alpha \beta } D_\alpha U_\varepsilon D_\beta \zeta \,\mathrm {d}x\le c \int _{B_\varrho (x_o)} |f|^2 \zeta \,\mathrm {d}x+ c\, \mu \int _{B_\varrho (x_o)} |f| |D\zeta | \,\mathrm {d}x, \end{aligned}$$
(4.5)

for any non-negative \(\zeta \in C^1_0(B_R)\).

We now investigate the upper bound and ellipticity of the coefficients \({\widetilde{A}}_{\alpha \beta }\). We will show that there exist \(0<\lambda \le \Lambda <\infty \) both depending at most on pM and \(\delta \) such that

$$\begin{aligned} \lambda |\zeta |^2\le \widetilde{A}_{\alpha \beta }(x)\zeta _\alpha \zeta _\beta \le \Lambda |\zeta |^2 \end{aligned}$$

for any \(x\in B_{r_o}\) and \(\zeta \in {\mathbb {R}}^n\). We start with the former one. On the set where \(|Du_\varepsilon |\le 1+\delta \) the upper bound holds with \(\Lambda =1\), while on the set where \(|Du_\varepsilon |>1+\delta \) we have from Lemma 2.7 that

$$\begin{aligned} {\widetilde{A}}_{\alpha \beta }\zeta _\alpha \zeta _\beta&= |Du_\varepsilon |{\mathcal {C}}_\varepsilon (Du_\varepsilon )(\zeta ,\zeta ) \\&\le |Du_\varepsilon |\big [ \varepsilon +\varvec{\Lambda }(|Du_\varepsilon |)\big ] |\zeta |^2 \\&= \Big [\varepsilon |Du_\varepsilon | + \max \big \{ (|Du_\varepsilon |-1)^{p-1}, (p-1)|Du_\varepsilon |(|Du_\varepsilon |-1)^{p-2} \big \} \Big ] |\zeta |^2 \\&\le \bigg [M+\frac{pM^{p}}{\delta }\bigg ]|\zeta |^2, \end{aligned}$$

which proves the claim with \(\Lambda =\Lambda (p,M,\delta )=M+\frac{pM^{p}}{\delta }\). Similarly, on the set where \(|Du_\varepsilon |\le 1+\delta \) the ellipticity holds with \(\lambda =1\), while on the set where \(|Du_\varepsilon |>1+\delta \) we have from Lemma 2.7 that

$$\begin{aligned} {\widetilde{A}}_{\alpha \beta }\zeta _\alpha \zeta _\beta&= |Du_\varepsilon |\,{\mathcal {C}}_\varepsilon (Du_\varepsilon )(\zeta ,\zeta ) \\&\ge |Du_\varepsilon |\big [ \varepsilon +\varvec{\lambda }(|Du_\varepsilon |)\big ] |\zeta |^2 \\&\ge \min \big \{ (|Du_\varepsilon |-1)^{p-1}, (p-1)|Du_\varepsilon |(|Du_\varepsilon |-1)^{p-2} \big \} |\zeta |^2 \\&\ge \min \{1,p-1\} \,\delta ^{p-1} |\zeta |^2, \end{aligned}$$

which proves the claim with \(\lambda =\lambda (p,\delta )=\min \{1,p-1\} \,\delta ^{p-1}\).

Now, the claimed energy estimate follows in a standard way by choosing in (4.5) a test-function of the form \(\zeta =\eta ^2(U_\varepsilon -k)_+\) with a cut-off function \(\eta \in C^1_0(B_\varrho (x_o))\) with \(\eta \equiv 1\) on \(B_{\tau \varrho }(x_o)\) and \(|\nabla \eta |\le \frac{2}{\tau \varrho }\), cf. [19, Chapter 10.1]. \(\square \)

4.3 Energy estimates

Here, we assume that the hypothesis of Proposition 3.4 are in force. Our starting point is again Lemma 4.1. This time we keep the two non-negative terms containing the quadratic forms \(\varvec{{\mathcal {A}}}_\varepsilon \) and \(\varvec{\mathcal C}_\varepsilon \) on the left-hand side. For any non-decreasing function \(\phi \in W^{1,\infty }_{\text {loc}}({\mathbb {R}}_{\ge 0},{\mathbb {R}}_{\ge 0})\) and any non-negative function \(\zeta =\eta ^2\in C^1_0(B_\varrho (x_o))\) we have

$$\begin{aligned}&\int _{B_\varrho (x_o)} \Big [\varvec{{\mathcal {A}}}_\varepsilon (Du_\varepsilon )\big (D^2u_\varepsilon ,D^2u_\varepsilon \big ) \phi (|Du_\varepsilon |)\\&\qquad + \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla |Du_\varepsilon |,\nabla |Du_\varepsilon |\big ) \phi '(|Du_\varepsilon |)|Du_\varepsilon |\Big ] \eta ^2 \,\mathrm {d}x\\&\quad \le 4\bigg |\int _{B_\varrho (x_o)} \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon )\big (\nabla |Du_\varepsilon |,\nabla \eta \big ) \phi (|Du_\varepsilon |)|Du_\varepsilon | \eta \,\mathrm {d}x\bigg | \\&\qquad + c(p) \int _{B_\varrho (x_o)} |f|^2 \bigg [\frac{\phi (|Du_\varepsilon |)}{h_\varepsilon (|Du_\varepsilon |)} + \frac{\phi '(|Du_\varepsilon |)|Du_\varepsilon |}{h_\varepsilon (|Du_\varepsilon |)}\bigg ] \eta ^2 \,\mathrm {d}x\\&\qquad + 4\int _{B_\varrho (x_o)} |f| \phi (|Du_\varepsilon |) |Du_\varepsilon | |\nabla \eta |\eta \,\mathrm {d}x\\&\quad =: 4{\mathbf {I}} + c(p)\mathbf {II} + 4\mathbf {III} \end{aligned}$$

with the obvious meaning of \({\mathbf {I}}\)\(\mathbf {III}\). For \(\mathbf {III}\) we have

$$\begin{aligned} 4\mathbf {III}&\le 2\int _{B_\varrho (x_o)} \bigg [ |f|^2 \frac{\phi '(|Du_\varepsilon |)|Du_\varepsilon |}{h_\varepsilon (|Du_\varepsilon |)} \,\eta ^2 + h_\varepsilon (|Du_\varepsilon |) \frac{\phi ^2(|Du_\varepsilon |)|Du_\varepsilon |}{\phi '(|Du_\varepsilon |)} |\nabla \eta |^2\bigg ] \,\mathrm {d}x. \end{aligned}$$

Moreover, we estimate the integral \({\mathbf {I}}\) by Cauchy-Schwarz inequality and obtain

$$\begin{aligned} 4{\mathbf {I}}&\le \int _{B_\varrho (x_o)} \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla |Du_\varepsilon |,\nabla |Du_\varepsilon |\big ) \phi '(|Du_\varepsilon |)|Du_\varepsilon |\eta ^2 \,\mathrm {d}x\\&\quad + 4\int _{B_\varrho (x_o)} \varvec{\mathcal C}_\varepsilon (Du_\varepsilon ) \big (\nabla \eta ,\nabla \eta \big ) \frac{\phi ^2(|Du_\varepsilon |)|Du_\varepsilon |}{\phi '(|Du_\varepsilon |)} \,\mathrm {d}x\\&\le \int _{B_\varrho (x_o)} \varvec{{\mathcal {C}}}_\varepsilon (Du_\varepsilon ) \big (\nabla |Du_\varepsilon |,\nabla |Du_\varepsilon |\big ) \phi '(|Du_\varepsilon |)|Du_\varepsilon |\eta ^2 \,\mathrm {d}x\\&\quad + 4\int _{B_\varrho (x_o)} \big [ \varepsilon +\varvec{\Lambda }(|Du_\varepsilon |)\big ] \frac{\phi ^2(|Du_\varepsilon |)|Du_\varepsilon |}{\phi '(|Du_\varepsilon |)} |\nabla \eta |^2\,\mathrm {d}x. \end{aligned}$$

From the second to last inequality we used the upper bound from Lemma 2.7 to estimate the second integral. Inserting the results above and re-absorbing the first term from the right into the left, we find that

$$\begin{aligned}&\int _{B_\varrho (x_o)} \varvec{{\mathcal {A}}}_\varepsilon (Du_\varepsilon )\big (D^2u_\varepsilon ,D^2u_\varepsilon \big ) \phi (|Du_\varepsilon |) \eta ^2 \,\mathrm {d}x\nonumber \\&\quad \le 4\int _{B_\varrho (x_o)} \big [ \varepsilon +\varvec{\Lambda }(|Du_\varepsilon |) + h_\varepsilon (|Du_\varepsilon |)\big ] \frac{\phi ^2(|Du_\varepsilon |)|Du_\varepsilon |}{\phi '(|Du_\varepsilon |)} |\nabla \eta |^2 \,\mathrm {d}x\nonumber \\&\qquad + c \int _{B_\varrho (x_o)} |f|^2\, \frac{\phi (|Du_\varepsilon |) + \phi '(|Du_\varepsilon |)|Du_\varepsilon |}{h_\varepsilon (|Du_\varepsilon |)} \,\eta ^2 \,\mathrm {d}x\end{aligned}$$
(4.6)

holds true with a constant \(c=c(p)\). We now choose \(\phi (t):=(t-1)_+^p{\tilde{\phi }}(t)\), where \({\tilde{\phi }}\in W^{1,\infty }_{\text {loc}}({\mathbb {R}}_{\ge 0},{\mathbb {R}}_{\ge 0})\) is non-decreasing. Note that

$$\begin{aligned} \phi '(t) = (t-1)_+^{p-1}\big [p{\tilde{\phi }}(t) + (t-1)_+{\tilde{\phi }}'(t)\big ]. \end{aligned}$$

For \(t\in [0,1+2\mu ]\) we compute

$$\begin{aligned}&\big [ \varepsilon +\varvec{\Lambda }(t) + h_\varepsilon (t)\big ] \frac{\phi ^2(t)t}{\phi '(t)} \\&\quad \le 2\big [ \varepsilon +\varvec{\Lambda }(t)\big ] \frac{(t-1)_+^{p+1} t{\tilde{\phi }}^2(t)}{p{\tilde{\phi }}(t) + (t-1)_+{\tilde{\phi }}'(t)} \\&\quad \le 2\Big [ \varepsilon (t-1)_+^{p+1}t + \max \big \{(t-1)_+^{2p}, (p-1)(t-1)_+^{2p-1}t\big \} \Big ] \frac{{\tilde{\phi }}^2(t)}{p{\tilde{\phi }}(t) + (t-1)_+{\tilde{\phi }}'(t)} \\&\quad \le c\,\big [\varepsilon \mu ^{p+2} + \mu ^{2p}\big ] \frac{{\tilde{\phi }}^2(t)}{p{\tilde{\phi }}(t) + (t-1)_+{\tilde{\phi }}'(t)} \\&\quad \le c(p, M,\delta )\, \frac{\mu ^{2p}{\tilde{\phi }}^2(t)}{p{\tilde{\phi }}(t) + (t-1)_+{\tilde{\phi }}'(t)}. \end{aligned}$$

In turn we used \(\delta \le \mu \le M\) from (3.5) and (3.6), which implies on the one hand \(t\le 1+2\mu \le \big (2+\frac{1}{\delta })\mu \), and on the other hand \(\mu ^2\le \max \{\delta ^{2-p}, M^{2-p}\}\mu ^p\). Next, we compute

$$\begin{aligned} \frac{\phi (t) + \phi '(t)t}{h_\varepsilon (t)}&\le \frac{\phi (t) + \phi '(t)t}{h(t)} \\&= \big [(t-1)_+{\tilde{\phi }}(t) + pt{\tilde{\phi }}(t) + (t-1)_+ {\tilde{\phi }}'(t)t\big ]t \\&\le c(p,M,\delta )\, [{\tilde{\phi }}(t) + {\tilde{\phi }}'(t)t]. \end{aligned}$$

Due to assumptions (3.4) and (3.6) we know that \(|Du_\varepsilon |\le 1+2\mu \) on \(B_\varrho (x_o)\). This allows us to use the preceding estimates in (4.6) to bound the right-hand side from above. Moreover, by Lemma 2.11 the left-hand side in (4.6) can be estimated from below. Proceeding in this way we obain

$$\begin{aligned}&\int _{B_\varrho (x_o)} \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 {\tilde{\phi }}(|Du_\varepsilon |) \eta ^2 \,\mathrm {d}x\nonumber \\&\quad \le c \int _{B_\varrho (x_o)} \frac{\mu ^{2p}{\tilde{\phi }}^2(|Du_\varepsilon |)}{p{\tilde{\phi }}(|Du_\varepsilon |) + (|Du_\varepsilon |-1)_+{\tilde{\phi }}'(|Du_\varepsilon |)} \,|\nabla \eta |^2 \,\mathrm {d}x\nonumber \\&\qquad + c \int _{B_\varrho (x_o)} |f|^2\, \big [{\tilde{\phi }}(|Du_\varepsilon |) + {\tilde{\phi }}'(|Du_\varepsilon |)|Du_\varepsilon |\big ]\eta ^2 \,\mathrm {d}x, \end{aligned}$$
(4.7)

for any \(\eta \in C^1_0(B_\varrho (x_o))\). The constant c depends only on pM, and \(\delta \).

Different concrete choices of \({\tilde{\phi }}\) in (4.7) result in two important energy inequalities. The first one is

Lemma 4.4

Let \(\varepsilon \in (0,1]\) and \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) be a weak solution of the regularized system (3.1) such that hypotheses (3.4), (3.5) and (3.6) are in force on \(B_\varrho (x_o)\subset B_{r_o}\Subset B_R\). Then, for any \(\tau \in (0,1)\) there holds

$$\begin{aligned} \int _{B_{\tau \varrho }(x_o)} \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \mathrm {d}x\le c \bigg [\frac{\mu ^{2p}}{\varrho ^2(1-\tau )^2} + \varrho ^{-\frac{2n}{n+\sigma }} \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ] |B_\varrho | \end{aligned}$$

for some universal constant \(c=c(n,p,M,\delta )\).

Proof

We apply inequality (4.7) with the choice \({\tilde{\phi }}\equiv 1\). The cut-off function \(\eta \in C^1_0(B_\varrho (x_o))\) is chosen such that \(\eta \equiv 1\) in \(B_{\tau \varrho }(x_o)\), \(0\le \eta \le 1\), and \(|\nabla \eta |\le \frac{2}{(1-\tau )\varrho }\). This leads us to

$$\begin{aligned} \int _{B_{\tau \varrho }(x_o)} \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \,\mathrm {d}x&\le c \int _{B_{\varrho }(x_o)} \big [\mu ^{2p}|\nabla \eta |^2 + |f|^2 \big ] \,\mathrm {d}x\\&\le \frac{c\,\mu ^{2p}}{(1-\tau )^2} \,\varrho ^{n-2} + c\, \varrho ^{n-\frac{2n}{n+\sigma }} \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \end{aligned}$$

with a constant \(c=c(n,p,M,\delta )\), which is the claimed energy estimate. \(\square \)

The second energy estimate is

Lemma 4.5

Let \(\nu \in (0,\frac{1}{4}]\), \(\varepsilon \in (0,1]\) and \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) be a weak solution of the regularized system (3.1) such that hypotheses (3.4), (3.5), (3.6) and (3.7) are in force on \(B_\varrho (x_o)\subset B_{r_o}\Subset B_R\). Then, for any \(\tau \in (0,1)\) we have

$$\begin{aligned} \int _{E_{\tau \varrho }^\nu (x_o)} \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \,\mathrm {d}x\le c\bigg [\frac{\mu ^{2p}\nu }{\varrho ^2(1-\tau )^2} + \frac{\varrho ^{-\frac{2n}{n+\sigma }} }{\nu } \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ] |B_\varrho |, \end{aligned}$$

for a constant \(c=c(n,p,M,\delta )\).

Proof

This time we choose

$$\begin{aligned} {\tilde{\phi }}(t)=(t-1-\delta -k)_+^2\; \text{ with } k:= (1-2\nu )\mu \end{aligned}$$

in inequality (4.7), and obtain

$$\begin{aligned}&\int _{B_\varrho (x_o)} \eta ^2 \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \big (|Du_\varepsilon |-1-\delta -k\big )_+^2 \,\mathrm {d}x\nonumber \\&\quad \le c \int _{B_\varrho (x_o)} \frac{\mu ^{2p}\big (|Du_\varepsilon |-1-\delta -k\big )_+^3}{p(|Du_\varepsilon |-1-\delta -k) + 2(|Du_\varepsilon |-1)} \,|\nabla \eta |^2 \,\mathrm {d}x\nonumber \\&\qquad + c \int _{B_\varrho (x_o)} |f|^2\, |Du_\varepsilon | \big (|Du_\varepsilon |-1-\delta -k\big )_+\eta ^2 \,\mathrm {d}x. \end{aligned}$$

On \(B_\varrho (x_o)\cap \{|Du_\varepsilon |>1+\delta +k\}\) we have

$$\begin{aligned} \big (|Du_\varepsilon |-1-\delta -k\big )_+ \le \mu -k = \mu -(1-2\nu )\mu = 2\nu \mu , \end{aligned}$$

and

$$\begin{aligned} p\big (|Du_\varepsilon |-1-\delta -k\big ) + 2(|Du_\varepsilon |-1) \ge 2(\delta +k) \ge 2k \ge \mu , \end{aligned}$$

since \(\nu \le \frac{1}{4}\). Again, we choose \(\eta \in C_0^1(B_\varrho (x_o))\) to be a non-negative cut-off function with \(\eta \equiv 1\) in \(B_{\tau \varrho }(x_o)\), \(0\le \eta \le 1\), and \(|\nabla \eta |\le \frac{2}{(1-\tau )\varrho }\). This, together with the fact that \(|Du_\varepsilon |\le 1+2\mu \) on \(B_\varrho (x_o)\), allows us to estimate the right-hand side in the above inequality. Indeed, we have

$$\begin{aligned}&\int _{B_{\tau \varrho }(x_o)} \big |D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \big (|Du_\varepsilon |-1-\delta -k\big )_+^2 \,\mathrm {d}x\nonumber \\&\quad \le c\bigg [\frac{\nu ^3\mu ^{2p+2}}{(1-\tau )^2}\,\varrho ^{n-2} + \nu \mu ^2 \int _{B_\varrho (x_o)} |f|^2 \,\mathrm {d}x\bigg ]. \end{aligned}$$

Therefore it remains to estimate the left-hand side from below. The integral has to be taken only on the set of points \(x\in B_\varrho (x_o)\) with \(|Du_\varepsilon (x)|-1-\delta >k = (1-2\nu )\mu \). We shrink this set to those points satisfying the stronger condition \(|Du_\varepsilon (x)|-1-\delta>(1-\nu )\mu >k\), i.e. to \(E_{\tau \varrho }^\nu (x_o)\). On this set we have

$$\begin{aligned} |Du_\varepsilon |-1-\delta -k \ge (1-\nu )\mu - (1-2\nu )\mu = \nu \mu . \end{aligned}$$

Inserting this above we conclude that

$$\begin{aligned} \nu ^2\mu ^2\int _{E_{\tau \varrho }^\nu (x_o)} \!\! \big | D\big [g (|Du_\varepsilon |)Du_\varepsilon \big ]\big |^2 \,\mathrm {d}x\le c\bigg [ \frac{\nu ^3\mu ^{2p+2}}{\varrho ^2(1-\tau )^2} + \nu \mu ^2 \varrho ^{-\frac{2n}{n+\sigma }} \Vert f\Vert ^2_{L^{n+\sigma }(B_R)}\bigg ]|B_\varrho | \end{aligned}$$

holds true. This proves the claim. \(\square \)

5 The non-degenerate regime

The aim of this section is to prove Proposition 3.4. Throughout this section we presume the following general assumptions. For given \(\varepsilon \in (0,1]\) we denote by \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) the unique weak solution of the regularized system (3.1). Moreover, we assume that for some \(\delta \in (0,1]\) and \(\mu >\delta \) and a ball \(B_{2\varrho }(x_o)\subset B_{r_1}\) with \(\varrho \le 1\) assumptions (3.4)–(3.6) are in force. We denote by

(5.1)

the \(L^2\)-excess of \(Du_\varepsilon \) on \(B_\varrho (x_o)\), i.e. the \(L^2\)-mean square deviation of \(Du_\varepsilon \) from its mean value \((Du_\varepsilon )_{x_o,\varrho }\).

5.1 Higher integrability

An ingredient in the proof of Proposition 3.4 is the following higher integrability result.

Lemma 5.1

Under the general assumptions of Sect. 5 there exist \(\vartheta =\vartheta (n,p,\sigma ,M,\delta ) \in (0,\min \{\tfrac{1}{2},\frac{n+\sigma }{2}-1\}]\) and \(c=c(n,p,M,\delta )\) such for any \(\xi \in {\mathbb {R}}^{Nn}\) satisfying

$$\begin{aligned} 1+\delta +\tfrac{1}{4}\mu \le |\xi | \le 1+\delta +\mu , \end{aligned}$$

we have

Proof

We consider a ball \(B_s (z_o)\subset B_\varrho (x_o)\). We test the weak form (3.2) of the elliptic system by the testing function

$$\begin{aligned} \varphi :=\eta ^2w, \quad \text{ where } \ w := u_\varepsilon - (u_\varepsilon )_{z_o,s} -\xi (x-z_o) \end{aligned}$$

and \(\eta \in C_0^1(B_s (z_o))\) is a standard cut-off function with \(\eta \equiv 1\) in \(B_{s/2}(z_o)\), \(0\le \eta \le 1\), and \(|\nabla \eta |\le \frac{4}{s}\). We obtain

$$\begin{aligned} 0&= \int _{B_s (z_o)} \big [ {\mathbf {A}}_\varepsilon (Du_\varepsilon )\cdot D\varphi + f\cdot \varphi \big ]\, dx\\&= \int _{B_s (z_o)} \Big [ \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot D\varphi + f\cdot \varphi \Big ]\, \mathrm {d}x\\&= \int _{B_s (z_o)} \Big [ \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \big [ \eta ^2 Dw +2\eta \nabla \eta \otimes w\big ] + f\cdot \varphi \Big ] \, \mathrm {d}x. \end{aligned}$$

We use the monotonicity of \({\mathbf {A}}_\varepsilon \) from Lemma 2.8 in order to estimate the first term from below. Due to our assumption on \(\xi \) and (3.4) we have \(|\xi |+|Du_\varepsilon |\le 5|\xi |\) and therefore obtain

$$\begin{aligned} \int _{B_s (z_o)}&\eta ^2 \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot Dw \, \mathrm {d}x\ge \bigg [\varepsilon + \lambda \, \frac{(|\xi |-1)^p}{|\xi |^2}\bigg ] \int _{B_s (z_o)} \eta ^2 |Dw|^2\,\mathrm {d}x, \end{aligned}$$

where \(\lambda =\frac{1}{5\cdot 2^{p+1}} \min \{1,p-1\}\). To bound the second integral from above we use the structural upper bound from Lemma 2.8. Moreover, we observe that \((|Du_\varepsilon |-1)_+\le 4 (|\xi |-1)\) due to our assumption on \(\xi \). This allows us to estimate

$$\begin{aligned}&2\bigg |\int _{B_s (z_o)} \eta \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (\xi )\big ) \cdot \nabla \eta \otimes w\,\mathrm {d}x\bigg |\\&\quad \le c\,\big [\varepsilon + (|\xi |-1)^{p-2}\big ] \int _{B_s (z_o)} \eta |Dw||w||\nabla \eta |\, dx\\&\quad \le \bigg [\varepsilon + \tfrac{1}{2}\lambda \, \frac{(|\xi |-1)^p}{|\xi |^2}\bigg ] \int _{B_s (z_o)} \eta ^2 |Dw|^2\,\mathrm {d}x\\&\qquad + c \big [\varepsilon + |\xi |^2(|\xi |-1)^{p-4}\big ] \int _{B_s (z_o)}|w|^2|\nabla \eta |^2\,\mathrm {d}x. \end{aligned}$$

for a constant \(c=c(p)\). Re-absorbing terms on the left-hand side, we find that

$$\begin{aligned}&\tfrac{1}{2}\lambda \, \frac{(|\xi |-1)^p}{|\xi |^2} \int _{B_s (z_o)} \eta ^2|Dw|^2 \,\mathrm {d}x\\&\quad \le c \big [1 + |\xi |^2(|\xi |-1)^{p-4}\big ] \int _{B_s (z_o)}|w|^2|D\eta |^2\,\mathrm {d}x+ \int _{B_s (z_o)} |f||w| \,\mathrm {d}x, \end{aligned}$$

where \(c=c(p)\). Due to the assumption on \(\xi \), the assumption (3.5), and the particular choice of \(\eta \), we conclude with an application of Hölder’s and Sobolev–Poincaré’s inequality a reverse Hölder inequality of the form

with a constant \(c=c(n,p, M,\delta )\). The dependence of c upon M only occurs in the sub-quadratic case of \(p<2\). The claim, i.e. the higher integrability, now follows with Gehring’s lemma, since \(Dw=Du_\varepsilon -\xi \), cf. [1, Theorem 3.22], [22, Theorem 2.4] and [42, Theorem 3.3]. Note that \(\vartheta \) can always be diminished if necessary. \(\square \)

5.2 Comparison with a linear system

In this section we will consider the weak solution \(v\in u_\varepsilon + W^{1,2}_0(B_{\varrho /2}(x_o),{\mathbb {R}}^N)\) of the linear elliptic system

$$\begin{aligned} \int _{B_{\varrho /2}(x_o)}\varvec{\mathcal B}_\varepsilon \big ((Du_\varepsilon )_{x_o,\varrho /2}\big ) (Dv,D\varphi ) \,\mathrm {d}x= 0, \end{aligned}$$
(5.2)

for any \(\varphi \in W^{1,2}_0(B_{\varrho /2}(x_o),{\mathbb {R}}^N)\) as comparison function to our solution \(u_\varepsilon \) of the regularized elliptic system (3.1). Recall that \(\varvec{\mathcal B}_\varepsilon \) has been defined in (2.7).

Lemma 5.2

Let the general assumptions of Sect. 5 be in force and assume that

$$\begin{aligned} 1+\delta +\tfrac{1}{4}\mu \le \big |(Du_\varepsilon )_{x_o,\varrho }\big | \le 1+\delta +\mu . \end{aligned}$$
(5.3)

Then, there exists \(\vartheta =\vartheta (n,p,M,\sigma ,\delta )\in (0,\min \{\tfrac{1}{2},\frac{n+\sigma }{2}-1\}]\) and \(c=c(n,p,M,\delta \), \(\Vert f\Vert _{L^{n+\sigma }(B_R)})\) such that

Here, v is the unique weak solution of the Dirichlet problem (5.2) and \(\beta =\frac{\sigma }{n+\sigma }\).

Proof

Throughout the proof we omit the reference to the center \(x_o\) and write \(B_\varrho \) instead of \(B_\varrho (x_o)\). Moreover, we abbreviate \(\xi :=(Du_\varepsilon )_\varrho \). Using the weak form (3.2) of the elliptic system we obtain

$$\begin{aligned} 0&= \int _{B_{\varrho /2}} {\mathbf {A}}_\varepsilon (Du_\varepsilon ) \cdot D\varphi \, \mathrm {d}x+ \int _{B_{\varrho /2}} f\cdot \varphi \,\mathrm {d}x\\&= \int _{B_{\varrho /2}} \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - \mathbf{A}_\varepsilon (\xi )\big )\cdot D\varphi \, \mathrm {d}x+ \int _{B_{\varrho /2}} f\cdot \varphi \,\mathrm {d}x, \end{aligned}$$

for any \(\varphi \in W^{1,2}_0(B_{\varrho /2},{\mathbb {R}}^N)\). Using also the fact that v is a weak solution of the linear elliptic system (5.2), we find that

$$\begin{aligned}&\int _{B_{\varrho /2}} \varvec{{\mathcal {B}}}_\varepsilon (\xi )(Du_\varepsilon -Dv,D\varphi ) \,\mathrm {d}x\\&\quad = \int _{B_{\varrho /2}} \varvec{{\mathcal {B}}}_\varepsilon (\xi )(Du_\varepsilon ,D\varphi ) \,\mathrm {d}x\\&\quad = \int _{B_{\varrho /2}} \varvec{{\mathcal {B}}}_\varepsilon (\xi )(Du_\varepsilon -\xi ,D\varphi ) \,\mathrm {d}x\\&\quad = \int _{B_{\varrho /2}} \Big [ \varvec{{\mathcal {B}}}_\varepsilon (\xi )(Du_\varepsilon -\xi , D\varphi ) - \big ({\mathbf {A}}_\varepsilon (Du_\varepsilon ) - {\mathbf {A}}_\varepsilon (\xi )\big )\cdot D\varphi \Big ]\,\mathrm {d}x- \int _{B_{\varrho /2}} f\cdot \varphi \,\mathrm {d}x\\&\quad \le c(p)\,\mu ^{p-3} \int _{B_{\varrho /2}} |Du_\varepsilon -\xi |^2 |D\varphi | \,\mathrm {d}x+ \bigg (\int _{B_{\varrho /2}} |f|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \bigg (\int _{B_{\varrho /2}} |\varphi |^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}}. \end{aligned}$$

Here we used from the second to last line Lemma 2.10. This is possible since (3.4) and (5.3) are in force. Since \(u_\varepsilon - v \in W^{1,2}_0(B_{\varrho /2},{\mathbb {R}}^N)\), we may choose the testing function \(\varphi =u_\varepsilon -v\). Together with the bound from below from Lemma 2.7 and Hölder’s and Poincaré’s inequality this leads us to

$$\begin{aligned}&\gamma \mu ^{p-2} \int _{B_{\varrho /2}} |Du_\varepsilon -Dv|^2 \,\mathrm {d}x\\&\quad \le \int _{B_{\varrho /2}} \varvec{{\mathcal {B}}}_\varepsilon (\xi )(Du_\varepsilon -Dv,Du_\varepsilon -Dv) \,\mathrm {d}x\nonumber \\&\quad \le c(p)\,\mu ^{p-3} \int _{B_{\varrho /2}} |Du_\varepsilon -\xi |^2 |Du_\varepsilon -Dv| \,\mathrm {d}x\\&\qquad + \bigg (\int _{B_{\varrho /2}} |f|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \bigg (\int _{B_{\varrho /2}} |u_\varepsilon -v|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}}\\&\quad \le c(p)\,\mu ^{p-3} \bigg (\int _{B_{\varrho /2}} |Du_\varepsilon -\xi |^4 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \bigg (\int _{B_{\varrho /2}} |Du_\varepsilon -Dv|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \\&\qquad + c(n)\,\varrho \bigg (\int _{B_{\varrho /2}} |f|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \bigg (\int _{B_{\varrho /2}} |Du_\varepsilon -Dv|^2 \,\mathrm {d}x\bigg )^{\frac{1}{2}} \end{aligned}$$

for a constant \(\gamma =\gamma (p,\delta )>0\). We divide both sides by

$$\begin{aligned} \gamma \mu ^{p-2}\big [\int _{B_{\varrho /2}} |Du_\varepsilon -Dv|^2 \,\mathrm {d}x\big ]^{\frac{1}{2}}, \end{aligned}$$

square the result and finally take means. This implies

with a constant \(c=c(n,p,M,\delta )\). Here we have also used \(\delta <\mu \le M\). At this stage arrived, we want to reduce the integrability exponent on the right-hand side from 4 to \(2(1+\vartheta )\), where \(\vartheta =\vartheta (n,p,M,\delta )\in (0,\min \{\tfrac{1}{2},\frac{n+\sigma }{2}-1\}]\) is the integrability exponent from the higher integrability Lemma 5.1. This is possible since \(|Du_\varepsilon |\) and \(|\xi |\) are bounded by \(( 2+\frac{1}{\delta })\mu \) on account of (3.4) and (3.6). Then, the application of the higher integrability lemma yields

where \(c=c(n,p,M,\delta )\). Inserting this inequality above and noting that \(\varrho \le 1\) finishes the proof of the lemma. \(\square \)

The following a priori estimate for solutions to linear elliptic systems can be inferred from [5] once the ellipticity conditions for the quadratic form \(\varvec{\mathcal B}_\varepsilon \big ((Du_\varepsilon )_{x_o,\varrho }\big )\) are established; see also [21, Theorem 2.3].

Lemma 5.3

Let the general assumptions of Sect. 5 be in force and assume that (5.3) holds true. Then, the weak solution \(v\in W^{1,2}(B_{\varrho /2}(x_o),{\mathbb {R}}^N)\) of the linear elliptic system (5.2) satisfies \(v\in W^{2,2}_{\text {loc}}(B_{\varrho /2}(x_o),{\mathbb {R}}^N)\) and there exists a constant \(c_o=c_o(n,N,p,\delta )\) such that for any \(\tau \in (0,\frac{1}{2}]\) we have

Proof

As mentioned before, the a priori estimate is standard. The constant \(c_o\) depends on the dimensions nN and the ellipticity constant and the upper bound of the quadratic form \(\varvec{\mathcal B}_\varepsilon \big ((Du_\varepsilon )_{x_o,\varrho }\big )\). Due to assumption (5.3) and Lemma 2.7 these quantities only depend on p and \(\delta \). \(\square \)

5.3 Exploiting the measure theoretic information

The aim of this subsection is to convert the measure theoretic information (3.7) into a lower bound for the mean value of \(Du_\varepsilon \) and smallness of the excess.

Lemma 5.4

Let the general assumptions of Sect. 5 be in force. Furthermore, assume that (3.7) holds for some \(\nu \in (0,\frac{1}{4}]\). Then there exists a constant \(c=c(n,p,M,\delta )\) such that for any \(\tau \in [\frac{1}{2},1)\) there holds

$$\begin{aligned} \Phi (x_o,\tau \varrho ) \le c\, \mu ^2\bigg [ \frac{\nu ^{\frac{2}{n}}}{(1-\tau )^2}+ \frac{\varrho ^{2\beta }}{\nu } \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ]. \end{aligned}$$

Proof

Throughout the proof we omit the reference to the center \(x_o\) and write \(B_\varrho \) instead of \(B_\varrho (x_o)\). We define \(\zeta \in {\mathbb {R}}^{Nn}\) by

$$\begin{aligned} |\zeta |^{p-1} \zeta = \big (|{\mathcal {G}}(Du_\varepsilon )|^{p-1} {\mathcal {G}}(Du_\varepsilon ) \big )_{\tau \varrho } \end{aligned}$$

and let

$$\begin{aligned} {\tilde{\zeta }} := {\mathcal {G}}^{-1}(\zeta ). \end{aligned}$$

Note that \(|\zeta |\le \delta +\mu \) by (3.4) and \(|{\tilde{\zeta }}|\le 1+\delta +\mu \). Due to the minimality of the integral average \((Du_\varepsilon )_{\tau \varrho }\) with respect to the mapping \(\xi \mapsto \int _{B_{\tau \varrho }} |Du_\varepsilon - \xi |^2 \,\mathrm {d}x\), we have

We recall that \(|Du_\varepsilon |,|{\tilde{\zeta }}|\le 1+\delta +\mu \) and hence by (3.6) we have \(|Du_\varepsilon |,|{\tilde{\zeta }}|\le (2+\frac{1}{\delta })\mu \). Due to assumption (3.7) we therefore obtain for the second integral

$$\begin{aligned} \mathbf {II} \le \frac{c(\delta )\,\mu ^2}{|B_{\tau \varrho }|}\, \big |B_{\varrho }\setminus E_{\varrho }^\nu \big | \le \frac{c(\delta )\,\nu \mu ^2}{\tau ^n}. \end{aligned}$$

For the estimate of \({\mathbf {I}}\) we first note that \(|Du_\varepsilon |\ge 1+\frac{3}{2}\delta \) on \(E^\nu _{\tau \varrho }\) since \(\nu \in (0,\frac{1}{4}]\) and \(\mu \ge \delta \). Therefore, the application of Lemma 2.3 yields

$$\begin{aligned} {\mathbf {I}} \le \frac{c(\delta )}{|B_{\tau \varrho }|} \int _{E_{\tau \varrho }^\nu } \big |{\mathcal {G}}(Du_\varepsilon ) - \zeta \big |^2 \,\mathrm {d}x. \end{aligned}$$

Next, we note that

$$\begin{aligned} |{\mathcal {G}}(Du_\varepsilon )| + |\zeta | \ge |{\mathcal {G}}(Du_\varepsilon )| = (|Du_\varepsilon |-1)_+> \delta +(1-\nu )\mu > \tfrac{1}{2} \mu \quad \text{ on } E_{\tau \varrho }^\nu . \end{aligned}$$

Using this information, Lemma 2.2, the choice of \(\zeta \), and Poincaré’s inequality we obtain

We once again decompose the domain of integration into \(E_{\tau \varrho }^\nu \) and \(B_{\tau \varrho }\setminus E_{\tau \varrho }^\nu \). Subsequently applying Hölder’s inequality and taking into account assumption (3.7) leads us to

$$\begin{aligned} {\mathbf {I}}&\le \frac{c(p,\delta )\,\varrho ^2}{\mu ^{2p-2}|B_{\tau \varrho }|} \bigg [\! \int _{E_{\tau \varrho }^\nu } \big | D\big [g(Du_\varepsilon )Du_\varepsilon \big ] \big |^2 \,\mathrm {d}x+ \nu ^{\frac{2}{n}} \int _{B_{\tau \varrho }\setminus E_{\tau \varrho }^\nu } \big | D\big [g(Du_\varepsilon )Du_\varepsilon \big ] \big |^2 \,\mathrm {d}x\! \bigg ]. \end{aligned}$$

We note that \(\tau \ge \frac{1}{2}\) and hence \(|B_{\tau \varrho }|\ge c(n)\varrho ^n\). For the first integral we use Lemma 4.5, while for the second one we use Lemma 4.4 and the assumption \(\mu \ge \delta \). In this way we obtain

$$\begin{aligned} {\mathbf {I}}&\le \frac{c\,\mu ^2}{(1-\tau )^2} \big [\nu + \nu ^{\frac{2}{n}}\big ] + \frac{c\,\varrho ^{2-\frac{2n}{n+\sigma }}}{\nu \mu ^{2p-2}} \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \\&\le c\, \mu ^2\bigg [ \frac{\nu ^{\frac{2}{n}}}{(1-\tau )^2} + \frac{\varrho ^{\frac{2\sigma }{n+\sigma }}}{\nu } \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ], \end{aligned}$$

for a constant \(c=c(n,p,M,\delta )\). Inserting this above yields the desired estimate. \(\square \)

Lemma 5.5

Let the general assumptions of Sect. 5 be in force. Then, for any \(\theta \in (0,\frac{1}{64}]\) there exist \(\nu =\nu (n,p,M,\delta ,\theta )\in (0,\frac{1}{4}]\) and \(\varrho _o=\varrho _o(n,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)},\) \(M,\delta ,\theta )\in (0,1]\) such that the smallness assumption \(\varrho \le \varrho _o\) and the measure theoretic hypothesis (3.7) imply

$$\begin{aligned} \big |(Du_\varepsilon )_{x_o,\varrho }\big | \ge 1+\delta +\tfrac{1}{2}\mu \qquad \text{ and }\qquad \Phi (x_o,\varrho ) \le \theta \mu ^2. \end{aligned}$$
(5.4)

Proof

We let \(\tau \in [\frac{1}{2},1)\), \(\nu \in (0,\frac{1}{4}]\) and \(\varrho _o\in (0,1]\). Consider \(B_\varrho (x_o)\subset B_R\) with \(\varrho \le \varrho _o\). For convenience in notation we omit the reference to the center \(x_o\). Using the minimality of \((Du)_{\varrho }\) with respect to the mapping \(\xi \mapsto \int _{B_{\varrho }} |Du - \xi |^2 \,\mathrm {d}x\) and decomposing the domain of integration into \(B_{\tau \varrho }\) and \(B_\varrho \setminus B_{\tau \varrho }\), we obtain

For the first integral we use Lemma 5.4 and obtain

$$\begin{aligned} {\mathbf {I}} = \tau ^n \Phi (\tau \varrho ) \le c\, \mu ^2\bigg [ \frac{\nu ^{\frac{2}{n}}}{(1-\tau )^2} + \frac{\varrho ^{2\beta }}{\nu } \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ], \end{aligned}$$

where \(c=c(n,p,M,\delta )\). For the second integral we use \(|Du_\varepsilon |\le 1+\delta +\mu \le c(\delta )\mu \) and get

$$\begin{aligned} \mathbf {II}&\le \frac{4(1+\delta +\mu )^2 |B_{\varrho }\setminus B_{\tau \varrho }|}{|B_{\varrho }|} \le c(\delta )\,\mu ^2 (1-\tau ^n) \le c(n,\delta )\,\mu ^2 (1-\tau ), \end{aligned}$$

so that

$$\begin{aligned} \Phi (\varrho ) \le c\,\mu ^2 \bigg [\frac{\nu ^{\frac{2}{n}}}{(1-\tau )^2} + (1-\tau ) + \frac{\varrho ^{2\beta }}{\nu } \Vert f\Vert ^2_{L^{n+\sigma }(B_R)} \bigg ], \end{aligned}$$

for a constant \(c=c(n,p,M,\delta )\). Now, we first choose \(\tau \in [\frac{1}{2},1)\) in dependence on \(n,p,M, \delta \) and \(\theta \) in such a way that \(c(1-\tau )\le \frac{1}{3}\theta \). Subsequently, we choose \(\nu \in (0,\frac{1}{4}]\) in dependence on \(n,p, M,\delta \) and \(\theta \) such that

$$\begin{aligned} \nu \le \min \bigg \{ \bigg (\frac{\theta (1-\tau )^2}{3c}\bigg )^{\frac{n}{2}} \,, \ \frac{\delta }{4(1+\delta )} \bigg \}. \end{aligned}$$

Finally, we choose \(\varrho _o\in (0,1]\) such that

$$\begin{aligned} \varrho _o^{2\beta } \le \frac{\nu \theta }{3c\big (1+\Vert f\Vert _{L^{n+\sigma }(B_R)}^2\big )}. \end{aligned}$$

In this way we obtain (5.4)\(_2\).

To prove (5.4)\(_1\), we first observe that the measure theoretic assumption (3.7) implies

$$\begin{aligned} |E_{\varrho }^\nu | > (1-\nu )|B_\varrho |. \end{aligned}$$

Hence, due to the definition of the set \(E_{\varrho }^\nu \), we obtain

$$\begin{aligned} \int _{B_\varrho } |Du_\varepsilon | \,\mathrm {d}x&\ge \int _{E_{\varrho }^\nu } |Du_\varepsilon | \,\mathrm {d}x\ge \big (1+\delta +(1-\nu )\mu \big ) |E_{\varrho }^\nu | \\&\ge (1-\nu )\big (1+\delta +(1-\nu )\mu \big ) |B_\varrho | . \end{aligned}$$

On the other hand, due to (5.4)\(_2\), we have

so that

$$\begin{aligned} \big |(Du_\varepsilon )_{\varrho }\big | \ge (1-\nu )\big (1+\delta +(1-\nu )\mu \big ) - \sqrt{\theta }\mu . \end{aligned}$$

Due to the choice of \(\nu \) and the fact that \(n\ge 2\) we have \(\nu \le (\frac{1}{3}\theta )^{\frac{n}{2}}\le \frac{1}{3}\theta \le \frac{1}{2}\sqrt{\theta }\) and \(\nu \le \frac{\delta }{4(1+\delta )}\). Together with the assumptions \(\delta \le \mu \) and \(\theta \le \frac{1}{64}\) we obtain

$$\begin{aligned}&-\nu (1+\delta ) + (1-\nu )^2\mu - \sqrt{\theta }\mu - \tfrac{1}{2}\mu \ge -\tfrac{1}{4} \delta + \big (\tfrac{1}{2} - 2\nu - \sqrt{\theta }\big ) \mu \\&\ge \big [\tfrac{1}{4} - 2\sqrt{\theta }\big ] \mu \ge 0. \end{aligned}$$

Inserting this above yields the claim (5.4)\(_1\) and finishes the proof of the lemma. \(\square \)

5.4 Proof of Proposition 3.4

Our aim in this subsection is to prove Proposition 3.4. We start with an excess-decay estimate for the excess \(\Phi (x_o,\varrho )\) of \(Du_\varepsilon \).

Lemma 5.6

Assume that the general hypotheses of Sect. 5 are in force. Let \(\tau \in (0,\frac{1}{2}]\) and \(\vartheta =\vartheta (n,p,M,\delta )\in (0,\frac{1}{2}]\) be the exponent from Lemma 5.2. If

$$\begin{aligned} |(Du_\varepsilon )_{x_o,\varrho }| \ge 1 + \delta + \tfrac{1}{4}\mu \qquad \text{ and }\qquad \Phi (x_o,\varrho ) \le \tau ^{\frac{n+2}{\vartheta }}\mu ^2 , \end{aligned}$$
(5.5)

hold true, then we have the quantitative excess decay estimate

$$\begin{aligned} \Phi (x_o,\tau \varrho ) \le c_*\Big [ \tau ^{2}\Phi (x_o,\varrho ) + \tau ^{-n}\varrho ^{2\beta }\mu ^2 \Big ] \end{aligned}$$

with a constant \(c_*=c_*(n,N, p,\Vert f\Vert _{L^{n+\sigma }(B_R)},M,\delta )\).

Proof

Throughout the proof we omit the reference to the center \(x_o\) and write \(B_\varrho \) instead of \(B_\varrho (x_o)\). By \(v\in u_\varepsilon +W^{1,2}_0(B_{\varrho },{\mathbb {R}}^N)\) we denote the unique weak solution of the linear elliptic system (5.2). For \(\tau \in (0,\frac{1}{2}]\), we have

In view of Lemma 5.3 we deduce

where \(c=c(n,N,p,\delta )\). Inserting this above and applying Lemma 5.2 and assumption (5.5)\(_2\), we end up with

Note that the constant \(c_*\) depends on \(n,N, p,\Vert f\Vert _{L^{n+\sigma }(B_R)}, M\) and \(\delta \). \(\square \)

Proof of Proposition 3.4

By \(\vartheta =\vartheta (n,p,\sigma ,M,\delta )\in (0,\min \{\tfrac{1}{2},\frac{n+\sigma }{2}-1\}]\) we denote the constant from Lemma 5.2 and by \(c_*=c_*(n,N, p, \Vert f\Vert _{L^{n+\sigma }(B_R)}, M,\delta )\) the one from Lemma 5.6. For \(\beta =\frac{\sigma }{n+\sigma }\in (0,1)\) we define \(\tau \in (0,\frac{1}{8}]\) by

$$\begin{aligned} \tau := \min \Big \{\tfrac{1}{8},2^{-\frac{1}{\beta }}, (10c_*)^{-\frac{1}{2(1-\beta )}}\Big \}. \end{aligned}$$

For the particular choice \(\theta =\tau ^{\frac{n+2}{\vartheta }}\) we let \(\varrho _o=\varrho _o(n,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)},M,\delta )\in (0,1]\) be the radius from Lemma 5.5. Finally, we define

$$\begin{aligned} {\hat{\varrho }} := \min \Big \{\varrho _o, (2c_*)^{-\frac{1}{2\beta }}\tau ^{1+\frac{2n+2}{\beta \vartheta }}\Big \}, \end{aligned}$$

so that \({\hat{\varrho }}\) depends on \(n,N,p,\sigma , \Vert f\Vert _{L^{n+\sigma }(B_R)},M,\delta \). In the following we consider a ball \(B_{2\varrho }(x_o)\subset B_{r_1}\) with \(\varrho \le {\hat{\varrho }}\). As before, we omit the reference to the center \(x_o\) and write \(B_\varrho \) instead of \(B_\varrho (x_o)\). By \(\nu =\nu (n,p,M,\delta ,\theta =\frac{1}{2}\tau ^{\frac{n+2}{\vartheta }})\in (0,\frac{1}{4}]\) we denote the constant from Lemma 5.5 and assume that (3.7) is satisfied for this particular choice of \(\nu \). Note that by our choice of \(\tau \) the parameter \(\nu \) depends on \(n,N,p,\sigma ,\Vert f\Vert _{L^{n+\sigma }(B_R)},M\) and \(\delta \). From Lemma 5.5 applied with \(\theta =\tau ^{\frac{n+2}{\vartheta }}\) we infer that

$$\begin{aligned} \big |(Du)_{\varrho }\big | \ge 1 + \delta + \tfrac{1}{2} \mu \qquad \text{ and }\qquad \Phi (\varrho ) \le \tau ^{\frac{n+2}{\vartheta }}\mu ^2. \end{aligned}$$
(5.6)

By induction we shall prove that for any \(i\in {\mathbb {N}}\) we have

$$\begin{aligned}&\Phi (\tau ^i\varrho ) \le \tau ^{\frac{n+2}{\vartheta }}\tau ^{2\beta i} \mu ^2\qquad \qquad \qquad \qquad \qquad \qquad \qquad {({\text{ I }})_{i}} \end{aligned}$$

and

$$\begin{aligned}&\big |(Du)_{\tau ^i\varrho }\big | \ge 1 + \delta + \bigg [\frac{1}{2} - \frac{1}{8} \sum _{j=0}^{i-1} 2^{-j}\bigg ] \mu .\qquad \quad \qquad \qquad {({\text{ II }})_{i}} \end{aligned}$$

For \(i=1\) we can apply Lemma 5.6, since (5.6) ensures that the assumptions of the lemma are satisfied. Then, (I)\(_1\) follows from Lemma 5.6, (5.6)\(_2\) and our choices of \(\tau \) and \({\hat{\varrho }}\), since

$$\begin{aligned} \Phi (\tau \varrho )&\le c_*\Big [\tau ^{2}\Phi (\varrho ) + \tau ^{-n}\varrho ^{2\beta }\mu ^2 \Big ] \\&\le \tfrac{1}{2} \tau ^{2\beta }\Phi (\varrho ) + \frac{c_*}{\tau ^n}\varrho ^{2\beta }\mu ^2 \\&\le \tau ^{\frac{n+2}{\vartheta }}\tau ^{2\beta } \bigg [\tfrac{1}{2} + \frac{c_*\hat{\varrho }^{2\beta }}{\tau ^{\frac{n+2}{\vartheta }+n+2\beta }}\bigg ] \mu ^2 \\&\le \tau ^{\frac{n+2}{\vartheta }}\tau ^{2\beta i}\mu ^2 . \end{aligned}$$

For the proof of (II)\(_1\) we use (5.6)\(_2\) and \(\tau ^{\frac{n+2}{\vartheta }}\le \tau ^{n+2}\) to obtain

so that

$$\begin{aligned} \big |(Du)_{\tau \varrho } - (Du)_\varrho \big | \le \tau \mu \le \tfrac{1}{8}\mu . \end{aligned}$$

Together with (5.6)\(_1\) this implies (II)\(_1\).

Now, we consider \(i>1\) and prove (I)\(_i\) and (II)\(_i\) assuming that (I)\(_{i-1}\) and (II)\(_{i-1}\) hold. From (I)\(_{i-1}\) and (II)\(_{i-1}\) we observe that the assumptions of Lemma 5.6 as formulated in (5.5) are satisfied on \(B_{\tau ^{i-1}\varrho }\). Therefore, applying the lemma with \(\tau ^{i-1}\varrho \) instead of \(\varrho \), recalling the choices of \(\tau \) and \({\hat{\varrho }}\) and joining the result with (I)\(_{i-1}\) yields

$$\begin{aligned} \Phi (\tau ^i\varrho )&\le c_*\Big [\tau ^{2}\Phi (\tau ^{i-1}\varrho ) + \tau ^{-n}(\tau ^{i-1}\varrho )^{2\beta }\mu ^2 \Big ] \\&\le \tfrac{1}{2} \tau ^{2\beta }\Phi (\tau ^{i-1}\varrho ) + \frac{c_*}{\tau ^n}(\tau ^{i-1}\varrho )^{2\beta }\mu ^2 \\&\le \tau ^{\frac{n+2}{\vartheta }}\tau ^{2\beta i} \bigg [\tfrac{1}{2} + \frac{c_*\hat{\varrho }^{2\beta }}{\tau ^{\frac{n+2}{\vartheta }+n+2\beta }}\bigg ] \mu ^2 \\&\le \tau ^{\frac{n+2}{\vartheta }}\tau ^{2\beta i}\mu ^2 . \end{aligned}$$

This proves (I)\(_i\). Moreover, from (I)\(_{i-1}\) and \(\tau ^{\frac{n+2}{\vartheta }}\le \tau ^{n+2}\) we obtain

so that

$$\begin{aligned} |(Du)_{\tau ^i\varrho } - (Du)_{\tau ^{i-1}\varrho }| \le \tau ^{\beta (i-1)}\tau \,\mu \le \tfrac{1}{8} 2^{-(i-1)}\,\mu , \end{aligned}$$

by our choice of \(\tau \). Together with (II)\(_{i-1}\), this proves (II)\(_i\).

We now come to the proof of (3.8) and (3.9). For \(i\in {\mathbb {N}}\) we obtain from the minimizing property of the mean value, Lemma 2.3, (I)\(_i\), (5.6) and our choice of \(\tau \) that

(5.7)

This allows us to compute

Given \(j<k\), we use the preceding inequality to conclude that

$$\begin{aligned} \big |\big ({\mathcal {G}}_{2\delta }(Du)\big )_{\tau ^j\varrho } - \big (\mathcal G_{2\delta }(Du)\big )_{\tau ^k\varrho }\big |&\le \sum _{i=j+1}^{k} \big |\big ({\mathcal {G}}_{2\delta }(Du)\big )_{\tau ^i\varrho } - \big ({\mathcal {G}}_{2\delta }(Du)\big )_{\tau ^{i-1}\varrho }\big | \nonumber \\&\le \tau ^{\frac{n}{2}+1}\sum _{i=j+1}^{k} \tau ^{\beta (i-1)}\, \mu \le \tau ^{\frac{n}{2}+1}\frac{\tau ^{\beta j}}{1-\tau ^{\beta }}\, \mu \nonumber \\&\le 2\tau ^{\frac{n}{2}+1} \tau ^{\beta j}\, \mu . \end{aligned}$$
(5.8)

This shows that \(((\mathcal G_{2\delta }(Du))_{\tau ^i\varrho })_{i=1}^\infty \) is a Cauchy sequence and therefore the limit

$$\begin{aligned} \Gamma _{x_o} := \lim _{i\rightarrow \infty } \big (\mathcal G_{2\delta }(Du)\big )_{\tau ^i\varrho } \end{aligned}$$

exists. Passing to the limit \(k\rightarrow \infty \) in (5.8) yields

$$\begin{aligned} \big |\big ({\mathcal {G}}_{2\delta }(Du)\big )_{\tau ^j\varrho } - \Gamma _{x_o}\big | \le 2\tau ^{\frac{n}{2}+1} \tau ^{\beta j} \mu \qquad \text{ for } \text{ any } j\in {\mathbb {N}}. \end{aligned}$$

Joining this with (5.7), we find

For \(r\in (0,\varrho ]\) there exists \(j\in {\mathbb {N}}_0\) such that \(\tau ^{j+1}\varrho <r\le \tau ^{j}\varrho \). Then, we obtain from the last inequality

This implies

so that also

$$\begin{aligned} \Gamma _{x_o} = \lim _{r\downarrow 0} \big (\mathcal G_{2\delta }(Du)\big )_{r}. \end{aligned}$$

Finally, due to assumption (3.4) we have \(|(\mathcal G_{2\delta }(Du))_{r}|\le \mu \) for any \(0<r\le \varrho \), which implies \(|\Gamma _{x_o}|\le \mu \). This finishes the proof of Proposition 3.4. \(\square \)

6 The degenerate regime

Our aim in this section is to prove Proposition 3.5, which treats the degenerate regime. The proof relies on a De Giorgi type reduction argument reducing the supremum of \(U_\varepsilon =(|Du_\varepsilon |-1-\delta )_+^2\) under the measure theoretic assumption (3.10). The starting point is the energy estimate for \(U_\varepsilon \) from Lemma 4.3.

As in Section 5, we first formulate the general assumptions. For \(\varepsilon \in (0,1]\) we denote by \(u_\varepsilon \in W^{1,p}(B_R,{\mathbb {R}}^N)\) the unique weak solution to the Dirichlet problem (3.1) associated to the regularized system. We assume that (3.4) is in force for some \(\mu ,\delta >0\) on some ball \(B_{2\varrho }(x_o)\subset B_{r_1}\Subset B_R\). Let \(U_\varepsilon :=(|Du_\varepsilon |-1-\delta )_+^2\) denote the function defined in (4.2). Note that (3.4) implies

$$\begin{aligned} \sup _{B_{2\varrho }(x_o)} U_\varepsilon \le \mu ^2. \end{aligned}$$

Moreover, we set \(\beta :=\frac{\sigma }{n+\sigma }\in (0,1)\).

We start by a De Giorgi type lemma for \(U_\varepsilon \), which can for instance be deduced as in [19, Chap. 10, Proposition 4.1] by the use of the energy estimate from Lemma 4.3. For the readers convenience we provide the proof in the appendix Sect. 7.

Lemma 6.1

(Reducing the supremum) Assume that the general assumptions of Sect. 6 are in force and let \(\theta \in (0,1)\). Then, there exists \({\tilde{\nu }}={\tilde{\nu }}(n,p, \Vert f\Vert _{n+\sigma }, M,\delta )\in (0,1)\) such that the measure theoretic assumption

$$\begin{aligned} \big |\big \{x\in B_{\varrho }(x_o): U_\varepsilon (x)> (1-\theta )\mu ^2\big \}\big | < {\tilde{\nu }}\,\big |B_{\varrho }(x_o)\big |, \end{aligned}$$

implies that either

$$\begin{aligned} \mu ^2<\frac{\varrho ^{\beta }}{\theta }, \end{aligned}$$

or

$$\begin{aligned} U_\varepsilon \le \big (1-\tfrac{1}{2}\theta \big )\mu ^2 \qquad \text{ in } B_{\varrho /2}(x_o) \end{aligned}$$

hold true. \(\Box \)

The proof of the next Lemma can be deduced as in [19, Chap. 10, Proposition 5.1] utilizing the energy estimate from Lemma 4.3; see also Sect. 7.

Lemma 6.2

Assume that the general assumptions of Sect. 6 are in force and assume that (3.10) is satisfied for some \(\nu \in (0,1)\). Then, for any \(i_*\in {\mathbb {N}}\) we either have

$$\begin{aligned} \mu ^2<2^{i_*}\varrho ^{\beta }/\nu \end{aligned}$$

or

$$\begin{aligned} \big |\big \{x\in B_{\varrho }(x_o): U_\varepsilon (x)> (1-2^{-i_*}\nu )\mu ^2\big \}\big | < \frac{c_*}{\nu \sqrt{i_*}}\,\big |B_{\varrho }(x_o)\big | \end{aligned}$$

for a constant \(c_*=c_*(n,p,\Vert f\Vert _{n+\sigma }, M,\delta )\).

Now, we have all the prerequisites at hand to provide the

Proof of Proposition 3.5

Let \({\tilde{\nu }}\in (0,1)\) and \(c_*\) be the constants from Lemmas 6.1 and 6.2. Note that both depend on \(n,p,\Vert f\Vert _{n+\sigma },M\) and \(\delta \). Choose \(i_*\in {\mathbb {N}}\) such that

$$\begin{aligned} i_*\ge \Big (\frac{c_*}{\nu {\tilde{\nu }}}\Big )^2 . \end{aligned}$$

Then \(i_*\) depends \(n,p,\Vert f\Vert _{n+\sigma },M,\delta \) and \(\nu \). Lemma 6.2 implies that either \(\mu ^2<2^{i_*}\varrho ^{\beta }/\nu \), or

$$\begin{aligned} \big |\big \{x\in B_{\varrho }(x_o): U_\varepsilon (x)> (1-2^{-i_*}\nu )\mu ^2\big \}\big | < \frac{c_*}{\nu \sqrt{i_*}}\,\big |B_{\varrho }(x_o)\big | \le {\tilde{\nu }}\,\big |B_{\varrho }(x_o)\big |. \end{aligned}$$

In the first case the proposition is proved with \(c_o=2^{i_*}/\nu \), while in the second case we may apply Lemma 6.1 with \(\theta =2^{-i_*}\nu \). Therefore either \(\mu ^2<2^{i_*}\varrho ^{\beta }/\nu \) or

$$\begin{aligned} U_\varepsilon \le \big (1-2^{-(i_*+1)}\nu \big )\mu ^2 \qquad \text{ in } B_{\varrho /2}(x_o). \end{aligned}$$

The first alternative coincides with the first alternative above, while the second one implies the sup-bound for \({\mathcal {G}}_{\delta }(Du_\varepsilon )\) for any \(\kappa \ge \sqrt{1-2^{-(i_*+1)}\nu }\) since \(U_\varepsilon =|{\mathcal {G}}_{\delta }(Du_\varepsilon )|^2\). Therefore we my choose \(\kappa \in [2^{-\beta /2},1)\) as required. \(\square \)