1 Introduction

A classical identity, which links the Laplacian \(\Delta {\textbf {u}}\) of a vector-valued function \({\textbf {u}}\in C^3(\Omega , {\mathbb R^N})\) to its Hessian \(\nabla ^2 {\textbf {u}}\), tells us that

$$\begin{aligned} |\Delta {\textbf {u}}|^2 = \textrm{{div}} \Big ( (\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big ) + |\nabla ^2 {\textbf {u}}|^2 \quad \hbox {in }\quad \Omega , \end{aligned}$$
(1.1)

where \(\Omega \) is an open set in \({\mathbb R^n}\). Here, and in what follows, \(n\ge 2\), \(N \ge 1\), and the gradient \(\nabla {\textbf {u}}\) of a function \({\textbf {u}}: \Omega \rightarrow \mathbb {R}^N\) is regarded as the matrix in \(\mathbb {R}^{N\times n}\) whose rows are the gradients in \(\mathbb {R}^{1 \times n}\) of the components \(u^1, \, \dots \, , u^N\) of \({\textbf {u}}\). Moreover, the suffix “T” stands for transpose.

Identity (1.1) can be found as early as more than one century ago in [11] for \(n=2\) —see also [43, 58]. It has applications, for instance, in the second-order \(L^2\)-regularity theory for solutions to the Poisson system for the Laplace operator

$$\begin{aligned} - \Delta {\textbf {u}}= {\textbf {f}}\quad \text {in}\quad \Omega , \end{aligned}$$
(1.2)

where \({\textbf {f}}: \Omega \rightarrow \mathbb R^N\). Indeed, identity (1.1) enables one to bound the integral of \(|\nabla ^2 {\textbf {u}}|^2\) over some set in \(\Omega \) by the integral of \(|\Delta {\textbf {u}}|^2\) over the same set, plus a boundary integral involving the expression under the divergence operator. Of course, since the equations in the linear system (1.2) are uncoupled, its theory is reduced to that of its single equations.

The second-order regularity theory of nonlinear equations and systems is much less developed, yet for the basic p-Laplace equation or system

$$\begin{aligned} - \textbf{div} (|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}}) = {\textbf {f}}\qquad \text {in}\quad \Omega , \end{aligned}$$
(1.3)

where \(p>1\) and “\( \textbf{div}\)” denotes the \(\mathbb {R}^N\)-valued divergence operator. Standard results concern weak differentiability properties of the expression \(|\nabla {\textbf {u}}|^{\frac{p-2}{2}}\nabla {\textbf {u}}\). They trace back to [61] for \(p>2\), and to [1, 22] for every \(p >1\). The case of a single equation was earlier considered in [62]. Further developments are in [8, 20, 33].

As demonstrated by several more recent contributions, the regularity of solutions to p-Laplacian type equations and systems is often most neatly described in terms of the expression \(|\nabla {\textbf {u}}|^{{p-2}}\nabla {\textbf {u}}\) appearing under the divergence operator in (1.3). This surfaces, for instance, from BMO and Hölder bounds of [39], potential estimates of [47], rearrangement inequalities of [28], pointwise oscillation estimates of [13], regularity results up to the boundary of [14]. More results in this connection can be found e.g. in [3, 30, 48].

Differentiability properties of \(|\nabla {\textbf {u}}|^{{p-2}}\nabla {\textbf {u}}\) have customarily been detected under strong regularity assumptions on the right-hand side \({\textbf {f}}\). This is the case of [50], where local solutions are considered. High regularity of the right-hand side is also assumed in [35], where results for boundary value problems can be found under smoothness assumptions on \(\partial \Omega \). Both papers [35, 50] deal with scalar problems, i.e. with the case when \(N=1\). Fractional-order regularity of the gradient of solutions to quasilinear equations of p-Laplacian type has been studied in [57], and in the more recent contributions [3, 18, 21, 55, 56]. The question of fractional-order regularity of the quantity \(|\nabla {\textbf {u}}|^{{p-2}}\nabla {\textbf {u}}\), when \(N=1\) and the right-hand side of Eq. (1.3) is in divergence form, is addressed in [5], where, in particular, sharp results are obtained for \(n=2\).

Optimal second-order \(L^2\)-estimates for solutions to a class of problems, including (1.3) for every \(p >1\), in the scalar case, have recently been established in [31]. Loosely speaking, these estimates tell us that \(|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}}\in W^{1,2}\) if and only if \({\textbf {f}}\in L^2\). Such a property is shown to hold both locally, and, under minimal regularity assumptions on the boundary, also globally. Parallel results are derived in [32] for vectorial problems, namely for \(N\ge 2\), but for the restricted range of powers \(p >\frac{3}{2}\). The results of [31, 32] rely upon the idea that, in the nonlinear case, the role of the pointwise identity (1.1) can be performed by a pointwise inequality. The latter amounts to a bound from below for the square of the right-hand side of (1.3) by the square of the derivatives of \(|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}}\), plus an expression in divergence form. The restriction for the admissible values of p in the vectorial case stems from this pointwise inequality.

In the present paper we offer an enhanced pointwise inequality in the same spirit, with best possible constant, for a class of nonlinear differential operators of the form \(- \textbf{div}(a(|\nabla {\textbf {u}}|) \nabla {\textbf {u}})\). The relevant inequality holds under general assumptions on the function a, which also allow growths that are not necessarily of power type. Importantly, our inequality improves the available results even in the case when the operator is the p-Laplacian, namely when \(a(t)=t^{p-2}\). In particular, for this special choice, it entails the existence of a constant \(c>0\) such that

$$\begin{aligned}&\big |\textrm{\textbf{div}} (|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}})\big |^2\nonumber \\ {}&\quad \ge \textrm{{div}} \Big [|\nabla {\textbf {u}}|^{2(p-2)} \Big ((\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ] + c\, |\nabla {\textbf {u}}|^{2(p-2)} |\nabla ^2 {\textbf {u}}|^2 \end{aligned}$$
(1.4)

in \(\{\nabla {\textbf {u}}\ne 0\}\) if and only either \(N=1\) and \(p>1\), or \(N \ge 2\) and \(p>2(2-\sqrt{2})\approx 1.1715\).

The differential inequality to be presented, in its general version, is the crucial point of departure in our proof of the local and global \(W^{1,2}\)-regularity for the expression \(a(|\nabla {\textbf {u}}|) \nabla {\textbf {u}}\) for systems of the form

$$\begin{aligned} - \textrm{\textbf{div}}( a(|\nabla {{\textbf {u}}}|) \nabla \textbf{u} ) = \textbf{f} \quad \textrm{in}\,\,\, \Omega . \end{aligned}$$
(1.5)

Regularity issues for equations and systems driven by non standard nonlinearities, encompassing (1.5), are nowadays the subject of a rich literature. A non exhaustive sample of contributions along this direction of research includes [2, 4, 6, 7, 16, 19, 23, 26, 27, 36, 39, 40, 44, 45, 49, 51, 60].

Let us incidentally note that system (1.5) is the Euler equation of the functional

$$\begin{aligned} J({{\textbf {u}}}) = \int _\Omega B(|\nabla {{\textbf {u}}}|) - \textbf{f}\cdot {{\textbf {u}}}\,\, dx. \end{aligned}$$
(1.6)

Here, the dot “\( \, \cdot \, \)” stands for scalar product, and \(B:[0, \infty ) \rightarrow [0, \infty )\) is the function defined as

$$\begin{aligned} B(t) = \int _0^t b(s)\, ds \qquad \text {for} \quad t \ge 0, \end{aligned}$$
(1.7)

where the function \(b: [0, \infty ) \rightarrow [0, \infty )\) is given by

$$\begin{aligned} b(t)= a(t) t \qquad \hbox {for}\quad t >0, \end{aligned}$$
(1.8)

and \(b(0)=0\).

Under the assumptions to be imposed on a, the function B and the functional J turn out to be strictly convex. In particular, if \(a(t)=t^{p-2}\), then \(B(t)=\frac{1}{p} t^p\), and J agrees with the usual energy functional associated with the p-Laplace system (1.3).

We shall focus on the case when \(N \ge 2\), the case of equations being already fully covered by the results of [31]. In particular, our regularity results apply to the p-Laplacian system (1.3) for every

$$\begin{aligned} p>2(2-\sqrt{2}) \approx 1.1715. \end{aligned}$$
(1.9)

Hence, we extend the range of the admissible exponents p known until now, which was limited to \(p>\frac{3}{2}\).

In the light of the pointwise inequality (1.4), the lower bound (1.9) for p is optimal for our approach to the second-order regularity of solutions to the p-Laplace system (1.3). The question of whether such a restriction is really indispensable for this regularity, or it can be dropped as in the case when \(N=1\), where every \(p>1\) is admitted, is an open challenging problem.

2 Main results

The statement of the general differential inequality requires a few notations. Given a positive function \(a \in C^1((0, \infty ))\), we define the indices

$$\begin{aligned} i_a= \inf _{t>0} \frac{t a'(t)}{a(t)} \qquad \hbox {and} \qquad s_a= \sup _{t >0} \frac{t a'(t)}{a(t)}, \end{aligned}$$
(2.1)

where \(a'\) stands for the derivative of a. Plainly, if \(a(t)=t^{p-2}\), then \(i_a=s_a=p-2\).

Moreover, we denote, for \(N\ge 1\), the continuously increasing function \(\kappa _N : [1, \infty ) \rightarrow \mathbb R\) as

$$\begin{aligned} \kappa _1 (p) = {\left\{ \begin{array}{ll} (p-1)^2 &{}\qquad \text {if}\quad p \in [1,2) \\ 1 &{}\qquad \text {if}\quad p \in [2, \infty ), \end{array}\right. } \end{aligned}$$
(2.2)

if \(N=1\), and

$$\begin{aligned} \kappa _N (p) = {\left\{ \begin{array}{ll} 1- \frac{1}{8}(4-p)^2 &{}\quad \text {if}\quad p \in [1, \frac{4}{3}) \\ (p-1)^2 &{}\quad \text {if }\quad p \in [\frac{4}{3},2) \\ 1 &{}\quad \text {if}\quad p \in [2, \infty ), \end{array}\right. } \end{aligned}$$
(2.3)

if \(N \ge 2\).

Theorem 2.1

(General pointwise inequality). Let \(n \ge 2\) and \(N \ge 1\). Let \(\Omega \) be an open set in \({\mathbb R^n}\) and let \({\textbf {u}}\in C^3(\Omega , {\mathbb R^N})\).

(i) [Nonsingular case] Assume that the function \(a \in C^0([0, \infty ))\) is such that:

$$\begin{aligned}&a(t)>0 \qquad \text { for }\quad t>0, \end{aligned}$$
(2.4)
$$\begin{aligned}&i_a\ge -1, \end{aligned}$$
(2.5)

and

$$\begin{aligned} b \in C^1([0, \infty )), \end{aligned}$$
(2.6)

where b is the function defined by (1.8). Then

$$\begin{aligned}&\big |\textrm{\textbf{div}} \big (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}}\big )\big |^2 \nonumber \\&\quad \ge \textrm{{div}} \Big [a(|\nabla {\textbf {u}}|)^2 \Big ( (\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ] + \kappa _N (i_a+2) a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 \end{aligned}$$
(2.7)

in \(\Omega \), where \(\kappa _N \) is defined as in (2.2)–(2.3). Moreover, the constant \( \kappa _N (i_a+2)\) is sharp.

(ii) [General case] If a is just defined in \((0,\infty )\), \(a \in C^1((0, \infty ))\), and conditions (2.4) and (2.5) are fulfilled, then inequality (2.7) continues to hold in the set \(\{\nabla {\textbf {u}}\ne 0\}\).

Remark 2.2

Observe that the assumption (2.6) need not be fulfilled by the functions a appearing in the elliptic systems (1.5) to be considered. Such an assumpton fails, for instance, when \(a(t)=t^{p-2}\) with \(1<p<2\). This calls for a regularization argument for a in our applications of inequality (2.7) to the solutions to the systems in question. The solutions to the regularized systems will also enjoy the smoothness properties required on the function \({\textbf {u}}\) in Part (i) of Theorem 2.1. On the other hand, the functions a in the original systems do satisfy the conditions required in Part (ii) of Theorem 2.1 for the validity of inequality (2.7) outside the set \({\{{\nabla {\textbf {u}}= 0}\}}\) of critical points of the function \({\textbf {u}}\).

In the special case when \(a(t)=t^{p-2}\), Theorem 2.1 yields the following inequality for the p-Laplace operator we alluded to in Sect. 1.

Corollary 2.3

(Pointwise inequality for the p-Laplacian). Let \(n \ge 2\) and \(N \ge 1\). Let \(\Omega \) be an open set in \({\mathbb R^n}\) and let \({\textbf {u}}\in C^3(\Omega , {\mathbb R^N})\). Assume that \(p\ge 1\). Then

$$\begin{aligned}&\big |\textrm{\textbf{div}} (|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}})\big |^2 \nonumber \\&\quad \ge \textrm{{div}} \Big [|\nabla {\textbf {u}}|^{2(p-2)} \Big ((\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ]+ \kappa _N(p) |\nabla {\textbf {u}}|^{2(p-2)} |\nabla ^2 {\textbf {u}}|^2 \end{aligned}$$
(2.8)

in \(\{\nabla {\textbf {u}}\ne 0\}\). Moreover, the constant \( \kappa _N(p)\) is sharp.

Notice that, if \(N=1\), then

$$\begin{aligned} \kappa _1(p)>0 \quad \text {if}\quad p >1, \end{aligned}$$
(2.9)

whereas, if \(N \ge 2\),

$$\begin{aligned} \kappa _N (p)>0 \quad \text { if } \quad p>2(2-\sqrt{2}). \end{aligned}$$
(2.10)

The gap between (2.9) and (2.10) is responsible for the different implications of inequality (2.7) in view of second-order \(L^2\)-estimates for solutions to

$$\begin{aligned} - \textrm{\textbf{div}}( a(|\nabla {{\textbf {u}}}|) \nabla \textbf{u} ) = \textbf{f} \quad \textrm{in}\,\,\, \Omega \,, \end{aligned}$$
(2.11)

according to whether \(N=1\) or \(N \ge 2\). Indeed, inequality (2.7) is of use for this purpose only if \(\kappa _N(i_a+2)>0\).

Since we are concerned with \(L^2\)-estimates, the datum \(\textbf{f}\) in (2.11) is assumed to be merely square integrable. Solutions in a suitably generalized sense have thus to be considered. For instance, the existence of standard weak solutions to the p-Laplace system (1.3) is only guaranteed if \(p\ge \frac{2n}{n+2}\). In the scalar case, various definitions of solutions—entropy solutions, renormalized solutions, SOLA—that allow for right-hand sides that are just integrable functions, or even finite measures, are available in the literature, and turn out to be a posteriori equivalent. Note that these solutions need not be even weakly differentiable. The case of systems is more delicate and has been less investigated. A notion of solution, which is well tailored for our purposes and will be adopted, is patterned on the approach of [41]. Loosely speaking, the solutions in question are only approximately differentiable, and are pointwise limits of solutions to approximating problems with smooth right-hand sides.

The outline of the derivation of the second-order \(L^2\)-bounds for these solutions to system (2.11) via Theorem 2.1 is analogous to the one of [31]. However, new technical obstacles have to be faced, due to the non-polynomial growth of the coefficient a in the differential operator. In particular, an \(L^1\)-estimate, of independent interest, for the expression \(a(|\nabla {{\textbf {u}}}|) \nabla \textbf{u}\) for merely integrable data \({\textbf {f}}\) is established. Such an estimate is already available in the literature for equations, but seems to be new for systems, and its proof requires an ad hoc Sobolev type inequality in Orlicz spaces.

Our local estimate for system (2.11) reads as follows. In the statement, \(B_R\) and \(B_{2R}\) denote concentric balls, with radius R and 2R, respectively.

Theorem 2.4

(Local estimates). Let \(\Omega \) be an open set in \(\mathbb R^n\), with \(n \ge 2\), and let \(N \ge 2\). Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable, and satisfies

$$\begin{aligned} i_a>2(1-\sqrt{2})\,, \end{aligned}$$
(2.12)

and

$$\begin{aligned} s_a< \infty \,. \end{aligned}$$
(2.13)

Let \({\textbf {f}}\in L^2_\textrm{loc}(\Omega , {\mathbb R^N})\) and let \({{\textbf {u}}}\) be an approximable local solution to system (2.11). Then

$$\begin{aligned} a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\in W^{1,2}_\textrm{loc}(\Omega , {\mathbb R^{N\times n}}), \end{aligned}$$
(2.14)

and there exists a constant \(C=C(n, N, i_a, s_a)\) such that

$$\begin{aligned}&R^{-1}{\big \Vert {a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}}\big \Vert }_{L^2(B_R, {\mathbb R^{N\times n}})} + \,{\big \Vert {\nabla \big (a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\big )}\big \Vert }_{L^2(B_R, \mathbb R^{\mathbb N\times n\times n})} \nonumber \\&\quad \le C \Big (\,\Vert {\textbf {f}}\Vert _{L^2(B_{2R}, {\mathbb R^N})} + R^{-\frac{n}{2}-1}\Vert a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\Vert _{L^{1}(B_{2R}, {\mathbb R^{N\times n}})}\Big ) \end{aligned}$$
(2.15)

for any ball \(B_{2R} \subset \subset \Omega \).

Remark 2.5

In particular, if \(\Omega = \mathbb R^n\) and, for instance, \(a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\in L^1(\mathbb R^n, {\mathbb R^{N\times n}})\), then passing to the limit in inequality (2.15) as \(R\rightarrow \infty \) tells us that

$$\begin{aligned} {\big \Vert {\nabla \big (a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\big )}\big \Vert }_{L^2(\mathbb R^n, \mathbb R^{\mathbb N\times n\times n})} \le C \Vert {\textbf {f}}\Vert _{L^2(\mathbb R^n, {\mathbb R^N})}. \end{aligned}$$
(2.16)

We next deal with global estimates for solutions to system (2.11), subject to Dirichlet homogeneous boundary conditions. Namely, we consider solutions to problems of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{\textbf{div}} (a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}} ) = {{\textbf {f}}} &{} \textrm{in}\,\,\, \Omega \\ {{\textbf {u}}} =0 &{} \textrm{on}\,\,\, \partial \Omega \,. \end{array}\right. } \end{aligned}$$
(2.17)

As shown by classical counterexamples, yet in the linear case, global estimates involving second-order derivatives of solutions can only hold under suitable regularity assumptions on \(\partial \Omega \). Specifically, information on the (weak) curvatures of \(\partial \Omega \) is relevant in this connection. Convexity of the domain \(\Omega \), which results in a positive semidefinite second fundamental form of \(\partial \Omega \), is well known to ensure bounds in \(W^{2,2}(\Omega , {\mathbb R^{N\times n}})\) for the solution \({{\textbf {u}}}\) to the homogeneous Dirichlet problem associated with the linear system (1.2) in terms of the \(L^2(\Omega , {\mathbb R^N})\) norm of \({\textbf {f}}\) – see [43]. The following result provides us with an analogue for problem (2.17), for the same class of nonlinearities a as in Theorem 2.4.

Theorem 2.6

(Global estimates in convex domains) Let \(\Omega \) be any bounded convex open set in \({\mathbb R^n}\), with \(n \ge 2\), and let \(N \ge 2\). Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable and fulfills conditions (2.12) and (2.13). Let \({{\textbf {f}}} \in L^2(\Omega , {\mathbb R^N})\) and let \({{\textbf {u}}}\) be an approximable solution to the Dirichlet problem (2.17). Then

$$\begin{aligned} a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\in W^{1,2}(\Omega , {\mathbb R^{N\times n}}), \end{aligned}$$
(2.18)

and

$$\begin{aligned} C_1 \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})} \le \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C_2 \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})} \end{aligned}$$
(2.19)

for some positive constants \(C_1=C_1(n, N, i_a, s_a)\) and \(C_2=C_2(N, i_a, s_a, \Omega )\).

The global assumption on the signature of the second fundamental form of \(\partial \Omega \) entailed by the convexity of \(\Omega \) can be replaced by local conditions on the relevant fundamental form. This is the subject of Theorem 2.7.

The finest assumption on \(\partial \Omega \), that we are able to allow for, amounts to a decay estimate of the integral of its weak curvatures over subsets of \(\partial \Omega \) whose diameter approaches zero, in terms of their capacity. Specifically, suppose that \(\Omega \) is a bounded Lipschitz domain such that \(\partial \Omega \in W^{2,1}\). This means that the domain \(\Omega \) is locally the subgraph of a Lipschitz continuous function of \((n-1)\) variables, which is also twice weakly differentiable. Denote by \(\mathcal B\) the weak second fundamental form on \(\partial \Omega \), by \(|\mathcal B|\) its norm, and set

$$\begin{aligned} \mathcal K_\Omega (r) = \sup _{ \begin{array}{c} E\subset \partial \Omega \cap B_r(x) \\ x\in \partial \Omega \end{array} } \frac{\int _E |\mathcal B|d\mathcal H^{n-1}}{\textrm{cap}_{B_1(x)} (E)}\qquad \hbox {for}\quad r\in (0,1)\,. \end{aligned}$$
(2.20)

Here, \(B_r(x)\) stands for the ball centered at x, with radius r, the notation \(\textrm{cap}_{B_1(x)}(E)\) is adopted for the capacity of the set E relative to the ball \(B_1(x)\), and \(\mathcal H^{n-1}\) is the \((n-1)\)-dimensional Hausdorff measure. The decay we hinted to above consists in a smallness condition on the limit at as \(r\rightarrow 0^+\) of the function \(\mathcal K_\Omega (r)\). The smallness depends on \(\Omega \) through its diameter \(d_\Omega \) and its Lipschitz characteristic \(L_\Omega \). The latter quantity is defined as the maximum among the Lipschitz constants of the functions that locally describe the intersection of \(\partial \Omega \) with balls centered on \(\partial \Omega \), and the reciprocals of their radii. Here, and in similar occurrences in what follows, the dependence of a constant on \(d_\Omega \) and \(L_\Omega \) is understood just via an upper bound for them.

Theorem 2.7 also provides us with an ensuing alternate assumption on \(\partial \Omega \), which only depends on integrability properties of the weak curvatures of \(\partial \Omega \). Precisely, it requires the membership of \(|\mathcal B|\) in a suitable function space \(X(\partial \Omega )\) over \(\partial \Omega \) defined in terms of weak type norms, and a smallness condition on the decay of these norms of \(|\mathcal B|\) over balls centered on \(\partial \Omega \). This membership will be denoted by \(\partial \Omega \in W^2X\). The relevant weak space is defined as

$$\begin{aligned} X (\partial \Omega )= {\left\{ \begin{array}{ll} L^{n-1, \infty }(\partial \Omega ) &{} \quad \hbox {if} \quad n \ge 3, \\ L^{1, \infty } \log L (\partial \Omega )&{} \quad \hbox {if }\quad n =2. \end{array}\right. } \end{aligned}$$
(2.21)

Here, \(L^{n-1, \infty }(\partial \Omega )\) denotes the weak-\(L^{n-1}(\partial \Omega )\) space, and \(L^{1, \infty } \log L (\partial \Omega )\) denotes the weak-\(L\log L (\partial \Omega )\) space (also called Marcinkiewicz spaces), with respect to the \((n-1)\)-dimensional Hausdorff measure.

Theorem 2.7

(Global estimates under minimal boundary regularity). Let \(\Omega \) be a bounded Lipschitz domain in \({\mathbb R^n}\), \(n \ge 2\), such that \(\partial \Omega \in W^{2,1}\), and let \(N \ge 2\). Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable and fulfills conditions (2.12) and (2.13). Let \({{\textbf {f}}} \in L^2(\Omega , {\mathbb R^N})\) and let \({{\textbf {u}}}\) be an approximable solution to the Dirichlet problem (2.17). Then there exists a constant \(c=c(n, N, i_a, s_a, L_\Omega , d_\Omega )\) such that, if

$$\begin{aligned} \lim _{r\rightarrow 0^+} \mathcal K _\Omega (r) < c, \end{aligned}$$
(2.22)

then \(a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), and inequality (2.19) holds.

In particular, if \(\partial \Omega \in W^2X\), where \(X(\partial \Omega )\) is the space defined by (2.21), then there exists a constant \(c=c(n, N, i_a, s_a, L_\Omega , d_\Omega )\) such that, if

$$\begin{aligned} \lim _{r\rightarrow 0^+} \Big (\sup _{x \in \partial \Omega } \Vert \mathcal B \Vert _{X(\partial \Omega \cap B_r(x))}\Big ) < c \,, \end{aligned}$$
(2.23)

then \(a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), and inequality (2.19) holds.

Remark 2.8

We emphasize that the assumptions on \(\partial \Omega \) in Theorem 2.7 are essentially sharp. For instance, the mere finiteness of the limit in (2.22) is not sufficient for the conclusion to hold. As shown in [53, 54], there exists a one-parameter family of domains \(\Omega \) such that \(\mathcal K_{\Omega }(r)<\infty \) for \(r\in (0,1)\) and the solution to the homogeneous Dirichlet problem for (1.2), with a smooth right-hand side \({\textbf {f}}\), belongs to \(W^{2,2}({\Omega }, \mathbb R^N)\) only for those values of the parameter which make the limit in (2.22) smaller than a critical (explicit) value.

A similar phenomenon occurs in connection with assumption (2.23). An example from [46] applies to demonstrate its optimality yet for the scalar p-Laplace equation. Actually, open sets \(\Omega \subset \mathbb R^3\), with \(\partial {\Omega } \in W^2L^{2, \infty }\), are displayed where the solution \({\textbf {u}}\) to the homogeneous Dirichlet problem for (1.3), with \(N=1\), \(p\in (\tfrac{3}{2},2]\) and a smooth right-hand side \({\textbf {f}}\), is such that \(|\nabla u|^{p-2}\nabla u \notin W^{1,2}(\Omega , \mathbb R^n)\). This lack of regularity is due to the fact that the limit in (2.23), though finite, is not small enough. Similarly, if \(n=2\) there exist open sets \(\Omega \), with \(\partial {\Omega } \in W^2L^{1, \infty } \log L\), for which the limit in (2.23) exceeds some threshold, and where the solution to the homogeneous Dirichlet problem for (1.2), with a smooth right-hand side, does not belong to \(W^{2,2}({\Omega })\)—see [53].

Remark 2.9

The one-parameter family of domains \(\Omega \) mentioned in the first part of Remark 2.8 with regard to condition (2.22) is such that \(\partial {\Omega } \notin W^2L^{n-1, \infty }\) if \(n \ge 3\). Hence, assumption (2.23) is not fulfilled even for those values of the parameter which render (2.22) true. This shows that the latter assumption is indeed weaker than (2.23).

Remark 2.10

Condition (2.23) certainly holds if \(n \ge 3\) and \(\partial {\Omega } \in W^{2,n-1}\), and if \(n=2\) and \(\partial {\Omega } \in W^{2}L\log L\) (and hence, if \(\partial {\Omega } \in W^{2,q}\) for some \(q>1\)). This is due to the fact that, under these assumptions, the limit in (2.23) vanishes. In particular, assumption (2.23) is satisfied if \(\partial \Omega \in C^2\).

3 The pointwise inequality

This section is devoted to the proof of Theorem 2.1, which is split in several lemmas. The point of departure is a pointwise identity, of possible independent use, stated in Lemma 3.1.

Given a positive function \(a \in C^1(0, \infty )\), we define the function \(Q_a: [0, \infty ) \rightarrow \mathbb R\) as

$$\begin{aligned} Q_a(t) = \frac{t a'(t)}{a(t)}\quad \text {for}\quad t>0. \end{aligned}$$
(3.1)

Hence,

$$\begin{aligned} i_a= \inf _{t>0} Q_a(t) \quad \hbox {and} \quad s_a= \sup _{t >0} Q_a(t), \end{aligned}$$
(3.2)

where \(i_a\) and \(s_a\) are the indices given by (2.1).

Lemma 3.1

Let n, N, \(\Omega \) and \({\textbf {u}}\) be as in Theorem 2.1.

(i) Assume that the function \(a \in C^0([0, \infty ))\) and satisfies conditions (2.4)–(2.6). Then

$$\begin{aligned} \big |\textrm{\textbf{div}} (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}})\big |^2&= \textrm{{div}} \Big [a(|\nabla {\textbf {u}}|)^2 \Big ( (\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ] \nonumber \\&\quad + a(|\nabla {\textbf {u}}|)^{2} \Bigg [ |\nabla ^2 {\textbf {u}}|^2 +2 Q_a(|\nabla {\textbf {u}}|)|\nabla |\nabla {\textbf {u}}||^2 \nonumber \\&\quad +Q_a(|\nabla {\textbf {u}}|)^2\bigg | \frac{\nabla {\textbf {u}}}{|\nabla {\textbf {u}}|} (\nabla |\nabla {\textbf {u}}|)^T \bigg |^2\Bigg ] \quad \text {in }\quad \Omega , \end{aligned}$$
(3.3)

where the last two addends in square brackets on the right-hand side of equation (3.3) have to interpreted as 0 if \(\nabla {\textbf {u}}=0\).

(ii) If a is just defined in \((0,\infty )\), \(a \in C^1((0, \infty ))\), and conditions (2.4) and (2.5) are fulfilled, then equation (3.3) continues to hold in the set \(\{\nabla {\textbf {u}}\ne 0\}\).

The next corollary follows from Lemma 3.1. applied with \(a(t)=t^{p-2}\).

Corollary 3.2

Let n, N, \(\Omega \) and \({\textbf {u}}\) be as in Theorem 2.1. Assume that \(p\ge 1\). Then

$$\begin{aligned} \big |\textrm{\textbf{div}} (|\nabla {\textbf {u}}|^{p-2}\nabla {\textbf {u}})\big |^2 =\&\textrm{{div}} \Big [|\nabla {\textbf {u}}|^{2(p-2)} \Big ((\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ] \nonumber \\&+ |\nabla {\textbf {u}}|^{2(p-2)} \Bigg [ |\nabla ^2 {\textbf {u}}|^2 +2 (p-2)|\nabla |\nabla {\textbf {u}}||^2\nonumber \\&+(p-2)^2\bigg | \frac{\nabla {\textbf {u}}}{|\nabla {\textbf {u}}|} (\nabla |\nabla {\textbf {u}}|)^T \bigg |^2\Bigg ] \end{aligned}$$
(3.4)

in \(\{\nabla {\textbf {u}}\ne 0\}\).

Proof of Lemma 3.1

Part (i). The following chain can be deduced via straightforward computations:

$$\begin{aligned}&\big | \textrm{\textbf{div}} \big (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}}\big )\big |^2\nonumber \\&\quad = \big | a(|\nabla {\textbf {u}}|) \Delta {\textbf {u}}+ a'(|\nabla {\textbf {u}}|) \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T \big |^2 \nonumber \\&\quad = \ a(|\nabla {\textbf {u}}|)^2 \big (|\Delta {\textbf {u}}|^2 - |\nabla ^2 {\textbf {u}}|^2\big ) + a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 + \nonumber \\&\qquad + a'(|\nabla {\textbf {u}}|)^2 | \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T \big |^2 + 2 a(|\nabla {\textbf {u}}|) a'(|\nabla {\textbf {u}}|) \Delta {\textbf {u}}\cdot \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T \nonumber \\&\quad = a(|\nabla {\textbf {u}}|)^2\Big ( \textrm{{div}} (( \Delta {\textbf {u}})^T \nabla {\textbf {u}}) - \tfrac{1}{2}\textrm{{div}} (\nabla |\nabla {\textbf {u}}|^2)\Big ) + a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 + \nonumber \\&\qquad + a'(|\nabla {\textbf {u}}|)^2 \big | \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T \big |^2 + 2 a(|\nabla {\textbf {u}}|) a'(|\nabla {\textbf {u}}|) \Delta {\textbf {u}}\cdot \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T. \end{aligned}$$
(3.5)

Notice that equation (3.5) also holds at the points where \(|\nabla {\textbf {u}}|=0\), provided that the terms involving the factor \(a'(|\nabla {\textbf {u}}|)\) are intepreted as 0. This is due to the fact that all the terms in question also contain the factor \(\nabla {\textbf {u}}\) and, by assumption (2.6),

$$\lim _{t\rightarrow 0^+} a'(t)t=0.$$

Moreover,

$$\begin{aligned}&a(|\nabla {\textbf {u}}|)^2 \textrm{{div}} ((\Delta {\textbf {u}})^T \nabla {\textbf {u}})\nonumber \\&\quad =\textrm{{div}}\big ( a(|\nabla {\textbf {u}}|)^2 (\Delta {\textbf {u}})^T\nabla {\textbf {u}}\big ) - 2 a(|\nabla {\textbf {u}}|) a'(|\nabla {\textbf {u}}|) \Delta {\textbf {u}}\cdot \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T, \end{aligned}$$
(3.6)

and

$$\begin{aligned} \tfrac{1}{2} a(|\nabla {\textbf {u}}|)^2 \textrm{{div}}\big (\nabla |\nabla {\textbf {u}}|^2\big ) = \tfrac{1}{2}\textrm{{div}}\big ( a(|\nabla {\textbf {u}}|)^2\,\nabla |\nabla {\textbf {u}}|^2\big ) - 2a(|\nabla {\textbf {u}}|) a'(|\nabla {\textbf {u}}|)|\nabla {\textbf {u}}||\nabla |\nabla {\textbf {u}}||^2. \end{aligned}$$
(3.7)

From equations (3.5)–(3.7) one deduces that

$$\begin{aligned} \big | \textrm{\textbf{div}} (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}})\big |^2&= \textrm{{ div}}\big (a(|\nabla {\textbf {u}}|)^2(\Delta {\textbf {u}})^T \nabla {\textbf {u}}\big ) - \tfrac{1}{2} \textrm{{div}}\big (a(|\nabla {\textbf {u}}|)^2\,\nabla |\nabla {\textbf {u}}|^2\big ) \nonumber \\&\quad + a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 + a'(|\nabla {\textbf {u}}|)^2 \big | \nabla {\textbf {u}}(\nabla |\nabla {\textbf {u}}|)^T \big |^2 \nonumber \\&\quad + 2a(|\nabla {\textbf {u}}|) a'(|\nabla {\textbf {u}}|)|\nabla {\textbf {u}}||\nabla |\nabla {\textbf {u}}||^2. \end{aligned}$$
(3.8)

If \(\nabla {\textbf {u}}=0\), then the last two addends on the right-hand side of Eq. (3.8) vanish. Hence, Eq. (3.3) follows. Assume next that \(\nabla {\textbf {u}}\ne 0\). Then, from Eq. (3.8) we obtain that

$$\begin{aligned} \big | \textrm{\textbf{div}} (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}})\big |^2&=\textrm{{div}}\big (a(|\nabla {\textbf {u}}|)^2 (\Delta {\textbf {u}})^T \nabla {\textbf {u}}\big ) - \tfrac{1}{2} \textrm{{div}}\big (a(|\nabla {\textbf {u}}|)^2\,\nabla |\nabla {\textbf {u}}|^2\big ) \nonumber \\&\quad \ + a(|\nabla {\textbf {u}}|)^2 \Bigg [ |\nabla ^2 {\textbf {u}}|^2 +\bigg (\frac{a'(|\nabla {\textbf {u}}|) |\nabla {\textbf {u}}|}{a(|\nabla {\textbf {u}}|)}\bigg )^2 \bigg | \frac{\nabla {\textbf {u}}}{|\nabla {\textbf {u}}|} (\nabla |\nabla {\textbf {u}}|)^T \Bigg |^2 \nonumber \\&\quad +2 \frac{a'(|\nabla {\textbf {u}}|) |\nabla {\textbf {u}}|}{a(|\nabla {\textbf {u}}|)}|\nabla |\nabla {\textbf {u}}||^2\Bigg ]. \end{aligned}$$

The proof of Eq. (3.3) is complete.

Part (ii). The conclusion follows from the above computations, on disregarding the comments on the points where \(\nabla {\textbf {u}}=0\). \(\square \)

Having identity (3.3) at our disposal, the point is now to derive a sharp lower bound for the second addend on its right-hand side. This will be accomplished via Lemma 3.6 below. Its proof requires a delicate analysis of the quadratic form, depending on the entries of the Hessian matrix \(\nabla ^2 {\textbf {u}}\), which appears in square brackets in the expression to be bounded. This analysis relies upon some critical linear-algebraic steps that are presented in the next three lemmas.

In what follows, \(\mathbb {R}^{n\times n}_{{\textrm {sym}}}\) denotes the space of symmetric matrices in \(\mathbb {R}^{n \times n}\). The dot “\( \, \cdot \,\)” is employed to denote scalar product of vectors or matrices, and the symbol “\( \otimes \)” for tensor product of vectors. Also, I stands for the identity matrix in \(\mathbb {R}^{n \times n}\).

Lemma 3.3

Let \(\omega \in {\mathbb R^n}\) be such that \({|{\omega }|} =1\). Then

$$\begin{aligned} {|{H\omega }|}^2-\tfrac{1}{2} {|{\omega \cdot H\omega }|}^2-\tfrac{1}{2} {|{H}|}^2 =- \tfrac{1}{2} {|{H_{\omega ^\perp }}|}^2 \end{aligned}$$
(3.9)

for every \(H \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\), where \(H_{\omega ^\perp } = (I - \omega \otimes \omega ) H (I - \omega \otimes \omega )\).

Proof

Let \(\{e_1, \dots , e_n\}\) denote the canonical basis in \({\mathbb R^n}\) and let \(\{\theta _1, \dots , \theta _n\}\) be an orthonormal basis of \({\mathbb R^n}\) such that \(\theta _1 = \omega \). Let \(Q \in \mathbb {R}^{n \times n}\) be the matrix whose columns are \(\theta _1, \dots , \theta _n\). Hence, \(\omega = Q e_1\). Next, let \(R= Q^T H Q\). Clearly, \(R \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\). Denote by \(r_{ij}\) the entries of R. Computations show that

$$\begin{aligned} {{|{H\omega }|}^2-\tfrac{1}{2} {|{\omega \cdot H\omega }|}^2 -\tfrac{1}{2} {|{H}|}^2 }&= {|{R e_1}|}^2-\tfrac{1}{2} {|{e_1 \cdot R e_1}|}^2-\tfrac{1}{2} {|{R}|}^2\\&= \sum _{i=1}^n {|{r_{i1}}|}^2-\tfrac{1}{2} {|{r_{11}}|}^2-\tfrac{1}{2} \sum _{i,j=1}^n{|{r_{ij}}|}^2 \\&=\tfrac{1}{2}\sum _{j=1}^n {|{r_{1j}}|}^2 + \tfrac{1}{2} \sum _{i=1}^n {|{r_{i1}}|}^2-\tfrac{1}{2} {|{r_{11}}|}^2-\tfrac{1}{2} \sum _{i,j=1}^n{|{r_{ij}}|}^2\\&=-\tfrac{1}{2} \sum _{i,j \ge 2}{|{r_{ij}}|}^2 \\&= -\tfrac{1}{2} {|{(I - e_1 \otimes e_1) R (I- e_1 \otimes e_1)}|}^2\\&= -\tfrac{1}{2} {|{(I - \omega \otimes \omega ) H (I - \omega \otimes \omega )}|}^2. \end{aligned}$$

Hence, Eq. (3.9) follows. \(\square \)

Given a vector \(\omega \in {\mathbb R^n}\), define the set

$$\begin{aligned} E(\omega ) = {\big \{{ H \omega \,:\, H \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}, {|{H}|} \le 1}\big \}}. \end{aligned}$$

It is easily verified that \(E(\omega )\) is a convex set in \({\mathbb R^n}\) for every \(\omega \in {\mathbb R^n}\). Lemma 3.4 below tells us that, in fact, \(E(\omega )\) is an ellipsoid, centered at 0 (which reduces to \(\{0\}\) if \(\omega =0\)). This assertion will be verified by showing that, for each \(\omega \in {\mathbb R^n}\), there exists a positive definite matrix \(W \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\) such that \(E(\omega )\) agrees with the ellipsoid

$$\begin{aligned} F(W) = {\big \{{ x\in {\mathbb R^n}:\, x\cdot W^{-1} x\le 1}\big \}}, \end{aligned}$$
(3.10)

where \(W^{-1}\) stands for the inverse of W. This is the content of Lemma 3.4 below. In its proof, we shall make use of the alternative representation

$$\begin{aligned} F(W) = {\big \{{x\in {\mathbb R^n}:\, y \cdot x \le \sqrt{y \cdot Wy} \quad \text {for every}\quad y \in \mathbb {R}^n}\big \}}, \end{aligned}$$
(3.11)

which follows, for instance, via a maximization argument for the ratio of the two sides of the inequality in (3.11) for each given \(x\in {\mathbb R^n}\).

Also, observe that, as a consequence of Eq. (3.11),

$$\begin{aligned} {|{x}|} = x \cdot \widehat{x} \le \sqrt{ \widehat{x} \cdot W \widehat{x}} \qquad \text {for every}\quad x \in F(W)\setminus \{0\}. \end{aligned}$$
(3.12)

Here, and in what follows, we adopt the notation

$$\begin{aligned} \widehat{x} = \frac{x}{|x|} \qquad \hbox {for}\quad x \in {\mathbb R^n}\setminus \{0\}. \end{aligned}$$

Lemma 3.4

Given \(\omega \in {\mathbb R^n}\), let \(W(\omega ) \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\) be defined as

$$\begin{aligned} W(\omega ) = \tfrac{1}{2} \big ( {|{\omega }|}^2 I + \omega \otimes \omega \big ). \end{aligned}$$
(3.13)

Then \(W(\omega )\) is positive definite, and

$$\begin{aligned} E(\omega ) = F(W(\omega )). \end{aligned}$$
(3.14)

In particular,

$$\begin{aligned} H \omega \in {|{H}|}\,F\big ( W(\omega )\big ) \quad \text {for every}\quad \omega \in {\mathbb R^n}\hbox { and }H \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}. \end{aligned}$$
(3.15)

Proof

Equation (3.14) trivially holds if \(\omega =0\). Thus, by a scaling argument, it suffices to consider the case when \({|{\omega }|}=1\). We begin showing that \(E(\omega ) \subset F(W(\omega ))\). One can verify that, since \({|{\omega }|}=1\),

$$\begin{aligned} W(\omega )^{-1} = 2I - \omega \otimes \omega . \end{aligned}$$
(3.16)

Let \(H \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\) be such that \({|{H}|} \le 1\). Owing to equation (3.16) and to Lemma 3.3,

$$\begin{aligned} H\omega \cdot W(\omega )^{-1} H\omega = 2 {|{H \omega }|}^2 - {\big |{\omega \cdot H \omega }\big |}^2 \le {|{H}|}^2 \le 1. \end{aligned}$$
(3.17)

This shows that \(H \omega \in F(W(\omega ))\). The inclusion \(E(\omega ) \subset F(W(\omega ))\) is thus established.

Let us next prove that \(F(W(\omega )) \subset E(\omega )\). Let \(x \in F(W(\omega ))\). We have to detect a matrix  \(H\in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\) such that \({|{H}|} \le 1\) and \(x = H \omega \). To this purpose, consider the decomposition \(x=t \omega + s \omega ^\perp \), for suitable \(s, t \in \mathbb {R}\), where \(\omega ^\perp \perp \omega \) and \({|{\omega ^\perp }|}=1\). Since \(x \in F(W(\omega ))\), one has that \(x \cdot W(\omega ) ^{-1} x \le 1\). Furthermore,

$$\begin{aligned} x \cdot W(\omega )^{-1}x =(t \omega {+} s \omega ^\perp ) \cdot (2 I - \omega \otimes \omega ) (t \omega {+} s \omega ^\perp ) {=} 2(t^2 + s^2) - t^2 = t^2 + 2s^2. \end{aligned}$$

Hence, \(t^2 + 2s^2 \le 1\). We claim that the matrix H, defined as

$$\begin{aligned} H = t\,\omega \otimes \omega + s\,(\omega ^\perp \otimes \omega + \omega \otimes \omega ^\perp ), \end{aligned}$$

has the desired properties. Indeed, \(H \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\),

$$\begin{aligned} {|{H}|}^2&= \textrm{tr}(H^T H) = t^2 + 2s^2 \le 1, \\ H \omega&= t \omega + s \omega ^\perp = x. \end{aligned}$$

This proves that \(x \in E(\omega )\). The inclusion \(F(W(\omega )) \subset E(\omega )\) hence follows. \(\square \)

In view of the statement of the next lemma, we introduce the following notation. Given N vectors \(\omega ^\alpha \in {\mathbb R^n}\) and N matrices \(H^\alpha \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\), with \(\alpha =1, \dots N\), we set

$$\begin{aligned} J&= {\bigg |{ \sum _{\alpha =1}^N H^\alpha \omega ^\alpha }\bigg |}^2,&J_0&= \sum _{\alpha =1}^N{\bigg |{ \omega ^\alpha \cdot \sum _{\beta =1}^N H^\beta \omega ^\beta }\bigg |}^2,&J_1&= \sum _{\alpha =1}^N {|{H^\alpha }|}^2. \end{aligned}$$
(3.18)

Lemma 3.5

Let \(N \ge 2\), \(0 \le \delta \le \frac{1}{2}\) and \(\delta +\sigma \ge 1\). Assume that the vectors \(\omega ^\alpha \in {\mathbb R^n}\) and the matrices \(H^\alpha \in \mathbb {R}^{n \times n}_{{\textrm {sym}}}\), with \(\alpha =1, \dots N\), satisfy the following constraints:

$$\begin{aligned}&\sum _{\alpha =1}^N {|{\omega ^\alpha }|}^2 \le 1, \end{aligned}$$
(3.19)
$$\begin{aligned}&\sum _{\alpha =1}^N {|{H^\alpha }|}^2 \le 1. \end{aligned}$$
(3.20)

Then

$$\begin{aligned} J - \delta J_0 - \sigma J_1&\le {\left\{ \begin{array}{ll} 0 &{}\qquad \text {if}\quad \delta \in [0,\frac{1}{3}],\\ \max {\Big \{{0,\frac{(\delta +1)^2}{8\delta } - \sigma }\Big \}} &{}\qquad \text {if}\quad \delta \in (\frac{1}{3},\frac{1}{2}]. \end{array}\right. } \end{aligned}$$
(3.21)

Proof

Given \(\delta \) and \(\sigma \) as in the statement, set

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma } = J - \delta J_0 - \sigma J_1. \end{aligned}$$

The quantities \(J_0\), J and \(J_1\) are 1-homogeneous with respect to the quantity \(\sum _{j=1}^N {|{H_j}|}^2\). Moreover, inequality (3.21) trivially holds if the latter quantity vanishes. Thereby, it suffices to prove this inequality under the assumption that \( \sum _{j=1}^N {|{H_j}|}^2=1\), namely that

$$\begin{aligned} J_1 = 1. \end{aligned}$$
(3.22)

On setting \(\zeta = \sum _{\alpha =1}^N H^\alpha \omega ^\alpha \), one has that

$$\begin{aligned} J = {|{\zeta }|}^2 \qquad \text {and} \qquad J_0 = \sum _{\alpha =1}^N {|{ \omega ^\alpha \cdot \zeta }|}^2. \end{aligned}$$

Therefore,

$$\begin{aligned} J_0 \le {|{\zeta }|}^2 \sum _{\alpha =1}^N {|{\omega ^\alpha }|}^2 \le {|{\zeta }|}^2 = J. \end{aligned}$$
(3.23)

Owing to Lemma 3.4,

$$\begin{aligned} H^\alpha \omega ^\alpha \in {|{H^\alpha }|} F(W^\alpha ) \end{aligned}$$

for \(\alpha =1, \dots , N\), where \(W^\alpha = {|{\omega ^\alpha }|}^2 \frac{1}{2} (\textrm {Id}+ \widehat{\omega ^\alpha } \otimes \widehat{\omega ^\alpha })\). Thus, by equations (3.15) and (3.12),

$$\begin{aligned} H^\alpha \omega ^\alpha \cdot \widehat{\zeta }&\le {|{H^\alpha }|} \sqrt{\widehat{\zeta }\cdot W^\alpha \widehat{\zeta }} = {|{H^\alpha }|} {|{\omega ^\alpha }|} \sqrt{ \tfrac{1}{2} + \tfrac{1}{2} {|{\widehat{\omega ^\alpha } \cdot \widehat{\zeta }}|}^2} \end{aligned}$$
(3.24)

for \(\alpha =1, \dots , N\). Since

$$\begin{aligned} \zeta&= ( \zeta \cdot \widehat{\zeta })\widehat{\zeta } = \sum _{\alpha =1}^N (H^\alpha \omega ^\alpha \cdot \widehat{\zeta }) \widehat{\zeta }, \end{aligned}$$

equation (3.24) implies that

$$\begin{aligned} {|{\zeta }|}&\le \sum _{\alpha =1}^N {\big |{H^\alpha \omega ^\alpha \cdot \widehat{\zeta }}\big |} \le \sum _{\alpha =1}^N {|{H^\alpha }|} {|{\omega ^\alpha }|} \sqrt{ \tfrac{1}{2} + \tfrac{1}{2} {|{\widehat{\omega ^\alpha } \cdot \widehat{\zeta }}|}^2}. \end{aligned}$$

Hence,

$$\begin{aligned} {|{\zeta }|}^2&\le \tfrac{1}{2} \bigg (\sum _{\alpha =1}^N {|{H^\alpha }|} {|{\omega ^\alpha }|} \sqrt{ \tfrac{1}{2} + \tfrac{1}{2} {|{\widehat{\omega ^\alpha } \cdot \widehat{\zeta }}|}^2} \bigg )^2. \end{aligned}$$
(3.25)

On setting \( \widehat{J_0} = \sum _{\alpha =1}^N {|{\smash {\omega ^\alpha \cdot \widehat{\zeta }}}|}^2\), we obtain that

$$\begin{aligned} \widehat{J_0} = \sum _{\alpha =1}^N {|{\omega ^\alpha }|}^2 {|{\smash {\widehat{\omega ^\alpha }\cdot \widehat{\zeta }}}|}^2 \qquad \text {and} \qquad J_0 = {|{\zeta }|}^2 \widehat{J_0}. \end{aligned}$$

Note that \(\widehat{J_0} \le 1\), inasmuch as \(J_0 \le J= {|{\zeta }|}^2\). Moreover, by equation (3.22),

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma }&= J - \delta J_0 - \sigma = {|{\zeta }|}^2 \big (1 - \delta \widehat{J_0}\big ) - \sigma . \end{aligned}$$
(3.26)

From inequalities (3.25) and (3.26) we deduce that

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma }&\le \tfrac{1}{2} \bigg (\sum _{\alpha =1}^N {|{H^\alpha }|} {|{\omega ^\alpha }|} \sqrt{ \tfrac{1}{2} + \tfrac{1}{2} {|{\widehat{\omega ^\alpha } \cdot \widehat{\zeta }}|}^2} \bigg )^2 \Big (1 - \delta \sum _{\alpha =1}^N {|{\omega ^\alpha }|}^2 {|{\smash {\widehat{\omega ^\alpha }\cdot \widehat{\zeta }}}|}^2 \Big ) - \sigma . \end{aligned}$$
(3.27)

Next, define the function with \(g\,:\ [0,1]^N \times [0,1]^N \times [0,1]^N \rightarrow \mathbb {R}\) as

$$\begin{aligned} \begin{aligned} g(h,s,t)&= \tfrac{1}{2} \Big ( \sum _{\alpha =1}^N h_\alpha t_\alpha \sqrt{1 + s_\alpha ^2} \Big )^2 \bigg (1 - \delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \Big ) \bigg ) - \sigma \\&\quad \text {for}\quad (h,s,t) \in [0,1]^N \times [0,1]^N \times [0,1]^N, \end{aligned} \end{aligned}$$
(3.28)

where \(h=(h_1, \dots , h_N)\), \(s=(s_1, \dots , s_N)\) and \(t=(t_1, \dots , t_N)\). Inequality (3.27) then takes the form

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma }&\le g((|H^1|, \dots , |H^N|), (|\omega ^1|, \dots , |\omega ^N|), ({|{\widehat{\omega ^1} \cdot \widehat{\zeta }}|}, \dots , {|{\widehat{\omega ^N} \cdot \widehat{\zeta }}|})). \end{aligned}$$

Our purpose is now to maximize the function g under the constraints

$$\begin{aligned} \sum _{\alpha =1}^N t_\alpha ^2 \le 1,\quad \sum _{\alpha =1}^N h_\alpha ^2 = 1. \end{aligned}$$
(3.29)

We claim that the maximum of g can only be attained if \(\sum _{\alpha =1}^N t_\alpha ^2 = 1\). To verify this claim, it suffices to show that

$$\begin{aligned}&g(h, s, \tau t) \le g(h,s,t)\nonumber \\ {}&\quad \text {for every}\quad (h,s,t) \in [0,1]^N \times [0,1]^N \times [0,1]^N\hbox { and }\tau \in [0,1]. \end{aligned}$$
(3.30)

Plainly,

$$\begin{aligned} g(h,s,\tau t) = \tfrac{1}{2} \tau ^2\Big ( \sum _{\alpha =1}^N h_\alpha t_\alpha \sqrt{1 + s_\alpha ^2} \Big )^2 \bigg (1 - \tau ^2\delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \Big ) \bigg ) - \sigma \end{aligned}$$

for \((h,s,t) \in [0,1]^N \times [0,1]^N \times [0,1]^N\) and \(\tau \in [0,1]\). Note that

$$\begin{aligned} 0&\le \delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \Big ) \le \delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 \Big ) = \delta \le \tfrac{1}{2}. \end{aligned}$$
(3.31)

Thus, for each fixed \((h,s,t) \in [0,1]^n \times [0,1]^n \times [0,1]^n\), we have that

$$\begin{aligned} g(h, s, \tau t) = c_1 \tau (1- c_2 \tau ) - \beta \qquad \text {for}\quad \tau \in [0,1], \end{aligned}$$
(3.32)

for suitable constants \(c_1 \ge 0\) and \(0\le c_2 \le \tfrac{1}{2}\), depending on (hst). Since the polynomial on the right-hand side of Eq. (3.32) is increasing for \(\tau \in [0,1]\), inequality (3.30) follows. As a consequence, constraints (3.29) can be equivalently replaced by

$$\begin{aligned} \sum _{\alpha =1}^N t_\alpha ^2&=1 \qquad \text {and} \qquad \sum _{\alpha =1}^N h_\alpha ^2 =1. \end{aligned}$$
(3.33)

Let us maximize the function g(hst) with respect to h, under the constraint \(\sum _{\alpha =1}^N h_\alpha ^2 = 1\). Let \((h_1, \dots , h_N)\) be any point where the maximum is attained. Then, there exists a Langrange multiplier \(\lambda \in \mathbb {R}\) such that

$$\begin{aligned} t_\alpha \sqrt{1 + s_\alpha ^2} \bigg ( \sum _{\gamma =1}^N h_\gamma t_\gamma \sqrt{1 + s_\gamma ^2} \bigg ) \bigg (1 - \delta \Big ( \sum _{\gamma =1}^N t_\gamma ^2 s_\gamma ^2 \Big ) \bigg )&= 2\lambda h_\alpha \quad \hbox {for}\quad \alpha =1, \dots , N. \end{aligned}$$
(3.34)

Multiplying through equation (3.34) by \(h_\beta \), and then subtracting equation (3.34), with \(\alpha \) replaced by \(\beta \) and multiplied by \(h_\alpha \), yield

$$\begin{aligned} \bigg ( \sum _{\gamma =1}^N h_\gamma t_\gamma \sqrt{1 + s_\gamma ^2} \bigg ) \bigg (1 - \delta \Big ( \sum _{\gamma =1}^n t_\gamma ^2 s_\gamma ^2 \Big ) \bigg ) \Big ( h_\beta t_\alpha \sqrt{1 + s_\alpha ^2} - h_\alpha t_\beta \sqrt{1 + s_\beta ^2} \Big )&= 0 \end{aligned}$$
(3.35)

for  \(\alpha , \beta = 1,\dots , N\). Owing to equation (3.31), we have that \(\big (1 - \delta \big ( \sum _{\gamma =1}^n t_\gamma ^2 s_\gamma ^2 \big ) \big ) \ge \frac{1}{2}\). Next, if \( \sum _{\gamma =1}^N h_\gamma t_\gamma \sqrt{1 + s_\gamma ^2}=0\), then \(h_1t_1 = \dots = h_N t_N = 0\), whence \(\mathcal {D}_{\delta ,\sigma } = -\sigma \le 0\), and inequality (3.21) holds trivially. Therefore, we may assume that \( \sum _{\gamma =1}^N h_\gamma t_\gamma \sqrt{1 + s_\gamma ^2} > 0\) in what follows. Under this assumption, equation (3.35) tells us that

$$\begin{aligned} h_\beta t_\alpha \sqrt{1 + s_\alpha ^2}&= h_\alpha t_\beta \sqrt{1 + s_\beta ^2} \end{aligned}$$
(3.36)

for  \(\alpha , \beta = 1,\dots , N\). Combining equations (3.33) and (3.36) yields

$$\begin{aligned} \begin{aligned} t_\alpha ^2(1+s_\alpha ^2)&= t_\alpha ^2(1+s_\alpha ^2) \sum _{\beta =1}^N h_\beta ^2 = h_\alpha ^2 \sum _{\beta =1}^N t_\beta ^2(1+s_\beta ^2) = h_\alpha ^2 \bigg ( 1 + \sum _{\beta =1}^N t_\beta ^2s_\beta ^2 \bigg ) \end{aligned} \end{aligned}$$
(3.37)

for  \(\alpha = 1,\dots , N\). Hence,

$$\begin{aligned} h_\alpha t_\alpha \sqrt{1+s_\alpha ^2}&= h_\alpha ^2 \sqrt{1 + \sum _{\beta =1}^N t_\beta ^2s_\beta ^2} \end{aligned}$$
(3.38)

for \(\alpha = 1,\dots , N\). From equations (3.28), (3.38) and (3.33) we deduce that

$$\begin{aligned} g(h,s,t)&\le \tfrac{1}{2} \left( \sum _{\alpha =1}^N h_\alpha ^2 \sqrt{1+\sum _{\beta =1}^N t_\beta ^2s_\beta ^2} \right) ^2 \bigg (1 - \delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \Big ) \bigg ) - \sigma \\&= \tfrac{1}{2} \bigg ( 1+\sum _{\beta =1}^N t_\beta ^2s_\beta ^2 \bigg ) \bigg (1 - \delta \Big ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \Big ) \bigg ) - \sigma = \psi \bigg ( \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \bigg ), \end{aligned}$$

where \(\psi : [0,1] \rightarrow \mathbb R\) is the function defined as

$$\begin{aligned} \psi (r) = \tfrac{1}{2} (1 +r) \big (1 - \delta r \big ) - \sigma \quad \text {for}\quad r \in \mathbb R. \end{aligned}$$

Set \(\rho = \sum _{j=\alpha }^N t_\alpha ^2 s_\alpha ^2\), and notice that \(\rho \in [0,1]\), since \(0 \le \sum _{\alpha =1}^N t_\alpha ^2 s_\alpha ^2 \le \sum _{\alpha =1}^N t_\alpha ^2=1\). Thereby, the maximum of the function g on \([0,1]^N \times [0,1]^N \times [0,1]^N\) under constraints (3.33) agrees with the maximum of the function \(\psi \) on [0, 1]. It is easily verified that, if \(\delta \in [0, \frac{1}{3}]\), then \(\max _{r \in [0,1]}\psi (r) = \psi (1)\). Hence, since we are assuming that \(\delta + \sigma \ge 1\),

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma } \le \psi (1) = 1-\delta -\sigma \le 0. \end{aligned}$$

On the other hand, if \(\delta \in (\frac{1}{3}, \frac{1}{2}]\), then \(\max _{r \in [0,1]}\psi (r) = \psi (\frac{1-\delta }{2\delta })\). Therefore,

$$\begin{aligned} \mathcal {D}_{\delta ,\sigma }&\le \psi \Big ( \frac{1-\delta }{2\delta } \Big ) = \frac{(\delta +1)^2}{8\delta } - \sigma . \end{aligned}$$

The proof of inequality (3.21) is complete. \(\square \)

Lemma 3.6

Let n, N, \(\Omega \) and \({\textbf {u}}\) be as in Theorem 2.1. Given \(p\ge 1\), let \(\kappa _N (p)\) be the constant defined by (2.2)–(2.3). Then

$$\begin{aligned} {|{\nabla ^2 {\textbf {u}}}|}^2 + 2(p-2){\big |{\nabla {|{\nabla {\textbf {u}}}|}}\big |}^2 + (p-2)^2 {\bigg |{ \frac{\nabla {\textbf {u}}}{{|{\nabla {\textbf {u}}}|}} (\nabla {|{\nabla {\textbf {u}}}|})^T}\bigg |}^2&\ge \kappa _N (p) {|{\nabla ^2 {\textbf {u}}}|}^2 \end{aligned}$$
(3.39)

in \(\{\nabla {\textbf {u}}\ne 0\}\). Moreover, the constant \(\kappa _N (p)\) is sharp in (3.39).

Proof

Case \(N=1\). Inequality (3.39) trivially holds if \(p\ge 2\). Let us focus on the case when \(1\le p < 2\). Notice that, on setting

$$\begin{aligned} \omega = \frac{(\nabla u)^T}{|\nabla u|}\in {\mathbb R^n}\quad \text {and} \quad H= \nabla ^2 u \in \mathbb {R}^{n\times n}_\textrm{sym} \end{aligned}$$

at any point in \(\{\nabla u \ne 0\}\), we have that

$$\begin{aligned} {|{H\omega }|}^2= {\big |{\nabla {|{\nabla u}|}}\big |}^2, \quad {|{\omega \cdot H\omega }|}^2= {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2, \quad {|{H}|}^2= {|{\nabla ^2 u}|}^2. \end{aligned}$$

Therefore, by equation (3.9),

$$\begin{aligned} {\big |{\nabla {|{\nabla u}|}}\big |}^2&\le \tfrac{1}{2} {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2 + \tfrac{1}{2} {|{\nabla ^2 u}|}^2. \end{aligned}$$

Consequently, the following chain holds:

$$\begin{aligned}&{|{\nabla ^2 u}|}^2 + 2(p-2){\big |{\nabla {|{\nabla u}|}}\big |}^2 + (p-2)^2 {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2 \\&\quad \ge \big (1 + (p-2) \big ) {|{\nabla ^2 u}|}^2 + \big ( (p-2) + (p-2)^2\big ) {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2 \\&\quad \ge (p-1) {|{\nabla ^2 u}|}^2 + (p-1)(p-2) {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2 \\&\quad \ge \big ((p-1) + (p-1)(p-2)\big ) {|{\nabla ^2 u}|}^2 \\&\quad = (p-1)^2 {|{\nabla ^2 u}|}^2. \end{aligned}$$

Hence, inequality (3.39) follows.

As far as the sharpness of the constant is concerned, if \(p\ge 2\), consider the function \(u: \mathbb {R}^n\setminus {\{{0}\}} \rightarrow \mathbb {R}\) given by

$$\begin{aligned} u (x)= |x| \quad \text {for }\quad x \in \mathbb {R}^n\setminus {\{{0}\}} . \end{aligned}$$

Since \(\nabla {|{\nabla u}|} = 0\), equality holds in (3.39) for every \(x \in \mathbb {R}^n\setminus {\{{0}\}}\). On the other hand, if \(p \in [1, 2)\), consider the function \(u: \mathbb {R}^n\rightarrow \mathbb {R}\) defined as

$$\begin{aligned} u (x)= \tfrac{1}{2} x_1^2 \quad \text {for}\quad x \in \mathbb {R}^n. \end{aligned}$$

One has that

$$\begin{aligned} {|{\nabla ^2 u}|}^2 ={\big |{\nabla {|{\nabla u}|}}\big |}^2 = {\bigg |{ \frac{\nabla u}{{|{\nabla u}|}} (\nabla {|{\nabla u}|})^T}\bigg |}^2 =1 \quad \text {in} \quad {\mathbb R^n}. \end{aligned}$$

Hence, equality holds in (3.39) for every \(x \in \mathbb {R}^n\setminus {\{{0}\}}\).

Case \(N \ge 2\). It suffices to prove that inequality (3.39) holds at every point \(x\in \{\nabla {\textbf {u}}\ne 0\}\) under the assumption that \({|{\nabla ^2 {\textbf {u}}(x)}|}\) equals either 0 or 1. Indeed, if \({|{\nabla ^2 {\textbf {u}}(x)}|}\ne 0\) at some point x, then the function given by \(\overline{{\textbf {u}}}= \frac{{\textbf {u}}}{{|{\nabla ^2 {\textbf {u}}(x)}|}}\) fulfills \({|{\nabla ^2 \overline{{\textbf {u}}}(x)}|}=1\). Hence, inequality (3.39) for \({\textbf {u}}\) at the point x follows from the same inequality applied to \(\overline{u}\).

If \(p\ge 2\), inequality (3.39) holds trivially. Thus, we may focus on the case when \(p \in [1,2)\). In this case, we make use of Lemma 3.5. Define

$$\begin{aligned} \omega ^\alpha = \frac{\nabla {\textbf {u}}^\alpha }{|\nabla {\textbf {u}}|}\in {\mathbb R^n}\quad \text {and} \quad H^\alpha = \nabla ^2 {\textbf {u}}^\alpha \in \mathbb {R}^{n\times n}_\textrm{sym} \end{aligned}$$

for \(\alpha =1, \dots , N\), at any point in \(\{\nabla {\textbf {u}}\ne 0\}\). In particular, assumptions (3.19) and (3.20) are satisfied with this choice. Computations show that

$$\begin{aligned} J= {\big |{\nabla {|{\nabla {\textbf {u}}}|}}\big |}^2, \quad J_0= {\bigg |{ \frac{\nabla {\textbf {u}}}{{|{\nabla {\textbf {u}}}|}} (\nabla {|{\nabla {\textbf {u}}}|})^T}\bigg |}^2, \quad J_1 = {|{\nabla ^2 {\textbf {u}}}|}^2, \end{aligned}$$
(3.40)

where J, \(J_0\) and \(J_1\) are defined as in (3.18).

Next, let \(\delta = \frac{2-p}{2}\). Notice that \(\delta \in [0, \frac{1}{2}]\), and that \(\delta \in (0, \frac{1}{3}]\) if and only if \(p \in [\frac{4}{3},2)\). We next choose \(\sigma = \frac{p}{2}\) if \(p \in [\frac{4}{3},2)\), and \(\sigma = \frac{(\delta +1)^2}{8\delta }= \frac{1}{16} \frac{(4-p)^2}{2-p}\) if \(p\in [1, \frac{4}{3})\). Observe that \(\delta +\sigma =1\) in the former case, and \(\delta +\sigma > 1\) in the latter. Thus, the assumptions on \(\delta \) and \(\sigma \) of Theorem 3.5 are fulfilled. Furthermore, by our choice of \(\sigma \), the maximum on right-hand side of inequality (3.21) equals 0 when \(\delta >\frac{1}{3}\), namely when \(p\in [1, \frac{4}{3})\). From inequality (3.21) we infer that

$$\begin{aligned} J&\le \tfrac{2-p}{2} J_0 + \sigma J_1. \end{aligned}$$

This inequality is equivalent to

$$\begin{aligned} J_1 + 2(p-2)J + (p-2)^2 J_0 \ge (1- \sigma 2(2-p)) J_1. \end{aligned}$$

Since \(1- \sigma 2(2-p) = \kappa _N(p)\), inequality (3.39) follows.

In order to prove the sharpness of the constant \(\kappa _N(p)\), let us distinguish the cases when \(p\ge 2\), \(p \in [\frac{4}{3}, 2)\) and \(p\in [1, \frac{4}{3})\).

If \(p\ge 2\), consider the function \({\textbf {u}}: \mathbb {R}^n\setminus {\{{0}\}} \rightarrow \mathbb {R}^N\) given by

$$\begin{aligned} {\textbf {u}}(x)= (|x|, 0, \dots ,0) \quad \text {for}\quad x \in \mathbb {R}^n\setminus {\{{0}\}} . \end{aligned}$$

Since \(\nabla {|{\nabla {\textbf {u}}}|} = 0\), equality holds in (3.39) for every \(x \in \mathbb {R}^n\setminus {\{{0}\}}\).

If \(p \in [\frac{4}{3}, 2)\), consider the function \({\textbf {u}}: \mathbb {R}^n\rightarrow \mathbb {R}^N\) defined as

$$\begin{aligned} {\textbf {u}}(x)= (\tfrac{1}{2} x_1^2, 0, \dots ,0) \quad \text {for}\quad x \in \mathbb {R}^n. \end{aligned}$$

One has that

$$\begin{aligned} {|{\nabla ^2 {\textbf {u}}}|}^2 ={\big |{\nabla {|{\nabla {\textbf {u}}}|}}\big |}^2 = {\bigg |{ \frac{\nabla {\textbf {u}}}{{|{\nabla {\textbf {u}}}|}} (\nabla {|{\nabla {\textbf {u}}}|})^T}\bigg |}^2 =1 \quad \text {in}\quad {\mathbb R^n}. \end{aligned}$$

Thus, equality holds in (3.39) for every \(x \in \mathbb {R}^n\setminus {\{{0}\}}\).

If \(p \in [1, \frac{4}{3})\), set \(r_0 = \frac{p}{2(2-p)}\).

Let \(e_1, e_2\) denote the first two vectors of the canonical base of \({\mathbb R^n}\). Define

$$\begin{aligned} t_1&= \sqrt{r_0},&\qquad \omega ^1&= t_1 e_1, \\ t_2&= \sqrt{1-r_0},&\qquad \omega ^2&= t_2 e_2, \\ h_1&= \sqrt{\frac{2r_0}{1+r_0}},&\qquad H^1&= h_1 e_1 \otimes e_1, \\ h_2&= \sqrt{\frac{1-r_0}{1+r_0}},&\qquad H^2&= h_2 \tfrac{1}{\sqrt{2}} \big ( e_1 \otimes e_2 + e_2 \otimes e_1\big ), \end{aligned}$$

and \(\omega ^3 = \dots = \omega ^N = 0\), \(H^3 = \dots = H^N = 0\). Then

$$\begin{aligned} \sum _{\alpha =1}^N {|{\omega ^\alpha }|}^2 = {|{\omega ^1}|}^2 + {|{\omega ^2}|}^2 = 1. \end{aligned}$$
(3.41)

Moreover,

$$\begin{aligned} J_1&= \sum _{\alpha =1}^N {|{H^\alpha }|}^2 = {|{H^1}|}^2 + {|{H^2}|}^2= 1, \end{aligned}$$
(3.42)
$$\begin{aligned} J&= {\Bigg |{\sum _{\alpha =1}^N H^\alpha \omega ^\alpha }\Bigg |}^2 = {|{H^1 \omega ^1 + H^2 \omega ^2}|}^2 = \bigg |\Big ( h_1 t_1 + \frac{1}{\sqrt{2}} h_2 t_2 \Big ) e_1\bigg |^2 \nonumber \\&= \bigg |\sqrt{\frac{1+r_0}{2}} e_1\bigg |^2 =\frac{1+r_0}{2}, \end{aligned}$$
(3.43)
$$\begin{aligned} J_0&= \sum _{\alpha =1}^N {|{\ \omega ^\alpha \cdot (H^1 \omega ^1 + H^2 \omega ^2)}|}^2 = {\big |{ \omega ^ 1\cdot (H^1 \omega ^1 + H^2 \omega ^2) }\big |}^2 = \frac{r_0(1+r_0)}{2}. \end{aligned}$$
(3.44)

Now, let \({\textbf {u}}: {\mathbb R^n}\rightarrow {\mathbb R^N}\) be a polynomial of degree two such that \(\nabla u^\alpha (0)^T = \omega ^\alpha \) and \(\nabla ^2u^\alpha = H^\alpha \) for \(\alpha =1, \dots N\). Formulas (3.40), combined with (3.42)–(3.44), tell us that

$$\begin{aligned}&{|{\nabla ^2 {\textbf {u}}}|}^2 + 2(p-2){\big |{\nabla {|{\nabla {\textbf {u}}}|}}\big |}^2 + (p-2)^2 {\bigg |{ \frac{\nabla {\textbf {u}}}{{|{\nabla {\textbf {u}}}|}} (\nabla {|{\nabla {\textbf {u}}}|})^T}\bigg |}^2 \\&\quad = 1- \tfrac{1}{8}(4-p)^2 = \kappa _N (p) {|{\nabla ^2 {\textbf {u}}}|}^2 \quad \hbox {at}\quad 0. \end{aligned}$$

Hence, equality holds in (3.39) for \(x =0\). \(\square \)

We are now in a position to prove Theorem 2.1.

Proof of Theorem 2.1

By Lemma 3.6, applied with \(p= Q_a(|\nabla {\textbf {u}}|) +2\), and the monotonicity of the function \(\kappa _N\) one has that

$$\begin{aligned}&a(|\nabla {\textbf {u}}|)^2\Bigg [ |\nabla ^2 {\textbf {u}}|^2 +2 Q_a(|\nabla {\textbf {u}}|)|\nabla |\nabla {\textbf {u}}||^2 +Q_a(|\nabla {\textbf {u}}|)^2\bigg | \frac{\nabla {\textbf {u}}}{|\nabla {\textbf {u}}|} (\nabla |\nabla {\textbf {u}}|)^T \bigg |^2 \Bigg ] \nonumber \\&\quad \ge \kappa _N \big (Q_a(|\nabla {\textbf {u}}|)+2\big ) a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 \nonumber \\&\quad \ge \kappa _N \big (i_a+2\big ) a(|\nabla {\textbf {u}}|)^2 |\nabla ^2 {\textbf {u}}|^2 \quad \text {in }\quad \{\nabla {\textbf {u}}\ne 0\}. \end{aligned}$$
(3.45)

Thus, under the assumptions of Part (ii), inequality (2.7) holds at every point in the set \(\{\nabla {\textbf {u}}\ne 0\}\), owing to equation (3.3) and inequality (3.45).

On the other hand, if the stronger assumptions of Part (i) are in force, then equation (3.3) also holds at every point in the set \(\{\nabla {\textbf {u}}= 0\}\), provided that the expression

$$\begin{aligned} 2 Q_a(|\nabla {\textbf {u}}|)|\nabla |\nabla {\textbf {u}}||^2 +Q_a(|\nabla {\textbf {u}}|)^2\bigg | \frac{\nabla {\textbf {u}}}{|\nabla {\textbf {u}}|} (\nabla |\nabla {\textbf {u}}|)^T \bigg |^2 \end{aligned}$$

is interpreted as 0. Hence, inequality (2.7) holds in \(\{\nabla {\textbf {u}}= 0\}\) as well, inasmuch as \(\kappa _N \big (i_a+2\big )\le 1\).

In order to verify the optimality of the constant \(\kappa _N(i_a+2)\) in inequality (2.7), pick a function \(\overline{{\textbf {u}}}\) and a point \(x_0\) from the proof of Lemma 3.6 such that \(\nabla \overline{{\textbf {u}}}(x_0)\ne 0\) and equality holds in inequality (3.39), with \({\textbf {u}}=\overline{{\textbf {u}}}\) and \(p=i_a +2\), at the point \(x_0\). Namely,

$$\begin{aligned}&{|{\nabla ^2 \overline{{\textbf {u}}}(x_0)}|}^2 + 2i_a{\big |{\nabla {|{\nabla \overline{{\textbf {u}}}}|}(x_0)}\big |}^2 + i_a^2 {\bigg |{ \frac{\nabla \overline{{\textbf {u}}}(x_0)}{{|{\nabla \overline{{\textbf {u}}}(x_0)}|}} (\nabla {|{\nabla \overline{{\textbf {u}}}}|}(x_0))^T}\bigg |}^2\nonumber \\&\quad = \kappa _N (i_a+2) {|{\nabla ^2 \overline{{\textbf {u}}}(x_0)}|}^2. \end{aligned}$$
(3.46)

By the definition of the index \(i_a\), given \(\varepsilon >0\) there exists \(t_0 \in (0, \infty )\) such that

$$\begin{aligned} i_a \le Q_a(t_0) \le i_a + \varepsilon . \end{aligned}$$
(3.47)

Define, the function \({\textbf {u}}= \frac{t_0 \overline{{\textbf {u}}}}{|\nabla \overline{{\textbf {u}}}(x_0)|}\), so that \(|\nabla {\textbf {u}}(0)|= t_0\). From identity (3.3), equation (3.46) and inequality (3.47) we obtain that

$$\begin{aligned}&\frac{\big |\textrm{\textbf{div}} (a(|\nabla {\textbf {u}}|)\nabla {\textbf {u}})\big |^2 - \textrm{{div}} \Big [a(|\nabla {\textbf {u}}|)^2 \Big ( (\Delta {\textbf {u}})^T \nabla {\textbf {u}}- \tfrac{1}{2} \nabla |\nabla {\textbf {u}}|^2\Big )\Big ]}{a(|\nabla {\textbf {u}}|)^{2} |\nabla ^2 {\textbf {u}}|^2 }\Bigg |_{x=x_0} \nonumber \\ {}&\quad = \frac{|\nabla ^2 {\textbf {u}}(x_0)|^2 +2 Q_a(t_0)|\nabla |\nabla {\textbf {u}}|(x_0)|^2 +Q_a(t_0)^2\Big | \frac{\nabla {\textbf {u}}(x_0)}{|\nabla {\textbf {u}}(x_0)|} (\nabla |\nabla {\textbf {u}}|(x_0))^T \Big |^2}{ |\nabla ^2 {\textbf {u}}(x_0)|^2 } \nonumber \\ {}&\quad = \frac{|\nabla ^2 \overline{{\textbf {u}}}(x_0)|^2 +2 Q_a(t_0)|\nabla |\nabla \overline{{\textbf {u}}}|(x_0)|^2 +Q_a(t_0)^2\Big | \frac{\nabla \overline{{\textbf {u}}}(x_0)}{|\nabla \overline{{\textbf {u}}}(x_0)|} (\nabla |\nabla \overline{{\textbf {u}}}|(x_0))^T \Big |^2}{ |\nabla ^2 \overline{{\textbf {u}}}(x_0)|^2 } \nonumber \\ {}&\quad \le \frac{|\nabla ^2 \overline{{\textbf {u}}}(x_0)|^2 \!+\!2 ( i_a \!+\! \varepsilon )|\nabla |\nabla \overline{{\textbf {u}}}|(x_0)|^2 \!+\! (i_a^2 \!+\! 2\varepsilon |i_a|\!+\! \varepsilon ^2))\Big | \frac{\nabla \overline{{\textbf {u}}}(x_0)}{|\nabla \overline{{\textbf {u}}}(x_0)|} (\nabla |\nabla \overline{{\textbf {u}}}|(x_0))^T \Big |^2}{ |\nabla ^2 \overline{{\textbf {u}}}(x_0)|^2 } \nonumber \\ {}&\quad = k_N(i_a+2)+ \frac{2 \varepsilon |\nabla |\nabla \overline{{\textbf {u}}}|(x_0)|^2 + (2\varepsilon |i_a|+ \varepsilon ^2))\Big | \frac{\nabla \overline{{\textbf {u}}}(x_0)}{|\nabla \overline{{\textbf {u}}}(x_0)|} (\nabla |\nabla \overline{{\textbf {u}}}|(x_0))^T\Big |^2}{ |\nabla ^2 \overline{{\textbf {u}}}(x_0)|^2 } \end{aligned}$$
(3.48)

Hence, the optimality of the constant \(\kappa _N(i_a+2)\) in inequality (2.7) follows, owing to the arbitrariness of \(\varepsilon \). \(\square \)

4 Function spaces

An appropriate functional framework for the analysis of solutions to systems of the general form (1.5) is provided by the Orlicz-Sobolev spaces associated with the energy integral appearing in the functional (1.6). They consist in a generalization of the classical Sobolev spaces, where the role of powers in the definition of the norm is played by more general Young functions. Section 4.1 is devoted to some basic definitions and properties of Young functions and of Orlicz-Sobolev spaces. A Poincaré type inequality for functions in these spaces, of use for our purposes, is established as well. In Section 4.2 we collect specific properties of the Young function (and of perturbations of its) entering the definition of the peculiar Orlicz–Sobolev ambient space associated with system (2.11).

4.1 Young functions and Orlicz–Sobolev spaces

A Young function \(A : [0, \infty ) \rightarrow [0, \infty ]\) is a convex function such that \(A (0)=0\). The Young conjugate of a Young function A is the Young function \(\widetilde{A}\) defined as

$$\begin{aligned} \widetilde{A} (t) = \sup \{st - A (s): s \ge 0\} \quad \hbox {for}\quad t \ge 0. \end{aligned}$$

A Young function (and, more generally, an increasing function) A is said to belong to the class \(\Delta _2\), or to satisfy the \(\Delta _2\)-condition, if there exists a constant \(c>1\) such that

$$\begin{aligned} A (2t ) \le c A (t) \quad \hbox {for }t>0. \end{aligned}$$
(4.1)

Let \(i_A\) and \(s_A\) be the indices associated with a continuously differentiable function A as in (2.1), with a replaced by A. Namely

$$\begin{aligned} i_A= \inf _{t>0} \frac{t A'(t)}{A(t)} \quad \hbox {and} \qquad s_A= \sup _{t >0} \frac{t A'(t)}{A(t)}. \end{aligned}$$
(4.2)

One has that \(A \in \Delta _2\) if and only if \(s_A<\infty \). The constant c in inequality (4.1) depends on \(s_a\). Also, \(\widetilde{A} \in \Delta _2\) if and only if \(i_A>1\).

The Orlicz space \(L^{A}(\Omega )\) is the Banach function space of those real-valued measurable functions \(u : \Omega :\rightarrow \mathbb R\) whose Luxemburg norm

$$\begin{aligned} \Vert u\Vert _{L^A (\Omega )} = \inf \bigg \{\lambda >0: \int _\Omega A \bigg (\frac{|u|}{\lambda }\bigg ) \,d x \le 1\bigg \} \end{aligned}$$

is finite. The Orlicz space \(L^{A}(\Omega , {\mathbb R^N})\) of \({\mathbb R^N}\)-valued functions, and the Orlicz spaces \(L^{A}(\Omega , \mathbb R^{N\times n})\) and \(L^{A}(\Omega , \mathbb R^{N\times n\times n})\) of \(\mathbb R^{N\times n}\)-valued and \(\mathbb R^{N\times n\times n}\)-valued functions, respectively, are defined analogously.

The Orlicz-Sobolev space \(W^{1, A }(\Omega )\) is the Banach space

$$\begin{aligned} W^{1, A}(\Omega ) = \{u \in L^A (\Omega ): u \hbox { is weakly differentiable in }\Omega \hbox { and }\nabla u \in L^A (\Omega , {\mathbb R^n}) \}\,, \end{aligned}$$
(4.3)

and is equipped with the norm

$$\begin{aligned} \Vert u\Vert _{W^{1,A} (\Omega )} = \Vert u\Vert _{L^{A} (\Omega )}+ \Vert \nabla u\Vert _{L^A (\Omega , {\mathbb R^n})}. \end{aligned}$$

The space \(W^{1, A}_\textrm{loc} (\Omega )\) is defined accordingly. By \(W^{1,A} _0(\Omega )\) we denote the subspace of \(W^{1,A} (\Omega )\) of those functions in \(W^{1,A} (\Omega )\) whose extension by 0 outside \(\Omega \) is weakly differentiable in the whole of \({\mathbb R^n}\). The notation \((W^{1,A}_0 (\Omega ))'\) stands for the dual of \(W^{1,A} _0(\Omega )\). If \(\Omega \) has finite Lebesgue measure \(|\Omega |\), then the functional \( \Vert \nabla u\Vert _{L^A (\Omega , {\mathbb R^n})}\) defines a norm in \(W^{1,A} _0(\Omega )\) equivalent to \(\Vert u\Vert _{W^{1,A} (\Omega )}\).

The space \(C^\infty _0(\Omega )\) is dense in \(W^{1,A} _0(\Omega )\) if \(A\in \Delta _2\). Moreover, \(W^{1,A} _0(\Omega )\) is reflexive if both \(A\in \Delta _2\) and \( \widetilde{A} \in \Delta _2\), and hence if \(i_A>1\) and \(s_A<\infty \).

The Orlicz-Sobolev space \(W^{1,A} (\Omega , {\mathbb R^N})\) of \({\mathbb R^N}\)-valued functions, its variants \(W^{1,A}_\textrm{loc} (\Omega , {\mathbb R^N})\) and \(W^{1,A}_0 (\Omega , {\mathbb R^N})\), and the space \((W^{1,A}_0 (\Omega , {\mathbb R^N}))'\) are defined analogously.

If \(|\Omega |<\infty \) and the Young function \(A\in \Delta _2\), then the Poincaré type inequality

$$\begin{aligned} \int _\Omega A(|u|)\, dx \le c \int _\Omega A(|\nabla u|)\, dx \end{aligned}$$
(4.4)

holds for some constant \(c=c(n, |\Omega |, s_a)\) and for every function \(u \in W^{1,A}_0(\Omega )\). Inequality (4.4) follows, for instance, from [59, Lemma 3].

In order to bound lower-order terms appearing in our global estimate, we also need a stronger, yet non-optimal, Sobolev-Poincaré type inequality for functions in \(W^{1,A}_0(\Omega )\) with an Orlicz target space smaller than \(L^A(\Omega )\). This is the subject of Theorem 4.1 below, which generalizes a version of the relevant inequality with optimal Orlicz target space from [26] (see also [25] for an equivalent form).

Assume that the Young function A and the number \(\sigma >1\) satisfy the conditions

$$\begin{aligned} \int _0 \bigg (\frac{t}{A(t)}\bigg )^{\frac{1}{\sigma -1}}\,dt < \infty \end{aligned}$$
(4.5)

and

$$\begin{aligned} \int ^\infty \bigg (\frac{t}{A(t)}\bigg )^{\frac{1}{\sigma -1}}\,dt = \infty . \end{aligned}$$
(4.6)

Then, we define the function \(H_\sigma : [0, \infty ) \rightarrow [0, \infty )\) as

$$\begin{aligned} H_\sigma (s) = \bigg (\int _0^s \bigg (\frac{t}{A(t)}\bigg )^{\frac{1}{\sigma -1}}\,dt \bigg )^{\frac{1}{\sigma '}} \qquad \hbox {for}\quad s \ge 0, \end{aligned}$$
(4.7)

and the Young function \(A_\sigma \) as

$$\begin{aligned} A_\sigma (t) = A(H_\sigma ^{-1}(t)) \qquad \hbox {for }\quad t \ge 0. \end{aligned}$$
(4.8)

Theorem 4.1

Let \(\Omega \) be an open set in \({\mathbb R^n}\) with \(|\Omega |<\infty \). Assume that the Young function A and the number \(\sigma \ge n\) fulfill conditions (4.5) and (4.6). Then there exists a constant \(c=c(n, \sigma )\) such that

$$\begin{aligned} \int _\Omega A_\sigma \Bigg (\frac{|u(x)|}{c|\Omega |^{\frac{1}{n} - \frac{1}{\sigma }}\big (\int _\Omega A(|\nabla u |)dy\big )^{1/\sigma }}\Bigg )\, dx \le \int _\Omega A(|\nabla u|)dx \end{aligned}$$
(4.9)

for every \(u \in W^{1,A}_0(\Omega )\).

Proof

By the Pólya-Szegö principle on the decrease of the functional on the right-hand side of inequality (4.9) under symmetric decreasing rearrangement of functions \(u \in W^{1,A}_0(\Omega )\) (see [15]), it suffices to prove inequality (4.9) in the case when \(\Omega \) is a ball and the trial functions u are nonnegative and radially decreasing. As a consequence, this inequality will follow if we show that

$$\begin{aligned} \int _0^{|\Omega |} A_\sigma \Bigg (\frac{\int _s^{|\Omega |}\varphi (r)r^{-\frac{1}{n'}}\, dr}{c |\Omega |^{\frac{1}{n} - \frac{1}{\sigma }} \big (\int _0^{|\Omega |} A(\varphi (r))dr\big )^{1/\sigma }}\Bigg )\, ds \le \int _0^{|\Omega |} A(\varphi (s))\, ds \end{aligned}$$
(4.10)

for a suitable constant c as in the statement and for every measurable function \(\varphi : (0, |\Omega |)\rightarrow [0, \infty )\). Let S be the linear operator defined as

$$\begin{aligned} S\varphi (s) = \int _s^{|\Omega |}\varphi (r)r^{-\frac{1}{n'}}\, dr \quad \text {for}\quad s \in (0, |\Omega |), \end{aligned}$$
(4.11)

for every measurable function \(\varphi : (0, |\Omega |) \rightarrow \mathbb R\) that makes the integral on the right-hand side converge. One has that

$$\begin{aligned} \Vert S\varphi \Vert _{L^{\sigma '}(0, |\Omega |)}&= \bigg (\int _0^{|\Omega |}|S\varphi (s)|^{\sigma '}\, ds\bigg )^{\frac{1}{\sigma '}} \le \bigg (\int _0^{|\Omega |}s^{-\frac{\sigma '}{n'}}\bigg (\int _s^{|\Omega |}|\varphi (r)|\, dr\bigg )^{\sigma '}\, ds\bigg )^{\frac{1}{\sigma '}} \nonumber \\ {}&\le \Vert \varphi \Vert _{L^1(0, |\Omega |)} \bigg (\int _0^{|\Omega |}s^{-\frac{\sigma '}{n'}}\, ds\bigg )^{\frac{1}{\sigma '}} =c |\Omega |^{\frac{1}{n} - \frac{1}{\sigma }} \Vert \varphi \Vert _{L^1(0, |\Omega |)} \end{aligned}$$
(4.12)

for a suitable constant \(c=c(n, \sigma )\) and for every \(\varphi \in L^1(0, |\Omega |)\). Also, by the Hardy-Littlewood inequality for rearrangements,

$$\begin{aligned} \Vert S\varphi \Vert _{L^{\infty }(0, |\Omega |)}&\le \int _0^{|\Omega |}|\varphi (r)|r^{-\frac{1}{n'}}\, dr \le \int _0^{|\Omega |}\varphi ^*(r)r^{-\frac{1}{n'}}\, dr \nonumber \\ {}&\le |\Omega |^{\frac{1}{n} - \frac{1}{\sigma }} \int _0^{|\Omega |}\varphi ^*(r)r^{-\frac{1}{\sigma '}}\, dr = |\Omega |^{\frac{1}{n} - \frac{1}{\sigma }} \Vert \varphi \Vert _{L^{\sigma , 1}(0, |\Omega |)} \end{aligned}$$
(4.13)

for every \(\varphi \in L^{\sigma , 1}(0, |\Omega |)\). Here, \(\varphi ^*\) denotes the decreasing rearrangement of \(\varphi \), and \(L^{\sigma , 1}(0, |\Omega |)\) is the Lorentz space whose norm is defined by the last integral in equation (4.13). Owing to equations (4.12) and (4.13), the interpolation theorem established in [26, Theorem 4] can be applied to deduce inequality (4.10). \(\square \)

The next lemma tells us that the assumptions of Theorem 4.1 are certainly fulfilled if A satisfies the \(\Delta _2\)-condition, provided that \(\sigma \) is sufficiently large.

Lemma 4.2

Let A be a continuously differentiable Young function satisfying the \(\Delta _2\)-condition and let \(\sigma >s_A\). Then conditions (4.5) and (4.6) are fulfilled.

Proof

Owing to the definition of \(s_a\), one verifies via differentiation that the function \(\frac{A(t)}{t^{s_A}}\) is non-increasing. Thus,

$$\begin{aligned} A(t) \ge A(1) t^{s_A} \quad \text {if} \quad t\in (0,1], \end{aligned}$$
(4.14)

and

$$\begin{aligned} A(t) \le A(1) t^{s_A} \quad \text {if}\quad t\in [1, \infty ). \end{aligned}$$
(4.15)

Equations (4.5) and (4.6) follow from (4.14) and (4.15), respectively. \(\square \)

4.2 Young functions built upon the function a

Given a continuously differentiable function \(a: (0, \infty ) \rightarrow (0,\infty )\) such that \(i_a \ge -1\), let b and B the functions defined by (1.8) and (1.7). Our assumption on \(i_a\) ensures that b is a non-decreasing function, and hence B is a Young function.

One has that

$$\begin{aligned} i_b= i_a+1\quad \hbox { and }\quad s_b=s_a+1. \end{aligned}$$
(4.16)

Also

$$\begin{aligned} i_B\ge i_b+1 \quad \hbox { and }\quad s_B\le s_b +1. \end{aligned}$$
(4.17)

Thus, if \(s_a<\infty \), then the functions b and B satisfy the \(\Delta _2\)-conditon, and if \(i_a>-1\), then the function \(\widetilde{B}\) satisfies the \(\Delta _2\)-conditon.

Hence, if \(s_a<\infty \), then for every \(\lambda >1\) there exists a constant \(c=c(\lambda , s_a)>1\) such that

$$\begin{aligned} b(\lambda t) \le c b(t) \qquad \hbox {for}\quad t \ge 0, \end{aligned}$$
(4.18)

and

$$\begin{aligned} B(\lambda t) \le c B(t) \qquad \hbox {for}\quad t \ge 0. \end{aligned}$$
(4.19)

Moreover,

$$\begin{aligned} tb'(t) \le (s_a + 1) b(t) \quad \text {for}\quad t>0, \end{aligned}$$
(4.20)

and

$$\begin{aligned} B(t) \le tb(t) \le (s_a + 2) B(t) \quad \text {for}\quad t>0. \end{aligned}$$
(4.21)

Since \(\widetilde{B} (b(t)) \le B(2t)\) for \(t \ge 0\), there exists a constant \(c=c(s_a)\) such that

$$\begin{aligned} \widetilde{B} (b(t)) \le c B(t) \quad \text {for}\quad t\ge 0. \end{aligned}$$
(4.22)

Finally, if \(i_a>-1\) and \(s_a<\infty \), then

$$\begin{aligned} a(1)\min \{t^{i_a}, t^{s_a}\} \le a(t) \le a(1)\max \{t^{i_a}, t^{s_a}\} \qquad \text {for}\quad t>0. \end{aligned}$$
(4.23)

If the function a is as above and \(\varepsilon >0\), we define the function \(a_\varepsilon : [0,\infty ) \rightarrow (0,\infty )\) as

$$\begin{aligned} a_\varepsilon (t) = a(\sqrt{t^2 + \varepsilon ^2})\quad \hbox {for}\quad t \ge 0. \end{aligned}$$
(4.24)

The functions \(b_\varepsilon \) and \(B_\varepsilon \) are defined as in (1.8) and (1.7), with a replaced by \(a_\varepsilon \).

Lemma 4.3

Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable in \((0, \infty )\) and that \(i_a>-1\) and \(s_a<\infty \). Let \( \varepsilon >0\) and let \(a_ \varepsilon \) be the function defined by (4.24). Then

$$\begin{aligned} i_{a_\varepsilon } \ge \min \{i_a, 0\} \quad \hbox {and} \quad s_{a_\varepsilon } \le \max \{s_a, 0\}\,, \end{aligned}$$
(4.25)

where \(i_{a_\varepsilon }\) and \(s_{a_\varepsilon }\) are defined as in (2.1), with a replaced by \(a_\varepsilon \).

Let b, B, \(b_\varepsilon \) and \(B_\varepsilon \) be the functions defined above. Then there exist constants \(c_1, c_2, c_3\), depending only on \(s_a\), such that

$$\begin{aligned} c_1B(t) -c_2 B(\varepsilon ) \le a_\varepsilon (t) t^2 \le c_3(B(t) + B(\varepsilon )) \quad \text {for}\quad t \ge 0. \end{aligned}$$
(4.26)

Moreover, there exists a constant \(c=c(s_a)\) such that

$$\begin{aligned} B_\varepsilon (t) \le c(B(t) + B(\varepsilon )) \quad \text {for}\quad t \ge 0, \end{aligned}$$
(4.27)

and

$$\begin{aligned} \widetilde{B} (b_\varepsilon (t)) \le c(B(t) + B(\varepsilon )) \quad \text {for}\quad t \ge 0. \end{aligned}$$
(4.28)

Proof

Property (4.25) can be verified by straightforward computations. Consider equation (4.26). One has that

$$\begin{aligned}&a_\varepsilon (t) t^2 \le a(t+\varepsilon ) t^2 \le (s_a+2)B(t+ \varepsilon )\le (s_a+2) (B(2t) + B(2\varepsilon )) \nonumber \\&\quad \le c(B(t)+B(\varepsilon )) \quad \text {for}\quad t\ge 0, \end{aligned}$$
(4.29)

for some constant \(c=c(s_a)\), where the second inequality holds by (4.21) and the last one by (4.1). This proves the second inequality in (4.26). As for the first one, observe that

$$\begin{aligned} B(t) \le B(t+\varepsilon ) \le B(2t) + B(2\varepsilon ) \le c B(t) + c B(\varepsilon ) \quad \text {for}\quad t\ge 0, \end{aligned}$$
(4.30)

for some constant \(c=c(s_a)\), where we have made use of inequality (4.1) again. Now,

$$\begin{aligned} B(t)&= \int _0^t a(\tau )\tau \, d\tau \le \int _0^t a(\tau +\varepsilon )(\tau +\varepsilon ) \, d\tau \le \int _0^t a(2\sqrt{\tau ^2 +\varepsilon ^2})2\sqrt{\tau ^2 +\varepsilon ^2} \, d\tau \nonumber \\&\le c \int _0^t a(\sqrt{\tau ^2 +\varepsilon ^2})\sqrt{\tau ^2 +\varepsilon ^2} \, d\tau \le c\, t \,a(\sqrt{t^2 +\varepsilon ^2})\sqrt{t^2 +\varepsilon ^2}\nonumber \\&= c \,a_\varepsilon (t) t \sqrt{t^2 +\varepsilon ^2} \quad \text {for}\quad t\ge 0, \end{aligned}$$
(4.31)

for some constant \(c=c(s_a)\), where the third inequality is due to (4.18). On the other hand,

$$\begin{aligned} a_\varepsilon (t) t \sqrt{t^2 +\varepsilon ^2} \le \sqrt{2} a_\varepsilon (t) t^2 \qquad \text {if}\quad t\ge \varepsilon , \end{aligned}$$
(4.32)

and

$$\begin{aligned} a_\varepsilon (t) t \sqrt{t^2 +\varepsilon ^2} \le \sqrt{2} a_\varepsilon (\varepsilon ) \varepsilon ^2 = \sqrt{2} a (\sqrt{2} \varepsilon ) \varepsilon ^2 \le c B(\varepsilon )\qquad \text {if}\quad 0\le t\le \varepsilon , \end{aligned}$$
(4.33)

for some constant \(c=c(s_a)\), where the last inequality holds thanks to (4.21). Combining inequalities (4.31)–(4.33) yields

$$\begin{aligned} B(t) \le c a_\varepsilon (t) t^2 + c B(\varepsilon ) \qquad \text {for }\quad t\ge 0, \end{aligned}$$

for some constant \(c=c(s_a)\). Hence, the first inequality in (4.26) follows.

Inequality (4.27) holds because of the first inequality in (4.21), applied with B replaced by \(B_\varepsilon \), and of the second inequality in (4.26).

Inequality (4.28) is a consequence of the following chain:

$$\begin{aligned} \widetilde{B}(b_\varepsilon (t) )&= \widetilde{B}(a (\sqrt{t^2 +\varepsilon ^2}) t)\le \widetilde{B}(b (\sqrt{t^2 +\varepsilon ^2}))\nonumber \\ {}&\le \widetilde{B}(b (t+\varepsilon )) \le c B(t+\varepsilon ) \le c' (B(t) + B(\varepsilon )) \quad \text {for}\quad t\ge 0, \end{aligned}$$
(4.34)

for some constants c and \(c'\) depending on \(s_a\). Notice, that we have made use of property (4.22) in last but one inequality, and of property (4.1) in the last inequality. \(\square \)

Lemma 4.4

Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable in \((0, \infty )\) and that \(i_a>-1\) and \(s_a<\infty \). Let \( \varepsilon >0\) and let \(a_ \varepsilon \) be the function defined by (4.24). Let \(M>0\). Then there exists a constant \(c=c(i_a, s_a, \varepsilon , M)\) such that

$$\begin{aligned} |P-Q| \le c |a_\varepsilon (P)P -a_\varepsilon (Q)Q| \end{aligned}$$
(4.35)

for every \(P, Q \in \mathbb R^{N\times n}\) such that \(|P|\le M\) and \(|Q|\le M\).

Proof

Our assumptions on a legitimate an application of [38, Lemma 21], whence we deduce that there exists a positive constant \(c=c(i_{a_\varepsilon }, s_{a_\varepsilon })\) such that

$$\begin{aligned} c \big [a_\varepsilon (|P|+|Q|) + a_\varepsilon '(|P|+|Q|) (|P|+|Q|)\big ] |P-Q|^2 \le (a_\varepsilon (|P|)P - a_\varepsilon (|Q|)Q) \cdot (P-Q) \end{aligned}$$
(4.36)

for every \(P, Q \in \mathbb R^{N\times n}\). Via inequalities (4.36) and (4.25) we deduce that

$$\begin{aligned} c (1+ \min \{i_a, 0\}) a_\varepsilon (|P|+|Q|) |P-Q| \le |a_\varepsilon (|P|)P - a_\varepsilon (|Q|)Q| \end{aligned}$$
(4.37)

for every \(P, Q \in \mathbb R^{N\times n}\). Inequality (4.4) hence follows, since

$$a_\varepsilon (|P|+|Q|) \ge \min \big \{a(t): \varepsilon \le t \le \sqrt{2M^2+ \varepsilon ^2}\big \}>0$$

if \(|P|\le M\) and \(|Q|\le M\), and \((1+ \min \{i_a, 0\})>0\). \(\square \)

One more function associated with a function a as above and to a number \(\varepsilon >0\) will be needed in our proofs. The function in question is denoted by \(V_\varepsilon : {\mathbb R^{N\times n}}\rightarrow {\mathbb R^{N\times n}}\) and is defined as

$$\begin{aligned} V_\varepsilon (P)= \sqrt{a_\varepsilon (|P|)}P \qquad \text { for }\quad P \in {\mathbb R^{N\times n}}. \end{aligned}$$
(4.38)

Lemma 4.5

Assume that the function \(a: (0, \infty ) \rightarrow (0, \infty )\) is continuously differentiable and such that \(i_a>-1\) and \(s_a<\infty \). Let \( \varepsilon >0\) and let \(a_ \varepsilon \) be the function defined by (4.24). Then

$$\begin{aligned} a_\varepsilon (|P|) P \rightarrow a(|P|) P \quad \text {as}\quad \varepsilon \rightarrow 0^+, \end{aligned}$$
(4.39)

uniformly for P in any compact subset of \({\mathbb R^{N\times n}}\).

Moreover,

$$\begin{aligned} (a_\varepsilon (|P|) P - a_\varepsilon (|Q|) Q)\cdot (P - Q) \approx \big |V_\varepsilon (P) -V_\varepsilon (Q)\big |^2 \quad \text {for}\quad P, Q \in {\mathbb R^{N\times n}}, \end{aligned}$$
(4.40)

where the relation \(\approx \) means that the two sides are bounded by each other, up to positive multiplicative constants depending only on \(i_a\) and \(s_a\).

Proof

Fix any \(0<\ell < L\) and assume that \(\varepsilon \in [0, 1]\). Since \(a\in C^1(0, \infty )\), if \(\ell \le |P| \le L\) then

$$\begin{aligned}&|a_\varepsilon (|P|)P - a(|P|)P| \le |P| |a_\varepsilon (|P|) - a(|P|)| \\ \nonumber&\qquad \le \max _{t\in [\ell , \sqrt{L^2+1}]}|a'(t)|(\sqrt{|P|^2+\varepsilon ^2}- |P|)\le \max _{t\in [\ell , \sqrt{L^2+1}]}|a'(t)| \varepsilon . \end{aligned}$$
(4.41)

Moreover, if \(|P|\le 1\), then, by the second inequality in (4.23) applied with a replaced by \(a_\varepsilon \) and by the first inequality in (4.25),

$$\begin{aligned} |a_\varepsilon (|P|)P| \le a_\varepsilon (1) |P|^{1+ \min \{i_a,0\}} \le \max _{t\in [1, \sqrt{2}]}|a(t)| |P|^{1+ \min \{i_a,0\}}. \end{aligned}$$
(4.42)

Now, let \(L>0\). Fix any \(\sigma >0\). By inequality (4.42), there exists \(\ell >0\) such that

$$\begin{aligned} |a_\varepsilon (|P|)P-a(|P|)P| \le |a_\varepsilon (|P|)P|+ |a(|P|)P| \le \sigma \end{aligned}$$
(4.43)

for every \(\varepsilon \in [0, 1]\), provided that \(|P|<\ell \). On the other hand, inequality (4.41) ensures that there exists \(\varepsilon _0\in (0,1)\) such that

$$\begin{aligned} |a_\varepsilon (|P|)P - a(|P|)P|<\sigma \end{aligned}$$
(4.44)

if \(\ell \le |P| \le L\). From inequalities (4.43) and (4.44) we deduce that, if \(0\le \varepsilon <\varepsilon _0\), then

$$\begin{aligned} |a_\varepsilon (|P|)P - a(|P|)P|<\sigma \qquad \text {if}\quad |P|\le L. \end{aligned}$$
(4.45)

This shows that the limit (4.39) holds unifromly for \(|P|\le L\).

As far as Eq. (4.40) is concerned, it follows from [37, Lemma 41] that, since we are assuming that \(i_{a_\varepsilon }>-1\) and \(s_{a_\varepsilon }<\infty \), the ratio of the two sides of this equation is bounded from below and from above by positive constants depending only on a lower bound for \(i_{a_\varepsilon }\) and an upper bound for \(s_{a_\varepsilon }\). Owing to inequalities (4.25), we have that \(i_{a_\varepsilon }\ge \min \{i_a, 0\}>0\) and \(s_{a_\varepsilon }\le \max \{s_a, 0\}<\infty \) for every \(\varepsilon >0\). This implies that Eq. (4.40) actually holds up to equivalence constants depending only on \(i_a\) and \(s_a\). \(\square \)

5 Second-order regularity: local solutions

The definiton of generalized local solution to the system

$$\begin{aligned} - \textrm{\textbf{div}}( a(|\nabla {{\textbf {u}}}|) \nabla \textbf{u} ) = \textbf{f} \quad \textrm{in}\,\,\, \Omega \, \end{aligned}$$
(5.1)

that will be adopted is inspired by the results of [41], and involves the notion of approximate differentiability. Recall that a measurable function \({{\textbf {u}}}: \Omega \rightarrow {\mathbb R^N}\) is said to be approximately differentiable at \(x \in \Omega \) if there exists a matrix \(\textrm{ap} \nabla {{\textbf {u}}}(x) \in \mathbb R^{N\times n}\) such that, for every \(\varepsilon >0\),

$$\begin{aligned} \lim _{r \rightarrow 0^+} \frac{\big |\{y\in B_r(x): \frac{1}{r} |{{\textbf {u}}}(y)-{{\textbf {u}}}(x)- \textrm{ap} \nabla {{\textbf {u}}}(x) (y-x)|>\varepsilon \}\big |}{r^n} =0. \end{aligned}$$

If \({{\textbf {u}}}\) is approximately differentiable at every point in \(\Omega \), then the function \(\textrm{ap} \nabla {{\textbf {u}}}: \Omega \rightarrow \mathbb R^{N\times n}\) is measurable.

Assume that a is as in Theorem 2.4 and let \({\textbf {f}}\in L^q_\textrm{loc}(\Omega , {\mathbb R^N})\) for some \(q\ge 1\). An approximately differentiable function \({{\textbf {u}}}: \Omega \rightarrow {\mathbb R^N}\) is called a local approximable solution to system (5.1) if \(a(|\textrm{ap}\nabla {{\textbf {u}}}|) |\textrm{ap}\nabla {{\textbf {u}}}| \in L^{1}_\textrm{loc}(\Omega )\), and there exist a sequence \(\{{\textbf {f}}_k\}\subset C^\infty (\Omega , {\mathbb R^N})\), with \({\textbf {f}}_k \rightarrow {\textbf {f}}\) in \(L^q_\textrm{loc} (\Omega , {\mathbb R^N})\), and a corresponding sequence of local weak solutions \(\{{{\textbf {u}}}_k\}\) to the systems

$$\begin{aligned} - \textrm{\textbf{div}}( a(|\nabla {{\textbf {u}}}_k|) \nabla \textbf{u} _k) = \textbf{f}_k \quad \textrm{in}\,\,\, \Omega \,, \end{aligned}$$
(5.2)

such that

$$\begin{aligned} {{\textbf {u}}}_k \rightarrow {{\textbf {u}}}\quad \hbox {and} \quad \nabla {{\textbf {u}}}_k \rightarrow \textrm{ap} \nabla {{\textbf {u}}}\quad \hbox {a.e. in }\quad \Omega , \end{aligned}$$
(5.3)

and

$$\begin{aligned} \lim _{k \rightarrow \infty } \int _{\Omega '}a(|\nabla {{\textbf {u}}}_k|) |\nabla {{\textbf {u}}}_k| \, dx =\int _{\Omega '} a(|\textrm{ap}\nabla {{\textbf {u}}}|) |\textrm{ap}\nabla {{\textbf {u}}}|\, dx\, \end{aligned}$$
(5.4)

for every open set \(\Omega ' \subset \subset \Omega \). In what follows, we shall denote \(\textrm{ap}\nabla {{\textbf {u}}}\) simply by \(\nabla {\textbf {u}}\).

Weak solutions to system (5.1) are defined in a standard way if \({\textbf {f}}\in L^1_\textrm{loc}(\Omega , {\mathbb R^N}) \cap (W^{1,B}_0(\Omega , {\mathbb R^N}))'\), where B is the Young function defined via (1.7). Namely, a function \({{\textbf {u}}}\in W^{1,B}_\textrm{loc}(\Omega , \mathbb R^N)\) is called a local weak solution to this system if

$$\begin{aligned} \int _{\Omega '} a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\cdot \nabla {\varvec{\varphi }}\,dx = \int _{\Omega '} {{\textbf {f}}}\cdot {\varvec{\varphi }}\,dx \end{aligned}$$
(5.5)

for every open set \(\Omega ' \subset \subset \Omega \), and every function \({\varvec{\varphi }}\in W^{1,B}_0(\Omega ', \mathbb R^N)\).

Inequality (2.7) enters the proof of Theorem 2.4 through Lemma 5.1 below. The latter will be applied to solutions to systems which approximate system (2.11), and involve regularized differential operators and smooth right-hand sides. Lemma 5.1 can be deduced from Theorem 2.1 and inequality (2.10), along the same lines as in the proof of [32, Theorem 3.1, Inequality (3.4)]. The details are omitted, for brevity. We seize this opportunity to point out an incorrect dependence on the radius R of the constants in that inequality, due to a flaw in the scaling argument in the derivation of [32, Inequality (3.43)].

Lemma 5.1

Let \(n \ge 2\), \(N \ge 2\), and let \(\Omega \) be an open set in \({\mathbb R^n}\). Assume that the function \(a \in C^1([0, \infty ))\) satisfies conditions (2.4)–(2.6). Then there exists a constant \(C= C(n,N, i_a, s_a)\), such that

$$\begin{aligned}&R^{-1}{\big \Vert {a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}}\big \Vert }_{L^2(B_R, {\mathbb R^{N\times n}})} + \,{\big \Vert {\nabla \big (a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\big )}\big \Vert }_{L^2(B_R, \mathbb R^{N\times n\times n})}\nonumber \\&\quad \le C\Big (\Vert \textrm{\textbf{div}} ( a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}})\Vert _{L^2(B_{2R}, \mathbb R^{N})} + R^{-\frac{n}{2}-1}\Vert a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\Vert _{L^1(B_{2R}, \mathbb R^{N\times n})}\Big )\nonumber \\ \end{aligned}$$
(5.6)

for every function \({{\textbf {u}}}\in C^3(\Omega , {\mathbb R^N})\) and any ball \(B_{2R} \subset \subset \Omega \).

Proof of Theorem 2.4

Let us temporarily assume that

$$\begin{aligned} {\textbf {f}}\in C^\infty (\Omega , {\mathbb R^N})\,, \end{aligned}$$
(5.7)

and that \({\textbf {u}}\) is a local weak solution to system (5.1). Observe that, thanks to equations (2.12) and (4.25),

$$\begin{aligned} i_{a_\varepsilon } > 2(1-\sqrt{2})\,. \end{aligned}$$
(5.8)

Let \(B_{2R} \subset \subset \Omega \) and, given \(\varepsilon \in (0,1)\), let \({{\textbf {u}}}_\varepsilon \in {\textbf {u}}+ W^{1,B}_0(B_{2R}, {\mathbb R^N})\) be the weak solution to the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} - \textrm{\textbf{div}} (a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon ) = {\textbf {f}}&{} \textrm{in}\,\,\, B_{2R} \\ {{\textbf {u}}}_\varepsilon ={{\textbf {u}}}&{} \textrm{on}\,\,\, \partial B_{2R} \,. \end{array}\right. } \end{aligned}$$
(5.9)

We claim that

$$\begin{aligned} {{\textbf {u}}}_\varepsilon \in C^{\infty }(B_{2R}, {\mathbb R^N}). \end{aligned}$$
(5.10)

Actually, as a consequence of [39, Corollary 5.5], \(\nabla {{\textbf {u}}}_\varepsilon \in L^{\infty }_\textrm{loc}(B_{2R}, {\mathbb R^{N\times n}})\) and there exists a constant C, independent of \(\varepsilon \), such that

$$\begin{aligned} \Vert \nabla u_\varepsilon \Vert _{L^\infty (B_{R}, {\mathbb R^{N\times n}})}\le C. \end{aligned}$$
(5.11)

The same result also tells us that \(a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \in C^{\alpha }_\textrm{loc}(B_{2R}, {\mathbb R^{N\times n}})\) for some \(\alpha \in (0,1)\). Therefore, by inequality (4.35), we have that \(\nabla {{\textbf {u}}}_\varepsilon \in C^{\alpha }_\textrm{loc}(B_{2R}, {\mathbb R^{N\times n}})\) as well. Hence, \(a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) \in C^{1,\alpha }_\textrm{loc}(B_{2R})\), and by the Schauder theory for linear elliptic systems, \({{\textbf {u}}}_\varepsilon \in C^{2,\alpha }_\textrm{loc}(B_{2R}, {\mathbb R^N})\). An iteration argument relying upon the Schauder theory again yields property (5.10).

We claim that

$$\begin{aligned} \int _{B_{2R}} B(|\nabla {{\textbf {u}}}_\varepsilon |)\, dx \le C \bigg (\int _{B_{2R}}\widetilde{B}(|{\textbf {f}}|)\,dx + \int _{B_{2R}}B(|\nabla {{\textbf {u}}}|)\, dx + B(\varepsilon )\bigg )\, \end{aligned}$$
(5.12)

for some constant \(C=C(n, N, s_a, R)\) and for \(\varepsilon \in (0,1)\). Indeed, choosing \({{\textbf {u}}}_\varepsilon - {{\textbf {u}}}\in W^{1,B}_0(B_{2R}, {\mathbb R^N})\) as a test function in the weak formulation of problem (5.9) results in

$$\begin{aligned} \int _{B_{2R}} a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \cdot (\nabla {{\textbf {u}}}_\varepsilon - \nabla {{\textbf {u}}})\, dx = \int _{B_{2R}} {\textbf {f}}\cdot ({{\textbf {u}}}_\varepsilon - {{\textbf {u}}})\, dx\,. \end{aligned}$$
(5.13)

The Poincaré inequality (4.4) implies that

$$\begin{aligned} \int _{B_{2R}}B(|{{\textbf {u}}}_\varepsilon - {{\textbf {u}}}|)\, dx \le C \int _{B_{2R}}B(|\nabla {{\textbf {u}}}_\varepsilon - \nabla {{\textbf {u}}}|)\, dx \end{aligned}$$
(5.14)

for some constant \(C=C(n, s_a, R)\).

Fix \(\delta \in (0,1)\). From equation (5.13), the first inequality in (4.26), and inequalities (5.14) , (4.22) and (4.27) one obtains that

$$\begin{aligned}&c_1\int _{B_{2R}}B(|\nabla {{\textbf {u}}}_\varepsilon |)\, dx \nonumber \\&\quad \le \int _{B_{2R}} |{\textbf {f}}| |{{\textbf {u}}}_\varepsilon - {{\textbf {u}}}|\, dx + C \int _{B_{2R}}a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) |\nabla {{\textbf {u}}}_\varepsilon ||\nabla {{\textbf {u}}}|\, dx + C R^n B(\varepsilon ) \nonumber \\&\quad \le C_1 \int _{B_{2R}} \widetilde{B}(|{\textbf {f}}|)\, dx + \delta \int _{B_{2R}} B(|{{\textbf {u}}}_\varepsilon - {{\textbf {u}}}|)\, dx \nonumber \\&\qquad + \delta \int _{B_{2R}}\widetilde{B}_\varepsilon (a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) |\nabla {{\textbf {u}}}_\varepsilon |)\, dx + C_1 \int _{B_{2R}}B_\varepsilon (|\nabla {{\textbf {u}}}|)\, dx + C R^n B(\varepsilon ) \nonumber \\&\quad \le C_1 \int _{B_{2R}} \widetilde{B}(|{\textbf {f}}|)\, dx + \delta C_2 \int _{B_{2R}}B(|\nabla {{\textbf {u}}}_\varepsilon |)\, dx + C_3 \int _{B_{2R}}B(|\nabla {{\textbf {u}}}|)\, dx \nonumber \\&\qquad + \delta C_4 \int _{B_{2R}}B_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\, dx + C_1 \int _{B_{2R}}B_\varepsilon (|\nabla {{\textbf {u}}}|)\, dx + C R^n B(\varepsilon ) \nonumber \\&\quad \le C_1 \int _{B_{2R}} \widetilde{B}(|{\textbf {f}}|)\, dx + \delta C_5 \int _{B_{2R}}B(|\nabla {{\textbf {u}}}_\varepsilon |)\, dx + C_6 \int _{B_{2R}}B(|\nabla {{\textbf {u}}}|)\, dx + C R^n B(\varepsilon ) \end{aligned}$$
(5.15)

for suitable constants \(C_2\), \(C_4\) and \(C_5\) depending on \(n, N, s_a, R\), and constants \(C_1\), \(C_3\) and \(C_6\) depending also on \(\delta \). Inequality (5.12) follows from (5.15), on choosing \(\delta \) small enough.

Coupling inequality (5.12) with the Poincaré inequality (4.4) tells us that the family \(\{{\textbf {u}}_\varepsilon \}\) is bounded in \(W^{1,B}(B_{2R}, {\mathbb R^N})\). Since under assumptions (2.12) and (2.13) the latter space is reflexive, there exist a sequence \(\{\varepsilon _k\}\) and a function \({\textbf {v}}\in W^{1,B}(B_{2R}, {\mathbb R^N})\) such that \(\varepsilon _k \rightarrow 0^+\) and

$$\begin{aligned} {\textbf {u}}_{\varepsilon _k} \rightharpoonup {\textbf {v}}\qquad \text {in}\quad W^{1,B}(B_{2R}, {\mathbb R^N}). \end{aligned}$$
(5.16)

Choosing the test function \({\textbf {u}}_{\varepsilon _k} - {\textbf {u}}\) for system (2.11), and subtracting the resultant equation from (5.13) enables us to deduce that, given any \(\delta >0\),

$$\begin{aligned} \int _{B_{2R}}&\big (a_{\varepsilon _k}(|\nabla {{\textbf {u}}}_{\varepsilon _k}|)\nabla {{\textbf {u}}}_{\varepsilon _k} - a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\big )\cdot (\nabla {{\textbf {u}}}_{\varepsilon _k} - \nabla {{\textbf {u}}})\, dx \nonumber \\ {}&= \int _{B_{2R}}\big (a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}- a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\big ) \cdot (\nabla {{\textbf {u}}}_{\varepsilon _k} - \nabla {{\textbf {u}}})\, dx \nonumber \\&\le \delta \int _{B_{2R}} B(|\nabla {{\textbf {u}}}_{\varepsilon _k}|) + B(|\nabla {{\textbf {u}}}|) \, dx + C \int _{B_{2R}} \widetilde{B}\big (|a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}- a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}|\big ) dx \end{aligned}$$
(5.17)

for some constant \(C=C(\delta , s_a)\). Owing to equation (4.40), there exists a constant \(c=c(i_a, s_a)\) such that

$$\begin{aligned}&\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}_{\varepsilon _k}) - V(\nabla {\textbf {u}})|^2\,dx \le 2\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}_{\varepsilon _k}) - V_{\varepsilon _k}(\nabla {\textbf {u}})|^2\,dx \nonumber \\&\qquad +2\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}) - V(\nabla {\textbf {u}})|^2\,dx \nonumber \\&\quad \le c \int _{B_{2R}}\big (a_{\varepsilon _k}(|\nabla {{\textbf {u}}}_{\varepsilon _k}|)\nabla {{\textbf {u}}}_{\varepsilon _k} - a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\big )\cdot (\nabla {{\textbf {u}}}_{\varepsilon _k} - \nabla {{\textbf {u}}})\, dx\nonumber \\&\qquad + 2\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}) - V(\nabla {\textbf {u}})|^2\,dx. \end{aligned}$$
(5.18)

Combining Eqs. (5.18), (5.17) and (5.12) yields

$$\begin{aligned} \int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}_{\varepsilon _k}) - V(\nabla {\textbf {u}})|^2\,dx&\le \delta c \bigg (\int _{B_{2R}}\widetilde{B}(|{\textbf {f}}|)\,dx + \int _{B_{2R}}B(|\nabla {{\textbf {u}}}|)\, dx + B(\varepsilon )\bigg ) \nonumber \\ {}&\quad +c \int _{B_{2R}} \widetilde{B}\big (|a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}- a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}|\big ) dx \nonumber \\ {}&\quad + 2\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}) - V(\nabla {\textbf {u}})|^2\,dx \end{aligned}$$
(5.19)

for some constant \(c=c(n,N,R,i_a,s_a)\). Inequalities (4.22) and (4.28) entail that

$$\begin{aligned} \widetilde{B}\big (|a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}- a_{\varepsilon _k}(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}|\big ) \le c (B(|\nabla {\textbf {u}}|) + B({\varepsilon _k})) \quad \text {a.e. in }\quad B_{2R}, \end{aligned}$$
(5.20)

for some constant \(c=c(s_a)\). Furthermore, from inequality (4.26) one infers that

$$\begin{aligned} |V_{\varepsilon _k} (\nabla {\textbf {u}})|^2 \le c (B(|\nabla {\textbf {u}}|) + B({\varepsilon _k})) \quad \text {a.e. in }\quad B_{2R}, \end{aligned}$$
(5.21)

for some constant \(c=c(s_a)\). Thanks to inequalities (5.20) and (5.21), and to property (4.39), the last two integrals on the right-hand side of inequality (5.19) tend to 0 as \(k\rightarrow \infty \), via the dominated convergence theorem. Owing to the same theorem, equation (5.19) implies that

$$\begin{aligned} \lim _{k\rightarrow \infty }\int _{B_{2R}} |V_{\varepsilon _k} (\nabla {\textbf {u}}_{\varepsilon _k}) - V(\nabla {\textbf {u}})|^2\,dx \le \delta c \end{aligned}$$
(5.22)

for every \(\delta \in (0,1)\). Thereby,

$$\begin{aligned} V_{\varepsilon _k}(\nabla {\textbf {u}}_{\varepsilon _k}) \rightarrow V(\nabla {\textbf {u}}) \quad \text {in}\quad L^{2}(B_{2R}, {\mathbb R^{N\times n}}), \end{aligned}$$
(5.23)

and, on passing to a subsequence, still indexed by k,

$$\begin{aligned} V_{\varepsilon _k}(\nabla {\textbf {u}}_{\varepsilon _k}) \rightarrow V(\nabla {\textbf {u}}) \quad \text {a.e. in}\quad B_{2R}. \end{aligned}$$
(5.24)

An analogous argument as in [40, Lemma 4.8] shows that the function \((\varepsilon , P) \mapsto V_{\varepsilon }^{-1}(P)\) is continuous. Thus, one can deduce from equation (5.24) that

$$\begin{aligned} \nabla {\textbf {u}}_{\varepsilon _k} \rightarrow \nabla {\textbf {u}}\quad \text {a.e. in}\quad B_{2R}. \end{aligned}$$
(5.25)

Hence, Eq. (5.16) implies that \({\textbf {v}}={\textbf {u}}\) and

$$\begin{aligned} {\textbf {u}}_{\varepsilon _k} \rightharpoonup {\textbf {u}}\qquad \text {in}\quad W^{1,B}(B_{2R}, {\mathbb R^N}). \end{aligned}$$
(5.26)

Inequalities (4.26) and (5.12), and the monotonicity of the function \(b_{\varepsilon _k}\), yield

$$\begin{aligned} \int _{B_{2R}} a_{\varepsilon _k}(|\nabla {\textbf {u}}_{\varepsilon _k}|) |\nabla {\textbf {u}}_{\varepsilon _k}|\, dx&\le \int _{\{|\nabla {\textbf {u}}_{\varepsilon _k}|\le 1\} \cap B_{2R}} a_{\varepsilon _k}(|\nabla {\textbf {u}}_{\varepsilon _k}|) |\nabla {\textbf {u}}_{\varepsilon _k}|\, dx \nonumber \\&\quad + \int _{B_{2R}} a_{\varepsilon _k}(|\nabla {\textbf {u}}_{\varepsilon _k}|) |\nabla {\textbf {u}}_{\varepsilon _k}|^2\, dx \nonumber \\ {}&\le cR^n b_{\varepsilon _k}(1) + c \int _{B_{2R}}B (\nabla {\textbf {u}}_{\varepsilon _k}) \,dx + c R^n B(\varepsilon _k) \le C \end{aligned}$$
(5.27)

for some constants c and C independent of k.

Thanks to assumption (5.8) and to property (4.25), Lemma 5.1 can be applied with a replaced by \(a_{\varepsilon _k}\). The use of inequality (5.6) of this lemma for the function \({\textbf {u}}_{\varepsilon _k}\), and the equation in (5.9), ensure that

$$\begin{aligned}&R^{-1}{\big \Vert {a(|\nabla {\textbf {u}}_{\varepsilon _k}|) \nabla {\textbf {u}}_{\varepsilon _k}}\big \Vert }_{L^2(B_R, {\mathbb R^{N\times n}})} + \,{\big \Vert {\nabla \big (a(|\nabla {\textbf {u}}_{\varepsilon _k}|) \nabla {\textbf {u}}_{\varepsilon _k}\big )}\big \Vert }_{L^2(B_R, \mathbb R^{N\times n\times n})}\nonumber \\&\quad \le C\Big (\Vert {\textbf {f}}\Vert _{L^2(B_{2R}, \mathbb R^{N})} + R^{-\frac{n}{2}-1}\Vert a(|\nabla {\textbf {u}}_{\varepsilon _k}|)\nabla {\textbf {u}}_{\varepsilon _k}\Vert _{L^1(B_{2R}, \mathbb R^{N\times n})}\Big ) \end{aligned}$$
(5.28)

for some constant \(C=C(n,N, i_a, s_a)\). Owing to inequalities (5.27) and (5.28), the sequence \(\{a_{\varepsilon _k}(|\nabla {{\textbf {u}}}_{\varepsilon _k}|)\nabla {{\textbf {u}}}_{\varepsilon _k}\}\) is bounded in \(W^{1,2}(B_R, {\mathbb R^{N\times n}})\). Thus, there exists a function \({\textbf {U}}\in W^{1,2}(B_R, {\mathbb R^{N\times n}})\), and a subsequence of \(\{\varepsilon _k\}\), still indexed by k, such that

$$\begin{aligned}&a_{\varepsilon _k}(|\nabla {{\textbf {u}}}_{\varepsilon _k}|) \nabla {{\textbf {u}}}_{\varepsilon _k} \rightarrow {\textbf {U}}\quad \hbox {in}\quad L^2(B_R, {\mathbb R^{N\times n}})\nonumber \\ {}&\quad \hbox {and} \quad a_{\varepsilon _k}(|\nabla {{\textbf {u}}}_{\varepsilon _k}|) \nabla {{\textbf {u}}}_{\varepsilon _k} \rightharpoonup {\textbf {U}}\quad \hbox {in}\quad W^{1,2}(B_R, {\mathbb R^{N\times n}}). \end{aligned}$$
(5.29)

Combining property (4.39) with Eqs. (5.11), (5.25) and (5.29) yields

$$\begin{aligned} a(|\nabla {\textbf {u}}|) \nabla {\textbf {u}}= {\textbf {U}}\in W^{1,2}(B_R, {\mathbb R^{N\times n}}). \end{aligned}$$
(5.30)

On passing to the limit as \(k \rightarrow \infty \), from equations (5.28), (5.29) and (5.30) we infer that

$$\begin{aligned}&R^{-1}{\big \Vert {a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}}\big \Vert }_{L^2(B_R, {\mathbb R^{N\times n}})} + \,{\big \Vert {\nabla \big (a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\big )}\big \Vert }_{L^2(B_R, \mathbb R^{N\times n\times n})}\nonumber \\&\quad \le C\Big (\Vert {\textbf {f}}\Vert _{L^2(B_{2R}, \mathbb R^{N})} + R^{-\frac{n}{2}-1}\Vert a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\Vert _{L^1(B_{2R}, \mathbb R^{N\times n})}\Big ). \end{aligned}$$
(5.31)

It remains to remove assumption (5.7). Suppose that \({\textbf {f}}\in L^2_\textrm{loc}(\Omega , {\mathbb R^N})\). Let \({{\textbf {u}}}\) be an approximable local solution to equation (2.11), and let \({\textbf {f}}_k\) and \({{\textbf {u}}}_k\) be as in the definition of this kind of solution. Applying inequality (5.31) to the function \({{\textbf {u}}}_k\) tells us that \(a(|\nabla {{\textbf {u}}}_k|) \nabla {{\textbf {u}}}_k \in W^{1,2}(B_R, {\mathbb R^{N\times n}})\), and

$$\begin{aligned}&R^{-1}{\big \Vert {a(|\nabla {{\textbf {u}}}_k|) \nabla {{\textbf {u}}}_k}\big \Vert }_{L^2(B_R, {\mathbb R^{N\times n}})} + \,{\big \Vert {\nabla \big (a(|\nabla {{\textbf {u}}}_k|) \nabla {{\textbf {u}}}_k\big )}\big \Vert }_{L^2(B_R, \mathbb R^{N\times n\times n})}\nonumber \\&\quad \le C\Big (\Vert {\textbf {f}}_k\Vert _{L^2(B_{2R}, \mathbb R^{N})} + R^{-\frac{n}{2}-1}\Vert a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k\Vert _{L^1(B_{2R}, \mathbb R^{N\times n})}\Big ) \end{aligned}$$
(5.32)

for some constant C independent of k. Hence, by equation (5.4), the sequence \(\{a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k\}\) is bounded in \(W^{1,2}(B_R, {\mathbb R^{N\times n}})\). Thereby, there exist a subsequence, still indexed by k, and a function \({\textbf {U}}\in W^{1,2}(B_R, {\mathbb R^{N\times n}})\), such that

$$\begin{aligned}&a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k \rightarrow {\textbf {U}}\quad \hbox {in}\quad L^2(B_R, {\mathbb R^{N\times n}})\nonumber \\&\quad \hbox {and} \quad a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k \rightharpoonup {\textbf {U}}\quad \hbox {in}\quad W^{1,2}(B_R, {\mathbb R^{N\times n}}). \end{aligned}$$
(5.33)

By Assumption (5.3), we have that \(\nabla {{\textbf {u}}}_k \rightarrow \nabla {{\textbf {u}}}\) a.e. in \(\Omega \). Hence, thanks to properties (5.33),

$$\begin{aligned} a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}= {\textbf {U}}\in W^{1,2}(B_R, {\mathbb R^{N\times n}})\,. \end{aligned}$$
(5.34)

Inequality (2.15) follows on passing to the limit as \(k\rightarrow \infty \) in (5.32), via (5.4), (5.33) and (5.34). \(\square \)

6 Second-order regularity: Dirichlet problems

Generalized solutions, in the approximable sense, to the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{\textbf{div}} (a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}} ) = {{\textbf {f}}} &{} \textrm{in}\,\,\, \Omega \\ {{\textbf {u}}} =0 &{} \textrm{on}\,\,\, \partial \Omega \,, \end{array}\right. } \end{aligned}$$
(6.1)

are defined in analogy with the local solutions introduced in Sect. 5.

Assume that a is as in Theorems 2.6 and 2.7 and let \({\textbf {f}}\in L^q(\Omega , {\mathbb R^N})\) for some \(q \ge 1\). An approximately differentiable function \({{\textbf {u}}}: \Omega \rightarrow {\mathbb R^N}\) is called an approximable solution to the Dirichlet problem (6.1) if there exists a sequence \(\{{{\textbf {f}}}_k\} \subset C^{\infty }_0(\Omega , \mathbb R^N)\) such that \({{\textbf {f}}}_k \rightarrow {\textbf {f}}\) in \(L^q(\Omega , \mathbb R^N)\), and the sequence \(\{{{\textbf {u}}}_k\}\) of weak solutions to the Dirichlet problems

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{\textbf{div}} (a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}} _k) = {{\textbf {f}}}_k &{} \textrm{in}\,\,\, \Omega \\ {{\textbf {u}}}_k =0 &{} \textrm{on}\,\,\, \partial \Omega \, \end{array}\right. } \end{aligned}$$
(6.2)

satisfies

$$\begin{aligned} {{\textbf {u}}}_k \rightarrow {{\textbf {u}}}\quad \hbox {and} \quad \nabla {{\textbf {u}}}_k \rightarrow \textrm{ap}\nabla {{\textbf {u}}}\quad \hbox {a.e. in}\quad \Omega . \end{aligned}$$
(6.3)

As above, in what follows \( \textrm{ap}\nabla {{\textbf {u}}}\) will simply be denoted by \(\nabla {\textbf {u}}\).

Recall that, under the assumption that \({\textbf {f}}\in L^1(\Omega , {\mathbb R^N}) \cap (W^{1,B}_0(\Omega , {\mathbb R^N}))'\), a function \({{\textbf {u}}}\in W^{1,B}_0(\Omega , \mathbb R^N)\) is called a weak solution to the Dirichlet problem (6.1) if

$$\begin{aligned} \int _\Omega a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\cdot \nabla {\varvec{\varphi }}\,dx = \int _{\Omega } {{\textbf {f}}}\cdot {\varvec{\varphi }}\,dx \end{aligned}$$
(6.4)

for every \({\varvec{\varphi }}\in W^{1,B}_0(\Omega , \mathbb R^N)\). A unique weak solution to problem (6.1) exists whenever \(|\Omega |<\infty \).

The notion of approximable solution to the Dirichlet problem (6.1) introduced above is closely related to those appearing in [9, 12, 34] in the case of equations of p-Laplacian type. The existence of approximable solutions to the Dirichlet problem (6.1), in the case of equations and with \({\textbf {f}}\in L^1(\Omega )\), was proved in [29]. Systems of p-Laplacian type were treated in [41, 42], whereas the existence of approximable solutions for systems with a more general growth as in (6.1) has very recently been established in [24]. In the latter paper, data \({\textbf {f}}\) in \(L^1(\Omega , \mathbb R^N)\), and even in the space of finite Radon measures, are considered. In the definition of approximable solution adopted in [24] the approximate gradient \(\textrm{ap}\nabla {{\textbf {u}}}\) is replaced by an alternate notion of generalized gradient, which involves truncations of vector-valued functions. The results of the present paper also apply to those solutions, provided that the gradient is interpreted accordingly. Besides other ingredients, the result of [24] relies upon the use of arguments from the proof of Proposition 6.2 below, which appeared in a preliminary version of the present paper.

Before accomplishing the proof of our global estimates, we recall the notions of capacity and of Marcinkiewicz spaces that enter conditions (2.22) and (2.23), respectively, in the statement of Theorem 2.7.

The capacity \(\textrm{cap}_{\Omega }(E)\) of a set \(E \subset {\Omega }\) relative to \({\Omega }\) is defined as

$$\begin{aligned} \textrm{cap}_{\Omega }(E) = \inf \bigg \{\int _{\Omega } |\nabla v|^2\, dx : v\in C^{0,1}_0({\Omega }), v\ge 1 \,\hbox {on }\, E\bigg \}. \end{aligned}$$
(6.5)

Here, \(C^{0,1}_0({\Omega })\) denotes the space of Lipschitz continuous, compactly supported functions in \({\Omega }\).

The Marcinkiewicz space \(L^{q, \infty } (\partial \Omega )\) is the Banach function space endowed with the norm defined as

$$\begin{aligned} \Vert \psi \Vert _{L^{q, \infty } (\partial \Omega )} = \sup _{s \in (0, \mathcal H^{n-1}(\partial \Omega ))} s ^{\frac{1}{q}} \psi ^{**}(s) \end{aligned}$$
(6.6)

for a measurable function \(\psi \) on \(\partial \Omega \). Here, \(\psi ^{**}(s)= \frac{1}{s}\smallint _0^s \psi ^* (r)\, dr\) for \(s >0\), where \(\psi ^*\) denotes the decreasing rearrangement of \(\psi \). The Marcinkiewicz space \(L^{1, \infty } \log L (\partial \Omega )\) is equipped with the norm given by

$$\begin{aligned} \Vert \psi \Vert _{L^{1, \infty } \log L (\partial \Omega )} = \sup _{s \in (0, \mathcal H^{n-1}(\partial \Omega ))} s \log \big (1+ \tfrac{C}{s}\big ) \psi ^{**}(s), \end{aligned}$$
(6.7)

for any constant \(C>\mathcal H^{n-1}(\partial \Omega )\). Different constants C result in equivalent norms in (6.7).

Lemma 6.1 is related to Theorems 2.6 and 2.7 in the same way that Lemma 5.1 is related to Theorem 2.4. Lemma 6.1 follows from Theorem 2.1 and inequality (2.10), via the same proof of [31, Theorem 3.1, Part (ii)].

Lemma 6.1

Let \(n \ge 2\), \(N \ge 2\), and let \(\Omega \) be a bounded open set in \({\mathbb R^n}\) with \(\partial \Omega \in C^2\). Let \(\mathcal K_\Omega \) be the function defined by (2.20). Assume that a is a function as in Theorem 2.1, Part (i), which also fulfills conditions (2.12) and (2.13). There exists a constant \(c=c(n,N, i_a, s_a, L_\Omega , d_\Omega )\) such that, if

$$\begin{aligned} \mathcal K_\Omega (r) \le \mathcal K(r) \quad \hbox {for}\quad r \in (0,1), \end{aligned}$$
(6.8)

for some function \(\mathcal K : (0,1) \rightarrow [0, \infty )\) satisfying

$$\begin{aligned} \lim _{r \rightarrow 0^+} \mathcal K(r) <c\,, \end{aligned}$$
(6.9)

then

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C \big (\Vert \textrm{\textbf{div}} ( a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}})\Vert _{L^2(\Omega , {\mathbb R^N})}+ \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{L^1(\Omega , {\mathbb R^{N\times n}})}\big ) \, \end{aligned}$$
(6.10)

for some constant \(C=C(n,N, i_a, s_a, L_\Omega , d_\Omega , \mathcal K)\), and for every function \({{\textbf {u}}}\in C^3(\Omega , {\mathbb R^N})\cap C^2(\overline{\Omega }, {\mathbb R^N})\) such that

$$\begin{aligned} {{\textbf {u}}}=0 \quad \hbox {on} \quad \partial \Omega . \end{aligned}$$
(6.11)

In particular, if \(\Omega \) is convex, then inequality (6.10) holds whatever \(\mathcal K_\Omega \) is, and the constant C in (6.10) only depends on \(n,N, i_a, s_a, L_\Omega , d_\Omega \).

The following gradient bound for solutions to the Dirichlet problem (6.1) is needed to deal with lower-order terms appearing in our global estimates.

Proposition 6.2

Assume that \(n\ge 2\), \(N \ge 2\). Let \(\Omega \) be an open set in \({\mathbb R^n}\) such that \(|\Omega |<\infty \). Assume that the function \(a: [0, \infty ) \rightarrow [0, \infty )\) is continuously differentiable in \((0, \infty )\) and fulfills conditions (2.12) and (2.13). Let \({\textbf {f}}\in L^1(\Omega , {\mathbb R^N})\cap (W^{1,B}_0(\Omega , {\mathbb R^N}))'\) and let \({{\textbf {u}}}\) be the weak solution to the Dirichlet problem (6.1). Then there exists a constant \(C=C(n, N, i_a, s_a, |\Omega |)\) such that

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{L^{1}(\Omega , {\mathbb R^{N\times n}})} \le C \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}. \end{aligned}$$
(6.12)

The same conclusion holds if \({\textbf {f}}\in L^1(\Omega , {\mathbb R^N})\) and \({{\textbf {u}}}\) is an approximable solution to the Dirichlet problem (6.1).

Proof

Assume that \({\textbf {f}}\in L^1(\Omega , {\mathbb R^N})\cap (W^{1,B}_0(\Omega , {\mathbb R^N}))'\) and that \({{\textbf {u}}}\) is the weak solution to the Dirichlet problem (6.1). Given \(t>0\), let \(T_t({\textbf {u}}): \Omega \rightarrow {\mathbb R^N}\) be the function defined by

$$\begin{aligned} T_t({\textbf {u}}) = {\left\{ \begin{array}{ll} {\textbf {u}}&{} \quad \text {in}\quad \{|{\textbf {u}}|\le t\} \\ \displaystyle t\frac{{\textbf {u}}}{|{\textbf {u}}|} &{} \quad \text {in}\quad \{|{\textbf {u}}|> t\}. \end{array}\right. } \end{aligned}$$
(6.13)

Then \(T_t({\textbf {u}}) \in W^{1,B}_0(\Omega , {\mathbb R^N})\), and

$$\begin{aligned} \nabla T_t({\textbf {u}}) = {\left\{ \begin{array}{ll} \nabla {\textbf {u}}&{} \quad \text {a.e. in} \quad \{|{\textbf {u}}|\le t\} \\ \displaystyle \frac{t}{|{\textbf {u}}|} \Big (I - \frac{{\textbf {u}}}{|{\textbf {u}}|} \otimes \frac{{\textbf {u}}}{|{\textbf {u}}|} \Big ) \nabla {\textbf {u}}&{} \quad \text {a.e. in}\quad \{|{\textbf {u}}|> t\} \end{array}\right. } \end{aligned}$$
(6.14)

Observe that

$$a(|P|)P \cdot (I-\omega \otimes \omega )P\ge 0$$

for every matrix \(P \in {\mathbb R^{N\times n}}\) and any vector \(\omega \in {\mathbb R^N}\) such that \(|\omega |\le 1\). Thus, on making use of \(T_t({\textbf {u}})\) as a test function \({\varvec{\varphi }}\) in equation (6.4), one deduces that

$$\begin{aligned} \int _{\{|{\textbf {u}}|\le t\}} a(|\nabla {\textbf {u}}|) |\nabla {\textbf {u}}|^2\, dx&\le \int _{\Omega } a(|\nabla {\textbf {u}}|) \nabla {\textbf {u}}\cdot \nabla T_t({\textbf {u}}) \, dx = \int _\Omega {\textbf {f}}\cdot T_t({\textbf {u}}) \, dx \nonumber \\ {}&= \int _{\{|{\textbf {u}}|\le t\}} {\textbf {f}}\cdot {\textbf {u}}\, dx + \int _{\{|{\textbf {u}}>t\}} {\textbf {f}}\cdot \displaystyle t\frac{{\textbf {u}}}{|{\textbf {u}}|}\, dx \le t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}. \end{aligned}$$
(6.15)

Hence, by the first inequality in (4.21),

$$\begin{aligned} \int _{\{|{\textbf {u}}|\le t\}} B(|\nabla {\textbf {u}}|) \, dx\le t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}. \end{aligned}$$
(6.16)

On the other hand, the chain rule for vector-valued functions ensures that the function \(|{\textbf {u}}| \in W^{1,B}_0(\Omega )\), and \(|\nabla {\textbf {u}}| \ge |\nabla |{\textbf {u}}||\) a.e. in \(\Omega \). Inequality (6.16) thus implies that

$$\begin{aligned} \int _{\{|{\textbf {u}}|< t\}} B(|\nabla |{\textbf {u}}||) \, dx\le t \Vert {\textbf {f}}\Vert _{L^1(\Omega , \mathbb R^N)} \quad \hbox {for}\quad t>0. \end{aligned}$$
(6.17)

The standard chain rule for Sobolev functions entails that \({T_t}(|{\textbf {u}}|) \in W^{1,B}(\Omega )\). Let \(\sigma > \max \{s_a +2, n\}\). Hence, \(\sigma > \max \{s_B, n\}\), inasmuch as \(i_B\le i_b+1 = i_a+2\). Owing to Lemma 4.2, the assumptions of Theorem 4.1 are fulfilled, with A replaced by B and with this choice of \(\sigma \). An application of the Orlicz-Sobolev inequality (4.9) to the function \({T_t}(|{\textbf {u}}|)\) tells us that

$$\begin{aligned} \int _\Omega B_\sigma \Bigg (\frac{|{T_t}(|{\textbf {u}}|)|}{C \big (\int _\Omega B(|\nabla {T_t}(|{\textbf {u}}|)|)dy\big )^{1/\sigma }}\Bigg )\, dx \le \int _\Omega B(|\nabla ({T_t}(|{\textbf {u}}|))|)dx, \end{aligned}$$
(6.18)

where \(C=c |\Omega |^{\frac{1}{n} - \frac{1}{\sigma }}\). Here, \(B_\sigma \) denotes the function defined as in (4.7)–(4.8), with A replaced by B. One has that

$$\begin{aligned} \int _\Omega B(|\nabla {T_t}(|{\textbf {u}}|)|)dx&= \int _{\{|{\textbf {u}}|< t\}} B(|\nabla |{\textbf {u}}||)dx \quad \hbox {for}\quad t>0, \end{aligned}$$
(6.19)
$$\begin{aligned} |{T_t}(|{\textbf {u}}|)|&= t \quad \hbox {in}\quad \{|{\textbf {u}}|\ge t\}, \end{aligned}$$
(6.20)

and

$$\begin{aligned} \{|{T_t}(|{\textbf {u}}|)|\ge t\} = \{|{\textbf {u}}|\ge t\} \quad \hbox {for}\quad t>0. \end{aligned}$$
(6.21)

Thus,

$$\begin{aligned}&|\{|{\textbf {u}}|\ge t\}| B_\sigma \bigg (\frac{ t}{C (\int _{\{|{\textbf {u}}|< t\}} B(|\nabla |{\textbf {u}}||)dy)^{\frac{1}{\sigma }} }\bigg )\nonumber \\&\quad \le \int _{\{|{\textbf {u}}|\ge t\}} B_\sigma \Bigg (\frac{|{T_t}(|{\textbf {u}}|)|}{C \big (\int _{\{|{\textbf {u}}|< t\}} B(|\nabla |{\textbf {u}}||)dy\big )^{1/\sigma }}\Bigg )\, dx \nonumber \\&\quad \le \int _{\{|{\textbf {u}}|< t\}} B(|\nabla |{\textbf {u}}||)dx \end{aligned}$$
(6.22)

for \(t>0\). Hence, by inequality (6.17),

$$\begin{aligned} |\{|{\textbf {u}}|\ge t\}| B_\sigma \bigg (\frac{ t}{C ( t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})})^{\frac{1}{\sigma }}}\bigg ) \le t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})} \qquad \hbox {for }\quad t > 0. \end{aligned}$$
(6.23)

From inequality (6.16) we deduce that

$$\begin{aligned} |\{B(|\nabla {\textbf {u}}|)>s, |{\textbf {u}}|\le t\}|\le & {} \frac{1}{s} \int _{\{B(|\nabla {\textbf {u}}|)>s, |{\textbf {u}}|\le t\}} B(|\nabla {\textbf {u}}|)\, dx \nonumber \\\le & {} \frac{t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}}{s} \quad \hbox {for}\quad t>0\quad \hbox {and }s > 0. \end{aligned}$$
(6.24)

Coupling inequalities (6.24) and (6.23) yields

$$\begin{aligned} |\{B(|\nabla {\textbf {u}}|)>s\}|&\le |\{|{\textbf {u}}|>t\}| + |\{B(|\nabla {\textbf {u}}|)>s,|{\textbf {u}}|\le t\}| \nonumber \\ {}&\le \frac{t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}}{B_\sigma (Ct^{\frac{1}{\sigma '}}/(t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})})^{\frac{1}{\sigma }})} + \frac{ t \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}}{s}\nonumber \\&\qquad \hbox {for}\quad t>0\quad \hbox {and}\quad s>0. \end{aligned}$$
(6.25)

The choice \(t = \big (\tfrac{1}{C} \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}^{1/\sigma } B_\sigma ^{-1}(s)\big )^{\sigma '}\) in inequality (6.25) results in

$$\begin{aligned} |\{B(|\nabla {\textbf {u}}|)>s\}| \le \ \frac{2 \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}^{\sigma '}}{C^{\sigma '} }\frac{B_\sigma ^{-1}(s)^{\sigma '}}{s} \quad \hbox {for}\quad s>0. \end{aligned}$$
(6.26)

Next, set \(s= B(b^{-1}(\tau ))\) in (6.26) and make use of (4.8) to obtain that

$$\begin{aligned} |\{b(|\nabla {\textbf {u}}|)>\tau \}| \le \frac{2 \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}^{\sigma '}}{C^{\sigma '} } \frac{H_\sigma (b^{-1}(\tau ))^{\sigma '}}{B(b^{-1}(\tau ))} \qquad \hbox {for}\quad \tau >0, \end{aligned}$$
(6.27)

where \(H_\sigma \) is defined as in (4.7), with A replaced by B. Thanks to inequality (6.27),

$$\begin{aligned} \int _\Omega b(|\nabla {\textbf {u}}|) \, dx&= \int _0^\infty |\{b(|\nabla {\textbf {u}}|)>\tau \}|\, d\tau \nonumber \\&\le \lambda b(|\Omega |) + 2 C^{-\sigma '} \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}^{\sigma '}\int _\lambda ^\infty \frac{H_\sigma (b^{-1}(\tau ))^{\sigma '}}{B(b^{-1}(\tau ))}\, d\tau \end{aligned}$$
(6.28)

for \(\lambda >0\). Owing to inequalities (4.20) and (4.21), and to Fubinis’s theorem, the following chain holds:

$$\begin{aligned} \int _\lambda ^\infty&\frac{H_\sigma (b^{-1}(\tau ))^{\sigma '}}{B(b^{-1}(\tau ))}\, d\tau \le \int _{b^{-1}(\lambda )} ^\infty \frac{H_\sigma (s)^{\sigma '}}{sB(s)} b(s) ds \nonumber \\&= \int _{b^{-1}(\lambda )} ^\infty \frac{b(s)}{sB(s)} \int _0^s \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}}\,dt \,ds \nonumber \\ {}&\le (s_a+2) \int _{b^{-1}(\lambda )} ^\infty \frac{1}{s^2} \int _0^s \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}}\,dt \,ds \nonumber \\ {}&= (s_a\!+\!2)\bigg (\int _0^{b^{-1}(\lambda )} \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}}\!\! \int _{b^{-1}(\lambda )} ^\infty \frac{ds}{s^2} \, dt \!+\! \int _{b^{-1}(\lambda )} ^\infty \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}} \!\!\int _{t} ^\infty \frac{ds}{s^2} \, dt\!\bigg ) \nonumber \\ {}&= (s_a+2)\bigg (\frac{1}{b^{-1}(\lambda )} \int _0^{b^{-1}(\lambda )} \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}} \, dt + \int _{b^{-1}(\lambda )} ^\infty \bigg (\frac{t}{B(t)}\bigg )^{\frac{1}{\sigma -1}} \frac{dt}{t}\bigg ) \nonumber \\ {}&\le (s_a+2)^{\sigma '}\bigg (\frac{1}{b^{-1}(\lambda )} \int _0^{b^{-1}(\lambda )} \bigg (\frac{1}{b(t)}\bigg )^{\frac{1}{\sigma -1}} \, dt + \int _{b^{-1}(\lambda )} ^\infty \bigg (\frac{1}{b(t)}\bigg )^{\frac{1}{\sigma -1}} \frac{dt}{t}\bigg ) \end{aligned}$$
(6.29)

for \(\lambda >0\). The function \(\frac{t^{s_a+1+\varepsilon }}{b(t)}\) is increasing for every \(\varepsilon >0\). Hence, if \(0<\varepsilon < \sigma - s_a -2\), then

$$\begin{aligned} \frac{1}{b^{-1}(\lambda )}&\int _0^{b^{-1}(\lambda )} \bigg (\frac{1}{b(t)}\bigg )^{\frac{1}{\sigma -1}} \, dt = \frac{1}{b^{-1}(\lambda )} \int _0^{b^{-1}(\lambda )} \bigg (\frac{t^{s_a+1+\varepsilon }}{b(t)}\bigg )^{\frac{1}{\sigma -1}} t^{- \frac{s_a+1+\varepsilon }{\sigma -1}}\, dt \nonumber \\ {}&\le \frac{1}{b^{-1}(\lambda )} \bigg (\frac{b^{-1}(\lambda )^{s_a+1+\varepsilon }}{\lambda }\bigg )^{\frac{1}{\sigma -1}} \int _0^{b^{-1}(\lambda )} t^{- \frac{s_a+1+\varepsilon }{\sigma -1}}\, dt\nonumber \\&= \frac{\sigma -1}{\sigma - s_a-2-\varepsilon } \lambda ^{-\frac{1}{\sigma -1}} \quad \text {for}\quad \lambda > 0. \end{aligned}$$
(6.30)

On the other hand, if \(0<\varepsilon < i_a +1\), then the function \(\frac{t^{\varepsilon }}{b(t)}\) is decreasing. Hence,

$$\begin{aligned} \int _{b^{-1}(\lambda )} ^\infty \bigg (\frac{1}{b(t)}\bigg )^{\frac{1}{\sigma -1}} \frac{dt}{t}&= \int _{b^{-1}(\lambda )} ^\infty \bigg (\frac{t^\varepsilon }{b(t)}\bigg )^{\frac{1}{\sigma -1}} t^{-\frac{\varepsilon }{\sigma -1}-1}\, dt\nonumber \\&\le \bigg (\frac{b^{-1}(\lambda )^\varepsilon }{\lambda }\bigg )^{\frac{1}{\sigma -1}} \int _{b^{-1}(\lambda )} ^\infty t^{-\frac{\varepsilon }{\sigma -1}-1}\, dt \nonumber \\&= \frac{\sigma -1}{\varepsilon }\lambda ^{-\frac{1}{\sigma -1}} \quad \text {for}\quad \lambda > 0. \end{aligned}$$
(6.31)

Inequalities (6.29)–(6.31) entail that there exists a constant \(c=c(\sigma , i_a, s_a)\) such that

$$\begin{aligned} \int _\lambda ^\infty \frac{H_\sigma (b^{-1}(\tau ))^{\sigma '}}{B(b^{-1}(\tau ))}\, d\tau \le c \lambda ^{-\frac{1}{\sigma -1}} \quad \text {for}\quad \lambda >0. \end{aligned}$$
(6.32)

Inequality (6.12) follows from (6.28) and (6.32), with the choice \(\lambda = \Vert {\textbf {f}}\Vert _{L^1(\Omega , {\mathbb R^N})}\).

The assertion about the case when \({\textbf {f}}\in L^1(\Omega , {\mathbb R^N})\) and \({{\textbf {u}}}\) is an approximable solution to the Dirichlet problem (6.1) follows on applying inequality (6.12) with \({\textbf {f}}\) and \({{\textbf {u}}}\) replaced by the functions \({\textbf {f}}_k\) and \({{\textbf {u}}}_k\) appearing in the definition of approximable solutions, and passing to the limit as \(k \rightarrow \infty \) in the resultant inequality. Fatou’s lemma plays a role here. \(\square \)

A last preliminary result, proved in [32, Lemma 5.2], is needed in an approximation argument for the domain \(\Omega \) in our proof of Theorem 2.7.

Lemma 6.3

Let \(\Omega \) be a bounded Lipschitz domain in \({\mathbb R^n}\), \(n \ge 2\) such that \(\partial \Omega \in W^{2,1}\). Assume that the function \(\mathcal K_\Omega (r)\), defined as in (2.20), is finite-valued for \(r\in (0,1)\). Then there exist positive constants \(r_0\) and C, depending on n, \(d_\Omega \) and \(L_\Omega \), and a sequence of bounded open sets \(\{\Omega _m\}\), such that \(\partial \Omega _m \in C^\infty \), \(\Omega \subset \Omega _m\), \(\lim _{m \rightarrow \infty }|\Omega _m \setminus \Omega | = 0\), the Hausdorff distance between \(\Omega _m\) and \(\Omega \) tends to 0 as \(m \rightarrow \infty \),

$$\begin{aligned} L_{\Omega _m} \le C, \quad d_{\Omega _m} \le C \end{aligned}$$
(6.33)

and

$$\begin{aligned} \mathcal K_{\Omega _m}(r) \le C \mathcal K_{\Omega } (r) \end{aligned}$$
(6.34)

for \(r\in (0, r_0)\) and \(m \in \mathbb N\).

Proof of Theorem 2.7

It suffices to prove Part (i). Part (ii) will then follow, since, by [32, Lemmas 3.5 and 3.7],

$$\begin{aligned} \mathcal K_{\Omega } (r) \le C \sup _{x\in \partial {\Omega }}\Vert \mathcal B\Vert _{X(\partial {\Omega } \cap B_r(x))} \qquad \hbox {for}\quad r \in (0, r_0), \end{aligned}$$
(6.35)

for suitable constants \(r_0\) and C depending on n, \(L_{\Omega }\) and \(d_{\Omega }\).

We split the proof in three separate steps, where approximation arguments for the differential operator, the domain and the datum on the right-hand side of the system, respectively, are provided.

Step 1. Assume that the additional conditions

$$\begin{aligned} {\textbf {f}}\in C^\infty _0(\Omega , {\mathbb R^N})\,, \end{aligned}$$
(6.36)

and

$$\begin{aligned} \partial \Omega \in C^\infty \, \end{aligned}$$
(6.37)

are in force. Given \(\varepsilon \in (0,1)\), we denote by \({{\textbf {u}}}_\varepsilon \) the weak solution to the system

$$\begin{aligned} {\left\{ \begin{array}{ll} - \textrm{\textbf{div}}(a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) \nabla {{\textbf {u}}}_\varepsilon ) = {{\textbf {f}}} &{} \textrm{in}\,\,\, \Omega \\ {{\textbf {u}}_\varepsilon } =0 &{} \textrm{on}\,\,\, \partial \Omega \,, \end{array}\right. } \end{aligned}$$
(6.38)

where \(a_\varepsilon \) is the function defined by (4.24). An application of [30, Theorem 2.1] tells us that

$$\begin{aligned} \Vert \nabla {{\textbf {u}}}_\varepsilon \Vert _{L^\infty (\Omega , {\mathbb R^{N\times n}})} \le C \end{aligned}$$
(6.39)

for some constant C independent of \(\varepsilon \). Let us notice that the statement of [30, Theorem 2.1] yields inequality (6.39) under the assumption that the function \(a_\epsilon \) be either increasing or decreasing; such an additional assumption can however be dropped via a slight variant in the proof. Inequality (6.39) implies that, for each \(\varepsilon \in (0,1)\),

$$\begin{aligned} c_1 \le a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) \le c_2 \qquad \hbox {in}\quad \Omega \end{aligned}$$
(6.40)

for suitable positive constants \(c_1\) and \(c_2\).

A classical result by Elcrat and Meyers [10, Theorem 8.2] enables us to deduce, via properties (6.36), (6.37) and (6.40), that \(\textbf{u}_\varepsilon \in W^{2,2}(\Omega , {\mathbb R^N})\). Consequently, \(\textbf{u}_\varepsilon \in W^{1,2}_0(\Omega , {\mathbb R^n}) \cap W^{1,\infty }(\Omega , {\mathbb R^N}) \cap W^{2,2}(\Omega , {\mathbb R^N})\). One can then find a sequence \(\{\textbf{u}_{\varepsilon ,k}\}_{k\in \mathbb N} \subset C^\infty (\Omega , {\mathbb R^N})\cap C^2(\overline{\Omega }, {\mathbb R^N})\) such that \(\textbf{u}_{\varepsilon ,k} = 0\) on \(\partial \Omega \) for \(k \in \mathbb N\), and

$$\begin{aligned}&\textbf{u}_{\varepsilon ,k} \rightarrow {{\textbf {u}}}_\varepsilon \quad \hbox {in}\quad W^{1,2}_0(\Omega , {\mathbb R^N}), \quad \textbf{u}_{\varepsilon ,k} \rightarrow {{\textbf {u}}}_\varepsilon \quad \hbox {in}\quad W^{2,2}(\Omega , {\mathbb R^N}),\nonumber \\&\quad \nabla \textbf{u}_{\varepsilon ,k} \rightarrow \nabla {{\textbf {u}}}_\varepsilon \quad \hbox {a.e. in}\quad \Omega , \end{aligned}$$
(6.41)

as \(k \rightarrow \infty \) — see e.g. [17, Chapter 2, Corollary 3]. One also has that

$$\begin{aligned} \Vert \nabla \textbf{u}_{\varepsilon ,k}\Vert _{L^\infty (\Omega , {\mathbb R^{N\times n}})} \le C \Vert \nabla {{\textbf {u}}}_\varepsilon \Vert _{L^\infty (\Omega , {\mathbb R^{N\times n}})}\qquad \quad \end{aligned}$$
(6.42)

for some constant C independent of k, and, by the chain rule for vector-valued Sobolev functions [52, Theorem 2.1], \(|\nabla |\nabla \textbf{u}_{\varepsilon ,k}|| \le |\nabla ^2 \textbf{u}_{\varepsilon ,k}|\) a.e. in \(\Omega \). Moreover, [30, Equation (6.12)] tells us that

$$\begin{aligned} - \textrm{\textbf{div}} (a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|)\nabla \textbf{u}_{\varepsilon ,k} ) \rightarrow {{\textbf {f}}} \quad \hbox {in}\quad L^2(\Omega , {\mathbb R^N}), \end{aligned}$$
(6.43)

as \(k \rightarrow \infty \). Assumption (2.22) enables us to apply inequality (6.10), with a replaced by \(a_\varepsilon \) and \({{\textbf {u}}}\) replaced by \(\textbf{u}_{\varepsilon ,k}\), to deduce that

$$\begin{aligned}&\Vert a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|) \nabla \textbf{u}_{\varepsilon ,k}\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \nonumber \\&\quad \le C \Big ( \Vert \textrm{\textbf{div}} ( a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|) \nabla \textbf{u}_{\varepsilon ,k})\Vert _{L^2(\Omega , {\mathbb R^N})} + \Vert a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|)\nabla \textbf{u}_{\varepsilon ,k}\Vert _{L^1(\Omega , {\mathbb R^{N\times n}})}\Big )\nonumber \\ \end{aligned}$$
(6.44)

for \(k\in \mathbb N\), and for some constant \(C=C(n, N, i_a, s_a, L_\Omega , d_\Omega , \mathcal K_\Omega )\). Notice that this constant actually depends on the function \(a_\varepsilon \) only through \(i_a\) and \(s_a\), and it is hence independent of \(\varepsilon \), owing to (4.25). Equations (6.42)–(6.44) ensure that the sequence \(\{a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|)\nabla \textbf{u}_{\varepsilon ,k}\}\) is bounded in \(W^{1,2}(\Omega , {\mathbb R^{N\times n}})\). Therefore, there exist a subsequence of \(\{\textbf{u}_{\varepsilon ,k}\}\), still denoted by \(\{\textbf{u}_{\varepsilon ,k}\}\), and a function \(\textbf{U}_\varepsilon \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\) such that

$$\begin{aligned}&a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|) \nabla \textbf{u}_{\varepsilon ,k} \rightarrow \textbf{U}_\varepsilon \quad \hbox {in}\quad L^2(\Omega , {\mathbb R^{N\times n}}),\nonumber \\&\quad a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|) \nabla \textbf{u}_{\varepsilon ,k} \rightharpoonup \textbf{U}_\varepsilon \quad \hbox {in}\quad W^{1,2}(\Omega , {\mathbb R^{N\times n}}). \end{aligned}$$
(6.45)

Equation (6.41) entails that \(\nabla \textbf{u}_{\varepsilon ,k} \rightarrow \nabla {{\textbf {u}}}_\varepsilon \) a.e. in \(\Omega \). As a consequence,

$$\begin{aligned} a_\varepsilon (|\nabla \textbf{u}_{\varepsilon ,k}|)\nabla \textbf{u}_{\varepsilon ,k} \rightarrow a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \quad \hbox {a.e. in}\quad \Omega , \end{aligned}$$
(6.46)

as \(k \rightarrow \infty \). From equations (6.46) and (6.45) one infers that

$$\begin{aligned} a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |) \nabla {{\textbf {u}}}_\varepsilon = \textbf{U}_\varepsilon \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\,. \end{aligned}$$
(6.47)

Furthermore, passing to the limit as \(k \rightarrow \infty \) in (6.44) yields

$$\begin{aligned} \Vert a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C \big ( \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N}) } + \Vert a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \Vert _{L^1(\Omega , {\mathbb R^{N\times n}})}\big ). \end{aligned}$$
(6.48)

Here, Eqs. (6.45) and (6.47) have been exploited to pass to the limit on the left-hand side, and equations (6.42) and (6.43) on the right-hand side. Combining Eqs. (6.48) and (6.39) tells us that

$$\begin{aligned} \Vert a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C \end{aligned}$$
(6.49)

for some constant C, independent of \(\varepsilon \). By the last inequality, the family of functions \(\{a_\varepsilon (|\nabla {{\textbf {u}}}_\varepsilon |)\nabla {{\textbf {u}}}_\varepsilon \}\) is uniformly bounded in \(W^{1,2}(\Omega , {\mathbb R^{N\times n}})\) for \( \varepsilon \in (0,1)\). Therefore, there exist a sequence \(\{\varepsilon _m\}\) converging to 0 and a function \(\textbf{U} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\) such that

$$\begin{aligned}&a_{\varepsilon _m}(|\nabla {{\textbf {u}}}_{\varepsilon _m}|) \nabla {{\textbf {u}}}_{\varepsilon _m} \rightarrow \textbf{U} \,\,\hbox {in}\quad L^2(\Omega , {\mathbb R^{N\times n}}),\nonumber \\&\quad a_{\varepsilon _m}(|\nabla {{\textbf {u}}}_{\varepsilon _m}|) \nabla {{\textbf {u}}}_{\varepsilon _m} \rightharpoonup \textbf{U} \, \,\hbox {in}\quad W^{1,2}(\Omega , {\mathbb R^{N\times n}}). \end{aligned}$$
(6.50)

An argument parallel to that of the proof of (5.25) yields

$$\begin{aligned} \nabla {{\textbf {u}}}_{\varepsilon _m} \rightarrow \nabla {{\textbf {u}}}\qquad \hbox {a.e. in}\quad \Omega . \end{aligned}$$
(6.51)

We omit the details, for brevity. Let us just point out that, in this argument, one has to make use of the inequality

$$\begin{aligned} \int _{\Omega } B(|\nabla {{\textbf {u}}}_{\varepsilon _m}|)\, dx \le C \bigg (\int _{\Omega }\widetilde{B}(|{\textbf {f}}|)\,dx + B(\varepsilon _m)\bigg )\,, \end{aligned}$$
(6.52)

instead of (5.12). Inequality (6.52) easily follows on choosing \({{\textbf {u}}}_{\varepsilon _m}\) as a test function in the definition of weak solution to problem (6.38), with \(\varepsilon = \varepsilon _m\). Coupling equations (6.50) and (6.51) implies that

$$\begin{aligned} a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}= \textbf{U} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\,. \end{aligned}$$
(6.53)

On the other hand, on exploiting Eqs. (6.51) and (6.39), the dominated convergence theorem for Lebesgue integrals and inequality (6.12) (applied with a and \({\textbf {u}}\) replaced by \(a_{\varepsilon _m}\) and \({{\textbf {u}}}_{\varepsilon _m}\)), one can deduce that

$$\begin{aligned} \lim _{m \rightarrow \infty } \Vert a_{\varepsilon _m}(|\nabla {{\textbf {u}}}_{\varepsilon _m}|)\nabla {{\textbf {u}}}_{\varepsilon _m}\Vert _{L^1(\Omega , {\mathbb R^{N\times n}})} = \Vert a(|\nabla {{\textbf {u}}}|)\nabla {{\textbf {u}}}\Vert _{L^{1}(\Omega , {\mathbb R^{N\times n}})} \le C \Vert {\textbf {f}}\Vert _{L^2(\Omega , {\mathbb R^N})} \end{aligned}$$
(6.54)

for some constant \(C=C(n,N,i_a, s_a, |\Omega |)\). Combining Eqs. (6.48), (6.50), (6.53) and (6.54) yields

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})} \end{aligned}$$
(6.55)

for some constant \(C= C(n, N, i_a, s_a, L_\Omega , d_\Omega , \mathcal K_\Omega )\).

Step 2. Assume now that the temporary condition (6.36) is still in force, but \(\Omega \) is just as in the statement. Let \(\{\Omega _m\}\) be a sequence of open sets approximating \(\Omega \) in the sense of Lemma 6.3. For each \(m \in \mathbb N\), denote by \({\textbf {u}}_m\) the weak solution to the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} - \textrm{\textbf{div}} (a(|\nabla {{\textbf {u}}}_m|)\nabla {{\textbf {u}}}_m ) = {{\textbf {f}}} &{} \textrm{in}\,\,\, \Omega _m \\ {{\textbf {u}}}_m =0 &{} \textrm{on}\,\,\, \partial \Omega _m \,, \end{array}\right. } \end{aligned}$$
(6.56)

where \({{\textbf {f}}}\) is continued by 0 outside \(\Omega \). Owing to our assumptions on \(\Omega _m\), inequality (6.55) holds for \({\textbf {u}}_m\). Thereby, there exists a constant \(C(n, N, i_a, s_a, L_\Omega , d_\Omega , \mathcal K_\Omega )\) such that

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}_m|) \nabla {{\textbf {u}}}_m\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})}&\le \Vert a(|\nabla {{\textbf {u}}}_m|) \nabla {{\textbf {u}}}_m\Vert _{W^{1,2}(\Omega _m, {\mathbb R^{N\times n}})} \nonumber \\&\le C \Vert {{\textbf {f}}}\Vert _{L^2(\Omega _m, {\mathbb R^N})}= C \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})}. \end{aligned}$$
(6.57)

Observe that the dependence of the constant C on the specified quantities, and, in particular, its independence of m, is due to properties (6.33) and (6.34).

Thanks to (6.57), the sequence \(\{a(|\nabla {{\textbf {u}}}_m |) \nabla {{\textbf {u}}}_m\}\) is bounded in \(W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), and hence there exists a subsequence, still denoted by \(\{{{\textbf {u}}}_m\}\) and a function \({{\textbf {U}}} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), such that

$$\begin{aligned} a(|\nabla {{\textbf {u}}}_{m}|) \nabla {{\textbf {u}}}_{m}&\rightarrow {{\textbf {U}}}\quad \hbox {in}\quad L^2(\Omega , {\mathbb R^{N\times n}}), \nonumber \\ a(|\nabla {{\textbf {u}}}_{m}|)\nabla {{\textbf {u}}}_{m}&\rightharpoonup {{\textbf {U}}}\quad \hbox {in}\quad W^{1,2}(\Omega , {\mathbb R^{N\times n}}). \end{aligned}$$
(6.58)

We now notice that there exists \(\alpha \in (0,1)\), independent of m, such that \({{\textbf {u}}}_m\in C^{1,\alpha }_\textrm{loc}(\Omega , {\mathbb R^N})\), and for every open set \(\Omega ' \subset \subset \Omega \) with a smooth boundary

$$\begin{aligned} \Vert {{\textbf {u}}}_{m}\Vert _{C^{1,\alpha }(\Omega ', {\mathbb R^N})} \le C, \end{aligned}$$
(6.59)

for some constant C, independent of m. To verify this assertion, one can make use of [39, Corollary 5.5] and of inequality (4.35) to deduce that, for each open set \(\Omega '\) as above, there exists a constant C, independent of m, such that

$$\begin{aligned} \Vert \nabla {{\textbf {u}}}_{m}\Vert _{C^{\alpha }(\Omega ', {\mathbb R^{N\times n}})} \le C. \end{aligned}$$
(6.60)

Since the function \({\textbf {f}}\) satisfies assumption (6.36), a basic energy estimate for weak solutions tells us that

$$\begin{aligned} \int _{\Omega _m} B(|\nabla {{\textbf {u}}}_{m}|)\, dx \le C \end{aligned}$$
(6.61)

for some constant C independent of m. Thus, as a consequence of the Poincaré inequality (4.4),

$$\begin{aligned} \int _{\Omega _m} B(|{{\textbf {u}}}_{m}|)\, dx \le C\,, \end{aligned}$$
(6.62)

where the constant C is independent of m, for \(|\Omega _m|\) is uniformly bounded. Inequalities (6.60) and (6.62), via a Sobolev type inequality, tell us that

$$\begin{aligned} \Vert {{\textbf {u}}}_{m}\Vert _{L^\infty (\Omega ', {\mathbb R^N})} \le C\, \end{aligned}$$
(6.63)

for some constant C independent of m. Inequality (6.59) follows from (6.60) and (6.63).

On passing, if necessary, to another subsequence, we deduce from inequality (6.59) that there exists a function \({{\textbf {v}}} \in C^1(\Omega , {\mathbb R^N})\) such that

$$\begin{aligned} {{\textbf {u}}}_{m} \rightarrow {{\textbf {v}}}\, \quad \hbox {and}\quad \nabla {{\textbf {u}}}_{m} \rightarrow \nabla {{\textbf {v}}} \quad \hbox {in}\quad \Omega , \end{aligned}$$
(6.64)

pointwise. Hence,

$$\begin{aligned} a(|\nabla {{\textbf {u}}}_{m}|) \nabla {{\textbf {u}}}_{m} \rightarrow a(|\nabla \textbf{v}|) \nabla {{\textbf {v}}} \quad \hbox {in}\quad \Omega . \end{aligned}$$
(6.65)

Owing to equations (6.65) and (6.58),

$$\begin{aligned} a(|\nabla {{\textbf {v}}}|) \nabla {{\textbf {v}}} = {{\textbf {U}}} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\,. \end{aligned}$$
(6.66)

Next, we pick a test function \({\varvec{\varphi }}\in C^\infty _0(\Omega , {\mathbb R^N})\) (continued by 0 outside \(\Omega \)) in the definition of weak solution to problem (6.56). Passing to the limit as \(m \rightarrow \infty \) in the resulting equation yields, via (6.58) and (6.66),

$$\begin{aligned} \int _\Omega a(|\nabla {{\textbf {v}}}|) \nabla {{\textbf {v}}} \cdot \nabla {\varvec{\varphi }}\, dx = \int _\Omega {{\textbf {f}}} \cdot {\varvec{\varphi }}\, dx\,. \end{aligned}$$
(6.67)

Inequality (6.61) tells us that \(\int _\Omega B(|\nabla {{\textbf {u}}}_m|) \, dx \le C\) for some constant C independent of m. Therefore, this inequality is still true if \({{\textbf {u}}}_m\) is replaced with \({{\textbf {v}}}\). Consequently, thanks to inequality (4.22), \(\int _\Omega \widetilde{B}(a(|\nabla \textbf{v}|) |\nabla {{\textbf {v}}}|)\, dx <\infty \). Thus, since under our assumptions on a the space \(C^\infty _0(\Omega , {\mathbb R^N})\) is dense in \(W^{1,B}_0(\Omega , {\mathbb R^N})\), equation (6.67) holds for every function \({\varvec{\varphi }}\in W^{1,B}_0(\Omega , {\mathbb R^N})\) as well. Hence, \({{\textbf {v}}}\) is a weak solution to the Dirichlet problem (2.17), and, inasmuch as such a solution is unique, \({{\textbf {v}}}={{\textbf {u}}}\). Moreover, by equations (6.57) and (6.58), there exists a constant \(C=C(n, N, i_a, s_a, L_\Omega , d_\Omega , \mathcal K_\Omega )\) such that

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})}. \end{aligned}$$
(6.68)

Step 3. Finally, assume that both \(\Omega \) and \({\textbf {f}}\) are as in the statement.

The definition of approximable solution entails that there exists a sequence \(\{{{\textbf {f}}}_k\} \subset C^\infty _0(\Omega , {\mathbb R^N})\), such that \({{\textbf {f}}}_k \rightarrow {{\textbf {f}}}\) in \(L^2(\Omega , {\mathbb R^N})\) and the sequence of weak solutions \(\{{{\textbf {u}}}_k\} \subset W^{1,B}_0(\Omega , {\mathbb R^N})\) to problems (6.2), fulfills \({{\textbf {u}}}_k \rightarrow {{\textbf {u}}}\) and \(\nabla {{\textbf {u}}}_k \rightarrow \nabla {{\textbf {u}}}\) a.e. in \(\Omega \). An application of inequality (6.68) with \({\textbf {u}}\) and \({\textbf {f}}\) replaced by \({\textbf {u}}_k\) and \({\textbf {f}}_k\), tells us that \(a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), and

$$\begin{aligned} \Vert a(|\nabla {{\textbf {u}}}_k|) \nabla {{\textbf {u}}}_k\Vert _{W^{1,2}(\Omega , {\mathbb R^{N\times n}})} \le C_1 \Vert {{\textbf {f}}}_k\Vert _{L^2(\Omega , {\mathbb R^N})} \le C_2 \Vert {{\textbf {f}}}\Vert _{L^2(\Omega , {\mathbb R^N})}\, \end{aligned}$$
(6.69)

for some constants \(C_1\) and \(C_2\), depending on N, \(i_a\), \(s_a\) and \(\Omega \). Therefore, the sequence \(\{a(|\nabla {{\textbf {u}}}_k|)\nabla {{\textbf {u}}}_k\}\) is bounded in \( W^{1,2}(\Omega , {\mathbb R^{N\times n}})\), whence there exists a subsequence, still indexed by k, and a function \({{\textbf {U}}} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\) such that

$$\begin{aligned} a(|\nabla {{\textbf {u}}}_{k}|)\nabla {{\textbf {u}}}_{k} \rightarrow {{\textbf {U}}}\quad \hbox {in}\quad L^2(\Omega , {\mathbb R^{N\times n}}), \quad a(|\nabla {{\textbf {u}}}_{k}|) \nabla {{\textbf {u}}}_{k} \rightharpoonup \textbf{U}\quad \hbox {in}\quad W^{1,2}(\Omega , {\mathbb R^{N\times n}}). \end{aligned}$$
(6.70)

Inasmuch as \(\nabla {{\textbf {u}}}_k \rightarrow \nabla {{\textbf {u}}}\) a.e. in \(\Omega \), one hence deduces that \(a(|\nabla {{\textbf {u}}}|) \nabla {{\textbf {u}}}={{\textbf {U}}} \in W^{1,2}(\Omega , {\mathbb R^{N\times n}})\). Thereby, the second inequality in (2.19) follows from equations (6.69) and (6.70). The first inequality in (2.19) holds trivially. The proof is complete. \(\square \)

Proof of Theorem 2.6

The proof parallels that of Theorem 2.7. However, Step 2 requires a variant. The sequence \(\{\Omega _m\}\) of bounded sets, with smooth boundaries, coming into play in this step has to be chosen in such a way that they are convex and approximate \({\Omega }\) from outside with respect to the Hausdorff distance. Inequalities (6.33) automatically hold in this case. Moreover, inequality (6.34) is not needed, inasmuch as the constant C in (6.10) does not depend on the function \(\mathcal K_{\Omega }\) if \({\Omega }\) is convex. \(\square \)