1 Introduction

Let \(1<p<\infty \) and consider an open and bounded domain \({\Omega \subset \mathbb {R}^n}\). We want to find \({u\in W^{1,p}_g(\Omega )=g+W^{1,p}_0(\Omega ,\mathbb {R}^N)}\) such that

$$\begin{aligned} div \,a(x,D u)&= \,div \,f \quad \text { in } \Omega \end{aligned}$$
(1.1)

where \({a:\Omega \times \mathbb {R}^{N \times n}\rightarrow \mathbb {R}^{N \times n}}\) is a matrix-valued elliptic structure field satisfying controlled p-growth conditions. We assume that \(f\in L^{p'}(\Omega ,\mathbb {R}^{N\times n})\) and \(g\in W^{1,p}(\Omega ,\mathbb {R}^N)\). Here \(p'\) denotes the Hölder conjugate of p and (1.1) is to be understood in the sense of distributions. To be precise, we make the following assumptions on a:

$$\begin{aligned}&\text {For every} z\in \mathbb {R}^{N\times n}, a(\cdot ,z) \text { is measurable in} \Omega \text { and, for almost every}\nonumber \\ {}&x\in \Omega , a(x,\cdot ) \text {is continuously differentiable in} \mathbb {R}^{N\times n}. \end{aligned}$$
(A1)

There are \(\Lambda _a\ge \lambda _a>0\) and \(\mu \ge 0\) such that, for almost every \(x\in \Omega \) and for all \(z,\xi \in \mathbb {R}^{N\times n}\),

$$\begin{aligned} \lambda _a (\mu ^2+|z|^2)^\frac{p-2}{2}|\xi |^2\le \partial _z a(x,z)\xi \cdot \xi \le \Lambda _a (\mu ^2+|z|^2)^\frac{p-2}{2}|\xi |^2. \end{aligned}$$
(A2)
$$\begin{aligned}&\text {There is} c>0, \text { such that, for all} x\in \Omega , z\in \mathbb {R}^{N \times n}, \text { it holds that}\nonumber \\ {}&{|a(x,z)|\le c\left( 1 +|z|^{p-1}\right) }. \end{aligned}$$
(A3)

For simplicity of presentation, in addition to (A1)–(A3), we make the further assumption that

$$\begin{aligned} \text {For almost every } x\in \Omega \text { and every } z\in \mathbb {R}^{N\times n},\, \partial _z a(x,z) \text { is symmetric.} \end{aligned}$$
(A4)

Remark 1.1

We have chosen to focus on systems of the form (1.1), however with minor changes to (A1)-(A3) all the theory we develop applies equally to systems of the form \(div \,a(x,u,D u)=div \,f\) where we have that \({a:\Omega \times \mathbb {R}^N\times \mathbb {R}^{N\times n}\rightarrow \mathbb {R}^{N\times n}}\). Further (A4) can be dropped if the results of this paper are modified as indicated in Remark 3.3.

We will only concern ourselves with the vectorial case \(N\ge 2\) as the theory in the scalar case \(N=1\) is different and much more can be said.

The study of elliptic systems of the form (1.1) is well established with a very long list of important results. For an introduction we refer to [22, 24] as well as the classical [33]. It is well known that in general, without stronger structural assumptions than ellipticity and controlled p-growth, solutions to (1.1) will not be \(C^{1,\alpha }\)- or even \(C^{0,\alpha }\)- regular. In fact, problems already arise in the simplest case when \(p=2\). An initial counterexample for systems with \({a\equiv a(x,z)}\) (with \(n\ge 3\)) was developed in [14], where a quadratic system was given for which solutions are not Hölder-continuous and, in fact, fail to be bounded. While the x-dependence of the system in [14] is discontinuous, almost at the same time an example was found of a higher-order elliptic operator with analytic coefficients but discontinuous solutions [40]. We remark at this point that, if \(n=2\), solutions to (1.1) are Hölder continuous [43]. When \(a\equiv a(x,y,z)\) does indeed depend on y, it is possible to give examples where a is analytic, but solutions to (1.1) are nowhere continuous, see [30, 49], building on examples with analytic fields a but discontinuous solutions in [25]. In this set-up, an example of a system in optimal dimensions \(n>2\), \(N=2\) with discontinuous solutions is given in [20]. However, the lack of regularity occurs already without x-dependence of the coefficients. Even when \(a\equiv a(z)\) is quadratic and analytic, \(C^{1,\alpha }\)-regularity of solutions need not hold, if \(n>2\), as [44] shows.

Due to this lack of regularity, the general regularity theory of systems of type (1.1) proceeds through notions of partial regularity, that is, regularity outside of a context-dependent small relatively closed set. Since our focus here lies on results holding everywhere in the domain, we refer the reader to [22, 24, 42] for results and references in this direction.

Regularity results holding in the full domain are known only in a few special cases. The radial Uhlenbeck structure \(a\equiv a(\left| z \right| )z\), originally developed in [52], is well known to guarantee full \(C^{1,\alpha }\)-regularity. In [50], the result is generalised to fields of the form g(zz)z, where \(g(\cdot ,\cdot )\) is a symmetric, positive definite bilinear form. In particular, this theory covers the model case of the p-Laplace operator \(a(x,y,z)=|z|^{p-2}z\). There now exists a vast literature concerning the regularity theory of (1.1) if \(a\equiv a(|z|)z\) and we refer the reader to [17, 18, 23, 35, 38, 39] for further results and references.

A second direction of everywhere regularity results, in the case where \(a\equiv a(z)\), concerns the case where the modulus of continuity of \(\partial _z a(z)\) is not too large compared to the ellipticity constant of a(z). \(C^{1,\alpha }_{loc }\)-regularity of solutions to (1.1) is shown in this set-up in [11,12,13].

A further direction, closely related to the results of this paper, are results of Cordes–Nirenberg type. For linear elliptic PDE, the classical Cordes–Nirenberg results [9, 45] state that solutions of

$$\begin{aligned} a_{ij}(x)\partial _{ij}u(x)=f(x) \end{aligned}$$
(1.2)

have interior \(C^{1,\alpha }\) regularity, if \(f\in L^\infty \) and \(|a_{ij}-\delta _{ij}|<\varepsilon \) for sufficiently small \(\varepsilon \). That is, \(C^{1,\alpha }\)-regularity is obtained when the coefficients \(\{a_{ij}\}\) are sufficiently close to the identity matrix in an appropriate sense. Note that the identity matrix is nothing but the coefficients corresponding to the Laplacian. In [8] the result is extended to fully non-linear elliptic equations with solutions in the sense of viscosity solutions. A result of a similar spirit is [19], where local BMO-regularity for extremals of (a suitable relaxed version of) \(dist _K^2(D u)\) is shown. Here \(dist _K(z)\) denotes the distance of \(z\in \mathbb {R}^{N\times n}\) to the compact set \(K\subset \mathbb {R}^{N\times n}\).

We also comment on the case where \(a(x,z)=a(x)z+b(x,z)z\) and \(\Vert b(x,z)\Vert \le c(n)\lambda _a\). Here a(x) is a positive definite matrix, with smallest eigenvalue uniformly bounded below in \(\Omega \) by \(\lambda _a>0\) and c(n) is a constant depending only on n. If c(n) is sufficiently small, then bounded solutions of (1.1) are \(\alpha \)-Hölder-continuous [53].

In this paper we obtain regularity results for fields a that are close to a suitable reference field b, see the start of Sect. 3 for a precise definition of our notion of reference field. Suppose ab satisfy (A1) and (A2), and let \(A(x,z)=\partial _z a(x,z), B(x,z)=\partial _z b(x,z)\). Denote by \(\lambda _{B^{-1}A}(x,z), \Lambda _{B^{-1}A}(x,z)\) the smallest and largest eigenvalue of \(B^{-1}(x,z)A(x,z)\), respectively. We note that then \(\lambda _{B^{-1}A}\) has a lower bound, and \(\Lambda _{B^{-1}A}(x,z)\) has an upper bound, that is independent of xz. Denote these bounds by \(\lambda _{a,b}\) and \(\Lambda _{a,b}\), respectively. We set

$$\begin{aligned} K_{a,b}=\frac{\Lambda _{a,b}-\lambda _{a,b}}{\lambda _{a,b}+\Lambda _{a,b}} \end{aligned}$$
(1.3)

We will operate in a regime where \(K_{a,b}\) is sufficiently small and remark that in all cases we are able to make the smallness assumption on \(K_{a,b}\) explicit. In this regime, we are able to transfer existence and regularity results for solutions of the system \(div \,b(x,D u)=div \,f\) to solutions of the system \({div \,a(x,D u)=div \,f}\).

The observation of [32] is that, if a is a matrix-valued elliptic structure field satisfying controlled quadratic growth, then considering the scheme

$$\begin{aligned} {\left\{ \begin{array}{ll} -\Delta u_{n+1}=-\Delta u_n - \gamma div \,a(x,D u_n) +\gamma div \,f &{}\quad \text { in } \Omega \\ u_{n+1}= 0 &{}\quad \text { on } \partial \Omega , \end{array}\right. } \end{aligned}$$

with any choice of \(u_0\in W^{1,2}_g(\Omega )\), gives a sequence \(\{u_n\}\) that converges in \(W^{1,2}(\Omega )\) to a solution of (1.1). The problem is to be understood in the sense of distributions. [32] further shows convergence of the scheme in appropriate Morrey spaces, obtaining \(C^{0,\alpha }\)- and \(C^{1,\alpha }\)-regularity of solutions to (1.1) under (sharp in the case of \(C^{0,\alpha }\)) smallness assumptions on \(K_{-\Delta ,a}\). We will return to this point shortly.

We extend this result to matrix-valued elliptic structure fields with p-growth.

Theorem 1

Let \(1<p<\infty \), \(f\in L^{p'}(\Omega ,\mathbb {R}^{N \times n})\) and \(g\in W^{1,p}(\Omega ,\mathbb {R}^N)\). Suppose ab satisfy (A1)–(A3) and (A4). Assume that the problem

$$\begin{aligned} div \,b(x,D u)= div \,F \text { in } \Omega \end{aligned}$$

has a unique solution in \(W^{1,p}_g(\Omega )\) for any choice of \(F\in L^{p'}(\Omega )\). If

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{\Lambda _b}{\lambda _b}K_{a,b} \left( \dfrac{3+\sqrt{5}}{2}\right) ^\frac{p-2}{2p} 6^{p-2}(p-1)< 1 &{}\quad \text { for } p\ge 2 \\ \dfrac{\Lambda _b}{\lambda _b}K_{a,b} \left( \dfrac{3-\sqrt{5}}{2}\right) ^\frac{(p-2)}{4}\dfrac{2^{2-p}}{p-1}< 1 &{}\quad \text { for } p\le 2, \end{array}\right. } \end{aligned}$$

then there is \(\gamma >0\) such that, if \({u_0\in W^{1,p}_g(\Omega )}\), and \({u_{n+1}\in W^{1,p}_g(\Omega )}\) is inductively defined to be the weak solution of

$$\begin{aligned} div \,b(x,D u_{n+1}) = div \,( b(x,D u_n)-\gamma a(x,D u_n)+ \gamma f) \text { in } \Omega \end{aligned}$$
(1.4)

in the class \(W^{1,p}_g(\Omega )\), then \(u_n\rightarrow u\) in \(W^{1,p}(\Omega )\). Here u is the unique solution in \(W^{1,p}_g(\Omega )\) of (1.1).

Remark 1.2

We note that if \(p=2\) and \(b(x,z) = z\), we recover precisely the situation in [32].

Remark 1.3

It will actually be the case that when Theorem 1 applies, the convergence will be at linear rate, that is for some \(R<1\),

$$\begin{aligned} \Vert u_{n+1}-u\Vert _{W^{1,p}(\Omega )}\le R \Vert u_n-u\Vert _{W^{1,p}(\Omega )}. \end{aligned}$$

Further, \(R^{\max (p-1,p/2)} \sim K_{a,b}\), that is the speed of convergence depends on the size of the eigenvalues of \((\partial _z b(x,z))^{-1}\partial _z a(x,z)\). We comment further on this and give an explicit formula for R in Remark 3.4.

We are only to prove convergence for some \(\gamma >0\). However, in the quadratic case when \(a(x,z)=z\), convergence holds for all sufficiently small \(\gamma >0\) [32]. Moreover, in our numerical experiments we also observe convergence for all sufficiently small values of \(\gamma >0\), c.f. Figure 3. It would be interesting to formally prove that Theorem 1 holds for any sufficiently small \(\gamma >0\), but we have been unable to do so.

The iterative scheme (1.4) naturally lends itself as a numerical scheme to solve (1.1) using finite element approximations. Take \(g=0\) and suppose \(\Omega \) is a regular polytope. Consider a family of shape-regular triangulations \(\{T_h\}_{h\in (0,1]}\) of \(\Omega \) of mesh-size h. Denote by \(X_h\) the space of continuous piecewise linear (subordinate to \(T_h\)) functions. Take \(u_0\in X_h\). Define \(u_{n+1}^h\in X_h\) inductively as a solution of the problem: For all \(\phi \in X_h\),

$$\begin{aligned} {\left\{ \begin{array}{ll}\int _\Omega b(x,D u_{n+1}^h)\cdot D \phi \,\mathrm {d}x = \int _\Omega (b(x,D u_n^h)-\gamma a(x,D u_n^h)+\gamma f)\cdot D \phi \,\mathrm {d}x \\ u_{n+1}^h = 0 \qquad \text { on } \partial \Omega \end{array}\right. } \end{aligned}$$
(1.5)

Note that due to (A1)–(A3) and the theory of monotone operators (1.5) is well-defined. We prove in Sect. 4 that under the assumptions of Theorem 1 this scheme converges in \(W^{1,p}(\Omega )\) to a solution of the problem

$$\begin{aligned} {\left\{ \begin{array}{ll}\int _\Omega a(x,D u^h)\cdot D \phi = \int _\Omega f\cdot D \phi \,\mathrm {d}x \quad &{}\forall \phi \in X_h\\ u^h = 0 &{}\text { on } \partial \Omega . \end{array}\right. } \end{aligned}$$

Under the additional assumption that, if \(F\in L^{p'}(\Omega )\), then there is \(\alpha >0\) such that the equation \(div \,b(x,D u)= div \,F\) has a solution u in \(W^{1+\alpha ,p}(\Omega )\), and, moreover, we have the following estimate for some \(c>0\):

$$\begin{aligned} \Vert u\Vert _{W^{1+\alpha ,p}(\Omega )}\le c\left( 1+\Vert u\Vert _{W^{1,p}(\Omega )}+\Vert F\Vert _{L^{p'}(\Omega )}^{1/(p-1)}\right) , \end{aligned}$$
(1.6)

we show that \(u^h\rightarrow u\) in \(W^{1,p}(\Omega )\) at rate \(O(h^\frac{2\alpha }{\max (2,p)})\) as \(h\rightarrow 0\) where \(u\in W^{1,p}_g(\Omega )\) is the solution of (1.1). We note that (1.6) is satisfied by many fields of interest, e.g. the p-Laplacian \(b(x,z)=|z|^{p-2}z\), [48].

If \(p=2\) and \(b(x,z)=z\), the proposed numerical scheme falls into the class of iterative linearised Galerkin schemes studied in [29]. We refer to this reference for an overview and further references regarding such schemes. An advantage of the schemes studied in this paper, and of unified iterative schemes of the form studied in [29] in general, is that it is possible to show global convergence results at linear rate for a wide class of problems. This should be compared with nonlinear Newton schemes for which convergence can usually only be expected locally, albeit at quadratic rate. An example of a scheme, fitting into the framework of unified iterative schemes, that has been used to study some problems of p-Laplace type where \(a(x,z)=a(x,|z|)z\) and a(x, |z|) has \(p-2\)-growth, is the so-called Kačanov iteration, originally introduced in [31], and defined by solving

$$\begin{aligned} div \,a(x,|D u_n|)D u_{n+1} = div \,f. \end{aligned}$$

Versions of this scheme have been studied in [15, 21, 27, 28, 54], but in general have been restricted to the case \(p\in [1,2]\).

We stress that, in general, the scheme studied in this paper cannot be used to replace schemes proposed in the literature to solve nonlinear elliptic system. Indeed, we always require the existence of a numerical scheme which solves the corresponding problem \(div \,b(x,D u)= div \,F\) for the reference field. However, in any set-up where a numerical scheme for a reference field exists, we can apply the iterative scheme to perturbations a of the reference field. As an example, in the case of the Kačanov scheme, the iterative scheme applies to perturbations of the p-Laplace operator which need not satisfy any structural assumption of the form \(a(x,z)=a(x,|z|)z\). Further no regularity theory is required for the perturbation a. Nevertheless, it is instructive to compare our numerical results and experiments to those available in the literature, and in particular to some results regarding the p-Laplace operator \(b(x,z)=|z|^{p-2} z\). In [6], optimal error bounds, that is O(h) bounds, where h is the mesh-size, are proven for the scalar p-Laplace equation. For \(p\le 2\), these error bounds are in \(W^{1,p}\)-norm and are thus replicated by our convergence rates if (1.6) holds with \(\alpha =1\). However the regularity requirement on solutions in [6] is stronger than (1.6). For \(p\ge 2\), the optimal convergence result in [6] is achieved in \(W^{1,4/3}(\Omega )\). This is a general theme in the study of numerical schemes for nonlinear elliptic systems: optimal convergence rates are achieved in norms adapted to the operator, often referred to as quasi-norms. For example, such quasi-norms were studied for the p-Laplace equation in [37], systems of the form \(b(x,z)=div \,k(x,|z|)z\) in [36] and for systems satisfying Orlicz-type growth conditions in [16]. Specialising to reference operators satisfying such assumptions, we believe it should be possible to perform an analysis of the numerical scheme proposed here, in the framework of appropriate quasi-norms. However, for simplicity and also in order to keep our assumptions on the reference operator b(xz) minimal, we restrict our analysis to the question of \(W^{1,p}\)-convergence.

In [32], the iterative process is used to derive sharp conditions on the dispersion \(\Lambda _a/\lambda _a\) of elliptic systems with quadratic growth that guarantee \(C^{0,\alpha }\)-regularity of solutions. An alternative proof was suggested in [34]. Further, under additional assumptions, conditions for \(C^{1,\alpha }\)-regularity of solutions are presented. In this spirit, we obtain Cordes–Nirenberg type results with regards to Calderón–Zygmund estimates for perturbations of fields of p-Laplace type. We recall a result of [7].

Theorem 2

(Theorem 2.1 in [7]) Let \(1<p<\infty \), \(q_0\in [1,\infty )\) and \({\Omega \subset \mathbb {R}^n}\) be a bounded domain with Lipschitz boundary. Suppose b(xz) satisfies (A1)-(A3) with \(\mu =0\) and assume in addition that there exists a bounded non-negative function \({\Phi :[0,\infty )\rightarrow [0,\infty )}\) with \(\lim _{\rho \rightarrow \infty }\Phi (\rho )=0\) such that

$$\begin{aligned} |b(x,z)-|z|^{p-2}z|\le \Phi (|z|)(1+|z|^{p-1}). \end{aligned}$$

Then there is a small constant \(\delta _0>0\) such that for any \(q\in [1,q_0]\) and \({\delta \in (0,\delta _0]}\), if \(\partial \Omega \) has Lipschitz constant smaller than \(\delta \) and if \(g\in W^{1,pq}(\Omega )\), \(F\in L^{p'q}(\Omega )\), then there exists a unique weak solution \(u\in W^{1,p}_g(\Omega )\) of \(div \,b(x,D u)=div \,F\). Moreover, there is a constant \(C_0>0\) so that

$$\begin{aligned} \int _\Omega |D u|^{pq} \,\mathrm {d}x\le C_0\left( 1+\Vert D g\Vert _{L^{pq}(\Omega ,\mathbb {R}^{n\times N})}+\Vert F\Vert _{L^{p' q}(\Omega ,\mathbb {R}^{n\times N})}^\frac{1}{p-1}\right) . \end{aligned}$$

Here \(C_0\) and \(\delta _0\) depend only on \(n,N,p,q_0,\lambda _b,\Lambda _b\) and \(\Omega \). In particular, this implies that \(u\in W^{1,pq}_g(\Omega )\).

We obtain the following extension:

Theorem 3

Let \(1<p<\infty \), \(q_0\in [1,\infty )\). Suppose \(\Omega \) is a Lipschitz domain. Let \(q\in [1,q_0]\). Take \(f\in L^{p'q}(\Omega )\) and \(g\in W^{1,pq}(\Omega )\). Assume for some field b the assumptions of Theorem 2 are satisfied and let \(C_0\) be the constant defined there. Assume that a satisfies assumptions (A1)-(A3) and (A4) with \(\mu =0\) and that the assumptions of Theorem 1 are satisfied. If

$$\begin{aligned} \dfrac{C_0^{p-1}\Lambda _b K_{a,b}}{(p-1)}<1, \end{aligned}$$

then the solution u of (1.1) satisfies

$$\begin{aligned} \Vert D u\Vert _{L^{pq}(\Omega )}\lesssim 1+\Vert D g\Vert _{L^{pq}(\Omega )}+\Vert f\Vert _{L^{p'q)}(\Omega )}^\frac{1}{p-1}. \end{aligned}$$
(1.7)

Further we obtain a similar result in the scalar setting with regards to weighted estimates for perturbations of equations of p-Laplace type. We recall the following special case of the main result of [46]:

Theorem 4

(c.f. Theorem 2.1 in [46]) Suppose \(N=1\). Let \(1<p<q<\infty \) and let w be an \(A_{q/p}\) weight. Suppose \(\Omega \) is a \(C^1\)-domain. Suppose \(B\in VMO(\Omega )\) and B satisfies

$$\begin{aligned} \lambda |\xi |^2\le B(x)\xi \cdot \xi \le \Lambda |\xi |^2. \end{aligned}$$

Set \(b(x,z)=(B(x)z\cdot z)^\frac{p-2}{2} z\). Then there exists a positive constant \(C_1>0\) such that the following holds. For a given vector field \({F\in L^{q(p-1)}_w(\mathbb {R}^n)}\), there is a unique weak solution \({u\in W^{1,p}_0(\Omega )}\) of \(div \,b(x,D u)=div \,F\), satisfying \(D u\in L^q_w(\Omega )\) with the estimate

$$\begin{aligned} \Vert D u\Vert _{L^q_w(\Omega )}\le C_1 \Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}. \end{aligned}$$

Here the constant \(C_1\) depends only on \(n, p, q, R, \lambda ,\Lambda \) and \([w]_{q/ p}\).

We use the Koshelev iteration to obtain the following extension.

Theorem 5

Suppose \(N=1\). Let \(1<p<q<\infty \) and \(w\in A_{q/p}\). Assume \(\Omega \) is a \(C^1\) domain and let \(g=0\), \(f\in L^\frac{q}{p-1}_w(\Omega )\). Suppose b satisfies the assumptions of Theorem 4. Let \(C_1\) be the constant from Theorem 4. Suppose ab satisfy the assumptions of Theorem 1. If

$$\begin{aligned} \frac{C_1^{(p-1)/q} K_{a,b}\Lambda _b}{(p-1)}<1, \end{aligned}$$

then the solution v of (1.1) satisfies the estimate

$$\begin{aligned} \Vert D v\Vert _{L^q_w(\Omega )}\lesssim 1+\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^\frac{1}{p-1} \end{aligned}$$
(1.8)

The structure of this paper is as follows. In Sect. 2 we explain our notation and present a number of preliminary results. In Sect. 3 we show convergence in \(W^{1,p}(\Omega )\) of the iterative process. In Sect. 4 we present our numerical experiments regarding the iterative process. In Sect. 5 we use the iterative process to prove higher differentiability and weighted estimates of perturbations to equations with p-Laplace structure.

2 Preliminaries and notation

Throughout \(\Omega \subset \mathbb {R}^n\) will be a domain. c will denote a a positive constant depending only on \(\Omega ,N,p\) that may change from line to line. We write \(a\lesssim b\) if there is \(c>0\) depending only on \(\Omega ,N,p\) such that \(a\le c b\).

Let \(1\le p<\infty \). We denote by \(p'\) the Hölder conjugate of p and write \(L^p(\Omega )=L^p(\Omega ,\mathbb {R}^N)\) and \(W^{1,p}(\Omega )=W^{1,p}(\Omega ,\mathbb {R}^N)\) for the usual Lebesgue and Sobolev spaces respectively. \(VMO(\Omega )\) denotes the space of maps with vanishing mean oscillation. We also employ the standard fractional Sobolev spaces \(W^{k,p}(\Omega )\) where \({k\in (0,\infty )}\), whose theory can be found for example in [51]. We recall in particular the following fact from [26]: if \(p\in (1,\infty )\), \(\alpha >0\) and \(\Omega \) is a regular polytope, then for \(u\in W^{1+\alpha ,p}(\Omega )\) the best approximation v to u in the space of continuous piecewise linear functions subordinate to a shape-regular triangulation of \(\Omega \) of mesh-size h satisfies

$$\begin{aligned} \Vert u-v\Vert _{W^{1,p}_\infty (\Omega )}\lesssim h^\alpha \Vert u\Vert _{W^{1+\alpha ,p}_\infty (\Omega )}. \end{aligned}$$
(2.1)

For \(z\in \mathbb {R}^n\), \(r>0\) we denote by \(B_r(z)\) the open ball of radius r around z.

Given vectors \(u, v\in \mathbb {R}^n\) we denote by |v| the Euclidean norm and we denote the inner product in \(\mathbb {R}^n\) by \(u\cdot v\). Given a matrix \(A\in \mathbb {R}^{n \times n}\) we denote by \(\Vert A\Vert \) the operator norm of A. If A is positive definite, we denote its smallest eigenvalue by \(\lambda _A\) and its largest by \(\Lambda _A\). In particular when \(a\equiv a(x,z):\Omega \times \mathbb {R}^{N\times n}\rightarrow \mathbb {R}^{N\times n}\) is such that \(a(x,\cdot )\) is differentiable in \(\mathbb {R}^{N\times n}\) for almost every \(x\in \Omega \), we view \(\partial _z a(x,y,z)\) both as a matrix and as a linear form. If \(\partial _z a(x,y,z)\) is positive definite, we denote \(\lambda _a = \lambda _{\partial _z a(x,y,z)}\), \(\Lambda _a=\Lambda _{\partial _z a(x,y,z)}\).

We recall the definition of Muckenhoupt weights. For a fixed \(1< p <\infty \), we say that a weight \(w :\mathbb {R}^n\rightarrow [0,\infty )\) belongs to \(A_p\) if w is locally integrable and there is a constant \(C>0\) such that, for all balls \(B_r(z)\subset \mathbb {R}^n\), we have

Given a weight w on \(\Omega \), we denote the weighted Lebesgue-spaces by \(L^p_w(\Omega )\).

Given \(\mu \ge 0\), \(p\in (1,\infty )\) and \(v\in \mathbb {R}^n\) we write \(V_{\mu ,p}(v) = (\mu ^2+|v|^2)^\frac{p-2}{4}v\). When the choice of p is clear from the context we suppress the index and write \(V_{\mu }(v)=V_{\mu ,p}(v)\).

We recall some well-known tools for dealing with p-growth. The following Lemma is standard, but for our application it is desirable that all constants tend to 1 as \(\gamma \rightarrow 0\). The author has been unable to find a reference in the literature with bounds having this behaviour and thus a proof of the version detailed below can be found in the appendix.

Lemma 2.1

Let \(\xi ,\eta \in \mathbb {R}^m\) for some \(m>0\). For \(\gamma \ge 0\) and \(\mu \ge 0\) we have

$$\begin{aligned} \frac{1}{6^{\gamma }(2\gamma +1)}(\mu ^2+|\eta |^2+|\eta -\xi |^2)^\gamma&\le \int _0^1 (\mu ^2+|t\xi +(1-t)\eta |^2)^\gamma \,\mathrm {d}t\\&\le 2^\gamma (\mu ^2+|\eta |^2+|\eta -\xi |^2)^\gamma \end{aligned}$$

If \(\gamma \in (-1/2,0]\) we have

$$\begin{aligned} 2^\gamma (\mu ^2+|\eta |^2+|\eta -\xi |^2)^\gamma&\le \int _0^1 (\mu ^2+|t\xi +(1-t) \eta |^2)^\gamma \,\mathrm {d}t\\&\le \frac{1}{4^\gamma (\gamma +1)}(\mu ^2+|\eta |^2+|\xi -\eta |^2)^\gamma \end{aligned}$$

We also recall that the \(V_\mu \)-functional enjoys a Young-type inequality.

Lemma 2.2

(cf. [1], Lemma 2.3) Let \(x,y\in \mathbb {R}^{N \times n}\), \(\mu \ge 0\) and \(1\le p\). Let \(\varepsilon >0\). Then with

$$\begin{aligned} C_\varepsilon = \max \left( \frac{1}{4\varepsilon },\frac{(p-1)^{p-1}}{p^p\varepsilon ^{p-1}}\right) , \end{aligned}$$

it holds that

$$\begin{aligned} (\mu ^2+|x|^2)^\frac{p-2}{2}x\cdot y\le \varepsilon |V_\mu (x)|^2+ C_\varepsilon |V_\mu (y)|^2. \end{aligned}$$

Further we note the following elementary estimate:

Lemma 2.3

For \(a,b\in \mathbb {R}^n\) we have

$$\begin{aligned} \frac{3-\sqrt{5}}{2}(|a|^2+|b|^2)\le |a|^2+|a-b|^2\le \frac{3+\sqrt{5}}{2}(|a|^2+|b|^2) \end{aligned}$$

We close this section by recalling a linear algebra result from [32].

Lemma 2.4

(c.f. Lemma 1.1.2 in [32]) Let \(A\in \mathbb {R}^{n\times n}\) be positive definite. Write \(A=A^+A^-\) where \(A^+, A^-\) are the symmetric and skew-symmetric part of A, respectively. Set \({C=A^+A^--A^-A^+-(A^-)^2}\) and denote by \(\sigma \) the largest eigenvalue of C. Further set

$$\begin{aligned} K_\gamma {:=} \inf _{\gamma >0}\Vert I-\gamma A\Vert . \end{aligned}$$

Then \(K_\gamma \) is attained with the choice

$$\begin{aligned} {\left\{ \begin{array}{ll} \gamma = \dfrac{\lambda _{A}}{\sigma + \lambda _{A}^2}\qquad &{} \text { if } \sigma \ge \dfrac{\lambda _{A}(\Lambda _{A}-\lambda _{A})}{2}\\ \gamma = \dfrac{2}{\Lambda _{A}+\lambda _{A}}\qquad &{}\text { if } \sigma \le \dfrac{\lambda _{A}(\Lambda _{A}-\lambda _{A})}{2} \end{array}\right. } \end{aligned}$$

and takes the value

$$\begin{aligned} {\left\{ \begin{array}{ll} K_\gamma ^2 = \dfrac{\sigma }{\sigma + \lambda _{A}^2} \quad &{} \text { if } \sigma \ge \dfrac{\lambda _{A}(\Lambda _{A}-\lambda _{A})}{2}\\[20pt] K_\gamma ^2 = \dfrac{(\Lambda _{A}-\lambda _{A})^2+4\sigma }{(\Lambda _{A}+\lambda _{A})^2} \quad &{} \text { if } \sigma \le \dfrac{\lambda _{A}(\Lambda _{A}-\lambda _{A})}{2}. \end{array}\right. } \end{aligned}$$

Remark 2.5

An inspection of the proof shows that if \(\lambda \le \lambda _A\), \(\Lambda _A\le \Lambda \) and \(\sigma \le \sigma '\), then with

$$\begin{aligned} {\left\{ \begin{array}{ll} K_\gamma ^2 = \dfrac{\sigma '}{\sigma ' + \lambda } \quad \gamma = \dfrac{\lambda }{\sigma ' + \lambda ^2} \quad &{} \text { if } \sigma ' \ge \dfrac{\lambda (\Lambda -\lambda )}{2}\\[20pt] K_\gamma ^2 = \dfrac{(\Lambda -\lambda )^2+4\sigma '}{(\Lambda +\lambda )^2} \quad \gamma = \dfrac{2}{\Lambda +\lambda }\qquad &{} \text { if } \sigma \le \dfrac{\lambda (\Lambda -\lambda )}{2}. \end{array}\right. } \end{aligned}$$

it still holds that \(\Vert I-\gamma A\Vert \le K_\gamma \).

3 The iterative process

We call \(b:\Omega \times \mathbb {R}^{N \times n}\rightarrow \mathbb {R}^{N\times n}\) a reference field if the following holds: b satisfies (A1)-(A3) and (A4). In addition, whenever \(F\in L^{p'}(\Omega ,\mathbb {R}^{N \times n})\) and \(g\in W^{1,p}(\Omega ,\mathbb {R}^N)\), there exists a unique solution \(u\in W^{1,p}_g(\Omega )\) to the problem

$$\begin{aligned} div \,b(x,D u) = div \,F. \end{aligned}$$
(3.1)

(3.1) is to be understood in the sense of distributions.

Remark 3.1

The p-Laplacian \(|z|^{p-2}z\) is an example of such a field. We encourage the reader to think of this case on a first reading.

Recall the definition of the iterative process. Let \({u_0\in W^{1,p}_g(\Omega )}\) and \({f\in L^{p'}(\Omega )}\). Take \(\gamma >0\) to be chosen later. Define \(u_{n+1}\in W^{1,p}_g(\Omega )\) to be the weak solution of the problem (3.1) with the choice

$$\begin{aligned} F = b(x,D u_n)-\gamma a(x,D u_n)+ \gamma f. \end{aligned}$$
(3.2)

Note that (A3) ensures that \(F\in L^{p'}(\Omega ,\mathbb {R}^{N \times n})\), so that the sequence \(\{u_n\}\) is well-defined.

We want to show that \(u_n\rightarrow u\) in \(W^{1,p}(\Omega )\) where u is a solution of (1.1). The crucial observation to prove convergence is the following linear algebra observation:

Lemma 3.2

Suppose \(A,B\in \mathbb {R}^{m\times m}\) are positive definite and symmetric. Then, with the choice \(\gamma = \frac{2}{\Lambda _{B^{-1}A}+\lambda _{B^{-1}A}}\) and

$$\begin{aligned} K=\frac{\Lambda _{B^{-1}A}-\lambda _{B^{-1}A}}{\Lambda _{B^{-1}A}+\lambda _ {B^{-1}A}}<1, \end{aligned}$$

the estimate

$$\begin{aligned} \Vert B-\gamma A\Vert \le K\Vert B\Vert \end{aligned}$$

holds.

Proof

Applying Lemma 2.4, and noting that since AB are symmetric, \({\sigma =0}\), there holds

$$\begin{aligned} \Vert B-\gamma A\Vert&\le \Vert B\Vert \Vert I-\gamma B^{-1}A\Vert = \Vert B\Vert \frac{\Lambda _{B^{-1}A}-\lambda _{B^{-1}A}}{\Lambda _{B^{-1}A}+\lambda _{B^{-1}A}} = \Vert B\Vert K \end{aligned}$$

\(\square \)

Remark 3.3

Due to Remark 2.5, if ab satisfy (A1), (A2) and (A4), then with

$$\begin{aligned} K_{a,b} = \frac{\Lambda _{a,b}-\lambda _{a,b}}{\Lambda _{a,b}+\lambda _{a,b}} \qquad \gamma = \frac{2}{\Lambda _{a,b}+\lambda _{a,b}} \end{aligned}$$

it holds for any \(x\in \Omega \) and almost every \(z\in \mathbb {R}^{N\times n}\) that

$$\begin{aligned} \Vert \partial _z b(x,z)-\gamma \partial _z a(x,z)\Vert \le K_{a,b}\Vert \partial _z b(x,z)\Vert . \end{aligned}$$

In general, we set

$$\begin{aligned} K_\gamma {:=} \inf \{c>0:\Vert \partial _z b(x,z)-\gamma \partial _z(x,z)\Vert \le c \Vert \partial _z b(x,z)\Vert \text { for } x\in \Omega , z\in \mathbb {R}^{N\times n}\}. \end{aligned}$$
(3.3)

In the case where AB are non-symmetric, the choice of \(\gamma \) and K needs to be modified with the obvious adaptions to the above proof coming from Lemma 2.4. With these modifications the symmetry assumption (A4) may be dropped in all results of this paper.

We are now able to prove Theorem 1, which we restate for the convenience of the reader.

Theorem 6

Suppose \(b:\Omega \times \mathbb {R}^{N \times n}\rightarrow \mathbb {R}^N\) is a reference field. Consider \({a:\Omega \times \mathbb {R}^{N \times n}\rightarrow \mathbb {R}^N}\) satisfying (A1)-(A3) and (A4). If

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{\Lambda _b}{\lambda _b}K_{a,b} \left( \dfrac{3+\sqrt{5}}{2}\right) ^\frac{p-2}{2p} 6^{p-2}(p-1)< 1 &{}\quad \text { for } p\ge 2 \\ \dfrac{\Lambda _b}{\lambda _b}K_{a,b} \left( \dfrac{3-\sqrt{5}}{2}\right) ^\frac{(p-2)}{4}\dfrac{2^{2-p}}{p-1}< 1 &{}\quad \text { for } p\le 2 \end{array}\right. } \end{aligned}$$

then (1.1) has a unique solution \(u\in W^{1,p}_g(\Omega )\). Moreover, there is \(\gamma >0\) such that if the sequence \({u_n\in W^{1,p}_g(\Omega )}\) is generated via (3.2), then \(u_n\rightarrow u\) in \(W^{1,p}(\Omega )\).

Remark 3.4

From the proof it will be clear that, in fact, \(\{u_n\}\) generated via (3.2) converges to the weak solution of (1.1) whenever \(\gamma >0\) is such that

$$\begin{aligned} {\left\{ \begin{array}{ll} R^{p-1} =\dfrac{\Lambda _b}{\lambda _b}K_\gamma \left( \dfrac{3+\sqrt{5}}{2} \right) ^\frac{p-2}{2p} 6^{p-2}(p-1)<1 &{}\quad \text { if } p\ge 2 \\ R^{p/2}=\dfrac{\Lambda _b}{\lambda _b}K_\gamma \left( \dfrac{3-\sqrt{5}}{2}\right) ^\frac{(p-2)}{4}\dfrac{2^{2-p}}{p-1}< 1 &{}\quad \text { if } p\le 2. \end{array}\right. } \end{aligned}$$

Here \(K_\gamma \) is the constant from (3.3). Moreover, considering Remark 3.3 and noting that

$$\begin{aligned} \Lambda _{a,b}\le \frac{\Lambda _a}{\lambda _b} \qquad \lambda _{a,b}\ge \frac{\lambda _a}{\Lambda _b}, \end{aligned}$$

we can estimate

$$\begin{aligned} \inf _\gamma K_\gamma \le K_{a,b} \le \frac{\Lambda _a\Lambda _b-\lambda _a\lambda _b}{\Lambda _a\Lambda _b+\lambda _a\lambda _b}. \end{aligned}$$

Further, the convergence occurs at linear rate R. To be precise, the following estimates hold:

$$\begin{aligned} \Vert u_{n+1}-u_n\Vert _{W^{1,p}(\Omega )}&\le R^n \Vert u_1-u_0\Vert _{W^{1,p}(\Omega )}\\ \Vert u_{n+1}-u\Vert _{W^{1,p}(\Omega )}&\le R^n \Vert u_0-u\Vert _{W^{1,p}(\Omega )}, \end{aligned}$$

where u is the weak solution of (1.1).

Before presenting the proof we want to briefly outline the main idea. Consider the sequence \((u_n)\) generated via (3.2). Subtracting the equations defining \(u_{n+1}\) and \(u_n\) we find

$$\begin{aligned}&div \,\left( b(x,D u_{n+1})-b(x,D u_n)\right) \\&=div \,\Big (b(x,D u_n)-b(x,D u_{n-1})-\gamma \big (a(x,D u_n)-a(x,D u_{n-1})\big )\Big ) \end{aligned}$$

Using the mean-value theorem and denoting \(A(x,z)=\partial _z a(x,z)\), as well as \({B(x,z)=\partial _z b(x,z)}\), we may rewrite this as

$$\begin{aligned} div \,\left( B(x,\xi _{n+1})D (u_{n+1}-u_n)\right) = div \,\left( (B-\gamma A)(x,{\tilde{\xi }}_n)D (u_n-u_{n-1})\right) . \end{aligned}$$

for some \(\xi _{n+1}\) lying on the line segment between \(D u_{n+1}\) and \(D u_n\) and \({\tilde{\xi }}_n\) lying on the line segment between \(D u_n\) and \(D u_{n-1}\). If \(\Vert B(x,z)-\gamma A(x,z)\Vert \) is sufficiently small, uniformly in x and z, then applying the ellipticity assumption to bound the left-hand side from below, we are able to show that for some \(R<1\),

$$\begin{aligned} \Vert D (u_{n+1}-u_n)\Vert _{L^p(\Omega )}\le R\Vert D (u_n-u_{n-1})\Vert _{L^p(\Omega )}. \end{aligned}$$

We then conclude easily.

Proof

Let \(\gamma \) be as in Remark 3.3 and write \(K=K_{a,b}\). Throughout the proof we write \(A(x,z)= \partial _z a(x,z)\) and \(B(x,z)=\partial _z b(x,z)\).

Let \(n\ge 1\). Test the equations defining \(u_{n+1}\) and \(u_n\) against \(u_{n+1}-u_n\) to find:

$$\begin{aligned}&\int _\Omega \big (b(x,D u_{n+1})-b(x,D u_n)\big )D (u_{n+1}-u_n)\,\mathrm {d}x\\&\quad =\int _\Omega \big (b(x,D u_n)-b(x,D u_{n-1})\big )D (u_{n+1}-u_n)\\&\qquad \,-\gamma \big (a(x,D u_n)-a(x,D u_{n-1})\big )D (u_{n+1}-u_n)\,\mathrm {d}x\\&\quad =\int _\Omega \int _0^1 \big (B(x,z_n(\theta ))-\gamma A(x,z_n(\theta ))\big )D (u_n-u_{n-1})\cdot D (u_{n+1}-u_n)\ d\theta \,\mathrm {d}x. \end{aligned}$$

Here \(z_n(\theta ) = (1-\theta ) D u_{n-1}+ \theta D u_n\). The last line follows from an application of the mean value theorem. Now proceed to estimate both sides of this equality.

We focus first on the case \(p\ge 2\). Using again the mean-value theorem and writing \(z_{n+1}(\theta ) = (1-\theta ) D u_{n+1}+ \theta D u_n\), by assumption (A2), the left-hand side gives:

$$\begin{aligned}&\int _\Omega \left( b(x,D u_{n+1})-b(x,D u_n)\right) D (u_{n+1}-u_n)\ d x\\&\quad = \int _\Omega \int _0^1 (B(x,z_{n+1}(\theta ))D (u_{n+1}-u_n)\cdot D (u_{n+1}-u_n) \,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \ge \lambda _b \int _\Omega \int _0^1 (\mu ^2+|z_{n+1}(\theta )|^2)^\frac{p-2}{2}| D (u_{n+1}-u_n)|^2\,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \ge \frac{\lambda _b}{6^\frac{p-2}{2}(p-1)} \int _\Omega (\mu ^2+|D u_{n+1}|^2+|D u_{n+1}-D u_n|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x \end{aligned}$$

where the last line uses Lemma 2.1.

Note that \(\Vert B(x,w)\Vert \le \Lambda _b(\mu ^2+|w|^2)^\frac{p-2}{2}\) by (A2). Hence on the right-hand side by Lemmas 3.2 and 2.3,

$$\begin{aligned}&\int _\Omega \int _0^1 \left( B(x,z_n(\theta ))-\gamma A(x,z_n(\theta ))\right) D (u_n-u_{n-1})\cdot D (u_{n+1}-u_n)\,\mathrm {d}\theta \,\mathrm {d}x\\&\le \Lambda _b K\int _\Omega \int _0^1 (\mu ^2+|z_n(\theta )|^2)^\frac{p-2}{2} |D u_n-D u_{n-1}||D u_{n+1}-D u_n|\,\mathrm {d}\theta \,\mathrm {d}x\\&\le \Lambda _b K \int _\Omega (\mu ^2+|D u_n|^2+|D u_n-D u_{n-1}|^2)^\frac{p-2}{2}\\&\qquad \times |D (u_n-u_{n-1})||D (u_{n+1}-u_n)|\,\mathrm {d}x =I. \end{aligned}$$

We now apply Lemma 2.2 and Lemma 2.1 to find for any \(\varepsilon >0\),

$$\begin{aligned} I&\le \Lambda _b K \Big (C_\varepsilon \int _\Omega V_{(\mu ^2+|D u_n|^2)^{1/2}}(D (u_{n+1}-u_{n}))^2\\&\quad +\varepsilon \int _\Omega V_{(\mu ^2+|D u_n|^2)^{1/2}}(D (u_n-u_{n-1}))^2\Big )\\&\le \Lambda _b K \Big (\varepsilon \int _\Omega V_{(\mu ^2+|D u_n|^2)^{1/2}}(D (u_n-u_{n-1}))^2+C_\varepsilon \left( \frac{3+\sqrt{5}}{2}\right) ^\frac{p-2}{2}\\&\quad \times \int _\Omega (\mu ^2+|D u_{n+1}|^2+|D u_{n+1}-D u_n|^2)^\frac{p-2}{2}|D u_{n+1}-D u_n|^2\Big ). \end{aligned}$$

Combining these two estimates and re-arranging gives:

$$\begin{aligned}&\int _\Omega (\mu ^2+|D u_{n+1}|^2+|D u_{n+1}-D u_n|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x\\&\quad \le \frac{\varepsilon C_1}{C_0-C_\varepsilon C_2} \int _\Omega (\mu ^2+|D u_n|^2+|D (u_{n-1}-u_n)|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x. \end{aligned}$$

where \(C_0 = \frac{\lambda _b}{6^\frac{p-2}{2}(p-1)}\), \(C_1 = \Lambda _b K\) and \(C_2 = (\frac{3+\sqrt{5}}{2})^\frac{p-2}{2}\Lambda _b K\).

Optimising in \(\varepsilon \) and using the hypothesis this shows by induction that

$$\begin{aligned} \int _\Omega (|D u_n|^2+|D (u_{n+1}-u_n)|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x\rightarrow 0 \end{aligned}$$

at linear rate.

As \(p\ge 2\), it immediately follows that \(\{u_n\}\) converges in \(W^{1,p}(\Omega )\) at linear rate. Necessarily the limit u lies in \(W^{1,p}_g(\Omega )\) and it is a solution of problem (1.1). Moreover considering any solution \(v\in W^{1,p}_g(\Omega )\) of (1.1) and considering the above estimates with the starting point

$$\begin{aligned}&\int _\Omega (b(x,D u_{n+1})-b(x,D v))D (u_{n+1}-v) \,\mathrm {d}x\\&\quad = \int _\Omega (b(x,D u_n)-b(x,D v) -\gamma (a(x,D u_n)-a(x,D v))+\gamma f)\cdot D (u_{n+1}-v)\,\mathrm {d}x \end{aligned}$$

we find that, for any \(u_0\in W^{1,p}_g(\Omega )\), the corresponding iterative process converges to v in \(W^{1,p}_g(\Omega )\). In particular, the solution to (1.1) is unique.

We now consider \(p<2\). Proceed as before to obtain

$$\begin{aligned}&\int _\Omega \big (b(x,D u_{n+1})-b(x,D u_n)\big )D (u_{n+1}-u_n)\,\mathrm {d}x\\&\quad \ge \lambda _b \int _\Omega (\mu ^2+|D u_{n+1}|^2+|D u_{n+1}-D u_n|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x \end{aligned}$$

and

$$\begin{aligned}&\int _\Omega \int _0^1 \left( B(x,z(\theta ))-\gamma A(x,z(\theta ))\right) D (u_n-u_{n-1})\cdot D (u_{n+1}-u_n)\,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \le \frac{\Lambda _b K}{2^{p-2}(p-1)} \Bigg (\left( \frac{3-\sqrt{5}}{2}\right) ^\frac{p-2}{2}C_\varepsilon \int _\Omega (\mu ^2+|D u_{n+1}|^2+|D u_{n+1}-D u_n|^2)^\frac{p-2}{2}\\&\qquad \times |D (u_n-u_{n-1})|^2\,\mathrm {d}x \\&\qquad + \varepsilon \int _\Omega (\mu ^2+|D u_n|^2+|D u_n-D u_{n-1}|^2)^\frac{p-2}{2}|D (u_n-u_{n-1})|^2\,\mathrm {d}x\Bigg ). \end{aligned}$$

Combining these estimates and our assumptions, we again find that

$$\begin{aligned} \int _\Omega (|D u_n|^2+|D u_{n+1}|^2)^\frac{p-2}{2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x\rightarrow 0 \end{aligned}$$

at linear rate.

Note that by Hölder there is \(c>0\) such that

$$\begin{aligned}&\int _\Omega (|D u_n|+|D (u_{n+1}-u_n)|)^{p-2}|D (u_{n+1}-u_n)|^2\,\mathrm {d}x\\&\quad \ge c \frac{\Vert D (u_{n+1}-u_n)\Vert _p^2}{\Vert D u_n\Vert _p^{2-p}+\Vert D u_{n+1}\Vert _p^{2-p}}. \end{aligned}$$

If \(\{D u_n\}\) is a bounded sequence in \(W^{1,p}\), convergence of \(\{u_n\}\) in \(W^{1,p}(\Omega )\) and existence and uniqueness of solutions to (1.1) follow by repeating the arguments of the case \(p\ge 2\). For clarity of presentation and as it essentially follows by repeating arguments similar to those of this proof we postpone the proof of boundedness to Lemma 3.5. \(\square \)

Lemma 3.5

Suppose the assumptions of Theorem 6 hold. Then \(\{u_n\}\) is bounded in \(W^{1,p}(\Omega )\).

Proof

Let \(\gamma \) be as in Remark 3.3 and write \(K = K_{a,b}\). We write \(A(x,z)=\partial _z a(x,z)\) and \(B(x,z)=\partial _z(x,z)\).

Let \(n\ge 1\). Test the equation defining \(u_{n+1}\) against \(u_{n+1}\) to find:

$$\begin{aligned}&\int _\Omega \left( b(x,D u_{n+1})-b(x,0)\right) D u_{n+1}\,\mathrm {d}x\\&\quad =\int _\Omega \left( b(x,D u_n)-b(x,0)\right) D u_{n+1}\\&\quad \quad -\gamma \left( a(x,D u_n)-a(x,0)\right) D u_{n+1}+\gamma a(x,0)D u_{n+1}\ d x\\&\quad =\int _\Omega \int _0^1 \left( B(x,z_n(\theta ))-\gamma A(x,z_n(\theta ))\right) D u_n\cdot D u_{n+1}+\gamma a(x,0)D u_{n+1}\,\mathrm {d}\theta \,\mathrm {d}x. \end{aligned}$$

Here \(z_n(\theta ) = \theta u_n\). The last line follows from an application of the mean value theorem. Now proceed to estimate both sides of this equality exactly as in Theorem 6.

We focus first on the case \(p\ge 2\). Using again the mean-value theorem and writing \(z_{n+1}(\theta ) = \theta u_{n+1}\), by assumption (A2), the left-hand side gives:

$$\begin{aligned}&\int _\Omega \left( b(x,D u_{n+1})-b(x,0)\right) D u_{n+1}\,\mathrm {d}x\\&\quad = \int _\Omega \int _0^1 (B(x,z_{n+1}(\theta ))D (u_{n+1})\cdot D u_{n+1}\,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \ge \lambda _b \int _\Omega \int _0^1 (\mu ^2+|z_{n+1}(\theta )|^2)^\frac{p-2}{2}|D u_{n+1}|^2\,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \ge \frac{\lambda _b}{(p-1)6^\frac{p-2}{2}}\int _\Omega (\mu ^2+|D u_{n+1}|^2)^\frac{p-2}{2}|D u_{n+1}|^2\,\mathrm {d}x \end{aligned}$$

where the last line uses Lemma 2.1.

Note that \(\Vert B(x,z)\Vert \le \Lambda _b(\mu ^2+|z|^2)^\frac{p-2}{2}\) by (A2). Hence on the right-hand side by Lemmas 3.2 and 2.1, as well as (A3)

$$\begin{aligned}&\int _\Omega \int _0^1 \left( B(x,z_n(\theta ))-\gamma A(x,z_n(\theta ))\right) D u_n\cdot D u_{n+1}+\gamma a(x,0)\cdot D u_{n+1}\,\mathrm {d}\theta \,\mathrm {d}x\\&\quad \le \Lambda _b K\int _\Omega \int _0^1 (\mu ^2+|z_n(\theta )|^2)^\frac{p-2}{2} |D u_n||D u_{n+1}|+C\gamma |D u_{n+1}|\mathrm {d\theta }\,\mathrm {d}x\\&\quad \le \Lambda _b K \int _\Omega (\mu ^2+|D u_n|^2)^\frac{p-2}{2}|D u_n||D (u_{n+1})|\,\mathrm {d}x + C\gamma \int _\Omega |D u_{n+1}|\,\mathrm {d}x = I + II. \end{aligned}$$

We now apply Lemma 2.2, to find

$$\begin{aligned} I&\le \Lambda _b K \left( C_\varepsilon \int _\Omega V_{\mu }(D u_{n+1})^2+\varepsilon \int _\Omega V_{\mu }(D u_n)^2\right) . \end{aligned}$$

Young’s inequality also gives

$$\begin{aligned} II&\le C\gamma \Vert D u_{n+1}\Vert _{L^p(\Omega )}\le \varepsilon _1\Vert D u_{n+1}\Vert _{L^p(\Omega )}^p + C(\varepsilon _1)\\&\le \varepsilon _1 \Vert V_\mu (D u_{n+1})\Vert _{L^2(\Omega )}^2+C(\varepsilon _1). \end{aligned}$$

Combining these two estimates and re-arranging gives:

$$\begin{aligned}&\int _\Omega (\mu ^2+|D u_{n+1}|^2)^\frac{p-2}{2}|D u_{n+1}|^2\,\mathrm {d}x\\&\quad \le \frac{\varepsilon C_1}{C_0-C_\varepsilon C_2} \int _\Omega (\mu ^2+|D u_n|^2)^\frac{p-2}{2}|D u_n)|^2\,\mathrm {d}x\\&\quad \quad +\frac{1}{C_0-\varepsilon C_2}(C(\varepsilon _1)+\varepsilon _1\Vert V_\mu (D u_{n+1})\Vert _{L^2(\Omega )}^2). \end{aligned}$$

where \(C_0 = \frac{\lambda _b}{6^\frac{p-2}{2}(p-1)}\) and \(C_1 = C_2 = \Lambda _b K\).

Optimising in \(\varepsilon \), we can ensure that \(\frac{\varepsilon C_1}{C_0-C_\varepsilon C_1}<1\). Now choosing \(\varepsilon _1\) sufficiently small, we conclude that

$$\begin{aligned} \Vert V_\mu (D u_{n+1})\Vert _{L^2(\Omega )}^2\le \eta \Vert V_\mu (D u_{n+1})\Vert _{L^2(\Omega )}^2 + C \end{aligned}$$
(3.4)

for some constants \(\eta \in (0,1)\) and \(C>0\). The conclusion follows using induction.

For \(p\le 2\) we argue similarly to again obtain (3.4), now with with \(C_0 = \lambda _b\), \(C_1 = \Lambda _b K\), \(C_2 = \Lambda _b K (p-1)^{-1}4^{(2-p)2}\). Optimising in \(\varepsilon \) we can again ensure that

$$\begin{aligned} \dfrac{\varepsilon C_1}{C_0-C_\varepsilon C_1}<1 \end{aligned}$$

and use this to conclude as before. \(\square \)

4 Numerical analysis and experiments

Throughout this section we assume that a(xz) and b(xz) satisfy assumptions (A1)-(A3) and (A4). We further assume that the assumptions of Theorem 6 hold. In particular, this will imply that, with an appropriate choice of \(\gamma \), the iterative process (3.2) converges in \(W^{1,p}(\Omega )\) to a solution of (1.1) and so, in particular, the iterates \(\{u_n\}\) are uniformly bounded in \(W^{1,p}(\Omega )\). We fix such a choice of \(\gamma \) from now on.

Recall that we study the following numerical scheme: Suppose \(\Omega \) is a regular polytope. Consider a sequence of shape-regular triangulations \(\{T_h\}_{h\in (0,1]}\) of \(\Omega \) of mesh-size h. Denote by \(X_h\) the space of continuous piecewise linear (subordinate to \(T_h\)) functions. Choose \(u_0\in X_h\). Define \(u_{n+1}^h\in X_h\) inductively as a solution of the problem: For all \(\phi \in X_h\),

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _\Omega b(x,D u_{n+1}^h)\cdot D \phi \,\mathrm {d}x = \int _\Omega (b(x,D u_n^h)-\gamma a(x,D u_n^h)+\gamma f)\cdot D \phi \,\mathrm {d}x \\ u_{n+1}^h = 0 \quad \text { on } \partial \Omega \end{array}\right. } \end{aligned}$$
(4.1)

Note that due to (A1)-(A3) and the theory of monotone operators (4.1) is well-defined.

4.1 Analysis of the numerical scheme

In this section we present our main results regarding convergence, a-priori and a-posteriori estimates for (4.1). We begin with the following convergence result:

Theorem 7

Let \(u_0\in X_h\) and consider the sequence \(\{u_n^h\}\) generated by (4.1) starting from \(u_0\). Then \(u_n^h\rightarrow u^h\) in \(W^{1,p}(\Omega )\) where \(u^h\in X_h\) solves

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _\Omega a(x,D u^h)\cdot D \phi \,\mathrm {d}x = \int _\Omega f\cdot D \phi \,\mathrm {d}x &{}\quad \forall \phi \in X_h\\ u^h = 0 &{}\quad \text { on } \partial \Omega . \end{array}\right. } \end{aligned}$$
(4.2)

Moreover there is \(0<C_1<1\) such that

$$\begin{aligned} \Vert u_{n+1}^h-u_n^h\Vert _{W^{1,p}(\Omega )}\le C_1^n\Vert u_1-u_0\Vert _{W^{1,p}(\Omega )} \end{aligned}$$

Proof

This follows from carrying out line by line the proof of Theorem 6, replacing \(\{u_n\}\) with \(\{u_n^h\}\) and u with \(u^h\) respectively. \(\square \)

We now wish to show that \(u^h\rightarrow u\) in \(W^{1,p}(\Omega )\) as \(h\rightarrow 0\) where \({u\in W^{1,p}_0(\Omega )}\) solves (1.1). In order to prove this we make the following regularity assumption on b: If \(F\in L^{p'}(\Omega )\) then there is \(\alpha >0\) such that (3.1) has a solution u in \(W^{1+\alpha ,p}(\Omega )\) and moreover we have the following estimate for some \(c>0\):

$$\begin{aligned} \Vert u\Vert _{W^{1+\alpha ,p}(\Omega )}\le c\left( 1+\Vert u\Vert _{W^{1,p}(\Omega )}+\Vert F\Vert _{L^{p'}(\Omega )}^{1/(p-1)}\right) \end{aligned}$$
(4.3)

Remark 4.1

(4.3) is satisfied for example when \(b(x,z)=|z|^{p-2}z\) with any choice of \(\alpha >0\) such that \({\alpha < \min \left( 1/(p-1)^2,(p-1)^2\right) }\), see [48].

We proceed to study the effect of decreasing the mesh-size h. We use the notation of Theorem 7.

Theorem 8

Set \(g=0\). Assume the assumptions of Theorem 7 hold. Suppose moreover that (4.3) is satisfied. Choose \(u_0\in X_h\) and let \(\{u_n^h\}\) be the sequence generated by (4.1). Suppose \(u\in W^{1,p}_0(\Omega )\) solves (1.1). Then \(u\in W^{1+\alpha ,p}(\Omega )\) and \(u^h\rightarrow u\) in \(W^{1,p}(\Omega )\) as \(h\rightarrow 0\). Moreover we have the estimate

$$\begin{aligned} \Vert u_{n+1}^h-u\Vert _{W^{1,p}(\Omega )}\le c h^\frac{2\alpha }{\max (2,p)}c\left( \Vert f\Vert _{L^{p'}(\Omega )}\right) +C_1^n\Vert u_1-u_0\Vert _{W^{1,p} (\Omega )}. \end{aligned}$$

Proof

Consider the iterative process \(\{u_n\}\) started from 0. By Theorem 6, \(u_n\rightarrow u\) in \(W^{1,p}(\Omega )\). Using (4.3), we find

$$\begin{aligned}&\Vert u_n\Vert _{W^{1+\alpha ,p}(\Omega )}\\&\quad \le c\left( 1+\Vert u_n\Vert _{W^{1,p}(\Omega )}+\Vert b(x,D u_n)-\gamma a(x,D u_n)+\gamma f\Vert _{L^{p'}(\Omega )}^{1/(p-1)}\right) \end{aligned}$$

Repeating arguments from the proof of Lemma 3.5 we deduce using induction that there is \(c(\Vert f\Vert _{L^{p'}(\Omega )})\) such that for all \(n\ge 0\),

$$\begin{aligned} \Vert u_n\Vert _{W^{1+\alpha ,p}(\Omega )}\le c(\Vert f\Vert _{L^{p'}(\Omega )}). \end{aligned}$$

Extracting a weakly convergent subsequence we conclude the same estimate holds for u.

Let v be the best approximation to u in \(X_h\). Using (A2), the fact that \(u,u^h\) solve (1.1) and (4.2) respectively we find

$$\begin{aligned}&\lambda _a\int _\Omega (\mu ^2+|D u|^2+|D u^h|^2)^\frac{p-2}{2}|D u-D u^h|^2\,\mathrm {d}x\\&\quad \le \int _\Omega (a(x,D u)-a(x,D u^h))D (u-u^h)\,\mathrm {d}x\\&\quad = \int _\Omega (a(x,D u)-a(x,D u^h))D (u-v)\,\mathrm {d}x\\&\quad \lesssim \Lambda _a \int _\Omega (\mu ^2+|D u|^2+|D (u-u^h)|^2)^\frac{p-2}{2}|D (u-u^h)||D (u-v)|\,\mathrm {d}x\\&\quad \lesssim C_\varepsilon \Lambda _a \int _\Omega (\mu ^2+|D u|^2+|D u^h|^2)^\frac{p-2}{2}|D (u-u^h)|^2\,\mathrm {d}x \\&\qquad +\varepsilon \Lambda _a \int _\Omega (\mu ^2+|D u|^2+|D v|^2)^\frac{p-2}{2}|D (u-v)|^2\,\mathrm {d}x \end{aligned}$$

where to obtain the last line we have used Lemma 2.2. Choosing \(\varepsilon \) sufficiently large we conclude using (2.1)

$$\begin{aligned} I&=\int _\Omega (\mu ^2+|D u|^2+|D u^h|^2)^\frac{p-2}{2}|D (u-u^h)|^2\nonumber \\&\lesssim \int _\Omega (\mu ^2+|D u|^2+|D v|^2)^\frac{p-2}{2}|D (u-v)|^2\,\mathrm {d}x\nonumber \\&\le \Vert D (u-v)\Vert _{L^p(\Omega )}^2\left( 1+\Vert D u\Vert _{W^{1,p}(\Omega )}^{p-1}+\Vert D (u-v)\Vert _{W^{1,p}(\Omega )}^{p-1}\right) \nonumber \\&\lesssim h^{2\alpha } \Vert u\Vert _{W^{1+\alpha ,p}(\Omega )}\left( 1+\Vert u\Vert _{W^{1,p}(\Omega )}^{p-1}+h^{\alpha (p-1)}\Vert u\Vert _{W^{1+\alpha ,p}(\Omega )}^{p-1}\right) \end{aligned}$$
(4.4)

If \(p\ge 2\), \(I\ge \Vert D (u-u^h)\Vert _{L^p(\Omega )}^p\) whereas if \(p\le 2\), we have by applying Hölder’s inequality \({I\ge \Vert D (u-u^h)\Vert _{L^p(\Omega )}^2 (1+\Vert D u\Vert _{L^p(\Omega )}+\Vert D u^h\Vert _{L^p(\Omega )})}\). Recalling the standard estimates \(\Vert D u\Vert _{L^p(\Omega )}\le \Vert f\Vert _{L^{p'}(\Omega )}^{1/(p-1)}\) as well as that \({\Vert D u^h\Vert _{L^p(\Omega )}\le \Vert f\Vert _{L^{p'}(\Omega )}^{1/(p-1)}}\), we conclude by combining (4.4), Theorem 7 and the inequality

$$\begin{aligned} \Vert u_{n+1}^h-u\Vert _{W^{1,p}(\Omega )}\le \Vert u_{n+1}^h-u^h\Vert _{W^{1,p}(\Omega )}+\Vert u^h-u\Vert _{W^{1,p}(\Omega )}. \end{aligned}$$

\(\square \)

We also have an a-posteriori error bound. The proof follows [29].

Proposition 4.2

Assume the conditions of Theorem 7 hold. Then we have

$$\begin{aligned} \Vert D (u_n^h-u^h)\Vert _{L^p(\Omega )}\le C(\Vert f\Vert _{L^{p'}(\Omega )},\Vert D u_0\Vert _{L^p(\Omega )})\Vert D (u_n^h-u_{n-1}^h)\Vert _{L^p(\Omega )}^\frac{2}{\max (2,p)}. \end{aligned}$$

Proof

We compute using (A2) and Lemma 2.2,

$$\begin{aligned}&\lambda \gamma \int _\Omega (\mu ^2+|D (u^h-u_{n-1}^h)|^2+|D u_{n-1}^h|^2)^\frac{p-2}{2}|D u^h-D u_{n-1}^h|^2\,\mathrm {d}x\\&\lesssim \gamma \int _\Omega (a(x,D u^h)-a(x,D u_{n-1}^h))\cdot D (u^h-u_{n-1}^h)\,\mathrm {d}x\\&= \int _\Omega \gamma f\cdot D (u^h-u_{n-1}^h) \\&\quad + (b(x,D (u_n^h))-b(x,D u_{n-1}^h))-\gamma f)\cdot D (u^h-u_{n-1}^h)\,\mathrm {d}x\\ \lesssim&\int _\Omega \left( \mu ^2+|D u_{n-1}^h|^2+|D (u_n^h-u_{n-1}^h)|^2\right) ^\frac{p-2}{2}\\&\quad \times |D (u_n^h-u_{n-1}^h)| |D (u^h-u_{n-1}^h)|\,\mathrm {d}x\\&\le \varepsilon \int _\Omega \left( \mu ^2+|D u_{n-1}^h|^2+|D (u_n^h-u_{n-1}^h)|^2\right) ^\frac{p-2}{2}|D (u_n^h-u_{n-1}^h)|^2\,\mathrm {d}x\\&\quad +C_\varepsilon \int _\Omega (\mu ^2+|D (u^h-u_{n-1}^h)|^2+|D u_{n-1}^h|^2)^\frac{p-2}{2}|D u^h-D u_{n-1}^h|^2\,\mathrm {d}x \end{aligned}$$

Thus choosing \(\varepsilon \) sufficiently large, re-arranging, employing by now standard arguments and recalling that \(\{u_n^h\}\) is bounded uniformly in \(W^{1,p}(\Omega )\) we obtain

$$\begin{aligned} \Vert D (u^h-u_{n-1}^h)\Vert _{L^p(\Omega )}\lesssim c\left( \Vert D u^h\Vert _{L^p(\Omega )},\Vert D u_0\Vert _{L^p(\Omega )}\right) \Vert D (u_n^h-u_{n-1}^h)\Vert _{L^p(\Omega )}^\frac{2}{\max (2,p)}. \end{aligned}$$

By the triangle inequality, and using \(\Vert u^h\Vert _{W^{1,p}(\Omega )}\lesssim \Vert f\Vert _{L^{p'}(\Omega )}^{1/(p-1)}\), we conclude the desired estimate:

$$\begin{aligned} \Vert D (u_n^h-u^h)\Vert _{L^p(\Omega )}&\le \Vert D (u_{n-1}^h-u^h)\Vert _{L^p(\Omega )}+\Vert D (u_n^h-u_{n-1}^h)\Vert _{L^p(\Omega )}\\&\le c\left( \Vert f\Vert _{L^{p'}(\Omega )},\Vert D u_0\Vert _{L^p(\Omega )}\right) \Vert D (u_n^h-u_{n-1}^h)\Vert _{L^2(\Omega )}. \end{aligned}$$

\(\square \)

We close this section by detailing a modification of the numerical scheme, following the algorithm outlined in [29]. We strengthen our assumptions on \(\{T_h\}\) and assume \(\{T_k\}_{k\in \mathbb {N}}\) is a sequence of shape-regular triangulations with meshsize \(h_k\rightarrow 0\) as \(k\rightarrow \infty \). Moreover, we assume that \(\{T_k\}\) is obtained from \(\{T_{k-1}\}\) by refinement. Denote by \(X^k\) the space of continuous piecewise linear functions subordinate to \(\{T_k\}\). We then consider the following algorithm:

figure a

From the results of this section it is clear that, under the assumptions of Theorem 8, Algorithm 1 converges to a solution of (1.1) as \(k_{max}\rightarrow \infty \).

4.2 Numerical experiments

The results presented in this section are obtained using Firedrake [2,3,4,5, 10, 47]. Throughout this section \(\Omega = [0,1]^3\). We consider shape-regular triangulations of \(\Omega \) with uniformly spaced nodes at distance \(h =2^{-i}\). Note that in this set-up the mesh-size is \(\sqrt{3}h\).

A linear example: We first consider a linear example where we know the exact solution. We choose,

$$\begin{aligned} f = \left( \begin{matrix}8\pi &{} 8\pi &{} 8\pi \\ 10\pi &{} 10\pi &{} 10\pi \\ 2\pi &{} 2\pi &{} 2\pi \end{matrix}\right) (v,v,v)^T \,\text { where } v = \left( \begin{matrix}cos(2\pi x)sin(2\pi y)sin(2\pi z) \\ sin(2\pi x)cos(2\pi y) sin(2\pi z) \\ sin(2\pi x)sin(2\pi y)cos(2\pi z)\end{matrix}\right) \end{aligned}$$

and consider the problem

$$\begin{aligned} {\left\{ \begin{array}{ll}-div \,A\,D u = div \,f \quad &{}\text { in } \Omega \\ u = 0 &{}\text { on } \partial \Omega \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} A = \left( \begin{matrix} 1 &{} 1 &{} 2 \\ 0 &{} 2 &{} 3 \\ 0 &{} 0 &{} 1 \end{matrix}\right) . \end{aligned}$$

Note that with \({\tilde{u}}(x,y,z)=sin(2\pi x)sin(2\pi y)sin(2\pi z)\),

$$\begin{aligned} u(x,y,z)=({\tilde{u}}(x,y,z),{\tilde{u}}(x,y,z),{\tilde{u}}(x,y,z))^T \end{aligned}$$

is the exact solution of this problem.

For the iteration scheme we choose \(u_0 = 0\), \(b(x,z)=z\) and \(\gamma = 2/3\) or \(\gamma = 1/2\). We solve each iteration step using GMRES with an incomplete LU factorisation to precondition the problem. The iteration is terminated when \(\Vert u_{n+1}-u_n\Vert _{H^1(\Omega )}\le 10^{-9}\).

We record the \(H^1\)-error of the numerical solution v computed using the iterative scheme (4.1) in Table 1. We also record the number of iterations needed.

Table 1 \(\Vert u-v\Vert _{H^1(\Omega )}\)

We also compute the numerical solution \(u^h\) directly using a LU-factor-isation. We record the \(H^1\)-distance between \(u_n^h\) and \(u^h\) for two different choices of \(\gamma \) in Fig. 1.

Fig. 1
figure 1

\(\Vert u_n^h-u_h\Vert _{H^1(\Omega )}\)

A nonlinear example We also consider the nonlinear problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -div \,(1+|D u|^4) AD u + |u|^4 u = (y,x^2,z^2+x^2)^T\quad &{}\text { in } \Omega \\ u = 0 &{}\text { on } \partial \Omega \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} A=\left( \begin{matrix} 1 &{} 3 &{} 5 \\ 0 &{} 2 &{} 4 \\ 0 &{} 0 &{} 1\end{matrix}\right) . \end{aligned}$$

We set \(b(x,z)=(1+|z|^4)z\), \(\gamma = 0.65\) and \(u_0 = 0\) and employ (4.1). Each (non-linear) iteration step is solved using the standard ’solve’-method in Firedrake. This utilises a nonlinear Newton linesearch scheme where the linear step is computed using GMREs. Denote the solution obtained in this way with \(u_{dir}\). We record the \(H^1\)-distance \(\Vert u_{dir}-u_n^h\Vert _{H^1(\Omega )}\) in Fig. 2.

Fig. 2
figure 2

\(\Vert u_{dir}-u_n^h\Vert _{H^1(\Omega )}\)

For this problem, we moreover compare computing times for (4.1) and Algorithm (1) in Table 2. For (4.1) we use \(\Vert u_{n+1}-u_n\Vert _{H^1(\Omega )}\le 10^{-9}\) as our termination condition, while for Algorithm 1 we choose \(tol _k = 10^{-(i+4)}\) when \(h=2^{-i}\).

Table 2 runtime in seconds

Remark 4.3

We do not apply a scheme optimised for the p-Laplacian type operator \(b(x,z)=(1+|z|^4)z\) in order to solve each iteration step. Improving our computation in this way a further reduction in runtime should be achieved.

Finally, for the same problem, we consider the effect of the choice of \(\gamma \) on the iteration. We record in Fig. 3 the number of iterations required to obtain convergence when \(\gamma \) varies between 0 and 0.9. For values of \(\gamma \) larger than \(\approx 0.95\), the algorithm diverges. Similar to the linear case, and as predicted by our theory, the number of iterations needed to obtain convergence is independent of the choice of h. Hence, computing solutions for a large value of h is an efficient way to find the optimal value of \(\gamma \) to use for smaller values of h.

Fig. 3
figure 3

The effect of varying \(\gamma \)

Remark 4.4

Note that we are not able to prove convergence for all small values of \(\gamma \) in Theorem 8. Convergence nevertheless holds for all sufficiently small \(\gamma \) in our experiment. It would be interesting to understand this phenomenon.

5 Improved regularity results

In this section, we give two examples where the iterative process can be used to obtain regularity statements about solutions to perturbations of a system which admits regular solutions.

5.1 A first example: Calderón–Zygmund type estimates

We will use fields satisfying the assumptions of Theorem 2 as reference fields and apply the iterative process to obtain the following result.

Theorem 9

Let \(1<p<\infty \), \(q_0\in [1,\infty )\). Suppose \(\Omega \) is a Lipschitz domain. Let \(q\in [1,q_0]\). Take \(f\in L^{p'q}(\Omega )\) and \(g\in W^{1,pq}(\Omega )\). Assume for some field b the assumptions of Theorem 2 are satisfied and let \(C_0\) be the constant defined there. Assume that a satisfies assumptions (A1)-(A3) and (A4) with \(\mu =0\). Suppose further with this choice of ab the assumptions of Theorem 1 are satisfied. If

$$\begin{aligned} C_0'=\dfrac{C_0^{p-1}\Lambda _b K_{a,b}}{(p-1)}<1, \end{aligned}$$

where \(K_{a,b}\) is given by (1.3), then the solution u of (1.1) satisfies

$$\begin{aligned} \Vert D u\Vert _{L^{pq}(\Omega )}\lesssim \left( 1+\Vert D g\Vert _{L^{pq}(\Omega )}+\Vert f\Vert _{L^{p'q)}(\Omega )}^\frac{1}{p-1}\right) . \end{aligned}$$

Proof

Let \(\{u_n\}\) be the sequence generated by the iterative process with \(u_0 = 0\) and with the choice of \(\gamma \) as in in Theorem 1. Write \(K=K_{a,b}\). By Theorem 2, Lemma 3.2 and (A3), we have, for some \(c>0\), the estimate

$$\begin{aligned}&\Vert D u_{n+1}\Vert _{L^{pq}(\Omega )} \\&\quad \le C_0\left( 1+\Vert \nabla g\Vert _{L^{pq}(\Omega )}+ \Vert b(x,D u_n)-\gamma a(x,D u_n)+\gamma f|^{p'q}\Vert _{L^{p'q}(\Omega )}\Vert ^\frac{1}{p-1}\right) \\&\quad \le C_0\Big (1+c+\Vert \nabla g\Vert _{L^{pq}(\Omega )}\\&\qquad + \Vert b(x,D u_n)-b(x,0)-\gamma (a(x,D u_n)-a(x,0))+\gamma f\Vert _{L^{p'q}(\Omega )}^\frac{1}{p-1}\Big )\\&\quad \le C_0\Big (1+c+\Vert D g\Vert _{L^{pq}(\Omega )}+(\Lambda _b K)^\frac{1}{p-1}\left\| \int _0^1 \theta ^{p-2}|D u_n|^{p-1}\,\mathrm {d}\theta \right\| _{L^{p'q}(\Omega )}^\frac{1}{p-1}\\&\qquad +\gamma \Vert f\Vert _{L^{p'q}(\Omega )}^\frac{1}{p-1}\Big )\\&\quad = C_0\left( 1+c+\Vert D g\Vert _{L^{pq}(\Omega )}+\frac{(\Lambda _b K)^\frac{1}{p-1}}{(p-1)^\frac{1}{p-1}}\Vert D u_n\Vert _{L^{pq}(\Omega )}+\gamma \Vert f\Vert _{L^{p'q}(\Omega )}^\frac{1}{p-1}\right) \end{aligned}$$

Thus, as by assumption \(C_0'<1\), we find by induction

$$\begin{aligned} \Vert D u_{n+1}\Vert _{L^q(\Omega )}\lesssim \frac{(C_0')^\frac{1}{p-1}}{1-(C_0')^\frac{1}{p-1}} \left( 1+\Vert D g\Vert _{L^{pq}(\Omega )}+\Vert f\Vert _{L^{q/(p-1)}}^\frac{1}{p-1}\right) . \end{aligned}$$

Extracting a weakly convergent subsequence and noting that \(u_n\rightarrow u\) in \(W^{1,p}(\Omega )\) by Theorem 6, where u is the solution of (1.1), we find the desired estimate holds. \(\square \)

5.2 A second example: weighted estimates and Hölder continuity

We can use the Koshelev iteration to perturb Theorem 4 as follows:

Theorem 10

Let \(N=1\). Suppose a is a field satisfying (A1)-(A3) and (A4) with \(\mu =0\). Let \(b(x,z)=(B(x)z\cdot z)^\frac{p-2}{2}\) and assume that with this choice of b, the assumptions of Theorem 4 are satisfied. Let \(C_1\) be the constant obtained in Theorem 4. Assume that the assumptions of Theorem 1 hold with this choice of ab. If

$$\begin{aligned} C_1' =\dfrac{C_1^\frac{p-1}{q} K_{a,b} \Lambda _b}{(p-1)}<1, \end{aligned}$$

where \(K_{a,b}\) is given by (1.3), then the solution v of the boundary value problem (1.1) satisfies the estimate

$$\begin{aligned} \Vert D v\Vert _{L^q_w(\Omega )}\lesssim 1+\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^\frac{1}{p-1}. \end{aligned}$$

Proof

The proof is similar to the proof of Theorem 9. Let \(\{u_n\}\) be given by the iterative process with \(u_0 = 0\) and \(\gamma \) chosen as in Theorem 6.

Note that \(b(x,0)-\gamma a(x,0)+\gamma f\in L^\frac{q}{p-1}_w(\Omega )\). Thus we find,

$$\begin{aligned}&\Vert D u_{n+1}\Vert _{L^q_w(\Omega )}^q \\&\quad \le C_1 \int _\Omega |b(x,D u_n)-\gamma a(x,D u_n)+\gamma f|^\frac{q}{p-1}w(x)\,\mathrm {d}x\\&\quad \le C_1 \int _\Omega |b(x,D u_n)-b(x,0)-\gamma (a(x,D u_n)-a(x,0))|^\frac{q}{p-1}w(x)\,\mathrm {d}x \\&\qquad +C_1 |b(x,0)-\gamma a(x,0)+\gamma f|^\frac{q}{p-1}w(x)\,\mathrm {d}x\\&\quad \le C_1 K_{a,b}^\frac{q}{p-1}\Lambda _b^\frac{q}{p-1}(p-1)^{-\frac{q}{p-1}} \Vert D u_n\Vert _{L^q_w(\Omega )}^q+ c(C_1,p,q)\left( 1+\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^\frac{q}{p-1}\right) . \end{aligned}$$

Hence by induction, as by assumption \(C_1'<1\),

$$\begin{aligned} \Vert D u_{n+1}\Vert _{L^q_w(\Omega )}\lesssim \dfrac{(C_1')^\frac{q}{p-1}}{1-(C_1')^\frac{q}{p-1}}\left( 1+\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^\frac{q}{p-1}\right) . \end{aligned}$$

Extracting a weakly convergent subsequence and noting that \(u_n\rightarrow v\) in \(W^{1,p}(\Omega )\) by Theorem 6 where u solves (1.1) we conclude the desired estimate, after passing to the limit in the estimate. \(\square \)

We note that of particular interest is the choice \(w(x) = |x|^\alpha \) which can be used to obtain estimates in the the familiar Morrey spaces and hence through the Morrey-Sobolev embedding allows to obtain continuity statements. Recall the definition of the \(L^{q,\theta }\)-Morrey-norm:

$$\begin{aligned} \Vert u\Vert _{L^{q,\theta }(\Omega )} = \sup _{0<r<\text {diam}(\Omega ),z\in \Omega }r^\frac{\theta -n}{q}\Vert u\Vert _{L^q(B_r(z)\cap \Omega )}, \end{aligned}$$

where \(\theta \in (0,n)\). We assume that all the assumptions and the notation of Theorem 10 hold and show how to deduce estimates in Morrey spaces.

Fix \(z\in \Omega \), \(r\in (0,\text {diam}(\Omega ))\) and choose for \(\rho \in (0,\theta )\),

$$\begin{aligned} w(x)= \min \left( |x-z|^{-n+\theta -\rho },r^{-n+\theta -\rho }\right) . \end{aligned}$$

Due to [41, Lemma 3.4], w is an \(A_s\) weight for any \(1<s<\infty \). Thus by Theorem 10,

$$\begin{aligned} \Vert D u\Vert _{L^{q,\theta }(B_r(z)\cap \Omega )}^q&\le r^{n-\theta +\rho }\Vert D u\Vert _{L^q_w(B_r(z)\cap \Omega )}^q\\&\le r^{n-\theta +\rho }c(C,p,q)\left( 1+\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^\frac{q}{p-1}\right) . \end{aligned}$$

It remains to estimate \(\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}\). For this we proceed exactly as [41] but provide the argument here for the sake of completeness. We will show that \(\Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}\le c \Vert f\Vert _{L^{q,\theta }(\Omega )}^q r^{-\rho }\), which will conclude the proof. It is convenient to introduce \(f'\) where \(|f'|^{p-2}f' = f\). For \(\alpha >0\) we denote the set \({E_\alpha = \{x\in \Omega :|f'|>\alpha \}}\). Then we can write

$$\begin{aligned} \Vert f\Vert _{L^\frac{q}{p-1}(\Omega )}^q = \Vert f'\Vert _{L^q(\Omega )}^q&= q \int _0^\infty \alpha ^q \int _{E_\alpha } w(x)\,\mathrm {d}x\frac{d\alpha }{\alpha }\\&\le q\int _0^\infty \alpha \int _0^{r^{-n+\theta -\rho }} \left| E_\alpha \cap B_{\beta ^\frac{1}{-n+\theta -\rho }}(z)\right| \,\mathrm {d}\beta \frac{d\alpha }{\alpha }. \end{aligned}$$

We now estimate the inner integral as follows:

$$\begin{aligned} \int _0^{r^{n+\theta -\rho }} \left| E_\alpha \cap B_{\beta ^\frac{1}{-n+\theta -\rho }}(z)\right| d\beta&\le \sum _{i=1}^\infty 2^{-i}r^{-n+\theta -\beta }\left| E_\alpha \cap B_{r 2^\frac{-i}{-n+\theta -\rho }}(z)\right| \\&\le \, 2 \int _0^{\frac{1}{2} r^{-n+\theta -\rho }} \beta \left| E_\alpha \cap B_{\beta ^\frac{1}{-n+\beta -\rho }}(z)\right| \frac{d\beta }{\beta }. \end{aligned}$$

Now returning to the original estimate and applying Fubini’s theorem we conclude

$$\begin{aligned} \Vert f\Vert _{L^\frac{q}{p-1}_w(\Omega )}^q&\le 2q \int _0^\infty \alpha ^q \int _0^{\frac{1}{2} r^{-n+\theta -\rho }}\beta \left| E_\alpha \cap B_{\beta ^\frac{1}{-n+\theta -\rho }}(z)\right| \frac{\,\mathrm {d}\beta }{\beta }\frac{\,\mathrm {d}\alpha }{\alpha }\\&\le 2q \Vert f\Vert _{L^{q,\theta }(\Omega )}^q \int _0^{\frac{1}{2} r^{-n+\theta -\rho }} \beta ^{1+\frac{n-\theta }{-n+\theta -\rho }}\le c \Vert f\Vert _{L^{q,\theta }(\Omega )}^q r^{-\rho }. \end{aligned}$$

This gives the desired result.