1 Introduction

1.1 Nonlocal equations

We study the Sobolev regularity of weak solutions to linear nonlocal integro-differential equations of the formFootnote 1

$$\begin{aligned} L_A u = f \quad \text {in } \Omega \subset {\mathbb {R}}^n, \end{aligned}$$
(1.1)

where \(\Omega \subset {\mathbb {R}}^n\) is a domain (= open set) and \(A:{\mathbb {R}}^n \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is a coefficient. In addition, for some fixed parameter \(s \in (0,1)\) the nonlocal operator \(L_A\) is formally given by

$$\begin{aligned} L_A u(x) {:}{=} p.v. \int _{{\mathbb {R}}^n} \frac{A(x,y)}{|x-y|^{n+2s}} (u(x)-u(y))dy, \quad x \in \Omega . \end{aligned}$$
(1.2)

Throughout the paper, for the sake of simplicity we assume that \(n>2s\). Furthermore, we require that the coefficient A is measurable and that there exists some constant \(\Lambda \ge 1\) such that

$$\begin{aligned} \Lambda ^{-1} \le A(x,y) \le \Lambda \quad \text {for almost all } x,y \in {\mathbb {R}}^n. \end{aligned}$$
(1.3)

Moreover, we assume that A is symmetric, that is,

$$\begin{aligned} A(x,y)=A(y,x) \quad \text {for almost all } x,y \in {\mathbb {R}}^n. \end{aligned}$$
(1.4)

We define \({\mathcal {L}}_0(\Lambda )\) as the class of all such measurable coefficients A that satisfy (1.3) and (1.4).

Building on the results and techniques from our previous work [43], the aim of this paper is to show that under appropriate regularity assumptions on A and f, weak solutions to (1.1), which are initially assumed to belong to the fractional Sobolev space \(W^{s,2}({\mathbb {R}}^n)\), in fact belong to higher-order spaces \(W^{t,p}_{loc}(\Omega )\) for some \(p>2\) and any \(s \le t<\min \{2s,1\}\). For the relevant definitions of these spaces, we refer to Sect. 2.

Concerning our precise notion of weak solutions, denoting by \(W^{s,2}_c(\Omega )\) the set of all functions that belong to \(W^{s,2}({\mathbb {R}}^n)\) and are compactly supported in \(\Omega \), we have the following definition.

Definition

Given \(f \in L^\frac{2n}{n+2s}_{loc}(\Omega )\), we say that \(u \in W^{s,2}({\mathbb {R}}^n)\) is a weak solution of the equation \(L_A u = f\) in \(\Omega \), if

$$\begin{aligned} \int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{A(x,y)}{|x-y|^{n+2s}} (u(x)-u(y))(\varphi (x)-\varphi (y))dydx = \int _{\Omega } f \varphi dx \quad \forall \varphi \in W^{s,2}_c(\Omega ).\quad \end{aligned}$$
(1.5)

1.2 VMO coefficients

Before stating our main results, we need to recall our notion of coefficients with vanishing mean oscillation which was introduced in [43].

Definition

Let \(\delta >0\) and \(A \in {\mathcal {L}}_0(\Lambda )\). We say that A is \(\delta \)-vanishing in a ball \(B \subset {\mathbb {R}}^n\), if for any \(r>0\) and all \(x_0,y_0 \in B\) with \(B_r(x_0) \subset B\) and \(B_r(y_0) \subset B\), we have

where .

Moreover, we say that A is \((\delta ,R)\)-BMO in a domain \(\Omega \subset {\mathbb {R}}^n\) and for some \(R>0\), if for any \(z \in \Omega \) and any \(0<r\le R\) with \(B_r(z) \Subset \Omega \), A is \(\delta \)-vanishing in \(B_r(z)\).

Finally, we say that A is VMO in \(\Omega \), if for any \(\delta >0\), there exists some \(R>0\) such that A is \((\delta ,R)\)-BMO in \(\Omega \).

Let us briefly put the above definition into a more classical context. In case A belongs to the classical space of functions with vanishing mean oscillation \(\text {VMO}({\mathbb {R}}^{2n})\) (see e.g. [34,  Section 2.1.1], [19] or [46]), then A is also VMO in \({\mathbb {R}}^n\). However, our assumption that A is VMO in \(\Omega \) is more general, in the sense that we essentially only assume A to be of vanishing mean oscillation in some arbitrarily small open neighbourhood of the diagonal in \(\Omega \times \Omega \), while away from the diagonal in \(\Omega \times \Omega \) and outside of \(\Omega \times \Omega \) A is not required to possess any regularity at all. In particular, any coefficient A that is continuous in an open neighbourhood of the diagonal in \(\Omega \times \Omega \) is VMO in \(\Omega \). Nevertheless, continuity close to the diagonal is not essential in order for a coefficient to be VMO.

Indeed, the class of discontinuous VMO functions is actually rather rich. For instance, assuming that \(\Omega \) contains the origin, if for some \(\alpha \in (0,1)\) we have

$$\begin{aligned} \begin{aligned} A(x,y)= {\left\{ \begin{array}{ll} \text {sin} \left( |\text {log}(|x|+|y|)|^\alpha \right) +2 &{} \text { if } x \ne 0 \text { or } y \ne 0 \\ 0 &{} \text { if } x=y=0 \end{array}\right. } \end{aligned} \end{aligned}$$
(1.6)

or

$$\begin{aligned} \begin{aligned} A(x,y)= {\left\{ \begin{array}{ll} \text {sin} \left( \text {log}|\text {log}(|x|+|y|)| \right) +2 &{} \text { if } x \ne 0 \text { or } y \ne 0 \\ 0 &{} \text { if } x=y=0 \end{array}\right. } \end{aligned} \end{aligned}$$
(1.7)

in an open neighbourhood of \(\text {diag}(\Omega \times \Omega )\), then A is VMO in \(\Omega \). However, in both cases A is discontinuous at \(x=y=0\).

1.3 Main results

We are now in the position to state our main results.

Theorem 1.1

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain, \(s \in (0,1)\) and \(\Lambda \ge 1\). If \(A \in {\mathcal {L}}_0(\Lambda )\) is VMO in \(\Omega \), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation

$$\begin{aligned} L_A u = f \text { in } \Omega , \end{aligned}$$

any \(p \in (2,\infty )\) and any \(s \le t <\min \{2s,1\}\), we have the implication

$$\begin{aligned} f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega ) \implies u \in W^{t,p}_{loc}(\Omega ). \end{aligned}$$

If we are only interested in arriving at the conclusion that \(u \in W^{t,p}_{loc}(\Omega )\) for some fixed t and some fixed p, then it suffices for A to be small in BMO, as our second main result indicates, in which we also state an explicit estimate on the solution.

Theorem 1.2

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain, \(s \in (0,1)\), \(\Lambda \ge 1\) and \(R>0\). Moreover, fix some \(p \in (2,\infty )\) and some \(s<t<\min \{2s,1\}\). Then there exists some small enough \(\delta =\delta (p,n,s,t,\Lambda )>0\), such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \((\delta ,R)\)-BMO in \(\Omega \), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation

$$\begin{aligned} L_A u = f \quad \text {in } \Omega , \end{aligned}$$

we have the implication

$$\begin{aligned} f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega ) \implies u \in W^{t,p}_{loc}(\Omega ). \end{aligned}$$

In addition, for all relatively compact bounded open sets \({\Omega ^\prime } \Subset {\Omega ^{\prime \prime }} \Subset \Omega \), we have the estimate

$$\begin{aligned}{}[u]_{W^{t,p}(\Omega ^\prime )} \le C \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + ||f||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })} \right) , \end{aligned}$$
(1.8)

where \(C=C(n,s,t,\Lambda ,R,p,\Omega ^\prime ,\Omega ^{\prime \prime })>0\).

We stated Theorems 1.1 and 1.2 in terms of the higher integrability exponent p at which we arrive. Since in some circumstances it might be more natural to instead prescribe the integrability of the source function f, we also state the following reformulation of Theorem 1.1.

Theorem 1.3

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain, \(s \in (0,1)\), \(\Lambda \ge 1\), fix some \(s \le t < \min \{2s,1\}\) and let \(f \in L^q_{loc}(\Omega )\) for some \(q \in \left( \frac{2n}{n+2(2s-t)},\infty \right) \). In addition, assume that \(A \in {\mathcal {L}}_0(\Lambda )\) is VMO in \(\Omega \). Then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_A u = f \text { in } \Omega ,\) we have

$$\begin{aligned} u \in {\left\{ \begin{array}{ll} W^{t,\frac{nq}{n-(2s-t)q}}_{loc}(\Omega ) , &{} \text { if } q<\frac{n}{2s-t} \\ W^{t,p}_{loc}(\Omega ) \text { for any } p \in (1,\infty ), &{} \text { if } q \ge \frac{n}{2s-t}. \end{array}\right. } \end{aligned}$$

Since for any \(s< t < \min \{2s,1\}\) we have \(\frac{2n}{n+2(2s-t)} < 2\), Theorem 1.3 in particular implies the following higher differentiability result for nonlocal equations with right-hand side in \(L^2\).

Theorem 1.4

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain, \(s \in (0,1)\), \(\Lambda \ge 1\) and \(f \in L^2_{loc}(\Omega )\). In addition, assume that \(A \in {\mathcal {L}}_0(\Lambda )\) is VMO in \(\Omega \). Then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_A u = f \text { in } \Omega ,\) we have \(u \in W^{t,2}_{loc}(\Omega )\) for any \(s< t < \min \{2s,1\}\).

Remark 1.5

Actually, the conclusions of Theorems 1.1, 1.3 and 1.4 also remain valid for a class of coefficients A that in general might not be VMO, including in particular irregular coefficients that are translation invariant inside of \(\Omega \).

More precisely, our approach is flexible enough in order to include the case when \(A \in {\mathcal {L}}_0(\Lambda )\) satisfies \(A(x,y)=a(x-y)\) for all \(x,y \in \Omega \) and some measurable function \(a: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\), but is not required to satisfy any additional regularity assumption. For a more elaborate discussion regarding this extension of our main results, we refer to Remark 8.1.

1.4 Local elliptic equations with VMO coefficients

From the point of view of the regularity theory for local elliptic equations, our main results can be considered to be somewhat surprising. In order to illustrate this at first glance surprising nature of our main results, let us briefly consider local second-order elliptic equations in divergence form of the type

$$\begin{aligned} \text {div}(B \nabla u)= f \quad \text {in } \Omega , \end{aligned}$$
(1.9)

where the matrix of coefficients \(B=\{b_{ij}\}_{i,j=1}^n\) is assumed to be uniformly elliptic and bounded. As it is for instance rigorously established in [25], the Eq. (1.9) can be thought of as a local analogue of the nonlocal equation (1.1) corresponding to the limit case \(s=1\). Therefore, it might be intuitive to guess that the regularity properties of solutions to the nonlocal equation (1.1) should in some sense correspond to the ones of the Eq. (1.9). However, it turns out that in the context of higher regularity, this is not true at all.

A classical fact (see e.g. [39, 49]) is that if the coefficients \(b_{ij}\) are continuous in \(\Omega \) and if \(f \in L^\frac{np}{n+p}_{loc}(\Omega )\) for some \(p>2\), then weak solutions \(u \in W^{1,2}_{loc}(\Omega )\) of the Eq. (1.9) belong to \(W^{1,p}_{loc}(\Omega )\). While for equations with general measurable coefficients such a gain of regularity is not achievable, it was nevertheless realized later (see [19]) that the above assertion remains true if the continuity assumption on the coefficients is relaxed to assuming that the coefficients belong to the space of functions with vanishing mean oscillation \(\text {VMO}(\Omega )\) (see also e.g. [2, 4, 21, 30] for some more general developments). In addition, if one is only interested in obtaining \(W^{1,p}_{loc}\) regularity for some fixed p, then similar to our Theorem 1.2, in more recent years it was observed that it suffices for B to be small in BMO, see [8, 9]. However, in contrast to our main results, the results mentioned above do not yield any differentiability gain.

And indeed, in order to gain any amount of differentiability along the Sobolev scale in the setting of local equations, a corresponding amount of differentiability has to be imposed on the coefficients, which can already be observed in one-dimensional examples (see e.g. [32,  section 1]). Thus, in the setting of local elliptic equations with VMO or even continuous coefficients in general no differentiability gain at all is attainable. In contrast, our main results show that in the setting of nonlocal equations with VMO coefficients, the differentiability of weak solutions improves quite significantly. Let us give some further illustrations of these improved regularizing effects of nonlocal equations contained in our main results.

In fact, in the case when \(s \le 1/2\), we are able to almost match the optimal Calderón–Zygmund-type Sobolev regularity for the fractional Laplacian, which corresponds to the case when the coefficient A is constant. Namely, it is known that for the weak solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} (-\Delta )^s u = f &{} \text { in } \Omega \\ u=0 &{} \text { a.e. in } {\mathbb {R}}^n \setminus \Omega , \end{array}\right. } \end{aligned}$$

we have \(u \in W^{2s,p}_{loc}(\Omega )\) whenever \(f \in L^p(\Omega )\) for some \(p \in \left[ 2,\infty \right) \) (see [5]), while our main results show that despite the presence of a general VMO coefficient A in (1.1), for \(s \le 1/2\) weak solutions of (1.1) still belong to \(W^{t,p}_{loc}(\Omega )\) for any \(t<2s\) whenever \(f \in L^p_{loc}(\Omega )\). This is in sharp contrast to the setting of local second-order equations, since weak solutions \(u \in W^{1,2}_{loc}(\Omega )\) to the Poisson equation \(\Delta u= f\) in \(\Omega \) belong to \(W^{2,p}_{loc}(\Omega )\) whenever \(f \in L^p_{loc}(\Omega )\), gaining a full weak derivative, while as mentioned above, in the presence of VMO coefficients in (1.9) in general not even a gain of fractional differentiability can be expected.

In the case when \(s > 1/2\), our main results only yield differentiability for any \(t<1\), so that in this case we are no longer able to almost match the optimal Sobolev regularity for the fractional Laplacian. However, this seems natural to us, since we do not expect that the differentiability of solutions to local second-order equations can be exceeded by solutions to corresponding nonlocal equations of lower order. Nevertheless, for \(s \ge 1/2\) our main results in particular show that weak solutions to nonlocal equations with VMO coefficients of the type (1.1) almost share the amount differentiability that weak solutions to local equations with VMO coefficients of the type (1.9) possess, despite the fact that the order of such nonlocal equations is lower.

1.5 Previous related results

By now, there is a substantial amount of works concerning the regularity theory for weak solutions to nonlocal equations of the type (1.1).

This is especially true concerning regularity results of purely nonlocal type, in the sense that the obtained results do not have analogues in the regularity theory of local elliptic equations. This line of results was started in the papers [32] and [47], where it was demonstrated that in the case of general bounded measurable coefficients \(A \in {\mathcal {L}}_0(\Lambda )\), weak solutions to nonlocal equations of the type (1.1) are slightly higher differentiable and higher integrable, provided the right-hand side satisfies \(f \in L^q_{loc}\) for some \(q>\frac{2n}{n+2s}\). Our main results show that under the additional assumption that A is VMO, the conclusions of the results in [32, 47] can be improved to gaining larger amounts of differentiability and integrability.

Concerning results on higher Sobolev regularity for nonlocal equations of the type (1.1), in [36] Mengesha, Schikorra and Yeepo proved results similar to our Theorem 1.1 in the case when \(\Omega ={\mathbb {R}}^n\) and under the assumption that the mapping \(x \mapsto A(x,y)\) is globally Hölder continuous for some arbitrarily small Hölder exponent. Since this Hölder continuity assumption on A in particular does not include discontinuous coefficients of VMO-type like (1.6) and (1.7), in [36,  p. 10] the authors raised the question if the regularity gain they obtained remains valid for coefficients that merely belong to VMO. Therefore, one of the main achievements of the present paper is that our main results confirm this conjecture to be true, even establishing the desired regularity in the slightly more general case when the coefficient is merely assumed to be small in BMO. Moreover, in contrast to [36] we are also able to include translation invariant coefficients that do not satisfy any smoothness assumption, see Remark 8.1. In addition, we argue on a completely different set of techniques in comparison to the ones applied in [36]. Namely, while the key ingredient in [36] is given by commutator estimates, our approach is based on a delicate interplay between comparison estimates and so-called dual pairs (see Sect. 1.7).

Furthermore, in [43] we proved weaker versions of the main results in the present paper, in the sense that the differentiability gain obtained in [43] depends on ns and in particular the amount of integrability that we are able to gain, while in our main results stated in Sect. 1.3 an arbitrarily small gain of integrability suffices in order to gain differentiability in the full range \(s< t < \min \{2s,1\}\). For this reason, the amount of differentiability gained in [43] only matches the one in this paper in the case when a very large amount of integrability is prescribed on the right-hand side f, while in general the differentiability gain in this work exceeds the one obtained in [43] by a very substantial amount.

This is probably illustrated best in the setting of our Theorem 1.4: For \(f \in L^2_{loc}(\Omega )\), [43,  Theorem 1.3] only implies that \(u \in W^{t,2}_{loc}(\Omega )\) for any t in the restricted range

$$\begin{aligned} s<t < t_{n,s} {:}{=} {\left\{ \begin{array}{ll} \frac{ns+4s^2}{n+2s} , &{} \text { if } s \le 1/2 \\ \frac{ns+4s-4s^2}{n+2-2s}, &{} \text { if } s > 1/2. \end{array}\right. } \end{aligned}$$

In particular, e.g. for \(n=2\) and \(s=1/2\), [43,  Theorem 1.3] yields differentiability for any \(t<2/3\), while in this case our Theorem 1.4 yields differentiability for any \(t<1\). In higher dimensions, the improvement in differentiability gain becomes even more visible. In fact, for any \(s \in (0,1)\) and any fixed \(\varepsilon >0\), there exists some large enough \(n=n(s,\varepsilon )\) such that \(t_{n,s} < s+\varepsilon \), so that the gain of differentiability in [43] is in general very small in the case when f merely belongs to \(L^2_{loc}(\Omega )\). On the other hand, for \(f \in L^2_{loc}(\Omega )\) our Theorem 1.4 implies that \(u \in W^{t,2}_{loc}(\Omega )\) in the whole range \(s<t<\min \{2s,1\}\), independently of n.

Moreover, in [42] for \(p \in (2,\infty )\) it was proved that weak solutions u to (1.1) belong to \(W^{s,p}_{loc}(\Omega )\) whenever \(f \in L^\frac{np}{n+sp}_{loc}(\Omega )\) and A is continuous in \(\Omega \times \Omega \), which corresponds to the case of no differentiability gain as in the setting of local equations.

Also, in the case when \(f \in L^2_{loc}(\Omega )\) and \(A \in C^s(\Omega \times \Omega )\), by using difference quotients, in [16] it was shown that weak solutions to (1.1) belong to \(W^{t,2}_{loc}(\Omega )\) for any \(t < 2s\), which also follows from our Theorem 1.4 in the case when \(s \le 1/2\). In other words, in this case we not only do not need \(C^s\) regularity of the coefficient A, but not even continuity of A in order to achieve this higher differentiability result. In fact, it is sufficient for A to be VMO in \(\Omega \).

More results regarding Sobolev regularity for nonlocal equations are for example proved in [3, 6, 22, 26, 28, 29, 35, 37, 40], while various results on Hölder regularity are proved in [7, 11,12,13,14,15, 17, 18, 23, 24, 27, 33, 41, 44, 45, 48]. Furthermore, for some regularity results concerning nonlocal equations similar to (1.1) in the case when the right-hand side is merely a measure, we refer to [31].

1.6 Some remaining open questions and possible extensions

First of all, while as we discussed in Sect. 1.5 the differentiability gain in [43] is in general substantially smaller than the gain we achieve in our main results, the main results in [43] hold also for certain nonlinear generalizations of the Eq. (1.1), while in this paper and also in [36] only linear equations are considered. Thus, a natural question is if the improved differentiability gain in the present paper remains valid for nonlinear equations.

In addition, in [36] the lower bound we imposed on A in (1.3) is only assumed to hold at the diagonal, so that another naturally arising question is if the lower bound on A can be relaxed to hold only at the diagonal also in the case when A is merely VMO.

Furthermore, in [1], in the case of the fractional Laplacian, that is, in the special case when A is constant, a global regularity result corresponding to our Theorem 1.3 was proved under the additional restriction that \(q<\frac{1}{t-s}\), which is sharp when dealing with regularity up to the boundary. In view of this global regularity result, another interesting question is to what extent the conclusions of our main results, which deal with local regularity, remain valid up to the boundary.

Moreover, we believe that our approach is flexible enough in order to generalize our main results to include so-called local weak solutions as considered e.g. in [6, 7] or [41], essentially only assuming that \(u \in W^{s,2}_{loc}(\Omega )\) and the finiteness of the nonlocal tails of u. However, since including this slightly more general notion of solutions would require a revision of the previous work [43] and most notably [32], we decided not to insist on this point.

Finally, another feature of our approach is that it also enables us to prove local \(W^{t,p}\) estimates in the case when the right-hand side f in (1.1) is replaced by the fractional Laplacian or even by sums of more general nonlocal operators, see Theorem 7.7 and Remark 7.8.

1.7 Approach

Before commencing with the technical part of the paper, in this section we give a heuristic summary of our approach, in particular since we believe that the techniques displayed in this work have the potential to be useful in a large variety of situations involving nonlocal equations.

As mentioned, in the previous paper [43], we proved weaker versions of the main results in the present paper, gaining only a restricted amount of differentiability that depends on the amount of integrability we are able to gain. This was achieved by introducing ideas that on the one hand allow to prove suitable comparison estimates in our nonlocal setting, and on the other hand allow to combine various highly nontrivial covering techniques introduced in the papers [10, 32]. Our approach in this paper essentially combines the techniques implemented in [43] with some novel insights that enable us to gain differentiability independently of the integrability gain. Since an in-depth heuristic description of the philosophy of the approach from [43] was already given in [43,  Section 1.5], here we focus on emphasizing the main novelties of the approach used in this work compared to the one applied in [43].

The objects at the heart of the approach from [43] are certain fractional gradients given by so-called dual pairs. Namely, for some fixed \(\theta \in \left( 0,\frac{1}{2} \right) \), we define a Borel measure \(\mu \) on \({\mathbb {R}}^{2n}\) as follows. For any function \(u:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and \((x,y) \in {\mathbb {R}}^{2n}\) with \(x \ne y\), we define the function

$$\begin{aligned} U(x,y){:}{=}\frac{|u(x)-u(y)|}{|x-y|^{s+\theta }}. \end{aligned}$$
(1.10)

In addition, for any measurable set \(E \subset {\mathbb {R}}^{2n}\), set

$$\begin{aligned} \mu (E){:}{=} \int _{E} \frac{dxdy}{|x-y|^{n-2\theta }}. \end{aligned}$$
(1.11)

For any domain \(\Omega \subset {\mathbb {R}}^n\), we then clearly have \(u \in W^{s,2}(\Omega )\) if and only if \(u \in L^2(\Omega )\) and \(U \in L^2(\Omega \times \Omega ,\mu )\), so that in some sense the function U and the measure \(\mu \) are in duality. Regarding larger exponents, by a simple computation, for any \(p > 2\) and \({\widetilde{s}}{:}{=}s+\theta \left( 1-\frac{2}{p} \right) >s\) we have

$$\begin{aligned} u \in W^{{\widetilde{s}},p}(\Omega ) \quad \text {if and only if} \quad u \in L^p(\Omega ) \text { and } U \in L^p(\Omega \times \Omega ,\mu ). \end{aligned}$$
(1.12)

Therefore, a key feature of this approach to fractional-type gradients is that by proving higher integrability of the gradient-type function U with respect to the measure \(\mu \), we do not only gain regularity along the integrability scale of fractional Sobolev spaces, but also a substantial amount of higher differentiability! In [43], this property of such dual pairs of the type \((U,\mu )\) was then exploited by proving that in the restricted range \(0<\theta < \min \{s,1-s\}\), we have \(U \in L^p_{loc}(\Omega \times \Omega ,\mu )\), which in turn then also gives some higher differentiability as indicated above. However, the amount of differentiability gained in this fashion is in general strictly smaller than the amount we gain in our main results, since a small amount of integrability gain also only yields a small gain along the differentiability scale. In the present paper, we overcome this issue by considering also fractional gradients and dual pairs of higher order. More precisely, the key idea is to iteratively replace the function U by fractional gradient-type functions of the type

$$\begin{aligned} U_{\alpha }(x,y){:}{=}\frac{|u(x)-u(y)|}{|x-y|^{\alpha +\theta _\alpha }} \end{aligned}$$
(1.13)

and the above measure \(\mu \) by measures of the form

$$\begin{aligned} \mu _{\alpha }(E){:}{=} \int _{E} \frac{dxdy}{|x-y|^{n-2\theta _\alpha }}, \end{aligned}$$
(1.14)

where \(s \le \alpha <\min \{2s,1\}\) and \(\theta _\alpha {:}{=} s+\theta -\alpha \). In a similar way as above, for any \(p > 2\) and \({{\widetilde{\alpha }}}{:}{=}\alpha +\theta \left( 1-\frac{2}{p} \right) >\alpha \), we have

$$\begin{aligned} u \in W^{{{\widetilde{\alpha }}},p}(\Omega ) \quad \text {if and only if} \quad u \in L^p(\Omega ) \text { and } U_\alpha \in L^p(\Omega \times \Omega ,\mu _\alpha ). \end{aligned}$$
(1.15)

With these notions in place, let us now briefly sketch the further approach and in particular the iteration argument that leads to achieving \(W^{t,p}_{loc}\) regularity for any \(s<t<\min \{2s,1\}\).

First, we observe that instead of directly proving the desired regularity for nonlocal equations of the type \(L_A u=f\), for technical reasons it is more appropriate for us to first focus on proving regularity for equations of the type \(L_A u=(-\Delta )^s g\), where \((-\Delta )^s\) denotes the fractional Laplacian. This is because once we are able to transfer a sufficient amount of regularity from g to u, in view of the known \(H^{2s,p}\) estimates for the fractional Laplacian, we can then first transfer regularity from f to some solution g of \((-\Delta )^s g=f\) and then from g to weak solutions u of (1.1). Thus, we focus on proving that for weak solutions to \(L_A u = (-\Delta )^s g\), for any \(s<t<\min \{2s,1\}\) we have the implication

$$\begin{aligned} g \in W^{t,p}_{loc} \implies u \in W^{t,p}_{loc}. \end{aligned}$$
(1.16)

Instead of proving this implication directly, roughly speaking we focus on proving implications of the type

$$\begin{aligned} G_\alpha \in L^p_{loc}(\Omega \times \Omega ,\mu _\alpha ) \implies U_\alpha \in L^p_{loc}(\Omega \times \Omega ,\mu _\alpha ) \end{aligned}$$
(1.17)

for any \(\alpha \in [s,\min \{2s,1\})\), where \(G_\alpha \) is defined in the same way as \(U_\alpha \) with u replaced by g. Since \(\theta _\alpha \) decreases as \(\alpha \) increases, this exactly leads to the implication (1.16) for any \(s<t<\min \{2s,1\}\).

In order to prove the implication (1.17), we make use of a covering argument implemented in detail in [43]. The main idea is to cover the level sets \(\left\{ {\mathcal {M}}(U_\alpha ^2) > \lambda ^2 \right\} \) of the maximal function of \(U_\alpha \) by dyadic cubes in order to show that these level sets decay sufficiently fast with respect to \(\mu _\alpha \), which in view of standard measure-theoretic arguments then implies the desired implication (1.17). However, since the above level sets are subsets of \({\mathbb {R}}^{2n}\) instead of \({\mathbb {R}}^{n}\), in our setup we have to run an exit time argument in \({\mathbb {R}}^{2n}\) instead of \({\mathbb {R}}^{n}\) in order to cover the level set of U by Calderón–Zygmund cubes in \({\mathbb {R}}^{2n}\), which leads to rather severe technical difficulties. In particular, since close to the diagonal the information given by the equation can be used much more efficiently, an additional cover of the diagonal in terms of balls is constructed. However, since a large part of this technical covering argument works almost in exactly the same way as the one applied in [43], as indicated before, in this paper we primarily focus on the nontrivial modifications necessary in order to prove the implication (1.17) in the higher-order case when \(\alpha >s\).

Namely, probably the most crucial complication in contrast to [43] arises in the arguments applied in order to control the measures of the balls in the mentioned additional diagonal cover. In [43], the central tool in order to achieve this is given by a comparison estimate. More precisely, in [43] the function U was locally approximated in \(L^2(\mu )\) by a corresponding function V, which is given as in (1.10) with u replaced by a weak solution v of the corresponding homogeneous equation \(L_{A_0} v=0\) with locally "frozen" coefficient \(A_0\). Equivalently, it was proved that the difference \(w{:}{=}u-v\) is small in \(W^{s,2}\) whenever g is small in \(W^{s,2}\), which can be shown by testing the equation with w itself. The mentioned covering argument then essentially allows to transfer regularity from v to u. More precisely, in [41] it was shown that such weak solutions v to homogeneous equations with locally constant coefficients belong to \(C^\beta \) for any \(0<\beta < \min \{2s,1\}\), which suffices in order to transfer enough regularity from v to u in order to obtain the desired result.

In contrast, proving such a comparison estimate for higher-order fractional gradients is more involved, since in this case the order of the gradient-type function no longer matches the order of the equation already in \(L^2\). We resolve this issue as follows. In order to show that \(U_\alpha \) is close to \(V_\alpha \) in \(L^2(\mu _\alpha )\) or equivalently, that \(w=u-v\) is small in \(W^{\alpha ,2}\) whenever g is small in \(W^{\alpha ,2}\), roughly speaking we additionally assume that w satisfies an estimate of the form

$$\begin{aligned}{}[w]_{W^{\alpha ,2}} \lesssim [w]_{W^{s,2}} + [g]_{W^{\alpha ,m}} + \text { tail terms} \end{aligned}$$
(1.18)

for some \(m>2\). This additional estimate then essentially allows to reduce the problem of proving the smallness of w in \(W^{\alpha ,2}\) to showing the smallness of w in \(W^{s,2}\), which was already done in [43]. In addition, while in [43] it was necessary to locally freeze the coefficient A, since in view of the Sobolev embedding the main results in [43] already imply a \(C^\beta \) estimate for any \(0<\beta < \min \{2s,1\}\) in the case when A is merely VMO, in our situation freezing the coefficient is no longer necessary. However, in order to arrive at our main results, it then still remains to remove the assumption that the estimate (1.18) holds.

We achieve this as follows. Since in the case when \(\alpha =s\) the estimate (1.18) holds trivially, in this case (which corresponds to [43]) we already achieve some higher differentiability or more precisely, we obtain that the implication (1.16) holds for some small enough \(t_1>s\). But since due to the linearity of the equation, \(w=u-v\) also satisfies the equation \(L_A w = (-\Delta )^s g\), the estimate (1.18) is therefore also satisfied for \(\alpha =t_1\). Thus, through the procedure we sketched above, we obtain that u and w satisfy the implication (1.17) for \(\alpha =t_1\), leading to the estimate (1.16) for some \(t_2>t_1\), exceeding the amount of differentiability obtained in [43]. Iterating this procedure finitely many times then indeed leads to the estimate (1.16) in the full range \(s<t < \min \{2s,1\}\).

1.8 Brief outline of the paper

The paper is organized as follows. In Sect. 2, we define the fractional Sobolev spaces \(W^{s,p}\) and mention some of their properties that we use throughout the paper. In Sect. 3, we then further discuss the notion of fractional gradients given by dual pairs introduced in the previous Sect. 1.7.

The rest of the paper is then devoted to the proof of our main results. In Sect. 4 we implement the approximation argument for higher-order fractional gradients mentioned in Sect. 1.7. In Sect. 5, we turn to proving certain good-\(\lambda \) inequalities, both at the diagonal and far away from the diagonal. These good-\(\lambda \) inequalities then allow to carry out an adaptation of the covering argument from [43,  Section 7] for higher-order fractional gradients. Since the covering argument needed in our setting follows very closely the steps in [43,  Section 7], in Sect. 6 we only explain the required adaptations in order to arrive at the desired level set estimate. In Sect. 7, this level set estimate is then used along with some delicate iteration arguments in order to prove a priori estimates for weak solutions. Finally, in Sect. 8 these a priori estimates are then combined with smoothing techniques in order to arrive at our main results.

1.9 Some notation

For convenience, let us fix some notation which we use throughout the paper. By Cc and \(C_i,c_i\), \(i \in {\mathbb {N}}_0\), we always denote positive constants, while dependences on parameters of the constants will be shown in parentheses. As usual, by

$$\begin{aligned} B_r(x_0){:}{=} \{x \in {\mathbb {R}}^n \mid |x-x_0|<r \} \end{aligned}$$

we denote the open euclidean ball with center \(x_0 \in {\mathbb {R}}^n\) and radius \(r>0\). We also set \(B_r{:}{=}B_r(0)\). In addition, by

$$\begin{aligned} Q_r(x_0){:}{=} \{x \in {\mathbb {R}}^n \mid |x-x_0|_\infty <r/2\} \end{aligned}$$

we denote the open cube with center \(x_0 \in {\mathbb {R}}^n\) and sidelength \(r>0\). Moreover, if \(E \subset {\mathbb {R}}^n\) is measurable, then by |E| we denote the n-dimensional Lebesgue-measure of E. If \(0<|E|<\infty \), then for any \(u \in L^1(E)\) we define

As indicated in Sect. 1.7, throughout this paper, we often consider integrals and functions on \({\mathbb {R}}^{2n}={\mathbb {R}}^{n} \times {\mathbb {R}}^{n}\). Instead of dealing with the usual euclidean balls in \({\mathbb {R}}^{2n}\), for this purpose it is more convenient for us to use the balls generated by the norm

$$\begin{aligned} ||(x_0,y_0)||{:}{=}\max \{|x_0|,|y_0|\}, \quad (x_0,y_0) \in {\mathbb {R}}^{2n}. \end{aligned}$$

These balls with center \((x_0,y_0) \in {\mathbb {R}}^{2n}\) and radius \(r>0\) are denoted by \({\mathcal {B}}_r(x_0,y_0)\) and are of the form

$$\begin{aligned} {\mathcal {B}}_r(x_0,y_0)=B_r(x_0) \times B_r(y_0). \end{aligned}$$

In the case when \(x_0=y_0\) we also write \({\mathcal {B}}_r(x_0){:}{=}{\mathcal {B}}_r(x_0,x_0),\) we call such balls diagonal balls. We also set \({\mathcal {B}}_r{:}{=}{\mathcal {B}}_r(0)\). Similarly, for \(x_0,y_0 \in {\mathbb {R}}^n\) and \(r>0\) we define \({\mathcal {Q}}_r(x_0,y_0){:}{=}Q_r(x_0) \times Q_r(y_0)\) and \({\mathcal {Q}}_r(x_0){:}{=}{\mathcal {Q}}_r(x_0,x_0)\) and also \({\mathcal {Q}}_r{:}{=}{\mathcal {Q}}_r(0)\).

2 Fractional Sobolev spaces

Definition

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain. For \(p \in [1,\infty )\) and \(s \in (0,1)\), we define the fractional Sobolev space

$$\begin{aligned} W^{s,p}(\Omega ){:}{=}\left\{ u \in L^p(\Omega ) \mathrel {\Big |} \int _{\Omega } \int _{\Omega } \frac{|u(x)-u(y)|^p}{|x-y|^{n+sp}}dydx<\infty \right\} \end{aligned}$$

with norm

$$\begin{aligned} ||u||_{W^{s,p}(\Omega )} {:}{=} \left( ||u||_{L^p(\Omega )}^p + [u]_{W^{s,p}(\Omega )}^p \right) ^{1/p} , \end{aligned}$$

where

$$\begin{aligned}{}[u]_{W^{s,p}(\Omega )}{:}{=}\left( \int _{\Omega } \int _{\Omega } \frac{|u(x)-u(y)|^p}{|x-y|^{n+sp}}dydx \right) ^{1/p} . \end{aligned}$$

In addition, we define the corresponding local fractional Sobolev spaces by

$$\begin{aligned} W^{s,p}_{loc}(\Omega ){:}{=} \left\{ u \in L^p_{loc}(\Omega ) \mid u \in W^{s,p}(\Omega ^\prime ) \text { for any domain } \Omega ^\prime \Subset \Omega \right\} . \end{aligned}$$

Also, we define the space

$$\begin{aligned} W^{s,p}_0(\Omega ){:}{=} \left\{ u \in W^{s,2}({\mathbb {R}}^n) \mid u = 0 \text { in } {\mathbb {R}}^n \setminus \Omega \right\} . \end{aligned}$$

We use the following fractional Poincaré inequality, see [38,  Section 4].

Lemma 2.1

(fractional Poincaré inequality) Let \(s \in (0,1)\), \(p \in [1,\infty )\), \(r>0\) and \(x_0 \in {\mathbb {R}}^n\). For any \(u \in L^p(B_r(x_0))\), we have

$$\begin{aligned} \int _{B_r(x_0)} \left| u(x)- {\overline{u}}_{B_r(x_0)} \right| ^p dx \le C r^{sp} \int _{B_r(x_0)} \int _{B_r(x_0)} \frac{|u(x)-u(y)|^p}{|x-y|^{n+sp}}dydx, \end{aligned}$$

where \(C=C(s,p)>0\).

Proposition 2.2

Let \(\Omega \subset {\mathbb {R}}^n\) be a Lipschitz domain, \(s \in (0,1)\) and \(p \in [1,\infty )\).

  • If \(sp<n\), then we have the continuous embedding

    $$\begin{aligned} W^{s,p}(\Omega ) \hookrightarrow L^\frac{np}{n-sp}(\Omega ). \end{aligned}$$
  • If \(sp=n\), then for any \(q \in [1,\infty )\) we have the continuous embedding

    $$\begin{aligned} W^{s,p}(\Omega ) \hookrightarrow L^q(\Omega ). \end{aligned}$$
  • If \(sp>n\), then we have the continuous embedding

    $$\begin{aligned} W^{s,p}(\Omega ) \hookrightarrow C^{s-\frac{n}{p}}(\Omega ). \end{aligned}$$

In addition, if \(sp>n\) and \(\Omega =B_r(x_0)\) for some \(r>0\) and some \(x_0 \in {\mathbb {R}}^n\), then for any \(u \in W^{s,p}(B_r(x_0))\), we have

$$\begin{aligned}{}[u]_{C^{s-\frac{n}{p}}(B_r(x_0))} \le C [u]_{W^{s,p}(B_r(x_0))}, \end{aligned}$$
(2.1)

where \(C=C(n,s,p)>0\).

Proof

The above three embeddings follow from [20,  Theorem 6.7, Theorem 6.10, Theorem 8.2]. Let us now prove (2.1). Define \(u_r(x){:}{=}u(rx+x_0)\). Applying the third of the above embeddings to \({\widetilde{u}}_r{:}{=} u_r-\overline{\left( u_r \right) }_{B_1} \in W^{s,p}(B_1)\) and then using the fractional Poincaré inequality (Lemma 2.1), along with changes of variables leads to

$$\begin{aligned} r^{s-\frac{n}{p}} [u]_{C^{s-\frac{n}{p}}(B_r(x_0))}&= [u_r]_{C^{s-\frac{n}{p}}(B_1)} \\&= [{\widetilde{u}}_r]_{C^{s-\frac{n}{p}}(B_1)} \\&\le C_1 \left( \int _{B_1} \int _{B_1} \frac{|{\widetilde{u}}_r(x)-{\widetilde{u}}_r(y)|^{p}}{|x-y|^{n+sp}}dydx + \int _{B_1} |{\widetilde{u}}_r(x)|^p dx \right) ^\frac{1}{p} \\&\le C \left( \int _{B_1} \int _{B_1} \frac{|u_r(x)-u_r(y)|^{p}}{|x-y|^{n+sp}}dydx \right) ^\frac{1}{p} \\&= C r^{s-\frac{n}{p}} \left( \int _{B_r(x_0)} \int _{B_r(x_0)} \frac{|u(x)-u(y)|^{p}}{|x-y|^{n+sp}}dydx \right) ^\frac{1}{p}, \end{aligned}$$

where \(C_1\) and C depend only on ns and p. Since the factor \(r^{s-\frac{n}{p}}\) cancels out on both sides, the proof is finished. \(\square \)

For the following Lemma, we refer to [43,  Lemma 2.4].

Lemma 2.3

(fractional Sobolev–Poincaré inequality) Let \(s \in (0,1)\), \(p \in [1,\infty )\), \(r>0\) and \(x_0 \in {\mathbb {R}}^n\). In addition, let

Then for any \(u \in W^{s,p}(B_r(x_0))\), we have

where \(C=C(n,s,p,q)>0\).

For \(p \in (1,\infty )\) and \(s \in (0,2)\), denote by \(H^{s,p}(\Omega )\) the standard Bessel potential spaces on \(\Omega \), see e.g. [43,  Section 2]. The following embedding result follows from [51,  Theorem 2.5], where it is given in the more general context of Besov and Triebel–Lizorkin spaces.

Proposition 2.4

Let \(1< p_0< p< p_1 < \infty \), \(s \in (0,2)\), \(s_0,s_1 \in (0,1)\) and assume that \(\Omega \subset {\mathbb {R}}^n\) is a smooth domain. If \( s_0 - \frac{n}{p_0} = s - \frac{n}{p} = s_1 - \frac{n}{p_1}, \) then

$$\begin{aligned} W^{s_0,p_0}(\Omega ) \hookrightarrow H^{s,p}(\Omega ) \hookrightarrow W^{s_1,p_1}(\Omega ). \end{aligned}$$

Unlike the first-order Sobolev spaces \(W^{1,p}(\Omega )\) on a bounded domain \(\Omega \subset {\mathbb {R}}^n\), the fractional Sobolev spaces \(W^{s,p}(\Omega )\) are not contained in each other as the integrability exponent p decreases. Nevertheless, the following result essentially shows that the mentioned inclusions are almost true.

Proposition 2.5

Let \(1< p_0 \le p<\infty \), \(s \in (0,1)\) and assume that \(\Omega \subset {\mathbb {R}}^n\) is a smooth bounded domain. Then for any \(t \in (s,1)\), we have

$$\begin{aligned} W^{t,p}(\Omega ) \hookrightarrow W^{s,p_0}(\Omega ). \end{aligned}$$

In addition, if \(\Omega =B_r(x_0)\) for some \(r>0\) and some \(x_0 \in {\mathbb {R}}^n\), then for any \(u \in W^{t,p}(B_r(x_0))\), we have

$$\begin{aligned}{}[u]_{W^{s,p_0}(B_r(x_0))} \le Cr^{\frac{n}{p_0}-\frac{n}{p}+t-s} [u]_{W^{t,p}(B_r(x_0))}, \end{aligned}$$
(2.2)

where \(C=C(n,s,t,p,p_0)>0\).

Proof

By [43,  Proposition 2.6], for \(0<\varepsilon <\min \left\{ t-s,\frac{2n}{p},2n \left( 1-\frac{1}{p_0} \right) \right\} \), we have \( W^{t,p}(\Omega ) \hookrightarrow W^{t-\varepsilon ,p_0}(\Omega ).\) Since by [20,  Proposition 2.1], we also have \( W^{t-\varepsilon ,p_0}(\Omega ) \hookrightarrow W^{s,p_0}(\Omega ) ,\) we arrive at the embedding \( W^{t,p}(\Omega ) \hookrightarrow W^{s,p_0}(\Omega ).\)

In order to prove (2.2), set \(u_r(x){:}{=}u(rx+x_0)\). By using the above embedding with respect to \({\widetilde{u}}_r{:}{=}u_r-\overline{\left( u_r \right) }_{B_1}\) and then the fractional Poincaré inequality (Lemma 2.1), along with changing variables we conclude that

$$\begin{aligned}&\left( \int _{B_r(x_0)} \int _{B_r(x_0)} \frac{|u(x)-u(y)|^{p_0}}{|x-y|^{n+sp_0}}dydx \right) ^\frac{1}{p_0} \\&\quad = r^{\frac{n}{p_0}-s} \left( \int _{B_1} \int _{B_1} \frac{|{\widetilde{u}}_r(x)-{\widetilde{u}}_r(y)|^{p_0}}{|x-y|^{n+sp_0}}dydx \right) ^\frac{1}{p_0} \\&\quad \le C_1 r^{\frac{n}{p_0}-s} \left( \int _{B_1} \int _{B_1} \frac{|{\widetilde{u}}_r(x)-{\widetilde{u}}_r(y)|^{p}}{|x-y|^{n+tp}}dydx + \int _{B_1} |{\widetilde{u}}_r(x)|^p dx \right) ^\frac{1}{p}\\&\quad \le C_2 r^{\frac{n}{p_0}-s} \left( \int _{B_1} \int _{B_1} \frac{|u_r(x)-u_r(y)|^{p}}{|x-y|^{n+tp}}dydx \right) ^\frac{1}{p} \\&\quad = C_2 r^{\frac{n}{p_0}-\frac{n}{p}+t-s} \left( \int _{B_r(x_0)} \int _{B_r(x_0)} \frac{|u(x)-u(y)|^{p}}{|x-y|^{n+tp}}dydx \right) ^\frac{1}{p}, \end{aligned}$$

where \(C_1\) and \(C_2\) depend only on \(n,p,p_0,s\) and t. This proves (2.2). \(\square \)

3 Fractional gradients on \({\mathbb {R}}^{2n}\)

3.1 Basic properties of dual pairs

Fix some \(t \in (0,1)\) and some \(\theta \in \left( 0,\frac{1}{2} \right) \). We define a Borel measure \(\mu _\theta \) on \({\mathbb {R}}^{2n}\) as follows. For any function \(u:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and \((x,y) \in {\mathbb {R}}^{2n}\) with \(x \ne y\), we define the function

$$\begin{aligned} U_{t,\theta }(x,y){:}{=}\frac{|u(x)-u(y)|}{|x-y|^{t+\theta }}. \end{aligned}$$
(3.1)

For any measurable set \(E \subset {\mathbb {R}}^{2n}\), set

$$\begin{aligned} \mu _{\theta }(E){:}{=} \int _{E} \frac{dxdy}{|x-y|^{n-2\theta }}. \end{aligned}$$
(3.2)

The following Lemma follows by a straightforward computation, see [43,  Lemma 3.1].

Lemma 3.1

Let \(p \ge 2\) and set \(t_\theta {:}{=}t+\theta \left( 1-\frac{2}{p} \right) \). Then we have

$$\begin{aligned} u \in W^{t_\theta ,p}(\Omega ) \quad \text {if and only if} \quad u \in L^p(\Omega ) \text { and } U_{t,\theta } \in L^p(\Omega \times \Omega ,\mu _\theta ) \end{aligned}$$

and

$$\begin{aligned} ||U_{t,\theta }||_{L^p(\Omega \times \Omega ,\mu _\theta )}=[u]_{W^{t_\theta ,p}(\Omega )}. \end{aligned}$$

The next Proposition contains some further important properties of the measure \(\mu _\theta \) which we use frequently throughout the paper, usually without explicit reference. For a proof, we refer to [43,  Proposition 3.2].

Proposition 3.2

  1. (i)

    For all \(r>0\) and \(x_0 \in {\mathbb {R}}^n\), we have

    $$\begin{aligned} \mu _\theta ({\mathcal {B}}_r(x_0))= \mu _\theta ({\mathcal {B}}_r) =cr^{n+2\theta }, \end{aligned}$$

    where \(c=c(n,\theta )>0\).

  2. (ii)

    (volume doubling property) For any \((x_0,y_0) \in {\mathbb {R}}^{2n}\), any \(r>0\) and any \(M >0\), we have

    $$\begin{aligned} \mu _\theta ({\mathcal {B}}_{Mr}(x_0,y_0)) = M^{n+2 \theta } \mu _\theta ({\mathcal {B}}_{r}(x_0,y_0)). \end{aligned}$$

We will also frequently use the following relation between fractional gradients of different order.

Lemma 3.3

Let \(0<s \le t<1\), \(p \ge 2\), \(\theta \in \left( 0,\frac{1}{2} \right) \) and set \(\theta _{t} {:}{=} s+\theta -t\). Then for any \(r >0\), any \(x_0 \in {\mathbb {R}}^n\) and any \(u \in W^{t,p}(B_r(x_0))\), we have

where \(C=C(n,s,t,\theta ,p)>0\).

Proof

Using that \(tp+\theta _t(p-2)-sp-\theta (p-2)=2(t-s)\), we have

where all constants depend only on \(n,s,t,\theta \) and p. \(\square \)

3.2 The Hardy–Littlewood maximal function

Another tool we use is the Hardy–Littlewood maximal function with respect to the measure \(\mu _\theta \).

Definition

Let \(F \in L^1_{loc}({\mathbb {R}}^{2n},\mu _\theta )\). We define the Hardy–Littlewood maximal function \({\mathcal {M}} F: {\mathbb {R}}^{2n} \rightarrow [0,\infty ]\) of F with respect to \(\mu _\theta \) by

where

Moreover, for any open set \(E \subset {\mathbb {R}}^{2n}\), we define

$$\begin{aligned} {\mathcal {M}}_{E} (F) {:}{=} {\mathcal {M}} \left( F \chi _{E} \right) , \end{aligned}$$

where \(\chi _{E}\) is the characteristic function of E. In addition, for any \(r>0\) we define

and

$$\begin{aligned} {\mathcal {M}}_{\ge r,E} (F) {:}{=} {\mathcal {M}}_{\ge r} \left( F \chi _{E} \right) . \end{aligned}$$

The following result shows that the Hardy–Littlewood maximal function is well-behaved in the context of \(L^p\) spaces. Since in view of Proposition 3.2\(\mu _\theta \) is a doubling measure with doubling constant \(2^{n+2\theta }\), the result follows directly from [50,  Chapter 1, Section 3, Theorem 1].

Proposition 3.4

Let E be an open subset of \({\mathbb {R}}^{2n}\).

  1. (i)

    (weak p-p estimates) If \(F \in L^p(E,\mu _\theta )\) for some \(p \ge 1\) and \(\lambda >0\), then

    $$\begin{aligned} \mu _\theta \left( \{x \in E \mid {\mathcal {M}}_E(F)(x) > \lambda \} \right) \le \frac{C}{\lambda ^p} \int _{E} |F|^p d\mu _\theta , \end{aligned}$$

    where C depends only on \(n,\theta \) and p.

  2. (ii)

    (strong p-p estimates) If \(F \in L^p(E,\mu _\theta )\) for some \(p \in (1,\infty ]\), then

    $$\begin{aligned} ||{\mathcal {M}}_E (F)||_{L^p(E,\mu _\theta )} \le C ||F||_{L^p(E,\mu _\theta )}, \end{aligned}$$

    where C depends only on n, \(\theta \) and p.

The following result is a direct consequence of the Lebesgue differentiation theorem with respect to \(\mu _\theta \), see [43,  Corollary 3.5].

Proposition 3.5

Let \(F \in L_{loc}^1({\mathbb {R}}^{2n},\mu _\theta )\). Then for almost every \((x,y) \in {\mathbb {R}}^{2n}\), we have

$$\begin{aligned} |F(x,y)| \le {\mathcal {M}}(F)(x,y). \end{aligned}$$

In addition, for any open set \(E \subset {\mathbb {R}}^{2n}\) and any \(p \in [1,\infty ]\), we have

$$\begin{aligned} ||F||_{L^p(E,\mu _\theta )} \le ||{\mathcal {M}}_E (F)||_{L^p(E,\mu _\theta )}. \end{aligned}$$

4 An approximation argument

From now on, we fix some \(s \in (0,1)\) and some parameter

$$\begin{aligned} \theta \in (0,\min \{s,1-s\}) \end{aligned}$$
(4.1)

to be chosen later. In addition, for the fractional gradients \(U_{s,\theta }\), \(V_{s,\theta }\) and \(G_{s,\theta }\) of functions \(u,v,g:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and the measure \(\mu _\theta \), we are going to use the abbreviated notation

$$\begin{aligned} U{:}{=}U_{s,\theta }, \quad V{:}{=}V_{s,\theta }, \quad G{:}{=}G_{s,\theta }, \quad \mu {:}{=}\mu _\theta . \end{aligned}$$

Definition

Given \(g \in W^{s,2}({\mathbb {R}}^n)\), we say that \(u \in W^{s,2}({\mathbb {R}}^n)\) is a weak solution of the equation \(L_A u = (-\Delta )^s g\) in \(\Omega \), if

$$\begin{aligned}&\int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{A(x,y)}{|x-y|^{n+2s}} (u(x)-u(y))(\varphi (x)-\varphi (y))dydx \\&\quad = C_{n,s} \int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{g(x)-g(y)}{|x-y|^{n+2s}} (\varphi (x)-\varphi (y))dydx \quad \forall \varphi \in W^{s,2}_0(\Omega ). \end{aligned}$$

In addition, we also need the following definition.

Definition

Let \(\Omega \) be a domain and consider functions \(h \in W^{s,2}({\mathbb {R}}^n)\) and \(f \in L^\frac{2n}{n+2s}(\Omega )\). We say that \(v \in W^{s,2}({\mathbb {R}}^n)\) is a weak solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_{A} v = f &{} \text { in } \Omega \\ v = h &{} \text { a.e. in } {\mathbb {R}}^n \setminus \Omega , \end{array}\right. } \end{aligned}$$

if we have \(v = h \text { a.e. in } {\mathbb {R}}^n \setminus \Omega \) and

$$\begin{aligned} \int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{A(x,y)}{|x-y|^{n+2s}} (u(x)-u(y))(\varphi (x)-\varphi (y))dydx = \int _{\Omega } f \varphi dx \quad \forall \varphi \in W^{s,2}_0(\Omega ). \end{aligned}$$

The following comparison estimate follows from [43,  Proposition 5.1] by taking \(A={\widetilde{A}}\) and \(f={\widetilde{f}} =0\).

Proposition 4.1

Let \(x_0 \in {\mathbb {R}}^n\), \(r>0\), \(g \in W^{s,2}({\mathbb {R}}^n)\) and \(A \in {\mathcal {L}}_0(\Lambda )\). Moreover, let \(u \in W^{s,2}({\mathbb {R}}^n)\) be a weak solution of the equation

$$\begin{aligned} L_A u = (-\Delta )^s g \quad \text {in } B_{2r}(x_0), \end{aligned}$$
(4.2)

and let \(v \in W^{s,2}({\mathbb {R}}^n)\) be the unique weak solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_A v = 0 &{} \text { in } B_{2r}(x_0) \\ v = u &{} \text { a.e. in } {\mathbb {R}}^n \setminus B_{2r}(x_0). \end{array}\right. } \end{aligned}$$
(4.3)

Then the function \(w{:}{=}u-v \in W^{s,2}_0(B_{2r}(x_0))\) satisfies

where \(C=C(n,s,\theta ,\Lambda )>0\).

We continue by fixing some further notation and some assumptions which we will use throughout the rest of this paper. From now on, we fix some \(\Lambda \ge 1\), some \(\delta >0\) to be chosen small enough, some coefficient \(A \in {\mathcal {L}}_0(\Lambda )\) that is \(\delta \)-vanishing in \(B_{5n}\) and some \(p \in (2,\infty )\). Moreover, we fix another number \(q \in [2,p)\) and define

$$\begin{aligned} q_\alpha ^\star {:}{=}{\left\{ \begin{array}{ll} \frac{nq}{n-\alpha q}, &{} \text {if } n>\alpha q \\ 2p, &{} \text {if } n \le \alpha q. \end{array}\right. } \end{aligned}$$
(4.4)

In addition, we fix a number m in the range

$$\begin{aligned} 2 \le m < \min \left\{ \frac{2(n-s)}{n-2s},p \right\} \end{aligned}$$
(4.5)

and define

$$\begin{aligned} q_0{:}{=}\max \{m,q\}. \end{aligned}$$
(4.6)

Furthermore, we fix some function \(g \in W^{s,2}({\mathbb {R}}^n)\) and a weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation

$$\begin{aligned} L_A u = (-\Delta )^s g \quad \text{ in } B_{5n} \end{aligned}$$
(4.7)

and set

(4.8)

where \(M_0 \ge 1 \) remains to be chosen large enough. From now on, we also fix some number

$$\begin{aligned} \alpha \in [s,\min \{2s,1\}) \end{aligned}$$

and assuming that \(\theta >\alpha -s\), we define a corresponding parameter by

$$\begin{aligned} \theta _\alpha {:}{=} s+\theta -\alpha >0 \end{aligned}$$
(4.9)

with associated gradient-type functions

$$\begin{aligned} U_\alpha (x,y)&{:}{=}&U_{\alpha ,\theta _\alpha }(x,y)=\frac{|u(x)-u(y)|}{|x-y|^{\alpha +\theta _\alpha }}, \quad \\ G_\alpha (x,y)&{:}{=}&G_{\alpha ,\theta _\alpha }(x,y)=\frac{|g(x)-g(y)|}{|x-y|^{\alpha +\theta _\alpha }} \end{aligned}$$

and with associated measure

$$\begin{aligned} \mu _\alpha (E){:}{=}\mu _{\theta _\alpha }(E)= \int _{E} \frac{dxdy}{|x-y|^{n-2\theta _\alpha }}, \quad E \subset {\mathbb {R}}^{2n} \text { measurable}. \end{aligned}$$

In addition, from this point on we assume that for any \(x_0 \in {\mathbb {R}}^n\), \(r>0\) such that \(B_r(x_0) \subset B_{5n}\), and any weak solution \(u_0 \in W^{s,2}({\mathbb {R}}^n)\) of \(L_A u_0=(-\Delta )^s g\) in \(B_{r}(x_0)\), we have a higher differentiability estimate of the form

(4.10)

where \(C=C(n,s,\alpha ,\theta ,\Lambda ,m)>0\) and

$$\begin{aligned} U_0(x,y){:}{=}\frac{|u_0(x)-u_0(y)|}{|x-y|^{s+\theta }}. \end{aligned}$$

Lemma 4.2

Let \(M>0\), \(x_0 \in B_{\frac{\sqrt{n}}{2}}\), \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and \(\lambda \ge \lambda _0\). Then for any \(\varepsilon _0 >0\), there exists some small enough \(\delta = \delta (\varepsilon _0,n,s,\alpha ,\theta ,\Lambda ,m,M) \in (0,1)\), such that under the assumptions that

$$\begin{aligned} {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x_0) \le M\lambda ^2, \quad {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(x_0) \le M\lambda ^{q_0} \delta ^{q_0} , \end{aligned}$$
(4.11)

for the unique weak solution \(v \in W^{s,2}({\mathbb {R}}^n)\) of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_{A} v = 0 &{} \text { in } B_{6r}(x_0) \\ v = u &{} \text { a.e. in } {\mathbb {R}}^n \setminus B_{6r}(x_0) \end{array}\right. } \end{aligned}$$
(4.12)

and the function

$$\begin{aligned} W_\alpha (x,y){:}{=}\frac{|u(x)-v(x)-u(y)+v(y)|}{|x-y|^{\alpha +\theta _\alpha }}, \quad (x,y) \in {\mathbb {R}}^{2n}, \end{aligned}$$
(4.13)

we have

$$\begin{aligned} \int _{{\mathcal {B}}_{2r}(x_0)} W_\alpha ^2 d \mu _\alpha \le \varepsilon ^2 \lambda ^2 \mu _\alpha ({\mathcal {B}}_r(x_0)) . \end{aligned}$$
(4.14)

Moreover, the function

$$\begin{aligned} V_\alpha (x,y){:}{=}\frac{|v(x)-v(y)|}{|x-y|^{\alpha +\theta _\alpha }}, \quad (x,y) \in {\mathbb {R}}^{2n} \end{aligned}$$

satisfies the estimate

$$\begin{aligned} ||V_\alpha ||_{L^\infty ({\mathcal {B}}_{2r}(x_0),d\mu _\alpha )} \le N_0 \lambda \end{aligned}$$
(4.15)

for some constant \(N_0= N_0(n,s,\alpha ,\theta ,\Lambda ,M)>0\).

Remark 4.3

In the above Lemma and in the rest of this paper, the Hardy–Littlewood maximal function is always considered with respect to the measure \(\mu _\alpha \).

Proof

Fix \(x_0 \in B_{\frac{\sqrt{n}}{2}}\) and \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \). Let \(l \in {\mathbb {N}}\) be determined by \(2^{l-1} r < \sqrt{n} \le 2^{l} r\), note that \(l \ge 2\). Then for any \(k < l\), by (4.11) we have

(4.16)

On the other hand, in view of (4.8) and the inclusions

$$\begin{aligned} {\mathcal {B}}_{2^k \sqrt{n}}(x_0) \subset {\mathcal {B}}_{2^{k+l-1}4r}(x_0) \subset {\mathcal {B}}_{2^k 4\sqrt{n}}(x_0) \subset {\mathcal {B}}_{2^k 5n}, \end{aligned}$$

we have

(4.17)

where \(C_1=C_1(n,\theta )>0\). Moreover, by Lemma 3.3 for any \(k \le l-1\) we have

where \(C_2=C_2(n,s,\alpha ,\theta )\). Now combining the previous display with (4.17), (4.16) and the facts that \(\theta < s\) and \(\lambda \ge \lambda _0\), we arrive at

(4.18)

where \(C_3=C_3(n,s,\alpha ,\theta ,M)>0\). In a similar way as in (4.17), we have

Therefore, using Lemma 3.3 along with Hölder’s inequality, we obtain

(4.19)

Since \(w{:}{=}u-v\) is a weak solution of \(L_A w = (-\Delta )^s g\) in \(B_{6r}(x_0)\), w satisfies the estimate (4.10), which combined with Proposition 4.1, Hölder’s inequality, (4.16) and (4.19) yields

where all constants depend only on \(n,s,\alpha ,\theta ,\Lambda ,m,M\) and the last inequality was obtained by choosing \(\delta \) sufficiently small. This proves (4.14).

Let us now proof the estimate (4.15). Define

$$\begin{aligned} \theta _0{:}{=}\frac{\min \{s,1-s\} + \theta }{2} \in (\theta ,\min \{s,1-s\}), \quad p_0{:}{=} \frac{n+2\theta _0}{\theta _0-\theta } \in (2,\infty ). \end{aligned}$$

Since A is \(\delta \)-vanishing in \(B_{5n}\) and therefore \((\delta ,5n)\)-BMO in \(B_{5n}\), by [43,  Theorem 9.1], after choosing \(\delta \) smaller if necessary, we have \(v \in W^{s+\theta _0(1-2/p_0),p_0}(B_{4r}(x_0))\) and thus \(V_{s,\theta _0} \in L^p(B_{4r}(x_0),\mu _{\theta _0})\). Therefore, [43,  Corollary 8.6] yields the estimate

where \(C_7=C_7(n,s,p_0,\theta _0,\Lambda )>0\), and therefore

where \(C_8=C_8(n,s,p_0,\theta _0,\Lambda )\) and we used that

$$\begin{aligned} \frac{n+2\theta _0}{p_0}-\theta _0+\theta = 0. \end{aligned}$$

Since \(s+\theta _0(1-2/p_0)-\frac{n}{p_0}=s+\theta =\alpha +\theta _\alpha \), combining the previous display with the fractional Sobolev embedding given by (2.1) yields

(4.20)

where \(C_{9}\) and \(C_{10}\) depend only on \(n,s,p_0,\theta _0,\Lambda \). Now in view of Proposition 4.1 along with (4.16), (4.19) and (4.18), we have

where all constants depend only on \(n,s,\alpha ,\theta ,\Lambda ,M\). Therefore, combining the last display with (4.20) yields

$$\begin{aligned} ||V_\alpha ||_{L^\infty ({\mathcal {B}}_{2r}(x_0),d\mu _\alpha )} \le [v]_{C^{\alpha +\theta _\alpha }(B_{2r}(x_0))} \le N_0 \lambda , \end{aligned}$$

for some \(N_0=N_0(n,s,\alpha ,\theta ,\Lambda ,M)>0\), which proves the estimate (4.15). This finishes the proof. \(\square \)

5 Good-\(\lambda \) inequalities

In this section, we prove some good-\(\lambda \) inequalities which serve as key ingredients in the covering arguments from [43,  Section 7]. Although the proofs of the results in this section are similar to the ones of the corresponding good-\(\lambda \) inequalities in [43,  Section 6], since the presence of higher-order fractional gradients requires quite a few adaptations, for the sake of coherence we nevertheless provide most of the details.

5.1 Diagonal good-\(\lambda \) inequalities

We start by proving good-\(\lambda \) inequalities at the diagonal, which are somewhat akin to corresponding ones in the local setting, see e.g. [8, 10].

Lemma 5.1

There is a constant \(N_d=N_d(n,s,\alpha ,\theta ,\Lambda ) \ge 1\), such that the following holds. For any \(\varepsilon > 0\) and any \(\kappa >0\) there exists some small enough \(\delta = \delta (\varepsilon ,\kappa ,n,s,\alpha ,\theta ,\Lambda ,m) \in (0,1)\), such that for any \(\lambda \ge \lambda _0\), any \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and any point \(x_0 \in Q_1\) with

$$\begin{aligned} \mu _\alpha \left( \left\{ (x,y) \in {\mathcal {B}}_{r}(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y) > N_d^2 \lambda ^2 \right\} \right) \ge \kappa \varepsilon \mu _\alpha ({\mathcal {B}}_r(x_0)), \end{aligned}$$
(5.1)

we have

$$\begin{aligned} \begin{aligned} {\mathcal {B}}_r(x_0)&\subset \left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> \lambda ^2 \right\} \\&\cap \left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} \left( G_\alpha ^{q_0} \right) (x,y) > \lambda ^{q_0} \delta ^{q_0} \right\} , \end{aligned} \end{aligned}$$
(5.2)

Proof

Let \(\varepsilon _0 >0\) and \(M>0\) to be chosen and consider the corresponding \(\delta = \delta (\varepsilon _0,n,s,\theta ,\Lambda ,m,M) \in (0,1)\) given by Lemma 4.2. Fix \(\varepsilon , \kappa > 0\), \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \), \(x_0 \in Q_1\) and assume that (5.1) holds, but that (5.2) is false, so that there exists a point \((x^\prime ,y^\prime ) \in {\mathcal {B}}_r(x_0)\) such that

$$\begin{aligned} {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x^\prime ,y^\prime ) \le \lambda ^2, \quad {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} \left( G_\alpha ^{q_0} \right) (x^\prime ,y^\prime ) \le \lambda ^{q_0} \delta ^{q_0}. \end{aligned}$$

Thus, for any \(\rho >0\) we have

(5.3)

Observe that for any \(\rho \ge r\), we have \({\mathcal {B}}_\rho (x_0) \subset {\mathcal {B}}_{2\rho }(x^\prime ,y^\prime ) \subset {\mathcal {B}}_{3\rho }(x_0)\). Together with (5.3), we obtain

and similarly

so that \(U_\alpha \) and \(G_\alpha \) satisfy the condition (4.11) with \(M=3^{n+2\theta _\alpha }\). Therefore, by Lemma 4.2 the weak solution \(v \in W^{s,2}({\mathbb {R}}^n)\) of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_{A} v = 0 &{} \text { in } B_{6r}(x_0) \\ v = u &{} \text { a.e. in } {\mathbb {R}}^n \setminus B_{6r}(x_0) \end{array}\right. } \end{aligned}$$

satisfies

$$\begin{aligned} \int _{{\mathcal {B}}_{2r}(x_0)} W_\alpha ^2 d\mu _\alpha \le \varepsilon _0^2 \lambda ^2 \mu _\alpha ({\mathcal {B}}_{r}(x_0)) , \end{aligned}$$
(5.4)

where \(W_\alpha \) is given as in (4.13). In addition, also by Lemma 4.2 there exists a constant \(N_0=N_0(n,s,\alpha ,\theta , \Lambda ) >0\) such that

$$\begin{aligned} ||V_\alpha ||_{L^\infty ({\mathcal {B}}_{2r}(x_0))}^2 \le N_0^2 \lambda ^2. \end{aligned}$$
(5.5)

Next, we set \(N_d {:}{=} (\max \{ 4 N_0^2, 5^{n+2\theta _\alpha } \})^{1/2} > 1\) and claim that

$$\begin{aligned} \begin{aligned}&\left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} ( U_\alpha ^2 )(x,y)> N_d^2\lambda ^2 \right\} \\&\quad \subset \left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{2r}(x_0)} ( W_\alpha ^2 )(x,y) > N_0^2\lambda ^2 \right\} . \end{aligned} \end{aligned}$$
(5.6)

To see this, assume that

$$\begin{aligned} (x_1,y_1) \in \left\{ x \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{2r}(x_0)} ( W_\alpha ^2 ) (x,y) \le N_0^2\lambda ^2 \right\} . \end{aligned}$$
(5.7)

For \( \rho < r\), we have \({\mathcal {B}}_\rho (x_1,y_1) \subset {\mathcal {B}}_r(x_1,y_1) \subset {\mathcal {B}}_{2r}(x_0)\), so that together with (5.7) and (5.5) we deduce

On the other hand, for \(\rho \ge r\) we have \( {\mathcal {B}}_\rho (x_1,y_1) \subset {\mathcal {B}}_{3 \rho }(x^\prime ,y^\prime ) \subset {\mathcal {B}}_{5 \rho }(x_1,y_1)\), so that (5.3) implies

Thus, we have

$$\begin{aligned} (x_1,y_1) \in \left\{ (x,y) \in {\mathcal {B}}_r(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} ( U_\alpha ^2 )(x,y) \le N_d^2 \lambda ^2 \right\} , \end{aligned}$$

which implies (5.6). Now using (5.6), the weak 1-1 estimate from Proposition 3.4 and (5.4), we conclude that there exists some constant \(C=C(n,\theta _\alpha )>0\) such that

$$\begin{aligned}&\mu _\alpha \left( \left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> N_d^2\lambda ^2 \right\} \right) \\&\quad \le \mu _\alpha \left( \left\{ (x,y) \in {\mathcal {B}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{2r}(x_0)} ( W_\alpha ^2 )(x,y) > N_0^2\lambda ^2 \right\} \right) \\&\quad \le \frac{C}{N_0^2\lambda ^2} \int _{{\mathcal {B}}_{2r}(x_0)} W_\alpha ^2 d\mu _\alpha \\&\quad \le \frac{C}{N_0^2} \mu _\alpha ({\mathcal {B}}_{r}(x_0)) \varepsilon _0^2 < \varepsilon \kappa \mu _\alpha ({\mathcal {B}}_{r}(x_0)), \end{aligned}$$

where the last inequality is obtained by choosing \(\varepsilon _0\) and thus also \(\delta \) sufficiently small. This contradicts (5.1) and thus finishes our proof. \(\square \)

5.2 Off-diagonal reverse Hölder inequalities

While in the setting of local elliptic equations of the form (1.9) proving analogues of the above diagonal good-\(\lambda \) inequalities is sufficient in order to establish the desired Sobolev regularity, in our nonlocal setting which involves fractional gradients defined on \({\mathbb {R}}^{2n}\), it is also necessary to prove an analogue of Lemma 5.1 on balls that are far away from the diagonal. However, since far away from the diagonal the equation cannot be used very efficiently, in this situation no useful comparison estimates are available.

In order to bypass this loss of information, as in [43] we replace the comparison estimates used in the diagonal setting by certain off-diagonal reverse Hölder inequalities with diagonal correction terms, which in view of an iteration argument in the end will still be sufficiently strong tools in order to deduce the desired regularity.

For this reason, in addition to the assumption that u satisfies the estimate (4.10), from now on we assume that for any \(r>0\), \(x_0 \in {\mathbb {R}}^n\) with \(B_{r}(x_0) \subset B_{5n}\), \(U_\alpha \) satisfies an estimate of the form

(5.8)

where \(C_q\) depends only on \(q,n,s,\alpha ,\theta ,m\) and \(\Lambda \).

Proposition 5.2

Let \(r>0\), \(x_0,y_0 \in {\mathbb {R}}^n\) and suppose that for some \(\gamma \in (0,1]\) we have \(\text {dist}(B_r(x_0),B_r(y_0)) \ge \gamma r\). Then we have

where \(C_{nd}=C_{nd}(n,s,\alpha ,\theta ,\Lambda ,\gamma ,m,q,p) \ge 1\) and \(q_\alpha ^\star \) is given by (4.4).

Proof

Choose points \(x_1 \in {\overline{B}}_r(x_0)\) and \(y_1 \in {\overline{B}}_r(y_0)\) such that \(\text {dist}(B_r(x_0),B_r(y_0))=|x_1-y_1|\). For any \((x,y) \in {\mathcal {B}}_r(x_0,y_0)\), we observe that

$$\begin{aligned} |x-y|&\le |x_1-y_1| + |x_1-x| + |y_1-y| \\&\le \text {dist}(B_r(x_0),B_r(y_0)) + 2 r \le 3 \text {dist}(B_r(x_0),B_r(y_0))/\gamma . \end{aligned}$$

Together with the definition of \(\text {dist}(B_r(x_0),B_r(y_0))\), for any \((x,y) \in {\mathcal {B}}_r(x_0,y_0)\) we obtain

$$\begin{aligned} 1 \le \frac{|x-y|}{\text {dist}(B_r(x_0),B_r(y_0))} \le 3 /\gamma . \end{aligned}$$
(5.9)

Thus, by taking into account the definition of the measure \(\mu _\alpha \), we conclude that

$$\begin{aligned} \frac{c_1 r^{2n}}{\text {dist}(B_r(x_0),B_r(y_0))^{n-2\theta _\alpha }} \le \mu _\alpha ({\mathcal {B}}_r(x_0,y_0)) \le \frac{C_1 r^{2n}}{\text {dist}(B_r(x_0),B_r(y_0))^{n-2\theta _\alpha }}, \end{aligned}$$
(5.10)

where \(c_1=c_1(n,\gamma ,\theta _\alpha ) \in (0,1)\) and \(C_1=C_1(n,\theta _\alpha ) \ge 1\). By (5.10) and (5.9), we have

where \(C_2=C_2(n,\gamma ,\theta _\alpha )\ge 1\). In view of Minkowski’s inequality, we can further estimate the integral on the right-hand side as follows

By using the fractional Sobolev–Poincaré inequality (Lemma 2.3) and then the estimate (5.8), for \(I_1\) we obtain

where \(C_3,C_4\) and \(C_5\) depend only on \(n,s,\alpha ,\theta \) and \(\theta _\alpha \). In the same way, for \(I_2\) we deduce that

Finally, by the Cauchy–Schwarz inequality, (5.10) and (5.9), for \(I_3\) we have

where \(C_6=C_6(n,\gamma ,\theta _\alpha ) \ge 1\) and \(C_7=C_7(n,\gamma ,\theta _\alpha ) \ge 1\). The claim now follows by combining the last five displays, so that the proof is finished. \(\square \)

5.3 Off-diagonal good-\(\lambda \) inequalities

In what follows, we fix some \(\varepsilon \in (0,1)\) to be chosen small enough and set

$$\begin{aligned} N_{\varepsilon ,q}{:}{=}\frac{C_{nd} C_{s,\theta } C_\alpha N_d 10^{10n}}{ \varepsilon ^{1/q_\alpha ^\star }}, \end{aligned}$$
(5.11)

where \(N_d=N_d(n,s,\alpha ,\theta ,\Lambda ) \ge 1\) is given by Lemma 5.1, \(C_{nd}=C_{nd}(n,s,\alpha ,\theta ,\Lambda ,\gamma ,m,q,p) \ge 1\) is given by Proposition 5.2 with \(\gamma \) to be chosen and

$$\begin{aligned} 1 \le C_{s,\theta }{:}{=} \sum _{k=1}^\infty 2^{-k(s-\theta )} < \infty , \end{aligned}$$
(5.12)

while \(C_\alpha =C_\alpha (n,s,\alpha ,\theta )>0\) is given by Lemma 3.3 with \(t=\alpha \). Moreover, for all \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and all \((x_0,y_0) \in {\mathcal {Q}}_1\) we define

$$\begin{aligned} {{\widetilde{\phi }}}(r,x_0,y_0){:}{=} \frac{r}{\text {dist}(B_\frac{r}{2}(x_0),B_\frac{r}{2}(y_0))}. \end{aligned}$$
(5.13)

Lemma 5.3

For any \(\lambda \ge \lambda _0\), \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and any point \((x_0,y_0) \in {\mathcal {Q}}_1\) satisfying \(|x_0-y_0| \ge (3\sqrt{n}+1)r\) and

$$\begin{aligned} \mu _\alpha \left( \left\{ (x,y) \in {\mathcal {B}}_{\frac{\sqrt{n}}{2}r}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y) > N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) \ge \varepsilon \mu _\alpha ({\mathcal {B}}_{\frac{r}{2}}(x_0,y_0)),\nonumber \\ \end{aligned}$$
(5.14)

we have

$$\begin{aligned} {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0)&\subset \left\{ (x,y) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> \lambda ^2 \right\} \\&\cup \left\{ (x,y) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0) \mid {{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,x)> 3^{n+2\theta _\alpha }N_d^2 {{\widetilde{\phi }}}(r,x_0,y_0)^{-2(\alpha +\theta _\alpha )} \lambda ^2 \right\} \\&\cup \left\{ (x,y) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0) \mid {{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (U_\alpha ^2)(y,y)> 3^{n+2\theta _\alpha }N_d^2 {{\widetilde{\phi }}}(r,x_0,y_0)^{-2(\alpha +\theta _\alpha )} \lambda ^2 \right\} \\&\cup \left\{ (x,y) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0) \mid {{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(x,x)> 3^{n+2\theta _\alpha } {{\widetilde{\phi }}}(r,x_0,y_0)^{-q_0(\alpha +\theta _\alpha )} \lambda ^{q_0} \right\} \\&\cup \left\{ (x,y) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0) \mid {{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(y,y) > 3^{n+2\theta _\alpha } {{\widetilde{\phi }}}(r,x_0,y_0)^{-q_0(\alpha +\theta _\alpha )} \lambda ^{q_0} \right\} . \end{aligned}$$

Proof

Assume that (5.14) holds, but that the conclusion is false, so that there exists a point \((x^\prime ,y^\prime ) \in {\mathcal {B}}_{\frac{r}{2}}(x_0,y_0)\) such that

$$\begin{aligned}&{{\mathcal {M}}}_{{\mathcal {B}}_{5n}}(U_\alpha ^2)(x^\prime ,y^\prime ) \le \lambda ^2, \\&{{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (U_\alpha ^2)(x^\prime ,x^\prime ) \\&\quad \le 3^{n+2\theta _\alpha }N_d^2 {{\widetilde{\phi }}}(r,x_0,y_0)^{-2(\alpha +\theta _\alpha )} \lambda ^2, \\&{{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (U_\alpha ^2)(y^\prime ,y^\prime ) \\&\quad \le 3^{n+2\theta _\alpha }N_d^2 {{\widetilde{\phi }}}(r,x_0,y_0)^{-2(\alpha +\theta _\alpha )} \lambda ^2, \\&{{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(x^\prime ,x^\prime ) \\&\quad \le 3^{n+2\theta _\alpha } {{\widetilde{\phi }}}(r,x_0,y_0)^{-{q_0}(\alpha +\theta _\alpha )} \lambda ^{q_0}, \\&{{\mathcal {M}}}_{\ge r,{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(y^\prime ,y^\prime ) \\&\quad \le 3^{n+2\theta _\alpha } {{\widetilde{\phi }}}(r,x_0,y_0)^{-{q_0}(\alpha +\theta _\alpha )} \lambda ^{q_0}. \end{aligned}$$

Therefore, for any \(\rho \ge r\) we have

(5.15)
(5.16)

and similarly

(5.17)

Since for any \(\rho \ge r\) we have \({\mathcal {B}}_\rho (x_0,y_0) \subset {\mathcal {B}}_{2\rho }(x^\prime ,y^\prime ) \subset {\mathcal {B}}_{3\rho }(x_0,y_0)\), from (5.15) we deduce

(5.18)

Since for any \(\rho \ge r\) we have \({\mathcal {B}}_\rho (x_0) \subset {\mathcal {B}}_{2\rho }(x^\prime )\), together with (5.16) we observe that

(5.19)

and similarly by using (5.17) instead of (5.16), we obtain

(5.20)

By the same reasoning, (5.19) and (5.20) hold also with \(x_0\) replaced by \(y_0\). Next, we claim that

$$\begin{aligned} \begin{aligned}&\left\{ (x,y) \in {\mathcal {B}}_{\frac{\sqrt{n}}{2}r}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}}( U_\alpha ^2 )(x,y)> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \\&\quad \subset \left\{ (x,y) \in {\mathcal {B}}_{\frac{ \sqrt{n}}{2}r}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{\frac{3\sqrt{n}}{2}r}(x_0,y_0)} ( U_\alpha ^2 )(x,y) > N_{\varepsilon ,q}^2 \lambda ^2 \right\} . \end{aligned} \end{aligned}$$
(5.21)

To see this, assume that

$$\begin{aligned} (x_1,y_1) \in \left\{ x \in {\mathcal {B}}_{\frac{\sqrt{n}}{2}r}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{\frac{3\sqrt{n}}{2}r}(x_0,y_0)} ( U_\alpha ^2 ) (x,y) \le N_{\varepsilon ,q}^2 \lambda ^2 \right\} . \end{aligned}$$
(5.22)

For \( \rho < \sqrt{n}r\), we have \({\mathcal {B}}_\rho (x_1,y_1) \subset {\mathcal {B}}_{\sqrt{n}r}(x_1,y_1) \subset {\mathcal {B}}_{\frac{3\sqrt{n}}{2}r}(x_0,y_0)\), so that along with (5.22) we deduce

On the other hand, for \(\rho \ge \sqrt{n}r\) we have \( {\mathcal {B}}_\rho (x_1,y_1) \subset {\mathcal {B}}_{3 \rho }(x^\prime ,y^\prime ) \subset {\mathcal {B}}_{5 \rho }(x_1,y_1)\), so that (5.15) implies

Thus, we have

$$\begin{aligned} (x_1,y_1) \in \left\{ (x,y) \in {\mathcal {B}}_r(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}}( U_\alpha ^2 )(x,y) \le N_{\varepsilon ,q}^2 \lambda ^2 \right\} , \end{aligned}$$

which implies (5.21). As in the proof of Lemma 4.2, let \(l \in {\mathbb {N}}\) be determined by \(2^{l-1} r < \sqrt{n} \le 2^{l} r\), note that \(l \ge 2\). Then for any \(k < l\), by (5.19) and (5.20) we have

(5.23)

Moreover, in view of (4.8), the inclusions

$$\begin{aligned} {\mathcal {B}}_{2^k \frac{n}{2}}(x_0) \subset {\mathcal {B}}_{2^{k+l-1} \frac{3\sqrt{n}}{2} r}(x_0) \subset {\mathcal {B}}_{2^k \frac{3n}{2}}(x_0) \subset {\mathcal {B}}_{2^k 5n} \end{aligned}$$

and the fact that \({{\widetilde{\phi }}}(r,x_0,y_0) \le 1\), we have

(5.24)

Together with (5.23) and the assumption that \(\lambda \ge \lambda _0\), along with using Lemma 3.3, we obtain

(5.25)

By a similar reasoning as above, (5.24) holds also with U replaced by G, so that along with Lemma 3.3 and Hölder’s inequality, we deduce

(5.26)

Again, by the same arguments as above (5.25) and (5.26) also hold for \(x_0\) replaced by \(y_0\). Therefore, together with the weak \(\frac{q_\alpha ^\star }{2}-\frac{q_\alpha ^\star }{2}\) estimate for the Hardy–Littlewood maximal function, Proposition 5.2 with \(\gamma =\frac{1}{3\sqrt{n}}\), (5.18), (5.25), (5.20), (5.26) and taking into account (5.11), we arrive at

which contradicts (5.14) and thus finishes the proof. \(\square \)

Next, we restate the previous Lemma in terms of cubes instead of balls, which is vital in order to make it applicable in the context of Calderón–Zygmund cube decompositions as used in the covering argument in [43,  Section 7]. In analogy to the quantity \({{\widetilde{\phi }}}(r,x_0,y_0)\) defined in (5.13), for any \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and all \(x_0,y_0 \in {\mathbb {R}}^n\) with \(|x_0-y_0|>\sqrt{n}r\), we define the quantity

$$\begin{aligned} \phi (r,x_0,y_0){:}{=}\frac{r}{\text {dist}(Q_r(x_0),Q_r(y_0))}. \end{aligned}$$
(5.27)

Since the proof of the following result works almost exactly like the one in [43,  Corollary 6.4] by using our Lemma 5.3 instead of [43,  Lemma 6.3] and by replacing in [43] the measure \(\mu \) by \(\mu _\alpha \), the function U by \(U_\alpha \) and the parameters s and \(\theta \) by \(\alpha \) and \(\theta _\alpha \), respectively, we omit the proof and instead refer to [43,  Corollary 6.4].

Corollary 5.4

For any \(\lambda \ge \lambda _0\), \(r \in \left( 0,\frac{\sqrt{n}}{2} \right) \) and any point \((x_0,y_0) \in {\mathcal {Q}}_1\) satisfying \(|x_0-y_0| \ge (3\sqrt{n}+1)r\) and

$$\begin{aligned} \mu _\alpha \left( \left\{ (x,y) \in {\mathcal {Q}}_{r}(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}}(U_\alpha ^2)(x,y)> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) > \varepsilon \mu _\alpha ({\mathcal {Q}}_r(x_0,y_0)),\nonumber \\ \end{aligned}$$
(5.28)

we have

$$\begin{aligned}&\mu _\alpha ({\mathcal {Q}}_{r}(x_0,y_0)) \\&\quad \le (\sqrt{n})^{n+2\theta _\alpha } \bigg ( \mu _\alpha \left( \left\{ (x,y) \in {\mathcal {Q}}_r(x_0,y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> \lambda ^2 \right\} \right) \\&\qquad + \phi (r,x_0,y_0)^{n-2 \theta _\alpha } \mu _\alpha \\&\qquad \times \left( \left\{ (x,y) \in {\mathcal {Q}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> N_d^2 \phi (r,x_0,y_0)^{-2(\theta _\alpha +\alpha )} \lambda ^2 \right\} \right) \\&\qquad + \phi (r,x_0,y_0)^{n-2 \theta _\alpha } \mu _\alpha \\&\qquad \times \left( \left\{ (x,y) \in {\mathcal {Q}}_r(y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> N_d^2 \phi (r,x_0,y_0)^{-2(\theta _\alpha +\alpha )} \lambda ^2 \right\} \right) \\&\qquad + \phi (r,x_0,y_0)^{n-2 \theta _\alpha } \mu _\alpha \\&\qquad \times \left( \left\{ (x,y) \in {\mathcal {Q}}_r(x_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(x,y)> \phi (r,x_0,y_0)^{-{q_0}(\theta _\alpha +\alpha )} \lambda ^{q_0} \right\} \right) \\&\qquad + \phi (r,x_0,y_0)^{n-2 \theta _\alpha } \mu _\alpha \\&\qquad \times \left( \left\{ (x,y) \in {\mathcal {Q}}_r(y_0) \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0})(x,y) > \phi (r,x_0,y_0)^{-{q_0}(\theta _\alpha +\alpha )}\lambda ^{q_0} \right\} \right) \bigg ). \end{aligned}$$

6 Level set estimates

By combining the good-\(\lambda \) inequalities given by Lemma 5.1 and Corollary 5.4 with a technically involved covering argument, it is possible to deduce a level set estimate which will then imply the desired \(L^p\) estimate for \(U_\alpha \) with respect to \(\mu _\alpha \). Since the mentioned covering argument was already implemented in great detail in [43,  Section 7] and up to some minor straightforward adjustments the argument needed in our setting works exactly like the one in [43,  Section 7], we omit most of the technical details leading to this level set estimate.

More precisely, in the arguments of [43,  Section 7], we need to replace the ball \({\mathcal {B}}_{4n}\) by \({\mathcal {B}}_{5n}\), the function U by \(U_\alpha \), the measure \(\mu \) by \(\mu _\alpha \) and the parameters s and \(\theta \) by \(\alpha \) and \(\theta _\alpha \), respectively, while the good-\(\lambda \) inequalities given by [43,  Lemma 6.1, Corollary 6.4] need to be replaced by our corresponding good-\(\lambda \) inequatilities given by Lemma 5.1 and Corollary 5.4. If in addition, we take into account our different definition (4.8) of the number \(\lambda _0\) in comparison to [43,  Formula (5.10)], we arrive at the following level set estimate, which corresponds to [43,  Corollary 7.8].

Proposition 6.1

Assume that the estimate (4.10) is satisfied in any ball contained in \(B_{5n}\) with respect to \(\alpha \) and that the estimate (5.8) is satisfied in any ball contained in \(B_{5n}\) with respect to q. Then there exists some \(\varepsilon _0=\varepsilon _0(n,\theta _\alpha ) \in (0,1)\), such that the following is true. Let \(\varepsilon \in (0,\varepsilon _0]\) and let \(\delta =\delta (\varepsilon ,n,s,\alpha ,\theta ,\Lambda ,m)>0\) be given by Lemma 5.1. Then after choosing the number \(M_0=M_0(n,\theta _\alpha )>0\) in (4.8) large enough, for any \(\lambda \ge \lambda _0\) we have

$$\begin{aligned}&\mu _\alpha \left( \left\{ (x,y) \in {\mathcal {Q}}_{1} \mid {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)(x,y)> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) \\&\quad \le C \left( \frac{\varepsilon }{\lambda ^2} \int _{{\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)> \lambda ^2 \right\} } {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) d\mu _\alpha \right. \\&\left. \qquad + \frac{1}{\delta ^{q_0} \lambda ^{q_0}} \int _{{\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0}) > \delta ^{q_0} \lambda ^{q_0} \right\} } {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} ) d\mu _\alpha \right) , \end{aligned}$$

where \(C=C(n,\alpha ,\theta _\alpha )>0\).

We remark that the number \(\varepsilon _0\) arises from the restriction [43,  Formula (7.26)] adapted to our setting, that is, we have

$$\begin{aligned} \varepsilon _0=\frac{1}{4 (\sqrt{n})^{n+2\theta _\alpha }c} \end{aligned}$$

for some \(c=c(n,\theta _\alpha ) \ge 1\). In addition, the number \(M_0\) needs to be chosen large enough in [43,  Formula (7.7)] adapted to our setting. More precisely, in our setting [43,  Formula (7.7)] needs to be replaced by

where all constants depend only on \(n,s,\alpha \) and \(\theta \) and the last inequality is obtained by choosing \(M_0\) large enough in (4.8) and taking account our definition (4.8) of \(\lambda _0\).

7 A priori estimates

In order to establish a priori estimates for weak solutions to the equation \(L_{A} u = (-\Delta )^s g\), we need the following standard alternative characterization of the \(L^p\) norm which follows from Fubini’s theorem in a straightforward way.

Lemma 7.1

Let \(\nu \) be a \(\sigma \)-finite measure on \({\mathbb {R}}^n\) and let \(h:\Omega \rightarrow [0,+\infty ]\) be a \(\nu \)-measurable function in a domain \(\Omega \subset {\mathbb {R}}^n\). Then for any \(0< \beta < \infty \), we have

$$\begin{aligned} \int _{\Omega } h^\beta d\nu = \beta \int _0^{\infty } \lambda ^{\beta -1} \nu \left( \{x \in \Omega \mid h(x)>\lambda \} \right) d\lambda . \end{aligned}$$

Proposition 7.2

Let \(q \in [2,p)\) and \({\widetilde{q}} \in (q_0,q_\alpha ^\star )\), where \(q_0\) is given by (4.6). Then there exists some small enough \(\delta = \delta (n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}}) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \(\delta \)-vanishing in \({\mathcal {B}}_{5n}\) and \(g \in W^{s,2}({\mathbb {R}}^n)\) satisfies \(G_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_{5n},\mu _\alpha )\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(B_{5n}\) that satisfies \(U_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_{5n},\mu _\alpha )\), the estimate (4.10) in any ball contained in \(B_{5n}\) with respect to \(\alpha \) and (5.8) in any ball contained in \(B_{5n}\) with respect to q, we have

where \(C=C(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}},p)>0\).

Proof

Let \(\varepsilon \) to be chosen small enough and consider the corresponding \(\delta = \delta (\varepsilon ,n,s,\alpha ,\theta ,\Lambda ,m) > 0\) given by Lemma 5.1. Then by using Lemma 7.1 multiple times, first with \(\beta ={\widetilde{q}}\), \(h={{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )^\frac{1}{2}\) and \(d\nu =d\mu _\alpha \), then with \(\beta ={\widetilde{q}}-2\), \(h={{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )^\frac{1}{2}\) and \(d\nu ={{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )d\mu _\alpha \), and also with \(\beta ={\widetilde{q}}-q_0\), \(h={{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} )^\frac{1}{q_0}\) and \(d\nu ={{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} )d\mu _\alpha \), a change of variables, Proposition 6.1 and the definition of \(N_{\varepsilon ,q}\) from (5.11), we obtain

$$\begin{aligned}&\int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) \right) ^\frac{{\widetilde{q}}}{2} d\mu _\alpha \\&\quad = {\widetilde{q}} \int _0^{\infty } \lambda ^{{\widetilde{q}}-1} \mu _\alpha \left( {\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )> \lambda ^2 \right\} \right) d\lambda \\&\quad = {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \int _0^{\infty } \lambda ^{{\widetilde{q}}-1} \mu _\alpha \left( {\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) d\lambda \\&\quad = {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \int _0^{\lambda _0} \lambda ^{{\widetilde{q}}-1} \mu _\alpha \left( {\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) d\lambda \\&\qquad + {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \int _{\lambda _0}^{\infty } \lambda ^{{\widetilde{q}}-1} \mu _\alpha \left( {\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 )> N_{\varepsilon ,q}^2 \lambda ^2 \right\} \right) d\lambda \\&\quad \le {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \mu _\alpha ({\mathcal {Q}}_1) \lambda _0^{{\widetilde{q}}} \\&\qquad + C_1 {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \varepsilon \int _{0}^{\infty } \lambda ^{{\widetilde{q}}-3} \int _{{\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2)> \lambda ^2 \right\} } {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) d\mu _\alpha d\lambda \\&\qquad + C_1 {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}}\delta ^{-q_0} \int _{0}^{\infty } \lambda ^{{\widetilde{q}}-q_0-1} \int _{{\mathcal {Q}}_1 \cap \left\{ {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0}) > \delta ^{q_0} \lambda ^{q_0} \right\} } {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} ) d\mu _\alpha d\lambda \\&\quad = {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \mu _\alpha ({\mathcal {Q}}_1) \lambda _0^{{\widetilde{q}}} \\&\qquad + C_1 {\widetilde{q}} C_{nd} C_{s,\theta } N_d 10^{10n} \varepsilon ^{1-{\widetilde{q}}/q_\alpha ^\star } \int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) \right) ^\frac{{\widetilde{q}}}{2} d\mu _\alpha \\&\qquad + C_1 {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \delta ^{-q_0} \int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} ) \right) ^\frac{{\widetilde{q}}}{q_0} d\mu _\alpha , \end{aligned}$$

where \(C_1=C_1(n,s,\alpha ,\theta ) \ge 1\). Next, we set

$$\begin{aligned} \varepsilon {:}{=} \min \left\{ \varepsilon _0, \left( 2 C_1 {\widetilde{q}} C_{nd} C_{s,\theta } C_\alpha N_d 10^{10n} \right) ^{-\frac{q_\alpha ^\star }{q_\alpha ^\star -{\widetilde{q}}}} \right\} , \end{aligned}$$

so that \(\varepsilon \) is a valid choice in Proposition 6.1 and moreover, we have

$$\begin{aligned} C_1 {\widetilde{q}} C_{nd} C_{s,\theta } C_\alpha N_d 10^{10n} \varepsilon ^{1-{\widetilde{q}}/q_\alpha ^\star } \le \frac{1}{2}. \end{aligned}$$

Since in addition by assumption we have \(U_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_{5n},\mu _\alpha )\), by Proposition 3.4 we have

$$\begin{aligned} \int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) \right) ^\frac{{\widetilde{q}}}{2} d\mu _\alpha < \infty , \end{aligned}$$

so that we can reabsorb the second to last term on the right-hand side of the first display of the proof in the the left-hand side, which yields

$$\begin{aligned} \int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (U_\alpha ^2 ) \right) ^\frac{{\widetilde{q}}}{2} d\mu _\alpha \le&2 {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \mu _\alpha ({\mathcal {Q}}_1) \lambda _0^{{\widetilde{q}}}+ 2 C_1 {\widetilde{q}} N_{\varepsilon ,q}^{{\widetilde{q}}} \delta ^{-q_0} \int _{{\mathcal {Q}}_1} \left( {{\mathcal {M}}}_{{\mathcal {B}}_{5n}} (G_\alpha ^{q_0} ) \right) ^\frac{{\widetilde{q}}}{q_0} d\mu _\alpha . \end{aligned}$$

Now in view of Propositions 3.5 and 3.4, taking into account the definition of \(\lambda _0\) from (4.8) along with using the estimate (4.10) with \(u_0=u\) and Hölder’s inequality, we obtain

where we also used that \(m \le q_0 \le {\widetilde{q}}\) and all constants depend only on \(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}}\) and p. This proves the desired estimate with \(C=C_5^{1/{\widetilde{q}}}\). \(\square \)

Corollary 7.3

Consider some \(q \in [2,p)\) and some \({\widetilde{q}} \in (q_0,q_\alpha ^\star )\). Then there exists some small enough \(\delta = \delta (n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}}) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \(\delta \)-vanishing in \(B_1\) and \(g \in W^{s,2}({\mathbb {R}}^n)\) satisfies \(G_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_1,\mu _\alpha )\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(B_1\) that satisfies \(U_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_1,\mu _\alpha )\), the estimate (4.10) in any ball contained in \(B_1\) with respect to \(\alpha \) and (5.8) in any ball contained in \(B_1\) with respect to q, we have the estimate

(7.1)

where \(C=C(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}},p)>0\).

Proof

There exists some small enough \(r_1 \in \left( 0,1 \right) \) such that for any \(z \in B_{1/2}\), we have

$$\begin{aligned} B_{5nr_1}(z) \Subset B_1. \end{aligned}$$
(7.2)

Fix some \(z \in B_{1/2}\) and consider the scaled functions \(u_z,g_z \in W^{s,2}({\mathbb {R}}^n)\) given by

$$\begin{aligned} u_z(x){:}{=}u(r_1 x+z), \quad g_z(x){:}{=}g(r_1 x+z) \end{aligned}$$

and also

$$\begin{aligned} A_z(x,y){:}{=} A(r_1 x+z,r_1 y+z). \end{aligned}$$

Since A is \(\delta \)-vanishing in \({\mathcal {B}}_1\), we see that \(A_z\) clearly is \(\delta \)-vanishing in \(B_{\frac{1}{5nr}}(-z) \supset B_{5n}\). Furthermore, in view of (7.2), \(u_z\) is a weak solution of \(L_{A_z} u_z = g_z\) in \(B_{\frac{1}{5nr_1}}(-z) \supset B_{5n}\). Now fix some \(r>0\) and some \(x_0 \in {\mathbb {R}}^n\) such that \(B_r(x_0) \subset B_{5n}\). Then again in view of (7.2), we clearly have

$$\begin{aligned} B_{r_1 r}(r_1 x_0+z) \subset B_1, \end{aligned}$$

so that by the assumption that the estimate (5.8) holds for any ball contained in \(B_1\), the estimate (5.8) holds with respect to the ball \(B_{r_1 r}(r_1 x_0+z)\). Together with changes of variables and taking into account (4.9), by straightforward computations similar to [43,  Formula (8.2)] it is now easy to verify that the functions

$$\begin{aligned} (U_\alpha )_z(x,y){:}{=}\frac{|u_z(x)-u_z(y)|}{|x-y|^{\alpha +\theta _\alpha }}, \quad (G_\alpha )_z(x,y){:}{=}\frac{|g_z(x)-g_z(y)|}{|x-y|^{\alpha +\theta _\alpha }}, \end{aligned}$$
$$\begin{aligned} U_z(x,y){:}{=}\frac{|u_z(x)-u_z(y)|}{|x-y|^{s+\theta }}, \quad G_z(x,y){:}{=}\frac{|g_z(x)-g_z(y)|}{|x-y|^{s+\theta }} \end{aligned}$$

satisfy the estimate (4.10) in any ball contained in \(B_{5n}\) with respect to \(\alpha \) and the estimate (5.8) in any ball contained in \(B_{5n}\) with respect to q. Since in addition the assumption that \(U_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_1,\mu _\alpha )\) clearly implies that \((U_\alpha )_z \in L^{{\widetilde{q}}} \left( {\mathcal {B}}_{{\frac{1}{5nr_1}}(-z)},\mu _\alpha \right) \subset L^{{\widetilde{q}}}\left( {\mathcal {B}}_{5n} ,\mu _\alpha \right) \), by Proposition 7.2 we obtain that

where \(C_4=C_4(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}},p)>0\). By combining the last display with another straightforward computation involving changes of variables (cf. [43,  Formula (8.3)]), we obtain

where again \(C_5=C_5(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}},p)>0\). Since \(\left\{ B_{r_1/2}(z) \right\} _{z \in B_{1/2}}\) is an open covering of the compact set \({\overline{B}}_{1/2}\), there is a finite subcover \(\left\{ B_{r_1/2}(z_j) \right\} _{j=1}^N\) of \(B_{1/2}\). Thus, summing up the above estimates applied with \(z=z_j\) over \(j=1,\ldots ,N\) in essentially the same way as in the last display in the proof of [43,  Corollary 8.3] yields the estimate (7.1), which finishes the proof. \(\square \)

In view of another straightforward scaling argument (cf. [43,  Corollary 8.4]), we also have the following scaled version of Corollary 7.3.

Corollary 7.4

Let \(r>0\) and \(z \in {\mathbb {R}}^n\) and consider some \(q \in [2,p)\) and some \({\widetilde{q}} \in (q_0,q_\alpha ^\star )\). Then there exists some small enough \(\delta = \delta (n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}}) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \(\delta \)-vanishing in \({\mathcal {B}}_r(z)\) and \(g \in W^{s,2}({\mathbb {R}}^n)\) satisfies \(G_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_r(z),\mu _\alpha )\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(B_r(z)\) that satisfies \(U_\alpha \in L^{{\widetilde{q}}}({\mathcal {B}}_r(z),\mu _\alpha )\), the estimate (4.10) in any ball contained in \(B_r(z)\) with respect to \(\alpha \) and (5.8) in any ball contained in \(B_r(z)\) with respect to q, we have the estimate

where \(C=C(n,s,\alpha ,\theta ,\Lambda ,m,q,{\widetilde{q}},p)>0\).

Next, we use an iteration argument in order to drop the assumption (5.8) and obtain higher integrability all the way up to the exponent p.

Proposition 7.5

Let \(r>0\), \(z \in {\mathbb {R}}^n\), \(s \in (0,1)\) and \(p \in (m,\infty )\), where m satisfies (4.5). Then there exists some small enough \(\delta = \delta (p,n,s,\alpha ,\theta ,\Lambda ,m) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \(\delta \)-vanishing in \({\mathcal {B}}_{r}(z)\) and \(g \in W^{s,2}({\mathbb {R}}^n)\) satisfies \(G \in L^{p}({\mathcal {B}}_{r}(z),\mu )\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(B_{r}(z)\) that satisfies \(U_\alpha \in L^{p}({\mathcal {B}}_{r}(z),\mu _\alpha )\) and the estimate (4.10) in any ball contained in \(B_r(z)\), we have

(7.3)

where \(C=C(n,s,\alpha ,\theta ,\Lambda ,m,p)>0\).

Proof

Define iteratively a sequence \(\{q_i\}_{i=1}^\infty \) of real numbers by

$$\begin{aligned} q_1{:}{=}2, \quad q_{i+1}{:}{=} \min \{(q_i+(q_i)^\star )/2,p\}, \end{aligned}$$

where as in (4.4) we let

$$\begin{aligned} (q_i)^\star ={\left\{ \begin{array}{ll} \frac{nq_i}{n-\alpha q_i}, &{} \text {if } n>\alpha q_i \\ 2p, &{} \text {if } n \le \alpha q_i. \end{array}\right. } \end{aligned}$$

Since for any i with \(n>\alpha q_{i+1}\) we have

$$\begin{aligned} \left( q_i+\frac{nq_i}{n-\alpha q_i} \right) /2 -q_i =\frac{nq_{i}}{2(n-\alpha q_{i})} - \frac{q_i}{2} \ge \frac{4s}{2(n-\alpha )}>0, \end{aligned}$$

there clearly exists some \(i_p \in {\mathbb {N}}\) such that \(q_{i_p} = p\).

Since the estimate (5.8) is trivially satisfied for \(q=q_1=2\), and in view of the additional assumption that \(U_\alpha \in L^{p}({\mathcal {B}}_{r}(z),\mu _\alpha )\) we in particular have \(U_\alpha \in L^{q_1}({\mathcal {B}}_{r}(z),\mu _\alpha )\), if we choose \(\delta \) small enough such that Corollary 7.4 is applicable with \(q=2\) and \({\widetilde{q}} = q_2\), then all assumptions of Corollary 7.4 are satisfied with respect to \(q=q_1=2\) and \({\widetilde{q}} = q_2\in (\min \{m,q_1\},(q_1)^\star )\), so that we obtain

(7.4)

where \(C_1=C_1(n,s,\alpha ,\theta ,\Lambda ,m,p)>0\). If \(i_p=2\), then \(q_2=p\) and the proof is finished. Otherwise, we observe that since r and z are arbitrary, the estimate (7.4) holds also in any ball that is contained in \(B_r(z)\), so that that the estimate (5.8) is satisfied with respect to \(q=q_2\) in any ball contained in \(B_r(z)\). Since also \(U_\alpha \in L^{p}({\mathcal {B}}_{r}(z),\mu _\alpha ) \subset L^{q_2}({\mathcal {B}}_{r}(z),\mu _\alpha )\), if we choose \(\delta \) smaller if necessary such that Corollary 7.4 is applicable with \(q=q_2\) and \({\widetilde{q}} = q_3\), then all assumptions of Corollary 7.4 are satisfied with respect to \(q=q_2\) and \({\widetilde{q}} = q_3 = (q_2+(q_2)^\star )/2 \in (q_2,(q_2)^\star )\), so that we obtain the estimate

where \(C_2=C_2(n,s,\alpha ,\theta ,\Lambda ,m,p)>0\). If \(i_p=3\), then \(q_3=p\) and the proof is finished. Otherwise, iterating this procedure \(i_p-1\) times and using that \(q_{i_p}=p\) also leads to the estimate (7.3). \(\square \)

Finally, by another delicate iteration argument we also drop the assumption that the estimate (4.10) holds, achieving an a priori higher differentiability estimate for any \(s<t<\min \{2s,1\}\).

Proposition 7.6

Let \(r>0\), \(z \in {\mathbb {R}}^n\), \(s \in (0,1)\), \(s<t<\min \{2s,1\}\) and \(p \in (2,\infty )\). Then there exists some small enough \(\delta = \delta (p,n,s,t,\Lambda ) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \(\delta \)-vanishing in \({\mathcal {B}}_{r}(z)\) and g belongs to \(W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(B_r(z))\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(B_r(z))\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(B_{r}(z)\), we have

$$\begin{aligned}{}[u]_{W^{t,p}(B_{r/2}(z))} \le C \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + [g]_{W^{t,p}(B_{r}(z))} + [g]_{W^{s,2}({\mathbb {R}}^n)} \right) , \end{aligned}$$
(7.5)

where \(C=C(n,s,t,\Lambda ,p,r)>0\).

Proof

Fix some \(s<t<\min \{2s,1\}\) and some \(p \in (2,\infty )\). All constants in this proof will only depend on \(n,s,t,\Lambda ,p\) and r. First of all, the assumption that \(u \in W^{t,p}(B_r(z))\) implies that \(U_\alpha =U_{\alpha ,\theta _\alpha } \in L^p(B_r(z),\mu _\alpha )\) for any \(s \le \alpha <\min \{2s,1\}\) such that \(\alpha + \left( 1-\frac{2}{p} \right) \theta _\alpha \le t\).

Let \(\delta >0\) be to be chosen small enough, fix some \(0<\gamma <\min \{2s,1\}-t\) and choose the parameter \(\theta \) by \(\theta {:}{=}\min \{s,1-s\}-\gamma \in (0,\min \{s,1-s\})\), so that in particular \(t<s+\theta \). In addition, define sequences of parameters \(\{m_k\}_{k \in {\mathbb {N}}}\) and \(\{\varepsilon _k\}_{k \in {\mathbb {N}}}\) by

$$\begin{aligned} m_k&{:}{=}&\frac{1}{k} \min \left\{ \frac{2n-3s}{n-2s},1+\frac{p}{2} \right\} \\&+\left( 1-\frac{1}{k} \right) \min \left\{ \frac{2(n-s)}{n-2s},p \right\} \in \left( 2, \min \left\{ \frac{2(n-s)}{n-2s},p \right\} \right) \end{aligned}$$

and

$$\begin{aligned} \varepsilon _k{:}{=}1-\frac{2}{m_k} \in (0,1). \end{aligned}$$

In particular, note that as indicated above, for any \(k \in {\mathbb {N}}\) the parameter \(m_k\) belongs to the range given by (4.5). Define inductively further sequences of parameters \(\{t_k\}_{k \in {\mathbb {N}}_0}\) and \(\{\theta _{t_k}\}_{k \in {\mathbb {N}}_0}\) by \(t_0{:}{=}s\), \(\theta _{t_0}{:}{=}\theta \) and

$$\begin{aligned} t_k{:}{=}t_{k-1}+\frac{\varepsilon _k \theta _{t_{k-1}}}{2}, \quad \theta _{t_{k}}{:}{=}s+\theta -t_{k}, \quad k \ge 1. \end{aligned}$$

Let

$$\begin{aligned} \varepsilon _\star {:}{=} \lim _{k \rightarrow \infty } \varepsilon _k = 1-2 / \min \left\{ \frac{2(n-s)}{n-2s},p \right\} >0. \end{aligned}$$

Since the sequence \(\{t_k\}_{k \in {\mathbb {N}}_0}\) is strictly increasing and bounded by \(s+\theta \), the limit \(t_\star {:}{=} \lim _{k \rightarrow \infty } t_k\) exists and satisfies \(t_\star =t_\star +\frac{\varepsilon _\star }{2}(s+\theta -t_\star ),\) which leads to \(t_\star =s+\theta \). Thus, since we have \(t<s+\theta =t_\star \), there exists a non-negative integer \({\widetilde{k}}\) such that \(t_{{\widetilde{k}}} < t\) but \(t_{{\widetilde{k}} +1} \ge t\). Also, define

$$\begin{aligned} \theta _t{:}{=}\frac{t-t_{{\widetilde{k}}}}{1-2/p}, \quad {{\widetilde{\theta }}} {:}{=} \theta _t+t_{{\widetilde{k}}}-s \end{aligned}$$

and note that since \(p \ge m_{{\widetilde{k}}}\), we have

$$\begin{aligned} \theta _t \le \frac{t-t_{{\widetilde{k}} +1}+\varepsilon _{{\widetilde{k}}} \theta _{t_{{\widetilde{k}}}}}{\varepsilon _{{\widetilde{k}}}} \le \theta _{t_{{\widetilde{k}}}}=s+\theta -t_{{\widetilde{k}}}, \end{aligned}$$

which implies that

$$\begin{aligned} 0<{{\widetilde{\theta }}} \le \theta < \min \{s,1-s\}. \end{aligned}$$

Thus, \({{\widetilde{\theta }}}\) also belongs to the range (4.1) and the relation (4.9) is satisfied for \(\theta _\alpha =\theta _t\), \(\alpha =t_{{\widetilde{k}}}\) and with \(\theta \) replaced by \({{\widetilde{\theta }}}\), that is, we have \(\theta _t=s+{{\widetilde{\theta }}} - t_{{\widetilde{k}}}\). In addition, observe that

$$\begin{aligned} t_{{\widetilde{k}}} + \left( 1-\frac{2}{p} \right) \theta _t=t. \end{aligned}$$

If \({\widetilde{k}}=0\), then since for \(\alpha =t_0=s\), the estimate (4.10) is trivially satisfied with \(m=2\), by Corollary 7.5 with \(\theta _\alpha =\theta _t\) and with \(\theta \) replaced by \({{\widetilde{\theta }}}\), for \(\delta \) small enough we have

In the case when \({\widetilde{k}}=0\), the proof is finished. If on the other hand \({\widetilde{k}}>0\), then for any \(x_0 \in B_r(z)\) and any \(r^\prime >0\) such that \(B_{r^\prime }(x_0) \subset B_r(z)\), using Proposition 2.5, Corollary 7.5 with p replaced by \(m_1\) along with Lemma 3.3 yields

for \(\delta \) small enough and any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of \(L_{A} u = (-\Delta )^s g\) in \(B_{r}(z)\). Thus, since in addition \(C_7\) does not depend on r and \(r^\prime \), we conclude that the estimate (4.10) is satisfied in any ball contained in \(B_r(z)\) with respect to \(\alpha =t_1\) and \(m=m_1\). Therefore, in the case when \({\widetilde{k}}=1\), once again by Corollary 7.5 with \(\theta _\alpha =\theta _t\) (which is applicable since \(m_1<p\)) and with \(\theta \) replaced by \({{\widetilde{\theta }}}\), we see that

for \(\delta \) small enough, so that in this case the proof is finished. If \({\widetilde{k}}>1\), then since \(m_2>m_1\), for any \(x_0 \in B_r(z)\) and any \(r^\prime >0\) such that \(B_{r^\prime }(x_0) \subset B_r(z)\), by Proposition 2.5, Corollary 7.5 with p replaced by \(m_2\) and Lemma 3.3, for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n)\) of \(L_{A} u = (-\Delta )^s g\) in \(B_{r}(z)\) and \(\delta \) small enough we have

where \(C_{14}\) does not depend on r and \(r^\prime \), so that (4.10) is satisfied in any ball contained in \(B_r(z)\) with respect to \(\alpha =t_2\) and \(m=m_2\). Thus, if \({\widetilde{k}}=2\), again by applying Corollary 7.5 with respect to \(\theta _\alpha =\theta _t\) and with \(\theta \) replaced by \({{\widetilde{\theta }}}\) we see that the desired estimate (7.5) holds, so that in this case the proof is finished. If \({\widetilde{k}}>2\), then iterating the above procedure \({\widetilde{k}}\) times also leads to the estimate (7.5), which finishes the proof. \(\square \)

We are now able to prove an a priori \(W^{t,p}\) estimate for equations of the type \(L_A u = (-\Delta )^s g\) in the case when A is small in BMO.

Theorem 7.7

Let \(\Omega \subset {\mathbb {R}}^n\) be a domain, \(s \in (0,1)\), \(\Lambda \ge 1\), \(s<t<\min \{2s,1\}\), \(p \in (2,\infty )\) and \(R>0\). Then there exists some small enough \(\delta = \delta (p,n,s,t,\Lambda ) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \((\delta ,R)\)-BMO in \(\Omega \) and g belongs to \(W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(\Omega )\), then for any weak solution \(u \in W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(\Omega )\) of the equation \(L_{A} u = (-\Delta )^s g\) in \(\Omega \) and any relatively compact domain \(\Omega ^\prime \Subset \Omega \), we have

$$\begin{aligned}{}[u]_{W^{t,p}(\Omega ^\prime )} \le C \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + [g]_{W^{t,p}(\Omega )} + [g]_{W^{s,2}({\mathbb {R}}^n)} \right) , \end{aligned}$$
(7.6)

where \(C=C(n,s,t,\Lambda ,R,p,\Omega ^\prime ,\Omega )>0\).

Proof

Fix a relatively compact bounded domain \({\Omega ^\prime } \Subset \Omega \) and let \(\delta =\delta (p,n,s,t,\Lambda )>0\) be given by Proposition 7.6. There exists some \(r \in (0,R)\) such that for any \(z \in \Omega ^\prime \), we have \(B_r(z) \Subset \Omega \). Since A is \((\delta ,R)\)-BMO in \(\Omega \), for any \(z \in \Omega ^\prime \) we conclude that A is \(\delta \)-vanishing in \(B_{r}(z)\). Also, since \(u \in W^{t,p}(\Omega )\), we have \(u \in W^{t,p}(B_{r}(z))\) for any \(z \in \Omega ^\prime \). Therefore, by Proposition 7.6, for any \(z \in \Omega ^\prime \) we obtain the estimate

$$\begin{aligned}{}[u]_{W^{t,p}(B_{r/2}(z))} \le C_1 \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + [g]_{W^{t,p}(B_{r}(z))} + [g]_{W^{s,2}({\mathbb {R}}^n)} \right) , \end{aligned}$$
(7.7)

where \(C_1=C_1(n,s,t,\Lambda ,p,r)\).

Since \(\left\{ B_{r/2}(z) \right\} _{z \in {\Omega ^\prime }}\) is an open covering of \(\overline{\Omega ^\prime }\) and \(\overline{\Omega ^\prime }\) is compact, there exists a finite subcover \(\left\{ B_{r/2}(z_i) \right\} _{i=1}^N\) of \(\overline{\Omega ^\prime }\) and hence of \(\Omega ^\prime \). Now summing over \(i=1,\ldots ,N\) and using the estimate (7.7) for \(z=z_i\) (\(i=1,\ldots ,N\)) yields

$$\begin{aligned} \begin{aligned}{}[u]_{W^{t,p}(\Omega ^\prime )}&\le \sum _{i=1}^N [u]_{W^{t,p}(B_{r/2}(z_i))} \\&\le \sum _{i=1}^N C_2 \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + [g]_{W^{t,p}(B_{r}(z))} + [g]_{W^{s,2}({\mathbb {R}}^n)} \right) \\&\le C_{2}N \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + [g]_{W^{t,p}({\Omega })} + [g]_{W^{s,2}({\mathbb {R}}^n)} \right) , \end{aligned} \end{aligned}$$
(7.8)

where \(C_{2}=C_{2}(n,s,t,\Lambda ,p,r)>0\). Since N depends only on \(\Omega ^\prime \) and \({\Omega }\), while r depends only on \(R,\Omega ^\prime \) and \(\Omega \), this proves the estimate (7.6), so that the proof is finished. \(\square \)

Remark 7.8

Since it might be useful in some applications, we remark that the statement of Proposition 7.7 can be generalized to the setting of a right-hand side that is given by a more general nonlocal operator or even by sums of more general nonlocal operators. For some \(l \in {\mathbb {N}}\) and \(i=1,\ldots ,l\), consider measurable functions \(D_i:{\mathbb {R}}^n \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} \sum _{i=1}^l |D_i(x,y)| \le \Lambda \text { for almost all } x,y \in {\mathbb {R}}^n. \end{aligned}$$
(7.9)

In addition, fix functions \(g_i \in W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(\Omega )\) and let \(u \in W^{s,2}({\mathbb {R}}^n) \cap W^{t,p}(\Omega )\) be a weak solution of the more general nonlocal equation \(L_A u =\sum _{i=1}^l L_{D_i} g_i\) in \(\Omega \), that is, assume that

$$\begin{aligned}&\int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{A(x,y)}{|x-y|^{n+2s}} (u(x)-u(y))(\varphi (x)-\varphi (y))dydx \\&\quad = \sum _{i=1}^l \int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} \frac{D_i(x,y)}{|x-y|^{n+2s}}(g_i(x)-g_i(y)) (\varphi (x)-\varphi (y))dydx \quad \forall \varphi \in W^{s,2}_0(\Omega ). \end{aligned}$$

Then the following is true. For st and p as in Theorem 7.7, there exists some small enough \(\delta = \delta (p,n,s,t,\Lambda ) > 0\) such that if \(A \in {\mathcal {L}}_0(\Lambda )\) is \((\delta ,R)\)-BMO in \(\Omega \) for some \(R>0\), then for any relatively compact domain \(\Omega ^\prime \Subset \Omega \), we have the a priori estimate

$$\begin{aligned}{}[u]_{W^{t,p}(\Omega ^\prime )} \le C \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + \sum _{i=1}^l \quad [g_i]_{W^{t,p}(\Omega )} + \sum _{i=1}^l \quad [g_i]_{W^{s,2}({\mathbb {R}}^n)} \right) , \end{aligned}$$
(7.10)

where \(C=C(n,s,t,\Lambda ,R,p,\Omega ^\prime ,\Omega )>0\).

This is true since the statement of our comparison estimate given by Proposition 4.1 remains valid for weak solutions u of such equations of the form \(L_A u =\sum _{i=1}^l L_{D_i} g_i\), which can be easily seen by using the bound (7.9) in the estimation of the appropriately adapted integral \(I_2\) in [43,  Proposition 5.1], while the adaptations required to account for the summation over \(i=1,\ldots ,l\) are straightforward and do not change the proofs in any conceptually significant way.

8 Proofs of the main results

We are now in the position to prove our main results.

Proof of Theorem 1.2

Fix relatively compact bounded domains \({\Omega ^\prime } \Subset \Omega _0 \Subset {\Omega ^{\prime \prime }} \Subset \Omega \), where we assume that \(\Omega _0\) is a smooth domain. Let \(\delta =\delta (p,n,s,t,\Lambda )>0\) be given by Theorem 7.7 and let \(\{\psi _m\}_{m=1}^\infty \) be a sequence of standard mollifiers in \({\mathbb {R}}^{n}\) with the properties

$$\begin{aligned} \psi _m \in C_0^\infty (B_{1/m}), \quad \psi _m \ge 0, \quad \int _{{\mathbb {R}}^{n}} \psi _m(x)dx=1 \quad \text {for all } m \in {\mathbb {N}}. \end{aligned}$$
(8.1)

For any \(m \in {\mathbb {N}}\) and \(x \in \Omega _m {:}{=} \left\{ x \in \Omega \mid \text {dist}(x, \partial \Omega ) > 1/m \right\} \), we now define

$$\begin{aligned} f_m(x){:}{=} \int _{\Omega } f(y)\psi _m(x-y)dy. \end{aligned}$$

Next, observe that there exists some large enough \(m_0 \in {\mathbb {N}}\), such that \(\Omega ^{\prime \prime } \subset \Omega _m\) for all \(m \ge m_0\). Since \(f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega )\) and \({\Omega ^{\prime \prime }} \Subset \Omega \), by standard properties of mollifiers we have

$$\begin{aligned} f_m \xrightarrow {m \rightarrow \infty } f \quad \text {in } L^\frac{np}{n+(2s-t)p}({\Omega ^{\prime \prime }}) \end{aligned}$$
(8.2)

and \(f_m \in L^\infty ({\Omega ^{\prime \prime }})\) for any \(m \ge m_0\). In addition, for any \(m \ge m_0\), by [42,  Proposition 4.1] there exists a unique weak solution \(u_m \in W^{s,2}({\mathbb {R}}^n)\) of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L_{A} u_m = f_m &{} \text { in } \Omega ^{\prime \prime } \\ u_m = u &{} \text { a.e. in } {\mathbb {R}}^n \setminus \Omega ^{\prime \prime }. \end{array}\right. } \end{aligned}$$
(8.3)

Since \(w_m{:}{=}u-u_m \in W_0^{s,2}(\Omega ^{\prime \prime })\) is a weak solution of the equation \(L_A w_m = f-f_m\) in \(\Omega ^{\prime \prime }\), in view of using \(w_m\) itself as a test function in this equation, along with Hölder’s inequality and the fractional Sobolev inequality (see [20,  Theorem 6.5]), we obtain

$$\begin{aligned} \begin{aligned}{}[w_m]_{W^{s,2}({\mathbb {R}}^n)}^2&\le \Lambda ^{-1} \int _{{\mathbb {R}}^n} \int _{{\mathbb {R}}^n} A(x,y) \frac{(w_m(x)-w_m(y))^2}{|x-y|^{n+2s}}dydx \\&= \Lambda ^{-1} \int _{\Omega ^{\prime \prime }} (f-f_m)w_m dx \\&\le \Lambda ^{-1} ||f-f_m||_{L^\frac{2n}{n+2s}(\Omega ^{\prime \prime })} ||w_m||_{L^\frac{2n}{n-2s}({\mathbb {R}}^n)} \\&\le C_1 ||f-f_m||_{L^\frac{np}{n+(2s-t)p}(\Omega ^{\prime \prime })} [w_m]_{W^{s,2}({\mathbb {R}}^n)}, \end{aligned} \end{aligned}$$
(8.4)

where \(C_{1}=C_{1}(n,s,t,p,\Lambda ,\Omega ^{\prime \prime })>0\), so that along with (8.2), we deduce that

$$\begin{aligned}{}[w_m]_{W^{s,2}({\mathbb {R}}^n)} \le C_{2} ||f-f_m||_{L^\frac{np}{n+(2s-t)p}(\Omega ^{\prime \prime })} \xrightarrow {k \rightarrow \infty } 0 \end{aligned}$$

and

$$\begin{aligned} \lim _{m \rightarrow \infty } [u_m]_{W^{s,2}({\mathbb {R}}^n)} = [u]_{W^{s,2}({\mathbb {R}}^n)}. \end{aligned}$$
(8.5)

Next, for any \(m \in {\mathbb {N}}\) let \(g_m \in W^{s,2}({\mathbb {R}}^n)\) be the unique weak solution of the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} (-\Delta )^s g_m = f_m &{} \text { in } \Omega ^{\prime \prime } \\ g_m = 0 &{} \text { a.e. in } {\mathbb {R}}^n \setminus \Omega ^{\prime \prime }. \end{array}\right. } \end{aligned}$$
(8.6)

Then by a similar reasoning as in (8.4), each function \(g_m\) satisfies the estimate

$$\begin{aligned}{}[g_m]_{W^{s,2}({\mathbb {R}}^n)} \le C_2 ||f_m||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })}, \end{aligned}$$
(8.7)

where \(C_2=C_2(n,s,t,p,\Omega ^{\prime \prime })>0\). In addition, by the local \(H^{2s,p}\) estimates for the fractional Laplacian (see [43,  Theorem 4.4]), we have the estimate

$$\begin{aligned} ||g_m||_{H^{2s,\frac{np}{n+(2s-t)p}}(\Omega _0)} \le C_3 ||f_m||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })}, \end{aligned}$$
(8.8)

where \(C_3=C_3(n,s,t,p,\Omega _0,\Omega ^{\prime \prime })>0\). Also, by Proposition 2.4, we have

$$\begin{aligned}{}[g_m]_{W^{t,p}(\Omega _0)} \le C_4 ||g_m||_{H^{2s,\frac{np}{n+(2s-t)p}}(\Omega _0)}, \end{aligned}$$
(8.9)

where \(C_4=C_4(n,s,t,p,\Omega _0)>0\). In view of (8.3) and (8.6), \(u_m\) is a weak solution of the equation

$$\begin{aligned} L_{A} u_m= (-\Delta )^s g_m \quad \text {in } \Omega ^{\prime \prime }. \end{aligned}$$

Since \(f_m \in L^\infty (\Omega ^{\prime \prime })\), by [43,  Theorem 1.4] we have \(u_m \in C^{\beta }_{loc}(\Omega _0)\) for any \(\beta \in (0,\min \{2s,1\})\) and thus \(u_m \in W^{t,p}(\Omega _0)\). Therefore, by Theorem 7.7, (8.7), (8.9) and (8.8), we have

$$\begin{aligned}{}[u_m]_{W^{t,p}(\Omega ^{\prime })}&\le C_5 \left( [u_m]_{W^{s,2}({\mathbb {R}}^n)} + [g_m]_{W^{t,p}(\Omega _0)} + [g_m]_{W^{s,2}({\mathbb {R}}^n)} \right) \\&\le C_6 \left( [u_m]_{W^{s,2}({\mathbb {R}}^n)} + ||f_m||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })} \right) , \end{aligned}$$

where all constants depend only on \(n,s,t,\Lambda ,p,\Omega ^\prime ,\Omega ^{\prime \prime }\) and \(\Omega _0\). Combining the previous display with Fatou’s Lemma (which is applicable after passing to a subsequence if necessary), (8.5) and (8.2), we conclude that

$$\begin{aligned} \begin{aligned}{}[u]_{W^{t,p}(\Omega ^{\prime })}&\le \liminf _{m \rightarrow \infty } [u_m]_{W^{t,p}(\Omega ^{\prime })} \\&\le C_7 \lim _{m \rightarrow \infty } \left( [u_m]_{W^{s,2}({\mathbb {R}}^n)} + ||f_m||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })} \right) \\&= C_{7} \left( [u]_{W^{s,2}({\mathbb {R}}^n)} + ||f||_{L^{\frac{np}{n+(2s-t)p}}(\Omega ^{\prime \prime })} \right) , \end{aligned} \end{aligned}$$
(8.10)

where \(C_7=C_7(n,s,t,\Lambda ,p,\Omega ^{\prime },\Omega ^{\prime \prime })>0\). This proves the estimate (1.8).

The assertion that \(u \in L^p_{loc}(\Omega )\) now follows by a simple iteration argument for which we refer to the proof of [43,  Theorem 9.1], so that we conclude that \(u \in W^{t,p}_{loc}(\Omega )\). This finishes the proof. \(\square \)

Proof of Theorem 1.1

The case when \(t=s\) follows directly from [43,  Theorem 1.1]. Next, fix some \(p>2\), some \(s<t<\min \{2s,1\}\) and consider the corresponding \(\delta =\delta (p,n,s,t,\Lambda )>0\) given by Theorem 1.2. Since A is assumed to be VMO in \(\Omega \), there exists some \(R>0\) such that A is \((\delta ,R)\)-BMO in \(\Omega \). Thus, by Theorem 1.2 we obtain that \(u \in W^{t,p}_{loc}(\Omega )\) whenever \(f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega )\), which finishes the proof. \(\square \)

Proof of Theorem 1.3

Fix some t such that \(s \le t < \min \{2s,1\}\), some \(q \in \left( \frac{2n}{n+2(2s-t)},\infty \right) \) and some \(f \in L^q_{loc}(\Omega )\). First, we assume that \(q<\frac{n}{2s-t}\). Then we have \(n>(2s-t)q\) and set \(p{:}{=}\frac{nq}{n-(2s-t)q}>2\), so that we have \(q=\frac{np}{n+(2s-t)p}\) and thus \(f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega )\). Therefore, by Theorem 1.1 we obtain \(u \in W^{t,p}_{loc}(\Omega ) = W^{t,\frac{nq}{n-(2s-t)q}}_{loc}(\Omega )\).

If on the other hand \(q \ge \frac{n}{2s-t}\), then for any \(p \in (2,\infty )\) we have \(\frac{np}{n+(2s-t)p} \le q\) and thus \(f \in L^\frac{np}{n+(2s-t)p}_{loc}(\Omega )\), so that again by Theorem 1.1 we obtain \(u \in W^{t,p}_{loc}(\Omega )\). In view of Proposition 2.5, the conclusion that \(u \in W^{t,p}_{loc}(\Omega )\) for \(p \in (2,\infty )\) and for any t in the range \(s \le t < \min \{2s,1\}\) also implies that \(u \in W^{t,p}_{loc}(\Omega )\) for any \(p \in (1,\infty )\), so that the proof is finished. \(\square \)

Proof of Theorem 1.4

Fix some \(s<t<\min \{2s,1\}\) and some \(t^\prime \) such that \(t<t^\prime <\min \{2s,1\}\). Since \(f \in L^2_{loc}(\Omega )\) and \(\frac{2n}{n+2(2s+t^\prime )}<2\), Theorem 1.3 implies that \(u \in W^{t^\prime ,\frac{2n}{n-2(2s-t^\prime )}}_{loc}(\Omega )\). Since \(\frac{2s}{n-2(2s-t^\prime )} >2\), by Proposition 2.5 we arrive at \(u \in W^{t,2}_{loc}(\Omega )\), so that the proof is finished. \(\square \)

Remark 8.1

As already indicated in Remark 1.5, our main results remain valid for another class of coefficients A that in general might not be VMO in \(\Omega \). Namely, the conclusions of Theorems 1.1, 1.3 and 1.4 remain true if instead we assume that there exists some small \(\varepsilon >0\) such that

$$\begin{aligned} \lim _{h \rightarrow 0} \sup _{\begin{array}{c} _{x,y \in K}\\ {|x-y| \le \varepsilon } \end{array}} |A(x+h,y+h)-A(x,y)| =0 \quad \text {for any compact set } K \subset \Omega . \end{aligned}$$
(8.11)

In fact, in the present paper we only use the assumption that A is VMO in \(\Omega \) in order to ensure that the Hölder estimate for corresponding homogeneous equations given by (4.20) holds, which in this case is guaranteed by the results from [43]. If instead A satisfies the assumption (8.11), then this Hölder estimate actually follows from [41,  Theorem 1.1] combined with [43,  Lemma 5.1], so that our proofs and main results remain valid under the assumption (8.11).

As mentioned, the condition (8.11) is for example satisfied in the case when \(A \in {\mathcal {L}}_0(\Lambda )\) is translation invariant in \(\Omega \), that is, if we have \(A(x,y)=a(x-y)\) for all \(x,y \in \Omega \) and some measurable function \(a: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\). Since in this case A is otherwise not required to satisfy any additional smoothness assumption, A might not be VMO in \(\Omega \) but still satisfies (8.11).