Higher H\"older regularity for nonlocal equations with irregular kernel

We study the higher H\"older regularity of local weak solutions to a class of nonlinear nonlocal elliptic equations with kernels that satisfy a mild continuity assumption. An interesting feature of our main result is that the obtained regularity is better than one might expect when considering corresponding results for local elliptic equations in divergence form with continuous coefficients. Therefore, in some sense our result can be considered to be of purely nonlocal type, following the trend of various such purely nonlocal phenomena observed in recent years. Our approach can be summarized as follows. First, we use certain test functions that involve discrete fractional derivatives in order to obtain higher H\"older regularity for homogeneous equations driven by a locally translation invariant kernel, while the global behaviour of the kernel is allowed to be more general. This enables us to deduce the desired regularity in the general case by an approximation argument.

A(x, y) |x − y| n+2s Φ(u(x) − u(y))dy, x ∈ Ω, is a nonlocal operator. Throughout the paper, for simplicity we assume that n > 2s. Furthermore, the function A : R n × R n → R is measurable and we assume that there exists a constant λ ≥ 1 such that (2) λ −1 ≤ A(x, y) ≤ λ for almost all x, y ∈ R n .
Moreover, we require A to be symmetric, i.e. (3) A(x, y) = A(y, x) for almost all x, y ∈ R n .
We call such a function A a kernel coefficient. We define L 0 (λ) as the class of all such measurable kernel coefficients A that satisfy the conditions (2) and (3). Moreover, in our main results Φ : continuity and monotonicity assumptions, namely where for simplicity we use the same constant λ ≥ 1 as in (2). In particular, if Φ(t) = t, then the operator L Φ A reduces to a linear nonlocal operator which is widely considered in the literature. Define the fractional Sobolev space W s,2 (Ω) := u ∈ L 2 (Ω) Ω Ω |u(x) − u(y)| 2 |x − y| n+2s dy < ∞ and denote by W s,2 loc (Ω) the set of all functions u ∈ L 2 loc (Ω) that belong to W s,2 (Ω ′ ) for any relatively compact open subset Ω ′ of Ω. In addition, we define the tail space We remark that for any function u ∈ L 1 2s (R n ), the quantity R n \BR(x0) |u(y)| |x 0 − y| n+2s dy is finite for all R > 0, x 0 ∈ R n . For all measurable functions u, ϕ : R n → R, we define provided that the above expression is well-defined and finite. This is for example the case if u ∈ W s,2 loc (Ω) ∩ L 1 2s (R n ) and ϕ ∈ W s,2 c (Ω), where by W s,2 c (Ω) we denote the set of all functions that belong to W s,2 (Ω) and are compactly supported in Ω.
In the literature, various types of weak solutions with varying generality are considered. In this paper, we adopt the following very general notion of local weak solutions which is for example used in [1] and [2].
We remark that the right-hand side of (6) is finite by the fractional Sobolev embedding (cf. [9,Theorem 6.5]). It is noteworthy that the above notion of local weak solutions contains most other notions of weak solutions considered in the literature, such as the ones considered in e.g. [8] or [22].
In our main result, we need to impose an additional continuity assumption on A, namely |A(x + h, y + h) − A(x, y)| = 0 for any compact set K ⊂ Ω.
In particular, the condition (7) is satisfied if A is either continuous in Ω × Ω or if A belongs to the following subclass of L 0 (λ) which plays an important role in our proof of the desired regularity.

Definition.
Let Ω be a domain and λ ≥ 1. We say that a kernel coefficient A 0 ∈ L 0 (λ) belongs to the class L 1 (λ, Ω), if there exists a measurable function a : R n → R such that A 0 (x, y) = a(x − y) for all x, y ∈ Ω.
A kernel coefficient that belongs to the class L 1 (λ, Ω) can be thought of being translation invariant, but only inside of Ω. We also call such a kernel coefficient locally translation invariant.
We note that the condition (7) is also satisfied by some more general choices of kernel coefficients, where A ′ ∈ L 0 (λ 1 2 ) is continuous in Ω×Ω and A 0 belongs to the class L 1 (λ 1 2 , Ω), but is not required to satisfy any continuity or smoothness assumption. Moreover, we stress that the condition (7) only restricts the behaviour of A inside of Ω × Ω, while outside of Ω × Ω a more general behaviour is possible. We are now in the position to state our main result.
Remark 1.2. In order to provide some context, let us briefly consider the local elliptic equation in divergence form of the type (9) div(B∇u) = 0 in Ω, where the matrix of coefficients B = {b ij } n i,j=1 is assumed to be uniformly elliptic and bounded. The equation (9) can in some sense be thought of as a local analogue of the nonlocal equation (1) corresponding to the limit case s = 1. A classical regularity result states that if the coefficients b ij are continuous, then weak solutions u ∈ W 1,2 loc (Ω) of the equation (9) are locally Hölder continuous for any exponent α ∈ (0, 1), see for example [13,Corollary 5.18]. Heuristically, one might therefore expect that the optimal regularity in the setting of nonlocal equations with continuous kernel coefficient should not exceed C s regularity. Nevertheless, Theorem 1.1 in particular shows that weak solutions to nonlocal equations of the type L Φ A u = 0 in Ω are locally C α for any 0 < α < min 2s, 1 whenever A ∈ L 0 (λ) is continuous, exceeding C s regularity. In particular, in the case when s ≥ 1/2, weak solutions to homogeneous nonlocal equations with continuous kernel coefficients enjoy the same amount of Hölder regularity as weak solutions to corresponding local equations with continuous coefficients, despite the fact that the order of such nonlocal equations is lower.
Such at first sight unexpected additional regularity is however not untypical in the context of nonlocal equations and has been observed in various previous works in the context of Sobolev regularity. For example, in [18] and [24] it is shown that already in the setting of a general kernel coefficient A ∈ L 0 (λ), weak solutions to nonlocal equations of the type (1) are slightly higher differentiable than initially assumed along the scale of Sobolev spaces, which is a phenomenon not shared by local elliptic equations of the type (9) with coefficients that are merely measurable. Another result in this direction was recently proved in [21], where the authors in particular show that if A ∈ L 0 (λ) is Hölder continuous with some arbitrary Hölder exponent and Φ(t) = t, then weak solutions of the equation L Φ A u = 0 in R n belong to W α,p loc (R n ) for any α < min 2s, 1 and any 2 ≤ p < ∞, while for local equations of the type (9) with corresponding Hölder continuous coefficients no comparable gain in differentiability is achievable. In particular, by the Sobolev embedding this result implies that such weak solutions belong to C α loc (R n ) for any 0 < α < min 2s, 1 , which is consistent with our main result. Our main result shows that this amount of higher Hölder regularity is also enjoyed by local weak solutions of possibly nonlinear equations driven by kernel coefficients of class L 0 (λ) that satisfy the continuity assumption (7). Remark 1.3. Besides being interesting for its own sake, one of our main motivations is that Theorem 1.1 also has some interesting potential applications concerning the Sobolev regularity of solutions to nonlocal equations. A first such application can briefly be summarized as follows.
In [22], in the main result it is assumed that A is globally translation invariant, i.e. that A belongs to the class L 1 (λ, R n ). However, this assumption is only used in order to ensure that the Hölder estimate (8) from Theorem 1.1 is valid, which up to this point was only known for translation invariant kernels, cf. [22,Theorem 4.6]. Since otherwise the proofs in [22] only rely on the properties (2) and (3) of A, from Theorem 1.1 above we conclude that the statement of [22, Theorem 1.1] is also true for general kernel coefficients A of class L 0 (λ) that satisfy the condition (7).
1.2. Approach and previous results. As mentioned, our approach is strongly influenced by an approach introduced in [2], where a similar result concerning higher Hölder regularity is proved for the fractional p-Laplacian in the superquadratic case when p ≥ 2. Although we restrict ourselves to the quadratic case when p = 2, in contrast to [2] we deal with a nonlinearity already in the quadratic setting and most importantly, we also treat equations driven by general kernel coefficients A that satisfy the mild assumption (7), while in [2] only the case when A ≡ 1 is considered. Let us briefly summarize our approach, highlighting the differences to the one used in [2].
First, we prove the higher Hölder regularity for homogeneous equations driven by a locally translation invariant kernel coefficient, see section 3. As in [2], the main idea in this case is to test the equation with certain monotone power functions of discrete fractional derivatives leading to an incremental higher integrability and differentiability result on the scale of certain Besov-type spaces. However, in our setting we also need to carefully use the local translation invariance and the bounds imposed on A, and also the assumptions (4) and (5) imposed on Φ in order to overcome the difficulties that arise due to the presence of the general kernel and the general type of nonlinearity. Moreover, we remark that restricting ourselves to equations with quadratic growth has the advantage that the proof of this incremental higher regularity result simplifies quite substantially in some other respects. The obtained incremental gain in regularity is then iterated, in order for the desired Hölder regularity to follow by embedding.
In section 4, we then treat the general case of inhomogeneous equations driven by a kernel coefficient satisfying the condition (7) by an approximation argument. In the corresponding approximation argument applied in [2], the solution is approximated by a solution of a corresponding equation with zero right-hand side, while the nonlocal operator driving the equation is left unchanged. In order to be able to treat equations with a general kernel coefficient A of class L 0 (λ) that satisfies only the continuity assumption (7), in addition to freezing the right-hand side, we also need to locally replace A by a corresponding locally translation invariant kernel coefficient, which is possible in view of the assumption (7). Since by the first part of the proof the desired Hölder regularity is already known for solutions to equations with locally translation invariant kernel coefficients, we can then transfer this regularity from the approximate solution to the solution itself. In other words, in some sense we locally freeze the coefficient, in order to transfer the regularity from an equation for which the higher regularity can be proved directly to an equation driven by a less regular kernel. This strategy can be thought of as a nonlocal counterpart of corresponding techniques widely used in the study of higher regularity for local elliptic equations, although we stress that in our nonlocal setting we have to overcome a number of additional difficulties which are not present in the local setting in order to execute such an approximation argument successfully.
Moreover, we believe that just like in the local setting, the approximation techniques developed in this paper are flexible enough in order to be adaptable to also proving other higher regularity results for nonlocal equations similar to (1). Regarding other related regularity results, in [12] a similar result is proved in the linear case when Φ(t) = t, where A is required to be locally close enough to b x−y |x−y| for some even function b : S n−1 → R that is bounded between two positive constants, which in the limit is contained in our assumption (7), see also Remark 4.4 below. More results concerning higher Hölder regularity for various types of nonlocal equations are for instance contained in [11], [23], [5], [6] and [14]. Furthermore, results regarding basic Hölder regularity for nonlocal equations are proved for example in [8], [15], [25] and [19], while results concerning Sobolev regularity can be found for example in [18], [24], [1], [7], [21] and [22]. Finally, for some regularity results concerning nonlocal equations similar to (1) in the more general setting of measure data, we refer to [17].

Preliminaries
2.1. Some notation. Let us fix some notation which we use throughout the paper. By C, c, C i and c i , i ∈ N 0 , we always denote positive constants, while dependences on parameters of the constants will be shown in parentheses. As usual, by we denote the open and closed ball with center x 0 ∈ R n and radius r > 0, respectively. Moreover, if E ⊂ R n is measurable, then by |E| we denote the n-dimensional Lebesgue-measure of E. If 0 < |E| < ∞, then for any u ∈ L 1 (E) we define Next, for any p ∈ (1, ∞) we define the function J p : R → R by Moreover, for any measurable function ψ : R n → R and any h ∈ R n , we define 2.2. The nonlocal tail. In this section, for convenience we state and proof the following two simple results concerning the nonlocal tail of a function which we use frequently throughout the paper.
Lemma 2.1. Let s ∈ (0, 1) and 0 < r < R. Then for any x ∈ B r and any u ∈ L 1 2s (R n ), we have Proof. The claim follows directly from the observation that for any x ∈ B r and any y ∈ R n \ B R , we have Lemma 2.2. Let s ∈ (0, 1), r > 0 and x 0 ∈ B 1 such that B r (x 0 ) ⊂ B 1 . Then for any u ∈ L 1 2s (R n ), we have Proof. Since by assumption x 0 ∈ B 1−r , with the help of Lemma 2.1 we obtain which finishes the proof. , so that we have Moreover, we define the space The following Poincaré-type inequality associated to the space W s,2 will frequently be used throughout the paper.

2.4.
Besov-type spaces. Next, let us introduce some function spaces of Besov-type. In order to do so, for q ∈ [1, ∞) and any function u ∈ L q (R n ) we define the quantities This enables us to define the two Besov-type spaces The following embedding result can be found in [4, Lemma 2.3].
Finally, the following result can be found in [1, Proposition 2.6].
Next, we prove two elementary inequalities which involve the function J p defined in section 2.1 and are based on the monotonicity property (5) of Φ.
so that in this case both sides of the inequality vanish. Next, we consider the case when a − c = b − d. In view of the monotonicity assumption (5) imposed on Φ, we have Moreover, by [20, page 71], for all x, y ∈ R we have Since the last term on the right-hand side is non-negative, by choosing Multiplying the inequality (11) with the one in the previous display leads to so that the claim follows by simplifying the factor ((a − b) − (c − d)) 2 from both sides.
Proof. If a − c = b − d, then both sides of the inequality vanish. Next, let us consider the case when a − c = b − d. In view of (5), we have The right-hand side of the above estimate can be further estimated by applying [2, Lemma A.1] with p = β + 1 and q = 2, which yields The claim now follows by combining the last two displays.
2.6. Some preliminary estimates. The following Caccioppoli-type inequality can be proved in essentially the same way as the one in [18, Theorem 3.1].
). Moreover, assume that A ∈ L 0 (λ) and that the Borel function Φ : R → R satisfies

Then for any local weak solution
where C = C(n, s, λ, r, R) > 0.
We remark that the assumptions in (12) are clearly implied by the assumptions Φ(0) = 0, (4) and (5) which are used in our main results.
The following result on local boundedness is essentially given by [3,Theorem 3.8], where the below result is stated under the stronger assumption that u ∈ W s,2 0 (B R (x 0 )) and in setting of the fractional p-Laplacian, which applied to our setting means that strictly speaking it only contains the case when Φ(t) = t and A(x, y) ≡ 1. Nevertheless, an inspection of the proof shows that it remains valid for local weak solutions, see also [2,Theorem 3.2]. Moreover, the case of a general Φ and a general A can easily be treated by noting that the Caccioppoli-type inequality from [3, Proposition 3.5] remains valid for such a general Φ and a general A by simply applying the bounds imposed on Φ and A whenever appropriate in a similar fashion as in [18,Theorem 3.1]. Therefore, we have the following result.
2s . Moreover, consider a kernel coefficient A ∈ L 0 (λ) and assume that the Borel function Φ : R → R satisfies (12). Then for any local weak solution u ∈ W s,2 (B R (x 0 )) ∩ L 1 2s (R n ) of the equation we have the estimate where C = C(n, s, λ, q, σ) > 0.

Higher Hölder regularity for homogeneous equations with locally translation invariant kernel
3.1. Incremental higher integrability and differentiability.
The key ingredient to proving the desired higher Hölder regularity for homogeneous equations with locally translation invariant kernel is provided by the following incremental higher integrability and differentiability result on the scale of Besov-type spaces. In the case of the fractional p-Laplacian for p ≥ 2, the below result was proved in [2, Proposition 5.1]. Besides the fact that we treat equations with arbitrary locally translation invariant kernels, it is also interesting that in our setting of equations with quadratic growth, we are able to directly prove both higher integrability and differentiability, while for possibly degenerate equations as in [2] it is necessary to first obtain a pure higher integrability result (cf. [2, Proposition 4.1]), which is then used in order to also obtain higher differentiability. We remark that this additional higher differentiability does not seem to have a counterpart in the context of local equations and is one of the main reasons why in our nonlocal setting we are able to exceed C s regularity. Moreover, note that although at this point we work with solutions that are bounded, this assumption will later be removed by using Theorem 2.11.
Step 1: Discrete differentiation of the equation. Set r := R − 4h 0 > 0 and fix some h ∈ R n such that 0 < |h| < h 0 . Let η ∈ C ∞ 0 (B R ) be a non-negative Lipschitz cutoff function satisfying Let us show that the function . Thus, since the product of a function belonging to W s,2 (B R ) and a Lipschitz function also belongs to and are compactly supported in B R−h0 , in view of [2, Lemma 2.11] in particular both ϕ and ϕ −h belong to W s,2 c (B 1 ), so that both ϕ and ϕ −h are admissible test functions in (13). Therefore, using ϕ −h as a test function in (13) along with a change of variables yields By subtracting (16) from (15) and dividing by 0 < |h| < h 0 , we obtain Next, splitting the above integral and taking into account the choice of ϕ, we arrive at where we used that η vanishes identically outside of B (R+r)/2 .
Step 2: Preliminary estimation of the local term I 1 . Since A ∈ L 1 (λ, B 1 ), we have A(x, y) = a(x − y) for all x, y ∈ B 1 and some measurable function a : R n → R. Since for x, y ∈ B R we have x + h, y + h ∈ B 1 , it follows that for all x, y ∈ B R we have Therefore, we can rewrite I 1 as follows Let us now concentrate on estimating I 1 . First of all, we observe that Therefore, we obtain Next, using the Lipschitz bound (4), Young's inequality and then Lemma 2.8 with β = q, for the negative term in the last display we deduce where ε > 0 is arbitrary. By choosing ε := 1 2λ 2 , combining the last two displays yields where C 3 = C 3 (λ) > 0. By using Lemma 2.9 with β = q, we can further estimate the first term of the previous display, which along with the bounds (2) of A leads to where c = c(λ, q) > 0 and C 4 = C 4 (λ) > 0. Next, for simplicity we write and observe that by using the convexity of the function t → t 2 , we obtain Combining (18) with the last display yields where C 5 = C 5 (λ, q) > 0. By combining the above estimate for I 1 with the identity I 1 +I 2 +I 3 = 0, we arrive at where C 6 = C 6 (λ, q) > 0 and Our next goal is to estimate the terms I 1,1 , |I 2 | and |I 3 |.
Step 3: Estimating the local term I 1,1 . In order to estimate I 1,1 , observe that for any x ∈ B R changing variables and integrating in polar coordinates yields where we used that R + h 0 ≤ 1 and ||u|| L ∞ (B1) ≤ 1 in order to obtain the last inequality and In the same way we have so that we obtain Step 4: Estimating the nonlocal terms I 2 and I 3 . Next, let us estimate the nonlocal terms I 2 and I 3 , which can be treated in the same way. Since ||u|| L ∞ (B1) ≤ 1 and (R + r)/2 + h 0 ≤ 1, by additionally using the bound (4) of Φ with t = u h (x) − u h (y) and t ′ = 0, for almost every x ∈ B (R−r)/2 and any y ∈ R n \ B R we have By using the upper bound in (2) of A (which trivially also holds for A h ) and the fact that 0 ≤ η ≤ 1 and then the last two displays, we deduce For any x ∈ B (R+r)/2) , we have B (R−r)/2 (x) ⊂ B R , which in view of integration in polar coordinates along with the fact that R − r = 4h 0 leads to where C 9 = C 9 (n, s) > 0 and C 10 = C 9 (2h 0 ) −2s . Using Lemma 2.1, the change of variables z = y + h and then Lemma 2.2, for any x ∈ B (R+r)/2 we obtain where C 11 = C 11 (n, s, h 0 ) > 0. Here we also used the the fact that R > 4h 0 and the bounds imposed on u. The term involving u can be estimated similarly. In fact, by using Lemma 2.1 and Lemma 2.2, for any x ∈ B (R+r)/2 we obtain where C 12 = C 12 (n, s, h 0 ) > 0. By combining the above estimates with (22) and the observation that |I 3 | can be estimated in the same way, we arrive at where C 13 = C 13 (n, s, λ, h 0 ) > 0. By combining this estimate with (21) and (19), we find the estimate (23) |δ h u| where C 14 = C 14 (n, s, q, λ, h 0 ) > 0.
Step 5: Conclusion. Let ξ ∈ R n \ {0} to be chosen such that |ξ| < h 0 . Applying Lemma 2.7 with where C 15 = C 15 (q) > 0. Here we also used that η ≡ 1 in B r in order to obtain the last inequality. Next, we observe that by the discrete Leibniz rule (cf. [2, Formula (2.1)]), we can write We arrive at where C 16 = 2C 15 . By applying the first part of Proposition 2.6 with for the first term on the right-hand side of (24) we obtain where C 17 = C 17 (n, s) > 0 and C 18 = C 18 (n, s, h 0 ) > 0. By using that η is Lipschitz and that ξ < h 0 , along with the assumption that ||u|| L ∞ (B1) ≤ 1 we estimate the second term on the right-hand side of (24) as follows where C 19 = C 19 (n, h 0 ) > 0. Therefore, we arrive at where C 20 = C 20 (n, s, q, h 0 ) > 0. We now choose ξ = h and take the supremum over h for 0 < |h| < h 0 , so that together with (23) we obtain where C 21 = C 21 (n, s, q, h 0 , λ) > 0. Next, we use the fact that by [2, Lemma 2.6] applied with β = (1 + ϑq)/q < 1, on the right-hand side of (25) we can replace the first-order difference quotient by a corresponding second-order difference quotient in the following way where C 22 = C 22 (n, q, ϑ, h 0 ) > 0. By combining the last display with (25) and using that where C = C(n, s, q, ϑ, h 0 , λ) > 0. Since r = R − 4h 0 , the proof is finished. 3.
2. An iteration argument. We now use an iteration argument based on Proposition 3.1 in order to obtain the following higher Hölder regularity result. (4) and (5) with respect to λ and assume that u ∈ W s,2 (B R (x 0 ))∩L 1 where C = C(n, s, λ, α) > 0.
Consider the scaled function and also Observe that u 1 belongs to W s,2 (B 1 ) ∩ L 1 2s (R n ) ∩ L ∞ (B 1 ) and is a weak solution of L Φ1 A1 u 1 = 0 in B 1 . Moreover, it is easy to verify that A 1 ∈ L 1 (λ, B 1 ) and that Φ 1 satisfies (4) and (5) with respect to λ. Furthermore, by using changes of variables it is straightforward to verify that u 1 Therefore, the conclusion of Proposition 3.1 is valid with respect to u 1 . For i ∈ N 0 , we define the sequences In particular, we have We split the further proof into two cases.
Along with Lemma 2.5 with our choice of β and q = q i∞ (which is applicable in view of (29)), we obtain (36) Finally, rescaling yields the desired estimate, namely (26). This finishes the proof in the case when s ≤ 1/2.
By imitating the arguments used to conclude in case 1 (cf. (35) and (36)), which in particular involves applying Lemma 2.5 with β = 1 − ε and q = q i∞+j∞ (which is applicable in view of (38)), we conclude that [u 1 ] C α (B 1/2 ) ≤ C = C(n, s, λ, α) for a different constant C as the one in (36). The desired estimate (26) now once again simply follows by rescaling, which finishes the proof.

Higher Hölder regularity by approximation
We now use an approximation argument inspired by [2, section 6] and [5] in order to prove Theorem 1.1 under full generality. In order to do so, we need the following definition.
Then the unique weak solution v ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) of the problem Proof. First of all, we remark that the existence of a unique weak solution of the problem (44) belonging to W s,2 (B 1 ) ∩ L 1 2s (R n ) can be shown almost exactly as in [16, Theroem 1 and Remark 3] by using the theory of monotone operators and additionally taking into account the bounds (4) and (5) imposed on Φ. We now prove by contradiction. Assume that the conclusion is not true. Then there exist some τ > 0, sequences of kernel coefficients {A m } ∞ m=1 and { A m } ∞ m=1 of class L 0 (λ), a sequence of functions {Φ m } ∞ m=1 satisfying (4) and (5), and sequences , such that for any m the function u m is a local weak solution of the problem |u m (y)| |y| n+2s dy ≤ M, but for any m the unique weak solution v m ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) of (49) In view of (2), (5) and using w m := u m − v m ∈ W s,2 0 (B 7/8 ) as a test function in (49) and also in (46), we obtain . By using (4) and (48), we further estimate I 1 as follows .
In order to proceed, we observe that since n > 2s, we have q > n 2s > 2n n+2s , so that Hölder's inequality and (48) yield where C 1 = C 1 (n, s, q) > 0. By using the Cauchy-Schwarz inequality, Theorem 2.10, (47) and (51), for I 1,1 we obtain where C 2 and C 3 depend only on n, s, λ, q and M . For I 1,2 , by using Lemma 2.1, the Cauchy-Schwarz-inequality, the fractional Friedrichs-Poincaré inequality (Lemma 2.3) and (47), we have where C 4 = 15 n+2s , C 5 = C 5 (n, s) > 0 and C 6 = C 6 (n, s, M ) > 0. Similarly, by using Lemma 2.1, the Cauchy-Schwarz-inequality, Lemma 2.2, Lemma 2.3 and (47), for I 1,3 we obtain where again all the constants depend only on n, s and M . Next, by using Hölder's inequality, the fractional Sobolev inequality (cf. [9, Theorem 6.5]) and (51), we estimate I 2 in the following way where C 10 = C 10 (n, s, q) > 0. Putting the above estimates together, we arrive at for some C 11 = C 11 (n, s, λ, q, M ) > 0. Combining this estimate with the fractional Friedrichs-Poincaré inequality (Lemma 2.3) leads to In other words, we have In view of Theorem 2.11, Theorem 2.12, the fact that u m = v m a.e. in R n \ B 7/8 and Lemma 2.2, we have |u m (y)| |y| n+2s dy , so that in view of (52) and (47) the sequence {v m } ∞ m=1 is uniformly bounded in B 3/4 and has uniformly bounded C β seminorms in B 3/4 , where β = β(n, s, λ, q) > 0. Moreover, in view of (47) and Theorem 2.12, the sequence {u m } ∞ m=1 is also uniformly bounded in B 3/4 and has uniformly bounded C β seminorms in B 3/4 . In particular, the same is also true for the sequence {u m −v m } ∞ m=1 . Therefore, by the Arzelà-Ascoli theorem, by passing to a subsequence if necessary, we obtain that In particular, for m large enough we have which contradicts (50). This finishes the proof.
Next, we use the above Lemma and Theorem 3.2 in order to prove the desired higher Hölder regularity in the case when A is close enough to a locally translation invariant kernel coefficient.
Proof. We divide the proof into two parts.
Step 1: Regularity at the origin. In this step, our aim is to prove that for any 0 < ε < Θ and any 0 < r < 1, there exists some small enough δ > 0 such that if A, A, f and u are as above, then for some constant C 1 = C 1 (n, s, λ, ε) > 0. In order to accomplish this, we fix some 0 < ε < Θ and observe that it suffices to prove that there exist 0 < ρ < 1 3 and δ > 0 such that if A, A, f and u are as above, then for any k ∈ N 0 we have where M 0 := 1 + R n \B1 dy |y| n+2s < ∞. Indeed, assume that (59) were true. Since for any 0 < r < 1 there exists some k ∈ N 0 such that ρ k+1 < r ≤ ρ k , by the first inequality in (59) we would arrive at which would prove (58) with C 1 = 2 ρ Θ−ε . In order to prove (59), we proceed by induction. In the case when k = 0, (59) is true by the assumptions (57).
Next, suppose that (59) holds up to k and let us prove that it is also true for k + 1. Let τ > 0 to be chosen small enough and consider the corresponding δ = δ(τ, n, s, λ, q, M ) > 0 given by Lemma 4.1, where M := 2 + M 0 . Assume that (55) is satisfied with respect to this δ. Furthermore, define and We note that A k ∈ L 0 (λ), A k ∈ L 1 λ, B 1 ρ k ⊂ L 1 (λ, B 1 ) and that Φ k satisfies (4) and (5) with respect to λ. Moreover, w k belongs to W s,2 (B 1 ) ∩ L 1 2s (R n ) and is a local weak solution of where we have also used that Θ ≤ 2s − n q and thus k 2s − (Θ − ε) − n q ≥ kε ≥ 0. Moreover, by the induction hypothesis we have Therefore, by Lemma 4.1 the unique weak solution v k ∈ W s,2 ( Together with the fact that w k (0) = 0, we obtain that for any x ∈ B 1/3 we have Our next goal is to prove that the right-hand side of the previous estimate is uniformly bounded by a constant that does not depend on k. In order to do so, we observe that since A k ∈ L 1 (λ, where C 2 = C 2 (n, s, λ, Θ, ε) > 0. For the first term of the right-hand side, in view of (61) and (60) we have In order to estimate the tail term, we observe that by the same argument used in order to obtain (52), we have where C 3 = C 3 (n, s, λ, q) > 0. Together with the fact that v k = w k in R n \ B 7/8 , Lemma 2.2 and (60), we deduce where C 6 = C 6 (n, s, λ, q, δ) > 0. Finally, for the Sobolev seminorm by Theorem 2.10 and the above estimates we have where C 7 and C 8 do not depend on k. By combining the above estimates with (62) and (63), we obtain that for any x ∈ B 1/3 we have (64) |w k (x)| ≤ 2τ + C 9 |x| Θ−ε/2 , where again C 9 does not depend on k. Next, define By choosing τ small enough such that 2τ < ρ Θ , in view of (64), we obtain (65) |w k+1 (x)| ≤ 2τ ρ ε−Θ + C 9 ρ ε−Θ |ρx| Θ−ε/2 ≤ (1 + C 9 |x| Θ−ε/2 )ρ ε/2 ∀x ∈ B 1 3ρ . In particular, by choosing ρ small enough such that ρ ≤ (1 + C 9 ) − 2 ε and recalling that ρ < 1/3, we arrive at ||w k+1 || L ∞ (B1) ≤ 1. By definition of w k+1 this is equivalent to which proves the first estimate in (59) for k + 1. In order to prove the second estimate in (59) for k + 1, we observe that (65) implies where C 10 := (1 + C 9 ) R n \B1 dy |y| n+2s+ε/2−Θ < ∞ does not depend on k and is finite because 2s + ε/2 − Θ ≥ n q + ε/2 > 0. Furthermore, by using a change of variables and the first bound in (60), we obtain where C 11 := 3 n+2s 2|B 1 | < ∞. Moreover, again by a change of variables and the second bound in (60), we deduce Note that in the last two estimates we also used that ρ < 1 and that ε − Θ + 2s ≥ ε/2. By combining the last three displays and choosing ρ small enough such that we arrive at which proves the second estimate in (59) for k + 1. Therefore, for (59) is true for any k ∈ N 0 , which in particular also proves (58) under the assumptions (55) and (57), where δ is chosen as above.
Therefore, we are in the position to apply step 1 to u z , which yields sup x∈Br |u z (x) − u z (0)| ≤ C 1 r Θ−ε , 0 < r < 1.
In order to obtain the estimate (8) in our main result with its precise scaling, we now first proof our main result at scale 1 by using scaling and covering arguments. The general case will then follow by another scaling argument. Theorem 4.3. Let λ ≥ 1 and f ∈ L q (B 1 ) for some q > n 2s . Consider a kernel coefficient A ∈ L 0 (λ) that satisfies |A(x + h, y + h) − A(x, y)| = 0.
Therefore, by additionally taking into account the symmetry of A we see that the kernel coefficient defined by Moreover, A clearly belongs to the class L 1 (λ, B rz (z)). In the case when u ≡ 0, the desired Hölder regularity trivially holds. Otherwise, set M z := sup x∈Br z (z) |u(x)| + r 2s z R n \Br z (z) |u(y)| |z − y| n+2s dy + r 2s−n/q z δ ||f || L q (Br z (z)) > 0.
Consider the scaled functions u z ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) and f z ∈ L q (B 1 ) given by u z (x) := 1 M z u(r z x + z), f z (x) := r 2s M z f (r z x + z) and also A z (x, y) := A(r z x + z, r z y + z), A z (x, y) := A(r z x + z, r z y + z), Φ z (t) : We note that u z is a local weak solution of L Φz Az u z = f z in B 1 . Moreover, observe that A z ∈ L 0 (λ) and A z ∈ L 1 (λ, B 1 ), while Φ z satisfies (4) and (5) with respect to λ. Furthermore, by using changes of variables it is easy to verify that u z and f z satisfy |u z (y)| |y| n+2s dy ≤ 1, ||f z || L q (B1) ≤ δ, while (70) implies that Therefore, in view of (71) and (72) the assumptions (55) and (57) from Proposition 4.2 are verified with respect to u z , f z , A z and A z , so that by Proposition 4.2 we obtain [u z ] C Θ−ε (B 1/2 ) ≤ C 1 (n, s, λ, q).

Remark 4.4.
As can be seen from the above proofs, the assumption (7) in our main result can actually be slightly weakened. In fact, it is enough to assume that for any x 0 ∈ Ω, there exists some small enough radius r x0 > 0 and some A x0 ∈ L 1 (λ, B rx 0 (x 0 )) such that where δ = δ(α, n, s, λ, q) > 0 is given by Proposition 4.2. In other words, roughly speaking it suffices that inside of Ω × Ω, A is locally close enough to being translation invariant. This slight "room for error" is typical when one uses approximation techniques in order to obtain regularity results, see for example [5].