Higher Hölder regularity for nonlocal equations with irregular kernel

We study the higher Hölder regularity of local weak solutions to a class of nonlinear nonlocal elliptic equations with kernels that satisfy a mild continuity assumption. An interesting feature of our main result is that the obtained regularity is better than one might expect when considering corresponding results for local elliptic equations in divergence form with continuous coefficients. Therefore, in some sense our result can be considered to be of purely nonlocal type, following the trend of various such purely nonlocal phenomena observed in recent years. Our approach can be summarized as follows. First, we use certain test functions that involve discrete fractional derivatives in order to obtain higher Hölder regularity for homogeneous equations driven by a locally translation invariant kernel, while the global behaviour of the kernel is allowed to be more general. This enables us to deduce the desired regularity in the general case by an approximation argument.


Basic setting and main result
In this work, we study the higher Hölder regularity of solutions to nonlinear nonlocal equations of the form driven by a kernel that potentially exhibits a very irregular behaviour. More precisely, by modifying an approach introduced in [2], we prove that so-called local weak solutions to such equations are locally Hölder continuous with some explicitly determined Hölder exponent. Here s ∈ (0, 1), ⊂ R n is a domain (= open set), f : R n → R is a given function and A(x, y) |x − y| n+2s (u(x) − u(y))dy, x ∈ , is a nonlocal operator. Throughout the paper, for simplicity we assume that n > 2s. Furthermore, the function A : R n × R n → R is measurable and we assume that there exists a constant λ ≥ 1 such that λ −1 ≤ A(x, y) ≤ λ for almost all x, y ∈ R n . ( Moreover, we require A to be symmetric, i.e. A(x, y) = A(y, x) for almost all x, y ∈ R n .
We call such a function A a kernel coefficient. We define L 0 (λ) as the class of all such measurable kernel coefficients A that satisfy the conditions (2) and (3). Moreover, in our main results : R → R is assumed to be a continuous function satisfying (0) = 0 and the following Lipschitz continuity and monotonicity assumptions, namely and where for simplicity we use the same constant λ ≥ 1 as in (2). In particular, if (t) = t, then the operator L A reduces to a linear nonlocal operator which is widely considered in the literature. The above conditions are for example satisfied by any C 1 function with (0) = 0 such that the image of the first derivative of is contained in [λ −1 , λ]. Define the fractional Sobolev space W s,2 ( ) := u ∈ L 2 ( ) |u(x) − u(y)| 2 |x − y| n+2s dy < ∞ and denote by W s,2 loc ( ) the set of all functions u ∈ L 2 loc ( ) that belong to W s,2 ( ) for any relatively compact open subset of . In addition, we define the tail space L 1 2s (R n ) := u ∈ L 1 loc (R n ) R n |u(y)| 1 + |y| n+2s dy < ∞ .
We remark that for any function u ∈ L 1 2s (R n ), the quantity |u(y)| |x 0 − y| n+2s dy is finite for all R > 0, x 0 ∈ R n . For all measurable functions u, ϕ : R n → R, we define E A (u, ϕ) := R n R n A(x, y) |x − y| n+2s (u(x) − u(y))(ϕ(x) − ϕ(y))dydx, provided that the above expression is well-defined and finite. This is for example the case if u ∈ W s,2 loc ( ) ∩ L 1 2s (R n ) and ϕ ∈ W s,2 c ( ), where by W s,2 c ( ) we denote the set of all functions that belong to W s,2 ( ) and are compactly supported in .
In the literature, various types of weak solutions with varying generality are considered. In this paper, we adopt the following very general notion of local weak solutions which is for example used in [1] and [2].
Definition Let f ∈ L 2n n+2s loc ( ). We say that u ∈ W s,2 loc ( ) ∩ L 1 2s (R n ) is a local weak solution of the equation We remark that the right-hand side of (6) is finite by the fractional Sobolev embedding (cf. [9,Theorem 6.5]). It is noteworthy that the above notion of local weak solutions contains most other notions of weak solutions considered in the literature, such as the ones considered in e.g. [8] or [22]. In our first main result, we are going to impose an additional continuity assumption on A. Namely, we assume that there exists some small ε > 0 such that lim h→0 sup x,y∈K |x−y|≤ε |A(x + h, y + h) − A(x, y)| = 0 for any compact set K ⊂ .
In particular, the condition (7) is satisfied if A is either continuous close to the diagonal in × or if A belongs to the following subclass of L 0 (λ) which plays an important role in our proof of the desired regularity.
Definition Let be a domain and λ ≥ 1. We say that a kernel coefficient A 0 ∈ L 0 (λ) belongs to the class L 1 (λ, ), if there exists a measurable function a : R n → R such that A 0 (x, y) = a(x − y) for all x, y ∈ .
A kernel coefficient that belongs to the class L 1 (λ, ) can be thought of being translation invariant, but only inside of . We also call such a kernel coefficient locally translation invariant. We note that the condition (7) is also satisfied by some more general choices of kernel coefficients, for example if where A ∈ L 0 (λ 1 2 ) is continuous near the diagonal in × and A 0 belongs to the class L 1 (λ 1 2 , ), but is not required to satisfy any continuity or smoothness assumption. Moreover, we stress that the condition given by (7) only restricts the behaviour of A close to the diagonal in × , while away from the diagonal in × and outside of × a more general behaviour is possible.
We are now in the position to state our main results. Theorem 1.1 Let ⊂ R n be a domain, s ∈ (0, 1), λ ≥ 1 and f ∈ L q loc ( ) for some q > n 2s . Consider a kernel coefficient A ∈ L 0 (λ) that satisfies the condition (7) for some ε > 0 and suppose that satisfies (4) and (5) with respect to λ. Moreover, assume that u ∈ W s,2 loc ( ) ∩ L 1 2s (R n ) is a local weak solution of the equation L A u = f in . Then for any 0 < α < min 2s − n q , 1 , we have u ∈ C α loc ( ). Furthermore, for all R > 0, x 0 ∈ such that B R (x 0 ) and any σ ∈ (0, 1), we have |u(y)| |x 0 − y| n+2s dy where C = C(n, s, λ, α, q, σ, ε) > 0 and If we focus on obtaining Hölder regularity for some fixed exponent 0 < α < min 2s − n q , 1 , then we can slightly weaken the assumption on A as follows. Roughly speaking, in this case it is enough to require that A is locally close enough to being translation invariant, while the condition (7) essentially means that A is locally arbitrarily close to being translation invariant. This slight "room for error" is typical when one uses approximation techniques in order to obtain regularity results, see for example [5].

Remark 1.3
In order to provide some context, let us briefly consider the local elliptic equation in divergence form of the type div(B∇u) = 0 in , where the matrix of coefficients B = {b i j } n i, j=1 is assumed to be uniformly elliptic and bounded. The equation (9) can in some sense be thought of as a local analogue of the nonlocal equation (1) corresponding to the limit case s = 1. A classical regularity result states that if the coefficients b i j are continuous, then weak solutions u ∈ W 1,2 loc ( ) of the equation (9) are locally Hölder continuous for any exponent α ∈ (0, 1), see for example [13,Corollary 5.18]. Heuristically, one might therefore expect that the optimal regularity in the setting of nonlocal equations with continuous kernel coefficient should not exceed C s regularity. Nevertheless, Theorem 1.1 in particular shows that weak solutions to nonlocal equations of the type L A u = 0 in are locally C α for any 0 < α < min 2s, 1 whenever A ∈ L 0 (λ) is continuous, exceeding C s regularity. In particular, in the case when s ≥ 1/2, weak solutions to homogeneous nonlocal equations with continuous kernel coefficients enjoy the same amount of Hölder regularity as weak solutions to corresponding local equations with continuous coefficients, despite the fact that the order of such nonlocal equations is lower.
Such at first sight unexpected additional regularity is however not untypical in the context of nonlocal equations and has been observed in various previous works in the context of Sobolev regularity. For example, in [18] and [24] it is shown that already in the setting of a general kernel coefficient A ∈ L 0 (λ), weak solutions to nonlocal equations of the type (1) are slightly higher differentiable than initially assumed along the scale of Sobolev spaces, which is a phenomenon not shared by local elliptic equations of the type (9) with coefficients that are merely measurable.
Another result in this direction was recently proved in [21], where the authors in particular show that if A ∈ L 0 (λ) is Hölder continuous with some arbitrary Hölder exponent and (t) = t, then weak solutions of the equation L A u = 0 in R n belong to W α, p loc (R n ) for any α < min 2s, 1 and any 2 ≤ p < ∞, while for local equations of the type (9) with corresponding Hölder continuous coefficients no comparable gain in differentiability is achievable. In particular, by the Sobolev embedding this result implies that such weak solutions belong to C α loc (R n ) for any 0 < α < min 2s, 1 , which is consistent with our main result. Our main result shows that this amount of higher Hölder regularity is also enjoyed by local weak solutions of possibly nonlinear equations driven by kernel coefficients of class L 0 (λ) that satisfy the continuity assumption (7).

Remark 1.4
Besides being interesting for its own sake, one of our main motivations is that Theorem 1.1 also has some interesting potential applications concerning the Sobolev regularity of solutions to nonlocal equations. A first such application can briefly be summarized as follows. In [22], in the main result it is assumed that A is globally translation invariant, i.e. that A belongs to the class L 1 (λ, R n ). However, this assumption is only used in order to ensure that the Hölder estimate (8) from Theorem 1.1 is valid, which up to this point was only known for translation invariant kernels, cf. [22,Theorem 4.6]. Since otherwise the proofs in [22] only rely on the properties (2) and (3) of A, from Theorem 1.1 above we conclude that the statement of [22,Theorem 1.1] is also true for general kernel coefficients A of class L 0 (λ) that satisfy the condition (7).

Approach and previous results
As mentioned, our approach is strongly influenced by an approach introduced in [2], where a similar result concerning higher Hölder regularity is proved for the fractional p-Laplacian in the superquadratic case when p ≥ 2. Although for simplicity we restrict ourselves to the quadratic case when p = 2, in contrast to [2] we deal with a nonlinearity already in the quadratic setting and most importantly, we also treat equations driven by general kernel coefficients A that satisfy the mild assumption (7), while in [2] only the case when A ≡ 1 is considered. Also, we stress that by combining our techniques with some more techniques from [2], our approach could be modified in order to treat also nonlinearities with nonlinear growth of the type (t) ≈ t p−1 . However, since the additional difficulties arising from such a generalization were already dealt with in [2] and we instead want to focus on the difficulties arising from considering equations with general coefficients, we decided not to pursue this direction in this work.
Let us briefly summarize our approach, highlighting the differences to the one used in [2]. First, we prove the higher Hölder regularity for homogeneous equations driven by a locally translation invariant kernel coefficient, see Sect. 3. As in [2], the main idea in this case is to test the equation with certain monotone power functions of discrete fractional derivatives leading to an incremental higher integrability and differentiability result on the scale of certain Besov-type spaces. However, in our setting we also need to carefully use the local translation invariance and the bounds imposed on A, and also the assumptions (4) and (5) imposed on in order to overcome the difficulties that arise due to the presence of the general kernel and the general type of nonlinearity. Moreover, we remark that restricting ourselves to equations with linear growth has the advantage that the proof of this incremental higher regularity result simplifies quite substantially in some other respects. The obtained incremental gain in regularity is then iterated, in order for the desired Hölder regularity to follow by embedding.
In Sect. 4, we then treat the general case of inhomogeneous equations driven by a kernel coefficient satisfying the condition (7) by an approximation argument. In the corresponding approximation argument applied in [2], the solution is approximated by a solution of a corresponding equation with zero right-hand side, while the nonlocal operator driving the equation is left unchanged. In order to be able to treat equations with a general kernel coefficient A of class L 0 (λ) that satisfies only the continuity assumption (7), in addition to freezing the right-hand side, we also need to locally replace A by a corresponding locally translation invariant kernel coefficient, which is possible in view of the assumption (7). Since by the first part of the proof the desired Hölder regularity is already known for solutions to equations with locally translation invariant kernel coefficients, we can then transfer this regularity from the approximate solution to the solution itself. In other words, in some sense we locally freeze the coefficient, in order to transfer the regularity from an equation for which the higher regularity can be proved directly to an equation driven by a less regular kernel. This strategy can be thought of as a nonlocal counterpart of corresponding techniques widely used in the study of higher regularity for local elliptic equations, although we stress that in our nonlocal setting we have to overcome a number of additional difficulties which are not present in the local setting in order to execute such an approximation argument successfully. Moreover, we believe that just like in the local setting, the approximation techniques developed in this paper are flexible enough in order to be adaptable to also proving other higher regularity results for nonlocal equations similar to (1).
Regarding other related regularity results, in [12] a similar result is proved in the linear case when (t) = t, where A is required to be locally close enough to b x−y |x−y| for some even function b : S n−1 → R that is bounded between two positive constants, which is contained in our assumption on A in Theorem 1.2. More results concerning higher Hölder regularity for various types of nonlocal equations are for instance contained in [5,6,11,23] and [14]. Furthermore, results regarding basic Hölder regularity for nonlocal equations are proved for example in [8,15,19,25], while results concerning Sobolev regularity can be found for example in [1,7,10,18,21,22,24]. Finally, for some regularity results concerning nonlocal equations similar to (1) in the more general setting of measure data, we refer to [17].

Some notation
Let us fix some notation which we use throughout the paper. By C, c, C i and c i , i ∈ N 0 , we always denote positive constants, while dependences on parameters of the constants will be shown in parentheses. As usual, by we denote the open and closed ball with center x 0 ∈ R n and radius r > 0, respectively. Moreover, if E ⊂ R n is measurable, then by |E| we denote the n-dimensional Lebesguemeasure of E. If 0 < |E| < ∞, then for any u ∈ L 1 (E) we define

The nonlocal tail
In this section, for convenience we state and proof the following two simple results concerning the nonlocal tail of a function which we use frequently throughout the paper.
Lemma 2.1 Let s ∈ (0, 1) and 0 < r < R. Then for any x ∈ B r and any u ∈ L 1 2s (R n ), we have Proof The claim follows directly from the observation that for any x ∈ B r and any y ∈ R n \ B R , we have Proof Since by assumption x 0 ∈ B 1−r , with the help of Lemma 2.1 we obtain which finishes the proof.

The fractional Sobolev space W s,2
First of all, for notational convenience for any domain ⊂ R n we define the seminorm associated to the space W s,2 ( ) by Moreover, we define the space The following Poincaré-type inequality associated to the space W s,2 will frequently be used throughout the paper.

Besov-type spaces
Next, let us introduce some function spaces of Besov-type. In order to do so, for q ∈ [1, ∞) and any function u ∈ L q (R n ) we define the quantities This enables us to define the two Besov-type spaces The following embedding result can be found in [4, Lemma 2.3].
Finally, the following result can be found in [1, Proposition 2.6].

Some elementary inequalities
The proof of the following elementary inequality can be found in [2, Lemma A.3].

Lemma 2.7
For all X , Y ∈ R and any p ≥ 1, we have Next, we prove two elementary inequalities which involve the function J p defined in Sect. 2.1 and are based on the monotonicity property (5) of .
so that in this case both sides of the inequality vanish. Next, we consider the case when a−c = b−d. In view of the monotonicity assumption (5) imposed on , we have Moreover, by [20, page 71], for all x, y ∈ R we have Since the last term on the right-hand side is non-negative, by choosing y = a − b and Multiplying the inequality (11) with the one in the previous display leads to so that the claim follows by simplifying the factor ((a − b) − (c − d)) 2 from both sides.
The right-hand side of the above estimate can be further estimated by applying [2, Lemma A.1] with p = q + 1 and q = 2, which yields The claim now follows by combining the last two displays.

Some preliminary estimates
The following Caccioppoli-type inequality can be proved in essentially the same way as the one in [18, Theorem 3.1].
Theorem 2.10 Let 0 < r < R, x 0 ∈ R n , λ ≥ 1 and f ∈ L 2n n+2s (B R (x 0 )). Moreover, assume that A ∈ L 0 (λ) and that the Borel function : R → R satisfies Then for any local weak solution u ∈ W s, We remark that the assumptions in (12) are clearly implied by the assumptions (0) = 0, (4) and (5) which are used in our main results.
The following result on local boundedness is essentially given by [3,Theorem 3.8], where the below result is stated under the stronger assumption that u ∈ W s,2 0 (B R (x 0 )) and in setting of the fractional p-Laplacian, which applied to our setting means that strictly speaking it only contains the case when (t) = t and A(x, y) ≡ 1. Nevertheless, an inspection of the proof shows that it remains valid for local weak solutions, see also [2,Theorem 3.2]. Moreover, the case of a general and a general A can easily be treated by noting that the Caccioppoli-type inequality from [3, Proposition 3.5] remains valid for such a general and a general A by simply applying the bounds imposed on and A whenever appropriate in a similar fashion as in [18,Theorem 3.1]. Therefore, we have the following result.
2s . Moreover, consider a kernel coefficient A ∈ L 0 (λ) and assume that the Borel function : R → R satisfies (12). Then for any local weak solution u ∈ W s,2 (B R (x 0 )) ∩ L 1 2s (R n ) of the equation In the case when f = 0 and (t) = t, the following result concerning basic Hölder regularity follows from [

Incremental higher integrability and differentiability
The key ingredient to proving the desired higher Hölder regularity for homogeneous equations with locally translation invariant kernel is provided by the following incremental higher integrability and differentiability result on the scale of Besov-type spaces. In the case of the fractional p-Laplacian for p ≥ 2, the below result was proved in [2, Proposition 5.1].
Besides the fact that we treat equations with arbitrary locally translation invariant kernels, it is also interesting that in our setting of equations with linear growth, we are able to directly prove both higher integrability and differentiability, while for possibly degenerate equations as in [2] it is necessary to first obtain a pure higher integrability result (cf. [2, Proposition 4.1]), which is then used in order to also obtain higher differentiability. We remark that this additional higher differentiability does not seem to have a counterpart in the context of local equations and is one of the main reasons why in our nonlocal setting we are able to exceed C s regularity. Moreover, note that although at this point we work with solutions that are bounded, this assumption will later be removed by using Theorem 2.11.

Proof
Step 1: Discrete differentiation of the equation. Set r := R − 4h 0 > 0 and fix some h ∈ R n such that 0 < |h| < h 0 . Let η ∈ C ∞ 0 (B R ) be a non-negative Lipschitz cutoff function satisfying Let us show that the function Thus, since the product of a function belonging to W s,2 (B R ) and a Lipschitz function also belongs to and are compactly supported in B R−h 0 , in view of [2, Lemma 2.11] in particular both ϕ and ϕ −h belong to W s,2 c (B 1 ), so that both ϕ and ϕ −h are admissible test functions in (13). Therefore, using ϕ −h as a test function in (13) along with a change of variables yields where we have set A h (x, y) := A(x + h, y + h). Moreover, testing (13) with ϕ yields By subtracting (16) from (15) and dividing by 0 < |h| < h 0 , we obtain (17) Next, splitting the above integral and taking into account the choice of ϕ, we arrive at where where we used that η vanishes identically outside of B (R+r )/2 .
Step 2: Preliminary estimation of the local term I 1 .
for all x, y ∈ B 1 and some measurable function a : R n → R. Since for x, y ∈ B R we have x + h, y + h ∈ B 1 , it follows that for all x, y ∈ B R we have Therefore, we can rewrite I 1 as follows Let us now concentrate on estimating I 1 . First of all, we observe that Therefore, we obtain Next, using the Lipschitz bound (4), Young's inequality and then Lemma 2.8, for the negative term in the last display we deduce where ε > 0 is arbitrary. By choosing ε := 1 2λ 2 , combining the last two displays yields where C 3 = C 3 (λ) > 0. By using Lemma 2.9, we can further estimate the first term of the previous display, which along with the bounds (2) of A leads to where c = c(λ, q) > 0 and C 4 = C 4 (λ) > 0. Next, for simplicity we write and observe that by using the convexity of the function t → t 2 , we obtain Combining (18) with the last display yields where C 5 = C 5 (λ, q) > 0. By combining the above estimate for I 1 with the identity I 1 + I 2 + I 3 = 0, we arrive at where C 6 = C 6 (λ, q) > 0 and Our next goal is to estimate the terms I 1,1 , |I 2 | and |I 3 |.
Step 3: Estimating the local term I 1,1 . In order to estimate I 1,1 , observe that for any x ∈ B R changing variables and integrating in polar coordinates yields where C 7 = C 7 (n, s) > 0. Since by construction η is Lipschitz with Lipschitz constant C 1 4h 0 , along with (20) we obtain where we used that R + h 0 ≤ 1 and ||u|| L ∞ (B 1 ) ≤ 1 in order to obtain the last inequality and C 8 = C 8 (n, s, q, λ, h 0 ) > 0. In the same way we have |δ h u(y)| q |h| 1+ϑq dy, so that we obtain Step 4: Estimating the nonlocal terms I 2 and I 3 . Next, let us estimate the nonlocal terms I 2 and I 3 , which can be treated in the same way. Since ||u|| L ∞ (B 1 ) ≤ 1 and (R+r )/2+h 0 ≤ 1, by additionally using the bound (4) of with t = u h (x) − u h (y) and t = 0, for almost every x ∈ B (R−r )/2 and any y ∈ R n \ B R we have By using the upper bound in (2) of A (which trivially also holds for A h ) and the fact that 0 ≤ η ≤ 1 and then the last two displays, we deduce For any x ∈ B (R+r )/2) , we have B (R−r )/2 (x) ⊂ B R , which in view of integration in polar coordinates along with the fact that R − r = 4h 0 leads to where C 9 = C 9 (n, s) > 0 and C 10 = C 9 (2h 0 ) −2s . Using Lemma 2.1, the change of variables z = y + h and then Lemma 2.2, for any x ∈ B (R+r )/2 we obtain where C 11 = C 11 (n, s, h 0 ) > 0. Here we also used the the fact that R > 4h 0 and the bounds imposed on u. The term involving u can be estimated similarly. In fact, by using Lemma 2.1 and Lemma 2.2, for any x ∈ B (R+r )/2 we obtain R n \B R |u(y)| |x − y| n+2s dy ≤(2h 0 ) −(n+2s) |u(y)| |y| n+2s dy ≤ C 12 ,
Step 5: Conclusion. Let ξ ∈ R n \ {0} to be chosen such that |ξ | < h 0 . Applying Lemma 2.7 with where C 15 = C 15 (q) > 0. Here we also used that η ≡ 1 in B r in order to obtain the last inequality. Next, we observe that by the discrete Leibniz rule (cf. [2, Formula (2.1)]), we can write We arrive at , where C 17 = C 17 (n, s) > 0 and C 18 = C 18 (n, s, h 0 ) > 0. By using that η is Lipschitz and that ξ < h 0 , along with the assumption that ||u|| L ∞ (B 1 ) ≤ 1 we estimate the second term on the right-hand side of (24) as follows where C 20 = C 20 (n, s, q, h 0 ) > 0. We now choose ξ = h and take the supremum over h for 0 < |h| < h 0 , so that together with (23) we obtain sup 0<|h|<h 0 where C 21 = C 21 (n, s, q, h 0 , λ) > 0. Next, we use the fact that by [2, Lemma 2.6] applied with β = (1 + ϑq)/q < 1, on the right-hand side of (25) we can replace the first-order difference quotient by a corresponding second-order difference quotient in the following way By combining the last display with (25) and using that ||u|| where C = C(n, s, q, ϑ, h 0 , λ) > 0. Since r = R − 4h 0 , the proof is finished.

An iteration argument
We now use an iteration argument based on Proposition 3.1 in order to obtain the following higher Hölder regularity result.
Proof If u ≡ 0 a.e., then the assertion is trivially satisfied. Otherwise, set Consider the scaled function and also Observe that u 1 belongs to W s, and is a weak solution of L 1 A 1 u 1 = 0 in B 1 . Moreover, it is easy to verify that A 1 ∈ L 1 (λ, B 1 ) and that 1 satisfies (4) and (5) with respect to λ. Furthermore, by using changes of variables it is straightforward to verify that u 1 satisfies Therefore, the conclusion of Proposition 3.1 is valid with respect to u 1 . For i ∈ N 0 , we define the sequences In particular, we have lim We split the further proof into two cases. Case 1: s ≤ 1/2. Fix 0 < α < 2s. In view of (28), we can find some large enough i ∞ ∈ N such that α < 1 For i = 0, ..., i ∞ , define We note that Since s ≤ 1/2, for i = 0, ..., i ∞ − 1 we have 0 < (1 + ϑ i q i )/q i < 1. Therefore, for i = 0, ..., i ∞ − 1 we can apply Proposition 3.1 to so that along with (30) and the observation that by construction we obtain the following estimates sup 0<|h|<h 0 and sup 0<|h|<h 0 where C 0 = C 0 (n, s, λ, α). Combining the above estimates leads to the estimate sup 0<|h|<h 0 where C 1 = C 1 (n, s, λ, α) > 0. By taking into account the relation and then using the second part of Proposition 2.6 and then (27), we deduce sup 0<|h|<h 0 By combining (31) with (32) and setting we arrive at sup 0<|h|<h 0 In order to proceed, we fix a cutoff function χ ∈ C ∞ 0 (B 5/8 ) with the properties where by ∇ 2 χ we denote the Hessian of χ and C 4 = C 4 (n) > 0. In particular, since 0 < β < 1, for any h ∈ R n with |h| > 0 we have (33) and (27), for 0 < |h| < h 0 we obtain Since moreover by (27), for |h| ≥ h 0 we have by Lemma 2.4 it follows that s, λ, α).
(35) Along with Lemma 2.5 with our choice of β and q = q i ∞ (which is applicable in view of (29)), we obtain s, λ, α).
By imitating the arguments used to conclude in case 1 (cf. (35) and (36)), which in particular involves applying Lemma 2.5 with β = 1 − ε and q = q i ∞ + j ∞ (which is applicable in view of (38)), we conclude that for a different constant C as the one in (36). The desired estimate (26) now once again simply follows by rescaling, which finishes the proof.

Higher Hölder regularity by approximation
We now use an approximation argument inspired by [2, section 6] and [5] in order to prove Theorem 1.1 and Theorem 1.2 under full generality. In order to do so, we need the following definition.

Moreover, suppose that A is another kernel coefficient of class
and let u ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) be a local weak solution of Then the unique weak solution v ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) of the problem Proof First of all, we remark that the existence of a unique weak solution of the problem (44) belonging to W s,2 (B 1 ) ∩ L 1 2s (R n ) can be shown almost exactly as in [16, Theroem 1 and Remark 3] by using the theory of monotone operators and additionally taking into account the bounds (4) and (5) imposed on . We now prove by contradiction. Assume that the conclusion is not true. Then there exist some (4) and (5), and sequences , such that for any m the function u m is a local weak solution of the problem sup but for any m the unique weak solution v m ∈ W s,2 ( In view of (2), (5) and using w m := u m − v m ∈ W s,2 0 (B 7/8 ) as a test function in (49) and also in (46), we obtain By using (4) and (48), we further estimate I 1 as follows .
In order to proceed, we observe that since n > 2s, we have q > n 2s > 2n n+2s , so that Hölder's inequality and (48) yield where C 1 = C 1 (n, s, q) > 0. By using the Cauchy-Schwarz inequality, Theorem 2.10, (47) and (51), for I 1,1 we obtain where C 4 = 15 n+2s , C 5 = C 5 (n, s) > 0 and C 6 = C 6 (n, s, M) > 0. Similarly, by using Lemma 2.1, the Cauchy-Schwarz-inequality, Lemma 2.2, Lemma 2.3 and (47), for I 1,3 we obtain where again all the constants depend only on n, s and M. Next, by using Hölder's inequality, the fractional Sobolev inequality (cf. [9, Theorem 6.5]) and (51), we estimate I 2 in the following way where C 10 = C 10 (n, s, q) > 0. Putting the above estimates together, we arrive at In other words, we have lim so that in view of (52) and (47) the sequence {v m } ∞ m=1 is uniformly bounded in B 3/4 and has uniformly bounded C β seminorms in B 3/4 , where β = β(n, s, λ, q) > 0. Moreover, in view of (47) and Theorem 2.12, the sequence {u m } ∞ m=1 is also uniformly bounded in B 3/4 and has uniformly bounded C β seminorms in B 3/4 . In particular, the same is also true for the sequence {u m − v m } ∞ m=1 . Therefore, by the Arzelà-Ascoli theorem, by passing to a subsequence if necessary, we obtain that the sequence {u m − v m } ∞ m=1 converges uniformly in B 3/4 to some function h. Since by (53) up to passing to another subsequence we have which by uniqueness of the limit implies that h = 0 a.e. in B 3/4 , we arrive at In particular, for m large enough we have which contradicts (50). This finishes the proof.
Next, we use the above Lemma and Theorem 3.2 in order to prove the desired higher Hölder regularity in the case when A is close enough to a locally translation invariant kernel coefficient.
Proof We divide the proof into two parts.
Step 1: Regularity at the origin. In this step, our aim is to prove that for any 0 < ε < and any 0 < r < 1, there exists some small enough δ > 0 such that if A, A, f and u are as above, then sup for some constant C 1 = C 1 (n, s, λ, ε) > 0. In order to accomplish this, we fix some 0 < ε < and observe that it suffices to prove that there exist 0 < ρ < 1 3 and δ > 0 such that if A, A, f and u are as above, then for any k ∈ N 0 we have where M 0 := 1 + R n \B 1 dy |y| n+2s < ∞. Indeed, assume that (59) were true. Since for any 0 < r < 1 there exists some k ∈ N 0 such that ρ k+1 < r ≤ ρ k , by the first inequality in (59) we would arrive at which would prove (58) with C 1 = 2 ρ −ε . In order to prove (59), we proceed by induction. In the case when k = 0, (59) is true by the assumptions (57). Next, suppose that (59) holds up to k and let us prove that it is also true for k + 1. Let τ > 0 to be chosen small enough and consider the corresponding δ = δ(τ, n, s, λ, q, M) > 0 given by Lemma 4.1, where M := 2 + M 0 . Assume that (55) is satisfied with respect to this δ. Furthermore, define and λ, B 1 ) and that k satisfies (4) and (5) with respect to λ. Moreover, w k belongs to W s,2 (B 1 ) ∩ L 1 2s (R n ) and is a local weak solution of L k A k w k = f k in B 1 , while by (55) we have where we have also used that ≤ 2s − n q and thus k 2s − ( − ε) − n q ≥ kε ≥ 0.
Moreover, by the induction hypothesis we have

S. Nowak
Therefore, by Lemma 4.1 the unique weak solution v k ∈ W s,2 ( Together with the fact that w k (0) = 0, we obtain that for any x ∈ B 1/3 we have Our next goal is to prove that the right-hand side of the previous estimate is uniformly bounded by a constant that does not depend on k. In order to do so, we observe that since where C 2 = C 2 (n, s, λ, , ε) > 0. For the first term of the right-hand side, in view of (61) and (60) we have In order to estimate the tail term, we observe that by the same argument used in order to obtain (52), we have where C 6 = C 6 (n, s, λ, q, δ) > 0. Finally, for the Sobolev seminorm by Theorem 2.10 and the above estimates we have where C 7 and C 8 do not depend on k. By combining the above estimates with (62) and (63), we obtain that for any x ∈ B 1/3 we have where again C 9 does not depend on k. Next, define By choosing τ small enough such that 2τ < ρ , in view of (64), we obtain In particular, by choosing ρ small enough such that ρ ≤ (1 + C 9 ) − 2 ε and recalling that ρ < 1/3, we arrive at ||w k+1 || L ∞ (B 1 ) ≤ 1. By definition of w k+1 this is equivalent to which proves the first estimate in (59) for k + 1. In order to prove the second estimate in (59) for k + 1, we observe that (65) implies where C 10 := (1 + C 9 ) R n \B 1 dy |y| n+2s+ε/2− < ∞ does not depend on k and is finite because 2s + ε/2 − ≥ n q + ε/2 > 0. Furthermore, by using a change of variables and the first bound in (60), we obtain where C 11 := 3 n+2s 2|B 1 | < ∞. Moreover, again by a change of variables and the second bound in (60), we deduce Note that in the last two estimates we also used that ρ < 1 and that ε − + 2s ≥ ε/2. By combining the last three displays and choosing ρ small enough such that we arrive at which proves the second estimate in (59) for k + 1. Therefore, for (59) is true for any k ∈ N 0 , which in particular also proves (58) under the assumptions (55) and (57), where δ is chosen as above.
Step 2: Regularity in a ball. Next, we show the desired higher Hölder regularity in the whole ball B 1/2 . We fix some 0 < ε < and take the corresponding small enough δ from step 1. We note that A z ∈ L 0 (λ), A z ∈ L 1 (λ, B 1 ) and that L satisfies (4) and (5) with respect to λ. Moreover, u z is a local weak solution of L L A z u z = f z in B 1 and by (55) we have Additionally, by (57) Therefore, we are in the position to apply step 1 to u z , which yields sup x∈B r |u z (x) − u z (0)| ≤ C 1 r −ε , 0 < r < 1.
By rewriting this estimate in terms of u, for any z ∈ B 1/2 we obtain sup x∈B r (z) Now fix two points x, y ∈ B 1/2 . Then applying (66) with r = |x−y| 2 < 1/2 and z = (x + y)/2 yields which proves the desired Hölder regularity of u.
In order to obtain the estimate (8) in our main results with its precise scaling, we now first prove Theorem 1.2 at scale 1 by using scaling and covering arguments. The general case will then follow by another scaling argument.
Consider the functions u 1 ∈ W s,2 (B 1 ) ∩ L 1 2s (R n ) and f 1 ∈ L q (B 1 ) given by and also where A Rz+x 0 exists for any z ∈ B 1 since in this case we have Rz + x 0 ∈ B R (x 0 ). We note that for any z ∈ B 1 and r z := r Rz+x 0 /R > 0, we have (A 1 ) z ∈ L 1 (λ, B r z (z)) and In addition, u 1 is a local weak solution of L A 1 u 1 = f 1 in B 1 . Therefore, by Theorem 4.3 along with some changes of variables, for any σ ∈ (0, 1) we obtain the estimate |u(y)| |x 0 − y| n+2s dy + R 2s− n q || f || L q (B R (x 0 )) , which proves the estimate (8). Furthermore, since x 0 ∈ is arbitrary, we in particular obtain that u ∈ C α loc ( ).
Therefore, by additionally taking into account the symmetry of A, we see that the kernel coefficient defined by A z (x, y) := 1 2 (A(x − y + z, z) + A(y − x + z, z)) satisfies ||A − A z || L ∞ (B rz (z)×B rz (z)) ≤ δ and clearly belongs to the class L 1 (λ, B r z (z)). Since z ∈ B R (x 0 ) is arbitrary, all assumptions from Theorem 1.2 are satisfied with replaced by B R (x 0 ). Therefore, by Theorem 1.2 we see that the estimate (8) holds in any ball B R (x 0 ) . In addition, since x 0 ∈ is arbitrary, we obtain that u ∈ C α loc ( ).
Acknowledgements The author wants to thank the anonymous referee for careful reading and useful remarks that led to improvements of the paper.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.