Strong Feller Property for SDEs Driven by Multiplicative Cylindrical Stable Noise

We consider the stochastic differential equation dXt = A(Xt−)dZt, X0 = x, driven by cylindrical α-stable process Zt in , where α ∈ (0,1) and d ≥ 2. We assume that the determinant of A(x) = (aij(x)) is bounded away from zero, and aij(x) are bounded and Lipschitz continuous. We show that for any fixed γ ∈ (0,α) the semigroup Pt of the process Xt satisfies |Ptf(x)−Ptf(y)|≤ct−γ/α|x−y|γ||f||∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$|P_{t} f(x) - P_{t} f(y)| \le c t^{-\gamma /\alpha } |x - y|^{\gamma } ||f||_{\infty }$\end{document} for arbitrary bounded Borel function f. Our approach is based on Levi’s method.

( 4 ) It is well known that the SDE (1) has a unique strong solution X t , see e.g. [30,Theorem 34.7 and Corollary 35.3]. By [33,Corollary 3.3], X t is a Feller process.
Let E x denote the expected value of the process X starting from x and B b (R d ) denote the set of all Borel bounded functions f : The main result of this paper is the following theorem. Theorem 1.1 For any γ ∈ (0, α), τ > 0, t ∈ (0, τ ], x, y ∈ R d and f ∈ B b (R d ) we have where c depends on τ, α, d, η 1 , η 2 , η 3 , γ .
Recently, estimates of the type |P t f (x) − P t f (y)| ≤ c t |x − y| γ f ∞ or |∇ x P t f (x)| ≤ c t f ∞ have been intensively studied for semigroups of solutions of SDEs dX t = A(X t− ) dZ t + b(X t ) dt driven by Lévy processes Z with jumps. In [37] such results were proved for the Orstein-Uhlenbeck jump process X. In [38] the above estimates were shown in the case when the driving process Z is a subordinated Brownian motion. There are also known results when the Lévy measure ν of Z satisfies ν(dz) ≥ c1 |z|≤r |z| −d−α dz for some α ∈ (0, 2) and c, r > 0, see [27]. Another well explored case is when Z has a nondegenerate diffusion part, as done in [40]. The above estimates were also studied in [32] for solutions of SDEs in infinite dimensional Hilbert space driven by an additive cylindrical stable noise (see [32,Theorem 5.7]).
Note that Theorem 1.1 implies the strong Feller property of the semigroup P t . This property also follows from [10], see Remark 4.23. The strong Feller property for SDEs driven by additive cylindrical Lévy processes have been intensively studied recently (see e.g. [11,32,39]). It is worth mentioning that the strong Feller property and gradient estimates for the semigroups associated to SDEs driven by Lévy processes in R d with jumps, with absolutely continuous Lévy measures, have been examined for many years (see e.g. [25,28,34,36,38,40,42]).
The SDE (1) (with multiplicative noise) was studied by Bass and Chen in [1]. They proved existence and uniqueness of weak solutions of the SDE (1) under very mild assumptions on matrices A(x) (i.e. they assumed that A(x) are continuous and bounded in x and nondegenerate for each x). In [24] the SDE (1) was considered for diagonal matrices A(x) whose diagonal coefficients are bounded away from zero, from infinity and Hölder continuous. Under these assumptions the corresponding transition density p A (t, x, y) was constructed and Hölder estimates x → p A (t, x, y) were obtained. These estimates imply the strong Feller property of the corresponding semigroup.
The case of non-diagonal matrices A(x), treated in this paper, is much more difficult. We use Levi's method to construct the semigroup P t and to prove Theorem 1.1. However, there are many problems in applying this method to the case when multiplicative coefficient A(x) in the SDE (1) may perform a non-trivial rotation. Therefore we had to introduce some new ideas. Below we briefly describe the main steps in our approach.
We divide K into two parts where and μ(w) dw is the appropriate truncation of the Lévy measure |w| −1−α dw (supp(μ) ⊂ [−1, 1]). The definition of the truncated measure is presented at the beginning of Section 2.
Roughly speaking, L corresponds to small jumps of the process X and R corresponds to big jumps of the process X. Our first aim is to construct the heat kernel u(t, x, y) corresponding to the operator L. This is done by using Levi's method, which is heuristically explained below.
At first we choose a kernelũ(t, x, y) which serves as the first order approximation of u(t, x, y). By r(t, x, y) we denote the resulting error, that is u(t, x, y) =ũ(t, x, y) + r(t, x, y). Put Since u(t, x, y) is the heat kernel for L, at least formally, one has r(t, x, y) = where q n is defined inductively as q n (t, x, y) = t 0 R d q 0 (t − s, x, z)q n−1 (s, z, y) dz ds.
As the approximationũ(t, x, y) we choose the heat kernel of the operator L frozen at y ∈ R d . More precisely, we putũ(t, x, y) = p y (t, x − y), where p y (t, ·) is the heat kernel of L y (i.e. ∂ ∂t p y (t, z) = L y p y (t, ·)(z), t > 0, z ∈ R d ), where Typically, in many papers using Levi's method, the first step was to obtain precise bounds for q 0 (t, x, y) which allow to estimate q n (t, x, y) inductively point-wise. In our case it seems impossible to obtain such precise bounds, hence we prove (see Proposition 3.9) some crude estimates for q 0 (t, x, y) and its integrals with respect to dx or dy, which are sufficient for our purposes. The main tools to prove Proposition 3.9 are Lemma 3.6 and the estimate (18). This key estimate (18) is proven using the techniques and results from [23], [22] and [35]. After constructing the transition density u(t, x, y) we use the technique developed by Knopova and Kulik [19] to show that U t f (x) := R d u(t, x, y)f (y) dy satisfies the appropriate heat equation in the so-called approximate setting. In the next step we construct the semigroup T t for the solution of the SDE (1) (driven by the not truncated process). Roughly speaking, this construction is based on adding big jumps to the truncated process (Meyer-type construction). Next we show that T t f (x) satisfies the appropriate heat equation in the approximate setting (see Lemma 4.18), which allows to prove that the constructed semigroup T t is in fact the semigroup P t .
Let us introduce the operator K frozen at a point y ∈ R d , Let p K y (t, ·) be the heat kernel of K y , that is ∂ ∂t p K y (t, z) = K y p K y (t, ·)(z) for any t > 0, z ∈ R d . One may ask why the Levi method is applied to the operator L (corresponding to the truncated Lévy measure) and not to the operator K. The reason is that (in general) the heat kernel p K y (t, ·) of the operator K y does not have good integrability properties, i.e. R d p K y (t, x − y) dy = ∞ for some choices of matrices A(x) and some t > 0, x ∈ R d . Therefore it cannot be used as the first order approximation of the heat kernel of K.
Our current technique is restricted to the case α ∈ (0, 1). The main difficulty for α ∈ [1, 2) is that in such case one has to effectively estimate the expression where p y (t, x) is the frozen density for the truncated process (see Section 3 for the precise definition of p y (t, x)). Our crucial estimate (18) allows a suitable estimate of (13) but fails to bound (12) in a way sufficient for our purpose.
We point out that the existence of transition densities p(t, x, y) of the process X is already well known, see [10]. In our paper we obtain a representation of this densities, see Remark 4.26. One may ask about the boundedness of p(t, x, y). It turns out that for some choices of matrices A(x) (satisfying (2), (3), (4)) and for some t > 0, The paper is organized as follows. In Section 2 we study properties of the transition density of a suitably truncated one-dimensional stable process. These properties are crucial in the sequel. In Section 3 we construct the transition density u(t, x, y) of the solution of (1) driven by the truncated process. We also show that it satisfies the appropriate equation in the approximate setting. In Section 4 we construct the transition semigroup of the solution of (1). We also prove Theorems 1.1 and 1.2.

Preliminaries
All constants appearing in this paper are positive and finite. In the whole paper we fix τ > 0, α ∈ (0, 1), d ∈ N, d ≥ 2, η 1 , η 2 , η 3 , where η 1 , η 2 , η 3 appear in (2), (3) and (4). We adopt the convention that constants denoted by c (or c 1 , c 2 , . . .) may change their value from one use to the next. In the whole paper, unless is explicitly stated otherwise, we understand that constants denoted by c (or c 1 , c 2 , . . .) depend on τ, α, d, η 1 , η 2 , η 3 . We also understand that they may depend on the choice of the constants ε and γ . We The standard inner product for x, y ∈ R d we denote by x · y.
For any t > 0, x ∈ R d we define the measure σ t (x, ·) by for any Borel set A ⊂ R d . P x denotes the distribution of the process X starting from It is well known that the density of the Lévy measure of the one-dimensional symmetric standard α-stable process is given by A α |x| −1−α . In the sequel we will need to truncate this density. The truncated density will be denoted by μ (δ) At first we find μ (δ) for δ = 1. Let ν(x) = A α |x| −1−α . Observe that the tangent line to the graph of ν(x) at (1, ν(1)) is l : y − A α = −A α (1 + α)(x − 1) and it crosses the horizontal axes at x 0 = 1 + 1 1+α < 2. Let (x − 2) 2 + (y − r) 2 = r 2 , r > 0, be a circle tangent to the horizontal axes at the point (2, 0) and tangent to the line l. Let (x,ỹ) be the point at which the circle touches the line l. It is clear that 1 <x < x 0 . We put μ (1) It is obvious that μ (1) is convex, decreasing, continuously differentiable on (0, ∞) and satisfies all the other requirements if δ = 1. For arbitrary δ > 0 we define Since ν(x) = δ −α−1 ν(x/δ) we see that μ (δ) possess all desired properties since μ (1) does. We also define t we denote the heat kernel corresponding to G (δ) that is It is well known that g (δ) t belongs to C 1 ((0, ∞)) as a function of t and belongs to C 2 (R) as a function of z. We also note that as a function of x it is symmetric and nonincreasing on [0, ∞). This is due to the fact that μ (δ) also has that property.
Proof By Lemma 2.1 we get Now the assertion follows from Lemmas 2.1 and 2.2.
Proof First we note that and by the substitution s = aw we have Using (18) we get

. This yields
Now we estimate I 2 . If t 1/α > 2δ|a| then I 2 = 0 so we assume that t 1/α ≤ 2δ|a| and using Eq. 17 we obtain denote the transition density of the one-dimensional symmetric standard αstable process. It follows from [2] for |s| ≥ t 1/α , and using the Chapman-Kolmogorov equation for g for |x| ≤ |ε|. Let now |x| ≥ ε ≥ 4δ|a| ≥ 2t 1/α . Then |x + s| ≥ |x|/2 for s ≤ 2δ|a|, and we obtain Proof In the proof we assume that constants c may additionally depend on m and n. We use Theorem 3 of [16]. Let f (s) = A α s −1−α for s ≤ δ and f (s) = A α δ n−1−α s −n for s > δ. It is then obvious that the assumptions (1) and (2) of Theorem 3 in [16] hold and it follows that Clearly, for |x| ≥ 1 and t ∈ (0, τ ] we have f (|x|) ≈ |x| −n and 1 + |x| This implies the assertion of the lemma.

Lemma 2.6
There is a constant C = C(α) such that for δ ∈ (0, 1], a ≥ 0, and any t > 0, Proof We have Next, In the sequel we will need a version of the inverse map theorem for a Lipschitz function f : R n → R n , n ∈ N. The corresponding theorem is the main result in [9], however it is not formulated in a suitable way for our purpose. Below, closely following the arguments from [9], we provide a version we need. It is well known that y almost surely the Jacobi matrix J f (y) of f exists. For any y 0 ∈ R n we define (see Definition 1 in [9]) the generalized Jacobian denoted ∂f (y 0 ) as the convex hull of the set of matrices which can be obtained as limits of J f (y n ), when y n → y 0 .
We denote by B(x, r) an open ball of the center x ∈ R n and radius r > 0. For any matrix M we denote by ||M|| ∞ the maximum of its entries.

Lemma 2.7
Let f : R n → R n be a Lipschitz map and x ∈ R n . Suppose that for any y ∈ R n , the generalized Jacobian ∂f (y) consist of the matrices which can be represented as M(x) + R, where matrices M(x), R satisfy the following conditions: there are positive β and η such that ||R|| ∞ ≤ η|x − y| and |vM(x) T | ≥ 2β for every v ∈ R n , |v| = 1. Then f is injective on B(x, β/(nη)) and we have B(f (x), β 2 /(2nη)) ⊂ f (B(x, β/(nη))).
Proof Let v be an arbitrary unit vector in R n . Let M ∈ ∂f (y) and let z = vM(x) T . Since M T = M(x) T + R T the scalar product of z and w = vM T = z + vR T can be estimated as follows Next, taking w * = z/|z| we have for |x − y| ≤ β/(nη), Using this fact we can apply Lemma 3 and Lemma 4 of [9] to claim that for every y 1 , which shows that f is injective in a ball B(x, β/(nη)). Next, by similar arguments, we show that which proves that all matrices from the set ∂f (y) are of full rank if |y − x| ≤ β/(nη). Finally, we can apply Lemma 5 of [9] to show that the f image of the ball B(x, β/(nη)) contains the ball B(f (x), β 2 /(2nη)).

Construction and Properties of the Transition Density of the Solution of (1) Driven by the Truncated Process
The approach in this section is based on Levi's method (cf. [12,26,29]). This method was applied in the framework of pseudodifferential operators by Kochubei [20] to construct a fundamental solution to the related Cauchy problem as well as transition density for the corresponding Markow process. In recent years it was used in several papers to study transition densities of Lévy-type processes see e.g. [5,7,8,13,15,[17][18][19]21]. Levi's method was also used to study gradient and Schrödinger perturbations of fractional Laplacians see e.g. [4,6,41].
Let us fix ε ∈ (0, 1] (it will be chosen later). Recall that for given ε the constant δ is chosen according to Lemma 2.1. For such fixed ε, δ we abbreviate t depends also on τ > 0 which we keep fixed.
Let us recall that Our first aim in this section will be to construct the heat kernel u(t, x, y) corresponding to the operator L. This will be done by using Levi's method.
Note that the coordinates of B(x) satisfy the conditions (2) and (4) with possibly different constants η * 1 and η * 3 , but taking maximums we can assume that η * We also denote B ∞ = max{|b ij | : i, j ∈ {1, . . . , d}}. For any t > 0, x, y ∈ R d we define Recall that L y is the "freezing" operator given by (11). It may be easily checked that for each fixed y ∈ R d the function p y (t, ·) is the heat kernel of L y that is For any t > 0, x, y ∈ R d we also define Now we use the approach which was explained in details in the Introduction. The approximationũ(t, x, y) of u(t, x, y) is defined byũ(t, x, y) = p y (t, x − y). For x, y ∈ R d , t > 0, the kernel q 0 (t, x, y) is given by (8). We have hence by (24) we get For x, y ∈ R d , t > 0 and n ∈ N kernels q n (t, x, y) are given by (10). For x, y ∈ R d , t > 0 we define The kernel u(t, x, y) is given by (9). We have In this section we will show that q n (t, x, y), q(t, x, y), u(t, x, y) are well defined and we will obtain estimates of these functions. First, we will get some simple properties of p y (t, x) and r y (t, x).
Using the definition of p y (t, x) and properties of g t (x) we obtain the following regularity properties of p y (t, x).
Proof The estimates follow from Lemma 2.5 and the same arguments as in the proof of (26).
There is a positive ε 0 = ε 0 (η 1 , η 3 , η 4 , η 5 , d) ≤ 1 2η 5 such that the map x and its Jacobian determinant denoted by J x has the property | x (w, y)| ≤ 1, Morever, the map y and its Jacobian determinant denoted by J y has the property | y (w, x)| ≤ 1, Proof In the proof we assume that constants c may additionally depend on η 4 , η 5 . We prove the statement for the map x , only.
Proof In the proof we assume that constants c may additionally depend on η 4 , η 5 .

This implies that
Note that the functionsb i =b i (x, y) have the same properties (29,30) as b * i . To evaluate the integral |x−y|≤ε 0 A 1 l dy, which in fact is an integral with respect to dw dy, we introduce new variables (w, ξ ) in R d+1 , given by (w, ξ ) = . We recall that x was defined in Lemma 3.6. Note that the vector ξ = (ξ 1 , . . . , ξ d ) can be written as From this we infer that |w||x − y| ≤ c(|ξ | + |w|)|w|.
Let Q x = {(w, y) : |y − x| ≤ ε 0 , |w| ≤ ε 0 }. Due to Lemma 3.6, almost surely on Q x , the absolute value of the Jacobian determinant of the map x is bounded from below and above by two positive constants and x is an injective transformation.
Observing that the support of the density μ is contained in [−ε 0 , ε 0 ] and then applying the above change of variables, we have where the last equality follows from the general change of variable formula for injective Lipschitz maps (see e.g. [14,Theorem 3]). Since |ξ | ≤ 1 for (w, ξ ) ∈ V x , we get Applying Lemma 2.6 we have Finally, Similarly we obtain |y−x|≤ε 0 A 2 l dy ≤ ct −1/2 , which completes the proof of the first bound. To estimate |y−x|≤ε 0 A l dx we proceed exactly in the same way.

y + a i (x)w)−p y (t, x −y + a i (y)w) μ(w) dw.
For i = 1, . . . , d we put We have q 0 (t, x, y) = R 1 + . . . + R d . It is clear that it is enough to handle R 1 alone. Note that We will use the following abbreviations Note that k 10 = 1 and k i0 = 0, 2 ≤ i ≤ d.
Using similar arguments as in the proof of Proposition 3.9 we obtain the following result.

Proposition 3.10 For any
For any δ 1 > 0, We have lim uniformly with respect to x ∈ R d .
Note that the coordinates of the matrix B(y) have partial derivatives y almost surely, bounded uniformly. We can calculate the absolute value of the Jacobian determinant J˜ x (y), y almost surely, as Next,

Lemma 3.11
For any t > 0, x ∈ R d and n ∈ N the kernel q n (t, x, y) is well defined. For any t ∈ (0, τ ], x ∈ R d and n ∈ N we have For any t ∈ (0, τ ], x, y ∈ R d and n ∈ N we have |q n (t, x, y)| ≤ c 1 c n 2 t n/2−1 (n!) 1/2 t d/α .
Proof By Proposition 3.9, there is a constant c * such that for any x, y ∈ R d , t ∈ (0, τ ] we have It follows from (47) there is p ≥ 1 such that for n ∈ N, We define c 1 = pc * ≥ c * and c 2 = 2 d/α+1 c 1 (2 + p) > c 1 .
We will prove (48), (49), (50) simultaneously by induction. They are true for n = 0 by (52, 54, 55) and the choice of c 1 . Assume that (48), (49), (50) are true for n ∈ N, we will show them for n + 1. By the definition of q n (t, x, y) and the induction hypothesis we obtain Hence we get (50) for n + 1. In particular this gives that the kernel q n+1 (t, x, y) is well defined.
By the definition of q n (t, x, y), (54) and the induction hypothesis we obtain
Using our induction hypothesis, (48) and (49) we get for |x − y| ≥ n + 2, which proves (51) for n + 1 since by the choice of constants 2c 1 c n+1 By standard estimates, one easily gets where C 1 depends on C.

|q(s, z, y)| dz ds.
For 0 < s < t/2 we have and, by Proposition 3.12, for t/2 < s < t, Hence, where (58) and Proposition 3.10 were applied to estimate the integrals with respect to the space variable. Let a the constant found in Proposition 3.12. Assume that |x − y| ≥ 1 + a. By Corollary 3.3, for 0 < s < t, we have Proposition 3.12 implies that for 0 < s < t, Hence, Combining (61) and (62) we obtain the desired pointwise estimates of u(t, x, y). Next, (59) and (60) immediately follow from (57), (58) and Proposition 3.10.
Proof We estimate the term for i = 1. By Lemma 3.1, for γ = 1, we get for w ∈ R Recall that, if |w| ≥ 2δ, then μ(w) = 0. So we may assume that |w| ≤ 2δ. By Corollary 3.3, we get Proof The lemma follows easily by Propostion 3.10.
For any t > 0, x, y ∈ R d we define Clearly we have u(t, x, y) = p y (t, x − y) + ϕ y (t, x).
Now, following ideas from [19], we will define the so-called approximate solutions.
. By the same arguments as in the proof of Corollary 3.13 we obtain the following result.
By the same arguments as in the proof of Lemma 3.19 (iv) we obtain the following result.
The next lemma is similar to [19,Lemma 4.2].

Construction and Properties of the Semigroup of X t
Let us introduce the following notation Note that by (7), for any x ∈ R d and f ∈ B b (R d ), we have We denote, for any x ∈ R d and f ∈ B b (R d ), For any t ≥ 0, ξ ∈ [0, 1], x ∈ R d and n ∈ N, We observe that n,t . Finally, we remark that all these functions have nothing in common with the functions or x used in Section 3.

Lemma 4.1 n,t f (x) and
(ξ ) Proof We will only show the result for n,t f (x) using the induction. The proof for n,t f (x) is almost the same.
Let c be the constant from (59) and put c 1 = (λ ∨ 1)c. For n = 0 (87) follows from (59). Assume that (87) holds for n ≥ 0, we will show it for n + 1. Indeed, applying (59) and (82), we get For any x ∈ R d we define n,t f (x), t > 0, Our ultimate aim will be to show that for any t > 0 we have T t = P t , where P t is given by (5).
By Lemma 4.1, we obtain Next, we obtain the following regularity results concerning operators T t .

Theorem 4.3 For any
Proof For any t ∈ (0, τ ], x ∈ R d by Corollary 3.13 we get Using the same arguments as in Lemma 4.1 for any t ∈ (0, τ ], x ∈ R d , n ∈ N, n ≥ 1 one gets | n,t f (x)| ≤ c n t n−1 f 1−γ ∞ f γ 1 /(n−1)!, which implies the assertion of the theorem.
Let t ∈ (0, τ ], n ∈ N, i, j ∈ {1, . . . , d}. The fact that ∂ ∂x i (ξ ) n,t f (x) is well defined and continuous as a function of x ∈ R d follows from (101), Lemmas 3.4, 3.5, Proposition 3.12 and Lemma 3.19. By the above arguments, we also get (vi) there exists a nonnegative function p(t, x, y) in (t, x, y) ∈ (0, ∞) × R d × R d ; for each fixed t > 0, x ∈ R d the function y → p(t, x, y) is Lebesgue measurable, equation (1) has the Feller property, see eg. [33]. This implies that the sequence of measures m n (dy) = p(t, x n , y)dy is weakly convergent to m 0 (dy) = p(t, x 0 , y)dy. Hence the set of the truncated measures 1 m n (dy) is weakly convergent to 1 m 0 (dy) provided that ∂ has the Lebesgue measure 0. By compact embedding any subsequence {x k n } has a subsequence {x l n } such that p(t, x l n , ·)1 is convergent to q(·) in L 1 ( ). Due to the weak convergence of the truncated measures we have that q(·) = p(t, x 0 , ·) in L 1 ( ). This implies lim sup n→∞ f (y)p t (x n , y)dy − f (y)p t (x 0 , y)dy ≤ lim sup provided is large enough. Indeed, sup n≥0 m n ( c ) can be made arbitrarily small by the tightness of {m n ; n ≥ 0}, which follows from some moment estimates on X t . This shows that the above limit must be 0. It is worth noticing that this argument works in our setup for all 0 < α < 2. On the other hand we do not think that from the results of [10] one can obtain our main estimate (6).

Remark 4.24
For any α ∈ (0, 1), d ≥ 2 there exist A(x) satisfying (2-4) and t > 0 such that P t : L 1 (R d ) → L ∞ (R d ) is not bounded. For simplicity we will present an example for d = 2 but similar examples can be constructed for d > 2.