Abstract
Let \((P_t)\) be the transition semigroup of the Markov family \((X^x(t))\) defined by SDE
where \(Z=\left( Z_1, \ldots , Z_d\right) ^*\) is a system of independent real-valued Lévy processes. Using the Malliavin calculus we establish the following gradient formula
where the random field Y does not depend on f. Moreover, in the important cylindrical \(\alpha \)-stable case \(\alpha \in (0,2)\), where \(Z_1, \ldots , Z_d\) are \(\alpha \)-stable processes, we are able to prove sharp \(L^1\)-estimates for Y(t, x). Uniform estimates on \(\nabla P_tf(x)\) are also given.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \((P_t)\) be the transition semigroup of a Markov family \(X=(X^x(t))\) on \({\mathbb {R}}^d\), that is
In the paper X is given by the stochastic differential equation
where \(b:{\mathbb {R}}^d\mapsto {\mathbb {R}}^d\) is a \(C^2({\mathbb {R}}^d,{\mathbb {R}}^d)\) and Lipschitz mapping and
is a Lévy process in \({\mathbb {R}}^d\). We assume that \(Z_j, \ j=1,\ldots , d\), are independent real-valued Lévy processes. We denote by \(m_j\) the Lévy measure of \(Z_j\). Recall that
We assume that each \(Z_j\) is of purely jump type
where \(\Pi _j(\mathrm{d}s, \mathrm{d}\xi )\) is a Poisson random measure on \([0,+\infty )\times {\mathbb {R}}\) with intensity measure \(\mathrm{d}sm_j(\mathrm{d}\xi )\).
The main aim of this article is to establish the following gradient formula
where the random field Y does not depend on f. The gradient formulae of such type date back to [5, 10] and are frequently called the Bismut–Elworthy–Li formulae. Note that [5] uses an approach based on the Girsanov transformation. On the other hand [10] introduces martingale methods to derive formulae like (4) in the Gaussian setting; this approach also works for jump diffusions with a non-degenerate Gaussian component (cf. Section 5 in [20]).
One important consequence of (4) is the strong Feller property of the semigroup \((P_t)\), e.g. [7,8,9, 18], which in particular motivates our interest in this topic. Moreover, such gradient formulae allow the Greeks computations for pay-off functions in mathematical finance, e.g. [6, 11]. In particular in [11] the authors apply the Malliavin calculus on the Wiener space to the sensitivity analysis for asset price dynamics models.
For Lévy-driven SDEs with a possibly degenerate Gaussian component, the Bismut–Elworthy–Li formula has been obtained in [21] under the assumption on the Lévy measure to have a density with respect to Lebesgue measure in \({\mathbb {R}}^d\); see also [22, 23] for the Bismut–Elworthy–Li formula for an SDE driven by a subordinated Brownian motion. In our study, we are focused on the more difficult situation, where the noise is presented by a collection of one-dimensional Lévy processes, and thus is quite singular.
In plain words, the substantial complication of the problem in our case is that the class of the random vector fields, which are “admissible” for the noise in the sense that they allow the integration-by-parts formula, is much more restricted. Namely, in our case only the “coordinate axis differentiability directions” in \({\mathbb {R}}^d\) are actually allowed, while in the case of the Lévy measure with a density there are no limitation on these directions. For the first advances in the Malliavin calculus for Lévy noises, supported by (singular) collection of curves, we refer to [16].
In the important cylindrical \(\alpha \)-stable case (i.e., when each \(Z_j\) is \(\alpha \)-stable) with \(\alpha \in (0,2)\) we obtain the sharp estimate
The method we use to obtain (5) seems to be of independent interest. It has two main steps. The first one is a bound for \( {\mathbb {E}}\left| Y(t)-Y(t,x)\right| \), where Y(t) corresponds to Y(t, x) when \(b=0\) in (2), i.e., \(X^x(t) = x + Z(t)\). The second step concerns with \({{\mathbb {E}}}\left| Y(t)\right| \) (see Sect. 8). Both steps require sharp estimates and are quite involved (see in particular Sects. 6.2 and 8). Formula (5) implies the bound (\(\Vert \cdot \Vert _\infty \) stands for the supremum norm)
It seems that when \(0 < \alpha \le 1\) also estimate (6) is new; it cannot be obtained by a perturbation argument which is available when \(\alpha >1\). In fact we will establish (6) for any process Z with small jumps similar to \(\alpha \)-stable process. Recall that estimates like (6) for \(\alpha >1\) hold even in some non-degenerate multiplicative cases (see Theorem 1.1 in [15]; in such result the Lipschitz case \(\gamma =1\) requires \(\alpha >1\)). We expect that our approach should also work for SDEs with multiplicative cylindrical noise; such an extension is a subject of our ongoing research.
Let us mention that from the analytical point of view we are concerned with the gradient estimates of the solution to the following equation with a non-local operator
\(u(0,x) =f(x)\), where \(e_j, j=1, \dots , d\), is the canonical basis of \({\mathbb {R}}^d\).
2 Main result
Let \(Q_tf(x)={\mathbb {E}}\, f(Z^x(t))\) be the transition semigroup corresponding to the Lévy proces \(Z^x(t)= x + Z(t)\). The proof of the following theorem concerning BEL formulae for \((P_t)\) and \((Q_t)\) is postponed to Sect. 6.
Theorem 1
Let \(P=(P_t)\) be given by (1), (2). Assume that:
-
(i)
\(b\in C^2({\mathbb {R}}^d, {\mathbb {R}}^d)\) has bounded derivatives \(\frac{\partial b_i}{\partial \xi _j}\), \(\frac{\partial ^2 b_i}{\partial \xi _j\partial \xi _k}\), \(i,j,k=1,\ldots , d\).
-
(ii)
There is a \(\rho >0\) such that
$$\begin{aligned} \liminf _{\varepsilon \downarrow 0} \varepsilon ^{\rho } m_j\{|\xi |\ge \varepsilon \}\in (0,+\infty ], \quad j=1,\ldots , d. \end{aligned}$$ -
(iii)
There exists a \(r>0\) such that each \(m_j\) restricted to the interval \((- r, r)\) is absolutely continuous with respect to Lebesgue measure. Moreover, the density \(\rho _j= \frac{\mathrm{d}m_j}{\mathrm{d}\xi }\) is of class \(C^1((-r,r){\setminus } \{0\})\) and there exists a \(\kappa >1\) such that for all j,
$$\begin{aligned} \int _{- r}^{r} |\xi |^\kappa \rho _j(\xi )\mathrm{d}\xi&<+\infty , \end{aligned}$$(7)$$\begin{aligned} \int _{-r} ^{r} |\xi |^{2\kappa } \left( \frac{\rho '_j(\xi )}{\rho _j(\xi )}\right) ^2 \rho _j(\xi )\mathrm{d}\xi&<+\infty , \end{aligned}$$(8)$$\begin{aligned} \int _{-r} ^{r} |\xi |^{2\kappa -2} \rho _j(\xi )\mathrm{d}\xi&<+\infty . \end{aligned}$$(9)
Then there are integrable random fields \(Y(t)= \left( Y_1(t), \ldots , Y_d(t)\right) \), and \(Y(t,x)= \left( Y_1(t,x),\ldots , Y_d(t,x)\right) \), \(t> 0\), \(x\in {\mathbb {R}}^d\), such that for any \(f\in B_b({\mathbb {R}}^d)\), \(t>0\), \(x\in {\mathbb {R}}^d\),
and the Bismut–Elworthy–Li formula (4) for \((P_t)\) holds. Moreover, for any \(T >0\) there is an independent of \(t \in (0,T]\) and x constant C such that
Remark 1
Note that, what is expected, the rate \(-\frac{\kappa }{\rho }+ \frac{1}{2}\) depends only on the small jumps of Z.
Remark 2
In fact we have formulae for the fields appearing in Theorem 1. Namely,
where:
-
the matrix-valued random fields \(A(t)= [A_{k,j}(t)]\in M(d\times d)\) and \(A(t,x)=[A_{k,j}(t,x)]\in M(d\times d)\) are given by
$$\begin{aligned} \begin{aligned} A(t)&= \left[ {\mathbb {D}} Z(t)\right] ^{-1}, \\ A(t,x)&= \left[ {\mathbb {D}} X^x(t)\right] ^{-1}\nabla X^x(t), \;\;\;\; \mathbb {P-}{\text{ a }.s}, \end{aligned} \end{aligned}$$(13)We note that the matrix A(t) is diagonal with entries
$$\begin{aligned} A_{j,j}(t)=\left( \int _0^t \int _{{\mathbb {R}}} V_j(s,\xi _j)\Pi _j(\mathrm{d}s,\mathrm{d}\xi _j)\right) ^{-1}. \end{aligned}$$ -
\({\mathbb {D}} Z(t)\) and \({\mathbb {D}} X^x(t)\) are the Malliavin derivatives (see Sect. 3 and formulae (27) and (29)) of Z(t) and \(X^x(t)\) respectively, with respect to the field \(V= (V_1,\ldots , V_d)\),
$$\begin{aligned} V_j(t,\xi )= \phi _{\delta }(\xi _j)\psi _{{\delta }}(t) = V_j(t,\xi _j). \end{aligned}$$(14)Here \(\psi _{{\delta }}\in C^\infty ({\mathbb {R}})\) and \(\phi _\delta \in C^\infty ({\mathbb {R}}{\setminus }\{0\})\) are non-negative functions such that
$$\begin{aligned} \psi _{{\delta }}(z)={\left\{ \begin{array}{ll} 0&{}\text {if }|z|\ge {\delta },\\ 1&{}\text {if }|z|\le \frac{\delta }{2} \end{array}\right. }, \qquad \phi _{\delta }(z)= |z|^{\kappa }\psi _{ \delta }(z), \end{aligned}$$(15)with \(\kappa \) appearing in assumption (iii) of Theorem 1, and
$$\begin{aligned} \delta \in (0, r] \;\; \text { small enough.} \end{aligned}$$ -
\(\nabla X^x(t)\) is the derivative in probability of \(X^x\) with respect to the initial condition x,
-
\(D_{k}^*{\mathbf {1}}(t)\) is the adjoint derivative operator calculated on the constant function \({\mathbf {1}}\), see Sect. 3, Lemma 3.
Remark 3
The fields Y(t) and Y(t, x) are not uniquely determined by the BEL formulae. In particular the BEL formula for \((Q_t)\) holds with Y(t) being replaced by \(Y(t)+ \eta (t)\), where \(\eta (t)\) is any zero-mean random variable which is independent of \(Z^x(t)\). Note that the conditional expectations \({\mathbb {E}}\left( Y(t)\vert Z^x(t)\right) \) and \({\mathbb {E}}\left( Y(t,x)\vert X^x(t)\right) \) are uniquely determined. On the other hand, \({\mathbb {E}}\left| Y(t)\right| \) and \({\mathbb {E}}\left| Y(t,x)\right| \) may depend on the choice of the fields.
Estimate (10) implies new uniform gradient estimates
Although (16) is quite general, it is not sharp in the relevant cylindrical \(\alpha \)-stable case with \(\alpha \in (0,2)\). In such case \(\rho =\alpha \) and \(\kappa \) is any real number satisfying \(\kappa > 1+ \frac{\alpha }{2}\). Therefore we only get that for any \(\varepsilon >0\) and \(T<+\infty \) there is a constant \(C_{\varepsilon ,T}\) such that for any \(f\in B_b({\mathbb {R}}^d)\),
We will improve the previous estimate in Sect. 8 by considering \(\varepsilon =0\). To this purpose we will also use the next remark.
Remark 4
Our main theorem provides also estimate (11) for \({\mathbb {E}}\left| Y(t,x)- Y(t)\right| \). This can be useful. Indeed if for some specific Lévy processes \(Z_j\) we have
or even if \({\mathbb {E}}\left| {\mathbb {E}}\left( Y(t)\vert X^x(t)\right) \right| \le C_Tt^{-\eta }\) for some \(\eta \) such that
where \(\kappa \) verifies our assumptions, then we can improve (10) and get, for \(t \in (0,T]\),
By (19) one deduces
In particular when \(Z_j\) are independent real \(\alpha \)-stable processes, \(\alpha \in (0,2)\), we will get in Sect. 8 the crucial estimate
Combining (11) with (20) we deduce in the cylindrical \(\alpha -\)stable case
(where \(C'_{T}\) is independent of x and t) and the sharp gradient estimate
Remark 5
The time dependent case could be also considered. This is the case when the drift b(x) is replaced by b(t, x) (assuming that \(b: [0,T] \times {{\mathbb {R}}}^d \rightarrow {{\mathbb {R}}}^d\) is Borel and verifies \(|b(t,x)| \le C(1+ |x|)\), \(b(t, \cdot ) \in C^2({{\mathbb {R}}}^d, {{\mathbb {R}}}^d)\) with all spatial derivates bounded uniformly in \(t \in [0,T]\)). In such situation one deals with a time dependent Markov semigroup \((P_{st})\). Fixing \(s \in [0,T)\) one could obtain a formula for \(\nabla (P_{st} f)(x)\) with \(s < t \le T\), \(f \in {\mathcal B}_b({{\mathbb {R}}}^d)\) which generalizes (4). The strategy is basically the same as in this paper but the computations would be much more involved.
As mentioned in the introduction a difficulty of the Proof of Theorem 1 is also to show that the Malliavin derivative of the solution \({\mathbb {D}} (X^x(t))\) in the direction to a suitable random field V is invertible and the inverse is integrable with sufficiently large power. The idea (see the proof of our Lemma 5) is to show that \({\mathbb {D}} (X^x(t))\approx {\mathbb {D}}Z(t)\), where \({\mathbb {D}}Z(t)\), is a diagonal matrix with the terms \(\int _0^t \int _{{\mathbb {R}}} V_j(s,\xi _j)\Pi _j(\mathrm{d}s,\mathrm{d}\xi _j)\) on diagonal. Therefore the integrability of \(\left( {\mathbb {D}} (X^x(t))\right) ^{-1}\) follows from the known fact, see Sect. 5 that
On the other hand, several technical difficulties arise in proving the sharp bounds for \( {\mathbb {E}}\left| Y(t)-Y(t,x)\right| \) and \({{\mathbb {E}}}\left| Y(t)\right| \).
Finally, we mention that an attempt to prove (4) has been done in [4] by the martingale approach used in [21] (see, in particular, Lemma A.3 in [4]). However the BEL formula in [4] does not seem to be correct, since there is a gap in the proof, passing from formula (48) to (49) in page 1450 of [4], which consists in an undue application of the chain rule. It seems that the complication here is substantial, and it is difficult to adapt directly the approach used in [21] to the current setting, where because of singularity of the noise it is hard to guarantee invertibility of the Malliavin derivative w.r.t. one vector field. Exactly this crucial point is our reason to use a matrix-valued Malliavin derivative of the solution w.r.t. a vector-valued field \(V= (V_1,\ldots , V_d)\).
3 Malliavin calculus
In this section we adopt in a very direct way the classical concepts and results of Bass and Cranston [3] and Norris [17] to the case of \(Z=(Z_1,\ldots , Z_d)^*\) being a Lévy process in \({\mathbb {R}}^d\) with independent coordinates \(Z_j\). For more information on Malliavin calculus for jump processes we refer the reader to the book of Ishikawa [12] (see also [2] and the references therein).
We assume that \(Z=\left( Z_1,\ldots , Z_d\right) ^*\) is defined on a probability space \((\Omega ,{\mathcal {F}},{\mathbb {P}})\). By the Lévy–Itô decomposition
where \(\Pi \) is the Poisson random measure on \(E:= [0,+\infty )\times {\mathbb {R}}^d\) with intensity measure \(\mathrm{d}s\mu (\mathrm{d}\xi )\),
Moreover, as the coordinates of Z are independent,
where \(\delta _0\) is the Dirac \(\delta \)-function, and \(m_j(\mathrm{d}\xi _j)\) is the Lévy measure of \(Z_j\). Note that
where \(\Pi _j\) are independent Poisson random measures each on \([0,+\infty )\times {\mathbb {R}}^d\) with the intensity measure \(\mathrm{d}s \mu _j\) (we use the same symbol as for the one-dimensional \(\Pi _j(\mathrm{d}s, \mathrm{d}\xi )\) appearing in (3) when no confusion may arise).
Consider the filtration
The Poisson random measure \(\Pi \) can be treated as a random element in the space \({\mathbb {Z}}_+(E)\) of integer-valued measures on \((E,{\mathcal {B}})\) with the \({\sigma }\)-field \({\mathcal {G}}\) generated by the family of functions
Definition 1
Let \(p\in (0,+\infty )\). We call a random variable \(\Psi :\Omega \mapsto {\mathbb {R}}\) an \(L^p\)-functional of \(\Pi \) if there is a sequence of bounded measurable functions \(\varphi _n:{\mathbb {Z}}_+(E)\mapsto {\mathbb {R}}\) such that
A random variable \(\Psi :\Omega \mapsto {\mathbb {R}}\) is called an \(L^0\)-functional of \(\Pi \) if, instead of (23), the convergence in probability holds
The space of all \(L^p\)-functionals of \(\Pi \) is denoted by \(L^p(\Pi )\). Note that for \(p\ge 1\), \(L^p(\Pi )\) is a Banach space with the norm \(\Vert \Psi \Vert _{L^p(\Pi )}= ({\mathbb {E}}\left| \Psi \right| ^p)^{1/p}\), and for \(p\in (0,1)\), \(L^p(\Pi )\) is a Polish space with the metric \(\rho _{L^p(\Pi )}(\Phi , \Psi )={\mathbb {E}}\left| \Phi -\Psi \right| ^p\).
Assume now that \(V=(V_1,\ldots , V_d):[0,+\infty )\times {\mathbb {R}}^d\mapsto {\mathbb {R}}^d\) is a field given by (14) and (15). The parameter \(\delta \) appearing in (15) will be specified later. Define transformations \({\mathcal {Q}}^\varepsilon _k\), \(\varepsilon >0\) and \(k=1,\ldots ,d\), \({\mathcal {Q}}^{\varepsilon }_k:{\mathbb {Z}}_+(E)\mapsto {\mathbb {Z}}_+(E)\) as follows
where \(e_k, k=1, \dots , d\), is the canonical basis of \({\mathbb {R}}^d\).
Now let \(\Psi \in L^0(\Pi )\). Write
where \(\varphi _n:{\mathbb {Z}}_+(E) \mapsto {\mathbb {R}}\) are such that (24) holds true. It follows from Lemma 2 below that \({\mathcal {Q}}^{\varepsilon }_{k}\Psi \) is well defined, that is the limit exists and does not depend on the particular choice of an approximation sequence \((\varphi _n)\).
Definition 2
We call \(\Psi \in L^0(\Pi )\), differentiable (with respect to the field \(V= (V_1,\ldots , V_d)\)) if there exist limits in probability
Here \(D_k \Psi \) is the Malliavin derivative of \(\Psi \) along the direction \(V_k e_k\).
If \(\Psi \in L^0(\Pi )\) is differentiable then we set
The proof of the following chain rule is standard and left to the reader.
Lemma 1
Assume that \(\Psi _1, \ldots ,\Psi _m\) are differentiable functionals of \(\Pi \). Then for any \(f\in C^1_b({\mathbb {R}}^m)\) the variable \(f\left( \Psi _1, \ldots ,\Psi _m\right) \) is differentiable and
Let \(\rho _k=\frac{\mathrm{d}m_k}{\mathrm{d}x}\) be the density of the Lévy measure \(m_k\) restricted to \((-r,r){\setminus }\{0\} \subset {\mathbb {R}}\). We extend artificially \(\rho _k\) putting \(\rho _k(0)=1\). Given \(\varepsilon \in [-1,1]\) sufficiently small and \(k=1,\ldots , d\), define
and
where \(\mu _k\) is defined in (22) and
Note that the set
is of Lebesgue measure zero.
We will need the following result (see e.g. [17] or [13]).
Lemma 2
The process \(M^{\varepsilon }_k \) is a martingale and for all \(T\ge 0\), and \(m\in {\mathbb {R}}\), \({\mathbb {E}} \left[ M^{\varepsilon }_k(T)\right] ^m<+\infty \). Let \(T\in (0,+\infty )\). Then, under the probability \(\mathrm{d}{\mathbb {P}}^\varepsilon = M^{\varepsilon }_k(T)\mathrm{d}{\mathbb {P}}\), \({\mathcal {Q}}^{\varepsilon }_k(\Pi )\) restricted to \([0,T]\times {\mathbb {R}}^d\) is a Poisson random measure with intensity \({ \mu _k (\mathrm{d}\xi )\mathrm{d}s }\).
The following lemma provides an integration by parts formula for the derivative \(D_k\). For the completeness we repeat some elements of a proof from [17].
Lemma 3
For any \(1 \le q\le 2\) and \(t\in (0,+\infty )\), the random variable
is q-integrable. Assume that \(p \ge 2\) and that \({ \Phi } \in L^p(\Pi )\) is differentiable and \({\mathfrak {F}}_t\)-measurable. Then \( {\mathbb {E}} D_k\Phi = {\mathbb {E}} \Phi D_k^*{\mathbf {1}}(t)\).
Proof
Note that the process \(D_{k}^*{\mathbf {1}}(t)\) is well defined and q-integrable thanks to (8) and (9). The integrability follows from the fact, see e.g. [1], Theorem 4.4.23, or [19], Lemma 8.22, that one has
By Lemma 2 we have
Thus
where
Consequently, we need to show that \(D_{k}^*{\mathbf {1}}(t)= -R(t)\).
Since
we have
Finally
\(\square \)
4 Malliavin derivative of \(X^x\)
Let \(X^x(t)=\left[ X_1^x(t), \ldots , X_d^x(t)\right] ^*\in {\mathbb {R}}^d\) be the value of the solution at time t. We use the convention that the vectors in \({\mathbb {R}}^d\) are columns, and the derivatives (gradients) are rows. Using the chain rule (see Lemma 1) it is easy to check that each of its coordinate is a differentiable functional of \(\Pi \) and the \(d\times d\)-matrix valued process \({\mathbb {D}} X^x(t)\),
satisfies the following random ODE
(cf. Section 5 in [3]) where \(Z^V(t)= \left[ Z^V_{ij}(t)\right] \), \(t \ge 0,\) is a \(d\times d\)-matrix valued process
Note that \(\int _{{\mathbb {R}}} \vert V_j(t,\xi )\vert m_j(\mathrm{d}\xi )<+\infty \) thanks to (7), and therefore, the process \(Z^V\) is well defined and q-integrable for any \(q\in [1,+\infty )\). The integrability follows from the so-called Kunita inequality (see [1]) and assumption (7). In fact the Kunita inequality ensures that for \(q\ge 2\),
Clearly we have:
Let \(\nabla X^x(t)\) be the derivative in probability of the solution with respect to the initial value
Note that, the process \(X^x\) might not be integrable. However, as the noise is additive and b has bounded derivatives, \( \nabla X^x(t)\) exists, it is p-integrable, for any \(p\ge 1\), and
Since b has bounded derivatives, we have the next result in which \(\Vert \cdot \Vert \) is the operator norm on the space of real \(d\times d\)-matrices.
Lemma 4
For all \(t\ge 0\) and \(x\in {\mathbb {R}}^d\), \(\nabla X^x(t)\) is an invertible matrix. Moreover, there is a constant C such that
Moreover, there is a constant C, possibly depending on T, such that
As a simple consequence of (27) and Lemma 4 we have
Let
Then \({\mathbb {D}}X^x(t)= \nabla X^x(t)M(t,x)\) and consequently the matrix valued process \(A=[A_{k,j}(t,x)]\) given by (13) satisfies
The proof of the following lemma is moved to the next section (Sect. 5).
Lemma 5
Assume that the parameter \({\delta }\) in (15) is small enough. Let \(p\ge 1\). The Malliavin matrix \({\mathbb {D}} X^x(t)\) is invertible and p-integrable. Moreover, the matrix valued process \(A=[A_{k,j}(t,x)]\) given by (13) or (32) is differentiable and p-integrable.
5 Proof of Lemma 5
As before \(\Vert \cdot \Vert \) denotes
the operator norm on the space of real \(d\times d\)-matrices. Moreover for a random \(d \times d\)-matrix B we set
Lemma 6
(i) For any \(t>0\), the matrix \(Z^V(t)\) is invertible, \({\mathbb {P}}\)-a.s.. Moreover, for any \(p \ge 1\), \(T>0\), there is a constant \(C= C(p,T)\) such that
(ii) Assume that the parameter \({\delta }\) in (15) is small enough (possibly depending on the dimension d). Then the matrix M(t, x) is invertible, \({\mathbb {P}}\)-a.s.. Moreover, for any \(p \ge 1\) and any \(T>0\), there is a constant \(C= C(p, T)\) such that
where \(A(t,x)=\left( M(t,x)\right) ^{-1}\).
Proof
The first part of the lemma follows from Corollary 1 from Sect. 7 below. To show the second part note that
where \(R(t,x):= \left( \nabla X^x(t)\right) ^{-1}-I\) is a random variable taking values in the space of \(d\times d\) matrices. Note that
with \(\triangle Z(s) = Z(s) - Z({s-}) \), where \({{\tilde{V}}}(s, z)\) is a diagonal matrix, \(s \ge 0,\) \(z \in {{\mathbb {R}}}^d\), such that
Moreover, \({\mathbb {P}}\)-a.s., \(Z^V(t)= \sum _{0< s \le t} {{\tilde{V}}}(s, \triangle Z(s))\) is convergent by (7) and it is also invertible. We write
We would like to obtain, for \(\delta >0\) small enough, \(t>0\),
To this purpose we consider
Recall that \((e_j)\) is the canonical basis of \({{\mathbb {R}}}^d\). We get for \(j =1, \ldots , d\), \({{\mathbb {P}}}\)-a.s.,
and so
where C is independent of \(x \in {{\mathbb {R}}}^d\), \(t \ge 0 \) and \(\omega \), \({{\mathbb {P}}}\)-a.s. Above we used the second estimate of Lemma 4; \(\Vert R(s,x)\Vert \le Cs\). We will need also that \(| Q(t,x) e_j|\le 1/2\). To this end we have to consider \(\delta \) sufficiently small. In fact we require \(\delta C\le 1/2\).
Therefore, as \(Z^V(t)\) is invertible, the matrix M(t, x) is invertible and \(A(t,x)=\left( M(t,x)\right) ^{-1}\) satisfies (35). Moreover
Consequently, we have
and (33) follows. The proof is complete. \(\square \)
Remark 6
We note that in the previous proof it is important to have a term like \( \int _0^t R(s,x)\mathrm{d}Z^V(s) \, (Z^V(t))^{-1} \) (cf. (34)). Such term can be estimated in a sharp way by \(\min (Ct, 1/2)\). On the other hand, a term like \( (Z^V(t))^{-1}\int _0^t R(s,x)\mathrm{d}Z^V(s) \, \) would be difficult to estimate in a sharp way (we can estimate its \(L^2\)-norm by \(C t^{-\frac{ \kappa }{\rho } \, + \, \frac{3}{2}}\)). On this respect see also the computations in Sect. 6.2.
5.1 Proof of Lemma 5
Since b has bounded derivatives of the first and second order, \(\nabla X^x(t)\) and \(\left( \nabla X^x(t)\right) ^{-1}\) are differentiable and p-integrable. Next, thanks to (9), the matrix valued process \(Z^V\) given by (28) is also differentiable, p-integrable, and
Therefore, as
b has bounded derivatives of the first and second order, and \(\mathrm{d}Z^V(t)\) is p-integrable and differentiable, we infer that \({\mathbb {D}}X^x(t)\) is p-integrable and differentiable. Clearly \(\nabla X^x(t)\) is invertible. By Lemma 6, the matrix M(t, x) given by (31) is invertible, p-integrable and differentiable. Since, (cf. (30) and (31)),
and, by Lemma 6, \(A(t,x):= \left( M(t,x)\right) ^{-1}\) is p-integrable, we infer that \({\mathbb {D}}X^x(t)\) is invertible, and \(\left( {\mathbb {D}}X^x(t)\right) ^{-1}\) is p-integrable.
We can show the differentiability of \( \left( {\mathbb {D}}X^x(t)\right) ^{-1}\) or equivalently of A(t, x) in a standard way based on the observation that
6 Proof of Theorem 1
By Lemma 5 the random field Y(t, x) given by (12) is well defined and integrable. By an approximation argument, see e.g. [21], Corollary 3.1 and its proof given in Section 4.3, or [14], see also [18], Lemma 2.2 for gradient estimates, it is enough to show that for any \(f\in C_b^1({\mathbb {R}}^d)\) we have (4). To this end note that
Since, by Lemma 1,
and, by Lemma 5 the matrix \({\mathbb {D}} X^x(t)\) is invertible, we have
where A(t, x) is given by (13) or equivalently by (32), and, as gradients are row vectors, \(e_j^{*}\) is the transpose of \(e_j\). By the chain rule we have
Hence, by Lemma 3, we have (4) with Y given by (12). The same arguments can be applied to show the BEL formula for the Lévy semigroup.
The proof of (10) and (11) is more difficult, and it is divided into the following two parts.
6.1 Lévy case
Assume that \(b\equiv 0\), that is \(X^x(t)= Z^x(t)\). Let us fix a time horizon \(T<+\infty \). We are proving estimate (10) for the process Y(t) corresponding to the pure Lévy case.
We have, for \(j=1,\ldots ,d\),
where \(A(t)= \left[ {\mathbb {D}}Z^x(t)\right] ^{-1}= \left[ Z^V(t)\right] ^{-1}\) and \(Z^V(t)\) is a diagonal matrix defined in (28). Therefore
where \(D_j^*{\mathbf {1}}(t)\) and \(D_j Z^{V}_{j,j}\) are given by (26) and (38), respectively. We have
By Lemma 6, there is a constant \(C_1\) such that \({\mathbb {E}}\left| Z_{jj}^V(t) \right| ^{-2}\le C_1t^{-\frac{2\kappa }{\rho } }\). Next there are constants \(C_2\) and \(C_3\) such that
where the last estimate follows from (8) and (9). Therefore there is a constant \(C_4\) such that
Let us observe now that
here in the last inequality we have used an elementary inequality
valid for any non-negative real numbers \(\{x_k\}\). Thus
Therefore, by Lemma 6,
Note that \(\int _{{\mathbb {R}}} \left( \phi _\delta '(\xi _j)\right) ^2 m_j(\mathrm{d}\xi _j)<+\infty \) thanks to (9). Summing up, we can find a constant C such that
which is the desired estimate. \(\square \)
6.2 General case
Recall that M and \(A=M^{-1}\) are given by (31) and (32), respectively. Let \(T>0.\) We prove first that (for \(\delta >0\) small enough) there is a constant c such that for \(t\in (0,T]\),
By Lemma 6 there is a constant \(C>0\) such that
Therefore, (45) follows from (39) by using the Cauchy–Schwarz inequality. Clearly (44) follows from (40) and (45).
It is much harder to evaluate \(L^1\)-norm of the term
Recall that \(R(s,x):= \left( \nabla X^x(s)\right) ^{-1}-I\). Moreover,
is differentiable, p-integrable, and we have (see also (38)):
We have \(\Vert R(s,x)\Vert \le C_1s\), \(s \in [0,T]\), and that there are non-negative random variables \(\eta (s)\), integrable with an arbitrary power, such that, \({{\mathbb {P}}}\)-a.s., \(0 \le \eta (s) \le \eta (t)\), \(0 \le s \le t \le T\),
where \(C_2\) is independent of s. Indeed, using that \(\mathrm{d}\nabla X^x(t)= \nabla b(X^x(t))\nabla X^x(t)\mathrm{d}t,\) \(\nabla X^x(0)=I\),
Since \(\nabla b\) is bounded we have \(\Vert R(s,x)\Vert \le C_1s\). After differentiation we obtain
By (30), there is a constant \(C_3\) such that for all \(t\in [0,T]\), \(\Vert D_kX^x(t)\Vert \le C_3 \Vert Z^V(t)\Vert \). Therefore there is a constant \(C_4\) such that
and consequently
We will show that I(t, x) is a proper perturbation of the already estimated
The proof will be completed as soon as we can show there is a constant \(C_6\) such that
This will imply that
Collecting (44) and (51) will give the estimate for \({{\mathbb {E}}}\left| Y(t,x)\right| \).
Let us prove (50). Recalling that \(A(t,x) = (M(t,x))^{-1}\) we have to estimate
where
We have
Using (33) we infer
Since
we can use (41) and get
see (9). Hence we have
We evaluate now \(J_2\). By Lemma 6 we have
Next
We will argue as in the proof of Lemma 6. Note that
Recall that, for \(\delta \) small enough,
where \({{\tilde{U}}}(s, z)\) is a diagonal matrix, \(s \ge 0,\) \(z \in {{\mathbb {R}}}^d\), such that
Hence
see (38). We deduce that
Therefore, in order to estimate \(J_2\), it remains to consider
where \({{\tilde{V}}}(s, z)\) is a diagonal matrix, \(s \ge 0,\) \(z \in {{\mathbb {R}}}^d\), such that \( ({{\tilde{V}}}(s, z))_{ii} \) \( = V_i(s,z_i).\) Using the bound (48) we obtain, for \(j =1, \ldots , d,\) \({{\mathbb {P}}}\)-a.s.,
It follows that
Summing up we have
To treat \(J_1\) we note that by Lemma 6 we have
We write
The more difficult term is
By (37) we have
Hence, by (36),
where \({{\tilde{C}}}_1 \) is independent of x, \(t \in (0,T]\) and \(\omega \), \({{\mathbb {P}}}\)-a.s.. The term
can be treated as the first term in (52). Therefore we have
Summing up we have
Since
where \(c_3 \) is independent of x and \(\omega \), \({{\mathbb {P}}}\)-a.s., we have
and the proof is complete. \(\square \)
7 An integrability result
Assume that \({\mathcal {M}}\) is a Poisson random measure on \([0,+\infty )\times {\mathbb {R}}\) with intensity measure \(\mathrm{d}t m(\mathrm{d}\xi )\). Given a measurable \(h:{\mathbb {R}}\mapsto [0,+\infty )\) let
Then for any \(\beta >0\),
Using the identity
we obtain
Using this method one can obtain (see Norris [17]) the following result.
Lemma 7
If for a certain \(\rho >0\),
then
Let \(\phi _\delta \in C^\infty ({\mathbb {R}}\setminus \{0\})\) be given by (15). Assume that \(m(\mathrm{d}\xi )\) satisfies hypothesis (ii) of Theorem 1 and \(h=\phi _\delta \). Then
Consequently, by Lemma 7 we have the following result:
Corollary 1
For any \(q\ge 1\) there is a constant \(C= C(q,T)\) such that
Moreover,
8 Sharp estimates in the cylindrical \(\alpha \)-stable case
Here we are concerned with rather general perturbation of \(\alpha \)-stable case. Indeed in such case we can improve the estimate on Y(t) given in Sect. 6.1. This estimate according to Remark 4 leads to the sharp gradient estimates (6).
Below in (54) we will strengthen hypotheses (8) and (9). In Remark 7 we clarify the validity of the new assumptions in the relevant cylindrical \(\alpha \)-stable case.
Lemma 8
Let \(\alpha \in (0,2)\). Suppose that all the assumptions of Theorem 1 hold with \(\rho = \alpha \) and for some \( \kappa > 1 + \alpha /2 \). Moreover, suppose that, for the same \(\kappa \),
and there exists \(p\in (1,2)\) such that
Then the following estimate holds for the \({\mathbb {R}}^d\)-valued process Y (cf. (43)) :
Remark 7
We provide a sufficient condition such that all the hypotheses of Lemma 8 hold. To this purpose recall that \(\rho _j\) is the \(C^1\)-density of the Lévy measure \(m_j\) associated to the process \(Z_j\); such density exists on \((-r, r) {\setminus } \{0\}\), \(r >0\) as in (iii) of Theorem 1.
Moreover, \(l_\alpha (\xi ):= \vert \xi \vert ^{-1-\alpha }\) denotes the density of the Lévy measure of a symmetric one-dimensional \(\alpha \)-stable process, \(\alpha \in (0,2)\).
Assume that there is a positive constant c such that, for \(\xi \in (- r, r) {\setminus } \{ 0\},\)
\(j =1, \ldots , d\). It is easy to check that (57) implies all the assumptions of Lemma 8 with arbitrary \(\kappa \in (1+\alpha /2, 1+\alpha )\). Thus under condition (57) we obtain (56) and the sharp gradient estimates (6).
Proof
To prove the result we can assume \(d=1\) so that \(Y_1 = Y \); \(\Pi \) is the associated Poisson random measure and we set \(m_1 =\mu \) for the corresponding Lévy measure having \(C^1\)-density \(\rho _1=\rho \) on \((- r, r)\).
It is enough to show (56) for small t, say \(0 < t^{1/\alpha } \vee t \le \delta /2 \le 1\), where \(\delta \le r\) is small enough.
Note that \(\phi _\delta (\xi )=|\xi |^{\kappa }\) for \(|\xi |\le \delta /2\). Moreover, recall that \(\psi _\delta (t)=1\) for \(t\le \delta /2\). Let us fix \( \kappa = 1+ \frac{3}{4} \alpha . \) We have
We have
We are showing that
We concentrate on \(D^*{\mathbf {1}}(t) \):
Concerning \(I_1(t)\) we can improve some estimates of Sect. 6.1; using the Hölder inequality (because \(\xi \) is separated from 0): for the given \(p \in (1,2)\) and \(q :1/p+1/q=1\) we have
By Corollary 1, there is a constant \(C_1\) such that \({\mathbb {E}}\left| Z_{}^V(t) \right| ^{-q}\le C_1t^{-\frac{\kappa q}{\alpha } }\), recall that \(\rho =\alpha \) now. Since \(p\in (1,2)\), there exists a positive constant c such that
see e.g. Lemma 8.22 in [19]. Since \(\phi _\delta (\xi )=|\xi |^{\kappa } \psi _{\delta } (\xi )\), it follows that for \(|\xi |\le \delta \), \(|\phi _\delta '(\xi ) | \le C_{\delta } |\xi |^{k-1}\).
We have by (55)
with some constant \(C_3\). Combined with the previous inequality, this gives
Therefore by the Hölder inequality
For \(I_2(t)\), we proceed in a similar way. Namely, by the Cauchy inequality, isometry formula, Lemma 6 and using (54) we find
which completes the proof of (58). Now we are showing that
To this end note that
Hence, as the arguments from the derivation of (59) (recall that for \(|\xi |\le \delta \), \(|\phi _\delta '(\xi ) | \le C_{\delta } |\xi |^{k-1}\)) we obtain
Set
and
Since \(\phi _\delta (\xi )=\vert \xi \vert ^\kappa \) if \(\vert \xi \vert \le \delta /2\), we have
We are dealing now with H(t). Since
we have, arguing as in (60), using again (54),
which finishes the proof of (61). \(\square \)
References
Applebaum, D.: Lévy Processes and Stochastic Calculus, 2nd edn. Cambridge Studies in Advanced Mathematics, Cambridge (2011)
Bally, V., Clement, E.: Integration by parts formula and applications to equations with jumps. Prob. Theo. Rel. Fields 151, 613–657 (2011)
Bass, R.F., Cranston, M.: The Malliavin calculus for pure jump process and applications to local time. Ann. Probab. 14, 490–532 (1986)
Bessaih, H., Hausenblas, E., Razafimandimby, P.A.: Ergodicity of stochastic shell models driven by pure jump noise. SIAM J. Math. Anal. 48, 1423–1458 (2014)
Bismut, J.M.: Calcul des variations stochastique et processus de sauts. Z. Wahrsch. Verw. Gebiete 63, 147–235 (1983)
Davis, M., Johansson, M.: Malliavin Monte Carlo Greeks for jump diffusions. Stochastic Process. Appl. 116, 101–1029 (2006)
Dong, Z., Song, Y., Xie, Y.: Derivative formula and coupling property for linear SDEs driven by Lévy processes. Acta Math. Appl. Sin. Engl. Ser. 35, 708–721 (2019)
Dong, Z., Peng, X., Song, Y., Zhang, X.: Strong Feller properties for degenerate SDEs with jumps. Ann. Inst. Henri Poincaré Probab. Stat. 52, 888–897 (2016)
Du, K., Zhang, X.: Optimal gradient estimates of heat kernels of stable-like operators. Proc. Am. Math. Soc. 147, 3559–3565 (2019)
Elworthy, K.D., Li, X.-M.: Formulae for the derivatives of heat semigroups. J. Funct. Anal. 125, 252–286 (1994)
Fournie, E., Lasry, J.M., Lebuchoux, J., Lions, P.L., Touzi, N.: Applications of Malliavin calculus to monte Carlo methods in finance. Finance Stoch. 3, 391–412 (1999)
Ishikawa, Y.: Stochastic Calculus of Variations for Jump Processes, De Gruyter Studies in Mathematics, 54, Walter de Gruyter, 2nd ed. (2016)
Ivanenko, D.O., Kulik, A.M.: Malliavin calculus approach to statistical inference for Lévy driven SDE’s. Methodolo. Comput. Appl. Prob. 17(1), 107–123 (2013)
Kawai, R., Takeuchi, A.: Greeks formulas for an asset price model with gamma processes. Math. Finance 21, 723–742 (2011)
Kulczycki, T., Ryznar, M.: Semigroup properties of solutions of SDEs driven by Lévy processes with independent coordinates, preprint arXiv:1906.07173
Léandre, R.: Régularité de processus de sauts dégénéré. Ann. Inst. H. Poincaré Probab. Statist. 21, 125–146 (1985)
Norris, J.R.: Integration by parts for jump processes, Séminaire de Probabilité XXII, pp. 271–315, Lecture Notes in Math. 1321, Springer (1988)
Peszat, S., Zabczyk, J.: Strong Feller property and irreducibility for diffusions on Hilbert spaces. Ann. Probab. 23, 157–172 (1995)
Peszat, S., Zabczyk, J.: Stochastic Partial Differential Equations with Lévy Noise. Cambridge University Press, Cambridge (2007)
Priola, E., Zabczyk, J.: Liouville theorems for nonlocal operators. J. Func. Anal. 216, 455–490 (2004)
Takeuchi, A.: Bismut-Elworthy-Li-type formulae for stochastic differential equations with jumps. J. Teoret. Probab. 23, 576–604 (2010)
Wang, F.Y., Xu, L., Zhang, X.: Gradient estimates for SDEs driven by multiplicative Lévy noise. J. Funct. Anal. 269, 3195–3219 (2015)
Zhang, X.: Derivative formulas and gradient estimates for SDEs driven by \(\alpha \)-stable processes. Stochastic Process. Appl. 123, 1213–1228 (2013)
Acknowledgements
We would like to thank Prof. Jerzy Zabczyk for very useful discussions on the topic. We also would like to thank the anonymous referee for his excellent work pointing out several useful comments and corrections on the previous version of the paper.
Funding
Open access funding provided by Università degli Studi di Pavia within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work of Alexei Kulik was supported by Polish National Science Center Grant 2019/33/B/ST1/02923. The work of Szymon Peszat was supported by Polish National Science Center Grant 2017/25/B/ST1/02584. The work of Enrico Priola was supported by the Grant 346300 for IMPAN from the Simons Foundation and the matching 2015–2019 Polish MNiSW fund.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kulik, A.M., Peszat, S. & Priola, E. Gradient formula for transition semigroup corresponding to stochastic equation driven by a system of independent Lévy processes. Nonlinear Differ. Equ. Appl. 30, 7 (2023). https://doi.org/10.1007/s00030-022-00810-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00030-022-00810-2