Support Theorem for Lévy-driven Stochastic Differential Equations

We provide a support theorem for the law of the solution to a stochastic differential equation (SDE) with jump noise. This theorem applies to quite general Lévy-driven SDEs and is illustrated by examples with rather degenerate jump noises, where the theorem leads to an informative description of the support.


Introduction
The aim of this note is to describe the topological support of the law of the solution to a general stochastic differential equation (SDE) with jump noise. For diffusions, such a description is provided by the classical Stroock-Varadhan support theorem, see [1]. The natural question of an extension of this theorem to SDEs with jumps was studied intensively at the end of 1990s-early 2000s by H.Kunita, Y.Ishikawa, and T.Simon, see [2], [3], [4]. It appears that, in the jump noise setting, the two cases should be naturally separated, named in [3] the 'Type I' and 'Type II' SDEs. In plain words, an SDE is of 'Type I' if the small jumps are absolutely integrable and of 'Type II' otherwise. A 'Type I' SDE admits a simple and intuitively clear description of the support; this case was studied completely in [3]. 'Type II' SDEs had been studied under some additional limitations only, which can be understood as certain technical convexity and scaling-type assumptions on the Lévy measure of the noise. The latter B Oleksii Kulyk Oleksii.Kulyk@pwr.edu.pl 1 Faculty of Pure and Applied Mathematics, Wrocław University of Science and Technology, Wyb. Wyspiańskiego 27, 50-370 Wrocław, Poland 'scaling' assumption (see H2 in [3] or (10) below) is quite restrictive and requires the jump noise to be close to an α-stable one, in a sense. This limitation is of a technical nature and is caused by the method used in [3], [4] rather than the problem itself; hence, a natural question arises how to remove it in order to get a general support theorem for SDEs with jumps, free from any technical limitations. This question was solved in the one-dimensional setting in [5] and for canonical (Markus) equations in [6]. For general multidimensional Itô equations with jumps, it apparently requires alternative methods and remained open since early 2000s.
In this note, we propose a method for proving a support theorem for general 'Type II' SDEs, based on a change of measure and free from any non-natural technical limitations. The description we obtain for the topological support for the law of the solutions to jumping SDEs is of a considerable importance because of its natural applications, in particular, to the study of the ergodic properties of the solution to SDE, considered as a Markov process and the strong maximum principle for the generator of this process; see more discussion in Sect. 2.3.
The structure of the paper is as follows: In Sect. 2, the main statement is formulated and provided by a discussion and examples. The proof of the main statement is given in Sect. 3, and to improve the readability the proof of the key lemma (Lemma 3.1) is postponed to a separate Sect. 4. For the same reason, the proof of a technical estimate (19) is given in Appendix A.

Preliminaries
Let N (du, dt) be a Poisson point measure (PPM) on R d × [0, ∞) with the intensity measure μ(du)dt, where μ(du) is a Lévy measure, i.e., We consider an SDE in R m where N (du, dt) = N (du, dt) − μ(du)dt is the corresponding compensated PPM. The coefficient c(x, u) is assumed to have the form where r (x, u) is negligible, in a sense, w.r.t. |u| for small values of |u| (see below). In the case r (x, u) ≡ 0 Eq. (1) transforms to where the Lévy process Z is given by its Itô-Lévy decomposition We assume the following.
There exist constants C > 0, β > 0 such that If β ≤ 1, then H 2 yields that the whole function c(x, u) is absolutely integrable w.r.t. μ(du) on the set {|u| ≤ 1}; that is, Eq. (1) has Type I in the terminology of [3]. This (comparatively simple) case is already studied completely, hence in what follows we consider the case β > 1, only. Note that, in this case r (x, u) ≤ C(|x − x 0 | + 1)|u| β |u| for small |u|, and the linear part σ (x)u is the principal one in the decomposition (2) for c (x, u).
Recall that the Skorokhod space D([0, T ], R m ) is the set of càdlàg functions on [0, T ] with the metric where T > 0 and 0,T denotes the set of strictly increasing continuous functions λ : is a Polish space, e.g., [7,Section 14]. Under the assumptions H 1 , H 2 , SDE (1) has unique strong solution X , see [8, Theorem IV.9.1]. We fix a time horizon T > 0 and consider this solution on the time interval [0, T ]. The law of this solution in the Skorokhod space D([0, T ], R m ) will be denoted by Law x 0 ,T (X ). We aim to describe the support of this law; recall that the (topological) support of a measure κ on a metric space S with Borel σ -algebra is the minimal closed subset F such that κ(S \ F) = 0, this set is denoted by supp(κ). Alternatively, x ∈ supp(κ) if, and only if, for any open ball B centered at x one has κ(B) > 0.

The Main Statement
To describe the support of Law x 0 ,T (X ), we introduce some constructions. First, we introduce a kernel J (x, dy) on R m by and define the set A ⊂ R m × R m of 'admissible jumps' by (x, y) ∈ A ⇐⇒ y ∈ supp(J (x, ·)).
Next, denote by L the set of ∈ R d such that |u|≤1 |u · |μ(du) < ∞; here and below we use notation a ·b for the scalar product in R d . It is easy to check that L is a vector subspace in R d ; we call it the 'integrability subspace' for μ furthermore. Denote by u L the orthogonal projection of u on L, then |u|≤1 |u L |μ(du) < ∞ and under the assumptions H 1 , H 2 the following function is well defined and is Lipschitz continuous: Denote by F const 0,T and F Finally, denote by S const 0,T ,x 0 and S step 0,T ,x 0 the classes of functions satisfying the following piece-wise ordinary differential equations where {t k } ⊂ (0, T ) is an arbitrary finite set, f is an arbitrary function from the class F const 0,T or F step 0,T , respectively, and the jumps of the function φ at the time moments {t k } satisfy (φ tk −, φ tk ) ∈ A for any k.
Note that each such φ can be obtained as follows. Let x 0 , {t k }, and f be given, we can assume t 1 < t 2 < . . . . We first solve the Cauchy problem for the ODE on the interval [0, t 1 ) with the initial condition φ 0 = x 0 . Then, we determine the value φ t 1 which should obey the admissibility assumption (6); note that φ t 1 − is already well defined. Then, we use φ t 1 as the initial value or the (new) Cauchy problem on the time interval [t 1 , t 2 ), and so on. Denote by S const 0,T ,x 0 and S In plain words, the above description of the support can be explained as follows. The SDE (1) contains two parts, the 'deterministic flow' part which corresponds to the drift, and the 'stochastic jump part' which corresponds to the PPM. These two parts are still related through the compensator term, which is involved in the stochastic integral w.r.t. N (du, dt). In the simple case where L = R d , and thus the SDE is of the Type I, these parts can be completely separated, and (1) can be written in the form In this case, (5) does not contain the part with f because L ⊥ = {0}, and the description of the support of the law of the solution X is intuitively clear: the solution follows the deterministic flow defined by the effective drift, then makes a jump, admissible for the stochastic jump part, then again follows the deterministic flow, etc. This description was thoroughly proved in [3, Theorem I]. The general (Type II) case is more sophisticated, since the compensator term cannot be separated from the stochastic integral. Theorem 2.1 actually tells us that the above description of the support remains essentially correct, with the following two important changes: • the effective drift involves only the parts of c(x, u) which are absolute integrable, i.e., r (x, u) and σ (x)u L ; • the 'non-integrability subspace' L ⊥ induces an extra drift part, which may act at Such a description of the support was conjectured by T. Simon, see [3, Remark 3.3(c)], by analogy with a similar result proved for Lévy processes in [9]. Theorem 2.1 proves this conjecture in a wide generality, i.e., without any specific assumptions on the SDE except the natural conditions H 1 , H 2 which yield strong existence of the solution.

Remark 2.2
Instead of the global Lipschitz conditions in H 1 , H 2 , one can assume their local analogues combined with certain condition which prevents the strong solution to SDE (1) from blowing up, e.g., the linear growth condition on the coefficients. Theorem 2.1 can be extended to such a setting by the usual localization technique.

Remark 2.3
Assumption H 2 with β > 1 contains a structural limitation that the infinite variation part of the SDE is linear in the jump variable u. Removing this limitation would require considering x-dependent non-integrability subspaces instead of {σ (x) f , f ∈ L ⊥ } and piece-wise differential inclusions instead of (5); such an extension is a topic for a further research. Still, the limitation mentioned above is not very restrictive practically. For instance, if c(x, u) is C 2 -smooth in u, then (the local version of) H 2 holds true with β = 2 by the Taylor formula; in this case, (4) holds true just because μ is a Lévy measure.

Remark 2.4
Our description of admissible jumps differs from the one adopted in [3], which requires that These two descriptions coincide for c(x, u) continuous in u; for discontinuous c(x, u), the latter one is no longer applicable. To see this, one can consider a simple example where m = d = 1, μ is a finite discrete measure μ which has a full support in R,

Discussion: Applications and Examples
Support theorems are involved naturally in the study of ergodic properties of the Markov process associated with the SDE; namely, they provide a natural tool for proving that the process is topologically irreducible, see [ Thus, in order to provide the topological irreducibility for the Markov process associated with (1), it is sufficient to show that Another application of the support theorems dates back to the original paper by Stroock and Varadhan [1], and concerns the strong maximal principle for the generator L of the Markov process X , which states that any sub-harmonic function for L which reaches its maximum on the given set is constant on this set. We refer an interested reader for a detailed discussion to [8,Theorem IV.8.3] or [3,Remark 3.3(b)]. Here, we mention briefly the following simple corollary: if T >0 S T (x) is dense in R m for any x ∈ R m , then the strong maximal principle for L holds true on the entire R m . For this property to hold, it is clearly sufficient that (9) holds true for some T > 0.
Below, we give two simple sufficient conditions where (9) holds true for any T > 0.

Example 2.5 (Jump noise satisfying the 'cone condition').
Let there exist θ ∈ (0, 1) such that, for any ∈ R d , | | = 1 and ε > 0, the intersection of the cone {u : To prove this assertion, one can take f ≡ 0 and organize for given x, y ∈ R m , ε > 0 and T > 0 sequences of (frequent) jump times {t k } and (small) jumps amplitudes u k , which would force the solution to (5) with x 0 = x to take the final value y with |φ T − y| < ε. This construction is essentially the same as in the proof of [11,Proposition 4.17]; thus, we omit the details here.

Example 2.6 (Jump noise of the 'strong Type II').
Let there be no integrability directions for μ(du); that is, L = {0}. Assume also that σ (x) is point-wise degenerate. Then, To prove this assertion, one can ignore the jump part in (5) and consider the solution to the SDE Let us give two more particular examples illustrating the range of applicability of Theorem 2.1 and sufficient conditions from Example 2.5 and Example 2.6. For that purpose, we recall the auxiliary 'scaling' assumption imposed in [3], [4]: for some α ∈ (0, 2), The following example concerns the cylindrical Lévy processes, which have been studied extensively in the last decades, e.g., [12], [13], [14].

Example 2.7
Let μ(du) be the Lévy measure of the Z = (Z 1 , . . . , Z d ) with the independent components Z i , i = 1, . . . , d. Let the components be symmetric α i -stable processes on R; then, the scaling assumption (10)

Proof of Theorem 2.1
To prove the announced statement, it is sufficient to prove the following two inclusions: supp The proof of the first inclusion is simple and standard, corresponding argument was explained in [3]; for the reader's convenience, we outline the argument here. Consider a family of SDEs with It is easy to check that b η → b, η → 0 uniformly on compact subsets of R m . Then, by the usual stochastic calculus technique one can show that in probability; see the proof of a similar statement in Lemma 4.2. On the other hand, Eq. (12) can be written in the form: The PPM N , restricted to {|u| ≥ η, t ∈ [0, T ]}, has a finite set of atoms ('jumps'). Then, the solution to the latter equation can be represented path-wise as a collection of solutions to ODEs of the form (5), where f ≡ υ η belongs to F const 0,T , {t k } are equal to the time instants of the jumps, and t k φ = c(φ t k − , u k ), where {u k } are equal to the amplitudes of the jumps for N . Because is admissible for any k. Therefore, for any η, and completes the first inclusion in (11). The second inclusion is the main part of the theorem. In order to proceed with its proof, we re-write the original SDE (1) to the form where η > 0 is a (small) parameter which is yet to be chosen and recall that υ η is given by (14). Consider SDE dX η,trunc t (16) which can be seen as a modification of (15) with 'large jumps' being truncated, i.e., with the PPM N (du, dt) changed to its restriction to {|u| < η} × [0, T ]. We will denote by X x,S,η,trunc the solution of (16) with X S = x.
Let f ∈ F step 0,T , x ∈ R m , S ∈ [0, T ] be fixed and φ x,S, f be the solution of the ODE The following statement is the cornerstone of the entire proof. In what follows, we denote by B(x, r ) the open ball in R m with the center x and radius r .

Lemma 3.1 (The Key Lemma) Let f ∈ F
step 0,T be fixed. There exists ρ ∈ (0, 1) such that, for any given x ∈ R m , and γ > 0, there exists η f ,x,γ > 0 such that We postpone the proof of Lemma 3.1 to a separate Sect. 4; here, we explain the argument which provides the second inclusion in (11) once this key lemma is proved. Though being quite standard, this argument is a bit cumbersome; hence, we divide the exposition in several steps.
Step 1: Choosing the number and approximate instants of 'big jumps'. Fix η > 0 and decompose The PPM N η (du, dt) a.s. has a finite number of atoms with t ∈ [0, T ]; say, It is well known that the PPMs N η (du, dt) and N η (du, dt) are independent. In addition, the random variable has the Poisson distribution with the intensity T μ(|u| ≥ η), and conditioned by the event {J = K }, the random vectors {ξ j } K j=1 , {τ j } K j=1 are independent. The corresponding (conditional) laws are the K -fold product of for {ξ j } K j=1 , and the uniform distribution on the simplex T ,x 0 be fixed with the corresponding function f ∈ F step 0,T and points {t k , k = 1, . . . , K } from Eq. (5) for φ. Denote x k = φ t k − , y k = φ t k ; we can and will assume that x k = y k because otherwise we can simply exclude the point t k from the Eq. (5). Denote also t 0 = 0, t K +1 = T , then for any positive which is positive.
Step 2: Linking the instants of 'big jumps' for X with the discontinuities of φ. Process X follows the truncated SDE (16) between the 'big jumps instants' τ j , and at these instants satisfies Denote by X s 1 ,...,s K the similar process with J = K and τ k = s k : it follows the truncated SDE (16) on each [s k−1 , s k ), and satisfies s k X s 1 ,...,s K = c(X s 1 ,...,s K s k , ξ k ), k = 1, . . . , K ; the random vector {ξ j } K j=1 has the law (P η ) ⊗K . Then We will apply this inequality with for the reader's convenience, we prove this relation in Appendix A. Hence, we can fix δ(ε) > 0 such that and thus for any Step 3: Some preliminary estimates for the distance between X s 1 ,...,s K and φ s 1 ,...,s K . In what follows, ρ ∈ (0, 1) is the number given by Lemma 3.1. Denote note that ξ k is independent on G s k for any k. The following holds: (I) For any k = 0, . . . , K and γ > 0, the conditional probability w.r.t. F s k for the event and by Lemma 3.1 it is bounded from below by p trunc (η, f , y k , γ ) on the set provided that η ≤ η f ,y k ,γ ; (II) for any k = 1, . . . , K and γ > 0, the conditional probability w.r.t. G s k for the event Recall that each pair (x k , y k ) = (φ t k − , φ t k ) is admissible; hence, for any γ > 0 and k, then by the assumptions H 1 , H 2 there exists η * > 0 such that, for all k, This yields for any γ ∈ (0, γ * ] Moreover, because c(x, u) is continuous w.r.t. x we have by the usual weak continuity arguments that, for each γ ∈ (0, γ * ] and k, there exists γ > 0 such that Step 4: Specifying the parameters and completing the proof. Now, we can specify the free parameters η, γ , δ in the above estimates and finalize the entire proof. Let us define iteratively parameters γ k for k = K + 1, . . . , 1 as follows: Take and let for k = K , . . . , 1 where γ k+1 is such that (22) holds true with this γ and γ = γ k+1 . We will use these parameters to estimate P sup It follows from the calculations in Appendix A that δ ∈ (0, δ(ε)) can be taken small enough such that, for any s 1 , . . . , s K with |t k − s k | < δ, k = 1, . . . , K , We fix such δ > 0 and the truncation level Then A k ∈ G s k+1 , B k ∈ F s k , k = 1, . . . , K , and we have the following: • By Lemma 3.1, for k = 0, . . . , K P(A k |F s k ) ≥ p trunc (η, f , y k , γ k+1 ) a.s. on the set B k .

Proof of the Key Lemma
We begin the proof of Lemma 3.1 with the following auxiliary result. Lemma 4.1 For any w ∈ L ⊥ and η > 0, there exist ζ ∈ (0, η) and a function g : R d → [− 1 2 , 1 2 ] such that g(u) = 0 whenever either |u| ≤ ζ or |u| ≥ η and Proof For a given 0 < ζ < η, denote by G η ζ the set of all functions g : R d → [− 1 2 , 1 2 ] such that g(u) = 0 whenever either |u| ≤ ζ or |u| ≥ η. This set is convex and symmetric; hence, is a symmetric convex subset of L ⊥ , and so is the set The statement of the lemma is equivalent to Assuming (25) to fail, we have that V η 0 is a proper symmetric convex subset of L ⊥ , and thus, there exist ∈ L ⊥ \ {0} and c ≥ 0 such that Since 0 ∈ V η 0 , we get c ≥ 0 which proves the second inequality in (26); the first inequality follows then by the symmetry of V η 0 . It follows from (26) that, for every ζ ∈ (0, η) and g ∈ G η ζ , in the first identity we have used that u L ∈ L is orthogonal to ∈ L ⊥ . Taking we get from the previous inequality that ζ <|u|<η | · u| μ(du) ≤ 2c, ζ ∈ (0, η), and passing to the limit as ζ → 0 we obtain |u|<η | · u| μ(du) ≤ 2c < +∞.
This means that ∈ L, which contradicts to the fact that ∈ L ⊥ \ {0}.
We write the compensated PPM in (16) in the form N (du, dt) − μ(du)dt and consider the same SDE with another PPM Q f ,η (du, dt) which has the intensity measure (1 + g f ,η t (u))μ(du)dt: Similarly to the notation used in Lemma 3.1, we denote by Y  Proof For simplicity of notation, we take S = 0, Q = T and omit the index S. We also consider the scalar case d = 1; for d > 1 similar estimates should be performed coordinate-wise. The stochastic integral part in Eq. (28) can be written as  [0, T ], η ∈ (0, 1] In addition, σ is Lipschitz and a piece-wise constant function f ∈ F step 0,T is bounded; hence, we can assume that L is large enough for |σ (x) f t − σ (y) f t | ≤ L|x − y| to hold. Then on the set A ε,η := sup which by the Gronwall inequality yields Combined with (34),(35), this proves (36) and completes the proof of (19).