Abstract
This article establishes the cutoff phenomenon in the Wasserstein distance for systems of nonlinear ordinary differential equations with a dissipative stable fixed point subject to small additive Markovian noise. This result generalizes the results shown in Barrera, Högele, Pardo (EJP2021) in a more restrictive setting of Blumenthal-Getoor index \(\alpha >3/2\) to the formulation in Wasserstein distance, which allows to cover the case of general Lévy processes with some given moment. The main proof techniques are based on the close control of the errors in a version of the Hartman–Grobman theorem and the adaptation of the linear theory established in Barrera, Högele, Pardo (JSP2021). In particular, they rely on the precise asymptotics of the nonlinear flow and the nonstandard shift linearity property of the Wasserstein distance, which is established by the authors in (JSP2021). Main examples are the nonlinear Fermi–Pasta–Ulam–Tsingou gradient flow and dissipative nonlinear oscillators subject to small (and possibly degenerate) Brownian or arbitrary \(\alpha \)-stable noise.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we study the asymptotics of the ergodic behavior of the following stochastic differential equation (SDE)
for small noise intensity \(\varepsilon >0\), where the vector field \(b\in \mathcal {C}^2(\mathbb {R}^d,\mathbb {R}^d)\) satisfies \(b(0)=0\) and the following dissipative condition.
Hypothesis 1
(Dissipativity) There exists a constant \(\delta >0\) such that
The noise process \(L=(L_t)_{t\geqslant 0}\) in (1.1) is a Lévy process with values in \(\mathbb {R}^d\) on a given probability space \((\Omega , \mathcal {F}, \mathbb {P})\). It is well-known that the law of L is characterized by the triplet \((a,\Sigma , \nu )\), where \(a\in \mathbb {R}^d\), \(\Sigma \in \mathbb {R}^{d \times d}\) is a non-negative definite matrix and \(\nu : \mathcal {B}(\mathbb {R}^d) \rightarrow [0, \infty ]\) is a locally finite Borel measure satisfying
For \(\nu =0\) the process L is a multidimensional Brownian motion with drift, while for \(a=0\) and \(\Sigma =0\) we have a multidimensional pure jump process such as compound Poisson processes or \(\alpha \)-stable processes, in particular, the Cauchy process for \(\alpha =1\). We refer to [1, 14, 16, 18, 22] for further details on Lévy processes. Under Hypothesis 1, it is known that the SDE (1.1) has a pathwise unique strong solution, see for instance Theorem 1.1 in [10], here denoted by \(X^\varepsilon (x):=(X^\varepsilon _t(x))_{t\geqslant 0}\). Moreover, \(X^\varepsilon (x)\) is a Markov process and, in particular, it satisfies the Feller property see Proposition 2.1 in [21].
In order to present the main results of this paper, we formally introduce the Wasserstein distance of order \(p_*\). We assume some finite moment for \(L_t\) and hence \(X^{\varepsilon }_t(x)\) for all \(t\geqslant 0\).
Hypothesis 2
(Finite \(p_*\)-th moment) There exists \(p_*>0\) such that
This article shows the cutoff phenomenon for the family of processes \((X^{\varepsilon }(x))_{\varepsilon >0}\) with respective invariant measures \((\mu ^\varepsilon )_{\varepsilon >0}\) under the Wasserstein distance \(\mathcal {W}_{p_*}\) of order \(p_*>0\). For \(p_*>1\) we characterize the following cutoff profile asymptotics
where \(\mathfrak {t}_\varepsilon =\frac{1}{\mathfrak {q}}|\ln (\varepsilon )|+\frac{\ell -1}{\mathfrak {q}}\ln (|\ln (\varepsilon )|)\) for some explicit positive constants \(\mathfrak {q},\ell ,C\) that depend on x in terms of an \(\omega \)-limit set of the rotational part for the Hartman–Grobman linearization of \(X^0(x)\).
For such processes \((X^{\varepsilon }(x))_{\varepsilon >0}\) where (1.3) fails, we establish the following weaker window cutoff asymptotics
Our results generalize the results in [2] to the nonlinear vector field and [3, 5] and [6] to the Wasserstein distance which cover second order equations with degenerate noise. For a detailed introduction on the subject we refer to the aforementioned articles, in particular, see Table 1.1 in [3]. There is a particular advantage of studying this problem under the Wasserstein distance rather than in the total variation. While the Wasserstein distance only requires the existence of moments of \(X^{\varepsilon }(x)\) of a given order, the total variation distance needs existence of its density in addition to its regularity. The latter brings further requirements for the Lévy process L which can be quite restrictive, see [3] for further details. Furthermore the Wasserstein case, at least in case of \(X^{\varepsilon }(x)\) moments of order \(p> 1\), the cutoff phenomenon of \((X^{\varepsilon }(x))_{\varepsilon > 0}\) is completely determined by an explicit function (see Theorem 2 below), here called as cutoff profile. On the contrary, in the total variation case the profile function can be very involved and even hard to simulate in examples.
In [4], the cutoff phenomenon with respect to the total variation distance covering SDEs of the type (1.1) in the one dimensional case, L being a standard Brownian motion and with general drift coefficient b (satisfying Hypothesis 1) is studied. Since scalar systems are gradient systems, there is always a cutoff profile which can be given explicitly in terms of the Gauss error function. The follow-up work [5] covers the multidimensional case, where the picture is considerably richer, due to the presence of strong and complicated rotational patterns. The authors characterize sharply the existence of a cutoff profile in terms of the omega limit sets appearing in the long-term behavior of the matrix exponential function \(e^{-\mathcal {Q} t}x\) in Lemma B.2 in [5], which plays an analogous role in this article. The paper [6] is the first attempt to study the cutoff phenomenon for such models with jumps. More precisely, [6] covers the cutoff phenomenon with respect to the total variation distance of the generalized Ornstein-Uhlenbeck processes. The previous process satisfies an SDE of the form (1.1) with L being a Lévy process and \(b(x)=\mathcal {Q} x\), where \(\mathcal {Q}\) is a square real matrix whose eigenvalues have positive real parts. The proof methods are based on concise Fourier inversion techniques. Due to the aforementioned regularity inherited by the total variation, the results in [6] are given under the hypothesis of continuous densities of the marginals, which to date is mathematically not characterized in simple terms. The cutoff profile function in [6] is given in terms of the Lévy-Ornstein-Uhlenbeck limiting measure for \(\varepsilon =1\) and measured in the total variation distance. Such profile functions are theoretically highly insightful, but almost impossible to calculate and simulate in examples. The characterization of the existence of a cutoff-profile remains analogously to [5] in abstract terms of the behavior of the mentioned profile function on a suitably defined omega limit set. The Wasserstein case is treated in [2] where, contrary to the total variation case, it is noted that the profile function takes an explicit and simple shape. Finally, [3] treats the cutoff phenomenon with respect to the total variation distance for (1.1) with b satisfying Hypothesis 1 and driven by a Lévy process in the rather restrictive class of strongly locally layered stable processes (see Definition 1.4 in [3]).
In this article we combine a nonlinear version of the Wasserstein estimates of [2], with the Freidlin-Wentzell first order approximation of (1.1) in the spirit of [3] and the fine properties of the Wasserstein distance given in Lemma 2.1, in particular, the non-standard shift linearity of Lemma 2.1.d).
The manuscript is organized in four parts. After the exposition of the setting and the presentation of the main results in Sect. 2, we illustrate our findings for the nonlinear Fermi–Pasta–Ulam–Tsingou gradient system and a class of nonlinear oscillators in Sect. 3. The main steps of the proof of the cutoff phenomenon are given in Sect. 4 while the auxiliary technical such as exponential ergodicity in Wasserstein distance, the coupling between the original nonlinear system and the Freidlin-Wentzell linearization results are given in the “Appendix”.
2 Setting and Main Results
2.1 Fine Properties of the Wasserstein Distance
For any two probability distributions \(\mu _1\) and \(\mu _2\) on \(\mathbb {R}^d\) with finite \(p_*\)-th moment for some \(p_*>0\), we define the Wasserstein \(p_*\)-distance between them as follows
where the infimum is taken over all couplings (joint distributions on \(\mathbb {R}^d\times \mathbb {R}^d\)) \(\Pi \) with marginals \(\mu _1\) and \(\mu _2\). We refer to [12, 20] and references therein for more details. For convenience of notation we do not distinguish a random variable U and its law \(\mathbb {P}_U\) as an argument of \(\mathcal {W}_{p_*}\). That is, for random variables \(U_1\), \(U_2\) and probability measure \(\mu \) we write \(\mathcal {W}_{p_*}(U_1, U_2)\) instead of \(\mathcal {W}_{p_*}(\mathbb {P}_{U_1}, \mathbb {P}_{U_2})\), \(\mathcal {W}_{p_*}(U_1, \mu )\) instead of \(\mathcal {W}_{p_*}(\mathbb {P}_{U_1}, \mu )\) etc. The next result establishes properties of the Wasserstein distance which turn out to be important for our arguments.
Lemma 2.1
(Properties of \(\mathcal {W}_{p_*}\)) For \(p_*>0\), \(u_1,u_2\in \mathbb {R}^d\), \(c\in \mathbb {R}\) and \(U_1\) and \(U_2\) being random vectors in \(\mathbb {R}^d\) with finite \(p_*\)-th moment we have the following:
-
(a)
The Wasserstein distance \(\mathcal {W}_{p_*}\) is a metric.
-
(b)
Translation invariance: \(\mathcal {W}_{p_*}(u_1+U_1,u_2+U_2)=\mathcal {W}_{p_*}(u_1-u_2+U_1,U_2)\).
-
(c)
Homogeneity:
$$\begin{aligned} \mathcal {W}_{p_*}(c\cdot U_1,c\cdot U_2)= {\left\{ \begin{array}{ll} |c|\;\mathcal {W}_{p_*}(U_1,U_2)&{}\text { for } p_*\in [1,\infty ),\\ |c|^{p_*}\;\mathcal {W}_{p_*}(U_1,U_2)&{}\text { for } p_*\in (0,1). \end{array}\right. } \end{aligned}$$ -
(d)
Shift linearity: For \(p_*\geqslant 1\) it follows
$$\begin{aligned} \mathcal {W}_{p_*}(u_1+U_1,U_1)=|u_1|. \end{aligned}$$(2.1)For \(p_*\in (0,1)\) we have
$$\begin{aligned} \max \{|u_1|^{p_*}-2\mathbb {E}[|U_1|^{p_*}],0\}\leqslant \mathcal {W}_{p_*}(u_1+U_1,U_1)\leqslant |u_1|^{p_*}. \end{aligned}$$(2.2) -
(e)
Domination: For any given coupling \(\tilde{\Pi }\) between \(U_1\) and \(U_2\) it follows
$$\begin{aligned} \mathcal {W}_{p_*}(U_1, U_2) \leqslant \Big (\int _{\mathbb {R}^d\times \mathbb {R}^d} |v_1-v_2|^{p_*} \tilde{\Pi }(\mathrm {d}v_1,\mathrm {d}v_2)\Big )^{1\wedge (1/p_*)}. \end{aligned}$$ -
(f)
Characterization: Let \((U_n)_{n\in \mathbb {N}}\) be a sequence of random vectors with finite \(p_*\)-th moments and U a random vector with finite \(p_*\)-th moment. Then the following statements are equivalent:
-
(1)
\(\mathcal {W}_{p_*}(U_n, U) \rightarrow 0\) as \(n\rightarrow \infty \).
-
(2)
\(U_n {\mathop {\longrightarrow }\limits ^{d}} U\) as \(n \rightarrow \infty \) and \(\mathbb {E}[|U_n|^{p_*}] \rightarrow \mathbb {E}[|U|^{p_*}]\) as \(n\rightarrow \infty \).
-
(1)
For \(p_*\in (0,1)\) equality (2.1) is false in general, see Remark 2.4 in [2]. The proof of the previous lemma is given in Lemma 2.2 in [2].
The following result yields the existence of a unique invariant distribution for (1.1) under Hypotheses 1 and 2. Moreover, under the Wasserstein distance, the strong solution of (1.1) is exponentially ergodic.
Proposition 1
(Existence of a unique invariant distribution) Under Hypothesis 1 for \(p_*>0\) and Hypothesis 2 there exists a unique invariant probability measure \(\mu ^\varepsilon \) such that
The proof is given in “Appendix 1”.
2.2 Hartman–Grobman Asymptotics
The zeroth-order approximation of a smooth dynamical systems on a finite time horizon [0, T] subject to small perturbations is given by the deterministic system, that is, \((X^0_t(x))_{t\in [0,T]}\). Our main results treat small asymptotics close to the stable state 0 which translates to meaningful time scales \(t_\varepsilon \rightarrow \infty \), as \(\varepsilon \rightarrow 0\), in Theorems 1 and 2. Before we state our main result, we first provide the long-time asymptotics of \(X^0_t(x)\) in terms of the spectral decomposition of the solution \(t\mapsto e^{-Db(0)t}x^*\) of the respective linear system for some \(x^*\) in a small neighbourhood of the origin.
Lemma 2.2
(Asymptotic Hartman–Grobman) Assume Hypothesis 1. Then for any \(x\in \mathbb {R}^{d}\setminus \{0\}\) there exist:
-
(i)
positive constants \( \mathfrak {q}^x, \tau ^x,\ell ^x, m^x\) with \(\ell ^x,m^x\in \{1,\ldots ,d\}\),
-
(ii)
angular velocities \(\theta ^{x}_{1},\dots ,\theta ^x_{m^x}\in \mathbb {R}\), where all \(\theta ^x_k \ne 0\) come in pairs \((\theta ^x_{j_*},\theta ^x_{j_*+1})=(\theta ^x_{j_*}, -\theta ^x_{j_*})\),
-
(iii)
linearly independent vectors \(v_1^x,\dots ,v_{m_x}^x\) in \(\mathbb {C}^d\) which are complex conjugate \((v^x_{j_*},v^x_{j_*+1})=(v^x_{j_*}, \bar{v}^x_{j_*})\) whenever \((\theta ^x_{j_*},\theta ^x_{j_*+1})=(\theta ^x_{j_*}, -\theta ^x_{j_*})\),
such that
Moreover,
The formal proof of the previous lemma is given in Lemma B.2 in Appendix B of [5].
Remark 2.3
-
(1)
Convention: Note that \(\theta ^x_k=0\) is true for at most one index \(k\in \{1,\ldots , m^x\}\). If such an index shows up in \(\theta ^x_{1},\ldots , \theta ^x_{m^x}\) we adopt the convention that \(\theta ^x_1=0\) and \(v_1^x\in \mathbb {R}^d\), and hence \(m^x=2n+1\) for some \(n\in \mathbb {N}_0\). Otherwise, \(m^x=2n\) for some \(n\in \mathbb {N}_0\) and we eliminate \(\theta ^x_1\) and count the angular velocities as follows \(\theta ^x_2,\ldots , \theta ^x_{2n+1}\).
-
(2)
Note that the linearly independent complex vectors \(v_1^x,\dots ,v_{m_x}^x\) in \(\mathbb {C}^d\) not only depend on x but also crucially on the dissipation time \(\tau ^x\) of the deterministic system to a Hartman–Grobman domain of conjugacy U. We stress that \(\tau ^x\) is not unique since \(X^0_{t+\tau ^x}(x) \in U\) for all \(t\geqslant 0\).
-
(3)
A word about the parameters \(\ell ^x\), \(\mathfrak {q}^x\) and \(m^x\) in Lemma 2.2. By the Hartman–Grobman theorem there are open sets \(0\in U, V\subset \mathbb {R}^d\) and a homeomorphism \(H: U\rightarrow V\) with \(H(0) = 0\) satisfying for all \(u\in U\) and \(t\geqslant 0\)
$$\begin{aligned} H(X^0_t(u)) = e^{- Db(0) t}H(u). \end{aligned}$$(2.6)In fact, by Hypothesis 1 we have that H is a \(\mathcal {C}^1\)-diffeomorphism, see the original paper [8] or Theorem(Hartman), Sec. 2.8, p.127, [13]. In [8] it is shown that H can be chosen to be
$$\begin{aligned} H(x) = x + o(|x|)_{|x| \rightarrow 0}. \end{aligned}$$Let \(\tilde{u} = X^0_{\tau ^x}(x) \in U\). With the help of a linear coordinate change W we obtain the Jordan normal form \(Db(0) = W^{-1} J(Db(0)) W\) and (using the linearity of the semigroup)
$$\begin{aligned} H(X^0_{t+\tau ^x}(x)) = W^{-1} e^{- J(Db(0)) t} (W H(\tilde{u})). \end{aligned}$$We denote \(\tilde{w} = W H(\tilde{u})\). Now, the parameters \(\ell ^x\), \(\mathfrak {q}^x\) and \(m^x\) are given as follows. Consider the sequence of generalized eigenspaces \(H_{j}\) of J(Db(0)) such that
$$\begin{aligned} \mathbb {R}^d = H_1\oplus \dots \oplus H_{k_*}. \end{aligned}$$By construction, \(\tilde{w} \in G(\tilde{w}) := \text{ span }(\{H_k~|~\text{ where } 1\leqslant k\leqslant k_*: ~\text{ proj }(\tilde{w}, H_k)\ne 0\})\). Note that \(G(\tilde{w})\) is unique. We consider the restriction
$$\begin{aligned} \tilde{J}(\tilde{w}):= J(Db(0))\big |_{G(\tilde{w})}. \end{aligned}$$Now, \(\mathfrak {q}^x\) is the smallest real part of the spectrum of \(\tilde{J}(\tilde{w})\), \(\ell ^x\) is the dimension of the largest Jordan block of \(\tilde{J}(\tilde{w})\) which has the real part \(\mathfrak {q}^x\) and \(m^x\) is the number of Jordan blocks associated to \(\mathfrak {q}^x\) and \(\ell ^x\). Note that in case of a non real eigenvalue with real part \(\mathfrak {q}^x\) and Jordan block size \(\ell ^x\), we have \(m^x\geqslant 2\). For an extensive numerical example for a linear chain of oscillators we refer to Sect. 4.3.2 in [2].
2.3 Main Results
Our first main result establishes \(\infty /0\) collapse of the Wasserstein distance between the law of the current state \(X^{\varepsilon }_t(x)\) and the dynamical equilibrium \(\mu ^\varepsilon \) along the critical time scale \(\mathfrak {t}^x_\varepsilon \) given in (2.7) under mild conditions.
Theorem 1
(Window cutoff) Let b satisfy Hypothesis 1 and \(\nu \) satisfy Hypothesis 2 for some \(p_*>0\). Fix \(x\in \mathbb {R}^d\setminus \{0\}\) and consider the notation in the asymptotic Hartman–Grobman representation \(\mathfrak {q}^x>0\), \(\ell ^x , m^x \in \{1,\ldots , d\}\), \(\theta ^x_1,\dots ,\theta ^x_{m^x} \in [0,2\pi )\), \(v^x_1,\dots ,v^x_{m^x} \in \mathbb {C}^d\) and \(\tau ^x>0\) of Lemma 2.2.
Then the family of processes \((X^{\varepsilon }(x))_{\varepsilon >0}\) exhibits a window cutoff phenomenon on the time scale
and for all asymptotically constant window sizes \(w_\varepsilon \), that is, \(w_\varepsilon \rightarrow w>0\) as \(\varepsilon \rightarrow 0\), in the following sense. For all \(0<p< p^*\) we have
The second main result provides two characterizations for the proper limits (\(\varepsilon \rightarrow 0\)) of the expressions in (2.8) for any fixed \(r\in \mathbb {R}\). That is to say, we characterize under which conditions the asymptotics (1.3) is satisfied. In addition, it yields the precise shape of the limit which turn out to be a simple exponential function for \(p\in [1,p_*)\).
Theorem 2
(Dynamical profile cutoff characterization for \(p_*>0\)) Let the assumptions (and the notation) of Theorem 1 be valid for some \(p_*>0\). Consider the unique strong solution \((\mathcal {O}_t)_{t\geqslant 0}\) of the linear system
where \(\mathcal {O}_\infty \) is the unique invariant probability distribution of (2.9).
-
(1)
Then for any \(0<p< p_*\) the following statements are equivalent.
-
(i)
For any \(\lambda >0\), the function \(\omega (x)\ni u\mapsto \mathcal {W}_{p}(\lambda u+\mathcal {O}_\infty ,\mathcal {O}_\infty )\) is constant, where
$$\begin{aligned} \omega (x):= \Big \{ \text {accumulation points of } \sum _{k=1}^m e^{i t \theta ^x_k} v^x_k \text { as } t\rightarrow \infty \Big \}. \end{aligned}$$ -
(ii)
The family of processes \((X^{\varepsilon }(x))_{\varepsilon >0}\) exhibits a profile cutoff for any \(0< p< p_*\) as follows
$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{\mathcal {W}_{p}(X^\varepsilon _{\mathfrak {t}^x_\varepsilon +r\cdot w_\varepsilon }(x),\mu ^\varepsilon )}{\varepsilon ^{1\wedge p}}= \mathcal {P}^{x}_{p}(r) \quad \text { for any } r\in \mathbb {R}, \end{aligned}$$where
$$\begin{aligned} \mathcal {P}^{x}_{p}(r):=\mathcal {W}_{p}\Big (\kappa ^x(r)\cdot v+ \mathcal {O}_\infty ,\mathcal {O}_\infty \Big ) \qquad \text{ for } \text{ any } v\in \omega (x) \end{aligned}$$(2.10)and
$$\begin{aligned} \kappa ^x(r)= \frac{e^{-\mathfrak {q}^x r\cdot w}}{e^{\mathfrak {q}^x \tau ^x}(\mathfrak {q}^x)^{\ell ^x-1}}. \end{aligned}$$
-
(i)
-
(2)
For \(p_*> 1\) and \(p\in [1,p_*)\) the profile has the shape
$$\begin{aligned} \mathcal {P}^{x}_{p}(r)=\kappa ^x(r)\cdot |v|\quad \text { for all }v\in \omega (x) \end{aligned}$$if and only if \(\omega (x)\) is contained in a sphere in \(\mathbb {R}^d\) with respect to the Euclidean norm.
-
(3)
We recall the convention of Remark 2.3. Let \(p_*> 1\) and \(p\in [1,p_*)\). If the angles \(\theta ^x_{2},\ldots , \theta ^x_{2n}\) satisfy the following non-resonance condition
$$\begin{aligned} h_1\theta _2 + \ldots + h_n \theta _{2n} \in 2 \pi \cdot \mathbb {Z} \qquad \text{ for } \text{ all } (h_1, \ldots , h_n)\in \mathbb {Z}^n\setminus \{0\}, \end{aligned}$$(2.11)then the statements (i) and (ii) in item (1) are equivalent to the following normal growth condition of the asymptotic Hartman–Grobman linearization: The family of limiting vectors \((v_1^x,\mathsf {Re}\,v^x_2,\mathsf {Im}\, v^x_2,\ldots ,\mathsf {Re}\, v^x_{2n},\mathsf {Im}\, v^x_{2n})\) is orthogonal in \(\mathbb {R}^d\) and satisfies
$$\begin{aligned} |\mathsf {Re}\, v^x_{2k}|=|\mathsf {Im}\, v^x_{2k}|\qquad \text{ for } \text{ all } \quad k=1,\ldots ,n. \end{aligned}$$
Remark 2.4
We stress that \(\mathcal {O}_\infty = \lim _{t\rightarrow \infty } \mathcal {O}_t\) in \(\mathcal {W}_{p_*}\) and due to Hypothesis 1 (in combination with Hypothesis 2) the distribution of \(\mathcal {O}_\infty \) does not depend on any deterministic initial condition of (2.9).
Due to its relevance as physical observables, we formulate the corresponding window cutoff result for the respective moments.
Corollary 2.5
(Moments cutoff) Let the assumptions (and the notation) of Theorem 1 be valid for some \(p_*>0\). Then for any \(0< p<p_*\) it follows
3 Examples
In this section we present two examples which illustrate the applicability of Theorem 1 and Theorem 2 to nonlinear dynamics with degenerate noise.
Example 3.1
(The Fermi–Pasta–Ulam–Tsingou potential) We consider the nonlinear Langevin gradient system
for the strongly convex quartic Fermi–Pasta–Ulam–Tsingou potential \(\mathcal {U}(x) = \frac{1}{2} |x|^2 + \frac{1}{4}|x|^4\), \(x\in \mathbb {R}^d\) subject to degenerate noise \(\mathrm {d}L_t\). For any Lévy process L satisfying Hypothesis 2 for some \(p_*>0\) the system (3.1) exhibits a profile cutoff due to Theorem 2 where the cutoff time is given by \(\mathfrak {t}_\varepsilon ^x = |\ln (\varepsilon )|\). For \(p_*>1\) and any \(p\in [1,p_*)\) the profile function in \(\mathcal {W}_{p}\) is always of the following exponential shape
where \(\tau ^{x}:=\min \{t\geqslant 0:|X^0_t(x)|\leqslant R_0/2\}\) and \(R_0\) being an small radius inside of which Hartman–Grobman conjugation is valid. Note that \(\tau ^x\) can be replaced by any upper bound of \(\tau ^x\) such as for instance \((1/\delta )\ln (2|x|/R_0)\) given by Hypothesis 1.
In particular, the profile cutoff (3.2) is valid for \(L=L^\alpha \) being an (possibly degenerate) \(\alpha \)-stable process with index \(\alpha \in (1,2]\). Note that for the limiting case of a possibly degenerate Cauchy process (\(\alpha =1\)) and in fact of any \(L^\alpha \) with index \(\alpha \in (0,1)\), Theorem 2 also yields a profile cutoff. However, the profile function remains not explicit. This is due to the absence of a finite first moment and the lack of the shift linearity (2.2). In other words, the profile function is given in (2.10) for \(p\in (0,\alpha )\) and up to our knowledge unknown how to simplify further. Note that the case of \(\alpha \in (0,3/2]\) is new and is not covered in [3].
Example 3.2
(Nonlinear non-gradient with degenerate noise) For \(F,\mathcal {H}\in \mathcal {C}^2(\mathbb {R}^2,\mathbb {R})\) we consider the following perturbed simple harmonic oscillator with unit angular frequency given in Section 4 of [19] subject to a small noise perturbation
where \(\mathcal {L}=(\mathcal {L}_t)_{t\geqslant 0}\) is a one dimensional Lévy process with finite \(p_*\)-th moments. The Jacobian matrix \(Jb(v_1,v_2)\) at \((v_1,v_2)\) of the respective vector field \(b:\mathbb {R}^2\rightarrow \mathbb {R}^2\) is given by
It is enough to prove the existence of a positive constant \(\delta \) such that for any \(u_1,u_2,v_1,v_2\in \mathbb {R}\) it follows
For instance, for a nonlinear perturbation of a linear oscillator, that is, \(F(v_1,v_2)=\eta \) for some \(\eta >0\), the preceding condition reads
For \(\mathcal {L}\) satisfying Hypothesis 2 with \(p_*\), and F, \(\mathcal {H}\) fulfilling (3.3) Theorem 1 implies window cutoff for any initial condition \((X^{\varepsilon ,1}_0,X^{\varepsilon ,2}_0)=x\in \mathbb {R}^2\setminus \{0\}\) and any \(p\in (0,p_*)\). The cutoff time is given by
Note that this result is new even in the Brownian case since the results of [3] and [5] are stated for the total variation distance which requires regularity on the transition probabilities given in the setting of non-degenerate noise. In our case, the Wasserstein distance circumvents this difficulty by the continuity of \(\mathcal {W}_{p}(x+X,X)\) for any \(X\in L^{p}\) as \(|x|\rightarrow 0\) and \(|x|\rightarrow \infty \), while for total variation distance it requires absolutely continuity on the distribution of X. We refer to [3], Lemma 1.17 in Subsection 1.3.5, for an example where the continuity of the total variation distance under shifts is not valid.
In the sequel, we characterize the existence of a profile cutoff under (3.3) in terms of the linearization at the stable state (0, 0). Let \(a:=-\partial ^2_{11}\mathcal {H}(0,0)\) \(b:=-\partial ^2_{22}\mathcal {H}(0,0)\), \(c:=-\partial _{12}\mathcal {H}(0,0)\) and \(\eta _0:=-F(0,0)\). Then
Note that \(\eta _0=c\) implies that the eigenvalues of Jb(0, 0) are the numbers a and b which are positive and hence by Theorem 2 profile cutoff is valid. In the sequel we assume \(\eta _0 \ne c\). Then the eigenvalues of Jb(0, 0) are given by
with corresponding eigenvectors
In addition,
For \(\Delta \geqslant 0\) Theorem 2 yields a profile cutoff phenomenon. For \(\Delta <0\) Theorem 1 implies the weaker window cutoff phenomenon, however, by part (3) of Theorem 2 the stronger profile cutoff for \(p_*>1\) and \(p\in [1,p_*)\) is valid if and only if
which is equivalent to special case \(a=b\) and \(c=0\). In other words, \(e^{-Jb(0,0)t}=e^{-at}R(\theta t)\), where \(R(\theta t)\) is an orthogonal \(2\times 2\) matrix with angle \(\theta t\).
Remark 3.3
(A word about the linear dynamics) In [2] the authors study (1.1) for the linear vector field \(b(x)=\mathcal {Q}x\) for any Hurwitz stable matrix \(-\mathcal {Q}\), that is, \(\mathsf {Re}(\lambda )<0\) for any eigenvalue \(\lambda \) of \(-\mathcal {Q}\). Under these assumptions, the results of Theorem 1 and Theorem 2 are obtained.
It is not hard to see that Hypothesis 1 implies \(\mathsf {Re}(\lambda )\leqslant -\delta \) for any eigenvalue \(\lambda \) of \(-\mathcal {Q}\) and hence Hurwitz stability. However, the dissipativity condition (1.2) which is assumed in order to control the nonlinear vector field, is strictly stronger than Hurwitz stability. For instance, the vector field \(b:\mathbb {R}^2 \rightarrow \mathbb {R}^2\) given by \(b(x)=\mathcal {Q}x\) with
has eigenvalues with real part \(-\lambda /2<0\), but it does not satisfy Hypothesis 1. Note that the dissipativity condition (1.2) is not even satisfied locally in a neighborhood of the origin.
4 Proofs of the Main Results
4.1 The First Order Approximation
We define the Freidlin-Wentzell first order approximation given by
![](http://media.springernature.com/lw214/springer-static/image/art%3A10.1007%2Fs10884-022-10138-1/MediaObjects/10884_2022_10138_Equ18_HTML.png)
where \((\mathcal {Y}^{x}_t)_{t\geqslant 0}\) is the unique strong solution of the linear inhomogeneous SDE
![](http://media.springernature.com/lw324/springer-static/image/art%3A10.1007%2Fs10884-022-10138-1/MediaObjects/10884_2022_10138_Equ19_HTML.png)
In [3], Lemma C.4 in Section C.4 it is shown that \(Y^{\varepsilon }_t(x)\) converges in total variation distance to a unique limiting distribution \(\mu ^\varepsilon _*\) as \(t\rightarrow \infty \). Moreover, it is shown there that \(\mu ^\varepsilon _*{\mathop {=}\limits ^{d}}\varepsilon \mathcal {O}_\infty \), where \(\mathcal {O}_\infty \) is the unique invariant probability distribution of the homogeneous Ornstein-Uhlenbeck dynamics
In the sequel we reduce the nonlinear ergodic convergence of \(X^\varepsilon _t(x)\) to the ergodic convergence of the Freidlin-Wentzell linearization \(Y^\varepsilon _t(x)\) in (4.4) up to error terms. For any \(0<p\leqslant p_*\), by the triangle inequality it follows that
for any \(t\geqslant 0\), \(x\in \mathbb {R}^d\). Analogously we estimate
Combining the preceding inequalities we obtain the linear approximation
for any \(t\geqslant 0\), \(x\in \mathbb {R}^d\). In Proposition 2 given in “Appendix B.2” we show that for any \(t_\varepsilon =O(|\ln (\varepsilon )|)\) and \(0< p < p_*\) the following limit holds
Moreover, in Lemma B.2 we show that for \(0< p < p_*\)
4.2 Derivation of the Cutoff Phenomenon
In the sequel, we analyze the asymptotic behavior of \(\mathcal {W}_{p}(Y^\varepsilon _t(x), \mu ^\varepsilon _*)\cdot \varepsilon ^{-(1\wedge p)}\) from which we recognize the cutoff of the Freidlin-Wentzell linearization \(Y^\varepsilon _t(x)\). By the triangle inequality, translation invariance, homogeneity and shift linearity given in Lemma 2.1 we obtain for \(0< p\leqslant p_*\)
Analogously we deduce
Consequently,
The right-hand side of (4.7) does not depend of \(\varepsilon \) and by Lemma B.3 it tends to 0 as \(t\rightarrow \infty \). It is therefore enough to study the precise longterm behavior of \(\mathcal {W}_{p}(\varepsilon ^{-1}\cdot X^0_t(x) + \mathcal {O}_\infty , \mathcal {O}_\infty )\) in order to derive the cutoff phenomenon.
4.3 Proof of Theorem 1
For any \(0< p< p_*\), \(\mathfrak {t}^x_\varepsilon \) and \(w_\varepsilon \) being given in statement and \(r\in \mathbb {R}\), (4.4), (4.5), (4.6), (4.7) yield
For short, we define
Claim A.
and
for any \(0<p< p_*\). In particular, the limit
Proof of Claim A
In the sequel we study the asymptotics of the drift term \(X^0_t(x) \cdot \varepsilon ^{-1}\). A straightforward calculation shows
The preceding limit implies with the help of the spectral decomposition (2.4) given in Lemma 2.2 and the triangle inequality that
We set
Analogous reasoning yields
In the sequel it remains to show that \(R^x_\varepsilon \rightarrow 0\) as \(\varepsilon \rightarrow 0\). By the continuity of \(z\rightarrow \mathcal {W}_{p}(z+\mathcal {O}_\infty ,\mathcal {O}_\infty )\) at \(z=0\) it is enough to prove
which is valid due to the limit (2.4) and (4.10). This finishes the proof of Claim A. \(\square \)
In the sequel, we prove the window cutoff asymptotics in (2.8). Note that \(\Lambda ^x(\varepsilon )\) is uniformly bounded on \(\varepsilon \in (0,1]\). For any accumulation point U (as \(\varepsilon \rightarrow 0\)) of \(\big (\mathcal {W}_{p}( \Lambda ^x(\varepsilon ) + \mathcal {O}_\infty , \mathcal {O}_\infty )\big )_{\varepsilon \in (0,1]}\) there exists a sequence \((\varepsilon _k)_{k\in \mathbb {N}}\), \(\varepsilon _k\rightarrow 0\) as \(k\rightarrow \infty \), such that
The Bolzano–Weierstrass theorem for the sequence \((\Lambda (\varepsilon _k))_{k\in \mathbb {N}}\), the limit (4.10) and the continuity of \(\mathcal {W}_{p}\) yield
In particular,
where \({\hat{u}},\check{u}\in \omega (x)\) and \(\check{u}\ne 0\) by (2.5). Hence item (d) in Lemma 2.1 implies
This finishes the proof of Theorem 1. \(\square \)
4.4 Proof of Theorem 2
We keep the notation (4.8) of the proof of Theorem 1. By (4.9) it is enough to prove that the limit
We recall the definition of \(\Lambda ^x(\varepsilon )\) (4.8) and the limit (4.10). By (4.12) we have
For \(p\geqslant 1\), the shift linearity given in item d) of Lemma 2.1 implies
Combining (4.14) and (4.15) we infer
Hence (4.14) and (4.16) imply that the limit (4.13) exists if and only if the right-hand side of (4.16) has exactly one element. This is equivalent to \(\omega (x)\) being contained in a sphere in \(\mathbb {R}^d\) with respect to the Euclidean distance. For \(p\in (0,1)\) the shift linearity is not valid and we are stuck after (4.14). Consequently, (4.14) holds true and the limit (4.13) exists if and only if for all \(\lambda >0\) the function
This finishes the proof of Theorem 2. \(\square \)
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
Applebaum, D.: Lévy Processes and Stochastic Calculus, 2nd edn. Cambridge University Press, Cambridge (2009)
Barrera, G., Högele, M.A., Pardo, J.C.: Cutoff thermalization for Ornstein–Uhlenbeck systems with small Lévy noise in the Wasserstein distance. J. Stat. Phys. 184(27) (2021)
Barrera, G., Högele, M.A., Pardo, J.C.: The cutoff phenomenon in total variation for nonlinear Langevin systems with small layered stable noise. Electron. J. Probab. 26, 1–76 (2021)
Barrera, G., Jara, M.: Abrupt convergence of stochastic small perturbations of one dimensional dynamical systems. J. Stat. Phys. 163(1), 113–138 (2016)
Barrera, G., Jara, M.: Thermalisation for small random perturbation of dynamical systems. Ann. Appl. Probab. 30(3), 1164–1208 (2020)
Barrera, G., Pardo, J.C.: Cut-off phenomenon for Ornstein–Uhlenbeck processes driven by Lévy processes. Electron. J. Probab. 25(15), 1–33 (2020)
Da Prato, G., Gatarek, D., Zabczyk, J.: Invariant measures for semilinear stochastic equations. Stoch. Anal. Appl. 10(4), 387–408 (1992)
Hartman, P.: On local homeomorphisms of Euclidean spaces. Bol. Soc. Mat. Mexicana 2(5), 220–241 (1960)
Kallianpur, G., Sundar, P.: Stochastic Analysis and Diffusion Processes. Oxford University Press, Oxford (2014)
Majka, M.: A note on existence of global solutions and invariant measures for jump SDEs with locally one-sided Lipschitz drift. Probab. Math. Stat. 40(1), 37–55 (2020)
Mikami, T.: Asymptotic expansions of the invariant density of a Markov process with a small parameter. Ann. Inst. H. Poincaré Probab. Stat. 24(3), 403–424 (1988)
Panaretos, V., Zemel, Y.: An Invitation to Statistics in Wasserstein Space. Springer (2020)
Perko, L.: Differential Equations and Dynamical Systems. Texts in Applied Mathematics, vol. 7, 3rd edn. Springer, New York (2001)
Protter, P.: Stochastic Integration and Differential Equations, 2nd edn. Springer-Verlag, Berlin (2004)
Saint Loubert Bié, E.: Étude d’une EDPS conduite par un bruit poissonnien. Probab. Theory Relat. Fields 111(2), 287–321 (1998)
Sato, K.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press, Cambridge (1999)
Siorpaes, P.: Applications of pathwise Burkholder–Davis–Gundy inequalities. Bernoulli 24(4B), 3222–3245 (2018)
Situ, R.: Theory of Stochastic Differential Equations with Jumps and Applications. Springer, New York (2005)
Tudoran, R.M.: On the coercivity of continuously differentiable vector fields. Qual. Theory Dyn. Syst. 19(2), Paper No. 58, 1–7 (2020)
Villani, C.: Optimal Transport. Old and New. Springer-Verlag, Berlin (2009)
Wang, J.: Regularity of semigroups generated by Lévy type operators via coupling. Stoch. Process. Appl. 120(9), 1680–1700 (2010)
Watanabe, S., Ikeda, N.: Stochastic Differential Equations and Diffusion Processes. North-Holland Publishing Co., Amsterdam-New York, Kodansha Ltd., Tokyo (1981)
Acknowledgements
The authors would like to thank the anonymous referee for her/his valuable comments which has led to significant improvement of the manuscript. The authors would like to thank Carlos Gustavo Tamm de Araújo Moreira (Gugu) at IMPA for clarifying comments on the Hartman–Grobman theorem.
Funding
Open Access funding provided by University of Helsinki including Helsinki University Central Hospital. The research of GB has been supported by the Academy of Finland, via the Matter and Materials Profi4 University Profiling Action, an Academy Project (Project No. 339228) and the Finnish Centre of Excellence in Randomness and STructures (Project No. 346306). GB also would like to express his gratitude to University of Helsinki for all the facilities used along the realization of this work. The research of MAH has been supported by the proyecto de la Convocatoria 2020–2021: “Stochastic dynamics of systems perturbed with small Markovian noise with applications in biophysics, climatology and statistics” of the Facultad de Ciencias at Universidad de los Andes.
Author information
Authors and Affiliations
Contributions
All authors have contributed equally to the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Existence of the Invariant Measure
1.1 Invariant distribution \(\mu ^\varepsilon \)
In the sequel we show the existence of a unique invariant distribution \(\mu ^\varepsilon \) of the solution of (1.1) for any \(\varepsilon >0\). We stress that beyond the existence of moments (Hypothesis 2), this does not include any regularity such as absolute continuity whatsoever in our setting. For instance, our setting covers nonlinear oscillators with degenerate noise in Example 3.2.
We recall the standing assumptions Hypothesis 1 with \(\delta >0\) and Hypothesis 2 with \(p_*>0\). For the existence of the invariant probability measure \(\mu ^\varepsilon \) it is enough to verify the following condition by [7], p. 388. For some \(x\in \mathbb {R}^d\), the limit
Hypotheses 1 and 2 imply inequality (D.3) p. 71 in [3]. That is to say, for \(\gamma \in (0,1\wedge p_*)\) there exist positive constants \(C_1,C_2,C_3\) such that for all \(x\in \mathbb {R}^d\), \(\varepsilon >0\), \(t\geqslant 0\), \(A=\varepsilon \Pi \), \(c=\varepsilon \)
where \(C_3= c^\gamma +\frac{1}{\gamma \delta } \big (\gamma \delta c^\gamma + C_1\Vert A\Vert ^\gamma + C_2 c^{\gamma -2}\Vert A\Vert ^{2} \big )=\varepsilon ^\gamma \cdot \big (2+ \frac{1}{\gamma \delta }( C_1\Vert \Pi \Vert ^\gamma +C_2 \Vert \Pi \Vert ^2) \big ) \). Inequality (A.2) implies (A.1) with the help of the Markov inequality.
For the uniqueness, it enough to verify the following condition given in Theorem 11.4.3 in [9]. For any given positive numbers \(\eta \), \(\delta \) and R, there exists a positive constant S such that
Hypotheses 1, 2 and the additivity of the noise imply (D.5) p. 71 in [3]. In other words, for any \(\gamma \in (0,1\wedge p_*)\), \(x,y\in \mathbb {R}^d\), \(t\geqslant 0\), \(\varepsilon >0\), \(c=\varepsilon \) we have
The preceding inequality implies (A.3) with the help of the Markov inequality.
1.2 Convergence to \(\mu ^\varepsilon \) in \(\mathcal {W}_{p_*}\) for \(p_*>0\)
Due to Hypothesis 1 and the additive of the noise the natural coupling yields
Since \(\mu ^\varepsilon \) is an invariant measure and \(X^\varepsilon \) is a Feller process, disintegration and (A.4) imply
The preceding right-hand side tends to zero as \(t\rightarrow \infty \) provided that \(\int _{\mathbb {R}^d}|y|^{1\wedge p_*}\mu ^{\varepsilon }(\mathrm {d}y)<\infty \) which is shown in (2.84) p. 48 in [3].
Appendix B. \(L^{p}\) Estimates for \(p\in (0,p_*)\)
We recall the Lévy-Khinchin formula of L with characteristic triple \((a,\Sigma ,\nu )\)
and the pathwise Lévy-Itô representation
where \((B_t)_{t\geqslant 0}\) is a standard Brownian motion in \(\mathbb {R}^d\), N is a Poisson random measure on \([0,\infty )\times \mathbb {R}^d\) with intensity measure \(\mathrm {d}t\otimes \nu (\mathrm {d}z)\) and \(\tilde{N}\) is the compensated counterpart of N. See [16] for further details on Lévy processes.
We recall the standing assumptions Hypothesis 1 with \(\delta >0\) and Hypothesis 2 with \(p_*>0\).
1.1 Localization
We start with the probability estimate of the event
where \(\mathcal {Y}^x\) is given in (4.2). Note that \(\mathcal {Z}_\cdot (0) = \mathcal {Y}^0_\cdot \) satisfies
for \(x= 0\).
Lemma B.1
For any \(\gamma \in (0,p_*\wedge 1]\) there is a positive constant C such that for any \(\vartheta \geqslant 1\), \(x\in \mathbb {R}^d\) and \(t\geqslant 0\) we have
Proof
By Theorem 1 in [17] we have
In particular, it follows
By Hypothesis 1 we obtain \( \int _0^t \langle H_{s-}, - Db(X^0_s(x)) \mathcal {Y}^x_s \rangle \mathrm {d}s \leqslant 0 \) a.s. Hence
We continue term by term. By the Chebyshev inequality we obtain
and
Finally, for \(\gamma \in (0,p_*\wedge 1]\) we have
where we have used the subadditivity of the power \(\gamma \) in the sense of Subsection 1.1.2, see formula (1.6) in [15]. This finishes the proof of the statement. \(\square \)
1.2 First Order Approximation
We start with some technical preliminaries. In order to overcome that \(u \mapsto |u|^{p}\) for \(p \in (0, 2)\) is not twice continuously differentiable which turns out to be necessary for applying Itô’s formula we use the following \(\mathcal {C}^2\) norm approximation \(|x|_c:= \sqrt{|x|^2 + c^2}, c> 0\), with the limiting case \(|x|_0 = |x|\). It is well-behaved in the following sense. For any \(c>0\) we have
Furthermore, it is straightforward to verify for \(G(x)=|x|^p_c\) the following calculations
The \(L_1\)-matrix norm \(\Vert \cdot \Vert _1\) of the respective Hessian \(H_{G}(x)\), \(x\in \mathbb {R}^d\), can be estimated as follows
For details of the estimates, we refer to p. 69 in [3]. Since \(p \in (0,2)\) and \(c\leqslant |x|_c\), we obtain
Proposition 2
We keep the notation of Theorem 1. Then for any \(x\in \mathbb {R}^d\), \(r\in \mathbb {R}\) and \(p\in (0,p_*)\) it follows
Proof
By the domination property of the Wasserstein distance in Lemma 2.1 it is enough to show the preceding limit in the respective \(L^{p}\) space. By (4.1) we have
Let \(\Delta ^\varepsilon _t := X^\varepsilon _t(x)-Y^\varepsilon _t(x)\), \(t\geqslant 0\). Then
where \((\mathcal {Y}^x_t)_{t\geqslant 0}\) is given in (4.2). An elementary estimate of the \(p_*\)-th power of a sum yields for all \(t\geqslant 0\)
where \(C_{p_*}\) is a positive constant. Since \((\mathcal {Y}^x_t)_{t\geqslant 0}\) satisfies a dissipative linear equation, it exhibits the same integrability as L, which is straightforward to verify. There are a positive constant \(\tilde{C}_{p_*}\) and a function \(S_{p_*}(t)\) of at most polynomial order such that
For the first term of the right-hand side of (B.6), Lemmas B.4 and B.5 yield the following estimate. For any \(\eta \in (0,p_*)\) there is a map \(R_\eta :[0,\infty )\rightarrow [0,\infty )\) which increases with polynomial order as t tends to infinity, such that
We start with the case \(p_*>1\) and \(p\in (1,p_*)\). The Hölder inequality implies
where \(\tilde{R}_\eta \) is a function of at most polynomial order as t tends to infinity and \(\eta '=\eta (p-1) p_*^{-1}\). For \(\eta \) small enough we fix \(\eta '\in (0,{1}/{4})\). Since \(p_*>1\), we may choose \(p\in (1,p_*)\) and \(\theta \in (0,{1}/{4})\). We split
where
First we prove that
where \(C(|x|)=\max \limits _{|u|\leqslant |x|+1}|D^2b(u)|\). The choice of \(\eta '\) and \(\theta \) yields \(1-\eta '-2\theta >{1}/{4}\). For notational convenience, we use the differential formalism, however, we stress that all differential inequalities are understood in the integral sense. Since \(p> 1\), the chain rule, Hypothesis 1 and Cauchy-Schwarz inequality imply
On the event \(\mathcal {A}^\varepsilon _t\), Taylor’s theorem applied to b implies
Taking expectation, the integral monotonicity, Fubini’s theorem and (B.9) yield
Bearing in mind \(|\Delta ^\varepsilon _0|^{p}=0\), we have
Therefore
We continue with the estimate on the complement of \(\mathcal {A}^\varepsilon _t\). We show
where \(\mathcal {R}(t)\) is a function of at most polynomial order. Indeed, by Hölder’s inequality and the inequalities (B.6), (B.7) and (B.8) we have
where \( \mathcal {R}(t):=\max \{\big ( C_{p_*}R_\eta (t)\big )^{\frac{p}{p_*}},\big (C_{p_*}\tilde{C}_{p_*}S_{p_*}(t) \big )^{\frac{p}{p_*}}\}\). As a consequence,
Combining estimates (B.12), (B.13) in decomposition (B.10) we obtain a positive constant \(C:=C(p_*,p,\delta , |x|, |D^2F|)\) such that for any \(t\geqslant 0\)
By Lemma B.1 there exists a positive constant C such that for all \(\gamma \in (0, 1)\) for the choice \(\vartheta = \varepsilon ^{-\theta /\gamma }\) and any \(t\geqslant 0\) it follows
We further restrict \(\theta \) such that additionally \(0<\theta <\min \{\frac{2\eta p}{p_*-p},\frac{1}{4}\}\). Hence, with the help of inequality (B.14) and (B.15) we have
where \(\mathcal {R}_1\) and \(\mathcal {R}_2\) are functions of at most polynomial order. Consequently we obtain the desired limit
We continue with the case \(p_*>0\) and \(p\in (0,1\wedge p_*]\). Let \(\theta \in (0,{1}/{4})\) and recall the event \(\mathcal {A}^\varepsilon _t\) in (B.11). For \(p\in (0,1\wedge p_*]\) we split
We start with the term \(J_1\). Since \(|\cdot |^p\) is not differentiable, we apply the chain rule for the smooth approximation \(|x|^{p}_c=(\sqrt{|x|^2+c^2})^{p}\). Hypothesis 1 then yields
Due to \(|X^0_t(x)|\leqslant e^{-\delta t}|x|\) for all \(t\geqslant 0\) and \(x\in \mathbb {R}^d\), Taylor’s expansion for b on the event \(\mathcal {A}^\varepsilon _t\) implies
where \(C(|x|)=\max \limits _{|u|\leqslant |x|+1}|D^2b(u)|\). Hence
The integral version of the Grönwall inequality with negative linearity given in Lemma 1 in [11] implies for all \(t\geqslant 0\)
For \(p\ne 1\) we have the following. Since \(c>0\) is arbitrary and \(\theta \in (0,{1}/{4})\), the choice \(c=\varepsilon ^{1+{\eta }/{p}}\) with \(\eta \in (0,\frac{p}{2(1-p)})\) in (B.16) yields for any \(r\in \mathbb {R}\)
The case of \(p=1\) follows by the choice \(c=\varepsilon ^{2}\) in (B.16).
We continue with the term \(J_2\). By the subadditivity of the power \(p\leqslant 1\) and the Hölder inequality for the index \(p'/p\) where \(p'\in (p,p_*)\) and r is such that \(p/p'+1/r=1\) we have
By Lemma B.5 we obtain for all \(t\geqslant 0\)
Note that for all \(t\geqslant 0\) it follows
Lemma A.1 in [3] yields the existence of a positive constant C(r, |x|) such that
Combining (B.18) with inequalities (B.15), (B.19), (B.20) and (B.21) gives
Since \(\mathbb {E}[|\mathcal {Y}^x_{t}|^{p'}]\leqslant \mathcal {R}(t)\), where \(\mathcal {R}\) is a function of at most polynomial order, we have
The right-hand side of the preceding inequality equals zero. The preceding argument combined with (B.17) yields the desired limit (B.5). \(\square \)
1.3 Asymptotic First Order Approximation
Lemma B.2
For any \(p\in (0,p_*)\) we have
Proof
First we observe that \(Y^\varepsilon _t(0) = \mathcal {Z}^\varepsilon _t(0)\) for any \(t\geqslant 0\), \(\varepsilon >0\), where \((\mathcal {Z}_t^\varepsilon (0))_{t\geqslant 0}\) is given in (B.2). In abuse of notation, we write \((X^\varepsilon _t(\mu ^\varepsilon ))_{t\geqslant 0}\) (and analogously respectively \((\mathcal {Z}^\varepsilon _t(\mu ^\varepsilon _*))_{t\geqslant 0}\)) for the process starting at the random vector with distribution \(\mu ^\varepsilon \) independent of the noise process L. Since \(X^{\varepsilon }_t(\mu ^\varepsilon )=\mu ^\varepsilon \) and \(\mathcal {Z}^{\varepsilon }_t(\mu ^\varepsilon _*)=\mu ^\varepsilon _*\) for any \(t\geqslant 0\), the triangle inequality yields
By Proposition 2 for \(x= 0\), we have
By disintegration, inequalities (A.4) and (2.84) in [3] imply
for some positive constant C. As a consequence,
Analogously,
Combining (B.22) with the estimates (B.23), (B.24) and (B.25) completes the proof. \(\square \)
Lemma B.3
For any \(p\in (0,p_*)\) we have
Proof
Recall that \(\mathcal {O}_\infty \) is the limiting and invariant distribution of the homogeneous Ornstein-Uhlenbeck process \((\mathcal {Z}(x)_t)_{t\geqslant 0}\) defined in (B.2). That is \(\mathcal {O}_\infty {\mathop {=}\limits ^{d}} \mathcal {Z}_\infty \). Since \(-Db(X^0_t(x))\) converges exponentially fast to \(-Db(0)\), it is natural to expect that the flow of \((\mathcal {Y}^x_t)_{t\geqslant 0}\) behaves as the flow of \((\mathcal {Z}_t(x))_{t\geqslant 0}\) for large t. In [3], Lemma C.3, it is shown that \(\mathcal {Y}^x_t \rightarrow \mathcal {O}_\infty \) as \(t \rightarrow \infty \) in law. However, the law \(\mathcal {O}_\infty \) is not invariant under the random dynamics of \((\mathcal {Y}^x_t)_{t\geqslant 0}\) due to the time inhomogeneity. Analogously as in (A.5) we deduce
We start with the proof of the statement. The triangle inequality yields
where the second term on the right-hand side tends to 0 as \(t\rightarrow \infty \) due to (B.27). Thus it remains to prove \( \mathcal {W}_{p}(\mathcal {Y}^x_t,\mathcal {Z}_t(0)) \rightarrow 0\), as \(t \rightarrow \infty \). Since
we derive the respective \(L^{p}\) estimates. By (4.2) and (B.2) we obtain
We first consider the case \(p_*>1\) and \(p\in (1,p_*)\). The chain rule and Hypothesis 1 yield
where \(C(|x|)=\max \limits _{|u|\leqslant |x|+1}|D^2b(u)|\). Taking expectation, using the monotonicity of the integrals and Fubini’s theorem imply
By Young’s inequality and \(|X^0_t(x)|\leqslant e^{-\delta t}|x|\) for any \(t\geqslant 0\) and \(x\in \mathbb {R}^d\) it follows
A straightforward calculation yields (for any \(p>0\)) that there exist functions \(P_1(t)\) and \(P_2(t)\) of polynomial order (depending of p, \(\delta \), |x|) such that
Therefore,
The integral version of the Grönwall inequality with negative linearity given in Lemma 1 in [11] yields
Therefore,
Combining (B.27) and (B.30) in (B.28) we conclude (B.26).
We continue with the case \(p \in (0, p_*\wedge 1]\). Note that the case \(p_*>1\) and \(p\in (0,1]\) is also covered in the sequel. By Lemma B.1 there exists a positive constant C such that for the choice \(\gamma =p\), \(\vartheta = e^{\frac{\delta }{2} t}\) and any \(t\geqslant 0\) it follows
We split
We start with the term \(I_1\). The chain rule for \(|x|^{p}_c=(\sqrt{|x|^2+c^2})^{p}\) and Hypothesis 1 yield
where \(C(|x|)=\max \limits _{|u|\leqslant |x|+1}|D^2b(u)|\). On the event \((\mathcal {D}^0_t)^{\mathsf {c}}\) we have
due to \(|X^0_t(x)|\leqslant e^{-\delta t}|x|\) for all \(t\geqslant 0\) and \(x\in \mathbb {R}^d\). Hence
The Grönwall inequality in [11] implies
Then
which yields \(\lim _{t\rightarrow \infty } \mathbb {E}[|\mathcal {Y}^x_t-\mathcal {Z}_t(0)|^{p}\mathbf {1}((\mathcal {D}^0_t)^{\mathsf {c}})]=0\).
We continue with the term \(I_2\). By the Hölder inequality for the index \(p'/p\) where \(p'=(p+p_*)/2\) and r the conjugate index of \(p'/p\) we have
By (B.29) and (B.31) the right-hand side of (B.32) tends to zero as \(t\rightarrow \infty \). As a consequence we have \(\mathcal {W}_{p}(\mathcal {Y}^x_t,\mathcal {Z}_t(0))\leqslant (\mathbb {E}[|\mathcal {Y}^x_t-\mathcal {Z}_t(0)|^{p}])^{1\wedge (1/p)}\) which tends to zero as \(t\rightarrow \infty \). By (B.27) and (B.28) we obtain (B.26). \(\square \)
1.4 Auxiliary Moment Estimates
Lemma B.4
For any \(2\leqslant p< p_*\) (and \(p = 2\) if \(p_* = 2\)) there is a function of at most polynomial order R(t) as \(t \rightarrow \infty \) and \(\varepsilon _0\in (0, 1]\) such that for any \(t\geqslant 0\) and \(0< \varepsilon < \varepsilon _0\) we have
Proof
First note that for \(G(u) = |u|^{p_*},p_*\geqslant 2\) we have
Recall the notation (B.1) for L. The Itô formula for \(\Theta ^\varepsilon _t= X^\varepsilon _t(x)-X^0_t(x)\) yields
Taking expectation yields
By the mean value theorem we have
and
Hence there is a positive constant K such that
For \(p_*=2\) we have directly \( \mathbb {E}[|\Theta ^\varepsilon _t|^{p_*}]\leqslant \varepsilon ^2 K t. \) For \(p_*>2\) we continue in (B.33) with Young’s inequality
for \(\varepsilon <(\frac{\delta p_*}{2K})^{1/2}\). Grönwall’s lemma applied to the preceding estimate yields the a priori estimate \( \mathbb {E}[|\Theta ^\varepsilon _t|^{p_*}] \leqslant \varepsilon ^2 K t^2 =: \varepsilon ^2 R_0(t). \) Inserting the a priori estimate in (B.33) and using the Hölder inequality for \(p_*>2\) we obtain
By induction we deduce after the i-th iterations of the bootstrap the estimate
for a polynomial order function \(R_i(t)\). Clearly, \(\lim _{i\rightarrow \infty } 2 \sum _{j=0}^i \Big (\frac{p_*-2}{p_*}\Big )^j = p_*\) and therefore for any \(0<p<p_*\) there is an iteration \(i_0=i_0(p_*,p)\) such that we obtain \( \mathbb {E}[|\Theta ^\varepsilon _t|^{p_*}] \leqslant \varepsilon ^{p} R_{i_0}(t) \). This finishes the proof of the lemma. \(\square \)
Lemma B.5
Let \(p_*>0\). Then for any \(p\in (0,2\wedge p_*)\) there exists a positive constant \(C_{p}\) such that for any \(t\geqslant 0\) and \(\varepsilon >0\) we have
Proof
Without loss of generality let \(p_*\in (0,2]\). Itô’s formula yields for \(\Theta ^\varepsilon _t = X^{\varepsilon }_t(x) -X^0_t(x)\) and the function \(G(z)=|z|^{p}_c\)
Taking expectation and using Hypothesis 1 we have
Since \(|x|^2=|x|^2_c-c^2\), we obtain
In the sequel we estimate the second order term for small increments with the help of (B.4) by
For the large increments, we use the mean value theorem and obtain
For \(p\in (0,1]\), note that \(|x+y|^p_c\leqslant |x|^p+|y|^p+c^p\) for all \(x,y \in \mathbb {R}^d\). Then we have for all \(t\geqslant 0\)
For \(p>1\), due to \(|x+y|^{p-1}_c\leqslant |x|^{p-1}+|y|^{p-1}+c^{p-1}\) for all \(x,y\in \mathbb {R}^d\), we split the intermediate value as follows
where we have used in the last line the following weighted Young inequality
with \(K_2=\int _{|z|>1} | z|\mathrm {d}\nu (\mathrm {d}z)+1\) and \(K_3=({\delta }/{2})^{p/(p-1)}\) followed by \(|y|\leqslant |y|_c\). Combining (B.35) with (B.37) for \(p\geqslant 1\), and (B.36) with (B.37) for \(p< 1\), respectively, in (B.34) we obtain
where \(K_0=C(p,d){{\,\mathrm{trace}\,}}(\Sigma ^{1/2}(\Sigma ^{1/2})^{*})\). Since \(|x|^p\leqslant |x|^p_c\), the choice \(c=c_\varepsilon =\varepsilon \) yields for all \(t\geqslant 0\) \( \mathbb {E}[|\Theta ^\varepsilon _t|^p]\leqslant \varepsilon ^p (1+Ct) \) for some constant \(C=C(p,\delta )\). This completes the proof of the lemma. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Barrera, G., Högele, M.A. & Pardo, J.C. The Cutoff Phenomenon in Wasserstein Distance for Nonlinear Stable Langevin Systems with Small Lévy Noise. J Dyn Diff Equat 36, 251–278 (2024). https://doi.org/10.1007/s10884-022-10138-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10884-022-10138-1
Keywords
- Cutoff phenomenon
- Exponential ergodicity
- Lévy processes
- Nonlinear Langevin dynamics
- Nonstandard properties of the Wasserstein distance