Drift estimation for a Lévy-driven Ornstein–Uhlenbeck process with heavy tails

We consider the problem of estimation of the drift parameter of an ergodic Ornstein–Uhlenbeck type process driven by a Lévy process with heavy tails. The process is observed continuously on a long time interval [0, T], T→∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T\rightarrow \infty $$\end{document}. We prove that the statistical model is locally asymptotic mixed normal and the maximum likelihood estimator is asymptotically efficient.

The problem is to study asymptotic properties of the corresponding statistical model and to show that the maximum likelihood estimator of θ is asymptotically efficient in an appropriate sense. Although the continuous time observations are far from being realistic in applications, they are of theoretical importance since they can be considered as a limit of high frequency discrete models.
Since we deal with continuous observations, it is natural to assume that the Gaussian component of the Lévy process Z is not degenerate. In this case, the laws of observations corresponding to different values of θ are equivalent and the likelihood ratio has an explicit form.
There are a lot of papers devoted to inference for Lévy driven SDEs. Most of the literature treats the case of discrete time observations both in the high and low frequency setting. A general theory for the likelihood inference for continuously observed jump-diffusions can be found in Sørensen (1991).
A complete analysis of the drift estimation for continuously observed ergodic and nonergodic Ornstein-Uhlenbeck process driven by a Brownian motion can be found in Höpfner (2014, Chapter 8.1).
For continuously observed square integrable Lévy driven Ornstein-Uhlenbeck processes, the local asymptotic normality (LAN) of the model and the asymptotic efficiency of the maximum likelihood estimator of the drift have been derived by Mai (2012Mai ( , 2014 with the help of the theory of exponential families, see Küchler and Sørensen (1997).
High frequency estimation of a square integrable Lévy driven Ornstein-Uhlenbeck process with non-vanishing Gaussian component has been performed by Mai (2012Mai ( , 2014. Kawai (2013) studied the asymptotics of the Fisher information for three characterizing parameters of Ornstein-Uhlenbeck processes with jumps under low frequency and high frequency discrete sampling. The existence of all moments of the Lévy process was assumed. Tran (2017) considered the ergodic Ornstein-Uhlenbeck process driven by a Brownian motion and a compensated Poisson process, whose drift and diffusion coefficients as well as its jump intensity depend on unknown parameters. He obtained the LAN property of the model in the high frequency setting.
We also mention the works by Long (2007, 2009a, b), Long (2009) and Zhang and Zhang (2013) devoted to the least-square estimation of parameters of the Ornstein-Uhlenbeck process driven by an α-stable Lévy process.
In this paper, we fill the gap and analyse a continuously observed ergodic Ornstein-Uhlenbeck process driven by a Lévy process with heavy regularly varying tails of index −α, α ∈ (0, 2), in the presence of a Gaussian component. It turns out that the log-likelihood in this model is quadratic, however the model is not asymptotically normal and we prove only the local asymptotic mixed normality (LAMN) property. We refer to Le Cam and Yang (2000) and Höpfner (2014) for the general theory of estimation for LAMN models.
The fact that the prelimiting log-likelihood is quadratic automatically implies that the maximum likelihood estimator is asymptotically efficient in the sense of Jeganathan's convolution theorem and attains the local asymptotic minimax bound. Another feature of our model is that the asymptotic observed information has spectrally positive α/2-stable distribution. This implies that the limiting law of the maximum likelihood estimator has tails of the order exp(−x α ) and hence finite moments of all orders.
The paper is organized as follows. In the next section we formulate the assumptions of our model and the main results of the paper. Section 3 contains auxiliary results that will be used in the proof of the main Theorem 2.5. In particular, we calculate the tail of a product of two iid heavy-tail random variables (Lemma 3.2), a conditional law of inter-arrival times of a Poisson process, and prove a technically involved Lemma 3.7. Eventually in Sect. 4, the proofs of the main results are presented.

Setting and the main result
Consider a stochastic basis ( , F , F, P), F being right-continuous. Let Z be a Lévy process with the characteristic triplet (σ 2 , b, ν) and the Lévy-Itô decomposition where W is a standard one-dimensional Brownian motion, N is a Poisson random measure on R\{0} with the Lévy measure ν satisfying R (z 2 ∧ 1) ν(dz) < ∞,Ñ is the compensated Poisson random measure, and b ∈ R.
For θ ∈ R, let X be an Ornstein-Uhlenbeck type process being a solution of the SDE where θ ∈ R is an unknown parameter. The initial value X 0 ∈ F 0 is a random variable whose distribution does not depend on θ . Note that X has an explicit representation see, e.g. Applebaum (2009, Sections 4.3.5 and 6.3) and Sato (1999, Section 17). Let D = D([0, ∞), R) be the space of real-valued càdlàg functions ω : [0, ∞) → R equipped with Skorokhod topology and Borel σ -algebra B(D). The space (D, B(R)) is Polish, and B(D) coincides with the σ -algebra generated by the coordinate projections. We define a (right-continuous) filtration G = (G t ) t≥0 consisting of σ -algebras For each θ ∈ R, the process X = (X t ) t≥0 induces a measure P θ on the path space (D, B(D)). Let be a restriction of P θ to the σ -algebra G T . In order to establish the equivalence of the laws P θ T and P θ 0 T , θ, θ 0 ∈ R, we have to make the following assumption. A σ : The Brownian component of Z is non-degenerate, i.e. σ > 0.
Proposition 2.1 Let A σ hold true. Then for each T > 0, any θ, θ 0 ∈ R P θ T ∼ P θ 0 T , and the likelihood ratio is given by is the continuous local martingale component of ω under the measure P θ 0 T , and the random measure is defined by the jumps of ω.

Consider a family of statistical experiments
Our goal is to establish local asymptotic mixed normality (LAMN) of these experiments under the assumption that the process Z has heavy tails. We make the following assumption. A ν : The Lévy measure ν has a regularly varying heavy tail of the order α ∈ (0, 2), i.e.
In other words, H : (0, ∞) → (0, ∞) and there is a positive function l = l(R) slowly varying at infinity such that Since H (z) > 0, z > 0, the functionH is absolutely continuous and strictly decreasing. Moreover, by Karamata's theorem, see e.g. Resnick (2007, Theorem 2.1 (a)), applied to the We use the functionH to introduce the continuous and monotone increasing scaling {φ T } T >0 defined by the relation

Remark 2.2
We make use of the absolutely continuous and strictly decreasing functionH just for convenience in order to avoid technicalities connected with the inversion of càdlàg functions. For instance, one can equivalently define φ T : Bingham et al. (1987, Chapter 1.5.7).

Example 2.4
Let the jump part of the process Z be an α-stable Lévy process, i.e. for α ∈ (0, 2) and The main result is the LAMN property of our model.

Theorem 2.5 Let A σ and A ν hold true. Then the family of statistical experiments (2.4) is locally asymptotically mixed normal at each
where N is a standard Gaussian random variable and S (α/2) is an independent spectrally positive α/2-stable random variable with the Laplace transform Theorem 2.5 is based on the following key result.

Theorem 2.6 Let
where S (α/2) is a random variable with the Laplace transform (2.6).
Corollary 2.7 Let A σ and A ν hold true. Then for each θ 0 > 0 Proposition 2.1 and Theorem 2.5 allow us to establish asymptotic distribution of the maximum likelihood estimatorθ T of θ . Moreover, the special form of the likelihood ratio guarantees thatθ T is asymptotically efficient.

Corollary 2.8 1. Let A σ hold true. Then the maximum likelihood estimatorθ T of θ satisfieŝ
The maximum likelihood estimatorθ T is asymptotically efficient in the sense of the convolution theorem and the local asymptotic minimax theorem for LAMN models, see Höpfner (2014, Theorems 7.10 and 7.12).

Remark 2.9
It is instructive to determine the tails of the random variable N / and in particular all moments of the r.h.s. of (2.9) are finite.

Auxiliary results
We decompose the Lévy process Z into a compound Poisson process with heavy jumps, and the rest. Consider the non-decreasing function R T = T ρ : [1, ∞) → [1, ∞), where ρ ≥ 0 will be chosen later. Denote Denote also by N T the Poisson counting process of η T ; it is a Poisson process with intensity H (R T ).
The next Lemma will be used to determine the tail behaviour of the product of any two independent normalized jumps |J T k ||J T l |/R 2 T , k = l. Lemma 3.2 Let U R ≥ 1 and V R ≥ 1 be two independent random variables with probability distribution function Then for each ε ∈ (0, α) there is C(ε) > 0 such that for all R ≥ 1 and all x ≥ 1 Proof Recall that Potter's bounds Resnick (2007, Proposition 2.6 (ii)) imply that for each ε > 0 there is C 0 (ε) > 0 such that for each x ≥ 1 and R ≥ 1 For x > 1 we write R (x). Then Eventually, for some C(ε) > 0.

Remark 3.3
A finer tail asymptotics of products of iid non-negative Pareto type random variables can be found in Rosiński and Woyczyński (1987, Theorem 2.1) and Jessen and Mikosch (2006, Lemma 4.1 (4)). In Lemma 3.2, however, we establish rather rough estimates which are valid for the families of iid random variables {U R , V R } R≥1 .
The following useful Lemma will be used to determine the conditional distribution of the interarrival times of the compound Poisson process η T .

6)
where σ k is a Beta(m, k − 1)-distributed random variable with density Proof It is well known that the conditional distribution of the arrival times τ 1 , . . . , τ m , given that N T = m, coincides with the distribution of the order statistics obtained from m samples from the population with uniform distribution on [0, T ], see Sato (1999, Proposition 3.4). Let, for brevity, T = 1. The joint density of (τ j , τ j+k ), 1 ≤ j < j + k ≤ m is well known, see e.g. Balakrishnan and Nevzorov (2003, Chapter 11.10): Hence, the probability density of the difference τ j+k − τ j is obtained by integration w.
Recalling the definition of the Beta-function, we get which yields the desired result.

Proof
The process t → φ 2 T [η T ] t is a compound Poisson process with Lévy measure ν T with the tail Integrating by parts yields Since the first summands on the r.h.s. of (3.8) vanish, it is left to evaluate the integral term. Taking into account (2.5), namely that 1 T =H ( 1 φ T ), we write for any u 0 > 0 It is evident that lim T →∞ Resnick (2007, Proposition 2.4), the convergence Further we estimate Note that y → y H(y) is integrable at 0 by the definition of the Lévy measure, 0 ≤ − 1 0 y 2 dH (y) < ∞, and the integration by parts. Eventually by Karamata's theorem (Resnick 2007, Theorem 2.1 (a)) Hence choosing u 0 > 0 sufficiently small and letting T → ∞ we obtain the convergence of K T to the cumulant of a spectrally positive stable random variable Lemma 3.6 For any ρ ∈ [0, 1/α) and any θ > 0 By the Itô isometry and Lemma 3.1, for any ε > 0 we estimate for each s ≥ 0 (3.10) Analogously, Lemma 3.1 yields and hence for each s ≥ 0 For any ρ ∈ [0, 1/α) we can choose ε > 0 sufficiently small such that the bounds in (3.10) and (3.11) and (3.12) converge to 0 as T → ∞ which gives (3.9). Integrating these inequalities w.r.t. s ∈ [0, T ] results in an additional factor T on the r.h.s. of these estimates, and convergence to 0 still holds true for ε > 0 sufficiently small.

Lemma 3.7
For any ρ > 1 2α and any θ > 0 Proof The Ornstein-Uhlenbeck process X η T as well as its integral w.r.t. η T can be written explicitly in the form of sums: As always, we agree that m j=k = 0 for m < k. Note that for N T T = 0 and N T T = 1, (3.13) We also take into account that for all m ≥ 2 and 1 ≤ j < j + k ≤ m where U T , V T are iid random variables with probability law and σ k , k = 1, . . . , m − 1, is a Beta(k, m − 1 + k)-distributed random variable independent of U T and V T with probability density (3.7). For each m ≥ 0 denote by P (m) T the conditional law P( · |N T T = m). For some ε ∈ (0, 2−α α ) which will be chosen sufficiently small later, and for each m ≥ 2 define the family of positive weights is the normalizing constant. With this construction for each m ≥ 2 (3.14) Let γ > 0. In order to show that the sum (3.13) multiplied by φ 2 T converges to zero, we take into account (3.14) and write Applying Lemma 3.2 and the independence of U T V T and σ k we obtain for some ε > 0 where we have used the well known relation ∞ 0 au n e −au du = n!/a n , a > 0, n ≥ 0, as well as the elementary estimates (m − 1) α−ε ≤ m 2 and k ( 2 α −ε)(α−ε) ≤ k 2 which are valid for ε > 0 and α ∈ (0, 2). Hence To evaluate the inner sum we use the formula ∞ j=0 ( j + k) 2 a j / j! = e a (a 2 + 2ak + a + k 2 ) to obtain (3.16) Combining (3.15) and (3.16), it is left to estimate two summands. For the first one, we use the formula ∞ k=1 k 2 q k = q(q + 1)/(1 − q) 3 , |q| < 1, to get For the second summand, we use the formula ∞ k=1 k 2 (k+1) 2 q k = 4q(q 2 +4q +1)/(1−q) 5 , |q| < 1, to get Combining (3.15) with the bounds for S 1 and S 2 we obtain Since ρ > 1 2α , one can choose ε > 0 sufficiently small to obtain the limit p(T ) → 0, T → ∞.

Proofs of the main results
Proof of Theorem 2.6 Let ρ ∈ ( 1 2α , 1 α ) be fixed. With the help of the decomposition (3.1) we may write as well as (4.1). It is easy to check that φ T T 0 X T s dW s d → 0. Indeed, due to the independence of X 0 and W φ T T 0 X 0 e −θ s dW s = φ T · X 0 · T 0 e −θ s dW s → 0 a.s. and obviously by Lemma 3.7 Finally by the estimate (3.10) of Lemma 3.6 Taking into account the argument in the proof of Theorem 2.6, we conclude that it is sufficient to consider the limiting behaviour of the pair φ T

The processes η T and W are independent and
Proof of Theorem 2.5 The statement of the theorem follows immediately from Proposition 2.1 and Corollary 2.7. Indeed, for each θ 0 > 0 and u ∈ R we use the formula (2.3) for the likelihood ratio as well as semimartingale decompositions (2.1) and (2.2) to conclude that Proof of Corollary 2.8 . The relation (2.8) follows from Proposition 2.1. Due to the linearquadratic form of the likelihood ratio, the maximum likelihood estimator coincides with the so-called central sequence. This implies the asymptotic efficiency in the aforementioned sense. The limit (2.9) follows from Corollary 2.7.
Proof of Remark 2.9 For x > 0, By the well known property of the Gaussian distribution p 2 (x) ≤ 2 π e −x α /2 x α/2 .
Acknowledgements Open Access funding provided by Projekt DEAL. The authors thank the DAAD exchange programme Eastern Partnership for financial support. A.G. thanks Friedrich Schiller University Jena for hospitality. The authors are grateful to the anonymous referees for their valuable comments and careful reading of the manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.