Abstract
Donsker-type functional limit theorems are proved for empirical processes arising from discretely sampled increments of a univariate Lévy process. In the asymptotic regime the sampling frequencies increase to infinity and the limiting object is a Gaussian process that can be obtained from the composition of a Brownian motion with a covariance operator determined by the Lévy measure. The results are applied to derive the asymptotic distribution of natural estimators for the distribution function of the Lévy jump measure. As an application we deduce Kolmogorov–Smirnov type tests and confidence bands.
1 Introduction
Suppose that \((L_t: t \geqslant 0)\) is a real-valued Lévy process defined on some probability space \((\Omega , {\mathcal {A}}, \Pr )\) and we observe \(n\) of its increments
sampled at frequency \(1/\Delta >0\). Equivalently the \(X_k\)’s are drawn i.i.d. from some infinitely divisible distribution \({{\mathrm{{\mathbb {P}}}}}_\Delta \), with corresponding empirical measures \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n} = \frac{1}{n} \sum _{k=1}^n \delta _{X_k}\).
Lévy processes are increasingly popular in stochastic modelling. A question of key importance is how the structure of the Lévy process, particularly its jump behaviour, can be recovered from these observed increments. From a statistical point of view it is natural to consider a growing observation horizon \(n \Delta \rightarrow \infty \). If simultaneously \(\Delta = \Delta _n\) approaches zero one speaks of a ‘high-frequency’ sampling regime, as opposed to ‘low-frequency’ sampling where \(\Delta \) remains fixed. Inference problems of this kind have recently gained increased attention. Jongbloed et al. [22] studied nonparametric inference for Lévy-driven Ornstein–Uhlenbeck processes. Belomestny and Reiß [3] treat nonparametric estimation of Lévy processes in a financial model. Low-frequency observations were considered, e.g., by Neumann and Reiß [20], Belomestny [25], Gugushvili [2] as well as Nickl and Reiß [27], whereas Figueroa-López [12, 13] treats high-frequency observations. Nonparametric estimation of Lévy processes in a model selection context was studied by Comte and Genon-Catalot [6] and Kappus [23]. A general discussion of the literature and further references can be found in the recent survey paper [30].
By the Lévy–Khintchine representation [31] the Lévy process \((L_t: t \geqslant 0)\) is entirely characterised by three parameters: the diffusion coefficient \(\sigma ^2\) describing the Brownian motion component, the centring or drift parameter \(\gamma \), and the Lévy measure \(\nu \). Recovering the Lévy process can thus be reduced to recovering the Lévy triplet \((\sigma ^2, \gamma , \nu )\). Statistical inference for the one-dimensional parameters \(\sigma ^2, \gamma \) can be based on standard statistics such as the quadratic variation and the sample average of the increments, or on spectral estimators, see Sect. 4 for discussion and references.
An intrinsically more complex problem than inference on \(\sigma ^2\) and \(\gamma \) is the recovery of the Lévy measure \(\nu \), which describes the jump behaviour of the Lévy process. We recall that there is a bijection between the set of Lévy measures \(\nu \) and all positive Borel measures \(\nu \) on \({{\mathrm{{\mathbb {R}}}}}\) s.t.
Thus a natural target is to recover the cumulative distribution function
from the observed increments; it encodes both local and global information about \(\nu \). The presence of \((1\wedge x^2)\) smooths the singularity that \(\nu \) may possess at the origin. Other possibilities to smooth the singularity exist and our results will cover functions from a general class (see Sect. 3). In particular this will include recovery of the distribution function
of the Lévy measure at any point \(t\ne 0\).
For statistical applications, inference on the functions \(N, {\mathcal {N}}\) in the uniform norm \(\Vert \cdot \Vert _\infty \) on the real line is of particular interest, paralleling the classical Donsker–Kolmogorov–Smirnov central limit theorems
in the space of bounded functions on \({{\mathrm{{\mathbb {R}}}}}\), where \(F_n\) is the empirical distribution function of a random sample from distribution \(F\), and where \({\mathbb {G}}_F\) is the \(F\)-Brownian bridge [9, 33]. In the Lévy setting, Nickl and Reiß [27] considered an estimator for the distribution function \({\mathcal {N}}(t)\), \(|t|\geqslant \zeta ,\) based on low-frequency observations (\(\Delta \) fixed) and proved such a Donsker–Kolmogorov–Smirnov theorem. The purpose of the present article is to derive such results when also \(\Delta \rightarrow 0\). The main message is that high-frequency observations reveal much finer statistical properties of the Lévy measure, and inference is possible for a much larger class of Lévy processes than considered in [27], including processes with a nonzero Gaussian component. Moreover, the theory does not only cover nonlinear ‘inversion’ estimators based on the Lévy–Khintchine formula, but also ‘linear’ estimators based on elementary counting statistics. At the heart of these results is a general purpose uniform central limit theorem for a basic ‘smoothed empirical process’ arising from the \(X_k\)’s in (1), see Theorem 11 below.
In the next section we introduce the estimators and give the main results as well as some statistical applications. In Sect. 3 we show how to reduce the proofs to the study of a unified smoothed empirical process, and in Sect. 4 we discuss our conditions and their interpretation in a variety of concrete examples of Lévy processes. The remainder of the article is then devoted to the proofs of our results.
2 Main results: asymptotic inference on the Lévy measure \(\nu \)
In this section we study two approaches to estimate the distribution functions \(N, {\mathcal {N}}\) of a Lévy measure, based on discrete observations (1). The first estimator is constructed by a direct approach and counts the number of increments below a certain threshold, where increments are weighted by \(1\wedge X_k^2\). The second approach relies on the Lévy–Khintchine representation and a spectral regularisation step.
2.1 Basic notation and assumptions
The symbol \(\ell ^\infty (T)\) denotes the space of bounded functions on a set \(T\) normed by the usual supremum norm \(\Vert \cdot \Vert _\infty \). We will measure the smoothness of functions in a local Hölder norm: denoting by \(C(U)=C^0(U)\) the set of all functions on an open set \(U\subseteq {{\mathrm{{\mathbb {R}}}}}\) which are bounded, continuous and real-valued, we define for \(s>0\) the Hölder spaces
where \(\lfloor s \rfloor \) denotes the largest integer strictly smaller than \(s\).
We assume throughout this article that the Lévy measure has finite second moments,
This is equivalent to \({{\mathrm{{\mathbb {P}}}}}_\Delta \) having finite second moments for all \(\Delta >0\) [31].
For our main results we will rely on the following stronger assumption on \(\nu \). Slightly abusing notation we shall use the same symbol for a measure and its Lebesgue density, if the latter exists. Also we use \(\lesssim , \gtrsim , (\sim )\) to denote (two-sided) inequalities up to a multiplicative constant.
Assumption 1
-
(a)
For some \(\varepsilon >0\) we have
$$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}}|x|^{4+\varepsilon }\nu (d x)<\infty . \end{aligned}$$ -
(b)
The Lévy measure \(\nu \) has a Lebesgue density, also denoted by \(\nu \), and
$$\begin{aligned} (1\wedge x^4)\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}}). \end{aligned}$$ -
(c)
The measure \(x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta }\) admits a Lebesgue density, also denoted by \(x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \), satisfying, as \(\Delta \rightarrow 0\),
$$\begin{aligned} ||x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta } ||_\infty \lesssim \Delta . \end{aligned}$$ -
(d)
Let \(U\) be a neighbourhood of the origin and \(V \subseteq {{\mathrm{{\mathbb {R}}}}}\). For some \(s>0\) and some finite constants \(c_t>0\), \(t\in V\), we have
$$\begin{aligned} \Vert g_t(-\cdot ) *(x^2\nu )\Vert _{C^s(U)} \leqslant c_t \qquad \text { with } g_{t}(x):= (1\wedge x^{-2})1\!\!1_{(-\infty ,t]}(x). \end{aligned}$$
Assumptions (a) and (b) are a moment condition and a mild regularity condition on the Lévy measure, respectively. Assumption (c) is the key condition and will be discussed in detail in Sect. 4.4. Here we just remark that for instance under the assumption \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), this condition will be shown to be satisfied whenever the diffusion coefficient is positive (\(\sigma >0\)). Assumption (d) is used to control approximation theoretic properties of the distribution function of \(x^2\nu \). For global results (\(V={{\mathrm{{\mathbb {R}}}}}\)) we notice that it is easily seen that (d) is satisfied with a uniform constant \(c>0\) if \(x^2\nu \in C^{s-1}({{\mathrm{{\mathbb {R}}}}}), s \geqslant 1\).
Recall that a function \(l\) defined on \((0,\infty )\) is slowly varying at the origin if
A function \(f\) is regularly varying at the origin with exponent \(p\in {{\mathrm{{\mathbb {R}}}}}\) if \(f\) is of the form
with \(l\) slowly varying at the origin. We denote the symmetrised Lévy density by \({\widetilde{\nu }}(x):=\nu ^+(x)+\nu ^-(-x)\), where \(\nu ^+=\nu 1\!\!1_{{{\mathrm{{\mathbb {R}}}}}^+}\) and \(\nu ^-=\nu 1\!\!1_{{{\mathrm{{\mathbb {R}}}}}^-}\).
Throughout the paper we write \(\rightarrow ^{\mathcal {L}}\) to denote convergence in distribution of random elements in a metric space as in Chapter 1 in [33].
2.2 The direct estimation approach
In the high-frequency regime \((\Delta \rightarrow 0)\) inference on \(\nu \) can be based on the following simple observation.
Lemma 2
If the Lévy measure \(\nu \) satisfies (4), then we have the weak convergence
as \(\Delta \rightarrow 0\) in the sense that
for every bounded continuous function \(f: {{\mathrm{{\mathbb {R}}}}}\rightarrow {{\mathrm{{\mathbb {R}}}}}\).
Starting with Lévy processes without diffusion component, that is, with \(\sigma =0\), the asymptotic identification (5) motivates a linear estimator of \(N(t)\) given by

where \({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}=\frac{1}{n}\sum _{k=1}^{n}\delta _{X_k}\) is the empirical measure of the increments from (1).
Similarly, and including the case \(\sigma \ne 0\), one can estimate the function \({\mathcal {N}}\) by
We start with a theorem for the basic estimator \({\widetilde{N}}_n\).
Theorem 3
Let \(\sigma =0\) and grant Assumption 1 for \(V={{\mathrm{{\mathbb {R}}}}}\), for some \(s\in (0,2]\) and with uniform constant \(\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}c_t=c<\infty \).
Assume either that
-
(a)
the density of \(x \nu \) exists and is of bounded variation, and that for the drift \(\gamma _0 := \gamma -\int x \nu (\,\mathrm {d}x)=0\); or that
-
(b)
\({\widetilde{\nu }}(x)=\nu ^+(x)+\nu ^-(-x)\) is regularly varying at zero with exponent \(-(\beta +1)\), \(\beta \in (0,2), s \in (0,2-\beta )\).
If \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that
then
where \(\mathbb {G}\) is a tight Gaussian random variable arising from the centred Gaussian process \(\{\mathbb {G}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance
Since estimation at the origin \(t=0\) is included in the last theorem, the assumption \(\sigma =0\) is natural—the simple linear estimator \({\widetilde{N}}_n\) cannot distinguish between arbitrarily small jumps and a Brownian diffusion component. Moreover, setting the drift \(\gamma _0 =0\) in (a) rules out situations where the measure \({{\mathrm{{\mathbb {P}}}}}_\Delta \) has a discrete component \(\delta _{\Delta \gamma _0}\), which causes complications in the analysis. Simultaneous estimation of all parameters of the Lévy triplet without restrictions on \(\gamma \) and \(\sigma \) will be considered by non-linear methods in the next subsection.
The conditions (a) and (b) are required to show that the deterministic ‘bias’ term arising from the basic linear estimator is negligible in the limit distribution (Proposition 17). The case (a) covers many examples of finite activity Lévy processes as well as some limiting cases where the singularity of \(\nu \) at the origin behaves like \(|x|^{-1}\) (see Sect. 4.5 for examples). In contrast case (b) covers infinite activity processes with a singularity of the form \(|x|^{-1-\beta }, \beta \in (0,2)\). The assumption of regular variation of \(\widetilde{\nu }\) at zero is natural in all key examples considered in Sect. 4.5 below—typically the variation exponent will be closely related to the regularity \(s\) of \(N\), and we discuss in Sect. 4.1 how our parameter constraints on \(\beta \) and \(s\) are compatible.
When the origin is excluded from consideration, an argument of [13] can be used to obtain the following result for the linear estimator \({\widetilde{\mathcal {N}}}\), allowing also for \(\sigma \ne 0\):
Theorem 4
Grant Assumptions 1(a)–(c). Let \(\zeta >0\) and suppose that the Lévy density \(\nu \) is Lipschitz continuous in an open set \(V_0\) containing \(V=(-\infty ,-\zeta ]\cup [\zeta ,\infty )\). Let \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that
Then
where \(\mathbb {W}\) is a tight Gaussian random variable arising from the centred Gaussian process \(\{\mathbb {W}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance, for \(f_t=1\!\!1_{(-\infty ,t]}\) for \(t<0\) and \(f_t=1\!\!1_{[t,\infty )}\) for \(t>0\),
The estimators \({\widetilde{N}}_n,\widetilde{\mathcal {N}}_n\) are ‘linear’ in the observations \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\), and their consistency relies on the assumption that \(\Delta _n\) tends to zero fast enough, in Theorems 3 and 4 at least of order \(\Delta _{n}=o(n^{-1/(s+1)})\) for \(s\in (0,2]\). In both theorems a weaker assumption than \(\Delta _{n}=o(n^{-1/3})\) cannot be expected in general: In typical situations the function \({{\mathrm{{\mathbb {P}}}}}(X_{\Delta _n}\leqslant t)\), \(t<0\), can be expressed in terms of \(\Delta _n\) as a series expansion
For a compound Poisson process this follows by conditioning on the number of jumps but it also holds in more general infinite activity cases (see [14]). From the expansion we see that the approximation error \(\Delta _n^{-1}{{\mathrm{{\mathbb {P}}}}}(X_{\Delta _n}\leqslant t)-\nu ((-\infty ,t])\) will not decay faster than \(\Delta _n\), and the assumption \(\Delta _n=o(1/\sqrt{n\Delta _n})\) is expressed equivalently as \(\Delta _{n}=o(n^{-1/3})\).
2.3 The spectral estimation approach
Instead of relying on \(\Delta \rightarrow 0\) one can identify the Lévy measure by the Lévy–Khintchine formula
which we give here in Kolmogorov’s version (valid under (4), see (8.8) in [31]). Differentiating the characteristic exponent \(\psi (u)=\Delta ^{-1}\log \varphi _\Delta (u)\), one sees
where \({{\mathrm{{\mathcal {F}}}}}f(u):=\int e^{iux}f(x)\,\mathrm {d}x\) and \({{\mathrm{{\mathcal {F}}}}}\mu (u) := \int e^{iux}\mu (\,\mathrm {d}x)\) for any \(f\in L^1({{\mathrm{{\mathbb {R}}}}})\cup L^2({{\mathrm{{\mathbb {R}}}}})\) and any finite measure \(\mu \), respectively, denotes the Fourier transform. If \({{\mathrm{{\mathcal {F}}}}}^{-1}\) is the inverse Fourier transform we hence have
In contrast to (5) this identification of \(\nu \) is nonlinear in \(\varphi _\Delta = {{\mathrm{{\mathcal {F}}}}}{{\mathrm{{\mathbb {P}}}}}_{\Delta }\), but has the remarkable advantage of being nonasymptotic and valid for all \(\Delta >0\), without relying on a high-frequency approximation \(\Delta \rightarrow 0\). This was exploited in [27] to show that a plug-in of the empirical characteristic function \({\mathcal {F}} {{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\) into (9) can result, for a (naturally) restricted class of Lévy processes, in efficient recovery of \({\mathcal {N}}(t), t \ne 0,\) without the requirement \(\Delta \rightarrow 0\). In the low-frequency case only Lévy processes without diffusion component can be covered. Our high-frequency setting allows us to drop this (otherwise necessary) restriction and to treat Lévy processes with diffusion component and with Lévy measures from a much wider class.
Replacing \(\varphi _\Delta (u)\) in (9) by the empirical characteristic function of the observed increments,
(and its derivatives \(\varphi _{\Delta , n}^{(i)}, i=1,2\), respectively), we obtain an empirical plug-in estimate \({\widehat{\psi }}_n''\) of \(\psi ''\). Recalling the definitions of \(g_t, f_t\) in Assumption 1(d) and in Theorem 4, respectively, the resulting estimators of \(N,{\mathcal {N}}\) are given by
Here \(K_h\) is a kernel such that \({{\mathrm{{\mathcal {F}}}}}K_h\) has compact support, specified in detail below, ensuring in particular that \({\widehat{N}}_n,\widehat{\mathcal {N}}_n\) are well-defined (on sets of probability approaching one). Moreover, \({\widehat{\sigma }}^2\) is any pilot estimate of \(\sigma ^2\). We can estimate \(\sigma ^2\) for instance as in [21] by
where \(c_0>0\) is a suitable numerical constant, and if we assume a lower bound on the characteristic function determined by \(\sigma _{\max }>0\). Under suitable conditions Proposition 13 below entails that the estimator \({\widehat{\sigma }}^2\) satisfies
and hence is negligible in the limit process \({\mathbb {G}}\) in the next theorem. While the construction of an optimal estimator of \(\sigma ^2\) in the setting considered here is a topic of independent interest, Theorem 5 below will hold for any plug-in estimator that satisfies (13).
We regularise with a band-limited kernel \(K_h:=h^{-1}K(h^{-1}{\scriptstyle \bullet } )\) of bandwidth \(h>0\). The following properties of \(K\) are supposed:
The main result for the spectral estimators is the following theorem, where \(\mathbb {G}\) and \(\mathbb {W}\) are tight Gaussian random variables arising from the same Gaussian processes as in Theorems 3 and 4, respectively. For Part (ii) we recall the definition \(f_t=1\!\!1_{(-\infty ,t]}\) for \(t<0\) and \(f_t=1\!\!1_{[t,\infty )}\) for \(t>0\).
Theorem 5
Grant Assumptions 1(a)–(c) and let \(s>0\). Let the kernel satisfy (14) with \(p\geqslant s\vee 2\) and choose \(h_n\sim \Delta _n^{1/2}\). Let either \(\sigma ^2\) be known (in which case \({\widehat{\sigma }}^2 := \sigma ^2\)), or let \({\widehat{\sigma }}^2\) be any estimator satisfying (13). Suppose \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that
-
(i)
Grant Assumption 1(d) for \(s>0\), for \(V={{\mathrm{{\mathbb {R}}}}}\) and for constants \(c_t\) with \(\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}c_t=c<\infty \). Then
$$\begin{aligned} \sqrt{n\Delta _n}\big (\widehat{N}_{n}-N\big )\rightarrow ^{\mathcal {L}}\mathbb {G} \quad \text {in}\,\ell ^{\infty }({{\mathrm{{\mathbb {R}}}}}). \end{aligned}$$ -
(ii)
Grant Assumption 1(d) for \(s>0\), for \(g_t(x)=x^{-2}f_t(x)\), for \(V=(-\infty ,-\zeta ]\cup [\zeta ,\infty )\), \(\zeta >0\), and for constants \(c_t\) with \(\sup _{t\in V}c_t=c'<\infty \). Then
$$\begin{aligned} \sqrt{n\Delta _n}\big (\widehat{\mathcal {N}}_{n}-\mathcal {N}\big )\rightarrow ^{\mathcal {L}} \mathbb {W} \quad \text {in}\,\ell ^{\infty }(V). \end{aligned}$$
2.4 Limit process and statistical applications
The continuous mapping theorem with the usual sup-norm \(\Vert \cdot \Vert _\infty \) combined with Theorems 3 and 5 yields in particular the limit theorems, as \(n \rightarrow \infty \),
This can be used to construct Kolmogorov–Smirnov tests for Lévy measures and global confidence bands for the function \(N\), as we explain now.
For absolutely continuous Lévy measures \(\nu \) the Gaussian random function \(({\mathbb {G}}(t):t\in {{\mathrm{{\mathbb {R}}}}})\) can be realised as a version of
where \({\mathbb {B}}\) is a standard Brownian motion. An alternative representation is given by \({\mathbb {G}}(t)=\int _{-\infty }^{t}(1\wedge x^2) \nu (x)^{1/2}\,\mathrm {d}{\mathbb {B}}(x)\), where \({\mathbb {B}}\) is a two-sided Brownian motion. We have
so that quantiles of the distribution of \(\Vert {\mathbb {G}}\Vert _\infty \) can be calculated. For example a global asymptotic confidence band for \(N\) can be constructed in the setting of Theorems 3 and 5 by defining
with consistent estimators
of the standard deviation \((\int _{{{\mathrm{{\mathbb {R}}}}}}(1\wedge x^4)\nu (\,\mathrm {d}x))^{1/2}\), and with \(q_\alpha \) the upper \(\alpha \)–quantile, \(0<\alpha <1\), of the distribution of \(\max _{s\in [0,1]}|{\mathbb {B}}(s)|\) (see Example X.5(c) in [10] for its well-known formula). For the confidence band \(C_n\) equal to either
Theorems 3 and 5 imply, under the respective assumptions, that the asymptotic coverage probability of \(C_n\) equals
Theorems 3 and 5 allow likewise the construction of tests: If \(H_0\) is a set of Lévy measures, let \({\mathcal {D}}\) be the set of the corresponding cumulative distribution functions of the form (2). We define \(T_n=1\!\!1\{{\mathcal {D}}\cap C_n=\varnothing \}\) to reject \(H_0\) and accept \(H_0\) when \(T_n=0\). This test has asymptotic level \(\alpha \): if \({{\mathrm{{\mathbb {P}}}}}_\vartheta \) is the law of a Lévy process from \(\vartheta \in H_0\) then we have
assuming \(H_0\) satisfies the assumptions of Theorem 3 or 5.
2.5 Numerical example
Let us briefly illustrate the finite sample performance of the two estimation approaches and their corresponding confidence bands. We apply the procedures to two standard examples of pure jump Lévy processes: a Gamma process and a normal inverse Gaussian (NIG) process. The empirical coverage of the confidence bands reveals the finite sample level of the associated Kolmogorov–Smirnov tests and the size of the bands indicates the power of the tests.
The Gamma process has infinite, but relatively small jump activity (its Blumenthal–Getoor index equals zero). Its Lévy measure is given by the Lebesgue density \( \nu (x)=\frac{c}{x}e^{-\lambda x}, x>0,\) and we choose \(c=30\) and \(\lambda =1\) here. The NIG process can be constructed by subordinating a diffusion with volatility \(s>0\) and drift \(\vartheta \in {{\mathrm{{\mathbb {R}}}}}\) by an inverse Gaussian process with variance \(\kappa >0\). The resulting infinite variation process has Blumenthal–Getoor index equal to one. The NIG process admits an explicit formula for the jump measure and for its law we apply the simulation algorithm from [7], choosing \(s=1.5, \vartheta =0.1\) and \(\kappa =0.5\). Both processes satisfy the assumptions of Theorems 3 and 5, cf. Sect. 4.5.
We simulate \(n=2{,}000\) increments with observation distance \(\Delta =0.01\). For the spectral estimator we apply a flat top kernel and the universal bandwidth choice \(h=\sqrt{\Delta }\) which turned out to perform well in a variety of settings. Figure 1 shows the true distribution-type function \(N\), the direct estimator \({\widetilde{N}}_n\) from (7) and the spectral estimator \({\widehat{N}}_n\) from (11) for 50 simulations. In each setting the confidence band for level \(\alpha =0.9\), as constructed in the previous section, is plotted for the first simulation result. We clearly see the higher activity of small jumps of the NIG process from the linear growth of \(N\) at zero. On the other hand, the choice of our process parameters yields more pronounced tails of the jump measure for the gamma process.
Direct estimator \({\widetilde{N}}_n\) (left) and spectral estimator \({\widehat{N}}_n\) (right) for the Gamma (top) and NIG process (bottom). Each time 50 estimators (light blue) and the true distribution function (black) are shown. One estimator (blue, solid) with its asymptotic 0.9-confidence band (blue, dashed) is highlighted (colour figure online)
By construction, the direct estimator is not smooth. For the Gamma process it possesses a significant bias. The intensity of the small jumps is systematically underestimated which results in an overestimation of the larger jumps and thus too large values of \({\widetilde{N}}_n(t)\) for \(t\) large. For the choice \(\Delta =0.001\) this bias of the direct estimator is already negligible. In the simulations of the NIG process, \({\widetilde{N}}_n\) achieves good results that coincide with the asymptotic theory. In the simulations for the Gamma process the empirical coverage of the \(\alpha =0.9\) confidence bands in 500 Monte Carlo iterations is 0.86 for the Gamma process. The direct estimator has an empirical coverage of 0.59, reflecting the bias problem mentioned above. For the NIG process both estimators yield bands covering the true \(N\) uniformly in \(92\,\%\) of cases.
3 Unifying empirical process
The key probabilistic challenge in the proofs of Theorems 3–5 is a uniform central limit theorem for certain smoothed empirical processes arising from the sampled increments (1). We show in this section how these processes arise naturally for both estimation approaches considered here.
We will consider slightly more general objects than the distribution function \(N(t) = \int _{-\infty }^t (1 \wedge x^2) \nu (\,\mathrm {d}x)\)—the truncation at one in \((1 \wedge x^2)\) is somewhat arbitrary and, in particular, not smooth. Other truncations such as \(x^2/(1+x^2)\), or variations thereof can be of interest. To accommodate such examples we thus consider recovery of the functionals
where the ‘clipping function’ \(\rho \) satisfies the following condition:
Assumption 6
The function \(\rho \) satisfies \(0< \rho (x) \leqslant C(1 \wedge x^{-2})\) for all \(x \in {{\mathrm{{\mathbb {R}}}}}\) and some constant \(0<C<\infty \). Moreover, \(\rho , x\rho \) are Lipschitz continuous functions of bounded variation (i.e., their weak derivative is equal to a finite signed measure).
This covers the above examples [with either \(\rho (x) = 1 \wedge x^{-2}\) or \(\rho (x) = 1/(1+x^2)\)]. In the definition of the basic estimator (7) and the kernel estimator (11), we only need to replace \((1\wedge x^2)1\!\!1_{(-\infty ,t]}(x)\) by \(x^2g_t(x)\) where now
replacing also \(g_t\) in Assumption 1. The covariance of the limit process in Theorems 3 and 5 then changes to
and the according representation of \({\mathbb {G}}\) in terms of a reparametrised Brownian motion is
Let us turn to the main purpose of this section: We start with the direct estimator \({\widetilde{N}}_n\), which is easier to analyse. The estimation error of \(\widetilde{N}_{n}\) can be decomposed as follows
for any \(t\in {{\mathrm{{\mathbb {R}}}}}\). The first term \(B\) is a deterministic approximation error and the rough idea for controlling it is to view \({{\mathrm{{\mathbb {P}}}}}_\Delta \) as an approximate identity and to use similar arguments as for the approximation error of a kernel estimator. The second term \(S\) is the main stochastic error term driven by the empirical process
where the scaling follows from the intuitive observation that the \(X_k\)’s are drawn i.i.d. from law \({{\mathrm{{\mathbb {P}}}}}_{\Delta }\) and hence satisfy, using that \({{\mathrm{{\mathbb {P}}}}}_\Delta \) is an infinitely divisible distribution,
Turning our attention to the second estimator we decompose \({\widehat{N}}_n-N_\rho \) into three error terms, using (9):
The first term is a deterministic approximation error, which can be bounded by Assumption 1(d) on the smoothness. The last term will be negligible since we assume that \({\widehat{\sigma }}^2\) converges to \(\sigma ^2\) with a faster rate than \(1/\sqrt{n\Delta }\). The key stochastic term is the second one. Compared to the basic estimator \({\widetilde{N}}_n\) we face the additional difficulty that \(\widehat{\psi }_n''\) depends nonlinearly on \({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}\). The following result shows that even after linearisation the resulting term is still different from the basic process \({\widetilde{N}}_n -{{\mathrm{{\mathbb {E}}}}}{\widetilde{N}}_n\) in that it performs a division by \(\varphi _\Delta \) in the spectral domain.
Proposition 7
Grant Assumption 1(a) and assume \(\sup _{u\in [-1/h_n,1/h_n]}|\varphi _{\Delta _n}(u)|^{-1}\lesssim 1\) for some \(h_n \rightarrow 0, \Delta _n \rightarrow 0\). Let the function
satisfy uniformly for \(h_n,\Delta _n\rightarrow 0\), \(\Vert m\Vert _\infty \lesssim 1\). If \(n\Delta _n\rightarrow \infty \) and \(h_n\rightarrow 0\) with \(h_n\gtrsim \Delta _n^{1/2}\), then we have
where
Note that the assumption \(\Vert m\Vert _ \infty \lesssim 1\) uniformly for \(h_n, \Delta _n\rightarrow 0\) is valid if \(K\) is as in (14), \(h_n\sim \sqrt{\Delta _n}\) and \( \sup _{u\in [-1/h_n,1/ h_n]} |\varphi _{\Delta _n}(u)|^{-1} \lesssim 1\).
We refer to \(M_{\Delta ,n}\) as the main stochastic term. To accommodate both (21) and (23) we now study empirical processes
for general \((n,\Delta )\)-dependent Fourier multipliers \(m: {{\mathrm{{\mathbb {R}}}}}\rightarrow {\mathbb {C}}\) satisfying the following condition.
Assumption 8
For every \(n, \Delta \) the twice differentiable functions \(m = m_{n, \Delta }: {{\mathrm{{\mathbb {R}}}}}\rightarrow {\mathbb {C}}\) are either such that
-
(a)
\({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }]\), \({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }']\) are finite signed measures with uniformly bounded total variations, or such that
-
(b)
\({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }]\) is real-valued and \(m_{n, \Delta }\) is supported in \([-C\Delta ^{-1/2}, C \Delta ^{-1/2}]\) for some fixed constant \(C>0\).
-
(c)
In addition to (a) or (b), letting \(\Delta =\Delta _n \rightarrow 0\) as \(n \rightarrow \infty \) we assume that \(m_{n,\Delta } \rightarrow 1\) pointwise on \({{\mathrm{{\mathbb {R}}}}}\), that
for some \(0<c<\infty \) independent of \(n, \Delta \) and that
The above assumption is an adaptation of the usual Mikhlin-type Fourier multiplier conditions to the situation relevant here (see [19], Cor. 4.11). It ensures that \(m\), \(m'\) act as norm-continuous Fourier multipliers on suitable function spaces, which will be a key tool in our proofs. Obviously Assumption 8 covers the case \(m=1\) relevant in (21) above. Moreover, we show in Proposition 19 below that it also covers \(m={\mathcal {F}} K_{(\Delta )}/\varphi _\Delta \) under our conditions on \(\varphi _\Delta \) and \(K_{(\Delta )}\), where \(K_{(\Delta )}\) denotes a kernel as in (14) with bandwidth depending on \(\Delta \). It includes other situations not studied further here, too, such as smoothed empirical processes based on \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\) convolved with an approximate identity \(K_h= h^{-1} K(\cdot /h), h:=h_n \rightarrow 0, \int K=1,\) upon setting \(m={\mathcal {F}} K_h\).
With the definition of general \(m=m_{n,\Delta }\) at hand we can now unify the second term \(S(t)\) in (20) and the main stochastic error (23), and study the smoothed empirical process
the identity following from Fubini’s theorem and standard properties of Fourier transforms.
When \(t\) is a fixed point in \({{\mathrm{{\mathbb {R}}}}}\) and \(m=1\) one shows without difficulty that, as \(n \rightarrow \infty \),
whenever \(\nu (\{t\})=0\). More generally one can show convergence of the finite-dimensional distributions of the process \(({\mathbb {G}}_n(t): t \in {{\mathrm{{\mathbb {R}}}}})\) to the process \(({\mathbb {G}}(t): t \in {{\mathrm{{\mathbb {R}}}}})\) from Theorem 3.
Proposition 9
Let \(\Delta = \Delta _n \rightarrow 0\) in such a way that \(n \Delta _n \rightarrow \infty \). Suppose the Lévy process satisfies Assumption 1, that \(\rho \) satisfies Assumption 6, and that \(m\) satisfies Assumption 8. Then as \(n \rightarrow \infty \) we have, for any \(t_1, \ldots , t_k \in {{\mathrm{{\mathbb {R}}}}}\), that
We remark that in this proposition we can omit \((1\wedge x^4)\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) from Assumption 1 as it is only needed later in the proof of the tightness of the process \(\mathbb {G}_n\).
By sample-continuity of Brownian motion, and since the integral in (19) takes values in a fixed compact set, we deduce that there exists a version of \(({\mathbb {G}}(t):t\in {{\mathrm{{\mathbb {R}}}}})\) with uniformly continuous sample paths for the intrinsic covariance metric
and that, moreover, \({{\mathrm{{\mathbb {R}}}}}\) is totally bounded with respect to \(d\). As a consequence we obtain:
Lemma 10
Grant Assumption 6. For \(g_t\) as in (18) and any Lévy measure \(\nu \), the law of the centred Gaussian process \(\{\mathbb {G}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance
defines a tight Gaussian Borel random variable in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). In particular, there exists a version of the process \(({\mathbb {G}}(t): t \in {{\mathrm{{\mathbb {R}}}}})\) such that
The most difficult part in the proofs of Theorem 3 and 5 is to show that \({\mathbb {G}}_n\) converges in law to \({\mathbb {G}}\) in the space \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) of bounded functions on the real line. Given that convergence of the finite-dimensional distributions and tightness of the limit process have already been established, this can be reduced to showing asymptotic equicontinuity of the process \(({\mathbb {G}}_n(t): t \in {{\mathrm{{\mathbb {R}}}}})\), or equivalently, uniform tightness of the random variables \({\mathbb {G}}_n\) in the Banach space \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) (see Section 1.5 in [33] and (39) below for precise definitions).
Theorem 11
Let \(\Delta = \Delta _n \rightarrow 0\) in such a way that \(n \Delta _n \rightarrow \infty \) and \(\log ^4(1/\Delta _n)=o(n\Delta _n)\). Suppose the Lévy process satisfies Assumption 1, that \(\rho \) satisfies Assumption 6, and that \(m\) satisfies Assumption 8. Then the process \({\mathbb {G}}_n\) from (25) is asymptotically equicontinuous in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). In particular, \({\mathbb {G}}_n\) is uniformly tight in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\),
and
The proof is based on ideas from the theory of smoothed empirical processes (in particular from [16]). The main mathematical challenges consist in dealing with envelopes of the empirical process that can be as large as \(1/\sqrt{\Delta }\rightarrow \infty \) in the high-frequency setting, and in accommodating the presence of an \(n\)-dependent Fourier multiplier \(m\) that needs to be general enough to allow for \(m={\mathcal {F}} K_{h}/\varphi _\Delta \). The latter requires the treatment of empirical processes that cannot be controlled with the standard bracketing or uniform metric entropy techniques. Our proofs rely on direct arguments for symmetrised empirical processes inspired by [18] and on sharp bounds on certain covering numbers based on a suitable Fourier integral operator inequality for \({{\mathrm{{\mathcal {F}}}}}^{-1}[m]\) in \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norms.
4 Discussion and examples
4.1 Regularity of \(x^2 \nu \) and the Blumenthal–Getoor index
The regularity index \(s>0\) in Assumption 1 measures the smoothness of the function \(g_t(-\cdot )*(x^2\nu )\). When \(x^2\nu \) is sufficiently regular away from the origin, this will equivalently measure the smoothness of the function \(t \mapsto \int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\), and hence is effectively driven by the singularity that \(\nu \) possesses at zero. The latter can be quantitatively measured by the Blumenthal–Getoor index (see [5])
The Blumenthal–Getoor index \(\beta \) takes values in \([0,2]\) and we have \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^\alpha \nu (\,\mathrm {d}x)=c<\infty \) for all \(\alpha \in (\beta ,2]\) (\(\alpha =2\) if \(\beta =2\)). In fact, for such \(\alpha \) and for all intervals \([a,b]\) containing the origin
Provided \(\nu \) is smooth away from zero this shows that the Hölder smoothness of \(\int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\) is at least \(2-\beta ^+\), where \(\beta ^+>\beta \) and \(\beta ^+\geqslant 1\). For a singularity of the form \(\nu (x)=|x|^{-\beta -1}\), \(\beta \in (1,2)\), which corresponds to Blumenthal–Getoor index \(\beta \), we have \(\int _{|x|<t}x^2\nu (\,\mathrm {d}x)=2(2-\beta )^{-1}t^{2-\beta }\) showing that the Hölder smoothness is at most \((2-\beta )\). This argument can be extended to the case where the symmetrised Lévy density \({\widetilde{\nu }}\) is regularly varying: If \({\widetilde{\nu }}\) is regularly varying with exponent \(-(\beta +1)\) at zero then \(\int _{|x|<t}|x|^2\nu (\,\mathrm {d}x)\) is regularly varying of exponent \((2-\beta )\) at zero by a Tauberian theorem (see e.g. [10], Thm. VIII.9.1). For Blumenthal–Getoor index \(\beta \in (1,2]\) this means that the Hölder regularity of \(\int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\) is at most \((2-\beta )\).
4.2 The drift parameter \(\gamma \)
None of the above estimators \({\widetilde{N}}, {\widehat{N}}, {\widetilde{\mathcal {N}}}, {\widehat{\mathcal {N}}}\) require knowledge, or estimation, of the drift parameter \(\gamma \), which, at any rate, can be naturally estimated by \(L_{n\Delta }/(n\Delta )\). It is interesting to note that the ‘nonlinear’ estimator \({\widehat{N}}_n\) is even invariant under a change of the drift parameter \(\gamma \), as the following lemma shows.
Lemma 12
Let \(Y_k:=X_k-\Delta \gamma ,k=1,\ldots ,n,\) which are increments of a Lévy process with characteristic triplet \((\sigma ,0,\nu )\). Denoting the estimators (11) based on \((X_k)\) and \((Y_k)\) as \({\widehat{N}}_{X,n}\) and \({\widehat{N}}_{Y,n}\), respectively, we obtain
Proof
The drift causes a factor \(e^{-i\Delta \gamma u}\) in the empirical characteristic function \(\varphi _{\Delta ,n,Y}\) such that
\({\widehat{N}}_n\) only depends via \({\widehat{\psi }}_{n}''\) on the observations.
Consequently, without loss of generality a specific value of \(\gamma \) can be assumed in the proofs for the estimator \({\widehat{N}}_n\) based on the Lévy–Khintchine representation. In particular, the conditions on \({{\mathrm{{\mathbb {P}}}}}_\Delta \) need to be verified only for one \(\gamma \).
4.3 A pilot estimate of the diffusion coefficient \(\sigma ^2\)
Proposition 13
Suppose the Lévy measure satisfies \(\int |x |^\alpha \nu (dx)<\infty \) for some \(\alpha \in [0,2]\) and the characteristic function is bounded from below via
Let \({\widehat{\sigma }}^2\) be as in (12). Then we have, for \(c_0\) small enough, as \(n \rightarrow \infty \), and uniformly in \(\Delta \leqslant 1\),
The proof follows along the lines of [21] and is omitted. The previous discussion and the examples in Sect. 4.5 below show that the natural connection between smoothness \(s\) and Blumenthal–Getoor index \(\beta \) is given by \(s=2-\beta \). For such \(s\) and with the choice \(c_0=1/6\) the conditions of Theorem 5 ensure that (13) is satisfied provided the infimum in the definition of the Blumenthal–Getoor index is attained. Otherwise it suffices to replace the condition \(\Delta _n=o(n^{-1/(s+1)})\) by the slightly stronger condition \(\Delta _n=o(n^{-1/(s^-+1)})\) for some \(s^-<s\) in order to guarantee (13). Other estimators, based for instance on the truncated quadratic variations of the process, can be considered, and different sets of conditions are possible. As this is beyond the scope of the present paper, we refer to [21] for discussion and references.
4.4 Bounding \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \)
A key condition in all results above is a uniform bound on \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta }\Vert _\infty \) of order \(\Delta \). The following proposition shows that this condition follows already from \(\Vert x{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim 1\) and \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). We recall that we always assume \(\int _{{{\mathrm{{\mathbb {R}}}}}}x^2\nu (\,\mathrm {d}x)<\infty \).
Proposition 14
For any Lévy process \((L_t:t\geqslant 0)\) with \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) and \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) we have \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) (with constants uniform in \(\Delta )\).
Proof
From \(\varphi _\Delta ''=(\Delta \psi ''+(\Delta \psi ')^2)\varphi _\Delta =\Delta \psi '' \varphi _\Delta +(\Delta \psi '\varphi _{ \Delta /2})^2\) by the infinite divisibility, we conclude
where \(\nu _\sigma =\sigma ^2\delta _0+x^2\nu \). Using \(x(P*Q)=(xP)*Q+P*(xQ)\), we infer further
By assumption and properties of Lévy processes, we have \(||x\nu _\sigma ||_\infty <\infty \), \({{\mathrm{{\mathbb {P}}}}}_\Delta ({{\mathrm{{\mathbb {R}}}}})=1\), \(\nu _\sigma ({{\mathrm{{\mathbb {R}}}}})<\infty \) and \(||x^2{{\mathrm{{\mathbb {P}}}}}_\Delta ||_{L^1}\lesssim \Delta \). This yields
\(\square \)
The condition \(\Vert x{{\mathrm{{\mathbb {P}}}}}_{\Delta _n}\Vert _\infty \lesssim 1\) is satisfied for all basic examples of Lévy processes like Brownian motion, compound Poisson, Gamma and symmetric (tempered) \(\alpha \)-stable processes. For the latter processes it is interesting to compare the resulting bounds to the small time estimates in [29]. The conjecture that the bound \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) is universal for arbitrary jump behaviour near zero, however, is wrong as the case of a completely asymmetric (tempered) 1-stable process shows where \({{\mathrm{{\mathbb {P}}}}}_\Delta (-\Delta \log (1/\Delta ))\thicksim \Delta ^{-1}\) holds, see the exceptional case in Example 4.5 of [29].
If \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|\nu (\,\mathrm {d}x)<\infty \) we can define the drift parameter \(\gamma _0 := \gamma - \int x \nu (\,\mathrm {d}x)\).
Assumption 15
Let \((\sigma ^2,\gamma ,\nu )\) be a Lévy triplet and \(\nu ^+=\nu {1\!\!1}_{{{\mathrm{{\mathbb {R}}}}}^+}\), \(\nu ^-=\nu {1\!\!1}_{{{\mathrm{{\mathbb {R}}}}}^-}\). Consider the following conditions for the two triplets \((\sigma ^2,\gamma ,\nu ^\pm )\):
-
(i)
(diffusive case) \(\sigma >0\)
-
(ii)
(small intensity case) \(\sigma =0\), \(\gamma _0=0\), \(||x\nu ^\pm ||_\infty <\infty \)
-
(iii)
(finite variation case) \(\sigma =0\), \(\gamma _0=0\), \(x\nu ^\pm \) admits a Lebesgue density in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}}\setminus [-\varepsilon ,\varepsilon ])\) for all \(\varepsilon >0\), \(\varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon |x |\nu ^\pm (\,\mathrm {d}x)\lesssim \varepsilon \nu (\pm \varepsilon )\) for \(\varepsilon \downarrow 0\) and
$$\begin{aligned} \liminf _{\varepsilon \downarrow 0}\inf _{t\in (0,1]} \frac{(t\varepsilon )^{-1}\int _{|x |\leqslant t\varepsilon } x^2\nu ^\pm (\,\mathrm {d}x)}{\varepsilon ^2\nu (\pm \varepsilon )t\log (t^{-1})}>0 \end{aligned}$$ -
(iv)
(infinite variation case) \(\sigma =0\), \(\nu ^{\pm }\) admits a Lebesgue density,
$$\begin{aligned} \frac{1}{\varepsilon }\int _{-\varepsilon }^\varepsilon x^2\nu ^\pm (\,\mathrm {d}x)\gtrsim \int _{|x |>\varepsilon }|x |\nu ^\pm (\,\mathrm {d}x)+1\quad \text { for } \varepsilon \in (0,1) \end{aligned}$$
Proposition 16
If each of the triplets \((\sigma ^2,\gamma ,\nu ^\pm )\) of the Lévy process satisfies one of the Assumptions 15(i)–(iv), then \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) holds uniformly in \(\Delta \).
4.5 Examples
Let us discuss the applicability of Proposition 16 together with the smoothness conditions on the jump measure from Theorem 5 in a few examples.
-
(i)
Diffusion plus compound Poisson process.
Let \(\nu \) be a finite measure on \({{\mathrm{{\mathbb {R}}}}}\) with a Lebesgue density. Suppose that we have \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^{4+\varepsilon }\nu (\,\mathrm {d}x)<\infty \) for some \(\varepsilon >0\) and \(\Vert x^3\nu \Vert _\infty <\infty \). Proposition 16 yields \(\Vert x{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim 1\) if either \(\gamma _0=0\) and \(\nu (x)\lesssim |x|^{-1}\) as \(x\rightarrow 0\), or if \(\sigma >0\).
For \(x^2\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) the global Hölder regularity in Assumption 1(d) is \(s=1\), and for smooth compounding measure \(x^2\nu \in C^r({{\mathrm{{\mathbb {R}}}}})\) it is satisfied with \(s=r+1\).
-
(ii)
Self-decomposable Lévy process.
The jump measures of self-decomposable Lévy processes are characterised by \(\nu (\,\mathrm {d}x)=\frac{k(x)}{|x|}\,\mathrm {d}x\) for a function \(k:{{\mathrm{{\mathbb {R}}}}}\rightarrow {{\mathrm{{\mathbb {R}}}}}_+\) which is monotonically increasing on the negative half line and decreasing on the positive one. An explicit example is given by the Gamma process where \(k(x)=ce^{-\lambda x}1\!\!1_{{{\mathrm{{\mathbb {R}}}}}_+}(x)\) for \(c,\lambda >0\). Note that nontrivial self-decomposable processes have an infinite jump activity. If \(k\) is a bounded function, then Assumption 15(ii) is fulfilled. The smoothness is determined by the Hölder regularity of \(|x|k(x)\). For instance, Gamma processes induce regularity \(s=2\) at \(t=0\) and \(C^\infty \) away from the origin.
-
(iii)
Tempered stable Lévy process.
Let \(L\) be a tempered stable process, that is a pure jump process with Lévy measure given by the Lebesgue density
$$\begin{aligned} \nu (x)=|x|^{-1-\alpha }\left( c_-e^{-\lambda _-|x|}1\!\!1_{(-\infty ,0)}(x) +c_+e^{-\lambda _+|x|}1\!\!1_{(0,\infty )}(x)\right) \end{aligned}$$with parameters \(c_\pm \geqslant 0,\lambda _\pm >0\) and stability index \(\alpha \in (0,2)\). By the exponential tails of \(\nu \) the moment assumptions are satisfied. For the finite variation case \(\alpha \in (0,1)\) Assumption 15(iii) can be verified since \(\varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon |x|\nu ^\pm (x)\,\mathrm {d}x\sim \varepsilon ^{-\alpha }\sim \varepsilon \nu (\pm \varepsilon )\) and the second condition simplifies to \(t^{-\alpha }/\log (t^{-1})>0\). In the infinite variation case \(\alpha \in (1,2)\) Assumption 15(iv) is satisfied owing to
$$\begin{aligned} \varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon x^2\nu ^\pm (x)\,\mathrm {d}x\sim \varepsilon ^{1-\alpha }\sim \int _{|x|>\varepsilon }|x|\nu ^\pm (x)\,\mathrm {d}x, ~~\varepsilon \in (0,1). \end{aligned}$$Outside of a neighbourhood of zero the Lévy measure is arbitrarily smooth. Due to the cusp of \(x^2\nu (x)\) at the origin the global Hölder regularity is in general given by \(s=2-\alpha \). In the case \(\alpha =1\) and \(c_+=c_-\), \(x^2\nu \) is already Lipschitz continuous at zero and so \(s=2\).
-
(iv)
Jump densities regularly varying at zero.
The first condition in Assumption 15(iii) holds for regularly varying \(\nu \) with \(\alpha <1\), that is \(\nu ^\pm (x)=|x|^{-1-\alpha }l(x)\) with slowly varying \(l\) at zero, by a classical Tauberian theorem (see e.g. [10], Thm. VIII.9.1). The second condition then reduces to
$$\begin{aligned} l(t\varepsilon )\geqslant C_\alpha t^\alpha \log (t^{-1})l(\varepsilon ),\quad C_\alpha >0, \end{aligned}$$uniformly over \(t\in (0,1]\) for small \(\varepsilon >0\), which is always satisfied for \(\alpha >0\). Similarly, Assumption 15(iv) is satisfied if \(\nu ^\pm (x)=|x|^{-1-\alpha }l(x)\) holds with \(\alpha \in (1,2)\) and a slowly varying function \(l\) at zero.
5 Proofs
We collect the proofs for Sects. 2, 3, 4. Theorem 11 is proved in the next section.
5.1 Proof of Lemma 2
The result is a standard—for convenience of the reader we include a short proof. Using the Lévy–Khintchine formula (8) we see
as \(\Delta \rightarrow 0\). The characteristic function of the probability measure \(c_\Delta ^{-1}x^2\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) converges pointwise to the characteristic function of \(c^{-1}(\sigma ^2 \delta _0+ x^2\nu (\,\mathrm {d}x))\) as \(\Delta \rightarrow 0\) since
Therefore, we obtain (5) from Lévy’s continuity theorem.
5.2 Proof of Theorem 3
Using decomposition (20), Theorem 3 follows from Theorem 11 with \(m=1\) [which trivially satisfies Assumption 8(a)], if we can show that the ‘bias’ term \(B(t)\) is asymptotically negligible uniformly in \(t \in {{\mathrm{{\mathbb {R}}}}}\). This is achieved in the following proposition.
Proposition 17
Grant the assumptions of Theorem 3. Then it holds
Proof
We decompose the bias into
We start with the first term \(B_1(t)\): Using \(-({{\mathrm{{\mathcal {F}}}}}f)''={{\mathrm{{\mathcal {F}}}}}[x^{2}f]\) for any function \(f\) satisfying \((1\vee x^{2})f\in L^{1}({{\mathrm{{\mathbb {R}}}}})\), we have
Plancherel’s identity and \(\psi ''=-{{\mathrm{{\mathcal {F}}}}}[x^2\nu ]\) then gives
The proofs below will imply that the last integral exists, which in particular justifies the preceding manipulations. We shall repeatedly use that \(\sup _{t}\Vert g_{t}\Vert _{L^{1}}\leqslant \Vert \rho \Vert _{L^{1}}\) and \(\sup _{t}\Vert g_{t}\Vert _{BV}\leqslant \Vert \rho \Vert _{BV}\) imply
uniformly in \(t \in {{\mathrm{{\mathbb {R}}}}}\). In case (a) we can use (31), \(\Vert \varphi _\Delta \Vert _\infty =1\), \(|{{\mathrm{{\mathcal {F}}}}}[x\nu ](u)| \lesssim (1+|u|)^{-1}\), the hypothesis \(\gamma _0=0\) and the resulting identity
to bound
For case (b) we will show that
By assumption \({\widetilde{\nu }}(x)=\nu ^+(x)+\nu ^-(-x)\) is regularly varying at zero with exponent \(-(\beta +1)\) and so the function \(H(r):=\int _{|x|<r}x^2\nu (x)\,\mathrm {d}x=\int _0^r x^2 {\widetilde{\nu }}(x)\,\mathrm {d}x\) is regularly varying with exponent \((2-\beta )\) by a Tauberian theorem [10, Thm. VIII.9.1]. Especially we can bound \(H(r)\) from below, more precisely for any \(\beta ^-\in (0,\beta )\) there exists \(r_0>0\) such that \(H(r)\gtrsim r^{2-\beta ^-}\) for all \(r\in (0,r_0)\). By [28] there is a constant \(c>0\) such that
for \(|u|\) sufficiently large. On the other hand, it is easily seen that
for any \(\beta ^{+}\in (\beta ,2)\) and that \(|\psi ''(u)|\) is bounded. Especially we have \(\varphi _{\Delta },\varphi _{\Delta }''\in L^{2}({{\mathrm{{\mathbb {R}}}}})\). Collecting the above and using (31) implies
Let us distinguish the cases \(\beta \geqslant 1\) and \(\beta <1\), which will yield together (32). We will be using the bounds for \(\varphi _\Delta \) and \(\psi '\) in (33) and (34), respectively.
-
(i)
For \(\beta \geqslant 1\) substituting \(u=\Delta ^{-1/\beta ^{-}}z\) yields
$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|&\lesssim \Delta \int (1+|u|)^{(2\beta ^{+}-3)}\exp (-c\Delta |u|^{\beta ^{-}})\,\,\mathrm {d}u\\&\lesssim \Delta ^{(\beta ^{-}-2\beta ^{+}+2)/\beta ^{-}}\int ((1+|z|)^{2\beta ^{+}-3}\vee |z|^{ 2\beta ^{+}-3})\exp (-c|z|^{\beta ^{-}})\,\,\mathrm {d}z, \end{aligned}$$where the integral in the last display is finite owing to \(\beta ^{+}>1\). Noting that \(2\beta ^{+}-\beta ^{-}>\beta \), we conclude that \(|B_{1}|\lesssim \Delta ^{p}\) for any \(p<(2-\beta )/\beta \).
-
(ii)
For \(0<\beta <1\) boundedness of \(|\psi '|\) and the same substitution yields for any \(\delta >0\)
$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|&\lesssim \Delta \int (1+|u|)^{-1}\exp (-c\Delta |u|^{\beta ^{-}})\,\,\mathrm {d}u\\&\leqslant \Delta ^{1-1/\beta ^{-}}\int (1+\Delta ^{-1/\beta ^{-}}|z|)^{-1+\delta }\exp (-c|z|^{ \beta ^{-}})\,\,\mathrm {d}z\\&\leqslant \Delta ^{1-\delta /\beta ^{-}}\int |z|^{-1+\delta }\exp (-c|z|^{\beta ^{-}})\,\,\mathrm {d}z. \end{aligned}$$By choosing \(\delta \) sufficiently small, we obtain \(|B_{1}|\lesssim \Delta ^{p}\) for any \(p<1\).
Let us now consider \(B_{2}\) in (30) which we can write as
For the sake of brevity we define \(h_{t}(y):=(g_{t}(-{\scriptstyle \bullet } )*(x^{2}\nu ))(y)\). We decompose the integration domain into the neighbourhood of the origin \((-U,U)\) and the tails \(\{y:|y|\geqslant U\}\). For small \(y\) the uniform Hölder regularity of \(h_{t}(y)\), for \(|y|<U\), as well as \({{\mathrm{{\mathbb {E}}}}}[|X_{1}|^{2}]\lesssim \Delta \) and Jensen’s inequality yield for \(s\leqslant 1\)
and for \(s>1\) with \(x_y\in [-y,0]\) an intermediate point from the mean value theorem
For the tails we conclude from \(\sup _{t}\Vert h_{t}\Vert _{\infty }\leqslant \Vert \rho \Vert _{\infty }\int x^{2}\nu (\,\mathrm {d}x)\) and Markov’s inequality
The previous two estimates finally yield \(\sup _{t}|B_{2}(t)|\lesssim \Delta ^{s/2}\). \(\square \)
5.3 Proof of Theorem 4
We only prove the case \(V=(-\infty , -\zeta ]\), the general case follows from symmetry arguments that are left to the reader. We use decomposition (20) and apply Theorem 11—with \(m=1\) and \(\rho \) suitably chosen such that \(\rho (x)=x^{-2}\) for all \(x \in (-\infty , -\zeta ]\) —to the stochastic term \(S(t)\). For our choice of \(\Delta _n\) the bias term \(B(t)\) is negligible in the asymptotic distribution in view of Proposition 2.1 in [13] (which holds also for unbounded \(V\) separated away from the origin, as inspection of that proof shows).
5.4 Proof of Theorem 5
For Theorem 5(ii) we choose a suitable \(\rho \) such that \(\rho (x)=x^{-2}\) on \(V\) and we restrict to the case \((-\infty ,-\zeta ]\) since the proof can be easily extended to cover the general case by symmetry arguments. We use the decomposition (22). The third term is negligible in view (13). Recalling
the following result shows that the deterministic approximation error is negligible in the asymptotic distribution of \(\sqrt{n\Delta _n}({\widehat{N}}_n-N_\rho )\) whenever \(h^{s}=o(1/\sqrt{n\Delta _n})\), valid for our choice of \(\Delta _n\).
Proposition 18
Suppose \(x^2\nu \) is a finite measure satisfying Assumption 1(d). If the kernel satisfies (14) with order \(p\geqslant s\), then
with constants independent of \(t\).
Proof
Using Fubini’s theorem,
The result now follows from Assumption 1(d) and a standard Taylor expansion argument using the order \(p\) of the kernel. \(\square \)
The second, stochastic, term in (22) can be reduced to the linear term from Proposition 7, which is proved as follows:
Proof of Proposition 7
To linearise \(\psi ''-{\widehat{\psi }}_n''=-\Delta ^{-1}\log (\varphi _{\Delta ,n}/\varphi _\Delta )''\), we set \(F(y)=\log (1+y)\), \(\eta =(\varphi _{\Delta ,n}-\varphi _\Delta )/\varphi _\Delta \), and use
On the event \(\Omega _n:=\{\sup _{|u|\leqslant 1/h}|(\varphi _{\Delta ,n}-\varphi _\Delta )(u)/\varphi _\Delta (u) |\leqslant 1/2\}\) we thus obtain

To estimate \(||\eta ^{(k)} ||_{\ell ^\infty [-h^{-1},h^{-1}]},k=0,1,2\), we note \(|\psi '(u) |\lesssim 1+|u |\), \(|\psi ''(u) |\lesssim 1\) and \(h\gtrsim \Delta ^{1/2}\)
Moreover, from Theorem 1 by [24] we know that under our moment assumption on \(\nu \) (for \(k=0,1,2\) and any \(\delta >0\))
This yields for \(k=0,1,2\)
In combination with \(n(\log h^{-1})^{-1-\delta }\gtrsim n\Delta ^{3(1+\delta )/4}\rightarrow \infty \) for \(\delta \in (0,1/3)\) and \(|1/\varphi _\Delta | \lesssim 1\) on \([-1/h_n, 1/h_n]\) the bound (36) shows also \({{\mathrm{{\mathbb {P}}}}}(\Omega _n)\rightarrow 1\) and then
We decompose the linearised stochastic error into
By the previous estimates we have
Inserting the asymptotics in \(h\), we conclude
By the Plancherel formula and the Cauchy–Schwarz inequality we have
\(\square \)
Finally, to the main stochastic term
we apply Theorem 11. The proof of Theorem 5 is thus complete upon verification of Assumption 8 for the present choice of \(m\). This is achieved in the following proposition.
Proposition 19
Assume that \(K\) satisfies (14) for \(p \geqslant 2\) and that \(\nu \) satisfies \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^3\nu (\,\mathrm {d}x)<\infty \). Let \(h=h_n\rightarrow 0\) and \(\Delta =\Delta _n\rightarrow 0\) as \(n\rightarrow \infty \) with \(h^3=o(\Delta )\), \(h^{-1} = O(\Delta ^{-1/2})\). Then \(m_{n,\Delta }(u):={\mathcal {F}} K_h(u)/\varphi _\Delta (u)\), \(u \in {{\mathrm{{\mathbb {R}}}}}\), satisfies Assumption 8.
Proof
We have \(m(-u)=\overline{m(u)}\) so that \({{\mathrm{{\mathcal {F}}}}}^{-1}m\) is real-valued. By the compact support of \({\mathcal {F}} K\) and the assumption on \(h^{-1}\) the support assumption on \(m\) is satisfied. Since \(\varphi _{\Delta } = e^{\Delta \psi }\), we have \(m_{\Delta , n} \rightarrow 1\) pointwise as \(\Delta \rightarrow 0\), \(h\rightarrow 0\). Moreover, by (9) we have \(|\psi ''(u)|\lesssim 1\) hence for \(|u| \leqslant C\Delta ^{-1/2}\) we have
uniformly in \(\Delta \), and thus \(m \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), using also \(\sup _h\Vert {\mathcal {F}} K_h\Vert _\infty \leqslant \Vert K\Vert _{L^1}\). Next
so that using \(xK \in L^1, |\psi '(u)| \lesssim 1+|u|, |u| \leqslant h^{-1}=O(\Delta ^{-1/2})\) and the bound for \(m\) above we see
Using \(|\psi ''(u)|\lesssim 1\) we further obtain
On the support of \(m\) we have \(|u|\leqslant h^{-1}\) so that \(\Vert (1+|u|)^{k}m^{(k)}\Vert _\infty \leqslant c\), \(k\in \{0,1,2\}\), follows. Likewise by the support of \(m\) we have \(\Vert m'\Vert _{L^2}\lesssim h^{1/2}\rightarrow 0\) and \(\Delta ^{-1/2}\Vert m''\Vert _{L^2}\lesssim \Delta ^{-1/2}h^{3/2}\rightarrow 0\). \(\square \)
5.5 Convergence of finite-dimensional distributions
We next turn to the proof of Proposition 9.
Definition 20
A function \(g\) is called admissible if it is of bounded variation and satisfies for all \(x, u \in {{\mathrm{{\mathbb {R}}}}}\),
Note that the bound on \({\mathcal {F}} g\) follows from the bounded variation of \(g\), and that \(x^2 g^2\) is of bounded variation whenever \(g\) is admissible.
Proposition 21
Let \(g\) be admissible and suppose the conditions of Proposition 9 are satisfied. Then
with variance \(\sigma _g^2=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4g(x)^2\nu (\,\mathrm {d}x)\).
The functions \(g_t=\rho 1\!\!1_{(-\infty ,t]}\) are uniformly bounded in bounded variation and are admissible with constants independent of \(t\in {{\mathrm{{\mathbb {R}}}}}\). The convergence of the finite dimensional distributions in Proposition 9 hence follows from the Cramér-Wold device since linear combinations of the functions \(g_{t_1},\ldots ,g_{t_k}\) for \(t_1,\ldots ,t_k\in {{\mathrm{{\mathbb {R}}}}}\) are admissible.
For the proof of Proposition 21 we will use the following lemma, whose assumptions are in particular fulfilled for \(m_{n,\Delta }\) satisfying Assumption 8 and for classes of functions with uniform constants in the admissibility definition.
Lemma 22
Let \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert \lesssim \Delta \). For \(\Delta \rightarrow 0\) as \(n\rightarrow \infty \) let \(\Vert m_{n,\Delta }\Vert _\infty \) and \(\Vert m_{n,\Delta }'\Vert _\infty \) be uniformly bounded and \(m_{n,\Delta }\rightarrow 1\) pointwise. If \({{\mathrm{{\mathcal {G}}}}}\) is a class of functions such that for all \(u\in {{\mathrm{{\mathbb {R}}}}}\)
then
Proof
We rewrite the term with \(m=m_{n,\Delta }\) as
Using \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \), the term (37) can be estimated by the Cauchy–Schwarz inequality and Plancherel’s identity yielding the bound
The first factor converges to zero by the dominated convergence theorem because \(m\) is uniformly bounded and converges pointwise to one while \(|\mathcal {F}g(u)|\leqslant C(1+|u|)^{-1}\) for all \(g\in {{\mathrm{{\mathcal {G}}}}}\). For the second factor we estimate, using that \(g\) and \(xg\) are uniformly bounded in \(L^2({{\mathrm{{\mathbb {R}}}}})\) and that \(\Vert m\Vert _\infty \) and \(\Vert m'\Vert _\infty \) are uniformly bounded,
which completes the proof of the lemma.\(\square \)
Proof of Proposition 21
We define
We will prove the proposition for general Fourier multipliers satisfying Assumption 8(b), the case where \({{\mathrm{{\mathcal {F}}}}}^{-1} m\) is a finite signed measure is similar (in fact easier) and is omitted. We will verify the conditions of Lyapunov’s central limit theorem, see, e.g., [1, Theorems 28.3 and (28.8)].
Step 1: We will show that \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4 g(x)^2\nu (\,\mathrm {d}x)\), noting that \(Y_{n,k}\) are real valued. We estimate
where we have used that \({{\mathrm{{\mathbb {E}}}}}[X_1^2]= O(\Delta )\). Consequently, \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\lim _{n\rightarrow \infty }{{\mathrm{{\mathbb {E}}}}}[Y_{n,k}^2]\), which we decompose in the following way:
The last term is the claimed limit. The first limit is zero by Lemma 22. For the second limit we deduce by Lemma 2 that \((x^2 \wedge x^4) {{\mathrm{{\mathbb {P}}}}}_\Delta /\Delta \) converges weakly to the absolutely continuous measure \((x^2 \wedge x^4) \nu \), and thus in particular by the Portmanteau lemma when integrating against the function \((x^2 \vee 1) g(x)^2\), which is of bounded variation. This implies convergence to zero of the second term. This shows \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\int x^4 g(x)^2\nu (\,\mathrm {d}x)\).
Step 2: We verify Lyapunov’s moment condition: For some \(\varepsilon \in (0,1)\) and \(S_n=\sum _{k=1}^nY_{n,k}\)
From the previous step we know \(n^{-1}{{\mathrm{\hbox {Var}}}}(S_n)={{\mathrm{\hbox {Var}}}}(Y_{n,k})\rightarrow \sigma _g^2\) as \(n\rightarrow \infty \). Moreover, by \(|x|^{4+2\varepsilon }\lesssim |1+ix|^{2+\varepsilon }|x|^3\), \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) and the Hausdorff–Young inequality (e.g., 8.30 on p. 253 in [15])
By Assumption 8, \(m\) and \(m'\) are uniformly bounded, \(\Vert {\mathcal {F}} g(u)\Vert _{L^{(2+\varepsilon )/(1+\varepsilon )}}\) is bounded by \(|{{{\mathrm{{\mathcal {F}}}}}}g|\lesssim (1+|u|)^{-1}\) and
which are finite by \(xg\in L^2({{\mathrm{{\mathbb {R}}}}})\) and by \(u{{\mathrm{{\mathcal {F}}}}}[xg](u)\in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), respectively. Consequently, \({{\mathrm{{\mathbb {E}}}}}[|Y_{n,k}|^{ 2+\varepsilon }]\lesssim \Delta ^{-\varepsilon /2}\), implying
\(\square \)
5.6 Proof of Proposition 16
For (ii) and (iii) we have \(\int |x| \nu (\,\mathrm {d}x)<\infty \) and will use that the function \(\psi \) in the exponent of the Lévy–Khintchine formula (8) may be written as
For (iii) note \(x{{\mathrm{{\mathbb {P}}}}}_\Delta =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^- +(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^-)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) with the corresponding laws for \(\nu ^+,\nu ^-\). It thus suffices to prove \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+ ||_\infty + ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^- ||_\infty \lesssim 1\) and without loss of generality we only consider \({{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) in the proof of case (iii). For (iv) we use the same decomposition but this time the law \({{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) corresponds to the Lévy triplet \((0,\gamma ,\nu ^+)\) so that it also incorporates the drift.
-
(i)
If \(\sigma >0\) holds, then \(|\psi '(u) |\lesssim 1+|u |\) implies
$$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant ||\varphi _\Delta ' ||_{L^1}\lesssim \int \Delta (1+|u |)e^{-\Delta \sigma ^2u^2/2}\,\mathrm {d}u\lesssim 1. \end{aligned}$$ -
(ii)
On the assumptions the measure \(x\nu \) is finite yielding the identity \(x{{\mathrm{{\mathbb {P}}}}}_\Delta =\Delta (x\nu )*{{\mathrm{{\mathbb {P}}}}}_\Delta \), which implies that even
$$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant \Delta ||x\nu ||_\infty . \end{aligned}$$ -
(iii)
Without loss of generality we suppose \(\Vert x\nu ^\pm \Vert _\infty =\infty \). Denote the limit inferior in condition (iii) by \(\delta >0\) and define
$$\begin{aligned} a_\Delta :=\inf \left\{ a>0:\sup _{x>a}\Delta x\nu (x)\leqslant \frac{4}{\delta }\right\} , \end{aligned}$$where \(a_\Delta >0\) follows from \(\lim _{a\rightarrow 0}\sup _{x>a}x\nu (x)=\Vert x\nu ^+\Vert _\infty =\infty \). Since \(\Vert x\nu \Vert _{\ell ^\infty ({{\mathrm{{\mathbb {R}}}}}\setminus [-\varepsilon ,\varepsilon ])}\) is bounded for any \(\varepsilon >0\) we deduce that \(a_\Delta \downarrow 0\) as \(\Delta \rightarrow 0\).
Let us introduce \(\nu _\Delta ^s:=\nu {1\!\!1}_{[0,a_\Delta ]}\) and \(\nu _\Delta ^c:=\nu ^+-\nu _\Delta ^s\). By \(\Vert x\nu _\Delta ^c\Vert _\infty \leqslant \frac{4}{\Delta \delta } \) and the argument in (ii), applied to \(\nu _\Delta ^c\), the corresponding law \({{\mathrm{{\mathbb {P}}}}}_\Delta ^c\) satisfies \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c ||_\infty \lesssim 1\). Because of
$$\begin{aligned} x{{\mathrm{{\mathbb {P}}}}}_\Delta =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^s+(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^s) *{{\mathrm{{\mathbb {P}}}}}_\Delta ^c =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^s+(\Delta x\nu _\Delta ^s)*{{\mathrm{{\mathbb {P}}}}}_\Delta \end{aligned}$$we shall bound \(||\Delta x\nu _\Delta ^s ||_{L^1}\) and \(||{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \). From the assumptions we infer \(||\Delta x\nu _\Delta ^s ||_{L^1}\lesssim a_{\Delta }\) via
$$\begin{aligned} \int _0^{a_\Delta } \Delta x\nu (\,\mathrm {d}x) =\lim _{a\downarrow a_\Delta }\int _0^a\Delta x\nu (\,\mathrm {d}x) \lesssim \limsup _{a\downarrow a_\Delta }a\Delta a\nu (a)\leqslant \frac{4a_\Delta }{\delta }. \end{aligned}$$On the other hand, by construction there is some \(a_\Delta ^-\in [\frac{1}{2}a_\Delta ,a_\Delta ]\) such that \(\Delta a_\Delta ^-\nu (a_\Delta ^-)\geqslant 4/\delta \). Together with the assumptions, and \(\Vert {{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \leqslant \Vert \varphi _\Delta \Vert _1\), we see that for \(\varepsilon :=a_\Delta ^-\) sufficiently small, that is for \(\Delta \) small, and for some \(\kappa \in (2,4)\)
$$\begin{aligned} a_\Delta ||{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty&\leqslant 2a_\Delta ^-\int _{-\infty }^\infty e^{-\frac{\Delta }{\kappa } u^2\int _0^{1/|u |} x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&=2\int _{-\infty }^\infty e^{-\frac{\Delta }{\kappa } (v/a_\Delta ^-)^{2}\int _0^{a_\Delta ^-/|v |} x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}v\\&\leqslant 4+2\int _{|v |>1} e^{-\frac{\delta }{\kappa }\Delta a_\Delta ^-\nu (a_\Delta ^-)\log (|v |)}\,\mathrm {d}v\\&\leqslant 4+4\int _{1}^\infty v^{-4/\kappa }\,\mathrm {d}v\sim 1, \end{aligned}$$which together with the bound on \(||\Delta x\nu _\Delta ^s ||_{L^1}\) yields the result.
-
(iv)
By Theorem 27.7 in [31] \({{\mathrm{{\mathbb {P}}}}}_\Delta \) admits a Lebesgue density, hence by Fourier inversion \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant ||\varphi _\Delta ' ||_{L^1}\) and by the hypothesis on \(\nu ^+\), we estimate for some \(\kappa >0\) and for some small \(c>0\)
$$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+ ||_\infty&\leqslant \int _{-\infty }^\infty \Delta {\Bigl |\int _0^\infty (e^{iux}-1)x\nu (\,\mathrm {d}x)+\gamma \Bigr |}e^{-\Delta \int _0^\infty (1-\cos (ux))\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&\lesssim \int _{-\infty }^\infty \Delta \Big (1+\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\Big ) e^{-\frac{\Delta }{\kappa } u^2\int _0^{1/|u |}x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&\leqslant \int _{-\infty }^\infty \Delta \Big (1+\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\Big ) e^{-c\Delta \int _0^\infty (\tfrac{u^2}{2}x^2\wedge |u |x)\nu (\,\mathrm {d}x)}\,\mathrm {d}u. \end{aligned}$$The derivative of the exponent is given by \(-c\Delta {{\mathrm{sgn}}}(u)\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\) such that the last line of the display is bounded by
$$\begin{aligned} \int _{-\infty }^\infty \Delta e^{-c\Delta \int _0^\infty (\tfrac{u^2}{2} x^2\wedge |u |x)\nu (\,\mathrm {d}x)}\,\mathrm {d}u +2/c. \end{aligned}$$From \(|u |\int _0^{1/u}x^2\nu (\,\mathrm {d}x)\gtrsim 1\) we infer that the integral is at most of order \(\int \Delta e^{-\Delta |u |}\,\mathrm {d}u\thicksim 1\) and the result follows.
6 Proof of Theorem 11
We recall \(g_t(x)=\rho (x)1\!\!1_{(-\infty ,t]}(x)\) and hence
By Proposition 9 and Theorem 1.5.7 in [33] it suffices to show that there is a semimetric \(d\) such that \(({{\mathrm{{\mathbb {R}}}}},d)\) is totally bounded and for every \(\gamma >0\) we have
We note that \({\mathbb {G}}_n\) equals a triangular array of empirical processes \(\sqrt{n} ({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })\) indexed by the class
6.1 Equicontinuity and a change of metric
For \(t\leqslant 0\) we decompose \(\widetilde{g}_t\) into the three terms
Heuristically speaking the main difficulties arise from the fact that \(1\!\!1_{(-\infty , t]}\) is nonintegrable on \({{\mathrm{{\mathbb {R}}}}}\) and discontinuous at \(t\). The above decomposition separates the jump-discontinuity from the non-integrable part, and the third term collects the remainder without discontinuity or integrability issues. We refer to the second term as the ‘critical term’ since it is not regular enough to be treated by the usual metric entropy techniques.
For \(t>0\) we replace \(e^{y-t}\rho (t)1\!\!1_{(-\infty ,t]}(y)\) by \(-e^{t-y}\rho (t)1\!\!1_{(t, \infty )}(y)\), and the proof below proceeds with only notational changes. We thus restrict to \(t \in (-\infty ,0]\).
By the triangle inequality it suffices to show asymptotic equicontinuity for the empirical processes indexed by the three terms in the above decomposition separately with appropriate metrics \(d^{(i)}\), and then (39) holds with the overall metric \(d=\max _i d^{(i)}\) equal to the maximum of the three metrics \(d^{(i)}, i =1,2, 3\). In view of the variance structure of the limiting process \({\mathbb {G}}\) it is natural to choose the semimetrics
where
and we note \(x^2g_t=g_t^{(1)}+g_t^{(2)}+g_t^{(3)}\). On the other hand the covariance metric compatible with the distribution \({{\mathrm{{\mathbb {P}}}}}_\Delta \) of the \(X_k\)’s driving the empirical process is given by the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-distance. In the following we will show that a \(\delta \)-increment for the limiting metric \(d^{(i)}\) corresponds, for \(n\) large enough, to a \(\delta \)-increment in the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-metric on the functions \({\widetilde{g}}_t^{(i)}\). Verifying asymptotic equicontinuity for the whole process then reduces to showing total boundedness of each subclass and that, for each \(i=1, 2, 3\), and every \(\gamma >0\),
where \(\Vert f\Vert _{2,P}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\). This will permit the application of powerful tools from empirical process theory to control the last probabilities. Before we do this, we demonstrate the reduction to (46) for all three terms in the above decomposition separately. We note that total boundedness of the classes \({\mathcal {G}}^{(i)} = \{g_t^{(i)}: t \in {{\mathrm{{\mathbb {R}}}}}\}\) for the \(d^{(i)}\)-metric follows from entropy computations given in the following subsections.
Starting with \(\{{\widetilde{g}}_t^{(1)}: t \leqslant 0\}\), we note that the functions
are uniformly bounded and uniformly Lipschitz continuous. In order to compare \(d^{(1)}\) to the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norm on \(\{{\widetilde{g}}_t^{(1)}:t\leqslant 0\}\), we claim
as \(n\rightarrow \infty \). Any class of functions that is uniformly bounded and uniformly Lipschitz continuous is a uniformity class for weak convergence using either Theorem 1 in [4], or the well-known fact that the bounded-Lipschitz metric metrises weak convergence. So, the weak convergence in Lemma 2 yields
as \(n\rightarrow \infty \). Next using \(0<\rho (x)\leqslant C (1\wedge x^{-2})\) and the bounded variation of \(\rho \), we see that \({{\mathrm{{\mathcal {G}}}}}:=\{x^{-2}(g_s^{(1)}(x)-g_t^{(1)}(x)):s,t\leqslant 0\}\) satisfies the assumption of Lemma 22 and hence
as \(n\rightarrow \infty \). We conclude that (47) and then also the reduction to (46) holds for \(\{{\widetilde{g}}_t^{(1)}: t \in {{\mathrm{{\mathbb {R}}}}}\}\).
A similar reduction for \({\widetilde{g}}_t^{(2)}\) defined in (41) is achieved as follows. As in (47) we claim that
converges to zero as \(n\rightarrow \infty \). To see this we observe that by Lemma 2 the measures \((1\wedge x^4)\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) converge weakly to \((1\wedge x^4)\nu (\,\mathrm {d}x)\). The limit is absolutely continuous with respect to Lebesgue measure and thus the functions \(g_t^{(2)}(x)/(1\wedge x^2)\), \(t<0\), are \((1\wedge x^4)\nu (\,\mathrm {d}x)\)-almost everywhere continuous. Moreover, the functions
are all contained in a bounded set of the space of bounded variation functions and hence \(\{(g_s^{(2)}(x)/(1\wedge x^{2})-g_t^{(2)}(x)/(1\wedge x^{2}))^2:~s,t<0\}\) forms a uniformity class for weak convergence towards \((1 \wedge x^4)\nu (\,\mathrm {d}x) \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) (after renormalising the measures involved to have mass one and by Theorem 1 in [4]). Consequently
as \(n\rightarrow \infty \), where we recall that \(g_0^{(2)}=0\). To deal with the first term in (49) we define
Lemma 22 can be applied to the class \({{\mathrm{{\mathcal {G}}}}}:=\{y^{-2}(g_s^{(2)}(y)-g_t^{(2)}(y)):s,t\leqslant 0\}\) using that \(y^{-2}g_t(y)\) is uniformly bounded in the space of bounded variation functions, as observed after (50). This yields
as \(n\rightarrow \infty \). Therefore, (49) follows from (51) and (53) if
uniformly in \(t\leqslant 0\). To show this, note that
for \(t<0\) and \({\bar{g}}_0^{(2)}(x)={\widetilde{g}}_0^{(2)}\). We will use the following proposition, which is an adaptation of the pseudo-differential operator inequality Proposition 10 in [27]. We denote the \(L^q\)-Sobolev space for \(q\in (0,\infty )\) and \(s\in {{\mathrm{{\mathbb {N}}}}}\) by \(W^s_q({{\mathrm{{\mathbb {R}}}}}):=\{f\in L^q({{\mathrm{{\mathbb {R}}}}}):\sum _{k=0}^s\Vert f^{(k)}\Vert _{L^q}<\infty \}\) and define \(\Vert f\Vert _{L^2(P)}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\).
Proposition 23
Let \(P\) be a probability measure with Lebesgue density \(P\) and such that \(\Vert x^{2j+k} P\Vert _\infty <\infty \) for some \(j,k\in {{\mathrm{{\mathbb {N}}}}}\). Let \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) with \({{\mathrm{\hbox {supp}}}}(f)\cap (-\delta ,\delta )=\varnothing \) for some \(\delta >0\). Then for any \(p,q\in [1,2]\), \(s\in \{1,2\},\) and any compactly supported function \(\mu \in W^s_q({{\mathrm{{\mathbb {R}}}}})\)
provided that the right-hand side is finite. The constant does not depend on \(\mu \), \(\delta \) or \(f\).
Proof
For \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) and \(s=1,2\) we can show, as in [27], the pseudo-differential operator identity
Let \(\delta ':=\delta /2\). We use Hölder’s inequality, Plancherel’s identity and the Hausdorff-Young inequality to conclude
The result follows by taking the square root.\(\square \)
We apply Proposition 23 with \(P={{\mathrm{{\mathbb {P}}}}}_\Delta \), \(\mu =m'(-{\scriptstyle \bullet } )\), \(f(y)=g_t^{(2)}(y)/y^2\), \(\delta = |t|\), \(p=1\), \(q=2\), \(k=1\), \(j=1\) and \(s=1\). Using \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) we estimate (55) for \(t<0\) by
which converges to zero uniformly for \(t<0\) by Assumption 8. Hence, tightness of the empirical processes indexed by \({\widetilde{g}}_t^{(2)}\) can be verified by (46) with \(i=2\).
Finally, we discuss the remaining \({\widetilde{g}}_t^{(3)}={\widetilde{g}}_t-{\widetilde{g}}_t^{(1)}-{\widetilde{g}}_t^{(2)}\). We have
We combine Lemma 22 for \({{\mathrm{{\mathcal {G}}}}}:=\{g_s-g_t:s,t\leqslant 0\}\), with (48), (53) and (54) and obtain
Exactly as in (51) we infer
and thus we obtain the counterpart to (47)
6.2 Asymptotic equicontinuity for the ‘non-critical terms’
We next turn to verifying the asymptotic equicontinuity condition (46) for the terms \({\widetilde{g}}_t^{(i)}, i \in \{1,3\}\). We refer to them as non-critical since uniform tightness of these processes can be deduced directly from existing bracketing metric entropy inequalities for the empirical process.
We recall standard empirical process notation such as \(\Vert G\Vert _{\mathfrak F}:=\sup _{f\in \mathfrak F}|G(f)|\) and \(\Vert f\Vert _{2,P}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\). We denote by \(H(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) the logarithm of the covering number \(N(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) and by \(H_{[\,]}(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) the logarithm of the covering number under bracketing \(N_{[\,]}(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) (see [33] for definitions). For a class of functions \(\mathfrak F\) we define
We define the functions \(f_t(x):=x^{-2}g_t^{(1)}(x)=(\rho (x)-e^{x-t}\rho (t) )1\!\!1_{(-\infty ,t]}(x)\) and recall \({\widetilde{g}}_t^{(1)}(x)=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){\mathcal {F}} f_t(u)](x)\). In order to show the equicontinuity condition (46) for \({\widetilde{g}}_t^{(1)}\) we define the corresponding classes
We suppress in the notation the implicit dependence on \(n\) through \(\Delta \). The weak derivative \(D\rho \) is in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) by the Lipschitz continuity of \(\rho \). Since \(\rho \) is also of bounded variation we have \(D\rho \in L^1({{\mathrm{{\mathbb {R}}}}})\cap \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\subseteq L^2({{\mathrm{{\mathbb {R}}}}})\). The class \(\{f_t:t\leqslant 0\}\) is contained in a bounded set of the Sobolev space \(W_2^1({{\mathrm{{\mathbb {R}}}}})\) since the \(L^2({{\mathrm{{\mathbb {R}}}}})\)-norms of \(f_t\) and \(Df_t\) are bounded. By boundedness of \(m\) we conclude that \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}f_t(u)](x)\), \(t\leqslant 0\), are contained in bounded subset of \(W_{2}^1({{\mathrm{{\mathbb {R}}}}})\), which embeds continuously into \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). As an envelope of the class \((\widetilde{\mathfrak {F}})_\delta '\) we can thus take \(F(x):=c\Delta ^{-1/2}x^2\) for some \(c>0\). By Lemma 19.34 in [32] we have
where \(a(\delta ):=\delta /\sqrt{\log N_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}\) and
\(\Delta ^{1/2}x^{-2}(\widetilde{\mathfrak {F}})_\delta '\) is contained in a bounded set of the Besov space \(B^{s}_{22}({{\mathrm{{\mathbb {R}}}}})\) for \(s\leqslant 1\), which does not depend on \(\Delta \) or \(\delta \). Let \(\gamma >0\) be such that \(\int | x |^{4+2\gamma }\nu (\,\mathrm {d}x)<\infty \). We take \(s\in (1/2,1/2+\gamma )\). The proof of Theorem 1 in [26] with \(p=2\), \(q=2\) and \(\beta =0\) yields
where \(\langle x \rangle :=(1+x^2)^{1/2}\). Another way to write the entropy is \(H(\varepsilon ,\Delta ^{1/2}x^{-2}(\widetilde{\mathfrak {F}})_\delta ',\Vert \cdot \langle x \rangle ^{-\gamma }\Vert _\infty )=H(\varepsilon ,\Delta ^{1/2}(\widetilde{\mathfrak {F}} )_\delta ',\Vert \cdot x^{-2}\langle x \rangle ^{-\gamma }\Vert _\infty )\). A ball in the \(\Vert \cdot x^{-2}\langle x \rangle ^{-\gamma }\Vert _\infty \)-norm with centre \(f\) and radius \(\varepsilon \) is a bracket
whose \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-size is given by \(\Vert 2\varepsilon x^2\langle x \rangle ^{\gamma }\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). Consequently we have
By Theorem 1.1 in [11] (see also [14]) we have \(\Delta ^{-1/2}2\varepsilon \Vert x^2\langle x \rangle ^{\gamma }\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\rightarrow 2\varepsilon \Vert x^2\langle x \rangle ^{\gamma }\Vert _{2,\nu }\) as \(n\rightarrow \infty \). We obtain by a rescaling that
Taking \(s>1/2\) we conclude that the entropy integral \(J_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\) is finite and tends to zero as \(\delta \rightarrow 0\). To show that the left hand side of (56) tends to zero, we first ensure that the entropy integral is small by choosing \(\delta >0\). Upon fixing \(\delta \) and thus for fixed \(a(\delta )\) bounded away from zero uniformly in \(\Delta \), we choose \(n\) large enough such that the second term is small. We recall that we have taken the envelopes to be \(F(x)=c\Delta ^{-1/2}x^2\). We bound
where we multiplied by \(cx^2/(\sqrt{n\Delta }\,a(\delta ))>1\). For \(M\) large enough \(\int x^4 1\!\!1_{\{x^2>M\}}\nu (\,\mathrm {d}x)\) is small. Since \(\Delta ^{-1}\int x^4 1\!\!1_{\{x^2>M\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\rightarrow \int x^4 1\!\!1_{\{x^2>M\}}\nu (\,\mathrm {d}x)\) by Theorem 1.1 in [11], \(n\Delta \rightarrow \infty \) as \(n\rightarrow \infty \) and \(a(\delta )\) is bounded away from zero, we have that \(\Delta ^{-1}\int x^41\!\!1_{\{x^2>\sqrt{n\Delta }\,a(\delta )/c\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) is small for \(n\) large enough. So indeed the left hand side of (56) tends to zero as \(\delta \rightarrow 0\) and \(n\rightarrow \infty \) and we have shown tightness of the empirical process indexed by \(\{{\widetilde{g}}^{(1)}_t:t\leqslant 0\}\).
Let us now consider the terms associated to
The functions \((y-t)e^{y-t} \rho (t)1\!\!1_{(-\infty ,t]}(y)\) are uniformly for all \(t\leqslant 0\) bounded in \(L^2({{\mathrm{{\mathbb {R}}}}})\) and likewise are their weak derivatives. We conclude that they are contained in a bounded set of \(B^1_{22}({{\mathrm{{\mathbb {R}}}}})\). The functions \(e^{y-t}\rho (t)1\!\!1_{(-\infty ,t]}(y)\), \(t\leqslant 0\), are contained in a bounded set of \(L^2({{\mathrm{{\mathbb {R}}}}})\). Assumption 8 implies, together with the Mikhlin Fourier multiplier theorem (e.g., Corollary 4.11 in [19]), that \(m\) is a Fourier multiplier on every Besov space \(B^s_{pq}({{\mathrm{{\mathbb {R}}}}})\), \(s\in {{\mathrm{{\mathbb {R}}}}}\), \(p,q\in [1,\infty ]\), and, moreover, that \(m'\) is a Fourier multiplier mapping \(B^s_{pq}({{\mathrm{{\mathbb {R}}}}})\) into \(B^{s+1}_{pq}({{\mathrm{{\mathbb {R}}}}})\). We see that \(\Delta ^{1/2}x^{-1}{\widetilde{g}}^{(3)}_t(x)\), \(t\leqslant 0\), are contained in a bounded set of \(B^1_{22}({{\mathrm{{\mathbb {R}}}}})\). We define the class \(\widetilde{{\mathcal {G}}}:=\{{\widetilde{g}}^{(3)}_t:t\leqslant 0\}\).
As an envelope of the class \((\widetilde{\mathcal {G}})_{\delta }'\) we can take \(G(x):=c\Delta ^{-1/2}x\) for some constant \(c>0\). Lemma 19.34 in [32] yields
Again by the proof of Theorem 1 in [26] with \(s=1\), \(p=2\), \(q=2\), \(\beta =0\) and \(\gamma =1\) we have
The entropy can be rewritten as \(H(\varepsilon ,\Delta ^{1/2}(\widetilde{{\mathcal {G}}})_{\delta }',\Vert \cdot x^{-1}\langle x\rangle ^{-1}\Vert _\infty )\). A corresponding \(\varepsilon \) ball is in the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norm of size \(2\varepsilon \Vert x\langle x \rangle \Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). By Theorem 1.1 in [11] we have \(\Delta ^{-1/2}\Vert x\langle x\rangle \Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\rightarrow \Vert x\langle x \rangle \Vert _{2,\nu }+\sigma ^2\) as \(n\rightarrow \infty \). Arguing as for \(\widetilde{\mathfrak {F}}\) we obtain
The entropy integral in (57) is finite and converges to zero as \(\delta \rightarrow 0\). The second term \(\sqrt{n}{{\mathrm{{\mathbb {P}}}}}_\Delta G\{G>\sqrt{n}a(\delta )\}\) can be treated exactly as the second term in (56) with \(x^2\) replaced by \(x\). So the \(\lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\) of (57) is zero and thus (46) follows for the functions \({\widetilde{g}}^{(3)}_t\).
6.3 Asymptotic equicontinuity of the ‘critical term’
It remains to show asymptotic equicontinuity of the empirical process indexed by the class
where we recall from (41) that
We refer to this term as ‘critical’: the functions \(q_t\) contain a step-discontinuity at \(t\) and controlling its interaction with the operator \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-\cdot )]\) needs some more elaborate techniques than in the previous section.
We will rely on the following auxiliary result, which is a modification of Theorem 3 in [16], which in itself goes back to fundamental ideas in [18]. It is designed to allow for maximally growing envelopes of the empirical process, which is crucial in our setting to allow for minimal conditions on \(\Delta \). Note that indeed Condition (a) only requires \(M_n/n^{1/2}\rightarrow 0\) instead of the more stringent condition \(M_n/n^{1/4} \rightarrow 0\) which was required in Theorem 3 in [16].
Proposition 24
For every \(n \in {{\mathrm{{\mathbb {N}}}}},\) let \(X_{n,j}, j=1, \ldots , n,\) be i.i.d. from law \(P_n\) on a measurable space \((S, {\mathcal {B}})\) and let \(\varepsilon _j\), \(j=1,\ldots ,n,\) be i.i.d. Rademacher random variables independent of the \(X_{n,j}\)’s, all defined on a common probability space \((\Omega , {\mathcal {A}}, \Pr )\). For any sequence \(({\mathcal {Q}}_n)_{n\geqslant 1}\) of classes of measurable functions \(q: S \rightarrow {{\mathrm{{\mathbb {R}}}}}\) and
suppose the following conditions are satisfied for some sequence \(r_n\rightarrow 0\) as \(n\rightarrow \infty \)
-
(a)
\(\sup _{q\in {\mathcal {Q}}_n}\Vert q\Vert _\infty \leqslant M_n\) for a sequence \(M_n\) such that \(n r_n^{2}{M_n}^{-2}\rightarrow \infty \).
-
(b)
$$\begin{aligned} \left\| \frac{1}{\sqrt{n}}\sum _{j=1}^n\varepsilon _j q(X_{n,j})\right\| _{({\mathcal {Q}}_n)'_{r}} = o_P(1) \end{aligned}$$
as \(n\rightarrow \infty \).
-
(c)
There exists \(n_0\in {{\mathrm{{\mathbb {N}}}}}\) such that for all \(n\geqslant n_0\)
$$\begin{aligned} 23 H(r_n,{\mathcal {Q}}_n,L^2(P_n))\leqslant n r_n^{2}{M_n}^{-2}. \end{aligned}$$ -
(d)
$$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\int _0^\delta \sqrt{H( \varepsilon ,{\mathcal {Q}}_n, L^2(P_n))}\,\mathrm {d}\varepsilon =0 \end{aligned}$$
Then for all \(\gamma >0\)
Proof
Let \(\gamma >0\) be given. We sometimes omit to mention \(q,q' \in {\mathcal {Q}}\) to expedite notation. By Lemma 11.2.6 in [9] we have for \(\delta \in (0,\gamma /\sqrt{2})\)
where \(\varepsilon _j\) are Rademacher random variables independent of the \((X_{n,j})\), all defined on a large product probability space. Since \(\gamma \) is given and \(\delta \) tends to zero, we can choose \(\delta \) small enough such that \(\delta <\gamma /2\). Hence it suffices to show for all \(\gamma >0\) that
Let \({\mathcal {H}}={\mathcal {H}}_n\) be a maximal collection of functions \(h_1,\ldots ,h_m\) in \({\mathcal {Q}}_n\) such that \(\Vert h_j-h_k\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }>r_n\) if \(j\ne k\). The closed balls with centres \(h_1,\ldots ,h_m\) of radius \(r_n\) cover \({\mathcal {Q}}_n\). We define
For \(n\) large enough such that \(r_n<\delta /2\) we have
By condition (b) the first term tends to zero. To control the second term we define the event
where we used the notation \(P_n f:=\int _S f \,\mathrm {d}P_n\) for functions \(f:S\rightarrow {{\mathrm{{\mathbb {R}}}}}\). Using Markov’s inequality the second term in (59) can be bounded by
The number of elements in \({\mathcal {H}}'_{2\delta }\) is bounded by
For a single \(h\in {\mathcal {H}}_{2\delta }'\backslash \{0\}\) we have, using Bernstein’s inequality,
Combining the last bound and (61) we obtain
by condition (a) and (c). It remains to show that the second term in (60) converges to zero. Conditional on the \(X_{n,j}\)’s the process
is subgaussian. Let \(h,h'\in {\mathcal {H}}\) such that \(h-h'\in {\mathcal {H}}_{2\delta }'\backslash \{0\}\). On the event \(A_n\) we have
Especially we have on \(A_n\) that \(\Vert h-h'\Vert _{2,P_n}<\varepsilon \) implies \(d_Z(h,h')<\sqrt{2}\varepsilon \) for all \(\varepsilon >0\) with \(\varepsilon \leqslant 2\delta \) and for all \(h,h'\in {\mathcal {H}}\). We define \(\psi _2(x):=\exp (x^2)-1\) and the norm \(\Vert \xi \Vert _{\psi _2}:=\inf \{c>0:{{\mathrm{{\mathbb {E}}}}}[\psi _2(|\xi |/c)]\leqslant 1\}\). By (4.3.3) in [8] there is a constant \(c>0\) such that \({{\mathrm{{\mathbb {E}}}}}[|\xi |]\leqslant c \Vert \xi \Vert _{\psi _2}\). So we obtain the bound
Next we apply Dudley’s theorem in the form of Corollary 5.1.6 and Remark 5.1.7 in [8] to the process \(Z\). This yields a constant \(K\) such that
a bound independent of \(X\). In order to complete the proof we take expectation with respect to \(X\), consider the limit \(\lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\) of the expression and apply condition (d).\(\square \)
To proceed with the tightness proof for the critical term we will show conditions (a) to (d) for \(r_n:=\log (1/\Delta _n)^{-\alpha }\), \(\alpha \in (1/2,1)\), and for the class \( Q_n=\{{\widetilde{g}}_t^{(2)}:t\leqslant 0\}\) defined above.
(a) We rewrite
where the last step also shows that the bounded variation norm of \(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y)\) is bounded uniformly in \(t\leqslant 0\). If \({{\mathrm{{\mathcal {F}}}}}^{-1}[m]\), \({{\mathrm{{\mathcal {F}}}}}^{-1}[m']\) are finite signed measures as in Assumption 8(a), then the bounded variation norms of \({{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-{\scriptstyle \bullet } )]*(t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y))\) and \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-{\scriptstyle \bullet } )]*(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y))\) are bounded uniformly in \(t\leqslant 0\) and
where \(\Vert f\Vert _{BV}\) denotes the bounded variation norm equal to the sum of the \(\ell ^\infty \)-norm of \(f\) and the usual total variation norm of the weak derivative \(Df\). For \(m\) supported in \([-C\Delta ^{-1/2},C\Delta ^{-1/2}]\) as in Assumption 8(b), we have
and the Fourier transform of \({\widetilde{g}}^{(2)}_t\) is supported on \([-C\Delta ^{-1/2},C\Delta ^{-1/2}]\). In view of the Littlewood-Paley definition of Besov spaces we can estimate the \(B^1_{11}({{\mathrm{{\mathbb {R}}}}})\)-norm of \({\widetilde{g}}^{(2)}_t\) by \(\log (C/\Delta ^{1/2})\)-times its \(B^1_{1\infty }({{\mathrm{{\mathbb {R}}}}})\)-norm. With the Fourier multiplier property of \(m\) and \(m'\) this yields
since the \(B^1_{1\infty }({{\mathrm{{\mathbb {R}}}}})\)-norm of \(t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y)\) and \(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y)\) are uniformly in \(t\) bounded by integrability and bounded variation. So \(M_n\) can be chosen proportional to \(\Delta ^{-1/2}\log (1/\Delta )\) and \(nr_n^2M_n^{-2}\rightarrow \infty \) by \(\log ^4(1/\Delta )=o(n\Delta )\).
(b) We will show condition (b) by applying a moment inequality for empirical processes under uniform entropy bounds for \(Q_n\). We decompose \({\widetilde{g}}^{(2)}_t\) according to (62). Using that \(B_{11}^{1}({{\mathrm{{\mathbb {R}}}}})\) embeds continuously into the space \(\text {BV}\) of bounded variation functions, the bounds in (a) show that
where we omitted the factors \(t\rho (t)\) and \(t^2\rho (t)\) to obtain translation invariant classes. Since the functions in the class
are of bounded variation, we can write them as the composition of a 1-Lipschitz function after a nondecreasing function. The class of all translates of a nondecreasing function has VC index 2 and thus polynomial \(L^2({{\mathrm{{\mathbb {Q}}}}})\)-covering numbers uniformly in all probability measures \({{\mathrm{{\mathbb {Q}}}}}\) by Theorem 5.1.15 in [8]. The \(\varepsilon \)-covering numbers are preserved under 1-Lipschitz transformations and thus the covering numbers of \(\mathfrak F_n\) are polynomial in \(M_n/\varepsilon \). The \(\varepsilon \)-covering numbers of \(\{t\rho (t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) are polynomial in \(1/\varepsilon \). To obtain an \(\varepsilon \)-covering of the functions in the first term of (62) we cover the class \(\mathfrak {F}_n\) by balls of size \(\varepsilon /2\) and the class \(\{t\rho (t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) by balls of size \(\varepsilon /(2M_n)\). We see that the covering numbers can be bounded by a product of two polynomial covering numbers and thus are polynomial in \(M_n/\varepsilon \). Arguing in the same way for the two other terms in (62) yields polynomial covering numbers for them, too. Using that the covering numbers of \(Q_n\) can be bounded by the product of the covering numbers for the respective terms we see that the covering numbers of \(Q_n\) are polynomial in \(M_n/\varepsilon \). By Proposition 3 in [17] there exists a universal constant \(L>0\) such that
Condition (b) is satisfied if this maximum tends to zero. We have
and
which tends to zero by \(\log ^4(1/\Delta )=o(n\Delta )\).
(c) In order to verify (c), we will show that \(H(\varepsilon ,Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\lesssim \log (\varepsilon ^{-1})\) uniformly in \(n\). Applying Proposition 23 with \(j=k=1\), \(p=q=2\) and \(s=2\) yields that for \(\mu =m(-{\scriptstyle \bullet } )\) and for all \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) with \({{\mathrm{\hbox {supp}}}}(f)\cap (-\delta ,\delta )=\varnothing \) for some \(\delta >0\)
where we used \(\Vert m\Vert _\infty \leqslant C\) and \(\Delta ^{-1/2}\Vert m''\Vert _{L^2}\rightarrow 0\) by Assumption 8.
Let \(M\geqslant 1\) and \(\eta \in [0,1]\). We will distinguish the three cases \(s,t\leqslant -M\), \(s,t\in [-M,-\eta ]\) and \(s,t\in [-\eta ,0]\).
Case 1: Let \(s,t\leqslant -M\). We apply (65) to \(\delta =M\) and \(f(y):=q_s(y)-q_t(y)\) with \(q_t\) defined in (58). Noting that we can bound \(\Vert q_t\Vert _{L^2}\lesssim M^{-1}\) uniformly in \(t\leqslant 0\), we obtain for \(s,t\leqslant -M\)
Case 2: For the second case let \(-M\leqslant s,t\leqslant -\eta \). We apply (65) with \(\delta =\eta \) to \(f(y):=q_s(y)-q_t(y)\). Without loss of generality we assume \(s\leqslant t\). We estimate
by the Lipschitz continuity of \(x\rho \) and obtain for \(s,t\in [-M,-\eta ]\) with \(|s-t|\leqslant 1\)
Case 3: Let \(-\eta \leqslant s,t\leqslant 0\). We have \(\Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\leqslant 2\sup _{t\in [-\eta ,0]}\Vert {\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). We apply Proposition 23 with \(f=q_t\), \(\mu =m(-{\scriptstyle \bullet } )\), \(\delta =|t|\), \(k=1\), \(j=1\), \(p=2\), \(q=2\) and \(s=2\). We have
and consequently \(\Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim \eta ^{1/2}\) for \(-\eta \leqslant s,t\leqslant 0\).
Having treated these three cases we can show \(N(\varepsilon , Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\lesssim \varepsilon ^{-7}\). For an integer \(J>0\) we consider the grid of points \(t_j=-j J^{-6}\) with \(j=J^4, J^4+1, J^4+2,\ldots , J^{7}\). We take \(\eta =J^{-2}\). By Case 3 we see that \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim J^{-1}\) for all \(s,t\in [-J^{-2},0]\). By Case 2 we have \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim |s-t|^{1/2}/\eta \leqslant J^{-1}\) for \(s,t\in [-(j+1)J^{-6},-jJ^{-6}]\). And by Case 1 \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim J^{-1}\) for \(s,t\leqslant -J\).
We have polynomial covering numbers and it suffices for condition (c) that
In (a) we have seen that \(M_n\lesssim \Delta ^{-1/2}\log (1/\Delta )\). For the choice \(r_n=\log (1/\Delta )^{-\alpha }\) we obtain
which tends to infinity by \(\log ^4(1/\Delta )=o(n\Delta )\).
(d) In (c) we have seen that the covering numbers \(N(\varepsilon ,Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\) are uniformly in \(n\) polynomial in \(\varepsilon ^{-1}\) so that the condition is satisfied.
References
Bauer, H.: Probability Theory. De Gruyter, Berlin (1996)
Belomestny, D.: Spectral estimation of the fractional order of a Lévy process. Ann. Stat. 38(1), 317–351 (2010)
Belomestny, D., Reiß, M.: Spectral calibration of exponential Lévy models. Fin. Stoch. 10(4), 449–474 (2006)
Billingsley, P., Topsøe, F.: Uniformity in weak convergence. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 7, 1–16 (1967)
Blumenthal, R.M., Getoor, R.K.: Sample functions of stochastic processes with stationary independent increments. J. Math. Mech. 10, 493–516 (1961)
Comte, F., Genon-Catalot, V.: Estimation for Lévy processes from high frequency data within a long time interval. Ann. Stat. 39(2), 803–837 (2011)
Cont, R., Tankov, P.: Financial Modelling with Jump Processes. Chapman & Hall/CRC Financial Mathematics Series, Chapman & Hall/CRC, Boca Raton (2004)
de la Peña, V.H., Giné, E.: Decoupling: From Dependence to Independence. Springer, New York (1999)
Dudley, R.M.: Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics, vol. 63. Cambridge University Press, Cambridge (1999)
Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd edn. Wiley, New York (1971)
Figueroa-López, J.E.: Small-time moment asymptotics for Lévy processes. Stat. Probab. Lett. 78(18), 3355–3365 (2008)
Figueroa-López, J.E.: Nonparametric estimation of Lévy models based on discrete-sampling. In: Optimality. IMS Lecture Notes Monographs Series, vol. 57, pp. 117–146. Inst. Math. Statist, Beachwood (2009)
Figueroa-López, J.E.: Sieve-based confidence intervals and bands for Lévy densities. Bernoulli 17(2), 643–670 (2011)
Figueroa-López, J.E., Houdré, C.: Small-time expansions for the transition distributions of Lévy processes. Stoch. Process. Appl. 119(11), 3862–3889 (2009)
Folland, G.B.: Real Analysis. Modern techniques and their applications, 2nd edn. Wiley, New York (1999)
Giné, E., Nickl, R.: Uniform central limit theorems for kernel density estimators. Probab. Theory Rel. Fields 141(3–4), 333–387 (2008)
Giné, E., Nickl, R.: An exponential inequality for the distribution function of the kernel density estimator, with applications to adaptive estimation. Probab. Theory Rel. Fields 143(3–4), 569–596 (2009)
Giné, E., Zinn, J.: Some limit theorems for empirical processes. Ann. Probab. 12(4), 929–998 (1984). (with discussion)
Girardi, M., Weis, L.: Operator-valued Fourier multiplier theorems on Besov spaces. Math. Nachr. 251(1), 34–51 (2003)
Gugushvili, S.: Nonparametric inference for discretely sampled Lévy processes. Ann. Inst. Henri Poincaré Probab. Stat. 48(1), 282–307 (2012)
Jacod, J., Reiß, M.: A remark on the rates of convergence for integrated volatility estimation in the presence of jumps. Ann. Stat. 42(3), 1131–1144 (2014)
Jongbloed, G., van der Meulen, F.H., van der Vaart, A.W.: Nonparametric inference for Lévy-driven Ornstein–Uhlenbeck processes. Bernoulli 11(5), 759–791 (2005)
Kappus, J.: Adaptive nonparametric estimation for Lévy processes observed at low frequency. Stoch. Process. Appl. 124(1), 730–758 (2014)
Kappus, J., Reiß, M.: Estimation of the characteristics of a Lévy process observed at arbitrary frequency. Stat. Neerl. 64(3), 314–328 (2010)
Neumann, M.H., Reiß, M.: Nonparametric estimation for Lévy processes from low-frequency observations. Bernoulli 15(1), 223–248 (2009)
Nickl, R., Pötscher, B.M.: Bracketing metric entropy rates and empirical central limit theorems for function classes of Besov- and Sobolev-type. J. Theor. Probab. 20(2), 177–199 (2007)
Nickl, R., Reiß, M.: A Donsker theorem for Lévy measures. J. Funct. Anal. 263(10), 3306–3332 (2012)
Orey, S.: On continuity properties of infinitely divisible distribution functions. Ann. Math. Stat. 39, 936–937 (1968)
Picard, J.: Density in small time for Lévy processes. ESAIM Probab. Stat. 1, 357–389 (1997)
Reiß, M.: Testing the characteristics of a Lévy process. Stoch. Process. Appl. 123, 2808–2828 (2013). (special issue international year of statistics)
Sato, K.-I.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press, Cambridge (1999)
van der Vaart, A.W.: Asymptotic Statistics, vol. 3. Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge (1998)
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer Series in Statistics, Springer, New York (1996)
Acknowledgments
The authors acknowledge insightful remarks from the Associate Editor and two anonymous referees that helped to improve the presentation of the paper. Financial Support by the Deutsche Forschungsgemeinschaft via FOR 1735 Structural Inference in Statistics is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nickl, R., Reiß, M., Söhl, J. et al. High-frequency Donsker theorems for Lévy measures . Probab. Theory Relat. Fields 164, 61–108 (2016). https://doi.org/10.1007/s00440-014-0607-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-014-0607-3
Keywords
- High-frequency inference
- Donsker theorem
- Lévy process
- Empirical process
Mathematics Subject Classification
- Primary 60F05
- Secondary 60G51
- 62G05
