1 Introduction

Suppose that \((L_t: t \geqslant 0)\) is a real-valued Lévy process defined on some probability space \((\Omega , {\mathcal {A}}, \Pr )\) and we observe \(n\) of its increments

$$\begin{aligned} X_k = L_{k \Delta } - L_{(k-1)\Delta },\quad k =1, \ldots , n, \end{aligned}$$
(1)

sampled at frequency \(1/\Delta >0\). Equivalently the \(X_k\)’s are drawn i.i.d. from some infinitely divisible distribution \({{\mathrm{{\mathbb {P}}}}}_\Delta \), with corresponding empirical measures \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n} = \frac{1}{n} \sum _{k=1}^n \delta _{X_k}\).

Lévy processes are increasingly popular in stochastic modelling. A question of key importance is how the structure of the Lévy process, particularly its jump behaviour, can be recovered from these observed increments. From a statistical point of view it is natural to consider a growing observation horizon \(n \Delta \rightarrow \infty \). If simultaneously \(\Delta = \Delta _n\) approaches zero one speaks of a ‘high-frequency’ sampling regime, as opposed to ‘low-frequency’ sampling where \(\Delta \) remains fixed. Inference problems of this kind have recently gained increased attention. Jongbloed et al. [22] studied nonparametric inference for Lévy-driven Ornstein–Uhlenbeck processes. Belomestny and Reiß [3] treat nonparametric estimation of Lévy processes in a financial model. Low-frequency observations were considered, e.g., by Neumann and Reiß [20], Belomestny [25], Gugushvili [2] as well as Nickl and Reiß [27], whereas Figueroa-López [12, 13] treats high-frequency observations. Nonparametric estimation of Lévy processes in a model selection context was studied by Comte and Genon-Catalot [6] and Kappus [23]. A general discussion of the literature and further references can be found in the recent survey paper [30].

By the Lévy–Khintchine representation [31] the Lévy process \((L_t: t \geqslant 0)\) is entirely characterised by three parameters: the diffusion coefficient \(\sigma ^2\) describing the Brownian motion component, the centring or drift parameter \(\gamma \), and the Lévy measure \(\nu \). Recovering the Lévy process can thus be reduced to recovering the Lévy triplet \((\sigma ^2, \gamma , \nu )\). Statistical inference for the one-dimensional parameters \(\sigma ^2, \gamma \) can be based on standard statistics such as the quadratic variation and the sample average of the increments, or on spectral estimators, see Sect. 4 for discussion and references.

An intrinsically more complex problem than inference on \(\sigma ^2\) and \(\gamma \) is the recovery of the Lévy measure \(\nu \), which describes the jump behaviour of the Lévy process. We recall that there is a bijection between the set of Lévy measures \(\nu \) and all positive Borel measures \(\nu \) on \({{\mathrm{{\mathbb {R}}}}}\) s.t.

$$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}} (1 \wedge x^2) \nu (\,\mathrm {d}x) <\infty , \quad \nu (\{0\})=0. \end{aligned}$$

Thus a natural target is to recover the cumulative distribution function

$$\begin{aligned} N(t) = \int _{-\infty }^t (1 \wedge x^2) \nu (\,\mathrm {d}x), \quad t \in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$
(2)

from the observed increments; it encodes both local and global information about \(\nu \). The presence of \((1\wedge x^2)\) smooths the singularity that \(\nu \) may possess at the origin. Other possibilities to smooth the singularity exist and our results will cover functions from a general class (see Sect. 3). In particular this will include recovery of the distribution function

$$\begin{aligned} {\mathcal {N}}(t) =\int _{-\infty }^t \nu (\,\mathrm {d}x), \quad t < 0,\quad \text {and} \quad {\mathcal {N}}(t) =\int _{t}^{\infty } \nu (\,\mathrm {d}x), \quad t > 0, \end{aligned}$$
(3)

of the Lévy measure at any point \(t\ne 0\).

For statistical applications, inference on the functions \(N, {\mathcal {N}}\) in the uniform norm \(\Vert \cdot \Vert _\infty \) on the real line is of particular interest, paralleling the classical Donsker–Kolmogorov–Smirnov central limit theorems

$$\begin{aligned} \sqrt{n} (F_n -F) \rightarrow ^{{\mathcal {L}}} {\mathbb {G}}_F \end{aligned}$$

in the space of bounded functions on \({{\mathrm{{\mathbb {R}}}}}\), where \(F_n\) is the empirical distribution function of a random sample from distribution \(F\), and where \({\mathbb {G}}_F\) is the \(F\)-Brownian bridge [9, 33]. In the Lévy setting, Nickl and Reiß [27] considered an estimator for the distribution function \({\mathcal {N}}(t)\), \(|t|\geqslant \zeta ,\) based on low-frequency observations (\(\Delta \) fixed) and proved such a Donsker–Kolmogorov–Smirnov theorem. The purpose of the present article is to derive such results when also \(\Delta \rightarrow 0\). The main message is that high-frequency observations reveal much finer statistical properties of the Lévy measure, and inference is possible for a much larger class of Lévy processes than considered in [27], including processes with a nonzero Gaussian component. Moreover, the theory does not only cover nonlinear ‘inversion’ estimators based on the Lévy–Khintchine formula, but also ‘linear’ estimators based on elementary counting statistics. At the heart of these results is a general purpose uniform central limit theorem for a basic ‘smoothed empirical process’ arising from the \(X_k\)’s in (1), see Theorem 11 below.

In the next section we introduce the estimators and give the main results as well as some statistical applications. In Sect. 3 we show how to reduce the proofs to the study of a unified smoothed empirical process, and in Sect. 4 we discuss our conditions and their interpretation in a variety of concrete examples of Lévy processes. The remainder of the article is then devoted to the proofs of our results.

2 Main results: asymptotic inference on the Lévy measure \(\nu \)

In this section we study two approaches to estimate the distribution functions \(N, {\mathcal {N}}\) of a Lévy measure, based on discrete observations (1). The first estimator is constructed by a direct approach and counts the number of increments below a certain threshold, where increments are weighted by \(1\wedge X_k^2\). The second approach relies on the Lévy–Khintchine representation and a spectral regularisation step.

2.1 Basic notation and assumptions

The symbol \(\ell ^\infty (T)\) denotes the space of bounded functions on a set \(T\) normed by the usual supremum norm \(\Vert \cdot \Vert _\infty \). We will measure the smoothness of functions in a local Hölder norm: denoting by \(C(U)=C^0(U)\) the set of all functions on an open set \(U\subseteq {{\mathrm{{\mathbb {R}}}}}\) which are bounded, continuous and real-valued, we define for \(s>0\) the Hölder spaces

$$\begin{aligned} C^s(U)&:= \left\{ f\in C(U): \Vert f\Vert _{C^s(U)}:=\sum _{k=0}^{\lfloor s\rfloor }\sup _{x\in U}|f^{(k)}(x)|\right. \\&\left. \quad +\sup _{x,y\in U:x\ne y} \frac{|f^{(\lfloor s\rfloor )}(x)-f^{(\lfloor s\rfloor )}(y)|}{|x-y|^{ s-\lfloor s\rfloor }}<\infty \right\} \end{aligned}$$

where \(\lfloor s \rfloor \) denotes the largest integer strictly smaller than \(s\).

We assume throughout this article that the Lévy measure has finite second moments,

$$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}} x^2 \nu (\,\mathrm {d}x) <\infty . \end{aligned}$$
(4)

This is equivalent to \({{\mathrm{{\mathbb {P}}}}}_\Delta \) having finite second moments for all \(\Delta >0\) [31].

For our main results we will rely on the following stronger assumption on \(\nu \). Slightly abusing notation we shall use the same symbol for a measure and its Lebesgue density, if the latter exists. Also we use \(\lesssim , \gtrsim , (\sim )\) to denote (two-sided) inequalities up to a multiplicative constant.

Assumption 1

  1. (a)

    For some \(\varepsilon >0\) we have

    $$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}}|x|^{4+\varepsilon }\nu (d x)<\infty . \end{aligned}$$
  2. (b)

    The Lévy measure \(\nu \) has a Lebesgue density, also denoted by \(\nu \), and

    $$\begin{aligned} (1\wedge x^4)\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}}). \end{aligned}$$
  3. (c)

    The measure \(x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta }\) admits a Lebesgue density, also denoted by \(x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \), satisfying, as \(\Delta \rightarrow 0\),

    $$\begin{aligned} ||x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta } ||_\infty \lesssim \Delta . \end{aligned}$$
  4. (d)

    Let \(U\) be a neighbourhood of the origin and \(V \subseteq {{\mathrm{{\mathbb {R}}}}}\). For some \(s>0\) and some finite constants \(c_t>0\), \(t\in V\), we have

    $$\begin{aligned} \Vert g_t(-\cdot ) *(x^2\nu )\Vert _{C^s(U)} \leqslant c_t \qquad \text { with } g_{t}(x):= (1\wedge x^{-2})1\!\!1_{(-\infty ,t]}(x). \end{aligned}$$

Assumptions (a) and (b) are a moment condition and a mild regularity condition on the Lévy measure, respectively. Assumption (c) is the key condition and will be discussed in detail in Sect. 4.4. Here we just remark that for instance under the assumption \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), this condition will be shown to be satisfied whenever the diffusion coefficient is positive (\(\sigma >0\)). Assumption (d) is used to control approximation theoretic properties of the distribution function of \(x^2\nu \). For global results (\(V={{\mathrm{{\mathbb {R}}}}}\)) we notice that it is easily seen that (d) is satisfied with a uniform constant \(c>0\) if \(x^2\nu \in C^{s-1}({{\mathrm{{\mathbb {R}}}}}), s \geqslant 1\).

Recall that a function \(l\) defined on \((0,\infty )\) is slowly varying at the origin if

$$\begin{aligned} \frac{l(tx)}{l(t)}\rightarrow 1,\qquad \text {as } t\rightarrow 0,~~\forall x>0. \end{aligned}$$

A function \(f\) is regularly varying at the origin with exponent \(p\in {{\mathrm{{\mathbb {R}}}}}\) if \(f\) is of the form

$$\begin{aligned} f(x)=x^p l(x) \end{aligned}$$

with \(l\) slowly varying at the origin. We denote the symmetrised Lévy density by \({\widetilde{\nu }}(x):=\nu ^+(x)+\nu ^-(-x)\), where \(\nu ^+=\nu 1\!\!1_{{{\mathrm{{\mathbb {R}}}}}^+}\) and \(\nu ^-=\nu 1\!\!1_{{{\mathrm{{\mathbb {R}}}}}^-}\).

Throughout the paper we write \(\rightarrow ^{\mathcal {L}}\) to denote convergence in distribution of random elements in a metric space as in Chapter 1 in [33].

2.2 The direct estimation approach

In the high-frequency regime \((\Delta \rightarrow 0)\) inference on \(\nu \) can be based on the following simple observation.

Lemma 2

If the Lévy measure \(\nu \) satisfies (4), then we have the weak convergence

$$\begin{aligned} x^2 \frac{{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta } \rightarrow \sigma ^2 \delta _0 + x^2 \nu \end{aligned}$$
(5)

as \(\Delta \rightarrow 0\) in the sense that

$$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}} f(x) x^2 \frac{{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)}{\Delta } \rightarrow \sigma ^2 f(0) + \int _{{{\mathrm{{\mathbb {R}}}}}} f(x) x^2 \nu (\,\mathrm {d}x) \end{aligned}$$
(6)

for every bounded continuous function \(f: {{\mathrm{{\mathbb {R}}}}}\rightarrow {{\mathrm{{\mathbb {R}}}}}\).

Starting with Lévy processes without diffusion component, that is, with \(\sigma =0\), the asymptotic identification (5) motivates a linear estimator of \(N(t)\) given by

(7)

where \({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}=\frac{1}{n}\sum _{k=1}^{n}\delta _{X_k}\) is the empirical measure of the increments from (1).

Similarly, and including the case \(\sigma \ne 0\), one can estimate the function \({\mathcal {N}}\) by

$$\begin{aligned} \widetilde{{\mathcal {N}}}_n(t):=\int _ {{{\mathrm{{\mathbb {R}}}}}} f_t(x)\frac{{{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}(\,\mathrm {d}x)}{\Delta }\quad \text {with}\quad f_t(x):=\Biggl \{\begin{array}{ll} 1\!\!1_{(-\infty ,t]},&{} t<0\\ 1\!\!1_{[t,\infty )},&{} t>0. \end{array} \end{aligned}$$

We start with a theorem for the basic estimator \({\widetilde{N}}_n\).

Theorem 3

Let \(\sigma =0\) and grant Assumption 1 for \(V={{\mathrm{{\mathbb {R}}}}}\), for some \(s\in (0,2]\) and with uniform constant \(\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}c_t=c<\infty \).

Assume either that

  1. (a)

    the density of \(x \nu \) exists and is of bounded variation, and that for the drift \(\gamma _0 := \gamma -\int x \nu (\,\mathrm {d}x)=0\);  or that

  2. (b)

    \({\widetilde{\nu }}(x)=\nu ^+(x)+\nu ^-(-x)\) is regularly varying at zero with exponent \(-(\beta +1)\), \(\beta \in (0,2), s \in (0,2-\beta )\).

If \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that

$$\begin{aligned} n\Delta _{n}\rightarrow \infty ,\quad \Delta _{n}=o\left( n^{-1/(s+1)}\right) \quad \text { and }\quad \log ^4(1/\Delta _n)=o(n\Delta _n), \end{aligned}$$

then

$$\begin{aligned} \sqrt{n\Delta _n}\big (\widetilde{N}_{n}-N\big )\rightarrow ^{\mathcal {L}}\mathbb {G} \quad \text {in}\quad \ell ^{\infty }({{\mathrm{{\mathbb {R}}}}}), \end{aligned}$$

where \(\mathbb {G}\) is a tight Gaussian random variable arising from the centred Gaussian process \(\{\mathbb {G}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}[\mathbb {G}(t)\mathbb {G}(t')]=\int _{-\infty }^{t \wedge t'}(1\wedge x^4)\nu (\,\mathrm {d}x),\quad t,t'\in {{\mathrm{{\mathbb {R}}}}}. \end{aligned}$$

Since estimation at the origin \(t=0\) is included in the last theorem, the assumption \(\sigma =0\) is natural—the simple linear estimator \({\widetilde{N}}_n\) cannot distinguish between arbitrarily small jumps and a Brownian diffusion component. Moreover, setting the drift \(\gamma _0 =0\) in (a) rules out situations where the measure \({{\mathrm{{\mathbb {P}}}}}_\Delta \) has a discrete component \(\delta _{\Delta \gamma _0}\), which causes complications in the analysis. Simultaneous estimation of all parameters of the Lévy triplet without restrictions on \(\gamma \) and \(\sigma \) will be considered by non-linear methods in the next subsection.

The conditions (a) and (b) are required to show that the deterministic ‘bias’ term arising from the basic linear estimator is negligible in the limit distribution (Proposition 17). The case (a) covers many examples of finite activity Lévy processes as well as some limiting cases where the singularity of \(\nu \) at the origin behaves like \(|x|^{-1}\) (see Sect. 4.5 for examples). In contrast case (b) covers infinite activity processes with a singularity of the form \(|x|^{-1-\beta }, \beta \in (0,2)\). The assumption of regular variation of \(\widetilde{\nu }\) at zero is natural in all key examples considered in Sect. 4.5 below—typically the variation exponent will be closely related to the regularity \(s\) of \(N\), and we discuss in Sect. 4.1 how our parameter constraints on \(\beta \) and \(s\) are compatible.

When the origin is excluded from consideration, an argument of [13] can be used to obtain the following result for the linear estimator \({\widetilde{\mathcal {N}}}\), allowing also for \(\sigma \ne 0\):

Theorem 4

Grant Assumptions 1(a)–(c). Let \(\zeta >0\) and suppose that the Lévy density \(\nu \) is Lipschitz continuous in an open set \(V_0\) containing \(V=(-\infty ,-\zeta ]\cup [\zeta ,\infty )\). Let \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that

$$\begin{aligned} n\Delta _{n}\rightarrow \infty ,\quad \Delta _{n}=o(n^{-1/3})\quad \text { and }\quad \log ^4(1/\Delta _n)=o(n\Delta _n). \end{aligned}$$

Then

$$\begin{aligned} \sqrt{n\Delta _n}\big (\widetilde{{\mathcal {N}}}_{n}-{\mathcal {N}}\big )\rightarrow ^{\mathcal {L}}\mathbb {W} \quad \text {in}\quad \ell ^{\infty }(V), \end{aligned}$$

where \(\mathbb {W}\) is a tight Gaussian random variable arising from the centred Gaussian process \(\{\mathbb {W}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance, for \(f_t=1\!\!1_{(-\infty ,t]}\) for \(t<0\) and \(f_t=1\!\!1_{[t,\infty )}\) for \(t>0\),

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}[\mathbb {W}(t)\mathbb {W}(t')]=\int _{{{\mathrm{{\mathbb {R}}}}}}f_t(x)f_{t'}(x)\nu (\,\mathrm {d}x),\quad t,t'\in V. \end{aligned}$$

The estimators \({\widetilde{N}}_n,\widetilde{\mathcal {N}}_n\) are ‘linear’ in the observations \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\), and their consistency relies on the assumption that \(\Delta _n\) tends to zero fast enough, in Theorems 3 and 4 at least of order \(\Delta _{n}=o(n^{-1/(s+1)})\) for \(s\in (0,2]\). In both theorems a weaker assumption than \(\Delta _{n}=o(n^{-1/3})\) cannot be expected in general: In typical situations the function \({{\mathrm{{\mathbb {P}}}}}(X_{\Delta _n}\leqslant t)\), \(t<0\), can be expressed in terms of \(\Delta _n\) as a series expansion

$$\begin{aligned} {{\mathrm{{\mathbb {P}}}}}(X_{\Delta _n}\leqslant t)=\nu ((-\infty ,t])\Delta _n+b_t\Delta _n^2+O(\Delta _n^3),~~b_t \in (0,\infty ). \end{aligned}$$

For a compound Poisson process this follows by conditioning on the number of jumps but it also holds in more general infinite activity cases (see [14]). From the expansion we see that the approximation error \(\Delta _n^{-1}{{\mathrm{{\mathbb {P}}}}}(X_{\Delta _n}\leqslant t)-\nu ((-\infty ,t])\) will not decay faster than \(\Delta _n\), and the assumption \(\Delta _n=o(1/\sqrt{n\Delta _n})\) is expressed equivalently as \(\Delta _{n}=o(n^{-1/3})\).

2.3 The spectral estimation approach

Instead of relying on \(\Delta \rightarrow 0\) one can identify the Lévy measure by the Lévy–Khintchine formula

$$\begin{aligned} \varphi _\Delta (u)&:={{\mathrm{{\mathbb {E}}}}}[e^{iuX_k}]=e^{\Delta \psi (u)},\nonumber \\ \psi (u)&=-\frac{\sigma ^2u^2}{2}+i\gamma u+\int _{{{\mathrm{{\mathbb {R}}}}}}\big (e^{iux}-1-iux\big )\nu (\,\mathrm {d}x),\quad u \in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$
(8)

which we give here in Kolmogorov’s version (valid under (4), see (8.8) in [31]). Differentiating the characteristic exponent \(\psi (u)=\Delta ^{-1}\log \varphi _\Delta (u)\), one sees

$$\begin{aligned} \psi ''(u)&=\frac{\varphi _\Delta ''(u)\varphi _\Delta (u)-(\varphi _\Delta ')^2(u)}{\Delta \varphi _\Delta ^2(u)} =-\sigma ^2-{{\mathrm{{\mathcal {F}}}}}[x^2\nu ](u), \end{aligned}$$
(9)

where \({{\mathrm{{\mathcal {F}}}}}f(u):=\int e^{iux}f(x)\,\mathrm {d}x\) and \({{\mathrm{{\mathcal {F}}}}}\mu (u) := \int e^{iux}\mu (\,\mathrm {d}x)\) for any \(f\in L^1({{\mathrm{{\mathbb {R}}}}})\cup L^2({{\mathrm{{\mathbb {R}}}}})\) and any finite measure \(\mu \), respectively, denotes the Fourier transform. If \({{\mathrm{{\mathcal {F}}}}}^{-1}\) is the inverse Fourier transform we hence have

$$\begin{aligned} -{\mathcal {F}}^{-1}[\psi ''] = \sigma ^2 \delta _0 + x^2 \nu . \end{aligned}$$
(10)

In contrast to (5) this identification of \(\nu \) is nonlinear in \(\varphi _\Delta = {{\mathrm{{\mathcal {F}}}}}{{\mathrm{{\mathbb {P}}}}}_{\Delta }\), but has the remarkable advantage of being nonasymptotic and valid for all \(\Delta >0\), without relying on a high-frequency approximation \(\Delta \rightarrow 0\). This was exploited in [27] to show that a plug-in of the empirical characteristic function \({\mathcal {F}} {{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\) into (9) can result, for a (naturally) restricted class of Lévy processes, in efficient recovery of \({\mathcal {N}}(t), t \ne 0,\) without the requirement \(\Delta \rightarrow 0\). In the low-frequency case only Lévy processes without diffusion component can be covered. Our high-frequency setting allows us to drop this (otherwise necessary) restriction and to treat Lévy processes with diffusion component and with Lévy measures from a much wider class.

Replacing \(\varphi _\Delta (u)\) in (9) by the empirical characteristic function of the observed increments,

$$\begin{aligned} \varphi _{\Delta ,n}(u):={{\mathrm{{\mathcal {F}}}}}{{\mathrm{{\mathbb {P}}}}}_{\Delta , n} (u) = \frac{1}{n}\sum _{k=1}^ne^{iuX_k}, \end{aligned}$$

(and its derivatives \(\varphi _{\Delta , n}^{(i)}, i=1,2\), respectively), we obtain an empirical plug-in estimate \({\widehat{\psi }}_n''\) of \(\psi ''\). Recalling the definitions of \(g_t, f_t\) in Assumption 1(d) and in Theorem 4, respectively, the resulting estimators of \(N,{\mathcal {N}}\) are given by

$$\begin{aligned} {\widehat{N}}_n(t)&:=\int _{{{\mathrm{{\mathbb {R}}}}}}g_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\left[ (-{\widehat{\psi }}''_n-{\widehat{\sigma }}^2){\mathcal {F}} K_h \right] (x)\,\mathrm {d}x,\nonumber \\ \widehat{\mathcal {N}}_n(t)&:=\int _{{{\mathrm{{\mathbb {R}}}}}}x^{-2}f_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\left[ (-{\widehat{\psi }}''_n-{\widehat{\sigma }}^2){\mathcal {F}} K_h \right] (x)\,\mathrm {d}x. \end{aligned}$$
(11)

Here \(K_h\) is a kernel such that \({{\mathrm{{\mathcal {F}}}}}K_h\) has compact support, specified in detail below, ensuring in particular that \({\widehat{N}}_n,\widehat{\mathcal {N}}_n\) are well-defined (on sets of probability approaching one). Moreover, \({\widehat{\sigma }}^2\) is any pilot estimate of \(\sigma ^2\). We can estimate \(\sigma ^2\) for instance as in [21] by

$$\begin{aligned} {\widehat{\sigma }}^2:=\frac{2}{\Delta u_n^2}\log (|\varphi _{\Delta ,n}(u_n)|)\qquad \text { with }u_n:=\sqrt{\frac{2c_0\log (n)}{\Delta \sigma _{\max }^2}}, \end{aligned}$$
(12)

where \(c_0>0\) is a suitable numerical constant, and if we assume a lower bound on the characteristic function determined by \(\sigma _{\max }>0\). Under suitable conditions Proposition 13 below entails that the estimator \({\widehat{\sigma }}^2\) satisfies

$$\begin{aligned} {\widehat{\sigma }}^2-\sigma ^2=o_P\big ((n\Delta )^{-1/2}\big ), \end{aligned}$$
(13)

and hence is negligible in the limit process \({\mathbb {G}}\) in the next theorem. While the construction of an optimal estimator of \(\sigma ^2\) in the setting considered here is a topic of independent interest, Theorem 5 below will hold for any plug-in estimator that satisfies (13).

We regularise with a band-limited kernel \(K_h:=h^{-1}K(h^{-1}{\scriptstyle \bullet } )\) of bandwidth \(h>0\). The following properties of \(K\) are supposed:

$$\begin{aligned}&\int _{{{\mathrm{{\mathbb {R}}}}}} K(x)\,\mathrm {d}x=1,\qquad \int x^lK(x)\,\mathrm {d}x=0 \quad \text { for }l=1,\ldots ,p,\nonumber \\&{{\mathrm{\hbox {supp}}}}{\mathcal {F}} K\subseteq [-1,1],\quad x^{p+1}K(x)\in L^1({{\mathrm{{\mathbb {R}}}}}), ~~p \in {\mathbb {N}}. \end{aligned}$$
(14)

The main result for the spectral estimators is the following theorem, where \(\mathbb {G}\) and \(\mathbb {W}\) are tight Gaussian random variables arising from the same Gaussian processes as in Theorems 3 and 4, respectively. For Part (ii) we recall the definition \(f_t=1\!\!1_{(-\infty ,t]}\) for \(t<0\) and \(f_t=1\!\!1_{[t,\infty )}\) for \(t>0\).

Theorem 5

Grant Assumptions 1(a)–(c) and let \(s>0\). Let the kernel satisfy (14) with \(p\geqslant s\vee 2\) and choose \(h_n\sim \Delta _n^{1/2}\). Let either \(\sigma ^2\) be known (in which case \({\widehat{\sigma }}^2 := \sigma ^2\)), or let \({\widehat{\sigma }}^2\) be any estimator satisfying (13). Suppose \(n\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\) such that

$$\begin{aligned} n\Delta _{n}\rightarrow \infty ,\quad \Delta _{n}=o(n^{-1/(s+1)})\quad \text { and }\quad \log ^4(1/\Delta _n)=o(n\Delta _n). \end{aligned}$$
  1. (i)

    Grant Assumption 1(d) for \(s>0\), for \(V={{\mathrm{{\mathbb {R}}}}}\) and for constants \(c_t\) with \(\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}c_t=c<\infty \). Then

    $$\begin{aligned} \sqrt{n\Delta _n}\big (\widehat{N}_{n}-N\big )\rightarrow ^{\mathcal {L}}\mathbb {G} \quad \text {in}\,\ell ^{\infty }({{\mathrm{{\mathbb {R}}}}}). \end{aligned}$$
  2. (ii)

    Grant Assumption 1(d) for \(s>0\), for \(g_t(x)=x^{-2}f_t(x)\), for \(V=(-\infty ,-\zeta ]\cup [\zeta ,\infty )\), \(\zeta >0\), and for constants \(c_t\) with \(\sup _{t\in V}c_t=c'<\infty \). Then

    $$\begin{aligned} \sqrt{n\Delta _n}\big (\widehat{\mathcal {N}}_{n}-\mathcal {N}\big )\rightarrow ^{\mathcal {L}} \mathbb {W} \quad \text {in}\,\ell ^{\infty }(V). \end{aligned}$$

2.4 Limit process and statistical applications

The continuous mapping theorem with the usual sup-norm \(\Vert \cdot \Vert _\infty \) combined with Theorems 3 and 5 yields in particular the limit theorems, as \(n \rightarrow \infty \),

$$\begin{aligned} \sqrt{n \Delta } \Vert {\widetilde{N}}_n -N\Vert _\infty \rightarrow ^{{\mathcal {L}}} \Vert {\mathbb {G}}\Vert _\infty \quad \text { and } \quad \sqrt{n \Delta } \Vert {\widehat{N}}_n -N\Vert _\infty \rightarrow ^{{\mathcal {L}}} \Vert {\mathbb {G}}\Vert _\infty . \end{aligned}$$
(15)

This can be used to construct Kolmogorov–Smirnov tests for Lévy measures and global confidence bands for the function \(N\), as we explain now.

For absolutely continuous Lévy measures \(\nu \) the Gaussian random function \(({\mathbb {G}}(t):t\in {{\mathrm{{\mathbb {R}}}}})\) can be realised as a version of

$$\begin{aligned} {\mathbb {G}}(t)&={\mathbb {B}} \left( \int _{-\infty }^t(1\wedge x^4)\nu (\,\mathrm {d}x)\right) , \quad t \in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$
(16)

where \({\mathbb {B}}\) is a standard Brownian motion. An alternative representation is given by \({\mathbb {G}}(t)=\int _{-\infty }^{t}(1\wedge x^2) \nu (x)^{1/2}\,\mathrm {d}{\mathbb {B}}(x)\), where \({\mathbb {B}}\) is a two-sided Brownian motion. We have

$$\begin{aligned} {{\mathrm{{\mathbb {P}}}}}\left( \sup _{s\leqslant t}|{\mathbb {G}}(s)|\!\geqslant \! a\!\right) \!=\!{{\mathrm{{\mathbb {P}}}}}\left( \left( \int _{-\infty }^t(1\wedge x^4)\nu (\,\mathrm {d}x)\right) ^{1/2}\max _{s\in [0,1]}|{\mathbb {B}}(s)|\!\geqslant \! a\!\right) ,\quad t \!\in \! {{\mathrm{{\mathbb {R}}}}}\cup \{\infty \}, \end{aligned}$$

so that quantiles of the distribution of \(\Vert {\mathbb {G}}\Vert _\infty \) can be calculated. For example a global asymptotic confidence band for \(N\) can be constructed in the setting of Theorems 3 and 5 by defining

$$\begin{aligned} {\widetilde{C}}_n(t)&:=\left[ {\widetilde{N}}_n(t)-\frac{{\widetilde{d}} q_{\alpha }}{\sqrt{n \Delta }}, {\widetilde{N}}_n(t)+\frac{{\widetilde{d}} q_{\alpha }}{\sqrt{n \Delta }}\right] ,\\ {\widehat{C}}_n(t)&:=\left[ {\widehat{N}}_n(t)-\frac{{\widehat{d}} q_{\alpha }}{\sqrt{n \Delta }}, {\widehat{N}}_n(t)+\frac{{\widehat{d}} q_{\alpha }}{\sqrt{n\Delta }}\right] , \quad t \in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$

with consistent estimators

$$\begin{aligned} {\widetilde{d}}&:=\left( \frac{1}{n\Delta }\sum _{k=1}^{n}(1\wedge X_k^4)\right) ^{1/2},\\ {\widehat{d}}&:= \left( \int _{{{\mathrm{{\mathbb {R}}}}}}(x^{-2}\wedge x^2){{\mathrm{{\mathcal {F}}}}}^{-1}\left[ (-{\widehat{\psi }}''_n-{\widehat{\sigma }}^2){\mathcal {F}} K_h \right] (x)\,\mathrm {d}x\right) ^{1/2} \end{aligned}$$

of the standard deviation \((\int _{{{\mathrm{{\mathbb {R}}}}}}(1\wedge x^4)\nu (\,\mathrm {d}x))^{1/2}\), and with \(q_\alpha \) the upper \(\alpha \)–quantile, \(0<\alpha <1\), of the distribution of \(\max _{s\in [0,1]}|{\mathbb {B}}(s)|\) (see Example X.5(c) in [10] for its well-known formula). For the confidence band \(C_n\) equal to either

$$\begin{aligned} {\widetilde{C}}_n := \left\{ f: f(t) \in {\widetilde{C}}_n(t) ~ \forall t \in {{\mathrm{{\mathbb {R}}}}}\right\} , ~\text { or }~{\widehat{C}}_n := \left\{ f: f(t) \in {\widehat{C}}_n(t) ~ \forall t \in {{\mathrm{{\mathbb {R}}}}}\right\} , \end{aligned}$$

Theorems 3 and 5 imply, under the respective assumptions, that the asymptotic coverage probability of \(C_n\) equals

$$\begin{aligned} \lim _{n\rightarrow \infty }{{\mathrm{{\mathbb {P}}}}}\left( N(t)\in C_n(t) \; \forall t\in {{\mathrm{{\mathbb {R}}}}}\right) =1-\alpha . \end{aligned}$$

Theorems 3 and 5 allow likewise the construction of tests: If \(H_0\) is a set of Lévy measures, let \({\mathcal {D}}\) be the set of the corresponding cumulative distribution functions of the form (2). We define \(T_n=1\!\!1\{{\mathcal {D}}\cap C_n=\varnothing \}\) to reject \(H_0\) and accept \(H_0\) when \(T_n=0\). This test has asymptotic level \(\alpha \): if \({{\mathrm{{\mathbb {P}}}}}_\vartheta \) is the law of a Lévy process from \(\vartheta \in H_0\) then we have

$$\begin{aligned} \lim _{n\rightarrow \infty }{{\mathrm{{\mathbb {P}}}}}_\vartheta (T_n\ne 0)\leqslant \alpha , \end{aligned}$$

assuming \(H_0\) satisfies the assumptions of Theorem 3 or 5.

2.5 Numerical example

Let us briefly illustrate the finite sample performance of the two estimation approaches and their corresponding confidence bands. We apply the procedures to two standard examples of pure jump Lévy processes: a Gamma process and a normal inverse Gaussian (NIG) process. The empirical coverage of the confidence bands reveals the finite sample level of the associated Kolmogorov–Smirnov tests and the size of the bands indicates the power of the tests.

The Gamma process has infinite, but relatively small jump activity (its Blumenthal–Getoor index equals zero). Its Lévy measure is given by the Lebesgue density \( \nu (x)=\frac{c}{x}e^{-\lambda x}, x>0,\) and we choose \(c=30\) and \(\lambda =1\) here. The NIG process can be constructed by subordinating a diffusion with volatility \(s>0\) and drift \(\vartheta \in {{\mathrm{{\mathbb {R}}}}}\) by an inverse Gaussian process with variance \(\kappa >0\). The resulting infinite variation process has Blumenthal–Getoor index equal to one. The NIG process admits an explicit formula for the jump measure and for its law we apply the simulation algorithm from [7], choosing \(s=1.5, \vartheta =0.1\) and \(\kappa =0.5\). Both processes satisfy the assumptions of Theorems 3 and 5, cf. Sect. 4.5.

We simulate \(n=2{,}000\) increments with observation distance \(\Delta =0.01\). For the spectral estimator we apply a flat top kernel and the universal bandwidth choice \(h=\sqrt{\Delta }\) which turned out to perform well in a variety of settings. Figure 1 shows the true distribution-type function \(N\), the direct estimator \({\widetilde{N}}_n\) from (7) and the spectral estimator \({\widehat{N}}_n\) from (11) for 50 simulations. In each setting the confidence band for level \(\alpha =0.9\), as constructed in the previous section, is plotted for the first simulation result. We clearly see the higher activity of small jumps of the NIG process from the linear growth of \(N\) at zero. On the other hand, the choice of our process parameters yields more pronounced tails of the jump measure for the gamma process.

Fig. 1
figure 1

Direct estimator \({\widetilde{N}}_n\) (left) and spectral estimator \({\widehat{N}}_n\) (right) for the Gamma (top) and NIG process (bottom). Each time 50 estimators (light blue) and the true distribution function (black) are shown. One estimator (blue, solid) with its asymptotic 0.9-confidence band (blue, dashed) is highlighted (colour figure online)

By construction, the direct estimator is not smooth. For the Gamma process it possesses a significant bias. The intensity of the small jumps is systematically underestimated which results in an overestimation of the larger jumps and thus too large values of \({\widetilde{N}}_n(t)\) for \(t\) large. For the choice \(\Delta =0.001\) this bias of the direct estimator is already negligible. In the simulations of the NIG process, \({\widetilde{N}}_n\) achieves good results that coincide with the asymptotic theory. In the simulations for the Gamma process the empirical coverage of the \(\alpha =0.9\) confidence bands in 500 Monte Carlo iterations is 0.86 for the Gamma process. The direct estimator has an empirical coverage of 0.59, reflecting the bias problem mentioned above. For the NIG process both estimators yield bands covering the true \(N\) uniformly in \(92\,\%\) of cases.

3 Unifying empirical process

The key probabilistic challenge in the proofs of Theorems 35 is a uniform central limit theorem for certain smoothed empirical processes arising from the sampled increments (1). We show in this section how these processes arise naturally for both estimation approaches considered here.

We will consider slightly more general objects than the distribution function \(N(t) = \int _{-\infty }^t (1 \wedge x^2) \nu (\,\mathrm {d}x)\)—the truncation at one in \((1 \wedge x^2)\) is somewhat arbitrary and, in particular, not smooth. Other truncations such as \(x^2/(1+x^2)\), or variations thereof can be of interest. To accommodate such examples we thus consider recovery of the functionals

$$\begin{aligned} N_\rho (t) = \int _{-\infty }^t \rho (x) x^2\nu (\,\mathrm {d}x), \quad t \in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$
(17)

where the ‘clipping function’ \(\rho \) satisfies the following condition:

Assumption 6

The function \(\rho \) satisfies \(0< \rho (x) \leqslant C(1 \wedge x^{-2})\) for all \(x \in {{\mathrm{{\mathbb {R}}}}}\) and some constant \(0<C<\infty \). Moreover, \(\rho , x\rho \) are Lipschitz continuous functions of bounded variation (i.e., their weak derivative is equal to a finite signed measure).

This covers the above examples [with either \(\rho (x) = 1 \wedge x^{-2}\) or \(\rho (x) = 1/(1+x^2)\)]. In the definition of the basic estimator (7) and the kernel estimator (11), we only need to replace \((1\wedge x^2)1\!\!1_{(-\infty ,t]}(x)\) by \(x^2g_t(x)\) where now

$$\begin{aligned} g_t(x):=\rho (x)1\!\!1_{(-\infty ,t]}(x), \end{aligned}$$
(18)

replacing also \(g_t\) in Assumption 1. The covariance of the limit process in Theorems 3 and 5 then changes to

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}[\mathbb {G}(s)\mathbb {G}(t)]=\int _{{{\mathrm{{\mathbb {R}}}}}}x^{4}g_s(x)g_t(x)\nu (\,\mathrm {d}x) \end{aligned}$$

and the according representation of \({\mathbb {G}}\) in terms of a reparametrised Brownian motion is

$$\begin{aligned} {\mathbb {G}}(t)={\mathbb {B}}\left( \int _{{{\mathrm{{\mathbb {R}}}}}}x^4g_t^2(x)\nu (\,\mathrm {d}x)\right) ={\mathbb {B}}\left( \int _{{{\mathrm{{\mathbb {R}}}}}}x^4\rho ^2(x)1\!\!1_{(-\infty ,t]}(x)\nu (\,\mathrm {d}x)\right) \!. \end{aligned}$$
(19)

Let us turn to the main purpose of this section: We start with the direct estimator \({\widetilde{N}}_n\), which is easier to analyse. The estimation error of \(\widetilde{N}_{n}\) can be decomposed as follows

$$\begin{aligned} \widetilde{N}_{n}(t)-N_{\rho }(t)&= \int _{{{\mathrm{{\mathbb {R}}}}}}x^{2}g_t(x)\big (\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_{ \Delta ,n} (\mathrm {d}x)-\nu (\mathrm {d}x)\big )\nonumber \\&= \int _{{{\mathrm{{\mathbb {R}}}}}}x^2g_t(x) \big (\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\mathrm {d}x)- \nu (\mathrm {d}x)\big ) \!+\!\int _{{{\mathrm{{\mathbb {R}}}}}}g_t(x) \frac{x^{2}}{\Delta }({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })(\,\mathrm {d}x)\nonumber \\&=: B(t)+S(t), \end{aligned}$$
(20)

for any \(t\in {{\mathrm{{\mathbb {R}}}}}\). The first term \(B\) is a deterministic approximation error and the rough idea for controlling it is to view \({{\mathrm{{\mathbb {P}}}}}_\Delta \) as an approximate identity and to use similar arguments as for the approximation error of a kernel estimator. The second term \(S\) is the main stochastic error term driven by the empirical process

$$\begin{aligned} \sqrt{n\Delta }\left( \frac{x^{2}}{\Delta }{{\mathrm{{\mathbb {P}}}}}_{\Delta ,n} -\frac{x^{2}}{\Delta }{{\mathrm{{\mathbb {P}}}}}_{\Delta }\right) =\sqrt{\frac{n}{\Delta }}x^{2} \big ({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta }\big ), \end{aligned}$$
(21)

where the scaling follows from the intuitive observation that the \(X_k\)’s are drawn i.i.d. from law \({{\mathrm{{\mathbb {P}}}}}_{\Delta }\) and hence satisfy, using that \({{\mathrm{{\mathbb {P}}}}}_\Delta \) is an infinitely divisible distribution,

$$\begin{aligned} {{\mathrm{\hbox {Var}}}}\left( \sum _{k=1}^n X_k\right) = {{\mathrm{\hbox {Var}}}}(L_{n \Delta }) = n \Delta {{\mathrm{\hbox {Var}}}}(L_1). \end{aligned}$$

Turning our attention to the second estimator we decompose \({\widehat{N}}_n-N_\rho \) into three error terms, using (9):

$$\begin{aligned} {\widehat{N}}_n(t)-N_\rho (t)&= \int _{{{\mathrm{{\mathbb {R}}}}}}\left( g_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\left[ (-{\widehat{\psi }}''_n-{\widehat{\sigma }}^2){\mathcal {F}} K_h \right] (x)\,\mathrm {d}x-x^2 g_t(x)\nu (\,\mathrm {d}x)\right) \nonumber \\&=\int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x)\big (K_h*\big (y^2\nu (\,\mathrm {d}y)\big )-x^2\nu \big )(\,\mathrm {d}x)\nonumber \\&\quad +\int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\Big [{\mathcal {F}} K_h(u)\big (\psi ''(u)-{\widehat{\psi }}_n''(u)\big )\Big ](x)\,\mathrm {d}x\nonumber \\&\quad +(\sigma ^2-{\widehat{\sigma }}^2)\int _{{{\mathrm{{\mathbb {R}}}}}}g_t(x)K_h(x)\,\mathrm {d}x. \end{aligned}$$
(22)

The first term is a deterministic approximation error, which can be bounded by Assumption 1(d) on the smoothness. The last term will be negligible since we assume that \({\widehat{\sigma }}^2\) converges to \(\sigma ^2\) with a faster rate than \(1/\sqrt{n\Delta }\). The key stochastic term is the second one. Compared to the basic estimator \({\widetilde{N}}_n\) we face the additional difficulty that \(\widehat{\psi }_n''\) depends nonlinearly on \({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}\). The following result shows that even after linearisation the resulting term is still different from the basic process \({\widetilde{N}}_n -{{\mathrm{{\mathbb {E}}}}}{\widetilde{N}}_n\) in that it performs a division by \(\varphi _\Delta \) in the spectral domain.

Proposition 7

Grant Assumption 1(a) and assume \(\sup _{u\in [-1/h_n,1/h_n]}|\varphi _{\Delta _n}(u)|^{-1}\lesssim 1\) for some \(h_n \rightarrow 0, \Delta _n \rightarrow 0\). Let the function

$$\begin{aligned} m(u):=\frac{{\mathcal {F}} K(h_nu)}{\varphi _{\Delta _n}(u)} \end{aligned}$$

satisfy uniformly for \(h_n,\Delta _n\rightarrow 0\), \(\Vert m\Vert _\infty \lesssim 1\). If \(n\Delta _n\rightarrow \infty \) and \(h_n\rightarrow 0\) with \(h_n\gtrsim \Delta _n^{1/2}\), then we have

$$\begin{aligned} \int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\Big [{\mathcal {F}} K_{h_n}(u)(\psi -{\widehat{\psi }}_n)''(u)\Big ](x)\,\mathrm {d}x=M_{\Delta ,n}+o_P(1/\sqrt{n\Delta _n}), \end{aligned}$$

where

$$\begin{aligned} M_{\Delta ,n}&:=-\Delta _n^{-1}\int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x)\mathcal{F}^{-1}[\varphi _{\Delta _n}^{-1}(\varphi _{\Delta _n,n}''-\varphi _{\Delta _n}'')\mathcal{F}K_{h_n}](x)\,\mathrm {d}x\nonumber \\&= \int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x) \left( \frac{x^2}{\Delta }({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })\right) *{{\mathrm{{\mathcal {F}}}}}^{-1}[m](\,\mathrm {d}x). \end{aligned}$$
(23)

Note that the assumption \(\Vert m\Vert _ \infty \lesssim 1\) uniformly for \(h_n, \Delta _n\rightarrow 0\) is valid if \(K\) is as in (14), \(h_n\sim \sqrt{\Delta _n}\) and \( \sup _{u\in [-1/h_n,1/ h_n]} |\varphi _{\Delta _n}(u)|^{-1} \lesssim 1\).

We refer to \(M_{\Delta ,n}\) as the main stochastic term. To accommodate both (21) and (23) we now study empirical processes

$$\begin{aligned} \sqrt{\frac{n}{\Delta }}( x^2 ({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })) *{\mathcal {F}}^{-1} m. \end{aligned}$$
(24)

for general \((n,\Delta )\)-dependent Fourier multipliers \(m: {{\mathrm{{\mathbb {R}}}}}\rightarrow {\mathbb {C}}\) satisfying the following condition.

Assumption 8

For every \(n, \Delta \) the twice differentiable functions \(m = m_{n, \Delta }: {{\mathrm{{\mathbb {R}}}}}\rightarrow {\mathbb {C}}\) are either such that

  1. (a)

    \({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }]\), \({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }']\) are finite signed measures with uniformly bounded total variations, or such that

  2. (b)

    \({{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n, \Delta }]\) is real-valued and \(m_{n, \Delta }\) is supported in \([-C\Delta ^{-1/2}, C \Delta ^{-1/2}]\) for some fixed constant \(C>0\).

  3. (c)

    In addition to (a) or (b), letting \(\Delta =\Delta _n \rightarrow 0\) as \(n \rightarrow \infty \) we assume that \(m_{n,\Delta } \rightarrow 1\) pointwise on \({{\mathrm{{\mathbb {R}}}}}\), that

$$\begin{aligned} \Vert (1+|u|)^{k}m_{n, \Delta }^{(k)}\Vert _\infty \leqslant c, \quad k\in \{0,1,2\}, \end{aligned}$$

for some \(0<c<\infty \) independent of \(n, \Delta \) and that

$$\begin{aligned} \Vert m'_{n, \Delta }\Vert _{L^2}\rightarrow 0,\quad \Delta ^{-1/2}\Vert m''_{n, \Delta }\Vert _{L^2}\rightarrow 0. \end{aligned}$$

The above assumption is an adaptation of the usual Mikhlin-type Fourier multiplier conditions to the situation relevant here (see [19], Cor. 4.11). It ensures that \(m\), \(m'\) act as norm-continuous Fourier multipliers on suitable function spaces, which will be a key tool in our proofs. Obviously Assumption 8 covers the case \(m=1\) relevant in (21) above. Moreover, we show in Proposition 19 below that it also covers \(m={\mathcal {F}} K_{(\Delta )}/\varphi _\Delta \) under our conditions on \(\varphi _\Delta \) and \(K_{(\Delta )}\), where \(K_{(\Delta )}\) denotes a kernel as in (14) with bandwidth depending on \(\Delta \). It includes other situations not studied further here, too, such as smoothed empirical processes based on \({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}\) convolved with an approximate identity \(K_h= h^{-1} K(\cdot /h), h:=h_n \rightarrow 0, \int K=1,\) upon setting \(m={\mathcal {F}} K_h\).

With the definition of general \(m=m_{n,\Delta }\) at hand we can now unify the second term \(S(t)\) in (20) and the main stochastic error (23), and study the smoothed empirical process

$$\begin{aligned} \mathbb {G}_n(t)&:= \sqrt{n\Delta }\int _{{{\mathrm{{\mathbb {R}}}}}} g_t(x) \left( \frac{x^2}{\Delta }({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })\right) * {{\mathrm{{\mathcal {F}}}}}^{-1}[m](\,\mathrm {d}x),\\&= \sqrt{n \Delta }\int _{{{\mathrm{{\mathbb {R}}}}}}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[g_t](u)](x) \frac{x^2}{\Delta }({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_\Delta )(\,\mathrm {d}x), \quad t \in {{\mathrm{{\mathbb {R}}}}},\nonumber \end{aligned}$$
(25)

the identity following from Fubini’s theorem and standard properties of Fourier transforms.

When \(t\) is a fixed point in \({{\mathrm{{\mathbb {R}}}}}\) and \(m=1\) one shows without difficulty that, as \(n \rightarrow \infty \),

$$\begin{aligned} {{\mathrm{\hbox {Var}}}}({\mathbb {G}}_n(t)) \rightarrow \int _{{{\mathrm{{\mathbb {R}}}}}} x^4 g_t^2(x) \nu (\,\mathrm {d}x) \end{aligned}$$

whenever \(\nu (\{t\})=0\). More generally one can show convergence of the finite-dimensional distributions of the process \(({\mathbb {G}}_n(t): t \in {{\mathrm{{\mathbb {R}}}}})\) to the process \(({\mathbb {G}}(t): t \in {{\mathrm{{\mathbb {R}}}}})\) from Theorem 3.

Proposition 9

Let \(\Delta = \Delta _n \rightarrow 0\) in such a way that \(n \Delta _n \rightarrow \infty \). Suppose the Lévy process satisfies Assumption 1, that \(\rho \) satisfies Assumption 6, and that \(m\) satisfies Assumption 8. Then as \(n \rightarrow \infty \) we have, for any \(t_1, \ldots , t_k \in {{\mathrm{{\mathbb {R}}}}}\), that

$$\begin{aligned} \left[ {\mathbb {G}}_n(t_1), \ldots , {\mathbb {G}}_n(t_k) \right] \rightarrow ^{{\mathcal {L}}} \left[ {\mathbb {G}}(t_1), \ldots {\mathbb {G}}(t_k)\right] . \end{aligned}$$

We remark that in this proposition we can omit \((1\wedge x^4)\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) from Assumption 1 as it is only needed later in the proof of the tightness of the process \(\mathbb {G}_n\).

By sample-continuity of Brownian motion, and since the integral in (19) takes values in a fixed compact set, we deduce that there exists a version of \(({\mathbb {G}}(t):t\in {{\mathrm{{\mathbb {R}}}}})\) with uniformly continuous sample paths for the intrinsic covariance metric

$$\begin{aligned} d^2(s,t)=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4(g_t(x)-g_s(x))^2\nu (\,\mathrm {d}x)=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4\rho ^2(x)1\!\!1_{(s\wedge t,s\vee t]}\nu (\,\mathrm {d}x), \end{aligned}$$

and that, moreover, \({{\mathrm{{\mathbb {R}}}}}\) is totally bounded with respect to \(d\). As a consequence we obtain:

Lemma 10

Grant Assumption 6. For \(g_t\) as in (18) and any Lévy measure \(\nu \), the law of the centred Gaussian process \(\{\mathbb {G}(t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) with covariance

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}[\mathbb {G}(t)\mathbb {G}(t')]=\int _{{{\mathrm{{\mathbb {R}}}}}}x^4 g_t(x) g_{t'}(x) \nu (\,\mathrm {d}x),\quad t,t'\in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$

defines a tight Gaussian Borel random variable in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). In particular, there exists a version of the process \(({\mathbb {G}}(t): t \in {{\mathrm{{\mathbb {R}}}}})\) such that

$$\begin{aligned} \sup _{t \in {{\mathrm{{\mathbb {R}}}}}} |{\mathbb {G}}(t)|<\infty ~a.s. \end{aligned}$$

The most difficult part in the proofs of Theorem 3 and 5 is to show that \({\mathbb {G}}_n\) converges in law to \({\mathbb {G}}\) in the space \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) of bounded functions on the real line. Given that convergence of the finite-dimensional distributions and tightness of the limit process have already been established, this can be reduced to showing asymptotic equicontinuity of the process \(({\mathbb {G}}_n(t): t \in {{\mathrm{{\mathbb {R}}}}})\), or equivalently, uniform tightness of the random variables \({\mathbb {G}}_n\) in the Banach space \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) (see Section 1.5 in [33] and (39) below for precise definitions).

Theorem 11

Let \(\Delta = \Delta _n \rightarrow 0\) in such a way that \(n \Delta _n \rightarrow \infty \) and \(\log ^4(1/\Delta _n)=o(n\Delta _n)\). Suppose the Lévy process satisfies Assumption 1, that \(\rho \) satisfies Assumption 6, and that \(m\) satisfies Assumption 8. Then the process \({\mathbb {G}}_n\) from (25) is asymptotically equicontinuous in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). In particular, \({\mathbb {G}}_n\) is uniformly tight in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\),

$$\begin{aligned} {\mathbb {G}}_n \rightarrow ^{{\mathcal {L}}} {\mathbb {G}} \quad \text {in} \quad \ell ^\infty ({{\mathrm{{\mathbb {R}}}}}), \end{aligned}$$

and

$$\begin{aligned} \sup _{t \in {{\mathrm{{\mathbb {R}}}}}} \left| \int _{{{\mathrm{{\mathbb {R}}}}}}g_t(x) \left( \frac{x^2}{\Delta }({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })\right) * {{\mathrm{{\mathcal {F}}}}}^{-1}[m](\,\mathrm {d}x) \right| = O_P\left( \frac{1}{\sqrt{n \Delta }} \right) \!. \end{aligned}$$
(26)

The proof is based on ideas from the theory of smoothed empirical processes (in particular from [16]). The main mathematical challenges consist in dealing with envelopes of the empirical process that can be as large as \(1/\sqrt{\Delta }\rightarrow \infty \) in the high-frequency setting, and in accommodating the presence of an \(n\)-dependent Fourier multiplier \(m\) that needs to be general enough to allow for \(m={\mathcal {F}} K_{h}/\varphi _\Delta \). The latter requires the treatment of empirical processes that cannot be controlled with the standard bracketing or uniform metric entropy techniques. Our proofs rely on direct arguments for symmetrised empirical processes inspired by [18] and on sharp bounds on certain covering numbers based on a suitable Fourier integral operator inequality for \({{\mathrm{{\mathcal {F}}}}}^{-1}[m]\) in \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norms.

4 Discussion and examples

4.1 Regularity of \(x^2 \nu \) and the Blumenthal–Getoor index

The regularity index \(s>0\) in Assumption 1 measures the smoothness of the function \(g_t(-\cdot )*(x^2\nu )\). When \(x^2\nu \) is sufficiently regular away from the origin, this will equivalently measure the smoothness of the function \(t \mapsto \int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\), and hence is effectively driven by the singularity that \(\nu \) possesses at zero. The latter can be quantitatively measured by the Blumenthal–Getoor index (see [5])

$$\begin{aligned} \beta : =\inf \left\{ \alpha >0:\int _{|x|<1}|x|^{\alpha }\nu (\,\mathrm {d}x)<\infty \right\} \!=\!\inf \left\{ \alpha >0:\lim _{r\downarrow 0}r^{\alpha }\int _{ {{\mathrm{{\mathbb {R}}}}}\setminus [-r,r]}\,\mathrm {d}\nu \!=\!0\right\} \!. \end{aligned}$$
(27)

The Blumenthal–Getoor index \(\beta \) takes values in \([0,2]\) and we have \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^\alpha \nu (\,\mathrm {d}x)=c<\infty \) for all \(\alpha \in (\beta ,2]\) (\(\alpha =2\) if \(\beta =2\)). In fact, for such \(\alpha \) and for all intervals \([a,b]\) containing the origin

$$\begin{aligned} \int _a^b |x|^2\nu (\,\mathrm {d}x) \leqslant \int _a^b|x|^\alpha \nu (\,\mathrm {d}x) (b-a)^{2-\alpha }\leqslant c (b-a)^{2-\alpha }. \end{aligned}$$

Provided \(\nu \) is smooth away from zero this shows that the Hölder smoothness of \(\int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\) is at least \(2-\beta ^+\), where \(\beta ^+>\beta \) and \(\beta ^+\geqslant 1\). For a singularity of the form \(\nu (x)=|x|^{-\beta -1}\), \(\beta \in (1,2)\), which corresponds to Blumenthal–Getoor index \(\beta \), we have \(\int _{|x|<t}x^2\nu (\,\mathrm {d}x)=2(2-\beta )^{-1}t^{2-\beta }\) showing that the Hölder smoothness is at most \((2-\beta )\). This argument can be extended to the case where the symmetrised Lévy density \({\widetilde{\nu }}\) is regularly varying: If \({\widetilde{\nu }}\) is regularly varying with exponent \(-(\beta +1)\) at zero then \(\int _{|x|<t}|x|^2\nu (\,\mathrm {d}x)\) is regularly varying of exponent \((2-\beta )\) at zero by a Tauberian theorem (see e.g. [10], Thm. VIII.9.1). For Blumenthal–Getoor index \(\beta \in (1,2]\) this means that the Hölder regularity of \(\int _{-\infty }^t x^2\nu (\,\mathrm {d}x)\) is at most \((2-\beta )\).

4.2 The drift parameter \(\gamma \)

None of the above estimators \({\widetilde{N}}, {\widehat{N}}, {\widetilde{\mathcal {N}}}, {\widehat{\mathcal {N}}}\) require knowledge, or estimation, of the drift parameter \(\gamma \), which, at any rate, can be naturally estimated by \(L_{n\Delta }/(n\Delta )\). It is interesting to note that the ‘nonlinear’ estimator \({\widehat{N}}_n\) is even invariant under a change of the drift parameter \(\gamma \), as the following lemma shows.

Lemma 12

Let \(Y_k:=X_k-\Delta \gamma ,k=1,\ldots ,n,\) which are increments of a Lévy process with characteristic triplet \((\sigma ,0,\nu )\). Denoting the estimators (11) based on \((X_k)\) and \((Y_k)\) as \({\widehat{N}}_{X,n}\) and \({\widehat{N}}_{Y,n}\), respectively, we obtain

$$\begin{aligned} \forall t\in {{\mathrm{{\mathbb {R}}}}}: {\widehat{N}}_{X,n}(t)={\widehat{N}}_{Y,n}(t). \end{aligned}$$

Proof

The drift causes a factor \(e^{-i\Delta \gamma u}\) in the empirical characteristic function \(\varphi _{\Delta ,n,Y}\) such that

$$\begin{aligned} {\widehat{\psi }}_{n,Y}''(u)=\Delta ^{-1} (\log (\varphi _{\Delta ,n,X}(u))-i\Delta \gamma u)''={\widehat{\psi }}_{n,X}''(u). \end{aligned}$$

\({\widehat{N}}_n\) only depends via \({\widehat{\psi }}_{n}''\) on the observations.

Consequently, without loss of generality a specific value of \(\gamma \) can be assumed in the proofs for the estimator \({\widehat{N}}_n\) based on the Lévy–Khintchine representation. In particular, the conditions on \({{\mathrm{{\mathbb {P}}}}}_\Delta \) need to be verified only for one \(\gamma \).

4.3 A pilot estimate of the diffusion coefficient \(\sigma ^2\)

Proposition 13

Suppose the Lévy measure satisfies \(\int |x |^\alpha \nu (dx)<\infty \) for some \(\alpha \in [0,2]\) and the characteristic function is bounded from below via

$$\begin{aligned} |\varphi _\Delta (u) |\geqslant \exp (-\Delta \sigma _{max}^2u^2/2)\quad \text { for all } u\geqslant 0. \end{aligned}$$

Let \({\widehat{\sigma }}^2\) be as in (12). Then we have, for \(c_0\) small enough, as \(n \rightarrow \infty \), and uniformly in \(\Delta \leqslant 1\),

$$\begin{aligned} |{\widehat{\sigma }}^2-\sigma ^2 |=O_P\Big ((\log n)^{(\alpha -2)/2}\Delta ^{1-\alpha /2}+(\log n)^{-1}n^{c_0-1/2}\Big ). \end{aligned}$$

The proof follows along the lines of [21] and is omitted. The previous discussion and the examples in Sect. 4.5 below show that the natural connection between smoothness \(s\) and Blumenthal–Getoor index \(\beta \) is given by \(s=2-\beta \). For such \(s\) and with the choice \(c_0=1/6\) the conditions of Theorem 5 ensure that (13) is satisfied provided the infimum in the definition of the Blumenthal–Getoor index is attained. Otherwise it suffices to replace the condition \(\Delta _n=o(n^{-1/(s+1)})\) by the slightly stronger condition \(\Delta _n=o(n^{-1/(s^-+1)})\) for some \(s^-<s\) in order to guarantee (13). Other estimators, based for instance on the truncated quadratic variations of the process, can be considered, and different sets of conditions are possible. As this is beyond the scope of the present paper, we refer to [21] for discussion and references.

4.4 Bounding \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \)

A key condition in all results above is a uniform bound on \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_{\Delta }\Vert _\infty \) of order \(\Delta \). The following proposition shows that this condition follows already from \(\Vert x{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim 1\) and \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). We recall that we always assume \(\int _{{{\mathrm{{\mathbb {R}}}}}}x^2\nu (\,\mathrm {d}x)<\infty \).

Proposition 14

For any Lévy process \((L_t:t\geqslant 0)\) with \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) and \(x^3\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) we have \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) (with constants uniform in \(\Delta )\).

Proof

From \(\varphi _\Delta ''=(\Delta \psi ''+(\Delta \psi ')^2)\varphi _\Delta =\Delta \psi '' \varphi _\Delta +(\Delta \psi '\varphi _{ \Delta /2})^2\) by the infinite divisibility, we conclude

$$\begin{aligned} x^2{{\mathrm{{\mathbb {P}}}}}_\Delta =\Delta \nu _\sigma *{{\mathrm{{\mathbb {P}}}}}_\Delta +4(x{{\mathrm{{\mathbb {P}}}}}_{\Delta /2})*(x{{\mathrm{{\mathbb {P}}}}}_{\Delta /2}), \end{aligned}$$
(28)

where \(\nu _\sigma =\sigma ^2\delta _0+x^2\nu \). Using \(x(P*Q)=(xP)*Q+P*(xQ)\), we infer further

$$\begin{aligned} x^3{{\mathrm{{\mathbb {P}}}}}_\Delta =\Delta \Big ( (x\nu _\sigma )*{{\mathrm{{\mathbb {P}}}}}_\Delta +\nu _\sigma *(x{{\mathrm{{\mathbb {P}}}}}_\Delta )\Big )+8(x{{\mathrm{{\mathbb {P}}}}}_{\Delta /2})*(x^2{{\mathrm{{\mathbb {P}}}}}_{\Delta /2}). \end{aligned}$$

By assumption and properties of Lévy processes, we have \(||x\nu _\sigma ||_\infty <\infty \), \({{\mathrm{{\mathbb {P}}}}}_\Delta ({{\mathrm{{\mathbb {R}}}}})=1\), \(\nu _\sigma ({{\mathrm{{\mathbb {R}}}}})<\infty \) and \(||x^2{{\mathrm{{\mathbb {P}}}}}_\Delta ||_{L^1}\lesssim \Delta \). This yields

$$\begin{aligned} ||x^3{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim \Delta (1+||x{{\mathrm{{\mathbb {P}}}}}_{\Delta } ||_\infty + ||x{{\mathrm{{\mathbb {P}}}}}_{\Delta /2} ||_\infty )\lesssim \Delta . \end{aligned}$$

\(\square \)

The condition \(\Vert x{{\mathrm{{\mathbb {P}}}}}_{\Delta _n}\Vert _\infty \lesssim 1\) is satisfied for all basic examples of Lévy processes like Brownian motion, compound Poisson, Gamma and symmetric (tempered) \(\alpha \)-stable processes. For the latter processes it is interesting to compare the resulting bounds to the small time estimates in [29]. The conjecture that the bound \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) is universal for arbitrary jump behaviour near zero, however, is wrong as the case of a completely asymmetric (tempered) 1-stable process shows where \({{\mathrm{{\mathbb {P}}}}}_\Delta (-\Delta \log (1/\Delta ))\thicksim \Delta ^{-1}\) holds, see the exceptional case in Example 4.5 of [29].

If \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|\nu (\,\mathrm {d}x)<\infty \) we can define the drift parameter \(\gamma _0 := \gamma - \int x \nu (\,\mathrm {d}x)\).

Assumption 15

Let \((\sigma ^2,\gamma ,\nu )\) be a Lévy triplet and \(\nu ^+=\nu {1\!\!1}_{{{\mathrm{{\mathbb {R}}}}}^+}\), \(\nu ^-=\nu {1\!\!1}_{{{\mathrm{{\mathbb {R}}}}}^-}\). Consider the following conditions for the two triplets \((\sigma ^2,\gamma ,\nu ^\pm )\):

  1. (i)

    (diffusive case) \(\sigma >0\)

  2. (ii)

    (small intensity case) \(\sigma =0\), \(\gamma _0=0\), \(||x\nu ^\pm ||_\infty <\infty \)

  3. (iii)

    (finite variation case) \(\sigma =0\), \(\gamma _0=0\), \(x\nu ^\pm \) admits a Lebesgue density in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}}\setminus [-\varepsilon ,\varepsilon ])\) for all \(\varepsilon >0\), \(\varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon |x |\nu ^\pm (\,\mathrm {d}x)\lesssim \varepsilon \nu (\pm \varepsilon )\) for \(\varepsilon \downarrow 0\) and

    $$\begin{aligned} \liminf _{\varepsilon \downarrow 0}\inf _{t\in (0,1]} \frac{(t\varepsilon )^{-1}\int _{|x |\leqslant t\varepsilon } x^2\nu ^\pm (\,\mathrm {d}x)}{\varepsilon ^2\nu (\pm \varepsilon )t\log (t^{-1})}>0 \end{aligned}$$
  4. (iv)

    (infinite variation case) \(\sigma =0\), \(\nu ^{\pm }\) admits a Lebesgue density,

    $$\begin{aligned} \frac{1}{\varepsilon }\int _{-\varepsilon }^\varepsilon x^2\nu ^\pm (\,\mathrm {d}x)\gtrsim \int _{|x |>\varepsilon }|x |\nu ^\pm (\,\mathrm {d}x)+1\quad \text { for } \varepsilon \in (0,1) \end{aligned}$$

Proposition 16

If each of the triplets \((\sigma ^2,\gamma ,\nu ^\pm )\) of the Lévy process satisfies one of the Assumptions 15(i)–(iv), then \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \lesssim 1\) holds uniformly in \(\Delta \).

4.5 Examples

Let us discuss the applicability of Proposition 16 together with the smoothness conditions on the jump measure from Theorem 5 in a few examples.

  1. (i)

    Diffusion plus compound Poisson process.

    Let \(\nu \) be a finite measure on \({{\mathrm{{\mathbb {R}}}}}\) with a Lebesgue density. Suppose that we have \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^{4+\varepsilon }\nu (\,\mathrm {d}x)<\infty \) for some \(\varepsilon >0\) and \(\Vert x^3\nu \Vert _\infty <\infty \). Proposition 16 yields \(\Vert x{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim 1\) if either \(\gamma _0=0\) and \(\nu (x)\lesssim |x|^{-1}\) as \(x\rightarrow 0\), or if \(\sigma >0\).

    For \(x^2\nu \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) the global Hölder regularity in Assumption 1(d) is \(s=1\), and for smooth compounding measure \(x^2\nu \in C^r({{\mathrm{{\mathbb {R}}}}})\) it is satisfied with \(s=r+1\).

  2. (ii)

    Self-decomposable Lévy process.

    The jump measures of self-decomposable Lévy processes are characterised by \(\nu (\,\mathrm {d}x)=\frac{k(x)}{|x|}\,\mathrm {d}x\) for a function \(k:{{\mathrm{{\mathbb {R}}}}}\rightarrow {{\mathrm{{\mathbb {R}}}}}_+\) which is monotonically increasing on the negative half line and decreasing on the positive one. An explicit example is given by the Gamma process where \(k(x)=ce^{-\lambda x}1\!\!1_{{{\mathrm{{\mathbb {R}}}}}_+}(x)\) for \(c,\lambda >0\). Note that nontrivial self-decomposable processes have an infinite jump activity. If \(k\) is a bounded function, then Assumption 15(ii) is fulfilled. The smoothness is determined by the Hölder regularity of \(|x|k(x)\). For instance, Gamma processes induce regularity \(s=2\) at \(t=0\) and \(C^\infty \) away from the origin.

  3. (iii)

    Tempered stable Lévy process.

    Let \(L\) be a tempered stable process, that is a pure jump process with Lévy measure given by the Lebesgue density

    $$\begin{aligned} \nu (x)=|x|^{-1-\alpha }\left( c_-e^{-\lambda _-|x|}1\!\!1_{(-\infty ,0)}(x) +c_+e^{-\lambda _+|x|}1\!\!1_{(0,\infty )}(x)\right) \end{aligned}$$

    with parameters \(c_\pm \geqslant 0,\lambda _\pm >0\) and stability index \(\alpha \in (0,2)\). By the exponential tails of \(\nu \) the moment assumptions are satisfied. For the finite variation case \(\alpha \in (0,1)\) Assumption 15(iii) can be verified since \(\varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon |x|\nu ^\pm (x)\,\mathrm {d}x\sim \varepsilon ^{-\alpha }\sim \varepsilon \nu (\pm \varepsilon )\) and the second condition simplifies to \(t^{-\alpha }/\log (t^{-1})>0\). In the infinite variation case \(\alpha \in (1,2)\) Assumption 15(iv) is satisfied owing to

    $$\begin{aligned} \varepsilon ^{-1}\int _{-\varepsilon }^\varepsilon x^2\nu ^\pm (x)\,\mathrm {d}x\sim \varepsilon ^{1-\alpha }\sim \int _{|x|>\varepsilon }|x|\nu ^\pm (x)\,\mathrm {d}x, ~~\varepsilon \in (0,1). \end{aligned}$$

    Outside of a neighbourhood of zero the Lévy measure is arbitrarily smooth. Due to the cusp of \(x^2\nu (x)\) at the origin the global Hölder regularity is in general given by \(s=2-\alpha \). In the case \(\alpha =1\) and \(c_+=c_-\), \(x^2\nu \) is already Lipschitz continuous at zero and so \(s=2\).

  4. (iv)

    Jump densities regularly varying at zero.

    The first condition in Assumption 15(iii) holds for regularly varying \(\nu \) with \(\alpha <1\), that is \(\nu ^\pm (x)=|x|^{-1-\alpha }l(x)\) with slowly varying \(l\) at zero, by a classical Tauberian theorem (see e.g. [10], Thm. VIII.9.1). The second condition then reduces to

    $$\begin{aligned} l(t\varepsilon )\geqslant C_\alpha t^\alpha \log (t^{-1})l(\varepsilon ),\quad C_\alpha >0, \end{aligned}$$

    uniformly over \(t\in (0,1]\) for small \(\varepsilon >0\), which is always satisfied for \(\alpha >0\). Similarly, Assumption 15(iv) is satisfied if \(\nu ^\pm (x)=|x|^{-1-\alpha }l(x)\) holds with \(\alpha \in (1,2)\) and a slowly varying function \(l\) at zero.

5 Proofs

We collect the proofs for Sects. 2, 3, 4. Theorem 11 is proved in the next section.

5.1 Proof of Lemma 2

The result is a standard—for convenience of the reader we include a short proof. Using the Lévy–Khintchine formula (8) we see

$$\begin{aligned} c_\Delta :=\Delta ^{-1}{{\mathrm{{\mathbb {E}}}}}[X_1^2]= -\Delta ^{-1}\varphi _{\Delta }''(0) = -\psi ''(0)- \Delta (\psi '(0))^2 \rightarrow \sigma ^2+\int x^2 \nu (\,\mathrm {d}x)=:c \end{aligned}$$
(29)

as \(\Delta \rightarrow 0\). The characteristic function of the probability measure \(c_\Delta ^{-1}x^2\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) converges pointwise to the characteristic function of \(c^{-1}(\sigma ^2 \delta _0+ x^2\nu (\,\mathrm {d}x))\) as \(\Delta \rightarrow 0\) since

$$\begin{aligned} \frac{1}{c_\Delta \Delta }\int e^{iux}x^2 {{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)&=-\frac{1}{c_\Delta \Delta }\varphi _\Delta ''(u)=-\frac{1}{c_\Delta \Delta }(\Delta \psi ''(u)+\Delta ^2(\psi '(u))^2)e^{\Delta \psi (u)}\\&\rightarrow \frac{1}{c}(\sigma ^2+\int e^{iux}x^2 \nu (\,\mathrm {d}x)). \end{aligned}$$

Therefore, we obtain (5) from Lévy’s continuity theorem.

5.2 Proof of Theorem 3

Using decomposition (20), Theorem 3 follows from Theorem 11 with \(m=1\) [which trivially satisfies Assumption 8(a)], if we can show that the ‘bias’ term \(B(t)\) is asymptotically negligible uniformly in \(t \in {{\mathrm{{\mathbb {R}}}}}\). This is achieved in the following proposition.

Proposition 17

Grant the assumptions of Theorem 3. Then it holds

$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B(t)|=\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int g_{t}(x)\big (\Delta ^{-1}x^{2}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\mathrm {d}x)-x^{2} \nu (\mathrm {d}x)\big )\Big |=O(\Delta ^{s/2}). \end{aligned}$$

Proof

We decompose the bias into

$$\begin{aligned} B(t)&= \int g_{t}(x)\big (\Delta ^{-1}x^{2}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\mathrm {d}x)-((x^2\nu )*{{\mathrm{{\mathbb {P}}}}}_{ \Delta })(\,\mathrm {d}x)\big )\nonumber \\&\quad +\int g_{t}(x)\big (((x^{2}\nu )*{{\mathrm{{\mathbb {P}}}}}_{\Delta })(\,\mathrm {d}x)-x^{2}\nu (\,\mathrm {d}x)\big )\nonumber \\&=: B_{1}(t)+B_{2}(t). \end{aligned}$$
(30)

We start with the first term \(B_1(t)\): Using \(-({{\mathrm{{\mathcal {F}}}}}f)''={{\mathrm{{\mathcal {F}}}}}[x^{2}f]\) for any function \(f\) satisfying \((1\vee x^{2})f\in L^{1}({{\mathrm{{\mathbb {R}}}}})\), we have

$$\begin{aligned} -{{\mathrm{{\mathcal {F}}}}}\big [x^{2}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\mathrm {d}x)]=\varphi _{\Delta } ''=(\Delta \psi ''+(\Delta \psi ')^{2})\varphi _{\Delta }. \end{aligned}$$

Plancherel’s identity and \(\psi ''=-{{\mathrm{{\mathcal {F}}}}}[x^2\nu ]\) then gives

$$\begin{aligned} B_{1}(t)&= \frac{1}{2\pi }\int {{\mathrm{{\mathcal {F}}}}}g_{t}(-u){{\mathrm{{\mathcal {F}}}}}\big [\Delta ^{-1}x^{2}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\mathrm {d}x)-((x^2\nu )*{{\mathrm{{\mathbb {P}}}}}_{ \Delta })(\,\mathrm {d}x)\big ](u)\,\,\mathrm {d}u\\&= \frac{1}{2\pi }\int {{\mathrm{{\mathcal {F}}}}}g_{t}(-u)\big (-\Delta ^{-1}\varphi _{\Delta }''(u)-{{\mathrm{{\mathcal {F}}}}}[x^2\nu ](u)\varphi _{\Delta } (u)\big )\,\,\mathrm {d}u\\&= -\frac{\Delta }{2\pi }\int {{\mathrm{{\mathcal {F}}}}}g_{t}(-u)\psi '(u)^{2}\varphi _{\Delta }(u)\,\,\mathrm {d}u. \end{aligned}$$

The proofs below will imply that the last integral exists, which in particular justifies the preceding manipulations. We shall repeatedly use that \(\sup _{t}\Vert g_{t}\Vert _{L^{1}}\leqslant \Vert \rho \Vert _{L^{1}}\) and \(\sup _{t}\Vert g_{t}\Vert _{BV}\leqslant \Vert \rho \Vert _{BV}\) imply

$$\begin{aligned} |{{\mathrm{{\mathcal {F}}}}}g_{t}(u)|\lesssim (1+|u|)^{-1},u\in {{\mathrm{{\mathbb {R}}}}}, \end{aligned}$$
(31)

uniformly in \(t \in {{\mathrm{{\mathbb {R}}}}}\). In case (a) we can use (31), \(\Vert \varphi _\Delta \Vert _\infty =1\), \(|{{\mathrm{{\mathcal {F}}}}}[x\nu ](u)| \lesssim (1+|u|)^{-1}\), the hypothesis \(\gamma _0=0\) and the resulting identity

$$\begin{aligned} \psi '(u) = i {{\mathrm{{\mathcal {F}}}}}[x\nu ](u) - i \left( \int x \nu (\,\mathrm {d}x) - \gamma \right) = i {{\mathrm{{\mathcal {F}}}}}[x \nu ](u) \end{aligned}$$

to bound

$$\begin{aligned} |B_1(t)| \leqslant \frac{\Delta }{2\pi }\int |{{\mathrm{{\mathcal {F}}}}}g_{t}(-u)||{{\mathrm{{\mathcal {F}}}}}[x \nu ](u)|^{2}|\varphi _{\Delta }(u)|\,\,\mathrm {d}u \lesssim \Delta \int \frac{1}{(1+|u|)^3} \,\mathrm {d}u =O(\Delta ). \end{aligned}$$

For case (b) we will show that

$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|\lesssim \Delta ^{p}\quad \text {for any}\quad p\in \left( 0,\frac{2-\beta }{\beta }\vee 1\right) \!. \end{aligned}$$
(32)

By assumption \({\widetilde{\nu }}(x)=\nu ^+(x)+\nu ^-(-x)\) is regularly varying at zero with exponent \(-(\beta +1)\) and so the function \(H(r):=\int _{|x|<r}x^2\nu (x)\,\mathrm {d}x=\int _0^r x^2 {\widetilde{\nu }}(x)\,\mathrm {d}x\) is regularly varying with exponent \((2-\beta )\) by a Tauberian theorem [10, Thm. VIII.9.1]. Especially we can bound \(H(r)\) from below, more precisely for any \(\beta ^-\in (0,\beta )\) there exists \(r_0>0\) such that \(H(r)\gtrsim r^{2-\beta ^-}\) for all \(r\in (0,r_0)\). By [28] there is a constant \(c>0\) such that

$$\begin{aligned} |\varphi _{\Delta }(u)|\lesssim \exp (-c\Delta |u|^{\beta ^{-}}) \end{aligned}$$
(33)

for \(|u|\) sufficiently large. On the other hand, it is easily seen that

$$\begin{aligned} |\psi '(u)|\lesssim 1+|u|^{(\beta ^{+}-1)\vee 0} \end{aligned}$$
(34)

for any \(\beta ^{+}\in (\beta ,2)\) and that \(|\psi ''(u)|\) is bounded. Especially we have \(\varphi _{\Delta },\varphi _{\Delta }''\in L^{2}({{\mathrm{{\mathbb {R}}}}})\). Collecting the above and using (31) implies

$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|\lesssim \Delta \int (1+|u|)^{-1} |\psi '(u)|^{2}|\varphi _{\Delta }(u)|\,\,\mathrm {d}u. \end{aligned}$$

Let us distinguish the cases \(\beta \geqslant 1\) and \(\beta <1\), which will yield together (32). We will be using the bounds for \(\varphi _\Delta \) and \(\psi '\) in (33) and (34), respectively.

  1. (i)

    For \(\beta \geqslant 1\) substituting \(u=\Delta ^{-1/\beta ^{-}}z\) yields

    $$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|&\lesssim \Delta \int (1+|u|)^{(2\beta ^{+}-3)}\exp (-c\Delta |u|^{\beta ^{-}})\,\,\mathrm {d}u\\&\lesssim \Delta ^{(\beta ^{-}-2\beta ^{+}+2)/\beta ^{-}}\int ((1+|z|)^{2\beta ^{+}-3}\vee |z|^{ 2\beta ^{+}-3})\exp (-c|z|^{\beta ^{-}})\,\,\mathrm {d}z, \end{aligned}$$

    where the integral in the last display is finite owing to \(\beta ^{+}>1\). Noting that \(2\beta ^{+}-\beta ^{-}>\beta \), we conclude that \(|B_{1}|\lesssim \Delta ^{p}\) for any \(p<(2-\beta )/\beta \).

  2. (ii)

    For \(0<\beta <1\) boundedness of \(|\psi '|\) and the same substitution yields for any \(\delta >0\)

    $$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}|B_{1}(t)|&\lesssim \Delta \int (1+|u|)^{-1}\exp (-c\Delta |u|^{\beta ^{-}})\,\,\mathrm {d}u\\&\leqslant \Delta ^{1-1/\beta ^{-}}\int (1+\Delta ^{-1/\beta ^{-}}|z|)^{-1+\delta }\exp (-c|z|^{ \beta ^{-}})\,\,\mathrm {d}z\\&\leqslant \Delta ^{1-\delta /\beta ^{-}}\int |z|^{-1+\delta }\exp (-c|z|^{\beta ^{-}})\,\,\mathrm {d}z. \end{aligned}$$

    By choosing \(\delta \) sufficiently small, we obtain \(|B_{1}|\lesssim \Delta ^{p}\) for any \(p<1\).

Let us now consider \(B_{2}\) in (30) which we can write as

$$\begin{aligned} B_{2}(t)=&\int \int \big (g_{t}(x+y)-g_{t}(x)\big )x^{2}\nu (\,\mathrm {d}x){{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\\ =&\int \big ((g_{t}(-{\scriptstyle \bullet } )*(x^{2}\nu ))(-y)-(g_{t}(-{\scriptstyle \bullet } )*(x^{2} \nu ))(0)\big ){{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y). \end{aligned}$$

For the sake of brevity we define \(h_{t}(y):=(g_{t}(-{\scriptstyle \bullet } )*(x^{2}\nu ))(y)\). We decompose the integration domain into the neighbourhood of the origin \((-U,U)\) and the tails \(\{y:|y|\geqslant U\}\). For small \(y\) the uniform Hölder regularity of \(h_{t}(y)\), for \(|y|<U\), as well as \({{\mathrm{{\mathbb {E}}}}}[|X_{1}|^{2}]\lesssim \Delta \) and Jensen’s inequality yield for \(s\leqslant 1\)

$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|<U}\big (h_{t}(-y)-h_{t}(0)\big ){{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\lesssim \int _{{{\mathrm{{\mathbb {R}}}}}}|y|^{s}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)={{\mathrm{{\mathbb {E}}}}}[|X_{1}|^{s}]\lesssim \Delta ^{s/2}. \end{aligned}$$

and for \(s>1\) with \(x_y\in [-y,0]\) an intermediate point from the mean value theorem

$$\begin{aligned}&\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|<U}\big (h_{t}(-y)-h_{t}(0)\big ){{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\\&\quad \leqslant \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|<U}\big (h_{t}'(x_y)-h_{t}'(0)\big )y{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big | + \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|<U}h_{t}'(0)y{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\\&\quad \leqslant \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|<U}|y|^s{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |+ \Big |\int _{{{\mathrm{{\mathbb {R}}}}}}y{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\left| h_{t}'(0)\right| \\&\qquad + \Big |\int _{|y|\geqslant U}y^2 U^{-1}{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\left| h_{t}'(0)\right| \\&\quad \lesssim \Delta ^{s/2}+\Delta +\Delta \lesssim \Delta ^{s/2}. \end{aligned}$$

For the tails we conclude from \(\sup _{t}\Vert h_{t}\Vert _{\infty }\leqslant \Vert \rho \Vert _{\infty }\int x^{2}\nu (\,\mathrm {d}x)\) and Markov’s inequality

$$\begin{aligned} \sup _{t\in {{\mathrm{{\mathbb {R}}}}}}\Big |\int _{|y|\geqslant U}\big (h_{t}(-y)-h_{t}(0)\big ){{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}y)\Big |\lesssim {{\mathrm{{\mathbb {P}}}}}_{\Delta }(|X_{1}|\geqslant U)\leqslant U^{-2}{{\mathrm{{\mathbb {E}}}}}[|X_{1}|^{2}]\lesssim \Delta . \end{aligned}$$

The previous two estimates finally yield \(\sup _{t}|B_{2}(t)|\lesssim \Delta ^{s/2}\). \(\square \)

5.3 Proof of Theorem 4

We only prove the case \(V=(-\infty , -\zeta ]\), the general case follows from symmetry arguments that are left to the reader. We use decomposition (20) and apply Theorem 11—with \(m=1\) and \(\rho \) suitably chosen such that \(\rho (x)=x^{-2}\) for all \(x \in (-\infty , -\zeta ]\) —to the stochastic term \(S(t)\). For our choice of \(\Delta _n\) the bias term \(B(t)\) is negligible in the asymptotic distribution in view of Proposition 2.1 in [13] (which holds also for unbounded \(V\) separated away from the origin, as inspection of that proof shows).

5.4 Proof of Theorem 5

For Theorem 5(ii) we choose a suitable \(\rho \) such that \(\rho (x)=x^{-2}\) on \(V\) and we restrict to the case \((-\infty ,-\zeta ]\) since the proof can be easily extended to cover the general case by symmetry arguments. We use the decomposition (22). The third term is negligible in view (13). Recalling

$$\begin{aligned} g_t(x)=\rho (x)1\!\!1_{(-\infty ,t]}(x),\quad t\in {{\mathrm{{\mathbb {R}}}}}. \end{aligned}$$

the following result shows that the deterministic approximation error is negligible in the asymptotic distribution of \(\sqrt{n\Delta _n}({\widehat{N}}_n-N_\rho )\) whenever \(h^{s}=o(1/\sqrt{n\Delta _n})\), valid for our choice of \(\Delta _n\).

Proposition 18

Suppose \(x^2\nu \) is a finite measure satisfying Assumption 1(d). If the kernel satisfies (14) with order \(p\geqslant s\), then

$$\begin{aligned} \Big |\int _{{{\mathrm{{\mathbb {R}}}}}} g_{t}(x)\big (K_h*\big (y^2\nu (\,\mathrm {d}y)\big )-x^2\nu \big )(\,\mathrm {d}x)\Big |\lesssim c_t h^{s}, \end{aligned}$$

with constants independent of \(t\).

Proof

Using Fubini’s theorem,

$$\begin{aligned}&\int _{{{\mathrm{{\mathbb {R}}}}}} g_{t}(x)\big (K_h*\big (y^2\nu (\,\mathrm {d}y)\big )-x^2\nu \big )(\,\mathrm {d}x)\nonumber \\&\quad =K_h*g_{t}(-{\scriptstyle \bullet } )*(x^2\nu )(0)-g_{t}(-{\scriptstyle \bullet } )*(x^2\nu )(0). \end{aligned}$$
(35)

The result now follows from Assumption 1(d) and a standard Taylor expansion argument using the order \(p\) of the kernel. \(\square \)

The second, stochastic, term in (22) can be reduced to the linear term from Proposition 7, which is proved as follows:

Proof of Proposition 7

To linearise \(\psi ''-{\widehat{\psi }}_n''=-\Delta ^{-1}\log (\varphi _{\Delta ,n}/\varphi _\Delta )''\), we set \(F(y)=\log (1+y)\), \(\eta =(\varphi _{\Delta ,n}-\varphi _\Delta )/\varphi _\Delta \), and use

$$\begin{aligned} (F\circ \eta )''(u)&=F'(\eta (u))\eta ''(u)+F''(\eta (u))\eta '(u)^2\\&=F'(0)\eta ''(u)+O\Big (||F'' ||_\infty \Big (||\eta ||_\infty ||\eta '' || _\infty +||\eta ' ||_\infty ^2\Big )\Big ). \end{aligned}$$

On the event \(\Omega _n:=\{\sup _{|u|\leqslant 1/h}|(\varphi _{\Delta ,n}-\varphi _\Delta )(u)/\varphi _\Delta (u) |\leqslant 1/2\}\) we thus obtain

To estimate \(||\eta ^{(k)} ||_{\ell ^\infty [-h^{-1},h^{-1}]},k=0,1,2\), we note \(|\psi '(u) |\lesssim 1+|u |\), \(|\psi ''(u) |\lesssim 1\) and \(h\gtrsim \Delta ^{1/2}\)

$$\begin{aligned} \sup _{u\in [-h^{-1},h^{-1}]}|(\varphi _\Delta ^{-1})'(u) |\!\lesssim \! \Delta h^{-1}\lesssim \Delta ^{1/2},\quad \sup _{u\in [-h^{-1},h^{-1}]}|(\varphi _\Delta ^{-1})''(u) |\!\lesssim \! \Delta ^2h^{-2}\!+\!\Delta \!\lesssim \!\Delta . \end{aligned}$$

Moreover, from Theorem 1 by [24] we know that under our moment assumption on \(\nu \) (for \(k=0,1,2\) and any \(\delta >0\))

$$\begin{aligned} ||(\varphi _{\Delta ,n}-\varphi _\Delta )^{(k)} ||_{\ell ^\infty [-h^{-1},h^{-1}]} =O_P(n^{-1/2}\Delta ^{(k\wedge 1)/2}(\log h^{-1})^{(1+\delta )/2}). \end{aligned}$$
(36)

This yields for \(k=0,1,2\)

$$\begin{aligned} \Vert \eta ^{(k)}\Vert _{\ell ^\infty [-h^{-1},h^{-1}]}&=O_P\big (n^{-1/2}\Delta ^{k/4}(\log h^{-1})^{(1+\delta )/2}\big ). \end{aligned}$$

In combination with \(n(\log h^{-1})^{-1-\delta }\gtrsim n\Delta ^{3(1+\delta )/4}\rightarrow \infty \) for \(\delta \in (0,1/3)\) and \(|1/\varphi _\Delta | \lesssim 1\) on \([-1/h_n, 1/h_n]\) the bound (36) shows also \({{\mathrm{{\mathbb {P}}}}}(\Omega _n)\rightarrow 1\) and then

$$\begin{aligned} \sup _{|u |\leqslant h^{-1}} |{\widehat{\psi }}_n''(u)\!-\!\psi ''(u)-\Delta ^{-1}(\varphi _\Delta ^{-1}(\varphi _{\Delta ,n} \!-\!\varphi _\Delta ))''(u) | \!=\!O_P(n^{-1}\Delta ^{-1/2}\log (h^{-1})^{1+\delta }). \end{aligned}$$

We decompose the linearised stochastic error into

$$\begin{aligned} (\varphi _\Delta ^{-1}(\varphi _{\Delta ,n}\!-\!\varphi _\Delta ))'' \!=\!\varphi _\Delta ^{-1}(\varphi _{\Delta ,n}-\varphi _\Delta )'' +2(\varphi _\Delta ^{-1})'(\varphi _{\Delta ,n}\!-\!\varphi _\Delta )' \!+\!(\varphi _\Delta ^{-1})''(\varphi _{\Delta ,n}\!-\!\varphi _\Delta ). \end{aligned}$$

By the previous estimates we have

$$\begin{aligned} \sup _{|u |\leqslant h^{-1}}|(\varphi _\Delta ^{-1})'(\varphi _{\Delta ,n}-\varphi _\Delta )' |(u)&=O_P( \Delta h^{-1}n^{-1/2}\Delta ^{1/2}(\log h^{-1})^{(1+\delta )/2}),\\ \sup _{|u |\leqslant h^{-1}}|(\varphi _\Delta ^{-1})''(\varphi _{\Delta ,n}-\varphi _\Delta ) |(u)&=O_P( (\Delta ^2 h^{-2}+\Delta )n^{-1/2}(\log h^{-1})^{(1+\delta )/2}). \end{aligned}$$

Inserting the asymptotics in \(h\), we conclude

$$\begin{aligned}&\sup _{|u |\leqslant h^{-1}} |{\widehat{\psi }}_n''(u)-\psi ''(u)-\Delta ^{-1}\varphi _\Delta ^{-1}(\varphi _{\Delta ,n} -\varphi _\Delta )''(u) |\\&\quad \leqslant \sup _{|u |\leqslant h^{-1}}2\Delta ^{-1}|(\varphi _\Delta ^{-1})'(\varphi _{\Delta ,n}-\varphi _\Delta )' |(u) +\sup _{|u |\leqslant h^{-1}}\Delta ^{-1}|(\varphi _\Delta ^{-1})''(\varphi _{\Delta ,n}-\varphi _\Delta ) |(u)\\&\qquad +O_P\big (n^{-1}\Delta ^{-1/2}\log (h^{-1})^{1+\delta }\big )\\&\quad =O_P\left( n^{-1/2}\Delta ^{-1/2}h^{1/2}\left( \Delta h^{-3/2}+\Delta ^{3/2}h^{-5/2}+\Delta ^{1/2}h^{-1/2}\right) (\log h^{-1})^{(1+\delta )/2}\right) \\&\qquad +O_P\big (n^{-1}\Delta ^{-1/2}\log (h^{-1})^{1+\delta }\big )\\&\quad =o_P\big (n^{-1/2}\Delta ^{-1/2}h^{1/2}\big ). \end{aligned}$$

By the Plancherel formula and the Cauchy–Schwarz inequality we have

$$\begin{aligned}&{\Bigl |\int g_t(x){{\mathrm{{\mathcal {F}}}}}^{-1}\Big [{\mathcal {F}} K_h(u)\big ({\widehat{\psi }}_n''(u)-\psi ''(u)-\Delta ^{-1}\varphi _\Delta (u)^{-1}(\varphi _{\Delta ,n}-\varphi _\Delta )''(u)\big )\Big ] (x)\,\mathrm {d}x \Bigr |}\\&\quad \leqslant ||{\mathcal {F}} g_t ||_{L^2}||{\mathcal {F}} K_h ||_{L^2}\sup _{|u |\leqslant h^{-1}}|{\widehat{\psi }}_n''(u)-\psi ''(u)-\Delta ^{-1}\varphi _\Delta (u)^{-1}(\varphi _{ \Delta ,n}-\varphi _\Delta )''(u) |\\&\quad =o_P(n^{-1/2}\Delta ^{-1/2}). \end{aligned}$$

\(\square \)

Finally, to the main stochastic term

$$\begin{aligned} M_{\Delta ,n}&=\Delta _n^{-1}\int g_t(x) \big ({{\mathrm{{\mathcal {F}}}}}^{-1}[\varphi _{\Delta _n}^{-1}{\mathcal {F}} K_{h_n}]*(x^2({{\mathrm{{\mathbb {P}}}}}_{\Delta _n,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta _n}))\big )(\,\mathrm {d}x)\\&=\Delta _n^{-1}\int \mathcal{F}^{-1}[\varphi _{\Delta _n}^{-1}(-u){\mathcal {F}} K_{h_n}(-u)\mathcal{F}g_t(u)](x)x^2({{\mathrm{{\mathbb {P}}}}}_{\Delta _n,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta _n})(\,\mathrm {d}x), \end{aligned}$$

we apply Theorem 11. The proof of Theorem 5 is thus complete upon verification of Assumption 8 for the present choice of \(m\). This is achieved in the following proposition.

Proposition 19

Assume that \(K\) satisfies (14) for \(p \geqslant 2\) and that \(\nu \) satisfies \(\int _{{{\mathrm{{\mathbb {R}}}}}}|x|^3\nu (\,\mathrm {d}x)<\infty \). Let \(h=h_n\rightarrow 0\) and \(\Delta =\Delta _n\rightarrow 0\) as \(n\rightarrow \infty \) with \(h^3=o(\Delta )\), \(h^{-1} = O(\Delta ^{-1/2})\). Then \(m_{n,\Delta }(u):={\mathcal {F}} K_h(u)/\varphi _\Delta (u)\), \(u \in {{\mathrm{{\mathbb {R}}}}}\), satisfies Assumption 8.

Proof

We have \(m(-u)=\overline{m(u)}\) so that \({{\mathrm{{\mathcal {F}}}}}^{-1}m\) is real-valued. By the compact support of \({\mathcal {F}} K\) and the assumption on \(h^{-1}\) the support assumption on \(m\) is satisfied. Since \(\varphi _{\Delta } = e^{\Delta \psi }\), we have \(m_{\Delta , n} \rightarrow 1\) pointwise as \(\Delta \rightarrow 0\), \(h\rightarrow 0\). Moreover, by (9) we have \(|\psi ''(u)|\lesssim 1\) hence for \(|u| \leqslant C\Delta ^{-1/2}\) we have

$$\begin{aligned} |\varphi _\Delta (u)|= |e^{\Delta \psi (u)}| \geqslant e^{-\Delta cu^2} \geqslant c'>0 \end{aligned}$$

uniformly in \(\Delta \), and thus \(m \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), using also \(\sup _h\Vert {\mathcal {F}} K_h\Vert _\infty \leqslant \Vert K\Vert _{L^1}\). Next

$$\begin{aligned} m' = h \frac{i{{\mathrm{{\mathcal {F}}}}}[xK](h{\scriptstyle \bullet } )}{\varphi _\Delta } + {\mathcal {F}} K_h \frac{\Delta \psi ' \varphi _\Delta }{\varphi ^2_\Delta } \end{aligned}$$

so that using \(xK \in L^1, |\psi '(u)| \lesssim 1+|u|, |u| \leqslant h^{-1}=O(\Delta ^{-1/2})\) and the bound for \(m\) above we see

$$\begin{aligned} |m'(u)| \lesssim (h + \sqrt{\Delta }) \lesssim h. \end{aligned}$$

Using \(|\psi ''(u)|\lesssim 1\) we further obtain

$$\begin{aligned} |m''(u)|&\lesssim (h^2+h\sqrt{\Delta }+\Delta ) \lesssim h^2. \end{aligned}$$

On the support of \(m\) we have \(|u|\leqslant h^{-1}\) so that \(\Vert (1+|u|)^{k}m^{(k)}\Vert _\infty \leqslant c\), \(k\in \{0,1,2\}\), follows. Likewise by the support of \(m\) we have \(\Vert m'\Vert _{L^2}\lesssim h^{1/2}\rightarrow 0\) and \(\Delta ^{-1/2}\Vert m''\Vert _{L^2}\lesssim \Delta ^{-1/2}h^{3/2}\rightarrow 0\). \(\square \)

5.5 Convergence of finite-dimensional distributions

We next turn to the proof of Proposition 9.

Definition 20

A function \(g\) is called admissible if it is of bounded variation and satisfies for all \(x, u \in {{\mathrm{{\mathbb {R}}}}}\),

$$\begin{aligned} |g(x)| \lesssim 1 \wedge x^{-2},\quad |{\mathcal {F}} g(u)|\lesssim (1+|u|)^{-1} \quad \text {and}\quad u{{\mathrm{{\mathcal {F}}}}}[xg](u)\in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}}). \end{aligned}$$

Note that the bound on \({\mathcal {F}} g\) follows from the bounded variation of \(g\), and that \(x^2 g^2\) is of bounded variation whenever \(g\) is admissible.

Proposition 21

Let \(g\) be admissible and suppose the conditions of Proposition 9 are satisfied. Then

$$\begin{aligned} \sqrt{n\Delta _n} \int {{\mathrm{{\mathcal {F}}}}}^{-1}[m(-{\scriptstyle \bullet } ){\mathcal {F}} g](x)\frac{x^2}{\Delta _n}({{\mathrm{{\mathbb {P}}}}}_{\Delta _n,n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta _n})(\,\mathrm {d}x)\rightarrow ^{{\mathcal {L}}} {\mathcal {N}}(0,\sigma _g^2) \end{aligned}$$

with variance \(\sigma _g^2=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4g(x)^2\nu (\,\mathrm {d}x)\).

The functions \(g_t=\rho 1\!\!1_{(-\infty ,t]}\) are uniformly bounded in bounded variation and are admissible with constants independent of \(t\in {{\mathrm{{\mathbb {R}}}}}\). The convergence of the finite dimensional distributions in Proposition 9 hence follows from the Cramér-Wold device since linear combinations of the functions \(g_{t_1},\ldots ,g_{t_k}\) for \(t_1,\ldots ,t_k\in {{\mathrm{{\mathbb {R}}}}}\) are admissible.

For the proof of Proposition 21 we will use the following lemma, whose assumptions are in particular fulfilled for \(m_{n,\Delta }\) satisfying Assumption 8 and for classes of functions with uniform constants in the admissibility definition.

Lemma 22

Let \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert \lesssim \Delta \). For \(\Delta \rightarrow 0\) as \(n\rightarrow \infty \) let \(\Vert m_{n,\Delta }\Vert _\infty \) and \(\Vert m_{n,\Delta }'\Vert _\infty \) be uniformly bounded and \(m_{n,\Delta }\rightarrow 1\) pointwise. If \({{\mathrm{{\mathcal {G}}}}}\) is a class of functions such that for all \(u\in {{\mathrm{{\mathbb {R}}}}}\)

$$\begin{aligned} \sup _{g\in {{\mathrm{{\mathcal {G}}}}}}|\mathcal {F}g(u)|\lesssim (1+|u|)^{-1}, \quad \sup _{g\in {{\mathrm{{\mathcal {G}}}}}}\Vert xg(x)\Vert _{L^2}\lesssim 1, \end{aligned}$$

then

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{g\in {{\mathrm{{\mathcal {G}}}}}}\int _{{{\mathrm{{\mathbb {R}}}}}}\Big (x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m_{n,\Delta } (-u){\mathcal {F}} g(u)](x)-x^2g(x)\Big )^2\frac{{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta }(\,\mathrm {d}x)=0. \end{aligned}$$

Proof

We rewrite the term with \(m=m_{n,\Delta }\) as

$$\begin{aligned}&\Delta ^{-1}\int _{{{\mathrm{{\mathbb {R}}}}}}{{\mathrm{{\mathcal {F}}}}}^{-1}[{\mathcal {F}} g(u)(m(-u)-1)](x){{\mathrm{{\mathcal {F}}}}}^{-1}[{{\mathrm{{\mathcal {F}}}}}g(u)(m(-u)-1)](x)x^4{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\nonumber \\&\quad =\frac{-i}{\Delta }\int _{{{\mathrm{{\mathbb {R}}}}}}{{\mathrm{{\mathcal {F}}}}}^{-1}\big [{\mathcal {F}} g(u)(m(-u)-1)\big ](x)\nonumber \\&\qquad \times {{\mathrm{{\mathcal {F}}}}}^{-1}\big [i{{\mathrm{{\mathcal {F}}}}}[xg](u)(m(-u)-1)-{\mathcal {F}} g(u)m'(-u)\big ](x)x^3{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x). \end{aligned}$$
(37)

Using \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \), the term (37) can be estimated by the Cauchy–Schwarz inequality and Plancherel’s identity yielding the bound

$$\begin{aligned}&\int _{{{\mathrm{{\mathbb {R}}}}}}\Big |{{\mathrm{{\mathcal {F}}}}}^{-1}\big [{\mathcal {F}} g(u)(m(-u)\!-\!1)\big ](x){{\mathrm{{\mathcal {F}}}}}^{-1}\big [i{{\mathrm{{\mathcal {F}}}}}[xg](u)(m(-u)\!-\!1)\!-\!{{\mathrm{{\mathcal {F}}}}}g(u)m'(-u)\big ](x)\Big |\,\mathrm {d}x\\&\quad \leqslant \frac{1}{2\pi }\big \Vert {\mathcal {F}} g(u)(m(-u)-1)\big \Vert _{L^2}\big \Vert \big (i{{\mathrm{{\mathcal {F}}}}}[xg](u)(m(-u)-1)-{{\mathrm{{\mathcal {F}}}}}g(u)m'(-u)\big )\big \Vert _{L^2}. \end{aligned}$$

The first factor converges to zero by the dominated convergence theorem because \(m\) is uniformly bounded and converges pointwise to one while \(|\mathcal {F}g(u)|\leqslant C(1+|u|)^{-1}\) for all \(g\in {{\mathrm{{\mathcal {G}}}}}\). For the second factor we estimate, using that \(g\) and \(xg\) are uniformly bounded in \(L^2({{\mathrm{{\mathbb {R}}}}})\) and that \(\Vert m\Vert _\infty \) and \(\Vert m'\Vert _\infty \) are uniformly bounded,

$$\begin{aligned} \big \Vert i{{\mathrm{{\mathcal {F}}}}}[xg](u)(m(-u)\!-\!1)\!-\!{\mathcal {F}} g(u)m'(-u)\big \Vert _{L^2} \lesssim \Vert {{\mathrm{{\mathcal {F}}}}}[xg](u)\Vert _{L^2}+\Vert {\mathcal {F}} g(u)\Vert _{L^2}<\infty , \end{aligned}$$

which completes the proof of the lemma.\(\square \)

Proof of Proposition 21

We define

$$\begin{aligned} S_n - {{\mathrm{{\mathbb {E}}}}}S_n := \frac{1}{n}\sum _{k=1}^n( Y_{n,k}-{{\mathrm{{\mathbb {E}}}}}[Y_{n,k}])\quad \text {with}\quad Y_{n,k}:=\Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-{\scriptstyle \bullet } ){\mathcal {F}} g](X_k)X_k^2. \end{aligned}$$
(38)

We will prove the proposition for general Fourier multipliers satisfying Assumption 8(b), the case where \({{\mathrm{{\mathcal {F}}}}}^{-1} m\) is a finite signed measure is similar (in fact easier) and is omitted. We will verify the conditions of Lyapunov’s central limit theorem, see, e.g., [1, Theorems 28.3 and (28.8)].

Step 1: We will show that \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\int _{{{\mathrm{{\mathbb {R}}}}}} x^4 g(x)^2\nu (\,\mathrm {d}x)\), noting that \(Y_{n,k}\) are real valued. We estimate

$$\begin{aligned} |{{\mathrm{{\mathbb {E}}}}}[Y_{n,k}] |&=\Delta ^{-1/2}{\Bigl |\int _{{{\mathrm{{\mathbb {R}}}}}}\mathcal{F}^{-1}\big [m(-u){{\mathrm{{\mathcal {F}}}}}g(u)\big ](x)x^2{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x) \Bigr |}\\&\leqslant \Delta ^{-1/2}||\mathcal{F}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}g(u)] ||_\infty ||x^2{{\mathrm{{\mathbb {P}}}}}_\Delta ||_{L^1}\\&\lesssim \Delta ^{-1/2} \int _{-C \Delta ^{-1/2}}^{C \Delta ^{-1/2}} (1+|u|)^{-1}\,\mathrm {d}u ~{{\mathrm{{\mathbb {E}}}}}[X_1^2] \\&\lesssim \Delta ^{-1/2}\log (\Delta ^{-1})\Delta \rightarrow 0, \end{aligned}$$

where we have used that \({{\mathrm{{\mathbb {E}}}}}[X_1^2]= O(\Delta )\). Consequently, \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\lim _{n\rightarrow \infty }{{\mathrm{{\mathbb {E}}}}}[Y_{n,k}^2]\), which we decompose in the following way:

$$\begin{aligned} \lim _{n\rightarrow \infty }{{\mathrm{{\mathbb {E}}}}}[Y_{n,k}^2]&=\lim _{n\rightarrow \infty }\Delta ^{-1}\int _{{{\mathrm{{\mathbb {R}}}}}}({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){\mathcal {F}} g(u)](x)x^2)^2{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\\&= \lim _{n\rightarrow \infty }\Delta ^{-1}\int _{{{\mathrm{{\mathbb {R}}}}}}\Big (({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){\mathcal {F}} g(u)](x)x^2)^2-(x^2g(x))^2\Big ){{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\\&\quad +\lim _{n\rightarrow \infty }\left( \Delta ^{-1}\int _{{{\mathrm{{\mathbb {R}}}}}}(x^2g(x))^2{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)-\int _{{{\mathrm{{\mathbb {R}}}}}}(xg(x))^2 x^2\nu (\,\mathrm {d}x)\right) \\&\quad +\int _{{{\mathrm{{\mathbb {R}}}}}}(xg(x))^2x^2\nu (\,\mathrm {d}x). \end{aligned}$$

The last term is the claimed limit. The first limit is zero by Lemma 22. For the second limit we deduce by Lemma 2 that \((x^2 \wedge x^4) {{\mathrm{{\mathbb {P}}}}}_\Delta /\Delta \) converges weakly to the absolutely continuous measure \((x^2 \wedge x^4) \nu \), and thus in particular by the Portmanteau lemma when integrating against the function \((x^2 \vee 1) g(x)^2\), which is of bounded variation. This implies convergence to zero of the second term. This shows \(\lim _{n\rightarrow \infty }{{\mathrm{\hbox {Var}}}}(Y_{n,k})=\int x^4 g(x)^2\nu (\,\mathrm {d}x)\).

Step 2: We verify Lyapunov’s moment condition: For some \(\varepsilon \in (0,1)\) and \(S_n=\sum _{k=1}^nY_{n,k}\)

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{{{\mathrm{\hbox {Var}}}}(S_n)^{1+\varepsilon /2}}\sum _{k=1}^{n} {{\mathrm{{\mathbb {E}}}}}[|Y_{n,k}|^{ 2+\varepsilon }]=0. \end{aligned}$$

From the previous step we know \(n^{-1}{{\mathrm{\hbox {Var}}}}(S_n)={{\mathrm{\hbox {Var}}}}(Y_{n,k})\rightarrow \sigma _g^2\) as \(n\rightarrow \infty \). Moreover, by \(|x|^{4+2\varepsilon }\lesssim |1+ix|^{2+\varepsilon }|x|^3\), \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) and the Hausdorff–Young inequality (e.g., 8.30 on p. 253 in [15])

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}[|Y_{n,k}|^{2+\varepsilon }]&\lesssim \Delta ^{-1-\varepsilon /2}\int _{{{\mathrm{{\mathbb {R}}}}}}\big |{{\mathrm{{\mathcal {F}}}}}^{-1}\big [m(-u){\mathcal {F}} g(u)\big ](x)x^2\big |^{2+\varepsilon }{{\mathrm{{\mathbb {P}}}}}_{\Delta }(\,\mathrm {d}x)\\&\lesssim \Delta ^{-\varepsilon /2}\int _{{{\mathrm{{\mathbb {R}}}}}}\big |{{\mathrm{{\mathcal {F}}}}}^{-1}\big [m(-u){\mathcal {F}} g(u)\big ](x)(1+ix)\big |^{2+\varepsilon }\,\mathrm {d}x\\ \!&=\!\Delta ^{-\varepsilon /2}\Big \Vert {{\mathrm{{\mathcal {F}}}}}^{-1}\big [m(-u){\mathcal {F}} g(u)\!-\!m'(-u){{\mathrm{{\mathcal {F}}}}}g(u)\!+\!im(-u){{\mathrm{{\mathcal {F}}}}}[xg](u)\big ]\Big \Vert _{L^{2+\varepsilon }}^{2+\varepsilon }\\&\lesssim \Delta ^{-\varepsilon /2}\Big \Vert m(-u){\mathcal {F}} g(u)\!-\!m'(-u){{\mathrm{{\mathcal {F}}}}}g(u)\!+\!im(-u){{\mathrm{{\mathcal {F}}}}}[xg](u)\Big \Vert _{L^{(2+\varepsilon )/(1+\varepsilon )}}^{2+\varepsilon }. \end{aligned}$$

By Assumption 8, \(m\) and \(m'\) are uniformly bounded, \(\Vert {\mathcal {F}} g(u)\Vert _{L^{(2+\varepsilon )/(1+\varepsilon )}}\) is bounded by \(|{{{\mathrm{{\mathcal {F}}}}}}g|\lesssim (1+|u|)^{-1}\) and

$$\begin{aligned} \big \Vert {{\mathrm{{\mathcal {F}}}}}[xg](u)\big \Vert _{L^{(2+\varepsilon )/(1+\varepsilon )}}&\lesssim \big \Vert {{\mathrm{{\mathcal {F}}}}}[xg](u)\big \Vert _{L^{(2+\varepsilon ) /(1+\varepsilon )}([-1,1])}\\&\quad +\big \Vert {{\mathrm{{\mathcal {F}}}}}[xg] (u)\big \Vert _ { L^ { (2+\varepsilon )/(1+\varepsilon )}([-1,1]^c)}, \end{aligned}$$

which are finite by \(xg\in L^2({{\mathrm{{\mathbb {R}}}}})\) and by \(u{{\mathrm{{\mathcal {F}}}}}[xg](u)\in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\), respectively. Consequently, \({{\mathrm{{\mathbb {E}}}}}[|Y_{n,k}|^{ 2+\varepsilon }]\lesssim \Delta ^{-\varepsilon /2}\), implying

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{{{\mathrm{\hbox {Var}}}}(S_n)^{1+\varepsilon /2}}\sum _{k=1}^{n} {{\mathrm{{\mathbb {E}}}}}[|Y_{n,k}|^{2+\varepsilon }]\lesssim \lim _{n\rightarrow \infty } \frac{n\Delta _n^{-\varepsilon /2}}{n^{1+\varepsilon /2}} =\lim _{n\rightarrow \infty }(n\Delta _n)^{-\varepsilon /2}=0. \end{aligned}$$

\(\square \)

5.6 Proof of Proposition 16

For (ii) and (iii) we have \(\int |x| \nu (\,\mathrm {d}x)<\infty \) and will use that the function \(\psi \) in the exponent of the Lévy–Khintchine formula (8) may be written as

$$\begin{aligned} \psi (u)=-\frac{\sigma ^2u^2}{2}+i\gamma _0 u+\int _{{{\mathrm{{\mathbb {R}}}}}}(e^{iux}-1)\nu (\,\mathrm {d}x)\quad \text { with }\gamma _0:=\gamma -\int _{{{\mathrm{{\mathbb {R}}}}}}x\nu (\,\mathrm {d}x). \end{aligned}$$

For (iii) note \(x{{\mathrm{{\mathbb {P}}}}}_\Delta =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^- +(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^-)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) with the corresponding laws for \(\nu ^+,\nu ^-\). It thus suffices to prove \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+ ||_\infty + ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^- ||_\infty \lesssim 1\) and without loss of generality we only consider \({{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) in the proof of case (iii). For (iv) we use the same decomposition but this time the law \({{\mathrm{{\mathbb {P}}}}}_\Delta ^+\) corresponds to the Lévy triplet \((0,\gamma ,\nu ^+)\) so that it also incorporates the drift.

  1. (i)

    If \(\sigma >0\) holds, then \(|\psi '(u) |\lesssim 1+|u |\) implies

    $$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant ||\varphi _\Delta ' ||_{L^1}\lesssim \int \Delta (1+|u |)e^{-\Delta \sigma ^2u^2/2}\,\mathrm {d}u\lesssim 1. \end{aligned}$$
  2. (ii)

    On the assumptions the measure \(x\nu \) is finite yielding the identity \(x{{\mathrm{{\mathbb {P}}}}}_\Delta =\Delta (x\nu )*{{\mathrm{{\mathbb {P}}}}}_\Delta \), which implies that even

    $$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant \Delta ||x\nu ||_\infty . \end{aligned}$$
  3. (iii)

    Without loss of generality we suppose \(\Vert x\nu ^\pm \Vert _\infty =\infty \). Denote the limit inferior in condition (iii) by \(\delta >0\) and define

    $$\begin{aligned} a_\Delta :=\inf \left\{ a>0:\sup _{x>a}\Delta x\nu (x)\leqslant \frac{4}{\delta }\right\} , \end{aligned}$$

    where \(a_\Delta >0\) follows from \(\lim _{a\rightarrow 0}\sup _{x>a}x\nu (x)=\Vert x\nu ^+\Vert _\infty =\infty \). Since \(\Vert x\nu \Vert _{\ell ^\infty ({{\mathrm{{\mathbb {R}}}}}\setminus [-\varepsilon ,\varepsilon ])}\) is bounded for any \(\varepsilon >0\) we deduce that \(a_\Delta \downarrow 0\) as \(\Delta \rightarrow 0\).

    Let us introduce \(\nu _\Delta ^s:=\nu {1\!\!1}_{[0,a_\Delta ]}\) and \(\nu _\Delta ^c:=\nu ^+-\nu _\Delta ^s\). By \(\Vert x\nu _\Delta ^c\Vert _\infty \leqslant \frac{4}{\Delta \delta } \) and the argument in (ii), applied to \(\nu _\Delta ^c\), the corresponding law \({{\mathrm{{\mathbb {P}}}}}_\Delta ^c\) satisfies \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c ||_\infty \lesssim 1\). Because of

    $$\begin{aligned} x{{\mathrm{{\mathbb {P}}}}}_\Delta =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^s+(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^s) *{{\mathrm{{\mathbb {P}}}}}_\Delta ^c =(x{{\mathrm{{\mathbb {P}}}}}_\Delta ^c)*{{\mathrm{{\mathbb {P}}}}}_\Delta ^s+(\Delta x\nu _\Delta ^s)*{{\mathrm{{\mathbb {P}}}}}_\Delta \end{aligned}$$

    we shall bound \(||\Delta x\nu _\Delta ^s ||_{L^1}\) and \(||{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \). From the assumptions we infer \(||\Delta x\nu _\Delta ^s ||_{L^1}\lesssim a_{\Delta }\) via

    $$\begin{aligned} \int _0^{a_\Delta } \Delta x\nu (\,\mathrm {d}x) =\lim _{a\downarrow a_\Delta }\int _0^a\Delta x\nu (\,\mathrm {d}x) \lesssim \limsup _{a\downarrow a_\Delta }a\Delta a\nu (a)\leqslant \frac{4a_\Delta }{\delta }. \end{aligned}$$

    On the other hand, by construction there is some \(a_\Delta ^-\in [\frac{1}{2}a_\Delta ,a_\Delta ]\) such that \(\Delta a_\Delta ^-\nu (a_\Delta ^-)\geqslant 4/\delta \). Together with the assumptions, and \(\Vert {{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \leqslant \Vert \varphi _\Delta \Vert _1\), we see that for \(\varepsilon :=a_\Delta ^-\) sufficiently small, that is for \(\Delta \) small, and for some \(\kappa \in (2,4)\)

    $$\begin{aligned} a_\Delta ||{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty&\leqslant 2a_\Delta ^-\int _{-\infty }^\infty e^{-\frac{\Delta }{\kappa } u^2\int _0^{1/|u |} x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&=2\int _{-\infty }^\infty e^{-\frac{\Delta }{\kappa } (v/a_\Delta ^-)^{2}\int _0^{a_\Delta ^-/|v |} x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}v\\&\leqslant 4+2\int _{|v |>1} e^{-\frac{\delta }{\kappa }\Delta a_\Delta ^-\nu (a_\Delta ^-)\log (|v |)}\,\mathrm {d}v\\&\leqslant 4+4\int _{1}^\infty v^{-4/\kappa }\,\mathrm {d}v\sim 1, \end{aligned}$$

    which together with the bound on \(||\Delta x\nu _\Delta ^s ||_{L^1}\) yields the result.

  4. (iv)

    By Theorem 27.7 in [31] \({{\mathrm{{\mathbb {P}}}}}_\Delta \) admits a Lebesgue density, hence by Fourier inversion \(||x{{\mathrm{{\mathbb {P}}}}}_\Delta ||_\infty \leqslant ||\varphi _\Delta ' ||_{L^1}\) and by the hypothesis on \(\nu ^+\), we estimate for some \(\kappa >0\) and for some small \(c>0\)

    $$\begin{aligned} ||x{{\mathrm{{\mathbb {P}}}}}_\Delta ^+ ||_\infty&\leqslant \int _{-\infty }^\infty \Delta {\Bigl |\int _0^\infty (e^{iux}-1)x\nu (\,\mathrm {d}x)+\gamma \Bigr |}e^{-\Delta \int _0^\infty (1-\cos (ux))\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&\lesssim \int _{-\infty }^\infty \Delta \Big (1+\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\Big ) e^{-\frac{\Delta }{\kappa } u^2\int _0^{1/|u |}x^2\nu (\,\mathrm {d}x)}\,\mathrm {d}u\\&\leqslant \int _{-\infty }^\infty \Delta \Big (1+\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\Big ) e^{-c\Delta \int _0^\infty (\tfrac{u^2}{2}x^2\wedge |u |x)\nu (\,\mathrm {d}x)}\,\mathrm {d}u. \end{aligned}$$

    The derivative of the exponent is given by \(-c\Delta {{\mathrm{sgn}}}(u)\int _0^\infty (|u |x^2\wedge x)\nu (\,\mathrm {d}x)\) such that the last line of the display is bounded by

    $$\begin{aligned} \int _{-\infty }^\infty \Delta e^{-c\Delta \int _0^\infty (\tfrac{u^2}{2} x^2\wedge |u |x)\nu (\,\mathrm {d}x)}\,\mathrm {d}u +2/c. \end{aligned}$$

    From \(|u |\int _0^{1/u}x^2\nu (\,\mathrm {d}x)\gtrsim 1\) we infer that the integral is at most of order \(\int \Delta e^{-\Delta |u |}\,\mathrm {d}u\thicksim 1\) and the result follows.

6 Proof of Theorem 11

We recall \(g_t(x)=\rho (x)1\!\!1_{(-\infty ,t]}(x)\) and hence

$$\begin{aligned} \mathbb {G}_n(t)=\sqrt{n}\int _{{{\mathrm{{\mathbb {R}}}}}}\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){\mathcal {F}} g_t(u)](x)({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_\Delta )(\,\mathrm {d}x), \quad t\in {{\mathrm{{\mathbb {R}}}}}. \end{aligned}$$

By Proposition 9 and Theorem 1.5.7 in [33] it suffices to show that there is a semimetric \(d\) such that \(({{\mathrm{{\mathbb {R}}}}},d)\) is totally bounded and for every \(\gamma >0\) we have

$$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\Pr \left( \sup _{s,t \in {{\mathrm{{\mathbb {R}}}}}: d(s,t)\leqslant \delta } |\mathbb {G}_n(s)-\mathbb {G}_n(t)|>\gamma \right) =0. \end{aligned}$$
(39)

We note that \({\mathbb {G}}_n\) equals a triangular array of empirical processes \(\sqrt{n} ({{\mathrm{{\mathbb {P}}}}}_{\Delta , n}-{{\mathrm{{\mathbb {P}}}}}_{\Delta })\) indexed by the class

$$\begin{aligned} \widetilde{{{\mathrm{{\mathcal {G}}}}}}_n&:=\{{\widetilde{g}}_t(x):t\in {{\mathrm{{\mathbb {R}}}}}\},\\ \widetilde{g}_t(x)&:=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-{\scriptstyle \bullet } ){\mathcal {F}} g_t({\scriptstyle \bullet } )](x). \end{aligned}$$

6.1 Equicontinuity and a change of metric

For \(t\leqslant 0\) we decompose \(\widetilde{g}_t\) into the three terms

$$\begin{aligned} {\widetilde{g}}_t^{(1)}(x)&:=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[(\rho ({\scriptstyle \bullet } ) -e^{{\scriptstyle \bullet } -t}\rho (t) )1\!\!1_{(-\infty ,t]}({\scriptstyle \bullet } )](u)](x),\end{aligned}$$
(40)
$$\begin{aligned} {\widetilde{g}}_t^{(2)}(x)&:=\Delta ^{-1/2}x{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[te^{{\scriptstyle \bullet } -t}\rho (t) 1\!\!1_{(-\infty ,t]}({\scriptstyle \bullet } )](u)](x),\end{aligned}$$
(41)
$$\begin{aligned} {\widetilde{g}}_t^{(3)}(x)&:={\widetilde{g}}_t(x)-{\widetilde{g}}_t^{(1)}(x)-{\widetilde{g}}_t^{(2)}(x). \end{aligned}$$
(42)

Heuristically speaking the main difficulties arise from the fact that \(1\!\!1_{(-\infty , t]}\) is nonintegrable on \({{\mathrm{{\mathbb {R}}}}}\) and discontinuous at \(t\). The above decomposition separates the jump-discontinuity from the non-integrable part, and the third term collects the remainder without discontinuity or integrability issues. We refer to the second term as the ‘critical term’ since it is not regular enough to be treated by the usual metric entropy techniques.

For \(t>0\) we replace \(e^{y-t}\rho (t)1\!\!1_{(-\infty ,t]}(y)\) by \(-e^{t-y}\rho (t)1\!\!1_{(t, \infty )}(y)\), and the proof below proceeds with only notational changes. We thus restrict to \(t \in (-\infty ,0]\).

By the triangle inequality it suffices to show asymptotic equicontinuity for the empirical processes indexed by the three terms in the above decomposition separately with appropriate metrics \(d^{(i)}\), and then (39) holds with the overall metric \(d=\max _i d^{(i)}\) equal to the maximum of the three metrics \(d^{(i)}, i =1,2, 3\). In view of the variance structure of the limiting process \({\mathbb {G}}\) it is natural to choose the semimetrics

$$\begin{aligned} d^{(i)}(s,t) = \sqrt{\int _{{{\mathrm{{\mathbb {R}}}}}} (g_s^{(i)}-g^{(i)}_t )^2(x) \nu (\,\mathrm {d}x) },\quad i=1,2,3, \end{aligned}$$

where

$$\begin{aligned} g_t^{(1)}(x)&:=x^2(\rho (x)-e^{x-t}\rho (t) )1\!\!1_{(-\infty ,t]}(x),\end{aligned}$$
(43)
$$\begin{aligned} g_t^{(2)}(x)&:=xte^{x-t}\rho (t) 1\!\!1_{(-\infty ,t]}(x),\end{aligned}$$
(44)
$$\begin{aligned} g_t^{(3)}(x)&:=x(x-t)e^{x-t}\rho (t) 1\!\!1_{(-\infty ,t]}(x), \end{aligned}$$
(45)

and we note \(x^2g_t=g_t^{(1)}+g_t^{(2)}+g_t^{(3)}\). On the other hand the covariance metric compatible with the distribution \({{\mathrm{{\mathbb {P}}}}}_\Delta \) of the \(X_k\)’s driving the empirical process is given by the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-distance. In the following we will show that a \(\delta \)-increment for the limiting metric \(d^{(i)}\) corresponds, for \(n\) large enough, to a \(\delta \)-increment in the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-metric on the functions \({\widetilde{g}}_t^{(i)}\). Verifying asymptotic equicontinuity for the whole process then reduces to showing total boundedness of each subclass and that, for each \(i=1, 2, 3\), and every \(\gamma >0\),

$$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\Pr \left( \sup _{\Vert {\widetilde{g}}_s^{(i)}-{\widetilde{g}}_t^{(i)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\leqslant \delta } \left| \sqrt{n}\int _{{{\mathrm{{\mathbb {R}}}}}} ({\widetilde{g}}_s^{(i)}-{\widetilde{g}}_t^{(i)})({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_\Delta )(\,\mathrm {d}x)\right| >\gamma \right) =0, \end{aligned}$$
(46)

where \(\Vert f\Vert _{2,P}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\). This will permit the application of powerful tools from empirical process theory to control the last probabilities. Before we do this, we demonstrate the reduction to (46) for all three terms in the above decomposition separately. We note that total boundedness of the classes \({\mathcal {G}}^{(i)} = \{g_t^{(i)}: t \in {{\mathrm{{\mathbb {R}}}}}\}\) for the \(d^{(i)}\)-metric follows from entropy computations given in the following subsections.

Starting with \(\{{\widetilde{g}}_t^{(1)}: t \leqslant 0\}\), we note that the functions

$$\begin{aligned} x^{-1}g_t^{(1)}(x)&=x(\rho (x)-e^{x-t}\rho (t))1\!\!1_{(-\infty ,t]}(x)\\&=(x\rho (x)-(x-t)e^{x-t}\rho (t)-e^{x-t}t\rho (t))1\!\!1_{(-\infty ,t]}(x),\quad t\leqslant 0, \end{aligned}$$

are uniformly bounded and uniformly Lipschitz continuous. In order to compare \(d^{(1)}\) to the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norm on \(\{{\widetilde{g}}_t^{(1)}:t\leqslant 0\}\), we claim

$$\begin{aligned} \sup _{s,t\leqslant 0} \left| \int ({\widetilde{g}}_s^{(1)}(x)-{\widetilde{g}}_t^{(1)}(x))^2{{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)}-\int (g_s^{(1)}(x)-g_t^{(1)}(x))^2\nu (\,\mathrm {d}x)\right| \rightarrow 0 \end{aligned}$$
(47)

as \(n\rightarrow \infty \). Any class of functions that is uniformly bounded and uniformly Lipschitz continuous is a uniformity class for weak convergence using either Theorem 1 in [4], or the well-known fact that the bounded-Lipschitz metric metrises weak convergence. So, the weak convergence in Lemma 2 yields

$$\begin{aligned}&\sup _{s,t\leqslant 0}\left| \int (x^{-1}g_s^{(1)}(x)-x^{-1}g_t^{(1)}(x))^2 x^2\Delta ^{ -1 } {{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\right. \\&\quad \left. -\int (x^{-1}g_s^{(1)}(x)-x^{-1}g_t^{(1)}(x))^2x^2\nu (\,\mathrm {d}x)\right| \rightarrow 0 \end{aligned}$$

as \(n\rightarrow \infty \). Next using \(0<\rho (x)\leqslant C (1\wedge x^{-2})\) and the bounded variation of \(\rho \), we see that \({{\mathrm{{\mathcal {G}}}}}:=\{x^{-2}(g_s^{(1)}(x)-g_t^{(1)}(x)):s,t\leqslant 0\}\) satisfies the assumption of Lemma 22 and hence

$$\begin{aligned} \sup _{s,t\leqslant 0}\int \left( ({\widetilde{g}}_s^{(1)}(x)-{\widetilde{g}}_t^{(1)}(x))-\Delta ^{-1/2}(g_s^{(1)}(x)-g_t^{(1)}(x))\right) ^2{{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)}\rightarrow 0 \end{aligned}$$
(48)

as \(n\rightarrow \infty \). We conclude that (47) and then also the reduction to (46) holds for \(\{{\widetilde{g}}_t^{(1)}: t \in {{\mathrm{{\mathbb {R}}}}}\}\).

A similar reduction for \({\widetilde{g}}_t^{(2)}\) defined in (41) is achieved as follows. As in (47) we claim that

$$\begin{aligned}&\sup _{s,t\leqslant 0}\left| \int ({\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)})^2\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta -\int (g_s^{(2)}-g_t^{(2)})^2\,\mathrm {d}\nu \right| \nonumber \\&\quad \leqslant \sup _{s,t\leqslant 0}\left| \int ({\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)})^2\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta -\int (g_s^{(2)}-g_t^{(2)})^2\frac{\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta }\right| \nonumber \\&\qquad +\sup _{s,t\leqslant 0}\left| \int (g_s^{(2)}-g_t^{(2)})^2\frac{\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta }-\int (g_s^{(2)}-g_t^{(2)})^2\,\mathrm {d}\nu \right| \end{aligned}$$
(49)

converges to zero as \(n\rightarrow \infty \). To see this we observe that by Lemma 2 the measures \((1\wedge x^4)\Delta ^{-1}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) converge weakly to \((1\wedge x^4)\nu (\,\mathrm {d}x)\). The limit is absolutely continuous with respect to Lebesgue measure and thus the functions \(g_t^{(2)}(x)/(1\wedge x^2)\), \(t<0\), are \((1\wedge x^4)\nu (\,\mathrm {d}x)\)-almost everywhere continuous. Moreover, the functions

$$\begin{aligned} \frac{g_t^{(2)}(x)}{1\wedge x^2}=xte^{x-t}\rho (t)1\!\!1_{(-\infty ,t]}(x)\vee \frac{t}{x}e^{x-t}\rho (t)1\!\!1_{(-\infty ,t]}(x),\quad t<0, \end{aligned}$$
(50)

are all contained in a bounded set of the space of bounded variation functions and hence \(\{(g_s^{(2)}(x)/(1\wedge x^{2})-g_t^{(2)}(x)/(1\wedge x^{2}))^2:~s,t<0\}\) forms a uniformity class for weak convergence towards \((1 \wedge x^4)\nu (\,\mathrm {d}x) \in \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) (after renormalising the measures involved to have mass one and by Theorem 1 in [4]). Consequently

$$\begin{aligned} \sup _{s,t\leqslant 0}\left| \int (g_s^{(2)}-g_t^{(2)})^2\frac{\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta } -\int (g_s^{(2)}-g_t^{(2)})^2\,\mathrm {d}\nu \right| \rightarrow 0 \end{aligned}$$
(51)

as \(n\rightarrow \infty \), where we recall that \(g_0^{(2)}=0\). To deal with the first term in (49) we define

$$\begin{aligned} {\bar{g}}_t^{(2)}(x)&:=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[y^{-1}e^{y-t}t\rho (t) 1\!\!1_{(-\infty ,t]}(y)] (u)](x). \end{aligned}$$
(52)

Lemma 22 can be applied to the class \({{\mathrm{{\mathcal {G}}}}}:=\{y^{-2}(g_s^{(2)}(y)-g_t^{(2)}(y)):s,t\leqslant 0\}\) using that \(y^{-2}g_t(y)\) is uniformly bounded in the space of bounded variation functions, as observed after (50). This yields

$$\begin{aligned} \sup _{s,t\leqslant 0}\int \left( ({\bar{g}}_s^{(2)}-{\bar{g}}_t^{(2)}) -\Delta ^{-1/2}(g_s^{(2)}-g_t^{(2)})\right) ^2\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta \rightarrow 0 \end{aligned}$$
(53)

as \(n\rightarrow \infty \). Therefore, (49) follows from (51) and (53) if

$$\begin{aligned} \Vert {\bar{g}}_t^{(2)} - {\widetilde{g}}_t^{(2)}\Vert _{L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}\rightarrow 0 \end{aligned}$$
(54)

uniformly in \(t\leqslant 0\). To show this, note that

$$\begin{aligned} {\bar{g}}_t^{(2)}(x)-{\widetilde{g}}_t^{(2)}(x)&=i\Delta ^{-1/2}x {{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[y^{-2}g_t^{(2)}(y)](u)](x), \end{aligned}$$
(55)

for \(t<0\) and \({\bar{g}}_0^{(2)}(x)={\widetilde{g}}_0^{(2)}\). We will use the following proposition, which is an adaptation of the pseudo-differential operator inequality Proposition 10 in [27]. We denote the \(L^q\)-Sobolev space for \(q\in (0,\infty )\) and \(s\in {{\mathrm{{\mathbb {N}}}}}\) by \(W^s_q({{\mathrm{{\mathbb {R}}}}}):=\{f\in L^q({{\mathrm{{\mathbb {R}}}}}):\sum _{k=0}^s\Vert f^{(k)}\Vert _{L^q}<\infty \}\) and define \(\Vert f\Vert _{L^2(P)}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\).

Proposition 23

Let \(P\) be a probability measure with Lebesgue density \(P\) and such that \(\Vert x^{2j+k} P\Vert _\infty <\infty \) for some \(j,k\in {{\mathrm{{\mathbb {N}}}}}\). Let \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) with \({{\mathrm{\hbox {supp}}}}(f)\cap (-\delta ,\delta )=\varnothing \) for some \(\delta >0\). Then for any \(p,q\in [1,2]\), \(s\in \{1,2\},\) and any compactly supported function \(\mu \in W^s_q({{\mathrm{{\mathbb {R}}}}})\)

$$\begin{aligned} \Vert x^{j}({{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ]*f)\Vert _{L^2(P)} \lesssim \frac{\Vert x^{2j+k} P\Vert _\infty ^{1/2}}{\delta ^{k/2}}\Vert \mu \Vert _{L^{2p/(2-p)}}\Vert f\Vert _{L^{p}} \!+\!\delta ^j\Vert \mu ^{(s)}\Vert _{L^q}\left\| \frac{f(y)}{y^s}\right\| _{L^q} \end{aligned}$$

provided that the right-hand side is finite. The constant does not depend on \(\mu \), \(\delta \) or \(f\).

Proof

For \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) and \(s=1,2\) we can show, as in [27], the pseudo-differential operator identity

$$\begin{aligned} ({{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ]*f)(x)=\left( \left( \frac{1}{(i{\scriptstyle \bullet } )^s}{{\mathrm{{\mathcal {F}}}}}^{-1}\big [\mu ^{(s)}\big ] \right) *f\right) (x), \quad x\notin {{\mathrm{\hbox {supp}}}}(f). \end{aligned}$$

Let \(\delta ':=\delta /2\). We use Hölder’s inequality, Plancherel’s identity and the Hausdorff-Young inequality to conclude

$$\begin{aligned}&\int |x|^{2j}|{{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ]*f|^2P(\,\mathrm {d}x)\\&\quad \leqslant \Vert {{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ]*f\Vert ^2_{L^{2}} \Vert x^{2j}\,\mathrm {d}P\Vert _{\ell ^\infty ([-\delta ',\delta ']^c) }\\&\qquad +\Vert {{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ]*f\Vert ^2_{\ell ^\infty ([-\delta ',\delta '])} \int _{-\delta '}^{\delta '}|x|^{2j}P(\,\mathrm {d}x)\\&\quad \lesssim \Vert \mu {\mathcal {F}} f\Vert _{L^2}^2\Vert x^{2j+k} P\Vert _{\infty }(\delta ')^{-k} +\Vert (x^{-s}{{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ^{(s)}](x))*f\Vert ^2_{ L^ { \infty } ([ -\delta ',\delta ']) }(\delta ')^{2j}\\&\quad \lesssim (\delta ')^{-k}\Vert x^{2j+k} P\Vert _{\infty }\Vert \mu \Vert _{L^{2p/(2-p)}}^2 \Vert {\mathcal {F}} f\Vert ^2_{L^{p/(p-1)}} \\&\qquad +(\delta ')^{2j}\Vert {{\mathrm{{\mathcal {F}}}}}^{-1}[\mu ^{(s)}]\Vert _{L^{q/(q-1)}}^2\sup _{x\in [ -\delta ',\delta ']} \left( \int _{{{\mathrm{{\mathbb {R}}}}}}\frac{|f(y)|^q}{|x-y|^{sq}}\,\mathrm {d}y\right) ^{2/q}\\&\quad \lesssim (\delta ')^{-k}\Vert x^{2j+k} P\Vert _{\infty }\Vert \mu \Vert _{L^{2p/(2-p)}}^2\Vert f\Vert ^2_{L^{p}} +(\delta ')^{2j}\Vert \mu ^{(s)}\Vert _{L^q}^2\Vert f(y)/y^s\Vert _{L^q}^2. \end{aligned}$$

The result follows by taking the square root.\(\square \)

We apply Proposition 23 with \(P={{\mathrm{{\mathbb {P}}}}}_\Delta \), \(\mu =m'(-{\scriptstyle \bullet } )\), \(f(y)=g_t^{(2)}(y)/y^2\), \(\delta = |t|\), \(p=1\), \(q=2\), \(k=1\), \(j=1\) and \(s=1\). Using \(\Vert x^3{{\mathrm{{\mathbb {P}}}}}_\Delta \Vert _\infty \lesssim \Delta \) we estimate (55) for \(t<0\) by

$$\begin{aligned}&\Delta ^{-1}\Vert x{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[y^{-2}g_t^{(2)}(y)](u)]\Vert _{L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}^2\\&\quad \lesssim |t|^{-1}\Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2\Vert y^{-2}g_t^{(2)}(y)\Vert _{L^1}^2+t^2 \Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2\Vert y^{-3}g_t^{(2)}(y) \Vert _{L^2}^2\\&\quad \lesssim |t|^{-1}\Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2 \left( \int _{-\infty }^{t}y^{-1}e^{y-t}t\rho (t) \,\mathrm {d}y\right) ^2\\&\qquad +\Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2\Vert e^{y-t}1\!\!1_{(-\infty ,t]}(y)\Vert _{L^2}^2\\&\quad \lesssim \Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2\left( \int _{-\infty }^{t} |y|^{-1}e^{y-t}|t|^{1/2}\rho (t) \,\mathrm {d}y\right) ^2+\Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2\\&\quad \lesssim \Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2\left( \int _{-\infty }^{-1}e^{y+1}\,\mathrm {d}y+|t|^{1/2}\int _{-1}^{t}|y|^{-1}\,\mathrm {d}y1\!\!1_{\{t>-1\}}\right) ^2+\Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2\\&\quad \lesssim (1+\max _{t\in [-1,0) } |t|\log (1/|t|)^2)\Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2+\Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2\\&\quad \lesssim \Vert m'(-{\scriptstyle \bullet } )\Vert _{L^2}^2+\Delta ^{-1} \Vert m''(-{\scriptstyle \bullet } )\Vert _{L^2}^2, \end{aligned}$$

which converges to zero uniformly for \(t<0\) by Assumption 8. Hence, tightness of the empirical processes indexed by \({\widetilde{g}}_t^{(2)}\) can be verified by (46) with \(i=2\).

Finally, we discuss the remaining \({\widetilde{g}}_t^{(3)}={\widetilde{g}}_t-{\widetilde{g}}_t^{(1)}-{\widetilde{g}}_t^{(2)}\). We have

$$\begin{aligned} g_t^{(3)}(x)=x^2g_t(x)-g_t^{(1)}(x)-g_t^{(2)}(x). \end{aligned}$$

We combine Lemma 22 for \({{\mathrm{{\mathcal {G}}}}}:=\{g_s-g_t:s,t\leqslant 0\}\), with (48), (53) and (54) and obtain

$$\begin{aligned} \sup _{s,t\leqslant 0}\left| \left\| {\widetilde{g}}_s^{(3)}-{\widetilde{g}}_t^{(3)}\right\| _{L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}-\Delta ^{-1/2}\left\| g_s^{(3)}-g_t^{(3)} \right\| _{ L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}\right| \rightarrow 0. \end{aligned}$$

Exactly as in (51) we infer

$$\begin{aligned} \sup _{s,t\leqslant 0}\left| \int (g_s^{(3)}-g_t^{(3)})^2\frac{\,\mathrm {d}{{\mathrm{{\mathbb {P}}}}}_\Delta }{\Delta } -\int (g_s^{(3)}-g_t^{(3)})^2\,\mathrm {d}\nu \right| \rightarrow 0 \end{aligned}$$

and thus we obtain the counterpart to (47)

$$\begin{aligned} \sup _{s,t\leqslant 0}\left| \int ({\widetilde{g}}_s^{(3)}-{\widetilde{g}}_t^{(3)})^2{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)-\int (g_s^{(3)}-g_t^{(3)})^2\,\mathrm {d}\nu \right| \rightarrow 0. \end{aligned}$$

6.2 Asymptotic equicontinuity for the ‘non-critical terms’

We next turn to verifying the asymptotic equicontinuity condition (46) for the terms \({\widetilde{g}}_t^{(i)}, i \in \{1,3\}\). We refer to them as non-critical since uniform tightness of these processes can be deduced directly from existing bracketing metric entropy inequalities for the empirical process.

We recall standard empirical process notation such as \(\Vert G\Vert _{\mathfrak F}:=\sup _{f\in \mathfrak F}|G(f)|\) and \(\Vert f\Vert _{2,P}:=(\int |f|^2 \,\mathrm {d}P)^{1/2}\). We denote by \(H(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) the logarithm of the covering number \(N(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) and by \(H_{[\,]}(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) the logarithm of the covering number under bracketing \(N_{[\,]}(\varepsilon ,\mathfrak {F},\Vert \cdot \Vert )\) (see [33] for definitions). For a class of functions \(\mathfrak F\) we define

$$\begin{aligned} \mathfrak {F}_\delta ':=\{f-g:f,g\in \mathfrak {F},\Vert f-g\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\leqslant \delta \}. \end{aligned}$$

We define the functions \(f_t(x):=x^{-2}g_t^{(1)}(x)=(\rho (x)-e^{x-t}\rho (t) )1\!\!1_{(-\infty ,t]}(x)\) and recall \({\widetilde{g}}_t^{(1)}(x)=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){\mathcal {F}} f_t(u)](x)\). In order to show the equicontinuity condition (46) for \({\widetilde{g}}_t^{(1)}\) we define the corresponding classes

$$\begin{aligned} \widetilde{\mathfrak {F}}:=\{{\widetilde{g}}_t^{(1)}:t\leqslant 0\}. \end{aligned}$$

We suppress in the notation the implicit dependence on \(n\) through \(\Delta \). The weak derivative \(D\rho \) is in \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\) by the Lipschitz continuity of \(\rho \). Since \(\rho \) is also of bounded variation we have \(D\rho \in L^1({{\mathrm{{\mathbb {R}}}}})\cap \ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\subseteq L^2({{\mathrm{{\mathbb {R}}}}})\). The class \(\{f_t:t\leqslant 0\}\) is contained in a bounded set of the Sobolev space \(W_2^1({{\mathrm{{\mathbb {R}}}}})\) since the \(L^2({{\mathrm{{\mathbb {R}}}}})\)-norms of \(f_t\) and \(Df_t\) are bounded. By boundedness of \(m\) we conclude that \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}f_t(u)](x)\), \(t\leqslant 0\), are contained in bounded subset of \(W_{2}^1({{\mathrm{{\mathbb {R}}}}})\), which embeds continuously into \(\ell ^\infty ({{\mathrm{{\mathbb {R}}}}})\). As an envelope of the class \((\widetilde{\mathfrak {F}})_\delta '\) we can thus take \(F(x):=c\Delta ^{-1/2}x^2\) for some \(c>0\). By Lemma 19.34 in [32] we have

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}\Vert \sqrt{n}({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_\Delta )\Vert _{(\widetilde{\mathfrak {F}})_\delta '} \lesssim J_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )) + \sqrt{n}{{\mathrm{{\mathbb {P}}}}}_\Delta F\{F>\sqrt{n}a(\delta )\}, \end{aligned}$$
(56)

where \(a(\delta ):=\delta /\sqrt{\log N_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )}\) and

$$\begin{aligned} J_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )) :=\int _0^\delta \sqrt{\log N_{[\,]}(\varepsilon ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))}\,\mathrm {d}\varepsilon . \end{aligned}$$

\(\Delta ^{1/2}x^{-2}(\widetilde{\mathfrak {F}})_\delta '\) is contained in a bounded set of the Besov space \(B^{s}_{22}({{\mathrm{{\mathbb {R}}}}})\) for \(s\leqslant 1\), which does not depend on \(\Delta \) or \(\delta \). Let \(\gamma >0\) be such that \(\int | x |^{4+2\gamma }\nu (\,\mathrm {d}x)<\infty \). We take \(s\in (1/2,1/2+\gamma )\). The proof of Theorem 1 in [26] with \(p=2\), \(q=2\) and \(\beta =0\) yields

$$\begin{aligned} H(\varepsilon ,\Delta ^{1/2}x^{-2}(\widetilde{\mathfrak {F}})_\delta ',\Vert \cdot \langle x \rangle ^{-\gamma }\Vert _\infty )\lesssim \varepsilon ^{-1/s}, \end{aligned}$$

where \(\langle x \rangle :=(1+x^2)^{1/2}\). Another way to write the entropy is \(H(\varepsilon ,\Delta ^{1/2}x^{-2}(\widetilde{\mathfrak {F}})_\delta ',\Vert \cdot \langle x \rangle ^{-\gamma }\Vert _\infty )=H(\varepsilon ,\Delta ^{1/2}(\widetilde{\mathfrak {F}} )_\delta ',\Vert \cdot x^{-2}\langle x \rangle ^{-\gamma }\Vert _\infty )\). A ball in the \(\Vert \cdot x^{-2}\langle x \rangle ^{-\gamma }\Vert _\infty \)-norm with centre \(f\) and radius \(\varepsilon \) is a bracket

$$\begin{aligned}{}[f-\varepsilon x^2 \langle x \rangle ^{\gamma },f+\varepsilon x^2\langle x \rangle ^{\gamma }], \end{aligned}$$

whose \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-size is given by \(\Vert 2\varepsilon x^2\langle x \rangle ^{\gamma }\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). Consequently we have

$$\begin{aligned} H_{[\,]}(\varepsilon \Vert 2x^2\langle x \rangle ^{\gamma }\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta },\Delta ^{1/2}(\widetilde{\mathfrak {F}})_\delta ', L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\!\leqslant \! H(\varepsilon ,\Delta ^{1/2}(\widetilde{\mathfrak {F}})_\delta ',\Vert \cdot x^{-2}\langle x \rangle ^{-\gamma }\Vert _\infty )\lesssim \varepsilon ^{-1/s}. \end{aligned}$$

By Theorem 1.1 in [11] (see also [14]) we have \(\Delta ^{-1/2}2\varepsilon \Vert x^2\langle x \rangle ^{\gamma }\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\rightarrow 2\varepsilon \Vert x^2\langle x \rangle ^{\gamma }\Vert _{2,\nu }\) as \(n\rightarrow \infty \). We obtain by a rescaling that

$$\begin{aligned} H_{[\,]}(\varepsilon ,(\widetilde{\mathfrak {F}})_\delta ', L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )) \lesssim \varepsilon ^{-1/s}. \end{aligned}$$

Taking \(s>1/2\) we conclude that the entropy integral \(J_{[\,]}(\delta ,(\widetilde{\mathfrak {F}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\) is finite and tends to zero as \(\delta \rightarrow 0\). To show that the left hand side of (56) tends to zero, we first ensure that the entropy integral is small by choosing \(\delta >0\). Upon fixing \(\delta \) and thus for fixed \(a(\delta )\) bounded away from zero uniformly in \(\Delta \), we choose \(n\) large enough such that the second term is small. We recall that we have taken the envelopes to be \(F(x)=c\Delta ^{-1/2}x^2\). We bound

$$\begin{aligned} \sqrt{n}{{\mathrm{{\mathbb {P}}}}}_\Delta F\{F>\sqrt{n}a(\delta )\}&\lesssim \sqrt{n}\Delta ^{-1/2}\int x^21\!\!1_{\{x^2>\sqrt{n\Delta }\,a(\delta )/c\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\\&\lesssim \Delta ^{-1}\int x^41\!\!1_{\{x^2>\sqrt{n\Delta }\,a(\delta )/c\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x), \end{aligned}$$

where we multiplied by \(cx^2/(\sqrt{n\Delta }\,a(\delta ))>1\). For \(M\) large enough \(\int x^4 1\!\!1_{\{x^2>M\}}\nu (\,\mathrm {d}x)\) is small. Since \(\Delta ^{-1}\int x^4 1\!\!1_{\{x^2>M\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\rightarrow \int x^4 1\!\!1_{\{x^2>M\}}\nu (\,\mathrm {d}x)\) by Theorem 1.1 in [11], \(n\Delta \rightarrow \infty \) as \(n\rightarrow \infty \) and \(a(\delta )\) is bounded away from zero, we have that \(\Delta ^{-1}\int x^41\!\!1_{\{x^2>\sqrt{n\Delta }\,a(\delta )/c\}}{{\mathrm{{\mathbb {P}}}}}_\Delta (\,\mathrm {d}x)\) is small for \(n\) large enough. So indeed the left hand side of (56) tends to zero as \(\delta \rightarrow 0\) and \(n\rightarrow \infty \) and we have shown tightness of the empirical process indexed by \(\{{\widetilde{g}}^{(1)}_t:t\leqslant 0\}\).

Let us now consider the terms associated to

$$\begin{aligned} {\widetilde{g}}_t^{(3)}(x)&={\widetilde{g}}_t(x)-{\widetilde{g}}_t^{(1)}(x)-{\widetilde{g}}_t^{(2)}(x)\\&=\Delta ^{-1/2}x^2{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[e^{y-t}\rho (t) 1\!\!1_{(-\infty ,t]}(y)](u)](x)\\&\quad -\Delta ^{-1/2}x{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[te^{y-t} \rho (t)1\!\!1_{(-\infty ,t]}(y)](u)](x)\\&=i\Delta ^{-1/2}x{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[e^{y-t}\rho (t) 1\!\!1_{(-\infty ,t]}(y)](u)](x)\\&\quad +\Delta ^{-1/2}x{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[(y-t)e^{y-t} \rho (t)1\!\!1_{(-\infty ,t]}(y)](u)](x). \end{aligned}$$

The functions \((y-t)e^{y-t} \rho (t)1\!\!1_{(-\infty ,t]}(y)\) are uniformly for all \(t\leqslant 0\) bounded in \(L^2({{\mathrm{{\mathbb {R}}}}})\) and likewise are their weak derivatives. We conclude that they are contained in a bounded set of \(B^1_{22}({{\mathrm{{\mathbb {R}}}}})\). The functions \(e^{y-t}\rho (t)1\!\!1_{(-\infty ,t]}(y)\), \(t\leqslant 0\), are contained in a bounded set of \(L^2({{\mathrm{{\mathbb {R}}}}})\). Assumption 8 implies, together with the Mikhlin Fourier multiplier theorem (e.g., Corollary 4.11 in [19]), that \(m\) is a Fourier multiplier on every Besov space \(B^s_{pq}({{\mathrm{{\mathbb {R}}}}})\), \(s\in {{\mathrm{{\mathbb {R}}}}}\), \(p,q\in [1,\infty ]\), and, moreover, that \(m'\) is a Fourier multiplier mapping \(B^s_{pq}({{\mathrm{{\mathbb {R}}}}})\) into \(B^{s+1}_{pq}({{\mathrm{{\mathbb {R}}}}})\). We see that \(\Delta ^{1/2}x^{-1}{\widetilde{g}}^{(3)}_t(x)\), \(t\leqslant 0\), are contained in a bounded set of \(B^1_{22}({{\mathrm{{\mathbb {R}}}}})\). We define the class \(\widetilde{{\mathcal {G}}}:=\{{\widetilde{g}}^{(3)}_t:t\leqslant 0\}\).

As an envelope of the class \((\widetilde{\mathcal {G}})_{\delta }'\) we can take \(G(x):=c\Delta ^{-1/2}x\) for some constant \(c>0\). Lemma 19.34 in [32] yields

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}\Vert \sqrt{n}({{\mathrm{{\mathbb {P}}}}}_{\Delta ,n}-{{\mathrm{{\mathbb {P}}}}}_\Delta )\Vert _{(\widetilde{\mathcal {G}})_\delta '} \lesssim J_{[\,]}(\delta ,(\widetilde{\mathcal {G}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )) + \sqrt{n}{{\mathrm{{\mathbb {P}}}}}_\Delta G\{G>\sqrt{n}a(\delta )\}. \end{aligned}$$
(57)

Again by the proof of Theorem 1 in [26] with \(s=1\), \(p=2\), \(q=2\), \(\beta =0\) and \(\gamma =1\) we have

$$\begin{aligned} H(\varepsilon ,\Delta ^{1/2}x^{-1}(\widetilde{{\mathcal {G}}})_{\delta }',\Vert \cdot \langle x\rangle ^{-1}\Vert _\infty )\lesssim \varepsilon ^{-1}. \end{aligned}$$

The entropy can be rewritten as \(H(\varepsilon ,\Delta ^{1/2}(\widetilde{{\mathcal {G}}})_{\delta }',\Vert \cdot x^{-1}\langle x\rangle ^{-1}\Vert _\infty )\). A corresponding \(\varepsilon \) ball is in the \(L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\)-norm of size \(2\varepsilon \Vert x\langle x \rangle \Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). By Theorem 1.1 in [11] we have \(\Delta ^{-1/2}\Vert x\langle x\rangle \Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\rightarrow \Vert x\langle x \rangle \Vert _{2,\nu }+\sigma ^2\) as \(n\rightarrow \infty \). Arguing as for \(\widetilde{\mathfrak {F}}\) we obtain

$$\begin{aligned} H_{[\,]}(\varepsilon ,(\widetilde{{\mathcal {G}}})_\delta ',L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\lesssim \varepsilon ^{-1}. \end{aligned}$$

The entropy integral in (57) is finite and converges to zero as \(\delta \rightarrow 0\). The second term \(\sqrt{n}{{\mathrm{{\mathbb {P}}}}}_\Delta G\{G>\sqrt{n}a(\delta )\}\) can be treated exactly as the second term in (56) with \(x^2\) replaced by \(x\). So the \(\lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\) of (57) is zero and thus (46) follows for the functions \({\widetilde{g}}^{(3)}_t\).

6.3 Asymptotic equicontinuity of the ‘critical term’

It remains to show asymptotic equicontinuity of the empirical process indexed by the class

$$\begin{aligned} Q_n&:=\{{\widetilde{g}}_t^{(2)}:t\leqslant 0\}, \end{aligned}$$

where we recall from (41) that

$$\begin{aligned} {\widetilde{g}}_t^{(2)}(x)=\Delta ^{-1/2}x({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u)]*q_t)(x),~~ q_t(y):=t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y). \end{aligned}$$
(58)

We refer to this term as ‘critical’: the functions \(q_t\) contain a step-discontinuity at \(t\) and controlling its interaction with the operator \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-\cdot )]\) needs some more elaborate techniques than in the previous section.

We will rely on the following auxiliary result, which is a modification of Theorem 3 in [16], which in itself goes back to fundamental ideas in [18]. It is designed to allow for maximally growing envelopes of the empirical process, which is crucial in our setting to allow for minimal conditions on \(\Delta \). Note that indeed Condition (a) only requires \(M_n/n^{1/2}\rightarrow 0\) instead of the more stringent condition \(M_n/n^{1/4} \rightarrow 0\) which was required in Theorem 3 in [16].

Proposition 24

For every \(n \in {{\mathrm{{\mathbb {N}}}}},\) let \(X_{n,j}, j=1, \ldots , n,\) be i.i.d. from law \(P_n\) on a measurable space \((S, {\mathcal {B}})\) and let \(\varepsilon _j\), \(j=1,\ldots ,n,\) be i.i.d. Rademacher random variables independent of the \(X_{n,j}\)’s, all defined on a common probability space \((\Omega , {\mathcal {A}}, \Pr )\). For any sequence \(({\mathcal {Q}}_n)_{n\geqslant 1}\) of classes of measurable functions \(q: S \rightarrow {{\mathrm{{\mathbb {R}}}}}\) and

$$\begin{aligned} ({\mathcal {Q}}_n)'_{r}:=\{q-q':q,q'\in {\mathcal {Q}}_n, \Vert q-q'\Vert _{2,P_n}\leqslant r_n\}, n \in {{\mathrm{{\mathbb {N}}}}}, \end{aligned}$$

suppose the following conditions are satisfied for some sequence \(r_n\rightarrow 0\) as \(n\rightarrow \infty \)

  1. (a)

    \(\sup _{q\in {\mathcal {Q}}_n}\Vert q\Vert _\infty \leqslant M_n\) for a sequence \(M_n\) such that \(n r_n^{2}{M_n}^{-2}\rightarrow \infty \).

  2. (b)
    $$\begin{aligned} \left\| \frac{1}{\sqrt{n}}\sum _{j=1}^n\varepsilon _j q(X_{n,j})\right\| _{({\mathcal {Q}}_n)'_{r}} = o_P(1) \end{aligned}$$

    as \(n\rightarrow \infty \).

  3. (c)

    There exists \(n_0\in {{\mathrm{{\mathbb {N}}}}}\) such that for all \(n\geqslant n_0\)

    $$\begin{aligned} 23 H(r_n,{\mathcal {Q}}_n,L^2(P_n))\leqslant n r_n^{2}{M_n}^{-2}. \end{aligned}$$
  4. (d)
    $$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\int _0^\delta \sqrt{H( \varepsilon ,{\mathcal {Q}}_n, L^2(P_n))}\,\mathrm {d}\varepsilon =0 \end{aligned}$$

Then for all \(\gamma >0\)

$$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\Pr \left( \sup _{q,q' \in {\mathcal {Q}}_n: \Vert q-q'\Vert _{2,P_n}\leqslant \delta } \left| \sqrt{n}\int _S (q\!-\! q')\left( \frac{1}{n}\sum _{j=1}^n \delta _{X_{n,j}}\!-\!P_n\right) (\,\mathrm {d}x)\right| >\gamma \right) \!=\!0. \end{aligned}$$

Proof

Let \(\gamma >0\) be given. We sometimes omit to mention \(q,q' \in {\mathcal {Q}}\) to expedite notation. By Lemma 11.2.6 in [9] we have for \(\delta \in (0,\gamma /\sqrt{2})\)

$$\begin{aligned}&{\Pr }\left( \sup _{\Vert q-q'\Vert _{2,P_n}\leqslant \delta } \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n} q(X_{n,j})-q'(X_{n,j})-{{\mathrm{{\mathbb {E}}}}}[q(X_{n,j})]+{{\mathrm{{\mathbb {E}}}}}[q'(X_{n,j})]\right| >\gamma \right) \\&\quad \leqslant 4{\Pr }\left( \sup _{ \Vert q-q'\Vert _{2,P_n}\leqslant \delta } \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j(q(X_{n,j})-q'(X_{n,j} ))\right| >\frac{\gamma -\sqrt{2}\delta }{2}\right) , \end{aligned}$$

where \(\varepsilon _j\) are Rademacher random variables independent of the \((X_{n,j})\), all defined on a large product probability space. Since \(\gamma \) is given and \(\delta \) tends to zero, we can choose \(\delta \) small enough such that \(\delta <\gamma /2\). Hence it suffices to show for all \(\gamma >0\) that

$$\begin{aligned} \lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }{\Pr }\left( \sup _{ q\in ({\mathcal {Q}}_n)'_\delta } \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j q(X_{n,j})\right| >3\gamma \right) =0. \end{aligned}$$

Let \({\mathcal {H}}={\mathcal {H}}_n\) be a maximal collection of functions \(h_1,\ldots ,h_m\) in \({\mathcal {Q}}_n\) such that \(\Vert h_j-h_k\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }>r_n\) if \(j\ne k\). The closed balls with centres \(h_1,\ldots ,h_m\) of radius \(r_n\) cover \({\mathcal {Q}}_n\). We define

$$\begin{aligned} {\mathcal {H}}'_\delta :=\{g-h:g,h\in {\mathcal {H}}, \Vert g-h\Vert _{2,P_n}\leqslant \delta \}. \end{aligned}$$

For \(n\) large enough such that \(r_n<\delta /2\) we have

$$\begin{aligned}&{{\Pr }}\left( \sup _{q\in ({\mathcal {Q}}_n)'_\delta } \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j q(X_{n,j})\right| >3\gamma \right) \nonumber \\&\quad \leqslant 2{\Pr }\left( \sup _{q\in ({\mathcal {Q}}_n)'_r} \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j q(X_{n,j})\right| \!>\!\gamma \!\right) \!+\!{\Pr }\left( \max _{h\in {\mathcal {H}}_{2\delta }'} \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j h(X_{n,j})\right| \!>\!\gamma \!\right) . \end{aligned}$$
(59)

By condition (b) the first term tends to zero. To control the second term we define the event

$$\begin{aligned} A_n:=\left\{ \max _{h\in {\mathcal {H}}_{2\delta }'\backslash \{0\}}\frac{\sum _{j=1}^n h^2(X_{n,j})}{nP_n h^2}<2\right\} \!, \end{aligned}$$

where we used the notation \(P_n f:=\int _S f \,\mathrm {d}P_n\) for functions \(f:S\rightarrow {{\mathrm{{\mathbb {R}}}}}\). Using Markov’s inequality the second term in (59) can be bounded by

$$\begin{aligned}&{\Pr }\left( \max _{h\in {\mathcal {H}}_{2\delta }'} \frac{1}{\sqrt{n}}\left| \sum _{j=1}^{n}\varepsilon _j h(X_{n,j})\right| >\gamma \right) \nonumber \\&\quad \leqslant {\Pr }(A_n^c)+\frac{1}{\gamma }{{\mathrm{{\mathbb {E}}}}}_X{{\mathrm{{\mathbb {E}}}}}_\varepsilon \left[ \left\| \frac{\sum _{j=1}^{n}\varepsilon _j h(X_{n,j})}{\sqrt{n}} \right\| _{{\mathcal {H}}_{2\delta }'}1\!\!1_ { A_n } \right] \,\,\!. \end{aligned}$$
(60)

The number of elements in \({\mathcal {H}}'_{2\delta }\) is bounded by

$$\begin{aligned} \#{\mathcal {H}}'_{2\delta }\leqslant \exp (2 H(r_n,{\mathcal {Q}}_n,L^2(P_n))). \end{aligned}$$
(61)

For a single \(h\in {\mathcal {H}}_{2\delta }'\backslash \{0\}\) we have, using Bernstein’s inequality,

$$\begin{aligned} {\Pr }\left( \frac{\sum _{j=1}^n h^2(X_{n,j})}{nP_n h^2}\geqslant 2\right)&={\Pr }\left( \sum _{j=1}^{n}(h^2(X_{n,j})-P_n h^2)\geqslant n P_n h^2\right) \\&\leqslant \exp \left( -\frac{n^2 (P_n h^2)^2}{2n P_n h^4+8M_n^2 n P_n h^2/3}\right) \\&\leqslant \exp \left( -\frac{nP_n h^2}{11 M_n^2}\right) . \end{aligned}$$

Combining the last bound and (61) we obtain

$$\begin{aligned} \Pr (A_n^c)\leqslant \exp \left( 2 H(r_n,{\mathcal {Q}}_n,L^2(P_n))-\frac{n r_n^2}{11 M_n^2}\right) \rightarrow 0 \end{aligned}$$

by condition (a) and (c). It remains to show that the second term in (60) converges to zero. Conditional on the \(X_{n,j}\)’s the process

$$\begin{aligned} Z(h):=\frac{1}{\sqrt{n}}\sum _{j=1}^{n}\varepsilon _j h(X_{n,j}), \quad h\in {\mathcal {H}}, \end{aligned}$$

is subgaussian. Let \(h,h'\in {\mathcal {H}}\) such that \(h-h'\in {\mathcal {H}}_{2\delta }'\backslash \{0\}\). On the event \(A_n\) we have

$$\begin{aligned} d_Z(h,h')^2&:={{\mathrm{{\mathbb {E}}}}}_\varepsilon \left[ \left( \frac{1}{\sqrt{n}}\sum _{j=1} ^n\varepsilon _j h(X_{n,j})-\frac{1}{\sqrt{n}}\sum _{j=1} ^n\varepsilon _j h'(X_{n,j})\right) ^2\right] \\&=\frac{1}{n}\sum _{j=1}^n(h(X_{n,j})-h'(X_{n,j} ))^2<2 P_n(h-h')^2. \end{aligned}$$

Especially we have on \(A_n\) that \(\Vert h-h'\Vert _{2,P_n}<\varepsilon \) implies \(d_Z(h,h')<\sqrt{2}\varepsilon \) for all \(\varepsilon >0\) with \(\varepsilon \leqslant 2\delta \) and for all \(h,h'\in {\mathcal {H}}\). We define \(\psi _2(x):=\exp (x^2)-1\) and the norm \(\Vert \xi \Vert _{\psi _2}:=\inf \{c>0:{{\mathrm{{\mathbb {E}}}}}[\psi _2(|\xi |/c)]\leqslant 1\}\). By (4.3.3) in [8] there is a constant \(c>0\) such that \({{\mathrm{{\mathbb {E}}}}}[|\xi |]\leqslant c \Vert \xi \Vert _{\psi _2}\). So we obtain the bound

$$\begin{aligned}&{{\mathrm{{\mathbb {E}}}}}_\varepsilon \left[ \left\| \frac{\sum _{j=1}^{n}\varepsilon _j h(X_{n,j})}{\sqrt{n}} \right\| _{{\mathcal {H}}_{2\delta }'}1\!\!1_ { A_n } \right] \\&\quad \leqslant c \left\| \sup _{d_Z(h,h')<2\sqrt{2}\delta }\left| \frac{\sum _{j=1}^{n}\varepsilon _j (h(X_{n,j})-h'(X_{n,j}))}{\sqrt{n}} \right| 1\!\!1_{A_n} \right\| _{\psi _2}. \end{aligned}$$

Next we apply Dudley’s theorem in the form of Corollary 5.1.6 and Remark 5.1.7 in [8] to the process \(Z\). This yields a constant \(K\) such that

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}_\varepsilon \left[ \left\| \frac{\sum _{j=1}^{n}\varepsilon _j h(X_{n,j})}{\sqrt{n}} \right\| _{{\mathcal {H}}_{2\delta }'}\right] 1\!\!1_ { A_n }&\leqslant K \int _0^{2\sqrt{2}\delta }\left( \log (N(\varepsilon ,{\mathcal {H}}_n,d_Z))\right) ^{1/2}\,\mathrm {d}\varepsilon 1\!\!1_{A_n}\\&\leqslant K \int _0^{2\sqrt{2}\delta }\left( \log (N(\varepsilon /\sqrt{2},{\mathcal {H}}_n,L^2(P_n)))\right) ^{1/2}\,\mathrm {d}\varepsilon 1\!\!1_{A_n},\\&\leqslant \sqrt{2} K \int _0^{2\delta }\left( \log (N(\varepsilon ,{\mathcal {Q}}_n,L^2(P_n)))\right) ^{1/2}\,\mathrm {d}\varepsilon , \end{aligned}$$

a bound independent of \(X\). In order to complete the proof we take expectation with respect to \(X\), consider the limit \(\lim _{\delta \rightarrow 0}\limsup _{n\rightarrow \infty }\) of the expression and apply condition (d).\(\square \)

To proceed with the tightness proof for the critical term we will show conditions (a) to (d) for \(r_n:=\log (1/\Delta _n)^{-\alpha }\), \(\alpha \in (1/2,1)\), and for the class \( Q_n=\{{\widetilde{g}}_t^{(2)}:t\leqslant 0\}\) defined above.

(a) We rewrite

$$\begin{aligned} {\widetilde{g}}^{(2)}_t&=\Delta ^{-1/2}i{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[t\rho (t)e^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)] \nonumber \\&\quad +\Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[t\rho (t)ye^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)] \nonumber \\&=\Delta ^{-1/2}i{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[t\rho (t)e^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)]\nonumber \\&\quad +\Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[t\rho (t)(y-t)e^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)]\nonumber \\&\quad +\Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[t^2\rho (t)e^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)], \end{aligned}$$
(62)

where the last step also shows that the bounded variation norm of \(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y)\) is bounded uniformly in \(t\leqslant 0\). If \({{\mathrm{{\mathcal {F}}}}}^{-1}[m]\), \({{\mathrm{{\mathcal {F}}}}}^{-1}[m']\) are finite signed measures as in Assumption 8(a), then the bounded variation norms of \({{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-{\scriptstyle \bullet } )]*(t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y))\) and \({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-{\scriptstyle \bullet } )]*(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y))\) are bounded uniformly in \(t\leqslant 0\) and

$$\begin{aligned} \Vert {\widetilde{g}}^{(2)}_t\Vert _\infty \leqslant \Vert {\widetilde{g}}^{(2)}_t\Vert _{BV}\lesssim \Delta ^{-1/2}, \end{aligned}$$

where \(\Vert f\Vert _{BV}\) denotes the bounded variation norm equal to the sum of the \(\ell ^\infty \)-norm of \(f\) and the usual total variation norm of the weak derivative \(Df\). For \(m\) supported in \([-C\Delta ^{-1/2},C\Delta ^{-1/2}]\) as in Assumption 8(b), we have

$$\begin{aligned} \Vert {\widetilde{g}}^{(2)}_t\Vert _\infty \lesssim \Vert {\widetilde{g}}^{(2)}_t\Vert _{B^0_{\infty ,1}}\lesssim \Vert {\widetilde{g}}^{(2)}_t\Vert _{B^1_{1,1}} \end{aligned}$$

and the Fourier transform of \({\widetilde{g}}^{(2)}_t\) is supported on \([-C\Delta ^{-1/2},C\Delta ^{-1/2}]\). In view of the Littlewood-Paley definition of Besov spaces we can estimate the \(B^1_{11}({{\mathrm{{\mathbb {R}}}}})\)-norm of \({\widetilde{g}}^{(2)}_t\) by \(\log (C/\Delta ^{1/2})\)-times its \(B^1_{1\infty }({{\mathrm{{\mathbb {R}}}}})\)-norm. With the Fourier multiplier property of \(m\) and \(m'\) this yields

$$\begin{aligned} \Vert {\widetilde{g}}^{(2)}_t\Vert _\infty&\lesssim \log (C/\Delta ^{1/2})\Vert {\widetilde{g}}^{(2)}_t\Vert _{B^1_{1\infty }}\\&\lesssim \Delta ^{-1/2}\log (1/\Delta )(\Vert t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y) \Vert _{B^1_{1\infty }}\\&\quad +\Vert t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y)\Vert _{B^1_{1\infty }})\\&\lesssim \Delta ^{-1/2}\log (1/\Delta ), \end{aligned}$$

since the \(B^1_{1\infty }({{\mathrm{{\mathbb {R}}}}})\)-norm of \(t\rho (t)e^{y-t}1\!\!1_{(-\infty ,t]}(y)\) and \(t\rho (t)ye^{y-t}1\!\!1_{(-\infty ,t]}(y)\) are uniformly in \(t\) bounded by integrability and bounded variation. So \(M_n\) can be chosen proportional to \(\Delta ^{-1/2}\log (1/\Delta )\) and \(nr_n^2M_n^{-2}\rightarrow \infty \) by \(\log ^4(1/\Delta )=o(n\Delta )\).

(b) We will show condition (b) by applying a moment inequality for empirical processes under uniform entropy bounds for \(Q_n\). We decompose \({\widetilde{g}}^{(2)}_t\) according to (62). Using that \(B_{11}^{1}({{\mathrm{{\mathbb {R}}}}})\) embeds continuously into the space \(\text {BV}\) of bounded variation functions, the bounds in (a) show that

$$\begin{aligned} \Vert \Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[e^{y-t}1\!\!1_{(-\infty ,t]}(y)](u)]\Vert _{BV}&\lesssim \Delta ^{-1/2}\log (1/\Delta ),\nonumber \\ \Vert \Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[(y-t)e^{y-t}1\!\!1_{(-\infty ,t]}(y)](u)] \Vert _{BV}&\lesssim \Delta ^{-1/2}\log (1/\Delta ),\\ \Vert \Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u){{\mathrm{{\mathcal {F}}}}}[e^{y-t}1\!\!1_{(-\infty ,t]}(y)](u)]\Vert _{BV}&\lesssim \Delta ^{-1/2}\log (1/\Delta ),\nonumber \end{aligned}$$
(63)

where we omitted the factors \(t\rho (t)\) and \(t^2\rho (t)\) to obtain translation invariant classes. Since the functions in the class

$$\begin{aligned} \mathfrak {F}_n:=\{\Delta ^{-1/2}{{\mathrm{{\mathcal {F}}}}}^{-1}[m'(-u){{\mathrm{{\mathcal {F}}}}}[e^{y-t} 1\!\!1_{(-\infty ,t]}(y)](u)] :t\leqslant 0\} \end{aligned}$$
(64)

are of bounded variation, we can write them as the composition of a 1-Lipschitz function after a nondecreasing function. The class of all translates of a nondecreasing function has VC index 2 and thus polynomial \(L^2({{\mathrm{{\mathbb {Q}}}}})\)-covering numbers uniformly in all probability measures \({{\mathrm{{\mathbb {Q}}}}}\) by Theorem 5.1.15 in [8]. The \(\varepsilon \)-covering numbers are preserved under 1-Lipschitz transformations and thus the covering numbers of \(\mathfrak F_n\) are polynomial in \(M_n/\varepsilon \). The \(\varepsilon \)-covering numbers of \(\{t\rho (t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) are polynomial in \(1/\varepsilon \). To obtain an \(\varepsilon \)-covering of the functions in the first term of (62) we cover the class \(\mathfrak {F}_n\) by balls of size \(\varepsilon /2\) and the class \(\{t\rho (t):t\in {{\mathrm{{\mathbb {R}}}}}\}\) by balls of size \(\varepsilon /(2M_n)\). We see that the covering numbers can be bounded by a product of two polynomial covering numbers and thus are polynomial in \(M_n/\varepsilon \). Arguing in the same way for the two other terms in (62) yields polynomial covering numbers for them, too. Using that the covering numbers of \(Q_n\) can be bounded by the product of the covering numbers for the respective terms we see that the covering numbers of \(Q_n\) are polynomial in \(M_n/\varepsilon \). By Proposition 3 in [17] there exists a universal constant \(L>0\) such that

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}\left\| \frac{1}{\sqrt{n}}\sum _{j=1}^n\varepsilon _j q(X_{n,j})\right\| _{(Q_n)'_{r}}\leqslant L \max \left( r_n\sqrt{\log \left( \frac{M_n}{r_n}\right) }, \frac{M_n}{\sqrt{n}} \log \left( \frac{M_n}{r_n}\right) \right) . \end{aligned}$$

Condition (b) is satisfied if this maximum tends to zero. We have

$$\begin{aligned} r_n\sqrt{\log \left( \frac{M_n}{r_n}\right) } \!\lesssim \!\frac{\sqrt{\log \left( \log (1/\Delta )^{1+\alpha }/\Delta ^{1/2}\right) }}{ \log (1/\Delta )^\alpha } \!\lesssim \!\frac{\sqrt{\log (\log (1/\Delta ))}}{\log (1/\Delta )^\alpha }\!+\!\frac{ \sqrt{ \log (1/\Delta )}}{\log (1/\Delta )^\alpha }\!\rightarrow \!0, \end{aligned}$$

and

$$\begin{aligned} \frac{M_n}{\sqrt{n}} \log \left( \frac{M_n}{r_n}\right)&\lesssim \frac{\log (1/\Delta )}{\sqrt{\Delta n}} \log \left( \frac{\log (1/\Delta )^{1+\alpha }}{\Delta ^{1/2}}\right) \\&\lesssim \frac{\log (1/\Delta )\log (\log (1/\Delta ))}{\sqrt{\Delta n}} +\frac{\log (1/\Delta )^2}{\sqrt{\Delta n}}, \end{aligned}$$

which tends to zero by \(\log ^4(1/\Delta )=o(n\Delta )\).

(c) In order to verify (c), we will show that \(H(\varepsilon ,Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta )\lesssim \log (\varepsilon ^{-1})\) uniformly in \(n\). Applying Proposition 23 with \(j=k=1\), \(p=q=2\) and \(s=2\) yields that for \(\mu =m(-{\scriptstyle \bullet } )\) and for all \(f\in L^2({{\mathrm{{\mathbb {R}}}}})\) with \({{\mathrm{\hbox {supp}}}}(f)\cap (-\delta ,\delta )=\varnothing \) for some \(\delta >0\)

$$\begin{aligned} \Delta ^{-1/2}\Vert x({{\mathrm{{\mathcal {F}}}}}^{-1}[m(-u)]*f)\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }&\lesssim \left( \delta ^{-1/2}\Vert m\Vert _\infty +\delta ^{-1}\Delta ^{-1/2}\Vert m''\Vert _{L^2}\right) \Vert f\Vert _{L^2}\nonumber \\&\lesssim \left( \delta ^{-1/2}\vee \delta ^{-1}\right) \Vert f\Vert _{L^2}, \end{aligned}$$
(65)

where we used \(\Vert m\Vert _\infty \leqslant C\) and \(\Delta ^{-1/2}\Vert m''\Vert _{L^2}\rightarrow 0\) by Assumption 8.

Let \(M\geqslant 1\) and \(\eta \in [0,1]\). We will distinguish the three cases \(s,t\leqslant -M\), \(s,t\in [-M,-\eta ]\) and \(s,t\in [-\eta ,0]\).

Case 1: Let \(s,t\leqslant -M\). We apply (65) to \(\delta =M\) and \(f(y):=q_s(y)-q_t(y)\) with \(q_t\) defined in (58). Noting that we can bound \(\Vert q_t\Vert _{L^2}\lesssim M^{-1}\) uniformly in \(t\leqslant 0\), we obtain for \(s,t\leqslant -M\)

$$\begin{aligned} \Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim M^{-3/2}. \end{aligned}$$

Case 2: For the second case let \(-M\leqslant s,t\leqslant -\eta \). We apply (65) with \(\delta =\eta \) to \(f(y):=q_s(y)-q_t(y)\). Without loss of generality we assume \(s\leqslant t\). We estimate

$$\begin{aligned}&\int _{-\infty }^{t}(q_s(y)-q_t(y))^2\,\mathrm {d}y\\&\quad =\int _{-\infty }^{0}\left( s\rho (s)e^{y}-t\rho (t)e^{y+s-t}\right) ^2\,\mathrm {d}y+\int _{s}^{t}t^2\rho (t)^2e^{2(y-t)}\,\mathrm {d}y\\&\quad \leqslant 2\left( s\rho (s)\!-\!t\rho (t)\right) ^2\int _{-\infty }^0e^{2y}\,\mathrm {d}y\!+\!2t^2\rho (t)^2\int _{-\infty }^0(1\!-\!e^{s-t})^2e^{2y}\,\mathrm {d}y \!+\!t^2\rho (t)^2|s\!-\!t|\\&\quad \lesssim |s-t|^2+|s-t| \end{aligned}$$

by the Lipschitz continuity of \(x\rho \) and obtain for \(s,t\in [-M,-\eta ]\) with \(|s-t|\leqslant 1\)

$$\begin{aligned} \Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim |s-t|^{1/2}/\eta . \end{aligned}$$

Case 3: Let \(-\eta \leqslant s,t\leqslant 0\). We have \(\Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\leqslant 2\sup _{t\in [-\eta ,0]}\Vert {\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\). We apply Proposition 23 with \(f=q_t\), \(\mu =m(-{\scriptstyle \bullet } )\), \(\delta =|t|\), \(k=1\), \(j=1\), \(p=2\), \(q=2\) and \(s=2\). We have

$$\begin{aligned} |t|^{-1}\Vert q_t\Vert _{L^2}^2&=|t|\rho (t)^2\int _{-\infty }^{0}e^{2y}\,\mathrm {d}y\lesssim |t|\quad \text {and}\\ t^2\left\| q_t(y)/y^2\right\| _{L^2}^2&\leqslant \int _{-\infty }^t\frac{t^4}{y^4}\,\mathrm {d}y=|t|\int _{-\infty }^{-1}x^{-4}\,\mathrm {d}x\lesssim |t| \end{aligned}$$

and consequently \(\Vert {\widetilde{g}}_s^{(2)} -{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim \eta ^{1/2}\) for \(-\eta \leqslant s,t\leqslant 0\).

Having treated these three cases we can show \(N(\varepsilon , Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\lesssim \varepsilon ^{-7}\). For an integer \(J>0\) we consider the grid of points \(t_j=-j J^{-6}\) with \(j=J^4, J^4+1, J^4+2,\ldots , J^{7}\). We take \(\eta =J^{-2}\). By Case 3 we see that \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim J^{-1}\) for all \(s,t\in [-J^{-2},0]\). By Case 2 we have \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim |s-t|^{1/2}/\eta \leqslant J^{-1}\) for \(s,t\in [-(j+1)J^{-6},-jJ^{-6}]\). And by Case 1 \(\Vert {\widetilde{g}}_s^{(2)}-{\widetilde{g}}_t^{(2)}\Vert _{2,{{\mathrm{{\mathbb {P}}}}}_\Delta }\lesssim J^{-1}\) for \(s,t\leqslant -J\).

We have polynomial covering numbers and it suffices for condition (c) that

$$\begin{aligned} \frac{nr_n^2}{M_n^2\log \left( {r_n}^{-1}\right) }\rightarrow \infty . \end{aligned}$$

In (a) we have seen that \(M_n\lesssim \Delta ^{-1/2}\log (1/\Delta )\). For the choice \(r_n=\log (1/\Delta )^{-\alpha }\) we obtain

$$\begin{aligned} \frac{n r_n^2}{M_n^2\log (1/r_n)}\gtrsim \frac{n\Delta }{ \log (1/\Delta )^{2+2\alpha }\log (\log (1/\Delta )) }, \end{aligned}$$

which tends to infinity by \(\log ^4(1/\Delta )=o(n\Delta )\).

(d) In (c) we have seen that the covering numbers \(N(\varepsilon ,Q_n,L^2({{\mathrm{{\mathbb {P}}}}}_\Delta ))\) are uniformly in \(n\) polynomial in \(\varepsilon ^{-1}\) so that the condition is satisfied.