1 Introduction

Consider a stationary Markov chain \((\Phi _t)\) and a function \(f\) acting on the state space of the Markov chain and mapping into \(\mathbb{R }^d\) for some \(d\ge 1\). For the resulting stationary process \(X_t=f(\Phi _t)\), \(t\in \mathbb{Z }\), the corresponding partial sum process is given by

$$\begin{aligned} S_0=0,\quad S_n= X_1+\cdots +X_n. \end{aligned}$$

We also assume that the finite-dimensional distributions of the process \((X_t)\) are regularly varying with index \(\alpha \); see Sect. 2.1 for a definition. Roughly speaking, this condition ensures that the tails of the finite-dimensional distributions have power law behavior, hence sufficiently high moments of \(X\) are infinite. (Here and in what follows, we write \(Y\) for a generic element of any stationary sequence \((Y_t)\).) Regular variation of a random vector and, more generally, of a stationary sequence is a condition which determines the extremal dependence structure in a flexible way.

For an iid sequence the condition of regular variation of \(X\) with index \(\alpha \in (0,2)\) is necessary and sufficient for the central limit theorem

$$\begin{aligned} a_n^{-1} (S_n-b_n)\stackrel{d}{\rightarrow }\xi _\alpha ,\quad n\rightarrow \infty , \end{aligned}$$

where \(a_n>0\), \(b_n\in \mathbb{R }\), \(n\in \mathbb{N }\), are suitable constants and \(\xi _\alpha \) has an \(\alpha \)-stable distribution in \(\mathbb{R }^d\); see [43] for the limit theorem and [44] for a description of infinite variance stable laws in \(\mathbb{R }^d\). Limit theory with \(\alpha \)-stable limits for dependent sequences was studied in [23, 24] by using the convergence of characteristic functions and in [15] by using the continuous mapping theorem acting on suitable weakly converging point processes; see also [4] for a functional central limit theorem using the same technique. These results were proved for univariate sequences, but [16] proved that the point process convergence results remain valid in the multivariate case by a slight modification of the proofs in [15].

Using ideas from [23, 24], the authors of [1] studied stable limit theory for general univariate sequences; see Theorem 6.1 below. We use this result and the Cramér-Wold device to derive the corresponding limits for all linear combinations \(\theta ^{\prime }S_n\), \(\theta \in \mathbb{S }^{d-1}\), where \(\mathbb{S }^{d-1}\) is the unit sphere in \(\mathbb{R }^d\) with respect to the Euclidean norm. According to Theorem 6.1, the \(\alpha \)-stable limit laws of \(\theta ^{\prime }S_n\) (with suitable normalization and centering) are characterized by the function

$$\begin{aligned} b(\theta )= \lim _{k\rightarrow \infty }\lim _{x\rightarrow \infty } \dfrac{\mathbb{P }(\theta ^{\prime }S_k>x)-\mathbb{P }(\theta ^{\prime }S_{k-1}>x)}{\mathbb{P }(|X|>x)},\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$
(1.1)

We discuss the so-called cluster index \(b\) in Sect. 3. The existence of the limits in (1.1) is guaranteed under the conditions of this paper; see Theorem 3.2. Moreover, the cluster index \(b\) determines the \(\alpha \)-stable limit laws in the multivariate case; see Theorem 4.1. In a way, the function \(b\) plays a similar role as the notion of extremal index in limit theory for maxima of dependent sequences; see [29] for this notion.

Regular variation is also the key to precise large deviation theory for the sums \(S_n\). In the univariate iid case, classical work by Nagaev and Nagaev [35, 36] shows that relations of the following type hold

$$\begin{aligned} \sup _{x\ge b_n}\Big |\dfrac{\mathbb{P }(S_n>x)}{n\,\mathbb{P }(|X|>x)}- p\Big |\rightarrow 0, \end{aligned}$$

where \(p=\lim _{x\rightarrow \infty }\mathbb{P }(X>x)/\mathbb{P }(|X|>x)=p\) and \((b_n)\) is a suitably chosen sequence such that \(b_n\rightarrow \infty \) and \(S_n/b_n\stackrel{P}{\rightarrow }0\) as \(n\rightarrow \infty \). Related work for dependent sequences was proved in [33] for linear processes, in [13, 28] for solutions tostochastic recurrence equations and for general sequences in [34]; for earlier results see also [15, 23, 24]. The results in these papers are all of the type

$$\begin{aligned} \sup _{x\in (b_n,c_n)}\Big |\dfrac{\mathbb{P }(S_n>x)}{n\,\mathbb{P }(|X|>x)}- b(1)\Big |\rightarrow 0, \end{aligned}$$
(1.2)

where \(b(1)\) is the limit in (1.1) for \(d=1\) and \((b_n,c_n)\) are suitable regions tending to infinity.

The case of iid multivariate sequences \((X_t)\) was treated in [22], including a corresponding functional large deviation result. In this paper, we get a corresponding large deviation principle for multivariate functions acting on a Markov chain (see Theorem 4.3):

$$\begin{aligned} \dfrac{\mathbb{P }(\lambda _n^{-1}S_n\in \cdot )}{n\,\mathbb{P }(|X|>\lambda _n)}\stackrel{v}{\rightarrow }\nu _\alpha . \end{aligned}$$
(1.3)

Here \(\stackrel{v}{\rightarrow }\) denotes vague convergence on some Borel \(\sigma \)-field, \(\lambda _n\rightarrow \infty \) is a suitable normalizing sequence¡ and the limit \(\nu _\alpha \) is a measure which is induced by the regular variation of the sums \(S_k\), \(k\ge 1\). As for the case of stable limits, we start by proving the large deviation principle for linear combinations \(\theta ^{\prime } S_n\), exploiting the corresponding result (1.2) with \(b(1)\) replaced by \(b(\theta )\) from (1.1); see Theorem 7.2. This corresponds to (1.3) restricted to half-planes not containing the origin. It is in general not possible to extend the limit relation (1.3) from half-spaces to general Borel sets. This extension is however possible by assuming some additional conditions such as \(\alpha \) is non-integer. As a matter of fact, relation (1.3) cannot be written as a uniform result in the spirit of (1.2), due to its multivariate character.

The paper is organized as follows. In Sect. 2 we introduce regular variation of a stationary sequence and the drift condition of a Markov chain. In Sect. 3 we define the cluster index \(b(\theta )\), \(\theta \in \mathbb{S }^{d-1}\), of a stationary sequence. We prove the existence of the cluster index for multivariate functions acting on a Markov chain under a drift condition (Theorem 3.2). In Sect. 4 we formulate the main results of this paper. They include \(\alpha \)-stable limit theory (Theorem 4.1) and precise large deviation principles (Theorem 4.2) for functions of regenerative Markov chains. In Sect, 5 we calculate the cluster index for several important time series models, including multivariate autoregressive processes, solutions to stochastic recurrence equations, GARCH\((1,1)\) processes and their sample covariance functions. In the remaining sections we prove the results of Sect. 4.

2 Preliminaries

2.1 Regular variation of vectors and sequences of random vectors

In what follows, we will use the notion of regular variation as a suitable way of describing heavy tails of random vectors and sequences of random vectors. We commence with a random vector \(X\) with values in \(\mathbb{R }^d\) for some \(d\ge 1\). We say that this vector (and its distribution) are regularly varying with index \(\alpha >0\) if the following relation holds as \(x\rightarrow \infty \):

$$\begin{aligned} \frac{\mathbb{P }(|X| > ux, X/| X| \in \cdot )}{\mathbb{P }(|X| > x)}\stackrel{w}{\rightarrow }u^{-\alpha }\, \mathbb{P }(\Theta \in \cdot ),\quad u>0. \end{aligned}$$
(2.1)

Here \(\stackrel{w}{\rightarrow }\) denotes weak convergence of finite measures and \(\Theta \) is a vector with values in the unit sphere \(\mathbb{S }^{d-1} = \{x\in \mathbb{R }^d : |x| = 1\}\) of \(\mathbb{R }^d\). Its distribution is the spectral measure of regular variation and depends on the choice of the norm. However, the definition of regular variation does not depend on any concrete norm; we always refer to the Euclidean norm. An equivalent way to define regular variation of \(X\) is to require that there exists a non-null Radon measure  \(\mu \) on the Borel \(\sigma \)-field of \(\overline{\mathbb{R }}_0^d=\overline{\mathbb{R }}^d\setminus \{0\}\) such that

$$\begin{aligned} n\, \mathbb{P }(a_n^{-1} X\in \cdot )\stackrel{v}{\rightarrow }\mu _X(\cdot ), \end{aligned}$$
(2.2)

where the sequence \((a_n)\) can be chosen such that \(n\,\mathbb{P }(|X|>a_n)\sim 1\) and \(\stackrel{v}{\rightarrow }\) refers to vague convergence. The limit measure \(\mu _X\) necessarily has the property \(\mu _X(u\cdot )= u^{-\alpha }\mu _X(\cdot ),u>0\), which explains the relation with the index \(\alpha \). We refer to [6] for an encyclopedic treatment of one-dimensional regular variation and [40, 41] for the multivariate case.

Next consider a strictly stationary sequence \((X_t)_{t\in \mathbb{Z }}\) of \(\mathbb{R }^d\)-valued random vectors with a generic element \(X\). It is regularly varying with index \(\alpha >0\) if every lagged vector \((X_1,\ldots ,X_k)\), \(k\ge 1\), is in the sense of (2.1); see [15]. An equivalent description of a sequence \((X_t)\) is achieved by exploiting (2.2): for every \(k\ge 1\), there exists a non-null Radon measure \(\mu _k\) on the Borel \(\sigma \)-field of \( \overline{\mathbb{R }}_0^{dk}\) such that

$$\begin{aligned} n\, \mathbb{P }(a_n^{-1} (X_1,\ldots ,X_k)\in \cdot )\stackrel{v}{\rightarrow }\mu _k, \end{aligned}$$
(2.3)

where \((a_n)\) is chosen such that \(n\,\mathbb{P }(|X_0|>a_n)\sim 1\).

A convenient characterization of a sequence \((X_t)\) was given in Theorem 2.1 of [5]: there exists a sequence of \(\mathbb{R }^d\)-valued random vectors \((Y_t)_{t\in \mathbb{Z }}\) such that \(\mathbb{P }(|Y_0| > y) = y^{-\alpha }\) for \(y > 1\) and for \(k\ge 0\),

$$\begin{aligned} \mathbb{P }(x^{-1}(X_{-k},\ldots ,X_k)\in \cdot \mid |X_0| > x)\stackrel{w}{\rightarrow }\mathbb{P }((Y_{-k},\ldots ,Y_k)\in \cdot ),\quad x\rightarrow \infty . \end{aligned}$$

The process \((Y_t)\) is the tail process of \((X_t)\). Writing \(\Theta _t = Y_t/|Y_0|\) for \(t\in \mathbb{Z }\), one also has for \(k\ge 0\),

$$\begin{aligned} \mathbb{P }( |X_0|^{-1}(X_{-k},\ldots ,X_k)\!\in \!\cdot \mid |X_0| \!>\! x) \stackrel{w}{\rightarrow }\mathbb{P }((\Theta _{-k},\ldots ,\Theta _k)\in \cdot ),\quad x\rightarrow \infty .\qquad \end{aligned}$$
(2.4)

We will identify \(|Y_0|\,(Y_t/|Y_0|)_{|t|\le k}= |Y_0|\,(\Theta _t)_{|t|\le k}\), \(k\ge 0\). Then \(|Y_0|\) is independent of \((\Theta _t)_{|t|\le k}\) for every \(k\ge 0\). We refer to \((\Theta _t)_{t\in \mathbb{Z }}\) as the spectral tail process of \((X_t)\).

We formulate our main condition on the tails of the sequence \((X_t)\):

Condition \(\mathbf{(RV_\alpha )}\): The strictly stationary sequence \((X_t)\) is regularly varying with index \(\alpha >0\) and spectral tail process \((\Theta _t)\).

2.2 The drift condition

Assume that the following drift condition holds for the Markov chain \((\Phi _t)\) for suitable \(p>0\) and an \(\mathbb{R }^d\)-valued function \(f\) acting on the state space of the Markov chain: Condition \(\mathbf{(DC_{ p})}\): There exist constants \(\beta \in (0,1)\), \(b>0\), and a function \(V:\mathbb{R }^d\rightarrow (0,\infty )\) such that \(c_1|x|^{p}\le V(x)\le c_2|x|^p\), \(c_1,c_2>0\), satisfying for any \(y\) in the state space of the Markov chain,

$$\begin{aligned} \mathbb{E }( V(f(\Phi _1))\mid \Phi _0=y)\le \beta \,V(f(y))+b. \end{aligned}$$

We mention that Jensen’s inequality ensures that \(\mathbf{(DC_{ p})}\) implies \(\mathbf{(DC_{ p^{\prime }})}\) for \(p^{\prime }<p\). We exploited condition \(\mathbf{(DC_{ p})}\) in [34], where we proved large deviation principles for strictly stationary sequences of random variables, in particular for irreducible Markov chains.

If \((\Phi _t)\) is an irreducible Markov chain then \(\mathbf{(DC_{ p})}\) for any \(p>0\) implies \(\beta \)-mixing with geometric rate; see [30], p. 371. Moreover, without loss of generality, by considering the Nummelin splitting scheme, see [37] for details, we will assume that \((\Phi _t)\) possesses an atom \(A\). The notions of drift, small set, atom, etc. used throughout are borrowed from [30]. In what follows, we write \(\mathbb{P }_A(\cdot )= \mathbb{P }(\cdot \mid \Phi _0\in A)\) and \(\mathbb{E }_A\) for the corresponding expectation.

We always assume the existence of some \(M>0\) such that \(\{x\,:\, V(f(x))\le M\}\) is a small set (this is true in all our examples). Then the condition \(\mathbf{(DC_{ p})}\) is equivalent to the existence of constants \(\beta \in (0,1)\) and \(b>0\) such that for any \(y\),

$$\begin{aligned} \mathbb{E }( V(f(\Phi _1)\mid \Phi _0=y)\le \beta \,V(f(y))+b1\!\!1_A(y). \end{aligned}$$

Direct verification of the condition \(\mathbf{(DC_{ p})}\) is in general difficult. We will use the following result which can often be checked much easier.

Lemma 2.1

Assume that the stationary Markov chain \((\Phi _{t})\) is aperiodic, irreducible and satisfies the following condition for some \(p>0\) and integer \(m\ge 1\):

Condition \(\mathbf{(DC_{ p,m})}\): (a) There exist \(b>0\) and \(\beta \in (0,1)\) such that for any \(y\) in the state space of the Markov chain,

$$\begin{aligned} \mathbb{E }( V(f(\Phi _m))\mid \Phi _0=y)\le \beta \,V(f(y))+b1\!\!1_A(y), \end{aligned}$$

where \(V\) is the function from \(\mathbf{(DC_{ p})}\).

(b) There exist \(c_1,c_2>0\) such that for any \(y\) in the state space of the Markov chain

$$\begin{aligned} \mathbb{E }( V(f(\Phi _1)\mid \Phi _0=y)\le c_1V(f(y))+c_2. \end{aligned}$$

Then condition \(\mathbf{(DC_{ p})}\) holds.

Proof

Theorem 15.3.3 in [30] says that the drift condition in part (a) of \(\mathbf{(DC_{ p,m})}\) implies \(V\)-geometric regularity of the \(m\)-skeleton Markov chain \((\Phi _{tm})\). Theorem 15.3.6 in [30] yields the equivalence between \(V\)-geometric regularity and \(g\)-geometric regularity of the original Markov chain for a function \(g\) satisfying \(\sum _{t=1}^m\mathbb{E }(g(\Phi _t)\mid \Phi _0=y)=V(f(y))\). Thus the drift condition is satisfied for the original Markov chain and some finite Lyapunov function \(V^{\prime }\ge g\). Making multiple use of part (b) of \(\mathbf{(DC_{ p,m})}\), we can show that there exist constants \(c_1^{\prime },c_2^{\prime }>0\) satisfying \(\sum _{t=1}^m\mathbb{E }(g(\Phi _t)\mid \Phi _0=y)\le c_1^{\prime } V(f(y))+c_2^{\prime }\). Thus (DC \(_{p}\)) follows for a function\(V^{\prime }(x)=c_1^{\prime \prime }V(x)+c_2^{\prime \prime }\) and suitable constants \(c_1^{\prime \prime },c_2^{\prime \prime }>0\). \(\square \)

Consider the sequence of the hitting times of the atom \(A\) by the Markov chain \((\Phi _t)\), i.e. \(\tau _A(1)=\tau _A=\min \{k>0: \Phi _k\in A\}\) and \(\tau _A(j+1)=\min \{k>\tau _A(j): \Phi _k\in A\}\), \(j\ge 1\). We will write

$$\begin{aligned} S(0)=\sum _{t=1}^{\tau _A}X_t \quad \text{ and }\quad S(i)=\sum _{t=\tau _A(i)+1}^{\tau _A(i+1)}X_t,\quad i\ge 1. \end{aligned}$$
(2.5)

According to the theory in [30], \((\tau _A(i)-\tau _A(i-1))_{i\ge 2}\) and \((S(i))_{i\ge 1}\) constitute iid sequences; we will refer to regenerative Markov chains. The drift condition (DC \(_{p}\)) is tailored for proving the existence of moments of \(S(1)\) under the existence of moments of \(X_t=f(\Phi _t)\) of the same order.

The drift condition (DC \(_{p}\)) is useful for proving central limit theory and other asymptotic results for functions of Markov chains. As a benchmark result we quote a central limit theorem which is a simple corollary of Proposition 2.1 in Samur [45]. To apply this result notice that \(\mathbf (DC _{1})\) implies condition (D\(_2\)) of [45] for \(|X_t|\) with \(V=c\,|f|\) with \(c>0\) sufficiently small.

Theorem 2.2

Assume that the stationary Markov chain\((\Phi _t)\) is aperiodic, irreducible and \((X_t)=(f(\Phi _t))\) satisfies \(\mathbf (DC _\mathrm{1})\), \(\mathbb{E }|X|^2<\infty \) and \(\mathbb{E }X=0\). Then the following statements hold:

  1. (1)

    The partial sum \(S(1)\) has finite second moment.

  2. (2)

    The central limit theorem \(n^{-0.5}S_n\stackrel{d}{\rightarrow }\mathcal{N }(0,\Sigma )\) holds with

    $$\begin{aligned} \Sigma&= \mathbb{E }_A[S(1)S(1)^{\prime }]\\&= \lim _{k\rightarrow \infty } \mathbb{E }\left[ \left( \sum _{t= 0}^kX_t\right) \left( \sum _{t=0} ^kX_t\right) ^{\prime }-\left( \sum _{t= 1}^k X_t\right) \left( \sum _{t=1}^k X_t\right) ^{\prime }\right] \!. \end{aligned}$$

Together with Theorem 4.1 that deals with the case of infinite variance stable limits, Theorem 2.2 complements the limit theory for partial sums of functions of Markov chains in the case of finite variance summands and Gaussian limits.

3 The cluster index

We commence by considering a general \(\mathbb{R }^d\)-valued stationary process \((X_t)\) satisfying \(\mathbf (RV _\alpha )\) for some \(\alpha >0\). A continuous mapping argument for regular variation (see e.g. [19, 20]) and (2.3) ensure the existence of the limits

$$\begin{aligned} b_k(\theta ) =\lim _{n\rightarrow \infty }n\,\mathbb{P }(\theta ^{\prime }S_k >a_n),\quad k\ge 1,\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

The difference \(b_{k+1}(\theta )-b_k(\theta )\) can be expressed in terms of the spectral tail process \((\Theta _t)\) of \((X_t)\).

Lemma 3.1

Let \((X_t)\) be an \(\mathbb{R }^d\)-valued stationary process satisfying \(\mathbf (RV _\alpha )\) for some \(\alpha >0\). Then, for any \(k\ge 1\),

$$\begin{aligned} b_{k+1}(\theta )-b_k(\theta )=\mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=0}^k\Theta _t\right) _+^\alpha -\left( \theta ^{\prime }\sum _{t=1}^k\Theta _t\right) _+^\alpha \right] \!. \end{aligned}$$

Proof

We start by observing that each \(b_k(\theta )\) can be expressed in terms of the spectral tail process \((\Theta _t)\). Indeed, \(\mathbf (RV _\alpha )\) yields for every \(k\ge 1\) and \(\theta \in \mathbb{S }^{d-1}\) that

$$\begin{aligned} b_k(\theta )&= \lim _{x\rightarrow \infty }\frac{\mathbb{P }(\theta ^{\prime }S_k>x)}{\mathbb{P }(|X|>x)}\\&= \lim _{x\rightarrow \infty }\frac{\mathbb{P }(\cup _{j=1}^k\{\theta ^{\prime }S_k>x,\theta ^{\prime }X_j>x/k\}\cap \{\theta ^{\prime }X_i<x/k,1\le i<j\})}{\mathbb{P }(|X|>x)}\\&= \lim _{x\rightarrow \infty }\sum _{j=1}^k \Big [\frac{\mathbb{P }( \theta ^{\prime }S_k>x,\theta ^{\prime }X_j>x/k)}{\mathbb{P }(|X|>x)}\\&\quad -\frac{\mathbb{P }( \theta ^{\prime }S_k>x,\theta ^{\prime }X_j>x/k,\max _{1\le i<j}\theta ^{\prime }X_i>x/k)}{\mathbb{P }(|X|>x)}\Big ]. \end{aligned}$$

By stationarity, the summands in the above expression can be written in the form

$$\begin{aligned}&\frac{\mathbb{P }(|X_0|>x/k)}{\mathbb{P }(|X_0|>x)}\left[ \mathbb{P }\left( \theta ^{\prime }\sum _{t=1-j}^{k-j}X_t>x,\theta ^{\prime }X_0>x/k \mid |X_0|>x/k\right) \right. \\&\left. -\mathbb{P }\left( \theta ^{\prime }\sum _{t=1-j}^{k-j}X_t>x,\theta ^{\prime }X_0>x/k,\max _{1-j\le i<0}\theta ^{\prime }X_i>x/k \mid |X_0|>x/k\right) \right] \!. \end{aligned}$$

Here we used the fact that \(\{\theta ^{\prime }X_0>x/k\}\subset \{|X_0|>x/k\}\). Letting \(x\rightarrow \infty \) in the above expressions, applying the conditional limits (2.4) and observing that \(\mathbb{P }(|Y_0|>y)=y^{-\alpha }\), \(y>1\), we obtain the limiting expressions

$$\begin{aligned}&{k^\alpha \left[ \mathbb{P }\left( |Y_0|\theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t>k, |Y_0|\theta ^{\prime }\Theta _{0}>1 \right) \right. } \\&\left. -\mathbb{P }\left( |Y_0|\theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t>1,|Y_0|\theta ^{\prime }\Theta _0>1,|Y_0|\max _{1-j\le i<0}\theta ^{\prime }\Theta _i>1\right) \right] \\&= \mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t\right) _+^\alpha \wedge (k\theta ^{\prime }\Theta _{0})_+^\alpha \right] \\&\quad -\mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t\right) _+^\alpha \wedge (k\theta ^{\prime }\Theta _{0})_+^\alpha \wedge \max _{1-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right] . \end{aligned}$$

Hence \(b_k(\theta )\) has representation

$$\begin{aligned} b_k(\theta )\!=\!\sum _{j=1}^k \mathbb{E }\left[ \left( \left( \theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t\right) _+^\alpha \!-\!\max _{1-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\!\wedge \! \left( (k\theta ^{\prime }\Theta _{0})_+^\alpha \!-\!\max _{1-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right] ,\qquad \end{aligned}$$

and therefore

$$\begin{aligned}&{b_{k+1}(\theta )-b_k(\theta )}\\&= \mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=0}^{k}\Theta _t\right) _+^\alpha \wedge (k\theta ^{\prime }\Theta _{0})_+^\alpha \right] \\&+\sum _{j=1}^k \mathbb{E }\left[ \left( \left( \theta ^{\prime }\sum _{t=-j}^{k-j}\Theta _t\right) _+^\alpha -\max _{-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\wedge \left( (k\theta ^{\prime }\Theta _{0})_+^\alpha -\max _{-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right. \\&\left. -\left( \left( \theta ^{\prime }\sum _{t=1-j}^{k-j}\Theta _t\right) _+^\alpha -\max _{1-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\wedge \left( (k\theta ^{\prime }\Theta _{0})_+^\alpha -\max _{1-j\le i<0}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right] . \end{aligned}$$

The expectations in the sum are of the type \(\mathbb{E }f(\Theta _{-s},\ldots ,\Theta _t)\) for integrable \(f\) such that \(f(x_{-s},\ldots ,x_t)=0\) if \(x_{-s}=0\), \(s,t\ge 0\). Then, according to Theorem 3.1 (iii) in [5],

$$\begin{aligned} \mathbb{E }f(\Theta _{-s},\ldots ,\Theta _t)= E\Big ( f(\Theta _0/|\Theta _s|,\ldots , \Theta _{t+s}/|\Theta _s|)\,|\Theta _s|^\alpha \Big ),\quad s,t\ge 0. \end{aligned}$$

Application of this formula and the fact that our functions \(f\) are homogeneous of order \(\alpha \) yield

$$\begin{aligned}&b_{k+1}(\theta )-b_k(\theta ) =\mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=0}^{k}\Theta _t\right) _+^\alpha \wedge (k\theta ^{\prime }\Theta _{0})_+^\alpha \right] \\&\qquad +\sum _{j=1}^k \mathbb{E }\left[ \left( \left( \theta ^{\prime }\sum _{t=0}^{k}\Theta _t\right) _+^\alpha \!-\!\max _{0\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\!\wedge \! \left( (k\theta ^{\prime }\Theta _{j})_+^\alpha \!-\!\max _{0\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right. \\&\qquad \left. -\left( \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\wedge \left( (k\theta ^{\prime }\Theta _{j})_+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right] \\&\quad = \mathbb{E }\left[ \sum _{j=0}^k\left( \left( \theta ^{\prime }\sum _{t=0}^{k}\Theta _t\right) _+^\alpha \!-\!\max _{0\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\!\wedge \! \left( (k\theta ^{\prime }\Theta _{j})_+^\alpha \!-\!\max _{0\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right. \\&\qquad \left. -\sum _{j=1}^k\left( \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\!\wedge \! \left( (k\theta ^{\prime }\Theta _{j})_+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\right] \\&\quad =\mathbb{E }\left[ \left( \theta ^{\prime }\sum _{t=0}^k\Theta _t\right) _+^\alpha -\left( \theta ^{\prime }\sum _{t=1}^k\Theta _t\right) _+^\alpha \right] \!. \end{aligned}$$

The last identity follows because there exists \(\ell =\min \{1\le j\le n;\,(k\theta ^{\prime }\Theta _{j})_+^\alpha \ge \Big (\theta ^{\prime }\sum _{t=1}^{k}\Theta _t\Big )_+^\alpha \}\) such that

$$\begin{aligned}&\left( \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+=0\quad \text{ for } \text{ all } j>\ell , \end{aligned}$$

and then also

$$\begin{aligned}&\left( \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<\ell }(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\wedge \left( (k\theta ^{\prime }\Theta _{\ell })_+^\alpha -\max _{1\le i<\ell }(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\\&= \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<\ell }(k\theta ^{\prime }\Theta _i )_+^\alpha , \end{aligned}$$

and

$$\begin{aligned}&\sum _{j=1}^{\ell -1}\left( \left( \theta ^{\prime }\sum _{t=1}^{k}\Theta _t\right) _+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\wedge \left( (k\theta ^{\prime }\Theta _{j})_+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\\&= \sum _{j=1}^{\ell -1}\left( (k\theta ^{\prime }\Theta _{j})_+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \right) _+\\&= \sum _{j=1}^{\ell -1} \max _{1\le i\le j}(k\theta ^{\prime }\Theta _i )_+^\alpha -\max _{1\le i<j}(k\theta ^{\prime }\Theta _i )_+^\alpha \\&= \max _{1\le i<\ell }(k\theta ^{\prime }\Theta _i )_+^\alpha . \end{aligned}$$

\(\square \)

The remainder of this paper crucially depends on the notion of cluster index of the sequence \((X_t)\), given as the limiting function:

$$\begin{aligned} b(\theta )=\lim _{k\rightarrow \infty } (b_{k+1}(\theta )-b_k(\theta )), \quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

In contrast to the quantities \(b_k(\theta )\) the existence of the limits \(b(\theta )\) is not straightforward. The following result yields a sufficient condition for the existence of \(b\).

Theorem 3.2

Assume that \((X_t)\) satisfies \(\mathbf (RV _\alpha )\) for some \(\alpha >0\) and that \(X_t=f(\Phi _t)\), \(t\in \mathbb{Z }\), where \(f\) is an \(\mathbb{R }^d\)-valued functionacting on the Markov chain \((\Phi _t)\) satisfying \(\mathbf (DC _{p})\) for some positive \(p\in (\alpha -1,\alpha )\). Then the limits

$$\begin{aligned} b(\theta )&= \mathbb{E }\left[ \left( \sum _{t\ge 0}\theta ^{\prime }\Theta _t\right) ^\alpha _+-\left( \sum _{t\ge 1}\theta ^{\prime }\Theta _t\right) ^\alpha _+\right] ,\quad \theta \in \mathbb{S }^{d-1}, \end{aligned}$$

exist and are finite.

Remark 3.4

The cluster index \(b\) of \((X_t)\) is a continuous function on  \(\mathbb{S }^{d-1}\). This is shown in the proof below: \(b\) is the uniform limit of continuous functions on \(\mathbb{S }^{d-1}\). The index \(b(\theta )\) is non-negative since it coincides with the Cèsaro mean \(\lim _{k\rightarrow \infty } k^{-1} b_k(\theta )\). For \(0<\alpha \le 1\), the sub-additivity of the function \(x\rightarrow x_+^\alpha \) implies the inequality \(b(\theta )\le \mathbb{E }[(\theta ^{\prime }\Theta _0)_+^\alpha ]\). Moreover, if \(\mathbb{E }[(\theta ^{\prime }\Theta _0)_+^\alpha ]> 0\) then \(b(\theta )>0\) by an application of the mean value theorem when \(0<\alpha \le 1\). These two properties are shared by the extremal index of a multivariate stationary process. The extremal index admits a similar representation in terms of the spectral tail process, i.e. \(\mathbb{E }[ (\sup _{t\ge 0}\theta ^{\prime }\Theta _t )^\alpha _+- (\sup _{t\ge 1}\theta ^{\prime }\Theta _t )^\alpha _+ ]\); see [4].

Remark 3.4

The limit \(b\) also exists for various classes of stationary processes beyond functions of a Markov chain; see [1, 34] for such examples in the case \(d=1\). The cluster index \(b\) plays a crucial role for characterizing weak and large deviation limits for partial sums of the processes \((X_t)\). This was recognized in [1, 34], and we extend some of these results to the multivariate case in Sect. 4.

Proof

We will show that the limit \(b(\theta )\) of (3.1) exists as \(k\rightarrow \infty \). We start with the case \(\alpha >1\). Then, for \(x,y\in \mathbb{R }\), by the mean value theorem, \(|(x+y)^\alpha _+- x_+^\alpha |\le ( \alpha |y| |x+\xi y|^{\alpha -1})\vee |y|^\alpha \) for some \(\xi \in (0,1)\). Hence, since \(|\theta ^{\prime }\Theta _0|\le 1\) a.s.,

$$\begin{aligned} |b_{k+1}(\theta )-b_k(\theta )|&\le \mathbb{E }\left[ \left( \alpha \,|\theta ^{\prime }\Theta _0|\,\left| \sum _{t=1}^k \theta ^{\prime }\Theta _t+\xi \theta ^{\prime }\Theta _0 \right| ^{\alpha -1}\right) \vee |\theta ^{\prime }\Theta _0|^\alpha \right] \\&\le \mathbb{E }\left[ \left( \alpha \,\left| \sum _{t=1}^k \theta ^{\prime }\Theta _t+\xi \theta ^{\prime }\Theta _0 \right| ^{\alpha -1}\right) \vee 1\right] =I_0. \end{aligned}$$

For \(\alpha \in (1,2]\),

$$\begin{aligned} I_0\le 1+\alpha \sum _{t=0}^k \mathbb{E }|\theta ^{\prime }\Theta _t|^{\alpha -1}. \end{aligned}$$

We will show that the right-hand side is finite, implying that \(\mathbb{E }\big |\sum _{t=0}^\infty |\theta ^{\prime }\Theta _t|\big |^{\alpha -1}<\infty \) and \(\sum _{t=0}^\infty \theta ^{\prime }\Theta _t\) converges absolutely a.s. An application of Lebesgue dominated convergence shows that the limit \(b(\theta )\) exists and is finite. For \(\alpha >2\), an application of Minkowski’s inequality yields

$$\begin{aligned} I_0\le 1+ \alpha \left( \sum _{t=0}^k (\mathbb{E }|\theta ^{\prime }\Theta _t|^{\alpha -1})^{1/(\alpha -1)}\right) ^{\alpha -1}. \end{aligned}$$

We will show that the right-hand side is finite and then the same argument as for \(\alpha \in (1,2]\) applies. We will achieve the bounds for \(I_0\) by showing that there exists \(c>0\) such that

$$\begin{aligned} \mathbb{E }|\theta ^{\prime }\Theta _t|^{\alpha -1}\le c\,\beta ^t,\quad t\ge 0. \end{aligned}$$
(3.1)

Using the fact that \(\Theta _t=Y_t/Y_0\) and \(Y_0\) are independent, for \(t\ge 1\) and \(s=\alpha -1\),

$$\begin{aligned} \mathbb{E }|Y_0|^s\,\mathbb{E }|\theta ^{\prime }\Theta _t|^s\le \mathbb{E }|Y_0|^s\, \mathbb{E }|\Theta _t|^s=\mathbb{E }|Y_t|^s. \end{aligned}$$

By definition of the tail process and Markov’s inequality, for small \(\epsilon >0\) such that \(s (1+\epsilon )<\alpha \),

$$\begin{aligned} \mathbb{E }|Y_t|^s&= \int \limits _0^\infty \mathbb{P }(|Y_t|^s>y) \,dy\\&= \int \limits _0^\infty \lim _{x\rightarrow \infty } \mathbb{P }(|x^{-1}X_t|^s>y\mid |X_0|>x) \,dy \\&\le \int \limits _1^\infty y^{-(1+\epsilon )}\,dy \lim _{x\rightarrow \infty } \dfrac{\mathbb{E }[|X_t|^{s(1+\epsilon )}\,1\!\!1_{\{|X_0|>x\}}]}{x^{s(1+\epsilon )}\mathbb{P }(|X_0|>x)}\\&+\int \limits _0^1 y^{-(1-\epsilon )}\,dy \lim _{x\rightarrow \infty } \dfrac{\mathbb{E }[|X_t|^{s(1-\epsilon )}\,1\!\!1_{\{|X_0|>x\}}]}{x^{s(1-\epsilon )}\mathbb{P }(|X_0|>x)}\\&= R_1+R_2. \end{aligned}$$

By virtue of \(\mathbf (DC _{p})\) for some \(p\in (\alpha -1,\alpha )\), using a recursive argument, we obtain for sufficiently large \(y\) and \(s(1+\epsilon )\le p\),

$$\begin{aligned} \mathbb{E }[|X_t|^{s(1+\epsilon )}\mid \Phi _0=y]\le \beta ^t |f(y)|^{s(1+\epsilon )} + b\,\sum _{j=1}^t\beta ^j. \end{aligned}$$

Using this inequality and Karamata’s theorem (see [6]), for some \(c>0\),

$$\begin{aligned} R_1&\le c\,\lim _{x\rightarrow \infty } \dfrac{\mathbb{E }\big [ 1\!\!1_{\{|X_0|>x\}} \mathbb{E }[|X_t|^{s(1+\epsilon )}\mid \Phi _0]\big ]}{x^{s(1+\epsilon )}\mathbb{P }(|X_0|>x)}\\&\le c\,\beta ^t\,\lim _{x\rightarrow \infty } \dfrac{\mathbb{E }[|X_0|^{s(1+\epsilon )} 1\!\!1_{\{|X_0|>x\}} ]}{x^{s(1+\epsilon )}\mathbb{P }(|X_0|>x)} \le c \beta ^t. \end{aligned}$$

Similarly, \(R_2\le c \beta ^t\). We conclude that (3.1) holds for \(\alpha >1\).

It remains to consider the case \(\alpha \le 1\). We observe that \(|(x+y)_+^\alpha -x_+^\alpha |\le |y|^\alpha \) for any \(x,y\in \mathbb{R }\). Hence

$$\begin{aligned} |b_{k+1}(\theta )-b_k(\theta )|\le \mathbb{E }|\theta ^{\prime }\Theta _0|^\alpha \le 1. \end{aligned}$$

It suffices to show that \(\sum _{t=0}^\infty |\theta ^{\prime }\Theta _t|<\infty \) a.s. This follows if \(\sum _{t=0}^\infty \mathbb{E }|\theta ^{\prime }\Theta _t\big |^s<\infty \) for some \(s<p\). The proof is analogous, using \(\mathbf (DC _{p})\) for some \(p<\alpha \). \(\square \)

4 Limit theory for functions of regenerative Markov chains

In this section we present the main results of this paper. Throughout we consider an \(\mathbb{R }^d\)-valued process \(X_t=f(\Phi _t)\), \(t\in \mathbb{Z }\), where \((\Phi _t)\) is an irreducible aperiodic Markov chain. We present two types of limit results for the partial sums \((S_n)\) of \((X_n)\): central limit theory with infinite stable limits in Theorem 4.1 and precise large deviation results in Theorem 4.3. The proofs of these results are postponed to Sects. 6 and 7.

4.1 Stable limit theory

We start with a central limit theorem with stable limit law.

Theorem 4.1

Consider an \(\mathbb{R }^d\)-valued strictly stationary sequence \((X_t)=(f(\Phi _t))\) satisfying the following conditions:

  • \(\mathbf (RV _\alpha )\) for some \(\alpha \in (0,2)\), \(\mathbb{E }X=0\) if \(\alpha >1\) and \(X\) is symmetric if \(\alpha =1\).

  • \(\mathbf (DC _{p})\) for some \(p\in ((\alpha -1)\vee 0,\alpha )\).

Let \((a_n)\) be a sequence of positive numbers such that \(n\,\mathbb{P }(|X_0|>a_n)\sim 1\). Then the following statements hold:

  1. (1)

    The central limit theorem \(a_n^{-1} S_n \stackrel{d}{\rightarrow }\xi _\alpha \) is satisfied for a centered \(\alpha \)-stable random vector \(\xi _\alpha \) with spectral measure \(\Gamma _\alpha \) on \(\mathbb{S }^{d-1}\) (see [44, Section 2.3, for a definition) given by the relation

    $$\begin{aligned} b(\theta )=C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} (\theta ^{\prime }s)^\alpha _+\Gamma _\alpha (ds) ,\quad \theta \in \mathbb{S }^{d-1}, \end{aligned}$$
    (4.1)

    where \(b\) is the cluster index of \((X_t)\) introduced in Sect.  3 and

    $$\begin{aligned} C_\alpha&= \dfrac{1-\alpha }{\Gamma (2-\alpha ) \cos (\pi \alpha /2)}. \end{aligned}$$
    (4.2)

    If \(b\equiv 0\) the limit \(\xi _\alpha =0\) a.s.

  2. (2)

    If \(b\ne 0\) the partial sums over full cycles \((S(i))_{i=1,2,\ldots }\) defined in (2.5) are with index \(\alpha \) and spectral measure \(\mathbb{P }_{\Theta ^{\prime }}(\cdot )\) on \(\mathbb{S }^{d-1}\) given by

    $$\begin{aligned} d\mathbb{P }_{\Theta ^{\prime }}(ds) =\dfrac{b(s) }{\int _{\mathbb{S }^{d-1}}b(\theta )\,d\mathbb{P }_{\Theta }(\theta )}\, d\mathbb{P }_{\Theta }(ds). \end{aligned}$$
    (4.3)

The proof of Theorem 4.1 is given in Sect. 6. To a large extent, the results of Theorem 4.1 can be extended to the case of non-irreducible Markov chains. A short discussion of this topic will be given at the end of Sect. 6.

4.2 A discussion of related stable limit results

Theorem 4.1 complements the central limit theorem with Gaussian limits for \(\mathbb{R }^d\)-valued functions of a Markov chain; see Theorem 2.2 above. For both results, conditions of type \(\mathbf (DC _{p})\) enter the proofs to show the existence of moments of \(S(1)\) under the existence of the corresponding moments for \(X_0\).

The history of stable limit theory for non-linear multivariate time series is short in comparison with the finite variance case. Davis and Mikosch [16] prove a central limit theorem with \(\alpha \)-stable limit for an \(\mathbb{R }^d\)-valued strictly stationary sequence \((X_t)\), satisfying a weak dependence condition. The result is a straightforward extension of the 1-dimensional result proved in Theorem 3.1 of Davis and Hsing [15]. We recall the forementioned results for the reason of comparison with Theorem 4.1.

Theorem 4.2

Assume that the strictly stationary \(\mathbb{R }^d\)-valued sequence \((X_t)\) satisfies \(\mathbf (RV _\alpha )\) for some \(\alpha >0\) and the following point process convergence result holds:

$$\begin{aligned} N_n=\sum _{t=1}^n\delta _{a_n^{-1} X_t}\stackrel{d}{\rightarrow }N= \sum _{i=1}^\infty \sum _{j=1}^\infty \delta _{P_i Q_{ij}}, \end{aligned}$$

where \((P_i)\) are the points of a Poisson random measure on \((0,\infty )\) with intensity \(h(y)=\gamma \alpha y^{-\alpha -1}\), \(y>0\), and it is assumed that \(\gamma >0\),Footnote 1 the sequence \((Q_{ij})_{j\ge 1}\), \(i=1,2,\ldots ,\) is iid with values \(|Q_{ij}|\le 1\), independent of \((P_i)\) and such that \(sup_{j\ge 1}|Q_{ij}|=1\).

  1. (1)

    If \(\alpha \in (0,1)\) then

    $$\begin{aligned} a_n^{-1} S_n\stackrel{d}{\rightarrow }\xi _\alpha =\sum _{i=1}^\infty \sum _{j=1}^\infty P_i Q_{ij} \end{aligned}$$

    and \(\xi _\alpha \) has an \(\alpha \)-stable distribution ,

  2. (2)

    If \(\alpha \in [1,2)\) and for any \(\delta >0\),

    $$\begin{aligned} \lim _{\varepsilon \downarrow 0}\limsup _{n\rightarrow \infty }\mathbb{P }(|S_n(0,\varepsilon ]- \mathbb{E }S_n(0,\varepsilon ] |>\delta )=0, \end{aligned}$$
    (4.4)

    where \(S_n(0,\varepsilon ]=a_n^{-1}\sum _{t=1}^n X_t 1\!\!1_{\{|X_t|\le \varepsilon a_n\}}\), then

    $$\begin{aligned} a_n^{-1} S_n- \mathbb{E }S_n(0,1]\stackrel{d}{\rightarrow }\xi _\alpha , \end{aligned}$$

    where \(\xi _\alpha \) is the distributional limit as \(\varepsilon \downarrow 0\) of

    $$\begin{aligned} \left( \sum _{i=1}^\infty \sum _{j=1}^\infty P_i Q_{ij} 1\!\!1_{(\varepsilon ,\infty )}( P_i |Q_{ij}|) -\int \limits _{\varepsilon <|x|\le 1} x\,\mu _X(dx)\right) \end{aligned}$$

    which exists and has an \(\alpha \)-stable distribution . (Recall that \(\mu _X\) is the limit measure in (2.2).)

The latter result has been the basis for a variety of results for partial sums of strictly stationary processes with infinite variance stable limits; see [4, 16, 31, 46]. The main idea of the proof of Theorem 4.2 is a continuous mapping argument acting on \(N_n\stackrel{d}{\rightarrow }N\), showing that the sums of the points of \(N_n\) converge in distribution to the corresponding sum of the points of \(N\). This method is rather elegant and can be applied to a large variety of strictly stationary vector sequences \((X_t)\). The proofs use advanced point process techniques.

A characterization of the parameters of the distribution of the multivariate limit \(\xi _\alpha \) in Theorem 4.2 can be given by extending Theorem 3.2 in [15] to the multidimensional case: if

$$\begin{aligned} \mathbb{E }\left( \sum _{j\ge 1} |Q_{1j}|\right) ^\alpha <\infty \end{aligned}$$
(4.5)

then the Lévy spectral measure \(\Gamma _\alpha \) of \(\xi _\alpha \) is described by

$$\begin{aligned} \int \limits _{\mathbb{S }^{d-1}} (\theta ^{\prime }s)^\alpha _+\Gamma _\alpha (ds)=\gamma \frac{\alpha }{2-\alpha }\,\mathbb{E }\big [\big (\sum _{t\ge 1}\theta ^{\prime }Q_{1t}\big )^\alpha _+ \big ],\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

This representation is particularly useful for \(\alpha <1\). Then (4.5) is always satisfied. Adapting Theorem 4.2 in terms of the tail process as in Basrak et al. [4], an alternative characterization of the Lévy spectral measure \(\Gamma _\alpha \) is the following: if

$$\begin{aligned} \mathbb{E }\left( \sum _{t\ge 0} |\Theta _{t}|\right) ^\alpha <\infty \end{aligned}$$
(4.6)

then

$$\begin{aligned} \int \limits _{\mathbb{S }^{d-1}} (\theta ^{\prime }s)^\alpha _+\Gamma _\alpha (ds)= C_\alpha ^{-1}\mathbb{E }\Big [\Big (\sum _{t\ge 0}\theta ^{\prime }\Theta _{t}\Big )^\alpha _+1\!\!1_{\{\Theta _i=0,\,\forall i\le -1\}} \Big ],\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

Conditions (4.5) and (4.6) may fail for \(\alpha >1\), e.g. for a GARCH(1,1) model; see Sect. 5.4.

If we assume the conditions of Theorem 4.1, classical computation for \(\alpha \ne 1\) yields

$$\begin{aligned} \mathbb{E }[\exp (iv^{\prime }\xi _\alpha )]&= \exp \left( -\int \limits _{\mathbb{S }^{d-1}}|v^{\prime }\theta |^\alpha (1-i\text{ sign }(v^{\prime }\theta )\tan (\pi \alpha /2))\Gamma _\alpha (d\theta )\right) \\&= \exp \left( \int \limits _0^\infty \mathbb{E }\left[ \exp \left( iu\sum _{t=1}^\infty v^{\prime }\Theta _t\right) -\exp \left( iu\sum _{t=0}^\infty v^{\prime }\Theta _t\right) \right] \alpha x^{-\alpha -1}dx\right) \!. \end{aligned}$$

For \(\alpha \in (0,1)\), this form of the limiting stable characteristic function was proved in Basrak and Segers [5].

The additional condition (4.4) is not easily checked for dependent sequences. It is implied for stationary \(\rho \)-mixing processes with rate function \(\rho (j)\) satisfying \(\sum _{j\ge 1}\rho (2^j)<\infty \); see [24]. It is also implied by \(\mathbf (DC _{p})\) for functions of an irreducible Markov chain; see [34]. For a (possibly non-irreducible) Markov chain \((X_t)\), condition \(\mathbf (DC _{p})\) is much weaker than this \(\rho \)-mixing condition which is equivalent to a spectral gap in \(L^2(\mathbb{P })\); see [30].

In our paper, characteristic function based methods are employed which are close to those used in classical limit theory for iid sequences; see e.g. [38]. As in the iid case, Theorem 4.1 yields an explicit form of the characteristic function of the limiting \(\alpha \)-stable random vector. The underlying extremal dependence structure of \((X_t)\) shows via the cluster index \(b(\theta )\) which appears explicitly in the characteristic function. We refer the reader to the extensive discussion in [1] on the comparison of the point process and the characteristic function approaches to stable limit theory. One drawback of our approach is that, in contrast to the point process approach, we do not have series representations of \(\xi _\alpha \) in terms of the sequence \((\Theta _t)\).

Recently, the special case of solutions to multivariate stochastic recurrence equations (5.3) has attracted attention; see e.g. [11, 14]. In this case, one can exploit the underlying random iterative contractive structure to derive stable limits without additional restrictions. We mention that drift conditions such as \(\mathbf (DC _{p})\) are automatically satisfied for solutions of stochastic recurrence equations; see Sect. 5.

4.3 Precise large deviations for functions of a Markov chain

In this section, we extend some of the results obtained in [34] for general univariate sequences.Footnote 2 We again focus on \(\mathbb{R }^d\)-valued sequences \((X_t)=(f(\Phi _t))\) for an underlying aperiodic irreducible Markov chain \((\Phi _t)\). The case \(\alpha \in (0,2)\) turns out to be a consequence of Theorem 4.1; the proof is given in Sect. 7.1. The proof in the case \(\alpha >2\) is more involved and requires different techniques; see Sect. 7.2.

Theorem 4.3

Consider an \(\mathbb{R }^d\)-valued strictly stationary sequence \((X_t)=(f(\Phi _t))\) for an aperiodic irreducible Markov chain \((\Phi _t)\). Assume that \((X_t)\) satisfies the condition \(\mathbf (RV _\alpha )\) for some \(\alpha >0\). Let \((\lambda _n)\) be any sequences such that \(\log (\lambda _n)=o(n)\) and \(\lambda _n/n^{1/\alpha +\varepsilon }\rightarrow \infty \) if \(\alpha \in (0,2)\) and \(\lambda _n/n^{0.5+\varepsilon }\rightarrow \infty \) if \(\alpha >2\) for any \(\varepsilon >0\). Assume either

  1. (1)

    \(\alpha \in (0,2)\) and the conditions of Theorem 4.1 are satisfied, or

  2. (2)

    \(\alpha >2\), \(\alpha \not \in \mathbb{N }\) or \(b(\theta )=b(-\theta )\), \(\theta \in \mathbb{S }^{d-1}\), and \(\mathbf (DC _{p})\) holds for every \(p<\alpha \),

then the following large deviation principle holds:

$$\begin{aligned} \dfrac{\mathbb{P }(\lambda _n^{-1}S_n\in \cdot )}{n\,\mathbb{P }(|X|>\lambda _n)}\stackrel{v}{\rightarrow }\nu _\alpha ,\quad n\rightarrow \infty , \end{aligned}$$
(4.7)

where \(\nu _\alpha \) is a Radon measure on the Borel \(\sigma \)-field of \(\overline{\mathbb{R }}^d_0\) uniquely determined by the relations

$$\begin{aligned} \nu _\alpha (t\{x: \theta ^{\prime }x >1\})\!=\! t^{-\alpha } \nu _\alpha (\{x: \theta ^{\prime }x >1\})=t^{-\alpha } \,b(\theta ),\quad \theta \in \mathbb{S }^{d-1},\; t>0.\qquad \end{aligned}$$
(4.8)

Remark 4.4

The conditions \(\alpha \not \in \mathbb{N }\) or \(b(\cdot )=b(-\cdot )\) are needed to apply inverse results for regular variation . For \(\alpha >2\), we show that the measure \(\nu _\alpha \) on the Borel \(\sigma \)-field of \(\overline{\mathbb{R }}_0^d\) is uniquely determined by its values on sets of the form \(t\{x: \theta ^{\prime }x >1\}\), \(t>0\), \(\theta \in \mathbb{S }^{d-1}\), provided the mentioned additional conditions are met. In general, such conditions cannot be avoided; [21, 26] give counterexamples for integer values \(\alpha \). In [2, 7, 27] further conditions on the vector \(X\) are given which allow one to discover the measure \(\nu _\alpha \) from its knowledge on the sets \(t\{x: \theta ^{\prime }x >1\}\), \(t>0\), \(\theta \in \mathbb{S }^{d-1}\).

Remark 4.5

The proof of Theorem 4.3 shows that (4.7) holds uniformly for certain intervals of normalizations and for half-spaces not containing the origin. To be precise, the following uniform relations hold

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{x\in \Lambda _n}\Big |\frac{\mathbb{P }(\theta ^{\prime }S_n> x)}{n\,\mathbb{P }(|X|> x)}-b(\theta ) \Big |=0,\quad \theta \in \mathbb{S }^{d-1}, \end{aligned}$$
(4.9)

for regions \(\Lambda _n=(b_n,c_n)\). Here \((b_n)\) satisfies \(b_n=n^{0.5+\varepsilon }\) in the case \(\alpha >2\) and \(b_n=n^{1/\alpha +\varepsilon }\) in the case \(\alpha \in (0,2)\) for any \(\varepsilon >0\), and \((c_n)\) is chosen such that \(c_n>b_n\) and \(\log c_n=o(n)\). Moreover, for (4.9) one does not need the additional conditions \(b(\cdot )=b(-\cdot )\) and \(\alpha \not \in \mathbb{N }\).

5 Examples

Here we consider several examples of regularly varying stationary processes with index \(\alpha >0\), where the theory of the previous sections applies. In particular, we will determine the tail process \((\Theta _t)\), the cluster index \(b\) and verify the drift condition \(\mathbf (DC _{p})\) for \(p<\alpha \). All models considered fall in the class of functions acting on an aperiodic irreducible Markov chain.

5.1 Vector-autoregressive process

Consider the vector-autoregressive process of order 1 given by

$$\begin{aligned} X_t = A\,X_{t-1} + Z_t,\quad t\in \mathbb{Z }, \end{aligned}$$
(5.1)

where \(A\) is a random \(d\times d\) matrix whose eigenvalues are less than 1 in absolute value, and \(A\) is independent of the iid \(\mathbb{R }^d\)-valued sequence \((Z_t)\) which is regularly varying with index \(\alpha >0\). Then we also have \(\mathbb{E }\Vert A\Vert ^s<1\) for every \(s>0\). Here \(\Vert \cdot \Vert \) denotes the operator norm with respect to the Euclidean norm.

Then a stationary solution \((X_t)\) to (5.1) exists and has representation

$$\begin{aligned} X_t= A^t X_0 + \sum _{i=1}^t A^{t-i} Z_i,\quad t\ge 0; \end{aligned}$$

see [10], Chapter 11. Morever, \(X_0\) is regularly varying with index \(\alpha \); see [42]. In particular, denoting the limiting measure of the regularly varying vector \(Z_0\) by \(\mu _Z\), it follows from [42] that

$$\begin{aligned} \dfrac{\mathbb{P }( x^{-1} X_0\in \cdot )}{\mathbb{P }(|Z_0|>x)}\stackrel{v}{\rightarrow }\sum _{i=0}^\infty \mathbb{E }\big [\mu _{Z}(\{x\in \mathbb{R }^d: A^i x\in \cdot \}\big )\big ]. \end{aligned}$$
(5.2)

Since

$$\begin{aligned} (X_1,\ldots ,X_h)= (A,\ldots , A^h) X_0+ \left( Z_1,\ldots , \sum _{t=1}^h A^{h-t} Z_t \right) , \end{aligned}$$

and \((Z_t)_{t\ge 1}\) is independent of \(X_0\), regular varation of \((X_1,\ldots ,X_h)\) is a consequence of the fact that regular varation is kept under linear transformations. Let \(C\) be a continuity set relative to the limiting measure \(\mu _{h+1}\) of \((X_0,\ldots ,X_h)\) and \(I_d\) the identity matrix. Since \(X_0\) is independent of \((Z_t)_{t\ge 1}\),

$$\begin{aligned}&\mathbb{P }(x^{-1} (X_0,\ldots ,X_h)\in C \mid |X_0|>x) =\mathbb{P }(x^{-1}(I_d,A,\ldots , A^h) \,X_0 \in C \mid |X_0|>x)\\&\quad +\mathbb{P }\left( x^{-1}(0,Z_1, \sum _{i=1}^2 A^{2-i} Z_i,\ldots , \sum _{i=1}^h A^{h-i} Z_i\in C\right) + o(1)\\&\quad \rightarrow \mathbb{P }((I_d,A,\ldots ,A^h) Y_0\in C) ,\quad x\rightarrow \infty . \end{aligned}$$

Thus we may identify \((\Theta _t)_{t=0,\ldots ,h}\) with \((I_d,A,\ldots , A^h) \Theta _0\). In view of (5.2),

$$\begin{aligned}&\mathbb{P }( x^{-1} X_0/|X_0|\!\in \!\cdot \mid |X_0|\!>\!x)\!\stackrel{w}{\rightarrow }\! \dfrac{\sum _{i=0}^\infty \mathbb{E }\big [\mu _{Z}(\{x\!\in \! \mathbb{R }^d: A^i x/|A^i x|\in \cdot ,|A^i x|\!>\!1\}\big )\big ]}{\sum _{i=0}^\infty \mathbb{E }\big [\mu _{Z}(\{x\in \mathbb{R }^d: |A^i x|\!>\!1\})\big ]}\\&\quad =\mathbb{P }(\Theta _0\in \cdot ). \end{aligned}$$

Writing \((I_d-A)^{-1}=\sum _{t=0}^\infty A^t\) (this series converges since the largest eigenvalue of \(A\) is smaller than 1), we conclude that

$$\begin{aligned} b(\theta )= \mathbb{E }\Big [\big (\theta ^{\prime } (I_d-A)^{-1}\Theta _0\big )_+^{\alpha } -\big (\theta ^{\prime } A (I_d-A)^{-1} \Theta _0\big )_+^{\alpha } \Big ],\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

Next we show \(\mathbf (DC _{p})\) for \(p<\alpha \). First assume \(p>1\). A Taylor series expansion yields

$$\begin{aligned} \mathbb{E }( |A x+Z_1|^p- |A x|^p)&\le p \mathbb{E }[|Z_1| \,|Ax + \xi Z_1|^{p-1}]\\&\le c\,(\mathbb{E }|Ax|^{p-1}+1)\le c\,( |x|^{p-1} +1) \end{aligned}$$

for some random variable \(\xi \in (0,1)\) a.s. Then for some \(\beta \in (\mathbb{E }\Vert A\Vert ^p,1)\) and sufficiently large \(|x|\),

$$\begin{aligned} \mathbb{E }|A x\!+\!Z_1|^p\!\le \! \mathbb{E }|A x|^p \!+\! c \,(1\!+\! |x|^{p-1})\!\le \! \mathbb{E }\Vert A\Vert ^p |x|^p (1+c\,|x|^{-1}) \!+\!c\le \beta |x|^p +c, \end{aligned}$$

and \(\mathbf (DC _{ p})\) is satisfied. If \(p\le 1\) a simpler argument applies with \(\beta =\mathbb{E }\Vert A\Vert \):

$$\begin{aligned} \mathbb{E }( |A x+Z_1|^p\le \mathbb{E }|A x|^p +\mathbb{E }|Z_1|^p\le \beta \,|x|^p +c. \end{aligned}$$

If the Markov chain \((X_t)\) is also aperiodic and irreducible the results in Sect. 4 are directly applicable with \(f(x)=x\).

5.2 Random affine mapping

Following Kesten [26], we consider the stochastic recurrence equation

$$\begin{aligned} X_t= A_t\,X_{t-1}+ B_t,\quad t \in \mathbb{Z }, \end{aligned}$$
(5.3)

where \(((A_t,B_t))_{t\in \mathbb{Z }}\) is an iid sequence , \(A_t\) are random \(d\times d\)-matrices and \(B_t\) are \(\mathbb{R }^d\)-valued random vectors. We also assume \(\mathbb{E }\log ^+\Vert A\Vert <\infty \), where \(\Vert \cdot \Vert \) denotes the operator norm with respect to the Euclidean norm, \(\mathbb{E }\log ^+|B|<\infty \), and that the Lyapunov exponent of the stochastic recurrence equation (5.3) is negative. These conditions ensure that an a.s. unique stationary causal solution \((X_t)\) to (5.3) exists; see [8]. Under additional regularity conditions which ensure that the distribution of \(A\) is sufficiently spread out, the equation

$$\begin{aligned} \varrho (\kappa )=\lim _{n\rightarrow \infty } n^{-1} \log \mathbb{E }\Vert A_1\cdots A_n\Vert ^\kappa =0,\quad \kappa >0, \end{aligned}$$
(5.4)

has a unique positive solution \(\alpha \) and \(\theta ^{\prime }X\), \(\theta \in \mathbb{S }^{d-1}\), is with index \(\alpha \). Under stronger conditions on \(A\), \(\alpha \) can be calculated as the solution to \(E\Vert A\Vert ^\kappa =1\), \(\kappa >0\); see [12, 18] for recent results. Kesten [26] had already given conditions which ensured that at least one of the linear combinations \(\theta ^{\prime }X\), \(\theta \in \mathbb{S }^{d-1}\), is regularly varying with index \(\alpha \). In general, one cannot conclude from regular variation of \(\theta ^{\prime }X\), \(\theta \in \mathbb{S }^{d-1}\), that \(X\) is regularly varying; see [21, 26] for some counterexamples.Footnote 3 In [2, 7, 27] conditions are given which ensure that the regular variation of a vector can be recovered from the regular variation of its linear projections. One of these conditions is that \(\alpha \not \in \mathbb{N }\); see [2] for details. In what follows, we will assume that \(X_t\) is regularly varying with index \(\alpha >0\) and that the stronger moment conditions \(\mathbb{E }\Vert A\Vert ^{2(\alpha +\epsilon )}<\infty \) and \(\mathbb{E }|B|^{2(\alpha +\epsilon )}<\infty \) hold for some \(\epsilon >0\). If \(A_t\) and \(B_t\) are independent the milder moment conditions \(\mathbb{E }\Vert A\Vert ^{\alpha +\epsilon }<\infty \) and \(\mathbb{E }|B|^{\alpha +\epsilon }<\infty \) for some \(\epsilon >0\) suffice.

Calculation yields

$$\begin{aligned} X_t=\Pi _t X_0 +R_t,\quad \text{ where }\quad \Pi _t=A_t\cdots A_1,\quad t\ge 1, \end{aligned}$$
(5.5)

where \(\mathbb{E }|R_t|^{\alpha +\epsilon }<\infty \) and hence

$$\begin{aligned} \mathbb{P }( x^{-1} (X_0,\ldots ,X_t)\in \cdot \mid |X_0|>x)&\stackrel{w}{\rightarrow }&\mathbb{P }(|Y_0|\, (I_d,\Pi _1,\ldots \Pi _t)\Theta _0\in \cdot ). \end{aligned}$$

where \(\mathbb{P }(X_0/|X_0|\in \cdot \mid |X_0|>x)\stackrel{w}{\rightarrow }\mathbb{P }(\Theta _0\in \cdot )\) and \(\Theta _0\) is independent of \((A_t)_{t\ge 1}\). Therefore \((\Theta _i)_{i=0,\ldots ,t}=(I_d,\Pi _1,\ldots , \Pi _t)\Theta _0\). Writing \(\Pi _0=I_d\), the identity matrix in \(\mathbb{R }^d\), and \((Z_t)\) for the solution of the stochastic recurrence equation (5.3) in the special case \(B=I_d\), we obtain from Theorem 3.2,

$$\begin{aligned} b(\theta )&= \mathbb{E }\Big [\Big (\theta ^{\prime }\sum _{t\ge 0} \Pi _t \Theta _0\Big )^\alpha _+-\Big (\theta ^{\prime }\sum _{t\ge 1}\Pi _t\Theta _0\Big )^\alpha _+\Big ]\nonumber \\&= \mathbb{E }\Big [\Big (\theta ^{\prime }(Z_1+I_d) \Theta _0\Big )^\alpha _+-\Big (\theta ^{\prime }Z_1\Theta _0\Big )^\alpha _+\Big ] ,\quad \theta \in \mathbb{S }^{d-1}, \end{aligned}$$
(5.6)

provided we can show \(\mathbf (DC _{p})\) for the Markov chain \((\Phi _t)=(X_t)\). The formula (5.6) is in agreement with the calculations for \(d=1\) in [1].

Since (5.4) is satisfied we can use Lemma 2.1 for proving \(\mathbf (DC _{p})\). Assuming irreducibility and aperiodicity of the Markov chain \((X_t)\) and exploiting the definition of \(\alpha \) as solution to (5.4), one can choose \(m\ge 1\) sufficiently large such that \(\mathbb{E }\Vert A_1\cdots A_m\Vert ^p<1\) for any \(p<\alpha \). Indeed, assume on the contrary that \(\varrho (p)\ge 0\) for some \(p<\alpha \). This contradicts the convexity of \(\varrho \) which has roots at \(0\) and \(\alpha \). Then the \(m\)-skeleton of the chain satisfies the drift condition, \(\mathbf (DC _{p,m})\) follows and Lemma 2.1 yields \(\mathbf (DC _{p})\). Thus we conclude that the results of Sect. 4 are directly applicable to the Markov chain \((X_t)\) with \(f(x)=x\) if it is also aperiodic and irreducible.

5.3 Sample autocovariance function of one-dimensional random affine mapping

Consider the solution \((X_t)\) to the stochastic recurrence equation (5.3) in the case \(d=1\), under irreducibility and aperiodicity. We assume the conditions and use the notation of Sect. 5.2. In addition, we write

$$\begin{aligned} \Pi _{s,t}=\left\{ \begin{array}{ll} A_s\cdots A_t&{}\quad s\le t,\\ 1&{}\quad \text{ otherwise. }\end{array}\right. \end{aligned}$$

In particular, we assume that \((X_t)\) is regularly varying with index \(\alpha >0\) satisfying \(\mathbb{E }|A|^\alpha =1\), \(\mathbb{E }|A|^{\alpha +\varepsilon }<\infty \) and \(\mathbb{E }|B|^{\alpha +\varepsilon }<\infty \) for some \(\varepsilon >0\). For \(h\ge 0\), consider the process \(\Phi _t=(X_t,X_{t-1},\ldots ,X_{t-h})^{\prime }\), \(t\in \mathbb{Z }\), of lagged vectors. They constitute an \(\mathbb{R }^{h+1}\)-valued stationary, aperiodic and irreducible Markov chain. Similar arguments as in Sect. 5.2 show that the chain is regularly varying with index \(\alpha >0\). We consider the following function acting on the Markov chain \((\Phi _t)\):

$$\begin{aligned} \mathbf X _t= f(\Phi _t)= (X_t \Phi _t,X_t,\Phi _t),\quad t\in \mathbb{Z }. \end{aligned}$$

By convention, we will assume that all vectors are understood as column vectors. The sequence \((\Phi _t)\) satisfies the recursion

$$\begin{aligned} \Phi _t=\left( \begin{matrix}A_t&{}0&{} \cdots &{}0&{}0\\ 1&{}0&{}\cdots &{}0&{}0\\ 0&{}1&{}\cdots &{}0&{}0\\ \vdots &{}\vdots &{}\ddots &{}\vdots &{}\vdots \\ 0&{}0&{}\cdots &{}1&{}0 \end{matrix}\right) \Phi _{t-1} +\left( \begin{matrix}B_t\\ 0\\ 0\\ \vdots \\ 0\end{matrix}\right) = \mathbf A _t \Phi _{t-1}+ \mathbf B _t,\quad t\in \mathbb{Z }. \end{aligned}$$

We will show that \((\mathbf X_t)\) satisfies \(\mathbf (RV _{\alpha /\mathrm{2}})\) and \(\mathbf (DC _{p,m})\) for \(m\) sufficiently large, \(V(x)=|x|^p\) and \(p<\alpha /2\). The condition \( \mathbb{E }( V(f(\Phi _1))\mid \Phi _0=y)\le c_1V(f(y))+c_2\) for some positive \(c_1,c_2\) follows immediately from the stochastic recurrence equation

$$\begin{aligned} \mathbf X _t=\left( \begin{matrix}A_t\mathbf A _t&{}\,A_t\mathbf B _t&{}\, B_t\mathbf A _t\\ 0_{1,h+1}&{}\,A_t&{}\,0_{1,h+1}\\ 0_{h+1,h+1}&{}\,0_{h+1,1}&{}\,\mathbf A _t \end{matrix}\right) \mathbf X _{t-1} +\left( \begin{matrix}B_t\mathbf B _t\\ B_t\\ \mathbf B _t\end{matrix}\right) =\mathbf C _t \mathbf X _{t-1}+\mathbf D _t,\quad t\in \mathbb{Z }. \end{aligned}$$

Condition \(\mathbb{E }|\mathbf D |^{(\alpha +\varepsilon )/2}<\infty \) follows by the assumptions. From basic algebra, for \(m\ge h\) the matrix products \(\prod _{t=1}^mA_t\mathbf A _t=\Pi _t \prod _{t=1}^m \mathbf{A}_t\) can be written as \(\Pi _{m,h}\mathbf M _h\), where the \((h+1)\times (h+1)\) matrix \(\mathbf M _h\) has zero entries but the first column given by \((\Pi _{1,h-1},\Pi _{2,h-1},\ldots ,1)\). Products of triangular matrices remain triangular and their diagonal is the product of the diagonals. Thus we obtain

$$\begin{aligned} \mathbf C _m\cdots \mathbf C _1=\left( \begin{matrix}\Pi _{m,h}^2I_h&{}\,0_{1,h+1}&{}\, 0_{h+1,h+1}\\ 0_{1,h+1}&{}\,\Pi _{m,h} &{}\,0_{1,h+1}\\ 0_{h+1,h+1}&{}\,0_{h+1,1}&{}\,\Pi _{m,h} I_h \end{matrix}\right) \widetilde{\mathbf{C }}_h= \widetilde{\mathbf{D }}_m \widetilde{\mathbf{C }}_h, \end{aligned}$$

where \(\widetilde{\mathbf{C }}_h\) is an upper triangular block matrix depending only on \((A_t)_{1\le t\le h-1}\). The matrices \(\widetilde{\mathbf{D }}_m\) and \(\widetilde{\mathbf{C }}_h\) are independent and for some \(c>0\) we have

$$\begin{aligned} \mathbb{E }\Vert \widetilde{\mathbf{D }}_m \widetilde{\mathbf{C }}_h\Vert ^p&\le \mathbb{E }\Vert \widetilde{\mathbf{D }}_m\Vert ^p\mathbb{E }\Vert \widetilde{\mathbf{C }}_h\Vert ^p\\&\le c\mathbb{E }[|A_m|^{2p}\cdots |A_{h}|^{2p}+|A_m|^{p}\cdots |A_{h}|^{p}]\mathbb{E }\Vert \widetilde{\mathbf{C }}_h\Vert ^p. \end{aligned}$$

Since \(p<\alpha /2\), \(\mathbb{E }(|A_0|^{2p})^m\rightarrow 0\) and \(\mathbb{E }(|A_0|^{p})^m\rightarrow 0\) as \(m\rightarrow \infty \). Thus, for \(m\) sufficiently large, \(\mathbb{E }\Vert \widetilde{\mathbf{D }}_m \widetilde{\mathbf{C }}_h\Vert ^p\le c\, (\mathbb{E }(|A_0|^{2p})^m + \mathbb{E }(|A_0|^{p})^m)<1\), i.e. condition \(\mathbf (DC _{p,m})\) holds, and Lemma 2.1 applies provided we can also show \(\mathbf (RV _{\alpha /\mathrm{2}})\) for \((\mathbf{X}_t)\). This is our next goal. Since \(X_t\) and \(\Phi _t\) are with index \(\alpha \) we deal with a degenerate case where the limiting measure on regular variation of \(\mathbf{X}_t\) is concentrated at zero for the last \(h+2\) components. Then, in view of the definition of the cluster index, \(b\) is the same for \((X_t\Phi _t)\) and \((\mathbf{X}_t)\). Therefore we will calculate \(b\) for \((X_t\Phi _t)\). Abusing notation, we will also use the same notation for the tail process. As in Sect. 5.2 we obtain by iteration of the stochastic recurrence equation \(X_t=A_tX_{t-1}+B_t\),

$$\begin{aligned} X_{t} \Phi _{t}&= \Pi _{t-h+1,t} (\Pi _{t-h+1,t},\Pi _{t-h+1,t-1},\ldots ,1)^{\prime } X_{t-h}^2 +\mathbf{R}_t^{(1)}\nonumber \\&= \Pi _{1-h,t-h}^2\Pi _{t-h+1,t} (\Pi _{t-h+1,t},\Pi _{t-h+1,t-1},\ldots ,1)^{\prime } X_{-h}^2 +\mathbf{R}_t^{(2)},\nonumber \\&= \Pi _{1-h ,t} (\Pi _{1-h,t},\Pi _{1-h,t-1},\ldots ,\Pi _{1-h,t-h})^{\prime } X_{-h}^2 +\mathbf{R}_t^{(2)}, \end{aligned}$$
(5.7)

where \(\mathbb{E }|\mathbf{R}_t^{(i)}|^{(\alpha +\varepsilon )/2}<\infty \), \(i=1,2\). Then for \(t\ge 0\),

$$\begin{aligned}&(X_0\Phi _0,\ldots ,X_t \Phi _t)\\&\quad =\left( \begin{matrix}\Pi _{1-h,0}^2 &{}\,\Pi _{1-h,1}^2 &{}\,\cdots &{} \Pi _{1-h,t}^2\\ \Pi _{1-h,0} \Pi _{1-h,-1} &{}\,\Pi _{1-h,1}\Pi _{1-h,0} &{}\,\cdots &{}\,\Pi _{1-h,t} \Pi _{1-h,t-1}\\ \vdots &{}\,\vdots &{}\,\ddots &{}\,\vdots \\ \Pi _{1-h,0} &{}\, \Pi _{1-h,1}A_{1-h}&{}\,\cdots &{}\,\Pi _{1-h,t}\Pi _{1-h,t-h} \end{matrix}\right) X_{-h}^2 +\mathbf{Q}_t,\quad t\in \mathbb{Z }. \end{aligned}$$

and \(\mathbb{E }|\mathbf{Q}_t|^{(\alpha +\varepsilon )/2}<\infty \). In the remainder of this section we assume that \(P(A=0)=0\); the general case can be treated as well but leads to tedious case studies. An application of Corollary 3.2 in Basrak and Segers [5] yields that for continuity sets \(M\),

$$\begin{aligned} \mathbb{P }(x^{-1} (X_0\Phi _0,\ldots ,X_t \Phi _t) \in M \mid |X_0\Phi _0|>x) \rightarrow P(|Y_0| \mathbf{E }_t \in M), \end{aligned}$$

where

$$\begin{aligned} \mathbf{E }_t \stackrel{d}{=}\frac{1}{|\Pi _h|\sqrt{\Pi _{h}^2+\Pi _{h-1}^2+\cdots +1}}\left( \begin{matrix} \Pi _{h}\Pi _h &{}\Pi _{h+1}\Pi _{ h+1} &{}\cdots &{}\Pi _{t+h} \Pi _{t+h} \\ \Pi _h\Pi _{h-1} &{}\Pi _{h+1} \Pi _{h} &{}\cdots &{}\Pi _{t+h}\Pi _{ t+h-1}\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ \Pi _h &{}\Pi _{h+1}\Pi _1&{}\cdots &{}\Pi _{t+h}\Pi _t \end{matrix}\right) \end{aligned}$$

and \(\mathbf{E }_t\) is independent of \(|Y_0|\). The right-hand side can be identified with \((\Theta _0,\ldots ,\Theta _t)\).

An application of Theorem 4.1 now yields a stable limit result for the sample autocovariance function of \((X_t)\): Assume that \((a_n)\) satisfies \(n\,\mathbb{P }(|X_0 \Phi _0|>a_n)\sim 1\). In view of (5.7) and Breiman’s result (see [9]) we also have

$$\begin{aligned} n \mathbb{P }(|X_0 \Phi _0|>a_n)\sim n \mathbb{P }( X^2>a_n) \, \mathbb{E }\Big [\Big (|\Pi _h|\sqrt{1+\Pi _1^2+\cdots + \Pi _h^2}\big )^{\alpha /2}\Big ]. \end{aligned}$$

In view of Kesten’s result [26], \(\mathbb{P }(|X|>x))\sim c_0 x^{-\alpha }\). Therefore we can choose

$$\begin{aligned} a_n= n^{2/\alpha } \big (c_0 \mathbb{E }\Big [\Big (|\Pi _h|\sqrt{1+\Pi _1^2+\cdots + \Pi _h^2})^{\alpha /2}\Big ]\Big )^{2/\alpha }. \end{aligned}$$

Then we have for \(m\ge 0\), \(\alpha \in (2,4)\),

$$\begin{aligned} \left( a_n^{-1}\sum _{t=1}^{n-h} \big (X_tX_{t+h}- \mathbb{E }(X_0 X_h)\big )\right) _{h=0,\ldots ,m}\stackrel{d}{\rightarrow }\xi _{\alpha /2}, \end{aligned}$$

and for \(\alpha \in (0,2)\),

$$\begin{aligned} \left( a_n^{-1}\sum _{t=1}^{n-h} X_tX_{t+h}\right) _{h=0,\ldots ,m}\stackrel{d}{\rightarrow }\xi _{\alpha /2}, \end{aligned}$$

where \(\xi _{\alpha /2}\) is an \(\alpha /2\)-stable \(\mathbb{R }^{h+1}\)-valued random vector whose characteristic function is given in Theorem 4.1 and \((\Theta _t)_{t\ge 0}\) is described above. This result was proved in Basrak et al. [3], Theorem 2.13. In the case \(\alpha \in (2,4)\) the additional condition (2.20) was needed; the latter condition is hardly verifiable and could be overcome in the present paper by showing condition \(\mathbf (DC _{p})\). Moreover, as in [3] a straightforward application of the continuous mapping theorem yields a corresponding limit result for the sample autocorrelation function ; we omit details. The limit laws in Theorem 2.13 of [3] are expressed in terms of the points of the limiting point processes in Theorem 4.2 above, while our limits are expressed in terms of the cluster index \(b\). Neither of the representations of the \(\alpha \)-stable limits are easy due to the complicated dependence structure.

5.4 Sample mean of a GARCH\((1,1)\) process and its volatility, sample covariance function of a GARCH\((1,1)\) process

We consider a GARCH\((1,1)\) process \(X_t=\sigma _t\,Z_t\), where \((Z_t)\) is an iid sequence of mean zero unit variance random variables and \((\sigma _t)\) is a sequence of non-negative random variables such that \(\sigma _t^2 = \alpha _0 + \sigma _{t-1}^2 (\alpha _1 Z_{t-1}^2+\beta _1)\). Here \(\alpha _0,\alpha _1,\beta _1\) are positive constants. The latter equation is of Kesten type (5.3) with \(A_t=\alpha _1 Z_{t-1}^2+\beta _1\) and \(B_t=\alpha _0\). We assume that the conditions of Sect. 5.2 are satisfied, in particular,

$$\begin{aligned} \mathbb{P }(\sigma >x)\sim c_0 x^{-\alpha },\quad x\rightarrow \infty , \end{aligned}$$

for some constant \(c_0>0\) and tail index \(\alpha >0\), satisfying \(\mathbb{E }(\alpha _1 Z_{0}^2+\beta _1)^{\alpha /2}=1\). We also assume that \(\mathbb{E }|Z|^{\alpha +\epsilon }<\infty \) for some \(\epsilon >0\). Rewriting (5.5), we have

$$\begin{aligned} (\sigma _0^2,\ldots ,\sigma _t^2)=\sigma _0^2 (1,\Pi _1,\ldots ,\Pi _t)+ \mathbf{R}_t, \end{aligned}$$

where \(\mathbb{E }|\mathbf{R}_t|^{(\alpha +\epsilon )/2}<\infty \) and also \(\mathbb{E }|\Pi _i|^{(\alpha +\epsilon )/2}<\infty \) for \(i\ge 1\). An application of Breiman’s multivariate result (see Basrak et al. [2]) shows that for any continuity set \(M\) as \(x\rightarrow \infty \),

$$\begin{aligned} \dfrac{\mathbb{P }(x^{-1}(\sigma _0,\ldots ,\sigma _t)\in M)}{\mathbb{P }(\sigma >x)}&\sim \dfrac{\mathbb{P }(x^{-1}\sigma _0(1,\Pi _1^{0.5},\ldots ,\Pi _t^{0.5})\in M)}{\mathbb{P }(\sigma >x)}\\&\rightarrow \int \limits _0^\infty \alpha y^{-\alpha -1} P(y(1,\Pi _1^{0.5},\ldots ,\Pi _t^{0.5})\in M) \,dy. \end{aligned}$$

This shows that regular variation of \((\sigma _t)\) with index \(\alpha \) follows from the regular variation of \(\sigma \). This property is inherited by the sequence\((X_t)\). We observe that as \(x\rightarrow \infty \),

$$\begin{aligned}&\dfrac{\mathbb{P }(|(X_0,\ldots ,X_t)-\sigma _0 (Z_0,\Pi _1^{0.5} Z_1,\ldots ,\Pi _t^{0.5}Z_t)|>x)}{\mathbb{P }(\sigma >x)}\\&\quad \le \dfrac{\mathbb{P }( |Z_1| R_1^{0.5}+ \cdots + |Z_t| R_t^{0.5}>x)}{\mathbb{P }(\sigma >x)}=o(1). \end{aligned}$$

In the last step we used the independence of \(Z_i\) and \(R_i\) as well as the moment condition on \(Z\). Condition \(\mathbf (RV _\alpha )\) for \((X_t)\) now follows. This property was proved in Mikosch and Stărică [31] under the additional condition that \(Z\) be symmetric. The above calculation shows that this assumption can be avoided.

Next we consider the 2-dimensional Markov chain

$$\begin{aligned} \Phi _t=(\sigma _t,X_t)^{\prime }=\sigma _t (1,Z_t)^{\prime },\quad t\in \mathbb{Z }. \end{aligned}$$

A similar calculation as above shows that this Markov chain satisfies \(\mathbf (RV _\alpha )\) and for \(h\ge 0\), any continuity set \(N\), observing that \(|\Phi _0|=\sigma _0\sqrt{Z_0^2+1}\),

$$\begin{aligned}&\mathbb{P }( x^{-1}(\Phi _0,\ldots ,\Phi _h)\in N\mid |\Phi _0|>x)\\&\sim \mathbb{P }\big ( x^{-1} \sigma _0\big ((1,Z_0)^{\prime }, \Pi _1^{0.5}(1,Z_1)^{\prime },\ldots ,\Pi _{h}^{0.5} (1,Z_h)^{\prime }\big ) \in N\mid |\Phi _0|>x\big )\\&\stackrel{w}{\rightarrow }\mathbb{P }( |Y_0| \big ((1,Z_0)^{\prime },\Pi _1^{0.5}(1,Z_1)^{\prime },\ldots ,\Pi _{h}^{0.5} (1,Z_h)^{\prime }\big )/(Z_0^2+1)^{0.5}\in N). \end{aligned}$$

Identifying the limiting vector with \(|Y_0|(\Theta _0,\ldots , \Theta _h)^{\prime }\), we have for any \(\theta \in \mathbb{S }\),

$$\begin{aligned} b(\theta )&=\mathbb{E }\Big [\Big \{\Big (\theta ^{\prime }(1,Z_0)^{\prime }+ \sum _{t\ge 1}\Pi _{t}^{0.5} \theta ^{\prime } (1,Z_t)^{\prime }\Big )_+^{\alpha }\\&\quad -\Big ( \sum _{t\ge 1}\Pi _{t}^{0.5} \theta ^{\prime } (1,Z_t)^{\prime }\Big )_+^{\alpha }\Big \}\Big /(Z_0^2+1)^{\alpha /2}\Big ]. \end{aligned}$$

The Markov chain \((\Phi _t)\) is aperiodic and irreducible under classical conditions on the density of the \(Z\); see e.g. [31] for details. The condition \(\mathbf (DC _{p})\) for \(p<\alpha \) follows by an application of Lemma 2.1 for \(V(x)=|x|^p\). We recall that for \(m\ge 2\), \(\sigma _m^2=\Pi _{2,m}(\alpha _0+\alpha _1 X_0^2+\beta _1 \sigma _0^2)+ \widetilde{R}_m\), where \(\mathbb{E }|\widetilde{R}_m|^{(\alpha +\epsilon )/2}<\infty \) for some \(\epsilon >0\) and \(\widetilde{R}_m\) is independent of \(Z_m\). We have for \(p<\alpha \), some \(c>0\),

$$\begin{aligned} \mathbb{E }[|\Phi _m|^p\mid \Phi _0=\mathbf{y}]&= \mathbb{E }|\Pi _{2,m}(\alpha _0+\alpha _1 y_1^2+\beta _1 y_2^2)+\widetilde{R}_m|^{p/2}\mathbb{E }(Z^2+1)^{p/2}\nonumber \\&\le |\mathbf{y}|^p \mathbb{E }|\Pi _{2,m} |^{p/2}\mathbb{E }(Z^2+1)^{p/2} \max (\alpha _1^p,\beta _1^p)+c. \end{aligned}$$
(5.8)

For \(m=1\), we find constants \(c_1,c_2>0\) such that \(\mathbb{E }(V(\Phi _1)|\Phi _0=\mathbf{y})\le c_1 V(\mathbf{y})+c_2\). Since \(\mathbb{E }A^{p/2}<1\) for \(p<\alpha \), \(\mathbf (DC _{p,m})\) holds for sufficiently large \(m\) in view of (5.8). An application of Lemma 2.1 concludes the proof. Thus we may apply the stable limit theory of Theorem 4.1 with \(f(x)=x\) to \((\Phi _t)\) for \(\alpha <2\) and the limit law is determined by the cluster index \(b\) above.

For \(h\ge 0\) consider the Markov chain, recycling the notation \(\Phi _t\),

$$\begin{aligned} \Phi _t&= (X_t,\sigma _t,\ldots ,X_{t-h},\sigma _{t-h}),\quad t\in \mathbb{Z }. \end{aligned}$$
(5.9)

We also write

$$\begin{aligned} \Phi _t^2&= (X_t^2,\sigma _t^2,\ldots ,X_{t-h}^2,\sigma _{t-h}^2),\quad t\in \mathbb{Z }, \end{aligned}$$

and introduce the function \(f\) acting on \((\Phi _t)\) given by

$$\begin{aligned} \mathbf{Y}_t=f(\Phi _t)= \big (X_t (X_{t-1},\ldots ,X_{t-h}), \Phi _t^2, \Phi _t\big ),\quad t\in \mathbb{Z }. \end{aligned}$$

We intend to show \(\mathbf (DC _{p,m})\) for \(p<\alpha /2\) and some large \(m\). We restrict ourselves to the case \(h=1\); the general case is analogous but requires more accounting. We observe that for suitable constants \(c>0\),

$$\begin{aligned} |f(\Phi _t)|^p&= |X_t^2 X_{t-1}^2+ X_{t}^4+X_{t-1}^4+\sigma _t^4+ \sigma _{t-1}^4+ X_t^2+X_{t-1}^2+\sigma _t^2+ \sigma _{t-1}^2 |^{p/2}\\&\le c \big ((1+Z_t^4) (1+X_{t-1}^4+\sigma _{t-1}^4) +Z_t^2 (1+\sigma _{t-1}^2+X_{t-1}^2) (1+X_{t-1}^2)\\&+ X_{t-1}^2+1+\sigma _{t-1}^2\big )^{p/2} \end{aligned}$$

Then for suitable constants \(c_1,c_2>0\),

$$\begin{aligned} \mathbb{E }[|f(\Phi _1)|^p\mid \Phi _{0}=\mathbf{y}]&\le c\big (1+ |y_1^2|^p+ |y_2^2|^p+ |y_1|^p+ |y_2|^p\big ) \big )\\&\le c_1 |f(\mathbf{y})|^p+c_2. \end{aligned}$$

By a similar argument, for sufficiently large \(m\ge 1\), suitable constants \(c>0\), recalling that \(\sigma _t^2=\Pi _t\sigma _0^2+R_t\), where \(\sigma _0^2\) is independent of \((\Pi _t,R_t)\), and

$$\begin{aligned} \mathbb{E }[|f(\Phi _m)|^p\mid \Phi _{0}=\mathbf{y}]&\le c\big (1+\mathbb{E }[|\sigma _{m-1}^4|^{p/2}+|\sigma _{m-1}^2|^{p/2}\mid \Phi _{0}=\mathbf{y}]\big )\\&\le c\big (1+ \mathbb{E }\Pi _m^{2p}\,|y_2|^{2p} + \mathbb{E }\Pi _m^{p}]\, |y_2|^{p}\big )\\&\le c\,\big (\mathbb{E }[\Pi _m^{2p}] + \mathbb{E }[\Pi _m^{p}]\big )\,|f(\mathbf{y})|^p+c\\&\le \beta |f(\mathbf{y})|^p+c, \end{aligned}$$

for some \(\beta \in (0,1)\), sufficiently large \(m\ge 1\). Here we used the fact that \(\mathbb{E }A ^{2p}<1\) for \(p<\alpha /2\). Now we can apply Lemma 2.1 to show \(\mathbf (DC _{p})\) for \(p<\alpha /2\)

It remains to show \(\mathbf (RV _{\alpha /\mathrm{2}})\) for \((\mathbf{Y}_t)\) defined in (5.9). The \(\Phi _t\)-component of \(\mathbf{Y}_t\) is regularly varying with index \(\alpha \). Therefore, without loss of generality and abusing notation, we will consider the sequence

$$\begin{aligned} \mathbf{Y}_t=f(\Phi _t)= \big (X_t (X_{t-1},\ldots ,X_{t-h}), \Phi _t^2),\quad t\in \mathbb{Z }. \end{aligned}$$

Similar arguments as in the first part of this subsection and as in Sect. 5.2 show for \(t\ge 0\) that

$$\begin{aligned}&\mathbf{Y}_t= \mathbf{R}_t^{(1)}+\sigma _{t-h}^2 \big (Z_t \Pi _{t-h+1,t}^{0.5}(Z_{t-1}\Pi _{t-h+1,t-1}^{0.5},\ldots , Z_{t-h} ), (\Pi _{t-h+1,t}(Z_t^2,1),\ldots , (Z_{t-h}^2,1)) \big )^{\prime }\\&= \mathbf{R}_t^{(2)}+\sigma _{-h}^2 \Pi _{1-h,t-h} \big (Z_t \Pi _{t-h+1,t}^{0.5}(Z_{t-1}\Pi _{t-h+1,t-1}^{0.5},\ldots , Z_{t-h} ),\\&\quad (\Pi _{t-h+1,t}(Z_t^2,1),\ldots , (Z_{t-h}^2,1)) \big )^{\prime }, \end{aligned}$$

where \(\mathbb{E }| \mathbf{R}_t^{(i)}|^{(\alpha +\varepsilon )/2}<\infty \), \(i=1,2\). Therefore

$$\begin{aligned} (\mathbf{Y}_0,\ldots ,\mathbf{Y}_t)^{\prime }=\widetilde{\mathbf{D}}_t\sigma _{-h}^2 + \widetilde{\mathbf{Q}}_t, \end{aligned}$$

where \(\mathbb{E }| \widetilde{\mathbf{Q}}_t|^{(\alpha +\varepsilon )/2}<\infty \) and \(\mathbb{E }| \widetilde{\mathbf{D}}_t|^{(\alpha +\varepsilon )/2}<\infty \) for some \(\varepsilon >0\) and

$$\begin{aligned} \widetilde{\mathbf{D}}_t= \left( \begin{matrix} Z_0Z_{-1}\Pi _{1-h,0}^{0.5}\Pi _{1-h,-1}^{0.5}&{} Z_1Z_{0}A_{1-h}\Pi _{2-h,1}^{0.5}\Pi _{2-h,0}^{0.5}&{} \cdots &{} Z_tZ_{t-1}\Pi _{1-h,t-h}\Pi _{t-h+1,t}^{0.5}\Pi _{t-h+1,t-1}^{0.5}\\ Z_0Z_{-2}\Pi _{1-h,0}^{0.5}\Pi _{1-h,-2}^{0.5}&{} Z_1Z_{-1}A_{1-h}\Pi _{2-h,1}^{0.5}\Pi _{2-h,-1}^{0.5}&{} \cdots &{} Z_tZ_{t-2}\Pi _{1-h,t-h}\Pi _{t-h+1,t}^{0.5}\Pi _{t-h+1,t-2}^{0.5}\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ Z_0Z_{-h}\Pi _{1-h,0}^{0.5}&{} Z_1Z_{1-h}A_{1-h}\Pi _{2-h,1}^{0.5}&{} \cdots &{} Z_tZ_{t-h}\Pi _{1-h,t-h}\Pi _{t-h+1,t}^{0.5} \\ \Pi _{1-h,0}(Z_0^2,1)&{}A_{1-h}\Pi _{2-h,1}(Z_1^2,1)&{}\cdots &{}\Pi _{1-h,t-h}\Pi _{t-h+1,t}(Z_t^2,1)\\ \Pi _{1-h,-1}(Z_{-1},1)&{}A_{1-h}\Pi _{2-h,0}(Z_0^2,1)&{}\cdots &{}\Pi _{1-h,t-h}\Pi _{t-h+1,t-1}(Z_{t-1}^2,1)\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ (Z_{-h}^2,1)&{}A_{1-h} (Z_{1-h}^2,1)&{}\cdots &{}\Pi _{1-h,t-h} (Z_{t-h}^2,1) \end{matrix} \right) . \end{aligned}$$

Notice that \(\sigma _{-h}^2\) and \(\widetilde{\mathbf{D}}_t\) are independent and that \(\sigma _{-h}^2\) is regularly varying with index \(\alpha /2\). Then \(\mathbf (RV _{\alpha /\mathrm{2}})\) for \((\mathbf{Y}_0,\ldots ,\mathbf{Y}_t)\) follows by an application of the multivariate Breiman result; see [2]. We omit the calculation of the cluster index; it is similar to its calculation in Sect. 5.2.

Now we can apply Theorem 4.1 to prove limit theory with \(\alpha /2\)-stable limits, \(\alpha <4\), for the sample autocovariance function of the GARCH\((1,1)\) process. The corresponding theory using point process techniques is given in [16, 31]. There the limit theory for the sequences \((|X_t|)\) and \((X_t^2)\) was also provided. The same results can be provided by Theorem 4.1 by calculating the corresponding cluster indices. Applied to the squares \((X_t^2)\) we obtain in particular for \(\alpha \in (2,4)\),

$$\begin{aligned} na_n^{-1}\frac{1}{n}\sum _{t=1}^{n-h} X_t^2X_{t+h}^2-\left( \frac{1}{n}\sum _{t=1}^{n} X_t^2\right) ^2\stackrel{d}{\rightarrow }\xi _{\alpha /4}, \end{aligned}$$
(5.10)

where \(\xi _{\alpha /4}\) is an \(\alpha /4\)-stable random variable whose characteristic function is given in Theorem 4.1 and \((\Theta _t)_{t\ge 0}=(cZ_t^2Z_{t+h}^2\Pi _t\Pi _{t+h})_{t\ge 0}\) for some \(c>0\). In particular, the \(\Theta _t\)s are non negative and thus \(b_-=0\). Then \(\xi _{\alpha /4}\) is supported on \([-(\mathbb{E }X_0^2)^2,\infty )\). We omit further details. Relation (5.10) supports the idea of spurious long-range dependence effects observed on real-life log-return data which are often observed to have infinite fourth moments; see [32] for a discussion.

6 Proof of Theorem 4.1

6.1 Proof of part (1)

We will use the Cramér-Wold device to show that \((a_n^{-1} \theta ^{\prime }S_n)\) has a (possibly degenerate) \(\alpha \)-stable limit \(\xi _\alpha (\theta )\) for every \(\theta \in \mathbb{S }^{d-1}\). We will apply Theorem 1 in [1] which we recall for convenience:

Theorem 6.1

Assume that \((G_t)\) is a strictly stationary process of random variables, satisfying the following conditions.

  1. (1)

    The regular variation condition \(\mathbf (RV _\alpha )\) holds for some \(\alpha \in (0,2)\).

  2. (2)

    The mixing condition (MX): There exist \(m=m_n\rightarrow \infty \) such that \(k_n=[n/m_n]\rightarrow \infty \) and

    $$\begin{aligned} \mathbb{E }\mathrm{e}\,^{it b_n^{-1}S_n(G)}- \Big (\mathbb{E }\mathrm{e}\,^{it b_n^{-1}S_m(G)}\Big )^{k_n}\rightarrow 0,\quad n\rightarrow \infty ,\quad t\in \mathbb{R }, \end{aligned}$$

    where \(S_n(G)=G_1+\cdots +G_n\) and \((b_n)\) is chosen such that \(n\,\mathbb{P }(|G_1|>b_n)\sim 1\).

  3. (3)

    The anti-clustering condition

    holds, where \(m=m_n\) is the same as in (MX) and \(\overline{x} = (x\wedge 2)\vee (-2)\).

  4. (4)

    The limits

    exist. Here \(b_+(\ell ),b_-(\ell )\) are the tail balance parameters given by \(b_+(\ell )=\lim _{n\rightarrow \infty } n\,P(S_\ell (G)>b_n)\) and \(b_-(\ell )=\lim _{n\rightarrow \infty } n\,P(S_\ell (G)\le -b_n)\).

  5. (5)

    For \(\alpha >1\) assume \(\mathbb{E }G_1=0\) and for \(\alpha =1\),

Then \(c_+\) and \(c_-\) are non-negative and \((b_n^{-1}S_n(G))\) converges in distribution to an \(\alpha \)-stable random variable (possibly zero) with characteristic function \(\psi _{\alpha }(x) = \exp ( -|x|^{\alpha } \chi _{\alpha }(x, c_+, c_-))\), where for \(\alpha \ne 1\) the function \(\chi _\alpha (x,c_+, c_-), x \in \mathbb{R }\), is given by the formula

$$\begin{aligned} \dfrac{\Gamma (2-\alpha )}{ 1-\alpha }\,\Big ((c_++c_-)\,\cos (\pi \alpha /2)-i\,\mathrm{sign}(x) (c_+-c_-)\, \sin (\pi \, \alpha /2)\Big ), \end{aligned}$$

while for \(\alpha = 1\) one has

$$\begin{aligned} \chi _1(x,c_+, c_-) = 0.5\,\pi (c_++c_-) +i\,\mathrm{sign}(x)\,(c_+-c_-) \log |x|,\quad x\in \mathbb{R }. \end{aligned}$$

We will verify the conditions of this theorem for the sequence \(G_i=\theta ^{\prime }X_i\) for fixed \(\theta \in \mathbb{S }^{d-1}\).

6.1.1 The regular variation condition \(\mathbf (RV _\alpha )\) for \((G_t)\)

This condition is straightforward from the definition of \(\mathbf (RV _\alpha )\) for \((X_t)\) and the fact that the function \(f(x)=\theta ^{\prime } x\), \(x\in \mathbb{R }^d\), is continuous and homogeneous.

6.1.2 The anti-clustering condition (AC)

Without loss of generality we assume that \(\mathbf (DC _{p})\) holds for \(V(y)=|y|^{p}\). We also assume \(p\le 1\); for \(p>1\) an application of Jensen’s inequality yields \(\mathbf (DC _{p^{\prime }})\) for any \(p^{\prime }<p\). Since \(p\le 1\) there exists \(c>0\) such that \(y\le c\,y^ p\) for \(y\in [0,2]\). Then one has

$$\begin{aligned} T_{\ell m}&= \dfrac{n}{m}\sum _{j=\ell +1}^{m}\mathbb{E }\Big [\, \overline{\big | x\,b_n^{-1}(S_{j}(G)-S_ \ell (G))\big |}\;\overline{| x\,b_n^{-1} G_{1}|}\,\Big ]\\&\le c \frac{n}{m} \,\sum _{j=\ell +1}^{m}\mathbb{E }\Big [\,\overline{\big | x\,b_n^{-1}(S_{j}(G)-S_\ell (G))\big |^p}\;\overline{ \big | x\,b_n^{-1} G_{1}\big |}\,\Big ]. \end{aligned}$$

Using \(\mathbf (DC _{p})\), a recursive argument yields

$$\begin{aligned} \mathbb{E }( |G_k|^{p} \mid \Phi _1=y)\le \beta ^{k-1} |f(y)|^{p}+b\,\sum _{j=1}^{k-1}\beta ^{j},\quad k\ge 2, \end{aligned}$$
(6.1)

where \(\beta ,b\) appear in \(\mathbf (DC _{p})\). Multiple use of this argument and the subadditivity of the function \(z\mapsto \overline{z}\) on \((0,\infty )\) yield for \(\ell <j\le m\),

$$\begin{aligned} \mathbb{E }\Big [\overline{\big | xb_n^{-1}(S_j(G)-S_{\ell }(G)) \big |^{p}}\mid \Phi _1\Big ]\le c\,\overline{|x|^{p}b_n^{-p}\sum _{i=\ell +1}^m\beta ^i|X_1|^{p}}+ c b_n^{-p} \,m. \end{aligned}$$

Conditioning on \(\Phi _1\), the latter inequality finally yields

$$\begin{aligned} \mathbb{E }T_{\ell m}\!\le \! c \dfrac{n}{m}\sum _{j=\ell +1}^{m}\mathbb{E }\Big [\overline{|x|^{p}b_n^{-p} \sum _{i=1}^j\beta ^i|X_1|^p}\;\overline{ x\,b_n^{-1} |X_{1}|}\Big ]\!+\! c\, \dfrac{ \,m\,n}{b_n^p} \;\mathbb{E }\overline{|xb_n^{-1}X_1|} \!=\!I_1+I_2. \end{aligned}$$

We have \(I_2 \le c b_n^{-p-1} n\, m=o(1)\) if we choose \(m=m_n=\log ^2 n\). It remains to prove that \(I_1\) is asymptotically negligible. An application of Karamata’s theorem yields the bound

$$\begin{aligned}&I_1\le c\,\dfrac{n}{m}\sum _{j=\ell +1}^{m}\mathbb{P }\left( |X_1|\ge c b_n\left( \sum _{i=\ell }^j\beta ^i\right) ^{-1/(p+1)}\right) \le \frac{c}{m} \sum _{j=\ell +1}^{m}\left( \sum _{i=\ell }^j\beta ^i\right) ^{\alpha /(p+1)}\\&\qquad \,\,\le c \beta ^{\ell \alpha /(p+1)}. \end{aligned}$$

The right-hand side vanishes as \(\ell \rightarrow \infty \). Collecting the above bounds, condition (AC) follows.

6.1.3 The mixing condition (MX)

Here we give a significant improvement on Lemma 3 in [1]; in the latter paper it is assumed that \((G_t)\) is strongly mixing. The next result avoids this condition.

Lemma 6.2

Consider a strictly stationary real-valued sequence \((G_t)\) satisfying the conditions \(\mathbf (RV _\alpha )\) for some \(\alpha \in (0,2)\) and (AC). Then (MX) can be replaced by

Condition (MX’): There exists a sequence \((r_n)\) such that \(r_n=o(m_n)\) and

$$\begin{aligned} |\varphi _n^{(\ell )}(t)-\varphi _{n,m-\ell }^k(t)|\rightarrow 0,\qquad t\in \mathbb{R }. \end{aligned}$$

holds for \(\ell =m-r_n\) and \(\ell =r_n\), where

$$\begin{aligned} \varphi _n^{(\ell )}(t)&= \mathbb{E }\left[ \exp \left( itb_n^{-1}\sum _{i=1}^{k_n}\sum _{t=(i-1)m+1}^{im-\ell }G_t\right) \right] \!,\\ \varphi _{n,j}&= \mathbb{E }\left[ \exp \left( itb_n^{-1}\sum _{t=1}^{j}G_t\right) \right] \!,\quad j\ge 1,\quad \varphi _n(t)=\varphi _{n,n}(t),\quad t\in \mathbb{R }. \end{aligned}$$

Proof

Notice that condition (MX) can be written in the form \(\varphi _n(t)-\varphi _{n,m}^k(t)\rightarrow 0\) as \(n\rightarrow \infty \). We have

$$\begin{aligned} \varphi _n(t)-\varphi _{n,m}^k(t)&= [\varphi _n(t)\!-\!\varphi _n^{(r)}(t)] \!+\![\varphi _n^{(r)}(t)\!-\!\varphi _{n,m-r}^k(t)]+ [\varphi _{n,m-r}^k(t)-\varphi _{n,m}^k(t)]\\&= P_1+P_2+P_3. \end{aligned}$$

In view of (MX)’, \(P_2\rightarrow 0\). Next we deal with \(P_1\). Assume for simplicity that \(k_n=n/m\) is an integer. We use the classical Bernstein blocks technique, writing

$$\begin{aligned} S_n=b_n^{-1}\sum _{i=1}^{k_n}\sum _{t=(i-1)m+1}^{im-r}G_t+b_n^{-1}\sum _{i=1}^{k_n}\sum _{t=im-r+1}^{im}G_t=I_1+I_2. \end{aligned}$$

We will show that \(\mathbb{E }\exp (itI_2)\rightarrow 1\). Condition (MX)’ implies that \(|\mathbb{E }\exp (itI_2)-\varphi _{n,r}^k(t)|\rightarrow 0\) as \(\ell =m-r\ge r\) and \(\ell /n\rightarrow 0\). Moreover, Lemma 3.5 in [38] yields that \(\varphi _{n,r}^k(t)\rightarrow 1\) if and only if \(k(\varphi _{n,r}(t)-1)\rightarrow 0\). Assuming \(\mathbf (RV _\alpha )\) and (AC), one can follow the proof of Lemma 1 in [1]. We have

$$\begin{aligned} \lim _{q\rightarrow \infty }\limsup _{n\rightarrow \infty }|k\,(\varphi _{n,r}(t)-1)- k\,r\,(\varphi _{n,q}(t)-\varphi _{n,q-1}(t))|\rightarrow 0,\quad t\in \mathbb{R }. \end{aligned}$$

Under \(\mathbf (RV _\alpha )\), an application of Theorem 3 in Section XVII.5 of Feller gives that \(n(\varphi _{n,q}(t)-1)\) converges for all \(q\). We deduce that \(n(\varphi _{n,q}(t)-\varphi _{n,q-1}(t))\) converges too. As \(kr/n\sim r/m\rightarrow 0\) we conclude that \(kr(\varphi _{n,q}(t)-\varphi _{n,q-1}(t))\rightarrow 0\) and then \(k_n(\varphi _{n,r}(t)-1)\rightarrow 0\) which gives the desired result \(\mathbb{E }\exp (itI_2)\rightarrow 1\), equivalently, \(I_2\stackrel{P}{\rightarrow }0\). Since

$$\begin{aligned} |P_1|= \Big |\mathbb{E }\Big [\exp (it (I_1)( 1-\exp (it I_2)))\Big ]\Big |\le \mathbb{E }\Big |1-\exp (it I_2) \Big |, \end{aligned}$$

dominated convergence yields \(P_1\rightarrow 0\). Finally,

$$\begin{aligned} |P_3|\le k\,\Big |(\varphi _{n,m-r}(t)-1)- (\varphi _{n,m}(t)-1)\Big |\rightarrow 0. \end{aligned}$$

and the same arguments as above show that \(P_3\rightarrow 0\). \(\square \)

We finish the proof of (MX) for the sequence \((G_t)\). In view of \(\mathbf (DC _{p})\), \((X_t)\), hence \((G_t)\), are \(\beta \)-mixing, hence strongly mixing, with exponential rate \((\alpha _h)\). We will show (MX) by an application of Lemma 6.2. A standard telescoping sum argument shows that

$$\begin{aligned} |\varphi _n^{(\ell )}(t)-\varphi _{n,m-\ell }^k(t)|&\le c\,k_n \alpha _\ell . \end{aligned}$$

Since we choose \(m=\log ^2 n\) in the proof of (AC), \(k_n \alpha _\ell \le (n/\log ^2 n) \exp (-c \ell _n)\). Thus, choosing \(\ell _n= C\log n\) for some sufficiently large constant \(C>0\) we have \(\ell _n=o(m_n)\), \(k_n \alpha _\ell =o(1)\) and we can also find \(r_n=o(\ell _n)\). This proves (MX’), hence (MX).

6.1.4 Condition \(\mathbf{(TB)}\)

Note that \(\{|\theta ^{\prime } X| >b_n\}\subset \{|X|>b_n\} \). Then

$$\begin{aligned} b_+(\ell )&= \lim _{x\rightarrow \infty } \dfrac{\mathbb{P }(S_\ell (G)>x)}{\mathbb{P }(|\theta ^{\prime }X|>x)}\\&= \lim _{x\rightarrow \infty } \dfrac{\mathbb{P }(\theta ^{\prime }S_\ell >x)}{\mathbb{P }(|X|>x)}\lim _{x\rightarrow \infty }\dfrac{\mathbb{P }(|X|>x)}{\mathbb{P }(|\theta ^{\prime }X|>x)}\\&= b_\ell (\theta ) \,\lim _{x\rightarrow \infty } (\mathbb{P }(|\theta ^{\prime }X|>x\mid |X|>x))^{-1}\\&= b_\ell (\theta ) (\mathbb{P }(|Y_0| |\theta ^{\prime }\Theta _0|>1)))^{-1}\\&= b_\ell (\theta ) (\mathbb{E }(|\theta ^{\prime }\Theta _0|^\alpha ))^{-1} . \end{aligned}$$

Correspondingly, \(b_-(\ell )=b_\ell (-\theta )(\mathbb{E }(|\theta ^{\prime }\Theta _0|^\alpha ))^{-1}\). Here we assumed that \(\mathbb{E }(|\theta ^{\prime }\Theta _0|^\alpha )\ne 0\). Otherwise, \(b_+(\ell )=b_-(\ell )=0\).

Thus we may apply Theorem 6.1 to conclude that \( b_n^{-1} \theta ^{\prime }S_n \stackrel{d}{\rightarrow }\xi _\alpha (\theta ) \) for an \(\alpha \)-stable random variable \(\xi _\alpha (\theta )\) with characteristic function \(\psi _\alpha (x,\theta )\) given by

$$\begin{aligned}&\mathbb{E }(|\theta ^{\prime }\Theta _0|^\alpha )\,\log \psi _\alpha (x,\theta )\\&\quad \!=\! -|x|^{\alpha } \dfrac{\Gamma (2-\alpha )}{ 1-\alpha }\,\Big ((b(\theta )+ b(-\theta ))\,\cos (\pi \alpha /2)-i\,\mathrm{sign}(x) (b(\theta )- b(-\theta ))\, \sin (\pi \, \alpha /2)\Big ),\quad x\in \mathbb{R }. \end{aligned}$$

The factor \(\mathbb{E }(|\theta ^{\prime }\Theta _0|^\alpha )\) on the left-hand side is due to the normalization \((b_n)\) instead of \((a_n)\). Replacing \((b_n)\) by \((a_n)\), we have for any \(v\in \mathbb{R }^d\) that

$$\begin{aligned}&\mathbb{E }\mathrm{e}\,^{i v^{\prime } (a_n^{-1}S_n)}\rightarrow \\&\exp \left\{ - |v |^{\alpha } C_\alpha ^{-1}\,\Big ((b(v/|v|)\!+\!b(-v/|v|))\,-i\, (b(v/|v|)\!-\! b(-v/|v|))\, \tan (\pi \, \alpha /2)\Big )\right\} , \end{aligned}$$

where \(C_\alpha \) is defined in (4.2). This is the characteristic function of an \(\alpha \)-stable random vector \(\xi _\alpha \). The representation of the Lévy spectral measure \(\Gamma _\alpha \) in (4.1) follows by calculations as in Example 2.3.4 of [44]. Indeed, keeping notations of [44] and identifying the limiting law yields the equations

$$\begin{aligned} b(\theta )+b(-\theta )&= C_\alpha \,\sigma _\theta ^\alpha =C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} |\theta ^{\prime }s|^\alpha \Gamma _\alpha (ds)=C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} (\theta ^{\prime }s)_+^\alpha \Gamma _\alpha (ds)\nonumber \\&+C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} (-\theta ^{\prime }s)_+^\alpha \Gamma _\alpha (ds) ,\end{aligned}$$

and

$$\begin{aligned} b(\theta )-b(-\theta )&= (b(\theta )+b(-\theta ))\,\beta _\theta \\&= C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} |\theta ^{\prime }s|^\alpha \mathrm{sign}(\theta ^{\prime }s)\Gamma _\alpha (ds)\\&= C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} (\theta ^{\prime }s)_+^\alpha \Gamma _\alpha (ds)-C_\alpha \,\int \limits _{\mathbb{S }^{d-1}} (-\theta ^{\prime }s)_+^\alpha \Gamma _\alpha (ds),\quad \theta \in \mathbb{S }^{d-1}. \end{aligned}$$

The limiting \(\alpha \)-stable distribution is degenerate if and only if \(b(\theta )=0\) for all \(\theta \in \mathbb{S }^{d-1}\).

This proves part (1) of the theorem.

6.2 Stable limit theory for general stationary processes

In this part we want to give some arguments showing that the results of Theorem 4.1 can be applied in much more general context. For this reason, consider a strictly stationary \(\mathbb{R }^d\)-valued sequence \((X_t)\) with index \(\alpha >0\). Then \(\Phi _t=(X_t,X_{t-1},\ldots )\), \(t\in \mathbb{Z }\), constitutes a Markov chain with infinite-dimensional state space. In this setting, \(\mathbf (DC _{p})\) for \(X_t=f(\Phi _t)\) takes on the form:

Condition \(\mathbf (DC _{p}^{\prime })\):

$$\begin{aligned}&\mathbb{E }(|X_1|^p\mid | (X_0,X_{-1},\ldots )=(x_0,x_{-1},\ldots ))\\&\quad \le \beta |x_0|^p+b\quad \text{ for } \text{ some } 0<\beta <1 \text{ and } b>0\text{. } \end{aligned}$$

We also need a weak dependence assumption more general than geometric \(\beta \)-mixing which, in the irreducible case, is implied by the drift condition.

Condition (MX\(_m\)): Consider an integer sequence \((m_n)\) such that \(m=m_n\rightarrow \infty \) and \(m_n/n=o(1)\) and also write \(k_n=[n/m]\). There exists a sequence \((r_n)\) such that \(r_n=o(m_n)\) and

$$\begin{aligned} \lim _{n\rightarrow \infty }|\varphi _n^{(\ell )}(s)-\varphi _{n,m-\ell }^k(s)|\rightarrow 0,\quad s\in \mathbb{R }^d, \end{aligned}$$

holds for both \(\ell =\ell _n=m_n-r_n\) and \(\ell =r_n\), where

$$\begin{aligned} \varphi _n^{(\ell )}(s)&= \mathbb{E }\left[ \exp \left( ia_n^{-1}\sum _{i=1}^{k_n} \sum _{t=(i-1)m+1}^{im-\ell }s^{\prime }X_t\right) \right] \!,\\ \varphi _{n,j}&= \mathbb{E }\left[ \exp \left( ia_n^{-1}\sum _{t=1}^{j}s^{\prime }X_t\right) \right] \!,\quad j\ge 1,\quad \varphi _n(s)=\varphi _{n,n}(s),\quad s\in \mathbb{R }^d. \end{aligned}$$

Condition (MX\(_m\)) is implied by \(\theta \) -weak dependence introduced by Doukhan and Louhichi [17]: For every \(m\ge 1\), equip \((\mathbb{R }^d)^m\) with the metric \(|\cdot |_m=m^{-1}\sum _{i=1}^m|\cdot |\). A function \(f: (\mathbb{R }^d)^m\mapsto [-1,1]\), \(m\ge 1\), is Lipschitz if

$$\begin{aligned} \sup _{x\ne y}\frac{|f(x)-f(y)|}{|x-y|_m}=\text{ Lip }(f)<\infty . \end{aligned}$$

The \(\theta \)-weak dependence coefficients \((\theta _r)_{r\ge 0}\) are defined for any \(f\) with \(\text{ Lip }(f)=1\) and measurable \(g: (\mathbb{R }^d)^v\mapsto [-1,1]\), \(v\ge 1\), as

$$\begin{aligned} \sup _{k,v\ge 1}\sup _{i_1<\cdots <i_v\le 0\le r\le j_1<\cdots <j_m}|\mathrm{cov}(g(X_{i_1},\ldots ,X_{i_v}),f(X_{j_1},\ldots ,X_{j_m}))|=\theta _r. \end{aligned}$$

Condition (MX \(_m\)) follows if \(\theta _r\rightarrow 0\) for some \(r=r_n=o(m)\) with \(m=m_n\). \(\theta \)-weak dependence covers a wide range of known dependence concepts, including a large variety of mixing conditions; see [17].

In the general case, the following analog of Theorem 4.1 holds. The proof follows along the lines of Theorem 4.1. Irreducibility of \((X_t)\) can be replaced by (MX \(_m\)). We omit further details.

Theorem 6.3

Consider an \(\mathbb{R }^d\)-valued strictly stationary sequence \((X_t)\) satisfying the following conditions:

  • \(\mathbf (RV _\alpha )\) for some \(\alpha \in (0,2)\), \(\mathbb{E }X=0\) if \(\alpha >1\) and \(X\) is symmetric if \(\alpha =1\).

  • \(\mathbf {(DC_{p}^{\prime })}\) for some \(p\in ((\alpha -1)\vee 0,\alpha )\).

  • (MX \(_m\)) for \(m_n=o(n^{(p+1)/\alpha -1})\).

Let \((a_n)\) be a sequence of positive numbers such that \(n\,\mathbb{P }(|X_0|>a_n)\sim 1\). Then the statement of part (1) of Theorem 4.1 holds.

6.3 Proof of part (2)

Recall the regenerative structure of the Markov chain \((X_t)\) from Sect. 2.2. We will show that the partial sum \(S(1)\) over a full regenerative cycle is regularly varying with index \(\alpha \). We write

$$\begin{aligned} S_n=S(0)+\sum _{t=1}^{N_A(n)}S(t)+\sum _{i=N_A(n)+1}^nX_i, \end{aligned}$$
(6.2)

where \(N_A(n)=\#\{i\ge 0:\tau _A(i)\le n\}\), \(n\ge 1\), is independent of \((S(i))_{i\ge 1}\). The first and last block sums \(S(0)\) and \(\sum _{i=\tau _A(N_A(n))+1}^nX_i\) are tight. Therefore

$$\begin{aligned} a_n^{-1} S_n=a_n^{-1}\sum _{t=1}^{N_A(n)}S(t) + o_P(1). \end{aligned}$$

By virtue of \(\mathbf (DC _{p})\) for some \(p>0\) the chain \((X_t)\) is geometrically ergodic. Therefore there exists a constant \(\kappa >0\) such that

$$\begin{aligned} \sup _{x\in A}\mathbb{E }_x\mathrm{e}\,^{\kappa \tau _A}<\infty , \end{aligned}$$
(6.3)

(see [30], (15.2) in Theorem 15.0.1) and hence \(\tau _A\) has exponential moment. By a standard renewal argument, \(N_A(n)/ n \mathop {\rightarrow }\limits ^{\mathrm{a.s.}} (\mathbb{E }\tau _A)^{-1}\). Then for \(\epsilon ,\delta >0\),

$$\begin{aligned}&\mathbb{P }\left( a_n^{-1}\left| \sum _{t=1}^{N_A(n)}S(t)-\sum _{t=1}^{n\,(\mathbb{E }\tau _A)^{-1}}S(t)\right| \ge \epsilon \right) \\&\le \mathbb{P }( |N_A(n)-n (\mathbb{E }\tau _A)^{-1}|>\delta N_A(n))\\&+ \mathbb{P }\left( a_n^{-1}\left| \sum _{t=1}^{|N_A(n)-n (\mathbb{E }\tau _A)^{-1}|}S(t)\right| \ge \epsilon , |N_A(n)-n (\mathbb{E }\tau _A)^{-1}|\le \delta N_A(n)\right) \\&\le o(1)+ c\mathbb{P }\left( a_n^{-1}\left| \sum _{t=1}^{\delta N_A(n)}S(t)\right| \ge 0.5 \epsilon \right) \!. \end{aligned}$$

In the last step we used a maximal inequality of Ottaviani type; see e.g. [38, Chapter 2. The second term on the right-hand side is neglible, as first letting \(n\rightarrow \infty \) and then \(\delta \rightarrow 0\) since \(a_n^{-1}\sum _{t=1}^{N_A(n)}S(t)\stackrel{d}{\rightarrow }\xi _\alpha \). Hence

$$\begin{aligned} a_n^{-1}S_n=a_n^{-1}\sum _{t=1}^{n\,(\mathbb{E }\tau _A)^{-1}}S(t)+o_P(1). \end{aligned}$$

In view of part (1), the sum of iid random vectors \((S(i))\) on the right-hand side has an \(\alpha \)-stable limit. It follows from [43] that \(S(1)\) is with index \(\alpha \). This concludes the proof.

7 Proof of Theorem 4.3

7.1 Proof of part (1): The case \(\alpha \in (0,2)\)

Recall the decomposition (6.2) of the partial sums \(S_n\) in terms of the regenerative cycles of the Markov chain. We start with an auxiliary result which deals with the sums over the first and last blocks.

Lemma 7.1

Assume that \(\mathbf (RV _\alpha )\) and \(\mathbf (DC _{p})\) hold for some \(p>\alpha -1\) provided \(\alpha >1\). Then there exists a constant \(c>0\) such that for any sequence \(x=x_n\rightarrow \infty \) as \(n\rightarrow \infty \),

$$\begin{aligned} \mathbb{P }_A \left( \sum _{t=1}^{\tau _A}|X_t|>x\right)&\le c\, \mathbb{P }(|X|>x),\end{aligned}$$
(7.1)
$$\begin{aligned} \mathbb{P }\left( \sum _{t=1}^{\tau _A}|X_t|>x,\tau _A\le n\right)&= o(n \mathbb{P }(|X|>x)). \end{aligned}$$
(7.2)

Proof

We start by proving (7.1). For any random vector \(X\) we write \(\overline{X}= X1\!\!1_{\{|X|\le x\}}\). Then

$$\begin{aligned} \mathbb{P }_A\left( \sum _{t=1}^{\tau _A}|X_t|>x\right) \le \mathbb{P }_A\left( \sum _{t=1}^{\tau _A}\overline{|X_t|}>x/2\right) +\mathbb{P }_A\left( \cup _{t=1}^{\tau _A}\{\overline{|X_t|}\ne |X_t|\}\right) =I_1+I_2. \end{aligned}$$

Using the Wald identity, we have

$$\begin{aligned} I_2 =\mathbb{E }_A\left( \max _{1\le t\le \tau _A}1_{\{|X_t|>x\}}\right) \le \mathbb{E }_A\left( \sum _{t=1}^{\tau _A}1_{\{|X_t|>x\}}\right) = \mathbb{E }_A(\tau _A)\,\mathbb{P }(| X|>x). \end{aligned}$$

Write \(k_0=\lceil \alpha \rceil \) and choose \(0<\beta <1\) such that \(\beta k_0>\alpha \). Since \(k_0(k_0-1)\ge \alpha (\alpha -1)\) for \(\alpha >1\), we will choose \(\beta \) such that \(p/\beta =k_0-1\). Markov’s inequality yields

$$\begin{aligned} I_1\le c \frac{\mathbb{E }_A\Big (\sum _{t=1}^{\tau _A}\overline{| X_t|}\Big )^{\beta k_0}}{x^{\beta k_0}}\le c \frac{\mathbb{E }_A(\sum _{t=1}^{\tau _A}\overline{ |X_t|}^\beta )^{k_0 }}{x^{\beta k_0}}. \end{aligned}$$

Note that \((\overline{|X_t|}^\beta )\) satisfies \(\mathbf (DC _{k_0-1})\). Under the latter condition we may apply Proposition 4.7 of [34] to get \(\mathbb{E }_A(\sum _{t=1}^{\tau _A}\overline{ |X_t|}^\beta )^{k_0 }\le c\mathbb{E }\overline{| X|}^{\beta k_0}\). An application of Karamata’s theorem shows that the right-hand side is bounded by \(c \mathbb{P }(|X|>x)\). This concludes the proof of (7.1).

Now we turn to the proof of (7.2). Abusing notation, we write \(\overline{X}=1\!\!1_{\{|X|\le x\delta \}}\) for any fixed \(\delta \). Then

$$\begin{aligned}&\mathbb{P }\left( \sum _{t=1}^{\tau _A}|X_t|>x,\tau _A\le n\right) \le \mathbb{P }\left( \sum _{t=1}^{\tau _A}\overline{|X_t|}>x/2,\tau _A\le n\right) \\&\quad +\mathbb{P }\left( \cup _{t=1}^{\tau _A}\{\overline{|X_t|}\ne |X_t|\}\right) =I_1^{\prime }+I_2^{\prime }. \end{aligned}$$

Since \(\mathbb{E }\tau _A<\infty \) and \(X\) is regularly varying, we have

$$\begin{aligned} I_2^{\prime }\le \mathbb{E }(\tau _A)\,\mathbb{P }(| X|>x\delta )=o(n\mathbb{P }(|X|>x)). \end{aligned}$$

Similar arguments as above yield

$$\begin{aligned} I_1^{\prime }\le c \frac{\mathbb{E }(\sum _{t=1}^{\tau _A}\overline{|X_t|}1\!\!1_{\{\tau _A\le n\}})^{\beta k_0}}{x^{\beta k_0}}\le c \frac{\mathbb{E }(\sum _{t=1}^{n}\overline{ |X_t|}1\!\!1_{\{\tau _A\ge t\}})^{\beta k_0}}{x^{\beta k_0}}. \end{aligned}$$

An argument similar to the one used in the proof of Theorem 4.6 in [34] shows that

$$\begin{aligned} \mathbb{E }\left( \sum _{t=1}^{n} \overline{|X_t|}^\beta 1\!\!1_{\{\tau _A\ge t\}}\right) ^{ k_0}\le c\,\mathbb{E }\left( \sum _{t=1}^{ n }\overline{|X_t|}^{\beta k_0}1\!\!1_{\{\tau _A\ge t\}}\right) . \end{aligned}$$

Finally, an application of Pitman’s identity [39], Proposition 4.7 in [34] and Karamata’s theorem yield

$$\begin{aligned} \mathbb{E }\left( \sum _{t=1}^{n}\overline{|X_t|}^{\beta k_0}1\!\!1_{\{\tau _A\ge t\}}\right)&= \mathbb{P }(X_0\in A)\,\mathbb{E }_A\left( \sum _{k=0}^{\tau _A-1}\sum _{t=1}^{n}\overline{| X_{k+t}|}^{\beta k_0}1\!\!1_{\{\tau _A\ge k+t\}}\right) \\&\le n\, \mathbb{P }(X_0\in A)\,\mathbb{E }_A\left( \sum _{t=1}^{\tau _A}\overline{|X_{t}|}^{\beta k_0}\right) \\&\le c\, n \mathbb{E }\overline{|X|}^{\beta k_0}\sim c\, n\, x^{\beta k_0}\delta ^{\beta k_0-\alpha }\mathbb{P }(|X|>x). \end{aligned}$$

Since \(\beta k_0>\alpha \) and we can make \(\delta \) as small as we like, we conclude that \(I_1^{\prime }=o\big (n\,(\mathbb{P }(|X|>x)\big )\). This concludes the proof of (7.2). \(\square \)

Now we are ready to prove part (1). Since \(\tau _A\) has exponential moment, it follows that \(\mathbb{P }(\tau _A> n)=o(\mathbb{P }(|X|>\lambda _n)\). Therefore we may prove the result on the event \(\{\tau _A\le n\}\). We write for simplicity \(\mathbb{P }_n(\cdot )=\mathbb{P }(\cdot \cap \{\tau _A\le n\})\). In view of Lemma 7.1 and the decomposition (6.2) of \(S_n\) we may neglect the sums over the first and last cycles and it suffices to prove the large deviation principle for the process \(\sum _{t=1}^{N_A(n)}S(t)\) over independent cycles. Observe that

$$\begin{aligned} \frac{\mathbb{P }_n(\lambda ^{-1}_n\sum _{t=1}^{N_A(n)}S(t)\in \cdot )}{n\mathbb{P }(|X|\ge \lambda _n)}=\frac{\mathbb{P }_n(\lambda ^{-1}_n\sum _{t=1}^{N_A(n)}S(t)\in \cdot )}{n\mathbb{P }(|S(1)|\ge \lambda _n)}\frac{\mathbb{P }(|S(1)|\ge \lambda _n)}{\mathbb{P }(|X |\ge \lambda _n)}. \end{aligned}$$

The same arguments as in the proof of Lemma 4.12 in [34] (here the conditions \(\lambda _n\rightarrow \infty \) and \(\lambda _n/n^{\delta +1/\alpha }\rightarrow \infty \) for some \(\delta >0\) are crucial) show that for any small \(\xi ,\varepsilon >0\), and any set \(B\) bounded away from zero,

$$\begin{aligned} \dfrac{(1-\varepsilon )\mathbb{P }( \lambda _n^{-1}(1+\xi )^{-1}(1+\varepsilon )^{-1}S(1)\in B )}{\mathbb{E }(\tau _A)\,\mathbb{P }(|S(1)|>\lambda _n)}&\le \dfrac{\mathbb{P }_n \Big (\lambda _n^{-1} \sum _{t=1}^{N_A(n)} S(t)\in B\Big )}{n\mathbb{P }(|S(1)|>\lambda _n)}+o(1)\\&\le \dfrac{\mathbb{P }( \lambda _n^{-1}(1-\xi )^{-1}S(1)\in B)}{\mathbb{E }(\tau _A)\,\mathbb{P }(|S(1)|>\lambda _n)}+o(1). \end{aligned}$$

Assume first that the cluster index \(b\) does not vanish. In view of part (2) of Theorem 4.1 we know that \(S(1)\) is with index \(\alpha \) and spectral measure \(\mathbb{P }_{\Theta ^{\prime }}\) given by (4.3), and we also know that

$$\begin{aligned} \dfrac{\mathbb{P }(\lambda _n^{-1} S(1)\in \cdot )}{\mathbb{P }(|S(1)|>\lambda _n)}\stackrel{v}{\rightarrow }\mu _{S(1)}(\cdot ), \end{aligned}$$

for a non-null Radon measure \(\mu _{S(1)}\). Hence, letting \(\varepsilon \rightarrow 0\) and \(\xi \rightarrow 0\), we conclude that

$$\begin{aligned} \dfrac{\mathbb{P }_n\Big (\lambda _n^{-1} \sum _{t=1}^{N_A(n)} S(t)\in \cdot \Big )}{n\mathbb{P }(|S(1)|>\lambda _n)}\stackrel{v}{\rightarrow }\frac{\mu _{S(1)}(\cdot )}{\mathbb{E }(\tau _A)}. \end{aligned}$$

It remains to determine the limit of \(\mathbb{P }(|S(1)|>x)/\mathbb{P }(|X|>x)\) as \(x\rightarrow \infty \). By virtue of the proof of Theorem 4.1, \(a_n^{-1}\sum _{t=1}^{n/\mathbb{E }\tau _A} S(t)\stackrel{d}{\rightarrow }\xi _\alpha \). Then necessarily

$$\begin{aligned} \dfrac{n}{\mathbb{E }\tau _A} \mathbb{P }(a_n^{-1} S(1)\in \cdot )\stackrel{v}{\rightarrow }\nu _\alpha (\cdot ), \end{aligned}$$

where \(\nu _\alpha \) is the Lévy measure of \(\xi _\alpha \). Hence

$$\begin{aligned} \dfrac{\mathbb{P }(|S(1)|>a_n)}{\mathbb{P }(|X|>a_n)} \sim n\mathbb{P }(|S(1)|\ge a_n)\rightarrow \mathbb{E }\tau _A \,\Gamma _\alpha (\mathbb{S }^{d-1}), \end{aligned}$$

where \(\Gamma _\alpha \) is the spectral measure of \(\nu _\alpha \). But from part (2) of Theorem 4.1 we know that \(n\mathbb{P }(|S(1)|\ge a_n)\rightarrow \mathbb{E }\tau _A \int _{\mathbb{S }^{d-1}} b(\theta )\,dP_\Theta (\theta )\). This proves the result in the non-generate case \(b\ne 0\).

In the degenerate case \(b=0\), \(\mathbb{P }(|S(1)|\ge x)=o(\mathbb{P }(|X|\ge x))\) as \(x\rightarrow \infty \). By independence of the cycles and since \(\lambda _n/a_n\rightarrow \infty \), for any set \(B\) bounded away from zero, some \(\gamma >0\),

$$\begin{aligned} \mathbb{P }_n \left( \lambda _n^{-1} \sum _{t=1}^{N_A(n)} S(t)\in B\right) \le \mathbb{P }\left( \left| \sum _{t=1}^{N_A(n)} S(t)\right| >\gamma \lambda _n\right) \le n\,c\,\mathbb{P }(|S(1)|\ge c\,\lambda _n)\\=o(n \mathbb{P }(|X|>a_n))=o(1). \end{aligned}$$

The desired result in the degenerate case follows.

7.2 Proof of part (2): The case \(\alpha >2\).

We only consider the non-degenerate case \(b\ne 0\). We will apply Theorem 4.6 in [34] for functions of Markov chains in the case \(d=1\).

Theorem 7.2

Let \((G_t)=(f(\Phi _t))\) be a 1-dimensional functional of a strictly stationary \(\mathbb{R }\)-valued irreducible aperiodic Markov chain \((\Phi _t)\) . Write \(S_n(G)=G_1+\cdots +G_n\), \(n\ge 1\), for the corresponding random walk. Assume that the following conditions are satisfied.

  1. (1)

    The regular variation condition \(\mathbf (RV _\alpha )\) for some \(\alpha >2\) and \(\mathbb{E }G=0\).

  2. (2)

    The anti-clustering condition \((\mathbf AC )_\alpha \):

    $$\begin{aligned} \lim _{k\rightarrow \infty }\limsup _{n\rightarrow \infty } \sup _{x\in \Lambda _n} \delta _k^{-\alpha }\sum _{j= k}^n\mathbb{P }(|G_j|> x\delta _k\mid |G_0|> x\delta _k)=0. \end{aligned}$$

    for a sequence \(\delta _k=o(k^{-2})\), \(k\rightarrow \infty \), and sets \((\Lambda _n)\) such that \(b_n=\inf \Lambda _n\rightarrow \infty \) as \(n\rightarrow \infty \).

  3. (3)

    The limit \(b_+=\lim _{k\rightarrow \infty }(b_+(k+1)-b_+(k)) \) exists, where the constants \((b_+(k))\) are defined in Theorem 6.1.

  4. (4)

    The drift condition \(\mathbf (DC _{p})\) for every \(p<\alpha \).

Then the precise large deviation  principle

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{x\in \Lambda _n}\Big |\frac{\mathbb{P }(S_n(G)> x)}{n\,\mathbb{P }(|G|> x)}-b_+\Big |=0, \end{aligned}$$
(7.3)

holds if \(\Lambda =(b_n,c_n)\) for any sequence \((b_n)\) satisfying \(b_n=n^{0.5+\varepsilon }\) for any \(\varepsilon >0\), and \((c_n)\) such that \(c_n>b_n\) and

$$\begin{aligned} \mathbb{P }(\tau _A>n)= o(n\, \mathbb{P }(|G|>c_n)), \end{aligned}$$
(7.4)

where \(\tau _A= \tau _A(1)\) is the first hitting time of the atom \(A\) of the Markov chain; see Sect. 2.2.

We will apply this result to \(G_t=\theta ^{\prime }X_t\), \(t\in \mathbb{Z }\), for any fixed \(\theta \in \mathbb{S }^{d-1}\) with \(b(\theta )\ne 0\). Note that (7.4) is satisfied since \(\tau _A\) has exponential moment. Condition \(\mathbf (RV _\alpha )\) for \((G_t)\) is satisfied by regular variation of \((X_t)\) in all non-degenerate cases \(b(\theta ) \ne 0\). The existence of the limits \(b_+=\lim _{k\rightarrow \infty }(b_+(k+1)-b_+(k))= b(\theta )/\mathbb{E }|\theta ^{\prime }\Theta _0|^\alpha \) (here we assume that \(\mathbb{E }|\theta ^{\prime }\Theta _0|^\alpha \ne 0\)) is ensured by Theorem 3.2. It remains to check condition \((\mathbf AC )_\alpha \) for \((G_t)\) under \(\mathbf (DC _{p})\) for \((G_t)\) for every \(p<\alpha \). Note that \(\mathbf (DC _{p})\) for \((X_t)\) implies \(\mathbf (DC _{p})\) for \((G_t)\). Using Markov’s inequality of order \(p<\alpha \) and (6.1), we obtain the following bound for \(k\ge 1\), \(x\in \Lambda _n\):

$$\begin{aligned} \sum _{j= k}^n\mathbb{P }(|G_j|> x \delta _k \mid |G_0|> x\delta _k)&\le \sum _{j= k}^{n}\frac{\mathbb{E }(|G_j|^p1\!\!1_{\{|G_0|>x \delta _k\}})}{x^p\delta _k^p \mathbb{P }(|G_0|>x\delta _k)}\\&\le \sum _{j= k}^{n}\left( \frac{\beta ^{j-1}\mathbb{E }(|X_0|^p1\!\!1_{\{|X_0|>x\delta _k\}})}{x^p\delta _k^p\mathbb{P }(|G_0|>x\delta _k)}+\frac{c}{x^p\delta _k^p}\right) \\&\le c\left( \frac{\beta ^k\mathbb{E }(|X_0|^p1\!\!1_{\{|X_0|>x\delta _k\}})}{x^p\delta _k^p\mathbb{P }(|X_0|>x\delta _k)}+\frac{n}{x^p\delta _k^p}\right) . \end{aligned}$$

The second term is of the order \(O(n b_n^{-p})=o(1)\) uniformly for \(x\in \Lambda _n\) since \(p\) can be chosen larger than 2 such that \(p(0.5+\varepsilon )>1\). The first term converges to \(c\beta ^k\) as \(n\rightarrow \infty \) uniformly for \(x\in \Lambda _n\), by applications of Karamata’s Theorem and the uniform convergence theorem of regular variation . We conclude that \((\mathbf AC )_\alpha \) holds as \(\delta _k^{-1}\beta ^k\rightarrow 0\) as \(k\rightarrow \infty \) if we choose \(\delta _k=k^{-2-\varepsilon ^{\prime }}\) for \(\varepsilon ^{\prime }>0\) sufficiently small. Thus all conditions of Theorem  are satisfied for \((G_t)=(\theta ^{\prime }X_t)\) and therefore (7.3) applies. Since \(\mathbb{P }(|\theta ^{\prime }X|>x)/\mathbb{P }(|X|>x)\rightarrow \mathbb{E }[ |\theta ^{\prime }X|^\alpha ]\) we can also write (7.3) in the form (4.9).

Now choose \((\lambda _n)\) as in the formulation of the theorem and apply Lemma A.1 below. This proves the theorem.