1 Introduction

Ergodic theorems for nonconventional sums of the form

$$\begin{aligned} S_N=S_N({\omega })=\sum _{k=1}^{N}f_1(T^{q_1(k)}{\omega })f_2(T^{q_2(k)}{\omega })\cdots f_\ell (T^{q_\ell (k)}{\omega }), \end{aligned}$$
(1.1)

were initiated in [11] and employed there in the proof of Szemerédi’s theorem on arithmetic progressions while the name “nonconventional” comes from [12]. Here \(T\) is an ergodic measure preserving transformation, \(f_i\)’s are bounded measurable functions and \(q_i(k)=ik\), \(i=1,\ldots ,\ell \). Since then results concerning such ergodic theorems under various conditions evolved into a substantial body of literature. More recently under appropriate mixing conditions a strong law of large numbers and a functional central limit theorem were obtained even for more general sums in [19] and [21], respectively.

If \(f_i\) equals for each \(i\) an indicator \({\mathbb I}_A\) of the same measurable set \(A\) then the corresponding sum \(S_N=S_N^A({\omega })\) counts the multiple recurrence events \(T^{q_i(k)}{\omega }\in A,\, i=1,\ldots ,\ell \) which occur for \(k\le N\). It was shown in [20] that if we count such multiple arrivals to appropriately shrinking sets \(A_n\) then the sums

$$\begin{aligned} S_{N}^{A_n}({\omega })=\sum _{k=1}^{N}{\mathbb I}_{A_n}(T^{q_1(k)}{\omega }) {\mathbb I}_{A_n}(T^{q_2(k)}{\omega })\cdots {\mathbb I}_{A_n}(T^{q_\ell (k)}{\omega }) \end{aligned}$$
(1.2)

usually will have asymptotically Poisson distribution for suitably chosen sequences \(N=N_n\rightarrow \infty \) as \(n\rightarrow \infty \). For \(\ell =1\) and \(q_1(k)=k\) this type of results was obtained in a series of papers under various conditions (see, for instance, [3, 5, 17, 18]).

To explain the results of this paper more precisely let us specify first our setup which consists of a stationary \(\psi \)-mixing discrete time process \(\xi (k),\, k=0,1,\ldots \) evolving on a finite or countable state space \({\mathcal A}\) and of nonnegative increasing functions \(q_i,\, i=1,\ldots ,\ell \) taking on integer values on integers and such that \(q_1(k)<q_2(k)<\cdots <q_\ell (k)\) when \(k\ge 1\). For each sequence \(a=(a_0,a_1,\ldots )\in {\mathcal A}^{\mathbb N}\) of elements from \({\mathcal A}\) and any \(n\in {\mathbb N}\) denote by \(a^{(n)}\) the string \(a_0,a_1,\ldots ,a_{n-1}\) which determines also an \(n\)-cylinder set \(A^a_n\) in \({\mathcal A}^{\mathbb N}\) consisting of sequences whose initial \(n\)-string coincides with \(a_0,a_1,\ldots ,a_{n-1}\). For appropriately chosen sequences \(N=N_n\) we are interested in the number of those \(l\le N\) such that the process \(\xi (k)= \xi (k,{\omega }),\, k\ge 0\) repeats the string \(a^{(n)}\) starting at times \(q_1(l),\, q_2(l),\ldots ,q_\ell (l)\). Employing the left shift transformation \(T\) on the sequence space \({\mathcal A}^{\mathbb N}\) we can represent the number in question via \(S^A_N({\omega })\) given by (1.2) with \(A=A^a_n\) and \(N=N_n\) considering \(S^A_N\) as a random variable on the probability space corresponding to the process \(\xi \).

Viewing such \(S^{A^a_n}_N\) as random variables on \({\mathcal A}^{\mathbb N}\) considered with a Gibbs \(T\)-invariant measure \({\mathbb P}\) it was shown in [20] that \(S^{A_n^a}_N\) for almost all \(a\in {\mathcal A}^{\mathbb N}\) has asymptotically a Poisson distribution with a parameter \(t\) provided

$$\begin{aligned} N=N_n=N_n^a\sim t({\mathbb P}(A^a_n))^{-\ell }\quad \text{ as }\quad n\rightarrow \infty . \end{aligned}$$
(1.3)

Observe that such asymptotical in \(n\) results make sense only for sequences \(a\in {\mathcal A}^{\mathbb N}\) such that \({\mathbb P}(A^a_n)>0\) for all \(n\ge 1\) but this is good enough since the set of such \(a\)’s has probability one. We will show in this paper for linear \(q_i\)’s that under (1.3) for large \(n\) the distribution of \(S^{A_n^a}_N\) with \(N\) as in (1.3) is close either to a Poisson distribution with a parameter \(t\) or to a compound Poisson distribution. The latter or the former holds true depending on whether or not the string \(a^{(n)}\) looks periodic with relatively short with respect to \(n\) period. In the “conventional” \(\ell =1\) case Poisson approximation estimates were obtained in [4] and [5] while compound Poisson approximation estimates for periodic sequences were derived in [18] but even in this case the complete dichotomy as described above seems to be new. We observe that in a related setup certain dichotomy of limiting distributions for first hitting times was described in [13] (see also [7] and references there) while [14] exhibited compound Poisson limits for distributions of numbers of returns to neighborhoods of periodic points. Note that both hitting times and number of returns statistics in [13] and [14] are obtained for balls and absolutely continuous invariant measures of corresponding dynamical systems while our results in the symbolic setup hold true for cylinders and all \(\psi \)-mixing shift invariant measures.

Relying on our approximation estimates it is possible to see (assuming (1.3)) for which sequences \(a\) the distribution of \(S_N^{A^a_n}\) approaches for large \(n\) the Poisson distribution and for which compound Poisson. Moreover, we will show that under (1.3) for all nonperiodic sequences \(a\) the sum \(S_N^{A^a_n}\) converges in distribution as \(n\rightarrow \infty \) to a Poisson random variable with the parameter \(t\). Since the number of periodic points is countable we obtain a Poissonian convergence for all but a countable set of points \(a\) and not only for almost all \(a\) as in previous papers. Observe that convergence of distributions of first hitting times for all nonperiodic points in a related setup (as mentioned above) was obtained in [13]. On the other hand, for periodic sequences \(a\) the limiting distribution may not exist at all and we provide a corresponding example. We observe that an attempt to construct such an example was made in Sect. 3.4 of [18] but it follows both from our estimates and, actually, already from [5] that for the example in [18] the limiting distribution exists and it is a Poisson one. We prove also that if \({\mathbb P}\) is a product stationary measure on \({\mathcal A}^{\mathbb N}\) (Bernoulli measure), i.e. if coordinate projections are independent identically distributed (i.i.d.) random variables, then the limiting distributions, either Poissonian or compound Poissonian, exist for all sequences \(a\). The above results describe completely the limiting behavior as \(n\rightarrow \infty \) in this setup and they are new even for the extensively studied “conventional” \(\ell =1\) case. Furthermore, either of the above convergence results ruins also the example in [18] since the latter counts arrivals to cylinders constructed by a nonperiodic sequence and it is built on a shift space with a product probability measure. Nevertheless, as our nonconvergence example shows a slight perturbation of independence may already create sequences \(a\) where convergence fails.

Let \(a\in {\mathcal A}^{\mathbb N}\) and \(\tau _{A^a_n}({\omega })\) be the first time \(k\) when the string \(a^{(n)}\) starts at each of the places \(q_1(k),q_2(k),\ldots ,q_\ell (k)\) of a sequence \({\omega }\in {\mathcal A}^{\mathbb N}\). When \(\ell =1\) and \(q_1(k)=k\) such hitting times were studied in a number of papers (see, for instance, [2, 3] and references there) where it was shown that under appropriate normalization they have as \(n\rightarrow \infty \) exponential limiting distribution. As a corollary of our Poisson and compound Poisson approximations we will extend here this type of results to the nonconventional \(\ell >1\) situation.

Our results are applicable to larger classes of dynamical systems and not only to shifts. Indeed, any expansive endomorphism \(S\) of a compact \(M\), in particular, any smooth expanding endomorphism of a compact manifold, has a symbolic representation as a one sided shift by taking a finite partition \({\alpha }_1,\ldots ,{\alpha }_k\) of \(M\) into sets of small diameter and assigning to each point \(x\in M\) the sequence \(j_0,j_1,\ldots \) such that \(x\in \cap _{i=0}^\infty S^{-i}{\alpha }_{j_i}\) while noticing that the last intersection is a singelton in view of expansivity. Smooth expanding endomorphisms of compact manifolds have many exponentially fast \(\psi \)-mixing invariant measures which are Gibbs measures constructed by Hölder continuous functions which ensures fast decay of our approximation estimates in this case. Our results remain valid also for many invertible dynamical systems having symbolic representations as two sided shifts, notably, for Axiom A (in particular, Anosov) diffeomorphisms. Within number theoretic applications the results can be formulated in terms of occurence of prescribed strings of digits in base-\(m\) or continued fraction expansions.

The structure of this paper is the following. In the next section we describe precisely our setup and conditions, give necessary definitions, formulate our main results and discuss more their connections and relevance. In Sect. 3 we state and prove some auxiliary lemmas. In Sects. 4 and 5 our Poisson and compound Poisson approximations results will be proved. In Sect. 6 we prove existence of limiting distributions in the i.i.d. case while in Sect. 7 we exhibit our nonconvergence example.

2 Preliminaries and main results

We start with a probability space \(({\Omega },{\mathcal F},{\mathbb P})\) such that \({\Omega }\) is a space of sequences \({\mathcal A}^{\mathbb N}\) with entries from a finite or countable set (alphabet) \({\mathcal A}\) which is not a singelton, the \(\sigma \)-algebra \({\mathcal F}\) is generated by cylinder sets and a probability \({\mathbb P}\) invariant with respect to the left shift \(T\) acting by \((T{\omega })_i={\omega }_{i+1}\) for \({\omega }=({\omega }_0,{\omega }_1,\ldots )\in {\Omega }={\mathcal A}^{{\mathbb N}}\). For each \(I\subset {\mathbb N}\) denote by \({\mathcal F}_I\) the sub \({\sigma }\)-algebra of \({\mathcal F}\) generated by the cylinder sets \([a_i,\, i\in I]=\{{\omega }=({\omega }_0,{\omega }_1,\ldots )\in {\Omega }:\, {\omega }_i=a_i\,\forall i\in I\}\). Without loss of generality we assume that the probability of each 1-cylinder set is positive, i.e. \({\mathbb P}([a])>0\) for every \(a\in {\mathcal A}\), and since \({\mathcal A}\) is not a singelton we have also \(\sup _{a\in {\mathcal A}}{\mathbb P}([a])<1\). Our results will be based on the \(\psi \)-mixing (dependence) coefficient defined for any two \({\sigma }\)-algebras \({\mathcal G},{\mathcal H}\subset {\mathcal F}\) by (see [10]),

$$\begin{aligned} \begin{aligned} \psi ({\mathcal G},{\mathcal H})&=\sup _{B\in {\mathcal G},\, C\in {\mathcal H}}\bigg \{\bigg \vert \frac{{\mathbb P}(B\cap C)}{{\mathbb P}(B){\mathbb P}(C)}-1\bigg \vert ,\,\,{\mathbb P}(B){\mathbb P}(C)\ne 0\bigg \}\\&=\sup \{\Vert {\mathbb E}(g|{\mathcal G})-Eg\Vert _\infty :\, g\,\,\,\text{ is }\,\,\, {\mathcal H}\text{-measurable } \text{ and }\, {\mathbb E}|g|\le 1\} \end{aligned} \end{aligned}$$
(2.1)

where \(\Vert \cdot \Vert _p\) is the \(L^p({\Omega },{\mathbb P})\)-norm. Next, we set

$$\begin{aligned} \psi _m=\sup _{n\in {\mathbb N}}\psi ({\mathcal F}_{0,n},{\mathcal F}_{n+m+1,\infty }) \end{aligned}$$
(2.2)

where \({\mathcal F}_{k,n}={\mathcal F}_{\{i:\,k\le i\le n\}}\) when \(0\le k\le n\le \infty \). It follows from (2.1) and (2.2) that \(\psi _m\) is non increasing in \(m\) and the measure \({\mathbb P}\) is called \(\psi \)-mixing if \(\psi _0<\infty \) and \(\psi _m\rightarrow 0\) as \(m\rightarrow \infty \).

Another important ingredient of our setup is a collection of \(\ell \) positive increasing integer valued functions \(q_1,\ldots ,q_\ell \) defined on positive integers \({\mathbb N}_+\) and such that

$$\begin{aligned} \begin{aligned}&q_1(n)<q_2(n)<\cdots <q_{\ell }(n),\,\,\forall n\in {\mathbb N}_+\,\,\text{ and }\\&\quad \lim _{n\rightarrow \infty }(q_{i+1}(n)-q_i(n))=\infty ,\,\,\forall i=1,\ldots ,\ell -1. \end{aligned} \end{aligned}$$
(2.3)

Our estimates will involve the following quantities related to these functions

$$\begin{aligned} g(n)=\inf _{k\ge n}\min _{1\le i\le \ell -1}(q_{i+1}(k)-q_i(k))\,\,\text{ and } \,\,{\gamma }(n)=\min \{ k\ge 0:\, g(k)\ge 2n\} \end{aligned}$$
(2.4)

where \(n\in {\mathbb N}_+\). In view of the second assumption in (2.3) the function \({\gamma }(n)\) is well defined for all \(n\).

Denote by \({\mathcal C}_n\) the set of all \(n\)-cylinders \(A=[a_0,a_1,\ldots , a_{n-1}]\). Here and in what follows we denote as above by \([Q]\) a cylinder if \(Q\) is a finite string of elements from \({\mathcal A}\) but, as usual, when \(Q\) is a number then \([Q]\) denotes the integral part of \(Q\). Similarly to [5] we introduce also for each \(A=[a_0,a_1,\ldots ,a_{n-1}] \in {\mathcal C}_n\) the quantity

$$\begin{aligned} \pi (A)=\min \{ k\in \{ 1,2,\ldots ,n\}:\, A\cap T^{-k}A\ne \emptyset \} \end{aligned}$$
(2.5)

setting also \(A(\pi )=[a_0,a_1,\ldots ,a_{\pi (A)-1}]\in {\mathcal C}_{\pi (A)}\). Next, for each \(A\in {\mathcal C}_n\) we write

$$\begin{aligned} S^A_N=\sum _{k=1}^NX^A_k\,\,\text{ where }\,\, X^A_k=\prod _{i=1}^\ell {\mathbb I}_A\circ T^{q_i(k)} \end{aligned}$$
(2.6)

and often, when \(A\) will be fixed and no ambiguity will arise, we will drop the index \(A\) in \(X_k^A\) and \(S^A_N\). In order to shorten our formulas we will use throughout this paper the notation \(\wp (x)=xe^x\). The following result exhibits our Poisson approximation estimates (for \(\ell =1\) a similar estimate was obtained in [5]).

Theorem 2.1

Let \(A\in {\mathcal C}_n\) with \({\mathbb P}(A)>0\), \(t>0\), \(N=[t({\mathbb P}(A))^{-\ell }]\) and assume that \(\psi _n<(3/2)^{1/(\ell +1)}-1\). Then

$$\begin{aligned}&\sup _{L\subset {\mathbb N}}\left| {\mathbb P}\{ S^A_N\in L\}-P_t(L)\right| \le 16{\mathbb P}(A)\left( \ell ^2nt+ {\gamma }(n)(1+t^{-1})\right) \nonumber \\&\quad +\,\,6{\mathbb P}(A(\pi ))tn\ell ^2(1+\psi _0)+2\wp \left( 2^\ell t\psi _n+{\gamma }(n){\mathbb P}(A)\right) \end{aligned}$$
(2.7)

where \(P_t(L)\) is the probability assigned to \(L\) by the Poisson distribution with the parameter \(t\). Moreover, assume that \({\mathbb P}\) is \(\psi \)-mixing (i.e. \(\psi _n\rightarrow 0\) as \(n\rightarrow \infty \)) then (2.7) holds true with

$$\begin{aligned} {\mathbb P}(A)\le e^{-{\Gamma }n}\,\,and\,\,{\mathbb P}(A(\pi ))\le e^{-{\Gamma }\pi (A)} \end{aligned}$$
(2.8)

for some \({\Gamma }>0\) independent of \(A\) and \(n\).

Next, for each \({\omega }=({\omega }_0,{\omega }_1,\ldots )\in {\Omega }\) and \(n\ge 1\) set \(A^{\omega }_n= [{\omega }_0,\ldots ..,{\omega }_{n-1}]\). Denote by \({\Omega }_{\mathbb P}\) the set of \({\omega }\in {\Omega }\) such that \({\mathbb P}(A^{\omega }_n)>0\) for all \(n\ge 1\). Clearly, \(T{\Omega }_{\mathbb P}\subset {\Omega }_{\mathbb P}\) and \({\mathbb P}({\Omega }_{\mathbb P})=1\) since \({\Omega }_{\mathbb P}\) is the complement in \({\Omega }\) of the union of all cylinder sets \(A\in {\mathcal C}_n,\, n\ge 1\) such that \({\mathbb P}(A)=0\) and the number of such cylinders is countable. Furthermore, \({\Omega }_{\mathbb P}=\, \)supp\({\mathbb P}\) if we take the discrete topology on \({\mathcal A}\) and the corresponding product topology on \({\Omega }\) since cylinders form an open base of this topology. Theorem 2.1 yields the following asymptotic result.

Corollary 2.2

Assume that the conditions of Theorem 2.1 together with the \(\psi \)-mixing assumption hold true. Then for \({\mathbb P}\)-almost all \({\omega }\in {\Omega }\) there exists \(M({\omega })<\infty \) such that for any \(n\ge M({\omega })\),

$$\begin{aligned}&\sup _{L\subset {\mathbb N}} \left| {\mathbb P}\left\{ S^{A^{\omega }_n}_N\in L\right\} -P_t(L)\right| \le 16e^{-{\Gamma }n} \big (\ell ^2nt+{\gamma }(n)(1+t^{-1})\nonumber \\&\quad +\,\,t\ell ^2 n^4(1+\psi _0)\big )+2\wp \left( 2^\ell t\psi _n+{\gamma }(n)e^{-{\Gamma }n}\right) \end{aligned}$$
(2.9)

provided \(N=N^{\omega }_n=[t({\mathbb P}(A^{\omega }_n))^{-\ell }]\).

Theorem 2.1 yields that if both \(\pi (A^{\omega }_n)\) and \(g(n)\) grow fast enough in \(n\) then under the \(\psi \)-mixing condition the distribution of \(S^A_N\) approaches the Poisson one as \(n\rightarrow \infty \). The fast growth of \(\pi (A^{\omega }_n)\) in \(n\) means that \({\omega }\) is “very” nonperiodic and Corollary 2.2 sais that almost all \({\omega }\)’s fall into this category. The above results can be compared with Theorem 2.3 from [20] where \({\mathbb P}\) was supposed to be a Gibbs invariant measure corresponding to a Hölder continuous function concentrated on a subshift of finite type space (see [9]). A \(\psi \)-mixing coefficient of such a measure decays in \(n\) exponentially fast (see [9]), and so (2.8) holds true then. Under conditions of [20] the function \(g(n)\) grows faster than logarithmically, and so \({\gamma }(n)\) grows slower than exponentially which yields a fast decay of the errors in the Poisson approximations above which improves the result from [20] where no estimates of the convergence rate were obtained. Note also that any Gibbs measure \({\mathbb P}\) gives a positive weight to each cylinder in the corresponding subshift of finite type space, and so in this case the latter space coincides with \({\Omega }_{\mathbb P}\). We observe that in the “conventional” \(\ell =1\) case similar to Theorem 2.1 error estimates in the corresponding Poisson approximations were obtained in [4] and [5].

Next, we describe our compound Poisson approximations where we assume that

$$\begin{aligned} q_i(k)=d_ik\,\,\text{ for } \text{ some }\,\, d_i\in {\mathbb N},\, i=1,\ldots ,\ell \,\,\text{ with } \,\, 1\le d_1<d_2<\cdots <d_\ell . \end{aligned}$$
(2.10)

For each \(R=[a_0,a_1,\ldots ..,a_{r-1}]\in {\mathcal C}_r\) and an integer \(n\ge r\) set

$$\begin{aligned} R^{n/r}=\left( \mathop \cap \limits _{k=0}^{[n/r]-1}T^{-kr}(R)\right) \cap T^{-[n/r]}\left( [a_0,\ldots , a_{n-r[n/r]-1}]\right) \end{aligned}$$

where we define \([a_0,\ldots ,a_{n-r[n/r]-1}]={\Omega }\) if \(r\) divides \(n\). Observe that if \(A=[a_0,a_1,\ldots ,a_{n-1}],\, A(\pi )=R\) and \(\pi (A)=r\) then \(R^{n/r}=A\).

Theorem 2.3

Let \(A=[a_0,a_1,\ldots ,a_{n-1}]\in {\mathcal C}_n\) with \({\mathbb P}(A)>0\), \(t>0\), \(N=[t({\mathbb P}(A))^{-\ell }]\) and assume that \(\psi _n<(3/2)^{1/(\ell +1)}-1\). Set \(r=\pi (A),\,\, R=[a_0,\ldots ..,a_{r-1}]\),

$$\begin{aligned} {\kappa }=lcm\bigg \{\frac{r}{gcd\{r,d_i\}}:\, 1\le i\le \ell \bigg \} \,\,and\,\,\rho =\rho _A=\prod _{i=1}^\ell {\mathbb P}\left\{ R^{(n+d_i{\kappa })/r}|A\right\} \end{aligned}$$
(2.11)

where lcm and gcd denote the least common multiple and the greatest common divisor, respectively. Assume that \({\mathbb P}\) is \(\psi \)-mixing then

$$\begin{aligned} \sup _{A\in {\mathcal C}_n,\, n\ge 1}\rho _A<1 \end{aligned}$$
(2.12)

and if \({\omega }\in {\Omega }_{\mathbb P}\) is not a periodic sequence then

$$\begin{aligned} \lim _{n\rightarrow \infty }\rho _{A^{\omega }_n}=0 \end{aligned}$$
(2.13)

where \(A_n^{\omega }\) is the same as in Corollary 2.2. Furthermore, let \(W\) be a Poisson random variable with the parameter \(t(1-\rho )\). Then for any \(n>r(d_\ell +6)\) there exists a sequence of i.i.d. random variables \(\eta _1,\eta _2,\ldots \) independent of \(W\) such that \({\mathbb P}\{\eta _1\in \{ 1,\ldots ,[\frac{n}{r}]\}\}=1\) and the compound Poisson random variable \(Z=\sum _{k=1}^W\eta _k\) satisfies

$$\begin{aligned}&\sup _{L\subset {\mathbb N}}\left| {\mathbb P}\{S^A_N\in L\}-{\mathbb P}\{ Z\in L\}\right| \le 2^{2\ell +7}(1+ \psi _0)^{2\ell }\left( d_\ell \ell ^2n^4e^{-{\Gamma }n/2}\right. \nonumber \\&\quad \left. +\,\,\psi _n(1-e^{-{\Gamma }})^{-1}\right) +2\wp \left( 10(1+\psi _0)^{2\ell }d_\ell n^2(t+1) e^{-{\Gamma }n/2}+2^\ell t\psi _n\right) . \end{aligned}$$
(2.14)

We will see in Lemma 3.7 that for any nonperiodic sequence \({\omega }\in {\Omega }_{\mathbb P}\) both (2.13) and \(r^{\omega }_n=\pi (A^{\omega }_n)\rightarrow \infty \) as \(n\rightarrow \infty \) hold true which together with appropriate estimates for the distribution of \(\eta _i\)’s constructed in Theorem 2.3 will yield the following result.

Corollary 2.4

Under conditions and notations of Theorem 2.3, for any nonperiodic sequence \({\omega }\in {\Omega }_{\mathbb P}\) the sum \(S_N^{A_n^{\omega }}\) converges in distribution as \(n\rightarrow \infty \) to a Poisson random variable with the parameter \(t\).

An important role in the proof of estimates in both Theorems 2.1 and 2.3 will play the estimates from [6]. We observe that both Theorem 2.3 and Corollary 2.4 seem to be new even for the “conventional” case \(\ell =1\). Some compound Poisson approximation estimates were obtained in [18] but they deal there only with cylinders \(A=A^a_n\) constructed for periodic sequences \(a\) and only with geometrically distributed random variables \(\eta _i\) appearing in the compound Poisson random variable \(Z\).

Remark 2.5

In fact, all above results can be formulated assuming that \(N\sim t({\mathbb P}(A))^{-\ell }\) instead of \(N=[t({\mathbb P}(A))^{-\ell }]\) where \(A\in {\mathcal C}_n\). Indeed, in view of Lemma 3.2 of the next section for all sufficiently large (in comparison to \(n\)) \(k\),

$$\begin{aligned} {\mathbb E}X^A_k\le C(P(A))^\ell \end{aligned}$$

for some \(C>0\), and so

$$\begin{aligned} \sum _{\min (N,t({\mathbb P}(A))^{-\ell })\le k\le \max (N,t({\mathbb P}(A))^{-\ell )})} \quad X^A_k\rightarrow 0\,\,\text{ in } \text{ probability } \text{ as }\,\, n\rightarrow \infty . \end{aligned}$$

Next, for any \(A\in {\mathcal C}_n\) define

$$\begin{aligned} \tau _A({\omega })=\min \bigg \{ k\ge 1:\,\prod _{i=1}^\ell {\mathbb I}_A\circ T^{q_i(k)}({\omega })=1\bigg \} \end{aligned}$$
(2.15)

which is the first time \(k\) for which the multiple recurrence event \(\{ T^{q_i(k)}{\omega }\in A,\, i=1,\ldots ,\ell \}\) occurs. Then combining Theorems 2.1 and 2.3 (assuming zero occurence of multi recurrence events in question) we derive

Corollary 2.6

Suppose that \({\mathbb P}\) is \(\psi \)-mixing and \(q_i(k)=d_ik,\, i=1,\ldots ,\ell \). Then for any \(A\in {\mathcal C}_n\) with \({\mathbb P}(A)>0\) and \(t\ge 0\),

$$\begin{aligned}&\left| {\mathbb P}\{ ({\mathbb P}(A))^\ell \tau _A>t\}-e^{-(1-\rho _A)t}\right| \nonumber \\&\quad \le \, 2^{2\ell +8}(1+\psi _0)^{2\ell }(t+1)\left( \psi _n(1-e^{-{\Gamma }})^{-1} +d_\ell \ell ^2n^4(1+t^{-1})e^{-{\Gamma }n/(d_\ell +6)}\right) \nonumber \\&\qquad +2\wp \left( 2^\ell t\psi _n+10e^{-{\Gamma }n/2}(1+\psi _0)^{2\ell }d_\ell n^2(t+1)\right) , \end{aligned}$$
(2.16)

provided \(\psi _n<(3/2)^{1/(\ell +1)}-1\).

We observe that when \(\ell =1\) the random variable \(\tau _A\) can be treated as a stopping time and most direct methods which deal with \(\tau _A\) in this case are based on this fact (see, for instance, [2, 3] and references there). In our nonconventional situation when \(\ell >1\) the random variable \(\tau _A\) depends on the future and it is difficult to deal with it directly. By this reason we obtain Corollary 2.6 as an immediate consequence of Poisson and compound Poisson approximations applied to the case when no multiple recurrence event occurs until time \(t({\mathbb P}(A))^{-\ell }\). Observe that if we replace \(t\) in (2.16) by \(u=t(1-\rho _A)\) then in view of (2.12) Corollary 2.6 will take the form of Theorem 1 from [2] which dealt though only with the “conventional” \(\ell =1\) case. Recall, also that arrivals to more general than cylindrical sets may lead to other than Poisson or compound Poisson limiting distribution (see [23]) but it seems plausible that some of our results can be extended for arrivals to balls of more general dynamical systems as considered in the “conventional” case in [7, 14] and in some references there.

In Sect. 5 we will derive from Corollary 2.6 together with the Shannon–McMillan–Breiman theorem the following result.

Corollary 2.7

Suppose that (2.10) holds true and the \(\psi \)-mixing coefficient of \({\mathbb P}\) satisfies

$$\begin{aligned} \sum _{n=1}^\infty \psi _n\ln n<\infty . \end{aligned}$$
(2.17)

Then for \({\mathbb P}\)-almost all \({\omega }\in {\Omega }\),

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\big (\ln \tau _{A^{\omega }_n}+\ell \ln {\mathbb P}(A^{\omega }_n)\big )=0 \,\,\,\,{\mathbb P}\text{-a.s. } \end{aligned}$$
(2.18)

If, in addition,

$$\begin{aligned} -\sum _{a\in {\mathcal A}}{\mathbb P}([a])\ln {\mathbb P}([a])<\infty \end{aligned}$$
(2.19)

then for \({\mathbb P}\)-almost all \({\omega }\in {\Omega }\),

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\ln \tau _{A^{\omega }_n}=\ell h_{\mathbb P}(T)\,\,\,\, {\mathbb P}\text{-a.s. } \end{aligned}$$
(2.20)

where \(h_{\mathbb P}(T)\) is the Kolmogorov–Sinai entropy of the shift \(T\) with respect to its invariant measure \({\mathbb P}\).

In the “conventional” \(\ell =1\) case relying on estimates of Theorem 1 from [2] it is possible to obtain (2.20) under weaker than (2.17) conditions.

Under additional independence conditions we obtain a more specific than in Theorem 2.3 compound Poisson approximation of \(S^A_n\) constructed via independent geometrically distributed random variables.

Theorem 2.8

Suppose that the conditions of Theorem 2.3 are satisfied and assume in addition that \({\mathbb P}\) is the product stationary probability on \({\mathcal A}^{\mathbb N}\), i.e. that coordinate projections from \({\Omega }\) to \({\mathcal A}\) are i.i.d. random variables. Let \(W\) be the same as in Theorem 2.3 and \(\zeta _1, \zeta _2,\ldots \) be i.i.d. random variables independent of \(W\) and having the geometric distribution with the parameter \(\rho \) defined in Theorem 2.3, i.e. \({\mathbb P}\{\zeta _1=k\}=(1-\rho )\rho ^{k-1},\, k=1,2,\ldots \). Let \(Y=\sum _{k=1}^W\zeta _k\) then

$$\begin{aligned}&\sup _{L\subset {\mathbb N}}\left| {\mathbb P}\{S^A_N\in L\}-{\mathbb P}\{ Y\in L\}\right| \le 2^{2\ell +8}(t+1)\ell ^2d_\ell n^4e^{-{\Gamma }n/2}\nonumber \\&\quad +\,\,2\wp \left( 12d_\ell n^2(t+1) e^{-{\Gamma }n/2}\right) +\frac{t^{[n/r]+1}}{([n/r]+1)!}. \end{aligned}$$
(2.21)

where \(A,\, N,\) and \(r\) are the same as in Theorem 2.3.

We observe that when \({\mathbb P}\) is a product stationary measure on \({\Omega }={\mathcal A}^{\mathbb N}\) then \({\Omega }_{\mathbb P}={\Omega }\) since then, in view of our assumption that \({\mathbb P}([a])>0\) for all \(a\in {\mathcal A}\), any cylinder set has positive probability. The following result, which seems to be new even in the “conventional” \(\ell =1\) case, sais that under the independence conditions of Theorem 2.8 the limiting distribution, either Poisson or compound Poisson, always exists.

Theorem 2.9

Suppose that the conditions of Theorem 2.8 are satisfied. For each \({\omega }\in {\Omega }\) and \(n\ge 1\) set \(U^{\omega }_n=S^{A^{\omega }_n}_{N^{\omega }_n}\) where \(A^{\omega }_n\) is the same as in Corollary 2.2, \(N^{\omega }_n=[t({\mathbb P}(A^{\omega }_n))^{-\ell }]\) and \(t>0\). Let \(r^{\omega }_n=\pi (A^{\omega }_n)\), \(R^{\omega }_n=[{\omega }_0,{\omega }_1,\ldots ,{\omega }_{r^{\omega }_n-1}]\) and \(\rho ^{\omega }_n={\mathbb P}(R^{\omega }_n)\). Then the limits

$$\begin{aligned} \lim _{n\rightarrow \infty }r^{\omega }_n=r^{\omega }\,\,and\,\,\lim _{n\rightarrow \infty } \rho _n^{\omega }=\rho ^{\omega }\end{aligned}$$
(2.22)

exist for any \({\omega }\in {\Omega }\).

  1. (i)

    If \({\omega }\) is not a periodic point then \(U^{\omega }_n\) converges in distribution as \(n\rightarrow \infty \) to a Poisson random variable with the parameter \(t\).

  2. (ii)

    If \({\omega }\) is a periodic point (sequence) of a minimal period \(d\) then \(r^{\omega }=d\) and \(\rho ^{\omega }=({\mathbb P}(A^{\omega }_d))^{k_0}\) with

    $$\begin{aligned} k_0=\frac{{\kappa }^{\omega }}{r^{\omega }}\sum _{i=1}^\ell d_i \end{aligned}$$

    where \({\kappa }^{\omega }\) is defined by (2.11) considered with \(r^{\omega }\) in place of \(r\) there. Furthermore, let \(W\) be a Poisson random variable with the parameter \(t(1-\rho ^{\omega })\) and \(\zeta _1,\zeta _2, \ldots \) be a sequence of i.i.d. random variables independent of \(W\) having the geometric distribution with the parameter \(\rho ^{\omega }\), i.e. \({\mathbb P}\{\zeta _1=k\}=(1-\rho ^{\omega })(\rho ^{\omega })^{k-1},\, k=1,2,\ldots \). Then \(U^{\omega }_n\) converges in distribution to \(Y^{\omega }=\sum _{k=1}^W\zeta _k\) as \(n\rightarrow \infty \).

A number theory (combinatorial) application of Theorem 2.9 can be described in the following way. Fix a point \(a\in [0,1)\) having a base \(m\ge 2\) expansion \(a=\sum _{i=0}^\infty a_im^{-(i+1)}\). For each point \({\omega }\in [0,1)\) with a base \(m\) expansion \({\omega }=\sum _{i=0}^\infty {\omega }_im^{-(i+1)}\) count the number \(N_n({\omega })\) of those \(i\le tm^{n\ell }\) for which the initial \(n\)-string \(a_0,a_1,\ldots ,a_{n-1}\) of \(a\) is repeated starting from all places \(d_1i,d_2i,\ldots ,d_\ell i\) in the expansion of \({\omega }\). Then the random variable \(N_n({\omega })\) on the probability space \(([0,1],\) Leb) converges in distribution either to a Poisson (with the parameter \(t\)) or to a compound Poisson random variable depending on whether the expansion of \(a\) is nonperiodic or periodic. In particular, for all irrational and for some rational \(a\) the limiting distribution will be Poissonian. This result seems to be new even in the “conventional” \(\ell =1\) case when we count the number \(N_n({\omega })\) of those \(i\le tm^n\) for which the initial \(n\)-string \(a_0,a_1,\ldots ,a_{n-1}\) of \(a\) appears starting from the place \(i\) in the expansion of \({\omega }\).

Since we allow also countable alphabet shift spaces our results are applicable to continued fraction expansions, as well. There the Gauss map \(Gx=\frac{1}{x}\) (mod 1), \(x\in (0,1],\, G0=0\) preserves the Gauss measure \(\mu _G({\Gamma })=\frac{1}{\ln 2}\int _{\Gamma }\frac{dx}{1+x}\) which is exponentially fast \(\psi \)-mixing (see, for instance, [16]) and \(G\) (restricted to points with infinite continued fraction expansions) is conjugate to the left shift on a sequence space with a countable alphabet. Thus Theorems 2.1 and 2.3 are applicable here, as well. On the other hand, Theorem 2.9 does not work for the Gauss measure which does not produce i.i.d. digits in continued fraction expansions (see, for instance, [22]), and so in order to apply Theorem 2.9 here we have to take other probabability measures conjugated to product measures on the shift space which produce i.i.d. continued fraction digits. Moreover, these results remain valid for a larger class of transformations generated by, so called, \(f\)-expansions (see [16]).

It follows from both Corollary 2.4 and Theorem 2.9 that the nonconvergence example from Sect. 3.4 in [18] is not correct and, in fact, the limiting distribution exists there and it is Poisson since the cylinder sets constructed there were based on a nonperiodic point and the probability measure on the sequence space was a product measure, whence both Corollary 2.4 and Theorem 2.9 above are applicable. Still, we construct in Sect. 7 an example of a periodic point \({\omega }\in {\Omega }_{\mathbb P}\) and of a \(\psi \)-mixing shift invariant measure such that \(S^{A^{\omega }_n}_{N^{\omega }_n}\) considered with \(\ell =1\) and \(d_1=1\) does not have a limiting distribution as \(n\rightarrow \infty \). Namely, in order for \(S^{A^{\omega }_n}_{N^{\omega }_n}\) to converge in distribution as \(n\rightarrow \infty \), where \({\omega }_{\mathbb P}\) is a periodic point, it is necessary in view of Theorem 2.3 that \({\mathbb P}\{ S^{A^{\omega }_n}_{N^{\omega }_n}=0\}=\exp (-t(1-\rho _n^{\omega }))\) converges to a limit as \(n\rightarrow \infty \), i.e. \(\lim _{n\rightarrow \infty }\rho _n^{\omega }\) must exist which will not be the case in our example. Moreover, the \(\psi \)-mixing coefficient \(\psi _l\) in our example equals zero for any \(l\ge 1\) while \(\psi _0<\infty \), i.e. the situation there is as close as possible to independence where Theorem 2.9 asserts convergence for all points \({\omega }\in {\Omega }\).

Remark 2.10

Observe that our convergence theorems above are based on approximation estimates of Theorems 2.1 and 2.3 which yield convergence of distributions with respect to the total variation distance but for integer valued random variables we are dealing with this is equivalent to convergence in distribution.

Remark 2.11

All above results were stated for one-sided shifts but essentially without changes they remain true for two-sided shifts, as well. We can also restrict the discussion to a subshift of finite type space considering invariant probability measures supported there. This enables us to apply the results to Axiom A diffeomorphisms (see [9]) considered with their Gibbs invariant measures relying on their symbolic representations via Markov partitions.

3 Auxiliary lemmas

We start with the following result.

Lemma 3.1

Suppose that \({\mathbb P}\) is \(\psi \)-mixing then there exists a constant \({\Gamma }>0\) such that for any \(A\in {\mathcal C}_n\),

$$\begin{aligned} {\mathbb P}(A)\le e^{-{\Gamma }n}. \end{aligned}$$
(3.1)

Proof

The proof is contained in Lemma 1 from [1] and in Lemma 1 from [15]. In both places the authors assume summability of \(\psi _n\) but, in fact, both proofs use only convergence of \(\psi _n\) to zero as \(n\rightarrow \infty \) which is \(\psi \)-mixing. In both papers the result is formulated with a constant in front of the exponent in (3.1) but it is clear from the proofs there that this constant can be taken equal 1. Furthermore, the authors there work on finite alphabet shift spaces but the proof of this result remains valid without any changes for countable alphabets, as well.\(\square \)

We will need also the following result which is, essentially, well known but for readers’ convenience we give its simple proof here.

Lemma 3.2

Let \(Q_1,\ldots ,Q_l,\, l\ge 1\) be subsets of nonnegative integers such that for \(i=2,\ldots ,l-1\),

$$\begin{aligned} \max Q_{i-1}+k\le \min Q_i\le \max Q_i\le \min Q_{i+1}-k \end{aligned}$$
(3.2)

for some integer \(k>0\). Then for any \(U_i\in {\mathcal F}_{Q_i},\, i=1,\ldots ,l\),

$$\begin{aligned} \left| {\mathbb P}(\mathop {\cap }\limits _{i=1}^lU_i)-\prod _{i=1}^l{\mathbb P}(U_i)\right| \le \left( (1+\psi _k)^l-1\right) \prod _{i=1}^l {\mathbb P}(U_i). \end{aligned}$$
(3.3)

Proof

By the definition (2.1) and (2.2) of the coefficient \(\psi \),

$$\begin{aligned} \bigg |{\mathbb P}\left( \mathop {\cap }\limits _{i=1}^{j+1}U_i\right) -{\mathbb P}\left( \mathop {\cap }\limits _{i=1}^jU_i\right) {\mathbb P}\left( U_{j+1}\right) \bigg |\le \psi _k{\mathbb P}\left( \mathop {\cap }\limits _{i=1}^jU_i\right) {\mathbb P}(U_{j+1}) \end{aligned}$$
(3.4)

for \(j=1,\ldots ,l-1\). applying (3.4) successively \(l-1\) times we obtain

$$\begin{aligned} \left| {\mathbb P}\left( \mathop {\cap }\limits _{i=1}^lU_i\right) -\prod _{i=1}^l{\mathbb P}(U_i)\right| \le \psi _k\sum _{j=1}^{l-1} \bigg ({\mathbb P}\left( \mathop {\cap }\limits _{i=1}^jU_i\right) \prod _{i=j+1}^l{\mathbb P}(U_i)\bigg ). \end{aligned}$$
(3.5)

Furthermore, applying (3.4) successively \(j-1\) times we see that

$$\begin{aligned} {\mathbb P}\left( \mathop {\cap }\limits _{i=1}^{j}U_i\right) \le (1+\psi _k){\mathbb P}\left( \mathop {\cap }\limits _{i=1}^{j-1}U_i\right) {\mathbb P}(U_j)\le (1+\psi _k)^{j-1}\prod _{i=1}^j{\mathbb P}(U_i). \end{aligned}$$
(3.6)

This together with (3.5) yields (3.3).\(\square \)

Lemma 3.3

let \(Q\) and \(\tilde{Q}\) be two subsets of nonnegative integers such that at least one of these sets is finite and

$$\begin{aligned} d=\min _{i\in Q,\, j\in R}|i-j|>0. \end{aligned}$$

Let \(k\) bound the number of components of \(Q\) and \(\tilde{Q}\) which are separated by some elements of the other set. Assume that \(\psi _d<2^{1/k}-1\). Then

$$\begin{aligned} \psi \left( {\mathcal F}_Q,{\mathcal F}_{\tilde{Q}}\right) \le 2^{2k+2}\psi _d(2-(1+\psi _d)^k)^{-2}. \end{aligned}$$
(3.7)

Proof

Suppose, for instance, that \(Q\) is a finite set. Then \(Q\) and \(\tilde{Q}\) can be represented as disjoint unions

$$\begin{aligned} Q=\mathop {\cup }\limits _{i=1}^kQ_{2i-1}\,\,\text{ and }\,\, \tilde{Q}=\cup _{i=1}^kQ_{2i} \end{aligned}$$

such that \(Q_1\) and \(Q_{2k}\) may be empty sets while all \(Q_j,\, j=2,\ldots ,2k-1\) are nonempty and

$$\begin{aligned} \max Q_{j-1}+d\le \min Q_j\le \max Q_j\le \min Q_{j+1}-d \end{aligned}$$
(3.8)

for \(j=2,\ldots ,2k-1\) where if \(j=2\) and \(Q_1=\emptyset \) or if \(j=2k-1\) and \(Q_{2k}=\emptyset \) then we disregard the first or the last inequality in (3.8), respectively.

Next, let \(U=\cap _{i=1}^kU_{2i-1}\) and \(V=\cap _{i=1}^kU_{2i}\) where \(U_{2i-1}\in {\mathcal F}_{Q_{2i-1}}\) and \(U_{2i}\in {\mathcal F}_{Q_{2i}},\, i=1,\ldots ,k\). Then by Lemma 3.2,

$$\begin{aligned} \bigg |{\mathbb P}(U\cap V)-\prod _{j=1}^{2k}{\mathbb P}(U_j)\bigg |\le \big ((1+\psi _d)^{2k}-1\big ) \prod _{j=1}^{2k}{\mathbb P}(U_j),\end{aligned}$$
(3.9)
$$\begin{aligned} \bigg |{\mathbb P}(U)-\prod _{i=1}^{k}{\mathbb P}(U_{2i-1})\bigg |\le \big ((1+\psi _d)^{k}-1\big ) \prod _{i=1}^{k}{\mathbb P}(U_{2i-1}) \end{aligned}$$
(3.10)

and

$$\begin{aligned} \bigg |{\mathbb P}(V)-\prod _{i=1}^{k}{\mathbb P}(U_{2i})\bigg |\le \big ((1+\psi _d)^{k}-1\big ) \prod _{i=1}^{k}{\mathbb P}(U_{2i}). \end{aligned}$$
(3.11)

Combining (3.9), (3.10) and (3.11) we obtain that

$$\begin{aligned} \bigg |{\mathbb P}(U\cap V)-{\mathbb P}(U){\mathbb P}(V)\bigg |\le \big ((1+\psi _d)^{2k}-1\big )\prod _{i=1}^{k} {\mathbb P}(U_{2i\!-\!1})\bigg ({\mathbb P}(V)\!+\!2\prod _{i=1}^{k}{\mathbb P}(U_{2i})\bigg ).\nonumber \\ \end{aligned}$$
(3.12)

Next, by Lemma 3.2,

$$\begin{aligned} {\mathbb P}(U)\ge (2-(1+\psi _d)^k)\prod _{i=1}^{k}{\mathbb P}(U_{2i-1})\,\,\text{ and }\,\, {\mathbb P}(V)\ge (2-(1+\psi _d)^k)\prod _{i=1}^{k}{\mathbb P}(U_{2i}),\nonumber \\ \end{aligned}$$
(3.13)

which together with (3.12) yields that

$$\begin{aligned}&|{\mathbb P}(U\cap V)-{\mathbb P}(U){\mathbb P}(V)|\nonumber \\&\quad \le ((1+\psi _d)^{2k}-1)(4-(1+\psi _d)^k)(2-(1+\psi _d)^k)^{-2}{\mathbb P}(U){\mathbb P}(V). \end{aligned}$$
(3.14)

The inequality (3.14) remains true when \(U\) and \(V\) are disjoint unions of intersections of sets as above and it will still hold under monotone limits of sets \(U\in {\mathcal F}_Q\) and \(V\in {\mathcal F}_{\tilde{Q}}\). Hence, it holds true for all \(U\in {\mathcal F}_Q\) and \(V\in {\mathcal F}_{\tilde{Q}}\), and so (3.7) follows (see [8]).\(\square \)

We will need also the following estimate of the total variation distance between two Poisson distributions.

Lemma 3.4

For any \(\lambda ,\gamma >0\),

$$\begin{aligned} \sum _{l=0}^\infty \left| P_{\lambda }(l)-P_{\gamma }(l)\right| \le 2e^{|\lambda -\gamma |}|\lambda -\gamma |=2\wp (|{\lambda }-{\gamma }|). \end{aligned}$$
(3.15)

Proof

Assume, for instance, that \({\lambda }\ge {\gamma }\). Then

$$\begin{aligned}&\sum _{n=0}^\infty \left| e^{-{\lambda }}\frac{{\lambda }^n}{n!}-e^{-{\gamma }}\frac{{\gamma }^n}{n!}\right| \le \sum _{n=0}^\infty e^{-{\lambda }}\frac{{\lambda }^n}{n!}\left( \left| 1-e^{{\lambda }-{\gamma }}\right| \right. \\&\quad \left. +\,\,e^{{\lambda }-{\gamma }}\left| 1-\left( \frac{{\gamma }}{{\lambda }}\right) ^n\right| \right) \le \left| 1-e^{{\lambda }-{\gamma }}\right| + e^{{\lambda }-{\gamma }}\left| {\lambda }-{\gamma }\right| \le 2({\lambda }-{\gamma })e^{{\lambda }-{\gamma }} \end{aligned}$$

and (3.15) follows.\(\square \)

The following two lemmas will be used in the proof of Theorem 2.3.

Lemma 3.5

Let \(H=[a_{0},\ldots ,a_{h-1}]\in {\mathcal C}_h,\, h\ge 1\) and either \(\pi (H)=h\) or \(h\) is not divisible by \(\pi (H)\). Then \(\pi (H^{n/h})=h\) for each \(n\ge 2h\).

Proof

For each \(B\in {\mathcal C}_m\) set

$$\begin{aligned} {\mathcal O}(B)=\left\{ k\le m,k\ge 1:\, B\cap T^{-k}B\ne \emptyset \right\} , \end{aligned}$$

so that \(\pi (B)=\min ({\mathcal O}(B))\). It follows from Theorem 6 and Remark 7 from [25] that we can also write

$$\begin{aligned} \begin{aligned}&{\mathcal O}(B)=\left\{ \pi (B),2\pi (B),\ldots ,\left[ \frac{m}{\pi (B)}\right] \pi (B)\right\} \\&\cup \{ k\in \{ m-\pi (B)+1,\ldots ,m\}:\, B\cap T^{-k}B\ne \emptyset \}. \end{aligned} \end{aligned}$$
(3.16)

Let \(n\ge 2h\) and set \(\pi (H^{n/h})=k\). Clearly, \(h\in {\mathcal O}(H^{n/h})\) and so \(h\ge k\). We suppose that \(h>k\) and arrive at a contradiction. By the definition of \(\pi \) it follows that \(H^{n/h}\cap T^{-k}(H^{n/h})\ne \emptyset \). Therefore, \([a_{k},\ldots ,a_{h-1}]=[a_{0},\ldots ,a_{h-1-k}]\) and \([a_{0},\ldots ,a_{k-1}]=[a_{h-k},\ldots ,a_{h-1}]\), and so \(k,h-k\in \mathcal {O}(H)\). Thus, \(k\ge \pi (H)\) and if \(\pi (H)=h\) we would have \(k\ge h\) which contradicts our assumption. Hence, it remains to consider the case when \(h\) is not divisible by \(\pi (H)\). Since \(k,h-k\in \mathcal {O}(H)\) then \(\pi (H)\le k,h-k\), and so \(k,h-k\le h-\pi (H)\). This together with (3.16) yields that there exist integers \(a\) and \(b\) such that \(1\le a,b\le [\frac{h}{\pi (H)}],\, k=a\cdot \pi (H)\) and \(h-k=b\cdot \pi (H)\). Therefore, \(h=(h-k)+k=(a+b)\pi (H)\) which contradicts our assumption that \(h\) is not divisible by \(\pi (H)\). Hence, \(h=k\) and the proof is complete. \(\square \)

The following result will be used in the proof of Theorem 2.3.

Lemma 3.6

Let \(q_i,\, i=1,\ldots ,\ell \) be as in (2.10) and \(n\ge r(d_\ell +1)\). For any positive integers \(m\) and \(n\) satisfying \(m>2d_\ell n\) set

$$\begin{aligned} {\mathcal N}={\mathcal N}_{m,n}=\{l=1,2,\ldots , n:\,\{X_m=1,X_{m+l}=1\}\ne \emptyset \}. \end{aligned}$$

Then \(\min {\mathcal N}={\kappa }\) where \({\kappa }\) is defined by (2.11) with \(n\) being the length of a cylinder \(A\) there and \(r=\pi (A)\).

Proof

It follows from (3.16) that if \(1\le l\le \frac{n-r}{d_{\ell }}\) then \(l\in \mathcal {N}\) if and only if \(r\) divides \(d_il\) for each \(i=1,\ldots ..,\ell \). By the definition \(r\) divides \({\kappa }d_i\) and by the assumption of the lemma \({\kappa }\le r\le \frac{n-r}{d_{\ell }}\). Hence, \({\kappa }\in \mathcal {N}\), and so \({\kappa }\ge y=\min {\mathcal N}\). Now, let \(l\in \mathcal {N}\) satisfies \(l\le \frac{n-r}{d_{\ell }}\).Then \(r\) divides \(d_il\) for each \(1\le i\le \ell \), and so \(\frac{r}{gcd\{r,d_{i}\}}\) divides \(l\). Thus, \({\kappa }\) divides \(l\), and so \({\kappa }\le l\). It follows that \({\kappa }\le y\), completing the proof of the lemma.\(\square \)

In Sects. 5 and 6 we will need the following results. As in Corollary 2.2 for each \({\omega }=({\omega }_0,{\omega }_1,\ldots )\in {\Omega }\) set \(A_n^{\omega }=[{\omega }_0,\ldots ,{\omega }_{n-1}]\), \(r^{\omega }_n=\pi (A^{\omega }_n)\) and \(R^{\omega }_n=[{\omega }_0,\ldots ,{\omega }_{r^{\omega }_n-1}]\). Next, define \({\kappa }={\kappa }_n^{\omega }\) and \(\rho =\rho _n^{\omega }\) by (2.11) with \(r=r_n^{\omega }\) and \(A=A_n^{\omega }\).

Lemma 3.7

For any \({\omega }\in \Omega \) the limit \(r^{{\omega }}={\mathop {\lim }\limits _{n\rightarrow \infty }r_{n}^{{\omega }}}\) exists. Furthermore, if \({\omega }\) is a periodic point with period \(d\in \mathbb {N}^{+}\) (i.e. the whole path \(\{T^{k}{\omega }\,:\, k\ge 0\}\) of \({\omega }\) consists of \(d\) points) then \(r^{{\omega }}=d\), otherwise \(r^{{\omega }}=\infty \). Assume that \({\mathbb P}\) is \(\psi \)-mixing. If \({\omega }\in {\Omega }_{\mathbb P}\) is not a periodic point then

$$\begin{aligned} \lim _{n\rightarrow \infty }\rho _n^{\omega }=0. \end{aligned}$$
(3.17)

Furthermore, if independence conditions of Theorem 2.9 are satisfied then the limit \(\rho ^{{\omega }}={\mathop {\lim }\limits _{n\rightarrow \infty }\rho _{n}^{{\omega }}}\) always exists. Moreover, in this case if \({\omega }\) is a periodic point with period \(d\in \mathbb {N}^{+}\) then \(\rho ^{{\omega }}=\big (\mathbb {P}([{\omega }_{0},\ldots ,{\omega }_{d-1}]) \big )^{k_0}\) with \(k_0\) defined in Theorem 2.9 (ii), otherwise \(\rho ^{{\omega }}=0\).

Proof

Assume that \({\omega }\) is periodic with period \(d\in \mathbb {N}^{+}\) and set \(D=[{\omega }_{0},\ldots ,{\omega }_{d-1}]\). By the definition of a period of a point it follows that \(\pi (D)=d\) or \(\pi (D)\) does not divide \(d\). This together with Lemma 3.5 yields that \(\pi (D^{n/d})=d\) for all \(n\ge 2d\). Moreover, \(T^{d}({\omega })={\omega }\) which implies that \(D^{n/d}=A_{n}^{{\omega }}\) for each \(n\ge d\). From this it follows that

$$\begin{aligned} {\mathop {\lim }\limits _{n\rightarrow \infty }r_{n}^{{\omega }}}=\underset{n\rightarrow \infty }{\lim }\pi (A_{n}^{{\omega }})=\underset{n\rightarrow \infty }{\lim }\pi (D^{n/d})=d \end{aligned}$$

Now assume that \({\omega }\) is not periodic. Given \(n\ge 1\), from the definition of \(\mathcal {O}\) in Lemma 3.5 it follows that \(r_{n+1}^{{\omega }}\in \mathcal {O}(A_{n}^{{\omega }})\cup \{n+1\}\), which implies that \(r_{n}^{{\omega }}\le r_{n+1}^{{\omega }}\). This holds true for all \(n\ge 1\), and so \(r^{{\omega }}={\mathop {\lim }\limits _{n\rightarrow \infty }r_{n}^{{\omega }}}\) exists and it is in the set \(\mathbb {N}^{+}\cup \{\infty \}\). Assume by contradiction that \(r^{{\omega }}<\infty \), then there exists an integer \(M\ge 1\) such that \(r_{n}^{{\omega }}=r^{{\omega }}\) for all \(n\ge M\). From this it follows that \(A_{n}^{{\omega }}=[{\omega }_{0},\ldots ,{\omega }_{r^{{\omega }}-1}]^{n/r^{{\omega }}}\) for all such \(n\) which implies that \({\omega }=[{\omega }_{0},\ldots ,{\omega }_{r^{{\omega }}-1}]^{\infty }\) , and so \({\omega }\) is a periodic point which contradicts our assumption. It follows that \(r^{{\omega }}=\infty \) and the assertion concerning \(r^{\omega }\) is proved.

Next, let \({\omega }\in {\Omega }_{\mathbb P}\) be not a periodic point. Let \(r=r_n^{\omega },\, {\kappa }={\kappa }_n^{\omega },\, A=A_n^{\omega }\) and \(R=R_n^{\omega }\). Since \(r\) divides \(d_i{\kappa }\) for each \(i=1,\ldots ,\ell \) we see by (2.1), (2.2) and (3.1) that

$$\begin{aligned} {\mathbb P}\left( R^{(n+d_i{\kappa })/r}\right) \le (1+\psi _0){\mathbb P}(A){\mathbb P}\left( T^{n_r}R^{(n_r+r)/r}\right) \le (1+\psi _0)e^{-{\Gamma }r^{\omega }_n}{\mathbb P}(A) \end{aligned}$$
(3.18)

where \(n_r=n\) (mod \(r)=n-[n/r]r\). Now (3.17) follows since by the above \(r_n^{\omega }\rightarrow \infty \) as \(n\rightarrow \infty \) when \({\omega }\) is not periodic. Under the conditions of Theorem 2.9 the remaining assertion concerning \(\rho ^{\omega }\) follows from the properties of \(r^{\omega }_n\) derived above together with the independence assumption.\(\square \)

The assertion (2.12) of Theorem 2.3 we obtain as a separate lemma.

Lemma 3.8

Assume that \({\mathbb P}\) is \(\psi \)-mixing then (2.12) holds true.

Proof

First, clearly \(\rho _A\le 1\) for any \(A\in {\mathcal C}_n\) and each \(n\ge 1\). Next, let \(A_{n}=[a_0^{(n)},\ldots ,a_{n-1}^{(n)}]\in {\mathcal C}_{n}\) and write

$$\begin{aligned} {\mathbb P}\left( R_{n}^{(n+d_1{\kappa }_{n})/r_{n}}|A_{n}\right) =\frac{{\mathbb P}\big ( R_{n}^{(n+d_1{\kappa }_{n})/r_{n}} \big )}{{\mathbb P}(A_n)}=1-{\delta }_n, \quad {\delta }_n\ge 0 \end{aligned}$$
(3.19)

where \(R_n=[a_0^{(n)},\ldots ,a_{r_n-1}^{(n)}]\), \(r_n=\pi (A_n)\) and \({\kappa }_n\) is given by (2.11) with \(r=r_n\). Then we show by induction that for any \(k\in {\mathbb N}\),

$$\begin{aligned} {\mathbb P}\left( R_n^{(n+kd_1{\kappa }_n)/r_n}\right) \!\ge \left( 1-2^{k-1}{\delta }_n\right) {\mathbb P}(A_n). \end{aligned}$$
(3.20)

Indeed, (3.20) is satisfied for \(k=1\) in view of (3.19). Suppose that (3.20) holds true for \(k=m\) and prove it for \(k=m+1\). Set \(n_r=n-r_n[n/r_n]\) and observe that for any \(l\ge 1\),

$$\begin{aligned}&\left( R_n^{(n+(l-1)d_1{\kappa }_n)/r_n}\cap T^{-(n+(l-1)d_1{\kappa }_n)}\left( {\Omega }{\setminus } T^{n_r}R_n^{(n_r+d_1{\kappa }_n)/r_n}\right) \right) \nonumber \\&\quad \cup R_n^{(n+ld_1{\kappa }_n)/r_n}=R_n^{(n+(l-1)d_1{\kappa }_n)/r_n}\subset A_n \end{aligned}$$
(3.21)

and the union above is disjoint. The induction hypothesis together with (3.21) considered with \(l=m\) yields

$$\begin{aligned}&{\mathbb P}\left( R_n^{(n+md_1{\kappa }_n)/r_n}\cap T^{-(n+md_1{\kappa }_n)} \left( {\Omega }{\setminus } T^{n_r}R_n^{(n_r+d_1{\kappa }_n)/r_n}\right) \right) \nonumber \\&\quad \le {\mathbb P}\left( T^{-d_1{\kappa }}(R_n^{(n+(m-1)d_1{\kappa }_n)/r_n}\cap T^{-(n+(m-1)d_1{\kappa }_n)} \left( {\Omega }{\setminus } T^{n_r}R_n^{(n_r+d_1{\kappa }_n)/r_n})\right) \right) \nonumber \\&\quad ={\mathbb P}\left( R_n^{(n+(m-1)d_1{\kappa }_n)/r_n}\cap T^{-(n+(m-1)d_1{\kappa }_n)} \left( {\Omega }{\setminus } T^{n_r}R_n^{(n_r+d_1{\kappa }_n)/r_n}\right) \right) \nonumber \\&\quad \le {\delta }_n2^{m-1}{\mathbb P}(A_n). \end{aligned}$$
(3.22)

Employing (3.21) with \(l=m+1\) we obtain from (3.22) and (3.20) for \(k=m\) that

$$\begin{aligned} {\mathbb P}\left( R_n^{(n+(m+1)d_1{\kappa }_n)/r_n}\right) \ge {\mathbb P}\left( R_n^{(n+md_1{\kappa }_n)/r_n} \right) -{\delta }_n2^{m-1}{\mathbb P}(A_n)\ge \left( 1-2^m{\delta }_n\right) {\mathbb P}(A_n), \end{aligned}$$

and so (3.20) holds true with \(k=m+1\) completing the induction.

Now observe that by (2.1), (2.2) and (3.1),

$$\begin{aligned} {\mathbb P}\left( R_n^{(n+kd_1{\kappa }_n)/r_n}\right) \le (1+\psi _0)e^{-{\Gamma }kr_n} {\mathbb P}(A_n). \end{aligned}$$
(3.23)

Since always \(r_n\ge 1\) we can choose \(k\) so large that \((1+\psi _0)e^{-{\Gamma }kr_n}<\frac{1}{2}\) for all \(n\) making the right hand side of (3.23) less than \(\frac{1}{2}{\mathbb P}(A_n)\) for all \(n\). Now suppose by contradiction that there exists a subsequence \(l_m\rightarrow \infty \) as \(m\rightarrow \infty \) such that \({\delta }_{l_m}\rightarrow 0\) as \(m\rightarrow \infty \). Then, we can choose \(n=l_m\) in (3.20) with \(m\) so large that the right hand side of (3.20) will be bigger than \(\frac{1}{2}{\mathbb P}(A_n)\) which leads to a contradiction proving the lemma.\(\square \)

4 Poisson approximation

4.1 Proof of Theorem 2.1

If \(t^{-1}{\gamma }(n)(\mathbb {P}(A))^{\ell }\ge \frac{1}{4}\) then the theorem clearly holds true, and so we can assume that \({\gamma }(n)<\frac{t(\mathbb {P}(A))^{-\ell }}{4}\). Set \(U=S_N^A\). If \(t(\mathbb {P}(A))^{-\ell }<4\) then for each \(L\subset \mathbb {N}\),

$$\begin{aligned}&|\mathbb {P}\{U\in L\}-P_{t}(L)|\nonumber \\&\quad \le \mathbb {P}\{U\ne 0\}+|\mathbb {P}\{U=0\}- P_{t}\{0\}|+P_{t}(\mathbb {N}^{+})\le 2\mathbb {P}\{U\ne 0\}\nonumber \\&\qquad +\,\,2P_{t}(\mathbb {N}^{+})\le 2N\mathbb {P}(A) +2(1-e^{-t})\le 8\mathbb {P}(A)+2t\le 16\mathbb {P}(A). \end{aligned}$$
(4.1)

Again the theorem holds true, so we can assume that \(t(\mathbb {P}(A))^{-\ell }\ge 4\) and then \({\gamma }(n)<\frac{t(\mathbb {P}(A))^{-\ell }}{4}\le N\).

Set \(W=\sum \nolimits _{\alpha ={\gamma }(n)}^{N}X_{\alpha }\), where \(X_{\alpha }=X_{\alpha }^A\) was defined in (2.6), and \(\lambda =EW\). For any \(L\subset \mathbb {N}\),

$$\begin{aligned}&|\mathbb {P}\{U\in L\}-P_{t}(L)|\le |\mathbb {P}\{U\in L\}-\mathbb {P}\{W\in L\}|\nonumber \\&\quad +\,\,|\mathbb {P}\{W\in L\}-P_{\lambda }(L)|+|P_{\lambda }(L)-P_{t}(L)|=\delta _{1} +\delta _{2}+\delta _{3} \end{aligned}$$
(4.2)

where \({\delta }_1,{\delta }_2\) and \({\delta }_3\) denote the first, the second and the third terms in the right hand side of (4.2), respectively. We estimate \(\delta _{1}\) by

$$\begin{aligned} \delta _{1}\le 2\mathbb {P}\{U-W>0\}\le 2\underset{\alpha =1}{\overset{{\gamma }(n)}{\sum }}\mathbb {P}\{X_{\alpha }=1\}\le 2{\gamma }(n) \mathbb {P}(A). \end{aligned}$$
(4.3)

In order to estimate \(\delta _{2}\) we use Theorem 1 from [6]. Note that from the assumption \(\psi _{n}\le (3/2)^{1/(\ell +1)}-1\) and from Lemma 3.2 whenever \({\gamma }(n)\le \alpha \le N\) it follows that

$$\begin{aligned} \mathbb {P}\{X_{\alpha }=1\}\ge (2-(1+\psi _n)^\ell )(\mathbb {P}(A))^{\ell }> \frac{1}{2}(\mathbb {P}(A))^{\ell }>0. \end{aligned}$$

Hence, the conditions of Theorem 1 from [6] are satisfied with the collection \(\{X_{{\gamma }(n)},\ldots ,X_{N}\}\). For each \({\alpha }\in {\mathbb N}_+\) satisfying \({\gamma }(n)\le \alpha \le N\) set

$$\begin{aligned} B_{\alpha }=\{{\beta }\ge {\gamma }(n),\,\beta \le N:\,\,\,|q_{i} (\alpha )-q_{j}(\beta )|<2n\,\,\text{ for } \text{ some }\,\, i,j=1,2,\ldots ,\ell \}. \end{aligned}$$

Then by Theorem 1 from [6],

$$\begin{aligned} \delta _{2}\le b_{1}+b_{2}+b_{3} \end{aligned}$$
(4.4)

where

$$\begin{aligned} b_{1}&= \underset{\alpha ={\gamma }(n)}{\overset{N}{\sum }}\left( \,\underset{\beta \in B_{\alpha }}{\sum }\mathbb {P}\{X_{\alpha }=1\}\mathbb {P}\{X_{\beta }=1\}\right) ,\\ b_{2}&= \underset{\alpha ={\gamma }(n)}{\overset{N}{\sum }}\left( \,\underset{\alpha \ne \beta \in B_{\alpha }}{\sum }\mathbb {P}\{X_{\alpha }=1,X_{\beta }=1\}\right) \,\, \text{ and }\\ b_{3}&= \underset{\alpha ={\gamma }(n)}{\overset{N}{\sum }}E\big \vert E(X_{\alpha } -EX_\alpha \mid {\mathcal B}_{\alpha })\big \vert \,\,\text{ with }\,\,{\mathcal B}_{\alpha }=\sigma \{X_{\beta } \,:\,\beta \notin B_{\alpha }\}. \end{aligned}$$

From Lemma 3.2 and the fact that \(|B_{\alpha }|\le 4\ell ^{2}n\) for each \(\alpha \) it follows that

$$\begin{aligned} b_{1}\le N4\ell ^{2}n\big ((1+\psi _n)\mathbb {P}(A)\big )^{2\ell } \le 8\ell ^{2}nt(\mathbb {P}(A))^{\ell }. \end{aligned}$$
(4.5)

Next, we estimate \(b_{2}\). Let \({\gamma }(n)\le \alpha \le N\) and \(\alpha \ne \beta \in B_{\alpha }\). Assume without loss of generality that \(\alpha <\beta \), so \(q_{1}(\alpha ) <q_{1}(\beta )\). If \(q_{1}(\beta )-q_{1}(\alpha )<\pi (A)\) then by the definition of \(\pi (A)\) it follows that \(\mathbb {P}\{X_{\alpha }=1,X_{\beta }=1\}=0\), so we can assume that \(q_{1}(\beta )-q_{1}(\alpha )\ge \pi (A)\). Now by (2.1), (2.2) and Lemmas 3.1 and 3.2,

$$\begin{aligned}&\mathbb {P}\{X_{\alpha }=1,X_{\beta }=1\}\le \mathbb {P}(T^{-q_{1}(\alpha )} ([a_{0},\ldots ,a_{\pi (A)-1}])\cap \{X_{\beta }=1\})\\&\quad \le (1+\psi _{0})\mathbb {P}(A(\pi ))\mathbb {P}\{X_{{\beta }}=1\} \le (1+\psi _0)\mathbb {P}(A(\pi ))(1+\psi _n)^\ell ({\mathbb P}(A))^\ell . \end{aligned}$$

Since by our assumption \((1+\psi _n)^\ell \le 3/2\) we obtain that

$$\begin{aligned} b_{2}\le 6(1+\psi _0)N\ell ^{2}n\mathbb {P}(A(\pi ))(\mathbb {P}(A))^{\ell } \le 6(1+\psi _{0})\ell ^2tn\mathbb {P}(A(\pi )). \end{aligned}$$
(4.6)

In order to estimate \(b_{3}\) we use Lemma 3.3. Fix an integer \({\alpha }\) such that \({\gamma }(n)\le {\alpha }\le N\) and set

$$\begin{aligned} Q&= Q_{\alpha }=\{ q_i({\alpha })+m:\, i=1,\ldots ,\ell ;\, m=0,1,\ldots ,n-1\}\,\,\hbox {and}\\ \tilde{Q}&= \tilde{Q}_{\alpha }=\{ q_j({\beta })+m:\, j=1,\ldots ,\ell ;\,\,{\beta }\not \in B_{\alpha },\, m=0,1,\ldots ,n-1\}. \end{aligned}$$

Then the conditions of Lemma 3.3 are satisfied with \(d=n\) and such \(Q\) and \(\tilde{Q}\). Taking into account that \({\mathcal B}_{\alpha }\subset {\mathcal F}_{\tilde{Q}}\) we derive easily from Lemmas 3.2 and 3.3 that for \(p=EX_{\alpha }\),

$$\begin{aligned}&E\bigr |E(X_{\alpha }-p\mid \mathcal {B_{\alpha }})\bigr |=E\bigr |E(E(X_{\alpha }- p\mid \mathcal {F}_{\tilde{Q}})\mid \mathcal {B_{\alpha }})\bigr |\\&\quad \le EE(\bigr |E(X_{\alpha } -p\mid \mathcal {F}_{\tilde{Q}})\bigr |\mid \mathcal {B_{\alpha }})=E\bigr |E(X_{\alpha }-p\mid \mathcal {F}_{\tilde{Q}})\bigr | \\&\quad \le 2^{2\ell +4}\psi _{n}\mathbb {P}\{X_{\alpha }=1\}\le 2^{2\ell +5}\psi _{n}(\mathbb {P}(A))^{\ell } \end{aligned}$$

Hence,

$$\begin{aligned} b_{3}\le N2^{2\ell +5}\psi _{n}(\mathbb {P}(A))^{\ell }\le 2^{2\ell +5} t\psi _{n}. \end{aligned}$$
(4.7)

In order to estimate \(\delta _{3}\) we use Lemma 3.4 which yields

$$\begin{aligned} \delta _{3}\le \underset{l=0}{\overset{\infty }{\sum }}|P_{\lambda }\{l\}- P_{t}\{l\}|\le 2e^{|\lambda -t|}|\lambda -t|=2\wp (|{\lambda }-t|). \end{aligned}$$

We also have by Lemma 3.2 that

$$\begin{aligned}&|\lambda -t|\le \left| E\left( \underset{\alpha ={\gamma }(n)}{\overset{N}{\sum }}X_{\alpha }\right) - N(\mathbb {P}(A))^{\ell }\right| +(\mathbb {P}(A))^{\ell }\\&\quad \le \underset{\alpha ={\gamma }(n)}{\overset{N}{\sum }}\left| \mathbb {P}\{X_{\alpha }=1\} -(\mathbb {P}(A))^{\ell }\right| +{\gamma }(n)(\mathbb {P}(A))^{\ell }\\&\quad \le N\psi _{n}2^{\ell }(\mathbb {P}(A))^{\ell }+{\gamma }(n)(\mathbb {P}(A))^{\ell } \le 2^{\ell } t\psi _{n}+{\gamma }(n)(\mathbb {P}(A))^{\ell }. \end{aligned}$$

It follows that

$$\begin{aligned} \delta _{3}\le 2\wp (2^{\ell } t\psi _{n}+{\gamma }(n)(\mathbb {P}(A))^{\ell }). \end{aligned}$$
(4.8)

Now (2.7) follows from (4.1), (4.2), (4.3), (4.4), (4.5), (4.6), (4.7) and (4.8) while (2.8) follows from Lemma 3.1, completing the proof of the theorem.\(\square \)

4.2 Proof of Corollary 2.2

Set \(c=3\Gamma ^{-1}\) and fix \(M\in \mathbb {N}\) such that \(M>c\ln M\) and \(\psi _{n}\le (3/2)^{1/(\ell +1)}-1\) for all \(n\ge M\). Denote by \(\Omega ^{*}\) the set of all \(\omega \in \Omega \) for which there exist an \(M(\omega )\ge M\) such that for each \(n\ge M(\omega )\),

$$\begin{aligned} \pi (A_{n}^{\omega })>n-c\ln n\;\text { and }\;\mathbb {P}(A_{n}^{\omega })>0. \end{aligned}$$

Set \(U^{\omega }_n=S_N^{A_n^{\omega }}\). Assuming (2.8) it follows from Theorem 2.1 that for each \(\omega \in \Omega ^{*}\) and \(n\ge M(\omega )\),

$$\begin{aligned}&\underset{L\subset \mathbb {N}}{\sup }\left| \mathbb {P}\{U_{n}^{\omega }\in L\}-P_{t}(L)\right| \\&\quad \le 16e^{-{\Gamma }n}\left( \ell ^2nt+{\gamma }(n)(1+t^{-1})+t\ell ^2n^4(1+\psi _0) \right) ++2\wp \left( 2^{\ell } t\psi _{n}+{\gamma }(n)e^{-{\Gamma }n}\right) \end{aligned}$$

which gives (2.11) and it remains to show that \(\Omega ^{*}\) has the full measure.

For each \(n\ge M\) set

$$\begin{aligned} B_{n}=\{\omega \,:\,\pi (A_{n}^{\omega })\le n-c\ln n\}. \end{aligned}$$

Fix \(n\ge M\) and let \(d=[n-c\ln n]\). For \(a_0,a_1,\ldots ,a_d\) and \(r\le d\) set \(A^a_r=[a_0,a_1,\ldots ,a_{r-1}]\) and \(D^a_{r,n}=\{{\omega }=({\omega }_0,{\omega }_1, \ldots )\,:\,{\omega }_k=a_{k-r[k/r]}\,\,\forall \, k=r,r+1,\ldots n-1\}\). Then by (2.1), (2.2) and Lemma 3.1,

$$\begin{aligned}&\mathbb {P}(B_{n})\le \underset{r=1}{\overset{d}{\sum }}\mathbb {P}\left\{ \omega : A_{n}^{\omega }\cap T^{-r}(A_{n}^{\omega })\ne \emptyset \right\} =\underset{r=1}{\overset{d}{\sum }}\left( \underset{a_{0},\ldots ,a_{r-1} \in \mathcal {A}}{\sum }\mathbb {P}(A^a_r\cap D^a_{r,n})\right) \\&\quad \le (1+\psi _{0})\underset{r=1}{\overset{d}{\sum }}\,\,\underset{a_{0},\ldots ,a_{r-1} \in \mathcal {A}}{\sum }\mathbb {P}(A^a_r)\mathbb {P}(D^a_{r,n}) \le (1+\psi _{0})\underset{r=1}{\overset{d}{\sum }} e^{-\Gamma (n-r)}\\&\qquad \,\,\times \underset{a_{0},\ldots ,a_{r-1}\in \mathcal {A}}{\sum }\mathbb {P}(A^a_r) =(1+\psi _0)\underset{r=1}{\overset{d}{\sum }} e^{-\Gamma (n-r)} \le d(1+\psi _{0}) e^{-\Gamma (n-d)}\\&\quad \le n(1+\psi _{0}) e^{-\Gamma c\ln n} =(1+\psi _{0}) n^{1-\Gamma c}=(1+\psi _{0}) n^{-2}. \end{aligned}$$

It follows that

$$\begin{aligned} \underset{n=M}{\overset{\infty }{\sum }}\mathbb {P}(B_{n})\le (1+\psi _{0}) \underset{n=M}{\overset{\infty }{\sum }}n^{-2}<\infty . \end{aligned}$$

Now from the Borel–Cantelli lemma we obtain that \(\mathbb {P}\{B_{n} \text { i.o.}\}=0\) where i.o. stands for “infinitely often”. Set \(D={\Omega }\!\setminus \!{\Omega }_{\mathbb P}\) which, recall, is the union of cylinders \(A\) with \({\mathbb P}(A)=0\). Since \(\mathbb {P}(D)=0\) then \(\mathbb {P}(\Omega \!\setminus \!\Omega ^{*})=\mathbb {P}(D\cup \{B_{n}\,\,\text {i.o.}\}) =0\) completing the proof of the corollary.\(\square \)

5 Compound Poisson approximation

5.1 Proof of Theorem 2.3

First, recall that assertions conserning \(\rho =\rho _A\) are contained in Lemmas 3.7 and 3.8. Throughout this subsection \(A\in {\mathcal C}_n\) will be fixed, and so we will write \(X_k\) and \(S_N\) for \(X^A_k\) and \(S_N^A\), respectively. Next, set \(K=5d_{\ell }rn\) and \(\hat{X}_{\alpha }=X_{\alpha }\) if \(K<{\alpha }\le N\) and \(\hat{X}_{\alpha }=0\) if \({\alpha }\le K\) or \({\alpha }> N\). Now define

$$\begin{aligned} U=\sum _{{\alpha }=1}^N\hat{X}_{\alpha }=\sum _{{\alpha }=K+1}^NX_{\alpha }\,\,\text{ and }\,\, X_{\alpha ,j}=(1-\hat{X}_{\alpha -{\kappa }})(1-\hat{X}_{\alpha +j{\kappa }}) \prod _{k=0}^{j-1}\hat{X}_{\alpha +k{\kappa }}. \end{aligned}$$

Observe that for any \(m\),

$$\begin{aligned} |S_N-U|\le \sum _{{\alpha }=1}^KX_{\alpha }\,\,\text{ and }\,\, \left| U-\sum _{{\alpha }=K+1}^N \sum _{j=1}^mjX_{{\alpha },j}\right| \le m\sum _{{\alpha }=K+1}^N\prod _{k=0}^mX_{{\alpha }+k{\kappa }} \end{aligned}$$
(5.1)

with \({\kappa }\) defined by (2.11). Introduce also

$$\begin{aligned} I_{0}=\{K+1,\ldots ,N\}\times \{1,\ldots ,n_{0}\}, \end{aligned}$$

where, recall, \(n_{0}=\big [\frac{n}{r}\big ]\),

$$\begin{aligned} \lambda _{\alpha ,j}=EX_{\alpha ,j},\,\lambda =\underset{(\alpha ,j)\in I_{0}}{\sum }\lambda _{\alpha ,j}\,\,\text{ and }\,\,s=t(1-\rho ) \end{aligned}$$

with \(\rho \) defined in (2.11).

Next, we estimate \(|\lambda -s|\). For each \(i=1,2,\ldots ,\ell \) set for brevity \(c_i=n+d_i{\kappa }\). Then

$$\begin{aligned}&|\lambda -s|\le (\mathbb {P}(A))^{\ell }+\left| \underset{\alpha =K+1}{\overset{N}{\sum }}\,\,\underset{j=1}{\overset{n_{0}}{\sum }}\lambda _{\alpha ,j}- N(\mathbb {P}(A))^{\ell }(1-\rho )\right| \\&\quad \le (K+1)(\mathbb {P}(A))^{\ell }+\underset{\alpha =K+1}{\overset{N}{\sum }} \left| \underset{j=1}{\overset{n_{0}}{\sum }}\lambda _{\alpha ,j}- (\mathbb {P}(A))^{\ell }(1-\rho )\right| \\&\quad =(K+1)(\mathbb {P}(A))^{\ell }+\underset{\alpha =K+1}{\overset{N}{\sum }}\left| \mathbb {P}\left( \underset{j=1}{\overset{n_{0}}{\cup }}\{X_{\alpha ,j}=1\}\right) - (\mathbb {P}(A))^{\ell }+\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P} (R^{c_{i}/r})\right| \\&\quad \le 2K e^{-\Gamma n}+\underset{\alpha =K+1}{\overset{N}{\sum }} \mathbb {P}\left\{ \underset{k=0}{\overset{n_{0}}{\prod }}X_{\alpha +k{\kappa }}=1\right\} \\&\qquad +\underset{\alpha =K+1}{\overset{N}{\sum }}\bigg |\mathbb {P}\{(1-X_{\alpha -{\kappa }}) X_{\alpha }=1\}-\mathbb {P}\{X_{\alpha }=1\}+\mathbb {P}\bigg (\underset{i=1}{\overset{\ell }{\cap }}T^{-d_{i}(\alpha -{\kappa })}R^{c_i/r}\bigg ) \bigg |\\&\qquad +\underset{\alpha =K+1}{\overset{N}{\sum }}\bigg (\bigg |\mathbb {P} \{X_{\alpha }=1\}-(\mathbb {P}(A))^{\ell }\bigg |+\bigg |\mathbb {P}\left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_{i}(\alpha -{\kappa })}R^{c_i/r}\right) -\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P}(R^{c_{i}/r})\bigg |\bigg )\\&\quad =2K e^{-\Gamma n}+\sigma _{1}+\sigma _{2}+\sigma _{3}. \end{aligned}$$

Here \(\sigma _{1},\sigma _{2}\text { and }\sigma _{3}\) denote the first, second and third sums, respectively, and we use in the last inequality above that

$$\begin{aligned} \underset{\alpha =K+1}{\overset{N}{\sum }} \mathbb {P}\left\{ \underset{k=0}{\overset{n_{0}}{\prod }}X_{\alpha +k{\kappa }}=1\right\} \ge \underset{\alpha =K+1}{\overset{N}{\sum }}\left| \mathbb {P}\{(1-X_{\alpha -{\kappa }}) X_{\alpha }=1\}\!-\!\mathbb {P}\left( \mathop {\cup }\limits _{j=1}^{n_0}\{ X_{\alpha }=1\}\right) \right| \!. \end{aligned}$$

In order to estimate \(\sigma _1\) we observe that the choice of \(K\) gives \({\alpha }(d_{i+1}-d_i)\ge 3d_in\) for any \({\alpha }>K\) and \(i=1,\ldots ,\ell -1\). It follows that whenever \(0\le m\le n_0\) and \({\alpha }>K\) there exist disjoint sets of integers \(Q_1,Q_2,\ldots ,Q_\ell \) satisfying (3.2) with \(k\ge 1\) and such that \(T^{-d_i{\alpha }}\cap _{l=0}^{n_0}T^{-d_il{\kappa }}A\in {\mathcal F}_{Q_i}\), \(i=1, \ldots ,\ell \). Since \(r\) divides \(d_i{\kappa }\) and \(d_i{\kappa }\le d_ir<n\) by the assumption, then for such \(m\) each \(\cap _{l=0}^{m}T^{-d_il{\kappa }}A\) is contained in \(D_{i,m}\cap T^{-d_im{\kappa }}A\), where \(D_{i,m}\) is a cylinder set of the length \(d_im{\kappa }\ge rm\) and such that \(D_{i,m}\in {\mathcal F}_{0,d_im{\kappa }-1}\) while, clearly, \(T^{-d_im{\kappa }}A\in {\mathcal F}_{d_im{\kappa },\infty }\). Hence, relying on Lemmas 3.1 and 3.2 we conclude from here that

$$\begin{aligned}&\sum \limits _{{\alpha }=K+1}^N{\mathbb P}\left\{ \prod _{l=0}^mX_{{\alpha }+l{\kappa }}=1\right\} =\underset{\alpha =K+1}{\overset{N}{\sum }}\mathbb {P}\left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_{i} \alpha }\mathop {\cap }\limits _{l=0}^{m}T^{-d_il{\kappa }}A\right) \nonumber \\&\quad \le (1+\psi _0)^\ell \sum \limits _{{\alpha }=K+1}^N\prod _{i=1}^\ell {\mathbb P}\left( \mathop {\cap }\limits _{l=0}^{m} T^{-d_il{\kappa }}A\right) \nonumber \\&\quad \le (1+\psi _0)^{2\ell }N({\mathbb P}(A))^\ell e^{-{\Gamma }\ell rm} \le (1+\psi _0)^{2\ell } te^{-{\Gamma }\ell rm}. \end{aligned}$$
(5.2)

In particular, taking \(m=n_0\) we obtain

$$\begin{aligned} \sigma _{1}\le (1+\psi _0)^{2\ell } te^{-{\Gamma }\ell rn_0}\le (1+\psi _0)^{2\ell } te^{-{\Gamma }n/2}. \end{aligned}$$
(5.3)

Next we show that the term \(\sigma _{2}\) vanishes. Since \(r\) divides \(d_i{\kappa }\) for each \(i=1,\ldots ,\ell \) we have that \(R^{c_i/r}=R^{d_i{\kappa }/r} \cap (T^{-d_i{\kappa }}A)\). It follows that for any \({\alpha }>K\),

$$\begin{aligned}&\underset{i=1}{\overset{\ell }{\cap }} T^{-d_i({\alpha }-{\kappa })}R^{c_{i}/r} =\left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_i({\alpha }-{\kappa })}R^{d_{i}{\kappa }/r}\right) \cap \left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_i{\alpha }}A\right) \\&\quad =\left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_i({\alpha }-{\kappa })}A\right) \cap \left( \underset{i=1}{\overset{\ell }{\cap }}T^{-d_i{\alpha }}A\right) = \{X_{\alpha -{\kappa }} X_{\alpha }=1\}. \end{aligned}$$

Hence,

$$\begin{aligned}&\mathbb {P}\{X_{\alpha }=1\}-\mathbb {P}\left( \underset{i=1}{\overset{m}{\cap }} T^{-d_i({\alpha }-{\kappa })}R^{c_{i}/r}\right) = \mathbb {P}\{X_{\alpha }=1\}-\mathbb {P}\{X_{\alpha -{\kappa }} X_{\alpha }=1\}\\&\quad =\mathbb {P}\{(1-X_{\alpha -{\kappa }})X_{\alpha }=1\}, \end{aligned}$$

and so \({\sigma }_2=0\).

Next, we estimate \(\sigma _{3}\) using Lemma 3.2 similarly to above which gives

$$\begin{aligned} \sigma _{3}\le 2^{\ell }N\psi _{n}(\mathbb {P}(A))^{\ell }\le 2^{\ell } t \psi _{n}. \end{aligned}$$
(5.4)

Hence, by (5.3) and (5.4),

$$\begin{aligned}&|\lambda -s|\le 2Ke^{-\Gamma n}+(1+\psi _{0})^{2\ell } t e^{-\frac{\Gamma }{2} n}+2^{\ell } t\psi _{n}\nonumber \\&\quad \le 2(1+\psi _{0})^{2\ell } K(t+1) e^{-\frac{\Gamma }{2} n} +2^{\ell } t\psi _{n}. \end{aligned}$$
(5.5)

Next, assume that \(\lambda =0\). Then by (5.1),

$$\begin{aligned} \mathbb {P}\{S_N\ne 0\}\le \sum _{{\alpha }=1}^K{\mathbb P}\{ X_{\alpha }=1\}+{\sigma }_1\le K\mathbb {P}(A)+\sigma _{1}. \end{aligned}$$
(5.6)

Let \(\eta _{1},\eta _{2},\ldots \) be a sequence of i.i.d. random variables with \(\mathbb {P}\{\eta _{1}\in \mathbb {N}^{+}\}=1\) independent of a Poisson random variable \(W\) with the parameter \(s\) and \(Z=\sum _{k=1}^W\eta _k\). Then by (5.3), (5.5) and (5.6) for any \(L\subset {\mathbb N}\),

$$\begin{aligned}&|\mathbb {P}\{S_N\in L\}-\mathbb {P}\{Z\in L\}|\le \mathbb {P}\{S_N\ne 0\}+ |\mathbb {P}\{S_N=0\}-\mathbb {P}\{Z=0\}|\nonumber \\&\qquad +\,\,\mathbb {P}\{Z\ne 0\}\le 2(\mathbb {P}\{S_N\ne 0\}+\mathbb {P}\{Z\ne 0\})= 2(\mathbb {P}\{S_N\ne 0\}+(1-e^{-s}))\nonumber \\&\quad \le 2(\mathbb {P}\{S_N\ne 0\}+s)\le 8(1+\psi _{0})^{2\ell } K(t+1) e^{-\frac{\Gamma }{2} n}+2^{\ell +1} t\psi _{n}. \end{aligned}$$
(5.7)

Hence, if \(\lambda =0\) the theorem follows for any such i.i.d. sequence \(\eta _{1},\eta _{2},\ldots \), and so we can assume that \(\lambda >0\).

Define

$$\begin{aligned} I=\{(\alpha ,j)\in I_{0}\,:\,\mathbb {P}\{X_{\alpha ,j}=1\}>0\}. \end{aligned}$$

When \(\lambda >0\) then \(I\ne \emptyset \). For each \(j\in \{1,\ldots ,n_{0}\}\) set

$$\begin{aligned} \lambda _{j}=\lambda ^{-1}\underset{\alpha =K+1}{\overset{N}{\sum }} \lambda _{\alpha ,j}. \end{aligned}$$

We choose an i.i.d. sequence \(\{\eta _{k}\}_{k=1}^{\infty }\) such that \(\mathbb {P}\{\eta _{1}=j\}=\lambda _{j}\) for each \(1\le j\le n_{0}\) and set, again, \(Z=\sum _{k=1}^W\eta _k\) where, as before, \(W\) is a Poisson random variable with the parameter \(s\) independent of \(\eta _k\)’s. Set \(\mathbf {X}=\{X_{\alpha ,j}\}_{(\alpha ,j)\in I}\) and let \(\mathbf {Y}=\{Y_{\alpha ,j}\}_{(\alpha ,j)\in I}\) be a collection of independent random variables such that each \(Y_{\alpha ,j}\) has the Poisson distribution with the parameter \(\lambda _{\alpha ,j}\). Given \((a_{\alpha ,j})_{(\alpha ,j)\in I}=a\in \mathbb {N}^{I}\) define

$$\begin{aligned} f(a)=\underset{(\alpha ,j)\in I}{\sum }j a_{\alpha ,j}\,\,\text{ and }\,\, h_L(a)={\mathbb I}_{f(a)\in L}. \end{aligned}$$

Then

$$\begin{aligned}&|\mathbb {P}\{S_N\in L\}-\mathbb {P}\{Z\in L\}|\le |\mathbb {P}\{S_N\in L\}- Eh_L(\mathbf {X})|\nonumber \\&\quad +\,\,|Eh_L(\mathbf {X})-Eh_L(\mathbf {Y})|+ |Eh_L(\mathbf {Y})-\mathbb {P}\{Z\in L\}|=\delta _{1}+\delta _{2}+ \delta _{3} \end{aligned}$$
(5.8)

where \({\delta }_1,{\delta }_2\) and \({\delta }_3\) denote the respective terms in the right hand side of (5.8).

By (5.1) and (5.3) we obtain that

$$\begin{aligned}&\delta _{1}\le 2\mathbb {P}\{S_N\ne f(\mathbf {X})\}\le 2\sum \limits _{{\alpha }=1}^K {\mathbb P}\{ X_{\alpha }=1\}+2{\sigma }_1\le 2K{\mathbb P}(A)\nonumber \\&\quad +\,\,2{\sigma }_1\le 2K\mathbb {P}(A)+2(1+\psi _{0})^{2\ell }t e^{-\frac{\Gamma }{2} n}\le 4(1+\psi _{0})^{2\ell } K(t+1)e^{-\frac{\Gamma }{2} n}. \end{aligned}$$
(5.9)

In order to estimate \(\delta _{2}\) we use Theorem 2 from [6]. Note that by the definition of \(I\) for each \((\alpha ,j)\in I\) we have \(\mathbb {P}\{X_{\alpha ,j}=1\}>0\), and so the use of this theorem is justified. For each \((\alpha ,j)\in I\) define

$$\begin{aligned} B_{\alpha ,j}=\{(\beta ,k)\in I\,:\,\exists \, i_{1},i_{2}=1,\ldots ,\ell \,\, \text{ such } \text{ that }\,\, |d_{i_{1}}\alpha -d_{i_{2}}\beta |<K\}. \end{aligned}$$

By Theorem 2 in [6] we see that

$$\begin{aligned} \delta _{2}\le \bigl \Vert {\mathcal L}(\mathbf {X})-{\mathcal L}(\mathbf {Y})\bigr \Vert \le 2(2b_{1}+2b_{2}+b_{3}), \end{aligned}$$
(5.10)

where

$$\begin{aligned} \bigl \Vert {\mathcal L}(\xi )-{\mathcal L}(\zeta )\bigr \Vert =2\sup _{L\subset {\mathbb N}^I}|{\mathbb P}\{\xi \in L\} -{\mathbb P}\{\zeta \in L\}| \end{aligned}$$

is the total variation distance between distributions of \({\mathbb N}^I\)-valued random vectors \(\xi \) and \(\zeta \),

$$\begin{aligned} b_{1}&= \underset{\left( \alpha ,j\right) \in I}{\sum }\left( \underset{(\beta ,k)\in B_{\alpha ,j}}{\sum }\mathbb {P}\{X_{\alpha ,j}=1\}\mathbb {P}\{X_{\beta ,k}=1\}\right) ,\\ b_{2}&= \underset{\left( \alpha ,j\right) \in I}{\sum }\left( \underset{\left( \alpha ,j\right) \ne (\beta ,k)\in B_{\alpha ,j}}{\sum }\mathbb {P}\{X_{\alpha ,j}=1,X_{\beta ,k}=1\}\right) \,\,\text{ and }\\ b_{3}&= \underset{\left( \alpha ,j\right) \in I}{\sum }E\bigr |E(X_{\alpha ,j}- \lambda _{\alpha ,j}\mid {\mathcal B}_{{\alpha },j})\bigr | \end{aligned}$$

where \({\mathcal B}_{{\alpha },j}=\sigma \{X_{\beta ,k}:\,(\beta ,k)\notin B_{\alpha ,j}\}\). For each \((\alpha ,j)\in I\) it follows by Lemma 3.2 that

$$\begin{aligned} \mathbb {P}\{X_{\alpha ,j}=1\}\le \mathbb {P}\{X_{\alpha }=1\} \le 2(\mathbb {P}(A))^{\ell }. \end{aligned}$$

Since the number of elements in \(B_{\alpha ,j}\) and in \(I\) do not exceed \(2K\ell ^{2}n\) and \(nN\), respectively, we obtain from here that

$$\begin{aligned} b_{1}\le 8n^2NK\ell ^{2}(\mathbb {P}(A))^{2\ell }\le 8K\ell ^{2} t n^{2} (\mathbb {P}(A))^{\ell }\le 8K\ell ^{2} tn^{2} e^{-\Gamma n}. \end{aligned}$$
(5.11)

Now we estimate \(b_{2}\). Fix \((\alpha ,j)\in I\), let \((\alpha ,j)\ne (\beta ,k)\in B_{\alpha ,j}\) and set \(F=\{X_{\alpha ,j}=1,X_{\beta ,k}=1\}\). We want to estimate \(\mathbb {P}(F)\) from above. Clearly, if \(\alpha =\beta \) then \(F=\emptyset \), so without loss of generality we can assume that \(\alpha <\beta \). Suppose, first, that \(\alpha +\frac{n-3r}{d_{\ell }}>\beta \) and show that in this case \(F=\emptyset \). Indeed, assume by contradiction that \(F\ne \emptyset \), then by Lemma 3.6 it follows that \(\alpha \le \beta -{\kappa }\). Let \({\omega }\in F\) then \(X_{\beta ,k}(\omega )=1\), and so \(X_{\beta -{\kappa }}(\omega )=0\) and \(X_{\beta }(\omega )=1\). Hence, there exists an \(1\le i_{0}\le \ell \) such that

$$\begin{aligned} \mathbb {I}_{A}\circ T^{d_{i_{0}}(\beta -{\kappa })}(\omega )=0\text { and } \mathbb {I}_{A}\circ T^{d_{i_{0}}\beta }(\omega )=1. \end{aligned}$$

It follows that for \(c=d_{i_{0}} {\kappa }\),

$$\begin{aligned} \mathbb {I}_{R^{c/r}}\circ T^{d_{i_{0}}(\beta -{\kappa })}(\omega )=0. \end{aligned}$$
(5.12)

By our assumption,

$$\begin{aligned} d_{i_{0}}(\beta -\alpha )< d_\ell ({\beta }-{\alpha })<n-3r<(n_{0}-2) r. \end{aligned}$$

Write \(d_{i_{0}}(\beta -\alpha )=ur+v\) where \(u,v\in \mathbb {N}\) and \(v<r\). Then \(d_{i_{0}}\alpha +ur\le d_{i_{0}}\alpha +(n_{0}-2) r\). Since \(X_{\alpha ,j}(\omega )=1\), and so \({\mathbb I}_A\circ T^{d_{i_0}{\alpha }}({\omega })=1\), we obtain from the last inequality and the definition of \(n_0\) that \(\mathbb {I}_{R^{2}}\circ T^{d_{i_{0}}\alpha +ur}(\omega )=1\) where \(R^2= R^{2r/r}\) is the concatenation of two copies of \(R\). Since \(X_{\beta ,k}(\omega )=1\), and so \({\mathbb I}_A\circ T^{d_{i_0}{\beta }}({\omega })=1\), we obtain also that

$$\begin{aligned} \mathbb {I}_{R^{2}}\circ T^{d_{i_{0}}\alpha +ur+v}(\omega )= \mathbb {I}_{R^{2}}\circ T^{d_{i_{0}}\beta }(\omega )=1. \end{aligned}$$

Hence, \(R^{2}\cap T^{-v}(R^{2})\ne \emptyset \). By the assumption \(v<r\) and if \(v>0\) then \(\pi (R)\le \pi (R^2)\le v<r\). If \(r\) is not divisible by \(\pi (R)\) then by Lemma 3.5 we would have \(\pi (R^2)=r\) contradicting the above inequality and if \(\pi (R)\) divides \(r\) then \(\pi (R)\in \mathcal O(A)\) contradicting \(\pi (A)=r\). Hence, \(v=0\), and so \(d_{i_{0}}(\beta -\alpha )=ur\). Since \(X_{\alpha ,j}(\omega )=1\), \(r\) divides \(d_{i_0}{\kappa }\) and \(n\ge ur = d_{i_{0}}(\beta -\alpha )\ge d_{i_0}{\kappa }\), it follows that

$$\begin{aligned} 1&= X_{\alpha }(\omega )\le \mathbb {I}_{A}\circ T^{d_{i_{0}}\cdot \alpha }(\omega ) \le \mathbb {I}_{R^{ur/r}}\circ T^{d_{i_{0}}\cdot \alpha }(\omega )\\&= (\mathbb {I}_{R^{(ur-d_{i_{0}}{\kappa })/r}}\circ T^{d_{i_{0}}\cdot \alpha }(\omega )) \cdot (\mathbb {I}_{R^{c/r}}\circ T^{(d_{i_{0}}\cdot \alpha +ur-d_{i_{0}}{\kappa })}(\omega ))\\&\le \mathbb {I}_{R^{c/r}}\circ T^{(d_{i_{0}}\cdot \alpha +ur-d_{i_{0}}{\kappa })} (\omega )=\mathbb {I}_{R^{c/r}}\circ T^{d_{i_{0}}\cdot (\beta -{\kappa })}(\omega ) \end{aligned}$$

where, again, \(c=d_{i_0}{\kappa }\) and we set \(R^{0/r}={\Omega }\). Hence, \(\mathbb {I}_{R^{c/r}}\circ T^{d_{i_{0}}(\beta -{\kappa })}(\omega )=1\) contradicting (5.12), and so \(F=\emptyset \) in this case.

Thus, we can assume that \(\alpha +\frac{n-3r}{d_{\ell }}\le \beta \), and so \(d_{\ell }\alpha +n\le d_{\ell }\beta +3r\). Hence, by Lemma 3.2,

$$\begin{aligned}&\mathbb {P}\{X_{\alpha ,j}=1,X_{\beta ,k}=1\}\le {\mathbb P}\{ X_{\alpha }=1,\,{\mathbb I}_A\circ T^{d_\ell {\beta }}=1\}\\&\quad \le \mathbb {P}\{X_{\alpha }=1,\mathbb {I}_{R^{(n-3r)/r}}\circ T^{d_{\ell } \beta +3r}=1\}\\&\quad \le (1+\psi _{0})\mathbb {P}\{X_{\alpha }=1\}\mathbb {P}(R^{(n-3r)/r}) \le 4\psi _{0}(\mathbb {P}(A))^{\ell }e^{-\Gamma (n-3r)}. \end{aligned}$$

Since the number of elements in \(B_{\alpha ,j}\) and in \(I\) do not exceed \(2K\ell ^{2}n\) and \(nN\), respectively, we obtain that

$$\begin{aligned} b_{2}\le 8n^2NK\ell ^{2}\psi _{0}(\mathbb {P}(A))^{\ell }e^{-\Gamma (n-3r)} \le 8\ell ^{2}\psi _{0} K t n^{2} e^{-\frac{\Gamma }{2} n}. \end{aligned}$$
(5.13)

In order to estimate \(b_{3}\) we use Lemma 3.3 with

$$\begin{aligned} Q&= Q_{{\alpha },j}=\{ d_i({\alpha }+l{\kappa })+m:\, i=1,\ldots ,\ell ;\, l=-1,0,1,\ldots ,j,\\ m&= 0,1,\ldots ,n-1\}\,\, \text{ and }\,\,\tilde{Q}=\tilde{Q}_{{\alpha },j}=\{ d_i({\beta }+l{\kappa })+m:\,{\beta }\not \in B_{{\alpha },j},\\ i&= 1,\ldots ,\ell ,\, l=-1,0,1,\ldots ,n_0,\, m=0,1,\ldots ,n-1\}. \end{aligned}$$

Then by the choice of \(K\) the conditions of Lemma 3.3 are satisfied with \(d=n\) and such \(Q\) and \(\tilde{Q}\). Taking into account that \(\mathcal {B}_{{\alpha },j}\subset {\mathcal F}_{\tilde{Q}}\) we obtain from (2.1), (2.2) and Lemma 3.3 that

$$\begin{aligned}&E\bigl |E(X_{\alpha ,j}-\lambda _{\alpha ,j}\mid \mathcal {B}_{{\alpha },j})\bigl |\,= E\bigl |E(E(X_{\alpha ,j}-\lambda _{\alpha ,j}\mid \mathcal {F}_{\tilde{Q}}) \mid \mathcal {B}_{{\alpha },j})\bigl |\\&\quad \le E\bigl |E(X_{\alpha ,j}-\lambda _{\alpha ,j}\mid \mathcal {F}_{\tilde{Q}})\bigl | \le 2^{2\ell +4}\psi _{n}\mathbb {P}\{X_{\alpha ,j}=1\}. \end{aligned}$$

For each \(1\le i\le \ell \) set \(c_{i}=n+{\kappa }d_{i}(j-1)\). Then by Lemma 3.2 we see that

$$\begin{aligned}&\mathbb {P}\{X_{\alpha ,j}=1\}\le \mathbb {P}\left\{ \underset{i=1}{\overset{m}{\cap }} \mathbb {I}_{R^{c_{i}/r}}\circ T^{d_{i}\alpha }\right\} \le 2\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P}(R^{c_{i}/r})\\&\quad \le 2(1+\psi _{0})\underset{i=1}{\overset{\ell }{\prod }}\left( \mathbb {P}(A) e^{-\Gamma {\kappa }d_{i}(j-1)}\right) \le 2(1+\psi _{0})(\mathbb {P}(A))^{\ell } e^{-\Gamma \ell r(j-1)}. \end{aligned}$$

It follows from the above estimates that

$$\begin{aligned} b_{3}&= \underset{(\alpha ,j)\in I}{\sum }E\bigl |E(X_{\alpha ,j}- \lambda _{\alpha ,j}\mid {\mathcal B}_{{\alpha },j}\})\bigl |\le \underset{\alpha =K+1}{\overset{N}{\sum }}\,\,\underset{j=1}{\overset{n_{0}}{\sum }}2^{2\ell +4}\psi _{n}\mathbb {P}\{X_{\alpha ,j}=1\}\nonumber \\&\le 2^{2\ell +5}(1+\psi _{0})\psi _{n}N(\mathbb {P}(A))^{\ell }\underset{j=1}{\overset{n_{0}}{\sum }}e^{-\Gamma (j-1)}\le 2^{2\ell +5}(1+\psi _{0})\psi _{n} N(\mathbb {P}(A))^{\ell } \nonumber \\&\times \frac{1}{1-e^{-\Gamma }}\le 2^{2\ell +5}(1+\psi _{0})t\psi _{n}(1-e^{-{\Gamma }})^{-1}. \end{aligned}$$
(5.14)

Next, we estimate \(\delta _{3}\). Given a random variable \(\xi \) we denote by \(\varphi _{\xi }\) the characteristic function of \(\xi \). Let \(\xi \) be a Poisson random variable with a parameter \({\lambda }\) independent of \(\{\eta _{k}\}_{k=1}^{\infty }\) and set \(\Psi =\sum \nolimits _{k=1}^ {\xi }\eta _{k}\). Then for each \(s\in \mathbb {R}\),

$$\begin{aligned} \varphi _{\Psi }(s)=\underset{l=0}{\overset{\infty }{\sum }}\mathbb {P}\{\xi =l\} \underset{k=1}{\overset{l}{\prod }}\varphi _{\eta _{k}}(s)=e^{-{\lambda }}\underset{l=0}{\overset{\infty }{\sum }}\frac{\lambda ^{l}}{l!}(\varphi _{\eta _{1}}(s))^{l} =\exp \left( \lambda \underset{j=1}{\overset{n_{0}}{\sum }}\lambda _{j}(e^{i js}-1)\right) \end{aligned}$$

and

$$\begin{aligned} \varphi _{f(\mathbf {Y})}(s)&= \underset{(\alpha ,j)\in I}{\prod } \varphi _{Y_{\alpha ,j}}(j s) =\exp \left( \underset{(\alpha ,j)\in I}{\sum }\lambda _{\alpha ,j}(e^{i js}\!-\!1)\right) \\&= \exp \left( \lambda \underset{j=1}{\overset{n_{0}}{\sum }}\lambda _{j}(e^{i js}\!-\!1)\right) , \end{aligned}$$

warning the reader that the last two formulas are the only places in this paper where \(i\) stands for \(\sqrt{-1}\) and not for an integer. It follows from here that \(f(\mathbf {Y})\) and \(\Psi \) have the same distribution, and so by Lemmas 3.2 and 3.4,

$$\begin{aligned} \quad \delta _{3}&= \left| \mathbb {P}\{\Psi \in L\}-\mathbb {P}\{Z\in L\}\right| =\underset{l=0}{\overset{\infty }{\sum }}\left| \mathbb {P}\{\xi =l\}-\mathbb {P}\{W=l\}\right| \mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\eta _{k}\in L\right\} \nonumber \\&\le \underset{l=0}{\overset{\infty }{\sum }}\left| \mathbb {P}\{\xi =l\}- \mathbb {P}\{W=l\}\right| \le 2e^{|\lambda -s|}|\lambda -s|=2\wp (|{\lambda }-s|). \end{aligned}$$
(5.15)

Finally, (2.14) follows from (5.5), (5.7), (5.8), (5.9), (5.10), (5.11) and (5.13), (5.14), (5.15) completing the proof of Theorem 2.3.\(\square \)

5.2 Proof of Corollary 2.4

Let \({\omega }\in {\Omega }_{\mathbb P}\) be a nonperiodic sequence, \(A=A_n^{\omega }\) and \(r=r^{\omega }_n= \pi (A^{\omega }_n)\). Assume first that \(n>r^{\omega }_n(d_\ell +6)\) so that Theorem 2.3 can be applied. By (2.12), (5.5) and the definition of \(s=t(1-\rho _A)\) it follows that \({\lambda }\) defined for \(A=A^{\omega }_n\) at the beginning of Sect. 5 is bounded away from 0 by a constant \({\delta }>0\) independent of \(n\) provided that \(n\) is large enough which suits our goals as \(n\) will tend to \(\infty \) here. Then by (5.2) and the definition of the sequence \(\eta _1,\eta _2,\ldots \) in the proof of Theorem 2.3 we obtain that

$$\begin{aligned} {\mathbb P}\{\eta _1=j\}={\lambda }_j={\lambda }^{-1}\sum _{{\alpha }=K+1}^N{\lambda }_{{\alpha },j}\le {\lambda }^{-1} (1+\psi _0)^{\ell +1}te^{-{\Gamma }\ell r(j-1)}. \end{aligned}$$
(5.16)

Let \(\Xi \) be a Poisson random variable with the parameter \(t\) and \(Z=\sum ^W_{l=1}\eta _l\) be the compound Poisson random variable constructed by Theorem 2.3 with \(A=A^{\omega }_n,\,\rho =\rho _n^{\omega }\) and \(r=r^{\omega }_n\). Then for any \(L\subset {\mathbb N}\),

$$\begin{aligned} \left| {\mathbb P}\{\Xi \in L\}-{\mathbb P}\{ Z\in L\}\right|&\le \sum _{l=0}^\infty \left| {\mathbb P}\{\Xi =l\} -{\mathbb P}\{ W=l\}\right| {\mathbb P}\left\{ \sum _{k=1}^l\eta _k\in L\right\} \nonumber \\&+\sum _{l=0}^\infty {\mathbb P}\left\{ \Xi =l\right\} \left| {\mathbb P}\{\sum _{k=1}^l \eta _k\in L\}-{\mathbb I}_L(l)\right| \end{aligned}$$
(5.17)

where \({\mathbb I}_L(l)=1\) if \(l\in L\) and \(=0\), otherwise.

By Lemma 3.4,

$$\begin{aligned} \sum _{l=0}^\infty \left| {\mathbb P}\{\Xi =l\}-{\mathbb P}\{ W=l\}\right| \le 2t\rho ^{\omega }_ne^{t\rho ^{\omega }_n} =2\wp (t\rho ^{\omega }_n). \end{aligned}$$
(5.18)

Next, by (5.16),

$$\begin{aligned}&\left| {\mathbb P}\left\{ \sum _{k=1}^l\eta _k\in L\right\} -{\mathbb I}_L(l)\right| \le 2{\mathbb P}\left\{ \sum _{k=1}^l \eta _k\ne l\right\} \le 2l{\mathbb P}\{\eta _1\ne 1\}\nonumber \\&\quad =2l\sum ^{n_0}_{j=2}{\lambda }_j\le 2l{\lambda }^{-1}(1+\psi _0)^{2\ell }te^{-{\Gamma }\ell r^{\omega }_n}(1-e^{-{\Gamma }\ell r^{\omega }_n})^{-1}. \end{aligned}$$
(5.19)

Now (5.17), (5.18) and (5.19) yield

$$\begin{aligned} |{\mathbb P}\{\Xi \in L\}-{\mathbb P}\{ Z\in L\}|\le 2\wp (t\rho ^{\omega }_n)+2{\lambda }^{-1} (1+\psi _0)^{\ell +1}t^2e^{-{\Gamma }\ell r^{\omega }_n}(1-e^{-{\Gamma }\ell r^{\omega }_n})^{-1}\nonumber \\ \end{aligned}$$
(5.20)

where we used that \(t=E\Xi =e^{-t}\sum _{l=0}^\infty l\frac{t^l}{l!}\).

If \(n\le r^{\omega }_n(d_\ell +6)\) then we apply Theorem 2.1, and so we can write

$$\begin{aligned} \left| {\mathbb P}\left\{ S_N^{A_n^{\omega }}\in L\right\} -{\mathbb P}\{\Xi \in L\}\right| \le \max ({\varepsilon }_1(n)+{\varepsilon }_2(n), \,{\varepsilon }_3(n)) \end{aligned}$$

where \({\varepsilon }_1(n)\) and \({\varepsilon }_2(n)\) are right hand sides of (2.14) and (5.20), respectively, while

$$\begin{aligned} {\varepsilon }_3(n)&= 16e^{-{\Gamma }n/(d_\ell +6)}\big (\ell ^2nt+{\gamma }(n)(1+t^{-1})+ tn\ell ^2(1+\psi _0)\big )\nonumber \\&+\,\,2\wp \big (2^\ell t\psi _n+{\gamma }(n)e^{-{\Gamma }n}\big ). \end{aligned}$$

Clearly, \({\varepsilon }_1(n),{\varepsilon }_3(n)\rightarrow 0\) as \(n\rightarrow \infty \) and since \(\rho ^{\omega }_n\rightarrow 0\) and \(r^{\omega }_n\rightarrow \infty \) as \(n\rightarrow \infty \) by Lemma 3.7 then \({\varepsilon }_2(n)\rightarrow 0\) as \(n\rightarrow \infty \), as well, and so the assertion of Corollary 2.4 follows.\(\square \)

5.3 Proof of Corollary 2.6

Let \(t>0\) be given. If \(n>r(d_{\ell }+6)\) then we take \(W\) and \(Z\) to be as in Theorem 2.3. Note that \(\mathbb {P}\{(\mathbb {P}(A))^{\ell }\tau _{A}>t\}=\mathbb {P}\{S^A_N=0\}\) and \(\mathbb {P}\{Z=0\}={\mathbb P}\{ W=0\}\), and so by Theorem 2.3,

$$\begin{aligned}&\left| \mathbb {P}\left\{ (\mathbb {P}(A))^{\ell }\tau _{A}>t\right\} -\mathbb {P}\{ W=0\}\right| = |\mathbb {P}\{ S^A_N=0\}-\mathbb {P}\{ Z=0\}|\nonumber \\&\quad \le 2^{2\ell +7}(1+\psi _0)^{2\ell }(t+1)\big (d_\ell \ell ^2 n^{4} e^{-\Gamma n/2}+\psi _{n}(1-e^{-{\Gamma }})^{-1}\big )\nonumber \\&\qquad +\,\,2\wp \big (2^\ell t\psi _n+10e^{-{\Gamma }n/2}(1+\psi _0)^{2^\ell }d_\ell n^2(t+1)\big ) \end{aligned}$$
(5.21)

On the other hand, if \(n\le r(d_{\ell }+6)\) then by Theorem 2.1 (with \(q_{i}\)’s being linear),

$$\begin{aligned} \left| \mathbb {P}\left\{ (\mathbb {P}(A))^{\ell }\tau _{A}>t\right\} -P_s\{ 0\}\right| \le \left| \mathbb {P}\left\{ S^A_N=0\right\} -P_{t}\{0\}\right| +\left| P_{t}\{0\}-P_{s}\{0\}\right| . \end{aligned}$$
(5.22)

where \(s=t(1-\rho )\). Furthermore, \({\kappa }\ge \frac{r}{d_{1}}\), and so

$$\begin{aligned} \rho \!=\!\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P}\left\{ R^{(n+d_{i} {\kappa })/r} \mid A\right\} \!\le \frac{\mathbb {P}(R^{(n+d_{1} {\kappa })/r})}{\mathbb {P}(A)} \le \psi _{0}\mathbb {P}(R^{(d_{1} {\kappa })/r})\!\le \!\psi _{0} e^{-\Gamma d_{1}{\kappa }}\!\le \!\psi _{0} e^{-\Gamma r}.\nonumber \\ \end{aligned}$$
(5.23)

By Theorem 2.1,

$$\begin{aligned}&\left| \mathbb {P}\{S_N^A=0\}-P_{t}\{0\}\right| \le 16e^{-{\Gamma }n}\left( \ell ^2nt+ {\gamma }(n)(1+t^{-1})\right) \nonumber \\&\quad +\,\,6(1+\psi _0)tn\ell ^2 e^{-\Gamma r}+2\wp \big (2^\ell t\psi _n+ {\gamma }(n)e^{-{\Gamma }n}\big ) \end{aligned}$$
(5.24)

and by (5.23),

$$\begin{aligned} |P_{t}\{0\}-P_{s}\{0\}|\le |t-s|\le t\psi _0 e^{-{\Gamma }r}. \end{aligned}$$
(5.25)

Taking into account that \({\gamma }(n)\le 2n\) when \(q_i\)’s are as in Theorem 2.3 and that \(r\ge n/(d_\ell +6)\) in (5.24) we obtain the estimate of Corollary 2.6 from (5.21), (5.22), (5.23), (5.24) and (5.25).\(\square \)

5.4 Proof of Corollary 2.7

First, observe that our mixing conditions imply, in particular, that \({\mathbb P}\) is ergodic. Set \(\iota =\sup _{A\in {\mathcal C}_n,\, n\ge 1}\rho _A\) and \(b(n)=\frac{2\ln n}{1-\iota }\) recalling that \(\iota <1\) by (2.12). Now, by Corollary 2.6 for any \(A\in {\mathcal C}_n\) with \({\mathbb P}(A)>0\),

$$\begin{aligned}&{\mathbb P}\left\{ ({\mathbb P}(A))^\ell \tau _A>b(n)\right\} \le e^{-(1-\iota )b(n)}+2^{2\ell +8}(1+\psi _0)^{2\ell }(b(n)+1)\nonumber \\&\quad \times \,\,\bigg ( d_\ell \ell ^2 n^4\bigg (1+ \frac{1}{b(n)}\bigg )e^{-{\Gamma }n/(d_\ell +6)}+\psi _n(1-e^{-{\Gamma }})^{-1}\bigg )\nonumber \\&\quad +\,\,2\wp \big (2^\ell b(n)\psi _n+10e^{-{\Gamma }n/2}(1+\psi _0)^{2\ell }d_\ell n^2 (b(n)+1)\big )\end{aligned}$$
(5.26)

and

$$\begin{aligned}&{\mathbb P}\left\{ ({\mathbb P}(A))^\ell \tau _A\le n^{-2}\right\} =1-{\mathbb P}\left\{ ({\mathbb P}(A))^\ell \tau _A >n^{-2}\right\} \nonumber \\&\quad \le \bigg |1-e^{-(1-\rho _A)n^{-2}}\bigg |+\bigg |{\mathbb P}\left\{ ({\mathbb P}(A))^\ell \tau _A>n^{-2}\right\} - e^{-(1-\rho _A)n^{-2}}\bigg |\nonumber \\&\quad \le n^{-2}+2^{2\ell +8}(1+\psi _0)^{2\ell }\big (n^{-2}+1\big )\big ( d_\ell \ell ^2 n^4 \big (1+n^2\big )e^{-{\Gamma }n/(d_\ell +6)}\nonumber \\&\qquad +\psi _n\big (1-e^{-{\Gamma }}\big )^{-1}\big )+2\wp \big (2^\ell n^{-2}\psi _n+10 e^{-{\Gamma }n/2}(1+\psi _0)^{2\ell }d_\ell (1+n^2)\big ). \end{aligned}$$
(5.27)

Applying (5.26) and (5.27) to \(A=A^{\omega }_n\) with \({\omega }\in {\Omega }_{\mathbb P}\) we obtain by the Borel–Cantelli lemma that there exists a random variable \(m=m(\varpi )\) on \({\Omega }\) which is finite \({\mathbb P}\)-a.s. and such that for all \(n\ge m(\varpi )\),

$$\begin{aligned} n^{-2}<({\mathbb P}(A_n^{\omega }))^\ell \tau _{A^{\omega }_n}(\varpi )\le b(n) \end{aligned}$$
(5.28)

which implies (2.18). Finally, if (2.19) holds true then (2.20) follows from (2.18) and the Shannon–McMillan–Breiman theorem (see, for instance, [24]).\(\square \)

6 I.I.D. case

6.1 Proof of Theorem 2.8

We use the same notation as in the proof of Theorem 2.3 and as there we can assume that \(\lambda >0\) and \(N>K=5d_\ell rn\). In order to derive (2.21) we will estimate \(|\mathbb {P}\{Z\in L\}-\mathbb {P} \{Y\in L\}|\) for any \(L\subset \mathbb {N}\) and combine it with (2.14). Observe that under the assumption that the coordinate projections from \(\Omega \) onto \(\mathcal {A}\) are i.i.d. random variables it follows that

$$\begin{aligned} \rho =\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P}\{R^{(n+d_{i} {\kappa })/r} \mid A\}=\underset{i=1}{\overset{\ell }{\prod }}\mathbb {P}(R^{(d_{i} {\kappa })/r}) =\big ({\mathbb P}(R))^{k_0}\le \mathbb {P}(R), \end{aligned}$$
(6.1)

where \(k_0=\frac{{\kappa }}{r}\sum _{i=1}^\ell d_i\), and also that \(\lambda _{\alpha ,j}=(1-\rho )^{2}(\mathbb {P}(A))^{\ell }\rho ^{j-1}\) for each \((\alpha ,j)\in I\). Let \(\eta _1,\eta _2,\ldots \) be i.i.d. random variables constructed in the proof of Theorem 2.3. Then, for each \( j=1,2,\ldots ,n_{0}=[n/r]\),

$$\begin{aligned}&\mathbb {P}\{\eta _{1}=j\}=\lambda _{j}=\bigg ((N-K)\underset{l=1}{\overset{n_{0}}{\sum }}\lambda _{N,l}\bigg )^{-1}(N-K)\lambda _{N,j}\\&\quad =\bigg ((1-\rho )^{2}(\mathbb {P}(A))^{\ell }\underset{l=1}{\overset{n_{0}}{\sum }} \rho ^{l-1}\bigg )^{-1}(1-\rho )^{2}(\mathbb {P}(A))^{\ell }\rho ^{j-1}\\&\quad =\bigg ((1-\rho )\underset{l=1}{\overset{n_{0}}{\sum }}\rho ^{l-1}\bigg )^{-1} \mathbb {P}\{\zeta _{1}=j\}=\mathbb {P}\{\zeta _{1}=j\mid \zeta _{1}\in \{1,\ldots ,n_{0}\}\} \end{aligned}$$

where \(\zeta _1,\zeta _2,\ldots \) are i.i.d. random variables described in the statement of Theorem 2.8. Furthermore, for any \(j>n_{0}\),

$$\begin{aligned} \mathbb {P}\{\eta _{1}=j\}=0=\mathbb {P}\{\zeta _{1}=j\mid \zeta _{1}\in \{1,\ldots , n_{0}\}\} \end{aligned}$$

Hence, by the independence for any \(l\ge 1\) and each \(j_{1},\ldots ,j_{l}\in \mathbb {N}^{+}\),

$$\begin{aligned} \mathbb {P}\{\eta _{1}=j_{1},\ldots ,\eta _{l}=j_{l}\}=\mathbb {P}\{\zeta _{1}=j_{1},\ldots , \zeta _{l}=j_{l}\mid \zeta _{1},\ldots ,\zeta _{l}\in \{1,\ldots ,n_{0}\}\}. \end{aligned}$$

It follows from here that

$$\begin{aligned} \mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\eta _{k}\in L\right\} =\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\mid \zeta _{1},\ldots ,\zeta _{l}\in \{1,\ldots , n_{0}\}\right\} . \end{aligned}$$

Introduce the event \(\Psi =\{\exists k\le l,k\ge 1:\, \zeta _{k}\notin \{1,\ldots ,n_{0}\}\}\). Then by the law of total probability it follows that

$$\begin{aligned}&\left| \mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\right\} -\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\eta _{k}\in L\right\} \right| \nonumber \\&\quad =\left| \mathbb {P}\{\zeta _{1},\ldots ,\zeta _{l}\in \{1,\ldots ,n_{0}\}\}\mathbb {P} \left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\mid \zeta _{1},\ldots ,\zeta _{l}\in \{1,\ldots ,n_{0}\}\right\} \right. \nonumber \\&\qquad \left. +\,\,\mathbb {P}(\Psi )\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\mid \Psi \right\} -\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\mid \zeta _{1},\ldots ,\zeta _{l}\in \{1,\ldots ,n_{0}\}\right\} \right| \nonumber \\&\quad =2\mathbb {P}(\Psi )\le 2\underset{k=1}{\overset{l}{\sum }}\mathbb {P}\{ \zeta _{k} \notin \{1,\ldots ,n_{0}\}=2l\rho ^{n_{0}}. \end{aligned}$$

Set again \(s=t(1-\rho )\). Then

$$\begin{aligned}&\left| \mathbb {P}\{Z\in L\}-\mathbb {P}\{Y\in L\}\right| \le \underset{l=0}{\overset{\infty }{\sum }}\mathbb {P}\{W=l\}\left| \mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }} \eta _{k}\in L\right\} -\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\right\} \right| \\&\quad \le \underset{l=1}{\overset{n_{0}}{\sum }}\left| \mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\eta _{k}\in L\right\} -\mathbb {P}\left\{ \underset{k=1}{\overset{l}{\sum }}\zeta _{k}\in L\right\} \right| +\mathbb {P}\{W>n_{0}\}\\&\quad \le \underset{l=1}{\overset{n_{0}}{\sum }}2l\rho ^{n_{0}}+ e^{-s}\left( e^{s}-\underset{l=0}{\overset{n_{0}}{\sum }}\frac{s^{l}}{l!}\right) \le 2n_{0}^{2}\rho ^{n_{0}}+\frac{t^{n_{0}+1}}{(n_{0}+1)!}. \end{aligned}$$

Taking into account that \(\psi _n\equiv 0\) under the independence assumption, (2.21) follows from here together with (2.14), (6.1) and Lemma 3.1, completing the proof of Theorem 2.8.\(\square \)

6.2 Proof of Theorem 2.9

In the i.i.d. case the assertion (i) can be derived easily studying the asymptotic behavior of the compound Poisson distribution constructed in Theorem 2.8 but since (i) follows from the more general result of Corollary 2.4 we will prove now only the assertion (ii). Assume that \({\omega }\) is periodic with a period \(d\in \mathbb {N}^{+}\). From Lemma 3.7 it follows that there exists a positive integer \(M\ge 1\) such that \(r_{n}^{{\omega }}=d\) for all \(n\ge M\). Next, by (6.1) for any \(n\ge M\),

$$\begin{aligned} \rho _{n}^{{\omega }}=\big (\mathbb {P}([{\omega }_{0},\ldots ,{\omega }_{r_{n}^{{\omega }}-1}])\big )^{k_0} =\big (\mathbb {P}([{\omega }_{0},\ldots ,{\omega }_{d-1}])\big )^{k_0}=\rho ^{{\omega }} \end{aligned}$$

which does not depend on \(n\). Hence, for all \(n\ge \max (M,d(d_\ell +6))\) we can apply Theorem 2.8 in order to obtain that

$$\begin{aligned}&\underset{L\subset \mathbb {N}}{\sup }\left| \mathbb {P}\{U_{n}^{{\omega }}\in L\}- \mathbb {P}\{Y\in L\}\right| \le 2^{2\ell +8}(t+1)d_\ell \ell ^2 n^{4}e^{-{\Gamma }n/2}\\&\quad +\,\,2\wp \big (10d_\ell n^2(t+1)e^{-{\Gamma }n/2}\big )+\frac{t^{[n/d]+1}}{([n/d]+1)!} \end{aligned}$$

where \(\Gamma =\min \{-\ln (\mathbb {P}\{\omega _{0}=a\})\,:\, a\in \mathcal {A}\}\). The expression on the right hand side of the last inequality tends to \(0\) as \(n\rightarrow \infty \), and so (ii) is proved.\(\square \)

7 A nonconvergence example

We assume here that \({\mathcal A}=\{ 0,1\}\) and consider the probability space \(({\Omega },{\mathcal F},{\mathbb P})\) such that \({\Omega }=\{ 0,1\}^{\mathbb N},\, {\mathcal F}\) is the \({\sigma }\)-algebra generated by cylinder sets and \(\mathbb {P}\) is a probability measure on \(\Omega \) such that the coordinate projections \(\{\omega _{j}\}_{j=0}^{\infty }\) from \(\Omega \) onto \(\{0,1\}\) are i.i.d. random variables with \(\mathbb {P}\{\omega _{0}=0\}=p_{0}=1-p_{1}=1-\mathbb {P}\{\omega _{0}=1\}\) where \(p_{0},p_{1}>0\), \(p_{0}+p_{1}=1\) and \(p_{0}\ne p_{1}\). As before \(T\) will denote the left shift on \({\Omega }\) and we introduce here another map \(S:\,{\Omega }\rightarrow {\Omega }\) acting by \((S{\omega })_n={\omega }_n+{\omega }_{n+1}\) (mod 2) for any \(n\ge 0\) and \({\omega }=({\omega }_0,{\omega }_1,\ldots )\).

Set \({\mathbb P}_0=S{\mathbb P}\) which is a probability measure on \({\Omega }\) defined by \({\mathbb P}_0(U)={\mathbb P}(S^{-1}U)\) for any \(U\in {\mathcal F}\). We claim that \({\mathbb P}_0\) is \(T\)-invariant. Indeed, let \(A=[a_0,a_1,\ldots ,a_{n-1}]\) be a cylinder set then

$$\begin{aligned} S^{-1}A=[0,{\alpha }_1,\ldots ,{\alpha }_n]\cup [1,{\beta }_1,\ldots ,{\beta }_n] \end{aligned}$$
(7.1)

where \({\alpha }_i+{\beta }_i=1\) for all \(i=1,\ldots ,n\) and \(a_i={\alpha }_i+{\alpha }_{i+1}\) (mod 2) \(={\beta }_i+{\beta }_{i+1}\) (mod 2) for \(i=0,1,\ldots ,n-1\) where \({\alpha }_0=0\) and \({\beta }_0=1\). Similarly,

$$\begin{aligned} S^{-1}T^{-1}A&= [0,0,{\alpha }_1,\ldots ,{\alpha }_n]\cup [1,0,{\alpha }_1,\ldots ,{\alpha }_n]\cup [1,1,{\beta }_1,\ldots ,{\beta }_n]\nonumber \\&\times \,\,\cup [0,1,{\beta }_1,\ldots ,{\beta }_n]. \end{aligned}$$
(7.2)

It follows from (7.1) and (7.2) that

$$\begin{aligned} {\mathbb P}_0(T^{-1}A)={\mathbb P}(S^{-1}T^{-1}A)={\mathbb P}(S^{-1}A)={\mathbb P}_0(A). \end{aligned}$$
(7.3)

Since (7.3) holds true for all cylinder sets, it remains true for their disjoint unions and since it is preserved under monotone limits we obtain that (7.3) is satisfied for any \(A\in {\mathcal F}\) which proves our claim.

Next, we will show that \({\mathbb P}_0\) has a \(\psi \)-mixing coefficient \(\psi _l\) equal zero for any \(l\ge 1\) while \(\psi _0<\infty \). First, observe that \(\psi _l\) can be defined using only cylinder sets saying that \(\psi _l\) is the infimum of constants \(M\ge 0\) such that for each \(n\ge 0\) and any cylinder sets \(A\in {\mathcal F}_{0,n}\) and \(B\in {\mathcal F}_{n+l+1,\infty }\) (as defined at the beginning of Sect. 2),

$$\begin{aligned} |{\mathbb P}_0(A\cap B)-{\mathbb P}_0(A){\mathbb P}_0(B)|\le M{\mathbb P}_0(A){\mathbb P}_0(B). \end{aligned}$$
(7.4)

Indeed, if (7.4) holds true for such cylinder sets then it remains true for their corresponding disjoint unions and it is preserved under monotone limits which yields that (7.4) is satisfied for any sets \(A\in {\mathcal F}_{0,n}\) and \(B\in {\mathcal F}_{n+l+1,\infty }\), proving the assertion. Now, if \(A\in {\mathcal F}_{0,n}\) and \(B\in {\mathcal F}_{n+l+1,\infty }\) are cylinder sets then analysing their preimages under \(S^{-1}\) similarly to (7.1) we conclude that \(S^{-1}A\in {\mathcal F}_{0,n+1}\) and \(S^{-1}B\in {\mathcal F}_{n+l+1,\infty }\). Hence, if \(l\ge 1\) then \(S^{-1}A\) and \(S^{-1}B\) are independent events with respect to the probability \({\mathbb P}\), and so

$$\begin{aligned} {\mathbb P}_0(A\cap B)-{\mathbb P}_0(A){\mathbb P}_0(B)={\mathbb P}(S^{-1}A\cap S^{-1}B)-{\mathbb P}(S^{-1}A) {\mathbb P}(S^{-1}B)=0 \end{aligned}$$

implying that \(\psi _l=0\) in this case.

In order to estimate \(\psi _0\) we observe that if \(B=[b_{n},b_{n+1},\ldots , b_{n+m-1}]=\{{\omega }=({\omega }_0,{\omega }_1,\ldots ):\,{\omega }_i=b_i\) when \(n\le i\le n+m-1\}\) then

$$\begin{aligned} S^{-1}B=[{\gamma }_{n},{\gamma }_{n+1},\ldots ,{\gamma }_{n+m}]\cup [{\delta }_{n},{\delta }_{n+1},\ldots ,{\delta }_{n+m}] \end{aligned}$$

where \({\gamma }_i+{\delta }_i=1\) for \(i=n,\ldots ,n+m\) and \({\gamma }_i+{\gamma }_{i+1}\) (mod 2)\(= {\delta }_i+{\delta }_{i+1}\) (mod 2)\(=b_i\) for \(i=n,\ldots ,n+m-1\). This together with (7.1) yields that

$$\begin{aligned} {\mathbb P}_0(A\cap B)={\mathbb P}(S^{-1}A\cap S^{-1}B)=\prod _{i=0}^np_{{\alpha }_i} \prod _{j=n+1}^{n+m}p_{{\gamma }_j}+\prod _{i=0}^np_{{\beta }_i} \prod _{j=n+1}^{n+m}p_{{\delta }_j}, \end{aligned}$$

where \({\alpha }_0=0,\, {\beta }_0=1\) and without loss of generality we assume that \({\alpha }_n={\gamma }_{n}\) and \({\beta }_n={\delta }_{n}\). On the other hand,

$$\begin{aligned} {\mathbb P}_0(A)&= {\mathbb P}(S^{-1}A)=\prod _{i=0}^np_{{\alpha }_i}+\prod _{i=0}^np_{{\beta }_i}\,\, \text{ and }\\ {\mathbb P}_0(B)&= {\mathbb P}(S^{-1}B)=\prod _{j=n}^{n+m}p_{{\gamma }_j}+ \prod _{j=n}^{n+m}p_{{\delta }_j}. \end{aligned}$$

It follows that

$$\begin{aligned} \frac{{\mathbb P}_0(A\cap B)}{{\mathbb P}_0(A){\mathbb P}_0(B)}\le 2\left( p^{-1}_{{\gamma }_{n}}+ p_{{\delta }_{n}}^{-1}\right) =2\left( p^{-1}_0+p^{-1}_1\right) . \end{aligned}$$

Hence, \(\psi _0\le 1+2(p_0^{-1}+p_1^{-1})<\infty \) as required.

Let \(1^{\infty }={\omega }\in \Omega \) and \(t>0\). For each \(n\ge 1\) let \(A_{n}^{{\omega }},\, N_{n}^{{\omega }}\) and \(U_{n}^{{\omega }}=S^{A^{\omega }_n}_{N^{\omega }_n}\) be as defined in Sect. 2. We now show that the sequence \(\{U_{n}^{{\omega }}\}_{n=1}^{\infty }\) does not converge in distribution when we take \(\mathbb {P}_{0}\) as the measure on \(\Omega \). For two strings \(S_{1}\) and \(S_{2}\) of 0 and 1 digits we denote by \(S_{1}\cdot S_{2}\) the concatenation of \(S_{1}\) with \(S_{2}\), and for any integer \(k\ge 1\) we denote by \(S_{1}^{k}\) the concatenation of \(S_{1}\) with itself \(k\) times. For every even \(n\ge 1\),

$$\begin{aligned} \mathbb {P}_{0}(1^{n})=\mathbb {P}\left( [[1,0]^{n/2}\cdot 1]\right) + \mathbb {P}\left( [[0,1]^{n/2}\cdot 0]\right) =p_{1}^{\frac{n}{2}+1} p_{0}^{\frac{n}{2}}+p_{0}^{\frac{n}{2}+1} p_{1}^{\frac{n}{2}}= (p_{1} p_{0})^{\frac{n}{2}} \end{aligned}$$

And for every odd \(n\ge 1\),

$$\begin{aligned} \mathbb {P}_{0}(1^{n})=\mathbb {P}\left( [[1,0]^{(n+1)/2}]\right) +\mathbb {P} \left( [[0,1]^{(n+1)/2}]\right) =2(p_{1} p_{0})^{\frac{n+1}{2}}. \end{aligned}$$

From this it follows that for every even \(n\ge 1\),

$$\begin{aligned} \rho _{A^{\omega }_n}=\mathbb {P}_{0}\left( 1^{n+1}\mid 1^{n}\right) =\frac{2(p_{1} p_{0})^{\frac{n+2}{2}}}{(p_{1}p_{0})^{\frac{n}{2}}}=2 p_{1} p_{0} \end{aligned}$$

while for every odd \(n\ge 1\),

$$\begin{aligned} \rho _{A^{\omega }_n}=\mathbb {P}_{0}\left( 1^{n+1}\mid 1^{n}\right) =\frac{(p_{1} p_{0})^{\frac{n+1}{2}}}{2(p_{1} p_{0})^{\frac{n+1}{2}}}=\frac{1}{2}. \end{aligned}$$

Setting \(\theta _{n}=\mathbb {P}_{0}\{U_{n}^{{\omega }}=0\}\) for each \(n\ge 1\) we obtain from here together with Theorem 2.3 that

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\theta _{2n}=e^{-t(1-2p_{1}p_{0})}\,\, \text{ and }\,\,\underset{n\rightarrow \infty }{\lim }\theta _{2n+1}= e^{-\frac{t}{2}}. \end{aligned}$$

Since \(p_0\ne p_1\) then \(1-2p_{1}p_{0}\ne \frac{1}{2}\), and so the sequence \(\{U_{n}^{{\omega }}\}_{n=1}^{\infty }\) does not converge in distribution when we take \(\mathbb {P}_{0}\) as the measure on \(\Omega \).