1 Introduction

A major open problem in ergodic theory is Rokhlin’s question on whether mixing implies mixing of all orders, also called multiple mixing [26]. In most of the known examples of mixing dynamical systems, multiple mixing is now known to hold. Moreover, a positive answer to Rokhlin’s question is actually known to generally hold within various classes of mixing dynamical systems. The most noteworthy are K-systems where multiple mixing always holds [4], horocycle flows [23], mixing systems with singular spectrum that display multiple mixing by a celebrated theorem of Host [13], and finite rank systems since Kalikow showed that rank one and mixing implies multiple mixing [14], a result that was extended to finite rank mixing systems by Ryzhikov [30].

In the second half of the last century, it was discovered that mixing is possible for a class of smooth area preserving flows on surfaces called multi-valued Hamiltonians since they are locally given by Hamilton’s equations. The Hamiltonian \(H\) in question is associated to a closed differential \(1\)-form \(\omega =dH\), the form \(\omega \) being globally defined—but of course this is not necessarily the case for \(H\). The possibility of mixing for these flows was studied in two different cases: first, by Kochergin who obtained mixing when \(\omega \) has higher order zeros and thus the flow has degenerate saddles [17], and then by Arnold [1] who suggested that mixing is possible on a minimal component even in the case where \(\omega \) is Morse but the saddles on the minimal component appear in asymmetric configurations, for example because of a saddle loop. That mixing was indeed possible in the latter context was proved in a particular case by Khanin and Sinai [32]. Absence of mixing in the case of Morse forms was obtained by Kochergin in some particular cases [18] (see also [22]), and later generalized to a typical one form by Ulcigrai [35].

Considering a \(1\)-dimensional section of a multi-valued Hamiltonian flow allows to view the dynamics on its minimal components as special flows above interval exchange transformations (IET), which in particular situations can be circular rotations. The case of non-degenerate saddles corresponds to ceiling functions with logarithmic singularities, while the case of degenerate saddle points corresponds to ceiling functions with stronger singularities such as integrable power like singularities. In the case of power singularities [17] proved mixing above any ergodic IET, while [32] established mixing in the case of a single asymmetric logarithmic singularity above a circular rotation with a typical frequency.

The study of the mixing properties of surface flows has known a revival of interest since the beginning of the 2000s, with results such as the computation of the speed of mixing [6] or extensions of the Khanin-Sinai mixing result to include all irrational translation vectors [19, 20] (see also [21]), or advances in the study of multi-valued Hamiltonian flows on surfaces in the general case where the Poincaré section return map is an IET and not just a circular rotation [3, 3537].

Mixing surface flows stand today as the main and almost only natural class of mixing transformations for which higher order mixing has not been established, nor disproved. Our aim here is to prove mixing of all orders for a subclass of these systems given by special flows above circular rotations, with ceiling functions having asymmetric logarithmic singularities or integrable power like singularities. For convenience, we will speak of Arnold flows in the first case, and Kochergin flows in the second. Our results will depend on the arithmetics of the frequency \(\alpha \) of the rotation on the base, that determines the slope of the unique rotation vector of the corresponding surface flow.

Loosely speaking our main result is as follows (it will be made precise at the end of this introduction, see Corollaries 1.6 and 1.9).

Theorem

Arnold flows are mixing of all orders for a set of \(\alpha \in (0,1)\) of full Lebesgue measure and Kochergin flows are mixing of all orders for a set of \(\alpha \in (0,1)\) of full Hausdorff dimension. This is true in the case of a single singularity and the same result holds if there are many singularities, under a non resonance condition (of full Hasudorff dimension) between the positions of the singularities and the base frequency \(\alpha \).

Similar mixing mechanisms due to orbit shear as in Kochergin and Arnold flows were observed relatively recently such as in [2, 3, 7, 35] and it should be possible to apply the techniques of the current paper to the study of higher order mixing for such parabolic systems.

To explain our approach we need first to make a detour by Ratner’s study of horocycle flows. In the 1980s, Ratner developed a rich machinery to study horocycle flows [2729] and, in particular, singled out a special way of controlled slow divergence of orbits of nearby points which resulted in the notion of \(H_p\)-property, later called R-property by Thouvenot [33]. This property, to which we will come back with more detail in the sequel, has important dynamical consequences, mainly expressed by a restriction on the possible joining measures of a system having the R-property with other systems, and in particular with itself.

A joining between two dynamical systems \((T,X,\fancyscript{B},\mu )\) and \((S,Y,\fancyscript{C},\nu ), (X,\fancyscript{B},\mu )\) and \((Y,\fancyscript{C},\nu )\) being standard Borel probability spaces, is a measure \(\rho \) on \(X \times Y\) invariant by \(T \times S\) whose marginals on \(X\) and \(Y\) are \(\mu \) and \(\nu \). The definition for flows is similar. An important notion in Ratner’s theory is that of finite extension joinings (FEJ).

Definition 1.1

An ergodic flow \((T_t)_{t\in \mathbb {R}}\) is said to have FEJ-property, an acronym for finite extension joining, if for every ergodic flow \((S_t)_{t\in \mathbb {R}}\) acting on \((Y,\fancyscript{C},\nu )\) and every ergodic joining \(\rho \) of \((T_t)_{t\in \mathbb {R}}\) and \((S_t)_{t\in \mathbb {R}}\) different from the product measure \(\mu \times \nu , \rho \) yields a flow which is a finite extension of \((S_t)_{t\in \mathbb {R}}\).

It was shown in [31] that a mixing flow with FEJ-property is mixing of all orders. Moreover, it was proved in [27] that a flow with R-property has the FEJ-property. It follows that mixing flows with the R-property are mixing of all orders. Since the R-property for horocycle flows stemmed from polynomial shear along the orbits, and since Kochergin flows displayed a similar polynomial shear along the orbits, the idea that special flows over rotations may enjoy the R-property, and thus be multiple mixing, was then suggested by J-P. Thouvenot in the 1990s (see p. 2 in [9]).

However, whether natural classes of special flows (not necessarily mixing) over irrational rotations may have the R-property remained open until Fra̧czek and Lemańczyk [9, 10] showed that a generalized R-property holds in some classes of special flows with roof functions of bounded variation (which, by [16], are not mixing). More precisely, they have introduced a weaker notion than the R-property, called weak Ratner or WR-property that however still implies the FEJ-property (see Definition 2.1 and the comment after it) .

Unfortunately, in the mixing examples of special flows under piecewise convex functions with singularities such as Arnold and Kochergin flows, the shear may occur very abruptly as orbits approach the singularity and this may prevent them from having the weak Ratner property. Indeed, we believe that these flows do not have the WR-property (it was observed by the first author that Arnold and Kochergin flows do not have a natural strong Ratner property as described in the next paragraph). This is corroborated by the following result that shows that Kochergin flows, in the context of bounded type frequency in the base (that is a priori favorable to controlling the shear), do not have the WR-property.

Theorem 1

Let \(\alpha \in \mathbb {R}\) be irrational of bounded type and \(f(x)=x^\gamma +r, -1<\gamma <0, r>0\). Then the special flow \((T^t_{\alpha ,f})_{t\in \mathbb {R}}\) defined above the circle rotation \(R_\alpha \) and under the ceiling function \(f\) does not have the WR-property.

Here the circle is \(\mathbb {T}=\mathbb {R}/ \mathbb {Z}\). We recall that an irrational \(\alpha \) is said to be of bounded type or \(\alpha \in \mathrm{DC}(0)\), if the partial quotients in the continued fraction of \(\alpha \) are bounded, i.e. if \(\alpha =[a_0;a_1,a_2,\ldots ]\) and there exists \(K>0\) such that \(a_i <K\) for all \(i \geqslant 1\). We refer to Sect. 3 for the exact definition of special flows. Theorem 1 has another consequence. It is known that every horocycle flow \((h_t)_{t\in \mathbb {R}}\) is loosely Bernoulli [29]; therefore, for every irrational \(\alpha \), there exists a positive function in \(f\in L^1(\mathbb {T})\) such that \((h_t)_{t\in \mathbb {R}}\) is measurably isomorphic to the special flow \((T^t_{\alpha ,f})_{t\in \mathbb {R}}\) [25]. It follows from [16] and the fact that \((h_t)_{t\in \mathbb {R}}\) is mixing that \(f\) is of unbounded variation. Moreover, by [24], \(f\) can be made \(C^1\) except for one point. Since the R-property implies the WR-property and the R-property is an isomorphism invariant, no special flow as in Theorem 1 is isomorphic to a horocycle flow. Actually, this line of thought can be extended to show that horocycle flows are never isomorphic to special flows above an irrational rotation and under a roof function that is convex and \(C^2\) except at one point. For the latter result, one needs to introduce the concept of strong Ratner property, which is also an isomorphism invariant, that specifies the occurrence of slow divergence of nearby orbits to the first time when the orbits do split apart. This is the natural property that Ratner actually obtains for horocycle flows, and it is relatively easy to show that the Kochergin flows do not have it. What is more complicated in the proof of the absence of the general R-property for special flows, is to make sure that the slow divergence does not occur much later in the future after the nearby points have split and then came back together (see the proof of Theorem 1).

To bypass Theorem 1 and still use controlled divergence of orbits to show multiple mixing, our approach will be to further weaken the WR-property. Namely, we introduce the SWR-property, which stands for switchable weak Ratner property, that assumes that a pair of nearby points displays the WR-Property either under forward iteration in time or under backward iteration, and this depending on the pair of points. We show that the SWR-property is sufficient to guarantee the same FEJ consequences as the Ratner or the weak Ratner property. Consequently, a mixing flow enjoying the SWR-property is mixing of all orders.

The main idea in showing that Arnold and Kocergin special flows may have the SWR-Property is the following. For these flows, the main contribution to the shear between orbits is due to the visits of the flow lines to the neighborhood of the singularities. With the representation of these flows as special flows above irrational rotations, the shear is translated into the divergence between the Birkhoff sums of the roof functions for nearby points, and this divergence is mainly due to the visits under the base rotation to the neighborhoods of the points where the roof function has its singularities. If the base rotation angle \(\alpha \) is of bounded type two nearby points will accumulate sufficient stretch while staying sufficiently far from the singularity either when they are iterated forward or when they are iterated backward. In the case of ceiling functions with only logarithmic singularities we are also able to exploit the progressive contribution to the shear of these visits to the singularities to obtain multiple mixing for a full measure set of numbers \(\alpha \).

We now introduce some notations related to the ceiling functions that will be considered and their singularities, after which we will be able to state our exact results on the SWR-property and multiple mixing.

Definition 1.2

Let \(h\) be a positive function \(h\in C^2(\mathbb {T}\setminus \{0\})\), decreasing on \((0,1), \lim _{x\rightarrow 0^+}h(x)=+\infty , h'\) increasing on \((0,1)\). Let \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) for some numbers \(a_1,\ldots ,a_k\in \mathbb {T}\). We say that \(f\) has singularities of type \(h\) at \(\{a_1,\ldots ,a_k\}\) if

$$\begin{aligned} \lim _{x\rightarrow a_i^+}\frac{f'(x)}{h'(x-a_i)}=A_i\;\; \quad \text {and}\;\; \lim _{x\rightarrow a_i^-}\frac{f'(x)}{h'(a_i-x)}=-B_i, \end{aligned}$$
(1)

for some numbers \(A_i,B_i\geqslant 0, i=1,\ldots ,k\).

Notice that in this definition \(h\) may only reflect a domination on the singularities of \(f\) since the coefficients \(A_i,B_i\) may be equal to zero at some or at all \(i\)’s.

Definition 1.2 will allow us to state our results with some flexibility on the singularities but the reader should keep in mind that the results will target functions that are essentially of the form \(A \ln (a-x)\) or \(A (a-x)^{\gamma }, \gamma \in (-1,0)\), when \(x\) is in a left neighborhood of a singularity \(a\) and similar form on the right side of a singularity. In the case of Arnold flows all the singularities will be assumed to be essentially of logarithmic type (see more general definition in Sect. 1.1) while in the case of Kochergin flows we will be dealing with functions having at least one power like singularity while the other singularities are supposed to be of equal or weaker type (see Sect. 1.2).

Furthermore, our results can deal with functions having several singularities but require a non resonance condition between these singularities and the rotation frequency \(\alpha \) in the base of the special flow, that we now state.

Our standing assumption is that \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\). We then let \((q_s)\) be the sequence of denominators of the best rational approximations of \(\alpha \). Namely \((q_s)\) is the unique increasing sequence such that \(q_0=1\) and \(\Vert q_s \alpha \Vert <\Vert k\alpha \Vert \) for any \(k <q_{s+1}, k\ne q_s\). We recall (see e.g. [15]) that

$$\begin{aligned} \frac{1}{2q_{s+1}} \leqslant \Vert q_s \alpha \Vert \leqslant \frac{1}{q_{s+1}} \end{aligned}$$
(2)

Definition 1.3

[Badly approximable singularities] Given \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\), we will say that \(\{a_1,\ldots ,a_k\}\) are badly approximable by \(\alpha \) if there exists \(C>1\) such that for every \(x\in \mathbb {T}\) and every \(s\in \mathbb {N}\), there exists at most one \(i_0\in \{0,\ldots ,q_s-1\}\) such that

$$\begin{aligned} x+i_0\alpha \in \bigcup _{i=1}^k\left[ \frac{-1}{2Cq_s}+a_i,a_i+\frac{1}{2Cq_s}\right] . \end{aligned}$$
(3)

Remark 1.4

It was shown in [11], Lemma3] that if \(a_i-a_j\in (\mathbb {Q}+\mathbb {Q}\alpha )\setminus (\mathbb {Z}+\mathbb {Z}\alpha )\) whenever \(i\ne j\) then \(\{a_1,\ldots ,a_k\}\) are badly approximable by \(\alpha \).

Note that if there is only one singularity, that is \(k=1\), then by (2) it is always badly approximable by \(\alpha \). The following shows that for \(k \geqslant 2\) the set of singularities that are badly approximable by \(\alpha \) is a thick set in \([0,1]^k\).

Lemma 1.5

[34] Let \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\). For any \(k \in \mathbb {N}\), the set \(E \subset [0,1]^k\) of \(k-tuples (a_1,\ldots ,a_k)\) that are badly approximable by \(\alpha \) is a product of sets of full Hausdorff dimension in \([0,1]\).

Proof

Define

$$\begin{aligned} B(\alpha ) := \left\{ b \in \mathbb {R}: \exists C>0, \forall k \in \mathbb {Z}\setminus 0 : \Vert k \alpha - b\Vert \geqslant \frac{1}{C |k|}\right\} \end{aligned}$$

Then if \((a_1,\ldots ,a_k)\) are such that \(a_i-a_j \in B(\alpha )\) for any \(i,j \in \{1,\ldots ,k\}, i\ne j\), then \((a_1,\ldots ,a_k)\) are badly approximable by \(\alpha \).

But it was proven in [34] (see also [5]) that the set \(B(\alpha )\) is a winning set in the sense of Schmidt (see [5, 34] and references therein). A winning set is of full Hausdorff dimension. Moreover, for a winning set \(B \in \mathbb {R}\) we have that for any \(x_1,\ldots ,x_n\) the set \(\cap _{s=1}^n (x_s+B)\) is winning. So, if \(a_1, \ldots ,a_l\) are such that \(a_i-a_j \in B(\alpha )\) for any \(i,j \in \{1,\ldots ,l\}, i\ne j\), then the set of \(a \in [0,1]\) such that \(a \in \cap _{s=1}^l (a_s + B(\alpha ))\) is winning which means that \(a_1, \ldots ,a_l,a_{l+1}\) are badly approximable by \(\alpha \) for a winning set of \(a_{l+1}\). The statement of the Lemma follows then by induction and because a single \(a_1\) is always badly approximable by \(\alpha \). \(\square \)

1.1 Logarithmic like singularities

In the case of logarithmic like singularities, the following theorem holds.

Theorem 2

Let \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\) and \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) with the singularities \(\{a_1,\ldots ,a_k\}\) of type \(h\) and badly approximable by \(\alpha \), with some associated constant \(C>1\). Assume that \(\sum _{i=1}^k A_i\ne \sum _{i=1}^k B_i\) and that there exists a constant \(m_0>0\) and a sequence \((x_s)\) such that for every \(s\in \mathbb {N}\), we have \(x_s < \frac{1}{q_s}\) and

  1. 1.

    \(\lim _{s\rightarrow +\infty } \frac{h'(\frac{x_s}{4C})}{q_sh(\frac{1}{2q_s})}=0, \quad \lim _{s\rightarrow +\infty }x_sq_sh(\frac{1}{2q_s})=+\infty \);

  2. 2.

    \(\sum _{i\notin K_\alpha }q_ix_i<+\infty \), where \(K_\alpha :=\{s\in \mathbb {N}\;:\; q_{s+1}< \frac{1}{x_s}\}\);

  3. 3.

    \(h(\frac{1}{2q_s})/h(\frac{1}{2q_{s+1}})>m_0\).

Then the special flow \((T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-property.

What is \(x_s\) in Theorem 2? When we will compute the shear in the Birkhoff sums of the ceiling function \(f\), we will naturally be able to control only those points that do not go too close to the singularities under the iteration by \(R_\alpha \) on the base. Then, a controllable point will be a point that stays at distance \(x_s\) from the singularities during \(O(q_{s+1})\) iterates in the future or in the past. We choose \(x_s\) so that the contribution to the shear of a single visit to a singularity is negligible with respect to the accumulated shear (this, either in the future or in the past). To fix ideas, suppose the singularities are reduced to just one one-sided singularity at the origin and observe that if \(f\) at the origin is exactly \(\log \) then the choice \(x_s= {1}/{q_s \log ^{\frac{7}{8}}q_s}\) would satisfy 1. Then if \(q_{s+1}< q_s \log ^{\frac{7}{8}} q_s\) we see that for any \(x \in \mathbb {T}\) either up to \(q_{s+1}/2\) in the future or up to \(-q_{s+1}/2\) in the past the orbit of \(x\) by \(R_\alpha \) does not enter the \(x_s\) neighborhood of the origin, and this will show that the progressive accumulation in the Birkohff sums of the derivative of the ceiling function above the orbit of \(x\) in the region of time between \(O(q_s)\) and \(O(q_{s+1})\) in the future or \(O(-q_{s+1})\) and \(O(-q_s)\) in the past dominates the value of the derivative at the closest entry to the neighborhood of the origin (this is the aim of condition 1). The latter is the crucial fact that we need to show SWR. In the opposite case where for example \(q_{s+1} \geqslant q_s \log q_s\) we have to discard the points that enter the \(x_s\) neighborhood of the origin between time \(-q_s\) and \(q_s\) and then show that the remaining points stay away from the \(x_s\) neighborhood of the origin for \(O(q_{s+1})\) iterations (either in the past or in the future) and conclude as before. This is where 2. is necessary to show that the measure of the discarded points is arbitrarily small.

We refer to the beginning of Sect. 4.1 for an outline of the proof of Theorem 2 in which the different roles of conditions 1., 2. and 3. are all explained in detail.

We now restate Theorem 2 in the particular case of exactly logarithmic singularities, that is when \(h(x)=-\log (x),\) for \(x\in [0,1)\). To be able to choose \(x_s\) such that \(1., 2.\) and \(3.\) are satisfied we need some arithmetic restrictions on \(\alpha \), that we now introduce.

For \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\), let \(K_{\alpha }:=\{n\in \mathbb {N}\;:\;q_{n+1}<q_n\log ^{\frac{7}{8}}(q_n)\}\). We then define in view of \(1.\) and \(2.\) of Theorem 2

$$\begin{aligned} {\mathcal {E}}:=\left\{ \alpha \in \mathbb {T}\setminus \mathbb {Q}\;:\; \sum _{i\notin K_\alpha }\frac{1}{\log ^{\frac{7}{8}}q_i}<+\infty \right\} . \end{aligned}$$

Indeed, for \(\alpha \in {\mathcal {E}}\), 1. and 2. are satisfied with \(x_s:=\frac{1}{q_s \log ^{\frac{7}{8}}q_s}\), for \(s\geqslant 1\).

Recall first that a number \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\) is said to be Diophantine if there exists \(\tau \geqslant 0\) such that for any \(p,q\in \mathbb {Z}\times \mathbb {N}^*\) we have that \(|\alpha -\frac{p}{q}|\geqslant \frac{C(\alpha )}{q^{2+\tau }}\) for some \(C(\alpha )>0\). We call \(\mathcal {D}\) the set of Diophantine numbers.

To have 3., it suffices to assume that \(\alpha \) is Diophantine since an equivalent definition of \(\alpha \in \mathbb {R}\setminus \mathbb {Q}\) being Diophantine is that its sequence of denominators \(q_n\) satisfies \(\forall _{n\in \mathbb {N}}, \ q_{n+1}<r_\alpha q^{1+\tau }_{n}\) (see (2)).

Hence we have the following

Corollary 1.6

Consider \(h(x)=-\log (x)\). Let \(\alpha \in \mathcal {D}\cap \mathcal {E}\). Let \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) with the singularities \(\{a_1,\ldots ,a_k\}\) of type \(h\) and badly approximable by \(\alpha \). Assume that \(\sum _{i=1}^k A_i\ne \sum _{i=1}^k B_i\). Then \((T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-Property.

Proof

We take for \(x_s\) the sequence \(\frac{1}{q_s \log ^{\frac{7}{8}}q_s}\) and easily check the hypothesis of Theorem 2. Therefore \(( T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-property. \(\square \)

Corollary 1.6 covers a set of full Lebesgue measure of rotation angles \(\alpha \). Indeed, it is known that the set of Diophantine numbers \(\mathcal D\) has full Lebesgue measure, and we will prove in Appendix B the following result. Denote by \(\lambda \) the Haar measure on \(\mathbb {T}\).

Proposition 1.7

It holds that \(\lambda ({\mathcal {E}})=1\).

We now recall the following results on mixing of special flows with ceiling functions having logarithmic singularities.

Theorem 3

Let \(f\) be as in Corollary 1.6. Then

  1. (a)

    ([18]) If \(\sum _j A_j=\sum _j B_j\) then \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) is not mixing for any \(\alpha \in \mathbb {R}-\mathbb {Q}.\)

  2. (b)

    ([19, 20, 32]) If \(\sum _j A_j \ne \sum _j B_j\) then \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) is mixing for almost every \(\alpha \in \mathbb {R}-\mathbb {Q}.\)

  3. (c)

    ([19, 20]) If \(A_j-B_j \ne 0\) have the same sign for all \(j\) then \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) is mixing for each \(\alpha \in \mathbb {R}-\mathbb {Q}.\)

In Theorem 5 below, we show that the SWR property for \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) implies the FEJ-property. As an immediate consequence of this and of Theorem 3, Corollary 1.6 and the fact that mixing flows with the FEJ-property are multiple mixing [31] we get the following.

Corollary 1.8

Consider \(h(x)=-\log (x)\). For \(\alpha \in (0,1)\), assume \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) has singularities \(\{a_1,\ldots ,a_k\}\) of type \(h\) and badly approximable by \(\alpha \). If \(A_j-B_j \ne 0\) have the same sign for all \(j\), then \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) is multiple mixing for each \(\alpha \in \mathcal {E}\cap \mathcal {D}\). If \(\sum _j A_j \ne \sum _j B_j\) then \((T_{\alpha , f}^t)_{t\in \mathbb {R}}\) is multiple mixing for almost every \(\alpha \in \mathcal {E}\cap \mathcal {D}\).

1.2 Power like singularities

Now, we will deal with power like singularities. We suppose \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) with singularities \(\{a_1,\ldots ,a_k\}\) of type \(h\). We assume that there exists \(i\in \{1,\ldots ,k\}\) such that \(A_i^2+B_i^2>0\) in (1).

Theorem 4

Let \(\alpha \) be irrational with bounded partial quotients, that is, \(\alpha \in DC(0)\). Assume that \(\{a_{1},\ldots ,a_k\}\) are badly approximable by \(\alpha \) with some constant \(C >1\). Assume that there exist constants \(D_1,D_2>0\) such that for every \(s\in \mathbb {N}\)

$$\begin{aligned} D_2>\frac{-h'\left( \frac{1}{C^4q_s}\right) }{q_sh\left( \frac{1}{2q_s}\right) }>D_1\;\;\;\text {and}\;\;\; \frac{h\left( \frac{1}{2q_s}\right) }{h\left( \frac{1}{2q_{s+1}}\right) }>D_1. \end{aligned}$$
(4)

Then \(( T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-property.

The most interesting case – when \(h\) has power singularities – is discussed in the corollary below.

Corollary 1.9

Let \(\alpha \in DC(0)\). Let \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) with all the singularities \(\{a_1,\ldots ,a_k\}\) of power-like type \(x^{\gamma _i}\) from the left and \(x^{\delta _i}\) from the right, \(-1<\gamma _i,\delta _i<0\). Then, if the points \(\{a_1,\ldots ,a_k\}\) are badly approximable by \(\alpha \), we have that \(( T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-Property and is mixing of all orders.

Proof of Corollary 1.9

It is easy to check that (4) in Theorem 4 is satisfied. This gives the SWR-Property . Mixing of \(( T^t_{\alpha ,f})_{t\in \mathbb {R}}\) was established in [17]. Multiple mixing then follows from Theorem 5 and the FEJ-property. \(\square \)

Remark 1.10

A stronger version of Theorem 4 actually holds : if \(\alpha \in DC(0)\) and \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\) with all the singularities \(\{a_1,\ldots ,a_k\}\) of at most power-like type \((x^{\gamma _i}\) from the left and \(x^{\delta _i}\) from the right, \(-1<\gamma _i,\delta _i<0)\) and if \(\gamma =\min _{1\leqslant i\leqslant k}\{\gamma _i,\delta _i\}\), then it is sufficient to have the singularities of maximal type badly approximable with \(\alpha \) to guarantee the SWR property, namely if the points in \(E=\{a_i\;:\; \min \{\gamma _i,\delta _i\}=\gamma \}\subset \{a_1,\ldots ,a_k\}\) are badly approximable by \(\alpha \), then \((T^t_{\alpha ,f})_{t\in \mathbb {R}}\) has the SWR-Property and is mixing of all orders. The proof in this case is similar to the proof of Theorem 4 and we omit it to avoid overloading the paper with unnecessary technicalities.

1.3 Plan of the paper

In Sect. 2 we introduce the SWR-Property and describe its joinings consequences. In Sect. 3 we give a criterion involving the Birkhoff sums of the ceiling function that guarantees that a special flow above an isometry has the SWR-property. The treatment of these sections is similar to [9, 10]. In Sect. 4 we study the Birkhoff sums of logarithmic like and power like functions and prove Theorems 2 and 4. Appendix A is devoted to the proof of Theorem 1 on the absence of the SWR-Property for a subclass of Kochergin flows. Finally Appendix B is devoted to the proof that the set of frequencies for which Theorem 2 holds has full Lebesgue measure.

2 The SWR-property

Let \((X,\fancyscript{B},\mu )\) be a probability standard Borel space. We additionally assume that \(X\) is a complete metric space with a metric \(d\). Let \((T_t)_{t\in \mathbb {R}}\) be an ergodic flow acting on \((X,\fancyscript{B},\mu )\). The R-property used by Ratner in the context of horocycle flows is a property of slow divergence between the orbits of nearby points that essentially states as follows: for any \(\varepsilon >0\) there exists \(\kappa >0\) such that if \(y\) and \(y'\) are sufficiently close to each other and if they are not on the same orbit then there exists \(M(y,y')\) such that \(d(T^ty,T^{t+\iota }y')<\varepsilon \) (with \(\iota =\pm 1\)) for \(t\in [M,(1+\kappa )M]\).

It is not difficult to see that the R-property implies the FEJ property. Indeed, if a joining \(\lambda \) is not a finite extension joining then one has that there exists points \(x,y,y'\) such that \((x,y)\) and \((x,y')\) are typical for \(\lambda \) while \(y\) and \(y'\) are arbitrarily close and not in the same orbit, and by the R-property and the Birkhoff ergodic theorem one obtains an extra invariance of \(\lambda \), namely by \(\mathrm{Id} \times T^\iota \), that implies that \(\lambda \) is the product measure.

Actually, it is useful to relax the R-property by asking that the controlled divergence happens for \(x\) and \(y\) outside an exceptional set of points of measure less than \(\varepsilon \) (and for most of the times in \([M,(1+\kappa )M]\)). The proof of the FEJ property remains the same in nature but becomes a bit more technical involving some standard measure theoretical arguments (see for example [33]).

In [10], Definition 4, a slightly weaker version of the \(R\)-property is given, that is called WR or Weak Ratner property, that allows the drift to vary in some fixed compact set away from zero and infinity. There again, the FEJ consequence as well as its proof follow in practically the same way as for the R-property, with some extra standard measure theoretical arguments, and under a “continuity” assumption on orbits (see below). However, as shown in [10] the WR property is more versatile than the R-property and is adapted to nonlinear situations, where the R-property is unlikely to hold, such as in our context of reparametrizations of linear flows with singularities.

Observe now that if in the proof of the FEJ property, we used that \(d(T^{t}y,T^{t+\iota }y')<\varepsilon \) during a large interval of negative times \(t\) instead of positive times, then exactly the same conclusion of extra invariance of \(\lambda \) by \(\mathrm{Id} \times T^\iota \) would still hold. The crucial observation here is that it suffices to check one of the two slow divergences, in the future or in the past, and this possibly depending on the pair of points that is considered. This motivates the introduction of what we call the Switchable Ratner, or SWR, property that we now formally define.

Definition 2.1

(The switchable Ratner property) Fix \(t_0\in \mathbb {R}_+\) and a compact set \(P\subset \mathbb {R}\setminus \{0\}\). One says that the flow has the switchable \(R(t_0,P)\)-property if for every \(\epsilon >0\) and \(N\in \mathbb {N}\) there exist \(\kappa =\kappa (\epsilon ), \delta =\delta (\epsilon ,N)\) and a set \(Z=Z(\epsilon ,N)\in \fancyscript{B}\) with \(\mu (Z)>1-\epsilon \) such that for any \(x,y\in Z\) with \(d(x,y)<\delta , x\) not in the orbit of \(y\) there exist \(M=M(x,y),L=L(x,y)\in \mathbb {N}\) with \(M,L>N\) and \(\frac{L}{M}\geqslant \kappa \) and \(p=p(x,y)\in P\) such that

$$\begin{aligned} \frac{1}{L}\big |\{n\in [M,M+L]\cap \mathbb {Z}: d(T_{nt_0}(x),T_{nt_0+p}(y))<\epsilon \}\big |>1-\epsilon \end{aligned}$$
(5)

or

$$\begin{aligned} \frac{1}{L}\left| \{n\in [M,M+L]\cap \mathbb {Z}: d(T_{n(-t_0)}(x),T_{n(-t_0)+p}(y))<\epsilon \}\right| >1-\epsilon . \end{aligned}$$
(6)

If the set of \(t_0>0\) such that the flow \((T_t)_{t\in \mathbb {R}}\) has the switchable \(R(t_0,P)\)-property is uncountable, the flow is said to have SWR-property.

For the sake of completeness, we can formally compare the SWR-property with the definition of the WR-property [10]. To have WR-property, we fix \(P\subset \mathbb {R}\setminus \{0\}\) and \(t_0\in \mathbb {R}. (T_t^f)_{t\in \mathbb {R}}\) has \(R(t_0,P)\) property if in Definition 2.1, (5) holds (the condition (6) is not taken into account) and \((T_t^f)_{t\in \mathbb {R}}\) has WR-property if the set of \(t_0\in \mathbb {R}\) such that \((T_t^f)_{t\in \mathbb {R}}\) has \(R(t_0,P)\) property is uncountable. Consequently, SWR-property is weaker than WR-property (and as Theorem 1 shows, it is strictly weaker).

As we just mentioned the proof of the FEJ implication from the SWR property is a direct adaptation of the proof of the same implication in the case of the R property or the WR property. For completeness, we will present a detailed proof of this fact that is stated in Theorem 5 below and that occupies the rest of the section.

As a standing assumption in all the sequel, we will add one more natural condition on the flow \((T_t)_{t\in \mathbb {R}}\) which can be viewed as “continuity” on orbits. The flow \((T_t)_{t\in \mathbb {R}}\) is called almost continuous [10] if for every \(\epsilon >0\) there exists a set \(X(\epsilon )\) with \(\mu (X(\epsilon ))>1-\epsilon \) such that for every \(\epsilon '>0\) there exists \(\delta '>0\) such that for every \(x\in X(\epsilon )\), we have \(d(T_t(x),T_{t'}(x))<\epsilon '\) for \(t,t'\in [-\delta ,\delta ]\).

Theorem 5

Let \((T_t)_{t\in \mathbb {R}}\) be a weakly mixing almost continuous flow acting on a probability standard Borel space \((X,\fancyscript{B},\mu )\). Assume that \((T_t)_{t\in \mathbb {R}}\) satisfies the SWR-property. Let \((S_t)_{t\in \mathbb {R}}\) be an ergodic flow acting on a probability standard Borel space \((Y,\fancyscript{C},\nu )\) and let \(\rho \in J((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) be an ergodic joining. Then either \(\rho \) is equal to \(\mu \otimes \nu \) or is a finite extension of the measure \(\nu \).

For the definition and properties of joinings, we refer the reader to [33] or [12]. In the proof of Theorem 5, we will need some lemmas from [10]. But first we state a useful fact that is a simple consequence of the Birkhoff ergodic theorem.

Lemma 2.2

Let \(T,S:(X,\fancyscript{B},\mu )\rightarrow (X,\fancyscript{B},\mu )\) be two ergodic automorphisms and let \(A\in \fancyscript{B}\). For any \(\epsilon ,\delta ,\kappa >0\) there exist \(N=N(\epsilon ,\delta ,\kappa )\) and a measurable set \(Z=Z(\epsilon ,\delta ,\kappa )\) with \(\mu (Z)>1-\delta \) such that for any \(M,L\geqslant N\) with \(\frac{L}{M}\geqslant \kappa \) and any \(x\in \mathbb {Z}\) we have

$$\begin{aligned} \left| \frac{1}{L}\sum _{i=M}^{M+L}\chi _A(T^ix)-\mu (A)\right| <\epsilon \end{aligned}$$

and

$$\begin{aligned} \left| \frac{1}{L}\sum _{i=M}^{M+L}\chi _A(S^ix)-\mu (A)\right| <\epsilon . \end{aligned}$$

The following is a consequence of Lemma 5.4. in [10], that is itself based on the Birkhoff ergodic theorem.

Lemma 2.3

Let \((T_t)_{t\in \mathbb {R}}\) be a weakly mixing almost continuous flow acting on \((X,\fancyscript{B},\mu )\), and \((S_t)_{t\in \mathbb {R}}\) be another ergodic flow acting on \((Y,\fancyscript{C},\nu )\). Let \(\rho \in J((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) be such that \(\rho \) is ergodic for automorphisms \(T_{t_0}\times S_{t_0}\) for some \(t_0>0\) (hence, for \(T_{-t_0}\times S_{-t_0}\)). Let \(P\subset \mathbb {R}\) be non-empty and compact. Let \(A\in \fancyscript{B}\) be such that \(\mu (\partial A)=0\) and \(B\in \fancyscript{C}\). Then, for every \(\epsilon ,\delta ,\kappa >0\) there exist a natural number \(N=N(\epsilon ,\delta ,\kappa )\) and a set \(Z=Z(\epsilon ,\delta ,\kappa )\subset \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (Z)>1-\delta \) such that for any \(\mathbb {N}\ni M,L\geqslant N\) with \(\frac{L}{M}\geqslant \kappa \) and any \(p\in P\), we have

$$\begin{aligned} \left| \frac{1}{L}\sum _{j=M}^{M+L}\chi _{T_{-p}A\times B}(T_{t_0j}x,S_{t_0j}y)-\rho (T_{-p}A\times B)\right| <\epsilon \end{aligned}$$

and

$$\begin{aligned} \left| \frac{1}{L}\sum _{j=M}^{M+L}\chi _{T_{-p}A\times B}(T_{-t_0j}x,S_{-t_0j}y)-\rho (T_{-p}A\times B)\right| <\epsilon \end{aligned}$$

for every \((x,y)\in Z\).

Proof

The proof is a simple consequence of the following lemma:

Lemma 2.4

(Lemma 5.4. in [10]) Let \((T_t)_{t\in \mathbb {R}}\) be an ergodic almost continuous flow acting on \((X,\fancyscript{B},\mu )\), and \((S_t)_{t\in \mathbb {R}}\) be another ergodic flow acting on \((Y,\fancyscript{C},\nu )\). Let \(\rho \in J((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) be such that \(\rho \) is ergodic for the automorphism \(T_{t_0}\times S_{t_0}\) for some \(t_0>0\). Let \(P\subset \mathbb {R}\) be non-empty and compact. Let \(A\in \fancyscript{B}\) be such that \(\mu (\partial A)=0\) and \(B\in \fancyscript{C}\). Then, for every \(\epsilon ,\delta ,\kappa >0\) there exist a natural number \(N=N(\epsilon ,\delta ,\kappa )\) and a set \(Z=Z(\epsilon ,\delta ,\kappa )\subset \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (Z)>1-\delta \) such that for any \(\mathbb {N}\ni M,L\geqslant N\) with \(\frac{L}{M}\geqslant \kappa \) and any \(p\in P\), we have

$$\begin{aligned} \left| \frac{1}{L}\sum _{j=M}^{M+L}\chi _{T_{-p}A\times B}(T_{t_0j}x,S_{t_0j}y)-\rho (T_{-p}A\times B)\right| <\epsilon \end{aligned}$$

for every \((x,y)\in Z\).

One uses the above lemma first for the flows \((T_t)_{t\in \mathbb {R}}\) and \((S_t)_{t\in \mathbb {R}}\) and ergodic joining \(\rho \in J((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) to get, for \(\epsilon ,\frac{\delta }{2}, \kappa >0\), a natural number \(N_+\in \mathbb {N}\) and a set \(Z_+\subset \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (Z_+)>1-\frac{\delta }{2}\). Then, for flows \((T_{-t})_{t\in \mathbb {R}}\) and \((S_{-t})_{t\in \mathbb {R}}\) and the same ergodic joining \(\rho \) to get, for \(\epsilon ,\frac{\delta }{2},\kappa >0\), a natural number \(N_-\in \mathbb {N}\) and a set \(Z_-\subset \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (Z_-)>1-\frac{\delta }{2}\). To finish the proof one takes \(N:=\max (N_+,N_-)\) and \(Z=Z_+\cap Z_-\). \(\square \)

The next lemma is used in the proof of Theorem 3 in [27].

Lemma 2.5

Let \((T_t)_{t\in \mathbb {R}}\) and \((S_t)_{t\in \mathbb {R}}\) be two ergodic flows. Let \(\rho \in J^e((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) be an ergodic joining. Then if there exists a set \(V\) with \(\rho (V)>0\) such that for any points \((x,y),(x',y)\in V\) either \(x\) is in the orbit of \(x'\) or \(d(x,x')>c_0\) for some constant \(c_0>0\), then \(\rho \) is a finite extension of \(\nu \).

In what follows, we consider only \((X,d)\) be a \(\sigma \)-compact metric space. Let \(A\in \fancyscript{B}\). For \(\eta >0\) we denote by \(V_\eta (A):=\{x\in X:\; d(x,A)<\eta \}\).

Lemma 2.6

(cf. [10]) For any \(A\in \fancyscript{B}\) there exists \(R\subset (0,+\infty )\) such that \((0,+\infty )\setminus R\) is countable and \(\mu (\partial V_{\eta }(A))=0\) for \(\eta \in R\). It particular, there exists a dense family \((B_i)_{i\geqslant 1}\) in \(\fancyscript{B}\) with the property \(\mu (\partial B_i)=0\) for every \(i\in \mathbb {N}\).

Remark 2.7

Since \((X,d)\) is a Polish space, by Lemma 2.6 and regularity of \(\mu \), there exists a dense family \(\{B_i\}_{i\geqslant 1}\in \fancyscript{B}\), such that \(\mu (\partial B_i)=0\) for \(i\geqslant 1\).

Proof of Theorem 5

Let \(\rho \in J((T_t)_{t\in \mathbb {R}},(S_t)_{t\in \mathbb {R}})\) be an ergodic joining and \(\rho \ne \mu \times \nu \). Assume that \((T_t)_{t\in \mathbb {R}}\) has the switchable \(R(t_0,P)\)-property and \(\rho \) is ergodic for \(T_{t_0}\times S_{t_0}\) (then \(\rho \) is ergodic for \(T_{-t_0}\times S_{-t_0}\)). Such \(t_0>0\) always exists because an ergodic flow can have at most countably many non-ergodic time automorphisms and, by assumptions, the property \(R(t_0,P)\) is satisfied for uncountably many \(t_0>0\). For simplicity of notation, we assume \(t_0=1\). Let \(\{B_i\}_{i\geqslant 1}\) and \(\{C_i\}_{i\geqslant 1}\) be two countable dense families in the \(\sigma \)-algebras \(\fancyscript{B}\) and \(\fancyscript{C}\), respectively such that for every \(i\geqslant 1, B_i\) is an open set and we have \(\mu (\partial B_i)=0\) (such \(\{B_i\}_{i\geqslant 1}\) always exists by Remark 2.7). Consider the following real function:

$$\begin{aligned} \mathbb {R}\ni t\rightarrow k(t):=\sum _{i,j\geqslant 1}(1/2^{i+j})|\rho (T_t(B_i)\times C_j)-\rho (B_i\times C_j)|. \end{aligned}$$

As in Lemma 5.4. in [10], we conclude that \(k:\mathbb {R}\rightarrow \mathbb {R}\) is a continuous function and for any \(t\in \mathbb {R}, k(t)>0\). Indeed, it follows by the fact that if for some \(r\in \mathbb {R}\setminus \{0\}\) we have for any \(i,j\in \mathbb {N}\rho (T_{r}(B_i)\times C_j)=\rho (B_i\times C_j)\) then \(\rho \) is product measure (recall that \((T_t)_{t\in \mathbb {R}}\) is assumed to be weak mixing hence every time \(r\) of the flow is ergodic).

The set \(P\subset \mathbb {R}\setminus \{0\}\) is compact, therefore there exists \(\epsilon >0\) such that \(k(p)>\epsilon \) for any \(p\in P\). It follows by the definition of the function \(k\) that there exists a number \(R:=R(\epsilon )\) such that

$$\begin{aligned} \sum _{i,j\geqslant 1}^R(1/2^{i+j})|\rho (T_p(B_i)\times C_j)-\rho (B_i\times C_j)|>\epsilon /2 \end{aligned}$$

for every \(p\in P\). Therefore, for every \(p\in P\), there exist \(1\leqslant i,j\leqslant R\) such that \(|\rho (T_p(B_i)\times C_j)-\rho (B_i\times C_j)|>\epsilon \).

By Lemma 2.6 and the fact that for \(0\leqslant i\leqslant R, B_i\) is open, there exists \(\epsilon '<\frac{\epsilon }{8}\) such that for every \(1\leqslant i\leqslant R\)

$$\begin{aligned} \mu (V_{\epsilon '}(B_i)\setminus B_i)<\epsilon ,\quad \text {and}\quad \mu (\partial V_{\epsilon '}(B_i))=0. \end{aligned}$$

It follows by the fact that \(\rho \) is a joining that

$$\begin{aligned}&|\rho (V_{\epsilon '}(B_i)\times C_j)-\rho (B_i\times C_j)|<\frac{\epsilon }{2}\quad \text {and}\quad |\rho (S_{-t}V_{\epsilon '}(B_i)\times C_j)\nonumber \\&\quad -\rho (S_{-t}B_i\times C_j)| <\frac{\epsilon }{2}, \end{aligned}$$
(7)

for \(1\leqslant i,j\leqslant R\) and every \(t\in \mathbb {R}\). By the switchable \(R(1,P)\)-property, let \(\kappa :=\kappa (\epsilon ')\). By Lemma 2.2 applied to \(\frac{\epsilon }{8},\frac{1}{8},\kappa \), the sets \(V_{\epsilon '}(B_i)\times C_j, 1\leqslant i,j\leqslant R\), and to automorphisms \(T_1\times S_1\) and \(T_{-1}\times S_{-1}\), we get \(N_1\in \mathbb {N}\) and a set \(U_1\in \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (U_1)>\frac{7}{8}\), such that for every \(L,M\geqslant N_1\) with \(\frac{L}{M}\geqslant \kappa \) and every \((x,y)\in U_1\), we have

$$\begin{aligned}&\left| \frac{1}{L}\sum _{k=M}^{M+L}\chi _{V_{\epsilon '}(B_i)\times C_j}(T^kx,S^ky)-\rho (V_{\epsilon '}(B_i)\times C_j)\right| <\frac{\epsilon }{8}\end{aligned}$$
(8)
$$\begin{aligned}&\left| \frac{1}{L}\sum _{k=M}^{M+L}\chi _{V_{\epsilon '}(B_i)\times C_j}(T^{-k}x,S^{-k}y)-\rho (V_{\epsilon '}(B_i)\times C_j)\right| <\frac{\epsilon }{8}. \end{aligned}$$
(9)

Next, by Lemma 2.3 applied to \(\frac{\epsilon }{8},\frac{1}{8},\kappa >0\) and the sets \(B_i\times C_j, 1\leqslant i,j\leqslant R\), there exist \(N_2\in \mathbb {N}\) and a set \(U_2\subset \fancyscript{B}\otimes \fancyscript{C}\) with \(\rho (U_2)>\frac{7}{8}\) such that for every \(L,M\geqslant N_2\) with \(\frac{L}{M}\geqslant \kappa \) and any \(p\in P\), we have

$$\begin{aligned} \left| \frac{1}{L}\sum _{k=M}^{M+L}\chi _{T_{-p}B_i \times C_j}(T_kx,S_ky)-\rho (T_{-p}B_i\times C_j)\right| <\frac{\epsilon }{8} \end{aligned}$$
(10)

and

$$\begin{aligned} \left| \frac{1}{L}\sum _{k=M}^{M+L}\chi _{T_{-p}B_i\times C_j}(T_{-k}x,S_{-k}y)-\rho (T_{-p}B_i\times C_j)\right| <\frac{\epsilon }{8}. \end{aligned}$$
(11)

It follows that if we set \(N_0:=\max (N_1,N_2)\) and \(U_0:=U_1\cap U_2\), then \(\rho (U_0)>\frac{1}{2}\) and for every \(L,M\geqslant N_0\) with \(\frac{L}{M}\geqslant \kappa \), any \(p\in P\), the Eqs. (8), (9), (10), (11) are satisfied for every \((x,y)\in U_0\). Using the switchable \(R(1,P)\)-property with \(\epsilon '>0\) and \(N_0\in \mathbb {N}\), we obtain \(\delta =\delta (\epsilon ',N_0)\) and \(Z=Z(\epsilon ',N_0)\) with \(\mu (Z)>1-\epsilon '\). Now, we will use Lemma 2.5 with the set \(U:=U_0\cap (Z\times Y)\) (then of course \(\rho (U)>\frac{1}{4}\)) and \(\delta _0=\delta (\epsilon ', N_0)\) to prove that for every \((x,y),(x',y)\in U, d(x,x')\geqslant \delta _0\). Assume on the contrary that \(d(x,x')<\delta _0\). Then by the switchable R(1,P)-property, there exist \(L_0,M_0>N_0\) with \(\frac{L_0}{M_0}\geqslant \kappa \) and \(p\in P\) such that

$$\begin{aligned} \frac{1}{L_0}\big |\{n\in [M_0,M_0+L_0]: d(T_{n}(x),T_{n+p}(x'))<\epsilon '\}\big |>1-\epsilon ' \end{aligned}$$

or

$$\begin{aligned} \frac{1}{L_0}\big |\{n\in [M_0,M_0+L_0]: d(T_{-n}(x),T_{-n+p}(x'))<\epsilon '\}\big |>1-\epsilon '. \end{aligned}$$

Assume that the first inequality is satisfied. We will use Eqs. (8) and (10) (in case the second one is satisfied, we use Eqs. (9) and (11)). Let \(1\leqslant i_p, j_p\leqslant R\) be the numbers which satisfy \(|\rho (T_p(B_{i_p})\times C_{j_p})-\rho (B_{i_p}\times C_{j_p})|>\epsilon \). Let \(K=K(x,x',p):= \{n\in [M_0,M_0+L_0]: d(T_{n}(x),T_{n+p}(x'))<\epsilon '\}\). It follows that if \(k\in K\) and \(T_{k+p}x'\in B_i\) then \(T_kx\in V_{\epsilon '}(B_i)\). Therefore

$$\begin{aligned} \rho (T_{-p}B_{i_p}\times C_{j_p})\leqslant & {} \frac{1}{L_0}\sum _{k=M_0}^{M_0+L_0}\chi _{T_{-p}B_{i_p}\times C_{j_p}}(T^kx',S^ky)+\frac{\epsilon }{8}\nonumber \\\leqslant & {} \frac{\epsilon ' L_0}{L_0}+\frac{1}{L_0}\sum _{k=M_0}^{M_0+L_0}\chi _{V_{\epsilon '}(B_{i_p})\times C_{j_p}}(T^kx,S^ky)+\frac{\epsilon }{8} \nonumber \\\leqslant & {} \frac{\epsilon }{2}+\rho (V_{\epsilon '}(B_{i_p})\times C_{j_p})<\epsilon + \rho (B_{i_p}\times C_{j_p}).\qquad \end{aligned}$$
(12)

A similar arguments show that \(\rho (B_{i_p}\times C_{j_p})<\epsilon +\rho (T_{-p}B_{i_p}\times C_{j_p})\) and consequently, \(|\rho (B_{i_p}\times C_{j_p})-\rho (T_{-p}B_{i_p}\times C_{j_p})|<\epsilon \). This contradicts our assumption that \(|\rho (T_p(B_{i_p})\times C_{j_p})-\rho (B_{i_p}\times C_{j_p})|>\epsilon \) is satisfied. Therefore, for any \((x,y),(x',y)\in U\) we have \(d(x,x')\geqslant \delta _0\) and an application of Lemma 2.5 completes the proof. \(\square \)

3 SWR-property for special flows

In this section, we will prove a sufficient condition for SWR-property in the case of special flows over an ergodic isometry. We start by recalling the definition of special flows. Let \(T\) be an automorphism \((X,\fancyscript{B},\mu )\). Let \(f\in L^1(X,\mu )\) such that \(f>0\). The special flow \((T_t^f)_{t\in \mathbb {R}}\) defined above \(T\) and under the ceiling function \(f\) is given by

$$\begin{aligned} X \times \mathbb {R}/ \sim\rightarrow & {} X \times \mathbb {R}/ \sim \\ (x,s)\rightarrow & {} (x,s+t), \end{aligned}$$

where \(\sim \) is the identification

$$\begin{aligned} (x, s + f(x)) \sim (T(x),s) \end{aligned}$$
(13)

Equivalently the flow \((T_t^f)_{t\in \mathbb {R}}\) is defined for \(t+s \geqslant 0\) (with a similar definition for negative times) by

$$\begin{aligned} T_t^f(x,s) = (T^n x, t+s-f^{(n)}(x)) \end{aligned}$$

where \(n\) is the unique integer such that

$$\begin{aligned} f^{(n)}(x) \leqslant t+s < f^{(n+1)}(x) \end{aligned}$$
(14)

and

$$\begin{aligned} f^{(n)}(x)=\left\{ \begin{array}{ccc} f(x)+\cdots +f(T^{n-1}x) &{}\quad \hbox {if} &{} n>0\\ 0&{}\quad \hbox {if}&{} n=0\\ -(f(T^nx)+\cdots +f(T^{-1}x))&{}\quad \hbox {if} &{}n<0.\end{array}\right. \end{aligned}$$

If \(T\) preserves a unique probability measure \(\mu \) then the special flow will preserve a unique probability measure that is the normalized product measure of \(\mu \) on the base and the Lebesgue measure on the fibers. If \(X\) is a metric space with a metric \(d\), so is \(X^f\) with the metric \(d^f((x,s),(x',s')):=d(x,x')+|s-s'|\). Moreover, it is easy to show that if \((T_t^f)_{t\in \mathbb {R}}\) is a special flow acting on \(X^f\), then \((T_t^f)_{t\in \mathbb {R}}\) is almost continuous (see Sect. 2) with \(X(\epsilon )=\{(x,s)\in X^f: x\in X, \epsilon <s<f(x)-\epsilon \}\).

The following general lemma is a direct consequence of Birkhoff ergodic theorem.

Lemma 3.1

Let \(T\) be an ergodic automorphism \((X,\fancyscript{B},\mu )\). Let \(f\in L^1(X,\mu ), \int _Xf\,d\mu \ne 0\). For every \(\epsilon ,\kappa >0\) there exist \(N=N(\epsilon ,\kappa )\) and a set \(A=A(\epsilon ,\kappa )\) with \(\mu (A)>1-\epsilon \) such that for every \(M\geqslant N\)

$$\begin{aligned} \left| \frac{1}{M}\sum _{i=1}^Mf(T^ix)- \int _{X}f\,d\mu \right| \leqslant \frac{\kappa }{3}\left| \int _X f\,d\mu \right| \end{aligned}$$
(15)

for every \(x\in A\).

Remark 3.2

Assume that additionally \(f\) is positive and bounded away from zero. Fix \(\epsilon ,\kappa >0 (\kappa <|\int _X f\,d\mu |<1/2)\). It follows that there are constants \(r_1,r_2>0\) such that if we take \(x\in A\) then for any \(M,L\geqslant N\) with \(\frac{L}{M}\geqslant \kappa \), we have \(r_1<\frac{f^{(M)}(x)}{M}<r_2\) and

$$\begin{aligned} r_1< & {} \frac{(1-\frac{\kappa }{3})\int f\,d\mu (M+L)-(1+\frac{\kappa }{3})\int f\,d\mu \cdot M}{L}\\\leqslant & {} \frac{f^{(M+L)}(x)-f^{(M)}(x)}{M}<r_2. \end{aligned}$$

We now state the main result of this section. It is similar to Lemma 6 of [9].

Proposition 3.3

Let \(T:(X,d)\rightarrow (X,d)\) be an ergodic isometry and \(f\in L^1(X,\fancyscript{B},\mu )\) a positive function bounded away from zero. Let \((T_t^f)_{t\in \mathbb {R}}\) be the corresponding special flow. Let \(P\subset \mathbb {R}\setminus \{0\}\) be a compact set. Assume that for every \(\epsilon >0\) and \(N\in \mathbb {N}\) there exist \(\kappa =\kappa (\epsilon ), \delta =\delta (\epsilon ,N)\) and a set \(X'=X'(\epsilon ,N)\) with \(\mu (X')>1-\epsilon \), such that for any \(x,y\in X'\) with \(0<d(x,y)<\delta \) there exist \(\mathbb {N}\ni M=M(x,y),L=L(x,y)\) with \(M,L\geqslant N, \frac{L}{M}\geqslant \kappa \) and \(p=p(x,y)\in P\) such that

$$\begin{aligned} |f^{(n)}(x)-f^{(n)}(y)-p|<\epsilon \;\;\text {for every}\;\; n\in [M,M+L] \end{aligned}$$
(16)

or

$$\begin{aligned} |f^{(-n)}(x)-f^{(-n)}(y)-p|<\epsilon \;\text {for every}\;\; n\in [M,M+L]. \end{aligned}$$
(17)

If \(\gamma >0\) is such that the automorphism \(T^f_\gamma \) is ergodic, then \((T^f_t)_{t\in \mathbb {R}}\) has the switchable \(R(\gamma ,P)\)-property. Consequently, \((T^f_t)_{t\in \mathbb {R}}\) has the SWR-property.

Proof

Fix \(\gamma >0\) such that \(T^f_\gamma \) is ergodic. Fix also \(\frac{1}{\Vert f\Vert _{L^1}}>4\epsilon >0\). Apply Remark 3.2 with the constants \(\epsilon /4,\kappa \) to \(f\) and \(T, T^{-1}\), respectively to obtain constants \(D_1,D_2>0\) such that for \(x\in A, \mu (A)>1-\epsilon /2\) (the set \(A\) is the intersection of two relevant sets), we have

$$\begin{aligned}&D_1<\frac{f^{(M)}(x)}{M},\frac{f^{(M+L)}(x)-f^{(M)}(x)}{L},\frac{f^{(-M)}(x)}{-M},\nonumber \\&\quad \frac{f^{(-M-L)}(x)-f^{(-M)}(x)}{-L}<D_2. \end{aligned}$$
(18)

Fix \(N>\frac{2}{D_2\epsilon ^2}\). Let \(\epsilon ':=\min (\frac{D_1\epsilon }{8(\gamma +D_2)},\frac{\epsilon }{16})\). Let \(\kappa ':=\frac{D_1}{D_2}\kappa (\epsilon ')\). Let us consider the set \(X(\epsilon )\) on which \((T_t^f)_{t\in \mathbb {R}}\) is \(\frac{\epsilon }{8}\)- “almost continuous”, that is

$$\begin{aligned} X(\epsilon ):=\left\{ (x,s)\in X^f\;:\; \frac{\epsilon }{8}<s<f(x)-\frac{\epsilon }{8}\right\} . \end{aligned}$$

Now, we will use ergodicity of \(T^f_\gamma \) and \(T^f_{-\gamma }\). It follows that there exist \(N_0:=N(\epsilon )\) and a set \(Z:=Z(\epsilon )\) with \(\mu ^f(Z)>1-\frac{\epsilon }{2}\) and for every \((x,s)\in Z\) and \(n\geqslant N_0\)

$$\begin{aligned} \left| \frac{1}{n}\sum _{k=1}^n \chi _{X(\epsilon )}T^f_{ki}(x,s)-\left( 1-\frac{\epsilon }{4}\right) \right| <\frac{\kappa }{\kappa +1}\frac{\epsilon }{8} \end{aligned}$$
(19)

for \(i=\gamma ,-\gamma \). Moreover, since \(f\in L^1(X,\fancyscript{B},\mu )\), there exists a set \(V=V(\epsilon )\subset X\) with \(\mu (V)>1-\frac{\epsilon }{2}\) and such that for every \(x\in V, f(x)<\frac{2}{\epsilon ^2}\). Define the set \(Z':=Z\cap \{(x,s)\in X^f\,:\; x\in V\}\cap \{(x,s)\in X^f\;: \; x\in A\}\), then \(\mu ^f(Z')>1-\epsilon \).

Let \(\delta ':=\delta (\epsilon ', 2\gamma \frac{\max (N_0,N)}{D_1})\). Take two points \((x,s),(x',s')\in Z'\), such that \(x\ne x'\) and \(d^f((x,s),(x',s'))<\delta '\). It follows by definition of \(d^f\) that \(d(x,x')<\delta '\) and therefore by our assumptions there exist \(M,L\geqslant 2\gamma \frac{\max (N_0,N)}{D_1}\) with \(\frac{L}{M}\geqslant \kappa , p\in P\) and such that for all \(n\in [M,M+L]\) either \(|f^{(n)}(x)-f^{(n)}(y)-p|<\epsilon '\) or for all \(n\in [M,M+L], |f^{(-n)}(x)-f^{(-n)}(y)-p|<\epsilon '\). We will consider the second case (the proof in the first case goes along the same lines).

Let us define

$$\begin{aligned} M':=\frac{f^{(-M)}(x)-s}{-\gamma }\quad \text { and }\quad L':=\frac{f^{(-M-L)}(x)-f^{(-M)}(x)}{-\gamma }. \end{aligned}$$

By (18) it follows that \(L'=\frac{f^{(-L-M)}(x)-f^{(-M)}(x)}{-L}\frac{-L}{-\gamma }> \frac{-LD_1}{-\gamma }>N\). Similarly, \(\frac{f^{(-M)}(x)-s}{-\gamma }>\frac{f^{(-M)}(x)}{-\gamma }>\frac{MD_1}{\gamma }\), so \(M'>N\). Moreover, since \((x,s)\in Z', s<\frac{2}{\epsilon ^2}<ND_2\leqslant M D_1D_2/(2\gamma )\) (by the choice of \(N\)) and therefore

$$\begin{aligned} \frac{L'}{M'}\geqslant \frac{LD_1}{\gamma }\frac{-\gamma }{f^{(-M)}(x)-s}\geqslant \frac{LD_1}{MD_2}\geqslant \kappa '. \end{aligned}$$

It follows by the properties of \(M',L'\in \mathbb {N}\) that if \((x,s)\in Z'\subset Z\) we have

$$\begin{aligned} \left| \frac{1}{L'}\sum _{k=M'}^{M'+L'}\chi _{X(\epsilon )}T^f_{-k\gamma } -(1-\epsilon )\right| <\frac{\epsilon }{2}. \end{aligned}$$
(20)

Take any \(k\in [M',M'+L']\) such that \(T^f_{-k\gamma }\in X(\epsilon )\) it follows that there exist a number \(m_k\in [M,M+L]\) such that \(T^f_{-k\gamma }(x,s)=(T^{m_k}x,-k\gamma +s-f^{(-m_k)}(x))\), where, by the fact that \(T^f_{-k\gamma }\in X(\epsilon ), f^{(-m_k-1)}(x)+\frac{\epsilon }{8}<-k\gamma +s<f^{(-m_k)}(x)-\frac{\epsilon }{8}\). Using additionally the inequality \(|s-s'|<\delta '\) we hence obtain

$$\begin{aligned}&f^{(-m_k-1)}(x')\leqslant f^{(-m_k-1)}(x)+p-\epsilon '\leqslant f^{(-m_k-1)}(x)+p+\frac{\epsilon }{8}-\delta '\\&\quad <-k\gamma +s'+p. \end{aligned}$$

A similar reasoning shows that

$$\begin{aligned} -k\gamma +s'+p<-k\gamma +s+p+\delta \leqslant f^{(-m_k)}(x)+p-\frac{\epsilon }{8}+\delta \leqslant f^{(-m_k)}(x'). \end{aligned}$$

Therefore, by the definition of the special flow, we have \(T^f_{-k\gamma +p}(x',s')=(T^{m_k}x',-k\gamma +s'+p-f^{(-m_k)}(x'))\). Consequently,

$$\begin{aligned} d^f(T^f_{-k\gamma }(x,s),T^f_{-k\gamma +p}(x',s'))= & {} d^f((x,s),(x',s'))+|f^{(-m_k)}(x)\\&-f^{(-m_k)}(x')-p|<\epsilon . \end{aligned}$$

Now, the number of \(k\in [M',M'+L']\) such that \(T^f_{-k\gamma }\in X(\epsilon )\) is, by (20), at least \((1-\epsilon )L'\) and for any such \(k\) we get that \(d^f(T^f_{-k\gamma }(x,s),T^f_{-k\gamma +p}(x',s'))<\epsilon \). Hence

$$\begin{aligned} \frac{1}{L'}\left| \left\{ k\in [M',M'+L']\;:\;d^f(T^f_{-k\gamma }(x,s),T^f_{-k\gamma +p}(x',s'))<\epsilon \right\} \right| >1-\epsilon . \end{aligned}$$

This gives us the switchable \(R(\gamma ,P)\)-property. Note that since the flow \((T_t^f)_{t\in \mathbb {R}}\) is ergodic, then the set of \(\eta \in \mathbb {R}\) such that \(T^f_\eta \) is not ergodic, is at most countable and therefore, as a direct consequence of Proposition 3.3, we get that \((T^f_t)_{t\in \mathbb {R}}\) enjoys SWR-property. \(\square \)

4 SWR-property for smooth special flows with singularities

In this section we will use Proposition 3.3 to prove SWR-property for special flows given by the assumptions in Theorem 2 and Theorem 4. In all the sequel we assume \(\{a_1,\ldots ,a_k\}\) are badly approximable by \(\alpha \) with a constant \(C>1\) (see Definition 1.3). We start with an easy combinatorial fact about the visits of an orbit by the rotation \(R_\alpha \) to the neighborhood of the singularities.

Lemma 4.1

Let \(s\in \mathbb {N}\) be such that \(q_{s+1}>4Cq_s\) and \(x\in \mathbb {T}\). Then

$$\begin{aligned} \{x+j\alpha \}_{j=0}^{[\frac{q_{s+1}}{4C}]} \cap \bigcup _{i=1}^k\left[ \frac{-1}{4Cq_s}+a_i,a_i+\frac{1}{4Cq_s}\right] \subset \{x+rq_s+i_0\alpha \}_{r=0}^{[\frac{q_{s+1}}{4Cq_s}]}, \end{aligned}$$

where \(i_0\in \{0,\ldots ,q_s-1\}\) is such that \(\rho (\{x+v\alpha \}_{v=0}^{q_s-1},\{a_i\}_{i=1}^k)= \rho (x+i_0\alpha ,\{a_i\}_{i=1}^k)\). For finite sets \(A,B\subset \mathbb {T}\), we use the notation \(\rho (A,B)=\min _{a\in A,b\in B}\Vert a-b\Vert \).

Proof

Observe that for every \(0<j<q_s-1, j\ne i_0\) and every \(r=0,\ldots ,[\frac{q_{s+1}}{4Cq_s}]\) we have

$$\begin{aligned} \Vert x+rq_s+j\alpha -(x+j\alpha )\Vert = \Vert rq_s\alpha \Vert \leqslant r\Vert q_s\alpha \Vert \leqslant \frac{1}{4Cq_s}. \end{aligned}$$
(21)

Moreover, since \(\{a_i\}_{i=1}^k\) are badly approximable [see (3)] and by the definition of \(i_0\) we get

$$\begin{aligned} \{x+j\alpha \}_{j=0}^{q_s-1}\cap \bigcup _{i=1}^k\left[ \frac{-1}{2Cq_s}+a_i,a_i+\frac{1}{2Cq_s}\right] \subset \{x+i_0\alpha \}. \end{aligned}$$
(22)

Therefore, for \(j\ne i_0\) and \(r=0,\ldots ,[\frac{q_{s+1}}{4Cq_s}]\), by (21) and (22)

$$\begin{aligned} \rho (x+j\alpha +rq_s,\{a_i\}_{i=1}^{k})\geqslant \rho (x+j\alpha ,\{a_i\}_{i=1}^{k})-\frac{1}{4Cq_s}\geqslant \frac{1}{4Cq_s}. \end{aligned}$$

This finishes the proof. \(\square \)

We will often use the Denjoy-Koksma inequality to control the growth of the Birkhoff sums. For a reference, see for example [4].

Proposition 4.2

(Denjoy-Koksma inequality) Let \(f:\mathbb {T}\rightarrow \mathbb {R}\) be a function of bounded variation. Then

$$\begin{aligned} \left| \sum _{k=0}^{q_n-1}f(x+k\alpha )-q_n\int _\mathbb {T}f\,d\lambda \right| \leqslant Varf, \end{aligned}$$

for every \(x\in \mathbb {T}\) and \(n\in \mathbb {N}\).

The following lemma is a simple consequence of the Denjoy-Koksma inequality. It will be very useful in separating the contribution to the shear of the visits to the neighborhood to the singularities from the rest of the orbit.

Lemma 4.3

Let \(h\in C^2(\mathbb {T}\setminus \{0\})\) be positive and decreasing on \((0,1)\) with \(h'\) is increasing on \((0,1)\) and \(\lim _{x\rightarrow 0^+}h(x)=\lim _{x \rightarrow 0^+}(-h'(x))=+\infty \). Denote by \(c_0:=\inf _\mathbb {T}h\). Then for every \(x\in \mathbb {T}\) and \(s\in \mathbb {N}\) we have the following estimates:

$$\begin{aligned}&-q_s\left( h\left( \frac{1}{2q_s}\right) -c_0\right) -2h'\left( \frac{1}{2q_s}\right) >h'^{(q_s)}(x)\geqslant h'(x+j\alpha )\\&\quad -q_s\left( h\left( \frac{1}{2q_s}\right) -c_0\right) +2h'\left( \frac{1}{2q_s}\right) \end{aligned}$$

where \(j\in \{0,\ldots ,q_s-1\}\) is such that \(\min _{\ell \in \{0,\ldots ,q_s-1\}}|x+\ell \alpha |=x+j\alpha \).

Proof

Fix \(s\in \mathbb {N}\). Consider

$$\begin{aligned} \bar{h}(x)={\left\{ \begin{array}{ll} 0,\; &{}\text {if} \;x\in \left[ 0,\frac{1}{2q_s}\right) \\ h'(x), \;&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Then \(\bar{h}\in BV(\mathbb {T})\) and we use the Denjoy-Koksma inequality to obtain \(|\bar{h}^{(q_s)}(x)-q_s\int _\mathbb {T}\bar{h}(t)d\lambda |<\mathrm{Var}\,\bar{h}\). But \(\int _\mathbb {T}\bar{h}(t)d\lambda =h(\frac{1}{2q_s})-c_0\) and \(\mathrm{Var}\,\bar{h} \leqslant - 2h'(\frac{1}{2q_s})\). Moreover, by (2), the set \(\{x+r\alpha \}_{r=0}^{q_s-1}\cap [0,\frac{1}{2q_s}]\) is at most a singleton. We then finish since \(h'^{(q_s)}(x)=\bar{h}^{(q_s)}(x)+h'(x+j\alpha )\chi _{[0,\frac{1}{2q_s}]}(x+j\alpha )\) and \(h'<0\). \(\square \)

Lemma 4.4

Let \(f\in C^2(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\). Assume that for \(i=1,\ldots ,k, \lim _{x\rightarrow a_i^+}\frac{f'(x)}{r_i(x-a_i)}\) and \(\lim _{x\rightarrow a_i^-}\frac{f'(x)}{r_i(a_i-x)}\) exist and are finite, where \(0\leqslant r_i\in C^2(\mathbb {T}\setminus \{0\})\) is decreasing on \((0,1)\) with \(r_i'\) increasing on \((0,1)\). Then there exists a constant \(H>0\) such that

$$\begin{aligned} |f'(x)|<H \left( \sum _{i=1}^k -r_i'(x-a_i)-r_i'(a_i-x)\right) , \end{aligned}$$

for each \(x\in \mathbb {T}\).

Proof

By assumptions, there exists a constant \(z_0>0\) such that for every \(i=1,\ldots ,k\) and for every \(x\in [-z_0+a_i,a_i) |f'(x)|<-Kr_i'(a_i-x)\) and for every \(x\in (a_i,a_i+z_0]; |f'(x)|<-Kr_i'(x-a_i)\) for some constant \(K\geqslant 0\).

Moreover, since \(f'\in C^1(\mathbb {T}\setminus \{a_1,\ldots ,a_k\})\); it follows that there exists a constant \(R>0\) such that for every \(x\in \mathbb {T}\setminus \bigcup _{i=1}^k[-z_0+a_i,a_i+z_0], |f'(x)|<R\). Denote by \(C_0:=\min _{i=1,\ldots ,k}|\sup _{\mathbb {T}}r_i'|\) Now, the constant \(H:=2\max _{i=1,\ldots ,k}\{K,\frac{R}{C_0}\}\) satisfies the assertion of the lemma. \(\square \)

4.1 Proof of Theorem 2

4.1.1 Outline of the proof.

We first give an outline of the proof in which we suppose for simplicity that the ceiling function has just one singularity that is exactly logarithmic. The SWR-property (in the same way as the original Ratner’s property) consists of two parts. First, for two “nearby” points, we need to show that their orbits drift apart by a controlled amount at some time \(M (M\) may be positive or negative in the case of the SWR property). Second, we need to make sure that their orbits keep essentially the same drift during an interval of time comparable to \(M\). For special flows over rotations (or IET’s) we gave in Proposition 3.3 a characterization of the SWR property based on the divergence for nearby points between the corresponding Birkhoff sums of the ceiling function \(f\). The latter is controlled by the Birkhoff sums of the derivative of the ceiling function that were the subject of investigation of several other works (see e.g. [8, 17, 19, 20, 32, 35]).

In Proposition 4.6 Part a, we prove the first part of the SWR property. Precisely, we want to see a macroscopic yet controlled drift between the Birkhoff sums of two points \(1/(q_s \ln q_s) \geqslant d(x,y) \geqslant 1/(q_{s+1} \ln q_{s+1})\) for every \(s\) sufficiently large. We explain now the proof showing also how the argument simplifies if \(K_{\alpha }=\{n\in \mathbb {N}\;:\;q_{s+1}<q_s(\ln q_s)^{\frac{7}{8}}\}\) contains all the integers after some \(n_0\), in particular if \(\alpha \) is of constant type.

  1. a.

    There exists \(c>0\) such that for any point \(x \in \mathbb {T}\) either \(R^i_\alpha x\) is disjoint from \([-c/q_s,c/q_s]\) for every \(i=0,\ldots ,l< \frac{q_s}{2}\) or \(R^{-i}_\alpha x\) is disjoint from \([-c/q_s,c/q_s]\) for every \(i=1,\ldots ,l< \frac{q_s}{2}\).

  2. b.

    By the Denjoy-Koksma inequality, if the forward (backward) orbit of length \(q_s\) a point \(x\) by \(R_\alpha \) is disjoint with the \(1/(q_s \ln q_s)\) neighborhood of the origin we have that \(|f'^{(q_s)}(x)|\) (or \(|f'^{(-q_s)}(x)|\)) is of order \(q_s \ln q_s\) (this is easy to understand if we rearrange the orbit of \(x\) almost like \(c/q_s, (c+1)/q_s,(c+2)/q_s...\) with \(c \in (0,1)\) far from \(0\) and from \(1\)). Hence for any point \(x\) either \(|f'^{(q_s)}(x)|\) or \(|f'^{(-q_s)}(x)|\) is of order \(q_s \ln q_s\). This is the content of Lemma 4.8. In case \(q_{s+1} \leqslant C q_s\), this would give the macroscopic yet controlled drift between the points \(x,y\) such that \(1/(q_{s+1} \ln q_{s+1}) \leqslant d(x,y) \leqslant 1/(q_{s} \ln q_{s})\) either in the future at time \(q_s\) or in the past at time \(-q_s\). So, if \(\alpha \) is of constant type, we would be done with the proof of Proposition 4.6 Part a.

  3. c.

    In the case \(q_{s+1} \gg q_s\) but \(s \in K_\alpha \), we may have to consider the Birkhoff sums beyond \(q_s\) to see the drift between the orbits of \(x\) and \(y\) such that \(1/(q_{s+1} \ln q_{s+1}) \leqslant d(x,y) \ll 1/(q_{s} \ln q_{s})\). Since \(s \in K_\alpha \), we have that \(1/(q_s \ln q_s) \ll c/q_{s+1}\) hence a. applied to \(s+1\) implies that either up to \(q_{s+1}/2\) in the future or up to \(-q_{s+1}/2\) in the past the orbit of \(x\) by \(R_\alpha \) does not enter the \(1/(q_s \ln q_s)\) neighborhood of the origin (see Lemma 4.7, case \(m \in K_\alpha \)). Using this, the estimate of b. and the Denjoy-Koksma inequality we get by the cocycle identity that \(f'^{(\iota kq_s)}(x)\) behaves like \(kq_s \ln q_s\) for \(s\) sufficiently large, \(k \in [1,O(q_{s+1}/q_s)]\) and \(\iota =1\) or \(-1\) (see Lemma 4.9). As a consequence there exists a time of the form \(n_0q_s, n_0 \in [1,O(q_{s+1}/q_s)]\) such that \(f^{(\iota n_0q_s)}(x)-f^{(\iota n_0q_s)}(y)\) is in some compact set \(P\) away from \(0\), which finishes the proof of Proposition 4.6 Part a in the case \(K_\alpha \) contains all sufficiently large integers.

  4. d.

    In the case \(s \notin K_\alpha \), we define \(x_s := 1/(q_s (\ln q_s)^{7/8})\) and use our arithmetic condition \(\alpha \in \mathcal {E}\) to define a set \(Z\) of almost full measure (see definition of \(Z\) in Sect. 4.1.2) such that \(R^i_\alpha x\) does not enter the \(x_s\) neighborhood of the origin (see Definition (25)) for every \(i\in \{-q_s,\ldots , 0,\ldots , q_s-1\}\) for every \(s \notin K_\alpha \) sufficiently large. This actually implies that either up to \(\frac{q_{s+1}}{4C}\) in the future or up to \(-\frac{q_{s+1}}{4C}\) (\(C\) is a constant coming from Definition 1.3; in case there is only one singularity \(C=1\)) in the past the orbit of \(x\) by \(R_\alpha \) does not enter the \(x_s\) neighborhood of the origin (see Lemma 4.7, case \(m \notin K_\alpha \)). From there the proof of Proposition 4.6 Part a is similar to case 3. above except that the condition 3 of Theorem 2, namely \(h(\frac{1}{2q_s})/h(\frac{1}{2q_{s+1}})>m_0\) is used to show that a stretch of order \( n_0q_s \ln q_s\) where \(n_0\) can be taken to be as large as \(O(q_{s+1}/q_s)\) is sufficient to produce a drift between the point \(x,y\) such that \(d(x,y)\) is comparable to \(1/(q_{s+1} \ln q_{s+1})\). This is where the Diophantine condition 3. of Theorem 2 is crucial.

Observe that d. is the only part where we used that in the definition of SWR, it is allowed to discard a small measure set of pairs \((x,y)\) for which the property will not be checked.

Note however that in all the cases above, it is crucial to use the possibility to control the drift in the future or in the past depending on the pair of points.

The second part (keeping the drift) is proved in Proposition 4.6 Part b. We need to consider the points \(R_\alpha ^{n_0q_s}x\) and \(R_\alpha ^{n_0q_s}y\) and apply similar arguments as in Part a to bound the drift during time \(\kappa n_0q_s\). The main ingredient is Lemma 4.10 which is another lemma that allows us to bound the drift between the Birkhoff sums in the future (or in the past) up to a time comparable to \(q_{m+1}\) for points that stay away from the singularities in the future (or in the past) during this time. We then have to “situate” \(\kappa n_0 q_s\) relatively to the denominators of \(\alpha \) and check that the conditions of Lemma 4.10 are satisfied by \(R_\alpha ^{n_0q_s}x\) and \(R_\alpha ^{n_0q_s}y\). Of course if \(s\) is such that \(q_{s+1}\gg q_s\) (for example if \(s \notin K_\alpha \)) and if \(1/(q_{s+1} \ln q_{s+1}) \ll d(x,y) \ll 1/(q_s \ln q_s)\), then the same argument of Part a would allow to keep the drift under control for additional \(\kappa n_0q_s\) time. But in the other cases where we have in particular to interpolate between the constant type and non-constant type behavior, our proof gets a bit technical and treats different cases separately.

4.1.2 Notations and standing assumptions

In all the proofs of Theorem 2, Theorem 4 and Theorem 1, we will use \(T\) for the irrational rotation \(R_\alpha \). We may assume WLOG that \(\sum _{i=1}^k(A_i-B_i)>0\). Fix \(1 \gg \epsilon >0\) and \(N\in \mathbb {N}\). Let \(d=\sum _{i=1}^k(A_i-B_i)-\min (\frac{1}{10},\frac{\sum _{i=1}^k(A_i-B_i)}{2})>0\). Define \(\kappa =\kappa (\epsilon ):=\frac{\epsilon m_0d}{64(d+1)Hk}\), where \(H\) comes from Lemma 4.4, and \(m_0>0\) is the constant coming from 3. in Theorem 2. With \(C>0\) the constant from the Definition 1.3 of badly approximable singularities, we let

$$\begin{aligned} P:=\left[ -2(d+1),\frac{-dm_0}{32C}\right] \cup \left[ \frac{dm_0}{32C},2(d+1)\right] , \end{aligned}$$
(23)

In the sequel, we will assume \(s \geqslant s_0(\epsilon ,N)=s_0\), where \(s_0\) is a sufficiently large integer, in particular \(\kappa q_{s_0}>N\).

We summarize now the consequences of the hypothesis 1.,2.,3. of Theorem 2 that will be useful to us in the sequel. If \(s \geqslant s_0\) we have

$$\begin{aligned}&\frac{|h'\left( \frac{x_s}{4C}\right) |}{q_sh\left( \frac{1}{2q_s}\right) }<\frac{\epsilon }{2},\;\; \frac{x_s}{2C}q_sh\left( \frac{1}{2q_s}\right) >\frac{1}{\epsilon },\nonumber \\&\quad \sum _{s\geqslant s_0,s\notin K_\alpha }x_s q_s<\frac{\epsilon }{16k},\;\; h\left( \frac{1}{2q_s}\right) /h\left( \frac{1}{2q_{s+1}}\right) >m_0 \end{aligned}$$
(24)

We also note that \(h\left( \frac{1}{2q_s}\right) >8C\). Set \(v_s:=\frac{x_s}{4C}\) and define

$$\begin{aligned} W_s:= & {} \left\{ x\in \mathbb {T}\;:\, x-q_s\alpha ,\ldots ,x,\ldots ,x+(q_s-1)\alpha \right. \nonumber \\&\left. \quad \notin \bigcup _{i=1}^k(-4v_s+a_i,a_i+4v_s)\right\} \end{aligned}$$
(25)

and \(Z:=\bigcap _{s\geqslant s_0,s\notin K_\alpha }W_s\).

Observe that \(\lambda (Z)\geqslant 1- \epsilon (\lambda (W_s)\geqslant 1-16kv_sq_s )\).

Set \(\delta :=\frac{1}{q_{s_0}h(\frac{1}{2q_{s_0}})}\). Consider \(x,y\in Z\) with \(0<\Vert x-y\Vert <\delta \). We will assume WLOG that \(x<y\) (we consider the trigonometric order on \(\mathbb {T}\)).

4.1.3 Controlling the drift

The following proposition implies Theorem 2 due to Proposition 3.3.

Proposition 4.5

Consider \(x,y\in Z\) with \(0<\Vert x-y\Vert <\delta \). Then there exists \(p \in P, M,L\geqslant \kappa M \geqslant N\) such that either (16) holds for \(n \in [M,M+L]\) or (17) holds for \(n \in [M,M+L]\).

Proposition 4.5 can be deduced from the following main result on the drift of the Birkhoff sums of a function with logarithmic like singularities. Let \(s:=s(x,y) (s\geqslant s_0)\) be unique such that

$$\begin{aligned} x<y, \quad \frac{1}{q_{s+1}h\left( \frac{1}{2q_{s+1}}\right) }< \Vert x-y\Vert \leqslant \frac{1}{q_sh\left( \frac{1}{2q_s}\right) }. \end{aligned}$$
(26)

We will assume that \(q_{s+1}>2q_s\). If not, then in (26), \(\frac{m_0}{2q_sh(\frac{1}{2q_s})}<\Vert x-y\Vert \) and we repeat the considerations below in the time interval \([q_{s-1},q_s]\). In other words, in this case we will see the drift between \(x\) and \(y\) before time \(q_s\).

Proposition 4.6

Consider \(x,y\in Z\) as in (26).

Part a There exists \(n_0\in \{1,\ldots ,\max (\frac{q_{s+1}}{8Cq_s},1)\}\) satisfying

$$\begin{aligned} f^{(n_0q_s)}(x)-f^{(n_0q_s)}(y)\in P \end{aligned}$$
(27)

or

$$\begin{aligned} f^{(-n_0q_s)}(x)-f^{(-n_0q_s)}(y)\in P \end{aligned}$$
(28)

and such that the following holds

Part b Let \(X=T^{n_0q_s}x\) and \(Y=T^{n_0q_s}y\) if (27) holds, and \(X=T^{-(n_0q_s+1)}x\) and \(Y=T^{-(n_0q_s+1)}y\) if (28) holds. For \(n=1,\ldots ,[\kappa n_0q_s]+1\) we have

$$\begin{aligned} \text {A.}\;|f^{(n)}(X)-f^{(n)}(Y)|<\epsilon \;\;\;\; \text {or} \;\;\;\;\text {B.}\;|f^{(-n)}(X)-f^{(-n)}(Y)|<\epsilon . \end{aligned}$$
(29)

The rest of this section is devoted to the proof of Proposition 4.6. But before this we show how it implies Proposition 4.5 and thus Theorem 2.

Proof of Proposition 4.5

Suppose (27) holds, the other case being similar. If A. from (29) holds, set \(M:=n_0q_s, L:=[\kappa M]+1\) and \(p:=f^{(n_0q_s)}(x)-f^{(n_0q_s)}(y) \in P\). If B. holds, we set \(M:=[\frac{n_0q_s}{1+\kappa }], L:=[\kappa M]+1\) and \(p:=f^{(n_0q_s)}(x)-f^{(n_0q_s)}(y)\in P\). Notice that in both cases \(M,L\geqslant \frac{1}{2} \kappa n_0q_s\geqslant \frac{1}{2}\kappa q_s\geqslant \frac{1}{2} \kappa q_{s_0}\geqslant N\). Finally, using \(A.\) or \(B.\) and the cocycle identity for the Birkhoff sums and the triangular inequality shows that for \(n\in [M,M+L], |f^{(n)}(x)-f^{(n)}(y)-p|< \varepsilon \) for some \(p\in P\). \(\square \)

4.1.4 Proof of Proposition 4.6 Part a

For \(m \in \mathbb {N}\), we will often use the following non resonance conditions of a pair of points \((x,y)\) with the singularities \(\{a_1,\ldots ,a_k\}\) [compare with (25)].

$$\begin{aligned} \bigcup _{j=-q_m}^{q_m-1}T^j[x,y]\cap \bigcup _{i=1}^k[-2v_m+a_i,a_i+2v_m]= & {} \emptyset \end{aligned}$$
(30)
$$\begin{aligned} \bigcup _{j=0}^{\max ([\frac{q_{m+1}}{4C}],q_m)}T^j[x,y]\cap \bigcup _{i=1}^k[-v_m+a_i,a_i+v_m]= & {} \emptyset .\end{aligned}$$
(31)
$$\begin{aligned} \bigcup _{j=1}^{\max ([\frac{q_{m+1}}{4C}],q_m)}T^{-j}[x,y]\cap \bigcup _{i=1}^k[-v_m+a_i,a_i+v_m]= & {} \emptyset . \end{aligned}$$
(32)

Lemma 4.7

Let \(x,y\in \mathbb {T}\) be as in (26). Then for every \(m\) such that \(s_0\leqslant m\leqslant s\), if we have at least one of the following

  1. 1.

    if \(m\notin K_\alpha \) and (30) is satisfied

  2. 2.

    if \(m\in K_\alpha \) and \(q_{m+1}\geqslant 2q_m\),

then we have at least one of (31) or (32).

Proof

Observe first that since by (26), \(\Vert x-y\Vert \leqslant v_s\leqslant v_m\), then it suffices to prove (31) or (32) with just \(x\) instead of \([x,y]\) on the LHS and \( 2v_m\) instead of \(v_m\) on the RHS.

Assume \(m\notin K_\alpha \). Since \(m\geqslant s_0\) we may assume that \(q_{m+1}\geqslant 16Cq_m\).

Let \(t_1\in [-q_m,q_m-1]\cap \mathbb {Z}\) and \(r_1\in \{1,\ldots ,k\}\) be such that

$$\begin{aligned} \rho \left( \{x+j\alpha \}_{j=-q_m}^{q_m-1},\{a_i\}_{i=1}^k\right) =\Vert x+t_1\alpha -a_{r_1}\Vert . \end{aligned}$$

We assume now that \(t_1<0\) and show (31). In case \(t_1\geqslant 0\), (32) would follow similarly.

By (30) we have that \(\Vert x+t_1 -a_{r_1}\Vert \geqslant 2v_m\). Since \(m\notin K_\alpha \), and \(t_1<0\) it follows that for every \(r=0,1,\ldots , [\frac{q_{m+1}}{4Cq_m}]\),

$$\begin{aligned} \Vert x+(t_1+rq_m) \alpha -a_{r_1} \Vert \geqslant \Vert x+t_1 -a_{r_1}\Vert \geqslant 2v_m. \end{aligned}$$

Moreover for every \(i\in \{1,\ldots ,k\}, i\ne r_1\), since \(a_1,\ldots ,a_k\) are badly approximable by \(\alpha \) (see Definition 1.3), we have for every \(r=0,1,\ldots , [\frac{q_{m+1}}{4Cq_m}] \)

$$\begin{aligned} \Vert x+(t_1+rq_m) \alpha -a_{i} \Vert \geqslant \Vert x+t_1 -a_{i}\Vert -r\Vert q_m\alpha \Vert \geqslant \frac{1}{4Cq_m}\geqslant 2v_m. \end{aligned}$$

For \(j \leqslant [\frac{q_{m+1}}{4C}]-1\) and \(j\notin \{t_1,t_1+q_m,\ldots ,t_1+[\frac{q_{m+1}}{4Cq_m}]q_m-1\}\), Lemma 4.1 (for \(i_0=t_1\)) implies

$$\begin{aligned} x+j\alpha \notin \bigcup _{i=1}^k\left[ -\frac{1}{4Cq_m}+a_i,a_i+\frac{1}{4Cq_m}\right] \end{aligned}$$

and since \(2v_m \leqslant 1/(4Cq_m)\), this finishes the proof of (31). \(\square \)

In the following lemma we control the drift between the Birkhoff sums up to \(q_s\) or \(-q_s\) between nearby points that do not go too close to the singularities. Recall that we have assumed that \(d=\sum _{i=1}^k (A_i-B_i)>0\).

Lemma 4.8

For every \(s\geqslant s_0\) we have the following for any points \(x<y\in \mathbb {T}\) if

$$\begin{aligned} \bigcup _{j=0}^{q_s-1}T^j[x,y]\cap \bigcup _{i=1}^k[-2v_s+a_i,a_i+2v_s]=\emptyset \end{aligned}$$
(33)

then

$$\begin{aligned} (d+1)q_sh\left( \frac{1}{2q_s}\right) \Vert x-y\Vert \geqslant f^{( q_s)}(x)-f^{( q_s)}(y)\geqslant dq_sh\left( \frac{1}{2q_s}\right) \Vert x-y\Vert . \end{aligned}$$
(34)

If

$$\begin{aligned} \bigcup _{j=-q_m}^{-1}T^j[x,y]\cap \bigcup _{i=1}^k[-2v_m+a_i,a_i+2v_m]=\emptyset \end{aligned}$$
(35)

then

$$\begin{aligned} (d+1)q_sh\left( \frac{1}{2q_s}\right) \Vert x-y\Vert \geqslant f^{( -q_s)}(x)-f^{( -q_s)}(y)\geqslant dq_sh\left( \frac{1}{2q_s}\right) \Vert x-y\Vert . \end{aligned}$$
(36)

Proof

We show that (33) implies (34), the second part of the Lemma being similar. By (33) and (26), \(f^{(q_s)}\) is differentiable on \([x,y]\). Therefore, there exists \(\theta \in [x,y]\) such that

$$\begin{aligned} f^{(q_s)}(x)-f^{(q_s)}(y)=(x-y)f'^{(q_s)}(\theta ). \end{aligned}$$

It is enough to show that there exist \(d>0\) such that for \(s\geqslant s_0\)

$$\begin{aligned} (d+1)q_sh\left( \frac{1}{2q_s}\right) \geqslant -f'^{(q_s)}(\theta )\geqslant dq_sh\left( \frac{1}{2q_s}\right) . \end{aligned}$$
(37)

For \(s\in \mathbb {N}\), define

$$\begin{aligned} \bar{f'_s}(\theta )={\left\{ \begin{array}{ll} 0,\; &{}\text {if} \;\theta \in \bigcup _{i=1}^k\left[ -\frac{1}{2q_s}+a_i,a_i+\frac{1}{2q_s}\right] \\ f'(\theta ), \;&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

It follows that \(\bar{f'_s}\in BV(\mathbb {T})\) and

$$\begin{aligned} f'^{(q_s)}(\theta )=\bar{f'_s}^{(q_s)}(\theta )+\sum _{i\in J_s} f'(\theta +j_i\alpha )+\sum _{i\in L_s}f'(\theta +l_i\alpha ), \end{aligned}$$
(38)

where \(J_s=\{i \in [1,k] : \exists j_i\in \{0,\ldots ,q_s-1\}:\;\theta +j_i\alpha \in [-\frac{1}{2q_s}+a_i,a_i]\}\) and \(L_s:=\{i \in [1,k]: \exists l_i\in \{0,\ldots ,q_s-1\}:\;\theta +l_i\alpha \in [a_i,a_i+\frac{1}{2q_s}]\}\). Note that for every \(i \in [1,k] \) there exists at most one \( j_i\in \{0,\ldots ,q_s-1\}:\;\theta +j_i\alpha \in [-\frac{1}{2q_s}+a_i,a_i]\).

We use the Denjoy-Koksma inequality to \(\bar{f'_s}\), to get

$$\begin{aligned} q_s\int _{\mathbb {T}}\bar{f'_s}d\lambda -\mathrm{Var}(\bar{f'_s})\leqslant |\bar{f'_s}^{(q_s)}(\theta )|\leqslant q_s\int _{\mathbb {T}}\bar{f'_s}d\lambda +\mathrm{Var}(\bar{f'_s}). \end{aligned}$$
(39)

We have

$$\begin{aligned} \int _{\mathbb {T}}\bar{f'_s}d\lambda= & {} \sum _{i=1}^k f\left( a_i+\frac{1}{2q_s}\right) -f\left( a_i-\frac{1}{2q_s}\right) \text { and }\mathrm{Var}(\bar{f'_s})\nonumber \\= & {} 2\sum _{i=1}^k\left( f'\left( a_i+\frac{1}{2q_s}\right) +f'\left( a_i-\frac{1}{2q_s}\right) \right) , \end{aligned}$$
(40)

(if \(s\in \mathbb {N}\) is sufficiently large). It follows by the assumptions on \(f'\) and \(h'\) and (30), that if \(s_0(\epsilon ,N)\) is sufficiently large, then for \(s\geqslant s_0\), we have for every \(i=1,\ldots ,k\):

$$\begin{aligned} |f'(\theta +j_i\alpha )|\leqslant & {} (B_i+1)|h'(\theta +j_i\alpha )|\leqslant (B_i+1)\left| h'\left( \frac{x_s}{4C}\right) \right| \nonumber \\\leqslant & {} \epsilon q_sh\left( \frac{1}{2q_s}\right) \quad \text { for } i \in J_s. \end{aligned}$$
(41)

and similarly

$$\begin{aligned} |f'(\theta +l_i\alpha )|\leqslant \epsilon q_sh\left( \frac{1}{2q_s}\right) \quad \text { for } i \in L_s. \end{aligned}$$
(42)

On the other hand, by l’Hospital’s rule

$$\begin{aligned} ((A_i+\epsilon )-(B_i-\epsilon ))h\left( \frac{1}{2q_s}\right)\geqslant & {} f\left( a_i+\frac{1}{2q_s}\right) -f\left( a_i-\frac{1}{2q_s}\right) \nonumber \\\geqslant & {} ((A_i-\epsilon )-(B_i+\epsilon ))h\left( \frac{1}{2q_s}\right) .\qquad \quad \end{aligned}$$
(43)
$$\begin{aligned} |f'\left( a_i+\frac{1}{2q_s}\right) |+|f'\left( a_i-\frac{1}{2q_s}\right) |\leqslant & {} ((A_i+1)+(B_i+1))\left| h'\left( \frac{1}{2q_s}\right) \right| \nonumber \\\leqslant & {} \epsilon q_sh\left( \frac{1}{2q_s}\right) , \end{aligned}$$
(44)

(by \(\frac{x_s}{4C}<\frac{1}{2q_s})\).

Now, using (38)–(44), we get

$$\begin{aligned}&q_sh\left( \frac{1}{2q_s}\right) \left( \left( \sum _{i=1}^k(A_i-B_i)\right) - 6k\epsilon \right) \leqslant -f'^{(q_s)}(\theta )\\&\quad \leqslant q_sh\left( \frac{1}{2q_s}\right) \left( \left( \sum _{i=1}^k(A_i-B_i)\right) + 6k\epsilon \right) . \end{aligned}$$

which allows us to conclude since if we assume WLOG that \(\varepsilon \) is sufficiently small (recall that \(x<y\)). \(\square \)

We now concatenate the inequalities of Lemma 4.8 if (31) or (32) are satisfied.

Lemma 4.9

Let \(x,y\in \mathbb {T}\) satisfy (26) and (31) with \(m=s\), then there exists \(n_0\in \{1,\ldots ,\max (\frac{q_{s+1}}{8Cq_s},1)\}\) such that (27) holds. Moreover,

$$\begin{aligned} n_0q_sh\left( \frac{1}{2q_s}\right) \leqslant \frac{2(d+1)}{d\Vert x-y\Vert }. \end{aligned}$$
(45)

If \(x,y\) satisfy (26) and (32) with \(m=s\) then there exists \(n_0\in \{1,\ldots ,\max (\frac{q_{s+1}}{8Cq_s},1)\}\) such that (28) holds for some \(n_0\in \{1,\ldots ,\max (\frac{q_{s+1}}{8Cq_s},1)\}\) satisfying (45).

Proof

We will assume (31) holds, the other case being similar. We will use repeatedly (34) of Lemma 4.8 with \(x,y\) replaced by \(x+rq_s\alpha , y+rq_s\alpha \) respectively, with \(r=0,1,\ldots ,\max ([\frac{q_{s+1}}{4Cq_s}]-1,0)\). Indeed, by (31) the points \(x+rq_s,y+rq_s\) satisfy (33). Summing up the obtained inequalities we get for any \( R\in [0, \max ([\frac{q_{s+1}}{4Cq_s}],1)]\),

$$\begin{aligned} R\Vert x-y\Vert (d+1)q_sh\left( \frac{1}{2q_s}\right) \!>\! f^{(Rq_s)}(x)- f^{(Rq_s)}(y)\!\geqslant \! R\Vert x-y\Vert dq_sh\left( \frac{1}{2q_s}\right) . \end{aligned}$$
(46)

Let \(e_R:=f^{(Rq_s)}(x)-f^{(Rq_s)}(y)\). Then for \(R\leqslant [\frac{q_{s+1}}{4Cq_s}]-1\) we have that \(e_{R+1}-e_R\leqslant d+1\).

Moreover, by (46) and (26) and the hypothesis \(h(\frac{1}{2q_s})/h(\frac{1}{2q_{s+1}})>m_0\) we get

$$\begin{aligned}&e_{\max \left( \left[ \frac{q_{s+1}}{8Cq_s}\right] ,1\right) }\geqslant \max \left( \frac{q_{s+1}}{8Cq_s},1\right) dq_sh\left( \frac{1}{2q_s}\right) \Vert x-y\Vert \\&\quad \geqslant dm_0\max \left( \frac{1}{8C}-\frac{q_s}{q_{s+1}}, \frac{q_s}{q_{s+1}}\right) \geqslant \frac{dm_0}{16C}. \end{aligned}$$

Therefore, there exists \(n_0\in \{1,\ldots ,\max ([\frac{q_{s+1}}{8Cq_s}],1)\}\) such that

$$\begin{aligned} f^{(n_0q_s)}(x)-f^{(n_0q_s)}(y)=e_{n_0} \in \left[ \frac{dm_0}{32C},2(d+1)\right] \subset P \end{aligned}$$

Moreover, (46) with \(R=n_0\) implies (45)

$$\begin{aligned} n_0q_sh\left( \frac{1}{2q_s}\right) \leqslant \frac{2(d+1)}{d\Vert x-y\Vert }. \end{aligned}$$
(47)

In case (32) is satisfied instead of (31), we show (28) using repeatedly (36) of Lemma 4.8. \(\square \)

We are ready now to finish the proof of Part a. of Proposition 4.6. If \(s\notin K_\alpha \), then by the fact that \(x,y\in Z\subset W_s\), it follows that 1. in Lemma 4.7 is satisfied with \(m=s\). If \(s\in K_\alpha \) then 2. in Lemma 4.7 is satisfied with \(m=s\). Therefore we can use Lemma 4.7 for \(x,y\) and \(m=s\). Now, by Lemma 4.9, if (31) holds we have (27), if (32) holds we have (28). Part a. of Proposition 4.6 is settled, we turn now to Part b.

4.1.5 Proof of Proposition 4.6 Part b

For \(m\geqslant s_0\) and \(\mathbb {N}\cup \{0\}\ni l\leqslant \max (\frac{q_{m+1}}{8Cq_m}-1,0)\) we will consider the following conditions on \(x,y\in \mathbb {T}\) [compare with (31) and (32)]

$$\begin{aligned}&\bigcup _{j=0}^{(l+1)q_m}T^j[x,y]\cap \bigcup _{i=1}^k[-v_m+a_i,a_i+v_m]=\emptyset .\end{aligned}$$
(48)
$$\begin{aligned}&\bigcup _{j=1}^{(l+1)q_m}T^{-j}[x,y]\cap \bigcup _{i=1}^k[-v_m+a_i,a_i+v_m]=\emptyset . \end{aligned}$$
(49)

Lemma 4.10

Let \(x,y\in \mathbb {T}\) satisfy (26) and (48) for some \(m\geqslant s_0\) and \(\mathbb {N}\cup \{0\}\ni l\leqslant \max (\frac{q_{m+1}}{8Cq_m}-1,0)\). Then

$$\begin{aligned}&\text {for every } n=0,\ldots ,(l+1)q_m,\;\left| f^{(n)}(x)-f^{(n)}(y)\right| \nonumber \\&\quad <8kH\Vert x-y\Vert (l+1)q_mh\left( \frac{1}{2q_m}\right) \end{aligned}$$
(50)

Let \(x,y\in \mathbb {T}\) satisfy (26) and (49) for some \(m\geqslant s_0\) and \(\mathbb {N}\cup \{0\}\ni l\leqslant \max (\frac{q_{m+1}}{8Cq_m}-1,0)\). Then

$$\begin{aligned}&\text {for every } n=1,\ldots ,(l+1)q_m,\;|f^{(-n)}(x)-f^{(-n)}(y)|\nonumber \\&\quad <8kH\Vert x-y\Vert (l+1)q_mh\left( \frac{1}{2q_m}\right) . \end{aligned}$$
(51)

Proof

We only give the proof of the first case since the other is similar. For every \(n=0,\ldots ,\max (\frac{q_{m+1}}{4C},q_m)\), there exists \(\theta _n\in [x,y]\) such that \(|f^{(n)}(x)-f^{(n)}(y)|=(x-y)f'^{(n)}(\theta _n)\). Therefore, using Lemma 4.4, for every \(n=0,\ldots ,(l+1)q_m\), we have

$$\begin{aligned} \left| f^{(n)}(x)-f^{(n)}(y)\right| \leqslant H\Vert x-y\Vert \left( \sum _{i=1}^k-h'^{(n)}(\theta _n-a_i)-h'^{(n)}(a_i-\theta _n)\right) . \end{aligned}$$
(52)

Moreover, by monotonicity of \(h'\), for every \(i=1,\ldots ,k\),

$$\begin{aligned} -h'^{(n)}(\theta _n-a_i)\leqslant -h'^{(n)}(x-a_i)\quad \text { and }\quad -h'^{(n)}(a_i-\theta _n)\leqslant -h'^{(n)}(a_i-y). \end{aligned}$$

Since \(-h'\) is positive, we get that

$$\begin{aligned} -h'^{(n)}(\theta _n-a_i)\leqslant & {} -h'^{((l+1)q_m)}(x-a_i)\quad \text { and }\quad -h'^{(n)}(a_i-\theta _n)\nonumber \\ {}< & {} -h'^{((l+1)q_m)}(a_i-y). \end{aligned}$$
(53)

It follows by Lemma 4.3, (48) and (24) that for every \(u=0,\ldots ,l\)

$$\begin{aligned} \Vert x-y\Vert |h'^{(q_m)}(T^{uq_m}x-a_i)|\leqslant & {} \Vert x-y\Vert \left( q_mh\left( \frac{1}{2q_m}\right) -h'\left( \frac{1}{2q_m}\right) -h'(v_m)\right) \\\leqslant & {} 4\Vert x-y\Vert q_mh\left( \frac{1}{2q_m}\right) . \end{aligned}$$

Hence, summing up over \(u=0,\ldots ,l\), and using the cocycle identity, (52) implies (50).

This finishes the proof. \(\square \)

To prove Proposition 4.6 Part b., observe first that if \(s_0\) is sufficiently large, and up to eventually changing \(\kappa \) to \(\kappa ' = \frac{\kappa }{8C}\), one of two possibilities holds : \(\mathbf{1.}\) There exists \(s_0\leqslant m\leqslant s, m \in K_\alpha \), such that \( \kappa n_0 q_s < q_{m} \leqslant 8 C \kappa n_0 q_s\), or \(\mathbf{2.}\) There exist \(s_0\leqslant m\leqslant s\) and \(l \geqslant 1\) such that \( l q_m \leqslant \kappa n_0 q_s < (l+1) q_m \leqslant \frac{q_{m+1}}{8C}\).

Case 1. \( \kappa n_0 q_s < q_{m} \leqslant 8 C \kappa n_0 q_s\) with \(s_0\leqslant m\leqslant s, m \in K_\alpha \). Lemma 4.7 implies that either (31) or (32) holds for \(T^{n_0 q_s}x,T^{n_0 q_s}y,m\). Therefore, (48) or (49) holds for \(m\) and \(l=0\). We then apply Lemma 4.10 to \(T^{n_0 q_s}x,T^{n_0 q_s}y,m\) with \(l=0\), and according to whether we have (50) or (51) we will get A. or B. of Proposition 4.6 Part b. Indeed, suppose (50) holds. Then, since \(\kappa n_0 q_s < q_{m}\), for \(n =1, \ldots , [\kappa n_0 q_s]+1\), we have due to (47)

$$\begin{aligned} |f^{(n)}(T^{n_0q_s}x)-f^{(n)}(T^{n_0q_s}y)|&<8kH\Vert x-y\Vert q_mh\left( \frac{1}{2q_m}\right) \\&<16C kH \kappa n_0 q_s \Vert x-y\Vert h\left( \frac{1}{2q_m}\right) \\&<16C kH \kappa \frac{2(d+1)}{d} < \varepsilon . \end{aligned}$$

Case 2. There exist \(s_0\leqslant m\leqslant s\) and \(l \geqslant 1\) such that \( l q_m \leqslant \kappa n_0 q_s <(l+1) q_m \leqslant \frac{q_{m+1}}{8C}\). We will first prove that \(T^{n_0 q_s}x,T^{n_0 q_s}y,m,l\) satisfy the hypothesis of Lemma 4.10. If \(m\in K_\alpha \), then Lemma 4.7 implies that either (31) or (32) holds for \(T^{n_0 q_s}x,T^{n_0 q_s}y,m\). Therefore, since \(l\leqslant \frac{q_{m+1}}{8Cq_m}-1\), either (48) or (49) holds for \(T^{n_0 q_s}x,T^{n_0 q_s}y,m,l\). If \(m\notin K_\alpha \), then we consider two cases:

I. \(m=s\). In this case \(n_0>\frac{1}{\kappa }\) and therefore \(q_{s+1}>32Cq_s\). Since \(l\leqslant [\kappa n_0]+1, n_0\leqslant \frac{q_{s+1}}{8Cq_s}\) and \(\kappa \ll 1\), we get that

$$\begin{aligned}&\bigcup _{j=0}^{(l+1)q_s}T^j[T^{n_0q_s}x,T^{n_0q_s}y]\cap \bigcup _{i=1}^k[-v_s+a_i,a_i+v_s]\\&\qquad \subset \bigcup _{j=0}^{\max (\left[ \frac{q_{s+1}}{4C}\right] ,q_s)}T^j[x,y]\cap \bigcup _{i=1}^k[-v_s+a_i,a_i+v_s]. \end{aligned}$$

Similarly,

$$\begin{aligned}&\bigcup _{j=0}^{(l+1)q_s}T^{-j}[T^{n_0q_s}x,T^{n_0q_s}y]\cap \bigcup _{i=1}^k[-v_s+a_i,a_i+v_s]\\&\qquad \subset \bigcup _{j=0}^{\max ([\frac{q_{s+1}}{4C}],q_s)}T^{-j}[x,y]\cap \bigcup _{i=1}^k[-v_s+a_i,a_i+v_s]. \end{aligned}$$

Note that by Lemma 4.7 \(x,y\) satisfy (31) or (32). Therefore the assumptions of Lemma 4.10 are satisfied for \(T^{n_0q_s}x, T^{n_0q_s}y,s,l\).

II. \(m<s\). Since \(l\leqslant \frac{q_{m+1}}{8Cq_m}-1\), it is enough to show that \(T^{n_0 q_s}x,T^{n_0 q_s}y,m\) satisfy (31) or (32). Due to Lemma 4.7, we just have to check (30) for \(T^{n_0 q_s}x,T^{n_0 q_s}y,m\):

$$\begin{aligned} \bigcup _{j=-q_m}^{q_m-1}T^j[T^{n_0q_s}x,T^{n_0q_s}y]\cap \bigcup _{i=1}^k[-2v_m+a_i,a_i+2v_m]=\emptyset . \end{aligned}$$
(54)

Since \(m\notin K_\alpha \) and \(m<s\), we have

$$\begin{aligned} \frac{1}{q_{s}}\leqslant \frac{1}{q_{m+1}}\ll v_m. \end{aligned}$$
(55)

Moreover, \(\Vert T^{n_0q_s}x-T^{n_0q_s}y\Vert \mathop {<}\limits ^{(26)}\frac{1}{10v_m}\). Therefore, it is enough to show that for \(j\in \{0,\ldots ,q_m-1\}\) we have

$$\begin{aligned} T^{n_0q_s}x+j\alpha \notin \bigcup _{i=1}^k[-3v_m+a_i,a_i+3v_m]. \end{aligned}$$

For this aim, let \(i_0\) and \(r_1\) be such that \(\rho (\{T^{n_0q_s}x+j\alpha \}_{j=0}^{q_m-1},\{a_i\}_{i=1}^k)= \Vert T^{n_0q_s}x+i_0\alpha -a_{r_1}\Vert \). It follows by (3), that for \(i_0\ne j\in \{0,\ldots ,q_{m}-1\}\),

$$\begin{aligned} T^{n_0q_s}x+j\alpha \notin \bigcup _{i=1}^k\left[ -\frac{1}{2Cq_m}+a_i,a_i+\frac{1}{2Cq_m}\right] . \end{aligned}$$
(56)

Next, by the fact that \(m\notin K_\alpha \) and \(x\in B_m (m\geqslant s_0)\), we get that \(\Vert x+i_0\alpha -a_{r_1}\Vert \geqslant 4v_m\), and therefore

$$\begin{aligned} \Vert x+i_0\alpha +n_0q_s\alpha -a_{r_1}\Vert\geqslant & {} \Vert x+i_0\alpha -a_{r_1}\Vert -\Vert n_0q_s\alpha \Vert \geqslant 4 v_m-\frac{n_0}{q_{s+1}}\\ {}\geqslant & {} 4v_m-\frac{1}{8Cq_s}\mathop {\geqslant }\limits ^{(55)} 3 v_m, \end{aligned}$$

(recall that \(n_0\leqslant \frac{q_{s+1}}{8Cq_s}\)) and (54) is thus proved. So in Case 2. at least one of (48) or (49) is satisfied for \(T^{n_0q_s}x, T^{n_0q_s}y,m,l\).

Therefore, we can apply Lemma 4.10 to \(T^{n_0 q_s}x,T^{n_0 q_s}y,m,l\) (recall that \(l q_m \leqslant \kappa n_0 q_s < (l+1) q_m\)). Now and as in Case 1., if (50) holds we get A., if (51) holds we get B. Indeed, assume WLOG that \(T^{n_0q_s}x, T^{n_0q_s}y,m,l\) satisfy (50) (the proof in the other case is analogous). Using (47) and the fact that \([\kappa n_0 q_s]+1\leqslant (l+1)q_m\leqslant 2\kappa n_0q_s\), we get for \(n =1, \ldots , (l+1)q_m\)

$$\begin{aligned} |f^{(n)}(T^{n_0q_s}x)-f^{(n)}(T^{n_0q_s}y)|&<8kH\Vert x-y\Vert (l+1)q_mh\left( \frac{1}{2q_m}\right) \\&<16 kH \kappa n_0 q_s \Vert x-y\Vert h\left( \frac{1}{2q_s}\right) \\&<16 kH \kappa \frac{2(d+1)}{d} < \varepsilon . \end{aligned}$$

So A. in Proposition 4.6 Part b. holds.

The proof of Proposition 4.6 is thus completed and Theorem 2 follows. \(\square \)

4.2 Proof of Theorem 4

4.2.1 Outline of the proof

The general scheme of the proof is similar to the scheme of the proof of Theorem 2 (see the outline of the proof of the latter theorem in Subsection 4.1). Assume for simplicity that \(f\) has just one right-sided power singularity at \(0\) of type \(x^{-\gamma }\). In this outline we will actually see that the constant type condition is an if and only if condition in the proof of Theorem 4 that we give. Indeed, the following facts are easy to check for an interval \(I=[x,y]\) such that \(y-x \in [1/q_{n+1}^{\gamma +1},1/q_{n}^{\gamma +1}]\).

  1. a.

    If for some \(c\in (0,1), R^i_\alpha I\) is disjoint from \([-c/q_n,c/q_n]\) for every \(i=0,\ldots ,l \leqslant q_n\) then \(f'^{(l)}(\theta )\leqslant C q_n^{1+\gamma }\) for any \(\theta \in I\) for some \(C\) that depends on \(c\) (with a similar statement for negative iterates).

  2. b.

    If for some \(c'\in (0,1), R^{i_0}_\alpha I\) is included in \([-c'/q_n,0)\) for some \(i_0\geqslant 0\) then \(f'^{(i_0+1)}(\theta )-f'^{(i_0)}(\theta )\geqslant C' q_n^{1+\gamma }\) for any \(\theta \in I\) for some \(C'\) that depends on \(c'\) (with a similar statement for negative iterates).

  3. c.

    If \(\alpha \) is of constant type then there exists \(c'>c>0\) such that one of the following holds if \(n\) is sufficiently large : 1. there exists \(i_0\geqslant 0\) such that \(R^{i_0}_\alpha I\) is included in \([-c'/q_n,0)\) and \(R^i_\alpha I\) is disjoint from \([-c/q_n,c/q_n]\) for every \(i=0,\ldots ,i_0\) or 2. there exists \(i_0<0\) such that \(R^{i_0}_\alpha I\) is included in \([-c'/q_n,0)\) and \(R^i_\alpha I\) is disjoint from \([-c/q_n,c/q_n]\) for every \(i=-1,\ldots ,-i_0\).

  4. d.

    If \(\alpha \) is not of constant type and if \(q_{n+1}\gg q_n\) and \(y-x=\varepsilon _n/q_n^{1+\gamma }\) while \(|x-p/q_n|\leqslant \varepsilon _n^2/q_n\) with \(\varepsilon _n\rightarrow 0\), then as long as for \(i \in [0,l]\) (or for \(i \in [-l,-1]\)) \(R^i_\alpha I\) is disjoint from \([-1/(2q_n),1/(2q_n)]\) we have that \(f'^{(l)}(\theta )\leqslant C q_n^{1+\gamma }\) for any \(\theta \in I\), while if \(l\) is the first integer such that \(R^l_\alpha I\) intersects \([-1/(2q_n),1/(2q_n)]\) then \(f'^{(l)}(\theta )\geqslant q_n^{1+\gamma }/\varepsilon _n \).

Now, if \(\alpha \) is of constant type, and if we assume that c.1 holds, then a. and b. imply that either \(f'^{(i_0)}(\theta ) \in [ C' q_n^{1+\gamma },C q_n^{1+\gamma }]\) for every \(\theta \in I\) or \(f'^{(i_0+1)}(\theta ) \in [ C' q_n^{1+\gamma },C q_n^{1+\gamma }]\) for every \(\theta \in I\). Since \(q_{n+1}/q_n\) is bounded this implies a controlled macroscopic drift between the orbit of \(x\) and \(y\) (this is the content of Proposition 4.12 Part a). As in the proof of Theorem 2, we then need to use the same type of arguments to show that the drift remains almost constant during a small additional proportion of time (we do this in the future of \(i_0+1\) or in the past of \(i_0\), and this is the content of Proposition 4.12 Part b and relies on Sublemma 4.16). The case c.2 is treated similarly.

In Sublemma 4.14 we essentially prove a. and in Sublemma 4.15 we essentially prove b.

We now explain why the constant type condition is necessary in our proof. Indeed, if \(\alpha \) is not of constant type, d. gives an example of pairs \(x,y\) for which the drift between the forward orbits jumps from \(\varepsilon _n\) to \(1/\varepsilon _n\) and the same happens for backward orbits, which contradict the SWR property for this pair. Furthermore, if \(\varepsilon _n\) is taken to converge very slowly to \(0\) such pairs can be produced with \(x \in Z\) for any \(Z\) having positive measure. Observe that this does not imply that the SWR property would not reappear much later in time but this is very unlikely as demonstrated for the absence of the WR property in the particular case of Theorem 1.

Observe finally that the same type of pairs \((x,y)\) described in d. show that it is necessary to use the SWR property instead of the WR property. Indeed, only one of the alternatives c.1 or c.2 holds for such pairs and we are obliged, if we want to see a controlled drift, to iterate in the future or in the past.

4.2.2 Notations and standing assumptions

Recall that in the proof of Theorem 4 \(T\) means \(R_\alpha \). We may assume WLOG that \(A_k^2+B_k^2>0\). Let \(C_k=\max (A_k,-B_k)>0\). Recall \(H>0\) coming from Lemma 4.4, \(D_1,D_2>0\), the constants in the hypothesis (4) in Theorem 4 and define

$$\begin{aligned} P:=\left[ -12Hk (D_2+2),-\frac{C_kD_1^2}{16c}\right] \cup \left[ \frac{C_kD_1^2}{16c},12Hk (D_2+2),\right] \end{aligned}$$

where \(c\) is such that for every \(s\in \mathbb {N}, q_{s+1}\leqslant cq_s\).

Fix \(\varepsilon \ll 1\) and \(N\in \mathbb {N}\). We will assume that \(\epsilon <\frac{C_kD_1^2}{8c}\). Let \(\kappa :=\kappa (\epsilon )=\frac{\epsilon }{2(3D_2+2)kCH}\).

Let \(s_0\in \mathbb {N}\) be such that \(q_{s_0-4}\geqslant \frac{1}{\kappa }N\), and \(h(\frac{1}{2q_s})>6C\) for \( s\geqslant s_0\), and for every \(i=1,\ldots ,k\)

$$\begin{aligned}&\left| \frac{f'(x)}{h'(x-a_i)}\right| >\frac{A_i}{2}\;\text {for}\; x\in \left[ a_i,a_i+\frac{1}{q_{s_0-4}}\right] \;\;\text {and}\;\; \left| \frac{f'(x)}{h'(a_i-x)}\right| >\frac{B_i}{2},\nonumber \\&\quad \;\text {for}\; x\in \left[ -\frac{1}{q_{s_0-4}}+a_i,a_i\right] . \end{aligned}$$
(57)

Define \(\delta :=\frac{1}{q_{s_0}h(\frac{1}{2q_{s_0}})}\). We will show that SWR-property holds for all pairs of points \(x,y\in \mathbb {T}\) with \(\Vert x-y\Vert <\delta \).

4.2.3 Controlling the drift

The following proposition implies Theorem 4 due to Proposition 3.3.

Proposition 4.11

Consider \(x,y\in \mathbb {T}\) with \(0<\Vert x-y\Vert <\delta \). Then there exists \(p \in P, M,L\geqslant \kappa M \geqslant N\) such that either (16) or (17) holds for \(n \in [M,M+L]\).

We can assume WLOG that \(x<y\). Let \(s:=s(x,y)\) be unique such that

$$\begin{aligned} \frac{1}{q_{s+1}h\left( \frac{1}{2q_{s+1}}\right) }\leqslant \Vert x-y\Vert <\frac{1}{q_sh\left( \frac{1}{2q_s}\right) }. \end{aligned}$$
(58)

As in the precedent section, Proposition 4.11 follows from

Proposition 4.12

Consider \(x,y\in \mathbb {T}\) as in (58).

Part a. There exists \(i_0 \in \{0,\ldots ,q_{s-2}-1\}\), such that

$$\begin{aligned} \left| f^{(i_0)}(x)-f^{(i_0)}(y)\right| \in P \end{aligned}$$
(59)

or

$$\begin{aligned} \left| f^{(-i_0)}(x)-f^{(-i_0)}(y)\right| \in P. \end{aligned}$$
(60)

Part b. Let \(X=T^{i_0}x\) and \(Y=T^{i_0}y\) if (59) holds, and \(X=T^{-i_0-1}x\) and \(Y=T^{-i_0-1}y\) if (60) holds, for \(n=1,\ldots ,[\kappa i_0]+1\) the following holds

$$\begin{aligned} \text {A.}\left| f^{(n)}(X)-f^{(n)}(Y)\right| <\epsilon \;\;\;\; \text {or} \;\;\;\;\text {B.}\;\left| f^{(-n)}(X)-f^{(-n)}(Y)\right| <\epsilon . \end{aligned}$$
(61)

The rest of Sect. 4.2 is devoted to the proof of Proposition 4.12.

Consider the orbit \(x-q_{s-2}\alpha ,\ldots ,x,\ldots ,x+(q_{s-2}-1)\alpha \) (the length of this orbit is smaller than \(q_s\)). It follows by (3) that there exists at most one \(t_s\in [-q_{s-2},q_{s-2}+1]\) such that \(x+t_s\alpha \in \bigcup _{i=1}^k[-\frac{1}{2Cq_{s}}+a_i,a_i+\frac{1}{2Cq_{s}}]\). Hence at least one of the following two holds:

$$\begin{aligned} \bigcup _{j=0}^{q_{s-2}-1}T^j[x,y]\cap \bigcup _{i=1}^k\left[ -\frac{1}{2Cq_{s}}+a_i,a_i+\frac{1}{2Cq_{s}}\right] =\emptyset \end{aligned}$$
(62)

or

$$\begin{aligned} \bigcup _{j=1}^{q_{s-2}}T^{-j}[x,y]\cap \bigcup _{i=1}^k\left[ -\frac{1}{2Cq_{s}}+a_i,a_i+\frac{1}{2Cq_{s}}\right] =\emptyset . \end{aligned}$$
(63)

The following Lemma directly implies the proof of Proposition 4.12.

Lemma 4.13

If (62) then (59) and (61) hold. If (63) then (60) and (61) hold.

4.2.4 Proof of Lemma 4.13

We will suppose (62) holds, the proof of the other case being analogous. We will need some lemmas.

Sublemma 4.14

For \(n=0,\ldots ,q_{s-2}-1\),

$$\begin{aligned} |f^{(n)}(x)-f^{(n)}(y)| \leqslant 2kH(3D_2+2). \end{aligned}$$

Proof

By (58) we have for every \(i=1,\ldots ,k, a_i\notin [x+j\alpha ,y+j\alpha ]\) with \(j\in \{0,\ldots ,q_{s-2}-1\}\). It follows that for \(n=0,\ldots ,q_{s-2}-1, \left| f^{(n)}(x)-f^{(n)}(y)\right| = \left| f'(\theta _n)\right| \Vert x-y\Vert \), for some \(\theta _n\in [x,y]\). Hence, using Lemma 4.4, for every \(n=0,\ldots ,q_{s-2}\) we have

$$\begin{aligned} \left| f^{(n)}(x)-f^{(n)}(y)\right| \leqslant H\Vert x-y\Vert \left( \sum _{i=1}^k(-h'^{(n)}(\theta _n-a_i)-h'^{(n)}(a_i-\theta _n))\right) . \end{aligned}$$
(64)

By the monotonicity of \(h'\) on \((0,1)\) we obtain \(-h'^{(n)}(\theta _n-a_i)\leqslant -h'^{(n)}(x-a_i), -h'^{(n)}(a_i-\theta _n)\leqslant -h'^{(n)}(a_i-y)\). Since \(-h'>0\),

$$\begin{aligned} -h'^{(n)}(x-a_i)-h'^{(n)}(a_i-y)\leqslant -h'^{(q_{s-2})}(x-a_i)-h'^{(q_{s-2})}(a_i-y).\quad \end{aligned}$$
(65)

Using Lemma 4.3 (applied to \(x-a_i\), where \(j_i\in [0,q_{s-2}]-1\) is unique such that \(x+j_i\alpha \in [a_i,a_i+\frac{1}{2q_{s-2}}]\)), we obtain

$$\begin{aligned}&\Vert x-y\Vert \left( -h'^{(q_{s-2})}(x-a_i)\right) \\&\quad \leqslant \Vert x-y\Vert \left( q_{s-2}h\left( \frac{1}{2q_{s-2}}\right) -2h'\left( \frac{1}{2q_{s-2}}\right) -h'(x+j_i\alpha )\right) . \end{aligned}$$

By (62), it follows that for \(n=0,\ldots ,q_{s-2}-1\) we have \(x+n\alpha ,y+n\alpha \notin \bigcup _{i=1}^k[-\frac{1}{2Cq_s}+a_i,a_i+\frac{1}{2Cq_s}]\).

By monotonicity of \(h' , -h'(x+j_i\alpha )<-h'(\frac{1}{2Cq_s})\) and therefore by (4) and (58)

$$\begin{aligned}&\Vert x-y\Vert \left( -h'^{(q_{s-2})}(x-a_i)\right) \leqslant \Vert x-y\Vert \nonumber \\&\quad \times \left( q_{s-2}h\left( \frac{1}{2q_{s-2}}\right) -2h'\left( \frac{1}{\frac{1}{2}\frac{1}{q_{s-2}}}\right) -h'\left( \frac{1}{\frac{1}{2C}\frac{1}{q_s}}\right) \right) \nonumber \\&\quad \leqslant \frac{q_{s-2}h\left( \frac{1}{2q_{s-2}}\right) +2D_2q_{s-2}h\left( \frac{1}{2q_{s-2}}\right) +D_2q_sh\left( \frac{1}{2q_s}\right) }{q_sh\left( \frac{1}{2q_s}\right) }\nonumber \\&\quad \leqslant \frac{3D_2q_sh\left( \frac{1}{2q_s}\right) +q_{s-2}h\left( \frac{1}{2q_{s-2}}\right) }{q_sh\left( \frac{1}{2q_s}\right) }\leqslant 3D_2+1. \end{aligned}$$
(66)

Similarly we obtain \(\Vert x-y\Vert \left( -h'^{(q_{s-2})}(a_i-y)\right) <3D_2+1\).

Therefore using (64) and the computations above, for \(n=0,\ldots ,q_{s-2}-1\),

$$\begin{aligned} |f^{(n)}(x)-f^{(n)}(y)|<2kH(3D_2+2). \end{aligned}$$

\(\square \)

Sublemma 4.15

There exists \(i_0 \in \{0,\ldots ,q_{s-2}-1\}\), such that \(|f^{(i_0)}(x)-f^{(i_0)}(y)| \geqslant \frac{A_kD_1^2}{4c}\).

Proof

Since \(q_{s-2}-q_{s-4}>q_{s-4}+1\), there exists \(i_0\in [q_{s-4},q_{s-2}-2]\) such that

$$\begin{aligned} T^{i_0}x\in \left[ a_{k},a_k+\frac{1}{q_{s-4}}\right] . \end{aligned}$$
(67)

Indeed

$$\begin{aligned} \{T^kx\}_{k=q_{s-4}}^{q_{s-2}-2}= T^{q_{s-4}}x+\{T^k0\}_{k=0}^{q_{s-2}-q_{s-4}-2}\subset T^{q_{s-4}}x+\{T^k0\}_{k=0}^{q_{s-4}-1}, \end{aligned}$$

and \(\{T^k0\}_{k=0}^{q_{s-4}-1}\) is at least \(\frac{1}{q_{s-4}}\)-dense. We have assumed that \(A_k^2+B_k^2>0\). Suppose additionally \(A_k\geqslant -B_k\) (if \(A_k\leqslant -B_k\) then we replace \([a_{k},a_k+\frac{1}{q_{s-4}}]\) by \([-\frac{1}{q_{s-4}}+a_k,a_k]\)). We claim that

$$\begin{aligned} \left| \left( f^{(i_0+1)}(x)-f^{(i_0+1)}(y)\right) -\left( f^{(i_0)}(x)-f^{(i_0)}(y)\right) \right| > \frac{A_kD_1^2}{2c}. \end{aligned}$$

Indeed, the LHS of this inequality is equal to \(|f(x+i_0\alpha )-f(y+i_0\alpha )|=|f'(\theta _{i_0})|\Vert x-y\Vert \), for some \(\theta _{i_0}\in [x+i_0\alpha ,y+i_0\alpha ]\). Now, by (58), \(\theta _{i_0}\in [a_k,a_k+\frac{1}{q_{s-4}}+\frac{1}{q_sh(\frac{1}{2q_s})}]\subset [a_k,a_k+\frac{2}{q_{s-4}}]\). By (57), monotonicity of \(h'\), (4) twice (for \(s\) and \(s+1\)) and (58)

$$\begin{aligned} |f'(\theta _{i_0})|\geqslant & {} \frac{A_k}{2} \left| h'\left( \theta _{i_0}-a_k\right) \right| \geqslant \frac{A_k}{2} \left| h'\left( \frac{2}{q_{s-4}}\right) \right| \geqslant \frac{A_k}{2} \left| h'\left( 2c^4\frac{1}{q_s}\right) \right| \nonumber \\\geqslant & {} \frac{A_k}{2}D_1q_sh\left( \frac{1}{2q_s}\right) \geqslant \frac{A_k}{2}D_1^2 \frac{q_{s+1}}{c}h\left( \frac{1}{2q_{s+1}}\right) \geqslant \frac{A_kD_1^2}{2c}\frac{1}{\Vert x-y\Vert };\nonumber \\ \end{aligned}$$
(68)

and the claim follows. Therefore, one of the numbers \(|f^{(i_0+1)}(x)-f^{(i_0+1)}(y)|\) or \(|f^{(i_0)}(x)-f^{(i_0)}(y)|\) is at least \(\frac{A_kD_1^2}{4c}\). \(\square \)

As a consequence of the above lemmas, we obtain that at least one of the numbers \(f^{(i_0+1)}(x)-f^{(i_0+1)}(y)\) or \(f^{(i_0)}(x)-f^{(i_0)}(y)\) belongs to the set \(P\), and (59) is proved. The next result will give the proof of (61).

Sublemma 4.16

The following hold:

$$\begin{aligned}&\left| f^{(n)}(T^{i_0+1}x)-f^{(n)}(T^{i_0+1}y)\right| <\epsilon \;\;\quad \text {for all}\;0\leqslant n\leqslant \kappa (i_0+1),\end{aligned}$$
(69)
$$\begin{aligned}&\left| f^{(-n)}(T^{i_0}x)-f^{(-n)}(T^{i_0}y)\right| <\epsilon \;\; \quad \text {for all}\; 0\leqslant n\leqslant \kappa (i_0+1). \end{aligned}$$
(70)

Proof

First we show (69). Select (the unique) \(m\in \mathbb {N}\) such that \(q_m\geqslant \kappa (i_0+1)\geqslant q_{m-1}\) (note that \(q_m\ll q_s\)). By (3) applied to \(T^{i_0}(x)\), (67) and the fact that \(q_m\ll q_s\) it follows that

$$\begin{aligned} \{T^{i_0}x,\ldots ,T^{i_0}x+(q_m-1)\alpha \}\cap \bigcup _{i=1}^k\left[ -\frac{1}{2Cq_m}+a_i,a_i+\frac{1}{2Cq_m}\right] =\{T^{i_0}x\}. \end{aligned}$$
(71)

Analogously, by (3) applied to \(T^{i_0}(x)-(q_m-1)\alpha \), we get

$$\begin{aligned} \{T^{i_0}x-(q_m-1)\alpha ,\ldots ,T^{i_0}x\}\cap \bigcup _{i=1}^k\left[ -\frac{1}{2Cq_m}+a_i,a_i+\frac{1}{2Cq_m}\right] =\{T^{i_0}x\}. \end{aligned}$$
(72)

By (71) and using the same arguments which precedes (64) we obtain (cf. (65)) for \(n=0,\ldots ,\kappa (i_0+1)\)

$$\begin{aligned}&\left| f^{(n)}(T^{i_0+1}x)-f^{(n)}(T^{i_0+1}y)\right| \leqslant H\Vert x-y\Vert \nonumber \\&\quad \times \left( \sum _{i=1}^k-h'^{(q_m)}(T^{i_0+1}x-a_i)-h'^{(q_m)}\left( a_i-T^{i_0+1}y\right) \right) . \end{aligned}$$
(73)

Then for \(i\in \{1,\ldots ,k\}\), again by repeating that lead to (66) we obtain

$$\begin{aligned} \Vert x-y\Vert \left( -h'^{(q_{m})}(T^{i_0+1}x-a_i)\right)\leqslant & {} \frac{q_{m}h\left( \frac{1}{2q_{m}}\right) +3D_2q_{m}h\left( \frac{1}{2q_{m}}\right) }{q_sh\left( \frac{1}{2q_s}\right) }\\\leqslant & {} \frac{(3D_2+1)q_mh\left( \frac{1}{2q_m}\right) }{q_sh\left( \frac{1}{2q_s}\right) }. \end{aligned}$$

But \(q_m\leqslant c\kappa (i_0+1)<c\kappa q_{s-2}\), thus (by the monotonicity of \(h\)) \(\Vert x-y\Vert \left( -h'^{(q_{m})}(T^{i_0+1}x-a_i)\right) \leqslant (3D_2+1)\frac{c\kappa q_{s-2}}{q_s}=\frac{\epsilon }{4Hk}\), by the definition of \(\kappa \). Similarly, we obtain \(\Vert x-y\Vert \left( -h'^{(q_{m})}(a_i-T^{i_0+1}y)\right) <\frac{\epsilon }{4Hk}\).

Using this and (73) we get

$$\begin{aligned} |f^{(n)}(T^{i_0+1}x)-f^{(n)}(T^{i_0+1}y)|<\epsilon , \end{aligned}$$

which yields the first case of (69). To handle the second case we use (72) and proceed as before to obtain first \(|f^{(-n)}(T^{i_0}x)-f^{(-n)}(T^{i_0}y)|=\Vert x-y\Vert \left| f'^{(n)}(\theta _n)\right| \) with \(\theta _n\in [T^{i_0}x-n\alpha ,T^{i_0}yn-\alpha ]\) and then estimating above by

$$\begin{aligned}&H\Vert x-y\Vert \left( \sum _{i=1}^k-h'^{(q_m)}(T^{i_0}x-(q_m-1)\alpha -a_i)\right. \\&\left. \quad -h'^{(q_m)}(a_i-(T^{i_0}y-(q_m-1)\alpha ))\right) . \end{aligned}$$

We conclude exactly in the same way as in the first case. \(\square \)

We proceed to the proof of Lemma 4.13 in the case (62) is satisfied. If \(f^{(i_0+1)}(x)-f^{(i_0+1)}(y)\in P\), then (69) gives A. in (61); if \(f^{(i_0)}(x)-f^{(i_0)}(y)\in P\), then (70) gives B. in (61). The proof of Lemma 4.13 is thus completed since the case where (63) is satisfied is analogous. \(\square \)

This finishes the proof of Proposition 4.12, thus of Theorem 4. \(\square \)