Multiple mixing for a class of conservative surface flows

Arnold and Kochergin mixing conservative flows on surfaces stand as the main and almost only natural class of mixing transformations for which higher order mixing has not been established, nor disproved. Under suitable arithmetic conditions on their unique rotation vector, of full Lebesgue measure in the first case and of full Hausdorff dimension in the second, we show that these flows are mixing of any order. For this, we show that they display a generalization of the so called Ratner property on slow divergence of nearby orbits, that implies strong restrictions on their joinings, which in turn yields higher order mixing. This is the first case in which the Ratner property is used to prove multiple mixing outside its original context of horocycle flows and we expect our approach will have further applications.


Introduction
A major open problem in ergodic theory is Rokhlin's question on whether mixing implies mixing of all orders, also called multiple mixing [26]. In most of the known examples of mixing dynamical systems, multiple mixing is now known to hold. Moreover, a positive answer to Rokhlin's question is actually known to generally hold within various classes of mixing dynamical systems. The most noteworthy are K-systems where multiple mixing always holds [4], horocycle flows [23], mixing systems with singular spectrum that display multiple mixing by a celebrated theorem of Host [13], and finite rank systems since Kalikow showed that rank one and mixing implies multiple mixing [14], a result that was extended to finite rank mixing systems by Ryzhikov [30].
In the second half of the last century, it was discovered that mixing is possible for a class of smooth area preserving flows on surfaces called multivalued Hamiltonians since they are locally given by Hamilton's equations. The Hamiltonian H in question is associated to a closed differential 1-form ω = d H, the form ω being globally defined-but of course this is not necessarily the case for H . The possibility of mixing for these flows was studied in two different cases: first, by Kochergin who obtained mixing when ω has higher order zeros and thus the flow has degenerate saddles [17], and then by Arnold [1] who suggested that mixing is possible on a minimal component even in the case where ω is Morse but the saddles on the minimal component appear in asymmetric configurations, for example because of a saddle loop. That mixing was indeed possible in the latter context was proved in a particular case by Khanin and Sinai [32]. Absence of mixing in the case of Morse forms was obtained by Kochergin in some particular cases [18] (see also [22]), and later generalized to a typical one form by Ulcigrai [35].
Considering a 1-dimensional section of a multi-valued Hamiltonian flow allows to view the dynamics on its minimal components as special flows above interval exchange transformations (IET), which in particular situations can be circular rotations. The case of non-degenerate saddles corresponds to ceiling functions with logarithmic singularities, while the case of degenerate saddle points corresponds to ceiling functions with stronger singularities such as integrable power like singularities. In the case of power singularities [17] proved mixing above any ergodic IET, while [32] established mixing in the case of a single asymmetric logarithmic singularity above a circular rotation with a typical frequency.
The study of the mixing properties of surface flows has known a revival of interest since the beginning of the 2000s, with results such as the computation of the speed of mixing [6] or extensions of the Khanin-Sinai mixing result to include all irrational translation vectors [19,20] (see also [21]), or advances in the study of multi-valued Hamiltonian flows on surfaces in the general case where the Poincaré section return map is an IET and not just a circular rotation [3,[35][36][37].
Mixing surface flows stand today as the main and almost only natural class of mixing transformations for which higher order mixing has not been established, nor disproved. Our aim here is to prove mixing of all orders for a subclass of these systems given by special flows above circular rotations, with ceiling functions having asymmetric logarithmic singularities or integrable power like singularities. For convenience, we will speak of Arnold flows in the first case, and Kochergin flows in the second. Our results will depend on the arithmetics of the frequency α of the rotation on the base, that determines the slope of the unique rotation vector of the corresponding surface flow.
Loosely speaking our main result is as follows (it will be made precise at the end of this introduction, see Corollaries 1.6 and 1.9). Similar mixing mechanisms due to orbit shear as in Kochergin and Arnold flows were observed relatively recently such as in [2,3,7,35] and it should be possible to apply the techniques of the current paper to the study of higher order mixing for such parabolic systems.
To explain our approach we need first to make a detour by Ratner's study of horocycle flows. In the 1980s, Ratner developed a rich machinery to study horocycle flows [27][28][29] and, in particular, singled out a special way of controlled slow divergence of orbits of nearby points which resulted in the notion of H p -property, later called R-property by Thouvenot [33]. This property, to which we will come back with more detail in the sequel, has important dynamical consequences, mainly expressed by a restriction on the possible joining measures of a system having the R-property with other systems, and in particular with itself.
A joining between two dynamical systems (T, X, B, μ) and (S, Y, C , ν), (X, B, μ) and (Y, C , ν) being standard Borel probability spaces, is a measure ρ on X × Y invariant by T × S whose marginals on X and Y are μ and ν. The definition for flows is similar. An important notion in Ratner's theory is that of finite extension joinings (FEJ). Definition 1.1 An ergodic flow (T t ) t∈R is said to have FEJ-property, an acronym for finite extension joining, if for every ergodic flow (S t ) t∈R acting on (Y, C , ν) and every ergodic joining ρ of (T t ) t∈R and (S t ) t∈R different from the product measure μ × ν, ρ yields a flow which is a finite extension of (S t ) t∈R .
It was shown in [31] that a mixing flow with FEJ-property is mixing of all orders. Moreover, it was proved in [27] that a flow with R-property has the FEJ-property. It follows that mixing flows with the R-property are mixing of all orders. Since the R-property for horocycle flows stemmed from polynomial shear along the orbits, and since Kochergin flows displayed a similar polynomial shear along the orbits, the idea that special flows over rotations may enjoy the R-property, and thus be multiple mixing, was then suggested by J-P. Thouvenot in the 1990s (see p. 2 in [9]).
However, whether natural classes of special flows (not necessarily mixing) over irrational rotations may have the R-property remained open until Fraczek and Lemańczyk [9,10] showed that a generalized R-property holds in some classes of special flows with roof functions of bounded variation (which, by [16], are not mixing). More precisely, they have introduced a weaker notion than the R-property, called weak Ratner or WR-property that however still implies the FEJ-property (see Definition 2.1 and the comment after it) .
Unfortunately, in the mixing examples of special flows under piecewise convex functions with singularities such as Arnold and Kochergin flows, the shear may occur very abruptly as orbits approach the singularity and this may prevent them from having the weak Ratner property. Indeed, we believe that these flows do not have the WR-property (it was observed by the first author that Arnold and Kochergin flows do not have a natural strong Ratner property as described in the next paragraph). This is corroborated by the following result that shows that Kochergin flows, in the context of bounded type frequency in the base (that is a priori favorable to controlling the shear), do not have the WR-property.
Theorem 1 Let α ∈ R be irrational of bounded type and f (x) = x γ +r, −1 < γ < 0, r > 0. Then the special flow (T t α, f ) t∈R defined above the circle rotation R α and under the ceiling function f does not have the WR-property.
Here the circle is T = R/Z. We recall that an irrational α is said to be of bounded type or α ∈ DC(0), if the partial quotients in the continued fraction of α are bounded, i.e. if α = [a 0 ; a 1 , a 2 , . . .] and there exists K > 0 such that a i < K for all i 1. We refer to Sect. 3 for the exact definition of special flows. Theorem 1 has another consequence. It is known that every horocycle flow (h t ) t∈R is loosely Bernoulli [29]; therefore, for every irrational α, there exists a positive function in f ∈ L 1 (T) such that (h t ) t∈R is measurably isomorphic to the special flow (T t α, f ) t∈R [25]. It follows from [16] and the fact that (h t ) t∈R is mixing that f is of unbounded variation. Moreover, by [24], f can be made C 1 except for one point. Since the R-property implies the WRproperty and the R-property is an isomorphism invariant, no special flow as in Theorem 1 is isomorphic to a horocycle flow. Actually, this line of thought can be extended to show that horocycle flows are never isomorphic to special flows above an irrational rotation and under a roof function that is convex and C 2 except at one point. For the latter result, one needs to introduce the concept of strong Ratner property, which is also an isomorphism invariant, that specifies the occurrence of slow divergence of nearby orbits to the first time when the orbits do split apart. This is the natural property that Ratner actually obtains for horocycle flows, and it is relatively easy to show that the Kochergin flows do not have it. What is more complicated in the proof of the absence of the general R-property for special flows, is to make sure that the slow divergence does not occur much later in the future after the nearby points have split and then came back together (see the proof of Theorem 1).
To bypass Theorem 1 and still use controlled divergence of orbits to show multiple mixing, our approach will be to further weaken the WR-property. Namely, we introduce the SWR-property, which stands for switchable weak Ratner property, that assumes that a pair of nearby points displays the WR-Property either under forward iteration in time or under backward iteration, and this depending on the pair of points. We show that the SWR-property is sufficient to guarantee the same FEJ consequences as the Ratner or the weak Ratner property. Consequently, a mixing flow enjoying the SWR-property is mixing of all orders.
The main idea in showing that Arnold and Kocergin special flows may have the SWR-Property is the following. For these flows, the main contribution to the shear between orbits is due to the visits of the flow lines to the neighborhood of the singularities. With the representation of these flows as special flows above irrational rotations, the shear is translated into the divergence between the Birkhoff sums of the roof functions for nearby points, and this divergence is mainly due to the visits under the base rotation to the neighborhoods of the points where the roof function has its singularities. If the base rotation angle α is of bounded type two nearby points will accumulate sufficient stretch while staying sufficiently far from the singularity either when they are iterated forward or when they are iterated backward. In the case of ceiling functions with only logarithmic singularities we are also able to exploit the progressive contribution to the shear of these visits to the singularities to obtain multiple mixing for a full measure set of numbers α.
We now introduce some notations related to the ceiling functions that will be considered and their singularities, after which we will be able to state our exact results on the SWR-property and multiple mixing.
Notice that in this definition h may only reflect a domination on the singularities of f since the coefficients A i , B i may be equal to zero at some or at all i's. Definition 1.2 will allow us to state our results with some flexibility on the singularities but the reader should keep in mind that the results will target functions that are essentially of the form A ln(a−x) or A(a−x) γ , γ ∈ (−1, 0), when x is in a left neighborhood of a singularity a and similar form on the right side of a singularity. In the case of Arnold flows all the singularities will be assumed to be essentially of logarithmic type (see more general definition in Sect. 1.1) while in the case of Kochergin flows we will be dealing with functions having at least one power like singularity while the other singularities are supposed to be of equal or weaker type (see Sect. 1.2).
Furthermore, our results can deal with functions having several singularities but require a non resonance condition between these singularities and the rotation frequency α in the base of the special flow, that we now state.
Our standing assumption is that α ∈ R \ Q. We then let (q s ) be the sequence of denominators of the best rational approximations of α. Namely (q s ) is the unique increasing sequence such that q 0 = 1 and q s α < kα for any k < q s+1 , k = q s . We recall (see e.g. [15]) that (2)
Note that if there is only one singularity, that is k = 1, then by (2) it is always badly approximable by α. The following shows that for k 2 the set of singularities that are badly approximable by α is a thick set in [0, 1] k . Lemma 1.5 [34] Let α ∈ R \ Q. For any k ∈ N, the set E ⊂ [0, 1] k of k − tuples(a 1 , . . . , a k ) that are badly approximable by α is a product of sets of full Hausdorff dimension in [0, 1].
But it was proven in [34] (see also [5]) that the set B(α) is a winning set in the sense of Schmidt (see [5,34] and references therein). A winning set is of full Hausdorff dimension. Moreover, for a winning set B ∈ R we have that for any x 1 , . . . , x n the set ∩ n s=1 (x s + B) is winning. So, if a 1 , . . . , a l are such that a i − a j ∈ B(α) for any i, j ∈ {1, . . . , l}, i = j, then the set of a ∈ [0, 1] such that a ∈ ∩ l s=1 (a s + B(α)) is winning which means that a 1 , . . . , a l , a l+1 are badly approximable by α for a winning set of a l+1 . The statement of the Lemma follows then by induction and because a single a 1 is always badly approximable by α.

Logarithmic like singularities
In the case of logarithmic like singularities, the following theorem holds.
Theorem 2 Let α ∈ R\Q and f ∈ C 2 (T\{a 1 , . . . , a k }) with the singularities {a 1 , . . . , a k } of type h and badly approximable by α, with some associated What is x s in Theorem 2? When we will compute the shear in the Birkhoff sums of the ceiling function f , we will naturally be able to control only those points that do not go too close to the singularities under the iteration by R α on the base. Then, a controllable point will be a point that stays at distance x s from the singularities during O(q s+1 ) iterates in the future or in the past. We choose x s so that the contribution to the shear of a single visit to a singularity is negligible with respect to the accumulated shear (this, either in the future or in the past). To fix ideas, suppose the singularities are reduced to just one onesided singularity at the origin and observe that if f at the origin is exactly log then the choice x s = 1/q s log 7 8 q s would satisfy 1. Then if q s+1 < q s log 7 8 q s we see that for any x ∈ T either up to q s+1 /2 in the future or up to −q s+1 /2 in the past the orbit of x by R α does not enter the x s neighborhood of the origin, and this will show that the progressive accumulation in the Birkohff sums of the derivative of the ceiling function above the orbit of x in the region of time between O(q s ) and O(q s+1 ) in the future or O(−q s+1 ) and O(−q s ) in the past dominates the value of the derivative at the closest entry to the neighborhood of the origin (this is the aim of condition 1). The latter is the crucial fact that we need to show SWR. In the opposite case where for example q s+1 q s log q s we have to discard the points that enter the x s neighborhood of the origin between time −q s and q s and then show that the remaining points stay away from the x s neighborhood of the origin for O(q s+1 ) iterations (either in the past or in the future) and conclude as before. This is where 2. is necessary to show that the measure of the discarded points is arbitrarily small.
We refer to the beginning of Sect. 4.1 for an outline of the proof of Theorem 2 in which the different roles of conditions 1., 2. and 3. are all explained in detail.
We now restate Theorem 2 in the particular case of exactly logarithmic singularities, that is when h(x) = − log(x), for x ∈ [0, 1). To be able to choose x s such that 1., 2. and 3. are satisfied we need some arithmetic restrictions on α, that we now introduce.
For α ∈ R \ Q, let K α := {n ∈ N : q n+1 < q n log 7 8 (q n )}. We then define in view of 1. and 2. of Theorem 2 Indeed, for α ∈ E, 1. and 2. are satisfied with x s := Recall first that a number α ∈ R \ Q is said to be Diophantine if there exists τ 0 such that for any p, q ∈ Z × N * we have that |α − p q | C(α) q 2+τ for some C(α) > 0. We call D the set of Diophantine numbers.
To have 3., it suffices to assume that α is Diophantine since an equivalent definition of α ∈ R \ Q being Diophantine is that its sequence of denominators q n satisfies ∀ n∈N , q n+1 < r α q 1+τ n (see (2)). Hence we have the following Proof We take for x s the sequence 1 q s log 7 8 q s and easily check the hypothesis of Theorem 2. Therefore (T t α, f ) t∈R has the SWR-property. Corollary 1.6 covers a set of full Lebesgue measure of rotation angles α. Indeed, it is known that the set of Diophantine numbers D has full Lebesgue measure, and we will prove in Appendix B the following result. Denote by λ the Haar measure on T. Proposition 1.7 It holds that λ(E) = 1.
We now recall the following results on mixing of special flows with ceiling functions having logarithmic singularities.
Theorem 3 Let f be as in Corollary 1.6. Then (a) ( [18]) If j A j = j B j then (T t α, f ) t∈R is not mixing for any α ∈ R−Q. (b) ( [19,20,32]) If j A j = j B j then (T t α, f ) t∈R is mixing for almost every α ∈ R − Q.
(c) ( [19,20]) If A j − B j = 0 have the same sign for all j then (T t α, f ) t∈R is mixing for each α ∈ R − Q.
In Theorem 5 below, we show that the SWR property for (T t α, f ) t∈R implies the FEJ-property. As an immediate consequence of this and of Theorem 3, Corollary 1.6 and the fact that mixing flows with the FEJ-property are multiple mixing [31] we get the following.
. . , a k }) has singularities {a 1 , . . . , a k } of type h and badly approximable by α. If A j − B j = 0 have the same sign for all j, then (T t α, f ) t∈R is multiple mixing for each α ∈ E ∩ D. If j A j = j B j then (T t α, f ) t∈R is multiple mixing for almost every α ∈ E ∩ D.
Then (T t α, f ) t∈R has the SWR-property. The most interesting case -when h has power singularities -is discussed in the corollary below. Corollary 1.9 Let α ∈ DC(0). Let f ∈ C 2 (T \ {a 1 , . . . , a k }) with all the singularities {a 1 , . . . , a k } of power-like type x γ i from the left and x δ i from the right, −1 < γ i , δ i < 0. Then, if the points {a 1 , . . . , a k } are badly approximable by α, we have that (T t α, f ) t∈R has the SWR-Property and is mixing of all orders. Proof of Corollary 1.9 It is easy to check that (4) in Theorem 4 is satisfied. This gives the SWR-Property . Mixing of (T t α, f ) t∈R was established in [17]. Multiple mixing then follows from Theorem 5 and the FEJ-property. Remark 1.10 A stronger version of Theorem 4 actually holds : if α ∈ DC(0) and f ∈ C 2 (T \ {a 1 , . . . , a k }) with all the singularities {a 1 , . . . , a k } of at most power-like type (x γ i from the left and x δ i from the right, −1 < γ i , δ i < 0) and if γ = min 1 i k {γ i , δ i }, then it is sufficient to have the singularities of maximal type badly approximable with α to guarantee the SWR property, namely if the points in E = {a i : min{γ i , δ i } = γ } ⊂ {a 1 , . . . , a k } are badly approximable by α, then (T t α, f ) t∈R has the SWR-Property and is mixing of all orders. The proof in this case is similar to the proof of Theorem 4 and we omit it to avoid overloading the paper with unnecessary technicalities.

Plan of the paper
In Sect. 2 we introduce the SWR-Property and describe its joinings consequences. In Sect. 3 we give a criterion involving the Birkhoff sums of the ceiling function that guarantees that a special flow above an isometry has the SWR-property. The treatment of these sections is similar to [9,10]. In Sect. 4 we study the Birkhoff sums of logarithmic like and power like functions and prove Theorems 2 and 4. Appendix A is devoted to the proof of Theorem 1 on the absence of the SWR-Property for a subclass of Kochergin flows. Finally Appendix B is devoted to the proof that the set of frequencies for which Theorem 2 holds has full Lebesgue measure.

The SWR-property
Let (X, B, μ) be a probability standard Borel space. We additionally assume that X is a complete metric space with a metric d. Let (T t ) t∈R be an ergodic flow acting on (X, B, μ). The R-property used by Ratner in the context of horocycle flows is a property of slow divergence between the orbits of nearby points that essentially states as follows: for any ε > 0 there exists κ > 0 such that if y and y are sufficiently close to each other and if they are not on the same orbit then there exists M(y, y ) such that d(T t y, T t+ι y ) < ε (with It is not difficult to see that the R-property implies the FEJ property. Indeed, if a joining λ is not a finite extension joining then one has that there exists points x, y, y such that (x, y) and (x, y ) are typical for λ while y and y are arbitrarily close and not in the same orbit, and by the R-property and the Birkhoff ergodic theorem one obtains an extra invariance of λ, namely by Id × T ι , that implies that λ is the product measure.
Actually, it is useful to relax the R-property by asking that the controlled divergence happens for x and y outside an exceptional set of points of measure less than ε (and for most of the times in [M, (1 + κ)M]). The proof of the FEJ property remains the same in nature but becomes a bit more technical involving some standard measure theoretical arguments (see for example [33]).
In [10], Definition 4, a slightly weaker version of the R-property is given, that is called WR or Weak Ratner property, that allows the drift to vary in some fixed compact set away from zero and infinity. There again, the FEJ consequence as well as its proof follow in practically the same way as for the R-property, with some extra standard measure theoretical arguments, and under a "continuity" assumption on orbits (see below). However, as shown in [10] the WR property is more versatile than the R-property and is adapted to nonlinear situations, where the R-property is unlikely to hold, such as in our context of reparametrizations of linear flows with singularities.
Observe now that if in the proof of the FEJ property, we used that d(T t y, T t+ι y ) < ε during a large interval of negative times t instead of positive times, then exactly the same conclusion of extra invariance of λ by Id × T ι would still hold. The crucial observation here is that it suffices to check one of the two slow divergences, in the future or in the past, and this possibly depending on the pair of points that is considered. This motivates the introduction of what we call the Switchable Ratner, or SWR, property that we now formally define.
If the set of t 0 > 0 such that the flow (T t ) t∈R has the switchable R(t 0 , P)property is uncountable, the flow is said to have SWR-property.
For the sake of completeness, we can formally compare the SWR-property with the definition of the WR-property [10]. To have WR-property, we fix Consequently, SWR-property is weaker than WR-property (and as Theorem 1 shows, it is strictly weaker).
As we just mentioned the proof of the FEJ implication from the SWR property is a direct adaptation of the proof of the same implication in the case of the R property or the WR property. For completeness, we will present a detailed proof of this fact that is stated in Theorem 5 below and that occupies the rest of the section.
As a standing assumption in all the sequel, we will add one more natural condition on the flow (T t ) t∈R which can be viewed as "continuity" on orbits. The flow (T t ) t∈R is called almost continuous [10] if for every > 0 there exists a set X ( ) with μ(X ( )) > 1 − such that for every > 0 there exists δ > 0 such that for every Theorem 5 Let (T t ) t∈R be a weakly mixing almost continuous flow acting on a probability standard Borel space (X, B, μ). Assume that (T t ) t∈R satisfies the SWR-property. Let (S t ) t∈R be an ergodic flow acting on a probability standard Borel space (Y, C , ν) and let ρ ∈ J ((T t ) t∈R , (S t ) t∈R ) be an ergodic joining. Then either ρ is equal to μ ⊗ ν or is a finite extension of the measure ν.
For the definition and properties of joinings, we refer the reader to [33] or [12]. In the proof of Theorem 5, we will need some lemmas from [10]. But first we state a useful fact that is a simple consequence of the Birkhoff ergodic theorem.

Lemma 2.2 Let T, S : (X, B, μ) → (X, B, μ) be two ergodic automorphisms and let A ∈ B.
For any , δ, κ > 0 there exist N = N ( , δ, κ) and a measurable set Z = Z ( , δ, κ) with μ(Z ) > 1−δ such that for any M, L N with L M κ and any x ∈ Z we have The following is a consequence of Lemma 5.4. in [10], that is itself based on the Birkhoff ergodic theorem.

Lemma 2.3 Let (T t ) t∈R be a weakly mixing almost continuous flow acting on (X, B, μ), and (S t ) t∈R be another ergodic flow acting on
Proof The proof is a simple consequence of the following lemma: Lemma 2.4 (Lemma 5.4. in [10]) Let (T t ) t∈R be an ergodic almost continuous flow acting on (X, B, μ), and (S t ) t∈R be another ergodic flow acting on (Y, C , ν). Let ρ ∈ J ((T t ) t∈R , (S t ) t∈R ) be such that ρ is ergodic for the automorphism T t 0 × S t 0 for some t 0 > 0. Let P ⊂ R be non-empty and compact. Let A ∈ B be such that μ(∂ A) = 0 and B ∈ C . Then, for every , δ, κ > 0 there exist a natural number N = N ( , δ, κ) and a set Z = Z ( , δ, κ) ⊂ B ⊗ C with ρ(Z ) > 1 − δ such that for any N M, L N with L M κ and any p ∈ P, we have One uses the above lemma first for the flows (T t ) t∈R and (S t ) t∈R and ergodic Then, for flows (T −t ) t∈R and (S −t ) t∈R and the same ergodic joining ρ to get, for , δ The next lemma is used in the proof of Theorem 3 in [27].
In what follows, we consider only (X, d) be a σ -compact metric space. Let

Remark 2.7
Since (X, d) is a Polish space, by Lemma 2.6 and regularity of μ, there exists a dense family Proof of Theorem 5. Let ρ ∈ J ((T t ) t∈R , (S t ) t∈R ) be an ergodic joining and ρ = μ × ν. Assume that (T t ) t∈R has the switchable R(t 0 , P)-property and ρ is ergodic for T t 0 × S t 0 (then ρ is ergodic for T −t 0 × S −t 0 ). Such t 0 > 0 always exists because an ergodic flow can have at most countably many non-ergodic time automorphisms and, by assumptions, the property R(t 0 , P) is satisfied for uncountably many t 0 > 0. For simplicity of notation, we assume t 0 = 1.
As in Lemma 5.4. in [10], we conclude that k : R → R is a continuous function and for any t ∈ R, k(t) > 0. Indeed, it follows by the fact that if for t∈R is assumed to be weak mixing hence every time r of the flow is ergodic). The set P ⊂ R\{0} is compact, therefore there exists > 0 such that k( p) > for any p ∈ P. It follows by the definition of the function k that there exists a By Lemma 2.6 and the fact that for 0 It follows by the fact that ρ is a joining that for 1 i, j R and every t ∈ R. By the switchable R(1, P)-property, let κ := κ( ). By Lemma 2.2 applied to 8 Next, by Lemma 2.3 applied to 8 , 1 8 , κ > 0 and the sets B i ×C j , 1 i, j R, there exist N 2 ∈ N and a set U 2 ⊂ B ⊗ C with ρ(U 2 ) > 7 8 such that for every L , M N 2 with L M κ and any p ∈ P, we have It follows that if we set N 0 := max(N 1 , N 2 ) and U 0 := U 1 ∩U 2 , then ρ(U 0 ) > Assume that the first inequality is satisfied. We will use Eqs. (8) and (10) (in case the second one is satisfied, we use Eqs. (9) and (11)). Let 1 i p , j p R be the numbers which satisfy |ρ( δ 0 and an application of Lemma 2.5 completes the proof.

SWR-property for special flows
In this section, we will prove a sufficient condition for SWR-property in the case of special flows over an ergodic isometry. We start by recalling the definition of special flows. Let T be an automorphism (X, B, μ). Let f ∈ L 1 (X, μ) such that f > 0. The special flow (T f t ) t∈R defined above T and under the ceiling function f is given by where ∼ is the identification Equivalently the flow (T f t ) t∈R is defined for t +s 0 (with a similar definition for negative times) by where n is the unique integer such that and If T preserves a unique probability measure μ then the special flow will preserve a unique probability measure that is the normalized product measure of μ on the base and the Lebesgue measure on the fibers. If X is a metric space with a metric d, so is The following general lemma is a direct consequence of Birkhoff ergodic theorem.

Lemma 3.1 Let T be an ergodic automorphism
for every x ∈ A. Remark 3.2 Assume that additionally f is positive and bounded away from zero. Fix , κ > 0(κ < | X f dμ| < 1/2). It follows that there are constants We now state the main result of this section. It is similar to Lemma 6 of [9]. If Apply Remark 3.2 with the constants /4, κ to f and T, T −1 , respectively to obtain constants D 1 , D 2 > 0 such that for x ∈ A, μ(A) > 1 − /2 (the set A is the intersection of two relevant sets), we have We will consider the second case (the proof in the first case goes along the same lines). Let us define By (18)

It follows by the properties of
Take A similar reasoning shows that Therefore, by the definition of the special flow, we have (20), at least (1− )L and for any such k we get that This gives us the switchable R(γ , P)-property. Note that since the flow (T f t ) t∈R is ergodic, then the set of η ∈ R such that T f η is not ergodic, is at most countable and therefore, as a direct consequence of Proposition 3.3, we get that (T f t ) t∈R enjoys SWR-property.

SWR-property for smooth special flows with singularities
In this section we will use Proposition 3.3 to prove SWR-property for special flows given by the assumptions in Theorem 2 and Theorem 4. In all the sequel we assume {a 1 , . . . , a k } are badly approximable by α with a constant C > 1 (see Definition 1.3). We start with an easy combinatorial fact about the visits of an orbit by the rotation R α to the neighborhood of the singularities.
. For finite sets A, B ⊂ T, we use the notation ρ(A, B) = min a∈A,b∈B a − b .
We will often use the Denjoy-Koksma inequality to control the growth of the Birkhoff sums. For a reference, see for example [4].

Proposition 4.2 (Denjoy-Koksma inequality) Let f : T → R be a function of bounded variation. Then
for every x ∈ T and n ∈ N.
The following lemma is a simple consequence of the Denjoy-Koksma inequality. It will be very useful in separating the contribution to the shear of the visits to the neighborhood to the singularities from the rest of the orbit.
Thenh ∈ BV (T) and we use the Denjoy-Koksma inequality to obtain (2), the set {x + r α} for each x ∈ T.
Proof By assumptions, there exists a constant z 0 > 0 such that for every i = 1, . . . , k and for every x Moreover, since f ∈ C 1 (T \ {a 1 , . . . , a k }); it follows that there exists a constant R > 0 such that for every C 0 } satisfies the assertion of the lemma.

Outline of the proof.
We first give an outline of the proof in which we suppose for simplicity that the ceiling function has just one singularity that is exactly logarithmic. The SWR-property (in the same way as the original Ratner's property) consists of two parts. First, for two "nearby" points, we need to show that their orbits drift apart by a controlled amount at some time M(M may be positive or negative in the case of the SWR property). Second, we need to make sure that their orbits keep essentially the same drift during an interval of time comparable to M. For special flows over rotations (or IET's) we gave in Proposition 3.3 a characterization of the SWR property based on the divergence for nearby points between the corresponding Birkhoff sums of the ceiling function f . The latter is controlled by the Birkhoff sums of the derivative of the ceiling function that were the subject of investigation of several other works (see e.g. [8,17,19,20,32,35]). In Proposition 4.6 Part a, we prove the first part of the SWR property. Precisely, we want to see a macroscopic yet controlled drift between the Birkhoff sums of two points 1/(q s ln q s ) d(x, y) 1/(q s+1 ln q s+1 ) for every s sufficiently large. We explain now the proof showing also how the argument simplifies if K α = {n ∈ N : q s+1 < q s (ln q s ) 7 8 } contains all the integers after some n 0 , in particular if α is of constant type.
a. There exists c > 0 such that for any point In case q s+1 Cq s , this would give the macroscopic yet controlled drift between the points x, y such that 1/(q s+1 ln q s+1 ) d(x, y) 1/(q s ln q s ) either in the future at time q s or in the past at time −q s . So, if α is of constant type, we would be done with the proof of Proposition 4.6 Part a. c. In the case q s+1 q s but s ∈ K α , we may have to consider the Birkhoff sums beyond q s to see the drift between the orbits of x and y such that 1/(q s+1 ln q s+1 ) d(x, y) 1/(q s ln q s ). Since s ∈ K α , we have that 1/(q s ln q s ) c/q s+1 hence a. applied to s + 1 implies that either up to q s+1 /2 in the future or up to −q s+1 /2 in the past the orbit of x by R α does not enter the 1/(q s ln q s ) neighborhood of the origin (see Lemma 4.7, case m ∈ K α ). Using this, the estimate of b. and the Denjoy-Koksma inequality we get by the cocycle identity that f (ιkq s ) (x) behaves like kq s ln q s for s sufficiently large, k ∈ [1, O(q s+1 /q s )] and ι = 1 or −1 (see Lemma 4.9). As a consequence there exists a time of the form n 0 q s , n 0 ∈ [1, O(q s+1 /q s )] such that f (ιn 0 q s ) (x) − f (ιn 0 q s ) (y) is in some compact set P away from 0, which finishes the proof of Proposition 4.6 Part a in the case K α contains all sufficiently large integers. d. In the case s / ∈ K α , we define x s := 1/(q s (ln q s ) 7/8 ) and use our arithmetic condition α ∈ E to define a set Z of almost full measure (see definition of Z in Sect. 4.1.2) such that R i α x does not enter the x s neighborhood of the origin (see Definition (25)) for every i ∈ {−q s , . . . , 0, . . . , q s −1} for every s / ∈ K α sufficiently large. This actually implies that either up to q s+1 4C in the future or up to − q s+1 4C (C is a constant coming from Definition 1.3; in case there is only one singularity C = 1) in the past the orbit of x by R α does not enter the x s neighborhood of the origin (see Lemma 4.7, case m / ∈ K α ). From there the proof of Proposition 4.6 Part a is similar to case 3. above except that the condition 3 of Theorem 2, namely h( 1 2q s )/ h( 1 2q s+1 ) > m 0 is used to show that a stretch of order n 0 q s ln q s where n 0 can be taken to be as large as O(q s+1 /q s ) is sufficient to produce a drift between the point x, y such that d(x, y) is comparable to 1/(q s+1 ln q s+1 ). This is where the Diophantine condition 3. of Theorem 2 is crucial.
Observe that d. is the only part where we used that in the definition of SWR, it is allowed to discard a small measure set of pairs (x, y) for which the property will not be checked.
Note however that in all the cases above, it is crucial to use the possibility to control the drift in the future or in the past depending on the pair of points.
The second part (keeping the drift) is proved in Proposition 4.6 Part b. We need to consider the points R n 0 q s α x and R n 0 q s α y and apply similar arguments as in Part a to bound the drift during time κn 0 q s . The main ingredient is Lemma 4.10 which is another lemma that allows us to bound the drift between the Birkhoff sums in the future (or in the past) up to a time comparable to q m+1 for points that stay away from the singularities in the future (or in the past) during this time. We then have to "situate" κn 0 q s relatively to the denominators of α and check that the conditions of Lemma 4.10 are satisfied by R n 0 q s α x and R n 0 q s α y. Of course if s is such that q s+1 q s (for example if s / ∈ K α ) and if 1/(q s+1 ln q s+1 ) d(x, y) 1/(q s ln q s ), then the same argument of Part a would allow to keep the drift under control for additional κn 0 q s time. But in the other cases where we have in particular to interpolate between the constant type and non-constant type behavior, our proof gets a bit technical and treats different cases separately.

Notations and standing assumptions
In all the proofs of Theorem 2, Theorem 4 and Theorem 1, we will use T for the irrational rotation R α . We may assume WLOG that In the sequel, we will assume s s 0 ( , N ) = s 0 , where s 0 is a sufficiently large integer, in particular κq s 0 > N .
We summarize now the consequences of the hypothesis 1.,2.,3. of Theorem 2 that will be useful to us in the sequel. If s s 0 we have We also note that h 1 2q s > 8C. Set v s := x s 4C and define and Z := s s 0 ,s / ∈K α W s . Observe that λ(Z ) 1 − (λ(W s ) 1 − 16kv s q s ).
. Consider x, y ∈ Z with 0 < x − y < δ. We will assume WLOG that x < y (we consider the trigonometric order on T).

Controlling the drift
The following proposition implies Theorem 2 due to Proposition 3.3.
We will assume that q s+1 > 2q s . If not, then in (26), In other words, in this case we will see the drift between x and y before time q s . (26). Part a There exists n 0 ∈ {1, . . . , max( q s+1 8Cq s , 1)} satisfying

Proposition 4.6 Consider x, y ∈ Z as in
and such that the following holds Part b Let X = T n 0 q s x and Y = T n 0 q s y if (27) holds, and X = T −(n 0 q s +1) x and Y = T −(n 0 q s +1) y if (28) holds. For n = 1, . . . , [κn 0 q s ] + 1 we have The rest of this section is devoted to the proof of Proposition 4.6. But before this we show how it implies Proposition 4.5 and thus Theorem 2.

Proof of Proposition 4.6 Part a
For m ∈ N, we will often use the following non resonance conditions of a pair of points (x, y) with the singularities {a 1 , . . . , a k } [compare with (25)].
Lemma 4.7 Let x, y ∈ T be as in (26). Then for every m such that s 0 m s, if we have at least one of the following (30) is satisfied 2. if m ∈ K α and q m+1 2q m , then we have at least one of (31) or (32).
Proof Observe first that since by (26), x − y v s v m , then it suffices to prove (31) or (32) with just x instead of [x, y] on the LHS and 2v m instead of v m on the RHS.
In the following lemma we control the drift between the Birkhoff sums up to q s or −q s between nearby points that do not go too close to the singularities. Recall that we have assumed that then then Proof We show that (33) implies (34), the second part of the Lemma being similar. By (33) and (26) It is enough to show that there exist d > 0 such that for s s 0 For s ∈ N, definē It follows thatf s ∈ BV (T) and We have (if s ∈ N is sufficiently large). It follows by the assumptions on f and h and (30), that if s 0 ( , N ) is sufficiently large, then for s s 0 , we have for every i = 1, . . . , k: and similarly On the other hand, by l'Hospital's rule (by x s 4C < 1 2q s ). Now, using (38)-(44), we get which allows us to conclude since if we assume WLOG that ε is sufficiently small (recall that x < y).
We now concatenate the inequalities of Lemma 4.8 if (31) or (32) are satisfied.
We are ready now to finish the proof of Part a. of Proposition 4.6. If s / ∈ K α , then by the fact that x, y ∈ Z ⊂ W s , it follows that 1. in Lemma 4.7 is satisfied with m = s. If s ∈ K α then 2. in Lemma 4.7 is satisfied with m = s. Therefore we can use Lemma 4.7 for x, y and m = s. Now, by Lemma 4.9, if (31) holds we have (27), if (32) holds we have (28). Part a. of Proposition 4.6 is settled, we turn now to Part b.
To prove Proposition 4.6 Part b., observe first that if s 0 is sufficiently large, and up to eventually changing κ to κ = κ 8C , one of two possibilities holds : 1. There exists s 0 m s, m ∈ K α , such that κn 0 q s < q m 8Cκn 0 q s , or 2. There exist s 0 m s and l 1 such that lq m κn 0 q s < (l +1)q m q m+1 8C . Case 1. κn 0 q s < q m 8Cκn 0 q s with s 0 m s, m ∈ K α . Lemma 4.7 implies that either (31) or (32) holds for T n 0 q s x, T n 0 q s y, m. Therefore, (48) or (49) holds for m and l = 0. We then apply Lemma 4.10 to T n 0 q s x, T n 0 q s y, m with l = 0, and according to whether we have (50) or (51) we will get A. or B. of Proposition 4.6 Part b. Indeed, suppose (50) holds. Then, since κn 0 q s < q m , for n = 1, . . . , [κn 0 q s ] + 1, we have due to (47)

Case 2.
There exist s 0 m s and l 1 such that lq m κn 0 q s < (l + 1)q m q m+1 8C . We will first prove that T n 0 q s x, T n 0 q s y, m, l satisfy the hypothesis of Lemma 4.10. If m ∈ K α , then Lemma 4.7 implies that either (31) or (32) holds for T n 0 q s x, T n 0 q s y, m. Therefore, since l q m+1 8Cq m −1, either (48) or (49) holds for T n 0 q s x, T n 0 q s y, m, l. If m / ∈ K α , then we consider two cases: I. m = s. In this case n 0 > 1 κ and therefore q s+1 > 32Cq s . Since l [κn 0 ] + 1, n 0 q s+1 8Cq s and κ 1, we get that Similarly, Note that by Lemma 4.7 x, y satisfy (31) or (32). Therefore the assumptions of Lemma 4.10 are satisfied for T n 0 q s x, T n 0 q s y, s, l. II. m < s. Since l q m+1 8Cq m − 1, it is enough to show that T n 0 q s x, T n 0 q s y, m satisfy (31) or (32). Due to Lemma 4.7, we just have to check (30) for T n 0 q s x, T n 0 q s y, m: Since m / ∈ K α and m < s, we have Moreover, T n 0 q s x − T n 0 q s y (26) < 1 10v m . Therefore, it is enough to show that for j ∈ {0, . . . , q m − 1} we have For this aim, let i 0 and r 1 be such that ρ({T n 0 q s x + jα} It follows by (3), that for i 0 = j ∈ {0, . . . , q m − 1}, Next, by the fact that m / ∈ K α and x ∈ B m (m s 0 ), we get that x + i 0 α − a r 1 4v m , and therefore (recall that n 0 q s+1 8Cq s ) and (54) is thus proved. So in Case 2. at least one of (48) or (49) is satisfied for T n 0 q s x, T n 0 q s y, m, l.
Therefore, we can apply Lemma 4.10 to T n 0 q s x, T n 0 q s y, m, l (recall that lq m κn 0 q s < (l + 1)q m ). Now and as in Case 1., if (50) holds we get A., if (51) holds we get B. Indeed, assume WLOG that T n 0 q s x, T n 0 q s y, m, l satisfy (50) (the proof in the other case is analogous). Using (47) and the fact that [κn 0 q s ] + 1 (l + 1)q m 2κn 0 q s , we get for n = 1, . . . , So A. in Proposition 4.6 Part b. holds. The proof of Proposition 4.6 is thus completed and Theorem 2 follows.

Outline of the proof
The general scheme of the proof is similar to the scheme of the proof of Theorem 2 (see the outline of the proof of the latter theorem in Subsection 4.1). Assume for simplicity that f has just one right-sided power singularity at 0 of type x −γ . In this outline we will actually see that the constant type condition is an if and only if condition in the proof of Theorem 4 that we give.
Indeed, the following facts are easy to check for an interval a. If for some c ∈ (0, 1), R i α I is disjoint from [−c/q n , c/q n ] for every i = 0, . . . , l q n then f (l) (θ ) Cq 1+γ n for any θ ∈ I for some C that depends on c (with a similar statement for negative iterates). b. If for some c ∈ (0, 1), for any θ ∈ I for some C that depends on c (with a similar statement for negative iterates). c. If α is of constant type then there exists c > c > 0 such that one of the following holds if n is sufficiently large : 1. there exists i 0 0 such that If α is not of constant type and if q n+1 q n and y − x = ε n /q 1+γ n while |x − p/q n | ε 2 n /q n with ε n → 0, then as long as for i ∈ [0, l] (or Now, if α is of constant type, and if we assume that c.1 holds, then a. and b. imply that either 1+γ n ] for every θ ∈ I . Since q n+1 /q n is bounded this implies a controlled macroscopic drift between the orbit of x and y (this is the content of Proposition 4.12 Part a). As in the proof of Theorem 2, we then need to use the same type of arguments to show that the drift remains almost constant during a small additional proportion of time (we do this in the future of i 0 + 1 or in the past of i 0 , and this is the content of Proposition 4.12 Part b and relies on Sublemma 4.16). The case c.2 is treated similarly.
In Sublemma 4.14 we essentially prove a. and in Sublemma 4.15 we essentially prove b.
We now explain why the constant type condition is necessary in our proof. Indeed, if α is not of constant type, d. gives an example of pairs x, y for which the drift between the forward orbits jumps from ε n to 1/ε n and the same happens for backward orbits, which contradict the SWR property for this pair. Furthermore, if ε n is taken to converge very slowly to 0 such pairs can be produced with x ∈ Z for any Z having positive measure. Observe that this does not imply that the SWR property would not reappear much later in time but this is very unlikely as demonstrated for the absence of the WR property in the particular case of Theorem 1.
Observe finally that the same type of pairs (x, y) described in d. show that it is necessary to use the SWR property instead of the WR property. Indeed, only one of the alternatives c.1 or c.2 holds for such pairs and we are obliged, if we want to see a controlled drift, to iterate in the future or in the past.

Notations and standing assumptions
Recall that in the proof of Theorem 4 T means R α . We may assume WLOG that Recall H > 0 coming from Lemma 4.4, D 1 , D 2 > 0, the constants in the hypothesis (4) in Theorem 4 and define where c is such that for every s ∈ N, q s+1 cq s .
Fix ε 1 and N ∈ N. We will assume that < Let s 0 ∈ N be such that q s 0 −4 1 κ N , and h( 1 2q s ) > 6C for s s 0 , and for Define δ := . We will show that SWR-property holds for all pairs of points x, y ∈ T with x − y < δ.

Controlling the drift
The following proposition implies Theorem 4 due to Proposition 3.3. We can assume WLOG that x < y. Let s := s(x, y) be unique such that As in the precedent section, Proposition 4.11 follows from Proposition 4.12 Consider x, y ∈ T as in (58). Part a. There exists i 0 ∈ {0, . . . , q s−2 − 1}, such that The rest of Sect. 4.2 is devoted to the proof of Proposition 4.12.
Consider the orbit x − q s−2 α, . . . , x, . . . , x + (q s−2 − 1)α (the length of this orbit is smaller than q s ). It follows by (3) that there exists at most one Hence at least one of the following two holds: The following Lemma directly implies the proof of Proposition 4.12.

Proof of Lemma 4.13
We will suppose (62) holds, the proof of the other case being analogous. We will need some lemmas.

Sublemma 4.15
There Indeed , is at least 1 q s−4 -dense. We have assumed that . We claim that 2c .

Indeed, the LHS of this inequality is equal to
. By (57), monotonicity of h , (4) twice (for s and s + 1) and (58) and the claim follows. Therefore, one of the numbers . As a consequence of the above lemmas, we obtain that at least one of the numbers belongs to the set P, and (59) is proved. The next result will give the proof of (61).

Sublemma 4.16
The following hold: for all 0 n κ(i 0 + 1), (69) Proof First we show (69). Select (the unique) m ∈ N such that q m κ(i 0 + 1) q m−1 (note that q m q s ). By (3) applied to T i 0 (x), (67) and the fact that q m q s it follows that (72) By (71) and using the same arguments which precedes (64) we obtain (cf. (65)) for n = 0, . . . , κ(i 0 + 1) Then for i ∈ {1, . . . , k}, again by repeating that lead to (66) we obtain Using this and (73) we get which yields the first case of (69). To handle the second case we use (72) and proceed as before to obtain first | f (−n) ( and then estimating above by We conclude exactly in the same way as in the first case. We proceed to the proof of Lemma 4.13 in the case (62) is satisfied. If We will hereafter assume that (T t α, f ) t∈R has the R(t 0 , P) property (see Definition 2.1) and obtain a contradiction. Thus, assume there exist a set Z ⊂ X f with λ f (Z ) > 1 − and 0 < δ < such that for every (x, s), (y, s ) ∈ Z with d f ((x, s), (y, s ) Consider It follows that λ f (V ) > 1 − 4 . The contradiction will come from the following two Propositions, the first one of which is a consequence of (75) and (76).
Remark 5. 4 The points x ∈ W 0 go too close to the singularity under iteration by R α , so that points of the form (x, s), (x + δ 0 , s) split far apart before they get separated by a distance in P (Lemma 5.11 below). In other words, these points do not have the 'natural' WR-property that consists of a controlled drift starting from the first time the points split. To make sure these points cannot display the WR-Property in the future δ 0 is chosen in such a way, 1 M 1−γ , and Lemma 5.5 then precludes the WR-property (see Lemma 5.10 below).
Before we prove these propositions we will see how they imply Theorem 1.
Proof of Proposition 5.1 Lemma 5.5 Let x, y ∈ T and let I be an integer interval such that for every n ∈ I, | f (n) (x) − f (n) (y)| < η (where η is a sufficiently small number). Then Proof We assume that x < y. Let s ∈ N be unique such that Then, by the cocycle identity, the fact that a ∈ I , for n ∈ Z, we have Let k ∈ N be unique such that We will show that there exists n 0 ∈ [0, q k+1 ] such that (82) This, by (80), gives | f (n 0 +a) (x) − f (n 0 +a) (y)| > η and therefore n 0 + a / ∈ I .
The following lemma translates (75) into a property on the Birkhoff sums above R α of the ceiling function f .
Proof Assume WLOG that x < y. Let n ∈ [M, M + L] and r n be unique such that f (r n ) (x) n + s < f (r n +1) (x). We will show that a contradiction. Secondly, by the fact that (x, s) ∈ V (hence s < 1 2 ) and r M N 0 , using Lemma 3.1 to r M and the definition of N (N 1 2 κ 2 ), we get (M N ) Analogously we prove that We set m r = m r (n) := m n − r n ∈ Z to get |x − y − m r α| < and | f (r n ) (x) − f (r n +m r ) (y)− p| < 2 . It follows that the number of different r n ∈ [M 0 , M 0 + L 0 ] is at least 2a(1 − )L 0 and hence (84) follows.

Proof of Proposition 5.1 Denote by
Moreover we assume that for every i = 1, . . . , l, I i is maximal in the sense We will show that there exists i = 1, . . . , l such that This will obviously finish the proof of (77) with M = a i , L = |I i |, and m = m i ∈ Z. Let us show (87). If l 2 there is nothing to prove. Assume l 3. Notice that U is the set of n s such that (x, s), (y, s ) ∈ V are p, n-close. The next lemma implies that between any two disjoint integer intervals I j , I j+1 ⊂ U , on which (x, s), (y, s ) are p, n-close, there will be an integer interval J j much longer than I j , such that for any n ∈ J j , (x, s), (y, s ) are not p, n-close.
Sublemma 5.7 will give (87). Indeed, by the definition of J i and I i , it follows that for i, j = 2, . . . , l − 1 with j = i − 1, i, i + 1 Hence, l−1 i=2 |J i | 3L 0 , and Therefore, by the fact that |U | > aL 0 , we have |I 1 ∪ I l | > 2aL 0 3 and consequently, |I w | aL 0 3 for at least one of w = 0 or w = 1. Hence to obtain (87) we just need to prove Sublemma 5.7.
Proof of Sublemma 5.7 Let v ∈ N be unique such that Consider Hence, for n ∈ I i , | f (n−a i ) (T a i x) − f (n−a i ) (T a i +m i y)| < 4 . It follows now by Lemma 5.5 applied to η = 4 , the points T a i x, T a i +m i y and s = v that . It follows by (3) with k = 1 and for the point T a i x, similarly to the proof of Theorem 4, that there exist at most one Assume t 0 < 0. Then we consider L i . Moreover, we may assume that γ 1−γ > 2cC, and therefore, using (88) we obtain 1

Moreover, by Lemma 4.3 (the RHS of the inequality) to
(if necessary, to get the last inequality, we consider a bigger C). We set J i = [b i +1, a i +q v−2 ](b i a i +2(4 ) 1+γ q v+1 , by (90)). It follows that for n ∈ J i , we have by cocycle identity Now, by (90) for some N l 1 depending on w, to be specified later, Before we proof the above Lemmas, let us first show, how they imply Proposition 5.3.
Proof Fix any M N 2 , any p ∈ P and any k = 0 such that x − y − kα < .
We thus proved (78) in Proposition 5.3. Let us now complete the proof by proving Sublemmas 5.8 and 5.9.
Proof of Sublemma 5.8 Fix v ∈ N. To simplify the notations, we will write δ 0 instead of δ v 0 . Given u ∈ N, set Let t ∈ N be unique such that Let c 1 = 4c, then q t+1 4cq t−c 1 (since t depends on v which is sufficiently large, t − c 1 > v − 4 by (107)).
We will show below that this set is not empty. Now, let δ 0 be any number in this Suppose that for some k t − c 1 we have a closed interval E k ⊂ 1 q t+1 , 2 q t+1 ∩ k i=t−c 1 B i such that |E k | 1 c c 1 +2 q k . It follows that Let E k+1 ⊂ E k be the longest closed subinterval (in E k ) such that It follows that E k+1 ⊂ E k ⊂ 1 q t+1 , 2 q t+1 , and by (110), To do this note that Indeed, |E k ∩ {iα} q k+1 −1 −q k+1 | 4|E k |q k+1 and around each point of the form iα, i = −q k+1 , . . . , q k+1 − 1, we discard an interval of length 1 (k+1) 2 q k+1 (see (106), for u = k + 1). We use the induction assumption, the fact that k + 1 t − c 1 v − 4 (and v is sufficiently large) to obtain Hence (109) is proved.
The proof of Sublemma 5.8 is thus finished.
By the fact that i 0 < q w−l and (99) it follows that x + jα − 0 > 1 This finishes the proof of (100). The proof of Sublemma 5.9 is complete.
This finishes the proof of Theorem 1.
This finishes the proof.