1 Introduction

Our starting point is the famous Kronecker–Weyl equidistribution theorem which refers to the uniformity result concerning the irrational rotation sequence.

This says that the sequence \(\{q\alpha \}\), \(q=0,1,2,3,\ldots \), where \(\alpha \) is irrational and \(\{z\}\) denotes the fractional part of z, is uniformly distributed in the unit interval [0, 1), so that for any subinterval \([a,b)\subset [0,1)\), we have

(1.1)

It is easy to show that (1.1) holds for all \([a,b)\subset [0,1)\) if and only if it holds for all \([0,b)\subset [0,1)\). Furthermore, we can consider the more general sequence \(\{\tau \,{+}\,q\alpha \}\), \(q=0,1,2,3,\ldots \), with an arbitrary starting point \(\tau \in [0,1)\). Then for any \(\tau \in [0,1)\) and \(b\in [0,1)\), we have

This sequence is called the irrational rotation sequence because if we take a circle with circumference 1 and radius \(1/2\pi \), then the unit interval can be represented by this circle, and moving from one term to the next corresponds to an anticlockwise rotation by an angle \(2\pi \alpha \), as shown in Fig. 1.

Fig. 1
figure 1

The irrational rotation sequence

The uniformity result concerning the irrational rotation sequence is the first equidistribution type result, proved independently by Bohl, Sierpiński and Weyl around 1910, followed soon by the multidimensional version and also the continuous version concerning the torus line, both due to Weyl. And of course Birkhoff’s ergodic theorem, proved about 20 years later, says that in general every ergodic measure-preserving transformation is a rich source, namely that it provides half-infinite orbits that exhibit equidistribution relative to the invariant measure.

An interesting problem studied about 50 years ago by Veech [15] is the following parity, or mod 2, version of the classical equidistribution theorem. Take two copies of the circle with circumference 1 and radius \(1/2\pi \), and mark off a segment [0, b) of length b in the anticlockwise direction on each circle. Let \(J_1=J_1(b)\) denote this segment on the first circle and let \(J_2=J_2(b)\) denote this segment on the second circle. We now take an irrational number \(\alpha \), and consider the discrete dynamical system illustrated in Fig. 2.

Fig. 2
figure 2

The parity version of the classical equidistribution theorem

Start with an arbitrary point \(s_0\) on the first circle \(C_1\). Rotating in the anticlockwise direction by an angle \(2\pi \alpha \), we arrive at a point \(s_1\). If \(s_1\) does not lie on \(J_1\), then we leave it where it is. If \(s_1\) lies on \(J_1\), then we move it to the corresponding point on the second circle \(C_2\).

In general, suppose that the point \(s_i\) lies on the circle \(C_j\), where \(j=1,2\). Rotating in the anticlockwise direction by an angle \(2\pi \alpha \), we arrive at a point \(s_{i+1}\). If \(s_{i+1}\) does not lie on \(J_j\), then we leave it where it is. If \(s_{i+1}\) lies on \(J_j\), then we move it to the corresponding point on the other circle \(C_k\), where .

Clearly the sequence \(s_0,s_1,s_2,s_3,\ldots \) keeps alternating between the two circles. The problem is to describe the distribution of this half-infinite orbit on the union of the two circles, and find cases that exhibit equidistribution.

There are at least two different ways of visualizing the Veech discrete 2-circle system as a continuous flat dynamical system. This is motivated by the observation that the problem of torus lines with irrational slopes in the unit square as well as the problem of point billiards with initial irrational slopes on a square table are basically equivalent continuous representations of the problem concerning discrete irrational rotation sequences. More precisely, the 1-dimensional irrational rotation sequence arises from these two continuous 2-dimensional flat dynamical systems with irrational slopes via discretization, the general method of converting the problem of describing the distribution of a continuous orbit to the discrete problem of studying where the orbit hits the boundary.

We first discuss a simple continuous system which gives arguably the best way to visualize the Veech discrete 2-circle system. In this simple continuous model, we replace the 2-circle underlying set by a flat surface, and replace the discrete orbit by a geodesic, or generalized torus line. This flat surface, which we call the 2-square-b surface, is constructed from joining two unit squares side by side and adding an extra vertical barrier, a wall of length \(1-b\) between them, as shown in Fig. 3. The vertical complement of the barrier, indicated by the line in light gray, is a b-size gate, or b-gate, in the middle which makes it possible to travel from one square to the other. To make this a surface, we identify pairs of boundary edges with the same label via perpendicular translation. Note that the two sides of the vertical barrier in the middle are different edges. Note also that the 2-square-b surface actually has two b-gates. Apart from the obvious one in the middle, there is a second b-gate on the far right vertical edge \(v_1\) which is identified with the far left vertical edge \(v_1\). This is a b-gate, as it is clear that a geodesic that reaches the far right vertical edge \(v_1\) continues from the corresponding point on the far left vertical edge \(v_1\), and in doing so, passes from the right square to the left square.

Fig. 3
figure 3

2-square-b surface

Since the 2-square-b-surface is a flat translation surface, geodesics on this surface are 1-direction generalized torus lines. If the slope of a geodesic is \(\alpha \), then we call it an \(\alpha \)-geodesic or \(\alpha \)-line.

Let us now clarify the connection between the Veech discrete 2-circle system in Fig. 2 and the 1-direction geodesic flow with slope \(\alpha \) on the 2-square-b surface in Fig. 4.

Fig. 4
figure 4

A geodesic with slope \(\alpha \) on the 2-square-b surface

First of all, we can represent the two circles in Fig. 2 by two circles in the vertical direction in Fig. 4. We can first view the far left edges \(v_1\) and \(v_2\) of the 2-square-b surface as forming a circle, due to the identification of the point (0, 0) at the bottom with the point (0, 1) at the top. Thus we visualize the left vertical edge of the left square of the 2-square-b surface as the left circle in Fig. 2.

We can next view the middle edge \(v_3\) and the b-gate below it of the 2-square-b surface as forming a circle, due to the identification of the point (1, 0) at the bottom with the point (1, 1) at the top. Thus we visualize the left vertical edge of the right square of the 2-square-b surface as the right circle in Fig. 2.

Indeed, we can go back and forth between Figs. 2 and 4.

Consider the point \(s_0\) on the left circle in Fig. 2. Based on the representation of the two circles just discussed, we find \(s_0\) on the left vertical edge of the left square of the 2-square-b surface as shown in Fig. 4, and \(s_0\) is the initial point of the geodesic segment \(\textbf{1}\). The point \(s_1\) is obtained from \(s_0\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it does not lie on \(J_1\), so it stays on the left circle. Now the point \(s_1\) is related to the terminal point of the geodesic segment \(\textbf{1}\). As shown in Fig. 4, this terminal point lies on the edge \(v_2\) in the middle, but in view of the identification of the edges \(v_2\), we can place the point \(s_1\) on the left vertical edge of the left square of the 2-square-b surface that corresponds to the left circle.

As shown in Fig. 4, \(s_1\) is the initial point of the geodesic segment \(\textbf{2}\). The point \(s_2\) is obtained from \(s_1\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it lies on \(J_1\), so it moves to the corresponding point on the right circle. Now the point \(s_2\) is related to the terminal point of the geodesic segment \(\textbf{2}\). As shown in Fig. 4, this terminal point lies on the b-gate in the middle, on the left vertical edge of the right square of the 2-square-b surface that corresponds to the right circle.

As shown in Fig. 4, \(s_2\) is the initial point of the geodesic segment \(\textbf{3}\). The point \(s_3\) is obtained from \(s_2\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it does not lie on \(J_2\), so it stays on the right circle. Now the point \(s_3\) is related to the terminal point of the geodesic segment \(\textbf{3}\). As shown in Fig. 4, this terminal point lies on the edge \(v_3\) on the right, but in view of the identification of the edges \(v_3\), we can place the point \(s_3\) on the left vertical edge of the right square of the 2-square-b surface that corresponds to the right circle.

And so on.

There is a fundamental difference between torus line flow on a square and geodesic flow on the 2-square-b surface. Torus line flow in a square, or in any cube of higher dimensions, exhibits remarkable stability and predictability, where two particles moving on two parallel torus lines and close to each other with the same speed remain close forever. Thus such dynamical systems are said to be integrable.

How about the analogous question for geodesic flow on the 2-square-b surface? Here, there are singular points, and two particles moving with the same speed on two parallel geodesic segments close to each other do not remain close forever after they pass through opposite sides of a split singularity, as shown in Fig. 5.

Fig. 5
figure 5

Singular points on the 2-square-b surface

Thus this dynamical system is said to be non-integrable.

We next discuss the second model, a billiard system due to Masur. Billiards have the advantage that they represent a more-or-less legitimate mechanical system, one step closer to physics. The billiard table in this second model is the underlying double-square of the 2-square-b surface. For convenience, we take a copy scaled by half, as shown in the picture on the left in Fig. 6.

The billiard flow is a 4-direction flow. The well-known trick of unfolding, first introduced by König and Szücs [6] in 1913, converts the 4-direction billiard flow on the table in the picture on the left in Fig. 6 to a 1-direction linear flow on the corresponding 4-copy flat surface, obtained by a reflection across the right vertical side, followed by a reflection of the whole image across the top horizontal side, as shown in the picture on the right in Fig. 6. Here the left and right vertical edges are identified, the top and bottom horizontal edges are identified, and the two sides of the two walls are appropriately identified as shown.

Fig. 6
figure 6

The underlying double-square of the 2-square-b surface and a torus with two vertical walls

In particular, the right side of the left wall and the left side of the right wall, both indicated by \(+\), are identified, while the left side of the left wall and the right side of the right wall, both indicated by −, are identified. Now a torus has genus 1. However, with the two walls, the surface in the picture on the right in Fig. 6 has genus 2. And 1-direction geodesic flow on this surface is a 4-fold covering of billiard flow on the table in the picture on the left in Fig. 6.

Let us now clarify the connection between the Veech discrete 2-circle system in Fig. 2 and the 1-direction geodesic flow with slope \(\alpha \) on the torus with two vertical walls in Fig. 7.

Fig. 7
figure 7

A geodesic with slope \(\alpha \) on the torus with two vertical walls in the middle

First of all, we can represent the two circles in Fig. 2 by two circles in the vertical direction in Fig. 7. We can first view the right side of the left wall and its vertical extension to the points (1/2, 0) and (1/2, 1) as forming a circle, with the extension forming the b-gate, due to the identification of the point (1/2, 0) at the bottom with the point (1/2, 1) at the top. Thus we visualize this as the left circle in Fig. 2.

We can next view the right side of the right wall and its vertical extension to the points (3/2, 0) and (3/2, 1) as forming a circle, with the extension forming the b-gate, due to the identification of the point (3/2, 0) at the bottom with the point (3/2, 1) at the top. Thus we visualize this as the right circle in Fig. 2.

Indeed, we can go back and forth between Figs. 2 and 7.

Consider the point \(s_0\) on the left circle in Fig. 2. Based on the representation of the two circles just discussed, we find \(s_0\) on the right side of the left wall in Fig. 7, and \(s_0\) is the initial point of the geodesic segment \(\textbf{1}\). The point \(s_1\) is obtained from \(s_0\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it does not lie on \(J_1\), so it stays on the left circle. Now the point \(s_1\) is related to the terminal point of the geodesic segment \(\textbf{1}\). As shown in Fig. 7, this terminal point lies on the left side of the right wall, but in view of the identification of the left side of the right wall with the right side of the left wall, we can place the point \(s_1\) at the corresponding position on the right side of the left wall. This corresponds to the left circle.

As shown in Fig. 7, \(s_1\) is the initial point of the geodesic segment \(\textbf{2}\). The point \(s_2\) is obtained from \(s_1\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it lies on \(J_1\), so it moves to the corresponding point on the right circle. Now the point \(s_2\) is related to the terminal point of the geodesic segment \(\textbf{2}\). As shown in Fig. 7, this terminal point lies on the extension of the right side of the right wall that forms the b-gate. This corresponds to the right circle.

As shown in Fig. 7, \(s_2\) is the initial point of the geodesic segment \(\textbf{3}\). The point \(s_3\) is obtained from \(s_2\) by rotating in the anticlockwise direction by an angle \(2\pi \alpha \), and we see from Fig. 2 that it does not lie on \(J_2\), so it stays on the right circle. Now the point \(s_3\) is related to the terminal point of the geodesic segment \(\textbf{3}\). As shown in Fig. 7, this terminal point lies on the left side of the left wall, but in view of the identification of the left side of the left wall with the right side of the right wall, we can place the point \(s_3\) at the corresponding position on the right side of the right wall. This corresponds to the right circle.

And so on.

For the rest of this paper, we shall represent the Veech discrete 2-circle system as 1-direction geodesic flow on the 2-square-b surface.

The most natural question is the following. Since we are not interested in periodic orbits, we shall always assume that the slope \(\alpha \) is irrational.

Question 1

Let \(\alpha \) be an irrational number. When can we guarantee that every half-infinite \(\alpha \)-geodesic on the 2-square-b surface is uniformly distributed?

An infinite discrete or continuous orbit is uniformly distributed if, given a nice test set A, the asymptotic proportion of time the orbit visits A is equal to the relative area of A. A classical result of Weyl [18] then says that it does not make any difference in the definition of uniformity of an infinite discrete or continuous orbit in the 2-dimensional case whether we choose the family of nice test sets to be (i) the class of all triangles, or (ii) the very different class of all circles, or (iii) the much larger class of all Jordan measurable sets which contains both (i) and (ii).

We recall that Jordan measurable means that the 2-dimensional Riemann integral of the characteristic function of the set is well defined.

In this paper uniformly distributed and equidistributed have the same meaning.

We can assume that the irrational number \(\alpha \) satisfies \(0<\alpha <1\). To see this, let \(n\in {\mathbb {Z}}\) be an arbitrary non-zero integer. Starting from the same point on a vertical edge of the 2-square-b surface, it is clear that the \(\alpha \)-geodesic and the corresponding \((\alpha \,{+}\,n)\)-geodesic intersect the three vertical edges of the 2-square-b surface at the same points. Thus if the \(\alpha \)-geodesic is equidistributed on the 2-square-b surface, then the corresponding \((\alpha \,{+}\,n)\)-geodesic is also equidistributed on the 2-square-b surface, and vice versa.

We shall formulate the main results of this long paper in Sect. 3. Before that, we discuss in Sect. 2 the interesting special case when \(b=\{m\alpha \}\) for some non-zero integer \(m\in {\mathbb {Z}}\). This special case, not considered by Veech [15], is in part simple and in part difficult.

Recall that geodesic flow on the 2-square-b surface is non-integrable, in view of the singularities in the orbit space, making it difficult to predict the long-term behavior of any given half-infinite geodesic. Assuming that a particle moves on the geodesic with constant speed, it is often difficult to predict which square contains the particle at any given time instance t, when t is large. On the other hand, there are only two candidates for the location of the particle, with one in each square, since the \(\alpha \)-flow on the 2-square-b surface modulo 1 reduces to a torus line in the unit square, giving rise to a well-predictable integrable system, namely, a straight line on the plane modulo 1. So the difficult question is which one of these two candidates is the true location of the particle.

The special case when \(b=\{m\alpha \}\) for some non-zero integer \(m\in {\mathbb {Z}}\) is simple, in the sense that there is a particularly simple and efficient algorithm that answers the question of which square. Indeed, this question is equivalent to the following number-theoretic parity type problem. Consider the infinite irrational rotation sequence

$$\begin{aligned} s_q=\{\tau +q\alpha \}, \quad q=0,1,2,3,\ldots , \end{aligned}$$

with arbitrary starting point \(\tau \geqslant 0\). For every \(N\in {\mathbb {N}}\), let \(\Psi (\alpha ;\tau ;b;N)\) denote the number of integers q satisfying \(0\leqslant q\leqslant N-1\) such that \(0\leqslant s_q<b\). It is easy to see that the parity of \(\Psi (\alpha ;\tau ;b;N)\) answers the question of which square contains the particle. This follows from discretization of the \(\alpha \)-geodesic and studying the consecutive intersection points on the vertical edges of the 2-square-b surface. Note that an \(\alpha \)-geodesic moves from one square to the other if and only if it crosses one of the two b-gates, and any two consecutive gate crossings always happen with different b-gates.

We first consider the special case \(0<b=\alpha <1\) and \(\tau =0\). For \(N\geqslant 2\), we have

(1.2)

where \(\lceil \beta \rceil \) denotes the upper integral part of a real number  \(\beta \). To see this, consider the numbers

$$\begin{aligned} \tau +q\alpha , \quad 0\leqslant q\leqslant N-1. \end{aligned}$$
(1.3)

Clearly they fall into the interval . Now for every integer n satisfying , there is a unique number in (1.3) such that \(\tau +q\alpha \in [n,n+\alpha )\), so that \(0\leqslant s_q<b\). On the other hand, \(0\leqslant s_q<b\) if and only if \(\tau +q\alpha \in [n,n+\alpha )\) for some integer n satisfying .

A somewhat similar argument shows that for every integer \(N\geqslant 2\), we have

(1.4)

where \(\lfloor \beta \rfloor \) denotes the lower integral part of a real number  \(\beta \). Note that in the first case \(0\leqslant \{\tau \}<b\), the first term \(s_0<b\), whereas in the second case \(b\leqslant \{\tau \}<1\), the first term \(s_0\geqslant b\).

We next consider the special case \(0<b=\{2\alpha \}<1\) and \(\tau =0\). Here we apply (1.4) to each of the two subsequences

$$\begin{aligned} s_0,s_2,s_4,s_6,\ldots \quad \hbox {and}\quad s_1,s_3,s_5,s_7,\ldots , \end{aligned}$$

with the same gap \(b=\{2\alpha \}\).

Suppose first that \(0<\alpha <1/2\), so that \(b=\{2\alpha \}=2\alpha \). Then

(1.5)

Note here that \(s_1=\alpha <2\alpha =b\).

Suppose next that \(1/2<\alpha <1\), so that \(b=\{2\alpha \}=2\alpha -1\). Then

(1.6)

Note here that \(s_1=\alpha >2\alpha -1=b\).

For the special case \(0<b=\{3\alpha \}<1\), we apply (1.4) separately to each of the three subsequences

$$\begin{aligned} s_0,s_3,s_6,s_9,\ldots , \quad s_1,s_4,s_7,s_{10},\ldots , \quad s_2,s_5,s_8,s_{11},\ldots , \end{aligned}$$

with the same gap \(b=\{3\alpha \}\).

And so on. In general, for any \(b=\{m\alpha \}\), where \(m\in {\mathbb {Z}}\) is non-zero, we obtain an analogous explicit formula for \(\Psi (\alpha ;\tau ;b;N)\), and this gets more complicated as m increases. Nevertheless, it is not difficult to determine from such an explicit formula the parity of \(\Psi (\alpha ;\tau ;b;N)\), and this parity tells us which square contains the particle. This explains why, on the one hand, we say that this special case when \(b=\{m\alpha \}\) for some non-zero integer \(m\in {\mathbb {Z}}\) is simple. More precisely, we may call it a non-integrable dynamical system with very low algorithmic complexity.

On the other hand, this special case is still quite difficult. For instance, even in the totally innocent looking special case \(0<b=\{2\alpha \}\) with \(1/2<\alpha <1\), it is not easy at all to determine whether a half-infinite \(\alpha \)-geodesic is equidistributed on the whole 2-square-b surface.

We conclude Sect. 1 with a simple technical observation. To study the special case \(b=\{m\alpha \}\) for some non-zero integer \(m\in {\mathbb {Z}}\), it suffices to consider only positive integers m. This follows on combining the trivial identity \(b=\{m\alpha \}=\{-m(1-\alpha )\}\) with the following result.

Lemma 1.1

A half-infinite \(\alpha \)-geodesic on the 2-square-b surface with starting point (0, x) lying on the far left vertical edge of the surface is equidistributed on the surface if and only if the half-infinite \((1-\alpha )\)-geodesic with starting point \((0,\{b+1-x\})\) lying on the same far left vertical edge of the surface is equidistributed on the surface.

Proof

The proof follows from combining three simple transformations.

The first simple transformation is illustrated in Fig. 8. It maps an \(\alpha \)-geodesic with starting point (0, x) on the far left vertical edge of the 2-square-b surface to an \((\alpha \,{-}\,1)\)-geodesic with the same starting point (0, x) on the far left vertical edge of the 2-square-b surface. It is clear that they hit the same point \((1,\{x\,{+}\,\alpha \})\) on the middle vertical line of the 2-square-b surface. The first geodesic is equidistributed on the 2-square-b surface if and only if the second geodesic is equidistributed on the 2-square-b surface.

Fig. 8
figure 8

\(\alpha \)-geodesic and \((\alpha -1)\)-geodesic

The second simple transformation is illustrated in Fig. 9. It maps an \((\alpha \,{-}\,1)\)-geodesic with starting point (0, x) on the far left vertical edge of the 2-square-b surface to a \((1\,{-}\,\alpha )\)-geodesic with starting point \((0,1-x)\) on the far left vertical edge of the 2-square-b surface reflected across the horizontal line \(y=1/2\). It is clear that the first geodesic hits the point \((1,\{x\,{+}\,\alpha \})\) on the middle vertical line of the 2-square-b surface, whereas the second geodesic hits the point \((1,1-\{x\,{+}\,\alpha \})\) on the middle vertical line of the 2-square-b surface reflected across the horizontal line \(y=1/2\). The first geodesic is equidistributed on the 2-square-b surface if and only if the second geodesic is equidistributed on the 2-square-b surface reflected across the horizontal line \(y=1/2\).

Fig. 9
figure 9

\((\alpha -1)\)-geodesic and \((1-\alpha )\)-geodesic

The third simple transformation is illustrated in Fig. 10, which also shows that the 2-square-b surface can be recovered from the 2-square-b surface reflected across the horizontal line \(y=1/2\) by a vertical translation by b modulo 1. It now maps a \((1\,{-}\,\alpha )\)-geodesic with starting point \((0,1-x)\) on the far left vertical edge of the 2-square-b surface reflected across the horizontal line \(y=1/2\) to a \((1\,{-}\,\alpha )\)-geodesic with starting point \((0,\{b+1-x\})\) on the far left vertical edge of the 2-square-b surface. It is clear that the two geodesics hit corresponding points on the middle vertical line of their respective 2-square-b surfaces. The first geodesic is equidistributed on the 2-square-b surface reflected across the horizontal line \(y=1/2\) if and only if the second geodesic is equidistributed on the 2-square-b surface. \(\square \)

Fig. 10
figure 10

Vertical translation by b modulo 1

Remark 1.2

Strictly speaking, the 2-square-b surface reflected across the horizontal line \(y=1/2\) followed by a vertical translation by b modulo 1 leads to another copy of the 2-square-b surface if and only if the gates are open intervals or closed intervals. However, for formulas such as (1.2) and (1.4)–(1.6) to hold precisely, the gates and barriers need to be intervals that are closed at the bottom end and open at the top end. In any case, an \(\alpha \)-geodesic can hit any singularity of the 2-square-b surface at most once, so equidistribution is not affected by altering the openness or closedness of the gates.

2 Some interesting special cases, and polygonal invariant sets

We briefly consider the special case \(b=\{m\alpha \}\) for some non-zero integer \(m\in {\mathbb {Z}}\). As explained in Sect. 1, we may assume that m is positive and \(0<\alpha <1\).

Case \(\varvec{m}=\textbf{1}\). In the special case \(0<b=\alpha <1\), we can show equidistribution for any half-infinite \(\alpha \)-geodesic with irrational \(\alpha \). Here is a relatively simple proof. The idea is summarized in Fig. 11.

Fig. 11
figure 11

The case \(0<b=\alpha <1\)

Consider the sequence \(\tau +q\alpha \), \(q=0,1,2,3,\ldots \) Without loss of generality, we can assume that \(0\leqslant \tau <1\). Consider a geodesic on the 2-square-b surface with slope \(\alpha \), starting at a point on the left vertical edge at height \(\tau \), and hitting the vertical edges of the 2-square-b surface successively at height \(\{\tau \,{+}\,q\alpha \}\), \(q=0,1,2,3,\ldots \) For every such integer q, consider the following assertion:

P(q)::

The condition \(\tau +q\alpha \bmod {2}\) is in [0, 1) corresponds to a hitting point on the 2-square-b surface on the left vertical edge, or on the left side of the middle vertical edge above the gate, or on the right vertical edge at the gate, while the condition \(\tau +q\alpha \bmod {2}\) is in [1, 2) corresponds to a hitting point on the 2-square-b surface on the right vertical edge above the gate, or on the right side of the middle vertical edge above the gate, or on the middle vertical edge at the gate.

It is clear that P(0) holds by definition. Assume now that P(k) holds for some integer k.

Suppose first that \(\tau +k\alpha \bmod {2}\) is in [0, 1). In view of vertical edge identification, we may assume without loss of generality that the corresponding hitting point of lies on the left vertical edge. We have one of the following two possibilities:

  1. (i)

    If is in [0, 1), then since \(0<\alpha <1\), we must have

    (2.1)

    It follows that , so that the corresponding hitting point of is on the left side of the middle vertical edge above the gate, and so \(P(k+1)\) holds.

  2. (ii)

    If is in [1, 2), then since \(0<\alpha <1\), we must have

    (2.2)

    It follows that , so that the corresponding hitting point of is on the middle vertical edge at the gate, and so \(P(k+1)\) holds.

Suppose next that \(\tau +k\alpha \bmod {2}\) is in [1, 2). In view of vertical edge identification, we may assume without loss of generality that the corresponding hitting point of lies on the right side of the middle vertical edge above the gate, or on the middle vertical edge at the gate.

  1. (i)

    If is in [0, 1), then since \(0<\alpha <1\), we must have (2.2). It follows that , so that the corresponding hitting point of is on the right vertical edge at the gate, and so \(P(k+1)\) holds.

  2. (ii)

    If is in [1, 2), then since \(0<\alpha <1\), we must have (2.1). It follows that , so that the corresponding hitting point of is on the right vertical edge above the gate, and so \(P(k+1)\) holds.

Thus the statement P(q) holds for every \(q=0,1,2,3,\ldots \)

Finally, note that the sequence \(\tau +q\alpha \), \(q=0,1,2,3,\ldots \), is uniformly distributed in the double interval [0, 2).

Next come some surprises.

Case \(\varvec{m}=\textbf{2}\). A pleasant first surprise comes from the special case \(b=\{2\alpha \}\) with \(0<\alpha <1/2\), so that \(0<b=2\alpha <1\). Figure 12 summarizes a very quick proof that any \(\alpha \)-geodesic on the 2-square-b surface is not dense or equidistributed.

Fig. 12
figure 12

The case \(b=\{2\alpha \}\) with \(0<\alpha <1/2\)

Note that any \(\alpha \)-geodesic on the 2-square-b surface modulo 1 reduces to a torus line of slope \(\alpha \) in the unit square, and we know that this projected torus line is uniformly distributed as long as \(\alpha \) is irrational. On the other hand, Fig. 12 shows two invariant subsets of the 2-square-b surface under geodesic flow of slope \(\alpha \). It is easy to see that any \(\alpha \)-geodesic that passes through the shaded part of the 2-square-b surface remains forever in the shaded part and never reaches the white part, and vice versa, so it is not dense on the 2-square-b surface.

It is easy to see that an \(\alpha \)-geodesic in the shaded part has visit density \(\alpha \) on the left square of the surface and \(1-\alpha \) on the right square of the surface. Likewise, an \(\alpha \)-geodesic in the white part has visit density \(\alpha \) on the right square of the surface and \(1-\alpha \) on the left square of the surface. Since \(\alpha \ne 1/2\), this means that there cannot possibly be equidistribution.

Note also from Fig. 12 that the square-crossings, i.e., instances of passing from one square to the other, occur in pairs along any \(\alpha \)-geodesic.

Remark 2.1

It is easy to see that a similar argument works for the special case \(b=\{m\alpha \}\) for any even positive integer m with \(0<\alpha <1/m\). Any \(\alpha \)-geodesic on the 2-square-b surface is not dense or equidistributed.

A second surprise is that the case \(b=\{2\alpha \}\) with \(1/2<\alpha <1\) turns out to be completely different from when \(0<\alpha <1/2\). In this case, every half-infinite \(\alpha \)-geodesic on the 2-square-b surface with irrational \(\alpha \) is equidistributed. We do not have a quick proof of this result. It follows instead from the general Theorem 2.5 which we shall state later in this section. This general result has a fairly non-trivial proof.

Case \(\varvec{m}=\textbf{3}\). The special case \(b=\{3\alpha \}\) gives rise to equidistribution for every half-infinite \(\alpha \)-geodesic on the 2-square-b surface with irrational \(\alpha \). Again, we do not know a quick proof, and refer the reader to Theorem 2.5.

Next come more surprises.

Case \(\varvec{m}=\textbf{4}\). Let us first consider the special case \(b=\{4\alpha \}\) with

$$\begin{aligned} 0<\alpha<\tfrac{1}{4} \quad \hbox {or}\quad \tfrac{1}{2}<\alpha<\tfrac{2}{3} \quad \hbox {or}\quad \tfrac{2}{3}<\alpha <\tfrac{3}{4}. \end{aligned}$$

Figs. 13, 14 and 15 summarize very quick proofs that any \(\alpha \)-geodesic on the 2-square-b surface is not dense or equidistributed.

In Fig. 13, an \(\alpha \)-geodesic in the shaded part has visit density \(2\alpha \) on the left square of the surface and \(1-2\alpha \) on the right square of the surface.

Fig. 13
figure 13

The case \(b=\{4\alpha \}\) with \(0<\alpha <1/4\)

In Fig. 14, an \(\alpha \)-geodesic in the shaded part has visit density

$$\begin{aligned} (\{3\alpha \}-\alpha )+\{2\alpha \} =(3\alpha -1-\alpha )+(2\alpha -1) =4\alpha -2 \end{aligned}$$

on the left square of the surface and

$$\begin{aligned} (1-\{3\alpha \})&+(\alpha -\{4\alpha \})+\{2\alpha \} \\&=(1-3\alpha +1)+(\alpha -4\alpha +2)+(2\alpha -1) =3-4\alpha \end{aligned}$$

on the right square of the surface.

Fig. 14
figure 14

The case \(b=\{4\alpha \}\) with \(1/2<\alpha <2/3\)

In Fig. 15, an \(\alpha \)-geodesic in the shaded part has visit density

$$\begin{aligned} (\alpha -\{2\alpha \})+\{3\alpha \} =(\alpha -2\alpha +1)+(3\alpha -2) =2\alpha -1 \end{aligned}$$

on the left square of the surface and

$$\begin{aligned} (1-\{4\alpha \})&+(\alpha -\{2\alpha \})+\{3\alpha \} \\&=(1-4\alpha +2)+(\alpha -2\alpha +1)+(3\alpha -2) =2-2\alpha \end{aligned}$$

on the right square of the surface.

Fig. 15
figure 15

The case \(b=\{4\alpha \}\) with \(2/3<\alpha <3/4\)

However, for the special case \(b=\{4\alpha \}\) with \(1/4<\alpha <1/2\) or \(3/4<\alpha <1\), every half-infinite \(\alpha \)-geodesic on the 2-square-b surface with irrational \(\alpha \) is equidistributed. Again, we do not know a quick proof, and refer the reader to Theorem 2.5.

At first sight this case study may seem hopelessly complicated and mysterious. However, there is a simple underlying rule that explains everything. We call this the Double Even Criterion.

If the Double Even Criterion fails, then every half-infinite \(\alpha \)-geodesic on the 2-square-b surface with irrational \(\alpha \) is equidistributed. This case forms the hard part of the case \(n=2\) of Theorem 2.5.

On the other hand, if the Double Even Criterion holds, then there is a reasonably simple algorithm to construct two non-trivial \(\alpha \)-flow invariant subsets of the 2-square-b surface. Clearly density and equidistribution for any \(\alpha \)-geodesic on the 2-square-b surface are impossible.

Let \(b=\{m\alpha \}\), where \(m\geqslant 2\) is an integer and \(\alpha \) is an irrational number satisfying \(0<\alpha <1\). We take the parameter \(\Upsilon (m;\alpha )\) to denote the total number of integers q such that \(1\leqslant q\leqslant m\) and \(\{q\alpha \}<\alpha \). For example, as clearly shown in Figs. 13, 14 and 15, we have

$$\begin{aligned} \Upsilon (4;\alpha )={\left\{ \begin{array}{ll} \,0,&{}\hbox {if}\;\; 0<\alpha<1/4,\\ \,2,&{}\hbox {if}\;\; 1/2<\alpha<2/3,\\ \,2,&{}\hbox {if}\;\; 2/3<\alpha <3/4. \end{array}\right. } \end{aligned}$$

Double Even Criterion

The integer \(m\geqslant 2\) and the parameter \(\Upsilon (m;\alpha )\) are both even.

The Double Even Criterion is a special case of a more general criterion which applies to the n-square-b surface for any integer \(n\geqslant 2\), the natural generalization of the 2-square-b surface to a surface consisting of a horizontal row of n consecutive unit squares with \(n-1\) b-size gates between the squares, and with appropriate edge identification. The 3-square-b surface is shown in Fig. 16.

Fig. 16
figure 16

The 3-square-b surface

GCD Criterion

For the n-square-b surface with \(b=\{m\alpha \}\), the greatest common divisor d of the three integers n, m and \(\Upsilon (m;\alpha )\) satisfies \(d>1\).

If the GCD Criterion fails, then every half-infinite \(\alpha \)-geodesic on the n-square-b surface with irrational \(\alpha \) is equidistributed. This case forms the hard part of Theorem 2.5.

On the other hand, if the GCD Criterion holds with greatest common divisor \(d>1\), then there is a reasonably simple algorithm to construct d non-trivial \(\alpha \)-flow invariant subsets of the n-square-b surface. Clearly density and equidistribution for any \(\alpha \)-geodesic on the n-square-b surface are impossible. This case is relatively short, and we discuss it now.

Suppose that the GCD Criterion holds. We now show how we can construct d non-trivial \(\alpha \)-flow invariant subsets of the n-square-b surface.

Consider the finite sequence

$$\begin{aligned} 0,\;\{\alpha \},\;\{2\alpha \},\;\ldots ,\;\{m\alpha \}=b \end{aligned}$$

of \(m+1\) terms, and arrange it in increasing order

$$\begin{aligned} 0=b_0<b_1<\cdots<b_\Upsilon<b_{\Upsilon +1}=\alpha<b_{\Upsilon +2}<\cdots <b_m, \end{aligned}$$
(2.3)

where the index \(\Upsilon \) denotes the parameter \(\Upsilon =\Upsilon (m;\alpha )\),

and b is one of the elements in (2.3), so that \(b=b_\nu \) for some \(\nu =1,\ldots ,m\).

If we remove the term \(b_\nu \) from the sequence (2.3), then we obtain a subsequence

$$\begin{aligned} 0=b'_0<b'_1<\cdots <b'_{m-1} \end{aligned}$$
(2.4)

of m terms, where for every integer \(j=0,\ldots ,m-1\),

$$\begin{aligned} b'_j={\left\{ \begin{array}{ll} \,b_j,&{}\hbox {if}\;\; 0\leqslant j<\nu ,\\ \,b_{j+1},&{}\hbox {if}\; \;\nu \leqslant j\leqslant m-1. \end{array}\right. } \end{aligned}$$

Note that the elements of the subsequence (2.4) are in one-to-one correspondence with the collection of division points

$$\begin{aligned} \{q\alpha \}, \quad q=0,1,\ldots ,m-1. \end{aligned}$$
(2.5)

This subsequence also leads to a partition of the unit interval [0, 1) into m intervals

$$\begin{aligned} I_j=[b'_j,b'_{j+1}), \quad j=0,1,\ldots ,m-1, \end{aligned}$$
(2.6)

with the convention that \(b'_m=1\).

Since d divides m, we can color the intervals (2.6) from top to bottom with distinct colors \({\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_d\), repeated periodically m/d times.

We now proceed to d-color the n-square-b surface as follows.

Double Periodic Coloring Algorithm

Suppose that the integer d divides both n and m. Let \({\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_d\) denote d distinct colors.

  1. (1)

    Suppose that \(\ell =1,\ldots ,d\). Identify the left vertical edge of the \(\ell \)-th square face of the n-square-b surface with the interval [0, 1), consisting of the m intervals (2.6). We color these intervals from top to bottom by the colors

    $$\begin{aligned} {\mathfrak {c}}_\ell ,\ldots ,{\mathfrak {c}}_d,{\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_{\ell -1}, \end{aligned}$$

    repeated periodically m/d times. This clearly gives rise to a periodic d-coloring of this edge. Using the \(\alpha \)-flow, we can extend this periodic d-coloring to a d-coloring \(C(\ell )\) of the \(\ell \)-th square face of the n-square-b surface.

  2. (2)

    We then d-color the other square faces of the n-square-b surface by repeating the d-colorings \(C(1),\ldots ,C(d)\) of the first d square faces periodically n/d times.

Remark 2.2

The array

(2.7)

shows the coloring on each subinterval of the left vertical edge of each square face of the n-square-b surface. The sub-array on the top left repeats throughout the whole array, with periodicty of the coloring vertically and horizontally. This explains the terminology Double Periodic Coloring Algorithm.

It becomes particularly interesting if the GCD Criterion holds, so that the integer d also divides the parameter \(\Upsilon (m;\alpha )\). Note that this is the case for \(n=2\) in each of Figs. 1215, and in each case, we are able to give 2 non-trivial \(\alpha \)-flow invariant subsets of the 2-square-b surface. The next lemma is a far-reaching generalization of this observation.

Lemma 2.3

Suppose that an integer d divides both n and m. Then the Double Periodic Coloring Algorithm gives rise to d non-trivial \(\alpha \)-flow invariant subsets of the n-square-b surface if and only if d also divides \(\Upsilon (m;\alpha )\).

Proof

The d-coloring C(1) from the Double Periodic Coloring Algorithm also gives the periodic d-coloring \(C_0\) of the far left vertical edge of the n-square-b surface, viewed as the unit torus [0, 1), with m division points given by (2.4) or (2.5). In particular, the color pattern from the top is \({\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_d\), with periodic repetition until it reaches the bottom.

Let \(C^*\) denote a new d-coloring of the unit torus, obtained from \(C_0\) by translating each point in [0, 1) by \(\alpha \) modulo 1. Noting (2.4) and (2.5), it is clear that the division points of \(C^*\) are given by

$$\begin{aligned} \{q\alpha \}, \quad q=1,\ldots ,m. \end{aligned}$$
(2.8)

Thus the division points of \(C_0\) and \(C^*\) are essentially the same, apart from 0 being replaced by \(b=\{m\alpha \}\).

Let \(C^{**}\) denote another new d-coloring of the unit torus, obtained from \(C_0\) by keeping the colors in the interval \([b,1)=[\{m\alpha \},1)\) and replacing any color \({\mathfrak {c}}_j\) in the interval \([0,b)=[0,\{m\alpha \})\) by the next color \({\mathfrak {c}}_{j+1}\) along the chain \({\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_d\) modulo d. Note that in \(C^{**}\), the two sides of 0 now have the same color, so 0 is no longer a division point. On the other hand, note that \(b=\{m\alpha \}\) is not a division point of \(C_0\). However, switching from \(C_0\) to \(C^{**}\), we switch the color below b and keep the color above b, so \(b=\{m\alpha \}\) is clearly a division point of \(C^{**}\).

It follows that \(C^*\) and \(C^{**}\) are two d-colorings of the unit torus with precisely the same division points (2.8).

We shall first show that the two d-colorings \(C^*\) and \(C^{**}\) are equal if and only if d divides \(\Upsilon (m;\alpha )\). In view of the vertical periodicity of the d-colorings, to show that \(C^*\) and \(C^{**}\) are equal, it clearly suffices to check the equality of colors in just one interval. We distinguish two cases.

Case 1. Suppose that \(b=\{m\alpha \}<\alpha \). Since \(b_{\Upsilon +1}=\alpha \) and \(b=b_\nu <\alpha \), it follows that \(1\leqslant \nu \leqslant \Upsilon \). Recall that \(b=b_\nu \) is not a division point of \(C_0\). Hence

$$\begin{aligned} 0=b_0<b_1<\cdots<b_{\nu -1}<b_{\nu +1}<\cdots <b_{\Upsilon +1}=\alpha \end{aligned}$$

are successive division points of \(C_0\). Hence the intervals \([b_0,b_1)\) and \([b_{\Upsilon +1},b_{\Upsilon +2})\) have the same color \({\mathfrak {c}}_d\) in \(C_0\) if and only if d divides \(\Upsilon \). Next, note that the interval \([b_{\Upsilon +1},b_{\Upsilon +2})=[\alpha ,b_{\Upsilon +2})\) is obtained from the interval \([b_0,b_1)=[0,b_1)\) by translation by \(\alpha \) modulo 1. It follows that \([b_{\Upsilon +1},b_{\Upsilon +2})\) has the same color \({\mathfrak {c}}_d\) in \(C^*\) as \([b_0,b_1)\) has in \(C_0\). On the other hand, the interval \([b_{\Upsilon +1},b_{\Upsilon +2})=[\alpha ,b_{\Upsilon +2})\) is not in the interval [0, b), and so it has the same color in \(C^{**}\) as in \(C_0\). It now follows that the interval \([b_{\Upsilon +1},b_{\Upsilon +2})=[\alpha ,b_{\Upsilon +2})\) has the same color \({\mathfrak {c}}_d\) in \(C^*\) as in \(C^{**}\) if and only if d divides \(\Upsilon (m;\alpha )\).

Case 2. Suppose that \(b=\{m\alpha \}>\alpha \). Since \(b_{\Upsilon +1}=\alpha \) and \(b=b_\nu >\alpha \), it follows that \(\nu >\Upsilon +1\). Hence

$$\begin{aligned} 0=b_0<b_1<\cdots <b_{\Upsilon +1}=\alpha \end{aligned}$$

are successive division points of \(C_0\). Hence the intervals \([b_0,b_1)\) and \([b_{\Upsilon +1},b_{\Upsilon +2})\) have different colors \({\mathfrak {c}}_d\) and \({\mathfrak {c}}_{d-1}\) respectively in \(C_0\) if and only if d divides \(\Upsilon \). As in Case 1, \([b_{\Upsilon +1},b_{\Upsilon +2})\) has the same color \({\mathfrak {c}}_d\) in \(C^*\) as \([b_0,b_1)\) has in \(C_0\). On the other hand, the interval \([b_{\Upsilon +1},b_{\Upsilon +2})=[\alpha ,b_{\Upsilon +2})\) is in the interval [0, b), and so its color in \(C^{**}\) is the next color \({\mathfrak {c}}_d\) along the chain \({\mathfrak {c}}_1,\ldots ,{\mathfrak {c}}_d\) from its color \({\mathfrak {c}}_{d-1}\) in \(C_0\). It now follows that the interval \([b_{\Upsilon +1},b_{\Upsilon +2})=[\alpha ,b_{\Upsilon +2})\) has the same color \({\mathfrak {c}}_d\) in \(C^*\) as in \(C^{**}\) if and only if d divides \(\Upsilon (m;\alpha )\).

Finally, note that the equality of \(C^*\) and \(C^{**}\) and periodicity represent precisely the division of the n-square-b surface into d monochromatic sets that represent d non-trivial \(\alpha \)-flow invariant subsets of the n-square-b surface. Indeed, \(C^{**}\) exhibits the key difference between the intervals [0, b) and [b, 1), that an \(\alpha \)-geodesic can freely cross the b-gate and is obstructed above it. \(\square \)

Remark 2.4

Lemma 2.3 basically says that from the viewpoint of equidistribution on the n-square-b surface, the GCD Criterion can be considered an obstacle. Note, however, that any \(\alpha \)-geodesic with irrational \(\alpha \) in any monochromatic subset of the n-square-b surface is equidistributed in that subset. We only need to recall that any \(\alpha \)-geodesic on the n-square-b surface modulo 1 reduces to a torus line of slope \(\alpha \) in the unit square. Since \(\alpha \) is irrational, this projected torus line is uniformly distributed in the unit square.

If the GCD Criterion holds, then we can always compute the corresponding visit densities, analogous to the cases illustrated in Figs. 1215. It is not difficult to see that each visit density is necessarily of the form \(u\alpha +v\), where \(u,v\in {\mathbb {Z}}\). Since this is strictly between 0 and 1, it follows that \(u\ne 0\), and since \(\alpha \) is irrational, the visit density can never be equal to 1/n. Thus any half-infinite \(\alpha \)-geodesic on the n-square-b surface is always unevenly distributed between the squares.

Lemma 2.3 clearly establishes one half of the following result.

Theorem 2.5

Suppose that \(b=\{m\alpha \}\), where \(\alpha \) is irrational and m is a positive integer. Then any \(\alpha \)-geodesic on the n-square-b surface is equidistributed on the surface if and only if the GCD Criterion fails.

An interesting consequence of Theorem 2.5 is the following. If \(b=\{m\alpha \}\), where \(\alpha \) is irrational and m is a positive integer, and an \(\alpha \)-geodesic on the n-square-b surface is dense on the surface, then the geodesic exhibits the stronger property of equidistribution.

We shall prove the remainder of Theorem 2.5 later; see Sect. 4 and the end of Sect. 6.

3 More on the 2-square-b surface and beyond

We now consider the general case of the n-square-b surface when \(b\ne \{m\alpha \}\) for any \(m\in {\mathbb {Z}}\). Here the answer is rather tricky.

For the original case \(n=2\), the paper of Veech [15] contains a study of the following special case of Question 1 where the test sets are simply the two squares of the 2-square-b surface.

Question 2

Let \(\alpha \) be an irrational number. When can we guarantee that every half-infinite \(\alpha \)-geodesic on the 2-square-b surface is evenly distributed between the two constituent squares?

In other words, assuming that a particle moves along the \(\alpha \)-geodesic with unit speed, under what condition can we guarantee that for every starting point, the left square is visited half the time? More precisely, we want the asymptotic visit-density of this particle to the left square of the 2-square-b surface to exist, and to be equal to 1/2.

Veech [15] has the following positive answer to Question 2.

Theorem A

Suppose that the slope \(\alpha \) is badly approximable. Suppose further that the gate-size \(b\ne \{m\alpha \}\) for any \(m\in {\mathbb {Z}}\). Then every half-infinite \(\alpha \)-geodesic on the 2-square-b surface is evenly distributed between the two constituent squares.

We recall that badly approximable numbers are characterized by the property that the continued fraction digits have a common upper bound. A well-known subclass of badly approximable numbers is the set of all quadratic irrationals, i.e., real algebraic numbers of degree 2, which are characterized by the property that the continued fraction expansions are eventually periodic.

Given a badly approximable slope \(\alpha \), the condition \(b\ne \{m\alpha \}\) for any \(m\in {\mathbb {Z}}\) in Theorem A excludes a countable set of values of b. For these excluded values of b, we now have a complete understanding of the situation. As explained in the remarks after the proof of Lemma 2.3, what happens depends on the Double Even Criterion. Suppose that the Double Even Criterion fails. Then it follows as a consequence of Theorem 2.5 that any half-infinite \(\alpha \)-geodesic is evenly distributed between the two constituent squares. On the other hand, suppose that the Double Even Criterion holds. Then each constituent square has a well-defined visit-density, depending on the starting point of the \(\alpha \)-geodesic, which is never equal to 1/2, so the half-infinite \(\alpha \)-geodesic is never evenly distributed between the two constituent squares.

If the slope \(\alpha \) is not badly approximable, then Veech [15] has the following very interesting negative result.

Theorem B

Suppose that the irrational slope \(\alpha \) is not badly approximable. Then there exists an explicit construction of an uncountable set of values b with strong violation of uniformity in the sense that for some half-infinite \(\alpha \)-geodesics on such a 2-square-b surface, the visit-densities of the constituent squares do not even exist.

So far, we have considered a fixed irrational slope \(\alpha \) and asked the question of what values of b lead to half-infinite \(\alpha \)-geodesics on the 2-square-b-surface that are evenly distributed between the two constituent squares.

Suppose instead that we consider a fixed gate size b. Then it is reasonable to ask what irrational slopes \(\alpha \) give rise to half-infinite \(\alpha \)-geodesics on the 2-square-b surface that are evenly distributed between the two constituent squares.

Veech [15] has the following result which shows that 2-square-b surfaces with rational values of b are exceptional.

Theorem C

Suppose that the number b is rational. Then for any irrational slope \(\alpha \), every half-infinite \(\alpha \)-geodesic on the 2-square-b surface gives rise to equal visit-densities of the two constituent squares.

We also have the following negative result of Masur and Smillie on the 2-square-b surface; see [8, Theorem 3.2] or [7, Theorem 2].

Theorem D

Suppose that the number b is irrational. Then there exist uncountably many slopes \(\alpha \) such that for almost every starting point, a half-infinite \(\alpha \)-geodesic on the 2-square-b surface is not uniformly distributed.

Note that the uncountable set of bad slopes \(\alpha \) in Theorem D can be extended to a set of positive Hausdorff measure, but not to a set of positive Lebesgue measure. This follows from a well-known general result of Kerckhoff, Masur, and Smillie [5] concerning geodesic flow on any rational polygonal surface. This important general theorem, which works for almost every slope, unfortunately does not say anything about any explicit slope, which is our main interest. For more about non-integrable flat dynamical systems, the reader is referred to the survey papers [8, 19].

Theorems AD are very satisfactory results that give us a very good understanding of the distribution of half-infinite \(\alpha \)-geodesics on the 2-square-b surface. We can view this as the mod 2 case. However, the corresponding mod n version, concerning the n-square-b surface, remains open for any integer \(n\geqslant 3\).

Veech [15] has asked the question of whether or not his method can be extended to prove the mod n versions of Theorem A for \(n\geqslant 3\). Here we can establish such a result, but we do not use Veech’s method which is quite complicated. In fact, we can prove the following stronger result that answers the mod n analog of Question 1.

Theorem 3.1

Suppose that \(n\geqslant 2\) and the slope \(\alpha \) is badly approximable. Suppose also that the gate-size \(b\ne \{m\alpha \}\) for any \(m\in {\mathbb {Z}}\). Then every half-infinite \(\alpha \)-geodesic on the n-square-b surface is uniformly distributed.

Furthermore, we can establish a far-reaching generalization of Theorem 3.1. We consider the larger class of flat finite polysquare, or square tiled, translation surfaces with b-rational gates.

A finite polysquare, or square tiled, region is a connected, but not necessarily simply-connected, polygon P on the plane which is tiled with unit squares, assumed to be closed, that we call the atomic squares of P, and which satisfies the following conditions:

  1. (i)

    Any two atomic squares in P either are disjoint, or intersect at a single point, or have a common edge.

  2. (ii)

    Any two atomic squares in P are joined by a chain of atomic squares where any two neighbors in the chain have a common edge.

To turn a given finite polysquare region P into a flat finite polysquare translation surface , we need identification of pairs of horizontal edges as well as identification of pairs of vertical edges. In Fig. 17, we show examples of the identification of horizontal edges on the two leftmost columns of atomic squares as well as examples of the identification of vertical edges on the two topmost rows of atomic squares.

Note that the finite polysquare surface may have holes, and we also allow whole barriers which are horizontal or vertical walls that consist of one or more boundary edges of atomic squares. For example, the finite polysquare surface in Fig. 17 has 32 atomic squares, 2 holes as well as 3 horizontal walls and 4 vertical walls.

Fig. 17
figure 17

A flat finite polysquare translation surface

Geodesic flow on a flat finite polysquare translation surface is always 1-direction linear flow.

Remark 3.2

Geodesic flow on a general finite polysquare surface may sometimes be a 4-direction flow. Consider, for example, geodesic flow on the cube surface. It is well known that this 4-direction geodesic flow on the cube surface can be converted to a 1-direction geodesic flow by using a 4-copy construction, where we take four rotated copies of the cross-shaped net of the cube surface, and glue together corresponding edges in the different copies to obtain a flat finite polysquare translation surface. Indeed, an analog of this 4-copy construction works for any finite polysquare surface with 4-direction geodesic flow.

Meanwhile, it can also be shown that any 4-direction billiard orbit in a finite polysquare region is equivalent to 1-direction geodesic flow in a corresponding flat finite polysquare translation surface. This follows as a consequence of the concept of unfolding, first demonstrated on the unit square by König and Szücs [6] in 1913.

It is therefore sufficient to study 1-direction geodesic flow on flat finite polysquare translation surfaces.

The 2-dimensional continuous Kronecker–Weyl equidistribution theorem for the torus line in a square leads to an interesting uniform-periodic dichotomy, in the sense that every torus line with irrational slope is uniformly distributed, whereas every torus line with rational slope is periodic.

We have the following remarkable extension of this classical result by Gutkin and Veech about 70 years later; see [3, 16, 17].

Theorem E

On any flat finite polysquare translation surface, every half-infinite 1-direction geodesic with irrational slope is uniformly distributed, whereas every half-infinite 1-direction geodesic with rational slope is periodic.

Note that we consider here only half-infinite 1-direction geodesics, as we need to exclude any geodesic that hits a singularity of the polysquare surface after which there is no well defined unique continuation.

If the gate size b is irrational, then the 2-square-b surface is not a polysquare surface, so Theorem E does not apply. Furthermore, as Theorem B shows, for any irrational slope which is not badly approximable, there is clearly no uniform-periodic dichotomy. There is an uncountable set of values b for which even the simplest test sets, namely the two constituent squares of the 2-square-b surface, violate uniformity. On the other hand, any half-infinite 1-direction geodesic with irrational slope on any 2-square-b surface cannot be periodic.

As in Theorem A, we study uniformity in the case of badly approximable slopes. Theorem 3.1 is such a result. Next we formulate a far-reaching generalization of it, to the class of flat finite polysquare translation surfaces with b-rational gates.

An example of such a surface is the (Lb)-surface, an L-shaped 4-square surface with three b-size gates and one b/2-size gate, as shown in Fig. 18.

Fig. 18
figure 18

The (Lb)-surface

Here the left vertical edge of the bottom middle atomic square has two division points b and \(1-b/2=\{-b/2\}\) which determine the left bottom b-gate between 0 and b, as well as the top b/2-gate between \(1-b/2\) and 1, separated by the fractional vertical barrier between b and \(1-b/2\). On the other hand, the left vertical edge of the bottom right atomic square has two division points b and \(1-b=\{-b\}\) which determine the left bottom b-gate between 0 and b, as well as the top b-gate between \(1-b\) and 1, separated by the fractional vertical barrier between b and \(1-b\).

We now extend the class of flat finite polysquare translation surfaces to the larger class of flat finite polysquare-b-rational translation surfaces by following and then extending the pattern of the (Lb)-surface. For any vertical side of an atomic square, we may place any number of b-rational division points located at distance \(\{rb\}\) from the bottom of the edge, where \(0<b<1\) is fixed and r is a non-zero rational number. These division points, often called the division numbers, determine vertical gates separated by fractional vertical barriers, where every gate and barrier is a subinterval of the vertical edge, with endpoints which are b-rational division points. To obtain a translation surface, we identify pairs of horizontal edges and pairs of vertical edges in an appropriate manner. Then geodesic flow is 1-direction linear flow.

The flat finite polysquare-b-rational translation surface in Fig. 19 is modified from the flat finite polysquare translation surface in Fig. 17 in this way. We have not included the edge identifications.

Fig. 19
figure 19

A flat finite polysquare-b-rational translation surface

In Sects. 57, we shall prove the following generalization of Theorem 3.1.

Theorem 3.3

Suppose that is a flat finite polysquare-b-rational translation surface, where b is irrational, and with division numbers \(\{r_ib\}\), \(i=1,\ldots ,R\), where each \(r_i\) is a non-zero rational number. Let \(\alpha \) be a badly approximable number such that \(\{r_ib\}\ne \{m\alpha \}\) for any \(i=1,\ldots ,R\) and . Then every half-infinite \(\alpha \)-geodesic on is uniformly distributed.

Remark 3.4

The study of geodesic flow on a flat finite polysquare-b-rational translation surface is related to a suitable generalization of the Veech 2-circle problem. Here the number of circles corresponds to the number of vertical streets of the underlying finite polysquare surface, and the circumference of a circle is the length of the vertical street that corresponds to it. This remains the case if the division numbers are replaced by a finite set of real numbers, at least one of which is irrational, resulting in surfaces that can be more general than polysquare-b-rational translation surfaces. Unfortunately, we are not able to extend Theorem 3.3 to this more general setting, as we are not able to establish a suitable generalization of the separation lemma as given by Lemma 5.2.

We can show that billiard in any finite polysquare-b-rational region is equivalent to a 1-direction geodesic flow on a corresponding flat finite polysquare-b-rational translation surface. This follows from a generalization of the concept of unfolding, pioneered by König and Szücs [6] in 1913, to show that billiard in the unit square is equivalent to 1-direction geodesic flow in the square torus. Indeed, as mentioned earlier, it can be shown that billiard in any finite polysquare region is equivalent to a 1-direction geodesic flow on a corresponding flat finite polysquare translation surface.

Thus we have immediately the following result concerning billiards.

Theorem 3.5

Let P be a finite polysquare-b-rational translation region, where b is irrational, and with division numbers \(\{r_ib\}\), \(i=1,\ldots ,R\), where each \(r_i\) is a non-zero rational number. Let \(\alpha \) be a badly approximable number such that \(\{r_ib\}\ne \{m\alpha \}\) for any \(i=1,\ldots ,R\) and . Then every half-infinite billiard orbit in P with initial slope \(\alpha \) is uniformly distributed.

Next we return to the 2-square-b surface and the somewhat negative Theorem B of Veech. If the irrational slope \(\alpha \) is not badly approximable, then there exists an uncountable set of values of b such that the visit-densities of the constituent squares do not even exist. For such gate-sizes b, it is perhaps natural then to call them bad. This raises the question of finding a quantitative description of this phenomenon, that extreme violation of uniformity can be exhibited by a concrete geodesic.

We shall give such a quantitative result which demonstrates serious violations of uniformity. For appropriate pairs of the parameters \(\alpha \) and b, we shall construct a half-infinite \(\alpha \)-geodesic on the 2-square-b surface which demonstrates extra-large one-sidedness exhibited in an alternating way. Such a geodesic also violates any form of quasi-periodicity. Using a completely different method from those that give Theorems B and D, we shall prove in Sects. 8 and 9 the following result.

For any 2-square-b surface, we denote by \({{\,\textrm{LS}\,}}(b)\) the left constituent square of the surface, and by \({{\,\textrm{RS}\,}}(b)\) the right constituent square of the surface.

Theorem 3.6

Suppose that \(\varepsilon >0\) is arbitrarily small but fixed, and that \(\alpha \in (0,1)\) is any irrational number with continued fraction

$$\begin{aligned} \alpha =\frac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots }}}=[a_1,a_2,a_3,\ldots ], \end{aligned}$$

where the digits \(a_1,a_2,a_3,\ldots \) satisfy the condition

(3.1)

There exists an explicitly given gate-size \(\beta _0=\beta _0(\alpha )\) such that the \(\alpha \)-geodesic , starting from some explicitly given point on the 2-square-\(\beta _0\) surface, satisfies the following simultaneously, where C is any positive integer satisfying \(C<200/\varepsilon \):

  1. (i)

    There exists an infinite sequence \(T^*_n\), \(n=1,2,3,\ldots \), of positive real numbers satisfying \(T^*_{n+1}>2T^*_n\) such that for every integer \(n=1,2,3,\ldots \) and for every integer \(b=0,1,\ldots ,C\) apart from \(b=1\),

    (3.2)

    with an overwhelming bias for the left constituent square of the surface, as well as

    (3.3)

    with an overwhelming bias for the right constituent square of the surface.

  2. (ii)

    There exists an infinite sequence \(T^{**}_n\), \(n=1,2,3,\ldots \), of positive real numbers satisfying \(T^{**}_{n+1}>2T^{**}_n\) such that for every integer \(n=1,2,3,\ldots \) and for every integer \(b=0,1,\ldots ,C\) apart from \(b=2\),

    (3.4)

    with an overwhelming bias for the left constituent square of the surface, as well as

    (3.5)

    with an overwhelming bias for the right constituent square of the surface.

On the other hand, for any large but fixed positive integer n, there exists another explicitly given gate-size \(\beta _1=\beta _1(\alpha ,n)\) such that \(\vert \beta _1-\beta _0\vert <\varepsilon \) and the \(\alpha \)-geodesic , starting from some explicitly given point on the 2-square-\(\beta _1\) surface, satisfies the following simultaneously:

  1. (iii)

    There exists a finite sequence \(W_1,\ldots ,W_n\) of positive real numbers satisfying \(W_{i+1}>2W_i\) whenever \(i<n\) such that for every integer \(i=1,\ldots ,n\),

    (3.6)

    with an overwhelming bias for the left constituent square of the surface, as well as

    (3.7)

    with an overwhelming bias for the right constituent square of the surface.

  2. (iv)

    There exists a positive threshold \(W^\star \) such that for every positive real number \(W>W^\star \),

    (3.8)

    with a significant bias for the left constituent square of the surface.

Removing the vertical barrier on the 2-square-\(\beta _0\) surface or 2-square-\(\beta _1\) surface leads to a polysquare surface which is an integrable rectangle surface. It can then be shown that applying some slow growth conditions on the continued fraction digits of \(\alpha \) without violating (3.1), we obtain essentially best possible time-quantitative uniformity for any geodesic with slope \(\alpha \), with polylogarithmic error term, on this integrable surface. Thus the barrier is the root cause of the polarizingly different uniformity properties of the two geodesics with the same slope. We omit the details.

4 Interval exchange transformation and ergodicity

A common tool in the proofs of Theorems 2.5 and 3.3 is the concept of an interval exchange transformation which represents a natural discretization of the linear flow of slope \(\alpha \) on the flat translation surface. As a first step, we need to exhibit ergodicity of this transformation, and this step is summarized by Lemmas 4.1 and 5.1.

We discuss this standard technique here, and also illustrate a second key idea, which is an application of the so-called 3-distance theorem, as given in Lemma 4.2, an idea used earlier in related work by Boshernitzan [1, Theorem 7.2 \((r=2)\)].

Theorem 3.3 concerns flat finite polysquare-b-rational translation surfaces which often can be far more complicated than the n-square-b surface in Theorem 2.5. Thus to illustrate the idea of an interval exchange transformation, we shall use instead the special case of the (Lb)-surface shown in Fig. 18, as this special case already well captures the whole difficulty of the situation in general. We shall further assume that \(0<b<\alpha <1/2\), where \(\alpha \) is a given irrational slope.

Before we introduce the interval exchange transformation, we first consider the effect of the \(\alpha \)-flow. For convenience, we shall assume that all the gates and barriers are closed at the bottom end and open at the top end.

Let \(w_1,w_2,w_3,w_4\) denote the left vertical edges of the four atomic squares that make up the (Lb)-surface, as shown in Fig. 20.

Fig. 20
figure 20

The vertical edges \(w_1,w_2,w_3,w_4\) and \(\alpha \)-flow

For the vertical edge \(w_1\), we denote by \(w_1(0)\) and \(w_1(1)\) the bottom endpoint and top endpoint of \(w_1\) respectively, and denote by \(w_1(x)\), where \(0<x<1\), the point on \(w_1\) which is a distance x from \(w_1(0)\). Furthermore, for any set \(S\subset [0,1]\), we let

$$\begin{aligned} w_1S=\{w_1(x):x\in S\}, \end{aligned}$$

so that \(w_1[0,1]=w_1\).

We now repeat this for the other three vertical edges \(w_2,w_3,w_4\).

Using Figs. 18 and 20, we see that the \(\alpha \)-flow maps the interval \(w_4[0,1-\alpha )\) to the interval \(w_4[\alpha ,1)\). We denote this by

$$\begin{aligned} w_4[0,1-\alpha ) \mapsto w_4[\alpha ,1). \end{aligned}$$

Careful analysis now shows that the effect of the \(\alpha \)-flow is summarized by a collection of increasing bijective linear mappings

$$\begin{aligned} w_1\bigl [0,1-\alpha -\tfrac{b}{2}\bigr )&\mapsto w_1\bigl [\alpha ,1-\tfrac{b}{2}\bigr ), \end{aligned}$$
(4.1)
$$\begin{aligned} w_1\bigl [1-\alpha -\tfrac{b}{2},1-\alpha \bigr )&\mapsto w_2\bigl [1-\tfrac{b}{2},1\bigr ), \end{aligned}$$
(4.2)
$$\begin{aligned} w_1[1-\alpha ,1)&\mapsto w_4[0,\alpha ), \end{aligned}$$
(4.3)
$$\begin{aligned} w_2[0,1-\alpha -b)&\mapsto w_2[\alpha ,1-b), \end{aligned}$$
(4.4)
$$\begin{aligned} w_2[1-\alpha -b,1-\alpha )&\mapsto w_3[1-b,1), \end{aligned}$$
(4.5)
$$\begin{aligned} w_2[1-\alpha ,1-\alpha +b)&\mapsto w_3[0,b), \end{aligned}$$
(4.6)
$$\begin{aligned} w_2[1-\alpha +b,1)&\mapsto w_2[b,\alpha ), \end{aligned}$$
(4.7)
$$\begin{aligned} w_3[0,1-\alpha -b)&\mapsto w_3[\alpha ,1-b), \end{aligned}$$
(4.8)
$$\begin{aligned} w_3\bigl [1-\alpha -b,1-\alpha -\tfrac{b}{2}\bigr )&\mapsto w_2\bigl [1-b,1-\tfrac{b}{2}\bigr ), \end{aligned}$$
(4.9)
$$\begin{aligned} w_3\bigl [1-\alpha -\tfrac{b}{2},1-\alpha \bigr )&\mapsto w_1\bigl [1-\tfrac{b}{2},1\bigr ), \end{aligned}$$
(4.10)
$$\begin{aligned} w_3[1-\alpha ,1-\alpha +b)&\mapsto w_1[0,b), \end{aligned}$$
(4.11)
$$\begin{aligned} w_3[1-\alpha +b,1)&\mapsto w_3[b,\alpha ), \end{aligned}$$
(4.12)
$$\begin{aligned} w_4[0,1-\alpha )&\mapsto w_4[\alpha ,1), \end{aligned}$$
(4.13)
$$\begin{aligned} w_4[1-\alpha ,1-\alpha +b)&\mapsto w_2[0,b), \end{aligned}$$
(4.14)
$$\begin{aligned} w_4[1-\alpha +b,1)&\mapsto w_1[b,\alpha ). \end{aligned}$$
(4.15)

We next identify the edges \(w_1,w_2,w_3,w_4\) with the intervals [0, 1), [1, 2), [2, 3), [3, 4) respectively. Using this identification and (4.1)–(4.15), the effect of the \(\alpha \)-flow can then be described by a piecewise linear map \(T:[0,4)\rightarrow [0,4)\), where

$$\begin{aligned} T\bigl (\bigl [0,1-\alpha -\tfrac{b}{2}\bigr )\bigr )&= \bigl [\alpha ,1-\tfrac{b}{2}\bigr ), \end{aligned}$$
(4.16)
$$\begin{aligned} T\bigl (\bigl [1-\alpha -\tfrac{b}{2},1-\alpha \bigr )\bigr )&= \bigl [2-\tfrac{b}{2},2\bigr ), \end{aligned}$$
(4.17)
$$\begin{aligned} T([1-\alpha ,1))&= [3,3+\alpha ), \end{aligned}$$
(4.18)
$$\begin{aligned} T([1,2-\alpha -b))&= [1+\alpha ,2-b), \end{aligned}$$
(4.19)
$$\begin{aligned} T([2-\alpha -b,2-\alpha ))&= [3-b,3), \end{aligned}$$
(4.20)
$$\begin{aligned} T([2-\alpha ,2-\alpha +b))&= [2,2+b), \end{aligned}$$
(4.21)
$$\begin{aligned} T([2-\alpha +b,2))&= [1+b,1+\alpha ), \end{aligned}$$
(4.22)
$$\begin{aligned} T([2,3-\alpha -b))&= [2+\alpha ,3-b), \end{aligned}$$
(4.23)
$$\begin{aligned} T\bigl (\bigl [3-\alpha -b,3-\alpha -\tfrac{b}{2}\bigr )\bigr )&= \bigl [2-b,2-\tfrac{b}{2}\bigr ), \end{aligned}$$
(4.24)
$$\begin{aligned} T\bigl (\bigl [3-\alpha -\tfrac{b}{2},3-\alpha \bigr )\bigr )&= \bigl [1-\tfrac{b}{2},1\bigr ), \end{aligned}$$
(4.25)
$$\begin{aligned} T\bigl (\bigl [3-\alpha ,3-\alpha +b\bigr )\bigr )&= [0,b), \end{aligned}$$
(4.26)
$$\begin{aligned} T([3-\alpha +b,3))&= [2+b,2+\alpha ), \end{aligned}$$
(4.27)
$$\begin{aligned} T([3,4-\alpha ))&= [3+\alpha ,4), \end{aligned}$$
(4.28)
$$\begin{aligned} T([4-\alpha ,4-\alpha +b))&= [1,1+b), \end{aligned}$$
(4.29)
$$\begin{aligned} T([4-\alpha +b,4))&= [b,\alpha ), \end{aligned}$$
(4.30)

and each of (4.16)–(4.30) represents an increasing bijective linear map. This map T is known as the interval exchange transformation of the \(\alpha \)-flow on the (Lb)-surface. It is clear that T preserves Lebesgue measure.

A quick inspection of (4.16)–(4.30) shows that T has many points of discontinuity. However, if we take them modulo 1, then their values are given by

$$\begin{aligned} 0, \quad 1-\alpha -b, \quad 1-\alpha -\tfrac{b}{2}, \quad 1-\alpha , \quad 1-\alpha +b. \end{aligned}$$

We refer to these five numbers as the singularities of T modulo 1, or simply the singularities. These are precisely the division numbers shifted by \(-\alpha \) modulo 1, together with 0 and \(1-\alpha \).

Suppose now that is a flat finite polysquare-b-rational translation surface, with division numbers \(\{r_ib\}\), \(i=1,\ldots ,R\), where each \(r_i\) is rational and non-zero. Let

denote the interval exchange transformation of the \(\alpha \)-flow on this surface. Suppose that s denotes the number of atomic squares in the underlying polysquare region. Then \(T:[0,s)\rightarrow [0,s)\) is a piecewise linear bijective map that preserves Lebesgue measure, and the singularities of T modulo 1 are

$$\begin{aligned} 0, \;\; 1-\alpha \;\;\hbox {and}\;\; \{r_ib-\alpha \}, \quad i=1,\ldots ,R. \end{aligned}$$
(4.31)

For the remainder of this section, we concentrate on Theorem 2.5 concerning the n-square-b surface in the special case \(b=\{m\alpha \}\) for some integer \(m\geqslant 2\). Our goal here is to establish equidistribution when the GCD Criterion fails. We assume that \(0<\alpha <1\).

The interval exchange transformation is a piecewise linear map \(T:[0,n)\rightarrow [0,n)\). It has 3 singularities 0, \(1-\alpha \) and modulo 1. The inverse transformation \(T^{-1}\) has 3 singularities 0, \(\alpha \) and \(\{m\alpha \}\) modulo 1.

The simplest special case is \(n=m=2\) with \(b=\{2\alpha \}\) where \(1/2<\alpha <1\). It is easy to check that the Double Even Criterion, i.e., the GCD Criterion for \(n=2\), fails.

To bring us one step closer to a complete proof of Theorem 2.5, we have the following result on ergodicity.

Lemma 4.1

Consider \(\alpha \)-flow on the n-square-b surface with \(b=\{m\alpha \}\) for some integer \(m\geqslant 2\). Suppose that the GCD Criterion fails. Then the interval exchange transformation \(T=T_{\alpha ;m}:[0,n)\rightarrow [0,n)\) is ergodic.

Proof

We shall prove this by contradiction. Assume on the contrary that T is not ergodic. Then there exists a T-invariant measurable subset \(S_0\subset [0,n)\) such that \(0<{{\,\textrm{meas}\,}}(S_0)<n\), where \({{\,\textrm{meas}\,}}\) denotes 1-dimensional Lebesgue measure. Since T reduces modulo 1 to irrational rotation on the unit interval with the same \(\alpha \), it follows that T modulo 1 is ergodic, and so \(S_0\) modulo 1 is the unit interval [0, 1), implying that \({{\,\textrm{meas}\,}}(S_0)\) is an integer strictly between 0 and n.

The irrational slope \(\alpha \in (0,1)\) has an infinite continued fraction expansion

(4.32)

where \(a_i\geqslant 1\), \(i=1,2,3,\ldots \), are integers. The rational numbers

$$\begin{aligned} \frac{p_k}{q_k}=\frac{p_k(\alpha )}{q_k(\alpha )}=[a_1,\ldots ,a_k], \quad k=1,2,3,\ldots , \end{aligned}$$
(4.33)

where \(p_k\in {\mathbb {Z}}\) and \(q_k\in {\mathbb {N}}\) are coprime, are the k-convergents of \(\alpha \). It is well known that they give rise to the best rational approximations of the irrational number \(\alpha \), and we have

(4.34)

with \(p_0=0\) and \(q_0=1\).

Let \(\Vert y\Vert \) denote the distance of a real number y from the nearest integer. We shall make use of the fact that for an irrational number \(\alpha \), the sequence

$$\begin{aligned} \min _{1\leqslant k\leqslant n}\Vert k\alpha \Vert , \quad n=1,2,3,\ldots , \end{aligned}$$

is well described by the continued fraction expansion of \(\alpha \).

For every \(k=0,1,2,3,\ldots \), we have

$$\begin{aligned} \Vert q\alpha \Vert\geqslant & {} \Vert q_k\alpha \Vert , \quad 1\leqslant q<q_{k+1},\\ \Vert q_{k+1}\alpha \Vert< & {} \Vert q_k\alpha \Vert ,\nonumber \end{aligned}$$
(4.35)

as well as

(4.36)

Indeed, the sequences \(p_k\) and \(q_k\), \(k=0,1,2,3,\ldots \), are given by the initial values

$$\begin{aligned} p_0=0, \quad p_1=1, \quad q_0=1, \quad q_1=a_1, \end{aligned}$$

and the recurrence relations

$$\begin{aligned} p_{k+1}=a_{k+1}p_k+p_{k-1}, \quad q_{k+1}=a_{k+1}q_k+q_{k-1}, \quad \; k\geqslant 1. \end{aligned}$$
(4.37)

We also have

$$\begin{aligned} p_{k-1}q_k-q_{k-1}p_k=(-1)^k, \quad k\geqslant 1. \end{aligned}$$

On the other hand, using (4.34) and (4.37), it is easy to show that

$$\begin{aligned} \Vert q_{k+1}\alpha \Vert +a_{k+1}\Vert q_k\alpha \Vert =\Vert q_{k-1}\alpha \Vert . \end{aligned}$$
(4.38)

The following result is known as the 3-distance theorem. This surprising geometric fact, formulated as a conjecture by Steinhaus, has many proofs, by Sós [10, 11], Świerczkowski [14], Surányi [13], Halton [4] and Slater [9], with others published more recently.

Lemma 4.2

Consider the \(N+1\) numbers \(0,\alpha ,2\alpha ,3\alpha ,\ldots ,N\alpha \) modulo 1 in the unit torus/circle [0, 1), leading to an \((N\,{+}\,1)\)-partition. This partition exhibits at most three different distances between neighboring points. Furthermore, every positive integer N can be expressed uniquely in the form

$$\begin{aligned} N=\mu q_k+q_{k-1}+r, \quad \hbox {with}\;\;1\leqslant \mu \leqslant a_{k+1}\;\;\text {and}\;\; 0\leqslant r<q_k, \end{aligned}$$

in terms of the continued fraction (4.32) of \(\alpha \) and its convergents (4.33), with the convention that \(q_0=1\) and \(q_{-1}=0\). Then

  1. (i)

    the distance \(\Vert q_k\alpha \Vert \) shows up precisely \(N+1-q_k\) times;

  2. (ii)

    the distance \(\Vert q_{k-1}\alpha \Vert -\mu \Vert q_k\alpha \Vert \) shows up precisely \(r+1\) times; and

  3. (iii)

    the distance \(\Vert q_{k-1}\alpha \Vert -(\mu -1)\Vert q_k\alpha \Vert \) shows up precisely \(q_k-r-1\) times.

Given an integer \(k\geqslant 1\), let denote the partition of the unit torus/circle [0, 1) with \(q_{k+1}=q_{k+1}(\alpha )\) division points \(\{q\alpha \}\), \(-1\leqslant q\leqslant q_{k+1}-2\). Note that the choices \(q=-1,0\) in \(\{q\alpha \}\) represent two of the singularities of the interval exchange transformation T restricted to the interval [0, 1).

A consequence of the special choice \(N=q_{k+1}-1\) is that the 3-distance theorem simplifies to a 2-distance theorem. This in turn leads to some very useful information concerning the distances between neighboring points of the \(q_{k+1}\)-partition of the unit torus/circle [0, 1). Indeed, using the second recurrence relation in (4.37), we have

$$\begin{aligned} N=q_{k+1}-1=a_{k+1}q_k+q_{k-1}-1=\mu q_k+q_{k-1}+r, \end{aligned}$$

with \(\mu =a_{k+1}-1\) and \(r=q_k-1\). Since \(q_k-r-1=0\), it follows from the 3-distance theorem that there are only two distances

$$\begin{aligned} \Vert q_k\alpha \Vert \quad \hbox {and}\quad \Vert q_{k-1}\alpha \Vert -(a_{k+1}-1)\Vert q_k\alpha \Vert =\Vert q_{k+1}\alpha \Vert +\Vert q_k\alpha \Vert , \end{aligned}$$
(4.39)

in view of (4.38).

It follows immediately from (4.35) that one of the neighbors of 0 in the partition is \(\{q_k\alpha \}\) which clearly has distance \(\Vert q_k\alpha \Vert \) from 0 in the unit torus/circle. Since \(\alpha \) is irrational, the other neighbor of 0 in the partition must have distance \(\Vert q_{k+1}\alpha \Vert +\Vert q_k\alpha \Vert \) from 0 in the unit torus/circle. Simple calculation then shows that it is . Thus the two neighbors

of 0 in the partition exhibit the two gaps in (4.39) in some order. Similarly, the two neighbors

of \(1-\alpha =\{-\alpha \}\) in the partition exhibit the same two gaps in (4.39) in the same order. Furthermore, for every integer \(q=1,\ldots ,m-1\), the two neighbors

of \(\{q\alpha \}\) in the partition also exhibit the same two gaps in (4.39) in the same order.

The union of the left and right neighborhoods of 0 in the partition has the form

(4.40)

Indeed, the union of the left and right neighborhoods of \(\{q\alpha \}\), \(q=-1,0,1,\ldots ,m-1\), in the partition has the form

(4.41)

with the two gaps in the same order, where

(4.42)

but we have not specified which one is which. We refer to B(q), \(q=-1,0,m-1\), as the buffer zones of the singularities respectively of T.

Now suppose that \(q_{k+1}\) is much greater than m.

We consider the short special intervals

(4.43)

Note that these short special intervals have three crucial properties:

  1. (i)

    They completely cover the \(m+1\) long special intervals determined by the \(m+1\) division points \(\{q\alpha \}\), \(q=-1,0,1,\ldots ,m-1\), of the torus/circle [0, 1).

  2. (ii)

    They avoid all the division points \(\{q\alpha \}\), \(q=-1,0,1,\ldots ,m-1\), in view of (4.40)–(4.43). In particular, they avoid the singularities of T.

  3. (iii)

    Any two short special intervals contained inside the same long special interval in (i) and arising from neighboring partition points exhibit substantial overlapping. More precisely, if are two integers such that and \(\{q'\alpha \}\) and \(\{q''\alpha \}\) are neighboring points in the partition , and both points are in the same long special interval in (i), then

    (4.44)

    Note the trivial upper bound

    (4.45)

    Then (4.44) and (4.45) together justify the term substantial overlapping.

Since T acts on the interval [0, n), for every interval \(J_k(q)\), \(m\leqslant q\leqslant q_{k+1}-2\), given by (4.43), we define its n-copy extension \(J_k(q;n)\) by

$$\begin{aligned} J_k(q;n)=J_k(q)\cup (1+J_k(q))\cup \cdots \cup (n-1+J_k(q))\subset [0,n), \end{aligned}$$

a union of \(J_k(q)\) with \(n-1\) of its translates.

Lemma 4.3

Let \(\varepsilon <1/100\) be positive and fixed. Provided that the positive integer k is sufficiently large, there exists an integer \(q^*\) such that and for each \(\ell =0,1,\ldots ,n-1\), we have either

(4.46)

or

(4.47)

Remark 4.4

Lemma 4.3 resembles Lebesgue’s Density Theorem. It is rather tempting to say that the latter almost implies the former, or at least makes the former quite plausible. Nevertheless, our formal proof below does not make use of Lebesgue’s Density Theorem, just the definition of Lebesgue measure.

Proof of Lemma 4.3

Since \(S_0\) is Lebesgue measurable, given any \(\eta >0\), there exists a finite set of disjoint intervals \(I_h\), \(1\leqslant h\leqslant H=H(S_0;\eta )\), such that the union

(4.48)

gives an \(\eta \)-approximation of \(S_0\), in the sense that the symmetric difference satisfies the condition

(4.49)

We will specify a suitable value of \(\eta =\eta (\varepsilon )>0\) later.

A short special interval \(\ell +J_k(q)\), where \(\ell =0,1,\ldots ,n-1\) and \(m\leqslant q\leqslant q_{k+1}-2\), is said to be V-nice if it is either completely contained in V, or it is disjoint from V.

Since V given by (4.48) is a finite union of disjoint intervals, it is clear that there exists an integer-valued threshold \(k=k(S_0;V;\eta )\) such that the union of the V-nice short special intervals \(\ell +J_k(q)\), with \(\ell =0,1,\ldots ,n-1\) and \(m\leqslant q\leqslant q_{k+1}-2\), has measure at least \(n(1-\eta )\).

On the other hand, let denote the set of short special intervals \(\ell +J_k(q)\), where \(\ell =0,1,\ldots ,n-1\) and \(m\leqslant q\leqslant q_{k+1}-2\), that are bad in the sense that

(4.50)

Then it follows from (4.49) and (4.50) that

where the factor 1/3 arises from the observation that an interval \(J_k(q)\) intersects at most two other such intervals, namely its left and right neighbors, and denotes the cardinality of the set . Combining this with (4.36), we deduce that

(4.51)

Since \(\eta >0\) can be arbitrarily small, we choose \(\eta =\varepsilon ^2/6\). Then (4.51) simplifies to . Since \(\varepsilon \) is small, the bad short special intervals in form a small minority of the short special intervals under consideration.

Thus the overwhelming majority of the short special intervals under consideration are V-nice and violate (4.50). A routine application of the Pigeonhole Principle now implies the existence of an integer \(q^*\) such that \(m\leqslant q^*\leqslant q_{k+1}-2\) and each interval \(\ell +J_k(q^*)\), \(\ell =0,1,\ldots ,n-1\), is V-nice and violates (4.50). For such an interval \(\ell +J_k(q^*)\), it follows from (4.42) and (4.43) that

Since is a disjoint union, it follows that

(4.52)

Suppose first of all that \(\ell +J_k(q^*)\) is completely contained in V. Then

(4.53)

while

(4.54)

The assertion (4.46) now follows on combining (4.52)–(4.54).

Suppose next that \(\ell +J_k(q^*)\) is disjoint from V. Then

(4.55)

while

(4.56)

The assertion (4.47) now follows on combining (4.52), (4.55) and (4.56). \(\square \)

In view of Lemma 4.3, we can define an ordered n-tuple

$$\begin{aligned} \Theta (k,q^*)=(\theta _0(k,q^*),\theta _1(k,q^*),\ldots ,\theta _{n-1}(k,q^*)), \end{aligned}$$

where, for \(\ell =0,1,\ldots ,n-1\),

$$\begin{aligned} \theta _\ell (k,q^*)={\left\{ \begin{array}{ll} \,1,&{}\hbox {if } \ell +J_k(q^*) \hbox { satisfies (4.46)},\\ \,0,&{}\hbox {if } \ell +J_k(q^*) \hbox { satisfies (4.47)}. \end{array}\right. } \end{aligned}$$

We are now in a position to complete the proof of Lemma 4.1.

The first key step in our argument is the extension of the local set \(J_k(q^*;n)\) globally via a T-power argument.

Consider an arbitrary set \(J_k(q;n)\) such that \(m\leqslant q\leqslant q_{k+1}-2\) and \(q\ne q^*\). Then

$$\begin{aligned} J_k(q;n)=T^{q-q^*}J_k(q^*;n). \end{aligned}$$

Note that \(S_0\subset [0,n)\) is T-invariant, and that the three singularities

modulo 1 never split the intervals in the process of iterated applications of the transformation T. It follows that \(J_k(q;n)\cap S_0\) defines an ordered k-tuple \(\Theta (k,q)\) which is either equal to \(\Theta (k,q^*)\) or has the entries permuted.

The second key step in our argument concerns taking advantage of property (iii) earlier concerning substantial overlappings of the intervals \(J_k(q)\).

Recall that the division points

$$\begin{aligned} \{q\alpha \}, \quad q=-1,0,1,\ldots ,m-1, \end{aligned}$$

of the torus/circle [0, 1) give rise to \(m+1\) long special intervals in the torus/circle [0, 1). They lead naturally to \(n(m+1)\) division points and \(n(m+1)\) long special intervals in [0, n). Now the sets \(J_k(q;n)\), \(m\leqslant q\leqslant q_{k+1}-2\), lead to \(n(q_{k+1}-m-1)\) intervals which give rise to \(n(m+1)\) collections of substantially overlapping intervals in [0, n). These \(n(m+1)\) collections cover the \(n(m+1)\) disjoint long special intervals. Due to the substantial overlappings, neighboring short special intervals in the same collection must have identical ordered n-tuples \(\Theta (k,q)\).

It follows that the short special intervals within any given long special interval must either all satisfy (4.46) or all satisfy (4.47). This means that the given long special interval is \(\varepsilon \)-almost entirely in \(S_0\), in the sense that

(4.57)

or is \(\varepsilon \)-almost disjoint from \(S_0\), in the sense that

Let the set \(S_0^*\subset [0,n)\) be defined as follows, apart from the \(n(m+1)\) division points that give rise to the long special intervals. For every long special interval , we set

Then each of the long special intervals in [0, n) is either entirely contained in \(S_0^*\) or disjoint from \(S_0^*\). It then remains to prove that if the GCD Criterion fails, then such a set \(S_0^*\) cannot exist. Our argument is to show that the existence of such a set \(S_0^*\) would give rise to a multi-coloring of the n-square-b surface, sufficiently restricted as to allow us to derive the necessary contradiction.

The \(n(m+1)\) division points in [0, n) that give rise to the \(n(m+1)\) long special intervals are

Here, for each \(\ell =0,1,\ldots ,n-1\), we view the left vertical edge of the \((\ell +1)\)-th square face of the n-square-b surface as the interval \([\ell ,\ell +1)\). We can then 2-color these n intervals to distinguish points in \(S_0^*\) from points not in \(S_0^*\). This gives rise to a 2-coloring of the set [0, n).

For ease of description, let us denote the bottom left vertex of the 1-st square face and the top right vertex of the n-th square face of the n-square-b surface by (0, 0) and (n, 1) respectively. Using the \(\alpha \)-flow, the \(n(m+1)\) division points now lead to \(n(m+1)\) line segments, linking pairs of points

(4.58)

lying on the \((\ell \,{+}\,1)\)-th square face of the n-square-b surface.

We shall show later that the line segments linking the points

$$\begin{aligned} (\ell ,\{-\alpha \}) \;\;\hbox {and}\;\; (\ell +1,1), \quad \ell =0,1,\ldots ,n-1, \end{aligned}$$
(4.59)

do not come into the argument.

Note first of all that \(\{m\alpha \}\) is not a division point of the vertical edges of the n-square-b surface. As in Sect. 2, write \(b=b_\nu =\{m\alpha \}\). Using (2.3), we see that \(\{m\alpha \}\) is in the interior of the interval \([b_{\nu -1},b_{\nu +1})\), of the form (2.6).

Suppose that , so that precisely \(\tau \) of the intervals

$$\begin{aligned} \ell +[b_{\nu -1},b_{\nu +1}), \quad \ell =0,1,\ldots ,n-1, \end{aligned}$$
(4.60)

belong to \(S_0^*\). Assign the color B or W to an interval in (4.60) according to whether it is contained in \(S_0^*\) or is disjoint from \(S_0^*\). This gives rise to a 2-coloring sequence of n terms, corresponding to the n intervals (4.60) on the left vertical edges of the square faces of the n-square-b surface and made up of \(\tau \) copies of B and \((n-\tau )\) copies of W. We determine a shortest subsequence of consecutive terms of this 2-coloring sequence of length \(d>1\) such that the 2-coloring sequence modulo n is the d-term subsequence repeated n/d times. In particular, the number \(d>1\) must divide n. In view of cyclic periodicity, we may restrict our attention to the d square faces corresponding to this d-term subsequence.

For these d square faces under consideration, we color the interval \([b_{\nu -1},b_{\nu +1})\) on the left vertical edges according to the 2-coloring subsequence of length d, and then use the \(\alpha \)-flow to spread this 2-coloring of the intervals to the relevant square faces. Figure 21 below, which is not to scale, illustrates our observations thus far in the case \(d=4\), where the 2-coloring subsequence WBBB has length 4.

Fig. 21
figure 21

A partial 2-coloring on four consecutive square faces of the n-square-b surface where 4 divides n

We next consider the line segments linking the pairs of points

(4.61)

on the square faces under consideration. Since is one of the division points, it follows from (2.3) that there exists a unique \(\mu =0,1,\ldots ,m\) such that \(\mu \ne \nu \) and . Let \(b_{\mu -1}\) and \(b_{\mu +1}\) be the closest division points to \(b_\mu \) from below and above respectively. We next investigate the coloring of the intervals \([b_{\mu -1},b_\mu )\) and \([b_\mu ,b_{\mu +1})\) on the left vertical edges of the square faces. The T-invariance of \(S_0\), and hence \(S_0^*\), clearly dictates that the interval \([b_\mu ,b_{\mu +1})\) must have the same coloring as the interval \([b_{\nu -1},b_{\nu +1})\) on the left vertical edge of the same square face, whereas the interval \([b_{\mu -1},b_\mu )\) must have the same coloring as the interval \([b_{\nu -1},b_{\nu +1})\) on the left vertical edge of the square face immediately to the right. Figure 22 continues with our example, and we see that the 2-coloring sequence of the intervals \([b_\mu ,b_{\mu +1})\) in the four square faces remain WBBB, whereas the 2-coloring sequence of the intervals \([b_{\mu -1},b_\mu )\) in the four square faces becomes BBBW, representing a shift by 1 to the left of the original 2-coloring sequence. Furthermore, the color pattern sequence across the line segments (4.61) is

$$\begin{aligned} WB,\;BB,\;BB,\;BW, \end{aligned}$$
(4.62)

where, for instance, WB denotes W above and B below.

Fig. 22
figure 22

Extending the 2-coloring on four consecutive square faces of the n-square-b surface where 4 divides n

Let \(b_{\mu -2}\) denote the closest division point to \(b_{\mu -1}\) from below, and let \(b_{\nu -2}\) denote the closest division point to \(b_{\nu -1}\) from below. Note that \(b_{\nu -2}=\{b_{\mu -2}+\alpha \}\). We next consider the line segments linking the pairs of points \((\ell ,b_{\mu -2})\) and \((\ell +1,b_{\nu -2})\) on the square faces under consideration. As these line segments can be obtained from those in (4.61) under the inverse transformation \(T^{-1}\), the color patterns across them must be preserved. Cyclic periodicity and the maximality of d then dictate that they must appear in the same order modulo d. The unique solution to this problem is a shift by 1 to the left of the earlier color pattern sequence across the line segments. Figure 23 illustrates this observation in our continuing example.

Fig. 23
figure 23

Cyclic periodicity at work on four consecutive square faces of the n-square-b surface where 4 divides n

Note that the new color pattern sequence across the line segments is

$$\begin{aligned} BB,\;BB,\;BW,\;WB, \end{aligned}$$
(4.63)

a shift to the left by 1 of the sequence (4.62).

Let \(b_{\mu -3}\) denote the closest division point to \(b_{\mu -2}\) from below, and let \(b_{\nu -3}\) denote the closest division point to \(b_{\nu -2}\) from below. Note that \(b_{\nu -3}=\{b_{\mu -3}+\alpha \}\). We next consider the line segments linking the pairs of points \((\ell ,b_{\mu -3})\) and \((\ell +1,b_{\nu -3})\) on the square faces under consideration. Again, the color patterns across them must be preserved. Cyclic periodicity and the maximality of d then dictate that they must again appear in the same order modulo d. The unique solution to this problem is a shift by 1 to the left of the earlier color pattern sequence across the line segments. Figure 24 illustrates this observation in our continuing example.

Fig. 24
figure 24

Cyclic periodicity at work on four consecutive square faces of the n-square-b surface where 4 divides n

Note that the new color pattern sequence across the line segments is

$$\begin{aligned} BB,\;BW,\;WB,\;BB \end{aligned}$$

a shift to the left by 1 of the sequence (4.63).

The reader will by now observe that this 2-coloring can be replaced by a d-coloring, and all the properties we have described so far will be preserved. For instance, for our continuing example, Fig. 24 can be replaced by Fig. 25.

Fig. 25
figure 25

A partial 4-coloring on four consecutive square faces of the n-square-b surface where 4 divides n

Proceeding in the same way will allow us eventually to d-color the entire d square faces under consideration.

We have claimed earlier that the line segments linking the points (4.59) do not come into the argument. It can easily be shown that the color pattern across each of these line segments is monochromatic. Thus there are only m division points on the left vertical edge of each square face. The cyclic periodicity described earlier now shows that we have a double periodic d-coloring pattern on the d square faces analogous to (2.7) and which can be obtained by the Double Periodic Coloring Algorithm. It follows that d must divide m. Furthermore, each color represents a T-invariant subset of the n-square-b surface. It now follows from Lemma 2.3 that d also divides \(\Upsilon (m;\alpha )\). Since \(d\geqslant 2\), it follows that the GCD Criterion is satisfied, a contradiction. Hence such a non-trivial T-invariant subset \(S_0\) of [0, n) does not exist, and it follows that T is ergodic. \(\square \)

Lemma 4.1 tells us that if the GCD Criterion fails, then T is ergodic. Birkhoff’s ergodic theorem then gives equidistribution of the half-infinite \(\alpha \)-geodesic on the n-square-b surface for almost every starting point. This time-qualitative result arises from our 2-distance method. Later in Sect. 6, we shall extend this result to half-infinite geodesics with any starting point.

5 Starting the proof of Theorem 3.3: proving ergodicity

Let be an arbitrary flat finite polysquare-b-rational translation surface with s atomic squares and with division numbers \(\{r_ib\}\), \(i=1,\ldots ,R\), where b is irrational and each \(r_i\) is rational and non-zero. Let

denote the interval exchange transformation of the \(\alpha \)-flow on this surface. Then T maps the interval [0, s) to itself.

Since reflecting a polysquare-b-rational translation surface across a horizontal or vertical line gives rise to another polysquare-b-rational translation surface, we can assume, without loss of generality, that the slope satisfies \(0<\alpha <\infty \).

To bring us one step closer to a complete proof of Theorem 3.3, we have the following analog of Lemma 4.1 on ergodicity.

Lemma 5.1

Consider \(\alpha \)-flow on a polysquare-b-rational translation surface with s atomic squares and division numbers \(\{r_ib\}\), \(i=1,\ldots ,R\), where b is irrational and each \(r_i\) is rational and non-zero. Assume that

(5.1)

Then the interval exchange transformation \(T:[0,s)\rightarrow [0,s)\) is ergodic.

Proof

We shall prove this by contradiction. Assume on the contrary that T is not ergodic. Then there exists a T-invariant measurable subset \(S_0\subset [0,s)\) such that . Since T reduces modulo 1 to irrational rotation on the unit interval with the same \(\alpha \), it follows that T modulo 1 is ergodic, and so \(S_0\) modulo 1 is the unit interval [0, 1), implying that .

The irrational slope \(\alpha \in (0,\infty )\) has an infinite continued fraction expansion

(5.2)

where \(a_0\geqslant 0\), \(a_i\geqslant 1\), \(i=1,2,3,\ldots \), are integers. The rational numbers

$$\begin{aligned} \frac{p_k}{q_k}=\frac{p_k(\alpha )}{q_k(\alpha )}=[a_0;a_1,\ldots ,a_k], \quad k=0,1,2,3,\ldots , \end{aligned}$$

where \(p_k\in {\mathbb {Z}}\) and \(q_k\in {\mathbb {N}}\) are coprime, are the k-convergents of \(\alpha \). Since \(\alpha \) is badly approximable, there exists an integer A such that

$$\begin{aligned} a_0,a_1,a_2,a_3,\ldots \leqslant A. \end{aligned}$$
(5.3)

As in the proof of Lemma 4.1, we again use the 3-distance theorem as stated in Lemma 4.2. We also work with the same partition of the unit torus/circle [0, 1) with \(q_{k+1}=q_{k+1}(\alpha )\) division points \(\{q\alpha \}\), \(-1\leqslant q\leqslant q_{k+1}-2\). Note that the choices \(q=-1,0\) in \(\{q\alpha \}\) represent two of the singularities of the interval exchange transformation T restricted to the interval [0, 1). Here \(k\geqslant 1\) is an integer chosen to be sufficiently large.

Recall that for the special choice \(n=q_{k+1}-1\), the 3-distance theorem simplifies to a 2-distance theorem, and that there are only two distances

$$\begin{aligned} \Vert q_k\alpha \Vert \quad \hbox {and}\quad \Vert q_{k+1}\alpha \Vert +\Vert q_k\alpha \Vert \end{aligned}$$

between adjacent partition points in . Thus the union of the left and right neighborhoods of 0 in the partition has the form

(5.4)

while the union of the left and right neighborhoods of \(\{-\alpha \}\) in the partition has the form

(5.5)

with the two gaps in the same order, where

(5.6)

but we have not specified which one is which.

We consider the short special intervals

(5.7)

Note that these short special intervals have three crucial properties:

  1. (i)

    They completely cover the two long special intervals determined by the two division points 0 and \(\{-\alpha \}\) of the torus/circle [0, 1).

  2. (ii)

    They avoid the singularities 0 and \(\{-\alpha \}\) of T, in view of (5.4)–(5.7).

  3. (iii)

    Any two short special intervals inside the same long special interval in (i) arising from neighboring partition points exhibit substantial overlapping.

Recall from (4.31) that the singularities of T modulo 1 are 0 and \(\{-\alpha \}\), together with \(\{r_ib-\alpha \}\), \(i=1,\ldots ,R\). These latter singularities require extra care, and we deviate from the proof of Lemma 4.1. Our new argument depends on a crucial but rather complicated technical lemma. To formulate this, we first need some notation and definitions.

For each \(i=1,\ldots ,R\), we write the non-zero rational number \(r_i\) in the form

(5.8)

Furthermore, we write

$$\begin{aligned} U=\max _{1\leqslant i\leqslant R}\vert u_i\vert \quad \hbox {and}\quad V=\max _{1\leqslant i\leqslant R}\vert v_i\vert . \end{aligned}$$
(5.9)

On the other hand, let the integer \(k\geqslant 1\) be given. For every \(i=1,\ldots ,R\), let

$$\begin{aligned} h_k(i;+)=h_k(\alpha ;r_ib;+) \quad \hbox {and}\quad h_k(i;-)=h_k(\alpha ;r_ib;-) \end{aligned}$$

denote the two integers satisfying

$$\begin{aligned} -1\leqslant h_k(i;+),h_k(i;-)\leqslant q_{k+1}-2 \end{aligned}$$
(5.10)

such that and are the two neighbors of the singularity \(\{r_ib-\alpha \}\) in the partition of the unit torus/circle [0, 1). Clearly, for \(\sigma =\pm \), we have

(5.11)

It then follows from (5.11) that for every \(i=1,\ldots ,R\) and \(\sigma =\pm \),

$$\begin{aligned} h_k(i;\sigma )=h_k(\alpha ;r_ib;\sigma )\rightarrow \infty \quad \hbox {as } k\rightarrow \infty , \end{aligned}$$

for otherwise there exists an integer value m such that \(h_k(i;\sigma )=m\) for infinitely many distinct values of k, and so the corresponding limit in (5.11) must have the value \(\Vert m\alpha -\{r_ib-\alpha \}\Vert =0\), which contradicts the hypothesis (5.1).

We have the following separation lemma.

Lemma 5.2

Let \(\alpha \in (0,\infty )\) be badly approximable, with continued fraction (5.2) and digits satisfying (5.3), where A is a fixed positive integer. Write

(5.12)

where U and V are given by (5.8) and (5.9). Then there exists an infinite set

of positive integers such that for every , the following hold:

  1. (i)

    For every \(i=1,\ldots ,R\) and \(\sigma =\pm \), we have

    (5.13)
  2. (ii)

    For every \(i_1,i_2=1,\ldots ,R\) and \(\sigma _1,\sigma _2=\pm \) such that \((i_1,\sigma _1)\ne (i_2,\sigma _2)\), we have

    $$\begin{aligned} \vert h_k(i_1;\sigma _1)-h_k(i_2;\sigma _2)\vert >\delta q_{k+1}. \end{aligned}$$
    (5.14)

The underlying idea of the proof of Lemma 5.2 is quite simple. Unfortunately, the details are rather complicated and involve a case study. We thus postpone the proof to Sect. 7.

Since T acts on the interval [0, s), for every interval \(J_k(q)\), \(1\leqslant q\leqslant q_{k+1}-2\), given by (5.7), we define its s-copy extension \(J_k(q;s)\) by

$$\begin{aligned} J_k(q;s)=J_k(q)\cup (1+J_k(q))\cup \cdots \cup ((s-1)+J_k(q))\subset [0,s), \end{aligned}$$

a union of \(J_k(q)\) with \(s-1\) of its translates.

We have a more complicated variant of Lemma 4.3.

Lemma 5.3

Let \(\delta \) be given by (5.12), and let

Provided that the positive integer k is sufficiently large, for any subset

such that the cardinality , there exists an integer such that for each \(\ell =0,1,\ldots ,s-1\), we have either

(5.15)

or

(5.16)

Proof

Since \(S_0\) is Lebesgue measurable, given any \(\eta >0\), there exists a finite set of disjoint intervals \(I_h\), \(1\leqslant h\leqslant H=H(S_0;\eta )\), such that the union

gives an \(\eta \)-approximation of \(S_0\). Let denote the set of bad short special intervals \(\ell +J_k(q)\), where \(\ell =0,1,\ldots ,s-1\) and \(1\leqslant q\leqslant q_{k+1}-2\), in the sense that

Mimicking the proof of Lemma 4.3 and choosing \(\eta =\varepsilon ^2/6\), we deduce that

(5.17)

Suppose on the contrary that the conclusion of Lemma 5.3 fails. Again mimicking the proof of Lemma 4.3, we can show that there exists a subset

with cardinality , such that for every integer , there exists \(\ell =\ell (q)\) satisfying \(0\leqslant \ell \leqslant s-1\) such that

Thus , contradicting (5.17). \(\square \)

In view of Lemma 5.3, we can define an ordered s-tuple

$$\begin{aligned} \Theta (k,q^*)=(\theta _0(k,q^*),\theta _1(k,q^*),\ldots ,\theta _{s-1}(k,q^*)), \end{aligned}$$
(5.18)

where, for \(\ell =0,1,\ldots ,s-1\),

$$\begin{aligned} \theta _\ell (k,q^*)={\left\{ \begin{array}{ll} \,1,&{}\hbox {if } \ell +J_k(q^*) \text { satisfies (5.15)},\\ \,0,&{}\hbox {if } \ell +J_k(q^*) \text { satisfies (5.16)}. \end{array}\right. } \end{aligned}$$
(5.19)

The \(R+2\) singularities 0, \(\{-\alpha \}\) and \(\{r_ib-\alpha \}\), \(i=1,\ldots ,R\), of T modulo 1 divide the unit torus/circle [0, 1) into \(R+2\) disjoint long critical intervals.

The next lemma follows from combining Lemmas 5.2 and 5.3.

Lemma 5.4

If is sufficiently large, then for every short special interval \(J_k(q)\), \(1\leqslant q\leqslant q_{k+1}-2\), that is fully contained inside one of the \(R+2\) long critical intervals, the intersection \(J_k(q;s)\cap S_0\) defines an ordered s-tuple \(\Theta (k,q)\) that is either equal to \(\Theta (k,q^*)\) or has the entries permuted.

Proof

Suppose that \(J_k(q)\) is fully contained inside one of the \(R+2\) long critical intervals with endpoints \(z_1<z_2\) which are two adjacent singularities of T modulo 1. Suppose first that \(z_1,z_2\not \in \{0,\{-\alpha \}\}\), where \(z_2=0\) denotes the endpoint \(1=0\). Then \(z_1=\{r_{i_1}b\}\) and \(z_2=\{r_{i_2}b\}\) for some \(i_1\ne i_2\) satisfying \(1\leqslant i_1,i_2\leqslant R\), and there exist \(\sigma _1,\sigma _2\in \{\pm \}\) such that

It follows from (5.14) that the finite sequence of consecutive integers with integer endpoints \(h_k(i_1;\sigma _1)\) and \(h_k(i_2;\sigma _2)\) has at least \(\delta q_{k+1}\) terms and contains the integer q. If \(z_1\) or \(z_2\) belongs to \(\{0,\{-\alpha \}\}\), then a modification of the argument, using (5.13) as well as (5.14), will also lead to a sequence of consecutive integers with at least \(\delta q_{k+1}\) terms and which contains the integer q. It then follows from Lemma 5.3 that this finite sequence of consecutive integers also contains an integer \(q^*\) such that an ordered s-tuple \(\Theta (k,q^*)\) of the form (5.18) exists and satisfies (5.19) for every \(\ell =0,1,\ldots ,s-1\). Note next that

$$\begin{aligned} J_k(q;s)=T^{q-q^*}J_k(q^*;s). \end{aligned}$$

Note that \(S_0\subset [0,s)\) is T-invariant, and the \(R+2\) singularities never split the intervals in the process of iterated applications of the transformation T. The desired conclusion follows immediately. \(\square \)

Our next step is to take advantage of property (iii) earlier concerning substantial overlappings of the intervals \(J_k(q)\).

Recall that \(R+2\) division points 0, \(\{-\alpha \}\) and \(\{r_ib-\alpha \}\), \(i=1,\ldots ,R\), of the torus/circle [0, 1) give rise to \(R+2\) long critical intervals in the torus/circle [0, 1). They lead naturally to \(s(R+2)\) division points and \(s(R+2)\) long critical intervals in [0, s).

Consider the \(s(q_{k+1}-2)\) short special intervals

$$\begin{aligned} \ell +J_k(q), \quad \ell =0,1,\ldots ,s-1, \;\; 1\leqslant q\leqslant q_{k+1}-2. \end{aligned}$$

For large values of k, the overwhelming majority of these short special intervals are fully contained in some long critical interval in [0, s), and give rise to \(s(R+2)\) collections of substantially overlapping intervals in [0, s). These \(s(R+2)\) collections essentially cover the \(s(R+2)\) disjoint long critical intervals; more precisely, each long critical interval has an extremely short subinterval at each end which may not be covered. Due to the substantial overlappings, neighboring short special intervals in the same collection must have identical ordered s-tuples \(\Theta (k,q)\).

It follows that the short special intervals fully contained within any given long critical interval must either all satisfy (5.15) or all satisfy (5.16). This means that the given long critical interval is essentially \(\varepsilon \)-almost entirely in \(S_0\), or is essentially \(\varepsilon \)-almost disjoint from \(S_0\).

Let the set \(S_0^*\subset [0,s)\) be defined as follows, apart from the \(s(R+2)\) division points that give rise to the long critical intervals. For every long critical interval , we set

Then each of the long critical intervals in [0, s) is either entirely contained in \(S_0^*\) or disjoint from \(S_0^*\). It then remains to prove that such a set \(S_0^*\) cannot exist.

The two sets \(S_0^*\) and lead naturally to a 2-coloring of the interval [0, s), which in turn lead to a 2-coloring of the s vertical edges of the underlying finite polysquare-b-rational translation surface . We can then use the \(\alpha \)-flow to extend this 2-coloring to the whole of .

It is clear that cannot be monochromatic, for otherwise must be equal to 0 or s, implying that \({{\,\textrm{meas}\,}}(S_0)\) is close to 0 or s, contradicting our earlier conclusion that . Thus the 2-coloring must have a color switch across some division point.

Suppose first of all that there is a color switch across some division point \(\{-\alpha \}\), as illustrated in the picture on the left in Fig. 26. Applying the reverse \(\alpha \)-flow takes \(\{-\alpha \}\) to the point \(\{-2\alpha \}\) on some vertical edge of . As \(S_0\) is T-invariant, there must be a color switch across \(\{-2\alpha \}\), a contradiction since this is not a division point.

Fig. 26
figure 26

Contradicting a 2-coloring

Suppose next that there is a color switch across some division point 0, as illustrated in the picture on the right in Fig. 26. Applying the \(\alpha \)-flow takes 0 to the point \(\{\alpha \}\) on some vertical edge of . As \(S_0\) is T-invariant, there must be a color switch across \(\{\alpha \}\), a contradiction since this is not a division point.

Suppose finally that the 2-coloring has a color switch across a division point \(\{r_{i_0}b-\alpha \}\), \(1\leqslant i_0\leqslant R\), on some vertical edge of . The \(\alpha \)-flow moves this point to a new point on some vertical edge of . As \(S_0\) is T-invariant, there must be a color switch across this new point, so this new point must be a division point. Repeating this argument sufficiently long, this process must visit some division point twice, meaning that there exist some positive integers \(n_1<n_2\) such that

$$\begin{aligned} \{r_{i_0}b-\alpha +n_1\alpha \}=\{r_{i_0}b-\alpha +n_2\alpha \}. \end{aligned}$$

But this implies that is an integer, contradicting the assumption that \(\alpha \) is irrational.

This completes the proof of Lemma 5.1. \(\square \)

6 Extending ergodicity to unique ergodicity

Lemmas 4.1 and 5.1 establish ergodicity of some Lebesgue measure preserving transformation T. This means that we can apply Birkhoff’s well-known pointwise ergodic theorem concerning measure preserving systems . The triple is a measure space, where X is the underlying space, is a \(\sigma \)-algebra of sets in X, while \(\mu \) is a non-negative \(\sigma \)-additive measure on X with \(\mu (X)<\infty \), and \(T:X\rightarrow X\) is a measure-preserving transformation, so that and \(\mu (T^{-1}A)=\mu (A)\) for every .

Let denote the space of measurable and integrable functions in the measure space . Then Birkhoff’s pointwise ergodic theorem says that for every function , the limit

$$\begin{aligned} \lim _{m\rightarrow \infty }\frac{1}{m}\sum _{j=0}^{m-1}f(T^jx)=f^*(x) \end{aligned}$$
(6.1)

exists for \(\mu \)-almost every \(x\in X\), where is a T-invariant measurable function satisfying the condition

$$\begin{aligned} \int _Xf\,\textrm{d}\mu =\int _Xf^*\,\textrm{d}\mu . \end{aligned}$$

A particularly important special case is when T is ergodic, when every measurable T-invariant set is trivial in the precise sense that \(\mu (A)=0\) or \(\mu (A)=\mu (X)\). This is equivalent to the assertion that every measurable T-invariant function is constant \(\mu \)-almost everywhere.

If T is ergodic, then (6.1) simplifies to

$$\begin{aligned} \lim _{m\rightarrow \infty }\frac{1}{m}\sum _{j=0}^{m-1}f(T^jx)=\int _Xf\,\textrm{d}\mu , \end{aligned}$$
(6.2)

and the right-hand side of (6.1) is the same constant for \(\mu \)-almost every \(x\in X\).

The remarkable intuitive interpretation of (6.2) is that the time average on the left-hand side is equal to the space average on the right-hand side.

Unfortunately, Birkhoff’s theorem does not say anything about the speed of convergence in (6.1) or (6.2).

In the special case of Lemma 5.1, we have \(X=[0,s)\), formed from the s vertical edges of the given finite polysquare-b-rational translation surface , the measure \(\mu \) is 1-dimensional Lebesgue measure and \(T=T_\alpha \) is the interval exchange transformation corresponding to the \(\alpha \)-flow on . Combining Lemma 5.1 with (6.2), we immediately obtain that for almost every starting point, a half-infinite \(\alpha \)-geodesic is uniformly distributed on the surface , a weaker version of Theorem 3.3. Similarly, combining Lemma 4.1 with (6.2), we obtain a corresponding weaker version of Theorem 2.5.

To prove Theorem 3.3, we need to extend almost every starting point to every non-pathological starting point that gives rise to a half-infinite \(\alpha \)-geodesic. In other words, we need to establish the following result.

Lemma 6.1

Under the hypotheses of Lemma 5.1, consider the measure-preserving interval exchange transformation \(T=T_\alpha :X\rightarrow X\) of the polyrectangle-b-rational surface , where \(X=[0,s)\). Then for every subinterval \(J\subset X\) and for every non-pathological starting point \(x\in X\), we have

$$\begin{aligned} \lim _{m\rightarrow \infty }\frac{1}{m}\sum _{j=0}^{m-1}\chi _J(T^jx) =\frac{\lambda (J)}{s}, \end{aligned}$$

where \(\chi _J\) denotes the characteristic function of J and \(\lambda \) denotes 1-dimensional Lebesgue measure.

Using the standard extension argument, this discrete result can be converted to the continuous version concerning the uniformity of \(\alpha \)-geodesics on and every non-pathological starting point.

Proof

The proof is by contradiction, and consists of two parts. The first part simply follows Furstenberg’s argument, and the basic idea is to reformulate the problem of unique ergodicity in terms of T-invariant Borel measures, and to apply non-trivial results from functional analysis. The key idea of the second part is then an application of Birkhoff’s ergodic theorem to a new measure that is different from \(\lambda \).

The first part of the argument is summarized in the following lemma.

Lemma 6.2

Suppose that there exist a non-pathological starting point \(y_0\in X\) and an interval \(J_0\subset X\) for which uniformity fails, so that the infinite sequence

(6.3)

where \(\chi _{J_0}\) is the characteristic function of \(J_0\), does not converge to \(\lambda (J_0)/s\). Then there exists an ergodic measure-preserving system , where is the Borel \(\sigma \)-algebra on X, and \(\nu \) is a new T-invariant Borel probability measure, such that

(6.4)

Proof

In view of the assumption, there exists an infinite subsequence

$$\begin{aligned} 0\leqslant h_0< h_1<h_2<h_3<\cdots \end{aligned}$$
(6.5)

of the non-negative integers such that the limit

(6.6)

exists, but is not equal to \(\lambda (J_0)/s\).

Claim

There exist another infinite subsequence

$$\begin{aligned} 1\leqslant d_1<d_2<d_3<\cdots \end{aligned}$$

of the positive integers and an infinite sequence of corresponding starting points \(y(m)\in X\), \(m=1,2,3,\ldots \), such that the limit

(6.7)

exists, but is not equal to \(\lambda (J_0)/s\). Here, for every integer \(m=1,2,3,\ldots \), the number \(q_{d_m}\) represents the denominator of the \(d_m\)-th convergent of \(\alpha \).

We shall justify this Claim at the end of our proof of Lemma 6.2.

We now repeat and adapt some ideas in [2, Sections 3.2–3.3]. For every integer \(m\ge 1\), we introduce the normalized counting measure \(\nu _m\), defined for every Borel set \(B\subset X\) by

(6.8)

where \(\chi _B\) is the characteristic function of B.

Now we make use of a general theorem in functional analysis which says that the space of Borel probability measures on any compact set is compact in the so-called weak-star topology. The latter means that

$$\begin{aligned} \mu _m\rightarrow \mu \quad \hbox {if and only if}\quad \int f\,\textrm{d}\mu _m\rightarrow \int f\,\textrm{d}\mu , \end{aligned}$$

where f runs over all continuous functions on the compact space.

This compactness theorem is a non-trivial result. The standard proof is based on the Riesz Representation Theorem.

Let denote the set of Borel probability measures \(\mu \) on X. By the general theorem, is compact. Let denote the set of those Borel probability measures \(\mu \) on X that are T-invariant and such that \(\mu \ne \lambda /s\). It is obvious that is a closed subset of  and therefore compact.

The compactness of implies that there is a subsequence \(\nu _{m_i}\) of the sequence \(\nu _m\) defined by (6.8) such that as \(i\rightarrow \infty \), where \(\nu _{\infty }\) is a Borel probability measure on X. It easily follows from (6.8) that \(\nu _{\infty }\) is T-invariant. Indeed, writing \(y_1(m)=Ty(m)\), we have

and

$$\begin{aligned} \biggl \vert \frac{\chi _B(T^{q_{d_m}}y(m))-\chi _B(y(m))}{q_{d_m}}\biggr \vert \leqslant \frac{1}{q_{d_m}}\rightarrow 0 \quad \hbox {as}\;\; m\rightarrow \infty . \end{aligned}$$

Moreover, the limit measure \(\nu _{\infty }\) clearly satisfies the requirement (6.4), implying that , and so is a non-empty compact set.

To find an appropriate which guarantees that the measure-preserving system is ergodic, we use the almost trivial fact that is convex. The well-known Krein–Milman theorem in functional analysis implies that the non-empty convex set is spanned by its extremal points. It is a well known general result in ergodic theory that the extremal points are precisely the ergodic T-invariant measures; see [2, Proposition 3.4]. Thus we can choose our measure to be such an extremal point, and this completes the deduction of the lemma. It remains to establish Claim.

To establish Claim, we use the \(\alpha \)-representation, or Ostrowski representation, of an arbitrary integer \(N\geqslant 1\).

It is well known that every integer \(N\geqslant 1\) has a unique representation in the form

where the integer coefficients \(b_0,\ldots ,b_n\) satisfy the conditions

$$\begin{aligned} 0\leqslant b_0<a_1, \quad 0<b_n\leqslant a_{n+1}, \quad 0\leqslant b_k\leqslant a_{k+1}, \quad \; k=1,\ldots ,n-1, \end{aligned}$$
(6.9)

as well as the restrictions

$$\begin{aligned} b_{k-1}=0 \quad \hbox {if}\quad b_k=a_{k+1}, \;\; k=1,\ldots ,n, \end{aligned}$$
(6.10)

where \(a_1,a_2,a_3,\ldots \) are digits of the continued fraction (5.2) of \(\alpha \), and \(q_0,q_1,q_2,\ldots \) are the denominators of the successive convergents of \(\alpha \). Furthermore, the value of the integer n is determined by the inequalities \(q_n\leqslant N<q_{n+1}\).

We now do likewise for the sequence of integers \(h_\ell \), \(\ell =0,1,2,3,\ldots \), in (6.5), so that we have the Ostrowski representations

$$\begin{aligned} h_\ell =\sum _{k=0}^{n_\ell }\,b_{k,\ell }q_k, \quad \ell =0,1,2,3,\ldots , \end{aligned}$$

where the coefficients \(b_{k,\ell }\) satisfy conditions analogous to (6.9) and (6.10).

Next, observe that

$$\begin{aligned} \{0,1,\ldots ,h_\ell -1\} =\bigcup _{k_0=0}^{n_\ell -1}\,\biggl \{\,\sum _{k=0}^{k_0-1}b_{k,\ell }q_k,\ldots ,\sum _{k=0}^{k_0}b_{k,\ell }q_k-1\,\biggr \}, \end{aligned}$$
(6.11)

where each set in the union is a collection of consecutive integers. Write

$$\begin{aligned} N_\ell (k_0)=\sum _{k=0}^{k_0-1}b_{k,\ell }q_k, \quad k_0=0,1,\ldots ,n_\ell -1. \end{aligned}$$
(6.12)

Then for each \(k_0=0,1,\ldots ,n_\ell -1\), we have

(6.13)

It now follows from (6.11)–(6.13) that

(6.14)

Write \(y_\ell (k_0,b)=T^{N_\ell (k_0)+bq_{k_0}}y_0\). Replacing the dummy variables \(k_0\) and \(j_0\) by k and j respectively in (6.14), we then conclude that

(6.15)

Motivated by (6.15), write

and

Noting that the limit (6.6) exists and is not equal to \(\lambda (J_0)/s\), it is clear that \(\varepsilon (k)\) does not tend to zero as \(k\rightarrow \infty \). Hence there exists an infinite sequence

$$\begin{aligned} 1\leqslant d_1<d_2<d_3<\cdots \end{aligned}$$

of integers and a positive \(\varepsilon _0>0\) such that \(\varepsilon (d_k)\geqslant \varepsilon _0\) for all \(k\geqslant 1\). This clearly implies that the limit (6.7) exists, but is not equal to \(\lambda (J_0)/s\). This completes the proof of Claim and also of Lemma 6.2. \(\square \)

Since the measure-preserving system given by Lemma 6.2 is ergodic, it follows from Birkhoff’s ergodic theorem that for every Borel set and for \(\nu \)-almost every \(y\in X\), we have

(6.16)

Let W be an arbitrarily large but fixed positive integer. We claim that there exists a non-empty open interval \(Q=Q(W)\subset X\) such that

$$\begin{aligned} \frac{\nu (Q)}{\lambda (Q)}>W. \end{aligned}$$
(6.17)

Thus the measure \(\nu \) can be arbitrarily more dense in some subintervals \(Q\subset X\) than the Lebesgue measure \(\lambda \).

To prove (6.17), we choose \(B=J_0\) in (6.3), and consider the set

(6.18)

We already know that for \(\lambda \)-almost every \(y\in X\), we have

(6.19)

Combining (6.4), (6.16), (6.18) and (6.19), we conclude that

$$\begin{aligned} \nu (Y)=1 \quad \hbox {and}\quad \lambda (Y)=0. \end{aligned}$$
(6.20)

Let \(\delta >0\) be arbitrarily small but fixed. Since \(\lambda (Y)=0\), there exists an infinite sequence \(R_i\), \(i\geqslant 1\), of open intervals such that

$$\begin{aligned} \sum _{i=1}^{\infty }\lambda (R_i)<\delta \quad \hbox {and}\quad Y\subset \bigcup _{i=1}^{\infty }R_i. \end{aligned}$$
(6.21)

By (6.20) and (6.21), we have

$$\begin{aligned} \sum _{i=1}^{\infty }\nu (R_i)\geqslant 1. \end{aligned}$$
(6.22)

It follows from (6.21) and (6.22) that there exists an integer \(i_0\geqslant 1\) such that

$$\begin{aligned} \frac{\lambda (R_{i_0})}{\nu (R_{i_0})}<\delta . \end{aligned}$$
(6.23)

Choosing \(\delta =1/W\) in (6.23), inequality (6.17) follows with the choice \(Q=R_{i_0}\).

Next we derive a contradiction from (6.8) and (6.17). In (6.8) the orbits

(6.24)

of \(\nu _m\) have sizes equal to the denominator of a convergent of \(\alpha \). We shall show that they are uniformly not crowded. Then \(\nu \), being a limit point of the set of counting measures \(\nu _m\), cannot be arbitrarily more dense than \(\lambda \). This then contradicts (6.17).

To prove the sets in (6.24) are uniformly not crowded, we recall that T modulo 1 is the \(\alpha \)-shift in the unit torus/circle [0, 1). We also recall from (4.36) that

which implies

(6.25)

Since

$$\begin{aligned} \biggl \{\frac{jp_{d_m}}{q_{d_m}}\biggr \}, \quad 0\leqslant j<q_{d_m}, \end{aligned}$$
(6.26)

gives an equipartition of the unit torus/circle [0, 1), the points in (6.26) exhibit a best possible form of quantitative uniformity. Combining this fact with (6.25), we deduce that for every subinterval \(J\subset X=[0,s)\) with \(\lambda (J)\geqslant s/q_{d_m}\), we have

(6.27)

where denotes the number of elements of in the interval J of length \(\lambda (J)\), and the factor s comes from the fact that there are s atomic squares in . The bound (6.27) proves that the set in (6.24) is uniformly not crowded. This completes the proof of Lemma 6.1. \(\square \)

The slope in Theorem 3.3 is badly approximable. However, in this section this special property of \(\alpha \) is never used, only that it is irrational. Thus we can routinely repeat the same extension argument to Lemma 4.1 for the 2-square-b surface, and obtain the unique ergodicity for every irrational slope and complete the proof of Theorem 2.5.

Remark 6.3

The expert reader may well be wondering why we have not established Theorem 3.3 by using Boshernitzan’s Criterion for unique ergodicity of an interval exchange transformation as given in [16]. Unfortunately, we are not able to see how we may prove Theorem 2.5 via this method, and have already developed a different technique in Sect. 4. It therefore seems natural to adapt this other technique in Sect. 5 in the case of Theorem 3.3.

7 Proof of the separation lemma

The proof of Lemma 5.2 depends on the following very simple property of the given badly approximable number \(\alpha \).

Lemma 7.1

Let \(\alpha \in (0,\infty )\) be badly approximable, with continued fraction (5.2) and digits satisfying (5.3), where A is a fixed positive integer. Then for every integer \(n\geqslant 1\), we have

Proof

For every integer \(n\geqslant 1\), we can find an integer \(k\geqslant 0\) such that \(q_k\leqslant n<q_{k+1}\), where \(q_k=q_k(\alpha )\) denotes the denominator of the k-th convergent of \(\alpha \). Using well-known Diophantine approximation properties of continued fractions, we have

as required. \(\square \)

Proof of Lemma 5.2

Suppose that the integer \(k_0\geqslant 1\) violates (5.13) or (5.14). Then at least one of the conditions (V1)–(V3) holds:

  1. (V1)

    There exist \(i=1,\ldots ,R\) and \(\sigma =\pm \) such that

    $$\begin{aligned} h_{k_0}(i;\sigma )\leqslant \delta q_{k_0+1}. \end{aligned}$$
    (7.1)

    In this case, it follows from (5.11) that

    (7.2)

    Writing \(r_i=u_i/v_i\) and multiplying by \(v_i\) leads to the inequality

    Thus there exists a real number \(x_1>1\) such that

    (7.3)
  2. (V2)

    There exist \(i=1,\ldots ,R\) and \(\sigma =\pm \) such that

    (7.4)

    In this case, write

    $$\begin{aligned} h_{k_0}^-(i;\sigma )=q_{k_0+1}-h_{k_0}(i;\sigma ). \end{aligned}$$
    (7.5)

    This is equivalent to

    and it follows from the triangle inequality and (4.36) that

    and then from (5.11) that

    (7.6)

    Writing \(r_i=u_i/v_i\) and multiplying by \(v_i\) leads to the inequality

    Thus there exists a real number \(x_2>1\) such that

    Note from (5.10) with \(k=k_0\), (7.4) and (7.5) that

    $$\begin{aligned} 2\leqslant h_{k_0}^-(i;\sigma )\leqslant \delta q_{k_0+1}. \end{aligned}$$
  3. (V3)

    There exist \(i_1,i_2=1,\ldots ,R\) and \(\sigma _1,\sigma _2=\pm \) such that \((i_1,\sigma _1)\ne (i_2,\sigma _2)\) and

    $$\begin{aligned} \vert h_{k_0}(i_1;\sigma _1)-h_{k_0}(i_2;\sigma _2)\vert \leqslant \delta q_{k_0+1}. \end{aligned}$$

    For notational simplicity, we assume that

    $$\begin{aligned} h_{k_0}(i_1;\sigma _1)>h_{k_0}(i_2;\sigma _2) \quad \hbox {and}\quad \{r_{i_1}b-\alpha \}>\{r_{i_2}b-\alpha \}. \end{aligned}$$

    The argument for the other possibilities requires only minor modification. Then

    and it follows from the triangle inequality and (5.11) that

    Write

    $$\begin{aligned} h_{k_0}(i_1,i_2;\sigma _1,\sigma _2)=h_{k_0}(i_1;\sigma _1)-h_{k_0}(i_2;\sigma _2). \end{aligned}$$

    Then it clearly follows that

    (7.7)

    Writing \(r_{i_1}=u_{i_1}/v_{i_1}\), \(r_{i_2}=u_{i_2}/v_{i_2}\) and multiplying by \(v_{i_1}v_{i_2}\) leads to the inequality

    Thus there exists a real number \(x_3>1\) such that

    Note in particular that

    $$\begin{aligned} 1\leqslant h_{k_0}(i_1,i_2;\sigma _1,\sigma _2)\leqslant \delta q_{k_0+1}. \end{aligned}$$

Next let \(k>k_0\) be any integer which violates (5.13) or (5.14). We distinguish various cases.

Case 1A. Suppose that \(k_0\) satisfies (V1) and k satisfies the k-analog of (V1). Then there exist and such that

$$\begin{aligned} h_k(i^*;\sigma ^*)\leqslant \delta q_{k+1}. \end{aligned}$$
(7.8)

Writing \(r_{i^*}=u_{i^*}/v_{i^*}\) and multiplying the analog of (7.2) by \(u_iv_{i^*}\) leads to the inequality

(7.9)

On the other hand, multiplying (7.3) by \(u_{i^*}\), we obtain

(7.10)

We shall show that the integer

$$\begin{aligned} d_{11}=v_iu_{i^*}(h_{k_0}(i;\sigma )+1)-u_iv_{i^*}(h_k(i^*;\sigma ^*)+1)\ne 0 \end{aligned}$$
(7.11)

if the condition

$$\begin{aligned} \frac{UV}{q_{k+1}}\leqslant \frac{1}{x_1q_{k_0+1}} \end{aligned}$$
(7.12)

holds. Indeed, combining (5.9), (7.9), (7.10) and (7.12), we see that

$$\begin{aligned} \bigl \Vert u_iv_{i^*}(h_k&(i^*;\sigma ^*)+1)\alpha - u_ iu_{i^*}b\bigr \Vert <\frac{2UV}{q_{k+1}} \leqslant \frac{2}{x_1q_{k_0+1}} \\&\leqslant \frac{2\vert v_iu_{i^*}\vert }{x_1q_{k_0+1}} =\bigl \Vert v_iu_{i^*}(h_{k_0}(i;\sigma )+1)\alpha - u_ iu_{i^*}b\bigr \Vert . \end{aligned}$$

This clearly implies that \(d_{11}\ne 0\). Since \(r_i\) and \(r_{i^*}\) are non-zero, so are \(u_i\) and \(u_{i^*}\).

Case 1B. Suppose that \(k_0\) satisfies (V1) and k satisfies the k-analog of (V2). Then there exist and \(\sigma ^*=\pm \) such that

$$\begin{aligned} h_k^-(i^*;\sigma ^*)=q_{k+1}-h_k(i^*;\sigma ^*)\leqslant \delta q_{k+1}. \end{aligned}$$

Writing \(r_{i^*}=u_{i^*}/v_{i^*}\) and multiplying the analog of (7.6) by \(u_iv_{i^*}\) leads to the inequality

(7.13)

On the other hand, multiplying (7.3) by \(u_{i^*}\), we obtain (7.10). We shall show that the integer

$$\begin{aligned} d_{12}=v_iu_{i^*}(h_{k_0}(i;\sigma )+1)-u_iv_{i^*}({}-h_k^-(i^*;\sigma ^*)+1)\ne 0 \end{aligned}$$

if the condition

$$\begin{aligned} \frac{3UV}{2q_{k+1}}\leqslant \frac{1}{x_1q_{k_0+1}} \end{aligned}$$
(7.14)

holds. Indeed, combining (5.9), (7.10), (7.13) and (7.14), we see that

This clearly implies that \(d_{12}\ne 0\).

Case 1C. Suppose that \(k_0\) satisfies (V1) and k satisfies the k-analog of (V3). Then there exist \(i_1^*,i_2^*=1,\ldots ,R\) and \(\sigma _1^*,\sigma _2^*=\pm \) such that \((i_1^*,\sigma _1^*)\ne (i_2^*,\sigma _2^*)\) and

$$\begin{aligned} \vert h_k(i_1^*;\sigma _1^*)-h_k(i_2^*;\sigma _2^*)\vert \leqslant \delta q_{k+1}. \end{aligned}$$

For notational simplicity, we assume that

$$\begin{aligned} h_k(i_1^*;\sigma _1^*)>h_k(i_2^*;\sigma _2^*) \quad \hbox {and}\quad \{r_{i_1^*}b-\alpha \}>\{r_{i_2^*}b-\alpha \}. \end{aligned}$$

The argument for the other possibilities requires only minor modification.

Writing \(r_{i_1^*}=u_{i_1^*}/v_{i_1^*}\), \(r_{i_2^*}=u_{i_2^*}/v_{i_2^*}\) and multiplying the analog of (7.7) by \(u_iv_{i_1^*}v_{i_2^*}\) leads to the inequality

(7.15)

On the other hand, multiplying (7.3) by \(u_{i_1^*}v_{i_2^*}-v_{i_1^*}u_{i_2^*}\), we obtain

(7.16)

We shall show that the integer

$$\begin{aligned} d_{13}=v_i(u_{i_1^*}v_{i_2^*}-v_{i_1^*}u_{i_2^*})(h_{k_0}(i;\sigma )+1)-u_iv_{i_1^*}v_{i_2^*}h_k(i_1^*,i_2^*;\sigma _1^*,\sigma _2^*)\ne 0 \end{aligned}$$

if the condition

$$\begin{aligned} \frac{2UV^2}{q_{k+1}}\leqslant \frac{1}{x_1q_{k_0+1}} \end{aligned}$$
(7.17)

holds. Indeed, combining (5.9) and (7.15)–(7.17), we see that

This clearly implies that \(d_{13}\ne 0\).

Let us compare the requirements (7.12), (7.14) and (7.17). Clearly the last one is the strongest requirement. Let \(k_1\) be the smallest integer such that the inequality

$$\begin{aligned} \frac{2UV^2}{q_{k_1+1}}\leqslant \frac{1}{x_1q_{k_0+1}} \end{aligned}$$

holds, so that in particular, we have

$$\begin{aligned} q_{k_1+1}\geqslant 2UV^2x_1q_{k_0+1}. \end{aligned}$$

Note that \(q_{k_1+1}\) is the denominator of a convergent of the continued fraction of the badly approximable number \(\alpha \), and clearly \(k_1>k_0\). Recall (5.3) that the continued fraction digits are bounded by an integer A. Using the recurrence relations (4.37), we see that for every real number \(X\geqslant 1\), there exists a denominator \(q_k\) between X and \((1+A)X\). The minimality property of \(k_1\) then ensures that

(7.18)

We shall show that this \(k_1>k_0\) belongs to , and prove this by contradiction.

Suppose on the contrary that . Then one of the cases 1A, 1B and 1C holds with \(k=k_1\).

Suppose that Case 1A holds with \(k=k_1\). Starting with (7.9) with \(k=k_1\), (7.10) and (7.11), applying the triangle inequality, and then using (7.18), we obtain

(7.19)

Let \(n=\vert d_{11}\vert \) with \(k=k_1\). Then \(n\geqslant 1\). Using (7.1) and (7.8) with \(k=k_1\), we have

$$\begin{aligned} \begin{aligned} n =\bigl \vert v_iu_{i^*}(h_{k_0}(i;\sigma )+1)&- u_iv_{i^*}(h_{k_1}(i^*;\sigma ^*)+1)\bigr \vert \\&\leqslant 2\delta UVq_{k_0+1}+2\delta UVq_{k_1+1} <4\delta UVq_{k_1+1}.\qquad \quad \end{aligned} \end{aligned}$$
(7.20)

Applying Lemma 7.1 and using (7.20), we deduce that

(7.21)

Combining (7.19) and (7.21), and noting that \(\Vert n\alpha \Vert =\Vert d_{11}\alpha \Vert \), we conclude that

(7.22)

Clearly (7.22) contradicts the definition of \(\delta \) as given by (5.12). It then follows that Case 1A does not hold with \(k=k_1\).

Essentially similar arguments show that Case 1B and Case 1C also do not hold with \(k=k_1\).

It follows that if \(k_0\) satisfies (V1), then there exists \(k_1>k_0\) such that . Similar arguments then show that if \(k_0\) satisfies (V2) or (V3), then there exists \(k_1>k_0\) such that . This clearly implies that the set is infinite, and completes the proof of Lemma 5.2. \(\square \)

8 Proving time-quantitative anti-uniformity

Our goal in this section is to establish quite serious violations of uniformity. More precisely, we establish Theorem 3.6 which concerns half-infinite \(\alpha \)-geodesics that start from explicitly given points on the 2-square-b surface.

The proof is based on a rather complicated parity formula for certain counting number of the irrational rotation sequence. To describe this, we need the concept of continued fractions as well as the concept of \(\alpha \)-representations, or Ostrowski representations, both introduced earlier in this paper.

The rudiments of continued fractions are given in the proof of Lemma 4.1, so we give here only a brief summary of what we need.

The irrational slope \(\alpha \in (0,1)\) has an infinite continued fraction expansion

(8.1)

where \(a_i\geqslant 1\), \(i=1,2,3,\ldots \), are integers. The rational numbers

$$\begin{aligned} \frac{p_k}{q_k}=\frac{p_k(\alpha )}{q_k(\alpha )}=[a_1,\ldots ,a_k], \quad k=1,2,3,\ldots , \end{aligned}$$

where \(p_k\in {\mathbb {Z}}\) and \(q_k\in {\mathbb {N}}\) are coprime, are the k-convergents of \(\alpha \). Write also

$$\begin{aligned} \eta _k=q_k\alpha -p_k, \quad k=1,2,3,\ldots \end{aligned}$$

We have the recurrence relations

$$\begin{aligned} p_{k+1}=a_{k+1}p_k+p_{k-1}, \ q_{k+1}=a_{k+1}q_k+q_{k-1}, \ \eta _{k+1}=a_{k+1}\eta _k+\eta _{k-1}, \ k\geqslant 1,\nonumber \\ \end{aligned}$$
(8.2)

with initial conditions

$$\begin{aligned} p_0=0, \quad q_0=1, \quad \eta _0=\alpha , \quad p_1=1, \quad q_1=a_1, \quad \eta _1=a_1\alpha -1. \end{aligned}$$
(8.3)

It is well known that the k-convergents satisfy (4.34) and give rise to the best rational approximations of \(\alpha \), and

$$\begin{aligned} \eta _k=(-1)^k\vert \eta _k\vert {\left\{ \begin{array}{ll} \,>0,&{}\hbox {if } k \hbox { is even},\\ \,<0,&{}\hbox {if } k \hbox { is odd}. \end{array}\right. } \end{aligned}$$
(8.4)

We also have the crucial Diophantine approximation property

(8.5)

where \(\Vert y\Vert \) denotes the distance of a real number y from the nearest integer.

The concept of \(\alpha \)-representations, or Ostrowski representations, is first introduced in Sect. 6. Every integer \(N\geqslant 1\) has a unique representation in the form

(8.6)

where the integer coefficients \(b_0,\ldots ,b_k\) satisfy the conditions

$$\begin{aligned} 0\leqslant b_0<a_1, \quad 0<b_k\leqslant a_{k+1}, \quad 0\leqslant b_i\leqslant a_{i+1}, \quad \; i=1,\ldots ,k-1, \end{aligned}$$
(8.7)

as well as the restrictions

$$\begin{aligned} b_{i-1}=0 \quad \;\hbox {if}\quad b_i=a_{i+1}, \quad i=1,\ldots ,k, \end{aligned}$$
(8.8)

where \(a_1,a_2,a_3,\ldots \) are digits of the continued fraction (8.1) of \(\alpha \), and \(q_0,q_1,q_2,\ldots \) are the denominators of the successive convergents of \(\alpha \). Furthermore, the value of the integer k is determined by the inequalities \(q_k\leqslant N<q_{k+1}\). We also say that a sequence \(b_0,b_1,\ldots ,b_k\) that satisfies (8.6)–(8.8) is \(\alpha \)-legitimate.

We need one more concept that is not so well known. If \(\alpha \) is irrational, then any real number \(\beta \in (-\alpha ,1-\alpha )\) can be written in the form

$$\begin{aligned} \beta =\sum _{i=0}^\infty c_i\eta _i, \end{aligned}$$
(8.9)

where the integers \(c_0,\ldots ,c_k\) satisfy the conditions

$$\begin{aligned} 0\leqslant c_0<a_1, \quad 0\leqslant c_i\leqslant a_{i+1}, \quad \; i=1,2,3,\ldots , \end{aligned}$$
(8.10)

and

$$\begin{aligned} c_{i-1}=0 \quad \;\hbox {if}\quad c_i=a_{i+1}, \quad i=1,2,3,\ldots , \end{aligned}$$
(8.11)

where \(a_1,a_2,a_3,\ldots \) are digits of the continued fraction (8.1) of \(\alpha \). Furthermore, if we exclude the case

$$\begin{aligned} c_{u_0+2i}=a_{u_0+2i+1} \quad \hbox {for some } u_0 \hbox { and all } i\geqslant 0, \end{aligned}$$
(8.12)

then the representation (8.9) under the conditions (8.10) and (8.11) is unique. We call this the \(\alpha \)-expansion of the real number \(\beta \in (-\alpha ,1-\alpha )\).

The \(\alpha \)-expansion of the real number \(\beta \in (-\alpha ,1-\alpha )\) follows from the density of \(n\alpha \) mod 1 and the \(\alpha \)-representations of positive integers. Indeed, since \(n\alpha \) mod 1 is dense in the unit interval, for any real number \(\beta \in (-\alpha ,1-\alpha )\), there is an infinite sequence of positive integers \(1\leqslant n_1<n_2<\cdots<n_r<\cdots \) such that

$$\begin{aligned} \lim _{r\rightarrow \infty }n_r\alpha =\beta \bmod {1}. \end{aligned}$$
(8.13)

For each integer \(n_r\) of this sequence, consider the \(\alpha \)-representation

where the digits \(b_0(n_r),\ldots ,b_{k(r)}(n_r)\) satisfy conditions analogous to (8.7) and (8.8). Using (8.2)–(8.4) and (8.7), we have the upper bound

(8.14)

and the lower bound

(8.15)

Note next that

(8.16)

Since \(\beta \in (-\alpha ,1-\alpha )\), it now follows on combining (8.13)–(8.16) that

Combining this with a standard compactness argument, we obtain the existence of an \(\alpha \)-expansion (8.9) with the coefficients satisfying (8.10) and (8.11). Indeed, since \(0\leqslant b_0(n_r)<a_1\), there exists an infinite set \(R_0\) such that \(b_0(n_r)\) for every \(r\in R_0\) has the same value \(c_0\), say. Since \(0\leqslant b_1(n_r)\leqslant a_2\), there exists an infinite subset \(R_1\subset R_0\) such that \(b_1(n_r)\) for every \(r\in R_1\) has the same value \(c_1\), say. And so on. Compactness defines the infinite sequence of coefficients \(c_i\). On the other hand, the convergence of the series on the right-hand side of (8.9) is clear from the bound (8.5) and the exponent growth of the sequence \(q_{k+1}\) as shown by (8.2) which gives the estimate \(q_{k+1}\geqslant q_k+q_{k-1}\geqslant 2q_{k-1}\).

The fact that (8.12) guarantees uniqueness of the \(\alpha \)-expansion is left to the reader as an exercise.

We shall be concerned with real numbers \(\beta \) satisfying \(0<\beta <1-\alpha \). It is well known that for any \(\beta \) with \(\alpha \)-representation (8.9), we have \(0<\beta <1-\alpha \) if and only if

(8.17)

Let \(\alpha \in (0,1)\) be a fixed irrational number, and let \(N\geqslant 1\) be an integer. For any non-zero real number \(\beta \) satisfying \(0<\beta <1-\alpha \), let

$$\begin{aligned} \Phi (\alpha ;\beta ;N)=\vert \{q=0,\ldots ,N-1:\{q\alpha \}\in [0,\beta )\}\vert . \end{aligned}$$
(8.18)

The next result gives a fairly complicated parity formula for the difference of two counting numbers \(\Phi (\alpha ;\beta ';N)\) and \(\Phi (\alpha ;\beta '';N)\) in terms of the continued fraction of \(\alpha \), the \(\alpha \)-representation of N and the \(\alpha \)-expansions of \(\beta '\) and \(\beta ''\) under some very special circumstances.

Lemma 8.1

Suppose that \(\alpha \in (0,1)\) is a fixed irrational number, and that the integer N satisfies \(1\leqslant N<q_{k+1}\). Suppose further that

(8.19)

denotes the \(\alpha \)-representation of an integer \(N\geqslant 1\), and that

$$\begin{aligned} \beta '=\sum _{i=0}^\infty \, c'_i\eta _i \quad \hbox {and}\quad \beta ''=\sum _{i=0}^\infty \, c''_i\eta _i \end{aligned}$$
(8.20)

denote respectively the \(\alpha \)-expansions of two real numbers , where the digits \(c'_i\) and \(c''_i\) are all even, and where \(c'_i\) and \(c''_i\) are non-zero whenever i is even. For every integer \(j=0,\ldots ,k\), let

$$\begin{aligned} N_j=N_j(N)=\sum _{i=0}^jb_iq_i \end{aligned}$$
(8.21)

denote an integer defined in terms of the \(\alpha \)-representation of N, and let

$$\begin{aligned} C'_j=C'_j(N)=\sum _{i=0}^j\, c'_iq_i \quad \hbox {and}\quad C''_j=C''_j(N)=\sum _{i=0}^j\, c''_iq_i \end{aligned}$$
(8.22)

denote respectively integers defined in terms of the \(\alpha \)-expansions of \(\beta '\) and \(\beta ''\). For every integer \(\ell =1,\ldots ,k\), let

$$\begin{aligned} \Delta '_\ell =\Delta '_\ell (N) ={\left\{ \begin{array}{ll} \,1,&{}\hbox {if } \ell \hbox { is even and } C'_{\ell -1}<N_{\ell -1}\leqslant N_\ell<C'_\ell ,\\ \,-1,&{}\hbox {if } \ell \hbox { is odd and } N_{\ell -1}\leqslant C'_{\ell -1}\leqslant C'_\ell <N_\ell ,\\ \,0,&{}\hbox {otherwise}, \end{array}\right. } \end{aligned}$$
(8.23)

and

$$\begin{aligned} \Delta ''_\ell =\Delta ''_\ell (N) ={\left\{ \begin{array}{ll} \,1,&{}\hbox {if } \ell \hbox { is even and } C''_{\ell -1}<N_{\ell -1}\leqslant N_\ell<C''_\ell ,\\ \,-1,&{}\hbox {if } \ell \hbox { is odd and } N_{\ell -1}\leqslant C''_{\ell -1}\leqslant C''_\ell <N_\ell ,\\ \,0,&{}\hbox {otherwise}. \end{array}\right. } \end{aligned}$$
(8.24)

Then, provided that the coefficients \(b_1,\ldots ,b_k\) in (8.19) and (8.21) satisfy

$$\begin{aligned} b_i<a_{i+1}, \quad i=1,\ldots ,k, \end{aligned}$$
(8.25)

we have

(8.26)

We shall prove Lemma 8.1 in Sect. 9. There the reader will see that a parity formula for a single counting number \(\Phi (\alpha ;\beta ;N)\) contains a translation term which we are not able to handle. Thus by studying the difference of two counting numbers, this term appears twice and therefore cancel each other modulo 2.

Remark 8.2

The counting number \(\Phi (\alpha ;\beta ;N)\) is related to the discrepancy function

$$\begin{aligned} D(\alpha ;\beta ;N)=\vert \{q=0,\ldots ,N-1:\{q\alpha \}\in [0,\beta )\}\vert -N\beta \end{aligned}$$

for which there is an explicit formula due to Sós [12]. In fact, one can derive our parity formula using the ideas of Sós. However, it would be most unkind to ask the reader to work out the details. Instead, we include in the next section a detailed proof by closely following the method of Sós.

The conditions in (8.23) and (8.24) may look elegant, but as they stand, they are not of much use. For applications later, we need a less elegant but more convenient form. We summarize it below. The proof is almost trivial.

Lemma 8.3

Suppose that for every integer \(j=0,\ldots ,k\), the integer \(C_j\) is defined in terms of (8.9) in precisely the same way as the integers \(C'_j\) and \(C''_j\) are defined by (8.22) in terms of the \(\alpha \)-expansions (8.20) of \(\beta '\) and \(\beta ''\) respectively.

  1. (i)

    The condition \(C_{\ell -1}<N_{\ell -1}\leqslant N_\ell <C_\ell \) is equivalent to

    $$\begin{aligned} b_\ell <c_\ell , \end{aligned}$$
    (8.27)

    together with the existence of an integer \(m<\ell \) such that

    $$\begin{aligned} c_m<b_m \quad \hbox {and}\quad c_i=b_i, \quad m<i<\ell . \end{aligned}$$
    (8.28)
  2. (ii)

    The condition \(N_{\ell -1}\leqslant C_{\ell -1}\leqslant C_\ell <N_\ell \) is equivalent to

    $$\begin{aligned} c_\ell <b_\ell , \end{aligned}$$
    (8.29)

    together with either

    $$\begin{aligned} b_i=c_i, \quad i<\ell , \end{aligned}$$
    (8.30)

    or the existence of an integer \(m<\ell \) such that

    $$\begin{aligned} b_m<c_m \quad \hbox {and}\quad b_i=c_i, \quad m<i<\ell . \end{aligned}$$
    (8.31)

To prove Theorem 3.6, the simple basic idea is to use discretization to convert the continuous problem of the distribution of an \(\alpha \)-geodesic on the 2-square-b surface to the discrete problem of the distribution of the sequence of points at which the \(\alpha \)-geodesic hits a vertical edge of the surface. The latter gives the irrational rotation sequence \(\{q\alpha \}\), \(q\geqslant 0\). The question of left or right square clearly leads to a parity problem, where left or right is converted to even or odd.

Let an irrational number \(\alpha \in (0,1)\) be given and fixed. Let \(N\geqslant 1\) be an arbitrary integer, and consider the unique \(\alpha \)-representation of N as given by (8.6)–(8.8).

Next, we use the unique \(\alpha \)-expansion of real numbers, and define the length \(\beta '\) of a first gate in terms of the \(\alpha \)-expansion

$$\begin{aligned} \beta '=\sum _{i=0}^\infty \, c'_i\eta _i, \quad c'_i={\left\{ \begin{array}{ll} \,2,&{}\hbox {if } i \hbox { is even},\\ \,0\hbox { or }2,&{}\hbox {if } i \hbox { is odd}, \end{array}\right. } \end{aligned}$$
(8.32)

and the length \(\beta ''\) of a second gate in terms of the \(\alpha \)-expansion

$$\begin{aligned} \beta ''=\sum _{i=0}^\infty \, c''_i\eta _i, \quad c''_i={\left\{ \begin{array}{ll} \,4,&{}\hbox {if } i \hbox { is even},\\ \,0,&{}\hbox {if } i \hbox { is odd}. \end{array}\right. } \end{aligned}$$
(8.33)

Clearly the condition (8.17) is satisfied by both \(\beta '\) and \(\beta ''\), so . In fact, we have

(8.34)

Lemma 8.4

Suppose that the irrational number \(\alpha \in (0,1)\) satisfies (3.1), and that the lengths \(\beta '\) and \(\beta ''\) of the gates satisfy (8.32) and (8.33). For every integer \(k\geqslant 0\), consider the set

Then we have the lower bound

(8.35)

provided that \(\varepsilon >0\) is sufficiently small.

Proof

The condition (3.1) clearly guarantees that \(a_i\geqslant 6\), \(i\geqslant 1\), so that our choices of \(\beta '\) and \(\beta ''\) in (8.32) and (8.33) are valid.

For each element , we can write

where, for each \(i=0,\ldots ,k\), the coefficient \(b_i=b_i(N)\) satisfies (8.7) and (8.8), and with the convention that \(b_0(0)=\cdots =b_k(0)=0\). In this way, we see that the set is in one-to-one correspondence with the collection of \(\alpha \)-legitimate sequences \(b_0,b_1,\ldots ,b_k\) together with the trivial sequence \(0,\ldots ,0\).

Suppose that for an integer , we have

$$\begin{aligned} 5\leqslant b_\ell <a_{\ell +1}, \quad \ell =1,\ldots ,k. \end{aligned}$$
(8.36)

In view of (8.32), the first sum in (8.26) modulo 2 is equal to

(8.37)

In view of (8.33), the second sum in (8.26) modulo 2 is equal to

(8.38)

For the third sum in (8.26), note that (8.27) does not hold with \(c'_\ell =2\), so it follows from (8.23) and Lemma 8.3 (i) that \(\Delta '_\ell =0\) for even \(\ell \geqslant 2\). Also (8.30) and (8.31) do not hold with \(c_i=0\) or \(c_i=2\), so it follows from (8.23) and Lemma 8.3 (ii) that \(\Delta '_\ell =0\) for odd \(\ell \geqslant 3\). Thus the third sum in (8.26) is equal to

$$\begin{aligned} \sum _{\ell =1}^k\Delta '_\ell =\Delta '_1. \end{aligned}$$
(8.39)

For \(\ell =1\), it is clear that (8.29) holds. For \(b_0=0,1\), it is clear that (8.31) holds. For \(b_0=2\), it is clear that (8.30) holds. For \(b_0\geqslant 3\), it is clear that neither (8.30) nor (8.31) holds. Thus it follows from (8.23) and Lemma 8.3 (ii) that

$$\begin{aligned} \Delta '_1={\left\{ \begin{array}{ll} \,-1,&{}\hbox {if}\;\; b_0=0,1,2,\\ \,0,&{}\hbox {if}\;\; b_0\geqslant 3. \end{array}\right. } \end{aligned}$$
(8.40)

For the fourth sum in (8.26), note that (8.27) does not hold with \(c''_\ell =4\), so it follows from (8.24) and Lemma 8.3 (i) that \(\Delta ''_\ell =0\) for even \(\ell \geqslant 2\). Also (8.30) and (8.31) do not hold with \(c_i=0\) or \(c_i=4\), so it follows from (8.23) and Lemma 8.3 (ii) that \(\Delta '_\ell =0\) for odd \(\ell \geqslant 3\). Thus the fourth sum in (8.26) is equal to

$$\begin{aligned} \sum _{\ell =1}^k\Delta ''_\ell =\Delta ''_1. \end{aligned}$$
(8.41)

For \(\ell =1\), it is clear that (8.29) holds. For \(b_0=0,1,2,3\), it is clear that (8.31) holds. For \(b_0=4\), it is clear that (8.30) holds. For \(b_0\geqslant 5\), it is clear that neither (8.30) nor (8.31) holds. Thus it follows from (8.24) and Lemma 8.3 (ii) that

$$\begin{aligned} \Delta ''_1={\left\{ \begin{array}{ll} \,-1,&{}\hbox {if}\;\; b_0=0,1,2,3,4,\\ \,0,&{}\hbox {if}\;\; b_0\geqslant 5. \end{array}\right. } \end{aligned}$$
(8.42)

Hence if (8.36) holds, then it follows from (8.26) and (8.37)–(8.42) that

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N)) ={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b_0=4,\\ \,0,&{}\hbox {if}\;\; b_0\ne 4. \end{array}\right. } \end{aligned}$$
(8.43)

We have the trivial bounds

(8.44)

Using condition (3.1), we have

(8.45)

where is the exponential function. Thus (8.44) and (8.45) give

(8.46)

We wish to find a lower bound for the cardinality of the set

Observe from (8.43) that \(\textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))=0\) except possibly when

$$\begin{aligned} b_0(N)=4, \end{aligned}$$

or (8.36) fails, so that

$$\begin{aligned} b_\ell (N)\in J_\ell =\{0,1,2,3,4,a_{\ell +1}\} \quad \hbox {for some}\;\; \ell =1,\ldots ,k. \end{aligned}$$

Accordingly, we need to find a lower bound for the cardinality of the set

as well as upper bounds for the cardinality of each of the sets

and combine these with the inequality

(8.47)

Indeed, we have the trivial lower bound

Combining this with (3.1) and (8.46), we obtain

(8.48)

Also, for each \(\ell =1,\ldots ,k\) and \(j\in J_\ell \), we have the trivial upper bound

Combining this with (3.1) and (8.46), we obtain

(8.49)

Finally, combining (8.47)–(8.49), we obtain the desired lower bound

provided that \(\varepsilon >0\) is sufficiently small. \(\square \)

Next, for every \(b=1,\ldots ,a_{k+2}-1\), we consider the sets

where every element can be written in the form

and the coefficients \(b_i=b_i(N)=b_i(N-bq_{k+1})\) satisfy (8.7) and (8.8). For every element , the analog of (8.26) is

(8.50)

where \(b_{k+1}=b\).

Lemma 8.5

Under the hypotheses of Lemma 8.4, suppose further that k is odd. Then for every \(b=1,\ldots ,a_{k+2}-1\) apart from \(b=2\), we have the lower bound

(8.51)

provided that \(\varepsilon >0\) is sufficiently small. Furthermore, we have the lower bound

(8.52)

provided that \(\varepsilon >0\) is sufficiently small.

Proof

Clearly \(k+1\) is even, so it follows from (8.32) and (8.33) that \(c'_{k+1}=2\) and \(c''_{k+1}=4\). Then the first sum in (8.50), compared modulo 2 to the corresponding sum in (8.26), has an extra term

(8.53)

The second sum in (8.50), compared modulo 2 to the corresponding sum in (8.26), has an extra term

(8.54)

The third sum in (8.50), compared to the corresponding sum in (8.26), has an extra term \(\Delta '_{k+1}\). If \(b=1\), then (8.27) and (8.28) hold with \(\ell =k+1\) and \(m=k\), so it follows from (8.23) and Lemma 8.3 (ii) that \(\Delta '_{k+1}=1\). If \(b\ne 1\), then (8.27) does not hold with \(\ell =k+1\), so it follows from (8.23) and Lemma 8.3 (ii) that \(\Delta '_{k+1}=0\). Thus

$$\begin{aligned} \Delta '_{k+1}={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b=1,\\ \,0,&{}\hbox {if}\;\; b\ne 1. \end{array}\right. } \end{aligned}$$
(8.55)

The fourth sum in (8.50), compared to the corresponding sum in (8.26), has an extra term \(\Delta ''_{k+1}\). If \(b=1,2,3\), then (8.27) and (8.28) hold with \(\ell =k+1\) and \(m=k\), so it follows from (8.24) and Lemma 8.3 (ii) that \(\Delta '_{k+1}=1\). If \(b\ne 1,2,3\), then (8.27) does not hold with \(\ell =k+1\), so it follows from (8.24) and Lemma 8.3 (ii) that \(\Delta ''_{k+1}=0\). Thus

$$\begin{aligned} \Delta ''_{k+1}={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b=1,2,3,\\ \,0,&{}\hbox {if}\;\; b\ne 1,2,3. \end{array}\right. } \end{aligned}$$
(8.56)

We now combine (8.53)–(8.56). If \(b\ne 2\), then compared to (8.26), there is no change in parity, so that if (8.36) holds, then

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b_0=4,\\ \,0,&{}\hbox {if}\;\; b_0\ne 4, \end{array}\right. } \end{aligned}$$

resulting in a lower bound (8.51), provided that \(\varepsilon >0\) is sufficiently small. If \(b=2\), then compared to (8.26), there is a change in parity, so that if (8.36) holds, then

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))={\left\{ \begin{array}{ll} \,0,&{}\hbox {if}\;\; b_0=4,\\ \,1,&{}\hbox {if}\;\; b_0\ne 4, \end{array}\right. } \end{aligned}$$

resulting in a lower bound (8.52), provided that \(\varepsilon >0\) is sufficiently small. \(\square \)

Lemma 8.6

Under the hypotheses of Lemma 8.4, suppose further that k is even.

If \(c'_{k+1}=0\), then for every \(b=1,\ldots ,a_{k+2}-1\), we have the lower bound

(8.57)

provided that \(\varepsilon >0\) is sufficiently small.

If \(c'_{k+1}=2\), then for every \(b=1,\ldots ,a_{k+2}-1\) apart from \(b=1\), we have the lower bound

(8.58)

provided that \(\varepsilon >0\) is sufficiently small. Furthermore. we have the lower bound

(8.59)

provided that \(\varepsilon >0\) is sufficiently small.

Proof

Clearly \(k+1\) is odd, and it follows from (8.33) that \(c''_{k+1}=0\). The second sum in (8.50), compared to the corresponding sum in (8.26), has an extra term

(8.60)

The fourth sum in (8.50), compared to the corresponding sum in (8.26), has an extra term \(\Delta ''_{k+1}\). But then (8.30) and (8.31) do not hold with \(\ell =k+1\), so it follows from (8.24) and Lemma 8.3 (ii) that

$$\begin{aligned} \Delta ''_{k+1}=0. \end{aligned}$$
(8.61)

Suppose that \(c'_{k+1}=0\). The first sum in (8.50), compared to the corresponding sum in (8.26), has an extra term

(8.62)

The third sum in (8.50), compared to the corresponding sum in (8.26), has an extra term \(\Delta '_{k+1}\). Then (8.30) and (8.31) do not hold with \(\ell =k+1\), so it follows from (8.23) and Lemma 8.3 (ii) that

$$\begin{aligned} \Delta '_{k+1}=0. \end{aligned}$$
(8.63)

We now combine (8.60)–(8.63). Compared to (8.26), there is no change in parity, so that if (8.36) holds, then

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b_0=4,\\ \,0,&{}\hbox {if}\;\; b_0\ne 4, \end{array}\right. } \end{aligned}$$

resulting in a lower bound (8.57), provided that \(\varepsilon >0\) is sufficiently small.

Suppose that \(c'_{k+1}=2\). Then the first sum in (8.50), compared modulo 2 to the corresponding sum in (8.26), has an extra term

(8.64)

The third sum in (8.50), compared to the corresponding sum in (8.26), has an extra term \(\Delta '_{k+1}\). Then (8.30) and (8.31) do not hold with \(\ell =k+1\), so it follows from (8.23) and Lemma 8.3 (ii) that

$$\begin{aligned} \Delta '_{k+1}=0. \end{aligned}$$
(8.65)

We now combine (8.60), (8.61), (8.64) and (8.65). If \(b\ne 1\), then compared to (8.26), there is no change in parity, so that if (8.36) holds, then

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))={\left\{ \begin{array}{ll} \,1,&{}\hbox {if}\;\; b_0=4,\\ \,0,&{}\hbox {if}\;\; b_0\ne 4, \end{array}\right. } \end{aligned}$$

resulting in a lower bound (8.58), provided that \(\varepsilon >0\) is sufficiently small. If \(b=1\), then compared to (8.26), there is a change in parity, so that if (8.36) holds, then

$$\begin{aligned} \textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))={\left\{ \begin{array}{ll} \,0,&{}\hbox {if}\;\; b_0=4,\\ \,1,&{}\hbox {if}\;\; b_0\ne 4, \end{array}\right. } \end{aligned}$$

resulting in a lower bound (8.59), provided that \(\varepsilon >0\) is sufficiently small. \(\square \)

Proof of Theorem 3.6

In view of the inequalities (8.34), note that the difference

$$\begin{aligned} \Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N) \end{aligned}$$
(8.66)

corresponds to a surface with a gate , as shown in the picture on the left in Fig. 27, and an \(\alpha \)-geodesic that starts from the bottom of the left vertical edge of the surface. With the horizontal edge identifications shown, it is not difficult to see that this is equivalent to the 2-square-\((\beta ''-\beta ')\) surface shown in the picture on the right in Fig. 27, and an \(\alpha \)-geodesic that starts at a point on the left vertical edge at a distance \(\beta '\) from the top left vertex.

Fig. 27
figure 27

Two equivalent surfaces

Note also that when we the integer parameter N in (8.66) progresses by 1, this corresponds to the \(\alpha \)-geodesic travelling from one vertical edge of the 2-square-b surface to the next vertical edge, and the length of this geodesic segment is clearly \((1+\alpha ^2)^{1/2}\).

The length \(\beta _0=\beta ''_0-\beta '_0\) of the gate, given in terms of the \(\alpha \)-expansions

$$\begin{aligned} \beta '_0=\sum _{i=0}^\infty \,c'_i\eta _i, \quad c'_i=2, \end{aligned}$$

and

$$\begin{aligned} \beta ''_0=\sum _{i=0}^\infty \, c''_i\eta _i, \quad c''_i={\left\{ \begin{array}{ll} \,4,&{}\hbox {if } i \hbox { is even},\\ \,0,&{}\hbox {if } i \hbox { is odd}, \end{array}\right. } \end{aligned}$$

satisfies (8.32) and (8.33), so that Lemmas 8.48.6 are valid.

  1. (i)

    For every positive integer n, let

    $$\begin{aligned} T^*_n=(1+\alpha ^2)^{1/2}q_{2n+1}. \end{aligned}$$

    Then (3.2) follows from (8.35) for \(b=0\) and from (8.58) for \(b=2,\ldots ,C\). Meanwhile, (3.3) follows from (8.59).

  2. (ii)

    For every positive integer n, let

    $$\begin{aligned} T^{**}_n=(1+\alpha ^2)^{1/2}q_{2n}. \end{aligned}$$

    Then (3.4) follows from (8.35) for \(b=0\) and from (8.51) for \(b=1\) or \(b=3,\ldots ,C\). Meanwhile, (3.5) follows from (8.52).

    The length \(\beta _1=\beta ''_1-\beta '_1\) of the gate, given in terms of the \(\alpha \)-expansions

    $$\begin{aligned} \beta '_1=\sum _{i=0}^\infty \, c'_i\eta _i, \quad c'_i={\left\{ \begin{array}{ll} \,2,&{}\hbox {if } i \hbox { is even},\\ \,2,&{}\hbox {if } i \hbox { is odd and } i<2n+2,\\ \,0,&{}\hbox {if } i \hbox { is odd and }i>2n+2, \end{array}\right. } \end{aligned}$$

    and

    $$\begin{aligned} \beta ''_1=\sum _{i=0}^\infty \, c''_i\eta _i, \quad c''_i={\left\{ \begin{array}{ll} \,4,&{}\hbox {if } i \hbox { is even},\\ \,0,&{}\hbox {if } i \hbox { is odd}, \end{array}\right. } \end{aligned}$$

    satisfies (8.32) and (8.33), so that Lemmas 8.48.6 are valid.

  3. (iii)

    For every positive integer \(i=1,\ldots ,n\), let

    $$\begin{aligned} W_i=(1+\alpha ^2)^{1/2}q_{2n+1}, \end{aligned}$$

    as in Part (i). Then (3.6) follows from (8.35), and (3.7) follows from (8.59).

  4. (iv)

    Let . For any integer \(Q>Q^\star \), there clearly exists a unique integer \(k\geqslant 2n+2\) such that \(q_{k+1}\leqslant Q<q_{k+2}\). Furthermore, either

    (8.67)

    or Q is almost as large as \(q_{2k+2}\), in the sense that

    $$\begin{aligned} Q\in [a_{k+2}q_{k+1},q_{k+2}). \end{aligned}$$
    (8.68)

    Suppose first that (8.67) holds. Then

    so that

    (8.69)

    If k is even, then it follows from (8.35), (8.57) and (8.69) that

    (8.70)

    If k is odd and , then the middle sum on the right hand side of (8.69) is empty, and it follows from (8.35) and (8.51) that

    $$\begin{aligned} \begin{aligned}&\bigl \vert \bigl \{N\in [0,Q):\textrm{parity}(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N))=0\bigr \}\bigr \vert \\&\quad >q_{k+1}(1-\varepsilon )+(Q-q_{k+1}-q_{k+1}\varepsilon ) =Q-2q_{k+1}\varepsilon \geqslant Q(1-2\varepsilon ). \end{aligned} \end{aligned}$$
    (8.71)

    If k is odd and , then we ignore the last term on the right-hand side of (8.69), and it follows from (8.35) and (8.51) that

    $$\begin{aligned} \begin{aligned} \bigl \vert \bigl \{N\in [0,Q):\textrm{parity}&(\Phi (\alpha ;\beta '';N)- \Phi (\alpha ;\beta ';N))=0\bigr \}\bigr \vert \\&>2q_{k+1}(1-\varepsilon )>\frac{2}{3}\,Q(1-\varepsilon ) >Q\biggl (\frac{2}{3}-\varepsilon \biggr ). \end{aligned} \end{aligned}$$
    (8.72)

    If k is odd and , then we ignore the term corresponding to \(b=2\) in the middle sum on the right-hand side of (8.69), and it follows from (8.35) and (8.51) that

    (8.73)

    Suppose next that (8.68) holds. Then analogous to (8.69), we have

    (8.74)

    where we have ignored the term corresponding to \(b=2\) and the last term. Combining (8.35) and (8.74) with (8.51) if k is odd and with (8.57) if k is even, we have

    (8.75)

Since \(\varepsilon >0\) is arbitrary, we see that (3.8) follows immediately from (8.70)–(8.73) and (8.75) if we take \(W^\star =(1+\alpha ^2)^{1/2}Q^\star \). We leave the deduction of the inequality \(\vert \beta _1-\beta _0\vert <\varepsilon \) to the reader. \(\square \)

9 Establishing the parity formula

Throughout this section, we assume that the integers \(c_0,c_1,c_2,\ldots \) are even, and that \(c_i\) is non-zero whenever i is even.

Proof of Lemma 8.1

Recall the definition of \(\Phi (\alpha ;\beta ;N)\) as given by (8.18). We need to find a description of the condition \(0\leqslant q\leqslant N-1\) in terms of the \(\alpha \)-representations of q and N, as well as a description of the condition \(\{q\alpha \}\in [0,\beta )\) in terms of the \(\alpha \)-representation of q and the \(\alpha \)-expansion of \(\beta \). For these, we recall some elementary facts in the theory of continued fractions.

Fact 1

Suppose that

$$\begin{aligned} q=\sum _{i=0}^k\,x_iq_i \quad \hbox {and}\quad N=\sum _{i=0}^k\,b_iq_i \end{aligned}$$

are the \(\alpha \)-representations of q and N respectively, with the convention that when \(q=0\), we have \(x_0=\cdots =x_k=0\). Then \(0\leqslant q\leqslant N-1\) if and only if there exists some integer \(m=0,\ldots ,k\) such that

$$\begin{aligned} x_m<b_m, \quad \; x_i=b_i, \quad i=m+1,\ldots ,k. \end{aligned}$$

Fact 2

We have

$$\begin{aligned} \{q\alpha \} =\biggl \{\sum _{i=0}^k\,x_iq_i\alpha \biggr \} =\biggl \{\sum _{i=0}^k\,x_i(q_i\alpha -p_i)\biggr \} =\biggl \{\sum _{i=0}^k\,x_i\eta _i\biggr \} =\sum _{i=0}^k\,x_i\eta _i \end{aligned}$$

if and only if

(9.1)

Suppose further that

$$\begin{aligned} \beta =\sum _{i=0}^k\,c_i\eta _i \end{aligned}$$

is the \(\alpha \)-expansion of \(\beta \), and that \(0<\beta <1-\alpha \). Then \(\{q\alpha \}\in [0,\beta )\) if and only if there exists an integer \(\ell \) such that

and (9.1) holds.

For , combining Facts 1 and 2, we have

$$\begin{aligned} \Phi (\alpha ;\beta ;N)=\sum _{m=0}^k\sum _{\ell =0}^k\Phi _{\ell ,m}(\alpha ;\beta ;N), \end{aligned}$$
(9.2)

where \(\Phi _{\ell ,m}(\alpha ;\beta ;N)\) denotes the number of integer sequences \((x_0,\ldots ,x_k)\) such that (9.1) holds,

(9.3)

and the remaining terms satisfy

$$\begin{aligned} 0\leqslant x_i\leqslant a_{i+1}, \quad x_{i-1}=0\;\hbox { if }\; x_i=a_{i+1}, \quad i=\ell +1,\ldots ,m-1. \end{aligned}$$
(9.4)

To study some of these terms \(\Phi _{\ell ,m}(\alpha ;\beta ;N)\), we need a technical lemma.

Lemma 9.1

Let denote the number of integer sequences \((y_h,y_{h+1},\ldots ,y_r)\) such that \(y_h=s\),

$$\begin{aligned} 0\leqslant y_i\leqslant a_{i+1}, \quad i=h+1,\ldots ,r, \end{aligned}$$

and

$$\begin{aligned} y_{i-1}=0 \quad \hbox {if}\quad y_i=a_{i+1}, \quad i=h+1,\ldots ,r. \end{aligned}$$

Then

(9.5)

and

(9.6)

Proof

For \(s\geqslant 1\), we shall prove (9.5) by induction on r, starting with \(r=h\). Clearly , and

Note next that we have the recurrence relation

To see this, note that for each of the \(a_{j+1}\) choices of \(y_j\) satisfying \(0\leqslant y_j<a_{j+1}\), there are choices for the integer sequence \((y_h,y_{h+1},\ldots ,y_{j-1})\), and for \(y_j=a_{j+1}\), we must have \(y_{j-1}=0\) and so there are choices for the integer sequence \((y_h,y_{h+1},\ldots ,y_{j-2})\). Using the induction hypothesis for the right-hand side, we have

The identity (9.5) follows from the Principle of Induction. Finally, note that

Applying (9.5) to the terms on the right now leads to the identity (9.6). \(\square \)

To evaluate \(\Phi (\alpha ;\beta ;N)\), we need to split into cases.

Case 1. Suppose that \(m>\ell \), with \(\ell \) even, so \(c_\ell \geqslant 1\). The conditions (9.3) become

$$\begin{aligned}&x_0=c_0,\quad \ldots , \quad x_{\ell -1}=c_{\ell -1}, \\&x_\ell \in \{0,1,\ldots ,c_\ell -1\}, \\&x_m\in \{0,1,\ldots ,b_m-1\}, \\&x_{m+1}=b_{m+1}, \quad \ldots , \quad x_k=b_k, \end{aligned}$$

and the remaining terms satisfy (9.4). Let \(\Phi ^*_{\ell ,m}(\alpha ;\beta ;N)\) denote the number of integer sequences \((x_0,\ldots ,x_k)\) such that (9.3) and (9.4) hold. The restriction (8.25) gives \(x_{m+1}\ne a_{m+2}\), so

$$\begin{aligned} \Phi ^*_{\ell ,m}(\alpha ;\beta ;N)=b_m\Omega _{m-1}, \end{aligned}$$

where \(\Omega _{m-1}\) is the number of those sequences \((x_\ell ,x_{\ell +1},\ldots ,x_{m-1})\) such that

$$\begin{aligned} x_\ell \in \{0,1,\ldots ,c_\ell -1\}, \end{aligned}$$

and (9.4) holds. Using Lemma 9.1, we have

Since \(c_\ell \) is even, it follows that

$$\begin{aligned} \Phi ^*_{\ell ,m}(\alpha ;\beta ;N)=b_m(p_mq_{\ell +1}-q_mp_{\ell +1})\bmod {2}. \end{aligned}$$

Suppose that \(\ell \ne 0\). The assumption that \(c_0\) is non-zero and the requirement \(x_0=c_0\) then guarantee that \(x_0\) is non-zero, so that (9.1) is clearly satisfied, and so \(\Phi _{\ell ,m}(\alpha ;\beta ;N)=\Phi ^*_{\ell ,m}(\alpha ;\beta ;N)\). On the other hand, when \(\ell =0\), we do not have \(x_0=c_0\) but , so that \(x_0<c_0\), and this does not guarantee that (9.1) holds. To obtain \(\Phi _{0,m}(\alpha ;\beta ;N)\), we then have to deduct from \(\Phi ^*_{0,m}(\alpha ;\beta ;N)\) the count of those sequences \((x_0,\ldots ,x_k)\) that do not satisfy (9.1). We shall not give a precise value of this count. Instead, it suffices to show that this count is independent of the sequence \(c_0,c_1,c_2,c_3,\ldots \) The details are slightly different, depending on whether m is odd or even.

Suppose first of all that m is odd. Then those sequences \((x_0,\ldots ,x_k)\) that need to be excluded from the count are the following: either \(m\geqslant 3\) and there exists an even integer \(s=0,2,\ldots ,m-3\) such that

$$\begin{aligned}&x_0=x_1=\cdots =x_s=0, \\&x_{s+1}\in \{1,\ldots ,a_{s+2}\}, \\&x_{s+2}\in \{0,\ldots ,a_{s+3}-1\}, \\&x_{s+3},\ldots ,x_{m-1} \text { satisfy (9.4)}, \\&x_m<b_m, \\&x_{m+1}=b_{m+1},\quad \ldots ,\quad x_k=b_k, \end{aligned}$$

or, if \(b_{m+1}\ne a_{m+1}\),

$$\begin{aligned}&x_0=x_1=\cdots =x_{m-1}=0, \\&x_m\in \{1,\ldots ,b_m-1\}, \\&x_{m+1}=b_{m+1},\quad \ldots ,\quad x_k=b_k, \end{aligned}$$

or

This count is clearly independent of the sequence \(c_0,c_1,c_2,c_3,\ldots \)

Suppose next that m is even. Then those sequences \((x_0,\ldots ,x_k)\) that need to be excluded from the count are the following: either \(m\geqslant 4\) and there exists an even integer \(s=0,2,\ldots ,m-4\) such that

$$\begin{aligned}&x_0=x_1=\cdots =x_s=0, \\&x_{s+1}\in \{1,\ldots ,a_{s+2}\}, \\&x_{s+2}\in \{0,\ldots ,a_{s+3}-1\}, \\&x_{s+3},\ldots ,x_{m-1} \hbox { satisfy (9.4)}, \\&x_m<b_m, \\&x_{m+1}=b_{m+1},\quad \ldots ,\quad x_k=b_k, \end{aligned}$$

or \(m\geqslant 2\) and

$$\begin{aligned}&x_0=x_1=\cdots =x_{m-2}=0, \\&x_{m-1}\in \{1,\ldots ,a_m\}, \\&x_m<b_m, \\&x_{m+1}=b_{m+1},\quad \ldots ,\quad x_k=b_k, \end{aligned}$$

or \(m\geqslant 2\) and

This count is also independent of the sequence \(c_0,c_1,c_2,c_3,\ldots .\)

In summary, corresponding to these values of \(\ell \) and m, we can write

(9.7)

where is independent of the sequence \(c_0,c_1,c_2,c_3,\ldots \)

Case 2. Suppose that \(m>\ell \), with \(\ell \) odd, so \(c_{\ell -1}\geqslant 1\). The conditions (9.3) become

$$\begin{aligned}&x_0=c_0,\quad \ldots , \quad x_{\ell -1}=c_{\ell -1}, \\&x_\ell \in \{c_\ell +1,\ldots ,a_{\ell +1}-1\}, \\&x_m\in \{0,1,\ldots ,b_m-1\}, \\&x_{m+1}=b_{m+1}, \quad \ldots , \quad x_k=b_k, \end{aligned}$$

and the remaining terms satisfy (9.4). The restriction (8.25) gives \(x_{m+1}\ne a_{m+2}\), so

$$\begin{aligned} \Phi _{\ell ,m}(\alpha ;\beta ;N)=b_m\Omega _{m-1}, \end{aligned}$$

where \(\Omega _{m-1}\) is the number of those sequences \((x_\ell ,x_{\ell +1},\ldots ,x_{m-1})\) such that

$$\begin{aligned} x_\ell \in \{c_\ell +1,\ldots ,a_{\ell +1}-1\}, \end{aligned}$$

and (9.4) holds. Using Lemma 9.1, we have

Since \(c_\ell \) is even, it follows that

$$\begin{aligned} \Phi _{\ell ,m}(\alpha ;\beta ;N)=b_m(a_{\ell +1}-1)(p_mq_\ell -q_mp_\ell )\bmod {2}. \end{aligned}$$

Corresponding to these values of \(\ell \) and m, we can write, for instance

(9.8)

Case 3. Suppose that \(m=\ell \), with \(\ell \) even. The conditions (9.3) and (9.4) become

The restriction (8.25) gives \(x_{\ell +1}\ne a_{\ell +2}\), and so .

As in Case 1, for \(\ell =0\), we need to deduct from \(\Phi ^*_{0,m}(\alpha ;\beta ;N)\) the count of those sequences \((x_0,\ldots ,x_k)\) that do not satisfy (9.1), i.e., those that satisfy

As before, this count is independent of the sequence \(c_0,c_1,c_2,c_3,\ldots \)

In summary, corresponding to these values of \(\ell \) and m, we can write

(9.9)

where is independent of the sequence \(c_0,c_1,c_2,c_3,\ldots \)

Case 4. Suppose that \(m=\ell \), with \(\ell \) odd. The conditions (9.3) and (9.4) become

$$\begin{aligned}&x_0=c_0,\quad \ldots , \quad x_{\ell -1}=c_{\ell -1}, \\&x_\ell \in \{c_\ell +1,\ldots ,a_{\ell +1}-1\}, \\&x_\ell \in \{0,1,\ldots ,b_\ell -1\}, \\&x_{\ell +1}=b_{\ell +1}, \quad \ldots , \quad x_k=b_k. \end{aligned}$$

The restriction (8.25) gives \(x_{\ell +1}\ne a_{\ell +2}\), and it is not difficult to see that

Corresponding to these values of \(\ell \) and m, we can write

(9.10)

Case 5. Suppose that \(\ell >m\). Note here that the condition (9.3) is very restrictive and the condition (9.4) is void. It is easy to see that

$$\begin{aligned} \Phi _{\ell ,m}(\alpha ;\beta ;N)=\delta (\ell ,m), \end{aligned}$$

where

$$\begin{aligned} \delta (\ell ,m)={\left\{ \begin{array}{ll} \,1,&{}\hbox {if } c_m<b_m, c_i\!=\!b_i, m<i<\ell , \hbox { and } {{\,\textrm{sign}\,}}(c_\ell -b_\ell )\!=\!(-1)^\ell ,\quad \\ \,0,&{}\hbox {otherwise}. \end{array}\right. }\nonumber \\ \end{aligned}$$
(9.11)

Corresponding to these values of \(\ell \) and m, we can write

(9.12)

Combining (9.2), (9.7)–(9.10) and (9.12), we see that

(9.13)

We shall show that

(9.14)

and that

(9.15)

We then show that

(9.16)

and that

(9.17)

Then combining (9.13)–(9.17), we deduce that

where the translation term

proves to be a nuisance.

Taking two numbers \(\beta '\) and \(\beta ''\) satisfying , with \(\alpha \)-expansion digits \(c'_i\) and \(c''_i\) respectively and taking the difference \(\Phi (\alpha ;\beta '';N)-\Phi (\alpha ;\beta ';N)\), we remove this translation term, and the deduction of the parity formula (8.26) is essentially complete, apart from the deduction of the congruences (9.14)–(9.17).

To establish (9.14), note simply from (9.7) and (9.8) that

To establish (9.15), note that using the recurrence relations (8.2), we can write

(9.18)

where

(9.19)

and

(9.20)

The congruence (9.15) now follows on combining (9.18)–(9.20).

To establish (9.16), note that

so that

and so

Hence

(9.21)

The congruence (9.16) now follows on combining (9.9), (9.10) and (9.21).

Finally, to establish (9.17), we note from (8.23), (8.24) and Lemma 8.3 that \(\Delta _\ell =1\) if and only if there exists an integer \(m<\ell \) such that

$$\begin{aligned} \ell \hbox { is even, (8.27) holds and (8.28) holds}, \end{aligned}$$
(9.22)

and \(\Delta _\ell =-1\) if and only if

$$\begin{aligned} \ell \hbox { is odd, (8.29) holds and (8.30) holds}. \end{aligned}$$
(9.23)

or there exists an integer \(m<\ell \) such that

$$\begin{aligned} \ell \hbox { is odd, (8.29) holds and (8.31) holds}. \end{aligned}$$
(9.24)

Naturally for a given \(\ell \), the integer m for which (9.22) or (9.24) is valid is uniquely determined. Suppose first that \(\ell \geqslant 1\) is even. Then \(\Delta _\ell =1\) if and only if there exists an integer \(m<\ell \) such that (9.22) holds. For the integer m in question, it follows from (9.11) that \(\delta (\ell ,m)=1\). Thus

$$\begin{aligned} \sum _{\begin{array}{c} {\ell =1}\\ {\ell \ \text {even}} \end{array}}^k\Delta _\ell =\mathop {\sum _{m=0}^k\sum _{\ell =0}^k}_{\begin{array}{c} {\ell >m}\\ {\ell \ \text {even}} \end{array}}\,\delta (\ell ,m). \end{aligned}$$
(9.25)

Suppose next that \(\ell \geqslant 1\) is odd. Then \(\Delta _\ell =-1\) if and only if (9.23) holds or there exists an integer \(m<\ell \) such that (9.24) holds. In the latter case, for the integer m in question, it follows from (9.11) that \(\delta (\ell ,m)=0\). Thus

$$\begin{aligned} \mathop {\sum _{m=0}^k\sum _{\ell =0}^k}_{\begin{array}{c} {\ell >m}\\ {\ell \ \text {odd}} \end{array}}\,\delta (\ell ,m)=0, \end{aligned}$$
(9.26)

and

(9.27)

The congruence (9.17) now follows on combining (9.12) and (9.25)–(9.27).

This completes the deduction of the parity formula (8.26). \(\square \)