Zero-Dimensional Principal Extensions

We prove that every topological dynamical system (X,T) has a zero-dimensional principal extension, i.e. a zero-dimensional extension (Y,S) such that for every S-invariant measure ν on Y the conditional entropy h(ν|X) is zero. This reduces the discussion of many entropy-related properties to the zero-dimensional case which gives access to the various useful tools of symbolic dynamics.

entropy or the entropies of the invariant measures, and, in fact, many other invariants. In case we find such a system (Y, S), we say that we have found a zero-dimensional extension. It is an elementary exercise to show that every system has a zero-dimensional extension. On the other hand, we are naturally interested in minimizing the "distance" between the original system (X, T ) and its extension (Y, S). There are at least two levels at which this minimization can be performed: 1. We may be interested in minimizing, for each invariant measure μ on X, the increase of entropy as we lift that measure to an invariant measure ν on Y . 2. We may want that for each μ there exist a unique lift ν and that (Y, ν, S) is measuretheoretically isomorphic to (X, μ, T ).
The increase of entropy specified in 1. is measured by the conditional dynamical entropy h(ν|X). In case μ has finite dynamical entropy, this conditional entropy equals simply the difference h(ν) − h(μ), but the conditional notation is universal. The best one can get in the category 1. is a principal extension, i.e., such that h(ν|X) = 0 for every invariant measure ν on Y . An extension which satisfies the Postulate 2, will be call here an isomorphic extension. Clearly, an isomorphic extension is automatically principal.
For invertible systems (i.e., such that T is a homeomorphism) with finite topological entropy and satisfying an additional assumption that it has a nonperiodic minimal factor, the existence of zero-dimensional isomorphic (hence principal) extensions has already been established as a consequence of the deep results in mean dimension theory developed by E. Lindenstrauss and B. Weiss ( [5] and [4]). Every such system satisfies the so-called "small boundary property", which allows to rather easily construct its zero-dimensional isomorphic extension. If the assumption about the existence of a minimal factor is dropped, the above theory still allows to easily build a zero-dimensional principal extension: It is elementary to see that the direct product with any system of zero topological entropy is a principal extension and that the composition of two principal extensions is a principal extension. Thus, an arbitrary (invertible) system of finite topological entropy is first extended to its direct product with some infinite minimal system of zero topological entropy (for example an irrational rotation of the circle, or an odometer) and since such a product already has a minimal nonperiodic factor, it can be isomorphically extended to a zero-dimensional system. The resulting extension is no longer isomorphic, but it is isomorphic to the intermediate product system . Lindenstrauss provided examples showing that both assumptions (finite entropy and minimal factor) are essential for the small boundary property [4]. Thus, for systems with infinite entropy, even those which admit a minimal nonperiodic factor, we cannot hope to prove the existence of an isomorphic zero-dimensional extension.
We can, however, still hope to prove the existence of a principal zero-dimensional extension, but using a different method, not appealing to mean dimension or small boundary property. This is exactly what is done in this paper: we prove that every topological dynamical system (X, T ) has a principal zero-dimensional extension. The theorem presented herein has the twofold advantage over the same obtained via the mean dimension theory: of making no restrictions on the entropy of the dynamical system and of being achieved by more elementary methods.
Historically, the term principal was probably first used by F. Ledrappier in [3]. The construction of a principal extension for systems of finite topological entropy via the mean dimension theory was heavily exploited in [1] in the theory of symbolic extensions. The detailed description of the passage from the small boundary property to the principal extension can be found in [2] (in earlier papers it is considered more or less obvious and left to the reader).

Basic Notions
Dynamical Systems. Throughout this work a dynamical system will be a triple (X, T , S), where X is a compact metric space with metric d, T is a continuous map on X (invertible or not) and S ∈ {Z, Z + ∪ {0}} is the index set, depending on whether we consider both the negative and positive iterates of T (i.e. the action on X of Z) or only positive ones (i.e. the action on X of Z + ∪ {0})-our final result applies in both cases, but a few details of the proof differ, so we need to be able to make the distinction. For brevity, where the index set or transformation are obvious or irrelevant, we will omit them.
Factors, Extensions and Conjugacies. A dynamical system (X, T , S) is a factor of the dynamical system (Y, S, S) if there exists a continuous map π from Y onto X such that T • π = π • S. In this situation we also say that Y is an extension of X. If the map π is a homeomorphism, we say that X and Y are conjugate.
Odometers. Let {p k } k≥1 be a sequence of positive integers such that p k+1 is a multiple of p k . An odometer to base p k is defined as the inverse limit of the sequence Z p k , i.e. as the subset of the product ∞ k=1 Z p k (with the product topology) consisting of sequences {g k } such that g k = g k+1 mod p k . With coordinate-wise addition G is a compact, metrizable, zero-dimensional group and with the action T (g 1 , g 2 , . . .) = (g 1 + 1, g 2 + 1, . . .) it becomes a zero-dimensional dynamical system. Define G k as the set of points g ∈ G such that g j = 0 for j ≤ k. Observe that G is a disjoint union G k∪ T G k∪ · · ·∪ T p k −1 G k and that G k+1 ⊂ G k . {G k } is a base of the topology at zero.

Zero-Dimensional Dynamical Systems
i.e. if it has a base consisting of sets which are both closed and open. A particularly important class of such systems are symbolic systems over an uncountable alphabet which we will call array systems and which are constructed as follows: Let Λ k be a finite set with discrete topology and let Y c = ∞ k=1 Λ S j . The points of Y c can be thought of as arrays (infinite in at least two directions: downward and forward, and perhaps also backward) {y k,n } where y k,n ∈ Λ k . With the action of the horizontal shift S (i.e. (Sy) k,n = y k,n+1 ) Y c becomes a zerodimensional dynamical system. An array system is any closed, shift-invariant subset Y of such Y c . Our main object of interests will be marked array systems, i.e. ones for which there exists a descending sequence of clopen sets F k that are unions of cylinders associated to the symbols in Λ k (which means that we can determine whether an array y belongs to F k by viewing the symbol y k,0 ), and a sequence {p k } such that the every point of Y visits F k periodically with period p k (this implies that p k+1 is a multiple of p k ). This is equivalent to requiring that Y factors onto an odometer G to the base p k via a map depending only on the column zero. The points of such marked systems can be viewed as arrays with vertical lines inserted at regular intervals: If there is a marker at some position then there is one in the same column in every preceding row-see Fig. 1.
By a k-rectangle we will mean a finite matrix C = (C j,n ), j = 1, . . . , k; n = 0, . . . , p k − 1 that has markers in all rows in column 0. In the array representation such as the one in Fig. 1 a k-rectangle is a rectangle occurring in rows 1 through k of such an array, between two vertical lines. Every k-rectangle C can be identified with a cylinder in Y in the standard way: y ∈ C iff y j,n = C j,n , j = 1, . . . , k; n = 0, . . . , p k − 1. We will denote the set of all k-rectangles by C k . The extreme points of M T (X) are ergodic measures, i.e. the ones for which any invariant set has measure either 0 or 1. We will denote the set of ergodic measures by M e T (X). For any x ∈ X let δ x denote the point mass at x and let We will later need the following two facts, both of which are fairly obvious:

Fact 2.2 Let U be an open subset of M(X) containing M T (X).
There exists an N such that for any n > N and any x ∈ X the measure M n (δ x ) is in U .
We will be particularly interested in invariant measures on marked array systems. Let Y be a marked array system and let d * be a metric on M(Y ) compatible with the weak*-topology. It is a standard fact in zero-dimensional dynamics that we can assess the proximity of measures by investigating k-rectangles alone:

Entropy
We recall the basic definitions and facts of the entropy theory of dynamical systems. Let (X, T ) be a dynamical system and let μ ∈ M T (X). For any finite partition A of X into measurable sets we define the entropy of a partition as The sequence H n is known to converge to its infimum, which allows one to define Finally the entropy of a measure is given as where the supremum is taken over all finite partitions of X.
If A and B are two finite partitions of X, then we can define the conditional entropy of A with respect to B as for all Borel sets. Then we can proceed to define Again, the sequence H n is known to converge to its infimum, which allows one to define Suppose now that we have a system (Y, S) that is an extension of (X, T ) by a map π . Let ν be an invariant measure on Y . Any partition B of X can be lifted to a partition where the infimum is taken over all finite partitions of X. Finally, define where once again the supremum is taken over all finite partitions of Y . If the image πν of ν by the factor map π has finite entropy, then it is not difficult to see that We will make use of the following fact.

Fact 2.4
Let Y be an array system and let X be a factor of Y . Let R k be the partition defined by cylinders of height k and length 1.
To see that it is so, it suffices to observe two facts. Firstly, that the family {R k } together with its images under the iterates of S generates the Borel σ -algebra on Y . Secondly, if j < k If (X, T ) has finite topological entropy, then by the variational principle πν has finite entropy for each ν in M S (Y ) so the extension is principal if and only if h(ν) = h(πν) for each ν. In particular (Y, S) has the same topological entropy as (X, T ).

Continuity of the Entropy Functions
In the main proof we will consider entropy as a function of the measure, and we will need several basic facts about the continuity of this function which we state without proof. First of all: Since μ(Int(A)) ≤ μ(A) ≤ μ(A) and the three are equal if the boundary of A has measure 0, we have the following: Using the fact that the limit defining h(μ, A|B) is also the infimum, we easily arrive at the following:

Fact 2.8 For any finite partitions
Finally: To observe that, note that h(μ, A|X) is the infimum of any sequence h(μ, A|B n ), provided that the diameter of the largest set in B n tends to 0. Since we can construct partitions into sets of arbitrarily small diameter that all have boundaries whose measure μ is 0, Fact 2.9 now follows.

The Main Result
Theorem 3.1 Every dynamical system has a zero-dimensional principal extension.
The proof of Theorem 3.1 will occupy the remainder of this section.

Preliminary Reshaping
Let (X 0 , T , S) be the dynamical system for which we shall be constructing the zerodimensional principal extension. First of all, observe that without loss of generality we can assume that X 0 is invertible. Indeed, if T is surjective, we can simply replace the system by its natural extension (which is principal). If T is not surjective, we replace X 0 with the set and define a metric d on it as follows (denoting the initial metric on X 0 by d): In other words, X 0 can be seen as infinitely many copies of X 0 arranged in a sequence and shrinking to a single point. Now, we define the action T on X as follows: Now T is surjective on X 0 , and, with the exception of the measure concentrated on the fixed point ∞, all T -invariant measures are supported by the set X 0 × {1}, so they are the same as the original measures on X 0 . The system (X 0 , T , S) has a natural extension. If we now construct a principal zero-dimensional extension of (X 0 , T , S) (via the natural extension), then the system on the preimage of X 0 × {1} will be a principal zero-dimensional extension of (X 0 , T , S). Therefore, from now on we will assume T to be invertible. Now, let G be an odometer to some base (p k ). What we will actually construct is a zerodimensional principal extension (Y, S) of the system X = X 0 × G with the product action, also denoted by T . Since X is a principal extension of X 0 (the easy proof we leave to the reader), Y will be a principal extension of X 0 as well.
Let I denote the one-dimensional torus, i.e. the interval [0, 1] with the endpoints identified, and let λ be the Lebesgue measure on I . Any function f : X → [0, 1] induces a partition A f of X × I into two sets: the sets of points below and above the graph of f ). For a family F of functions we denote by A F the partition f ∈F A f . Two useful observations are that

Constructing the Zero-Dimensional Principal Extension
Let d denote the metric on X. Fix a sequence of positive numbers ε k decreasing to 0. Let {F k } be a sequence of families of continuous functions from X into [0, 1] such that F k ⊂ F k−1 . Let η k be the diameter of the largest set in A F k (in the product metric on X × I ). We will require that 2η k+1 < η k (which obviously implies that the η k tend to 0) and that η k be so small that, for a given k, d(x 1 , x 2 ) < η k in X implies that x 1 and x 2 belong to the same element of the family of clopen sets {T i (X 0 × G k )}, i ∈ {0, 1, . . . , p k }. This ensures that every cell of A F k is completely contained in one of the sets {T i (X 0 × G k ) × I }.
Let π (1) denote the projection of X × I onto X. Consider the space of all formal arrays y = y k,n (k > 0, n ∈ S), such that y k,n ∈ A F k (we treat each finite partition A F k as the alphabet in row k). For an array y define the sets K k,n (y) = {x ∈ X : d(x, π (1) (y k,n )) ≤ η k }. (K k,n (y) is the η k -neighborhood of the projection onto X of the cell of A F k appearing as a symbol in y at the position (k, n)). An array y will be said to satisfy the column condition if for each n the sequence K k,n (y) is descending (as k increases). Since the diameter of K k,n tends to 0 with k, the column condition implies that the intersection ∞ k=0 K k,n (y) is a single point in X which we will denote by x n (y). Note that x n (y) is within η k of each set π (1) (y k,n )-a fact that will be useful later.
Let Y c be the space of all arrays y satisfying the column condition with the additional requirement that x n+1 (y) = T x n (y). Our choice of η k ensures that the elements of Y c have the following two properties: • In row k the symbols representing subsets of X 0 × G k occur every p k positions.
• If y k,n is a symbol representing a subset of X 0 × G k , then y k−1,n represents a subset of It is easy to see that Y c is a marked array system (the markers in row k are the symbols representing subsets of X 0 × G k ) and that with the action of the horizontal shift S it forms a continuous extension of (X, T ), where the factor map is π X (y) = x 0 (y) (by which we mean x n (y) for n = 0). We will construct Y as a subsystem of Y c which will in some sense be the limit of an auxiliary sequence of (disjoint) mutually conjugate subsystems Y k . We will define Y k inductively, by constructing the maps Φ k : Y k−1 → Y k (these maps will in fact be block codes defined on rectangles of some order). The main goal will be to ensure that for any k and all j > k the set M S (Y j ) is contained within an open set U k ⊂ M S (Y c ), where the sequence {U k } (which we will also define inductively) satisfies the following properties: U2. For any k > 0 and any measure ν ∈ U k we have h ν (R k |X) < ε k , where R k is the partition defined by cylinders of height k and length 1.
To begin with, let Y 0 be the closure of the set of array-names of points in X × I under the action of T × id with respect to the partitions A F k . In other words, Y 0 is the closure of the set of all points y ∈ Y c such that for some pair (x, t) ∈ X × I and for any k and n we have (T n x, t) ∈ y k,n . By a standard argument, Y 0 is an extension of X × I (we will denote the corresponding map by π 0 ) as well as of X itself and the following diagram commutes: (1) X Let the set U 0 be all of M S (Y c ) (all our requirements on the properties of U k only apply to the case k > 0).
There are two important observations to be made here: Firstly, the only points in X × I that have multiple preimages under π 0 are the ones whose orbits enter the graph of a function from some F k (we are using the fact that the graphs of continuous functions are closed). The product measure μ × λ of the graph of any function is 0, therefore if ν is a measure on Y 0 that factors onto a measure μ × λ on X × I then the set of points in X × I with multiple preimages by π 0 has measure 0, and thus the measure-theoretic systems (Y, S, ν) and (X × I, T × id, μ× λ) are isomorphic. Secondly, any k-rectangle in Y 0 is associated with a unique cell of A p k F k , the closure of which is the image (by π 0 ) of this rectangle. Such cells (which are exactly the cells of A p k F k that are contained in X 0 × G k ) will be called fundamental k-cells.
We will now proceed to create the systems Y k , requiring them to have the following properties: where Φ k is a conjugacy, and there exists an increasing sequence j k such that Φ k preserves rows from j k onwards.
Observe that the property (Y2) ensures that the diagram (1) X commutes and that for j ≥ j k we still have the one-to-one correspondence between the jrectangles of Y k and the cells of A p j F j , since this correspondence depends only on the contents of row j . Throughout, π k will denote the factor map of Y k onto X × I defined by composing the factorization π 0 of Y 0 with the conjugacy between Y 0 and Y k . Now, suppose we have defined the system Y k−1 and the set U k−1 . Our task is to create the set U k satisfying the requirements (U1)-(U2) and a system Y k such that M S (Y k ) ⊂ U k . Let M k−1 be the set of all measures on Y k−1 that factor by π k−1 onto measures of the form μ × λ on X × I . As stated above, if ν ∈ M k−1 and ν factors onto μ × λ, then (Y, ν) and (X × I, μ × λ) are measure-theoretically isomorphic. It follows that for any ν in M k−1 we have h(ν|X) = h(μ × λ|X) = 0, so in particular h ν (R k |X) = 0. As we have noted earlier, h ν (R k |X) is upper semicontinuous at ν, provided ν(∂R) = 0 for every R ∈ R k . This is the case for any ν since cylinders have no boundaries at all. Therefore every measure in M k−1 has a neighborhood where h ν (R k |X) < ε k . M k−1 is compact, so by choosing a finite number of such neighborhoods covering M k−1 we can simply assume that there exists some neighborhood V k of M k−1 such that for any measure ν ∈ V k we have h ν (R k |X) < ε k .
It is clear that the set U k = U k−1 ∩ V k has the properties (U1)-(U2). We must now construct the system Y k whose invariant measures are all in U k , also making sure to satisfy the requirement (Y2). In other words, we must ensure that every invariant measure on Y k must be close (in the space M S (Y c )) to some measure on Y k−1 that factors onto a product measure on X × I . To this end we will employ the following lemma: Lemma 3.2 For any measure μ on X and any neighborhood U of μ × λ in M(X × I ) there exists a neighborhood U μ of μ in M(X), an irrational rotation R μ of the one-dimensional torus and a number N μ such that for any (x, t) ∈ X × I and any n > N μ the condition M n (δ x ) ∈ U μ implies that M n (δ (x,t) ) ∈ U , where the averaging in the product is with respect to the map T × R μ .
Proof First note that for any measure μ in M T (X) there exists an irrational rotation R μ disjoint from μ (i.e. the only (T × R μ )-invariant measure on X × I with marginals μ and λ is μ × λ). Indeed, if μ is an ergodic measure and e α2πi is not its eigenvalue, then the rotation of the circle by α is disjoint from μ. Since an ergodic measure has at most countably many eigenvalues, for any ergodic μ there exist at most countably many rotations that are not disjoint from μ. If μ is not ergodic, denote its ergodic decomposition by ξ (ξ is a measure on the set M e T (X) of ergodic measures on X) and consider the product M e T (X) × I with the measure ξ × λ. The set {(ν, α) : e α2πi is an eigenvalue of ν} is a measurable subset of the product and has measure 0 (because all its vertical sections are countable), so almost every horizontal section of this set has measure 0. Therefore there exists an α such that the measures for which e α2πi is an eigenvalue have zero mass in the ergodic decomposition of μ. Setting R μ to be the rotation by α we obtain a rotation disjoint from μ.
Suppose the statement of the lemma is not true. Then there exists a sequence of measures M n (δ (xn,tn) ) such that M n (δ xn ) converge to μ yet the M n (δ (xn,tn) ) all lie outside U (remember that the averaging in X × I is with respect to T × R μ ). Choose the limit ν of some subsequence of M n (δ (xn,tn) ). It is a T × R μ -invariant measure which is outside U and whose marginals are μ (being the limit of M n (δ xn )) and λ (being the only R μ -invariant measure on I ). But the only T × R μ -invariant measure with marginals μ and λ is μ × λ, which is in U -a contradiction.
As the open set U k contains the closed set M k−1 , there exists some number ε such that the ε-neighborhood of M k−1 is contained in U k . From Fact 2.3 we obtain numbers δ and j such that if two measures differ by no more than 2δ on all j -rectangles, then the distance between them is less than ε. We can assume that j > j k−1 (increasing j if necessary). For any measure μ on X the partition A p j F j has boundaries of measure μ × λ equal to 0. Therefore there exists a neighborhood U of μ × λ in M(X × I ) such that if ν ∈ U then |ν(A) − μ × λ(A)| < δ for every A ∈ A p j F j . Applying Lemma 3.2 to U , we obtain for every μ ∈ M T (X) an open set U μ around μ in M(X). Out of these we select a finite family W of measures such that the union of U μ for μ ∈ W covers M T (X) in M(X). The union of this cover is an open set in M(X). There exists a j k > j for which p j k is so large that every measure of the form M p j k (δ x ) is in U μ for some μ ∈ W (we are using Fact 2.2). We can also assume that p j k is larger than the number N μ of Lemma 3.2 for all μ ∈ W.
Now, let C be any j k -rectangle in Y k−1 . Choose a pair (x C , t C ) from the fundamental j kcell whose closure is π k−1 (C), and a measure μ C ∈ W such that M p j k (δ x C ) belongs to U μ C . Recall that the fundamental j -cells partition the set (X 0 × G j ) × I , which is invariant under the map (T × R μ C ) p j . Thus (x C , t C ) has a name under the action of (T × R μ C ) p j on (X 0 × G j ) × I with respect to the partition into the fundamental j -cells. Take the initial block of length q = p j k p j of this name. It is an ordered list of q fundamental j -cells, each associated with a unique j -rectangle in Y k−1 , so we have a sequence of j -rectangles D 1 , . . . , D q . Observe that the number of times at which a j -rectangle D occurs in this list equals the number of times at which (x C , t C ) visits π k−1 (D) under the action of (T × R μ C ) p j . This number equals p j k M p j k (δ (x C ,t) )(π k−1 (D)). By Lemma 3.2, M p j k (δ (x C ,t) ) π k−1 (D) − (μ C × λ) π k−1 (D) < δ, therefore the number of occurrences of D equals p j k (μ C × λ)(π k−1 (D)) ± p j k δ, a fact that we will use later. We now define the image C under Φ k as follows: In rows 1 through j it has the ordered list of j -rectangles D 1 , . . . , D q described above and in rows j + 1 through j k it retains the contents of C. Observe that Φ k (C) satisfies the column condition (the column condition concerns pairs of symbols in one column, so it makes sense for rectangles as well as arrays).
To verify this, we only need to check that every set corresponding to a symbol in row j + 1 of Φ k (C) is contained within η j of the corresponding set above it in row j ; any other pair of symbols (one above another) appears already in Y k−1 , so it satisfies the column condition by the inductive assumption. For each n ≤ p j k , the symbols C j,n and C j +1,n treated as sets in the product space X × I contain the image of (x C , t C ) under the composition of n transformations, each being a product of T with some rotation or identity. Thus, their projections onto X both contain T n (x C ). The set π (1) (C j +1,n ) has diameter smaller than η j +1 , therefore any point from its η j +1 -neighborhood must be within 2η j +1 of T n (x C ). Since 2η j +1 < η j , this means that the η j +1 -neighborhood of π (1) (C j +1,n ) is entirely contained in the η j -neighborhood of π (1) (C j,n ), which is precisely the column condition.
The idea of the construction of the code Φ k is shown on the Fig. 2. The bases of the three large rectangles are the sets X 0 × G j , T (X 0 × G j ), . . . , T p j −1 (X 0 × G j ). For simplicity, we imagine the transformation T as the rigid translation between these sets, except on the last one, which is mapped somehow to X 0 × G j . The large rectangles are Cartesian products with I . The family F j consists, in this example, of the characteristic functions of T i (X 0 × G j ) (i = 0, 1, 2) and of two more functions. The partition A F j of X × I is labeled {0, 1, . . . , 9}. The resulting fundamental j -cells are labeled 047, 048, . . . , 359 (these are our j -rectangles). The fundamental j k -cell in X × I corresponding to the selected j k -rectangle C is shown in gray (with pieces of the enclosing functions from F p j k j k ) and the point (x C , t C ) is inside. The j th row of C is obtained by reading the labels of the fundamental j -cells along the trajectory of (x C , t C ) for q iterates of (T × id) p j (the black dots). In this example it starts with |057|257|258|258|057| . . . . The code Φ k changes this row (and the ones above) by following (x C , t C ) under the action of (T × R μc ) p j (the gray dots). In this example the j th row of Φ k (C) begins with |057|357|359|258|157| . . . . Notice that the projection of the nth symbol in both names contains the point T n (x C ).
For any point y ∈ Y k−1 define its image, Φ k (y), by replacing its every j k -rectangle C with Φ k (C). The column condition and the fact that Φ k preserves rows from j onwards ensure that Φ k (y) is still in Y c and that the condition (Y2) is satisfied.
We will now verify the condition (Y1)-that all invariant measures on Y k are in U k . Let ν be such a measure. By Fact 2.1, for any n the closure of the convex hull of the measures {M n (δ y ); y ∈ Y k } contains ν. In other words there exist points y 1 , . . . , y N ∈ Y k and coeffi-cients α 1 , . . . , α N ∈ (0, 1) such that N i=1 α i = 1 and for all j -rectangles D.
For any y and any set D the measures M n (δ y )(D) and M n (δ T y )(D) differ by at most 1 n (thus they are arbitrarily close for large n) and this proximity is preserved by convex combinations. Therefore, if we choose n large enough, we can simply assume (replacing y i with its image by at most p j k applications of S) that π k (y i ) ∈ X 0 × G j k for all i, i.e. that each y i has a j k -rectangle starting at coordinate 0. Furthermore, we shall pick an n that is a multiple of p j k . For a given i ∈ {1, . . . , N} and for any j -rectangle D the number M n (δ y i )(D) is 1 n times the number of occurrences of D in the block B consisting of the first n coordinates of y i in rows 1 through j . This block B forms the first j rows of a concatenation of j k -rectangles Φ k (C 1 ), . . . , Φ k (C Q ), where C 1 , . . . , C Q are j k -rectangles in Y 0 . We know that the number of occurrences of D in any Φ k (C) is p j k (μ C × λ)(π k−1 (D)) ± p j k δ. Therefore the total number of occurrences of D in B is Therefore, if we set μ i = 1 Q Q q=1 μ Cq (note that obviously the C q depend on i, but Q does not), we can write N B (D) = Qp j k (μ i × λ) π k−1 (D) ± Qp j k δ.
But Qp j k = n, since it is simply the length of B. So, the above statement is equivalent to N B (D) − n(μ i × λ) π k−1 (D) < nδ.
Dividing this by n we see that Let μ = N i=1 α i μ i and let ν be the (unique) measure in M k−1 that factors onto μ × λ on X × I . As remarked above, since j > j k−1 , any point y ∈ Y k−1 belongs to the j -rectangle D if and only if π k−1 (y) belongs to the set π k−1 (D) which is the closure of a fundamental j -cell. But this means that ν (D) = (μ × λ)(π k−1 (D)), and thus ν (D) = (μ × λ) π k−1 (D) = N i=1 α i μ i × λ π k−1 (D) .
Combining this with Eq. (3.1), we conclude that ν(D) − ν (D) < 2δ for all j -rectangles. By our choice of j and δ, d * (ν, ν ) < ε, which means that ν ∈ U k , as requested in (Y1). Now, having obtained the sequence of the Y k 's, we define In other words, Y is the set of all points y such that y = lim k y k , y k ∈ Y k . It is easy to see that this is a closed subsystem of Y c and an extension of X via the map π X . The important observation is that any invariant measure ν on Y is in every U k . This follows from the same argument as the one used to show that the invariant measures on Y k are in U k , since this argument depended only on the properties of j k -rectangles and Y contains no j k -rectangles that did not occur in Y k . To show that Y is a principal extension of X we need to show that the conditional entropy of Y with respect to X is 0 for every measure ν ∈ M s (Y ). For any k > 0 and for any k > k we have h ν (R k |X) ≤ h ν (R k |X), since R k R k . On the other hand, since ν is in the set U k , we know that h ν (R k |X) < ε k . It follows that for any k > k h ν (R k |X) < ε k , and thus h ν (R k |X) = 0. Thus we conclude that h ν (Y |X) = 0.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.