Average frequencies of digits in infinite IFS’s and applications to continued fractions and Lüroth expansions

The detailed investigation of the distribution of frequencies of digits of points belonging to attractors K of Infinite iterated functions systems (IIFS’s) is a fundamental and important problem in the study of attractors of IIFS’s. This paper studies the Baire category of different families of sets of points belonging to attractors of IIFS’s characterised by the behaviour of the frequencies of their digits. All our results are of the following form: a typical (in the sense of Baire) point x∈K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x\in K$$\end{document} has the following property: the average frequencies of digits of x have maximal oscillation.We consider general types of average frequencies, namely, average frequencies associated with general averaging systems. These averages include, for example, all higher order Hölder and Cesaro averages, and Riesz averages. Surprising, for all averaging systems (regardless of how powerful they are) we prove that a typical (in the sense of Baire) point x∈K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x\in K$$\end{document} has the following property: the average frequencies of digits of x have maximal oscillation. This substantially extends previous results and provides a powerful topological manifestation of the fact that “points of divergence” are highly visible. Several applications are given, e.g. to continued fraction digits and Lüroth expansion digits. a typical (in the sense of Baire) point x∈K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x\in K$$\end{document} has the following property: the average frequencies of digits of x have maximal oscillation.


Introduction
Self-similar sets are arguably the most important examples of fractal sets. This is due to their simple description and the fact that they exhibit many of the properties one expects from fractals. Self-similar sets also form a natural and fundamental framework for studying frequencies of digits associated with a variety of classical and important digit expansions from number theory. Indeed, a unique feature of self-similar sets is that points belonging to a self-similar set can be assigned a sequence of digits (the precise definitions will be given in Sect. 2). For example, for judiciously chosen selfsimilar sets the assigned digits of a real number x in the unit interval [0, 1] coincide with the usual N -ary digits of x or the continued fraction expansion digits of x, and the assigned digits therefore form a natural generalisation of the classical notions of digits associated with traditional number theoretical expansions.
The detailed investigation of the distribution of digits is a fundamental and important problem in the study of self-similar sets. For example, the detailed knowledge of the distribution of digits is crucial for our understanding of dimensions of selfsimilar fractals. Indeed, lower bounds for the Hausdorff and packing dimensions of self-similar sets are invariably obtained by using the ergodic theorem to examine the limiting distribution of almost all frequencies of digits (more precisely, to find the local dimension of judiciously constructed ergodic measures); these results are then translated into lower bounds for the fractal dimension using the "distribution of mass" principle. The detailed study of the distribution of digits also links to fundamental questions in metric number theory. In this direction, the detailed study of almost sure frequencies of digits of different expansions is a fundamental area in metric number theory with deep connections to, for example, Diophantine approximations, see [14] and [5]. Divergence points, i.e. points for which the frequencies of digits, or more general ergodic averages, do not converge, have also been studied. Indeed, until very recently divergence points have been considered of little interest in dynamical systems and geometric measure theory and according to folklore, sets of divergence points carried no essential information about the underlying structure. However, recently Barriare & Schmeling [3] have shown that many natural and important sets of divergence points associated with self-similar constructions are highly "visible" or "observable", namely, they have full Hausdorff dimension. Sets of divergence points have been investigated further in, for example, [26] showing that these sets have a surprising rich and intricate fractal structure. A common feature in most studies of convergence and divergence properties of frequencies of digits (and, in particular, in the work cited above), is that they focus exclusively on the frequencies of digits of almost all points chosen with respect to judiciously constructed ergodic measures.
However, there is a further fundamental notion of "generic", namely, the notion of typical defined using Baire category. Recall that if X is a metric space and P is a property that the elements of X may have, then we say that a typical element x in X has property P if the set E = {x ∈ M | x does not have property P} is of the 1'st Baire category (also sometimes called meagre), see Oxtoby [27] for more details. While properties of distributions of digits of almost all points have been investigated intensively, the study of distributions of digits of typical points is sporadic and focusing on specific and concrete examples. For example, some basic questions of the distribution of the usual N -ary digits of typical numbers in the unit interval have been investigated (see, for example, [1,12,25,31] and the references therein), and some basic questions of the distribution of digits associated with other types of expansions, including the continued fraction expansion and the Lüroth expansion, of typical numbers in the unit interval have been investigated (see, for example, [2,[17][18][19]29,30] and the references therein). However, somewhat surprisingly, a general and comprehensive study of the distributions of digits of typical points seems to be missing, and the main purpose of this paper is to provide a detailed and comprehensive investigation of this problem in general and abstract setting. More precisely, we focus on points belonging to self-similar sets generated by general Infinite Iterated Functions Systems using the powerful (and classical) notion of Averaging Systems. We will now explain this in more detail.
Infinite Iterated Functions Systems Loosely speaking, self-similar sets come in two varieties, namely, those associated with a (finite) Iterated Functions System (IFS), and those associated with an Infinite Iterated Functions System (IIFS). The theory of IFS's go back to Moran's paper from 1946 [22] and found its present elegant and mature formulation in Huchinson's seminal paper [11] from 1981. The more general notion of IIFS is more recent and was introduced and pioneered by Mauldin & Urbanski in late 1990's [20,21]. For a positive integer N , the N -ary digits of a real number coincide with the digits associated with a particular (finite) IFS. However, the digits of other important expansions, including, for example, the continued fraction expansion and the Lüroth expansion, do not coincide with the digits associated with any (finite) IFS. Instead, the continued fraction digits and the Lüroth expansion digits coincide with the digits associated with two judiciously constructed IIFS's. For this reason, we investigate digits of points belonging to arbitrary IIFS's satisfying the so-called Strong Open Set Condition (and 2 further very mild technical assumptions). We emphasize that we do not assume that the IIFS we study consist of conformal maps. In fact, the maps may be highly non-conformal. We also stress that this setup includes a large number of classical expansions, including, for example, the continued fraction expansion and the Lüroth expansion. Indeed, as an application of our main results we provide a detailed study of the frequencies of the continued fraction expansion digits and the Lüroth expansion digits of typical real numbers; this is done in Sects. 5.1 and 5.2, respectively. Short studies of the distribution digits of typical points belonging to self-similar sets associated with IFS's and IIFS's are present in [2,17,18]. However, while the results in [2,17,18] are formulated in the setting of IFS's or IIFS's, the main focus in [2,17,18] is on N -ary expansion of real numbers and on the continued fraction expansions of irrational numbers.
Averaging systems Using a Baire category viewpoint, often frequencies do not exist and, in fact, diverges in the most dramatic way possible. To explore this further we analyse very general types of averages of frequencies using the classical notion of an Averaging System. This notion is very general and includes many classical averages, including, for example, all higher order Hölder averages and all higher order Cesaro averages.
Surprising, for all averaging systems (regardless of how powerful they are) we prove that a typical (in the sense of Baire) point x has the following property: the average frequencies of digits of x have maximal oscillation. For example, it follows from this that a typical (in the sense of Baire) point x has the following property: all higher order Hölder averages and all higher order Cesaro averages of digits of x have maximal oscillation. This substantially extends previous results and provides a powerful topological manifestation of the fact that "points of divergence" are highly visible.
The structure of the paper is the following. We state the main results in Sect. 2. As an application of our main results we provide a detailed study of higher order Hölder averages and higher order Cesaro averages of frequencies of digits of typical points in in Sect. 3, and as a further application, we study the logarithmic Riesz averages of frequencies of digits of typical points in in Sect. 4. Finally, specialising further, frequencies of the continued fraction expansion digits and the Lüroth expansion digits of typical real numbers are investigated in Sects. 5.1 and 5.1, respectively. The proofs are presented in Sects. 5-8.

Infinite iterated function systems
In this section we define the notion of an Infinite Iterated Function System and the associated self-similar set.

Definition Infinite Iterated Function System An Infinite Iterated Function System
where I is a set with 1 < |I | ≤ ℵ 0 (here |I | denotes the cardinality of I ) and (1) X is a compact subset of R d with • X = X ; (2) For each i ∈ I , the map S i : X → X is an injective contraction and there is a constant 0 < s < 1 such that |S i (x) − S i (y)| ≤ s|x − y| for all i ∈ I and all x, y ∈ X . If the set I is finite, then the list (X , (S i ) i∈I ) is called an Iterated Function System (IFS). Next, we define the self-similar set (also sometimes called the limit set) associated with an IIFS (X , (S i ) i∈I ). We first introduce the following notation. For a positive integer n, let i.e. I n is the family of all strings i = i 1 . . . i n of length n with entries i j ∈ I . Also, let I * = n I n , i.e. I * is the family of all finite strings i = i 1 . . . i n with entries i j ∈ I and I N is the family of all infinite strings i = i 1 i 2 . . . with entries i j ∈ I ; here and below N denotes the set of all strictly positive integers, i.e. N = {1, 2, . . .}. For i = i 1 i 2 . . . ∈ I N and n ∈ N, write i|n = i 1 . . . i n for the truncation of i at the n'th place. Finally, for i = i 1 . . . i n ∈ I * , we write Since clearly diam(S i|n (X )) ≤ s n diam(X ) for i ∈ I N , it follows that (S i|n (X )) n is decreasing sequence of non-empty compact sets, and the intersection ∩ n S i|n (X ) is therefore a singleton. In particular, we can define the map π : I N → X by (2.1) The self-similar set K associated with the IIFS (X , (S i ) i∈I ) is now defined by 2) see, for example, [20,21]. It is not difficult to see that the limit set K satisfies the following "self-similar" identity If the set I is finite, then it is well-known (see, for example [11] or the textbooks [7,8]) that K is the unique non-empty and compact subset of R d that satisfies the "selfsimilar" identity (2.3); however, if I is infinite, then the set K may not be compact. We will often assume that the IIFS (X , (S i ) i∈I ) satisfies the following separation condition; loosely speaking, this condition says that overlaps S i (K ) ∩ S j (K ) are small for i = j.
The reader is referred to the Falconer's excellent textbook [8] for a discussion of the dimension theory of self-similar sets associated with IFS's and to Mauldin & Urbanski's text [21] for a detailed discussion of the dimension theory of self-similar sets associated with IIFS's.

Frequencies of digits
If x ∈ K and x = π(i) where i = i 1 i 2 . . . ∈ I N , then we refer to the i j 's as digits of x; we note that a point x ∈ K may have several strings of digits and a string of digits for x is therefore not, in general, unique. To investigate the divergence and convergence properties of the frequencies of digits of x we introduce the following notation. Let i = i 1 i 2 . . . ∈ I N be a infinite string. For a positive integer n and a finite string l = l 1 . . . l k ∈ I * , we write for the frequency of the string l among the first n digits of the string i. We also let π π π(i; l) = π n (i; l) n (2.5) denote the sequence of frequencies π n (i; l). Finally, we define the lower and upper frequencies of the string l among the digits of the string i by π(i; l) = lim inf n π n (i; l) and π(i; l) = lim sup n π n (i; l) .
In addition, write π π π k,n (i) = π n (i; l) l∈I k , (2.6) i.e. π π π k,n (i) denotes the vector of frequencies π n (i; l) of all strings l of length equal to k.

Typical frequencies of digits
We define the subset k of 1 (I k ) by k = p l l∈I k p l ≥ 0 , i.e. k denotes the simplex of probability vectors indexed by strings i = i 1 . . . i k ∈ I k of length k. We will always equip k with the 1-norm · 1 . The vector π π π k,n (i) clearly belongs to k . We will now quantify the divergence of the frequencies of the digits of the point x = π(i) by considering the extent to which the sequence ( π π π k,n (i) ) n fills up the simplex k . Of course, in general, it is not true that the sequence ( π π π k,n (i) ) n fills up a substantial part of k for any i. For example, let I = N and consider strings of length 3. By considering all possible ways a string of length 2, such as 37 ∈ N 2 (i.e. 37 represents the string of length 2 whose first digit equals 3 and whose second digit equals 7), can arise it is easily seen that l∈N π n (i; l37) − l∈N π n (i; 37l) ≤ 1 n for all i ∈ N N . This implies that for each i ∈ N N , all but finitely many points in the sequence ( π π π 3,n (i) ) n will be very close to the subsimplex Hence, in general the sequence ( π π π k,n (i) ) n will not fill up a significant part of the simplex k , and the full simplex k is not the "correct" object to consider. Rather we need to consider the subsimplex defined by slicing k by various planes corresponding to the subsimplex in (2. (2.8) The next result shows that the subsimplex shift k of shift invariant probability vectors is, indeed, the correct subsimplex to consider. To make this statement precise, we introduce the following notation, namely, if (x n ) n is a sequence in a metric space X , we will write acc n x n for the set of accumulation points of (x n ) n , i.e. acc n x n = x ∈ X x is an accumulation point of (x n ) n .
We can now state the result that says that the subsimplex shift k is the correct subsimplex to consider; in Theorem A and in all subsequent parts of the paper we will always equip shift k with the 1-norm and all accumulation points are with respect to the 1-norm.
In view of Theorem A it is natural to quantify the divergence of the sequence π π π k,n (i) = π n (i; l) l∈I k by considering the extent to which it fills up the subsimplex shift k . We therefore introduce the following notation. For i ∈ I N , we say that the point π(i) is extremely non-k-normal if the set of accumulation points of the sequence equals shift k , and we will denote the set of extremely non-k-normal points by E k , i.e.
We will say that a point is extremely non-normal if it is extremely non-k-normal for all k and we let E denote the set of extremely non-normal, i.e.
Hence, the points in E are those whose frequencies of digits diverge in the worst possible way. Work by [2,17,18,23], generalising earlier results [12,25,[29][30][31], shows somewhat surprisingly that, in some cases, the set E of extremely non-normal points is extremely big from a topological viewpoint; these results are summarised in Theorem B below.
Theorem B [2,17,18,23] Let (X , (S i ) i∈I ) be an IIFS and assume that one of the following two conditions (a) or (b) is satisfied:

Typical average frequencies of digits
If a sequence diverges, forming the sequence of averages may "smoothen out" the irregularities of the original sequence. We will now use this idea to study of the irregular behaviour of the sequence of frequencies π π π(i; l) = π n (i; l) n . Indeed, Theorem B shows that, in some cases, the sequence of frequencies π π π(i; l) = π n (i; l) n diverges (in the worst possible way) for a typical point π(i) in K , and it is therefore natural to form the sequence of average frequencies and investigate the convergence of this sequence. In fact, below we form very general types of averages, namely, averages formed using the classical notion of an averaging (or summability) system, and show (perhaps surprisingly) that all of these averages divergence in the worst possible way. We start by recalling the definition of an averaging (or summability) system; the reader is referred to Hardy's classical text [9] for a systematic treatment of averaging systems.
Definition Averaging system An averaging system is a sequence = ( n ) n∈N of sequences n = (π n,i ) i∈N of real numbers satisfying the following conditions: (1) π n,i ≥ 0 for all n and i; (2) the set {i | π n,i > 0} is finite for all n; (3) 1 n i π n,i → 0; (4) The Consistency Condition: if (x n ) n is a sequence of positive real numbers such that there is a real number x with x n → x, then i π n,i x i → x. Remark We immediately note that the usual Cesaro average of a sequence can be obtained by considering the averaging system defined by = ( n ) n where n = ( 1 n , . . . , 1 n , 0, 0 . . .) (where the sequence n contains n terms of the form 1 n ); indeed, with for this choice of , we clearly have A ,n (x) = 1 n (x 1 +· · ·+x n ) for any sequence x = (x n ) n . More general averages, including, for example, all higher order Cesaro and Hölder averages and the logarithmic Riesz averages can also be obtained from suitable averaging systems and will be studied in detail in Sects. 3 and 4, respectively.
Applying averaging systems to the sequence π π π(i; l) = π n (i; l) n of frequencies leads to the definition of average frequencies.
Definition Average frequencies Let = ( n ) n be an averaging system with n = (π n,i ) n∈N . Let i ∈ I N . For a finite string l ∈ I k , we define the lower and upper -average frequency of the string l among the digits of the string i by π (i; l) = A ( π π π(i; l) ) and π (i; l) = A ( π π π(i; l) ) .

Remark
We note that the usual frequencies π n (i; l) are average frequencies; indeed, if we define the averaging system by = ( n ) n where n = (δ n,i ) i (where δ n,i denotes the Kronecker delta), then clearly π ,n (i; l) = π n (i; l) . (2.9) In analogy with the vector π π π k,n (i; l) = π n (i; l) l∈I k of frequencies of all strings l ∈ I k of length k, we define the vector π π π ,k,n (i) of all -average frequencies of strings l ∈ I k of length k by π π π ,k,n (i) = π ,n (i; l) l∈I k . (2.10) Analogously to Theorem A, the next result, i.e. Theorem 2.1 below, says that the subsimplex shift k is also the correct simplex to consider when investigating -average frequencies, namely, all accumulation points of the sequence ( π π π ,k,n (i) ) n belong to shift k . We note that the proof of Theorem 2.1 is similar to the proof of [23, Theorem 0]; however, we have decided to include it for completeness.
By considering all possible ways a string l ∈ N k−1 of length k − 1 can arise as a substring of the string i it follows that for all positive integers n, and so for all n. Since 1 n i π n,i → 0, inequality (2.13) implies that | l∈I p ll − l∈I p ll | ≤ 2 lim inf n π π π ,k,n (i) − p 1 , and we therefore conclude from (2.11) that | l∈I p ll − l∈I p ll | = 0, i.e. l∈I p ll = l∈I p ll for all l ∈ N k−1 , whence p = ( p l ) l∈I k ∈ shift k .
In view of Theorem 2.1 it is natural to attempt to quantify the divergence of the sequence π π π ,k,n (i) = π ,n (i; l) l∈I k by considering the extent to which it fills up the subsimplex shift k . We therefore introduce the following notation. For i ∈ I N , we say that the point π(i) is extremely non-k--normal if the set of accumulation points of the sequence ( π π π ,k,n (i) ) n equals shift k , and we will denote the set of extremely non-k--normal points by E ,k , i.e.
We will say that a point is extremely non--normal if it is extremely non-k--normal for all k and we denote the set of extremely non--normal by E , i.e.
Hence, the points in E are those whose -average frequencies of digits diverge in the worst possible way. Our first main result, Theorem 2.3 below, states (perhaps surprisingly) that the set E is extremely big from a topological viewpoint. In the statement of Theorem 2.3 we assume that the function π is closed and it is clearly desirable to have simple conditions guaranteeing that this condition is satisfied. Before stating Theorem 2.3, we immediately give two such conditions. This is the content of the next proposition; in particular, this proposition shows that the map π is closed in the cases considered by Baek, Olsen and Madritsch in Theorem B, and Theorem 2.3 can therefore by applied to the cases considered previously in [2,17,18,23].

Proposition 2.2 Let (X , (S i ) i∈I ) be an IIFS and assume that one of the following two conditions (a) or (b) is satisfied:
(a) The set I is finite and the list (X , (S i ) i∈I ) satisfies the SOSC; is compact, and since π : I N → K is continuous, we therefore conclude that π maps closed sets to compact sets and hence to closed sets.
On the other hand, if the list (X , (S i ) i∈I ) is the continued fraction IIFS, then it is well-known that the map π : I N → K is a homeomorphism and hence closed.
We can now state Theorem 2.3.

Theorem 2.3
Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. Then the set E is comeager in K .
The proof of Theorem 2.3 is given in Sects. 6 and 7. Several applications of Theorem 2.3 will be presented in Sects. 3-5: in Sect. 3 we use Theorem 2.3 to investigate higher order Hölder and Cesaro averages, in Sect. 4 we use Theorem 2.3 to investigate logarithmic Reisz averages, and in Sect. 5 we apply Theorem 2.3 to study the distribution of continued fraction digits and Lüroth expansion digits.
We also find the packing dimension of the set E . Indeed, the next result shows that if the assumptions in Theorem 2.3 and an additional mild condition on the maps S i are satisfied, then the packing dimension of E is maximal, i.e. equal to the packing dimension of the ambient set K . This result may be viewed as a further manifestation of the fact that the set E is "big". Below we denote the packing dimension by dim P (F); the reader is referred to [8] for the definition of the packing dimension.
satisfies the SOSC and that the map π is closed. Assume further that the maps S i are bi-Lipschitz for all i. Then The proof of Theorem 2.4 is given in Sect. 8.
We have not considered the problem of computing the Hausdorff dimension of the set E . However, it follows from [24,25] that the Hausdorff dimension of the set of numbers whose N -adic-expansion satisfies a similar condition of extreme nonnormality equals 0, and it follows from [16] that the Hausdorff dimension of the set of numbers whose continued fraction expansion satisfies a similar condition of extreme non-normality equals 0. We therefore make the following conjecture. Below we denote the Hausdorff dimension by dim H ; the reader is referred to [8] for the definition of the Hausdorff dimension.

Conjecture 2.5
Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. Assume further that the maps S i are bi-Lipschitz for all i. Then In fact, for each positive integer k, we have

Points of maximal oscillation
Our results can be applied to the study of points of maximal oscillation. Recently such points have received considerable interest in the literature and it has been proved that in many cases typical points have maximal oscillation. A point π(i) is said to have maximal oscillation if the frequencies π n (i; i) of all digits i oscillate as much as possible as n → ∞, more precisely, a point π(i) is said to have maximal oscillation if π (i; i) = 0 and π(i; i) = 1 for all digits i; we let M denote the set of points of maximal oscillation, i.e. M = π i ∈ I N π(i; i) = 0 and π(i; i) = 1 for all i ∈ I . (2.14) Šalat [30] proved that if (X , (S i ) i∈I ) is the continued fraction IIFS from Sect. 5.1 (and, hence, the digits i are the continued fraction digits of x = π(i)), then M is comeagre, and Madritsch [18] proved analogous results for the frequencies of strings of continued fraction digits. We will now show that Theorem 2.3 provides substantial extensions of these and many other results. It is natural to consider points for which the average of the frequencies of strings of digits have maximal oscillation. For this reason we introduce the following notation. Fix k ∈ N and l ∈ I k . Define the projection P k,l : R I k → R onto the l'th coordinate by We now define the set of points of maximal -k-oscillation by M ,k = π i ∈ I N π (i; l) = u k,l and π (i; l) = v k,l for all l ∈ I k , and we define the set of points of maximal -oscillation by Remark We note that the notion of points of maximal -k-oscillation extends the classical notion of points of maximal oscillation in (2.14); indeed, if we let denote the trivial averaging system defined by However, for k > 1, it may happen that v k,l is strictly less than 1. For example, if we let I = {0, 1}, k = 2 and l = 10, then v k,l < 1 .
Indeed, otherwise sup P k,l ( shift k ) = v k,l = 1, and since P k,l ( shift k ) is compact (because shift k is a closed and bounded subset of R I k = R {0,1} 2 = R 4 (and hence compact) and P k,l is continuous), we now conclude that there is a probability vector p = ( p i ) i∈I k ∈ shift k with P k,l (p) = 1. However, this clearly implies that p l = 1 and p i = 0 for i = l, and so i=0,1 p i0 = 1 and i=0,1 p 0i = 0, whence p / ∈ shift k , contradicting the fact that p ∈ shift k .
The next result says that typical points have maximal oscillation.
satisfies the SOSC and that the map π is closed. Then the set M of points of maximal -oscillation is comeagre. In particular, the set M (in (2.14)) of points of maximal oscillation is comeagre.
Since E ,k is comeagre for k ∈ N (by Theorem 2.3) and M ,k = ∩ l∈I k M ,k,l , it clearly suffices to show that E ,k ⊆ M ,k,l for all k ∈ N and all l ∈ I k . To prove this, we fix k ∈ N and l ∈ I k , and let i ∈ E ,k . It follows from the definitions of u k,l and v k,l that we can choose sequences k , we can choose a strictly increasing sequence (n j ) j of positive integers with π π π ,k,n 2 j (i) − u j 1 → 0 and π π π ,k,n 2 j+1 (i) − v j 1 → 0. This clearly implies that π ,n 2 j (i; l) − P k,l (u j ) → 0 and π ,n 2 j+1 (i; l) − P k,l (v j ) → 0, and since P k,i (u j ) → u k,l and P k,i (v j ) → v k,l , we therefore conclude that π ,n 2 j (i; l) → u k,l and π ,n 2 j+1 (i; l) → v k,l , whence i ∈ M ,k,l .

Comparing with normal points
Our main results show that the limiting frequencies of digits of a typical point in K fail to exist in a very spectacular way. This is often in very sharp contrast to the behaviour of the limiting frequencies of digits of almost all points in K with respect to judiciously chosen ergodic measures, and we believe that it is instructive to provide some context by briefly explaining this. For i ∈ I , we define the cylinder [i] generated by i by In analogy with the usual definition of normal numbers (see, for example, [14])), we now make the following definition.
Definition Normal Let μ be a probability measure on I N . Write We now define the set of μ-normal points in K by Remark It is clear that the above definition of normality extends the classical notion of normal numbers. Indeed, fix a positive integer N .
denote the unique non-terminating N -adic expansion of x. With these definitions, it is not difficult to see that i.e. N μ equals the set of numbers that are normal to base N .
As remarked above, our main results show that the limiting frequencies of digits of a typical point in K fail to exist in a very spectacular way. We also suggested that this often is in very sharp contrast to the behaviour of the limiting frequencies of digits of almost all points in K with respect to judiciously chosen ergodic measures. We can now make this statement precise. Namely, in many important and natural cases μ-almost all points in K are μ-normal. Indeed, if μ is a shift-invariant and ergodic probability measure on I N and we writeμ = μ • π −1 for the image measure of μ on K , then it follows immediately from the ergodic theorem that μ(N μ ) = 1, and since π −1 (K \ N μ ) = π −1 (K \ π(N μ )) ⊆ I N \ N μ , we therefore conclude that i.e.μ-almost all points in K are μ-normal. Hence, often the limiting frequencies of digits of almost all points in K exist and equal the "correct" average value.

Hölder and Cesaro averages of frequences of digits in IIFS's
Two of the most important and commonly used averages are the Hölder averages and the Cesaro averages. We will now define these averages and apply them to the sequence π π π(i; l) = π n (i; l) n of frequencies of the string l in string i. We start by recalling the definitions of the Hölder and Cesaro averages. For a sequence x = (x n ) n of real numbers and an integer m with m ≥ 1, we define the sequence (H m n (x)) n inductively as follows The Cesaro averages are defined as follows. For a sequence x = (x n ) n of real numbers and an integer m with m ≥ 1, we define the sequence (C m n (x)) n inductively as follows The lower and upper m'th order Cesaro averages of x are now defined by It is well-known that that the higher order Hölder and Cesaro averages form a double infinite hierarchy between the lower limit of the sequence x and the upper limit of the sequence x in (at least) countably infinite many levels, namely, we have the following inequalities lim inf   [13], and so C m (x) = A m C (x) and C m (x) = A m C (x).
Using Hölder and Cesaro averages we can now introduce average Hölder and Cesaro frequencies by applying the definitions of the Hölder and Cesaro averages to the sequence π π π(i; l) = π n (i; l) n of frequencies of the string l in string i. This is the content of the next definition.
Definition Higher order Hölder and Cesaro averages frequencies of digits Let i ∈ I N . For a finite string l ∈ I k , we define the lower and upper n'th order Hölder average frequency of the string l among the digits of the string i by π m H (i; l) = H m ( π π π(i; l) ) = lim inf n H m n ( π π π(i; l) ) , π m H (i; l) = H m ( π π π(i; l) ) = lim sup n H m n ( π π π(i; l) ) .
Similarly, we define the lower and upper n'th order Cesaro average frequency of the string l among the digits of the string i by C m n ( π π π(i; l) ) , C m n ( π π π(i; l) ) .
The higher order average Hölder and Cesaro frequencies form a double infinite hierarchy between the lower frequency and the upper frequency in (at least) countably infinite many levels, namely, we have (using (3.1)) As an application of Theorem 2.3, we will now show that the behaviour of the digits of a typical point in K is so irregular that not even the hierarchies in (3.2) formed by taking Hölder and Cesaro averages of all orders are sufficiently powerful to "smoothen out" the behaviour of the frequencies of the digits. Let π π π m H,k,n (i) = H m n ( π π π(i; l) ) l∈I k , i.e. π π π m H,k,n (i) denotes the vector of m'th order Hölder-average frequencies H m n ( π π π(i; l) ) of all strings l ∈ I k of length k. Similarly, we let i.e. π π π m C,k,n (i) denotes the vector of m'th order Cesaro-average frequencies C m n ( π π π(i; l) ) of all strings l ∈ I k of length k.
For i ∈ I N , we say that the point π(i) is extremely non-Hölder-normal if the set of accumulation points of the sequence ( π π π m H,k,n (i) ) n (with respect to · 1 ) equals shift k for all m and all k, and we will denote the set of extremely non-Hölder-normal points by E H , i.e.
Similarly, we define the set of extremely non-Cesaro-normal points, denoted by E C , by Hence, the points in E H and E C are as far away from being normal that all higher order Hölder and Cesaro averages of frequencies of strings of all lengths diverges in in the worst possible way, i.e. the points in E H and E C are so irregular that not even the hierarchies in (3.2) formed by taking Hölder and Cesaro averages of all orders are sufficiently powerful to "smoothen out" the behaviour of the frequencies of their digits. The theorem below shows (perhaps surprisingly), that the sets E H and E C are extremely big from a topological viewpoint.

Theorem 3.1 Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. Then the sets E H and E C are comeager in K .
Proof Since H m n (x) = A m H,n (x) and C m n (x) = A m C,n (x) for all real valued sequences x and an easy calculation shows that 1 n i π m H,n,i → 0 as n → ∞ and 1 n i π m C,n,i → 0 as n → ∞, the statement in Theorem 3.1 follows immediately from Theorem 2.3.

Logarithmic Riesz averages
A further important average method, that is even more powerful than all higher order Cesaro averages, is the logarithmic Riesz average. This method was introduced by Marcel Riesz [28] in the early 1900's and is of particular importance in the study of convergence of Dirichlet series, see, for example, Hardy and Riesz's classical text on Dirichlet series [10]. We now recall the definition of the logarithmic Riesz average and apply it to the sequence π π π(i; l) = π n (i; l) n of frequencies. For a sequence x = (x n ) n of real numbers, define the sequence (R n (x)) n by R n (x) = 1 log(n + 1) The lower and upper logarithmic Riesz averages of x are now defined by for all m ≥ 0; while the inequalities in (4.1) undoubtedly are well-known we have been unable to find a proof in the literature and for the convenience of the reader we have decided to provide a short and direct proof of (4.1) in "Appendix A". It is also clear that the logarithmic Riesz averages are averaging systems in the sense of the definition in Sect. 2.4. Indeed, if we for a positive integer n, define the averaging system R = ( R,n ) n by R,n = (π R,n,i ) i where . Using logarithmic Riesz averages we can now introduce logarithmic Riesz average frequencies by applying the definitions of the logarithmic Riesz averages to the sequence π π π(i; l) = π n (i; l) n of frequencies of the string l in string i. This is the content of the next definition.
Definition Logarithmic Riesz averages frequencies of digits Let i ∈ I N . For a finite string l ∈ I k , we define the lower and upper logarithmic Riesz average frequency of the string l among the digits of the string i by π R (i; l) = R( π π π(i; l) ) = lim inf n R n ( π π π(i; l) ) , π R (i; l) = R( π π π(i; l) ) = lim sup n R n ( π π π(i; l) ) .
The logarithmic Riesz average frequencies are more powerful than all higher order average Cesaro frequencies, namely, we have (using (4.1)) for all m.
As an application of Theorem 2.3, we will now show that the behaviour of the digits of a typical point in K is so irregular that not even taking logarithmic Riesz averages is sufficiently powerful to "smoothen out" the behaviour of the frequencies of the digits. Let π π π R,k,n (i) = R n ( π π π(i; l) ) l∈I k , i.e. π π π R,k,n (i) denotes the vector of the logarithmic Riesz average frequencies R n ( π π π(i; l) ) of all strings l ∈ I k of length k. For i ∈ I N , we say that the point π(i) is extremely non-logarithmic-Riesz-normal if the set of accumulation points of the sequence ( π π π R,k,n (i) ) n (with respect to · 1 ) equals shift k for all k, and we will denote the set of extremely non-logarithmic-Riesz-normal points by E R , i.e.
Hence, the points in E R are so far away from being normal that even forming the powerful logarithmic Riesz average is not sufficiently powerful to "smoothen out" the behaviour of the frequencies of their digits. The theorem below shows that the set E R is extremely big from a topological viewpoint.

Theorem 4.1 Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. Then the set E R is comeager in K .
Proof Since R n (x) = A R,n (x) for all real valued sequences x and an easy calculation shows that 1 n i π R,n,i → 0 as n → ∞, the statement in Theorem 4.1 follows immediately from Theorem 2.3.

Continued fraction expansion
For this choice of X and S i it is well-known that π(N N ) = P and that π is a homeomorphism between N N and P. It is also well-known that π −1 (x) = a a a(x) for all x ∈ P; it is, of course, this identity that allow us to apply Theorem 2.3 to the IIFS in (5.2) in order to obtain results about the continued fraction digits of a typical number x ∈ P.
Recall that if n is a positive integer and i ∈ N is a digit, then π n (i; i) = | {1≤ j≤n | i j =i} | n denotes the frequency of the digit i among the first n digits of the string i = i 1 i 2 . . .. In particular, π n (a a a(x); i) denotes the frequency of the digit i among the first n digits of the continued fraction digits of x. A classical result due to Lévy [15] says that for Lebesgue almost all x ∈ P we have π n (a a a(x) for all i ∈ N; the reader is referred to the textbook [4, p 45] for a contemporary proof of this based on the ergodic theorem. In analogy with normal numbers, we will say that a number x ∈ P is continued fraction normal (c-f-normal) if it satisfies (5.3). Hence, using this terminology, Lévy's result says that Lebesgue almost all x ∈ P are c-fnormal. As an application of Theorem 2.3 to the IIFS in (5.2), we will now show that the behaviour of the continued fraction digits of a typical point x in P is so irregular that no averaging system (regardless of how powerful it is) is powerful enough to "smoothen out" the behaviour of the frequencies of the digits; more precisely, for all averaging systems (regardless of how powerful they are), the set of accumulation points of the sequence ( π π π ,k,n ( a a a(x) ) ) n of -average frequencies of a typical number x is as big as possible. This is the content of the next theorem.

Theorem 5.1 Let be an averaging system and write
E cf = k x ∈ P acc n π π π ,k,n ( a a a(x) ) = shift k .
Then the set E cf is comeagre in [0, 1] and dim P (E cf ) = 1.
Proof This follows immediately by applying Theorem 2.3 and Theorem 2.4 to the IIFS in (5.2) noticing that in this case π −1 (x) = a a a(x) for all x ∈ P.
Theorem 5.1 is a substantial extension of previously obtained results. For example, Šalat [30] proved that for a typical x ∈ P, the oscillation of the frequencies π n ( a a a(x); i ) is maximal for all digits i, i.e. for a typical x ∈ P, we have lim inf n π n ( a a a(x); i ) = 0 and lim sup n π n ( a a a(x); i ) = 1 for all i . (5.4) However, this result follows immediately from Theorem 5.1; indeed, if = ( n ) n denotes the trivial averaging system with n = (δ n,i ) i and M cf denotes the set of points x satisfying (5.4), then clearly M cf ⊆ E cf , and since Theorem 5.1 shows that E cf is comeagre, we conclude that M cf is comeagre. Also, Madritsch [18] proved that for a typical x ∈ P, the set of accumulation points of the m'th order Hölder-average of the frequencies of all strings of continued fraction digits of length k is maximal, i.e. for a typical x ∈ P, we have acc n π π π m H,k,n ( a a a(x) ) = shift k for all k and all m (recall, that for an infinite string i ∈ N N , the vector π π π m H,k,n (i) = ( H m n ( π π π(i; l) ) ) l∈N k of m'th order Hölder-average frequencies H m n ( π π π(i; l) ) of all strings l ∈ I k of length k is defined in (3.3)). However, this result follows immediately by applying Theorem 5.1 to the averaging systems m H (with m ∈ N) in Sect. 3. Due to the importance and prevalence of the Hölder averages, we have decided to state this result formally in Corollary 5.2 below.

i.e. E cf H is the set of points for which the set of accumulation points of the m'th order Hölder-average of the frequencies of all strings of continued fraction digits of length k is maximal for all k and all m. Then the set E cf
H is comeagre in [0, 1] and dim P (E cf H ) = 1.

Lüroth expansion
As in the previous example, let P denote the irrational numbers in the closed unit interval, i.e. P = [0, 1] \ Q. For each x ∈ P, it is known that there are unique positive It is well-known that π((N \ {1}) N ) = P and that π is a homoemorphism between (N \ {1}) N and P. It is also well-known that π −1 (x) = b b b(x) for all x ∈ P; it is, of course, this identity that allow us to apply Theorem 2.3 to the IIFS in (5.4) in order to obtain results about the Lüroth expansion digits of a typical number x ∈ P. As in the previous example, recall that if n is a positive integer and i ∈ N is a digit, then π n (i; i) = | {1≤ j≤n | i j =i} | n denotes the frequency of the digit i among the first n digits of the string i = i 1 i 2 . . .. In particular, π n (b b b(x); i) denotes the frequency of the digit i among the first n digits of the Lüroth expansion digits of x. Since the Lebesgue measure is T -invariant and ergodic where T : and T (0) = 0, it follows from the ergodic theorem that for Lebesgue almost all x ∈ P we have for all i ∈ N. In analogy with normal numbers, we will say that a number x ∈ P is Lüroth normal if it satisfies (5.7). Hence, using this terminology, (5.7) says that Lebesgue almost all x ∈ P are Lüroth normal. As an application of Theorem 2.3 to the IIFS in (5.6), we will now show sthat the behaviour of the Lüroth expansion digits of a typical point x in P is so irregular that no averaging system (regardless of how powerful it is) is powerful enough to "smoothen out" the behaviour of the frequencies of the digits; more precisely, for all averaging systems (regardless of how powerful they are), the set of accumulation points of the sequence ( π π π ,k,n ( b b b(x) ) ) n of -average frequencies of a typical number x is as big as possible. This is the content if the next theorem.

Theorem 5.3 Let be an averaging system and write
Then the set E L is comeagre in [0, 1] and dim P (E L ) = 1.

Proof
The proof of this result is similar to the proof of Theorem 5.1.
Theorem 5.3 is a substantial extension of previously obtained results. For example, Madritsch [18] proved that for a typical x ∈ P, the set of accumulation points of the m'th order Hölder-average of the frequencies of all strings of the Lüroth expansion digits of length k is maximal, i.e. for a typical x ∈ P, we have acc n π π π m H,k,n ( b b b(x) ) = shift k for all k and all m.
However, this result follows immediately by applying Theorem 5.3 to the averaging systems m H (with m ∈ N) in Sect. 3.

Proof of Theorem 2.3: Part 1
The purpose of this and the next section is to prove Theorem 2.3. The structure of the proof of Theorem 2.3 is as follows. For an averaging system , we first define a subset E k of I N such that (this is done in (6.1) below). The proof of Theorem 2.3 is now divided into the following two parts: Part 1: In Part 1 we prove that the set E k is comeagre in I N ; this is done in this section (Sect. 6) and is the content of Proposition 6.6. Part 2: In Part 2 we use the comeagreness of E k in I N to prove that E ,k = π(E k ) is comeagre in K = π(I N ); this is done in the next section (Sect. 7). We first recall and introduce various notation which will be used in this and the remaining sections of the paper. For i = i 1 i 2 . . . ∈ I N and a positive integer n, recall that we write i|n = i 1 . . . i n for the truncation of i at the n'th place. Also, for l = l 1 . . . l n ∈ I * and i = i 1 . . . i m ∈ I * , we write li = l 1 . . . l n i 1 . . . i m for the concatenation of l and i. Similarly, for l = l 1 . . . l n ∈ I * and i = i 1 i 2 . . . ∈ I N , we write li = l 1 . . . l n i 1 i 2 . . . for the concatenation of l and i. Also, if i ∈ I n is a finite string with length equal to n, we write |i| = n for the length of i. Finally, for l = l 1 . . . l n ∈ I * , we write [l] for the cylinder generated by i, i.e.
[l] = i ∈ I N i|n = l . Let = ( n ) n be an averaging system. For a positive integer k, define the set E k by E k = i ∈ I N acc n π π π ,k,n (i) = shift k , (6.1) and note that i.e. the set E k is the "symbolic" counterpart of the set E ,k . We now turn towards the proof of the main result in this section, namely, Proposition 6.6 saying that E k is comeagre in I N . We first note that since I N is a Baire space it suffices to find a countable family of open and dense subsets of I N whose intersection is contained in E k . Below we construct a family of subsets of I N with these properties. It is clear that shift,fin k is a dense and separable subset of shift k , and we can therefore choose a countable dense subset Q k of shift,fin k . For q ∈ Q k , ε ∈ Q + and m ∈ N, let G q,ε,m = i ∈ I N π π π ,k,n (i) − q 1 < ε for some n ≥ m .
The family (G q,ε,m ) q∈Q k ,ε∈Q + ,m∈N is clearly a countable collection of subsets of I N . In order to show that E k is comeagre it therefore suffices to show that: (1) the intersection of the sets G q,ε,m is contained in E k , (2) the sets G q,ε,m are open in I N , and (3) the sets G q,ε,m are dense in I N ; these statements are the contents of Proposition 6.1, Proposition 6.2 and Proposition 6.5 below.
Proof Let Q k denote the closure of Q k in k with respect to the 1-norm. We clearly have q∈Q k ε∈Q + m∈N G q,ε,m = i ∈ I N acc n π π π ,k,n (i) = Q k . (6.2) However, since Q k is dense in shift k , we conclude that shift k ⊆ Q k , and it therefore follows from (6.2) and the fact that acc n π π π ,k,n (i) ⊆ shift k for all i ∈ I N (by Theorem 2.1), that q∈Q k ε∈Q + m∈N G q,ε,m ⊆ i ∈ I N acc n π π π ,k,n (i) = shift This completes the proof of Proposition 6.1.

Proposition 6.2 The set G q,ε,m is open in I N .
Proof Let i ∈ G q,ε,m . We must now show that there is u ∈ I * such that i ∈ [u] and [u] ⊆ G q,ε,m . Since i ∈ G q,ε,m , there is an integer n with n ≥ m such that π π π ,k,n (i) − q 1 < ε. Write s n = max{i ∈ N | π n,i = 0} and note that it follows from the definition of an averaging system that s n < ∞. Let u = i|(s n + k − 1) ∈ I * . We will prove that i ∈ [u] and [u] ⊆ G q,ε,m . Indeed, it is clear that i ∈ [u]. Next, we prove that To prove (6.3), let j ∈ [u]. Since i, j ∈ [u], we conclude that π i (i; l) = π i (j; l) for all integers i with 1 ≤ i ≤ s n and all l ∈ N k . This clearly implies that π π π ,k,n (j) = π ,n (j; whence π π π ,k,n (j) − q 1 = π π π ,k,n (i) − q 1 < ε, and so j ∈ G q,ε,m because n ≥ m.
Finally, we must prove that G q,ε,m is dense in I N . In order to prove this we first prove the following auxiliary result, namely, if q ∈ shift,fin k , then there is a finite subset I q of I and i ∈ I N q such that π π π ,k,n (i) − q 1 → 0; this is the statement of Proposition 6.4. However, in order to prove Proposition 6.4 we need Theorem 6.
(2) If p ∈ shift N ,k , then Next, we can state and prove Proposition 6.4.

Proposition 6.4
Let q ∈ shift,fin k . Then there is a finite subset I q of I and i ∈ I N q such that π π π ,k,n (i) − q 1 → 0 .
Proof Write q = (q l ) l∈I N . Since there are only finitely many l ∈ I k such that q l > 0, we can find a finite subset I q = {i q,1 , . . . , i q,N } of I such that q l = 0 for l ∈ I k \ I k q . For l = l 1 . . . l k ∈ {0, 1, . . . , N − 1} k , writel =l 1 . . .l k ∈ I k q wherel j = i q,l j +1 , and letq = (q˜l) l∈{0,1,...,N −1} k . Since q = (q l ) l∈I k ∈ shift k and q l = 0 for l ∈ I k \ I k q , we conclude thatq ∈ shift N ,k , and it therefore follows immediately from Theorem 6.  +1 . We now claim that π π π n (i; k) − q 1 → 0 . (6.4) Indeed, since lim n N ,k,n (x) −q 1 = 0, we deduce that N ,k,n (x; l) → q˜l for all l ∈ {0, 1, . . . , N − 1} k . It follows immediately from this that π n (i; l) → q l for all l ∈ I k q , and the Consistency Condition therefore implies that i π n,i π i (i; l) → q l for all l ∈ I k q , whence Since I q is finite, we deduce from (6.5) that Also, since i ∈ I N q and q l = 0 for all l ∈ I k \ I k q , we conclude that π n (i; l) = q l = 0 for all l ∈ I k \ I k q and all n, and so ∀l ∈ I k \ I k q : i π n,i π i (i; l) − q l = 0 for all n, whence l∈I k \I k q i π n,i π i (i; l) − q l = 0 for all n, (6.7) Finally, it follows from (6.6) and (6.7) that π π π ,k,n (i)−q 1 = l∈I k | i π n,i π i (i; l)− q l | → 0. This completes the proof of (6.4).

Proposition 6.5
The set G q,ε,m is dense in I N .
Proof It follows from Proposition 6.4 that there is a finite subset I q of I and i ∈ I N q such that π π π ,k,n (i) − q 1 → 0. Now put U = ∪ u∈I * [ui]. We claim that U is dense in I N and that U ⊆ G q,ε,m . Indeed, it is clear that U is dense in I N . Next, we prove that U ⊆ G q,ε,m . In order to prove this, we let u ∈ I * . We must now prove that ui ∈ G q,ε,m . To prove this, we first prove the following two claims.
In particular, this implies that for all positive integers n, we have ∀l ∈ M k : π n (i; l) − π n (ui; l) ≤ π n (i; l) − π n+|u| (ui; l) + π n+|u| (ui; l) − π n (ui; l) ≤ (1 + π n+|u| (ui; l)) |u| n + (1 + π n+|u| (ui; l)) |u| n ≤ 4 |u| n (since π n+|u| (ui; l) ≤ 1), and so |π n (i; l) − π n (ui; l)| → 0. It follows from this and the Consistency Condition that Finally, since the set M k is finite, we therefore conclude that l∈M k i π n,i π i (i; l) − π i (ui; l) → 0 . (6.9) Next, we analyse the sum i∈I k \M k i π n,i π i (i; l) − π i (ui; l) . Since u ∈ M * and i ∈ I N q ⊆ M N , the strings u and i only contain letters from M, and it therefore follows that if l ∈ I k \ M k , then l cannot be a substring of u or i. In particular, we conclude that for all positive integers n, we have π n (i; l) = π n (ui; l) = 0 for all l ∈ I k \ M k , and so ∀l ∈ I k \ M k : i π n,i π i (i; l) − π i (ui; l) = 0 for all n.
This clearly implies that i∈I k \M k i π n,i π i (i; l) − π i (ui; l) = 0 for all n. (6.10) The statement in Claim 1 now follows immediately from combining (6.8), (6.9) and (6.10). This completes the proof of Claim 1.
We can now prove the main result in this section, namely, that the set E k is comeagre in I N .

Proposition 6.6
The set E k is comeagre in I N Proof Indeed, it follows immediately from Proposition 6.1, Proposition 6.2 and Proposition 6.5 that the family ( G q,ε,m ) q∈Q k ,ε∈Q + ,m∈N is a countable collection of open and dense sets whose union is contained in E k . We conclude immediately from this that E k is comeagre.

Proof of Theorem 2.3: Part 2
The main purpose of this section is to prove Theorem 2.3 using Proposition 6.6. We first prove the following three auxiliary results. The first lemma, i.e. Lemma 7.1, is well-known if the index set I is finite, see, for example [8]; however, we have been unable to find a proof in the general case, and we have therefore decided to include the simple and brief proof. Recall, that K = π(I N ) denotes the self-similar set associated with the IIFS (X , (S i ) i∈I ). Proof Let i ∈ I N . We will now prove that π(i) ∈ V . The continuity of S i implies that It follows from this that (S i|n (V )) n is a decreasing sequence of non-empty compact sets with diam(S i|n (V )) → 0, and the intersection ∩ n S i|n (V ) is therefore a singleton. However, since V ⊆ X , we conclude that ∩ n S i|n (V ) ⊆ ∩ n S i|n (X ) = {π(i)}. It follows immediately from this that {π(i)} = ∩ n S i|n (V ), and so π(i) ∈ ∩ n S i|n (V ) ⊆ V . Also, since x ∈ π(H ), we can find m ∈ I * with |m| = |l| and s ∈ I N such that ms ∈ H and x = π(ms) .
Since |m| = |l| and m = l, we conclude that S l (V ) ∩ S m (V ) = ∅. Next, recall that the maps S i for i ∈ I are assumed to be injective, and since the maps S i are also continuous (because they are Lipschitz), we conclude from the Invariance of However, we also clearly have (since, by assumption, S k (K ) ⊆ V ) and (since K ⊆ V by Lemma 7.1) 3) The desired contradiction now follows from (7.1), (7.2) and (7.3).

Proposition 7.3
Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. If M ⊆ I N is meagre in I N , then π(M) is meagre in K = π(I N ).
Proof Since M ⊆ I N is meagre in I N , we can find countably many subsets N 1 , N 2 In particular, this shows that ∩ m S w|m (K ) ⊆ ∩ m S w|m (X ) = {π(w)} ⊆ V , and since V is open and lim m diam(S w|m (K )) = 0, we therefore conclude that there is a positive integer m such that S w|m (K ) ⊆ V . Putting k = w|m ∈ I * , we therefore deduce that It now follows from (7.6), (7.7) and Lemma 7.2 that However, this clearly contradicts the fact that π([ijk]) ⊆ π([i]) ⊆ π(N n ). This completes the proof of Claim 1. Claim 2. Let n be a positive integer. Then π(N n ) is nowhere dense in K , i.e. π(N n ) • =

∅.
Proof of Claim 2. Assume, in order to reach a contradiction, that π(N n ) • = ∅. This implies that π −1 (π(N n ) • ) = ∅. It follows from this and the fact that is open that we can choose a string i ∈ I * such that However, since the map π is assumed to be closed, we deduce that π(N n ) is closed, i.e. π(N n ) = π(N n ). Since also π(N n ) ⊆ π(N n ), we therefore conclude that The desired contradiction now follows by comparing (7.5) and (7.10). This completes the proof of Claim 2.
We can now prove that π(M) is comeagre in K . Indeed, it follows from (7.4) that and since Claim 2 shows that π(N n ) is nowhere dense in K for all positive integers n, we conclude that π(M) is meagre in K .
We can now prove Theorem 2.3

Proof of Theorem 2.3
It follows from Proposition 6.6 that the set I N \ E k is meagre in I N and Proposition 7.3 therefore shows that the set π(

Proof of Theorem 2.4
The purpose of this section is to prove Theorem 2.4. We first prove several auxiliary results. Recall that dim P denotes the packing dimension.
Proof For F ⊆ R d , let dim B (F) denote the upper box-dimension of F; the reader is referred to [8] for the definition of the upper box-dimension dim B . Also, recall that if F ⊆ R d , then see, for example, [8]. Since dim P (E) < dim P (X ) it therefore follows that we can find countably many subsets Since E ⊆ X and E ⊆ ∪ n E n , we immediately conclude that Next, we prove the following statement: the set X ∩ E n is nowhere dense in X for all n.
We will now prove (8.2). Assume in order to reach a contradiction that the statement in (8.2) is not true. We can therefore find an integer n such that X ∩ E n is not nowhere dense in X . This implies that there is an open subset U of R d such that X ∩U = ∅ and X ∩ E n ∩ U is dense in X ∩ U . In particular, we deduce that X ∩ U ⊆ X ∩ E n ∩ U , whence in the first line of (8.3) we have used the fact that if F ⊆ R d , then dim B (F) = dim B (F), and in the fourth line of (8.3) we have used the fact that if F ⊆ R d , then dim B (F) ≥ dim P (F). However, since sup m dim B (E m ) < dim P (X ), we also conclude that The desired contradiction now follows from (8.3) and (8.4). This completes the proof of (8.2). Finally, it follows immediately from (8.1) and (8.2) that E is meagre in X . Proof It is clear that dim P (K ∩ U ) ≤ dim P (K ). Next, we prove the reverse inequality. Because K ∩ U = ∅, we can choose a string i ∈ I N with π(i) ∈ U . In particular, we have ∩ n S i|n (K ) ⊆ ∩ n S i|n (X ) = {π(i)} ⊆ U , and since U is open and lim n diam(S i|n (K )) = 0, it therefore follows that we can find a positive integer n with S i|n (K ) ⊆ U , whence S i|n (K ) ⊆ K ∩ U . Also, since S i|n is bi-Lipschitz, we conclude that dim P (S i|n (K )) = dim P (K ), and it follows from this and the inclusion S i|n (K ) ⊆ K ∩ U that dim P (K ) = dim P (S i|n (K )) ≤ dim P (K ∩ U ).

Lemma 8.3
Let (X , (S i ) i∈I ) be an IIFS. Assume that the maps S i are bi-Lipschitz for all i. Let E ⊆ K with dim P (E) < dim P (K ). Then E is meagre in K .
Proof This follows immediately from Lemmas 8.1 and 8.2.

Proposition 8.4
Let X and Y be metric spaces. Let f : X → Y be a function and assume that: (1) f is surjective; (2) f is continuous; (3) f is closed; (4) if U is a non-empty and open subset of X , then f (X \ U ) = Y .
If X is not meagre, then Y is not meagre.
Proof Assume, in order to reach a contradiction, that the space Y is meagre. We can therefore choose countably many nowhere dense subsets N 1 , N 2 , . . . of Y such that Y = ∪ n N n . Since Y = ∪ n N n and X = f −1 (Y ), we immediately conclude that X = n f −1 (N n ) . (8.5) Next, we prove the following claim: the set f −1 (N n ) is nowhere dense in X for all n. (8.6) In order to prove this, we let x ∈ X and r > 0. We must now show that there is x 0 ∈ X and r 0 > 0 such that Since the set X \ B(x, r ) is closed and the function f : X → Y is closed, we conclude that the image f (X \ B(x, r )) is closed and the complement Y \ f (X \ B(x, r )) is therefore open. Also, since B(x, r ) is open, it follows from assumption (4) on f that f (X \ B(x, r )) = Y . Consequently, Y \ f (X \ B(x, r )) is an open and non-empty subset of Y . In particular, it follows from this and the fact that N n is nowhere dense that there is a point y 0 ∈ Y and a positive real number s 0 > 0 such that By the surjectivity of f , we can find a point x 0 ∈ X with f (x 0 ) = y 0 , and since f is continuous we therefore conclude that there is a positive real number r 0 > 0 such that B(x 0 , r 0 ) ⊆ f −1 (B(y 0 , s 0 )). We will now prove that Indeed, if z ∈ B(x 0 , r 0 ), then z ∈ B(x 0 , r 0 ) ⊆ f −1 (B(y 0 , s 0 )), and it therefore follows from (8.7) that f (z) ∈ B(y 0 , B(x, r )) ∪ N n , and so f (z) / ∈ f (X \ B(x, r )) , (8.9) f (z) / ∈ N n . (8.10) We conclude from (8.9) that z / ∈ X \ B(x, r ), whence z ∈ B(x, r ) , (8.11) and we conclude from (8.10) that z / ∈ f −1 (N n ) . (8.12) It now follows from (8.11) and (8.12) that z ∈ B(x, r ) \ f −1 (N n ). This completes the proof of (8.8) and hence the proof of (8.6). Finally, it follows immediately from (8.5) and (8.6) that X is meagre contradicting the fact that X was assumed not to be meagre.

Lemma 8.5
Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. If U is a non-empty and open subset of I N , then π(I N \ U ) = K .
Proof Since K satisfies the SOSC there is a non-empty, open and bounded set V with K ∩ V = ∅ such that S i (V ) ⊆ V for all i and S i (V ) ∩ S j (V ) = ∅ for all i, j with i = j. Because K ∩ V = ∅, there is w ∈ I N such that π(w) ∈ V . In particular, this shows that ∩ m S w|m (K ) ⊆ ∩ m S w|m (X ) = {π(w)} ⊆ V , and since V is open and lim m diam(S w|m (K )) = 0, we therefore conclude that there is a positive integer m such that S w|m (K ) ⊆ V . Putting k = w|m ∈ I * , we therefore deduce that This completes the proof of (8.14).

Lemma 8.6
Let (X , (S i ) i∈I ) be an IIFS. Assume that (X , (S i ) i∈I ) satisfies the SOSC and that the map π is closed. Then K is not meagre in K .
Proof Note that the map π : I N → K is surjective, continuous and closed, and that it follows from Lemma 8.5 that if U is a non-empty and open subset of I N , then π(I N \ U ) = K . Since, I N is not meagre (for example, because the space I N is complete), we therefore conclude from Proposition 8.4 that K is not meagre.
We can now prove Theorem 2.4.

Proof of Theorem 2.4
Assume, in order to reach a contradiction, that dim P (E ) < dim P (K ). It follows from this and Lemma 8.3 that E is meagre in K , and since it also follows from Theorem 2.3 that K \ E is meagre in K , we conclude that K = E ∩ (K \ E ) is meagre in K . However, this contradicts the fact that K is not meagre by Lemma 8.6.