1 Introduction and Main Results

The study of potential functions \(\psi \) over an expanding dynamical system (XT) and the corresponding equilibrium measures has a long and rich history; for a few classical references relevant for this work compare [7, 19, 21]. If the potential function \(\psi \) is sufficiently regular, the full strength of the thermodynamic formalism is applicable. Using standard results in multifractal analysis, this yields a detailed description of both the Birkhoff averages of the potential function and of the local dimensions of the equilibrium measure. More precisely, one considers

$$\begin{aligned} b_{\psi }(x) = \lim _{n \rightarrow \infty } \frac{1}{n} S_n \psi (x), \quad \text{ with } \quad S_n \psi (x) = \sum _{m=0}^{n-1} \psi (T^m x), \end{aligned}$$

and the corresponding dimension spectrum, which is given by the Hausdorff dimension of the corresponding level sets,

$$\begin{aligned} f_{\psi }(\beta ) = \dim _H \{x \in X: b_{\psi }(x) = \beta \}. \end{aligned}$$

If \(\psi \) is Hölder continuous (and the dynamical system is sufficiently niceFootnote 1), the dimension spectrum \(f_{\psi }\) is known to be given by a concave real analytic function, supported on a finite interval, outside of which the level sets are empty [19]. In such a situation, the local dimension

$$\begin{aligned} d_{\mu }(x) = \lim _{r\rightarrow 0} \frac{\log \mu (B_r(x))}{\log (r)} \end{aligned}$$

of the unique equilibrium measure \(\mu \) coincides with the Birkhoff average \(b_{\psi }(x)\) up to a constant (whenever any of the limits exists). A multifractal analysis of \(d_{\mu }\) is therefore obtained along the same lines.

Over the last decades, similar results have been established under less restrictive regularity assumptions. At the same time, the study of singular (or unbounded) potentials has gained increased attention. In the presence of a singularity, the dimension spectra can be positive on a half-line and the points with infinite Birkhoff averages (or infinite local dimensions of the equilibrium measure) may have full Hausdorff dimension. In this case, a more complete understanding can be obtained by renormalizing the Birkhoff sums (or the measure decay on shrinking balls) with a more quickly increasing function. This was studied for the specific case of the Saint-Petersburg potential in [15] and in the context of continued fraction expansions; see for example [10, 16].

In this note, we contribute to the study of singular potentials and their equilibrium measures via a case study of the Thue–Morse (TM) measure. This measure was one of the first examples of a singular continuous measure, exhibited by Mahler almost a century ago [18]. To this day, it is of interest in number theory and the study of substitution dynamical systems and continues to be the object of active research—compare the review [20] for a collection of recent results and open questions. It can be written as an infinite Riesz product on the torus \(\mathbb {T}\) (identified with the unit interval) via

$$\begin{aligned} \mu _{{\text {TM}}} = \prod _{m=0}^{\infty } \bigl ( 1 - \cos (2\pi 2^m x) \bigr ), \end{aligned}$$

to be understood as a weak limit of absolutely continuous probability measures. The TM-measure falls into the class of g-measures [14], most recently renamed “Doeblin measures" in [4], giving credit to the pioneering role of Doeblin and Fortet [9]. This class of measures had an important role in fueling the development of the thermodynamic formalism, largely due to the contributions by Walters [22, 23] and Ledrappier [17]. The term “g-measure" is related to the observation that \(\mu _{{\text {TM}}}\) can be constructed by tracing a (normalized) function \(\widetilde{g}\), in this case given by

$$\begin{aligned} \widetilde{g} :\mathbb {T}\rightarrow [0,1], \quad \widetilde{g}(x) = \frac{1}{2} (1 - \cos (2 \pi x)), \end{aligned}$$

along the doubling map \(T :x \mapsto 2x \mod 1\); see Sect. 2 for details and a formal definition of the term g-measure in our setting.

The doubling map \((\mathbb {T},T)\) is closely related to the full shift \((\mathbb {X},\sigma )\), with \(\mathbb {X} = \{0,1\}^{{\mathbb {N}}}\) and \(\sigma (x)_n = x_{n+1}\) via the (inverse) binary representation \(\pi _2 :(x_n)_{n \in {{\mathbb {N}}}} \mapsto \sum _{n=1}^{\infty } x_n 2^{-n}\), which semi-conjugates the action of \(\sigma \) and T. The map \(\pi _2\) is 2-to-1 on the set \(\mathcal D\) of sequences that are eventually constant (preimages of dyadic rationals), and 1-to-1 everywhere else. Since the dyadic rationals are countable and hence a nullset of \(\mu _{{\text {TM}}}\), we can uniquely lift \(\mu _{{\text {TM}}}\) to a measure \(\mu \) on \(\mathbb {X}\) satisfying

$$\begin{aligned} \mu _{{\text {TM}}} = \mu \circ \pi _2^{-1}. \end{aligned}$$

We adopt a standard choice for the metric on \(\mathbb {X}\), given by \(d(x,y) = 2^{-k+1}\) whenever k is the smallest integer with \(x_k \ne y_k\). We also employ for every finite word \(w \in \{0,1\}^n\) and \(n \in {{\mathbb {N}}}\) the cylinder set notation \( [w] = \{x \in \mathbb {X}: x_{1} \cdots x_n = w_1 \cdots w_n \}. \) The choice to work with \((\mathbb {X},\sigma )\) instead of \((\mathbb {T},T)\) is purely conventional and mostly made for the sake of a simpler exposition. All of the results presented in this section hold just the same over the torus and the proof works in the same way with a few minor adaptations.

The close relation between \(\mu \) and \(\widetilde{g}\) alluded to earlier, persists in a thermodynamic description of \(\mu \). Indeed, due to a classical result by Ledrappier [17], \(\mu \) can alternatively be characterized as the unique equilibrium measure of the potential function

$$\begin{aligned} \psi :\mathbb {X} \rightarrow [-\infty ,\infty ), \quad x \mapsto \log \widetilde{g}(\pi _2(x)), \end{aligned}$$

which has a singularity at the preimages of the origin, \(x = 0^{\infty }\) and \(x=1^{\infty }\). A multifractal analysis for the Birkhoff averages \(b_{\psi }\) and the local dimensions \(d_{\mu }\) was performed in [1, 11]. There it was shown in particular that the level sets

$$\begin{aligned} \bigl \{ x \in \mathbb {X}: d_{\mu }(x) = \alpha \bigr \}, \quad \bigl \{ x \in \mathbb {X}: b_{\psi }(x) = - \log (2) \alpha \bigr \} \end{aligned}$$

have full Hausdorff dimension as soon as \(\alpha \geqslant 2\). This supports the idea that a superpolynomial scaling of the the TM measure (and a superlinear growth of the Birkhoff sums) is in some sense typical for the TM measure. We pursue this idea in the following.

Since the ball of radius \(2^{-n}\) around \(x \in \mathbb {X} \) is given by \(C_n(x):=[x_1\cdots x_n]\), we may also write the local dimension of the measure \(\mu \) as

$$\begin{aligned} d_{\mu }(x) = \lim _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{ - n \log 2}, \end{aligned}$$

provided that the limit exists. The equilibrium state can be expected to avoid the singularities at the preimages of the origin (which are also fixed points of the dynamics). It is therefore reasonable to expect the fastest possible decay rate for \(\mu \) at these positions. Given \(\pi _2(x) = 0\), it was already observed in [12] (for more refined estimates see also [2, 3]) that

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{- n^2 \log 2} = 1. \end{aligned}$$

The same conclusion holds in fact for \(x\in \mathcal D\), the preimages of dyadic rationals [13] (and no other points, as we will see below). However, this is a countable set of vanishing Hausdorff dimension. It seems natural to inquire if sets of non-trivial Hausdorff dimension occur if \(n^2\) is replaced by a different scaling function.

When it comes to the Birkhoff sums, choosing \(x \in \mathcal D\) immediately gives \(S_n \psi (x) = - \infty \) for large enough n, so we will not get a finite result for any scaling function. However, as long as \(x \notin \mathcal D\), we will obtain

$$\begin{aligned} \liminf _{n \rightarrow \infty } \frac{-S_n \psi (x)}{n^2 \log 2} \leqslant 1, \end{aligned}$$

and in this sense the fastest possible scaling for \(S_n \psi \) is also given by \(n^2\). We may interpolate between the linear and quadratic scaling via the scaling function \(n^{\gamma }\) for some \(\gamma \in (1,2)\). It turns out that the points with such an intermediate scaling have full Hausdorff dimension.

Theorem 1.1

For each \(\gamma \in (1,2)\) and \(\alpha \geqslant 0\), the level sets

$$\begin{aligned} \biggl \{ x \in \mathbb {X}: \lim _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{-n^{\gamma } \log 2} = \alpha \biggr \}, \quad \biggl \{ x \in \mathbb {X}: \lim _{n \rightarrow \infty } \frac{-S_n \psi (x)}{n^{\gamma } \log 2} = \alpha \biggr \} \end{aligned}$$

have Hausdorff dimension 1.

In this sense, \(n^2\) is the critical scaling, at least for phenomena that can be distinguished via Hausdorff dimension. We will therefore focus on accumulation points for this particular scaling in the following.

Although the relation between \(S_n \psi (x)\) and \(\mu (C_n(x))\) is not as simple as in the Hölder continuous case, their asymptotic behavior is still closely related. In fact, both expressions can be controlled via an appropriate recoding of \(x \in \mathbb {X}\). As long as \(x \notin \mathcal D\), its binary representation can be uniquely written in an alternating form as \( x = a^{n_1} b^{n_2} a^{n_3} b^{n_4} \ldots , \) where \(a,b \in \{0,1\}\) with \(a\ne b\) and \(n_i \in {{\mathbb {N}}}\) for all \(i \in {{\mathbb {N}}}\). With this notation, the alternation coding is a map \(\tau :\mathbb {X}{\setminus } \mathcal D \rightarrow {{\mathbb {N}}}^{{{\mathbb {N}}}}\), given by

$$\begin{aligned} \tau :a^{n_1} b^{n_2} a^{n_3} b^{n_4} \ldots \mapsto n_1 n_2 n_3 n_4\ldots . \end{aligned}$$

Given \(x \in \mathbb {X} {\setminus } \mathcal D\) with \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\), we define

$$\begin{aligned} F_m(x) = \frac{1}{N_m(x)^2} \sum _{i=1}^m n_i^2, \quad N_m(x) = \sum _{i=1}^m n_i, \end{aligned}$$

for all \(m \in {{\mathbb {N}}}\). For notational convenience, we also set \(\overline{F}(x) = \limsup _{m \rightarrow \infty } F_m(x)\) and \(\underline{F}(x) = \liminf _{m \rightarrow \infty } F_m(x)\). The role of this sequence of functions is clarified by the following result.

Proposition 1.2

Given \(x \in \mathbb {X} {\setminus } \mathcal D\), let \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \). Then,

$$\begin{aligned} \liminf _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{- n^2 \log 2} = \frac{\alpha }{1+\alpha }, \quad \limsup _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{- n^2 \log 2} = \beta , \end{aligned}$$

and

$$\begin{aligned} \liminf _{n \rightarrow \infty } \frac{- S_n \psi (x)}{n^2 \log 2} = \alpha , \quad \limsup _{n \rightarrow \infty } \frac{- S_n \psi (x)}{n^2 \log 2} = \frac{\beta }{1-\beta }. \end{aligned}$$

This has the following remarkable consequence.

Corollary 1.3

Whenever the sequence \(\log \mu (C_n(x))/n^2\) has a non-trivial accumulation point (\(\ne 0\)), the accumulation points form in fact an interval of strictly positive length. The same conclusion holds for the sequence \(S_n \psi (x)/n^2\).

Also, we immediately obtain a gap result for dyadic vs non-dyadic points.

Corollary 1.4

If \(x \in \mathcal D\), then \(S_n \psi (x) = -\infty \) for large enough n, and

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{n^2} = - \log 2. \end{aligned}$$

In contrast, if \(x \in \mathbb {X} \setminus \mathcal D\), then

$$\begin{aligned} \limsup _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{n^2} \geqslant - \frac{1}{2} \log 2, \quad \limsup _{n \rightarrow \infty } \frac{S_n \psi (x)}{n^2} \geqslant -1. \end{aligned}$$

Due to the pointwise relation in Proposition 1.2, it suffices to focus on the accumulation points of \((F_m)_{m \in {{\mathbb {N}}}}\). These can be analysed via the joint (dimension) spectrum of \(\underline{F}\) and \(\overline{F}\), given by

$$\begin{aligned} (\alpha ,\beta ) \mapsto \dim _H\{ x: \underline{F}(x) = \alpha , \overline{F}(x) = \beta \}, \end{aligned}$$

for \((\alpha ,\beta ) \in {{\mathbb {R}}}^2\). More generally, we calculate the Hausdorff dimension of

$$\begin{aligned} \{ (\underline{F}, \overline{F}) \in S \}: = \{ x \in \mathbb {X} \setminus \mathcal D: (\underline{F}(x), \overline{F}(x)) \in S \}, \end{aligned}$$

for every subset \(S \in {{\mathbb {R}}}^2\). Since all accumulation points of \((F_m)_{m \in {{\mathbb {N}}}}\) are in [0, 1], the pair \((\underline{F},\overline{F})\) is certainly contained in

$$\begin{aligned} \Delta := \{ (\alpha ,\beta ) \in [0,1]^2: \alpha \leqslant \beta \}. \end{aligned}$$

It therefore suffices to consider sets \(S \subset \Delta \). We show that the joint spectrum is given by a function \(f :\Delta \rightarrow [0,1]\), defined on \(\Delta {\setminus } \{(0,0) \}\) as

$$\begin{aligned} f(\alpha ,\beta ):= \frac{\sqrt{\alpha \beta + \beta - \alpha } - \beta }{\sqrt{\alpha \beta + \beta - \alpha } + \sqrt{\alpha \beta }}, \end{aligned}$$
(1)

see Fig. 1 for an illustration. A continuous extension of f to \(\Delta \) is not possible, since f can take arbitrary values in [0, 1] as we approach the origin from different directions. We define \(f(0,0):=1\), which is the most adequate choice for our application below.

Fig. 1
figure 1

The function \(f:\Delta \rightarrow [0,1]\)

Theorem 1.5

Let \(S \subset \Delta \). Then,

$$\begin{aligned} \dim _H \{(\underline{F},\overline{F}) \in S \} = \sup \{ f(\alpha ,\beta ): (\alpha ,\beta ) \in S\}. \end{aligned}$$

In particular, \(\dim _H\{\underline{F} = \alpha ,\overline{F} = \beta \} = f(\alpha ,\beta )\) for all \((\alpha ,\beta ) \in \Delta \).

Because of its central role, we detail some properties of the function f below (without proof), which may be verified using standard tools from analysis. We describe the values of f on the boundary of \(\Delta \) in the first two items and proceed to monotonicity properties thereafter.

Proposition 1.6

The function \(f :\Delta \rightarrow [0,1]\) has the following properties.

  1. (1)

    \(f(\beta ,\beta ) = 0 = f(\alpha ,1)\) for all \(\beta \in (0,1]\) and \(\alpha \in [0,1]\).

  2. (2)

    \(f(0,\beta ) = 1 - \sqrt{\beta }\) for all \(\beta \in [0,1]\).

  3. (3)

    \(f(\alpha ,\beta ) > 0\) for all \((\alpha ,\beta )\) in the interior of \(\Delta \).

  4. (4)

    The map \(\alpha \mapsto f(\alpha ,\beta )\) is decreasing in \(\alpha \) for all \(\beta \).

  5. (5)

    For every \(\alpha \in (0,1)\), there is a value \(\alpha ^*\) with \(\alpha< \alpha ^* < 1\) such that \(\beta \mapsto f(\alpha ,\beta )\) is strictly increasing on \((\alpha ,\alpha ^*)\), takes its maximum in \(\beta = \alpha ^*\) and is strictly decreasing on \((\alpha ^*,1)\).

Especially the last property in Proposition 1.6 is remarkable as it shows that, for a fixed value \(\underline{F} \in (0,1)\), most points (in the sense of Hausdorff dimension) achieve a value of \(\overline{F}\) that lies strictly between \(\underline{F}\) and 1. Due to Proposition 1.2, Theorem 1.5 can also be interpreted in terms of the sequences \(\log \mu (C_n(x))/n^2\) and \(S_n \psi (x)/n^2\).

Corollary 1.7

We have

$$\begin{aligned} f\Bigl (\frac{\alpha }{1-\alpha }, \beta \Bigr )&= \dim _H \left\{ x \in \mathbb {X}: \liminf _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{- n^2 \log 2} = \alpha , \quad \limsup _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{- n^2 \log 2} = \beta \right\} , \\ f \Bigl (\alpha , \frac{\beta }{1+\beta } \Bigr )&= \dim _H \left\{ x \in \mathbb {X}: \liminf _{n \rightarrow \infty } \frac{- S_n \psi (x)}{n^2 \log 2} = \alpha , \quad \limsup _{n \rightarrow \infty } \frac{- S_n \psi (x)}{n^2 \log 2} = \beta \right\} , \end{aligned}$$

if the argument \(\big (\frac{\alpha }{1-\alpha },\beta \big )\) (respectively \(\big (\alpha , \frac{\beta }{1+\beta }\big ))\) is in \(\Delta \). Otherwise, the level set is empty.

In particular, the non-triviality of the joint spectrum of the \(\limsup \) and the \(\liminf \) persists. Let us also point out that the condition \(\big (\frac{\alpha }{1-\alpha },\beta \big ) \in \Delta \) requires \(0\leqslant \alpha \leqslant 1/2\), and \(\big (\alpha , \frac{\beta }{1+\beta }\big ) \in \Delta \) allows for arbitrarily large values of \(\beta \in \mathbb {R}_+\). We single out two more consequences for the reader’s convenience.

Corollary 1.8

Given \(\beta \in [0,1]\), we have

$$\begin{aligned} \dim _H \biggl \{ x \in \mathbb {X}: \limsup _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{-n^2 \log 2} = \beta \biggr \} = 1 - \sqrt{\beta }. \end{aligned}$$

Corollary 1.9

The set of points \(x \in \mathbb {X}\) with \(\liminf _{n \rightarrow \infty } S_n \psi (x)/n^2 = - r\) has positive Hausdorff dimension if \(r \in [0,\infty )\) and vanishing Hausdorff dimension if \(r= \infty \).

2 Estimates for Birkhoff Sums and Measure Decay

We begin with a few preliminaries on notation and basic concepts. Given two (real-valued) sequences \((f_m)_{m \in {{\mathbb {N}}}}\) and \((g_m)_{m \in {{\mathbb {N}}}}\), we write \(f_m \sim g_m\) if \(f_m/g_m \rightarrow 1\) as \(m \rightarrow \infty \). Similarly, \(f_m = o(g_m)\) if \(f_m/g_m \rightarrow 0\) and \(f_m = O(g_m)\) if \(f_m/g_m\) is bounded as \(m \rightarrow \infty \).

Every Borel probability measure \(\nu \) on \(\mathbb {X}\) may also be regarded as a linear functional on the space of continuous functions \(C(\mathbb {X})\). This motivates the notation \(\nu (f):= \int f \,\textrm{d}\nu \) for \(f \in C(\mathbb {X})\), which we sometimes extend to \(\nu \)-integrable functions f.

Following [14, 17], a g-function over \((\mathbb {X},\sigma )\) is a Borel measurable function \(g :\mathbb {X} \rightarrow [0,1]\) satisfying \(\sum _{y \in \sigma ^{-1}x} g(y) = 1\) for all \(x \in \mathbb {X}\). There is a corresponding transfer operator

$$\begin{aligned} {\mathcal L}_{g} :C(\mathbb {X}) \rightarrow C(\mathbb {X}), \quad ({\mathcal L}_{g} f)(x) = \sum _{y \in \sigma ^{-1}x} g(y) f(y). \end{aligned}$$

We call \(\nu \) a g-measure with respect to g if it is invariant under the dual of \({\mathcal L}_g\), that is, \(\nu ({\mathcal L}_g f) = \nu (f)\) for all \(f \in C(\mathbb {X})\). It is straightforward to check that \(g = \widetilde{g} \circ \pi _2\), with \(\widetilde{g}(x) = (1 - \cos (2\pi x))/2\), is indeed a g-function with g-measure \(\mu \); compare [2] for the corresponding statement about \(\widetilde{g}\) and \(\mu _{{\text {TM}}}\) over the doubling map. In fact \(\mu _{{\text {TM}}}\) is known to be the unique g-measure with respect to \(\widetilde{g}\). We refer to [4, 6, 8, 14] and the references therein for more on the (non-)uniqueness of g-measures.

Since \(g = \exp \circ \psi \), the invariance of \(\mu \) under \({\mathcal L}_{g}\) builds a natural bridge to the potential function. This can be used to obtain the following replacement for the Gibbs property in the Hölder continuous case.

Lemma 2.1

For any two words \(w \in \{0,1 \}^n\) and \(v \in \{0,1\}^m\), we have

$$\begin{aligned} \mu ([wv]) = \int _{[v]} g_n (wx) \,\textrm{d}\mu (x), \end{aligned}$$

where

$$\begin{aligned} g_n(x) = \prod _{k = 0}^{n-1} g(\sigma ^k x). \end{aligned}$$

In particular,

$$\begin{aligned} \inf _{x \in [wv]} S_n \psi (x) + \log (\mu [v]) \leqslant \log (\mu [wv]) \leqslant \log (\mu [w]) \leqslant \sup _{x \in [w]} S_n \psi (x). \end{aligned}$$

Proof

Writing \(\mathbbm {1}_{[wv]}\) for the characteristic function of [wv] and using the invariance of \(\mu \) under the transfer operator, we get

$$\begin{aligned} \mu ([wv]) = \mu (\mathbbm {1}_{[wv]}) = \mu ({\mathcal L}_{g}^n \mathbbm {1}_{[wv]}), \end{aligned}$$

and obtain via a straightforward calculation

$$\begin{aligned} {\mathcal L}_{g}^n \mathbbm {1}_{[wv]}:x \mapsto \sum _{w' \in \{0,1 \}^n } g_n(w' x) \mathbbm {1}_{[wv]}(w' x) = g_n(wx) \mathbbm {1}_{[v]}(x), \end{aligned}$$

This yields the first assertion. The inequalities follow by estimating the integrand via its infimum (or supremum) and taking the logarithm. \(\square \)

We continue by recording a basic estimate for the potential function. The proof is straightforward and left to the interested reader.

Lemma 2.2

For every \(x \in \mathbb {T}\), let |x| be the smallest Euclidean distance to an endpoint of the unit interval. Then, we have

$$\begin{aligned} 2 \log (2 |x|) \leqslant \log \widetilde{g}(x) \leqslant 2 \log (\pi |x|). \end{aligned}$$

We use these bounds to obtain an estimate for \(S_n \psi (x)\) for arbitrary \(n \in {{\mathbb {N}}}\) and \(x \in \mathbb {X} {\setminus } \mathcal D\). Recall the notation \(\tau (x) = (n_m)_{m \in {{\mathbb {N}}}}\) for the alternation of coding of x, and \(N_m = \sum _{i=1}^{m} n_i\) for \(m \in {{\mathbb {N}}}\). We stress that \(n_m=n_m(x)\) and \(N_m = N_m(x)\) depend in fact on x, but we suppress this in our notation if there is no risk of confusion. The same holds for the quantity \(r_m = r_m(x)\) defined below.

Lemma 2.3

Let \(x \in \mathbb {X}\setminus \mathcal D\) with \(\tau (x) = (n_m)_{m \in {{\mathbb {N}}}}\). Assume \(N_m \leqslant n < N_{m+1}\) for some \(m \in {{\mathbb {N}}}\) and \(r_{m+1} = N_{m+1} - n > 0\). Then,

$$\begin{aligned} - \log 2 \biggl ( n + \sum _{i=1}^{m+1} n_i^2 - r_{m+1}^2\biggr ) \leqslant S_n \psi (x) \leqslant - \log 2 \biggl ( n + \sum _{i=1}^{m+1} n_i^2 - r_{m+1}^2 \biggr ) + 2 n \log \pi . \end{aligned}$$

Proof

First, note that if \(y \in [0^k 1]\) for some \(k \in {{\mathbb {N}}}\), then \( 2^{-(k+1)} \leqslant |\pi _2(y)| \leqslant 2^{-k}, \) which by Lemma 2.2 implies that

$$\begin{aligned} - 2k \log 2 \leqslant \psi (y) \leqslant - 2k \log 2 + 2 \log \pi . \end{aligned}$$

Let \(k' = k-r\) for some \(0\leqslant r < k\). Since for \(0 \leqslant \ell < k\) the point \(\sigma ^\ell y\) is contained in \([0^{k-\ell }1]\), we can estimate

$$\begin{aligned} S_{k'} \psi (y) = \sum _{\ell =0}^{k-r-1} \psi (\sigma ^\ell y) \geqslant - 2 \log 2 \sum _{\ell =0}^{k-r-1} (k-\ell ) = -(k^2 - r^2 + k') \log 2. \end{aligned}$$
(2)

In the special case \(k'= k\), this yields

$$\begin{aligned} S_k \psi (y) \geqslant -(k^2 + k) \log 2. \end{aligned}$$
(3)

By symmetry, the same bounds hold if \(y \in [1^k 0]\). For simplicity let us assume that

$$\begin{aligned} x = 0^{n_1} 1^{n_2} \cdots 1^{n_m} 0^{n_{m+1}} \cdots . \end{aligned}$$

All other cases work analogously. Since \(n+r_{m+1} = N_{m+1} = N_m + n_{m+1}\), we have in particular that \(n-N_m = n_{m+1} - r_{m+1}\). Using this, we can split up the Birkhoff sum as

$$\begin{aligned} S_n \psi (x)&=S_{n_1} \psi (0^{n_1} 1\cdots ) + \cdots + S_{n_m}\psi (1^{n_m} 0\cdots ) + S_{n_{m+1} - r_{m+1}} \psi (0^{n_{m+1}} 1 \cdots ) \\ {}&\geqslant - \log 2 \biggl ( n + \sum _{i=1}^m n_i^2 + (n_{m+1}^2 - r_{m+1}^2) \biggr ), \end{aligned}$$

using (2) and (3) in the last step. This shows the lower bound. The upper bound follows along the same lines. \(\square \)

Although \(\mu (C_n(x))\) is closely related to \(S_n \psi (x)\) via Lemma 2.1, we emphasize that, in contrast to \(S_n \psi (x)\), the expression \(\mu (C_n(x))\) depends only on the first n positions of x. To account for this fact, we extend the action of the alternation coding \(\tau \) to finite words via

$$\begin{aligned} \tau :a^{n_1} b^{n_2} \cdots a^{n_m} \mapsto n_1 \cdots n_m, \end{aligned}$$

for \(a\ne b\), (and m odd) and accordingly if the word ends in \(b^{n_m}\) (if m is even).

Lemma 2.4

Let \(w \in \{0,1\}^n\) with \(\tau (w) = n_1 \cdots n_m \in {{\mathbb {N}}}^m\). Then,

$$\begin{aligned} - \biggl ( n + 1 + \sum _{i=1}^m n_i^2 \biggr ) \log 2 \leqslant \log \mu ([w]) \leqslant - \biggl (n + \sum _{i = 1}^m n_i^2 \biggr ) \log 2 + 2 n \log \pi . \end{aligned}$$

Proof

Again, it suffices to consider the case that w is of the form

$$\begin{aligned} w = 0^{n_1} 1^{n_2} \cdots 0^{n_{m-1}} 1^{n_m}. \end{aligned}$$

From Lemma 2.1 (and using \(\mu [0] = 1/2\) by symmetry considerations), we obtain

$$\begin{aligned} \inf _{x \in [w0]} S_n \psi (x) - \log 2 \leqslant \log \mu [w0] \leqslant \log \mu [w] \leqslant \sup _{x \in [w]} S_n \psi (x). \end{aligned}$$
(4)

For the lower bound, let \(x \in [w0]\) and note that its alternation coding \(\tau (x) = (n_i(x))_{i \in {{\mathbb {N}}}}\) satisfies \(n_i(x) = n_i\) for all \(1\leqslant i \leqslant m\). Applying Lemma 2.3 with \(n=N_m(x)\) and \(r_{m+1}(x) = n_{m+1}(x)\) immediately gives the desired estimate. For the upper bound, assume that \(x \in [w]\) and note that in this case, \(\tau (x) = (n_i(x))_{i \in {{\mathbb {N}}}}\) is of the form

$$\begin{aligned} \tau (x) = n_1 \cdots n_{m-1} n_m(x) \cdots , \end{aligned}$$

with \(n_m(x) \geqslant n_m\) and \(N_{m-1}(x) < n \leqslant N_m(x)\). If \(n = N_m(x)\), we have \(n_m = n_m(x)\) and may argue as for the lower bound. We hence assume \(N_{m-1}(x)< n < N_m(x)\) in the following. Then, \(r_m(x) = N_{m}(x) - n\) is equal to \(n_m(x) - n_m\) by construction. From this, we easily conclude that \(n_m^2 \leqslant n_m(x)^2 - r_m(x)^2\). Combining this estimate with the upper bound provided by Lemma 2.3 yields

$$\begin{aligned} S_n \psi (x) \leqslant - \biggl (n + \sum _{j = 1}^m n_j^2 \biggr ) \log 2 + 2n \log \pi . \end{aligned}$$

Since \(x \in [w]\) was arbitrary, this concludes the proof via (4). \(\square \)

We summarize our findings in terms of the function sequence \((f_m)_{m \in {{\mathbb {N}}}}\), with

$$\begin{aligned} f_m(x) = \sum _{i=1}^m n_i^2, \end{aligned}$$

for all \(x \in \mathbb {X} {\setminus } \mathcal D\) with \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\), and \(m \in {{\mathbb {N}}}\). For an illustration of the following proposition we refer to Fig. 2.

Proposition 2.5

Let \(x \in \mathbb {X}{\setminus } \mathcal D\) with \(\tau (x) = (n_m)_{m \in {{\mathbb {N}}}}\). Assume \(N_m \leqslant n < N_{m+1}\) for some \(m \in {{\mathbb {N}}}\), with \(r_{m+1} = N_{m+1} - n\) and \(s_{m+1} = n - N_m\). Then,

$$\begin{aligned} S_n \psi (x)&= - (f_{m+1}(x) - r^2_{m+1})\log 2 + O(n), \\ \log \mu (C_n(x))&= - (f_{m}(x) + s^2_{m+1})\log 2 + O(n). \end{aligned}$$
Fig. 2
figure 2

Estimates (up to O(n)) for \(-\log \mu (C_n(x))/\log 2\) (solid) and for \(-S_n \psi (x)/\log 2\) (dashed), given in Proposition 2.5

Remark 2.6

It is worth noticing that both \(N_m(x)\) and \(f_m(x)\) are themselves Birkhoff sums over \(({{\mathbb {N}}}^{{\mathbb {N}}},\sigma )\). More precisely, \(N_m(x) = S_m \varphi (\tau (x))\), with \(\varphi :n_1 n_2 \ldots \mapsto n_1\) and \(f_m(x) = S_m \varphi ^2(\tau (x))\), where \(\varphi ^2 :n_1 n_2 \ldots \mapsto n_1^2\). Hence, we are in fact concerned with locally constant, unbounded observables over the full shift with a countable alphabet.

3 Intermediate Scaling

In this section we investigate the scaling function \(n \mapsto n^{\gamma }\) for \(\gamma \in (1,2)\) and prove that this scaling is typical for \(S_n \psi (x)\) and \(\log \mu (C_n(x))\) in the sense of full Hausdorff dimension. As a first step, we show that we may restrict our attention to the limiting behavior of \(f_m\) as \(m \rightarrow \infty \).

Lemma 3.1

Assume that \(x \in \mathbb {X} {\setminus } \mathcal D\) and \(\lim _{m\rightarrow \infty }N_m(x)^{-\gamma } f_m(x) = \alpha > 0\). Then,

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{\log \mu (C_n(x))}{n^{\gamma }} = \lim _{n \rightarrow \infty } \frac{S_n \psi (x)}{n^{\gamma }} = - \alpha \log 2. \end{aligned}$$

Proof

As usual, let \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\) and \(N_m = N_m(x)\). First, we will show that the convergence of \(N_m^{-\gamma } f_m(x)\) implies that both \(n_m/N_m\) and \(n_m^2/N_m^{\gamma }\) converge to 0. Indeed, whenever \(n_m/N_m> \delta > 0\), we get \(f_m(x) \geqslant \delta ^2 N_m^2\), which can happen only for finitely many values of m. This implies also \(\lim _{m \rightarrow \infty } N_m/N_{m+1} =1\). Finally, note that

$$\begin{aligned} \frac{f_{m}(x)}{N_{m}^{\gamma }} = \frac{f_{m-1}(x)}{N_m^{\gamma }} + \frac{n_{m}^2}{N_m^{\gamma }}. \end{aligned}$$
(5)

Then, if \(n_m^2/ N_m^{\gamma }> \delta >0\) for infinitely many m, applying the \(\limsup \) to both sides of (5) yields \(\alpha \geqslant \alpha + \delta \), a contradiction. These observations offer enough control over the points \(N_m \leqslant n < N_{m+1}\) to obtain the desired convergence from Proposition 2.5 (and the fact that \(0 \leqslant r_m,s_m \leqslant n_m\) in the corresponding notation). \(\square \)

In order to establish lower bounds for the Hausdorff dimension of level sets, we will make use of the following simple consequence of the mass distribution principle. Recall that we define the upper density of a subset \(M \subset {{\mathbb {N}}}\) via

$$\begin{aligned} \overline{D}(M) = \limsup _{n \rightarrow \infty } \frac{1}{n} \# (M \cap [1,n]). \end{aligned}$$

Lemma 3.2

For \(M \subset {{\mathbb {N}}}\) and \(w :M \rightarrow \{0,1\}\) let

$$\begin{aligned} A = A(w) = \{x \in \mathbb {X} \,: \, x_m = w_m \text{ for } \text{ all } m \in M \}. \end{aligned}$$

Then, \(\dim _H A \geqslant 1 - \overline{D}(M)\).

Proof

We define a Bernoulli-like measure \(\nu \) on A by “ignoring the determined positions". More precisely, for every \(n \in {{\mathbb {N}}}\) let \(P_n = \{1,\ldots ,n\} {\setminus } M\) be the free positions and set \(c_n = \# P_n\). Clearly, there are \(2^{c_n}\) choices for \(v \in \{0,1\}^n\) such that [v] intersects A and we set

$$\begin{aligned} \nu [v] = {\left\{ \begin{array}{ll} 2^{-c_n} &{} \text{ if } [v] \cap A \ne \varnothing , \\ 0 &{}\text{ otherwise }. \end{array}\right. } \end{aligned}$$

It is straightforward to check that this definition is consistent and there is a unique measure \(\nu \) with this property by the Kolmogorov extension theorem. We obtain for every \(x \in A\) and \(n \in {{\mathbb {N}}}\) that \(\nu (C_n(x)) = 2^{-c_n}\) and therefore the lower local dimension of \(\nu \) at x is given by

$$\begin{aligned} \underline{d}_{\nu }(x) = \liminf _{n \rightarrow \infty } \frac{\log \nu (C_n(x))}{-n \log 2} = \liminf _{n \rightarrow \infty } \frac{c_n}{n} = 1 - \overline{D}(M). \end{aligned}$$

The claim hence follows via the (non-uniform) mass distribution principle. \(\square \)

With the help of Lemma 3.2, we will show that for every \(\beta > 0\), the situation in Lemma 3.1 is typical in the sense of full Hausdorff dimension.

Proposition 3.3

For every \(\gamma \in (1,2)\) and \(\alpha > 0\), we have

$$\begin{aligned} \dim _H \{ x \in \mathbb {X} \setminus \mathcal D: f_m(x) \sim \alpha N_m^{\gamma } \} = 1. \end{aligned}$$

Proof

We construct a subset with Hausdorff dimension arbitrarily close to 1. The dimension estimate will be provided by Lemma 3.2. Hence, we want to find a subset \(M \in {{\mathbb {N}}}\) of arbitrarily small upper density, such that fixing x on M in an appropriate way ensures that \(f_m(x) \sim \alpha N_m^{\gamma }\). The general strategy is the following: We choose a sequence \((\theta _k)_{k \in {{\mathbb {N}}}}\) of positive real numbers such that \(\theta _{k} - \theta _{k-1} \rightarrow \infty \) but \(\theta _{k-1}/\theta _k \rightarrow 1\) as \(k \rightarrow \infty \). To ensure \(f_m(x) \sim \alpha N_m^{\gamma }\), we fix \(x \in \mathbb {X}\) to be constant on an interval of some appropriate length \(c_k\) in \([\theta _{k},\theta _{k+1}]\), and to have bounded alternation blocks outside of these intervals. Using that \(c_k\) grows slower than \(\theta _{k+1} - \theta _k\), this will fix x on a set of positions with arbitrarily small density. The details follow.

For definiteness, we fix some large number \(r = r(\gamma )\) (the exact value will be determined later) and set \(\theta _k = k^r\). For \(r > 1\) this satisfies \(\theta _{k} - \theta _{k-1} \rightarrow \infty \) and \(\theta _{k-1}/\theta _k \rightarrow 1\) for \(k \rightarrow \infty \), as required. An appropriate choice of \(c_k\) turns out to be

$$\begin{aligned} c_k = \sqrt{r \gamma \alpha } k^{\delta }, \quad \delta = \frac{r \gamma - 1}{2}, \end{aligned}$$
(6)

where \(\gamma \in (1,2)\) by assumption. For \(c_k\) to grow slower than \(\theta _k - \theta _{k-1}\), we require \(\delta < r-1\). Since

$$\begin{aligned} \frac{\delta }{r-1} = \frac{r \gamma - 1}{2 r - 2} \xrightarrow {r \rightarrow \infty } \frac{\gamma }{2} < 1, \end{aligned}$$

this holds true for large enough r and we take some \(r = r(\gamma ) > 2\) with this property. Hence, we can choose \(k_0 \in {{\mathbb {N}}}\) such that \(c_k < \theta _{k} - \theta _{k-1}\) for all \(k \geqslant k_0\). We specify a set of positions via

$$\begin{aligned} M_1 = \bigcup _{k \geqslant k_0} \{ n \in {{\mathbb {N}}}: \theta _{k} - c_k \leqslant n \leqslant \theta _{k} \}, \end{aligned}$$

and define

$$\begin{aligned} Q = \{ x \in \mathbb {X}\setminus \mathcal D: x_n = 0 \; \text{ for } \text{ all } \; n \in M_1\}. \end{aligned}$$

To avoid long repetitions of a single letter outside of \(M_1\), we further fix a large cutoff-value \(\Lambda \in {{\mathbb {N}}}\) and set

$$\begin{aligned} R_{\Lambda } = \{ x \in \mathbb {X}\setminus \mathcal D: x_n x_{n+1} = 10 \; \text{ for } \text{ all } \; n \in \Lambda {{\mathbb {N}}}\setminus M_1 \}. \end{aligned}$$

Finally, we combine both conditions by setting

$$\begin{aligned} A_{\Lambda } = Q \cap R_{\Lambda }. \end{aligned}$$

Given \(x \in A_{\Lambda }\), we want to show that \(f_m(x) \sim \alpha N_m^{\gamma }\). The definition of Q implies that x is constant on \([\theta _{k} - c_k, \theta _k]\) for each \(k \in {{\mathbb {N}}}\). If \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\) is the alternation coding of x, this implies that for every k there is a unique index \(i_k\) such that \(N_{i_k-1} \leqslant \lceil \theta _{k} - c_k \rceil \leqslant \lfloor \theta _{k} \rfloor \leqslant N_{i_k}\), and in particular \(n_{i_k} \geqslant c_k - 2\). Since \(R_{\Lambda }\) restricts the length of blocks outside of \(M_1\), we find that \(n_{i_k}\) can in fact not be much larger and hence

$$\begin{aligned} n_{i_k} = c_k + O(1), \end{aligned}$$

where the implied constant depends on \(\Lambda \). For all other indices i we have that \(n_i \leqslant \Lambda \) is bounded by a constant. Hence, for \(i_k \leqslant m < i_{k+1}\) we obtain

$$\begin{aligned} f_m(x) = \sum _{i=1}^m n_i^2 = \sum _{\ell =1}^k n_{i_\ell }^2 + O(m) \sim \sum _{\ell =1}^k c_\ell ^2. \end{aligned}$$

With the specific choice of \(c_k\) in (6), we obtain

$$\begin{aligned} \sum _{\ell =1}^k c_\ell ^2 = \alpha \sum _{\ell =1}^k r \gamma \ell ^{r \gamma -1} \sim \alpha k^{r\gamma }, \end{aligned}$$

using an integral estimate in the last equation. Since \(\theta _k \sim \theta _{k+1}\) and by the monotonicity of \(N_m\), we also observe that \(N_m \sim \theta _k = k^{r}\) for \(i_k \leqslant m < i_{k+1}\), and therefore

$$\begin{aligned} f_m(x) \sim \alpha k^{r \gamma } \sim \alpha N_m^{\gamma }, \end{aligned}$$

as required. That is, \(A_{\Lambda } \subset \{ f_m \sim \alpha N_m^{\gamma } \}\) for every \(\Lambda \in {{\mathbb {N}}}\) and it suffices to find an appropriate lower bound for the Hausdorff dimension of \(A_\Lambda \). Since the positions in \(M_1\) are accumulated to the left of the values \(\theta _k\), we obtain

$$\begin{aligned} \overline{D}(M_1) = \limsup _{k \rightarrow \infty } \frac{1}{\theta _k} \sum _{\ell = 1}^{k} c_{\ell } = \limsup _{k \rightarrow \infty } \frac{1}{k^{r \gamma }} \sqrt{r \gamma \alpha } \frac{k^{\delta +1}}{\delta +1} = 0, \end{aligned}$$

using that \(\delta + 1 < r \gamma \) in the last step. Because the points in \(A_{\Lambda }\) are fixed on the positions given by \(M_1 \cup \Lambda {{\mathbb {N}}}\cup (\Lambda {{\mathbb {N}}}+ 1)\), we obtain by Lemma 3.2,

$$\begin{aligned} \dim _H A_{\Lambda } \geqslant 1 - \overline{D}\bigl (M_1 \cup \Lambda {{\mathbb {N}}}\cup (\Lambda {{\mathbb {N}}}+ 1) \bigr ) = 1 - \frac{2}{\Lambda }. \end{aligned}$$

Since this is a lower bound for \(\dim _H\{f_m \sim \alpha N_m^{\gamma } \}\) and \(\Lambda \in {{\mathbb {N}}}\) was arbitrary, the claim follows. \(\square \)

Proof of Theorem 1.1

For \(\alpha > 0\), the desired relation follows by combining Proposition 3.3 with Lemma 3.1. For \(\alpha = 0\), simply recall that both \(S_n \psi (x)\) and \( \log \mu (C_n(x))\) scale linearly with n for a set of full Hausdorff dimension [1]. \(\square \)

4 Spreading of Accumulation Points

We specialize to the scaling function \(n \mapsto n^2\) for the remainder of this article. We continue with the standing assumption that \(x \in \mathbb {X} \setminus \mathcal D\). By Proposition 2.5, the accumulation points for \(-\log \mu (C_n(x))/(n^2 \log 2)\) are the same as those of

$$\begin{aligned} \xi _n^{\mu }(x):= \frac{1}{n^2} \biggl (\sum _{i=1}^m n_i^2 + (n-N_m)^2 \biggr ), \text{ if } N_m \leqslant n < N_{m+1}. \end{aligned}$$

Similarly, the accumulation points of \(-S_n \psi (x)/(n^2 \log 2)\) coincide with those of

$$\begin{aligned} \xi _n^{\psi }(x):= \frac{1}{n^2} \biggl (\sum _{i=1}^{m} n_i^2 - (N_m-n)^2 \biggr ), \text{ if } N_{m-1} \leqslant n < N_{m}. \end{aligned}$$

Recall the notation \(F_m(x) = N_m^{-2} f_m(x)\), together with \(\underline{F}(x) = \liminf _{m \rightarrow \infty } F_m(x)\) and \(\overline{F}(x) = \limsup _{m \rightarrow \infty } F_m(x)\). The strict convexity of the function \(s \mapsto s^2\) causes the sequence \(\xi _n^{\mu }(x)\) to take its minimum on \([N_m,N_{m+1}]\) at some intermediate point, provided that \(n_{m+1}\) is sufficiently large; compare Fig. 2. This gives rise to a drop of the \(\liminf \), as compared to \(\underline{F}(x)\).

Lemma 4.1

Given \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \), we have

$$\begin{aligned} \liminf _{n \rightarrow \infty } \xi _n^{\mu }(x) = \frac{\alpha }{1 + \alpha }, \quad \limsup _{n \rightarrow \infty } \xi _n^{\mu }(x) = \beta . \end{aligned}$$

Proof

We start with the assertion about the \(\liminf \). Let \(m \in {{\mathbb {N}}}\) and assume that \(n = (1+ c) N_m\) (not necessarily \(n < N_{m+1}\)) for some \(c\geqslant 0\). We obtain

$$\begin{aligned} n^2 \xi _n^{\mu }(x) \leqslant \sum _{i=1}^{m} n_i^2 + (c N_m)^2 = N_m^2 (F_m(x) + c^2), \end{aligned}$$

with equality if and only if \(n \leqslant N_{m+1}\). Hence,

$$\begin{aligned} \xi _n^{\mu }(x) \leqslant \frac{F_m(x) + c^2}{(1+c)^2}, \end{aligned}$$
(7)

again with equality precisely if \(n \leqslant N_{m+1}\). For \(r > 0\), the function

$$\begin{aligned} \phi _{r} :c \mapsto \frac{r + c^2}{(1+c)^2} \end{aligned}$$

is strictly decreasing on [0, r), takes a minimum at \(c = r\) and is increasing for \(c \geqslant r\). This yields for \(N_m \leqslant n \leqslant N_{m+1}\),

$$\begin{aligned} \xi _n^{\mu }(x) \geqslant \min _{c > 0} \phi _{F^{}_m(x)}(c) = \frac{F_m(x)}{1 + F_m(x)}, \end{aligned}$$

and in particular,

$$\begin{aligned} \liminf _{n\rightarrow \infty } \xi _n^{\mu }(x) \geqslant \liminf _{m \rightarrow \infty } \frac{F_m(x)}{1 + F_m(x)} = \frac{\alpha }{1 + \alpha }. \end{aligned}$$

On the other hand, let

$$\begin{aligned} c_m = \frac{\lfloor { F_m(x) N_m \rfloor }}{N_m}, \end{aligned}$$

and note that if \(F_{m_k}(x)\) converges to \(\alpha \), then so does \(c_{m_k}\) as \(k \rightarrow \infty \). In particular, we find for \(r_k = N_{m_k} (1 + c_{m_k})\) that

$$\begin{aligned} \xi ^{\mu }_{r_k}(x) \leqslant \frac{F_{m_k}(x) + c_{m_k}^2}{(1 + c_{m_k})^2} \xrightarrow {k \rightarrow \infty } \frac{\alpha }{1 + \alpha }, \end{aligned}$$

and the claim on the \(\liminf \) follows.

For \(n = (1+ c)N_m\) let I be the interval of values c such that \(N_m \leqslant n \leqslant N_{m+1}\). Due to the monotonicity properties of \(c \mapsto \phi _{F^{}_m(x)}(c)\), its maximum on I is obtained on a boundary point. By (7), we hence conclude that \(\xi _n^{\mu }(x) \leqslant F_m(x)\) or \(\xi _n^{\mu }(x) \leqslant F_{m+1}(x)\), with equality if \(n = N_m\) or \(n = N_{m+1}\), respectively. This implies the assertion about the \(\limsup \). \(\square \)

Lemma 4.2

Given \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \), we have

$$\begin{aligned} \liminf _{n \rightarrow \infty } \xi _n^{\psi }(x) = \alpha , \quad \limsup _{n \rightarrow \infty } \xi _n^{\psi }(x) = \frac{\beta }{1-\beta }. \end{aligned}$$

Proof

This is similar to the proof of Lemma 4.1. With \(m \in {{\mathbb {N}}}\) and \(n = (1-c)N_m\) for some \(0 \leqslant c < 1\), we get

$$\begin{aligned} n^2 \xi _n^{\psi }(x) \geqslant \sum _{i=1}^m n_i^2 - (cN_m)^2 = N_m^2 (F_m(x) - c^2), \end{aligned}$$

with equality if and only if \(n \geqslant N_{m-1}\). Therefore,

$$\begin{aligned} \xi _n^{\psi }(x) \geqslant \frac{F_m(x) - c^2}{(1-c)^2} =: \bar{\phi }_{F_m(x)}(c), \end{aligned}$$

again with equality if and only if \(n \geqslant N_{m-1}\). For \(0< r < 1\), the function \(\bar{\phi }_{r}(c)\) is strictly increasing on [0, r), takes a maximum in \(c = r\) and is decreasing for \(c \in (r,1)\). Noting that \(\bar{\phi }_{r}(r) = r/(1-r)\), the rest follows precisely as in the proof of Lemma 4.1. \(\square \)

Proof of Proposition 1.2

The corresponding statements for the accumulation points of \((\xi ^{\mu }_n(x))_{n \in {{\mathbb {N}}}}\) and \((\xi ^{\psi }_n(x))_{n \in {{\mathbb {N}}}}\) are given in Lemmas 4.1 and 4.2. Combining this with Proposition 2.5 gives the desired relations for the Birkhoff sums and the measure decay. \(\square \)

5 Lower Bounds

We want to establish necessary and sufficient criteria for x to satisfy \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \). We show that this requires a certain number of large blocks in the alternation coding \(\tau (x) =(n_i)_{i \in {{\mathbb {N}}}}\). To be more precise, let us start with a certain large cutoff-value \(\Lambda \in {{\mathbb {N}}}\) and let

$$\begin{aligned} I_{\Lambda } = \{ i \in {{\mathbb {N}}}: n_i \geqslant \Lambda \}. \end{aligned}$$

It will be convenient to ignore all contributions of \(n_i\) to \(F_n(x)\) as long as \(n_i < \Lambda \). This is achieved by setting

$$\begin{aligned} F^{\Lambda }_m(x) = \frac{1}{N_m^2} \sum _{i \in I_\Lambda \cap [1,m]} n_i^2. \end{aligned}$$

Lemma 5.1

We have \(|F_m(x) - F^\Lambda _m(x)| \in O(N_m^{-1})\). Hence, \((F^\Lambda _m(x))_{m \in {{\mathbb {N}}}}\) and \((F_m(x))_{m \in {{\mathbb {N}}}}\) have the same set of accumulation points.

Proof

This follows by

$$\begin{aligned} |F_m(x) - F^\Lambda _m(x)| = \frac{1}{N_m^2} \sum _{i \in [1,m]\setminus I_\Lambda } n_i^2 < \frac{1}{N_m^2} m \Lambda ^2 \leqslant \frac{\Lambda ^2}{N_m}, \end{aligned}$$

which gives the desired estimate. \(\square \)

In principle, it is possible that \(F_m^{\Lambda }(x) = 0\) for all \(m \in {{\mathbb {N}}}\). However, this can only happen if \(\overline{F}(x) = 0\), a case that we will treat separately. In the following, we always assume that m is large enough to ensure \(F_m^\Lambda (x) > 0\).

If \(n_{j+1} < \Lambda \), we interpolate \(F_r^\Lambda (x)\) continuously between \(r = j\) and \(r = j+1\) by setting \(N_r = N_j + (r-j) n_{j+1}\) and

$$\begin{aligned} F^\Lambda _r(x) = \frac{1}{N_r^2} \sum _{i \in I_\Lambda \cap [1,j]} n_i^2, \end{aligned}$$

for all \(r \in {{\mathbb {R}}}\) such that \(j< r < j+1\). For a lower bound on the Hausdorff dimension of \(\{ \underline{F} = \alpha , \overline{F} = \beta \}\), we wish to provide a mechanism that produces an abundance of points with this property. More precisely, we exhibit a subset of \(\{ \underline{F} = \alpha , \overline{F} = \beta \}\) that permits a lower estimate for its the Hausdorff dimension via Lemma 3.2. The main idea is the following: Given \(r \in {{\mathbb {R}}}\) with \(F_r^\Lambda (x) = \beta \) we introduce blocks of length smaller than \(\Lambda \) until we hit the level \(F_{k-1}^{\Lambda }(x) = \alpha \) for some \(k \in {{\mathbb {N}}}\). Since these blocks can be chosen arbitrarily we interpret them as degrees of freedom or “undetermined positions".

Fig. 3
figure 3

Example for the alternation block decomposition of x, given that \(F^\Lambda _r(x) = F^\Lambda _k(x) = \beta \) and \(F^\Lambda _{k-1}(x) = \alpha \). All blocks between \(N_r\) and \(N_{k-1}\) have length below \(\Lambda \)

We then add a single large block of size \(n_k\) (the “determined positions”) that raises the level back to \(F_k^{\Lambda }(x) = \beta \); compare Fig. 3 for an illustration. The relative amount \(f(\alpha ,\beta )\) of undetermined positions turns out to be independent of the starting position r. Repeating this procedure, the lower density of undetermined positions equals \(f(\alpha ,\beta )\) over the whole sequence. This will yield the same value as a lower bound on the Hausdorff dimension of \(\{ \underline{F} = \alpha , \overline{F} = \beta \}\). In Sect. 6, we will prove that this strategy is indeed optimal, establishing \(f(\alpha ,\beta )\) also as an upper bound for the Hausdorff dimension.

Lemma 5.2

Let \(j,k \in {{\mathbb {N}}}\) with \(j<k\) and assume that \(n_i < \Lambda \) for all \(j< i < k\) and \(n_{k} \geqslant \Lambda \). Suppose that there is \(j \leqslant r < j+1\) such that \(F^\Lambda _r(x) = F^\Lambda _k(x) =:\beta \) and set \(\alpha : = F^\Lambda _{k-1}(x)\). Then,

$$\begin{aligned} \frac{N_{k-1} - N_r}{N_k - N_r} = f(\alpha ,\beta ), \end{aligned}$$

with \(f(\alpha ,\beta )\) as defined in (1).

Proof

Let \(N_{k-1} = (1+s) N_r\) and \(N_k = (1+s+t) N_r\). Since \([j+1,k-1] \cap I_\Lambda = \varnothing \), we have

$$\begin{aligned} N_{k-1}^2 F_{k-1}^\Lambda (x) = N_r^2 F_r^\Lambda (x), \end{aligned}$$

which translates to

$$\begin{aligned} \alpha (1+s)^2 = \beta . \end{aligned}$$

Solving for s, we obtain

$$\begin{aligned} s = \frac{\sqrt{\beta } - \sqrt{\alpha }}{\sqrt{\alpha }}. \end{aligned}$$

On the other hand, we have \(n_k = t N_r\) by definition, yielding

$$\begin{aligned} N_k^2 F_k^\Lambda (x) = n_{k}^2 + \sum _{i \in I_\Lambda \cap [1,k-1]} n_i^2 = t^2 N_r^2 + N_r^2 F_r^\Lambda (x). \end{aligned}$$

That is,

$$\begin{aligned} t^2 + \beta = \beta (1+s + t)^2 = \beta \biggl ( \frac{\sqrt{\beta }}{\sqrt{\alpha }} + t \biggr )^2, \end{aligned}$$

which gives after a few steps of calculation,

$$\begin{aligned} t = \frac{\sqrt{\beta }}{\sqrt{\alpha }} \frac{1}{1-\beta } \Bigl ( \beta + \sqrt{\alpha \beta + \beta - \alpha } \Bigr ), \end{aligned}$$

as the unique positive solution. Finally, this implies

$$\begin{aligned} \frac{N_{k-1} - N_r}{N_k - N_r}&= \frac{s}{s+t} = \frac{1}{1 + s/ t} = \biggl ( 1 + \frac{\sqrt{\beta }}{\sqrt{\beta } - \sqrt{\alpha }} \frac{1}{1 - \beta } \Bigl ( \beta + \sqrt{\alpha \beta + \beta - \alpha } \Bigr ) \biggr )^{-1}. \end{aligned}$$

A few formal manipulations show that this is precisely the expression given by \(f(\alpha ,\beta )\). \(\square \)

In order to show that the strategy sketched before Lemma 5.2 is in a certain sense optimal, we move away from the assumption that there are only negligible blocks between \(N_r\) and \(N_{k-1}\). In this more general setting, we find the following analogue of Lemma 5.2 which will be useful in Sect. 6.

Lemma 5.3

Suppose \(n_{k} \geqslant \Lambda \) for some \(k \in {{\mathbb {N}}}\) and let \(F^\Lambda _{k-1}(x) = \alpha < F^\Lambda _{k}(x) = \beta \). Then,

$$\begin{aligned} \frac{N_{k-1} - \sqrt{\alpha /\beta } N_{k-1}}{N_k - \sqrt{\alpha /\beta } N_{k-1}} = f(\alpha ,\beta ). \end{aligned}$$

Sketch of proof

The proof of Lemma 5.2 carries over verbatim if we replace \(N_r\) by the term \(\sqrt{\alpha /\beta } N_{k-1}\) and use the identification \(F^\Lambda _r(x) = F^\Lambda _k(x)\). \(\square \)

We can now provide a lower estimate for the dimension of the set \(\{ \underline{F} = \alpha , \overline{F} = \beta \}\). As in Sect. 3, we will make use of Lemma 3.2, by fixing the values of the sequence \(x = (x_n)_{n \in {{\mathbb {N}}}}\) on an appropriate subset of \({{\mathbb {N}}}\).

Proposition 5.4

Let \((\alpha ,\beta ) \in \Delta \). Then, \(\dim _H \{ \underline{F} = \alpha , \overline{F} = \beta \} \geqslant f(\alpha ,\beta )\).

Proof

For \(\beta =0\), note that whenever the size of alternation blocks in \(\tau (x)\) is uniformly bounded, it follows that \(\overline{F}(x) = 0\). Since the union of all such elements x has full Hausdorff dimension, the claim holds in this particular case. Likewise, the claim is trivial if \(\beta = 1\) or \(\alpha = \beta \ne 0\) because this implies \(f(\alpha ,\beta ) = 0\). We can hence assume \(\alpha< \beta < 1\) in the following. For simplicity, we further restrict to the case that \(\alpha > 0\). The case \(\alpha = 0\) can be treated by replacing \(\alpha \) with a sequence \(\alpha _k \rightarrow 0\) in the argument below.

We follow the ideas outlined before Lemma 5.2, using some of the notation introduced in its proof. For \(\Lambda \in {{\mathbb {N}}}\), we specify a set of positions, given by

$$\begin{aligned} M_\Lambda := \bigcup _{k \geqslant 0}\{n \in {{\mathbb {N}}}: \ (1+s)\theta _k \le n \le \theta _{k+1} \}, \end{aligned}$$

where \(\theta _k = (1+s+t)^k \theta _0\) for all \(k\in {{\mathbb {N}}}_0\), \(s = \frac{\sqrt{\beta } - \sqrt{\alpha }}{\sqrt{\alpha }}\), \(t=\frac{\sqrt{\beta }}{\sqrt{\alpha }} \frac{1}{1-\beta }(\beta + \sqrt{\alpha \beta + \beta -\alpha })\) and \(\theta _0 \in {{\mathbb {N}}}\) is a value with \(t\theta _0 > \Lambda + 2\). We recall from the proof of Lemma 5.2 that

$$\begin{aligned} t^2 + \beta = \beta (1+s + t)^2, \quad \frac{s}{s+t} = f(\alpha ,\beta ). \end{aligned}$$
(8)

The set M will denote those positions where the binary expansion of x is assumed to have a (large) constant block. We hence define

$$\begin{aligned} Q_\Lambda := \{x \in \mathbb {X} \,: \, x_n = 0 \; \text{ for } \text{ all } n \in M_\Lambda \}. \end{aligned}$$

To avoid contributions that come from the complement of \(M_\Lambda \), we introduce the set

$$\begin{aligned} R_\Lambda := \{ x \in \mathbb {X}\,:\, x_1 =0, \, x_{n} x_{n+1} = 10 \; \text{ for } \text{ all } n \in \Lambda {{\mathbb {N}}}\setminus M_\Lambda \} \end{aligned}$$

Combining both conditions, it is natural to define

$$\begin{aligned} A_{\Lambda }:= Q_\Lambda \cap R_\Lambda . \end{aligned}$$

First, we will show that \(A_{\Lambda } \subset \{ \underline{F} = \alpha , \overline{F} = \beta \}\) such that it suffices to bound the Hausdorff dimension of \(A_{\Lambda }\) from below. Let \(x\in A_{\Lambda }\) with alternation coding \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\). Since the expansion of x is constant on \([(1+s)\theta _k, \theta _{k+1}]\), there exists a corresponding index \(i_k\) such that \( N_{i_k-1} \leqslant \lceil (1+s)\theta _k \rceil \leqslant \lfloor \theta _{k+1} \rfloor \leqslant N_{i_k}. \) In particular,

$$\begin{aligned} n_{i_k} \geqslant \theta _{k+1} - (1+s) \theta _k - 2 = t \theta _k - 2, \end{aligned}$$

and by the assumption on \(\theta _0\) this also implies \(n_{i_k} > \Lambda \). On the other hand, the restriction via \(R_{\Lambda }\) ensures that \(n_{i_k}\) cannot be much larger. More precisely, we have \(n_{i_k} \leqslant t \theta _k + 2\Lambda \) and hence

$$\begin{aligned} n_{i_k} = t \theta _k + O(1). \end{aligned}$$

for every \(k \in {{\mathbb {N}}}\). Note that for all other indices \(i \in {{\mathbb {N}}}\) the defining condition for \(R_\Lambda \) also enforces \(n_i < \Lambda \). Hence, we have \(I_\Lambda = \{i_k: k \in {{\mathbb {N}}}_0 \}\) and obtain

$$\begin{aligned} F_j^{\Lambda }(x) = \frac{1}{N_j^2} \sum _{i_k \leqslant j} t^2 \theta _k^2 + o(1). \end{aligned}$$
(9)

Clearly, this sequence attains its \(\limsup \) along the subsequence with \(j = i_k\) and \(k \in {{\mathbb {N}}}\). Since \(N_{i_k} \sim \theta _{k+1}\), we obtain

$$\begin{aligned} \overline{F}(x) = \limsup _{k \rightarrow \infty } \frac{1}{\theta _{k+1}^2} \sum _{j=0}^{k} t^2 \theta _j^2 = t^2 \sum _{j=1}^\infty \frac{1}{(1+s+t)^{2j}} = \frac{t^2}{(1+s+t)^2 - 1} = \beta , \end{aligned}$$

using the first identity from (8) in the last step. On the other hand, (9) implies that the \(\liminf \) for \(F_j(x)\) is obtained along the subsequence with \(j = i_k -1\) and \(k \in {{\mathbb {N}}}\). Since \(N_{i_k - 1} \sim (1+s) \theta _k\), we get by a similar calculation as before

$$\begin{aligned} \underline{F}(x) = \liminf _{k \rightarrow \infty } \frac{1}{(1+s)^2 \theta _k^2} \sum _{j=0}^{k-1} t^2 \theta _j^2 = \frac{\beta }{(1+s)^2} = \alpha , \end{aligned}$$

using the definition of s in the last step. This completes the proof for the statement that \(A_{\Lambda } \subset \{ \underline{F} = \alpha , \overline{F} = \beta \}\).

In view of Lemma 3.2, one has to compute the upper density of \(M_\Lambda \) in order to acquire a lower bound for the Hausdorff dimension of \(A_{\Lambda }\). Since the elements of \(M_\Lambda \) are accumulated to the left of the positions \(\theta _k\), we have that,

$$\begin{aligned} \overline{D}(M_\Lambda )&= \limsup _{k \rightarrow \infty } \frac{1}{\theta _k} \# (M_\Lambda \cap [1,\theta _k]) = \limsup _{k \rightarrow \infty } \frac{1}{\theta _k} \sum _{j=0}^{k-1} t \theta _j = t \sum _{j=1}^{\infty } \frac{1}{(1 + s +t)^j} \\ {}&= \frac{t}{s + t} = 1- f(\alpha ,\beta ), \end{aligned}$$

where we have used the second identity from (8) in the last step. Since the points in \(A_{\Lambda }\) are determined precisely for the positions in \(M_\Lambda \cup \Lambda {{\mathbb {N}}}\cup (\Lambda {{\mathbb {N}}}+ 1)\), we get by Lemma 3.2,

$$\begin{aligned} \dim _H A_{\Lambda } \geqslant 1 - \overline{D} \bigl (M_\Lambda \cup \Lambda {{\mathbb {N}}}\cup (\Lambda {{\mathbb {N}}}+ 1) \bigr ) \geqslant f(\alpha ,\beta ) - \frac{2}{\Lambda } \xrightarrow {\Lambda \rightarrow \infty } f(\alpha ,\beta ). \end{aligned}$$

Since \(\dim _H\{ \underline{F} = \alpha , \overline{F} = \beta \} \geqslant \dim _H A_{\Lambda }\), the proof is complete. \(\square \)

Corollary 5.5

Let \(S \subset \Delta \). Then, \(\dim _H \{ (\underline{F},\overline{F}) \in S \} \geqslant \sup _S f(\alpha ,\beta )\).

Proof

For \((\alpha , \beta ) \in S\), we have \(\{ (\underline{F},\overline{F}) \in S \} \supset \{ \underline{F} = \alpha , \overline{F}=\beta \}\) and we hence obtain \(\dim _H\{ (\underline{F},\overline{F}) \in S \} \geqslant f(\alpha ,\beta )\), due to Proposition 5.4. Taking the supremum over S yields the assertion. \(\square \)

6 Upper Bounds

We proceed by establishing an upper bound for the Hausdorff dimension of the set \(\{ \underline{F} = \alpha , \overline{F} = \beta \}\). This is somewhat more involved than proving the lower bound because we now have to account for all mechanisms that lead to this particular range of accumulation points.

Let us fix \(x \in \mathbb {X} {\setminus } \mathcal D\) with \(\tau (x) = (n_i)_{i \in {{\mathbb {N}}}}\) and \(\Lambda \in {{\mathbb {N}}}\) as in the last section. For every \(k \in {{\mathbb {N}}}\), we define

$$\begin{aligned} \ell _k = \ell _k(x,\Lambda ) = \frac{\sum _{i \in I_\Lambda \cap [1,k]} n_i}{N_k}. \end{aligned}$$

This corresponds to the relative density of positions of x (in the region \([1,N_k]\)) that are occupied with large blocks. Naturally, if we enlarge \([1,N_k]\) by an interval that does not contain elements of \(I_\Lambda \), this density decays. It will be useful to cancel this effect in an appropriate way. To this end, we define a sequence \((\varrho _k)_{k \in {{\mathbb {N}}}}\), implicitly dependent on \((x,\Lambda )\), via

$$\begin{aligned} \varrho _k = \frac{\ell _k}{\sqrt{F^\Lambda _k(x)}}, \end{aligned}$$

which we may interpret as a renormalized block density. Indeed, one easily verifies that whenever \(n_i < \Lambda \) for all \(j < i \leqslant k\), it follows that \(\varrho _j = \varrho _k\).

In the following, let

$$\begin{aligned} \eta (\alpha ,\beta ):= \frac{1}{\sqrt{\beta }} (1 - f(\alpha ,\beta )). \end{aligned}$$

In the situation of Lemma 5.2, this may be interpreted as the relative size of the single large block \(n_k\) in the region between \(N_r\) and \(N_k\), normalized by \(\sqrt{\beta }\). The similarity of this interpretation with the definition of \(\varrho _k\) provides some intuition for the following result.

Lemma 6.1

Whenever \(n_k \geqslant \Lambda \) and \(F^{\Lambda }_{k-1}(x) = \alpha < F^{\Lambda }_{k}(x) = \beta \), we can write \(\varrho _k\) as the convex combination,

$$\begin{aligned} \varrho _k = p_k \varrho _{k-1} + (1-p_k) \eta (\alpha ,\beta ), \end{aligned}$$

where

$$\begin{aligned} p_k = \sqrt{\frac{\alpha }{\beta }} \frac{N_{k-1}}{N_k}. \end{aligned}$$

In particular, \(p_k \leqslant N_{k-1}/N_k\).

Proof

First, we write \(\ell _k\) as a convex combination via

$$\begin{aligned} \ell _k = \frac{1}{N_k} \biggl ( \sum _{i \in I_\Lambda \cap [1,k-1]} n_i + n_k \biggr ) = \frac{N_{k-1}}{N_k} \ell _{k-1} + \frac{n_k}{N_k}. \end{aligned}$$

Dividing this relation by \(\sqrt{\beta }\) yields

$$\begin{aligned} \varrho _k = \sqrt{\frac{\alpha }{\beta }} \frac{N_{k-1}}{N_k} \varrho _{k-1} + \frac{1}{\sqrt{\beta }} \frac{n_k}{N_k}. \end{aligned}$$

Using \(n_k = N_k - N_{k-1}\), the last summand may be rewritten as

$$\begin{aligned} \frac{1}{\sqrt{\beta }} \frac{n_k}{N_k} = \frac{1}{\sqrt{\beta }} (1-p_k) \frac{N_k - N_{k-1}}{N_k - \sqrt{\alpha /\beta } N_{k-1}} = (1-p_k) \frac{1}{\sqrt{\beta }} (1 - f(\alpha ,\beta )), \end{aligned}$$

using Lemma 5.3 in the last step. By the definition of \(\eta (\alpha ,\beta )\), this is precisely the claimed expression. \(\square \)

Lemma 6.2

The function \(\eta :\Delta \setminus \{(0,0)\} \rightarrow [0,1]\), with

$$\begin{aligned} \eta (\alpha ,\beta ) = \frac{1}{\sqrt{\beta }}(1 - f(\alpha ,\beta )) \end{aligned}$$

is continuous on its domain. It is increasing in \(\alpha \) and decreasing in \(\beta \). In particular,

$$\begin{aligned} \inf \{ \eta (\gamma , \delta ): \alpha \leqslant \gamma \leqslant \delta \leqslant \beta \} = \eta (\alpha ,\beta ), \end{aligned}$$

for all \((\alpha ,\beta ) \in \Delta \setminus \{(0,0)\}\).

Sketch of proof

A short calculation shows that

$$\begin{aligned} \eta (\alpha ,\beta ) = \frac{\sqrt{\beta } + \sqrt{\alpha }}{\sqrt{\alpha \beta + \beta - \alpha } + \sqrt{\alpha \beta }}, \end{aligned}$$

which can be checked to have the required properties. \(\square \)

Proposition 6.3

Assume that \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \) for some \((\alpha ,\beta ) \in \Delta {\setminus } \{(0,0)\}\). Then,

$$\begin{aligned} \liminf _{k \rightarrow \infty } \varrho _k \geqslant \eta (\alpha ,\beta ). \end{aligned}$$

Proof

By Lemma 5.1, we have \(\liminf _{m \rightarrow \infty } F_m^\Lambda (x) = \alpha \) and \(\limsup _{m \rightarrow \infty } F_m^\Lambda (x) = \beta \). Given \(\varepsilon > 0\) let us define \(\beta _{\varepsilon } = \min \{1,\beta + \varepsilon \}\) and \(\alpha _{\varepsilon } = \max \{0,\alpha - \varepsilon \}\). By assumption, we have \(F_k^\Lambda (x) \in (\alpha _{\varepsilon },\beta _{\varepsilon })\) for large enough \(k \in {{\mathbb {N}}}\). For such k and \(\gamma = F^\Lambda _{k-1}(x)\), \(\delta = F^\Lambda _k(x)\), we distinguish three cases

  1. (1)

    If \(n_k < \Lambda \), we have \(\varrho _k = \varrho _{k-1}\).

  2. (2)

    If \(n_k \geqslant \Lambda \) but \(\gamma \geqslant \delta \), we get \(\varrho _k > \varrho _{k-1}\) (by straightforward calculation).

  3. (3)

    If \(n_k \geqslant \Lambda \) and \(\gamma < \delta \), we have \(\varrho _k = p_k \varrho _{k-1} + (1-p_k)\eta (\gamma ,\delta )\) and \(p_k \leqslant N_{k-1}/N_k\).

Due to Lemma 6.2, we have \(\eta (\gamma ,\delta ) \geqslant \eta (\alpha _{\varepsilon },\beta _{\varepsilon })\) if \(\gamma < \delta \). Going through all possible cases we thereby find that \(\varrho _{k-1} \geqslant \eta (\alpha _{\varepsilon },\beta _{\varepsilon })\) also implies \(\varrho _k \geqslant \eta (\alpha _{\varepsilon },\beta _{\varepsilon })\). By the continuity of \(\eta \), the claim follows as soon as \(\varrho _{k-1} \geqslant \eta (\alpha _{\varepsilon },\beta _{\varepsilon })\) for some k. Let us therefore assume that there is some \(k_0 \in {{\mathbb {N}}}\) with \(\varrho _k <\eta (\alpha _{\varepsilon },\beta _{\varepsilon })\) for all \(k \geqslant k_0\). By assumption, there are several accumulation points of the sequence \((F_k^{\Lambda }(x))_{k \in {{\mathbb {N}}}}\) and hence the third case needs to occur infinitely often. In each such case note that

$$\begin{aligned} \varrho _k \geqslant p_k \varrho _{k-1} + (1- p_k)\eta (\alpha _{\varepsilon },\beta _{\varepsilon }) \end{aligned}$$

and thereby

$$\begin{aligned} \varrho _k -\eta (\alpha _{\varepsilon },\beta _{\varepsilon }) \geqslant p_k(\varrho _{k-1} -\eta (\alpha _{\varepsilon },\beta _{\varepsilon })). \end{aligned}$$

Since we have assumed that \(\varrho _k\) remains below \(\eta (\beta _{\varepsilon },\alpha _\varepsilon )\), this means that

$$\begin{aligned} |\varrho _k - \eta (\alpha _{\varepsilon },\beta _{\varepsilon })| \leqslant p_k |\varrho _{k-1} - \eta (\alpha _{\varepsilon },\beta _{\varepsilon }) |. \end{aligned}$$

Note that \(\gamma = F_{k-1}^\Lambda (x) < F_k^\Lambda (x) = \delta \) requires that \(p_k \leqslant N_{k-1}/N_k\) is bounded above by some constant \(c(\delta ) < 1\), compare the proof of Lemma 4.1. Restricting to those k such that \(\delta> \beta /2 > 0\), we can further assume that there is a uniform \(p < 1\) with \(c(\delta ) < p\) and hence

$$\begin{aligned} |\varrho _k - \eta (\alpha _{\varepsilon },\beta _{\varepsilon })| \leqslant p |\varrho _{k-1} -\eta (\alpha _{\varepsilon },\beta _{\varepsilon }) |. \end{aligned}$$
(10)

Since \(\varrho _k\) is non-decreasing we have overall that the distance of \(\varrho _k\) to \(\eta (\alpha _{\varepsilon },\beta _{\varepsilon })\) is non-increasing and exponentially decaying on a subsequence due to (10). It thereby follows that \(\lim _{k \rightarrow \infty } \varrho _k = \eta (\alpha _{\varepsilon },\beta _{\varepsilon })\). Hence, we have in every case

$$\begin{aligned} \liminf _{k \rightarrow \infty } \varrho _k \geqslant \eta (\alpha _{\varepsilon },\beta _{\varepsilon }) \xrightarrow {\varepsilon \rightarrow 0} \eta (\alpha ,\beta ), \end{aligned}$$

which finishes the proof. \(\square \)

For every x, let the upper density of large blocks be given by

$$\begin{aligned} D_\Lambda (x):= \limsup _{m \rightarrow \infty } \frac{1}{N_m} \sum _{i \in [1,m] \cap I_\Lambda } n_i = \limsup _{m \rightarrow \infty } \ell _m(x,\Lambda ). \end{aligned}$$

From Proposition 6.3 we can infer the following structural property.

Proposition 6.4

Let \(S \subset \Delta \). Then, for every \(x \in \bigl \{ (\underline{F},\overline{F}) \in S \bigr \}\) and for every \(\Lambda \in {{\mathbb {N}}}\),

$$\begin{aligned} D_\Lambda (x) \geqslant 1 - \sup \{f(\alpha ,\beta ): (\alpha ,\beta ) \in S \}. \end{aligned}$$

Proof

Since \(f(0,0) = 1\) by convention, the lower bound is trivial if \((0,0) \in S\). We can hence restrict to the case \(S \subset \Delta {\setminus } \{(0,0)\}\). Let x be such that \(\underline{F}(x) = \alpha \) and \(\overline{F}(x) = \beta \) with \((\alpha ,\beta ) \in S\). Take an increasing subsequence \((k_m)_{m \in {{\mathbb {N}}}}\) such that \(\lim _{m \rightarrow \infty } F_{k_m}^\Lambda (x) = \beta \). Then, by Proposition 6.3,

$$\begin{aligned} D_\Lambda (x) \geqslant \liminf _{m \rightarrow \infty } \sqrt{F_{k_m}^\Lambda (x)} \varrho _{k_m}(x) \geqslant \sqrt{\beta } \, \eta (\alpha ,\beta ) = 1 - f(\alpha ,\beta ) \geqslant 1 - \sup _S f(\alpha ,\beta ). \end{aligned}$$

Since \(\Lambda \in {{\mathbb {N}}}\) was arbitrary, this is the desired statement. \(\square \)

Before we proceed, let us recall a standard estimate due to Billingsley [5].

Lemma 6.5

(Billingsley) Let \(\nu \) be a probability measure on \(\mathbb {X}\) and \(c > 0\). Then,

$$\begin{aligned} \dim _H \{x \in \mathbb {X}: \underline{d}_{\nu } x \leqslant c \} \leqslant c. \end{aligned}$$

This can be used to provide an estimate for the Hausdorff dimension of level sets of the density function \(D_\Lambda \).

Lemma 6.6

For \(0 \leqslant c \leqslant 1\), let B(c) be the set

$$\begin{aligned} B(c) = \{ x \in \mathbb {X}\setminus \mathcal D: D_\Lambda (x) \geqslant 1 - c \text{ for } \text{ all } \Lambda \in {{\mathbb {N}}}\}. \end{aligned}$$

Then, \(\dim _H B(c) \leqslant c\).

Proof

We fix large integer numbers \(m,k \in {{\mathbb {N}}}\) and set \(\Lambda = km\). Let \(p = 1/3\) and define a \(\sigma ^m\)-invariant (Bernoulli) measure \(\nu \) on cylinders of length m via

$$\begin{aligned} \nu ([w]) = {\left\{ \begin{array}{ll} p &{} \text{ if } w \in \{0^m,1^m\}, \\ p \frac{1}{2^m - 2} &{} \text{ if } w \in \{0,1 \}^m \setminus \{ 0^m,1^m\}. \end{array}\right. } \end{aligned}$$

This is extended to a product measure via the relation

$$\begin{aligned} \nu ([w_1 \cdots w_n]) = \prod _{i=1}^n \nu ([w_i]), \end{aligned}$$

whenever each \(w_i \in \{0,1\}^m\). For \(x \in B(c)\) with alternation coding \((n_i)_{i \in {{\mathbb {N}}}}\) let j be such that

$$\begin{aligned} \frac{1}{N_{j}} \sum _{i \in [1,j] \cap I_\Lambda } n_i \geqslant 1-c-\varepsilon . \end{aligned}$$
(11)

Decompose \(x^j = x_1 \cdots x_{N_j}\) into blocks of length m, yielding

$$\begin{aligned} x^j = w_1 \ldots w_{r_j} \widetilde{w}, \end{aligned}$$

where \(w_i \in \{0,1 \}^m\) and \(1\leqslant |\widetilde{w}| \leqslant m\). Then, due to the product definition of \(\nu \),

$$\begin{aligned} \log \nu (C_{N_j}(x)) = \sum _{r=1}^{r_j} \log \nu ([w_r]) + O(1). \end{aligned}$$

Let \(r_j^* \leqslant r_j\) be the number of indices r with \(w_r \in \{0^m,1^m\}\). Then,

$$\begin{aligned} \log \nu (C_{N_j}(x))&= r_j^* \log (p) + (r_j-r_j^*) \log (p/(2^m - 2)) + O(1) \\ {}&= r_j \log p + (r_j^{*} - r_j) \log (2^m - 2) + O(1). \end{aligned}$$

Note that for every \(i \in I_\Lambda \), the number \(k_i\) of words \(w_r\) that are completely contained in the corresponding block of length \(n_i\) satisfies

$$\begin{aligned} k_i \geqslant \left\lfloor \frac{n_i}{m} \right\rfloor - 2. \end{aligned}$$

Since \(n_i \geqslant \Lambda = m k\), we can choose k large enough to ensure

$$\begin{aligned} k_i \geqslant \frac{n_i}{m} (1-\varepsilon ). \end{aligned}$$

Hence, using (11), the number \(r_j^*\) is bounded below via

$$\begin{aligned} r_j^* \geqslant (1-\varepsilon ) \frac{1}{m} \sum _{i \in [1,j] \cap I_\Lambda } n_i \geqslant (1-\varepsilon ) \frac{N_j}{m}(1 - c - \varepsilon ). \end{aligned}$$

As a result we get

$$\begin{aligned} \frac{\log \nu (C_{N_j}(x))}{N_j}&\geqslant \frac{r_j}{N_j} \log p - \Bigl ( \frac{r_j}{N_j} - \frac{1}{m}(1-\varepsilon )(1-c-\varepsilon ) \Bigr )\log (2^m - 2) + o(1) \\ {}&\xrightarrow {j \rightarrow \infty } \frac{\log p}{m} - \frac{1}{m}\bigl (1 - (1-\varepsilon )(1-c-\varepsilon )\bigr ) \log (2^m - 2). \end{aligned}$$

Since \(\varepsilon > 0\) was arbitrary, it follows that

$$\begin{aligned} \underline{d}_{\nu }(x) \leqslant \liminf _{j \rightarrow \infty } \frac{\log \nu (C_{N_j}(x))}{- N_j \log 2} \leqslant \frac{c \log (2^m - 2)}{m \log 2} - \frac{\log p}{m \log 2} =: c_m. \end{aligned}$$

Since this holds for all points in B(c) it follows by Lemma 6.5 that

$$\begin{aligned} \dim _H B(c) \leqslant c_m \xrightarrow {m \rightarrow \infty } c, \end{aligned}$$

which indeed implies that \(\dim _H B(c) \leqslant c\). \(\square \)

Corollary 6.7

For \(S \subset \Delta \), we have \(\dim _H \bigl \{ (\underline{F},\overline{F}) \in S \bigr \} \leqslant \sup _S f(\alpha ,\beta )\).

Proof

Let \(c = \sup _S f(\alpha ,\beta )\). Due to Proposition 6.4, we have \(D_\Lambda (x) \geqslant 1 -c\) for all \(x \in \bigl \{ (\underline{F},\overline{F}) \in S \bigr \}\) and \(\Lambda \in {{\mathbb {N}}}\). That is, \(\bigl \{ (\underline{F},\overline{F}) \in S \bigr \} \subset B(c)\) in the notation of Lemma 6.6, implying that \(\dim _H \bigl \{ (\underline{F},\overline{F}) \in S \bigr \} \leqslant \dim _H B(c) \leqslant c \). \(\square \)

Proof of Theorem 1.5

The lower bound in Theorem 1.5 is given in Corollary 5.5 and the upper bound is provided by Corollary 6.7. \(\square \)