1 Introduction

For subsets \({{\mathcal {B}}}\subseteq {\mathbb {N}}\setminus \{1\}\) denote by \({{\mathcal {M}}}_{{\mathcal {B}}}:=\bigcup _{b\in {{\mathcal {B}}}}b{\mathbb {Z}}\) the set of all multiples of \({{\mathcal {B}}}\). The density \(d({{\mathcal {M}}}_{{\mathcal {B}}}):=\lim _{N\rightarrow \infty }N^{-1}{\text {card}}({{{\mathcal {M}}}_{{\mathcal {B}}}}\cap [1:N])\) exists in many cases, in particular if \({{\mathcal {B}}}\) is thin, i.e. if \(\sum _{b\in {{\mathcal {B}}}}1/b<\infty \), but Besicovitch [1] provided examples where it does not exist. A bit later, Davenport and Erdős [3] proved that the logarithmic densityFootnote 1 of \({{\mathcal {M}}}_{{\mathcal {B}}}\) always exists and coincides with the lower density \({\underline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})\). For more background on this see [9, Sec. 2.5].

Recurrence properties of the sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}:=1_{{\mathbb {Z}}\setminus {{\mathcal {M}}}_{{\mathcal {B}}}}\in \{0,1\}^{\mathbb {Z}}\) can be studied using dynamical systems theory, see in particular [9]. To that end denote by S the left shift on \(\{0,1\}^{\mathbb {Z}}\) and restrict this homeomorphism to the closure \(X_\eta \) of \(\{S^n\eta :n\in {\mathbb {Z}}\}\) in \(\{0,1\}^{\mathbb {Z}}\). The sequence \(\eta \) is called quasi-generic for an invariant probability \(\mu \) on \(X_\eta \), if \(N^{-1}\sum _{n=1}^N\delta _{S^n\eta }\) converges weakly to \(\mu \) along some subsequence. There is a distinguished invariant measure \(\mu _\eta \) on \(X_\eta \) (its Mirsky measure) for which \(\eta \) is quasi-generic. Indeed, \(N^{-1}\sum _{n=1}^N\delta _{S^n\eta }\) converges weakly to \(\mu _\eta \) along each subsequence \((N_i)_i\) along which the lower density \({\underline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})\) is attained. Hence \(\eta \) is generic for \(\mu _\eta \) if and only if \(d({{\mathcal {M}}}_{{\mathcal {B}}})\) exists. It follows that for the sets \({{\mathcal {B}}}\) constructed by Besicovitch [1], the point \(\eta \) is quasi-generic for at least one further invariant measure on \(X_\eta \).

In [9] and [12] properties of the topological and measure theoretic dynamical systems \((X_\eta ,S)\) and \((X_\eta ,S,\mu _\eta )\) were characterized in (elementary) number theoretic terms. Combining some of these results, it turns out that Besicovitch’s examples always lead to proximal systems \((X_\eta ,S)\), namely to systems where each closed invariant subset contains the fixed point \(0^{\mathbb {Z}}\), and that such systems have positive topological entropy and host a huge collection of ergodic invariant measures, among them a unique measure of maximal entropy, see Remark 2 in Sect. 4. But there are also plenty of sets \({{\mathcal {B}}}\) for which the system \((X_\eta ,S)\) itself is minimal. In these cases, \(\eta \) is a Toeplitz sequence (see Remark 1)Footnote 2, and it is hitherto unknown whether there are examples of this type where \(X_\eta \) can host more invariant measures than just the Mirsky measure.Footnote 3

Following Hall [11], we call \({{\mathcal {B}}}\) a Besicovitch set, if the density \(d({{\mathcal {M}}}_{{\mathcal {B}}})\) exists. In this note we modify Besicovitch’s construction and prove the the following result:Footnote 4

Theorem 1

There are sets \({{\mathcal {B}}}\subseteq {\mathbb {N}}\setminus \{1\}\) with the following properties:

  1. i)

    The sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is an irregular Toeplitz sequence.

  2. ii)

    The set of shift invariant measures on \(X_\eta \) contains at least one measure of positive entropy.

  3. iii)

    Depending on the details of the construction one can make sure that

    1. (a)

      \({{\mathcal {B}}}\) is not a Besicovitch set and \(\eta \) is quasi-generic for some measure of positive entropy, or

    2. (b)

      \({{\mathcal {B}}}\) is a Besicovitch set, so \(\eta \) is generic for the Mirsky measure, but there is also some measure of positive entropy.

Remark 1

Recall (e.g. from [5, 6]) that a sequence \(\omega \in \{0,1\}^{\mathbb {Z}}\) is a Toeplitz sequence, if for each \(n\in {\mathbb {Z}}\) there exists a positive integer p such that \(\omega _n=\omega _{n+kp}\) for all \(k\in {\mathbb {Z}}\). Given \(p\in {\mathbb {N}}\), denote \(\mathsf{Per}_p(\omega )=\{n\in {\mathbb {Z}}: \omega _n=\omega _{n+kp}\text { for all }k\in {\mathbb {Z}}\}\) and \(\mathsf{Aper}(\omega )={\mathbb {Z}}\setminus \bigcup _{p\in {\mathbb {N}}}\mathsf{Per}_p(\omega )\). With this notation, \(\omega \) is a Toeplitz sequence if and only if \(\mathsf{Aper}(\omega )=\emptyset \). A Toeplitz sequence \(\omega \) is regular, if \(\sup _{p\in {\mathbb {N}}}d(\mathsf{Per}_p(\omega ))=1\) [6, Thm. 2.8], otherwise it is irregular. The reader may have in mind that \(\bigcup _{p=1}^N\mathsf{Per}_p(\omega )\subseteq \mathsf{Per}_{{\text {lcm}}(1,\dots ,N)}(\omega )\).

If \(\omega \) is a regular Toeplitz sequence, then \((X_\omega ,S)\) is uniquely ergodic and has entropy zero [6, Thm. 2.5]. For irregular Toeplitz sequences \(\omega \) a wide range of different dynamical properties of \((X_\omega ,S)\) is possible, see e.g. [2, 18, 6, Ex. 5.1, 6.1] for specific examples and [4] for the general fact that any topological dynamical system with infinitely many rational continuous eigenvalues is isomorphic (in a very strong Borel sense) to some Toeplitz system. For irregular \({{\mathcal {B}}}\)-free Toeplitz sequences \(\eta \) at least one of these dynamical possibilities is excluded, namely to have a uniquely ergodic system \((X_\eta ,S)\) of positive entropy (cf. [2]), because the Mirsky measure always exists and has entropy zero.

Other examples of irregular \({{\mathcal {B}}}\)-free Toeplitz sequences were provided in [12, Ex. 4.2]. It is not immediately clear from that construction, however, whether those examples may/must possess at least two invariant measures, or whether they even may/must have positive entropy. Here we modify and tune that construction in such a way that we end up with an irregular Toeplitz sequence \(\eta \) for which \((X_\eta ,S)\) is uniquely ergodic. I am indebted to Stanisław Kasjan, who provided a more systematic description of the construction from [12, Ex. 4.2], which was instrumental in proving the following theorem.

Theorem 2

There are sets \({{\mathcal {B}}}\subseteq {\mathbb {N}}\setminus \{1\}\) with the following properties:

  1. i)

    The sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is an irregular Toeplitz sequence.

  2. ii)

    \(X_\eta \) is uniquely ergodic, in particular of entropy zero.

Before we turn to the proof of Theorem 1 in Sect. 5, we prove Theorem 2 in Sect. 2, provide a useful “prime number characterization” of those sets \({{\mathcal {B}}}\) for which \(\eta \) is a Toeplitz sequence in Sect. 3, and prove a simplified version of Theorem 1 (existence of at least two invariant measures for which \(\eta \) is quasi-generic) in Proposition 2 of Sect. 4. The proof of one lemma, for which we rely on properties of Kolmogorov complexity, is deferred to Sect. 6.

2 Proof of Theorem 2

The starting point of the construction is a sequence \((P_k)_{k\in {\mathbb {N}}}\) of finite sets of prime numbers satisfying

  1. (I)


  2. (II)

    \(\min P_{k+1}>k\,2^2\, Q_1\dots Q_k\) where \(Q_j:=\prod _{q\in P_j}q\) (in particular \(Q_{k+1}\geqslant 2^2\,3^{k}\)), and

  3. (III)

    \(d({{\mathcal {M}}}_{P_{{k+1}}})\geqslant 1-2^{-({k+3})}\) for \(k\geqslant 1\).Footnote 5

It is convenient to denote the elements of \(P_k\) by \(q_1^{(k)},\dots ,q_{t_k}^{(k)}\), so \({\text {card}}(P_k)=t_k\) and \(Q_k=\prod _{s=1}^{t_k}q_s^{(k)}\). Observe that \(t_1=1\) and \(q_1^{(1)}=2\).

Once the numbers \(t_1,t_2,\dots \) are fixed we can choose the second basic ingredient of the construction, namely we fixFootnote 6

  1. (IV)

    for each \(k\in {\mathbb {N}}\), a partition of \({\mathbb {N}}\) into pairwise disjoint infinite sets \(R_{(i_1,\dots ,i_k)}\) where \(i_\ell \in \{1,\dots ,t_\ell \}\) for all \(\ell =1,\dots ,k\), and such that \(R_{(1)}=R_{(t_1)}={\mathbb {N}}\) (recall that \(t_1=1\)) and

    $$\begin{aligned} R_{(i_1,\dots ,i_k)}=\bigcup _{s=1}^{t_{k+1}}R_{(i_1,\dots ,i_k,s)}. \end{aligned}$$

After these preliminaries we define positive integers

$$\begin{aligned} c_{k+1}^{(k)}=q_{i_1}^{(1)} q_{i_2}^{(2)}\cdots q_{i_k}^{(k)}\quad \text {for }k\in {\mathbb {N}}\text { such that } k+1\in R_{(i_1,\dots ,i_k)}. \end{aligned}$$

As \(\bigcup _{(i_1,\dots ,i_k)}R_{(i_1,\dots ,i_k)}={\mathbb {Z}}\) in view of (IV), this defines the numbers \(c_{k+1}^{(k)}\) for all \(k\in {\mathbb {N}}\). Finally let

$$\begin{aligned} b_1=(q_1^{(1)})^3=2^3,\text { and, for }k\geqslant 1 ,\; b_{k+1}=c_{k+1}^{(k)}Q_{k+1}, \end{aligned}$$

and denote \({{\mathcal {B}}}=\{b_1,b_2,\dots \}\). It follows that

$$\begin{aligned} \ell _k:={\text {lcm}}(b_1,\dots ,b_k)=2^2\prod _{j=1}^k Q_j. \end{aligned}$$

Lemma 1

The sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is an irregular Toeplitz sequence and \({{\mathcal {B}}}\) is thin, i.e. \(\sum _{b\in {{\mathcal {B}}}}1/b<\infty \).


Let \(S_k=\{b_1,\dots ,b_k\}\) and define \({{\mathcal {A}}}_{S_k}:=\{\gcd (\ell _k,b):b\in {{\mathcal {B}}}\}\). Observe that

$$\begin{aligned} {{\mathcal {A}}}_{S_k}&=S_k\cup \{\gcd (\ell _k,b_{k+j}):j\geqslant 1\}\\&=S_k\cup \{q_{i_1}^{(1)}\cdots q_{i_k}^{(k)}:\exists j\geqslant 1\text { s.t. }k+j\in R_{(i_1,\dots ,i_k)}\}, \end{aligned}$$

in particular \(\limsup _{k\rightarrow \infty }({{\mathcal {A}}}_{S_k}\setminus S_k)=\emptyset \), so that \(\eta \) is a Toeplitz sequence by [12, Thm. B]. As each set \(R_{(i_1,\dots ,i_k)}\) is infinite, we have indeed

$$\begin{aligned} {{\mathcal {A}}}_{S_k}=S_k\cup P_1\cdot P_2\cdots P_k. \end{aligned}$$

In order to prove that \(\eta \) is irregular it suffices to show that

$$\begin{aligned} \inf _{k\in {\mathbb {N}}}{\underline{d}}\left( {{\mathcal {M}}}_{{{\mathcal {A}}}_{S_k}}\setminus {{\mathcal {M}}}_{{\mathcal {B}}}\right) >0, \end{aligned}$$

see [12, Lem. 4.3] together with [6, Thm. 2.5]. Observe first that \({\overline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})\leqslant \sum _{k=1}^\infty 1/b_k=2^{-3}+\sum _{k=1}^\infty 1/Q_{k+1}\leqslant \frac{1}{4}\), because \(Q_{k+1}\geqslant 2^2\,3^{k}\) by (II). This shows in particular that \({{\mathcal {B}}}\) is thin, so that the density \(d({{\mathcal {M}}}_{{\mathcal {B}}})\) exists.

Next, observing III and the fact that the \(P_k\) are pairwise disjoint sets of prime numbers,

$$\begin{aligned} \begin{aligned} d({{\mathcal {M}}}_{{{\mathcal {A}}}_{S_k}})&\geqslant d({{\mathcal {M}}}_{P_1\cdot P_2\cdots P_k}) = d({{\mathcal {M}}}_{P_1}\cap {{\mathcal {M}}}_{P_2}\cap \dots \cap {{\mathcal {M}}}_{P_k})\\&=d({{\mathcal {M}}}_{P_1})\cdots d({{\mathcal {M}}}_{P_k})\\&\geqslant \frac{1}{2}\cdot \prod _{j=2}^k(1-2^{-(j+2)}) \geqslant \frac{1}{2}\left( 1-2^{-3}\right) . \end{aligned} \end{aligned}$$

Hence the term in (3) is lower bounded by \(\frac{1}{2}-\frac{1}{16}-\frac{1}{4}=\frac{3}{16}\). this shows that \(\eta \) is irregular. \(\square \)

Lemma 2

If \(k\ell _k\leqslant L<(k+1)\ell _{k+1}\), then

$$\begin{aligned} \sup _{a\in {\mathbb {Z}}}{\text {card}}\left( {{\mathcal {M}}}_{{\mathcal {B}}}\cap [a,a+L)\right) \leqslant L\cdot \left( d({{\mathcal {M}}}_{S_k})+\frac{4}{k}+\frac{1}{b_{k+1}}\right) . \end{aligned}$$


Abbreviate \(I:=[a,a+L)\). Then

$$\begin{aligned} {\text {card}}\left( {{\mathcal {M}}}_{{\mathcal {B}}}\cap [a,a+L)\right)&\leqslant {\text {card}}\left( {{\mathcal {M}}}_{S_k}\cap I\right) +{\text {card}}\left( b_{k+1}{\mathbb {Z}}\cap I\right) \\&\quad +{\text {card}}\left( {{\mathcal {M}}}_{{{\mathcal {B}}}\setminus S_{k+1}}\cap I\right) , \end{aligned}$$

and, as \(S_k\) has period at most \(\ell _k\) and as \(1\leqslant L/(k\ell _k)\leqslant L/k\),

$$\begin{aligned} {\text {card}}\left( {{\mathcal {M}}}_{S_k}\cap I\right)&\leqslant L\,d({{\mathcal {M}}}_{S_k})+2\ell _k \leqslant L\cdot \left( d({{\mathcal {M}}}_{S_k})+2/k\right) ,\\ {\text {card}}\left( b_{k+1}{\mathbb {Z}}\cap I\right)&\leqslant [L/b_{k+1}]+1 \leqslant L\cdot \left( 1/b_{k+1}+{1/k}\right) ,\qquad \text { and}\\ {\text {card}}\left( {{\mathcal {M}}}_{{{\mathcal {B}}}\setminus S_{k+1}}\cap I\right)&\leqslant 1{\leqslant L/k}, \end{aligned}$$

where the last estimate is based on the following observation: If \(mb_{k+r},nb_{k+s}\in I=[a,a+L)\) for some \(2\leqslant r<s\) and \(m,n\in {\mathbb {Z}}\), then \(0<|mb_{k+r}-nb_{k+s}|<L\) so that \(\gcd (b_{k+r},b_{k+s})<L\). However,

$$\begin{aligned} \gcd (b_{k+r},b_{k+s}) = \gcd \left( c_{k+r}^{(k+r-1)}Q_{k+r},q_{i_1}^{(1)}q_{i_2}^{(2)}\dots q_{i_{k+s-1}}^{(k+s-1)}Q_{k+s}\right) \geqslant q_{i_{k+r}}^{(k+r)} \end{aligned}$$

where \(k+s\in R_{(i_1,\dots ,i_{k+r})}\), so that, also in view of (II) and (1),

$$\begin{aligned}&\gcd (b_{k+r},b_{k+s}) \\&\quad \geqslant q_{i_{k+r}}^{(k+r)} \geqslant \min P_{k+r}> (k+r-1)\,2^2\,Q_1\cdots Q_{k+r-1} \geqslant (k+1)\ell _{k+1}>L. \end{aligned}$$

\(\square \)

Lemma 2 implies

$$\begin{aligned} \limsup _{L\rightarrow \infty }\;\sup _{a\in {\mathbb {Z}}}\frac{1}{L}{\text {card}}\left( {{\mathcal {M}}}_{{\mathcal {B}}}\cap [a,a+L)\right) \leqslant d({{\mathcal {M}}}_{{\mathcal {B}}}). \end{aligned}$$

It follows that

$$\begin{aligned} \liminf _{L\rightarrow \infty }\;\inf _{x\in X_\eta }\frac{1}{L}\sum _{k=0}^{L-1}x_k \geqslant 1-d({{\mathcal {M}}}_{{\mathcal {B}}}). \end{aligned}$$

In particular, \(\mu \{x\in X_\eta :x_0=1\}\geqslant 1-d({{\mathcal {M}}}_{{\mathcal {B}}})\) for each invariant measure \(\mu \) on \(X_\eta \). But in view of [14, Thm. 4] (which owes much to Moody [16]) and the correspondence between the “sets of multiples” and “the cut and project” points of view on \({{\mathcal {B}}}\)-free numbers (see [12], in particular Lemma 4.1), the Mirsky measure is the only invariant measure on \(X_\eta \) which satisfies this inequality. Hence \((X_\eta ,S)\) is uniquely ergodic.

3 Another characterization of the case when \(\eta \) is a Toeplitz sequence

A set \({{\mathcal {B}}}\subseteq {\mathbb {N}}\) is primitive, if no number from \({{\mathcal {B}}}\) divides another one. If \({{\mathcal {B}}}\) is not primitive, there is always a unique primitive subset \({{\mathcal {B}}}'\subseteq {{\mathcal {B}}}\) such that \({{\mathcal {M}}}_{{\mathcal {B}}}={{\mathcal {M}}}_{{{\mathcal {B}}}'}\).

For \(k\in {\mathbb {N}}\) let \({{\mathcal {B}}}/k:=\{\frac{b}{k}: b\in {{\mathcal {B}}}, k\mid b\}\). Observe that \({{\mathcal {B}}}/k=\{1\}\) if and only if \(k\in {{\mathcal {B}}}\), whenever \({{\mathcal {B}}}\) is primitive.

Lemma 3

Suppose that \({{\mathcal {B}}}\) is primitive and let \(k\in {\mathbb {N}}\setminus {{\mathcal {B}}}\). Then \({{\mathcal {B}}}/k\) contains no infinite pairwise coprime subset if and only if there is a finite set of primes \(P_k\) such that \({{\mathcal {B}}}/k\subseteq {{\mathcal {M}}}_{P_k}\).


\({{\mathcal {B}}}/k\) contains an infinite pairwise coprime subset if and only if \({{\mathcal {B}}}/k\not \subseteq {{\mathcal {M}}}_C\) for all finite sets \(C\subseteq {\mathbb {N}}\setminus \{1\}\) [9, Thm. 3.7]Footnote 7, and the latter is equivalent to \({{\mathcal {B}}}/k\not \subseteq {{\mathcal {M}}}_P\) for all finite sets P of primes. The claim of the lemma is just the equivalence of the negations of these assertions. \(\square \)

Proposition 1

Suppose that \({{\mathcal {B}}}\) is primitive. The sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is a Toeplitz sequence if and only if for every \(k\in {\mathbb {N}}\setminus {{\mathcal {B}}}\) there is a finite set \(P_k\) of primes such that \({{\mathcal {B}}}/k\subseteq {{\mathcal {M}}}_{P_k}\).


\(\eta \) is a Toeplitz sequence if and only if there are no \(k\in {\mathbb {N}}\) and no infinite pairwise coprime set \({{\mathcal {A}}}\subseteq {\mathbb {N}}\setminus \{1\}\) such that \(k{{\mathcal {A}}}\subseteq {{\mathcal {B}}}\) [12, Thm. B]. As \({{\mathcal {B}}}\) is primitive, there can never be \(k\in {{\mathcal {B}}}\) and an infinite pairwise coprime set \({{\mathcal {A}}}\subseteq {\mathbb {N}}\setminus \{1\}\) such that \(k{{\mathcal {A}}}\subseteq {{\mathcal {B}}}\). Hence \(\eta \) is a Toeplitz sequence if and only if there are no \(k\in {\mathbb {N}}\setminus {{\mathcal {B}}}\) and no infinite pairwise coprime set \({{\mathcal {A}}}\subseteq {\mathbb {N}}\setminus \{1\}\) such that \(k{{\mathcal {A}}}\subseteq {{\mathcal {B}}}\). But \(k{{\mathcal {A}}}\subseteq {{\mathcal {B}}}\) is equivalent to \({{\mathcal {A}}}\subseteq {{\mathcal {B}}}/k\), so that an application of Lemma 3 finishes the proof. \(\square \)

4 Non-uniquely ergodic \({{\mathcal {B}}}\)-free Toeplitz sequences

For each \(\varepsilon >0\), Besicovitch [1] provided an example of a primitive set \(G\subseteq {\mathbb {N}}\setminus \{1\}\) such that the lower asymptotic density \({\underline{d}}({{\mathcal {M}}}_G)<\varepsilon \), while the upper asymptotic density of this set is \({\overline{d}}({{\mathcal {M}}}_G)>\frac{1}{2}\).Footnote 8

Remark 2

The set G in Besicovitch’s example contains arbitrarily long intervals [T, 2T), see (0.33) in Hall’s presentation of the proof [11]. Since, by the Bertrand postulate (proved by Tchebichef [17, pp. 371–382]) each such interval contains at least one prime number, the set G contains infinitely many prime numbers. Its “tautification” is a taut set \(G'\subseteq {\mathbb {N}}{\setminus }\{1\}\) for which \({{\mathcal {M}}}_G\subseteq {{\mathcal {M}}}_{G'}\) [9, Thm. 4.5] and \({\varvec{\delta }}({{\mathcal {M}}}_{G'})={\varvec{\delta }}({{\mathcal {M}}}_G)\) [9, Proof of Lemma 4.11].Footnote 9\({{\mathcal {M}}}_G\subseteq {{\mathcal {M}}}_{G'}\) implies that also \(G'\) contains infinitely many prime numbers, and so the corresponding subshift \(X_{\eta '}=X_{1_{{{\mathcal {F}}}_{G'}}}\) is proximal [9, Thm. B]. As \(G'\) is taut, the subshift \(X_{\eta '}\) is hereditary [13, Thm. 3]. Since \({\varvec{\delta }}({{\mathcal {M}}}_{G'})={\varvec{\delta }}({{\mathcal {M}}}_G)<1\), \(X_{\eta '}\) and \(X_\eta \) have positive topological entropy equal to \(1-{\varvec{\delta }}({{\mathcal {M}}}_G)\) [9, Prop. K and Cor. 1.7].

Our goal is to modify Besicovitch’s construction in several respects by defining primitive sets \({{\mathcal {B}}}\subseteq {\mathbb {N}}\setminus \{1\}\) such that

  • \({{\mathcal {M}}}_{{\mathcal {B}}}\subseteq {{\mathcal {M}}}_G\), so \({\underline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})\leqslant {\underline{d}}({{\mathcal {M}}}_G)<\varepsilon \),

  • \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\in \{0,1\}^{\mathbb {Z}}\) is a Toeplitz sequence,

  • there is at least one invariant measure of positive entropy for which \(\eta \) is not quasi-generic, and

  • depending on details of the construction, \(\eta \) is generic for the Mirsky measure, or it is quasi-generic for some measure of positive entropy (and, of course, for the Mirsky measure).

We start by recalling the essentials of Besicovitch’s construction, following more or less the outline in [11, second part of Thm. 0.1]: Take positive numbers \(\varepsilon ,\varepsilon _i\) \((i=1,2,\dots )\) such that

$$\begin{aligned} \varepsilon<\frac{1}{4}, \quad \sum _{i=1}^\infty \varepsilon _i<\frac{\varepsilon }{2}. \end{aligned}$$

Denote \(E_T:={{\mathcal {M}}}_{[T,2T)}\) and write e(T) for the asymptotic density of \(E_T\). As \(E_T\) is periodic, there are numbers \(\lambda (T)\) such that the mean density of the set \(E_T\) on any interval of more than \(\lambda (T)\) consecutive integers is \(<2e(T)\).

Define integers \(1=T_0<T_1<T_2<T_3<\dots \) so that

$$\begin{aligned}&e(T_1)<\varepsilon _{1}\nonumber \\ T_2>\lambda (T_1),\quad&e(T_2)<\varepsilon _{2}\nonumber \\ T_3>\lambda (T_2),\quad&e(T_3)<\varepsilon _{3}\nonumber \\&\vdots \end{aligned}$$

These inductive choices are possible, because of Erdős’ result [10] that \(\lim _{T\rightarrow \infty }e(T)=0\).Footnote 10 Observe that, given \(T_1,\dots ,T_k\), the index \(T_{k+1}\) can be chosen arbitrarily large. We will make use of this freedom of choice in the sequel.Footnote 11

Besicovitch’s set G is then defined as

$$\begin{aligned} G=\bigcup _{k=1}^\infty \left[ T_k,2T_{k}\right) \setminus (E_{T_1}\cup \dots \cup E_{T_{k-1}}), \end{aligned}$$

and obviously \([T_k,2T_{k})\subseteq \bigcup _{j=1}^\infty E_{T_j}={{\mathcal {M}}}_G\) for all k. As Besicovitch observed,

$$\begin{aligned} {\overline{d}}({{\mathcal {M}}}_G)&\geqslant \limsup _{k\rightarrow \infty }(2T_k)^{-1}{\text {card}}({{\mathcal {M}}}_G\cap [1,2T_k)) \\&\geqslant \limsup _{k\rightarrow \infty }(2T_k)^{-1}{\text {card}}[T_k,2T_k) = \frac{1}{2}, \end{aligned}$$


$$\begin{aligned} \begin{aligned} {\underline{d}}({{\mathcal {M}}}_G)&\leqslant \liminf _{k\rightarrow \infty }T_k^{-1}{\text {card}}({{\mathcal {M}}}_G\cap [1,T_k)) \\&= \liminf _{k\rightarrow \infty }T_k^{-1}{\text {card}}((E_{T_1}\cup \dots \cup E_{T_{k-1}})\cap [1,T_k))\\&\leqslant \liminf _{k\rightarrow \infty }\sum _{j=1}^{k-1}{\text {card}}(E_{T_j}\cap [1,T_k)) \leqslant \sum _{j=1}^{k-1}2e(T_j)< \sum _{j=1}^\infty 2\varepsilon _j <\varepsilon . \end{aligned} \end{aligned}$$

We now proceed to introduce additional constraints to the choice of the indices \(T_k\) and to construct a set \({{\mathcal {B}}}\subseteq {\mathbb {N}}\setminus \{1\}\) with the following properties:

  1. (I)

    For every \(j\in {\mathbb {N}}\setminus {{\mathcal {B}}}\) there is a finite set \(P_j\) of primes such that \({{\mathcal {B}}}/j\subseteq {{\mathcal {M}}}_{P_j}\) (see also Sect. 3),

  2. (II)

    \({\underline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})<\varepsilon \), and

  3. (III)

    \({\overline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})\geqslant \frac{1}{2}-2\varepsilon \).

To this end assume that integers \(1=T_0<T_1<\dots <T_{k}\), positive integers \(L_1,\dots ,L_k\), and finite sets \(P_1,\dots ,P_{T_k-1}\) of prime numbers are chosen such that (setting \(T_{-1}=1\))

  1. (A)

    the following strengthening of Besicovitch’s constraints (4) is satisfied for \(i=1,\dots ,k\):

    $$\begin{aligned} T_i\geqslant L_i>\lambda (T_{i-1}),\quad e(T_i)<\varepsilon _i, \end{aligned}$$
  2. (B)

    \({\text {card}}(j\cdot {{\mathcal {F}}}_{P_j}\cap [T,2T))\leqslant 2d(j\cdot {{\mathcal {F}}}_{P_j})\cdot T\) for all \(j\in [1,T_{k-1})\) and \(T\geqslant T_k\),

  3. (C)

    \({\{p\in {\mathbb {N}}: p\text { prime, p divides some }r\in [1,2T_k)\}}\subseteq P_j\) for all \(j\in [T_{k-1},T_k)\), and

  4. (D)

    \(d(j\cdot {{\mathcal {F}}}_{P_j})<\varepsilon \cdot 2^{-(j+1)}\) for all \(j\in [1,T_k)\).

Observe first that conditions (A) – (D) are empty and hence trivially satisfied for \(k=0\).

Now we choose \(T_{k+1}\geqslant L_{k+1}>\max \{T_k,\lambda (T_k)\}\) and sets \(P_j\) \((T_k\leqslant j<T_{k+1})\) inductively in such a way that (A) – (D) hold for \(k+1\) instead of k: First we make sure that \(T_{k+1}\) is large enough to satisfy (A) and (B) for \(k+1\). (For property (B) note that the sets \(j\cdot {{\mathcal {F}}}_{P_j}\) are periodic.) Then we choose the additional \(P_j\) big enough such that also (C) and (D) are satisfied for \(k+1\).

For the next step of the construction we fix, for all \(k\in {\mathbb {N}}\), sets \(J_k\subseteq [T_k,T_k+L_k)\) (with additional properties to be specified below), and define

$$\begin{aligned} {{\mathcal {F}}}_{P_j}^*:=\,&{{\mathcal {F}}}_{P_j}\setminus \{1\}\\ F:= \,&\bigcup _{j=1}^\infty j\cdot {{\mathcal {F}}}_{P_j}^*\\ E_j':=\,&{{\mathcal {M}}}_{J_j\setminus F}\quad (j\in {\mathbb {N}})\\ {{\mathcal {B}}}_n:=\,&\bigcup _{k=1}^n\left( J_k\setminus F\right) \setminus \bigcup _{j=1}^{k-1}E_j'\quad (n\in {\mathbb {N}})\\ {{\mathcal {B}}}:=\,&\bigcup _{n=1}^\infty {{\mathcal {B}}}_n. \end{aligned}$$

Lemma 4

  1. a)

    \({{\mathcal {B}}}\) is primitive by construction.

  2. b)

    \({{\mathcal {B}}}\cap F=\emptyset \) by construction.

  3. c)

    \({{\mathcal {B}}}/j\subseteq {{\mathcal {M}}}_{P_j}\) for every \(j\in {\mathbb {N}}\setminus {{\mathcal {B}}}\).

  4. d)

    \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is a Toeplitz sequence.


a) and b): Obvious.

c) Let \(b\in {{\mathcal {B}}}/j\). Then \(jb\in {{\mathcal {B}}}\), whence \(jb\not \in F\) by assertion b). In particular, \(jb\not \in j\cdot {{\mathcal {F}}}_{P_j}^*\), i.e. \(b\not \in {{\mathcal {F}}}_{P_j}^*\). Hence \(b=1\) or \(b\in {{\mathcal {M}}}_{P_j}\). But \(b\ne 1\) since \(j\not \in {{\mathcal {B}}}\).

d) \(\eta \) is a Toeplitz sequence by Proposition 1 and assertions a) and c). \(\square \)

Lemma 5

  1. a)

    \({{\mathcal {M}}}_{{{\mathcal {B}}}_n}=\bigcup _{j=1}^n E_j'\).

  2. b)

    \({{\mathcal {M}}}_{{\mathcal {B}}}\subseteq {{\mathcal {M}}}_G\), where G is defined in (5).

  3. c)

    \({{\mathcal {M}}}_{{\mathcal {B}}}\cap J_k\supseteq J_k\setminus F\) for all k.

  4. d)

    \({\underline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})<\varepsilon \).


a) For \(n=1\) we have \({{\mathcal {B}}}_1=J_1\setminus F\), whence \({{\mathcal {M}}}_{{{\mathcal {B}}}_1}=E_1'\). It follows inductively that

$$\begin{aligned} \begin{aligned} {{\mathcal {M}}}_{{{\mathcal {B}}}_{n+1}} = ~&{{\mathcal {M}}}_{{{\mathcal {B}}}_n}\cup {{\mathcal {M}}}_{\left( J_{n+1}\setminus F\right) \setminus {{\mathcal {M}}}_{{{\mathcal {B}}}_n}} = {{\mathcal {M}}}_{{{\mathcal {M}}}_{{{\mathcal {B}}}_n}\cup \left( \left( J_{n+1}\setminus F\right) \setminus {{\mathcal {M}}}_{{{\mathcal {B}}}_n}\right) }\\ =~&{{\mathcal {M}}}_{{{\mathcal {M}}}_{{{\mathcal {B}}}_n}\cup \left( J_{n+1}\setminus F\right) } = {{\mathcal {M}}}_{{{\mathcal {B}}}_n}\cup E_{n+1}' =\bigcup _{j=1}^{n+1}E_j'. \end{aligned} \end{aligned}$$

b) In view of assertion a), \({{\mathcal {M}}}_{{\mathcal {B}}}=\bigcup _{j=1}^\infty E_j'\subseteq \bigcup _{j=1}^\infty E_{T_j}={{\mathcal {M}}}_G\), see (5).

c) \({{\mathcal {M}}}_{{\mathcal {B}}}\cap J_k=\bigcup _{j=1}^\infty E_j'\cap J_k\supseteq \bigcup _{j=1}^\infty \left( J_j\setminus F\right) \cap J_k=J_k\setminus F\).

d) follows from b) and (6). \(\square \)

We will use the following two estimates:

Lemma 6

For all \(k\in {\mathbb {N}}\),

  1. a)

    \({\text {card}}(F\cap [T_k,T_k+L_k))\leqslant \varepsilon L_k\),

  2. b)

    \({\text {card}}\left( \bigcup _{j=1}^{k-1}E_j'\setminus (J_k\setminus F)\cap [T_k,T_k+L_k)\right) \leqslant 2\varepsilon L_k\).



$$\begin{aligned} \begin{aligned}&{\text {card}}(F\cap [T_k,T_k+L_k)) \leqslant \sum _{j=1}^\infty {\text {card}}([T_k,T_k+L_k)\cap j\cdot {{\mathcal {F}}}_{P_j}^*)\\&\quad = \sum _{j=1}^{T_{k-1}-1}{\text {card}}([T_k,T_k+L_k)\cap j\cdot {{\mathcal {F}}}_{P_j}^*) + \sum _{j=T_{k-1}}^{T_k-1}{\text {card}}([T_k,T_k+L_k)\cap j\cdot {{\mathcal {F}}}_{P_j}^*)\\&\qquad + \sum _{j=T_k}^\infty {\text {card}}([T_k,T_k+L_k)\cap j\cdot {{\mathcal {F}}}_{P_j}^*)\\&\quad \leqslant \sum _{j=1}^{T_{k-1}-1}2d(j\cdot {{\mathcal {F}}}_{P_j}^*)L_k +\sum _{j=T_{k-1}}^{T_k-1}0 +\sum _{j=T_k}^{\infty }0< \sum _{j=1}^{T_{k-1}-1}\varepsilon 2^{-j}L_k<\varepsilon L_k. \end{aligned} \end{aligned}$$

Here the first “0-sum” is due to property (C), and for the second “0-sum” one only needs to observe that \(1\not \in {{\mathcal {F}}}_{P_j}^*\). The final estimate uses property (D).


$$\begin{aligned}&{\text {card}}\left( \bigcup _{j=1}^{k-1}E_j'\setminus (J_k\setminus F)\cap [T_k,T_k+L_k)\right) \\&\quad \leqslant \sum _{j=1}^{k-1}{\text {card}}\left( E_j'\cap [T_k,T_k+L_k)\right) \leqslant \sum _{j=1}^{k-1}2e(T_j)L_k \leqslant 2\varepsilon L_k. \end{aligned}$$

\(\square \)

Proposition 2

There are primitive sets \({{\mathcal {B}}}\) with the following properties:

  1. i)

    The sequence \(\eta =1_{{{\mathcal {F}}}_{{\mathcal {B}}}}\) is a Toeplitz sequence.

  2. ii)

    \({{\mathcal {B}}}\) is not a Besicovitch set.

  3. iii)

    The sequence \(\eta \) is quasi-generic for at least two measures.


Let \(J_k=[T_k,2T_k)\) for all k. Lemma 5d) shows that \(\eta \) is a Toeplitz sequence, and Lemma 5c) and Lemma 6a) imply

$$\begin{aligned} {\text {card}}({{\mathcal {M}}}_{{\mathcal {B}}}\cap [1,2T_k))&\geqslant {\text {card}}([T_k,2T_k)\setminus F) \geqslant T_k-{\text {card}}(F\cap [T_k,2T_k))\\&\geqslant (1-\varepsilon )T_k \end{aligned}$$

for every k, so that in particular \({\overline{d}}({{\mathcal {M}}}_{{\mathcal {B}}})>\frac{1}{2}-\varepsilon \). Combined with Lemma 5d) this shows that \({{\mathcal {B}}}\) is not a Besicovitch set and that \(\eta \) is not generic for any measure, so it is quasi-generic for at least two measures. \(\square \)

5 Positive entropy

For the proof of Proposition 2 we made the straightforward choice \(J_k=[T_k,2T_k)\). In order to control the entropy of the measures we construct, we will have to make more subtle choices for the sets \(J_k\subseteq [T_k,2T_k)\), and in order to include also measures, for which \(\eta \) is not quasi-generic, we replace the intervals \([T_k,2T_k)\) by more flexible intervals \([T_k,T_k+L_k)\). The choice of the sets \(J_k\) is based on the following lemma, which might be folklore among specialists, but which I could not locate in the literature. So I provide a proof based on properties of Kolmogorov complexity in Sect. 6.

Denote by \(\Phi :[0,1]\rightarrow [0,\log 2]\), \(\phi (t)=-t\log _2(t)-(1-t)\log _2(1-t)\) the binary entropy function and by \(H_n(w)\) the entropy of the empirical distribution of blocks of length n in the sample \((w_{[j+1,j+n]})_{j=0,\dots ,L-n}\). (These are all sub-words of w with length n.)

For \(A\subseteq \{1,\dots ,L\}\) let \(d_L(A):={\text {card}}(A)/L\).

Lemma 7

Let \(\varepsilon \in (0,\frac{1}{2})\). There is a constant \(L_\varepsilon >0\) such that for all \(L\geqslant L_\varepsilon \) and \(\gamma \in (0,1/2-\varepsilon )\) there is a word \(w_{L,\gamma }\in \{0,1\}^L\) with the following properties: For each \(n>0\) and each \(\kappa >0\) there is \(\ell _{n,\kappa }>0\) such that, for all sets \(A,B\subseteq \{1,\dots ,L\}\) with \(d_L(A),d_L(B)<\varepsilon \), \(\Phi (d_L(A)),\Phi (d_L(B))<\frac{1}{4}\kappa \) and \(w_{L,\gamma }\cdot 1_{A^c}\cdot 1_B=0\),

$$\begin{aligned}&(\gamma -\varepsilon ) L\leqslant \sum _{i=1}^L(w_{L,\gamma }\cdot 1_{A^c}+1_B)_i\leqslant (\gamma +2\epsilon )L\quad \text {and} \end{aligned}$$
$$\begin{aligned}&\frac{1}{n}H_n(w_{L,\gamma }\cdot 1_{A^c}+1_B) \geqslant \Phi (\gamma )-\kappa \quad \text {if }L\geqslant \ell _{n,\kappa }. \end{aligned}$$

We now describe how to choose the \(J_k\) in order to get a measure of positive entropy for which \(\eta \) is quasi-generic. So let \(\varepsilon \in (0,\frac{1}{4})\) and choose \(\gamma \in (\varepsilon ,\frac{1}{2}-\varepsilon )\). Fix also some number \(\kappa \in (0,\varepsilon )\). For each \(n\in {\mathbb {N}}\) and all indices k such that \(L_k\geqslant \ell _{n,\kappa }\) we choose a word \(w_k=w_{L_k,\gamma }\in \{0,1\}^n\) as in Lemma 7.

For any \(w\in \{0,1\}^L\) denote \(J(w):=\{i\in [1,L]:w_i=1\}\). Define the sets \(J_k\) for our construction,

$$\begin{aligned} J_k:=J(w_k)+T_k-1\subseteq [T_k,T_k+L_k). \end{aligned}$$

Fix a subsequence \((T_{k_i})_i\) for which

  • the sequence \(\left( \frac{1}{T_{k_i}}\sum _{j=0}^{T_{k_i}-1}\delta _{S^j\eta }\right) _i\) converges weakly to some invariant measure \(\nu _1\), and

  • the sequence \(\left( \frac{1}{L_{k_i}}\sum _{j=T_{k_i}}^{T_{k_i}+L_{k_i}-1}\delta _{S^j\eta }\right) _i\) converges weakly to some invariant measure \(\nu _2\).


  • if \(L_k=T_k\), the sequence \(\left( \frac{1}{T_{k_i}+L_{k_i}}\sum _{j=0}^{T_{k_i}+L_{k_i}-1}\delta _{S^j\eta }\right) _i\) converges weakly to the invariant measure \(\nu =\frac{1}{2}(\nu _1+\nu _2)\), and

  • if \(L_k/T_k\rightarrow 0\), the sequence \(\left( \frac{1}{T_{k_i}+L_{k_i}}\sum _{j=0}^{T_{k_i}+L_{k_i}-1}\delta _{S^j\eta }\right) _i\) converges weakly to the invariant measure \(\nu _1\).

Without loss of generality we may assume that \((T_{k_i})_i\) is the full sequence \((T_k)_k\) – this just eases the notation.

Lemma 8

We have the following lower bound for the Kolmogorov-Sinai entropy of \((X_\eta ,S,\nu _2)\):

$$\begin{aligned} h_{\nu _2}(S)\geqslant \Phi (\gamma )-4\Phi (2\varepsilon ). \end{aligned}$$


For each \(k\in {\mathbb {N}}\),

$$\begin{aligned} \begin{aligned} {{\mathcal {M}}}_{{\mathcal {B}}}\cap [T_k,T_k+L_k)&= {{\mathcal {M}}}_{{{\mathcal {B}}}_k}\cap [T_k,T_k+L_k)\\&= \bigcup _{j=1}^{k-1}E_j'\cap [T_k,T_k+L_k)\cup \left( (J_k\setminus F)\cap [T_k,T_k+L_k)\right) \\&= J_k\setminus \left( F\cap [T_k,T_k+L_k)\right) \cup \bigcup _{j=1}^{k-1}\left( E_j'\setminus (J_k\setminus F)\right) \cap [T_k,T_k+L_k)\\&=: (J(w_k)+T_k-1)\setminus (A_k+T_k-1)\cup (B_k+T_k-1)\\&= J(w_k\cdot 1_{A_k^c}+1_{B_k}) \end{aligned}\nonumber \\ \end{aligned}$$

with sets \(A_k,B_k\subseteq [1,L_k]\) such that \(B_k\cap (J(w_k)\setminus A_k)=\emptyset \), to which we want to apply Lemma 7 – with \(2\varepsilon \) instead of \(\varepsilon \) and \(\kappa =4\Phi (2\epsilon )\). To this end observe that Lemma 6 implies

$$\begin{aligned} d_{L_k}(A_k)= & {} L_k^{-1}{\text {card}}(F\cap [T_k,T_k+L_k)) \leqslant \varepsilon \quad \text {and}\\ d_{L_k}(B_k)= & {} L_k^{-1}{\text {card}}\left( \bigcup _{j=1}^{k-1}E_j'\setminus (J_k\setminus F)\cap [T_k,T_k+L_k)\right) \leqslant 2\varepsilon , \end{aligned}$$

in particular also \(\Phi (d_{L_k}(A_k)),\Phi (d_{L_k}(B_k))<\Phi (2\varepsilon )=\frac{\kappa }{4}\). Hence, Lemma 7 shows that for each \(n\in {\mathbb {N}}\) there is \(k_n>0\) such that, for all \(k\geqslant k_n\),

$$\begin{aligned}&(\gamma -2\varepsilon ) L_k\leqslant \sum _{i=1}^{L_k}(w_k\cdot 1_{A_k^c}+1_{B_k})_i\leqslant (\gamma +4\epsilon )L_k\quad \text {and} \end{aligned}$$
$$\begin{aligned}&\frac{1}{n}H_n(w_k\cdot 1_{A_k^c}+1_{B_k}) \geqslant \Phi (\gamma )-\kappa . \end{aligned}$$

So fix \(n\in {\mathbb {N}}\). For each cylinder set [u] determined by \(u\in \{0,1\}^n\) we have

$$\begin{aligned} \nu _2([u])= & {} \lim _{k\rightarrow \infty }\frac{1}{L_k}\sum _{\ell =T_k}^{T_k+L_k-n}1_{[u]}(S^\ell \eta )\\= & {} \lim _{k\rightarrow \infty }\frac{1}{L_k}{\text {card}}\{\ell \in [T_k,T_k+L_k-n]:\eta _{[\ell ,\ell +n-1]}=u\}. \end{aligned}$$

It follows from (10) and (12) that \(H_n(\nu _2)\), the entropy of \(\nu _2\) on blocks of length n, can be estimated by

$$\begin{aligned} \frac{1}{n}H_n(\nu _2) \geqslant \Phi (\gamma )-\kappa = \Phi (\gamma )-4\Phi (2\epsilon ), \end{aligned}$$

so that \(h_{\nu _2}(S)\geqslant \Phi (\gamma )-4\Phi (2\varepsilon )\). \(\square \)

Proof of Theorem 1

(ii) By Lemma 8, \(h_{\nu _2}(S)\geqslant \Phi (\gamma )-4\Phi (2\epsilon )\) is strictly positive if \(\gamma >\Phi ^{-1}(4\Phi (2\epsilon ))\), which can easily be achieved for small enough \(\varepsilon >0\).

(i) \(\eta \) is a Toeplitz sequence by Lemma 4d). It is irregular, because \(X_\eta \) is not uniquely ergodic by assertion (ii).

(iii) (a) Choose \(T_k=L_k\). Then \(\eta \) is quasi-generic for the invariant measure \(\nu =\frac{1}{2}(\nu _1+\nu _2)\), and \(h_\nu (S)\geqslant \frac{1}{2}h_{\nu _2}(S)>0\) as above.

(b) Choose \(T_k=k^2L_k\). Then \(\nu _2\) is a measure of positive entropy as before, but the set \({{\mathcal {B}}}\) is Besicovitch:

$$\begin{aligned} \sum _{b\in {{\mathcal {B}}}}\frac{1}{b}\leqslant & {} \sum _{k=1}^\infty \sum _{j=T_k}^{T_k+L_k-1}\frac{1}{j} \leqslant 1+\sum _{k=1}^\infty \sum _{j=T_k+1}^{T_k+L_k}\frac{1}{j} \leqslant 1+\sum _{k=1}^\infty \log \frac{T_k+L_k}{T_k} \leqslant 1\\&+\sum _{k=1}^\infty \log \left( 1+\frac{1}{k^2}\right) <\infty . \end{aligned}$$

\(\square \)

6 On Kolmogorov complexity and entropy

Very loosely speaking, the Kolmogorov complexity C(w) of a word \(w\in \{0,1\}^*\) is the length of the shortest binary code that can serve as a program for a universal Turing machine to print the word w on its output tape and then to stop. Of course this definition depends on the choice of the particular Turing machine, but it can be shown that for any two different universal Turing machines there exists a constant such that the difference of complexities defined with respect to these two machines does not exceed this constant for any word w of any length. The monograph [15] provides a precise and detailed introduction to Kolmogorov complexity and other variants of algorithmic complexity and their relation to entropy and coding, and we will refer to notation and results from this book throughout this section.

A general pitfall when dealing with algorithmic complexity is that (in)equalities which one might expect when one does not think too much about the details of their proofs, hold only up to a constant or even logarithmic (logarithm of the word length) error term. One of the reasons is that the transitions between consecutive words on the same input tape of the Turing machine must be recognizable, another one that sometimes the word length must be provided as additional information to the Turing machine to make the intended algorithm work. This can be dealt with properly by introducing variants of Kolmogorov complexity like the prefix complexity K(w) in [15, Sec. 3.1]. It should not come as a surprise that C(w) and K(w) differ only by a logarithmic (in the word length) term. As logarithmic terms do not influence our arguments, we will provide only “naive” proofs, whenever complexity is involved.

Recall that \(\Phi \) denotes the binary entropy function. The following lemma is folklore:

Lemma 9

Let \(\varepsilon >0\). There is a constant \(L_\varepsilon \) such that for each \(L\geqslant L_\varepsilon \) and each \(\gamma \in (0,1/2-\varepsilon )\) there is some \(w_{L,\gamma }\in \{0,1\}^L\) such that

$$\begin{aligned} \gamma L\leqslant \sum _{i=1}^L(w_{L,\gamma })_i\leqslant (\gamma +\varepsilon )L \quad \text {and}\quad C(w_{L,\gamma })\geqslant \Phi (\gamma ) L. \end{aligned}$$


For large enough L (“large” depending only on \(\varepsilon \)) we can fix \(k\in {\mathbb {N}}\) such that \(\gamma +\varepsilon /2<k/L<\gamma +\varepsilon \). Hence there are at least \(L\atopwithdelims ()k\) words \(w\in \{0,1\}^L\) with \((\gamma +\varepsilon /2) L\leqslant \sum _{i=1}^Lw_i\leqslant (\gamma +\varepsilon )L\). At least one of these words has complexity \(C(w)\geqslant \log _2{L\atopwithdelims ()k}-1\) [15, Thm. 2.2.1], and one can estimate that this is bounded from below by \(\Phi (\gamma )L\) when \(L\geqslant L_\varepsilon \) for some suitable \(L_\varepsilon \).Footnote 12\(\square \)

We fix some notation.

  • For \(0<n<L\) let \(m:=[(L-(n-1))/n]\) so that \(L':=(m+1)n-1\leqslant L<(m+2)n-1\).

  • Let \(0<n<L\) and \(w\in \{0,1\}^L\).

    1. (1)

      For \(s\in \{0,\dots ,n-1\}\) denote by \(H_n^s(w)\) the entropy of the empirical distribution of blocks of length n in the sample \((w_{[jn+s+1,jn+s+n]})_{j=0,\dots ,m-1}\). (These are the non-overlapping sub-words of w with length n starting at position s, except possibly for the last one.)

    2. (2)

      Recall that \(H_n(w)\) denotes the entropy of the empirical distribution of blocks of length n in the sample \((w_{[j+1,j+n]})_{j=0,\dots ,L-n}\).

Lemma 10

\(H_n(w)\geqslant \frac{1}{n}\sum _{s=0}^{n-1}H_n^s(w)-q_n(L)\), where \(\lim _{n\rightarrow \infty }q_n(L)=0\) for each fixed n.


Denote by \(w'\) the restriction of the word w to the indices \([1,L']\). Then the collection of length-n subwords of \(w'\) is the disjoint union of the samples from item (1), so that \(H_n(w')\geqslant \frac{1}{n}\sum _{s=0}^{n-1}H_n^s(w)\), because the entropy function (as a function on probability vectors) is concave. So it remains to estimate the difference \(H_n(w')-H_n(w)\). As \(L-L'<n\), a crude estimate can use the fact that the relative frequencies of any block \(u\in \{0,1\}^n\) in w and \(w'\) can differ by at most \((n-1)/L'<n/((m+1)n)=1/(m+1)\). Hence, the contribution of each single block to the entropy can change by at most \(\varphi (1/(m+1))\), where we use that the function \(\varphi (x)=-x\log _2x\) is concave and increasing on the interval \([0,e^{-1}]\). It follows that \(|H_n(w')-H_n(w)|\leqslant 2^n\varphi (1/(m+1))\leqslant 2^n\varphi (\frac{n}{L-(n-1)})=:q_n(L)\) and \(q_n(L)\rightarrow 0\) as \(L\rightarrow \infty \). \(\square \)

Sketch of a proof of Lemma 7 using Kolmogorov complexity

The inequalities in (78) are obvious. We turn to the lower bound for the entropy. Let \(w=w_{L,\gamma }\). It is intuitively clear that

$$\begin{aligned} C(w)\leqslant C(w\cdot 1_{A^c}+1_B)+C(1_B)+C(w\cdot 1_A)+O(\log L). \end{aligned}$$

Just observe that \((w\cdot 1_{A^c}+1_B)-1_B+w\cdot 1_A=w\). It follows from [15, Thm. 2.8.1] that, for each \(n>0\), there is a sequence \(\epsilon _n(1)>\epsilon _n(2)>\dots \searrow 0\) such that, for all \(s\in \{0,\dots ,n-1\}\),

$$\begin{aligned} \begin{aligned} C(w\cdot 1_{A^c}+1_B)&\leqslant m(H_n^s(w\cdot 1_{A^c}+1_B)+\epsilon _n(m))+2n \\ C(1_B)&\leqslant L(H_1(1_B)+\epsilon _1(L))\\ C(w\cdot 1_{A})&\leqslant L(H_1(w\cdot 1_A)+\epsilon _1(L)). \end{aligned} \end{aligned}$$

But \(H_1(1_B)\leqslant \Phi (d_L(B)){<\frac{\kappa }{4}}\), \(H_1(w\cdot 1_A)\leqslant \Phi (d_L(A)){<\frac{\kappa }{4}}\) by assumption, and \(\frac{1}{n}\sum _{s=0}^{n-1}H_n^s(w\cdot 1_{A^c}+1_B)\leqslant H_n(w\cdot 1_{A^c}+1_B)+q_n(L)\) by Lemma 10, so that

$$\begin{aligned} C(w)\leqslant L\left( \frac{1}{n}H_n(w\cdot 1_{A^c}+1_B)+\frac{m}{L}\epsilon _n(m)+{\frac{m}{L}}q_n(L)+{\frac{\kappa }{2}}+{2}\epsilon _1(L){+\frac{2n}{L}}\right) . \end{aligned}$$

By Lemma 9,

$$\begin{aligned} C(w)\geqslant L\Phi (\gamma )-1. \end{aligned}$$


$$\begin{aligned} \begin{aligned} \frac{1}{n}H_n(w\cdot 1_{A^c}+1_B)&\geqslant \Phi (\gamma )-\frac{1}{L}-\frac{q_n(L)}{ n}-\frac{\epsilon _n(L/(n+1))}{n}-{\frac{\kappa }{2}}-{2}\epsilon _1(L){-\frac{2n}{L}}\\&= \Phi (\gamma )-\kappa /2-\rho _n(L), \end{aligned} \end{aligned}$$

where \(\rho _n(L):={\frac{2n+1}{L}+\frac{q_n(L)+\epsilon _n(L/(n+1))}{n}+2\epsilon _1(L)}\searrow 0\) as \(L\rightarrow \infty \) for each fixed n. To finish the proof, choose \(\ell _{n,\kappa }\geqslant L_\varepsilon \) so large that \(\rho _n(L)\leqslant \frac{\kappa }{2}\) for \(L\geqslant \ell _{n,\kappa }\). \(\square \)