1 Introduction

Given \(r\in \mathbb {N}\), a set \(X \subseteq [0,1]\) is \(\times r\)-invariant if it is closed and \(T_r X \subseteq X\), where \(T_r:[0,1]\rightarrow [0,1]\) is the map \(x \mapsto rx \pmod 1\). In the late 1960’s, Furstenberg conjecturedFootnote 1 that if r and s are multiplicatively independent positive integers (that is, \(\log r / \log s\) is irrational) and X and Y are \(\times r\)- and \(\times s\)-invariant, respectively, then

$$\begin{aligned} \dim _{\text {H}}\big (X+Y \big )&= \min \big (1, \dim _{\text {H}}X + \dim _{\text {H}}Y \big ), \text { and} \end{aligned}$$
(1.1)
$$\begin{aligned} \dim _{\text {H}}\big (X \cap Y \big )&\le \max \big (0, \dim _{\text {H}}X + \dim _{\text {H}}Y - 1 \big ). \end{aligned}$$
(1.2)

The sumset conjecture (1.1) was resolved by Hochman and Shmerkin [12], who proved a more general result concerning the dimension of sums of invariant measures. It also follows by more recent results of Shmerkin [23] and Wu [24], who independently resolved a generalization of the intersection conjecture (1.2). We give a more detailed account of this recent history later in the introduction.

The purpose of this article is to give a new, combinatorial proof of Furstenberg’s sumset conjecture (1.1). Denoting the unlimited \(\gamma \)-Hausdorff content by \(\mathcal {H}_{>0}^\gamma \) (see Definition 2.2), our main theorem is as follows.

Theorem A

Let r and s be multiplicatively independent positive integers, and let \(X, Y \subseteq [0,1]\) be \(\times r\)- and \(\times s\)-invariant sets, respectively. Define \(\overline{\gamma }= \min \big ( \dim _{\text {H}}X + \dim _{\text {H}}Y, 1 \big )\). For all compact \(I \subseteq \mathbb {R}{\setminus } \{0\}\) and all \(\gamma < \overline{\gamma }\),

$$\begin{aligned} \inf _{\lambda , \eta \in I}\ \mathcal {H}_{>0}^{\gamma } \big ( \lambda X + \eta Y \big ) > 0. \end{aligned}$$
(1.3)

Beyond implying (1.1), Theorem A gives finer quantitative information on the size of the sumset \(\lambda X + \eta Y\) in terms of the unlimited \(\gamma \)-Hausdorff content uniformly over the parameters \(\lambda \) and \(\eta \). The uniformity in the result, which does not appear to follow from [12], has found use in recent applications concerning digit problems; see, for example, [11] and [3]. See Remark 5.1 below for some further discussion on this uniformity.

Our proof of (1.1) differs from other proofs in the literature in that it completely avoids the machinery of CP-processes and local entropy averages. Instead, it features an elementary, combinatorial approach that builds on the work of Peres and Shmerkin in [22]. Important ingredients in the proof include a quantitative discrete Marstrand theorem (Theorem 3.2) and a subtree regularity theorem (Theorem 4.7), both of which may be of independent interest.

1.1 History and Context

In a highly influential work in geometric measure theory, Marstrand [18] related the Hausdorff dimension of a Borel set \(E \subseteq \mathbb {R}^2\), \(\dim _{\text {H}}E\), to the Hausdorff dimension of its images under orthogonal projections and its intersections with lines. More specifically, he showed that for almost every line \(L \subseteq \mathbb {R}^2\), \(\dim _{\text {H}}( \pi _L E) = \min \big (1,\dim _{\text {H}}E \big )\), where \(\pi _L\) is the orthogonal projection \(\mathbb {R}^2 \rightarrow L\), and that for almost every line L intersecting E, \(\dim _{\text {H}}( E \cap L) = \max \big (0,\dim _{\text {H}}E - 1 \big )\).Footnote 2

Images of a Cartesian product \(X \times Y\) under orthogonal projections are, up to affine transformations which preserve dimension, sumsets of the form \(\lambda X + \eta Y\), while intersections of \(X \times Y\) with lines are affinely equivalent to sets of the form \(\lambda X \cap (\eta Y + \sigma )\). Thus, Marstrand’s theorems in the case \(E=X\times Y\) imply the following.

Theorem 1.1

([18, Theorems II and III]) Let X and Y be Borel subsets of [0, 1]. For Lebesgue-a.e. \(\lambda , \eta , \sigma \in \mathbb {R}\),

$$\begin{aligned} \dim _{\text {H}}\big ( \lambda X + \eta Y \big )&= \min {}\big (\dim _{\text {H}}(X \times Y) , \ 1 \big ), \text { and} \end{aligned}$$
(1.4)
$$\begin{aligned} \dim _{\text {H}}\big ( \lambda X \cap (\eta Y + \sigma ) \big )&\le \max {}\big (0, \ \dim _{\text {H}}(X \times Y) - 1 \big ). \end{aligned}$$
(1.5)

Improving (1.4) and (1.5) by replacing the Lebesgue-typical projection or intersection of \(X\times Y\) with a concrete projection or intersection is not possible in general [13] but can be done in special cases when the sets X and Y are structured. Furstenberg’s conjectures (1.1) and (1.2) can be contextualized as such: when r and s are multiplicatively independent and \( X \times Y\) is the product of a \(\times r\)- and a \(\times s\)-invariant set, results for the Lebesgue-typical projection and intersection should hold for the orthogonal projection to, and the intersection with, the line \(x=y\). These conjectures join a host of results and conjectures by Furstenberg and others that aim to capture the independence between base-r and base-s structure when r and s are multiplicatively independent.

Conjectures (1.1) and (1.2) were recently resolved, both proven in more general forms. In the following theorem, we have combined special cases of the results by Hochman and Shmerkin [12], Shmerkin [23], and Wu [24] that are most relevant to this work. Note that \(\overline{\dim }_{\text {M}}\hspace{.1em}\) denotes the upper Minkowski dimension (see Definition 2.1).

Theorem 1.2

([12] and [23, 24]) Let r and s be multiplicatively independent positive integers, and let \(X, Y \subseteq [0,1]\) be \(\times r\)- and \(\times s\)-invariant sets, respectively. For all \(\lambda , \eta \in \mathbb {R}{\setminus } \{0\}\) and all \(\sigma \in \mathbb {R}\),

$$\begin{aligned} \dim _{\text {H}}\big ( \lambda X + \eta Y \big )&= \min {}\big (\dim _{\text {H}}X + \dim _{\text {H}}Y, \ 1 \big ), \text { and} \end{aligned}$$
(1.6)
$$\begin{aligned} \overline{\dim }_{\text {M}}\hspace{.1em}\big ( \lambda X \cap (\eta Y + \sigma ) \big )&\le \max {} \big (0, \ \dim _{\text {H}}X + \dim _{\text {H}}Y - 1 \big ). \end{aligned}$$
(1.7)

A number of partial results preceded those in Theorem 1.2, both for multiplicatively invariant sets and for attractors of iterated function systems (IFSs). Carlos Moreira [20] considered sumsets of attractors of IFSs with certain irrationality and non-linearity conditions. Peres and Shmerkin [22] proved (1.6) for attractors of IFSs with rationally independent contraction ratios; this resolved (1.6) in the special case that X and Y are restricted digit Cantor sets with respect to multiplicatively independent bases. (This work of Peres and Shmerkin is particularly relevant to the arguments in this paper, as we explain in detail in Sect. 5.1.)

Hochman and Shmerkin [12] developed Furstenberg’s CP processes [9] and introduced local entropy averages to prove (1.6) both for invariant sets and measures and for attractors of IFSs satisfying some general minimality conditions. Wu [24] combined the CP process machinery with Sinai’s factor theorem from ergodic theory to resolve (1.7) for invariant sets and attractors of regular, self-similar IFSs. Shmerkin [23] resolved (1.7) utilizing tools primarily from additive combinatorics, proving an inverse theorem for the decay of \(L^q\) norms of certain self-similar measures of dynamical origin. Yu [25] and Austin [1] gave dynamical proofs of (1.2), simplifying some aspects of earlier proofs.

The sumset and intersection theorems are closely related: fibers of orthogonal projections are precisely those lines with which intersections are considered. It is not surprising, then, that the intersection theorem can be used to deduce the sumset theorem. For example, if for arbitrary sets \(X, Y \subseteq [0,1]\) we know that for all \(\gamma > \max \big ( 0, \dim _{\text {H}}X + \dim _{\text {H}}Y - 1 \big )\), there exists \(\delta _0 > 0\), for all \(0< \delta < \delta _0\), and for all balls B of diameter \(\delta \),

$$\begin{aligned} \mathcal {N}\big ( X \cap (Y + B), \delta \big ) \le \delta ^{-\gamma }, \end{aligned}$$

then we can deduce that \(\dim _{\text {H}}(X+Y) = \min (1, \dim _{\text {H}}(X \times Y) \big )\). This type of uniformity is made explicit in Shmerkin [23] and Yu [25] and may be implicit in the other proofs of the intersection conjecture. It is possible to deduce Theorem A from Shmerkin’s main result in [23]; we explain the details in the course of another argument in [11]. Despite the fact that every proof of the intersection conjecture can be counted as a proof of the sumset conjecture, we believe our approach still has merit: it is the most elementary proof to date; it exposes uniformity important in certain number-theoretic applications; and it features tools which may be of independent interest.

Theorem A has a geometric formulation in terms of orthogonal projections; while we will not make particular use of the theorem in this form, it is worth formulating for its historical connection to the topic. Let \(\pi _\theta : \mathbb {R}^2 \rightarrow \mathbb {R}^2\) be the orthogonal projection onto the line that contains the origin and forms an angle \(\theta \) with the positive x-axis. The proof of the equivalence between Theorem A and Theorem B is standard and not needed in this work, so it is omitted.

Theorem B

Let r and s be multiplicatively independent positive integers, and let \(X, Y \subseteq [0,1]\) be \(\times r\)- and \(\times s\)-invariant sets, respectively. Define \(\overline{\gamma }= \min \big ( \dim _{\text {H}}X + \dim _{\text {H}}Y, 1 \big )\). For all compact \(I \subseteq (0,\pi ) {\setminus } \{\pi / 2\}\) and all \(\gamma < \overline{\gamma }\), \(\inf _{\theta \in I} \mathcal {H}_{>0}^{\gamma } \big ( \pi _\theta (X \times Y) \big ) > 0.\)

1.2 Overview of the Paper

The paper is organized as follows. In Sect. 2, we organize the terminology, notation, and basic facts we need from discrete and continuous fractal geometry, including some properties of \(\times r\)-invariant subsets of [0, 1] and an equidistribution lemma. Section 3 contains a proof of Theorem 3.2, our discrete Marstrand projection theorem. Section 4 features notation and terminology for trees and the subtree regularity theorem, Theorem 4.7. Finally, we prove Theorem A in Sect. 5.

1.3 Acknowledgements

The authors extend their thanks to the referees for their valuable feedback. The effectiveness of the main result was pointed out to us by the referees and led to Remark 5.1. The third author is supported by the National Science Foundation under grant number DMS 1901453.

2 Preliminary Definitions and Results

The positive and non-negative integers are denoted by \(\mathbb {N}\) and \(\mathbb {N}_0\), respectively. For \(x \in \mathbb {R}\), denote the fractional part by \(\{x\}\) and the integer part (or floor) by \(\lfloor x \rfloor \). The Lebesgue measure on the real line is denoted by \(\text {Leb}\). Throughout the paper, \(\mathbb {R}^d\) is equipped with the Euclidean norm which we denote by \(| \cdot |\). Given two positive-valued functions f and g, we write \(f \ll _{a_1,\ldots ,a_k} g\) or \(g \gg _{a_1,\ldots ,a_k} f\) if there exists a constant \(K > 0\), depending only on the quantities \(a_1, \ldots , a_k\), for which \(f(x) \le K g(x)\) for all x in the domain common to both f and g. We write \(f \asymp _{a_1,\ldots ,a_k} g\) if both \(f \ll _{a_1,\ldots ,a_k} g\) and \(f \gg _{a_1,\ldots ,a_k} g.\)

2.1 Continuous and Discrete Fractal Geometry

In this section, we lay out the notation, tools, and results we need from continuous and discrete fractal geometry. A good general reference for the standard material in this section is [19, Ch. 4]. In the definitions that follow, \(\rho , \gamma , c > 0\), \(d \in \mathbb {N}\), and \(X \subseteq \mathbb {R}^d\) is non-empty.

Definition 2.1

  • The set X is \(\rho \)-separated if for all distinct \(x_1, x_2 \in X\), \(|x_1 - x_2| \ge \rho \).

  • The metric entropy of X at scale \(\rho \) is

    $$\begin{aligned} \mathcal {N}(X,{\rho }) = \sup \big \{ |X_0| \ \big | \ X_0 \subseteq X \text { is } \rho \text {-separated} \big \}. \end{aligned}$$
  • The lower Minkowski dimension of X is

    $$\begin{aligned} \underline{\dim }_{\text {M}}\hspace{.1em}X = \liminf _{\delta \rightarrow 0^+} \frac{\log \mathcal {N}(X, \delta )}{\log \delta ^{-1}}. \end{aligned}$$
    (2.1)

    The upper Minkowski dimension, \(\overline{\dim }_{\text {M}}\hspace{.1em}X\), is defined analogously with a limit supremum in place of the limit infimum. If \(\underline{\dim }_{\text {M}}\hspace{.1em}X = \overline{\dim }_{\text {M}}\hspace{.1em}X\), then this value is the Minkowski dimension of X, \(\dim _{\text {M}}X\).

It is a well-known fact which we will use without further mention that if \(\rho < 1\), then \(\underline{\dim }_{\text {M}}\hspace{.1em}X = \liminf _{N \rightarrow \infty } \log \mathcal {N}(X, \rho ^{-N}) \big / \log \rho ^N\) and \(\overline{\dim }_{\text {M}}\hspace{.1em}X = \limsup _{N \rightarrow \infty } \log \mathcal {N}(X, \rho ^{-N}) \big / \log \rho ^N\).

Definition 2.2

  • The unlimited \(\gamma \)-Hausdorff contentFootnote 3 of X is

    $$\begin{aligned} \mathcal {H}_{>0}^\gamma (X) = \inf \left\{ \sum _{i \in I} \delta _i^\gamma \Bigg | \ X \subseteq \bigcup _{i \in I} B_i, \ B_i \text { open ball of diameter } \delta _i \right\} . \end{aligned}$$

    Note that when X is compact, the index set I may be taken to be finite.

  • The Hausdorff dimension of X is

    $$\begin{aligned} \dim _{\text {H}}X&= \sup \{ \gamma \in \mathbb {R}\ | \ \mathcal {H}_{>0}^\gamma (X)> 0 \}\\&=\inf \{ \gamma \in \mathbb {R}\ | \ \mathcal {H}_{>0}^\gamma (X) = 0 \}. \end{aligned}$$

In the following definition, we introduce two notions meant to capture the dimensionality of discrete sets.

Definition 2.3

  • (cf. [15, Definition 1.2]) The set X is a \((\rho ,\gamma )_c\)-set if it is \(\rho \)-separated and for all \(\delta \ge \rho \) and all open balls B of diameter \(\delta \),

    $$\begin{aligned} \big | X \cap B \big | \le c \left( \frac{\delta }{\rho } \right) ^\gamma . \end{aligned}$$
    (2.2)
  • The discrete Hausdorff content of X at scale \(\rho \) and dimension \(\gamma \) is

    $$\begin{aligned} \mathcal {H}_{\ge \rho }^\gamma (X) = \inf \left\{ \sum _{i\in I} \delta _i^\gamma \ \Bigg | \ X \subseteq \bigcup _{i\in I} B_i, \ B_i \text { open ball of diameter } \delta _i \ge \rho \right\} . \end{aligned}$$

    Note that when X is compact, the index set I may be taken to be finite.

In the definition of a \((\rho ,\gamma )_c\)-set, we think of \(\rho \) as being positive and close to 0, \(\gamma \in [0,d]\) as the “dimension” of the set, and \(c > 0\) as an uninteresting parameter that exists only to make our arguments explicit. The inequality in (2.2) guarantees that the points of a \((\rho ,\gamma )_c\)-set cannot be too concentrated in any ball. It follows from that inequality that the maximum cardinality of a \((\rho ,\gamma )_c\) set in \([0,1]^d\) is on the order of \(\rho ^{-\gamma }\). A \((\rho ,\gamma )_c\)-set with cardinality \(\gg \rho ^{-\gamma }\) can be thought of as a discrete approximation to a set with Hausdorff dimension \(\gamma \); this is made more precise in Remark 2.5 below and is realized in Lemma 2.13. In fact, if the discrete approximations of a set \(X \subseteq \mathbb {R}^d\) at all scales \(\rho > 0\) are \((\rho ,\gamma )_c\)-sets, then the Assouad dimension (cf. [7, Section 2.1]) of the set X is at most \(\gamma \). More precisely, the Assouad dimension of X is the infimum of the set of \(\gamma \)’s for which there exists \(c > 0\) such that for all \(\rho > 0\), the set X rounded to the lattice \(\rho \mathbb {Z}^d\) is a \((\rho ,\gamma )_c\)-set.

The discrete Hausdorff content at scale \(\rho \) is a “\(\rho \)-resolution” analogue of the unlimited Hausdorff content. The discrete Hausdorff contents of two sets that look the same at scale \(\rho \) are approximately equal. The following lemma provides a connection between the discrete and the continuous regimes that will be useful in the proof of Theorem A.

Lemma 2.4

Let \(X \subseteq \mathbb {R}^d\) be compact. For all \(\gamma \ge 0\),

$$\begin{aligned} \lim _{\rho \rightarrow 0^+} \mathcal {H}_{\ge \rho }^\gamma (X) = \mathcal {H}_{>0}^\gamma (X). \end{aligned}$$
(2.3)

Consequently, if \(\lim _{\rho \rightarrow 0} \mathcal {H}_{\ge \rho }^\gamma (X) > 0\), then \(\dim _{\text {H}}X \ge \gamma \).

Proof

Let \(\gamma \ge 0\). The limit in (2.3) exists because the function \(\rho \mapsto \mathcal {H}_{\ge \rho }^\gamma (X)\) is non-increasing as \(\rho \) tends to \(0^+\) and is bounded from below by \(\mathcal {H}_{>0}^\gamma (X)\). Equality in the limit follows from the fact that X is compact, allowing for the index set in the definition of \(\mathcal {H}_{>0}^\gamma (X)\) to be taken to be finite. If \(\lim _{\rho \rightarrow 0} \mathcal {H}_{\ge \rho }^\gamma (X) > 0\), then \(\mathcal {H}_{>0}^\gamma (X) > 0\), and it follows from the definition of the Hausdorff dimension that \(\dim _{\text {H}}X \ge \gamma \). \(\square \)

Remark 2.5

It would be natural to define the metric entropy at scale \(\rho \) and dimension \(\gamma \) of the set X as

$$\begin{aligned} \mathcal {N}( X,{(\rho ,\gamma )_c}) = \sup \big \{ |X_0| \ \big | \ X_0 \subseteq X \text { is a } (\rho ,\gamma )_c\text {-set} \big \}. \end{aligned}$$

Using a max flow, min cut argument similar to the one in [2, Ch. 3], it can be shown that for X compact,

$$\begin{aligned} \frac{\mathcal {N}\big (X,{(\rho ,\gamma )_c} \big )}{\rho ^{-\gamma }} \asymp _{c,d} \mathcal {H}_{\ge \rho }^\gamma (X). \end{aligned}$$
(2.4)

Thus, \((\rho ,\gamma )_c\)-sets of cardinality \(\gg \rho ^{-\gamma }\) can be thought of as discrete fractal sets of dimension \(\gamma \). We will not need (2.4); the interested reader can consult [6, Prop. A1] for some details.

The following is a discrete version of the well-known mass distribution principle, cf. [2, Lemma 1.2.8].

Lemma 2.6

Let \(\mu \) be a Borel probability measure on \(\mathbb {R}^d\), and let \(\rho , \kappa > 0\). If for all balls B of diameter \(\delta \ge \rho \), \(\mu (B) \le \kappa \delta ^\gamma \), then the support \({ \text {supp}}\,\mu \) of \(\mu \) satisfies \(\mathcal {H}_{\ge \rho }^\gamma ({ \text {supp}}\,\mu ) \ge \kappa ^{-1}\).

Proof

Let \(\epsilon > 0\), and let \(\{B_i\}_{i \in I}\) be a cover of \({ \text {supp}}\,\mu \) with ball \(B_i\) of diameter \(\delta _i \ge \rho \) and with \(\sum _{i \in I} \delta _i^\gamma \le \mathcal {H}_{\ge \rho }^\gamma ({ \text {supp}}\,\mu ) + \epsilon \). Then the conclusion follows because \(\epsilon > 0\) was arbitrary. \(\square \)

Denote by \([X]_\delta \) the closed \(\delta \)-neighborhood of X:

$$\begin{aligned}{}[X]_\delta {:}{=}\big \{z\in [0,1] \ \big | \ \exists x\in X~\text {with}~ |z - x| \le \epsilon \big \}. \end{aligned}$$

Lemma 2.7

Let \(a\ge 1\) and \(\rho > 0\). If \(X, Y \subseteq \mathbb {R}\) are compact and \(X\subseteq [Y]_{a\rho }\), then

$$\begin{aligned} \mathcal {H}_{\ge \rho }^\gamma (X) \, \ll _a\, \mathcal {H}_{\ge \rho }^\gamma (Y). \end{aligned}$$

Proof

Let \(\{B_i\}_{i\in I}\) be a collection of open balls covering Y and where \(B_i\) has diameter \(r_i\ge \rho \) and \(\sum _{i\in I}r_i^\gamma <2\mathcal {H}_{\ge \rho }^\gamma (Y)\). Since \(X\subseteq [Y]_{a\rho }\), it follows that \(X\subseteq \bigcup _{i\in I} [B_i]_{a\rho }\) and \([B_i]_{a\rho }\) is a ball of diameter \(r_i+2a\rho \le (2a+1)r_i\). Therefore \(\mathcal {H}_{\ge \rho }^\gamma (X)\le \sum _{i\in I}((2a+1)r_i)^\gamma \le 2(2a+1)\mathcal {H}_{\ge \rho }^\gamma (Y)\). \(\square \)

2.2 Multiplicatively Invariant Subsets of the Reals and Their Finite Approximations

In this section, we record some basic facts about multiplicatively invariant subsets of [0, 1] and their discrete approximations.

Definition 2.8

Let \(r \in \mathbb {N}\) and \(X \subseteq [0,1]\).

  • The map \(T_r: [0,1] \rightarrow [0,1]\) is defined by \(T_r x = \{rx\}\), where \(\{ \cdot \}\) denotes the fractional part of a real number.

  • The set X is \(\times r\)-invariant if it is closed and \(T_r X \subseteq X\).

The Hausdorff and Minkowski dimensions of a multiplicatively invariant set coincide. As a consequence of this regularity, the Hausdorff dimension of products of such sets is also well-behaved. We record these facts here for later use.

Theorem 2.9

([8, Proposition III.1]) If \(X \subseteq [0,1]\) is \(\times r\)-invariant, then \(\dim _{\text {H}}X = \dim _{\text {M}}X\).

Lemma 2.10

If \(X, Y \subseteq [0,1]\) are \(\times r, \times s\)-invariant, respectively, then \(\dim _{\text {H}}(X \times Y) = \dim _{\text {H}}X + \dim _{\text {H}}Y\).

Proof

This follows immediately from [19, Corollary 8.11] and the fact that \(\dim _{\text {H}}X = \overline{\dim }_{\text {M}}\hspace{.1em}X\). \(\square \)

Since we will work almost exclusively with finite approximations to multiplicatively invariant sets, we establish some useful notation.

Definition 2.11

Let \(X \subseteq [0,1]\) be \(\times r\)-invariant. For \(n \in \mathbb {N}_0\), the set \(X_n\) denotes the set X rounded down to the lattice \(r^{-n}\mathbb {Z}\). That is, the point \(i / r^n\) is an element of \(X_n\) if and only if \(X \cap [i/r^n, (i+1) / r^n)\) is non-empty.

The next results show that finite approximations to a multiplicatively invariant set are multiplicatively invariant and are discrete models of fractal sets as captured by Definition 2.3.

Lemma 2.12

Let \(X \subseteq [0,1]\) be \(\times r\)-invariant. For all \(n \in \mathbb {N}\), \(T_r X_n \subseteq X_{n-1}\).

Proof

Let \(n \in \mathbb {N}\), and let \(i/r^n \in X_n\) with \(i \in \{0,\ldots , r^n - 1\}\). Write \(i = i_0 + d_{n-1} r^{n-1}\) with \(i_0 \in \{0, \ldots , r^{n-1}-1\}\) and \(d_{n-1} \in \{0, \ldots , r-1\}\). Note that \(T_r (i / r^n) = i_0 / r^{n-1}\) and \(T_r ((i+1) / r^n) = (i_0+1) / r^{n-1}\). We must show that \(i_0 / r^{n-1} \in X_{n-1}\).

Since \(i/r^n \in X_n\), there exists \(x \in X \cap [i/r^n, (i+1)/r^n)\). Since \(T_r x \in X\), \(T_r x \in X \cap [i_0/r^{n-1}, (i_0+1) / r^{n-1})\). It follows by the definition of \(X_{n-1}\) that \(i_0/r^{n-1} \in X_{n-1}\), as was to be shown. \(\square \)

Lemma 2.13

Let \(r \ge 2\), and let \(X \subseteq [0,1]\) be a \(\times r\)-invariant set. For all \(\gamma > \dim _{\text {H}}X\), there exists \(c > 0\) such that for all sufficiently large \(N \in \mathbb {N}\), the set \(X_N\) is a \((r^{-N},\gamma )_c\)-set.

Proof

Let \(\gamma > \dim _{\text {H}}X\). Because \(\gamma > \overline{\dim }_{\text {M}}\hspace{.1em}X\) (cf. Theorem 2.9), there exists \(c_0 > 0\) such that for all \(N \in \mathbb {N}\),

$$\begin{aligned} |X_N| \le c_0 r^{N \gamma }. \end{aligned}$$
(2.5)

using the fact that X is \(\times r\)-invariant, that \(T_r^n\) is injective on half-open intervals of length \(r^{-n}\), Lemma 2.12, and the bound in (2.5), for all \(0 \le n \le N\) and for all \(i \in \{0,\ldots , r^n-1\}\),

$$\begin{aligned} \left| X_N \cap \left[ \frac{i}{r^n}, \frac{i+1}{r^n}\right) \right| \le \big | T_r^n X_N \big | \le \big |X_{N-n} \big | \le c_0 r^{(N-n)\gamma }. \end{aligned}$$
(2.6)

Put \(c = 2r^{\gamma }c_0\). To show that \(X_N\) is a \((r^{-N},\gamma )_c\)-set, let \(B \subseteq \mathbb {R}\) be a ball of diameter \(\delta \ge r^{-N}\). Put \(n = \lfloor - \log _r \delta \rfloor \) so that \(r^{-(n+1)} < \delta \le r^{-n}\), and note that a union of two intervals of length \(r^n\) of the form above suffice to cover B. Therefore,

$$\begin{aligned} \big |X_N \cap B \big | \le 2c_0r^{(N-n)\gamma } \le c \left( \frac{\delta }{r^{-N}} \right) ^{\gamma }, \end{aligned}$$

as was to be shown. \(\square \)

Lemma 2.14

Let \(r \ge 2\), and let \(X \subseteq [0,1]\) be non-empty and \(\times r\)-invariant. For all \(\gamma >\dim _{\text {H}}X\) and all sufficiently large \(N \in \mathbb {N}\),

$$\begin{aligned} r^{N\dim _{\text {H}}X}\le |X_N|\le r^{N\gamma }. \end{aligned}$$

Proof

Let \(\gamma > \dim _{\text {H}}X\). Because \(\gamma > \overline{\dim }_{\text {M}}\hspace{.1em}X\) (cf. Theorem 2.9), we have that \(|X_N| \le r^{N\gamma }\) for all but finitely many \(N\in \mathbb {N}\). It remains to show the lower bound.

Let \(M,N\in \mathbb {N}\). Since \(\big [\frac{i}{r^{N}}, \frac{i+1}{r^{N}}\big )\), \(i=0,1,\ldots ,r^N-1\), forms a partition of [0, 1), we have

$$\begin{aligned} \big |X_{N+M} \big |=\sum _{i=0}^{r^N-1} \left| X_{N+M} \cap \left[ \frac{i}{r^{N}}, \frac{i+1}{r^{N}}\right) \right| . \end{aligned}$$

Note that \(X_{N+M} \cap \big [\frac{i}{r^{N}}, \frac{i+1}{r^{N}}\big )\) is non-empty if and only if \(X \cap \big [\frac{i}{r^{N}}, \frac{i+1}{r^{N}}\big )\) is non-empty, which happens exactly when \({i}/{r^N}\in X_N\). Hence

$$\begin{aligned} \big |X_{N+M} \big |=\sum _{{i}/{r^N}\in X_N} \left| X_{N+M} \cap \left[ \frac{i}{r^{N}}, \frac{i+1}{r^{N}}\right) \right| . \end{aligned}$$
(2.7)

It follows from (2.6) that \(\big |X_{N+M} \cap \big [\frac{i}{r^{N}}, \frac{i+1}{r^{N}}\big ) \big | \le |X_M|\), which combined with (2.7) shows that \(|X_{N+M}|\le |X_N||X_M|\). In view of this sub-additive property, it follows from Fekete’s Lemma that the sequence \(|X_N|^{1/N}\) converges to its infimum, i.e.,

$$\begin{aligned} \lim _{N\rightarrow \infty }|X_N|^{1/N}=\inf _{N\in \mathbb {N}}|X_N|^{1/N}. \end{aligned}$$

It follows from \(\dim _{\text {H}}X = \dim _{\text {M}}X\) that \(r^{\dim _{\text {H}}X} = \lim _{N\rightarrow \infty }|X_N|^{1/N}\). Therefore, \(r^{\dim _{\text {H}}X} = \inf _{N\in \mathbb {N}}|X_N|^{1/N}\), and hence \(r^{N\dim _{\text {H}}X}\le |X_N|\) for all \(N\in \mathbb {N}\), as desired. \(\square \)

The following notation, borrowed from [22], allows us to easily compare powers of r and powers of s. This is useful when considering the finite approximations to the Cartesian product of a \(\times r\)- and a \(\times s\)-invariant set.

Definition 2.15

For \(n \in \mathbb {N}_0\), we set \(n'=\lfloor n\log r/\log s\rfloor \) to be the greatest integer so that \(s^{n'} \le r^{n}\). (The bases r and s do not appear in this notation but should always be clear from context.)

Recall from Definition 2.11 that \(X_N\) is the set X rounded to the lattice \(r^{-N}\mathbb {Z}\). Extending this notation to Y, the set \(Y_{N}\) is the set Y rounded to the lattice \(s^{-N} \mathbb {Z}\). Since \(r^{-N}\) is approximately equal to \(s^{-N'}\) (where \(N'\) is as defined in Definition 2.15), the set \(Y_{N'}\) is the discrete approximation to Y that is on a scale closest to the scale of \(X_N\). Therefore, the sets \(X_N\) and \(Y_{N'}\) will always be considered in the same context, as opposed to the sets \(X_N\) and \(Y_N\).

Corollary 2.16

Let \(2 \le r < s\), let \(X, Y \subseteq [0,1]\) be non-empty \(\times r\)- and \(\times s\)-invariant sets. For all \(\xi > \dim _{\text {H}}X+\dim _{\text {H}}Y\), there exist \(c_1, c_2 > 0\) and \(M_0\in \mathbb {N}\) such that for all \(N \ge M_0\), the sets \(X_N \times Y_{N'}\) and \(X_N \times Y_{N'+1}\) are \((c_1r^{-N},\xi )_{c_2}\)-sets satisfying \(r^{N (\dim _{\text {H}}X+\dim _{\text {H}}Y)} \le |X_N \times Y_{N'}| \le r^{N \xi }\) and \(r^{N (\dim _{\text {H}}X+\dim _{\text {H}}Y)} \le |X_N \times Y_{N'+1}| \le r^{N \xi }\).

Proof

Let \(\xi > \dim _{\text {H}}X+\dim _{\text {H}}Y\). Let \(g > \dim _{\text {H}}X \) and \(h > \dim _{\text {H}}Y \) be such that

$$\begin{aligned} \dim _{\text {H}}(X \times Y)< g + h < \xi . \end{aligned}$$

Applying Lemma 2.13 and Lemma 2.14, there exist \(c, d > 0\) such that for sufficiently large \(N \in \mathbb {N}\), the set \(X_N\) is a \((r^{-N},g)_c\)-set satisfying \(r^{N \dim _{\text {H}}X} \le |X_N| \le r^{N g}\) and \(Y_{N'}\) is a \((s^{-N'},h)_{d}\)-set satisfying \(s^{N' \dim _{\text {H}}Y} \le |Y_{N'}| \le s^{N' h}\). Since \(r^{N (\dim _{\text {H}}X+\dim _{\text {H}}Y)}= r^{N\dim _{\text {H}}X}r^{N\dim _{\text {H}}Y}\ge r^{N\dim _{\text {H}}X}s^{N'\dim _{\text {H}}Y}\), \(|X_N\times Y_N|\le |X_N\times Y_{N+1}|\) and \(g+h<\xi \), it follows that for sufficiently large \(N \in \mathbb {N}\), one has \(r^{N (\dim _{\text {H}}X+\dim _{\text {H}}Y)} \le |X_N \times Y_{N'}| \le r^{N \xi }\) and \(r^{N (\dim _{\text {H}}X+\dim _{\text {H}}Y)} \le |X_N \times Y_{N'+1}| \le r^{N \xi }\).

Set \(c_1 = s^{-1}\) and \(c_2 = s^{g} c d\). Since \(s^{N'}< r^N < s^{N'+1}\), the sets \(X_N \times Y_{N'}\) and \(X_N \times Y_{N'+1}\) are \(c_1 r^{-N}\)-separated. Since \(X_N\) is a \((r^{-N},g)_c\)-set, it is a \((c_1 r^{-N},g)_{s^{g} c}\)-set.Footnote 4 Let \(B \subseteq \mathbb {R}^2\) be a ball of diameter \(\delta \ge c_1 r^{-N}\). Note that

$$\begin{aligned} \big |(X_N \times Y_{N'}) \cap B \big |&\le s^{g} c \left( \frac{\delta }{c_1 r^{-N}} \right) ^{g} d \left( \frac{\delta }{s^{-N'}} \right) ^{h} \\&\le s^{g} c \left( \frac{\delta }{c_1 r^{-N}} \right) ^{g} d c_1^{h} \left( \frac{\delta }{c_1 r^{-N}} \right) ^{h} \le c_2 \left( \frac{\delta }{c_1 r^{-N}} \right) ^{\xi }, \end{aligned}$$

which shows that the set \(X_N \times Y_{N'}\) is a \((c_1r^{-N},\xi )_{c_2}\)-set. By a similar calculation,

$$\begin{aligned} \big |(X_N \times Y_{N'+1}) \cap B \big |&\le s^{g} c \left( \frac{\delta }{c_1 r^{-N}} \right) ^{g} d \left( \frac{\delta }{s^{-(N'+1)}} \right) ^{h} \\&\le s^{g} c \left( \frac{\delta }{c_1 r^{-N}} \right) ^{g} d \left( \frac{\delta }{c_1 r^{-N}} \right) ^{h} \le c_2 \left( \frac{\delta }{c_1 r^{-N}} \right) ^{\xi }, \end{aligned}$$

which shows that the set \(X_N \times Y_{N'+1}\) is a \((c_1r^{-N},\xi )_{c_2}\)-set. \(\square \)

2.3 A Quantitative Equidistribution Lemma

The main result in this short section, Lemma 2.18, gives a lower bound on the number of visits of an equidistributed sequence to a set as a function only of the measure and topological complexity of the set’s complement. This result is certainly not new; we state it explicitly here for convenience in a way that highlights the uniformity in the quantifiers.

For \(U \in \mathbb {N}\), denote by \({\mathcal {I}}_U\) the collection of those subsets of [0, 1) that are a union of no more than U disjoint intervals of the form [ab).

Lemma 2.17

For any uniformly distributed sequence \((x_n)_{n \in \mathbb {N}_0} \subseteq [0,1)\), \(U \in \mathbb {N}\), and \(\epsilon > 0\), there exists \(N_0 \in \mathbb {N}\) such that for all \(N \ge N_0\) and all \(B \in {\mathcal {I}}_U\),

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n \in B \} \big | \le \text {Leb}(B) + \epsilon . \end{aligned}$$

Proof

Let \((x_n)_{n \in \mathbb {N}_0} \subseteq [0,1)\) be uniformly distributed, \(U \in \mathbb {N}\), and \(\epsilon > 0\). The discrepancy of \((x_n)_{n=0}^{N-1}\) (cf. [14, Ch. 2, Def. 1.1]) is

$$\begin{aligned} D_N = \sup _{I} \left| \frac{\{0 \le n \le N-1 \ | \ x_n \in I \}}{N} - \text {Leb}(I)\right| , \end{aligned}$$

where the supremum is taken over all half-open intervals I in [0, 1). Because \((x_n)_n\) is uniformly distributed, \(D_N \rightarrow 0\) as \(N \rightarrow \infty \) (cf. [14, Ch. 2, Thm. 1]). By the definition of discrepancy, for any half-open interval \(I \subseteq [0,1)\),

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n \in I \} \big | \le \text {Leb}(I) + D_N. \end{aligned}$$

It follows that for every \(B \in {\mathcal {I}}_U\),

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n \in B \} \big | \le \text {Leb}(B) + U D_N. \end{aligned}$$

Let \(N_0 \in \mathbb {N}\) be large enough so that for all \(N \ge N_0\), \(U D_N \le \epsilon \). The conclusion follows. \(\square \)

Lemma 2.18

Let \(\beta > 0\). For any uniformly distributed sequence \((x_n)_{n \in \mathbb {N}_0} \subseteq [0,\beta )\) with respect to the Lebesgue measure, \(U \in \mathbb {N}\), and \(\epsilon > 0\), there exists \(N_0 \in \mathbb {N}\) such that for all \(N \ge N_0\) and all \(J \subseteq [0,\beta )\) whose complement is covered by a union of no more than U many disjoint, half-open intervals of total Lebesgue measure less than \(\epsilon \beta / 2\),

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n \in J \} \big | \ge 1-\epsilon . \end{aligned}$$

Proof

Let \((x_n)_{n \in \mathbb {N}_0} \subseteq [0,\beta )\) be uniformly distributed, \(U \in \mathbb {N}\), and \(\epsilon > 0\). Let \(N_0\) be from Lemma 2.17 with \((x_n / \beta )_{n \in \mathbb {N}_0}\), U, and \(\epsilon / 2\).

Let \(N \ge N_0\) and \(J \subseteq [0,\beta )\). Put \(B = [0,\beta ) {\setminus } J\), and note that by assumption, \(B / \beta \in {\mathcal {I}}_U\) and \(\text {Leb}(B / \beta ) < \epsilon / 2\). It follows from Lemma 2.17 that

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n / \beta \in B / \beta \} \big | < \epsilon . \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{1}{N} \big | \{ 0 \le n \le N-1 \ | \ x_n \in J \} \big | \ge 1- \epsilon , \end{aligned}$$

as was to be shown. \(\square \)

3 A Discrete Marstrand Projection Theorem

In this section, we prove a discrete analogue of Marstrand’s projection theorem from geometric measure theory. The theorem – stated for sumsets in the introduction as Theorem 1.1 – says that for every Borel set \(A \subseteq [0,1]^2\), for Lebesgue-a.e. \(\theta \in [0,\pi )\), \(\dim _{\text {H}}\pi _\theta A = \min (1,\dim _{\text {H}}A)\), where \(\pi _\theta : \mathbb {R}^2 \rightarrow \mathbb {R}^2\) is the orthogonal projection onto \(\ell _\theta \), the line that contains the origin and forms an angle \(\theta \) with the positive x-axis. Marstrand’s theorem and its relatives have enjoyed much recent attention: we refer the interested reader to the survey [5] and to the end of this section where we put Theorem 3.2 into more context.

The key idea behind Marstrand’s theorem is that of “geometric transversality” and is captured in the following lemma. The proof follows from a simple geometric argument and is left to the reader. An immediate consequence of the lemma is that there are not many projections which map two distant points close together.

Lemma 3.1

For all nonzero \(x \in \mathbb {R}^2\) and all \(\rho > 0\), the set of angles \(\theta \in [0,\pi )\) for which \(|\pi _\theta x| \le \rho \) is contained in at most two balls of diameter \(\ll \rho |x|^{-1}\).

The results in this section add to a number of other discrete Marstrand-type theorems in the recent literature: [17, Lemma 5.2], [16, Prop. 3.2], [10, Lemma 3.8], [22, Prop. 7], [21, Prop. 4.10] to name a few. Let us highlight some distinguishing features of LemmaA 3.1 and Theorem 3.2 that play an important role in this work. Analogues of Lemma 3.1 more commonly found in the literature, such as the one in [19, Lemma 3.11], bound the measure of the set of projections which map x close to 0. The result in Lemma 3.1 uses coverings to capture topological information on the set of projections. This information is carried into Theorem 3.2 and is important in the application to Theorem A. Another useful feature of Theorem 3.2 is the allowance of a subset \(A'\) in (3.1); this will allow us to treat sets in Theorem A that exhibit multiplicative invariance without necessarily being self-similar.

3.1 A Discrete Projection Theorem

Our discrete analogue of Marstrand’s theorem, Theorem 3.2, reaches a conclusion similar to that of Marstrand’s by quantifying the size of the set E of exceptional directions, those directions in which the image of the set A is small. On a first reading, it is safe to think of \(\gamma < 1\), \(n \approx \rho ^{-\gamma }\), \(\delta = 1\), and \(m \approx \rho ^{-(\gamma - \epsilon )}\). In this case, the set A is a discrete analogue of a set of Hausdorff dimension \(\gamma \) and the set E is the set of exceptional directions in which the set A loses at least a proportion \(\rho ^{\epsilon }\) of its points.

Theorem 3.2

Let \(\gamma , \rho , c > 0\). Put \(\overline{\gamma } = \min (\gamma ,1)\). If \(A \subseteq [0,1]^2\) is a \((\rho ,\gamma )_c\)-set with \(n {:}{=}|A| > -\log c\), then for all \(\delta > 0\) and all \(0 \le m \le \delta ^2 n \big / 4\), the set

$$\begin{aligned} E = \big \{ \theta \in [0,\pi ) \ \big | \ \exists A' \subseteq A, \ |A'| \ge \delta n, \ \mathcal {N}(\pi _\theta {A'},\rho ) \le m \big \} \end{aligned}$$
(3.1)

satisfies

$$\begin{aligned} \mathcal {N}( E, \rho ) \ll _{\gamma ,c} \rho ^{-1} \frac{m}{\delta ^2 n} {\left\{ \begin{array}{ll} n^{1- \overline{\gamma }/\gamma } &{} \text { if } \gamma \ne 1 \\ \log n &{} \text { if } \gamma = 1 \end{array}\right. }. \end{aligned}$$

Proof

Let \(A \subseteq [0,1]^2\) be a \((\rho ,\gamma )_c\)-set of cardinality \(n > - \log c\). Let \(\delta > 0\), and let \(0 \le m \le \delta ^2 n \big / 4\).

Define \(S(\theta ) = \big \{ (a_1,a_2) \in A^2 \ \big | \ |\pi _\theta (a_1-a_2) | < \rho \big \}\). Let \(E'\) be a maximal \(\rho \)-separated subset of E; thus, \(|E'| = \mathcal {N}(E,\rho )\). The goal is to bound \(\sum _{\theta \in E'} \big | S(\theta ) \big |\) from above and below to get the desired bound on \(|E'|\).

Let \(\theta \in E'\) and \(A'\) be the subset of A corresponding to \(\theta \). Since the set \(\pi _{\theta }{A'}\) lies on a line and \(\mathcal {N}(\pi _{\theta }{A'},\rho ) \le m\), there exists a collection \(\{B\}_{B \in {\mathcal {B}}}\) of no more than \(2\,m\) closed balls B of diameter \(\rho \) whose union covers \(\pi _{\theta }{A'}\). By Cauchy-Schwarz,

$$\begin{aligned} (\delta n)^2 \le |A'|^2&\le \left( \sum _{B \in {\mathcal {B}}} \big | \{ a_0 \in A' \ | \ \pi _\theta {a_0} \in B \} \big | \right) ^2 \\&\le \big | {\mathcal {B}} \big | \sum _{B \in {\mathcal {B}}} \big | \{ a_0 \in A' \ | \ \pi _\theta {a_0} \in B \} \big |^2 \\&\le 2m \sum _{B \in {\mathcal {B}}} \big | \{ a \in A \ | \ \pi _\theta a \in B \} \big |^2\\&= 2m \sum _{B \in {\mathcal {B}}} \big | \{ (a_1,a_2) \in A^2 \ | \ \pi _\theta {a_1}, \pi _\theta {a_2} \in B \} \big |\\&\le 2m \big | S(\theta ) \big |. \end{aligned}$$

It follows that

$$\begin{aligned} \frac{\delta ^2n^2}{2m}|E'| \le \sum _{\theta \in E'} \big | S(\theta ) \big |. \end{aligned}$$
(3.2)

Now we use Lemma 3.1 to bound the right hand side of (3.2) from above: for \(a_1,a_2 \in [0,1]^2\), the set

$$\begin{aligned} \Theta (a_1,a_2) = \big \{ \theta \in [0,\pi ) \ \big | \ \big | \pi _{\theta }{(a_1-a_2)} \big | < \rho \big \} \end{aligned}$$

is contained in at most two balls of diameter \(\ll \rho / |a_1 - a_2|\). Therefore, \(\mathcal {N}(\Theta (a_1,a_2),\rho ) \ll 1 / |a_1 - a_2| \), and using the fact that \(E'\) is \(\rho \)-separated, we see that

$$\begin{aligned} \sum _{\theta \in E'}1_{S(\theta )}(a_1,a_2) = \sum _{\theta \in E'}1_{\Theta (a_1,a_2)}(\theta )\le K \frac{1}{|a_1 - a_2|} \end{aligned}$$

for some constant K depending on the result in Lemma 3.1. It follows that

$$\begin{aligned} \sum _{\theta \in E'} \big | S(\theta ) \big |&= \sum _{\theta \in E'} \sum _{a_1,a_2 \in A} 1_{S(\theta )}(a_1,a_2)\\&= n|E'| + \sum _{\begin{array}{c} a_1, a_2 \in A \\ a_1 \ne a_2 \end{array}} \sum _{\theta \in E'}1_{\Theta (a_1,a_2)}(\theta )\\&\le n|E'| + K \sum _{\begin{array}{c} a_1, a_2 \in A \\ a_1 \ne a_2 \end{array}} |a_1-a_2|^{-1}, \end{aligned}$$

and so we are left to bound the second term from above.

For \(\ell \in \mathbb {N}_0\), let \(H_\ell = \{x \in \mathbb {R}^2 \ | \ |x| \in [\rho e^\ell , \rho e^{\ell +1}) \}\). Breaking up the sum \(\sum |a_1-a_2|^{-1}\) by fixing \(a_1\) and partitioning the \(a_2\)’s by shells, and using the fact that A is \(\rho \)-separated, we see

$$\begin{aligned} \sum _{\begin{array}{c} a_1, a_2 \in A \\ a_1 \ne a_2 \end{array}} |a_1-a_2|^{-1}&= \sum _{a_1 \in A} \sum _{\ell = 0}^{\infty } \sum _{a_2 \in A \cap (a_1 + H_\ell )} |a_1-a_2|^{-1}\\&\le \rho ^{-1} \sum _{a_1 \in A} \sum _{\ell = 0}^{\infty } e^{-\ell } \big |A \cap (a_1 + H_\ell )\big |. \end{aligned}$$

Since A is a \((\rho ,\gamma )_c\)-set, for all \(\ell \ge 0\), \(\big |A \cap (a_1 + H_\ell )\big | \le c \left( 2\rho e^{\ell +1} \big / \rho \right) ^\gamma \). On the other hand, \(\sum _{\ell =0}^\infty \big |A \cap (a_1 + H_\ell )\big |=|A|-1\). It follows then from the fact that \(\ell \mapsto e^{-\ell }\) is decreasing that \(\sum _{\ell = 0}^{\infty } e^{-\ell } \big |A \cap (a_1 + H_\ell )\big |\le \sum _{\ell = 0}^{\ell _0}2^\gamma ce^{\ell (\gamma -1)+\gamma }\), where \(\ell _0 = \lceil \log ((n/c)^{1/\gamma }) \rceil \) is the smallest value such that the set A could be contained in a ball of diameter \(\rho e^{\ell _0}\) about \(a_1\). Therefore,

$$\begin{aligned} \rho ^{-1} \sum _{a_1 \in A} \sum _{\ell = 0}^{\infty } e^{-\ell } \big |A \cap (a_1 + H_\ell )\big |&\ll _{\gamma ,c} \rho ^{-1} \sum _{a_1 \in A} \sum _{\ell = 0}^{\ell _0} \big (e^{\gamma - 1}\big )^\ell \\&\ll _{\gamma ,c} \rho ^{-1} n {\left\{ \begin{array}{ll} n^{1- \overline{\gamma }/\gamma } &{} \text { if } \gamma \ne 1 \\ \log n &{} \text { if } \gamma = 1 \end{array}\right. }. \end{aligned}$$

Combining the upper and lower bounds on \(\sum _{\theta \in E'} \big | S(\theta ) \big |\), we see that there exists a constant K depending on the result in Lemma 3.1, \(\gamma \), and c such that

$$\begin{aligned} \frac{\delta ^2n^2}{2m}|E'| \le n|E'| + K \rho ^{-1}n {\left\{ \begin{array}{ll} n^{1- \overline{\gamma }/\gamma } &{} \text { if } \gamma \ne 1 \\ \log n &{} \text { if } \gamma = 1 \end{array}\right. }. \end{aligned}$$

Dividing both sides by n and using the fact that \(m \le \delta ^2n / 4\), we see that

$$\begin{aligned} \frac{\delta ^2n}{4m}|E'| \le \left( \frac{\delta ^2n}{2m} - 1 \right) |E'| \le K \rho ^{-1} {\left\{ \begin{array}{ll} n^{1- \overline{\gamma }/\gamma } &{} \text { if } \gamma \ne 1 \\ \log n &{} \text { if } \gamma = 1 \end{array}\right. }, \end{aligned}$$

which rearranges to the desired conclusion. \(\square \)

3.2 A Corollary for Oblique Projections

The proof of Theorem A will feature oblique projections instead of orthogonal ones. The following corollary concerns oblique projections and is stated in a way that will make it immediately applicable in the proof of Theorem A.

Denote by \(\Pi _{t}: \mathbb {R}^2 \rightarrow \mathbb {R}\) the oblique projection \(\Pi _{t}(x,y) = x + t y\). Let \(\varphi : (0, \pi / 2) \rightarrow \mathbb {R}\) be the diffeomorphism \(\varphi (\theta ) = \log \tan \theta \). Note that \(\Pi _{e^{\varphi (\theta )}}\) is the oblique projection that is the “continuation” of the orthogonal projection \(\pi _\theta \), meaning that the points (xy), \((\Pi _{e^{\varphi (\theta )}}(x,y),0)\), and \(\pi _\theta (x,y)\) are collinear.

Corollary 3.3

Let \(0< \gamma _1< \gamma _2< \gamma _3< \gamma _4\) be such that \(\gamma _1< 1\) and

$$\begin{aligned} 2 (\gamma _4- \gamma _2) < \gamma _3- \gamma _1. \end{aligned}$$
(3.3)

For all compact \(I \subseteq \mathbb {R}\), all \(\epsilon , c_1, c_2, c_3 > 0\), all sufficiently small \(\rho > 0\) (depending on all previous quantities), and all \((c_1 \rho ,\gamma _4)_{c_2}\)-sets \(A \subseteq [0,1]^2\) with \(|A| \ge \rho ^{-\gamma _3}\), there exists \(T \subseteq I\) with the following properties:

  1. (I)

    the set \(I {\setminus } T\) can be covered by a disjoint union of not more than \(\epsilon \rho ^{-1} / 2\)-many half-open intervals of length \(\rho \), a cover of total Lebesgue measure less than \(\epsilon \).

  2. (II)

    for all \(t \in T\) and all \(A' \subseteq A\) with \(|A'| \ge \rho ^{-\gamma _2}\), there exists a subset \(A'_t \subseteq A'\) with \(|A'_t| \ge \rho ^{-\gamma _1}\) such that the points of \(\Pi _{e^t} A'_t\) are distinct and \(c_3 \rho \)-separated.

Proof

Let \(I \subseteq \mathbb {R}\) be compact and \(\epsilon , c_1, c_2, c_3 > 0\). Let \(\sigma \in \big (\gamma _4- \gamma _2, (\gamma _3-\gamma _1)/2 \big )\). Let \(\rho > 0\) be sufficiently small (to be specified later, but depending only on the quantities introduced thus far). Let \(A \subseteq [0,1]^2\) be a \((c_1 \rho ,\gamma _4)_{c_2}\)-set with \(|A| \ge \rho ^{-\gamma _3}\). Put \(\overline{\gamma _4} = \min (1,\gamma _4)\), \(n = |A|\), \(\delta = \rho ^\sigma \), and \(m = 2c_3 \rho ^{-\gamma _1}\). Note that since A is a \((c_1 \rho ,\gamma _4)_{c_2}\)-set contained in a ball of diameter \(\sqrt{2}\), \(n \le 2 c_2 (c_1 \rho )^{-\gamma _4}\).

We want to apply Theorem 3.2 with \(\gamma _4\) as \(\gamma \), \(c_1 \rho \) as \(\rho \), \(c_2\) as c, and with A, n, \(\delta \), and m as they are. We see that the inequality \(n > - \log c_2\) holds for \(\rho \) sufficiently small, as does \(m \le \delta ^2 n / 4\) since \(\sigma < (\gamma _3-\gamma _1)/2\). Since the conditions of Theorem 3.2 hold, the set \(E \subseteq [0,\pi )\) defined in (3.1) satisfies

$$\begin{aligned} \begin{aligned}\mathcal {N}(E,\rho )&\ll _{\gamma _4, c_2} \rho ^{-1} \frac{m}{\delta ^2 n} n^{1-\overline{\gamma _4} / \gamma _4} \log n \\&\ll _{\gamma _4, c_1, c_2,c_3} \rho ^{-1} \frac{\rho ^{-\gamma _1}}{\rho ^{2\sigma } \rho ^{-\gamma _3\overline{\gamma _4} / \gamma _4}} \log \left( \rho ^{-\gamma _4} \right) . \end{aligned} \end{aligned}$$
(3.4)

Let \(J = \varphi ^{-1}(I)\), and put \(T = I {\setminus } { \left. \hspace{0.0pt}\varphi \right| _{J}}(E)\). Since the map \({ \left. \hspace{0.0pt}\varphi \right| _{J}}\) is bi-Lipschitz,

$$\begin{aligned} \mathcal {N}({ \left. \hspace{0.0pt}\varphi \right| _{J}}(E), \rho ) \asymp _I \mathcal {N}(E,\rho ). \end{aligned}$$

Combining this with (3.4) and the fact that \(\sigma < (\gamma _3-\gamma _1)/2\), we have that for sufficiently small \(\rho \), \(\mathcal {N}(I {\setminus } T,\rho ) \le \epsilon \rho ^{-1} / 6\). It follows that the set \(I {\setminus } T\) can be covered by a disjoint union of not more than \(\epsilon \rho ^{-1} / 2\)-many half-open intervals of length \(\rho \), a cover of total measure less than \(\epsilon \). This establishes (I).

To prove (II), let \(t \in T\), and let \(A' \subseteq A\) with \(|A'| \ge \rho ^{-\gamma _2}\). Since \(n \le 2 c_2 (c_1 \rho )^{-\gamma _4}\) and \(\sigma > \gamma _4- \gamma _2\), for sufficiently small \(\rho \), \(\rho ^{-\gamma _2} \ge \delta n\). It follows that \(|A'| \ge \delta n\). Because \(\theta {:}{=}\varphi ^{-1}(t) \not \in E\), \(\mathcal {N}(\pi _\theta {A'},\rho ) \ge m\). It follows that \(\mathcal {N}(\pi _\theta {A'},c_3\rho ) \ge \rho ^{-\gamma _1}\). By choosing points in \(A'\) in each fiber of a maximally \(\rho \)-separated set of the projection, we see that there exists a subset \(A'_t \subseteq A'\) of cardinality at least \(\rho ^{-\gamma _1}\) such that the orthogonal projection of the points in \(A'_\theta \) onto \(\ell _\theta \) are disjoint and \(c_3\rho \)-separated. Since the oblique projection \(\Pi _{e^t}\) increases distances between points that lie on \(\ell _\theta \), the images of points of \(A'_t\) under \(\Pi _{e^t}\) are \(c_3\rho \)-separated. \(\square \)

4 Trees and a Subtree Regularity Theorem

Trees are combinatorial objects that are convenient for describing fractal sets. We will be concerned solely with finite trees throughout this work. After giving the main definitions, we motivate their importance by explaining how they will be used in the proof of Theorem A. We move then to prove the main result in this section.

4.1 Preliminary Definitions

The following definitions describe the familiar notion of a rooted tree, a graph with no cycles whose vertices can be arranged on levels and whose edges only connect vertices on adjacent levels.

Definition 4.1

  • A tree of height \(N \in \mathbb {N}_0\) is a finite set of nodes \(\Gamma \) together with a partition \(\Gamma =\Gamma _0\cup \cdots \cup \Gamma _N\) with \(|\Gamma _0|=1\) and a parent function \(P: \Gamma {\setminus } \Gamma _0 \rightarrow \Gamma {\setminus } \Gamma _N\) such that for every \(n\in \{1,\dots ,N\}\), \(P(\Gamma _n)=\Gamma _{n-1}\).

  • The nodes in \(\Gamma _n\) have height n. The single node with height 0 is the root and the nodes with height N are called leaves.

  • The node Q is the parent of each of its children, nodes in the set \(C_\Gamma (Q) {:}{=}P^{-1}(Q)\).

  • If Q is a node of height n, the induced tree based at Q is the tree \(\Gamma _Q {:}{=}\cup _{i=0}^{N-n} C_\Gamma ^{i}(Q)\) of height \(N-n\) with root Q and the same parent function as \(\Gamma \), restricted to the set \(\Gamma _Q\).

  • A subtree of \(\Gamma \) is a tree \(\Gamma ' \subseteq \Gamma \) of the same height as \(\Gamma \) with parent function \({ \left. \hspace{0.0pt}P \right| _{\Gamma ' \setminus \Gamma _0'}}\). (A subtree is uniquely determined by its non-empty set of leaves \(\Gamma _N' \subseteq \Gamma _N\).)

Continuing with terminology inspired by genealogy trees, the ancestors of a node Q are those nodes that lie between Q and the root. For the reasons described below in Remark 4.4, it will be important to count the number of ancestors of Q that have many children. To this end, we introduce the following terminology and notation.

Definition 4.2

Let \(\Gamma \) be a tree, \(c > 0\), and \(\omega \in [0,1]\).

  • The ancestry of \(Q \in \Gamma _n\) is the set

    $$\begin{aligned} \mathcal {A}_\Gamma (Q) {:}{=}\{P^k(Q) \ | \ 1 \le k \le n\}. \end{aligned}$$

    Note that \(|\mathcal {A}_\Gamma (Q)|\) is equal to the height of Q.

  • The node Q is c-fertile if \(|C_\Gamma (Q)| \ge c\). The set of c-fertile ancestors of Q is denoted

    $$\begin{aligned} \mathcal {F}_{\Gamma ,c}(Q) {:}{=}\{A \in \mathcal {A}_\Gamma (Q) \ | \ A\text { is} c-\text {fertile} \}. \end{aligned}$$

    A node Q has \((c,\omega )\)-fertile ancestry if \(|\mathcal {F}_{\Gamma ,c}(Q)| \ge \omega |\mathcal {A}_\Gamma (Q)|\).

The following definitions allow us to capture the dimension of a finite tree by giving costs to the nodes and measuring the cost of the least expensive cut.

Definition 4.3

Let \(\Gamma \) be a tree, \(r \in \mathbb {N}\), \(r \ge 2\), and \(\gamma > 0\).

  • A cut of \(\Gamma \) is a subset \(\mathcal {C}\subseteq \Gamma \) such that for every leaf L of \(\Gamma \), \(\big (\{L\} \cup \mathcal {A}_\Gamma (L) \big ) \cap \mathcal {C}\ne \emptyset \).

  • The \(\gamma \)-Hausdorff content of \(\Gamma \) with base r is

    $$\begin{aligned} \mathcal {H}_r^\gamma (\Gamma ) {:}{=}\min \left\{ \sum _{Q \in {\mathcal {C}}} r^{- \text {height}(Q) \gamma } \ \Bigg | \ \mathcal {C}\text { is a cut of}\,\, \Gamma \right\} . \end{aligned}$$

The main result in this section, Theorem 4.7, says, roughly speaking, that any tall enough tree with Hausdorff content bounded from below and with a uniform upper bound on the number of children of any node has a subtree in which most nodes have fertile ancestry. Before making this statement precise and beginning with the details of the proof, let us make two observations about the concept of fertile ancestry that will help explain why it will be useful later on in the proof of Theorem A.

Remark 4.4

  1. (I)

    The property of having fertile ancestry is preserved under a type of tree thinning process that we will employ in the proof of Theorem A. More specifically, suppose that \(\Gamma \) is a tree in which every node has either one child or at least c many children and in which every node has \((c, \omega )\)-fertile ancestry. Suppose further that for every node Q, there exists a subset \({\tilde{C}}(Q) \subseteq C_\Gamma (Q)\) of the children of Q with \(|{\tilde{C}}(Q)| \ge \min \big (\tilde{c},|C_\Gamma (Q)| \big )\). These subsets naturally give rise to a subtree \({\tilde{\Gamma }}\) obtained by thinning the tree \(\Gamma \): the subtree \({\tilde{\Gamma }}\) is uniquely defined by the property that if Q is a node of \({\tilde{\Gamma }}\), then \(C_{{\tilde{\Gamma }}}(Q) = {\tilde{C}}(Q)\). It is not hard to see that every node in \(\tilde{\Gamma }\) has \(({{\tilde{c}}}, \omega )\)-fertile ancestry, regardless of how the subsets of children \({\tilde{C}}(Q)\) were chosen.

  2. (II)

    A tree in which every node has fertile ancestry necessarily has large Hausdorff content. This is a simple consequence of the mass distribution principle (or the max flow-min cut theorem) for trees, the real analogue of which is stated in Lemma 2.6. More specifically, let \(\Gamma \) be a tree, and consider a “flow” through \(\Gamma \) of magnitude 1 starting at the root that splits equally amongst children. The value of the flow at any node Q with fertile ancestry can be bounded from above using the fact that many times, much of the flow is split amongst a large set of children before reaching Q. If all nodes of \(\Gamma \) have fertile ancestry, then the flow is not concentrated too highly at any node. According to the mass distribution principle, the Hausdorff content of a tree that supports such a flow is high.

4.2 A Subtree Regularity Theorem

We now proceed with the main results in this subsection. In the next two results, fix \(r \ge 2\) and \(0< \gamma _2< \gamma _3< \gamma _4\) such that setting

$$\begin{aligned} A&{:}{=}\,\, \gamma _4- \gamma _3+ \log _r 2,\\ B&{:}{=}\,\, \gamma _3- \gamma _2- \log _r 2, \end{aligned}$$

ensures the quantity B is positive. The following lemma describes the fundamental dichotomy behind Theorem 4.7.

Lemma 4.5

If \(\Gamma \) is a tree with the property that

$$\begin{aligned} \text {every node in the tree has at most }r^{\gamma _4}\text { many children,} \end{aligned}$$
(4.1)

then at least one of the following holds:

  1. (I)

    there are at least \(r^{\gamma _2}\) many children Q of the root, each of which satisfies

    $$\begin{aligned} \mathcal {H}_r^{\gamma _3}(\Gamma _Q) \ge \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{-A}; \end{aligned}$$
  2. (II)

    there is at least one child Q of the root satisfying

    $$\begin{aligned} \mathcal {H}_r^{\gamma _3}(\Gamma _Q) \ge \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{B}. \end{aligned}$$

Proof

Let \(\Gamma \) be a tree satisfying (4.1). Let \(Q_1\), \(Q_2\), ..., \(Q_I\) be the children of the root of \(\Gamma \), ordered so that \(\mathcal {H}_r^{\gamma _3}(\Gamma _{Q_i}) \ge \mathcal {H}_r^{\gamma _3}(\Gamma _{Q_{i+1}})\). If neither (I) nor (II) holds, then \(\mathcal {H}_r^{\gamma _3}(\Gamma _{Q_1}) < \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{B}\) and \(\mathcal {H}_r^{\gamma _3}(\Gamma _{Q_{\lceil r^{\gamma _2} \rceil }}) < \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{-A}\). It follows by the ordering of the \(Q_i\)’s and the definition of the Hausdorff content and induced trees that

$$\begin{aligned} \mathcal {H}_r^{\gamma _3}(\Gamma )&\le r^{-\gamma _3} \sum _{i=1}^I \mathcal {H}_r^{\gamma _3}(\Gamma _{Q_i}) \\&= \sum _{i=1}^{\lfloor r^{\gamma _2} \rfloor } r^{-\gamma _3} \mathcal {H}_r^{\gamma _3}(\Gamma _{Q_i}) + \sum _{i=\lceil r^{\gamma _2} \rceil }^{I} r^{-\gamma _3} \mathcal {H}_r^{\gamma _3}(\Gamma _{Q_i})\\&< r^{\gamma _2} r^{-\gamma _3} \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{B} + r^{\gamma _4} r^{-\gamma _3} \mathcal {H}_r^{\gamma _3}(\Gamma ) r^{-A} = \mathcal {H}_r^{\gamma _3}(\Gamma ), \end{aligned}$$

a contradiction. \(\square \)

Lemma 4.6

Every finite tree \(\Gamma \) that satisfies (4.1) has a subtree \(\Gamma '\) with the property that for all nodes Q in \(\Gamma '\),

$$\begin{aligned} \big |\mathcal {F}_{\Gamma ',r^{\gamma _2}}(Q) \big | \ge \frac{|\mathcal {A}_{\Gamma '}(Q)|B + \log _r \mathcal {H}_r^{\gamma _3}(\Gamma )}{A+B}. \end{aligned}$$
(4.2)

Proof

We will prove the lemma by induction on the height N of the tree \(\Gamma \). To verify the base case, let \(\Gamma \) be the tree of height \(N=0\): a single node with no children. Taking \(\Gamma ' = \Gamma \), the inequality (4.2) for this single node follows from the fact that \(\log _r \mathcal {H}_r^{\gamma _3}(\Gamma ) = 0\).

Suppose that \(N \in \mathbb {N}\) is such that the theorem holds for all trees of height \(N-1\). Let \(\Gamma \) be a tree of height N that satisfies (4.1). By Lemma 4.5, at least one of Case (I) or Case (II) holds.

Suppose Case (I) of Lemma 4.5 holds. Let Q be any one of the \(r^{\gamma _2}\)-many children guaranteed by Case (I). By the induction hypothesis, there exists a subtree \(\Gamma _Q'\) of \(\Gamma _Q\) in which every node satisfies (4.2) with \(\Gamma _Q\) in place of \(\Gamma \) and \(\Gamma _Q'\) in place of \(\Gamma '\). Define the subtree \(\Gamma '\) of \(\Gamma \) to be the root node of \(\Gamma \) with the collection of at least \(r^{\gamma _2}\) many children Q, each of those children followed by its subtree \(\Gamma _Q'\).

We will now verify that (4.2) holds for all nodes of \(\Gamma '\). Let Q be any node of \(\Gamma '\). If Q is the root node of \(\Gamma '\), then (4.2) holds because \(\log _r \mathcal {H}_r^{\gamma _3}(\Gamma ) \le 0\). (Indeed, that \(\mathcal {H}_r^{\gamma _3}(\Gamma ) \le 1\) follows by considering the cut \(\mathcal {C}{:}{=}\{Q\}\) of \(\Gamma \).) If Q is a non-root node of \(\Gamma '\), then it belongs to one of the subtrees \(\Gamma _S'\) for some child S of the root of \(\Gamma '\). By property (4.2) for the subsubtree \(\Gamma _S'\), we see

$$\begin{aligned} |\mathcal {F}_{\Gamma ',r^{\gamma _2}}(Q)| - 1&= |\mathcal {F}_{\Gamma '_S,r^{\gamma _2}}(Q)| \\&\ge \frac{|{\mathcal {A}}_{\Gamma '_S}(Q)|B + \log _r \mathcal {H}_r^{\gamma _3}(\Gamma _S)}{A+B} \\&\ge \frac{(|{\mathcal {A}}_{\Gamma '}(Q)|-1)B + \log _r \mathcal {H}_r^{\gamma _3}(\Gamma ) - A}{A+B}. \end{aligned}$$

This simplifies to the inequality in (4.2), verifying the inductive step if Case (I) of Lemma 4.5 holds.

Suppose Case (II) of Lemma 4.5 holds. Let Q be the child guaranteed by Case (II). By the induction hypothesis, there exists a subtree \(\Gamma _Q'\) of \(\Gamma _Q\) in which every node satisfies (4.2) with \(\Gamma _Q\) in place of \(\Gamma \) and \(\Gamma _Q'\) in place of \(\Gamma '\). Define the subtree \(\Gamma '\) of \(\Gamma \) to be the root of \(\Gamma \) with only the child Q followed by its subtree \(\Gamma _Q'\).

We will now verify that (4.2) holds for all nodes of \(\Gamma '\). Let Q be any node of \(\Gamma '\). If Q is the root node of \(\Gamma '\), then (4.2) holds because \(\log _r \mathcal {H}_r^{\gamma _3}(\Gamma ) \le 0\). If Q is a non-root node of \(\Gamma '\), then by property (4.2) for the subtree containing Q, we see

$$\begin{aligned} |\mathcal {F}_{\Gamma ',r^{\gamma _2}}(Q)| \ge \frac{(|{\mathcal {A}}_{\Gamma '}(Q)|-1)B + \mathcal {H}_r^{\gamma _3}(\Gamma )+B}{A+B}. \end{aligned}$$

This simplifies to the inequality in (4.2), verifying the inductive step if Case (II) of Lemma 4.5 holds. The proof of the inductive step is complete, and the lemma follows. \(\square \)

Theorem 4.7

For all \(0< \epsilon < 1\), for all \(0< \gamma _2< \gamma _3< \gamma _4< \gamma _3+ \epsilon (\gamma _3- \gamma _2)\), for all sufficiently large \(r \in \mathbb {N}\), and for all \(V > 0\), there exists \(N_0 \in \mathbb {N}\) for which the following holds. For all \(N \ge N_0\) and for all trees \(\Gamma \) of height N with \(\mathcal {H}_r^{\gamma _3}(\Gamma ) \ge V\) that satisfy (4.1), there exists a subtree \(\Gamma '\) of \(\Gamma \) such that all nodes \(Q \in \Gamma '\) with height at least \(N_0\) have \((r^{\gamma _2},1-\epsilon )\)-fertile ancestry in \(\Gamma '\).

Proof

Let \(0< \epsilon < 1\) and \(0< \gamma _2< \gamma _3< \gamma _4< \gamma _3+ \epsilon (\gamma _3- \gamma _2)\). Let \(r \in \mathbb {N}\) be sufficiently large so that \(\gamma _3- \gamma _2- \log _{r}2 > (1-\epsilon )(\gamma _4- \gamma _2)\). Define \(A = \gamma _4- \gamma _3+ \log _r 2\) and \(B = \gamma _3- \gamma _2- \log _r 2\), and note by the inequality in the previous sentence, \(B / (A+B) > (1-\epsilon )\). Let \(V > 0\). Choose \(N_0 \in \mathbb {N}\) such that

$$\begin{aligned} \frac{N_0B + \log _r V}{N_0(A+B)} > 1-\epsilon , \end{aligned}$$
(4.3)

and note that for all \(N \ge N_0\), the inequality in (4.3) holds with \(N_0\) replaced by N.

Let \(N \ge N_0\), and let \(\Gamma \) be a tree of height N with \(\mathcal {H}_r^{\gamma _3}(\Gamma ) \ge V\) that satisfies (4.1). By Lemma 4.6, there exists a subtree \(\Gamma '\) of \(\Gamma \) such that for all nodes Q of \(\Gamma '\), the inequality in (4.2) holds.

Let Q be a node of \(\Gamma '\) with height at least \(N_0\). By (4.2) and (4.3), we see that

$$\begin{aligned} \frac{|\mathcal {F}_{\Gamma ',r^{\gamma _2}}(Q)|}{|{\mathcal {A}}_{\Gamma '}(Q)|} \ge \frac{|{\mathcal {A}}_{\Gamma '}(Q)|B + \log _r V}{|{\mathcal {A}}_{\Gamma '}(Q)|(A+B)} > 1-\epsilon . \end{aligned}$$

It follows that Q has \((r^{\gamma _2},1-\epsilon )\)-fertile ancestry in \(\Gamma '\), as was to be shown. \(\square \)

5 Proof of the Sumsets Theorem

In this section, we prove Theorem A, the main theorem in this work. We restate it here for the reader’s convenience.

Theorem A

Let r and s be multiplicatively independent positive integers, and let \(X, Y \subseteq [0,1]\) be \(\times r\)- and \(\times s\)-invariant sets, respectively. Define \(\overline{\gamma }= \min \big ( \dim _{\text {H}}X + \dim _{\text {H}}Y, 1 \big )\). For all compact \(I \subseteq \mathbb {R}{\setminus } \{0\}\) and all \(\gamma < \overline{\gamma }\),

$$\begin{aligned} \inf _{\lambda , \eta \in I}\ \mathcal {H}_{>0}^{\gamma } \big ( \lambda X + \eta Y \big ) > 0. \end{aligned}$$
(5.1)

Several auxiliary results go into the proof: the discrete version of Marstrand’s projection theorem in Sect. 3, the subtree regularity theorem for finite trees in Sect. 4, and the quantitative equidistribution result in Sect. 2.3. We outline the proof of Theorem A in Sect. 5.1 before presenting the full details in Sects. 5.2 and 5.3.

Remark 5.1

It is natural to ask about the value of the infimum \(C {:}{=}\inf _{\lambda , \eta \in I}\ \mathcal {H}_{>0}^{\gamma } \big ( \lambda X + \eta Y \big )\) that appears in (5.1), or, more precisely, how it depends on X and Y. The value of C must depend on r, s, \(\gamma \), \(\overline{\gamma }\), and I, but also on X and Y, at least to the extent that it accounts for the Hausdorff content of \(X\times Y\). It follows from the proof of Theorem A below that this is essentially the only sense in which C depends on X and Y.

More precisely, there exist \(\gamma _3,\gamma _4>0\) (depending only on \(\gamma \) and \(\dim _{\text {H}}(X\times Y)\)) with \(\gamma _3< \dim _{\text {H}}(X\times Y) < \gamma _4\) such that taking \(M_0\in \mathbb {N}\) and \(c_1,c_2>0\) as given by Corollary 2.16 when applied with \(\gamma _4\) as \(\xi \), the quantity C depends only on \(M_0\), \(c_1\), \(c_2\), r, s, I, \(\gamma \), \(\dim _{\text {H}}(X\times Y)\), and \(\mathcal {H}_{>0}^{\gamma _3}(X\times Y)\), but otherwise not on X and Y.Footnote 5

5.1 Outline of the Proof of Theorem A

Before beginning with the details of the proof of Theorem A, we explain the main ideas behind it. To understand the argument, it helps to begin by assuming that the set \(X \times Y\) is self-similar in the sense that for every \(n \in \mathbb {N}_0\), it is a union of approximately \(r^{n (\dim _{\text {H}}X + \dim _{\text {H}}Y)}\) many translates of the set \(r^{-n} X \times s^{-n'} Y\). (Recall that \(n' = \lfloor n \log r / \log s \rfloor \) so that \(s^{-n'}\approx r^{-n}\).) This is the case, for example, if X and Y are both restricted digit Cantor sets. In this case, Peres and Shmerkin [22] proved that for all \(\lambda , \eta \in \mathbb {R}{\setminus } \{0\}\), \(\dim _{\text {H}}(\lambda X + \mu Y) = \overline{\gamma }\). Our argument follows along the same lines as theirs.

Recall that \(\Pi _{t}: \mathbb {R}^2 \rightarrow \mathbb {R}\) is the oblique projection \(\Pi _{t}(x,y) = x + t y\). A quick calculation shows that

$$\begin{aligned} \Pi _{e^t} (r^{-n} X \times s^{-n'} Y) = r^{-n} \Pi _{e^t r^n / s^{n'}} (X \times Y), \end{aligned}$$

which implies that the images of the translates of \(r^{-n} X \times s^{-n'} Y\) under the map \(\Pi _{e^t}\) are affinely equivalent to the image of the full set \(X \times Y\) under the map \(\Pi _{e^t r^n / s^{n'}}\). It follows that the set \(\Pi _{e^t} (X \times Y)\) contains affine images of the sets \(\Pi _{e^t r^n / s^{n'}} (X \times Y)\) and hence that

$$\begin{aligned} \dim _{\text {H}}\Pi _{e^t} (X \times Y) \ge \sup _{n \in \mathbb {N}_0} \dim _{\text {H}}\Pi _{e^t r^n / s^{n'}} (X \times Y). \end{aligned}$$

Thus, to bound \(\dim _{\text {H}}\Pi _{e^t} (X \times Y)\) from below, it suffices to show that there is some \(n \in \mathbb {N}_0\) for which \(e^t r^n / s^{n'}\) is a “good angle” for \(X \times Y\), in the sense that \(\dim _{\text {H}}\Pi _{e^t r^n / s^{n'}} (X \times Y) > \overline{\gamma }- \epsilon \). It follows from Marstrand’s theorem that the set of such “good angles” for \(X \times Y\) (indeed, for any set) has full measure in \(\mathbb {R}\), and it will be shown that the sequence \(n \mapsto \log (e^t r^n / s^{n'})\) has image in \([t,t+\log s)\) and is the orbit of t under the irrational \(x \mapsto x + \log r \pmod {\log s}\) translated by t. When combined, these facts fall just short of allowing us to conclude the existence of \(n \in \mathbb {N}_0\) for which \(e^t r^n / s^{n'}\) is a good angle: it is possible that the image of an equidistributed sequence misses a set of full measure.

To make use of the above outline, one needs to gain some topological information on the set of good angles from Marstrand’s theorem. This can be accomplished by moving the argument to a discrete setting. Discretizing introduces a number of technical nuisances, but the core of the argument remains the same. Recall that \(X_n\) and \(Y_{n'}\) are the sets X and Y rounded to the lattices \(r^{-n} \mathbb {Z}\) and \(s^{-n'} \mathbb {Z}\), respectively. The discrete analogue of Marstrand’s theorem in Theorem 3.2 tells us that the complement of the set of “good angles” for a finite set such as \(X_n \times Y_{n'}\) can be covered by a disjoint union of few half-open intervals. This topological information combines with the equidistribution of the irrational rotation described above to allow us to find many \(n \in \mathbb {N}_0\) for which \(e^t r^n / s^{n'}\) is a good angle for \(X_n \times Y_{n'}\).

The argument described thus far is essentially due to Peres and Shmerkin in [22] and allows them to conclude that for all \(t \in \mathbb {R}{\setminus } \{0\}\), \(\dim _{\text {H}}\Pi _{e^t} (X \times Y) = \overline{\gamma }\). We will now describe the two primary modifications we make to this argument in the course of the proof of Theorem A.

The first modification allows us to show that the discrete Hausdorff content of \(\Pi _{e^t} (X \times Y)\) at all small scales is uniform in t. Ultimately, this uniformity stems from the fact that the irrational rotation described above is uniquely ergodic: changing t in the argument above changes only the point whose orbit we consider. Exposing the uniformity in the argument after this is then mainly a matter of taking care with the quantifiers in the auxiliary results.

The second modification allows us to handle sets X and Y which are only assumed to be \(\times r\)- and \(\times s\)-invariant. Such sets need not be self-similar, but they do exhibit some “near self similarity” in the following sense. Consider the discrete set \(X_m\) for some large \(m \in \mathbb {N}\). Because X is \(\times r\)-invariant, the set \(X_{(n+1)m} \cap \big [i / r^{nm}, (i + 1)/ r^{nm} \big )\), when dilated by \(r^{mn}\) and considered modulo 1, is a subset of \(X_m\). While this set is generally not equal to \(X_m\), it is, by an averaging argument, very often of cardinality greater than \(r^{-\epsilon } |X_m|\). This is profitably re-interpreted in the language of trees: in the tree with levels \(X_{nm} \times Y_{(nm)'}\), \(n \in \mathbb {N}_0\), many nodes have nearly the maximum allowed number of children. The tree thinning result in Theorem 4.7 exploits this abundance by finding a sufficiently “regular” subtree on which we focus our attention. Then, we invoke our discrete analogue of Marstrand’s theorem – which provides information on the set of angles that are good not only for the original set \(X_m \times Y_{m'}\), but also for large subsets of it – to further thin the subtree. Following the reasoning given in Remark 4.4, the resulting subtree has fertile ancestry and hence has large Hausdorff content. By the construction of the subtree, its image under \(\Pi _{e^t}\) is large, and this yields the lower bound on the Hausdorff dimension in the conclusion of the theorem.

5.2 Proof of Theorem A

In this section and the next, let r, s, X, Y, and \(\overline{\gamma }\) be given as in the statement of Theorem A. The proof of Theorem A begins with a number of reductions, the last of which in Claim 5.2 is a statement about the existence of measures on the images of the discrete product sets under oblique projections. We prove Claim 5.2 in the next subsection.

By Lemma 2.10, \(\dim _{\text {H}}( X \times Y) = \dim _{\text {H}}X + \dim _{\text {H}}Y\). Note that if \(\dim _{\text {H}}X = 0\), then the conclusion is clear by considering, for any \(x \in X\), images of the set \(\{x\} \times Y\). The same is true if \(\dim _{\text {H}}Y=0\). Thus, we will proceed under the assumption that \(\dim _{\text {H}}X, \dim _{\text {H}}Y > 0\). Note that the set \(1-X\) is \(\times r\)-invariant and that \(-\lambda X + \eta Y\) is a translate of the set \(\lambda (1-X) + \eta Y\). The analogous statement holds for Y. Combining these facts, it is easy to see that it suffices to prove Theorem A in the case that \(I \subseteq (0,\infty )\).

The next step is to formulate a statement sufficient to prove Theorem A in terms of oblique projections of discrete sets. Recall that \(n' = \lfloor n \log r / \log s\rfloor \) and that \(X_n\), \(Y_{n'}\) are the sets X and Y rounded to the lattices \(r^{-n} \mathbb {Z}\) and \(s^{-n'} \mathbb {Z}\), respectively. For \(n \in \mathbb {N}_0\), define

$$\begin{aligned} \mathcal {Q}_n = X_n \times Y_{n'} \qquad \text { and } \qquad \widetilde{\mathcal {Q}}_n = X_n \times Y_{n'+1}. \end{aligned}$$
(5.2)

Claim 5.2

For all compact \(I \subseteq \mathbb {R}\) and all \(0< \gamma < \overline{\gamma }\), there exists \(m, N_0 \in \mathbb {N}\) such that for all \(N \ge N_0\) and all \(t \in I\), there exists a probability measure \(\mu \) supported on the finite set \(\Pi _{e^{t}} \mathcal {Q}_{Nm}\) with the property that for all balls \(B \subseteq \mathbb {R}\) of diameter \(\delta \ge r^{-Nm}\), \(\mu (B) \le r^{N_0m} \delta ^{\gamma }\).

To deduce Theorem A from Claim 5.2, let \(I \subseteq (0, \infty )\) be compact and \(0< \gamma < \overline{\gamma }\). Apply Claim 5.2 with \({{\tilde{I}}} {:}{=}\big \{\log (\eta /\lambda ) \ \big | \ \eta ,\lambda \in I \big \}\) as I and \(\gamma \) as it is. Let \(m, N_0 \in \mathbb {N}\) be as guaranteed by Claim 5.2.

Note that by Lemma 2.4 and the fact that the function \(\rho \mapsto \mathcal {H}_{\ge \rho }^{\gamma } \big ( \lambda X + \eta Y \big )\) is non-increasing (as \(\rho \) decreases),

$$\begin{aligned} \inf _{\lambda , \eta \in I}\ \mathcal {H}_{>0}^{\gamma } \big ( \lambda X + \eta Y \big ) = \lim _{\rho \rightarrow 0} \inf _{\lambda , \eta \in I}\ \mathcal {H}_{\ge \rho }^{\gamma } \big ( \lambda X + \eta Y \big ). \end{aligned}$$

The limit in the final expression exists because \(\inf _{\lambda , \eta \in I}\ \mathcal {H}_{\ge \rho }^{\gamma } \big ( \lambda X + \eta Y \big )\) is non-increasing and is bounded from below by zero.

Therefore, to show that (5.1) holds, it suffices to prove that

$$\begin{aligned} \lim _{N \rightarrow \infty }\ \inf _{\lambda , \eta \in I}\ \mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big ( \lambda X + \eta Y \big ) > 0. \end{aligned}$$
(5.3)

It follows from the fact that

$$\begin{aligned} d_H(\lambda X_{Nm} + \eta Y_{(Nm)'}, \lambda X + \eta Y) \ll _{I, r, s} r^{-Nm}, \end{aligned}$$

where \(d_H\) is the Hausdorff metric, and Lemma 2.7 that for all \(\lambda , \eta \in I\),

$$\begin{aligned} \begin{aligned} \mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big ( \lambda X + \eta Y \big )&\asymp _{I,r,s} \mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big (\lambda X_{Nm} + \eta Y_{(Nm)'} \big )\\&\asymp _{I,r,s} \mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big ( X_{Nm} + e^{\log (\eta / \lambda )} Y_{(Nm)'} \big ).\end{aligned} \end{aligned}$$
(5.4)

Therefore, to show (5.3), it suffices to prove that

$$\begin{aligned} \lim _{N \rightarrow \infty }\ \inf _{t \in {{\tilde{I}}}}\ \mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big ( \Pi _{e^{t}} \mathcal {Q}_{Nm} \big ) > 0. \end{aligned}$$
(5.5)

Combining the conclusion of Claim 5.2 with Lemma 2.6, we see that for all \(N \ge N_0\) and \(t \in {{\tilde{I}}}\), \(\mathcal {H}_{\ge r^{-Nm}}^{\gamma } \big (\Pi _{e^{t}} \mathcal {Q}_{Nm} \big ) \ge r^{-N_0m}\). This shows that the limit in (5.5) is positive and completes the deduction of Theorem A from Claim 5.2.

5.3 Proof of Claim 5.2

Choosing the parameter m and scale \(\rho \).    Recall that r, s, X, Y, and \(\overline{\gamma }\) are as given in the statement of Theorem A. Without loss of generality, we can assume that \(r < s\). Put \(\beta = \log s\), let \(0< \gamma < \overline{\gamma }\), and define \(\epsilon {:}{=}\overline{\gamma }- \gamma \) and \(\gamma _0{:}{=}\gamma \).

We claim that there exist \(\gamma _1\), \(\gamma _2\), \(\gamma _3\), and \(\gamma _4\) such that

  1. (I)

    \(0< \gamma _0/ (1-\epsilon /2)< \gamma _1< \gamma _2< \gamma _3< \dim _{\text {H}}X + \dim _{\text {H}}Y < \gamma _4\);

  2. (II)

    \(\gamma _4< \gamma _3+ \epsilon (\gamma _3- \gamma _2) / 6\);

  3. (III)

    \(\gamma _1< 1\);

  4. (IV)

    \(2(\gamma _4- \gamma _2) < \gamma _3- \gamma _1\) (this is the inequality in (3.3)).

To see why, note that if we put \(\gamma _1= \gamma _0/ (1-\epsilon /2)\), \(\gamma _4= \gamma _3= \dim _{\text {H}}X + \dim _{\text {H}}Y\), and \(\gamma _2= \gamma _1/ 3 + 2 \gamma _3/ 3\), then the inequalities in (I) holds with “<” replaced by “\(\le \)”, while the inequalities in (II), (III), and (IV) hold as written. It follows that \(\gamma _1\) and \(\gamma _4\) can be increased and \(\gamma _3\) can be decreased (with the corresponding change to \(\gamma _2= \gamma _1/ 3 + 2 \gamma _3/ 3\)) so that all of the strict inequalities hold.

Let \(c_1\), \(c_2\), and \(M_0\) be the constants guaranteed by Corollary 2.16, when applied with \(\gamma _4\) as \(\xi \). Let \(I \subseteq (0,\infty )\) be compact, and define \(I_\beta = I + [0,\beta ]\). Let \(P > 0\) be a Lipschitz constant for all of the maps \(\Pi _{e^t}\), \(t \in I_\beta \), and let \(c_3=4Ps^{-1}+1\). Choose \(m\in \mathbb {N}\) large enough so that we can apply

  • Theorem 4.7 with \(\epsilon / 6\) as \(\epsilon \) and \(r^m\) as r;

  • Corollary 3.3 with \(I_\beta \) as I, \(\epsilon \beta /12\) as \(\epsilon \) and \(r^{-m}\) as \(\rho \);

  • Corollary 2.16 with m as N (i.e., \(m \ge M_0\)).

Put \(\rho = r^{-m}\).

A uniformly distributed sequence.    Let \(\alpha =\log \big (r^m / s^{m'}\big )\) and let \(R:[0,\beta )\rightarrow [0,\beta )\) be the transformation \(R:x \mapsto x + \alpha \pmod \beta \). As \(\beta =\log s\) and \(m'=\lfloor m\log r/\log s\rfloor \), we have

$$\begin{aligned} \alpha /\beta =m\log r/\log s-m'=\big \{m\log r/\log s \big \}. \end{aligned}$$
(5.6)

Since \(\log r/\log s\) is irrational, we conclude that \(\alpha /\beta \) is irrational, whereby the sequence \((R^n(0))_{n \in \mathbb {N}_0}\) is uniformly distributed on \([0,\beta )\).

Claim 5.3

For all \(n \in \mathbb {N}_0\),

  1. (V)

    \(R^n(0)+(nm)'\log s= nm\log r \);

  2. (VI)
    $$\begin{aligned} \big ((n+1)m\big )'={\left\{ \begin{array}{ll} (nm)'+m'&{}\text { if }R^n(0)+\alpha <\beta \\ (nm)'+m'+1&{}\text { if }R^n(0)+\alpha >\beta \end{array}\right. }. \end{aligned}$$

Proof

Since for all \(n\in \mathbb {N}\), \(R^n(0)=n\alpha \pmod {\beta }\), using (5.6), we can write \(R^n(0)/\beta =\big \{n\alpha /\beta \big \}=\big \{n\{m\log r/\beta \}\big \}=\big \{nm\log r/\beta \big \}\). Recalling that \((nm)'=\lfloor nm\log r/\beta \rfloor \), this establishes (V).

Next, note that for any real numbers xy,

$$\begin{aligned} \lfloor x+y\rfloor ={\left\{ \begin{array}{ll} \lfloor x\rfloor +\lfloor y\rfloor &{}\text { if }\{x\}+\{y\}<1 \\ \lfloor x\rfloor +\lfloor y\rfloor +1&{}\text { if }\{x\}+\{y\}\ge 1 \end{array}\right. }. \end{aligned}$$

The equality in (VI) follows from this by substituting \(x=nm\log r/\beta \) and \(y=m\log r/\beta \) and using \(R^n(0)/\beta =\big \{nm\log r/\beta \big \}\) and (5.6). \(\square \)

Choosing the parameter \(N_0\).    From Corollary 2.16, the sets \(\mathcal {Q}_{m}\) and \(\widetilde{\mathcal {Q}}_{m}\) (defined in (5.2)) are \((c_1\rho ,\gamma _4)_{c_2}\)-sets and satisfy

$$\begin{aligned} \rho ^{-\gamma _3} \le |\mathcal {Q}_{m}|, |\widetilde{\mathcal {Q}}_{m}| \le \rho ^{-\gamma _4}. \end{aligned}$$
(5.7)

Let \(T_1\) (resp. \(T_2\)) be the subset of \(I_\beta \) obtained from applying Corollary 3.3 with \(I_\beta \) as I, \(\epsilon \beta / 12\) as \(\epsilon \) and \(\mathcal {Q}_m\) (resp. \(\widetilde{\mathcal {Q}}_m\)) as A. Put \(T = T_1 \cap T_2\). It follows from Corollary 3.3 that \(I_\beta \setminus T\) is covered by a disjoint union of not more than \(U {:}{=}\lceil 2 \epsilon \beta \rho ^{-1} / 24 \rceil \) many half-open intervals of Lebesgue measure less than \(\epsilon \beta / 6\).

Let \(N_0 \in \mathbb {N}\) be the larger of

  • the \(N_0\) from Theorem 4.7 with \(\epsilon / 6\) as \(\epsilon \), \(r^m\) as r, and \(2^{-\gamma _3} \mathcal {H}^{\gamma _3}_{>0}(X \times Y)\) as V;

  • the \(N_0\) from Lemma 2.18 with \((R^n(0))_{n \in \mathbb {N}_0}\) as \((x_n)_{n \in \mathbb {N}_0}\) and \(\epsilon / 3\) as \(\epsilon \).

 

Fixing the parameters N and t.    To prove Claim 5.2, we will show that for all \(N \ge N_0\) and all \(t \in I\) there exists a probability measure \(\mu \) supported on the set \(\Pi _{e^{t}} \mathcal {Q}_{Nm}\) with the property that for all balls \(B \subseteq \mathbb {R}\) of diameter \(\delta \ge \rho ^N\), \(\mu (B) \le \rho ^{-N_0} \delta ^{\gamma }\). Let \(N \ge N_0\) and \(t \in I\). From this point on, all new quantities and objects can depend on N and t.

Constructing the tree \(\Gamma \).    Let \(\Gamma \) be the tree (see Definition 4.1) of height N with node set at height \(n \in \{0, 1, \ldots , N\}\) equal to \(\mathcal {Q}_{nm}\). Associating the point \((i/ {r^{mn}}, j/s^{(mn)'}) \in \mathcal {Q}_{nm}\) with the rectangle

$$\begin{aligned} \left[ \frac{i}{r^{mn}},\frac{i+1}{r^{mn}}\right) \times \left[ \frac{j}{s^{(mn)'}},\frac{j+1}{s^{(mn)'}}\right) , \end{aligned}$$

parentage in the tree \(\Gamma \) is determined by containment amongst associated rectangles. Denote by \(C_\Gamma (Q)\) the children of the node Q in \(\Gamma \). Denote by \(\odot : \mathbb {R}^2 \times \mathbb {R}^2 \rightarrow \mathbb {R}^2\) the binary operation of pointwise multiplication.

Claim 5.4

Let \(n<N\) and \(Q\in \mathcal {Q}_{nm}\).

  1. (VII)

    If \(R^n(0) + \alpha < \beta \), then \(C_\Gamma (Q) \subseteq Q + (r^{-nm}, s^{-(nm)'}) \odot \mathcal {Q}_{m}\).

  2. (VIII)

    If \(R^n(0) + \alpha > \beta \), then \(C_\Gamma (Q) \subseteq Q + (r^{-nm}, s^{-(nm)'}) \odot \widetilde{\mathcal {Q}}_{m}\).

  3. (IX)

    \(\mathcal {H}_{r^m}^{\gamma _3}(\Gamma ) \ge 2^{-\gamma _3}\mathcal {H}^{\gamma _3}_{>0}(X \times Y)\).

Proof

We first prove parts (VII) and (VIII). By Lemma 2.12, \(r X_n \subseteq X_{n-1} \pmod 1\) and \(s Y_{n'} \subseteq Y_{n'-1} \pmod 1\). By (VI), if \(R^n(0) + \alpha < \beta \), then \(\big ((n+1)m \big )'=(nm)' + m'\), and hence \((r^{nm}, s^{(nm)'}) \odot \mathcal {Q}_{(n+1)m} \subseteq \mathcal {Q}_{m} \pmod 1\), and in particular \((r^{nm}, s^{(nm)'}) \odot C_\Gamma (Q) \subseteq \mathcal {Q}_{m} \pmod 1\). If \(R^n(0) + \alpha > \beta \), then \(\big ((n+1)m \big )'=(nm)' + m'+1\), and hence \((r^{nm}, s^{(nm)'}) \odot \mathcal {Q}_{(n+1)m} \subseteq \widetilde{\mathcal {Q}}_{m} \pmod 1\), and in particular \((r^{nm}, s^{(nm)'}) \odot C_\Gamma (Q) \subseteq \widetilde{\mathcal {Q}}_{m} \pmod 1\).

Write \(Q = (i / r^{nm}, j / s^{(nm)'})\) and let \(Q' \in C_\Gamma (Q)\). Because \(Q'\) is a child of Q, we can write \(Q' = Q + (i_0/r^{(n+1)m},j_0/s^{((n+1)m)'})\) where \(0 \le i_0 < r^m\) and \(0 \le j_0 < s^{m'}\). It follows that \((r^{nm}, s^{(nm)'}) \odot (C_\Gamma (Q)-Q) \subseteq \mathcal {Q}_{m}\) (in the first case \(R^n(0) + \alpha < \beta \)) or \((r^{nm}, s^{(nm)'}) \odot (C_\Gamma (Q)-Q) \subseteq \widetilde{\mathcal {Q}}_{m}\) (in the second case \(R^n(0) + \alpha > \beta \)), where the containment now is understood without reducing modulo 1.

To prove (XI), take a cut \(\{Q_1, \ldots , Q_\ell \} \subseteq \Gamma \) of \(\Gamma \) with node \(Q_i\) at height \(n_i\). Then, by construction of \(\Gamma \), there exists a cover \(X \times Y \subseteq \cup _{i=1}^\ell B_i\) where ball \(B_i\) has diameter at most \(2 \rho ^{n_i}\). Since the cut was arbitrary, it follows that \(\mathcal {H}_{r^m}^{\gamma _3}(\Gamma ) \ge 2^{-\gamma _3}\mathcal {H}^{\gamma _3}_{>0}(X \times Y)\). \(\square \)

Constructing the tree \(\Gamma '\).    Combining (5.7) with (VII) and (VIII), it follows that \(|C_\Gamma (Q)|\le r^{m\gamma _4}\) for every non-leaf node Q of \(\Gamma \). The tree \(\Gamma \) has now been shown to satisfy all the hypothesis of Theorem 4.7 (with \(\epsilon / 6\) as \(\epsilon \), \(r^m\) as r, and \(2^{-\gamma _3} \mathcal {H}^{\gamma _3}_{>0}(X \times Y)\) as V), thus there exists a subtree \(\Gamma '\) of \(\Gamma \) with the property that every node with height at least \(N_0\) has \((r^{m \gamma _2},1-\epsilon / 6)\)-fertile ancestry in \(\Gamma '\).

Constructing the tree \(\Gamma ''\).    Now we will use Corollary 3.3, the corollary to the discrete version of Marstrand’s theorem, to further thin out the tree \(\Gamma '\); an outline for this step was described in Remark 4.4 (I). For each non-leaf node \(Q \in \Gamma '\), we will define a subset \(C_{\Gamma '}^m(Q)\) of \(C_{\Gamma '}(Q)\). Define \(J = (T - t) \cap [0, \beta )\). Since \(I_\beta \setminus T\) is covered by at most U many half-open intervals of measure less than \(\epsilon \beta / 6\), the same is true for the set \([0,\beta ) \setminus J\). Define \(\mathcal {J}= \{ 0 \le n \le N-1 \ | \ R^n(0) \in J\}\). Note that for all \(n \ge N_0\), by Lemma 2.18, \(|\mathcal {J}\cap \{0, \ldots , n-1\}| \ge (1-\epsilon / 3)n\).

Let Q be a non-leaf node of \(\Gamma '\), and let \(n \in \{0, \ldots , N-1\}\) be the height of Q. Consider the following cases:

  1. (X)

    \(n \not \in \mathcal {J}\) or \(|C_{\Gamma '}(Q)| < \rho ^{-\gamma _2}\). Select a single child \(Q'\) of Q and put \(C_{\Gamma '}^m(Q) = \{Q'\}\).

  2. (XI)

    \(n \in \mathcal {J}\), \(|C_{\Gamma '}(Q)| \ge \rho ^{-\gamma _2}\), and \(R^n(0) + \alpha < \beta \). By Theorem 4.7 and (VII), the set \(A' {:}{=}(r^{nm}, s^{(nm)'}) \odot (C_{\Gamma '}(Q) - Q)\) is a subset of \(\mathcal {Q}_{m}\) of cardinality at least \(\rho ^{-\gamma _2}\). Since \(n \in \mathcal {J}\), we have that \(t+ R^n(0) \in T\). Applying Corollary 3.3 (II) with \(t+ R^n(0) \) in the role of t, there exists a subset \(A'_t \subseteq A'\) with \(|A'_t| \ge \rho ^{-\gamma _1}\) and such that the points of \(\Pi _{e^{t+ R^n(0)}} A'_t\) are distinct and \(c_3 \rho \)-separated. Define \(C_{\Gamma '}^m(Q) = Q + (r^{-nm}, s^{-(nm)'}) \odot A'_t\) so that \((r^{nm}, s^{(nm)'}) \odot (C_{\Gamma '}^m(Q) - Q) = A'_t\).

  3. (XII)

    \(n \in \mathcal {J}\), \(|C_{\Gamma '}(Q)| \ge \rho ^{-\gamma _2}\), and \(R^n(0) + \alpha > \beta \). We do exactly as in (XI) with \({\mathcal {Q}}_m\) replaced by \(\widetilde{\mathcal {Q}}_{m}\) and using (VIII) to get the set \(C_{\Gamma '}^m(Q)\).

Let \(\Gamma ''\) be the subtree of \(\Gamma '\) with the property that if Q is a non-leaf node of \(\Gamma ''\), then \(C_{\Gamma ''}(Q) = C_{\Gamma '}^m(Q)\). We claim that

$$\begin{aligned} \text {every node of }\Gamma ''\text { with height at least }N_0\text { has} (r^{m\gamma _1}, 1-\epsilon /2)-\text {fertile ancestry}. \end{aligned}$$
(5.8)

Indeed, let Q be a node of \(\Gamma ''\) with height \(n \ge N_0\). The ancestry of Q in \(\Gamma '\) is \((r^{m\gamma _2}, 1-\epsilon /6)\)-fertile. Each \(r^{m\gamma _2}\)-fertile ancestor of Q in \(\Gamma '\) with height in the set \(\mathcal {J}\) is an \(r^{m\gamma _1}\)-fertile ancestor of Q in \(\Gamma ''\). Since \(\big |\mathcal {J}\cap \{0, \ldots , n-1\} \big | \ge (1-\epsilon /3)n\), there are at least \((1-\epsilon /2)n\) many \(r^{m\gamma _2}\)-fertile ancestor of Q in \(\Gamma '\) with height in the set \(\mathcal {J}\). It follows that Q has \((r^{m\gamma _1}, 1-\epsilon /2)\)-fertile ancestry in \(\Gamma ''\).

Claim 5.5

If \(L_1\) and \(L_2\) are two distinct leaves of \(\Gamma ''\) and n is maximal such that \(L_1\) and \(L_2\) have a common ancestor at height n, then \(|\Pi _{e^t} L_1 - \Pi _{e^t} L_2| \ge \rho ^{n+1}\).

Proof

Let Q be the common ancestor of \(L_1\) and \(L_2\) in \(\Gamma ''\) of height n. Note that by the definition of \(\Gamma ''\) and maximality of n, it must be that Q has more than one child and hence that \(n \in \mathcal {J}\). Let \(Q_1\) and \(Q_2\) be the children of Q in \(\Gamma ''\) that are ancestors of \(L_1\) and \(L_2\), respectively. Note that \(Q_1 \ne Q_2\) but that \(Q_i\) may be equal to \(L_i\).

We will show first that \(\Pi _{e^t} Q_1\) and \(\Pi _{e^t} Q_2\) are \(c_3\rho ^{n+1}\)-separated. Write \(Q = (p,q)\) and \(Q_i = (p_i,q_i)\). Suppose that \(R^{n}(0) + \alpha < \beta \). It follows from (V) that

$$\begin{aligned} \begin{aligned}\Pi _{e^t} Q_i&= r^{-nm} \left( r^{nm} \Pi _{e^t} (Q_i - Q)\right) + \Pi _{e^t} Q \\&=\rho ^{n} \left( r^{nm} (p_i - p) + e^{t + R^n(0)} s^{(nm)'} (q_i - q) \right) + \Pi _{e^t} Q \\&= \rho ^{n} \left( \Pi _{e^{t + R^n(0)}} \big ((r^{nm},s^{(nm)'}) \odot (Q_i - Q) \big ) \right) + \Pi _{e^t} Q. \end{aligned} \end{aligned}$$
(5.9)

By (XI), the points of \(\Pi _{e^{t + R^n(0)}} \big ((r^{nm},s^{(nm)'}) \odot (Q_i - Q) \big )\), \(i=1,2\), are \(c_3\rho \)-separated. It follows then from (5.9) that the points of \(\Pi _{e^t} Q_i\), \(i=1,2\), are \(c_3\rho ^{n + 1}\)-separated. A similar argument works to reach the same conclusion if \(R^n(0) + \alpha > \beta \) using (XII).

By the definition of the \(\mathcal {Q}_{nm}\) sets, \(|Q_i - L_i| \le 2\,s^{-1}\rho ^{n+1}\). By the triangle inequality and the fact that \(c_3 = 4Ps^{-1}+1\),

$$\begin{aligned} \big |\Pi _{e^t} L_1 - \Pi _{e^t} L_2\big |&\ge \big |\Pi _{e^t} Q_1 - \Pi _{e^t} Q_2\big | - \big |\Pi _{e^t} (Q_1 - L_1)\big | - \big |\Pi _{e^t} (Q_1 - L_1)\big |\\&\ge (4Ps^{-1}+1)\rho ^{n+1} - 4Ps^{-1}\rho ^{n+1} \ge \rho ^{n+1}. \end{aligned}$$

It follows that \(\big |\Pi _{e^t} L_1 - \Pi _{e^t} L_2\big | \ge \rho ^{n+1}\), as was to be shown. \(\square \)

Constructing the measure \(\mu \).    The proof of Claim 5.2 will be concluded by demonstrating that 1) the fertile ancestry property of \(\Gamma ''\) in (5.8) guarantees that \(\Gamma ''\) supports a “measure” which is not too concentrated on any node (an outline for this step was described in Remark 4.4 (II)); and 2) by Claim 5.5, the projection of this measure is not too concentrated on any ball.

Let \(\nu : \Gamma '' \rightarrow [0,1]\) be the unique function that takes 1 on the root of \(\Gamma ''\) and has the properties that for all non-leaf nodes Q of \(\Gamma ''\), \(\nu \) is constant on \({\mathcal {C}}_{\Gamma ''}(Q)\) and \(\nu (Q) = \sum _{C \in {\mathcal {C}}_{\Gamma ''}(Q)} \nu (C)\). (Colloquially, a mass of 1 begins at the root of \(\Gamma ''\) and spreads down the tree by splitting equally amongst the children of each node.) Let \(\nu _N\) be the function \(\nu \) restricted to \(\Gamma _N''\), the set of leaves of \(\Gamma ''\). By the defining properties of \(\nu \), the function \(\nu _N\) is a probability measure on \(\Gamma ''_N\).

Since \(\Gamma ''_N \subseteq \mathcal {Q}_{Nm}\), the measure \(\mu = \Pi _{e^t} \nu _N\), the push-forward of \(\nu _N\) through the map \(\Pi _{e^t}\), is a probability measure supported on the set \(\Pi _{e^{t}} \mathcal {Q}_{Nm}\). We will conclude the proof of Claim 5.2 by verifying that for all balls \(B \subseteq \mathbb {R}\) of diameter \(\delta \ge \rho ^N\), \(\mu (B) \le \rho ^{-N_0} \delta ^{\gamma _0}\). (Recall that \(\gamma _0= \gamma \).)

Let \(B \subseteq \mathbb {R}\) be an interval of length \(\delta \ge \rho ^N\). Put \(n = \lfloor \log _\rho \delta \rfloor + 1\) and note that \(\rho ^{n} < \delta \le \rho ^{n-1}\). It follows from Claim 5.5 that there exists a node Q of \(\Gamma ''\) with height at least n with the property that if L is a leaf of \(\Gamma ''\) with \(\Pi _{e^t} L \in B\), then Q is an ancestor of L. This implies that \(\mu (B) \le \nu (Q)\), and so it suffices to show that

$$\begin{aligned} \nu (Q) \le \rho ^{-N_0} \delta ^{\gamma _0}. \end{aligned}$$
(5.10)

If \(n \le N_0\), then \(\rho ^{-N_0} \delta ^{\gamma _0} > 1\) and (5.10) holds trivially. If \(n > N_0\), then by the definition of \(\nu \) and the fact that Q has \((r^{m\gamma _1},1-\epsilon /2)\)-fertile ancestry (cf. (5.8)),

$$\begin{aligned} \nu (Q) \le \frac{1}{r^{m\gamma _1(1-\epsilon /2)n}} = \rho ^{\gamma _1(1-\epsilon /2)n} \le \rho ^{-N_0} \delta ^{\gamma _0}, \end{aligned}$$

since \((1-\epsilon /2) \gamma _1> \gamma _0\). This verifies (5.10), completing the proof of Claim 5.2 and hence of Theorem A.