A combinatorial proof of a sumset conjecture of Furstenberg

We give a new proof of a sumset conjecture of Furstenberg that was first proved by Hochman and Shmerkin in 2012: if logr/logs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log r/\log s$$\end{document} is irrational and X and Y are ×r\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times r$$\end{document} - and ×s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times s$$\end{document}-invariant subsets of [0, 1], respectively, then dimH(X+Y)=min(1,dimHX+dimHY)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\dim _{\text {H}}(X + Y ) = \min (1, \dim _{\text {H}}X + \dim _{\text {H}}Y )$$\end{document}. Our main result yields information on the size of the sumset λX+ηY\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda X + \eta Y$$\end{document} uniformly across a compact set of parameters at fixed scales. The proof is combinatorial and avoids the machinery of local entropy averages and CP-processes, relying instead on a quantitative, discrete Marstrand projection theorem and a subtree regularity theorem that may be of independent interest.

s are multiplicatively independent positive integers (that is, log r/ log s is irrational) and X and Y are ×rand ×s-invariant, respectively, then dim H X + Y = min 1, dim H X + dim H Y , and (1.1) 2) The sumset conjecture (1.1) was resolved by Hochman and Shmerkin [HS], who proved a more general result concerning the dimension of sums of invariant measures.It also follows by more recent results of Shmerkin [Shm] and Wu [Wu], who independently resolved a generalization of the intersection conjecture (1.2).We give a more detailed account of this recent history later in the introduction.The purpose of this article is to give a new, combinatorial proof of Furstenberg's sumset conjecture (1.1).Denoting the unlimited γ-Hausdorff content by H γ >0 (see Definition 2.2), our main theorem is as follows.
Theorem A. Let r and s be multiplicatively independent positive integers, and let X, Y ⊆ [0, 1] be ×rand ×s-invariant sets, respectively.Define γ = min dim H X + dim H Y, 1 .For all compact I ⊆ R\{0} and all γ < γ, inf λ,η∈I H γ >0 λX + ηY > 0. (1.3) Beyond implying (1.1), Theorem A gives finer quantitative information on the size of the sumset λX +ηY in terms of the unlimited γ-Hausdorff content uniformly over the parameters λ and η.The uniformity in the result, which does not appear to follow from [HS], has found use in recent applications concerning digit problems; see, for example, [GMR] and [BY].See Remark 5.1 below for some further discussion on this uniformity.
Our proof of (1.1) differs from other proofs in the literature in that it completely avoids the machinery of CP-processes and local entropy averages.Instead, it features an elementary, combinatorial approach that builds on the work of Peres and Shmerkin in [PS].Important ingredients in the proof include a quantitative discrete Marstrand theorem (Theorem 3.2) and a subtree regularity theorem (Theorem 4.7), both of which may be of independent interest.

History and context
In a highly influential work in geometric measure theory, Marstrand [Mar] related the Hausdorff dimension of a Borel set E ⊆ R2 , dim H E, to the Hausdorff dimension of its images under orthogonal projections and its intersections with lines.More specifically, he showed that for almost every line L ⊆ R 2 , dim H (π L E) = min 1, dim H E , where π L is the orthogonal projection R 2 → L, and that for almost every line L intersecting E, dim H (E ∩ L) = max 0, dim H E − 1 . 2 Images of a Cartesian product X × Y under orthogonal projections are, up to affine transformations which preserve dimension, sumsets of the form λX + ηY , while intersections of X ×Y with lines are affinely equivalent to sets of the form λX ∩(ηY +σ).Thus, Marstrand's theorems in the case E = X × Y imply the following.
Theorem 1.1 ( [Mar, Theorems II and III]).Let X and Y be Borel subsets of [0,1].For Lebesgue-a.e. λ, η, σ ∈ R, dim H λX + ηY = min dim H (X × Y ), 1 , and (1.4) dim H λX ∩ (ηY + σ) max 0, dim H (X × Y ) − 1 . (1.5) Improving (1.4) and (1.5) by replacing the Lebesgue-typical projection or intersection of X × Y with a concrete projection or intersection is not possible in general [KM] but can be done in special cases when the sets X and Y are structured.Furstenberg's conjectures (1.1) and (1.2) can be contextualized as such: when r and s are multiplicatively independent and X × Y is the product of a ×rand a ×s-invariant set, results for the Lebesgue-typical projection and intersection should hold for the orthogonal projection to, and the intersection with, the line x = y.These conjectures join a host of results and conjectures by Furstenberg and others that aim to capture the independence between base-r and base-s structure when r and s are multiplicatively independent.
Conjectures (1.1) and (1.2) were recently resolved, both proven in more general forms.In the following theorem, we have combined special cases of the results by Hochman and Shmerkin [HS], Shmerkin [Shm], and Wu [Wu] that are most relevant to this work.Note that dim M denotes the upper Minkowski dimension (see Definition 2.1).
Theorem 1.2 ( [HS] and [Shm, Wu]).Let r and s be multiplicatively independent positive integers, and let X, Y ⊆ [0, 1] be ×rand ×s-invariant sets, respectively.For all λ, η ∈ R\{0} and all σ ∈ R, A number of partial results preceded those in Theorem 1.2, both for multiplicatively invariant sets and for attractors of iterated function systems (IFSs).Carlos Moreira [Mor] considered sumsets of attractors of IFSs with certain irrationality and non-linearity conditions.Peres and Shmerkin [PS] proved (1.6) for attractors of IFSs with rationally independent contraction ratios; this resolved (1.6) in the special case that X and Y are restricted digit Cantor sets with respect to multiplicatively independent bases.(This work of Peres and Shmerkin is particularly relevant to the arguments in this paper, as we explain in detail in Section 5.1.) Hochman and Shmerkin [HS] developed Furstenberg's CP processes [Fur2] and introduced local entropy averages to prove (1.6) both for invariant sets and measures and for attractors of IFSs satisfying some general minimality conditions.Wu [Wu] combined the CP process machinery with Sinai's factor theorem from ergodic theory to resolve (1.7) for invariant sets and attractors of regular, self-similar IFSs.Shmerkin [Shm] resolved (1.7) utilizing tools primarily from additive combinatorics, proving an inverse theorem for the decay of L q norms of certain self-similar measures of dynamical origin.Yu [Yu] and Austin [Aus] gave dynamical proofs of (1.2), simplifying some aspects of earlier proofs.
The sumset and intersection theorems are closely related: fibers of orthogonal projections are precisely those lines with which intersections are considered.It is not surprising, then, that the intersection theorem can be used to deduce the sumset theorem.For example, if for arbitrary sets X, Y ⊆ [0, 1] we know that for all γ > max 0, dim H X + dim H Y − 1 , there exists δ 0 > 0, for all 0 < δ < δ 0 , and for all balls B of diameter δ, This type of uniformity is made explicit in Shmerkin [Shm] and Yu [Yu] and may be implicit in the other proofs of the intersection conjecture.It is possible to deduce Theorem A from Shmerkin's main result in [Shm]; we explain the details in the course of another argument in [GMR].Despite the fact that every proof of the intersection conjecture can be counted as a proof of the sumset conjecture, we believe our approach still has merit: it is the most elementary proof to date; it exposes uniformity important in certain number-theoretic applications; and it features tools which may be of independent interest.Theorem A has a geometric formulation in terms of orthogonal projections; while we will not make particular use of the theorem in this form, it is worth formulating for its historical connection to the topic.Let π θ : R 2 → R 2 be the orthogonal projection onto the line that contains the origin and forms an angle θ with the positive x-axis.The proof of the equivalence between Theorem A and Theorem B is standard and not needed in this work, so it is omitted.

Overview of the paper
The paper is organized as follows.In Section 2, we organize the terminology, notation, and basic facts we need from discrete and continuous fractal geometry, including some properties of ×r-invariant subsets of [0, 1] and an equidistribution lemma.Section 3 contains a proof of Theorem 3.2, our discrete Marstrand projection theorem.Section 4 features notation and terminology for trees and the subtree regularity theorem, Theorem 4.7.Finally, we prove Theorem A in Section 5.

Continuous and discrete fractal geometry
In this section, we lay out the notation, tools, and results we need from continuous and discrete fractal geometry.A good general reference for the standard material in this section is [Mat,Ch. 4].In the definitions that follow, ρ, γ, c > 0, d ∈ N, and X ⊆ R d is non-empty.
• The metric entropy of X at scale ρ is The upper Minkowski dimension, dim M X, is defined analogously with a limit supremum in place of the limit infimum.
It is a well-known fact which we will use without further mention that if Note that when X is compact, the index set I may be taken to be finite.
In the following definition, we introduce two notions meant to capture the dimensionality of discrete sets.
Definition 2.3.(2.2) • The discrete Hausdorff content of X at scale ρ and dimension γ is Note that when X is compact, the index set I may be taken to be finite.
In the definition of a (ρ, γ) c -set, we think of ρ as being positive and close to 0, γ ∈ [0, d] as the "dimension" of the set, and c > 0 as an uninteresting parameter that exists only to make our arguments explicit.The inequality in (2.2) guarantees that the points of a (ρ, γ) c -set cannot be too concentrated in any ball.It follows from that inequality that the maximum cardinality of a (ρ, γ) c set in [0, 1] d is on the order of ρ −γ .A (ρ, γ) c -set with cardinality ≫ ρ −γ can be thought of as a discrete approximation to a set with Hausdorff dimension γ; this is made more precise in Remark 2.5 below and is realized in Lemma 2.13.In fact, if the discrete approximations of a set X ⊆ R d at all scales ρ > 0 are (ρ, γ) c -sets, then the Assouad dimension (cf.[Fra,Section 2.1]) of the set X is at most γ.More precisely, the Assouad dimension of X is the infimum of the set of γ's for which there there exists c > 0 such that for all ρ > 0, the set X rounded to the lattice ρZ d is a (ρ, γ) c -set.
The discrete Hausdorff content at scale ρ is a "ρ-resolution" analogue of the unlimited Hausdorff content.The discrete Hausdorff contents of two sets that look the same at scale ρ are approximately equal.The following lemma provides a connection between the discrete and the continuous regimes that will be useful in the proof of Theorem A.
Lemma 2.4.Let X ⊆ R d be compact.For all γ 0, lim Proof.Let γ 0. The limit in (2.3) exists because the function ρ → H γ ρ (X) is non-increasing as ρ tends to 0 + and is bounded from below by H γ >0 (X).Equality in the limit follows from the fact that X is compact, allowing for the index set in the definition of H γ >0 (X) to be taken to be finite.If lim ρ→0 H γ ρ (X) > 0, then H γ >0 (X) > 0, and it follows from the definition of the Hausdorff dimension that dim H X γ.
Remark 2.5.It would be natural to define the metric entropy at scale ρ and dimension γ of the set X as Using a max flow, min cut argument similar to the one in [BP,Ch. 3], it can be shown that for X compact, (2.4) Thus, (ρ, γ) c -sets of cardinality ≫ ρ −γ can be thought of as discrete fractal sets of dimension γ.We will not need (2.4); the interested reader can consult [FO,Prop. A1] for some details.
The following is a discrete version of the well-known mass distribution principle, cf.[BP,Lemma 1.2.8].
Proof.Let ε > 0, and let {B i } i∈I be a cover of supp µ with ball B i of diameter δ i ρ and with Then the conclusion follows because ε > 0 was arbitrary.
Denote by [X] δ the closed δ-neighborhood of X: Proof.Let {B i } i∈I be a collection of open balls covering Y and where B i has diameter r i ρ and i∈I r γ

Multiplicatively invariant subsets of the reals and their finite approximations
In this section, we record some basic facts about multiplicatively invariant subsets of [0, 1] and their discrete approximations.
The Hausdorff and Minkowski dimensions of a multiplicatively invariant set coincide.As a consequence of this regularity, the Hausdorff dimension of products of such sets is also well-behaved.We record these facts here for later use.
Proof.This follows immediately from [Mat,Corollary 8.11] and the fact that dim H X = dim M X.
Since we will work almost exclusively with finite approximations to multiplicatively invariant sets, we establish some useful notation.
The next results show that finite approximations to a multiplicatively invariant set are multiplicatively invariant and are discrete models of fractal sets as captured by Definition 2.3.
. It follows by the definition of X n−1 that i 0 /r n−1 ∈ X n−1 , as was to be shown.Lemma 2.13.Let r 2, and let X ⊆ [0, 1] be a ×r-invariant set.For all γ > dim H X, there exists c > 0 such that for all sufficiently large N ∈ N, the set X N is a (r −N , γ) c -set.
Proof.Let γ > dim H X. Because γ > dim M X (cf.Theorem 2.9), there exists c 0 > 0 such that for all N ∈ N, (2.5) Using the fact that X is ×r-invariant, that T n r is injective on half-open intervals of length r −n , Lemma 2.12, and the bound in (2.5), for all 0 n N and for all i ∈ {0, . . ., r n − 1}, , and note that a union of two intervals of length r n of the form above suffice to cover B. Therefore, as was to be shown.
Lemma 2.14.Let r 2, and let X ⊆ [0, 1] be non-empty and ×r-invariant.For all γ > dim H X and all sufficiently large N ∈ N, Theorem 2.9), we have that |X N | r N γ for all but finitely many N ∈ N. It remains to show the lower bound.
Let M, N ∈ N. Since i r N , i+1 r N , i = 0, 1, . . ., r N − 1, forms a partition of [0, 1), we have r N is non-empty, which happens exactly when i/r N ∈ X N .Hence In view of this sub-additive property, it follows from Fekete's Lemma that the sequence |X N | 1/N converges to its infumum, i.e., lim The following notation, borrowed from [PS], allows us to easily compare powers of r and powers of s.This is useful when considering the finite approximations to the Cartesian product of a ×rand a ×s-invariant set.
Definition 2.15.For n ∈ N 0 , we set n ′ = ⌊n log r/ log s⌋ to be the greatest integer so that s n ′ r n .(The bases r and s do not appear in this notation but should always be clear from context.)Recall from Definition 2.11 that X N is the set X rounded to the lattice r −N Z. Extending this notation to Y , the set Y N is the set Y rounded to the lattice s −N Z. Since r −N is approximately equal to s −N ′ (where N ′ is as defined in Definition 2.15), the set Y N ′ is the discrete approximation to Y that is on a scale closest to the scale of X N .Therefore, the sets X N and Y N ′ will always be considered in the same context, as opposed to the sets X N and Y N .
Corollary 2.16.Let 2 r < s, let X, Y ⊆ [0, 1] be non-empty ×rand ×s-invariant sets.For all ξ > dim H X + dim H Y , there exist c 1 , c 2 > 0 and M 0 ∈ N such that for all N M 0 , the sets Applying Lemma 2.13 and Lemma 2.14, there exist c, d > 0 such that for sufficiently large N ∈ N, the set -set.This is a quick exercise left to the reader.which shows that the set

A quantitative equidistribution lemma
The main result in this short section, Lemma 2.18, gives a lower bound on the number of visits of an equidistributed sequence to a set as a function only of the measure and topological complexity of the set's complement.This result is certainly not new; we state it explicitly here for convenience in a way that highlights the uniformity in the quantifiers.
For U ∈ N, denote by I U the collection of those subsets of [0, 1) that are a union of no more than U disjoint intervals of the form [a, b).
Lemma 2.17.For any uniformly distributed sequence (x n ) n∈N 0 ⊆ [0, 1), U ∈ N, and ε > 0, there exists N 0 ∈ N such that for all N N 0 and all where the supremum is taken over all half-open intervals It follows that for every Let N 0 ∈ N be large enough so that for all N N 0 , U D N ε.The conclusion follows.

A discrete Marstrand projection theorem
In this section, we prove a discrete analogue of Marstrand's projection theorem from geometric measure theory.The theorem -stated for sumsets in the introduction as Theorem 1.1 -says that for every Borel set where π θ : R 2 → R 2 is the orthogonal projection onto ℓ θ , the line that contains the origin and forms an angle θ with the positive x-axis.Marstrand's theorem and its relatives have enjoyed much recent attention: we refer the interested reader to the survey [FFJ] and to the end of this section where we put Theorem 3.2 into more context.
The key idea behind Marstrand's theorem is that of "geometric transversality" and is captured in the following lemma.The proof follows from a simple geometric argument and is left to the reader.An immediate consequence of the lemma is that there are not many projections which map two distant points close together.
The results in this section add to a number of other discrete Marstrand-type theorems in the recent literature: [LM2, Lemma 5.2], [LM1,Prop. 3.2], [Gla,Lemma 3.8], [PS,Prop. 7], [Orp,Prop. 4.10] to name a few.Let us highlight some distinguishing features of Lemma 3.1 and Theorem 3.2 that play an important role in this work.Analogues of Lemma 3.1 more commonly found in the literature, such as the one in [Mat,Lemma 3.11], bound the measure of the set of projections which map x close to 0. The result in Lemma 3.1 uses coverings to capture topological information on the set of projections.This information is carried into Theorem 3.2 and is important in the application to Theorem A. Another useful feature of Theorem 3.2 is the allowance of a subset A ′ in (3.1); this will allow us to treat sets in Theorem A that exhibit multiplicative invariance without necessarily being self-similar.

A discrete projection theorem
Our discrete analogue of Marstrand's theorem, Theorem 3.2, reaches a conclusion similar to that of Marstrand's by quantifying the size of the set E of exceptional directions, those directions in which the image of the set A is small.On a first reading, it is safe to think of γ < 1, n ≈ ρ −γ , δ = 1, and m ≈ ρ −(γ−ε) .In this case, the set A is a discrete analogue of a set of Hausdorff dimension γ and the set E is the set of exceptional directions in which the set A loses at least a proportion ρ ε of its points.
then for all δ > 0 and all 0 m δ 2 n 4, the set The goal is to bound θ∈E ′ S(θ) from above and below to get the desired bound on |E ′ |.
Let θ ∈ E ′ and A ′ be the subset of A corresponding to θ.Since the set π θ A ′ lies on a line and N (π θ A ′ , ρ) m, there exists a collection {B} B∈B of no more than 2m closed balls B of diameter ρ whose union covers π θ A ′ .By Cauchy-Schwarz,

It follows that
Now we use Lemma 3.1 to bound the right hand side of (3.2) from above: for a 1 , a 2 ∈ [0, 1] 2 , the set and using the fact that E ′ is ρ-separated, we see that for some constant K depending on the result in Lemma 3.1.It follows that and so we are left to bound the second term from above.For ℓ ∈ N 0 , let Breaking up the sum |a 1 − a 2 | −1 by fixing a 1 and partitioning the a 2 's by shells, and using the fact that A is ρ-separated, we see , where ℓ 0 = ⌈log((n/c) 1/γ )⌉ is the smallest value such that the set A could be contained in a ball of diameter ρe ℓ 0 about a 1 .Therefore, Combining the upper and lower bounds on θ∈E ′ S(θ) , we see that there exists a constant K depending on the result in Lemma 3.1, γ, and c such that Dividing both sides by n and using the fact that m δ 2 n/4, we see that which rearranges to the desired conclusion.

A corollary for oblique projections
The proof of Theorem A will feature oblique projections instead of orthogonal ones.The following corollary concerns oblique projections and is stated in a way that will make it immediately applicable in the proof of Theorem A. Denote by Π t : R 2 → R the oblique projection Π t (x, y) = x + ty.Let ϕ : (0, π/2) → R be the diffeomorphism ϕ(θ) = log tan θ.Note that Π e ϕ(θ) is the oblique projection that is the "continuation" of the orthogonal projection π θ , meaning that the points (x, y), (Π e ϕ(θ) (x, y), 0), and π θ (x, y) are collinear.
For all compact I ⊆ R, all ε, c 1 , c 2 , c 3 > 0, all sufficiently small ρ > 0 (depending on all previous quantities), and all (c 1 ρ, γ 4 ) c 2 -sets A ⊆ [0, 1] 2 with |A| ρ −γ 3 , there exists T ⊆ I with the following properties: (I) the set I\T can be covered by a disjoint union of not more than ερ −1 /2-many half-open intervals of length ρ, a cover of total Lebesgue measure less than ε.(II) for all t ∈ T and all t are distinct and c 3 ρ-separated.
We want to apply Theorem 3.2 with γ 4 as γ, c 1 ρ as ρ, c 2 as c, and with A, n, δ, and m as they are.We see that the inequality n > − log c 2 holds for ρ sufficiently small, as does m δ 2 n/4 since σ < (γ 3 −γ 1 )/2.Since the conditions of Theorem 3.2 hold, the set E ⊆ [0, π) defined in (3.1) satisfies (3.4) Let J = ϕ −1 (I), and put T = I\ϕ| J (E).Since the map ϕ| J is bi-Lipschitz, Combining this with (3.4) and the fact that σ < (γ 3 − γ 1 )/2, we have that for sufficiently small ρ, N (I\T, ρ) ερ −1 /6.It follows that the set I\T can be covered by a disjoint union of not more than ερ −1 /2-many half-open intervals of length ρ, a cover of total measure less than ε.This establishes (I).
To prove (II), let t ∈ T , and let . By choosing points in A ′ in each fiber of a maximally ρ-separated set of the projection, we see that there exists a subset A ′ t ⊆ A ′ of cardinality at least ρ −γ 1 such that the orthogonal projection of the points in A ′ θ onto ℓ θ are disjoint and c 3 ρ-separated.Since the oblique projection Π e t increases distances between points that lie on ℓ θ , the images of points of A ′ t under Π e t are c 3 ρ-separated.

Trees and a subtree regularity theorem
Trees are combinatorial objects that are convenient for describing fractal sets.We will be concerned solely with finite trees throughout this work.After giving the main definitions, we motivate their importance by explaining how they will be used in the proof of Theorem A.
We move then to prove the main result in this section.

Preliminary definitions
The following definitions describe the familiar notion of a rooted tree, a graph with no cycles whose vertices can be arranged on levels and whose edges only connect vertices on adjacent levels.
• The nodes in Γ n have height n.The single node with height 0 is the root and the nodes with height N are called leaves.
• The node Q is the parent of each of its children, nodes in the set with root Q and the same parent function as Γ, restricted to the set Γ Q .
• A subtree of Γ is a tree Γ ′ ⊆ Γ of the same height as Γ with parent function (A subtree is uniquely determined by its non-empty set of leaves Γ ′ N ⊆ Γ N .)Continuing with terminology inspired by genealogy trees, the ancestors of a node Q are those nodes that lie between Q and the root.For the reasons described below in Remark 4.4, it will be important to count the number of ancestors of Q that have many children.To this end, we introduce the following terminology and notation.Definition 4.2.Let Γ be a tree, c > 0, and ω ∈ [0, 1].
• The ancestry of Q ∈ Γ n is the set The following definitions allow us to capture the dimension of a finite tree by giving costs to the nodes and measuring the cost of the least expensive cut.Definition 4.3.Let Γ be a tree, r ∈ N, r 2, and γ > 0. • The main result in this section, Theorem 4.7, says, roughly speaking, that any tall enough tree with Hausdorff content bounded from below and with a uniform upper bound on the number of children of any node has a subtree in which most nodes have fertile ancestry.Before making this statement precise and beginning with the details of the proof, let us make two observations about the concept of fertile ancestry that will help explain why it will be useful later on in the proof of Theorem A.
(I) The property of having fertile ancestry is preserved under a type of tree thinning process that we will employ in the proof of Theorem A. More specifically, suppose that Γ is a tree in which every node has either one child or at least c many children and in which every node has (c, ω)-fertile ancestry.Suppose further that for every node Q, there exists These subsets naturally give rise to a subtree Γ obtained by thinning the tree Γ: the subtree Γ is uniquely defined by the property that if Q is a node of Γ, then C Γ(Q) = C(Q).It is not hard to see that every node in Γ has (c, ω)-fertile ancestry, regardless of how the subsets of children C(Q) were chosen.(II) A tree in which every node has fertile ancestry necessarily has large Hausdorff content.This is a simple consequence of the mass distribution principle (or the max flow-min cut theorem) for trees, the real analogue of which is stated in Lemma 2.6.More specifically, let Γ be a tree, and consider a "flow" through Γ of magnitude 1 starting at the root that splits equally amongst children.The value of the flow at any node Q with fertile ancestry can be bounded from above using the fact that many times, much of the flow is split amongst a large set of children before reaching Q.If all nodes of Γ have fertile ancestry, then the flow is not concentrated too highly at any node.According to the mass distribution principle, the Hausdorff content of a tree that supports such a flow is high.

A subtree regularity theorem
We now proceed with the main results in this subsection.In the next two results, fix r 2 and 0 < γ 2 < γ 3 < γ 4 such that setting ensures the quantity B is positive.The following lemma describes the fundamental dichotomy behind Theorem 4.7.
Lemma 4.5.If Γ is a tree with the property that every node in the tree has at most r γ 4 many children, then at least one of the following holds: (I) there are at least r γ 2 many children Q of the root, each of which satisfies (II) there is at least one child Q of the root satisfying Proof.Let Γ be a tree satisfying (4.1).Let Q 1 , Q 2 , . . ., Q I be the children of the root of Γ, ordered so that It follows by the ordering of the Q i 's and the definition of the Hausdorff content and induced trees that Lemma 4.6.Every finite tree Γ that satisfies (4.1) has a subtree Γ ′ with the property that for all nodes Q in Γ ′ , Proof.We will prove the lemma by induction on the height N of the tree Γ.To verify the base case, let Γ be the tree of height N = 0: a single node with no children.Taking Γ ′ = Γ, the inequality (4.2) for this single node follows from the fact that log r H γ 3 r (Γ) = 0. Suppose that N ∈ N is such that the theorem holds for all trees of height N − 1.Let Γ be a tree of height N that satisfies (4.1).By Lemma 4.5, at least one of Case (I) or Case (II) holds.
Suppose Case (I) of Lemma 4.5 holds.Let Q be any one of the r γ 2 -many children guaranteed by Case (I).By the induction hypothesis, there exists a subtree Γ ′ Q of Γ Q in which every node satisfies (4.2) with Γ Q in place of Γ and Γ ′ Q in place of Γ ′ .Define the subtree Γ ′ of Γ to be the root node of Γ with the collection of at least r γ 2 many children Q, each of those children followed by its subtree Γ ′ Q .We will now verify that (4.2) holds for all nodes of Γ ′ .Let Q be any node of Γ ′ .If Q is the root node of Γ ′ , then (4.2) holds because log r H γ 3 r (Γ) 0. (Indeed, that H γ 3 r (Γ) 1 follows by considering the cut C := {Q} of Γ.)If Q is a non-root node of Γ ′ , then it belongs to one of the subtrees Γ ′ S for some child S of the root of Γ ′ .By property (4.2) for the subsubtree Γ ′ S , we see This simplifies to the inequality in (4.2), verifying the inductive step if Case (I) of Lemma 4.5 holds.Suppose Case (II) of Lemma 4.5 holds.Let Q be the child guaranteed by Case (II).By the induction hypothesis, there exists a subtree Γ ′ Q of Γ Q in which every node satisfies (4.2) with Γ Q in place of Γ and Γ ′ Q in place of Γ ′ .Define the subtree Γ ′ of Γ to be the root of Γ with only the child Q followed by its subtree Γ ′ Q .We will now verify that (4.2) holds for all nodes of Γ ′ .Let Q be any node of Γ ′ .If Q is the root node of Γ ′ , then (4.2) holds because log r H γ 3 r (Γ) 0. If Q is a non-root node of Γ ′ , then by property (4.2) for the subtree containing Q, we see This simplifies to the inequality in (4.2), verifying the inductive step if Case (II) of Lemma 4.5 holds.The proof of the inductive step is complete, and the lemma follows.
Theorem 4.7.For all 0 < ε < 1, for all 0 < γ 2 < γ 3 < γ 4 < γ 3 +ε(γ 3 −γ 2 ), for all sufficiently large r ∈ N, and for all V > 0, there exists N 0 ∈ N for which the following holds.For all N N 0 and for all trees Γ of height N with H γ 3 r (Γ) V that satisfy (4.1), there exists a subtree Γ ′ of Γ such that all nodes Q ∈ Γ ′ with height at least N 0 have (r and note by the inequality in the previous sentence, B/( and note that for all N N 0 , the inequality in (4.3) holds with N 0 replaced by N .Let N N 0 , and let Γ be a tree of height N with H γ 3 r (Γ) V that satisfies (4.1).By Lemma 4.6, there exists a subtree Γ ′ of Γ such that for all nodes Q of Γ ′ , the inequality in (4.2) holds.
Let Q be a node of Γ ′ with height at least N 0 .By (4.2) and ( 4.3), we see that It follows that Q has (r γ 2 , 1 − ε)-fertile ancestry in Γ ′ , as was to be shown.

Proof of the sumsets theorem
In this section, we prove Theorem A, the main theorem in this work.We restate it here for the reader's convenience.
Theorem A. Let r and s be multiplicatively independent positive integers, and let X, Y ⊆ [0, 1] be ×rand ×s-invariant sets, respectively.Define γ = min dim H X + dim H Y, 1 .For all compact I ⊆ R\{0} and all γ < γ, inf λ,η∈I (5.1) Several auxiliary results go into the proof: the discrete version of Marstrand's projection theorem in Section 3, the subtree regularity theorem for finite trees in Section 4, and the quantitative equidistribution result in Section 2.3.We outline the proof of Theorem A in Section 5.1 before presenting the full details in Sections 5.2 and 5.3.
Remark 5.1.It is natural to ask about the value of the infimum C := inf λ,η∈I H γ >0 λX + ηY that appears in (5.1), or, more precisely, how it depends on X and Y .The value of C must depend on r, s, γ, γ, and I, but also on X and Y , at least to the extent that it accounts for the Hausdorff content of X × Y .It follows from the proof of Theorem A below that this is essentially the only sense in which C depends on X and Y .

Outline of the proof of Theorem A
Before beginning with the details of the proof of Theorem A, we explain the main ideas behind it.To understand the argument, it helps to begin by assuming that the set X×Y is self-similar in the sense that for every n ∈ N 0 , it is a union of approximately r n(dim H X+dim H Y ) many translates of the set r −n X × s −n ′ Y .(Recall that n ′ = ⌊n log r/ log s⌋ so that s −n ′ ≈ r −n .)This is the case, for example, if X and Y are both restricted digit Cantor sets.In this case, Peres and Shmerkin [PS] proved that for all λ, η ∈ R\{0}, dim H (λX + µY ) = γ.Our argument follows along the same lines as theirs.
Recall that Π t : R 2 → R is the oblique projection Π t (x, y) = x + ty.A quick calculation shows that which implies that the images of the translates of r −n X × s −n ′ Y under the map Π e t are affinely equivalent to the image of the full set X × Y under the map Π e t r n /s n ′ .It follows that the set Π e t (X × Y ) contains affine images of the sets Π e t r n /s n ′ (X × Y ) and hence that dim H Π e t (X × Y ) sup Thus, to bound dim H Π e t (X × Y ) from below, it suffices to show that there is some n ∈ N 0 for which e t r n /s n ′ is a "good angle" for X × Y , in the sense that dim H Π e t r n /s n ′ (X × Y ) > γ − ε.
It follows from Marstrand's theorem that the set of such "good angles" for X × Y (indeed, for any set) has full measure in R, and it will be shown that the sequence n → log(e t r n /s n ′ ) has image in [t, t + log s) and is the orbit of t under the irrational x → x + log r (mod log s) translated by t.When combined, these facts fall just short of allowing us to conclude the existence of n ∈ N 0 for which e t r n /s n ′ is a good angle: it is possible that the image of an equidistributed sequence misses a set of full measure.
To make use of the above outline, one needs to gain some topological information on the set of good angles from Marstrand's theorem.This can be accomplished by moving the argument to a discrete setting.Discretizing introduces a number of technical nuisances, but the core of the argument remains the same.Recall that X n and Y n ′ are the sets X and Y rounded to the lattices r −n Z and s −n ′ Z, respectively.The discrete analogue of Marstrand's theorem in Theorem 3.2 tells us that the complement of the set of "good angles" for a finite set such as X n ×Y n ′ can be covered by a disjoint union of few half-open intervals.This topological information combines with the equidistribution of the irrational rotation described above to allow us to find many n ∈ N 0 for which e t r n /s n ′ is a good angle for X n × Y n ′ .Claim 5.2 with Ĩ := log(η/λ) η, λ ∈ I as I and γ as it is.Let m, N 0 ∈ N be as guaranteed by Claim 5.2.
Note that by Lemma 2.4 and the fact that the function ρ → H γ ρ λX + ηY is nonincreasing (as ρ decreases), inf λ,η∈I The limit in the final expression exists because inf λ,η∈I H γ ρ λX + ηY is non-increasing and is bounded from below by zero.
Therefore, to show that (5.1) holds, it suffices to prove that lim where d H is the Hausdorff metric, and Lemma 2.7 that for all λ, η ∈ I, (5.4) Therefore, to show (5.3), it suffices to prove that lim (5.5) Combining the conclusion of Claim 5.2 with Lemma 2.6, we see that for all N N 0 and t ∈ Ĩ, H γ r −Nm Π e t Q N m r −N 0 m .This shows that the limit in (5.5) is positive and completes the deduction of Theorem A from Claim 5.2.

Proof of Claim 5.2
Choosing the parameter m and scale ρ.Recall that r, s, X, Y , and γ are as given as in the statement of Theorem A. Without loss of generality, we can assume that r < s.Put β = log s, let 0 < γ < γ, and define ε := γ − γ and γ 0 := γ.
Let c 1 , c 2 , and M 0 be the constants guaranteed by Corollary 2.16, when applied with γ 4 as ξ.Let I ⊆ (0, ∞) be compact, and define I β = I + [0, β].Let P > 0 be a Lipschitz constant for all of the maps Π e t , t ∈ I β , and let c 3 = 4P s −1 + 1. Choose m ∈ N large enough so that we can apply Π e t+R n (0) A ′ t are distinct and c 3 ρ-separated.Define C m Γ ′ (Q) = Q + (r −nm , s −(nm) ′ ) ⊙ A ′ t so that (r nm , s (nm and R n (0)+α > β.We do exactly as in (XI) with Q m replaced by Q m and using (VIII) to get the set C m Γ ′ (Q).Let Γ ′′ be the subtree of Γ ′ with the property that if Q is a non-leaf node of Γ ′′ , then C Γ ′′ (Q) = C m Γ ′ (Q).We claim that every node of Γ ′′ with height at least N 0 has (r mγ 1 , 1 − ε/2)-fertile ancestry.
Claim 5.5.If L 1 and L 2 are two distinct leaves of Γ ′′ and n is maximal such that L 1 and L 2 have a common ancestor at height n, then |Π e t L 1 − Π e t L 2 | ρ n+1 .
Proof.Let Q be the common ancestor of L 1 and L 2 in Γ ′′ of height n.Note that by the definition of Γ ′′ and maximality of n, it must be that Q has more than one child and hence that n ∈ J .Let Q 1 and Q 2 be the children of Q in Γ ′′ that are ancestors of L 1 and L 2 , respectively.Note that Q 1 = Q 2 but that Q i may be equal to L i .
We will show first that Π e t Q 1 and Π e t Q 2 are c 3 ρ n+1 -separated.Write Q = (p, q) and Q i = (p i , q i ).Suppose that R n (0) + α < β.It follows from (V) that Π e t Q i = r −nm (r nm Π e t (Q i − Q)) + Π e t Q = ρ n r nm (p i − p) + e t+R n (0) s (nm) ′ (q i − q) + Π e t Q = ρ n Π e t+R n (0) (r nm , s (nm) ′ ) ⊙ (Q i − Q) + Π e t Q.
By the definition of the Q nm sets, |Q i − L i | 2s −1 ρ n+1 .By the triangle inequality and the fact that c 3 = 4P s −1 + 1, It follows that Π e t L 1 − Π e t L 2 ρ n+1 , as was to be shown.
Constructing the measure µ.The proof of Claim 5.2 will be concluded by demonstrating that 1) the fertile ancestry property of Γ ′′ in (5.8) guarantees that Γ ′′ supports a "measure" which is not too concentrated on any node (an outline for this step was described in Remark 4.4 (II)); and 2) by Claim 5.5, the projection of this measure is not too concentrated on any ball.
Let ν : Γ ′′ → [0, 1] be the unique function that takes 1 on the root of Γ ′′ and has the properties that for all non-leaf nodes Q of Γ ′′ , ν is constant on C Γ ′′ (Q) and ν(Q) = C∈C Γ ′′ (Q) ν(C).(Colloquially, a mass of 1 begins at the root of Γ ′′ and spreads down the separated and for all δ ρ and all open balls B of diameter δ,