1 Introduction and Statement of Results

1.1 Introduction.

Given \(A\subset {\mathbb {R}}^d\), its distance set is \(\Delta (A)=\{|x-y|:x,y\in A\}\). Falconer [Fal85] pioneered the study of the relationship between the Hausdorff dimensions of A and \(\Delta (A)\). He proved that if \(d\ge 2\) and \(A\subset {\mathbb {R}}^d\) is a Borel (or even analytic) set then \({{\,\mathrm{dim_H}\,}}(\Delta (A)) \ge \min ({{\,\mathrm{dim_H}\,}}(A)-\tfrac{1}{2}(d-1),1)\), where \({{\,\mathrm{dim_H}\,}}\) stands for Hausdorff dimension. Falconer also constructed compact sets \(A\subset {\mathbb {R}}^d\) (based on lattices) of any Hausdorff dimension such that \({{\,\mathrm{dim_H}\,}}(\Delta (A)) \le \min (2{{\,\mathrm{dim_H}\,}}(A)/d,1)\). Although it is not explicitly stated in [Fal85], the conjecture that these lattice constructions are extremal, in the sense that one should have \({{\,\mathrm{dim_H}\,}}(\Delta (A))=1\) if \({{\,\mathrm{dim_H}\,}}(A)\ge d/2\), has become known as the Falconer distance set problem.

Falconer’s problem is a continuous version of the celebrated P. Erdős distinct distances problem [Erd46], asserting (in the plane) that if \(|A|=N\), \(A\subset {\mathbb {R}}^2\), then \(|\Delta (A)|\ge c N/\sqrt{\log N}\). Guth and Katz [GN15] (building up on work of Elekes and Sharir [ES11]) famously solved this problem, up to logarithmic factors, by showing that \(|\Delta (A)| \ge c N/\log N\). However, the approach of Guth and Katz and, indeed, all previous methods developed to tackle Erdős’ problem, do not appear to be able to yield progress on Falconer’s problem.

From now on, we focus on the case \(d=2\), which is the first non-trivial case, the best understood, and the focus of this article. Wolff [Wol99], based on a method of Mattila [Mat87] and extending ideas of J. Bourgain [Bou94], proved that if \(A\subset {\mathbb {R}}^2\) is a Borel set with \({{\,\mathrm{dim_H}\,}}(A)\ge 4/3\), then \({{\,\mathrm{dim_H}\,}}(\Delta (A))=1\). In fact, he proved that \({{\,\mathrm{dim_H}\,}}(A)>4/3\) ensures that \(\Delta (A)\) has positive length, and established the more general dimension formula

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta (A)) \ge \min \left( \frac{3}{2}{{\,\mathrm{dim_H}\,}}(A)-1,1\right) , \end{aligned}$$
(1.1)

whenever \({{\,\mathrm{dim_H}\,}}(A)>1\). The method developed by Mattila and Wolff is strongly Fourier-analytic, depending on difficult estimates for the decay of circular averages of the Fourier transform of measures.

Later Bourgain [Bou03], crucially relying on earlier work of Katz and Tao [KT01], proved that if \(A\subset {\mathbb {R}}^2\) satisfies \({{\,\mathrm{dim_H}\,}}(A)\ge 1\), then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta (A)) > \frac{1}{2} + \delta , \end{aligned}$$
(1.2)

where \(\delta >0\) is a universal constant. Although non-explicit, it is clear from the proof that the value of \(\delta \) one would get is extremely small. The method of Katz–Tao and Bourgain is based on additive combinatorics, and it seems difficult for this type of arguments to yield reasonable values of \(\delta \).

A related problem concerns the dimensions of pinned distance sets

$$\begin{aligned} \Delta _y(A) = \{ |x-y|:x\in A\}. \end{aligned}$$

Peres and Schlag [PS00, Theorem 8.3] proved that if \(A\subset {\mathbb {R}}^2\) is a Borel set with \({{\,\mathrm{dim_H}\,}}(A)=s\), then for all \(0<t\le \min (s,1)\),

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_H}\,}}(\Delta _y(A)) < t \} \le 2+t-\max (s,1). \end{aligned}$$
(1.3)

Recently, Iosevich and Liu [IL17] proved that (1.3) remains true with \(3+3t-3s\) in the right-hand side. This is an improvement in some parts of the parameter region. Both results imply that if \({{\,\mathrm{dim_H}\,}}(A)>3/2\), then there is \(y\in A\) such that \({{\,\mathrm{dim_H}\,}}(\Delta _y A)=1\), and it is unknown whether 3 / 2 can be replaced by a smaller number. We remark that the results of both [PS00] and [IL17] extend to higher dimensions.

These were the best known results towards Falconer’s problem in the plane for general sets prior to this article. For some special classes of sets, better results are known. In particular, the second author proved in [Shm17] that if \(A\subset {\mathbb {R}}^2\) is a Borel set of equal Hausdorff and packing dimension, and this value is \(>1\), then \({{\,\mathrm{dim_H}\,}}(\Delta _y(A))=1\) for all y outside of a set of exceptions of Hausdorff dimension at most 1, and in particular for many \(y\in A\). This verifies Falconer’s conjecture for this type of sets, outside of the endpoint. We remark that Orponen [Orp17b] and the second author [Shm17b] had previously proved weaker results of the same kind. See also [Mat87, IL16] for other results on the distance sets of special classes of sets.

1.2 Main results.

In this article we prove new lower bounds on the dimensions of (pinned) distance sets, which in particular greatly improve the best previously known estimates when \({{\,\mathrm{dim_H}\,}}(A)=1+\delta \), \(\delta >0\) small.

Theorem 1.1

If A is a Borel subset of \({\mathbb {R}}^2\) with \({{\,\mathrm{dim_H}\,}}A=s\), then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\left\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_H}\,}}(\Delta _y(A)) < \min \left( \frac{2}{3}s,1\right) \right\} \le \max (1,2-s). \end{aligned}$$
(1.4)

In particular, if \(s>1\), then one can find many \(y\in A\) such that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta _y(A)) \ge \min \left( \frac{2}{3}s,1\right) . \end{aligned}$$

We remark that we get better bounds for the dimension of the full distance set, see Theorem 1.4 below.

The last claim in Theorem 1.1 improves the previously known bounds for the dimensions of pinned distance sets \(\Delta _y(A)\) with \(y\in A\) for all \(s\in (1,3/2]\). The bound (1.4) also improves upon (1.3) (and the variant of Iosevich and Liu) in large regions of parameter space, and in particular for \(t=\min (\tfrac{2}{3}s,1)\) and all \(s\in (3/5,5/3)\).

Theorem 1.1 is a special case of a more general result that takes into account the Hausdorff and also the packing dimension of A. We refer to [Fal14, §3.5] for the definition and main properties of packing dimension \({{\,\mathrm{dim_P}\,}}\), and simply note that it satisfies \({{\,\mathrm{dim_H}\,}}(A)\le {{\,\mathrm{dim_P}\,}}(A)\le {{\,\mathrm{{\overline{\dim }}_B}\,}}(A)\), where \({{\,\mathrm{{\overline{\dim }}_B}\,}}\) denotes the upper box-counting (or Minkowski) dimension. For our method, the worst case is that in which A has maximal packing dimension 2, and we get better bounds for the distance set under the assumption that the packing dimension is smaller:

Theorem 1.2

Let

Given \(0< s\le u \le 2\), the following holds: if A is a Borel subset of \({\mathbb {R}}^2\) with \({{\,\mathrm{dim_H}\,}}A\ge s\) and \({{\,\mathrm{dim_P}\,}}A \le u\), then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\left\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_H}\,}}(\Delta _y(A)) < \chi (s,u) \right\} \le \max (1,2-s). \end{aligned}$$

In particular, if \(s>1\) then there are many \(y\in A\) such that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta _y(A)) \ge \chi (s,u), \end{aligned}$$

and hence if \({{\,\mathrm{dim_H}\,}}(A)>1\) and \({{\,\mathrm{dim_P}\,}}A\le 2{{\,\mathrm{dim_H}\,}}A-1\), then \({{\,\mathrm{dim_H}\,}}(\Delta _y(A))=1\) for many \(y\in A\).

Note that Theorem 1.1 follows immediately by taking \(u=2\). A simple calculation shows that if \(0\le s\le u \le 2\) and \(s<2\), then

$$\begin{aligned} \chi (s,u) = \min \left( \frac{s(2+u-2s)}{2+2u-3s},1\right) . \end{aligned}$$

We remark that, taking \(u=s\), this theorem recovers the main result of [Shm17] mentioned above, namely that if \({{\,\mathrm{dim_H}\,}}(A)={{\,\mathrm{dim_P}\,}}(A)>1\), then \({{\,\mathrm{dim_H}\,}}(\Delta _y A)=1\) for many \(y\in A\). On the other hand, it was known from (1.3) that if \({{\,\mathrm{dim_H}\,}}(A)>3/2\) then there is \(y\in A\) such that \({{\,\mathrm{dim_H}\,}}(\Delta _y A)=1\). The last claim in Theorem 1.2 can be seen as interpolating between these two situations, and hence provides a new, more general, geometric condition under which Falconer’s conjecture is known to hold.

When \({{\,\mathrm{dim_H}\,}}(A)>1\), we are able to get much better lower bounds for the packing dimension of the pinned distance sets:

Theorem 1.3

Let A be a Borel subset of \({\mathbb {R}}^2\) with \(s={{\,\mathrm{dim_H}\,}}(A)\in (1,3/2)\). Then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\left\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_P}\,}}(\Delta _y(A)) < \frac{1+s+\sqrt{3s(2-s)}}{4} \right\} \le 1. \end{aligned}$$

In particular, there is \(y\in A\) such that

$$\begin{aligned} {{\,\mathrm{dim_P}\,}}(\Delta _y(A)) \ge \frac{1+s+\sqrt{3s(2-s)}}{4} > \frac{2+\sqrt{3}}{4} = 0.933013\ldots . \end{aligned}$$

We recall that since upper box-counting dimension is at least as large as packing dimension, the above theorem also holds for upper box-counting dimension. Even though Falconer’s conjecture is about the Hausdorff dimension of the distance set, this result presents further evidence towards its validity.

Fig. 1
figure 1

The three solid graphs show, from top to bottom: (1) the lower bound given by Theorem 1.3 for \({{\,\mathrm{dim_P}\,}}(\Delta _y(A))\) for y outside of a one dimensional set of y, (2) the lower bound for \({{\,\mathrm{dim_H}\,}}(\Delta (A))\) given by Theorem 1.4, (3) the lower bound given by Theorem 1.1 for \({{\,\mathrm{dim_H}\,}}(\Delta _y(A))\) outside of a one dimensional set of y. The dashed line is Wolff’s lower bound for \({{\,\mathrm{dim_H}\,}}(\Delta (A))\) (which was the previously known best bound, outside of a tiny interval to the right of 1). In all cases the variable is \({{\,\mathrm{dim_H}\,}}(A)\).

Finally, as anticipated above, we get a better bound for the dimension of the full distance set when \({{\,\mathrm{dim_H}\,}}(A)\) is slightly larger than 1:

Theorem 1.4

If \(A\subset {\mathbb {R}}^2\) is a Borel set with \({{\,\mathrm{dim_H}\,}}(A)=s\in (1,4/3)\), then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta (A))\ge \frac{s(147-170s+60s^2)}{18(12-14s+5s^2)} \ge \frac{37}{54}=0.6851851\ldots . \end{aligned}$$

A calculation shows that this indeed improves upon Wolff’s bound (1.1) for the dimension of the full distance set for \(s\in (1,1.21931\ldots )\) (and upon Bourgain’s bound (1.2) for all \(s>1\)). We remark that this theorem is obtained by combining the idea of the Proof of Theorem 1.2 with a known effective variant of Wolff’s bound (1.1). Although achieving this combination takes quite a bit of work, Theorem 1.2 should perhaps be considered the most basic result, since its proof is shorter and already contains most of the main ideas, and the improvement given by Theorem 1.4 is relatively modest. Note also that already applying Theorem 1.1 for the full distance set improves upon (1.1) for \(s\in (1,6/5)\). See Figure 1 for a comparison of the lower bounds from Theorems 1.1, 1.3 and 1.4 and Wolff’s lower bound.

After this paper was made public, Liu [Liu18] posted a preprint extending Wolff’s result to pinned distance sets. In particular, he shows that if \(A\subset {\mathbb {R}}^2\) is a Borel set with \({{\,\mathrm{dim_H}\,}}(A)>4/3\), then \(\Delta _x(A)\) has positive Lebesgue measure for some \(x\in A\) (with bounds on the dimension of the exceptional set). This is stronger than our Theorem 1.1 for \(s>4/3\) (other than the exceptional set being larger).

1.3 Strategy of proof.

Our approach is completely different to those of Wolff, Bourgain, Peres and Schlag and Iosevich and Liu. Rather, it can be seen as a continuation of the ideas successively developed in [Orp17b, Shm17b, Shm17] to attack the distance set problems for sets with certain regularity. Thus, one of the main points of this paper is extending the strategy of these papers so that it can be applied to general sets.

At the core of our method is a lower box-counting estimate for pinned distance sets \(\Delta _y A\) in terms of a multi-scale decomposition of A or, rather, a Frostman measure \(\mu \) supported on A. See Section 4 for precise statements. A key aspect of these estimates is that they recover a global lower box-counting estimate for \(\Delta _y A\) from bounds on local, discretized and linearized estimates for the pinned distance measures \(\Delta _y \mu \).

The general philosophy of obtaining lower bounds for the dimension of projected sets and measures, in terms of multi-scale averages of local projections is behind a large number of results in fractal geometry in the last few years, see e.g. [Hoc12, Hoc14] and references there. The insight that this approach can be used also to study distance sets is due to Orponen [Orp12, Orp17b].

Up until the paper [Shm17], the scales in the multi-scale decomposition behind all the variants of the method described above were of the form \(2^{-N j}\) for some fixed N. One of the innovations of [Shm17] was to modify the method so that it could handle also scales of the form \(2^{-(1+\varepsilon )^j}\) (the point being that \((1+\varepsilon )^j\) is exponential in j, rather than linear). Although this was flexible enough to handle sets of equal Hausdorff and packing dimension (as opposed to Ahlfors-regular sets as in [Orp17b, Shm17b]), it was still too restrictive for dealing with general sets.

One of the main innovations of this paper is that we are able to work with scales \(2^{-M_j}\) where the \(M_j\) only need to satisfy \(\tau M_j \le M_{j+1}-M_j\le M_j+T\) (where \(\tau >0, T\in {\mathbb {N}}\) are fixed parameters). This provides a major degree of flexibility. In particular, a crucial point is that we are able to pick the sequence \((M_j)\) depending on the set A (or the Frostman measure \(\mu \)), while in all previous works the scales in the multi-scale decomposition were basically fixed. See Proposition 4.4. This leads us to the combinatorial problem of optimizing the choice of \((M_j)\) for each measure \(\mu \). We solve this problem completely, up to negligible error terms, in Section 5.

In fact, we deduce the combinatorial statements we need from several statements about the variation of Lipschitz functions, which might be of independent interest. More precisely, given a 1-Lipschitz function \(f:[0,a]\rightarrow {\mathbb {R}}\) satisfying certain additional assumptions, we seek to minimize

$$\begin{aligned} \sum _{n=1}^\infty f(a_n)-\min _{[a_n,a_{n-1}]} f, \end{aligned}$$

where \((a_n)_{n=0}^\infty \) is a strictly decreasing sequence tending to 0 with \(a=a_0\) and \(a_{n} \le 2 a_{n+1}\). Conversely, we also study the structure of functions f for which these sums are (for some sequence \((a_i)\)) close to the minimum possible value. We underline that this part of the method is completely new as the combinatorial problem does not arise for fixed multi-scale decompositions.

Another obstacle to dealing with arbitrary sets and measures is that energies of measures (which play a key role throughout) do not have a nice multi-scale decomposition in general. We deal with this by decomposing a general measure supported on \([0,1)^2\) as a superposition of measures with a regular Cantor structure, plus a small error term: see Corollary 3.5. This step is an adaptation of some ideas of Bourgain we learned from [Bou10]. After some technical difficulties, this reduces our study to those regular measures for which a suitable multi-scale expression of the energy does exist, see Lemma 3.3.

The strategy just discussed is behind the proofs of Theorems 1.2, 1.3 and 1.4. However (as briefly indicated above), the Proof of Theorem 1.4 is based on merging these ideas with a more quantitative version of Wolff’s result that if \({{\,\mathrm{dim_H}\,}}(A)\ge 4/3\) then \({{\,\mathrm{dim_H}\,}}(\Delta (A))=1\), see Theorem 6.4 below. The fact that one can improve upon Theorem 1.1 (for the full distance set) is based on the observation that for some sets \(A\subset {\mathbb {R}}^2\) of Hausdorff dimension \(s>1\) for which the method of the Proof of Theorem 1.1 cannot give anything better than \({{\,\mathrm{dim_H}\,}}(\Delta (A))\ge 2s/3\), the quantitative version of Wolff’s Theorem can give a much better bound. The fact that these two methods are based on totally different techniques and also have different “enemies” that one must overcome, suggests that neither of them (or even in combination as we do here) provides a definitive line of attack on Falconer’s problem.

1.4 Sets of directions, and the case of dimension 1.

Although Theorem 1.1 does provide new information on the pinned distance sets \(\Delta _y A\) when \({{\,\mathrm{dim_H}\,}}A=1\), it gives no information whatsoever on \({{\,\mathrm{dim_H}\,}}(\Delta (A))\) in this case. There are some well-known “enemies” that one must handle in order to improve upon the easy bound \({{\,\mathrm{dim_H}\,}}(\Delta (A))\ge 1/2\) when \({{\,\mathrm{dim_H}\,}}A=1\). One is that the corresponding fact is false over the complex numbers: \({\mathbb {R}}^2\) is a subset of \({\mathbb {C}}^2\) of half the dimension of the ambient space for which the (squared) distance set

$$\begin{aligned} \Delta ^2({\mathbb {R}}^2) = \{ (x_1-y_1)^2 + (x_2-y_2)^2 : (x_1,x_2), (y_1,y_2)\in {\mathbb {R}}^2\} \end{aligned}$$

also has half the dimension of the ambient space. Hence any improvements over 1 / 2 in the real case must take into account the order structure of \({\mathbb {R}}\). The other obstacle is a well-known counterexample to a naive discretization of the problem: see [KT01, Eq. (2) and Figure 1]. These enemies do not arise when \({{\,\mathrm{dim_H}\,}}(A)>1\). Despite these conceptual differences, we underline that, with the exception of the work of Katz and Tao [KT01] underpinning Bourgain’s bound (1.2), none of the other methods developed so far make any distinction between the cases \({{\,\mathrm{dim_H}\,}}(A)=1\) and \({{\,\mathrm{dim_H}\,}}(A)=1+\delta \).

From the point of view of our strategy, the key significance of the assumption \({{\,\mathrm{dim_H}\,}}(A)>1\) is that in this case the sets of directions determined by points in A has positive Lebesgue measure. In fact, we need a far more quantitative “pinned” version of this fact, which is due to Orponen [Orp17], improving upon a related result by Mattila and Orponen [MO16] (see Proposition 3.11 below). However, even the fact that the direction set has positive measure clearly fails if \({{\,\mathrm{dim_H}\,}}A=1\) when A is contained in a line. Since \({{\,\mathrm{dim_H}\,}}(\Delta _y A) = {{\,\mathrm{dim_H}\,}}(A)\) trivially when A is contained in a line, this does not rule out an extension of our approach to the case \({{\,\mathrm{dim_H}\,}}(A)=1\). However, this would require some variant of Proposition 3.11 when both su are slightly less than 1, under a suitable hypothesis of non-concentration on lines, and this appears to be very hard. In [Orp17, Corollary 1.8], Orponen also proved that the direction set of a planar set of Hausdorff dimension 1 which is not contained in a line has Hausdorff dimension \(\ge 1/2\), but this is very far from positive measure, let alone from anything resembling Proposition 3.11.

To understand why directions arise naturally, we recall that our whole approach is based on bounding the size of pinned distance sets in terms of a multi-scale average of local linearized pinned distance measures. The derivative of the distance function \(x\mapsto |x-y|\) is precisely the direction spanned by x and y. Thus we are led to study orthogonal projections of certain measures localized around x, where the angle is given by the direction determined by x and y. The fact that these directions are “well distributed” in a suitable sense can then be used in conjunction with a finitary version of Marstrand’s projection theorem (see Lemma 3.6) and several applications of Fubini to conclude that one can choose y such that for “many” x the direction determined by x and y is good in the sense that the \(L^2\) norm of the projection is controlled by the 1-energy of the measure being projected.

1.5 Structure of the paper.

In Section 2 we introduce notation to be used in the rest of the paper. Section 3 contains some preliminary definitions and results that will be repeatedly used in the later proofs. In Section 4 we establish a lower bound for the box-counting numbers of pinned distance sets that will be at the heart of the proofs of all main theorems. Section 5 contains a number of optimization results about Lipschitz functions on the line, as well as corollaries of these results for discrete \([-1,1]\)-sequences; these corollaries play a key role in the proofs of the main theorems. Theorems 1.2, 1.3 and 1.4 are proved in Section 6. We conclude with some remarks on the sharpness of our results in Section 7.

We remark that Sections 5.2 and 5.3 are not needed for the Proof of Theorem 1.2 (the results from Section 5.2 are required only in the Proof of Theorem 1.3, and Section 5.3 is needed only for the Proof of Theorem 1.4).

We also wish to thank for T. Orponen for many useful discussions at the early stage of this project, and an anonymous referee for several suggestions that improved the paper, and in particular for suggesting a simplification of the statement and Proof of Proposition 3.12.

2 Notation

We use Landau’s \(O(\cdot )\) notation: given \(X>0\), O(X) denotes a positive quantity bounded above by CX for some constant \(C>0\). If C is allowed to depend on some other parameters, these are denoted by subscripts. We sometimes write \(X\lesssim Y\) in place of \(X=O(Y)\) and likewise with subscripts. We write \(X\gtrsim Y\), \(X\approx Y\) to denote \(Y\lesssim X\), \(X\lesssim Y\lesssim X\) respectively.

Throughout the rest of the paper, we work with three parameters that we assume fixed: a large integer T and small positive numbers \(\varepsilon ,\tau \). We briefly indicate their meaning:

  1. (1)

    We will decompose sets and measures in the base \(2^T\). In particular, we will work with sets and measures that have a regular tree (or Cantor) structure when represented in this base: see Definition 3.2.

  2. (2)

    The parameter \(\tau \) will be used to define sets of bad projections: see Definition 3.8. The fact that \(\tau >0\) is required to ensure that these sets have small measure. It also keeps some error terms negligible, see Proposition 4.4.

  3. (3)

    Finally, \(\varepsilon \) will denote a generic small parameter; it can play different roles at different places.

We will use the notation \(o_{T,\varepsilon ,\tau }(1)=o_{T\rightarrow \infty ,\varepsilon \rightarrow 0^+,\tau \rightarrow 0^+}(1)\) to denote any function \(f(T,\varepsilon ,\tau )\) such that

$$\begin{aligned} f(T,\varepsilon ,\tau )\ge 0 \quad \text {and}\quad \lim _{T\rightarrow \infty ,\varepsilon \rightarrow 0^+,\tau \rightarrow 0^+} f(T,\varepsilon ,\tau )=0. \end{aligned}$$

If a particular instance of o(1) is independent of some of the variables, we drop these variables from the notation. Different instances of the o(1) notation may refer to different functions of \(T,\varepsilon ,\tau \), and they may depend on each other, so long as they can always be made arbitrarily small.

Note that e.g. \(O_\varepsilon (1)\) denotes any (finite) function of \(\varepsilon \), while \(o_\varepsilon (1)\) denotes a function of \(\varepsilon \) that tends to 0 as \(\varepsilon \rightarrow 0^+\).

We will often work at a scale \(2^{-T\ell }\); it is useful to think that \(\ell \rightarrow \infty \) while \(T,\varepsilon ,\tau \) remain fixed.

The family of Borel probability measures on a metric space X is denoted by \({\mathcal {P}}(X)\). If \(\mu (A)>0\), then \(\mu _A\) denotes the normalized restriction \(\mu (A)^{-1}\mu |_A\). If \(f:X\rightarrow Y\) is a Borel map, then by \(f\mu \) we denote the push-forward measure, i.e. \(f\mu (A)= \mu (f^{-1}A)\).

We let \({\mathcal {D}}_j\) be the half-open \(2^{-jT}\)-dyadic cubes in \({\mathbb {R}}^d\) (where d is understood from context), and let \({\mathcal {D}}_j(x)\) be the only cube in \({\mathcal {D}}_j\) containing \(x\in {\mathbb {R}}^d\). Given a measure \(\mu \in {\mathcal {P}}({\mathbb {R}}^d)\), we also let \({\mathcal {D}}_j(\mu )\) be the cubes in \({\mathcal {D}}_j\) with positive \(\mu \)-measure. Note that these families depend on T. Given \(A\subset {\mathbb {R}}^d\), we also denote by \({\mathcal {N}}(A,j)\) the number of cubes in \({\mathcal {D}}_j\) that intersect A.

A \(2^{-m}\)-measure is a measure in \({\mathcal {P}}([0,1)^d)\) such that the restriction to any \(2^{-m}\)-dyadic cube Q is a multiple of Lebesgue measure on Q, i.e. a measure defined down to resolution \(2^{-m}\). Likewise, a \(2^{-m}\)-set is a union of \(2^{-m}\) dyadic cubes. If \(\mu \in {\mathcal {P}}({\mathbb {R}}^d)\) is an arbitrary measure, then we denote

$$\begin{aligned} R_\ell (\mu ) = \sum _{Q\in {\mathcal {D}}_\ell } \mu (Q) \text {Leb}_Q, \end{aligned}$$

that is \(R_\ell (\mu )\) is the \(2^{-T\ell }\)-measure that agrees with \(\mu \) on all dyadic cubes of side length \(2^{-T\ell }\). We also define the corresponding analog for sets: given \(A\subset {\mathbb {R}}^d\), \(R_\ell (A)\) denotes the union of all cubes in \({\mathcal {D}}_\ell \) that intersect A.

Due to our use of dyadic cubes, it will often be convenient to deal with supports in the dyadic metric, i.e. given \(\mu \in {\mathcal {P}}([0,1)^d)\) we let

$$\begin{aligned} {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ) = \{ x: \mu ({\mathcal {D}}_j(x))>0 \text { for all } j\in {\mathbb {N}}\}. \end{aligned}$$

Note that \(\mu ({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ))=1\) and that \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\subset {{\,\mathrm{supp}\,}}(\mu )\).

If a measure \(\mu \in {\mathcal {P}}({\mathbb {R}}^d)\) has a density in \(L^p\), then its density is sometimes also denoted by \(\mu \), and in particular \(\Vert \mu \Vert _p\) stands for the \(L^p\) norm of its density.

We make some further definitions. Let \(\mu \in {\mathcal {P}}([0,1)^d)\). If Q is a dyadic cube and \(\mu (Q)>0\), then we denote \(\mu ^Q = \text {Hom}_Q\mu _Q\), where \(\text {Hom}_Q\) is the homothety renormalizing Q to \([0,1)^d\). If \(M<N\) be integers, then for \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\), we define

$$\begin{aligned} \mu (x;M \rightarrow N) = R_{N-M} \mu ^{{\mathcal {D}}_M(x)}. \end{aligned}$$

In other words, \(\mu (x;M\rightarrow N)\) is the conditional measure on \({\mathcal {D}}_M(x)\), rescaled back to the unit cube, and then stopped at resolution \(2^{-(N-M)T}\). Likewise, for \(Q\in {\mathcal {D}}_M\) with \(\mu (Q)>0\) we define

$$\begin{aligned} \mu (Q;N) = R_{N-M} \mu ^Q. \end{aligned}$$

Note that \(\mu (x;M\rightarrow N)\) and \(\mu (Q;N)\) are \(2^{-(N-M)T}\)-measures.

Logarithms are always to base 2.

3 Preliminary Results

3.1 Regular measures and energy.

In this section we define some important notions and prove some preliminary results.

Recall that the s-energy of \(\mu \in {\mathcal {P}}({\mathbb {R}}^d)\) is

$$\begin{aligned} {\mathcal {E}}_s(\mu ) = \iint \frac{d\mu (x)d\mu (y)}{|x-y|^s}. \end{aligned}$$

Lemma 3.1

For any Borel probability measure \(\mu \) on \([0,1]^d\), if \(s>0\) then

$$\begin{aligned} {\mathcal {E}}_s(\mu ) \approx _{s,d,T} \sum _{j=1}^\infty 2^{s T j} \sum _{Q\in {\mathcal {D}}_j} \mu (Q)^2. \end{aligned}$$

If \(\mu \) is a \(2^{-{T\ell }}\)-measure and \(0<s<d\), then the sum runs up to \(\ell \) (in particular, the s-energy is finite).

Proof

First of all, by [PP95, Theorem 3.1], we can replace \({\mathcal {E}}_s(\mu )\) by the s-energy on the \(2^T\)-ary tree, i.e. by

$$\begin{aligned} \iint 2^{s T|x\wedge y|} \,d\mu (x)d\mu (y), \end{aligned}$$

where \(|x\wedge y| = \max \{ j: y\in {\mathcal {D}}_j(x)\}\) (both energies are comparable up to a \(O_{T,d}(1)\) factor). The formula for \({\mathcal {E}}_s(\mu )\) now follows from a standard calculation, see e.g. [Shm17, Lemma 3.1] for the case \(T=1\) (the proof of the general case is identical).

Finally, the case in which \(\mu \) is a \(2^{-{T\ell }}\)-measure follows again from another simple calculation, see e.g. [Shm17, Lemma 3.2] for the case \(T=1\). \(\square \)

One of the key steps in the proof of the main theorems is to decompose an arbitrary \(2^{-T\ell }\)-measure in terms of measures which have a uniform tree structure when represented in base \(2^T\). This notion (which is inspired by some constructions of Bourgain [Bou10]) is made precise in the next definition.

Definition 3.2

Given a sequence \(\sigma =(\sigma _1,\ldots ,\sigma _{\ell })\in [-1,d-1]^\ell \), we say that \(\mu \in {\mathcal {P}}([0,1)^d)\) is \(\sigma \)-regular if it is a \(2^{-{T\ell }}\)-measure, and for any \(Q\in {\mathcal {D}}_{j}(\mu )\), \(1\le j\le \ell \), we have

$$\begin{aligned} \mu (Q) \le 2^{-T(\sigma _j+1)} \mu (\widehat{Q}) \le 2\mu (Q), \end{aligned}$$

where \(\widehat{Q}\) is the only cube in \({\mathcal {D}}_{j-1}\) containing Q.

The expression \(2^{-T(\sigma _j+1)}\) in the definition may appear strange, but it turns out to be a convenient normalization. The key point in this definition is that a measure is \(\sigma \)-regular if all cubes of positive mass have roughly the same mass, and the sequence \((\sigma _j)\) helps quantify this common mass.

Lemma 3.3

If \(\nu \in {\mathcal {P}}([0,1)^d)\) is \(\sigma \)-regular for some \(\sigma \in {\mathbb {R}}^\ell \) and \(s\in (0,d)\), then

$$\begin{aligned} \left| \log {\mathcal {E}}_s(\nu )-\left( T \max _{j=0}^{\ell } \sum _{i=1}^j (s-1)-\sigma _i \right) \right| \le O(\ell ) + O_{d,s,T}(1). \end{aligned}$$

Proof

We use crude bounds which are enough for our purposes. From the definition it is clear that if \(Q\in {\mathcal {D}}_j(\nu )\) then

$$\begin{aligned} 2^{-\ell } 2^{-T(\sigma _1+1)} \cdots 2^{-T(\sigma _j+1)}\le & {} 2^{-j} 2^{-T(\sigma _1+1)} \cdots 2^{-T(\sigma _j+1)} \le \nu (Q) \\\le & {} 2^{-T(\sigma _1+1)} \cdots 2^{-T(\sigma _j+1)} . \end{aligned}$$

This implies, in particular, that

$$\begin{aligned} 2^{T(\sigma _1+1)} \cdots 2^{T(\sigma _j+1)} \le |{\mathcal {D}}_j(\nu )|\le 2^\ell 2^{T(\sigma _1+1)} \cdots 2^{T(\sigma _j+1)} . \end{aligned}$$
(3.1)

From the two displayed equations and Lemma 3.1 it follows that

$$\begin{aligned} 2^{-2\ell } \sum _{j=1}^{\ell } 2^{-T(\sigma _1+\cdots +\sigma _j+j)} \cdot 2^{s T j} \lesssim _{d,s,T} {\mathcal {E}}_s(\nu ) \lesssim _{d,s,T} 2^\ell \sum _{j=1}^{\ell } 2^{-T(\sigma _1+\cdots +\sigma _j+j)}\cdot 2^{sTj}. \end{aligned}$$

Write \({\mathcal {M}}_s(\sigma ) := T \max _{j=1}^{\ell } \sum _{i=1}^j (s-1)-\sigma _j\). Bounding \(\sum _{j=1}^\ell \) by \(\ell \) times the maximal term in the right-hand side, we deduce that

$$\begin{aligned} {\mathcal {M}}_s(\sigma ) -2\ell -O_{d,s,T}(1) \le \log {\mathcal {E}}_s(\nu ) \le {\mathcal {M}}_s(\sigma ) + \ell +\log \ell +O_{d,s,T}(1). \end{aligned}$$

This yields the claim. \(\square \)

Heuristically, the previous lemma says that for \(\log {\mathcal {E}}_s(\nu )\) to be small, it must hold that

$$\begin{aligned} \sum _{i=1}^j \sigma _i \ge (s-1)j, \quad j=0,\ldots ,\ell . \end{aligned}$$

Recalling the connection of \(\sigma _i\) to branching numbers, this means that the average branching number over any initial set of scales has to be sufficiently large, in a manner depending on s.

The following is a variant of Bourgain’s regularization argument (see e.g. [Bou10, Section 2] for a clean example). Recall that \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\) denotes the dyadic support of \(\mu \).

Lemma 3.4

Let \(\mu \) be a \(2^{-{T\ell }}\)-measure on \([0,1)^d\) for some \(\ell \ge 1\). There exists a \(2^{-{T\ell }}\)-set X, contained in \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\) and satisfying \(\mu (X) \ge (2Td+2)^{-\ell }\), such that \(\mu _X\) is \(\sigma \)-regular for some sequence \(\sigma \in [-1,d-1]^\ell \).

Proof

Recall that \(\widehat{Q}\) is the only cube in \({\mathcal {D}}_{j-1}\) containing \(Q\in {\mathcal {D}}_j\). For each \(k\in [0,Td]\cap {\mathbb {Z}}\), let

$$\begin{aligned} X_{\ell }^{(k)} = \bigcup \{ Q\in {\mathcal {D}}_{\ell }: \mu (Q) \le 2^{-k} \mu (\widehat{Q}) < 2\mu (Q) \}, \end{aligned}$$

and set

$$\begin{aligned} X_{\ell }^{(>Td)} = \bigcup \{ Q\in {\mathcal {D}}_{\ell }: \mu (Q) \le 2^{-(Td+1)} \mu (\widehat{Q}) \}. \end{aligned}$$

Note that

$$\begin{aligned} \mu \left( X_{\ell }^{(>Td)}\right) \le 2^{-(Td+1)} 2^{Td} = \frac{1}{2}, \end{aligned}$$

and that \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\) is the union of the \(X_\ell ^{(k)}\) together with \(X_{\ell }^{(>Td)}\). Pick the smallest \(k=k(\ell )\in [0,Td]\) which maximizes \(\mu (X_{\ell }^{(k)})\) and set \(\sigma _\ell = k/T-1\in [-1,d-1]\). Then

$$\begin{aligned} \mu \left( X_{\ell }^{(k)} \right) \ge \frac{1}{2(Td+1)} . \end{aligned}$$

Set \(X_{\ell }:= X_{\ell }^{(k)}\) and \(\mu _{\ell }=\mu _{X_{\ell }}\).

Now continue inductively, replacing \(\ell \) by \(\ell -1\) and \(\mu \) by \(\mu _{\ell }\), until we eventually get a set \(X_1\) and a sequence \((\sigma _1,\ldots ,\sigma _\ell )\in [-1,d-1]^\ell \). Note that for \(Q\in {\mathcal {D}}_j(\mu _i)\) the value of \(\mu _i(Q)/\mu _i(\widehat{Q})\) remains constant for \(i\le j\) and, in particular, for \(i=1\). Hence \(X=X_1\) has the desired properties. \(\square \)

The set X given by the lemma will have far too little measure for our purposes: later we will need \(\mu _X(A)\) to be large (in particular nonzero) for certain sets A of mass roughly \(\ell ^{-2}\). By iterating the construction, we are able to get a moderately long sequence of sets \(X_i\) such that \(\mu ({\mathbb {R}}^d\setminus \cup _i X_i)\ll \ell ^{-2}\); by pigeonholing we will then be able to select some \(X_i\) with \(\mu _{X_i}(A)\) suitably large.

Corollary 3.5

Fix \(\ell \ge 1\), write \(m=T\ell \), and let \(\mu \) be a \(2^{-m}\)-measure on \([0,1)^d\). There exists a family of pairwise disjoint \(2^{-m}\)-sets \(X_1,\ldots , X_N\) with \(X_i\subset {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\), and such that:

  1. (i)

    \(\mu \left( \bigcup _{i=1}^N X_i\right) \ge 1-2^{-\varepsilon m}\). In particular, if \(\mu (A)> 2^{-\varepsilon m}\), then there exists i such that \(\mu _{X_i}(A)\ge \mu (A)-2^{-\varepsilon m}\).

  2. (ii)

    \(\mu (X_i) \ge 2^{-(\varepsilon +(1/T)\log (2d T+2)) m} \ge 2^{-o_{T,\varepsilon }(1) m}\),

  3. (iii)

    Each \(\mu _{X_i}\) is \(\sigma (i)\)-regular for some \(\sigma (i)\in [-1,d-1]^\ell \).

Moreover, the family \((X_i)_{i=1}^N\) may be constructed so that it is determined by \(d, T,\varepsilon ,\ell \) and \(\mu \) (even though there may be other families satisfying the above properties).

Proof

Let \(X_1\) be the set given by Lemma 3.4, and put \(B_1=[0,1]^d\setminus X_1\). Continue inductively: once \(X_j,B_j\) are defined, let \(X_{j+1}\) be the set given by Lemma 3.4 applied to \(\mu _{B_j}\), and set \(B_{j+1}=B_j\setminus X_{j+1}\). Then (setting \(B_0=[0,1)^d\))

$$\begin{aligned} \mu (B_j) \ge 2^{-\varepsilon m} \quad \Longrightarrow \quad \mu (X_{j+1}) \ge 2^{-\varepsilon m} (2d T+2)^{-\ell }. \end{aligned}$$
(3.2)

Let N be the smallest integer such that \(\mu (B_N) \le 2^{-\varepsilon m}\); such N exists thanks to (3.2).

It is clear that in this construction the family \(X_1,\ldots ,X_N\) is determined by \(d,T,\varepsilon ,\ell ,\mu \) since the set X constructed in the Proof of Lemma 3.4 is determined by \(d,T,\ell ,\mu \).

The first part of claim (i) is immediate. Then note that

$$\begin{aligned} \mu (A)-2^{-\varepsilon m} \le \sum _{i=1}^N \mu (X_i \cap A) = \sum _{i=1}^N \mu (X_i) \mu _{X_i}(A), \end{aligned}$$

so there must be i such that \(\mu _{X_i}(A) \ge \mu (A)-2^{-\varepsilon m}\), as claimed.

Finally, (ii) is immediate from (3.2) and the definition of N, and (iii) is clear since the sets \(X_i\) were provided by Lemma 3.4. \(\square \)

3.2 Sets of bad projections.

In this subsection, we introduce sets of “bad” multi-scale projections for a measure \(\mu \) around a point x. The simple fact that these sets can be taken to have small measure (independently of \(\mu \) and x) will play a crucial role later. Although a similar notion was introduced in [Shm17], the sets of bad projections we use here are far more flexible and also more involved, depending on the decomposition into regular measures provided by Corollary 3.5.

Given \(\theta \in S^1\), we denote the orthogonal projection \(x\mapsto x\cdot \theta \) by \(\Pi _\theta \). Normalized Lebesgue measure on \(S^1\) will be denoted by \(|\cdot |\). We recall the following consequence of the energy version of Marstrand’s projection theorem.

Lemma 3.6

Let \(\mu \in {\mathcal {P}}([0,1)^2)\) have finite 1-energy. Then, for any \(R>0\),

$$\begin{aligned} |\{ \theta \in S^1: \Vert \Pi _\theta \mu \Vert _2^2 \ge R {\mathcal {E}}_1(\mu ) \}| \lesssim R^{-1}. \end{aligned}$$

Proof

This is just a consequence of Markov’s inequality and the identity

$$\begin{aligned} \int _{S^1} \Vert \Pi _\theta \mu \Vert _2^2\,d\theta \lesssim {\mathcal {E}}_1(\mu ). \end{aligned}$$

see e.g. [Mat04, Equation 1.7]. \(\square \)

We restate [Shm17, Lemma 3.7] using our notation, for later reference.

Lemma 3.7

For any \(\nu \in {\mathcal {P}}({\mathbb {R}}^2)\), \(k\in {\mathbb {N}}\) and \(\theta \in S^1\),

$$\begin{aligned} \Vert R_k \Pi _\theta \nu \Vert _2^2 \approx \Vert \Pi _\theta R_k \nu \Vert _2^2. \end{aligned}$$

Next, we define the various sets of “bad projections”.

Definition 3.8

Given \(\mu \in {\mathcal {P}}([0,1)^2)\), \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\) and non-negative integers \(j,k, j_0, \ell \), we let

$$\begin{aligned} {{\,\mathrm{\mathbf {Bad}}\,}}(\mu ,x,j, k)&= \left\{ \theta \in S^1: \Vert \Pi _\theta \mu (x;j\rightarrow j+k) \Vert _2^2 \ge 2^{\varepsilon T k} {\mathcal {E}}_1(\mu (x;j\rightarrow j+k)) \right\} , \\ {{\,\mathrm{\mathbf {Bad}}\,}}_{j_0\rightarrow \ell }(\mu ,x)&= \bigcup \big \{ {{\,\mathrm{\mathbf {Bad}}\,}}(\mu ,x,j,k) : k \ge \tau j, \,\, j_0\le j \le j+k\le \ell \big \}. \end{aligned}$$

We underline that the definition of \({{\,\mathrm{\mathbf {Bad}}\,}}_{j_0\rightarrow \ell }(\mu ,x)\) depends on the parameters \(T, \varepsilon \) and \(\tau \). Note that, since \(\mu (x;j\rightarrow j+k)\) has a bounded density by definition, both quantities in the definition of \({{\,\mathrm{\mathbf {Bad}}\,}}(\mu ,x,j, k)\) are finite.

Our next goal is to combine Lemma 3.6 with the decomposition given by Corollary 3.5. Starting with a \(2^{-T\ell }\)-measure \(\mu \in {\mathcal {P}}([0,1)^2)\) and \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\), we define

$$\begin{aligned} {{\,\mathrm{\mathbf {Bad}}\,}}'_{j_0\rightarrow \ell }(\mu ,x) = {\left\{ \begin{array}{ll} {{\,\mathrm{\mathbf {Bad}}\,}}_{j_0\rightarrow \ell }(\mu _{X_j},x) &{} \text { if }x\in X_j\\ \varnothing &{} \text { if } x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\setminus \bigcup _i X_i \end{array}\right. }, \end{aligned}$$
(3.3)

where \((X_i)_{i=1}^N\) are the sets given by Corollary 3.5. Note that \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _{X_j})=X_j\).

Lemma 3.9

There exists a further constant \(\varepsilon '=\varepsilon '(T,\varepsilon ,\tau )>0\) such that, for any \(2^{-T\ell }\)-measure \(\mu \in {\mathcal {P}}([0,1)^2)\),

$$\begin{aligned} |{{\,\mathrm{\mathbf {Bad}}\,}}'_{\varepsilon \ell \rightarrow \ell }(\mu ,x)| \lesssim _{T,\varepsilon ,\tau } 2^{-\varepsilon ' \ell } \quad \text {for all }x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ). \end{aligned}$$

Proof

According to the definitions and Lemma 3.6, for any \(\nu \in {\mathcal {P}}([0,1)^2)\) and \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\nu )\),

$$\begin{aligned} \left| {{\,\mathrm{\mathbf {Bad}}\,}}_{j_0\rightarrow \ell }(\nu ,x)\right| \lesssim \sum _{j=j_0}^\infty \sum _{k=\lfloor \tau j\rfloor }^\infty 2^{-\varepsilon T k} \lesssim _{T,\varepsilon ,\tau } 2^{-\varepsilon T \tau j_0} . \end{aligned}$$

The point here is that the bound does not depend on \(\nu \) or x. Hence the claim follows with \(\varepsilon ' = \varepsilon ^2 T\tau \).\(\square \)

Finally, if \(\mu \in {\mathcal {P}}([0,1)^2)\) and \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\), we let

$$\begin{aligned} {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x) =\bigcup _{\ell =\ell _0}^\infty {{\,\mathrm{\mathbf {Bad}}\,}}'_{\varepsilon \ell \rightarrow \ell }(R_\ell \mu ,x). \end{aligned}$$
(3.4)

We record the following immediate consequence of Lemma 3.9 for later use.

Lemma 3.10

$$\begin{aligned} |{{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x)| \lesssim _{T,\varepsilon ,\tau } 2^{-\varepsilon ' \ell _0}, \end{aligned}$$

for all \(x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\), where \(\varepsilon '=\varepsilon '(T,\varepsilon ,\tau )>0\) is the constant from Lemma 3.9.

3.3 Radial projections.

The following result was recently established by Orponen [Orp17]. We state it only in the plane. We denote the radial projection with center y by \(P_y\), i.e. \(P_y(x)=(y-x)/|y-x|\in S^1\) is the (oriented) direction determined by x and y.

Proposition 3.11

Let \(\mu ,\nu \in {\mathcal {P}}([0,1)^2)\) be measures with disjoint supports, such that \({\mathcal {E}}_s(\mu )<\infty \), \({\mathcal {E}}_u(\nu )<\infty \) for some \(u>1\), \(2-u<s<1\). Then there is \(p=p(s,u)>1\) such that \(P_x\nu \) is absolutely continuous with a density in \(L^p(S^1)\) for \(\mu \) almost all x. Moreover,

$$\begin{aligned} \int \Vert P_x\nu \Vert _p^p \,d\mu (x) <\infty . \end{aligned}$$

Proof

This is stated in [Orp17, Equation (3.5)], except that Orponen deals with weighted measures \(\mu _y=|x-y|^{-1}d\mu \) instead of \(\mu \) (note that the roles of \(\mu \) and \(\nu \) are interchanged in [Orp17]). Since the weight \(|x-y|^{-1}\) is bounded away from 0 and \(\infty \) by the assumption that the supports of \(\mu \) and \(\nu \) are bounded and disjoint, the claim also holds for \(\mu \). \(\square \)

We point out that Proposition 3.11 uses the Fourier transform, and is the only point in the proofs of Theorems 1.2 and 1.3 that does (on the other hand, the Proof of Theorem 1.4 relies heavily on the strongly Fourier-analytic approach of Mattila-Wolff).

Proposition 3.11 has the following key consequence. A similar statement was obtained in [Shm17] using a slightly more involved argument.

Proposition 3.12

Let \(\mu ,\nu \in {\mathcal {P}}([0,1)^2)\) have disjoint supports and satisfy \({\mathcal {E}}_s(\mu ), {\mathcal {E}}_u(\nu )<\infty \) for some \(s\in (0,2), u>\max (1,2-s)\). Then there exists \(\kappa =\kappa (\mu ,\nu )>0\) such that the following holds:

Suppose that \(\Theta \subset [0,1)^2\times S^1\) is a Borel set such that

$$\begin{aligned} (\mu \times {\mathcal {H}}^1)(\Theta ) \le \kappa . \end{aligned}$$

Then

$$\begin{aligned} (\mu \times \nu )\{ (x,y): P_y(x) \not \in \Theta _x \} > \frac{2}{3}. \end{aligned}$$

Proof

Since \({\mathcal {E}}_s(\mu )<\infty \) implies that \({\mathcal {E}}_{s'}(\mu )<\infty \) for all \(s'<s\), we may assume that \(s<1\). By Proposition 3.11, there is \(p>1\) such that

$$\begin{aligned} \int \Vert P_x\nu \Vert _p^p \,d\mu (x) =: C <\infty . \end{aligned}$$

Denote \(\Theta _x = \{ \theta \in S^1: (x,\theta )\in \Theta \}\) and \(-\Theta _x=\{ -\theta :\theta \in \Theta _x\}\). Using Fubini and Hölder, each twice, we estimate

$$\begin{aligned} (\mu \times \nu )\{(x,y): P_y(x)\in \Theta _x\}&= \int P_x\nu (-\Theta _x)\,d\mu (x)\\&\le \int {\mathcal {H}}^1(\Theta _x)^{1/p'} \Vert P_x\nu \Vert _p \,d\mu (x)\\&\le \left( \int {\mathcal {H}}^1(\Theta _x) d\mu (x)\right) ^{1/p'} \left( \int \Vert P_x\nu \Vert _p^p d\mu (x)\right) ^{1/p}\\&\le \kappa ^{1/p'} C^{1/p}. \end{aligned}$$

The claim follows by choosing \(\kappa \) so that \(\kappa ^{1/p'} C^{1/p}\le 1/3\). \(\square \)

4 Box-Counting Estimates for Pinned Distance Sets

In this section we derive a lower bound on box-counting numbers of pinned distance sets that will be crucial in the proofs of Theorems 1.2, 1.3 and 1.4. Our estimate will be in terms of a multiscale decomposition where, unlike previous works in the literature, we are allowed to choose the sequence of scales (depending on the set or measure for which we are seeking estimates). This additional flexibility will ultimately allow us to improve upon the easy bounds on the dimensions of distance sets.

To begin, we recall some basic facts about entropy. If \(\nu \in {\mathcal {P}}({\mathbb {R}}^d\)) and \({\mathcal {A}}\) is a finite partition of \({\mathbb {R}}^d\) (or of a set of full \(\nu \)-measure), then the entropy of \(\nu \) with respect to \({\mathcal {A}}\) is given by

$$\begin{aligned} H(\nu ,{\mathcal {A}}) = -\sum _{A\in {\mathcal {A}}} \nu (A) \log (\nu (A)), \end{aligned}$$

with the usual convention \(0\cdot \log 0=0\). It follows from the concavity of the logarithm that one always has

$$\begin{aligned} H(\nu ,{\mathcal {A}}) \le \log |{\mathcal {A}}|. \end{aligned}$$

Hence, a lower bound for \(H(\nu ,{\mathcal {D}}_j)\) provides a lower bound for \({\mathcal {N}}(A,j)\) if A is a Borel set of full measure (recall that \({\mathcal {N}}(A,j)\) denotes the number of elements in \({\mathcal {D}}_j\) that intersect A). We will apply this when \(\nu \) is supported on a pinned distance set. Although box-counting numbers in principle give bounds only for box dimension, together with standard mass pigeonholing arguments we will be able to get bounds also for Hausdorff and packing dimension.

The following proposition is the key device that will allow us to bound from below the entropy of pinned distance measures (and hence also the box-counting numbers of pinned distance sets). Roughly speaking, we bound the entropy of the projection of a measure \(\mu \) under the pinned distance map by an average over both scales and space (the latter, weighted by \(\mu \)) of a quantity involving the \(L^2\) norms of projected local pinned distance measures. We emphasize that this method to bound the dimension of (linear or nonlinear) projections from below goes back in various forms to [Hoc12, Hoc14, Orp17b], although the use of projected \(L^2\) norms (rather than projected entropies) was first used in [Shm17].

Before stating the proposition we introduce some definitions. Given \(L\in {\mathbb {N}}\), a good partition of (0, L] is an integer sequence \(0=N_0<\cdots <N_q=L\) such that \(N_{j+1}-N_j\le N_j+1\). We write \(\Delta _y(x)=|x-y|\) for the pinned distance map, and \(\theta (x,y)=P_y(x)=(x-y)/|x-y|\).

Proposition 4.1

Let \(\mu \in {\mathcal {P}}([0,1)^d)\), let \(y\in {\mathbb {R}}^d\) be at distance \(\ge \varepsilon \) from \({{\,\mathrm{supp}\,}}(\mu )\), and fix a good partition \((N_i)_{i=0}^q\) of \((0,\ell ]\). Then

$$\begin{aligned} T\ell - H(\Delta _y\mu ,{\mathcal {D}}_\ell ) \le O_{T,\varepsilon }(q) + \sum _{i=0}^{q-1} \sum _{Q\in {\mathcal {D}}_{N_i}} \mu (Q) \log \Vert \Pi _{\theta (x_Q,y)}\mu (Q;N_{i+1}) \Vert _2^2, \end{aligned}$$
(4.1)

where \(x_Q\) is an arbitrary point in Q.

Proof

Write \(D_i=N_{i+1}-N_i\). Note that our \({\mathcal {D}}_i\) correspond to \({\mathcal {D}}_{T i}\) and our \(T N_i\) to \(m_i\) in [Shm17]. Recall also that \(\mu ^Q\) denotes the magnification of \(\mu _Q\) to the unit cube. It is shown in [Shm17, Proposition 3.8 and Remark 3.10] that

$$\begin{aligned} H(\Delta _y\mu ,{\mathcal {D}}_\ell ) \ge -O_{T,\varepsilon }(q) +\sum _{i=0}^{q-1} \sum _{Q\in {\mathcal {D}}_{N_i}} \mu (Q) H\left( \Pi _{\theta (y,x_Q)} \mu ^Q ,{\mathcal {D}}_{D_i}\right) . \end{aligned}$$
(4.2)

Applying Lemma 3.7 to \(\nu = \mu ^Q\) for some \(Q\in {\mathcal {D}}_{N_i}\) and \(k=D_i\), we get that

$$\begin{aligned} \Vert R_{D_i} \Pi _{\theta (y,x_Q)} \mu ^Q \Vert _2^2 \approx \Vert \Pi _{\theta (y,x_Q)} \mu (Q;N_{i+1}) \Vert _2^2. \end{aligned}$$
(4.3)

On the other hand, a simple convexity argument (see [Shm17, Lemma 3.6]) yields that, for any \(\nu \in {\mathcal {P}}({\mathbb {R}})\) and \(k\in {\mathbb {N}}\),

$$\begin{aligned} H(\nu ,{\mathcal {D}}_k) \ge Tk - \log \Vert R_k \nu \Vert _2^2. \end{aligned}$$

Applying this with \(k=D_i\) and \(\nu = \Pi _{\theta (y,x_Q)} \mu ^Q\), and recalling (4.3), we deduce that

$$\begin{aligned} H\left( \Pi _{\theta (y,x_Q)} \mu ^Q ,{\mathcal {D}}_{D_i}\right) \ge T D_i - \log \Vert \Pi _{\theta (y,x_Q)} \mu (Q;N_{i+1}) \Vert _2^2 - O(1). \end{aligned}$$

Using this bound in each term in the right-hand side of (4.2), and absorbing the sum of the qO(1) terms into \(O_{T,\varepsilon }(q)\), we get the claim. \(\square \)

We remark that the assumption that \(N_{j+1}-N_j\le N_j+1\) in the definition of good partition (which will play a crucial role later) arises from the linearization of the distance function, and cannot be substantially weakened. The key advantage of having \(L^2\) norms instead of entropies in this proposition is that the estimate one gets is robust under passing to subsets of moderately large measure:

Proposition 4.2

With the assumptions and notation from Proposition 4.1, let us write \({\mathcal {F}}(\mu )\) for the right-hand side of (4.1) (we assume y and the partition \((N_i)\) are fixed). If \(\mu \in {\mathcal {P}}([0,1)^2)\), \(\nu =\mu _A\) where A is Borel and \(\mu (A)>0\), then

$$\begin{aligned} {\mathcal {F}}(\nu ) \le O_{T,\varepsilon }(q) + 2q\log \left( \tfrac{T\ell }{\mu (A)}\right) + \sum _{i=0}^{q-1} \sum _{Q\in {\mathcal {D}}_{N_i}} \nu (Q) \log \Vert \Pi _{\theta (y,x_Q)}\mu (Q;N_{i+1}) \Vert _2^2. \end{aligned}$$

Proof

We start with the trivial observation that if \(\rho ,\rho '\in {\mathcal {P}}({\mathbb {R}}^d)\) have an \(L^2\) density and \(\rho '(S)\le K\rho (S)\) for all Borel sets S, then the same bound transfers over to the densities for a.e. point, and so \(\Vert \rho '\Vert _2^2 \le K^2 \Vert \rho \Vert _2^2\).

Let \(\zeta =1/(T\ell )\in (0,1)\). Fix \(i\in \{0,\ldots ,q-1\}\), and note that

$$\begin{aligned} \sum \{ \nu (Q): Q\in {\mathcal {D}}_{N_i}, \nu (Q)< \zeta \mu (Q)\} < \zeta . \end{aligned}$$
(4.4)

Suppose \(\nu (Q) = \mu (A\cap Q)/\mu (A) \ge \zeta \mu (Q)>0\) for a given \(Q\in {\mathcal {D}}_{N_i}\). Then

$$\begin{aligned} \nu _Q(S) = \frac{\mu (A\cap Q\cap S)}{\mu (A\cap Q)} \le \frac{\mu (Q\cap S)}{\zeta \mu (A)\mu (Q)} = \frac{1}{\zeta \mu (A)} \mu _Q(S) \end{aligned}$$

for any Borel set \(S\subset [0,1)^2\). This domination is preserved under push-forwards and the action of \(R_{D_i}\) (where as before \(D_i = N_{i+1}-N_i\)), so in light of our initial observation we get

$$\begin{aligned} \Vert \Pi _{\theta (y,x_Q)} \nu (Q;N_{i+1}) \Vert _2^2 \le \frac{1}{(\zeta \mu (A))^2} \Vert \Pi _{\theta (y,x_Q)} \mu (Q;N_{i+1}) \Vert _2^2 , \end{aligned}$$

always assuming that \(\nu (Q)\ge \zeta \mu (Q)>0\) and \(Q\in {\mathcal {D}}_{N_i}\). Also, since the measure \(\Pi _{\theta (y,x_Q)} \mu (Q;N_{i+1})\) is supported on an interval of length \(\sqrt{2}\), it follows from Cauchy–Schwarz that

$$\begin{aligned} \Vert \Pi _{\theta (y,x_Q)} \mu (Q;N_{i+1}) \Vert _2^2 \ge 2^{-1/2}. \end{aligned}$$
(4.5)

On the other hand, for any \(2^{-T D}\)-measure \(\rho \) on \({\mathbb {R}}\) one has \(\Vert \rho \Vert _2^2 \le 2^{T D}\). In light of Lemma 3.7, this implies that

$$\begin{aligned} \Vert \Pi _{\theta (y,x_Q)}\nu (Q;N_{i+1}) \Vert _2^2 \lesssim 2^{T D_i}. \end{aligned}$$
(4.6)

Splitting (for each i) the sum \(\sum _{Q\in {\mathcal {D}}_{N_i}}\) in Proposition 4.1 into the cubes with \(\nu (Q) \ge \zeta \mu (Q)\) and \(\nu (Q)< \zeta \mu (Q)\), and recalling (4.4), we arrive at the estimate

$$\begin{aligned} {\mathcal {F}}(\nu )\le & {} O_{T,\varepsilon }(q) + \zeta T\ell - 2q\log (\zeta \mu (A)) \\&+ \sum _{i=0}^{q-1} \sum _{\begin{array}{c} Q\in {\mathcal {D}}_{N_i} \\ \nu (Q)\ge \zeta \mu (Q) \end{array}} \nu (Q) \log \Vert \Pi _{\theta (y,x_Q)}\mu (Q;N_{i+1}) \Vert _2^2, \end{aligned}$$

where we merged the sum of the (\(\log \) of the) implicit constants in (4.6) into \(O_{T,\varepsilon }(q)\). Recalling that \(\zeta =1/(T\ell )\) and using (4.5) we get the desired result. \(\square \)

Our next goal is to get a simpler lower bound in the context of Proposition 4.2 when \(\mu \) is \(\sigma \)-regular (recall Definition 3.2), and \(\nu \) is the restriction of \(\mu \) to the set of points which are not bad in the sense of Section 3.2. Combining the results of Section 3.2 and Section 3.3, we will later be able to deal with general measures via a reduction to this special case.

We require some additional definitions:

Definition 4.3

We say that \(0=N_0<N_1<\cdots <N_q=L\) is a \(\tau \)-good partition of (0, L] if

$$\begin{aligned} \tau N_j \le N_{j+1}-N_j \le N_j+1 \end{aligned}$$
(4.7)

for every \(0\le j< q\). In other words \((N_j)\) is a good partition and additionally \(N_{j+1}\ge (1+\tau ) N_j\).

Given a finite sequence \((\sigma _1,\ldots ,\sigma _L)\in {\mathbb {R}}^L\), let

$$\begin{aligned} {\mathcal {S}}(\sigma ) = -\min _{j=0}^L \sigma _1+\cdots +\sigma _j \ge 0. \end{aligned}$$

For any good partition \({\mathcal {P}}=(N_j)_{j=0}^q\) of (0, L] and any \(\sigma \in {\mathbb {R}}^L\), we denote

$$\begin{aligned} {\mathbf {M}}(\sigma ,{\mathcal {P}}) = \sum _{j=0}^{q-1} {\mathcal {S}}(\sigma |(N_j,N_{j+1}]), \end{aligned}$$

where \(\sigma |I\) denotes the restriction of the sequence \(\sigma \) to the interval I.

Finally, given \(\sigma \in {\mathbb {R}}^L\) and \(\tau \in (0,1)\), we let

$$\begin{aligned} {\mathbf {M}}_\tau (\sigma ) = \min \{ {\mathbf {M}}(\sigma ,{\mathcal {P}}) : {\mathcal {P}} \text { is a }\tau \text {-good partition of } (0,L]\}. \end{aligned}$$

Recall that \(o_{T,\varepsilon }(1)\) denotes a function of T and \(\varepsilon \) which tends to 0 as \(T\rightarrow \infty ,\varepsilon \rightarrow 0^+\).

Proposition 4.4

Suppose that \(\rho \in {\mathcal {P}}([0,1)^2)\) is a \((\sigma _1,\ldots ,\sigma _\ell )\)-regular measure. Assume that there are a Borel set \(A\subset [0,1)^2\), a point \(y\in {\mathbb {R}}^2\) and a number \(\beta \in (0,1)\) satisfying that \(\rho (A)>0\), \(\mathrm {dist}(y,{{\,\mathrm{supp}\,}}(\rho ))\ge \varepsilon \), and for all \(x\in A\cap {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\rho )\) there is \(\widetilde{x} \in {\mathcal {D}}_\ell (x)\) such that

$$\begin{aligned} \theta (\widetilde{x},y) \notin {{\,\mathrm{\mathbf {Bad}}\,}}_{\beta \ell \rightarrow \ell }(\rho ,\widetilde{x}). \end{aligned}$$

Then

$$\begin{aligned} \frac{\log {\mathcal {N}}(\Delta _y A,\ell )}{T\ell } \ge 1 - \frac{{\mathbf {M}}_\tau (\sigma )}{\ell } - {{\,\mathrm{Error}\,}}, \end{aligned}$$

where

$$\begin{aligned} {{\,\mathrm{Error}\,}}= 2\beta + o_{T,\varepsilon }(1) + O_{T,\varepsilon ,\tau }\left( \frac{\log ^2\ell }{\ell }\right) + \frac{O_\tau (\log \ell )\log (1/\rho (A))}{\ell }. \end{aligned}$$

Proof

Let \({\mathcal {P}}=(N_i)_{i=0}^q\) be a \(\tau \)-good partition of \((0,\ell ]\). We have to show that

$$\begin{aligned} \frac{\log {\mathcal {N}}(\Delta _y A,\ell )}{T\ell } \ge 1 - \frac{{\mathbf {M}}(\sigma ,{\mathcal {P}})}{\ell } - {{\,\mathrm{Error}\,}}. \end{aligned}$$

Fix \(i_0\) as the smallest value of i such that \(N_i\ge \beta \ell \), and note that \(N_{i_0}< 2\beta \ell +1\).

Let us rewrite the inequality from Proposition 4.2 applied to \(\rho \) and \(\rho _A\) in the form

$$\begin{aligned} {\mathcal {F}}(\rho _A) \le E + \Sigma _{\text {I}} + \Sigma _{\text {II}}, \end{aligned}$$

where

$$\begin{aligned} E&= O_{T,\varepsilon }(q) + 2q\log \left( \tfrac{T\ell }{\rho (A)}\right) ,\\ \Sigma _{\text {I}}&= \sum _{i=0}^{i_0-1} \sum _{Q\in {\mathcal {D}}_{N_i}: \rho (A\cap Q)>0} \rho _A(Q) \log \Vert \Pi _{\theta (y,x_Q)}\rho (Q;N_{i+1}) \Vert _2^2,\\ \Sigma _{\text {II}}&= \sum _{i=i_0}^{q-1} \sum _{Q\in {\mathcal {D}}_{N_i}: \rho (A\cap Q)>0} \rho _A(Q) \log \Vert \Pi _{\theta (y,x_Q)}\rho (Q;N_{i+1}) \Vert _2^2, \end{aligned}$$

where \(x_Q\) are arbitrary points in Q. By assumption, we may choose these points so that

$$\begin{aligned} \theta (x_Q,y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}_{\beta \ell \rightarrow \ell }(\rho ,x_Q). \end{aligned}$$
(4.8)

Using that \((1+\tau )^q\le \ell \), we bound

$$\begin{aligned} E \le O_{T,\varepsilon ,\tau }(\log ^2\ell )+ O_\tau (\log \ell ) \log (1/\rho (A)). \end{aligned}$$
(4.9)

Write \(D_i=N_{i+1}-N_i\). To estimate \(\Sigma _{\text {I}}\), we use the trivial bound \(\Vert R_{D_i}(\cdot )\Vert _2^2 \le 2^{D_i T}\) together with Lemma 3.7 and the bounds \(N_{i_0}<2\beta \ell +1\), \((1+\tau )^q \le \ell \), so that

$$\begin{aligned} \begin{aligned} \Sigma _{\text {I}}&\le \sum _{i=0}^{i_0-1} \sum _{Q\in {\mathcal {D}}_{N_i}} \rho _A(Q) (D_i T + O(1)) \\&\le N_{i_0} T + O(i_0) \\&\le 2\beta T\ell + O_{T,\tau }(\log \ell ). \end{aligned} \end{aligned}$$
(4.10)

Now, to estimate the main term \(\Sigma _{\text {II}}\), we need to go back to Definition 3.8. By (4.8), and using that \({\mathcal {P}}\) is a \(\tau \)-good partition of \((0,\ell ]\), we have \(\theta (x_Q,y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}(\rho ,x_Q,N_i,D_i)\) for \(i_0 \le i < q\). We deduce that

$$\begin{aligned} \log \Vert \Pi _{\theta (x_Q,y)}\rho (Q;N_{i+1}) \Vert _2^2 \le \varepsilon T D_i +\log {\mathcal {E}}_1(\rho (Q;N_{i+1})), \end{aligned}$$

for \(i_0 \le i < q\). On the other hand, by the assumption that \(\rho \) is \((\sigma _1,\ldots ,\sigma _\ell )\)-regular, and since \({\mathcal {P}}\) is a good partition of \((0,\ell ]\), the measure \(\rho (Q;N_{i+1})\) is \((\sigma _{N_i+1},\ldots ,\sigma _{N_{i+1}})\)-regular. Hence, using Lemma 3.3, we obtain

$$\begin{aligned} \log {\mathcal {E}}_1(\rho (Q;N_{i+1})) \le O(D_i) + O_T(1) + T {\mathcal {S}}(\sigma |(N_i,N_{i+1}]). \end{aligned}$$

Combining the last two displayed formulas, we deduce that

$$\begin{aligned} \log \Vert \Pi _{\theta (x_Q,y)} \rho (Q;N_{i+1}) \Vert _2^2 \le o_{T,\varepsilon }(1) T D_i+ O_T(1) + T {\mathcal {S}}(\sigma |(N_i,N_{i+1}]). \end{aligned}$$

Adding up from \(i=i_0\) to \(q-1\) and again using \(q= O_{\tau }(\log \ell )\), we get

$$\begin{aligned} \Sigma _{\text {II}} \le o_{T,\varepsilon }(1) T \ell + O_{T,\tau }(\log \ell )+ T {\mathbf {M}}(\sigma ,{\mathcal {P}}). \end{aligned}$$
(4.11)

Combining (4.9), (4.10) and (4.11), we conclude that

$$\begin{aligned} \frac{1}{T\ell }{\mathcal {F}}(\rho _A) \le \frac{1}{\ell }{\mathbf {M}}(\sigma ,{\mathcal {P}}) + {{\,\mathrm{Error}\,}}, \end{aligned}$$

where \({{\,\mathrm{Error}\,}}\) is as in the statement. Recall that \({\mathcal {F}}(\mu )\) denotes the right-hand side of (4.1) in Proposition 4.1. Now Proposition 4.1 guarantees that

$$\begin{aligned} \frac{1}{T\ell } H(\Delta _y\rho _A,{\mathcal {D}}_\ell ) \ge 1 - \frac{1}{\ell }{\mathbf {M}}(\sigma ,{\mathcal {P}}) - {{\,\mathrm{Error}\,}}. \end{aligned}$$

Since \(H(\mu ,{\mathcal {A}}) \le \log |{\mathcal {A}}|\) for any finite Borel partition \({\mathcal {A}}\) of a set of full \(\mu \)-measure, this finishes the proof. \(\square \)

Note that in this proposition, the sequence \(\sigma \) depends on the measure \(\rho \) and the bound is in terms of \(M_\tau (\sigma )\) (we will be able to make the error term arbitrarily small). Thus we are led to the combinatorial problem of minimizing \({\mathbf {M}}(\sigma ,(N_i))\) over all \(\tau \)-good partitions for a given \(\sigma \in [-1,1]^\ell \). This problem will be tackled in the next section: see Proposition 5.23, and also Proposition 5.24 for the case in which we are allowed to restrict \(\sigma \) to (0, L] for some large L.

5 Finding Good Scale Decompositions: Combinatorial Estimates

5.1 An optimization problem for Lipschitz functions.

We begin by defining suitable analogs of the concepts from Definition 4.3 for Lipschitz functions, instead of \([-1,1]\)-sequences.

Definition 5.1

A sequence \((a_n)_{n=0}^\infty \) is a partition of the interval [0, a] if \(a=a_0>a_1>\cdots >0\) and \(a_n\rightarrow 0\); it is a good partition if we also have \(a_{k-1} / a_{k} \le 2\) for every \(k\ge 1\).

A sequence \((a_n)_{n=0}^{\infty }\) is a \(\tau \)-good partition for a given \(0< \tau <1\) if it is a good partition and we also have \(a_{k-1} / a_{k} \ge 1+\tau \) for every \(k\ge 1\).

Let \(f:[0,a]\rightarrow {\mathbb {R}}\) be continuous and \((a_n)\) be a partition of [0, a]. By the total drop offaccording to\((a_n)\) we mean

$$\begin{aligned} {\mathbf {T}}(f,(a_n))=\sum _{n=1}^\infty f(a_n)-\min _{[a_n,a_{n-1}]} f, \end{aligned}$$

and we also introduce the notation

$$\begin{aligned} {\mathbf {T}}(f)= & {} \inf \{{\mathbf {T}}(f,(a_n))\ :\ (a_n) \text{ is } \text{ a } \text{ good } \text{ partition } \text{ of } [0,a]\},\\ {\mathbf {T}}_\tau (f)= & {} \inf \{{\mathbf {T}}(f,(a_n))\ :\ (a_n) \text{ is } \text{ a } \tau \text{-good } \text{ partition } \text{ of } [0,a]\}. \end{aligned}$$

We call the interval \([a_{n},a_{n-1}]\)increasing if \(\min _{[a_n,a_{n-1}]} f=f(a_n)\) and decreasing if \(\min _{[a_n,a_{n-1}]} f=f(a_{n-1})\). (Note that f needs not be increasing or decreasing on \([a_n,a_{n-1}]\).)

In this section we investigate the following question: given a 1-Lipschitz function \(f:[0,a]\rightarrow {\mathbb {R}}\) satisfying certain bounds, how large can \({\mathbf {T}}(f)\) and \({\mathbf {T}}_\tau (f)\) be?

First we study \({\mathbf {T}}(f)\). Later we show (see Corollary 5.20) that for small \(\tau \) the quantities \({\mathbf {T}}(f)\) and \({\mathbf {T}}_\tau (f)\) are close. Finally, from the bounds on \(T_\tau (f)\) we deduce corresponding bounds on \({\mathbf {M}}_\tau (\sigma )\): see for example Proposition 5.23. Hence this problem is closely related to that of minimizing the dimension loss when estimating the dimension of the pinned distance set via Proposition 4.4. Dealing first with Lipschitz functions rather than \([-1,1]\)-sequences allows us to avoid certain technicalities and make the arguments more transparent.

The basic result is the following.

Proposition 5.2

Let \(a>0\), \(-1\le D < C \le 1\) be given parameters such that \(C\ge 2D\). Let \(f:[0,a]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function such that \(Dx\le f(x)\le Cx\) for every \(x\in [0,a]\). Then

$$\begin{aligned} {\mathbf {T}}(f)\le \frac{(a-f(a))(C-2D)}{1+2C-3D} \le a\cdot \frac{(1-D)(C-2D)}{1+2C-3D}. \end{aligned}$$
(5.1)

Proof

Since \(f(a)\ge Da\) and \(a>0\), the second inequality of (5.1) is clear, so it enough to prove the first inequality.

Let

$$\begin{aligned} h=\frac{C-2D}{1+C-D} \quad \text {and} \quad \rho =\frac{C-2D}{1+2C-3D}. \end{aligned}$$

Note that

$$\begin{aligned} h=\frac{\rho }{1-\rho } \quad \text {and} \quad \rho =\frac{h}{h+1} \end{aligned}$$
(5.2)

and \(h,\rho \ge 0\) since we assumed \(C\ge 2D\) and \(C\ge D\), so \(2C\ge 3D\).

We will construct a good partition \((a_n)\) with the following two extra properties:

(*) every interval \([a_n,a_{n-1}]\) (\(n=1,2,\ldots \)) is either increasing or decreasing (recall Definition 5.1), and

(**) if \([a_k,a_{k-1}],\ldots ,[a_{l+1},a_l]\) (\( k \ge l+1 \ge 1\)) is a maximal block of consecutive decreasing intervals, then

$$\begin{aligned} \frac{f(a_k)-f(a_l)}{a_l-a_k} \le h=\frac{C-2D}{1+C-D}. \end{aligned}$$

First we show that this is enough to prove our claim. Let \(a=a'_0>a'_1>\ldots \) be the endpoints of the union of each maximal block of consecutive intervals of the same type (increasing or decreasing). It easily follows from the definitions and telescoping that \({\mathbf {T}}(f,(a_n))={\mathbf {T}}(f,(a'_k))\). Hence to obtain (5.1) it is enough to prove

$$\begin{aligned} {\mathbf {T}}(f,(a'_k)) \le \rho \cdot (a-f(a)). \end{aligned}$$
(5.3)

We claim that

$$\begin{aligned} f(a'_k)-\min _{[a'_k,a'_{k-1}]} f \le \rho ((a'_{k-1}-f(a'_{k-1})) - (a'_k-f(a'_k))) \qquad (k=1,2,\ldots ).\nonumber \\ \end{aligned}$$
(5.4)

Indeed, by construction, the interval \([a'_{k},a'_{k-1}]\) is either increasing or decreasing. If it is increasing then

$$\begin{aligned} f(a'_k)-\min _{[a'_k,a'_{k-1}]} f = 0 \le \rho ((a'_{k-1}-f(a'_{k-1})) - (a'_k-f(a'_k))) \end{aligned}$$

since f is 1-Lipschitz and \(a'_k<a'_{k-1}\).

If \([a'_{k+1},a'_{k}]\) is decreasing then, using first (**) and the fact that \(\rho <1\), and then (5.2), we get

$$\begin{aligned} f(a'_k)-\min _{[a'_k,a'_{k-1}]} f&= f(a'_{k})-f(a'_{k-1}) \\&\le \rho (f(a'_{k})-f(a'_{k-1})) + (1-\rho )h(a'_{k-1}-a'_{k})\\&= \rho (f(a'_{k})-f(a'_{k-1})) + \rho (a'_{k-1}-a'_{k}), \end{aligned}$$

which completes the Proof of (5.4).

By adding up (5.4) for \(k=1,2,\ldots \) and using that \(a'_0=a\), \(a'_k\rightarrow 0\) and \(f(a'_k)\rightarrow 0\) we get (5.3), which implies (5.1).

Therefore it is enough to construct a good partition \((a_n)\) with properties (*) and (**). Let \(a_0=a\) and suppose that \(a_0>\cdots>a_{n}>0\) are already constructed with properties (*) and (**) (up to n).

We distinguish three cases.

Case 1.\(\min _{[a_n/2,a_n]} f < f(a_n)\).

In this case let \(a_{n+1}\in [a_n/2,a_n]\) be the smallest number such that \(f(a_{n+1})=\min _{[a_n/2,a_n]}f\). Then \([a_{n+1},a_n]\) is an increasing interval and so (*) and (**) still hold and we can continue the procedure.

Case 2.\(\min _{[a_n/2,a_n]} f = f(a_n)\) and \(f(a_n/2)-f(a_n)\le h\cdot (a_n - a_n/2)\).

In this case let \(a_{n+1}=a_n/2\), and again (*), (**) hold for the extended sequence and we can continue the procedure.

Case 3.\(\min _{[a_n/2,a_n]} f = f(a_n)\) and \(f(a_n/2)-f(a_n)> h\cdot (a_n - a_n/2)\).

First we claim that \(h\ge -D\). Indeed, since \(-1\le D\le C\) we have

$$\begin{aligned} 0\le (C-D)(D+1)=-D+CD-D^2+C, \end{aligned}$$

which implies that

$$\begin{aligned} -D(1+C-D) \le C-2D , \end{aligned}$$

and this implies \(h\ge -D\).

Since \(h\ge -D\) and \(f(x)\ge Dx\) we have \(f(a_n)\ge Da_n\ge -ha_n\) and so

$$\begin{aligned} f(0)-f(a_n)=-f(a_n)\le h a_n = h (a_n-0). \end{aligned}$$

This and the assumption \(f(a_n/2)-f(a_n)> h\cdot (a_n - a_n/2)\) implies that there exists a largest \(b\in [0,a_n/2)\) be such that

$$\begin{aligned} f(b)-f(a_n)=h(a_n-b). \end{aligned}$$
(5.5)

Now our goal is to find a sequence \(b=b_0<b_1<\cdots <b_M=a_n\) with \(M\ge 2\) such that

$$\begin{aligned} \min _{[b_{i-1},b_i]}f=f(b_i),\ b_i/b_{i-1}\le 2\ (i=1,\ldots , M), \quad b_i/b_{i-2}\ge 2\ (i=2,\ldots ,M).\qquad \end{aligned}$$
(5.6)

The sequence \((b_i)\) is constructed by induction. Let \(b_0=b\). Suppose that \(m\ge 0\), \(b=b_0<\cdots<b_m< a_n\) are already constructed and (5.6) holds for \(M=m\). If \(b_m\ge a_n/2\) then we can take \(b_{m+1}=a_n\) and \(M=m+1\). Then the construction is completed and (5.6) holds.

Now consider the case \(b_m< a_n/2\). Let \(b_{m+1}\in [b_m,2b_m]\) be maximal such that \(f(b_{m+1})=\min _{[b_m,2b_m]}f\). Our goal is to show that \(b_{m+1}>b_m\). For this it is enough to show that \(f(2b_m)\le f(b_m)\).

Using that b is the largest number in \([0,a_n/2]\) for which (5.5) holds, \(b_m\ge b\) and \(f(a_n/2)-f(a_n)> h\cdot (a_n - a_n/2)\), we get

$$\begin{aligned} f(b_m)-f(a_n)\ge h(a_n-b_m). \end{aligned}$$
(5.7)

Hence to get \(f(2b_m)\le f(b_m)\) it is enough to show that

$$\begin{aligned} f(2b_m)-f(a_n)\le h(a_n-b_m). \end{aligned}$$
(5.8)

Using (5.7) and \(Dx\le f(x)\le Cx\) we get

$$\begin{aligned} h(a_n-b_m)\le f(b_m)-f(a_n)\le Cb_m-Da_n, \end{aligned}$$

which implies that

$$\begin{aligned} (D+h)a_n\le (C+h)b_m. \end{aligned}$$

Direct calculation shows that \(D+h=(C-D)(1-h)\) and \(C+h=(C-D)(2-h)\). Thus the last inequality and \(D<C\) imply that

$$\begin{aligned} (1-h)a_n\le (2-h)b_m. \end{aligned}$$

Hence, using also that f is 1-Lipschitz and \(b_m<a_n/2\), we obtain

$$\begin{aligned} h(a_n-b_m)\ge a_n - 2b_m \ge f(2b_m)-f(a_n). \end{aligned}$$

This completes the Proof of (5.8) and so also the proof of \(b_{m+1}>b_m\). It is easy to see that (5.6) holds for \(M=m+1\). Note also that the property \(b_i/b_{i-2}\ge 2\) implies that the construction of the sequence \((b_i)\) is completed after finitely many steps.

Now, to finish Case 3 we take \(a_{n+j}=b_{M-j}\) for \(j=1,\ldots ,M\). Then (*) and (**) hold (up to \(n+m\)) and so the procedure can be continued.

This way we obtain a sequence \(a=a_0>a_1>\cdots >0\) that forms a good partition with (*) and (**), provided \(a_n\rightarrow 0\). Therefore it remains to prove that \(a_n\rightarrow 0\).

Since \(a_{n+1}=a_{n}/2\) when Case 2 is applied and \(a_{n+M}=b_0=b\le a_n/2\) in Case 3, we are done if Case 2 or Case 3 is applied infinitely many times. It is easy to see that if both \(a_{n+1}\) and \(a_{n+2}\) were obtained from Case 1, then we have \(a_{n}/a_{n+2}\ge 2\). Thus \(a_n\rightarrow 0\), which completes the proof. \(\square \)

5.2 Small drop on initial segments.

The results in this subsection are required in the Proof of Theorem 1.3. We aim to minimize \({\mathbf {T}}(f|[0,u])/u\), where \(u>0\) is a new parameter that we are allowed to choose, subject to not being too small. The analysis will be strongly based on the study of hard points which we now define:

Definition 5.3

If \(f:[0,a]\rightarrow {\mathbb {R}}\) is a function, we say that \(p\in [0,a]\) is a hard point off if \(\min _{[p/2,p]} f=f(p)\).

We will say that a function f defined on an interval I is piecewise linear if I can be decomposed into finitely many intervals such that f is linear on each of them.

Lemma 5.4

Let \(f:[0,a]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function, which is piecewise linear on every closed subinterval of (0, a]. Then:

  1. (i)

    The set of hard points of f can be written as a (possibly empty) finite or infinite union of closed (possibly degenerate) intervals \(H = \cup _j [u_j,v_j]\) such that \(v_1\ge u_1>v_2\ge u_2>\ldots \) and every closed subinterval of (0, a] intersects only finitely many \([u_j,v_j]\).

  2. (ii)

    We have

    $$\begin{aligned} {\mathbf {T}}(f)=\sum _{j} f(u_j)-f(v_j), \end{aligned}$$
    (5.9)

    where the empty sum is meant to be zero.

Proof

The first statement is easy, using that f is piecewise linear.

First we prove \(\ge \) in (5.9). Let \((a_n)\) be a good partition of [0, a] and let \(a=a'_0>a'_1>\ldots \) be an ordered enumeration of the set \(\{a_n\} \cup \{u_j\} \cup \{v_j\}\). It is easy to check that \((a'_n)\) is also a good partition of [0, a], and that by inserting a hard point of f into a good partition \((a_n)\), the value of \({\mathbf {T}}(f,(a_n))\) is not changed. Thus \({\mathbf {T}}(f,(a'_n))={\mathbf {T}}(f,(a_n))\). Now every \([u_j,v_j]\) is of the form \([u_j,v_j]=\cup _{n=n_j}^{m_j} [a'_{n},a'_{n-1}]\). Since f must be nonincreasing on any interval \([u_j,v_j]\) we obtain

$$\begin{aligned} f(u_j)-f(v_j)= \sum _{n=n_j}^{m_j} f(a'_n)-\min _{[a'_n,a'_{n-1}]} f \end{aligned}$$

for every j. Adding up, and using that \(f(a'_n)-\min _{[a'_n,a'_{n-1}]} f\ge 0\) and \({\mathbf {T}}(f,(a'_n))={\mathbf {T}}(f,(a_n))\) we get the claim.

To prove the other inequality we construct by induction a good partition of [0, a] such that \({\mathbf {T}}(f,(a_n))\le \sum _{j} f(u_j)-f(v_j)\). Let \(a_0=a\). Suppose that \(a_0,\ldots ,a_n\) are already defined.

Case 1. If \(a_n\in (u_j,v_j]\) for some j then choose \(k\ge 1\) and \(a_n>a_{n+1}>\cdots >a_{n+k}=u_j\) so that \(a_{n+i}/a_{n+i-1}\le 2\) for \(i=1,\ldots ,k\).

Case 2. Otherwise let \(a_{n+1}\in [a_n/2,a_n]\) be the smallest number for which \(f(a_{n+1})=\min _{[a_n/2,a_n]}f\). We claim that \(a_{n+1}<a_n\). If \(a_n\not \in H\) then this is clear from the definition. Since the only points of H that are not handled in the previous case are the left endpoints of the intervals \([u_j,v_j]\) we can suppose that \(a_n=u_j\) for some j. By the piecewise linearity of f, there exists \(w\in (u_j/2, u_j)\) such that f is linear on \([w,u_j]\) and \(w>v_{j+1}\). Since \(u_j\) is a hard point, f cannot be increasing on \([w,u_j]\). If f is constant on \([w,u_j]\) then \(a_{n+1}\le w<u_j=a_n\), so we are done. So we can suppose that f is decreasing on \([w,u_j]\). Since \(w>v_{j+1}\), every \(x\in [w,u_j)\) is not hard, so there exists an \(x'\in [x/2,x)\) such that \(f(x')<f(x)\). Since f is decreasing on \([w,u_j]\), \(x'<w\). By the continuity of f, this implies that there exists \(x_0\in [u_j/2,w]\) such that \(f(x_0)\le f(u_j)\). Thus indeed \(a_{n+1}<u_j=a_n\).

Note that if Case 2 was applied to obtain both \(a_{n+1}\) and \(a_{n+2}\) then \(a_n/a_{n+2}\ge 2\). This implies that \(a_n\rightarrow 0\), so \((a_n)\) is a good partition of [0, a]. It remains to show that \({\mathbf {T}}(f,(a_n))\le \sum _j f(u_j)-f(v_j)\).

If \(a_n\) was obtained in Case 1 then \([a_n,a_{n-1}]\) is a subinterval of some \([u_j,v_j]\) and \(f(a_n)-\min _{[a_n,a_{n-1}]} f=f(a_n)-f(a_{n-1})\). If \(a_n\) was obtained in Case 2 then \(f(a_n)-\min _{[a_n,a_{n-1}]} f=0\). Note also that f is nonincreasing on each \([u_j,v_j]\) since all points of \([u_j,v_j]\) are hard points of f. These show that indeed \({\mathbf {T}}(f,(a_n))\le \sum _j f(u_j)-f(v_j)\), which completes the proof. \(\square \)

The next proposition (or rather, the discrete corollary given in Proposition 5.24 below) will be crucial to get estimates on the packing dimension of the pinned distance sets.

Proposition 5.5

Let \(a>0\) and \(0\le D < 1/2\) be given parameters. Let \(f:[0,a]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function, which is piecewise linear on every closed subinterval of (0, a], and suppose that \(f(0)=0\) and \(Dx\le f(x)\) for every \(x\in [0,a]\). Let

$$\begin{aligned} \Phi (D)=\frac{2-D-\sqrt{3-3D^2}}{4}. \end{aligned}$$

Then for every \(\delta \in (0,1/2)\) there exists \(u\in [3a\Phi (D)2^{-1/\delta },a]\) such that

$$\begin{aligned} {\mathbf {T}}(f|[0,u]) < u\cdot (\Phi (D) + \delta (2-4\log \delta )). \end{aligned}$$
(5.10)

Proof

Let \(H\subset [0,a]\) be the set of hard points of f. If \(H=\emptyset \) then by Lemma 5.4, \({\mathbf {T}}(f)=0\), so \(u=a\) is clearly a good choice in this case. So suppose that H is nonempty.

First we briefly explain the idea of the proof in this nontrivial case. For simplicity, suppose that \(a=1\) and \(D=0\), which is the most interesting case anyway. Assume that the maximum of f(x) / x on \(H\cap (0,1]\) exists and is attained at u, and let B be this maximum. Since u is a hard point, \(f(x)\ge f(u)=Bu\) on [u / 2, u], and a calculation using that f is 1-Lipschitz shows that

$$\begin{aligned} f(x)> Bx \quad \text {if} \quad u'<x<u, \quad \text {where }u'=\frac{1/2-B}{1-B}u. \end{aligned}$$
(5.11)

Let \(F(x)=\min (f(x),2B x)\). Then it is not hard to show (see below for details) that every \(p\in H\cap [0,u]\) is also a hard point of F and that \(F=f\) on \(H\cap [0,u]\). By Lemma 5.4 this implies that \({\mathbf {T}}(f|[0,u])={\mathbf {T}}(F|[0,u])\), so we can study F|[0, u] instead of f|[0, u]. Let v be the largest number in [0, u) such that \(F(v)=Bv\). It follows from (5.11) that also \(F(x)> Bx\) if \(u'<x<u\), so we must have \(v\le u'\), and hence

$$\begin{aligned} v-F(v)=v(1-B)\le u(1/2-B) \end{aligned}$$

and \(F(x)>Bx\) on (vu). Since \(F(x)\le 2Bx\), for any hard point y of F we must have \(F(y)\le By\), and this implies that F has no hard point in (vu). By Lemma 5.4 this implies that \({\mathbf {T}}(F|[0,u])={\mathbf {T}}(F|[0,v])\). Again using that \(F(x)\le 2Bx\), we can apply Proposition 5.2 on [0, v] to obtain

$$\begin{aligned} {\mathbf {T}}(f|[0,u])={\mathbf {T}}(F|[0,u])={\mathbf {T}}(F|[0,v])\le \frac{(v-F(v))(2B-0)}{1+2\cdot 2B - 3\cdot 0} \le \frac{u(1/2-B)2B}{1+4B}. \end{aligned}$$

Calculus shows that \(\frac{(1/2-B)2B}{1+4B}\le \frac{2-\sqrt{3}}{4}=\Phi (0)\) for \(B\in [0,1]\), so we obtain \({\mathbf {T}}(f|[0,u])\le u\Phi (0)\).

Unfortunately, f(x) / x may not have a maximum on \(H\cap (0,1]\) and, even if it does, we might get an u which is too small. To avoid these problems we replace f(x) / x by \(f(x)/x +\delta \log x\). Then we can show that u exists, is not too small, and it still satisfies the claim of the proposition.

We now continue with the actual proof. Note that H is a closed set, and let \(h=\max H\). By Lemma 5.4, \({\mathbf {T}}(f)={\mathbf {T}}(f|[0,h])\).

If \(h<3a\Phi (D)\) then, applying Proposition 5.2 on [0, h] with \(C=1\), we get

$$\begin{aligned} {\mathbf {T}}(f)={\mathbf {T}}(f|[0,h])\le \frac{(h-f(h))(1-2D)}{3-3D} \le \frac{h}{3} <a\Phi (D), \end{aligned}$$

so \(u=a\) is a good choice in this case.

Therefore in the rest of the proof we can suppose that \(h\ge 3a\Phi (D)\). Let

$$\begin{aligned} \phi (x)=\frac{f(x)}{x}+\delta \log x. \end{aligned}$$

(Recall that in this paper \(\log \) denotes \(\log _2\).) Since f is nonnegative and 1-Lipschitz, \(0\le f(x)/x\le 1\) on (0, a], so for any \(x\in (0,2^{-1/\delta }h)\) we have

$$\begin{aligned} \phi (x)=\frac{f(x)}{x}+\delta \log x < 1 +\delta \log (2^{-1/\delta }h) \le \delta \log h \le \frac{f(h)}{h}+\delta \log h = \phi (h).\nonumber \\ \end{aligned}$$
(5.12)

Now we claim that

$$\begin{aligned} \left( \exists u\in H\cap [2^{-1/\delta }h, h]\right) \ (\forall x\in H\cap [\delta u, u])\ \phi (x)\le \phi (u). \end{aligned}$$
(5.13)

To prove this we define a sequence \(u_0>u_1>\ldots \in H\) inductively. Let \(u_0=h\). Suppose that \(u_n\in H\) is already defined. Let \(v\in H\cap [\delta u_n, u_n]\) be the largest number such that \(\phi (v)=\max _{H\cap [\delta u_n, u_n]}\phi \). If \(v=u_n\) then let \(N=n\) and the procedure is terminated.

Otherwise letting \(u_{n+1}=v\) we have \(u_{n+1}<u_n\), so the procedure can be continued. Note that it follows from the construction that \(\phi (h)=\phi (u_0)\le \ldots \le \phi (u_n)\) and \(u_{n+2}<\delta u_n\) (\(n=0,1,\ldots \)). Thus (5.12) implies that the procedure must be terminated in finitely many steps and (5.13) holds for \(u=u_N\).

Let u be chosen according to (5.13). Then, using that \(h\ge 3a\Phi (D)\), we have \(u\ge 2^{-1/\delta }h\ge 2^{-1/\delta }\cdot 3a\Phi (D)\), so the requirement \(u\in [3a\Phi (D)2^{-1/\delta },a]\) is satisfied. Thus it remains to prove (5.10).

Let

$$\begin{aligned} B=\frac{f(u)}{u}-\delta \log \delta . \end{aligned}$$

Since u is chosen according to (5.13), we have

$$\begin{aligned} (\forall x\in H\cap [\delta u,u]) \quad f(x)\le x\left( \frac{f(u)}{u}+\delta \log \frac{u}{x}\right) \le x\left( \frac{f(u)}{u}-\delta \log \delta \right) = Bx.\nonumber \\ \end{aligned}$$
(5.14)

Let \(F(x)=\min (f(x),2Bx)\) (\(x\in [0,u]\)).

Now we claim that every \(p\in H\cap [\delta u,u]\) is also a hard point of F. Suppose, on the contrary, that \(p\in H\cap [\delta u,u]\) is not a hard point of F. Then there exists a \(q\in [p/2,p]\) such that \(F(q)<F(p)\). By (5.14) we have \(f(p)\le Bp\le 2Bp\), so by definition \(F(p)=f(p)\), and consequently we have

$$\begin{aligned} F(q)<F(p)=f(p)\le Bp\le 2Bq, \end{aligned}$$

which implies that \(f(q)=F(q)\). Thus \(f(q)<f(p)\), so p cannot be a hard point of f, which is a contradiction.

Note that, by Lemma 5.4 and since \(F(p)=f(p)\) for any hard point of F, the above claim and the trivial estimate \({\mathbf {T}}(f|[0,\delta u])\le \delta u\) imply

$$\begin{aligned} {\mathbf {T}}(f|[0,v])\le {\mathbf {T}}(F|[0,v])+\delta u \qquad \text {for any } v\in [0,u]. \end{aligned}$$
(5.15)

First we consider the case when \(f(u)/u<-\delta \log \delta \).

Then \(B<-2\delta \log \delta \), and so

$$\begin{aligned} 0\le F(x)\le 2Bx < (-4 \delta \log \delta ) x \quad \text {on }[0,u]. \end{aligned}$$

If \(-4\delta \log \delta >1\) then, since \(\Phi (D)\ge 0\) for \(D\le 1/2\), the righthand-side of (5.10) is larger than u. Since clearly \({\mathbf {T}}(g)\le u\) for any 1-Lipschitz function \(g:[0,u]\rightarrow {\mathbb {R}}\) we are done if \(-4\delta \log \delta >1\). So we may suppose that \(-4\delta \log \delta \le 1\). By Proposition 5.2 applied to F, with \(a=u, C=-4\delta \log \delta \) and \(D=0\), we obtain

$$\begin{aligned} {\mathbf {T}}(F)\le u\cdot \frac{-4 \delta \log \delta }{1-8 \delta \log \delta } < -4 u\delta \log \delta . \end{aligned}$$

By (5.15) (applied to \(v=u\)) this implies that

$$\begin{aligned} {\mathbf {T}}(f|[0,u])\le -4 u\delta \log \delta +\delta u. \end{aligned}$$
(5.16)

Since \(D\le 1/2\), we have \(\Phi (D)\ge 0\), so (5.16) implies (5.10), which completes the proof in the case when \(f(u)/u<-\delta \log \delta \).

So in the rest of the proof we may assume that

$$\begin{aligned} f(u)/u\ge -\delta \log \delta . \end{aligned}$$
(5.17)

Since \(\delta <1/2\) this also implies that \(f(u)/u>\delta \). Putting this together with the fact that u was chosen according to (5.13), and with the inequality \(\log y\le y-1\), we get that if \(x\in H\cap [\delta u, u)\), then

$$\begin{aligned} f(x)\le x\frac{f(u)}{u}+x\delta \log \frac{u}{x}< x\frac{f(u)}{u}+x\frac{f(u)}{u}\left( \frac{u}{x}-1\right) = f(u). \end{aligned}$$
(5.18)

Since u is a hard point, \(f(x)\ge f(u)\) on [u / 2, u], and so (5.18) implies that \(H\cap [u/2,u)=\emptyset \).

Again because u is a hard point, \(f(u/2)\ge f(u)\). Using this, \(\delta <1/2\) and the fact that f is 1-Lipschitz, we get

$$\begin{aligned} B=\frac{f(u)}{u}-\delta \log \delta < \frac{f(u/2)}{u}+\frac{1}{2}\le 1. \end{aligned}$$

Using again that f is 1-Lipschitz and \(f(u/2)\ge f(u)\), we get

$$\begin{aligned} f(x)\ge f\left( \frac{u}{2}\right) -\left( \frac{u}{2}-x\right) \ge f(u)-\left( \frac{u}{2}-x\right) \quad (x\in [0,u/2]). \end{aligned}$$

Thus

$$\begin{aligned} f(x)\ge f(u)-\left( \frac{u}{2}-x\right) >Bx \quad \text {if}\quad \frac{\frac{u}{2}-f(u)}{1-B} < x \le \frac{u}{2}. \end{aligned}$$
(5.19)

Let \(v_0= \frac{u/2-f(u)}{1-B}\). Note that \(f(x)>Bx\) also holds on the closed interval \([v_0,u/2]\) unless \(f(v_0)=Bv_0\). The definition \(B=\frac{f(u)}{u}-\delta \log \delta \) and the assumption (5.17) imply that \(B\le 2f(u)/u\), hence \(v_0\le u/2\). Let \(v=\max \{x\in [0,u/2]\ :\ f(x)=Bx\}\) (the maximum over a nonempty compact set). By (5.19) we have \(v\le v_0\) and \(f(x)>Bx\) on (vu / 2]. By (5.14), this implies that \(H \cap [\delta u,u]\cap (v, u/2)=\emptyset \). Since above we obtained \(H\cap [u/2,u)=\emptyset \) we get \(H\cap (v,u)\subset [0,\delta u]\). Hence, using Lemma 5.4 and the trivial estimate \(T(f|[0,\delta u))\le \delta u\), we get

$$\begin{aligned} {\mathbf {T}}(f|[0,u]) \le {\mathbf {T}}(f|[0,v]) +\delta u. \end{aligned}$$
(5.20)

Since \(v\le v_0=\frac{u/2-f(u)}{1-B}\) and \(f(v)=Bv\),

$$\begin{aligned} 2(v-f(v))=2(1-B)v\le u-2f(u) = u\left( 1-2\frac{f(u)}{u}\right) = u(1-2B-2\delta \log \delta ). \end{aligned}$$

Let \(C=\min (2B,1)\). We have just seen that

$$\begin{aligned} 2(v-f(v)) \le u(1-C-2\delta \log \delta ). \end{aligned}$$

Note also that \(D\le f(u)/u=B+\delta \log \delta <B\), and so \(D\le C/2\) since we assumed that \(D\le 1/2\). Then \(Dx\le F(x)\le Cx\) on \([0,v]\subset [0,u]\), so we can apply Proposition 5.2 to get

$$\begin{aligned} {\mathbf {T}}(F|[0,v])\le \frac{(v-f(v))(C-2D)}{1+2C-3D}\le \frac{u(1-C-2\delta \log \delta )(C/2-D)}{1+2C-3D}. \end{aligned}$$

Note that \(\frac{C/2-D}{1+2C-3D}< 1\). Using calculus, we get that \(\frac{(1-C)(C/2-D)}{1+2C-3D}\le \Phi (D)\) for \(C\in [2D,1]\). Therefore

$$\begin{aligned} {\mathbf {T}}(F|[0,v])< u(\Phi (D)-2\delta \log \delta ). \end{aligned}$$

Combining the above inequality with (5.15) and (5.20), we get (5.10). \(\square \)

5.3 Stability results.

The results of this subsection are only needed for the Proof of Theorem 1.4. Moreover, to get the bound \({{\,\mathrm{dim_H}\,}}(\Delta (A))\ge 37/54\) whenever \({{\,\mathrm{dim_H}\,}}(A)>1\), one only needs to consider the case \(D=0\) below. While there is no conceptual difference between the cases \(D=0\) and \(D>0\), the calculations are easier in the former case, so the reader may want to assume that \(D=0\) in a first reading.

In the \(C=1\) special case of Proposition 5.2, we get that if \(D\in [-1,1/2]\) and \(f:[0,1]\rightarrow {\mathbb {R}}\) is a 1-Lipschitz function such that \(f(0)=0\) and \(f(x)\ge Dx\) on [0, 1], then \({\mathbf {T}}(f)\le (1-2D)/3\). As we will see in Section 7, and is not hard to check, this estimate is sharp: if

$$\begin{aligned} f(x) = \left\{ \begin{array}{lll} x &{} \text { if } &{} x\in [0,(D+1)/2] \\ 1+D-x &{} \text { if } &{} x\in [(D+1)/2,1] \end{array} \right. , \end{aligned}$$

then \({\mathbf {T}}(f)=(1-2D)/3\). In this section we prove a quantitative stability result (Proposition 5.15) for \(D\in [0,1/3]\), stating that if \({\mathbf {T}}(f)\) is close to \((1-2D)/3\) then f(x) must be close to the above function when x is not too far from 0 or from 1.

The general plan to get this result is the following. Let \(b=\min _{[1/2,1]} f\) and choose \(a\in [1/2,1]\) such that \(f(a)=b\). It is easy to see that \({\mathbf {T}}(f)={\mathbf {T}}(f|[0,a])\), so it is enough to study f|[0, a] instead of f. We need to get an upper estimate on T(f) when f is not close enough to the function defined in the previous paragraph. This upper estimate will be obtained by finding a point \(p\in [0,a]\) such that in the good partition in the definition of T(f), the points \(a_n\) in [pa] can be chosen such that \(\min _{[a_n,a_{n-1}]}f = f(a_{n-1})\), and so for these indices the sum of the terms \(f(a_n)-\min _{[a_n,a_{n-1}]}f\) is \(f(p)-f(a)\) or, in other words, the smallest possible. Combining this with a near optimal good partition for f|[0, p] guaranteed by Proposition 5.2, we get a near optimal lower bound for T(f) for all f with such a special point p and value f(p). These points p will be called simple points, and after proving the above described near optimal upper estimate, most of the proof will be about hunting a simple point such that the estimate we obtain for \({\mathbf {T}}(f)\) is the upper estimate we claim.

First we collect some assumptions and define precisely the above mentioned notion of simple points.

Definition 5.6

Suppose that

$$\begin{aligned} \begin{aligned} \ a\in (0,1],\ b\in {\mathbb {R}},\ D\in [0,1/2), \ f:[0,a]\rightarrow {\mathbb {R}}\text { is 1-Lipschitz, }\\ f(0)=0,\ f(x)\ge Dx \ (x\in [0,a]) \ \text { and } \min _{[a/2,a]} f=f(a)=b. \end{aligned} \end{aligned}$$
(5.21)

A point \(p\in [0,a]\) is called simple if there exists a finite sequence \(p=p_0<p_1<\cdots <p_k=a\) such that

$$\begin{aligned} \frac{p_i}{p_{i-1}} \le 2 \quad \text { and } \quad f(p_i)=\min _{[p_{i-1},p_i]} f \qquad (i=1,\ldots ,k). \end{aligned}$$
(5.22)

Lemma 5.7

If (5.21) holds and \(p\in [0,a]\) is a simple point then

$$\begin{aligned} {\mathbf {T}}(f)\le \alpha p+ (1- \alpha ) f(p) -b, \text { where } \alpha = \frac{1-2D}{3(1-D)}. \end{aligned}$$

Proof

Applying Proposition 5.2 to f|[0, p] with \(C=1\) we get \({\mathbf {T}}(f|[0,p])\le \alpha (p-f(p))\). Hence for any \(\delta >0\) there exists a good partition \((a_n)\) of [0, p] such that

$$\begin{aligned} {\mathbf {T}}(f|[0,p],(a_n))\le \alpha (p-f(p)) + \delta . \end{aligned}$$

Since p is simple there exists a finite sequence \(p=p_0<p_1<\cdots <p_k=a\) such that (5.22) holds.

For \(n\le k\) let \(a'_n=p_{k-n}\) and for \(n>k\) let \(a'_n=a_{n-k}\). Then \(( a'_n)\) is a good partition of [0, a] and

$$\begin{aligned} {\mathbf {T}}(f,( a'_n))= & {} {\mathbf {T}}(f|[0,p],( a_n)) +\sum _{i=1}^k f(p_{i-1})-f(p_i) \\\le & {} \alpha (p-f(p)) + \delta + f(p) - f(a) = \alpha p+ (1- \alpha ) f(p)-b +\delta , \end{aligned}$$

which completes the proof. \(\square \)

Lemma 5.8

Suppose that (5.21) holds and let \(p\in [0,a]\). If

$$\begin{aligned} \text { for every } z\in [p,a/2) \text { there exists } y \in (z,2z] \text { such that } f(y)\le f(z) \end{aligned}$$
(5.23)

then p is simple.

Proof

Let \(p_0=p\). Suppose that \(n\ge 0\) and \(p_0<\cdots <p_n\) are defined such that (5.22) holds for \(k=n\). If \(p_n\ge a/2\) then let \(p_{n+1}=a\) and we are done. Otherwise, let \(p_{n+1}\in [p_n,2p_n]\) be the largest number such that \(f(p_{n+1})=\min _{[p_n,p_{n+1}]} f\). By (5.23) we also have \(p_{n+1}>p_n\). It remains to check that the procedure terminates, which follows from the simple observation that \(p_{n+2}\ge \min (2 p_n,a)\) by definition. \(\square \)

Lemma 5.9

Suppose that (5.21) holds. If \(p\in [a/2,a]\), or if \(p\in [0,a/2]\) and \(f(p)\ge -2p+a+b\), then p is a simple point.

Proof

The case \(p\in [a/2,a]\) is clear, so suppose that \(p\in [0,a/2]\) and \(f(p)\ge -2p+a+b\). Then the 1-Lipschitz property of f implies that for any \(x\in [p,a]\) we also have \(f(x)\ge -2x+a+b\). Since f is 1-Lipschitz and \(f(a)=b\) we have \(f(y)\le -y+a+b\) for any \(y\in [0,a]\). Thus \(f(x)\ge -2x+a+b\ge f(2x)\) for any \(x\in [p,a/2]\), so Lemma 5.8 completes the proof. \(\square \)

Lemma 5.10

Condition (5.21) implies that \(1-a+2b-2D\ge 0\).

Proof

Note that \(b=f(a)\ge Da\), so

$$\begin{aligned} 1-a+2b-2D \ge 1 -a + 2aD - 2D = (1-a)(1-2D) \ge 0. \end{aligned}$$

\(\square \)

Lemma 5.11

If (5.21) holds and \({\mathbf {T}}(f)> \frac{1-2D}{3}-\delta \) for some \(\delta \in (0,a/3)\) then

$$\begin{aligned} f(x) > x - 3\delta (1-D) \qquad \text { on } [0,t_0], \text { where } t_0=\frac{a+b}{3} +\delta (1-D). \end{aligned}$$

Proof

First note that \(\delta <a/3\) implies that \(t_0<a\). Since \(f(0)<-2\cdot 0+a+b\) and \(f(a)\ge -2\cdot a + a +b\) there exists a \(t\in (0,a]\) such that \(f(t)=-2t+a+b\). By Lemma 5.9, t is a simple point, so writing \(\alpha = \frac{1-2D}{3(1-D)}\) and using Lemma 5.7, we get

$$\begin{aligned} {\mathbf {T}}(f)&\le \alpha t + (1-\alpha ) f(t) - b \\&= \alpha t + (1-\alpha ) (-2t+a+b) - b \\&= \frac{-t}{1-D}+\frac{2-D}{3(1-D)}(a+b)-b. \end{aligned}$$

Combining this with the assumption \({\mathbf {T}}(f)> \frac{1-2D}{3} - \delta \) and multiplying through by \(3(1-D)\), we get

$$\begin{aligned} 3t < (2-D)(a+b)-3(1-D)b-(1-2D)(1-D)+3\delta (1-D), \end{aligned}$$

which can be rewritten as

$$\begin{aligned} 3t < a+b+ (1-D)(3\delta - (1-a+2b-2D)). \end{aligned}$$

By Lemma 5.10, this implies \(t < \frac{a+b}{3} +\delta (1-D)=t_0\). Using this and the 1-Lipschitz property of f, we obtain

$$\begin{aligned} f(t_0) \ge f(t)-(t_0-t) >f(t)-2(t_0-t) =a+b-2t_0=t_0-3\delta (1-D). \end{aligned}$$

Using again that f is 1-Lipschitz, this gives the claim. \(\square \)

Lemma 5.12

Suppose that (5.21) holds, \(0\le p\le \frac{a+b-v}{2} < u \le a\), \(f(u)=v\) and \(f(x)\ge v\) on \([p,\frac{a+b-v}{2}]\). If \(v\ge u/2\) or \(f(p)=-2p+u+v\), then p is simple.

Proof

It is useful to note that by the 1-Lipschitz property of f, the assumptions \(u\le a\), \(f(u)=v\) and \(f(a)=b\) imply that \(u+v\le a+b\), and so \(\frac{u}{2}\le \frac{a+b-v}{2}\).

By Lemma 5.8 it is enough to check (5.23). So let \(z\in [p,a/2)\). We distinguish three cases.

First suppose that \(z\ge \frac{a+b-v}{2}\). Then, using that \(f(\frac{a+b-v}{2})\ge v\), f is 1-Lipschitz, \(2z<a\) and \(f(a)=b\), we get

$$\begin{aligned} f(z)\ge v-\left( z-\frac{a+b-v}{2}\right) \ge v-2\left( z-\frac{a+b-v}{2}\right) = -2z+a+b\ge f(2z). \end{aligned}$$

Therefore (5.23) holds in this case.

Now suppose that \(z\in [\frac{u}{2}, \frac{a+b-v}{2}]\). Since we consider only \(z\in [p,a/2)\) we also have \(z\in [p, \frac{a+b-v}{2}]\). Then \(f(z)\ge v\), \(u\in (z,2z]\) and \(f(u)=v\le f(z)\), so (5.23) holds in this case as well.

Finally, suppose that \(z\in [p,\frac{u}{2})\). Then \(v \le f(z)\le z < u/2\), hence we cannot have \(v \ge u/2\), so we must have \(f(p)=-2p+u+v\). Using that f is 1-Lipschitz and \(z\ge p\), this implies \(f(z)\ge -2z+u+v\). Since f is 1-Lipschitz and \(f(u)=v\) we have \(f(x)\le u+v-x\) on [0, u]. Thus \(f(2z)\le u+v-2z\le f(z)\), which completes the proof. \(\square \)

Lemma 5.13

If (5.21) holds and \({\mathbf {T}}(f)> \frac{1-2D}{3}-\delta \) for some \(\delta \in (0,a/3)\) then

$$\begin{aligned} f(x)> \frac{a+b}{3}-2\delta (1-D) \quad \text { on } [t_0,2t_0-6\delta (1-D)], \text { where } t_0=\frac{a+b}{3} +\delta (1-D). \end{aligned}$$

Proof

Let \(v=\frac{a+b}{3}-2\delta (1-D)\). If \(v<0\) then the claim is clear, so we can suppose that \(v\ge 0\). By Lemma 5.11, \(f(t_0)>v\). Thus if the claim is false then there exists a \(u\in (t_0,2t_0-6\delta (1-D)]\) such that \(f(u)=v\).

By (5.21), we have \(b\le \frac{a}{2}\), which implies

$$\begin{aligned} 2t_0-6\delta (1-D)=\frac{2(a+b)}{3}-4\delta (1-D)<a. \end{aligned}$$

Since \(f(0)\le v<f(t_0)\) we also have a largest \(p\in [0,t_0)\) such that \(f(p)=v\). Then \(f(x)\ge v\) on \([p,t_0]\). Since \(\frac{a+b-v}{2}=t_0\) and \(u/2\le t_0-3\delta (1-D)=v\), all the assumptions of Lemma 5.12 hold, so we get that p is simple.

Then by Lemma 5.7 we have \({\mathbf {T}}(f)\le \alpha p + (1-\alpha ) v - b\), where \(\alpha = \frac{1-2D}{3(1-D)}\). Since \(p<t_0=v+3\delta (1-D)\), this implies that \({\mathbf {T}}(f)\le v+(1-2D)\delta -b\). Combining this with the assumption \({\mathbf {T}}(f)> \frac{1-2D}{3}-\delta \) we get

$$\begin{aligned} v> b+\frac{1-2D}{3}-2\delta (1-D). \end{aligned}$$

Note that Lemma 5.10 implies that \(b+\frac{1-2D}{3}\ge \frac{a+b}{3}\), so we obtain \( v> \frac{a+b}{3}-2\delta (1-D)\), which is a contradiction. \(\square \)

From the last lemma and the Lipschitz property of f one can easily derive a good lower estimate also on \([2t_0-6\delta (1-D),a]\). However, the next lemma will lead to an even better (and, as we will see later, sharp) estimate on the right part of [0, a].

Lemma 5.14

Suppose that (5.21) holds,

$$\begin{aligned} {\mathbf {T}}(f) > \frac{1-2D}{3}-\delta ,\ \delta \in (0,a/3),\ u \in (a/2, a], u \ge 2v + 6\delta (1-D),\ \text { and } f(u)=v. \end{aligned}$$

Then

$$\begin{aligned} u+v> 1+D -3\delta \frac{1+D}{1-2D}. \end{aligned}$$

Proof

Since \(f(0)<-2\cdot 0+u+v\) and \(f(a)\ge -2\cdot a + u+v\) there exists a \(p\in (0,a]\) such that \(f(p)=-2p+u+v\). First we prove that p is a simple point. To get this, by Lemma 5.12, it is enough to check that \(p\le \frac{a+b-v}{2}<u\) and \(f(x) \ge v\) on \([p,\frac{a+b-v}{2}]\).

Since \(u\in (a/2,a]\), \(v=f(u)\) and \(b=\min _{[a/2,a]} f\), we have \(b\le v\), so \(\frac{a+b-v}{2}\le a/2<u\).

Note (as in Lemma 5.12) that \(u+v\le a+b\). By Lemma 5.11, we have \(f(t_0)>t_0-3\delta (1-D)\), where \(t_0=\frac{a+b}{3} +\delta (1-D)\). Then

$$\begin{aligned} 2t_0+f(t_0)>3t_0-3\delta (1-D)=a+b\ge u+v=2p+f(p). \end{aligned}$$
(5.24)

Since f is 1-Lipschitz this implies that \(p< t_0\). Using this, \(u+v\le a+b\) and finally the assumption \(u\ge 2v+6\delta (1-D)\), we get

$$\begin{aligned} p< t_0&= \frac{a+b}{3}+\delta (1-D)\\&=\frac{a+b}{2}-\frac{a+b}{6}+\delta (1-D)\\&\le \frac{a+b}{2}-\frac{u+v}{6}+\delta (1-D)\le \frac{a+b-v}{2}. \end{aligned}$$

On \([0,t_0]\) we have \(f(x)-x> -3\delta (1-D)\) by Lemma 5.11, on [pa] we have \(2x+f(x)\ge 2p + f(p) = u+v\) by the 1-Lipschitz property of f. Taking the linear combination of these inequalities with weights 2 / 3 and 1 / 3, we get

$$\begin{aligned} f(x)> \frac{u+v}{3} - 2\delta (1-D) \quad \text {on } [p,t_0]. \end{aligned}$$

By the assumption \(u \ge 2v + 6\delta (1-D)\), this gives \(f(x)\ge v\) on \([p,t_0]\).

Using that f is 1-Lipschitz and then (5.24), we get that on \([t_0,a]\) we have \(2x+f(x)\ge 2t_0 +f(t_0)>a+b\), which implies that \(f(x)\ge v\) also on \([t_0,\frac{a+b-v}{2}]\).

Therefore, by Lemma 5.12, p is indeed a simple point. Now Lemma 5.7 gives

$$\begin{aligned} {\mathbf {T}}(f)&\le \alpha p + (1-\alpha )f(p)-b\\&=\alpha p + (1-\alpha )(-2p+u+v)-b\\&=(3\alpha - 2) p +(1-\alpha )(u+v)-b. \end{aligned}$$

Recalling that \(\alpha =\frac{1-2D}{3(1-D)}\), it is easy to check that \(D<1\) implies that \(3\alpha -2 <0\). The 1-Lipschitz property of f and \(f(0)=0\) imply that \(0\le p-f(p)=3p-(u+v)\), so \(p\ge \frac{u+v}{3}\). Using these facts, the last displayed equation yields

$$\begin{aligned} {\mathbf {T}}(f)\le \left( \frac{3\alpha -2}{3}+1-\alpha \right) (u+v)-b =\frac{u+v}{3}-b. \end{aligned}$$

Combining this with the assumption \({\mathbf {T}}(f)>\frac{1-2D}{3}-\delta \) we get \(u+v> 1-2D-3\delta +3b\). Note that \(u+v\le a+b\) and \(b=f(a)\ge Da\) imply that \(b \ge \frac{D}{D+1}(u+v)\). Combining these facts, we conclude that

$$\begin{aligned} u+v> 1-2D-3\delta +3b \ge 1-2D-3\delta + \frac{3D}{D+1}(u+v), \end{aligned}$$

which implies (using also that \(D<1/2\)) the claim. \(\square \)

The following proposition provides a global quantitative estimate for functions \(f:[0,1]\rightarrow {\mathbb {R}}\) for which \({\mathbf {T}}(f)\) is close to the maximum possible value.

Proposition 5.15

Fix \(D\in [0,1/3]\), \(\delta \in (0,1/21]\) and let \(f:[0,1]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function such that \(f(0)=0\), \(f(x)\ge Dx\) on [0, 1] and \({\mathbf {T}}(f) > \frac{1-2D}{3}-\delta \). Let

$$\begin{aligned} t_1= \frac{1+D}{3} - \delta \left( \frac{1+D}{1-2D}-(1-D)\right) . \end{aligned}$$
(5.25)

Then

$$\begin{aligned} x-3\delta (1-D) \ <&f(x) \ \le x \qquad \text { on } [0,t_1], \end{aligned}$$
(5.26)
$$\begin{aligned} t_1-3\delta (1-D) <&f(x) \qquad \text { on } [t_1,2t_1-6\delta (1-D)] \qquad \text {and} \end{aligned}$$
(5.27)
$$\begin{aligned} 3t_1-x-3\delta (1-D)<&f(x) \ < \ 1+D-x + 3\delta \frac{1-D}{1-2D} \qquad \text { on } [2t_1,1]. \end{aligned}$$
(5.28)

Proof

Let \(b=\min _{[1/2,1]} f\) and choose \(a\in [1/2,1]\) such that \(f(a)=b\). Then it is easy to see that \({\mathbf {T}}(f)={\mathbf {T}}(f|[0,a])\). So combining the assumption \({\mathbf {T}}(f) > \frac{1-2D}{3}-\delta \) and Proposition 5.2 for f|[0, a] and \(C=1\), and then using \(b=f(a)\ge Da\), we get

$$\begin{aligned} \frac{1-2D}{3}-\delta < {\mathbf {T}}(f) ={\mathbf {T}}(f|[0,a]) \le \frac{(a-b)(1-2D)}{3(1-D)} \le a\frac{1-2D}{3}. \end{aligned}$$
(5.29)

This implies

$$\begin{aligned} a> 1-\frac{3\delta }{1-2D} \end{aligned}$$
(5.30)

and so

$$\begin{aligned} a+b\ge a+Da > 1 + D - 3\delta \frac{1+D}{1-2D}. \end{aligned}$$
(5.31)

By (5.29),

$$\begin{aligned} a-b> 1-D -3\delta \frac{1-D}{1-2D}. \end{aligned}$$

Since f is 1-Lipschitz this implies that

$$\begin{aligned} f(1)\le b+1-a< D+3\delta \frac{1-D}{1-2D}, \end{aligned}$$

which (using again that f is 1-Lipschitz) yields the upper estimate of (5.28) on [0, 1].

By definition we have \(\min _{[1/2,a]}f=f(a)=b\), but in order to apply our lemmas to f|[0, a] we have to show \(\min _{[a/2,a]}f=f(a)\). Suppose then that \(\min _{[a/2,a]}f<f(a)\). Then there exists an \(a'\in [a/2,1/2)\) such that \(\min _{[a/2,a]}f=f(a')\). Using Proposition 5.2 applied to \(f|[0,a']\) and \(C=1\) we get

$$\begin{aligned} \frac{1-2D}{3}-\delta < {\mathbf {T}}(f) ={\mathbf {T}}(f|[0,a])={\mathbf {T}}(f|[0,a']) \le a' \frac{1-2D}{3} \le \frac{1}{2}\cdot \frac{1-2D}{3}, \end{aligned}$$

which is impossible, since we assumed \(D\le 1/3\) and \(\delta \le 1/21\).

Therefore (5.21) holds for f|[0, a]. Note that (5.31) implies that \(t_0> t_1\), where \(t_0=\frac{a+b}{3}+\delta (1-D)\) (as in Lemma 5.13).

By (5.30), \(D\le 1/3\) and \(\delta \le 1/21\), we get \(a>4/7\), so \(\delta <a/3\) holds. Then applying Lemmas 5.11 and 5.13 to f|[0, a] and using that f is 1-Lipschitz we get the lower estimate of (5.26) and (5.27). The upper estimate of (5.26) is clear.

It remains to prove the lower estimate of (5.28). Lemma 5.14 (for f|[0, a]) gives that every point of the graph of f|(a / 2, a] must be above either the \(y=1+D-x-3\delta \frac{1+D}{1-2D}\) line, or the \(y=\frac{x}{2}-3\delta (1-D)\) line. These two lines intersect at \((2t_1,t_1-3\delta (1-D))\). On the other hand, a calculation using \(\delta \le 1/21\), \(D\le 1/3\), \(a\le 1\) and (5.30) shows that \(a/2\le 1/2< 2t_1 \le 1-\frac{3\delta }{1-2D}< a\). We deduce that \(f(2t_1)> t_1-3\delta (1-D)\). Using the 1-Lipschitz property of f, this gives the lower estimate of (5.28). \(\square \)

Remark 5.16

Let \(D\in [0,1/3]\), and let \(f:[0,1]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function such that \(f(0)=0\), \(f(x)\ge Dx\) on [0, 1] and \({\mathbf {T}}(f)\ge \frac{1-2D}{3}\). Letting \(\delta \rightarrow 0^+\) in Proposition 5.15, we get that \(f(x)=x\) on \([0,\frac{1+D}{3}]\), \(f(x)\ge \frac{1+D}{3}\) on \([\frac{1+D}{3}, \frac{2(1+D)}{3}]\), and \(f(x)=1+D-x\) on \([\frac{2(1+D)}{3},1]\). It is easy to see that, conversely, \({\mathbf {T}}(f)=\frac{1-2D}{3}\) for any such f. Recall that by the \(C=1\) special case of Proposition 5.2, we have \({\mathbf {T}}(f)\le \frac{1-2D}{3}\) for any 1-Lipschitz function \(f:[0,1]\rightarrow {\mathbb {R}}\) such that \(f(0)=0\), \(f(x)\ge Dx\) on [0, 1]. Therefore the above observation gives a characterization of those functions for which we have equality in Proposition 5.2 when \(C=1\).

The following corollary can be seen as a version of Proposition 5.15 that is closer to the kind of estimates we will need in the Proof of Theorem 1.4.

Corollary 5.17

Let \(D\in [0,1/3)\) and

$$\begin{aligned} \Lambda (D)=\frac{(1+D)(37-50D+60D^2)}{18(3-4D+5D^2)} \ge \Lambda (0)=\frac{37}{54}=0.6851851\ldots . \end{aligned}$$

Then there exist \(\eta >0\) and \(\xi \in (2/3,1]\) (depending continuously on D) such that \(\Lambda (D)=\xi (1-2\eta )\) and the following holds.

If \(f:[0,1]\rightarrow {\mathbb {R}}\) is a 1-Lipschitz function such that \(f(0)=0\) and \(f(x)\ge Dx\) on [0, 1] then

$$\begin{aligned} {\mathbf {T}}(f) > 1-\Lambda (D) \quad \Longrightarrow \quad f(x) \ge \frac{x}{3}-\eta \xi \text { on } [0,\xi ]. \end{aligned}$$

Proof

Let

$$\begin{aligned} \delta =\frac{(1+D)(1-2D)}{18(3-4D+5D^2)} \in \left( 0,\frac{1}{36}\right) . \end{aligned}$$

Then \(1-\Lambda (D)=\frac{1-2D}{3}-\delta \). Let \(t_1\) be the number given by Proposition 5.15. By the hypothesis \({\mathbf {T}}(f)> 1-\Lambda (D)\), Proposition 5.15 implies that (5.26), (5.27) and (5.28) hold.

Let

$$\begin{aligned} \xi&=\frac{3}{4}\left( \delta (1-3D)+1+D-3\delta \frac{1+D}{1-2D}\right) = \frac{(1+D)(13-20D+24D^2)}{6(3-4D+5D^2)}\in (2/3,1),\\ \eta&=\frac{\delta (1-3D)}{\xi }= \frac{(1-2D)(1-3D)}{3(13-20D+24D^2)}>0. \end{aligned}$$

Then the three lines \(y=x-3\delta (1-D)\), \(y=x/3-\eta \xi \) and \(y=Dx\) meet at \((3\delta ,3D\delta )\). Thus, since \(D<1/3\), on \([0,3\delta ]\) we have \(f(x)\ge Dx\ge x/3-\eta \xi \) and, using (5.26), on \([3\delta ,t_1]\) we have \(f(x) \ge x-3\delta (1-D) \ge x/3-\eta \xi \).

One can also check that the lines \(y= x/3-\eta \xi \) and \(y=3t_1-x-3\delta (1-D)\) intersect at \(x=\xi \). Thus, by (5.28), on \([2t_1,\xi ]\) we also have \(f(x) > 3t_1-x-3\delta (1-D) \ge x/3-\eta \xi \).

It remains to check \(f(x)\ge x/3-\eta \xi \) on \([t_1,2t_1]\). By (5.28), \(f(2t_1)> t_1-3\delta (1-D)\). Hence (5.27) and the 1-Lipschitz property of f imply that on \([t_1,2t_1]\) we have \(f(x)> g(x)-3\delta (1-D)\), where

$$\begin{aligned} g(x)= {\left\{ \begin{array}{ll} t_1 &{} \text { on } [t_1,2t_1-6\delta (1-D)] \\ 3t_1 -6\delta (1-D) -x &{} \text { on } [2t_1-6\delta (1-D),2t_1-3\delta (1-D)] \\ x-t_1 &{} \text { on } [2t_1-3\delta (1-D),2t_1] \end{array}\right. }. \end{aligned}$$

Now we claim that

$$\begin{aligned} g(x_0)-3\delta (1-D) \ge x_0/3-\eta \xi , \qquad \text {where } x_0=2t_1-3\delta (1-D). \end{aligned}$$
(5.32)

Indeed, using the definition of g and \(x_0\) and the equation \(\eta \xi =(1-3D)\delta \), we obtain that the left-hand side of (5.32) is \(t_1 - 6\delta (1-D)\), the right-hand side is \(2t_1/3 -\delta (2-4D)\), so it is enough to prove that \(t_1/3 \ge \delta (4-2D)\). It is straightforward to check that this last inequality follows from the definition (5.25) of \(t_1\), \(D\in [0,1/3)\) and \(\delta \in (0,1/36)\).

Note that the function \(g(x)-3\delta (1-D)\) has slope 0 or \(-1\) on \([t_1,x_0]\) and it has slope 1 on \([x_0,2t_1]\), while \(x/3-\eta \xi \) has slope 1 / 3. Thus (5.32) implies that \(g(x)-3\delta (1-D) \ge x/3-\eta \xi \) on \([t_1,2t_1]\), which completes the proof. \(\square \)

5.4 Total drop for \(\tau \)-good partitions.

In this subsection we show that for small \(\tau \), allowing only \(\tau \)-good partitions (recall Definition 5.1) does not change too much the smallest possible total drop, see Corollary 5.20 below. We begin with a lemma that will allow us to obtain \(\tau \)-good partitions from partitions that satisfy a weaker property, with a controlled change in the total drop.

Lemma 5.18

Let \(f:[0,a]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function and \((a_n)\) be a good partition of [0, a]. Suppose that \(\tau >0\), \(K>1\) is an integer, \((1+\tau )^K<2\) and \(a_{n-K}/a_n\ge 2\) for every \(n\ge K\). Then

$$\begin{aligned} {\mathbf {T}}_\tau (f) < {\mathbf {T}}(f,(a_n)) + 6K(K-1) \tau a . \end{aligned}$$

Proof

Fix \(i\in {\mathbb {N}}_0\) and consider the numbers \(\beta _j=a_{iK+j-1}/a_{iK+j}\) (\(j=1,\ldots ,K\)). Then \(\beta _1\cdots \beta _K=a_{iK}/a_{iK+K}\ge 2\) and for every j we have \(1<\beta _j\le 2\). The goal is to make every \(\beta _j\) at least \(1+\tau \) so that each of them remains at most 2, the product \(\beta _1\cdots \beta _K\) stays fixed, and the numbers \(a_{iK+j}\) are changed by only a small amount.

So let \(\beta '_j=1+\tau \) if \(\beta _j\le 1+\tau \), and to get the remaining \(\beta '_j\)’s decrease some of the corresponding \(\beta _j\)’s (and choose \(\beta '_j=\beta _j\) for the rest), so that still \(\beta '_j\ge 1+\tau \) and \(\beta '_1\cdots \beta '_K=\beta _1\cdots \beta _K\); this is possible since \((1+\tau )^K<2\). Then let \(a'_{iK}=a_{iK}\) and for each \(j=1,\ldots , K\) let \(a'_{iK+j}=a_{iK}/(\beta '_1\cdots \beta '_j)\). Note that \(a'_{iK+K}=a_{iK+K}\), \(1+\tau \le a'_{iK+j-1}/a'_{iK+j} \le 2\) for every \(j=1,\ldots ,K\) and each \(a_{iK+j}\) was multiplied by a factor between \((1+\tau )^{-K}\) and \((1+\tau )^{K}\) to get \(a'_{iK+j}\). This implies that for every \(j=1,\ldots ,K-1\),

$$\begin{aligned} |a'_{iK+j}-a_{iK+j}|\le a_{iK} ((1+\tau )^{K}-1) \le 2(a_{iK}-a_{(i+1)K})((1+\tau )^{K}-1).\qquad \end{aligned}$$
(5.33)

Note that \((a'_n)_{n=0}^{\infty }\) obtained by applying this procedure for every \(i\in {\mathbb {N}}_0\) is a \(\tau \)-good partition of [0, a].

Let \(\tau _0=2^{1/K}-1\). Since \(\frac{(1+x)^K-1}{x}\) is increasing on \((0,\infty )\) (being a polynomial with positive coefficients) and \((1+\tau )^K< 2\), we have

$$\begin{aligned} \frac{(1+\tau )^K-1}{\tau }< \frac{(1+\tau _0)^K-1}{\tau _0}= \frac{1}{2^{1/K}-1} \le \frac{K}{\ln 2}<\frac{3K}{2}, \end{aligned}$$

where we used the inequality \(e^t-1 \ge t\). Thus

$$\begin{aligned} (1+\tau )^K-1< \frac{3K\tau }{2}. \end{aligned}$$

Combining this with (5.33) and \(a'_{iK}=a_{iK}\), then adding up, we get

$$\begin{aligned} \sum _{n=0}^{\infty }|a'_n-a_n|< & {} \left( \sum _{i=0}^{\infty }(K-1)\cdot 2(a_{iK}-a_{(i+1)K}) \frac{3K\tau }{2}\right) \\= & {} (K-1)(a_0-\lim _{n\rightarrow \infty }a_n)\cdot 3K\tau = 3K(K-1)\tau a. \end{aligned}$$

Since f is 1-Lipschitz, changing one \(a_n\) by \(\eta \) can change \({\mathbf {T}}(f,(a_n))\) by at most \(2\eta \), so the above inequality implies

$$\begin{aligned} {\mathbf {T}}(f,(a'_n))< {\mathbf {T}}(f,(a_n)) + 6K(K-1) \tau a, \end{aligned}$$

which completes the proof of the lemma. \(\square \)

The next lemma shows that we can replace an arbitrary good partition by one satisfying the assumptions of Lemma 5.18, without increasing the total drop.

Lemma 5.19

For any \(\delta >0\) and 1-Lipschitz function \(f:[0,a]\rightarrow {\mathbb {R}}\) there exists a good partition \((a'_n)\) such that \(a'_{n-3}/a'_n > 2\) for every \(n\ge 3\) and \({\mathbf {T}}(f,(a'_n))\le {\mathbf {T}}(f)+\delta \).

Proof

It is enough to show that for any good partition \((a_n)\) there exists a good partition \((a'_n)\) such that \(a'_{n-3}/a'_n > 2\) for every \(n\ge 3\) and \({\mathbf {T}}(f,(a'_n))\le {\mathbf {T}}(f,(a_n))\).

First we claim that we can suppose that every interval \([a_n,a_{n-1}]\) is increasing or decreasing (recall Definition 5.1). Indeed, for each \(n\ge 1\) if on the interval \([a_n,a_{n-1}]\) the minimum of f is taken at \(p\in (a_n,a_{n-1})\) then inserting p to the partition (in between \(a_{n-1}\) and \(a_n\)) we get a new good partition such that \([a_n,p]\) is decreasing and \([p,a_{n-1}]\) is increasing and it is easy to see that \({\mathbf {T}}(f,(a_n))\) is not changed.

Suppose that \(a_{n-2}/a_n<2\)\((n\ge 2)\). If \([a_n,a_{n-1}]\) and \([a_{n-1},a_{n-2}]\) are both increasing or both decreasing, then by merging these intervals we get an interval of the same type, and \({\mathbf {T}}(f,(a_n))\) remains unchanged. If \([a_n,a_{n-1}]\) is increasing and \([a_{n-1},a_{n-2}]\) is decreasing then after merging the two intervals the minimum of f on \([a_n,a_{n-2}]\) is still achieved at one of the endpoints of the interval, and \({\mathbf {T}}(f,(a_n))\) does not increase.

Applying the above merging procedure inductively (starting with \(n=2\)) whenever possible, we get a good partition \((a'_n)\) such that whenever \(a'_{n-2}/a'_n<2\)\((n\ge 2)\) then \([a'_n,a'_{n-1}]\) is decreasing and \([a'_{n-1},a'_{n-2}]\) is increasing. Since this cannot happen for both n and \(n-1\) we get that \(a'_{n-2}/a'_n\ge 2\) or \(a'_{n-3}/a'_{n-1}\ge 2\) for any \(n\ge 3\), which clearly implies that \(a'_{n-3}/a'_n> 2\). \(\square \)

Corollary 5.20

For any 1-Lipschitz function \(f:[0,a]\rightarrow {\mathbb {R}}\) and any \(0<\tau <1\),

$$\begin{aligned} {\mathbf {T}}_\tau (f)\le {\mathbf {T}}(f) + 36 \tau a. \end{aligned}$$

Proof

Note that for any 1-Lipschitz function \(f:[0,a]\rightarrow {\mathbb {R}}\) and any partition \((a_n)\) of [0, a], by definition, we have \(0\le {\mathbf {T}}(f,(a_n))\le a\). Thus the claim holds trivially if \(\tau \ge \root 3 \of {2} - 1\). Otherwise we can apply Lemma 5.19, and then Lemma 5.18 (for \(K=3\)). \(\square \)

5.5 Discretizing the estimates.

Recall from Definition 4.3 the notion of \(\tau \)-good partition of an integer interval \((0,\ell ]\), and the notation \({\mathbf {M}}(\sigma ,(N_i))\). Sometimes we refer to these as integer partitions for emphasis. Note that the requirement (4.7) for a \(\tau \)-good integer partition slightly differs from the requirement \(1+\tau \le a_{k-1}/a_k \le 2\) for a \(\tau \)-good partition (see Definition 5.1), which is equivalent to \(\tau a_k \le a_{k-1}-a_k\le a_k\). These two notions are connected by the following lemma.

Lemma 5.21

Assume that \(L\le \ell \) are positive integers. Let \(f:[0,L/\ell ]\rightarrow {\mathbb {R}}\) be a 1-Lipschitz function and let \((a_n)\) be a \((2\tau )\)-good partition of \([0,L/\ell ]\). Then there exists a \(\tau \)-good integer partition \(0=N_0<\cdots <N_q=L\) of (0, L] such that

$$\begin{aligned} \sum _{j=0}^{q-1} f(N_j/\ell )-\min _{[N_j/\ell ,N_{j+1}/\ell ]} f \le {\mathbf {T}}(f,(a_n)) +O_{\tau }(\log \ell /\ell ). \end{aligned}$$

Proof

Let \(N_0<\cdots <N_q\) be the values taken by the sequence \(\lfloor \ell a_n \rfloor \). Since \(a_n\rightarrow 0\) we get \(N_0=0\). Thus \(0=N_0<\cdots <N_q=L\) is an integer partition of (0, L].

Using that \((a_n)\) is a good partition we get

$$\begin{aligned} \lfloor \ell a_n \rfloor \le \ell a_n \le 2 \ell a_{n+1} < 2\lfloor \ell a_{n+1} \rfloor + 2, \end{aligned}$$

hence \(\lfloor \ell a_n \rfloor - \lfloor \ell a_{n+1} \rfloor \le \lfloor \ell a_{n+1} \rfloor + 1\). Thus to prove that \((N_j)\) is a \(\tau \)-good integer partition of (0, L] it is enough to show that \(\lfloor \ell a_n \rfloor - \lfloor \ell a_{n+1} \rfloor \ge \tau \lfloor \ell a_{n+1} \rfloor \) if \(\lfloor \ell a_n \rfloor > \lfloor \ell a_{n+1} \rfloor \). This is clear if \(\tau \lfloor \ell a_{n+1} \rfloor \le 1\). Otherwise, using also that \((a_n)\) is a \((2\tau )\)-good partition, we get

$$\begin{aligned} \lfloor \ell a_n \rfloor - \lfloor \ell a_{n+1} \rfloor> \ell a_n - \ell a_{n+1} - 1 \ge 2\tau \ell a_{n+1} - 1 \ge 2\tau \lfloor \ell a_{n+1} \rfloor -1 > \tau \lfloor \ell a_{n+1} \rfloor . \end{aligned}$$

Let \(K=\max \{ n: \ell a_n \ge 1\}\). Since \((a_n)\) is \((2\tau )\)-good and \(\ell a_0=L\), \((1+2\tau )^{K}\le L\le \ell \), and so \(K\le \log \ell /\log (1+2\tau )\). Let \(a'_n=\lfloor \ell a_n \rfloor / \ell \). Since f is 1-Lipschitz and \(|a'_n - a_n|<1/ \ell \), we deduce that

$$\begin{aligned} \sum _{n=1}^{K+1} f(a'_n)-\min _{[a'_n,a'_{n-1}]}f\le & {} \sum _{n=1}^{K+1} \left( f(a_n)-\min _{[a_n,a_{n-1}]}f + 2/\ell \right) \\\le & {} {\mathbf {T}}(f,(a_n)) + 2(K+1)/\ell . \end{aligned}$$

By definition \(\ell a_{K+1}<1\), hence \(a'_{K+1}=0\). Thus

$$\begin{aligned} \sum _{j=0}^{q-1} f(N_j/\ell )- \min _{[N_j/\ell ,N_{j+1}/\ell ]} f= & {} \sum _{n=1}^{K+1} f(a'_n)- \min _{[a'_n,a'_{n-1}]}f \\\le & {} {\mathbf {T}}(f,(a_n)) +O_{\tau }(\log \ell /\ell ), \end{aligned}$$

which completes the proof. \(\square \)

The following lemma will help us translate the results for Lipschitz functions to results for \([-1,1]\)-sequences.

Lemma 5.22

Let \(\gamma ,\Gamma \in [-1,1]\), \(\tau \in (0,1/2)\), \(\zeta \in (0,1)\) and let \(\sigma \in [-1,1]^\ell \) satisfy

$$\begin{aligned} \gamma j - \zeta \ell \le \sigma _1 +\cdots + \sigma _j \le \Gamma j + \zeta \ell \quad (1\le j\le \ell ). \end{aligned}$$

Then there exists a piecewise linear 1-Lipschitz function \(f:[0,1]\rightarrow {\mathbb {R}}\) such that

  1. (i)

    \(f(j/\ell )=\frac{1}{\ell }(\sigma _1+\cdots +\sigma _j)\) if \(\sqrt{\zeta }\ell \le j \le \ell \),

  2. (ii)

    \((\gamma -\sqrt{\zeta }) x \le f(x) \le (\Gamma +\sqrt{\zeta }) x\) on [0, 1] and

  3. (iii)

    for any integer \(0<L\le \ell \),

    $$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma |(0,L]) \le {\mathbf {T}}(f|[0,L/\ell ]) +2\sqrt{\zeta }+ 144\tau + O_{\tau }(\log \ell /\ell ). \end{aligned}$$

Proof

Let \(f_1:[0,1]\rightarrow {\mathbb {R}}\) be the piecewise linear function which is linear on each interval \([j/\ell ,(j+1)/\ell ]\), and at the points \(j/\ell \) takes the values

$$\begin{aligned} f_1(j/\ell ) = \frac{1}{\ell }(\sigma _1+\cdots +\sigma _j) \qquad (j=0,1,\ldots ,\ell ). \end{aligned}$$

Since \(\sigma _i\in [-1,1]\), this is a 1-Lipschitz function. Moreover, it follows from the assumption on \(\sigma \) that

$$\begin{aligned} \gamma x - \zeta \le f_1(x) \le \Gamma x + \zeta \quad (x\in [0,1]), \end{aligned}$$

and so

$$\begin{aligned} (\gamma - \sqrt{\zeta })x \le f_1(x) \le (\Gamma + \sqrt{\zeta })x \quad \text { for } x\in [\sqrt{\zeta },1]. \end{aligned}$$

Let f agree with \(f_1\) on \([\sqrt{\zeta },1]\), \(f(0)=0\) and let f be linear on \([0,\sqrt{\zeta }]\). Then \(f:[0,1]\rightarrow {\mathbb {R}}\) is also a piecewise linear 1-Lipschitz function and (i) and (ii) hold.

Therefore it remains to prove (iii). Let \(0<L\le \ell \) be an integer. By Corollary 5.20 we have \({\mathbf {T}}_{2\tau }(f)\le {\mathbf {T}}(f)+72\tau \). Thus it is enough to show that for any \(\delta >0\) and \((2\tau )\)-good partition \((a_n)\) of \([0,L/\ell ]\) there exists a \(\tau \)-good integer partition \({\mathcal {P}}\) of \(\sigma |(0,L]\) such that

$$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}(\sigma |(0,L], {\mathcal {P}}) \le {\mathbf {T}}(f|[0,L/\ell ], (a_n)) + 2\sqrt{\zeta }+ 72\tau + O_{\tau }(\log \ell / \ell ) +\delta . \end{aligned}$$
(5.34)

Let N be the largest index such that \(a_N\ge \sqrt{\zeta }\). Then \(a_N\le 2a_{N+1} < 2\sqrt{\zeta }\). By applying Corollary 5.20 and Proposition 5.2 to \(f_1|[0,a_N]\) with \(D=-1\) and \(C=1\), we get

$$\begin{aligned} {\mathbf {T}}_{2\tau }(f_1|[0,a_N]) \le {\mathbf {T}}(f_1|[0,a_N]) + 72\tau \le \frac{(a_N-f(a_N))\cdot 3}{6} + 72\tau \le 2\sqrt{\zeta }+ 72\tau . \end{aligned}$$

Hence for any \(\delta >0\) there exists a \((2\tau )\)-good partition \((b_n)\) of \([0,a_N]\) such that \({\mathbf {T}}(f_1|[0,a_N],(b_n))\le 2\sqrt{\zeta }+ 72\tau + \delta \).

Let \(a'_n=a_n\) if \(n\le N\) and \(a'_n=b_{n-N}\) otherwise. Then \((a'_n)\) is a \((2\tau )\)-good partition of \([0,L/\ell ]\) such that

$$\begin{aligned} {\mathbf {T}}(f_1|[0,L/\ell ],(a'_n))\le {\mathbf {T}}(f|[0,L/\ell ], (a_n)) +2\sqrt{\zeta }+ 72\tau + \delta . \end{aligned}$$
(5.35)

Applying Lemma 5.21 for \(f_1|[0,L/\ell ]\) and the \((2\tau )\)-good partition \((a'_n)\) of \([0,L/\ell ]\), we get a \(\tau \)-good integer partition \(0=N_0<\cdots <N_q=L\) of (0, L] such that

$$\begin{aligned} \sum _{j=0}^{q-1} f_1(N_j/\ell )-\min _{[N_j/\ell ,N_{j+1}/\ell ]} f_1 \le {\mathbf {T}}(f_1|[0,L/\ell ],(a_n')) +O_{\tau }(\log \ell /\ell ). \end{aligned}$$

Noting that the left-hand side of the above expression is exactly \(\frac{1}{\ell }{\mathbf {M}}(\sigma ,(N_j))\), and the right-hand side is at most the right-hand side of (5.34) by (5.35), the proof is complete. \(\square \)

The next proposition is a version of Proposition 5.2 for sequences, and will play a central role in the Proof of Theorem 1.2.

Proposition 5.23

For any \(\gamma ,\Gamma \in [-1,1]\), \(\tau \in (0,1/2)\), \(\zeta \in (0,1)\) such that \(\gamma \le \Gamma \) and \(2\gamma \le \Gamma \), the following holds.

Let \(\sigma \in [-1,1]^\ell \) satisfy

$$\begin{aligned} \gamma j - \zeta \ell \le \sigma _1 +\cdots + \sigma _j \le \Gamma j + \zeta \ell \quad (1\le j\le \ell ). \end{aligned}$$

Then

$$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma ) \le \frac{(1-\gamma )(\Gamma -2\gamma )}{1+2\Gamma -3\gamma } +14\sqrt{\zeta } + 144\tau + O_{\tau }(\log \ell /\ell ). \end{aligned}$$

Proof

Let f be the function provided by Lemma 5.22. Applying Proposition 5.2 to f with \(D=\max (-1,\gamma -\sqrt{\zeta })\), \(C=\min (1,\Gamma +\sqrt{\zeta })\), and using that \(D\in [\gamma -\sqrt{\zeta },\gamma ]\) and \(C\in [\Gamma ,\Gamma +\sqrt{\zeta }]\), then using that \(1+2\Gamma -3\gamma \ge 1\), \(\gamma ,\Gamma \in [-1,1]\) and \(\zeta \in (0,1)\), we obtain

$$\begin{aligned} {\mathbf {T}}(f)&\le \frac{(1-\gamma +\sqrt{\zeta }) (\Gamma -2\gamma +3\sqrt{\zeta })}{1+2\Gamma -3\gamma }\\&\le \frac{(1-\gamma )(\Gamma -2\gamma )}{1+2\Gamma -3\gamma } +12\sqrt{\zeta }. \end{aligned}$$

By applying (iii) of Lemma 5.22 for \(L=\ell \) we get the desired inequality. \(\square \)

The following proposition will be used (only) in the Proof of Theorem 1.3; it is essentially a consequence of Proposition 5.5,

Proposition 5.24

For any \(\gamma , \tau \in (0,1/2)\), \(\zeta \in (0,\gamma ^2]\), \(\delta >0\) there is \(\eta =\eta (\delta ,\gamma )>0\) such that the following holds for any positive integer \(\ell \).

Let \(\sigma \in [-1,1]^\ell \) satisfy

$$\begin{aligned} \gamma j - \zeta \ell \le \sigma _1 +\cdots + \sigma _j \quad (1\le j\le \ell ). \end{aligned}$$

Then there exists an integer \(L\in [\eta \ell ,\ell ]\) such that

$$\begin{aligned} \frac{1}{L} {\mathbf {M}}_\tau (\sigma |(0,L]) \le \Phi (\gamma ) +\frac{1}{\eta }\left( O(\sqrt{\zeta }) + O(\tau ) + O_\tau (\log \ell / \ell ) \right) +\delta . \end{aligned}$$

where

$$\begin{aligned} \Phi (x)=\frac{2-x-\sqrt{3-3x^2}}{4}. \end{aligned}$$

Proof

Let f be the function provided by Lemma 5.22 for \(\Gamma =1\). Choose \(\widetilde{\delta }\in (0,1/2)\) such that \(\widetilde{\delta }(2-4\log \widetilde{\delta })<\delta \) and let \(\eta =3\Phi (\gamma )2^{-1/\widetilde{\delta }}>0\).

Let \(D=\gamma -\sqrt{\zeta }\ge 0\). Note that \(\Phi \) is decreasing on [0, 1 / 2], so \(3\Phi (D)2^{-1/\widetilde{\delta }}\ge 3\Phi (\gamma )2^{-1/\widetilde{\delta }} = \eta \). Using this and applying Proposition 5.5, we obtain a \(u\in [\eta ,1]\) such that

$$\begin{aligned} \frac{1}{u} {\mathbf {T}}(f|[0,u])< & {} \Phi (D)+\delta = \frac{2-D-\sqrt{3-3D^2}}{4} + \delta \\\le & {} \frac{2-(\gamma -\sqrt{\zeta })-\sqrt{3-3\gamma ^2}}{4} +\delta = \Phi (\gamma )+\sqrt{\zeta }/4 +\delta . \end{aligned}$$

Let L be the smallest integer such that \(u\le L/\ell \). Then clearly \(L\in [\eta \ell , \ell ]\).

It is easy to see that for any 1-Lipschitz function \(g:[0,a]\rightarrow {\mathbb {R}}\) and any \(0<u_1<u_2\le a\) we have \({\mathbf {T}}(g|[0,u_2])\le {\mathbf {T}}(g|[0,u_1]) + (u_2-u_1)\). Thus

$$\begin{aligned} \frac{1}{u}{\mathbf {T}}(f|[0,L/\ell ]) \le \frac{1}{u}({\mathbf {T}}(f|[0,u])+1/\ell ) \le \Phi (\gamma )+\sqrt{\zeta }/4 +\delta +\frac{1}{\ell u}. \end{aligned}$$

Combining this with (iii) of Lemma 5.22, we conclude

$$\begin{aligned} \frac{1}{L} {\mathbf {M}}_\tau (\sigma |(0,L])&\le \frac{1}{u} \cdot \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma |(0,L]) \\&\le \frac{1}{u}\left( {\mathbf {T}}(f|[0,L/\ell ]) + 2\sqrt{\zeta }+ 144\tau + O_\tau (\log \ell / \ell ) \right) \\&\le \Phi (\gamma )+\sqrt{\zeta }/4 +\delta +\frac{1}{\ell \eta } +\frac{1}{\eta }\left( 2\sqrt{\zeta }+ 144\tau + O_\tau (\log \ell / \ell ) \right) \\&\le \Phi (\gamma ) +\frac{1}{\eta }\left( O(\sqrt{\zeta }) + O(\tau ) + O_\tau (\log \ell / \ell ) \right) +\delta . \end{aligned}$$

\(\square \)

Finally, we get a version for sequences and integer partitions of Corollary 5.17, which will be applied to prove Theorem 1.4.

Proposition 5.25

Let \(\tau \in (0,1/2)\), \(\gamma \in (0,1/3)\), \(\zeta \in (0,\gamma ^2)\) and

$$\begin{aligned} \Lambda (x) = \frac{(1+x)(37-50x+60x^2)}{18(3-4x+5x^2)} \qquad (x\in [0,1/3]). \end{aligned}$$

Then there exist \(\eta >0,\xi \in (2/3,1]\) (depending on \(\gamma -\sqrt{\zeta }\)) such that

$$\begin{aligned} \xi (1-2\eta ) = \Lambda (\gamma -\sqrt{\zeta })\ge \Lambda (\gamma )-\sqrt{\zeta }\end{aligned}$$

and the following holds:

For any sequence \((\sigma _1,\ldots ,\sigma _\ell ) \in [-1,1]^\ell \) such that

$$\begin{aligned} \sigma _1+ \cdots +\sigma _j \ge \gamma j - \zeta \ell \qquad (j=1,\ldots ,\ell ), \end{aligned}$$

one of the following alternatives is satisfied:

  1. (i)
    $$\begin{aligned} \frac{1}{\xi \ell } \max _{j=1}^{\xi \ell } \sum _{i=1}^j \left( 1/3-\sigma _i\right) \le \eta + 2\sqrt{\zeta }. \end{aligned}$$
  2. (ii)
    $$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma ) \le 1- \Lambda (\gamma ) +3\sqrt{\zeta }+ 144\tau + O_{\tau }(\log \ell /\ell ). \end{aligned}$$

Proof

We begin by noting that \(\Lambda (\gamma -\sqrt{\zeta })\ge \Lambda (\gamma )-\sqrt{\zeta }\) since \(\Lambda '(x)\le 1\) on [0, 1 / 3].

Let \(\eta >0,\xi \in (2/3,1]\) be the numbers given in Corollary 5.17 for \(D=\gamma -\sqrt{\zeta }\in (0,1/3)\). Suppose that (i) is false, so there is a \(j\le \xi \ell \) such that

$$\begin{aligned} \frac{1}{\xi \ell } \sum _{i=1}^j \left( 1/3-\sigma _i\right) > \eta + 2\sqrt{\zeta }, \end{aligned}$$

and therefore

$$\begin{aligned} \frac{1}{\ell }(\sigma _1+\cdots +\sigma _j) < \frac{j}{3\ell } - \eta \xi - 2 \xi \sqrt{\zeta }. \end{aligned}$$

Note that the left-hand side is at least \(-j/\ell \) and that, since \(\eta >0\) and \(\xi \ge 2/3\), we have \(\frac{3}{4}(\eta +2\sqrt{\zeta })\xi > \sqrt{\zeta }\). This implies that \(j/\ell > \sqrt{\zeta }\).

Let f be the function provided by Lemma 5.22 for \(\Gamma =1\), and let \(x_0=j/\ell \). Then \(x_0\in [\sqrt{\zeta },\xi ]\) and \(f:[0,1]\rightarrow {\mathbb {R}}\) is a 1-Lipschitz function such that \(f(0)=0\), \(f(x)\ge (\gamma -\sqrt{\zeta })x\) on [0, 1], \(f(x_0)<x_0/3 - \eta \xi - 2\xi \sqrt{\zeta }\le x_0/3-\eta \xi \), and

$$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma ) \le {\mathbf {T}}(f) +2\sqrt{\zeta }+ 144\tau + O_{\tau }(\log \ell /\ell ). \end{aligned}$$
(5.36)

Applying Corollary 5.17 to f, we get

$$\begin{aligned} {\mathbf {T}}(f) \le 1-\Lambda (\gamma -\sqrt{\zeta }) \le 1- \Lambda (\gamma )+\sqrt{\zeta }. \end{aligned}$$

Combining this with (5.36) we obtain (ii), which completes the proof. \(\square \)

6 Proofs of Main Theorems

6.1 Proof of Theorem 1.2.

In this section we prove Theorem 1.2. Write

$$\begin{aligned} \psi (s,u)= \frac{s(2+u-2s)}{2+2u-3s}, \end{aligned}$$

and recall that \(\chi (s,u)=\min (\psi (s,u),1)\) for \(0\le s\le u\le 2\) and \(s<2\), and moreover \(\chi (s,u)=1\) if and only if \(u\le 2s-1\) (which forces \(s\ge 1\)).

The next proposition encapsulates some preliminary reductions towards the Proof of Theorem 1.2. We first explain how to deduce the theorem from the proposition; the rest of the section is then devoted to the proof of the proposition.

Proposition 6.1

For every \(0<s\le u\le 2\) with \(u>2s-1\), the following holds.

Let \(\mu \in {\mathcal {P}}([0,1)^2)\) satisfy \({\mathcal {E}}_s(\mu )<\infty \) and \({{\,\mathrm{{\overline{\dim }}_B}\,}}({{\,\mathrm{supp}\,}}(\mu ))\le u\). If \(B\subset [0,1)^2\) is a compact set disjoint from \({{\,\mathrm{supp}\,}}(\mu )\) with \({{\,\mathrm{dim_H}\,}}(B)>\min (1,2-s)\), then

$$\begin{aligned} \sup _{y\in B} {{\,\mathrm{dim_H}\,}}(\Delta _y ({{\,\mathrm{supp}\,}}\mu )) \ge \psi (s,u). \end{aligned}$$

Proof of Theorem 1.2 (assuming Proposition 6.1)

We proceed by contradiction. Assume, then, that there exists a Borel set \(A\subset {\mathbb {R}}^2\) such that \(0< s\le {{\,\mathrm{dim_H}\,}}(A)\le {{\,\mathrm{dim_P}\,}}(A)\le u\le 2\) and

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\{ y\in {\mathbb {R}}^2 : {{\,\mathrm{dim_H}\,}}(\Delta _y A) < \chi (s,u) \} > \max (1,2-s). \end{aligned}$$

By countable stability of Hausdorff dimension, there are \(\eta >0\) and a set \(B\subset {\mathbb {R}}^2\) with \({{\,\mathrm{dim_H}\,}}(B)>\max (1,2-s)\) such that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta _y A) < \chi (s,u) -\eta \quad \text {for all }y\in B. \end{aligned}$$
(6.1)

Since \({{\,\mathrm{dim_H}\,}}(\Delta _y A)\) does not increase if we replace A by any subset, every Borel set of dimension \(s>0\) contains compact subsets of positive \(s'\)-dimensional Hausdorff measure for all \(0<s'<s\), and \(\chi (s,u)\) is continuous, at the price of replacing \(\eta \) by \(\eta /2\) we may assume that in (6.1) the set A is compact and of positive s-dimensional Hausdorff measure. In turn, a routine verification shows that if A is compact, then the set

$$\begin{aligned} \{ y: {{\,\mathrm{dim_H}\,}}(\Delta _y A) < \chi (s,u)-\eta /2 \} \end{aligned}$$

is Borel. Hence in (6.1) we may also assume that B is Borel.

Recall that \(\chi (s,u)= 1\) if and only if \(u\le 2s-1\) (and \(s\ge 1\) in this case). Hence, if \(u\le 2s-1\), then we can pick \(0< s'\le u'\le 2\) such that \(u'\ge u\), \(1\le s'\le s\), \(u'>2s'-1\), and \(\psi (s',u')>1-\eta /2\). This shows that in (6.1), we may further assume that \(u>2s-1\) and replace \(\chi (s,u)\) by \(\psi (s,u)\) (with \(\eta /2\) in place of \(\eta \)).

Let \(\mu \in {\mathcal {P}}({\mathbb {R}}^2)\) be an s-Frostman measure on A, i.e. \(\mu \) is a Radon measure supported on A and \(\mu (B(x,r)) \le C r^s\) for all \(x\in {\mathbb {R}}^2\), \(r>0\), where C is independent of x (recall that we assumed that A has positive s-dimensional Hausdorff measure). By assumption, \({{\,\mathrm{dim_P}\,}}({{\,\mathrm{supp}\,}}(\mu )) \le u\). Using that packing dimension is equal to the modified upper box counting dimension (see e.g. [Fal14, Proposition 3.8]), and that \({{\,\mathrm{{\overline{\dim }}_B}\,}}(A_0)={{\,\mathrm{{\overline{\dim }}_B}\,}}({\overline{A}}_0)\), we see that for every \(\delta >0\) there is a compact set \(A_0\subset A\) of positive \(\mu \)-measure such that \({{\,\mathrm{{\overline{\dim }}_B}\,}}(A_0)\le \min (u+\delta ,2)\).

We can then find disjoint compact subsets \(B'\subset B, A'\subset A_0\) such that still \(\mu (A')>0\), \({{\,\mathrm{dim_H}\,}}(B')>\max (1,2-s)\). Then (provided \(\delta \) was taken small enough in terms of \(s,u,\eta \))

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta _y A') < \psi (s-\delta ,\min (u+\delta ,2)) - \eta /2 \quad \text {for all }y\in B'. \end{aligned}$$

This inequality is preserved under (joint) scaling and translation of \(\mu , A',B'\), so it holds in particular for some compact \(A', B'\subset [0,1)^2\). Since \(\mu _{A'}(B(x,r)) \le C' r^s\) for some constant \(C'>0\), we can check that \({\mathcal {E}}_{s-\delta }(\mu _{A'})<\infty \). Since \({{\,\mathrm{supp}\,}}(\mu _{A'})\subset A'\), this contradicts Proposition 6.1 applied to \(\mu _{A'}\) and \(B'\), with \(s-\delta \), \(\min (u+\delta ,2)\) in place of su (provided \(\delta \) was taken small enough in terms of \(s,u,{{\,\mathrm{dim_H}\,}}(B')\)). \(\square \)

In order to bound the Hausdorff dimension of \(\Delta _y A\) from below, we will use the following standard criterion; although it is well known, we include the short proof for completeness.

Lemma 6.2

Let \(F\subset {\mathbb {R}}^d\) be a Borel set and let \(\rho \in {\mathcal {P}}({\mathbb {R}}^d)\) give full mass to F. Suppose that there are \(M_0\in {\mathbb {N}}_{\ge 2}\) and \(s>0\) such that for any \(M\ge M_0\) and any Borel subset \(F'\subset F\) with \(\rho (F')> M^{-2}\), the number \({\mathcal {N}}(F',M)\) of cubes in \({\mathcal {D}}_M\) hitting \(F'\) is at least \(2^{s T M}\). Then \({\mathcal {H}}^s(F)\gtrsim _{T,d} 1\) and in particular \({{\,\mathrm{dim_H}\,}}(F)\ge s\).

Proof

Let \(\{ B(x_i,r_i)\}\) be a cover of F where \(r_i \le 2^{-T M_0}\) for all i. Our goal is to estimate \(\sum _i r_i^s\) from below.

Write \(F_M\) for the union of all the \(B(x_i,r_i)\) for which \(2^{- T(M+1)} \le r_i \le 2^{-TM}\). Pigeonholing, there is \(M\ge M_0\) such that \(\rho (F_M)> M^{-2}\). By assumption, one needs at least \(2^{sTM}\) cubes in \({\mathcal {D}}_M\) to cover \(F_M\). It follows that the number of balls making up \(F_M\) is \(\gtrsim _{d,T} 2^{sTM}\), so that \(\sum _i r_i^s \gtrsim _{d,T} 2^{s T M} 2^{-s T M}=1\). This gives the claim. \(\square \)

We now begin the Proof of Proposition 6.1. Since \(\mu , s,u\) are fixed, any (possibly implicit) constants appearing in the proof may depend on them. Let \(\nu \) be a measure supported on B with finite \(u'\)-energy where \(u'>\max (1,2-s)\). Let \(\kappa =\kappa (\mu ,\nu )>0\) be the number given by Proposition 3.12. We will show that (under the assumptions of the proposition) there exists \(y\in B\) (possibly depending on \(T,\varepsilon ,\tau \)) such that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta _y ({{\,\mathrm{supp}\,}}(\mu ))) > \psi (s,u) - o_{T,\varepsilon ,\tau }(1). \end{aligned}$$
(6.2)

Recall that \(o_{T,\varepsilon ,\tau }(1)\) stands for a function of \(T,\varepsilon ,\tau \) which tends to 0 as \(T\rightarrow \infty \) and \(\varepsilon ,\tau \rightarrow 0^+\). We will henceforth assume that \(T,\varepsilon ,\tau \) are given, and that the integer \(\ell _0\) is chosen large enough in terms of \(T,\varepsilon ,\tau \) so that all the claimed inequalities hold. As a first instance of this, apply Lemma 3.10 to get that

$$\begin{aligned} |{{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x)|&\le \kappa \text { for all } x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ) \end{aligned}$$

provided \(\ell _0\) was taken large enough (in terms of \(T,\varepsilon ,\tau \)).

One can easily check that, given \(\nu \in {\mathcal {P}}([0,1)^2)\) and \(j,k\in {\mathbb {N}}\), the set \(\{ (x,\theta ): \theta \in {{\,\mathrm{\mathbf {Bad}}\,}}(\nu ,x,j,k)\}\) is Borel (recall Definition 3.8). It follows that the set

$$\begin{aligned} \Theta = \{ (x,\theta ): x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ), \theta \in {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x) \} \end{aligned}$$
(6.3)

is Borel. Hence, applying Proposition 3.12, and using Fubini and the fact that \(\mu \) is a Radon measure, we obtain a compact set \(A_1\subset {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\subset {{\,\mathrm{supp}\,}}(\mu )\) with \(\mu (A_1)>2/3\) and a point \(y\in {{\,\mathrm{supp}\,}}(\nu )\subset B\) such that

$$\begin{aligned} P_y(x) \notin {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x) \text { for all } x\in A_1. \end{aligned}$$
(6.4)

Making \(\varepsilon \) smaller (in terms of \(\mathrm {dist}(B,{{\,\mathrm{supp}\,}}(\mu ))\) only) and \(\ell _0\) larger, we may assume that

$$\begin{aligned} \mathrm {dist}(B,A_1) \ge \varepsilon + \sqrt{2} \cdot 2^{-\ell _0}. \end{aligned}$$
(6.5)

We will show that, in fact, \({{\,\mathrm{dim_H}\,}}(\Delta _y(A_1))\ge \psi (s,u)-o_{T,\varepsilon ,\tau }(1)\), which clearly implies (6.2). To do this, our aim is to apply Lemma 6.2 with \(F=\Delta _y(A_1)\), \(\rho =\Delta _y(\mu _{A_1})\). Note that if \(\rho (F')\ge \ell ^{-2}\), then \(A_2=\Delta _y^{-1}(F')\) satisfies that \(\mu _{A_1}(A_2)=\rho (F')\ge \ell ^{-2}\). Hence, in order to complete the Proof of Proposition 6.1, it is enough to establish the following.

Claim

If the Borel set \(A_2\subset [0,1)^2\) satisfies \(\mu _{A_1}(A_2) \ge \ell ^{-2}\) with \(\ell \ge \ell _0\), where \(\ell _0\) is taken sufficiently large in terms of \(T,\varepsilon ,\tau \), then

$$\begin{aligned} \log {\mathcal {N}}(\Delta _y A_2,\ell ) \ge (\psi (s,u)-o_{T,\varepsilon ,\tau }(1))T\ell . \end{aligned}$$
(6.6)

Fix, then, \(A_2\) as above. Since the set \(\Delta _y(R_\ell A_2)\) is contained in the \((\sqrt{2}\cdot 2^{-T\ell })\)-neighborhood of \(\Delta _y A_2\), the numbers \(\log {\mathcal {N}}(\Delta _y A_2,\ell )\) and \(\log {\mathcal {N}}(\Delta _y R_\ell A_2,\ell )\) differ by at most a constant. Hence we can, and do, assume that \(A_2=R_\ell A_2\) from now on. Moreover, we may assume that \(A_2\subset R_\ell (A_1)\), since whenever \(A_2=R_\ell A_2\) and \(\mu _{A_1}(A_2) \ge \ell ^{-2}\), the same holds for \(A_2 \cap R_\ell (A_1)\).

Consider the sets given by Corollary 3.5 applied to \(R_{\ell }\mu \). Applying the corollary with \(A=A_2\), and using that

$$\begin{aligned} 2^{-\varepsilon T\ell }\ll \tfrac{2}{3}\ell ^{-2} \le \mu (A_1)\mu _{A_1}(A_2) \le \mu (A_2)=R_\ell \mu (A_2) \end{aligned}$$

for large enough \(\ell \), we can find a further \(2^{-T\ell }\)-set X such that, setting \(\rho =(R_\ell \mu )_X\),

  1. (i)

    \(\rho (A_2) \ge \ell ^{-2}/2\).

  2. (ii)

    \(R_\ell \mu (X) \ge 2^{-o_{T,\varepsilon }(1) T\ell }\) and therefore, using that \({\mathcal {E}}_s(R_\ell \mu ) \lesssim _T {\mathcal {E}}_s(\mu )\) by Lemma 3.1,

    $$\begin{aligned} {\mathcal {E}}_s(\rho ) \le (R_\ell \mu (X))^{-2}{\mathcal {E}}_s(R_\ell \mu ) \lesssim _T 2^{o_{T,\varepsilon }(1) T\ell } {\mathcal {E}}_s(\mu ) \lesssim 2^{o_{T,\varepsilon }(1) T\ell }. \end{aligned}$$
  3. (iii)

    \(\rho \) is \(\sigma \)-regular for some sequence \(\sigma =(\sigma _1,\ldots ,\sigma _\ell )\), \(\sigma _j\in [-1,1]\).

  4. (iv)

    X is contained in \(R_\ell {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\).

By Lemma 3.3 and (ii), (iii) above, and assuming that \(\ell _0\) was taken large enough in terms of T, we have

$$\begin{aligned} \sum _{i=1}^j \sigma _i \ge (s-1)j - \ell o_{T,\varepsilon }(1) \qquad (j=1,\ldots ,\ell ). \end{aligned}$$
(6.7)

On the other hand, we have assumed that \({{\,\mathrm{{\overline{\dim }}_B}\,}}({{\,\mathrm{supp}\,}}(\mu ))\le u\), so that \({\mathcal {N}}({{\,\mathrm{supp}\,}}(\mu ),j) \le O_\varepsilon (1) 2^{(u+\varepsilon )T j}\) for all \(j\in {\mathbb {N}}\). By (iv) above, this also holds for X in place \({{\,\mathrm{supp}\,}}(\mu )\) if \(j\le \ell \). On the other hand, using that \(\rho \) is \(\sigma \)-regular as in (3.1), we get

$$\begin{aligned} {\mathcal {N}}(X,j) = |{\mathcal {D}}_j(\rho )| \ge 2^{T(\sigma _1+1)} \cdots 2^{T(\sigma _j+1)} \quad (1\le j\le \ell ). \end{aligned}$$

Combining these estimates, we deduce that \(2^{T(\sigma _1+\ldots +\sigma _j+j)} \le O_\varepsilon (1) 2^{(u+\varepsilon )T j}\), and hence

$$\begin{aligned} \sum _{i=1}^j \sigma _i \le \frac{O_\varepsilon (1)}{T} + (u-1+\varepsilon )j \le (u-1)j+\ell o_{T,\varepsilon } (1) \qquad (j=1,\ldots ,\ell ), \end{aligned}$$
(6.8)

provided \(\ell _0\) was taken large enough in terms of \(\varepsilon \).

Combining (6.7) and (6.8), we see that the assumptions of Proposition 5.23 are satisfied with \(\gamma =s-1\), \(\Gamma =u-1\), and \(\zeta =o_{T,\varepsilon }(1)\). After another short calculation, and starting with \(\ell _0\) large enough in terms of \(\tau \), we deduce that

$$\begin{aligned} \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma ) \le 1 - \psi (s,u) + o_{T,\varepsilon ,\tau }(1). \end{aligned}$$
(6.9)

Recall from (6.4) that if \(x\in A_1\), then \(\theta (x,y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x)\). Hence, according to the definition of the sets \({{\,\mathrm{\mathbf {Bad}}\,}}'_{\ell _0\rightarrow \ell }(R_\ell \mu ,x)\) and \({{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x)\) in (3.3) and (3.4) respectively, we have \(\theta (x,y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}'_{\varepsilon \ell \rightarrow \ell }(R_\ell \mu ,x)={{\,\mathrm{\mathbf {Bad}}\,}}_{\varepsilon \ell \rightarrow \ell }(\rho ,x)\) for all \(x\in A_1\cap X\). Since we have assumed that \(A_2\subset R_\ell (A_1)\), the hypotheses of Proposition 4.4 are met by \(\rho \) and \(A_2\), with \(\beta =\varepsilon \) (the separation assumption follows from (6.5)). Recalling (i), we see that if \(\ell _0\) was taken even larger in terms of \(T,\varepsilon ,\tau \) we can make the error term in Proposition 4.4 equal to \(o_{T,\varepsilon ,\tau }(1)\). In light of (6.9), Proposition 4.4 gives exactly (6.6).

This completes the proof of the claim and, with it, of Proposition 6.1 and Theorem 1.2.

6.2 Proof of Theorem 1.3.

In this section we prove Theorem 1.3. The proof goes along the same lines as the Proof of Theorem 1.2, except that we rely on Proposition 5.24 instead of Proposition 5.23 to choose the scales in the multi-scale decomposition. The need to deal with two different scales \(2^{-TL}\) and \(2^{-T\ell }\) also creates some additional challenges. Write

$$\begin{aligned} \psi (s) = \frac{1+s+\sqrt{3s(2-s)}}{4}. \end{aligned}$$

(This should not be confused with the function \(\psi (s,u)\) from Section 6.1.) The next proposition contains the core of Theorem 1.3.

Proposition 6.3

For every \(1<s<3/2\), the following holds.

Let \(\mu \in {\mathcal {P}}([0,1)^2)\) satisfy \({\mathcal {E}}_s(\mu )<\infty \). If \(B\subset [0,1)^2\) is a compact set disjoint from \({{\,\mathrm{supp}\,}}(\mu )\) with \({{\,\mathrm{dim_H}\,}}(B)>1\), then

$$\begin{aligned} \sup _{y\in B} {{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y ({{\,\mathrm{supp}\,}}\mu )) \ge \psi (s). \end{aligned}$$

Proof of Theorem 1.3 (assuming Proposition 6.3)

Reasoning as in the deduction of Theorem 1.2 from Proposition 6.1, we get that if U is a Borel subset of \({\mathbb {R}}^2\) with \({{\,\mathrm{dim_H}\,}}(U)\ge t\in (1,3/2)\), then

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\{ y\in {\mathbb {R}}^2: {{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y(U)) < \psi (t) \} \le 1. \end{aligned}$$
(6.10)

The reason we need to go via box dimension is that the map \(y\mapsto {{\,\mathrm{{\overline{\dim }}_B}\,}}\Delta _y(U)\) is Borel if U is compact, while it is unclear whether the map \(y\mapsto {{\,\mathrm{dim_P}\,}}\Delta _y(U)\) is Borel, since it was proved in [MM97] that packing dimension is not a Borel function of the set if one considers the Hausdorff metric on the compact subsets of \({\mathbb {R}}\).

Now suppose the claim of Theorem 1.3 does not hold. Then we can find a Borel set \(A\subset {\mathbb {R}}^2\) with \({{\,\mathrm{dim_H}\,}}(A)=s\in (1,3/2)\) and \(\eta >0\) such that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_P}\,}}(\Delta _y(A)) < \psi (s)-\eta \} > 1. \end{aligned}$$

Let \(\nu \) a Frostman measure on A of exponent \(t\in (1,s)\), sufficiently close to s that \(\psi (t)\ge \psi (s)-\eta \), and note that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}\{ y\in {\mathbb {R}}^2: {{\,\mathrm{dim_P}\,}}(\Delta _y({{\,\mathrm{supp}\,}}(\nu ))) < \psi (t) \} > 1. \end{aligned}$$
(6.11)

Fix a countable basis \((U_i)\) of open sets of \({{\,\mathrm{supp}\,}}(\nu )\) (in the relative topology). Note that \({{\,\mathrm{dim_H}\,}}(U_i) \ge t\) for all i since \(\nu \) is a Frostman measure. Hence, from (6.10) we get that \({{\,\mathrm{dim_H}\,}}(E)\le 1\), where

$$\begin{aligned} E = \{ y \in {\mathbb {R}}^2: {{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y(U_i)) < \psi (t) \text { for some } i\}. \end{aligned}$$

Fix \(y\in {\mathbb {R}}^2\setminus E\). Let \((F_j)\) be a countable cover of \(\Delta _y({{\,\mathrm{supp}\,}}(\nu ))\). By Baire’s Theorem, some \(\Delta _y^{-1}({\overline{F}}_j)\) has nonempty interior in \({{\,\mathrm{supp}\,}}(\nu )\), and hence contains some \(U_i\). By the definition of E,

$$\begin{aligned} {{\,\mathrm{{\overline{\dim }}_B}\,}}(F_j) = {{\,\mathrm{{\overline{\dim }}_B}\,}}({\overline{F}}_j) \ge {{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y(U_i)) \ge \psi (t). \end{aligned}$$

By the characterization of packing dimension as modified upper box counting dimension [Fal14, Proposition 3.8], we conclude that \({{\,\mathrm{dim_P}\,}}(\Delta _y({{\,\mathrm{supp}\,}}(\nu )) \ge \psi (t)\) whenever \(y\in {\mathbb {R}}^2\setminus E\). Since \({{\,\mathrm{dim_H}\,}}(E)\le 1\), this contradicts (6.11), finishing the proof. \(\square \)

We now start the Proof of Proposition 6.3. Let \(\nu \) be a measure supported on B with finite u-energy for some \(u>1\), and let \(\kappa =\kappa (\mu ,\nu )>0\) be the number given by Proposition 3.12. Apply Lemma 3.10 to obtain the bound

$$\begin{aligned} |{{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x)|&\le \kappa \text { for all } x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu ) \end{aligned}$$

provided \(\ell _0\) was taken large enough in terms of \(T,\varepsilon ,\tau \). Recall that the set \(\Theta \) in Equation (6.3) is Borel. Applying Proposition 3.12 to \(\Theta \) and Fubini, we obtain a compact set \(A\subset {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu )\) with \(\mu (A)> 2/3\) and a point \(y\in B\) such that

$$\begin{aligned} P_y(x) \notin {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu ,x) \text { for all } x\in A. \end{aligned}$$
(6.12)

Fix a number \(\delta >0\). We will show that

$$\begin{aligned} {{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y A) \ge \psi (s)- {{\,\mathrm{Error}\,}}_{T,\varepsilon ,\tau }(\delta ), \end{aligned}$$

where \({{\,\mathrm{Error}\,}}_{T,\varepsilon ,\tau }(\delta )\) can be made arbitrarily small by first taking \(\delta \) small enough, and then taking T large enough and \(\varepsilon ,\tau \) small enough, all in terms of \(\delta \). This error term may also depend on s.

Fix a large integer \(\ell \gg \ell _0\). We claim that it is enough to find a scale \(L\in [\ell _0,\ell ]\), tending to infinity with \(\ell \), such that

$$\begin{aligned} \frac{\log {\mathcal {N}}(\Delta _y(R_L A),L)}{TL} \ge \psi (s) - {{\,\mathrm{Error}\,}}_{T,\varepsilon ,\tau }(\delta ), \end{aligned}$$
(6.13)

where the error term has property detailed above. Indeed, since \(\Delta _y(R_L A)\) is contained in the \(O(2^{-TL})\)-neighborhood of \(\Delta _y(A)\), this implies the corresponding lower bound for \({{\,\mathrm{{\overline{\dim }}_B}\,}}(\Delta _y(A))\).

Apply Corollary 3.5 to \(R_{\ell }\mu \). Taking \(\ell \) large enough that \(2^{-\varepsilon T\ell }\ll 2/3<\mu (A)\le \mu (R_\ell A)\), there is a \(2^{-T\ell }\)-set X such that, setting \(\rho =(R_\ell \mu )_{X}\),

  1. (i)

    \(\rho (R_\ell A) \ge 1/2\).

  2. (ii)

    \(R_\ell \mu (X) \ge 2^{-o_{T,\varepsilon }(1) T\ell }\) whence, as we saw in the Proof of Proposition 6.1,

    $$\begin{aligned} {\mathcal {E}}_s(\rho ) \lesssim _T 2^{o_{T,\varepsilon }(1) T\ell }. \end{aligned}$$
  3. (iii)

    \(\rho \) is \(\sigma \)-regular for some sequence \(\sigma =(\sigma _1,\ldots ,\sigma _\ell )\), \(\sigma _j\in [-1,1]\).

Note that, provided \(\ell _0\) was taken large enough, (6.7) still holds, since it only depends on (ii) and (iii). We are then in the setting of Proposition 5.24 with \(\gamma =s-1\), \(\zeta =o_{T,\varepsilon }(1)\). Let \(\eta =\eta (\delta ,s-1)>0\) be the number given by the proposition; we underline that, since \(\delta \) is chosen before \(T,\varepsilon ,\tau \), the number \(\eta \) is also independent of \(T,\varepsilon ,\tau \) (it is useful to keep in mind that \(\eta \) does depend on \(\delta \)). A short calculation shows that \(1-\psi (s)=\Phi (s-1)\). From now we assume that \(\ell \) is taken large enough (in terms of \(\delta \) and s) that \(\eta \ell \ge \ell _0\). Then, applying Proposition 5.24 and making \(\ell \) even larger, we get an integer \(L \in [\eta \ell ,\ell ]\subset [\ell _0,\ell ]\) such that

$$\begin{aligned} \frac{1}{L} {\mathbf {M}}_\tau (\sigma |(0,L]) \le 1-\psi (s) + \eta ^{-1} o_{T,\varepsilon ,\tau }(1) +\delta . \end{aligned}$$
(6.14)

Note that \(R_L\rho \) is \((\sigma _1,\ldots ,\sigma _L)\)-regular. Also, if \(x\in A\cap X\), then

$$\begin{aligned} \theta (x,y)&\notin {{\,\mathrm{\mathbf {Bad}}\,}}'_{\varepsilon \ell \rightarrow \ell }(R_\ell \mu ,x)&(\text {by }(6.12) \text { and } (3.4))\\&= {{\,\mathrm{\mathbf {Bad}}\,}}_{\varepsilon \ell \rightarrow \ell }(\rho ,x)&(\text {by } (3.3), \text { since }X \text { came from Corollary} (3.5)) \\&\supset {{\,\mathrm{\mathbf {Bad}}\,}}_{(\varepsilon /\eta ) L\rightarrow L}(\rho ,x)&(\text {by Definition}~3.8, \varepsilon \ell \le (\varepsilon /\eta )L \text { and } L\le \ell )\\&= {{\,\mathrm{\mathbf {Bad}}\,}}_{(\varepsilon /\eta ) L\rightarrow L}(R_L\rho ,x)&(\text {by Definition}~3.8). \end{aligned}$$

Note that \(x\mapsto {{\,\mathrm{\mathbf {Bad}}\,}}_{(\varepsilon /\eta ) L\rightarrow L}(R_L \rho ,x)\) is constant on each square of \({\mathcal {D}}_L(R_L\rho )\). Hence, for each \(x\in R_L(A\cap X)\) there is \(\widetilde{x}\in A\cap X\subset R_L(A\cap X)\) such that \(\theta (\widetilde{x},y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}_{(\varepsilon /\eta )L\rightarrow L}(R_L\rho ,\widetilde{x})\). Assume \(\varepsilon <\eta \). If \(\varepsilon \) was taken small enough and \(\ell \) large enough that \(\mathrm {dist}(B,{{\,\mathrm{supp}\,}}(\mu ))\ge \varepsilon + \sqrt{2}\cdot 2^{-\ell _0}\), then all the hypotheses of Proposition 4.4 are satisfied for \(R_L\rho \), \(R_L(A\cap X)\) and L in place of \(\rho , A\) and \(\ell \), with \(\beta =\varepsilon /\eta \). Using (i) above (which implies \(R_L\rho (R_L A) \ge 1/2\)) and the bound \(L\ge \eta \ell \), the error term in Proposition 4.4 can be bounded by

$$\begin{aligned} \frac{2\varepsilon }{\eta } + o_{T,\varepsilon }(1) + O_{T,\varepsilon ,\tau }\left( \frac{\log ^2(\eta \ell )}{\eta \ell }\right) . \end{aligned}$$

Making \(\ell \) large enough in terms of \(T,\varepsilon ,\tau ,\delta \) and s, this error term can be made \(o_{T,\varepsilon }(1)+2\varepsilon \eta ^{-1}\). Hence Proposition 4.4 together with (6.14) ensure that (6.13) holds, with the error behaving as claimed.

Since \(L\ge \eta \ell \) and \(\ell \) is arbitrarily large, L is also arbitrarily large. Hence we have shown that (6.13) holds for arbitrarily large L and, as explained above, this completes the Proof of Proposition 6.3 and, with it, of Theorem 1.3.

6.3 Proof of Theorem 1.4.

In this section we prove Theorem 1.4. Throughout this section, we let \(\Delta :{\mathbb {R}}^4\rightarrow {\mathbb {R}}\), \((x,y)\mapsto |x-y|\). We start by recalling a more quantitative version of the Mattila-Wolff bound (1.1).

Theorem 6.4

Suppose \(\mu _1,\mu _2\in {\mathcal {P}}([0,1)^2)\) have \(\varepsilon \)-separated supports. If \({\mathcal {E}}_{4/3}(\mu _1)<+\infty \), \({\mathcal {E}}_{4/3+\varepsilon }(\mu _2)<\infty \), then \(\Delta (\mu _1\times \mu _2)\) has an \(L^2\) density, and

$$\begin{aligned} \Vert \Delta (\mu _1\times \mu _2)\Vert _2^2 \lesssim _\varepsilon {\mathcal {E}}_{4/3}(\mu _1) {\mathcal {E}}_{4/3+\varepsilon }(\mu _2). \end{aligned}$$

Proof

Given \(\mu \in {\mathcal {P}}([0,1)^2)\), let

$$\begin{aligned} \varvec{\sigma }(\mu ,r)&= \int _{S^1} |{\widehat{\mu }}(\theta r)|^2 \,d\theta ,\\ \varvec{\sigma }_\alpha (\mu )&= \sup \{ r^\alpha \varvec{\sigma }(\mu ,r) : r> 0 \}. \end{aligned}$$

Mattila [Mat87, Corollary 4.9] proved that

$$\begin{aligned} \Vert \Delta (\mu _1\times \mu _2)\Vert _2^2 \lesssim _\varepsilon {\mathcal {E}}_\alpha (\mu _1) \varvec{\sigma }_{2-\alpha }(\mu _2). \end{aligned}$$

We remark that in [Mat87] this is proved for a weighted version of the distance measure (see [Mat87, Eq. (4.1)]), but the weight \(u^{-1/2}\) lies in the interval \([(\sqrt{2})^{-1/2}, \varepsilon ^{-1/2}]\) by our assumption that the supports of \(\mu _1,\mu _2\) are \(\varepsilon \)-separated and contained in \([0,1)^2\). Later Wolff [Wol99, Theorem 1] proved that for any \(\mu \in {\mathcal {P}}([0,1)^2)\),

$$\begin{aligned} \varvec{\sigma }_{\beta /2}(\mu ) \lesssim _{\beta ,\varepsilon } {\mathcal {E}}_{\beta +\varepsilon }(\mu ), \end{aligned}$$

and this is sharp up to the \(\varepsilon \) when \(\beta \in (1,2)\). See also [Mat15, Chapters 15 and 16] for an exposition of these arguments. Combining these estimates with \(\alpha =\beta =4/3\) yields the claim. \(\square \)

In the proof we will also require the following well-known lemma, whose proof we include for completeness.

Lemma 6.5

Let \(f\in L^2({\mathbb {R}})\) satisfy \(\int f dx=1\). Then \({\mathcal {N}}({{\,\mathrm{supp}\,}}(f),L) \ge 2^{T L}/\Vert f\Vert _2^2\) for all \(L\in {\mathbb {N}}\).

Proof

Using Cauchy-Schwarz and Jensen’s inequality, we estimate

$$\begin{aligned} 1&= \left( \sum _{I\in {\mathcal {D}}_L} \int _I f \right) ^2 \\&\le {\mathcal {N}}({{\,\mathrm{supp}\,}}(f),L) \sum _{I\in {\mathcal {D}}_L} \left( \int _I f\right) ^2 \\&\le {\mathcal {N}}({{\,\mathrm{supp}\,}}(f),L) \sum _{I\in {\mathcal {D}}_L} 2^{-T L} \int _I f^2\\&= 2^{-T L} {\mathcal {N}}({{\,\mathrm{supp}\,}}(f),L) \Vert f\Vert _2^2. \end{aligned}$$

\(\square \)

Proof of Theorem 1.4

As usual fix \(T\gg 1, \varepsilon ,\tau \ll 1\). Let \(\Lambda (x)\) be the function defined in Proposition 5.25. A calculation shows that

$$\begin{aligned} \Lambda (s-1) =\frac{s(147-170s+60s^2)}{18(12-14s+5s^2)}. \end{aligned}$$

As \(\Lambda \) is continuous, it is enough to show that if \(A\subset {\mathbb {R}}^2\) is a Borel set with \({{\,\mathrm{dim_H}\,}}(A)>s\), then \({{\,\mathrm{dim_H}\,}}(\Delta (A\times A))\ge \Lambda (s-1)\). It is enough to consider the case in which A is bounded. After translating and rescaling A, we may further assume that \(A\subset [0,1)^2\).

Let \(\mu _1,\mu _2\in {\mathcal {P}}([0,1)^2)\) be measures supported on A such that \({\mathcal {E}}_{s}(\mu _1),{\mathcal {E}}_{s}(\mu _2)<\infty \), and their supports are \((2\varepsilon )\)-separated (making \(\varepsilon \) smaller if needed). Any implicit constants arising in the proof may depend on \(\mu _1\), \(\mu _2\) and s.

Let \(\kappa _1,\kappa _2>0\) be the numbers given by Proposition 3.12 applied to \(\mu _1,\mu _2\) and \(\mu _2,\mu _1\) in place of \(\mu ,\nu \) respectively, and set \(\kappa =\min (\kappa _1,\kappa _2)\).

Pick \(\ell _0\) large enough in terms of \(T,\varepsilon ,\tau \) that, invoking Lemma 3.10,

$$\begin{aligned} {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu _i,x) \le \kappa \text { for all } x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _i),\quad i=1,2. \end{aligned}$$

Let

$$\begin{aligned} \Theta _i = \{ (x,\theta ): x\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _i), \theta \in {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu _i,x)\}, \quad i=1,2. \end{aligned}$$

Applying Proposition 3.12 first with \(\mu _1,\mu _2\) and \(\Theta _1\) in place of \(\mu ,\nu ,\Theta \) and then with \(\mu _2,\mu _1\) and \(\Theta _2\) in place of \(\mu ,\nu ,\Theta \), we get that there exists a compact set \(G\subset {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _1)\times {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _2)\) such that \((\mu _1\times \mu _2)(G)>1/3\) and

$$\begin{aligned} \theta (x,y)\not \in {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu _1,x)\,\text { and } \,\theta (y,x)\not \in {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu _2,y)\quad \text {for all }(x,y)\in G. \end{aligned}$$
(6.15)

We write \(\mu =\mu _1\times \mu _2\) from now on. Denote \(s_0=\Lambda (s-1)\). Our goal is to show that

$$\begin{aligned} {{\,\mathrm{dim_H}\,}}(\Delta (G)) \ge s_0. \end{aligned}$$

Since \(\Delta (G)\subset \Delta (A\times A)\), this will establish the theorem. In turn, since a Borel set \(F'\subset {\mathbb {R}}\) satisfies \((\Delta \mu _G)(F')> \ell ^{-2}\) if and only if \(B=\Delta ^{-1}(F')\) satisfies \(\mu _G(B)> \ell ^{-2}\), according to Lemma 6.2, in order to complete the proof it is enough to prove the following claim. \(\square \)

Claim

The following holds if \(\ell \) is large enough in terms of \(\mu ,T,\varepsilon ,\tau \): if B is a Borel subset of \([0,1)^2\times [0,1)^2\) such that \(\mu _G(B) > \ell ^{-2}\), then

$$\begin{aligned} \log {\mathcal {N}}(\Delta (B),\ell ) \ge T\ell (s_0-o_{T,\varepsilon ,\tau }(1)). \end{aligned}$$
(6.16)

We start the proof of the claim. Firstly, replacing B by a compact subset of almost the same measure we may assume that B is compact. We may assume also that \(B\subset G\). Note that \(\mu (B) = \mu _G(B)\mu (G)\ge \ell ^{-2}/3\).

Let \((X_k^{(i)})_{k=1}^{N_i}\) be the \(2^{-T\ell }\)-sets given by Corollary 3.5 applied to \(R_\ell (\mu _i)\). Note that we have a disjoint union

$$\begin{aligned} {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(R_\ell \mu _i)= \left( \bigcup _{k=1}^{N_i} X_k^{(i)} \right) \cup \widetilde{X}^{(i)},\quad \text {where }\mu _i(\widetilde{X}^{(i)}) =R_\ell \mu _i(\widetilde{X}^{(i)}) \le 2^{-\varepsilon T\ell }.\nonumber \\ \end{aligned}$$
(6.17)

Write \(\rho _k^{(i)} = (R_\ell \mu _i)_{X_k^{(i)}}\). Note that \(\rho _k^{(i)}\) is \(\sigma _k^{(i)}\)-regular for some \(\sigma _k^{(i)} \in [-1,1]^\ell \); in particular, it is a \(2^{-T\ell }\)-measure. Also, by Lemma 3.1 and Corollary 3.5(ii), and using our assumption that \({\mathcal {E}}_{s}(\mu _i)<\infty \),

$$\begin{aligned} {\mathcal {E}}_{s}(\rho _k^{(i)}) \le \left( R_\ell \mu _i(X_k^{(i)})\right) ^{-2} {\mathcal {E}}_{s}(R_\ell \mu _i) \lesssim _T 2^{o_{T,\varepsilon }(1)T\ell }. \end{aligned}$$

Hence, using Lemma 3.3 and increasing the value of \(\ell _0\) again, any \(\sigma =\sigma _k^{(i)}\) satisfies

$$\begin{aligned} \sigma _1+\cdots +\sigma _j \ge (s-1)j - \zeta \ell \qquad (j=1,\ldots ,\ell ), \end{aligned}$$
(6.18)

where \(\zeta =o_{T,\varepsilon }(1)\). By starting with appropriate \(T,\varepsilon \), we may assume that \(\zeta <(s-1)^2\).

If \(\rho \) is \(\sigma \)-regular, we write

$$\begin{aligned} {\mathbf {D}}(\rho ) = 1 - \frac{1}{\ell } {\mathbf {M}}_\tau (\sigma ). \end{aligned}$$

Let \(F_i\subset {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(R_\ell \mu _i)\) be union of the sets \(X_k^{(i)}\) over all k such that \({\mathbf {D}}(\rho _k^{(i)})\ge s_0-\delta \), where

$$\begin{aligned} \delta = 3\sqrt{\zeta } + 145\tau . \end{aligned}$$
(6.19)

Note that, since the \(X_k^{(i)}\) are \(2^{-T\ell }\)-sets, then so if \(F_i\).

Consider two (non mutually exclusive) cases:

  1. (a)

    Either \(\mu _{B}(F_1\times {\mathbb {R}}^2)\ge 1/3\) or \(\mu _{B}({\mathbb {R}}^2 \times F_2)\ge 1/3\) (or both).

  2. (b)

    \(\mu _{B}(({\mathbb {R}}^2\setminus F_1)\times ({\mathbb {R}}^2\setminus F_2))\ge 1/3\).

Roughly speaking, in the first case we will argue as in the Proof of Theorem 1.2, while in case (b) we will appeal to Proposition 5.25.

Assume then that (a) holds. Without loss of generality, suppose \(\mu _{B}(F_1\times {\mathbb {R}}^2)\ge 1/3\). Instead of showing that (6.16) holds directly for B, we will show that it holds for the set

$$\begin{aligned} B' = \bigcup _{y\in [0,1)} (R_\ell B_y \times \{y\}) \end{aligned}$$

where, for the rest of this section, given \(A\subset {\mathbb {R}}^2\times {\mathbb {R}}^2\) we denote its “horizontal” sections by \(A_y = \{ x: (x,y)\in A\}\) (for \(y\in {\mathbb {R}}^2\)). In other words, to form \(B'\) we make each horizontal fiber of B into a union of squares in \({\mathcal {D}}_\ell \). One can check that \(B'\) is Borel (in fact, \(\sigma \)-compact). Since \(B\subset B'\subset R_\ell B\), the numbers \({\mathcal {N}}(\Delta (B'),\ell )\) and \({\mathcal {N}}(\Delta (B),\ell )\) differ by at most a multiplicative constant so that proving (6.16) for \(B'\) implies it also for B. Since we are assuming that \(B\subset G\), we have that \(B'\subset G'\), where \(G'\) is defined analogously to \(B'\).

Using Fubini, that \(F_1 = R_\ell F_1\), our definition of \(B'\), the assumption \(\mu _{B}(F_1\times {\mathbb {R}}^2)\ge 1/3\), and the fact that \(\mu (B)\ge \ell ^{-2}/3\), we get

$$\begin{aligned} (R_\ell \mu _1 \times \mu _2) ((F_1\times {\mathbb {R}}^2) \cap B')&= \int R_\ell \mu _1 (F_1 \cap B'_y) \,d\mu _2 (y) \\&= \int \mu _1 (F_1 \cap R_\ell B_y) \,d\mu _2 (y) \\&= \mu ( (F_1 \times {\mathbb {R}}^2) \cap B' ) \\&\ge \mu _B (F_1 \times {\mathbb {R}}^2) \mu (B) \ge \ell ^{-2} / 9. \end{aligned}$$

Applying (6.17) with \(i=1\), we can decompose

$$\begin{aligned} R_\ell \mu _1 = (R_\ell \mu _1)|_{\widetilde{X}^{(1)}} + \sum _{k=1}^{N_1} \mu _1(X_k^{(1)}) \rho _k^{(1)}. \end{aligned}$$

Hence, using that \(R_\ell \mu _1({\widetilde{X}}^{(1)}) \le 2^{-\varepsilon T\ell }\) and taking \(\ell \) large enough, there exists k such that

$$\begin{aligned} (\rho _k^{(1)}\times \mu _2)( (F_1\times {\mathbb {R}}^2) \cap B') \ge \ell ^{-2}/9 - 2^{-\varepsilon T\ell } \ge \ell ^{-2}/10. \end{aligned}$$

By the definition of \(F_1\), we must have \({\mathbf {D}}(\rho _k^{(1)})\ge s_0-\delta \), and \({{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\rho _k^{(1)})\subset F_1\). By Fubini, we can find \(y\in {{\,\mathrm{supp}\,}}_{{\mathsf {d}}}(\mu _2)\) such that

$$\begin{aligned} \rho _k^{(1)} (B'_y) \ge \ell ^{-2}/10. \end{aligned}$$
(6.20)

Since \(B'\subset G'\subset R_\ell G\), we know that if \(x\in B'_y\) then there exists \(\widetilde{x}\in {\mathcal {D}}_\ell (x)\) such that \((\widetilde{x},y)\in G\). By (6.15), this implies that \(\theta (\widetilde{x},y)\notin {{\,\mathrm{\mathbf {Bad}}\,}}''_{\ell _0}(\mu _1,\widetilde{x})\). Recalling the definitions (3.3), (3.4), we have shown that the hypotheses of Proposition 4.4 hold for \(\rho _k^{(1)}\) and \(B'_y\), with \(\beta =\varepsilon \) (the separation between y and \({{\,\mathrm{supp}\,}}(\rho _k^{(1)})\) follows from the fact that the supports of \(\mu _1\) and \(\mu _2\) are \((2\varepsilon )\)-separated, making \(\ell \) larger again). Recalling (6.20), we see that the error term in Proposition 4.4 can be made \(\le o_{T,\varepsilon ,\tau }(1)\) by making \(\ell \) even larger. Applying the proposition, and recalling that \({\mathbf {D}}(\rho _k^{(1)}) \ge s_0-\delta \), where \(\delta =o_{T,\varepsilon ,\tau }(1)\) was defined in (6.19), we conclude that (for this fixed value of y)

$$\begin{aligned} \log {\mathcal {N}}(\Delta _y(B'_y),\ell ) \ge T\ell (s_0-o_{T,\varepsilon ,\tau }(1)), \end{aligned}$$

and hence the same lower bound holds for \(\log {\mathcal {N}}(\Delta (B'),\ell )\). This concludes the proof of the claim in case (a).

We now consider case (b). Since \({\mathcal {N}}(\Delta (B),\ell )\) and \({\mathcal {N}}(\Delta (R_\ell B),\ell )\) differ by at most a multiplicative constant, it is enough to prove (6.16) for \(R_\ell B\) in place of B. It follows from the assumption of case (b), the decomposition (6.17) for both \(\mu _1, \mu _2\), and the definitions of the sets \(F_i\) that

$$\begin{aligned} \frac{1}{3}\mu (B)&\le \mu \left( \big (({\mathbb {R}}^2\setminus F_1)\times ({\mathbb {R}}^2\setminus F_2)\big ) \cap B\right) \\&\le 2\cdot 2^{-\varepsilon T\ell } + \sum _{(k_1,k_2):{\mathbf {D}}(\rho _{k_i}^{(i)})<s_0-\delta } R_\ell \mu _1(X_{k_1}^{(1)}) R_\ell \mu _2(X_{k_2}^{(2)}) \left( \rho _{k_1}^{(1)}\times \rho _{k_2}^{(2)} \right) (R_\ell B) . \end{aligned}$$

Hence, using that \(\mu (B)\ge \ell ^{-2}/3\), we can find \(k_1,k_2\) such that \({\mathbf {D}}(\rho _{k_i}^{(i)})<s_0-\delta \) for \(i=1,2\), and

$$\begin{aligned} \left( \rho _{k_1}^{(1)}\times \rho _{k_2}^{(2)} \right) (R_\ell B) \ge \frac{1}{3}\mu (B) - 2\cdot 2^{-\varepsilon T\ell } \ge \frac{1}{10}\ell ^{-2}, \end{aligned}$$
(6.21)

if \(\ell \) is large enough in terms of \(\varepsilon \).

Write \(\rho _{k_i}^{(i)} = \rho '_i\) for simplicity. In light of (6.18) and our earlier assumption \(\zeta <(s-1)^2\), the hypothesis of Proposition 5.25 holds for the sequences \(\sigma \) arising from both \(\rho '_1\) and \(\rho '_2\), with \(\gamma =s-1\). Let \(\eta >0, \xi \in (2/3,1]\) be the numbers given in the proposition (they depend on \(s-1-\sqrt{\zeta }\)) . Since \({\mathbf {D}}(\rho '_i)<\Lambda (s-1)-\delta \), where \(\delta \) was defined in (6.19), if we take \(\ell \) sufficiently large, then the alternative (i) in Proposition 5.25 must hold.

Let \(\rho ''_i=R_{\lfloor \xi \ell \rfloor }(\rho '_i)\). Note that if \(\rho '_i\) is \((\sigma _1,\ldots ,\sigma _{\ell })\)-regular, then \(\rho ''_i\) is \((\sigma _1,\ldots ,\sigma _{\lfloor \xi \ell \rfloor })\)-regular. Using Lemma 3.3 (with \(\lfloor \xi \ell \rfloor \) in place of \(\ell \)), recalling that \(\zeta =o_{T,\varepsilon ,\tau }(1)\) and that the alternative (i) in Proposition 5.25 holds, and making \(\ell \) larger if needed, we get

$$\begin{aligned} \log {\mathcal {E}}_{4/3}(\rho ''_i) \le \xi T\ell (\eta +o_{T,\varepsilon ,\tau }(1)) \quad (i=1,2). \end{aligned}$$

On the other hand, we see from Lemma 3.1 that

$$\begin{aligned} {\mathcal {E}}_{4/3+\varepsilon }(\rho ''_i) \lesssim _{T,\varepsilon } 2^{\varepsilon \xi T\ell } {\mathcal {E}}_{4/3}(\rho ''_i). \end{aligned}$$

We apply Theorem 6.4 (together with the last two displayed equations) to get

$$\begin{aligned} \Vert \Delta (\rho ''_1\times \rho ''_2)\Vert _2^2&\lesssim _{T,\varepsilon } 2^{\varepsilon \xi T\ell } {\mathcal {E}}_{4/3}(\rho ''_1) {\mathcal {E}}_{4/3}(\rho ''_2)\\&\le 2^{o_{T,\varepsilon ,\tau }(1) \xi T\ell } 2^{2 \eta \xi T\ell }. \end{aligned}$$

It follows from (6.21) that \((\rho ''_1\times \rho ''_2)(R_{\lfloor \xi \ell \rfloor }B)\ge \ell ^{-2}/10\). We deduce that, for \(\ell \) large enough,

$$\begin{aligned} \log \left\| \Delta \big ( \left( \rho ''_1 \times \rho ''_2\right) _{R_{\lfloor \xi \ell \rfloor }B} \big )\right\| _2^2 \le \xi T\ell (2\eta +o_{T,\varepsilon ,\tau }(1)). \end{aligned}$$

Applying Lemma 6.5 to \(f=\Delta \big ( \left( \rho ''_1 \times \rho ''_2\right) _{R_{\lfloor \xi \ell \rfloor }B} \big )\) and \(L=\lfloor \xi \ell \rfloor \), we conclude that

$$\begin{aligned} \log {\mathcal {N}}(\Delta (B),\ell )&\ge \log {\mathcal {N}}(\Delta (B),\lfloor \xi \ell \rfloor ) \\&\gtrsim \log {\mathcal {N}}(\Delta (R_{\lfloor \xi \ell \rfloor } B),\lfloor \xi \ell \rfloor ) \ge \xi T\ell (1-2\eta -o_{T,\varepsilon ,\tau }(1)) \end{aligned}$$

for \(\ell \) sufficiently large. Since \(\xi (1-2\eta )\ge s_0-\sqrt{\zeta }\) by Proposition 5.25 and \(\zeta =o_{T,\varepsilon ,\tau }(1)\), this concludes the proof of case (b) of the claim, which completes the Proof of Theorem 1.4. \(\square \)

7 Sharpness of the Results

It is natural to ask what parts of our approach are sharp and which are not. In this section we show that the results of Section 5 are sharp, up to error terms. Hence, if the main results are not sharp (which seems likely), this is not due to the estimates for \({\mathbf {M}}_\tau (\sigma )\), but rather to the fact that Proposition 4.4 (which connects the value of \({\mathbf {M}}_\tau (\sigma )\) to the size of distance sets) is itself not sharp.

We begin by showing that Proposition 5.2 is sharp for all parameter values (and even the value of f(a) can be chosen as an arbitrary \(b\in [Da,Ca]\)). This is illustrated by the following functions.

First consider the case when \(C=1\). Let \(x_0=y_0=0\), \(x_1=y_1=y_2=\frac{(1+D)(a-b)}{3(1-D)}\), \(x_2=2x_1\), \(x_3=\frac{a-b}{1-D}\), \(y_3=Dx_3\), \(x_4=a\) and \(y_4=b\), let \(f(x_i)=y_i\)\((i=0,\ldots ,4)\) and let f be linear on every interval \([x_{i-1},x_i]\). (See Figure 2 for \(D=0\), \(a=1\) and \(b=0\) and note that in the most important \(b=Da\) case \(x_3=x_4\), so the graph consists of only three linear segments.) It is clear that \(f(a)=b\) and \( Dx\le f(x) \le Cx\) on [0, a]. It is easy to check that f is 1-Lipschitz. The fact that the first inequality of (5.1) holds with equality (and for \(b=aD\) also the second one) follows from the observation that the set of hard points of f is \([x_2,x_3]\) (recall Definition 5.3), Lemma 5.4, and a straightforward calculation.

Fig. 2
figure 2

These are graphs of functions that witness the sharpness of Propositions 5.2 and 5.5 for \(a=1\), \(f(a)=D\) and various parameters C and D. In the top graph, the larger function is for Proposition 5.2 with \(D=0\), \(C=1\), and the smaller function is for Proposition 5.2 with \(D=0\), \(C=(\sqrt{3}-1)/2\) and also for Proposition 5.5 with \(D=0\). The graph on the bottom shows the function that witnesses the sharpness of Proposition 5.2 with \(D=1/7\) and \(C=4/7\) and of Proposition 5.2 with \(D=1/7\), together with the dashed lines \(y=x/7\) and \(y=4x/7\).

Now we consider the case when \(C<1\). Let \(q=\frac{(1+D)(1-C)}{(1-D)(2+C)}\). For \(k=0,1,\ldots \) let

$$\begin{aligned} x_{3k}=\frac{a-b}{1-C}q^k, \ \quad x_{3k+1}=\frac{a-b}{1-D}q^k, \ \quad x_{3k+2}=2x_{3k+3}, \ \end{aligned}$$

\(y_{3k}=Cx_{3k}\), \(y_{3k+1}=Dx_{3k+1}\) and \(y_{3k+2}=y_{3k+3}\). Let \(f(0)=0\), \(f(a)=b\), \(f(x_j)=y_j\) (\(j=1,2,\ldots \)) and let f be linear on \([x_1,a]\) and on each \([x_{j+1},x_j]\). (See Figure 2 for \(a=1\), \(b=D=0\), \(C=(\sqrt{3}-1)/2\), and for \(a=1\), \(b=D=1/7\), \(C=4/7\), and note again that because of \(b=Da\), the segment \([x_1,a]\) is degenerated in both cases.) Again, it is clear that \(f(a)=b\) and \(Cx\le f(x) \le Dx\) on [0, a], and it is easy to check that f is 1-Lipschitz. Now, observing that the set of hard points is \(\cup _{k=1}^\infty [x_{3k+2},x_{3k+1}]\), Lemma 5.4 and another straightforward calculation show that indeed \({\mathbf {T}}(f)=\frac{(a-f(a))(C-2D)}{1+2C-3D}\).

Proposition 5.5 is also sharp up to the error term: for any given \(D\in [0,1/2)\), we construct an f that satisfies the conditions and for which for any \(u\in (0,a]\) we have \({\mathbf {T}}(f|[0,u])\ge u\Phi (D)\).

Recall that at the end of the proof of Proposition 5.5 we claimed that \(\frac{(1-C)(C/2-D)}{1+2C-3D}\le \Phi (D)\) on [2D, 1]. The function \(\Phi (D)\) was of course chosen so that this is sharp, in fact for \(C=\frac{3D-1+\sqrt{3-3D^2}}{2}\in ( 2D, 1 )\) we have equality. Let C be chosen this way, and let f be the function we obtained above when we showed the sharpness of Proposition 5.2 for these values of C and D. (See Figure 2 for \(a=1\), \(b=D=0\), and for \(a=1\), \(b=D=1/7\).) As it was already mentioned above, the set of hard points of f is \(\cup _{k=1}^\infty [x_{3k+2},x_{3k+1}]\). By Lemma 5.4, this implies that if we want to minimize \({\mathbf {T}}(f|[0,u]) / u\), then u must be of the form \(u=x_{3k+2}\). Lemma 5.4 and a simple calculation shows that for every k we get

$$\begin{aligned} \frac{{\mathbf {T}}(f|[0,x_{3k+2}])}{x_{3k+2}}= \frac{(1-C)(C/2-D)}{1+2C-3D}=\Phi (D), \end{aligned}$$

which establishes the claimed sharpness.

Now we show that the lower estimates in (5.26) and (5.28) are sharp in Proposition 5.15: for any \(D\in [0,1/3]\) and \(\delta \in [0,1/12]\) we construct 1-Lipschitz functions \(f_1\) and \(f_2\) on [0, 1] such that \(f_i(x)\ge Dx\) on [0, 1], \(f_i(0)=0\), \({\mathbf {T}}(f_i)=\frac{1-2D}{3}-\delta \) (\(i=1,2\)), \(f_1(x)=x-3\delta (1-D)\) on \([3\delta ,t_1]\) and \(f_2(x)=3t_1-x-3\delta (1-D)\) on \([2t_1,1-\frac{3\delta }{1-2D}]\), where \(t_1\) is given by (5.25).

Let

$$\begin{aligned} f_1(x) = {\left\{ \begin{array}{ll} \min (x,3\delta (1+D)-x) &{} \text { on } [0,3\delta ] \\ \min (x-3\delta (1-D),1+D-x) &{} \text { on } [3\delta ,1] \end{array}\right. }. \end{aligned}$$

Then \(f_1(x)\ge Dx\) on [0, 1], and \(f_1(x)=x-3\delta (1-D)\) on \([3\delta ,\frac{1+D}{2}+\frac{3\delta (1-D)}{2}]\supset [3\delta ,t_1]\). One can check that set of hard points of \(f_1\) is \([2\delta (1+D),3\delta ]\cup [\tfrac{2}{3}(1+D+3\delta (1-D)),1]\), where \(\delta \in [0, 1/12]\) ensures that both intervals have nonnegative length, and so Lemma 5.4 gives that \({\mathbf {T}}(f_1)=\frac{1-2D}{3}-\delta \).

Now, let

$$\begin{aligned} f_2(x) = {\left\{ \begin{array}{ll} \min (x,3t_1-x-3\delta (1-D)) &{} \text { on } [0,1-\frac{3\delta }{1-2D}] \\ Dx &{} \text { on } [1-\frac{3\delta }{1-2D},1] \end{array}\right. }. \end{aligned}$$

Then \(f_2(x)\ge Dx\) on [0, 1], and

$$\begin{aligned} f_2(x)= & {} 3t_1-x-3\delta (1-D) \text { on } \left[ \frac{3}{2}(t_1-\delta (1-D)),1-\frac{3\delta }{1-2D}\right] \\&\supset \left[ 2t_1,1-\frac{3\delta }{1-2D}\right] . \end{aligned}$$

After checking that the set of hard points of \(f_2\) is \([2t_1-2\delta (1-D),1-\frac{3\delta }{1-2D}]\), Lemma 5.4 yields \({\mathbf {T}}(f_2)=\frac{1-2D}{3}-\delta \).

We claim that Corollary 5.17 is sharp in the following sense: If \(D\in [0,0.26]\), \(\eta >0\), \(\xi \in (0,1]\), \(\Lambda _1=\xi (1-2\eta )\) and for every 1-Lipschitz function \(f:[0,1]\rightarrow {\mathbb {R}}\) such that \(f(0)=0\) and \(f(x)\ge Dx\) on [0, 1] we have

$$\begin{aligned} {\mathbf {T}}(f)\ge 1-\Lambda _1\Longrightarrow f(x) > \frac{x}{3}-\eta \xi \text { on } [0,\xi ], \end{aligned}$$
(7.1)

then \(\Lambda _1 < \Lambda (D)\).

Indeed, if \(\eta > 1/3-D\) then \(\Lambda _1=\xi (1-2\eta )< 1-2(1/3-D) = 1/3+2D\), which is less than \(\Lambda (D)\) when \(D\in [0,0.26]\). So we can suppose that \(\eta \le 1/3-D\). Let \(x_1=\xi (4/3-\eta )/(1+D)\), \(x_2=\min (x_1,1)\) and

$$\begin{aligned} f_3(x) = {\left\{ \begin{array}{ll} \min (x,-x+ x_2(1+D)) &{} \text { on } [0, x_2] \\ Dx &{} \text { on } [ x_2,1] \end{array}\right. }. \end{aligned}$$

Then \(f_3\) is 1-Lipschitz, \(f_3(0)=0\), \(f_3(x)\ge Dx\) on [0, 1] and \(f_3(x)=-x+ x_2(1+D)\) on \([x_0, x_2]\), where

$$\begin{aligned} x_0=x_2\frac{1+D}{2}\le x_1\frac{1+D}{2}= \xi (2/3-\eta /2)<\xi . \end{aligned}$$

It is easy to see that the set of hard points of \(f_3\) is \([\frac{2}{3} x_2(1+D),x_2]\) and so by Lemma 5.4 we have \({\mathbf {T}}(f_3)=x_2(1-2D)/3.\) The assumptions \(\eta \le 1/3-D\) and \(\xi \le 1\) imply that \(\xi \le x_2\), hence \(\xi \in [x_0,x_2]\), and we have

$$\begin{aligned} f_3(\xi )=-\xi +x_2(1+D)\le -\xi +x_1(1+D)=\xi /3-\eta \xi . \end{aligned}$$

Thus by our assumption (7.1) we must have \(x_2(1-2D)/3={\mathbf {T}}(f_3) < 1-\Lambda _1\). If \(x_1\ge 1\) then \(x_2=1\), so we obtain \(\Lambda _1 < 2(1+D)/3\). It is easy to check that \(\Lambda (x)\ge 2(1+x)/3\) on [0, 1 / 2], so in this case we obtained \(\Lambda _1 < \Lambda (D)\) as we claimed. So we can suppose that \(x_1<1\) and so \(x_2=x_1\). Then \(x_2(1-2D)/3 < 1-\Lambda _1\) gives

$$\begin{aligned} \xi (4/3-\eta )\frac{1-2D}{3(1+D)} < 1-\xi (1-2\eta ). \end{aligned}$$
(7.2)

Let

$$\begin{aligned} \delta =\Lambda _1-\frac{2}{3}(1+D)=\xi (1-2\eta )-\frac{2}{3}(1+D). \end{aligned}$$

We can clearly suppose that \(\Lambda _1\ge \Lambda (D)\). Since \(\Lambda (D)\ge \frac{2}{3}(1+D)\), we obtain \(\delta \ge 0\). We also have \(\delta \le 1/12\) since this is clear if \(\xi \le 3/4\) and follows from (7.2) and the assumption \(\eta \le 1/3-D\) if \(\xi >3/4\). For this value of \(\delta \) let \(f_1\) be the 1-Lipschitz function defined above (to show the sharpness of (5.26) of Proposition 5.15). Then \(f_1(0)=0\), \(f_1(x)\ge Dx\) on [0, 1] and \({\mathbf {T}}(f_1)=\frac{1-2D}{3}-\delta =1-\Lambda _1\), so by (7.1) we must have \(f_1(x)>\frac{x}{3}-\eta \xi \) on \([0,\xi ]\). Since \(\xi \le 1\) and \(\eta >0\) we have \(3\delta =3\xi (1-2\eta )-2(1+D)\le 3\xi -2\le \xi \). Thus we get \(f_1(3\delta )>\delta -\eta \xi \). Since \(f_1(3\delta )=3\delta D\) this gives \(\eta \xi > \delta (1-3D)\).

From the definition of \(\delta \) we get \(\xi =\delta + 2\eta \xi + \frac{2}{3}(1+D)\). Considering \(\delta \), \(\xi \) and \(\eta \xi \) as variables and D as a parameter, substituting the above expression into (7.2), and then using that \(\eta \xi > \delta (1-3D)\), after some calculations one gets \(\delta < \frac{(1+D)(1-2D)}{18(3-4D+5D^2)}\), which yields \(\Lambda _1=\delta +\frac{2}{3}(1+D)<\Lambda (D)\), as we claimed.

We remark that if \(\eta =1/3-D, \xi =1, \Lambda _1=\xi (1-2\eta )=1/3+2D\) then (7.1) holds, since then \(f(x)\ge Dx\) on [0, 1] already implies that \(f(x)\ge x/3 - \eta \xi \) on \([0,\xi ]\). So in order to make Corollary 5.17 sharp for every \(D\in [0,1/3)\), the function \(\Lambda (D)\) has to replaced by \(\max (\Lambda (D),1/3+2D)\), which is equal to \(\Lambda (D)\) if and only if \(D\le 0.2609\ldots \). However, this version would not improve any of our distance set estimates.

Finally, we claim that Propositions 5.23, 5.24 and 5.25 are also sharp, up to the error terms. Indeed, given a 1-Lipschitz function \(f:[0,1]\rightarrow {\mathbb {R}}\) and \(\ell \in {\mathbb {N}}\), let \(\sigma =\sigma _{f,\ell }\in [-1,1]^\ell \) be the sequence

$$\begin{aligned} \sigma _i=\ell \left( f(i/\ell )-f((i-1)/\ell )\right) . \end{aligned}$$

It is not hard to show that for any positive integer \(L\le \ell \) and good integer partition \({\mathcal {P}}\) of (0, L] there exists a good partition \((a_n)\) of \([0,L/\ell ]\) such that \({\mathbf {T}}(f|[0,L/\ell ],(a_n))\le \frac{1}{\ell }{\mathbf {M}}(\sigma |(0,L],{\mathcal {P}})+O(\log \ell /\ell )\), thus

$$\begin{aligned} {\mathbf {T}}(f|[0,L/\ell ])\le \frac{1}{\ell }{\mathbf {M}}_\tau (\sigma |(0,L]) +O(\log \ell /\ell ). \end{aligned}$$

Thus, starting with the functions defined in this section that witness the sharpness of Propositions 5.2 and 5.5 and Corollary 5.17, this way we get sequences that show the sharpness of Propositions 5.23, 5.24 and 5.25, up to the error terms.