Advertisement

Explicit Matrices with the Restricted Isometry Property: Breaking the Square-Root Bottleneck

  • Dustin G. MixonEmail author
Chapter
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)

Abstract

Matrices with the restricted isometry property (RIP) are of particular interest in compressed sensing. To date, the best known RIP matrices are constructed using random processes, while explicit constructions are notorious for performing at the “square-root bottleneck,” i.e., they only accept sparsity levels on the order of the square root of the number of measurements. The only known explicit matrix which surpasses this bottleneck was constructed by Bourgain, Dilworth, Ford, Konyagin, and Kutzarova in Bourgain et al. (Duke Math. J. 159:145–185, 2011). This chapter provides three contributions to advance the groundbreaking work of Bourgain et al.: (i) we develop an intuition for their matrix construction and underlying proof techniques; (ii) we prove a generalized version of their main result; and (iii) we apply this more general result to maximize the extent to which their matrix construction surpasses the square-root bottleneck.

13.1 Introduction

A matrix Φ is said to satisfy the (K, δ)-restricted isometry property (RIP) if
$$\displaystyle{ (1-\delta )\|x\|^{2} \leq \|\varPhi x\|^{2} \leq (1+\delta )\|x\|^{2} }$$
for every K-sparse vector x. RIP matrices are useful when compressively sensing signals which are sparse in some known orthonormal basis. Indeed, if there is a unitary sparsity matrix Ψ such that every signal of interest x has the property that Ψ x is K-sparse, then any such x can be stably reconstructed from measurements of the form y = Ax by minimizing \(\|\varPsi x\|_{1}\) subject to the measurements, provided A Ψ−1 satisfies (K, δ)-RIP with δ < 1∕3 [9]. For sensing regimes in which measurements are costly, it is desirable to minimize the number of measurements necessary for signal reconstruction; this corresponds to the number of rows M in the M × N sensing matrix A. One can apply the theory of Gelfand widths to show that stable reconstruction by L1-minimization requires K = O(M∕log(NM)) [4], and random matrices show that this bound is essentially tight; indeed, M × N matrices with iid subgaussian entries satisfy (2K, δ)-RIP with high probability provided M = Ω δ (Klog(NK)) [13].

Unfortunately, random matrices are not always RIP, though the failure rate vanishes asymptotically. In applications, you might wish to verify that your randomly drawn matrix actually satisfies RIP before designing your sensing platform around that matrix, but unfortunately, this is NP-hard in general [2]. As such, one is forced to blindly assume that the randomly drawn matrix is RIP, and admittedly, this is a reasonable assumption considering the failure rate. Still, this is dissatisfying from a theoretical perspective, and it motivates the construction of explicit RIP matrices:

Definition 1.

For any z > 0, let ExRIP[z] denote the following statement:

There exists an explicit family of M × N matrices with arbitrarily large aspect ratio NM which are (K, δ)-RIP with K = Ω(Mzε) for all ε > 0 and δ < 1∕3.

Since there exist (non-explicit) matrices satisfying z = 1 above, the goal is to prove ExRIP[1]. The most common way to demonstrate that an explicit matrix Φ satisfies RIP is to leverage the pairwise incoherence between the columns of Φ. Indeed, it is straightforward to prove ExRIP[1∕2] by taking Φ to have near-optimally incoherent unit-norm columns and appealing to interpolation of operators or Gershgorin’s circle theorem (e.g., see, [1, 11, 12]). The emergence of this “square-root bottleneck” compelled Tao to pose the explicit construction of RIP matrices as an open problem [18]. Since then, only one construction has managed to break the bottleneck: In [8], Bourgain, Dilworth, Ford, Konyagin, and Kutzarova prove ExRIP[1∕2 +ε0] for some undisclosed ε0 > 0. This constant has since been estimated as ε0 ≈ 5. 5169 × 10−28 [15].

Instead of estimating δ in terms of coherence, Bourgain et al. leverage additive combinatorics to construct Φ and to demonstrate certain cancellations in the Gram matrix ΦΦ. Today (three years later), this is the only known explicit construction which breaks the square-root bottleneck, thereby leading to two natural questions:
  • What are the proof techniques that Bourgain et al. applied?

  • Can we optimize the analysis to increase ε0?

These questions were investigated recently in a series of blog posts [15, 16, 17], on which this chapter is based. In the next section, we provide some preliminaries—we first cover the techniques used in [8] to demonstrate RIP, and then we discuss some basic additive combinatorics to motivate the matrix construction. Section 13.3 then describes the construction of Φ, namely a subcollection of the chirps studied in [10], and discusses one method of selecting chirps (i.e., the method of Bourgain et al.). Section 13.4 provides the main result, namely the BDFKK restricted isometry machine, which says that a “good” selection of chirps will result in an RIP matrix construction which breaks the square-root bottleneck. This is a generalization of the main result in [8], as it offers more general sufficient conditions for good chirp selection, but the proof is similar. After generalizing the sufficient conditions, we optimize over these conditions to increase the largest known ε0 for which ExRIP[1∕2 +ε0] holds:
$$\displaystyle{ \epsilon _{0} \approx 4.4466 \times 10^{-24}. }$$
Of course, any improvement to the chirp selection method will further increase this constant, and hopefully, the BDFKK restricted isometry machine and overall intuition provided in this chapter will foster such progress.1 Section 13.5 contains the proofs of certain technical lemmas that are used to prove the main result.

13.2 Preliminaries

The goal of this section is to provide some intuition for the main ideas in [8]. We first explain the overall proof technique for demonstrating RIP (this is the vehicle for breaking the square-root bottleneck), and then we introduce some basic ideas from additive combinatorics. Throughout, \(\mathbb{Z}/n\mathbb{Z}\) denotes the cyclic group of n elements, S n denotes the cartesian power of a set S (i.e., the set of n-tuples with entries from S), \(\mathbb{F}_{p}\) denotes the field of p elements (in this context, p will always be prime), and \(\mathbb{F}_{p}^{{\ast}}\) is the multiplicative subgroup of \(\mathbb{F}_{p}\).

13.2.1 The Big-Picture Techniques

Before explaining how Bourgain et al. broke the square-root bottleneck, let’s briefly discuss the more common, coherence-based technique to demonstrate RIP. Let \(\varPhi _{\mathcal{K}}\) denote the submatrix of Φ whose columns are indexed by \(\mathcal{K}\subseteq \{ 1,\ldots,N\}\). Then (K, δ)-RIP equivalently states that, for every \(\mathcal{K}\) of size K, the eigenvalues of \(\varPhi _{\mathcal{K}}^{{\ast}}\varPhi _{\mathcal{K}}\) lie in [1 −δ, 1 +δ]. As such, we can prove that a matrix is RIP by approximating eigenvalues. To this end, if we assume the columns of Φ have unit norm, and if we let μ denote the largest off-diagonal entry of ΦΦ in absolute value (this is the worst-case coherence of the columns of Φ), then the Gershgorin circle theorem implies that Φ is (K, (K − 1)μ)-RIP. Unfortunately, the coherence can’t be too small, due to the Welch bound [20]:
$$\displaystyle{ \mu \geq \sqrt{ \frac{N - M} {M(N - 1)}}, }$$
which is Ω(M−1∕2) provided N ≥ cM for some c > 1. Thus, to get (K − 1)μ = δ < 1∕2, we require K < 1∕(2μ) + 1 = O(M1∕2). This is much smaller than the random RIP constructions which instead take K = O(M1−ε) for all ε > 0, thereby revealing the shortcoming of the Gershgorin technique.

Now let’s discuss the alternative techniques that Bourgain et al. use. The main idea is to convert the RIP statement, which concerns all K-sparse vectors simultaneously, into a statement about finitely many vectors:

Definition 2 (flat RIP).

We say Φ = [φ1φ N ] satisfies (K, θ)-flat RIP if for every disjoint I, J ⊆ { 1, , N} of size ≤ K,
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{i\in I}\varphi _{i},\sum _{j\in J}\varphi _{j}\bigg\rangle \bigg\vert \leq \theta \sqrt{\vert I\vert \vert J\vert }. }$$

Lemma 1 (essentially Lemma 3 in [8], cf. Theorem 13 in [3]).

If Φ has (K,θ)-flat RIP and unit-norm columns, then Φ has (K,150θlog K)-RIP.

Unlike the coherence argument, flat RIP doesn’t lead to much loss in K. In particular, [3] shows that random matrices satisfy (K, θ)-flat RIP with θ = O(δ∕logK) when \(M =\varOmega ((K/\delta ^{2})\log ^{2}K\log N)\). As such, it makes sense that flat RIP would be a vehicle to break the square-root bottleneck. However, in practice, it’s difficult to control both the left- and right-hand sides of the flat RIP inequality—it would be much easier if we only had to worry about getting cancellations, and not getting different levels of cancellation for different-sized subsets. This leads to the following:

Definition 3 (weak flat RIP).

We say \(\varPhi = [\varphi _{1}\cdots \varphi _{N}]\) satisfies (K, θ′)-weak flat RIP if for every disjoint I, J ⊆ { 1, , N} of size ≤ K,
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{i\in I}\varphi _{i},\sum _{j\in J}\varphi _{j}\bigg\rangle \bigg\vert \leq \theta 'K. }$$

Lemma 2 (essentially Lemma 1 in [8]).

If Φ has (K,θ′)-weak flat RIP and worst-case coherence μ ≤ 1∕K, then Φ has \((K,\sqrt{\theta '})\) -flat RIP.

Proof.

By the triangle inequality, we have
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{i\in I}\varphi _{i},\sum _{j\in J}\varphi _{j}\bigg\rangle \bigg\vert \leq \sum _{i\in I}\sum _{j\in J}\vert \langle \varphi _{i},\varphi _{j}\rangle \vert \leq \vert I\vert \vert J\vert \mu \leq \vert I\vert \vert J\vert /K. }$$
Since Φ also has weak flat RIP, we then have
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{i\in I}\varphi _{i},\sum _{j\in J}\varphi _{j}\bigg\rangle \bigg\vert \leq \min \{\theta 'K,\vert I\vert \vert J\vert /K\} \leq \sqrt{\theta '\vert I\vert \vert J\vert }. }$$
 □ 

Unfortunately, this coherence requirement puts K back in the square-root bottleneck, since μ ≤ 1∕K is equivalent to K ≤ 1∕μ = O(M1∕2). To rectify this, Bourgain et al. use a trick in which a modest K with tiny δ can be converted to a large K with modest δ:

Lemma 3 (buried in Lemma 3 in [8], cf. Theorem 1 in [14]).

If Φ has (K,δ)-RIP, then Φ has (sK,2sδ)-RIP for all s ≥ 1.

In [14], this trick is used to get RIP results for larger K when testing RIP for smaller K. For the explicit RIP matrix problem, we are stuck with proving how small δ is when K on the order of M1∕2. Note that this trick will inherently exhibit some loss in K. Assuming the best possible scaling for all N, K, and δ is M = Θ((Kδ2)log(NK)), then if N = poly(M), you can get (M1∕2, δ)-RIP only if δ = Ω((log1∕2M)∕M1∕4). In this best-case scenario, you would want to pick s = M1∕4−ε for some ε > 0 and apply Lemma 3 to get K = O(M3∕4−ε). In some sense, this is another manifestation of the square-root bottleneck, but it would still be a huge achievement to saturate this bound.

13.2.2 A Brief Introduction to Additive Combinatorics

In this subsection, we briefly detail some key ideas from additive combinatorics; the reader who is already familiar with the subject may proceed to the next section, whereas the reader who wants to learn more is encouraged to see [19] for a more complete introduction. Given an additive group G and finite sets A, B ⊆ G, we can define the sumset
$$\displaystyle{ A + B:=\{ a + b: a \in A,\ b \in B\}, }$$
the difference set
$$\displaystyle{ A - B:=\{ a - b: a \in A,\ b \in B\}, }$$
and the additive energy
$$\displaystyle{ E(A,B):= \#\big\{(a_{1},a_{2},b_{1},b_{2}) \in A^{2} \times B^{2}: a_{ 1} + b_{1} = a_{2} + b_{2}\big\}. }$$
These definitions are useful in quantifying the additive structure of a set. In particular, consider the following:

Lemma 4.

A nonempty subset A of some additive group G satisfies the following inequalities:
  1. (i)

    |A + A|≥|A|

     
  2. (ii)

    |A − A|≥|A|

     
  3. (iii)

    E(A,A) ≤|A| 3

     

with equality precisely when A is a translate of some subgroup of G.

Proof.

For (i), pick a ∈ A. Then | A + A | ≥ | A + a | = | A | . Considering
$$\displaystyle{ A + A =\bigcup _{a\in A}(A + a), }$$
we have equality in (i) precisely when A + A = A + a for every a ∈ A. Equivalently, given a0 ∈ A, then for every a ∈ A, addition by aa0 permutes the members of A + a0. This is further equivalent to the following: Given a0 ∈ A, then for every a ∈ A, addition by aa0 permutes the members of Aa0. It certainly suffices for H: = Aa0 to be a group, and it is a simple exercise to verify that this is also necessary.

The proof for (ii) is similar.

For (iii), we note that
$$\displaystyle{ E(A,A) = \#\big\{(a,b,c) \in A^{3}: a + b - c \in A\big\} \leq \vert A\vert ^{3}, }$$
with equality precisely when A has the property that a + bc ∈ A for every a, b, c ∈ A. Again, it clearly suffices for Aa0 to be a group, and necessity is a simple exercise. □ 

The notion of additive structure is somewhat intuitive. You should think of a translate of a subgroup as having maximal additive structure. When the bounds (i), (ii) and (iii) are close to being achieved by A (e.g., A is an arithmetic progression), you should think of A as having a lot of additive structure. Interestingly, while there are different measures of additive structure (e.g., | AA | and E(A, A)), they often exhibit certain relationships (perhaps not surprisingly). The following is an example of such a relationship which is used throughout the paper by Bourgain et al. [8]:

Lemma 5 (Corollary 1 in [8]).

If E(A,A) ≥|A| 3 ∕K, then there exists a set A′ ⊆ A such that |A′|≥|A|∕(20K) and |A′ − A′|≤ 10 7 K 9 |A|.

In words, a set with a lot of additive energy necessarily has a large subset with a small difference set. This is proved using a version of the Balog–Szemeredi–Gowers lemma [5].

If translates of subgroups have maximal additive structure, then which sets have minimal additive structure? It turns out that random sets tend to (nearly) have this property, and one way to detect low additive structure is Fourier bias:
$$\displaystyle{ \|A\|_{u}:=\max _{\begin{array}{c}\theta \in G \\ \theta \neq 0 \end{array}}\vert \widehat{1_{A}}(\theta )\vert, }$$
where the Fourier transform (\(\hat{\cdot }\)) used here has a 1∕ | G | factor in front (it is not unitary). For example, if \(G = \mathbb{Z}/n\mathbb{Z}\), we take
$$\displaystyle{ \hat{f}(\xi ):= \frac{1} {\vert G\vert }\sum _{x\in G}f(x)e^{-2\pi ix\xi /n}. }$$
Interestingly, \(\|A\|_{u}\) captures how far E(A, A) is from its minimal value | A | 4∕ | G | :

Lemma 6.

For any subset A of a finite additive group G, we have
$$\displaystyle{ \|A\|_{u}^{4} \leq \frac{1} {\vert G\vert ^{3}}\bigg(E(A,A) -\frac{\vert A\vert ^{4}} {\vert G\vert }\bigg) \leq \frac{\vert A\vert } {\vert G\vert }\|A\|_{u}^{2}. }$$

In the next section, we will appeal to Lemma 6 to motivate the matrix construction used by Bourgain et al. [8]. We will also use some of the techniques in the proof of Lemma 6 to prove a key lemma (namely, Lemma 7):

Proof (Proof of Lemma 6).

We will assume \(G = \mathbb{Z}/n\mathbb{Z}\), but the proof generalizes. Denote e n (x): = e2π i xn and λ x : = #{(a, a′) ∈ A2: aa′ = x}. For the left-hand inequality, we consider
$$\displaystyle{ \sum _{\theta \in G}\vert \widehat{1_{A}}(\theta )\vert ^{4} =\sum _{\theta \in G}\bigg\vert \frac{1} {\vert G\vert }\sum _{a\in A}e_{n}(-\theta a)\bigg\vert ^{4} = \frac{1} {\vert G\vert ^{4}}\sum _{\theta \in G}\bigg\vert \sum _{x\in G}\lambda _{x}e_{n}(-\theta x)\bigg\vert ^{2}, }$$
where the last step is by expanding \(\vert w\vert ^{2} = w\overline{w}\). Then Parseval’s identity simplifies this to \(\frac{1} {\vert G\vert ^{3}} \|\lambda \|_{2}^{2} = \frac{1} {\vert G\vert ^{3}} E(A,A)\). We use this to bound \(\|A\|_{u}^{4}\):
$$\displaystyle{ \|A\|_{u}^{4} =\max _{ \begin{array}{c}\theta \in G\\ \theta \neq 0\end{array}}\vert \widehat{1_{A}}(\theta )\vert ^{4} \leq \sum _{ \begin{array}{c}\theta \in G\\ \theta \neq 0\end{array}}\vert \widehat{1_{A}}(\theta )\vert ^{4} = \frac{1} {\vert G\vert ^{3}}E(A,A) -\frac{\vert A\vert ^{4}} {\vert G\vert ^{4}}. }$$
For the right-hand inequality, we apply Parseval’s identity:
$$\displaystyle{ E(A,A) =\sum _{x\in G}\lambda _{x}^{2} = \frac{1} {\vert G\vert }\sum _{\theta \in G}\bigg\vert \sum _{x\in G}\lambda _{x}e_{n}(-\theta x)\bigg\vert ^{2} = \frac{\vert A\vert ^{4}} {\vert G\vert } + \frac{1} {\vert G\vert }\sum _{\begin{array}{c}\theta \in G \\ \theta \neq 0 \end{array}}\bigg\vert \sum _{x\in G}\lambda _{x}e_{n}(-\theta x)\bigg\vert ^{2} }$$
From here, we apply the expansion \(\vert w\vert ^{2} = w\overline{w}\)
$$\displaystyle{ \bigg\vert \sum _{a\in A}e_{n}(-\theta a)\bigg\vert ^{2} =\sum _{ x\in G}\lambda _{x}e_{n}(-\theta x) }$$
to continue:
$$\displaystyle{ \sum _{\begin{array}{c}\theta \in G \\ \theta \neq 0 \end{array}}\bigg\vert \sum _{x\in G}\lambda _{x}e_{n}(-\theta x)\bigg\vert ^{2} =\sum _{ \begin{array}{c}\theta \in G\\ \theta \neq 0\end{array}}\bigg\vert \sum _{a\in A}e_{n}(-\theta a)\bigg\vert ^{4} \leq \sum _{ \begin{array}{c}\theta \in G\\ \theta \neq 0\end{array}}\big(\vert G\vert \|A\|_{u}\big)^{2}\bigg\vert \sum _{ a\in A}e_{n}(-\theta a)\bigg\vert ^{2}. }$$
Applying Parseval’s identity then gives
$$\displaystyle{ E(A,A) \leq \frac{\vert A\vert ^{4}} {\vert G\vert } +\big (\vert G\vert \|A\|_{u}\big)^{2} \cdot \frac{1} {\vert G\vert }\sum _{\theta \in G}\bigg\vert \sum _{a\in A}e_{n}(-\theta a)\bigg\vert ^{2} = \frac{\vert A\vert ^{4}} {\vert G\vert } + \vert G\vert ^{2}\|A\|_{ u}^{2}\vert A\vert, }$$
which is a rearrangement of the right-hand inequality. □ 

13.3 The Matrix Construction

This section combines ideas from the previous section to introduce the matrix construction used by Bourgain et al. [8] to break the square-root bottleneck. The main idea is to construct a Gram matrix ΦΦ whose entries exhibit cancellations for weak flat RIP (see Definition 3). By Lemma 6, we can control cancellations of complex exponentials
$$\displaystyle{ \bigg\vert \sum _{a\in A}e_{n}(\theta a)\bigg\vert \leq n\|A\|_{u},\qquad \theta \neq 0 }$$
in terms of the additive energy of the index set \(A \subseteq \mathbb{Z}/n\mathbb{Z}\); recall that e n (x): = e2π i xn. This motivates us to pursue a Gram matrix whose entries are complex exponentials. To this end, consider the following vector:
$$\displaystyle{ u_{a,b}:= \frac{1} {\sqrt{p}}\Big(e_{p}(ax^{2} + \mathit{bx})\Big)_{ x\in \mathbb{F}_{p}}, }$$
where p is prime and \(\mathbb{F}_{p}\) denotes the field of size p. Such vectors are called chirps, and they are used in a variety of applications including radar. Here, we are mostly interested in the form of their inner products. If a1 = a2, then \(\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle =\delta _{b_{1},b_{2}}\) by the geometric sum formula. Otherwise, the inner product is more interesting:
$$\displaystyle{ \langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle = \frac{1} {p}\sum _{x\in \mathbb{F}_{p}}e_{p}\Big((a_{1} - a_{2})x^{2} + (b_{ 1} - b_{2})x\Big). }$$
Since a1a2 ≠ 0, we can complete the square in the exponent, and changing variables to y: = x + (b1b2)∕(2(a1a2)) gives
$$\displaystyle{ \langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle = \frac{1} {p}e_{p}\bigg(-\frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\sum _{y\in \mathbb{F}_{p}}e_{p}\Big((a_{1} - a_{2})y^{2}\Big). }$$
Finally, this can be simplified using a quadratic Gauss sum formula:
$$\displaystyle{ \langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle = \frac{\sigma _{p}} {\sqrt{p}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg)e_{p}\bigg(-\frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg), }$$
where σ p is 1 or i (depending on whether p is 1 or \(3\bmod 4\)) and \((\frac{a_{1}-a_{2}} {p} )\) is a Legendre symbol, taking value ± 1 depending on whether a1a2 is a perfect square \(\bmod p\). Modulo these factors, the above inner product is a complex exponential, and since we want these in our Gram matrix ΦΦ, we will take Φ to have columns of the form ua, b—in fact, the columns will be \(\{u_{a,b}\}_{(a,b)\in \mathcal{A}\times \mathcal{B}}\) for some well-designed sets \(\mathcal{A},\mathcal{B}\subseteq \mathbb{F}_{p}\).
For weak flat RIP, we want to bound the following quantity for every \(\varOmega _{1},\varOmega _{2} \subseteq \mathcal{A}\times \mathcal{B}\) with \(\vert \varOmega _{1}\vert,\vert \varOmega _{2}\vert \leq \sqrt{p}\):
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{(a_{1},b_{1})\in \varOmega _{1}}u_{a_{1},b_{1}},\sum _{(a_{2},b_{2})\in \varOmega _{2}}u_{a_{2},b_{2}}\bigg\rangle \bigg\vert. }$$
For i = 1, 2, define
$$\displaystyle{ A_{i}:=\{ a \in \mathcal{A}: \exists b \in \mathcal{B}\mbox{ s.t. }(a,b) \in \varOmega _{i}\}\quad \mbox{ and}\quad \varOmega _{i}(a):=\{ b \in \mathcal{B}: (a,b) \in \varOmega _{i}\}. }$$
These provide an alternate expression for the quantity of interest:
$$\displaystyle\begin{array}{rcl} \bigg\vert \sum _{(a_{1},b_{1})\in \varOmega _{1}}\sum _{(a_{2},b_{2})\in \varOmega _{2}}\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle \bigg\vert & =& \bigg\vert \sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\sum _{ \begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle \bigg\vert, {}\\ & \leq & \sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\bigg\vert \sum _{\begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle \bigg\vert {}\\ & =& \frac{1} {\sqrt{p}}\sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\bigg\vert \sum _{\begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg(-\frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert. {}\\ \end{array}$$
Pleasingly, it now suffices to bound a sum of complex exponentials, which we feel equipped to do using additive combinatorics. The following lemma does precisely this (it can be viewed as an analog of Lemma 6).

Lemma 7 (Lemma 9 in [8]).

For every \(\theta \in \mathbb{F}_{p}^{{\ast}}\) and \(B_{1},B_{2} \subseteq \mathbb{F}_{p}\) , we have
$$\displaystyle{ \bigg\vert \sum _{\begin{array}{c}b_{1}\in B_{1} \\ b_{2}\in B_{2}\end{array}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert \leq \vert B_{ 1}\vert ^{1/2}E(B_{ 1},B_{1})^{1/8}\vert B_{ 2}\vert ^{1/2}E(B_{ 2},B_{2})^{1/8}p^{1/8}. }$$

Proof.

First, Cauchy–Schwarz gives
$$\displaystyle\begin{array}{rcl} \bigg\vert \sum _{\begin{array}{c}b_{1}\in B_{1} \\ b_{2}\in B_{2}\end{array}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert ^{2}& =& \bigg\vert \sum _{ b_{1}\in B_{1}}1 \cdot \sum _{b_{2}\in B_{2}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert ^{2} {}\\ & \leq & \vert B_{1}\vert \sum _{b_{1}\in B_{1}}\bigg\vert \sum _{b_{2}\in B_{2}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert ^{2}. {}\\ \end{array}$$
Expanding \(\vert w\vert ^{2} = w\overline{w}\) and rearranging then gives an alternate expression for the right-hand side:
$$\displaystyle{ \vert B_{1}\vert \sum _{b_{2},b_{2}'\in B_{2}}e_{p}\Big(\theta (b_{2}^{2} - (b_{ 2}')^{2})\Big)\overline{\sum _{ b_{1}\in B_{1}}e_{p}\Big(\theta (2b_{1}(b_{2} - b_{2}'))\Big)}. }$$
Applying Cauchy–Schwarz again, we then have
$$\displaystyle{ \bigg\vert \sum _{\begin{array}{c}b_{1}\in B_{1} \\ b_{2}\in B_{2}\end{array}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert ^{4} \leq \vert B_{ 1}\vert ^{2}\vert B_{ 2}\vert ^{2}\sum _{ b_{2},b_{2}'\in B_{2}}\bigg\vert \sum _{b_{1}\in B_{1}}e_{p}\Big(\theta (2b_{1}(b_{2} - b_{2}'))\Big)\bigg\vert ^{2}, }$$
and expanding \(\vert w\vert ^{2} = w\overline{w}\) this time gives
$$\displaystyle{ \vert B_{1}\vert ^{2}\vert B_{ 2}\vert ^{2}\sum _{ \begin{array}{c}b_{1},b_{1}'\in B_{1} \\ b_{2},b_{2}'\in B_{2}\end{array}}e_{p}\Big(2\theta (b_{1} - b_{1}')(b_{2} - b_{2}')\Big). }$$
At this point, it is convenient to change variables, namely, x = b1b1′ and y = b2b2′:
$$\displaystyle{ \bigg\vert \sum _{\begin{array}{c}b_{1}\in B_{1} \\ b_{2}\in B_{2}\end{array}}e_{p}\Big(\theta (b_{1} - b_{2})^{2}\Big)\bigg\vert ^{4} \leq \vert B_{ 1}\vert ^{2}\vert B_{ 2}\vert ^{2}\sum _{ x,y\in \mathbb{F}_{p}}\lambda _{x}\mu _{y}e_{p}(2\theta xy), }$$
(13.1)
where \(\lambda _{x}:= \#\{(b_{1},b_{1}') \in B_{1}^{2}: b_{1} - b_{1}' = x\}\) and similarly for μ y in terms of B2. We now apply Cauchy–Schwarz again to bound the sum in (13.1):
$$\displaystyle{ \bigg\vert \sum _{x\in \mathbb{F}_{p}}\lambda _{x}\sum _{y\in \mathbb{F}_{p}}\mu _{y}e_{p}(2\theta xy)\bigg\vert ^{2} \leq \|\lambda \|_{ 2}^{2}\sum _{ x\in \mathbb{F}_{p}}\bigg\vert \sum _{y\in \mathbb{F}_{p}}\mu _{y}e_{p}(2\theta xy)\bigg\vert ^{2}, }$$
and changing variables x′: = −2θ x (this change is invertible since θ ≠ 0), we see that the right-hand side is a sum of squares of the Fourier coefficients of μ. As such, Parseval’s identity gives the following simplification:
$$\displaystyle{ \bigg\vert \sum _{x,y\in \mathbb{F}_{p}}\lambda _{x}\mu _{y}e_{p}(2\theta xy)\bigg\vert ^{2} \leq p\|\lambda \|_{ 2}^{2}\|\mu \|_{ 2}^{2} = \mathit{pE}(B_{ 1},B_{1})E(B_{2},B_{2}). }$$
Applying this bound to (13.1) gives the result. □ 

13.3.1 How to Construct \(\mathcal{B}\)

Lemma 7 enables us to prove weak-flat-RIP-type cancellations in cases where \(\varOmega _{1}(a_{1}),\varOmega _{2}(a_{2}) \subseteq \mathcal{B}\) both lack additive structure. Indeed, the method of [8] is to do precisely this, and the remaining cases (where either Ω1(a1) or Ω2(a2) has more additive structure) will find cancellations by accounting for the dilation weights 1∕(a1a2). Overall, we will be very close to proving that Φ is RIP if most subsets of \(\mathcal{B}\) lack additive structure. To this end, Bourgain et al. [8] actually prove something much stronger: They design \(\mathcal{B}\) in such a way that all sufficiently large subsets have low additive structure. The following theorem is the first step in the design:

Theorem 1 (Theorem 5 in [8]).

Fix \(r,M \in \mathbb{N}\) , M ≥ 2, and define the cube \(\mathcal{C}:=\{ 0,\ldots,M - 1\}^{r} \subseteq \mathbb{Z}^{r}\) . Let τ denote the solution to the equation
$$\displaystyle{ \Big( \frac{1} {M}\Big)^{2\tau } +\Big (\frac{M - 1} {M} \Big)^{\tau } = 1. }$$
Then for any subsets \(A,B \subseteq \mathcal{C}\) , we have
$$\displaystyle{ \vert A + B\vert \geq \big (\vert A\vert \vert B\vert \big)^{\tau }. }$$
As a consequence of this theorem (taking A = B), we have | A + A | ≥ | A | 2τ for every \(A \subseteq \mathcal{C}\), and since τ > 1∕2, this means that large subsets A have | A + A | ≫ | A | , indicating low additive structure. However, \(\mathcal{C}\) is a subset of the group \(\mathbb{Z}^{r}\), whereas we need to construct a subset \(\mathcal{B}\) of \(\mathbb{F}_{p}\). The trick here is to pick \(\mathcal{B}\) so that it inherits the additive structure of \(\mathcal{C}\):
$$\displaystyle{ \mathcal{B}:=\bigg\{\sum _{ j=1}^{r}x_{ j}(2M)^{j-1}: x_{ 1},\ldots,x_{r} \in \{ 0,\ldots,M - 1\}\bigg\}. }$$
(13.2)
Indeed, the 2M-ary expansion of \(b_{1},b_{2} \in \mathcal{B}\) reveals the corresponding \(c_{1},c_{2} \in \mathcal{C}\). Also, adding b1 and b2 incurs no carries, so the expansion of b1 + b2 corresponds to c1 + c2 (even when \(c_{1} + c_{2}\not\in \mathcal{C}\)). This type of mapping \(\mathcal{C}\rightarrow \mathbb{F}_{p}\) is called a Freiman isomorphism, and it’s easy to see that Freiman isomorphic sets have the same sized sumsets, difference sets, and additive energy.

We already know that large subsets of \(\mathcal{C}\) (and \(\mathcal{B}\)) exhibit low additive structure, but the above theorem only gives this in terms of the sumset, whereas Lemma 7 requires low additive structure in terms of additive energy. As such, we will first convert the above theorem into a statement about difference sets, and then apply Lemma 5 to further convert it in terms of additive energy:

Corollary 1 (essentially Corollary 3 in [8]).

Fix r, M and τ according to Theorem 1 , take \(\mathcal{B}\) as defined in (13.2) , and pick s and t such that (2τ − 1)s ≥ t. Then every subset \(B \subseteq \mathcal{B}\) such that |B| > ps satisfies |B − B| > pt|B|.

Proof.

First note that − B is a translate of some other set \(B' \subseteq \mathcal{B}\). Explicitly, if b0 = j = 1 r (M − 1)(2M)j−1, then we can take B′: = b0B. As such, Theorem 1 gives
$$\displaystyle{ \vert B - B\vert = \vert B + B'\vert \geq \vert B\vert ^{2\tau } = \vert B\vert ^{2\tau -1}\vert B\vert> p^{(2\tau -1)s}\vert B\vert \geq p^{t}\vert B\vert. }$$
 □ 

Corollary 2 (essentially Corollary 4 in [8]).

Fix r, M and τ according to Theorem 1 , take \(\mathcal{B}\) as defined in (13.2) , and pick γ and ℓ such that (2τ − 1)(ℓ −γ) ≥ 10γ. Then for every ε > 0, there exists P such that for every p ≥ P, every subset \(S \subseteq \mathcal{B}\) with |S| > p satisfies E(S,S) < p−γ+ε|S|3.

Proof.

Suppose to the contrary that there exists ε > 0 such that there are arbitrarily large p for which there is a subset \(S \subseteq \mathcal{B}\) with | S | > p and E(S, S) ≥ pγ+ε | S | 3. Writing E(S, S) = | S | 3K, then K ≤ pγε. By Lemma 5, there exists B ⊆ S such that, for sufficiently large p,
$$\displaystyle{ \vert B\vert \geq \vert S\vert /(20K)> \frac{1} {20}p^{\ell-\gamma +\epsilon }> p^{\ell-\gamma }, }$$
and
$$\displaystyle{ \vert B - B\vert \leq 10^{7}K^{9}\vert S\vert \leq 10^{7}K^{9}(20K\vert B\vert ) \leq 10^{7} \cdot 20 \cdot p^{10(\gamma -\epsilon )}\vert B\vert <p^{10\gamma }\vert B\vert. }$$
However, this contradicts the previous corollary with s = γ and t = 10γ. □ 

Notice that we can weaken our requirements on γ and if we had a version of Lemma 5 with a smaller exponent on K. This exponent comes from a version of the Balog–Szemeredi–Gowers lemma (Lemma 6 in [8]), which follows from the proof of Lemma 2.2 in [5]. (Specifically, take A = B, and you need to change A E B to A + E B, but this change doesn’t affect the proof.) Bourgain et al. indicate that it would be desirable to prove a better version of this lemma, but it is unclear how easy that would be.

13.3.2 How to Construct \(\mathcal{A}\)

The previous subsection showed how to construct \(\mathcal{B}\) so as to ensure that all sufficiently large subsets have low additive structure. By Lemma 7, this in turn ensures that Φ exhibits weak-flat-RIP-type cancellations for most \(\varOmega _{1}(a_{1}),\varOmega _{2}(a_{2}) \subseteq \mathcal{B}\). For the remaining cases, Φ must exhibit weak-flat-RIP-type cancellations by somehow leveraging properties of \(\mathcal{A}\).

The next section gives the main result, which requires a subset \(\mathcal{A} = \mathcal{A}(p)\) of \(\mathbb{F}_{p}\) for which there exists an even number m as well as an α > 0 (both independent of p) such that the following two properties hold:
  1. (i)

    \(\varOmega (p^{\alpha }) \leq \vert \mathcal{A}(p)\vert \leq p^{\alpha }\).

     
  2. (ii)
    For each \(a \in \mathcal{A}\), then \(a_{1},\ldots,a_{2m} \in \mathcal{A}\setminus \{a\}\) satisfies
    $$\displaystyle{ \sum _{j=1}^{m} \frac{1} {a - a_{j}} =\sum _{ j=m+1}^{2m} \frac{1} {a - a_{j}} }$$
    (13.3)
    only if (a1, , a m ) and (am+1, , a2m) are permutations of each other. Here, division (and addition) is taken in the field \(\mathbb{F}_{p}\).
     

Unfortunately, these requirements on \(\mathcal{A}\) lead to very little intuition compared to our current understanding of \(\mathcal{B}\). Regardless, we will continue by considering how Bourgain et al. constructs \(\mathcal{A}\). The following lemma describes their construction and makes a slight improvement to the value of α chosen in [8]:

Lemma 8.

Take \(L:= \lfloor p^{1/2m(4m-1)}\rfloor\) and \(U:= \lfloor L^{4m-1}\rfloor\) . Then
$$\displaystyle{ \mathcal{A}:=\{ x^{2} + \mathit{Ux}: 1 \leq x \leq L\} }$$
satisfies (i) and (ii) above if we take
$$\displaystyle{ \alpha = \frac{1} {2m(4m - 1)} }$$
and p is a sufficiently large prime.

The original proof of this result is sketched after the statement of Lemma 2 in [8]. Unfortunately, this proof contains a technical error: The authors conclude that a prime p does not divide some integer D1D2V since V ≠ 0, p does not divide D1 and | D2V | < p, but the conclusion is invalid since D2V is not necessarily an integer. The following alternative proof removes this difficulty:

Proof (Proof of Lemma 8).

One may quickly verify (i). For (ii), we first note that multiplication by \(\prod _{i=1}^{2m}(a - a_{i})\) gives that (13.3) is equivalent to
$$\displaystyle{ S:=\sum _{ j=1}^{2m}\lambda _{ j}\prod _{\begin{array}{c}i=1 \\ i\neq j \end{array}}^{2m}(a - a_{ i}) \equiv 0\mod p, }$$
where λ j  = 1 for j ∈ { 1, , m} and λ j  = −1 for j ∈ { m + 1, , 2m}. Here, we are treating a and the a i ’s as integers in {0, , p − 1}, and so S is an integer. Furthermore,
$$\displaystyle{ \vert S\vert \leq m\big(2(L^{2} + \mathit{UL})\big)^{2m-1} \leq m\big(2(p^{1/m(4m-1)} + p^{4m/2m(4m-1)})\big)^{2m-1} <p }$$
for sufficiently large p. As such, we have \(S = 0\) (not just \(0\bmod p\)).
With this, it suffices to show that for any n ∈ { 1, , 2m}, any distinct x, x1, , x n  ∈ { 1, , L}, and any nonzero integers λ1, , λ n such that \(\vert \lambda _{1}\vert + \cdots + \vert \lambda _{n}\vert \leq 2m\),
$$\displaystyle{ W =\sum _{ j=1}^{n}\lambda _{ j}\prod _{\begin{array}{c}i=1 \\ i\neq j \end{array}}^{n}(x - x_{ i})(x + x_{i} + U) }$$
(13.4)
is nonzero (just as before, everything is an integer here). Indeed, taking a = x2 +Ux and a j  = x j 2 +Ux j , we see that S has the form of W, and so if W is always nonzero, so must S, which by the above reduction implies (ii).
To prove the claim, note that x + x1 + U is a factor of the jth term of (13.4) for every j ≥ 2. As such, W is congruent to the first term modulo x + x1 + U:
$$\displaystyle{ W \equiv \lambda _{1}\prod _{i=2}^{n}(x - x_{ i})(x + x_{i} + U)\mod (x + x_{1} + U). }$$
Each factor of the form x + x i + U can be further simplified:
$$\displaystyle{ x + x_{i} + U = (x_{i} - x_{1}) + (x + x_{1} + U) \equiv x_{i} - x_{1}\mod (x + x_{1} + U), }$$
and so
$$\displaystyle{ W \equiv W_{1}\mod (x + x_{1} + U), }$$
where
$$\displaystyle{ W_{1}:=\lambda _{1}\prod _{i=2}^{n}(x - x_{ i})(x_{i} - x_{1}). }$$
Note that W1 is nonzero since x and the x i ’s are distinct by assumption. Also,
$$\displaystyle{ \vert W_{1}\vert \leq 2m(L^{2})^{2m-1} \leq U <x + x_{ 1} + U }$$
for sufficiently large p. As such, x + x1 + U does not divide W1, and so W is nonzero, as desired. □ 

13.4 The Main Result

We are now ready to state the main result of this chapter, which is a generalization of the main result in [8]. Later in this section, we will maximize ε0 such that this result implies \(\mathrm{ExRIP}[1/2 +\epsilon _{0}]\) with the matrix construction from [8].

Theorem 2 (The BDFKK restricted isometry machine).

For every prime p, define subsets \(\mathcal{A} = \mathcal{A}(p)\) and \(\mathcal{B} = \mathcal{B}(p)\) of \(\mathbb{F}_{p}\) . Suppose there exist constants \(m \in 2\mathbb{N}\) , ℓ,γ > 0 independent of p such that the following conditions apply:
  1. (a)
    For every sufficiently large p, and for every \(a \in \mathcal{A}\) and \(a_{1},\ldots,a_{2m} \in \mathcal{A}\setminus \{a\}\) ,
    $$\displaystyle{ \sum _{j=1}^{m} \frac{1} {a - a_{j}} =\sum _{ j=m+1}^{2m} \frac{1} {a - a_{j}} }$$
    only if (a 1 ,…,a m ) and (a m+1 ,…,a 2m ) are permutations of each other. Here, division (and addition) is taken in the field \(\mathbb{F}_{p}\) .
     
  2. (b)

    For every ε > 0, there exists P = P(ε) such that for every p ≥ P, every subset \(S \subseteq \mathcal{B}(p)\) with |S|≥ p satisfies E(S,S) ≤ p −γ+ε |S| 3 .

     
Pick α such that
$$\displaystyle{ \varOmega (p^{\alpha }) \leq \vert \mathcal{A}(p)\vert \leq p^{\alpha },\qquad \vert \mathcal{B}(p)\vert \geq \varOmega (p^{1-\alpha +\epsilon '}) }$$
(13.5)
for some ε′ > 0 and every sufficiently large p. Pick ε1 > 0 for which there exist α12,ε,x,y > 0 such that
$$\displaystyle\begin{array}{rcl} \epsilon _{1}+\epsilon & <\alpha _{1} -\alpha -(4/3)x-\epsilon,&{}\end{array}$$
(13.6)
$$\displaystyle\begin{array}{rcl} \ell& \leq & 1/2 + (4/3)x -\alpha _{1} +\epsilon /2,{}\end{array}$$
(13.7)
$$\displaystyle\begin{array}{rcl} \epsilon _{1}+\epsilon & <\gamma /4 - y/4-\epsilon,&{}\end{array}$$
(13.8)
$$\displaystyle\begin{array}{rcl} \alpha _{2}& \geq 9x+\epsilon,&{}\end{array}$$
(13.9)
$$\displaystyle\begin{array}{rcl} c_{0}y/8 - (\alpha _{1}/4 + 9\alpha _{2}/8)/m& \leq x/8 -\alpha /4,&{}\end{array}$$
(13.10)
$$\displaystyle\begin{array}{rcl} \epsilon _{1}+\epsilon & <c_{0}y/8 - (\alpha _{1}/4 + 9\alpha _{2}/8)/m,&{}\end{array}$$
(13.11)
$$\displaystyle\begin{array}{rcl} my& \leq \min \{ 1/2 -\alpha _{1},1/2 -\alpha _{2}\},&{}\end{array}$$
(13.12)
$$\displaystyle\begin{array}{rcl} 3\alpha _{2} - 2\alpha _{1}& \leq (2 - c_{0})my.&{}\end{array}$$
(13.13)
Here, c0 = 1∕10430 is a constant from Proposition 2 in [6]. Then for sufficiently large p, the \(p \times \vert \mathcal{A}(p)\vert \vert \mathcal{B}(p)\vert\) matrix with columns
$$\displaystyle{ u_{a,b}:= \frac{1} {\sqrt{p}}\Big(e^{2\pi i(ax^{2}+\mathit{bx})/p }\Big)_{x\in \mathbb{F}_{p}}\qquad a \in \mathcal{A},b \in \mathcal{B} }$$
satisfies \((p^{1/2+\epsilon _{1}/2-\epsilon ''},\delta )\)-RIP for any ε″ > 0 and \(\delta <\sqrt{2} - 1\), thereby implying ExRIP [1∕2 + ε1∕2].
Let’s briefly discuss the structure of the proof of this result. As indicated in Section 13.2, the method is to prove flat-RIP-type cancellations, namely that
$$\displaystyle{ S(A_{1},A_{2}):=\sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\sum _{ \begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg)e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {2(a_{1} - a_{2})}\bigg) }$$
(13.14)
has size \(\leq p^{1-\epsilon _{1}-\epsilon }\) whenever Ω1 and Ω2 are disjoint with size \(\leq \sqrt{p}\). (Actually, we get to assume that these subsets and the Ω i (a i )’s satisfy certain size constraints since we have an extra −ε in the power of p; having this will imply the general case without the ε, as made clear in the proof of Theorem 2.) This bound is proved by considering a few different cases. First, when the Ω i (a i )’s are small, (13.14) is small by a triangle inequality. Next, when the Ω i (a i )’s are large, then we can apply a triangle inequality over each A i and appeal to hypothesis (b) in Theorem 2 and Lemma 7. However, this will only give sufficient cancellation when the A i ’s are small. In the remaining case, Bourgain et al. prove sufficient cancellation by invoking Lemma 10 in [8], which concerns the following quantity:
$$\displaystyle{ T_{a_{1}}(A_{2},B):=\sum _{\begin{array}{c}b_{1}\in B \\ a_{2}\in A_{2},\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg)e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg). }$$
(13.15)
Specifically, Lemma 10 in [8] gives that \(\vert T_{a_{1}}(A_{2},B)\vert\) is small whenever B has sufficient additive structure. In the proof of the main result, they take a maximal subset B0 ⊆ Ω1(a1) such that \(\vert T_{a_{1}}(A_{2},B_{0})\vert\) is small, and then they use this lemma to show that Ω1(a1)∖ B0 necessarily has little additive structure. By Lemma 7, this in turn forces \(\vert T_{a_{1}}(A_{2},B_{1})\vert\) to be small, and so \(\vert T_{a_{1}}(A_{2},\varOmega _{1}(a_{1}))\vert\) (and furthermore | S(A1, A2) | ) are also small due to a triangle inequality. The reader is encouraged to find more details in the proofs found in Section 13.5.

What follows is a generalized version of the statement of Lemma 10 in [8], which we then use in the hypothesis of a generalized version of Lemma 2 in [8]:

Definition 4.

Let \(\mathrm{L10} =\mathrm{ L10}[\alpha _{1},\alpha _{2},k_{0},k_{1},k_{2},m,y]\) denote the following statement about subsets \(\mathcal{A} = \mathcal{A}(p)\) and \(\mathcal{B} = \mathcal{B}(p)\) of \(\mathbb{F}_{p}\) for \(p\) prime:

For every ε > 0, there exists P > 0 such that for every prime p ≥ P the following holds:

Take \(\varOmega _{1},\varOmega _{2} \subseteq \mathcal{A}\times \mathcal{B}\) such that
$$\displaystyle{ \vert A_{2}\vert \geq p^{y}, }$$
(13.16)
and for which there exist powers of two M1, M2 such that
$$\displaystyle{ \frac{M_{i}} {2} \leq \vert \varOmega _{i}(a_{i})\vert <M_{i} }$$
(13.17)
and
$$\displaystyle{ \vert A_{i}\vert M_{i} \leq 2\sqrt{p} }$$
(13.18)
for i = 1, 2 and for every a i  ∈ A i . Then for every \(B \subseteq \mathbb{F}_{p}\) such that
$$\displaystyle{ p^{1/2-\alpha _{1} } \leq \vert B\vert \leq p^{1/2} }$$
(13.19)
and
$$\displaystyle{ \vert B - B\vert \leq p^{\alpha _{2}}\vert B\vert, }$$
(13.20)
we have that (13.15) satisfies
$$\displaystyle{ \vert T_{a_{1}}(A_{2},B)\vert \leq \vert B\vert p^{1/2-\epsilon _{2}+\epsilon } }$$
(13.21)
with ε2 = k0y − (k1α1 + k2α2)∕m for every a1 ∈ A1.

Lemma 9 (generalized version of Lemma 2 in [8]).

Take \(\mathcal{A}\) arbitrarily and \(\mathcal{B}\) satisfying the hypothesis (b) in Theorem 2, pick α such that \(\vert \mathcal{A}(p)\vert \leq p^{\alpha }\) for every sufficiently large p, and pick ε,ε1,x > 0 such that L10 holds with (13.6)–(13.9) and
$$\displaystyle{ \epsilon _{1}+\epsilon <\epsilon _{2} \leq x/8 -\alpha /4. }$$
(13.22)
Then the following holds for every sufficiently large p:

Take \(\varOmega _{1},\varOmega _{2} \subseteq \mathcal{A}\times \mathcal{B}\) for which there exist powers of two M1,M2 such that (13.17) and (13.18) hold for i = 1,2 and every ai ∈ Ai. Then (13.14) satisfies \(\vert S(A_{1},A_{2})\vert \leq p^{1-\epsilon _{1}-\epsilon }\).

The following result gives sufficient conditions for L10, and thus Lemma 9 above:

Lemma 10 (generalized version of Lemma 10 in [8]).

Suppose \(\mathcal{A}\) satisfies hypothesis (a) in Theorem 2 . Then L10 is true with k0 = c0∕8, k1 = 1∕4 and k2 = 9∕8 provided (13.12) and (13.13) are satisfied.

These lemmas are proved in Section 13.5. With these in hand, we are ready to prove the main result:

Proof (Proof of Theorem 2).

By Lemma 10, we have that L10 is true with
$$\displaystyle{ \epsilon _{2} = c_{0}y/8 - (\alpha _{1}/4 + 9\alpha _{2}/8)/m. }$$
As such, (13.10) and (13.11) together imply (13.22), and so the conclusion of Lemma 9 holds. We will use this conclusion to show that the matrix identified in Theorem 2 satisfies \((p^{1/2},p^{-\epsilon _{1}})\)-weak flat RIP. Indeed, this will imply \((p^{1/2},p^{-\epsilon _{1}/2})\)-flat RIP by Lemma 2, \((p^{1/2},75p^{-\epsilon _{1}/2}\log p)\)-RIP by Lemma 1, and \((p^{1/2+\epsilon _{1}/2-\epsilon ''},75p^{-\epsilon ''}\log p)\)-RIP for any ε″ > 0 by Lemma 3 (taking \(s = p^{\epsilon _{1}/2-\epsilon ''}\)). Since \(75p^{-\epsilon ''}\log p <\sqrt{2} - 1\) for sufficiently large p, this will prove the result.
To demonstrate \((p^{1/2},p^{-\epsilon _{1}})\)-weak flat RIP, pick disjoint \(\varOmega _{1},\varOmega _{2} \subseteq \mathcal{A}\times \mathcal{B}\) of size  ≤ p1∕2. We need to show
$$\displaystyle{ \bigg\vert \bigg\langle \sum _{(a_{1},b_{1})\in \varOmega _{1}}u_{a_{1},b_{1}},\sum _{(a_{2},b_{2})\in \varOmega _{2}}u_{a_{2},b_{2}}\bigg\rangle \bigg\vert \leq p^{1/2-\epsilon _{1} }. }$$
Recall that
$$\displaystyle\begin{array}{rcl} & & \sum _{(a_{1},b_{1})\in \varOmega _{1}}\sum _{(a_{2},b_{2})\in \varOmega _{2}}\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle {}\\ & & \qquad \qquad =\sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\sum _{ \begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\langle u_{a_{1},b_{1}},u_{a_{2},b_{2}}\rangle {}\\ & & \qquad \qquad =\sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\setminus A_{1}\end{array}}\sum _{ \begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}} \frac{\sigma _{p}} {\sqrt{p}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg)e_{p}\bigg(-\frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg). {}\\ \end{array}$$
As such, we may assume that A1 and A2 are disjoint without loss of generality, and it suffices to show that
$$\displaystyle{ \vert S(A_{1},A_{2})\vert =\bigg \vert \sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\sum _{ \begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg)e_{p}\bigg(-\frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert \leq p^{1-\epsilon _{1} }. }$$
To estimate this sum, it is convenient to partition A i according to the size of Ω i (a i ). For each k, define the set
$$\displaystyle{ A_{i}^{(k)}:=\{ a_{ i} \in A_{i}: 2^{k-1} \leq \vert \varOmega _{ i}(a_{i})\vert <2^{k}\}. }$$
Then we have
$$\displaystyle{ \vert A_{i}^{(k)}\vert 2^{k-1} \leq \sum _{ a_{i}\in A_{i}^{(k)}}\vert \varOmega _{i}(a_{i})\vert = \vert \{(a_{i},b_{i}) \in \varOmega _{i}: a_{i} \in A_{i}^{(k)}\}\vert \leq \vert \varOmega _{ i}\vert \leq p^{1/2}. }$$
As such, taking M i  = 2 k gives that A i  ← A i (k) satisfies (13.17) and (13.18), which enables us to apply the conclusion of Lemma 9. Indeed, the triangle inequality and Lemma 9 together give
$$\displaystyle{ \vert S(A_{1},A_{2})\vert \leq \sum _{k_{1}=1}^{\lceil \frac{1} {2} \log _{2}p\rceil }\sum _{k_{ 2}=1}^{\lceil \frac{1} {2} \log _{2}p\rceil }\vert S(A_{1}^{(k_{1})},A_{2}^{(k_{2})})\vert \leq p^{1-\epsilon _{1}-\epsilon }\log ^{2}p \leq p^{1-\epsilon _{1}}, }$$
where the last step takes p sufficiently large, i.e., such that p ε  ≥ log2p. □ 
To summarize Theorem 2, we may conclude \(\mathrm{ExRIP}[1/2 +\epsilon _{1}/2]\) if we can find
  1. (i)

    \(m \in 2\mathbb{N}\) satisfying hypothesis (a),

     
  2. (ii)

    , γ > 0 satisfying hypothesis (b),

     
  3. (iii)

    α satisfying (13.5), and

     
  4. (iv)

    α1, α2, ε, x, y > 0 satisfying (13.6)–(13.13).

     
Since we want to conclude ExRIP[z] for the largest possible z, we are inclined to maximize ε1 subject to (i)–(iv), above. To find m, , γ, α which satisfy (i)–(iii), we must leverage a particular construction of \(\mathcal{A}\) and \(\mathcal{B}\), and so we turn to Lemma 8 and Corollary 2. Indeed, for any given m, Lemma 8 constructs \(\mathcal{A}\) satisfying hypothesis (a) such that
$$\displaystyle{ \alpha = 1/(2m(4m - 1)) }$$
(13.23)
satisfies the first part of (13.5). Next, if we take β: = αε′ and define \(r:= \lfloor \beta \log _{2}p\rfloor\) and M: = 21∕β−1, then (13.2) constructs \(\mathcal{B}\) which, by Corollary 2, satisfies hypothesis (b) provided
$$\displaystyle{ (2\tau - 1)(\ell-\gamma ) \geq 10\gamma, }$$
(13.24)
where τ is the solution to
$$\displaystyle{ \Big( \frac{1} {M}\Big)^{2\tau } +\Big (\frac{M - 1} {M} \Big)^{\tau } = 1. }$$
For this construction, \(\vert \mathcal{B}\vert = M^{r} \geq \varOmega (p^{1-\beta })\), thereby satisfying the second part of (13.5).
It remains to maximize ε1 for which there exist m, ε′, , γ, α1, α2, ε, x, y satisfying (13.6)–(13.13), (13.23), and (13.24). Note that m and ε′ determine α and τ, and the remaining constraints (13.6)–(13.13), (13.24) which define the feasible tuples (ε1, , γ, α1, α2, ε, x, y) are linear inequalities. As such, taking the closure of this feasibility region and running a linear program will produce the supremum of ε1 subject to the remaining constraints. This supremum increases monotonically as ε′ → 0, and so we only need to consider the limiting case where ε′ = 0. Running the linear program for various values of m reveals what appears to be a largest supremum of ε1 ≈ 8. 8933 × 10−24 at m = 53, 000, 000 (see Fig. 13.1). Dividing by 2 then gives
$$\displaystyle{ \epsilon _{0} \approx 4.4466 \times 10^{-24}. }$$
While this optimization makes a substantial improvement (this is over 8,000 times larger than the original due to Bourgain et al. in [8]), the constant is still tiny!
Fig. 13.1

The supremum of ε1 as a function of m. Taking ε′ = 0, we run a linear program to maximize ε1 subject to the closure of the constraints (13.6)–(13.13), (13.24) for various values of m. A locally maximal supremum of ε1 ≈ 8. 8933 × 10−24 appears around m = 53, 000, 000.

For this particular construction of \(\mathcal{A}\) and \(\mathcal{B}\), the remaining bottlenecks may lie at the very foundations of additive combinatorics. For example, it is currently known that the constant c0 from Proposition 2 in [6] satisfies 1∕10430 ≤ c0 < 1. If c0 = 1∕2 (say), then taking m = 10, 000 leads to ε0 being on the order of 10−12. Another source of improvement is Lemma 5 (Corollary 1 in [8]), which is proved using a version of the Balog–Szemeredi–Gowers lemma [5]. Specifically, the power of K in Lemma 5 is precisely the coefficient of x in (13.9), as well as the coefficient of γ (less 1) in the right-hand side of (13.24); as such, decreasing this exponent would in turn enlarge the feasibility region. An alternative construction for \(\mathcal{A}\) is proposed in [7], and it would be interesting to optimize this construction as well, though the bottlenecks involving c0 and the Balog–Szemeredi–Gowers lemma are also present in this alternative.

13.5 Proofs of Technical Lemmas

This section contains the proofs of the technical lemmas (Lemmas 9 and 10) which were used to prove the main result (Theorem 2).

13.5.1 Proof of Lemma 9

First note that \(\vert A_{1}\vert M_{1} <p^{1/2+(4/3)x+\alpha -\alpha _{1}+\epsilon }\) implies that
$$\displaystyle{ \vert A_{1}\vert \vert \varOmega _{1}(a_{1})\vert <p^{1/2+(4/3)x+\alpha -\alpha _{1}+\epsilon } }$$
by (13.17), and by (13.18), we also have
$$\displaystyle{ \vert A_{2}\vert \vert \varOmega _{2}(a_{2})\vert <2p^{1/2}. }$$
As such, the triangle inequality gives that
$$\displaystyle{ \vert S(A_{1},A_{2})\vert \leq \vert A_{1}\vert \vert A_{2}\vert \vert \varOmega _{1}(a_{1})\vert \vert \varOmega _{2}(a_{2})\vert \leq 2p^{1+(4/3)x+\alpha -\alpha _{1}+\epsilon } \leq p^{1-\epsilon _{1}-\epsilon }, }$$
where the last step uses (13.6). Thus, we can assume \(\vert A_{1}\vert M_{1} \geq p^{1/2+(4/3)x+\alpha -\alpha _{1}+\epsilon }\), and so the assumption \(\vert \mathcal{A}\vert \leq p^{\alpha }\) gives
$$\displaystyle{ M_{1} \geq \frac{1} {\vert A_{1}\vert }p^{1/2+(4/3)x+\alpha -\alpha _{1}+\epsilon } \geq p^{1/2+(4/3)x-\alpha _{1}+\epsilon }. }$$
(13.25)
Applying (13.17) and (13.25) then gives
$$\displaystyle{ \vert \varOmega _{1}(a_{1})\vert \geq \frac{M_{1}} {2} \geq \frac{1} {2}p^{1/2+(4/3)x-\alpha _{1}+\epsilon }> p^{1/2+(4/3)x-\alpha _{1}+\epsilon /2} \geq p^{\ell}, }$$
where the last step uses (13.7). Note that we can redo all of the preceding analysis by interchanging indices 1 and 2. As such, we also have | Ω2(a2) | > p . (This will enable us to use hypothesis (b) in Theorem 2.) At this point, we bound
$$\displaystyle{ \bigg\vert \sum _{\begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert }$$
using Lemma 7:
$$\displaystyle{ \leq \vert \varOmega _{1}(a_{1})\vert ^{1/2}E(\varOmega _{ 1}(a_{1}),\varOmega _{1}(a_{1}))^{1/8}\vert \varOmega _{ 2}(a_{2})\vert ^{1/2}E(\varOmega _{ 2}(a_{2}),\varOmega _{2}(a_{2}))^{1/8}p^{1/8}. }$$
Next, since | Ω1(a1) | , | Ω2(a2) | > p , hypothesis (b) in Theorem 2 with ε ← 4ε gives
$$\displaystyle{ \leq \vert \varOmega _{1}(a_{1})\vert ^{7/8}\vert \varOmega _{ 2}(a_{2})\vert ^{7/8}p^{1/8-\gamma /4+\epsilon }. }$$
At this point, the triangle inequality gives
$$\displaystyle\begin{array}{rcl} \vert S(A_{1},A_{2})\vert & \leq & \sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\bigg\vert \sum _{\begin{array}{c}b_{1}\in \varOmega _{1}(a_{1}) \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert {}\\ &\leq & \sum _{\begin{array}{c}a_{1}\in A_{1} \\ a_{2}\in A_{2}\end{array}}\vert \varOmega _{1}(a_{1})\vert ^{7/8}\vert \varOmega _{ 2}(a_{2})\vert ^{7/8}p^{1/8-\gamma /4+\epsilon } {}\\ \end{array}$$
which can be further bounded using (13.17) and (13.18):
$$\displaystyle{ \leq 2^{7/4}\vert A_{ 1}\vert ^{1/8}\vert A_{ 2}\vert ^{1/8}p^{1-\gamma /4+\epsilon } }$$
Thus, if | A1 | , | A2 | < p y , then
$$\displaystyle{ \vert S(A_{1},A_{2})\vert \leq 2^{7/4}p^{1+y/4-\gamma /4+\epsilon } \leq p^{1-\epsilon _{1}-\epsilon }, }$$
where the last step uses (13.8). As such, we may assume that either | A1 | or | A2 | is ≥ p y . Without loss of generality, we assume | A2 | ≥ p y . (Considering (13.16), this will enable us to use L10.)
At this point, take B0 ⊆ Ω1(a1) to be a maximal subset satisfying (13.21) for B ← B0, and denote B1: = Ω1(a1)∖ B0. Then the triangle inequality gives
$$\displaystyle{ \vert T_{a_{1}}(A_{2},B_{1})\vert \leq \sum _{a_{2}\in A_{2}}\bigg\vert \sum _{\begin{array}{c}b_{1}\in B_{1} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert, }$$
and then Lemma 7 gives
$$\displaystyle{ \leq \sum _{a_{2}\in A_{2}}\vert B_{1}\vert ^{1/2}E(B_{ 1},B_{1})^{1/8}\vert \varOmega _{ 2}(a_{2})\vert ^{1/2}E(\varOmega _{ 2}(a_{2}),\varOmega _{2}(a_{2}))^{1/8}p^{1/8}. }$$
This can be bounded further by applying E(Ω2(a2), Ω2(a2)) ≤ | Ω2(a2) | 3, (13.17), (13.18) and the assumption \(\vert \mathcal{A}\vert \leq p^{\alpha }\):
$$\displaystyle{ \leq 2^{7/8}\vert B_{ 1}\vert ^{1/2}E(B_{ 1},B_{1})^{1/8}p^{\alpha /8+9/16}. }$$
(13.26)
At this point, we claim that E(B1, B1) ≤ pxM13. To see this, suppose otherwise. Then | B1 | 3 ≥ E(B1, B1) > pxM13, implying
$$\displaystyle{ \vert B_{1}\vert> p^{-x/3}M_{ 1}, }$$
(13.27)
and by (13.17), we also have
$$\displaystyle{ E(B_{1},B_{1})> p^{-x}M_{ 1}^{3}> p^{-x}\vert \varOmega _{ 1}(a_{1})\vert ^{3} \geq p^{-x}\vert B_{ 1}\vert ^{3}. }$$
Thus, Lemma 5 with K = p x produces a subset B1′ ⊆ B1 such that
$$\displaystyle{ \vert B_{1}'\vert \geq \frac{\vert B_{1}\vert } {20p^{x}}> \frac{M_{1}} {20p^{(4/3)x}} \geq \frac{1} {20}p^{1/2-\alpha _{1}+\epsilon } \geq p^{1/2-\alpha _{1} } }$$
where the second and third inequalities follow from (13.27) and (13.25), respectively, and
$$\displaystyle{ \vert B_{1}' - B_{1}'\vert \leq 10^{7}p^{9x}\vert B_{ 1}\vert \leq p^{9x+\epsilon }\vert B_{ 1}\vert \leq p^{\alpha _{2} }\vert B_{1}\vert, }$$
where the last step follows from (13.9). As such, | B1′ | satisfies (13.19) and (13.20), implying that B ← B1′ satisfies (13.21) by L10. By the triangle inequality, B ← B0B1′ also satisfies (13.21), contradicting B0’s maximality.
We conclude that E(B1, B1) ≤ pxM13, and continuing (13.26) gives
$$\displaystyle{ \vert T_{a_{1}}(A_{2},B_{1})\vert \leq 2^{7/8}\vert B_{ 1}\vert ^{1/2}M_{ 1}^{3/8}p^{9/16+\alpha /8-x/8}. }$$
Now we apply (13.21) to B ← B0 and combine with this to get
$$\displaystyle\begin{array}{rcl} \vert T_{a_{1}}(A_{2},\varOmega _{1}(a_{1}))\vert & \leq & \vert T_{a_{1}}(A_{2},B_{0})\vert + \vert T_{a_{1}}(A_{2},B_{1})\vert {}\\ &\leq & \vert B_{0}\vert p^{1/2-\epsilon _{1} } + 2^{7/8}\vert B_{ 1}\vert ^{1/2}M_{ 1}^{3/8}p^{9/16+\alpha /8-x/8}. {}\\ \end{array}$$
Applying | B0 | , | B1 | ≤ | Ω1(a1) | ≤ M1 by (13.17) then gives
$$\displaystyle{ \vert T_{a_{1}}(A_{2},\varOmega _{1}(a_{1}))\vert \leq M_{1}p^{1/2-\epsilon _{1} } + 2^{7/8}M_{ 1}^{7/8}p^{9/16+\alpha /8-x/8}. }$$
Now we apply the triangle inequality to get
$$\displaystyle\begin{array}{rcl} \vert S(A_{1},A_{2})\vert & \leq & \sum _{a_{1}\in A_{1}}\vert T_{a_{1}}(A_{2},\varOmega _{1}(a_{1}))\vert {}\\ &\leq & \vert A_{1}\vert \Big(M_{1}p^{1/2-\epsilon _{1} } + 2^{7/8}M_{ 1}^{7/8}p^{9/16+\alpha /8-x/8}\Big), {}\\ \end{array}$$
and applying (13.18) and the assumption \(\vert \mathcal{A}\vert \leq p^{\alpha }\) then gives
$$\displaystyle{ \leq 2p^{1-\epsilon _{2} } + 2^{7/4}p^{1+\alpha /4-x/8} \leq 2p^{1-\epsilon _{2} } + 2^{7/4}p^{1-\epsilon _{2} } \leq p^{1-\epsilon _{1}-\epsilon }, }$$
where the last steps use (13.22). This completes the proof.

13.5.2 Proof of Lemma 10

We start by following the proof of Lemma 10 in [8]. First, Cauchy–Schwarz along with (13.17) and (13.18) gives
$$\displaystyle\begin{array}{rcl} \vert T_{a_{1}}(A_{2},B)\vert ^{2}& =& \bigg\vert \sum _{ \begin{array}{c}a_{2}\in A_{2} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\bigg(\frac{a_{1} - a_{2}} {p} \bigg) \cdot \sum _{b_{1}\in B}e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert ^{2} {}\\ & \leq & 2\sqrt{p}\sum _{\begin{array}{c}a_{2}\in A_{2} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\bigg\vert \sum _{b_{1}\in B}e_{p}\bigg( \frac{(b_{1} - b_{2})^{2}} {4(a_{1} - a_{2})}\bigg)\bigg\vert ^{2}. {}\\ \end{array}$$
Expanding \(\vert w\vert ^{2} = w\overline{w}\) and applying the triangle inequality then gives
$$\displaystyle{ = 2\sqrt{p}\bigg\vert \sum _{\begin{array}{c}a_{2}\in A_{2} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}\sum _{b_{1},b\in B}e_{p}\bigg( \frac{b_{1}^{2} - b^{2}} {4(a_{1} - a_{2})} - \frac{b_{2}(b_{1} - b)} {2(a_{1} - a_{2})}\bigg)\bigg\vert \leq 2\sqrt{p}\sum _{b_{1},b\in B}\vert F(b,b_{1})\vert, }$$
where
$$\displaystyle{ F(b,b_{1}):=\sum _{\begin{array}{c}a_{2}\in A_{2} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg( \frac{b_{1}^{2} - b^{2}} {4(a_{1} - a_{2})} - \frac{b_{2}(b_{1} - b)} {2(a_{1} - a_{2})}\bigg). }$$
Next, Hölder’s inequality \(\|F1\|_{1} \leq \| F\|_{m}\|1\|_{1-1/m}\) gives
$$\displaystyle{ \vert T_{a_{1}}(A_{2},B)\vert ^{2} \leq 2\sqrt{p}\vert B\vert ^{2-2/m}\bigg(\sum _{ b_{1},b\in B}\vert F(b,b_{1})\vert ^{m}\bigg)^{1/m}. }$$
(13.28)
To bound this, we use a change of variables x: = b1 + b ∈ B + B and y: = b1b ∈ BB and sum over more terms:
$$\displaystyle{ \sum _{b_{1},b\in B}\vert F(b,b_{1})\vert ^{m} \leq \sum _{ \begin{array}{c}x\in B+B\\ y\in B-B\end{array}}\bigg\vert \sum _{\begin{array}{c}a_{2}\in A_{2} \\ b_{2}\in \varOmega _{2}(a_{2})\end{array}}e_{p}\bigg( \frac{\mathit{xy}} {4(a_{1} - a_{2})} - \frac{b_{2}y} {2(a_{1} - a_{2})}\bigg)\bigg\vert ^{m}. }$$
Expanding \(\vert w\vert ^{m} = w^{m/2}\overline{w}^{m/2}\) and applying the triangle inequality then gives
$$\displaystyle\begin{array}{rcl} & =& \bigg\vert \sum _{\begin{array}{c}x\in B+B \\ y\in B-B\end{array}}\sum _{ \begin{array}{c}a_{2}^{(i)}\in A_{ 2} \\ b_{2}^{(i)}\in \varOmega _{ 2}(a_{2}^{(i)}) \\ 1\leq i\leq m \end{array}}\!\!\!\!\!e_{p}\bigg(\sum _{i=1}^{m/2}\Big[ \tfrac{\mathit{xy}} {4(a_{1}-a_{2}^{(i)})} - \tfrac{b_{2}y} {2(a_{1}-a_{2}^{(i)})} - \tfrac{\mathit{xy}} {4(a_{1}-a_{2}^{(i+m/2)})} + \tfrac{b_{2}y} {2(a_{1}-a_{2}^{(i+m/2)})}\Big]\bigg)\bigg\vert {}\\ & \leq & \sum _{y\in B-B}\sum _{ \begin{array}{c}a_{2}^{(i)}\in A_{ 2} \\ b_{2}^{(i)}\in \varOmega _{ 2}(a_{2}^{(i)}) \\ 1\leq i\leq m \end{array}}\bigg\vert \sum _{x\in B+B}e_{p}\bigg(\frac{\mathit{xy}} {4} \sum _{i=1}^{m/2}\bigg[ \frac{1} {a_{1} - a_{2}^{(i)}} - \frac{1} {a_{1} - a_{2}^{(i+m/2)}}\bigg]\bigg)\bigg\vert. {}\\ \end{array}$$
Next, we apply (13.17) to bound the number of m-tuples of b2(i)’s for each m-tuple of a2(i)’s (there are less than M2 m ). Combining this with the bound above, we know there are complex numbers εy, ξ of modulus ≤ 1 such that
$$\displaystyle{ \sum _{b_{1},b\in B}\vert F(b,b_{1})\vert ^{m} \leq M_{ 2}^{m}\sum _{ y\in B-B}\sum _{\xi \in \mathbb{F}_{p}}\lambda (\xi )\epsilon _{y,\xi }\sum _{x\in B+B}e_{p}(xy\xi /4), }$$
(13.29)
where
$$\displaystyle{ \lambda (\xi ):=\bigg \vert \bigg\{a^{(1)},\ldots,a^{(m)} \in A_{ 2}:\sum _{ i=1}^{m/2}\bigg[ \frac{1} {a_{1} - a^{(i)}} - \frac{1} {a_{1} - a^{(i+m/2)}}\bigg] =\xi \bigg\}\bigg \vert. }$$
To bound the ξ = 0 term in (13.29), pick a(1), , a(m) ∈ A2 such that
$$\displaystyle{ \sum _{i=1}^{m/2}\bigg[ \frac{1} {a_{1} - a^{(i)}} - \frac{1} {a_{1} - a^{(i+m/2)}}\bigg] = 0. }$$
(13.30)
Then
$$\displaystyle{ \sum _{i=1}^{m/2} \frac{1} {a_{1} - a^{(i)}} + \frac{m} {2} \cdot \frac{1} {a_{1} - a^{(1)}} =\sum _{ i=m/2+1}^{m} \frac{1} {a_{1} - a^{(i)}} + \frac{m} {2} \cdot \frac{1} {a_{1} - a^{(1)}}, }$$
and so by hypothesis (a) in Theorem 2, we have that \((a^{(1)},\ldots,a^{(m/2)},a^{(1)},\ldots,a^{(1)})\) is a permutation of \((a^{(m/2+1)},\ldots,a^{(m)},a^{(1)},\ldots,a^{(1)})\), which in turn implies that (a(1), , a(m∕2)) and (a(m∕2+1), , a(m)) are permutations of each other. Thus, all possible solutions to (13.30) are determined by (a(1), , a(m∕2)). There are | A2 | m∕2 choices for this m∕2-tuple, and for each choice, there are (m∕2)! available permutations for (a(m∕2+1), , a(m)). As such,
$$\displaystyle{ \lambda (0) = (m/2)!\vert A_{2}\vert ^{m/2}, }$$
(13.31)
which we will use later to bound the ξ = 0 term. In the meantime, we bound the remainder of (13.29). To this end, it is convenient to define the following functions:
$$\displaystyle{ \zeta '(z):=\sum _{ \begin{array}{c}y\in B-B \\ \xi \in \mathbb{F}_{p}^{{\ast}} \\ y\xi =z \end{array}}\epsilon _{y,\xi }\lambda (\xi ),\qquad \zeta (z):=\sum _{ \begin{array}{c}y\in B-B \\ \xi \in \mathbb{F}_{p}^{{\ast}} \\ y\xi =z \end{array}}\lambda (\xi ). }$$
Note that | ζ′(z) | ≤ ζ(z) by the triangle inequality. We use the triangle inequality and Hölder’s inequality to bound the ξ ≠ 0 terms in (13.29):
$$\displaystyle\begin{array}{rcl} & & \bigg\vert \sum _{y\in B-B}\sum _{\xi \in \mathbb{F}_{p}^{{\ast}}}\lambda (\xi )\epsilon _{y,\xi }\sum _{x\in B+B}e_{p}(xy\xi /4)\bigg\vert \\ & &\qquad \qquad =\bigg \vert \sum _{\begin{array}{c}x\in B+B \\ z\in \mathbb{F}_{p} \end{array}}\zeta '(z)e_{p}(xz/4)\bigg\vert \\ & &\qquad \qquad \leq \sum _{x\in \mathbb{F}_{p}}\bigg\vert 1_{B+B}(x) \cdot \sum _{z\in \mathbb{F}_{p}}\zeta '(z)e_{p}(xz/4)\bigg\vert \\ & &\qquad \qquad \leq \vert B + B\vert ^{3/4}\bigg(\sum _{ x\in \mathbb{F}_{p}}\bigg\vert \sum _{z\in \mathbb{F}_{p}}\zeta '(z)e_{p}(xz/4)\bigg\vert ^{4}\bigg)^{1/4}.{}\end{array}$$
(13.32)
To proceed, note that
$$\displaystyle\begin{array}{rcl} \bigg(\sum _{z\in \mathbb{F}_{p}}\zeta '(z)e_{p}(xz/4)\bigg)^{2}& =& \sum _{ z,z''\in \mathbb{F}_{p}}\zeta '(z)\zeta '(z'')e_{p}(x(z + z'')/4) {}\\ & =& \sum _{z'\in \mathbb{F}_{p}}(\zeta ' {\ast}\zeta ')(z')e_{p}(xz'/4), {}\\ \end{array}$$
where the last step follows from a change of variables z′ = z + z″. With this and Parseval’s identity, we continue (13.32):
$$\displaystyle\begin{array}{rcl} & =& \vert B + B\vert ^{3/4}\bigg(\sum _{ x\in \mathbb{F}_{p}}\bigg\vert \sum _{z'\in \mathbb{F}_{p}}(\zeta ' {\ast}\zeta ')(z')e_{p}(xz'/4)\bigg\vert ^{2}\bigg)^{1/4} \\ & =& \vert B + B\vert ^{3/4}\|\zeta ' {\ast}\zeta '\|_{ 2}^{1/2}p^{1/4} \\ & \leq & \vert B + B\vert ^{3/4}\|\zeta {\ast}\zeta \|_{ 2}^{1/2}p^{1/4}, {}\end{array}$$
(13.33)
where the last step follows from the fact that | (ζ′ ∗ζ′)(z) | ≤ (ζζ)(z), which can be verified using the triangle inequality. Since \(\zeta (z) =\sum _{\xi \in \mathbb{F}_{p}^{{\ast}}}1_{B-B}(z/\xi )\lambda (\xi )\), the triangle inequality gives
$$\displaystyle\begin{array}{rcl} \|\zeta {\ast}\zeta \|_{2}& =& \bigg\|\bigg(\sum _{\xi \in \mathbb{F}_{p}^{{\ast}}}\lambda (\xi )1_{\xi (B-B)}\bigg) {\ast}\bigg (\sum _{\xi '\in \mathbb{F}_{p}^{{\ast}}}\lambda (\xi ')1_{\xi '(B-B)}\bigg)\bigg\|_{2} \\ & \leq & \sum _{\xi,\xi '\in \mathbb{F}_{p}^{{\ast}}}\lambda (\xi )\lambda (\xi ')\|1_{\xi (B-B)} {\ast} 1_{\xi '(B-B)}\|_{2} \\ & =& \sum _{\xi,\xi '\in \mathbb{F}_{p}^{{\ast}}}\lambda (\xi )\lambda (\xi ')\|1_{B-B} {\ast} 1_{(\xi '/\xi )(B-B)}\|_{2}, {}\end{array}$$
(13.34)
where the last step follows from the (easily derived) fact that 1BB ∗ 1(ξ′∕ξ)(BB) is a dilation of \(1_{\xi (B-B)} {\ast} 1_{\xi '(B-B)}\).
To bound (13.34), we will appeal to Corollary 2 in [8], which says that for any \(A \subseteq \mathbb{F}_{p}\) and probability measure λ over \(\mathbb{F}_{p}\),
$$\displaystyle{ \sum _{b\in \mathbb{F}_{p}^{{\ast}}}\lambda (b)\|1_{A} {\ast} 1_{\mathit{bA}}\|_{2} \ll (\|\lambda \|_{2} + \vert A\vert ^{-1/2} + \vert A\vert ^{1/2}p^{-1/2})^{c_{0} }\vert A\vert ^{3/2}, }$$
(13.35)
where ≪ is Vinogradov notation; f ≪ g means f = O(g). As such, we need to construct a probability measure and understand its 2-norm. To this end, define
$$\displaystyle{ \lambda _{1}(\xi ):= \frac{\lambda (\xi )} {\|\lambda \|_{1}} = \frac{\lambda (\xi )} {\vert A_{2}\vert ^{m}}. }$$
(13.36)
The sum \(\sum _{\xi \in \mathbb{F}_{p}}\lambda (\xi )^{2}\) is precisely the number of solutions to
$$\displaystyle{ \frac{1} {a_{1} - a^{(1)}} + \cdots + \frac{1} {a_{1} - a^{(m)}} - \frac{1} {a_{1} - a^{(m+1)}} -\cdots - \frac{1} {a_{1} - a^{(2m)}} = 0, }$$
which by hypothesis (a) in Theorem 2, only has trivial solutions. As such, we have
$$\displaystyle{ \|\lambda \|_{2}^{2} = m!\vert A_{ 2}\vert ^{m}. }$$
(13.37)
At this point, define λ1′(b) to be λ1(ξ′∕b) whenever b ≠ 0 and λ1(0) otherwise. Then λ1′ is a probability measure with the same 2-norm as λ1, but it allows us to directly apply (13.35):
$$\displaystyle\begin{array}{rcl} & & \sum _{\xi \in \mathbb{F}_{p}^{{\ast}}}\lambda _{1}(\xi )\|1_{B-B} {\ast} 1_{(\xi '/\xi )(B-B)}\|_{2} \\ & & \qquad \qquad =\sum _{b\in \mathbb{F}_{p}^{{\ast}}}\lambda _{1}'(b)\|1_{B-B} {\ast} 1_{b(B-B)}\|_{2} \\ & & \qquad \qquad \ll (\|\lambda _{1}\|_{2} + \vert B - B\vert ^{-1/2} + \vert B - B\vert ^{1/2}p^{-1/2})^{c_{0} }\vert B - B\vert ^{3/2}.{}\end{array}$$
(13.38)
At this point, our proof deviates from the proof of Lemma 10 in [8]. By (13.36), (13.37), and (13.16), we have
$$\displaystyle{ \|\lambda _{1}\|_{2} = \vert A_{2}\vert ^{-m}\|\lambda \|_{ 2} \leq \sqrt{m!}\vert A_{2}\vert ^{-m/2} \leq \sqrt{m!}p^{-my/2}, }$$
Next, (13.19) and (13.20) together give
$$\displaystyle{ \vert B - B\vert \geq \vert B\vert \geq p^{1/2-\alpha _{1} } }$$
and
$$\displaystyle{ \vert B - B\vert \leq p^{\alpha _{2}}\vert B\vert \leq p^{1/2+\alpha _{2}}. }$$
Thus,
$$\displaystyle\begin{array}{rcl} \|\lambda _{1}\|_{2} + \vert B - B\vert ^{-1/2} + \vert B - B\vert ^{1/2}p^{-1/2}& \leq & \sqrt{m!}p^{-my/2} + p^{\alpha _{1}/2-1/4} + p^{\alpha _{2}/2-1/4} \\ & \leq & p^{-my/2+4m\epsilon }, {}\end{array}$$
(13.39)
where the last step follows from (13.12). So, by (13.34), (13.36), (13.38), and (13.39), we have
$$\displaystyle\begin{array}{rcl} \|\zeta {\ast}\zeta \|_{2}& \leq & \vert A_{2}\vert ^{2m}\sum _{ \xi '\in \mathbb{F}_{p}^{{\ast}}}\lambda _{1}(\xi ')\sum _{\xi \in \mathbb{F}_{p}^{{\ast}}}\lambda _{1}(\xi )\|1_{B-B} {\ast} 1_{(\xi '/\xi )(B-B)}\|_{2} \\ & \ll & \vert A_{2}\vert ^{2m}(\|\lambda _{ 1}\|_{2} + \vert B - B\vert ^{-1/2} + \vert B - B\vert ^{1/2}p^{-1/2})^{c_{0} }\vert B - B\vert ^{3/2} \\ & \leq & \vert A_{2}\vert ^{2m}p^{-(c_{0}/2)\mathit{my}+4c_{0}m\epsilon }\vert B - B\vert ^{3/2}, {}\end{array}$$
(13.40)
and subsequent application of (13.29), (13.31), (13.33), and (13.40) gives
$$\displaystyle\begin{array}{rcl} \sum _{b_{1},b\in B}\vert F(b,b_{1})\vert ^{m}& \leq & (\tfrac{m} {2} )!(M_{2}\vert A_{2}\vert )^{m}\vert A_{ 2}\vert ^{-m/2}\vert B - B\vert \vert B + B\vert \\ & +& O(M_{2}^{m}\vert A_{ 2}\vert ^{m}\vert B - B\vert ^{3/4}\vert B + B\vert ^{3/4}p^{-(c_{0}/4)\mathit{my}+2c_{0}m\epsilon }p^{1/4}).{}\end{array}$$
(13.41)
By Lemma 4 in [8] (which states that | A + A | ≤ | AA | 2∕ | A | ), condition (13.20) implies
$$\displaystyle{ \vert B + B\vert \leq \frac{\vert B - B\vert ^{2}} {\vert B\vert } \leq p^{2\alpha _{2} }\vert B\vert. }$$
We now use this with (13.18), (13.16), and (13.20) to bound (13.41):
$$\displaystyle{ \ll (\tfrac{m} {2} )!(2\sqrt{p})^{m}p^{-my/2}p^{3\alpha _{2} }\vert B\vert ^{2} + (2\sqrt{p})^{m}p^{(9/4)\alpha _{2} }\vert B\vert ^{3/2}p^{-(c_{0}/4)\mathit{my}+2c_{0}m\epsilon }p^{1/4} }$$
Next, the left-hand inequality of (13.19) gives that \(p^{1/4} \leq \vert B\vert ^{1/2}p^{\alpha _{1}/2}\), leading to the following bound:
$$\displaystyle{ \ll \vert B\vert ^{2}p^{m/2-my/2+3\alpha _{2} } + \vert B\vert ^{2}p^{m/2+\alpha _{1}/2+(9/4)\alpha _{2}-(c_{0}/4)\mathit{my}+2c_{0}m\epsilon }. }$$
Overall, we have
$$\displaystyle{ \sum _{b_{1},b\in B}\vert F(b,b_{1})\vert ^{m} \leq 2^{-m}\vert B\vert ^{2}p^{m/2+\alpha _{1}/2+(9/4)\alpha _{2}-(c_{0}/4)\mathit{my}+2m\epsilon }, }$$
since c0 < 1, and 3α2 − 2α1 ≤ (2 − c0)my (i.e., (13.13)). Thus, (13.28) gives
$$\displaystyle\begin{array}{rcl} \vert T(A_{2},B)\vert ^{2}& \leq & \sqrt{p}\vert B\vert ^{2-2/m}(\vert B\vert ^{2}p^{m/2+\alpha _{1}/2+(9/4)\alpha _{2}-(c_{0}/4)\mathit{my}+2m\epsilon })^{1/m} {}\\ & =& \vert B\vert ^{2}p^{1-(c_{0}y/4-\alpha _{1}/(2m)-(9\alpha _{2})/(4m))+2\epsilon }. {}\\ \end{array}$$
Finally, taking square roots produces the result.

Footnotes

  1. 1.

    Since writing the original draft of this chapter, K. Ford informed the author that an alternative to the chirp selection method is provided in [7]. We leave the impact on ε0 for future work.

Notes

Acknowledgements

The author thanks the anonymous referees for their helpful suggestions. This work was supported by NSF Grant No. DMS-1321779. The views expressed in this chapter are those of the author and do not reflect the official policy or position of the United States Air Force, Department of Defense, or the U.S. Government.

References

  1. 1.
    Applebaum, L., Howard, S.D., Searle, S., Calderbank, R.: Chirp sensing codes: deterministic compressed sensing measurements for fast recovery. Appl. Comput. Harmon. Anal. 26, 283–290 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bandeira, A.S., Dobriban, E., Mixon, D.G., Sawin, W.F.: Certifying the restricted isometry property is hard. IEEE Trans. Inf. Theory 59, 3448–3450 (2013)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bandeira, A.S., Fickus, M., Mixon, D.G., Wong, P.: The road to deterministic matrices with the restricted isometry property. J. Fourier Anal. Appl. 19, 1123–1149 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Baraniuk, R., Davenport, M., DeVore, R., Wakin, M.: A simple proof of the restricted isometry property for random matrices. Constr. Approx. 28, 253–263 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Bourgain, J., Garaev, M.Z.: On a variant of sum-product estimates and explicit exponential sum bounds in prime fields. Math. Proc. Camb. Philos. Soc. 146, 1–21 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bourgain, J., Glibichuk, A.: Exponential sum estimate over subgroup in an arbitrary finite field. http://www.math.ias.edu/files/avi/Bourgain_Glibichuk.pdf (2011)
  7. 7.
    Bourgain, J., Dilworth, S.J., Ford, K., Konyagin, S.V., Kutzarova, D.: Breaking the k 2 barrier for explicit RIP matrices. In: STOC 2011, pp. 637–644 (2011)MathSciNetGoogle Scholar
  8. 8.
    Bourgain, J., Dilworth, S.J., Ford, K., Konyagin, S., Kutzarova, D.: Explicit constructions of RIP matrices and related problems. Duke Math. J. 159, 145–185 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Cai, T.T., Zhang, A.: Sharp RIP bound for sparse signal and low-rank matrix recovery. Appl. Comput. Harmon. Anal. 35, 74–93 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Casazza, P.G., Fickus, M.: Fourier transforms of finite chirps. EURASIP J. Appl. Signal Process. 2006, 7 p (2006)Google Scholar
  11. 11.
    DeVore, R.A.: Deterministic constructions of compressed sensing matrices. J. Complexity 23, 918–925 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Fickus, M., Mixon, D.G., Tremain, J.C.: Steiner equiangular tight frames. Linear Algebra Appl. 436, 1014–1027 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Springer, Berlin (2013)CrossRefzbMATHGoogle Scholar
  14. 14.
    Koiran, P., Zouzias, A.: Hidden cliques and the certification of the restricted isometry property. arXiv:1211.0665 (2012)Google Scholar
  15. 15.
    Mixon, D.G.: Deterministic RIP matrices: breaking the square-root bottleneck, short, fat matrices (weblog). http://www.dustingmixon.wordpress.com/2013/12/02/deterministic-rip-matrices-breaking-the-square-root-bottleneck/ (2013)
  16. 16.
    Mixon, D.G.: Deterministic RIP matrices: breaking the square-root bottleneck, II, short, fat matrices (weblog). http://www.dustingmixon.wordpress.com/2013/12/11/deterministic-rip-matrices-breaking-the-square-root-bottleneck-ii/ (2013)
  17. 17.
    Mixon, D.G.: Deterministic RIP matrices: breaking the square-root bottleneck, III, short, fat matrices (weblog). http://www.dustingmixon.wordpress.com/2014/01/14/deterministic-rip-matrices-breaking-the-square-root-bottleneck-iii/ (2013)
  18. 18.
    Tao, T.: Open question: deterministic UUP matrices. What’s new (weblog). http://www.terrytao.wordpress.com/2007/07/02/open-question-deterministic-uup-matrices/ (2007)
  19. 19.
    Tao, T., Vu, V.H.: Additive Combinatorics. Cambridge University Press, Cambridge (2006)CrossRefzbMATHGoogle Scholar
  20. 20.
    Welch, L.R.: Lower bounds on the maximum cross correlation of signals. IEEE Trans. Inform. Theory 20, 397–399 (1974)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Air Force Institute of TechnologyWright-Patterson Air Force BaseUSA

Personalised recommendations