Evasive sets, covering by subspaces, and point-hyperplane incidences

Given positive integers $k\leq d$ and a finite field $\mathbb{F}$, a set $S\subset\mathbb{F}^{d}$ is $(k,c)$-subspace evasive if every $k$-dimensional affine subspace contains at most $c$ elements of $S$. By a simple averaging argument, the maximum size of a $(k,c)$-subspace evasive set is at most $c |\mathbb{F}|^{d-k}$. When $k$ and $d$ are fixed, and $c$ is sufficiently large, the matching lower bound $\Omega(|\mathbb{F}|^{d-k})$ is proved by Dvir and Lovett. We provide an alternative proof of this result using the random algebraic method. We also prove sharp upper bounds on the size of $(k,c)$-evasive sets in case $d$ is large, extending results of Ben-Aroya and Shinkar. The existence of optimal evasive sets has several interesting consequences in combinatorial geometry. We show that the minimum number of $k$-dimensional linear hyperplanes needed to cover the grid $[n]^{d}\subset \mathbb{R}^{d}$ is $\Omega_{d}\big(n^{\frac{d(d-k)}{d-1}}\big)$, which matches the upper bound proved by Balko, Cibulka, and Valtr, and settles a problem proposed by Brass, Moser, and Pach. Furthermore, we improve the best known lower bound on the maximum number of incidences between points and hyperplanes in $\mathbb{R}^{d}$ assuming their incidence graph avoids the complete bipartite graph $K_{c,c}$ for some large constant $c=c(d)$.


Introduction
Given a finite field F, a set of points S ⊂ F d is (k, c)-subspace evasive if no k-dimensional affine subspace contains more than c elements of S. This notion was first investigated in an influential work of Pudlák and Rödl [18], who observed that explicit constructions of evasive sets in F n 2 can be transformed to give explicit constructions of bipartite Ramsey graphs.In particular, they showed that a (d/2, c)-evasive set S ⊂ F d 2 can be used to construct a bipartite graph with vertex classes of size |S| containing no complete or empty bipartite graph with parts of size more than c.Evasive sets also have application in coding theory in the context of list-decoding, and in combinatorial geometry, where they can be used to get incidence bounds.

Evasive sets and coding theory
Error-correcting codes are used for controlling errors in data transmission over noisy or unreliable communication channels and they were extensively studied in the last 70 years in information theory, computer science and telecommunication.An [m, r, t]-code over the field F is a linear subspace L < F m of dimension r such that the Hamming distance between any two distinct elements of L is at least t, or equivalently, L contains no nonzero vector with less than t nonzero coordinates.In practice, an [m, r, t]-code can be used to send r (F-ary) bits of data using m bits, and is capable of correcting (t − 1)/2 faulty bits.In other words, the Hamming balls of radius ⌊(t − 1)/2⌋ centered at the code words of L are disjoint, which gives the celebrated Hamming bound m ≤ O t |F| (m−r) ⌊(t−1)/2⌋ −1 (see, e.g., [22]).
A matrix M ∈ F (m−r)×m , whose kernel is L, is a parity-check matrix of L. It is easy to show that L is an [m, r, t]-code if and only if any t − 1 columns of M are linearly dependent.Therefore, the problem of constructing [m, r, t]-codes is equivalent to the construction of a set S of m vectors in F m−r , forming the columns of M , such that no (t − 2)-dimensional subspace contains t − 1 elements of S. Writing d = m − r and k = t − 2, the Hamming bound shows that if S ⊂ F d such that no k-dimensional linear subspace contains k + 1 elements of S, then A list-decoding problem deals with the case when we receive message with more than (t − 1)/2 faulty bits.In this case we might not be able to uniquely determine the original message, but we can sometimes output a small list of possibilities.An error correcting code L ⊂ F m is (ρ, c) list-decodable if the Hamming ball of radius ρm around every element of L contains at most c elements of L. In 2011, Guruswami [14] discovered an important connection between evasive sets and list-decodable codes.He showed in [14] Furthermore, Guruswami [14] observed that a random set S of size |F| d−k−δ is k, O(kd/δ) -subspace evasive with high probability, and so such a set can be used to construct list-decodable codes of near optimal capacity.In this setting, one thinks of k being fixed, while d and (possibly |F|) are large.Taking δ = εd in the above result implies that a random set of |F| d(1−ε) points is (k, c)-subspace evasive with c = O(k/ε).Notably, c does not depend on |F| or d.This simple probabilistic argument vastly outperforms every known explicit construction, so the main focus here is to find deterministic (k, c)-subspace evasive sets S of size |F| d(1−ε) with c small as possible, see e.g.[4,10].On the other hand, Ben-Aroya and Shinkar [4] proved that when ε −1 ≤ k O (1) , the bound c = O(k/ε) cannot be improved, therefore the probabilistic construction is optimal.We extend this result, showing that it remains true for every ε −1 = 2 O(k) as well.
Theorem 1.1.Let F be a field, k be a positive integer, and 0 < ε < 1/20, then if d is sufficiently large with respect to k, the following holds.
Observe that the later is a natural barrier, as in case F = F 2 , a k-dimensional affine subspace cannot contain more than 2 k points.Therefore, if one wants to extend Theorem 1.1 beyond ε −1 ≥ 2 k , the field F also has to play some role.This setting, already for k = 1, seems extremely difficult.Bounding the size of a set in F d 3 containing no three points on a line is equivalent with the famous Cap set problem, for which the upper bound 2.756 d was recently proved by Ellenberg and Gijswijt [11], following the breakthrough of Croot, Lev, and Pach [9].However, similar results are already not known for four points on a line.
It appears, that the upper bound on evasiveness behaves very differently in the regime when c is close to k.In this case, we show that the Hamming bound mentioned above can be used to estimate the size of (k, k + C)-subspace evasive sets, where C < k/2.Interestingly, the method of proof for this range of parameters is fundamentally different from that in Theorem 1.1.While the proof of this theorem is mostly combinatorial, relying on a generalization of the Erdős Box theorem [12], the proof of the following result (such as the proof of the Hamming bound) is based on coding theory.
Using the standard probabilistic argument, one can easily show that the bound in this theorem is optimal up to a factor of 2 in the exponent.To provide a different perspective and for the convenience of the reader, we give a short, alternative proof of this theorem.Note the striking difference between the bound of Theorem 1.3 and the lower bounds in case c being small.This leads to the natural question about the dependence of c(d, k) on the parameters d and k.The proof of Dvir and Lovett [10] gives c(d, k) = d k (if |F| is sufficiently large), which is likely to be far from optimal, while our proof gives even worse bounds.On the other hand, applying Theorem 1.1 with ε = max{ k d , 1 2 k−1 }, we get the lower bound c(d, k) = Ω(min{d, 2 k }).This might raise the question whether c(d, k) can be bounded by a function of k alone.However, this is not true already for k = 1.Indeed, if d is sufficiently large with respect to k and F, then the density Hales-Jewett theorem [13] implies that any subset S ⊂ F d of size at least 1  |F| k |F| d contains a combinatorial line, which in turn is also a complete 1-dimensional affine subspace.

Covering by subspaces
Theorem 1.3 has a number of interesting applications in combinatorial geometry.The following problem first appeared in a paper of Brass and Knauer [5] in connection to point-hyperplane incidences, which we discuss in more detail in the next subsection.Given positive integers n, k, d, c with k ≤ d, determine the maximum number of lattice points in the grid [n] d = {1, . . ., n} d with no k-dimensional linear or affine subspace containing more than c of them (over R).Let ℓ(d, k, n, c) denote this maximum in the linear case, and a(d, k, n, c) in the affine case.Here, we are interested in the behavior of ℓ(d, k, n, c) and a(d, k, n, c) as a function of n, while we think of k, d, c as fixed.Clearly, we have a(d, k, n, c) ≤ cn d−k as we can cover [n] d by n d−k affine hyperplanes of dimension k.On the other hand, a probabilistic argument of Brass and Knauer [5] shows that for every ε > 0 there exists Determining ℓ(d, k, n, c) seems to be more difficult.Brass and Knauer [5] ).However, this was refuted by Lefmann [17]   1) ).In general, Balko, Cibulka, and Valtr [2] showed that g (1) , where the lower bound comes from proving ℓ(d, k, n, c) ≥ n d(d−k)/(d−1)−ε for some ε = ε d,k (c) tending to 0 as c tends to infinity.If k = 1, it was shown by Konyagin and Sudakov [15] that the o(1) and ε terms can be removed, closing the gap in this case.Here, we close the gap for all values of k and d.We will give a very short alternative proof of Corollary 1.6 as well, which does not rely on Theorem 1.5.Finally, let us remark that c = c(d, k) denotes the same function in Theorems 1.3, 1.4 and 1.5.

Point-hyperplane incidences
One of the fundamental results in combinatorial geometry is the Szemerédi-Trotter theorem [21], which states that the number of incidences between n points and m lines is O((mn) 2/3 + m + n), and this bound is the best possible.Extending this result to higher dimensions is a notorious open problem.Given a set of points P and set of hyperplanes H in R d , let I(P, H) denote the number of incidences between P and H, that is, the number of pairs (p, H) ∈ P × H such that p ∈ H.Note that in R 3 , by taking n points on a single line and m planes containing this line, we have a collection of n points and m planes with mn incidences.Therefore, in order to avoid this triviality, we forbid a complete bipartite graph K c,c in the incidence graph of the configuration.I.e., if P is a set of n points and H is a set of m hyperplanes in R d , we are interested in the maximum of I(P, H) as a function of m and n assuming there are no c hyperplanes containing the same c points.Let f (d, n, m, c) denote this maximum.
It follows from works of Chazelle [8], Brass and Knauer [5] and Apfelbaum and Sharir [1] that However, this bound is only known to be sharp in case d = 2. Brass and Knauer [5] observed that large sets of lattice points satisfying the conditions of Theorems 1.4 and 1.5 can be used to provide lower bounds for f (d, n, m, c).For every pair of integers m and n, and real number ε > 0, they showed that there exists c such that By improving the known lower bounds on ℓ(d, k, n, c), Balko, Cibulka, and Valtr [2] improved the lower bounds on f (d, n, m, c) as well for d ≥ 4. By using Theorems 1.4 and 1.5, we further improve their result, and as these theorems are optimal (up to the value of c), we reach the full potential of the approach outlined by Brass and Knauer [5].
Theorem 1.7.For every positive integer d there exists c such that the following holds.Let m, n be positive integers, then there exists a set of n points P and a set of m hyperplanes H in R d such that the incidence graph of P and H is K c,c -free, and In certain asymmetric settings, i.e when n is much larger than m, better bounds are known, see [19].
The rest of this paper is organized as follows.In Section 2, we prove Theorems 1.1 and 1.4.Then, in Section 3, we prove Theorem 1.3.In Section 4, we prove Theorem 1.5, and give an alternative proof of Corollary 1.6.Finally, in Section 5, we give a proof sketch of Theorem 1.7.

Lower bounds for evasiveness
In this section, we prove Theorems 1.1 and 1.4.In order to prove Theorem 1.1, we consider a variant of the Erdős Box theorem [12].This theorem is a generalization of the Kővári-Sós-Turán theorem [16], providing upper bounds on the maximum number of edges of an r-partite r-uniform hypergraph with parts of size n containing no copy of the complete r-partite r-uniform hypergraph K s 1 ,...,sr .As we require a version of the Box theorem in which the parts of the host hypergraph have different sizes (which is not a standard setting), we present a short proof of the result that we need.With slight abuse of notation, given an r-uniform r-partite hypergraph H with vertex classes V 1 , . . ., V r , we view edges of H as both r-element subsets of the vertex set, and elements of the Cartesian product V 1 × • • • × V r .We also denote by X (s) all s-element subsets of the set X.
Lemma 2.1.Let r and s 1 , . . ., s r ≥ 2 be positive integers.Let H be an r-partite r-uniform Proof.We prove this by induction on r.In case r = 1, H has at least 2s 1 edges, so the statement is true.Let us assume that r ≥ 2.
be the number of edges of H.For each f ∈ U , let d(f ) denote the number of edges of H containing f .Also, for every set of vertices Then we have the following equality: By the convexity of the function x s 1 , and recalling that f ∈U d(f ) = t, we can write the following inequality:

The last inequality holds by the condition t/|U
1 .Therefore, by the pigeonhole principle, there exists Let H ′ be the (r − 1)-partite (r − 1)-uniform hypergraph with vertex classes V 2 , . . ., V r and set of edges E(H ′ ) = N (S 1 ).Then we can apply our induction hypothesis to conclude that there exist S 2 ⊂ V 2 , . . ., S r ⊂ V r such that |S i | = s i for i = 2, . . ., r, and Proof of Theorem 1.1.Let us introduce some parameters.Let r = ⌊log 2 (1/ε)⌋ − 1, then we may assume that r ≤ k, otherwise the statement of the theorem is vacuous.For 3 , assuming d is sufficiently large with respect to r.Furthermore, for i = 1, . . ., r − 1, let V i = F t i , and let V r = F d−T .We will view We would like to apply Lemma 2.1 with s 1 = • • • = s r−1 = 2 and s r = k − r + 2 to the hypergraph H to find suitable sets S 1 , . . ., S r .However, in order to do this, we need to verify that H satisfies the conditions of the lemma.First of all, for i ∈ [r − 1], we have where the second inequality holds assuming d is sufficiently large with respect to r.Furthermore, note that 1 8ε < s 1 . . .s r−1 = 2 r−1 ≤ 1 4ε , and Therefore, we can write Here, the last inequality holds by assuming d is sufficiently large with respect to k.Thus, the conditions of Lemma 2.1 are satisfied, so we can find , and let S r = {w 0 . . .w k−r+1 }.Given w ∈ V i for some i ∈ [r], let w ′ ∈ F d denote the vector which agrees with w on V i , and vanishes on all other coordinates.Then W is contained in the affine subspace Finally, let us present the proof of Theorem 1.2.
Proof of Theorem 1.2.For the convenience of the reader, we first recall the proof of the Hamming bound, that is (1).Let S ⊂ F d be a set of vectors such that no k-dimensional linear subspace contains k + 1 elements.Without loss of generality, assume that S spans F d .Let M ∈ F d×|S| be a matrix, whose columns are the elements of S. Then L = ker(M ) < F |S| does not contain a vector with at most k + 1 non-zero coordinates, which implies that L is an [|S|, |S| − d, k + 2]-code.Hence, the Hamming balls of radius r = ⌊ k+1 2 ⌋ around the elements of L are disjoint.The size of such a ball is at least Indeed, select W 1 , . . ., W C+1 one-by-one, at each step deleting the selected set from S. But then

Optimal constructions of evasive sets
In this section, we give an alternative proof of Theorem 1.3.Our proof is based on the random algebraic method pioneered by Bukh, and uses the ideas from his paper [7].With slight abuse of notation, let us exchange k with d − k for our (and the reader's) future convenience, so we prove the following equivalent formulation of Theorem 1.3.
which is the set of possible exponents of the monomials of the polynomials in Q D .Let q 1 , . . ., q d be random elements of Q D chosen independently from the uniform distribution, and set q = (q 1 , . . ., q d ).Our goal is to show that the set has the property that the no (d − k)-dimensional affine subspace of F d contains more than c elements of H with high probability, if c is sufficiently large with respect to k and d.
We prepare the proof of this with a number of claims.First, let us state three simple observations that we will use repeatedly.
(i) If M ∈ F k×d has rank k, and v ∈ F d is chosen randomly from the uniform distribution, then The coefficient of x α in q, v i is (M c α )(i).As M c α is uniformly distributed in F k and (c α ) α∈Λ D are independent, this proves the claim.
Proof.This follows as the constant term of q i is uniformly distributed in F. Proof.First, suppose that the first coordinates of the vectors z 1 , . . ., z s are pairwise distinct.For α ∈ {0, 1, . . ., s − 1} and i ∈ [d], let c i,α be the coefficient of x α 1 in q i .Also, let c i = (c i,0 , . . ., c i,s−1 ) and y i = (1, z i (1), z i (1) 2 , . . ., z i (1) s−1 ).Then y 1 , . . ., y s are linearly independent, using that z 1 (1), . . ., z s (1) are pairwise distinct, and so the Vandermonde determinant is nonzero.Let M ∈ F s×s p be the matrix whose rows are y 1 , . . ., y s .Then M has rank s, so M c i is uniformly distributed in F s p .As c 1 , . . ., c d are independent, we get that the d , where X i,j = (M c i )(j), and X i,j and Y i ′ ,j ′ are independent (since these variables depend on disjoint sets of random coefficients), hence (q i (z j )) i∈[d],j∈[s] are independent as well (see (iii)).Now consider the general case.We show that there exists an invertible matrix M ∈ F k×k such that M z 1 , . . ., M z s have pairwise distinct first coordinates.As M is a change of basis, the polynomial q ′ i defined as q ′ i (x) = q i (M −1 x) is also uniformly distributed in Q p,D , so then we are done by the previous argument.Choose M randomly from the uniform distribution on all invertible matrices.Then for 1 ≤ i < j ≤ s, we have P(M z i (1) = M z j (1)) = (|F| k−1 − 1)/(|F| k − 1) < 1/|F| as M (z i − z j ) is uniformly distributed on F k \ {0}.Hence, by Markov's inequality, the probability that there exists 1 ≤ i < j ≤ s such that M z i (1) = M z j (1) is at most s 2 /|F| < 1, implying the existence of the desired matrix M .
Let V be a (d − k)-dimensional affine subspace of F d and let z ∈ F k .Let I(z, V ) be the indicator random variable of the event {q(z) ∈ V }.Then there exist k linearly independent vectors v 1 , . . ., v k ∈ F d and b ∈ F k such that I(z, V ) = 1 if and only if q(z), v i = b(i) for every Let e 1 , . . ., e d ∈ F d be the unit basis, that is, e i (j) = 1 if i = j, and e i (j) = 0 otherwise.Let E be the (d − k)-dimensional linear subspace with normal vectors e 1 , . . ., e k .By Claim 3.2 and 3.3, N (V ) has the same distribution as N (E), so for simplicity, write N = N (E) and I(z) = I(z, E).Also, observe that I(z) is the indicator random variable of the event q 1 (z) = • • • = q k (z) = 0. Therefore, by Claim 3.3 and 3.4, we have that P(I(z) = 1) = 1/|F| k , and if s ≤ min{D, |F| 1/2 } and z 1 , . . ., z s ∈ F k are distinct, then I(z 1 ), . . ., I(z s ) are independent.

Incidences
As the proof of Theorem 1.7 is essentially identical to the proofs of [5] and [2], let us only give a very brief outline of it.
Proof sketch of Theorem 1.

Theorem 1 . 3 .
([10]) For every pair of positive integers k, d satisfying k ≤ d, there exists a positive integer c = c(d, k) such that the following holds.For every finite field F, there exists a (k, c)-subspace evasive set of size |F| d−k /3 in F d .

Theorem 1 . 4 .
was only known in the two special cases when k = 1 or k = d − 1.A straightforward application of Theorem 1.3 lets us close the gap between the lower and upper bound for every k < d and sufficiently large c.For every pair of positive integers k, d satisfying k ≤ d, there exists a positive integer c = c(d, k) such that the following holds.For every positive integer n there exists a set S ⊂ [n] d of size at least (n/2) d−k such that no k-dimensional affine hyperplane contains more than c elements of S. Indeed, let p be any prime between n/2 and n, which exists by Bertrand's postulate.Let c = c(d, k) be the constant guaranteed by Theorem 1.3, and let S 0 ⊂ F d p be a set of p d−k ≥ (n/2) d−k vectors such that no k-dimensional affine subspace contains more than c elements of S 0 .Setting S to be the set of lattice points in [p] d that are congruent to the elements of S 0 modulo p gives the desired set.
for most values of k and d, as he showed that ℓ(d, k, n, k) = O d (n d/⌊k/2⌋ ) (akin the Hamming bound, mentioned in the previous subsection).Similarly to the affine case, bounding ℓ(d, k, n, c) is closely related to the problem of bounding g(d, k, n), which is the minimum number of k-dimensional linear hyperplanes in a covering of [n] d .Indeed, we trivially have ℓ(d, k, n, c) ≤ cg(d, k, n).The problem of estimating g(d, k, n) was proposed by Brass, Moser, and Pach [6] (Problem 6 in Chapter 10.2).Bárány, Harcos, Pach, and Tardos [3] resolved the k = d− 1 case of both problems by showing that

Theorem 1 . 5 .
For every pair of positive integers k, d satisfying k ≤ d, there exist a positive integer c = c(d, k) and real number C = C(d, k) > 0 such that the following holds.For every positive integer n there exists a set S ⊂ [n] d of size at least Cn d(d−k)/(d−1) such that no k-dimensional linear hyperplane contains more than c elements of S. Corollary 1.6.Let k, d be positive integers satisfying k < d, then there exists C > 0 such that the following holds for every positive integer n.The number of k-dimensional hyperplanes in any covering of [n] d is at least Cn d(d−k)/(d−1) .
so S 1 , . . ., S r satisfy the required properties.Now we are ready to prove Theorem 1.1.
spans a linear subspace of dimension at most k, and|W | = k + C + 1, showing that S is not (k, k + C)-evasive.

Theorem 3 . 1 .
For every pair of positive integers k, d satisfying k ≤ d, there exists a positive integer c = c(d, d − k) such that the following holds.For every finite field F, there exists a (d − k, c)-subspace evasive set of size |F| k in F d .Let D = (d + 1)k + 1, p = |F|, and let Q D < F[x 1 , . . ., x k ] denote the space of polynomials of (total) degree at most D on k variables.Write

7 .
Let k = ⌊d/2⌋ − 1, n 0 ≈ n 1/(d−k) and m 0 ≈ (m/n 0 ) (d−1)/(dk+2d−1) .Let P ⊂ [n 0 ] d be a maximal set of lattice points such that no k-dimensional affine subspace contains more than c 1 = c(d, k) points of P , then |P | ≈ n d−k 0 ≈ n by Theorem 1.4.Also, let N ⊂ [m 0 ] d be a maximal set of lattice points such that no (d − k − 1)-dimensional linear subspace contains more than c 2 = c(d, d − k − 1) points of N , then |N | ≈ m d(k+1)/(d−1) 0 by Theorem 1.5.Let H be the set of all hyperplanes whose normal vector is in N and contains at least one point of P .Then |H| m 0 n 0 |N | ≈ m as the scalar product x, y for any x ∈ P and y ∈ N is contained in [dm 0 n 0 ].Furthermore, the incidence graph of (P, H) is K c 1 ,c 2 -free, as the intersection of any c 2 + 1 elements of H is an at most a k-dimensional affine hyperplane.Finally, I(P, H) = |P ||N |, as for each y ∈ N , the hyperplanes in H with normal vector y form a partition of P .Plugging in our bounds on |P | and |N | gives the desired result.See[2] for the precise calculations, that give almost the same bounds.
[10]vated by applications in combinatorial geometry, another interesting setting is to consider large (k, c)-subspace evasive sets in F d , where we think of k and d as fixed, while |F| is arbitrarily large.Clearly, a simple averaging argument shows that a (k, c)-subspace evasive set in F d can have size at most c|F| d−k .As mentioned above, the probabilistic argument shows that a random set of |F| d−k−δ points is (k, c)-subspace evasive for δ = Θ(kd/c).Note that, however, a random set of Ω d (|F| d−k ) points does intersect many k-dimensional affine subspaces in Ω d (log |F|) elements with high probability.Dvir and Lovett[10](see Theorem 2.4 together with Claim 3.5) showed that this can be improved, by giving an explicit algebraic construction of a (k, c)-subspace evasive set of size Ω(|F| d−k ), where c depends only on d and k.
X d are uniformly distributed random variables in F, then X 1 , ..., X d are independent if and only if (X 1 , ..., X d ) is uniformly distributed in F d .(iii)IfX 1 , ..., X d are independent, uniformly distributed random variables on F, and Y 1 , ..., Y d are random variables on F such that X i and Y j are independent for any i, j ∈ [d], then X 1 + Y 1 , ..., X d + Y d are independent and uniformly distributed.Claim 3.2.Let v 1 , . .., v k ∈ F d be linearly independent vectors.Then the polynomials q, v 1 , . .., q, v k are independent and uniformly distributed in Q D .Proof.Let M ∈ F d×k be the matrix, whose rows are v 1 , . . ., v k .For α ∈ Λ D and i ∈ [d], let c i,α be the coefficient of the monomial x α = x in q i , and let c α = (c 1,α , . . ., c d,α ).Observe that the d • |Λ D | random variables (c i,α ) i∈[d],α∈Λ D are independent and uniformly distributed in F.