1 Introduction

In his classical paper in 1921, Polya [5] proved his famous theorem on random walks on \({\mathbf {Z}}^d\). Consider a particle at the origin at time 0; at each tick of the clock it goes to a randomly selected neighbouring lattice point, uniformly at random. One is interested in the chance that it returns to the origin at time n. Let \(e_1, \ldots , e_d\) be the basis unit vectors in \({\mathbf {Z}}^d\) and \(\xi _i\) are i.i.d. symmetric Bernoulli random variables (taking values \(\pm 1\) with probability 1 / 2). Let

$$\begin{aligned} S_n := \sum _{j=1} ^n \xi _j f_j \end{aligned}$$

where \(f_j \) is chosen uniformly from \(E := \{e_1, \ldots , e_d \}\). One would like to estimate

$$\begin{aligned} {\mathbf {P}}( S_n = 0) . \end{aligned}$$
(1)

Polya proved.

Theorem 1.1

For any \(d \ge 1, {\mathbf {P}}(S_n =0) = \Theta ( n^{-d/2} ). \)

(In general, we can consider an arbitrary starting point. For the sake of presentation, we delay the discussion of this case until the end of the section). A random walk is said to be recurrent if it returns to its initial position with probability one. A random walk which is not recurrent is called transient. Theorem 1.1 implies.

Corollary 1.2

The simple random walk on \({\mathbf {Z}}^d\) is recurrent in dimensions \(d=1,2\) and transient in dimension \(d \ge 3\).

The goal of this note is to show that a random walk with steps of the same length all pointing in different directions behaves quite differently. As a matter of fact, dimension 2 turns out to be the worst. The intuitive reason behind this is that a circle can have very few integral points, given by the number of representations of an integer as a sum of two squares. In contrast, a sphere of the same radius in the euclidean space or in higher dimensions can have many integral points, as each of its cross-sections is a \((d-2)\)-sphere.

Given a fixed list of vectors, we consider the walk where, at each tick of the clock, the particle randomly moves either in the direction of the new vector in the list or in the opposite one. The new and critical assumption here is that the vectors are not to be repeated.

Mathematically, we consider the random walk

$$\begin{aligned} S_{n, V}=\eta _1 v_1 + \eta _2 v_2 + \cdots + \eta _n v_n \end{aligned}$$

where \(V := \{ v_1, v_2, \ldots , v_n \} \) is a given set of n different unit vectors in \(\mathbb {R}^d\), and \(\eta _i\) are again i.i.d. Bernoulli random variables. We say that V is effectively d-dimensional if there is no affine hyperplane that contains more than 0.99 n vectors in V [where 0.99 can be replaced by by any constant \(0<(1-\epsilon ) <1\). For ease of notation we take \(\epsilon =0.01\) throughout the paper].

Notation: Here and later, asymptotic notations, such as \(o(), O(), \omega ()\), and so forth, are used under the assumption that d is fixed, and \(n\rightarrow \infty \).

Theorem 1.3

Consider a set V of n different unit vectors which is effectively d-dimensional. Then

  • For \(d \ge 4,\) \( {\mathbf {P}}(S_{n, V} =0) \le n^{-\frac{d}{2}-\frac{d}{d-2}+ \ o(1)}. \)

  • For \(d=3, {\mathbf {P}}(S_{n, V}=0 ) \le n^{-4 +o(1)} . \)

  • For \(d=2, {\mathbf {P}}(S_{n, V} =0) \le n^{-\omega (1) }.\)

The assumption “effectively d-dimensional” is necessary, otherwise one can take V in a lower dimensional subspace and have a better bound. The bound for \(d \ge 4\) is sharp, as we can construct a set V such that \( {\mathbf {P}}(S_{n, V}) \ge n^{-\frac{d}{2}-\frac{d}{d-2}+ \ o(1)}\). We conjecture that this expression is also the sharp bound for \(d=3\). This conjecture would follow from a new conjecture concerning incidences in \({\mathbf {R}}^3\), which is of independent interest (see Sect. 4 for details).

The real surprise is the case \(d=2\), where no matter how one chooses the set V, the returning probability is super polynomially small. Deciding the order of the exponent is an interesting problem. We can construct a set V which provides \({\mathbf {P}}( S_{n, V} =0) \ge n^{- C \log \log n}\) for some constant \(C >0\).

Theorem 1.3 holds under a weaker assumption. We can allow the vectors in V to take different lengths and also have some multiplicities. We say that V is (LM)-typical if the vectors of V have lengths in a set \({\mathcal {L}}\) of size L, and each vector has multiplicity at most M. Furthermore, we can allow the target to be any point \(x \in {\mathbf {R}}^d\). (This corresponds to a walk starting at \(-x\) and ending at the origin).

Theorem 1.4

Let V be a (LM)-typical set which is effectively d-dimensional, where both \(L, M =n^{o(1) }\). Then the upper bounds in Theorem 1.4 holds for the probability \({\mathbf {P}}( S_{n, V } = x )\), for any \(x \in {\mathbf {R}}^d\).

In the next section, we present our main lemmas. The proofs of the theorems follow in Sect. 3. We conclude with an open problem in incidence geometry which would imply the sharp bound in the case \(d=3\).

2 The main lemmas

In this section, we describe our main tools. Let G be an abelian group (throughout the paper we will consider \(G={\mathbf {R}}^d\)). A generalized arithmetic progression (GAP) in G is a set of the form

$$\begin{aligned} Q(a_0, a_1,\ldots , a_r, N_1,\ldots ,N_r)=\{a_0+ x_1 a_1+ \cdots + x_r a_r| \, x_i=0,1,\ldots , N_i \} . \end{aligned}$$

We refer to r as the rank, and call \(a_0,a_1, \ldots , a_r \in G\) the generators, and \(N_1, \ldots , N_r\) the dimensions of Q. It’s generally useful to think of a generalized arithmetic progression as the image of the discrete r-dimensional box \( [0,N_1]\times \cdots \times [0,N_r]\) under the map

$$\begin{aligned} \Phi : [0,N_1]\times \cdots \times [0,N_r]\,&\longrightarrow \ G \\ ( x_1,\ldots , x_r)&\longrightarrow a_0 + a_1 x_1 + \cdots + a_r x_r. \end{aligned}$$

We say that Q is proper if the above map is injective. We say that Q is symmetric if it can also be written in the form

$$\begin{aligned} Q=\left\{ n_1 b_1 + n_2 b_2 + \cdots + n_r b_r | -M_i\le n_i \le M_i, i=1,\ldots , r \right\} \end{aligned}$$

for some \(b_1,\ldots , b_r \in \mathbb {R}^d\) and \(M_1,\ldots , M_r \in \mathbb {N}\).

Let V be a set of n vectors in \({\mathbf {R}}^d\), define the concentration probability

$$\begin{aligned} \rho (V)= \displaystyle \sup _{\begin{array}{c} x \in \mathbb {R}^d \end{array}} P (S_{n, V} =x). \end{aligned}$$
(2)

We are going to use the following result in [4, Theorem 2.1], which asserts that a set of vectors with high concentration probability must necessarily be, up to a few elements, a subset of a generalized arithmetic progression of small cardinality.

Theorem 2.1

(Optimal inverse Littlewood–Offord theorem) Let \(\epsilon < 1\) and C be positive constants. Assume that \(\rho (V) \ge n^{-C}\). Then, there exists a proper symmetric GAP Q in \({\mathbf {R}}^d\), of some rank \(r=O_{C, \epsilon }(1)\), that contains at least \((1-\epsilon )n\) elements of V, such that

$$\begin{aligned} |Q|=O_{C, \epsilon }\left( \rho (V)^{-1}n^{-\frac{r}{2}}\right) . \end{aligned}$$

Our next tool is a result of Chang [1]. For a set X and a number m, both in the complex plane, denote by \(r_2 (m ; X)\) the number of ways to write m as a product of two elements of X.

Theorem 2.2

For any fixed r there is some constant \(C_r>0\) such that the following holds. Let Q be a GAP of complex numbers of rank r and dimensions \(N_1, \ldots , N_r\). Let \(N=\max _i N_i\). Then for all \(m \in \mathbb {C} \),

$$\begin{aligned} r_2 (m ; Q) \le N^{\frac{C_r}{\log \log N}}. \end{aligned}$$

We use this theorem to prove the following corollary.

Corollary 2.3

Let Q be a GAP in \(\mathbb {R}^2\) with constant rank r. Let \(S \subseteq \mathbb {R}^2 \) be an arbitrary circle. Then

$$\begin{aligned} |Q\cap S| \le |Q|^{o(1)}. \end{aligned}$$

Proof

(Proof of Corollary 2.3) For any \(x\in \mathbb {C}\) denote its complex conjugate by \(\bar{x}\). Let S be a circle of radius R. As we can shift Q, we can assume, without loss of generality, that S is centered at 0. Let \(Q=\{a_0+ x_1 a_1+ \cdots + x_r a_r| x_i=0,\ldots ,N_i, \forall i \} \subseteq \mathbb {C}\). Consider

$$\begin{aligned} P= & {} \{x_0 a_0 + x_1 a_1+ \cdots + x_r a_r + y_0\bar{a}_0 + y_1 \bar{a}_1+ \cdots + y_r \bar{a}_r| \,x_0, y_0\\\in & {} \{ 0,1\}; |x_i|, |y_i| \le N_i \}. \end{aligned}$$

By the above theorem,

$$\begin{aligned} r_2 (R^2 ; P) \le N^{\frac{C_{2r+2}}{\log \log N}} \le |Q|^{\frac{C_{2r+2}}{\log \log |Q|/r}}=|Q|^{o(1)} . \end{aligned}$$

On the other hand, \( x\in S \) if and only if \( x\cdot \bar{x}= R^2\). As P contains all elements of Q and their conjugates, it follows that

$$\begin{aligned} |Q\cap S| \le r_2 (R^2 ; P)= |Q|^{o(1)} . \end{aligned}$$

\(\square \)

3 Proof of Theorems 1.3 and 1.4

We first prove Theorem 1.3.

3.1 Upper bound for \(d \ge 4\)

Consider a set V and assume, for a contradiction, that \({\mathbf {P}}( S_{n, V } =0 ) \ge n^{-d/2 - d/(d-2) + \delta } \) for some constant \(\delta >0 \). By Theorem 2.1, there is a proper symmetric GAP Q of constant rank r, which contains at least 0.99 n elements of V, and

$$\begin{aligned} |Q| = O( n^{d/2 +d/(d-2) -r/2 -\delta } ). \end{aligned}$$
(3)

In what follows, we derive a lower bound that contradicts (3). Let \(Q :=\{ n_1 a_1 + n_2 a_2 + \cdots + n_r a_r|\ |n_i| \le N_i \} , Q' := \{ n_3 a_3 + n_4 a_4 + \cdots + n_r a_r|\ |n_i| \le N_i \} \) and \(Q^{''}:= \{ n_1 a_1 +n_2 a_2|\ |n_i | \le N_i \}\). We can assume, without loss of generality, that \(N_1, N_2\) are the two largest dimensions, which implies that \(|Q'| \le |Q|^{(r-2) /r} \).

By Corollary 2.3 and the hypothesis that the vectors in V have unit length, we conclude that for any \(x \in Q', |(x+Q^{''} ) \cap V | \le |Q^{''}|^{o(1)} \le |Q|^{o(1)}\). Since \(V \cap Q = \cup _{x \in Q'} (x+ Q^{''} ) \cap V\), it follows that \(0.99n \le |Q|^{o(1)} |Q|^{(r-2) /r } \), or equivalently, \(|Q| \ge n^{ r/(r-2) -o(1) }\). This inequality relies on the hypothesis that the vectors of V are all distinct. Together with (3), we have

$$\begin{aligned} n^{d/2 +d/(d-2) -r/2 -\delta } \ge n ^{r/(r-2) -o(1) }. \end{aligned}$$
(4)

On the other hand V is effectively d-dimensional, so \(r \ge d\). For \(r \ge 4\), the function \(f(r)= r/2 + r/(r-2) \) is strictly monotone increasing. This implies that the above inequality cannot hold for sufficiently large n, a contradiction.

Notice that the proof does not depend on the value 0.99 given in the definition of an effectively d -dimensional set of vectors. And that it follows exactly the same for any other fixed constant \(0<1-\epsilon <1\).

3.2 Upper bound for \(d=3 \)

One can repeat the above argument, but we can no longer use the fact that \(f(r)= r/2 + r/(r-2)\) is monotone. As a matter of fact \(f(3) =9/2\) is larger than both \(f(4) =4\) and \(f(5)= 25/6\). As \(f(r) \ge 5\) for all \(r \ge 6\), the worst value one can take is \(f(4)=4\), which results in the upper bound \(n^{-4 +o(1) }\).

3.3 Upper bound for \(d =2\)

Consider a set V and assume, for a contradiction, that \( P(S_{n,V}=0) \ge n^{-C}\), for some constant C. By Theorem 2.1, there is proper symmetric GAP Q, of some constant rank \(r=O_{C, \epsilon }(1)\), that contains at least \((1-\epsilon )n\) elements of V, and with

$$\begin{aligned} |Q|=O_{C, \epsilon }(\rho (V)^{-1}n^{-\frac{r}{2}})\,. \end{aligned}$$

However, by Corollary 2.3, such Q can only contain \(|Q|^{o(1) }\le n^{o(1)}\) points from the unit circle, which, in turns, contains V. This provides the desired contradiction.

3.4 Lower bounds

Let us start with the case \(d \ge 3\). We construct a set V such that

$$\begin{aligned} {\mathbf {P}}( S_{n, V} = 0) \ge n^{-d/2 - d/(d-2) -o(1) } . \end{aligned}$$

By classical results on Waring’s problem [7], the number of ways to write an integer N as sum of d squares is at least \(n := N^{ (d-2)/2 +o(1) } \), for any fixed \(d \ge 4\) and all sufficiently large N. This means the sphere of radius \(R := N^{1/2} \) (centered at the origin) contains at least n lattice vectors. Let V be the set of these vectors (we can normalize them to have unit length). An application of the Chebyshev’s inequality shows that with probability at least \(1/2, S_{n,V}\) belongs to the ball B of radius \(10 n^{1/2} R\) centered at the origin. By pigeonhole, and the simple fact that the number of lattice points in the ball is comparable to its volume, we conclude that there is a lattice point \(x\in B\) such that

$$\begin{aligned} {\mathbf {P}}( S_{n, V} =x) \ge \frac{1}{2} (\mathrm{volume } \,\, B ) ^{-1} \ge C n^{-d/2 - d/(d-2) -o(1) } \end{aligned}$$

for some positive constant \(C = C(d) \).

Let us show that the supremum of \( {\mathbf {P}}(S_{n,V} = x )\) is attained at \(x=0\), for any set V symmetric with respect to the origin. We use Gauss’ identity

$$\begin{aligned} {\mathbf {I}}_{Y =0} = C_d \int _{ S^{d-1} } e ( Y \cdot t ) dt, \end{aligned}$$

where Y is a vector in \({\mathbf {R}}^d, {\mathbf {I}}\) is the indicator function, \(C_d\) is a positive constant depending on \(d, e (x) =\exp ( 2\pi i x ) \) and \(S^{d-1} \) is the unit sphere in \({\mathbf {R}}^d\). By this identity, we have

$$\begin{aligned} {\mathbf {P}}(S_{n, V} =x)= & {} {\mathbf {E}}{\mathbf {I}}_{S_{n, V} -x =0 } = {\mathbf {E}}C_d \int _{S^{d-1} } e ( (S_{n,V } -x ) \cdot t) dt\\= & {} C_d \int _{S^{d-1} } e (-x \cdot t) {\mathbf {E}}e (S_{n, V } \cdot t ) dt. \end{aligned}$$

As \(S_{n, V} = \sum _{i=1}^n \eta _i v_i\) where the \(\eta _i\) are independent, it follows that

$$\begin{aligned} {\mathbf {E}}e (S_{n, V } \cdot t ) =\prod _{i=1}^n {\mathbf {E}}e ( \eta _i v_i \cdot t ) = \prod _{i=1}^n \cos ( v_i \cdot t) . \end{aligned}$$

Since the set V is symmetric with respect to the origin,

$$\begin{aligned} \prod _{i=1}^n \cos ( v_i \cdot t) = \prod _{i=1}^n | \cos (v_i \cdot t ) | . \end{aligned}$$

Thus, by the triangle inequality

$$\begin{aligned} {\mathbf {P}}( S_{n,V} =x ) \le C_d \int _{S^{d-1} } \prod _{i=1}^n | \cos (v_i \cdot t ) | dt = {\mathbf {P}}( S_{n, V} =0 ) \end{aligned}$$

for any \(x \in {\mathbf {R}}^d\). This and the inequality \( {\mathbf {P}}( S_{n, V} = x) \ge n^{-d/2 - d/(d-2) -o(1) } \) imply the desired lower bound.

Let us now turn to the case \(d=2\). Classical results in number theory show that there are infinitely many R such that the circle centered at the origin of radius R contains at least \(R^{1/ \log \log R} \) integral points [3, Theorem 4.2.2]. Fixed one such R, let V be the set of these points. By a similiar argument to the case \(d\ge 3\),

$$\begin{aligned} \rho (V) = \Omega ( R^{-2 +o(1) } ) \ge |V| ^{- C \log \log |V|} \end{aligned}$$

for a properly chosen constant C.

3.5 Proof of Theorem 1.4

Assuming for a moment that V consists of different vectors of unit length, the proof for an arbitrary target x is the same, since in Theorem 2.1 we define \(\rho (V) :=\sup _x {\mathbf {P}}( S_{n,V} =x )\). For the general case, by the pigeon hole principle, there are at least n / LM different vectors in V with the same length t. Let \(V'\) be the set of these vectors and repeat the proof for this set, conditioning on the rest of the walk. By the condition on \(L, M, |V'| = n^{1- o(1) }\) and this only influences the o(1) terms in the bounds.

4 New problems in incidence geometry

We conjecture that in the case \(d=3\) the upper bound \(n^{-d/2 - d/(d-2) +o(1)}= n^{-9/2+o(1)}\) also holds. This would follow from the following conjectures, which are of independent interest.

Conjecture 4.1

Let V be a set of n unit vectors in the Euclidean space, with at most \(n^{o(1)}\) of its endpoints on any plane. Then \(|V+V+V| \ge n^{5/2 -o(1) } \).

Conjecture 4.2

Let P be a set of p points and B be a set of \(n^2\) unit spheres in \({\mathbf {R}}^3\). Again assume that no plane contains more than \(n^{o(1)}\) points of P. Assume as well that each sphere in B contains at least n points from P. Then \(p \ge n^{5/2-o(1)} \).

As a matter of fact, we feel that one can replace both exponents 5 / 2 by 3 (which would be clearly optimal).

Notice that the second statement implies the first. By congruence of triangles, the endopoints of all pairs of vectors with a prescribed sum lie in a same hyperplane. Under the hypothesis above, that means the size of \(V+V\) is at least \(n^{2-o(1)}\). Since each element of the triple sumset of V lies in a unit sphere centered at one of those \(n^{2-o(1)}\) points, the conclusion follows.

It is also easy to see that Conjecture 4.1 implies the desired upper bound for the unproved case dimension \(d=3\). In fact, the argument for the \(d \ge 4\) case carries on for \(d=3\) if the generalized arithmetic progression Q containing all but a few elements of V has rank at least 6. However, if Q has rank 4 we need its size to be at least \( n^{5/2-o(1)}\) (which also suffices for rank 5).

Denote by \(V'\) the set of \((1-\epsilon )n\) elements of V contained in that GAP Q. The elements of \(V'\) on any given hyperlane lie in the intersection of a circle with the projection of Q onto the plane. This is a GAP of rank and size no greater than Q. By Corollary 2.3, we conclude that \(V'\) has at most \(n^{o(1)}\) elements on any hyperplane. Furthermore we have that \(V'+V'+V' \subseteq Q+Q+Q\). Assuming Conjecture 4.1, this implies that \(|Q+Q+Q| \ge n^{5/2 -o(1) } \) and so, since Q is a generalized arithmetic progression of constant rank, its size itself is at least \(n^{5/2-o(1)}\).