1 Introduction

The endpoint Stein–Tomas inequality states that

$$\begin{aligned} \int _{{\mathbb {S}}^{d-1}} |{{\widehat{f}}}(\omega )|^2\,d \sigma (\omega ) \lesssim \Vert f\Vert _{L^{2\frac{d+1}{d+3}}({\mathbb {R}}^d)}^2 \end{aligned}$$
(1)

where \({\mathbb {S}}^{d-1}:=\{x\in {\mathbb {R}}^d: x\cdot x=1\}\) denotes the unit sphere equipped with normalized surface measure \(\sigma\), and \({{\widehat{f}}}\) is the Fourier transform of \(f:{\mathbb {R}}^d\rightarrow {\mathbb {C}}\). Thus (1) quantifies the possibility to restrict the Fourier transform of certain \(L^p\) functions on \({\mathbb {R}}^d\) to a specific curved hypersurface of \({\mathbb {R}}^d\), at least in the \(L^2\)-sense. It links the geometric notion of curvature to the analytic notion of Fourier decay, and is the starting point for the fertile research ground that goes by the name of Fourier restriction theory.

The genesis of inequality (1) can be briefly described as follows.

Stein was the first to notice the possibility of an \(L^p\rightarrow L^2({\mathbb {S}}^{d-1})\) restriction inequality and proved it for \(1\le p<{4d}/{(3d+1)}\); Fefferman refers to it in [8, p. 28] as a “remarkable observation”. For \(d=2\), this was extended by Fefferman and Stein [8, p. 33] to the nearly optimal range \(1\le p<6/5\), whereas Carleson–Sjölin [5, p. 290] established the optimal range \(1\le p\le 4/3\) when \(d=3\). In arbitrary dimensions \(d\ge 2\), Tomas [36] then obtained the full range of exponents \(1\le p< 2(d+1)/(d+3)\) up to, but not including, the endpoint.

Tomas’ proof is quite short: Setting \({\widetilde{f}}(x):=\overline{f(-x)}\), using Fubini’s theorem and Hölder’s inequality,

$$\begin{aligned}\int _{{\mathbb {S}}^{d-1}} |{\widehat{f}}(\omega )|^2\,d \sigma (\omega ) =\int _{{\mathbb {R}}^d} f*{\widetilde{f}}(x){\widehat{\sigma }}(x)\,d x =\int _{{\mathbb {R}}^d} f(x) f*{\widehat{\sigma }}(x)\,d x \le \Vert f\Vert _p \Vert f*{\widehat{\sigma }}\Vert _{p'},\end{aligned}$$

and so it suffices to show that the operator given by convolution with \({\widehat{\sigma }}\) is bounded from \(L^p\) to \(L^{p'}.\) The key is to realize that the curvature of \({\mathbb {S}}^{d-1}\) forces \({{\widehat{\sigma }}}\) to decay. Letting \(T_k(x):=(K(2^{-k}x)-K(2^{-k+1}x)){\widehat{\sigma }}(x)\), where \(K\in {\mathcal {S}}({\mathbb {R}}^d)\) is a fixed radial Schwartz function which is identically equal to 1 on a small neighborhood of the origin, it is enough to show that \(\Vert f*T_k\Vert _{p'} \lesssim 2^{-\varepsilon k}\Vert f\Vert _p\) for some \(\varepsilon =\varepsilon (p,d)>0\). This in turn follows from interpolating the estimates \(\Vert f*T_k\Vert _\infty \le 2^{(1-d)k/2} \Vert f\Vert _1\) and \(\Vert f*T_k\Vert _2 \le 2^k \Vert f\Vert _2\). QED.

The proof of the theorem in [36] ends with the following remark: “Professor E. M. Stein has extended the range of this result to include \(p=2(d+1)/(d+3)\). His proof uses complex interpolation of the operators given by convolution with the functions \(B_\sigma (x)=J_\sigma (2\pi |x|)/|x|^\sigma\)”. Tomas also notes that the sharpness of these results follows from “a very simple homogeneity argument”, which became known as Knapp’s example; see [34, p. 707].

In the following 50 years, the endpoint Stein–Tomas inequality (1) became a centerpiece in modern analysis. A late 2023 search in Google Scholar reveals that [36] has been cited well over 600 times in a variety of works on harmonic analysis, partial differential equations (PDE), geometric measure theory (GMT), additive combinatorics, theoretical computer science and analytic number theory (ANT). The list keeps growing.

1.1 Outline

In Sect. 2, we shall briefly discuss three classical applications of (1) to PDE, GMT and ANT. In Sects. 34 and 5, we describe three recent improvements from the last decade – namely: sharp, variational and symmetric refinements of (1) – and mention some related open problems.

1.2 Notation

The Fourier transform is normalized as follows: \({\widehat{f}}(\xi )= \int _{{\mathbb {R}}^d} f(x) e^{-ix\cdot \xi } \,d x.\)

2 Selected applications

The endpoint Stein–Tomas inequality (1) can be equivalently restated in dual form,

$$\begin{aligned} \Vert \widehat{f\sigma }\Vert _{L^{2\frac{d+1}{d-1}}({\mathbb {R}}^d)} \lesssim \Vert f\Vert _{L^2({\mathbb {S}}^{d-1})}, \end{aligned}$$
(2)

where \(\widehat{f\sigma }(x)=\int _{{\mathbb {S}}^{d-1}} f(\omega ) e^{-ix\cdot \omega }\,d \sigma (\omega )\) stands for the adjoint restriction (or extension) operator to \({\mathbb {S}}^{d-1}\). This formulation naturally arises in applications, as will now become apparent.

2.1 Strichartz inequalities

The Stein–Tomas argument goes through almost unchanged if the unit sphere \({\mathbb {S}}^{d-1}\) is replaced by a compact hypersurface of nonvanishing gaussian curvature. In 1977, Strichartz [34] noted that the argument could be further adapted to the case of (unbounded) quadratic surfaces \(S=\{x\in {\mathbb {R}}^d: R(x)=r\}\), where R is a polynomial of degree two with real coefficients and r is a real constant. He split the analysis into three cases:

  1. Case 1.

    \(S=\{x_d=Q'(x',x')\}\), where \(x=(x',x_d)\in {\mathbb {R}}^{d-1}\times {\mathbb {R}}\) and \(Q'(x',x')=x_1^2+\ldots +x_k^2-x_{k+1}^2-\ldots -x_{d-1}^2\) for some \(k\in \{0,1,\ldots ,d-1\}\).

  2. Case 2.

    \(S=\{Q(x,x)=0\}\), where \(Q(x,x)=x_1^2+\ldots +x_k^2-x_{k+1}^2-\ldots -x_d^2\) for some \(k\in \{1,2,\ldots ,d-1\}.\)

  3. Case 3.

    \(S=\{Q(x,x)=1\}\), where Q is as in Case 2 for some \(k\in \{1,2,\ldots ,d-1\}\).

He then considered the analytic family \(G_z(x):=\gamma (z)(R(x)-r)_+^z\), where \(\gamma (z)\) is an appropriately chosen analytic function and, in order to apply Stein’s complex interpolation theorem, proceeded to compute the inverse Fourier transform of \(G_z\).

Once the last variable \(x_d\) is interpreted as (Fourier) time, the extension inequalities corresponding to \(\{x_d=x_1^2+\ldots +x_{d-1}^2\}\) (Case 1), \(\{x_d^2=x_1^2+\ldots +x_{d-1}^2\}\) (Case 2) and \(\{x_d^2=x_1^2+\ldots +x_{d-1}^2+1\}\) (Case 3) give rise to the so-called Strichartz inequalities for the Schrödinger, wave and Klein–Gordon equations, respectively. These had a profound impact on the subsequent development of the theory of dispersive PDE; see [18, 35] and a very recent alternative approach in [30].

2.2 Salem sets

Since curvature plays such a key role, there can be no one-dimensional Stein–Tomas type restriction theorem, right? Wrong! In 2000, Mockenhaupt [28] observed that any positive measure \(\mu\) with compact support \(E\subset {\mathbb {R}}^d\) for which there exist

  1. (i)

    \(\alpha >0\) such that \(\mu (B_r(x))\lesssim r^\alpha\) for every ball \(B_r(x)\) of radius r centered at x, and

  2. (ii)

    \(\beta >0\) such that \(|{\widehat{\mu }}(\xi )|\lesssim |\xi |^{-\beta /2},\)

satisfies the Fourier restriction inequality

$$\begin{aligned} \int _E |{{\widehat{f}}}(\omega )|^2\,d \mu (\omega ) \lesssim \Vert f\Vert _{L^p({\mathbb {R}}^d)}^2, \end{aligned}$$
(3)

as long as \(1\le p< p_\circ (\alpha ,\beta ):= (2(2d-2\alpha +\beta ))(4(d-\alpha )+\beta )^{-1}\). Mitsis [27] gave an independent proof and Bak–Seeger [1] later extended inequality (3) to the endpoint \(p=p_\circ (\alpha ,\beta )\). The usual surface measure on the sphere \({\mathbb {S}}^{d-1}\subset {\mathbb {R}}^d\) satisfies assumptions (i)–(ii) with \(\alpha =\beta =d-1\), and so (3) generalizes the endpoint Stein–Tomas inequality (1). On the other hand, if \(d=1\) and \(\alpha \in (0,1)\), there exists a positive measure \(\nu\) supported on a compact set \(F\subset {\mathbb {R}}\) of Hausdorff and Fourier dimensions both equal to \(\alpha\) (such a set is called a Salem set), and the restriction inequality

$$\begin{aligned}\int _ F |{{\widehat{f}}}(\omega )|^2\,d \nu (\omega ) \lesssim \Vert f\Vert _{L^p({\mathbb {R}})}^2\end{aligned}$$

then holds for every \(1\le p\le 2(2-\alpha )(4-3\alpha )^{-1}\). The sharpness of these results was harder to verify since Knapp’s example does not generalize easily to the fractal setting. Still, it is now known [16, 17] that the range of exponents cannot in general be improved; see also the very recent [13].

Mockenhaupt [28, p. 1579] remarked that his fractal restriction theorem was partially motivated by the number theoretical observation (due to Bourgain [3]) that Montgomery’s conjectured bounds on finite Dirichlet sums imply the Kakeya conjecture from GMT. As we shall see in the next section, the links between Stein–Tomas and ANT do not stop here.

2.3 Roth’s theorem in the primes

Let \(r_3(N)\) denote the cardinality of the largest subset \(A\subseteq \{1,2,\ldots ,N\}\) containing no nontrivial three-term arithmetic progression (3AP). In 1953, Roth showed that \(r_3(N)=o(N)\), an influential result whose quantitative form reads as follows:

$$\begin{aligned}r_3(N)\lesssim N/\log \log N.\end{aligned}$$

This upper bound was refined several times throughout the 20th century, but never in a way that implies the existence of infinitely many 3APs in the set of prime numbers \({\mathcal {P}}\). However, it is a classical result of Van der Corput that \({\mathcal {P}}\) does contain infinitely many 3APs.Footnote 1

In 2005, Green [14] went farther and proved that every positive upper density subset of \({\mathcal {P}}\) contains a 3AP. This simultaneously generalizes the results of Roth and Van der Corput. Green’s proof relies on the fact that the primes enjoy the so-called Hardy–Littlewood majorant property. In turn, this is deduced from a more general result of Bourgain [4] which may be regarded as a restriction theorem for the primes that we now describe.

Let \(m\le \log N\) be a positive integer and let \(b\in [0,m-1]\) be coprime to m. Define the set

$$\begin{aligned}\Lambda _{b,m,N}:=\{n\le N: nm+b\text { is prime}\}.\end{aligned}$$

Based on the expected cardinality of \(\Lambda _{b,m,N}\), we define a function \(\lambda _{b,m,N}\) supported on \(X:=\Lambda _{b,m,N}\) via \(\lambda _{b,m,N}(n)=\phi (m)\log (nm+b)/mN\) if \(n\in \Lambda _{b,m,N}\) and zero otherwise. Finally, define the map \({\mathcal {E}}:C^0(X)\rightarrow C^0({\mathbb {T}})\) via \({\mathcal {E}}:f\mapsto (f\lambda _{b,m,N})^\wedge\), where the hat denotes the Fourier transform on \({\mathbb {Z}}\). Then the extension-type inequality

$$\begin{aligned} \Vert \mathcal Ef\Vert _p\lesssim N^{-1/p} \Vert f\Vert _2 \end{aligned}$$
(4)

holds for all \(p>2\). The proof of (4) closely follows the Tomas argument which we summarized in the Introduction. It suffices to prove \(\Vert {\mathcal {E}}{\mathcal {E}}^*\Vert _{p'\rightarrow p}\lesssim N^{-2/p}\), which in turn is accomplished via a dyadic decomposition \(\lambda _{b,m,N}=\sum _{k=1}^K\psi _k+\psi _{K+1}\) and the estimates

$$\begin{aligned} \Vert f*{\widehat{\psi }}_k\Vert _\infty \lesssim 2^{-(1-\varepsilon )k} \Vert f\Vert _1\text { and }\Vert f*{\widehat{\psi }}_k\Vert _2 \lesssim 2^{\varepsilon k} N^{-1}\Vert f\Vert _2. \end{aligned}$$
(5)

The proof of the first estimate in (5) is rather long and technical, but afterwards Roth’s theorem in the primes is obtained in just a few strokes. It remains to date one of the most striking instances of the Stein–Tomas argument.

3 Sharp Stein–Tomas

In three dimensional space, the sharp endpoint Stein–Tomas inequality reads as follows:

$$\begin{aligned} \int _{{\mathbb {S}}^{2}} |{{\widehat{f}}}(\omega )|^2\,d \sigma (\omega ) \le 4\pi ^2 \Vert f\Vert _{L^{\frac{4}{3}}({\mathbb {R}}^3)}^2. \end{aligned}$$
(6)

Here, \(4\pi ^2\) is the best (i.e., smallest) possible constant since equality in (6) holds if \(f(x)=\textup{sinc} ^3(|x|)\). By a standard duality argument, recall (2), this is equivalent to the statement that the inequality

$$\begin{aligned} \Vert \widehat{f\sigma }\Vert _{L^4({\mathbb {R}}^3)} \le 2\pi \Vert f\Vert _{L^2({\mathbb {S}}^2)} \end{aligned}$$
(7)

is sharp and equality holds if f is a constant function on \({\mathbb {S}}^2\).

It is not a priori clear that maximizers for (6), or equivalently for (7), exist. Christ–Shao [7] addressed this question and proved that nonnegative maximizing sequences for (7) are precompact in \(L^2({\mathbb {S}}^2)\), and therefore maximizers exist. Plancherel’s identity recasts (7) as a convolution estimate: \(\Vert f\sigma *f\sigma \Vert _{2}\lesssim \Vert f\Vert _2^2\). Since the latter is a positive inequality, the search for maximizers can be confined to the class of even, non-negative, smooth functions. Christ–Shao’s analysis proceeds via a compactness argument, identifying concentration at a pair of antipodal points as the most essential obstruction to the existence of maximizers, which is then ruled out and yields their result; see Fig. 1.

Fig. 1
figure 1

Left: The sumset \({\mathcal {C}}_\phi ^\star +{\mathcal {C}}_\phi ^\star\), where \({\mathcal {C}}_\phi ^\star =-{\mathcal {C}}_\phi \cup {\mathcal {C}}_\phi\) and \({\mathcal {C}}_\phi \subset {\mathbb {S}}^{2}\) is the (red) cap centered at the north pole with half-angle \(\phi =\frac{\pi }{4}\). Right: The Stein–Tomas functional \(\Phi (f)= \Vert f\sigma *f\sigma \Vert _2 \Vert f\Vert ^{-2}_2\) for \(f=\textbf{1}_{{\mathcal {C}}_\phi }\) and \(0<\phi <\frac{\pi }{2}\) (blue). It holds that \(2\Phi (\textbf{1}_{{\mathcal {C}}^\star _\phi })=3\Phi (\textbf{1}_{{\mathcal {C}}_\phi })\) if and only if \(\cos \phi \ge \frac{1}{3}\). For larger values of \(\phi\), the strict inequality \(2\Phi (\textbf{1}_{{\mathcal {C}}^\star _\phi })>3\Phi (\textbf{1}_{{\mathcal {C}}_\phi })\) holds; for such \(\phi\), it is thus energetically more advantageous to distribute the mass in an antipodally symmetric way

The sharp inequality (7) was obtained by Foschi [9] via a clever proof that proceeds in three steps:

  1. (a)

    a “magic” identity for four unit vectors which sum to zero;

  2. (b)

    an ingenious application of the Cauchy–Schwarz inequality;

  3. (c)

    the spectral analysis of a specific quadratic form.

Alternatives to steps (a) and (c) now exist. Step (a) can be replaced by partial integration together with the realization that \(u:=\widehat{f\sigma }\) satisfies the Helmholz equation \(u+\Delta u=0\); see [6, 32]. Step (c) can be replaced by a more elementary reflection method [31] or by a homogeneity argument on a specific family of tempered distributions [32]. An alternative to step (b) is not known to date, even though it could be an important ingredient towards generalizations to other dimensions and exponents.

For recent surveys on sharp restriction theory and related sharp Strichartz inequalities (recall §2.1), see [10, 32].

4 Variational Stein–Tomas

In three dimensional space, the variational endpoint Stein–Tomas inequality

$$\begin{aligned} \Big \Vert \big \Vert ({\widehat{f}}*\chi _{\varepsilon })(\omega ) \big \Vert _{V ^{\varrho }_{\varepsilon }} \Big \Vert _{{L}^2_{\omega }({\mathbb {S}}^2,\sigma )} \lesssim _{\chi ,\varrho } \Vert f\Vert _{{L}^{\frac{4}{3}}({\mathbb {R}}^3)} \end{aligned}$$
(8)

holds as long as \(\varrho \in (4/3,\infty )\). Here, the \(\varrho\)-variation norm of a function \(a:(0,\infty )\rightarrow {\mathbb {C}}\) is defined as

$$\begin{aligned}{} & {} \Vert a\Vert _{V ^{\varrho }}:= \sup _{\begin{array}{c} m\in {\mathbb {N}}\\ 0<\varepsilon _0<\varepsilon _1<\cdots <\varepsilon _m \end{array}} \Big ( |a(\varepsilon _0)|^{\varrho } + \sum _{j=1}^{m} |a(\varepsilon _{j})-a(\varepsilon _{j-1})|^{\varrho } \Big )^{\frac{1}{\varrho }}. \end{aligned}$$

Moreover, \(\chi\) is a fixed complex-valued Schwartz function such that \(\int _{{\mathbb {R}}^3}\chi (x)\,d x=1\), and \(\chi _{\varepsilon }(x):=\varepsilon ^{-3}\chi (\varepsilon ^{-1}x)\) for \(x\in {\mathbb {R}}^3\) and \(\varepsilon >0\). Inequality (8) implies the maximal endpoint Stein–Tomas inequality,

$$\begin{aligned} \Big \Vert \sup _{\varepsilon >0} \big | ({\widehat{f}}*\chi _{\varepsilon })(\omega ) \big | \Big \Vert _{{L}^2_{\omega }({\mathbb {S}}^2,\sigma )} \lesssim _{\chi } \Vert f\Vert _{{L}^{\frac{4}{3}}({\mathbb {R}}^3)}, \end{aligned}$$
(9)

which in turn implies the classical version given by (1) when \(d=3\).

The variational restriction inequality (8) was established for \(\varrho >2\) in [21] via a combination of Gaussian domination techniques and variational estimates for convolution-type operators, and later extended to \(\varrho >4/3\) in [20]. The maximal estimate (9) together with obvious convergence properties in the dense Schwartz class imply that the limit

$$\begin{aligned}\lim _{\varepsilon \rightarrow 0^+} ({{\widehat{f}}}*\chi _\varepsilon )(\omega )\end{aligned}$$

exists for each \(f\in L^{4/3}({\mathbb {R}}^3)\) and \(\sigma\)-almost every \(\omega \in {\mathbb {S}}^2\). This was key for Ramos [33] (see also [11]) to conclude that, for each \(f\in L^p({\mathbb {R}}^3)\) and \(1\le p\le 4/3\), almost every point on the sphere is a Lebesgue point of \({{\widehat{f}}}\). Prior to [33], this had been confirmed by Vitturi [37] in the smaller range \(1\le p\le 8/7\) via an adaptation of the original argument of Müller–Ricci–Wright [29]. In addition to quantifying the mere convergence, estimate (8) establishes convergence in the whole \(L^p\) space without the need for previous convergence results on a dense subspace.

We conclude this section by describing how maximal and variational restriction led to recent further understanding of the fine properties of the Fourier restriction operator. It is well-known that the Fourier transform of a function in \(L^1({\mathbb {R}}^d)\) is uniformly continuous, and therefore the set of its Lebesgue points is the whole \({\mathbb {R}}^d\). Bilz [2] recently constructed a function which belongs to \(L^p({\mathbb {R}}^d)\), for every \(p\in (1,\infty ]\), but whose Fourier transform has no Lebesgue points in some compact set of full Hausdorff dimension d. Together with the aformentioned extension by Kovač [20], this implies the existence of large sets without Fourier restriction theorems. More precisely, [2, Cor. 2] establishes the existence of a compact subset \(E\subset {\mathbb {R}}^d\) with full Hausdorff dimension and the following property: for any Borel measure \(\mu\) on \({\mathbb {R}}^d\) with \(\mu (E)>0\) and any \((p,q)\in (1,2]\times [1,\infty ]\),

$$\begin{aligned}\sup _{f\in {\mathcal {S}}({\mathbb {R}}^d)} \frac{\Vert {{\widehat{f}}}\Vert _{L^q(\mu )}}{\Vert f\Vert _{L^p({\mathbb {R}}^d)}} = \infty .\end{aligned}$$

It is surprising that this had not been observed before, and raises the question of whether further relations besides the Mockenhaupt–Mitsis–Bak–Seeger theorem from §2.2 hold between the Fourier dimension and the range of restriction exponents.

5 Symmetric Stein–Tomas

In three dimensional space, no symmetric version of the Stein–Tomas inequality (in the sense described below) is available. So we move up in dimension by one: the restriction inequality

$$\begin{aligned} \int _{{\mathbb {S}}^{3}} |{{\widehat{f}}}(\omega )|^2\,d \sigma (\omega ) \lesssim \Vert f\Vert _{L^p({\mathbb {R}}^4)}^2 \end{aligned}$$
(10)

holds for every function \(f:{\mathbb {R}}^4\rightarrow {\mathbb {C}}\) such that \(f=f\circ M\) for every \(M\in O(2)\times O(2)\), as long as \(1\le p\le 3/2\). Since \(3/2>10/7\), we see that the endpoint Stein–Tomas inequality on \({\mathbb {S}}^3\) can be improved in the presence of \(O(2)\times O(2)\)-symmetry.

Contrary to the results described in Sects. 3 and 4, the proof of inequality (10) does not rely on the specific dimension of the ambient space and finds its natural habitat in general \(d\ge 4\): Given \(2\le k\le d-2\) and letting \(m=\min \{d-k,k\}\), the inequality

$$\begin{aligned} \int _{{\mathbb {S}}^{d-1}} |{{\widehat{f}}}(\omega )|^2\,d \sigma (\omega ) \lesssim \Vert f\Vert _{L^p({\mathbb {R}}^d)}^2, \end{aligned}$$
(11)

holds for every \(f:{\mathbb {R}}^d\rightarrow {\mathbb {C}}\) such that \(f=f\circ M\) for every \(M\in O(d-k)\times O(k)\), as long as \(1\le p\le 2(d+m)(d+m+2)^{-1}\). This was proved in [26] and has non-trivial consequences to:

  • mapping properties of the Helmholtz resolvent ( [26]);

  • unconditional existence of endpoint Stein–Tomas maximizers ( [26]);

  • symmetry breaking for ground states of biharmonic NLS ( [25]).

The non-symmetric versions had been respectively addressed in influential work of Kenig–Ruiz–Sogge [19], Frank–Lieb–Sabin [12] and Lenzmann–Weth [23] (see also [22]). Since these applications were very recently surveyed in [24], we refer the interested reader to that work, and conclude with the following question: What happens with general subgroups of the orthogonal group?