1 Introduction

Consider a \(C^{\infty }\) non-degenerate curve \(\gamma : I: = [-1,1] \rightarrow {\mathbb {R}}^d\). In other words,

$$\begin{aligned} \det \begin{pmatrix} \gamma ^{(1)}(s)&\cdots&\gamma ^{(d)}(s) \end{pmatrix} \ne 0 \quad \hbox { for all}\ s \in I. \end{aligned}$$

The curve \(\gamma \) defines a one-parameter family of directions in \({\mathbb {R}}^{d+1}\). For \( 0< \delta < 1\) and \(s \in I\), consider a \(\delta \)-tube in \({\mathbb {R}}^{d+1}\) in the direction of \((\gamma (s) \ 1)^{\top }\), defined as

$$\begin{aligned} T_{\delta }(s):= \{(y,t) \in {\mathbb {R}}^{d} \times I: |y - t\gamma (s) |\le \delta \} \end{aligned}$$

and the corresponding averaging operator

$$\begin{aligned} \mathcal{{A}_{\delta }^{\gamma }}g(x,s)&: = \frac{1}{|T_{\delta }(s)|}\int _{{T}_{\delta }(s)} g(x - y,t) \mathrm{{d}}y\mathrm{{d}}t, \qquad \text { for } x \in {\mathbb {R}}^d \end{aligned}$$
(1)

whenever \(g \in L^1_{\textrm{loc}}({\mathbb {R}}^{d+1})\). Our goal is to investigate the \(L^p\) boundedness properties of the Nikodym maximal function

$$\begin{aligned} \mathcal{{N}_{\delta }^{\gamma }}g(x)&: = \sup _{s \in I} | \mathcal{{A}_{\delta }^{\gamma }}g(x,s) |. \end{aligned}$$
(2)

The main result is as follows.

Theorem 1

Let \(\gamma : I \rightarrow {\mathbb {R}}^d\) be a non-degenerate curve. There exists \(C_{d,\gamma } > 0\) such that

$$\begin{aligned} \Vert \mathcal{{N}_{\delta }^{\gamma }}\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d})} \le C_{d,\gamma } (\log \delta ^{-1})^{d/2} \qquad \hbox { for all}\ 0< \delta < 1. \end{aligned}$$

By interpolating with the trivial bound at \(L^{\infty }\), we estimate the \(L^p\) operator norm for the maximal function as \(O((\log \delta ^{-1})^{{d}/{p}})\) for \(2 \le p \le \infty \). This is sharp in the sense that the \(L^p\) operator norm has polynomial blowup in \(\delta ^{-1}\) for \(1 \le p < 2\) (see Sect. 5). The result is new for \(d \ge 4\). The theorem also slightly strengthens the known estimates for \(d = 2\) and \(d = 3\) (see [7, Lemma 1.4] and [1, Proposition 5.5], respectively) by improving the dependence on \(\delta ^{-1}\).

The operator \(\mathcal{{N}_{\delta }^{\gamma }}\) is a variant of the classical Nikodym maximal function considered in [2]. The main difference lies in the dimensional setup of the problem: by the above definition, \(\mathcal{{N}_{\delta }^{\gamma }}\) maps functions on \({\mathbb {R}}^{d+1}\) to functions on \({\mathbb {R}}^d\), whereas the classical operator considered in [2] is a mapping between functions on the same Euclidean space.

Maximal functions of the form (2) naturally arise in the study of local smoothing problems for averaging operators associated to \(\gamma \), as first observed in Mockenhaupt-Seeger-Sogge [7]. In [7, Lemma 1.4] estimates for \(\mathcal{{N}_{\delta }^{\gamma }}\) were obtained for \(d = 2\). The \(d=3\) case was later considered in [1, Propostion 5.5], in relation to the problem of bounding the helical maximal function. The averages \(\mathcal{{A}_{\delta }^{\gamma }}\) are also closely related to the restricted X-ray transforms considered in [4, 5, 8].

The proof scheme is outlined as follows: First, oscillations are introduced into the problem, followed by a fractional Sobolev embedding to dominate the maximal function by a Fourier integral operator (see Proposition 2). This allows us to fully access orthogonality in the subsequent decomposition. While the application of Sobolev embedding in this context is standard, the use of the fractional variant introduced here constitutes a novel element compared to the previous works [1, 7].

Next, the desired Fourier integral estimates are established through an induction scheme based on a parameter N, measuring the degree of non-degeneracy of the curve. This induction method conceals the intricacies of the root analysis detailed in [1], marking a significant departure from the previous cases of \(d = 2\) and \(d = 3\). The motivation for this induction approach stems from [6], where a (more complex) induction argument is employed to investigate the local smoothing problem associated to averages along curves in \({\mathbb {R}}^d\).

The base case for induction is essentially straightforward. In the induction step, the operator is divided into two parts: one where the induction hypothesis with parameter \(N - 1\) can be applied, and the other where the support of the resulting symbol ensures the existence of precisely one root \(s = \sigma (\xi )\) for the map \(s \mapsto \langle \gamma ^{(N-1)}(s), \xi \rangle \). This root generates a degenerate cone \(\Gamma \), and now a decomposition is performed with respect to the distance to \(\Gamma \). The most singular component is directly bounded in \(L^2\). To effectively bound the remaining segments, a further decomposition on the curve is performed, followed by rescaling arguments, and a final verification that the symbols we end up with are amenable to another application of the induction hypothesis.

1.1 Outline of the Paper

This paper is structured as follows:

  • In Sect. 3, we reduce the proof of Theorem 1 to Proposition 3 via fractional Sobolev embedding.

  • In Sect. 4, we present the inductive proof of Proposition 3.

  • In Sect. 5, we discuss the sharpness of Theorem 1.

  • In Sect. 6, we state an anisotropic extension of Theorem 1 and briefly discuss its proof.

2 Notational Conventions

For a set \(E \subseteq {\mathbb {R}}^n\), we denote its characteristic function by \(\chi _{E}\). Given \(f \in L^1({\mathbb {R}}^n)\) we let either \({\hat{f}}\) or \({\mathcal {F}}(f)\) denote its Fourier transform and \({\check{f}}\) or \({\mathcal {F}}^{-1}(f)\) denote its inverse Fourier transform, which are normalised as follows:

$$\begin{aligned} {\hat{f}}(\xi ):= \int _{{\mathbb {R}}^n} e^{-i x \cdot \xi } f(x)\,\textrm{d} x, \qquad {\check{f}}(\xi ):= \int _{{\mathbb {R}}^n} e^{i x \cdot \xi } f(x)\,\textrm{d} x. \end{aligned}$$

For \(m \in L^{\infty }({\mathbb {R}}^n) \), we denote by \(m(\tfrac{1}{i}\partial _{x})\) the Fourier multiplier operator defined by its action on \(g \in \mathcal{{S}}({\mathbb {R}}^{})\) as

$$\begin{aligned}\mathcal{{F}}_{}(m(\tfrac{1}{i}\partial _{x})g)(\xi ):= m(\xi )\mathcal{{F}}_{}(g)(\xi ) \qquad \hbox { for}\ \xi \in {\mathbb {R}}^n. \end{aligned}$$

Finally, given two non-negative real numbers AB, and a list of parameters \(M_1, \ldots , M_n\), the notation \(A \lesssim _{M_1, \ldots , M_n} B\) or \(A = O_{M_1, \ldots , M_n}(B)\) signifies that \(A \le C B\) for some constant \(C = C_{M_1, \ldots , M_n} > 0\) depending only on the parameters \(M_1, \ldots , M_n\). In addition, \(A \sim _{M_1, \ldots , M_n} B\) is used to signify that both \(A \lesssim _{M_1, \ldots , M_n} B\) and \(B \lesssim _{M_1, \ldots , M_n} A\) hold.

3 Initial Reductions and Sobolev Embedding

3.1 Initial Reductions

Let \(I:= [-1,1]\) and \(\gamma :I \rightarrow {\mathbb {R}}^d\) be a non-degenerate curve, as in Sect. 1. We begin by replacing the classical averaging operators by Fourier integral operators. Given \({a \in L^{\infty }( {\mathbb {R}}^{d} \times I \times I)}\), consider

$$\begin{aligned} \mathcal{{A}}[a,\gamma ] g(x,s)&: = \int _{I}\int _{{\mathbb {R}}^d} e^{i \langle x - t\gamma (s), \xi \rangle } a(\xi ,s,t)\mathcal{{F}}_{x}(g)(\xi ,t) \mathrm{{d}}\xi \mathrm{{d}}t\quad \text {for }g \in \mathcal{{S}}({\mathbb {R}}^{d+1}), \end{aligned}$$
(3)

where \(\mathcal{{F}}_{x}(g)(\xi ,t)\) denotes \(\mathcal{{F}}_{x}(g(\,\cdot ,t))(\xi )\), the Fourier transform of g in x only. Define the associated maximal operator

$$\begin{aligned} {\mathcal {N}}[a,{\gamma }]g(x,s)&: = \sup _{s \in I} |\mathcal{{A}}[a,\gamma ]g(x,s)|. \end{aligned}$$

Choose a function \(\psi \in C_{c}^{\infty }({\mathbb {R}})\) with \(\mathrm{{supp}} \ \psi \subseteq [-1,1]\) such that its inverse Fourier transform \({\check{\psi }}\) is non-negative and \({\check{\psi }}(y) \gtrsim 1\) whenever \(|y|\le 1\). Let \({\tilde{\chi }}_{I}\) be a non-negative smooth function that satisfies \({\tilde{\chi }}_{I}(x) = 1\) for all \(x \in I\) and \({\tilde{\chi }}_{I}(x) = 0\) when \(x \notin [-2,2]\). Define

$$\begin{aligned} a_{\delta }(\xi ,s,t) := \psi (\delta |\xi |) {\tilde{\chi }}_{I}(s){\tilde{\chi }}_{I}(t). \end{aligned}$$
(4)

Let \(K_{\delta }\) denote the kernel of the averaging operator \(\mathcal{{A}_{\delta }^{\gamma }}\) defined in (1). In particular,

$$\begin{aligned} K_{\delta }(x,s,t):= \frac{1}{|T_{\delta }(s)|} \chi _{T_{\delta }(s)}(x,t). \end{aligned}$$

By integral formula for the inverse Fourier transform and a change of variable,

$$\begin{aligned} K_{\delta }(x,s,t) \lesssim _{d} \int _{{\mathbb {R}}^d} e^{i\langle x - t\gamma (s), \xi \rangle } a_{\delta }(\xi ,s,t) \mathrm{{d}}\xi . \end{aligned}$$

Thus, the pointwise estimate

$$\begin{aligned} |\mathcal{{A}_{\delta }^{\gamma }}g (x,s)| \lesssim _{d} |{\mathcal {A}}[a_{\delta },\gamma ]g(x,s)| \end{aligned}$$

holds. It is therefore enough to bound the operator \({\mathcal {N}}[a_{\delta },\gamma ]\).

We now perform an endpoint Sobolev embedding to replace the \(L^{\infty }_s\) norm in the maximal function with an \(L^2_{s}\) norm. Here we write

$$\begin{aligned} {\mathfrak {D}}_{s}\mathcal{{A}}[a,\gamma ]:= \Big (1+ \sqrt{-\partial _{s}^{2}}\Big )^{1/2} \mathcal{{A}}[a,\gamma ], \end{aligned}$$

where a and \(\gamma \) are as above and \(\big (1+ \sqrt{-\partial _{s}^{2}}\big )^{1/2}\) is the fractional differential operator in s with multiplier \((1+|\sigma |)^{1/2}\).

Proposition 2

For a nondegenerate curve \(\gamma : I \rightarrow {\mathbb {R}}^d\), \(0< \delta < 1\) and \(a_{\delta }\) as defined in (4), we have

$$\begin{aligned} \Vert {\mathcal {N}}[a_{\delta },\gamma ]g\Vert _{L^2({\mathbb {R}}^{d})} \lesssim _{} |\log \delta |^{1/2}\Vert {\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})} + \Vert g\Vert _{L^2({\mathbb {R}}^{d+1})}\end{aligned}$$

for all \(g \in \mathcal{{S}}({\mathbb {R}}^{d+1})\).

Proof

Let \({\tilde{\chi }}: {\mathbb {R}} \rightarrow [0,1]\) satisfy \({\tilde{\chi }}(\sigma ) = 1\) for all \(\sigma \in (-C\delta ^{-1}, C\delta ^{-1})\) and \({\tilde{\chi }}(\sigma ) = 0\) when \(\sigma \notin (-2C\delta ^{-1},2C\delta ^{-1})\). The constant C is chosen large enough to satisfy the requirements of the forthcoming argument. Defining

$$\begin{aligned} \mathcal{{A}}_{\text {main}}[a_{\delta },\gamma ] := {\tilde{\chi }}\big (\tfrac{1}{i} \partial _s\big ) \circ \mathcal{{A}}[a_{\delta },\gamma ] \qquad \text {and} \qquad \mathcal{{A}}_{\text {err}}[a_{\delta },\gamma ] := \mathcal{{A}}[a_{\delta },\gamma ] - \mathcal{{A}}_{\text {main}}[a_{\delta },\gamma ], \end{aligned}$$

where the multiplier operator \({\tilde{\chi }}\big (\tfrac{1}{i} \partial _s\big )\) is defined in Sect. 2, it suffices to prove

$$\begin{aligned}{} & {} \Vert \mathcal{{A}}_{\text {main}}[a_{\delta },\gamma ]g\Vert _{L^{2}_{x}L^{\infty }_{s}({\mathbb {R}}^{d} \times I)}&\lesssim |\log \delta |^{1/2}\Vert {\mathfrak {D}}_s\mathcal{{A}}[a_\delta ,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1} )} ,\end{aligned}$$
(5)
$$\begin{aligned}{} & {} \Vert \mathcal{{A}}_{\text {err}}[a_{\delta },\gamma ]\Vert _{L^{2}_{x}L^{\infty }_{s}({\mathbb {R}}^{d} \times I)}&\lesssim \Vert g\Vert _{L^2({\mathbb {R}}^{d+1})} \end{aligned}$$
(6)

for all \(g \in \mathcal{{S}}({\mathbb {R}}^{d+1})\).

To prove (5), fix \(g \in \mathcal{{S}}({\mathbb {R}}^{d+1})\) and write

$$\begin{aligned} \mathcal{{A}}_{\text {main}}[a_{\delta },\gamma ] g(x,s) = {\tilde{\chi }}_1\big (\tfrac{1}{i} \partial _s\big ) \circ {\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g(x,s) \qquad \text {for }(x,s) \in {\mathbb {R}}^d \times I, \end{aligned}$$

where \({\tilde{\chi }}_1(\sigma ):= (1+|\sigma |)^{-1/2}{\tilde{\chi }}(\sigma )\). Temporarily fix \(x \in {\mathbb {R}}^d\). The above expression can be written as a convolution product in s variable between \(\mathcal{{F}}_{s}^{-1}({\tilde{\chi }}_1)\) and \({\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g(x, \, \cdot \,).\) Using Young’s inequality, Plancherel’s theorem and by noting that the \(L^2\) norm of \({\tilde{\chi }}_1\) is \(O(|\log \delta |^{1/2})\), we obtain

$$\begin{aligned} \Vert \mathcal{{A}}_{\text {main}}[a_{\delta },\gamma ] g(x,\,\cdot \,)\Vert _{L^{\infty }_s(I)} \lesssim _{} |\log \delta |^{1/2}\Vert {\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g(x,\,\cdot \,)\Vert _{L^2_s({\mathbb {R}})}. \end{aligned}$$

Combining Fubini’s theorem with the above estimate for each \(x \in {\mathbb {R}}^d\), we obtain (5).

To prove (6), write

$$\begin{aligned} \mathcal{{A}}_{\text {err}}[a_{\delta },\gamma ] g&= \Big (1+ \sqrt{-\partial ^{2}_{s}}\Big )^{-1} \circ \Big (1+ \sqrt{-\partial ^{2}_{s}}\Big ) \circ (1 - {\tilde{\chi }})\big (\tfrac{1}{i}\partial _{s}\big )\circ \mathcal{{A}}[a_{\delta },\gamma ]g \\&= {\tilde{\chi }}_2\big (\tfrac{1}{i} \partial _s\big ) \circ {\tilde{\chi }}_3\big (\tfrac{1}{i} \partial _s\big ) \circ \mathcal{{A}}[a_{\delta },\gamma ]g \end{aligned}$$

where

$$\begin{aligned}{\tilde{\chi }}_{2}(\sigma ):= (1+ |\sigma |)^{-1}(1-{\tilde{\chi }}(\sigma ))^{1/2} \quad \text {and} \quad {\tilde{\chi }}_{3}(\sigma ):= (1+ |\sigma |)(1-{\tilde{\chi }}(\sigma ))^{1/2} \end{aligned}$$

for \(\sigma \in {\mathbb {R}}\). Note that \((1+ |\sigma |)^{-1}(1-{\tilde{\chi }}(\sigma ))^{1/2}\) has uniformly bounded \(L^2\) norm (in \(\delta \)). Thus, an application of Young’s convolution inequality gives

$$\begin{aligned} \Vert \mathcal{{A}}_{\text {err}}[a_{\delta },\gamma ] g(x,\cdot )\Vert _{L^{\infty }_{s}(I)} \lesssim _{} \Vert {\tilde{\chi }}_3\big (\tfrac{1}{i} \partial _s\big ) \circ \mathcal{{A}}[a_{\delta },\gamma ]g(x,\cdot )\Vert _{L^2_s({\mathbb {R}})} \qquad \text { for } x \in {\mathbb {R}}^d.\end{aligned}$$

Integrating in x using Fubini’s theorem,

$$\begin{aligned} \Vert \mathcal{{A}}_{\text {err}}[a_{\delta },\gamma ] g\Vert _{L^{2}_{x}L^{\infty }_{s}({\mathbb {R}}^{d} \times I)} \lesssim _{} \Vert {\tilde{\chi }}_3\big (\tfrac{1}{i} \partial _s\big ) \circ \mathcal{{A}}[a_{\delta },\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

By Plancherel’s theorem, the quantity on the right can be estimated from above by \(L^2\) norm of the function \({\mathcal {B}}_{\text {err}}[a_{\delta }, \gamma , \tilde{\chi _{3}}]g\), where

$$\begin{aligned} {\mathcal {B}}_{\text {err}}[a_{\delta }, \gamma , \tilde{\chi _{3}}]g(\xi ,\sigma )&:= \int _{I} b_{\delta }(\xi , \sigma , t) \mathcal{{F}}_{x}(g)(\xi ,t) \mathrm{{d}}t\end{aligned}$$

for

$$\begin{aligned} b_{\delta }(\xi ,\sigma , t)&:= {\tilde{\chi }}_3(\sigma ) \int _{I} e^{-i( \sigma s + t\langle \gamma (s), \xi \rangle )} a_{\delta }(\xi ,s,t) \mathrm{{d}}s. \end{aligned}$$
(7)

By Minkowski’s integral inequality, Plancherel’s theorem and Cauchy–Schwarz in the t variable,

$$\begin{aligned} \Vert {\mathcal {B}}_{\text {err}}[a_{\delta }, \gamma , \tilde{\chi _{3}}]g\Vert _{L^2({\mathbb {R}}^{d+1})} \lesssim \Vert b_{\delta }\Vert _{L^{\infty }_{\xi ,t}L^2_{\sigma }({\mathbb {R}}^{d} \times I \times {\mathbb {R}})} \Vert {g}\Vert _{L^2({\mathbb {R}}^{d} \times I)}. \end{aligned}$$

Thus, the proof of (6) boils down to the estimate \(\Vert b_{\delta }(\xi , \, \cdot \,, t)\Vert _{{L^2_{\sigma }({\mathbb {R}})}} \lesssim _{} 1\) uniformly in \((\xi ,t) \in {\mathbb {R}}^{d} \times I\). Since \(|\xi | \lesssim \delta ^{-1}\) and C is large,

$$\begin{aligned} | \sigma + t \langle \gamma '(s), \xi \rangle | \sim |\sigma | \quad \text { whenever} \quad (\xi ,s,t) \in \mathrm{{supp}} \ a_{\delta },\ \sigma \in \ \mathrm{{supp}} \ {\tilde{\chi }}_3. \end{aligned}$$

Noting the easy estimate \(| \partial _{s}^{\beta }a_{\delta }(\xi ,s,t) | \lesssim _{\beta } 1\) for all \(\beta \in {\mathbb {N}}\), we apply integration-by-parts to estimate the oscillatory integral in (7). In particular,

$$\begin{aligned} b_{\delta }(\xi ,\sigma , t) = O_{N,\gamma }((1+ |\sigma |)^{-N}) \qquad (\xi ,t) \in {\mathbb {R}}^{d} \times I \text { and } N \ge 1. \end{aligned}$$

It is evident that the required \(L^2\) estimate for \(b_{\delta }\) follows from this rapid decay, completing the proof of (6). \(\square \)

Proposition 2 reduces the analysis to estimating the operator \({\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]\). We begin by dyadically decomposing the frequency space. Suppose \(\eta , \beta \in C_{c}^{\infty }({\mathbb {R}})\) are the classical Littlewood–Paley functions such that

$$\begin{aligned} \mathrm{{supp}} \ \eta \subseteq \{ r \in {\mathbb {R}}: | r | \le 2 \}, \qquad \mathrm{{supp}} \ \beta \subseteq \{ r \in {\mathbb {R}}: {1}/{2} \le | r | \le 2 \} \end{aligned}$$
(8)

and

$$\begin{aligned} \eta (r) + \sum _{\lambda \in 2^{{\mathbb {N}}}} \beta (r/\lambda ) = 1 \qquad \text {for all } r \in {\mathbb {R}}. \end{aligned}$$

For \(\lambda \in \{0\} \cup 2^{{\mathbb {N}}}\), introduce the dyadic symbols

$$\begin{aligned} a^{\lambda }_{\delta }({\xi ,s,t}):= {\left\{ \begin{array}{ll} a_{\delta }(\xi ,s,t)\eta (|\xi |) &{}\text {if} \ \lambda = 0, \\ a_{\delta }(\xi ,s,t)\beta (|\xi |/\lambda ) &{} \text {if} \ \lambda \in 2^{{\mathbb {N}}}. \end{array}\right. } \end{aligned}$$

Theorem 1 is a consequence of the following result.

Proposition 3

Let \(\lambda \in \{0\} \cup 2^{{\mathbb {N}}}\) and \(0< \delta < 1\). Then,

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{\lambda }_{\delta },\gamma ]\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})} \lesssim _{d,\gamma } (\log (2 + \lambda ) )^{(d-1)/2}. \end{aligned}$$

Proof

(Proposition 3\(\implies \) Theorem 1) Let \({\tilde{\eta }}, {\tilde{\beta }} \in C_c^{\infty }({\mathbb {R}})\) be two non-negative functions such that \({\tilde{\eta }}(r) = 1\) for \(r \in \mathrm{{supp}} \ \eta \), \( {\tilde{\beta }}(r) = 1\) for \(r \in \mathrm{{supp}} \ \beta \) and

$$\begin{aligned} {\tilde{\eta }}(r) + \sum _{\lambda \in 2^{{\mathbb {N}}}} {\tilde{\beta }}(r/\lambda )\lesssim 1 \qquad \text {for all } r \in {\mathbb {R}}. \end{aligned}$$

For \(g \in \mathcal{{S}}({\mathbb {R}}^{d+1})\), define

$$\begin{aligned} g^{\lambda } := {\left\{ \begin{array}{ll} {\tilde{\eta }}\left( |\tfrac{1}{i}\partial _{x}|\right) g &{} \hbox { if}\ \lambda = 0,\\ {\tilde{\beta }}\left( | \tfrac{1}{i}\partial _{x}/{\lambda } |\right) g &{} \hbox { if}\ \lambda \ge 1. \end{array}\right. } \end{aligned}$$

It is clear from the definitions that \({\mathfrak {D}}_s\mathcal{{A}}[a_{\delta }^{\lambda },\gamma ]g = {\mathfrak {D}}_s\mathcal{{A}}[a^{\lambda }_{\delta },\gamma ]g^{\lambda }\). By Plancherel’s theorem and the support properties of the \(a^{\lambda }_{\delta }\), we have

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g \Vert _{L^2({\mathbb {R}}^{d+1})}^2 \lesssim _{d} \sum _{\lambda \in \{0\} \cup 2^{{\mathbb {N}}}} \left\| {{\mathfrak {D}}_s\mathcal{{A}}[a^{\lambda }_{\delta },\gamma ]g^{\lambda } }\right\| ^2_{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

Applying Proposition 3 for each \(\lambda \) and observing that \(a_{\delta }^{\lambda } = 0\) when \(\delta ^{-1} \lesssim _{d} \lambda \), we obtain

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]g \Vert ^2_{L^2({\mathbb {R}}^{d+1})}&\lesssim _{d,\gamma } \sum _{\begin{array}{c} \lambda \in \{0\} \cup 2^{{\mathbb {N}}}: \ \lambda \lesssim \delta ^{-1} \end{array}} (\log (2+ \lambda ))^{d-1} \Vert g^{\lambda }\Vert ^2_{L^2({\mathbb {R}}^{d+1})} \\&\lesssim _{d,\gamma } (\log \delta ^{-1})^{d-1} \Vert g\Vert ^2_{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

Combining the above inequality with Proposition 2, we deduce Theorem 1. \(\square \)

The multiplier associated to \({\mathfrak {D}}_s\mathcal{{A}}[a^{0}_{\delta },\gamma ]\) is a bounded function and so the \(\lambda = 0\) case of Proposition 3 is immediate. More interesting cases arise when \(\lambda \in 2^{{\mathbb {N}}}\).

4 The Proof of Proposition 3

4.1 Setting Up the Induction Scheme

Fix \(\lambda \in 2^{{\mathbb {N}}}\). We begin with few basic definitions.

Definition 4

Let \(1 \le L \le d\). Define \({\mathfrak {S}}(B,L)\) to be the collection of all curves \(\gamma : I \rightarrow {\mathbb {R}}^d\) such that for all \(s \in I\), we have

$$\begin{aligned} \Vert \gamma \Vert _{C^{2d}(I)} \le B \qquad \text { and} \qquad |[|\big ]{ \det \begin{pmatrix} \gamma ^{(1)}(s)&\cdots&\gamma ^{(L)}(s) \end{pmatrix}}\ge {B}^{-1}, \end{aligned}$$
(9)

where the determinant is interpreted as the square root of the sum of squares of its \(L \times L\) minors.

Definition 5

Let \(1 \le L \le d\) and \(\gamma \in {\mathfrak {S}}(B,L)\). A symbol \(a \in C^{3d}( {\mathbb {R}}^d \times I \times I)\) is said to be of type \((\lambda , A,L)\) with respect to \(\gamma \) if the following hold:

  1. 1.

    There exists a constant \(C = C(A,B) > 1\), independent of \(\lambda \), such that

    $$\begin{aligned} \mathrm{{supp}}_{\xi } \ a \subseteq \{ \xi \in {\mathbb {R}}^d: C\lambda \le |\xi | \le 2C\lambda \}. \end{aligned}$$
  2. 2.

    \(|\partial _{s}^{\beta }a(\xi ,s,t)| \lesssim _{\beta ,A} 1 \) for \( 0 \le \beta \le 3d\) and \((\xi ,s,t) \in \mathrm{{supp}} \ a\).

  3. 3.

    The inner product estimate

    $$\begin{aligned} A^{-1}|\xi | \le \sum _{i = 1}^L | \langle \gamma ^{(i)}(s),\xi \rangle | \le A |\xi | \qquad \text { holds for all }(\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a. \end{aligned}$$
    (10)

Proposition 3 is consequence of the following result.

Proposition 6

Fix \(1 \le L \le d\), \(\gamma \in {\mathfrak {S}}(B,L)\) and let a be a symbol of type \((\lambda ,A,L)\) with respect to \(\gamma \). Then,

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})} \lesssim _{A,B,d} (\log \lambda )^{(L-1)/2}. \end{aligned}$$

In view of (10), it is clear that Proposition 3 corresponds to the case \(L = d\) of Proposition 6.

Proposition 6 is proved by inducting on L. Given an arbitrary symbol \(a \in C^{3d}({\mathbb {R}}^d \times I \times I)\) and a smooth curve \(\gamma \), we present here a general argument which will be used repeatedly through the induction process in order to obtain favourable norm bounds for the Fourier integral operator \({\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]\). For \(g \in \mathcal{{S}}({\mathbb {R}}^{d})\), we aim for the estimate

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})} \lesssim _{A,B,d} \Vert g\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$
(11)

By applying Plancherel’s theorem and the Cauchy–Schwarz inequality,

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}^2&=\int _{{\mathbb {R}}} (1+ |\sigma |) | \mathcal{{F}}_{x,s}(\mathcal{{A}}[a,\gamma ]g)|^2(\sigma ,\xi ) \mathrm{{d}}\xi \textrm{d}\sigma \\&\lesssim \Vert \mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}\left\| {\big (1+ \sqrt{-\partial _{s}^{2}}\big )\mathcal{{A}}[a,\gamma ]g}\right\| _{L^2({\mathbb {R}}^{d+1})} \\&\le \Vert \mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}^2 \\&\quad +\left\| {\sqrt{-\partial _{s}^{2}}\mathcal{{A}}[a,\gamma ]g}\right\| _{L^2({\mathbb {R}}^{d+1})} \Vert \mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

Since the Hilbert transform is bounded on \(L^2\),

$$\begin{aligned} \left\| {\sqrt{-\partial _{s}^{2}}\mathcal{{A}}[a,\gamma ]g}\right\| _{L^2({\mathbb {R}}^{d+1})} \lesssim \Vert \partial _{s}\mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

Thus, to prove (11), it suffices to show that there exists \(\Lambda > 1\) such that

$$\begin{aligned}\Vert \partial _{s}^{\iota }\mathcal{{A}}[a,\gamma ]\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})} \lesssim _{A,B,d} \Lambda ^{(2\iota - 1)/2} \quad \text {for } \iota = 0,1. \end{aligned}$$

Applying Plancherel’s theorem and the Cauchy–Schwarz inequality,

$$\begin{aligned} \Vert \mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}^2&\sim _{d} \int _{I}\int _{{\mathbb {R}}^d} {\mathcal {B}}[a]\mathcal{{F}}_{x}(g)(\xi ,t) \overline{\mathcal{{F}}_{x}(g)(\xi ,t)} \mathrm{{d}}\xi \mathrm{{d}}t\\&\le \int _{{\mathbb {R}}^d}\Vert {\mathcal {B}}[a]\mathcal{{F}}_{x}(g) (\xi ,\cdot )\Vert _{L^2({\mathbb {R}})} \Vert \mathcal{{F}}_{x}(g)(\xi ,\cdot )\Vert _{L^2({\mathbb {R}})} \mathrm{{d}}\xi , \end{aligned}$$

where \({\mathcal {B}}[a]\) is the operator that integrates (in \(t'\) variable) functions against the kernel

$$\begin{aligned} K[a](\xi ,t',t) := \int _{I} e^{i \langle (t -t')\gamma (s), \xi \rangle } a(\xi ,s,t') \overline{a(\xi ,s,t)} \mathrm{{d}}s. \end{aligned}$$
(12)

At this point, note that \(\partial _{s}\mathcal{{A}}[a,\gamma ]g\) can be expressed as \(\mathcal{{A}}[{\mathfrak {d}}_sa,\gamma ]g\), with a symbol

$$\begin{aligned} {\mathfrak {d}}_sa(\xi ,s,t):= t\langle \gamma '(s), \xi \rangle a(\xi ,s,t) + \partial _{s}a(\xi ,s,t) \quad \text { for }(\xi ,s,t) \in {\mathbb {R}}^{d} \times I \times I. \end{aligned}$$

Applying Schur’s test, we see that (11) is a consequence of the estimates

$$\begin{aligned} \sup _{(\xi ,t') \in \mathrm{{supp}}_{\xi } \ a \times I} \Vert K[{\mathfrak {d}}_s^{\iota }a](\xi ,t',\cdot )\Vert _{L^1_t(I)} \lesssim _{A,B,d} \Lambda ^{2\iota - 1} \quad \text { for } \iota = 0,1, \end{aligned}$$
(13)

completing the discussion.

The first application of this reduction is the following lemma.

Lemma 7

(Base case) Proposition 6 holds when \(L = 1\).

Proof

Choose a curve \(\gamma \) and a symbol a that satisfies the assumptions of Proposition 6 with \(L = 1\). In particular, a is of type \((\lambda , A,1)\) with respect to \(\gamma \) and as a consequence,

$$\begin{aligned} |\langle \gamma '(s), \xi \rangle | \sim _{A} \lambda \qquad \text {holds for } (\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a. \end{aligned}$$

Following the previous discussion, we wish to obtain good decay estimates for the function \(K[{\mathfrak {d}}_s^{\iota }a]\) with \(\iota = 0,1\). Integrating-by-parts in (12) and using Definition 5 ii), we have

$$\begin{aligned}| K[{\mathfrak {d}}_s^{\iota }a](\xi ,t',t) | \lesssim _{A,B,N} \lambda ^{2\iota }(1 + |t - t'|\lambda )^{-N} \qquad \text {for } \iota = 0,1 \text { and } N \ge 1.\end{aligned}$$

Clearly, these decay estimates imply the required bounds (13) with \(\Lambda = \lambda \). Consequently, we obtain (11) with the implicit constant depending only on A, B and d. \(\square \)

Lemma 7 addresses the base case of Proposition 6. It remains to establish the inductive step.

Proposition 8

Suppose the statement of Proposition 6 is true for \(L = N-1\). Then it is also true for \(L = N\).

Proposition 6, and therefore Theorem 1, follow from Proposition 8 and Lemma 7. For the remainder of the section we present the proof of Proposition 8, which is broken into steps.

4.2 Initial Decomposition

To begin the proof of Proposition 8, let \(\gamma \) and a be chosen to satisfy the assumptions of the Proposition 6 with \(L = N\). We apply a natural division of the symbol a. Let \(H: {\mathbb {R}}^{d+1} \rightarrow {\mathbb {R}}\) be defined as the product

$$\begin{aligned} H(\xi ,s) := \prod _{i = 1}^{N-1} \eta (A' \lambda ^{-1} \langle \gamma ^{(i)}(s), \xi \rangle ) \end{aligned}$$

where \(A'\) is large constant which will be chosen depending only on A, B and N. Here \(\eta \) is as defined in (8). Note that

$$\begin{aligned} |\partial _{s}^{\beta }H(\xi ,s) | \lesssim _{\beta ,A,B} 1 \qquad \text { for }(\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a \text { and }\beta \in {\mathbb {N}} \cup \{0\}.\end{aligned}$$

Furthermore, (10) holds for the pair \((\gamma , a(1-H))\) with A replaced with \(A'\) and \(L = N-1\). Thus, \(a(1-H)\) is a symbol of type \((\lambda ,A',N - 1)\) with respect to \(\gamma \). Applying the induction hypothesis, we deduce the desired estimate for the part of the operator corresponding to the symbol \(a(1-H)\).

Since (10) holds with \(L = N\) in \(\mathrm{{supp}} \ a\) by assumption, the inequalities

$$\begin{aligned}{} & {} (10A)^{-1}|\xi | \le |\langle \gamma ^{(N)}(s), \xi \rangle | \le A|\xi |, \end{aligned}$$
(14)
$$\begin{aligned}{} & {} \sum _{i = 1}^{N-1} | \langle \gamma ^{(i)}(s),\xi \rangle | \le 10^{-10}A^{-1}|\xi | \end{aligned}$$
(15)

also hold for all \((\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ aH\), provided \(A'\) is chosen large enough depending on N and A. Henceforth, for simplicity, we write a in place of aH and therefore work with the stronger assumptions (14) and (15) on the support of a. An application of the implicit function theorem now shows that for any \(\xi \in \mathrm{{supp}}_{\xi } \ a\), there exists \(\sigma (\xi ) \in I\) with

$$\begin{aligned} \langle \gamma ^{(N-1)}\circ \sigma (\xi ), \xi \rangle = 0. \end{aligned}$$
(16)

The strategy now involves a decomposition of the symbol away from the most degenerate regions in \({\mathbb {R}}^{d+1}\). Set

$$\begin{aligned} G(\xi ,s):= \sum _{i = 1}^{N-1} | \varepsilon _{0}^{-1}\lambda ^{-1}\langle \gamma ^{(i)}\circ \sigma (\xi ), \xi \rangle |^{{2}/{(N-i)}} + \varepsilon _{0}^{-2}|s - \sigma (\xi )|^{2},\end{aligned}$$

where the constant \(\varepsilon _{0}= \varepsilon _{0}(A,B)\) will be chosen small enough to satisfy the forthcoming requirements of the proof. The function G should be interpreted as the function measuring the distance of \((\xi ,s)\) from the co-dimension N surface

$$\begin{aligned}\Gamma := \{(\xi ,s) \in {\mathbb {R}}^d \times I: \langle \gamma ^{(i)}\circ \sigma (\xi ), \xi \rangle = 0 \text { for } 1 \le i \le N-1 \text { and } |s - \sigma (\xi )| = 0 \}.\end{aligned}$$

We now decompose the \((\xi ,s)\)-space dyadically away from \(\Gamma \). Suppose \(\eta _1, \beta _1 \in C_{c}^{\infty }({\mathbb {R}})\) are chosen such that

$$\begin{aligned} \mathrm{{supp}} \ \eta _1 \subseteq \{ r \in {\mathbb {R}}: | r | \le 4 \}, \qquad \mathrm{{supp}} \ \beta _1 \subseteq \{ r \in {\mathbb {R}}: {1}/{4} \le | r | \le 4 \} \end{aligned}$$
(17)

and

$$\begin{aligned} \eta _1(r) + \sum _{n \in {\mathbb {N}}} \beta _1(2^{-2n}r) = 1 \qquad \text {for all } r \in {\mathbb {R}}. \end{aligned}$$

Set

$$\begin{aligned} a^{n}({\xi ,s,t}) := a({\xi ,s,t}) \cdot {\left\{ \begin{array}{ll} \eta _1( \varepsilon _{1}^{2}\lambda ^{2/N} G(\xi ,s)) &{} \text {if } n = 0, \\ \beta _1(\varepsilon _{1}^{2}2^{-2n}\lambda ^{2/N} G(\xi ,s)) &{}\text {if } n \ge 1, \end{array}\right. } \end{aligned}$$
(18)

where \(\varepsilon _{1}\) will be chosen small enough (depending on \(\varepsilon _{0}\)) to satisfy the forthcoming requirements of the proof. Observe that \(a = a^0 + \sum _{n \in {\mathbb {N}}} a^n\) and this automatically induces a similar decomposition for the Fourier integral operator \({\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]\).

Since

$$\begin{aligned} |G(\xi ,s)|= O_{B,d}(\varepsilon _{0}^{-2}) \qquad \text { for all } (\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a, \end{aligned}$$
(19)

the symbols \(a^n\) are trivially zero except for \(O_{A,B}(\log \lambda )\) many values of n. Thus, by Plancherel’s theorem,

$$\begin{aligned} \left\| {{\mathfrak {D}}_s\mathcal{{A}}[\sum _{n \ge 0}a^n,\gamma ]g}\right\| _{L^2({\mathbb {R}}^{d+1})}^2&= \sum _{n \ge 0} \Vert {\mathfrak {D}}_s\mathcal{{A}}[ a^{n},\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}^2 \nonumber \\&\lesssim _{A,B} |\log \lambda |\max _{n \ge 0} \Vert {\mathfrak {D}}_s\mathcal{{A}}[ a^{n},\gamma ]g\Vert ^2_{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$
(20)

In light of the above, it remains to bound the fractional operator \({{\mathfrak {D}}_s\mathcal{{A}}[a^n,\gamma ]}\) for different values of n. The case of \(n = 0\) is dealt with by the following lemma.

Lemma 9

$$\begin{aligned} \left\| {{\mathfrak {D}}_s\mathcal{{A}}[ a^{0},\gamma ]}\right\| _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})} \lesssim _{A,B,d} 1. \end{aligned}$$
(21)

Next lemma addresses the case of all other values of n.

Lemma 10

For any \(n \ge 1\), we have

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{n},\gamma ]\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})}&\lesssim _{A,B,d} (\log \lambda )^{{(N-2)}/{2}}. \end{aligned}$$
(22)

Assuming Lemmas 9 and 10 for now, we plug (21), (22) into (20) and obtain

$$\begin{aligned}\Vert {\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})} \lesssim _{A,B,d} (\log \lambda )^{(N-1)/2}\Vert g\Vert _{L^2({\mathbb {R}}^{d+1})}.\end{aligned}$$

This concludes the proof of Proposition 8.

Rest of the section is dedicated to the proofs of the two key lemmas (Lemmas 9 and 10).

4.3 Proof of Lemma 9

To prove Lemma 9, we do not appeal to the induction hypothesis but directly estimate the operator.

Proof of Lemma 9

In view of discussions around (11) and (13), it suffices to show

$$\begin{aligned} |K[{\mathfrak {d}}_{s}^{\iota }a^{0}](\xi ,t',t)|\lesssim _{A,B,d} \lambda ^{(2\iota - 1)/{N}} \qquad \text { for }\iota = 0,1\text { and } (\xi ,t',t) \in {\mathbb {R}}^d \times I \times I. \nonumber \\ \end{aligned}$$
(23)

Indeed, (23) implies (13) with \(a = a^{0}\) and \(\Lambda = \lambda ^{1/N}\), which in turn gives (21) as discussed above.

The estimate (23) for \(\iota = 0\) is immediate from (12) as the \(\mathrm{{supp}}_{s} \ a^{0}(\xi ,\cdot ,\cdot )\) is contained in an interval of length \(O_{A,B}(\lambda ^{-1/N})\) for any fixed \(\xi \in {\mathbb {R}}^d\). By (12) again, the case \(\iota = 1\) becomes evident once we verify the estimates

$$\begin{aligned} |\langle \gamma '(s), \xi \rangle | + |\partial _{s}(a^{0})(\xi ,s,t)|\lesssim _{A,B,d} \lambda ^{{1}/{N}} \qquad \text { for }(\xi ,s,t) \in \mathrm{{supp}} \ a^{0}. \end{aligned}$$

It is easy to see that \(|\partial _{s} (a^{0})(\xi ,s,t)|\lesssim _{A,B} \lambda ^{1/N}\). To estimate the remaining term, note that for any \(1 \le i \le N\), we have

$$\begin{aligned} |\langle \gamma ^{(i)}\circ \sigma (\xi ), \xi \rangle |\lesssim _{A,B,N} \lambda \lambda ^{(i- N)/N} \qquad \text {and} \qquad |s - \sigma (\xi )|\lesssim _{A,B} \lambda ^{-1/N} \end{aligned}$$

for \((\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{0}\). Using Taylor’s theorem,

$$\begin{aligned} |\langle \gamma ^{(1)}(s), \xi \rangle |&\le \sum _{j = 1}^{N-1} |\langle \gamma ^{(j)}\circ \sigma (\xi ), \xi \rangle |\frac{|s - \sigma (\xi )|^{j - 1}}{(j - 1)!} + B|\xi |\frac{|s- \sigma (\xi )|^{N - 1}}{(N-1)!} \\&\lesssim _{A,B,d} \lambda ^{1/N} \end{aligned}$$

for \((\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{0}\), as required. Thus, we obtain (23) and consequently (21). \(\square \)

4.4 Further Decomposition

In order to prove Lemma 10 we must introduce a further decomposition of the symbol. Let \(\zeta \in C_{c}^{\infty }({\mathbb {R}})\) be chosen such that \(\mathrm{{supp}} \ \zeta \subseteq ~[-1,1]\) and \(\sum _{\nu \in {\mathbb {Z}}} \zeta (\,\cdot \, - \nu ) = 1\). For \(n \in {\mathbb {N}}\) and \(\nu \in {\mathbb {Z}}\), consider the symbol

$$\begin{aligned} a^{n,\nu }({\xi ,s,t}) := a^{n}({\xi ,s,t})\zeta (2^{-n}\lambda ^{1/N}(s - s_{n,\nu })) \end{aligned}$$
(24)

where \(s_{n,\nu }:= 2^n\lambda ^{-{1}/{N}}\nu \). Observe that the original symbol is recovered as the sum

$$\begin{aligned} a = \sum _{n= 0}^{C\log (\lambda )} a^{n} = a^{0} + \sum _{n= 1}^{C\log (\lambda )}\sum _{\nu \in {\mathbb {Z}}} a^{n,\nu } , \end{aligned}$$
(25)

where C is a constant that depends only on AB. The following lemma records a basic property of the localised symbols, which is useful later in the proof.

Lemma 11

Let \(n \ge 1, \nu \in {\mathbb {Z}}\) and \(\rho := 2^n\lambda ^{-{1}/{N}}\). For any \((\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{n,\nu }\), we have

$$\begin{aligned} \sum _{i = 1}^{N-1} \rho ^{i - N}| \langle \gamma ^{(i)}(s), \xi \rangle | \sim _{A,B,d} |\xi | \sim \lambda . \end{aligned}$$
(26)

Proof

The upper bound in (26) is easier to prove than the lower bound and follows from a similar argument. Consequently, we will focus only on the lower bound.

Fix \(n \ge 1\) and \(\nu \in {\mathbb {Z}}\). Recall from the definitions that

$$\begin{aligned} \varepsilon _{1}^{-2}/4 \le \sum _{i = 1}^{N-1} |\varepsilon _{0}^{-1}\lambda ^{-1}\rho ^{i - N}\langle \gamma ^{(i)}\circ \sigma (\xi ), \xi \rangle |^{2/(N-i)} + | \varepsilon _{0}^{-1}\rho ^{-1}(s - \sigma (\xi )) |^2 \le 4\varepsilon _{1}^{-2} \end{aligned}$$
(27)

for all \((\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{n,\nu }\). Fixing \(\xi \), we now consider two cases depending on which terms of the above sum dominate.

Case 1 Suppose \((\varepsilon _{0}\varepsilon _{1}^{-1}\rho )/4 \le |s - \sigma (\xi )|\). By the Mean Value Theorem, there exists \(s_{*} \in I\) between s and \(\sigma (\xi )\) such that

$$\begin{aligned} \langle \gamma ^{(N-1)}(s), \xi \rangle - \langle \gamma ^{(N-1)}\circ \sigma (\xi ), \xi \rangle = \langle \gamma ^{(N)}(s_{*}), \xi \rangle (s-\sigma (\xi )).\end{aligned}$$

Combining this with (14) and (16), we deduce that \(|\langle \gamma ^{(N-1)}(s), \xi \rangle |\gtrsim _{A} \lambda (\varepsilon _{0}\varepsilon _{1}^{-1}\rho )\). This gives the lower bound in (26).

Case 2 Suppose Case 1 fails. Using (27), we can find \(1 \le i_0 \le N-2\) such that

$$\begin{aligned} c_N \varepsilon _{0}\lambda (\varepsilon _{1}^{-1}\rho )^{N - i_0}\le | \langle \gamma ^{(i_0)}\circ \sigma (\xi ), \xi \rangle | \le 2^N\varepsilon _{0}\lambda (\varepsilon _{1}^{-1}\rho )^{N - i_0}, \end{aligned}$$
(28)

with \(c_N:= (4N)^{-N}\), whilst \(|s - \sigma (\xi )| \le \varepsilon _{0}\varepsilon _{1}^{-1}\rho \) and

$$\begin{aligned} |\langle \gamma ^{(i)}\circ \sigma (\xi ), \xi \rangle | \le 2^N\varepsilon _{0}\lambda (\varepsilon _{1}^{-1}\rho )^{N - i} \qquad \text { for all }i_0 < i \le N-1. \end{aligned}$$

By Taylor’s theorem,

$$\begin{aligned} |\langle \gamma ^{(i_0)}(s), \xi \rangle -&\langle \gamma ^{(i_0)} \circ \sigma (\xi ), \xi \rangle |\nonumber \\&\le \sum _{i = i_0 + 1}^{N - 1} 2^N\varepsilon _{0}^{1+i - i_0}\lambda (\varepsilon _{1}^{-1}\rho )^{N - i} (\varepsilon _{1}^{-1}\rho )^{i - i_0} + B\lambda (\varepsilon _{0}\varepsilon _{1}^{-1}\rho )^{N-i_0} \nonumber \\&\le (c_N \varepsilon _{0}/2) \lambda (\varepsilon _{1}^{-1}\rho )^{N - i_0}, \end{aligned}$$
(29)

provided the constant \(\varepsilon _{0}\) is chosen small enough depending on B and N. Combining (28) and (29), we deduce that

$$\begin{aligned} |\langle \gamma ^{(i_0)}(s), \xi \rangle |\sim _{\epsilon _0} \lambda (\varepsilon _{1}^{-1}\rho )^{N - i_0} \qquad \text {for all }s \in \mathrm{{supp}}_{s} \ a^{n,\nu }, \end{aligned}$$

which implies the lower bound in (26). \(\square \)

In view of (25), we restrict our attention to \({\mathfrak {D}}_s\mathcal{{A}}[a^{n,\nu },\gamma ]\) for fixed \(n \in {\mathbb {N}}\) and \(\nu \in {\mathbb {Z}}\). Before proceeding to its analysis, we make the following elementary observation about the size of \(\rho := 2^{n}\lambda ^{-1/N}\). From the definition (17) of \(\beta _1\), note that

$$\begin{aligned} \varepsilon _{1}^{-2}/4 \le \rho ^{-2}G(\xi ,s) \quad \text {for} \quad (\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{n}. \end{aligned}$$

Combining this with (19), we deduce that \(\rho = O_{B,d}( \varepsilon _{1}\varepsilon _{0}^{-1}).\) Thus, by choosing \(\varepsilon _{1}\) small enough (depending on \(\varepsilon _{0},B,d\)), we can assume that

$$\begin{aligned} \rho \le B^{-2d}. \end{aligned}$$
(30)

In the following subsections, the norm bounds for the operator \({\mathfrak {D}}_s\mathcal{{A}}[a^{n,\nu },\gamma ]\) are obtained using the induction hypothesis via a method of rescaling.

4.5 Rescaling for the Curve

In this subsection, we introduce the rescaling map in a generic setup and describe its basic properties which will play a crucial role in the proof of Lemma 10.

For \(\gamma \in {\mathfrak {S}}(B,N)\) and \(s_{\circ } \in I\), let

$$\begin{aligned}V_{s_{\circ }}^N:= \textrm{span}\{\gamma ^{(1)}(s_{\circ }),\ldots , \gamma ^{(N)}(s_{\circ })\}.\end{aligned}$$

Using (9), note that \(\dim V_{s_{\circ }}^N = N\). For \(0< \rho < 1,\) define a linear operator \(T_{s_{\circ }, \rho }^N\) such that

$$\begin{aligned} T_{s_{\circ }, \rho }^N \left( \gamma ^{(i)}(s_{\circ })\right) := \rho ^{i} \gamma ^{(i)}(s_{\circ }) \qquad \text {for }1 \le i \le N \end{aligned}$$
(31)

and

$$\begin{aligned} T_{s_{\circ }, \rho }^N v = \rho ^{N} v \qquad \text { for }v \in (V_{s_{\circ }}^N)^{\perp }. \end{aligned}$$

It is clear that \(T_{s_{\circ }, \rho }^N\) is a well-defined map such that

$$\begin{aligned} \Vert (T_{s_{\circ }, \rho }^N)^{-1}\Vert \lesssim _{B} \rho ^{-N}. \end{aligned}$$
(32)

Supposing \([s_{\circ } - \rho , s_{\circ } + \rho ] \subseteq I\), we define the rescaled curve

$$\begin{aligned} \gamma _{s_{0},\rho }^N(s):= \big (T_{s_{\circ }, \rho }^N\big )^{-1} (\gamma (s_{\circ } +\rho s) - \gamma (s_{\circ })). \end{aligned}$$

For simplicity, we introduce the notation

$$\begin{aligned} T := T^{N}_{s_{0},\rho }, \quad T^* := \big (T^{N}_{s_{0},\rho }\big )^{-\top }\quad \text {and} \quad {\tilde{\gamma }} := \gamma _{s_{0},\rho }^N. \end{aligned}$$
(33)

The following lemma verifies nondegeneracy assumptions for the rescaled curve.

Lemma 12

For \(0 < \rho \le B^{-2d}\) and \(\gamma \in {\mathfrak {S}}(B,N)\), the rescaled curve \({\tilde{\gamma }}\) as in (33) lies in \({\mathfrak {S}}(B_1,N-1)\) where \(B_1\) depends only on B and N.

A key feature of Lemma 12 is that the parameter \(B_1\) is independent of \(\rho \).

Proof of Lemma 12

We begin by verifying the first part of (9) for the curve \({\widetilde{\gamma }}\). From the definition, we see that \({\tilde{\gamma }}^{(i)}(s) = \rho ^{i} T^{-1} \gamma ^{(i)}(s_{0} + \rho s)\) for any \(i \in {\mathbb {N}}\). Combining this identity with (9) and (32), we deduce that

$$\begin{aligned} \left\| {{\tilde{\gamma }}^{(i)}}\right\| _{L^{\infty }(I)} = O_{B}( \rho ) \qquad \text {whenever }N+1 \le i \le 2d. \end{aligned}$$
(34)

Let \(1 \le i \le N\). By Taylor’s theorem, (31) and (32), we have

$$\begin{aligned} {\tilde{\gamma }}^{(i)}(s)&= \rho ^{i} \sum _{j = i}^{N} T^{-1} \gamma ^{(j)}(s_{0})\frac{(\rho s)^{j-i}}{(j-i)!} + O_{B}(\Vert T^{-1}\Vert \rho ^{N+1}) \nonumber \\&= \sum _{j = i}^{N} \gamma ^{(j)}(s_{0})\frac{s^{j-i}}{(j-i)!} + O_{B}(\rho ) \end{aligned}$$
(35)

Combining (35) with (9), we obtain uniform size estimates for \({\tilde{\gamma }}^{(i)}(s)\) when \(1 \le i \le N\). Together with (34), this implies

$$\begin{aligned} \Vert {\tilde{\gamma }}\Vert _{C^{2d}(I)} \lesssim _{B} 1. \end{aligned}$$
(36)

It remains to verify the second part in (9) for the curve \({\tilde{\gamma }}\) and \(L = N-1\). In view of (36), it suffices to obtain a lower bound for the determinant of the \(d \times N\) matrix whose columns vectors are formed by \(({\tilde{\gamma }}^{(i)}(s))_{1\le i \le N}\) for \(s \in I\). Observe that using the multilinearity of the determinant and elementary column operations, (35) gives

$$\begin{aligned} \left| \det \begin{pmatrix} {\tilde{\gamma }}^{(1)}(s)&\cdots&{\tilde{\gamma }}^{(N)}(s) \end{pmatrix}\right| = \left| \det \begin{pmatrix} {\gamma }^{(1)}(s_{0})&\cdots&{\gamma }^{(N)}(s_{0}) \end{pmatrix}\right| + O_B(\rho ). \end{aligned}$$

By the hypothesis of the lemma, \(\rho \) is small enough so that the above identity combined with (9) gives the estimate

$$\begin{aligned} \left| {\det \begin{pmatrix} {\tilde{\gamma }}^{(1)}(s)&\cdots&{\tilde{\gamma }}^{(N)}(s) \end{pmatrix}}\right| \ge (2B)^{-1}. \end{aligned}$$

Now, an application of (36) (in particular, \(|{\tilde{\gamma }}^{(N)}(s)|\lesssim _{B} 1\)) completes the proof of (9) for \(\gamma = {\tilde{\gamma }}\), \(L = N-1\) and B replaced with a new constant \(B_1\). \(\square \)

The rescaling map \(T^{N}_{s_{\circ },\rho }\) can be used to introduce a rescaling for the operators we are interested in. This is done in the next subsection.

4.6 Rescaling for the Operator

To introduce the operator rescaling, we begin by considering a Schwartz function \(u: {\mathbb {R}} \rightarrow {\mathbb {R}}\). Let \(s_{\circ } \in I\) and \(0< \rho < 1\). Direct computations give

$$\begin{aligned} \Big [\Big (1+ \sqrt{-\partial _{s}^{2}}\Big )^{1/2}u\Big ](s_{\circ } + \rho s) = \rho ^{-1/2}\Big [\Big (\rho + \sqrt{-\partial _{s}^{2}}\Big )^{1/2} {\tilde{u}}\Big ](s), \end{aligned}$$

where \({\tilde{u}}(s):= u(s_{\circ } + \rho s)\). Thus,

$$\begin{aligned} \left\| {\Big (1+ \sqrt{-\partial _{s}^{2}}\Big )^{1/2}u}\right\| _{L^2({\mathbb {R}})}^2&\sim \int _{{\mathbb {R}}}|(\rho + |\sigma |)^{1/2}\mathcal{{F}}_{s}({\tilde{u}})(\sigma )|^{2} \textrm{d}\sigma \nonumber \\&\le \int _{{\mathbb {R}}}|(1 + |\sigma |)^{1/2}\mathcal{{F}}_{s}({\tilde{u}})(\sigma )|^{2} \textrm{d}\sigma \nonumber \\&= \left\| {\Big (1+ \sqrt{-\partial _{s}^{2}}\Big )^{1/2} {\tilde{u}}}\right\| _{L^2({\mathbb {R}})}^{2}. \end{aligned}$$
(37)

For an arbitrary symbol \(a \in C^{2d}({\mathbb {R}}^d \times I \times I)\) and \(\gamma \in {\mathfrak {S}}(B,N)\), recall the definition of \(\mathcal{{A}}[a,\gamma ]\) from (3). Temporarily fixing \(x \in {\mathbb {R}}^d\), set

$$\begin{aligned} u(s) = \mathcal{{A}}[a,\gamma ]g (x,s) \qquad \text {and} \qquad \tilde{\mathcal{{A}}}[a,\gamma ]g(x,s) := \mathcal{{A}}[a,\gamma ]g(x,s_{\circ } + \rho s). \end{aligned}$$
(38)

By combining (37) for each \(x \in {\mathbb {R}}^d\) with Fubini’s theorem,

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a,\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})} \lesssim \left\| {\Big (1+ \sqrt{-\partial _{s}^{2}}\Big )^{1/2}\tilde{\mathcal{{A}}}[a,\gamma ]g}\right\| _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$
(39)

We claim that for \((x,s) \in {\mathbb {R}}^{d+1}\), the identity

$$\begin{aligned} \tilde{\mathcal{{A}}}[a,\gamma ]g(x,s) = |\det {T^*}|^{1/2}\mathcal{{A}}[{\tilde{a}},{\widetilde{\gamma }}]{\tilde{g}}(T^{-1}x, s) \end{aligned}$$
(40)

holds with T, \({\tilde{\gamma }}\) as in (33), symbol

$$\begin{aligned} {\tilde{a}}(\xi ,s,t) := a(T^*\xi ,t,s_{0} +\rho s) \end{aligned}$$

and input function \({\tilde{g}}\) defined by

$$\begin{aligned} \mathcal{{F}}_{x}({\tilde{g}})(\xi ,t) := |\det {T^*}|^{1/2} e^{it \langle T^{-1}\gamma (s_{0}), \xi \rangle }\mathcal{{F}}_{x}(g)(T^*\xi ,t). \end{aligned}$$

Verifying (40) is just a matter of unwinding the definitions. First, we expand \(\tilde{\mathcal{{A}}}[a,\gamma ]g(x,s)\) using (38) as the oscillatory integral

$$\begin{aligned} \int _{{\mathbb {R}}^{d} \times I} e^{i \langle x - t(\gamma (s_{0} +\rho s) - \gamma (s_{0})), \xi \rangle } a(\xi ,s_{0} +\rho s,t) e^{i t\langle \gamma (s_{0}), \xi \rangle } \mathcal{{F}}_{x}(g)(\xi ,t) \mathrm{{d}}\xi \mathrm{{d}}t. \end{aligned}$$

Applying change of variables \(\xi \rightarrow T^* \xi \), the above expression can be written as

$$\begin{aligned}&|\det {T^*}|^{1/2} \int _{{\mathbb {R}}^d \times {\mathbb {R}}} e^{i \langle (T^{-1}x - t{\widetilde{\gamma }}(s), \xi \rangle } a(T^*\xi ,s_{0} +\rho s,t) \mathcal{{F}}_{x}({\tilde{g}})(T^*\xi ,t) \mathrm{{d}}\xi \mathrm{{d}}t\nonumber \\&\qquad = |\det {T^*}|^{1/2}(\mathcal{{A}}[{\tilde{a}},{\widetilde{\gamma }}]{\tilde{g}})(T^{-1}x, s), \end{aligned}$$
(41)

proving the claim (40).

Fix \(n \in {\mathbb {N}}\), \(\nu \in {\mathbb {Z}}\) and recall the definitions of \(a^{n,\nu }\) and \(s_{n,\nu }\) from Sect. 4.4. Consider the rescaling map T as defined in Sect. 4.5 for

$$\begin{aligned}s_{\circ } = s_{n,\nu } \qquad \text {and} \qquad \rho = 2^n \lambda ^{-1/N}.\end{aligned}$$

Furthermore, consider the operator rescaling as in (40) for \(a = a^{n,\nu }\). In this setup, we record some of the basic properties of how \(T^*\) (as defined in (33)) interacts with \({\tilde{a}}\).

Lemma 13

The rescaling map \(T^*\) satisfies the estimate

$$\begin{aligned} \rho ^{-N}|\xi | \lesssim _{A,B} |T^* \,\xi |\lesssim _{B}\rho ^{-N} |\xi | \qquad \text {for all} \quad \xi \in \mathrm{{supp}}_{\xi } \ {\tilde{a}}. \end{aligned}$$
(42)

Proof

Fix \(1 \le i \le N\). From the definition of T, we have

$$\begin{aligned} \langle \gamma ^{(i)}(s_{n,\nu }), \xi \rangle = \rho ^{i}\langle T^{-1} \gamma ^{(i)}(s_{n,\nu }), \xi \rangle = \rho ^{i}\langle \gamma ^{(i)}(s_{n,\nu }), T^* \xi \rangle . \end{aligned}$$
(43)

Fix \(\xi \in \mathrm{{supp}}_{\xi } \ {\tilde{a}}\) so that, by the definition of the rescaled symbol, \(T^* \, \xi \in \mathrm{{supp}}_{\xi } \ a^{n,\nu }\). Using Lemma 11 when \(i \le N-1\) and the Cauchy–Schwarz inequality (or (14)) when \(i = N\), we obtain

$$\begin{aligned}|\langle \gamma ^{(i)}(s_{n,\nu }), T^* \xi \rangle |\lesssim _{A,B} \rho ^{N-i} |*|{T^* \xi }\quad \hbox { for}\ 1 \le i \le N.\end{aligned}$$

Combining this with (43), we deduce that

$$\begin{aligned} |\langle \gamma ^{(i)}(s_{n,\nu }), \xi \rangle |\lesssim _{A,B} \rho ^{N}|T^* \, \xi |. \end{aligned}$$
(44)

On the other hand, if \( v \in (V^{N}_{s_{n,\nu }})^{\perp }\) is a unit vector, one can argue as in (43) to have

$$\begin{aligned} |\langle v, \xi \rangle |= |\rho ^N\langle v, T^* \xi \rangle |\le |T^* \xi | \end{aligned}$$
(45)

where the fact \(\rho < 1\) has been used. Combining (44), (45) and (9), we obtain the lower bound in (42). The upper bound follows from (32). \(\square \)

The following lemma now verifies how rescaling improves the type condition of the symbol.

Lemma 14

The rescaled symbol \({\tilde{a}}\) is of type \((\rho ^{N}\lambda , A_1, N-1)\) with respect to \({\widetilde{\gamma }}\), where \(A_1\) depends only on AB and N.

Proof

By Lemma 13, it is clear that

$$\begin{aligned} \mathrm{{supp}}_{\xi } \ {\tilde{a}} \subseteq \{\xi \in {\mathbb {R}}^d: |\xi | \sim _{A,B} \rho ^{N}\lambda \}.\end{aligned}$$

Since \(0< \rho < 1\) by (30), the estimates \(|\partial _{s}^{\beta }{\tilde{a}}(\xi ,s,t)|\lesssim _{\beta } 1\) for \((\xi ,s,t) \in \mathrm{{supp}} \ {\tilde{a}}\) follows from similar derivative estimates for a. Thus, it remains to verify that (10) holds for the rescaled setup for \(L = N-1\), \(\gamma = {\widetilde{\gamma }}\) and \(a = {\tilde{a}}\). Explicitly, we wish to show

$$\begin{aligned} \sum _{i = 1}^{N-1} |\langle {\widetilde{\gamma }}^{(i)}(s), \xi \rangle |\sim _{A,B} |\xi | \quad \text {for all } (\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ {\tilde{a}}.\end{aligned}$$

To this end, we recall from Lemma 11 that

$$\begin{aligned} \sum _{i = 1}^{N-1} \rho ^{i - N}| \langle \gamma ^{(i)}(s), \xi \rangle | \sim _{A,B} \lambda \quad \text { for all }(\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ a^{n,\nu }. \end{aligned}$$
(46)

However, by unwinding the definition,

$$\begin{aligned} \langle {\widetilde{\gamma }}^{(i)}(s),\xi \rangle = \rho ^{i} \langle T^{-1}\gamma ^{(i)}(s_{n,\nu } + \rho s),\xi \rangle = \rho ^{i} \langle \gamma ^{(i)}(s_{n,\nu } + \rho s), T^*\,\xi \rangle . \end{aligned}$$

Thus, by (46) and Lemma 13,

$$\begin{aligned} \sum _{i = 1}^{N-1} | \langle {\widetilde{\gamma }}^{(i)}(s),\xi \rangle | \sim _{A,B} \rho ^{N}|T^*\,\xi |\sim |\xi | \quad \text {for all } (\xi ,s) \in \mathrm{{supp}}_{\xi ,s} \ {\tilde{a}}, \end{aligned}$$

which is the required estimate. \(\square \)

4.7 Proof of Lemma 10

With all the available tools, the operator \({\mathfrak {D}}_s\mathcal{{A}}[a^{n},\gamma ]\) can be estimated easily for \(n \ge 1\).

Proof of Lemma 10

Fix \(n \ge 1\). Temporarily fix \(\nu \in {\mathbb {Z}}\). In view of (39) and (40) for \(a = a^{n,\nu }\), we have

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{n,\nu },\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})} \lesssim \Vert {\mathfrak {D}}_s\mathcal{{A}}[{\tilde{a}},{\widetilde{\gamma }}]{\tilde{g}}\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$
(47)

Suppose \({\tilde{\zeta }} \in C_c^{\infty }({\mathbb {R}})\) is chosen such that \(\mathrm{{supp}} \ {\tilde{\zeta }} \subseteq [-4,4]\), \({\tilde{\zeta }}(r) = 1\) when \(| r | \le 3\) and

$$\begin{aligned} \sum _{\nu \in {\mathbb {Z}}} {\tilde{\zeta }}(\, \cdot \, - \nu ) \lesssim 1. \end{aligned}$$

In view of the support properties of \(a^{n,\nu }\) (in particular, (18) and (24)), we have

$$\begin{aligned} {\tilde{\zeta }}(\varepsilon _{0}^{-1}\varepsilon _{1}\rho ^{-1}(\sigma (T^{*}\xi ) - s_{n,\nu })) = 1 \qquad \text {for }\xi \in \mathrm{{supp}}_{\xi } \ {\tilde{a}}. \end{aligned}$$

Consequently, recalling the integral expression (41), it is clear that one can replace \({\tilde{g}}\) with \({\tilde{g}}^{n,\nu }\) in (47) where

$$\begin{aligned} {\tilde{g}}^{n,\nu }:= {\tilde{\zeta }}\left( \varepsilon _{0}^{-1}\varepsilon _{1}\rho ^{-1}(\sigma \circ T^{*}(\tfrac{1}{i}\partial _{x}) - s_{n,\nu })\right) {\tilde{g}}. \end{aligned}$$

Now, Lemmas 12 and 14 ensure that the rescaled pair (\({\tilde{a}}, {\tilde{\gamma }}\)) satisfy the assumptions of Proposition 6 with \(L = N-1\) (note that (30) ensures that \(\rho \) is of the right size, so Lemma 12 applies). Thus, the statement of the proposition applies and we obtain

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[{\tilde{a}},{\widetilde{\gamma }}]{\tilde{g}}^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})}&\lesssim _{A,B,d} (\log \rho ^{N}\lambda )^{{(N-2)}/{2}} \Vert {\tilde{g}}^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})} \nonumber \\&\lesssim _{A,B,d} (\log \lambda )^{{(N-2)}/{2}} \Vert {\tilde{g}}^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$
(48)

Thus, the proof of Lemma 10 reduces to summing the above estimates in \(\nu \) without further loss in \(\lambda \). Using (25), Plancherel’s theorem and the support properties of symbols \(a^{n,\nu }\), we combine (48) for different values of \(\nu \) to deduce that

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{n},\gamma ]g\Vert _{L^{2}({\mathbb {R}}^{d+1})}^{2}&\lesssim _{d} \sum _{\nu \in {\mathbb {Z}}} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{n,\nu },\gamma ]g\Vert _{L^{2}({\mathbb {R}}^{d+1})}^{2} \nonumber \\&\lesssim _{A,B,d} (\log \lambda )^{{N-2}} \sum _{\nu \in {\mathbb {Z}}} \Vert {\tilde{g}}^{n,\nu }\Vert _{L^{2}({\mathbb {R}}{^{d+1}})}{^{2}}. \end{aligned}$$

After a change of variable, it is evident that \(\Vert {\tilde{g}}^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})} = \Vert g^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})}\), where

$$\begin{aligned} g^{n,\nu }:= {\tilde{\zeta }}\left( \varepsilon _{0}^{-1}\varepsilon _{1}\rho ^{-1}(\sigma (\tfrac{1}{i}\partial _{x}) - s_{n,\nu })\right) g. \end{aligned}$$

Thus, by another application of the Plancharel’s theorem,

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a^{n},\gamma ]g\Vert _{L^2({\mathbb {R}}^{d+1})}^{2}&\lesssim _{A,B,d} (\log \lambda )^{{N-2}} \sum _{\nu \in {\mathbb {Z}}}\Vert g^{n,\nu }\Vert _{L^2({\mathbb {R}}^{d+1})}^2 \\&\lesssim _{A,B,d} (\log \lambda )^{{N-2}} \Vert g\Vert _{L^2({\mathbb {R}}^{d+1})}^2 \end{aligned}$$

concluding the proof. \(\square \)

In the next section, we discuss the sharpness of the main theorem.

5 Sharpness of the Theorem 1

By acting the maximal function on standard test functions, here we discuss the sharpness of Theorem 1 in two directions: sharpness of the range of p and the operator norm dependence on \(\log \delta ^{-1}\).

5.1 Sharpness of the Range of p

Fix \(p \in [1,\infty )\) and assume that given any \(\epsilon > 0\), we have

$$\begin{aligned} \Vert \mathcal{{N}_{\delta }^{\gamma }}\Vert _{L^p({\mathbb {R}}^{d+1}) \rightarrow L^p({\mathbb {R}}^{d})} \lesssim _{\epsilon } \delta ^{-\epsilon } \qquad \text {for all }0< \delta < 1. \end{aligned}$$
(49)

Temporarily fix \(\epsilon \) and \(\delta \). Set \(g_{\delta }:= \chi _{B(0,\delta )}\). It is easy to show that the \(\delta \)-neighbourhood of the curve \(-\gamma \) is a subset of the super-level set

$$\begin{aligned} \{ x \in {\mathbb {R}}^d: |\mathcal{{N}_{\delta }^{\gamma }}g_{\delta }(x)|\gtrsim \delta \}. \end{aligned}$$

Applying Chebyshev’s inequality and using (49), we have

$$\begin{aligned} \delta \delta ^{(d-1)/p} \lesssim _{\epsilon } \delta ^{(d+1)/p - \epsilon }. \end{aligned}$$

Letting \(\delta \rightarrow 0\), we see that \(p \ge 2 - \epsilon \). Letting \(\epsilon \rightarrow 0\), we conclude that \(p \ge 2\). Thus, \(L^p\) operator norm of \(\mathcal{{N}_{\delta }^{\gamma }}\) has polynomial blowup in \(\delta ^{-1}\) for \(p \in [1,2)\).

5.2 Sharpness of the Operator Norm

Fix \(0< \delta < 1\). Consider two vectors \({\varvec{w}}:= (x,0), \ {\varvec{z}}:= (y,0) \in {\mathbb {R}}^{d+1}\). It follows from the definition that

$$\begin{aligned} {\varvec{w}} + T_{\delta }(r) \cap {\varvec{z}} + T_{\delta }(s) \ne \emptyset \end{aligned}$$

if and only if there exists a \(t \in [-1,1]\) such that

$$\begin{aligned} (x - y) + t(\gamma (r) - \gamma (s)) = O(\delta ). \end{aligned}$$
(50)

Assuming \(|\gamma (s)|\sim 1\) for all \(s \in [-1,1]\), it is also not hard to see that

$$\begin{aligned} \textrm{Vol}_{{\mathbb {R}}^{d+1}}({\varvec{w}} + T_{10\delta }(r) \cap {\varvec{z}} + T_{\delta }(s)) \sim \frac{\delta ^{d+1}}{\delta + |\gamma (r) - \gamma (s)|} \end{aligned}$$
(51)

whenever (50) holds.

Fixing \((x,r) \in {\mathbb {R}}^d \times I\), set \(f_{\delta }:= \chi _{ {\varvec{w}} + T_{10\delta }(r)}\) and note that \(\Vert f_{\delta }\Vert _{L^2({\mathbb {R}}^{d+1})} \sim \delta ^{d/2}\). Fix \(0 \le k \le \lfloor \log (\delta ^{-1}) \rfloor \), define

$$\begin{aligned} A_{k}:= \{ y \in {\mathbb {R}}^d: |\mathcal{{N}_{\delta }^{\gamma }}f_{\delta }(y)|\sim 2^{-k} \}. \end{aligned}$$

We claim that

$$\begin{aligned} |A_k|\gtrsim 2^{2k}\delta ^{d}. \end{aligned}$$

Indeed, in view of (51), \(A_{k}\) contains all points \(y \in {\mathbb {R}}^d\) for which there exist \(s,r \in [-1,1]\) such that (50) holds and \(|\gamma (s) - \gamma (r)|\sim 2^{k}\delta \). The latter condition ensures that the admissible directions \(\gamma (s)\) belong to a portion of the curve which is contained inside a ball of radius \( \sim 2^{k}\delta \). Moreover, for a fixed direction \(\gamma (s)\), any \(y \in {\mathbb {R}}^d\) that lies in the \(\delta \)-neighbourhood of the tube \(x + \{ t(\gamma (r) - \gamma (s)): t \in [-1,1] \}\) satisfies (50). Therefore, \(A_k\) contains the \(\delta \)-neighbourhood of a two-dimensional cone in \({\mathbb {R}}^d\) of diameter \( \sim 2^k\delta \), justifying our claim. Thus,

$$\begin{aligned} (\log \delta ^{-1}) \delta ^{d} \lesssim \sum _{k = 0}^{\lfloor \log (\delta ^{-1}) \rfloor } 2^{-2k}|A_k|\le \Vert \mathcal{{N}_{\delta }^{\gamma }}\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d})}^2 \Vert f_{\delta }\Vert ^2_{L^2({\mathbb {R}}^{d+1})}. \end{aligned}$$

Consequently, we see that

$$\begin{aligned} \Vert \mathcal{{N}_{\delta }^{\gamma }}\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d})} \gtrsim (\log \delta ^{-1})^{1/2}. \end{aligned}$$

In view of the above, we may conjecture that \((\log \delta ^{-1})^{{1}/{2}}\) is the sharp \(L^2\) operator norm of \(\mathcal{{N}_{\delta }^{\gamma }}\). In other words, it is possible that Theorem 1 gives only a partial result in this direction.

In the next section, we discuss a generalisation of Theorem 1.

6 Further Extensions

As observed in [1], a stronger version of Theorem 1 which deals with families of anisotropic tubes is used in actual applications to the proofs of certain geometric maximal estimates (such as that of the helical maximal function). In this section, we state the anisotropic extension of Theorem 1 with a brief discussion on how the argument presented in the article can be adapted for the more general setup.

We begin by introducing the anisotropic tubes using the Frenet frame co-ordinate system. For \(s \in I\), let \(\{e_1(s),\ldots ,e_d(s)\}\) denote the collection of Frenet frame basis vectors, formed by applying Gram–Schmidt process to the set \(\{\gamma ^{(1)}(s), \ldots , \gamma ^{(d)}(s)\}\). For \({\textbf{r}}= (r_1,\ldots ,r_d) \in (0,1)^{d}\), we consider a tube in \({\mathbb {R}}^{d+1}\) in the direction of \(\gamma (s)\), defined as

$$\begin{aligned} {T}_{{\textbf{r}}}(s) := \big \{(y,t) \in {\mathbb {R}}^{d} \times I : |\langle y - t\gamma (s), e_j(s) \rangle |\le r_j \text { for } \ 1 \le j \le d \big \}. \end{aligned}$$
(52)

As before, we can introduce the corresponding averaging and maximal operator as

$$\begin{aligned} {\mathcal {A}}_{{\textbf{r}}}^{\gamma } g(x,s)&: = \frac{1}{|T_{{\textbf{r}}}(s)|}\int _{{T}_{{\textbf{r}}}(s)} g(x - y,t) \mathrm{{d}}y\mathrm{{d}}t\qquad \text { for } (x,s) \in {\mathbb {R}}^d \times I \end{aligned}$$

and

$$\begin{aligned} {\mathcal {N}}_{{\textbf{r}}}^{\gamma }g(x)&: = \sup _{s \in I} |{\mathcal {A}}_{{\textbf{r}}}^{\gamma }g(x,s)|\qquad \text { for } x \in {\mathbb {R}}^d \end{aligned}$$

whenever \(g \in L^1_{\textrm{loc}}({\mathbb {R}}^{d+1})\).

By modifying the argument presented in Sects. 3 and 4, the \(L^p\) boundedness problem for \({\mathcal {N}}_{{\textbf{r}}}^{\gamma }\) can be resolved under mild hypothesis on \({\textbf{r}}\). Our result [3] is as follows.

Theorem 15

Let \({\textbf{r}}:= (r_1,\ldots ,r_d) \in (0,1)^{d}\) be chosen such that

$$\begin{aligned} r_{d} \le \cdots \le r_{1} \le r_2^{1/2} \qquad \text {and} \qquad r_{j} \le r_i^{{(k-j)}/{(k-i)}} r_k^{{(j-i)}/{(k-i)}} \quad \end{aligned}$$
(53)

for any \(1 \le i \le j \le k \le d\) hold. Then, there exists \(C_{d,\gamma } > 1\) such that

$$\begin{aligned} \Vert {\mathcal {N}}_{{\textbf{r}}}^{\gamma }\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d})} \le C_{d,\gamma } (\log r_d^{-1})^{d/2}. \end{aligned}$$

There are two most interesting cases where Theorem 15 can be applied. These are when \({\textbf{r}}= {\textbf{r}}_{\text {iso}}:= (\delta ,\ldots , \delta )\) and \({\textbf{r}}= (\delta , \delta ^{2},\ldots , \delta ^{d})\) for \(0< \delta < 1\). In both cases, it is clear that \({\textbf{r}}\) satisfies (53). By applying Theorem 15 in the first case, we recover Theorem 1 as a consequence.

Discussion on the proof of Theorem 15

In the following discussion, we highlight the major changes from the arguments presented in this article. A detailed proof can be found in [3].

From (4), we recall the definition

$$\begin{aligned} a_{\delta }(\xi ,s,t) := \psi (\delta |\xi |) {\tilde{\chi }}_{I}(s){\tilde{\chi }}_{I}(t) \qquad \text { for } (\xi ,s,t) \in {\mathbb {R}}^{d} \times I \times I. \end{aligned}$$

In view of (52), the anisotropic version of \(a_{\delta }\) is defined by the formula

$$\begin{aligned} a_{{\textbf{r}}}(\xi ,s,t) := \prod _{j = 1}^{d} \psi ( \langle \xi , e_j(s) \rangle r_j) {\tilde{\chi }}_{I}(s){\tilde{\chi }}_{I}(t) \qquad \text { for } (\xi ,s,t) \in {\mathbb {R}}^{d} \times I \times I, \end{aligned}$$

whenever \({\textbf{r}}\in (0,1)^{d}\).

By arguing along the lines of Sect. 3, we can reduce the proof of Theorem 15 to establishing operator norm estimates for the Fourier integral operator \({\mathfrak {D}}_s\mathcal{{A}}[a_{{\textbf{r}}},\gamma ]\). In particular, it suffices to show that

$$\begin{aligned} \Vert {\mathfrak {D}}_s\mathcal{{A}}[a_{{\textbf{r}}},\gamma ]\Vert _{L^2({\mathbb {R}}^{d+1}) \rightarrow L^2({\mathbb {R}}^{d+1})} \lesssim _{d,\gamma } (\log r_d^{-1})^{d-1}. \end{aligned}$$
(54)

In Sect. 3, we obtained an equivalent version of (54) for \({\textbf{r}}= {\textbf{r}}_{\text {iso}}\) by first dyadically decomposing the operator and then applying Proposition 3 to each part. The proposition, in turn, was proved using an induction argument (in particular, via Proposition 6). In the same way, we can reduce the proof of (54) to a variant of Proposition 6. The modifications reuired in the proposition to adapt to the anisotropic setup do not alter the core argument of the proof. Key steps involving the decomposition as described in Sects. 4.2 and 4.4, and the rescaling as described in Sects. 4.5 and 4.6 remain intact. The differences mainly come from the description of the class of symbols of our interest.

Recall how the derivative bounds

$$\begin{aligned} |\partial _{s}^{\beta }a_{\delta }(\xi ,s,t)|\lesssim _{\beta ,\gamma ,d} 1 \qquad \text { for }\beta \in {\mathbb {N}}\text { and }(\xi ,s,t) \in \mathrm{{supp}} \ a_{\delta } \end{aligned}$$
(55)

were explicitly used for directly estimating parts of the operator in \(L^2\) at many stages in the proof of Proposition 6 (in particular, see the proofs of Lemmas 7 and 9). Consequently, the operator norm of \({\mathfrak {D}}_s\mathcal{{A}}[a_{\delta },\gamma ]\) depends on the upper bound in (55). Since rescaling of symbols preserves (55), one was able to carry these estimates unchanged throughout the induction process (see Definition 5). The situation differs in the anisotropic setup because of two reasons.

Firstly, in contrast to (55), the best attainable \(L^{\infty }\) bounds for the derivatives of the anisotropic symbol are

$$\begin{aligned} \left\| {\partial _{s}^{\beta }a_{{\textbf{r}}}}\right\| _{L^{\infty }({\mathbb {R}}^d \times I \times I)} \lesssim _{\beta ,\gamma ,d} \max _{1 \le j_1,\ldots ,j_\beta ,k_1,\ldots , k_{\beta } \le d} \ \prod _{i = 1}^{\beta } r_{j_i} r_{k_i}^{-1} \quad \text { for }\beta \in {\mathbb {N}}. \end{aligned}$$

Note that the expression on the right depends on \({\textbf{r}}\) and can be extremely large. However, after applying the decomposition as described in Sects. 4.2 and 4.4, improved \(L^{\infty }\) bounds can be attained for the derivatives of each part of the symbol. In view of this, rather than assuming a uniform control over the \(C^{3d}\) norm of the symbol, the anisotropic variant of Proposition 6 includes pointwise bounds for the derivatives of the symbol expressed in a form that is sensitive to the many decomposition in the argument.

Secondly, the action of the rescaling map on the symbol \(a_{{\textbf{r}}}\) significantly alters its derivative estimates. Thus, the properties listed in Definition 5 to describe the rescaling-invariant class of symbols that contain \(a_{\delta }\) must be modified to accommodate all symbols you encounter at various stages in the argument in the anisotropic setup.

Apart from the modifications in the symbol class as mentioned above, we also require additional control over the coefficients \(r_i\) for the purpose of establishing acceptable bounds at stages of direct \(L^2\) estimation. The mild conditions (53) are introduced for this purpose. The author does not know if these conditions are necessary for obtaining the maximal estimate, but they seem to fit well in the induction argument. By Combining (53) with the modified description of symbols, we prove the anisotropic variant of Proposition 6, completing the proof of Theorem 15. \(\square \)