1 Introduction and statement of the results

Let N be a dyadic integer, \(1<\alpha \le 1+\frac{1}{1000}\) and let

$$\begin{aligned} \mu _N(x)=\sum _{N\le m\le 2N}\phi \Big (\frac{m}{N}\Big )\,\frac{\delta _0(x-[m^\alpha ])}{m} \end{aligned}$$
(1)

where \(\delta _0\) denotes the unit point mass at zero, and \(\phi \in C_c^\infty (1,2)\). For fixed positive \(\theta <1\) and \({\check{\mu }}_N(x)=\mu _N(-x)\) we define

$$\begin{aligned} H_Mf(x)=\sum _{M^\theta \le N\le M}(\mu _N-{\check{\mu }}_N)*f(x). \end{aligned}$$
(2)

The operators \(H_M\) can be viewed as a discrete analog of rough singular integral operators, see eg. [3]. They have appeared in the \(\ell ^{1,\infty }\) invertibility problem for discrete singular integral operators considered by the authors in [14, 15]. In particular, it has been proved in [15] that \(H_M\) is of weak type (1,1), uniformly with M.

In the current paper we consider the following maximal variant of \(H_M\)

$$\begin{aligned} H_M^*f(x)=\max _{B\le M}\Big |\sum _{M^\theta \le N\le B}(\mu _N-{\check{\mu }}_N)*f(x)\Big |. \end{aligned}$$
(3)

It is by now classical, that operators \(H_M\) and \(H_M^*\) are bounded on \(\ell ^p\), \(p>1\). However, their behavior on \(\ell ^1\) still seems to be an open question. In this direction we prove the following partial result.

Theorem 1

The maximal Hilbert operators \(H_M^*f\) defined by (3) are of weak type (1,1), uniformly in M.

One of the motives to investigate this question is the natural difference in the analysis of singular integral operators in the discrete and the continuous settings. In the discrete case, the analytic properties of singular integral operators and ergodic averages along sequences of integers are a delicate consequence of arithmetic properties of these sequences. This subject has received considerable attention and we include some of the references. We refer in particular to [21] for a nice recap of the background. The proof of Theorem 1 is a refined version of the argument from [22] and also employs ideas from [3, 18]. In this direction, we refer to our previous work [14, 15, 22] for a more complete list of references and motivation.

2 Definitions

Let \(A,\,N\) be positive dyadic integers, \(\lambda >0\). For a set \({\mathcal {A}}\) we will denote its indicator function by \(1_{\mathcal A}(x)\), and its cardinality by \(|{\mathcal {A}}|\). For a function f we denote \(f^{\lambda N}(x)=f(x)\cdot 1_{\{|f|<\lambda N\}}(x)\). We put \(f^{\lambda N}_\infty (x)=f(x)-f^{\lambda N}(x)\) and \(f^{\frac{\lambda N}{A}}=f1_{\{|f|\sim \frac{\lambda N}{A}\}}\). This last notation may be misleading, but we will try to avoid any ambiguity in what follows.

Let \(\{Q\}\) be any collection of disjoint intervals in \({\mathbb {Z}}\). We define the conditional expectation operator

$$\begin{aligned} E_{\{Q\}}f(x)=\sum _Q \frac{1_Q(x)}{|Q|}\sum _{y\in Q} f(y) \end{aligned}$$
(4)

In particular, we will consider the family of dyadic intervals J of equal length \(|J|\approx M^{\theta -\epsilon }\). Its expectation operator (4) will be denoted by \(E_{\{J\}}\). We will write \(E_{{\mathcal {A}}}f\) if the collection contains only one interval \({\mathcal {A}}\).

We will use the following variant of the Calderón-Zygmund decomposition

Lemma 1

Let \(f\ge 0\), \(f\in \ell ^1\), and \(\lambda >0\). Then there exists a family of disjoint dyadic intervals \(Q\subset {\mathbb {Z}}\) such that

$$\begin{aligned} \lambda \le \frac{1}{|Q|}\sum _{x\in Q} f(x)\le 2\lambda , \end{aligned}$$
(5)

and for any \(x\notin (\bigcup _Q Q)\) we have \(f(x)\le \lambda \).

In what follows we will call the above intervals Q Calderón-Zygmund cubes. We note that for the family of Calderón-Zygmund cubes we have \(\Vert E_{\{Q\}}f(x)\Vert _{\ell ^\infty }\le 2\lambda \). If the family \(\{Q\}\) is fixed we will abbreviate the notation \(E_{\{Q\}}f(x)\) to \(E_{}f(x)\).

We will estimate \(|\{x:|H_M^*f_0(x)|>\lambda \}|\), where we assume \(f_0\ge 0\) and \(\Vert f_0\Vert _{\ell ^1}=1\). C will denote a constant, which can vary on each of its occurrences. For \(\phi \in C_c^\infty \) and a number s we denote \(\phi _s(x)=\phi (\frac{x}{s})\). If I is an interval with center \(x_I\), we denote \(\phi _I(x)=\phi _{|I|}(x-x_I)\).

3 Lemmas

We begin with the following

Lemma 2

Fix dyadic integers \(N_1, N_2\), an interval J and \(\phi \in C_c^\infty \). The interval J is of the size specified above, but not necessarily dyadic. Let

$$\begin{aligned} K_{N_1,N_2}(x_1,x_2)=\sum _{y\in J}\phi _J(y)\mu _{N_1}(x_1-y)\mu _{N_2}(x_2-y) \end{aligned}$$
(6)

Then

  1. 1.

    \(K_{N_1,N_2}(x_1,x_2)\) is supported on the set

    $$\begin{aligned} \{(x_1,x_2): |x_1-x_J|\le CN_1 ^\alpha \wedge |x_2-x_J|\le CN_2^\alpha \}. \end{aligned}$$
  2. 2.

    If \(N_2\ne N_1\) we have

    $$\begin{aligned} |K_{N_1,N_2}(x_1,x_2)|\le \frac{C|J|}{(N_1N_2)^{\alpha }} \end{aligned}$$
    (7)

    and for some \(\delta >0\)

    $$\begin{aligned} |K_{N_1,N_2}(x_1+h,x_2)-K_{N_1,N_2}(x_1,x_2)|\le \frac{C|J|}{(N_1N_2)^{\alpha }}\Big (\frac{|h|}{N_1^{\alpha }}\Big )^\delta \end{aligned}$$
    (8)
    $$\begin{aligned} |K_{N_1,N_2}(x_1,x_2+h)-K_{N_1,N_2}(x_1,x_2)|\le \frac{C|J|}{(N_1N_2)^{\alpha }}\Big (\frac{|h|}{N_2^{\alpha }}\Big )^\delta \end{aligned}$$
    (9)
  3. 3.

    If \(N_2= N_1=N\) then there exists a kernel \(Err_{N,J}(x_1,x_2)\) supported on the set

    $$\begin{aligned} \{(x_1,x_2):|x_1-x_2|\le CN^{1-\epsilon }\} \end{aligned}$$

    satisfying

    $$\begin{aligned} \sum _J|Err_{N,J}(x_1,x_2)|\le \frac{C}{N^{\alpha }} \end{aligned}$$
    (10)

    such that

    $$\begin{aligned} K_{N,N}(x_1,x_2)={\tilde{K}}_{N,N}(x_1,x_2) +Err_{N,J}(x_1,x_2)+C_{N,J}\delta _0(x_1-x_2) \end{aligned}$$
    (11)

    where

    $$\begin{aligned} |C_{N,J}|\le \frac{C|J|}{N^{1+\alpha }} \end{aligned}$$

    and the kernel \({\tilde{K}}_{N,N}(x_1,x_2)\) satisfies (7), (8), (9).

  4. 4.

    Let Q be an interval of length \(|Q|\ge M^{\alpha -1+\epsilon }\), for some \(\epsilon >0\), and let

    $$\begin{aligned} K_N(x)=\frac{1}{|Q|}\mu _N*1_Q(x). \end{aligned}$$

    Then, for some \(\delta >0\)

    $$\begin{aligned} \sum _x |K_N(x+h)-K_N(x)|^2\le \frac{C}{N^{\alpha }}\Big (\frac{|h|}{N^{\alpha }}\Big )^\delta \end{aligned}$$
    (12)

Proof

Statement 0 is obvious. Statements 1 and 2 follow by inspection of the proofs of the Lemma (3.1) in [22] and the Lemma in the appendix of [15]. We briefly sketch the necessary modifications of the argument. Since the statement is translation invariant we can assume that J is centered at 0, with length \(\sim M^{\theta -\epsilon }\). We can assume \(N_1\ge N_2\). Let

$$\begin{aligned} Q=N_1^{1-\delta _1},\quad R=Q^{\delta _2},\quad H=Q^{\delta _3} \end{aligned}$$

where \({\delta _1}, {\delta _2}, {\delta _3}\) are small positive constants, which will be specified later. We define

$$\begin{aligned} Z_{jQ}=\frac{1}{\alpha \big |x_1-x_2-(jQ)^\alpha \big |^\frac{\alpha -1}{\alpha }}. \end{aligned}$$

One can easily check that under the conditions of the lemma the denominator never vanishes. In fact one can check, that if

$$\begin{aligned} \alpha -\delta _1<\alpha \theta \end{aligned}$$
(13)

then the denominator of \(Z_{jQ}\) is comparable to \(N_2^\alpha \). It what follows we will fix \(\delta _1=\frac{1}{100}\). That, together with the choice of \(\theta \) close to 1 will ensure (13). \(\theta \) can be assumed to be arbitrarily close to 1 without loss of generality (see note in the Appendix of [15]). We define 1 - periodic functions \({\tilde{\Psi }}_a\) and \(\Psi _a\) by the conditions:

$$\begin{aligned} 0\le \Psi _{jQ}\le 1, \quad \Psi _{jQ}(t)=1 \text { for } 1-Z_{jQ}\le t \le 1 \end{aligned}$$
(14)
$$\begin{aligned} \text {supp}( \Psi _{jQ}(t))\subset \Big \{t: 1-Z_{jQ}\Big (1+\frac{1}{R}\Big )\le t \le 1+Z_{jQ}\frac{1}{R}\Big \} +{\mathbb {Z}}, \end{aligned}$$
(15)

where \({\mathbb {Z}}\) is the set of integers,

$$\begin{aligned} \text { } \Psi _{jQ}(t)\in C^\infty \text { and } |\partial ^k \Psi _{jQ}(t)|\le C(RZ_{jQ}^{-1})^k \text { for }k\le 4 \end{aligned}$$
(16)
$$\begin{aligned} 0\le {\widetilde{\Psi }}_{H}\le 1,\text { } {\widetilde{\Psi }}_{H}\in C^\infty , \text { } |\partial ^k{\widetilde{\Psi }}_{H}|\le CH^k \text { for } k\le 4 \end{aligned}$$
(17)
$$\begin{aligned} \text { supp}{\widetilde{\Psi }}_{H}\subset \Big [0,\frac{2}{H}\Big ]+{\mathbb {Z}}\text { and }\sum _{0\le P <H}{\widetilde{\Psi }}_{H}\Big (\Theta -\frac{P}{H}\Big )=1 \end{aligned}$$
(18)

Now, we briefly adopt the following convention: The symbol \(a\le _N b\) means \(b-a=O(\frac{1}{N^{\alpha +\eta }})+C\) where \(\eta >0\) is some constant depending on \(\alpha \) and independent of N, and \(C>0\). Similarly as in [22] we have

$$\begin{aligned}&K_{N_1,N_2}(x_1,x_2)\le _{N_1}\\&\le _{N_1}\sum _{0\le P <H}\sum _{j}\sum _{s=0}^{Q-1}{\widetilde{\Psi }}_{H}\Big ((jQ+s)^\alpha -\frac{P}{H}\Big )\nonumber \\&\quad \times \Psi _{jQ}\Big (\Big (x_1-x_2-(jQ+s)^\alpha -\frac{P}{H}\Big )^{\frac{1}{\alpha }}\Big )\nonumber \\&\quad \times \phi _{N_1^\alpha }((jQ+s)^\alpha )\phi _{N_2^\alpha }(x_1-(jQ+s)^\alpha -x_2)\phi _J(x_1-(jQ+s)^\alpha )\nonumber \end{aligned}$$
(19)

Now we replace \(\Psi _{jQ}\) by its Fourier series. By the definition, we have

$$\begin{aligned}{} & {} {c_\alpha }\Big (1-\frac{1}{R}\Big ){(x_1-x_2-(jQ)^\alpha )^{-\frac{\alpha -1}{\alpha }}}\le \\{} & {} \qquad \le \widehat{\Psi }_{jQ}(0)\le {c_\alpha }\Big (1+\frac{1}{R}\Big ){(x_1-x_2-(jQ)^\alpha )^{-\frac{\alpha -1}{\alpha }}}. \end{aligned}$$

Since \(\sum _{0\le P <H}{\widetilde{\Psi }}_{H}(\Theta -\frac{P}{H})=1\), the right hand side expression corresponding to \({\widehat{\Psi }}_{jQ}(0)\) becomes

$$\begin{aligned}&D(x_1,x_2)\\&\quad =\sum _{j,s}{\widehat{\Psi }}_{jQ}(0) \phi _{N_1^\alpha }((jQ+s)^\alpha )\phi _{N_2^\alpha }(x_1-(jQ+s)^\alpha -x_2)\\&\quad \quad \times \phi _J(x_1-(jQ+s)^\alpha )\\&\quad \le \Big (1+\frac{1}{R}\Big )\sum _{m}\frac{c_\alpha }{(x_1-x_2-(jQ)^\alpha )^{\frac{\alpha -1}{\alpha }}} \phi _{N_1^\alpha }(m^\alpha )\phi _{N_2^\alpha }(x_1-m^\alpha -x_2)\\&\quad \quad \times \phi _J(x_1-m^\alpha )\\&\quad \le \Big (1+\frac{1}{R}\Big )\int _{R_+}\frac{c_\alpha }{(x_1-x_2- t^{\alpha })^{\frac{\alpha -1}{\alpha }}} \phi _{N_1^\alpha }(t^\alpha )\phi _{N_2^\alpha }(x_1-t^\alpha -x_2)\\&\quad \quad \times \phi _J(x_1-t^\alpha )dt\\&\quad =\Big (1+\frac{1}{R}\Big )|J|\int _{R_+}\frac{c_\alpha }{ (x_2-|J|u)^{\frac{\alpha -1}{\alpha }}} \phi _{N_1^\alpha }(x_1-|J|u)\phi _{N_2^\alpha }(|J|u-x_2)\\&\quad \quad \times \phi (u)\frac{\alpha du}{(x_1-|J|u)^{\frac{\alpha -1}{\alpha }}}\\&\quad =\Big (1+\frac{1}{R}\Big )|J|F(x_1,x_2) \end{aligned}$$

By similar arguments, one can obtain the lower estimate

$$\begin{aligned} D(x_1,x_2)\ge \Big (1-\frac{1}{R}\Big )|J|F(x_1,x_2) \end{aligned}$$

Now the function of \(F(x_1,x_2)\) can be easily shown to satisfy the estimates \(|\partial _{x_1}F(x_1,x_2)|\le \frac{C}{N_1^{2\alpha -1}N_2^{\alpha -1}}\). This immediately yields (7), (8), (11).

In order to complete the argument we need to estimate the summands corresponding to coefficients \({\widehat{\Psi }}_{jQ}(k)\), \(k\ne 0\).

For jP fixed, we are left with the estimates for

$$\begin{aligned}&\sum _{s=0}^{Q-1} {\widetilde{\Psi }}_{H}\Big ((jQ+s)^\alpha -\frac{P}{H}\Big )\\&\quad \times \Big (\Psi _{jQ}\Big (\Big (x_1-x_2-(jQ+s)^\alpha -\frac{P}{H}\Big )^{\frac{1}{\alpha }}\Big )-{\widehat{\Psi }}_{jQ}(0)\Big ) \\&\quad \times \phi _{N_1^\alpha }((jQ+s)^\alpha )\phi _{N_2^\alpha }(x_1-(jQ+s)^\alpha -x_2)\phi _J(x_1-(jQ+s)^\alpha ) \end{aligned}$$

Let \(N_1\ne N_2\) or \(|x_1-x_2|\ge CN_1\), and \(|J|\ge Q\). Then the above sum is exactly of the form considered in [22]. Let \(\delta _1=\frac{1}{100}\). Using van der Corput Lemma we get an estimate with additional \(Q^{-\eta }\) factor, where \(\eta >0\) is independent of \(\alpha \). We fix \(\delta _2,\delta _3\) sufficiently small independent of \(\alpha \). Then summing with respect to jP completes the job. We note, that choosing \(\theta \) sufficiently close to 1 we can ensure required \(|J|\ge Q\). We omit further details, and refer the reader to [22].

The estimate (10) follows from

$$\begin{aligned} |\mu _{N}*{\check{\mu }}_{N}(x)|\le \frac{C}{N^\alpha } \text { for }x\ne 0, \end{aligned}$$

proved in [22].

The Hölder’s estimate (12) has been proved in [14], Lemma 3.5. The proof there is carried out for a smooth function \(\varphi \) in the place of characteristic function, but it carries over (with small modification in part III, see [14]). \(\square \)

We fix small \(\epsilon >0\), \(1-\epsilon<\theta <1\), the function \(f_0\ge 0\), \(\lambda >0\) and the set of the Calderón-Zygmund cubes \(\{Q\}\) associated with \(f_0\) by Lemma 1. By the \(\ell ^2\) boundedness of the maximal rough Hilbert transform \(H_M^*\) and Lemma 1 we can assume that \(f_0=0\) away from the \(\bigcup _Q Q\). Now we modify \(f_0\) putting

$$\begin{aligned} f_0=0 \text { on each Calder}\grave{o}\text {n- Zygmund cube with }|Q|\le M^{\theta -2\epsilon }. \end{aligned}$$
(20)

We will denote new function again by \(f_0\). In the remark at the end of the paper we explain why this procedure do not bring any loss of the generality.

We perform further reductions. First we have

$$\begin{aligned}&\max _{B}|\sum _{N\le B}(\mu _N-{\check{\mu }}_N)*f_0|\le \max _{B}|\sum _{N\le B}(\mu _N-{\check{\mu }}_N)*(f_{0,\infty }^{\lambda N})| \\&\qquad +\max _{B}|\sum _{N\le B}(\mu _N-{\check{\mu }}_N)*(Ef_{0,\infty }^{\lambda N})|\\&\qquad +\max _{B}|\sum _{N\le B}(\mu _N-{\check{\mu }}_N)*(f_0^{\lambda N}-Ef_0^{\lambda N})|\\&\qquad +\max _{B}|\sum _{N\le B}(\mu _N-{\check{\mu }}_N)*Ef_0|\\&\quad =I+II+III+IV \end{aligned}$$

Now, I has been estimated in [15, 22] using support properties. The estimates for the IV follow since \(|Ef_0|\le C\lambda \) and the maximal Hilbert transform \(H_M^*\) is bounded on \(\ell ^2\). The term II requires some care. First we note, that \(G_M=\sum _{M^\theta \le N \le M}(\mu _N-{\check{\mu }}_N)*(Ef_{0,\infty }^{\lambda N})\in \ell ^2\) and \(\Vert G_M\Vert ^2_{\ell ^2}\le C\lambda \Vert f_0\Vert _{\ell ^1}\). This has been proved in [15] page 22 with \(f_{0,\infty }^{\lambda N}\) replaced by \(f_{0}^{\lambda N}\), The proof of our statement is exactly the same, and we do not present any details.

Next, the key observation is the following property of the Calderón-Zygmund cubes in our setting

Lemma 3

  1. 1.

    Let Q be the Calderón-Zygmund cube which contains the point of the support of \(f_{0,\infty }^{\lambda N}\). Then \(|Q|\ge CN\).

  2. 2.

    Let Q be the Calderón-Zygmund cube which contains the point of the support of \(f_0^{\frac{\lambda N}{A}}\). Then \(|Q|\ge CNA^{-1}\).

Proof

The lemma follows immediately from the upper inequality in (5). \(\square \)

From (20) and (12) we infer that the function \(\mu _N*1_Q\) is \(\ell ^2\) is Hölder regular. Consequently we can estimate II in the similar manner as the maximal Calderón-Zygmund operator. First observe that by (12) the following estimates hold

$$\begin{aligned}{} & {} \sum _{N\le B}|\phi _{B^\alpha }*(\mu _N-{\check{\mu }}_N)(x)|\le C\phi _{4B^\alpha }(x) \\{} & {} \qquad \Big |\big (\mu _N-\phi _{B^\alpha }*\mu _N\big )*\frac{1_Q}{|Q|}(x)\Big |\le C\Big (\frac{B}{N}\Big )^\delta \phi _{4N^\alpha }(x)\text { for } B\le N \end{aligned}$$

and consequently

$$\begin{aligned} \Big |\sum _{ N \ge B}(\mu _N-{\check{\mu }}_N)*(Ef_{0,\infty }^{\lambda N}(x)-\phi _{B^\alpha }*G_M(x))\Big |\le \\ \le C M(Ef_0)(x) +C( M((Ef_0)^2))^\frac{1}{2}(x) \end{aligned}$$

where M denotes the classical Hardy-Littlewood maximal function. By weak type (1,1) of M we obtain \(II\le \frac{C}{\lambda }\). We leave the details (which are standard arguments in the Calderón-Zygmund theory) to the reader.

So the main term is III. We further decompose \(f_0\) writing down

$$\begin{aligned} f_0^{\lambda N}=\sum _{A\text {-dyadic}} f_0^{\frac{\lambda N}{A}} \end{aligned}$$

and obtain the following estimate

$$\begin{aligned} |III|\le \max _{B}|\sum _{N\le B}\sum _{1\le A\le N}(\mu _N-{\check{\mu }}_N)*(f_0^{\frac{\lambda N}{A}}-Ef_0^{\frac{\lambda N}{A}})| \end{aligned}$$
(21)

For \(i=1,2\) we write

$$\begin{aligned} b^{A,N}_{s,i}(x)=\sum _{Q\in {\mathcal {D}}_{A,N,s}^i} (1_{Q}(x)f_0^{\frac{\lambda N}{A}}(x)-E_{Q}f_0^{\frac{\lambda N}{A}}(x)), \end{aligned}$$
(22)
$$\begin{aligned} f^{A,N}_{s,i}(x)=\sum _{Q\in {\mathcal {D}}_{A,N,s}^i} 1_{Q}(x)f_0^{\frac{\lambda N}{A}}(x), \end{aligned}$$
(23)

where the families of cubes \( {\mathcal {D}}_{A,N,s}^i\), \(i=1,2\) are defined as follows

$$\begin{aligned} {\mathcal {D}}_{A,N,s}^1&=\{Q: |Q|\sim (2^{-s}N)^\alpha ,\,2^s> A\}\nonumber \\ {\mathcal {D}}_{A,N,s}^2&=\{Q: |Q|\sim (2^{-s}N)^\alpha ,\,2^s\le A\}. \end{aligned}$$
(24)

Furthermore, for \(i=1,2\) we let

$$\begin{aligned} f^{A,N}_{i}(x)=\sum _{s\ge 0} f^{A,N}_{s,i}(x), \end{aligned}$$
(25)

and for \(i=2\)

$$\begin{aligned} f^N_{s,2}(x)=\sum _{A:\,A\ge 2^{s}}f^{A,N}_{s,2}(x). \end{aligned}$$
(26)

From the above definitions, for fixed AN we have

$$\begin{aligned} \bigcup _{s\ge 0}( {\mathcal {D}}_{A,N,s}^1\cup \mathcal D_{A,N,s}^2)=\{Q: |Q|\le N^\alpha \} \end{aligned}$$
(27)

We note that \( \text {supp} \mu _N*1_{Q}\subset Q^{**}\) if \(|Q|\ge N^\alpha \), and by (5) we have \(\sum |Q^{**}|\le \frac{1}{\lambda }\). Hence it will suffice to estimate

$$\begin{aligned} H_i^*(x)=\max _{B}|\sum _{N\le B}\sum _{A}\sum _{s\ge 0}(\mu _N-{\check{\mu }}_N)*b^{A,N}_{s,i}(x)| \end{aligned}$$
(28)

Lemma 4

Let \(f^{A,N}_{i}\) be defined in (25) and let \(\beta _N\ge 0\) be a sequence of numbers. We define a sequence of integers \(N_j\), \(1\le j\le j_{max}^A\) by

$$\begin{aligned} N_j=\max \bigg \{2^k:\sum _{N\le 2^k}\beta _N\le j\,\lambda _0\bigg \} \end{aligned}$$
(29)

whenever the maximum exists. Assuming

$$\begin{aligned} \beta _N\le \lambda _0 \end{aligned}$$

we see that \(N_j\)’s form a strictly increasing sequence of dyadic integers. Let \(x\in J\) be such that

$$\begin{aligned} \max _{B}|\sum _{N\le B } \mu _N*f^{A,N}_{i}-\sum _{N\le B} \beta _N|\ge 4\lambda _0 \end{aligned}$$
(30)

Then

$$\begin{aligned} \max _{j}|\sum _{N\le N_j } \mu _N*f^{A,N}_{i}-\sum _{N\le N_j} \beta _N|\ge \lambda _0 \end{aligned}$$
(31)

The same is true for the functions \(f^N_{s,2}\) in the place of \(f^{A,N}_{i}\). We call \(j_{max}^s\) the largest value of j in this case.

Proof

Fix B maximising the estimate in (30) and the unique j such that \(N_j<B\le N_{j+1}\). Assume, that

$$\begin{aligned} |\sum _{N\le N_j } \mu _N*f^{A,N}_{i}-\sum _{N\le N_j} \beta _N|\le \lambda _0 \end{aligned}$$
(32)

Then

$$\begin{aligned} |\sum _{N_j<N\le B } \mu _N*f^{A,N}_{i}-\sum _{N_j<N\le B} \beta _N|\ge 3\lambda _0 \end{aligned}$$
(33)

and by (29) we must have

$$\begin{aligned} \sum _{N_j<N\le B } \mu _N*f^{A,N}_{i}-\sum _{N_j<N\le B} \beta _N\ge 3\lambda _0 \end{aligned}$$
(34)

Since \(f^{A,N}_{i}\ge 0\), again by (29)

$$\begin{aligned}{} & {} \sum _{N_j<N\le N_{j+1} } \mu _N*f^{A,N}_{i}-\sum _{N_j<N\le N_{j+1}} \beta _N\ge \\{} & {} \quad \ge \sum _{N_j<N\le B } \mu _N*f^{A,N}_{i}-\sum _{N_j<N\le B} \beta _N-{\lambda _0}\ge 2\lambda _0 \end{aligned}$$

Applying (32) we get

$$\begin{aligned} |\sum _{N\le N_{j+1} } \mu _N*f^{A,N}_{i}-\sum _{N\le N_{j+1} } \beta _N|\ge \lambda _0 \end{aligned}$$
(35)

The proof for functions \(f^N_{s,2}\) is similar. \(\square \)

Remark. Under the assumptions of the above lemma we can split the set of dyadic naturals into two collections

$$\begin{aligned} {\mathcal {C}}_1=\{N: \beta _N\le \lambda _0\}\quad \text {and}\quad {\mathcal {C}}_2=\{N: \beta _N>\lambda _0\}, \end{aligned}$$

such that

$$\begin{aligned} \max _{j}\Big |\sum _{N\le N_j, N\in {\mathcal {C}}_1 } \mu _N*f^{A,N}_{i}-\sum _{N\le N_j, N\in {\mathcal {C}}_1} \beta _N\Big |\ge \frac{1}{2} \lambda _0 \end{aligned}$$
(36)

or

$$\begin{aligned} \max _{j}|\sum _{N\le N_j, N\in {\mathcal {C}}_2 } \mu _N*f^{A,N}_{i}-\sum _{N\le N_j, N\in {\mathcal {C}}_2} \beta _N|\ge \frac{1}{2} \lambda _0 \end{aligned}$$
(37)

and the cardinality of \(|{\mathcal {C}}_2|\le j_{max}^A\).

Lemma 5

Fix an interval J, A and i. Let \(b^{A,N}_{s,i}\) be defined in (22), \(x\in J\) and

$$\begin{aligned} \max _{B}|\sum _{N\le B }\sum _{s} \mu _N*b^{A,N}_{s,i}(x)|\ge 4\lambda _0 \end{aligned}$$
(38)

Then there exists the sequence of nonegative numbers \(\beta _N\) independent on \(x\in J\), such that (30) holds, or (the definition of ER below) \(ER(x)\ge \lambda _0\). The sequence \(\beta _N\) depends on J.

If \(\{N_j\}\) is a sequence of numbers such that (30) holds for (40) then

$$\begin{aligned} \max _{j}|\sum _{N\le N_j }\sum _{s} \mu _N*b^{A,N}_{s,i}(x)|\ge \frac{\lambda _0}{8} \end{aligned}$$
(39)

or \(ER(x)\ge \lambda _0\).

Proof

By the definition we have \(\sum _s \mu _N*b^{A,N}_{s,i}(x)= \mu _N*f^{A,N}_{i}(x)- F_{A,N}(x)\). where \( F_{A,N}(x)= \mu _N*Ef^{A,N}_{i}(x)\). Then we put

$$\begin{aligned} \beta _N=E_{\{J\}} F_{A,N}(x) \end{aligned}$$
(40)

where \(E_{\{J\}}\) is the conditional expectation operator defined by (4), and \(x\in J\). By (20) we have \(|Q|\ge N^{1-3\epsilon }\). We will show that the error function

$$\begin{aligned} ER(x)=\sum _{N}| F_{A,N}(x)-E_{\{J\}} F_{A,N}(x)| \end{aligned}$$
(41)

satisfies

$$\begin{aligned} \Vert ER\Vert _{\ell ^1}\le CM^{-\delta _e}\Vert f_0\Vert _{\ell ^1}. \end{aligned}$$
(42)

where \(\delta _e\) is some small constant depending on the \(\epsilon \) in the definition of J. Observe

$$\begin{aligned} F_{A,N}(x)&= \mu _N*\Big (\sum _{Q}\frac{1_Q}{|Q|}\sum _Qf_i^{A,N}\Big )\ (x)\\&=\sum _Q\frac{C_{N,Q}}{|Q|}\ \mu _N*1_Q(x)\\&=\sum _Q\rho _{N,Q}(x)\cdot C_{N,Q}, \end{aligned}$$

where

$$\begin{aligned} \rho _{N,Q}=\mu _N*\frac{1_Q}{|Q|},\quad C_{N,Q}=\sum _{x\in Q}f_i^{A,N}(x). \end{aligned}$$

Consequently, for \(x\in J\)

$$\begin{aligned} ER(x)&\le \sum _N\sum _QC_{N,Q}\Big |E_{\{J\}}\rho _{N,Q}(x)- \rho _{N,Q}(x)\Big |\\&\le \sum _N\sum _Q\,C_{N,Q}\,\frac{1_J(x)}{|J|}\sum _{h\in J}\big |\rho _{N,Q}(h)-\rho _{N,Q}(x)\big |\\&\le \sum _N\sum _Q\,C_{N,Q}\,\frac{1_J(x)}{|J|}\sum _{h\in J_0^*}\big |\rho _{N,Q}(h+x)-\rho _{N,Q}(x)\big |, \end{aligned}$$

where \(J_0^*\) is the cube centered at 0, with size double that of J. Thus (recall, that \(|J|\sim M^{\theta -\epsilon }\))

$$\begin{aligned} \Vert ER\Vert _{\ell ^1}&\le \sum _N\sum _Q\,\frac{C_{N,Q}}{|J|}\sum _J\sum _{x\in J}\sum _{h\in J_0^*}\big |\rho _{N,Q}(h+x)-\rho _{N,Q}(x)\big |\\&\le \frac{1}{|J|}\sum _N\sum _Q C_{N,Q}\sum _{h\in J_0^*}\sum _{x}\big |\rho _{N,Q}(h+x)-\rho _{N,Q}(x)\big |\\&\le C \frac{1}{|J|}\sum _N\sum _Q C_{N,Q}\sum _{h\in J_0^*}\Big (\frac{|h|}{N^\alpha }\Big )^\delta \\&\le C \Big (\frac{M^{\theta -\epsilon }}{M^{\theta \alpha }}\Big )^\delta \sum _N\sum _Q C_{N,Q}\\&\le C M^{-\epsilon \delta }\,\Vert f_0\Vert _{\ell ^1}. \end{aligned}$$

We have used an \(\ell ^1\) Hölder estimate for \(\rho _{N,Q}\). It follows from the \(\ell ^2\) estimate (12) by Cauchy-Schwarz inequality. We have also used

$$\begin{aligned} \sum _{N,Q} C_{N,Q}\le \Vert f_0\Vert _{\ell ^1}. \end{aligned}$$

Finally, for the case \(i=2\) we define, similarly to \(F_{A,N}\),

$$\begin{aligned} F^N_{s,2}(x)=\mu _N*Ef^N_{s,2}(x), \end{aligned}$$

where \(f^N_{s,2}\) was defined in (26). The resulting \(\beta _N=E_{\{J\}} F^N_{s,2}(x)\) and the error function ER can be treated in the same way. \(\square \)

Lemma 6

Let

$$\begin{aligned} {\mathcal {A}}_A=\{x\in Z: \sum _{N}E_{\{J\}} F_{A,N}(x)\ge \lambda A^2\}, \\ {\mathcal {A}}^s=\{x\in Z: \sum _{N}E_{\{J\}} F^{N}_{s,2}(x)\ge \lambda 2^{s\epsilon }\}. \end{aligned}$$

Then we have \(|{\mathcal {A}}_A|\le \frac{1}{\lambda A^2}\), \(|\mathcal A^s|\le \frac{1}{\lambda 2^{s\epsilon }}\), moreover for each J, \(J\subset {\mathcal {A}}_A\) or \(J\cap {\mathcal {A}}_A=\emptyset \) and \(J\subset {\mathcal {A}}^s\) or \(J\cap {\mathcal {A}}^s=\emptyset \).

Proof

Immediate from the Chebyshev inquality. The second part follows since \(E_{\{J\}} F_{A,N}\) is constant on each J. \(\square \)

By the above lemma 6 we can consider only the intervals J with \(j_{max}^A\le CA^3\) and \(j_{max}^s\le 2^{3s\epsilon }\) for every As. This will allow us to apply the Rademacher-Menschov classical estimate for the maximal function by the sum of a small number of square functions.

Lemma 7

Let \(\{a_i\}_{1\le i \le D}\) be a sequence of numbers. Then

$$\begin{aligned} \max _{1\le i\le D}|\sum _{1\le j \le i}a_j|^2\le C\sum _{k\le \log D} \sum _{s\le D}|\sum _{2^k s\le j \le 2^k (s+1)}a_j|^2 \end{aligned}$$
(43)

Proof

This is a well known fact, see [12] Lemma 1. \(\square \)

For fixed N we denote by \({\mathcal {J}}_N\) the family of dyadic intervals I of size \(8N^\alpha \le |I| < 16N^\alpha \). Then, for N fixed, \(\cup _{I\in {\mathcal {J}}_N}I={\mathbb {Z}}\). Moreover, any two \(I_1\in {\mathcal {J}}_{N_1}, I_2\in {\mathcal {J}}_{N_2}\) either have empty intersection, or one is a subset of the other. For given interval I we denote by \(I^*\) an interval concentric with I, with a larger diameter. The exact ratio of diameters depends on constants appearing in Lemma 2, and will be obvious from the context.

Lemma 8

Let \(b^{A,N}_{s,i}\) be defined as in (22). Fix J and i. Let \(I_1\in {\mathcal {J}}_{N_1},I_2\in {\mathcal {J}}_{N_2}\), \(J\subset I_{N_1}\cap I_{N_2}\). Then for any fixed increasing sequence of integers \(\{S_j\}\) we have

$$\begin{aligned}&\sum _{y\in J}\sum _{j}|\sum _s\sum _{S_j< N\le S_{j+1}} \mu _N*b^{A,N}_{s,i}(y)|^2\phi _J(y)\nonumber \\&\quad \le \sum _{s_1,s_2} 2^{-\delta (s_1+s_2)} \sum _{N_2\le N_1}\frac{|J|}{( N_1N_2)^{\alpha }} \Vert b^{A,N_1}_{s_1,i}\Vert _{\ell ^1(I_1^*)}\Vert b^{A,N_2}_{s_2,i}\Vert _{\ell ^1(I_2^*)}+\\&\qquad +\sum _{N_1,s}\frac{|J|}{ N_1^{\alpha +1}}\sum _{x\in I_1^*}|b^{A,N_1}_{s,i}(x)|^2+\nonumber \\&\qquad +\sum _{N_1,s_1,s_2}\langle \,|Err_{N_1,J}\,b^{A,N_1}_{s_1,i}|,| b^{A,N_1}_{s_2,i}|\,\rangle \nonumber \\&\quad =D_I(J)+D_{II}(J)+D_{III}(J)\nonumber \end{aligned}$$
(44)

The terms \(D_{II}(J),D_{III}(J)\) appears only if \(N_1=N_2\). Moreover the RHS of (44) do not depend on the particular choice of the sequence \(S_j\).

Proof

We expand the square as a double sum

$$\begin{aligned} \begin{aligned}&\sum _{y\in J}\sum _{j}|\sum _s\sum _{S_j< N\le S_{j+1}} \mu _N*b^{A,N}_{s,i}(y)|^2\phi _J(y)\\&\qquad =\sum _{s_1,s_2}\sum _j \sum _{S_j<N_1, N_2\le S_{j+1}} <K_{N_1,N_2}^J b^{A,N_1}_{s_1,i}, b^{A,N_2}_{s_2,i}> \end{aligned} \end{aligned}$$
(45)

and to each summand we apply the regularity estimate (7), (8) of \(K_{N_1,N_2}^J\), see Lemma 2.

Let \(s_1\ge s_2\). If \(N_1\ne N_2\) then we apply (8) and the standard cancellation argument (we omit the details). This leads to

$$\begin{aligned} \Vert K_{N_1,N_2}^J b^{A,N_1}_{s_1,i}\Vert _{\ell ^\infty }\le 2^{-s_1\delta }|J|(N_1N_2)^{-\alpha }\Vert b^{A,N_1}_{s_1,i}\Vert _{\ell ^1(I_1^*)} \end{aligned}$$
(46)

If \(N_1= N_2\), we obtain the same estimate for \({\tilde{K}}\) instead of K.

If \(s_1< s_2\) we repeat the above argument to the kernels conjugate to \(K, {\tilde{K}}\), acting on \(b^{A,N_2}_{s_2,i}\). The estimate of the first summand in (44) follows.

If \(N_1=N_2\), \(s_1\ne s_2\) then the supports of the functions \( b^{A,N_1}_{s_1,i}\), \( b^{A,N_2}_{s_2,i}\) are disjoint. Consequently, the \(\delta _0\) term in (11) produce the second \(D_{II}(J)\) term in (44), and the Lemma follows. \(\square \)

4 Proof of theorem 1

We now return to the proof of Theorem 1. Recall, that we have reduced the proof to estimating the 2 operators \(H_i^*\) given by (28), \(i=1,2\). We proceed with the 2 cases. Case \(i=1\). Recall the definition (22):

$$\begin{aligned} b^{A,N}_{s,1}(x)=\sum _{Q\in {\mathcal {D}}_{A,N,s}^1} (1_{Q}(x)f_0^{\frac{\lambda N}{A}}(x)-E_{Q}f_0^{\frac{\lambda N}{A}}(x)) \end{aligned}$$

For a dyadic interval J (recall that we only consider J’s such that \(|J|\sim M^{\theta -\epsilon }\)) we need to estimate

$$\begin{aligned}&\Big |\Big \{x\in J:\max _B\Big |\sum _{N\le B}\sum _{s\ge 0}(\mu _N-{\check{\mu }}_N)*b^{A,N}_{s,1}(x)\Big |>\lambda A^{-\epsilon }\Big \}\Big | \\&\quad \le \Big |\Big \{x\in J:\max _B\Big |\sum _{N\le B}\sum _{s\ge 0}\mu _N*b^{A,N}_{s,1}(x)\Big |>\frac{1}{2}\lambda A^{-\epsilon }\Big \}\Big | \\&\quad \quad +\Big |\Big \{x\in J:\max _B\Big |\sum _{N\le B}\sum _{s\ge 0}{\check{\mu }}_N*b^{A,N}_{s,1}(x)\Big |>\frac{1}{2}\lambda A^{-\epsilon }\Big \}\Big |. \end{aligned}$$

We call the first summand \(L_J\). It is enough to estimate \(L_J\), since the second summand can be estimated analogously. We apply Lemma 5 (with \(\lambda _0=c\lambda A^{-\epsilon }\)) and thus there exists a sequence \(\{N_j\}_{j\le cA^3}\), depending on J but independent of \(x\in J\), the collections \({\mathcal {C}}_1, \mathcal C_2\) such that for some \(v\in \{1,2\}\) (\(v=v(A,J)\)

$$\begin{aligned} L_J&\le \Big |\Big \{x\in J:\max _{j\le cA^3}\Big |\sum _{N\le N_j, N\in C_v}\sum _{s\ge 0}\mu _N*b^{A,N}_{s,1}(x)\Big |>\lambda A^{-\epsilon }\Big \}\Big |\\&\quad +|\{x\in J:ER(x)>c\lambda A^{-\epsilon }\}|+|{\mathcal {A}}_A\cap J|, \end{aligned}$$

(regarding the range of j’s see Lemma 6 and remark that follows). Now apply Lemma 7, and obtain a \(k\le c\log A\) such that if we put \(S_j=N_{2^kj}\) we have

$$\begin{aligned} \sum _JL_J&\le \sum _J\Big |\Big \{x\in J:\sum _{j}\Big |\sum _{S_j<N\le S_{j+1}}\sum _{s\ge 0}\mu _N*b^{A,N}_{s,1}(x)\Big |^2>\lambda ^2 A^{-3\epsilon }\Big \}\Big |\\&\quad +|\{x:ER(x)>c\lambda A^{-\epsilon }\}|+|{\mathcal {A}}_A|. \end{aligned}$$

By (42) (and Chebychev’s inequality), Lemmas 5 and 6, the two last summands have estimates

$$\begin{aligned} |\{x:ER(x)>c\lambda A^{-\epsilon }\}|\le c\,\frac{\Vert f_0\Vert _{\ell ^1}A^{\epsilon }}{\lambda N^\delta } \le c\,\frac{1}{\lambda A^{\epsilon '}},\\ |{\mathcal {A}}_A|\le c\,\frac{1}{\lambda A^2}, \end{aligned}$$

since \(\epsilon \) can be chosen small enough and, as was pointed out before, only \(A\le N\) are relevant. We turn to the first summand. Applying Lemma 8 and the Chebychev’s inequality we get

$$\begin{aligned}&\sum _J\Big |\Big \{x\in J:\sum _{j}\Big |\sum _{\genfrac{}{}{0.0pt}{}{S_j<N\le S_{j+1}}{N\in {\mathcal {C}}_v}}\sum _{s\ge 0}\mu _N*b^{A,N}_{s,1}(x)\Big |^2>\lambda ^2 A^{-3\epsilon }\Big \}\Big |\nonumber \\&\quad \le \sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}\sum _J\sum _{N_1\ge N_2}2^{-\delta (s_1+s_2)} \frac{A^{3\epsilon }|J|}{\lambda ^2 (N_1N_2)^{\alpha }}\nonumber \\&\quad \quad \times \Vert b^{A,N_1}_{s_1,1}\Vert _{\ell ^1(I_{N_1}^*(J))}\Vert b^{A,N_2}_{s_2,1}\Vert _{\ell ^1(I_{N_2}^*(J))}+\nonumber \\&\qquad +\sum _J\sum _{N}\sum _{s:2^s\ge A^{}}\frac{A^{3\epsilon }|J|}{\lambda ^2 N^{\alpha +1}}\Vert b^{A,N}_{s,1}\Vert _{\ell ^2(I_{N}^*(J))}^2\\&\qquad +\sum _J\sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}\sum _{N}\frac{A^{3\epsilon }}{\lambda ^2}\big |\langle Err_{N,J}b^{A,N}_{s_1,1},b^{A,N}_{s_2,1}\rangle \big |\nonumber \\&\quad =I+II+III,\nonumber \end{aligned}$$
(47)

where \(I_{N}(J)\) is the unique dyadic interval from the family \({\mathcal {J}}_N\) which contains J. Again, we start with the first component I. Note, that the sum of |J| of those J’s, which share the same \(I_{N_2}(J)\) is equal to \(|I_{N_2}(J)|\sim N_2^\alpha \). So,

$$\begin{aligned} I\le & {} \sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}2^{-\delta (s_1+s_2)}\sum _{N_1\ge N_2}\sum _{I_{N_1}\subset {\mathcal {J}}_{N_1}} \sum _{\genfrac{}{}{0.0pt}{}{I_{N_2}\subset {\mathcal {J}}_{N_2}}{I_{N_2}\subset I_{N_1}}}\frac{A^{3\epsilon }}{\lambda ^2 N_1^{\alpha }}\\{} & {} \times \Vert b^{A,N_1}_{s_1,1}\Vert _{\ell ^1(I_{N_1}^*)}\Vert b^{A,N_2}_{s_2,1}\Vert _{\ell ^1(I_{N_2}^*)}. \end{aligned}$$

Further, for fixed \(N_1\) and \(I_{N_1}\in {\mathcal {J}}_{N_1}\) observe

$$\begin{aligned}&\sum _{N_2\le N_1}\sum _{\genfrac{}{}{0.0pt}{}{I\subset {\mathcal {J}}_{N_2}}{I\subset I_{N_1}}}\Vert b^{A,N_2}_{s_2,1}\Vert _{\ell ^1(I^*)}\nonumber \\&\quad =\sum _{N_2\le N_1}\sum _{\genfrac{}{}{0.0pt}{}{I\subset \mathcal J_{N_2}}{I\subset I_{N_1}}} \sum _{x\in I^*}\Big |\sum _{Q\in \mathcal D_{A,N_2,s_2}^1} \Big (1_Qf_0^{\frac{\lambda N_2}{A}}(x)-E_{Q}f_0^{\frac{\lambda N_2}{A}}(x)\Big )\Big |\nonumber \\&\quad \le 2\sum _{N2\le N_1}\sum _{\genfrac{}{}{0.0pt}{}{I\subset \mathcal J_{N_2}}{I\subset I_{N_1}}} \sum _{x\in I^*}\sum _{Q\in \mathcal D_{A,N_2,s_2}^1} 1_Qf_0^{\frac{\lambda N_2}{A}}(x)\\&\quad \le c\sum _{\genfrac{}{}{0.0pt}{}{Q\in {\mathcal {D}}_{A,N_2,s_2}^1}{Q\subset I_{N_1}^*}}\sum _{x\in Q}1_Qf_0^{\frac{\lambda N_2}{A}}(x).\nonumber \end{aligned}$$
(48)

We have used the fact, that scales of cubes in \(\mathcal D_{A,N_2,s_2}^1\) are all smaller than \(N_2^\alpha \), and also that for fixed \(A,s_2\) the cubes in \({\mathcal {D}}_{A,N_2,s_2}^1\) are disjoint. Consequently, again using disjointness and (5)

$$\begin{aligned}&\sum _{N_2\le N_1}\sum _{\genfrac{}{}{0.0pt}{}{Q\in \mathcal D_{A,N_2,s_2}^1}{Q\subset I_{N_1}^*}}\sum _{x\in Q}1_Qf_0^{\frac{\lambda N_2}{A}}(x)\\&\quad \le C\sum _{Q\subset I_{N_1}^*}\sum _{x\in Q}\Big (\sum _{N_2\le N_1} 1_{Q\in {\mathcal {D}}_{A,N_2,s_2}^1}(x)\,f_0^{\frac{\lambda N_2}{A}}(x)\Big )\nonumber \\&\quad \le C\sum _{Q\subset I_{N_1}^*}\sum _{x\in Q}\Big (\sum _{N_2\le N_1}f_0^{\frac{\lambda N_2}{A}}(x)\Big )\nonumber \\&\quad \le C\sum _{Q\subset I_{N_1}^*}\sum _{x\in Q}f_0(x)\nonumber \\&\quad \le C\sum _{Q\subset I^*_{N_1}}\lambda |Q|\nonumber \\&\quad \le C\lambda |I_{N_1}|\nonumber \\&\quad \le C\lambda N_1^\alpha \nonumber \end{aligned}$$
(49)

Thus,

$$\begin{aligned} I&\le c\sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}2^{-\delta (s_1+s_2)}\sum _{N_1}\sum _{I\in \mathcal J_{N_1}} \frac{A^{3\epsilon }}{\lambda }\Vert b^{A,N_1}_{s_1,1}\Vert _{\ell ^1(I^*)}\\&\le c\sum _{N_1}\sum _{I\in {\mathcal {J}}_{N_1}}\frac{A^{3\epsilon }}{\lambda A^{{\delta }{}}}\Vert f_0^{\frac{\lambda N_1}{A}}\Vert _{\ell ^1(I^*)}\\&\le \frac{c}{\lambda A^{\frac{\delta }{2}}}, \end{aligned}$$

using disjointness of the supports of \(f_0^{\frac{\lambda N_1}{A}}\).

We now turn to II in (47), and proceed similarly.

$$\begin{aligned} II&=\sum _J\sum _{N}\sum _{s:2^s\ge A^{}}\frac{A^{3\epsilon }|J|}{\lambda ^2 N^{\alpha +1}}\Vert b^{A,N}_{s,1}\Vert _{\ell ^2(I_{N}^*(J))}^2\\&\le c\sum _J\sum _{N}\sum _{s:2^s\ge A^{}}\frac{A^{3\epsilon }|J|}{\lambda ^2 N^{\alpha +1}}\Vert f^{A,N}_{s,1}\Vert _{\ell ^2(I_{N}^*(J))}^2\\&= c\sum _J\sum _{N}\sum _{s:2^s\ge A^{}}\frac{A^{3\epsilon }|J|}{\lambda ^2 N^{\alpha +1}}\sum _{\genfrac{}{}{0.0pt}{}{Q\subset I_{N}^*(J)}{|Q|\sim (2^{-s}N)^\alpha }}\Vert (f^{A,N}_{s,1})^2\Vert _{\ell ^1(Q)}\\&\le c\sum _J\sum _{N}\frac{A^{3\epsilon }|J|}{\lambda ^2 N^{\alpha +1}}\sum _{Q\subset I_{N}^*(J)}\Vert (f^{A,N}_{s,1})^2\Vert _{\ell ^1(Q)}\\&\le c\sum _{N}\sum _{I_N}\frac{A^{3\epsilon }}{\lambda ^2 N^{\alpha +1}}\sum _{Q\subset I_N^*}\Vert (f^{A,N}_{s,1})^2\Vert _{\ell ^1(Q)}\sum _{J\subset I_N^*}|J|\\&\le c\sum _{N}\frac{A^{3\epsilon }}{\lambda ^2 N}\sum _{Q}\Vert (f^{A,N}_{s,1})^2\Vert _{\ell ^1(Q)}\\&\le c\sum _{N}\frac{A^{3\epsilon }}{\lambda A}\sum _{Q}\Vert f^{A,N}_{s,1}\Vert _{\ell ^1(Q)}\\&=c\frac{A^{3\epsilon }}{\lambda A}\sum _{N}\Vert f^{A,N}_{s,1}\Vert _{\ell ^1}\\&\le c\frac{\Vert f_0\Vert _{\ell ^1}}{\lambda A^{1-3\epsilon }}\\&=c\frac{1}{\lambda A^{1-3\epsilon }}. \end{aligned}$$

We finally consider the last summand in (47). We use the following two properties of the kernel \(Err_{N,J}\) (Lemma 2): its support (which is within \(c N^\alpha \) from the center of J in both variables and, for any xy, and also in the strip \(|x-y|\le CM\))

$$\begin{aligned} \sum _J|Err_{N,J}(x,y)|\le \frac{c}{N^\alpha }. \end{aligned}$$

which, by (5), leads to

$$\begin{aligned} \sum _J\sum _{y}|Err_{N,J}(x,y)||b^{A,N}_{s_2,1}(y)|\le \frac{c\lambda (N+(2^{-s}N)^\alpha )}{N^\alpha }. \end{aligned}$$

We have

$$\begin{aligned} III&=\sum _J\sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}\sum _{N}\frac{A^{3\epsilon }}{\lambda ^2}\big |\langle Err_{N,J}b^{A,N}_{s_1,1},b^{A,N}_{s_2,1}\rangle \big |\\&\le C\,\frac{A^{3\epsilon }}{\lambda ^2}\sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{}}{i=1,2}}\sum _{N}\sum _J\sum _{x,y\in I^*_N(J)} |Err_{N,J}(x,y)|\,|b^{A,N}_{s_1,1}(x)|\,|b^{A,N}_{s_2,1}(y)|\\&\le C\,\frac{A^{3\epsilon }}{\lambda ^2}\sum _{\genfrac{}{}{0.0pt}{}{s_i:2^{s_i}\ge A^{},s_1\ge s_2}{i=1,2}}\sum _{N} \sum _{I\in \mathcal J_N}\frac{\lambda \,(N+(2^{-{s_1}}N)^\alpha )}{N^\alpha } \Vert b^{A,N}_{s_2,1}\Vert _{\ell ^1(I^*)}\\&\le \frac{C\Vert f_0\Vert _{\ell ^1}}{\lambda }. \end{aligned}$$

Case \(i=2\). Recall (24)

$$\begin{aligned} {\mathcal {D}}_{A,N,s}^2=\{Q: |Q|\sim (2^{-s}N)^\alpha ,\,2^s\le A\}, \end{aligned}$$

and (22)

$$\begin{aligned} b^{A,N}_{s,2}(x)=\sum _{Q\in {\mathcal {D}}_{A,N,s}^2} \Big (1_{Q}(x)f^{\frac{\lambda N}{A}}(x)-E_{Q}f^{\frac{\lambda N}{A}}(x)\Big ). \end{aligned}$$

We have

$$\begin{aligned}{} & {} \max _B\Big |\sum _s\sum _{\genfrac{}{}{0.0pt}{}{A}{A^{}\ge 2^s}}\sum _{N\le B}\mu _N*b_{s,2}^{A,N}(x)\Big |^2\\{} & {} \quad \le c_\epsilon \sum _s2^{\epsilon s}\max _B\Big |\sum _{A:\,A^{}\ge 2^{s}}\sum _{N\le B}\mu _N*b_{s,2}^{A,N}(x)\Big |^2. \end{aligned}$$

Hence, for some s, we must have

$$\begin{aligned} \max _B\Big |\sum _{N\le B}\mu _N*b_{s,2}^{N}(x)\Big |^2>2^{-2\epsilon s}\lambda _0^2, \end{aligned}$$

where

$$\begin{aligned} b_{s,2}^N=\sum _{A:\,A^{}\ge 2^{s}}b_{s,2}^{A,N}. \end{aligned}$$

Now, apply lemmas 4, 56 exactly as in the case \(i=1\). The Theorem follows

Remark. If \(|Q|\le M^{\theta -2\epsilon }\) then \(A\ge M^{\epsilon }\) by Lemma 3. By Lemma 8 or by [22] we have

$$\begin{aligned} \sum _N\sum _{I_N} |\mu _N*b^{A,N}_{Q}(x)|^2\le \frac{C\lambda }{M^\frac{\delta \epsilon }{2}} \end{aligned}$$
(50)

This yields the desired maximal estimate since we have at most \(C\log M\) summands. We omit the details.