1 Introduction

In this paper we consider several theoretical aspects regarding N-term approximation in a Banach space \(({{\mathbb {X}}},\Vert \cdot \Vert )\), over a field \({{\mathbb {K}}}={{\mathbb {R}}}\) or \({{\mathbb {C}}}\).

A fundamental question in this topic is, given a dictionary \({{\mathcal {D}}}=\{{\varphi }_i\}_{i\in {{\mathcal {I}}}}\) in \({{\mathbb {X}}}\), and the corresponding set of N-sparse vectors

$$\begin{aligned} \Sigma _N=\Sigma _N({{\mathcal {D}}}):=\left\{ \sum _{j=1}^Nc_j{\varphi }_{i_j}\mid c_j\in {{\mathbb {K}}}, \;{\varphi }_{i_j}\in {{\mathcal {D}}}\right\} , \end{aligned}$$

then find constructive procedures (algorithms) \({\mathscr {A}}_N:{{\mathbb {X}}}\rightarrow \Sigma _N\), where for all \(f\in {{\mathbb {X}}}\) the quantity \(\Vert f-{\mathscr {A}}_N(f)\Vert \) is as close as possible to the best error of N-term approximation, defined by

$$\begin{aligned} \sigma _N(f,{{\mathcal {D}}}):=\mathop \textrm{dist}(f,\Sigma _N)=\inf \Big \{\Vert f-g\Vert \mid g\in \Sigma _N({{\mathcal {D}}})\Big \}. \end{aligned}$$

Once an algorithm \({\mathscr {A}}_N\) is fixed, one can quantify the above statement by considering the associated Lebesgue-type inequality, which amounts to find the smallest value of \(\phi (N)\) so that

$$\begin{aligned} \Vert f-{\mathscr {A}}_{\phi (N)}(f)\Vert \,\le \,C\,\sigma _N(f),\quad \forall \;f\in {{\mathbb {X}}}, \end{aligned}$$
(1.1)

with C a fixed universal constant (if it exists). Observe, in particular, that (1.1) guarantees exact recovery of all N-sparse signals after \(\phi (N)\) iterations, that is

$$\begin{aligned} {\mathscr {A}}_{\phi (N)}(f)=f,\quad \forall \;f\in \Sigma _N({{\mathcal {D}}}). \end{aligned}$$

Ideally, one would like to find algorithms \({\mathscr {A}}_N\) so that (1.1) holds with \(\phi (N)=N\) (and \(C=1\)). But this is hardly possible in many situations (a notable exception being when \({{\mathcal {D}}}\) is an orthonormal basis in a Hilbert space). For instance, in the classical case when \({{\mathcal {D}}}\) is the trigonometric system in \(L^p({{\mathbb {T}}})\), \(p\not =2\), it is still a relevant open question to find one such (constructive) algorithm.

In this paper we shall be interested in the Weak Chebyshev Greedy Algorithm (WCGA), which was introduced by Temlyakov in [12] as a generalization to Banach spaces of the celebrated Orthogonal Matching Pursuit (OMP) from Hilbert spaces. We refer to [13, 15, 16], and references therein, for background on this topic.

Lebesgue-type inequalities for the WCGA were proved in [7, 14]; see also [16, Chapter 8] for a historical overview. One the features of WCGA is that it has good approximation properties for the trigonometric system in \(L^p\). Indeed, it was shown in [14, (4.3)] that, if \(p>2\), then Lebesgue inequalities hold with only \(\phi (N)=O(N\log N)\) iterations. This seems to be the best known result with a constructive algorithm in that setting. Likewise, for the univariate Haar system in \(L^p\), if \(1<p\le 2\), then it suffices with \(\phi (N)=O(N)\) iterations; see [14, (4.7)].

The above results are special cases of a deep theorem proved by Temlyakov in [14, Theorem 2.8], which we describe in detail below. In that theorem, the number of iterations \(\phi (N)\) is estimated in terms of some intrinsic properties of the pair \(({{\mathbb {X}}},{{\mathcal {D}}})\), namely, the power type of the modulus of smoothness of \(({{\mathbb {X}}},\Vert \cdot \Vert )\), and the power function associated with the so-called property \({\texttt {A3}}\) of \({{\mathcal {D}}}\); see (1.16) below.

Our main result in this paper, Theorem 1.12, will be a generalization of Temlyakov’s theorem, which allows to cover situations in which the modulus of smoothness and the \({\texttt {A3}}\) parameters are not necessarily power functions. This is actually needed in some special cases, such as when \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\), for which additional log factors appear naturally. Our next results, Theorems 1.17 and 5.20, will be applications of Theorem 1.12 to this setting, for two special dictionaries, the Haar and the trigonometric system.

We next give a more detailed description of these results.

1.1 Statements of Results

We assume that \(({{\mathbb {X}}},\Vert \cdot \Vert )\) is a uniformly smooth Banach space, meaning that its modulus of smoothness

$$\begin{aligned} \rho _{{\mathbb {X}}}(t):=\sup _{\Vert f\Vert =\Vert g\Vert =1} \,\tfrac{1}{2}\,\Big (\Vert f+tg\Vert +\Vert f-tg\Vert -2\Vert f\Vert \Big ),\quad t\in {{\mathbb {R}}}. \end{aligned}$$

satisfies \(\rho _{{\mathbb {X}}}(t)=o(t)\) as \(t\rightarrow 0\). Given \(f\in {{\mathbb {X}}}\) with \(f\not =0\), we let \(F_f\in {{\mathbb {X}}}^*\) be the associated norming functional, that is, the (unique) element in \({{\mathbb {X}}}^*\) such that

$$\begin{aligned} \Vert F_f\Vert _{{{\mathbb {X}}}^*}=1,{\quad \text{ and }\quad }F_f(f)=\Vert f\Vert . \end{aligned}$$

Uniqueness follows from the smoothness of the norm \(\Vert \cdot \Vert \).

We say that \({{\mathcal {D}}}=\{{\varphi }_i\}_{i\in {{\mathcal {I}}}}\) is a dictionary in \({{\mathbb {X}}}\), if it consists of non-null vectors whose closed linear span is \({{\mathbb {X}}}\), that is

$$\begin{aligned} \big [{\varphi }_i\big ]_{i\in {{\mathcal {I}}}}={{\mathbb {X}}}. \end{aligned}$$

We do not assume the dictionary elements to be normalized, although as a consequence of later properties \({{\mathcal {D}}}\) will be semi-normalized, that is

$$\begin{aligned} \mathfrak {c_0}\le \Vert {\varphi }_i\Vert \le \mathfrak {c_1}, \quad \forall \,{\varphi }_i\in {{\mathcal {D}}}, \end{aligned}$$

for some constants \(\mathfrak {c_1}\ge \mathfrak {c_0}>0\); see Sect. 2.1 below.

Definition 1.2

(Weak Chebyshev Greedy Algorithm (WCGA)) Given a fixed \(\tau \in (0,1]\), a \(\tau \)-WCGA associated with \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) is any collection of mappings

$$\begin{aligned} {\mathscr {G}}_N:{{\mathbb {X}}}\rightarrow \Sigma _N({{\mathcal {D}}}),\quad N=1,2,\ldots \end{aligned}$$

with the following properties:

Given \(f\in {{\mathbb {X}}}\setminus \{0\}\), we let \(f_0:=f\) and define inductively vectors \({\varphi }_{i_1},\ldots , {\varphi }_{i_n}\) in \({{\mathcal {D}}}\) and \(f_1,\ldots , f_n\in {{\mathbb {X}}}\) by the following procedure: at step \(n+1\) we pick any \({\varphi _{i_{n+1}}}\in {{\mathcal {D}}}\) such that

$$\begin{aligned} |F_{f_n}({\varphi _{i_{n+1}}})|\ge \,\tau \,\sup _{{\varphi }\in {{\mathcal {D}}}}|F_{f_n}({\varphi })|, \end{aligned}$$
(1.3)

and let \({\mathscr {G}}_{n+1}(f)\) be any element in \([{\varphi }_{i_1},\ldots ,{\varphi }_{i_{n+1}}]\) such that

$$\begin{aligned} \Vert f-{\mathscr {G}}_{n+1}(f)\Vert =\mathop \textrm{dist}\big (f,[{\varphi }_{i_1},\ldots ,{\varphi }_{i_{n+1}}]\big ). \end{aligned}$$

Then we set \(f_{n+1}=f-{\mathscr {G}}_{n+1}(f)\), and iterate the process (indefinitely, or until the remainder \(f_{n+1}=0\)).

If at some stage we have \(f_n=0\), then we just let \({\mathscr {G}}_{n+k}(f)={\mathscr {G}}_n(f)=f\) for all \(k\ge 1\).

Remark 1.4

Note that such algorithms can always be constructed when \(\tau <1\), and for some dictionaries also when \(\tau =1\) (namely, when the sup in (1.3) is attained within \({{\mathcal {D}}}\)).

We next define the three key properties that are needed to prove Lebesgue-type inequalities for WCGA. The first one is a generalization of a property given in [2, Definition 1.13].

Definition 1.5

Let Q(t) be a positive increasing function for \(t\in (0,\infty )\), with \(Q(0)=0\). We say that \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies property \({\texttt {D}}(Q)\) if

$$\begin{aligned} \mathop \textrm{dist}(f,[{\varphi }])\le \Vert f\Vert \,\Big (1-Q(|F_f({\varphi })|)\Big ),\quad \forall \,{\varphi }\in {{\mathcal {D}}},\,f\in {{\mathbb {X}}}\setminus \{0\}. \end{aligned}$$
(1.6)

The next definition coincides with property \({\texttt {A2}}\) from [7, 14].

Definition 1.7

Let \(N<D\) be positive integers and \(k_N>0\). We say that \(\Sigma _N({{\mathcal {D}}})\in {\texttt {A2}}(k_N, D)\) if

$$\begin{aligned} \Vert \sum _{j\in A}a_j{\varphi }_j\Vert \le k_N\,\Vert \sum _{j\in B}a_j{\varphi }_j\Vert , \quad \forall \,a_j\in {{\mathbb {K}}}, \;\forall \,A\subset B\mid |A|\le N,\;|B|< D\nonumber \\ \end{aligned}$$
(1.8)

If the above holds for all \(D<\infty \), we just write \(\Sigma _N({{\mathcal {D}}})\in {\texttt {A2}}(k_N)\).

Our third definition is a slight generalization of property \({\texttt {A3}}\) from [14].

Definition 1.9

Let \(N<D\) be positive integers and let \(\{H(k)\}_{k=1}^\infty \) be an increasing sequence of positive numbers. We say \(\Sigma _N({{\mathcal {D}}})\in {\texttt {A3}}(H,D)\) if

$$\begin{aligned} \sum _{j\in A}|a_j|\le H(|A|)\,\Vert \sum _{j\in B}a_j{\varphi }_j\Vert , \quad \forall \,a_j\in {{\mathbb {K}}}, \;\forall \,A\subset B\mid |A|\le N,\;|B|< D.\nonumber \\ \end{aligned}$$
(1.10)

If the above holds for all \(D<\infty \), we just write \(\Sigma _N({{\mathcal {D}}})\in {\texttt {A3}}(H)\).

Finally, we recall that a positive sequence \(\{G(k)\}_{k=1}^\infty \) is called 1-quasi-convex if

$$\begin{aligned} \frac{G(k)}{k}\,\le \, \frac{G(k+1)}{k+1},\quad \forall \, k\in {{\mathbb {N}}}. \end{aligned}$$

As an example, if G(t) is a positive convex function in \((0,\infty )\) with \(G(0^+)=0\), then \(\{G(k)\}_{k=1}^\infty \) is 1-quasi-convex. This is the case, for instance, for the functions

$$\begin{aligned} G(t)=t^p\,\big (\log (c+t)\big )^{\alpha }, \end{aligned}$$
(1.11)

if \(p=1\) and \({\alpha }\ge 0\), or if \(p>1\) and \({\alpha }\in {{\mathbb {R}}}\) (for a sufficiently large \(c\ge e\)).

The precise statement of our main result is now the following.

Theorem 1.12

Let \(({{\mathbb {X}}},\Vert \cdot \Vert )\) be a Banach space, \({{\mathcal {D}}}\) a dictionary, \(\tau \in (0,1]\) and \({\mathscr {G}}_n:{{\mathbb {X}}}\rightarrow \Sigma _n\) a \(\tau \)-WCGA. Let \(D>N\ge 1\) be fixed. Let \(k_N>0\) and let Q(t), H(n) be positive and increasing functions such that the following properties hold

  1. (i)

    \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies \({\texttt {D}}(Q)\)

  2. (ii)

    \(\Sigma _N\) satisfies property \({\texttt {A2}}(k_N,D)\).

  3. (iii)

    \(\Sigma _N\) satisfies property \({\texttt {A3}}(H,D)\).

Let \({\lambda }_1>1\). Assume further that the sequence

$$\begin{aligned} G(n)= \left[ Q\left( \frac{c(\tau )}{H(n)}\right) \right] ^{-1}, \quad \text{ with }\quad c(\tau )\,=\,\tfrac{\tau }{2}\left( 1-\tfrac{1}{\sqrt{{\lambda }_1}}\right) . \end{aligned}$$
(1.13)

is 1-quasi-convex. If we let

$$\begin{aligned} \phi (N):=\,8\,\log \Big [\frac{8(1+{\lambda }_1)k_N}{\sqrt{{\lambda }_1}-1}\Big ]\,G(2N), \end{aligned}$$
(1.14)

then it holds

$$\begin{aligned} \Big \Vert x-{\mathscr {G}}_{\phi (N)}(x)\Big \Vert \le {\lambda }_1\,\Vert x-\Phi \Vert ,\quad \forall \,x\in {{\mathbb {X}}}, \;\Phi \in \Sigma _N, \end{aligned}$$
(1.15)

provided that \(N+\phi (N)< D\).

We now make some comments about this theorem.

  1. (a)

    The result obtained by Temlyakov in [14, Theorem 2.8] corresponds to the case when \({\lambda }_1\) is a (possibly large) universal constant, and

    $$\begin{aligned} H(N)=V_N\,N^r {\quad \text{ and }\quad }Q(t)=c\,t^{q'}, \end{aligned}$$

    where \(q>1\) is the power type of the modulus of smoothness, i.e. \(\rho _{{\mathbb {X}}}(t)=O(t^q)\). In that case, the required number of iterations becomes

    $$\begin{aligned} \phi (N)\,=\, C_1\,(V_N/\tau )^{q'}\,\log (1+k_N)\,N^{rq'}, \end{aligned}$$
    (1.16)

    for some \(C_1>0\), provided that \(rq'\ge 1\). Our contribution gives an additional explicit form for the constants when the parameter \({\lambda }_1\) approaches 1.

  2. (b)

    As we show in Proposition 2.3 below, if \({{\mathcal {D}}}\) is normalized, then condition \({\texttt {D}}(Q)\) always holds with

    $$\begin{aligned} Q(t)=2{\delta }_{{{\mathbb {X}}}^*}(t/2), \end{aligned}$$

    where \({\delta }_{{{\mathbb {X}}}^*}(t)\) is the modulus of convexity of the dual space \({{\mathbb {X}}}^*\). This is also a new result. In many practical cases the asymptotic behavior of \({\delta }_{{{\mathbb {X}}}^*}(t)\) is well-known, so one can use property \({\texttt {D}}(Q)\) with no need to compute \(\rho _{{\mathbb {X}}}(t)\).

  3. (c)

    As was discussed in [2, Remark 2.10], in some special cases it is possible to prove that \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies property \({\texttt {D}}(Q)\) with a function Q(t) which is considerably better than \({\delta }_{{{\mathbb {X}}}^*}(t)\) (for t near 0). For instance, if \({{\mathbb {X}}}=\ell ^p\) and \({{\mathcal {D}}}\) is the canonical basis, then one can take \(Q(t)=c_pt^{p'}\), which gives better results than \({\delta }_{{{\mathbb {X}}}^*}(t)=O(t^{\max \{p',2\}})\) when \(p>2\). Other examples (with power type) were given in [2, Proposition 4.12 and Lemma 5.7].

  4. (d)

    The assumption that G(n) in (1.13) is 1-quasi-convex is only made for convenience. Alternatively, one could replace G(n) by any convex majorant (hence, 1-quasi-convex). In practice, quasi-convexity is easily verified after substituting the functions Q(t) and H(n) into (1.13); see the example in (1.11).

  5. (e)

    As in [14], the conclusion (1.15) in the previous theorem also holds when the assumptions \({\texttt {A2}}\) and \({\texttt {A3}}\) are required only on the individual sparse element \(\Phi =\sum _{j\in T}x_j{\varphi }_j\), with \(|T|\le N\) (and not necessarily in all \(\Phi \in \Sigma _N\)). Namely, in this case the requirement would be that (1.8) and (1.10) must hold for all \(A\subset T\) and all scalars \(a_j\in {{\mathbb {K}}}\) such that \(a_j=x_j\), \(j\in A\).

Our second result is an application of Theorem 1.12 to the case when \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\); see Sect. 4 below for the precise definition. We stress that, when \(1<p\le 2\), the number of iterations which are derived from the above theorem, namely

$$\begin{aligned} \phi (N)=O\Big (N\,\big (\log (e+N)\big )^{p'|{\alpha }|}\Big ), \end{aligned}$$

is actually (asymptotically) optimal for all \({\alpha }\in {{\mathbb {R}}}\).

Theorem 1.17

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\), and let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) be as in Sect. 4. Let \({{\mathcal {D}}}\) be the (normalized) Haar basis in \({{\mathbb {X}}}\). Then

  1. (a)

    there exists a constant \(C>1\) such that the WCGA satisfies

    $$\begin{aligned} \Big \Vert f-{\mathscr {G}}_{\phi (N)}(f)\Big \Vert \le \,2\,\sigma _N(f),\quad \forall \,f\in {{\mathbb {X}}},\;N\in {{\mathbb {N}}}, \end{aligned}$$
    (1.18)

    where

    $$\begin{aligned} \phi (N)=\left\{ \begin{array}{ll} C\,N^\frac{2}{p'}\,\big (\log (e+N)\big )^{2{\alpha }_+} &{} \text{ when }\ p>2\\ C\,N\,\big (\log (e+N)\big )^{p'|{\alpha }|} &{} \text{ when }\ 1<p\le 2. \end{array}\right. \end{aligned}$$
    (1.19)
  2. (b)

    if for some sequence \(\psi (N)\) the WCGA satisfies

    $$\begin{aligned} \Big \Vert f-{\mathscr {G}}_{\psi (N)}(f)\Big \Vert \le \,C\,\sigma _N(f),\quad \forall \,f\in {{\mathbb {X}}},\;N\in {{\mathbb {N}}}, \end{aligned}$$
    (1.20)

    then necessarily \(\psi (N)\ge c'\, N\,\big (\log (e+N)\big )^{|{\alpha }|p'}\), for some \(c'>0\).

Remark 1.21

We remark that, when \(p>2\), it is an open question already for \(L^p\) spaces (case \({\alpha }=0\)) whether \(\phi (N)\approx N^{2/p'}\) iterations are necessary to ensure (1.18); see [16, Open Problem 8.3, p. 448].

Finally, in Sect. 5 we give a similar application in the case that \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) and \({{\mathcal {D}}}=\{e^{inx}\}_{n\in {{\mathbb {Z}}}}\) is the trigonometric system. See Theorem 5.20 below for details.

2 Preliminaries

2.1 About Seminormalization of \({{\mathcal {D}}}\)

We claim that the two properties \({\texttt {D}}(Q)\) and \({\texttt {A3}}(H,D)\) imply that the dictionary \({{\mathcal {D}}}\) must be semi-normalized. Indeed, if \(\Sigma _1\) satisfies \({\texttt {A3}}(H,D)\) then

$$\begin{aligned} \Vert {\varphi }\Vert \ge 1/H(1), \quad \forall \, {\varphi }\in {{\mathcal {D}}}. \end{aligned}$$
(2.1)

On the other hand, \({\texttt {D}}(Q)\) implies that \(Q\big (|F_f({\varphi })|\big )\le 1\) for all \({\varphi }\in {{\mathcal {D}}}\) and \(f\in {{\mathbb {X}}}{\setminus }\{0\}\). Setting \(f={\varphi }\) and using \(F_{\varphi }({\varphi })=\Vert {\varphi }\Vert \), this gives

$$\begin{aligned} \Vert {\varphi }\Vert \le Q^{-1}(1), \quad \forall \, {\varphi }\in {{\mathcal {D}}}. \end{aligned}$$
(2.2)

Conversely, suppose that \({{\mathcal {D}}}=\{{\varphi }_j\}\) is a dictionary satisfying any of the properties \({\texttt {D}}(Q)\), \({\texttt {A2}}(k_N,D)\) or \({\texttt {A3}}(H,D)\), and let \({\tilde{\varphi }}_j={{\lambda }_j}{\varphi }_j\) for scalars \({\lambda }_j\) such that

$$\begin{aligned} 0<\mathfrak {c_0}\le |{\lambda }_j|\le \mathfrak {c_1}, \quad \forall \,j. \end{aligned}$$

It is then easily seen that the new dictionary \({\widetilde{{{\mathcal {D}}}}}=\{{\tilde{\varphi }}_j\}\) satisfies the corresponding properties with new parameters, namely

$$\begin{aligned} {\texttt {D}}\big (Q(\cdot /\mathfrak {c_1})\big ), \quad {\texttt {A2}}(k_N,D) \quad or \quad {\texttt {A3}}(H/\mathfrak {c_0},D) \end{aligned}$$

We also remark that if \({\mathscr {G}}_N\) is \(\tau \)-WCGA for \({{\mathcal {D}}}\), then it is also a \((\tau \mathfrak {c_0}/\mathfrak {c_1})\)-WCGA for \(\widetilde{{{\mathcal {D}}}}\).

2.2 About Condition \({\texttt {D}}(Q)\)

We give a practical criterion which ensures that condition \({\texttt {D}}(Q)\) holds. Let \(({{\mathbb {X}}},\Vert \cdot \Vert )\) be a Banach space with modulus of smoothness

$$\begin{aligned} \rho _{{{\mathbb {X}}}}(t)=\sup _{\Vert x\Vert =\Vert y\Vert =1}\tfrac{1}{2}\Big (\Vert x+ty\Vert +\Vert x-ty\Vert -2\Vert x\Vert \Big ), \quad t\in {{\mathbb {R}}}. \end{aligned}$$

We denote by \({\delta }(s)={\delta }_{{{\mathbb {X}}}}(s)\) its modulus of convexity, that is

$$\begin{aligned} {\delta }_{{{\mathbb {X}}}}(s)=\inf _{{\begin{array}{c} {\Vert x\Vert =\Vert y\Vert =1}\\ {\Vert x-y\Vert =s} \end{array}}}\Big (\frac{\Vert x\Vert +\Vert y\Vert }{2}-\Big \Vert \frac{x+y}{2}\Big \Vert \Big ), \quad s\in [0,2]. \end{aligned}$$

Next, we consider the following related function, introduced by Figiel [3],

$$\begin{aligned} {{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s):=\sup _{t\ge 0}\Big (\tfrac{1}{2}st-\rho _{{{\mathbb {X}}}}(t)\Big ), \quad s\ge 0. \end{aligned}$$

Assume for simplicity that \({{\mathbb {X}}}\) is uniformly smooth, that is \(\rho _{{\mathbb {X}}}(t)=o(t)\) when \(t\rightarrow 0\) (so in particular, \({{\mathbb {X}}}\) is reflexive). Then, it is easily seen that \(Q(s)={{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s)\) is a convex increasing function with \(Q(0)=0\). Moreover, it is shown in [3, Proposition 1] (see also [6, Proposition 1.e.6]) that \({{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s)\) is “equivalent” to \({\delta }_{{{\mathbb {X}}}^*}(s)\) (for small s), in the sense that

$$\begin{aligned} {\delta }_{{{\mathbb {X}}}^*}(s/2) \le {{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s)\le {\delta }_{{{\mathbb {X}}}^*}(s), \quad s\in [0,2]. \end{aligned}$$

Also, \({{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s)\) is the greatest convex minorant of \({\delta }_{{{\mathbb {X}}}^*}(s)\). In particular, \({{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}={\delta }_{{{\mathbb {X}}}^*}\) when the later is a convex function. In many examples of Banach spaces \({{\mathbb {X}}}\), the behavior of the function \({\delta }_{{{\mathbb {X}}}^*}(s)\) is well-known (sometimes quite explicitly). For instance, if \({{\mathbb {X}}}=L^p\), \(1<p<\infty \), then

$$\begin{aligned} {\delta }_{L^{p'}}(s) =c_q \,s^{q}+ o(s^q),\quad \text{ with }~q=\max \{2,p'\}; \end{aligned}$$

see [6, p.63]. Our main result in this section is the following.

Proposition 2.3

If \(({{\mathbb {X}}},\Vert \cdot \Vert )\) is uniformly smooth, then every normalized dictionary \({{\mathcal {D}}}\) in \({{\mathbb {X}}}\) satisfies property \({\texttt {D}}(Q)\) with \(Q(s)=2{{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s)\).

Proof

It suffices to prove (1.6) for \(f=x\in {{\mathbb {X}}}\) with \(\Vert x\Vert =1\). Let \(F_x\in {{\mathbb {X}}}^*\) be the norming functional of \({{\mathbb {X}}}\), and given \({\varphi }\in {{\mathcal {D}}}\), let \(\nu =\overline{\text{ sign }}\,F_x({\varphi })\). Then, for every \(t\ge 0\), using [2, Proposition 2.1], we have

$$\begin{aligned} \mathop \textrm{dist}(x,[{\varphi }])\le \Vert x-\nu t{\varphi }\Vert \le \Vert x\Vert -t\,|F_x({\varphi })|\,+\,2\,\rho _{{{\mathbb {X}}}}(t). \end{aligned}$$

Taking the infimum over all \(t\ge 0\) we obtain

$$\begin{aligned} \mathop \textrm{dist}(x,[{\varphi }])\le 1-2\sup _{t\ge 0}\Big (\tfrac{1}{2}t\,|F_x({\varphi })|\,-\,\rho _{{{\mathbb {X}}}}(t)\Big )=1-2{{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(|F_x({\varphi })|). \end{aligned}$$

\(\square \)

Remark 2.4

If the dictionary is not normalized, but we assume that \(0<\Vert {\varphi }\Vert \le \mathfrak {c_1}\), for all \({\varphi }\in {{\mathcal {D}}}\), then the previous result gives

$$\begin{aligned} \mathop \textrm{dist}(x,[{\varphi }])\le 1-2\,{{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(|F_x({\varphi }/\Vert {\varphi }\Vert )|)\le 1-2\,{{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(|F_x({\varphi })|/\mathfrak {c_1}). \end{aligned}$$

So property \({\texttt {D}}(Q)\) holds with \(Q(s)=2\,{{\widetilde{{\delta }}}}_{{{\mathbb {X}}}^*}(s/\mathfrak {c_1})\), which is also a function equivalent to \({\delta }_{{{\mathbb {X}}}^*}(s)\).

2.3 About Condition \({\texttt {A3}}(H)\)

In practice, it is quite common that \(({{\mathbb {X}}},{{\mathcal {D}}})\) satisfies properties \({\texttt {A2}}\) or \({\texttt {A3}}\) with depth \(D=\infty \). Our first observation is that this implies that \({{\mathcal {D}}}\) has a biorthogonal dual system.

Lemma 2.5

Let \({{\mathcal {D}}}=\{{\varphi }_j\}_{j=1}^\infty \) be a dictionary in \({{\mathbb {X}}}\). Assume that one of the following properties hold

  1. (i)

    There exists \(k_1>0\) such that \(\Sigma _1({{\mathcal {D}}})\in {\texttt {A2}}(k_1;D)\), for all \(D<\infty \)

  2. (ii)

    There exists \(H(1)>0\) such that \(\Sigma _1({{\mathcal {D}}})\in {\texttt {A3}}(H;D)\), for all \(D<\infty \).

Then, there exists \(\{{\varphi }^*_j\}_{j=1}^\infty \) in \({{\mathbb {X}}}^*\) such that \(\{{\varphi }_j,{\varphi }^*_j\}_{j=1}^\infty \) is a biorthogonal system, ie

$$\begin{aligned} {\varphi }^*_j({\varphi }_i)=0,\;\; \text{ if }\ j\not =i,{\quad \text{ and }\quad }{\varphi }^*_j({\varphi }_j)=1. \end{aligned}$$

Proof

This is a consequence of [11, Theorem 6.1, page 54]. Indeed, if (i) holds then

$$\begin{aligned} \big \Vert \sum _{j=1}^na_j{\varphi }_j\big \Vert \le \sum _{j=1}^n\Vert a_j{\varphi }_j\Vert \le n\,k_1\,\big \Vert \sum _{i=1}^{n+m}a_i{\varphi }_i\big \Vert , \end{aligned}$$

which implies biorthogonality by [11, Theorem 6.1, “\(8^\textrm{o}\Rightarrow 2^\textrm{o}\)”]. Similarly, if (ii) holds then

$$\begin{aligned} \sum _{j=1}^n\frac{|a_j|}{2^j\,H(1)}\,\le \, \sum _{j=1}^n2^{-j}\,\big \Vert \sum _{i=1}^{n}a_i{\varphi }_i\big \Vert \,\le \, \big \Vert \sum _{i=1}^{n}a_i{\varphi }_i\big \Vert , \end{aligned}$$

which implies biorthogonality by [11, Theorem 6.1, “\(3^\textrm{o}\Rightarrow 2^\textrm{o}\)”]. \(\square \)

So under this situation, the dictionary \({{\mathcal {D}}}\) generates a dual system \({{\mathcal {D}}}^*\). Then, a variation of [2, Lemma 2.17] gives the following.

Lemma 2.6

Let \({{\mathcal {D}}}=\{{\varphi }_j\}_{j=1}^\infty \) be a dictionary, with dual system \({{\mathcal {D}}}^*=\{{\varphi }^*_j\}_{j\ge 1}\). Then, \(\Sigma _N\in {\texttt {A3}}(H,D)\), for all \(N<D<\infty \), if we choose

$$\begin{aligned} H(n)=\sup _{|A|\le n, |{\varepsilon }_j|=1}\,\big \Vert \sum _{j\in A} {\varepsilon }_j{\varphi }^*_j\big \Vert _{{{\mathbb {X}}}^*}. \end{aligned}$$
(2.7)

Proof

Take sets \(A\subset B\), with \(|A|\le N\), and scalars \(a_j\in {{\mathbb {K}}}\). Let \({\varepsilon }_j=\mathop {\overline{\textrm{sign}}}a_j\), and denote

$$\begin{aligned} {\mathbbm {1}}^*_{{\varvec{{\varepsilon }}}A}:=\sum _{j\in A} {\varepsilon }_j{\varphi }^*_j\in {{\mathbb {X}}}^*. \end{aligned}$$

Then

$$\begin{aligned} \sum _{n\in A}|a_n|={\mathbbm {1}}^*_{{\varvec{{\varepsilon }}}A}\Big (\textstyle \sum _{n\in B} a_n{\varphi }_n\Big )\le \Vert {\mathbbm {1}}^*_{{\varvec{{\varepsilon }}}A}\Vert _{{{\mathbb {X}}}^*} \,\big \Vert \sum _{n\in B} a_n{\varphi }_n\big \Vert \le H(N)\,\big \Vert \sum _{n\in B} a_n{\varphi }_n\big \Vert . \end{aligned}$$

\(\square \)

In practice, the sequence H(n) in (2.7) is equivalent to the fundamental function of \({{\mathcal {D}}}^*\) in \({{\mathbb {X}}}^*\), which in many examples has an explicit expression.

2.4 Quasi-convex Sequences

Given a positive sequence \(w=\{w(j)\}_{j=1}^\infty \) we define its associated summing sequence \(\widetilde{w}\) by

$$\begin{aligned} \widetilde{w}(n):=\sum _{j=1}^n\frac{w(j)}{j},\quad n\ge 1. \end{aligned}$$
(2.8)

Lemma 2.9

If \(w=\{w(j)\}_{j=1}^\infty \) is non-decreasing then for all \(N\in {{\mathbb {N}}}\)

$$\begin{aligned} \sum _{j:\, 1\le 2^j\le N}w(2^j) < 2\widetilde{w}(N). \end{aligned}$$

Proof

Let \({\Delta }_j=\{n\in {{\mathbb {N}}}\mid 2^j\le n<2^{j+1}\}\), which has cardinality \(2^j\), \(j=0,1,\ldots \) Then, if J is the largest integer with \(2^J\le N\) we have

$$\begin{aligned} \sum _{j=0}^J w(2^j)\le \sum _{j=0}^J\frac{\sum _{n\in {\Delta }_j}w(n)}{2^j}< 2\sum _{j=0}^J\sum _{n\in {\Delta }_j}\frac{w(n)}{n}=2\widetilde{w}(2^J)\le 2\widetilde{w}(N). \end{aligned}$$

\(\square \)

Lemma 2.10

If \(w=\{w(j)\}_{j=1}^\infty \) is 1-quasi-convex then

  1. (a)

    \(\widetilde{w}(N)\le w(N)\) for all \(N\in {{\mathbb {N}}}\)

  2. (b)

    \(\widetilde{w}\) is superadditive, that is,

$$\begin{aligned} \widetilde{w}(M+N)\ge \widetilde{w}(M)+\widetilde{w}(N),\quad \forall \,M,N\in {{\mathbb {N}}}. \end{aligned}$$

Proof

The assertion a) follows from the definition of 1-quasi-convex, since

$$\begin{aligned} \widetilde{w}(N)=\sum _{n=1}^N\frac{w(n)}{n}\le \sum _{n=1}^N\frac{w(N)}{N} =w(N). \end{aligned}$$

The assertion b) follows similarly from

$$\begin{aligned} \widetilde{w}(M+N)= & {} \sum _{n=1}^N\frac{w(n)}{n}+\sum _{n=N+1}^{N+M}\frac{w(n)}{n}\\= & {} \widetilde{w}(N)+\sum _{j=1}^{M}\frac{w(j+N)}{j+N} \\\ge & {} \widetilde{w}(N)+\sum _{j=1}^{M}\frac{w(j)}{j}=\widetilde{w}(N)+\widetilde{w}(M). \end{aligned}$$

\(\square \)

3 The Proof of Theorem 1.12

In this section we give the proof of Theorem 1.12. We shall follow the main steps in the original proof of Temlyakov, see [14, Theorem 2.8] or [16, Theorem 8.7.18], adapted to the new properties \({\texttt {D}}(Q)\) and \({\texttt {A3}}(H,D)\). For completeness, we give self-contained arguments of all the steps, although the main changes will mostly appear in steps 1 and 4.

3.1 Step 1. The Iiteration Theorem

The following result is a generalization of [2, Theorem 3.1], so we follow the notation presented there. Namely, if \(f\in {{\mathbb {X}}}\setminus \{0\}\), then we write

$$\begin{aligned} f_n:=f-{\mathscr {G}}_n(f){\quad \text{ and }\quad }{\Gamma }_n:=\mathop \textrm{supp}{\mathscr {G}}_n(f), \end{aligned}$$

for the remainder and the supporting set of the n-th WCGA applied to f; see Definition 1.2. Also, if \(\Phi =\sum _{j\in T}a_j{\varphi }_j\in \Sigma _N\) and \(A\subset T\), then we denote

$$\begin{aligned} \Phi _A:=\sum _{j\in A}a_j{\varphi }_j,{\quad \text{ and }\quad }T_n:=T\setminus {\Gamma }_n. \end{aligned}$$

We shall also make frequent use of [2, Lemma 2.12], which asserts that

$$\begin{aligned} F_{f_n}(g)=0,\quad \forall \;g\in [{\varphi }_j]_{j\in {\Gamma }_n}. \end{aligned}$$
(3.1)

Theorem 3.2

Let \(D>N\ge 1\). Assume that

  1. (i)

    \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies \({\texttt {D}}(Q)\)

  2. (ii)

    \(\Sigma _N\) satisfies property \({\texttt {A3}}(H,D)\)

Then, for every \(f\in {{\mathbb {X}}}\setminus \{0\}\), \(\Phi =\sum _{j\in T}a_j{\varphi }_j\in \Sigma _N\), and \({\lambda }>1\), and for all integers \(m,M\ge 0\) such that \(N+m+M<D\) the following holds

$$\begin{aligned} \Vert f_{m+M}\Vert \le e^{-M/G(|A|)}\,\Vert f_m\Vert \,+\,{\lambda }\,\Big (\Vert f-\Phi \Vert +\Vert \Phi _B\Vert \Big ), \end{aligned}$$
(3.3)

for all sets \(A\subset T_k\) (with \(A\not =\emptyset \)), \(B=T_k\setminus A\) and all \(k\in [0,m)\), and where

$$\begin{aligned} G(n)=\frac{1}{Q(c(\tau )/H(n))}, \quad \text{ with }\quad c(\tau )\,=\,\tfrac{\tau }{2}\,(1-\tfrac{1}{{\lambda }}). \end{aligned}$$
(3.4)

Proof

Given a fixed \(n\in [m,m+M)\), condition \({\texttt {D}}(Q)\) implies

$$\begin{aligned} \Vert f_{n+1}\Vert \le \mathop \textrm{dist}(f_n,[{\varphi }_{i_{n+1}}])\le \Vert f_n\Vert \,\Big (1-Q(|F_{f_n}({\varphi }_{i_{n+1}})|)\Big ). \end{aligned}$$
(3.5)

By definition of the WCGA and (3.1), for each (non-empty) \(A\subset T\) we have

$$\begin{aligned} \tau \,|F_{f_n}(\Phi _A)|=\tau \,|F_{f_n}(\Phi _{A\cap T_n})|\le \Big (\sum _{A\cap T_n}|a_i|\Big )\,|F_{f_n}({\varphi }_{i_{n+1}})|. \end{aligned}$$

Now, the assumption \(\Sigma _N\in {\texttt {A3}}(H,D)\) implies that

$$\begin{aligned} \sum _{A\cap T_n}|a_i|\le H(|A|)\,\Vert \Phi -{\mathscr {G}}_n(f)\Vert \le H(|A|)\,\big (\Vert \Phi -f\Vert +\Vert f_n\Vert \big ). \end{aligned}$$

In order to apply \({\texttt {A3}}\) we have used that \(|A\cap T_n|\le |T|\le N\) and \(|T\cup {\Gamma }_n|\le N+n\le N+m+M<D\). Thus, inserting these estimates into (3.5) we obtain

$$\begin{aligned} \Vert f_{n+1}\Vert \le \Vert f_n\Vert \,\Big (1-Q\Big (\frac{\tau \,|F_{f_n}(\Phi _A)|}{H(|A|)(\Vert \Phi -f\Vert +\Vert f_n\Vert )}\Big )\Big ), \end{aligned}$$

which is valid for all sets \(A\subset T\).

Fix now an integer \(k\in [0,m]\) and a set \(A\subset T_k\), and let \(B=T_k\setminus A\). Since \({\Gamma }_k\subset {\Gamma }_n\) we can use (3.1) to obtain

$$\begin{aligned} |F_{f_n}(\Phi _A)|= & {} |F_{f_n}(\Phi _A+\Phi _{{\Gamma }_k\cap T}-{\mathscr {G}}_n(f))| = |F_{f_n}(\Phi _T-\Phi _B-f+f_n)| \\\ge & {} \Vert f_n\Vert -\Vert f-\Phi \Vert -\Vert \Phi _B\Vert . \end{aligned}$$

So, we conclude that

$$\begin{aligned} \Vert f_{n+1}\Vert\le & {} \Vert f_n\Vert \,\Big (1-Q\Big (\frac{\tau \,(\Vert f_n\Vert -\Vert f-\Phi \Vert -\Vert \Phi _B\Vert )_+}{H(|A|)(\Vert \Phi -f\Vert +\Vert f_n\Vert )}\Big )\Big ).\\ \end{aligned}$$

Using in the denominator that \(\Vert \Phi -f\Vert \le \Vert f_n\Vert \) (when the numerator is not zero), this further simplifies into

$$\begin{aligned} \Vert f_{n+1}\Vert\le & {} \Vert f_n\Vert \,\Big (1-Q\Big (\frac{\tau \, (\Vert f_n\Vert -\Vert f-\Phi \Vert -\Vert \Phi _B\Vert )_+}{2\,H(|A|)\,\Vert f_n\Vert }\Big )\Big )\\= & {} \Vert f_n\Vert \,\Big (1-Q\Big (\frac{\tau \,(1-u)_+}{2H(|A|)}\Big )\Big ), \end{aligned}$$

where we have let \(u=(\Vert f-\Phi \Vert +\Vert \Phi _B\Vert )/\Vert f_n\Vert \). Now, call

$$\begin{aligned} \beta =Q\Bigg (\frac{(1-\tfrac{1}{{\lambda }})\tau }{2H(|A|)}\Bigg )=\frac{1}{G(|A|)}. \end{aligned}$$

Observe from (2.1) and (2.2) that \(\beta <Q(1/H(1))\le 1\). Now, if \(u\le 1/{\lambda }\) then we have

$$\begin{aligned} \Vert f_{n+1}\Vert \le \,(1-\beta )\, \Vert f_n\Vert . \end{aligned}$$
(3.6)

On the other hand, if \(u\ge 1/{\lambda }\), by definition of u we have

$$\begin{aligned} \Vert f_n\Vert \le {\lambda }\,\big (\Vert f-\Phi \Vert +\Vert \Phi _B\Vert \big ), \end{aligned}$$

and therefore,

$$\begin{aligned} \Vert f_{n+1}\Vert\le & {} \Vert f_n\Vert \,= \,(1-\beta )\, \Vert f_n\Vert \,+\,\beta \,\Vert f_n\Vert \nonumber \\\le & {} \,(1-\beta )\, \Vert f_n\Vert \,+\,\beta \,{\lambda }\,\big (\Vert f-\Phi \Vert +\Vert \Phi _B\Vert \big ). \end{aligned}$$
(3.7)

So, combining (3.6) and (3.7), and calling \(v=\Vert f-\Phi \Vert +\Vert \Phi _B\Vert \) we obtain

$$\begin{aligned} \Vert f_{n+1}\Vert -{\lambda }\,v\,\le \, (1-\beta )\,(\Vert f_n\Vert -{\lambda }\,v). \end{aligned}$$

Since \(\Vert f_{n+1}\Vert \le \Vert f_n\Vert \) (and \(\beta <1\)) this implies

$$\begin{aligned} \big (\Vert f_{n+1}\Vert -{\lambda }\,v\big )_+\,\le \, (1-\beta )\,(\Vert f_n\Vert -{\lambda }\,v)_+. \end{aligned}$$

We can now iterate for all \(n\in [m,m+M)\) to obtain

$$\begin{aligned} \Vert f_{m+M}\Vert -{\lambda }\,v\,\le \, (1-\beta )^M\,(\Vert f_m\Vert -{\lambda }\,v)_+\,\le \,(1-\beta )^M\,\Vert f_m\Vert . \end{aligned}$$

Finally, using the value of v and \(1-\beta \le e^{-\beta }\) we obtain

$$\begin{aligned} \Vert f_{m+M}\Vert \,\le \, e^{-M\beta }\,\Vert f_m\Vert \,+{\lambda }\,(\Vert f-\Phi \Vert +\Vert \Phi _B\Vert ). \end{aligned}$$

This corresponds exactly to (3.3). \(\square \)

3.2 Step 2: Selection of Sets \(A_j\)

In the next step, we shall follow [16, pp. 435–437], and iteratively apply Theorem 3.2, with a suitably chosen selection of sets \(A_j\), in order to obtain the following result. We have adapted the proof to include the new conditions \({\texttt {D}}(Q)\) and \({\texttt {A3}}(H, D)\), and have made more precise the value of the constants.

Theorem 3.8

Let \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) and \(1\le N<D\) be such that

  1. (i)

    \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies \({\texttt {D}}(Q)\)

  2. (ii)

    \(\Sigma _N\) satisfies property \({\texttt {A2}}(k_N,D)\).

  3. (iii)

    \(\Sigma _N\) satisfies property \({\texttt {A3}}(H,D)\).

Given \({\lambda }>1\) and \({\delta }>0\), there exists \(\beta _0=\beta _0({\lambda }, {\delta }, k_N)>0\) such that, if \(\beta \ge \beta _0\) and \(\Phi =\sum _{j\in T}a_j{\varphi }_j\in \Sigma _N{\setminus }\{0\}\), then there exist positive integers \(L, m_L\in {{\mathbb {N}}}\) such that

$$\begin{aligned} 2^{L-2} < |T|{\quad \text{ and }\quad }m_L\le \beta \,\sum _{j=1}^L G(2^{j-1}), \end{aligned}$$
(3.9)

(with G(n) defined in (3.4)), and so that for all \(x\in {{\mathbb {X}}}\setminus \{0\}\) it holds

$$\begin{aligned} \text{ either } \quad \Vert x_{m_L}\Vert \le (1+{\delta })\,{\lambda }\,\Vert x-\Phi \Vert ,\quad \text{ or }\quad |T\cap {\Gamma }_{m_L}|> 2^{L-2}, \end{aligned}$$

provided that \(N+m_L<D\). Moreover, we can set

$$\begin{aligned} \beta _0=2\,\log \Big (\frac{8k_N(1+(1+{\delta }){\lambda })}{{\delta }}\Big ). \end{aligned}$$

Proof

Let \(n\ge 0\) be such that \(2^{n-1}<|T|\le 2^n\). Now, for each \(j=1,2,\ldots , n+1\), choose \(A_j\subset T\) such that

$$\begin{aligned} \Vert \Phi -\Phi _{A_j}\Vert =\min _{{\begin{array}{c} {A\subset T}\\ {|A|\le 2^{j-1}} \end{array}}}\Vert \Phi -\Phi _A\Vert . \end{aligned}$$

Then, define \(B_j=T\setminus A_j\). Picking the sets \(A_j\) with smallest cardinality we may assume that \(|A_1|\le |A_2|\le \cdots \le |A_{n+1}|\) (although these sets may not be nested). As special cases we define

$$\begin{aligned} (A_{n+1},B_{n+1}):=(T,\emptyset ){\quad \text{ and }\quad }(A_0,B_0):=(\emptyset , T). \end{aligned}$$

This construction implies the following

$$\begin{aligned} \text { Key } \text { Fact: }\quad {I\!f} \ T'\subset T \ and \ \Vert \Phi -\Phi _{T'}\Vert <\Vert \Phi _{B_j}\Vert \ then \ |T'|>2^{j-1}. \end{aligned}$$
(3.10)

This will be a crucial argument later to conclude the proof of the theorem.

Let \(\beta >0\) be a large number to be determined later, and define

$$\begin{aligned} \eta =e^{-\beta /2}{\quad \text{ and }\quad }b=\frac{1}{2\eta }. \end{aligned}$$

For the moment assume that \(\beta \) is large enough so that \(\eta <1/2\). With that choice of \(\beta \) we have \(b>1\). Now, pick the first positive integer \(L=L(b, \Phi )\in {{\mathbb {N}}}\) such that

$$\begin{aligned} \Vert \Phi _{B_{j-1}}\Vert <b\,\Vert \Phi _{B_j}\Vert , \;j=1,2,\ldots , L-1,{\quad \text{ and }\quad }\Vert \Phi _{B_{L-1}}\Vert \ge b\, \Vert \Phi _{B_L}\Vert . \nonumber \\ \end{aligned}$$
(3.11)

Note that we could have \(L=1\) if the first condition never holds, i.e. whenever \(\Vert \Phi \Vert =\Vert \Phi _{B_0}\Vert \ge b\,\Vert \Phi _{B_1}\Vert \). At the other extreme, we always have

$$\begin{aligned} \Vert \Phi _{B_n}\Vert \ge \,b\, \Vert \Phi _{B_{n+1}}\Vert =0, \end{aligned}$$

which implies that \(1\le L\le n+1\). Thus,

$$\begin{aligned} 2^{L-2}\le 2^{n-1}<|T|, \end{aligned}$$

which is the first assertion in (3.9). Observe also that \(A_L\not =\emptyset \), since otherwise we would have \(A_j=\emptyset \), for all \(j\le L\), and hence \(\Phi _{B_L}=\Phi _{B_{L-1}}=\Phi \), which would contradict the right hand side of (3.11).

We now apply iteratively Theorem 3.2. Consider the numbers \(m_0=0\) and

$$\begin{aligned} m_j=m_{j-1}+ \lfloor \beta \,G(|A_j|)\rfloor , \quad j=1,\ldots , L. \end{aligned}$$

Actually, to avoid trivial cases, we should restrict to \(j=j_0,\ldots , L\), where \(j_0\) is the first integer such that \(|A_{j_0}|\not =0\) (and let \(m_j=0\) for \(j<j_0\)). We also assume that \(\beta \) is large enough so that

$$\begin{aligned} \beta \,G(1)\ge 1. \end{aligned}$$
(3.12)

Observe that

$$\begin{aligned} m_L=\sum _{j=j_0}^L \lfloor \beta \,G(|A_j|)\rfloor \le \beta \,\sum _{j=1}^L G(2^{j-1}), \end{aligned}$$

since G is increasing. This is the second inequality in (3.9).

Now, for each \(j=j_0,\ldots , L\) we apply Theorem 3.2 with \(k=0\), \(m=m_{j-1}\), \(M=\lfloor \beta \,G(|A_j|)\rfloor \) and \(A=A_j\) to obtain

$$\begin{aligned} \Vert x_{m_j}\Vert\le & {} e^{-\frac{\lfloor \beta \,G(|A_j|)\rfloor }{G(|A_j|)}}\,\Vert x_{m_{j-1}}\Vert +{\lambda }\,\Big (\Vert \Phi -x\Vert +\Vert \Phi _{B_j}\Vert \Big )\\\le & {} \eta \,\Vert x_{m_{j-1}}\Vert +{\lambda }\,\Big (\Vert \Phi -x\Vert +\Vert \Phi _{B_j}\Vert \Big ), \end{aligned}$$

using in the last line that \(\lfloor a\rfloor \ge a/2\) if \(a\ge 1\). Observe that the above inequalities hold trivially for \(1\le j<j_0\) (if there is any such j) since

$$\begin{aligned} \Vert x_{m_j}\Vert =\Vert x_0\Vert =\Vert x\Vert \le \Vert x-\Phi \Vert +\Vert \Phi \Vert =\Vert x-\Phi \Vert +\Vert \Phi _{B_j}\Vert , \end{aligned}$$

and in this case \(B_j=T\). Therefore, we can iterate the inequalities to obtain

$$\begin{aligned} \Vert x_{m_L}\Vert \le \eta ^L\Vert x_{m_0}\Vert +{\lambda }\sum _{j=1}^L\eta ^{L-j}\,\Big (\Vert \Phi -x\Vert +\Vert \Phi _{B_j}\Vert \Big ). \end{aligned}$$

For the first summand we also have

$$\begin{aligned} \Vert x_{m_0}\Vert =\Vert x\Vert \le \Vert x-\Phi \Vert +\Vert \Phi \Vert =\Vert x-\Phi \Vert +\Vert \Phi _{B_0}\Vert . \end{aligned}$$

We now use the crucial assumption (3.11), that is,

$$\begin{aligned} \Vert \Phi _{B_j}\Vert \le b^{L-1-j}\,\Vert \Phi _{B_{L-1}}\Vert , \quad j=0,\ldots , L-1,{\quad \text{ and }\quad }\Vert \Phi _{B_L}\Vert \le b^{-1}\,\Vert \Phi _{B_{L-1}}\Vert , \end{aligned}$$

which inserted into the above expression gives

$$\begin{aligned} \Vert x_{m_L}\Vert\le & {} {\lambda }\sum _{j=0}^L\eta ^{L-j}\,\Vert \Phi -x\Vert \,+\,{\lambda }\,b^{-1}\,\sum _{j=0}^{L}(\eta b)^{L-j}\,\Vert \Phi _{B_{L-1}}\Vert \nonumber \\\le & {} \frac{{\lambda }}{1-\eta }\,\Vert \Phi -x\Vert \,+\,\frac{{\lambda }\,b^{-1}}{1-\eta b}\,\Vert \Phi _{B_{L-1}}\Vert \nonumber \\= & {} \frac{{\lambda }}{1-\eta }\,\Vert \Phi -x\Vert \,+\,4{\lambda }\,\eta \,\Vert \Phi _{B_{L-1}}\Vert , \end{aligned}$$
(3.13)

using in the last step the choice of \(b=1/(2\eta )\).

On the other hand, we can use that \(\Sigma _N\) satisfies property \({\texttt {A2}}(k_N, D)\) to obtain the following estimate

$$\begin{aligned} \Vert \Phi -\Phi _{T\cap {\Gamma }_{m_L}}\Vert\le & {} k_N\,\Vert \Phi -{\mathscr {G}}_{m_L}(x)\Vert \,\le \,k_N\,\big (\Vert \Phi -x\Vert +\Vert x_{m_L}\Vert \big )\nonumber \\\le & {} k_N\,\Big [(1+\tfrac{{\lambda }}{1-\eta })\Vert \Phi -x\Vert \,+\,4{\lambda }\,\eta \,\Vert \Phi _{B_{L-1}}\Vert \Big ]. \end{aligned}$$
(3.14)

At this point we wish to use the Key Fact in (3.10). So we distinguish two cases.

Case 1: \(\Vert \Phi -x\Vert < A\, \Vert \Phi _{B_{L-1}}\Vert \), for some \(A>0\) to be determined. Then

$$\begin{aligned} \Vert \Phi -\Phi _{T\cap {\Gamma }_{m_L}}\Vert <\, k_N\,\Big [(1+\tfrac{{\lambda }}{1-\eta })\,A\,+\,4{\lambda }\,\eta \,\Big ]\,\Vert \Phi _{B_{L-1}}\Vert . \end{aligned}$$

In this case, we wish to select A and \(\eta \) (and hence \(\beta \)) so that

$$\begin{aligned} k_N\,\Big [(1+\tfrac{{\lambda }}{1-\eta })\,A\,+\,4{\lambda }\,\eta \,\Big ]\le 1, \end{aligned}$$
(3.15)

which by the Key Fact would imply that

$$\begin{aligned} |T\cap {\Gamma }_{m_L}|>2^{L-2}. \end{aligned}$$

Case 2: \(\Vert \Phi _{B_{L-1}}\Vert \le A^{-1}\,\Vert \Phi -x\Vert \). In this case, using (3.13) we have

$$\begin{aligned} \Vert x_{m_L}\Vert \le {\lambda }\,\Big (\tfrac{1}{1-\eta }\,+\,4\eta \,A^{-1}\Big )\,\Vert \Phi -x\Vert . \end{aligned}$$

So, we wish to select A and \(\eta \) (hence \(\beta \)) such that

$$\begin{aligned} \tfrac{1}{1-\eta }\,+\,4\eta \,A^{-1} \le \,1+{\delta }. \end{aligned}$$
(3.16)

Overall, we have reduced the theorem to find numbers A and \(\eta \) so that (3.15) and (3.16) hold. Writing \(A=\eta \,B\), this amounts to find B and \(\eta \) so that

$$\begin{aligned} \frac{1}{1-\eta }+\frac{4}{B}\le \,1+{\delta }{\quad \text{ and }\quad }k_N\,\eta \,\big [(1+\tfrac{{\lambda }}{1-\eta })\,B\,+\,4{\lambda }\big ]\le 1. \end{aligned}$$

This is clearly possible if B is chosen sufficiently large and \(\eta \) sufficiently small. In order to make an explicit choice, we let \(B=8/{\delta }\), so we need to select \(\eta \) so that

$$\begin{aligned} \frac{1}{1-\eta }\le \,1+{\delta }/2{\quad \text{ and }\quad }k_N\,\eta \,\big [(1+\tfrac{{\lambda }}{1-\eta })\,8{\delta }^{-1}\,+\,4{\lambda }\big ]\le 1 \end{aligned}$$

If we impose the first condition, the second one will hold provided

$$\begin{aligned} k_N\,\eta \,\big [(1+(1+{\delta }/2){\lambda })\,8{\delta }^{-1}\,+\,4{\lambda }\big ]= 8\,k_N\,\eta \,\big [{\lambda }+\tfrac{1+{\lambda }}{{\delta }}\big ]\le 1. \end{aligned}$$

That is, we can choose

$$\begin{aligned} \eta \le \min \Big \{(8k_N\,[{\lambda }+\tfrac{1+{\lambda }}{{\delta }}])^{-1}, \,\frac{{\delta }}{2+{\delta }}\Big \}=(8k_N\,[{\lambda }+\tfrac{1+{\lambda }}{{\delta }}])^{-1}, \end{aligned}$$

with the last equality following easily from \(k_N\ge 1\) and \({\lambda }\ge 1\). So, simplifying a bit we can choose

$$\begin{aligned} \eta =\frac{{\delta }}{8k_N(1+(1+{\delta }){\lambda })}, \end{aligned}$$

and using that \(\eta =e^{-\beta /2}\), we find the expression

$$\begin{aligned} \beta =2\,\log (1/\eta ) = 2\,\log \Big [\frac{8k_N(1+(1+{\delta }){\lambda })}{{\delta }}\Big ]. \end{aligned}$$

We finally observe that (3.12) is also satisfied, as in fact we have \(G(1)\ge 1\). This is a simple consequence of

$$\begin{aligned} \frac{1}{G(1)}=Q\Big (\frac{(1-\tfrac{1}{{\lambda }})\tau }{2H(1)}\Big )\le Q\Big (\frac{1}{H(1)}\Big )\le Q(\Vert {\varphi }\Vert )=Q(|F_{\varphi }({\varphi })|)\le 1, \end{aligned}$$

with the second inequality due to (2.1) (for any \({\varphi }\in {{\mathcal {D}}}\)), and the last one due to \({\texttt {D}}(Q)\). \(\square \)

Remark 3.17

In order to ensure that \(\Vert x_{m_L}\Vert \le 2\Vert x-\Phi \Vert \) we must choose \({\lambda }\) and \({\delta }\) so that \((1+{\delta }){\lambda }=2\). For instance, \({\lambda }=\sqrt{2}\) and \({\delta }=\sqrt{2}-1\) will give the value

$$\begin{aligned} \beta = 2\,\log \left[ \frac{24k_N}{\sqrt{2}-1}\right] =2\,\log \big (24(1+\sqrt{2})k_N\big ). \end{aligned}$$

3.3 Step 3

The next step is a slight generalization of the previous Theorem 3.8 (which would be the special case \(k=0\)).

Theorem 3.18

Let \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) and \(1\le N<D\) be such that

  1. (i)

    \(({{\mathbb {X}}},\Vert \cdot \Vert ,{{\mathcal {D}}})\) satisfies \({\texttt {D}}(Q)\)

  2. (ii)

    \(\Sigma _N\) satisfies property \({\texttt {A2}}(k_N,D)\).

  3. (iii)

    \(\Sigma _N\) satisfies property \({\texttt {A3}}(H,D)\).

Given \({\lambda }>1\) and \({\delta }>0\), let \(\beta _0=\beta _0({\lambda }, {\delta }, k_N)>0\) be as in Theorem 3.8, and let \(\beta \ge \beta _0\). If \(x\in {{\mathbb {X}}}\) and \(\Phi =\sum _{j\in T}a_j{\varphi }_j\in \Sigma _N\) are not null, and if \(k\in {{\mathbb {N}}}_0\) is such that

$$\begin{aligned} T_k:=T\setminus {\Gamma }_k\not =\emptyset ,\quad \text{(where }~{\Gamma }_k=\mathop \textrm{supp}{\mathscr {G}}_k(x)), \end{aligned}$$

then there exist integers \(L\in {{\mathbb {N}}}\) and \(m_L\ge k+1\) such that

$$\begin{aligned} 2^{L-2} < |T_k|{\quad \text{ and }\quad }m_L-k\le \beta \,\sum _{j=1}^L G(2^{j-1}), \end{aligned}$$
(3.19)

and so that

$$\begin{aligned} \text{ either } \quad \Vert x_{m_L}\Vert \le (1+{\delta })\,{\lambda }\,\Vert x-\Phi \Vert ,\quad \text{ or }\quad |T_k\cap {\Gamma }_{m_L}|> 2^{L-2}, \end{aligned}$$

provided that \(N+m_L<D\).

Proof

Apply the construction in the first part of Theorem 3.8 to the vector \(\Phi _{T_k}\) (instead of \(\Phi \)). So for \(\eta \) and b fixed as above, this gives an integer \(L\in {{\mathbb {N}}}\) such that \(2^{L-2}<|T_k|\) and sets \(A_j\subset T_k\) and \(B_j=T_k{\setminus } A_j\) such that the inequalities in (3.11) hold.

At this point we let \(m_0=k\) and consider

$$\begin{aligned} m_j=m_{j-1}+ \lfloor \beta G(|A_j|)\rfloor , \quad j=j_0,\ldots , L, \end{aligned}$$

where \(j_0\) is the first integer such that \(|A_{j_0}|\not =0\). Otherwise we let \(m_j=m_0=k\) when \(1\le j<j_0\). As before, this choice (and the size of the sets \(A_j\)) gives the second assertion in (3.19).

Now, if \(j_0\le j\le L\) we apply Theorem 3.2 with \(m=m_{j-1}\), \(M=\lfloor \beta G(|A_j|)\rfloor \) and \(A=A_j\) to obtain

$$\begin{aligned} \Vert x_{m_j}\Vert\le & {} e^{-\frac{\lfloor \beta G(|A_j|)\rfloor }{G(|A_j|)}}\,\Vert x_{m_{j-1}}\Vert +{\lambda }\,\Big (\Vert \Phi -x\Vert +\Vert \Phi _{B_j}\Vert \Big ) \nonumber \\\le & {} \eta \,\Vert x_{m_{j-1}}\Vert +{\lambda }\,\Big (\Vert \Phi -x\Vert +\Vert \Phi _{B_j}\Vert \Big ). \end{aligned}$$

When \(0\le j<j_0\), we have instead

$$\begin{aligned} \Vert x_{m_j}\Vert =\Vert x_k\Vert =\mathop \textrm{dist}(x,[{\varphi }_i]_{i\in {\Gamma }_k})\le & {} \Vert x-\Phi _{T\cap {\Gamma }_k}\Vert =\Vert x-\Phi +\Phi _{T_k}\Vert \\\le & {} \Vert x-\Phi \Vert +\Vert \Phi _{B_j}\Vert , \end{aligned}$$

since \(A_j=\emptyset \) and hence \(B_j=T_k\). Thus, we can proceed exactly as we did in (3.13) to obtain the same conclusion, namely

$$\begin{aligned} \Vert x_{m_L}\Vert \le \frac{{\lambda }}{1-\eta }\,\Vert \Phi -x\Vert \,+\,4{\lambda }\,\eta \,\Vert \Phi _{B_{L-1}}\Vert . \end{aligned}$$

On the other hand, using property \({\texttt {A2}}(k_N, D)\) we obtain

$$\begin{aligned} \Vert \Phi _{T_k}-\Phi _{T_k\cap {\Gamma }_{m_L}}\Vert\le & {} k_N\,\Vert \Phi _{T_k}-{\mathscr {G}}_{m_L}(x)+\Phi _{T\cap {\Gamma }_k}\Vert \, = \, k_N\,\Vert \Phi -x+x_{m_L}\Vert \\\le & {} k_N\,\big (\Vert \Phi -x\Vert +\Vert x_{m_L}\Vert \big )\\\le & {} k_N\,\Big [(1+\tfrac{{\lambda }}{1-\eta })\Vert \Phi -x\Vert \,+\,4{\lambda }\,\eta \,\Vert \Phi _{B_{L-1}}\Vert \Big ]. \end{aligned}$$

which is the analogous inequality to (3.14) in the previous theorem.

At this point one considers the same two cases as in the lines following (3.14). Namely

Case 1: \(\Vert \Phi -x\Vert < A\, \Vert \Phi _{B_{L-1}}\Vert \), with the same \(A>0\) as in Theorem 3.8. This implies

$$\begin{aligned} \Vert \Phi _{T_k}-\Phi _{T_k\cap {\Gamma }_{m_L}}\Vert <\, k_N\,\Big [(1+\tfrac{{\lambda }}{1-\eta })\,A\,+\,4{\lambda }\,\eta \,\Big ]\,\Vert \Phi _{B_{L-1}}\Vert \,\le \,\Vert \Phi _{B_{L-1}}\Vert , \end{aligned}$$

so by the construction of the sets \((A_j,B_j)\) and the Key Fact one obtains

$$\begin{aligned} |T_k\cap {\Gamma }_{m_L}|>2^{L-2}. \end{aligned}$$

Case 2: \(\Vert \Phi _{B_{L-1}}\Vert \le A^{-1}\,\Vert \Phi -x\Vert \). In this case, the same reasoning as in Theorem 3.8 gives

$$\begin{aligned} \Vert x_{m_L}\Vert \le {\lambda }\,\Big (\tfrac{1}{1-\eta }\,+\,4\eta \,A^{-1}\Big )\,\Vert \Phi -x\Vert \,\le \,(1+{\delta })\,{\lambda }\,\Vert \Phi -x\Vert . \end{aligned}$$

This completes the proof of Theorem 3.18. \(\square \)

3.4 Step 4: Conclusion of the Proof of Theorem 1.12

This part of the proof requires substantial modifications compared to [14, 16], so we present it in detail.

Write \({\lambda }_1=(1+{\delta }){\lambda }\), say with

$$\begin{aligned} {\lambda }=\sqrt{{\lambda }_1}>1{\quad \text{ and }\quad }{\delta }=\sqrt{{\lambda }_1}-1>0. \end{aligned}$$

The iterative process discussed in the previous subsections produces a positive constant \(\beta = 2\,\log \Big [\frac{8k_N(1+{\lambda }_1)}{\sqrt{{\lambda }_1}-1}\Big ]\), and the following sequences of numbers

  • there exist positive integers \(L_1\) and \(m_{L_1}\) such that

    $$\begin{aligned} m_{L_1}\le \beta \sum _{j=1}^{L_1} G(2^{j-1}){\quad \text{ and }\quad }2^{L_1-2}\le |T|, \end{aligned}$$

    with the property that

    $$\begin{aligned} \text{ either }\quad \Vert x_{m_{L_1}}\Vert \le {\lambda }_1\,\Vert x-\Phi \Vert \quad \text{ or }\quad |T\cap {\Gamma }_{m_{L_1}}|\ge 2^{L_1-2}. \end{aligned}$$

    In the first case one stops; if not one iterates and applies Theorem 3.18 with \(k=m_{L_1}\), which implies

  • there exist positive integers \(L_2\) and \(m_{L_2}>m_{L_1}\) such that

    $$\begin{aligned} m_{L_2}-m_{L_1}\le \beta \sum _{j=1}^{L_2} G(2^{j-1}){\quad \text{ and }\quad }2^{L_2-2}\le |T_{m_{L_1}}|, \end{aligned}$$
    (3.20)

    with the property that

    $$\begin{aligned} \text{ either }\quad \Vert x_{m_{L_2}}\Vert \le {\lambda }_1\,\Vert x-\Phi \Vert \quad \text{ or }\quad |T_{m_{L_1}}\cap {\Gamma }_{m_{L_2}}|\ge 2^{L_2-2}. \end{aligned}$$

    Again, in the first case one stops; if not one applies iteratively Theorem 3.18, with values of \(k=m_{L_i}\), \(i=2,\ldots , s-1\), until some step s, where can one ensure that

  • there are positive integers \(L_s\) and \(m_{L_s}>m_{L_{s-1}}\) such that

    $$\begin{aligned} m_{L_s}-m_{L_{s-1}}\le \beta \sum _{j=1}^{L_s} G(2^{j-1}){\quad \text{ and }\quad }2^{L_s-2}\le |T_{m_{L_{s-1}}}|, \end{aligned}$$
    (3.21)

    where

    $$\begin{aligned} \left\{ \begin{array}{l} \text{ either }\quad \Vert x_{m_{L_s}}\Vert \le {\lambda }_1\,\Vert x-\Phi \Vert ,\quad \\ \text{ or } \quad |T_{m_{L_{s-1}}}\cap {\Gamma }_{m_{L_s}}|\ge 2^{L_s-2} \quad \text{ and }\quad m_{L_s}\ge 2\beta \,{{\widetilde{G}}}(2N). \end{array}\right. \end{aligned}$$
    (3.22)

Here G(n) denotes the sequence defined in (3.4), and the notation \({{\widetilde{G}}}(n)\) stands for the associated summing sequence as in (2.8).

In the first case of (3.22) one stops; if not, we shall show that the greedy algorithm actually covers the whole set T, that is

$$\begin{aligned} |T\cap {\Gamma }_{m_{L_s}}|\ge N. \end{aligned}$$
(3.23)

This would imply that \(x_{m_{L_s}}=0\), and so we would also stop.

Let us prove (3.23). Here we shall use the assumption that the sequence G(n) in (3.4) is increasing and 1-quasi-convex. Observe that

$$\begin{aligned} |T\cap {\Gamma }_{m_{L_s}}|= & {} |T\cap {\Gamma }_{m_{L_1}}|+|T_{m_{L_1}}\cap {\Gamma }_{m_{L_2}}|+\cdots +|T_{m_{L_{s-1}}}\cap {\Gamma }_{m_{L_s}}|\nonumber \\\ge & {} 2^{L_1-2}+2^{L_2-2}+\cdots +2^{L_s-2}. \end{aligned}$$
(3.24)

Now, by Lemma 2.9 and the inductive assumptions, see (3.20), for each \(i=1,\ldots ,s\), we have

$$\begin{aligned} 2\beta \,{{\widetilde{G}}}(2^{L_i-1}) > \beta \sum _{j=0}^{L_i-1}G(2^j)=\beta \sum _{j=1}^{L_i}G(2^{j-1})\ge m_{L_i}-m_{L_{i-1}}, \end{aligned}$$
(3.25)

with the notation \(m_{L_0}=0\). Thus, applying the (non-decreasing) function \(2\beta {{\widetilde{G}}}(2\cdot )\) to both sides of (3.24) and using part b) of Lemma 2.10 we obtain

$$\begin{aligned} 2\beta \,{{\widetilde{G}}}\big (2|T\cap {\Gamma }_{m_{L_s}}|\big )\ge & {} 2\beta \,{{\widetilde{G}}}(2^{L_1-1}+2^{L_2-1}+\cdots +2^{L_s-1})\\\ge & {} 2\beta \,\Big ({{\widetilde{G}}}(2^{L_1-1})+{{\widetilde{G}}}(2^{L_2-1})+\cdots +{{\widetilde{G}}}(2^{L_s-1})\Big )\\> & {} m_{L_s}\,\ge \, 2\beta {{\widetilde{G}}}(2N), \end{aligned}$$

using in the last line (3.25) and the second assertion in (3.22). Since \({{\widetilde{G}}}\) is increasing this implies

$$\begin{aligned} |T\cap {\Gamma }_{m_{L_s}}|>N, \end{aligned}$$

which proves (3.23).

Thus, the process will indeed end after \(m_{L_s}\) iterations. We now estimate this number using the remaining conditions in (3.21). Since the last inequality in (3.22) occurs for the first time at step s, we must have

$$\begin{aligned} m_{L_{s-1}}<2\beta {{\widetilde{G}}}(2N). \end{aligned}$$

Thus,

$$\begin{aligned} m_{L_s}= & {} m_{L_{s-1}}\,+\,\big (m_{L_s}-m_{L_{s-1}}\big )\\\le & {} 2\beta {{\widetilde{G}}}(2N)\,+\,\big (m_{L_s}-m_{L_{s-1}}\big )\\\le & {} 2\beta {{\widetilde{G}}}(2N)\,+\,\beta \,\sum _{j=1}^{L_s} G(2^{j-1})\\\le & {} 2\beta {{\widetilde{G}}}(2N)\,+\,2\beta \,{{\widetilde{G}}}(2^{L_s-1})\,\le \,4\beta \,{{\widetilde{G}}}(2N). \end{aligned}$$

Therefore, using also part a) of Lemma 2.10, we see that (1.15) will be true with

$$\begin{aligned} \phi (N)=m_{L_s}\le 4\beta \,G(2N)\,=\,8\,\log \Big [\frac{8k_N(1+{\lambda }_1)}{\sqrt{{\lambda }_1}-1}\Big ]\,G(2N), \end{aligned}$$

as asserted in (1.14). \(\square \)

4 An Application: WCGA in \(L^p(\log \,L)^{\alpha }\) Spaces

4.1 Property \({\texttt {D}}(Q)\) in \(L^p(\log \,L)^{\alpha }\)

In this section we shall apply Theorem 1.12 in the case when

$$\begin{aligned} {{\mathbb {X}}}=L^p(\log \,L)^{\alpha },\quad 1<p<\infty , {\alpha }\in {{\mathbb {R}}}. \end{aligned}$$

Following [1, Definition IV.6.11], this is the set of all measurable \(f:{{\mathbb {R}}}^d\rightarrow {{\mathbb {C}}}\) such that

$$\begin{aligned} \Big \Vert f(x)\,\big |\log (2+|f(x)|)|^{\alpha }\Big \Vert _{L^p({{\mathbb {R}}}^d)}<\infty . \end{aligned}$$

These classes satisfy the elementary inclusions

$$\begin{aligned} L^p(\log L)^{|{\alpha }|}\subset L^p\subset L^p(\log L)^{-|{\alpha }|}. \end{aligned}$$

We shall regard \({{\mathbb {X}}}\) as an Orlicz space \(L^\Phi \) associated with the function

$$\begin{aligned} \Phi (t)=\int _0^t s^{p-1}\,\big (\log (c+s)\big )^{{\alpha }p}\, ds, \quad t\ge 0, \end{aligned}$$

which for a sufficiently large \(c>1\) is a (smooth) Young function.Footnote 1 The corresponding (Luxemburg) norm is then defined by

$$\begin{aligned} \Vert f\Vert _{L^\Phi }=\inf \Big \{{\lambda }\mid \int _{{{\mathbb {R}}}^d}\Phi (|f(x)|/{\lambda })\,dx\le 1\Big \}. \end{aligned}$$

Let \(\Psi \) be the complementary functionFootnote 2 of \(\Phi (t)\). Then it is known that \((L^\Phi )^*=L^\Psi \) (isometrically, when the latter space is endowed with the Orlicz norm); see [1, Corollary IV.8.15]. In these examples it is not difficult to check that

$$\begin{aligned} \Phi (t)\,\approx \, t^p\,\big (\log (e+t)\big )^{{\alpha }p}, \quad t\ge 0, \end{aligned}$$
(4.1)

and

$$\begin{aligned} \Psi (t)\,\approx \,t^{p'}\,\big (\log (e+t)\big )^{-{\alpha }p'}, \quad t\ge 0; \end{aligned}$$
(4.2)

see e.g. [5, Theorem I.7.2].

We recall how the norming functional \(F_f\) of a (normalized) element \(f\in L^\Phi \) is defined; see [5, Theorem 18.5]. Let \(P(t)=\Phi '(t)\), \(t\ge 0\), and for \(z\in {{\mathbb {C}}}\setminus \{0\}\) let \(P(z)=\overline{{\text {sign}}}(z)P(|z|)\). Then \(F_f\) is explicitly given by

$$\begin{aligned} F_f(g)=\frac{\displaystyle \int _{{{\mathbb {R}}}^d} P(f(x))\,g(x)\,dx}{\displaystyle \int _{{{\mathbb {R}}}^d} P(f(x))\,f(x)\,dx}. \end{aligned}$$

In our case of interest we will have \(P(z)={\bar{z}}\,|z|^{p-2}\,\big (\log (c+|z|)\big )^{{\alpha }p}\), \(z\in {{\mathbb {C}}}\), and hence

$$\begin{aligned} F_f(g)=\tfrac{1}{A(f)}\int _{{{\mathbb {R}}}^d}|f(x)|^{p-2}\,\overline{f(x)}\,g(x)\,\big (\log (c+|f(x)|)\big )^{{\alpha }p}\,dx, \end{aligned}$$
(4.3)

with \(A(f)=\int _{{{\mathbb {R}}}^d}|f|^{p}\,(\log (c+|f|))^{{\alpha }p}\,dx\).

The moduli of smoothness and convexity for Orlicz spaces \(L^\Phi \) have been studied in [3, 8]. According to [8, Theorem 1], there exists a Young function \(\bar{\Phi }\), equivalent to \(\Phi \), such that

$$\begin{aligned} \rho _{L^{\bar{\Phi }}}(t)\lesssim \sup _{u\in [t,1],\,v>0}\frac{t^2\,\Phi (uv)}{u^2\,\Phi (v)} {\quad \text{ and }\quad }{\delta }_{(L^{\bar{\Phi }})^*}(s)\gtrsim \inf _{u\in [s,1],\,v>0}\frac{s^2\,\Psi (uv)}{u^2\,\Psi (v)} \end{aligned}$$
(4.4)

under suitable doubling conditions in \(\Phi (t)\) and \(\Psi (t)\) (which are always held in the cases considered in (4.1) and (4.2)). Moreover, our specific examples satisfy the regularity conditions stated in [3, Proposition 19], so one may actually take \(\bar{\Phi }=\Phi \).

Therefore, inserting into (4.4) the expressions for \(\Phi \) and \(\Psi \) from (4.1) and (4.2), and performing some straightforward computations, one obtains the following result. Here we use the standard notation \({\alpha }={\alpha }_+-{\alpha }_-\), where

$$\begin{aligned} {\alpha }_+=\max \{{\alpha },0\}{\quad \text{ and }\quad }{\alpha }_-=\max \{-{\alpha },0\}. \end{aligned}$$

Proposition 4.5

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\), and let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) as above. Then, for the Luxemburg norm associated with \(\Phi (t)\) in \({{\mathbb {X}}}\) it holds

  • if \(p>2\) then

    $$\begin{aligned} \rho _{{{\mathbb {X}}}}(t)\,\lesssim \, t^2{\quad \text{ and }\quad }{\delta }_{{{\mathbb {X}}}^*}(s)\,\gtrsim \,s^2, \end{aligned}$$
  • if \(1<p\le 2\) then

    $$\begin{aligned} \rho _{{{\mathbb {X}}}}(t)\,\lesssim \, t^p\,(\log (e+\tfrac{1}{t}))^{p\,{\alpha }_-}{\quad \text{ and }\quad }{\delta }_{{{\mathbb {X}}}^*}(s)\,\gtrsim \, \frac{s^{p'}}{(\log (e+\frac{1}{s}))^{p'\,{\alpha }_-}}. \end{aligned}$$

In view of the discussion in Sect. 2.2, we then obtain the following.

Corollary 4.6

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\), and let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) as above. Then, for every normalized dictionary \({{\mathcal {D}}}\) in \({{\mathbb {X}}}\), property \({\texttt {D}}(Q)\) holds with

$$\begin{aligned} Q(s):=\left\{ \begin{array}{ll} \frac{c_0\,s^{p'}}{(\log (e+\frac{1}{s}))^{p'\,{\alpha }_-}} &{} \text{ if } \ 1<p\le 2\\ c_0\,s^2 &{} \text{ if }\, {p>2,} \end{array}\right. \end{aligned}$$
(4.7)

for a suitably small constant \(c_0>0\).

4.2 The Haar System in \(L^p(\log \,L)^{\alpha }\)

Next we consider the dictionary \({{\mathcal {D}}}=\{\psi _j\}_{j=1}^\infty \) in \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) given by the normalized Haar basis in \({{\mathbb {R}}}^d\) (or any sufficiently smooth wavelet basis). These bases are unconditional, so in this case property \({\texttt {A2}}(k_N,D)\) will hold with \(k_N=O(1)\) (and any \(D<\infty \)). It remains to verify property \({\texttt {A3}}(H,D)\). As mentioned in Lemma 2.6, this property holds with

$$\begin{aligned} H(N)=\sup _{|A|\le N, |{\varepsilon }_j|=1}\,\big \Vert \sum _{j\in A} {\varepsilon }_j\psi ^*_j\big \Vert _{{{\mathbb {X}}}^*} \end{aligned}$$

(also for all \(D<\infty \)). Here \(\{\psi ^*_j\}\) is the dual dictionary, which is again the Haar basis, this time normalized in \({{\mathbb {X}}}^*\). Since the basis is unconditional, the parameter H(N) is equivalent to the upper democracy function of the dual space \({{\mathbb {X}}}^*\), that is

$$\begin{aligned} H(N)\approx h_{{{\mathbb {X}}}^*}(N):=\sup _{|A|\le N} \big \Vert \sum _{j\in A} \psi ^*_j\big \Vert _{{{\mathbb {X}}}^*}. \end{aligned}$$

Democracy functions, for the Orlicz classes \(L^\Phi \), were studied in [4], where it was proved that, if the Boyd indices of \(\Phi \) are non trivial, then

$$\begin{aligned} h_{L^\Phi }(N)\approx \sup _{s>0}\frac{{\varphi }(Ns)}{{\varphi }(s)}, \end{aligned}$$

where \({\varphi }(t):=1/\Phi ^{-1}(1/t)\) is the fundamental function of \(L^\Phi \).

In our case of interest, where \(\Phi (t)\) satisfies (4.1), we have

$$\begin{aligned} {\varphi }(t)\approx t^{1/p}\,\big (\log (e+\tfrac{1}{t})\big )^{{\alpha }}{\quad \text{ and }\quad }h_{L^\Phi }(N)\approx N^{1/p}\,\big (\log (e+N)\big )^{{\alpha }_-}; \end{aligned}$$

see [4, Proposition 3.4]. Thus, for the dual space \({{\mathbb {X}}}^*=L^\Psi \) with \(\Psi \) as in (4.2) we have

$$\begin{aligned} h_{L^\Psi }(N)\approx N^{1/p'}\,\big (\log (e+N)\big )^{{\alpha }_+}. \end{aligned}$$

Overall we conclude that property \({\texttt {A3}}(H)\) holds with

$$\begin{aligned} H(N)\,\approx \, N^{1/p'}\,\big (\log (e+N)\big )^{{\alpha }_+}. \end{aligned}$$
(4.8)

Thus, combining (4.7) and (4.8), we see that, for \(p>2\) we have

$$\begin{aligned} G(N)=\frac{1}{Q(\frac{c(\tau )}{H(N)})} \,\approx \,H(N)^2\,\approx \,N^\frac{2}{p'}\,\big (\log (e+N)\big )^{2{\alpha }_+} \end{aligned}$$

while for \(1<p\le 2\) we have

$$\begin{aligned} G(N)=\frac{1}{Q(\frac{c(\tau )}{H(N)})} \,\approx \,H(N)^{p'}\,\Big (\log (e+H(N))\Big )^{p'{\alpha }_-}\, \approx \,N\,\big (\log (e+N)\big )^{p'|{\alpha }|}. \end{aligned}$$

4.3 Proof of Theorem 1.17.a

Collecting the values of the parameters G(N) obtained in the previous subsection, and inserting them into Theorem 1.12, we deduce the first assertions (1.18) and (1.19) in Theorem 1.17.

4.4 Proof of Theorem 1.17.b

We shall use the following result whose proof can be found in [4, Lemma 3.1]. For simplicity in the notation, we assume in this section that the underlying space \({{\mathbb {R}}}^d\) has dimension \(d=1\).

Lemma 4.9

Let \({{\mathbb {X}}}=L^\Phi ({{\mathbb {R}}})\) be an Orlicz space with non-trivial Boyd indices, and let \({{\mathcal {D}}}=\{h_I\}\) be the (normalized) Haar basis in \({{\mathbb {X}}}\). Then, if A is a finite collection of disjoint dyadic intervals with the same size s, then

$$\begin{aligned} \big \Vert \sum _{I\in A} h_I\big \Vert _{L^\Phi } \,\approx \, \frac{{\varphi }(|A|s)}{{\varphi }(s)}, \end{aligned}$$

where \({\varphi }(t)\) is the fundamental function of \({{\mathbb {X}}}\).

We now show the lower bound for the function \(\psi (N)\) stated in (1.20).

Proof of (1.20)

Write \({{\mathcal {D}}}=\{h_I\}\) where \(h_I\) is the (normalized) Haar function supported in I, and I runs over all dyadic intervals in \({{\mathbb {R}}}\). Pick any two collections A and B, of pairwise disjoint dyadic intervals with cardinalities \(|A|=N\) and \(|B|=M\), such that

$$\begin{aligned} |I|=1\quad \text{ if }\,\,I\in A,{\quad \text{ and }\quad }|I|=1/M\quad \text{ if }\,\,I\in B. \end{aligned}$$

For instance, we could take

$$\begin{aligned} A=\{[n,n+1)\mid n=1,\ldots , N\}{\quad \text{ and }\quad }B=\{J\subset [0,1)\mid |J|=2^{-m}\}, \end{aligned}$$

with \(M=2^m\). For \(b>0\) to be determined, consider the function

$$\begin{aligned} f=f_1+f_2=\sum _{I\in A}h_I+b\sum _{I\in B}h_I. \end{aligned}$$

Using Lemma 4.9 and \({\varphi }(t)\,\approx \,t^{1/p}\,\big (\log (e+\frac{1}{t})\big )^{\alpha }\), observe that

$$\begin{aligned} \Vert f_1\Vert \approx {\varphi }(N)\approx N^{1/p},{\quad \text{ and }\quad }\Vert f_2\Vert \approx \frac{b}{{\varphi }(1/M)}\approx \frac{b\, M^\frac{1}{p}}{(\log (e+M))^{\alpha }}. \end{aligned}$$
(4.10)

Also, since \(\Vert h_I\Vert _{L^\Phi }=1\) we have

$$\begin{aligned} |h_I(x)|=\frac{1}{{\varphi }(|I|)}{{\textbf {1}}}_I(x). \end{aligned}$$
(4.11)

In particular, if \(I\in B\) we have

$$\begin{aligned} |f(x)|=b\,|h_I(x)|=\frac{b}{{\varphi }(1/M)}\approx \Vert f_2\Vert \lesssim \Vert f\Vert ,\quad x\in I, \end{aligned}$$
(4.12)

and similarly, if \(I\in A\) we have

$$\begin{aligned} |f(x)|=1\lesssim \Vert f_1\Vert \lesssim \Vert f\Vert , \quad x\in I. \end{aligned}$$
(4.13)

Using the formula for the norming functional in (4.3) we see that

$$\begin{aligned} |F_f(h_I)|=\,\frac{1}{C(f)}\,\times \,\left\{ \begin{array}{ll}\displaystyle \int |h_I|^p\,\Big (\log (c+|f(x)|/\Vert f\Vert )\Big )^{{\alpha }p}\,dx, &{} \quad I\in A,\\ \displaystyle {b^{p-1}}\int |h_I|^p\,\Big (\log (c+|f(x)|/\Vert f\Vert )\Big )^{{\alpha }p}\,dx, &{} \quad I\in B,\\ 0, &{} \quad I\not \in A\cup B, \end{array}\right. \end{aligned}$$

for some \(C(f)>0\). In view of (4.12) and (4.13), the logarithmic factors inside the integrals are approximately constant, so can be disregarded. Also, (4.11) implies

$$\begin{aligned} \int |h_I|^p\,dx \,=\, \frac{|I|}{{\varphi }(|I|)^p}\,\approx \,\frac{1}{\big (\log (e+|I|^{-1})\big )^{{\alpha }p}}, \end{aligned}$$

so we have

$$\begin{aligned} |F_f(h_I)|\!\approx \!\tfrac{1}{C(f)},\!\ I\in A,{\quad \text{ and }\quad }|F_f(h_I)|\!\approx \!\tfrac{1}{C(f)} \!\frac{b^{p-1}}{\big (\log (e+M)\big )^{{\alpha }p}},\!\!\ I\in B.\qquad \qquad \end{aligned}$$
(4.14)

Thus, the above quantities are approximately the same provided we choose

$$\begin{aligned} b\,=\,c_1\,\big (\log (e+M)\big )^{{\alpha }p'}. \end{aligned}$$
(4.15)

Therefore, if \(c_1>0\) is chosen properly, the WCGA, \({\mathscr {G}}_n(f)\), can be formed either by selecting consecutive elements I from A (if \(n\le N\)), or by selecting consecutive elements I from B (if \(n\le M/2\)). To verify these assertions one should note that the equivalences in (4.14) remain also trueFootnote 3 when f is replaced by the remainder \(f-{\mathscr {G}}_n(f)\).

So suppose now that (1.20) holds. If \({\alpha }\ge 0\), we let \(N=\psi (M)\), and in view of the previous comment we can select \(c_1\) such that \({\mathscr {G}}_N(f)=f_1\). Then

$$\begin{aligned} \Vert f_2\Vert =\Vert f-{\mathscr {G}}_{\psi (M)}(f)\Vert \le C\,\sigma _M(f)\le \,C\, \Vert f_1\Vert , \end{aligned}$$

which in view of (4.15) and (4.10) implies

$$\begin{aligned} c_1^p\,M\,(\log (e+M))^{{\alpha }p'}\,=\,\frac{b^p\, M}{(\log (e+M))^{{\alpha }p}}\,\approx \,\Vert f_2\Vert ^p\lesssim \Vert f_1\Vert ^p\approx N=\psi (M). \end{aligned}$$

This proves the assertion in the Theorem when \({\alpha }\ge 0\).

If \({\alpha }\le 0\), then we take \(M=2\psi (N)\), and select \(c_1\) such that \({\mathscr {G}}_{M/2}(f)=b\sum _{I\in B'}h_I\), for some \(B'\subset B\) with \(|B'|=M/2\). Then

$$\begin{aligned} \Vert f_1\Vert \lesssim \Vert f_1+b\sum _{I\in B\setminus B'}h_I\Vert =\Vert f-{\mathscr {G}}_{\psi (N)}(f)\Vert \le C\,\sigma _N(f)\le \,C\,\Vert f_2\Vert , \end{aligned}$$

which this time implies

$$\begin{aligned} N\approx \Vert f_1\Vert ^p\lesssim \Vert f_2\Vert ^p\approx \,M\,(\log (e+M))^{{\alpha }p'}. \end{aligned}$$

Solving for M this gives

$$\begin{aligned} \psi (N)=M/2\,\gtrsim \,\frac{N}{(\log (e+N))^{{\alpha }p'}}\,=\,N\,\big (\log (e+N)\big )^{|{\alpha }|\,p'}. \end{aligned}$$

This establishes (1.20), and therefore completes the proof of Theorem 1.17. \(\square \)

5 WCGA for Trigonometric System in \(L^p(\log \,L)^{\alpha }\)

In this section we give a second application of Theorem 1.12, this time to the trigonometric system in the torus \({{\mathbb {T}}}\equiv [-\pi ,\pi )\), that is,

$$\begin{aligned} {{\mathcal {D}}}={{\mathcal {T}}}:=\{e^{inx}\}_{n\in {{\mathbb {Z}}}}. \end{aligned}$$

So, from now on, all functions \(f\in L^p(\log \,L)^{\alpha }\) are understood as defined in \({{\mathbb {T}}}\). Otherwise, we regard \(L^p(\log \,L)^{\alpha }\) as an Orlicz space \(L^\Phi \) in the same sense as in Sect. 4. Since [8] covers also this setting, the estimates for the moduli of convexity and smoothness in Proposition 4.5 remain true, and so does the estimate (4.7) for the function Q(s) in Corollary 4.6.

We still have to compute the parameters \(k_N\) and H(N). To do so, we shall make use of the following interpolation lemma.

Lemma 5.1

Consider the Young function \({\bar{\Phi }}(t)=t^p\,\big (\log (c+t)\big )^{{\alpha }p}\), for some \(c\ge e\). Assume that

$$\begin{aligned} 2<p<\infty \;\;\text{ and }\;\;{\alpha }\in {{\mathbb {R}}},\quad \text{ or }\quad p=2\;\;\text{ and }\;\;{\alpha }\ge 0. \end{aligned}$$
(5.2)

Then,

$$\begin{aligned} \Vert f\Vert _{L^{\bar{\Phi }}}\le \Vert f\Vert _\infty ^{1-\frac{2}{p}}\,\Big (\log \big (c+\big [\tfrac{\Vert f\Vert _\infty }{\Vert f\Vert _2}\big ]^\frac{2}{p}\big )\Big )^{\alpha }\,\Vert f\Vert _2^{\frac{2}{p}}, \quad \quad \forall \,f\in L^\infty ({{\mathbb {T}}}). \end{aligned}$$

Proof

We may assume that \(\Vert f\Vert _2=1\). Define the functions

$$\begin{aligned} a(t)=t^\frac{2}{p}\,\big (\log (c+t^\frac{2}{p})\big )^{-{\alpha }}{\quad \text{ and }\quad }b(t)=\frac{t}{a(t)}= t^{1-\frac{2}{p}}\,\big (\log (c+t^\frac{2}{p})\big )^{{\alpha }}, \quad t>0. \end{aligned}$$

By the lattice property of the Luxemburg norm in \(L^{\bar{\Phi }}\) we have

$$\begin{aligned} \Vert f\Vert _{L^{\bar{\Phi }}} = \big \Vert a(f)b(f)\big \Vert _{L^{\bar{\Phi }}}\le \,\big \Vert b(f)\big \Vert _{L^\infty }\,\big \Vert a(f)\big \Vert _{L^{\bar{\Phi }}}\, = \, b\big (\Vert f\Vert _\infty \big )\,\big \Vert a(f)\big \Vert _{L^{\bar{\Phi }}}, \end{aligned}$$

using that b(t) is increasing under the conditions in (5.2). So, it suffices to show that

$$\begin{aligned} \int _{{\mathbb {T}}}{\bar{\Phi }}\big (a(|f(x)|)\big )\,dx\,\le \,1, \end{aligned}$$

as this will imply that \(\big \Vert a(f)\big \Vert _{L^{\bar{\Phi }}}\le 1\). Write

$$\begin{aligned} \int {\bar{\Phi }}\big (a(|f|)\big )\,dx= & {} \int a(|f|)^p\,\Big [\log \big (c+a(|f|)\big )\Big ]^{{\alpha }p}\,dx. \end{aligned}$$

Observe that, regardless of the sign of \({\alpha }\in {{\mathbb {R}}}\), we always have

$$\begin{aligned} \Big [\log \big (c+a(|f|)\big )\Big ]^{{\alpha }p}=\Bigg [\log \Big (c+\frac{|f|^\frac{2}{p}}{[\log (c+|f|^{2/p})]^{{\alpha }}}\Big )\Bigg ]^{{\alpha }p}\le \Big [\log \Big (c+|f|^\frac{2}{p}\Big )\Big ]^{{\alpha }p}. \end{aligned}$$

Thus,

$$\begin{aligned} \int {\bar{\Phi }}\big (a(|f|)\big )\,dx\,\le \, \int a(|f|)^p\,\Big [\log \big (c+|f|^\frac{2}{p}\big )\Big ]^{{\alpha }p}\,dx\,=\,\int |f|^2=1. \end{aligned}$$

\(\square \)

Remark 5.3

Observe that, when the indices p and \({\alpha }\) satisfy (5.2), then it holds

$$\begin{aligned} L^p(\log L)^{\alpha }\hookrightarrow L^2({{\mathbb {T}}}). \end{aligned}$$
(5.4)

This is easily proved using that \(t^2\lesssim \Phi (t)\) for \(t\ge 1\), since

$$\begin{aligned} \int _{{\mathbb {T}}}|f|^2=\int _{\{|f|<1\}}|f|^2+\int _{\{|f|\ge 1\}}|f|^2\le 1+c'\int \Phi (|f|)\,dx<\infty . \end{aligned}$$

Likewise, by duality, one proves that \(L^{2}({{\mathbb {T}}})\hookrightarrow L^p(\log L)^{\alpha }\) when

$$\begin{aligned} 1<p<2\;\;\text{ and }\;\;{\alpha }\in {{\mathbb {R}}},\quad \text{ or }\quad p=2\;\;\text{ and }\;\;{\alpha }\le 0. \end{aligned}$$
(5.5)

5.1 Property \({\texttt {A3}}\) for \({{\mathcal {T}}}\) in \(L^p(\log L)^{\alpha }\)

Lemma 5.6

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\). Then, for all \(|{\varepsilon }_n|\le 1\) and all \(A\subset {{\mathbb {Z}}}\) with \(|A|\le N\) it holds

$$\begin{aligned} \big \Vert \sum _{n\in A}{\varepsilon }_n e^{inx}\big \Vert _{L^p(\log L)^{\alpha }}\lesssim \max \Big \{N^{1/2}, \;N^{1-\frac{1}{p}}\,\big (\log (e+N)\big )^{\alpha }\Big \}. \end{aligned}$$
(5.7)

Proof

When p and \({\alpha }\) satisfy (5.5), the right hand side of (5.7) is \(\approx N^{1/2}\), so the assertion follows from the inclusion \(L^{2}\hookrightarrow L^p(\log L)^{\alpha }\). On the other hand, if p and \({\alpha }\) satisfy (5.2), then by Lemma 5.1 we have

$$\begin{aligned} \Vert f\Vert _{L^p(\log L)^{\alpha }}\,\lesssim \, b\Big (\tfrac{\Vert f\Vert _\infty }{\Vert f\Vert _2}\Big )\,\Vert f\Vert _2, \end{aligned}$$
(5.8)

where \(b(t)=t^{1-\frac{2}{p}}\,\big (\log (c+t^\frac{2}{p})\big )^{{\alpha }}\). Applying this to \(f=\sum _{n\in A}{\varepsilon }_n e^{inx}\), and using that b(t) is increasing and

$$\begin{aligned} {\Vert f\Vert _\infty }/{\Vert f\Vert _2}\,\le \, N/\sqrt{N}=\sqrt{N}, \end{aligned}$$

one easily obtains (5.7). \(\square \)

Remark 5.9

The upper bounds in (5.7) cannot be improved, even when all signs \({\varepsilon }_n=1\). Indeed, if one considers the Dirichlet kernel \(D_N(x)=\sum _{|n|\le N} e^{inx}\), then we have

$$\begin{aligned} \Vert D_N\Vert _{L^p(\log L)^{\alpha }}\,\approx \, N^{1-\frac{1}{p}}\,\big (\log (e+N)\big )^{\alpha }; \end{aligned}$$
(5.10)

see e.g. [9, Lemma 3.1]. On the other hand, if A is a lacunary set (say, \(A=\{2^j\}_{j=1}^N\)), then

$$\begin{aligned} \big \Vert \sum _{n\in A} e^{inx}\big \Vert _{L^p(\log L)^{\alpha }}\,\approx \, \sqrt{N}. \end{aligned}$$

Indeed, this is easily obtained from a similar result for all the \(L^q\) spaces, \(0<q<\infty \), and the inclusions \(L^{p+{\varepsilon }}\hookrightarrow L^p(\log L)^{\alpha }\hookrightarrow L^{p-{\varepsilon }}\).

Corollary 5.11

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\). Let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) and \({{\mathcal {D}}}=\{e^{inx}\}_{n\in {{\mathbb {Z}}}}\) in \({{\mathbb {T}}}\). Then, property \({\texttt {A3}}(H)\) holds with

$$\begin{aligned} H(N)\,\approx \, \max \Big \{N^{1/2}, \;N^{\frac{1}{p}}\,\big (\log (e+N)\big )^{-{\alpha }}\Big \}. \end{aligned}$$

Proof

Apply Lemmas 2.6 and 5.6, and the duality relation \({{\mathbb {X}}}^*=L^{p'}(\log L)^{-{\alpha }}\). \(\square \)

5.2 Property \({\texttt {A2}}\) for \({{\mathcal {T}}}\) in \(L^p(\log L)^{\alpha }\)

Given a finite set \(A\subset {{\mathbb {Z}}}\), we denote

$$\begin{aligned} S_A(g)=\sum _{n\in A}{{\hat{g}}}(n)e^{inx}, \end{aligned}$$

where \({{\hat{g}}}(n)\), \(n\in {{\mathbb {Z}}}\), are the Fourier coeffients of \(g\in L^1({{\mathbb {T}}})\). As noticed in [2, Lemma 2.15], property \({\texttt {A2}}(k_N)\) holds trivially when we let

$$\begin{aligned} k_N=\sup _{|A|\le N}\Vert S_A\Vert _{L^p(\log L)^{\alpha }\rightarrow L^p(\log L)^{\alpha }}. \end{aligned}$$
(5.12)

In this section we compute this last expression.

Lemma 5.13

Let \(2<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\), or \(p=2\) and \({\alpha }\ge 0\). Then,

$$\begin{aligned} \sup _{|A|\le N}\big \Vert S_A\big \Vert _{L^p(\log L)^{\alpha }\rightarrow L^p(\log L)^{\alpha }}\,\lesssim \;N^{\frac{1}{2}-\frac{1}{p}}\,\big (\log (e+N)\big )^{\alpha }. \end{aligned}$$
(5.14)

Proof

Let \(g\in L^p(\log L)^{\alpha }\). Using the inequality in (5.8) from the previous section, applied to \(f=S_A(g)\) we see that

$$\begin{aligned} \Vert S_A(g)\Vert _{L^p(\log L)^{\alpha }}\,\lesssim \, b\Big (\tfrac{\Vert S_A(g)\Vert _\infty }{\Vert S_A(g)\Vert _2}\Big )\,\Vert S_A(g)\Vert _2. \end{aligned}$$

Now,

$$\begin{aligned} \Vert S_A(g)\Vert _\infty \le |A|^{1/2}\,\big (\sum _{n\in A}|{{\hat{g}}}(n)|^2\big )^{1/2}\le \sqrt{N}\,\Vert S_A(g)\Vert _2, \end{aligned}$$

so using that b(t) is increasing we obtain

$$\begin{aligned} \Vert S_A(g)\Vert _{L^p(\log L)^{\alpha }}\,\lesssim \, b\big (\sqrt{N}\big )\,\Vert S_A(g)\Vert _2. \end{aligned}$$

On the other hand, the inclusion in (5.4) gives

$$\begin{aligned} \Vert S_A(g)\Vert _2\le \Vert g\Vert _2\lesssim \Vert g\Vert _{L^p(\log L)^{\alpha }}. \end{aligned}$$

Thus, we obtain

$$\begin{aligned} \Vert S_A\Vert \lesssim b(\sqrt{N}) \approx \,N^{\frac{1}{2}-\frac{1}{p}}\,\big (\log (e+N)\big )^{\alpha }. \end{aligned}$$

\(\square \)

Since \(S_A^*=S_A\), by duality one obtains the following complementary result.

Lemma 5.15

Let \(1<p<2\) and \({\alpha }\in {{\mathbb {R}}}\), or \(p=2\) and \({\alpha }\le 0\). Then,

$$\begin{aligned} \sup _{|A|\le N}\Vert S_A\Vert _{L^p(\log L)^{\alpha }\rightarrow L^p(\log L)^{\alpha }}\lesssim \;N^{\frac{1}{p}-\frac{1}{2}}\,\big (\log (e+N)\big )^{-{\alpha }}. \end{aligned}$$
(5.16)

Remark 5.17

The estimate in (5.16) is best possible (and by duality, also (5.14)). One can prove this by noticing that there exist choices of signs \(\pm 1\) such that

$$\begin{aligned} \big \Vert \sum _{|n|\le N}\pm e^{inx}\big \Vert _{L^p(\log L)^{\alpha }}\gtrsim \sqrt{N}. \end{aligned}$$
(5.18)

This last assertion can be easily obtained from a similar property of the \(L^q\)-spaces, and the inclusions at the end of Remark 5.9. From (5.18), there will be a set \(A\subset [-N,N]\), either corresponding to the positive or the negative signs, so that

$$\begin{aligned} \big \Vert \sum _{n\in A}e^{inx}\big \Vert _{L^p(\log L)^{\alpha }}\gtrsim \tfrac{1}{2}\,\sqrt{N}. \end{aligned}$$

Thus, omitting the subindices \(L^p(\log L)^{\alpha }\) from the norms, we have

$$\begin{aligned} \Vert S_A\Vert \ge \Vert S_A(D_N)\Vert /\Vert D_N\Vert = \big \Vert \sum _{n\in A}e^{inx}\big \Vert /\Vert D_N\Vert \gtrsim N^{\frac{1}{p}-\frac{1}{2}}\,\big (\log (e+N)\big )^{-{\alpha }}, \end{aligned}$$

using (5.10) in the last step.

Corollary 5.19

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\). Let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) and \({{\mathcal {D}}}=\{e^{inx}\}_{n\in {{\mathbb {Z}}}}\) in \({{\mathbb {T}}}\). Then, property \({\texttt {A2}}(k_N)\) holds with

$$\begin{aligned} k_N\,\approx \, \left\{ \begin{array}{ll} N^{\frac{1}{2}-\frac{1}{p}}\,\big (\log (e+N)\big )^{\alpha }&{} \text{ if }\ p>2,\,{ or}p=2{ and}{\alpha }\ge 0\\ N^{\frac{1}{p}-\frac{1}{2}}\,\big (\log (e+N)\big )^{-{\alpha }} &{} \text{ if }\ 1<p<2,\,{ or}p=2{ and}{\alpha }\le 0.\\ \end{array}\right. \end{aligned}$$

Proof

Apply Lemmas 5.13 and 5.15 to the expression in (5.12). \(\square \)

5.3 WCGA for \({{\mathcal {T}}}\) in \(L^p(\log L)^{\alpha }\)

Combining the estimates from the previous subsections, we obtain the following.

Theorem 5.20

Let \(1<p<\infty \) and \({\alpha }\in {{\mathbb {R}}}\). Let \({{\mathbb {X}}}=L^p(\log L)^{\alpha }\) and \({{\mathcal {D}}}=\{e^{inx}\}_{n\in {{\mathbb {Z}}}}\) in \({{\mathbb {T}}}\). Then, there exists a constant \(C>1\) such that the WCGA satisfies

$$\begin{aligned} \Big \Vert f-{\mathscr {G}}_{\phi (N)}(f)\Big \Vert _{L^p({{\text {Log }}}L)^{\alpha }}\le \,2\,\sigma _N(f),\quad \forall \,f\in L^p(\log L)^{\alpha },\;N\ge 2, \end{aligned}$$

where

$$\begin{aligned} \phi (N)\,=\,C\,\left\{ \begin{array}{ll} N\,\log N &{} \text{ when }\ p>2\\ N\,\log \log N &{} \text{ when }\ p=2{ and}{\alpha }>0\\ N &{} \text{ when }\ p=2{ and}{\alpha }=0\\ N\,(\log N)^{4{\alpha }_-}\,\log \log N &{} \text{ when }\ p=2{ and}{\alpha }<0\\ N^{p'-1}\,\frac{(\log N)^{p'{\alpha }_-}}{(\log N)^{p'{\alpha }}}\,\log N &{} \text{ when }\ 1<p<2.\\ \end{array}\right. \end{aligned}$$

Proof

Combine Theorem 1.12, with the estimates for Q(t), H(N) and \(k_N\) in Corollaries 4.6, 5.11 and 5.19. \(\square \)

Remark 5.21

The necessity of the log factors and the powers in the above expression of \(\phi (N)\) is not known, even in the case \({\alpha }=0\) (except, of course, if \({{\mathbb {X}}}=L^2\)). See [16, Open Question 8.2].