Hoeffding decomposition in $H^1$ spaces

The well known result of Bourgain and Kwapie\'n states that the projection $P_{\leq m}$ onto the subspace of the Hilbert space $L^2\left(\Omega^\infty\right)$ spanned by functions dependent on at most $m$ variables is bounded in $L^p$ with norm $\leq c_p^m$ for $1<p<\infty$. We will be concerned with two kinds of endpoint estimates. We prove that $P_{\leq m}$ is bounded on the space $H^1\left(\mathbb{D}^\infty\right)$ of functions in $L^1\left(\mathbb{T}^\infty\right)$ analytic in each variable. We also prove that $P_{\leq 2}$ is bounded on the martingale Hardy space associated with a natural double-indexed filtration and, more generally, we exhibit a multiple indexed martingale Hardy space which contains $H^1\left(\mathbb{D}^\infty\right)$ as a subspace and $P_{\leq m}$ is bounded on it.

The number |A| is called the mutiplicity of w A . Analogous problems for Walsh functions of finite multiplicity have been resolved independently by Bonami [2] and Kiener [15]. Namely, the inequality 1 2 holds true, and the orthogonal projection P m onto span (w A : |A| ≤ m) is bounded if and only if 1 < p < ∞. Some lower estimates for P m are also known, see [14] or [22] for a detailed discussion on this subject.
In [5], Bourgain generalized these results to a setting in which Z 2 , {∅, {0}, {1}, {0, 1}}, 1 2 # is replaced with an arbitrary probability space (Ω, F , µ). To be more precise, let (Ω ∞ , F ⊗∞ , µ ⊗∞ ) be the infinite product space. Any f ∈ L 2 (Ω ∞ , F ⊗∞ , µ ⊗∞ ) can be decomposed in a unique way into a series (1.4) f (x) = are mutually orthogonal orthogonal projections. In the case of Ω = Z 2 , the image of P A is just the one-dimensional space spanned by w A , so the above definition of P m coincides for with the projection onto Walsh functions of multiplicity m. In [5] Bourgain proved that for 1 ≤ p < ∞, which is a direct generalization of (1.3). Moreover, he proved that P m is bounded on L p if and only if 1 < p < ∞, with norm smaller than c m p where c p p 5 2 logp andp = p ∨ p p−1 . It turns out that the projections P m have a well established probabilistic interpretation. In [16], Kwapień connected them to the notion of Hoeffding decomposition, which originated from Hoeffding's work [13]. More precisely, elements of the image of P m are what is called generalized canonical U -statistics and the decomposition f = m P m f plays a crucial role in the proofs of many theorems concerning U -statistics. For more information, we refer the reader to [18]. Kwapień provided a shorter proof of Bourgain's result about boundedness of P m , with a better constant c p p logp . Let us decribe the main results of this paper, which give certain endpoint estimates for P m . One of them (Theorem 4.5 in the text) is obtained by restricting the domain of P m . For exact definition of H 1 all , see Section 2.
Theorem A. P m is bounded on the subspace H 1 all (T ∞ ) of L 1 (T ∞ ) consisting of functions analytic in each variable. We also find a norm stronger than L 1 and weaker than L p (p > 1), in which P m is bounded. The detailed construction is described in Section 5.2.
Theorem B. For any m ∈ N, there is a partition of the family of finite subsets of N into˙ i∈I A i such that the norm is between L 1 and all L p (p > 1) and P m is bounded in this norm.
It is worth noting that Theorem A translates directly to the space H 1 of Dirichlet series, i.e. the closure of polynomials of the form The Bohr lift, dating back to [1], is the map where a k = b n for n having the prime number factorization n = j p kj j . It is an isometry between H 1 all (T ∞ ) and the space H 1 of Dirichlet series. Thus, our result is equivalent to the fact that the projection from H 1 onto (1.10) span n −s : n has at most m prime factors ⊂ H 1 is bounded. For a more detailed exposition of Dirichlet series and their relation to polydisc Hardy spaces, see [23]. The paper is organized as follows. In Section 2 we introduce necessary notation and definitions. In Section 3, we provide a new simple proof of the historic L p boundedness result. The proof of the estimate P m ≤ (e P 1 ) m is done by means of a combinatorial identity expressing P m in terms of tensor products of P 1 . In Section 4, we show that the same argument carries over with little modification showing boundedness of P m on H 1 all (T ∞ ). In Section 5.1, we define, purely in terms of square functions and not referring to analyticity, a multiple indexed martingale Hardy space H 1 [T m ] of functions on Ω ∞ that admits a bounded action of P m . It turns out that if Ω = T, there is a subspace The arguments rely heavily on L 1 square function theorem for Hardy martingales and decoupling inequality of Zinn. We present two proofs of the latter in Section 6.

Acknowledgements
The results of this paper are taken from my doctoral thesis [25]. I am grateful to my advisors: Fedor Nazarov and Micha l Wojciechowski for their mentorship and support, especially their unending willingness to discuss my research.

Preliminaries
Probability spaces and conditional expectations. In all of the text, (Ω, F , µ) will be a probability space. We will equip sets of the form Ω I , where I is an at most countable index set, with the product measure µ ⊗I defined on F ⊗I . In case we are only concerned with the cardinality of I, we will write Ω n , where n is a natural number or ∞. By the natural filtration on Ω N we mean the filtration (F n : n = 0, 1, . . .), where F k is generated by the coordinate projection ω → (ω 1 , . . . , ω k ) and denote E k = E (· | F k ). In general, for a subset A of the index set, F A will be the sigma algebra generated by the coordinate projection ω → (ω i ) i∈A and E A = E (· | F A ). In more explicit terms, measurability with respect to F A is equivalent to being dependent only on variables with indices belonging to A and the conditional expectation operator E A integrates away the dependence on all other variables, so that the formulas are satisfied (with the convention that sequences indexed by A and N \ A are merged in a natural way into a sequence indexed by N). It will often be convenient to identify a function f defined on Ω A with an F A -measurable function Ω I ∋ ω → f (ω i ) i∈A . In order to save space, we will often write dx instead of dµ(x) whenever the measure is implied by context.
Tensor products. Let 1 ≤ p < ∞. For f k ∈ L p (Ω k ), we will denote by n k=1 f k the function on k Ω k satisfying Because of separation of variables, we have . This way we actually define an injection of the algebraic tensor product k L p (Ω k ) into L p ( k Ω k ), the image of which is dense.
Let X k be subspaces (by a subspace we always mean a closed linear subspace) of L p (Ω k ). By k X k we will denote the subspace of L p ( k Ω k ) spanned by functions of the form k f k , where f k ∈ X k , and the norm is inherited from L p ( k Ω k ) (care has to be taken, as k X k is not determined solely by X k as Banach spaces, but rather by the particular way they are embedded in L p (Ω k )). If T k : X k → L p (Ω k ) are bounded operators, then we can define an operator and easily check that the property Fourier transform. Let T be the interval [0, 2π) equipped with addition modulo 2π and normalized Lebesgue measure dµ = dx 2π . We will be exclusively dealing with Fourier transforms of functions on T or some power of T. Since the group dual to T is Z, the dual group to the product T N is the direct sum Z ⊕N (i.e., integer-valued sequences that are eventually 0), on which we define the Fourier transform by Hardy spaces of martingales and analytic functions. By D we denote the unit disk in the complex plane. We can identify T with the unit circle by the map t → e it . For N ∈ N, the space H 1 (D N ) is defined as the space of functions analytic in the polydisc D N such that the norm . . , r n e itN dt (2π) N is finite. It is well-known [12] that such a function has an a.e. radial limit f (t 1 , . . . , t n ) = lim r→1 F re it1 , . . . , re itn on the distinguished boundary T N and F can be recovered from f by convolution with a Poisson kernel. This sets a one-to-one correspondence between H 1 D N and the space We also can define H 1 all T N in the same manner as in (2.8), but care has to be taken, since these functions are can no longer be extended analytically to D N in general (hence the shorthand H 1 D N , which we will sometimes use, is an abuse of notation). Later we will use two more H 1 spaces, namely H 1 last T N (also called Hardy martingales) and H 1 m last T N , which we will define as follows. In the space H 1 m last T N we allow characters of the form e i n,t , where |supp n| < m and n j ≥ 0 for all j.
Now we recall the definition of a martingale Hardy space and some related inequalities. A standard reference in this matter is [11]. Let (F n ) ∞ n=0 be an arbitrary filtration on a probability space (Ω, F , µ), where F is generated by F n . We denote E k = E (· | F k ), ∆ 0 = E 0 , ∆ k = E k − E k−1 for k ≥ 1, and define the square function and maximal function of f respectively by This allows us to define the martingale Hardy space.
] is a function space on Ω with the norm We will make use of three following classical martingale inequalities.
Theorem 2.2 (Burkholder, Gundy [8] for 1 < p < ∞; Davis [9] for p = 1). For 1 ≤ p < ∞, Theorem 2.4 (Stein [4]). For 1 < p < ∞ and an arbitrary sequence Definition 2.5. A martingale atom is a function of the form ] be of mean 0. Then there are atoms a 1 , a 2 , . . . and scalars c 1 , c 2 , . . . such that c n a n and where the duality is given by f, g = lim n→∞ E (E n f E n g).
Vector-valued inequalities. For a Banach space B, by L p (S, B) we denote the Bochner space of strongly measurable B-valued random variables equipped with the norm For an operator T between subspaces of L p (S 1 ) and L p (S 2 ) and a linear operator F : B 1 → B 2 we can define T ⊗ F and the algebraic tensor product by ( , but this construction does not necessarlily produce a bounded operator on the closure. The main tool for obtaining vector-vlaued extensions of inequalities will be the following lemma, which for I 1 , I 2 being singletons is due to Marcinkiewicz and Zygmund [20] (in this case T can be replaced with ≤ T ).
Let also r j for j ∈ J be Rademacher variables. Then, applying ℓ 2 (I 2 )-valued Khintchine inequality, Hoeffding decomposition. Now we define the main object of our interest. In order to avoid technicalities with convergence in strong operator topology, we will work in a finite product of Ω (all the results extend automatically to Ω ∞ by density). We will see in a moment that any function f ∈ L 1 (Ω n ) can be decomposed in a unique way as where P i1,...,im f (x 1 , . . . , x n ) depends only on x i1 , . . . , x im and is of mean 0 with respect to each of x i1 , . . . , x im (equivalently, P A f is F A -measurable and is orthogonal to all F B -measurable functions for B A). This decomposition has been studied in [5], [16]. In particular, P i1,...,im are pairwise orthogonal orthogonal projections. Let and U m be the range of P m . It is known [5], [16] that P m is bounded on L p (Ω n ), 1 < p < ∞, with norm independent on n, but this is not true for L 1 (Ω n ).
One of the possible ways to prove the existence of the above decomposition in L 2 (Ω n ) is as follows. First we define the subspace for each m ≥ 0. The sequence of subspaces U ≤0 , U ≤1 , . . . , U ≤n is increasing, so by putting we obtain a decomposition into an orthogonal direct sum of U m . We will denote the orthogonal projection onto U m by P m . A more explicit formula for P m can be obtained.
where id and E are understood to act on L 2 (Ω), and let U A be the range of the projection P A . It is easy to see that and, since the subspaces U B are mutually orthogonal, and consequently Decoupling inequalities. We are going to present a special case of a theorem of J. Zinn [27], which will be one of the most important tools.
We will provide two new proofs of the above in Section 6. Below, we state two corollaries obtained by iterating Zinn's inequality.
for a < b and 0 otherwise. Then, by Theorem 2.9 applied for functions F b ℓ 2 , Analogously, by setting y as fixed, and applying Theorem 2.9 with reversed order of variables (which we can do, because we are dealing with finite sums), where y (1) , . . . , y (m) are variables in Ω N .
Proof. Let us fix k ∈ {1, . . . , m} and for each j ∈ N define a function ϕ j on Ω [1,j] × Ω N m−k by the formula Then, for fixed y (>k) = y (k+1) , . . . , y (m) ∈ Ω N m−k , Here, i k plays the role of j and (2.48) is an application of Theorem 2.9 to functions |ϕ j | 2 . Integrating the resulting inequality with respect to y (>k) , we get  , [16]). P m is bounded on L p Ω N for 1 < p < ∞, with norm c m p . We will present a proof that yields P 1 : L p < ∞ and c p = e P 1 : L p .
Proof. Without loss of generality, we may assume that we are working in L p Ω [1,N ] , F ⊗N . Indeed, by (2.33) and (2.37), P m preserves L 2 Ω N , F [1,N ] , which can be canonically identified with L 2 Ω [1,N ] , F ⊗N . Since the sequence L 2 Ω N , F [1,N ] : N ∈ N is increasing and its sum is dense in L p Ω N , F ⊗N , all we need to prove is The L p boundedness of P 1 is essentially a known result [6], but we provide a proof for the sake of completeness. Let (F k : k ∈ [0, N ]) be the natural filtration and (F * k ) N k=0 be the natural reversed filtration, i.e. F * k = F [k,N ] . By (2.32) and (2.37) we see that By mutual orthogonality of P A 's We will now proceed by induction. Suppose that (3.1) is satisfied with m − 1 in the place of m. Let N = mn and define an operator Q m acting on L p Ω [1,N ] by Utilising (2.37) we get  We prvide a short proof of a fact taken from [6] that Theorem 3.1 can not be, extended to p = 1 or ∞, which motivates the next section.

Proposition 3.2.
If Ω is not a single atom, then P m for m ≥ 1 is not bounded on L 1 (Ω ∞ ) or L ∞ (Ω ∞ ).
Proof. It is enough to consider L 1 (Ω ∞ ), because P m 's are self-adjoint. Let f ∈ L 2 (Ω) be such that Ef = 1, f ≥ 0 and µ (supp f ) < 1. Then E|f − 1| 2 > 0. For F n = f ⊗n ∈ L 2 (Ω n ) we have which is not dominated by F n L 1 (Ω n ) = f n L 1 (Ω) = 1. To prove the unboundedness of P m for m > 1, we simply notice that The projection P m can be described even more explicitly in the case Ω = T. Indeed, if n ∈ Z ⊕N is supported on the set A, then In particular, P m preserves the space In order to adapt the proof of Theorem 3.1 to the H 1 D N case, we will need a replacement for the argument proving that P 1 is bounded. The role of the combination of Burkholder-Gundy and Doob inequalities will be played by the following theorem, which can be found in [3].
is the natural filtration on T N . For later use, we note the Hilbert space valued extension.
is the natural filtration on T N . Proof. Theorem 4.1 gives a map (4.6) T : H 1 last T N → L 1 T N , ℓ 2 , which is an isomorphism onto the subspace of L 1 T N , ℓ 2 consisting of functions f such that f k is a k-th martingale difference and is analytic in the k-th variable, defined by . Thus, applying Lemma 2.8 with I 1 being a singleton, I 2 = N, T as above (and then the same for T −1 ) we get The role of the Stein martingale inequality will be played by the following simple observation.

Corollary 4.3.
For any sequence (f n : n ∈ N) adapted to the natural filtration on Ω N , Proof. Letf n be a sequence of functions on Ω N × Ω N defined by (4.10)f n (x, y) = f n (x 1 , . . . , x n−1 , y n ) .
Applying Theorem 2.9 and conditional expectation with respect to the second of two sets of variables, By conditioning with respect to the first set of variables, we obtain the inequality due to Lepingle [19]. . Proof. We proceed as in the proof of Theorem 3.1. First, we reduce the problem to the Ω [1,N ] realm. Then we notice that which by Corollary 4.3 yields Proof. The case m = 0 is trivial, m = 1 follows directly from Theorem 4.1 and Theorem 4.4. The induction step is identical to the proof of Theorem 3.1, up to changing L p Ω I to H 1 D I . Alternatively, we can prove the same in a single step. Set (4.22) Q m,n = 1 nm n,...,n A1∪...∪Am= [1,nm] Ai's disjoint |A1|=...=|Am|=n It is easily seen that for each set B of cardinality m, P B appears m! (n−1)m n−1,...,n−1 times in the sum. Therefore  we get (4.28) It has to be noted that our proofs of Theorems 3.1 and 4.5 extend naturally to a vector valued case, respectively UMD and AUMD valued. Indeed, Bourgain's proof of Theorem 2.4, as presented in [22], extends to the UMD valued version, while Theorem 4.1 is just the statement that a one-dimensional space has the AUMD property. In both cases, the induction follows without change. There is also a second direction in which we can generalize. Namely, by looking carefully at the proof of Theorem 4.1, one can see that the only place in which analyticity plays a role is the H 1 = H 2 · H 2 theorem, which is true for H 1 on any compact and connected group with ordered dual [24], which means that we can replace T with any such group.
Given that Kwapień's constant c p in Theorem 3.1 has the best known asymptotics as a function of p for m = 1, one can ask about the dependence of P m : L p (Ω ∞ ) and P m : where L p 0 and H 1 0 stand for functions of mean 0, are true. Also, (4.31) Proof. Let f ∈ L p (Ω n ) be of mean 0. Then f ⊗m ∈ L p (Ω mn ) and (4.32) (P m : L p (Ω mn ) ) f ⊗m = ((P 1 : L p (Ω n ) ) (f )) ⊗m .
Indeed, we have f = |A|≥1 P A f because of Ef = 0, hence The only way to get a summand in U m is to have |A i | = 1 for all i and the sum of such summands is the right hand side of (4.32). Taking an f which is close to attaining the norm of P 1 on a respective space proves (4.29) and (4.30).
In order to see (4.31), assume for the sake of contradiction that P 1 is a contraction on H 1 0 D 2 . We will test it on functions of the form F (z) + w + azw, where F ∈ H 1 0 (D) and a is a scalar. It is easy to see that (4.34) E |α + βw| = E ||α| + |β|w| for α, β ∈ C. Hence, from the inequality Since any nonnegative function can be approximated by the modulus of an H 1 0 (D) function, (4.35) is true for any nonnegative F . In particular, the left hand side attains a local minimum at a = 0, so by |u + v| = |u| + Re uv |u| + o(v) we infer that for r ≥ 0. This is a continuous function, whose values lie on some curve γ connecting 0 and 1 (because φ(0) = 1 and lim r→∞ φ(r) = 0). The condition (4.37) can be rewritten as (4.39) Re Ezφ(F (z)) = 0.
Since F was allowed to be any positive function, φ(F ) can be any function with values in γ, making (4.39) obviously false.

Martingale Hardy spaces
5.1. Double indexed martingales. Above we noticed that the boundedness of P 1 on H 1 D N follows from the boundedness of P 1 on a bigger space H 1 [(F n )]. It is temtping to find an abstract martingale inequality responsible for the boundedness of P m on H 1 D N . We can do this for m = 2. By the natural double-indexed filtration on Ω N we will mean the family F [a,b] : a ≤ b (note that the inclusion order in the first index is reversed). Let ∆ n = E n − E n−1 be the martinagle differences with respect to (F n ) and ∆ * n = E * n − E * n+1 be the martingale differences with repsect to (F * n ), where F * n = F [n,∞) . We define the martingale differences with respect to F [a,b] by and an H 1 norm for this filtration by The definition of double martingale differences coincides with what is considered in [26].
Corollary 5.1. For f ∈ H 1 D N , there is an equivalence of norms where F [a,b] a≤b is the natural double-indexed filtration on T N .
Proof. For any ±1-valued sequence (ε n : n ∈ N), we define operators S ε and S * ε by By Theorem 4.1, S ε is an isomorphism from H 1 D N to itself, uniformly in ε. By reversing the order of variables, the same can be said about S * ε . Thus for any ε, ε ′ , By averaging the last quantity over all choices of ε, ε ′ and applying the Khintchine-Kahane inequality twice, we get the desired inequalities.
Theorem 5.2. P 2 is bounded on H 1 (F a,b ) a≤b , for any Ω.
Proof. As usual, we reduce the problem to the Ω [1,N ] version. By (3.2), for a < b. We can assume that P 0 f = P 1 f = 0 (i.e. Ef = 0 and ∆ [a,a] f = 0 for all a), because U ≤1 , being the image of E + a ∆ [a,a] is trivially complemented in the underlying norm and P 2 is 0 on U ≤1 . By applying Corollary 2.10, as desired.

5.2.
Multiple indexed martingales. We will make an attempt at generalizing the above for multiple indexed martingales. Suppose there is a family (T i , ∂T i ) i∈I of pairs of finite subsets of some set X (finite or not) indexed by some set I, such that ∂T i ⊆ T i (∂T i is not a boundary in a topological sense -we use this notation for resemblance with the case where T i are intervals and ∂T i are their endpoints). We would like to define operators ∆ i on L 2 Ω X by the formula where T ′ i stands for the complement of T i in X. This is supposed to mimic the standard martingale differences when X = N, is guaranteed by (5.19) for any A ⊂ X, there exists unique i ∈ I such that ∂T i ⊆ A ⊆ T i . Indeed, and each P A appears in the above sum exactly once if and only if the condition (5.19) is satisfied. For a family (T i , ∂T i ) i∈I we may define a norm by the formula and ask the following: • Is it true that for f ∈ f H 1 (D N ) ? • If yes, is there any interesting example of a set N ⊕∞ ⊂ Γ ⊂ Z ⊕∞ such that (5.26) is true for f ∈ L 1 (T ∞ ) with supp f ⊂ Γ? • For which, if any, m is P m bounded on H 1 (T i , ∂T i ) i∈I ?
We are able to answer them in the case when For a finite set B ⊂ N, the unique A ∈ I such that ∂T A ⊆ B ⊆ T A , which we will denote by ∂B, is where T is used as Ω. Moreover, for m ′ ∈ N and nontrivial Ω, the following are equivalent.
A , ∂ (m) B to indicate the value of m we are currently using. For brevity we will denote T (m) . For m ′ = m, we notice that and the desired inequality follows from i1 , . . . , y (m) im The implication (ii) =⇒ (iii) follows from (5.30), which we will prove by induction with respect to m. For m = 1 this is just Theorem 4.1. Suppose it is true for some m and let f ∈ H 1 m+1 last T N . In particular, f ∈ H 1 m last T N . By (5.30), which is now the induction hypothesis, and (5.31), i1 , . . . , y (m) im  Integrating the resulting equivalence with respect to y (1,...,m) and plugging into (5.38), we verify that f L 1 (T N ) ≃ H 1 [T m+1 ], which finishes the proof of (5.30).
In order to see that (iii) =⇒ (i), let us take m ′ > m. For any g ∈ L 1 (T n ), the function G ∈ L 1 T N defined by It is worth noting that by repeating the above proof of the equivalence between H

Appendix
We present two proofs of Theorem 2.9 different from the original one by Zinn. Let us recall the non-linear telescoping lemma due to Bourgain and Müller.
Proof of Theorem 2.9. In order to prove the inequality in (2.38), we merely perform a slight modification of the proof of Lepingle inequality presented in [3]. Let us denote Ω N × Ω N ∋ (x, y) → f n (x 1 , . . . , x n−1 , n k ) byf n . By tensoring (f n ) against the Rademacher sequence, we may assume that it is a martingale difference sequence. Then the left hand side equals F H 1 , where F = ∞ n=1 f n and f n = ∆ n F . By Theorem 2.6 it is enough to check the boundedness of the right hand side in the case when F is an atom, because we have an a priori bound for finite sums. Let F = u − E j−1 u, where u satisfies (2.17). Then By A ∈ F j , the support of E k u for k ≥ j is contained in A as well, because E |E k u| · ½ Ω ∞ \A ≤ E E k |u| · ½ Ω ∞ \A (6.9) = E |u| · E k ½ Ω ∞ \A (6.10) = 0. (6.11) Thus for k > j we have supp f k ⊂ supp E k u ∪ supp E k−1 u ⊂ A. Consequently (6.12) suppf k ⊂ A × Ω ∞ for k > j,