UMD-Extensions of Calderón–Zygmund Operators with Mild Kernel Regularity

Armed with new methods we revisit a result of Figiel concerning the UMD-extensions of linear Calderón–Zygmund operators with mild kernel regularity and then extend our new proof to the multilinear setting improving recent UMD-valued estimates of multilinear singular integrals.


Introduction
The UMD-unconditional martingale differences-property of a Banach space X is a well-known necessary and sufficient condition for the boundedness of various singular integral operators (SIOs) T f (x) =R d K (x, y) f (y) dy on L p (R d ; X ) = L p (X ). A Banach space X has the UMD property if X -valued martingale difference sequences converge unconditionally in L p for some p ∈ (1, ∞).
By Burkholder [2] and Bourgain [1] we have that X is a UMD space if and only if a particular singular integral operator-the Hilbert transform H f (x) = p. v.´R f (y)dy x−y -admits an L p (X )-bounded extension. This theory quickly advanced up to the vector-valued T 1 theorem of Figiel [12]. It is important to understand that the fundamental result that all scalar-valued L 2 bounded SIOs can be extended to act boundedly on L p (X ), p ∈ (1, ∞), goes through this key theorem.
The deep work of Figiel also contains estimates for the required kernel regularity in terms of some key characteristics of the UMD space X . This concerns the required regularity of the continuity-moduli ω appearing in the various kernel estimates, such as, Recently, Grau de la Herrán and Hytönen [16] proved that the modifed Dini condition ω Dini α := 1 0 ω(t) 1 + log 1 t α dt t with α = 1 2 is sufficient to prove a scalar-valued T 1 theorem even with an underlying measure μ that can be non-doubling. This matches the best known [6,12] sufficient condition for the classical homogeneous T 1 theorem [5]. The exponent α = 1 2 has a fundamental feeling in all of the existing arguments-it seems very difficult to achieve a T 1 theorem with a weaker assumption.
It turns out that for completely general UMD spaces the threshold α = 1/2 needs to be replaced by a more complicated expression, while in the simpler function lattice case α = 1/2 suffices by a much more elementary argument. We revisit these fundamental linear results of Figiel using new optimized dyadic methods and obtain a modern proof of the following theorem. In our terminology, a Calderón-Zygmund operator (CZO) is an SIO that satisfies the T 1 assumptions (equivalenty, T : L 2 → L 2 boundedly). Theorem 1.1 Let T be a linear ω-CZO and X be a UMD space with type r ∈ (1,2] and cotype q ∈ [2, ∞). If ω ∈ Dini 1/ min(r ,q ) , we have T f L p (X ) f L p (X ) , p ∈ (1, ∞).
See the main text for the exact definitions. If X is a Hilbert space, then r = q = 2 and we get the usual α = 1/2. Again, this theory is relevant for UMD spaces that go beyond the function lattices, such as, non-commutative L p spaces.
A major part of our arguments has to do with the extension of this result to the multilinear setting. A basic model of an n-linear SIO T in R d is obtained by setting where U is a linear singular integral operator in Rnd. See e.g. Grafakos-Torres [15] for the basic theory. Multilinear SIOs appear in applications ranging from partial differential equations to complex function theory and ergodic theory. For example, L p estimates for the homogeneous fractional derivative D α f = F −1 (|ξ | α f (ξ )) of a product of two or more functions-the fractional Leibniz rules-are used in the area of dispersive equations. Such estimates descend from the multilinear Hörmander-Mihlin multiplier theorem of Coifman and Meyer [3]-See e.g. Kato and Ponce [23] and Grafakos and Oh [14].
The multilinear analogue of Theorem 1.1 extending the recent work [9] goes as follows.
Then for all exponents 1 < p 1 , . . . , p n ≤ ∞ and 1/r = n j=1 1/ p j > 0 we have Until recently, vector-valued extensions of multilinear SIOs had mostly been studied in the framework of p spaces and function lattices, rather than general UMD spacessee e.g. [4,13,24,25,28]. Taking the work [7] much further, the paper [9] finally established L p bounds for the extensions of n-linear SIOs with the usual Hölder modulus of continuity to tuples of UMD spaces tied by a natural product structure, such as, the composition of operators in the Schatten-von Neumann subclass of the algebra of bounded operators on a Hilbert space. In [10] the bilinear case of [9] was applied to prove UMD-extensions for modulation invariant singular integrals, such as, the bilinear Hilbert transform. See also [8] for the related operator-valued theory. With new and refined methods, we are able to prove the above Figiel type result, Theorem 1.2, in the multilinear setting. The proofs of T 1 theorems display a fundamental structural decomposition of SIOs into their cancellative parts and so-called paraproducts. It is this structure that is extremely important for obtaining further estimates beyond the initial scalar-valued L p boundedness. The original dyadic representation theorem of Hytönen [18,19] (extending an earlier special case of Petermichl [29]) provides a decomposition of the cancellative part of an SIO into so-called dyadic shifts. In [16] a new type of representation theorem appears, where the key difference to the original representation theorems [18,19] is that the decomposition of the cancellative part is in terms of different operators that package multiple dyadic shifts into one and offer more efficient bounds when it comes to kernel regularity. Aiming to prove Theorem 1.2 we develop these tools in the multilinear setting and present a useful, explicit and clear exposition of the appearing new dyadic model operators.

Notation
Throughout this paper A B means that A ≤ C B with some constant C that we deem unimportant to track at that point.
Given a dyadic grid D, I ∈ D and k ∈ Z, k ≥ 0, we use the following notation: (1) (I ) is the side length of I .
(3) ch(I ) is the collection of the children of I , i.e., ch(I ) = {J ∈ D : For an interval J ⊂ R we denote by J l and J r the left and right halves of J , respectively. We define h 0 . Let now I = I 1 ×· · ·× I d ⊂ R d be a cube, and define the Haar function h If η = 0 the Haar function is cancellative:´h η I = 0. We exploit notation by suppressing the presence of η, and write h I for some h η I , η = 0. Notice that for I ∈ D we have I f = f , h I h I (where the finite η summation is suppressed), f , h I :=´f h I .

Singular Integrals
Let ω be a modulus of continuity: an increasing and subadditive function with ω(0) = 0. A relevant quantity is the modified Dini condition In practice, the quantity (2.1) arises as follows: For many standard arguments α = 0 is enough. For the T 1 type arguments we will-at the minimum-always need α = 1/2. When we do UMD-extensions beyond function lattices, we will need a bit higher α depending on the so-called type and cotype constants of the underlying UMD space X .

A function
is called an n-linear ω-Calderón-Zygmund kernel if it holds that and for all j ∈ {1, . . . , n + 1} it holds that

Definition 2.3
An n-linear operator T defined on a suitable class of functions-e.g. on the linear combinations of cubes-is an n-linear ω-SIO with an associated kernel K , if we have Definition 2. 4 We say that T is an n-linear ω-CZO if the following conditions hold:

Model Operators
Let i = (i 1 , . . . , i n+1 ), i j ∈ {0, 1, . . .}, and let D be a dyadic lattice in R d . An operator S i is called an n-linear dyadic shift if it has the form where Here a K ,(I j ) = a K ,I 1 ,...,I n+1 is a scalar satisfying the normalization and there exist two indices j 0 , j 1 ∈ {1, . . . , n + 1}, j 0 = j 1 , so that h I j 0 = h I j 0 , h I j 1 = h I j 1 and for the remaining indices j / (2.6) or B K has one of the other symmetric forms, where the role of f n+1 is replaced by some other f j . The coefficients satisfy the same (but now |I 1 | = . . . = |I n+1 |) normalization An n-linear dyadic paraproduct π = π D also has n + 1 possible forms, but there is no complexity associated to them. One of the forms is where the coefficients satisfy the usual BMO condition In the remaining n alternative forms the cancellative Haar function h I is in a different position.
When we represent a CZO we will have modified dyadic shifts Q k , standard dyadic shifts of the very special form S k,...,k and paraproducts π . Dyadic shifts S k,...,k are simply easier versions of the operators Q k . Paraproducts do not involve a complexity parameter and are thus inherently not even relevant for the kernel regularity considerations (we just need their boundedness).

Remark 2.8
At least in the linear situation, we can easily unify the study of shifts S k,...,k and modified shifts Q k . This viewpoint could work in the multilinear generality also (with some tensor product formalism), but we did not pursue it. We can understand a modified linear shift to have the more general form Q k , k = 0, 1, . . ., where (2) |H I ,J | ≤ |I | −1/2 and (3)´H I ,J = 0.

The Threshold˛= 1/2
We quickly explain the role of the regularity threshold α = 1/2, which appears naturally in the fundamental scalar-valued theory.

Lemma 2.11
Let p ∈ (1, ∞). There holds that Proof If f i ∈ L p then a basic vector-valued square function estimate says that (2.12) Let K ∈ D. We have that Thus, (2.12) gives that

Proposition 2.13
Suppose that Q k is an n-linear modified shift. Let 1 < p j < ∞ with n+1 j=1 1/ p j = 1. Then we have (2.14) Proof We may assume that Q k has the form (2.6). Notice that if I (k) = K then we have Using this we have for I (k) 1 = · · · = I (k) We see that the last terms of these expansions cancel out in the difference It remains to estimate the others terms one by one. We pick the concrete (but completely representative) term It remains to use Hölder's inequality, maximal function and square function estimates and Lemma 2.11. We remark that the estimate for f n+1 is, indeed, just the usual square function estimate, since We now pick the corresponding term

Notice that
We are thus left with This is the same upper bound as in the first case, and thus handled with in the same way.

Remark 2.17 Proposition 2.13 considers only the Banach range boundedness of Q k .
We can, in any case, upgrade the boundedness to the full range with standard methods when we consider CZOs.
. . , f n ), and thus by (2.2) and Proposition 2.13 we will always need Dini 1/2 . The above proof readily generalises to so-called UMD function lattices. In Sect. 3 we tackle the much deeper case of general UMD spaces.

Modified Shifts are Sums of Standard Shifts
The standard linear shifts satisfy the complexity free bound Similar estimates hold in the multilinear generality-for example, the following complexity free bilinear estimate is true We need roughly nk shifts to represent an n-linear modified shift Q k as a sum of standard shifts, see Lemma 2.18 below. Therefore, these estimates lose to estimates like (2.14), and would lead to Dini 1 .
The following Lemma 2.18 is of philosophical importance. It should not be resorted to when more efficient estimates can be obtained by the direct study of the operators Q k . If one can accept some loss of regularity, then it can be practical. We present it for completeness and for the big picture.
Then for some C 1 we have where in the first sum there are m − 1 zeroes and in the second sum m zeroes in the complexity of the shift.

Remark 2.19
In the proof below we decompose various martingale differences using Haar functions, which strictly speaking leads to the fact that there is an implicit dimensional summation in the above decomposition.

Proof of Lemma 2.18
The underlying decomposition is, in part, more sophisticated than the one in the beginning of the proof of Proposition 2.13. See the multilinear collapse (2.22) and (2.24). This feels necessary for this result-moreover, we will later use this decomposition strategy when we do general UMD-valued estimates. Write We then write We start working with the first term. Notice that Using this we can write (2.20) Consider now, for m ∈ {1, . . . , n}, the following part of the modified shift (2.21) Next, write Notice the normalization estimate We also have two cancellative Haar functions, so for every i ∈ {0, . . . , k −1}, the inner sum in A m is a standard n-linear shift of complexity (0, . . . , 0, i, k, . . . , k), where the i is in the mth slot: We now turn to the part of the modified shift associated with (2.23) Consider now, for m ∈ {1, . . . , n}, the following part of the modified shift Notice the normalization estimate Therefore, for some constant C 1 we get that where there are m zeroes in S 0,...,0,1,...,1,i .

The Optimized Representation Theorem
We define the new dyadic grid where we simply have defined It is straightforward that D σ inherits the key nestedness property of gives us the notion of random dyadic grids σ → D σ over which we take the expectation E σ below.

Remark 2.25
The assumption ω ∈ Dini 1/2 in the theorem below is only needed to have a converging series. The regularity is not explicitly used in the proof of the representation. It is required due to the estimates of the model operators briefly discussed above.

Theorem 2.26 Suppose that T is an n-linear ω-CZO,
where ω ∈ Dini 1/2 . Then we have where V k,u,σ is always either a standard n-linear shift S k,...,k , a modified n-linear shift Q k or an n-linear paraproduct (this requires k = 0) in the grid D σ . Moreover, we have Proof We begin with the decomposition We deal with the remainder term R σ later, and now focus on dealing with one of the main terms The main terms are symmetric, and we choose to handle σ := n+1,σ . After collapsing the sums T (E I 1 f 1 , . . . , E I n f n ), I n+1 f n+1 .
Further, we write We define the abbreviation ϕ I 1 ,...,I n+1 := T h 0 If we now sum over I 1 , . . . , I n+1 we may express σ in the form ϕ I 1 ,...,I n+1 where we recognize that the second term 2 σ is a paraproduct. Thus, we only need to continue working with 1 σ . Since ϕ I ,...,I = 0, we have that As in [16] we say that I is k-good for k ≥ 2 -and denote this by (2.28) Notice that for all I ∈ D 0 we have Thus, by the independence of the position of I and the k-goodness of I we have where I ∈D σ,good (k) ϕ I +m 1 (I ),...,I +m n (I ),I and C is large enough.
Next, the key implication of the k-goodness is that if |m| ≤ 2 k−2 and I ∈ D σ,good (k). Indeed, notice that e.g. c I +m (I ) ∈ [I +m (I )]∩K (so that [I + m (I )] ∩ K = ∅ which is enough) as Therefore, to conclude that Q k is a modified n-linear shift it only remains to prove the normalization Suppose first that k ∼ 1. Recall that (m 1 , . . . , m n ) = (0, 0) and assume for example that m 1 = 0. We have using the size estimate of the kernel that where in the second step we repeatedly used estimates of the form Notice that this is the right upper bound (2.31) in the case k ∼ 1. Suppose then that k is large enough so that we can use the continuity assumption of the kernel. In this case we have that if x n+1 ∈ I and x 1 ∈ I + m 1 (I ), . . . , x n ∈ I + m n (I ), then n m=1 |x n+1 − x m | ∼ 2 k (I ) = (K ). Thus, there holds that (2.34) We have proved (2.31). This ends our treatment of E σ σ . We now only need to deal with the remainder term E σ R σ . Write where each (I 1 , . . . , I n+1 ) ∈ I σ satisfies that if j ∈ {1, . . . , n + 1} is such that (I j ) ≤ (I i ) for all i ∈ {1, . . . , n + 1}, then (I j ) = (I i 0 ) for at least one i 0 ∈ {1, . . . , n + 1} \ { j}. The point why the remainder is simpler than the main terms is that we can split this summation so that there are always at least two sums which we do not need to collapse-that means we will readily have two cancellative Haar functions. To give the idea, it makes sense to explain the bilinear case n = 2. In this case we can, in a natural way, decompose , which -after collapsing the relevant sums -gives that These are all handled similarly (the point is that there are at least two martingale differences remaining in all of them) so we look for example at We can represent these terms as sums of standard bilinear shifts of the form S k,k,k . The first term is handled exactly like 1 σ above. The second term is readily a zero complexity shift. To prove the estimate for the coefficient we write T (h I , h 0 I ), h I as the sum of The general n-linear remainder term R σ is analogous and only yields standard n-linear shifts S k,...,k . We are done.

Preliminaries of Banach Space Theory
An extensive treatment of Banach space theory is given in the books [20,21]  Suppose X is a Banach space. We denote the underlying norm by |·| X . The Kahane-Khintchine inequality says that for all x 1 , . . . , x N ∈ X and p, q ∈ (0, ∞) there holds Definitions related to Banach spaces often involve such random sums and the definition may involve some fixed choice of the exponent-but the choice is irrelevant by the Kahane-Khintchine inequality. The Kahane contraction principle says that if (a i ) N i=1 is a sequence of scalars, x 1 , . . . , x N ∈ X and p ∈ (0, ∞], then
1. The space X has type r if there exists a finite constant τ ≥ 0 such that for all finite sequences The space X has cotype q if there exists a finite constant c ≥ 0 such that for all finite sequences x 1 , . . . , For q = ∞ the usual modification is used.
The least admissible constants are denoted by τ r ,X and c q,X -they are the type r constant and cotype q constant of X .
In [21,Sect. 7] the reader can find the basic theory of types and cotypes. We only need a few basic facts, however. If X has type r (cotype q), then it also has type u for all u ∈ [1, r ] (cotype v for all v ∈ [q, ∞]), and we have τ u,X ≤ τ p,X (c v,X ≤ c q,X ). It is also trivial that always τ 1,X = c ∞,X = 1. We say that X has non-trivial type if X has type r for some r ∈ (1, 2] and finite cotype if it has cotype q for some q ∈ [2, ∞).
For the types and cotypes of L p spaces we have the following: if X has type r , then L p (X ) has type min(r , p), and if X has cotype q, then L p (X ) has cotype max(q, p).
The UMD property is a necessary and sufficient condition for the boundedness of various singular integral operators on L p (R d ; X ) = L p (X ), see [20, Sect. 5.2.c and the Notes to Sect. 5.2].

Definition 3.3 A Banach space X is said to be a UMD space, where UMD stands for unconditional martingale differences, if for all
. (3.4) The L p (X )-norm is with respect to the measure space where the martingale differences are defined.
A standard property of UMD spaces is that if ( This UMD-valued version of Stein's inequality is by Bourgain, for a proof see e.g. Theorem 4.2.23 in the book [20]. We now introduce some definitions related to the so called decoupling estimate. For K ∈ D denote by Y K the measure space (K , Leb(K ), ν K ). Here Leb(K ) is the collection of Lebesgue measurable subsets of K and ν K = dx K /|K |, where dx K is the d-dimensional Lebesgue measure restricted to K . We then define the product probability space If y ∈ Y and K ∈ D, we denote by y K the coordinate related to Y K .
In our upcoming estimates, it will be important to separate scales using the following subgrids. (3.5) The following proposition concerning decoupling is a special case of Theorem 3.1 in [17]. It is a result that can be stated in the generality of suitable filtrations, but we prefer to only state the following dyadic version.
´f K = 0 and (3) f K is constant on those K ∈ D k,l for which K K .
Then we havê where the implicit constant is independent of k, l.
Hilbert spaces are the only Banach spaces with both type 2 and cotype 2. Below we will prove estimates for modified shifts like where the UMD space X has type r and cotype q. Therefore, in the Hilbert space case-and thus in the scalar-valued case-these estimates recover the best possible regularity α = 1/2. The presented estimates are efficient in completely general UMD spaces-however, in UMD function lattices it is more efficient to mimic the scalarvalued theory (Proposition 2.13) and use square function estimates instead.

The Linear Case
We feel that it is too difficult to jump directly into the multilinear estimates, as they are quite involved. Thus, we first study the linear case. We show that the framework of modified dyadic shifts gives a modern and convenient proof of the results of Figiel [11,12] concerning UMD-extensions of CZOs with mild kernel regularity.
Before moving to the main X -valued estimate for Q k , we state the following result for paraproducts. We have whenever p ∈ (1, ∞) and X is UMD. We understand that this is usually attributed to Bourgain -in any case, a simple proof can now be found in [17].

Remark 3.9
The estimate in Proposition 3.10 below is best used for p = 2, since then e.g. min(r , p) = r , if r ∈ (1, 2] is an exponent such that X has type r . Indeed, it is efficient to only move the p = 2 estimate for the CZO T , and then interpolate to get the L p boundedness under a modified Dini type assuption that is independent of p. On the Q k level improving an L 2 estimate into an L p estimate with good dependency on the complexity does not seem so simple. Interpolation would introduce some additional complexity dependency, since the weak (1, 1) inequality of Q k is not complexity free.
Proposition 3.10 Let p ∈ (1, ∞) and X be a UMD space. If Q k is a modified shift of the form (2.10), then where r ∈ (1,2] is an exponent such that X has type r . If Q k is a modified shift of the form (2.9), we have where q ∈ [2, ∞) is an exponent such that X has cotype q.
Proof We assume that Q k has the form (2.10)-the other result follows by duality. This uses that if the UMD space X has cotype q, then the dual space X * has type q -see [21,Proposition 7.4.10].
If K ∈ D we define Recall the lattices D k,l from (3.5) and write Q k f = k l=0 K ∈D k,l B K f . By using the UMD property of X and the Kahane-Khintchine inequality we have for all s ∈ (0, ∞) that We use this with the choice s := min(r , p), since L p (X ) has type s. Using this we have To end the proof, it remains to show that uniformly on l.
We have that f , H I ,J = P K ,k f , H I ,J and so Accordingly, this splits the estimate of (3.11) into two parts. We consider first the part related to P K ,k f , 1 I H I ,J . By the UMD property and the Kahane-Khintchine inequality we have Notice then that for we have Using this we have by Hölder's inequality (recalling that ν is a probability measure) that Notice now that |a K (x, y)| ≤ 1 K (x). Thus, the Kahane contraction principle implies that for fixed x and y there holds that where we used the Kahane contraction principle. From here the estimate is easily concluded by Here we first changed the indexing of the random signs (using that for a fixed x for every K there is at most one J as in the sum for which 1 J (x) = 0) and then applied the UMD property. This finishes the proof.
Proof Apply Theorem 2.26 to simple vector-valued functions. By Proposition 3.10 and the X -valued boundedness of the paraproducts (3.8) we conclude that As the weak type (1, 1) follows from this even with just the assumption ω ∈ Dini 0 , we can conclude the proof by the standard interpolation and duality method.

The Multilinear Case
Let (X 1 , . . . , X n+1 ) be UMD spaces and Y * n+1 = X n+1 . Assume that there is an nlinear mapping X 1 × · · · × X n → Y n+1 , which we denote with the product notation (x 1 , . . . , x n ) → n j=1 x j , so that With just this setup it makes sense to extend an n-linear SIO (or some other suitable n-linear operator) T using the formula j 1 , . . . , f n, j n )(x) n k=1 e k, j k , x ∈ R d , In the bilinear case n = 2 the existence of such a product is the only assumption that we will need. The bilinear case is somewhat harder than the linear case, but the n ≥ 3 case is by far the most subtle. Indeed, for n ≥ 3 we will need a more complicated setting for the tuple of spaces (X 1 , . . . , X n+1 )-the idea is to model the Hölder type structure typical of concrete examples of Banach n-tuples, such as that of non-commutative L p spaces with the exponents p satisfying the natural Hölder relation. We will borrow this setting from [9]. In [9] it is shown in detail how natural tuples of non-commutative L p spaces fit to this abstract framework. While we borrow the setting, the proof is significantly different. First, we have to deal with the more complicated modified shifts. Second, even in the standard shift case the proof in [9] is-by its very designextremely costly on its complexity dependency. To circumvent this we need a new strategy.
For m ∈ N we write J m := {1, . . . , m} and denote the set of permutations of J ⊂ J m by (J ). We write (m) = (J m ).
Next, we fix an associative algebra A over C, and denote the associative operation A × A → A by (e, f ) → e f . We assume that there exists a subspace L 1 of A and a linear functional τ : L 1 → C, which we refer to as trace. Given an m-tuple (X 1 , . . . , X m ) of Banach subspaces X j ⊂ A, we construct the seminorm For a Banach subspace X ⊂ A and y ∈ Y (X ) we can define the mapping y ∈ X * by the formula y (x) := τ (yx), since by the definition yx ∈ L 1 and |τ (yx)| ≤ |y| Y (X ) |x| X . We say that a Banach subspace X of A is admissible if the following holds.
For each x ∈ X , y ∈ Y (X ), we also have x y ∈ L 1 and τ (x y) = τ (yx). (3.18) If X is admissible, then the map y → y is an isometric bijection from Y (X ) onto X * , and we identify Y (X ) with X * . The following is [9, Lemma 3.10].

Lemma 3.19
Let X be admissible and reflexive (for instance, X is admissible and UMD). If Y (X ) is also admissible, then Y (Y (X )) = X as sets and |x| Y (Y (X )) = |x| X for all x ∈ X.
Definition 3.20 [UMD Hölder pair] Let X 1 , X 2 be admissible spaces. We say that {X 1 , X 2 } is a UMD Hölder pair if X 1 is a UMD space and X 2 = Y (X 1 ). P1. For all j 0 ∈ J m there holds is an admissible Banach space with the norm (3.16) and is a UMD Hölder (k + 1)-tuple.
The following is a key consequence of the definition. Let m ≥ 3 and {X 1 , . . . , X m } be a UMD Hölder m-tuple. Then according to P2 the pair {X j 0 , Y (X j 0 )} is a UMD Hölder pair, which by Definition 3.20 implies that X j 0 and Y (X j 0 ) are UMD spaces. The inductive nature of the definition then ensures that each Y (X j 1 , . . . , X j k ) appearing in (3.22) is a UMD space.

Proposition 3.25 Suppose that Q k is an n-linear modified shift and f j
Proof We will assume that Q k is of the form (2.6)-the other cases follow by duality using the property (3.23). We follow the ideas of the decomposition from the proof of Lemma 2.18: we will estimate the terms A m and U m , m ∈ {1, . . . , n}, from there separately.
First, we estimate the part A m defined in (2.21). We have that and b K ,(I j ) = b K ,I 1 ,...,I n+1 = |I n+1 | n/2 a K ,(I j ) .
Recalling that X * n+1 is identified with Y (X n+1 ), we have that L p n+1 (Z 1,...,n ) has type s := min( p n+1 , s n+1 ). Thus, there holds that In the first step we used the UMD property of Z 1,...,n and the Kahane-Khintchine inequality. We see that to prove the claim it suffices to show the uniform bound We turn to prove (3.28). To avoid confusion with the various Y spaces, we denote the decoupling space by (W , ν) in this proof. The decoupling estimate (3.7) gives that Next, we fix the point w ∈ W and consider the term In the last step we were able to replace 1 I n /|I n | 1/2 with h I n because of the random signs, after which we removed the signs using UMD. Recalling the size of the coefficients b m,K ,(I j ) from (3.26) we see that since there is only one I n+1 such that h I n+1 (w K ) = 0. It is seen that after applying decoupling (3.34) is like (3.31) but the degree of linearity is one less. Therefore, iterating this we see that (3. Similarly as with the operators A m , we use the fact that L p n+1 (Z 1,...,n ) has type s = min( p n+1 , s n+1 ) to reduce to controlling the term uniformly on l. For every I n+1 such that I (k) n+1 = K there holds that Therefore, using the UMD property and the Kahane contraction principle the L p n+1 (Z 1,...,n )-norm of (3.36) is dominated by If all the averages were on the "level i + 1" we could estimate this directly. Since there are the averages on the "level i" we need to further split this. There holds that Then, both of these are expanded in the same way related to f m+2 and so on. This gives that Here the summation is over functions ϕ : {m+1, . . . , n} → {0, 1} and for a cube I ∈ D we defined D 0 I = E I and D 1 I = I . We also used the fact that Finally, we take one ϕ and estimate the related term. It can be written as 1/q m+1,...,n . (3.39) The term in (3.38) can be estimated by a repeated use of Stein's inequality. The term in (3.39) is like the term in (3.37). If there is only one martingale difference, then we can again estimate directly with Stein's inequality. If there is at least two martingale differences, then one can split into two as we did when we arrived at (3.38) and (3.39). This process is continued until one ends up with terms that contain only one martingale difference, and such terms we can estimate.
The proof of Proposition 3.25 is finished.
With the essentially same proof as for the terms A m above we also have the following result.  ( p 1 , . . . , p n+1 , s 1 , . . . , s n+1 ) and X j has cotype s j .

Remark 3.41
Ordinary shifts obey a complexity free bound in the scalar-valued setting. However, we do not know how to achieve this with general UMD spaces. It seems that a somewhat better dependency could be obtained, but it would not have any practical use.
The following is [9,Theorem 5.3]. This is a significantly simpler argument than the shift proof and consists of repeated use of Stein's inequality until one is reduced to the linear case (3.8).
Finally, we are ready to state our main result concerning the UMD extensions of n-linear ω-CZOs.

Theorem 3.43 Suppose that T is an n-linear ω-CZO.
Suppose ω ∈ Dini α , where α = 1 min((n + 1)/n, s 1 , . . . , s n+1 ) and X j has cotype s j . Then for all exponents 1 < p 1 , . . . , p n ≤ ∞ and 1/q n+1 = n j=1 1/ p j > 0 we have Proof The important part is to establish the boundedness with a single tuple of exponents. We may e.g. conclude from the boundedness of the model operators and Theorem 2.26 that if we choose α as in the statement of the theorem. It is completely standard how to improve this to cover the full range: we can e.g. prove the end point estimate T : L 1 (X 1 ) × · · · × L 1 (X n ) → L 1/n,∞ (X * n+1 ), see [26], and then use interpolation or good-λ methods. See e.g. [15,27]. For such arguments the spaces X j no longer play any role (the scalar-valued proofs can readily be mimicked).

Remark 3.44
The exponent (n + 1)/n in the definition of α is slightly annoying, since now the exponent α = 1/2 valid in the scalar-valued case X 1 = · · · = X n+1 = C does not follow from this result, even though then s 1 = · · · = s n+1 = 2. Of course, it is way more simple to prove scalar-valued estimates directly with other methods anyway (see Sect. 2.1).
Notice that it is also clear that Dini 1/2 suffices in suitable tuples (X 1 , . . . , X n+1 ) of UMD function lattices. See e.g. [22, for an account of the wellknown square function and maximal function estimates valid in lattices. In lattices the simple approach of Sect. 2.1 is much better, as then in addition to the factor (n + 1)/n we often have s j < 2.
In interesting and non-trivial situations the presence of (n + 1)/n is not an additional restriction. Suppose each space X j is a non-commutative L p space L p j (M) and n+1 j=1 1/ p j = 1, 1 < p j < ∞. Then the cotype of X j is s j = max(2, p j ) ≥ p j so that 1 = and so there has to be an index j so that s j ≤ (n + 1)/n anyway -thus min((n + 1)/n, s 1 , . . . , s n+1 ) = min(s 1 , . . . , s n+1 ).