1 Introduction

The UMD property is crucial in harmonic and stochastic analysis in Banach spaces, see, e.g., [17, 18]. A Banach space X is said to satisfy the UMD property if there exists a constant \( c_(1)\ge 1\) such that for every X-valued martingale difference sequence \((d_n)_{n=1}^N\) one has that

$$\begin{aligned} \left\| \sum _{n=1}^N d_n \right\| _{{\mathcal {L}}^2({{\mathbb {P}}};X)} \le c_{(1)} \left\| \sum _{n=1}^N \theta _n d_n \right\| _{{\mathcal {L}}^2({{\mathbb {P}}};X)} \end{aligned}$$
(1)

for all signs \(\theta _n \in \{ -1,1 \}\), i.e., one has Unconditional Martingale Differences. It follows from Maurey [27], that, in order to verify the UMD property of the Banach space X, it is sufficient to consider X-valued Haar- or dyadic martingales (dyadic martingales are also known as Paley–Walsh martingales).

On the other hand, McConnell [28, Theorem 2.2] (see also Hitczenko [15]) proved that the UMD property is equivalent to the existence of a \(c_{(2)}\ge 1\) such that

$$\begin{aligned} \left\| \sum _{n=1}^N d_n \right\| _{{\mathcal {L}}^2({{\mathbb {P}}};X)} \le c_{(2)} \left\| \sum _{n=1}^N e_n \right\| _{{\mathcal {L}}^2({{\mathbb {P}}};X)} \end{aligned}$$
(2)

for all \(N\in {{\mathbb {N}}}\) and all X-valued \(({\mathcal {F}}_n)_{n=1}^{N}\)-martingale difference sequences \((d_n)_{n=1}^{N}\) and \((e_n)_{n=1}^{N}\) such that \({\mathcal {L}}(d_n\,|\,{\mathcal {F}}_{n-1}) = {\mathcal {L}}(e_n\,|\,{\mathcal {F}}_{n-1})\), i.e., \((e_n)_{n=1}^{N}\) and \((d_n)_{n=1}^{N}\) are tangent. Imposing additional assumptions on either \((e_n)_{n=1}^{N}\) or \((d_n)_{n=1}^{N}\) in (2) results in an (a priori) weaker Banach space property, e.g., imposing that \((e_n)_{n=1}^{N}\) in (2) is the decoupled tangent sequence of \((d_n)_{n=1}^{N}\) (see Definition 2.5) results in the lower decoupling property for tangent martingales. The Banach space \(L^1\) satisfies the lower decoupling property for tangent martingales (see Cox and Veraar [9, Example 4.7]), but fails to have the UMD property (see, e.g., [17, Example 4.2.20]). The notion of decoupled tangent sequences was introduced by Kwapień and Woyczyński [23, 24]. The decoupled tangent sequence \((e_n)_{n=1}^N\) of a sequence \((d_n)_{n\in {{\mathbb {N}}}}\) (adapted to a filtration \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)) is unique in distribution and replaces parts of the dependence structure of \((d_n)_{n=1}^N\) by a sequence of conditionally independent random variables. Although the definition of decoupling might not be explicit, there are canonical representations of a decoupled tangent sequence, see Kwapień and Woyczyński [24] and Montgomery-Smith [29].

Inequalities (1) and (2) describe two-sided decoupling properties due to the symmetry between the left- and right-hand side. The lower decoupling property for tangent martingales however is an example of a one-sided decoupling property. An a priori different one-sided decoupling property is obtained by considering (1) where \((\theta _n)_{n=1}^{N}\) is replaced by a Rademacher sequence that is independent of \((d_n)_{n=1}^{N}\). This one-sided decoupling property was first studied explicitly in [13] and is also satisfied by \(L^1\). The goal of this article is to gain insight into the relation between these different kinds of one-sided decoupling properties. First, however, let us discuss some instances in the literature where decoupling inequalities play a crucial role.

The proofs by Burkholder [4] and Bourgain [2] of the equivalence of the UMD property of a Banach space X and the continuity of the X-valued Hilbert transform use that X has the UMD property if and only if it has both a lower- and an upper decoupling property. For certain applications, only a single one-sided decoupling property is needed. For example, the lower (resp. upper) decoupling property for martingales and the type (resp. cotype) property imply martingale type (resp. cotype) and therefore by Pisier [31] an equivalent re-norming of the Banach space with a norm having a certain modulus of continuity (resp. convexity). A classical case of decoupling, studied on its own, concerns randomly stopped sums of independent random variables, see for example the results of Klass [21, 22].

Another application for decoupling is stochastic integration. Indeed, only the lower decoupling property is needed to obtain sufficient conditions for the existence of stochastic integrals. This can be inferred from [12, the proof of Theorem 2], where stochastic integration of UMD Banach space-valued processes with respect to a Brownian motion is considered. One-sided decoupling is used more explicitly in [24, Section 6], where the existence of decoupled tangent processes for left quasi-continuous processes in the Skorokhod space is studied. In [34] and [9, Section 5], decoupling inequalities were used to give sufficient conditions for the existence of a Banach space-valued stochastic process with respect to a cylindrical Brownian motion. Very recently, Kallenberg [20] proved the existence of decoupled tangent semi-martingales and two-sided decoupling inequalities, and considered applications to multiple stochastic integrals. Moreover, quasi-Banach spaces fail to satisfy the UMD property, but may satisfy decoupling inequalities, see [7, Section 5.1] and e.g. [9, Example 4.7]. Section 5 contains our contribution to this topic, see also Theorem 1.7 and Remark 5.6. Before discussing this contribution, let us turn to the open problem that motivated our research:

Open Problem 1.1

If a Banach space X has the lower decoupling property for tangent dyadic martingales, does it also have the lower decoupling property for general tangent martingales?

We were not able to answer this question in this generality. However, our main result (Theorem 1.4) provides a reduction of this problem to simple Haar-type series and gives a partial answer (see Corollary 1.6 for a special case). The proof of Theorem 1.4 is inspired by the aforementioned work by Maurey [27]. Open Problem 1.1 can be split into two subproblems; consequently, our proof of Theorem 1.4 consists of two parts [completely solving subproblem (A) and partially solving subproblem (B)]:

  1. (A)

    If a Banach space X has a lower (upper) decoupling property for X-valued sequences of random variables adapted to a (in a certain sense) natural minimal filtration \(({\mathcal {F}}_n)_{n=1}^{\infty }\) and with conditional distributions in a set of measures \({\mathcal {P}}\), does X also have a lower (upper) decoupling property for X-valued sequences adapted to any filtration \(({\mathcal {F}}_n)_{n=1}^{\infty }\) and with conditional distributions in \({\mathcal {P}}\)?

  2. (B)

    Given that X has a lower (resp. upper) decoupling property for X-valued sequences with conditional distributions in a certain \({\mathcal {P}}\), does X also have a lower (resp. upper) decoupling property for general X-valued sequences?

Problem (A) is of fundamental importance in stochastic integration theory as, given the driving process, the underlying filtration determines the set of integrands we may use. We now describe the content of the article in more detail:

Section 3: Theorem 3.1 provides a factorization of a random variable along regular conditional probabilities. With this result, we contribute to the results of Montgomery-Smith [29] (see also Kallenberg [19, Lemma 3.22]). This result is the key to approximate our adapted processes in terms of Haar-like series.

Section 4: this section is devoted to our main results, Theorem 4.3 and its corollary, Theorem 1.4. It contains the key ingredients for the proofs and some examples.

To formulate Theorem 1.4, we recall some definitions. For a separable Banach space X, we denote by \({\mathcal {B}}(X)\) the \(\sigma \)-algebra generated by the norm-open sets. By \({\mathcal {P}}(X)\), we denote the set of all probability measures on \((X,{\mathcal {B}}(X))\), for \(p\in (0,\infty )\) we let

$$\begin{aligned} {\mathcal {P}}_p(X) := \left\{ \mu \in {\mathcal {P}}(X) :\int _{X} \Vert x \Vert _X^p \,\mathrm{d}\mu (x) < \infty \right\} , \end{aligned}$$

and \(\delta _x\in {\mathcal {P}}(X)\) stands for the Dirac measure at \(x\in X\). The next definition concerns a set of admissible adapted processes characterized by an assumption on the regular versions of the—in a sense—predictable projections:

Definition 1.2

Let X be a separable Banach space, \(p\in (0,\infty )\), \(\emptyset \not = {\mathcal {P}}\subseteq {\mathcal {P}}_p(X)\), and \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n=0}^\infty )\) be a stochastic basis. We denote by \({{\mathcal {A}}}_p(\Omega ,({\mathcal {F}}_n)_{n= 0}^\infty ;X,{\mathcal {P}})\) the set of \(({\mathcal {F}}_n)_{n=1}^\infty \)-adapted sequences \((d_n)_{n=1}^\infty \) in \({\mathcal {L}}^p({{\mathbb {P}}};X)\) with the property that for all \(n\ge 1\) there exists an \(\Omega _{n-1} \in {\mathcal {F}}\) satisfying \({{\mathbb {P}}}(\Omega _{n-1})=1\) and \(\kappa _{n-1}[\omega ,\cdot ]\in {\mathcal {P}}\) for all \(\omega \in \Omega _{n-1}\), where \(\kappa _{n-1}\) is a regular conditional probability kernel for \({\mathcal {L}}(d_{n}\,|\,{\mathcal {F}}_{n-1})\).

The concept of regular conditional probability kernels is recalled in Sect. 2.2. Next, we introduce an extension of a given set of probability measures that is natural in our context:

Definition 1.3

For a separable Banach space X, \(p\in (0,\infty )\) and \(\emptyset \ne {\mathcal {P}}\subseteq {\mathcal {P}}_p(X)\) we let

$$\begin{aligned} \begin{aligned} {\mathcal {P}}_{p\text {-ext}} := \Big \{ \mu \in {\mathcal {P}}_p(X) :&\forall j\ge 1\, \exists \, K_j\ge 1 \text { and } \mu _{j,1},\ldots ,\mu _{j,K_j} \in {\mathcal {P}}\\&\text {such that } \mu _{j,1} * \cdots * \mu _{j,K_j} {\mathop {\rightarrow }\limits ^{w^*}} \mu \text { as } j\rightarrow \infty \\&\text {and } \left( \mu _{j,1} * \cdots * \mu _{j,K_j} \right) _{j\in {{\mathbb {N}}}} \text { is uniformly } {\mathcal {L}}^p\text {-integrable} \Big \}. \end{aligned} \end{aligned}$$

The convergence of the convolutions \(\mu _{j,1} * \cdots * \mu _{j,K_j}\) toward \(\mu \) in Definition 1.3 is known to be the convergence in the p-Wasserstein distance if \(p\in [1,\infty )\) (cf. [6, Theorem 5.5, p. 358]).

Theorem 1.4

Let XYZ be Banach spaces, where X is separable, let \(S:X\rightarrow Y\) and \(T:X\rightarrow Z\) be linear and bounded, \(p\in (0,\infty )\), \(\Delta \) an index set, let \(\Psi :[0,\infty ) \rightarrow [0,\infty )\) be upper semi-continuous, and let \(\Psi _\lambda :[0,\infty ) \rightarrow [0,\infty )\), \(\lambda \in \Delta \), be a family of lower semi-continuous functions such that

$$\begin{aligned} \sup _{\xi \in (0,\infty )} ( 1 + | \xi | )^{-p} \Psi (\xi )<\infty \quad \mathrm{and} \quad \sup _{\xi \in (0,\infty )} ( 1 + | \xi | )^{-p} \Psi _\lambda (\xi ) < \infty \end{aligned}$$
(3)

for all \(\lambda \in \Delta \). Then, for a set \({\mathcal {P}}\subseteq {\mathcal {P}}_p(X)\) with \(\delta _0\in {\mathcal {P}}\), the following assertions are equivalent:

  1. (i)

    For every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n=1}^{\infty })\) and finitely supportedFootnote 1\((d_n)_{n=1}^\infty \) \(\in {{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}}_{p\text {-ext}})\), it holds that

    $$\begin{aligned} \sup _{\lambda \in \Delta } {{\mathbb {E}}}\Psi _{\lambda } \left( \left\| \sum _{n=1}^{\infty } S d_n \right\| _Y \right) \le {{\mathbb {E}}}\Psi \left( \left\| \sum _{n=1}^{\infty } T e_n \right\| _Z \right) , \end{aligned}$$
    (4)

    whenever \((e_n)_{n\in {{\mathbb {N}}}}\) is an \({{\mathbb {F}}}\)-decoupled tangent sequence of \((d_n)_{n\in {{\mathbb {N}}}}\).

  2. (ii)

    For every sequence of independent random variables \((\varphi _n)_{n=1}^N \subset {\mathcal {L}}^p({{\mathbb {P}}};X)\), \(N\ge 1\), satisfying \({\mathcal {L}}(\varphi _n)\in {\mathcal {P}},\) and every \(A_0\in \{\emptyset ,\Omega \}\), \(A_n\in \sigma (\varphi _1,\ldots ,\varphi _n)\), \(n\in \{1,\ldots ,N\}\), it holds that

    $$\begin{aligned} \sup _{\lambda \in \Delta } {{\mathbb {E}}}\Psi _{\lambda } \left( \left\| \sum _{n=1}^N 1_{A_{n-1}} S \varphi _n \right\| _Y \right) \le {{\mathbb {E}}}\Psi \left( \left\| \sum _{n=1}^N 1_{A_{n-1}} T \varphi '_n \right\| _Z \right) , \end{aligned}$$
    (5)

    where \((\varphi '_n)_{n=1}^N\) is an independent copy of \((\varphi _n)_{n=1}^N\).

Some remarks concerning Theorem 1.4 are at place:

  1. (1)

    The condition that \(\delta _0\in {\mathcal {P}}\) ensures that finitely supported sequences fit in our setting and is used at several instances in the proof.

  2. (2)

    The condition that X is separable is mainly to simplify our presentation: after all, we can apply Theorem 1.4 whenever \((d_n)_{n\in {{\mathbb {N}}}}\) is a sequence of random variables taking values in a separable subspace X of some non-separable space \({\tilde{X}}\).

  3. (3)

    The table below provides some typical choices for \(\Delta \), \(\Psi \), and \((\Psi _{\lambda })_{\lambda \in \Delta }\) given \(p\in (0,\infty )\) (here \(C\in (0,\infty )\) and fg are \({{\mathbb {R}}}\)-valued random variables):

    \(\Delta \)

    \(\Psi _\lambda (\xi )\)

    \(\Psi (\xi )\)

    \(\root p \of {\sup _{\lambda \in \Delta }{{\mathbb {E}}}\Psi _{\lambda }(f)} \le \root p \of {{{\mathbb {E}}}\Psi (g)}\)

    \(\mathrm{card}(\Delta )=1\)

    \(\xi ^p\)

    \(C^p \xi ^p\)

    \(\Vert f\Vert _{{\mathcal {L}}^p({{\mathbb {P}}})} \le C \Vert g\Vert _{{\mathcal {L}}^p({{\mathbb {P}}})}\)

    \(\mathrm{card}(\Delta )=1\)

    \(1_{\{\xi > \mu \}}, \mu \ge 0\)

    \(C^p \xi ^p\)

    \(\root p \of {{{\mathbb {P}}}(f > \mu )} \le C \Vert g\Vert _{{\mathcal {L}}^p({{\mathbb {P}}})}\)

    \((0,\infty )\)

    \(\lambda ^p 1_{\{\xi > \lambda \}}\)

    \(C^p \xi ^p\)

    \(\Vert f \Vert _{{\mathcal {L}}^{p,\infty }({{\mathbb {P}}})} \le C \Vert g\Vert _{{\mathcal {L}}^p({{\mathbb {P}}})}\)

  4. (4)

    For relevant choices for \({\mathcal {P}}\), see Examples 4.54.8; Corollary 1.6 uses Example 4.8.

  5. (5)

    Theorem 1.4 remains valid if one exchanges \((d_n)_{n=1}^N\) with \((e_n)_{n=1}^N\) in (4) and \((\varphi _n)_{n=1}^N\) with \((\varphi '_n)_{n=1}^N\) in (5), respectively.

Section 5: we use Theorem 1.4 to obtain relevant upper bounds for stochastic integrals, see Theorem 1.7. In order to formulate that theorem, we need the following definition:

Definition 1.5

If \(p\in (0,\infty )\) and if X is a separable Banach space X, then we let \(D_p(X) := \inf c\), where the infimum is taken over all \(c \in [0,\infty ]\) such that

$$\begin{aligned} \left\| \sum _{n=1}^N r_n v_{n-1} \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}};X)} \le c \left\| \sum _{n=1}^N r_n' v_{n-1} \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}};X)} \end{aligned}$$
(6)

for all \(N\ge 2\), \(v_0\in X\) and \(v_n:=h_n(r_1,\ldots ,r_n)\) with \(h_n:\{-1,1\}^{n} \rightarrow X\), \(n\in \{1,\ldots ,N-1\}\), where the \((r_n)_{n=1}^N\) are independent and take the values \(-1\) and 1 with probability 1/2, and \((r_n')_{n=1}^N\) is an independent copy of \((r_n)_{n=1}^N\).

The process \((\sum _{k=1}^n r_k v_{k-1})_{n=1}^N\) in Definition 1.5 is a dyadic martingale. Theorem 1.4 with \(X=Y=Z\), \(S=T={\text {Id}}\), \(\Delta =\{\lambda \}\), and \(\Psi (\xi )=\Psi _{\lambda }(\xi )=\xi ^p\) implies:

Corollary 1.6

Let X be a separable Banach space, let \(p\in (0,\infty )\) be such that \(D_p(X)<\infty \), and let \({\mathcal {P}}:=\{\frac{1}{2}(\delta _{-x}+\delta _{-x}) :x\in X\}\). Then,

$$\begin{aligned} \left\| \sum _{n=1}^N d_n \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}};X)} \le D_p(X) \left\| \sum _{n=1}^N e_n \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}};X)} \end{aligned}$$
(7)

for all \(N\ge 1\), every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n=0}^N)\), every strongly p-integrable and \(({\mathcal {F}}_n)_{n=1}^N\) adapted sequence of random variables \((d_n)_{n=1}^N\) such that, on a set of measure one, the \({\mathcal {F}}_{n-1}\)-conditional laws of all \(d_n\) belong to \({\mathcal {P}}_{p\text {-ext}}\), and every decoupled tangent sequence \((e_n)_{n=1}^N\) of \((d_n)_{n=1}^N\).

Cox and Geiss [8, Section 5] contains a characterization of \({\mathcal {P}}_{p\text {-ext}}\) when \({\mathcal {P}}:=\{\frac{1}{2}(\delta _{-x}+\delta _{-x}) :x\in {{\mathbb {R}}}\}\). Corollary 1.6 combined with the Central Limit Theorem results in Theorem 1.7. For details, see the proof of Part (ii) of Theorem 5.2. Theorem 1.7 extends both [9, Theorem 5.4] and [12, Theorem 2]: see Remark 5.6 for details.

Theorem 1.7

For a separable Banach space X and \(p,q\in (0,\infty )\), the following assertions are equivalent:

  1. (i)

    \(D_p(X) < \infty \).

  2. (ii)

    For every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},{{\mathbb {F}}}=({\mathcal {F}}_t)_{t\in [0,\infty )})\), every \({{\mathbb {F}}}\)-Brownian motion \(W=(W(t))_{t\in [0,\infty )}\), and every simple \({{\mathbb {F}}}\)-predictable X-valued process \((H(t))_{t\in [0,\infty )}\) it holds that

    $$\begin{aligned} \left\| \int _0^\infty H(t) \mathrm{d}W(t) \right\| _{{\mathcal {L}}^{q}({{\mathbb {P}}};X)} \le K_{p,2} D_p(X) \left\| S(H) \right\| _{{\mathcal {L}}^{q}({{\mathbb {P}}})} \end{aligned}$$

    with the square function \(S(H)(\omega ):= \Vert f\mapsto \int _0^\infty f(t) H(t,\omega ) \mathrm{d}t \Vert _{\gamma ({\mathcal {L}}^2((0,\infty );X)}\) and \(K_{p,2}\) the constant in the \({\mathcal {L}}^p\)-to-\({\mathcal {L}}^2\) Kahane–Khintchine inequality (see Sect. 5.1 for details on the \(\gamma \)-radonifying norm \(\gamma ({\mathcal {L}}^2((0,\infty );X)\)).

2 Preliminaries

2.1 Some General Notation

We let \({{\mathbb {N}}}= \{1, 2,\ldots \}\) and \({{\mathbb {N}}}_0 = \{0,1,2,\ldots \}\). For a vector space V and \(B\subseteq V\), we set \( -B := \{ x\in V :-x\in B \}.\) Given a non-empty set \(\Omega \), we let \(2^{\Omega }\) denote the system of all subsets of \(\Omega \) and use \(A \triangle B := (A{\setminus } B)\cup (B{\setminus } A)\) for \(A,B\in 2^{\Omega }\). A system of pair-wise disjoint subsets \((A_i)_{i\in I}\) of \(\Omega \) is a partition of \(\Omega \), where I is an arbitrary index set and \(A_i = \emptyset \) is allowed, if \(\bigcup _{i\in I} A_i = \Omega \). If (Md) is a metric space we define \(d:M \times 2^M \rightarrow [0,\infty ]\) by setting \(d(x,A) := \inf \{ d(x,y) :y\in A \}\) for all \((x,A)\in M\times 2^{M}\). If V is a Banach space and (Md) a metric space, then C(MV) is the space of continuous maps from M to V, and \(C_b(M;V)\) the subspace of bounded continuous maps from M to V.

Banach Space-Valued random variables: For a Banach space X, we let \({\mathcal {B}}(X)\) denote the Borel \(\sigma \)-algebra generated by the norm-open sets. For \(x\in X\) and \(\varepsilon >0\), we set \(B_{x,\varepsilon } := \{ y \in X :\Vert x - y \Vert _X < \varepsilon \}\). For \(B\in {\mathcal {B}}(X)\), we let \({\bar{B}}\) denote the norm-closure of B, we let \(B^{o}\) denote the interior and \(\partial B:={\bar{B}}{\setminus } B^{o}\). Given a probability space \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) and a measurable space \((S,\Sigma )\), an \({\mathcal {F}}/\Sigma \)-measurable mapping \(\xi :\Omega \rightarrow S\) is called an S-valued random variable. For a random variable \(\xi :\Omega \rightarrow S\), the law of \(\xi \) is denoted by \({\mathcal {L}}(\xi )(A) := {{\mathbb {P}}}(\xi \in A)\) for \(A\in \Sigma \).

Lebesgue spaces: For X a separable Banach space and \((S,\Sigma )\) a measurable space, we define \({\mathcal {L}}^0((S,\Sigma );X)\) to be the space of \(\Sigma /{\mathcal {B}}(X)\)-measurable mappings from S to X. If \((S,\Sigma )\) is equipped with a \(\sigma \)-finite measure \(\mu \) and \(p\in (0,\infty )\), then we define

$$\begin{aligned} {\mathcal {L}}^p((S,\Sigma ,\mu );X) := \left\{ \xi \in {\mathcal {L}}^0((S,\Sigma ,\mu );X) :\Vert \xi \Vert _{{\mathcal {L}}^p((S,\Sigma ,\mu );X)}^p := \int _{S} \Vert \xi \Vert _X^p \,\mathrm{d}\mu < \infty \right\} . \end{aligned}$$

If there is no risk of confusion, we write for example \({\mathcal {L}}^p(\mu ;X)\) or \({\mathcal {L}}^p(\Sigma ;X)\) as shorthand notation for \({\mathcal {L}}^p((S,\Sigma ,\mu );X)\), and we set \({\mathcal {L}}^p((S,\Sigma ,\mu )):= {\mathcal {L}}^p((S,\Sigma ,\mu );{{\mathbb {R}}})\).

Probability measures on Banach spaces:

  1. (1)

    Given an index set \(I\not = \emptyset \), a family \((\mu _i)_{i\in I} \subseteq {\mathcal {P}}_p(X)\) (\({\mathcal {P}}_p(X)\) was introduced in Sect. 1) is uniformly \({\mathcal {L}}^p\)-integrable if

    $$\begin{aligned} \lim _{K\rightarrow \infty } \sup _{i\in I} \int _{\{ \Vert x\Vert _X \ge K\}}\ \Vert x\Vert _X^p \mathrm{d}\mu _i (x) = 0. \end{aligned}$$

    Accordingly, a family of X-valued random variables \((\xi _i)_{i\in I}\) is uniformly \({\mathcal {L}}^p\)-integrable if \(({\mathcal {L}}(\xi _i))_{i\in I}\) is uniformly \({\mathcal {L}}^p\)-integrable.

  2. (2)

    For \(\mu \in {\mathcal {P}}(X)\) and \(\mu _n \in {\mathcal {P}}(X)\), \(n\in {{\mathbb {N}}}\), we write \(\mu _n {\mathop {\rightarrow }\limits ^{w^*}} \mu \) as \(n\rightarrow \infty \) if \(\mu _n\) converges weakly to \(\mu \), i.e., if \(\lim _{n\rightarrow \infty } \int _X f(x) \,\mathrm{d}\mu _n (x) = \int _X f(x) \,\mathrm{d}\mu (x)\) for all \(f \in C_b(X;{{\mathbb {R}}})\). Moreover, for a sequence of X-valued random variables \((\xi _n)_{n\in {{\mathbb {N}}}}\) and an X-valued random variable \(\xi \) (possibly defined on different probability spaces) we write \(\xi _n {\mathop {\rightarrow }\limits ^{w^*}} \xi \) as \(n\rightarrow \infty \) provided that \({\mathcal {L}}(\xi _n) {\mathop {\rightarrow }\limits ^{w^*}} {\mathcal {L}}(\xi )\) as \(n\rightarrow \infty \).

We shall frequently use the following well-known result, which relates \({\mathcal {L}}^p\)-uniform integrability and convergence of moments:

Lemma 2.1

Let \(p\in (0,\infty )\), let X be a separable Banach space, and let \(\mu \), \((\mu _n)_{n\in {{\mathbb {N}}}}\) be a sequence in \({\mathcal {P}}_p(X)\) such that \(\mu _n{\mathop {\rightarrow }\limits ^{w^*}} \mu \). Then, the following are equivalent:

  1. (i)

    \(\int _{X} \Vert x \Vert ^{p}\,\mathrm{d}\mu _n \rightarrow \int _{X} \Vert x \Vert ^{p}\,\mathrm{d}\mu \).

  2. (ii)

    \((\mu _n)_{n\in {{\mathbb {N}}}}\) is uniformly \({\mathcal {L}}^p\)-integrable.

Proof

Apply, e.g., [19, Lemma 4.11 (in (5) \(\limsup \) can be replaced by \(\sup \))] to the random variables \(\xi ,\xi _1,\xi _2,\ldots \) where \(\xi =\Vert \zeta \Vert _X^p\) and \(\xi _n = \Vert \zeta _n \Vert _X^p\), and where \({\mathcal {L}}(\zeta ) = \mu \) and \({\mathcal {L}}(\zeta _n)=\mu _n\). \(\square \)

Stochastic basis: We use the notion of a stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},{{\mathbb {F}}})\), which is a probability space \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) equipped with a filtration \({{\mathbb {F}}}=({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0}\), \({\mathcal {F}}_0 \subseteq {\mathcal {F}}_1 \subseteq \cdots \subseteq {\mathcal {F}}\), and where we set \({\mathcal {F}}_\infty := \sigma \left( \bigcup _{n\in {{\mathbb {N}}}_0} {\mathcal {F}}_n\right) \). For measurable spaces \((\Omega ,{\mathcal {F}})\) and \((S,{\mathcal {S}})\), and \(\xi =(\xi _n)_{n\in {{\mathbb {N}}}}\) a sequence of S-valued random variables on \((\Omega ,{\mathcal {F}})\), we let \({{\mathbb {F}}}^{\xi }=({\mathcal {F}}^{\xi }_n)_{n\in {{\mathbb {N}}}_0}\) denote the natural filtration generated by \(\xi \), i.e., \({\mathcal {F}}_0^\xi := \{\emptyset ,\Omega \}\) and \({\mathcal {F}}_n^\xi := \sigma (\xi _1,\ldots ,\xi _n)\) for \(n \in {{\mathbb {N}}}\), and \({\mathcal {F}}_\infty ^\xi := \sigma (\xi _n : n\in {{\mathbb {N}}})\).

2.2 Stochastic Kernels

We provide some details for regular versions of conditional probabilities we shall use later.

Definition 2.2

Let X be a separable Banach space and \((S,\Sigma )\) a measurable space. A mapping \(\kappa :S \times {\mathcal {B}}(X)\rightarrow [0,1]\) is a \(\Sigma /{\mathcal {B}}(X)\)-measurable kernel if and only if the following two conditions hold:

  1. (i)

    For all \(\omega \in S\), it holds that \(\kappa [\omega ,\cdot ]\in {\mathcal {P}}(X)\).

  2. (ii)

    For all \(B\in {\mathcal {B}}(X)\), the map \(\omega \rightarrow \kappa [\omega ,B]\) is \(\Sigma /{\mathcal {B}}({{\mathbb {R}}})\)-measurable.

Remark 2.3

Let the space \((S,\Sigma )\) be equipped with a probability measure \({{\mathbb {P}}}\) and let \(\Pi \subseteq {\mathcal {B}}(X)\) be a countable \(\pi \)-system that generates \({\mathcal {B}}(X)\). For two kernels \(\kappa , \kappa ' :S \rightarrow {\mathcal {P}}(X)\), the following assertions are equivalent:

  1. (i)

    \(\kappa [\omega ,B] = \kappa '[\omega ,B]\) for \({{\mathbb {P}}}\)-almost all \(\omega \in S\), for all \(B\in \Pi \).

  2. (ii)

    \(\kappa [\omega ,\cdot ] = \kappa '[\omega ,\cdot ]\) for \({{\mathbb {P}}}\)-almost all \(\omega \in S\).

We need the existence of kernels describing conditional probabilities:

Theorem 2.4

[19, Theorem 6.3] Let X be a separable Banach space, \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) a probability space, \({\mathcal {G}}\subseteq {\mathcal {F}}\) a sub-\(\sigma \)-algebra, and let \(\xi :\Omega \rightarrow X\) be a random variable. Then, there is a \({\mathcal {G}}/{\mathcal {B}}({\mathcal {P}}(X))\)-measurable kernel \(\kappa :\Omega \rightarrow {\mathcal {P}}(X)\) satisfying

$$\begin{aligned} \kappa [\cdot ,B] = {{\mathbb {P}}}(\xi \in B\,|\,{\mathcal {G}})\quad \text{ a.s. } \end{aligned}$$

for all \(B\in {\mathcal {B}}(X)\). If \(\kappa ':\Omega \rightarrow {\mathcal {P}}(X)\) is another kernel with this property, then \(\kappa '=\kappa \) a.s.

We refer to \(\kappa \) as a regular conditional probability kernel for \({\mathcal {L}}(\xi \,|\,{\mathcal {G}})\).

2.3 Decoupling

We briefly recall the concept of decoupled tangent sequences as introduced by Kwapień and Woyczyński [24]. For more details, we refer to [10, 25] and the references therein.

Definition 2.5

Let X be a separable Banach space, let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0})\) be a stochastic basis, and let \((d_n)_{n\in {{\mathbb {N}}}}\) be an \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)-adapted sequence of X-valued random variables on \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\). A sequence of X-valued and \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)-adapted random variables \((e_n)_{n\in {{\mathbb {N}}}}\) on \( (\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) is called an \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0}\)-decoupled tangent sequence of \((d_n)_{n\in {{\mathbb {N}}}}\) provided there exists a \(\sigma \)-algebra \({\mathcal {H}}\subseteq {\mathcal {F}}\) satisfying \(\sigma ((d_n)_{n\in {{\mathbb {N}}}}) \subseteq {\mathcal {H}}\) such that the following two conditions are satisfied:

  1. (i)

    Tangency: For all \(n\in {{\mathbb {N}}}\) and all \(B\in {\mathcal {B}}(X)\), one has

    $$\begin{aligned} {{\mathbb {P}}}(d_n \in B\,|\,{\mathcal {F}}_{n-1}) = {{\mathbb {P}}}(e_n \in B\,|\,{\mathcal {F}}_{n-1}) = {{\mathbb {P}}}(e_n\in B\,|\,{\mathcal {H}})\quad \text{ a.s. } \end{aligned}$$
  2. (ii)

    Conditional independence: For all \(N\in {{\mathbb {N}}}\) and \(B_1,\ldots ,B_N\in {\mathcal {B}}(X)\) one has

    $$\begin{aligned} {{\mathbb {P}}}(e_1\in B_1,\ldots , e_N\in B_N\,|\,{\mathcal {H}}) = {{\mathbb {P}}}(e_1\in B_1\,|\,{\mathcal {H}}) \ldots {{\mathbb {P}}}(e_N\in B_N\,|\,{\mathcal {H}})\quad \text{ a.s. } \end{aligned}$$

A construction of a decoupled tangent sequence is presented in [25, Section 4.3].

Example 2.6

Let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0})\) be a stochastic basis, \((\varphi _n)_{n\in {{\mathbb {N}}}}\) and \((\varphi _n')_{n\in {{\mathbb {N}}}}\) two independent and identically distributed sequences of independent, \({{\mathbb {R}}}\)-valued random variables such that \(\varphi _n\) and \(\varphi _n'\) are \({\mathcal {F}}_n\)-measurable and independent of \({\mathcal {F}}_{n-1}\) for all \(n\in {{\mathbb {N}}}\), and let \((v_n)_{n\in {{\mathbb {N}}}_0}\) be an \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0}\)-adapted sequence of X-valued random variables independent of \((\varphi _n')_{n\in {{\mathbb {N}}}}\). Then, \(( \varphi '_n v_{n-1} )_{n\in {{\mathbb {N}}}}\) is an \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0}\)-decoupled tangent sequence of \(( \varphi _n v_{n-1} )_{n\in {{\mathbb {N}}}}\), where one may take

$$\begin{aligned} {\mathcal {H}}:= \sigma ( (\varphi _n)_{n\in {{\mathbb {N}}}}, (v_n)_{n\in {{\mathbb {N}}}_0}). \end{aligned}$$

Similarly, \((\varphi _n)_{n\in {{\mathbb {N}}}}\) and \((\varphi _n')_{n\in {{\mathbb {N}}}}\) could be X-valued random variables and \((v_n)_{n\in {{\mathbb {N}}}_0}\) \({{\mathbb {R}}}\)-valued.

3 A Factorization for Regular Conditional Probabilities

Theorem 3.1 extends [19, Lemma 3.22] and can be viewed as a strong version of Montgomery-Smith’s distributional result [29, Theorem 2.1]. Theorem 3.1 is used to prove Theorem 4.3, where it yields a refined argument for the existence of a decoupled tangent sequence. In this sense, it also contributes to [24] (cf. [10, Proposition 6.1.5]).

Theorem 3.1

Let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) be a probability space, \({\mathcal {G}}\subseteq {\mathcal {F}}\) be a \(\sigma \)-algebra, let \(d\in {\mathcal {L}}^0({\mathcal {F}};{{\mathbb {R}}})\) satisfy \(d(\Omega )\subseteq [0,1)\), and let \(\kappa :\Omega \times {\mathcal {B}}([0,1)) \rightarrow [0,1]\) be a regular conditional probability kernel for \({\mathcal {L}}(d\,|\,{\mathcal {G}})\). Let \(({\bar{\Omega }},{\bar{{\mathcal {F}}}},{\bar{{{\mathbb {P}}}}}):= (\Omega \times (0,1], {\mathcal {F}}\otimes {\mathcal {B}}((0,1]),{{\mathbb {P}}}\otimes \lambda )\), where \(\lambda \) is the Lebesgue measure on \({\mathcal {B}}((0,1])\). Set \([0,0):=\emptyset \) and define \(H:{\bar{\Omega }}\rightarrow [0,1]\), \(d^{0}:\Omega \times [0,1] \rightarrow [0,1]\) by

$$\begin{aligned} H(\omega ,s)&:= \kappa [ \omega , [0,d(\omega ))] + s \kappa [ \omega , \{ d(\omega )\}], \end{aligned}$$
(8)
$$\begin{aligned} d^{0}(\omega ,h)&:= \inf \{ x\in [0,1] :\kappa [ \omega , [0,x]] \ge h \}. \end{aligned}$$
(9)

Then,

  1. (i)

    H is \({\bar{{\mathcal {F}}}}/{\mathcal {B}}([0,1])\)-measurable, independent of \({\mathcal {G}}\otimes \{\emptyset ,(0,1]\}\), and uniformly [0, 1] distributed,

  2. (ii)

    \(d^0\) is \({\mathcal {G}}\otimes {\mathcal {B}}([0,1])/{\mathcal {B}}([0,1])\)-measurable, and

  3. (iii)

    there is an \({\mathcal {N}}\in {\mathcal {F}}\) with \({{\mathbb {P}}}({\mathcal {N}})=0\) such that \(d^{0}(\omega ,H(\omega ,s))= d(\omega )\) for all \((\omega ,s)\in (\Omega {\setminus }{\mathcal {N}})\times (0,1]\).

Before we prove this theorem, let us comment on Item (i). There are two extreme cases. The first one is \({\mathcal {G}}:= \{\emptyset ,\Omega \}\). In this case, we get that \(\kappa [\omega ;(-\infty ,x]]\) is the distribution function of the law of d and here it is known that the distribution of H is the uniform distribution on [0, 1]. The other extreme case is \({\mathcal {G}}={\mathcal {F}}\) and here we can take \(\kappa [\omega ,B]:=1_{\{d(\omega )\in B\}}\) which implies that \(H(\omega ,s)=s\). Our result interpolates between these two extreme cases.

Proof of Theorem 3.1

  1. (i)

    For all \(n\in {{\mathbb {N}}}\) and \(\ell \in \{1,\ldots ,2^n\}\), let \(A_{n,\ell } := [(\ell -1)2^{-n},\ell 2^{-n})\). Define \(H_n :{\bar{\Omega }}\rightarrow [0,1]\) by

    $$\begin{aligned} H_n(\omega ,s) := \sum _{\ell =1}^{2^n} 1_{\{d\in A_{n,\ell }\}} (\omega ) \big ( \kappa [ \omega , [0,(\ell -1) 2^{-n})] + s \kappa [ \omega , A_{n,\ell }] \big ), \end{aligned}$$

    so that for all \((\omega ,s)\in {\bar{\Omega }}\) it holds that

    $$\begin{aligned} | H_n(\omega ,s) - H(\omega ,s) | \le \sum _{\ell =1}^{2^n} 1_{\{d\in A_{n,\ell }\}} (\omega ) (1+s) \kappa [ \omega , A_{n,\ell } {\setminus }\{d(\omega )\}] \rightarrow 0\quad \text {as}\quad n\rightarrow \infty . \end{aligned}$$

    The \(H_n\) are \({\bar{{\mathcal {F}}}}/{\mathcal {B}}([0,1])\)-measurable, so H is as point-wise limit (the measurability of H can be seen directly as well). Let \(n\in {{\mathbb {N}}}\), \(G\in {\mathcal {G}}\) and \(B\in {\mathcal {B}}([0,1])\). Because \(b \int _{0}^{1} 1_{\{a+sb\in B\}} \,\mathrm{d}s = \lambda (B\cap [a,a+b])\) for \(a,b\in [0,1]\) with \(a+b\le 1\) (where \(\lambda \) denotes the Lebesgue measure), we get

    $$\begin{aligned} \bar{{\mathbb {P}}}((G\times (0,1]) \cap \{ H_n \in B \} )&= \sum _{\ell =1}^{2^n} \int _{G} \lambda \big ( B\cap \big [\kappa [\cdot , [0, (\ell -1)2^{-n})], \kappa [\cdot , [0, \ell 2^{-n})] \big ]\big ) \,\mathrm{d}{{\mathbb {P}}}\\&= {{\mathbb {P}}}(G)\cdot \lambda (B). \end{aligned}$$

    This proves that \(H_n\) is uniformly [0, 1] distributed and independent of \({\mathcal {G}}\otimes \{\emptyset ,(0,1]\}\) for all \(n\in {{\mathbb {N}}}\). This completes the proof of (i), as H is the point-wise limit of \((H_n)_{n\in {{\mathbb {N}}}}\) (two \({{\mathbb {R}}}\)-valued random variables \(\xi _1,\xi _2\) are independent if and only if for all \(f,g\in C_{b}({{\mathbb {R}}})\) it holds that \({{\mathbb {E}}}[f(\xi _1)g(\xi _2)] = {{\mathbb {E}}}[f(\xi _1)]{{\mathbb {E}}}[f(\xi _2)]\)).

  2. (ii)

    For all \(x\in [0,1]\), note that

    $$\begin{aligned} \{ d^{0} \le x \} = \{ (\omega ,h)\in \Omega \times [0,1] :\kappa [\omega , [0,x]] -h\ge 0 \} \in {\mathcal {G}}\otimes {\mathcal {B}}([0,1]). \end{aligned}$$
    (10)
  3. (iii)

    It follows from (10) and the definition of H that we have, for all \(x\in [0,1]\), that

    $$\begin{aligned}&\{ (\omega ,s)\in {\bar{\Omega }} :d^{0}(\omega ,H(\omega ,s)) \le x \} \\&= \{ (\omega ,s)\in {\bar{\Omega }}:\kappa [\omega , [0,x]] \ge \kappa [\omega ,[0,d(\omega ))] + s \kappa [\omega ,\{d(\omega )\}] \} \end{aligned}$$

    can be written as \(B_x \times (0,1]\) for some unique \(B_x \in {\mathcal {F}}\) and that we have that

    $$\begin{aligned} B_x\times (0,1] \supseteq \{ (\omega ,s)\in {\bar{\Omega }}:d(\omega ) \le x \} =:C_x \times (0,1]. \end{aligned}$$

    On the other hand from the fact that the image measure of the map \((\omega ,s)\mapsto (\omega ,H(\omega ,s))\) as a map from \({\overline{\Omega }}\) into \(\Omega \times [0,1]\) equals \({{\mathbb {P}}}\otimes \lambda \), we obtain, for all \(x\in [0,1]\), that

    $$\begin{aligned} {{\mathbb {P}}}(B_x) = {\overline{{{\mathbb {P}}}}}(B_x \times (0,1])&= {{\mathbb {E}}}\int _{0}^{1} 1_{\{d^{0}(\omega ,h)\le x \}} \,\mathrm{d}h \,\mathrm{d}{{\mathbb {P}}}(\omega )\\&= {{\mathbb {E}}}\int _{0}^{1} 1_{\{ \kappa [\omega , [0,x]] \ge h \}} \,\mathrm{d}h \,\mathrm{d}{{\mathbb {P}}}(\omega ) = {{\mathbb {E}}}\kappa [\cdot ,[0,x]] = {{\mathbb {P}}}(C_x). \end{aligned}$$

    It follows that \({{\mathbb {P}}}(B_x{\setminus } C_x)=0\) for all \(x\in [0,1]\). Let \({\mathcal {N}}:= \cup _{q\in {{\mathbb {Q}}}\cap [0,1)} (B_q{\setminus } C_q)\) so that \({{\mathbb {P}}}({\mathcal {N}})=0\). Then, observing that \(B_x = \cap _{q\in {{\mathbb {Q}}}\cap [x,1)} B_q\) (this follows from \(B_x\times (0,1] = \{ d^{0}(\cdot ,H(\cdot ,\cdot )) \le x \} = \cap _{q\in {{\mathbb {Q}}}\cap [x,1)} \{ d^{0}(\cdot ,H(\cdot ,\cdot )) \le q \} = (\cap _{q\in {{\mathbb {Q}}}\cap [x,1)} B_q) \times (0,1]\) and the uniqueness of the sets \(B_r\), \(r\in [0,1]\)) and \(C_x = \cap _{q\in {{\mathbb {Q}}}\cap [x,1)} C_q\) for all \(x\in [0,1)\), we have for all \((\omega ,s)\in (\Omega {\setminus } {\mathcal {N}})\times (0,1]\) that \(d^{0}(\omega ,H(\omega ,s)) = d(\omega )\). \(\square \)

Corollary 3.2

Let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) be a probability space, \({\mathcal {G}}\subseteq {\mathcal {F}}\) a \(\sigma \)-algebra, X a separable Banach space, \(d\in {\mathcal {L}}^0({\mathcal {F}};X)\). Let \(({\bar{\Omega }},{\bar{{\mathcal {F}}}},{\bar{{{\mathbb {P}}}}}):= (\Omega \times (0,1], {\mathcal {F}}\otimes {\mathcal {B}}((0,1]),{{\mathbb {P}}}\otimes \lambda )\), where \(\lambda \) is the Lebesgue measure on \({\mathcal {B}}((0,1])\). Then, there exist random variables \(H:{\bar{\Omega }}\rightarrow [0,1]\), \(d^{0}:\Omega \times [0,1] \rightarrow X\) such that

  1. (i)

    H is uniformly [0, 1] distributed and independent of \({\mathcal {G}}\otimes \{\emptyset , [0,1]\}\),

  2. (ii)

    \(d^0\) is \({\mathcal {G}}\otimes {\mathcal {B}}([0,1])/{\mathcal {B}}(X)\)-measurable, and

  3. (iii)

    there is an \({\mathcal {N}}\in {\mathcal {F}}\) with \({{\mathbb {P}}}({\mathcal {N}})=0\) such that \(d(\omega ) = d^{0}(\omega , H(\omega ,s))\) for all \((\omega ,s)\in (\Omega {\setminus }{\mathcal {N}})\times (0,1]\).

Proof

This is an immediate consequence of Theorem 3.1 and the fact that that X is Borel-isomorphic to [0, 1), see, e.g., [11, Theorem 13.1.1]. \(\square \)

4 A Reduction of General Decoupling to Haar-Type Series

Before we turn to our main Theorem 4.3, we discuss some properties of the extension of \({\mathcal {P}}\) to \({\mathcal {P}}_{p\text {-ext}}\) (see Definition 1.3). For this, we need

Lemma 4.1

Assume a metric space (Md) and a continuous map \(*:M\times M \rightarrow M\) with \((x*y)*z=x*(y*z)\) for \(x,y,z\in M\). Let \(\emptyset \not = {\mathcal {P}}\subseteq M\) and

$$\begin{aligned} {\overline{{\mathcal {P}}}}^* := \mathrm{cl}_d(\{ x_1 * \cdots * x_L : x_1,\ldots ,x_L\in {\mathcal {P}}, L\in {{\mathbb {N}}}\}) \end{aligned}$$

where the closure on the right side is taken with respect to d. Then, one has \(\overline{({\overline{{\mathcal {P}}}}^*)}^*= {\overline{{\mathcal {P}}}}^*\) and \({\overline{{\mathcal {P}}}}^*\) is the smallest d-closed set \({\mathcal {Q}}\) with \({\mathcal {Q}}\supseteq {\mathcal {P}}\) and \(\mu * \nu \in {\mathcal {Q}}\) for all \(\mu ,\nu \in {\mathcal {Q}}\).

Proof

The equality \(\overline{({\overline{{\mathcal {P}}}}^*)}^*= {\overline{{\mathcal {P}}}}^*\) follows from the continuity of \(*\) and a standard diagonalization procedure. This also implies that \(\mu * \nu \in {\overline{{\mathcal {P}}}}^*\) for all \(\mu ,\nu \in {\overline{{\mathcal {P}}}}^*\). Now let us assume a set \({\mathcal {Q}}\) as in the assertion. Then, \(x_1 * \cdots * x_L\in {\mathcal {Q}}\) for all \(x_1,\ldots ,x_L\in {\mathcal {P}}\). As \({\mathcal {Q}}\) is closed we deduce \( {\overline{{\mathcal {P}}}}^*\subseteq {\mathcal {Q}}\). \(\square \)

Lemma 4.2 reveals some basic properties of \({\mathcal {P}}_{p\text {-ext}}\). To this end, for \(p\in (0,\infty )\) we introduce on \({\mathcal {P}}_p(X)\subseteq {\mathcal {P}}(X)\) the metric

$$\begin{aligned} d_p(\mu ,\nu ) := d_0(\mu ,\nu ) + \left| \int _X \Vert x \Vert ^p \,\mathrm{d}\mu (x) - \int _X \Vert x \Vert ^p \,\mathrm{d}\nu (x) \right| \end{aligned}$$
(11)

where \(d_0\) is a fixed metric on \({\mathcal {P}}(X)\) that metricizes the \(w^*\)-convergence, see for example [30, Theorem II.6.2].

Lemma 4.2

Let X be a separable Banach space, \(p\in (0,\infty )\), and let \({\mathcal {P}}\subseteq {\mathcal {P}}_p(X)\) be non-empty. Then,

  1. (i)

    \(({\mathcal {P}}_{p\text {-ext}})_{p\text {-ext}} = {\mathcal {P}}_{p\text {-ext}}\) and

  2. (ii)

    \({\mathcal {P}}_{p\text {-ext}}\) is the smallest \(d_p\)-closed set \({\mathcal {Q}}\) with \({\mathcal {Q}}\supseteq {\mathcal {P}}\) and \(\mu * \nu \in {\mathcal {Q}}\) for all \(\mu ,\nu \in {\mathcal {Q}}\).

Proof

We will verify that the convolution is continuous with respect to \(d_p\), the assertion then follows from Lemma . To verify this, we let \(\mu ,\nu ,\mu _n,\nu _n\in {\mathcal {P}}_p(X)\), \(n\in {{\mathbb {N}}}\), such that \(\lim _{n\rightarrow \infty } d_p(\mu ,\mu _n) = \lim _{n\rightarrow \infty } d_p(\nu ,\nu _n) = 0\). It is know that \(\mu _n*\nu _n {\mathop {\rightarrow }\limits ^{w^*}} \mu * \nu \) as well (one can use [19, Theorem 4.30]). Because for \(K>0\) we have, with \(h(x,y):= \max \{\Vert x\Vert _X, \Vert y \Vert _X\}\),

$$\begin{aligned}&\int _{\{\Vert x+y\Vert _X\ge K\}}\Vert x+y\Vert _X^p \mathrm{d}\mu _n(x) \mathrm{d}\nu _n(y) \le 2^p \int _{\{ h(x,y) \ge K/2 \}} h^p(x,y) \mathrm{d}\mu _n(x)\mathrm{d}\nu _n(y) \\&\quad \le 2^p \int _{\{\Vert x\Vert _X \ge K/2 \}} \Vert x\Vert _X^p \mathrm{d}\mu _n(x) + 2^p \int _{\{\Vert y\Vert _X \ge K/2 \}} \Vert y\Vert _X^p \mathrm{d}\nu _n(y), \end{aligned}$$

cf. [1, p. 217], by Lemma 2.1 we get that \(\mu _n * \nu _n\) is uniformly \({\mathcal {L}}^p\)-integrable and thus, again by Lemma 2.1, we obtain the convergence of the p-th moments. \(\square \)

Now we formulate the main result of this section. See Definition 1.2 for the definition of \({{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}}_{p\text {-ext}})\).

Theorem 4.3

Let X be a separable Banach space and let \(\Phi _\lambda \in C(X \times X;{{\mathbb {R}}})\), \(\lambda \in \Delta \), for an arbitrary non-empty index set \(\Delta \). Suppose that there exist a \(p\in (0,\infty )\) and constants \(C_\lambda \in (0,\infty )\), \(\lambda \in \Delta \), such that

$$\begin{aligned} | \Phi _\lambda (x,y) | \le C_\lambda ( 1 + \Vert x \Vert _X^p + \Vert y \Vert _X^p) \end{aligned}$$

for all \( (x,y) \in X\times X\), and let \({\mathcal {P}}\subseteq {\mathcal {P}}_p(X)\) with \(\delta _0\in {\mathcal {P}}\). Then, the following assertions are equivalent:

  1. (i)

    For every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},{{\mathbb {F}}})\) with \({{\mathbb {F}}}=({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}_0}\) and every finitely supportedFootnote 2\((d_n)_{n\in {{\mathbb {N}}}} \in {{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}}_{p\text {-ext}})\) it holds that

    $$\begin{aligned} \sup _{\lambda \in \Delta }{{\mathbb {E}}}\Phi _\lambda \left( \sum _{n=1}^\infty d_n,\sum _{n=1}^\infty e_n \right) \le 0, \end{aligned}$$
    (12)

    provided that \((e_n)_{n\in {{\mathbb {N}}}}\) is an \({{\mathbb {F}}}\)-decoupled tangent sequence of \((d_n)_{n\in {{\mathbb {N}}}}\).

  2. (ii)

    For every probability space \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\), every finitely supported sequence of independent random variables \(\varphi = (\varphi _n)_{n\in {{\mathbb {N}}}}\) in \( {\mathcal {L}}^p({{\mathbb {P}}};X)\) satisfying \({\mathcal {L}}(\varphi _n)\in {\mathcal {P}}\) for all \(n\in {{\mathbb {N}}}\), and every \( A_n \in {\mathcal {F}}_{n}^{\varphi }\), \(n\in {{\mathbb {N}}}_0\), it holds that

    $$\begin{aligned} \sup _{\lambda \in \Delta } {{\mathbb {E}}}\Phi _\lambda \left( \sum _{n=1}^\infty \varphi _n 1_{A_{n-1}}, \sum _{n=1}^\infty \varphi '_n 1_{A_{n-1}} \right) \le 0, \end{aligned}$$
    (13)

    where \((\varphi '_n)_{n\in {{\mathbb {N}}}}\) is an independent copy of \((\varphi _n)_{n\in {{\mathbb {N}}}}\).

Proof

Proof of (i) \(\Rightarrow \) (ii). In (ii), we have \((1_{A_{n-1}} \varphi _n )_{n\in {{\mathbb {N}}}} \in {{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}}^{\varphi ,\varphi '};X,{\mathcal {P}})\) with \({{\mathbb {F}}}^{\varphi ,\varphi '}\!=({\mathcal {F}}^{\varphi ,\varphi '}_n)_{n\in {{\mathbb {N}}}_0}\) where \({\mathcal {F}}_0^{\varphi ,\varphi '}:= \{\emptyset ,\Omega \}\) and \({\mathcal {F}}_n^{\varphi ,\varphi '}:=\sigma (\varphi _1,\varphi '_1,\ldots ,\varphi _n,\varphi '_n)\) for \(n\in {{\mathbb {N}}}\). Therefore, the implication (ii) \(\Rightarrow \) (i) follows by Example 2.6. \(\square \)

The implication (ii) \(\Rightarrow \) (i) will be proved in Appendix A. Theorem 4.3 allows us to prove Theorem 1.4 from Sect. 1:

Proof of Theorem 1.4

The statement for general \(\Delta \) follows from the case \(\Delta =\{\lambda _0\}\) so that we may assume this case and let \({\underline{\Psi }} := \Psi _{\lambda _0}\) and \({\overline{\Psi }}:=\Psi \). By the lower and upper semi-continuity, we can find continuous \({\underline{\Psi }}^\ell ,{\overline{\Psi }}^\ell :[0,\infty )\rightarrow [0,\infty )\), \(\ell \in {{\mathbb {N}}}\), such that \({\underline{\Psi }}^\ell (\xi ) \uparrow {\underline{\Psi }}(\xi )\) and \(C(1+| \xi |^p) \ge {\overline{\Psi }}^\ell (\xi ) \downarrow {\overline{\Psi }}(\xi )\) for all \(\xi \in [0,\infty )\). Next, we set \(\Phi _\ell (x,y):= {\underline{\Psi }}^\ell (\Vert Sx\Vert _Y) - {\overline{\Psi }}^\ell (\Vert Ty\Vert _Z)\), \(\ell \in {{\mathbb {N}}}\). Then, the monotone convergence theorem implies that for all \(\xi ,\eta \in {\mathcal {L}}^p(X)\) the conditions \(\sup _{\ell \in {{\mathbb {N}}}}{{\mathbb {E}}}\Phi _\ell (\xi ,\eta ) \le 0\) and \({{\mathbb {E}}}\left[ {\underline{\Psi }}(\Vert S\xi \Vert _Y) - {\overline{\Psi }}(\Vert T\eta \Vert _Z)\right] \le 0\) are equivalent. \(\square \)

Let us list some common choices of \({\mathcal {P}}\) in the setting of decoupling inequalities. To do so, we exploit the following lemma:

Lemma 4.4

Let \(C,p\in (0,\infty )\), let X be a separable Banach space, let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) be a probability space, and let \( \Phi \in C(X;{{\mathbb {R}}}) \) be such that

$$\begin{aligned} | \Phi (x) | \le C(1+\Vert x \Vert _X^p ) \end{aligned}$$
(14)

for all \(x\in X\). Assume \(\xi ,\xi _n \in {\mathcal {L}}^p({{\mathbb {P}}};X)\), \( n \in {{\mathbb {N}}}\), such that \(\xi _n {\mathop {\rightarrow }\limits ^{w^*}} \xi \) as \(n\rightarrow \infty \) and that \((\xi _n)_{n\in {{\mathbb {N}}}}\) is uniformly \({\mathcal {L}}^p\)-integrable. Then,

$$\begin{aligned} \lim _{n\rightarrow \infty } {{\mathbb {E}}}\Phi ( \xi _n) = {{\mathbb {E}}}\Phi ( \xi ). \end{aligned}$$
(15)

Proof

It follows from the uniform \({\mathcal {L}}^p\)-integrability of \((\xi _n)_{n\in {{\mathbb {N}}}}\) and estimate (14) that \((\Phi (\xi _n))_{n\in {{\mathbb {N}}}} \) is uniformly \({\mathcal {L}}^1\)-integrable. Moreover, note that \(\Phi (\xi _n) {\mathop {\rightarrow }\limits ^{w^*}} \Phi (\xi )\) as \(n\rightarrow \infty \), so that we may apply Lemma 2.1 for \(p=1\). \(\square \)

Note that if \(\xi _n \rightarrow \xi \) in \({\mathcal {L}}^p({{\mathbb {P}}};X)\), \(\xi _n,\xi \in {\mathcal {L}}^p({{\mathbb {P}}};X)\), then the assumptions on \((\xi _n)_{n\in {{\mathbb {N}}}}\) and \(\xi \) in Lemma 4.4 are satisfied (see [19, Lemma 4.7]).

Example 4.5

(Adapted processes) If \(p\in (0,\infty )\) and \({\mathcal {P}}={\mathcal {P}}_p(X)\), then \({\mathcal {P}}_{p\text {-ext}}={\mathcal {P}}\) by Lemma 4.4 and the space \({{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}})\) consists of all \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)-adapted processes \((d_n)_{n\in {{\mathbb {N}}}}\) in \({\mathcal {L}}^p({{\mathbb {P}}};X)\).

Example 4.6

(\({\mathcal {L}}^p\)-martingales) If \(p\in [1,\infty )\) and \({\mathcal {P}}\) consists of all mean zero measures in \({\mathcal {P}}_p(X)\), then \({\mathcal {P}}_{p\text {-ext}} = {\mathcal {P}}\) by Lemma 4.4 (one can test with \(\Phi (x):= \langle x,a \rangle \), where \(a\in X'\) and \(X'\) is the norm-dual) and \({{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}})\) consists of all \({\mathcal {L}}^p\)-integrable \({{\mathbb {F}}}\)-martingale difference sequences.

Example 4.7

(Conditionally symmetric adapted processes) Suppose \(p\in (0,\infty )\) and \({\mathcal {P}}\) consists of all symmetric measures in \({\mathcal {P}}_p(X)\). As a measure \(\mu \in {\mathcal {P}}(X)\) is symmetric if and only if for all \(f\in C_b(X;{{\mathbb {R}}})\) it holds that \(\int _X f(x) \,\mathrm{d}\mu (x) = \int _X f(-x) \,\mathrm{d}\mu (x)\), it follows that \({\mathcal {P}}_{p\text {-ext}}={\mathcal {P}}\). Moreover, the set \({{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}})\) consists of all sequences of X-valued \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)-adapted sequences of random variables \((d_n)_{n\in {{\mathbb {N}}}}\) such that \(d_n \in {\mathcal {L}}^p({{\mathbb {P}}};X)\) and \(d_n\) is \({\mathcal {F}}_{n-1}\)-conditionally symmetric for all \(n\in {{\mathbb {N}}}\), i.e., for all \(n\in {{\mathbb {N}}}\) and all \(B \in {\mathcal {B}}(X)\) it holds that \( {{\mathbb {P}}}( d_n \in B \,|\, {\mathcal {F}}_{n-1} ) = {{\mathbb {P}}}( d_n \in -B \,|\, {\mathcal {F}}_{n-1} ) \) a.s.

Example 4.8

(One-dimensional laws) If \(p\in (0,\infty )\), \(\emptyset \not = {\mathcal {P}}_0 \subseteq {\mathcal {P}}_p({{\mathbb {R}}})\), and

$$\begin{aligned} {\mathcal {P}}={\mathcal {P}}({\mathcal {P}}_0,X) := \left\{ \mu \in {\mathcal {P}}_p(X) :\exists \mu _0 \in {\mathcal {P}}_0,\, x\in X :\mu (\cdot ) = \mu _0\big (\{ r \in {{\mathbb {R}}}:rx \in \cdot \} \big ) \right\} , \end{aligned}$$

then an X-valued random variable \(\varphi \) satisfies \({\mathcal {L}}(\varphi )\in {\mathcal {P}}\) if and only if there exists an \(x\in X\) and a \({{\mathbb {R}}}\)-valued random variable \(\varphi _0\) such that \(\varphi = x \varphi _0\) and \({\mathcal {L}}(\varphi _0) \in {\mathcal {P}}_0\). Moreover, \({{\mathcal {A}}}_p(\Omega ,{{\mathbb {F}}};X,{\mathcal {P}})\) contains all sequences of the form \((\varphi _n v_{n-1} )_{n\in {{\mathbb {N}}}}\) where \((\varphi _n)_{n\in {{\mathbb {N}}}}\) is an \(({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}}\)-adapted sequence of \({{\mathbb {R}}}\)-valued random variables such that \(\varphi _n\) is independent of \({\mathcal {F}}_{n-1}\) and \({\mathcal {L}}(\varphi _n) \in {\mathcal {P}}_{0}\), and \(v_{n-1} \in {\mathcal {L}}^p({\mathcal {F}}_{n-1};X)\) for all \(n\in {{\mathbb {N}}}\). Finally, it holds that \( {\mathcal {P}}(({\mathcal {P}}_0)_{p\text {-ext}},X) \subseteq {\mathcal {P}}_{p\text {-ext}}\).

5 Decoupling for Dyadic Martingales and Stochastic Integration

In this section, we consider the case of decoupling of dyadic martingales and combine our main result, i.e., Theorem 4.3, with a standard extrapolation argument to obtain a decoupling result that is useful for the theory of stochastic integration of vector-valued stochastic processes, see Theorem 5.2.

5.1 Stochastic Integrals and \(\gamma \)-Radonifying Operators

Let X be a separable Banach space, let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_t)_{t\in [0,\infty )})\) be a stochastic basis, and let \(W=(W_t)_{t\ge 0}\) be an \(({\mathcal {F}}_t)_{t\in [0,\infty )}\)-Brownian motion, i.e., a centered \({{\mathbb {R}}}\)-valued Gaussian process such that for all \(0\le s \le t < \infty \) it holds that \(W_t\) is \({\mathcal {F}}_t\)-measurable, \(W_t - W_s\) is independent of \({\mathcal {F}}_s\), and \({{\mathbb {E}}}W_s W_t = s\). We say that \(H:[0,\infty ) \times \Omega \rightarrow X\) is a simple predictable stochastic process if there exist \(0=t_0<\cdots< t_N< \infty \) and random variables \(v_n\in {\mathcal {L}}^{\infty }({\mathcal {F}}_{t_n};X)\), \(n\in \{0,\ldots ,N-1\}\), such that for all \(t\in [0,\infty )\) it holds that

$$\begin{aligned} H(t,\omega ) = \sum _{n=1}^N \mathrm{1}_{(t_{n-1},t_n]}(t) v_{n-1}(\omega ). \end{aligned}$$

For \(H:[0,\infty ) \times \Omega \rightarrow X\) an X-valued simple predictable process, we define the stochastic integral \(\int _0^\infty H(s) \mathrm{d}W(s)\) in the usual way and we define

$$\begin{aligned} u_H : L^2((0,\infty ))\times \Omega \rightarrow X \quad \text { by } \quad u_H (f)(\omega ) := \int _0^\infty f(t) H(t,\omega ) \mathrm{d}t. \end{aligned}$$

Note that for all \(\omega \in \Omega \) we obtain a finite rank operator \(u_H(\omega ):L^2((0,\infty ))\rightarrow X\). Given a finite rank operator \(T:L^2((0,\infty ))\rightarrow X\), one can define the \(\gamma \)-radonifying norm \(\left\| \cdot \right\| _{\gamma (L^2((0,\infty ));X)}\) by

$$\begin{aligned} \Vert T \Vert _{\gamma (L^2((0,\infty ));X)} := \left\| \sum _{n=1}^\infty \gamma _n T e_n \right\| _{{\mathcal {L}}^2({{\mathbb {P}}}',X)}, \end{aligned}$$

where \((e_n)_{n\in {{\mathbb {N}}}}\) is an orthonormal basis of \(L^2((0,\infty ))\) and \((\gamma _n)_{n\in {{\mathbb {N}}}}\) is a sequence of independent standard Gaussian random variables on some probability space \((\Omega ',{\mathcal {F}}',{{\mathbb {P}}}')\). The \(\gamma \)-radonifying norm is independent of the chosen orthonormal basis. For more information about the \(\gamma \)-radonifying norm, see, for example, [32, Chapter 3] or the survey article [35]. For the relevance of \(\gamma \)-radonifying norms to the definition of vector-valued stochastic integrals, see the definition of and results on \(W_p(X)\) in Definition 5.1 and Theorem 5.2, or see [36] for more details.

5.2 Decoupling Constants

In order to state our result (Theorem 5.2), we first recall that a random variable \(f\in {\mathcal {L}}^0((\Omega ,{\mathcal {F}},{{\mathbb {P}}});X)\) is conditionally symmetric given a sub-\(\sigma \)-algebra \({\mathcal {G}}\) if \({{\mathbb {P}}}(\{f\in B\} \cap G) = {{\mathbb {P}}}(\{f\in - B\} \cap G)\) for all \(B\in {\mathcal {B}}(X)\) and \(G\in {\mathcal {G}}\). In addition to the constant \(D_p(X)\) from Definition 1.5, we introduce two more constants:

Definition 5.1

Assume a separable Banach space X and \(p\in (0,\infty )\).

\(\underline{W_p(X):}\) Let \(W_p(X)\in [0,\infty ]\) be the infimum over all \(c\in [0,\infty ]\) such that for every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_t)_{t\in [0,\infty )})\), every \(({\mathcal {F}}_t)_{t\in [0,\infty )}\)-Brownian motion W, and every \(({\mathcal {F}}_t)_{t\in [0,\infty )}\)-simple predictable process \(H:[0,\infty )\times \Omega \rightarrow X\) one has that

$$\begin{aligned} \left\| \int _0^\infty H(s) \mathrm{d}W(s) \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}};X)} \le c \left\| \Vert u_H \Vert _{\gamma (L^2((0,\infty ));X)} \right\| _{{\mathcal {L}}^{p}({{\mathbb {P}}})} . \end{aligned}$$

\(\underline{\mathrm{UMD}_p^{-,s}(X):}\) Let \(\mathrm{UMD}_p^{-,s}(X)\in [0,\infty ]\) be the infimum over all \(c\in [0,\infty ]\) such that for every stochastic basis \((\Omega ,{\mathcal {F}},{{\mathbb {P}}},({\mathcal {F}}_n)_{n\in {{\mathbb {N}}}})\) and every finitely supported sequence of X-valued random variables \((d_n)_{n=1}^{\infty }\) such that \(d_n \in {\mathcal {L}}^p({\mathcal {F}}_n;X)\) and \(d_n\) is \({\mathcal {F}}_{n-1}\)-conditionally symmetric for all \(n\in {{\mathbb {N}}}\) it holds that

$$\begin{aligned} \left\| \sum _{n=1}^{\infty } d_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \le c \left\| \sum _{n=1}^{\infty } r_n d_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}_{{{\mathbb {D}}}};X)}. \end{aligned}$$
(16)

Theorem 5.2

Let X be a separable Banach space and \(p\in (0,\infty )\).

  1. (i)

    If \(D_p(X)<\infty \), then \(D_q(X)<\infty \) for all \(q\in (0,\infty )\).

  2. (ii)

    If \(K_{p,2}\) is the constant in the \({\mathcal {L}}^p\)-to-\({\mathcal {L}}^2\) Kahane–Khintchine inequality, then

    $$\begin{aligned} W_p(X) \le K_{p,2} D_p(X). \end{aligned}$$

    Conversely, if \(W_p(X) <\infty \), then \(D_p(X)<\infty \).

  3. (iii)

    \( D_p(X) = \mathrm{UMD}_p^{-,s}(X)\).

For the proof, we use two lemmas. For the formulation of the first one, we introduce, for \(\nu \in {\mathcal {P}}({{\mathbb {R}}})\) and a separable Banach space X, the notation (see also Example 4.8):

$$\begin{aligned} {\mathcal {P}}(\nu ,X) := \big \{ \mu \in {\mathcal {P}}(X) :\exists x\in X :\mu (\cdot ) = \nu \big (\{r\in {{\mathbb {R}}}: rx \in \cdot \} \big ) \big \}. \end{aligned}$$

Lemma 5.3

Let X be a separable Banach space, \(p\in [2,\infty )\), let \(\mu \in {\mathcal {P}}_p({{\mathbb {R}}})\) satisfy \(\int _{{\mathbb {R}}}r \mathrm{d}\mu (r)=0\), \(\sigma ^2:=\int _{{{\mathbb {R}}}}|r|^2\,\mathrm{d}\mu (r)\in (0,\infty )\), and let \(\gamma \in {\mathcal {P}}({{\mathbb {R}}})\) be the standard Gaussian law. Then, \({\mathcal {P}}(\gamma ,X) \subseteq ({\mathcal {P}}(\mu ,X))_{p\text {-ext}}\).

Proof

Let \((\xi _n)_{n\in {{\mathbb {N}}}}\) be a sequence of independent, \(\mu \)-distributed random variables, and let \(\mu _n := {\mathcal {L}}((\sigma \sqrt{n})^{-1}\sum _{k=1}^{n} \xi _k)\). Observe that \({\mathcal {L}}((\sigma \sqrt{n})^{-1}\xi _1) \in {\mathcal {P}}(\mu ,{{\mathbb {R}}})\). Moreover, it follows from, e.g., [3, Theorem 5] that \(\mu _n {\mathop {\rightarrow }\limits ^{w^*}} \gamma \) and that \(\int _{{{\mathbb {R}}}} |r|^p \,\mathrm{d}\mu _n(r) \rightarrow \int _{{{\mathbb {R}}}} |r|^p \,\mathrm{d}\gamma (r)\). It thus follows from Lemma 2.1 that \(\gamma \in ({\mathcal {P}}(\mu ,{{\mathbb {R}}}))_{p\text {-ext}}\) and hence \({\mathcal {P}}(\gamma , X) \subseteq ({\mathcal {P}}(\mu ,X))_{p\text {-ext}}\). \(\square \)

Lemma 5.4

Let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) be a probability space, let X be a separable Banach space, let \(p\in (0,\infty )\), let \({\mathcal {G}}\subseteq {\mathcal {F}}\) be a \(\sigma \)-algebra, and let \(f\in {\mathcal {L}}^p({\mathcal {F}};X)\) be \({\mathcal {G}}\)-conditionally symmetric. Then, there exists a sequence of \({\mathcal {G}}\)-conditionally symmetric \({\mathcal {F}}\)-simple functions \((f_n)_{n\in {{\mathbb {N}}}}\) such that \(\lim _{n\rightarrow \infty } \Vert f - f_n \Vert _{{\mathcal {L}}^p({{\mathbb {P}}};X)} = 0\).

Proof

Let \((g_n)_{n\in {{\mathbb {N}}}}\) be a sequence of \(\sigma (f)\)-simple functions such that \(\lim _{n\rightarrow \infty } \Vert f - g_n \Vert _{{\mathcal {L}}^p({{\mathbb {P}}};X)} = 0\). For \(n\in {{\mathbb {N}}}\), let \(m_n\in {{\mathbb {N}}}\) and \(B_{n,k} \in \sigma (f)\), \(x_{n,k} \in X\), \(k\in \{1,\ldots ,m_n\}\), be such that \( g_n = \sum _{k=1}^{m_n} x_{n,k} 1_{\{ f \in B_{n,k}\}}.\) Define, for \(n\in {{\mathbb {N}}}\), \( f_n = \frac{1}{2} \sum _{k=1}^{m_n} x_{n,k} ( 1_{ \{f \in B_{n,k}\}} - 1_{ \{- f \in B_{n,k}\}} ) \) and observe that \(f_n\) is \({\mathcal {G}}\)-conditionally symmetric because f is \({\mathcal {G}}\)-conditionally symmetric. Moreover, the conditional symmetry of f implies that \({\mathcal {L}}(f) = {\mathcal {L}}(-f)\), whence

$$\begin{aligned}&\left\| f - f_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \\&\quad = \left\| \tfrac{1}{2} \left( f - \sum _{k=1}^{m_n} x_{n,k}1_{ \{f \in B_{n,k}\}} \right) - \tfrac{1}{2} \left( -f - \sum _{k=1}^{m_n} x_{n,k}1_{\{ - f \in B_{n,k}\}} \right) \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \\&\quad \le 2^{\left( \frac{1}{p}-1\right) ^+} \left\| f - \sum _{k=1}^{m_n} x_{n,k}1_{ \{f \in B_{n,k}\}} \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} = 2^{\left( \frac{1}{p}-1\right) ^+} \Vert f - g_n\Vert _{{\mathcal {L}}^p({{\mathbb {P}}};X)}. \end{aligned}$$

\(\square \)

Proof of Theorem 5.2

Part (i) follows from [14] and can be found in [8, Proposition B.1] for the convenience of the reader.

Part (ii): first, we check \(W_p(X) \le K_{p,2} D_p(X)\). For \(0=t_0< \cdots< t_N < \infty \) Lemma 5.3 applied to \(\mu := \frac{1}{2}\left( \delta _{-1} + \delta _1 \right) \) and Theorem 1.4 give (see also Corollary 1.6)

$$\begin{aligned} \left\| \sum _{n=1}^N (W_{t_n}-W_{t_{n-1}}) v_{n-1} \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \le D_p(X) \left\| \sum _{n=1}^N (W'_{t_n}-W'_{t_{n-1}}) v_{n-1} \right\| _{{\mathcal {L}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}';X)} \end{aligned}$$

for all \({\mathcal {L}}^p\)-integrable and \({\mathcal {F}}_{t_{n-1}}\)-measurable random variables \(v_{n-1}: \Omega \rightarrow X\) where \((W'_t)_{t\ge 0}\) is a Brownian motion defined on an auxiliary basis \((\Omega ',{\mathcal {F}}',{{\mathbb {P}}}',({\mathcal {F}}_t')_{t\in [0,\infty )})\). Exploiting the Kahane–Khintchine inequality gives that

$$\begin{aligned}&\left\| \sum _{n=1}^N (W'_{t_n}-W'_{t_{n-1}}) v_{n-1} \right\| _{{\mathcal {L}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}';X)} \\&\quad \le K_{p,2} \left( \int _{\Omega } \left( \int _{\Omega '} \left\| \sum _{n=1}^N (W'_{t_n}(\omega ')-W'_{t_{n-1}}(\omega ')) v_{n-1}(\omega ) \right\| _X^2 \mathrm{d}{{\mathbb {P}}}'(\omega ') \right) ^\frac{p}{2} \mathrm{d}{{\mathbb {P}}}(\omega ) \right) ^\frac{1}{p}. \end{aligned}$$

For \(H := \sum _{n=1}^{N} 1_{(t_{n-1},t_n]}v_{n-1}\), the result follows by the known relation

$$\begin{aligned} \left( \int _{\Omega '} \left\| \sum _{n=1}^N (W'_{t_n}(\omega ')-W'_{t_{n-1}}(\omega ')) v_{n-1}(\omega ) \right\| _X^2 \mathrm{d}{{\mathbb {P}}}'(\omega ') \right) ^\frac{1}{2} = \Vert u_H(\omega ) \Vert _{\gamma (L^2((0,\infty ));X)}. \end{aligned}$$

Conversely, let us assume that \(W_p(X)<\infty \). Now we use [37, Lemma 2.5] to deduce that X has finite cotype. (In [37, Lemma 2.5] it is assumed that \(p\in [1,\infty )\), however in the 5th line of the proof of this lemma it is shown that \(\Vert \int _0^\infty \phi (t) \mathrm{d}W(t) \Vert = N\) a.s., which implies the desired conclusion for \(p\in (0,\infty )\), see also [8, Lemma 6.1].) Thus, the proof of [37, Theorem 2.2] guarantees that \(D_p(X)<\infty \). Here, we exploit that [26, Proposition 9.14] works (in their notation) with \(r\in (0,1)\) as well: one starts on the left-hand side with \({\mathcal {L}}^r\), estimates this by \({\mathcal {L}}^1\), applies [26, Proposition 9.14], and uses [26, Proposition 4.7] (Khintchine’s inequality for a vector-valued Rademacher series) to change \({\mathcal {L}}^1\) back to \({\mathcal {L}}^r\) on the right-hand side.

Part (iii) is divided into several steps:

  • Proof of \(D_p(X) \le \mathrm{UMD}_p^{-,s}(X)\): this inequality follows from the following two observations: firstly, dyadic martingales are conditionally symmetric, and secondly if \((d_n)_{n=1}^{\infty }=(r_n v_{n-1})_{n=1}^{\infty }\) is a dyadic martingale and \((r_n')_{n=1}^{\infty }\) is Rademacher sequence independent of \((r_n)_{n=1}^{\infty }\), then \((r_n' r_n v_{n-1})_{n=1}^{\infty }\) and \((r_n'v_{n-1})_{n=1}^{\infty }\) are equal in distribution.

  • Proof of \(\mathrm{UMD}_p^{-,s}(X) \le D_p(X)\): using Lemma 5.4, we approximate each \(d_n\) in \({\mathcal {L}}^p({\mathcal {F}}_n;X)\) so that we may assume that the \(d_n\) take finitely many values only. Let

    $$\begin{aligned} \varepsilon _0 := \inf \{ \Vert d_n(\omega ) \Vert _X : n=1,\ldots ,N, \, \omega \in \Omega , \, d_n(\omega )\not = 0 \} > 0 \end{aligned}$$

    where \(\inf \emptyset := 1\). Take an \(x\in X\) with \(0<\Vert x\Vert _X<\varepsilon _0\) and let \(r=(r_n)_{n=1}^N\) be a Rademacher sequence on a probability space \((\Omega _{{{\mathbb {D}}}}, {\mathcal {F}}_{{{\mathbb {D}}}}, {{\mathbb {P}}}_{{{\mathbb {D}}}})\). If we define \({\tilde{d}}_n:\Omega \times \Omega _{{{\mathbb {D}}}}\rightarrow X\) by \({{\tilde{d}}}_n (\omega ,\omega _{{{\mathbb {D}}}}):= d_n(\omega ) + r_n(\omega _{{{\mathbb {D}}}}) x\), then \({{\tilde{d}}}_n(\omega ,\omega _{{{\mathbb {D}}}}) \not = 0\) for all \((\omega ,\omega _{{{\mathbb {D}}}})\in \Omega \times \Omega _{{{\mathbb {D}}}}\) and \({{\tilde{d}}}_n\) is conditionally symmetric given the \(\sigma \)-algebra \({\mathcal {F}}_{n-1}\otimes {\mathcal {F}}_{{{\mathbb {D}}},n-1}^{r}\), where \(({\mathcal {F}}_{{{\mathbb {D}}},n}^{r})_{n=0}^N\) is the natural filtration of \((r_n)_{n=1}^N\). Because we may let \(\Vert x\Vert \downarrow 0\), it suffices to verify the statement for \(({{\tilde{d}}}_n)_{n=1}^N\) or, in other words, we may assume without loss of generality that for all \(n\in {{\mathbb {N}}}\) the range of \(d_n\) is a finite set that does not contain 0.

Note that by removing all (i.e., at most finitely many) atoms of measure zero in the \(\sigma \)-algebra \({\mathcal {F}}_N^d\) and ‘updating’ the definition of \((d_n)_{n=1}^{N}\) accordingly, we may assume that the filtration \(({\mathcal {F}}_n^d)_{n=1}^N\) has the property that \({\mathcal {F}}_n^d\) is generated by finitely many atoms of positive measure.

Bearing in mind that for all \(n\in \{1,\ldots ,N\}\) the random variable \(d_n\) takes only finitely many values, each nonzero, and each with positive probability, one may check that for every atom \(A\in {\mathcal {F}}_{n-1}^d\), \(n\in \{1,\ldots ,N\}\), there exist disjoint sets \(A^{+}, A^{-} \in {\mathcal {F}}_{n}^d\) such that \(A = A^+ \cup A^{-}\), \({{\mathbb {P}}}(A^+)={{\mathbb {P}}}(A^-)\), and \({\mathcal {L}}(d_{n} \,|\, A^+ ) = {\mathcal {L}}(-d_{n} \,|\, A^{-})\). Now we introduce a Rademacher sequence \((\rho _n)_{n=1}^N\), \(\rho _n:\Omega \rightarrow \{-1,1 \}\), defined as follows: for each atom A of \({\mathcal {F}}_{n-1}^d\) we set \(\rho _n|_{A^+} \equiv 1\), and \(\rho _n|_{A^-}\equiv -1\), where \(A^+\) and \(A^{-}\) form a partition of A as described above. Moreover, we let \(v_n := \rho _n d_n\) so that \(d_n = \rho _n v_n\). By construction, \(\rho _n\) is independent from \({\mathcal {F}}_{n-1}^d\vee \sigma (v_n)\). It follows from the definition of \(D_p(X)\) and Theorem 1.4 (see also Example 2.6) that

$$\begin{aligned}&\left\| \sum _{n=1}^{N} d_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} = \left\| \sum _{n=1}^{N} \rho _n v_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \le D_p(X) \left\| \sum _{n=1}^{N} r'_n v_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}_{{{\mathbb {D}}}};X)} \\&\quad = D_p(X) \left\| \sum _{n=1}^{N} r'_n r_n v_n \right\| _{{{\mathcal {L}}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}_{{{\mathbb {D}}}};X)} = D_p(X) \left\| \sum _{n=1}^{N} r'_n d_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}}\otimes {{\mathbb {P}}}_{{{\mathbb {D}}}};X)}. \end{aligned}$$

\(\square \)

We conclude with some remarks regarding Theorem 5.2.

Remark 5.5

  1. (1)

    Let \((h_n)_{n\in {{\mathbb {N}}}}\) be the Haar system for \({\mathcal {L}}^2((0,1])\) with \({{\,\mathrm{ess\,sup}\,}}(|h_n|)=1\), and let \(H_p(X)\in [0,\infty ]\) be the infimum over all \(c\in [0,\infty ]\) such that for all finitely supported sequences \((x_n)_{n\in {{\mathbb {N}}}}\) in X one has that

    $$\begin{aligned} \left\| \sum _{n=1}^\infty h_n x_n \right\| _{{\mathcal {L}}^{p}((0,1]);X)} \le c \left\| \sum _{n=1}^\infty r_n h_n x_n \right\| _{{\mathcal {L}}^{p}((0,1]\times {{\mathbb {D}}};X)}. \end{aligned}$$

    Let \(|H|_p(X)\in [0,\infty ]\) be defined as above but with \(h_n\) replaced by \(|h_n|\). Then, it is straightforward to see that \( D_p(X) = H_p(X) = |H|_p(X)\).

  2. (2)

    Garling [13] introduced the constant \(\mathrm{UMD}_p^-(X)\), which is defined like the constant \(\mathrm{UMD}_p^{-,s}(X)\) in Definition 5.1 but without the condition conditionally symmetric. In general, the constants \(\mathrm{UMD}_p^-(X)\) and \(\mathrm{UMD}_p^{-,s}(X)\) behave differently: it follows from Hitczenko [16, Theorem 1.1] that \(\sup _{p\in [2,\infty )} D_p({{\mathbb {R}}}) < \infty \) and thus, by Theorem 5.2, \(\sup _{p\in [2,\infty )} \mathrm{UMD}_p^{-,s}({{\mathbb {R}}}) < \infty \). On the other hand, as outlined in [9, p. 348], one has \(\mathrm{UMD}_p^-({{\mathbb {R}}}) \succeq \sqrt{p}\) as \(p\rightarrow \infty \) by combining the result of Burkholder [5, Theorem 3.1] about the optimal behavior of the constant in the square function inequality and the behavior of the constant in the Khintchine inequality for Rademacher variables.

Remark 5.6

Part (ii) of Theorem 5.2 is an extension of Garling’s [12, Theorem 2]: whereas Garling requires the integrands to be adapted with respect to the filtration generated by the Brownian motion, we can allow for any filtration. In the development of stochastic integration theory in Banach spaces (as presented in, e.g., [9, 34]), the issue regarding the undesirable assumption on the filtration in [12] was known to the authors. In those articles, the problem was circumvented in two ways:

  1. (a)

    In [34, Lemma 3.4], a decoupling argument due to Montgomery-Smith [29] is used to prove \(W_p(X) \le \beta _p(X)\) for \(p\in (1,\infty )\), where \(\beta _p(X)\) is the \({\mathcal {L}}^p\)-UMD constant of X. This approach does not cover \(p\in (0,1]\) and the UMD property seems to be too strong as \(W_p(L^1 )<\infty \) for \(p\in (0,\infty )\) (see also [9]).

  2. (b)

    In [9, Theorem 5.4], it is observed that \(W_p(X)<\infty \) if \(D_p^\mathrm{gen}(X)<\infty \), where \(D_p^\mathrm{gen}(X)\) is the infimum over all \(c\in [0,\infty ]\) such that

    $$\begin{aligned} \left\| \sum _{n=1}^{\infty } d_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \le c \left\| \sum _{n=1}^{\infty } e_n \right\| _{{\mathcal {L}}^p({{\mathbb {P}}};X)} \end{aligned}$$

    whenever \((e_n)_{n\in {{\mathbb {N}}}}\) is an \({{\mathbb {F}}}\)-decoupled tangent sequence of a finitely supported \({\mathcal {L}}^p\)-integrable X-valued \({{\mathbb {F}}}\)-adapted sequence of random variables \((d_n)_{n\in {{\mathbb {N}}}}\).

The approach in [9] leads us to wonder: is that true that \(D_p(X)<\infty \) implies \(D^\mathrm{gen}_p(X)<\infty \)? (See Open Problem 1.1.) Although we could not fully answer this question, Theorem 5.2 resolves the issue regarding the filtration in [12] and thereby provides a direct approach for vector-valued stochastic integration.