1 Introduction

The subject of exchangeability is prevalent in probability theory (see e.g. [2], Chapters 7–9 in [11], or [3, 4] and [5] for recent overviews and results) and the goal of this paper is to study another notion of exchangeability that is motivated by spin glass models and, in particular, by the work of Mézard and Parisi on diluted models, [12].

We begin by considering an array \((X_\alpha )_{\alpha \in \mathbb{N }^r}\) of random variables \(X_\alpha \) indexed by \(\alpha \in \mathbb{N }^r\) for some integer \(r\ge 1\), whose distribution is invariant under certain rearrangements of the indices. We will think of \(\mathbb{N }^r\) as the set of leaves of a rooted tree (see Fig. 1) with the vertex set

$$\begin{aligned} {\fancyscript{A}}(r) = \mathbb{N }^0 \cup \mathbb{N }\cup \mathbb{N }^2 \cup \cdots \cup \mathbb{N }^r, \end{aligned}$$
(1)

where \(\mathbb{N }^0 = \{\emptyset \},\,\emptyset \) is the root of the tree and each vertex \(\alpha =(n_1,\ldots ,n_p)\in \mathbb{N }^{p}\) for \(p\le r-1\) has children

$$\begin{aligned} \alpha n : = (n_1,\ldots ,n_p,n) \in \mathbb{N }^{p+1} \end{aligned}$$

for all \(n\in \mathbb{N }\). Each vertex \(\alpha \) is connected to the root \(\emptyset \) by the path

$$\begin{aligned} \emptyset \rightarrow n_1 \rightarrow (n_1,n_2) \rightarrow \cdots \rightarrow (n_1,\ldots ,n_p) = \alpha . \end{aligned}$$
Fig. 1
figure 1

Index set \(\mathbb{N }^r\) as the leaves of the infinitary tree \({\fancyscript{A}}(r)\)

We will denote the set of vertices in this path by

$$\begin{aligned} p(\alpha ) = \bigl \{ \emptyset , n_1, (n_1,n_2),\ldots ,(n_1,\ldots ,n_p) \bigr \}. \end{aligned}$$
(2)

We will consider rearrangements of \(\mathbb{N }^r\) that preserve the structure of the tree \({\fancyscript{A}}(r)\), in the sense that they preserve the parent-child relationship. More specifically, we define by

$$\begin{aligned} \alpha \wedge \beta := |p(\alpha ) \cap p(\beta ) | \end{aligned}$$
(3)

the number of common vertices in the paths from the root \(\emptyset \) to the vertices \(\alpha \) and \(\beta \), and consider the following group of maps on \(\mathbb{N }^r\),

$$\begin{aligned} H_r = \bigl \{ \pi : \mathbb{N }^r\rightarrow \mathbb{N }^r \,\bigr |\, \pi \hbox { is a bijection}, \pi (\alpha )\wedge \pi (\beta ) = \alpha \wedge \beta \hbox { for all } \alpha ,\beta \in \mathbb{N }^r \bigr \}.\nonumber \\ \end{aligned}$$
(4)

Any such map can be extended to the entire tree \({\fancyscript{A}}(r)\) in a natural way: let \(\pi (\emptyset ) := \emptyset \) and

$$\begin{aligned} \hbox {if } \pi ((n_1,\ldots ,n_r)) = (m_1,\ldots ,m_r) \,\hbox { then let }\, \pi ((n_1,\ldots ,n_p)) := (m_1,\ldots ,m_p).\nonumber \\ \end{aligned}$$
(5)

Because of the condition \(\pi (\alpha )\wedge \pi (\beta ) = \alpha \wedge \beta \) in (4), this definition does not depend on the coordinates \(n_{p+1},\ldots ,n_r\), so the extension is well-defined. It is clear that the extension preserves the parent-child relationship. For each \(\alpha \in {\fancyscript{A}}(r){\setminus } \mathbb{N }^r\), it follows that \(\pi (\alpha n) = \pi (\alpha ) \pi _\alpha (n)\) for some bijection \(\pi _\alpha : \mathbb{N }\rightarrow \mathbb{N }\). In other words, the condition \(\pi (\alpha )\wedge \pi (\beta ) = \alpha \wedge \beta \) means that we can visualize the map \(\pi \) as a recursive procedure, in which children \(\alpha n\) of the vertex \(\alpha \in \mathbb{N }^p\) are rearranged among themselves for each \(\alpha \). Note that \(H_1\) is simply the group of all permutations of \(\mathbb{N }\).

We will say that an array of random variables \((X_\alpha )_{\alpha \in \mathbb{N }^r}\) taking values in a standard Borel space \(A\) (i.e. Borel-isomorphic to a Borel subset of a Polish space) is hierarchically exchangeable, or \(H\) -exchangeable, if

$$\begin{aligned} \bigl (X_{\pi (\alpha )} \bigr )_{\alpha \in \mathbb{N }^r} \stackrel{d}{=} \bigl (X_\alpha \bigr )_{\alpha \in \mathbb{N }^r} \end{aligned}$$
(6)

for all \(\pi \in H_r\). Throughout the paper, we will view any array of random variables as a random element in the product space, so the equality in distribution is always in the sense of equality of the finite dimensional distributions. Because of this, one can replace the condition in (4) that \(\pi \) is a bijection by the condition that \(\pi \) is simply an injection, since any injection viewed on finitely many elements can be, obviously, extended to a bijection preserving the property \(\pi (\alpha )\wedge \pi (\beta ) = \alpha \wedge \beta \).

The case of \(r=1\) corresponds to the classical notion of an exchangeable sequence, and in the general case of \(r\ge 1\) we will prove the following analogue of de Finetti’s classical theorem. One natural example of an \(H\)-exchangeable array is given by (recall the notation in (2))

$$\begin{aligned} X_\alpha = \sigma \bigl ((v_{\beta })_{\beta \in p(\alpha )} \bigr ), \end{aligned}$$
(7)

where \(\sigma : [0,1]^{r+1} \rightarrow A\) is a measurable function, and \(v_\alpha \) for \(\alpha \in {\fancyscript{A}}(r)\) are i.i.d. random variables with the uniform distribution on \([0,1]\). The reason this array is hierarchically exchangeable is because, by the definition of \(\pi \), the random variables \(v_{\pi (\alpha )}\) for \(\alpha \in {\fancyscript{A}}(r)\) are also i.i.d. and uniform on \([0,1],\,p(\pi (\alpha ))= \pi (p(\alpha ))\) and \(X_{\pi (\alpha )} = \sigma ((v_{\pi (\beta )})_{\beta \in p(\alpha )})\). We will show the following.

Theorem 1

Any hierarchically exchangeable array \((X_\alpha )_{\alpha \in \mathbb{N }^r}\) can be generated in distribution as in (7) for some measurable function \(\sigma \).

This result is not very difficult to prove, and one can give several different arguments. We will describe an approach that will be a natural first step toward the general case of processes indexed by several trees or, more specifically, by product sets of the form \(\mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }\) for any integers \(r_1,\ldots ,r_\ell \ge 1\). Recalling the definition (4), let us denote

$$\begin{aligned} H_{r_1,\ldots ,r_\ell } = H_{r_1}\times \cdots \times H_{r_\ell }, \end{aligned}$$
(8)

and for any \(\pi =(\pi _1,\ldots ,\pi _\ell )\in H_{r_1,\ldots ,r_\ell }\) and any \(\alpha = (\alpha _1,\ldots ,\alpha _\ell ) \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }\), let us denote

$$\begin{aligned} \pi (\alpha ) = \bigl (\pi _1(\alpha _1),\ldots ,\pi _\ell (\alpha _\ell ) \bigr ). \end{aligned}$$

We will say that an array of random variables \(X_{\alpha }\) indexed by \({\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\) and taking values in a standard Borel space \(A\) is hierarchically exchangeable, or \(H\) -exchangeable, if

$$\begin{aligned} \bigl (X_{\pi (\alpha )} \bigr )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }} \stackrel{d}{=} \bigl (X_\alpha \bigr )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }} \end{aligned}$$
(9)

for all \(\pi \in H_{r_1,\ldots ,r_\ell }\). Let us denote

$$\begin{aligned} {\fancyscript{A}}(r_1,\ldots ,r_\ell ) = {\fancyscript{A}}(r_1)\times \cdots \times {\fancyscript{A}}(r_\ell ) \end{aligned}$$

and, for \(\alpha = (\alpha _1,\ldots ,\alpha _\ell ) \in {\fancyscript{A}}(r_1,\ldots ,r_\ell ) \), denote

$$\begin{aligned} p(\alpha ) := p(\alpha _1)\times \cdots \times p(\alpha _\ell ). \end{aligned}$$

Then, again, the natural class of \(H\)-exchangeable arrays is those of the form

$$\begin{aligned} X_\alpha = \sigma \bigl ((v_{\beta })_{\beta \in p(\alpha )} \bigr ), \end{aligned}$$
(10)

for some measurable function \(\sigma :[0,1]^{(r_1+1) + \cdots + (r_\ell +1)}\rightarrow A\) and a family of i.i.d. random variables \(v_\beta \) indexed by \({\beta \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}\) with the uniform distribution on \([0,1]\).

Theorem 2

Any hierarchically exchangeable array \((X_{\alpha })_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\) can be generated in distribution as in (10) for some measurable function \(\sigma \).

This general result was motivated by the following special case, when the array \((X_{\alpha , i})\) is indexed by \(\alpha \in \mathbb{N }^r\) and \(i\in \mathbb{N }\). The condition (9) now becomes

$$\begin{aligned} \bigl (X_{\pi (\alpha ), \rho (i)} \bigr )_{\alpha \in \mathbb{N }^r, i\in \mathbb{N }} \stackrel{d}{=} \bigl (X_{\alpha ,i} \bigr )_{\alpha \in \mathbb{N }^r, i\in \mathbb{N }} \end{aligned}$$
(11)

for all \(\pi \in H_r\) and all bijections \(\rho :\mathbb{N }\rightarrow \mathbb{N }\), and Theorem 2 implies that any such array can be generated in distribution as

$$\begin{aligned} X_{\alpha ,i} = \sigma \bigl ((v_{\beta })_{\beta \in p(\alpha )}, (v_{\beta }^i)_{\beta \in p(\alpha )} \bigr ), \end{aligned}$$
(12)

where \(\sigma : [0,1]^{2(r+1)} \rightarrow \mathbb{R }\) is a measurable function and all \(v_\alpha \) and \(v_\alpha ^i\) for \(\alpha \in {\fancyscript{A}}(r)\) and \(i\in \mathbb{N }\) are i.i.d. random variables with the uniform distribution on \([0,1]\). This can be viewed as a hierarchical version of the classical Aldous-Hoover representation [1, 2, 8, 9], which corresponds to the case \(r=1\). One application of this representation can be found in [14], where it is explained how (12) is related to the predictions about the structure of the Gibbs measure in diluted spin glass models that originate in the work of Mézard and Parisi [12]. The main result in [14] proves precisely the hierarchical exchangeability (11) for the random variables \(X_{\alpha ,i}\) that represent the magnetization of the \(i\)th spin inside the pure state \(\alpha \), and the tree structure as above stems from the ultrametric organization of the pure states in the Parisi ansatz, which was recently proved in [13]. Finally, although this is not directly related to the results presented in this paper, an interested reader can find a study of another notion of exchangeability on (infinite infinitary) trees in Section III.13 in [2].

2 The case of one tree

It is well known that any standard Borel space is Borel-isomorphic to a Borel subset of \([0,1]\) (see e.g. Section 13.1 in [6]), which means that it is enough to prove Theorems 1 and 2 with random variables \(X_\alpha \) taking values in \([0,1]\), which we will assume from now on. All the arrays that we will deal with will take values in the product space of countably many copies of \([0,1]\), which is a compact space. For simplicity of notation, we will continue to denote all such spaces by \(A\). We will denote by \(\Pr \,A\) the space of probability measures on \(A\) equipped with the topology of weak convergence, which is also a compact space. If a sequence \((X_n)_n\) of \(A\)-valued random variables is such that the empirical distributions

$$\begin{aligned} \frac{1}{N}\sum _{n=1}^N\delta _{X_n} \end{aligned}$$

converge almost surely to some \((\Pr \,A)\)-valued random variable, then we will call this limit the empirical measure of \((X_n)_n\) and denote it by \({\fancyscript{E}}((X_n)_n)\). Our key tool will be the following strong version of de Finetti’s theorem (see Proposition 1.4, Corollary 1.5 and Corollary 1.6 from [11]).

Theorem 3

(de Finetti-Hewitt-Savage theorem) Suppose \((X_n)_n\) is an exchangeable sequence of \(A\)-valued random variables. Then the empirical measure \({\fancyscript{E}}((X_n)_n)\) exists almost surely and has the following properties:

  1. (i)

    \({\fancyscript{E}}((X_n)_n)\) is almost surely a function of \((X_n)_n\);

  2. (ii)

    given \({\fancyscript{E}}((X_n)_n)\), the random variables \(X_n\) are i.i.d. with the distribution \({\fancyscript{E}}((X_n)_n)\);

  3. (iii)

    if \(Z\) is any other random variable on the same probability space such that

    $$\begin{aligned} (Z,X_1,X_2,\ldots ) \stackrel{d}{=} (Z,X_{\pi (1)},X_{\pi (2)},\ldots ) \,\, \text{ for } \text{ all } \quad \pi \in H_1 \end{aligned}$$
    (13)

    then the sequence \((X_n)_n\) is conditionally independent from \(Z\) given \({\fancyscript{E}}((X_n)_n)\).

Proof of Theorem 1

The proof will be by induction on \(r\ge 1\). For each \(\alpha \in \mathbb{N }^{r-1}\), by Theorem 3, the empirical measures

$$\begin{aligned} X_\alpha := {\fancyscript{E}}\bigl ((X_{\alpha n})_{n}\bigr ) \in \Pr \,A \end{aligned}$$
(14)

exist almost surely, because hierarchical exchangeability (6) implies that \((X_{\alpha n})_n\) is exchangeable in the index \(n\) for each fixed \(\alpha \). Moreover, hierarchical exchangeability together with Theorem 3 imply the following:

  1. (a)

    Given \(X_\alpha \) for a fixed \(\alpha \in \mathbb{N }^{r-1}\), the random variables \(X_{\alpha n},\,n\in \mathbb{N }\), are i.i.d. with the distribution \(X_\alpha \).

  2. (b)

    The random variables \((X_{\alpha n})_{\alpha \in \mathbb{N }^{r-1}, n\in \mathbb{N }}\) are conditionally independent given \((X_\alpha )_{\alpha \in \mathbb{N }^{r-1}}\). This holds because for a chosen \(\alpha \), the joint distribution of all the random variables is invariant if one permutes the sequence \((X_{\alpha n})_{n \in \mathbb{N }}\) while leaving all \((X_{\alpha ' n})_{\alpha '\ne \alpha ,\,n\in \mathbb{N }}\) fixed, and so (iii) of Theorem 3 gives that the former are conditionally independent from the latter over \(X_\alpha \).

  3. (c)

    The empirical measures \((X_\alpha )_{\alpha \in \mathbb{N }^{r-1}}\) are hierarchically exchangeable,

    $$\begin{aligned} (X_{\pi (\alpha )})_{\alpha \in \mathbb{N }^{r-1}} \stackrel{d}{=} (X_\alpha )_{\alpha \in \mathbb{N }^{r-1}} \,\, \text{ for } \text{ all } \quad \pi \in H_{r-1}. \end{aligned}$$

By the induction hypothesis, property (c) yields a representation

$$\begin{aligned} (X_\beta )_{\beta \in \mathbb{N }^{r-1}} \stackrel{d}{=} \big (\sigma _1((\nu _\gamma )_{\gamma \in p(\beta )})\big )_{\beta \in \mathbb{N }^{r-1}}. \end{aligned}$$
(15)

By the properties (a) and (b) and the fact that \(A\) is a Borel space, there exists a measurable function \(\sigma _2:\Pr \,A\times [0,1]\rightarrow A\) such that, conditionally on \((X_\alpha )_{\alpha \in \mathbb{N }^{r-1}}\),

$$\begin{aligned} \bigl (X_{\alpha n}\bigr )_{\alpha \in \mathbb{N }^{r-1}, n\in \mathbb{N }} \stackrel{d}{=} \bigl (\sigma _2(X_\alpha ,v_{\alpha n}) \bigr )_{\alpha \in \mathbb{N }^{r-1}, n\in \mathbb{N }}, \end{aligned}$$
(16)

where \(v_{\alpha n}\) for \(\alpha n \in \mathbb{N }^r\) are i.i.d. random variables uniform on \([0,1]\), independent from everything else. In other words, we simply realize independent random variables \(X_{\alpha n}\) from the distribution \(X_\alpha \) as functions of independent uniform random variables \(v_{\alpha n}\). (See, for instance, Lemma 7.8 in [11] for a rather stronger result guaranteeing that this can be done.) Combining (15) and (16) implies

$$\begin{aligned} (X_\alpha )_{\alpha \in \mathbb{N }^r} \stackrel{d}{=} \big (\sigma ((\nu _\beta )_{\beta \in p(\alpha )})\big )_{\alpha \in \mathbb{N }^r} \end{aligned}$$

with \(\sigma (x_0,x_1,\ldots ,x_r) := \sigma _2(\sigma _1(x_0,x_1,\ldots ,x_{r-1}),x_r)\), which finishes the proof. \(\square \)

3 The case of several trees

Theorem 2 will be proved by induction on \((r_1,\ldots ,r_\ell )\). Of course, the case \(\ell = 1\) is already proved in the previous section. However, in order to close the induction, it will actually be convenient to focus on a more general result, describing \(H\)-exchangeable couplings between processes and \(I\)-fields, defined as follows. We will call an array of random variables \((u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}\) taking values in some compact spaces an \(I\) -field if all \(u_\alpha \) are independent and the distribution of \(u_\alpha \) depends only on the “distance of \(\alpha \) from the root”, namely, all \(u_\alpha \) have the same distribution for \(\alpha \in \mathbb{N }^{p_1}\times \cdots \times \mathbb{N }^{p_\ell }\) for any given \((p_1,\ldots ,p_\ell )\). We will consider a pair of processes

$$\begin{aligned} (u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )},(X_\alpha )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}, \end{aligned}$$
(17)

where \((u_\alpha )\) is an \(I\)-field, not necessarily independent of \((X_\alpha )\). We will assume that they are jointly hierarchically exchangeable in the sense that

$$\begin{aligned} \bigl ((u_{\pi (\alpha )})_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}, (X_{\pi (\alpha )})_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\bigr ) \!\stackrel{d}{=}\! \bigl ((u_{\alpha })_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}, (X_{\alpha })_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\!\bigr ) \nonumber \\ \end{aligned}$$
(18)

for all bijections \(\pi \in H_{r_1,\ldots , r_\ell }\) in (8) extended in a natural way to the entire set \( {\fancyscript{A}}(r_1,\ldots ,r_\ell )\), i.e. each coordinate \(\pi _i\in H_{r_i}\) is extended from \(\mathbb{N }^{r_i}\) to \({\fancyscript{A}}(r_i)\) as in (5). For convenience of notation, given an array \(Y_\alpha \) indexed by \(\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )\) and a subset \(S\subseteq {\fancyscript{A}}(r_1,\ldots ,r_\ell )\), we will denote \(Y_S = (Y_\alpha )_{\alpha \in S}\). For example, \(Y_{p(\alpha )} = (Y_\beta )_{\beta \in p(\alpha )}\). The following proposition is a generalization of Theorem 2.

Proposition 1

If (18) holds then there exists a measurable function \(\tau \) such that, conditionally on the \(I\)-field \((u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}\),

$$\begin{aligned} (X_\alpha )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }} \stackrel{d}{=} \bigl (\tau (u_{p(\alpha )},v_{p(\alpha )}) \bigr )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}, \end{aligned}$$
(19)

where \((v_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}\) are i.i.d. random variables uniform on \([0,1]\), independent of \((u_\alpha )_\alpha \).

Formally, this equality of distribution conditionally on \((u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )}\) means the following equality of distribution for larger families of random variables:

$$\begin{aligned}&\!\!\! \Bigl ((u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )},\ (X_\alpha )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\Bigr ) \\&\,\,\,\stackrel{d}{=} \Bigl ((u_\alpha )_{\alpha \in {\fancyscript{A}}(r_1,\ldots ,r_\ell )},\ \bigl (\tau (u_{p(\alpha )},v_{p(\alpha )}) \bigr )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\Bigr ). \end{aligned}$$

We will generally avoid writing this out in full for the sake of lighter notation.

Of course, (19) implies Theorem 2 by considering an \(I\)-field \((u_\alpha )\) independent of the process \((X_\alpha )\). Proposition 1 will be proved by induction on \((r_1,\ldots ,r_\ell )\) and, in the induction step, we will need to describe a conditional distribution of one array given another. We will be able to replace this second array with an \(I\)-field, and the independence built into the definition of \(I\)-fields will be well-suited for the induction argument. The induction argument does not work so well when the \(I\)-field in Proposition 1 is replaced by a general \(H\)-exchangeable array \((Y_\alpha )\). However, such a generalization, described in Theorem 4 below, will follow once we have Proposition 1.

To describe the induction, it will be convenient to write members of \({\fancyscript{A}}(r_1,\ldots ,r_\ell )\) in the form \((\omega ,\alpha )\), where \(\omega \in {\fancyscript{A}}(r_1,\ldots ,r_{\ell -1})\) and \(\alpha \in {\fancyscript{A}}(r_\ell )\), and also abbreviate

$$\begin{aligned} {\fancyscript{A}}= {\fancyscript{A}}(r_1,\ldots ,r_{\ell -1}) \, \text{ and } \, {\fancyscript{L}}= \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_{\ell -1}}. \end{aligned}$$

We therefore write the pair of processes (17) as \((u_{\omega ,\alpha })_{\omega \in {\fancyscript{A}},\alpha \in {\fancyscript{A}}(r_\ell )},(X_{\omega ,\alpha })_{\omega \in {\fancyscript{L}},\alpha \in \mathbb{N }^{r_\ell }}\). To close the induction we will make three separate appeals to simpler cases of Proposition 1, and we subdivide the proof into stages accordingly.

3.1 Using the case of one tree

For the first stage, it will also be convenient to introduce the notation, for each \(\alpha \in {\mathbb{N }}^{r_\ell }\),

$$\begin{aligned} {\widetilde{X}}_\alpha = ({\widetilde{X}}_\alpha ^1, {\widetilde{X}}_\alpha ^2) = \bigl ((u_{\omega ,\alpha })_{\omega \in {\fancyscript{A}}},(X_{\omega ,\alpha })_{\omega \in {\fancyscript{L}}}\bigr ), \end{aligned}$$
(20)

which is an element of another compact space, say \({\widetilde{A}} = {\widetilde{A}}_1 \times {\widetilde{A}}_2\), where \({\widetilde{X}}_\alpha ^j\) take values in \({\widetilde{A}}_j\) for \(j=1,2\). If we denote the subarray

$$\begin{aligned} U^- = (u_{\omega ,\alpha })_{\omega \in {\fancyscript{A}}, \alpha \in {\fancyscript{A}}(r_\ell -1)} \end{aligned}$$
(21)

of our \(I\)-field consisting of the coordinates that do not appear in (20), then in these terms our goal is to describe the joint distribution of \(({{\widetilde{X}}}_\alpha )_{\alpha \in {\mathbb{N }}^{r_\ell }}\) and \(U^-\).

First of all, notice that hierarchical exchangeability in (18) implies that the process \(({\widetilde{X}}_\alpha )_{\alpha \in \mathbb{N }^{r_\ell }}\) is \(H\)-exchangeable. Hence, similarly to the proof of Theorem 1, for each \(\alpha \in \mathbb{N }^{r_\ell -1}\), the empirical measure

$$\begin{aligned} {\widetilde{X}}_\alpha := {\fancyscript{E}}\bigl (({\widetilde{X}}_{\alpha n})_{n} \bigr ) \in \Pr \,{\widetilde{A}} \end{aligned}$$
(22)

exists almost surely and, by Theorem 3, we get:

  1. (a)

    given \({\widetilde{X}}_\alpha \) for \(\alpha \in \mathbb{N }^{r_\ell -1}\), the random variables \({\widetilde{X}}_{\alpha n}\) are i.i.d. with the distribution \({\widetilde{X}}_\alpha \);

  2. (b)

    given \(({\widetilde{X}}_\alpha )_{\alpha \in \mathbb{N }^{r_\ell -1}}\), the random variables \(({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in \mathbb{N }}\) are conditionally independent. Note also that the permutation of the index \(n\) for a fixed \(\alpha \) does not affect the subarray (21). Therefore, part (iii) of Theorem 3 also implies that

  3. (c)

    given \(({\widetilde{X}}_\alpha )_{\alpha \in \mathbb{N }^{r_\ell -1}}\), the array \(({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in \mathbb{N }}\) is independent of \(U^-\). Another important observation is that, by the definition of \(I\)-field, for any \(\alpha \in \mathbb{N }^{r_\ell -1}\), the random variables \({\widetilde{X}}_{\alpha n}^1 = (u_{\omega ,\alpha n})_{\omega \in {\fancyscript{A}}}\) in (20) are i.i.d. for \(n\in \mathbb{N }\) with some fixed distribution on \({\widetilde{A}}_1\) and, therefore, the marginal of the empirical measure \({\widetilde{X}}_\alpha \) in (22) on \({\widetilde{A}}_1\) is this fixed nonrandom measure. Together with the property (a) this implies:

  4. (d)

    the random variables \({\widetilde{X}}_{\alpha n}^1\) for \(n\in \mathbb{N }\) are independent of the empirical measure \({\widetilde{X}}_\alpha \).

Let us now consider an infinite subset \(I\subseteq \mathbb{N }\) such that \(I^c=\mathbb{N }\setminus I\) is also infinite. Even though our goal is to describe the joint distribution of \(({{\widetilde{X}}}_{\alpha })_{\alpha \in \mathbb{N }^{r_\ell }}\) and \(U^-\), because of the hierarchical exchangeability it is, obviously, sufficient to describe the joint distribution of

$$\begin{aligned} ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \,\,\hbox { and }\,\, U^-. \end{aligned}$$

This will be done in several steps, and we begin with the following lemma. We will suppose, without loss of generality, that \(1\in I\). We will write \(\mathbb{P }(Y\in \cdot \,|\,Y')\) for the conditional distribution of \(Y\) given \(Y'\).

Lemma 1

  1. (A)

    The following equality holds:

    $$\begin{aligned}&\mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c}\Bigr ) \nonumber \\&\quad = \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}} \mathbb{P }\Bigl ( {\widetilde{X}}_{\alpha 1} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{ n\in I^c}\Bigr )^{\otimes I}. \end{aligned}$$
    (23)
  2. (B)

    Conditionally on \(({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c}\), the arrays \(({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I}\) and \(U^-\) are independent.

  3. (C)

    The arrays \(({\widetilde{X}}_{\alpha n}^1)_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I}\) and \(({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c}\) are independent.

Proof

First of all, by property (a), the empirical measure (22) satisfies

$$\begin{aligned} {\widetilde{X}}_\alpha = {\fancyscript{E}}\bigl (({\widetilde{X}}_{\alpha n})_{n\in I^c} \bigr ), \end{aligned}$$
(24)

which means that \({\widetilde{X}}_\alpha \) is almost surely a function of \(({\widetilde{X}}_{\alpha n})_{n\in I^c}\). Therefore,

$$\begin{aligned}&\mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c},\ U^-\Bigr ) \\&\quad = \mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha })_{\alpha \in \mathbb{N }^{r_\ell -1}}, ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c},\ U^-\Bigr ). \end{aligned}$$

Using the properties (b) and (c), this conditional distribution is equal to

$$\begin{aligned} \mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_\alpha )_{\alpha \in \mathbb{N }^{r_\ell -1}}\Bigr ). \end{aligned}$$
(25)

The same computation obviously also works without \(U^-\), and therefore

$$\begin{aligned}&\mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c},\ U^-\Bigr ) \\&\quad = \mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c}\Bigr ). \end{aligned}$$

This proves (B). Next, using the properties (a) and (b), we can rewrite (25) as (recall that \(1\in I)\)

$$\begin{aligned} \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \,\, \mathbb{P }\Bigl ({\widetilde{X}}_{\alpha n}\in \ \cdot \ \Big |\ {\widetilde{X}}_\alpha \Bigr ) = \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}} \mathbb{P }\Bigl ({\widetilde{X}}_{\alpha 1}\in \ \cdot \ \Big |\ {\widetilde{X}}_\alpha \Bigr )^{\otimes I}, \end{aligned}$$

which proves that

$$\begin{aligned} \mathbb{P }\Bigl ( ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I^c}\Bigr ) = \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}} \mathbb{P }\Bigl ({\widetilde{X}}_{\alpha 1}\in \ \cdot \ \Big |\ {\widetilde{X}}_\alpha \Bigr )^{\otimes I}. \nonumber \\ \end{aligned}$$
(26)

Using (24) and property (a), for any fixed \(\alpha \in \mathbb{N }^{r_\ell -1}\),

$$\begin{aligned} \mathbb{P }\Bigl ( {\widetilde{X}}_{\alpha 1} \in \ \cdot \ \Big |\ ({\widetilde{X}}_{\alpha n})_{ n\in I^c}\Bigr ) = \mathbb{P }\Bigl ( {\widetilde{X}}_{\alpha 1} \in \ \cdot \ \Big |\ {\widetilde{X}}_\alpha , ({\widetilde{X}}_{\alpha n})_{ n\in I^c}\Bigr ) = \mathbb{P }\Bigl ( {\widetilde{X}}_{\alpha 1} \in \ \cdot \ \Big |\ {\widetilde{X}}_\alpha \Bigr ). \end{aligned}$$

Combining the last two equations proves (A). The last claim follows from (26) and property (d) above. \(\square \)

3.2 Using the case of \({\fancyscript{A}}\times {\fancyscript{A}}(r_\ell -1)\)

Now that we have utilized the exchangeability with respect to the permutations of the index \(n\), we will change the focus and make the dependence of all random variables on the index \(\omega \in {\fancyscript{A}}\) explicit. For each \(\alpha \in \mathbb{N }^{r_\ell -1}\), let us denote

$$\begin{aligned} X^{+}_{\omega ,\alpha }&:= (X_{\omega ,\alpha n})_{n\in I^c} \quad \text{ for } \,\, \omega \in {\fancyscript{L}}, \end{aligned}$$
(27)
$$\begin{aligned} U^+_{\omega ,\alpha }&:= (u_{\omega ,\alpha n})_{n\in I^c} \quad \text{ for }\,\, \omega \in {\fancyscript{A}}, \end{aligned}$$
(28)
$$\begin{aligned} (U^+_\alpha , X^{+}_\alpha )&:= \bigl ((U^+_{\omega ,\alpha })_{\omega \in {\fancyscript{A}}},(X^{+}_{\omega ,\alpha })_{\omega \in {\fancyscript{L}}}\bigr ), \end{aligned}$$
(29)
$$\begin{aligned} (u_{\alpha n},X_{\alpha n})&:= \bigl ((u_{\omega ,\alpha n})_{\omega \in {\fancyscript{A}}},(X_{\omega ,\alpha n})_{\omega \in {\fancyscript{L}}}\bigr )\quad \text{ for } \, n\in I, \end{aligned}$$
(30)

and let us also denote

$$\begin{aligned} (U^+, X^{+})&:= (U^+_\alpha , X^{+}_\alpha )_{\alpha \in \mathbb{N }^{r_\ell -1}}, \end{aligned}$$
(31)
$$\begin{aligned} (u,X)&:= \bigl ((u_{\alpha n},X_{\alpha n})\bigr )_{\alpha \in \mathbb{N }^{r_\ell -1}, n\in I}. \end{aligned}$$
(32)

With this notation, we can rewrite (23) as

$$\begin{aligned} \mathbb{P }\Bigl ( (u,X) \in \ \cdot \ \Big |\ (U^+, X^{+}) \Bigr ) = \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}} \mathbb{P }\Bigl ( (u_{\alpha 1},X_{\alpha 1}) \in \ \cdot \ \Big |\ (U^+_\alpha , X^{+}_\alpha ) \Bigr )^{\otimes I}.\nonumber \\ \end{aligned}$$
(33)

We can also rewrite claims (B) and (C) in Lemma 1 as follows:

\((\hbox {B}^\prime )\) :

conditionally on \((U^+, X^{+})\) the arrays \((u,X)\) and \(U^-\) are independent;

\((\hbox {C}^\prime )\) :

The arrays \(u\) and \((U^+, X^{+})\) are independent.

We will now make our first appeal to the inductive hypothesis of Proposition 1 to describe the joint distribution of \((U^+, X^{+})\) and \(U^-\). Notice that \(U_{\omega ,\alpha }^+\) in (28) and some of the coordinates \(u_{\omega ,\alpha }\) in \(U^-\) in (21) are indexed by \(\omega \in {\fancyscript{A}}\) and \(\alpha \in \mathbb{N }^{r_\ell -1}\), so we will combine them and introduce a new array \(U=(U_{\omega , \alpha })_{\omega \in {\fancyscript{A}}, \alpha \in {\fancyscript{A}}(r_\ell -1)}\) such that

$$\begin{aligned} U_{\omega ,\alpha }&: =&(u_{\omega ,\alpha }, U_{\omega ,\alpha }^+) \, \text{ for } \, \omega \in {\fancyscript{A}}, \alpha \in \mathbb{N }^{r_\ell -1}, \nonumber \\ U_{\omega ,\alpha }&: =&u_{\omega ,\alpha } \, \text{ for } \, \omega \in {\fancyscript{A}}, \alpha \in {\fancyscript{A}}(r_\ell -1)\setminus \mathbb{N }^{r_\ell -1}. \end{aligned}$$
(34)

Slightly abusing notation, this definition can be written as \(U = (U^-, U^+)\) and it is obvious that \(U\) is again an \(I\)-field. Let us also observe right away that, by property \((\hbox {B}^\prime )\),

$$\begin{aligned} \mathbb{P }\Bigl ( (u,X) \in \ \cdot \ \Big |\ (U^+, X^{+}) \Bigr ) = \mathbb{P }\Bigl ( (u,X) \in \ \cdot \ \Big |\ (U, X^{+}) \Bigr ). \end{aligned}$$
(35)

The following gives a description of the joint distribution of \((U^+, X^{+})\) and \(U^-\).

Lemma 2

Conditionally on the \(I\)-field \(U= (U^-, U^+)\),

$$\begin{aligned} X^{+} = \bigl (X^{+}_{\omega ,\alpha }\bigr )_{\omega \in {\fancyscript{L}},\alpha \in \mathbb{N }^{r_\ell -1}} \stackrel{d}{=} \Bigl (\xi \bigl (v_{p(\omega ,\alpha )}, U_{p(\omega ,\alpha )} \bigr ) \Bigr )_{\omega \in {\fancyscript{L}}, \alpha \in \mathbb{N }^{r_\ell -1}} \end{aligned}$$
(36)

for some measurable function \(\xi \) of its coordinates, where \(v_\beta \) are i.i.d. uniform random variables on \([0,1]\) indexed by \({\fancyscript{A}}\times {\fancyscript{A}}(r_\ell -1)\).

Proof

This is a consequence of the fact that \(U\) is an \(I\)-field, and the pair \(U\) and \((X^{+}_{\omega ,\alpha })_{\omega \in {\fancyscript{L}},\alpha \in \mathbb{N }^{r_\ell -1}}\) is, clearly, a hierarchically exchangeable coupling satisfying (18) with \(r_\ell \) replaced by \(r_{\ell }-1\). By the induction hypothesis, the claim follows. \(\square \)

Let us denote the array of random variables \(v\) on the right hand side of (36) by

$$\begin{aligned} V := \bigl (v_{\omega ,\alpha }\bigr )_{\omega \in {\fancyscript{A}}, \alpha \in {\fancyscript{A}}(r_\ell -1)}. \end{aligned}$$

Let us denote by \({\Xi }\) the full map on the right hand side of the Eq. (36), which can be then written as

$$\begin{aligned} X^{+}\stackrel{d}{=}{\Xi }(V,U). \end{aligned}$$

Since all our random variables take values in standard Borel (or even compact) spaces, we can consider the regular conditional probability

$$\begin{aligned} \mathbb{P }\bigl (\ \cdot \ \bigr | \ x\, \bigr ) = \mathbb{P }\Bigl ( V \in \ \cdot \ \Big |\ \bigl (U, \Xi (V,U)\bigr ) = x \Bigr ). \end{aligned}$$
(37)

It is a standard fact in this case that if \(\mu \) is the law of \((U, X^{+})\) then, for \(\mu \)-almost all \(x\),

$$\begin{aligned} \mathbb{P }\Bigl ( \bigl \{V \, \bigr | \, (U, {\Xi }(V,U)) = x \bigr \} \ \Big |\ x \Bigr )=1. \end{aligned}$$
(38)

Now, using this conditional probability, let us couple the arrays \((u,X)\) in (32) and \(V\) conditionally independently given \((U,X^{+})\),

$$\begin{aligned}&\mathbb{P }\Bigl ( (u,X), V \in \ \cdot \ \Big |\ (U,X^{+})=x \Bigr ) \nonumber \\&\quad = \mathbb{P }\Bigl ( (u,X) \in \ \cdot \ \Big |\ (U,X^{+})=x \Bigr ) \times \mathbb{P }\Bigl ( V \in \ \cdot \ \Big |\ (U,X^{+})=x \Bigr ). \end{aligned}$$
(39)

This is a standard construction in probability, as well as in ergodic theory, where it is called a ‘relatively independent joining’: see, for instance, the third example in Section 6.1 of Glasner [7]. The triple

$$\begin{aligned} (u,X), V\quad \text{ and }\quad (U,X^{+}) \end{aligned}$$

is still hierarchically exchangeable, since this is true separately of both conditional distributions on the right hand side of (39) (for a much more detailed explanation see Lemma 2.3 in [10]). Having done this, we may henceforth regard all of these processes as defined on the same background probability space.

Lemma 3

With the joint distribution constructed above,

$$\begin{aligned} \mathbb{P }\Bigl ((u,X)\in \ \cdot \ \Big |\ (U^+,X^{+}) \Bigr ) = \mathbb{P }\Bigl ((u,X)\in \ \cdot \ \Big |\ (V,U)\Bigr ). \end{aligned}$$
(40)

Notice that this implies that the property \((\hbox {C}^\prime )\) above can now be written as:

\((\hbox {C}^{{\prime }{\prime }})\) :

the arrays \(u\) and \((V, U)\) are independent.

Proof of Lemma 3

By (38), \(X^{+} = {\Xi }(V,U)\) with probability one, so \(X^{+}\) is almost surely a function of \(V\) and \(U\). Therefore,

$$\begin{aligned} \mathbb{P }\Bigl ((u,X)\in \ \cdot \ \Big |\ (V,U)\Bigr ) = \mathbb{P }\Bigl ((u,X)\in \ \cdot \ \Big |\ X^{+}, (V,U)\Bigr ). \end{aligned}$$

By the construction (39), \((u,X)\) and \(V\) are conditionally independently given \((U,X^{+})\), so this conditional distribution is equal to \(\mathbb{P }\bigl ((u,X)\in \ \cdot \ \big |\ (U,X^{+}) \bigr )\), and (35) finishes the proof. \(\square \)

Thus, we have replaced the conditioning on \((U^+,X^{+})\) on the left hand side of (33) with conditioning on \((V, U)\), and now we will do a similar substitution in each factor on the right hand side of (33). Recall the notation \(U_\alpha ^+\) and \(X_\alpha ^+\) in (29) and, for each \(\alpha \in \mathbb{N }^{r_\ell -1}\), let us denote

$$\begin{aligned} V_\alpha \!\!: =\!\! \bigl (v_{p(\omega ,\alpha )}\bigr )_{\omega \in {\fancyscript{L}}} \,\, \text{ and } \,\, U_\alpha \!\!:=\!\! \bigl (U_{p(\omega ,\alpha )} \bigr )_{\omega \in {\fancyscript{L}}}. \end{aligned}$$
(41)

Notice that one factor on the right hand side of (33) is \(\mathbb{P }\bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \bigr | \ (U^+_\alpha ,X^{+}_\alpha ) \bigr )\) and we will now show the following.

Lemma 4

For each \(\alpha \in \mathbb{N }^{r_\ell -1}\), we have

$$\begin{aligned} \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (U^+_\alpha , X^{+}_\alpha ) \Bigr ) = \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (V_\alpha , U_\alpha ) \Bigr ). \end{aligned}$$
(42)

Proof

First of all, the Eq. (33) implies that

$$\begin{aligned} \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (U^+_\alpha , X^{+}_\alpha ) \Bigr ) = \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (U^+, X^{+}) \Bigr ), \end{aligned}$$

which can be seen by considering the probabilities of cylindrical sets that depend only on \((u_{\alpha 1}, X_{\alpha 1})\). Using (40), we get

$$\begin{aligned} \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (U^+_\alpha ,X^{+}_\alpha ) \Bigr ) = \mathbb{P }\Bigl ((u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (V, U) \Bigr ). \end{aligned}$$
(43)

We saw in the proof of Lemma 3 that \(X^{+} = {\Xi }(V,U)\) with probability one and, therefore,

$$\begin{aligned} X^{+}_\alpha = \bigl (X^+_{\omega ,\alpha }\bigr )_{\omega \in {\fancyscript{L}}} = \Bigl (\xi \bigl (v_{p(\omega ,\alpha )}, U_{p(\omega ,\alpha )} \bigr ) \Bigr )_{\omega \in {\fancyscript{L}}}. \end{aligned}$$

Using this and the fact that, by (34), \(U_\alpha ^+\) is also a function of \(U_\alpha \), we obtain the following inclusion of \(\sigma \)-algebras,

$$\begin{aligned} \sigma (U_\alpha ^+, X^{+}_\alpha )\subseteq \sigma (V_\alpha , U_\alpha ) \subseteq \sigma (V, U). \end{aligned}$$

The equality of conditional distributions in (43) given the two extreme \(\sigma \)-algebras implies the equality to the conditional distribution given the middle \(\sigma \)-algebra, and this finishes the proof. \(\square \)

The preceding two lemmas allow us to rewrite (33) as

$$\begin{aligned} \mathbb{P }\Bigl ( (u, X)\in \ \cdot \ \Bigr | \ (V, U)\Bigr ) = \bigotimes \nolimits _{\alpha \in \mathbb{N }^{r_\ell -1}} \mathbb{P }\Bigl ( (u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (V_\alpha , U_\alpha ) \Bigr )^{\otimes I}. \end{aligned}$$
(44)

In other words, conditionally on \((V, U)\), the random variables \( (u_{\alpha n}, X_{\alpha n})\) are independent for all \(\alpha \in \mathbb{N }^{r_\ell -1}\) and \(n\in I\), and for a fixed \(\alpha \), have the same distribution,

$$\begin{aligned} \mathbb{P }\Bigl ( (u_{\alpha 1}, X_{\alpha 1})\in \ \cdot \ \Bigr | \ (V_\alpha , U_\alpha ) \Bigr ). \end{aligned}$$

By the property \((\hbox {C}^{{\prime }{\prime }})\) above, \(u_{\alpha 1}\) is independent of \((V_\alpha , U_\alpha )\), so our main concern now is to describe the conditional distribution of \(X_{\alpha 1}\) given \(u_{\alpha 1},\,V_\alpha \) and \(U_\alpha \).

3.3 Using the case of \(\ell -1\) trees

Lastly, we will use the induction hypothesis in Proposition 1 to describe the joint distribution of the processes \(X_{\alpha 1}, u_{\alpha 1}, V_\alpha \) and \(U_\alpha \) for a fixed \(\alpha \in \mathbb{N }^{r_\ell -1}\), so these are indexed by \({\fancyscript{A}}= {\fancyscript{A}}(r_1,\ldots ,r_{\ell -1})\). The process \(X_{\alpha 1}\) consists of the random variables \(X_{\omega ,\alpha 1}\) indexed by \(\omega \in {\fancyscript{L}}\). We will view the triple \((u_{\alpha 1}, V_\alpha ,U_\alpha )\) as a new \(I\)-field that consists of the random variables

$$\begin{aligned} T_\omega ^\alpha := \Bigl (u_{\omega , \alpha 1}, \bigl (v_{(\omega ,\beta )}\bigr )_{\beta \in p(\alpha )}, \bigl (U_{(\omega ,\beta )} \bigr )_{\beta \in p(\alpha )} \Bigr ) \end{aligned}$$
(45)

indexed by \(\omega \in {\fancyscript{A}}\). Here, we relabeled the random variables by collecting all the coordinates of \(U_\alpha \) and \(V_\alpha \) that depend on a fixed \(\omega \in {\fancyscript{A}}\). By the property \((\hbox {C}^{{\prime }{\prime }})\) above, the array \(T^\alpha := (T^\alpha _\omega )_{\omega \in {\fancyscript{A}}}\) is again an \(I\)-field, and it is clear that it forms a hierarchically exchangeable coupling with the array \(X_{\alpha 1}\). The induction hypothesis in Proposition 1, now used with \(r_\ell =0\), implies the following.

Lemma 5

There exists a measurable function \(\tau \) such that, conditionally on \(T^\alpha \),

$$\begin{aligned} \big (X_{\omega ,\alpha 1}\bigr )_{\omega \in {\fancyscript{L}}} \stackrel{d}{=} \bigl (\tau (w_{p(\omega )},T^\alpha _{p(\omega )})\bigr )_{\omega \in {\fancyscript{L}}}, \end{aligned}$$
(46)

where \(w\) is an array of i.i.d. random variables uniform on \([0,1]\) indexed by \(\omega \in {\fancyscript{A}}\), independent of everything else.

This allows us to finish the proof of Proposition 1. First of all, let us notice that we can write

$$\begin{aligned} T^\alpha _{p(\omega )} = \bigl (u_{p(\omega )\times \{\alpha 1\}}, v_{p(\omega ,\alpha )}, U_{p(\omega ,\alpha )} \bigr ). \end{aligned}$$

Combining Lemma 5 with (44), we proved that, conditionally on the arrays \(u, V\) and \(U\), we can generate the random variables \(X_{\omega , \alpha n}\) for \(\omega \in {\fancyscript{L}}, \alpha \in \mathbb{N }^{r_\ell -1}, n\in I\) in distribution by

$$\begin{aligned} X_{\omega , \alpha n} = \tau \bigl (v_{p(\omega )\times \{\alpha n\}},u_{p(\omega )\times \{\alpha n\}}, v_{p(\omega ,\alpha )}, U_{p(\omega ,\alpha )} \bigr ), \end{aligned}$$
(47)

where, for each \(\alpha \in \mathbb{N }^{r_\ell -1}\) and \(n\in I\), we used the random variables \(v_{p(\omega )\times \{\alpha n\}}\) in place of an independent copy of \(w_{p(\omega )}\) in (46). First of all,

$$\begin{aligned} \bigl (v_{p(\omega )\times \{\alpha n\}}, v_{p(\omega ,\alpha )} \bigr ) = v_{p(\omega , \alpha n)}. \end{aligned}$$

If we recall the definition of the process \(U\) in (34), we see that for \(\alpha \in \mathbb{N }^{r_\ell -1},\,U_{p(\omega ,\alpha )}\) consists of two parts, \(u_{p(\omega ,\alpha )}\) and \(U^+_{p(\omega )\times \{\alpha \}}\), and the first one can be combined with \(u_{p(\omega )\times \{\alpha n\}}\) to give

$$\begin{aligned} \bigl (u_{p(\omega )\times \{\alpha n\}}, u_{p(\omega ,\alpha )} \bigr ) = u_{p(\omega ,\alpha n)}. \end{aligned}$$

Then, (47) can be rewritten as (slightly abusing notation)

$$\begin{aligned} X_{\omega , \alpha n} = \tau \bigl (u_{p(\omega , \alpha n)}, v_{p(\omega , \alpha n)}, U^+_{p(\omega )\times \{\alpha \}} \bigr ). \end{aligned}$$
(48)

Finally, note that we consider the random variables \(X_{\omega , \alpha n}\) with the index \(n\in I\), while all the random variables \(U^+_{\omega , \alpha }\) in (28) were defined in terms of the random variables \(u_{\omega ,\alpha n}\) with the index \(n\in I^c\), so now they are not viewed as a part of our \(I\)-field \((u,U^-)\). Therefore, by redefining the function \(\tau \), we can absorb the randomness of \(U^+_{p(\omega )\times \{\alpha \}}\) into \(v_{p(\omega , \alpha n)}\) to get

$$\begin{aligned} X_{\omega , \alpha n} = \tau \bigl (u_{p(\omega , \alpha n)}, v_{p(\omega , \alpha n)} \bigr ). \end{aligned}$$
(49)

This completes the induction step in Proposition 1, and finishes the proof of Theorem 2. \(\square \)

One can also now formulate a conditional version of Theorem 2 as follows. Examples of \(H\)-exchangeable pairs of processes can be constructed in the form

$$\begin{aligned} (Y_\alpha ,X_\alpha )=\bigl (\sigma _1(u_{p(\alpha )}), \sigma _2(u_{p(\alpha )},v_{p(\alpha )})\bigr ), \end{aligned}$$
(50)

for two measurable functions \(\sigma _1, \sigma _2\) and independent \(I\)-fields \(u\) and \(v\) of uniform random variables on \([0,1]\).

Theorem 4

Any hierarchically exchangeable array of pairs \((Y_\alpha ,X_\alpha )_{\alpha \in \mathbb{N }^{r_1}\times \cdots \times \mathbb{N }^{r_\ell }}\) can be generated in distribution as in (50) for some measurable functions \(\sigma _1\) and \(\sigma _2\).

Proof

This follows by first applying Theorem 2 to represent

$$\begin{aligned} (Y_\alpha )_{\alpha } \stackrel{d}{=} \bigl (\sigma _1(u_{p(\alpha )}) \bigr )_\alpha , \end{aligned}$$

then forming the coupling of the processes \(X\) and \(u\) conditionally independently over \(Y\), and then applying Proposition 1 to represent the joint distribution of \((u,X)\). \(\square \)