1 Introduction

Condensation phenomena in stochastic particle systems (SPS) continue to be a topic of major research interest. They can be caused by spatial inhomogeneities (see e.g. [1, 2] and references therein) or attractive particle interaction in spatially homogeneous systems, which is the focus of this paper. If the total density of particles exceeds a critical value, the system phase separates into a homogeneous bulk and a condensed phase, with a finite fraction of the total mass concentrating in a vanishing volume fraction. First introduced in [3], zero-range processes and related models provided a first example of condensation in homogeneous SPS [4,5,6]. On the level of stationary distributions condensation is characterized by heavy-tail behaviour of stationary weights as first noted in [7, 8], which has been used to study the phenomenon in the context of equivalence of ensembles and large deviations [9,10,11].

The inclusion process has been introduced in [12] as a discrete dual to a model of heat conduction, and has later been studied as an interesting model of stochastic transport on its own [13,14,15]. It is a natural bosonic counterpart to the exclusion process where particles are subject to an attractive inclusion interaction in addition to independent diffusive motion. It can also be interpreted as a multi-species version of the Moran model of population genetics [16], where the inclusion interaction corresponds to selection, and diffusion to mutation dynamics. The inclusion process is part of a larger class of models introduced in [17] that exhibit factorized stationary distributions, which has recently been extended [18]. Condensation in the inclusion process has first been studied in [19] for inhomogeneous systems. Condensation in homogeneous systems only occurs if the diffusion strength vanishes with the system size. While such scaling of system parameters can lead to non-equivalence of ensembles and discontinuous behaviour as established for a toy zero-range model in [20, 21], this is not the case for the inclusion process and small diffusion or mutation rates are in fact very natural in many applications. The dynamics on various time scales have been established on a rigorous level in [22, 23], restricted to finite lattices in the limit of diverging particle density. In the thermodynamic limit with a finite limiting density there are only heuristic results so far, covering the dynamics of condensation in the inclusion process [24] and extensions with stronger particle interactions and instantaneous condensation [25, 26].

In particular, the stationary behaviour of the inclusion process in the thermodynamic limit has not been characterized so far, which is the main aim of this paper. We establish the equivalence of ensembles, and show that for vanishing diffusion strength the inclusion process exhibits condensation for any positive particle density. While the bulk of the system is empty, the condensed phase can exhibit an interesting hierarchical structure following the Poisson–Dirichlet distribution. The latter was originally introduced in the context of population genetics [27, 28], and has later been identified as the generic stationary distribution of split-merge dynamics [29, 30], which is related to its appearance in cycle length distributions of random permutations [31,32,33]. It has further been observed (though not identified) more recently in systems of interacting diffusions [34, 35], but to our knowledge is a novelty in the context of condensation in SPS. In general, the condensed phase in SPS with stationary product distributions concentrates on a single lattice site [7, 8, 10, 36]. A spread over multiple sites has only been observed in versions of zero-range processes which include an effective (soft) cut-off for site occupation numbers [37, 38], or in models with pair-factorized stationary states [39, 40] where it occurs naturally due to spatial correlations. Poisson–Dirichlet statistics arise when the diffusion parameter in the inclusion process scales with the inverse system size, and we also establish complete condensation for smaller diffusion where all particles concentrate on a single site, and a universal exponential law for intermediate scales.

Our main results on the structure of the condensed phase are derived using size-biased sampling of occupation numbers, which is related in a natural way to the Poisson–Dirichlet distribution as reviewed in Sect. 3.2. While this point of view is standard in population genetics (see e.g. [41]), this approach also provides a strong tool to study the condensed phase in SPS where it has not been used so far. After introducing the basic notation and concepts in Sect. 2, we derive our main results on condensation and the typical structure of the condensed phase for the inclusion process in Sect. 3. Our results are rigorous and derivations are presented in a general, transferable way, and we show simulation data for illustration. We include results on large deviations of the condensed phase in Sect. 4, and conclude with a discussion of the main points and relations to other models in Sect. 5. In Appendix 1 we show that under a general definition of condensation the system phase separates into a homogeneous bulk and a condensed phase, and that condensation implies divergence of higher moments. In Appendix 2 we comment on Monte-Carlo dynamics to generate stationary samples, and on differences between one-dimensional and mean-field geometries.

2 Mathematical Setting

2.1 Condensation in Homogeneous Particle Systems

We study stochastic particle systems (SPS) on a finite set of spatial locations/sites \(\Lambda \) of size \(|\Lambda |=L\), which can for example be a regular lattice with periodic or closed boundaries. The system has a fixed, finite number of N particles, and we denote configurations by \(\eta =(\eta _x :x\in \Lambda )\), \(\eta _x \in {\mathbb {N}}_0\), and the state space \(E_{L,N} =\big \{\eta :\sum _{x\in \Lambda } \eta _x =N\big \}\) denotes the set of all configurations. The dynamics should be irreducible on \(E_{L,N}\), so that the process has a unique (canonical) stationary distribution \(\pi _{L,N}\). We assume that \(\pi _{L,N}\) is spatially homogeneous, i.e. the single-site marginals \(\pi _{L,N} [\eta _x \in .]\) do not depend on site x, and in particular this implies that the density (the expected number of particles per site) is given as

$$\begin{aligned} \langle \eta _x \rangle _{L,N} {:}{=} \sum _{n=1}^N n\,\pi _{L,N} [\eta _x =n ] =N/L. \end{aligned}$$

We are interested in large-scale condensation phenomena of the system in the thermodynamic limit \(L,N\rightarrow \infty \) such that the density converges as \(N/L\rightarrow \rho \ge 0\), which in the following we often denote by \(\lim _{N/L\rightarrow \rho }\) to simplify notation. We assume that in this limit finite marginals of \(\pi _{L,N}\) converge, and we denote the limiting single site marginal as a distribution on \({\mathbb {N}}_0\) by

$$\begin{aligned} \nu _\rho {:}{=}\lim _{N/L\rightarrow \rho } \pi _{L,N} [\eta _x \in .]. \end{aligned}$$

This convergence of distribution functions is equivalent to weak convergence, i.e.

$$\begin{aligned} \big \langle f(\eta _x )\big \rangle _{L,N} \rightarrow \langle f\rangle _\rho \quad \text{ as } L,N\rightarrow \infty \quad N/L\rightarrow \rho \end{aligned}$$

for all \(x\in \Lambda \) and bounded, continuous test functions \(f\in C_b ({\mathbb {N}}_0 )\). With (1) the first moment \(\langle \eta _x\rangle _{L,N}\rightarrow \rho \) converges in the thermodynamic limit, and by Fatou’s Lemma this implies for the first moment of the limiting distribution that

$$\begin{aligned} \rho _{b} {:}{=}\langle \eta _x \rangle _\rho \le \rho . \end{aligned}$$

This is usually called the background or bulk density (indicated by the subscript) as is explained below. Strict inequality above is possible since \(f(\eta _x )=\eta _x\) is an unbounded function on \({\mathbb {N}}_0\), and implies that locally the system loses mass in the limit, providing the following standard definition of condensation.

Definition 1

A system with canonical distributions \(\pi _{L,N}\) exhibits condensation in the thermodynamic limit \(N/L\rightarrow \rho \) with background density \(\rho _{b}\) as in (4), if \(\nu _\rho \) exists as defined in (2) and \(\rho _b <\rho \). A system with \(\rho _b =0\) is said to exhibit complete condensation if

$$\begin{aligned} \pi _{L,N} \Big [\max _{x\in \Lambda } \eta _x =N\Big ] \rightarrow 1\quad \text{ as } L,N\rightarrow \infty ,\ N/L\rightarrow \rho , \end{aligned}$$

i.e. typically all particles in the system concentrate on a single lattice site.

If \(\nu _\rho \) exists for all \(\rho \ge 0\), the systems is said to exhibit a condensation transition with critical density\(\rho _c \ge 0\), if

$$\begin{aligned} \rho _b \left\{ \begin{array}{cl} =\rho &{}, \text{ for } \text{ all } \rho<\rho _c\\ <\rho &{}, \text{ for } \text{ all } \rho >\rho _c\end{array}\right. . \end{aligned}$$

Condensation in the above setting has been established in various SPS, including zero-range processes and related models (see e.g. [42, 43] and references therein). It has been shown on a case-by-case basis that \(\rho _b\) is monotone increasing with \(\rho \) and there exists a unique critical density \(\rho _c \in [0,\infty ]\) in the sense of (6). One sufficient general condition is monotonicity of the dynamics for the underlying particle system. But in principle more complicated behaviour such as non-monotonicity of \(\rho _b\) cannot be ruled out, even though we are not aware of any generic examples in the thermodynamic limit. For condensation on finite lattices possible non-monotonicity of \(\rho _b\) has been established and discussed e.g. in [44, 45] and references therein.

As is discussed in more detail in Appendix 1, the interpretation of \(\rho _b <\rho \) is that the system phase separates into a homogeneous bulk phase and a condensed phase. The latter concentrates on a vanishing volume fraction but contains a non-zero fraction \(\rho -\rho _b >0\) of the total mass in the system, and is usually simply called the condensate. Depending on the specific example and the nature of \(\pi _{L,N}\) the condensate may cover only a single lattice site (see e.g. [10, 36]) or a sub-extensive volume [39, 40]. In most cases the bulk density \(\rho _b =\rho _c\) is equal to critical one, but there are also models with \(\rho _b <\rho _c\), such as zero-range toy models with size-dependent rates [20, 21] which introduce an effective long-range interaction and lead to non-equivalence of ensembles. Complete condensation has been established for particular zero-range processes in [7, 46] and for inclusion processes in a fixed volume in [23].

As we show in Appendix 1 in Proposition 6, condensation as defined above implies in particular divergence of higher moments \(\langle \eta _x^a \rangle _{L,N}\) with \(a>1\). This has been used in some papers as a definition of condensation often using \(a=2\) [47, 48]. The converse does not hold, since moments of limiting distributions \(\nu _\rho \) with heavy tails can diverge also in the absence of phase separation, so we stick to Definition 1 to characterize condensation. For condensing systems, divergence of higher moments is due to the contribution of diverging occupation numbers in the condensed phase which is not described by the limiting distribution \(\nu _\rho \).

2.2 Models with Stationary Product Measures

From now on we focus on stochastic particle systems which are defined by a generator of the form

$$\begin{aligned} {\mathcal {L}}f(\eta )=\sum _{x,y\in \Lambda } p(x,y) u(\eta _x ,\eta _y )\big ( f(\eta ^{xy} )-f(\eta )\big ), \end{aligned}$$

for continuous test functions \(f\in C(E_{L,N} )\). This defines a continuous-time Markov process on the state space \(E_{L,N}\) jumping from configurations \(\eta \) to \(\eta ^{xy}\) where one particle moves from site x to y. The spatial dependence of the rates is given by a multiplicative factor p(xy), which we take to be an irreducible transition kernel for a single particle on \(\Lambda \). The interaction between particles is determined by the function u which depends only on the occupation numbers of departure and target site of a jump event. To ensure irreducibility of the process on \(E_{L,N}\) we assume

$$\begin{aligned} u(m,n)\ge 0\quad \text{ and }\quad u(m,n)=0 \text{ if } \text{ and } \text{ only } \text{ if } m=0. \end{aligned}$$

To ensure spatial homogeneity at stationarity we assume

$$\begin{aligned} \sum _{x\in \Lambda } p(x,y)=\sum _{x\in \Lambda } p(y,x)\quad \text{ for } \text{ all } y\in \Lambda , \end{aligned}$$

which is a slight generalization of translation invariance on regular lattices. This type of models have first been introduced in the seminal paper [17]. It is well known (see also [2, 18]) that they exhibit stationary product measures if and only if

$$\begin{aligned} \frac{u(n+1,m)}{u(m+1,n)} = \frac{u(n+1,0)}{u(1,n)}\frac{u(1,m)}{u(m+1,0)} \quad \text {for all}\; n,m\ge 0, \end{aligned}$$

and either \(p(\cdot ,\cdot )\) is symmetric, or

$$\begin{aligned} u(n,m) - u(m,n) = u(n,0)-u(m,0) \quad \text {for all }\ n,m\ge 0. \end{aligned}$$

In this case, normalizing the weights \(\displaystyle w(n) = \prod _{k=1}^n\frac{u(1,k)}{u(k,0)}\) leads to product distributions

$$\begin{aligned} \nu _\phi ^L [d\eta ]=\Bigg (\frac{1}{z(\phi )}\Bigg )^L \prod _{x\in \Lambda } w(\eta _x ) \phi ^{\eta _x} \, d\eta \quad \text{ with }\quad z(\phi )=\sum _{n\in {\mathbb {N}}_0 } w(n)\,\phi ^n, \end{aligned}$$

which are stationary for all \(\phi \ge 0\) such that the normalizing partition function \(z(\phi )<\infty \). Note that these ‘grand-canonical’ distributions are supported on the extended state space \(E_L =\big \{ \eta :\eta _x \ge 0\big \}\) without fixing the total number of particles. The expected number of particles per site is given as a monotone increasing function of \(\phi \) as

$$\begin{aligned} R(\phi ){:}{=}\langle \eta _x \rangle _\phi =\phi \,\partial _\phi \log z(\phi ). \end{aligned}$$

For such processes we have explicit representations of the canonical distributions as conditional grand-canonical distributions

$$\begin{aligned} \pi _{L,N} =\nu ^L_\phi \Bigg [\, \cdot \,\Big |\, \sum _{x\in \Lambda } \eta _x =N\Bigg ] , \end{aligned}$$

which in fact do not depend on the choice of \(\phi >0\). This leads to the useful form

$$\begin{aligned} \pi _{L,N} [d\eta ]=\frac{1}{Z_{L,N}}\prod _{x\in \Lambda } w(\eta _x ) \,\delta \Bigg (\sum _x \eta _x ,N\Bigg )\, d\eta \quad \text{ with }\quad Z_{L,N} =\sum _{\eta \in E_{L,N}} \prod _{x\in \Lambda } w(\eta _x ) \end{aligned}$$

with canonical partition function \(Z_{L,N}\). This implies in particular that for \(\rho <\rho _c\) the limits (2) of single-site marginals are given by the marginal \(\nu _\phi ^1\) with \(\phi \ge 0\) such that \(R(\phi )=\rho \).

For models of the above type, the condensation transition as given in Definition 1 is equivalent to existence of \(\phi _c <\infty \) such that \(z(\phi )=\infty \) for all \(\phi >\phi _c\), and \(R(\phi )\rightarrow \rho ^* <\infty \) as \(\phi \rightarrow \phi _c\) (see e.g. [2] for a detailed discussion). Examples of this type studied so far include zero-range processes with \(u(m,n)=u(m)\) and decreasing rates u(m) [5, 8, 36], where \(\rho ^* =\rho _c =\rho _b\). If the rates can depend on the system size, the transition can also be discontinuous with \(\rho _b<\rho _c <\rho ^*\) where grand-canonical distributions with densities in the range \((\rho _c ,\rho ^*)\) are metastable [20, 21]. More recently, condensation has also been studied for inclusion processes [19] and explosive condensation models [25, 26, 43] with rates of the form

$$\begin{aligned} u(m,n)=m^\gamma \big ( d +n^\gamma \big ),\quad \gamma \ge 1,\ d>0\ . \end{aligned}$$

If \(\gamma >2\) the system exhibits a condensation transition for all \(d>0\) with \(\rho _c >0\). For inclusion processes we have \(\gamma =1\), and this case is covered in more detail in Sect. 3.1. In all generic systems with stationary product measures studied so far, we have

$$\begin{aligned} \frac{1}{L}\max _{x\in \Lambda } \eta _x \rightarrow \rho -\rho _b \quad \text{ as } L,N\rightarrow \infty ,\quad N/L\rightarrow \rho , \end{aligned}$$

and the condensed phase concentrates on a single lattice site. In Sect. 3 we will see for the inclusion process that the condensed phase can extend over more than one site and have an interesting hierarchical structure, which has not been observed for condensing particle systems so far.

2.3 Size-Biased Sampling

Since the condensed phase concentrates on a vanishing volume fraction, the limiting marginal probabilities for a fixed number k of occupation numbers converge to the distribution of the bulk in a condensed system. As explained above, for models with stationary product measures this is usually given by the maximal product measure with critical density \(\rho _c =R(\phi _c )\) and we have (cf. [10])

$$\begin{aligned} \pi _{L,N} [\eta _{x_1} =n_1 ,\ldots ,\eta _{x_k} =n_k ] \rightarrow \prod _{i=1}^k \nu _{\phi _c} (n_i ), \end{aligned}$$

for all \(x_1 ,\ldots ,x_k \in \Lambda \) and \(n_1 ,\ldots ,n_k \ge 0\). This asymptotic equivalence of canonical and grand canonical ensembles (distributions) has been established for a large class of models [2, 9], and implies weak convergence w.r.t. local, bounded test functions as in (3).

Since it contains a non-zero fraction of all particles, the distribution of the condensed phase can be accessed via size-biased permutations of particle configurations. This can be interpreted as picking a particle uniformly at random and sampling the occupation number \(\eta _x\) at its location x. The larger \(\eta _x\), the more likely it is to pick site x in this way. Formally, this can be defined recursively (see e.g. [41], Sect. 2.4]).

Definition 2

For given \(\eta \in E_{L,N}\) pick a random permutation \(\sigma :\Lambda \rightarrow \Lambda \) of the lattice indices as

$$\begin{aligned} \sigma (1)&=x\quad \text{ with } \text{ probability }\quad \frac{\eta _x}{N}\ ,\quad x\in \Lambda \ ;\\ \sigma (2)&=x\quad \text{ with } \text{ probability }\quad \frac{\eta _x}{N-\eta _{\sigma (1)}},\quad x\in \Lambda \setminus \{\sigma (1)\}\ ;\\&\,\,\ldots \quad \text{ and } \text{ so } \text{ on }. \end{aligned}$$

Then we call   \({\tilde{\eta }} =\big ({\tilde{\eta }}_1 ,\ldots ,{\tilde{\eta }}_L \big ) {:}{=}\big (\eta _{\sigma (1)},\ldots ,\eta _{\sigma (L)}\big )\)   a size-biased permutation of \(\eta \).

For models with canonical distributions of the form (12), the distribution of the first size-biased marginal is given by

$$\begin{aligned} \pi _{L,N}[{\tilde{\eta }}_1 =n] =\frac{L}{N} n\pi _{L,N}[\eta _1 =n]= \frac{L}{N}nw(n)\frac{Z_{L-1,N-n}}{Z_{L,N}}, \end{aligned}$$

where the stationary weight w(n) is re-weighted proportional to n and re-normalized. Here and in the following we use the convention \(Z_{L,k} =0\) for all \(k<0\), so we can omit indicator functions of the form \(\mathbb {1}_{n\le N}\) to simplify notation. Note that the first identity in (14) with the re-weighted marginal probability holds in general, but the second one only because \(\pi _{L,N}\) is a conditional product measure of the form (12). For a two-site size-biased marginal we then have

$$\begin{aligned} \pi _{L,N}\big [{\tilde{\eta }}_1=n_1, {\tilde{\eta }}_2=n_2\big ]&=\pi _{L,N}\big [{\tilde{\eta }}_2=n_2 \big | {\tilde{\eta }}_1=n_1 \big ]\,\pi _{L,N}[{\tilde{\eta }}_1=n_1]\\&=\frac{L-1}{N-n_1}\frac{Z_{L-2,N-n_1-n_2}}{Z_{L-1,N-n_1}}n_2w(n_2) \frac{L}{N}\frac{Z_{L-1,N-n_1}}{Z_{L,N}}n_1w(n_1)\\&=\frac{L(L-1)}{N(N-n_1)}n_1n_2w(n_1)w(n_2)\frac{Z_{L-2,N-n_1-n_2}}{Z_{L,N}}. \end{aligned}$$

Generalizing to the k-site case we get

$$\begin{aligned}&\pi _{L,N}\big [{\tilde{\eta }}_1=n_1, {\tilde{\eta }}_2=n_2,\ldots ,{\tilde{\eta }}_k=n_k \big ]\nonumber \\&\quad =\frac{L(L-1)\cdots (L-k+1)}{N(N-n_1)\cdots (N-\sum _{i=1}^{k-1} n_i)}\prod _{i=1}^k (n_iw(n_i))\frac{Z_{L-k,N-\sum _{i=1}^k n_i}}{Z_{L,N}}, \end{aligned}$$

which includes \(k=L\) to get the full distribution of \({\tilde{\eta }}\) with \(Z_{0,n}=1\) for all \(n\in \{ 0,\ldots ,N\}\). Note that due to size-biased re-ordering, the distribution of \({\tilde{\eta }}\) and its marginals is of course not spatially homogeneous.

To our knowledge, essentially all previous studies of condensation in homogeneous particle systems focus instead on the (decreasing) order statistics

$$\begin{aligned} {\hat{\eta }} =\big (\eta _{(1)} ,\ldots ,\eta _{(L)}\big )\quad \text{ where }\quad \eta _{(1)} \ge \eta _{(2)}\ge \ldots \ge \eta _{(L)}, \end{aligned}$$

and in particular the maximum occupation number \(\eta _{(1)}\) (see e.g. [10, 21, 49, 50]). We will see below how this is related to size-biased sampling, and that the latter is very suitable to study condensation in systems with \(\rho _b =0\) such as the inclusion process and related models. A size-biased sampling approach can also be useful in models with \(\rho _b >0\) to study the dynamics of the condensed phase and phase separation as recently shown in [51].

2.4 The Poisson–Dirichlet and GEM Distribution

The Poisson–Dirichlet distribution has been introduced by Kingman in the context of population genetics [27, 28] and has since occurred in a variety of applications, such as split-merge dynamics [29, 30] and random permutations [31,32,33]. It is a one-parameter family of probability measures defined on the set of ordered partitions of the unit interval

$$\begin{aligned} \nabla {:}{=}\Bigg \{(v_1,v_2,\ldots )\in [0,1]^\infty : v_1\ge v_2 \ge \cdots \ge 0, \sum _{j=1}^{\infty } v_j=1\Bigg \}. \end{aligned}$$

It can be characterized for instance as a scaling limit of Dirichlet random variables which form a finite partition of [0, 1], or via scale invariant Poisson processes (see Chap. 2 in [41] for details). One of the most accessible characterization in terms of practical use is related to the GEM distribution, named in [52] after Griffiths [53, 54], Engen [55] and McCloskey [56], which is defined as follows. Let \(U_1, U_2, \ldots \) be i.i.d. Beta(\(1,\alpha \)) random variables with \(\alpha >0\), which take values on [0, 1] with PDF \(\alpha (1-x)^{\alpha -1}\), and the uniform distribution as a special case for \(\alpha =1\). On the set of (unordered) partitions

$$\begin{aligned} \Delta {:}{=}\Bigg \{(v_1,v_2,\ldots )\in [0,1]^\infty : \sum _{j=1}^{\infty } v_j=1\Bigg \}. \end{aligned}$$

define a random element \(V{:}{=}(V_1,V_2,\ldots )\in \Delta \) recursively via

$$\begin{aligned} V_1=U_1, V_2=(1-U_1)U_2, V_3=(1-U_1)(1-U_2)U_3, \ldots , \end{aligned}$$

which corresponds intuitively to breaking off a fraction \(1-U_1\) from the unit interval and continuing this process recursively with the remaining interval. The law of V on \(\Delta \) is called the Griffiths-Engen-McCloskey distribution GEM(\(\varvec{\alpha }\)), and the corresponding order statistics \({\hat{V}}\) on \(\nabla \) has Poisson–Dirichlet distribution PD(\(\varvec{\alpha }\)). Alternatively, given a PD(\(\alpha \)) distributed partition V on \(\nabla \), its size-biased permutation \({\tilde{V}}\) has GEM(\(\alpha \)) distribution on \(\Delta \) (see e.g. [41] for details).

Note that the construction (17) leads to a hierarchical structure of a GEM(\(\alpha \)) partition V, and the parameter \(\alpha >0\) controls the expected size of the components. The expectation of Beta\((1,\alpha )\)-distributed random variables \(U_i\) is \(\frac{1}{1+\alpha }\), so for small \(\alpha \) the size of the first component \(V_1\) is larger and the hierarchy stronger. For larger \(\alpha \) the expected sizes of the components are more similar, but always show a strict order since

$$\begin{aligned} \Big \langle 1-\sum _{k=1}^n V_k\Big \rangle _{\mathrm {GEM}(\alpha )}=\Big \langle \prod _{k=1}^n(1-U_k)\Big \rangle _{\mathrm {GEM}(\alpha )} =\left( \frac{\alpha }{1+\alpha }\right) ^n \rightarrow 0\quad \text{ as } n\rightarrow \infty . \end{aligned}$$

This shows that in fact \(V\in \Delta \) and that the expected component sizes of \(V_k\) vanish as \(k\rightarrow \infty \), and is also a useful relation to numerically test for GEM distributions (see Sect. 3.4).

Carrying over the product topology from \([0,1]^\infty \), weak convergence of probability distributions on \(\Delta \) and \(\nabla \) is equivalent to convergence in distribution of finite marginals \((V_1 ,\ldots ,V_k)\) of partitions. By Theorem 2 in [57], convergence in distribution of a sequence of size biased partitions \({\tilde{V}}^i \rightarrow V\) on \(\Delta \), implies convergence in distribution of the corresponding ordered partitions \({\hat{V}}^i \rightarrow {\hat{V}}\), and V is a size-biased permutation of \({\hat{V}}\). In Sect. 3.2 we will use this fact and that rescaled particle configurations \(\frac{1}{N}\eta \in \Delta \) can be interpreted as finite partitions of the unit interval, to derive our main results. Note that in a condensing system with \(\rho _b <\rho \) (4) the partitions \(\frac{1}{N}\eta \) in the thermodynamic limit only converge on the extended space

$$\begin{aligned} {\overline{\Delta }} {:}{=}\Bigg \{(v_1,v_2,\ldots )\in [0,1]^\infty : \sum _{j=1}^{\infty } v_j \le 1\Bigg \}, \end{aligned}$$

which allows for the loss of mass due to phase separation (see Proposition 6 in Appendix 1). On the other hand, size-biased permutations capture the condensed phase and the full mass of the system, and \(\frac{1}{N}{\tilde{\eta }}\) converge on \(\Delta \), as we will establish in the next Section.

3 Condensation in the Inclusion Process

The inclusion process is a stochastic particle system of type (7) with rates

$$\begin{aligned} u(m,n)=m(d+n)\quad \text{ with } \text{ parameter } d>0, \end{aligned}$$

which was first introduced in [12] in the context of energy/mass transport. Another important interpretation of this model is as a multi-species version of the Moran model of population genetics, which describes the selection-mutation dynamics of a population of N individuals which can take L different types [58]. Here the parameter d describes the mutation rate, which is small compared to the reproduction rate of the system and is often taken to depend on the system size \(d=d_L >0\) and vanish as \(L\rightarrow \infty \). Results in [23] show that for fixed L as \(N\rightarrow \infty \), complete condensation occurs if \(d=d_N \ll 1/\log N\). The thermodynamic limit has not been studied so far, and in this section we will establish a complete picture covering all densities \(\rho >0\) and possible scaling regimes of the parameter d.

The inclusion process satisfies conditions (8) and (9) and has stationary product measures of the form (10) with weights

$$\begin{aligned} w(n)=\frac{\Gamma (n+d)}{n!\Gamma (d)}\simeq d\, n^{d-1}\quad \text{ as } n\rightarrow \infty , \end{aligned}$$

Footnote 1 and with normalization \(z(\phi )=(1-\phi )^{-d}\). So \(\phi _c =1\) and

$$\begin{aligned} R(\phi )=d\frac{\phi }{1-\phi } \rightarrow \infty \quad \text{ as } \phi \rightarrow 1\quad \text{ for } \text{ all } d>0. \end{aligned}$$

This also leads to an explicit formula for the canonical distributions

$$\begin{aligned} \pi _{L,N} [d\eta ]=\frac{1}{Z_{L,N}}\prod _{x\in \Lambda } \frac{\Gamma (\eta _x +d)}{\eta _x !\Gamma (d)}\, d\eta \quad \text{ with }\quad Z_{L,N}=\frac{\Gamma (N+dL)}{N! \Gamma (dL)}, \end{aligned}$$

which can be identified as a Dirichlet multinomial distribution (cf. [41], Chap. 1]). These have been studied in detail in the context of urn models and have interesting structural properties and symmetries, but in the following we only make use of the asymptotic form of the partition function so that our results can be more easily translated to other systems. Our main results in the thermodynamic limit \(N,L\rightarrow \infty \), \(N/L\rightarrow \rho \ge 0\) are derived in the next subsections, and can be summarized as follows:

  1. 1.

    \(d>0\) constant or \(d_L \rightarrow d>0\): we have asymptotic equivalence of canonical measures and stationary product distributions (10) with \(\phi \in [0,1)\) such that \(R(\phi )=\rho \) (11), and there is no condensation.

  2. 2.

    \(d\rightarrow 0\): the inclusion process exhibits a condensation transition with \(\rho _c =0\) as follows:

    1. (a)

      \(d\rightarrow 0\) and \(d L\log L\rightarrow 0\): complete condensation

    2. (b)

      \(d\rightarrow 0\) and \(d L\rightarrow \alpha \in (0,\infty )\): the condensed phase exhibits a hierarchical structure on the scale N given by the PD(\(\alpha \)) distribution.

    3. (c)

      \(d\rightarrow 0\) and \(d L\rightarrow \infty \): the condensed phase consists of order dL sites with independent occupation numbers of order \(\rho /d\) and exponential distribution.

We will make use of the asymptotic behaviour of w(n) (20) and the partition function \(Z_{L,N}\), which can be derived by standard Stirling approximations from (22). Particularly useful in the following is the asymptotic behaviour of the ratio

$$\begin{aligned} \frac{\Gamma (L+a)}{\Gamma (L+b)} = L^{a-b} \big ( 1+o(1)\big )\quad \text{ as } L\rightarrow \infty , \end{aligned}$$

Footnote 2 which holds for all sequences \(a=a_L\) and \(b=b_L\) such that \(a^2 ,b^2 \ll L\). Recall also that \(\Gamma (d)= \frac{1}{d}\big (1 +o(1)\big )\) as \(d\rightarrow 0\).

3.1 Equivalence of Ensembles and Condensation

We assume \(d>0\) constant or \(d_L \rightarrow d>0\). In this case (21) implies that there exist grand-canonical distributions for any density \(\rho \ge 0\), by choosing

$$\begin{aligned} \phi =\Phi (\rho ){:}{=}\frac{\rho }{d+\rho }\in [0,1) \end{aligned}$$

such that \(R(\phi )=\rho \). In this case the equivalence of ensembles can be established most naturally in terms of the specific relative entropy between canonical and grand-canonical distributions (see e.g. [2, 9])

$$\begin{aligned} \frac{1}{L}H(\pi _{L,N} ,\nu _\phi ^L )&=\frac{1}{L}\sum _{\eta \in E_{L,N}} \pi _{L,N} [\eta ]\log \frac{\pi _{L,N} [\eta ]}{\nu _\phi ^L [\eta ]}\nonumber \\&=\log z(\phi )-\frac{N}{L}\log \phi -\frac{1}{L}\log Z_{L,N}\ . \end{aligned}$$

Computing the leading order terms of \(Z_{L,N}\) from (22) with standard Stirling formula we get

$$\begin{aligned} \frac{1}{L}\log Z_{L,N} \rightarrow \rho \log \Big (1+\frac{d}{\rho }\Big ) +d\log \rho , \end{aligned}$$

so choosing \(\phi =\phi (\rho )\) as in (24) we see that (25) vanishes in the thermodynamic limit since \(\log z(\phi )=-d\log (1-\phi )\). Convergence in specific relative entropy implies convergence of finite marginals [2], i.e. for any fixed \(k>0\) and \(n_1 ,\ldots ,n_k \ge 0\)

$$\begin{aligned} \pi _{L,N} \big [ \eta _1 =n_1 ,\ldots ,\eta _k =n_k\big ]\rightarrow \frac{1}{z(\phi )^k}\prod _{i=1}^k w( n_i ) \phi (\rho )^{n_i}\quad \text{ as } N/L\rightarrow \rho . \end{aligned}$$

The latter limit could also be computed directly in analogy to other results below, but the route via the equivalence of ensembles is more robust since only the logarithm of the partition function has to be controlled to leading order.

An alternative representation of the specific relative entropy is given by (see e.g. [9])

$$\begin{aligned} \frac{1}{L}H(\pi _{L,N} ,\nu _\phi ^L )&=-\frac{1}{L}\log \nu _{\phi }^L\Big [\sum _{x \in \Lambda }\eta _x = N\Big ]. \end{aligned}$$

Since the second moment of the single-site marginal \(\nu _\phi \) is finite when \(\phi (\rho )=\rho /(\rho +d)<1\), one can show that this vanishes in the thermodynamic limit even without computing the asymptotics of \(Z_{L,N}\), by applying a local central limit theorem to the right hand side (see for example [59, 60]).

In the case \(d\rightarrow 0\), (21) implies that there are no grand-canonical distributions for any positive density and therefore we expect a condensation transition, following the discussion after (12). We summarize this in the following result proved by a direct computation.

Proposition 1

Provided that \(d\rightarrow 0\) as \(L\rightarrow \infty \), the inclusion process exhibits a condensation transition as given in Definition 1 with \(\rho _c =\rho _b =0\), i.e. we have for all fixed \(n\ge 0\) and \(\rho \ge 0\)

$$\begin{aligned} \pi _{L,N} [\eta _1 =n] \rightarrow \delta _{0,n}\quad \text{ as } L,N\rightarrow \infty ,\ N/L\rightarrow \rho >0. \end{aligned}$$


We have for any \(n\ge 0\) fixed

$$\begin{aligned} \pi _{L,N} \big [ \eta _1 =n \big ] = w(n)\frac{Z_{L-1,N-n}}{Z_{L,N}}\simeq w(n), \end{aligned}$$

since with the scaling (27) given below for the partition function in the case \(dL\rightarrow \alpha \in [0,\infty )\) we have

$$\begin{aligned} \frac{Z_{L-1,N-n}}{Z_{L,N}} \simeq \Big ( 1-\frac{n}{N}\Big )^{dL} \rightarrow 1. \end{aligned}$$

The same holds with (33) in the case \(dL\rightarrow \infty \). From (20) we have \(w(0)=1\) and \(w(n)=O(d)\) for any \(n\ge 1\), leading to \(\pi _{L,N} [\eta _1 =n] \rightarrow \delta _{0,n}\), independently of \(\rho \). With Definition 1 this implies condensation with \(\rho _c =\rho _b =0\). \(\square \)

So locally the system appears empty in the limit, and a further investigation of the condensed phase will be given below in terms of size-biased samples. Note that in the proof we only use the asymptotic behaviour of ratios of partition functions and the fact that \(w(n)=O(d)\) for all \(n>0\).

3.2 GEM Scaling Limit and Complete Condensation

We study the distribution of the condensed phase by computing size-biased marginals in the case \(dL\rightarrow \alpha \ge 0\). Using (23), the leading order behaviour of the partition function is given by

$$\begin{aligned} Z_{L,N} =\frac{\Gamma (N+dL)}{N!\Gamma (dL)}\simeq \left\{ \begin{array}{cl} d N^{dL} /\rho &{} \text{ if } dL\rightarrow 0,\\ N^{dL -1}/\Gamma (\alpha )&{} \text{ if } dL\rightarrow \alpha >0. \end{array}\right. \end{aligned}$$

Recall from Sect. 2.4 that \(\frac{1}{N} (\eta _1 ,\ldots ,\eta _L )\) is a (finite) partition of the unit interval.

Theorem 1

In the thermodynamic limit \(L,N\rightarrow \infty \) such that \(N/L \rightarrow \rho \) with \(dL\rightarrow \alpha >0\), the rescaled order statistics of \(\eta \) (16) converge in distribution to Poisson Dirichlet, i.e.

$$\begin{aligned} \frac{1}{N}{\hat{\eta }} =\frac{1}{N} \big (\eta _{(1)} ,\ldots ,\eta _{(L)}\big ){\mathop {\longrightarrow }\limits ^{D}} \mathrm {PD}(\alpha )\ . \end{aligned}$$

Equivalently, size-biased samples converge as   \(\frac{1}{N}{\tilde{\eta }} {\mathop {\longrightarrow }\limits ^{D}}\mathrm {GEM}(\alpha )\).


Following the discussion in Sect. 2.4 it suffices to show that for all \(k\ge 1\), \(x_1 ,\ldots ,x_k \in [0,1]\) we have

$$\begin{aligned} N(N-n_1)\cdots \Bigg (N{-}\sum _{i=1}^{k-1} n_i\Bigg )\pi _{L,N}[{\tilde{\eta }}_1{=}n_1, {\tilde{\eta }}_2{=}n_2,\ldots ,{\tilde{\eta }}_k{=}n_k]\rightarrow \alpha ^k \prod _{i=1}^k (1-x_i)^{\alpha -1}, \end{aligned}$$

provided that \(\frac{n_1}{N}\rightarrow x_1\in [0,1] , \frac{n_2}{N}\rightarrow (1-x_1)x_2, \cdots , \frac{n_k}{N}\rightarrow (1-x_1)(1-x_2)\cdots (1-x_{k-1})x_k\). With the characterization in (17) this establishes convergence in distribution of size-biased permutations to GEM(\(\alpha \)), which is equivalent to (28).

Using (15), the scaling of \(w(n)\simeq dn^{d-1}\) as \(n\rightarrow \infty \) (20) and the partition function (27), and (23) we get

$$\begin{aligned}&\pi _{L,N}[{\tilde{\eta }}_1=n_1, {\tilde{\eta }}_2=n_2,\ldots ,{\tilde{\eta }}_k=n_k]\nonumber \\&\quad =\frac{L(L-1)\cdots (L-k+1)}{N(N-n_1)\cdots (N-\sum _{i=1}^{k-1} n_i)}\frac{Z_{L-k,N-\sum _{i=1}^{k} n_i}}{Z_{L,N}}\prod _{i=1}^k (n_iw(n_i))\nonumber \\&\quad \simeq \frac{L(L-1)\cdots (L-k+1)d^k}{N(N-n_1)\cdots (N-\sum _{i=1}^{k-1} n_i)} \Bigg (\frac{N-\sum _{i=1}^{k} n_i}{N}\Bigg )^{dL -1} \Bigg (N-\sum _{i=1}^k n_i\Bigg )^{-dk} \prod _{i=1}^k n_i^d. \end{aligned}$$

Since \(d=O(1/L)\) we have \(n_i^d \rightarrow 1\) and also \(\big (N-\sum _{i=1}^k n_i\big )^{-dk} \rightarrow 1\). Furthermore, with the choice of \(n_i\) we have

$$\begin{aligned} 1-\frac{1}{N}\sum _{i=1}^k n_i \simeq (1-x_1 )\cdots (1-x_k ) \end{aligned}$$

which implies (29). \(\square \)

For \(\alpha \rightarrow 0\) the above limiting distribution PD(\(\alpha \)) degenerates, with the mass fraction of the maximal occupation number tending to 1. Under a mild additional assumption \(dL\ll 1/\log L\) on the scaling, this statement can be significantly strengthened to ensure complete condensation in analogy with results in [23] for fixed L as \(N\rightarrow \infty \).

Proposition 2

In the thermodynamic limit \(L,N\rightarrow \infty \) such that \(N/L \rightarrow \rho \) with \(dL\log L\rightarrow 0\), we have complete condensation in the sense of (5), i.e. \(\pi _{L,N} \big [\max _{x\in \Lambda } \eta _x=N\big ]\rightarrow 1\).


It suffices to show for the first size-biased marginal that

$$\begin{aligned} \pi _{L,N} [{\tilde{\eta }}_1 =N-n] \rightarrow \delta _{n,0}\quad \text{ for } \text{ all } n\ge 0, \end{aligned}$$

which implies the same for the maximal occupation number. Using again (14), (20) and (27) we have for all \(n\ge 0\)

$$\begin{aligned} \pi _{L,N} [{\tilde{\eta }}_1 =N-n]&= \frac{L}{N}(N-n)w(N-n)\frac{Z_{L-1,n}}{Z_{L,N}} \simeq \frac{d}{\rho }(N-n)^{d} \frac{Z_{L-1,n}}{d N^{dL}/\rho }\\&= \Big ( 1-\frac{n}{N}\Big )^{d} N^{-d(L-1)} Z_{L-1,n}. \end{aligned}$$

The first term tends to 1 for all \(n\ge 0\) and the second scales like

$$\begin{aligned} N^{-d(L-1)} =e^{-d(L-1)\log N}\rightarrow 1\quad \text{ since } dL\ll 1/\log L\ . \end{aligned}$$

Then \(Z_{L-1 ,0}=1\) and \(Z_{L-1 ,n}\simeq dL /n\rightarrow 0\) for \(n\ge 1\), which implies (31). \(\square \)

3.3 Intermediate Scales

Assuming that \(d\rightarrow 0\) with \(dL\rightarrow \infty \) we cannot easily apply (23) for asymptotic estimates, and after a slightly more involved Stirling approximation the leading order of the partition function (12) is

$$\begin{aligned} Z_{L,N}\simeq \frac{e^{-1}}{\sqrt{2\pi dL}}\Big (\frac{N}{dL}\Big )^{dL-1} \Big ( 1+\frac{dL}{N}\Big )^{N+dL}\ . \end{aligned}$$

While in principle this scaling together with that of the weights (20) fully determines the asymptotics of size-biased distributions, it turns out to be more useful to use particular cancellations when estimating ratios of partition functions to proof our main result below. The above scaling implies for all fixed \(n\ge 0\) that

$$\begin{aligned} \frac{Z_{L-1,N-n}}{Z_{L,N}}\simeq \big (1-n/N\big )^{dL} \big (1+1/L\big )^{dL} \big (1+dL/N\big )^{-n} \rightarrow 1, \end{aligned}$$

which we have used to prove Proposition 1.

Theorem 2

In the thermodynamic limit \(L,N\rightarrow \infty \) such that \(N/L \rightarrow \rho \), \(d\rightarrow 0\) and \(dL\rightarrow \infty \), we have for any \(\rho > 0\) and fixed \(k\in {\mathbb {N}}\)

$$\begin{aligned} d({\tilde{\eta }}_1 ,\ldots ,{\tilde{\eta }}_k ){\mathop {\longrightarrow }\limits ^{D}}\text{ i.i.d. } \text{ Exp }(1/\rho ). \end{aligned}$$

i.e. marginals of rescaled size-biased samples \({\tilde{\eta }}\) converge in distribution to independent exponential random variables with mean \(\rho \).


To establish convergence of the joint density we have to show for all \(n_1 ,\ldots ,n_k\) such that \(n_i d\rightarrow x_i >0\)

$$\begin{aligned} \frac{1}{d^k} \pi _{L,N} \big [ {\tilde{\eta }}_1 =n_1,\ldots ,{\tilde{\eta }}_k =n_k ]\rightarrow \frac{1}{\rho ^k} \exp \Bigg (-\sum _{i=1}^k x_i\Big /\rho \Bigg ). \end{aligned}$$

In an analogous computation to (30), we get

$$\begin{aligned}&\frac{1}{d^k}\pi _{L,N} [{\tilde{\eta }}_1=n_1, {\tilde{\eta }}_2=n_2,\ldots ,{\tilde{\eta }}_k=n_k]\nonumber \\&\quad =\frac{1}{d^k}\frac{L(L-1)\cdots (L-k+1)}{N(N-n_1)\cdots (N-\sum _{i=1}^kn_i)}\frac{Z_{L-k,N-\sum _i n_i}}{Z_{L,N}}\prod _{i=1}^k (n_iw(n_i))\nonumber \\&\quad \simeq \Big (\frac{1}{\rho }\Big )^k \underbrace{\prod _{i=1}^k \Big (\frac{x_i}{d}\Big )^d }_{{:}{=}A}\ \underbrace{\frac{\Gamma (N-\sum _i n_i +d(L-k))}{(N-\sum _i n_i )!}}_{{:}{=}B}\underbrace{\frac{N!}{\Gamma (N+dL)}}_{{:}{=}C}\underbrace{\frac{\Gamma (dL)}{\Gamma (d(L-k))}}_{{:}{=}D}, \end{aligned}$$

where we used the asymptotic behaviour of the stationary weights (20), and arranged the contributions of the ratio of partition functions in a convenient way. Since \(d\rightarrow 0\) we have \(A\rightarrow 1\) and \(D\simeq (dL)^{dk}\) using (23). The latter does not apply to the other two terms since \(dL\rightarrow \infty \), and a more careful (but straightforward) analysis leads to

$$\begin{aligned} C\simeq N^{1-dL} \Big (1+\frac{d}{\rho }\Big )^{N+dL} \Big (1-\frac{1}{N}\Big )^N e^{dL-1} \end{aligned}$$

and analogously, using \(\frac{\Gamma (N-\sum _i n_i +d(L-k))}{\Gamma (N-\sum _i n_i+dL)}\simeq \bigg (N-\sum _i n_i +dL\bigg )^{-kd}\simeq N^{-kd}\),

$$\begin{aligned} B\simeq & {} N^{-kd}\bigg (N-\sum _i n_i \bigg )^{dL-1} \bigg (\Big (1+\frac{d}{\rho }\Big ) \Bigg (1+\frac{d\sum _i n_i}{\rho N}\Bigg )\bigg )^{\sum _i n_i -N-dL} \\&\times \Bigg (1-\frac{1}{N-\sum _i n_i}\Bigg )^{\sum _i n_i -N} e^{1-dL}. \end{aligned}$$

Therefore we get

$$\begin{aligned} BCD\simeq (d/\rho )^{dk} \Big (1+\frac{d}{\rho }\Big )^{\sum _i x_i /d} \Bigg (1-\frac{\sum _i x_i}{\rho dL}\Bigg )^{dL} \Bigg ( 1+\frac{\sum _i x_i}{\rho N}\Bigg )^{-N}\rightarrow e^{-\sum _i x_i /\rho }, \end{aligned}$$

and inserting into (35) implies (34). \(\square \)

So the condensed phase for any intermediate scale with \(dL\rightarrow \infty \) has a non-hierarchical structure, locally consisting of independent clusters of average size \(\rho /d\). This general behaviour across a large range of scaling regimes is quite remarkable. However, since \(dN\rightarrow \infty \), the rescaled size-biased samples \(d{\tilde{\eta }}\) do not form a partition of a compact interval (as in the previous case of \(dL\rightarrow \alpha \)). So our result on convergence of finite marginals does not imply weak convergence of the full sequence \(d{\tilde{\eta }}\), and we only get a local characterization of the condensed phase. Since the total mass of the condensed phase is N, and k in the above result can be chosen arbitrarily large, this at least implies that the volume fraction covered by the condensed phase scales at least as d to leading order.

Note also that the limiting exponential distribution of a rescaled cluster in the condensed phase is not itself the size-biased distribution of a random variable, since this would have density

$$\begin{aligned} \frac{\rho }{x} \frac{1}{\rho }e^{-x/\rho } =\frac{1}{x}e^{-x/\rho }\ . \end{aligned}$$

This cannot be normalized due to divergence at \(x=0\), and suggests that the condensed phase does not simply consist of O(1 / d) clusters with i.i.d. occupation numbers. If, conditional on the volume covered by the condensed phase, one could probe a cluster size without size bias, it would vanish on the scale 1 / d. This suggests that the volume fraction covered by the condensed phase could indeed be larger than d with many clusters on smaller scales that do not contribute to the total mass to leading order. Details of this behaviour are most likely depending on the particular scaling of d, and are very hard to access analytically or even to observe numerically.

3.4 Simulation Results

We illustrate our main results with Monte Carlo simulations of the inclusion process at stationarity. Recall that with (7) and (19) the generator describing the dynamics is given by

$$\begin{aligned} {\mathcal {L}}f(\eta )=\sum _{x,y\in \Lambda } p(x,y) \eta _x (d+\eta _y )\big (f(\eta ^{xy})-f(\eta )\big ). \end{aligned}$$

We initialize the system by distributing N particles independently, uniformly at random on the lattice. The stationary distributions \(\pi _{L,N}\) (22) are conditional product measures for all translation invariant or symmetric choices of p(xy). On the complete graph with \(p(x,y)\equiv \frac{1}{L-1}\) one can implement a simple rejection based algorithm to simulate the dynamics, which we summarize in Appendix 2 and call CG dynamics in the following. We also implemented the standard Gillespie algorithm [61] to simulate totally asymmetric dynamics on a one-dimensional lattice with periodic boundary conditions, i.e. \(p(x,y)=\delta _{y,x+1\mathrm {mod}L}\), which we call TA dynamics.

In both geometries, the number of empty sites grows in time and the particles concentrate in clusters, which exchange particles. Smaller clusters disappear and the average cluster size increases, driving a coarsening process. This leads to stationary distributions where either a balance between cluster aggregation and break-up is reached, which is the case for \(d\rightarrow 0\) and \(dL\rightarrow \alpha \in (0,\infty ]\), or the system saturates with a single cluster remaining for \(dL\rightarrow 0\). While for CG dynamics clusters can directly exchange particles, for TA dynamics the clusters are isolated and the coarsening process is limited by particle transport, which has been studied in [24]. Still, once stationarity is reached (see Appendix 2 for more details on this), both dynamics provide samples from the same stationary distributions \(\pi _{L,N}\) which do not have any spatial correlations. Two typical stationary configurations for CG and TA dynamics are illustrated in Fig. 1.

Fig. 1
figure 1

Typical stationary configurations for the inclusion process with \(N=2048\) particles on a lattice of size \(L=1024\) for TA dynamics with \(dL=1\) (left) and CG dynamics with \(dL=10\) (right)

Fig. 2
figure 2

Sample averages of \(R_k\) (37) against k from CG dynamics for the inclusion process are compared to the expected limiting behaviour (38) (black lines) for \(dL=0.5\) and 1 (left) and \(dL=10\) (right). Data are given in coloured symbols with error bars and averaged over 100 realizations \(\eta \) and a further 5 size-biased re-samples \({\tilde{\eta }}\) for each. Grey lines on the right show 100 individual \(R_k ({\tilde{\eta }} )\) for \(L=2048\), the smallest possible non-zero value of \(R_k\) here is 1 / 4096

Since the complete condensation regime \(dL\rightarrow 0\) has been studied numerically before [24], we focus on the hierarchical results in Theorem 1 with \(dL\rightarrow \alpha \in (0,\infty )\), and comment on intermediate scales with \(dL\rightarrow \infty \) from Theorem 2 later. There are no particularly useful results for marginals of Poisson Dirichlet random variables, so we compare size-biased samples of stationary configurations \({\tilde{\eta }}\) to the GEM(\(\alpha \)) distribution. For each \(k\ge 1\), we define

$$\begin{aligned} R_k ({\tilde{\eta }}){:}{=}1-\frac{1}{N}\sum _{i=1}^k {\tilde{\eta }}_i , \end{aligned}$$

the mass fraction remaining on all sites with index \(>k\) in the size-biased sample \({\tilde{\eta }}\). With the representation (17) of the GEM distribution, Theorem 1 implies that for each \(k\ge 1\) the random variable \(R_k\) converges in distribution to a product of i.i.d. random variables \(1-U_i\), where \(U_i \sim \mathrm {Beta}(1,\alpha )\). With (18) this implies that

$$\begin{aligned} \langle R_k \rangle _{L,N} \rightarrow \Big (\frac{\alpha }{1+\alpha }\Big )^k \quad \text{ as } L,N\rightarrow \infty ,\ N/L\rightarrow \rho ,\ dL\rightarrow \alpha , \end{aligned}$$

which is illustrated in Fig. 2 for various values of \(\alpha \) and \(\rho \). We see good agreement for small values of k, but in addition to statistical errors there are large systematic finite-size effects (illustrated for \(\alpha =10\) in Fig. 2 right). These are related to the small amount of non-zero occupation numbers \(\# (\eta )\) in typical stationary configurations, leading to a systematic underestimation of \(\langle R_k \rangle _{L,N}\). This can be derived from Ewen’s sampling formula (see e.g. [41], Theorem 2.8), where \(\# (\eta )\) corresponds to the number of different types in a finite sample of size N from a Poisson–Dirichlet population, and can be shown to scale as

$$\begin{aligned} \# (\eta )\simeq \alpha \log N\quad \text{ as } L,N\rightarrow \infty ,\ N/L\rightarrow \rho ,\ dL\rightarrow \alpha . \end{aligned}$$

This logarithmic scaling can be seen in Fig. 2 (right). Convergence of \(\# (\eta )/\log N\) to \(\alpha \) is very slow on the scale \(1/\sqrt{\log N}\) (see [41], Theorem 2.11]), so this is not a good estimator for \(\alpha \), and the comparison based on (38) in Fig. 2 is more useful.

Fig. 3
figure 3

Empirical tail distributions of \(d{\tilde{\eta }}_i\) for \(i=1,2,3\) from 100 samples \(\eta \) and 5 size-biased re-samples \({\tilde{\eta }}\) are shown as coloured step functions, and compared to the theoretical prediction \(e^{-u/\rho }\) from Theorem 2 for the intermediate regime (full black lines). On the left we fix \(\rho =1\) and agreement with \(e^{-u}\) improves with increasing d. We also include the size-biased grand canonical prediction (39) for the regime of constant \(d>0\) (dashed black line), which agrees well with the discrete data for \(d=0.5\). On the right we fix \(d=32=1/\sqrt{L}\), with good agreement with theory for densities \(\rho =0.5,\, 1\) and 2

For small values of d and finite system size L there is a data cross-over to the condensed regime, with very few occupied sites. This is very hard to access numerically, but theoretically, a single condensate site is fully consistent with the limit \(\alpha \rightarrow 0\) in (38). For large values of d there is a data cross-over to the intermediate regime \(d\rightarrow 0\) with \(dL\rightarrow \infty \), which is covered by Theorem 2. This cross-over is illustrated in Fig. 3 (left), where we plot the empirical tail distribution of \(d{\tilde{\eta }}_i\) for \(i=1,2,3\) based on 5 size-biased re-samples \({\tilde{\eta }}\) of 100 independent samples of \(\eta \) from \(\pi _{L,N}\) using CG dynamics. We pick small values for i in order to use the same procedure for all values of d including 1 / L. For larger d, larger values for i lead to the same behaviour, and tests reveal that the samples \({\tilde{\eta }}_i\) are indeed uncorrelated. For fixed density \(\rho =1\) we see that agreement with the exponential tail, \(e^{-u/\rho }\) predicted by Theorem 2, improves with increasing d up to \(d=32/L=1/\sqrt{L}\). In Fig. 3 (right) for this value of d we see good agreement with the predicted tail for several densities \(\rho \).

If we increase d further the system crosses over to the behaviour for constant \(d>0\), where we have equivalence of ensembles to grand canonical measures \(\nu _\phi \) as explained in Sect. 3.1. Rescaled size-biased variables \(d{\tilde{\eta }}_i\) will then take discrete values in \(d{\mathbb {N}}\) given by the size-biased version of \(\nu _\phi ^1\) (10), i.e.

$$\begin{aligned} \pi _{L,N} \big [ d{\tilde{\eta }}_i =dn\big ] \rightarrow \frac{n}{\rho }\nu _{\phi (\rho )}^1 [\eta _x =n]=\frac{nw(n)}{\rho z(\phi (\rho ))} \phi (\rho )^n \end{aligned}$$

as \(L,N\rightarrow \infty \), \(N/L\rightarrow \rho \) and \(d>0\) fixed. Here \(\phi (\rho )=\rho /(d+\rho )<1\) is given in (24) and \(z(\phi )=(1-\phi )^{-d}\). This is illustrated for \(d=512 L=0.5\) in Fig. 3 (left), where we compare the empirical tail with the tail of the size-biased distribution (39) and see very good agreement. Note that for \(d\rightarrow 0\), we have from the right-hand side of (39) that

$$\begin{aligned} \frac{1}{d} \frac{n}{\rho }\nu _{\phi (\rho )}^1 [\eta _x =n]\rightarrow \frac{1}{\rho } e^{-u/\rho } \quad \text{ if } nd\rightarrow u, \end{aligned}$$

since \(nw(n)/d\rightarrow 1\), \(z(\phi (\rho ))\rightarrow 1\) and \(\phi (\rho )^n \rightarrow e^{-u/\rho }\). So the size-biased grand-canonical distributions scale consistently with the result in Theorem 2.

4 Large Deviations

In Sect. 3 we derived the typical stationary behaviour in the condensed phase, and will now study the statistics of large deviations of the maximum occupation number. The most interesting case of complete condensation is covered in Sect. 4.3, for completeness and to introduce the main concepts of large deviations we first cover the non-condensing and intermediate regime. Note that in the hierarchical regime with \(dL\rightarrow \alpha \in (0,\infty )\), the typical size of the maximum is of order L and it can take any value on that scale with non-vanishing probability.

4.1 Non-condensing Regime

We first treat the case \(d \rightarrow d > 0\) as \(L \rightarrow \infty \) for which we have equivalence of ensembles. We find that the probability of observing maximum site occupations of order L decays exponentially in L, as would be the case under the grand-canonical measures \(\nu _\phi \) (10) where the site occupations are i.i.d. with finite mean and variance. We characterise this decay in terms of the large deviation rate function \(I_{\rho }(m)\), which is informally defined as

$$\begin{aligned} \pi _{L,N}[\eta _{(1)} = M] \sim e^{-L I_{\rho }(m)},\quad \text {for}\quad L,N,M\rightarrow \infty \ \text { and }\ N/L \rightarrow \rho ,\ M/L \rightarrow m. \end{aligned}$$

This is made precise in the following result which characterizes the local large deviations, and provides an explicit form for the rate function. The results in this section imply large deviation principles in the usual sense, see for example [59, 62] and references therein for details.

Proposition 3

If \(d \rightarrow d>0\) and \(m \in [0,\rho )\), then in the thermodynamic limit

$$\begin{aligned} \frac{1}{L}\log \pi _{L,N} \big [\eta _{(1)}=M]\rightarrow -I_\rho (m)\quad \text {as} \quad N/L\rightarrow \rho ,\quad M/L \rightarrow m \in [0,\rho ), \end{aligned}$$


$$\begin{aligned} I_\rho (m) = (\rho -m) \log \frac{\rho -m}{\rho -m+d} -\rho \log \frac{\rho }{\rho +d} - d \log \frac{\rho -m+d}{\rho +d}. \end{aligned}$$


The proof follows a standard tilting argument which we only sketch here, more details can be found in [59]. First note that for grand-canonical measures (10) with \(\phi ,\phi ' \in [0,1)\)

$$\begin{aligned} \nu _{\phi }^L\Big [\sum _x \eta _x = N\Big ] =\nu _{\phi '}^L\Big [\sum _x \eta _x = N\Big ] \Big (\frac{\phi }{\phi '}\Big )^N \Big (\frac{z(\phi ')}{z(\phi )}\Big )^L , \end{aligned}$$

and recall that \(\nu _\phi ^1 [\eta _1 =n]=w(n)\phi ^n /z(\phi )\) with weights w(n) given in (20) and normalization \(z(\phi )=(1-\phi )^{-d}\) for all \(\phi \in [0,1 )\). Since

$$\begin{aligned} \pi _{L,N}[\eta _{(1)} = M] = \nu _{\phi }^L\Big [\eta _{(1)} =M \big |\sum _x \eta _x = N\Big ], \end{aligned}$$

and \((\eta _x :x\in \Lambda )\) are i.i.d. under \(\nu _\phi ^L\), we have

$$\begin{aligned} \frac{1}{L}\log \pi _{L,N}&[\eta _{(1)} = M] = \frac{1}{L} \log \nu _{\phi }^L\Big [\eta _{(1)} = M;\ \sum _x \eta _x = N\Big ] - \frac{1}{L}\log \nu _{\phi }^L\Big [\sum _x \eta _x = N\Big ] \\&= \frac{1}{L} \log \nu _\phi ^{L-1}\Big [\sum _{x}\eta _x = N-M ;\ \eta _{(1)} \le M\Big ] + \frac{1}{L}\log \nu _\phi [\eta _1 = M]\\&\qquad - \frac{1}{L}\log \nu _{\phi }^L\Big [\sum _x \eta _x = N\Big ]. \end{aligned}$$

Since the grand canonical single site marginals \(\nu _\phi \) have finite exponential moments for each \(\phi \in [0,1 )\), we may choose a sequence of \(\phi \) such that the expected number of particles per site under \(\nu _{\phi }[\,\cdot \, ;\, \eta _1 < M]\) is \((N-M)/(L-1)\). Further, since \(M/L\rightarrow m\), this implies \(\phi \rightarrow \Phi (\rho -m)\) in the thermodynamic limit, with \(\Phi \) given in (24) as the inverse of \(R(\phi )\) (21). Since \(\nu _{\phi }\) has second moment which converges to \(\langle \eta _x^2\rangle _{\Phi (\rho -m)} < \infty \), we may then apply a standard local limit theorem for triangular arrays (see e.g. [60]) to show that with this choice of \(\phi \) the first term on the second line vanishes. The same is true for the term in the third line choosing \(\phi =\Phi (\rho )= \rho /(\rho +d)\) by equivalence of ensembles proved in Sect. 3.1, and we can conclude using (42) and taking limits. \(\square \)

4.2 Intermediate Scales

For the intermediate scale, \(d \rightarrow 0\) with \(dL \rightarrow \infty \), we cannot directly apply a local limit theorem for triangular arrays as in the previous case, since with (21) there are no grand-canonical measures with positive densities. Here we will make use of Stirling’s approximation of the partition function (32) and truncation arguments to derive the large deviations behaviour of the maximum \(\eta _{(1)}\). In this regime the probability of observing a maximum site occupation of order L has asymptotic decay rate dL.

Proposition 4

If \(d \rightarrow 0\) and \(dL \gg \log L\), then in the thermodynamic limit we have

$$\begin{aligned} -\frac{1}{d L}\log \pi _{L,N} \big [ \eta _{(1)}=M\big ]\rightarrow I_\rho (m){:}{=}\log \left( \frac{\rho }{\rho -m}\right) , \end{aligned}$$

as \(N/L\rightarrow \rho \) and \(M/L\rightarrow m\in [0,\rho )\).

Note that this rate function is consistent with the limit \(d\rightarrow 0\) of \(I_\rho (m)/d\) in (41), but the case \(d=0\) is not covered by Proposition 3 and needs a separate proof.


We firstly extract the contribution due to the maximum site occupation by observing that

$$\begin{aligned} \frac{w(M)Z_{L-1,N-M}^{(M)}}{Z_{L,N}}\le \pi _{L,N} \big [\eta _{(1)}=M\big ] \le \frac{L w(M)Z_{L-1,N-M}^{(M)}}{Z_{L,N}}, \end{aligned}$$

where   \(\displaystyle Z_{L,N}^{(M)} = \sum _{\eta \in E_{L,N}}\prod _{x \in \Lambda } w(\eta _x)\mathbb {1}\{\eta _x \le M\}\)   is a truncated canonical partition function.

This immediately implies the upper bound

$$\begin{aligned} \pi _{L,N} \big [\max _{x\in \Lambda }\eta _x=M\big ] \le \frac{Lw(M) Z_{L-1,N-M} }{Z_{L,N}}. \end{aligned}$$

We can bound from above the total weight of configurations violating the truncation by

$$\begin{aligned} Z_{L-1,N-M} -Z_{L-1,N-M}^{(M)}\le (L-1)(N-M)w(M) Z_{L-2,N-2M}, \end{aligned}$$

where we use monotone decay in N of the weights w(N) (20) and the partition function \(Z_{L,N}\) (12), which holds since \(dL > 1\) for L sufficiently large. This leads to a lower bound on \(Z_{L-1,N-M}^{(M)}\) in (44) and we get

$$\begin{aligned} \pi _{L,N} \big [\eta _{(1)} =M\big ] \ge \frac{w(M)Z_{L-1,N-M}}{Z_{L,N}}\left( 1 - (L-1)(N-M)w(M)\frac{ Z_{L-2,N-2M}}{Z_{L-1,N-M}}\right) . \end{aligned}$$

By applying (32) together with (20) we find that

$$\begin{aligned} (L-1)(N-M)w(M)\frac{ Z_{L-2,N-2M}}{Z_{L,N-M}} \rightarrow 0, \end{aligned}$$

in the thermodynamic limit if \(M/L \rightarrow m > 0\). We conclude by taking logarithms, and again applying (32) together with (20). \(\square \)

We illustrate the rate function for this and the following case of complete condensation in Fig. 4 and compare to exact numerics obtained for finite system size. The latter are generated using the right-hand side of (44) and the recursive structure of the canonical partition functions

$$\begin{aligned} Z_{L,N} =\sum _{n=0}^N Z_{k,n} Z_{L-k,N-n} \quad \text{ for } \text{ all } k=1,\ldots L-1. \end{aligned}$$

The same relation holds for truncated partition functions (see [59] for details). With initial condition \(Z_{1,n} =w(n)\), \(n=0,\ldots N\) and choosing \(k=L/2\) this can be used effectively in an iteration to reach large system sizes.

Fig. 4
figure 4

The large deviation rate functions of the maximum site occupation, \(I_\rho (m)\). Theoretical results are given by full black lines and numerics (45) for finite L by dashed coloured lines. Left: The intermediate case as given in (43), with numerics for \(d=1/\sqrt{L}\). According to Theorem 2 the maximum typically contains of order \(1/d =\sqrt{L}\) particles, so the location of the minima of \(I_\rho (m)\) vanishes with \(1/\sqrt{L}\) and there are significant finite size effects close to the origin. Right: The complete condensation case as given in (50), with numerics for \(d=L^{-2}\)

4.3 Complete Condensation

In the case \(d L \ll 1/\log L\) we have complete condensation as stated in Proposition 2. We characterise the large deviations of the maximum on the scale L, which turn out to be dominated by the probability of observing the smallest number of occupied sites required to realise a given size of the maximum. To derive this result, it is easier to first understand probabilities of size-biased configurations in analogy to Theorem 1.

Proposition 5

In the thermodynamic limit \(N,L\rightarrow \infty \) such that \(N/L \rightarrow \rho \), with \(dL \log L\rightarrow 0\) we have

$$\begin{aligned} \frac{1}{d^k}\pi _{L,N}[{\tilde{\eta }}_1=n_1,\ldots ,{\tilde{\eta }}_k=n_k] \rightarrow \rho ^{-k}\prod _{i=1}^{k}(1-x_i)^{i-k-1}, \end{aligned}$$

provided that \(\frac{n_1}{N}\rightarrow x_1 , \frac{n_2}{N}\rightarrow (1-x_1)x_2, \cdots \frac{n_k}{N}\rightarrow (1-x_1)(1-x_2)\cdots (1-x_{k-1})x_k\) with \(x_1,x_2,\ldots ,x_k \in (0,1)\). Furthermore, in the same limit

$$\begin{aligned} \pi _{L,N}[{\tilde{\eta }}_1=n_1,\ldots ,{\tilde{\eta }}_k=n_k] \simeq \pi _{L,N}\left[ {\tilde{\eta }}_1=n_1,\ldots ,{\tilde{\eta }}_k=n_k,{\tilde{\eta }}_{k+1}=N-\sum _{i=1}^k n_i\right] .\nonumber \\ \end{aligned}$$


In analogy to (30) in the proof of Theorem 1 we get

$$\begin{aligned} \frac{1}{d^k}\pi _{L,N}[{\tilde{\eta }}_1=n_1,\ldots ,{\tilde{\eta }}_k=n_k] \simeq \rho ^{-k}\prod _{i=1}^{k-1}(1-x_i)^{i-k}\frac{Z_{L-k,N-\sum _{i=1}^k n_i}}{Z_{L,N}}, \end{aligned}$$

where we also used

$$\begin{aligned} N(N-n_1)\cdots (N-\sum _{i=1}^{k-1} n_i)\simeq L^k \rho ^k \prod _{i=1}^{k-1} (1-x_i )^{k-i}. \end{aligned}$$

The remaining mass, \(N-\sum _{i=1}^k n_i\), is of order L since \(x_1,x_2,\ldots ,x_k \in (0,1)\). Therefore, applying (27) to the ratio of partition functions we find

$$\begin{aligned} \frac{Z_{L-k,N-\sum _{i=1}^k n_i}}{Z_{L,N}} \rightarrow \frac{1}{(1-x_1)(1-x_2)\ldots (1-x_k)}, \end{aligned}$$

where we used \(N^{dL}\), \((N-\sum _{i=1}^k n_i)^{dL} \rightarrow 1\), since \(dL \log L \rightarrow 0\). This completes the proof of (46).

Finally, for (47), we let \(n_{k+1}=N-\sum _{i=1}^kn_i\). Then using (15) and the fact that \(Z_{L,0}=1\) for all \(L\ge 1\) we have

$$\begin{aligned} \frac{\pi _{L,N}\left[ {\tilde{\eta }}_1=n_1,\ldots ,{\tilde{\eta }}_k =n_k,{\tilde{\eta }}_{k+1}=n_{k+1}\right] }{\pi _{L,N}[{\tilde{\eta }}_1 =n_1,\ldots ,{\tilde{\eta }}_k=n_k]}&= \frac{(L-k)n_{k+1}w(n_{k+1})Z_{L-k-1,0}}{n_{k+1} Z_{L-k,n_{k+1}}}\\&=\pi _{L-k,n_{k+1}}[\eta _{(1)} = n_{k+1}], \end{aligned}$$

which tends to one by Proposition 2. \(\square \)

Corollary 1

In the thermodynamic limit \(N,L\rightarrow \infty \) such that \(N/L \rightarrow \rho \), \(M/N \rightarrow x\in (0,1)\) with \(dL \log L\rightarrow 0\) we have

$$\begin{aligned} \frac{1}{d^{\lceil 1/x \rceil -1}N^{\lceil 1/x \rceil -2}}\pi _{L,N} \big [\eta _{(1)}=M\big ]\rightarrow \frac{C(x)}{\rho ^{\lceil 1/x\rceil -1}}, \end{aligned}$$

where \(0<C(x)<\infty \) is an x dependent constant.


The result follows rather directly from the previous proposition, and we sketch the main calculations required. First fix \(M \in [N/2,N)\cap {\mathbb {N}}\), then conditioned on the event \(\{\eta _{(1)}{=} M\}\) the configuration must contain at least two non-empty sites. Observe that \(\{\eta _{(1)}{=} M\}\) is given by the disjoint union

$$\begin{aligned} \{\eta _{(1)} = M\} = \{\eta _{(1)}=M,\,{\tilde{\eta }}_3 > 0\} \cup \{{\tilde{\eta }}_{1} = M,{\tilde{\eta }}_{2}=N-M\}\cup \{{\tilde{\eta }}_{1} = N-M,{\tilde{\eta }}_{2}=M\}. \end{aligned}$$

From (47) in Proposition 5 we see that \(\pi _{L,N} \big [\eta _{(1)}=M\,;\,{\tilde{\eta }}_3 > 0 \big ]\) decays to zero faster than d. Applying (46) to the probability of the remaining two events we find

$$\begin{aligned} \frac{1}{d} \pi _{L,N} \big [\eta _{(1)}=M\big ] \rightarrow \frac{1}{\rho x (1-x)}\quad \text{ so }\quad C(x)=\frac{1}{x(1-x)} \hbox { for } x\in [1/2,1). \end{aligned}$$

More generally, fix \(k \in {\mathbb {N}}\) and \(M \in [N/(k+1),N/k)\), let \(n_1 = M\), then we can again decompose as a disjoint union as follows

$$\begin{aligned} \{\eta _{(1)} = M\}&= \{\eta _{(1)}=n_1,\,{\tilde{\eta }}_{k+2} > 0 \big \}\\&\bigcup _{\sigma \in S_{k+1}}\bigcup _{\begin{array}{c} n_2,\ldots ,n_{k}\,:\,\\ n_1\ge n_2\ge \ldots \ge n_{k+1} \end{array}} \{ {\tilde{\eta }}_{1}=n_{\sigma (1)},{\tilde{\eta }}_{2}=n_{\sigma (2)}, \ldots ,{\tilde{\eta }}_{k+1}=n_{\sigma (k+1)}\} \end{aligned}$$

where \(S_{k+1}\) is the set of permutations of \(\{1,2,\ldots ,k+1\}\) and \(n_{k+1} =N-\sum _{i=1}^{k}n_{i}\). In order for \(n_1\ge n_2\ge \ldots \ge n_{k+1}\) to hold we must have that \((k+1-i)n_{i+1} \ge N-\sum _{j=1}^{i} n_j\) for each \(i \in \{1,\ldots ,k-1\}\). Again with (47), the probability of the event \(\{\eta _{(1)}=n_1,\,{\tilde{\eta }}_{k+2} > 0 \big \}\) decays faster than \(d^{k}L^{k-1}\). Applying (46) yields

$$\begin{aligned} \frac{1}{d^k}\pi _{L,N}[\eta _{(1)} = M]\simeq N^{k-1} \frac{1}{\rho ^k}\underbrace{\sum _{\sigma \in S_{k+1}} \int ^x_{\frac{1-x}{k}}{\dots }\int _{\frac{\prod _{i=1}^{k-1}(1-x_i)}{2}}^{x_{k-1}} \prod _{i=1}^{k}(1-x_{\sigma (i)})^{i-k-1} dx_k{\ldots } dx_{2}}_{{:}{=}C(x)}, \end{aligned}$$

and (49) follows. \(\square \)

If we take \(d = L^{-\gamma }\) with \(\gamma > 1\) then we may summarize Corollary 1 in terms of a large deviation rate function (with speed \(\log L\)), as follows

$$\begin{aligned} -\frac{1}{\log L} \log \pi _{L,N}[\eta _{(1)} = M] \rightarrow I_\rho (m) = (\lceil \rho /m \rceil - 1) \gamma - (\lceil \rho /m \rceil - 2). \end{aligned}$$

This is illustrated in Fig. 4 (right) for \(\gamma = 2\).

5 Discussion

5.1 Summary

We have established a complete picture for condensation in the inclusion process in the thermodynamic limit, and characterized the condensed phase in several regimes using size-biased sampling of configurations. Our results cover the full scaling regime of the diffusion parameter d, only excluding some narrow bands of size \(\log L/L\) for complete condensation and large deviations. A particularly interesting regime is the hierarchical structure discussed in Sect. 3.2 related to the GEM and the Poisson–Dirichlet distribution. This is well established in the context of population genetics [41], where the full structure of Dirichlet multinomials has been exploited to derive very detailed results for Moran models, which can be interpreted as inclusion processes. We derived our results using only the most general properties of inclusion processes so that our approach can be easily transferred to other systems, and we give more details in the next subsection.

The Poisson–Dirichlet distribution has been identified as the unique stationary distribution of split-merge dynamics of clusters [29, 30], where split and merge rates are proportional to cluster sizes. Our results show that the inclusion process can be seen as a generic ’monomer exchange’ version of such dynamics, where now only single particles are exchanged but with the same proportionality of rates in the inclusion interaction. It would be very interesting to investigate this connection in detail in the context of Poisson–Dirichlet diffusions in analogy to [63]. The crucial prerequisite to see Poisson–Dirichlet statistics in particle systems such as the inclusion process is the asymptotic behaviour of the stationary weights (20),

$$\begin{aligned} w(0)=1,\quad w(n)=O(d) \text{ for } \text{ all } n\ge 1 \text{ as } L\rightarrow \infty ,\quad \text{ and }\quad w(n)/d\simeq n^{-1} \text{ as } n\rightarrow \infty . \end{aligned}$$

The fact that w(n) vanishes proportionally to d as \(L\rightarrow \infty \) for all \(n>0\) leads to \(\rho _b =\rho _c =0\) and condensation with an empty bulk. The structure of the condensed phase is determined by the 1 / n decay of stationary weights for large occupation numbers. This is quite robust, as is discussed in the next subsection. There we summarize some previous results and connections to other particle systems with Poisson–Dirichlet statistics.

5.2 Other Particle Systems with Poisson–Dirichlet Statistics

The model studied in [34] consists of N particles moving diffusively on a one-dimensional torus of length L, subject to a logarithmic attractive potential and short-range hard-core exclusion. The weak attraction leads to the formation of large gaps between groups of particles, and the distances \(y=(y_1 ,\ldots ,y_N )\) between particles have a stationary distribution of the form (12) with weights \(w(y)=y^{-\beta }\), where \(\beta <1\) corresponds to a dimensionless inverse temperature controlling the strength of the noise. So the rescaled distances \(\frac{1}{L} y\) provide a partition of the unit interval and follow a Dirichlet(\(1-\beta ,\ldots ,1-\beta \)) distribution. Of particular interest in [34] is the temperature scaling \(\beta =\frac{N-b}{N-1}\nearrow 1\) as \(N\rightarrow \infty \) with \(b>1\), where Theorem 2.1 in [41] directly applies so that the order statistics

$$\begin{aligned} \frac{1}{L}{\hat{y}} {\mathop {\longrightarrow }\limits ^{D}} \mathrm {PD}(b-1)\quad \text{ as } N,L\rightarrow \infty ,\quad L/N\rightarrow \rho , \end{aligned}$$

converges in distribution to a Poisson–Dirichlet partition of [0, 1]. Indeed, the corresponding Beta(\(1,b-1\)) distribution of the first size-biased marginal \({\tilde{y}}_1\) as in (17) is established independently in [34] without mentioning the connection to the Poisson–Dirichlet distribution. Note that in this model gaps between particles correspond to cluster sizes, and the average cluster size is therefore L / N. A related paper with a hierarchical clustering phenomenon for interacting diffusions on a ring is [35], and to our knowledge these continuous models are the only particle systems where a connection to Poisson–Dirichlet statistics has been recognized so far. The Brownian energy process introduced in [12, 13] as a dual model to the inclusion process exhibits stationary product measures with chi-squared marginals, and conditioning on the total sum of occupation numbers leads to the same canonical distributions as the model in [34].

To test the robustness of our results against small changes in the stationary weights w(n), it is useful to consider zero-range processes. For any given w(n) it is well known that a process with the jump rate for a cluster of size n to lose a particle given by

$$\begin{aligned} u(n) =\frac{w(n-1)}{w(n)}\quad \text{ for } n\ge 1 \end{aligned}$$

exhibits stationary product measures of the form (10) (see e.g. [3, 17] and references therein). Using the weights (20) for the inclusion process this leads to jump rates

$$\begin{aligned} u(n)=\frac{n}{d+n-1}\quad \text{ for } \text{ all } n\ge 1, \end{aligned}$$

so that \(u(1)=1/d\) diverges in a scaling limit with \(d\rightarrow 0\). All other rates are bounded and converge as

$$\begin{aligned} u(n)\rightarrow \frac{n}{n-1} \quad \text{ as } L\rightarrow \infty \text{ for } \text{ all } n\ge 2. \end{aligned}$$

A zero-range process with rates (51) has exactly the same stationary distributions (12) as the inclusion process and all our results apply. Condensation in zero-range processes has been a major research area in recent years (see e.g. [2, 5, 9]), where decreasing rates \(u(n)\simeq 1+b/n\) lead to stationary weights of order \(n^{-b}\), so that \(\phi _c =1\) and the critical density is given by (see discussion in Sect. 2.2)

$$\begin{aligned} \rho _c =R(1) =\frac{1}{z(1)}\sum _{n=1}^\infty nw(n)<\infty \quad \text{ for } b>2. \end{aligned}$$

In such models, condensation is driven by strong enough on-site attraction between particles. The rates (51) have asymptotic behaviour

$$\begin{aligned} u(n)\simeq \frac{n}{n-1} \simeq 1+\frac{1}{n}\quad \text{ as } n\rightarrow \infty \end{aligned}$$

and the attraction between particles is not strong enough. Instead, cluster coarsening and condensation is driven by divergence of \(u(1)=1/d\), which ensures that \(\rho _b =0\) in the bulk of the system and the remaining mass concentrates on a number of lattice sites decreasing in time.

We have checked numerically that the particular form of the rates (51) is in fact not important, and choices of the form \(u(n)=n/(n-1)\) or \(u(n)=1+1/n\) for \(n\ge 2\) lead to the expected Poisson–Dirichlet statistics at stationarity for \(u(1)=1/d\simeq L/\alpha \) with \(\alpha >0\). This can be checked analytically on a case-by-case basis, but it is known that in general the asymptotic behaviour of the partition function and condensation behaviour may depend sensitively on perturbations of the rates (see e.g. [46, 64, 65]), so we are currently not able to prove a general result analogous to Theorem 1 based only on asymptotics of stationary weights or jump rates.