1 Introduction

Droplet models offer helpful guidance for understanding nucleation and condensation phenomena in classical statistical physics. They are known under the header of Fisher droplet models or Frenkel-Band theory of association equilibrium, see [1,2,3] and the references therein. They treat a gas of molecules as an ideal mixture of droplets of different sizes, coming each with a partition function over internal degrees of freedom, or some approximate formula for such internal partition functions. Condensation is understood as the formation of a large droplet of macroscopic size, and explicit computations are possible under the simplifying assumption that the mixture is ideal.

Rigorous results for droplet models that take into account excluded volume effects are sparse. Fisher proved that the phase transition for ideal droplet models subsists for a class of one-dimensional models [1, 4]; the one-dimensional model serves as a counter-example to the strict convexity of the pressure as a function of interaction potentials when the class of potentials is chosen too large [5], compare [6, Chapter V.2]. For particles in \({\mathbb {R}}^d\) with attractive interactions, errors in the ideal mixture approximation are bounded in [7, 8], however the bounds do not allow for a proof of phase transitions.

The present article proposes a toy model for which excluded volume effects and phase transitions can be understood rigorously, and that might pave the way for an application of renormalization techniques. To motivate the model it is helpful to describe first another model that we are not yet able to treat and that connects to a joint program started in [9] and pursued in [10, 11]. Consider a mixture of hard spheres in \({\mathbb {R}}^3\). Spheres are assumed to have integer volume \(k\in {\mathbb {N}}_0\) and are thought of as droplets made up of k particles. Distinct spheres cannot overlap, and a sphere of volume k comes with an energy \(E_k\) that satisfies \(E_k = k e_\infty +o(k)\) as \(k\rightarrow \infty \) with finite bulk energy \(e_\infty \). In order to control the distribution of sphere types it is natural to work in a multi-canonical ensemble, fixing the number \(N_k\) of k-spheres as well as the total area \(\sum _k k N_k\) covered by spheres (a substitute for the total number of particles). In the thermodynamic limit \(N_k/V\rightarrow \rho _k\), \(\sum _k k N_k /V \rightarrow \rho \), this results in an associated Helmholtz free energy per unit volume, which at low density should be of the form

$$\begin{aligned} f\bigl (\beta ,(\rho _j)_{j\in {\mathbb {N}}}, \rho \bigr ) = \sum _{j=1}^\infty \rho _j E_j + \rho _\infty e_\infty + \beta ^{-1}\sum _{j=1}^\infty \rho _j (\log \rho _j -1) + \text {correction terms} \end{aligned}$$

where \(\rho _\infty : = \rho - \sum _{k=0}^\infty k \rho _k\) accounts for the possible loss of mass to very large spheres. The correction terms should capture excluded volume effects and one might hope for a convergent power series expansion in the variables \(\rho _j\) and \(\rho _\infty \). The question arises if the free energy of a given packing fraction, defined by minimizing over all compatible distributions on sphere sizes

$$\begin{aligned} f(\beta ,\rho ):= \min \Biggl \{ f\bigl (\beta ,(\rho _j)_{j\in {\mathbb {N}}}, \rho \bigr )\, \Bigg |\, \sum _{j=1}^\infty j \rho _j \le \rho \Biggr \}, \end{aligned}$$

is strictly convex or has affine pieces. For the ideal mixture the question is easily answered: If

$$\begin{aligned} p_c^{\mathrm {ideal}}(\beta ):=\sum _{j=1}^\infty \exp ( - \beta [E_j - j e_\infty ]),\quad \rho _c^{\mathrm {ideal}}(\beta ):= \sum _{j=1}^\infty j \exp ( - \beta [E_j - j e_\infty ]) \end{aligned}$$

are both finite, then the free energy is strictly convex in \(\rho < \rho _c^{\mathrm {ideal}}(\beta )\) and affine with slope \(e_\infty \) in \(\rho >\rho _{\mathrm {sat}}\), moreover in the latter domain the unique minimizer in the variational formula is \(\rho _j = \exp ( - \beta [E_j - j e_\infty ])=:\rho _{j}^{\mathrm {ideal}}(\beta )\) and it satisfies \(\rho _\infty = \rho - \sum _{j=1}^\infty j \rho _j>0\). At low temperature, because of \(\rho _c^{\mathrm {ideal}}(\beta )\rightarrow 0\) as \(\beta \rightarrow \infty \), one may hope that the excluded volume effects do not destroy the existence of a first-order phase transition and that correction terms might be expressed in terms of convergent power series in the sphere size distributions \(\rho _{j}^{\mathrm {ideal}}(\beta )\), compare Sect. 6.

Unfortunately, currently available convergence criteria for multi-species virial expansions [9, 11] impose exponential decay \(\rho _j \le \exp ( - \, \mathrm {const}\, j)\), which excludes the ideal equilibrium densities \(\exp ( - \beta [E_j - je_\infty ])\). Therefore the naive argument sketched above stays somewhat speculative. The purpose of the present article is to provide an example where the argument nonetheless does work. The price we pay is a drastic simplification of the mixture of hard spheres. It is our impression, however, that the model is a valuable addition to rigorous results in dimension one [1, 12], moreover the simplification is a very natural starting point in the context of renormalization group theory [13, 14].

In fact the present work was motivated by the study of a two-scale mixture of hard spheres in \({\mathbb {R}}^d\) [10]. Integrating out the small spheres gives rise to an effective model for large spheres with new effective multi-body interactions and an effective activity, which leads to improved domains of convergence in Mayer expansions. The results from [10] leave open whether similar improvements can be reached in multi-scale systems, integrating out objects one by one. The present article should serve as a useful companion when trying to implement such a program.

Our model consists of non-overlapping hypercubes in \({\mathbb {Z}}^d\) belonging to some admissible set \({\mathbb {B}}\). The model is a special case of a polymer system [15]. The set \({\mathbb {B}}\) of admissible cubes is such that if two cubes overlap, then necessarily one cube is contained in the other. Concretely, \({\mathbb {B}} = \cup _{j=0}^\infty {\mathbb {B}}_j\) where the set \({\mathbb {B}}_j\) of j-blocks contains the representative cube \(B_j = \{1,\ldots , 2^j\}^d\) and all its shifts by vectors \(2^j\varvec{k}\), \(\varvec{k} \in {\mathbb {Z}}^d\). Such geometries are often called hierarchical in the context of renormalization group theory [13, 14]. We consider both the grand-canonical ensemble and the multi-canonical ensemble. In the grand-canonical ensemble, described in detail in Sect. 2, j-blocks have activity \(z_j\). In the multi-canonical ensemble we work with density variables \(\rho _j\) and the overall packing fraction \(\sigma \), see Sect. 4.

In Sect. 3 we work in the grand-canonical ensemble and prove explicit formulas for the pressure and block densities as functions of the activities \(z_j\) (Theorems 3.1 and 3.2 ). The formulas are similar to formulas for an ideal mixture, the only difference is that the activity \(z_j\) is replaced with an effective activity \({{\widehat{z}}}_j\). The effective activity \({{\widehat{z}}}_j\) takes into account the volume excluded for blocks of type \(k\le j\) in the presence of a j-block; it is exponentially smaller than the original activity, \({\widehat{z}}_j \le z_j \exp ( - \mathrm {const} |B_j|)\). This feature is shared by two-scale binary mixtures or colloids [10]. In addition, we prove an explicit inversion formula for the activities as functions of the densities and prove an equation of state for the pressure that is a variant of the van der Waals equation of state (Theorem 3.3). The equations are similar to equations for discrete systems of non-overlapping rods on a line [12].

In Sect. 4 we work in the multi-canonical ensemble and prove an explicit formula for the entropy as a function of block densities \(\rho _j\) and the overall packing fraction (Theorem 4.1). The entropy is the sum of the entropy of an ideal mixture plus a power series correction. The power series is absolutely convergent whenever the packing fraction is strictly smaller than 1 (Proposition 4.2)—there is no need for exponential decay \(\rho _j \le \exp (- \mathrm {const} |B_j|)\). We check that the pressure is a Legendre transform of the entropy and compute the maximizers in the resulting variational formula for the pressure (Proposition 4.3).

In Section 5 we investigate a parameter-dependent model with activities \(z_j(\mu ) = \exp ( \mu |B_j| - E_j)\) for some given sequence of energies \((E_j)_{j\in {\mathbb {N}}_0}\) and chemical potential \(\mu \in {\mathbb {R}}\), and we investigate possible phase transitions as \(\mu \) is varied. We prove a sufficient condition for the absence of phase transitions (Theorem 5.3). For constant energies \(E_j\equiv \lambda \) with \(\lambda \) sufficiently large, the mixture of cubes has a continuous phase transition (Theorem 5.5). The proof uses a parameter-dependent fixed point iteration, and we sketch some possible connections with Mandelbrot’s fractal percolation model [16, 17]. A necessary and sufficient condition for the existence of first-order phase transitions is given in Theorem 5.6.

2 The Model

2.1 Lattice Animals: Polymer Partition Function

Fix \(d\in {\mathbb {N}}\) and let \({\mathbb {X}}\) the collection of finite non-empty subsets of \({\mathbb {Z}}^d\). Elements X of \({\mathbb {X}}\) are called lattice animals or polymers. For \(\Lambda \subset {\mathbb {Z}}^d\) a bounded non-empty set, let

$$\begin{aligned} {\mathbb {X}}_\Lambda = \{X\in {\mathbb {X}}\mid X\subset \Lambda \}. \end{aligned}$$

We are interested in probability measures on finite collections of lattice animals in \(\Lambda \) and define

$$\begin{aligned} \Omega _\Lambda := \Bigl \{ \omega = \{X_1,\ldots , X_r\}\, \Big |\ r\in {\mathbb {N}}_0,\ X_1,\ldots , X_r\subset \Lambda ,\ \forall i \ne j:\ X_i \ne X_j\Bigr \}. \end{aligned}$$

The empty configuration is explicitly allowed, i.e., \(\varnothing \in \Omega _\Lambda \). Note the one-to-one correspondence

$$\begin{aligned} \Omega _\Lambda \rightarrow \{0,1\}^{{\mathbb {X}}_\Lambda },\quad \omega \mapsto \bigl ( n_X(\omega )\bigr )_{X\in {\mathbb {X}}_\Lambda } \end{aligned}$$

given by

$$\begin{aligned} n_X(\omega ):= {\left\{ \begin{array}{ll} 1, &{}\quad X\in \omega ,\\ 0, &{}\quad X\notin \omega . \end{array}\right. } \end{aligned}$$

Assume we are given a map \(z:{\mathbb {X}}\rightarrow {\mathbb {R}}_+\), called activity. For \(\Lambda \subset {\mathbb {Z}}^d\) a bounded non-empty set, define the polymer partition function

$$\begin{aligned} \Xi _\Lambda := 1 + \sum _{r=1}^\infty \frac{1}{r!}\ \sum _{(X_1,\ldots , X_r) \in {\mathbb {X}}_\Lambda ^r} \Biggl ( \prod _{i=1}^r z(X_i) \Biggr ) \mathbb {1}_{\{\forall i \ne j:\ X_i \cap X_j = \varnothing \}} \end{aligned}$$

and the grand-canonical Gibbs measure, a probability measure \({\mathbb {P}}_\Lambda \) on \(\Omega _\Lambda \) given by

$$\begin{aligned} {\mathbb {P}}_\Lambda \bigl ( \omega = \{X_1,\ldots , X_r\} \Bigr ) := \frac{1}{\Xi _\Lambda } \mathbb {1}_{\{\forall i \ne j:\ X_i \cap X_j = \varnothing \}} \prod _{i=1}^r z(X_i), \quad {\mathbb {P}}_\Lambda \bigl ( \omega = \varnothing \bigr ) := \frac{1}{\Xi _\Lambda }. \end{aligned}$$

The probabilistically minded reader may think of \({\mathbb {P}}_\Lambda \) as independent Bernoulli variables \(n_X(\omega )\) with parameters \(z(X)/(1+z(X))\) conditioned on non-overlap of the polymers X.

In order to pass to the limit \(\Lambda \nearrow {\mathbb {Z}}^d\) we impose conditions on the activity.

Definition 2.1

For \(z:{\mathbb {X}}\rightarrow {\mathbb {R}}_+\) and \(\theta \in {\mathbb {R}}\), let

$$\begin{aligned} ||z||_\theta := \sup _{x\in {\mathbb {Z}}^d} \sum _{X\ni x} \frac{1}{|X|} z(X)\, {\mathrm {e}}^{- \theta |X|}. \end{aligned}$$

The activity \(z(\cdot )\) is stable if \(||z||_ \theta <\infty \) for some \(\theta \in {\mathbb {R}}\).

The definition is adapted from Gruber and Kunz [15, Eq. (23)] who call the activity stable if instead \(||z||_0<\infty \) but also observe some scaling invariance of the model [15, Eq. (22)] see the proof of Lemma 2.2 below. Our definition incorporates possible rescalings into the definition of stability and allows for \(\theta >0\) and activities that are exponentially large in the polymer size |X|. Stability ensures a uniform bound on the finite-volume pressure.

Lemma 2.2

Suppose that the activity \(z(\cdot )\) is stable. Then for all \(\theta \in {\mathbb {R}}\) with \(||z||_\theta <\infty \) and for all \(\Lambda \subset {\mathbb {Z}}^d\), we have

$$\begin{aligned} \frac{1}{|\Lambda |} \log \Xi _\Lambda \le \theta + {\mathrm {e}}^{-\theta } + ||z||_\theta < \infty \end{aligned}$$

Proof

We follow [15, Lemma 1]. Define \(\Phi _\theta (X) = z(X) \exp ( - \theta | X|)\) if \(|X|\ge 2\) and \(\Phi _\theta (\{x\}) = (1+ z(\{x\}) )\exp ( - \theta )\). Then \(\Xi _\Lambda \) is a sum over set partitions \(\{X_1,\ldots , X_r\}\) of \(\Lambda \). For example, if \(d=1\) and \(\Lambda =\{0,1\}=B_1\), then

$$\begin{aligned} \Xi _{\{0,1\}}&= 1+ z(\{0\}) + z(\{1\}) + z(\{0\}) z(\{1\}) + z(\{0,1\})\\&= \bigl (1+ z(\{0\})\bigr )\bigl (1+ z(\{1\})\bigr ) + z(\{0,1\}) \\&= \Phi _0(\{0\}) \Phi _0(\{1\}) + \Phi _0(\{0,1\}). \end{aligned}$$

More generally,

$$\begin{aligned} \Xi _\Lambda&= \sum _{\{X_1,\ldots , X_r\}} \Phi _0(X_1)\cdots \Phi _0(X_r) = {\mathrm {e}}^{|\Lambda |\theta } \sum _{\{X_1,\ldots , X_r\}} \Phi _\theta (X_1)\cdots \Phi _\theta (X_r) \\&= {\mathrm {e}}^{|\Lambda | \theta }\sum _{\{X_1,\ldots , X_r\}} \prod _{i=1}^r \Biggl ( \sum _{x_i\in X_i} \frac{\Phi _\theta (X_i)}{|X_i|}\Biggr ) \\&\le {\mathrm {e}}^{|\Lambda | \theta }\Biggl ( 1+ \sum _{r=1}^\infty \frac{1}{r!}\sum _{(x_1,\ldots , x_r)\in \Lambda ^r} \prod _{i=1}^r \Biggl ( \sum _{X_i \ni x_i} \frac{\Phi _\theta (X_i)}{|X_i|} \Biggr )\Biggr )\\&= {\mathrm {e}}^{|\Lambda | \theta }\exp \Biggl ( \sum _{x\in \Lambda } \sum _{X \ni x} \frac{\Phi _\theta (X)}{|X|} \Biggr ). \end{aligned}$$

It follows that

$$\begin{aligned} \frac{1}{|\Lambda |} \log \Xi _\Lambda \le (\theta + {\mathrm {e}}^{-\theta })+ \frac{1}{|\Lambda |} \sum _{x\in \Lambda } \sum _{\begin{array}{c} X \in {\mathbb {X}}_\Lambda :\\ x\in X \end{array}} \frac{1}{|X|} z(X) {\mathrm {e}}^{-\theta |X|} \le \theta + {\mathrm {e}}^{-\theta } + ||z||_\theta < \infty . \end{aligned}$$

\(\square \)

2.2 Hierarchical Cubes

Now we specialize to activity maps \(z(\cdot )\) supported on a collection \({\mathbb {B}}\subset {\mathbb {X}}\) of cubes with the property that if \(A,B \in {\mathbb {B}}\) have non-empty intersection, then necessarily \(A\subset B\). A set \(B \subset {\mathbb {Z}}^d\) is called a j-block if

$$\begin{aligned} B= \{ k_1 2^{j} +1,\ldots , (k_1 +1) 2^j\}\times \cdots \times \{ k_d 2^{j} +1,\ldots , (k_d +1) 2^j\} \end{aligned}$$

for some \({\varvec{k}} =(k_1,\ldots , k_d) \in {\mathbb {Z}}^d\). Let \({\mathbb {B}}_j\) be the set of j-blocks. The blocks \(B\in {\mathbb {B}}_j\) form a tiling of \({\mathbb {Z}}^d\) consisting of the tile

$$\begin{aligned} B_j:= \{1,\ldots , 2^j\}^d \end{aligned}$$

and non-overlapping shifts of \(B_j\). Let \((z_j)_{j\in {\mathbb {N}}_0}\) be a sequence of non-negative numbers. We are interested in activity maps of the form

$$\begin{aligned} z(X) = {\left\{ \begin{array}{ll} z_j, &{}\quad \text {if }\ X= B\in {\mathbb {B}}_j,\\ 0, &{}\quad \text {if } X\in {\mathbb {X}}\setminus \bigcup _{j=0}^\infty {\mathbb {B}}_j. \end{array}\right. } \end{aligned}$$
(2.1)

Thus \(z_0\) is the activity of a monomer \(\{x\}\) and \(z_1\) the activity of a cube with sidelength 2. Define

$$\begin{aligned} \theta ^*:= \limsup _{j\rightarrow \infty }\frac{1}{|B_j|}\log z_j. \end{aligned}$$

Lemma 2.3

The activity (2.1) is stable if and only if \(\theta ^*<\infty \).

Proof

For every given block type \(j\in {\mathbb {N}}_0\), every point \(x\in {\mathbb {Z}}^d\) belongs to exactly one j-block, therefore

$$\begin{aligned} ||z||_\theta = \sum _{j=0}^\infty \frac{1}{|B_j|} z_j {\mathrm {e}}^{-\theta |B_j|}. \end{aligned}$$

If \(||z||_\theta < \infty \) for some \(\theta \in {\mathbb {R}}\), then \(z_j \le ||z||_\theta |B_j| \exp ( \theta |B_j|)\) hence \(\theta ^* \le \theta <\infty \). Conversely, if \(\theta ^*<\infty \), then for every \(\theta >\theta ^*\) we have \(z_j \exp ( - |B_j|\theta ) \le \exp (- |B_j| (\theta - \theta ^* +o(1)))\) which goes to zero exponentially fast as \(j\rightarrow \infty \), therefore \(||z||_\theta <\infty \) and the activity is stable. \(\square \)

2.3 Ideal Mixture: Bernoulli Variables

To help interpret subsequent formulas we recall the expression of the partition function for an ideal mixture of cubes, where cubes of different type may overlap. For \(\Lambda \in {\mathbb {B}}\), set

$$\begin{aligned} \Xi _\Lambda ^{\mathrm {Ber}} := \sum _{\omega \in \Omega _\Lambda } \prod _{X\in \omega } z(X) \end{aligned}$$

with \(\prod _{X\in \varnothing } z(X) =1\), and let \({\mathbb {P}}_\Lambda ^{\mathrm {Ber}}\) be the associated probability measure on \(\Omega _\Lambda \). It is straightforward to check that under \({\mathbb {P}}_\Lambda ^{\mathrm {Ber}}\), the occupation numbers \(n_X(\omega )\), \(X\subset \Lambda \), are independent Bernoulli variables with

$$\begin{aligned} {\mathbb {P}}_\Lambda ^{\mathrm {Ber}} \bigl ( n_X(\omega ) =1\bigr ) = {\mathbb {P}}_\Lambda ^{\mathrm {Ber}}(\omega \ni X) = \frac{z(X)}{1+z(X)}. \end{aligned}$$

For the activities (2.1) and \(\Lambda = \Lambda _n\in {\mathbb {B}}_n\), the finite-volume pressure of the ideal mixture is

$$\begin{aligned} \frac{1}{|\Lambda |} \log \Xi _\Lambda ^{\mathrm {Ber}}= \frac{1}{|\Lambda |} \sum _{\begin{array}{c} B\in {\mathbb {B}}:\\ B\subset \Lambda \end{array}} \log (1+ z(B)) =\sum _{j=0}^n \frac{1}{|B_j|} \log (1+ z_j). \end{aligned}$$

The infinite-volume pressure for the ideal mixture is therefore

$$\begin{aligned} p^{\mathrm {Ber}} := \lim _{\Lambda \nearrow {\mathbb {Z}}^d} \frac{1}{|\Lambda |} \log \Xi _\Lambda ^{\mathrm {Ber}} = \sum _{j=0}^\infty \frac{1}{|B_j|} \log (1+ z_j). \end{aligned}$$
(2.2)

The factor \(1/|B_j|\) reflects the lack of full translational invariance of the model: only translates by multiples of \(2^j\) map a j-block to another admissible j-block. The factor \(1/|B_j|\) also appears in the relation between the expected number of j-blocks and the probability that a given j-block is present: if \(B_j\subset \Lambda \) then

$$\begin{aligned} {\mathbb {E}}_\Lambda ^{\mathrm {Ber}} \bigl [\text {number of } j\text {-blocks in } \omega \bigr ] = \sum _{\begin{array}{c} B\in {\mathbb {B}}_j:\\ B\subset \Lambda \end{array}} {\mathbb {E}}_\Lambda ^{\mathrm {Ber}}\bigl [ n_B(\omega ) \bigr ]= \frac{|\Lambda |}{|B_j|}\, {\mathbb {P}}^{\mathrm {Ber}}_\Lambda \bigl ( n_{B_j}(\omega ) =1\bigr ). \end{aligned}$$

Remark 2.4

(Ideal gas and Poisson variables) The word “ideal mixture” often refers to a model where not only the hard-core interaction between different types of blocks is dropped, but also the self-interaction of j-blocks is discarded—i.e., not only is the mixture ideal but in addition each component on its own is an ideal gas. The configuration space of such a system is \({\mathbb {N}}_0^{{\mathbb {B}}}\) and the occupation numbers become Poisson variables with parameters \(z_j\) instead of Bernoulli variables. We have chosen the superscript “Ber” in order to avoid ambiguities associated with the word “ideal.”

3 Pressure: Grand-Canonical Ensemble

In the following \((\Lambda _n)_{n\in {\mathbb {N}}_0}\) represents a a growing sequence of cubes \(\Lambda _n \in {\mathbb {B}}_n\) with \(\Lambda _n\nearrow {\mathbb {Z}}^d\). The pressure in finite volume and infinite volume is

$$\begin{aligned} p_n:= \frac{1}{|\Lambda _n|}\log \Xi _{\Lambda _n},\quad p:= \lim _{n\rightarrow \infty } p_n. \end{aligned}$$

We assume throughout the article that the activity is stable, i.e., \(\theta ^* = \limsup _{j\rightarrow \infty }\frac{1}{|B_j|}\log z_j <\infty \).

Theorem 3.1

The limit defining the pressure exists and satisfies \(\theta ^*\le p<\infty \). It is expressed in terms of the effective activities

$$\begin{aligned} {\widehat{z}}_0:= z_0,\quad {\widehat{z}}_j:= z_j {\mathrm {e}}^{- |B_j| p_{j-1}}\quad (j\ge 1) \end{aligned}$$

as

$$\begin{aligned} p= \sum _{j=0}^\infty \frac{1}{|B_j|}\log (1+{\widehat{z}}_j). \end{aligned}$$

Consequently the pressure for a system of non-overlapping cubes is given by a formula similar to the pressure (2.2) for the ideal mixture, the only difference is that the activities \(z_j\) are replaced by the effective activities \({\widehat{z}}_j\). The effective activity is similar to the renormalized activity for binary mixtures from [10].

Proof

It is straightforward to check the recurrence relation

$$\begin{aligned} \Xi _{\Lambda _n} = z_n + \bigl ( \Xi _{\Lambda _{n-1}}\bigr )^{2^d} \qquad (n\ge 1). \end{aligned}$$
(3.1)

By definition of \({\widehat{z}}_j\) and \(p_j\) the recurrence relation can be rewritten as

$$\begin{aligned} \Xi _{\Lambda _n} = ( 1+ {\widehat{z}}_n ) \bigl ( \Xi _{\Lambda _{n-1}}\bigr )^{2^d} \end{aligned}$$

which gives \(p_n = p_{n-1} + \frac{1}{|\Lambda _n|} \log (1+ {\widehat{z}}_n)\). Combining with \(p_0 = \log (1+ z_0 ) = \log (1+{\widehat{z}}_0)\) we find

$$\begin{aligned} p_n = \sum _{j=0}^n \frac{1}{|B_j|} \log (1+ {\widehat{z}}_j) \end{aligned}$$
(3.2)

and the existence in \({\mathbb {R}}_+\cup \{\infty \}\) of the limit defining p, and its representation as an infinite series, follow. The stability of the activity guarantees that the pressure is finite, see Lemma 2.3. The inequality \(p\ge \theta ^*\) follows from \(\Xi _{\Lambda _n}\ge z_n\). \(\square \)

Next we investigate the density of j-blocks and the packing fraction. The probability that a cube \(B\subset \Lambda \) belongs to \(\omega \) is

$$\begin{aligned} \rho _\Lambda (B):= {\mathbb {P}}_\Lambda ( \omega \ni B ) = {\mathbb {E}}_\Lambda \bigl [n_B\bigr ]. \end{aligned}$$

It depends on the type of the block only, accordingly we write \(\rho _\Lambda (B) = \rho _{j,\Lambda }\) if \(B\in {\mathbb {B}}_j\). The expected number of j-blocks per unit volume is

$$\begin{aligned} \nu _{j,\Lambda }:= \frac{1}{|\Lambda |}\sum _{\begin{array}{c} B\in {\mathbb {B}}_j:\\ B\subset \Lambda \end{array}} \rho _\Lambda (B) = \frac{\rho _{j,\Lambda }}{|B_j|}. \end{aligned}$$
(3.3)

To simplify language we refer to both \(\nu _{j,\Lambda }\) and \(\rho _{j,\Lambda }\) as the density of j-cubes, though they are strictly speaking two different objects. The packing fraction is the fraction of area covered by cubes

$$\begin{aligned} \sigma _\Lambda := \frac{1}{|\Lambda |}\, {\mathbb {E}}_\Lambda \left[ \biggl | \bigcup _{B\in \omega }B\biggr |\right] = \sum _{j} |B_j| \nu _{j,\Lambda } = \sum _j \rho _{j,\Lambda }. \end{aligned}$$

Below we show that the limits

$$\begin{aligned} \rho _j :=\lim _{n\rightarrow \infty } \rho _{j,\Lambda _n},\quad \sigma := \lim _{n\rightarrow \infty }\sigma _{\Lambda _n} \end{aligned}$$
(3.4)

exist. Notice \(\sigma \le 1\) and \(\sum _{j=0}^\infty \rho _j \le \sigma \).

Theorem 3.2

The limits (3.4) exist and satisfy the following.

  1. (a)

    If \(\sum _{j=0}^\infty {\widehat{z}}_j < \infty \) , then

    $$\begin{aligned} \rho _j = \frac{{\widehat{z}}_j}{1+{\widehat{z}}_j}\prod _{k=j+1}^\infty \frac{1}{1+ {\widehat{z}}_k} >0,\quad \sigma = \sum _{j=0}^\infty \rho _j = 1 - \prod _{k=0}^\infty \frac{1}{1+{\widehat{z}}_k} <1. \end{aligned}$$
  2. (b)

    If \(\sum _{j=0}^\infty {\widehat{z}}_j = \infty \), then \(\rho _j =0\) for all \(j\in {\mathbb {N}}_0\) and \(\sigma =1\), moreover \(p = \theta ^* \).

Case (b) corresponds to a close-packing regime where the box \(\Lambda _n\) is filled with large blocks. Case (a) corresponds to a gas of small cubes that fill only a fraction of the volume. See Sect. 5 for examples.

Proof

We show first that for all \(n\in {\mathbb {N}}_0\) and \(j=0,\ldots , n\), we have

$$\begin{aligned} \rho _{j,\Lambda _n} = \frac{{\widehat{z}}_j}{1+{\widehat{z}}_j} \frac{1}{1+{\widehat{z}}_{j+1}}\cdots \frac{1}{1+{\widehat{z}}_n}, \quad \sigma _{\Lambda _n} = 1- \prod _{j=0}^n \frac{1}{1+{\widehat{z}}_j}. \end{aligned}$$
(3.5)

The proof of the first part of (3.5) is by induction over \(n\ge j\) at fixed \(j\in {\mathbb {N}}_0\). If \(n=j\), then

$$\begin{aligned} \rho _{j,\Lambda _j} = {\mathbb {P}}_{\Lambda _j}(\omega = \{B_j\}) = \frac{z_j}{\Xi _{\Lambda _j}} = \frac{z_j}{(1+{\widehat{z}}_j) \Xi _{\Lambda _{j-1}}^{2^d}} = \frac{{\widehat{z}}_j}{1+{\widehat{z}}_j}. \end{aligned}$$

For the induction step, write \(\Lambda _n\) as a disjoint union of \(2^d\) cubes \(\Lambda _{n-1}^{(k)} \in {\mathbb {B}}_{n-1}\). Let

$$\begin{aligned} \omega _k:= \big \{ B \in \omega \mid B\subset \Lambda _{n-1}^{(k)}\big \} \end{aligned}$$

so that \(\omega =\omega _1\cup \cdots \cup \omega _{2^d}\), unless \(\omega = \{\Lambda _n\}\) contains an n-block. Conditional on \(\Lambda _n\notin \omega \), the projections \(\omega _1,\ldots ,\omega _{2^d}\) are independent, their distribution is given by the Gibbs measures \({\mathbb {P}}_{\Lambda _{n-1}^{(k)}}\), \(k=1,\ldots , 2^d\). Thus fixing a j-block \(B\subset \Lambda _{n}\), and assuming without loss of generality \(B\subset \Lambda _{n-1}^{(1)}\), we get

$$\begin{aligned} {\mathbb {P}}_{\Lambda _n}(B\in \omega )&= {\mathbb {P}}_{\Lambda _n}(B \in \omega _{\Lambda _{n-1}} \mid \Lambda _n\notin \omega ) \times {\mathbb {P}}_{\Lambda _n}(\Lambda _n \notin \omega ) \\&= {\mathbb {P}}_{\Lambda _{n-1}^{(1)}} (B\in \omega _1) \times \frac{1}{1+{\widehat{z}}_n} = \Biggl ( \frac{{\widehat{z}}_j}{1+{\widehat{z}}_j} \prod _{k=j+1}^{n-1}\frac{1}{1+{\widehat{z}}_k}\Biggr ) \frac{1}{1+{\widehat{z}}_n} \end{aligned}$$

which is precisely the first part of (3.5). Thus the induction step is complete. For the second part of (3.5), set \(x_j= {\widehat{z}}_j / (1+ {\widehat{z}}_j)\) and \(y_j = 1 - x_j\). Then

$$\begin{aligned} 1 = \prod _{j=0}^n (x_j + y_j) = x_n + y_n \prod _{j=0}^{n-1} (x_j + y_j) = x_n + y_n x_{n-1}+\cdots + y_n \cdots y_1 x_0 + y_n\cdots y_0 \end{aligned}$$

hence

$$\begin{aligned} 1- \prod _{j=0}^n y_j = \sum _{j=0}^n x_j y_{j+1}\cdots y_n \end{aligned}$$

which is the second part of (3.5).

If \(\sum _{j=0}^\infty {\widehat{z}}_j<\infty \), then the infinite product \(\prod _{j=0}^\infty (1+ {\widehat{z}}_j)^{-1}\) is strictly smaller than 1 (because the logarithm is finite). We pass to the limit in (3.5) and obtain part (a) of the theorem.

If \(\sum _{j=0}^\infty {\widehat{z}}_j =\infty \), then \(\sum _{j=0}^\infty \log (1+{\widehat{z}}_j) =\infty \) and \(\lim _{n\rightarrow \infty }\prod _{j=0}^n (1+{\widehat{z}}_j)^{-1} =1\). Passing to the limit in (3.5) we see that \(\rho _j=0\) for all \(j\in {\mathbb {N}}_0\) and \(\sigma =1\). It remains to check that \(p = \theta ^*\). We already know by Theorem 3.1 that \(p\ge \theta ^*\). Suppose by contradiction that \(p>\theta ^*\). In view of \(p = \sum _{j} \frac{1}{|B_j|} \log (1+ {\widehat{z}}_j)< \infty \) we have \({\widehat{z}}_j \le \exp ( |B_j| p)\). If \(p> \theta ^*\), then we would deduce that

$$\begin{aligned} \sum _{j=0}^\infty {\widehat{z}}_j = \sum _{j=0}^\infty z_j {\mathrm {e}}^{- |B_j| (p+o(1))} \le \sum _{j=0}^\infty {\mathrm {e}}^{- |B_j|( p - \theta ^*+o(1))} <\infty , \end{aligned}$$

contradicting the assumption \(\sum {\widehat{z}}_j =\infty \). Thus \(p \le \theta ^*\) and \(p = \theta ^*\). \(\square \)

Next we turn to the equation of state and the inversion of the density-activity relation in the gas phase.

Theorem 3.3

Assume \(\sum _{j=0}^\infty {\widehat{z}}_j < \infty \). Then

$$\begin{aligned} p = \sum _{j=0}^\infty \frac{1}{|B_j|} \log \left( 1 + \frac{\rho _j}{1- \sum _{k=j}^\infty \rho _k}\right) \end{aligned}$$
(3.6)

and for all \(j\in {\mathbb {N}}_0\)

$$\begin{aligned} z_j = \frac{\rho _j \exp ( |B_j| p_{j-1})}{1 - \sum _{k=j}^\infty \rho _k}, \quad p_{j-1} = \sum _{k=0}^{j-1} \frac{1}{|B_k|} \log \left( 1+ \frac{\rho _k}{1 - \sum _{\ell =k}^\infty \rho _\ell } \right) . \end{aligned}$$

with the convention \(p_{-1}=0\).

The equations are strikingly similar to the formulas for a one-dimensional system of non-overlapping rods [12, Theorem 2.12]. The equation of state (3.6) is a variant of the van-der-Waals equation of state.

Proof

We show first that for all \(n\in {\mathbb {N}}\) and \(j\in \{0,\ldots , n\}\),

$$\begin{aligned} {\widehat{z}}_j = \frac{\rho _{j,\Lambda _n}}{ 1 - \sum _{k=j}^n \rho _{k,\Lambda _n}},\qquad \alpha _{j,\Lambda _n}:= \prod _{k=j}^n \frac{1}{1+{\widehat{z}}_j}= 1- \sum _{k=j}^n \rho _{k,\Lambda _n}. \end{aligned}$$
(3.7)

The proof is over a finite backward induction over \(j\le n\) at fixed n. For \(j=n\), we have \(\rho _{n,\Lambda _n} = {\widehat{z}}_n / (1+{\widehat{z}}_n)\) by (3.5) hence \({\widehat{z}}_n = \rho _{n,\Lambda _n} / (1- \rho _{n,\Lambda _n})\). Furthermore, \((1+ {\widehat{z}}_n)^{-1} = 1- \rho _{n,\Lambda _n}\). For the induction step, note

$$\begin{aligned} \rho _{j,\Lambda _n} = \frac{{\widehat{z}}_j}{1+ {\widehat{z}}_j} \prod _{k=j+1}^n \frac{1}{1+ {\widehat{z}}_k} = \frac{{\widehat{z}}_j}{1+ {\widehat{z}}_j} \, \alpha _{j+1,\Lambda _n}. \end{aligned}$$

It follows that

$$\begin{aligned} {\widehat{z}}_j = \frac{\rho _{j,\Lambda _n}}{\alpha _{j+1,\Lambda _n}- \rho _{j,\Lambda _n}} = \frac{\rho _{j,\Lambda _n}}{1- \sum _{k=j}^n \rho _{j,\Lambda _n}} \end{aligned}$$

and

$$\begin{aligned} \alpha _{j,\Lambda _n} = \frac{1}{1+{\widehat{z}}_j} \, \alpha _{j+1,\Lambda _n}= \Bigl ( 1- \frac{\rho _{j,\Lambda _n}}{\alpha _{j+1,\Lambda _n}}\Bigr ) \alpha _{j+1,\Lambda _n} = 1 - \sum _{k=j}^n \rho _{k,\Lambda _n}. \end{aligned}$$

The induction step is complete.

If \(\sum _{j=1}^\infty {\widehat{z}}_j<\infty \), then we may pass to the limit \(n\rightarrow \infty \) in (3.7) with the help of Theorem 3.2(a) and find

$$\begin{aligned} {\widehat{z}}_j = \frac{\rho _j}{1 - \sum _{k=j}^\infty \rho _k}. \end{aligned}$$

Theorem 3.1 and Eq. (3.2) in the proof of the theorem yield the formulas for p and \(p_n\), the expression for \(z_j\) follows as well. \(\square \)

4 Entropy: Multi-Canonical Ensemble

4.1 Explicit Formula: Effective Densities

Here we compute the entropy in a multi-canonical ensemble, fixing the number of j-blocks for each j. For \(\omega \in \Omega \), let \(N_j(\omega )\) be the number of j-blocks in \(\omega \). For \(n\in {\mathbb {N}}\), \(\Lambda _n\in {\mathbb {B}}_n\), and \(N_0^{(n)},\ldots , N_n^{(n)} \in {\mathbb {N}}_0\), let

$$\begin{aligned} S_{\Lambda _n} \big (N_0^{(n)},\ldots , N_n^{(n)}\big ) = \log \biggl |\big \{\omega \in \Omega _\Lambda \mid \forall j:\, N_j(\omega ) = N_j^{(n)} \big \}\biggr |. \end{aligned}$$

Set

$$\begin{aligned} s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr ):= \lim _{n\rightarrow \infty } \frac{1}{|\Lambda _n|} \log S_{\Lambda _n} \big (N_0^{(n)},\ldots , N_n^{(n)}\big ) \end{aligned}$$
(4.1)

where the limit is taken along sequences such that \(\sum _{j=0}^n |B_j|\, N_j^{(n)}\le |\Lambda _n|\) and

$$\begin{aligned} \frac{1}{|\Lambda _n|} \sum _{j=0}^n |B_j|\, N_j^{(n)}\rightarrow \sigma ,\quad \forall j\in {\mathbb {N}}_0:\, \frac{N_j^{(n)}}{|\Lambda _n|}\rightarrow \frac{\rho _j}{|B_j|}. \end{aligned}$$
(4.2)

Notice that if (4.2) holds true, then necessarily

$$\begin{aligned} \sum _{j=0}^\infty \rho _j = \sum _{j=0}^\infty \lim _{n\rightarrow \infty }\frac{|B_j|\,N_j^{(n)}}{|\Lambda _n|} \le \lim _{n\rightarrow \infty } \sum _{j=0}^\infty \frac{|B_j|\,N_j^{(n)}}{|\Lambda _n|} = \sigma . \end{aligned}$$

In the sequel it is convenient to introduce, given \((\rho _j)_{j\in {\mathbb {N}}_0}\) and \(\sigma \ge \sum _{k=0}^\infty \rho _j\), the variables

$$\begin{aligned} \sigma _\infty := \sigma - \sum _{k=0}^\infty \rho _k,\quad \sigma _j := \sigma - \sum _{k=0}^{j-1} \rho _k = \sigma _\infty + \sum _{k=j}^\infty \rho _j. \end{aligned}$$
(4.3)

The variable \(\sigma _\infty \) represents, roughly, the fraction of volume covered by blocks that grow with n, while \(\sigma _j\) is the fraction of volume covered by blocks of type \(k\ge j\). Note that if \(\sigma = \sigma _\infty + \sum _{j=0}^\infty \rho _j \le 1\), then \(\rho _j \le 1 - \sigma _{j+1}\) for all \(j\in {\mathbb {N}}_0\).

Theorem 4.1

Let \(\varvec{\rho }\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) and \(\sigma \ge 0\) with \(\sum _{j=0}^\infty \rho _j \le \sigma \le 1\). Then the limit (4.1) exists and is given by

$$\begin{aligned} s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr ) = - \sum _{j=0}^\infty \frac{1}{|B_j|} \Bigl ( \rho _j \log \frac{\rho _j}{1- \sigma _{j+1}} + (1- \sigma _j) \log \frac{1- \sigma _j}{1 - \sigma _{j+1}} \Bigr ) \end{aligned}$$

with the convention \(0 \log \frac{0}{0} =0\). Moreover

$$\begin{aligned} 0 \le s(\varvec{\rho },\sigma ) \le \sum _{j=0}^\infty \frac{1-\sigma _{j+1}}{|B_j|}\log 2 < \infty . \end{aligned}$$

An equivalent expression in terms of effective activities \({\widehat{\rho }}_j\) is given in Eq. (4.4) below. Notice that the entropy vanishes if \(\rho _j = 0\) for all \(j\in {\mathbb {N}}_0\)—only small blocks (i.e., blocks whose size does not scale with the volume) contribute to the entropy.

Proof

Configurations can be constructed by placing first the biggest block (if present), i.e., n-blocks, then blocks of type \(n-1\), etc. The entropy equals

$$\begin{aligned} S_{\Lambda _n} \big (N_0^{(n)},\ldots , N_n^{(n)}\big ) = \sum _{j=0}^n \log \left( {\begin{array}{c} (|\Lambda _n| - \sum _{k=j+1}^n |B_k|\, N_k^{(n)})/|B_j|\\ N_j^{(n)}\end{array}}\right) . \end{aligned}$$

Indeed, having chosen the blocks of \(\omega \) of type \(k\ge j+1\), there are \((|\Lambda _n| - N_n^{(n)} |B_n| - \cdots - N_{j+1}^{(n)}|B_j| )/|B_j|\) available j-blocks to choose from for the placement of the next \(N_j^{(n)}\) blocks of type j.

Set \(\rho _j^{(n)} := N_j^{(n)}|B_j| / |\Lambda _n|\) and \(\sigma _j^{(n)}:= \sum _{k=j}^n \rho _k^{(n)}\). Clearly \(\rho _j^{(n)}\rightarrow \rho _j\) and \(\sigma _j^{(n)}\rightarrow \sigma \) for all \(j\in {\mathbb {N}}_0\). Stirling’s formula and the resulting approximation \(\log \left( {\begin{array}{c}m\\ k\end{array}}\right) = - k \log \frac{k}{m} - (m - k) \log (1- \frac{k}{m} ) + O(\log k ) + O(\log (m-k)) + O(\log m)\) yield

$$\begin{aligned} \frac{1}{|\Lambda _n|} S_{\Lambda _n} \big (N_1^{(n)},\ldots , N_n^{(n)}\big ) = - \sum _{j=0}^n \frac{1}{|B_j|} \Biggl ( \rho _j^{(n)} \log \frac{\rho _j^{(n)}}{1- \sigma _{j+1}^{(n)}} + (1- \sigma _j^{(n)}) \log \frac{1- \sigma _j^{(n)}}{1 - \sigma _{j+1}^{(n)}} \Biggr ) + o(1). \end{aligned}$$

Summation and limits can be exchanged because each summand is bounded in absolute value by \(\frac{1- \sigma _{j+1}}{|B_j|} (\log 2)\) [see Eq. (4.4) below] and \(\sum _{j} \frac{1}{|B_j|}<\infty \). The proposition follows. \(\square \)

The proof of Theorem 4.1 suggests to work with effective densities. Set

$$\begin{aligned} {\widehat{\rho }}_j := \frac{\rho _j}{1- \sigma _{j+1}} = \frac{\rho _j}{ 1- \sum _{k=j+1}^\infty \rho _k - \sigma _\infty } \end{aligned}$$

with \(\sigma _j\) and \(\sigma _\infty \) defined in (4.3). Thus \({\widehat{\rho }}_j\) takes into account the volume excluded by cubes of type \(k\ge j+1\). The entropy becomes

$$\begin{aligned} s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr ) = - \sum _{j=0}^\infty \frac{1- \sigma _{j+1}}{|B_j|} \Bigl ( {\widehat{\rho }}_j \log {\widehat{\rho }}_j +(1-{\widehat{\rho }}_j )\log (1-{\widehat{\rho }}_j)\Bigr ). \end{aligned}$$
(4.4)

The entropy for the ideal mixture, where cubes may overlap, is instead given by

$$\begin{aligned} s^{\mathrm {Ber}}\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr ) = - \sum _{j=0}^\infty \frac{1}{|B_j|} \Bigl ( \rho _j \log \rho _j +(1-\rho _j )\log (1- \rho _j)\Bigr ). \end{aligned}$$
(4.5)

The expressions for the entropy are again very similar to each other, just as for the pressure. The similarity in equations can be pushed a bit further. In the multi-canonical ensemble we define the chemical potential of j-blocks by

$$\begin{aligned} \mu _j \bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma _\infty \bigr ):=- |B_j| \frac{\partial }{\partial \rho _j}s\Bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma _\infty + \sum _{j=0}^\infty \rho _j \Bigr ). \end{aligned}$$
(4.6)

The chemical potential can be thought of as a derivative with respect to \(\nu _j = \rho _j / |B_j|\), which is the expected number of j-blocks per unit volume [remember (3.3)]. The derivative is taken at constant \(\sigma _\infty \) rather than constant \(\sigma \). We also define

$$\begin{aligned} \mu _\infty \bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma _\infty \bigr ):=- \frac{\partial }{\partial \sigma _\infty } s\Bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma _\infty + \sum _{j=0}^\infty \rho _j \Bigr ). \end{aligned}$$
(4.7)

Explicit computations yield

$$\begin{aligned} \mu _j = \log \frac{{\widehat{\rho }}_j}{1- {\widehat{\rho }}_j} - |B_j| \sum _{k=0}^{j-1}\frac{1}{|B_k|} \log (1- {\widehat{\rho }}_k) , \qquad \mu _\infty = - \sum _{j=0}^\infty \frac{1}{|B_j|} \log (1- {\widehat{\rho }}_j).\nonumber \\ \end{aligned}$$
(4.8)

For the Bernoulli mixture, in contrast,

$$\begin{aligned} \mu _j^{\mathrm {Ber}} = \log \frac{ \rho _j}{1- \rho _j},\qquad \mu _\infty ^{\mathrm {Ber}} = 0. \end{aligned}$$

The chemical potentials coincide up to error terms of order \(O(\sum _j \rho _j) + O(\sigma _\infty ) = O(\sigma )\).

4.2 Analyticity: Multi-Species Virial Expansion

Before we turn to a variational representation of the pressure, we collect a few analytic properties of the entropy that are of intrinsic interest. Consider the complex Banach space \(\ell ^1({\mathbb {N}}_0)\times {\mathbb {C}}\) with norm \(||(\varvec{\rho },\sigma _\infty )|| = \sum _{j=0}^\infty |\rho _j| + |\sigma _\infty |\) and the open unit ball \(B(0,1)= \{ (\varvec{\rho },\sigma _\infty ):\ ||(\varvec{\rho },\sigma _\infty )|| <1 \}\). Define \(\sigma _j=\sigma _\infty + \sum _{k=j}^\infty \rho _k\) and

$$\begin{aligned} \Phi \bigl ( \varvec{\rho },\sigma _\infty ) :=\sum _{m=2}^\infty \frac{1}{m(m-1)} \sum _{j=0}^\infty \frac{1}{|B_j|} \bigl ( \sigma _j^{m} - \sigma _{j+1}^m\bigr ). \end{aligned}$$
(4.9)

Proposition 4.2

  1. (a)

    The map \(\Phi \) is holomorphic in the open unit ball and the Taylor series (4.9) converges uniformly in every open ball B(0, r) of radius \(r<1\).

  2. (b)

    The entropy satisfies

    $$\begin{aligned} s(\varvec{\rho },\sigma _\infty ) = - \sum _{j=0}^\infty \frac{1}{|B_j|} \rho _j(\log \rho _j - 1) - \Phi (\varvec{\rho },\sigma _\infty ) \end{aligned}$$

    for all \((\varvec{\rho },\sigma _\infty ) \in {\mathbb {R}}_+^{{\mathbb {N}}_0}\times {\mathbb {R}}_+\) with \(\sum _{j=0}^\infty \rho _j + \sigma _\infty < 1\).

A short overview and list of references on holomorphic functions in Banach spaces is provided in [11, Appendix B].

Proof

We compute, using \(\sigma _j = \rho _j + \sigma _{j+1}\),

$$\begin{aligned}&\rho _j\log \frac{\rho _j}{1-\sigma _{j+1}} + (1- \sigma _j)\log \frac{1- \sigma _j}{1-\sigma _{j+1}} \\&\quad = \rho _j \log \rho _j + (1- \sigma _j ) \log (1- \sigma _j) - (1- \sigma _{j+1}) \log (1- \sigma _{j+1})\\&\quad = \rho _j\bigl ( \log \rho _j-1\bigr ) + (1- \sigma _j ) \Bigl (\log (1- \sigma _j) -1\Bigr )- (1- \sigma _{j+1}) \Bigl (\log (1- \sigma _{j+1})-1\Bigr ). \end{aligned}$$

Because of

$$\begin{aligned} (1- x)\Bigl ( \log (1-x) - 1\Bigr ) = -1 - \int _0^x \log (1- y) {\mathrm {d}}y = - 1+ \sum _{m=2}^\infty \frac{x^m}{m(m-1)} \qquad (|x|<1), \end{aligned}$$

we deduce that the j-th summand in the formula for the entropy from Theorem 4.1 is given by

$$\begin{aligned} - \frac{1}{|B_j|} \rho _j(\log \rho _j-1) - \frac{1}{|B_j|} \sum _{m=2}^\infty \frac{1}{m(m-1)} (\sigma _j^{m}- \sigma _{j+1}^m). \end{aligned}$$
(4.10)

In order to split the series over j into two contributions corresponding to the two terms in the preceding sum, we need to check that the two sums are absolutely convergent. For the first term, we note that \(\sup _{x\in [0,1]} |x(\log x - 1)|=1\) hence

$$\begin{aligned} \sum _{j=0}^\infty \frac{1}{|B_j|} \bigl | \rho _j (\log \rho _j -1 )\bigr | \le \sum _{j=0}^\infty \frac{1}{|B_j|}<\infty . \end{aligned}$$

For the convergence of \(\Phi \), corresponding to the second term in (4.10) set

$$\begin{aligned} P_m(\varvec{\rho },\sigma _\infty ):= \frac{1}{m(m-1)} \sum _{j=0}^\infty \frac{1}{|B_j|} \bigl ( \sigma _j^{m} - \sigma _{j+1}^m\bigr ). \end{aligned}$$

Because of

$$\begin{aligned} \bigl |\sigma _j^{m} - \sigma _{j+1}^m \bigr |= \Biggl | \rho _j \sum _{k=0}^{m-1} \sigma _j^k \sigma _{j+1}^{m-1-k} \Biggr | \le m |\rho _j|\, ||(\varvec{\rho },\sigma )||^{m-1} \end{aligned}$$

and \(|B_j|\ge 1\), we have

$$\begin{aligned} \bigl |P_m(\varvec{\rho },\sigma _\infty )\bigr | \le \frac{1}{m-1}\Bigl ( \sum _{j=0}^\infty \frac{1}{|B_j|} |\rho _j|\Bigr ) ||(\varvec{\rho },\sigma _\infty )||^{m-1}\le ||(\varvec{\rho },\sigma _\infty )||^{m}< \infty . \end{aligned}$$

It follows that \(P_m\) is absolutely convergent in B(0, 1) and defines a continuous m-homogeneous polynomial with norm

$$\begin{aligned} ||P_m|| =\sup _{||(\varvec{\rho },\sigma _\infty )||\le 1} |P_m(\varvec{\rho },\sigma _\infty )| \le 1, \end{aligned}$$

moreover \(\Phi (\varvec{\rho },\sigma _\infty ) = \sum _{m=2}^\infty P_m(\varvec{\rho },\sigma _\infty )\) converges uniformly in \(||(\varvec{\rho },\sigma _\infty )||\le r\), for every \(r\in (0,1)\). This proves the analyticity in the open unit ball. The formula for the entropy follows from (4.10). \(\square \)

4.3 Variational Representation for the Pressure

Proposition 4.3

Assume that \(\lim _{j\rightarrow \infty } \frac{1}{|B_j|} \log z_j=\theta ^*\). Then the pressure has the variational representation

$$\begin{aligned} p \bigl ( (z_j)_{j\in {\mathbb {N}}_0}\bigr ) = \sup \Biggl \{ \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Bigl ( \sigma - \sum _{j=0}^\infty \rho _j\Bigr ) \theta ^* + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma ) \, \Bigr |\, \sum _{j=0}^\infty \rho _j \le \sigma \le 1\Biggr \}. \end{aligned}$$

In addition:

  1. (a)

    If \(\sum _{j=0}^\infty {\widehat{z}}_j<\infty \) and \(p((z_j)_{j\in {\mathbb {N}}_0})>\theta ^*\), then the tuple \((\varvec{\rho }(\varvec{z}) ,\sigma (\varvec{z}))\) given in Theorem 3.2(a) is the unique maximizer. It satisfies \(\sigma _\infty =0\) and \(\sigma <1\).

  2. (b)

    If \(\sum _{j=0}^\infty {\widehat{z}}_j<\infty \) and \(p((z_j)_{j\in {\mathbb {N}}_0})= \theta ^*\), then the set of maximizers is given by the convex combinations of \((\varvec{\rho }(z),\sigma (\varvec{z}))\) from Theorem 3.2(a) and \((\varvec{0}, 1)\).

  3. (c)

    If \(\sum _{j=0}^\infty {\widehat{z}}_j=\infty \), then \(p((z_j)_{j\in {\mathbb {N}}_0})= \theta ^*\) and the unique maximizer is the tuple \((\varvec{0}, 1)\).

We leave as an open problem whether the proposition extends to activities with \(\liminf _{j\rightarrow \infty } \frac{1}{|B_j|} \log z_j<\limsup _{j\rightarrow \infty } \frac{1}{|B_j|} \log z_j=\theta ^*\). The cases (a), (b), and (c) correspond to a gas phase, coexistence region, and condensed phase, respectively.

Proof of the variational formula in Proposition 4.3

Let \((\rho _j)_{j\in {\mathbb {N}}_0}\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) and \(\sigma \in [0,1]\) with \(\sum _{j=0}^\infty \rho _j \le \sigma \). Then there exist sequences \(N_j^{(n)}\) of integers satisfying (4.2). Clearly

$$\begin{aligned} \log \Xi _{\Lambda _n} \ge \sum _{j=0}^n N_j^{(n)} \log z_j + S_{\Lambda _n}\biggl (N_1^{(n)},\ldots ,N_n^{(n)}\biggr ). \end{aligned}$$
(4.11)

The second term, divided by \(|\Lambda _n|\), converges to \(s((\rho _j)_{j\in {\mathbb {N}}_0},\sigma )\) by Theorem 4.1. For the first term, we set \(z'_j:=z_j \exp ( - |B_j|\theta ^*)\) and we write for \(n\ge k\)

$$\begin{aligned}&\Biggl |\sum _{j=0}^n \frac{N_j^{(n)}}{|\Lambda _n|} \log z_j - \sum _{j=0}^\infty \frac{\rho _j}{|B_j|} \log z_j - \Biggl ( \sigma - \sum _{j=0}^\infty \rho _j\Biggr ) \theta ^*\Biggr | \\&\quad \le \Biggl |\sum _{j=0}^n \frac{N_j^{(n)}}{|\Lambda _n|} \log z'_j - \sum _{j=0}^\infty \frac{\rho _j}{|B_j|} \log z'_j - \Biggl ( \sigma - \sum _{j=0}^n \frac{N_j^{(n)}|B_j|}{|\Lambda _n|} \Biggr ) \theta ^*\Biggr | \\&\quad \le \sum _{j=0}^k \Bigl | \frac{N_j^{(n)}}{|\Lambda _n|} - \frac{\rho _j}{|B_j|} \Bigr | |\log z'_j| + 2 \max _{j\ge k+1}\Bigl | \frac{1}{|B_j|}\log z'_j\Bigr | + |\theta ^*|\,\Biggl |\sigma - \sum _{j=0}^n \frac{N_j^{(n)}|B_j|}{|\Lambda _n|}\Biggr |. \end{aligned}$$

Taking first the limit \(n\rightarrow \infty \) and then \(k\rightarrow \infty \), we see that overall the expression goes to zero. Turning back to (4.11) we get

$$\begin{aligned} \liminf _{n\rightarrow \infty } p_{\Lambda _n}\ge \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Bigl ( \sigma - \sum _{j=0}^\infty \rho _j\Bigr ) \theta ^* + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma ). \end{aligned}$$

This holds true for all \((\rho _j)_{j\in {\mathbb {N}}_0}\) and \(\sigma \in [0,1]\) with \(\sum _{j=0}^\infty \rho _j \le \sigma \), accordingly the limit inferior of the pressure is bounded from below by a supremum.

For the upper bound, let \({\mathcal {I}}_n \subset {\mathbb {N}}_0^n\) be the set of vectors \(\big (N_1^{(n)},\ldots ,N_n^{(n)}\big )\) with \(\sum _{j=0}^n |B_j|\, N_j^{(n)}\le |\Lambda _n|\). Every such vector is uniquely identified with an integer partition of \(|\Lambda _n|\), therefore by the Hardy-Ramanujan formula

$$\begin{aligned} |{\mathcal {I}}_n| \le \exp \Bigl ( o\bigl (|\Lambda _n|\bigr )\Bigr ). \end{aligned}$$
(4.12)

Clearly

$$\begin{aligned} \Xi _{\Lambda _n}\le |{\mathcal {I}}_n|\, \max _{(N_1^{(n)},\ldots ,N_n^{(n)})\in {\mathcal {I}}_n}\exp \Biggl (\sum _{j=0}^n N_j^{(n)} \log z_j + S_{\Lambda _n}\bigl (N_1^{(n)},\ldots ,N_n^{(n)}\bigr )\Biggr ). \end{aligned}$$
(4.13)

Consider the sequence of maximizers of the right-hand side. By compactness, every subsequence admits in turn a subsequence that satisfies (4.2) for some \((\rho _j)_{j\in {\mathbb {N}}_0}\) and \(\sigma \in [0,1]\) with \(\sum _{j=0}^\infty \rho _j\le \sigma \). The proof of the upper bound for the limit superior of the pressure is easily completed by combining Eqs. (4.12), (4.13), and arguments similar to the proof of the lower bound. This proves the variational representation of the pressure. \(\square \)

The proof of items (a) and (b) in Proposition 4.3 builds on several lemmas. First we show that for \(\sigma _\infty =0\), the expression to be maximized is a combination of relative entropies of measures on \(\{0,1\}\), corresponding to absence or presence of a cube.

Lemma 4.4

For every \((\rho _j)_{j\in {\mathbb {N}}_0}\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) and \( \sigma \in [0,1]\) with \(\sum _{j=0}^\infty \rho _j=\sigma \) (equivalently, \(\sigma _\infty =0\)), we have

$$\begin{aligned}&p\bigl ((z_j)_{j\in {\mathbb {N}}_0}\bigr ) - \Biggl (\sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma ) \Biggr ) \nonumber \\&\quad = - \sum _{j=0}^\infty \frac{1-\sigma _{j+1}}{|B_j|} \Biggl ( {\widehat{\rho }}_j \log \frac{{\widehat{\rho }}_j}{{\widehat{z}}_j/(1+{\widehat{z}}_j)}+ (1-{\widehat{\rho }}_j )\log \frac{1-{\widehat{\rho }}_j}{1/(1+{\widehat{z}}_j)}\Biggr ). \end{aligned}$$
(4.14)

Proof

We compute

$$\begin{aligned} \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j&= \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\Bigl ( \log {\widehat{z}}_j + |B_j|\sum _{k=0}^{j-1} \frac{1}{|B_k|}\log (1+{\widehat{z}}_k)\Bigr )\\&= \sum _{j=0}^\infty \frac{\rho _j}{|B_j|} \log {\widehat{z}}_j +\sum _{k=0}^\infty \frac{1}{|B_k|}\log (1+{\widehat{z}}_k) \sum _{j=k+1}^\infty \rho _j \\&=\sum _{j=0}^\infty \frac{1-\sigma _{j+1}}{|B_j|} {\widehat{\rho }}_j \log {\widehat{z}}_j+ \sum _{k=0}^\infty \frac{\sigma _{k+1}}{|B_k|}\log (1+{\widehat{z}}_k). \end{aligned}$$

In going from the second to the third line we have used the equality \(\sum _{j=k+1}^\infty \rho _k = \sigma _{k+1}\), which is valid because of \(\sigma _\infty =0\). It follows that

$$\begin{aligned} p\bigl ((z_j)_{j\in {\mathbb {N}}_0}\bigr ) - \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j = \sum _{j=0}^\infty \frac{1-\sigma _{j+1}}{|B_j|} \Bigl ( \log (1+{\widehat{z}}_j) -{\widehat{\rho }}_j \log {\widehat{z}}_j\Bigr ). \end{aligned}$$

We combine with the formula for the entropy from Theorem 4.1 and obtain (4.14). \(\square \)

The term in parentheses on the right-hand side of (4.14), together with the minus sign, is nothing else but the relative entropy of the Bernoulli measure with parameter \({\widehat{\rho }}_j\) with respect to the Bernoulli measure with parameter \({\widehat{z}}_j/(1+{\widehat{z}}_j)\). It is non-negative and vanishes if and only if \({\widehat{\rho }}_j ={\widehat{z}}_j/(1+{\widehat{z}}_j)\). The next lemma relates this identity to Theorem 3.2.

Lemma 4.5

Let \((\rho _j)_{j\in {\mathbb {N}}_0}\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) and \(\sigma := \sum _{j=0}^\infty \rho _j\). Pick \(m\in {\mathbb {N}}_0\) and assume \(\sigma _{m+1}=\sum _{j=m+1}^\infty \rho _j< 1\). Then the following two statements are equivalent:

  1. (i)

    \({\widehat{\rho }}_j ={\widehat{z}}_j/(1+{\widehat{z}}_j)\) for all \(j\ge m\).

  2. (ii)

    \(\rho _j = {\widehat{z}}_j \prod _{k=j}^\infty (1+{\widehat{z}}_k)^{-1}\) for all \(j\ge m\).

Let us stress that the lemma works both for \(\sum _j {\widehat{z}}_j < \infty \) and \(\sum _j {\widehat{z}}_j = \infty \). In the latter case the infinite products vanish and we find \(\rho _j =0\) for all \(j\ge m\).

Proof

We note

$$\begin{aligned} 1-\sigma _j= 1- \sigma _{j+1}-\rho _j = (1- \sigma _{j+1})(1- {\widehat{\rho }}_j) \end{aligned}$$

hence \(1- \sigma _j=(1- \sigma _\ell ) \prod _{k=j}^{\ell -1}(1-{\widehat{\rho }}_j)\) for all \(\ell \ge j \ge m\). Because of \(\sum _{j=0}^\infty \rho _j =\sigma \) we have \(\sigma _\infty =0\) and \(\lim _{\ell \rightarrow \infty }\sigma _\ell =0\), hence

$$\begin{aligned} 1- \sigma _j = \prod _{k=j}^\infty (1-{\widehat{\rho }}_j). \end{aligned}$$

If (i) holds true, then for all \(j\ge m\)

$$\begin{aligned} \rho _j = (1-\sigma _{j+1}) - (1-\sigma _j) = {\widehat{\rho }}_j \prod _{k=j+1}^\infty (1- {\widehat{\rho }}_k). \end{aligned}$$

The implication (i) \(\Rightarrow \) (ii) follows. Conversely, if (ii) holds, let \(Y_j\) be independent Bernoulli variables with \({\mathbb {P}}(Y_j=0) =1/(1+{\widehat{z}}_j)\). Then

$$\begin{aligned} \rho _j={\mathbb {P}}(Y_j=1,\,\forall k\ge j+1: Y_k=0) \end{aligned}$$

and

$$\begin{aligned} 1 - \sigma _r = 1-{\mathbb {P}}(\exists j\ge r:\, Y_j=1)={\mathbb {P}}(\forall j \ge r:\, Y_j = 0)=\prod _{j=r}^\infty \frac{1}{1+{\widehat{z}}_j} \end{aligned}$$

and (i) follows. \(\square \)

The previous two lemmas deal with the gas phase (\(\sigma _\infty =0\)) only. The next lemma allows for \(\sigma _\infty \ge 0\) and is particularly relevant for the coexistence region. Let us briefly motivate a new set of variables. Suppose that \(\sigma _\infty \in (0,1)\). Then we may think of the system as a mixture of a condensed phase, occupying the volume fraction \(\sigma _\infty \), and a gas phase in the remaining volume fraction \(1-\sigma _\infty \). The natural density variables for the gas phase should be defined relatively to the volume occupied by the gas and not the total volume. Therefore we introduce the new variables

$$\begin{aligned} \rho '_j:= \frac{\rho _j}{1- \sigma _\infty }, \quad \sigma ':=\sum _{j=0}^\infty \rho '_j,\quad \sigma '_j := \sum _{k=j}^\infty \rho '_j. \end{aligned}$$
(4.15)

Lemma 4.6

Let \(((\rho _j)_{j\in {\mathbb {N}}_0}, \sigma )\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\times [0,1]\) with \(\sum _{j=0}^\infty \rho _j \le \sigma \) and \(\sigma _\infty \in (0,1)\). Then

$$\begin{aligned}&\sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Biggl ( \sigma - \sum _{j=0}^\infty \rho _j\Biggr ) \theta ^* + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma ) \\&\quad = (1- \sigma _\infty ) \Biggl ( \sum _{j=0}^\infty \frac{\rho '_j}{|B_j|}\log z_j + s\bigl ( (\rho '_j)_{j\in {\mathbb {N}}_0}, \sigma ')\Biggr ) +\sigma _\infty \theta ^*. \end{aligned}$$

Put differently, the grand potential in the coexistence region is a convex combination of the grand potential \(\theta ^*\) in the condensed phase and the grand potential of the gas phase.

Proof

The lemma follows from Theorem 4.1 and explicit computations. Clearly

$$\begin{aligned} \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Bigl ( \sigma - \sum _{j=0}^\infty \rho _j\Bigr ) \theta ^* = (1- \sigma _\infty ) \sum _{j=0}^\infty \frac{\rho '_j}{|B_j|}\log z_j + \sigma _\infty \theta ^*, \end{aligned}$$

so it remains to check that

$$\begin{aligned} s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr ) = (1- \sigma _\infty ) s\bigl ( (\rho '_j)_{j\in {\mathbb {N}}_0},\sigma '\bigr ). \end{aligned}$$
(4.16)

As a preliminary observation we note \(\sigma ' = (\sigma -\sigma _\infty ) / (1- \sigma _\infty ) \le 1\). In view of

$$\begin{aligned} 1 - \sigma _{j+1} = 1 - \sum _{k=j+1}^\infty \rho _j - \sigma _\infty = (1- \sigma _\infty )(1- \sigma '_{j+1}), \end{aligned}$$

we also have \(\rho '_j \le 1 - \sigma '_{j+1}\), moreover

$$\begin{aligned} s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0},\sigma \bigr )&= - (1- \sigma _\infty ) \sum _{j=0}^\infty \frac{1}{|B_j|} \Biggl ( \rho '_j \log \frac{\rho '_j}{1- \sigma '_{j+1}} + (1- \sigma '_j) \log \frac{1- \sigma '_j}{1 - \sigma '_{j+1}} \Biggr ) \\&= (1- \sigma _\infty ) s\bigl ( (\rho '_j)_{j\in {\mathbb {N}}_0},\sigma '\bigr ). \end{aligned}$$

\(\square \)

Proof of Proposition 4.3(a)–(c)

Assume \(\sum _{j=0}^\infty {\widehat{z}}_j <\infty \) and \(p((z_j)_{j\in {\mathbb {N}}_0})>\theta ^*\). To prove part (a), we proceed in two steps: First we show that a tuple \(((\rho _j)_{j\in {\mathbb {N}}_0}, \sigma )\) with \(\sigma _\infty =0\), i.e., \(\sum _{j=0}^\infty \rho _j = \sigma \), is a maximizer if and only if it is given by the expressions from Theorem 3.2(a). Second, we show that every maximizer necessarily satisfies \(\sigma _\infty =0\).

For Step 1, we use Lemma 4.4. A tuple with \(\sigma _\infty =0\) is a maximizer if and only if the right-hand side of (4.14) vanishes. But on the right-hand side of (4.14), the term in parentheses, together with the minus sign, is nothing else but the relative entropy of two Bernoulli measures with parameters \({\widehat{\rho }}_j\) and \({\widehat{z}}_j/(1+{\widehat{z}}_j)\). As a consequence the overall sum vanishes—i.e., the tuple \((\rho _j)_{j\in {\mathbb {N}}_0}\), \(\sigma =\sum _{j=0}^\infty \rho _j\) is a maximizer—if and only if, for every \(j\in {\mathbb {N}}_0\), we have \(\sigma _{j+1}=1\) or \({\widehat{\rho }}_j ={\widehat{z}}_j/(1+ {\widehat{z}}_j)\).

Suppose by contradiction that there is a maximizer with \(\sigma _{r+1}=1\) for some \(r\in {\mathbb {N}}_0\), and \(\sigma =\sum _{j=0}^\infty \rho _j\). The sequence \((\sigma _j)\) is monotone decreasing, therefore if the set of such r’s is unbounded, then \(\sigma _j=1\) for all \(j\in {\mathbb {N}}_0\). It follows that \(\rho _j =\sigma _j - \sigma _{j+1}=0\) for all j and \(\sigma _{j+1}=\sum _{k=r+1}^\infty \rho _j=0\), contradiction. Thus the set of r’s with \(\sigma _{r+1}=1\) is bounded, let m be its maximal element. Then \(\sigma _{m+1}=\sum _{k=m+1}^\infty \rho _k=1\) hence \(\rho _0=\cdots =\rho _{m}=0\). In addition, \(\sigma _{j+1}<1\) and \({\widehat{\rho }}_j ={\widehat{z}}_j/(1+ {\widehat{z}}_j)\) for all \(j\ge m+1\). It follows that for all \(j\ge m+1\), the density \(\rho _j\) is given by the formula from Theorem 3.2(a), see Lemma 4.5. In particular, \(\sigma _{m+1}=\sum _{j=m+1}^\infty \rho _j\) is bounded by the packing fraction from Theorem 3.2(a), which is strictly smaller than 1. Thus \(\sigma <1\), in contradiction with \(\sigma = \sigma _{m+1}=1\).

Consequently \(\sigma _{j+1}<1\) and \({\widehat{\rho }}_j ={\widehat{z}}_j/(1+ {\widehat{z}}_j)\) for all \(j\in {\mathbb {N}}_0\). Lemma 4.5 shows that the maximizer is given by the formulas from Theorem 3.2(a). In particular, \(\sigma <1\) and \(\sigma _\infty =0\).

For Step 2, we use Lemma 4.6. Let \(((\rho _j)_{j\in {\mathbb {N}}_0},\sigma )\) be such that \(\sigma _\infty >0\). By Lemma 4.6 and the preceding considerations applied to \(((\rho '_j)_{j\in {\mathbb {N}}_0},\sigma ')\), we can bound

$$\begin{aligned} \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Bigl ( \sigma - \sum _{j=0}^\infty \rho _j\Bigr ) \theta ^* + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma ) \le (1- \sigma _\infty ) p\bigl ( (z_j)_{j\in {\mathbb {N}}_0}\bigr ) + \sigma _\infty \theta ^*\nonumber \\ \end{aligned}$$
(4.17)

which is strictly smaller than \(p\bigl ( (z_j)_{j\in {\mathbb {N}}_0}\bigr )\) because of the assumption \(\theta ^*< p\bigl ( (z_j)_{j\in {\mathbb {N}}_0}\bigr )\). Therefore the tuple is not a maximizer. This concludes Step 2 and the proof of part (a) of the proposition.

For (b) and (c), assume \(p((z_j)_{j\in {\mathbb {N}}_0})=\theta ^*\). Then \((\varvec{\rho }, \sigma ) = (\varvec{0}, 1)\) is a maximizer. Suppose that there exists another maximizer \((\varvec{\rho }, \sigma )\). Then necessarily \(\sigma _\infty < 1\) and we may define primed variables \((\varvec{\rho '},\sigma ')\) and \(\sigma '_j\) as in Eq. (4.15). The variational representation for the pressure, the equality \(p((z_j)_{j\in {\mathbb {N}}_0})=\theta ^*\), and Lemma 4.6 yields

$$\begin{aligned} 0= & {} \theta ^* - \Bigl ( \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\log z_j +\Bigl ( \sigma - \sum _{j=0}^\infty \rho _j\Bigr ) \theta ^* + s\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma )\Bigr ) \\= & {} (1- \sigma _\infty ) \Biggl \{ \theta ^* - \Bigl (\sum _{j=0}^\infty \frac{\rho '_j}{|B_j|}\log z_j + s\bigl ( (\rho '_j)_{j\in {\mathbb {N}}_0}, \sigma ')\Bigr ) \Biggr \}\ge 0 \end{aligned}$$

hence

$$\begin{aligned} \theta ^* - \Bigl (\sum _{j=0}^\infty \frac{\rho '_j}{|B_j|}\log z_j + s\bigl ( (\rho '_j)_{j\in {\mathbb {N}}_0}, \sigma ')\Bigr ) =0. \end{aligned}$$
(4.18)

Since \(p((z_j)_{j\in {\mathbb {N}}_0})=\theta ^*\), the left-hand side can be expressed as a combination of relative entropies of Bernoulli variables as in Lemma 4.4.

Assume first \(\sum _{j=0}^\infty {\widehat{z}}_j <\infty \). Adapting the arguments of the proof of part (a) we deduce

$$\begin{aligned} \rho '_j = \rho _j(\varvec{z})= \frac{{\widehat{z}}_j}{1+{\widehat{z}}_j}\prod _{k=j+1}^\infty \frac{1}{1+{\widehat{z}}_k}\quad (j\in {\mathbb {N}}_0). \end{aligned}$$

Then \(\rho _j = (1- \sigma _\infty ) \rho '_j\) and

$$\begin{aligned} \sigma = \sum _{j=0}^\infty \rho _j + \sigma _\infty = (1- \sigma _\infty ) \sigma ' + \sigma _\infty \end{aligned}$$

by definition of \(\rho '_j\) and \(\sigma '\). It follows that the additional maximizer \((\varvec{\rho },\sigma )\) is a convex combination of \((\varvec{\rho }(\varvec{z}),\sigma (\varvec{z}))\) and \((\varvec{0}, 1)\). Conversely, every such convex combination is indeed a maximizer. This proves part (b) of Proposition 4.3.

If on the other hand \(\sum _{j=0}^\infty {\widehat{z}}_j =\infty \), then we check that \(\rho '_j = 0\) hence \(\rho _j =0\) for all j. To that aim we revisit the arguments from the proof of part (a). We start from (4.18) and deduce as in part (a) that \(\sigma '_{j+1}=1\) or \({\widehat{\rho }}'_j ={\widehat{z}}_j/(1+ {\widehat{z}}_j)\) for all \(j\in {\mathbb {N}}_0\). We distinguish several cases.

If \(\sigma '_{j+1}=1\) for all \(j\in {\mathbb {N}}_0\), then \(\rho '_j =0\) for all \(j\in {\mathbb {N}}_0\) and \(\sigma '=0\), contradicting \(\sigma '_{j+1}=1\).

If \(\sigma '_{j+1}\ne 1\) for some j, then the set \(\{r\in {\mathbb {N}}_0 \mid \sigma '_{r+1}=1\}\) is bounded. Suppose by contradiction that it is non-empty and let m be its maximum. Then \(\sigma '_{m+1}=\sum _{k=m+1}^\infty \rho '_k=1\) hence \(\rho '_0=\cdots =\rho '_{m}=0\). In addition, \(\sigma '_{j+1}<1\) and \(\widehat{\rho '}_j ={\widehat{z}}_j/(1+ {\widehat{z}}_j)\) for all \(j\ge m+1\). Lemma 4.5 yields \(\rho '_j =0\) for all \(j \ge m+1\). It follows that \(\sigma '_{m+1}=0\), in contradiction with the identity \(\sigma '_{m+1}=1\) that holds true by definition of m.

The only case left is \(\sigma '_{j+1}<1\) for all \(j \in {\mathbb {N}}_0\). In this case Lemma 4.5 again yields \(\rho '_j =0\) for all \(j \in {\mathbb {N}}_0\) hence \(\sigma '=0\).

Consequently \(\rho _j = (1-\sigma _\infty ) \rho _j =0\) for all \(j \in {\mathbb {N}}_0\) and \(\sigma = \sigma _\infty \). The grand-potential of such a configuration is \(\sigma _\infty \theta ^*\), which is equal to \(\theta ^*\) if and only if \(\sigma _\infty =1\). As a consequence, \((\varvec{0},1)\) is the unique maximizer of the grand potential. This proves part (c). \(\square \)

5 Phase Transition

5.1 Generalities: Parameter-Dependent Activity

Let \((E_j)_{j\in {\mathbb {N}}_0}\) be a sequence in \({\mathbb {R}}\cup \{\infty \}\) such that \(E_j/|B_j|\) has a limit in \({\mathbb {R}}\cup \{\infty \}\), i.e.,

$$\begin{aligned} e_\infty := \lim _{j\rightarrow \infty } \frac{E_j}{|B_j|} > - \infty , \end{aligned}$$

and \(E_j <\infty \) for at least one \(j\in {\mathbb {N}}_0\). Think of \(E_j\) as the energy of a block, which could be a bulk contribution plus a boundary term, e.g., \(E_j = e_\infty |B_j| + \mathrm {const} |\partial B_j|\). For later purpose we also define

$$\begin{aligned} E(B) = E_j \quad (B\in {\mathbb {B}}_j). \end{aligned}$$

We specialize to parameter-dependent activities of the form

$$\begin{aligned} z_j(\mu ) = \exp \bigl (|B_j| \mu - E_j\bigr ) \qquad (\mu \in {\mathbb {R}}). \end{aligned}$$

The activity is stable with

$$\begin{aligned} \theta ^*(\mu ) = \lim _{j\rightarrow \infty }\frac{1}{|B_j|} \log z_j(\mu ) = \mu - e_\infty . \end{aligned}$$
(5.1)

We write \(p(\mu )\), \({\widehat{z}}_j(\mu )\), \(\rho _j(\mu )\) for the pressure, effective activities, and density variables of the \(\mu \)-dependent model. For \((\rho _j)_{j\in {\mathbb {N}}_0}\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) and \(\sigma _\infty \ge 0\) with \(\sum _{j=0}^\infty \rho _j + \sigma _\infty \le 1\), define the free energy of a block size distribution

$$\begin{aligned} f\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma _\infty \bigr ) := \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\, E_j + \sigma _\infty e_\infty - s\Bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma _\infty + \sum _{j=0}^\infty \rho _j \Bigr ). \end{aligned}$$
(5.2)

and the free energy at given packing fraction \(\sigma \in [0,1]\)

$$\begin{aligned} \varphi (\sigma ) = \inf \Biggl \{ f\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma _\infty \bigr ) \, \Big |\, \sum _{j=0}^\infty \rho _j + \sigma _\infty = \sigma \Biggr \}. \end{aligned}$$

The maps \(p(\mu )\), \(\varphi (\sigma )\), and \( f\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma _\infty \bigr )\) are convex, moreover by Proposition 4.3,

$$\begin{aligned} p(\mu )&= \sup _{\sigma \in [0,1]}\bigl ( \mu \sigma - \varphi (\sigma )\bigr ) \nonumber \\&= \sup \Biggl \{ \sum _{j=0}^\infty \mu \rho _j + \mu \sigma _\infty - f\bigl ( (\rho _j)_{j\in {\mathbb {N}}_0}, \sigma _\infty \bigr ) \, \Big |\, \sum _{j=0}^\infty \rho _j + \sigma _\infty \le 1\Biggr \}. \end{aligned}$$
(5.3)

The test configuration \(\rho _j \equiv 0\) and \(\sigma _\infty =1\) yields \(p(\mu ) \ge \mu - e_\infty \) for all \(\mu \in {\mathbb {R}}\), in agreement with the already known bound \(p(\mu ) \ge \theta ^*(\mu ) = \mu -e_\infty \). Define

$$\begin{aligned} \mu _c := \inf \Bigl \{\mu \in {\mathbb {R}}\,\mid \, p(\mu ) = \mu - e_\infty \Bigr \},\qquad \sigma _c := \lim _{\mu \nearrow \mu _c} \frac{{\mathrm {d}}p}{{\mathrm {d}}\mu }(\mu ). \end{aligned}$$

By convexity, the pressure p is differentiable almost everywhere with increasing derivative, therefore \(\sigma _c\) is well-defined.

Notice \(\mu _c \le \infty \) and \(\sigma _c \le 1\). We say that the mixture of cubes undergoes a phase transition if \(\mu _c<\infty \). The phase transition is continuous if \(\sigma _c= 1\) and it is of first order if \(\sigma _c <1\), see Proposition 5.2 below.

Lemma 5.1

The following holds true:

  1. (a)

    For each \(j\in {\mathbb {N}}_0\), the map \(\mu \mapsto {\widehat{z}}_j(\mu )\) is monotone increasing.

  2. (b)

    The system undergoes a phase transition if and only if \(\sum _{j=0}^\infty {\widehat{z}}_j(\mu ) =\infty \) for some \(\mu \in {\mathbb {R}}\), and we have

    $$\begin{aligned} \mu _c = \inf \Bigl \{\mu \in {\mathbb {R}}\, \Big |\, \sum _{j\in {\mathbb {N}}_0}{\widehat{z}}_j(\mu ) =\infty \Bigr \} > e_\infty . \end{aligned}$$
  3. (c)

    If \(\mu _c<\infty \), the phase transition is of first order if and only if \(\sum _j {\widehat{z}}_j (\mu _c) <\infty \), with

    $$\begin{aligned} \sigma _c = 1- \prod _{j=0}^\infty \frac{1}{1+ {\widehat{z}}_j(\mu _c)}. \end{aligned}$$

Proof

(a) The rescaling from the proof of Lemma 2.2 allows us to shove the \(\mu \)-dependence away from the activities \(z_j\) and into the vacuum activity, which becomes \({\mathrm {e}}^{-\mu }\) instead of 1. Precisely, remembering \(E(B)= E_j\) for \(B\in {\mathbb {B}}_j\), we get

$$\begin{aligned} \Xi _{\Lambda }(\mu )&= \sum _{\{X_1,\ldots , X_n\}} \prod _{i=0}^n {\mathrm {e}}^{|X_i| \mu - E(X_i)} = \sum _{\{X_1,\ldots , X_n\}} {\mathrm {e}}^{\mu |\cup _i X_i| - \sum _i E(X_i)} \\&= {\mathrm {e}}^{\mu |\Lambda |} \sum _{\{X_1,\ldots , X_n\}} {\mathrm {e}}^{- \mu |\Lambda \setminus \cup _i X_i|} {\mathrm {e}}^{- \sum _i E(X_i)} \end{aligned}$$

where the sum runs over collections of pairwise disjoint cubes. Notice that \({\mathrm {e}}^{-\mu }\) appears to the power \(|\Lambda \setminus \cup _i X_i|\) which is the number of vacant lattice sites. We apply the equality to \(\Lambda = B_{n-1}\) and find

$$\begin{aligned} {\widehat{z}}_n(\mu ) =\frac{z_n(\mu )}{\Xi _{B_{n-1}}(\mu )^{2^d}} = {\mathrm {e}}^{- E(B_n)} \times \Biggl ( \sum _{\{X_1,\ldots , X_n\}} {\mathrm {e}}^{- \mu |\Lambda \setminus \cup _i X_i|} {\mathrm {e}}^{- \sum _i E(X_i)}\Biggr )^{-2^d} \end{aligned}$$
(5.4)

because \(\exp (\mu |B_n|) = \exp (2^d \mu |B_{n-1}|)\) cancels in the ratio defining \({\widehat{z}}_n(\mu )\). The monotonicity in \(\mu \) follows.

(b) Suppose that the set \(I:=\{\mu \in {\mathbb {R}}\mid \sum _{j=0}^\infty {\widehat{z}}_j (\mu ) = \infty \}\) is non-empty. Then because of the monotonicity proven in (a), the set I is an open or half-open interval \((\mu ^*,\infty )\) or \([\mu ^*,\infty )\) with \(\mu ^*\in {\mathbb {R}}\cup \{-\infty \}\). For \(\mu \in I\) we have \(p(\mu ) = \theta ^*(\mu ) = \mu - e_\infty \) by Theorem 3.1 and (5.1), therefore \(\mu _c \le \mu ^*< \infty \) and the system undergoes a phase transition.

It remains to check \(\mu _c = \mu ^*\) or equivalently, \(p(\mu )>\mu -e_\infty \) for all \(\mu < \mu ^*\). First we show that \(\mu ^* >e_\infty \), which proves in particular \(\mu ^*>- \infty \). As noted above, \(p(\mu ) = \mu - e_\infty \) for all \(\mu >\mu _c\). But \(p(\cdot )\) is continuous because it is convex and finite, therefore the equality \(p(\mu ) = \mu - e_\infty \) extends to all \(\mu \ge \mu ^*\). On the other hand, the non-degeneracy condition \(\inf _j E_j <\infty \) is enough to guarantee \(p(\mu ) >0\) for all \(\mu \in {\mathbb {R}}\). Therefore \(\mu ^* - e_\infty = p(\mu ^*) >0 \) and \(\mu ^*>e_\infty \).

Next we show that \(p(\mu )\) is continuously differentiable in \((-\infty ,\mu ^*)\) with derivative \(\sigma (\mu ) \in (0,1)\), where

$$\begin{aligned} \sigma (\mu ) = 1 - \prod _{j=0}^\infty \frac{1}{1+ {\widehat{z}}_j(\mu )}, \end{aligned}$$
(5.5)

see Theorem 3.2(a). First we check that \(\sigma (\mu )\) is continuous in \((- \infty , \mu ^*)\). Every effective activity \({\widehat{z}}_j(\mu )\) is a rational function of \({\mathrm {e}}^{-\mu }\) hence continuous, see (5.4). To deduce the continuity of \(\sigma (\mu )\) we invoke dominated convergence for the series \(\sum _j \log (1+ {\widehat{z}}_j(\mu ))\). Fix \(\mu '<\mu ^*\). The monotonicity of \({\widehat{z}}_j(\mu )\) and the definition of \(\mu ^*\) yield \({\widehat{z}}_j(\mu ) \le {\widehat{z}}_j(\mu ')\) for \((- \infty ,\mu ')\) with \(\sum _{j=0}^\infty \log (1+ {\widehat{z}}_j(\mu '))<\infty \). Therefore dominated convergence shows \(\lim _{\varepsilon \rightarrow 0} \sigma (\mu +\varepsilon ) = \sigma (\mu )\), for all \(\mu<\mu '< \mu ^*\). Thus \(\sigma (\mu )\) is continuous.

The differentiability of \(p(\mu )\) follows from standard arguments. We have \(p(\mu )= \lim _{n\rightarrow \infty } p_{\Lambda _n}(\mu )\) and \(p'_{\Lambda _n}(\mu ) = \sigma _{\Lambda _n}(\mu ) \rightarrow \sigma (\mu ) \in (0,1)\) by Theorem 3.2(a). For \(\mu \in (- \infty , \mu ^*)\) and \(h\in {\mathbb {R}}\) small enough so that \(\mu \pm h< \mu ^*\), we may pass to the limit \(n\rightarrow \infty \) in

$$\begin{aligned} p_{\Lambda _n}(\mu +h) - p_{\Lambda _n}(\mu ) = \int _{\mu }^{\mu +h} \sigma _{\Lambda _n} (t) {\mathrm {d}}t \end{aligned}$$

and find

$$\begin{aligned} p(\mu +h) - p(\mu ) = \int _\mu ^{\mu +h} \sigma (t) {\mathrm {d}}t \end{aligned}$$

hence \(p'(\mu )= \sigma (\mu )\).

The differentiability together with the inequality \(\sigma (\mu ) \in (0,1)\) allow us to conclude the proof of (b): write

$$\begin{aligned} p(\mu ^*) - p(\mu ) = \int _{\mu }^{\mu ^*} \sigma (u) {\mathrm {d}}u < \mu ^*- \mu \end{aligned}$$

and

$$\begin{aligned} p(\mu )> p(\mu ^*) - \mu ^* + \mu = - e_\infty + \mu . \end{aligned}$$

This holds true for all \(\mu <\mu ^*\), therefore \(\mu _c \ge \mu ^*\) and altogether \(\mu _c = \mu ^*> e_\infty \).

(c) As noted above, we have \(p'(\mu ) = \sigma (\mu )\) for all \(\mu \in (-\infty ,\mu ^*) = (-\infty , \mu _c)\). Proceeding as in (b) but using monotone convergence for the series \(\sum _j \log (1+ {\widehat{z}}_j(\mu ))\) instead of dominated convergence, we obtain

$$\begin{aligned} \sigma _c = \lim _{\mu \nearrow \mu _c} p'(\mu )= \lim _{\mu \nearrow \mu _c} \sigma (\mu ) = \sigma (\mu _c). \end{aligned}$$

In particular, \(\sigma _c<1\) if and only if \(\sigma (\mu _c) <1\), which in turn is equivalent to \(\sum _{j=0}^\infty {\widehat{z}}_j (\mu _c)<\infty \). \(\square \)

In the proof of Lemma 5.1 we have proven a number of statements that can be formulated without any reference to the effective activities.

Proposition 5.2

The critical chemical potential satisfies \(\mu _c> e_\infty > - \infty \). In addition:

  1. (a)

    In \((-\infty , \mu _c)\) the pressure \(p(\mu )\) is strictly convex and continuously differentiable with packing fraction \(p'(\mu ) = \sigma (\mu ) \in (0,\sigma _c)\) and it satisfies \(p(\mu ) > \mu - e_\infty \).

  2. (b)

    If \(\mu _c < \infty \), then \(p(\mu ) = \mu - e_\infty \) for all \(\mu \ge \mu _c\) and the packing fraction is \(\sigma (\mu )=1\).

Proof

All statements except the strict convexity in \((- \infty , \mu _c)\) have been shown in the proof of Lemma 5.1. The strict convexity follows from the strict monotonicity of \(\sigma (\mu )\): Let \(\mu _1< \mu _2 < \mu _c\). Then

$$\begin{aligned} \sum _{j=0}^\infty \log (1+ {\widehat{z}}_j(\mu _1)) \le \sum _{j=0}^\infty \log (1+ {\widehat{z}}_j(\mu _2)) < \infty \end{aligned}$$

and, because of the monotonicity from Lemma 5.1(a),

$$\begin{aligned} \sum _{j=0}^\infty \Bigl ( \log (1+ {\widehat{z}}_j(\mu _2)) - \log (1+ {\widehat{z}}_j(\mu _1)) \Bigr ) \ge \log (1+ {\widehat{z}}_k(\mu _2)) - \log (1+ {\widehat{z}}_k(\mu _1))\nonumber \\ \end{aligned}$$
(5.6)

for all \(k\in {\mathbb {N}}_0\). Eq. (5.4) shows that if \(E_k<\infty \)—which is the case for at least one \(k\in {\mathbb {N}}_0\)—then \({\widehat{z}}_k(\mu )\) is strictly increasing in \(\mu \). Therefore the difference (5.6) is strictly positive and Eq. (5.5) yields \(\sigma (\mu _1) <\sigma (\mu _2)\). \(\square \)

5.2 Fixed Point Iteration: Absence of Phase Transition

The recurrence relation \(\Xi _{\Lambda _{ n+1}} = z_{n+1} + (\Xi _{\Lambda _n})^{2^d}\) encountered in the proof of Theorem 3.1 leads to a recurrence relation for the inverse probability of finding one large block. Indeed,

$$\begin{aligned} \frac{\Xi _{\Lambda _n}}{z_n} = 1+ \frac{z_{n-1}^{2^d}}{z_n} \Bigl ( \frac{\Xi _{\Lambda _{n-1}}}{z_{n-1}}\Bigr )^{2^d}. \end{aligned}$$

Thus if we set

$$\begin{aligned} v_n(\mu ):= \frac{\Xi _{\Lambda _n}(\mu )}{z_{n}(\mu )}= \frac{1}{{\mathbb {P}}^\mu _{\Lambda _n}(\omega =\{\Lambda _n\})} \end{aligned}$$

and

$$\begin{aligned} \varepsilon _n:= \frac{(z_{n-1}(\mu ) )^{2^d}}{z_n(\mu )} = \exp (E_n - 2^d E_{n-1})\qquad (n\in {\mathbb {N}}), \end{aligned}$$
(5.7)

then

$$\begin{aligned} v_{n} (\mu )= 1 + \varepsilon _n \bigl (v_{n-1}(\mu )\bigr )^{2^d}\qquad (n\in {\mathbb {N}}) \end{aligned}$$
(5.8)

and

$$\begin{aligned} v_0(\mu ) = 1+ \frac{1}{z_0(\mu )} = 1+ {\mathrm {e}}^{-\mu } {\mathrm {e}}^{\beta E_0}. \end{aligned}$$

Notice that the \(\mu \)-dependence drops out from the ratio \(z_{n-1}(\mu )^{2^d} / z_n(\mu )\) so that \(\varepsilon _n\) in (5.8) does not depend on \(\mu \). Thus the sequence \((v_n(\mu ))_{n\in {\mathbb {N}}_0}\) is computed recursively and the only explicit \(\mu \)-dependence is through the initial condition \(v_0(\mu )\).

For energies \((E_n)_{n\in {\mathbb {N}}}\) leading to constant ratios \(\varepsilon _n \equiv \varepsilon \), the iteration defining \(v_n(\mu )\) is a fixed point iteration that is straightforward to analyze. Set

$$\begin{aligned} f_\varepsilon (x):= 1 + \varepsilon x^{2^d}, \quad c_d:= \sup _{x\ge 1}\frac{x-1}{x^{2^d}}. \end{aligned}$$
(5.9)

Notice \(c_d\in (0,1)\). The following case distinction is relevant for this section and the following:

  1. (1)

    If \(\varepsilon > c_d\), then \(f_\varepsilon (x)>x\) for all \(x\ge 0\).

  2. (2)

    If \(\varepsilon < c_d\), then the equation \(x = f_\varepsilon (x)\) has exactly two solutions \(x_-< x_+\) in \((0,\infty )\). They satisfy \(1 \le x_- < x_+\). The smaller fixed point is attractive (\(f'_\varepsilon (x_-)\in (0,1)\)), the larger fixed point is repulsive (\(f'_\varepsilon (x_+)>1\)).

  3. (3)

    If \(\varepsilon = c_d\), then \(f_\varepsilon \) has exactly one fixed point. The fixed point satisfies \(f'_\varepsilon (x) =1\).

Theorem 5.3

Suppose

$$\begin{aligned} \liminf _{j\rightarrow \infty } \varepsilon _j = \liminf _{j\rightarrow \infty } \exp ( E_j - 2^d E_{j-1}) > c_d. \end{aligned}$$

Then \(\mu _c =\infty \).

Because of \(c_d< 1\), the theorem applies in particular to the reference measure for which \(E_j\equiv 0\) and we find that there are no entropy-driven phase transitions.

Corollary 5.4

If \(E_j\equiv 0\), then \(\mu _c = \infty \).

Proof of Theorem 5.3

Fix \(\mu \in {\mathbb {R}}\) and suppress the \(\mu \)-dependence from the notation. By the assumption of the theorem there exists \(n_0\in {\mathbb {N}}\) and \(\varepsilon >c_d\) such that \(\varepsilon _n > \varepsilon \) for all \(n\ge n_0\). Then \(v_{n_0+k} \ge f_\varepsilon ^k (v_{n_0})\) for all \(k\in {\mathbb {N}}_0\). A close look at the fixed point iteration \(x_{k+1} = f_\varepsilon (x_k)\), based on the case distinction sketched above, shows that \(f_\varepsilon ^k(x_0)\) goes to infinity for all \(x_0\ge 0\). Consequently \(v_n\rightarrow \infty \) as \(n\rightarrow \infty \). We check that the divergence is in fact exponentially fast. For \(n \ge n_0\) we have \(v_n =1+ \varepsilon _n v^{2^d}_{n-1}\ge \varepsilon v_{n-1}^{2^d}\) hence for all \(\delta >0\),

$$\begin{aligned} \delta v_n \ge \delta ^{1- 2^d} \varepsilon \times (\delta v_{n-1})^{2^d}. \end{aligned}$$

Let \(\delta >0\) be the solution of \(\delta ^{1- 2^d} \varepsilon =1\), then

$$\begin{aligned} \frac{1}{|B_n|} \log (\delta v_n) \ge \frac{1}{|B_{n-1}|} \log (\delta v_{n-1}) \end{aligned}$$

for all \(n\ge n_0\). Pick \(k\ge n_0\) with \(\delta v_{k}> 1\), which exists because of \(v_n\rightarrow \infty \). Then for all \(n\ge k\) we have

$$\begin{aligned} \delta v_n \ge (\delta v_k)^{|B_n|/|B_k|}. \end{aligned}$$

In particular \(v_n\rightarrow \infty \) exponentially fast. To conclude, we turn back to the pressure, bring the \(\mu \)-dependence back into the notation, and note

$$\begin{aligned} p(\mu ) - (\mu - e_\infty ) = \liminf _{n\rightarrow \infty } \frac{1}{|B_n|} \log \frac{\Xi _{\Lambda _n}(\mu )}{z_n(\mu )} = \liminf _{n\rightarrow \infty }\frac{1}{|B_n|}\log v_n(\mu ) >0. \end{aligned}$$

Thus \(p(\mu )> \mu - e_\infty \). This holds true for every \(\mu \in {\mathbb {R}}\), therefore \(\mu _c =\infty \). \(\square \)

5.3 Continuous Phase Transition: Scaling Limit

Here we consider a model where each block has the same energy. Thus we assume that for some \(\lambda \in {\mathbb {R}}\),

$$\begin{aligned} \forall j \in {\mathbb {N}}_0:\quad E_j = \lambda . \end{aligned}$$

The total energy \(\sum _{B\in \omega } E(B)\) is then simply \(\lambda \) times the number of blocks in a configuration, the Boltzmann factor is given by \({\mathrm {e}}^{-\lambda }\) to the power of the number of blocks, a feature somewhat reminiscent of random cluster models [18, Chapter 6].

The constant sequence \(E_j\equiv \lambda \) has \(e_\infty = \lim _{j\rightarrow \infty }E_j/|B_j| =0\). The ratio \(\varepsilon _n\) from Eq. (5.7) is constant and equal to

$$\begin{aligned} \varepsilon (\lambda ):= {\mathrm {e}}^{- (2^d - 1)\lambda }. \end{aligned}$$

We can therefore analyze the system with the fixed point iteration from the previous section. Set

$$\begin{aligned} \lambda _d:= - \frac{\log c_d}{2^d-1} \end{aligned}$$

and notice \(\lambda _d>0\). If \(\varepsilon (\lambda )>c_d\) i.e. \(\lambda < \lambda _d\), then Theorem 5.3 tells us that \(\mu _c=\infty \) and the system has no phase transition.

If \(\varepsilon (\lambda ) <c_d\) i.e. \(\lambda > - (2^d-1)^{-1} \log c_d\), then by case (2) below (5.9), the function \(f_{\varepsilon (\lambda )}(x)\) has two fixed points \(0< x_-(\lambda )< x_+(\lambda )\).

Theorem 5.5

Assume \(\lambda > \lambda _d= - (2^d -1)^{-1}\log c_d\) and let \(x_+(\lambda )>1\) be the repulsive fixed point of the map \({\mathbb {R}}_+\ni x\mapsto 1+ \varepsilon (\lambda ) x^{2^d}\). Then the system undergoes a phase transition at

$$\begin{aligned} \mu _c(\lambda ) = \lambda - \log \bigl ( x_+(\lambda ) - 1\bigr ) \end{aligned}$$

and the phase transition is continuous.

Proof

To lighten notation we suppress the \(\lambda \)-dependence. Set \(\mu ^*:= \lambda - \log (x_+ - 1)\) and note

$$\begin{aligned} v_0(\mu ^*) = 1+ \exp (- \mu ^*+\lambda ) =x_+(\lambda ). \end{aligned}$$

Our task is to show \(\mu _c = \mu ^*\). To that aim we return to the fixed point iteration for the inverse probability of finding a large block and the case distinction below (5.9):

  1. (1)

    If \(\mu >\mu ^*\), then \(v_0(\mu )<x_+(\lambda )\) and \(v_0(\mu )\) belongs to the domain of attraction of the fixed point \(x_-(\lambda )\) and \(v_n(\mu )\rightarrow x_-(\lambda )\) as \(n\rightarrow \infty \).

  2. (2)

    If \(\mu = \mu ^*\), then \(v_0(\mu ) = x_+(\lambda )\) and \(v_n(\mu ) = x_+(\lambda )\) for all \(n\in {\mathbb {N}}_0\).

  3. (3)

    If \(\mu < \mu ^*\), then \(v_0(\mu ) > x_+(\lambda )\) and \(v_n(\mu ) \rightarrow \infty \).

In the cases (1) and (2) we have

$$\begin{aligned} p(\mu ) - \mu = \lim _{n\rightarrow \infty } \frac{1}{|B_n|}\log \frac{\Xi _{\Lambda _n}(\mu )}{z_n(\mu )} = \lim _{n\rightarrow \infty } \frac{1}{|B_n|}\log v_n(\mu ) =0. \end{aligned}$$

Thus \(p(\mu ) = \mu \) for all \(\mu \ge \mu ^*\). Proceeding as in the proof of Theorem 5.3, one shows that the divergence in case (3) is exponentially fast and concludes \(p(\mu )>\mu \). Thus \(p(\mu ) = \mu \) if and only if \(\mu \ge \mu ^*\), consequently \(\mu _c = \mu ^* <\infty \). In particular, the system undergoes a phase transition.

The effective activity at \(\mu = \mu _c\) is given by

$$\begin{aligned} {\widehat{z}}_j(\mu _c) = \exp \Bigl (- \lambda + |B_j|\bigl (\mu _c - p_{j-1}(\mu _c)\bigr )\Bigr ). \end{aligned}$$

Because of \(\mu _c = p(\mu _c) \ge p_{j-1}(\mu _c)\), it follows that \({\widehat{z}}_j(\mu _c) \ge \exp ( - \lambda )\) and \(\sum _{j=0}^\infty {\widehat{z}}_j(\mu _c) =\infty \). We deduce from Lemma 5.1(c) that the phase transition is continuous. \(\square \)

The mixture of hierarchical cubes is closely related to Mandelbrot’s percolation process [16, 17]. Let us define a sequence of random subsets of the unit cube by rescaling \(\Lambda _n=\{1,\ldots , 2^n\}^d\). Let \({\mathcal {K}}\) be the collection of compact subsets of \([0,1]^d\), equipped with the Hausdorff distance and Borel \(\sigma \)-algebra \({\mathcal {B}}_{\mathcal {K}}\). Let us first map a block \(B\subset {\mathbb {B}}\subset {\mathbb {Z}}^d\) to its continuum counterpart \(B'\subset {\mathbb {R}}^d\) given by

$$\begin{aligned} B' = \bigcup _{\varvec{k} \in B} \bigl [ k_1-1,k_1]\times \cdots \times \bigl [ k_d-1,k_d]. \end{aligned}$$

Thus \(B'\) is the cube in \({\mathbb {R}}^d\) obtained as the union of unit cubes with upper right corners \(\varvec{k} \in B\subset {\mathbb {Z}}^d\). If \(B\subset \Lambda _n\) then \(B'\subset [0, 2^n]^d\). For \(n\in {\mathbb {N}}_0\), define the random variable \(K_n:(\Omega _{\Lambda _n},{\mathcal {P}}(\Omega _{\Lambda _n}), {\mathbb {P}}_{\Lambda _n})\rightarrow ({\mathcal {K}},{\mathcal {B}}_{\mathcal {K}})\) by

$$\begin{aligned} K_n(\omega ):= \bigcup _{B\in \omega } \frac{1}{2^n} B'. \end{aligned}$$

Further let \(F_n(\omega )\) be the closure of \([0,1]^d \setminus K_n(\omega )\). The random set \(K_n(\omega )\) is constructed as a union of cubes of sidelengths \(1, \frac{1}{2},\ldots , \frac{1}{2^n}\), roughly as follows.

  • With probability \(1/v_n(\mu )\) the random set is equal to the whole unit cube, \(K_n(\omega ) = [0,1]^d\).

  • With probability \(1 - 1/v_n(\mu )\), the random set is strictly smaller than the whole unit cube. In that case we decide independently for each of the \(2^d\) subcubes (\([0,\frac{1}{2}]^d\) and its translates) whether to add or not add it to \(K_n(\omega )\); a subcube is added with probability \(1/v_{n-1}(\mu )\). This results in a set \(A_{n,1}(\omega )\) that is a union of cubes of sidelength 1/2. Then, for each subcube that has not been added, we repeat the construction for each of the \(2^d\) subsubcubes, to be added with probability \(1/v_{n-2}(\mu ) \). We iterate until we have reached the smallest cubes of sidelength \(2^{-n}\), associated with the probability \(1/v_0(\mu )\).

If the sequence \(v_n(\mu )\) is n-independent, let us write \(q\equiv 1/v_n(\mu )\), \(p= 1- q\), and suppress the \(\mu \)-dependence. Then we may think of \(K_n\) as a growing family of subsets of \([0,1]^d\) and accordingly of \(F_n(\omega )\) as a decreasing family, and set \(F(\omega ) = \cap _{n\in {\mathbb {N}}_0} F_n(\omega )\); we owe to S. Winter the remark that \(F(\omega )\) should correspond to a special instance of Mandelbrot’s percolation process [16, 17].

Revisiting the case distinctions on the asymptotic behavior of \((v_n(\mu ))_{n\in {\mathbb {N}}_0}\) we may expect the following behavior, under the assumption \(\lambda >\lambda _d\) and after restoration of the \(\mu \)-dependence in the notation:

  1. (1)

    If \(\mu = \mu _c(\lambda )\) then as \(n\rightarrow \infty \) the distribution of \(K_n^\mu \) should converge in some suitable sense to a process where at each scale, a block is added with probability \(1/x_-(\lambda )\), with \(x_-(\lambda )\) the repulsive fixed point of \(x\mapsto 1 + \varepsilon (\lambda ) x^{2^d}\).

  2. (2)

    If \(\mu >\mu _c(\lambda )\) the distribution of \(K_n^\mu \) should converge in some suitable sense to a process where at each scale, a block is added with probability \(1/x_+(\lambda )\), with \(x_+(\lambda )\) the attractive fixed point of \(x\mapsto 1 + \varepsilon (\lambda ) x^{2^d}\).

A rigorous statement and proof (or disproof) of these statements are beyond the scope of this article.

5.4 First-Order Phase Transition

Finally we provide necessary and sufficient conditions for the existence of a first-order phase transitions. The mathematical proofs carried out in this section are complemented by a heuristic discussion in Sect. 6.

Theorem 5.6

Set \(u_j:= \exp ( |B_j| e_\infty - E_j)\). The following two conditions are equivalent:

  1. (i)

    There exists a family of non-negative weights \((a_k)_{k\in {\mathbb {N}}_0}\) such that \(\sum _{j=0}^\infty u_j \exp (a_j)<\infty \) and

    $$\begin{aligned} \sum _{k=j}^\infty \frac{|B_j|}{|B_k|} \log \bigl ( 1+ u_k\, {\mathrm {e}}^{a_k}\bigr ) \le a_j \end{aligned}$$
    (5.10)

    for all \(j\in {\mathbb {N}}_0\).

  2. (ii)

    The mixture of cubes has a first-order phase transition.

Corollary 5.7

  1. (a)

    If there is a first-order phase transition, then necessarily \(E_j \ge |B_j| e_\infty \) (i.e., \(u_j\le 1\)) for all \(j\in {\mathbb {N}}_0\) and \(\sum _{j=0}^\infty u_j< \infty \).

  2. (b)

    The condition \(\sum _{j=0}^\infty u_j \le 1/{\mathrm {e}}\) is sufficient for the existence of a first-order phase transition.

Example 5.8

Let \(E_j = J ( - |B_j| + |\partial B_j|)\) with \(J>0\) some coupling constant and \(|\partial B_j| = 2d\, 2^{j(d-1)}\) the area of the boundary of a cube of sidelength \(2^j\) in \({\mathbb {R}}^d\). Then if \(d\ge 2\) and J is sufficiently large, the mixture of cubes has a first-order phase transition.

Proof of Corollary 5.7

(a) If there is a first-order phase transition, then by condition (i) in Theorem 5.6 we must have \(\sum _{j=0}^\infty u_j \le \sum _{j=0}^\infty u_j\exp (a_j)<\infty \), moreover \(\log (1+ u_j \exp (a_j)) \le a_j\) hence \(u_j \le 1- \exp ( - a_j)\le 1\).

(b) Choose \(a_k\equiv 1\). Because of \(\log (1+x) \le x\) and \(|B_j|\le |B_k|\) whenever \(j\le k\) we have

$$\begin{aligned} \sum _{k=j}^\infty \frac{|B_j|}{|B_k|} \log \Bigl (1+ u_k{\mathrm {e}}^{a_k}\Bigr ) \le \sum _{k=0}^\infty u_k {\mathrm {e}}^{a_k} = \Bigl (\sum _{k=0}^\infty u_k\Bigr ) {\mathrm {e}}\le 1 = a_j. \end{aligned}$$

Thus condition (i) in Theorem 5.6 is satisfied and the mixture has a first-order phase transition. \(\square \)

Proof of the implication \((ii)\Rightarrow (i)\) in Theorem 5.6

Suppose that the mixture of cubes has a first-order phase transition. Then

$$\begin{aligned} \mu _c - e_\infty =p(\mu _c) = \sum _{j=0}^\infty \frac{1}{|B_j|}\log (1+ {\widehat{z}}_j(\mu _c)) \end{aligned}$$

hence

$$\begin{aligned} {\widehat{z}}_j(\mu _c)&= \exp \Biggl ( |B_j|\mu _c - E_j\Biggr ) \exp \Biggl ( p(\mu _c) - |B_j|\,p_{j-1}(\mu _c)\Biggr ) \\&= \exp \biggl ( |B_j| e_\infty - E_j\biggr ) \,\exp \Biggl (|B_j| \sum _{k=j}^\infty \frac{1}{|B_k|}\log (1+ {\widehat{z}}_k(\mu _c))\Biggr ) \end{aligned}$$

for all \(j \in {\mathbb {N}}_0\). Equivalently, \(\zeta _j := {\widehat{z}}_j(\mu _c)\) and \(u_j := \exp ( |B_j| e_\infty - E_j)\), satisfy

$$\begin{aligned} \zeta _j = u_j \exp \Biggl (|B_j| \sum _{k=j}^\infty \frac{1}{|B_k|}\log (1+ \zeta _k)\Biggr ) \quad (j\in {\mathbb {N}}_0). \end{aligned}$$
(5.11)

Define \(a_j: =\log (\zeta _j/u_j)\), then \(a_j\ge 0\) and the inequality (5.10) holds true and is actually an equality. Moreover

$$\begin{aligned} \sum _{j=0}^\infty u_j {\mathrm {e}}^{a_j} = \sum _{j=0}^\infty \zeta _j = \sum _{j=0}^\infty {\widehat{z}}_j(\mu _c) <\infty \end{aligned}$$

because the phase transition is of first order, see Lemma 5.1(b). \(\square \)

The strategy for the proof of the implication \((i) \Rightarrow (ii)\) in Theorem 5.6 is as follows. First we show that if condition (i) holds true, then the fixed point equation (5.11) has at least one solution \((\zeta _j)\), see Lemma 5.9. Then we turn to the computation of the free energy \(\varphi (\sigma )\), which is given by a constrained minimization; we show that every solution of the fixed point problem (5.11) is associated with a critical point of the Lagrange functional \(L(\varvec{\rho },\sigma _\infty ,\mu )\) and deduce that the free energy is affine on some interval \([\sigma ^*,1]\).

Lemma 5.9

If the inequality (5.10) holds true for some family of non-negative weights \((a_k)_{k\in {\mathbb {N}}_0}\), then the fixed point problem (5.11) has at least one solution \(\varvec{\zeta }\in {\mathbb {R}}_+^{\mathbb {N}}\) that satisfies \(\zeta _j \le u_j \exp ( a_j)\) for all \(j\in {\mathbb {N}}_0\).

Proof

We adapt the treatment of tree fixed points by Faris [19, Section 3.1] and reformulate our problem as a fixed point problem in a partially ordered set for a monotone increasing map. Let \({\mathcal {L}}\) be the space of bounded non-negative sequences \(\varvec{z} = (\zeta _j)_{j\in {\mathbb {N}}_0}\). For \(\varvec{\zeta }\in {\mathcal {L}}\), define

$$\begin{aligned} F_j(\varvec{\zeta }) := u_j \exp \Biggl ( \sum _{k=j}^\infty \frac{ |B_j|}{|B_k|}\log \bigl (1 + \zeta _k\bigr )\Biggr )\quad (j\in {\mathbb {N}}_0). \end{aligned}$$

Further set \(\varvec{F}(\varvec{\zeta }):= (F_j(\zeta ))_{j\in {\mathbb {N}}_0}\). If \((u_j)_{j\in {\mathbb {N}}_0}\) is bounded, then \(\varvec{F}(\varvec{\zeta })\) is bounded as well; thus \(\varvec{F}\) maps \({\mathcal {L}}\) to \({\mathcal {L}}\). We equip \({\mathcal {L}}\) with the partial order of pointwise inequality, i.e., \(\varvec{x} \le \varvec{y}\) if and only if \(x_j \le y_j\) for all \(j \in {\mathbb {N}}_0\), and note that \(\varvec{F}\) is increasing with respect to that partial order.

The vector \(\varvec{w}\) defined by \(w_k:= u_k\exp (a_k)\) satisfies \(F_k(\varvec{w}) \le w_k\) for all \(k\in {\mathbb {N}}_0\). Define a sequence \(({\varvec{\zeta }}^{(n)})_{n\in {\mathbb {N}}_0}\) iteratively by \(\zeta _j^{(0)}\equiv 0\) and \(\zeta _j^{(n+1)} = F_j(\varvec{\zeta }^{(n)})\). Notice \(\zeta _j^{(1)} = u_j\).

We check by induction over n that \(\zeta _j^{(n)}\le \zeta _{j}^{(n+1)}\le u_j \exp (a_j)=w_j\) for all \(j \in {\mathbb {N}}_0\) and \(n\in {\mathbb {N}}_0\). For \(n=0\), the inequality reads \(0 \le u_j \le w_j\) which is clearly true. The induction step works because of the monotonicity of \(\varvec{F}\) and because of \(\varvec{F}(\varvec{w}) \le \varvec{w}\).

It follows that the limit \(\zeta _j:= \lim _{n\rightarrow \infty }\zeta _j^{(n)}\) exists for all \(j\in {\mathbb {N}}_0\) and satisfies \(\zeta _j \le w_j\), moreover \(\varvec{\zeta }= \varvec{F}(\varvec{\zeta })\) because \(F_j(\varvec{\zeta }^{(n)})\rightarrow F_j(\varvec{\zeta })\) by monotone convergence. \(\square \)

The solution of Lemma 5.9 is in fact a critical point of the Lagrange function for the computation of the free energy \(\varphi (\sigma )\). Let

$$\begin{aligned} L_\sigma (\varvec{\rho },\sigma _\infty ;\mu ):= f\bigl ( \varvec{\rho }, \sigma _\infty \bigr ) - \mu \Biggl ( \sum _{j=0}^\infty \rho _j + \sigma _\infty - \sigma \Biggr ). \end{aligned}$$
(5.12)

Given \((\zeta _j)_{j\in {\mathbb {N}}_0}\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) a summable sequence, set

$$\begin{aligned} \mu ^* := e_\infty + \sum _{k=0}^\infty \frac{1}{|B_k|} \log (1+ \zeta _k),\quad \rho ^*_j = \frac{\zeta _j}{1+\zeta _j}\prod _{k=j+1}^\infty \frac{1}{1+\zeta _k},\quad \sigma ^*:= 1- \prod _{j=0}^\infty \frac{1}{1+\zeta _j}.\nonumber \\ \end{aligned}$$
(5.13)

Note \(\sigma ^*=\sum _{j=0}^\infty \rho ^*_j \in (0,1)\). Fix \(\sigma \in [\sigma ^*,1)\) and define

$$\begin{aligned} \sigma _\infty := \frac{\sigma - \sigma ^*}{1 -\sigma ^*},\quad \rho _j := (1-\sigma _\infty ) \rho ^*_j. \end{aligned}$$
(5.14)

Thus \((\varvec{\rho },\sigma _\infty ) \) is a convex combination

$$\begin{aligned} (\varvec{\rho },\sigma _\infty ) = (1-\sigma _\infty )\, (\varvec{\rho }^*, 0) +\sigma _\infty (\varvec{0},1) \end{aligned}$$
(5.15)

and the packing fraction \(\sigma \) enters only via the weight \(\sigma _\infty \) in the convex combination.

Lemma 5.10

Suppose that the system (5.11) admits a solution \(\varvec{\zeta }\in {\mathbb {R}}_+^{{\mathbb {N}}_0}\) that satisfies \(\sum _{j=0}^\infty \zeta _j <\infty \) and define \(\mu ^*, \sigma ^*,\varvec{\rho ^*}\) as in (5.13). Assume \(\sigma \in [\sigma ^*,1)\) and define \((\varvec{\rho },\sigma _\infty )\) by (5.14). Then all partial derivatives of L at \((\varvec{\rho },\sigma _\infty , \mu ^*)\) exist and are equal to zero, and \((\varvec{\rho },\sigma _\infty , \mu ^*)\) is a minimizer of the Lagrange functional L.

Proof

Remember

$$\begin{aligned} {\widehat{\rho }}_j = \frac{\rho _j}{1 - \sum _{k \ge j+1} \rho _k- \sigma _\infty } = \frac{\rho '_j}{1- \sum _{k\ge j+1} \rho '_k},\quad \rho '_j = \frac{\rho _j}{1-\sigma _\infty }. \end{aligned}$$

Lemma 4.5 applied to \(m=0\) and \((\rho '_j)\) and \((\zeta _j)\) yields

$$\begin{aligned} {\widehat{\rho }}_j = \frac{\zeta _j}{1+ \zeta _j}<1 \qquad (j\in {\mathbb {N}}_0). \end{aligned}$$
(5.16)

The convergence of the series \(\sum _j \zeta _j\) implies \(\zeta _j \rightarrow 0\) as \(j\rightarrow \infty \). The free energy is given by a linear term minus the entropy, and the partial derivatives of the entropy have been computed in Eqs. (4.6) and (4.7). The existence of the partial derivatives follows from Proposition 4.2 and \(\rho _j>0\) for all j. We obtain

$$\begin{aligned} \frac{\partial L_\sigma }{\partial \rho _j}(\varvec{\rho },\sigma _\infty ,\mu ^*)&= \frac{1}{|B_j|}\Biggl ( E_j + \log \frac{{\widehat{\rho }}_j}{1-{\widehat{\rho }}_j} - \sum _{k=0}^{j-1} \frac{|B_j|}{|B_k|} \log (1- {\widehat{\rho }}_k) - \mu ^* |B_j|\Biggr ) \end{aligned}$$
(5.17)
$$\begin{aligned} \frac{\partial L_\sigma }{\partial \sigma _\infty }(\varvec{\rho },\sigma _\infty ,\mu ^*)&= e_\infty - \sum _{j=0}^\infty \frac{1}{|B_j|} \log (1- {\widehat{\rho }}_j) - \mu ^* . \end{aligned}$$
(5.18)

Eq. (5.16) yields \(\log (1+ \zeta _j) = - \log (1- {\widehat{\rho }}_j)\). Eq. (5.18) then follows from the definition of \(\mu ^*\) in (5.14) and Eq. (5.17) follows from (5.14) and (5.18). Finally we note

$$\begin{aligned} \frac{\partial L_\sigma }{\partial \mu ^*}(\varvec{\rho },\sigma _\infty ,\mu ^*) = (1- \sigma _\infty ) \sigma ^* + \sigma _\infty = \sigma \end{aligned}$$

by definition of \(\sigma _\infty \).

By convexity, the critical point is a minimizer in every finite-dimensional affine subspace obtained by changing only finitely many components of \((\varvec{\rho },\sigma ^*_\infty ,\mu ^*)\). The union of these subspaces in dense, and the Lagrange functional is continuous in the domain \(||(\varvec{\rho },\sigma _\infty )||\le 1\); the lemma follows. \(\square \)

Lemma 5.11

For \(\sigma \in [\sigma ^*,1)\) the vector \((\varvec{\rho }, \sigma _\infty ,\mu ^*)\) defined in (5.14) is a minimizer of the free energy \(f(\varvec{\rho },\sigma )\) under the constraint \(\sum _{j=0}^\infty \rho _j + \sigma _\infty = \sigma \), and the minimum \(\varphi (\sigma )\) is an affine function of \(\sigma \) with slope \(\mu ^*\),

$$\begin{aligned} \varphi (\sigma ) = \varphi (\sigma ^*) + \mu ^*(\sigma - \sigma ^*) \qquad (\sigma ^*\le \sigma <1). \end{aligned}$$

Proof

The vector \((\varvec{\rho },\sigma _\infty )\) is a minimizer because of Lemma 5.10. By (5.15) and Lemma 4.6, the free energy is

$$\begin{aligned} \varphi (\sigma )&= f( \varvec{\rho },\sigma _\infty ) = (1-\sigma _\infty ) f(\varvec{\rho ^*},0) + \sigma _\infty f(\varvec{0},1) = (1-\sigma _\infty ) \varphi (\sigma ^*) + \sigma _\infty e_\infty . \end{aligned}$$

Since \(\sigma _\infty \) is an affine function of \(\sigma \) by (5.14) it follows that \(\varphi (\sigma )\) is an affine function of \(\sigma \) as well. Lemma 5.10 yields

$$\begin{aligned} \frac{\partial f}{\partial \rho _j}(\varvec{\rho ^*},0) = \frac{\partial f}{\partial \sigma _\infty }(\varvec{\rho ^*},0) = \mu ^*. \end{aligned}$$

Therefore

$$\begin{aligned} \varphi '(\sigma )&= \sum _{j=0}^\infty \frac{\partial f}{\partial \rho _j}(\varvec{\rho ^*},0) \frac{\partial \rho _j}{\partial \sigma } + \frac{\partial f}{\partial \sigma _\infty }(\varvec{\rho ^*},0) \frac{\partial \sigma _\infty }{\partial \sigma } \\&= \sum _{j=0}^\infty \mu ^* \Bigl ( - \frac{\rho ^*}{1-\sigma ^*} \Bigr )+ \frac{\mu ^*}{1-\sigma ^*} = \mu ^*. \end{aligned}$$

\(\square \)

Lemma 5.12

We have \(\mu ^*=\mu _c\), \(\sigma ^*=\sigma _c\), and \(\zeta _j = {\widehat{z}}_j(\mu _c)\) for all \(j\in {\mathbb {N}}_0\).

Remark 5.13

It follows that the solution \(\varvec{\zeta }\) of the fixed point problem (5.11) is in fact unique.

Proof

It follows from Lemma 5.11 and elementary considerations on Legendre transforms that \(p(\mu ) = \sup _{\sigma \in [0,1]} (\mu \sigma - \varphi (\sigma )) = \mu - e_\infty \) for \(\mu \ge \mu ^*\), which yields \(\mu _c \le \mu ^*\).

Moreover, for \(\mu >\mu ^*\) the unique maximizer of \(\sigma \mapsto \mu \sigma - \varphi (\sigma )\) is \(\sigma =1\) while for \(\mu = \mu ^*\) every \(\sigma \in [\sigma ^*,1]\) is a maximizer. In particular, \(p(\mu ^*) = \sigma ^* \mu ^* - \varphi (\sigma ^*)\) and the constrained minimizer \((\varvec{\rho ^*},0)\) of \(f(\varvec{\rho },\sigma _\infty )\) is a maximizer at \(\mu = \mu ^*\) in the variational formula (5.3) for the pressure. It follows from Proposition 4.3 that \(\sum _{j=0}^\infty {\widehat{z}}_j(\mu ^*)<\infty \)—otherwise, the unique maximizer would be \((\varvec{0},1)\), in contradiction with \((\varvec{\rho ^*},0)\) be a maximizer—hence by Lemma 5.1, we must have \(\mu ^*\le \mu _c\).

Thus we have shown \(\mu _c = \mu ^*<\infty \). Proposition 4.3 and the previous considerations on the variational formula for the pressure \(p(\mu ^*) = p(\mu _c)\) also yield

$$\begin{aligned} \widehat{\rho _j^*} = \frac{{\widehat{z}}_j(\mu _c)}{1+{\widehat{z}}_j(\mu _c)} = \frac{\zeta _j}{1+\zeta _j} \end{aligned}$$

hence \(\zeta _j = {\widehat{z}}_j(\mu _c)\) for all \(j\in {\mathbb {N}}_0\). Finally \(\sigma _c = \sum _{j=0}^\infty \rho _j^* =\sigma ^*\). \(\square \)

Proof of the implication \((i)\Rightarrow (ii)\) in Theorem 5.6

Suppose that condition (i) is satisfied. Then by Lemma 5.9 the fixed point equation (5.11) has a solution and we may define \(\mu ^*\in {\mathbb {R}}\), \(\sigma ^*\in (0,1)\), and \(\rho ^*_j\) as in (5.13). Lemma 5.12 shows that the system has a phase transition at \(\mu _c = \mu ^*\) with \(\sigma _c = \sigma ^*<1\), hence the transition is of first order. \(\square \)

6 Discussion

A concluding heuristic discussion of the parameter-dependent model from Sect. 5 makes the connection to the motivating considerations on the mixture of hard spheres in the introduction more apparent. By Proposition 4.2, the free energy (5.2) of the parameter-dependent model is

$$\begin{aligned} f(\varvec{\rho },\sigma _\infty ) = \sum _{j=0}^\infty \rho _j \frac{E_j}{|B_j|} + \sigma _\infty e_\infty + \sum _{j=0}^\infty \rho _j \bigl (\log \rho _j - 1\bigr ) + \Phi (\varvec{\rho },\sigma _\infty ) \end{aligned}$$

with \(\Phi (\varvec{\rho },\sigma _\infty )\) the absolutely convergent power series from Eq. (4.9). The leading order in the power series is quadratic,

$$\begin{aligned} \Phi (\varvec{\rho },\sigma _\infty ) = \frac{1}{2} \sum _{j=0}^\infty \frac{\rho _j}{|B_j|}\Bigl ( \rho _j + 2\sum _{k=j+1}^\infty \rho _k + 2\sigma _\infty \Bigr ) + \text {higher order terms} \end{aligned}$$

and the power series vanishes when \(\rho _j \equiv 0\). Every configuration is a convex combination of a gas configuration and a condensed configuration

$$\begin{aligned} (\varvec{\rho },\sigma _\infty ) = (1- \sigma _\infty ) \, (\varvec{\rho '},0) + \sigma _\infty \, (0,1) \end{aligned}$$

and by Lemma 4.6 the free energy is

$$\begin{aligned} f(\varvec{\rho },\sigma _\infty ) = (1- \sigma _\infty ) f(\varvec{\rho '},0) + \sigma _\infty e_\infty , \end{aligned}$$

which implies

$$\begin{aligned} \Phi (\varvec{\rho },\sigma _\infty ) = - \sum _{j=0}^\infty \frac{\rho _j}{|B_j|} \log (1- \sigma _\infty ) + (1-\sigma _\infty ) \Phi (\varvec{\rho '},0). \end{aligned}$$
(6.1)

When minimizing the free energy at prescribed packing fraction \(\sigma _\infty + \sum _{j=0}^\infty \rho _j = \sigma \) two scenarios are possible: In the gas phase the minimizer has \(\sigma _\infty =0\) while in the coexistence region the minimizer has \(\sigma _\infty \in (0,1)\). Accordingly in the gas phase the minimizer solves

$$\begin{aligned} \frac{E_j}{|B_j|} + \frac{1}{|B_j|} \log \rho _j + \frac{\partial \Phi }{\partial \rho _j}(\varvec{\rho },0) = \mu \qquad (j \in {\mathbb {N}}_0) \end{aligned}$$

with \(\mu \in {\mathbb {R}}\) some Lagrange parameter determined by

$$\begin{aligned} \sum _{j=0}^\infty \rho _j = \sum _{j=0}^\infty \exp \Bigl ( \mu |B_j| - E_j - |B_j| \frac{\partial \Phi }{\partial \rho _j}(\varvec{\rho },0) \Bigr ) =\sigma . \end{aligned}$$

In the coexistence region the equations are instead

$$\begin{aligned} \frac{E_j}{|B_j|} + \frac{1}{|B_j|} \log \rho _j + \frac{\partial \Phi }{\partial \rho _j}(\varvec{\rho },\sigma _\infty )&= \mu \qquad (j \in {\mathbb {N}}_0),\\ e_\infty + \frac{\partial \Phi }{\partial \sigma _\infty } (\varvec{\rho },\sigma _\infty )&= \mu ,\\ \sigma _\infty + \sum _{j=0}^\infty \rho _j&= \sigma . \end{aligned}$$

The second equation allows us to eliminate the Lagrange multiplier \(\mu \) from the first equation, we obtain

$$\begin{aligned} \rho _j \exp \Biggl (|B_j|\Bigl ( \frac{\partial \Phi }{\partial \sigma _\infty }(\varvec{\rho },\sigma _\infty ) - \frac{\partial \Phi }{\partial \rho _j}(\varvec{\rho },\sigma _\infty ) \Bigr ) \Biggr ) = \exp \bigl ( |B_j| e_\infty - E_j\bigr ) \qquad (j\in {\mathbb {N}}_0).\nonumber \\ \end{aligned}$$
(6.2)

Equation (6.1) allows us to formulate instead equations in terms of primed variables \(\rho '_j = \rho _j /(1-\sigma _\infty )\). Indeed,

$$\begin{aligned} \frac{\partial \Phi }{\partial \rho _j}(\varvec{\rho },\sigma _\infty )&= - \frac{1}{|B_j|} \log (1- \sigma _\infty ) + \frac{\partial \Phi }{\partial \rho _j}(\varvec{ \rho '},0)\\ \frac{\partial \Phi }{\partial \sigma _\infty }(\rho ,\sigma _\infty )&= - \sum _{j=0}^\infty \frac{\rho '_j}{|B_j|} - \Phi (\varvec{\rho '},0) + \sum _{j=0}^\infty \rho '_j \frac{\partial \Phi }{\partial \rho _j}(\varvec{ \rho '},0) \end{aligned}$$

and (6.2) is of the form

$$\begin{aligned} \rho '_j \exp \bigl ( F_j(\varvec{ \rho '})\bigr ) = u_j \qquad (j\in {\mathbb {N}}_0) \end{aligned}$$
(6.3)

with \(u_j = \exp ( |B_j| e_\infty - E_j)\) and \(F_j(\varvec{\rho }')\) a power series that is absolutely convergent in \(||\varvec{\rho '}|| = \sum _{j=0}^\infty |\rho '_j|<1\) and satisfies \(F_j(\varvec{\rho '}) =O( ||\varvec{\rho '}||)\). The fixed point equation (6.3) is similar to (5.11). In the absence of the correction term \(F_j\) the solution would be \(\rho '_j = u_j\). For sufficiently small values of \(u_j\) the solution should be a power series in the variables \(u_j\). Rigorous statements can be derived with the inversion theorems from [9, 11], complementing Lemma 5.9 on the solvability of Eq. (5.11).