1 Introduction

Berlin and Kac proposed [1] in 1952 a spherical model as a modification of the Ising model of a ferromagnet. In their model, discrete spin variables are replaced by continuum variables, i.e., by real numbers, while keeping a constraint that the total length of the continuum vector equals that of the discrete spin vector. This enforces the continuum spin vectors to remain on the surface of a fixed high-dimensional sphere, hence the name “spherical model”. Their motivation was to find simple models were phase transitions could be studied fairly explicitly, in particular, in the physically relevant case of three dimensions.

Although the partition function of the spherical model cannot be explicitly solved for fixed finite lattices, it has an integral representation which allows studying the properties of its infinite volume limit when restricted to nearest neighbour interactions. The limiting partition function is sufficiently explicit that standard thermal equilibrium properties of the model can be derived from it and, as shown in [1], the spherical model in three dimensions has a phase transition corresponding to spontaneous magnetisation. The reference also contains estimates for the second and fourth moments of the field, implying that the fluctuations at small temperatures, when there is spontaneous magnetisation, cannot be Gaussian.

On a technical level, the spontaneous magnetisation found in [1] is analogous to Bose–Einstein condensation in quantum statistical mechanics. For instance, Yan and Wannier [2] extend the analysis in [1] to compute also the single site distribution (one-point function) in the infinite volume limit. They find that in the subcritical case the distribution is Gaussian whereas in the supercritical case it is not Gaussian but instead corresponds to a random variable which is a sum of a random constant and a Gaussian variable. The appearance of the constant is analogous to the effect of condensation for ideal Bose gas.

To elucidate the connection further, let us begin with more detailed definitions. The spherical model in d dimensions is defined as the random field of “continuous spin” \(s_x\in {\mathbb R}\), \(x\in \varLambda \), where \(\varLambda \subset {\mathbb R}^d\) is a finite lattice of points. The main purpose of using a lattice to label the spins is to define the interaction energy of a spin configuration: one assumes that there is given a coupling function \(J_{x,y}\), \(x,y\in \varLambda \), such that the energy of a configuration s is given by

$$\begin{aligned} E_\varLambda [s] := \sum _{x,y\in \varLambda } J_{x,y} s_x^* s_y, \end{aligned}$$

where \(s_x^*\) denotes the complex conjugate, added here for later use. Often one takes \(J_{x,y} = v(x-y)\) for a function v which decays sufficiently rapidly with increasing \(|x-y|\). For instance, the rectangular nearest neighbour case with Dirichlet boundary conditions would have \(\varLambda \subset {\mathbb Z}^d\) and \(v(x)=0\) for \(|x|_\infty \ge 2\), where \(|x|_\infty := \max _i |x_i|\). We will use both \(|x|_\infty \) and the Euclidean norm on \({\mathbb R}^d\), |x|, frequently in the following.

Denoting the lattice size by \(V=|\varLambda |<\infty \), the probability measure for the spin field s at inverse temperature \(\beta >0\) is given by

$$\begin{aligned} \mu _{\text {BK},\beta }[\mathrm{d}s] = \frac{1}{Z_{\text {BK},\varLambda ,\beta }} \mathrm{e}^{-\beta E_\varLambda [s]}\, \delta \!\left( \sum _{x\in \varLambda }s_x^2 - V\right) \prod _{x\in \varLambda }\! \mathrm{d}s_x. \end{aligned}$$
(1.1)

The first factor is the standard canonical Gibbs weight for the given temperature and energy function. The second “factor” is a \(\delta \)-function constraint which enforces the assumption that the length of the spin vector divided by the number of particles is equal to one. We will use such \(\delta \)-functions liberally in the following, and the discussion about their mathematical definition and properties is given in Appendix A. In particular, it follows that under the above measure \(\sum _{x\in \varLambda }s_x^2=V\) almost surely. Here \(Z_{\text {BK},\varLambda ,\beta }>0\) is a constant which normalizes the positive measure into a probability measure, and it is also equal to the earlier mentioned finite volume partition function of the spherical model.

Here, we generalize the above spherical model slightly by complexifying the spin field \(s_x\) and allowing for arbitrary spin-densities \(\rho >0\). Explicitly, we consider here complex fields \(\phi _x\in {\mathbb C}\), \(x\in \varLambda \), whose values are distributed according to the measure

$$\begin{aligned} \mu _{\rho ,\beta }[\mathrm{d}\phi ] = \frac{1}{Z_{\rho ,\beta }} \mathrm{e}^{-\beta E_\varLambda [\phi ]}\, \delta \!\left( \sum _{x\in \varLambda }|\phi _x|^2 -\rho V\right) \prod _{x\in \varLambda }\! \left[ \mathrm{d}\phi ^*_x \mathrm{d}\phi _x \right] , \end{aligned}$$
(1.2)

where \(\mathrm{d}\phi ^*_x \mathrm{d}\phi _x := \mathrm{d}\bigl (\mathrm{Re\,}\phi _x\bigr ) \mathrm{d}\bigl (\mathrm{Im\,}\phi _x\bigr )\). The measure (1.2) is a “classical field” version of the ideal gas of bosonic particles in the canonical ensemble where the total particle number is fixed to \(\rho V\) but energy is allowed to fluctuate according to the canonical Gibbs ensemble. In fact, it follows from our main result that the mechanism behind the spherical model phase transition is identical to that found for Bose–Einstein condensation of non-interacting bosons: if \(d\ge 3\), we show that for all sufficiently large densities \(\rho \) it is possible to separate a finite number of Fourier modes from the field, called the condensate, and these will carry all of the excess mass above criticality. The fluctuations of the remaining degrees of freedom, the normal fluid, are shown to become Gaussian and independent from the condensate fluctuations in the large volume limit. The connection between the spherical model, its grand canonical, Gaussian version, and Bose–Einstein condensation has been explored in [3] which contains also further references on the topic.

An important consequence of the analysis here is to observe that the condensate cannot always be composed out of a unique Fourier mode. In fact, the number of relevant modes and their fluctuations might even depend on the precise shape and size of \(\varLambda \). For spin interactions, and even more so for dispersion relations arising from tight binding approximation or for phonons in solid state physics, it would be important to be able to consider fairly general interaction potentials. A number of example lattice interactions are discussed in Sect. 4. One of these is given by a dispersion relation which has a unique global minimum but its restrictions to periodic rectangular lattices with L particles on each side have a unique condensate mode for odd L but \(2^d\) condensate modes for even L. This is in sharp contrast to the standard ideal Bose gas example [4, Theorem 5.2.30] where \(L\rightarrow \infty \) limiting behaviour is unique and all excess mass condenses into the (unique) ground state, corresponding to the Fourier mode with wave number zero.

Our main result, Theorem 1, provides explicit bounds which may be used to estimate the accuracy of any proposed splitting of the Fourier modes into condensate and normal fluid modes. One of the main goals of the present contribution has been to find methods which would be able to identify the condensate modes properly for general, finite range lattice interactions. This has resulted in the bounds given in Theorem 1; as we discuss in Sect. 4, these bounds are indeed sufficiently refined to distinguish the condensate modes correctly not only in the above odd and even L cases, but also in all other examples considered in Sect. 4.

Bose–Einstein condensation has been much more extensively studied in the literature than the spherical model. Although the analysis is complicated by the replacement of the complex field \(\phi _x\) by non-commutative bosonic creation and annihilation operators on the Fock space, the findings are not dissimilar from the above observations. For example, in [5] the properties of the condensate in the so-called imperfect Bose gas are shown to depend on which lattices are used to approach the infinite volume limit, by varying the anisotropy of the lattices. Even more extreme examples for the ideal Bose gas are given in [6]. Multi-state condensation has also been shown to occur in similar models in [7] and its introduction contains a summary of other earlier findings. In contrast, if one adds a one-particle energy gap, single-state condensation occurs for bosons interacting via superstable two-body potentials [8]. The role the explicit gap plays in the result is discussed in the paper but, since the gap is not allowed to depend on the system size, it is not possible to draw conclusions about the minimal gap size needed. Indeed, our results indicate that this dependence could be fairly complex in general.

A second motivation to study the measure (1.2) comes from statistical mechanics of discrete wave equations. Considering \((2^{\frac{1}{2}}\mathrm{Re\,}\phi _x, 2^{\frac{1}{2}}\mathrm{Im\,}\phi _x)\) to form a pair of canonical variables for each x, one may use the function \(E_\varLambda [\phi ]\) to define Hamiltonian evolution under which it is conserved and may be identified physically as the total energy. Requiring the symmetry condition \(J_{y,x}^* = J_{x,y}\) from the coupling, the evolution equations are equivalent to

$$\begin{aligned} \partial _t \phi _x = -\mathrm{i}\sum _{y\in \varLambda } J_{x,y}\phi _y. \end{aligned}$$

In particular, if \(J_{x,y}=\alpha (x-y;L)\) where \(\alpha \) is L-periodic, this corresponds to a discrete wave equation with periodic boundary conditions and with a dispersion relation \(\omega \) which is given by the Fourier transform of \(\alpha \). In addition, one may check by differentiation that the \(\ell _2\)-norm is conserved by the time-evolution, i.e., that \(\sum _{x\in \varLambda } |\phi _x|^2\) is also a conserved quantity. By Liouville’s theorem, the Lebesgue measure is invariant under the Hamiltonian evolution and thus the measure (1.2) yields a family of stationary measures for the discrete wave equation corresponding to the Hamiltonian \(E_\varLambda [\phi ]\). Therefore, our result can also be viewed as a proof of “Bose–Einstein” condensation for the equilibrium measures of these discrete wave equations.

To mention one additional motivation for the measures in (1.2), let us point out that they can also be obtained as a weak coupling limit of fixed density, i.e., “canonical”, equilibrium measures of the discrete nonlinear Schrödinger equation. In [9], we study the discrete nonlinear Schrödinger evolution with random initial data distributed according to a grand canonical ensemble, aiming at rigorous control of the related kinetic theory. However, the assumptions used in [9] require that the weak coupling measure in the thermodynamic limit becomes Gaussian, hence excluding a range of densities which correspond to the supercritical case studied here. The above results could provide the first step towards understanding kinetic theory for weakly nonlinear waves in presence of a condensate.

The main technique for controlling the error arising from the separation of the condensate degrees of freedom is very different from the previous estimates in [1, 2]. Instead of trying to represent the \(\delta \)-function in terms of oscillatory integrals, we think of it as a constraint defining a positive measure, and aim at minimizing the effect of the separation with a flexible choice of which modes are included in the condensate. It turns out that there are cases in which the condensate degrees of freedom have somewhat irregular fluctuations but the main achievement here is to show that it is possible to make the separation in such a manner that the number of condensate modes always remains bounded and the rest of the modes become independent Gaussian random variables. After the approximate measure has been chosen, we check that it is close to the original one by constructing a coupling between the two measures, borrowing ideas from [10]. This controls the Wasserstein distance between the measures, and together with their translation invariance, we conclude that there is a power \(p'>0\) such that all finite moments of the field \(\phi _x\) are \(O(L^{-p'})\) close to each other as \(L\rightarrow \infty \).

Couplings and Wasserstein metric are basic tools for optimal transport problems [11]. They have also been used for studies of condensation phenomena in stochastic particle systems, although in models such as zero-range processes the condensation occurs at isolated lattice sites instead of Fourier modes as in the cases discussed above. We refer to [12] and references therein for an up-to-date discussion and examples related to the topic.

In the following sections, we first define the complexified spherical model and describe the main results in more detail in Sect. 2. The fixed finite lattice case for supercritical densities is discussed in Theorem 1 while the conclusions for the case where a given dispersion relation is studied in the infinite volume limit are given in Corollary 1. These results give bounds for the Wasserstein distance between the spherical model measure and the approximation where the condensate and normal fluid modes have been separated. The bounds typically diverge, but in Sect. 3 we explain how they nevertheless imply that the approximation errors of finite moments vanish in the infinite volume limit. Various scenarios for the formation of the condensate for a number of example continuum dispersion relations are discussed in Sect. 4.

In the technical part, we first prove Theorem 1 in Sect. 5, and a statement in item 3 of Proposition 1 which uses a number of components from the proof. The main estimates allowing to control the infinite volume limit of fixed dispersion relations are given in Sect. 6, in particular, completing the missing proof of Lemma 1. In the two Appendices, we first clarify the precise mathematical interpretation of the \(\delta \)-function constraints and recall the definition and basic properties of the Wasserstein distance.

2 Separation of Condensate in the Spherical Model

2.1 Notations and definition of the spherical model measure

We begin with the probability measure for a finite complex field \(\phi _x\), \(x\in \varLambda \), defined by the complexified spherical model of Berlin and Kac given in (1.2). For simplicity, we only consider d-dimensional periodic lattices of fixed side length L, which we parameterize as follows

$$\begin{aligned}&\varLambda _L := \Bigl \{-\frac{L-1}{2},\ldots ,\frac{L-1}{2}\Bigr \}^d, \qquad \text {if }L\text { is odd},\end{aligned}$$
(2.1)
$$\begin{aligned}&\varLambda _L := \Bigl \{-\frac{L}{2}+1,\ldots ,\frac{L}{2}\Bigr \}^d, \qquad \text {if }L\text { is even}. \end{aligned}$$
(2.2)

Then always \(V:=|\varLambda _L|=L^d\) and \(\varLambda _L\subset \varLambda _{L'}\) if \(L\le L'\). Also, if L is odd, \(x\in {\mathbb Z}^d\) belongs to \(\varLambda _L\) if and only if \(|x|_\infty < \frac{L}{2}\). If L is even, \(\varLambda _L\) contains those \(x\in {\mathbb Z}^d\) for which \(|x|_\infty \le \frac{L}{2}\) and \(x_i\ne -\frac{L}{2}\) for all i.

We further simplify the discussion by restricting to energy functions satisfying periodic boundary conditions. Without loss of generality, we also include the inverse temperature to the definition, and thus assume that

$$\begin{aligned} \beta E_\varLambda [\phi ] = H_L[\phi ] := \sum _{x,y\in \varLambda _L} \phi _x^* \alpha (x-y;L) \phi _y, \end{aligned}$$

where \(\alpha :\varLambda _L\rightarrow {\mathbb C}\) determines the interaction energies. Here, and in the following, we use periodic arithmetic on \(\varLambda _L\), setting \(x'\pm x := (x'{\pm }x) \bmod \varLambda _L\) and \(-x:= (-x) \bmod \varLambda _L\), for \(x',x\in \varLambda _L\).

The above definition implies that the energies remain invariant under periodic translations of the field configuration, i.e., \(H_L[\phi ']=H_L[\phi ]\) if \(y\in \varLambda _L\) and \(\phi '_x:=\phi _{x+y}\), \(x\in \varLambda _L\). In fact, we can now “diagonalize” the interaction by using discrete Fourier transform. We define the Fourier transform on \(\varLambda =\varLambda _L\) by first setting as the dual lattice \(\varLambda ^*(L) := \varLambda _L/L\subset {]}{-}\frac{1}{2},\frac{1}{2}]^d\) and then denoting the Fourier transform of a function \(f:\varLambda \rightarrow \mathbb {C}\) by \(\widehat{f} : \varLambda ^* \rightarrow \mathbb {C}\), where

$$\begin{aligned} \widehat{f}(k) = \sum _{x\in \varLambda } f(x) \text {e}^{-\text {i} 2\pi k \cdot x}, \qquad k\in \varLambda ^*. \end{aligned}$$
(2.3)

The inverse transform is given by

$$\begin{aligned} \widetilde{g}(x) = \frac{1}{V} \sum _{k\in \varLambda ^*} g(k) \text {e}^{\text {i} 2\pi k \cdot x} =: \int _{\varLambda ^*}\!\mathrm{d}k\, g(k) \text {e}^{\text {i} 2\pi k \cdot x}, \qquad x\in \varLambda . \end{aligned}$$
(2.4)

It is straightforward to check that both transforms are pointwise invertible for all f and g, \(f(x)=\widetilde{(\widehat{f})}(x)\) for \(x\in \varLambda \) and \(g(k) = \widehat{(\widetilde{g})}(k)\) for \(k\in \varLambda ^*\).

The standard convolution results hold for the discrete Fourier transform, and thus we have

$$\begin{aligned} H_L[\phi ] = \int _{\varLambda ^*}\!\mathrm{d}k\, \omega (k) |\varPhi _k|^2 =: H[\varPhi ], \end{aligned}$$

where \(\varPhi = \widehat{\phi }:\varLambda ^*\rightarrow {\mathbb C}\) and \(\omega = \widehat{\alpha }\). In this formulation, it is now obvious that if we wish to satisfy the physical requirement of the energy \(H_L\) being real for all field configurations, it is necessary that \(\omega (k)\in {\mathbb R}\) for all \(k\in \varLambda ^*\). In addition, by the inversion formula

$$\begin{aligned} \alpha (x;L) := \int _{\varLambda ^*}\!\mathrm{d}k\, \omega (k) \text {e}^{\text {i} 2\pi k \cdot x}. \end{aligned}$$
(2.5)

Therefore, it is possible to simplify the study of the infinite volume limit \(L\rightarrow \infty \) by considering a “target” function \(\omega :{\mathbb T}^d\rightarrow {\mathbb R}\), parameterizing the torus using \(]{-}\frac{1}{2},\frac{1}{2}]^d\), and defining \(\alpha \) using the formula (2.5). For reasons explained in the Introduction, we call such functions \(\omega \)dispersion relations. In the following, some of the results concern the limiting behaviour as \(L\rightarrow \infty \) for some given dispersion relation \(\omega \) on the torus, while others assume that L is fixed and \(\omega (k)\), \(k\in \varLambda ^*\), are some given real numbers.

We also denote

$$\begin{aligned} N_L[\phi ] = \sum _{x\in \varLambda _L} |\phi _x|^2, \end{aligned}$$

and thus arrive at the following expression for the spherical model measure

$$\begin{aligned} \mu _{\rho ,\beta }[\mathrm{d}\phi ] = \frac{1}{Z_{\rho ,\beta }} \mathrm{e}^{-H_L[\phi ]}\, \delta \!\left( N_L[\phi ] -\rho V\right) \prod _{x\in \varLambda }\! \left[ \mathrm{d}\phi ^*_x \mathrm{d}\phi _x \right] . \end{aligned}$$
(2.6)

By the discrete Plancherel theorem, here \(N_L[\phi ]=\Vert \phi \Vert ^2=\Vert \varPhi \Vert ^2=: N[\varPhi ]\), and we observed earlier that \(H_L[\phi ]=H[\varPhi ]\). Since the Fourier transform introduces an invertible linear transformation of the field, we may conclude that the spherical model measure has a particularly simple form for the Fourier components \(\varPhi _k = \widehat{\phi }_k\) of the field,

$$\begin{aligned} \mu _0[\mathrm{d}\varPhi ] := \frac{1}{Z_{\rho }} \mathrm{e}^{-H[\varPhi ]} \delta (N[\varPhi ]- \rho V) \prod _{k\in \varLambda ^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \end{aligned}$$
(2.7)

where \(\mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k := \mathrm{d}\bigl (\mathrm{Re\,}\varPhi _k\bigr ) \mathrm{d}\bigl (\mathrm{Im\,}\varPhi _k\bigr )\), \(Z_\rho \) normalizes the integral to one, and

$$\begin{aligned} H[\varPhi ] := \int _{\varLambda ^*}\!\mathrm{d}k\, \omega (k) |\varPhi _k|^2,\qquad N[\varPhi ] := \int _{\varLambda ^*}\!\mathrm{d}k\, |\varPhi _k|^2. \end{aligned}$$

As the norm in which to measure the Wasserstein distance, we choose the \(\ell _2\)-metric on the x-space. By the Plancherel theorem for discrete Fourier transform, this means using the following norm for the field \(\varPhi _k\),

$$\begin{aligned} \Vert \varPhi \Vert ^2 := \int _{\varLambda ^*}\!\mathrm{d}k\, |\varPhi _k|^2, \end{aligned}$$

and \(N[\varPhi ]=\Vert \varPhi \Vert ^2\). We also need spherical coordinates in these variables. We denote the radial distance coordinate by \(|\varPhi |\), and it is then related to the above norm by

$$\begin{aligned} |\varPhi |^2 := \sum _{k\in \varLambda ^*} |\varPhi _k|^2 = |\varLambda |\, \Vert \varPhi \Vert ^2. \end{aligned}$$

2.2 Factorized supercritical measures

Our goal is to study the spherical model for parameter values which lead to generation of a condensate. Since this is a physical, macroscopic notion, we first need to quantify mathematically what it could mean for finite lattice systems such as the spherical model measure introduced in the previous subsection. After this, we will separately consider the large L behaviour of systems whose energy eigenvalues \(\omega (k)\), \(k\in \varLambda ^*\), arise from a continuum dispersion relation \(\omega :{\mathbb T}^d\rightarrow {\mathbb R}\) as explained earlier.

To quantify condensates and supercriticality, it will be necessary to identify a sufficiently large energy gap separating the modes which belong to the condensate from the rest. To this end, we divide the wave numbers in \(\varLambda ^*\) into a condensate wave number set\(\varLambda _0^*\) and a normal fluid wave number set\(\varLambda ^*_+ =\varLambda ^*{\setminus } \varLambda _0^*\) in such a manner that the energies occurring in these sets are separated by a non-empty interval. An important parameter of the split turns out to be the proportional size of the gap, after normalizing the lowest energy to zero; the following item collects the related definitions and terminology.

Definition 1

Consider \(\varLambda ^*\) for some fixed L and suppose \(\omega (k)\in {\mathbb R}\), \(k\in \varLambda ^*\), are given. Define \(\omega _0:=\min _{k\in \varLambda ^*} \omega (k)\) and \(e_k:=\omega (k)-\omega _0\ge 0\), \(k\in \varLambda ^*\). A split of \(\varLambda ^*\) is a pair \((\varLambda _0^*,\varLambda _+^*)\) of nonempty disjoint subsets of \(\varLambda ^*\) whose union covers the whole \(\varLambda ^*\). Given \(0\le a<b\) and a split \((\varLambda _0^*,\varLambda _+^*)\), we say that the split is separated by the energy interval [a, b] if \(e_k \le a\) for all \(k\in \varLambda ^*_0\) and \(e_k \ge b\) for all \(k\in \varLambda ^*_+\). In this case, the relative energy gap of the split is defined as \(\delta ^{-1}\) where

$$\begin{aligned} \delta := \frac{\max _{k\in \varLambda ^*_0}e_k}{\min _{k\in \varLambda ^*_+}e_k}\le \frac{a}{b}<1. \end{aligned}$$

We denote the number of elements in the two subsets of the split by \(V_0:=|\varLambda _0^*|\) and \(V_+:=|\varLambda _+^*|\).

Since \(V=|\varLambda ^*|\), for such a split we clearly have \(0<V_0,V_+<V\) and \(V=V_0+V_+\). Also, every global lattice minima, a point \(k\in \varLambda ^*\) at which \(\omega (k)=\omega _0\), belongs to \(\varLambda ^*_0\). Hence, \(\varLambda ^*_0\) contains all k for which \(e_k=0\), and thus \(e_k>0\) for all \(k\in \varLambda ^*_+\).

Given such a split, we call the field \(\varPhi ^+\) composed out of modes with \(k\in \varLambda _+\) the normal fluid while the field \(\varPhi ^0\) resulting from the remaining modes is called the condensate. The goal is to quantify under which assumptions the condensate field can be composed out of a small fraction of the modes, \(\frac{V_0}{V}\ll 1\), so that they nevertheless carry a substantial fraction of the total mass \(\rho V\). Analogously to the Bose–Eistein condensation, one could then expect the normal fluid to fluctuate according to the critical thermal, grand canonical ensemble. Indeed, under the assumptions made in the main theorem we can prove that the normal fluid \(\varPhi ^+\) follows very accurately Gaussian statistics given by the following distribution

$$\begin{aligned}&\mu _+[\mathrm{d}\varPhi ] := \frac{1}{Z_+} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-L^{-d} \sum _{k\in \varLambda _+^*} (\omega (k)-\omega _0) |\varPhi _k|^2}. \end{aligned}$$
(2.8)

This measure is well-defined since \(\omega (k)>\omega _0\) for all \(k\in \varLambda ^*_+\). The expectation of norm density, \(\langle \Vert \varPhi _+\Vert ^2/V\rangle \), under such a measure is equal to

$$\begin{aligned} \rho _{\mathrm {c}}(L) := \int _{\varLambda _+^*}\!\!\mathrm{d}k\, \frac{1}{\omega (k)-\omega _0}. \end{aligned}$$
(2.9)

The standard deviation of the norm density is proportional to \(1/\sqrt{V}=L^{-\frac{d}{2}}\), and thus for large L the normal fluid under this measure cannot carry much more of the density fixed by the condition \(N[\varPhi ]=\rho V\) as soon as \(\rho >\rho _{\mathrm {c}}\). Since \(N[\varPhi ]= \Vert \varPhi _+\Vert ^2+ \Vert \varPhi _0\Vert ^2\), then the extra norm density \(\rho -\rho _{\mathrm {c}}\) will be contained in the condensate modes.

Based on the above analogy, we say the the spherical model is supercritical if \(\rho >\rho _{\mathrm {c}}\) for a split which has sufficiently large relative energy gap and only a few condensate modes (the precise conditions are given in Theorem 1). The above formal discussion will then turn out to give the correct picture for fairly general energy functions \(\omega (k)\). In fact, the separation between the two sets of modes is so strong that even the fluctuations of condensate and of the normal fluid will become statistically independent. However, if the condensate is degenerate, the fluctuations of the condensate can be nontrivial.

In the main result we will compare the spherical model measure \(\mu _0\) to the probability measure \(\mu _1\) defined by

$$\begin{aligned}&\mu _1[\mathrm{d}\varPhi ] := \frac{1}{Z_1} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} \nonumber \\&\quad \times \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_0[\varPhi ]\left( 1-\frac{\rho _{\mathrm {c}}}{\varDelta }\right) } \prod _{k\in \varLambda ^*_+} \left( 1-\frac{E_0[\varPhi ]L^{-d}}{e_{k} \varDelta } \right) ^{-1} \delta (\rho _0[\varPhi ] - \varDelta ), \end{aligned}$$
(2.10)

where \(\varDelta :=\rho -\rho _{\mathrm {c}}>0\), \(Z_1\) is a constant normalizing the integral to one and, using \(e_k:=\omega (k)-\omega _0\), we define

$$\begin{aligned} \rho _0[\varPhi ]:= \frac{1}{V}\int _{\varLambda _0^*}\!\mathrm{d}k\, |\varPhi _k|^2,\quad E_+[\varPhi ] := \int _{\varLambda _+^*}\!\mathrm{d}k\, e_k |\varPhi _k|^2,\quad E_0[\varPhi ] := \int _{\varLambda _0^*}\!\mathrm{d}k\, e_k |\varPhi _k|^2. \end{aligned}$$
(2.11)

Clearly, \(\mu _1\) is a product of \(\mu _+\) and a measure for the condensate modes, and the total norm density is split between the normal fluid and condensate in the manner described above.

The structure of the condensate fluctuations under \(\mu _1\) may indeed be fairly complicated. However, there are certain situations where they can be replaced by simpler uniform distribution of the excess mass over the condensate modes, i.e., by using the measure

$$\begin{aligned}&\mu '_1[\mathrm{d}\varPhi ] := \frac{1}{Z'_1} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \delta (\rho _0[\varPhi ] - \varDelta ) \end{aligned}$$
(2.12)

instead of \(\mu _1\) above. Some sufficient conditions for using the simpler measure are discussed later in Proposition 1. As we show there, using \(\mu '_1\) is allowed at least if a single mode condensate can be used, i.e., if \(V_0=1\). We call both \(\mu _1\) and \(\mu '_1\)factorized supercritical measures.

2.3 Main results

Our main result is to state conditions under which \(\mu _0\) and \(\mu _1\) are so close to each other that the expectations of all local observables will agree with each other, up to some error which is proportional to a negative power of L, hence vanishing when \(L\rightarrow \infty \). The precise conditions are contained in the following Theorem implying a bound for the Wasserstein distance between \(\mu _0\) and \(\mu _1\). The proof of Theorem is given in Sect. 5.

The Wasserstein distance estimate is sufficiently strong that local expectations of the original field, \(\phi = \tilde{\varPhi }\), generated by these two measures agree up to errors which vanish as \(L\rightarrow \infty \). Namely, if \(I\subset \varLambda \) is finite in the sense that \(|I|/V\ll 1\) and \(\phi ^I:=\prod _{x\in I}\phi _x\), then the bound given in Theorem 1 implies the existence of \(p'>0\) such that

$$\begin{aligned} \langle \phi ^I\rangle _{\mu _0} = \langle \phi ^I\rangle _{\mu _1} + O(L^{-p'}). \end{aligned}$$

The proof of this statement will rely on translation invariance of the random field generated by the measures \(\mu _0\) and \(\mu _1\) and it is given later as Theorem 2 in Sect. 3. Therefore, if a split with sufficiently large gap can be found, then the spherical model is well approximated by a critical Gaussian field and a few independent condensate Fourier modes, as determined by \(\mu _1\).

Theorem 1

Consider a fixed L and some given \(\omega (k)\in {\mathbb R}\), \(k\in \varLambda ^*\). Suppose \((\varLambda _0^*,\varLambda ^*_+)\) is a split of \(\varLambda ^*\) which is separated by the energy interval \([a_L,b_L]\), \(0\le a_L<b_L\), and has a relative energy gap \(\delta _L^{-1}\), as specified in Definition 1. We recall also the definitions of the total system size V, the number of the condensate modes \(V_0\), and the critical norm density \(\rho _{\mathrm {c}}(L)\) in (2.9).

Define the measure \(\mu _0\) by (2.7) and suppose that it is supercritical in the sense that \(\rho >\rho _{\mathrm {c}}\). Denote \(\varDelta :=\rho -\rho _{\mathrm {c}}(L)\), and assume that the gap and lattice size are large enough so that

$$\begin{aligned} \delta _L \le \frac{1}{2}, \qquad \varepsilon _L := \max \left( 2 \delta _L,\frac{1}{V^2\rho _{\mathrm {c}}^2 } \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2}\right) \le \frac{\varDelta ^2}{2^5 V_0^2\rho ^2}. \end{aligned}$$
(2.13)

Define the measure \(\mu _1\) by (2.10).

Then there exists a constant \(C_2>0\) such that the 2-Wasserstein distance between \(\mu _0\) and \(\mu _1\) satisfies

$$\begin{aligned} W_2(\mu _0,\mu _1) \le C_2 L^{\frac{d}{2}} \varepsilon _L^{\frac{1}{4}}. \end{aligned}$$
(2.14)

In particular, the inequality holds with the choice \(C_2= 2^4(\rho /\varDelta )^{V_0/2} \sqrt{ (\rho +\varDelta ) V_0}\).

As shown later in Lemma 1, for energies arising from many common continuum dispersion relations a sequence of splits can be found for which \(\varepsilon _L\rightarrow 0\) as \(L\rightarrow \infty \) while \(V_0\) and \(\rho _{\mathrm {c}}(L)\) remain bounded, implying \(C_2=O(1)\) if \(\rho > \sup _L \rho _{\mathrm {c}}(L)\). However, the speed of convergence of \(\varepsilon _L\) is usually not sufficient for the bound of the Wasserstein distance \(W_2(\mu _0,\mu _1)\) to go to zero, so we cannot state any convergence result in the above (unscaled) \(L^2\)-norm. Nevertheless, as we show in Sect. 3, for errors in local correlation functions the bound can be improved by a factor of \(L^{-\frac{d}{2}}\) which shows that these errors vanish in the limit of large lattices. The precise statement is given in Theorem 2, and as discussed in Sect. 3, the main simplification from the replacement of \(\mu _0\) by \(\mu _1\) is given by the vastly simpler fluctuation properties of the normal fluid under the measure \(\mu _1\).

There are a few special cases for which also the condensate fluctuations have simple structure, summarized in Proposition 1. In the statements below, we say for instance that “\(\varPhi =\varPhi ^+ + L^d \sqrt{\varDelta } X\) in distribution, where X is a random variable independent of \(\varPhi ^+\) and uniformly distributed on the unit sphere \(S^{2 V_0-1}\)”. There it is implicitly assumed that the first term refers to normal fluid components and the second to the condensate components using the standard isomorphism between \({\mathbb C}^{\varLambda ^*_0}\) and \({\mathbb R}^{2 V_0}\): for \(k\in \varLambda ^*_+\), we then have \(\varPhi _k=\varPhi ^+_k\), and for \(k\in \varLambda ^*_0\), we have \(\varPhi _k=L^d \sqrt{\varDelta } (X_{2 p(k) -1} + \mathrm{i}X_{2 p(k)})\) where \(p:\varLambda ^*_0\rightarrow \{1,2,\ldots ,V_0\}\) is any bijection, i.e., some enumeration of \(\varLambda ^*_0\). (Since the uniform measure on the unit sphere \(S^{d-1}\) is invariant under permutation of the d coordinate labels, the distribution does not depend on the choice of the enumeration p.)

Proposition 1

Suppose that all the assumptions and definitions in Theorem 1 hold, in particular, we recall Definition 1. Let \(\varPhi ^+\) denote the Gaussian lattice field distributed according to the measure \(\mu _+\) defined in (2.8).

  1. 1.

    If \(V_0=1\), then \(\varPhi =\varPhi ^+ + L^d\sqrt{\varDelta } \mathrm{e}^{\mathrm{i}\theta }\) in distribution, where \(\theta \) is a random variable independent of \(\varPhi ^+\) and uniformly distributed on the interval \([0,2\pi ]\).

  2. 2.

    If \(\omega (k)\) is a constant for \(k\in \varLambda _0^*\), then in distribution \(\varPhi =\varPhi ^+ + L^d \sqrt{\varDelta } X\), where X is a random variable independent of \(\varPhi ^+\) and uniformly distributed on the unit sphere \(S^{2 V_0-1}\).

  3. 3.

    If there is a non-negative \(\tilde{\varepsilon }\le 1\) such that \(e_k\le \frac{1}{2 \rho }L^{-d}\tilde{\varepsilon }\) for \(k\in \varLambda _0^*\), then

    $$\begin{aligned} W_2(\mu _0,\mu '_1) \le L^{\frac{d}{2}} 2^4 \sqrt{(\rho + \varDelta )V_0} \left( (\rho /\varDelta )^{V_0/2} \varepsilon _L^{\frac{1}{4}}+ \tilde{\varepsilon }^{\frac{1}{2}}\right) \end{aligned}$$
    (2.15)

    for the measure \(\mu '_1\) defined in (2.12). Under the measure \(\mu '_1\) we have \(\varPhi =\varPhi ^+ + L^d\sqrt{\varDelta } X\) in distribution, where X is a random variable independent of \(\varPhi ^+\) and uniformly distributed on the unit sphere \(S^{2 V_0-1}\).

Proof

The assumptions in the first two items imply that \(E_0[\varPhi ]=0\) (note that by definition of the split, we necessarily have \(\omega (k)=\omega _0\) for some, and hence for all, \(k\in \varLambda ^*_0\)). Thus the weight related to \(k\in \varLambda _0^*\) is equal to one. Since \(\rho _0[\varPhi ] = V^{-2} |\varPhi ^0|^2\), where \(|\varPhi ^0|\) denotes the Euclidean norm in \({\mathbb C}^{V_0}\cong {\mathbb R}^{2 V_0}\), the random variable \(X:= (L^{-d}\varDelta ^{-\frac{1}{2}}\mathrm{Re\,}\varPhi _k^0,L^{-d}\varDelta ^{-\frac{1}{2}}\mathrm{Im\,}\varPhi _k^0)_{k\in \varLambda ^*_0}\) is uniformly distributed on \(S^{2 V_0-1}\): for any continuous bounded function \(f:{\mathbb R}^{2 d}\rightarrow {\mathbb C}\) we have in spherical coordinates

$$\begin{aligned}&\int \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \delta (\rho _0[\varPhi ] - \varDelta ) f(\varPhi ) = \int _{{\mathbb R}^{2 V_0}}\! \mathrm{d}^{2 V_0\!}X\, \delta (\varDelta (|X|^2-1)) f(V \sqrt{\varDelta } X)\\&\quad = \frac{1}{\varDelta } \int _{S^{2 V_0-1}}\!\mathrm{d}\varOmega \int _0^\infty \! \mathrm{d}r\, r^{2 V_0-1} \delta (r^2-1) f(V \sqrt{\varDelta } r\varOmega )\\&\quad = \frac{1}{2\varDelta } \int _{S^{2 V_0-1}}\!\mathrm{d}\varOmega \int _0^\infty \! \mathrm{d}s\, s^{V_0-1} \delta (s-1) f(V \sqrt{\varDelta } \sqrt{s}\varOmega )\\&\quad = \frac{1}{2\varDelta } \int _{S^{2 V_0-1}}\!\mathrm{d}\varOmega \,f(V \sqrt{\varDelta }\varOmega ) \end{aligned}$$

and the normalization condition fixes the overall constant correctly.

If \(V_0=1\), X is uniformly distributed on the unit circle and thus equals \(\mathrm{e}^{\mathrm{i}\theta }\) in distribution. The proof of the last item uses techniques from the proof of the main Theorem, and it can be found at the end of Sect. 5. \(\quad \square \)

To study infinite volume limits, we assume that the weights \(\omega (k)\) are given by an L-independent dispersion relation, satisfying the following conditions.

Assumption 1

Suppose \(d\ge 3\) and consider a function \(\omega :{\mathbb T}^d\rightarrow {\mathbb R}\) which is \(C^2\) and has only finitely many non-degenerate minima. More precisely, we assume that both of the following statements hold:

  1. 1.

    The periodic extension of \(\omega \) into a function \({\mathbb R}^d\rightarrow {\mathbb R}\) is twice continuously differentiable.

  2. 2.

    By the first assumption and compactness of \({\mathbb T}^d\), \(\omega \) attains a minimum value \(\omega _{\text {min}}\in {\mathbb R}\). We assume that the collection of all global minima, , is finite and that the Hessian matrix \(D^2 \omega (k_0)\) is invertible for all \(k_0\in T_0\)

Note that these assumptions are invariant if \(\omega \) is multiplied by any positive constant, and thus they remain invariant in changes of the implicit inverse temperature factor \(\beta \).

It turns out that in the presence of a condensate, the distribution around the degrees of freedom with minimum energy may vary with the lattice size L without converging towards any limiting behaviour as \(L\rightarrow \infty \). For example, in Sect. 4.3 we present an example with different number of condensate modes for odd and even L. We also illustrate via explicit examples why the split can have nontrivial dependence on the lattice size L in Sect. 4.

The following Lemma shows that for dispersion relations satisfying Assumption 1 a split with the desired properties can be found.

Lemma 1

Suppose that \(d\ge 3\) and \(\omega \) satisfies Assumption 1. For each L, define \(\omega _0\) and \(e_k\), \(k\in \varLambda ^*\), as in Definition 1. Choose \(\kappa \) such that \(0<\kappa <\frac{d}{2}\), if \(d\ge 4\), and \(0<\kappa < 1\), if \(d=3\). Then there are constants \(L_0,M_0\in {\mathbb N}_+\) and \(c_0,c_2>0\), depending only on d, the function \(\omega \), and the choice of \(\kappa \), such that for all \(L\ge L_0\) we can find a split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) with the following properties:

  1. 1.

    \(M_0\) can be chosen independently of \(\kappa \), \(|\varLambda _0^*|\le M_0\), and for every \(k\in \varLambda _0^*\),

    $$\begin{aligned} 0\le \omega (k)-\omega _{\text {min}}< c_0 L^{-2}. \end{aligned}$$
    (2.16)
  2. 2.

    The split is separated by an energy interval \([a_L,b_L]\) and has a relative energy gap \(\delta _L^{-1}\), where \(b_L\ge \frac{1}{2}c_0 L^{-d+\kappa }\) and

    $$\begin{aligned} \delta _L\le L^{-\frac{d-2-\kappa }{M_0}}\le 1. \end{aligned}$$
    (2.17)
  3. 3.

    We have

    $$\begin{aligned} \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} \le c_2 L^{-2\kappa }, \end{aligned}$$
    (2.18)

    the following positive integral is finite,

    $$\begin{aligned} \rho _\infty := \int _{{\mathbb T}^d}\!\mathrm{d}k\, \frac{1}{\omega (k)-\omega _{\text {min}}} < \infty , \end{aligned}$$
    (2.19)

    and, as \(L\rightarrow \infty \),

    $$\begin{aligned} \rho _{\mathrm {c}}(L) = \rho _\infty + O(L^{-\min (\kappa ,2)}). \end{aligned}$$
    (2.20)

In particular, \(\max _{k\in \varLambda ^*_0} \omega (k)\rightarrow \omega _{\text {min}}\), \(\rho _{\mathrm {c}}(L)\rightarrow \rho _\infty \), and \(\delta _L \rightarrow 0\), as \(L\rightarrow \infty \).

The proof of the Lemma is postponed to Sect. 6, and it contains ways to construct some constants for which the Theorem holds. However, these constructions are not always optimal since they need to take into account extreme cases such as very anisotropic dispersion relations. Hence, if optimal decay estimates are desired, it is better to optimise the values case by case instead of using, e.g., the worst case estimate in (6.7) for \(M_0\).

As a straightforward application, we obtain the following consequences for systems where the infinite lattice dispersion relation is kept fixed and L is taken large.

Corollary 1

Suppose that \(d\ge 3\) and \(\omega \) satisfies Assumption 1, and take some cutoff parameters for the minimum distance from criticality, \(\varDelta _0>0\), and for a maximal density, \(\bar{\rho }> \rho _\infty + \varDelta _0\), where \(\rho _\infty \) is defined by (2.19).

Then there are \(L'\), \(M_0\), and \(C'>0\) such that for any \(L\ge L'\) we can find a split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) satisfying all properties stated in Lemma 1 and for which the Wasserstein distance between the measures \(\mu _0\) and \(\mu _1\) defined in Theorem 1 satisfies

$$\begin{aligned} W_2(\mu _0,\mu _1) \le C' L^{\frac{d}{2}-\frac{d/2-1}{2 M_0+1}}, \end{aligned}$$
(2.21)

for all densities \(\rho \) on the interval

$$\begin{aligned} \sup _{L\ge L'} \rho _{\mathrm {c}}(L) + \varDelta _0\le \rho \le \bar{\rho }. \end{aligned}$$
(2.22)

Proof

Since the assumptions of Lemma 1 are satisfied, \(\rho _{\mathrm {c}}(L)\rightarrow \rho _\infty \), as \(L\rightarrow \infty \), and thus there is \(L_0'\) such that \(\sup _{L\ge L'_0} \rho _{\mathrm {c}}(L) + \varDelta _0<\bar{\rho }\). Therefore, if \(L'\ge L_0'\), there are densities \(\rho \) for which (2.22) holds.

In addition, we can conclude from the Lemma that there is \(M_0\ge 1\) such that for any appropriately chosen \(\kappa \), the split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) obtained from the Lemma satisfies \(\delta _L\le L^{-\frac{d-2-\kappa }{M_0}}\) and \(\varepsilon _L=O(\delta _L+L^{-2\kappa })\). Thus both go to zero as \(L\rightarrow \infty \). Now if \(L'\ge \max (L_0, L_0')\), \(L\ge L'\), and \(\rho \) satisfies (2.22), we have \(\varDelta _0\le \varDelta \le \bar{\rho }\) and \(\frac{\varDelta ^2}{2^5 V_0^2\rho ^2}\ge \frac{\varDelta _0^2}{2^5 M_0^2\bar{\rho }^2}>0\), uniformly in L. Therefore, we may find \(L'\ge \max (L_0, L_0')\) such that both inequalities in (2.13) hold for all \(L\ge L'\) and all \(\rho \) satisfying (2.22).

Thus we may use the conclusions of the main Theorem for these values of parameters, and the constant \(C'=C_2\) may be adjusted to work for all allowed values of \(\kappa \), L, and \(\rho \). Since also \(M_0\) is independent of \(\kappa \), we can maximize the decay of \(\varepsilon _L\) by setting \(\kappa =\frac{d-2}{2 M_0+1}<\frac{d}{2}\) which satisfies \(\kappa <1\) for \(d=3\). This results in the bound stated in the Corollary. \(\quad \square \)

3 Local Correlation Estimates from Wasserstein Bounds

In the main result, a bound is derived for the Wasserstein distance between two measures \(\mu _0\) and \(\mu _1\) which are both gauge invariant in the following sense: the random fields \((\varPhi _k)_{k\in \varLambda ^*}\) and \((\mathrm{e}^{\mathrm{i}\varphi _k} \varPhi _k)_{k\in \varLambda ^*}\) have the same distribution for any choice of the constant phase shifts \(\varphi _k\in {\mathbb R}\), \(k\in \varLambda ^*\). This is a consequence of the geometric identification between \({\mathbb C}\) and \({\mathbb R}^2\) which implies that a multiplication \(\varPhi _k \rightarrow \mathrm{e}^{\mathrm{i}\varphi _k}\varPhi _k\) corresponds to a rotation by an angle \(\varphi _k\) and thus it leaves the Lebesgue measure \(\mathrm{d}(\mathrm{Re\,}\varPhi _k)\, \mathrm{d}(\mathrm{Im\,}\varPhi _k)\) invariant. The weight functions only depend on \(|\varPhi _k|^2\) and thus also they are left invariant.

However, in applications, one is usually mainly interested in the corresponding fields \(\phi _x\), \(x\in \varLambda _L\), obtained by inverse Fourier transform from \(\varPhi _k\): we consider the collection of

$$\begin{aligned} \phi _x = \int _{\varLambda ^*}\!\mathrm{d}k\, \varPhi _k \mathrm{e}^{\mathrm{i}2\pi k \cdot x}, \end{aligned}$$
(3.1)

for \(x\in \varLambda _L\). The above gauge invariance of the Fourier components is reflected in translation invariance of the field \(\phi _x\). Namely, for any \(y\in \varLambda _L\), we have

$$\begin{aligned} \phi _{x+y} = \int _{\varLambda ^*}\!\mathrm{d}k\, \mathrm{e}^{\mathrm{i}2\pi k \cdot x} \mathrm{e}^{\mathrm{i}2\pi k \cdot y} \varPhi _k, \quad x\in \varLambda _L, \end{aligned}$$

and thus the field \((\phi _{x+y})_{x\in \varLambda }\) has the same distribution as the field \((\phi _{x})_{x\in \varLambda }\).

This translation invariance is sufficient to lift the earlier usually divergent Wasserstein bounds to vanishing error estimates for moments of the field \(\phi _x\). To see this, consider a sequence I of length \(n\ge 1\) of pairs \((x_i,\tau _i)_{i=1}^n\), where \(x_i\in \varLambda _L\) and \(\tau _i\in \{-1,1\}\). We use the index \(\tau \) to determine complex conjugation: we set \(\phi _{x,1}=\phi _x\) and \(\phi _{x,-1}=\phi _x^*\), and use the shorthand notation \({\phi }^I := \prod _{\alpha \in I} {\phi }_\alpha := \prod _{i=1}^n {\phi }_{x_i,\tau _i}\) for the monomial corresponding to the above sequence I. The expectation of such local observables will get an improvement by a factor \(L^{-\frac{d}{2}}\) for the Wasserstein distance from translation invariance, as stated in the following Lemma.

Lemma 2

Suppose \(\mu \) and \(\mu '\) are gauge invariant measures for the Fourier components, field \(\varPhi (k)\), \(k\in \varLambda ^*\). Given \(x\in \varLambda \), define \(A_1(x):=1\) and for \(n>1\) set

$$\begin{aligned} A_n(x) := \max \left( \langle |\phi _{x}|^{2(n-1)}\rangle ^{(2(n-1))^{-1}}_{\mu },\langle |\phi _{x}|^{2(n-1)}\rangle ^{(2(n-1))^{-1}}_{\mu '}\right) . \end{aligned}$$
(3.2)

Consider the random field \(\phi = \tilde{\varPhi }\) and suppose \(n\ge 1\) is such that \(A_n(x)<\infty \) for some \(x\in \varLambda \).

Then \(A_n(x)\) does not depend on the choice of x and for any sequence I of length n as above, we have an estimate

$$\begin{aligned} \left| \langle \phi ^I\rangle _{\mu } - \langle \phi ^I\rangle _{\mu '} \right| \le A_n^{n-1} n W_2(\mu ,\mu ')L^{-d/2}. \end{aligned}$$
(3.3)

Proof

Under either of the measures \(\mu \) and \(\mu '\) the field \(\phi _x\) is translation invariant, \(\langle \phi ^I\rangle =\langle \phi ^{I+y}\rangle \) for any \(y\in \varLambda _L\), where \(I+y:=((x_i+y,\tau _i))_{i=1}^n\). Therefore, for any coupling \(\gamma \) between \(\mu \) and \(\mu '\) the difference of their moments satisfies

$$\begin{aligned} X&:= \langle \phi ^I\rangle _{\mu } - \langle \phi ^I\rangle _{\mu '} = \frac{1}{V} \sum _{y\in \varLambda _L}\langle \phi ^{I+y}\rangle _{\mu } -\frac{1}{V} \sum _{y\in \varLambda _L} \langle \phi ^{I+y}\rangle _{\mu '} \nonumber \\&= \frac{1}{V} \sum _{y\in \varLambda _L} \langle \phi ^{I+y} - (\phi ')^{I+y}\rangle _{\gamma }. \end{aligned}$$
(3.4)

In particular, if \(n=1\), by using Cauchy–Schwarz inequality we obtain

$$\begin{aligned} |X|&\le \frac{1}{V} \sum _{y\in \varLambda } \left\langle |\phi _{x_1+y}-\phi '_{x_1+y}|^2\right\rangle _{\gamma }^{\frac{1}{2}} \le \frac{1}{V} \sqrt{V} \langle \Vert \phi -\phi '\Vert _2^2\rangle _\gamma ^{\frac{1}{2}}\\&= L^{-\frac{d}{2}} A_n^{n-1} n \langle \Vert \varPhi -\varPhi '\Vert ^2\rangle _\gamma ^{\frac{1}{2}}. \end{aligned}$$

Since the left hand side does not depend on the coupling \(\gamma \), taking an infimum yields the bound in (3.3); cf. the definition of the Wasserstein distance in Appendix B.

Consider then the case \(n>1\). The difference of products in (3.4) can be “telescoped” as follows

$$\begin{aligned} \prod _{i=1}^n \phi _i = \prod _{i=1}^n \phi _i' +\sum _{i=1}^n (\phi _i-\phi _i') \prod _{j=1}^{i-1} \phi _j \prod _{j=i+1}^{n} \phi '_j, \end{aligned}$$

yielding an estimate

$$\begin{aligned} \left| \phi ^{I+y} - (\phi ')^{I+y} \right| \le \sum _{i=1}^n |\phi _{x_i+y}-\phi '_{x_i+y}| \prod _{j=1}^{i-1} |\phi _{x_j+y}| \prod _{j=i+1}^{n} |\phi '_{x_j+y}|. \end{aligned}$$

Note that the absolute values on the right hand side cancel the effect of any possible complex conjugations on the left hand side. Taking an expectation over \(\gamma \) and then using Cauchy–Schwarz inequality and the natural order in I to simplify the notations, we obtain

$$\begin{aligned}&\left\langle \left| \phi ^{I+y} - (\phi ')^{I+y} \right| \right\rangle _{\gamma } \le \sum _{x\in I} \left\langle |\phi _{x+y}-\phi '_{x+y}| \prod _{x'<x} |\phi _{x'+y}| \prod _{x'>x} |\phi '_{x'+y}| \right\rangle _{\gamma }\\&\quad \le \sum _{x\in I} \left\langle |\phi _{x+y}-\phi '_{x+y}|^2\right\rangle _{\gamma }^{\frac{1}{2}} \left\langle \prod _{x'<x} |\phi _{x'+y}|^2 \prod _{x'>x} |\phi '_{x'+y}|^2 \right\rangle _{\gamma }^{\frac{1}{2}}\\&\quad \le \sum _{x\in I} \left\langle |\phi _{x+y}-\phi '_{x+y}|^2\right\rangle _{\gamma }^{\frac{1}{2}} \prod _{x'<x} \left\langle |\phi _{x'+y}|^{q'}\right\rangle _{\gamma }^{\frac{1}{q'}} \prod _{x'>x} \left\langle |\phi '_{x'+y}|^{q'}\right\rangle _{\gamma }^{\frac{1}{q'}}, \end{aligned}$$

where in the last step we have used the generalized Hölder’s inequality with exponent \(q'=2 (n-1)\) for which indeed \(\sum _{x'\in I; x'\ne x} \frac{1}{q'}= \frac{1}{2}\) for all \(x\in I\).

We may now conclude that the error X is bounded by

$$\begin{aligned} |X|\le \frac{1}{V} \sum _{y\in \varLambda }\sum _{x\in I} \left\langle |\phi _{x+y}-\phi '_{x+y}|^2\right\rangle _{\gamma }^{\frac{1}{2}} \prod _{x'<x} \left\langle |\phi _{x'+y}|^{q'}\right\rangle _{\gamma }^{\frac{1}{q'}} \prod _{x'>x} \left\langle |\phi '_{x'+y}|^{q'}\right\rangle _{\gamma }^{\frac{1}{q'}}. \end{aligned}$$

Here, only the first factor depends on \(\gamma \), since all the other factors may be computed using the fixed marginal measures \(\mu \) and \(\mu '\). Using the translation invariance of the marginal measures we obtain

$$\begin{aligned} |X|\le \frac{1}{V}\sum _{x\in I} \prod _{x'<x} \left\langle |\phi _{x'}|^{q'}\right\rangle _{\mu }^{\frac{1}{q'}} \prod _{x'>x} \left\langle |\phi _{x'}|^{q'}\right\rangle _{\mu '}^{\frac{1}{q'}} \sum _{y\in \varLambda } \left\langle |\phi _{x+y}-\phi '_{x+y}|^2\right\rangle _{\gamma }^{\frac{1}{2}}. \end{aligned}$$

We next use the assumption that \(A_n<\infty \) for the moments given in (3.2). By translation invariance, \(A_n\) is independent of the choice of \(x\in \varLambda \), and thus by applying the Schwarz inequality to the sum over y, we obtain

$$\begin{aligned}&|X|\le \frac{1}{V}A_n^{n-1}\sum _{x\in I} \sqrt{V} \left[ \sum _{y\in \varLambda } \langle |\phi _{x+y}-\phi '_{x+y}|^2\rangle _\gamma \right] ^{\frac{1}{2}}\\&\quad = \frac{1}{\sqrt{V}}A_n^{n-1} n \langle \Vert \phi -\phi '\Vert _2^2\rangle _\gamma ^{\frac{1}{2}} = L^{-\frac{d}{2}} A_n^{n-1} n \langle \Vert \varPhi -\varPhi '\Vert ^2\rangle _\gamma ^{\frac{1}{2}}. \end{aligned}$$

Since the left hand side does not depend on the coupling \(\gamma \), taking an infimum yields the bound in (3.3), as before. This concludes the proof of the Lemma. \(\quad \square \)

In the bound (3.3) a factor \(L^{\frac{d}{2}}\) gets cancelled from the Wasserstein distance. Combined with the earlier results, the bound thus goes to zero if n is not allowed to increase when taking \(L\rightarrow \infty \), as long as the constants \(A_n\) remain bounded in the limit. As proven in Lemma 3 at the end of the section, this holds for the measures considered here. Hence, we may conclude that (3.3) combined with the Wasserstein estimates stated in the main results in Sect. 2 implies that

$$\begin{aligned} \langle \phi ^I\rangle _{\mu } = \langle \phi ^I\rangle _{\mu '} + O(L^{-p'}), \end{aligned}$$

as \(L\rightarrow \infty \) if \(\mu \) is a supercritical spherical model measure and \(\mu '\) is a compatible factorized supercritical measure. Summarizing all assumptions in one place, we obtain the following result as an immediate corollary of Corollary 1, Lemma 2, and Lemma 3.

Theorem 2

Suppose that \(d\ge 3\) and \(\omega \) satisfies Assumption 1, and consider any supercritical \(\rho \) as in Corollary 1. Fix a maximum order \(n\ge 1\) of the local moment. Then there are \(C',p',L'>0\) for which the following holds: if \(L\ge L'\), we may find a split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) and define the corresponding factorized supercritical measure \(\mu _1\) by (2.10) so that

$$\begin{aligned} \left| \langle \phi ^I\rangle _{\mu _0} - \langle \phi ^I\rangle _{\mu _1} \right| \le C' L^{-p'}, \end{aligned}$$
(3.5)

for any sequence I from \(\varLambda _L\times \{\pm 1\}\) of length at most n.

Using the constants occurring in Corollary 1, we may use \(p'=\frac{d/2-1}{2 M_0+1}\) in (3.5). However, as discussed before the Corollary, this value might not always be optimal, i.e., the result could hold also for larger values of \(p'\).

For applications of the approximation result, perhaps the most important consequence is the simplification of the structure of fluctuations. Namely, apart from the few condensate degrees of freedom, the field becomes Gaussian and translation invariant. In fact, as we will show next, its infinite volume statistics are given by the critical lattice field \(\psi _x\), \(x\in {\mathbb Z}^d\), which has zero mean and covariance with \({\mathbb E}[\psi _x \psi _y] = 0\) and

$$\begin{aligned} {\mathbb E}[\psi _x \psi _y^*] = \int _{{\mathbb T}^d} \!\mathrm{d}k\, \frac{1}{\omega (k)-\omega _{\text {min}}} \mathrm{e}^{\mathrm{i}2\pi k\cdot (x-y)}, \end{aligned}$$
(3.6)

for all \(x,y\in {\mathbb Z}^d\).

More precisely, for all of the factorized supercritical measures in Sect. 2.2, the field \(\phi _x\) can be written as a sum of two independent random fields of which the normal fluid component \(\phi ^+\) is defined by \(\phi ^+_x = \int _{\varLambda ^*_+}\!\mathrm{d}k\, \varPhi ^+_k \mathrm{e}^{\mathrm{i}2\pi k \cdot x}\) where \(\varPhi ^+\) is distributed according to the measure \(\mu _+\) in (2.8). Therefore, for any compactly supported test function \(J:{\mathbb Z}^d\rightarrow {\mathbb C}\), we can define the random variable

$$\begin{aligned} \langle J,\phi ^+ \rangle := \sum _{x\in {\mathbb Z}^d} J(x)^* \phi ^+_x, \end{aligned}$$

as soon as L is large enough so that \(\varLambda _L\) contains the support of J. Then \( \langle J,\phi ^+ \rangle \) has mean zero and a variance for which \(\langle \langle J,\phi ^+ \rangle ^2\rangle =0\) and

$$\begin{aligned} \langle | \langle J,\phi ^+ \rangle |^2\rangle = \int _{\varLambda ^*_+}\!\mathrm{d}k'\,\int _{\varLambda ^*_+}\!\mathrm{d}k\, {\mathbb E}_{\mu _+}\!\! \left[ \varPhi _{k'}^* \varPhi _k\right] \widehat{J}(k') \widehat{J}(k)^* = \int _{\varLambda ^*_+}\!\mathrm{d}k\, \frac{1}{e_k}\left| \widehat{J}(k)\right| ^2, \end{aligned}$$

where

$$\begin{aligned} \widehat{J}(k) := \sum _{z\in {\mathbb Z}^d} \mathrm{e}^{-\mathrm{i}2\pi k\cdot x} J(x). \end{aligned}$$

The function \(\widehat{J}:{\mathbb T}^d\rightarrow {\mathbb C}\) is continuous, hence also bounded. We assume that the split \((\varLambda _0^*,\varLambda ^*_+)\) for all L has the properties listed in Lemma 1. Then it is possible to partition \({\mathbb T}^d\) into boxes of side length \(\frac{1}{L}\) so that \(\frac{1}{e_k}\) is bounded in the corresponding box by a constant times \(\frac{1}{\omega -\omega _{\text {min}}}\), apart possibly from a finite number of boxes. Due to the lower bound for \(e_k\) valid for all \(k\in \varLambda ^*_+\), we may ignore the exceptional boxes, and for the remaining ones use dominated convergence theorem to conclude that for any fixed J

$$\begin{aligned} \lim _{L\rightarrow \infty } \langle | \langle J,\phi ^+ \rangle |^2\rangle = \int _{{\mathbb T}^d} \!\mathrm{d}k\, \frac{1}{\omega (k)-\omega _{\text {min}}} \left| \widehat{J}(k)\right| ^2. \end{aligned}$$

Details of this construction, as well as explicit estimates in L for the size of the error, can be found in the proof of (2.20) given at the end of Sect. 6.

Then an application of the polarization identity proves that for any two test functions \(J_1\) and \(J_2\) with a compact support we have

$$\begin{aligned} \lim _{L\rightarrow \infty } \langle \langle J_1,\phi ^+ \rangle ^* \langle J_2,\phi ^+ \rangle \rangle = \int _{{\mathbb T}^d} \!\mathrm{d}k\, \frac{1}{\omega (k)-\omega _{\text {min}}} \widehat{J}_1(k) \widehat{J}_2(k)^*. \end{aligned}$$

Restricted to single site test functions, we may thus conclude that (3.6) is indeed the limit of any pointwise covariances. Since both the finite volume and the limit field are Gaussian, these results also immediately imply the convergence of all finite moments.

We conclude the section by showing that both the original and factorized fields have uniformly bounded moments.

Lemma 3

Suppose that \(d\ge 3\) and \(\omega \) satisfies Assumption 1. Consider some supercritical \(\rho \), some \(L\ge L_0\) and any split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) satisfying all properties stated in Lemma 1. Let \(\mu \) be either \(\mu _0\) or one of the measures \(\mu _1\) or \(\mu _1'\) defined for this split in Theorem 1 and Proposition 1.

Then to each \(m\ge 0\) there is an L-independent constant \(c_m\) such that

$$\begin{aligned} \langle |\phi _{x}|^{2 m}\rangle _{\mu } \le c_m, \end{aligned}$$

for the random variable \(\phi _x\) defined by (3.1) for any \(x\in \varLambda _L\).

Proof

If \(m=0\), defining \(c_0=1\) obviously suffices since \(\mu \) is a probability measure. Assume thus \(m>0\).

Split \(\phi _x\) into a condensate and normal fluid component as follows

$$\begin{aligned} \phi ^0_x := \int _{\varLambda ^*_0}\!\mathrm{d}k\, \varPhi _k \mathrm{e}^{\mathrm{i}2\pi k \cdot x} \quad \text {and}\quad \phi ^+_x := \int _{\varLambda ^*_+}\!\mathrm{d}k\, \varPhi _k \mathrm{e}^{\mathrm{i}2\pi k \cdot x}. \end{aligned}$$

Then \(\phi _x = \phi ^0_x+ \phi ^+_x\), and the condensate component may be bound by using the upper bound \(M_0\) from Lemma 1,

$$\begin{aligned} |\phi ^0_x |\le \int _{\varLambda ^*_0}\!\mathrm{d}k\, |\varPhi _k| \le \sqrt{V_0/V} \Vert \varPhi ^0\Vert \le \sqrt{M_0 \rho _0[\varPhi ]}. \end{aligned}$$

Under the measure \(\mu _0\), \(\rho _0[\varPhi ]\le \rho \) almost surely, and under either of the measures \(\mu _1\) or \(\mu _1'\) we have \(\rho _0[\varPhi ]=\varDelta \le \rho \) almost surely. Therefore, in all of the three cases the condensate field is almost surely uniformly bounded in L, \(|\phi ^0_x |\le \sqrt{M_0 \rho }\).

We then employ Hölder’s inequality for the dual pair \((2m,2m/(2m-1))\) to bound the moment

$$\begin{aligned} \langle |\phi _{x}|^{2 m}\rangle _{\mu } \le \langle (|\phi ^+_{x}|+|\phi ^0_{x}|)^{2 m}\rangle _{\mu } \le 2^{2m-1} \left( \langle |\phi ^+_{x}|^{2 m}\rangle _{\mu }+\langle |\phi ^0_{x}|^{2 m}\rangle _{\mu }\right) . \end{aligned}$$

The condensate term on the right hand side is now bounded by \((M_0 \rho )^{m}\), so it only remains to estimate the normal fluid term.

Let us begin with the case where \(\mu \) is \(\mu _1\) or \(\mu _1'\). Since \(\phi ^+_x\) only depends on \(\varPhi ^+\), the product structure of these two measures implies that

$$\begin{aligned} \langle |\phi ^+_{x}|^{2 m}\rangle _{\mu }&=\langle |\phi ^+_{x}|^{2 m}\rangle _{\mu _+} \nonumber \\&= \int _{(\varLambda _+^*)^m} \!\mathrm{d}k \int _{(\varLambda _+^*)^m} \!\mathrm{d}k' \mathrm{e}^{\mathrm{i}2\pi x\cdot \sum _{i=1}^m(k_i-k'_i)}\left\langle \prod _{i=1}^m (\varPhi _{k_i} \varPhi ^*_{k'_i})\right\rangle _{\!\!\mu _+}. \end{aligned}$$

The remaining expectation is over independent, mean zero, Gaussian complex random variables. By the Wick rule and gauge invariance, the expectation is zero unless there is a permutation \(\pi \) of \(\{1,2,\ldots ,m\}\) such that \(k'_i = k_{\pi (i)}\) for all i. Therefore,

$$\begin{aligned} \left\langle \prod _{i=1}^m (\varPhi _{k_i} \varPhi ^*_{k'_i})\right\rangle _{\mu _+} = \sum _{\pi \in S_m} \prod _{i=1}^m {\mathbb {1}}_{\{k'_i=k_{\pi (i)}\}} \prod _{i=1}^m\frac{V}{e_{k_i}}. \end{aligned}$$

For any nonzero term in the sum \(\sum _{i=1}^m k'_i=\sum _{i=1}^m k_i\) implying \( \mathrm{e}^{\mathrm{i}2\pi x\cdot \sum _{i=1}^m(k_i-k'_i)}=1\). Summing over \(k'\) thus yields

$$\begin{aligned} \langle |\phi ^+_{x}|^{2 m}\rangle _{\mu } = \int _{(\varLambda _+^*)^m} \!\mathrm{d}k\sum _{\pi \in S_m} \prod _{i=1}^m\frac{1}{e_{k_i}} = m! \rho _{\mathrm {c}}(L)^m\le m! \rho ^m. \end{aligned}$$

Therefore, for these two measures, we may use \(c_m=2^{2m-1} (m!+M_0^m) \rho ^m\).

It remains to consider the normal fluid contribution for \(\mu =\mu _0\). As above, we have

$$\begin{aligned} \langle |\phi ^+_{x}|^{2 m}\rangle _{\mu } = \int _{(\varLambda _+^*)^m} \!\mathrm{d}k \int _{(\varLambda _+^*)^m} \!\mathrm{d}k' \mathrm{e}^{\mathrm{i}2\pi x\cdot \sum _{i=1}^m(k_i-k'_i)}\left\langle \prod _{i=1}^m (\varPhi _{k_i} \varPhi ^*_{k'_i})\right\rangle _{\mu _0}, \end{aligned}$$

and by gauge invariance of \(\mu _0\), the remaining expectation is zero unless for each \(k\in \varLambda _+^*\) there are the same number of \(\varPhi _k\) and \(\varPhi _k^*\) terms in the product, in which case the product yields a positive number. Thus for the nonzero terms also here we can find a permutation \(\pi \) of \(\{1,2,\ldots ,m\}\) such that \(k'_i = k_{\pi (i)}\) for all i. Therefore,

$$\begin{aligned} 0\le \left\langle \prod _{i=1}^m (\varPhi _{k_i} \varPhi ^*_{k'_i})\right\rangle _{\mu _0} \le \sum _{\pi \in S_m} \prod _{i=1}^m {\mathbb {1}}_{\{k'_i=k_{\pi (i)}\}} \left\langle \prod _{i=1}^m |\varPhi _{k_i}|^2\right\rangle _{\mu _0}. \end{aligned}$$

Continuing as above, and observing that \(\rho _+[\varPhi ] := \frac{1}{V}\int _{\varLambda _+^*} \!\mathrm{d}k |\varPhi _k|^2\le N[\varPhi ]/V\) is bounded by \(\rho \) almost surely under \(\mu _0\), we find an upper bound

$$\begin{aligned} \langle |\phi ^+_{x}|^{2 m}\rangle _{\mu } \le \int _{(\varLambda _+^*)^m} \!\mathrm{d}k\sum _{\pi \in S_m} V^{-m} \left\langle \prod _{i=1}^m |\varPhi _{k_i}|^2\right\rangle _{\mu _0} = m! \langle \rho _+[\varPhi ]^m\rangle _{\mu _0}\le m! \rho ^m. \end{aligned}$$

Therefore, also for \(\mu =\mu _0\), we may use \(c_m=2^{2m-1} (m!+M_0^m) \rho ^m\). Let us point out that, by Lemma 1, \(\rho _{\mathrm {c}}(L)\) is bounded in L and thus it is not a contradiction to assume that \(\rho \) is fixed and supercritical for all \(L\ge L_0\). \(\quad \square \)

4 Example Lattice Dispersion Relations

As an application, we consider explicitly a number of dispersion relations \(\omega :{\mathbb T}^d\rightarrow {\mathbb R}\), all of which are continuous (periodic) functions. Let us first recall that, once we define \(\phi _x\) by (3.1), the energy and norm satisfy

$$\begin{aligned} H[\varPhi ] = \sum _{x,y\in \varLambda _L} \phi _x^* \alpha (x-y;L) \phi _y \quad \text {and}\quad N[\varPhi ] = \sum _{x\in \varLambda _L} |\phi _x|^2, \end{aligned}$$

where

$$\begin{aligned} \alpha (x;L) := \int _{\varLambda ^*}\!\mathrm{d}k\, \omega (k) \text {e}^{\text {i} 2\pi k \cdot x}. \end{aligned}$$

Taking \(L\rightarrow \infty \) thus shows that \(\alpha (x;L)\rightarrow \alpha (x)=\int _{{\mathbb T}^d}\!\mathrm{d}k\, \omega (k) \mathrm{e}^{\mathrm{i}2\pi k \cdot x}\) for each \(x\in {\mathbb Z}^d\). Here \(\alpha (x)\) are the Fourier coefficients of \(\omega \) and they are \(\ell _2\)-summable since \(\omega \in L^2({\mathbb T}^d)\). In particular, \(\alpha (x)\rightarrow 0\) as \(|x|\rightarrow \infty \). Furthermore, if \(\omega \) is a restriction of an analytic function, we may conclude that its Fourier coefficients \(\alpha (x)\) are exponentially decreasing in \(|x|\rightarrow \infty \), and all such functions correspond to “short-range” interactions for the field \(\phi _x\).

4.1 Nearest neighbour interactions

In the original Berlin–Kac paper nearest neighbour interactions were considered which for a rectangular lattice would correspond to using a dispersion relation

$$\begin{aligned} \omega (k) = a + b \sum _{i=1}^d \sin ^2(\pi k_i), \end{aligned}$$

where \(a\in {\mathbb R}\) and \(b>0\). (Since \(2 \sin ^2 (\pi y)=1-\cos (2\pi y)=1-\frac{1}{2}(\mathrm{e}^{\mathrm{i}2\pi y}+\mathrm{e}^{-\mathrm{i}2\pi y})\), it is straightforward to check that then \(|\alpha (x;L)|=0\) if \(|x|_\infty >1\), i.e., for points which are not nearest neighbour on a rectangular lattice.)

Clearly, \(\omega \) is twice continuously differentiable and \(k=0\) is the unique minimum point on \({\mathbb T}^d\) and \(\omega _{\text {min}}=\omega (0)=a\). Also, \(D^2\omega (0)=2\pi ^2 b \,1\) is proportional to the unit matrix 1 and strictly positive. Thus \(\omega \) satisfies Assumption 1 with \(T_0=\{0\}\).

For fixed \(L\ge 2\), let us parameterize the dual lattice \(\varLambda ^*\) by \(k=\frac{n}{L}\) where \(n\in \varLambda _L\), in particular, \(|n|_\infty \le \frac{L}{2}\). Since \(0\in \varLambda ^*\), we have \(\omega _0=\omega _{\text {min}}=a\), and thus the excess energies satisfy

$$\begin{aligned} e_k = b \sum _{i=1}^d \sin ^2\!\left( \frac{\pi n_i}{L}\right) \ge \frac{4 b}{L^2} \sum _{i=1}^d n_i^2. \end{aligned}$$

Therefore, defining \(\varLambda _0^*=\{0\}\) and \(\varLambda _+^*=\varLambda ^*{\setminus }\{0\}\), results in a split of \(\varLambda ^*\) which is separated by the energy interval \([0,4 b L^{-2}]\) and thus has \(\delta _L=0\). We also have

$$\begin{aligned} \rho _{\mathrm {c}}(L) = \int _{\varLambda ^*_+}\!\mathrm{d}k\, \frac{1}{e_k} \le \frac{L^{2-d}}{4 b} \sum _{1\le |n|_\infty \le \frac{L}{2}} |n|^{-2} = O(1), \end{aligned}$$

and

$$\begin{aligned} \frac{1}{V}\int _{\varLambda ^*_+}\!\mathrm{d}k\, \frac{1}{e_k^2} \le L^{4-2 d} (4b)^{-2} \sum _{1\le |n|_\infty \le \frac{L}{2}} |n|^{-4}. \end{aligned}$$

By a Riemann sum approximation (see Sect. 6 for details) we find that the right hand side is \(O(L^{-2})\), for \(d=3\), it is \(O( L^{-4} \ln L)\) for \(d=4\), and \(O(L^{-d})\) for \(d\ge 5\). Hence, also \(\varepsilon _L\) satisfies these bounds, and we may apply Theorem 1 for all large enough L.

We conclude that \(W_2(\mu _0,\mu _1)\le C_2 L^{\frac{d}{2}-p'}\) with \(p'=\frac{d}{4}\) for \(d\ge 5\), any \(p'<1\) for \(d=4\), and \(p'=\frac{1}{2}\) for \(d=3\). Since \(V_0=1\) and \(k=0\) is the unique condensate Fourier mode, we can then apply Proposition 1 and Lemma 2 to conclude that for any finite moment, i.e., for index sets I whose length is less than some arbitrary cut-off, we can approximate

$$\begin{aligned} \langle \phi ^I\rangle _{\mu _0} = \langle \psi ^I\rangle + O(L^{-p'}), \end{aligned}$$

where \(\psi _x=\phi ^+_x + \phi ^0_x\) and \(\phi ^0_x = \sqrt{\rho -\rho _{\mathrm {c}}(L)} \mathrm{e}^{\mathrm{i}\theta }\) is a constant field with a random phase. As shown in Sect. 3, \(\phi ^+_x\) behaves like the critical Gaussian field.

4.2 Acoustic phonon type interactions

Although not covered by Assumption 1, we can also apply Theorem 1 directly by explicit estimates to the following dispersion relation which would appear in the theory of acoustic phonons:

$$\begin{aligned} \omega (k) = \left( \sum _{i=1}^d \sin ^2(\pi k_i)\right) ^{\frac{1}{2}}. \end{aligned}$$

By the computations in the previous subsection, then again \(k=0\) is the unique minimum also on finite lattices and the excess energies satisfy

$$\begin{aligned} e_k \ge 2 L^{-1} |n|, \quad k = \frac{n}{L},\ n\in \varLambda _L. \end{aligned}$$

Hence, for all \(d\ge 2\), we have \(\rho _{\mathrm {c}}(L)=O(1)\) and \(\varepsilon _L = O(L^{-d})\) for \(d\ge 3\), and \(O( L^{-d} \ln L)\) for \(d=2\). Thus the approximation result given at the end of Sect. 4.1 holds also in this case, only with smaller errors and including also the case \(d=2\).

4.3 Dispersion relation with several minima

Let

$$\begin{aligned} \omega (k) = \sum _{i=1}^d \sin ^2(2 \pi k_i), \end{aligned}$$

which has \(2^d\) global minima at points with \(k_i\in \{0,\frac{1}{2}\}\) for \(i=1,2,\ldots ,d\). All of these are non-degenerate and thus \(\omega \) satisfies Assumption 1. Also, 0 is a minimum and thus for all L the minimum value is reached, \(\omega _{\text {min}}=0=\omega _0\).

Suppose first that L is odd, say \(L=2 m+1\) with \(m\in {\mathbb N}_+\). Then if \(k_0\in T_0\) is not zero, it has some component i such that \(k_i=\frac{1}{2}\). For such i and any \(n\in {\mathbb Z}^d\), we have \(n_i - L k_i=n_i - m - \frac{1}{2}\ne 0\). Hence, \(T_0\cap \varLambda ^* = \{0\}\). In addition, if \(1\le n_i\le m\), we have

$$\begin{aligned} \sin \!\left( \frac{2 \pi n_i}{L}\right) = 2 \sin \!\left( \frac{\pi n_i}{L}\right) \cos \!\left( \frac{\pi n_i}{L}\right) \ge \frac{\min (2 n_i,L-2 n_i)}{L} \ge \frac{1}{L}. \end{aligned}$$

Therefore, \(e_k\ge L^{-2}\) for all \(k\ne 0\), and one may modify the earlier estimates to prove that the split with \(\varLambda _0^*=\{0\}\) has \(\varepsilon _L=O(L^{-4 p'})\) with \(p'\) chosen as for the nearest neighbour interactions. Thus for odd L one finds a single-component condensate, even though \(|T_0|=2^d\).

If L is even, say \(L=2m\) with \(m\in {\mathbb N}_+\), we have \(\frac{1}{2}=\frac{m}{L}\), and thus \(T_0\subset \varLambda _L/L\). Defining \(\varLambda _0^*=T_0\) results in a split for which \(\rho _{\mathrm {c}}(L)=O(1)\) and \(\varepsilon _L=O(L^{-4 p'})\) as above but now the condensate is \(2^d\)-fold degenerate. In addition, \(e_k=0\) for each \(k\in \varLambda _0^*\), so it is not possible to decrease \(\varLambda _0^*\) without reducing the gap size to zero. We can also apply item 2 of Proposition 1 and conclude that in the condensate the Fourier modes \(k\in T_0\) are distributed uniformly on a sphere and hence the condensate field \(\phi ^0_x\) has strong oscillations in x.

In summary, the odd and even lattice sizes behave differently, and it does not really make sense to talk about \(L\rightarrow \infty \) limit of the measure \(\mu _0\), at least not without first removing the condensate modes. This result becomes more transparent if one computes the coupling function \(\alpha (x;L)\): these correspond to next-to-nearest neighbour couplings where \(\alpha (x)=0\) unless \(x=0\) or \(|x|_\infty =2\). Considering each of the d directions separately, one observes that if L is even, the odd and even sites become disconnected, and thus the system decouples into \(2^d\) independent nearest neighbour systems. On the other hand, if L is odd, odd and even sites are coupled by “going around the circle once”. In fact, this system corresponds to a single nearest neighbour lattice where the particle labels have been permuted. Since the estimates in Theorem 1 are sufficiently strong to distinguish between the two cases, we find that they provide a reliable, relatively simple method of isolating the condensate modes also in this somewhat pathological setup.

4.4 Dispersion relations with varying condensate energy

As a straightforward generalization of the above dispersion relations, one can have any point \(\zeta \in {\mathbb T}^d\) as the global minimum, for instance using

$$\begin{aligned} \omega (k;\zeta ) = \sum _{i=1}^d \sin ^2(\pi (k_i-\zeta _i)). \end{aligned}$$

Even though the minimum point is unique on the torus, if \(\zeta \ne 0\), it does not need to belong to \(\varLambda ^*\) and then there might be several minimum points in \(\varLambda ^*\).

Consider for instance an odd \(L=2m+1\) and \(\zeta _i=\frac{1}{2}\) for all i. Then \(\frac{n_i}{L}-\frac{1}{2}= -\frac{1+2(m-n_i)}{2 L}\) and \(\frac{n_i}{L}+\frac{1}{2}= \frac{1+2(m+n_i)}{2 L}\), and thus in this case \(\omega _0=d \sin ^2\!\frac{\pi }{2 L}\) and it is reached whenever \(n_i=\pm m\) for all i. Thus to the unique continuum minimum \(\zeta \) there are \(2^d\) minimum points in \(\varLambda ^*\). In fact, in this case one should choose \(\varLambda _0^*\) to consist of these \(2^d\) points, since then for \(k\in \varLambda ^*_+\) the excess energies \(e_k\) increase like \(|n|^2/L^2\) where |n| denotes the number of “lattice steps” from k to the set \(\varLambda ^*_0\), leading to similar estimates as in the nearest neighbour case.

If L is even for this dispersion relation, \(\zeta \in \varLambda ^*\), \(\omega _0=0\), \(\varLambda _0^*=\{\zeta \}\), and the behaviour is identical to the nearest neighbour case.

Considering irrational minimum points \(\zeta \) can lead to much more complicated situations. For example, suppose r is an irrational number between 0 and \(\frac{1}{2}\) which has a binary representation \(b_j\in \{0,1\}\), \(j\in {\mathbb N}\), i.e., suppose that \(r = \sum _{j=2}^\infty b_j 2^{-j}\) where the sequence \((b_j)\) does not converge to zero or one. Set \(\zeta _1=r\) and \(\zeta _i=0\), for \(i\ge 2\), and consider the following dispersion relation obtained as a product of two previous ones,

$$\begin{aligned} \omega (k) := \omega (k;0) \omega (k;\zeta ), \end{aligned}$$

with global minima at 0 and \(\zeta \). Then for each L, \(0\in \varLambda ^*\), \(\omega _0=0\), and this value can only be reached at \(k=0\) on \(\varLambda ^*\). However, for values of \(k=n/L\), with \(n_i=0\) for \(i\ge 2\), we have

$$\begin{aligned} \omega (k) \le \sin ^2 \frac{\pi (n_1-L r)}{L}. \end{aligned}$$

Along the subsequence \(L=2^N\), \(N\in {\mathbb N}\), here \(n_1-L r = n_1-\sum _{\ell =0}^{N-2} b_{N-\ell } 2^\ell -\sum _{j=1}^{\infty } b_{N+j} 2^{-j}\). We can choose \(n_1=\sum _{\ell =0}^{N-2} b_{N-\ell } 2^\ell \le \sum _{\ell =0}^{N-2} 2^\ell = 2^{N-1}-1<\frac{L}{2}\), and for this value

$$\begin{aligned} e_k = \omega (k) \le \pi ^2 \left( \sum _{j=1}^\infty b_{N+j} 2^{-j-N}\right) ^2. \end{aligned}$$

Hence, by considering a binary sequence with ever less frequent ones and sufficiently large N, the bound can be made proportional to \(L^{-p}\) for any \(p\ge 2\). Depending on how small the term is, the above point \(k=n/L\) might or might not belong to the condensate modes \(\varLambda _0^*\). In particular, there are instances for which \(e_k>0\) but \(e_k \le \frac{1}{2\rho } L^{-2 d}\), and thus item 3 of Proposition 1 can be applied without increasing the magnitude of the error. Hence, the system behaves like a uniformly distributed two-component condensate even though \(e_k\), \(k\in \varLambda ^*_0\), is not identically zero.

4.5 Anisotropic dispersion relations

Another generalization of the above condensate cases is to consider anisotropic dispersion relations. For instance, in addition to shifting the global minimum to \(\zeta \in {\mathbb T}^d\) we may take any finite collection of points \(M^{(\ell )}\in {\mathbb Z}^d\), \(\ell =1,2,\ldots ,N\), choose some weights \(b_\ell >0\) for them, and define

$$\begin{aligned} \omega (k) = \sum _{\ell =1}^N b_\ell \sin ^2(\pi (k-\zeta )\cdot M^{(\ell )}). \end{aligned}$$

If there is a sufficient variety of points in the collection, for instance, if all unit vectors are included, there is only one global minimum for this dispersion relation, located at \(k=\zeta \). The Hessian at this point is equal to

$$\begin{aligned} 2\pi ^2 \sum _{\ell =1}^N b_\ell M^{(\ell )} \otimes M^{(\ell )}, \end{aligned}$$

so that the second derivative into a direction \(v\in S^{d-1}\) at \(k=\zeta \) is given by

$$\begin{aligned} 2\pi ^2 \sum _{\ell =1}^N b_\ell \left| v\cdot M^{(\ell )}\right| ^2. \end{aligned}$$

Hence, essentially arbitrary asymmetries between different directions may be generated near the minimum point by varying m and b.

In the proof of Lemma 1 given in Sect. 6, the uniform upper bound for the number of degrees included in the condensate, \(M_0\), depends on the dimension but also on the ratio between the maximal and minimal eigenvalue of the Hessian of \(\omega \) at its minima, i.e., on the maximal anisotropy at these points. The value appearing in the proof typically overestimates the true number of degrees of freedom needed. Let us conclude with two examples which highlight the problems which arise when trying to improve on such general uniform bounds.

For simplicity, let us consider anisotropy in the first two components only. To borrow results from the previous computations, assume that \(L=2m+1\) is odd and take \(\zeta =(\frac{1}{2},0,0,\ldots ,0)\). We reparameterize the first component using \(m_1:= m-L |k_1|\in {\mathbb Z}\) and the sign \(\sigma _1\) of \(k_1\). Then \(0\le m_1\le m\) and \(k_1=\sigma _1 |k_1| = \sigma _1 ( m-m_1)/L\), implying also that \(|\sin (\pi (k_1-\frac{1}{2}))|= |\sin (\pi (2 m_1+1)/(2L))|\ge (2 m_1+1)/L\).

We first consider the nearest neighbour case where the first component has unit weight but the rest have a much smaller weight 1 / B, where \(B\gg 1\). Then for \(k\in \varLambda _L\), and denoting \(n_i = L k_i\), for \(i=2,3,\ldots ,d\), we find an approximation

$$\begin{aligned} \omega (k) \approx \pi ^2 \frac{(m_1+1/2)^2+n^2/B}{L^2}, \end{aligned}$$

valid for \(m_1/L, |n|/L\ll 1\). Thus the minimum value is reached at the two points where \(m_1=0\) and \(n=0\). However, if \(m_1=0\), we then also have \(e_k\approx \pi ^2\frac{n^2}{B L^2}\) whenever \(|n|/L\ll 1\). Suppose that we wish to include in the condensate \(\varLambda _0^*\) at least all k with \(e_k\le L^{\frac{1}{2}-d}\) (corresponding roughly to the choice \(\kappa =\frac{1}{2}\) in Lemma 1). Since for some finite L it can happen that \(B\ge L^{d-1}\), the number of condensate modes can temporarily be very large. This effect can be traced back to the flatness of constant level surfaces of \(\omega \) caused by the strong anisotropy.

In the second example, we take also the first direction to have a small weight 1 / B but add one more point to the collection: set \(b_{d+1}=1\) and \(M^{(d+1)}=(M_1,M_2,0,\ldots ,0)\) where \(M_1,M_2\in {\mathbb N}\) are such that \(M_2\) is odd and \(M_1\) is even. Suppose also that L is large enough, satisfying \(L\gg M_1,M_2\). Then for \(m_1/L, |n|/L\ll 1\) we have

$$\begin{aligned} \omega (k) \approx \pi ^2 \frac{(m_1+1/2)^2+n^2}{B L^2} + \pi ^2 \frac{K^2}{L^2}, \end{aligned}$$

where, using the assumption that \(M_1\) is an integer,

$$\begin{aligned}&K = \left( L \left[ k_1-\frac{1}{2}\right] M_1 + L k_2 M_2\right) \bmod L \nonumber \\&\quad = \left( -\sigma _1 \left[ m_1+\frac{1}{2}\right] M_1 + n_2 M_2\right) \bmod L. \end{aligned}$$

Since \(M_1\) is even, \(M_2\) is odd, and both are positive, we may set \(n_2=\sigma _1\frac{M_1}{2}\) and choose \(m_1\) so that \(2 m_1+1=M_2\). Setting also \(n_i=0\) for \(i\ge 3\), we obtain two points in \(\varLambda _L^*\) for which \(K=0\) and

$$\begin{aligned} \omega (k) \approx \pi ^2 \frac{M_1^2+M_2^2}{4 B L^2}. \end{aligned}$$

However, for any point for which \(K\ne 0\), for instance, if \(m_1=0=n_2\), we have

$$\begin{aligned} \omega (k)\ge \frac{4}{L^2}. \end{aligned}$$

Therefore, if the system is sufficiently anisotropic, e.g., \(B\ge M_1^2+M_2^2\), it can happen that the minimum point is not the nearest lattice point to the minimum on \({\mathbb T}^d\), but it could be found many lattice steps away from it. In contrast to the first example, this effect does not disappear when \(L\rightarrow \infty \), but will persists for all sufficiently large odd L in the present case.

5 Proof of the Main Result, Theorem 1

Proof of Theorem 1

Consider a fixed L and a split \((\varLambda _0^*,\varLambda ^*_+)\) of \(\varLambda ^*\) which is separated by the energy interval [ab], \(0\le a<b\), and has a relative energy gap \(\delta ^{-1}\). We aim at separation in the degrees of freedom related to these two sets.

We begin by simplifying the representation of the Berlin–Kac measure \(\mu _0\). Starting from the simplified form, we then construct a change of variables which will bring it closer to the measure \(\mu _1\). We first shift the position of the \(\delta \)-constraint to match that in \(\mu _1\). This will introduce a shift in the normal fluid energies which we will need to repair back to the critical ones by a second change of variables. Even after these changes, the measures will differ by a weight function which, however, is close to one with high probability. This property is checked quantitatively in a technical Lemma 4, resulting in the estimates in Corollary 2. To make the final comparison, we use the change of variables to construct a coupling between \(\mu _0\) and \(\mu _1\) which, together with Corollary 2, will result in the stated bound on their Wasserstein distance.

To begin, let us collect the field values for \(k\in \varLambda _+^*\) into a vector \(\varPhi ^+\), corresponding to the normal fluid, and those for \(k\in \varLambda _0^*\) into a vector \(\varPhi ^0\), corresponding to the condensate. We denote

$$\begin{aligned} V_0 := |\varLambda _0^*|, \qquad V_+ := |\varLambda _+^*|, \end{aligned}$$

for which \(V_0,V_+>0\), and \(V=V_0+V_+\). Define also

$$\begin{aligned}&N_+[\varPhi ] := \int _{\varLambda _+^*}\!\mathrm{d}k\, |\varPhi _k|^2,\quad N_0[\varPhi ] := \int _{\varLambda _0^*}\!\mathrm{d}k\, |\varPhi _k|^2, \nonumber \\&\rho _+[\varPhi ] := \frac{N_+[\varPhi ]}{V},\quad \rho _0[\varPhi ] := \frac{N_0[\varPhi ]}{V}. \end{aligned}$$

Since \(N_+[\varPhi ]+N_0[\varPhi ] = N[\varPhi ]\), we have now

$$\begin{aligned} H[\varPhi ] = \int _{\varLambda ^*}\!\mathrm{d}k\, \omega (k) |\varPhi _k|^2 = \int _{\varLambda ^*}\!\mathrm{d}k\, e_k |\varPhi _k|^2 + \omega _0 N[\varPhi ]. \end{aligned}$$

Denote

$$\begin{aligned} E_+[\varPhi ] := \int _{\varLambda _+^*}\!\mathrm{d}k\, e_k |\varPhi _k|^2,\qquad E_0[\varPhi ] := \int _{\varLambda _0^*}\!\mathrm{d}k\, e_k |\varPhi _k|^2, \end{aligned}$$

and we may conclude that in the integrand, in which almost surely \(N[\varPhi ] = \rho V\), we have

$$\begin{aligned} H[\varPhi ] = E_+[\varPhi ] + E_0[\varPhi ] + \omega _0 \rho V. \end{aligned}$$

Therefore, we may rewrite

$$\begin{aligned} \mu _0[\mathrm{d}\varPhi ] = \frac{1}{Z_{0}} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_0[\varPhi ]} \delta (\rho _0[\varPhi ] + \rho _+[\varPhi ] - \rho ), \end{aligned}$$

where the new normalization constant is related to the one given in (2.7) by \(Z_0 = V \mathrm{e}^{\omega _0 \rho V} Z_\rho \).

Let \(\rho _{\mathrm {c}}>0\) denote the critical density, measured as an expectation of \(\rho _+\) over the probability measure (2.8), i.e., over

$$\begin{aligned} \mu _+[\mathrm{d}\varPhi ] := \frac{1}{Z_{+}} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]}. \end{aligned}$$

By assumption, \(e_k \ge b>0\) for each \(k\in \varLambda _+^*\), and thus this is a well-defined Gaussian measure under which \(\mathrm{Re\,}\varPhi _k\), \(\mathrm{Im\,}\varPhi _k\), \(k\in \varLambda _+^*\), form a collection of jointly independent random variables, with a zero mean and a variance \(\frac{V}{2 e_k}\). Therefore,

$$\begin{aligned} \langle \rho _+\rangle _{\mu _+}&= \frac{1}{Z_{+}} \int \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} |\varPhi _k|^2 \nonumber \\&= \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{V}{e_k} = \int _{\varLambda _+^*}\!\!\mathrm{d}k\, \frac{1}{e_k}=\rho _{\mathrm {c}}(L), \end{aligned}$$

as defined in (2.9).

Set then \(\varDelta :=\rho -\rho _{\mathrm {c}}\), which is strictly positive by assumption. Then we define the target measure \(\mu _1\) as a product between \(\mu _+\) and a suitably chosen condensate measure: we set

$$\begin{aligned}&\mu _1[\mathrm{d}\varPhi ] := \frac{1}{Z_1} \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} \nonumber \\&\qquad \times \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_0[\varPhi ]-\tilde{\varepsilon }[\varPhi ] V \rho _{\mathrm {c}}} \prod _{k\in \varLambda ^*_+} \left( 1-\frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}} \right) ^{-1} \delta (\rho _0[\varPhi ] - \varDelta ), \end{aligned}$$
(5.1)

where \(\tilde{\varepsilon }\) depends only on the condensate components \(\varPhi ^0\),

$$\begin{aligned} \tilde{\varepsilon }[\varPhi ] := \frac{E_0[\varPhi ]}{N_0[\varPhi ]} \le \max _{k\in \varLambda ^*_0}e_k \le a. \end{aligned}$$

Thus, for any \(k\in \varLambda ^*_+\),

$$\begin{aligned} \frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}} \le \delta <1, \end{aligned}$$

which implies that the weight in (5.1) is a strictly positive function. Since here \(\tilde{\varepsilon }[\varPhi ]=E_0[\varPhi ]/(V \varDelta )\) almost surely, this measure indeed coincides with the definition given in (2.10).

To construct a suitable coupling between the measures \(\mu _0\) and \(\mu _1\), we rely on a change of variables and the diagonal concentration trick which we learned from Saksman and Webb, from the proof of Lemma B.1 in [10, Appendix B]. The trick is to construct an explicit coupling between two probability measures by concentrating as much of their common mass as possible in the diagonal of the coupling (\(\varPhi '=\varPhi \)) and distributing any remaining mass as a product on the off-diagonal (\(\varPhi '\ne \varPhi \)). Although this coupling is seldom optimal, it can provide a good estimate of the Wasserstein distance of the two measures in case most of the mass can be concentrated in the diagonal; note that the diagonal mass does not contribute to the value of the integral defining the Wasserstein distance in (B.1).

In our application of the trick, we first need to change into variables using which the two measures share enough common mass. To find new variables better adapted to compare the measures \(\mu _0\) and \(\mu _1\), let us start from the measure \(\mu _0\) and denote its integration variable by \(\varPsi \). The goal is to find a change of variables \(\varPsi =G[\varPhi ]\) which would yield a measure close to \(\mu _1\): we try to construct G so that for any observable f we would have \(\int \! \mu _0[\mathrm{d}\varPsi ] \, f(\varPsi ) = \int \! \mu _1[\mathrm{d}\varPhi ]\, g[\varPhi ] f(G[\varPhi ])\) for some function g which is close to one with high \(\mu _1\)-probability. Some preliminary estimates and definitions will be needed to find the right choice, and we postpone the precise construction of the coupling later, until Eq. (5.11).

First, let us recall that \(\varDelta = \rho -\rho _{\mathrm {c}}> 0\) and define

$$\begin{aligned} \alpha [\varPsi ] := {\left\{ \begin{array}{ll} \frac{\rho _+[\varPsi ]-\rho _{\mathrm {c}}}{\rho -\rho _{\mathrm {c}}}, &{} \text {if }\rho _+[\varPsi ] < \rho , \\ 0, &{} \text {if }\rho _+[\varPsi ] \ge \rho . \end{array}\right. } \end{aligned}$$

Note that \(\alpha [\varPsi ]\) depends only on \(\varPsi ^+\), and \(-\frac{\rho _{\mathrm {c}}}{\varDelta }\le \alpha [\varPsi ]<1\). Consider the expectation of some continuous function \(f(\varPsi ^+,\varPsi ^0)\) with a compact support under the original measure \(\mu _0[\mathrm{d}\varPsi ]\). The mass constraint function can be written as

$$\begin{aligned} \rho _0[\varPsi ] + \rho _+[\varPsi ] - \rho = \rho _0[\varPsi ] - (1-\alpha [\varPsi ]) \varDelta , \end{aligned}$$

whenever \(\rho _+[\varPsi ] < \rho \). On the other hand, the set has a measure zero, and if \(\rho _+[\varPsi ] > \rho \), the mass constraint cannot be satisfied for any \(\varPsi ^0\). Hence, the collection of \(\varPsi \) with \(\rho _+[\varPsi ]\ge \rho \) has zero measure with respect to \(\mu _0\). Since \(\alpha [\varPsi ]<1\) depends only on \(\varPsi ^+\), it is straightforward to make a change of variables \(\varPsi _k = \sqrt{1-\alpha [\varPsi ]} \varPhi _k\) for \(k\in \varLambda ^*_0\). Then \(\rho _0[\varPsi ] = (1-\alpha [\varPsi ]) \rho _0[\varPhi ]\) and

$$\begin{aligned} \delta ( \rho _0[\varPsi ] + \rho _+ - \rho ) = \delta ((1-\alpha [\varPsi ])(\rho _0[\varPhi ]- \varDelta )) = \frac{1}{1-\alpha [\varPsi ]} \delta (\rho _0[\varPhi ]- \varDelta ). \end{aligned}$$

More detailed discussion about the validity of this formula can be found in Appendix A. In particular, we are allowed to apply the formal rule for \(\delta \)-functions to take out the factor \((1-\alpha [\varPsi ])\) here since the \(\delta \)-function can be integrated out using \(\varPsi ^0\) while keeping \(\varPsi ^+\), and hence also \(\alpha [\varPsi ]\), fixed.

In the above change of variables, \(E_0[\varPsi ] = (1-\alpha [\varPsi ])E_0[\varPhi ]\), and therefore we obtain

$$\begin{aligned}&\langle f\rangle _{\mu _0} = \frac{1}{Z_{0}} \int \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPsi ^*_k \mathrm{d}\varPsi _k\right] \mathrm{e}^{-E_+[\varPsi ]} {\mathbb {1}}_{\{\rho _+[\varPsi ] < \rho \}}(1-\alpha [\varPsi ])^{V_0-1} \\&\quad \times \int \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-(1-\alpha [\varPsi ])E_0[\varPhi ]} \delta (\rho _0[\varPhi ]- \varDelta ) f(\varPsi ^+,\sqrt{1-\alpha [\varPsi ]} \varPhi ^0). \end{aligned}$$

We then use Fubini’s theorem to change the order of \(\varPsi \) and \(\varPhi \) integrals. Then we can simplify the integral by making a change of variables for \(\varPsi ^+\) using a fixed \(E_0=E_0[\varPhi ]\) and assuming \(\rho _0=\varDelta \). In particular, for \(\rho _+[\varPsi ] < \rho \), we have

$$\begin{aligned} E_0 \alpha [\varPsi ] {=} \frac{E_0}{\varDelta }(\rho _+[\varPsi ]-\rho _{\mathrm {c}}) = \frac{E_0}{\rho _0}(\rho _+[\varPsi ]-\rho _{\mathrm {c}}) {=} \tilde{\varepsilon } V (\rho _+[\varPsi ]-\rho _{\mathrm {c}}) {=} \tilde{\varepsilon } N_+[\varPsi ]-\tilde{\varepsilon } V \rho _{\mathrm {c}}. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathrm{e}^{-E_+[\varPsi ]+ E_0 \alpha [\varPsi ]} = \mathrm{e}^{-\tilde{\varepsilon } V \rho _{\mathrm {c}}} \exp \biggl (-\frac{1}{V}\sum _{k\in \varLambda ^*_+} (e_k-\tilde{\varepsilon })|\varPsi _k|^2\biggr ). \end{aligned}$$

We now make a second change of variables to correct for the shift of energies here: \(\varPhi _k = \sqrt{1-\tilde{\varepsilon }/e_k}\varPsi _k\) for \(k\in \varLambda ^*_+\). As pointed out above, here \(\tilde{\varepsilon }/e_k<1\) and we can resolve the change of variables as easily as in the first case. We find that

$$\begin{aligned} \langle f\rangle _{\mu _0}&= \frac{1}{Z_{0}} \int \prod _{k\in \varLambda _0^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_0[\varPhi ]} \delta (\rho _0[\varPhi ]- \varDelta ) \mathrm{e}^{-\tilde{\varepsilon } V \rho _{\mathrm {c}}} \prod _{k\in \varLambda ^*_+} \left( 1-\frac{\tilde{\varepsilon }}{e_{k}} \right) ^{-1}\\&\quad \times \int \prod _{k\in \varLambda _+^*} \left[ \mathrm{d}\varPhi ^*_k \mathrm{d}\varPhi _k\right] \mathrm{e}^{-E_+[\varPhi ]} {\mathbb {1}}_{\{\rho _+ < \rho \}} (1-\alpha )^{V_0-1}\\&\quad \times f((1-\tilde{\varepsilon }/e_k)^{-1/2}\varPhi ^+_k,\sqrt{1-\alpha } \varPhi ^0), \end{aligned}$$

where \(\tilde{\varepsilon }=\tilde{\varepsilon }[\varPhi ]\), and we need to substitute in the integrand

$$\begin{aligned} \text {``}\rho _+\text {''} = \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{1}{1-\frac{\tilde{\varepsilon }}{e_{k}} } |\varPhi _k|^2,\quad \text {``}\alpha \text {''} = \frac{\rho _+-\rho _{\mathrm {c}}}{\rho -\rho _{\mathrm {c}}}, \end{aligned}$$

which are functions of both \(\varPhi ^+\) and \(\varPhi ^0\).

To summarize the result, let us define the functions

$$\begin{aligned} \rho '[\varPhi ] := \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{e_k}{e_k- \tilde{\varepsilon }[\varPhi ]} |\varPhi _k|^2, \qquad \alpha '[\varPhi ] := \frac{\rho '[\varPhi ]-\rho _{\mathrm {c}}}{\rho -\rho _{\mathrm {c}}}, \end{aligned}$$

and, using these, the weight function

$$\begin{aligned} g[\varPhi ] := \frac{Z_1}{Z_{0}} {\mathbb {1}}_{\{\rho '[\varPhi ] < \rho \}} (1-\alpha '[\varPhi ])^{V_0-1} \end{aligned}$$
(5.2)

and the change of variables

$$\begin{aligned} G(\varPhi )_k := {\left\{ \begin{array}{ll} \left( 1-\frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}}\right) ^{-\frac{1}{2}} \varPhi _k,&{} \text {for }k\in \varLambda ^*_+,\\ \left( 1-\alpha '[\varPhi ]\right) ^{\frac{1}{2}} \varPhi _k,&{} \text {for }k\in \varLambda ^*_0. \end{array}\right. } \end{aligned}$$
(5.3)

Then the above computation shows that

$$\begin{aligned}&\langle f\rangle _{\mu _0} = \int \mu _1[ \mathrm{d}\varPhi ] \, g[\varPhi ] f(G[\varPhi ]). \end{aligned}$$
(5.4)

Since \(0\le g[\varPhi ] \le \frac{Z_1}{Z_{0}} \left( \frac{\rho }{\varDelta }\right) ^{V_0-1}\), we can then use dominated convergence theorem to conclude that in fact (5.4) holds for all bounded continuous functions f.

Note that due to the change of variables implied by G there is a shift in the position of the \(\delta \)-weight. Therefore, the formula does not imply that \(\mu _0\) or \(\mu _1\) would be absolutely continuous with respect to each other (in fact, they are not: the collection of \(\varPhi \) with \(\rho _+[\varPhi ]>\rho \) has zero measure with respect to \(\mu _0\) but its measure is non-zero with respect to \(\mu _1\); conversely, the collection of \(\varPhi \) with \(\rho _0[\varPhi ]\le \frac{\varDelta }{2}\) has zero measure with respect to \(\mu _1\) but non-zero measure with respect to \(\mu _0\)). However, as we will prove next in Lemma 2, the weight g is close to one with high \(\mu _1\)-probability, and although there can be regions where it deviates significantly from one, g remains always uniformly bounded. These estimates will provide sufficient control for using the diagonal coupling trick at the end of the section, in (5.11).

Lemma 4

Using the above definitions, we have

$$\begin{aligned}&\displaystyle - \frac{V_0-1}{1-\delta } \left( \frac{\rho }{\varDelta }\right) ^{V_0-1}\sqrt{\tilde{\delta }} \le 1-\frac{Z_0}{Z_{1}} \le \frac{V_0}{1-\delta } \frac{\rho }{\varDelta } \sqrt{\tilde{\delta }}, \end{aligned}$$
(5.5)
$$\begin{aligned}&\displaystyle \langle |1-g|^2\rangle _{\mu _1} \le \frac{1}{(1-\delta )^2} \left[ \left( \frac{\rho }{\varDelta }\right) ^{2} + 4 V_0^2 \left( \frac{\rho }{\varDelta }\right) ^{2 V_0} \left( \frac{Z_1}{Z_{0}}\right) ^2\right] \tilde{\delta }, \end{aligned}$$
(5.6)
$$\begin{aligned}&\displaystyle \langle (\alpha ')^2\rangle _{\mu _1} \le \frac{\rho ^2}{\varDelta ^2 (1-\delta )^2} \tilde{\delta }, \end{aligned}$$
(5.7)

where

$$\begin{aligned} \tilde{\delta } := 2 \delta + \frac{1}{V^2\rho _{\mathrm {c}}^2} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2}. \end{aligned}$$
(5.8)

Proof

Using \(f=1\) in (5.4), we find that \(\langle g\rangle _{\mu _1}= 1\), and thus

$$\begin{aligned} \frac{Z_0}{Z_{1}} = \langle {\mathbb {1}}_{\{\rho ' < \rho \}} (1-\alpha ')^{V_0-1}\rangle _{\mu _1}, \end{aligned}$$

where \(-\frac{\rho _c}{\varDelta }\le \alpha '<1\), and hence \(0<1-\alpha '\le 1+\frac{\rho _c}{\varDelta } = \frac{\rho }{\varDelta }\). Therefore,

$$\begin{aligned} 1-\frac{Z_0}{Z_{1}} = \langle {\mathbb {1}}_{\{\rho ' \ge \rho \}} \rangle _{\mu _1} + \langle {\mathbb {1}}_{\{\rho ' < \rho \}} \left[ 1-(1-\alpha ')^{V_0-1}\right] \rangle _{\mu _1}, \end{aligned}$$

which implies that

$$\begin{aligned}&- \langle {\mathbb {1}}_{\{\rho '< \rho ,\, \alpha '< 0\}} \left[ (1-\alpha ')^{V_0-1}-1\right] \rangle _{\mu _1} \le 1-\frac{Z_0}{Z_{1}}\\&\quad \le \langle {\mathbb {1}}_{\{\rho ' \ge \rho \}} \rangle _{\mu _1} + \langle {\mathbb {1}}_{\{\rho ' < \rho ,\, \alpha ' > 0\}} \left[ 1-(1-\alpha ')^{V_0-1}\right] \rangle _{\mu _1}. \end{aligned}$$

On the left hand side, the integrand is zero unless \(-\frac{\rho _c}{\varDelta }\le \alpha '< 0\). Thus either \(V_0=1\) and the term is always zero, or we may bound in the integrand \((1-\alpha ')^{V_0-1}-1\le |\alpha '| (V_0-1) (\frac{\rho }{\varDelta })^{V_0-2}\). Therefore, we can always bound the expectation from above by \( (V_0-1) (\frac{\rho }{\varDelta })^{V_0-2} \langle |\alpha '| \rangle _{\mu _1}\). On the right hand side, we have \(0\le 1-(1-\alpha ')^{V_0-1}\le |\alpha '| (V_0-1)\) for \(\alpha '>0\), and for \(\rho '\ge \rho \), it holds that \(\alpha '\ge 1\). Therefore,

$$\begin{aligned} \langle {\mathbb {1}}_{\{\rho ' \ge \rho \}} \rangle _{\mu _1} + \langle {\mathbb {1}}_{\{\rho ' < \rho ,\, \alpha ' > 0\}} \left[ 1-(1-\alpha ')^{V_0-1}\right] \rangle _{\mu _1} \le V_0 \langle |\alpha '|\rangle _{\mu _1}. \end{aligned}$$

We have obtained the bounds

$$\begin{aligned} - (V_0-1) \left( \frac{\rho }{\varDelta }\right) ^{V_0-2} \langle |\alpha '| \rangle _{\mu _1} \le 1-\frac{Z_0}{Z_{1}} \le V_0 \langle |\alpha '|\rangle _{\mu _1} , \end{aligned}$$

which imply also that

$$\begin{aligned}&\left| 1-\frac{Z_0}{Z_{1}}\right| ^2 \le \max \left( V_0^2, (V_0-1)^2 \left( \frac{\rho }{\varDelta }\right) ^{2 V_0-4}\right) \langle |\alpha '| \rangle ^2_{\mu _1} \nonumber \\&\quad \le V_0^2 \left( \frac{\rho }{\varDelta }\right) ^{2 (V_0-2)_+}\langle |\alpha '|^2\rangle _{\mu _1}, \end{aligned}$$

where \((r)_+:= r {\mathbb {1}}_{\{r>0\}}\). We may use this result and similar techniques to derive an upper bound for

$$\begin{aligned}&\langle |1-g|^2\rangle _{\mu _1} = \langle {\mathbb {1}}_{\{\rho ' \ge \rho \}} \rangle _{\mu _1} + \langle {\mathbb {1}}_{\{\rho '< \rho \}} |1-(1-\alpha ')^{V_0-1} Z_1/Z_0|^2\rangle _{\mu _1}\\&\quad \le \langle |\alpha '|^2{\mathbb {1}}_{\{\rho ' \ge \rho \}} \rangle _{\mu _1} +2 \left( \frac{Z_1}{Z_{0}}\right) ^2 \left( \left| \frac{Z_0}{Z_{1}}-1\right| ^2 + \langle {\mathbb {1}}_{\{\rho ' < \rho \}} |1-(1-\alpha ')^{V_0-1}|^2\rangle _{\mu _1}\right) \\&\quad \le \langle |\alpha '|^2\rangle _{\mu _1} \left[ 1 + 4 V_0^2 \left( \frac{\rho }{\varDelta }\right) ^{2(V_0-2)_+} \left( \frac{Z_1}{Z_{0}}\right) ^2\right] . \end{aligned}$$

It remains to estimate

$$\begin{aligned} \varDelta ^2 \langle (\alpha ')^2\rangle _{\mu _1} = \langle (\rho '-\rho _{\mathrm {c}})^2\rangle _{\mu _1}, \end{aligned}$$

where

$$\begin{aligned}&\rho ' -\rho _{\mathrm {c}}= \frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{e_k}{e_k- \tilde{\varepsilon }[\varPhi ]} |\varPhi _k|^2 - \frac{1}{V} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k}. \end{aligned}$$

Since \(\frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}} \le \delta \), here

$$\begin{aligned}&\langle (\rho '-\rho _{\mathrm {c}})^2\rangle _{\mu _1} = \frac{1}{V^4} \sum _{k,k'\in \varLambda ^*_+} \left\langle \frac{1}{1- \tilde{\varepsilon }/e_k} \frac{1}{1- \tilde{\varepsilon }/e_{k'}} |\varPhi _k|^2 |\varPhi _{k'}|^2\right\rangle \\&\qquad - 2 \frac{1}{V^3} \sum _{k,k'\in \varLambda ^*_+} \frac{1}{e_{k'}} \left\langle \frac{1}{1- \tilde{\varepsilon }/e_k}|\varPhi _k|^2\right\rangle + \frac{1}{V^2}\sum _{k,k'\in \varLambda ^*_+} \frac{1}{e_{k'} e_k}\\&\quad \le \frac{1}{(1-\delta )^2}\frac{1}{V^4} \sum _{k,k'\in \varLambda ^*_+} \left\langle |\varPhi _k|^2 |\varPhi _{k'}|^2\right\rangle \\&\qquad - 2 \frac{1}{V^3} \sum _{k,k'\in \varLambda ^*_+} \frac{1}{e_{k'}} \left\langle |\varPhi _k|^2\right\rangle + \frac{1}{V^2}\sum _{k,k'\in \varLambda ^*_+} \frac{1}{e_{k'} e_k}. \end{aligned}$$

The remaining Gaussian expectations can be computed explicitly, yielding for \(k\ne k'\)

$$\begin{aligned} \left\langle |\varPhi _k|^2\right\rangle = \frac{V}{e_k},\quad \left\langle |\varPhi _k|^2 |\varPhi _{k'}|^2\right\rangle = \frac{V^2}{ e_k e_{k'}},\quad \left\langle |\varPhi _k|^4\right\rangle = 2\frac{V^2}{e_k^2}. \end{aligned}$$
(5.9)

Therefore,

$$\begin{aligned}&\varDelta ^2 \langle (\alpha ')^2\rangle _{\mu _1}\\&\quad \le \frac{1}{(1-\delta )^2}\frac{1}{V^2} \sum _{k,k'\in \varLambda ^*_+} \frac{1}{ e_k e_{k'}} + \frac{1}{(1-\delta )^2}\frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} - \frac{1}{V^2}\sum _{k,k'\in \varLambda ^*_+} \frac{1}{e_{k'} e_k}\\&\quad \le \frac{2\delta }{(1-\delta )^2} \rho _{\mathrm {c}}^2 + \frac{1}{(1-\delta )^2}\frac{1}{V^2} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} \le \frac{\rho ^2}{(1-\delta )^2} \tilde{\delta }, \end{aligned}$$

using the definition in (5.8) and the assumption \(\rho >\rho _{\mathrm {c}}\). Together with the earlier estimates this completes the proof of the Lemma. \(\quad \square \)

Corollary 2

If \(\delta \le \frac{1}{2}\) and \(\tilde{\delta }\le \frac{\varDelta ^2}{2^4 V_0^2\rho ^2}\), then \(Z_1\le 2 Z_0\), \(\langle (\alpha ')^2\rangle _{\mu _1} \le 4 \rho ^2 \varDelta ^{-2}\tilde{\delta }\), and

$$\begin{aligned} 0\le g[\varPhi ]\le 2 \left( \frac{\rho }{\varDelta }\right) ^{V_0-1}{\mathbb {1}}_{\{\rho '[\varPhi ] < \rho \}}, \quad \langle |1-g|^2\rangle _{\mu _1} \le 4 \left( \frac{\rho }{\varDelta }\right) ^{2 V_0} \left( 1+2^4 V_0^2\right) \tilde{\delta }. \end{aligned}$$
(5.10)

The assumptions made in the Theorem indeed guarantee that \(\delta \le \frac{1}{2}\) and \(\tilde{\delta }\le \frac{\varDelta ^2}{2^4 V_0^2\rho ^2}\), since \(\tilde{\delta }\le 2 \varepsilon \). Hence, we may continue the proof of the Theorem assuming that all of the conclusions in Corollary 2 are valid.

The above representation allows to construct a coupling \(\gamma \) between \(\mu _0\) and \(\mu _1\) by combining the change of variables G with the diagonal concentration trick mentioned earlier. Together with the estimates in Corollary 2 this will prove the bound stated for the Wasserstein distance between \(\mu _0\) and \(\mu _1\) in the Theorem. Explicitly, we define a positive Borel measure \(\gamma \) by its action on bounded continuous functions \(F(\varPhi ,\varPsi )\), as follows:

$$\begin{aligned}&\langle F\rangle _{\gamma } := \int \mu _1[\mathrm{d}\varPhi ] \min (1,g[\varPhi ]) F(\varPhi ,G[\varPhi ]) \nonumber \\&\quad + \int \mu _1[\mathrm{d}\varPhi ] \int \mu _1[\mathrm{d}\varPsi ] \frac{1}{Z'} (1-g(\varPhi ))_+ (g(\varPsi )-1)_+ F(\varPhi ,G[\varPsi ]). \end{aligned}$$
(5.11)

Here \((r)_+:= r {\mathbb {1}}_{\{r>0\}}\) and the normalization factor \(Z'\) is given by

$$\begin{aligned} Z' := \langle (1-g)_+\rangle _{\mu _1} = \langle (g-1)_+\rangle _{\mu _1} = \frac{1}{2}\langle |g-1|\rangle _{\mu _1}, \end{aligned}$$

where the second equality follows from the identity \(g=1+(g-1)_+-(1-g)_+\) and the earlier made observation that \(\langle g\rangle _{\mu _1}=1\) by (5.4). The final equality is then a consequence of the identity \(|g-1|=(g-1)_+ +(1-g)_+\). If f is bounded and continuous and \(F(\varPhi ,\varPsi )=f(\varPhi )\), a straightforward computation shows that \(\langle F\rangle _\gamma = \langle f\rangle _{\mu _1}\). If \(F(\varPhi ,\varPsi )=f(\varPsi )\), a similar computation and using the representation in (5.4) proves that \(\langle F\rangle _\gamma = \langle f\rangle _{\mu _0}\). Therefore, \(\gamma \) is indeed a coupling between \(\mu _0\) and \(\mu _1\).

Using this coupling, we can now conclude that

$$\begin{aligned}&W_p(\mu _1,\mu _0)^p \le \int \mu _1[\mathrm{d}\varPhi ] \min (1,g[\varPhi ]) \Vert \varPhi -G[\varPhi ]\Vert ^p \nonumber \\&\quad + \int \mu _1[\mathrm{d}\varPhi ] \int \mu _1[\mathrm{d}\varPsi ] \frac{1}{Z'} (1-g(\varPhi ))_+ (g(\varPsi )-1)_+ \Vert \varPhi -G[\varPsi ]\Vert ^p. \end{aligned}$$

In particular, in the case \(p=2\), we can simplify the computations by first using the upper bound \(\Vert \varPhi -G[\varPsi ]\Vert ^2\le 2 \Vert \varPhi -\varPsi \Vert ^2+ 2\Vert \varPsi -G[\varPsi ]\Vert ^2\), which shows that

$$\begin{aligned}&W_2(\mu _1,\mu _0)^2 \le 2 \langle g[\varPhi ]\,\Vert \varPhi -G[\varPhi ]\Vert ^2\rangle _{\mu _1} \nonumber \\&\qquad + 2 \int \mu _1[\mathrm{d}\varPhi ] \int \mu _1[\mathrm{d}\varPsi ] \frac{1}{Z'} (1-g(\varPhi ))_+ (g(\varPsi )-1)_+ \Vert \varPhi -\varPsi \Vert ^2. \end{aligned}$$

Let us begin with the second term on the right hand side. The integrand is zero unless \(g(\varPsi )>1\). In particular, then we must have \(\rho '[\varPsi ]<\rho \), implying that \(\Vert \varPsi ^+\Vert ^2 = V \rho _+[\varPsi ]\le V \rho '[\varPsi ]<V \rho \). On the other hand, under the measure \(\mu _1\), it holds almost surely that \(\Vert \varPsi ^0\Vert ^2=V \varDelta \). Therefore, almost surely in the above integrand

$$\begin{aligned} \Vert \varPhi -\varPsi \Vert ^2 \le 2 (\Vert \varPhi \Vert ^2+\Vert \varPsi \Vert ^2) \le 2 (\Vert \varPhi \Vert ^2+V \varDelta + V\rho ). \end{aligned}$$

Taking into account the definition of \(Z'\), we find an estimate

$$\begin{aligned}&\int \mu _1[\mathrm{d}\varPhi ] \int \mu _1[\mathrm{d}\varPsi ] \frac{1}{Z'} (1-g(\varPhi ))_+ (g(\varPsi )-1)_+ \Vert \varPhi -\varPsi \Vert ^2 \\&\quad \le 2 \int \mu _1[\mathrm{d}\varPhi ] (\Vert \varPhi \Vert ^2+V \varDelta + V\rho )(1-g(\varPhi ))_+\\&\quad \le 2 \left[ \int \mu _1[\mathrm{d}\varPhi ] \Vert \varPhi \Vert ^2 |1-g(\varPhi )| + V (\rho +\varDelta ) \int \mu _1[\mathrm{d}\varPhi ] |1-g(\varPhi )| \right] \\&\quad \le 2 \left( \langle \Vert \varPhi \Vert ^4\rangle _{\mu _1}^{\frac{1}{2}}+V(\rho +\varDelta )\right) \langle (1-g)^2\rangle _{\mu _1}^{\frac{1}{2}}. \end{aligned}$$

Using the definitions, we find that \(\Vert \varPhi \Vert ^2=\Vert \varPhi ^+\Vert ^2+\Vert \varPhi ^0\Vert ^2\). Therefore,

$$\begin{aligned}&\langle \Vert \varPhi \Vert ^4\rangle _{\mu _1} = \langle (\Vert \varPhi ^+\Vert ^2+\Vert \varPhi ^0\Vert ^2)^2\rangle _{\mu _1} \le 2\left( \langle \Vert \varPhi ^+\Vert ^4\rangle _{\mu _1} +V^2 \varDelta ^2\right) \end{aligned}$$

and using the expectations computed in (5.9)

$$\begin{aligned} \langle \Vert \varPhi ^+\Vert ^4\rangle _{\mu _1}&=\int _{\varLambda _+^*}\!\mathrm{d}k_1\int _{\varLambda _+^*}\!\mathrm{d}k_2\, \langle |\varPhi ^+(k_1)|^2 |\varPhi ^+(k_2)|^2 \rangle \\&= V^{-2} \sum _{k,k'\in \varLambda _+^*,\,k'\ne k} \frac{V^2}{e_k e_{k'}} + V^{-2}\sum _{k\in \varLambda _+^*} 2 \frac{V^2}{e_k^2} \le 2 V^2 \rho _{\mathrm {c}}^2. \end{aligned}$$

By assumption, this term is bounded by \(2 V^2 \rho ^2\), and we may conclude that

$$\begin{aligned} \langle \Vert \varPhi \Vert ^4\rangle _{\mu _1} \le 2\left( 2 V^2 \rho ^2+V^2 \varDelta ^2\right) \le 2^2 V^2 (\rho +\varDelta )^2. \end{aligned}$$

Therefore,

$$\begin{aligned}&W_2(\mu _1,\mu _0)^2 \le 2 \langle g[\varPhi ]\Vert \varPhi -G[\varPhi ]\Vert ^2\rangle _{\mu _1} + 12 (\rho +\varDelta ) L^d \langle (1-g)^2\rangle _{\mu _1}^{\frac{1}{2}}. \end{aligned}$$

By Corollary 2, \(\langle (1-g)^2\rangle _{\mu _1}^{\frac{1}{2}}\le 2^3 V_0 (\rho /\varDelta )^{V_0} \sqrt{2 \tilde{\delta }}\), and thus the second term is bounded by a constant \(3\cdot 2^6 (\rho +\varDelta ) V_0 (\rho /\varDelta )^{V_0}\) times \(L^d \sqrt{\varepsilon }\). In addition, using the definition (5.3) and Corollary 2, we find for the first term

$$\begin{aligned}&2 \langle g[\varPhi ]\Vert \varPhi -G[\varPhi ]\Vert ^2\rangle _{\mu _1}\\&\le 4 \left( \frac{\rho }{\varDelta }\right) ^{V_0-1}\int _{\varLambda _+^*}\!\mathrm{d}k\, \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ]< \rho \}} \Bigl [1-\Bigl (1-\frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}}\Bigr )^{-\frac{1}{2}}\Bigr ]^2 |\varPhi _k|^2\right\rangle _{\!\!\mu _1}\\&\quad + 4 \left( \frac{\rho }{\varDelta }\right) ^{V_0-1} \int _{\varLambda _0^*}\!\mathrm{d}k\, \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ] < \rho \}} \Bigl [1-\Bigl (1-\alpha '[\varPhi ]\Bigr )^{\frac{1}{2}}\Bigr ]^2 |\varPhi _k|^2\right\rangle _{\!\mu _1}. \end{aligned}$$

Here, whenever \(\rho '[\varPhi ] < \rho \) and \(k\in \varLambda _+^*\), we may use the definition of the relative energy gap and the identity \(1-1/\sqrt{c}=(c-1)/(c+\sqrt{c})\), valid for all \(c> 0\), to estimate

$$\begin{aligned} \Bigl [1-\Bigl (1-\frac{\tilde{\varepsilon }}{e_{k}}\Bigr )^{-\frac{1}{2}}\Bigr ]^2\le \frac{\tilde{\varepsilon }^2}{e_{k}^2} \frac{1}{1-\frac{\tilde{\varepsilon }}{e_{k}}} \le \frac{\delta ^2}{1-\frac{\tilde{\varepsilon }}{e_{k}}}. \end{aligned}$$

Therefore,

$$\begin{aligned}&\int _{\varLambda _+^*}\!\mathrm {d}k\, \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ]< \rho \}} \Bigl [1-\Bigl (1-\frac{\tilde{\varepsilon }[\varPhi ]}{e_{k}}\Bigr )^{-\frac{1}{2}}\Bigr ]^2 |\varPhi _k|^2\right\rangle _{\!\!\mu _1}\\ {}&\quad \le \delta ^2 \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ]< \rho \}} \int _{\varLambda _+^*}\!\mathrm {d}k\, \frac{e_{k}}{e_{k}-\tilde{\varepsilon }[\varPhi ]} |\varPhi _k|^2\right\rangle _{\!\!\mu _1} = \delta ^2 V \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ] < \rho \}} \rho '[\varPhi ]\right\rangle _{\!\mu _1}\\ {}&\quad \le \rho \delta ^2 L^d. \end{aligned}$$

Similarly, we have \(1-\sqrt{c}=(1-c)/(1+\sqrt{c})\) for all \(c\ge 0\), and thus

$$\begin{aligned} \Bigl [1-\Bigl (1-\alpha '\Bigr )^{\frac{1}{2}}\Bigr ]^2 \le |\alpha '|^2. \end{aligned}$$

Since the weight is the same for all components \(k\in \varLambda _0^*\), we find using Corollary 2

$$\begin{aligned}&\int _{\varLambda _0^*}\!\mathrm{d}k \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ]< \rho \}} \Bigl [1-\Bigl (1-\alpha '[\varPhi ]\Bigr )^{\frac{1}{2}}\Bigr ]^2 |\varPhi _k|^2\right\rangle _{\!\!\mu _1}\\&\quad \le \left\langle {\mathbb {1}}_{\{\rho '[\varPhi ] < \rho \}}|\alpha '[\varPhi ]|^2 V \rho _0[\varPhi ]\right\rangle _{\!\mu _1} \le 4 \varDelta ^{-1} \rho ^2 L^d \tilde{\delta }. \end{aligned}$$

Therefore, since \(\delta \le \frac{1}{2}\) and \(\delta \le \frac{\varepsilon }{2}\), we can add up and simplify the above bounds to arrive at the estimate

$$\begin{aligned} 2 \langle g[\varPhi ]\Vert \varPhi -G[\varPhi ]\Vert ^2\rangle _{\mu _1} \le 2^5 (\rho +\varDelta ) (\rho /\varDelta )^{V_0} L^d \varepsilon . \end{aligned}$$

The assumptions about \(\varepsilon \) allow simplifying this slightly to make the weight comparable to that of the first term. Namely, since now \(\sqrt{\varepsilon }\le \varDelta /(4 \rho )\le 2^{-2}\), we have proven that

$$\begin{aligned}&W_2(\mu _1,\mu _0)^2 \le \left( 2^3 (\rho +\varDelta ) (\rho /\varDelta )^{V_0} + 3\cdot 2^6 (\rho +\varDelta ) V_0 (\rho /\varDelta )^{V_0}\right) L^d \sqrt{\varepsilon }\\&\quad \le 2^8 (\rho +\varDelta ) V_0 (\rho /\varDelta )^{V_0} L^d \sqrt{\varepsilon }. \end{aligned}$$

Taking the square root, we conclude that the claim in the Theorem follows from the assumptions for the measure \(\mu _1\) defined in (5.1) and the explicit form for the constant \(C_2\) stated in the Theorem. \(\quad \square \)

Proof of Proposition 1, item 3. If \(e_k=0\) for all \(k\in \varLambda ^*_0\), we are back to the case in item 2, and since then \(\mu '_1=\mu _1\), its conclusions imply also the conclusions of item 3 whenever \(0\le \tilde{\varepsilon }\le 1\).

Suppose thus that there is some \(k\in \varLambda ^*_0\) for which \(e_k>0\) and that there is \(\tilde{\varepsilon }\le 1\) for which \(e_k\le \frac{1}{2 \rho }L^{-d}\tilde{\varepsilon }\) for all \(k\in \varLambda _0^*\). Clearly, then \(\tilde{\varepsilon }>0\). Comparing the definitions of \(\mu _1\) and \(\mu '_1\), we have \(\mu _1[\mathrm{d}\varPhi ]=g_1(\varPhi )\mu '_1[\mathrm{d}\varPhi ]\) for

$$\begin{aligned} g_1(\varPhi ):= \frac{Z'_1}{Z_1} g_2(\varPhi ), \qquad g_2(\varPhi ) := \mathrm{e}^{-E_0[\varPhi ]\left( 1-\frac{\rho _{\mathrm {c}}}{\varDelta }\right) } \prod _{k\in \varLambda ^*_+} \left( 1-\frac{E_0[\varPhi ]L^{-d}}{e_{k} \varDelta } \right) ^{-1}. \end{aligned}$$

Here \(g_2\) depends only on \(\varPhi ^0\) and satisfies \(\langle g_2\rangle _{\mu _1'}=\frac{Z_1}{Z'_1}\).

As before, the assumptions are tailored to guarantee that \(g_1\) remains close to one, and then an explicit good coupling can be found between \(\mu _1\) and \(\mu '_1\). As the small parameter we use here

$$\begin{aligned} \delta ' := \rho V \max _{k\in \varLambda ^*_0} e_k\le \frac{1}{2}\tilde{\varepsilon }\le \frac{1}{2}. \end{aligned}$$

In particular, we now have almost surely under \(\mu _1'\)

$$\begin{aligned} 0\le E_0[\varPhi ]\le \max _{k\in \varLambda _0^*}e_k N_0[\varPhi ] = V \varDelta \frac{\delta '}{\rho V }= \delta '\frac{\varDelta }{\rho } \le \delta '. \end{aligned}$$

Since \(-\ln (1-c)\le 2 c\) for \(0\le c\le \frac{1}{2}\), we find using the earlier assumption \(\delta \le \frac{1}{2}\) that almost surely under \(\mu _1'\)

$$\begin{aligned} 0\le -\ln \left( 1-\frac{E_0[\varPhi ]L^{-d}}{e_{k} \varDelta } \right) \le 2\frac{E_0[\varPhi ]L^{-d}}{e_{k} \varDelta }\le 2 \delta '\frac{1}{\rho V e_{k}}, \end{aligned}$$

for all \(k\in \varLambda ^*_+\). Therefore,

$$\begin{aligned} 0\le \sum _{k\in \varLambda ^*_+}\ln \left( 1-\frac{E_0[\varPhi ]L^{-d}}{e_{k} \varDelta } \right) ^{-1}\le 2 \delta '\frac{\rho _{\mathrm {c}}}{\rho }\le 2 \delta '. \end{aligned}$$

Similarly, \(E_0[\varPhi ]\frac{\rho _{\mathrm {c}}}{\varDelta }\le \delta '\), and thus we have obtained almost sure bounds

$$\begin{aligned} \mathrm{e}^{-\delta '}\le g_2(\varPhi )\le \mathrm{e}^{3 \delta '}. \end{aligned}$$

Taking expectation over \(\mu '_1\) we find also that

$$\begin{aligned} \mathrm{e}^{-\delta '} \le \frac{Z_1}{Z'_1}\le \mathrm{e}^{3 \delta '}. \end{aligned}$$

Combining these two results shows that almost surely under \(\mu _1'\)

$$\begin{aligned} \mathrm{e}^{-4 \delta '}\le g_1(\varPhi )\le \mathrm{e}^{4 \delta '}. \end{aligned}$$

Since \(\delta '\le \frac{1}{2}\), this yields an almost sure bound

$$\begin{aligned} |1-g_1(\varPhi )|\le \mathrm{e}^{4 \delta '}|1-\mathrm{e}^{-4 \delta '}|\le 4 \mathrm{e}^{2} \delta '. \end{aligned}$$
(5.12)

We define a measure \(\gamma _1\) by setting for bounded continuous functions \(F(\varPhi ,\varPsi )\)

$$\begin{aligned} \langle F\rangle _{\gamma _1} :=&\int \mu '_1[\mathrm {d}\varPhi ] \min (1,g_1[\varPhi ]) F(\varPhi ,\varPhi ) \nonumber \\ {}&+ \int \mu '_1[\mathrm {d}\varPhi ] \int \mu '_1[\mathrm {d}\varPsi ] \frac{1}{Z''} (1-g_1(\varPhi ))_+ (g_1(\varPsi )-1)_+ F(\varPhi ,\varPsi )\end{aligned}$$
(5.13)

where

$$\begin{aligned} Z'' := \langle (1-g_1)_+\rangle _{\mu '_1} = \langle (g_1-1)_+\rangle _{\mu '_1}. \end{aligned}$$

Note that, since \(E_0\) is not a constant function, \(g_1\) cannot be a constant function, and hence \(Z''>0\). As before, it is then straightforward to check that the first marginal equals \(\mu '_1\) and the second marginal equals \(\mu _1\).

Therefore, \(\gamma _1\) is a coupling between \(\mu _1\) and \(\mu '_1\), and we have

$$\begin{aligned} W_2(\mu _1,\mu '_1)^2 \le \int \mu '_1[\mathrm{d}\varPhi ] \int \mu '_1[\mathrm{d}\varPsi ] \frac{1}{Z''} (1-g_1(\varPhi ))_+ (g_1(\varPsi )-1)_+ \Vert \varPhi -\varPsi \Vert ^2. \end{aligned}$$

Again, we estimate \(\Vert \varPhi -\varPsi \Vert ^2 \le 2 (\Vert \varPhi \Vert ^2+ \Vert \varPsi \Vert ^2)\), and use the symmetry and definition of \(Z''\) to obtain a bound

$$\begin{aligned} W_2(\mu _1,\mu '_1)^2 \le 2 \langle \Vert \varPhi \Vert ^2 |1-g_1(\varPhi )|\rangle _{\mu _1'}. \end{aligned}$$

Combined with the almost sure bound in (5.12), we find that

$$\begin{aligned} W_2(\mu _1,\mu '_1)^2 \le 2^3 \mathrm{e}^{2} \delta '\langle \Vert \varPhi \Vert ^2\rangle _{\mu _1'}. \end{aligned}$$

Here, \(\langle \Vert \varPhi \Vert ^2\rangle _{\mu _1'} = \langle \Vert \varPhi ^+\Vert ^2+\Vert \varPhi ^0\Vert ^2\rangle _{\mu _1'} = V \rho _{\mathrm {c}}+ V \varDelta =V \rho \). Therefore,

$$\begin{aligned} W_2(\mu _1,\mu '_1)^2 \le 2^5 V \rho \tilde{\varepsilon }. \end{aligned}$$

Note that we obtained a better dependence on \(\tilde{\varepsilon }\) than on \(\varepsilon \) in the earlier estimate since we did not need to use the Schwarz inequality above. This was possible here since the weight \(g_1\) is almost surely close to one unlike the weight g which is close to one only with high probability.

Since the Wasserstein metric satisfies the triangle inequality, we can now combine the above bound with the one proved in Theorem 1, and conclude that

$$\begin{aligned} W_2(\mu _0,\mu '_1) \le W_2(\mu _0,\mu _1)+W_2(\mu _1,\mu '_1) \le L^{\frac{d}{2}} 2^4 \sqrt{V_0 (\rho +\varDelta )} \left( (\rho /\varDelta )^{\frac{V_0}{2}}\varepsilon _L^{\frac{1}{4}}+ \tilde{\varepsilon }^{\frac{1}{2}}\right) , \end{aligned}$$

as claimed in the Proposition. \(\quad \square \)

6 Proof of the Existence of the Energy Gap, Lemma 1

Here we suppose \(d\ge 3\) and consider a dispersion relation \(\omega \) which satisfies Assumption 1. For each L, define \(\omega _0\) and \(e_k\), \(k\in \varLambda ^*\), as in Definition 1. We choose \(\kappa \) such that \(0<\kappa <\frac{d}{2}\), if \(d\ge 4\), and \(0<\kappa < 1\), if \(d=3\), and fix its value for the rest of the proof. In principle, only the local behaviour of \(\omega \) around its global minima will matter, but the proof is complicated by the fact that the local behaviour in a neighbourhood of each minima can be different and the values of \(e_k\) can become mixed between the minima.

The proof will be composed out of several steps. The steps are not completely independent, and each step may use estimates and notations accumulated from the previous steps. Although the proof is not isolated into technical Lemmas, the steps highlight its structure by each having a specific goal, listed in the following:

  1. 1.

    Isolate sufficiently small neighbourhoods in \({\mathbb T}^d\) around each minimum of \(\omega \) so that second order Taylor series bounds its behaviour in the neighbourhood.

  2. 2.

    Choose sufficiently large L so that the rectangular grid \(\varLambda ^*(L)\) has some points in each neighbourhood.

  3. 3.

    Construct a condensate candidate set \(\varLambda ^*_1\) by isolating all small energies, with an energy difference from the lowest energy proportional to \(L^{-2}\). Show that the number of points in this set is bounded by some \(M_0\) which does not depend on \(\kappa \) nor on L.

  4. 4.

    Use a “pigeon hole” argument to show that this set must contain a large enough relative energy gap. This will fix the condensate wave number set \(\varLambda ^*_0\), hence also \(\varLambda ^*_+\), and complete the proof of item 1 of the Lemma.

  5. 5.

    Check that the relative energy gap of the construction satisfies item 2 of the Lemma.

  6. 6.

    Use the previous estimates to find a constant \(c_2\) for the bound (2.18), separately for \(d=3\), \(d=4\), and \(d\ge 5\).

  7. 7.

    Using an approximation with suitable Riemann sums, prove the estimates (2.19) and (2.20) for the continuum limit \(L\rightarrow \infty \).

(Step 1) Consider a point \(k_0\in T_0\) where \(\omega (k_0)=\omega _{\text {min}}\). Since \(k_0\) is a non-degenerate minimum of a twice continuously differentiable function \(\omega \), we have \(\nabla \omega (k_0)=0\) and the eigenvalues of \(D^2 \omega (k_0)\) are strictly positive. Let \(\lambda _-\) and \(\lambda _+\) denote the smallest and, respectively, the largest of these eigenvalues as \(k_0\) varies through the elements in \(T_0\). Then \(0<\lambda _-\le \lambda _+\). By continuity of \(D^2 \omega \) there is \(\delta >0\) such that \(\delta <\frac{1}{2}\), and wheneverFootnote 1\(k_0\in T_0\), \(|k-k_0|<\delta \), and \(p\in {\mathbb R}^d\) we have

$$\begin{aligned} \frac{\lambda _-}{2}|p|^2< p\cdot (D^2 \omega (k)p) < 2 \lambda _+ |p|^2. \end{aligned}$$

As \(T_0\) is finite, we can also assume that the balls \(B(k_0,\delta )\) are disjoint, by choosing a smaller \(\delta \) if this is not true initially. The complement of their union, the set , is compact and thus the continuous function \(\omega \) has a minimum value \(\omega _2\) there and the value is attained within the set. Then we must have \(\omega _2>\omega _{\text {min}}\) since else the point k at which \(\omega (k)=\omega _2\) would belong to \(T_0\). Furthermore, by a Taylor expansion up to second order around \(k_0\), we find that if \(k_0\in T_0\) and \(|k-k_0|<\delta \), then

$$\begin{aligned} \frac{\lambda _-}{4} |k-k_0|^2 \le \omega (k)-\omega _{\text {min}}\le \lambda _+ |k-k_0|^2,\qquad |\nabla \omega (k)| \le 2 \lambda _+ |k-k_0|. \end{aligned}$$
(6.1)

(Step 2) We are going to define a cut-off size \(L_0\), and consider lattices with \(L\ge L_0\). We begin by assuming that \(L_0 \in {\mathbb N}_+\) satisfies

$$\begin{aligned} L_0>\frac{\sqrt{d}}{2 \delta },\quad L_0\ge \left[ \frac{c_0}{\omega _2-\omega _{\text {min}}}\right] ^{\frac{1}{2}}, \end{aligned}$$
(6.2)

where \(c_0\) is an L-independent constant depending on \(\omega \) via \(\lambda _+\),

$$\begin{aligned} c_0 := \frac{\lambda _+ d}{2}. \end{aligned}$$
(6.3)

For any such \(\varLambda ^*(L)\), let us first isolate the minimum value of \(\omega \) on these points, i.e., set as in the Lemma

$$\begin{aligned} \omega _0(L) := \min _{k\in \varLambda ^*} \omega (k). \end{aligned}$$

As shown by the examples in Sect. 4, \(\omega _0\) may then depend on L, and even if \(\omega \) would have more than one minimum point on \({\mathbb T}^d\), the value of \(\omega _0\) could be unique.

Since \(\varLambda ^*\) forms a rectangular grid with side length \(\frac{1}{L}\) on \({\mathbb T}^d\), to any point \(k\in {\mathbb T}^d\) there is a point \(k'\in \varLambda ^*\) such that \(|k - k'|_\infty \le \frac{1}{2 L}\). Since \(|p|_\infty = \max _i |p_i|\ge d^{-\frac{1}{2}} |p|\), then \(|k - k'| \le \frac{\sqrt{d}}{2 L}\le \frac{\sqrt{d}}{2 L_0}< \delta \). Therefore, if \(k_0\in T_0\), there is \(k_0'\in \varLambda ^*\) for which \(|k'_0 - k_0|\le \frac{\sqrt{d}}{2 L} <\delta \), and thus \(\omega (k'_0)-\omega _{\text {min}}\le \lambda _+ |k'_0-k_0|^2 \le \frac{\lambda _+ d}{4} L^{-2}\). This implies that

$$\begin{aligned} 0\le \omega _0(L)-\omega _{\text {min}}\le \frac{c_0}{2} L^{-2}. \end{aligned}$$

In particular, \(\omega _0(L)\rightarrow \omega _{\text {min}}\) as \(L\rightarrow \infty \).

(Step 3) We recall that \(e_k = \omega (k)-\omega _0\) for \(k\in \varLambda ^*\), and consider the following set of k which have an energy close to the ground state:

(6.4)

Clearly, any minimum point has \(\omega (k)=\omega _0\) and thus it belongs to \(\varLambda ^*_1\). Hence, \(\varLambda ^*_1\) is not empty. In addition, the second inequality in (6.2) implies that if \(k\in \varLambda ^*_1\), then \(\omega (k)-\omega _{\text {min}}= e_k + \omega _0-\omega _{\text {min}}< c_0 L^{-2}\le \omega _2-\omega _{\text {min}}\). Therefore, to each \(k\in \varLambda ^*_1\), we can find a unique \(k_0\in T_0\) such that \(|k-k_0|<\delta \) and the inequalities (6.1) hold.

For each \(k_0\in T_0\), let us next consider the values in the subset

(6.5)

By the same reasoning as above, we can find \(n_0\in {\mathbb Z}^d\) for which \(|n_0-L k_0|_\infty \le \frac{1}{2}\). Therefore, it is possible to reparameterize the values in \(\varLambda ^*(k_0;L)\) defining \(m(k)=(L k -n_0)\bmod \varLambda _L\) for each \(k\in \varLambda ^*(k_0;L)\). Note that then for all \(k\in \varLambda ^*(k_0;L)\) we have \(L k=(n_0+m(k))\bmod \varLambda _L\) and \(L|k-k_0|_\infty = L \inf _{n\in {\mathbb Z}^d}|k-k_0-n|_\infty = |m(k)+n_0-L k_0|_\infty \ge |m(k)|_\infty - \frac{1}{2}\). On the other hand, if \(k\in \varLambda ^*(k_0;L)\cap \varLambda ^*_1\),

$$\begin{aligned} \frac{\lambda _-}{4} |k-k_0|_\infty ^2 \le \frac{\lambda _-}{4} |k-k_0|^2 \le \omega (k)-\omega _{\text {min}}< c_0 L^{-2}, \end{aligned}$$

and thus also

$$\begin{aligned} L|k-k_0|_\infty \le \sqrt{\frac{4 c_0}{\lambda _-}}. \end{aligned}$$

Therefore, then \(|m(k)|_\infty \le \frac{1}{2}+\sqrt{\frac{2 \lambda _+ d}{\lambda _-}}\). We define

$$\begin{aligned} M := \left\lfloor \frac{1}{2}+\sqrt{\frac{2 \lambda _+ d}{\lambda _-}} \right\rfloor , \end{aligned}$$
(6.6)

where \(\lfloor x\rfloor \) denotes the smallest integer in \({\mathbb Z}\) less than or equal to \(x\in {\mathbb R}\). Then \(M\ge 0\), and there are at most \((2 M+1)^d\) values \(m\in {\mathbb Z}^d\) which can satisfy \(|m|_\infty \le M\). Even if the maximal number of points occur in \(\varLambda ^*(k_0;L)\cap \varLambda ^*_1\) at each \(k_0\in T_0\), we conclude that there are at most

$$\begin{aligned} M_0 := |T_0| (2 M+1)^d \end{aligned}$$
(6.7)

points in \(\varLambda ^*_1\).

(Step 4) We next construct \(\varLambda ^*_0\) as a subset of \(\varLambda ^*_1\), and then also \(|\varLambda ^*_0|\le M_0\) and \(0\le \omega (k)-\omega _{\text {min}}< c_0 L^{-2}\) for all \(k\in \varLambda ^*_0\). Let us stress that \(M_0\) is indeed independent of L and \(\kappa \), as required in the Lemma. For simplicity, we now add one more requirement for \(L_0\): we assume that \(L_0^d\ge M_0+1\), so that if \(L\ge L_0\), the complement of \(\varLambda ^*_1\) cannot be empty.

To isolate those Fourier modes which behave as a condensate, recall that \(\kappa \) has been fixed to satisfy the requirements of the Lemma. Define \(b'_L= \frac{1}{2}c_0 L^{-d+\kappa }\) and \(r_L:= L^{-\frac{d-2-\kappa }{M_0}}\), to denote the two bounds appearing in item 2 of the Lemma. Then \(r_L\le 1\), since \(L\ge 1\), and the assumptions imply that \(\kappa < d-2\). We also have

$$\begin{aligned} L^2 b'_L = \frac{1}{2}c_0 L^{-d+\kappa +2}= \frac{1}{2}c_0 r_L^{M_0}\le \frac{1}{2}c_0. \end{aligned}$$

Therefore, if \(e_k<b'_L\), also \(e_k<\frac{c_0}{2}L^{-2}\), and thus \(k\in \varLambda ^*_1\). All of these values of k will be included in \(\varLambda ^*_0\) but to find a suitable gap, we might need to include also some values from the remainder set,

If \(\varLambda ^*_2=\emptyset \), we can conclude that \(e_k<b'_L\) for each \(k\in \varLambda ^*_1\) and, if \(k'\in \varLambda ^* {\setminus } \varLambda ^*_1\), we have \(e_{k'} \ge \frac{c_0}{2} L^{-2} = r_L^{-M_0} b'_L \ge r_L^{-1} b'_L > r_L^{-1}e_k\). Therefore, we may then define \(\varLambda ^*_0=\varLambda ^*_1\) and the corresponding split is separated by \([a_L,b_L]\) and has an energy gap \(\delta _L^{-1}\), where \(\delta _L<r_L\), \(a_L:=b'_L\), \(b_L:= r_L^{-M_0} b'_L\ge a_L\).

Suppose thus that \(N_2:=|\varLambda ^*_2|>0\), and enumerate the elements \(k_i\in \varLambda ^*_2\), \(i=1,2,\ldots ,N_2\), so that \(o_i = e_{k_i}\) form an increasing sequence, \(o_{i+1}\ge o_i\) for all i. Define also \(o_{N_2+1} := \min _{k\in \varLambda ^* {\setminus } \varLambda ^*_1} e_k \ge \frac{c_0}{2} L^{-2}\) and \(o_0 := \max _{k\in \varLambda ^*_1 {\setminus } \varLambda ^*_2} e_k< b'_L\). Note that at least all minimum points belong to \(\varLambda ^*_1{\setminus }\varLambda ^*_2\) and our L is large enough so that \(\varLambda ^* {\setminus } \varLambda ^*_1\) cannot be empty. Clearly, also the new sequence of \(o_i\), \(i=0,1,\ldots ,N_2+1\), is increasing. Therefore, we can use a pigeon hole argument to the relative energies: We have

$$\begin{aligned}&(N_2+1) \max _{i=0,1,\ldots , N_2} \ln \frac{o_{i+1}}{o_i} \ge \sum _{i=0}^{N_2} \ln \frac{o_{i+1}}{o_i} = \ln \left( \prod _{i=0}^{N_2} \frac{o_{i+1}}{o_i}\right) = \ln \left( \frac{o_{N_2+1}}{o_0}\right) \nonumber \\&\quad \ge \ln \left( \frac{c_0}{2 L^2 b'_L}\right) . \end{aligned}$$

The right hand side is equal to \(\ln r_L^{-M_0}=M_0 \ln r_L^{-1}\), and since \(N_2+1 \le |\varLambda ^*_1|\le M_0\), there is at least one \(i\in \{0,1,\ldots , N_2\}\) for which

$$\begin{aligned} \frac{o_{i+1}}{o_i} \ge r_L^{-1}. \end{aligned}$$

Let j denote the smallest of such i, and define

By construction, \(o_0\le o_j< \frac{c_0}{2} L^{-2}\) and thus \(\varLambda ^*_1{\setminus }\varLambda ^*_2\subset \varLambda ^*_0\subset \varLambda ^*_1\). Therefore, neither \(\varLambda ^*_0\) nor its complement \(\varLambda ^*_+\) can be empty, and \(|\varLambda ^*_0|\le M_0\). In addition, \(0\le \omega (k)-\omega _{\text {min}}< c_0 L^{-2}\) for all \(k\in \varLambda ^*_0\), and thus \((\varLambda _0^*,\varLambda ^*_+)\) forms a split of \(\varLambda ^*\) which satisfies item 1 of the Lemma.

(Step 5) In case \(j=0\), we have \(o_j=o_0<b'_L\). Otherwise, \(j\le N_2\le M_0-1\) and, by construction, we have \(o_{i+1}< r_L^{-1} o_i\) for all \(i<j\). Since \(j\le M_0-1\), we find

$$\begin{aligned} o_j \le r_L^{-j} o_{0} < r_L^{-(M_0-1)} b'_L = r_L \frac{c_0}{2} L^{-2}. \end{aligned}$$

Also by construction, if \(k'\in \varLambda ^*_+\), then \(k'\in \varLambda ^*_2\) or \(k'\in \varLambda ^*{\setminus } \varLambda ^*_1\), and in both cases \(e_{k'}\ge b'_L\). Thus we may define \(b_L := \min _{k\in \varLambda ^*_+} e_k\) for which \(b_L\ge b'_L\). In addition, for any \(k\in \varLambda ^*_0\) we have

$$\begin{aligned} e_k \le o_j\le r_L o_{j+1} \le r_L e_{k'}. \end{aligned}$$

Therefore, setting \(a_L:= o_j\), we find that this choice results in a split which is separated by \([a_L,b_L]\) and has an energy gap \(\delta _L^{-1}\), where \(\delta _L\le r_L\).

(Step 6) We have now shown that the split \((\varLambda _0^*,\varLambda ^*_+)\) constructed above satisfies also item 2 of the Lemma, and thus only the bounds stated in item 3 remain to be proven. We only need to consider values of \(e_k\) for \(k\in \varLambda ^*_+\) for which we have proven a lower bound \(e_k\ge \frac{1}{2}c_0 L^{-d+\kappa }\). In addition, we may also further divide these values into the sets

$$\begin{aligned} F(k_0) := \varLambda ^*(k_0;L)\cap \varLambda ^*_+, \quad k_0\in T_0, \end{aligned}$$

and \(F' := \varLambda ^*_+ {\setminus } \left( \cup _{k_0\in T_0} F(k_0)\right) \). If \(k\in F'\), we have by construction a lower bound \(e_k\ge \omega _2-\omega _0\) which by (6.2) and item 1 of the Lemma is bounded from below by \(\omega _2-\omega _{\text {min}}-\frac{c_0}{2}L^{-2}\ge \frac{1}{2}\left( \omega _2-\omega _{\text {min}}\right) >0\) for all \(L\ge L_0\). Therefore,

$$\begin{aligned} \sum _{k\in F'} \frac{1}{e_k^2} \le \frac{4}{(\omega _2-\omega _{\text {min}})^2} V=O(L^d). \end{aligned}$$

Let us then consider a fixed \(k_0\in T_0\) and the values \(k\in F(k_0)\). As explained above, we may parameterize these using integers \(m(k)\in \varLambda _L\). If \(|m(k)|_\infty \ge 1\), we have then \(L|k-k_0|_\infty \ge |m(k)|_\infty - \frac{1}{2} \ge \frac{1}{2}|m(k)|_\infty \). On the other hand, then also

$$\begin{aligned} e_k = \omega (k)-\omega _{\text {min}}+\omega _{\text {min}}-\omega _0\ge \frac{\lambda _-}{4}|k-k_0|_\infty ^2-\frac{c_0}{2}L^{-2} \ge \left( \frac{\lambda _-}{2^4}|m(k)|_\infty ^2-\frac{c_0}{2}\right) L^{-2}. \end{aligned}$$

This implies that whenever \(|m(k)|_\infty ^2 \ge \frac{2^4 c_0}{\lambda _-}\), we have \(e_k \ge \frac{\lambda _-}{2^5}|m(k)|_\infty ^2 L^{-2}\). For the remaining values we use the bound in item 2 of the Lemma, and taking into account that \(|m(k)|_\infty \le \frac{L}{2}\), we may conclude that

$$\begin{aligned} \sum _{k\in F(k_0)} \frac{1}{e_k^2} \le \frac{4}{c_0^2} L^{2 d-2 \kappa } \left( 1+2\sqrt{\frac{2^4 c_0}{\lambda _-}}\, \right) ^d + \sum _{m\in {\mathbb Z}^d} {\mathbb {1}}_{\{1\le |m|_\infty \le L/2\}} L^{4} \frac{2^{10}}{\lambda _-^2} |m|_\infty ^{-4}. \end{aligned}$$

The remaining sum satisfies a bound

$$\begin{aligned} \sum _{m\in {\mathbb Z}^d} {\mathbb {1}}_{\{1\le |m|_\infty \le L/2\}} |m|_\infty ^{-4} \le \sum _{n=1}^{L} \frac{1}{n^4} 2 d (2 n+1)^{d-1} \le d 2^{2 d-1} \sum _{n=1}^{L} n^{d-5}. \end{aligned}$$

If \(d\ge 5\), the terms in the sum over n form an increasing sequence and its value is bounded by \(L^{d-4}\). If \(d\le 4\), the summand consists of integer values of the decreasing function \(x^{-(5-d)}\). Thus by a Riemann sum estimate, we may use the following bound for \(d=4\),

$$\begin{aligned} \sum _{n=1}^{L} n^{-1} \le 1 + \int _1^L\!\mathrm{d}s \, \frac{1}{s} = 1+\ln L, \end{aligned}$$

and for \(d=3\) we obtain

$$\begin{aligned} \sum _{n=1}^{L} n^{-2} \le 1 + \int _1^L\!\mathrm{d}s \,\frac{1}{s^2} = 1+1-\frac{1}{L} \le 2. \end{aligned}$$

Collecting the above bounds together we find that there is a constant \(c>0\), which may vary with d but can be chosen independently of L, such that, if \(d=3\),

$$\begin{aligned} \frac{1}{V} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} \le c \left( L^{3-2\kappa }+L\right) , \end{aligned}$$

where \(3-2 \kappa >1\), if \(d=4\),

$$\begin{aligned} \frac{1}{V} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} \le c \left( L^{4-2\kappa }+\ln L+ 1\right) , \end{aligned}$$

where \(4-2 \kappa >0\), and if \(d\ge 5\),

$$\begin{aligned} \frac{1}{V} \sum _{k\in \varLambda ^*_+} \frac{1}{e_k^2} \le c \left( L^{d-2\kappa }+1\right) , \end{aligned}$$

where \(d-2\kappa >0\). In each of the three cases, the first term in the parenthesis on the right hand side dominates over the second term as \(L\rightarrow \infty \). Therefore, we can always find a constant \(c_2\) so that the bound in (2.18) holds for the fixed choice of \(\kappa \).

(Step 7) For the final estimates (2.19) and (2.20), let us first recall the bounds (6.1) satisfied by \(\omega (k)-\omega _{\text {min}}\) in a \(\delta \)-neighbourhood of any of its zeroes. Using the bounds and spherical coordinates shows that the integral (2.19) defining \(\rho _\infty \) is finite for all \(d\ge 3\). Denote the integrand by \(f(k) := \frac{1}{\omega (k)-\omega _{\text {min}}}\) for \(k\in {\mathbb T}^d{\setminus } T_0\), and choose arbitrarily f(k) to be zero otherwise. Suppose that \(L\ge L_0\), so that we may use all of the above results, in particular, let us continue to use the split \((\varLambda _0^*,\varLambda ^*_+)\) defined above.

Cover \({\mathbb T}^d\) with closed boxes with side length \(\frac{1}{L}\) and with \(k\in \varLambda ^*\) at the centre of each box, i.e., set for each \(k\in \varLambda ^*\)

Clearly, then \(\int _{D_k}\!\mathrm{d}k'\, 1=L^{-d}\), and thus

$$\begin{aligned} \rho _{\mathrm {c}}(L) = L^{-d}\sum _{k\in \varLambda ^*_+} \frac{1}{e_k} = \sum _{k\in \varLambda ^*_+} \int _{D_k}\!\mathrm{d}k'\, \frac{1}{e_k}. \end{aligned}$$
(6.8)

On the other hand, the points on the torus which correspond to a point in more than one box form a set of zero measure, so we may write

$$\begin{aligned} \rho _\infty = \int _{{\mathbb T}^d}\!\mathrm{d}k'\, f(k') = \sum _{k\in \varLambda ^*} \int _{D_k}\!\mathrm{d}k'\, f(k'). \end{aligned}$$

Therefore,

$$\begin{aligned} \rho _\infty - \rho _{\mathrm {c}}(L) = \sum _{k\in \varLambda ^*_0} \int _{D_k}\!\mathrm{d}k'\, f(k') + \sum _{k\in \varLambda ^*_+} \int _{D_k}\!\mathrm{d}k'\, \left( f(k')-\frac{1}{e_k} \right) . \end{aligned}$$

We estimate the error in two parts: First, the sum over \(k\in \varLambda ^*_0\) and those \(k\in \varLambda ^*_+\) which are sufficiently close to some \(k_0\in T_0\) can be estimated similarly. For the remaining \(k\in \varLambda ^*_+\) we use differentiability of f and decay of the error with distance from the singular set \(T_0\).

We first recall the above split of \(\varLambda ^*_+\) into \(F'\) and \(F(k_0)\), and consider the sum over \(k\in F(k_0)\) for some fixed \(k_0\in T_0\). Computing directly from the definitions, we find that

$$\begin{aligned} f(k')-\frac{1}{e_k} = \left( \omega (k)-\omega (k')-\omega _0+\omega _{\text {min}}\right) f(k') \frac{1}{e_k}. \end{aligned}$$

Here \(k'\in D_k\), and thus \(|k'-k|_\infty \le \frac{1}{2 L}\). Hence, by convexity of \(D_k\),

$$\begin{aligned} |\omega (k)-\omega (k')|\le |k'-k| \sup _{\xi \in D_k} |\nabla \omega (\xi )|\le \frac{1}{L} \frac{\sqrt{d}}{2}\sup _{\xi \in D_k} |\nabla \omega (\xi )|. \end{aligned}$$
(6.9)

Using again the parameterization of k by m(k) for which \(L|k-k_0|_\infty \le |m(k)|_\infty + \frac{1}{2}\), by the second bound in (6.1) we may estimate for all \(\xi \in D_k\) and sufficiently large L

$$\begin{aligned} |\nabla \omega (\xi )|\le 2 \lambda _+ |\xi -k_0|\le 2 \lambda _+ \sqrt{d}\, |\xi -k_0|_\infty \le 2 \lambda _+ \sqrt{d} \left( \frac{1}{2 L}+\frac{1}{2 L}+\frac{|m(k)|_\infty }{L}\right) . \end{aligned}$$
(6.10)

Therefore, if k is close enough to \(k_0\) so that \(|m(k)|_\infty ^2 < \frac{2^4 c_0}{\lambda _-}+4\), we can conclude that there is an L and k-independent constant \(c'\) such that for all \(k'\in D_k\)

$$\begin{aligned} |\omega (k)-\omega (k')-\omega _0+\omega _{\text {min}}| \le c' L^{-2}. \end{aligned}$$

Thus the contribution from such k satisfies

$$\begin{aligned} \int _{D_k}\!\mathrm{d}k'\, \left| f(k')-\frac{1}{e_k} \right| \le \frac{c'}{L^2 e_k} \int _{D_k}\!\mathrm{d}k'\, f(k') \le \frac{2 c'}{c_0} L^{d-2-\kappa } \int _{D_k}\!\mathrm{d}k'\, f(k'). \end{aligned}$$

In addition, then \(|k-k_0|\le \sqrt{d}|k-k_0|_\infty \le L^{-1} c''\), for an L-independent constant \(c''>0\). Therefore, the sum of the error terms over these k is bounded by \( \frac{2 c'}{c_0} L^{d-2-\kappa }\) times

$$\begin{aligned} \int _{|k-k_0|\le c''/L}\!\mathrm{d}k'\, f(k') \le \frac{4}{\lambda _-} |S^{d-1}|\int _0^{c''/L}\!\mathrm{d}r \, r^{d-1-2} = \frac{4}{\lambda _-} |S^{d-1}| \frac{(c'')^{d-2}}{d-2} L^{2-d}. \end{aligned}$$

This proves that the error from these terms is \(O(L^{-\kappa })\) as \(L\rightarrow \infty \).

Since for each \(k\in \varLambda ^*_0\) we know that \(L |k-k_0|_\infty \le \sqrt{4 c_0/\lambda _-}\), an identical argument may be used to conclude that, as \(L\rightarrow \infty \),

$$\begin{aligned} \sum _{k\in \varLambda ^*_0} \int _{D_k}\!\mathrm{d}k'\, f(k') = O(L^{2-d})=O(L^{-\kappa }). \end{aligned}$$

Let us next estimate terms \(k\in F(k_0)\) with \(|m(k)|_\infty ^2 \ge \frac{2^4 c_0}{\lambda _-}+4\). By the earlier computations, we know that then \(e_k \ge \frac{\lambda _-}{2^5}|m(k)|_\infty ^2 L^{-2}\). On the other hand, since \(|m(k)|_\infty \ge 2\), we also have \(|m(k)|_\infty -1 \ge \frac{1}{2} |m(k)|_\infty \), and thus, if \(k'\in D_k\), we may estimate \(|k'-k_0|_\infty \ge |k-k_0|_\infty -|k'-k|_\infty \ge \frac{1}{L}\left( |m(k)|_\infty -1 \right) \ge \frac{1}{2 L}|m(k)|_\infty \). Thus by (6.1)

$$\begin{aligned} \frac{1}{f(k')} = \omega (k')-\omega _{\text {min}}\ge \frac{\lambda _-}{4} |k'-k_0|_\infty ^2 \ge \frac{\lambda _-}{2^4} L^{-2} |m(k)|_\infty ^{2}, \end{aligned}$$

and both \(1/e_k\) and \(f(k')\) have similar upper bounds.

It is now useful to expand the difference further and integrate the identity

$$\begin{aligned}&f(k')-\frac{1}{e_k} \nonumber \\&\quad = \left( \omega (k)-\omega (k')-\omega _0+\omega _{\text {min}}\right) \frac{1}{e_k^2} +\left( \omega (k)-\omega (k')-\omega _0+\omega _{\text {min}}\right) ^2 \frac{1}{e_k^2} f(k'). \end{aligned}$$

Since \(\int _{D_k}\!\mathrm{d}k'\, (k'_i-k_i)=0\) for any \(i=1,2,\ldots ,d\), we have

$$\begin{aligned} \int _{D_k}\!\mathrm{d}k'\,\left( \omega (k)-\omega (k')\right) {=} \int _{D_k}\!\mathrm{d}k'\int _0^1\!\mathrm{d}\tau \, (1-\tau ) (k-k')\cdot D^2\omega (\tau k {+} (1-\tau ) k')(k-k'), \end{aligned}$$

and, therefore,

$$\begin{aligned} \left| \int _{D_k}\!\mathrm{d}k'\,\left( \omega (k)-\omega (k')\right) \right| \le \frac{d}{4 L^2} \sup _{\xi \in D_k}\Vert D^2 \omega (\xi )\Vert \frac{1}{2} L^{-d}. \end{aligned}$$

Since \(\omega \) is twice continuously differentiable, together with (6.9) this shows that there is an L-independent constant \(C'>0\) such that

$$\begin{aligned}&\left| \int _{D_k}\!\mathrm{d}k'\,\left( f(k')-\frac{1}{e_k}\right) \right| \nonumber \\&\quad \le C' L^{-2-d} \frac{1}{e_k^2} + C' \frac{(1+L\sup _{\xi \in D_k} |\nabla \omega (\xi )|)^2}{ L^{4} e_k^2}\int _{D_k}\!\mathrm{d}k'\,f(k'). \end{aligned}$$
(6.11)

Therefore, denoting \(m=m(k)\), using (6.10) to estimate the derivative, and recalling the earlier upper bounds for \(1/e_k\) and \(f(k')\), we find that

$$\begin{aligned} \left| \int _{D_k}\!\mathrm{d}k'\,\left( f(k')-\frac{1}{e_k}\right) \right| \le C'' L^{2-d} |m|^{-4}, \end{aligned}$$

where the constant \(C''\) is independent of L. Estimating the sum over possible values of m as above, we thus find that the contribution from these terms is \(O(L^{-1})\), for \(d=3\), it is \(O(L^{-2}(1+\ln L))\), for \(d=4\), and \(O(L^{-2})\), for \(d\ge 5\). The first two cases are \(O(L^{-\kappa })\), and thus we have proven that

$$\begin{aligned} \sum _{k\in \varLambda ^*_0} \int _{D_k}\!\mathrm{d}k'\, f(k') + \sum _{k_0\in T_0} \sum _{k\in F(k_0)} \int _{D_k}\!\mathrm{d}k'\, \left( f(k')-\frac{1}{e_k} \right) = O(L^{-\min (\kappa ,2)}), \end{aligned}$$

as required by the Lemma.

It remains to estimate the contribution from the values with \(k\in F'\). Since then \(e_k\ge (\omega _2-\omega _{\text {min}})/2>0\) uniformly in k and L, we may simply use the uniform bound for the gradient in (6.11), and conclude that

$$\begin{aligned}&\sum _{k\in F'} \left| \int _{D_k}\!\mathrm{d}k'\,\left( f(k')-\frac{1}{e_k}\right) \right| \nonumber \\&\quad \le C' L^{-2-d} \sum _{k\in F'} \frac{1}{e_k^2} + C''' L^{-2}\sum _{k\in F'}\int _{D_k}\!\mathrm{d}k'\,f(k') = O(L^{-2}). \end{aligned}$$

Combining all of the above results, we have thus proven that

$$\begin{aligned} \rho _{\mathrm {c}}(L) = \rho _\infty + O(L^{-\min (\kappa ,2)}), \end{aligned}$$

which completes the proof of the Lemma.