1 Introduction

The purpose of the present article is to analyze the low-temperature behavior for a one-dimensional chain of atoms that interact via a Lennard–Jones type potential. The model is atomistic and in terms of the Gibbs measures of classical statistical mechanics. Two limiting procedures are at play: the zero-temperature limit, for which the inverse temperature \(\beta \) goes to infinity, and the thermodynamic limit, where the number of particles N and the system size go to infinity. The order of the limits matters. When the zero-temperature limit is taken before the \(N\rightarrow \infty \) limit, the analysis of Gibbs measures is replaced by energy minimization, leading to variational models of non-linear elasticity. We perform instead the zero-temperature limit after the thermodynamic limit. The zero-temperature limit for infinite systems is far from trivial, see [14, 15, 55] and the discussion in [7].

For the one-dimensional Lennard–Jones interaction, it is known that energy minimizers (ground states) converge to a periodic lattice [22] (“crystallization”). In contrast, for one-dimensional systems with pair potentials that decay faster than \(1/r^2\), it is well-known that, at positive temperature, no matter how small, there is no crystallization [11]. Nevertheless, some quantities can be approximated well by their zero-temperature counterpart. For the bulk free energy this is to be expected; for other quantities such as surface corrections this is already more subtle. For the decay of correlations, it is a priori not even clear what the zero-temperature counterpart should be; we propose a natural candidate, see equations (2.11) and (2.12).

At zero temperature, surface corrections and boundary layers have been studied, for example, in order to better understand variational models of fracture, see for example [12, 49] and the references therein. Fracture might be expected for elongated chains that are forced to stretch beyond their preferred length. At small positive temperature, large interparticle distances correspond to low pressure (stress) \(p=p_\beta \rightarrow 0\). We address this regime in a subsequent work and focus here on the elastic regime of positive pressure \(p>0\), though the case of small pressure \(p_\beta \rightarrow 0\) is discussed in some comments.

Our main results come in four parts. They are listed in sections 2.12.4 and proven in sections 37. At zero temperature, we extend the result on bulk periodicity from [22] to a more general class of potentials and positive pressure, see Theorem 2.1. We prove the existence of bounded surface corrections, and characterize them with the help of an energy functional \({\mathcal {E}}_\mathrm {surf}\) for semi-infinite chains (Theorem 2.2).

At positive temperature, we prove large deviations principles for the Gibbs measures \(\mu _\beta \) and \(\nu _\beta \) on \({\mathbb {R}}_+^{\mathbb {Z}}\) and \({\mathbb {R}}_+^{\mathbb {N}}\) (product topology) as \(\beta \rightarrow \infty \) at fixed \(p>0\) (Theorem 2.4). The speed is \(\beta \) and the respective rate functions are energy functionals \(\overline{{\mathcal {E}}}_\mathrm {bulk}\) and \(\overline{{\mathcal {E}}}_\mathrm {surf} - \min \overline{{\mathcal {E}}}_\mathrm {surf}\) whose minimizers are, respectively, the periodic bulk ground state and the zero-temperature boundary layer. The convergence of positive-temperature surface corrections to their zero-temperature counterpart is addressed in Theorem 2.5. These results are intimately related to path large deviations for Markov processes and Hamilton–Jacobi–Bellman equations [19], semi-classical analysis [24], and a more direct approach to low-temperature expansions [51]. We remark that our results are valid for long range interactions which in particular are not assumed to have superlinear growth at infinity. The large deviations principle is complemented by a result on Gaussian approximations for the bulk Gibbs measure and the Gibbs free energy, valid for finite interaction range m (Theorems 2.7 and 2.8).

Finally, we study the temperature-dependence of correlations and informally discuss how correlations connect with effective interactions of defects and the decay of boundary layers. Theorem 2.9 provides a priori estimates that hold for all \(\beta ,p>0\). In Theorem 2.11 we show that for finite m and small positive pressure p, the decay of correlations is exponential with a rate of decay that stays bounded as \(\beta \rightarrow \infty \)—the associated Markov chain has a spectral gap bounded away from zero. This uniform estimate is proven with perturbation theory for the transfer operator. For infinite m, we provide instead a uniform estimate for restricted Gibbs measures (Proposition 2.10), which follows from the convexity of the energy (in a neighborhood of the periodic gound state) and techniques from the realm of Brascamp-Lieb inequalities [24]. At vanishing pressure \(p_\beta \rightarrow 0\) or fixed high pressure \(p>0\), the spectral gap might become exponentially small because of fracture or metastable wells [9] in non-convex energy landscapes.

Bringing statistical mechanics into atomistic models of crystals and elasticity has a rich tradition [6, 8, 39, 56]. Modern developments include the study of gradient Gibbs measures [20] with sophisticated tools such as renormalization groups and cluster expansions [1], random walk representations [13], and Witten Laplacians [24]; scaling limits and gradient Young–Gibbs measures [30, 41, 47]; the extension of approximation schemes, for example the quasi-continuum method, to positive temperature [10, 52]. In addition, there have been some inroads into the open problem of proving crystallization in the form of orientational order for two-dimensional models [2, 25].

To the best of our knowledge, all of the aforementioned mathematical literature, notably on Gibbs gradient measures, is limited to potentials with a superlinear growth at infinity. This is in stark contrast with the decay to zero typically imposed in statistical mechanics of point particles [44]. We work with potentials \(v(r)\rightarrow 0\), an additional linear term pr enters because we work in the constant pressure ensemble, which is the most convenient ensemble for one-dimensional systems [44, Section 5.6.6]. As a consequence, the by now classical combination of Bakry–Émery estimates and Holley–Stroock perturbation principle, see [34] and the references therein, becomes potentially more delicate. We use instead estimates on energy penalties, some aspects of which might generalize to higher-dimensional models.

Another aspect that might generalize to higher dimension concerns the large deviations principle. The existence of a large deviations principle for the Gibbs measure as \(\beta \rightarrow \infty \), proven using a exponential tightness and fixed point equation for the measure, amounts to the construction of an infinite volume energy functional that vanishes on ground states only. In higher dimension, the role of the fixed point equation is taken by DLR-conditions named after Dobrushin, Lanford, Ruelle [23] and the proof of a large deviations principle reduces to the investigation of a higher-dimensional analogue of a Bellman equation. The theory of the latter, for non-unique ground states, might mirror possible intricacies of the zero-temperature limit of Gibbs measure described in [55].

Finally we remark that the results of this work allow for a detailed analysis of typical atomic configurations at low temperature and low density. In [28] we will in particular prove that, when the density is strictly smaller than the density of the ground state lattice, a system with N particles fills space by alternating approximately crystalline domains (“clusters”) with empty domains (“cracks”). The number of domains is of the order of \(N \exp (-\beta e_\mathrm{surf}/2)\) with \(e_\mathrm{surf}\) the surface energy from Theorem 2.2 below.

2 Main Results

2.1 Zero Temperature

Let \(v:(0,\infty ) \rightarrow {\mathbb {R}}\) be a pair potential, \(m\in {\mathbb {N}}\cup \{\infty \}\) a truncation parameter and \(p\ge 0\) the pressure. At zero temperature we allow for \(p=0\), at positive temperature we impose \(p>0\). The Gibbs energy at zero temperature and pressure p for a system of N particles with positions \(x_1<\ldots <x_N\) and interparticle spacings \(z_j = x_{j+1}- x_j\), \(j=1,\ldots ,N-1\), is

$$\begin{aligned} {\mathcal {E}}_N(z_1,\ldots ,z_{N-1}) = \sum _{\genfrac{}{}{0.0pt}{}{1\le i <j\le N}{|i-j|\le m}} v(z_i+\cdots + z_{j-1}) + p \sum _{j=1}^{N-1} z_j. \end{aligned}$$

The parameter m restricts the range of the interaction: \(m=2\) corresponds to a next-nearest neighbor interaction. This section deals with the minimization problem

$$\begin{aligned} E_N = \inf _{z_1,\ldots ,z_{N-1}>0}\, {\mathcal {E}}_N(z_1,\ldots ,z_{N-1}) \end{aligned}$$

in the limit \(N\rightarrow \infty \). Throughout we assume that the following assumption holds (see also Fig. 1):

Assumption 1

The pair potential \(v : (0, \infty ) \rightarrow {\mathbb {R}}\cup \{+ \infty \}\) with (possibly vanishing) hard core radius \(r_\mathrm{hc} \ge 0\) is equal to \(+ \infty \) on \((0, r_\mathrm{hc}]\) and a \(C^2\) function on \((r_\mathrm{hc}, \infty )\). There exist \(r_\mathrm{hc}< z_{\min }< z_{\max } < 2 z_{\min }\) and \(\alpha _1, \alpha _2 > 0\), \(s > 2\) such that the following holds:

  1. (i)

    Shape of v: \(z_{\max }\) is the unique minimizer of v and satisfies \(v(z_{\max })<0\). v is decreasing on \((0,z_{\max })\) and increasing and non-positive on \((z_{\max },\infty )\).

  2. (ii)

    Growth of v: \(v(z) \ge - \alpha _1 z^{-s}\) for all \(z > 0\) and \(v(z) + v(z_{\max }) - 2 \alpha _1 \sum _{n=2}^\infty (n z)^{-s} > 0\) for all \(z < z_{\min }\).

  3. (iii)

    Shape of \(v''\): \(v''\) is decreasing on \([z_{\min }, z_{\max }]\) and increasing and non-positive on \([2 z_{\min }, \infty )\).

  4. (iv)

    Growth of \(v''\): \(v''(z) \ge -\alpha _2 z^{-s-2}\) for all \(z > r_\mathrm{hc}\) and \(v''(z_{\max }) + \sum _{n=2}^\infty n^2 v''(n z_{\min }) > 0\).

Fig. 1
figure 1

A typical pair interaction potential with \(0 \le r_{\mathrm hc}< z_{\min }< z_{\max } < 2 z_{\min }\) as specified in Assumption 1

The assumption is satisfied, for example, by the Lennard–Jones potential \(v(r) = r^{-12} - r^{-6}\). As we will see, parts (i) and (ii) of the assumption guarantee that energy minimizers at \(p=0\) have interparticle spacings \(z_j\) in \((z_{\min },z_{\max })\), parts (iii) and (iv) ensure that the distance of nearest neighbors lies within the convexity region of v while all other interactions occur in the concave region of v. \({\mathcal {E}}_N\) will then be uniformly strictly convex in \((z_{\min },z_{\max })^{N-1}\); moreover the Hessian \(\mathrm {D}^2 {\mathcal {E}}_N\) will be diagonally dominant with positive diagonal entries and negative off-diagonal entries.

Assumption 2

The pressure p satisfies \(0\le p<p^*\) with \(p^*:= \frac{|v(z_{\max })|}{z_{\max }}\).

At positive temperature we shall assume in addition that \(p>0\), \(r_{hc}>0\), and for some results we need \(\lim _{r\searrow r_\mathrm {hc}} v(r) = \infty \). The next theorem is the adaptation of a similar result by Gardner and Radin [22]. It is proven in section 3.1.

Theorem 2.1

(Bulk properties) Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p\in [0,p^*)\) as in Assumption 2.

  1. (a)

    For every \(N\ge 2\), the map \({\mathcal {E}}_N:{\mathbb {R}}_+^{N-1}\rightarrow {\mathbb {R}}\) has a unique minimizer \((z_1^{\scriptscriptstyle {({N}})},\ldots ,z_{N-1}^{\scriptscriptstyle {({N}})})\). The mimizer has all its spacings \(z_j\) in \( [z_{\min },z_{\max }]\).

  2. (b)

    As \(j,N\rightarrow \infty \) along \(N-j\rightarrow \infty \), we have \(z_j^{\scriptscriptstyle {({N}})}\rightarrow a\) where \(a \in (z_{\min }, z_{\max }]\) is the unique minimizer of \({\mathbb {R}}_+\ni r\mapsto p r+ \sum _{k=1}^m v(kr)\).

  3. (c)

    The limit \(e_0 = \lim _{N\rightarrow \infty } (E_N/N) < 0\) exists and is given by

    $$\begin{aligned} e_0= p a+ \sum _{k=1}^m v(ka) = \min _{r>0} \Bigl ( p r + \sum _{k=1}^m v(kr)\Bigr ). \end{aligned}$$

Let \({\mathcal {D}}_0 \subset (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) be the space of sequences \((z_j)_{j\in {\mathbb {N}}}\) with none or at most finitely many elements different from a. Define

$$\begin{aligned} h(z_1,\ldots ,z_{m})&= pz_1 + \sum _{k=1}^m v(z_1+\cdots + z_k), \end{aligned}$$
(2.1)
$$\begin{aligned} {\mathcal {E}}_\mathrm{surf}\bigl ( (z_j)_{j\in {\mathbb {N}}}\bigr )&=\sum _{j=1}^\infty \bigl ( h(z_j,\ldots ,z_{j+m-1}) - e_0 \bigr ) \nonumber \\&= \sum _{j=1}^\infty \Bigl [ p \gamma _j + \sum _{k=1}^m \bigl ( v(ka+\gamma _j+ \cdots + \gamma _{j+k-1}) - v(ka)\bigr ) \Bigr ] \end{aligned}$$
(2.2)

for \((z_j)_{j\in {\mathbb {N}}}\in {\mathcal {D}}_0\), where we have set \(\gamma _j := z_j - a\) and used that \(e_0 = p a + \sum _{k=1}^m v(ka)\). When \(m=\infty \), \(h((z_j)_{j\in {\mathbb {N}}})\) is a function of the whole sequence. \({\mathcal {E}}_\mathrm {surf}\) is the Gibbs energy of a semi-infinite chain, with additive constant chosen in such a way that at spacings \(z_j\equiv a\) the Gibbs energy is zero; \(h(z_1,z_2,\ldots )\) represents the interaction of the left-most particle with everybody else. Let \({\mathcal {D}} = \{ (z_j)_{j\in {\mathbb {N}}}\in (r_\mathrm{hc}, \infty )^{{\mathbb {N}}} \mid \sum _{j=1}^\infty (z_j-a)^2 <\infty \}\) be the space of square summable strains.

Theorem 2.2

(Surface energy) Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p\in [0,p^*)\) as in Assumption 2. Equip \({\mathcal {D}}\) with the \(\ell ^2\)-metric. Then

  1. (a)

    \({\mathcal {E}}_\mathrm {surf}\) extends to a continuous functional on \({\mathcal {D}}\).

  2. (b)

    On \({\mathcal {D}} \cap [z_{\min },z_{\max }]^{\mathbb {N}}\) it is strictly convex.

  3. (c)

    \({\mathcal {E}}_\mathrm {surf}\) has a unique minimizer. The minimizer lies in \({\mathcal {D}}\cap [z_{\min },z_{\max }]^{\mathbb {N}}\).

  4. (d)

    The limit \(e_\mathrm {surf} =\lim _{N\rightarrow \infty }(E_N - N e_0)\) exists and is given by

    $$\begin{aligned} e_\mathrm {surf} = 2 \min _{\mathcal {D}} {\mathcal {E}}_\mathrm {surf} - pa - \sum _{k=1}^m k v(ka). \end{aligned}$$

The theorem is proven in section 3.2. Note that \(-pa-\sum _{k=1}^\infty k v(ka)\) is the surface energy for a clamped chain with all spacings equal to a and \({\mathcal {E}}_\mathrm {surf}\) encodes the effect of boundary layers. \({\mathcal {E}}_\mathrm {surf}\) is multiplied by 2 because finite chains have two ends. We note that \(\min {\mathcal {E}}_\mathrm {surf}\) is exactly the boundary layer energy introduced by Braides and Cicalese [12]; Braides and Cicalese dealt with the special case \(m=2\) of next-nearest neighbor interactions but more general potentials. For finite \(m\ge 2\), see [50, Theorem 4.2].

For later purposes we also define a bulk functional

$$\begin{aligned} \begin{aligned} {\mathcal {E}}_\mathrm {bulk} \bigl ( (z_j)_{j\in {\mathbb {Z}}}\bigr )&= \sum _{j=-\infty }^\infty \bigl ( h(z_j,\ldots ,z_{j+m-1}) - e_0\bigr ) \\&= \sum _{j=-\infty }^\infty \sum _{k=1}^m \bigl ( v(z_j+\cdots + z_{j+k-1}) - v(ka) + \delta _{1k} p (z_j - a) \bigr ); \end{aligned} \end{aligned}$$

it is defined, a priori, on the space \({\mathcal {D}}_0^+\) of positive bi-infinite sequences \((z_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm{hc}, \infty )^{{\mathbb {Z}}}\) that have at most finitely many elements \(z_j \ne a\). Denoting the space of square summable strains \({\mathcal {D}}^+ = \{(z_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm{hc}, \infty )^{{\mathbb {Z}}} \mid \sum _{j\in {\mathbb {Z}}}(z_j-a)^2<\infty \}\), an analysis similar to the one for the surface functional yields the following result:

Proposition 2.3

(Limiting bulk properties) Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p \in [0,p^*)\) as in Assumption 2. Equip \(\mathcal {D^+}\) with the \(\ell ^2\)-metric. Then

  1. (a)

    \({\mathcal {E}}_\mathrm {bulk}\) extends to a continuous functional on \({\mathcal {D}}^+\).

  2. (b)

    On \({\mathcal {D}}^+ \cap [z_{\min },z_{\max }]^{\mathbb {N}}\) it is strictly convex.

  3. (c)

    The unique minimizer of \({\mathcal {E}}_\mathrm {bulk}\) is the constant sequence \((\ldots , a, a, \ldots )\). The minimum value is \({\mathcal {E}}_\mathrm {bulk}(\ldots , a, a, \ldots ) = 0\).

  4. (d)

    For every \((z_j)_{j \in {\mathbb {Z}}} \in {\mathcal {D}}^+\) one has

    $$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}((z_j)_{j \in {\mathbb {Z}}}) =&{\mathcal {E}}_\mathrm {surf}(z_1, z_2, \ldots ) + {\mathcal {E}}_\mathrm {surf}(z_0, z_{-1}, \ldots ) \\&+ {\mathcal {W}}( \cdots z_{-1} z_0 \mid z_1 z_2 \ldots ), \end{aligned}$$

    where \({\mathcal {W}}( \cdots z_{-1} z_0 \mid z_1 z_2 \cdots ) := \sum _{\genfrac{}{}{0.0pt}{}{j \le 0, k \ge 1}{|k-j|\le m-1}} v(z_j+\cdots + z_k)\) is the total interaction between the left and right half-infinite chain.

2.2 Small Positive Temperature

Next we analyze infinite volume Gibbs measures on \({\mathbb {R}}_+^{\mathbb {N}}\) and \({\mathbb {R}}_+^{\mathbb {Z}}\) in the limit \(\beta \rightarrow \infty \). We focus on fixed positive \(p\in (0, |v(z_{\max })|/z_\mathrm {\max })\) but comment on vanishing \(p=p_\beta \rightarrow 0\) at the end of the section. Let \({\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})}\) be the probability measure on \({\mathbb {R}}_+^{N-1}\) defined by

$$\begin{aligned} {\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})} (A) = \frac{1}{Q_N(\beta )} \int _A {{\text {e}} }^{-\beta {\mathcal {E}}_N(z_1,\ldots ,z_{N-1})} \mathrm {d}z_1\cdots \mathrm {d}z_{N-1}, \end{aligned}$$

where

$$\begin{aligned} Q_N(\beta ) = \int _{{\mathbb {R}}_+^{N-1}} {{\text {e}} }^{-\beta {\mathcal {E}}_N(z_1,\ldots ,z_{N-1})} \mathrm {d}z_1\cdots \mathrm {d}z_{N-1}. \end{aligned}$$

Standard arguments (see section 4) show there is a uniquely defined probability measure \(\nu _\beta \) on the product space \({\mathbb {R}}_+^{\mathbb {N}}\) such that for every \(k\in {\mathbb {N}}\), every bounded continuous test function \(f \in C_b({\mathbb {R}}_+^k)\),

$$\begin{aligned} \lim _{N\rightarrow \infty } \int _{{\mathbb {R}}_+^{N-1}} f(z_1,\ldots ,z_k) \mathrm {d}{\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})}(z_1,\ldots ,z_{N-1}) = \int _{{\mathbb {R}}_+^{{\mathbb {N}}}} f(z_1,\ldots ,z_k) \mathrm {d}\nu _\beta ( (z_j)_{j\ge 1}).\nonumber \\ \end{aligned}$$
(2.3)

Similarly, there is a uniquely defined probabilty measure \(\mu _\beta \) on \({\mathbb {R}}_+^{{\mathbb {Z}}}\) such that for all local test functions f as above, and all sequences \(i_N\) with \(i_N\rightarrow \infty \) and \(N-i_N\rightarrow \infty \),

$$\begin{aligned} \lim _{N\rightarrow \infty } \int _{{\mathbb {R}}_+^{N-1}} f(z_{i_N+1},\ldots ,z_{i_N+k}) \mathrm {d}{\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})}(z_1,\ldots ,z_{N-1}) =\nonumber \\ \int _{{\mathbb {R}}_+^{{\mathbb {Z}}}} f(z_1,\ldots ,z_k) \mathrm {d}\mu _\beta ( (z_j)_{j\in \mathbb {Z}}). \end{aligned}$$
(2.4)

Moreover the measure \(\mu _\beta \) is shift-invariant and mixing. The measure \(\mu _\beta \) describes the bulk behavior of a semi-infinite chain, the measure \(\nu _\beta \) is the equilibrium measure for a semi-infinite chain and encodes the probability distribution of boundary layers.

Our first result is a large deviations principle for the equilibrium measure \(\nu _\beta \) as \(\beta \rightarrow \infty \). The rate function is a suitable extension of \({\mathcal {E}}_\mathrm {surf}\): define \(\overline{{\mathcal {E}}}_\mathrm {surf}:\ {\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\cup \{\infty \}\) by

$$\begin{aligned} \overline{{\mathcal {E}}}_\mathrm {surf}\bigl ( (z_j)_{j\in {\mathbb {N}}}\bigr ) = {\left\{ \begin{array}{ll} {\mathcal {E}}_\mathrm {surf}\bigl ( (z_j)_{j\in {\mathbb {N}}}\bigr ), &{}\quad (z_j)_{j\in {\mathbb {N}}} \in {\mathcal {D}},\\ \infty , &{}\quad \text {else}. \end{array}\right. } \end{aligned}$$
(2.5)

In the same way \({\mathcal {E}}_\mathrm {bulk}\) extends to a map \(\overline{{\mathcal {E}}}_\mathrm {bulk}\) from \({\mathbb {R}}_+^{\mathbb {Z}}\) to \({\mathbb {R}}\cup \{\infty \}\). Both \({\mathbb {R}}_+^{\mathbb {N}}\) and \({\mathbb {R}}_+^{\mathbb {Z}}\) are equipped with the product topology.

Theorem 2.4

Fix \(p\in (0,p^*)\) and \(m\in {\mathbb {N}}\cup \{\infty \}\). Assume that \(r_\mathrm {hc}>0\) and \(\lim _{r\searrow r_\mathrm {hc}} v(r) = \infty \). Then as \(\beta \rightarrow \infty \), the equilibrium measures \((\nu _\beta )_{\beta >0}\) and \((\mu _\beta )_{\beta >0}\) satisfy large deviations principles with speed \(\beta \) and respective rate functions \(\overline{{\mathcal {E}}}_{\mathrm {surf}} - \min {\mathcal {E}}_\mathrm {surf}\) and \(\overline{{\mathcal {E}}}_\mathrm {bulk}\). The rate functions are good, that is lower semi-continuous with compact level sets.

The theorem is proven in section 5.3. The large deviations principle for \(\nu _\beta \) says that for every closed set \(A\subset {\mathbb {R}}_+^{\mathbb {N}}\) and every open set \(O\subset {\mathbb {R}}_+^{\mathbb {N}}\) (product topology)

$$\begin{aligned} \begin{aligned} \limsup _{\beta \rightarrow \infty } \frac{1}{\beta }\log \nu _\beta (A)&\le - \inf _{(z_j)\in A }\Bigl ( \overline{{\mathcal {E}}}_\mathrm {surf} \bigl ((z_j)\bigr )- \min _{{\mathbb {R}}_+^{\mathbb {N}}} {\mathcal {E}}_\mathrm {surf} \Bigr ) \\ \liminf _{\beta \rightarrow \infty } \frac{1}{\beta }\log \nu _\beta (O)&\ge - \inf _{(z_j)\in O }\Bigl ( \overline{{\mathcal {E}}}_\mathrm {surf} \bigl ((z_j)\bigr )- \min _{{\mathbb {R}}_+^{\mathbb {N}}} {\mathcal {E}}_\mathrm {surf} \Bigr ). \end{aligned} \end{aligned}$$
(2.6)

It is essential that we work in the product topology. Indeed we shall later see that \(\nu _\beta \) is mixing, therefore for every \(\varepsilon >0\), the measure \(\nu _\beta \) gives full mass 1 to sequences \((z_j)_{j\in {\mathbb {N}}}\) that have infinitely many spacings \(|z_j-a|>\varepsilon \). Thus for every ball \(O= \{ (z_j) \in {\mathbb {R}}_+^{\mathbb {N}}\mid \sum _{j=1}^\infty (z_j-a)^2<\delta \}\), we have \(\nu _\beta (O) = 0\) hence \(\beta ^{-1}\log \nu _\beta (O) = -\infty \), to be contrasted with the lower bound in equation (2.6).

Another consequence concerns the evaluation of the Gibbs energies of localized defects: suppose that because of some impurity, the energy is not \({\mathcal {E}}_N\) but \({\mathcal {E}}_N + {\mathcal {V}}\), where \({\mathcal {V}}\) is, say, continuous in the product topology, localized in the bulk, and bounded from below. Then by Varadhan’s lemma [17], as \(\beta \rightarrow \infty \), the effective Gibbs energy converges to the zero temperature energy of the defect,

$$\begin{aligned} -\frac{1}{\beta }\log \mu _\beta \bigl ( {{\text {e}} }^{-\beta {\mathcal {V}}}\bigr ) \rightarrow \inf _{\mathcal {D}} ({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}})\quad (\beta \rightarrow \infty ). \end{aligned}$$

Surface energies occur as a specific type of defect, when \({\mathcal {V}}\) cancels all interactions between two half-infinite chains [see Proposition 4.9(a)], which leads to the next theorem. Define

$$\begin{aligned} g(\beta ) = - \lim _{N\rightarrow \infty } \frac{1}{\beta N }\log Q_N(\beta ),\quad g_\mathrm {surf}(\beta ) = \lim _{N\rightarrow \infty } \Bigl ( -\frac{1}{\beta }\log Q_N(\beta ) - N g(\beta ) \Bigr ),\nonumber \\ \end{aligned}$$
(2.7)

the Gibbs free energy \(g(\beta )\) per particle in the bulk and the surface correction \(g_\mathrm {surf}(\beta )\).

Theorem 2.5

Fix \(p\in (0,p^*)\) and \(m\in {\mathbb {N}}\cup \{\infty \}\). The limits (2.7) exist. If, in addition, \(r_\mathrm {hc}>0\) and \(\lim _{r\searrow r_\mathrm {hc}} v(r) = \infty \), then the bulk and surface Gibbs energy approach their zero-temperature counterparts when \(\beta \rightarrow \infty \):

$$\begin{aligned} \lim _{\beta \rightarrow \infty } g(\beta )= e_0,\quad \lim _{\beta \rightarrow \infty } g_\mathrm {surf}(\beta ) =e_\mathrm {surf}. \end{aligned}$$

This proves that the thermodynamic limit and the zero temperature limit can be exchanged, which is non-trivial (and in fact, fails when the pressure goes to zero too fast, see below).

One last consequence of Theorem 2.4 concerns the distribution of spacings and the pressure-density (or stress-strain) relation. The Gibbs free energy and our partition functions correspond to an ensemble where the overall length of the system is not fixed, but instead may fluctuate with a law that depends on the pressure—high pressures p favor compressed states. In the thermodynamic limit \(N\rightarrow \infty \), though, the average spacing between particles becomes a well-defined quantity, given by

$$\begin{aligned} \ell (\beta ) = \int _{{\mathbb {R}}_+^{\mathbb {Z}}} z_0 \mathrm {d}\mu _\beta ((z_j)_{j\in {\mathbb {Z}}}). \end{aligned}$$
(2.8)

By the contraction principle [17, Theorem 4.2.1], the distribution of \(z_0\) under \(\mu _\beta \) satisfies a large deviations principle with good rate function \(w(z) = \inf \{\overline{{\mathcal {E}}}_\mathrm {bulk}((z_j)_{j\in {\mathbb {Z}}})\mid (z_j)_{j\in {\mathbb {Z}}}\in {\mathbb {R}}_+^{\mathbb {Z}},\, z_0 = z\}\). The unique minimizer of w(z) is the ground state spacing a. Lemma 5.1 implies that the distribution of spacings has exponential tails

$$\begin{aligned} \mu _\beta \bigl ( \{(z_j)_{j\in {\mathbb {Z}}}\mid z_0\ge r\}\bigr ) \le C\exp (-\beta p r) \end{aligned}$$

for some \(\beta \)-independent constant C.

Corollary 2.6

Under the assumptions of Theorem 2.5, we have

$$\begin{aligned} \lim _{\beta \rightarrow \infty } \ell (\beta ) = a = \mathrm {argmin} \bigl (p r +\sum _{k=1}^{m} v(kr)\bigr ). \end{aligned}$$

In particular, for large \(\beta \), we have \(\ell (\beta )< a_0\) where \(a_0\) is the minimizer of the zero-stress Cauchy–Born energy density \(\sum _k v(kr)\). Conversely, spacings \(\ell (\beta )>a_0\) (elongated chains) imply vanishing pressure \(p=p_\beta \rightarrow 0\). This is clearly apparent for nearest neighbor interactions (\(m=1\), Takahashi nearest neighbor gas [33, 53]), for which

$$\begin{aligned} g(\beta ) = - \frac{1}{\beta }\log \Bigl ( \int _0^\infty {{\text {e}} }^{-\beta [v(r)+ p_\beta r]} \mathrm {d}r\Bigr ),\quad \ell (\beta ) = \frac{\int _0^\infty r \exp (-\beta [v(r)+ p_\beta r]) \mathrm {d}r}{\int _0^\infty \exp (-\beta [v(r)+ p_\beta r]) \mathrm {d}r}.\nonumber \\ \end{aligned}$$
(2.9)

Comments on vanishing pressure. We add a superscript to indicate that zero-temperature quantities are evaluated at \(p=0\). When \(p=p_\beta \rightarrow 0\) slower than any exponential, it is still true that \(g(\beta ) \rightarrow e_0^0\). When \(\beta p_\beta = \exp (-\beta \nu )\) with \(\nu >0\), one can show with [26, 27] that

$$\begin{aligned} \lim _{\beta \rightarrow \infty } g(\beta ) = \min (e_0^0, - \nu ). \end{aligned}$$
(2.10)

At pressures vanishing faster than \(\exp (- \beta |e_0^0|)\), the most likely configurations have very large spacings (dilute gas phase, \(\ell (\beta )\rightarrow \infty \)) and the previous results no longer apply. For \(\liminf \frac{1}{\beta }\log (\beta p_\beta )>e_0^0\), we expect that large deviations principles with rate functions \(\overline{{\mathcal {E}}}_\mathrm {bulk}^0\) and \(\overline{{\mathcal {E}}}_\mathrm {surf}^0 - \min \overline{{\mathcal {E}}}_\mathrm {surf}^0\) still hold (in fact our proofs still show weak large deviations principles). However rate functions have non-compact level sets and exponential tightness is lost. Moreover large spacings may contribute to the average (2.8) and Corollary 2.6 need no longer be true, thus allowing for spacings \(\ell (\beta )\rightarrow \ell >a_0\).

2.3 Gaussian Approximation

Here we complement the large deviations result by a Gaussian approximation. This section deals with finite m and the bulk measure \(\mu _\beta \) only. Remember \(d=m-1\). We will see that the Hessian of \({\mathcal {E}}_{\mathrm {bulk}}\) at \((\ldots ,a,a,\ldots )\) is associated with a positive-definite, bounded operator \({\mathcal {H}}\) in \(\ell ^2 ({\mathbb {Z}})\). It is represented by a doubly-infinite matrix \(({\mathcal {H}}_{ij})_{i,j\in {\mathbb {Z}}}\) that is diagonally dominant. Write \(({\mathcal {H}}^{-1})_{ij}\) for the matrix elements of the inverse operator and let \(\mu ^{\mathrm {Gauss}}\) be the uniquely defined measure on \({\mathbb {R}}^{\mathbb {Z}}\), equipped with the product topology and its associated Borel \(\sigma \)-algebra, such that

$$\begin{aligned} \int _{{\mathbb {R}}^{\mathbb {Z}}} s_i s_j \mathrm {d}\mu ^\mathrm {Gauss}\bigl ( (s_k)_{k\in {\mathbb {Z}}}\bigr ) = ({\mathcal {H}}^{-1})_{ij} \end{aligned}$$

for all \(i,j\in {\mathbb {Z}}\), and every finite-dimensional marginal of \(\mu ^\mathrm {Gauss}\) is a multi-dimensional Gaussian distribution. Equivalently, \(\mu ^\mathrm {Gauss}\) is the distribution of a Gaussian process \((N_j)_{j\in {\mathbb {Z}}}\) with mean zero and covariance \({\mathbb {E}}[N_i N_j] = ({\mathcal {H}}^{-1})_{ij}\). More concrete expressions for the probability density functions of nd-dimensional marginals of \(\mu ^\mathrm {Gauss}\) are provided in Proposition 6.17 below.

In the following we identify the measure \(\mu _\beta \) on \({\mathbb {R}}_+^{\mathbb {Z}}\) with the measure \(\mathbb {1}_{{\mathbb {R}}_+^{\mathbb {Z}}} \mu _\beta \) on \({\mathbb {R}}^{\mathbb {Z}}\). We exclude the trivial case \(m=1\).

Theorem 2.7

Assume \(2 \le m<\infty \), \(p\in (0,p^*)\), and \(r_\mathrm {hc}>0\). Then for every \(n\in {\mathbb {N}}\), the n-dimensional marginals of \(\mu _\beta \) and \(\mu ^\mathrm {Gauss}\) have probability density functions \(\rho _n^{(\beta )}\) and \(\rho _n^\mathrm {Gauss}\), and

$$\begin{aligned}&\lim _{\beta \rightarrow \infty }\int _{{\mathbb {R}}^n} \Bigl | \beta ^{-n/2} \rho _n^{(\beta )} \bigl (a+ \beta ^{-1/2}s_1,\ldots , a+ \beta ^{-1/2}s_n \bigr )\\&\quad - \rho _n^\mathrm {Gauss}(s_1,\ldots ,s_n )\Bigr | \mathrm {d}s_1\ldots \mathrm {d}s_n = 0. \end{aligned}$$

It follows that the distribution of the spacings, suitably rescaled, converges locally to the Gaussian measure \(\mu ^\mathrm {Gauss}\): for every bounded function \(f:{\mathbb {R}}^Z\rightarrow {\mathbb {R}}\) that depends on finitely many spacings \(z_j\) only (bounded cylinder functions), we have

$$\begin{aligned} \lim _{\beta \rightarrow \infty }\int _{{\mathbb {R}}^{\mathbb {Z}}} f\bigl ( \sqrt{\beta }(z_j-a)_{j\in {\mathbb {Z}}}\bigr ) \mathrm {d}\mu _\beta \bigl ((z_j)_{j\in {\mathbb {Z}}}\bigr ) = \int _{{\mathbb {R}}^{\mathbb {Z}}} f \mathrm {d}\mu ^\mathrm {Gauss}. \end{aligned}$$

For example, in the limit \(\beta \rightarrow \infty \), the distribution of a single spacing \(z_j\) is approximately normal, with mean a and variance \(\beta ^{-1} ({\mathcal {H}}^{-1})_{ii}\). We expect that Theorem 2.7 stays true for \(m=\infty \) but a proof or disproof is beyond the scope of this article.

The next theorem says that the Gibbs free energy is close to the Gibbs free energy of the approximate Gaussian model.

Theorem 2.8

Assume \(2 \le m<\infty \), \(p\in (0,p^*)\), and \(r_{\mathrm {hc}}>0\). The Gibbs free energy satisfies, as \(\beta \rightarrow \infty \),

$$\begin{aligned} g(\beta ) = e_0 - \frac{1}{\beta }\log \sqrt{\frac{2\pi }{\beta (\det C)^{1/d}}} +o(\beta ^{-1}), \end{aligned}$$

where \(d=m-1\) and C is a \(d\times d\) positive-definite matrix.

The matrix C is introduced in equation (6.18) - see also Lemma 6.7 - and is a function of the Hessian of the energy.

Remark

(Gaussian approximation and semi-classical expansions) If v is smooth and \(p>0\) is fixed, the Gibbs energy should admit an asymptotic expansion of form

$$\begin{aligned} g(\beta ) = e_0 - \frac{1}{\beta }\log \sqrt{\frac{2\pi }{\beta c}} + \sum _{j=1}^n a_j \beta ^{-j/2} + O(\beta ^{-(n+1)/2}) \quad (\beta \rightarrow \infty ) \end{aligned}$$

to arbitrarily high order n, for some \(c>0\) and coefficients \(a_j\in {\mathbb {R}}\). The first correction comes from a Gaussian approximation of the partition function (harmonic crystal), see section 6, with the constant c capturing the asymptotic behavior of the determinant of the Hessian around the energy minimum. Higher order corrections correspond to anharmonic effects. A similar expansion holds for \(g_\mathrm {surf}(\beta )\). Rigorous results for finite m are derived with semi-classical analysis [3, 24, 35] which build on the analogy with the \(\hbar \rightarrow 0\) limit from quantum mechanics. For \(m=2\) and potentials with superlinear growth at infinity, independent results are given in [51].

2.4 Decay of Correlations

Suppose that two defects change the energy functional from \({\mathcal {E}}_\mathrm {bulk}\) to \({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}}_0 + {\mathcal {V}}_k\), where we assume for simplicity that \({\mathcal {V}}_0\) and \({\mathcal {V}}_k\) depend on \(z_0\) and \(z_k\) alone. For large k, we may expect that the Gibbs energies are approximately additive, that is

$$\begin{aligned} {\mathcal {I}}^{\scriptscriptstyle {({\beta }})}_\mathrm {eff} (k)= -\frac{1}{\beta } \log \mu _\beta ( {{\text {e}} }^{-\beta ({\mathcal {V}}_0+{\mathcal {V}}_k)}) + \frac{1}{\beta }\log \mu _\beta ( {{\text {e}} }^{-\beta {\mathcal {V}}_0}) + \frac{1}{\beta } \log \mu _\beta ( {{\text {e}} }^{-\beta {\mathcal {V}}_k})\nonumber \\ \end{aligned}$$
(2.11)

should be small when the defects are far apart. \({\mathcal {I}}_\mathrm {eff}^{\scriptscriptstyle {({\beta }})}(k)\) represents an effective interaction between the defects. In the study of systems with many defects it is important to understand how fast the effective interaction decreases at large distances. Some intuition is gained from the zero-temperature counterpart

$$\begin{aligned} {\mathcal {I}}^{\scriptscriptstyle {({\infty }})}_\mathrm {eff}(k) = \inf ({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}}_0 + {\mathcal {V}}_k) - \inf ({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}}_0) - \inf ({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}}_k),\nonumber \\ \end{aligned}$$
(2.12)

however in general the limits \(\beta ,k\rightarrow \infty \) cannot be interchanged and a full study of (2.11) for large k requires techniques beyond variational calculus.

A closely related problem is about the localization of changes induced by a defect: at zero temperature, if \((z_j)_{j\in {\mathbb {Z}}}\) is a minimizer of \({\mathcal {E}}_\mathrm {bulk} + {\mathcal {V}}_0\), how fast does \(z_k\) converge to the ground state spacing a as \(k\rightarrow \pm \infty \)? On a similar note, how fast does \(z_k\rightarrow a\) for a minimizer of the surface energy \({\mathcal {E}}_\mathrm {surf}\) (decay of boundary layers)? At positive temperature, the question is about the speed of convergence, for test functions \(f:{\mathbb {R}}_+^k \rightarrow {\mathbb {R}}\), in

$$\begin{aligned} \frac{\mu _\beta ({{\text {e}} }^{-\beta {\mathcal {V}}_0} f_{i})}{\mu _\beta ({{\text {e}} }^{-\beta {\mathcal {V}}_0})} \rightarrow \mu _\beta (f),\quad \nu _\beta (f_i)\rightarrow \mu _\beta (f) \end{aligned}$$

as \(i\rightarrow \infty \). Here \(f_i((z_j)_{j\in {\mathbb {Z}}}):= f(z_i,\ldots ,z_{i+k-1})\), so that \(f_{n+i} = f_i\circ \tau ^n\) when \(\tau \) denotes the left shift on \({\mathbb {R}}_+^{{\mathbb {Z}}}\). These questions naturally lead to the investigation of the decay of correlations. We start with a general result which holds for all \(\beta ,p>0\).

Theorem 2.9

Assume \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p>0\). There exist \(c,C>0\) such that for all \(\beta ,p>0\), \(k\in {\mathbb {N}}\), and bounded \(f,g:{\mathbb {R}}_+^k \rightarrow {\mathbb {R}}\),

$$\begin{aligned}&\bigl |\mu _\beta (f_0 g_n) - \mu _\beta (f_0) \mu _\beta (g_n)\bigr | \\&\quad \le \min _{\genfrac{}{}{0.0pt}{}{q\in {\mathbb {N}}:}{1 \le q \le n/k}} \Bigl ( (1- {{\text {e}} }^{-c\beta })^q + {{\text {e}} }^{c\beta } ({{\text {e}} }^{C \beta (q/n)^{s-2}} -1) \Bigr ) ||f||_\infty ||g||_\infty . \end{aligned}$$

When m is finite and \(k=m-1\), we have the stronger bound

$$\begin{aligned} \bigl |\mu _\beta (f_0 g_n) - \mu _\beta (f_0) \mu _\beta (g_n)\bigr | \le (1-{{\text {e}} }^{-c\beta })^{n/k} ||f||_\infty ||g||_\infty . \end{aligned}$$

The theorem is proven in section 4.2. When m is finite, it implies exponential decay of correlations as \(n\rightarrow \infty \), however the rate \( -\log (1- {{\text {e}} }^{-c\beta })\) can be exponentially small for large \(\beta \). When m is infinite, Theorem 2.9 implies algebraic decay of correlations: for \(q=\lfloor n^{\varepsilon }\rfloor \) and sufficiently large n, \((1- {{\text {e}} }^{-c\beta })^q\) is negligible compared to \(\beta (q/n)^{s-2}\) and we find that as \(n\rightarrow \infty \),

$$\begin{aligned} \bigl |\mu _\beta (f_0 g_n) - \mu _\beta (f_0) \mu _\beta (g_n)\bigr | \le (1+ O(1)) \frac{C \beta \exp (c\beta )}{n^{(s-2)(1-\varepsilon )}}. \end{aligned}$$
(2.13)

Better bounds are available for restricted Gibbs measures. Let \({\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})}\) be the measure \({\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})}\) conditioned on \([z_\mathrm {min},z_\mathrm {max}]^{N-1}\) and \({\tilde{\mu }}_\beta \) the probability measure on \([z_\mathrm {min},z_\mathrm {max}]^{\mathbb {Z}}\) obtained from the thermodynamic limit of \({\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})}\).

Proposition 2.10

Let \(m \in {\mathbb {N}}\cup \{\infty \}\). There exists \(c>0\) such that for all \(\beta ,p>0\), smooth \(f,g:{\mathbb {R}}_+\rightarrow {\mathbb {R}}\), and \(i\ne j\),

$$\begin{aligned} \Bigl |{\tilde{\mu }}_\beta (f_i g_j) - {\tilde{\mu }}_\beta (f_i) {\tilde{\mu }}_\beta (g_j) \Bigr | \le \frac{c}{\beta |i-j|^{s}} \Bigl ( {\tilde{\mu }}_\beta ({f'_i}^2) {\tilde{\mu }}_\beta ({g'_j}^2) \Bigr )^{1/2}. \end{aligned}$$

Remark

When m is finite, the uniform algebraic decay for the restricted Gibbs measure is replaced with uniform exponential decay \(\exp (- \gamma |j-i|)\) with \(\beta \)-independent \(\gamma >0\).

The proposition is proven in section 7. It follows from the uniform convexity of the energy (Lemma 3.3) and known results from the realm of Brascamp–Lieb, Poincaré and Log–Sobolev inequalities. Proposition 2.10 differs from the estimate (2.13) in two ways: there is no exponentially large prefactor \(\exp (c\beta )\), and the rate of algebraic decay is \(1/n^s\) instead of \(1/n^{s-2}\). Exponentially large prefactors are absent because the energy landscape has no local minimum. The improved algebraic decay \(1/n^s\) arises, roughly, because the Gibbs measure is comparable to a Gaussian measure whose covariance is the inverse of the energy’s Hessian near the minimum, and instead of the tails of v(r), it is the tails of \(v''(r)\) that count.

We suspect that for large \(\beta \) and small pressure, these improvements should carry over to the full Gibbs measure \(\mu _\beta \), but we have proofs for interactions involving finitely many neighbors only.

Theorem 2.11

Assume \(2\le m<\infty \), \(p\in (0,p^*)\), and \(r_\mathrm {hc}>0\). There exists \(\gamma >0\) such that for all sufficiently large \(\beta \), suitable \(C(\beta )\), all \(n\in {\mathbb {N}}\), and all \(f,g:{\mathbb {R}}_+^d \rightarrow {\mathbb {R}}\), we have

$$\begin{aligned} \bigl |\mu _\beta (f_{0} g_{n}) - \mu _\beta (f_{0}) \mu _\beta (g_{n})\bigr | \le C(\beta ) {{\text {e}} }^{-\gamma n} ||f_0||_\infty \, ||g_n||_\infty . \end{aligned}$$

If \(m=2\), we can pick \(C(\beta ) =1\).

The theorem is proven in section 6 with perturbation theory for compact integral operators in \(L^2({\mathbb {R}}^d)\). When \(m=2\), the relevant operators are self-adjoint and spectral norms and operator norms coincide, leading to improved statements. We conclude with a few comments.

Lagrangian vs. Eulerian point of view The theorems above formulate decay of correlations in terms of labelled spacings, which in the language of continuum mechanics is a Lagrangian viewpoint. On the other hand, in statistical mechanics of point particles it is more common to deal with unlabelled particles (Eulerian viewpoint) and correlations are between portions of space rather than labelled interparticle distances. The difference between the two approaches becomes quite clear for nearest neighbor interactions [\(m=1\), see equation (2.9)], for which the spacings are i.i.d. with probability density \(q_\beta (r)\) proportional to \(\exp ( - \beta [v(r)+ p_\beta r])\). Because of the independence of spacings, correlations in terms of spacings vanish, \(\mu _\beta (f_0 g_n) - \mu _\beta (f_0) \mu _\beta (g_n) =0\). On the other hand, the two-point function \(\rho _2(0,x)\)Footnote 1 studied in statistical mechanics of particles is a sum over the number of particles contained in (0, x],

$$\begin{aligned} \rho _2(0,x) =\frac{1}{\ell (\beta )} \sum _{k=1}^\infty q_\beta ^{*k}(x) = \frac{q_\beta (x)}{\ell (\beta )} + \int _0^\infty q_\beta (y-x) \rho _2(0,y)\mathrm {d}y, \end{aligned}$$

with \(q_\beta ^{*k}\) the n-fold convolution of \(q_\beta \) with itself. It is a well-known fact from renewal theory [18, Chapter XI] that

$$\begin{aligned} \rho _2(0,x) - \frac{1}{\ell (\beta )^2} \rightarrow 0 \quad (x\rightarrow \infty ), \end{aligned}$$

but in general the difference is non-zero for finite x—in fact changing \(q_\beta \) the convergence as \(x\rightarrow \infty \) can be arbitrarily slow, even though correlations of labelled interparticle spacings vanish identically. One should keep this difference in mind when browsing the literature.

Path-large deviations, non-linear semi-groups, Bellman equation For \(m=2\), we may view \(\mu _\beta \) as the law of a stationary Markov chain with state space \({\mathbb {R}}_+\) and transition kernel \(P_\beta \) defined in equation (6.6). Theorem 2.4 is a path-large deviations result for the Markov chain. Path large deviations are often investigated with the help of non-linear semi-groups and Hamilton–Jacobi–Bellman equations [19]. In our context, a natural non-linear semi-group is

$$\begin{aligned} V_\beta ^nf:= - \frac{1}{\beta }\log \Bigl (P_\beta ^n {{\text {e}} }^{-\beta f} \Bigr ), \end{aligned}$$

and for sufficiently smooth f, we have a convergence of the form

$$\begin{aligned} \lim _{\beta \rightarrow \infty }V_\beta f (x) =- u(x)+ \inf _{y \in {\mathbb {R}}_+} \bigl ( p x + v(x) + v(x+y) - e_0 + u(y)+ f(y)\bigr ), \end{aligned}$$

where u solves

$$\begin{aligned} u(x) = \inf _{y\in {\mathbb {R}}_+} \bigl (p x + v(x) + v(x+y) - e_0 + u(y)\bigr ). \end{aligned}$$

Similar equations, motivated by quantum mechanics and geometric optics, appear in semi-classical analysis [24, Equation (5.4.4)]. Proposition 3.9 below provides an infinite-m ersatz and is instrumental in the proof of Theorem 2.4.

Vanishing pressure When \(\beta p = \beta p_\beta \rightarrow 0\) faster than \(\exp (- \beta |e_0^0|)\) [see (2.10)], the Gibbs measure should no longer be comparable to a Gaussian. Instead, it should be close to the ideal gas measure, for which spacings are i.i.d. exponentially distributed with parameter \(\beta p_\beta \), and we may again expect uniform exponential decay of correlations (for finite m). When \(\beta p_\beta \rightarrow 0\) at a speed comparable to \(\exp (-\beta |e_0^0|)\), we should instead expect an exponentially small spectral gap: the Markov chain has two metastable wells, one corresponding to the optimal spacing a and another well at infinity. The exponentially small spectral gap is associated with the fracture of the chain of atoms, in the spirit of “fracture as a phase transition” [54].

3 Energy Estimates

In this section we analyze the variational problems arising at zero temperature. Throughout the section we assume that \(p\in [0,p^*)\) as in Assumption 2.

3.1 Bulk Periodicity

Lemma 3.1

Every minimizer of \({\mathcal {E}}_N:{\mathbb {R}}_+^{N-1} \rightarrow {\mathbb {R}}\) lies in \([z_{\min },z_{\max }]^{N-1}\).

Proof

Let \(z_1,\ldots ,z_{N-1}>0\). If \(z_j > z_{\max }\) for some j, define a new configuration by shrinking \(z_j\) to \(z_{\max }\), leaving all other spacings unchanged: \(z'_i = z_i\) for \(i \ne j\) and \(z'_j= z_{\max }\). Since \(z_\mathrm{max}\) is a strict minimizer of v and \(r \mapsto v(r)\) increases on \([z_{\max }, \infty )\), shrinking the bonds decreases \({\mathcal {E}}_N\) strictly and the original configuration could not have been a minimizer.

If some interparticle spacing is smaller than \(z_{\min }\), we remove a particle and reattach it to one end of the chain as follows. Assume \(b:= \min (z_1,\ldots ,z_{N-1})< z_{\min }\) and let \(j\in \{1,\ldots ,N-1\}\) with \(z_j=b\). Let \(x_1=0\) and \(x_i = z_1+\cdots + z_{i-1}\), \(i=2,\ldots ,N\) be associated particle positions. Thus \(x_{j+1}-x_j =z_j = b\) and \(x_{i+1}- x_i \ge b\) for all i. The interaction of \(x_j\) with all other particles is

$$\begin{aligned} v(b) + \sum _{i=1}^{\min \{m-1,N-j-1\}} v(z_{j} + \ldots + z_{j+i}) + \sum _{i=1}^{\min \{m,j-1\}} v(z_{j-1} + \ldots + z_{j-i}). \end{aligned}$$

For finite m we note that, if \(v(z_{j-i} + \ldots + z_{j-i+m}) > 0\) for an \(i \in \{ 1, \ldots , \min \{m,j-1\} \}\), then \(v(z_{j-i} + \ldots + z_{j-i+m}) < v(z_{j-i} + \ldots + z_{j-1})\) by Assumption 1(i). Removing the particle \(x_j\) thus leads to a configuration of N atoms whose energy has decreased by at least

$$\begin{aligned} \Delta _1 = v(b) + v(z_{\max }) - 2 \alpha _1 \sum _{n=2}^m (n b)^{-s} \ge v(b) + v(z_{\max }) - 2 \alpha _1 \sum _{n=2}^\infty (n b)^{-s} > 0.\nonumber \\ \end{aligned}$$
(3.1)

The last inequality holds because of Assumption 1(ii) and \(b<z_\mathrm{min}\). We define a new configuration by attaching the removed particle to either end of the chain at a distance \(r = z_\mathrm{max}\). Since \(v(z_{\max }) + p z_{\max } < 0\) by Assumption 2, this decreases \({\mathcal {E}}_N\) further, so overall the new configuration has strictly smaller energy, and the original sequence of spacings cannot be a minimizer of \({\mathcal {E}}_N\). \(\square \)

At zero pressure, it is a well-known fact that the N-particle energy is subadditive, \(E_{N+M}\le E_N + E_M\). Indeed placing two N,M-particle minimizers side by side with large mutual distance, because of \(v(r)\rightarrow 0\) at \(r\rightarrow \infty \), yields an \(N+M\)-particle configuration with energy \(\le E_N+E_M\). Positive pressure penalizes large mutual distances between two consecutive blocks, so the construction has to be modified.

Lemma 3.2

Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p\in [0,p^*)\). Then \(E_{N+M-1}\le E_{N} + E_{M}\) for all \(N,M\in {\mathbb {N}}\), and the limit \(e_0= \lim E_N/N\) exists and satisfies \(E_N \ge (N -1)e_0\) for all \(N\in {\mathbb {N}}\).

Proof

Let \(z\in (r_\mathrm {hc},\infty )^{N-1}\) and \(w\in (r_\mathrm {hc},\infty )^{M-1}\) be minimizers of \({\mathcal {E}}_N\) and \({\mathcal {E}}_{M}\) respectively. Define \(y\in (r_\mathrm {hc},\infty )^{M+N-2}\) by concatenating z and w. By Lemma 3.1, all spacings are in \([z_{\min }, z_{\max }]\). Therefore interactions that involve bonds from both blocks are for spacings \(\ge 2 z_{\min } >z_{\max }\), hence negative, and

$$\begin{aligned} E_{N+M-1} \le {\mathcal {E}}_{N-1}(y) \le E_N + E_M. \end{aligned}$$

As a consquence, \(a_n:= E_{n+1}\) is subadditive. By Fekete’s subadditive lemma, the limit \(e_0= \lim a_n/n = \lim E_{n}/n\) exists and is equal to the infimum of \(a_n/n\), hence \( E_{N} \ge (N-1) e_0\). Notice that \(e_0>-\infty \) since

$$\begin{aligned} E_n \ge (n-1) \Big ( v(z_{\max }) + \sum _{j=2}^{\infty } v(j z_{\min }) \Big ) \ge (n-1) \Big ( v(z_{\max }) + \alpha _1 z_{\min }^{-s} \sum _{j=2}^{\infty } j^{-s} \Big ). \end{aligned}$$

(In the terminology of statistical mechanics, the energy is stable [44, Chapter 3.2].) \(\square \)

The next lemma in particular shows that \({\mathcal {E}}_N\) is uniformly convex on \([z_{\min },z_{\max }]^{N-1}\). For later purposes, we state and prove this on a slightly larger set.

Lemma 3.3

There are constants \(\varepsilon ,\eta ,C>0\) such that for all \(m, N, N_1, N_2 \in {\mathbb {N}}\) with \(N_1 < N_2 \le N\), and \(z=(z_1,\ldots ,z_{N-1}) \in [z_{\min },\infty ]^{N-1}\) with \(z_j \le z_{\max }+\varepsilon \) for \(N_1 \le j \le N_2-1\), the Hessian of \({\mathcal {E}}_N\) at z satisfies

$$\begin{aligned} \eta \sum _{j=N_1}^{N_2-1} \zeta _j^2 \le \sum _{i,j=N_1}^{N_2-1} \zeta _i \zeta _j \partial _i \partial _j {\mathcal {E}}_N(z) \le C \sum _{j=1_1}^{N_2-1} \zeta _j^2 \end{aligned}$$

for all \(\zeta \in {\mathbb {R}}^{N-1}\). Moreover, the submatrix \((\partial _i\partial _j {\mathcal {E}}_N(z))_{N_1 \le i.j \le N_2-1}\) of the Hessian has strictly positive diagonal entries \(\partial _i^2 {\mathcal {E}}_N(z) >0\) and non-positive off-diagonal entries \(\partial _i\partial _j {\mathcal {E}}_N(z) \le 0\). In particular, this matrix is monotone.

Note that the Hessian is independent of the pressure p.

Proof

Let \({\mathcal {L}}\) be the collection of discrete intervals \(\{i,\ldots ,j-1\}\subset \{1,\ldots ,N-1\}\) of length \(j-i \le m\). Then for all ij

$$\begin{aligned} \partial _i \partial _j {\mathcal {E}}_N(z) = \sum _{L\in {\mathcal {L}}:\, \{i,j\}\subset L} v''\Big (\sum _{j\in L} z_j\Big ). \end{aligned}$$

For \(i\ne j\) and \(i,j\in L\) we have \(\sum _{j\in L} z_j \ge 2 z_{\min }\) hence \(v''(\sum _L z_j) \le 0\); it follows that the off-diagonal entries of the Hessian are non-positive. Next we show that the row-sums are bounded from below by some constant \(\eta >0\) if \(N_1 \le i \le N_2-1\):

$$\begin{aligned} \sum _{j = 1}^N \partial _i \partial _j {\mathcal {E}}_N(z)&= \partial _i^2 {\mathcal {E}}_N(z)+ \sum _{j: j \ne i} \partial _j \partial _i {\mathcal {E}}_N(z) \\&= v''(z_i) + \sum _{L\ni i,\#L\ge 2} v'' \Big ( \sum _{j\in L} z_j \Big ) + \sum _{j: j\ne i} \sum _{L\supset \{i,j\}} v'' \Big ( \sum _{j\in L} z_j \Big ) \\&\ge v''(z_i) + \sum _{n = 2}^m v''(n z_{\min }) \sum _{L\ni i,\#L=n} \Big (1 + \sum _{j\in L, j\ne i} 1\Big ) \\&\ge v''(z_i) - v''(z_{\max }) + v''(z_{\max }) + \sum _{n=2}^\infty n^2 v''(n z_{\min }) =\eta . \end{aligned}$$

Assumption 1 guarantees that \(\eta >0\) for \(\varepsilon > 0\) sufficiently small. Thus row sums are positive, off-diagonal matrix elements non-positive, and consequently diagonal elements positive. Moreover, with \(C = 2 \max \{ v''(r) \mid r \in [z_{\min }, z_{\max }+\varepsilon ] \}\) the diagonal elements are bounded from above by \(\frac{C}{2}\). The proof of the lemma is then completed with the help of standard arguments, for example every eigenvalue of \((\partial _i\partial _j {\mathcal {E}}_N(z))_{N_1 \le i.j \le N_2-1}\) lies in a Gershgorin circle with center \(\partial _i^2 {\mathcal {E}}_N\) and radius \(\sum _{j\ne i} |\partial _i\partial _j {\mathcal {E}}_N|\). In particular, \((\partial _i\partial _j {\mathcal {E}}_N(z))_{N_1 \le i,j \le N_2-1}\) is an M-matrix and thus monotone. \(\square \)

Proof of Theorem 2.1

(a) By Lemma 3.1 minimizers lie in the compact set \([z_{\min }, z_{\max }]^{N-1}\). On that set the Hessian of \({\mathcal {E}}_N\) is positive definite because of Lemma 3.3, so \({\mathcal {E}}_N\) is strictly convex and the minimzer is unique.

(b) The convergence \(z_j^{\scriptscriptstyle {({N}})}\rightarrow a\) as \(j,N\rightarrow \infty \) along \(N-j\rightarrow \infty \), where \(a \in [z_{\min }, z_{\max }]\) is the unique minimizer of \({\mathbb {R}}_+\ni r\mapsto p r+ \sum _{k=1}^m v(kr)\), with the help of Lemma 3.3 is a straightforward adaptation of the corresponding proof in [22] and will be omitted. By Assumption 1(ii) we even have \(a > z_{\min }\). We remark that the proof in [22] also shows that \(\max \{ z_j^{\scriptscriptstyle {({N+1}})},\ z_{j+1}^{\scriptscriptstyle {({N+1}})} \} \le z_j^{\scriptscriptstyle {({N}})}\) for \(j = 1, \ldots , N-1\). This in turn implies that the convergence is in fact uniform away from a boundary layer of vanishing volume fraction.

(c) This observation in combination with Lemma 3.2 yields (c). Note that \(e_0 < 0\) since \(e_0 \le pz_{\max } + \sum _{k=1}^{\infty } v(k z_{\max }) \le p z_{\max } + v(z_{\max }) < 0\) by Assumptions 1 and 2. \(\square \)

Notice that also \(a < z_{\max }\) except for the exceptional cases in which only nearest neighbors interact, that is \(m = 1\) or \(v(z) = 0\) for \(z \ge 2 z_{\max }\), and the pressure vanishes.

3.2 Surface Energy

Proposition 3.4

Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p\ge 0\). Then

$$\begin{aligned} \lim _{N\rightarrow \infty }(E_N- Ne_0)= e_\mathrm{surf} = 2 \inf _{{\mathcal {D}}_0} {\mathcal {E}}_\mathrm {surf} - p a - \sum _{k=1}^m k v(ka). \end{aligned}$$

Proof

For simplicity we write down the proof for \(m=\infty \); the proof when \(m\in {\mathbb {N}}\) is completely analogous. Fix \(k\ge 2\) and \(\varepsilon >0\). Let \(n_1,n_2\in {\mathbb {N}}\) with \(n_2\ge k\) and \(N= n_1+n_2+1\). Let \(z=(z_{-n_1},\ldots , z_{n_2-1})\in [z_{\min },z_{\max }]^{n_1+n_2}\) be the spacings of the N-particle ground state, labelled by \(j=-n_1,\ldots ,n_2-1\) rather than \(1,\ldots ,N-1\). Choosing \(n_1\) and \(n_2\) large enough we may assume \(\sum _{j=0}^{k-1}|z_j-a|^2\le \varepsilon \). Since the Hessian has matrix norm uniformly bounded from above (Lemma 3.3), changing the spacings \(z_0,\ldots ,z_{k-1}\) to a increases the energy by \(C \varepsilon \) at most thus

$$\begin{aligned} E_N \ge {\mathcal {E}}_N(z_{-n_1},\ldots ,z_{-1},a,\ldots ,a, z_k,\ldots ,z_{n_2-1}) - C\varepsilon . \end{aligned}$$

We group the atoms into a left, middle and right block and decompose the energy of the modified configuration as \(A_N+ B_N+ C_N+ D_N\) where

$$\begin{aligned} \begin{aligned} A_N&= {\mathcal {E}}_{n_1+1} (z_{-n_1},\ldots ,z_{-1}) + {\mathcal {W}}(z_{-n_1},\ldots ,z_{-1};a,\ldots ,a),\\ B_N&= {\mathcal {E}}_{k+1} (a,\ldots ,a), \\ C_N&= {\mathcal {W}}(a,\ldots ,a; z_k,\ldots ,z_{n_2-1}) + {\mathcal {E}}_{n_2-k+1}(z_k,\ldots ,z_{n_2-1}),\\ D_N&= \sum _{i=-n_1}^{-1}\sum _{j=k}^{n_2} v(z_i+\cdots + z_{-1} + ka + z_k+\cdots + z_j), \end{aligned} \end{aligned}$$

where \({\mathcal {W}}\) gathers interactions that involve bonds from two consecutive blocks. The term \(D_N\) represents the interactions between the left and right blocks. It satisfies

$$\begin{aligned} 0 \ge D_N \ge \sum _{n=k}^\infty (n-k) v(n z_\mathrm{min}) \ge - \alpha _1 \sum _{n=k}^\infty \frac{n-k}{(n z_\mathrm{min})^s} \ge - \frac{\alpha _1}{z_\mathrm{min}^s}\, \sum _{n=k}^\infty \frac{1}{n^{s-1}}, \end{aligned}$$

which goes to zero as \(k\rightarrow \infty \). Next we subtract \(Ne_0\) from \({\mathcal {E}}_N\) and distribute it as \(Ne_0= n_1 e_0 + (k+1)e_0 +(n_2-k) e_0\) over the first three sums. The middle block contributes

$$\begin{aligned} \begin{aligned} B_N - (k+1) e_0&= \sum _{n=1}^{k} (k-n+1) v(na) + kpa - (k+1)pa - (k+1) \sum _{n=1}^\infty v(na) \\&= - pa - \sum _{n=1}^{k} n v(na) - (k+1) \sum _{n=k+1}^\infty v(na) \rightarrow - \sum _{n=1}^\infty n v(na) \end{aligned} \end{aligned}$$

as \(k \rightarrow \infty \). For the first block, we notice that

$$\begin{aligned} A_N - n_1 e_0 \ge {\mathcal {E}}_\mathrm {surf} (z_{-n_1},\ldots ,z_{-1},a,a,\ldots ) \ge \inf _{{\mathcal {D}}_0}{\mathcal {E}}_\mathrm {surf} . \end{aligned}$$

Indeed the only missing piece are negative interactions between the left block and the right tail of a semi-infinite chain. The contribution of the right block \(C_N\) is estimated in a similar way. We combine the estimates and let first \(n_1,n_2 \rightarrow \infty \), then \(k\rightarrow \infty \), and finally \(\varepsilon \rightarrow 0\) and find

$$\begin{aligned} \liminf _{N\rightarrow \infty } \bigl (E_N - N e_0\bigr ) \ge 2 \inf _{{\mathcal {D}}_0} {\mathcal {E}}_\mathrm {surf} - pa - \sum _{n=1}^\infty n v(na). \end{aligned}$$

For the upper bound, we take approximate minimizers of \({\mathcal {E}}_\mathrm {surf}\) and glue them together to an N-particle configuration by assigning them to the left and right boundaries, with spacings a in between. This yields an N-particle configuration with energy \({\mathcal {E}}_N(z) - Ne_0 \le 2 \inf _{{\mathcal {D}}_0} {\mathcal {E}}_\mathrm {surf}- \sum _{n=1}^\infty n v(na) + O(\varepsilon )\), and the required upper bound follows. \(\square \)

Next we extend \({\mathcal {E}}_\mathrm {surf}\) to the space \({\mathcal {D}} \subset (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) of sequences with \(\sum _{j=1}^\infty (z_j-a)^2<\infty \).

Lemma 3.5

Let \(m\in {\mathbb {N}}\cup \{\infty \}\). Let \(\theta _j = \sum _{k=j+1}^m (k-j) v'(ka)\), \(j=1,\ldots ,m-1\). Then for all \((z_j)_{j\in {\mathbb {N}}} \in {\mathcal {D}}_0\), we have

$$\begin{aligned} {\mathcal {E}}_\mathrm {surf}((z_j)_{j\in {\mathbb {N}}})= & {} - \sum _{j=1}^{m-1} \theta _j (z_j - a) + \sum _{j=1}^\infty \sum _{k=1}^m \Bigl [ v\Bigl (\sum _{i=j}^{j+k-1} z_i\Bigr ) - v(ka) \nonumber \\&- v'(ka) \sum _{i=j}^{j+k-1} (z_i-a) \Bigr ]. \end{aligned}$$
(3.2)

The right-hand side is absolutely convergent for all \((z_j)_{j\in {\mathbb {N}}} \in {\mathcal {D}}\).

We remark that the double sum in (3.2) might be interpreted as a bulk energy of the semi-infinite chain in the sense that this term coincides with the expression for \({\mathcal {E}}_\mathrm {bulk}\) derived in (3.4) below when the summation is restricted to positive j there.

Proof

Recall from (2.2) that

$$\begin{aligned} {\mathcal {E}}_\mathrm {surf} ((z_j)_{j\in {\mathbb {N}}} ) = \sum _{j=1}^\infty \Bigl [ p \gamma _j + \sum _{k=1}^m \bigl ( v(ka+\gamma _j+ \cdots + \gamma _{j+k-1}) - v(ka)\bigr ) \Bigr ], \end{aligned}$$

where \(\gamma _j = z_j - a\). The equilibrium condition \(p + \sum _{k=1}^m k v'(ka)=0\) yields

$$\begin{aligned}&\sum _{j=1}^\infty \sum _{k=1}^m v'(ka) (\gamma _j+ \cdots + \gamma _{j+k-1}) \\&\quad = \sum _{i=1}^\infty \gamma _i \sum _{k=1}^m v'(ka) \#\{j \ge 1\mid j \le i \le j+k-1\} \\&\quad = \sum _{i=1}^\infty \gamma _i \sum _{k=1}^m v'(ka) \min (i,k) \\&\quad = - \sum _{i=1}^{m-1} \gamma _i \sum _{k=i+1}^{m}(k-i) v'(ka) = - \sum _{i=1}^{m-1} \theta _i \gamma _i - \sum _{i=1}^{\infty } p \gamma _i \end{aligned}$$

and the alternate expression for \({\mathcal {E}}_\mathrm {surf}\) follows. Next consider \((\gamma _j) \in \ell ^2({\mathbb {N}})\) with \(\gamma _j > r_\mathrm{hc} -a\) for all \(j \in {\mathbb {N}}\). Under Assumption 1 the derivatives behave as \(v''(r) = O(r^{-s-2})\) and \(v'(r) = O(r^{-s-1})\) as \(r\rightarrow \infty \) with \(s>2\). It follows that \(\varepsilon _j:= \sum _{k=1}^\infty k v'(ka)\) decays like \(\int _{ja}^\infty r \times r^{-s-1} \mathrm {d}r = O(j^{-s+1})\) so that \(\sum _{j=1}^\infty \varepsilon _j ^2 <\infty \). The Cauchy–Schwarz inequality then shows that

$$\begin{aligned} \sum _{j=1}^{m-1} \bigl |\theta _j \gamma _j \bigr | \le c \Bigl (\sum _{j=1}^\infty \gamma _j^2\Bigr )^{1/2} \end{aligned}$$

for some suitable m-independent constant c. In particular, when \(m=\infty \) the sum \(\sum _j \theta _j \gamma _j\) is absolutely convergent. In order to show that the double sum over k and j in equation (3.2) is absolutely convergent, we proceed with estimates analogous to Lemma 3.3. Assume first that all spacings \(z_j = \gamma _j + a\) are larger than \(z_\mathrm{min}\). Set \(\sup _{r\ge z_{\min }} |v''(r)|=c_1\) and note that, by Assumption 1(iii) for all \(k\ge 2\), \(\sup _{r\ge kz_{\min }} |v''(r)| \le |v''(k z_{\min })|\). Hence

$$\begin{aligned}&2 \sum _{j=1}^\infty \sum _{k=1}^m \bigl | v(ka+\gamma _j+ \cdots + \gamma _{j+k-1}) - v(ka) - v'(ka) (\gamma _j+ \cdots + \gamma _{j+k-1}) \bigr | \\&\quad \le c_1 \sum _{j=1}^\infty \gamma _j^2 + \sum _{j=1}^\infty \sum _{k=2}^m |v''(k z_{\min })| \, ( \gamma _j + \cdots + \gamma _{j+k-1}\bigr )^2 \\&\quad \le c_1 \sum _{j=1}^\infty \gamma _j^2 + \sum _{j=1}^\infty \sum _{k=2}^m k |v''(kz_{\min })| \, ( \gamma _j^2 + \cdots + \gamma _{j+k-1}^2\bigr ) \\&\quad \le \Bigl (c_1 + \sum _{k=1}^m k^2 |v''(kz_{\min })| \Bigr ) \sum _{j=1}^\infty \gamma _j^2. \end{aligned}$$

More generally, if \((\gamma _j)\in \ell ^2({\mathbb {N}}) \cap (r_\mathrm{hc}-a, \infty )^{{\mathbb {N}}}\), then \(\gamma _j \rightarrow 0\) and because of \(a>z_\mathrm{\min }\), there is an \(i \in {\mathbb {N}}\) such that \(z_j \ge z_{\min }\) for all \(j\ge i\). Let \(\varepsilon = \min \{ |z_j| \mid j=1,\ldots ,i\}\). Summands with \(j \ge i\) can be estimated as before. For \(j\le i\) and \(k \ge i+2\), we proceed as before as well, except that we replace \(v''(k z_{\min })\) by \(v''((k-i)z_{\min } + i \varepsilon )\). This leaves a finite sum over \(j \le i, k \le i+2\) and overall, the sum is absolutely convergent. \(\square \)

Lemma 3.6

The map \({\mathcal {D}} \rightarrow {\mathbb {R}}\), \((z_j) \mapsto {\mathcal {E}}_\mathrm {surf}\bigl ( (z_j)_{j\in {\mathbb {N}}}\bigr )\) defined by (3.2) is continuous.

Proof

Let \(z, z^{\scriptscriptstyle {({1}})}, z^{\scriptscriptstyle {({2}})}, \ldots \) be sequences in \({\mathcal {D}}\) such that \(z^{\scriptscriptstyle {({n}})} - z \rightarrow 0\) in \(\ell ^2({\mathbb {N}})\). As \(\lim _{i\rightarrow \infty } \sum _{j \ge i} (\gamma ^{\scriptscriptstyle {({n}})}_j)^2 = 0\) uniformly in n, the estimates above show that for every \(\varepsilon >0\), we can find \(i \in {\mathbb {N}}\) such that the sum over \(\{(j,k) \mid j \ge i \text { or } k \ge i\}\) contributes to \({\mathcal {E}}_\mathrm {surf}(\gamma ^{\scriptscriptstyle {({n}})})\) and \({\mathcal {E}}_\mathrm {surf}(\gamma )\) an amount bounded by \(\varepsilon \). In the remaining finite sum the continuity of v(r) allows us to pass to the limit. The proof is easily concluded with an \(\varepsilon /3\) argument. \(\square \)

Lemma 3.7

The restriction of \({\mathcal {E}}_\mathrm {surf}\) to \({\mathcal {D}}\cap [z_{\min },z_{\max }+\varepsilon ]^{\mathbb {N}}\) is strictly convex and satisfies

$$\begin{aligned} {\mathcal {E}}_\mathrm {surf}\bigl ( (z_j)_{j\in {\mathbb {N}}}\bigr ) \ge c_1 \sum _{j=1}^\infty (z_j-a)^2 -c_2 \end{aligned}$$

for suitable m-independent constants \(\varepsilon ,c_1,c_2>0\).

Proof

The proof of the convexity is similar to Lemma 3.3 and therefore omitted. For the coercivity, consider first \(m=\infty \). Let \(\gamma _j = z_j -a\), \(\gamma _j^{\scriptscriptstyle {({n}})} = \gamma _j \mathbb {1}_{\{j \le n\}}\) the truncated strain, and \(z_j^{\scriptscriptstyle {({n}})} = a+ \gamma _j^{\scriptscriptstyle {({n}})}\). Then

$$\begin{aligned} {\mathcal {E}}_\mathrm {surf}(z^{\scriptscriptstyle {({n}})})&= \sum _{j=1}^n \bigl (h(z_j^{\scriptscriptstyle {({n}})},z_{j+1}^{\scriptscriptstyle {({n}})},\ldots )-e_0) \\&= {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n) - n e_0 + \sum _{j=1}^n \sum _{k=1}^\infty v(z_j+\cdots + z_n + k a), \end{aligned}$$

thus

$$\begin{aligned} {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n) - n e_0\le {\mathcal {E}}_\mathrm {surf} (z^{\scriptscriptstyle {({n}})}) + C, \end{aligned}$$

where \(C = - \sum _{k,\ell =1}^\infty v(\ell z_{\min } + k a) <\infty \). Next we cut and paste \((z_1,\ldots ,z_n)\) into the middle of a large ground state chain: let \(k_1,k_2\in {\mathbb {N}}\) with \(k_2 \ge n+1\), \(N=k_2+k_1+1\) and \((z_{-k_1+1}^{\scriptscriptstyle {({N}})},\ldots ,z_{k_2}^{\scriptscriptstyle {({N}})})\) the spacings of the N-particle ground state. Let \(z'=(z_{-k_1+1}^{\scriptscriptstyle {({N}})},\ldots ,z_0^{\scriptscriptstyle {({N}})}, z_1,\ldots ,z_n, z_{n+1}^{\scriptscriptstyle {({N}})},\ldots ,z_{k_2}^{\scriptscriptstyle {({N}})})\). A Taylor expansion of \({\mathcal {E}}_N\) around the minimizer \(z^{\scriptscriptstyle {({N}})}\) together with Lemma 3.3 and Theorem 2.1 yields

$$\begin{aligned} {\mathcal {E}}_N(z') - {\mathcal {E}}_N(z^{\scriptscriptstyle {({N}})}) \ge \frac{\eta }{2} \sum _{j=1}^n (z_j - z_j^{\scriptscriptstyle {({N}})})^2 \rightarrow \frac{\eta }{2} \sum _{j=1}^n (z_j-a)^2 \quad (k_1,k_2\rightarrow \infty ). \end{aligned}$$
(3.3)

On the other hand, let \(C_1 = \sum _{\ell =2}^\infty \ell |v(\ell z_{\min })|\) be a bound for interactions between blocks and remember \(E_k \ge k e_0\) by Lemma 3.2 and \(e_0 \le 0\). Then

$$\begin{aligned} {\mathcal {E}}_N(z') - {\mathcal {E}}_N(z^{\scriptscriptstyle {({N}})}) \le&2 C_1 + {\mathcal {E}}_{k_1+1}(z_{-k_1+1}^{\scriptscriptstyle {({N}})},\ldots ,z_0^{\scriptscriptstyle {({N}})}) + {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n) \\&+ {\mathcal {E}}_{k_2-n+1}(z_{n+1}^{\scriptscriptstyle {({N}})},\ldots ,z_{k_2}^{\scriptscriptstyle {({N}})}) - E_N \\ \le&4 C_1 + {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n) - {\mathcal {E}}_{n+1}(z_1^{\scriptscriptstyle {({N}})},\ldots ,z_n^{\scriptscriptstyle {({N}})}) \\ \le&4 C_1 + {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n) - (n+1) e_0 \\ \le&4 C_1 - e_0 + C + {\mathcal {E}}_\mathrm {surf} (z^{\scriptscriptstyle {({n}})}) = C_2 + {\mathcal {E}}_\mathrm {surf} (z^{\scriptscriptstyle {({n}})}). \end{aligned}$$

We combine with equation (3.3) and let first \(k_1,k_2\rightarrow \infty \), then \(n\rightarrow \infty \), and conclude that \(\frac{\eta }{2}\sum _{j=1}^\infty \gamma _j^2 \le {\mathcal {E}}_\mathrm {surf}(z) + C_2\) with the help of Lemma 3.6. This proves the coercivity in the case \(m=\infty \). The proof for finite m is similar. \(\square \)

Lemma 3.8

The surface energy \({\mathcal {E}}_\mathrm {surf}\) has a unique minimizer in \({\mathcal {D}}\). The minimizer is in \({\mathcal {D}}\cap [z_{\min }, z_{\max }]^{\mathbb {N}}\).

Proof

We proceed as in section 3.1. Let \((z_j)_{j\in {\mathbb {N}}}\in {\mathcal {D}}\). If one of the \(z_j\)’s is larger than \(z_{\max }\), we can define a new configuration by shrinking this spacing to \(z_{\max }\), leaving all other configurations unchanged. This decreases \({\mathcal {E}}_\mathrm {surf}\). If one of the \(z_j\)’s is smaller than \(z_{\min }\), let b be the smallest among them, and \(j \in {\mathbb {N}}\) with \(b=z_j\). Then we can define a new configuration by removing a participating particle and possibly shrinking a bond, that is \((z_1,z_2,\ldots ) \mapsto (z_1,z_2,\ldots ,z_{j-1}, \min (z_{j} + z_{j+1}, z_{\max }), z_{j+2}, \ldots )\). Since \(e_0 \le 0\), just as in Lemma 3.1, we see that this decreases the energy. Repeating these steps if necessary, the initial configuration is mapped to a new one that has strictly lower energy and all spacings in \([z_{\min },z_{\max }]\).

The existence of a minimizer now follows from the coercivity proven in Lemma 3.7, the compactness of \([z_{\min },z_{\max }]^{\mathbb {N}}\cap {\mathcal {D}}\) with respect to the weak \(\ell ^2\)-convergence (shifted by \((a, a, \ldots )\)) and the weak lower semicontinuity of \({\mathcal {E}}_\mathrm {surf}\) on that set due to Lemmas 3.6 and 3.7. The minimizer is unique because of the strict convexity from Lemma 3.7. \(\square \)

Proof of Theorem 2.2

Clear from Lemmas 3.63.73.8 and Proposition 3.4. \(\square \)

Proof of Proposition 2.3

In complete analogy to Lemma 3.5 we obtain

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}((z_j)_{j\in {\mathbb {Z}}}) = \sum _{j=-\infty }^\infty \sum _{k=1}^m \Bigl [ v\Bigl (\sum _{i=j}^{j+k-1} z_i\Bigr ) - v(ka) - v'(ka) \sum _{i=j}^{j+k-1} (z_i-a) \Bigr ]\nonumber \\ \end{aligned}$$
(3.4)

for all \((z_j)_{j\in {\mathbb {Z}}} \in {\mathcal {D}}^+_0\), and as in Lemma 3.6, we see that (3.4) defines a continuous map \({\mathcal {D}}^+ \rightarrow {\mathbb {R}}\). The proof of strict convexity, even on \([z_{\min }, z_{\max } + \varepsilon ]^{{\mathbb {Z}}} \cap {\mathcal {D}}^+\) for some \(\varepsilon > 0\), is again similar to Lemma 3.3. As in Lemma 3.8 we have that \({\mathcal {E}}_\mathrm {bulk}\) has a unique minimizer in \({\mathcal {D}}\), which lies in \({\mathcal {D}}\cap [z_{\min }, z_{\max }]^{\mathbb {N}}\). Since \(a \in (z_{\min }, z_{\max }]\) and \(\partial _i {\mathcal {E}}_\mathrm {bulk}((z_j)_{j\in {\mathbb {Z}}}) = 0\) for every \(i \in {\mathbb {Z}}\) by (3.4), the minimizer of \({\mathcal {E}}_\mathrm {bulk}\) is \((\ldots , a, a, \ldots )\). Clearly, \({\mathcal {E}}_\mathrm {bulk}(\ldots , a, a, \ldots ) = 0\). Finally, the formula connecting \({\mathcal {E}}_\mathrm {bulk}\) and \({\mathcal {E}}_\mathrm {surf}\) is clear on \({\mathcal {D}}^+_0\) and follows on \({\mathcal {D}}^+\) by approximation. \(\square \)

3.3 A Fixed Point Equation

In the following we assume that v has a hard core:

Assumption 3

\(r_\mathrm{hc} > 0\) and \(v(r) \rightarrow \infty \) as \(r \searrow r_\mathrm{hc}\).

We extend h, defined by (2.1) on \((r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\), to \({\mathbb {R}}_+^{{\mathbb {N}}}\) by setting

$$\begin{aligned} h(z) = \infty \text{ if } z_j \le r_\mathrm{hc} \text{ for } \text{ some } j. \end{aligned}$$
(3.5)

Our main aim in this subsection is to obtain the following characterisation of \(\overline{{\mathcal {E}}}_\mathrm {surf}\), cf. (2.5).

Proposition 3.9

The unique lower semi-continuous solution (product topology) of the equation

$$\begin{aligned} I(z_1,z_2,\ldots ) = h(z_1,z_2,\ldots ) - e_0 + I(z_2,z_3,\ldots ) \end{aligned}$$
(3.6)

for I with \(\min I =0\) and \(I = \infty \) if \(z_j\le r_\mathrm {hc}\) for one of the \(z_j\)’s is given by \(I = \overline{{\mathcal {E}}}_\mathrm {surf} - \min {\mathcal {E}}_\mathrm {surf}\).

Together with Lemmas 5.3 and 5.4 this proposition will show that \(\overline{{\mathcal {E}}}_\mathrm {surf} - \min {\mathcal {E}}_\mathrm {surf}\) is the rate function for the large deviations of \((\nu _{\beta })\), cf. Theorem 2.4. Note that, by induction, (3.6) is equivalent to

$$\begin{aligned} I(z) = \sum _{j=1}^k \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big ) + I( z_{k+1}, z_{k+2}, \ldots ) \end{aligned}$$
(3.7)

for all \(k \in {\mathbb {N}}\) and \(z = (z_j)_{j\in {\mathbb {N}}} \in {\mathbb {R}}_+^{{\mathbb {N}}}\). (Observe that \(h(z) > - \infty \) for all \(z \in {\mathbb {R}}_+^{{\mathbb {N}}}\) by the decay assumption on v and \(r_\mathrm{hc} > 0\).)

We begin with a technical auxiliary result.

Lemma 3.10

If \(z_1, z_2, \ldots >0\) and \({\bar{c}} < \infty \) are such that

$$\begin{aligned} \sup _{k \in {\mathbb {N}}} \sum _{i = 1}^k \big ( h(z_{i}, \ldots , z_{m+i-1}) - e_0 \big ) \le {\bar{c}}, \end{aligned}$$

then \(z = (z_j)_{j\in {\mathbb {N}}} \in {\mathcal {D}}\). Moreover, any \(z \in {\mathcal {D}}\) satisfies

$$\begin{aligned} \lim _{k\rightarrow \infty } \sum _{j= 1}^k \big ( h(z_{j}, \ldots , z_{j+m-1}) - e_0 \big ) = {\mathcal {E}}_\mathrm {surf}(z). \end{aligned}$$

Proof

Let \(\varepsilon _0< \min (a-z_{\min }, z_{\max }- a)\). The partial sum \(\sum _{j=1}^k h(z_j,\ldots ,z_{j+m-1})\) is equal to the energy \({\mathcal {E}}_{k+1}(z_1,\ldots ,z_{k})\) plus an interaction

$$\begin{aligned} \sum _{j = 1}^k \sum _{i = k+1}^{m+j-1} v( z_j + \ldots + z_i ) \end{aligned}$$

(the inner sum being 0 if \(m+j-1<k+1\)) which is bounded from below by

$$\begin{aligned} - \alpha _1 \sum _{j = 1}^k \sum _{i = k+1}^\infty \big ( (i - j + 1) r_\mathrm {hc} \big )^{-s}&\ge - C \sum _{j = 1}^k (k - j + 1)^{-s+1} \\&\ge - C \sum _{i = 1}^{\infty } i^{-s+1} =: - C_1 > - \infty . \end{aligned}$$

By adding \(n_1\) and \(n_2\) spacings a to the left and right respectively, we may view z as a block of spacings in an N-particle configuration where \(N=n_1+n_2 + k +1\). Let \({\hat{z}} = (a,\ldots ,a,z_1,\ldots ,z_k,a,\ldots ,a)\). The new configuration satisfies

$$\begin{aligned} {\mathcal {E}}_N({\hat{z}})&\le {\mathcal {E}}_{k+1}(z_1,\ldots ,z_k) + 2 C_1 + {\mathcal {E}}_{n_1+1}(a,\ldots ,a) + {\mathcal {E}}_{n_2+1}(a,\ldots ,a) \\&\le C + Ne_0 \end{aligned}$$

for some suitable constant C that depends on \(r_\mathrm {hc}\), \({\bar{c}}\) and v only. Let \(z^{\scriptscriptstyle {({N}})}\) be the N-particle ground state with spacings labelled by \(j=-n_1+1,\ldots ,k+n_2\) rather than \(1,\ldots ,N-1\). Since \({\mathcal {E}}_N(z^{\scriptscriptstyle {({N}})}) = E_N\ge N e_0\) by Lemma 3.2 and \(e_0 \le 0\), we get

$$\begin{aligned} {\mathcal {E}}_N({\hat{z}}) - {\mathcal {E}}_N(z^{\scriptscriptstyle {({N}})}) \le C. \end{aligned}$$

Suppose that all spacings \(z_j\) are in \([z_{\min },z_{\max }]\). We use a Taylor approximation around the minimizer \(z^{\scriptscriptstyle {({N}})}\), apply Lemma 3.3 and Theorem 2.1, and obtain.

$$\begin{aligned} C\ge \frac{\eta }{2} \sum _{j=1}^k (z_j- z_j^{\scriptscriptstyle {({N}})})^2 \rightarrow \frac{\eta }{2} \sum _{j=1}^k (z_j- a)^2\qquad (n_1,n_2\rightarrow \infty ). \end{aligned}$$
(3.8)

Letting \(k\rightarrow \infty \) we obtain an upper bound for the \(\ell ^2\)-norm of \((z_j-a)_{j\in {\mathbb {N}}}\). If there are \(z_j\) with \(z_j < z_{\min }\) or \(z_j > z_{\max }\), we modify the configuration \(z_1, \ldots , z_{k}\) without increasing its energy as in the proof of Lemma 3.1 to obtain \(z'_1, \ldots , z'_{k}\). When we shrink bonds \(z_j >z_{\max }\) to \(z'_j = z_{\max }\), leaving all other spacings unchanged, both \(z'_j\) and \(z_j\) are strictly larger than \(\varepsilon _0\) so the truncated \(\ell ^2\)-norm \(\sum _{j=1}^k \min \bigl ( (z_j - a)^2, \varepsilon _0^2)\) is unaffected.

On the other hand suppose \(z_i = \min (z_j) <z_{\min }\). Then we remove the particle \(x_i\), reattach it a distance \(z_{\max }\) to the left of the k-particle block. This effects the change

$$\begin{aligned} (z_{i-1}-a)^2 + (z_{i}-a)^2 \rightarrow (z_{\max } - a)^2 + ((z_{i-1}+z_{i})- a)^2 \end{aligned}$$

on the \(\ell ^2\)-norm. Both \(|z_{i}- a|\) and \(|z_{\max }-a|\) are larger than \(\varepsilon _0\), moreover,

$$\begin{aligned} \min ( (z_{i-1}+z_{i}- a)^2,\varepsilon _0^2 ) - \min ( (z_{i-1} - a)^2,\varepsilon _0^2 ) \le \varepsilon _0^2. \end{aligned}$$

Thus the truncated \(\ell ^2\)-norm increases by at most \(\varepsilon _0^2\). Let n be the number of times this step has to be performed. Iterating we arrive at a configuration \(z''_1,\ldots ,z''_k\in [z_{\min },z_{\max }]\) with

$$\begin{aligned} \sum _{j=1}^k \min (\varepsilon _0^2,(z''_j-a)^2 ) \le n\varepsilon _0^2 + \sum _{j=1}^k \min ((z_j-a)^2,\varepsilon _0^2) \end{aligned}$$

and \({\mathcal {E}}_{k+1}(z'') \le {\mathcal {E}}_{k+1}(z) - n \delta \) for some \(\delta >0\), cf. (3.1). Making \(\varepsilon _0\) smaller if necessary we may assume \(\varepsilon _0^2<\delta \). We combine with equation (3.8) for \({\hat{z}}''\) and \(C''= C-n \delta \) and obtain

$$\begin{aligned} \sum _{j=1}^k \min ((z_j-a)^2,\varepsilon _0^2) \le C- n \delta +n\varepsilon _0^2 \le C. \end{aligned}$$

We let \(k\rightarrow \infty \) and find that the truncated \(\ell ^2\)-norm of \((z_j)_{j\in {\mathbb {N}}}\) is finite. It follows in particular that there are only finitely many spacings \(|z_j - a|\ge \varepsilon _0\), and \((z_j-a)_{j\in {\mathbb {N}}}\) is square summable. This establishes the first assertion.

In order to show the convergence of the partial sums to \({\mathcal {E}}_\mathrm {surf}\), first observe that \({\mathcal {E}}_\mathrm {surf}\) satisfies (3.7) for \(I = {\mathcal {E}}_\mathrm {surf}\). This is clear for \(z \in {\mathcal {D}}_0\) and follows for general \(z \in {\mathcal {D}}\) by continuity. If \(z \in {\mathcal {D}}\), the sequence of shifts \(((z_j)_{j \ge k})_{k \in {\mathbb {N}}}\) converges to \((\ldots , a, a, \ldots )\) strongly and thus

$$\begin{aligned} \sum _{j=1}^k \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big )&= {\mathcal {E}}_\mathrm {surf}(z) - {\mathcal {E}}_\mathrm {surf}( z_{k+1}, z_{k+2}, \ldots ) \\&\rightarrow {\mathcal {E}}_\mathrm {surf}(z) - {\mathcal {E}}_\mathrm {surf}(\ldots , a, a, \ldots ) = {\mathcal {E}}_\mathrm {surf}(z) \end{aligned}$$

as \(k \rightarrow \infty \). \(\square \)

We have actually proven the following: for sufficiently small \(\varepsilon _0>0\), suitable \(c_1,c_2>0\), and all \((z_j)_{j\in {\mathbb {N}}} \in {\mathbb {R}}_+^{\mathbb {N}}\),

$$\begin{aligned} \overline{{\mathcal {E}}}_\mathrm {surf}\bigl ( (z_j)\bigr ) \ge c_1 \sum _{j=1}^\infty \min ((z_j-a)^2,\varepsilon _0^2) - c_2. \end{aligned}$$
(3.9)

Proof of Proposition 3.9

Let \(I= \overline{{\mathcal {E}}}_\mathrm {surf} - \min {\mathcal {E}}_\mathrm {surf}\). Observe that I satisfies (3.6). This is clear for \(z \in {\mathcal {D}}_0\) and for \(z \notin {\mathcal {D}}\). For the remaining z it follows from Lemma 3.6. We now show that I is lower semi-continuous with respect to pointwise convergence. Without loss we suppose that \(z^{\scriptscriptstyle {({n}})} \in {\mathcal {D}}\) converges to \(z \in [r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) pointwise with \(I(z^{\scriptscriptstyle {({n}})}) \le {\bar{c}} < \infty \) for some constant \({\bar{c}} > 0\). Passing to a subsequence (not relabelled) we may furthermore assume that \(\liminf _{n \rightarrow \infty } I(z^{\scriptscriptstyle {({n}})}) = \lim _{n \rightarrow \infty } I(z^{\scriptscriptstyle {({n}})})\). Fix an \(\varepsilon > 0\) such that the estimate in Lemma 3.7 is satisfied. By (3.9)

$$\begin{aligned} \max _{n \in {\mathbb {N}}} \# \{ j \mid z^{\scriptscriptstyle {({n}})}_j \notin [z_{\min }, z_{\max }+\varepsilon ] \} \le C \end{aligned}$$

for some uniform constant \(C > 0\) since \(z_{\min } < a \le z_{\max }\). For given \(N \in {\mathbb {N}}\) we denote by \(j_n\) the first index \(j \ge N\), if existent, with \(z^{\scriptscriptstyle {({n}})}_{j} \notin [z_{\min }, z_{\max }+\varepsilon ]\). Passing to a further subsequence (not relabelled) and choosing N sufficiently large we may achieve that either such indices do not exist or that \(j_n \rightarrow \infty \) as \(n \rightarrow \infty \). In both cases we get that \(z_j \in [z_{\min }, z_{\max }+\varepsilon ]\) for \(j \ge N\). In particular, \(z_j > r_\mathrm{hc}\) for \(j \ge N\).

In the second case we define new configurations \({\tilde{z}}^{\scriptscriptstyle {({n}})}\) by applying the procedure detailed in the proof of Lemma 3.8 to the tails \((z^{\scriptscriptstyle {({n}})}_j)_{j \ge N}\) shrinking the bonds \(z^{\scriptscriptstyle {({n}})}_j > z_{\max }+\varepsilon \), \(j \ge N\), and deleting particles \(x^{\scriptscriptstyle {({n}})}_{j+1}\) if \(z^{\scriptscriptstyle {({n}})}_j < z_{\min }\), \(j \ge N\), so that

$$\begin{aligned} {{\mathcal {E}}}_\mathrm {surf}(({\tilde{z}}^{\scriptscriptstyle {({n}})}_j)_{j \ge N}) \le {{\mathcal {E}}}_\mathrm {surf}((z^{\scriptscriptstyle {({n}})}_j)_{j \ge N}). \end{aligned}$$

In the first case we simply set \({\tilde{z}}^{\scriptscriptstyle {({n}})} = z^{\scriptscriptstyle {({n}})}\). Since \(j_n \rightarrow \infty \) in the second case, we still have \({\tilde{z}}^{\scriptscriptstyle {({n}})} \rightarrow z\) pointwise.

By (3.7) with \(k = N-1\) we have

$$\begin{aligned} I(z^{\scriptscriptstyle {({n}})}) \ge \sum _{j=1}^{N-1} \big ( h(z^{\scriptscriptstyle {({n}})}_j, z^{\scriptscriptstyle {({n}})}_{j+1}, \ldots ) - e_0 \big ) + I( {\tilde{z}}^{\scriptscriptstyle {({n}})}_{N}, {\tilde{z}}^{\scriptscriptstyle {({n}})}_{N+1}, \ldots ). \end{aligned}$$

From the decay properties of v and \(z^{\scriptscriptstyle {({n}})}_j \ge r_\mathrm{hc} > 0\) it is easy to see that, for any \(j \in {\mathbb {N}}\), \(h(z^{\scriptscriptstyle {({n}})}_j, z^{\scriptscriptstyle {({n}})}_{j+1}, \ldots )\) converges to \(h(z_j, z_{j+1}, \ldots )\). Since \(I(z^{\scriptscriptstyle {({n}})}) \le {\bar{c}}\) and \(I \ge 0\), from Assumption 3 we also get \(z_j > r_\mathrm{hc}\) for \(j = 1, \ldots , N-1\). So

$$\begin{aligned} \sum _{j=1}^{N-1} \big ( h(z^{\scriptscriptstyle {({n}})}_j, z^{\scriptscriptstyle {({n}})}_{j+1}, \ldots ) - e_0 \big ) \rightarrow \sum _{j=1}^{N-1} \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big ). \end{aligned}$$

In particular, \(I( ({\tilde{z}}^{\scriptscriptstyle {({n}})}_{j})_{j \ge N}) \le C\) and so Lemma 3.7 implies that \(z \in {\mathcal {D}}\) and \({\tilde{z}}^{\scriptscriptstyle {({n}})} - z \rightharpoonup 0\) in \(\ell ^2\) by coercivity and hence that

$$\begin{aligned} \liminf _{n \rightarrow \infty } I( ({\tilde{z}}^{\scriptscriptstyle {({n}})}_{j})_{j \ge N}) \ge I( (z_{j})_{j \ge N}) \end{aligned}$$

by convexity. Summarizing we obtain

$$\begin{aligned} \liminf _{n \rightarrow \infty } I(z^{\scriptscriptstyle {({n}})}) \ge \sum _{j=1}^{N-1} \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big ) + I( z_{N}, z_{N+1}, \ldots ) = I(z). \end{aligned}$$

Suppose, conversely, that a lower semi-continuous \(I : {\mathbb {R}}_+^{{\mathbb {N}}} \rightarrow {\mathbb {R}}\cup \{+\infty \}\) satisfies (3.6) with \(\min I = 0\) and \(I(z) = \infty \) if \(z_j \le r_\mathrm {hc}\) for some j. We first note that, since \(I \ge 0\), for any z with \(I(z) < \infty \) one has

$$\begin{aligned} \sup _{k \in {\mathbb {N}}} \sum _{j=1}^k \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big ) < \infty \end{aligned}$$

by (3.7) and so \(z \in {\mathcal {D}}\) by Lemma 3.10. It thus suffices to show that

$$\begin{aligned} I(z) = {\mathcal {E}}_\mathrm {surf}(z) + I(a, a, \ldots ) \end{aligned}$$
(3.10)

for all \(z \in {\mathcal {D}}\).

If \(z \in {\mathcal {D}}\), then \({\mathcal {E}}_\mathrm {surf}(z)\) is indeed finite by Lemma 3.5. We have \(\lim _{k\rightarrow \infty } \sum _{j= 1}^k \big ( h(z_{j}, \ldots , z_{j+m-1}) - e_0 \big ) = {\mathcal {E}}_\mathrm {surf}(z)\) by Lemma 3.10. Since the sequence of shifts \(( (z_j)_{j \ge k} )_{k \in {\mathbb {N}}}\) converges to \((a, a, \ldots )\) pointwise as \(k \rightarrow \infty \), taking the \(\liminf \) in (3.7) yields

$$\begin{aligned} I(z)= & {} \lim _{k \rightarrow \infty } \sum _{j=1}^k \big ( h(z_j, z_{j+1}, \ldots ) - e_0 \big ) + \liminf _{k \rightarrow \infty } I( z_{k+1}, z_{k+2}, \ldots )\\\ge & {} {\mathcal {E}}_\mathrm {surf}(z) + I(a, a, \ldots ). \end{aligned}$$

Note that, as \(I \not \equiv \infty \), this inequality also shows that \(I(a, a, \ldots ) < \infty \).

For the reverse inequality, by choosing k large enough in (3.7) we first see that (3.10) holds true for all \(z \in {\mathcal {D}}_0\). We denote by \(z^{(N)}\) the truncation with \(z^{(N)}_j = z_j\) for \(j \le N\) and \(z^{(N)}_j = a\) for \(j \ge N+1\). Since \(z^{(N)} \rightarrow z\) pointwise and \(z^{(N)} - z \rightarrow 0\) in \(\ell ^2\) as \(N \rightarrow \infty \), lower semi-continuity of I and strong continuity of \({\mathcal {E}}_\mathrm {surf}\) (see Lemma 3.6) give

$$\begin{aligned} I(z) \le \liminf _{N \rightarrow \infty } I(z^{(N)}) = \liminf _{N \rightarrow \infty } {\mathcal {E}}_\mathrm {surf}(z^{(N)}) + I(a, a, \ldots ) = {\mathcal {E}}_\mathrm {surf}(z) + I(a, a, \ldots ), \end{aligned}$$

where we have used that \(z^{(N)} \in {\mathcal {D}}_0\) for all N. \(\square \)

We now restrict to the case \(m < \infty \). Let \(d = m-1\). By (3.7) with \(k = d\) we have

$$\begin{aligned} \begin{aligned} {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}})&= \sum _{j = 1}^d \big ( h(z_j, \ldots , z_{j+d}) - e_0 \big ) + {\mathcal {E}}_\mathrm{surf}(z_{d+1}, z_{d+2}, \ldots ) \\&= {\mathcal {E}}_{d+1}(z_1, \ldots , z_d) - d e_0 + W(z_1, \ldots , z_d; z_{d+1}, \ldots , z_{2d}) \\&\quad + {\mathcal {E}}_\mathrm{surf}((z_j)_{j \ge d+1}), \end{aligned} \end{aligned}$$
(3.11)

for any \((z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}\), where

$$\begin{aligned} W(z_1, \ldots , z_{d}; z_{d+1}, \ldots , z_{2d}) = \sum _{1 \le i \le d < j \le 2d \atop j - i \le d} v(z_i + \ldots + z_j). \end{aligned}$$

Taking the infimum over \((z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0\), with fixed \(z_1, \ldots , z_d\), setting

$$\begin{aligned} u(x)&= \inf \big \{ {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) \mid (z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0,~ (z_1, \ldots , z_d) = x \big \} \\&= \inf \big \{ {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) \mid (z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}},~ (z_1, \ldots , z_d) = x \big \} \end{aligned}$$

(recall Lemma 3.6) and using (3.11) we obtain

$$\begin{aligned} u(x) = \inf _{y \in {\mathbb {R}}^d_+} \big ( {\mathcal {E}}_{d+1}(x) + W(x; y) - d e_0 + u(y) \big ). \end{aligned}$$

In Chapter 6 we will need the following estimate:

Lemma 3.11

Set \(A_{\varepsilon } = [z_{\min }, z_{\max } + \varepsilon ]^d\) and \(B_{\varepsilon } = {\mathbb {R}}_+^d \setminus A_{\varepsilon }\). Then, for any \(\varepsilon > 0\) there exists a \(\delta > 0\) such that

$$\begin{aligned} \inf _{y \in B_{\varepsilon }} \big ( {\mathcal {E}}_{d+1}(x) + W(x; y) - d e_0 + u(y) \big ) \ge u(x) + \delta \end{aligned}$$

for all \(x \in A_{\varepsilon }\).

Proof

Suppose \((z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0\) is such that \((z_1, \ldots , z_d) \in A_{\varepsilon }\), in particular, \(z_j \ge z_{\min }\) for \(j = 1, \ldots , d\). If \((z_{d+1}, z_{d+2}, \ldots ) \notin [z_{\min }, z_{\max }+\varepsilon ]^{{\mathbb {N}}}\) we construct a new configuration \((z'_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0\) without changing the first d spacings similarly as in the proofs of Lemmas 3.1 and 3.8.

If \(z_i > z_{\max } + \varepsilon \), we define \((z'_j)_{j \in {\mathbb {N}}}\) by setting \(z'_j = z_j\) for \(j \ne i\) and \(z'_i = z_{\max }\). Then

$$\begin{aligned} {\mathcal {E}}_\mathrm{surf}((z'_j)_{j \in {\mathbb {N}}}) \le {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) + v(z_{\max }) - v(z_{\max } + \varepsilon ). \end{aligned}$$
(3.12)

Now assume \(b = \min \{z_{d+1}, z_{d+2}, \ldots \} < z_{\min }\). We choose an \(i \ge d+1\) with \(z_i = b\) and define \((z'_j)_{j \in {\mathbb {N}}}\) by setting \(z'_j = z_j\) for \(j < i\), \(z'_i = \min \{z_i+z_{i+1}, z_{\max }\}\) and \(z'_j = z_{j+1}\) for \(j > i\). As in Lemmas 3.1 and 3.8 (in particular using that \(e_0 \le 0\)), we see that

$$\begin{aligned} \begin{aligned} {\mathcal {E}}_\mathrm{surf}((z'_j)_{j \in {\mathbb {N}}})&\le {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) - \Big ( v(b) + v(z_{\max }) - 2 \alpha _1 \sum _{n=2}^m (nb)^{-s} \Big ) \\&\le {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) - 2 \alpha _1 \sum _{n=m+1}^{\infty } (nz_{\min })^{-s}. \end{aligned} \end{aligned}$$
(3.13)

The estimates (3.12) and (3.13) show that, for any \((z_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0\) with \((z_1, \ldots , z_d) \in A_{\varepsilon }\) and \((z_{d+1}, \ldots , z_{2d}) \in B_{\varepsilon }\) there is a \((z'_j)_{j \in {\mathbb {N}}} \in {\mathcal {D}}_0\) with \((z'_1, \ldots , z'_d) = (z_1, \ldots , z_d)\) such that

$$\begin{aligned} {\mathcal {E}}_\mathrm{surf}((z'_j)_{j \in {\mathbb {N}}})&\le {\mathcal {E}}_\mathrm{surf}((z_j)_{j \in {\mathbb {N}}}) - \delta , \end{aligned}$$

where \(\delta = \min \big \{ v(z_{\max } + \varepsilon ) - v(z_{\max }), \ 2 \alpha _1 \sum _{n=m+1}^{\infty } (nz_{\min })^{-s} \big \} > 0\). Using (3.11) we arrive at

$$\begin{aligned} u(z_1, \ldots , z_d) + \delta \le&{\mathcal {E}}_{d+1}(z_1, \ldots , z_d) - d e_0 + W(z_1, \ldots , z_d; z_{d+1}, \ldots , z_{2d}) \\&+ {\mathcal {E}}_\mathrm{surf}((z_j)_{j \ge d+1}). \end{aligned}$$

The claim now follows by taking the infimum over \((z_j)_{j \in {\mathbb {N}}}\) with fixed \((z_1, \ldots , z_d)\) conditioned on \((z_{d+1}, \ldots , z_{2d}) \in B_{\varepsilon }\). \(\square \)

A simpler proof gives the following estimate that will also be needed in Chapter 6.

Lemma 3.12

For any \(\varepsilon > 0\) there exists a \(\delta > 0\) such that \({\mathcal {E}}_\mathrm {bulk}(z) \ge \delta \) for all \(z \in {\mathcal {D}}^+ \setminus [z_{\min }, z_{\max } + \varepsilon ]^{{\mathbb {Z}}}\).

Proof

By continuity we may assume that \(z = (z_j)_{j \in {\mathbb {Z}}} \in {\mathcal {D}}^+_0 \setminus [z_{\min }, z_{\max } + \varepsilon ]^{{\mathbb {Z}}}\). If \(z_i > z_{\max } + \varepsilon \), we define \(z' = (z'_j)_{j \in {\mathbb {Z}}}\) by setting \(z'_j = z_j\) for \(j \ne i\) and \(z'_i = z_{\max }\). Then

$$\begin{aligned} 0 \le {\mathcal {E}}_\mathrm{bulk}(z') \le {\mathcal {E}}_\mathrm{bulk}(z) + v(z_{\max }) - v(z_{\max } + \varepsilon ). \end{aligned}$$

If \(b = \min \{z_j : j \in {\mathbb {Z}}\} < z_{\min }\). We choose the smallest i with \(z_i = b\) and define \(z = (z'_j)_{j \in {\mathbb {N}}}\) by setting \(z'_j = z_j\) for \(j < i\), \(z'_i = \min \{z_i+z_{i+1}, z_{\max }\}\) and \(z'_j = z_{j+1}\) for \(j > i\). As in (3.13) we get

$$\begin{aligned} 0 \le {\mathcal {E}}_\mathrm{bulk}(z') \le {\mathcal {E}}_\mathrm{bulk}(z) - 2 \alpha _1 \sum _{n=m+1}^{\infty } (nz_{\min })^{-s}. \end{aligned}$$

This concludes the proof. \(\square \)

4 Gibbs Measures for the Infinite and Semi-infinite Chains

Here we prove the existence of \(\nu _\beta \), \(\mu _\beta \), \(g(\beta )\), \(g_\mathrm {surf}(\beta )\) and check that \(\mu _\beta \) is shift-invariant and mixing, hence ergodic; the results and methods are fairly standard. In addition, we provide an a priori estimate on the decay of correlations with explicit analysis of the \(\beta \)-dependence (Theorem 4.4) which to the best of our knowledge is new. The results from this section need only very little on the pair potential: we only use that v has a hard core and that \(v(r)= O(1/r^{s})\), for large r, with \(s>2\). The technical assumption of a hard core frees us from superstability estimates [32, 45]. The decay of the potential ensures that the infinite volume Gibbs measure is unique, see for example [23, Chapter 8.3] and [29, 37, 38].

We follow the classical treatment of one-dimensional systems with transfer operators. For compactly supported pair potentials with a hard core (or, in our case, when m is chosen finite), the transfer operators are integral operators in \(L^2({\mathbb {R}}_+^{m-1},\mathrm {d}x)\) [44, Chapter 5.6], see Section 6. For long-range interactions, the transfer operator (also known as Ruelle operator or Ruelle–Perron–Frobenius operator) acts instead from the left on functions of infinitely many variables, and from the right on measures [21, 43, 46]. The formalism of transfer operators keeps being developed in the context of dynamical systems and ergodic theory [4, 5].

For the decay of correlations, we adapt [40] to the present context of continuous unbounded spins and carefully track the \(\beta \)-dependence in the bounds. In section 5.3, transfer operators will also help us investigate the large deviations behavior of the Gibbs measures; notably the eigenvalue equation from Lemma 4.1 translates into a fixed point equation for the rate function (see Lemma 5.4).

The results of this section hold for all \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(\beta ,p>0\); the additional condition \(p<p^*\) is not needed. Note that, unlike in the previous section, we assume that \(\beta < \infty \).

4.1 Transfer Operator

For \(j \in {\mathbb {Z}}\) and \(z_j,z_{j+1},\ldots > 0\) we abbreviate \(h_j = h(z_j,z_{j +1},\ldots )\), cf. (2.1) and (3.5). The transfer operator acts on functions as

$$\begin{aligned} {\mathcal {L}}_\beta f(z_1,z_2,\ldots ) = \int _0^\infty {{\text {e}} }^{-\beta h_0 } f(z_0,z_1,\ldots ) \mathrm {d}z_0. \end{aligned}$$

The dual action on measures is defined by \(({\mathcal {L}}_\beta ^*\nu )(f)= \nu ({\mathcal {L}}_\beta f)\) and is given by

$$\begin{aligned} {{\mathcal {L}}}_\beta ^* \nu (\mathrm {d}z_1 \mathrm {d}z_2...) = e^{-\beta h_1}\, \mathrm {d}z_1 \nu (\mathrm {d}z_2 \mathrm {d}z_3 ...). \end{aligned}$$

Lemma 4.1

There exist \(\lambda _0(\beta )>0\) and a probability measure \(\nu _\beta \) on \({\mathbb {R}}_+^{\mathbb {N}}\) such that

$$\begin{aligned} {\mathcal {L}}_\beta ^* \nu _\beta = \lambda _0(\beta ) \nu _\beta . \end{aligned}$$

Moreover \(\nu _\beta ((r_\mathrm {hc},\infty )^{\mathbb {N}}) =1\) and the pair \((\nu _\beta ,\lambda _0(\beta ))\) is unique.

We will show in Proposition 4.9 that \(\nu _\beta \) is the measure satisfying (2.3). The non-compactness of \((r_\mathrm {hc},\infty )^{\mathbb {N}}\) forms an obstacle to the application of a Schauder-Tychonoff fixed point theorem for the map \(\nu \mapsto {\mathcal {L}}_\beta ^* \nu /\nu ({\mathcal {L}}_\beta {\mathbf {1}})\), see for example [43, Proposition 2]. It might be possible to remove the obstacle using tightness estimates, but we prefer to follow a different route and exploit the known uniqueness of infinite volume Gibbs measures [23, Chapter 8.3] instead.

Proof

Let \(\nu \) be a probability measure on \({\mathbb {R}}_+^{\mathbb {N}}\), \(\lambda := \nu ({\mathcal {L}}_\beta {\mathbf {1}})\), and \({\tilde{\nu }}:= \frac{1}{\lambda }{\mathcal {L}}_\beta ^* \nu \). We show that if \(\nu \) is a Gibbs measure, then \({\tilde{\nu }}\) is a Gibbs measure as well. Let us first introduce the kernels needed to formulate that \(\nu \) is a Gibbs measure. By [23, Theorem 1.33] it is enough to look at one-point kernels. Pick \(k\in {\mathbb {N}}\). For \(z'_k>0\) and \(z= (z_j)_{j\in {\mathbb {N}}}\in {\mathbb {R}}_+^{\mathbb {N}}\), let

$$\begin{aligned} H_k(z'_k\mid z) = p z'_k + \sum _{J\subset {\mathbb {N}},\, J\ni k} v\Bigl ( z'_k + \sum _{j\in J\setminus \{k\}} z_j\Bigr ), \end{aligned}$$

where sum runs over discrete intervals \(J=\{i,\ldots , \ell -1\}\subset {\mathbb {N}}\). Further define the kernel

$$\begin{aligned} \gamma _k\bigl (z,A\bigr ) = \frac{1}{N_k(z)} \int _{0}^\infty \mathbb {1}_A\bigl ( \ldots ,z_{k-1},z'_k, z_{k+1},\ldots ) {{\text {e}} }^{- \beta H_k(z'_k \mid z)} \mathrm {d}z'_k, \end{aligned}$$

where \(A\subset {\mathbb {R}}_+^{\mathbb {N}}\) and \(N_k(z) = \int _{0}^\infty {{\text {e}} }^{- \beta H_k(z'_k \mid z)} \mathrm {d}z'_k\). The kernel acts on functions and measures in the usual way, in particular \((\gamma _k \mathbb {1}_A)(z) = \gamma _k(z,A)\). Notice that \(\gamma _k^2 f = \gamma _k f\) for all f. Indeed \(\gamma _k f\) yields a function where \(z_k\)-dependence has been integrated out, and integrating it against the probability measure \(\gamma _k(z,\cdot )\) does not change its value. Replacing \({\mathbb {N}}\) with \({\mathbb {N}}_0\), we define in a completely analogous fashion conditional energies \(H_k^0\) and kernels \(\gamma _k^0\bigl ( (z_j)_{j\in {\mathbb {N}}_0}, B\bigr )\).

Suppose that \(\nu \) is a Gibbs measure, that is \(\nu \gamma _k = \nu \) for all \(k\in {\mathbb {N}}\). Let \(f:{\mathbb {R}}_+^{{\mathbb {N}}_0}\rightarrow {\mathbb {R}}_+\) be a measurable test function. Treat \({\tilde{\nu }} = \lambda ^{-1} {\mathcal {L}}_\beta ^*\nu \) as a measure on \({\mathbb {R}}_+^{{\mathbb {N}}_0}\). We check that \({\tilde{\nu }}(\gamma _k^0 f) = {\tilde{\nu }}(f)\) for all \(k\in {\mathbb {N}}_0\). For \(k\in {\mathbb {N}}\), this property is inherited from the Gibbsianness of \(\nu \): we have

$$\begin{aligned} {\tilde{\nu }}(f) = \frac{1}{\lambda }\int _0^\infty \nu \Bigl ( f(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\Bigr ) \mathrm {d}z_0 = \frac{1}{\lambda }\int _0^\infty \nu \gamma _k \Bigl ( f(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\Bigr ) \mathrm {d}z_0. \end{aligned}$$

Set \({{\tilde{f}}}:= \gamma _k^0 f\). Note \({{\tilde{f}}} = (\gamma _k^0) {{\tilde{f}}}\). Therefore

$$\begin{aligned} \gamma _k \Bigl ( f(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\Bigr ) (z)&= ( \gamma _k^0 f)(z_0,z) \times (\gamma _k {{\text {e}} }^{-\beta h(z_0,\cdot )})(z)\\&= \gamma _k \Bigl ( {{\tilde{f}}}(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\Bigr ) (z), \end{aligned}$$

hence \({\tilde{\nu }}(f) = {\tilde{\nu }}({{\tilde{f}}}) = {\tilde{\nu }}(\gamma _k^0 f)\). For \(k=0\), the required property follows from the definition of \({\tilde{\nu }}\). Notice \(H_0^0 =h_0\) and

$$\begin{aligned} (\gamma _0^0 f)\bigl ( (z_j)_{j\in {\mathbb {N}}_0}\bigr ) = \frac{\int _0^\infty f(z'_0, z_1,z_2,\ldots ) {{\text {e}} }^{-\beta h(z'_0,z_1,\ldots )} \mathrm {d}z'_0}{\int _0^\infty {{\text {e}} }^{-\beta h(z'_0,z_1,\ldots )} \mathrm {d}z'_0}. \end{aligned}$$

Let \({{\tilde{f}}}= \gamma _0^0 f\). Then

$$\begin{aligned} {\tilde{\nu }}(f)&=\frac{1}{\lambda } \nu \Bigl ( \int _0^\infty f(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\mathrm {d}z_0 \Bigr ) = \frac{1}{\lambda } \nu \Bigl ( \int _0^\infty {{\tilde{f}}}(z_0,\cdot ) {{\text {e}} }^{-\beta h(z_0,\cdot )}\mathrm {d}z_0 \Bigr )\\&= {\tilde{\nu }}({{\tilde{f}}}) = {\tilde{\nu }}(\gamma _0^0 f). \end{aligned}$$

The previous identities hold for all non-negative test functions f, consequently \({\tilde{\nu }} \gamma _k^0 = {\tilde{\nu }}\) for all \(k\in {\mathbb {N}}_0\) and \({\tilde{\nu }}\) is a Gibbs measure as well.

By [23, Theorem 8.39], the Gibbs measure \(\nu \) exists and is unique. Treating \(\nu \) and \({\tilde{\nu }}\) both as measures on \({\mathbb {R}}_+^{\mathbb {N}}\), we must therefore have \(\nu = {\tilde{\nu }}\), that is the unique Gibbs measure is an eigenmeasure of \({\mathcal {L}}_\beta ^*\) and in particular, there exists an eigenmeasure. Conversely, let \(\nu = \frac{1}{\lambda } {\mathcal {L}}_\beta ^*\nu \) be an eigenmeasure. Arguments similar to the investigation of \({\tilde{\nu }}\) given above, based on the iterated fixed point equation \(\nu = \frac{1}{\lambda ^k} {{\mathcal {L}}_\beta ^*}^k\nu \), show that \(\nu \gamma _j = \nu \) for all \(j=1,\ldots ,k\) and all k, hence for all j. Every eigenmeasure is a Gibbs measure. Since the latter is unique, the eigenmeasure is unique as well. Finally, since \(v(z_j) = \infty \) for \(z_j\le r_\mathrm {hc}\), the eigenmeasure \(\nu = \frac{1}{\lambda ^k}{{\mathcal {L}}_\beta ^*}^k \nu \) must satisfy \(\nu (\exists j\in \{1,\ldots ,k\}:\, z_j\le r_\mathrm {hc}) = 0\). This holds for all \(k\in {\mathbb {N}}\), hence \(\nu ( (r_\mathrm {hc},\infty )^{\mathbb {N}}) =1\). \(\square \)

Let \(\nu _\beta ^-\) be the probability measure on \({\mathbb {R}}_+^{\{\ldots , -1,0\}}\) obtained by flipping \(\nu _\beta ^+= \nu _\beta \), that is \(\nu _\beta ^-\) is the image of \(\nu _\beta ^+ =\nu _\beta \) under the map \((z_k)_{k\in {\mathbb {N}}} \mapsto (z_{1-\ell })_{\ell \le 0}\). The measures \(\nu _\beta ^\pm \) represent equilibrium measures for the left and right half-infinite chains. Let

$$\begin{aligned} {\mathcal {W}}_0 = {\mathcal {W}}( \cdots z_{-1} z_0 \mid z_1 z_2 \ldots ) := \sum _{\genfrac{}{}{0.0pt}{}{j \le 0, k \ge 1}{|k-j|\le m-1}} v(z_j+\cdots + z_k) \end{aligned}$$

be the total interaction between left and right half-infinite chains, cf. Proposition 2.3(d). We abbreviate the shifted versions as \({\mathcal {W}}_\ell = {\mathcal {W}}(\cdots z_\ell \mid z_{\ell +1} \cdots )\). Define \(\varphi _\beta (z_1,z_2,\ldots )\) by

$$\begin{aligned} \varphi _\beta (z_1,z_2,\ldots ) = \frac{\nu _\beta ^-(\exp ( - \beta {\mathcal {W}}_0))}{\nu _\beta ^- \otimes \nu _\beta ^+( \exp ( - \beta {\mathcal {W}}_0))}. \end{aligned}$$
(4.1)

Thus \(\varphi _\beta (z_1,z_2,\ldots )\) represents an averaged contribution to the Boltzmann weight from the left half-infinite chain.

Lemma 4.2

We have \({\mathcal {L}}_\beta \varphi _\beta = \lambda _0(\beta ) \varphi _\beta \) and \( \nu _\beta (\varphi _\beta ) = 1\).

Proof

The normalization is obvious, for the eigenvalue equation let \(c_\beta = \nu _\beta ^- \otimes \nu _\beta ^+( \exp ( - \beta {\mathcal {W}}_0))\) and use the eigenvalue equation for \(\nu _\beta ^\pm \)

$$\begin{aligned}&\varphi _\beta (z_1,z_2,\ldots ) \\&\quad = \frac{1}{c_\beta } \int {{\text {e}} }^{- \beta {\mathcal {W}}(\cdots z_0 \mid z_1\cdots )} \mathrm {d}\nu _\beta ^-\bigl ( (z_j)_{j\le 0} \bigr ) \\&\quad = \frac{1}{c_\beta \lambda _0(\beta )} \int {{\text {e}} }^{- \beta {\mathcal {W}}(\cdots z_0 \mid z_1\cdots )} {{\text {e}} }^{- \beta (p z_0 + v(z_{0})+ v(z_0+z_{-1}) + \cdots )}\mathrm {d}z_0 \mathrm {d}\nu _\beta ^-\bigl ( (z_j)_{j\le -1} \bigr ) \\&\quad = \frac{1}{c_\beta \lambda _0(\beta )} \int {{\text {e}} }^{- \beta {\mathcal {W}}(\cdots z_{-1}\mid z_0 z_1\cdots )} {{\text {e}} }^{-\beta (pz_0 + v(z_0)+ v(z_0+z_1)+\cdots )} \mathrm {d}z_0 \mathrm {d}\nu _\beta ^-\bigl ( (z_j)_{j\le -1} \bigr ) \\&\quad = \frac{1}{\lambda _0(\beta )} \int {{\text {e}} }^{-\beta h_0} \varphi _\beta (z_0,z_1,\ldots ) \mathrm {d}z_0 \\&\quad = \frac{1}{\lambda _0(\beta )}({\mathcal {L}}_\beta \varphi _\beta )(z_1,z_2,\ldots ). \end{aligned}$$

See also [46, section 5.12]. \(\square \)

Define the operator

$$\begin{aligned} {\mathcal {S}}_\beta f:= \frac{1}{\lambda _0(\beta ) \varphi _\beta } {\mathcal {L}}_\beta (\varphi _\beta f) \end{aligned}$$

so that \({\mathcal {S}}_\beta {\mathbf {1}}= {\mathbf {1}}\) and \({\mathcal {S}}_\beta ^*(\varphi _\beta \nu _\beta ^+) = \varphi _\beta \nu _\beta ^+\). Let \(\mu _\beta \) be the probability measure on \({\mathbb {R}}_+^{\mathbb {Z}}\) given by

$$\begin{aligned} \frac{\mathrm {d}\mu _\beta }{\mathrm {d}\nu _\beta ^-\otimes \nu _\beta ^+} = \frac{1}{c_\beta } {{\text {e}} }^{-\beta {\mathcal {W}}_0}, \quad c_\beta =\nu _\beta ^-\otimes \nu _\beta ^+({{\text {e}} }^{- \beta {\mathcal {W}}_0}). \end{aligned}$$
(4.2)

We will show in Proposition 4.9 that \(\mu _\beta \) is the measure satisfying (2.4). Notice that for every bounded measurable function f that depends on right-chain variables \(z_1,z_2,\ldots \) only,

$$\begin{aligned} \mu _\beta (f) = \nu _\beta ^+(f \varphi _\beta ),\quad \nu _\beta ^+(f) = \frac{\mu _\beta ({{\text {e}} }^{\beta {\mathcal {W}}_0} f)}{\mu _\beta ({{\text {e}} }^{\beta {\mathcal {W}}_0})}. \end{aligned}$$
(4.3)

Let \(\tau :{\mathbb {R}}_+^{\mathbb {Z}}\rightarrow {\mathbb {R}}_+^{\mathbb {Z}}\) be the shift \((\tau z)_j= z_{j+1}\).

Lemma 4.3

  1. (a)

    \(\mu _\beta \) is shift-invariant.

  2. (b)

    For all \(f,g:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}_+\) and all \(n\in {\mathbb {N}}\), we have \(\mu _\beta (f (g\circ \tau ^n)) = \mu _\beta (({\mathcal {S}}_\beta ^n f)g)\).

The proof is standard [46] and therefore omitted. The lemma can be rephrased as follows: let \((Z_n)_{n\in {\mathbb {Z}}}\) be a stochastic process with law \(\mu _\beta \), defined on some probability space \((\Omega ,{\mathcal {F}},{\mathbb {P}})\). Then \((Z_n)_{n\in {\mathbb {Z}}}\) is stationary, and

$$\begin{aligned} \bigl ( {\mathcal {S}}_\beta ^n f\bigr )(Z_{n+1},Z_{n+2},\ldots ) = {\mathbb {E}}\Bigl [ f(Z_1,Z_2,\ldots )\, \Big |\, Z_{n+1}, Z_{n+2},\ldots \Bigr ]\quad \text {a.s.}. \end{aligned}$$

Our next task is to show that the process is not only stationary but in fact ergodic and to estimate the decay of correlations.

4.2 Ergodicity

Bounds on correlations are most conveniently expressed with the help of variations, semi-norms that quantify how much a function depends on faraway variables. Notice that \(\nu _{\beta }((r_\mathrm{hc}, \infty )^{{\mathbb {N}}}) = \mu _{\beta }((r_\mathrm{hc}, \infty )^{{\mathbb {Z}}}) = 1\). Let \(f:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\) be a function and \(n \in {\mathbb {N}}\). The nth variation of f on \((r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) is

$$\begin{aligned} {{\,\mathrm{var}\,}}_n(f):= \sup \{|f(z) - f(z')|\, :\, z, z' \in (r_\mathrm{hc}, \infty )^{{\mathbb {N}}} \text { such that } z_1 = z'_1,\ldots , z_n= z'_n\}. \end{aligned}$$

When \(n=0\) the constraint on initial values is empty, \({{\,\mathrm{var}\,}}_0(f)\) is sometimes called the oscillation of f [23, Equation (8.2)]. The oscillation vanishes if and only f is constant. Notice that \({{\,\mathrm{var}\,}}_k(h)\) decays algebraically: for \(k\in {\mathbb {N}}\), as \(v(r) = O(r^{-s})\),

$$\begin{aligned} {{\,\mathrm{var}\,}}_k(h) \le 2 \sup _{z} \Bigl |\sum _{j=k+1}^\infty v(z_1+\cdots + z_j)\Bigr | = O\Bigl (\frac{1}{k^{s-1}}\Bigr ). \end{aligned}$$

It follows that the variation is summable, \(\sum _{k=1}^\infty {{\,\mathrm{var}\,}}_k(h) <\infty \). Set

$$\begin{aligned} C_ q:= \sum _{k=q+1}^\infty {{\,\mathrm{var}\,}}_k (h) = O\Bigl ( \frac{1}{q^{s-2}}\Bigr ). \end{aligned}$$

Notice that for all \(q\in {\mathbb {N}}_0\), \(C_q\) is independent of \(\beta \) and p. In fact the pressure only enters the oscillation \({{\,\mathrm{var}\,}}_0 (h)\). By a slight abuse of notation we identify a function \(f:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\) with the function \(f_1: {\mathbb {R}}_+^{\mathbb {Z}}\rightarrow {\mathbb {R}}_+\), \((z_{j})_{j\in {\mathbb {Z}}}\mapsto f( (z_j)_{j\in {\mathbb {N}}})\) and write \(\mu _\beta (f)\) instead of \(\mu _\beta (f_1)\). The results of this subsection hold for all \(p>0\).

Theorem 4.4

Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p>0\). The measure \(\mu _\beta \) is mixing with respect to shifts, that is \(\mu _\beta (f (g\circ \tau ^n)) \rightarrow \mu _\beta (f) \mu _\beta (g)\) as \(n\rightarrow \infty \), for all \(f,g\in L^1({\mathbb {R}}_+^{\mathbb {Z}},\mu _\beta )\). Moreover for \(\gamma (\beta ) = \exp ( - 3 \beta C_0)\) and all bounded \(f,g:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\), \(q,n\in {\mathbb {N}}\), \(N\ge qn\),

$$\begin{aligned} \bigl |\mu _\beta \bigl ( f (g\circ \tau ^N)\bigr ) - \mu _\beta (f) \mu _\beta (g)\bigr |\le & {} \Bigl ( (1-\gamma (\beta ))^q + \frac{1}{\gamma (\beta )}({{\text {e}} }^{3\beta C_n} - 1)\Bigr ) ||g||_\infty ||f||_\infty \\&+ \frac{1}{\gamma (\beta )} ||g||_\infty {{\,\mathrm{var}\,}}_n(f). \end{aligned}$$

We prove Theorem 4.4 with Pollicott’s method of conditional expectations [40]. For alternative approaches, see [48] and the references therein. The principal idea is the following: for \(n\in {\mathbb {N}}\), \(f \in L^1({\mathbb {R}}_+^{\mathbb {N}},\varphi _\beta \nu _\beta )\) let \(\Pi _n f\) be the projection

$$\begin{aligned} \bigl (\Pi _n f\bigr )(z_1,\ldots ,z_n) = \frac{\int _{{\mathbb {R}}_+^{{\mathbb {N}}}} \varphi _{\beta }(z_1, \ldots ) f(z_1, \ldots ) {{\text {e}} }^{-\beta (h_1+\ldots +h_n)} \nu _{\beta }(\mathrm {d}z_{n+1} \ldots )}{\int _{{\mathbb {R}}_+^{{\mathbb {N}}}} \varphi _{\beta }(z_1, \ldots ) {{\text {e}} }^{-\beta (h_1+\ldots +h_n)} \nu _{\beta }(\mathrm {d}z_{n+1} \ldots )} \end{aligned}$$

onto the subspace of functions that depend on the first n coordinates only, that is \({{\,\mathrm{var}\,}}_{n}(f) =0\). In terms of the stationary process \((Z_n)_{n\in {\mathbb {Z}}}\) with law \(\mu _\beta \),

$$\begin{aligned} \bigl (\Pi _n f\bigr )(Z_1,\ldots ,Z_n) = {\mathbb {E}}\bigl [ f( (Z_j)_{j\ge 1})\, \big |\, Z_1,\ldots ,Z_n\bigr ]\quad \text {a.s.} \end{aligned}$$

Notice that

$$\begin{aligned} ||\Pi _n f- f||_1\le ||\Pi _n f- f||_\infty \le {{\,\mathrm{var}\,}}_n(f) \end{aligned}$$
(4.4)

where \(||\cdot ||_1\) is the \(L^1({\mathbb {R}}_+^{\mathbb {N}}, \varphi _\beta \nu _\beta )\) norm. Let \(q,n\in {\mathbb {N}}\). Then

$$\begin{aligned} {\mathcal {S}}_\beta ^{qn} = \Bigl ( {\mathcal {S}}_\beta ^{qn} - ({\mathcal {S}}_\beta ^n\Pi _n)^q\Bigr ) + ({\mathcal {S}}_\beta ^n\Pi _n)^q. \end{aligned}$$

The difference enclosed in parentheses represents a truncation error; it is made small by choosing n large. On the subspace of mean-zero functions, the truncated operator \({\mathcal {S}}^n\Pi _n\) satisfies a contraction property uniformly in n (Lemma 4.7), and \(({\mathcal {S}}_\beta ^n\Pi _n)^q\) goes to zero exponentially fast as \(q\rightarrow \infty \).

Lemma 4.5

We have \({{\,\mathrm{var}\,}}_q(\log \varphi _\beta ) \le \beta C_q\) for all \(q\in {\mathbb {N}}_0\) and \(\beta ,p>0\).

Proof

Let \(q\in {\mathbb {N}}_0\), \((z_j)_{j\in {\mathbb {Z}}}, (z'_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm{hc}, \infty )^{{\mathbb {Z}}}\) such that \(z_j = z'_j\) for all \(j \le q\). Then

$$\begin{aligned} |{\mathcal {W}}_0(z) - {\mathcal {W}}_0(z')| =| \sum _{j=0}^{\infty } \bigl (h_{-j}(z) - h_{-j}(z')\bigr )| \le \sum _{j=0}^\infty {{\,\mathrm{var}\,}}_{q+1+j}(h) = C_q \end{aligned}$$

and \(\nu _{\beta }^-(\exp (- \beta {\mathcal {W}}_0)) \le \exp (\beta C_q) \nu _{\beta }^-(\exp (-\beta {\mathcal {W}}_0'))\). The claim then follows from the definition (4.1) of the invariant function. \(\square \)

Lemma 4.6

Let \(f:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\) be a bounded function. Then \(n,k\in {\mathbb {N}}_0\),

$$\begin{aligned} {{\,\mathrm{var}\,}}_{k} ({\mathcal {S}}_\beta ^n f) \le {{\,\mathrm{var}\,}}_{n+k} (f) + ||f||_\infty ({{\text {e}} }^{3 \beta C_k}-1). \end{aligned}$$

Proof

Let \(g =\sum _{j=1}^n h_j - \beta ^{-1}\log [\lambda _0^n(\beta ) \varphi _\beta ] + \beta ^{-1} \log \varphi _\beta \circ \tau ^n\) on \((r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) and \(g \equiv \infty \) on \({\mathbb {R}}_+^{{\mathbb {N}}} \setminus (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) so that

$$\begin{aligned} {\mathcal {S}}_\beta ^n f (z_{n+1},z_{n+2},\ldots ) = \int _{{\mathbb {R}}_+^n} {{\text {e}} }^{ -\beta g (z_1,z_2,\ldots )} f(z_1,z_2,\ldots ) \mathrm {d}z_1\ldots \mathrm {d}z_n. \end{aligned}$$

Pick \(z, z' \in (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) so that \(z_j = z'_j\) for \(j=1,\ldots ,n+k\). Then

$$\begin{aligned} \bigl | {{\text {e}} }^{-\beta g(z)} f(z) -{{\text {e}} }^{-\beta g(z')} f(z') \bigr |&\le {{\text {e}} }^{-\beta g(z)} \bigl | f(z) - f(z')\bigr | + \bigl | f(z') \bigr | \bigl | {{\text {e}} }^{-\beta g(z)}- {{\text {e}} }^{-\beta g(z')}\bigr | \\&\le {{\text {e}} }^{-\beta g(z)}\Bigl ( {{\,\mathrm{var}\,}}_{n+k}(f) + ||f||_\infty \bigl ( {{\text {e}} }^{\beta {{\,\mathrm{var}\,}}_{n+k}(g)} -1 \bigr ) \Bigr ). \end{aligned}$$

We integrate out \(z_1,\ldots ,z_n\), observe \(\int \exp (-\beta g) \mathrm {d}z_1\cdots \mathrm {d}z_n = {\mathcal {S}}_\beta ^n {\mathbf {1}} = {\mathbf {1}}\), and deduce

$$\begin{aligned} {{\,\mathrm{var}\,}}_{k}({\mathcal {S}}^n f) \le {{\,\mathrm{var}\,}}_{n+k}(f) + ||f||_\infty \bigl ({{\text {e}} }^{\beta {{\,\mathrm{var}\,}}_{n+k}(g)} -1 \bigr ). \end{aligned}$$

To conclude, we note

$$\begin{aligned} {{\,\mathrm{var}\,}}_{k+n}(g)&\le \sum _{j=0}^{n-1} {{\,\mathrm{var}\,}}_{n+k-j}(h) +\frac{1}{\beta } \bigl ( {{\,\mathrm{var}\,}}_{n+k}(\log \varphi )+ {{\,\mathrm{var}\,}}_{k}(\log \varphi ) \bigr ) \nonumber \\&\le C_k + C_{n+k} + C_k\le 3 C_k. \end{aligned}$$
(4.5)

\(\square \)

Lemma 4.7

Let \(f\in L^1({\mathbb {R}}_+^{\mathbb {N}},\varphi _\beta \nu _\beta )\) such that \(\nu _\beta (f\varphi _\beta ) =0\). Then for all \(n\ge 1\) and \(\gamma (\beta ) = \exp ( - 3 \beta C_0)\)

$$\begin{aligned} ||{\mathcal {S}}_\beta ^n \Pi _n f||_1 \le (1- \gamma (\beta ) \bigr ) ||f||_1. \end{aligned}$$

Proof

We adapt [43, Proposition 3]. Consider first a non-negative function f that depends on \(z_1,\ldots ,z_n\) only, that is \({{\,\mathrm{var}\,}}_n(f) =0\). Let \(k\ge 0\) \(z,z'\) such that \(z_j=z'_j\) for \(j=1,\ldots ,n\) and \(g(z_1,z_2,\ldots )\) as in the proof of Lemma 4.6. Then

$$\begin{aligned} ({\mathcal {S}}_\beta ^n f)(z_{n+1},z_{n+2},\ldots )&= \int {{\text {e}} }^{-\beta g(z_1,\ldots )} f(z_1,\ldots ,z_n) \mathrm {d}z_1\cdots \mathrm {d}z_n \\&\le {{\text {e}} }^{\beta {{\,\mathrm{var}\,}}_n(g)}\int {{\text {e}} }^{-\beta g(z'_1,\ldots )} f(z'_1,\ldots ,z'_n) \mathrm {d}z'_1\cdots \mathrm {d}z'_n \\&= {{\text {e}} }^{\beta {{\,\mathrm{var}\,}}_{n} (g)} ({\mathcal {S}}_\beta ^n f)(z'_{n+1},z'_{n+2},\ldots ). \end{aligned}$$

By Inequality (4.5) with \(k = 0\) we have \({{\,\mathrm{var}\,}}_{n}(g) \le 3 C_0\), uniformly in n. Thus \({\mathcal {S}}_\beta ^n f(z) \le \exp (- 3\beta C_0) ({\mathcal {S}}_\beta ^n f)(z')\) for all \(z,z' \in (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\). For non-negative f with \(f =\Pi _n f\) we have by Lemma 4.3

$$\begin{aligned} \inf {\mathcal {S}}_\beta ^n f \ge \gamma (\beta ) \sup {\mathcal {S}}_\beta ^n f \ge \gamma (\beta ) \mu _\beta ({\mathcal {S}}_\beta ^n f) = \gamma (\beta ) \mu _\beta (|f|). \end{aligned}$$

Next let f with \({{\,\mathrm{var}\,}}_n(f) =0\) and \(\mu _\beta (f) =0\). Then \(\mu _\beta (f_+)=\mu _\beta (f_-)\) and

$$\begin{aligned} |{\mathcal {S}}_\beta ^n f|&\le \bigl ( {\mathcal {S}}_\beta ^n f_+ - \gamma (\beta ) \mu _\beta (f_+)\bigr ) + \bigl ( {\mathcal {S}}_\beta ^n f_- - \gamma (\beta ) \mu _\beta (f_-)\bigr ) \\&= {\mathcal {S}}_\beta ^n (f_++f_-) - \gamma (\beta )\mu _\beta (f_++ f_-) = {\mathcal {S}}_\beta ^n|f| - \gamma (\beta ) \mu _\beta (|f|). \end{aligned}$$

We integrate against \(\mu _\beta \), use \(\mu _\beta ({\mathcal {S}}_\beta ^n|f|) = \mu _\beta (|f|) = ||f||_1\), and find \(||{\mathcal {S}}_\beta ^nf||_1 \le (1-\gamma (\beta ))||f||_1\). This holds for every local function \({{\,\mathrm{var}\,}}_n(f) =0\) with \(\mu _\beta (f)=0\). For general f, we may apply the bound to \(\Pi _n f\) and use \(\mu _\beta (\Pi _n f) = \mu _\beta (f) =0\) and \(\mu _\beta (|\Pi _n f|) \le \mu _\beta (|f|)\), and we are done. \(\square \)

Lemma 4.8

Let \(f \in L^1({\mathbb {R}}_+^{\mathbb {N}},\varphi _\beta \nu _\beta )\) be a bounded map with \(\nu _\beta (f\varphi _\beta ) =0\). Then for all \(q,n\in {\mathbb {N}}\),

$$\begin{aligned} ||{\mathcal {S}}_\beta ^{nq} f - ({\mathcal {S}}_\beta ^n\Pi _n)^q f||_1 \le \frac{1}{\gamma (\beta )} ({{\text {e}} }^{3\beta C_n} - 1)||f||_\infty + \frac{1}{\gamma (\beta )} {{\,\mathrm{var}\,}}_n(f). \end{aligned}$$

Proof

A telescope summation, the triangle inequality, and Lemma 4.7 yield

$$\begin{aligned} ||{\mathcal {S}}_\beta ^{nq} f - ({\mathcal {S}}_\beta ^n\Pi _n)^q f||_1&\le \sum _{k=0}^{q-1} || ({\mathcal {S}}_\beta ^n\Pi _n)^k \bigl ({\mathcal {S}}_\beta ^n\Pi _n - {\mathcal {S}}_\beta ^n\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f||_1 \\&\le \sum _{k=0}^{q-1} (1-\gamma (\beta ))^k || \bigl ({\mathcal {S}}_\beta ^n\Pi _n - {\mathcal {S}}_\beta ^n\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f||_1 \\&\le \sum _{k=0}^{q-1} (1-\gamma (\beta ))^k || \bigl (\Pi _n - \mathrm {id}\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f||_1, \end{aligned}$$

where in the second step we use that \(\nu _{\beta }(({\mathcal {S}}_\beta ^n\Pi _n)^i \bigl ({\mathcal {S}}_\beta ^n\Pi _n - {\mathcal {S}}_\beta ^n\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f \varphi _\beta ) = \nu _{\beta }(f \varphi _\beta ) = 0\) for \(i = 1, \ldots , k\) by Lemma 4.3 and the third step follows from \(|{\mathcal {S}}_\beta ^n\bigl (\Pi _n - \mathrm {id}\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f| \le {\mathcal {S}}_\beta ^n|\bigl (\Pi _n - \mathrm {id}\bigr ) ({\mathcal {S}}_\beta ^n)^{q-k-1} f|\) and Lemma 4.3. By equation (4.4) and Lemma 4.6, this can be further estimated as

$$\begin{aligned}&\sum _{k=0}^{q-1}(1-\gamma (\beta ))^k {{\,\mathrm{var}\,}}_n( {\mathcal {S}}_\beta ^{n(q-k-1)} f) \\&\quad \le \sum _{k=0}^{q-1}(1-\gamma (\beta ))^k \Bigl ( ({{\text {e}} }^{3\beta C_n} - 1)||f||_\infty + {{\,\mathrm{var}\,}}_{n(q-k)}(f) \Bigr ) \\&\quad \le \frac{1}{\gamma (\beta )} ({{\text {e}} }^{3\beta C_n} - 1)||f||_\infty + \frac{1}{\gamma (\beta )} {{\,\mathrm{var}\,}}_n(f). \end{aligned}$$

Proof of Theorem 4.4

Let \(f,g:{\mathbb {R}}_+^{\mathbb {N}}\rightarrow {\mathbb {R}}\) be bounded functions and \(q,n\in {\mathbb {N}}\), \(N \ge qn\). Using equation (4.2) and Lemmas 4.7 and 4.8, we get

$$\begin{aligned}&\bigl | \mu _\beta \bigl ( f (g\circ \tau ^{N}) \bigr ) -\mu _\beta (f)\mu _\beta (g)\bigr | = \bigl |\mu _\beta \bigl (({\mathcal {S}}_\beta ^{N} f)g \bigr ) -\mu _\beta (f) \mu _\beta (g)\bigr | \\&\quad \le \mu _\beta \bigl ( |g| \bigl |{\mathcal {S}}_\beta ^{N}(f-\mu _\beta ( f) {\mathbf {1}} )\bigr |\bigr ) \le ||g||_\infty \, ||{\mathcal {S}}_\beta ^{{N}} (f-\mu _\beta ( f) {\mathbf {1}} )||_1 \\&\quad \le ||g||_\infty \, ||{\mathcal {S}}_\beta ^{qn} (f-\mu _\beta ( f) {\mathbf {1}} )||_1 \\&\quad \le \Bigl ( (1-\gamma (\beta ))^q + \frac{1}{\gamma (\beta )}({{\text {e}} }^{3\beta C_n} - 1)\Bigr ) ||g||_\infty ||f-\mu _\beta ( f)||_\infty + \frac{1}{\gamma (\beta )} ||g||_\infty {{\,\mathrm{var}\,}}_n(f) \end{aligned}$$

since \(||{\mathcal {S}}_\beta ||_1\le 1\). The explicit estimate on the decay of correlations follows. That \(\mu _\beta \) is mixing then follows from standard approximation arguments. \(\square \)

Proof of Theorem 2.9

The estimate for infinite m is an immediate consequence of Theorem 2.9. For finite m and \(n=m-1\), the truncation error in Lemma 4.8 for a function \(f : {\mathbb {R}}_+^n \rightarrow {\mathbb {R}}\) actually vanishes since \({{\,\mathrm{var}\,}}_n(f) = 0\) and \(C_n = 0\). The bound simplifies accordingly. \(\square \)

4.3 Thermodynamic Limit

Proposition 4.9

Let \(m\in {\mathbb {N}}\cup \{\infty \}\) and \(p>0\).

  1. (a)

    The Gibbs free energy and its surface correction defined by the limits (2.7) exist and are given by

    $$\begin{aligned} g(\beta ) = - \frac{1}{\beta }\log \lambda _0(\beta ),\quad g_\mathrm{surf}(\beta )= - g(\beta )-\frac{ 1}{\beta }\log \mu _\beta ({{\text {e}} }^{\beta {\mathcal {W}}_0}). \end{aligned}$$
  2. (b)

    Equations (2.3) and (2.4) hold true.

Proof

We compute

$$\begin{aligned} \nu _\beta \bigl ({{\text {e}} }^{\beta {\mathcal {W}}(z_1\cdots z_n\mid z_{n+1}\cdots )} \bigr )= & {} \bigl (\frac{1}{\lambda _0(\beta )^n} {{\mathcal {L}}_\beta ^*}^n\nu _\beta \bigr ) \bigl ({{\text {e}} }^{\beta {\mathcal {W}}(z_1\cdots z_n\mid z_{n+1}\cdots ) } \bigr ) \nonumber \\= & {} \frac{1}{\lambda _0(\beta )^n} \int {{\text {e}} }^{\beta {\mathcal {W}}(z_1\cdots z_n\mid z_{n+1}\cdots )} {{\text {e}} }^{-\beta \sum _{j=1}^n h_j } \mathrm {d}z_1\cdots \mathrm {d}z_n \mathrm {d}\nu _\beta (z_{n+1} z_{n+2}\ldots ) \nonumber \\= & {} \frac{1}{\lambda _0(\beta )^n}\int {{\text {e}} }^{-\beta {\mathcal {E}}_{n+1}(z_1,\ldots ,z_n)} \mathrm {d}z_1\cdots \mathrm {d}z_n \mathrm {d}\nu _\beta (z_{n+1} z_{n+2}\ldots ) \nonumber \\= & {} \frac{1}{\lambda _0(\beta )^n} Q_{n+1}(\beta ). \end{aligned}$$
(4.6)

Let \({\mathcal {W}}_{0n} = \sum _{j\le 0}\sum _{k\ge n+1} v(z_j+\cdots +z_k)\). We note

$$\begin{aligned} {\mathcal {W}}(z_1\cdots z_n\mid z_{n+1}\cdots ) = {\mathcal {W}}_n- {\mathcal {W}}_{0n}. \end{aligned}$$

and with (4.3) deduce

$$\begin{aligned} \frac{1}{\lambda _0(\beta )^n}\, Q_{n+1}(\beta ) = \nu _\beta ({{\text {e}} }^{\beta {\mathcal {W}}(z_1\cdots z_n\mid z_{n+1}\cdots )}) = \frac{\mu _\beta (\exp (\beta [{\mathcal {W}}_0+{\mathcal {W}}_n- {\mathcal {W}}_{0n}]) )}{\mu _\beta (\exp ( \beta {\mathcal {W}}_0))}. \end{aligned}$$

Now \({\mathcal {W}}_{0n} = O(n^{-(s-2)})\rightarrow 0\) uniformly on \((r_\mathrm{hc}, \infty )^{\mathbb {Z}}\). By Theorem 4.4, \(\mu _\beta (\exp (\beta [{\mathcal {W}}_0+ {\mathcal {W}}_n])) = \mu _\beta (f (f\circ \tau ^n)) \rightarrow \mu _\beta (f)^2\) where \(f= \exp (\beta {\mathcal {W}}_0)\). Consequently as \(n\rightarrow \infty \)

$$\begin{aligned} \log Q_{n+1}(\beta ) = (n+1) \log \lambda _0(\beta ) - \log \lambda _0(\beta ) + \log \mu _\beta ({{\text {e}} }^{\beta {\mathcal {W}}_0}) + o(1), \end{aligned}$$

from which part (a) of the lemma follows. A computation analogous to equation (4.6) shows that for every local test function \(f\in C_b({\mathbb {R}}_+^k)\),

$$\begin{aligned} {\mathbb {Q}}_{n+1}^{\scriptscriptstyle {({\beta }})}(f) = \frac{\mu _\beta ( f \exp ( \beta [{\mathcal {W}}_0+ {\mathcal {W}}_n - {\mathcal {W}}_{0n}] )}{\mu _\beta (\exp ( \beta [{\mathcal {W}}_0+ {\mathcal {W}}_n - {\mathcal {W}}_{0n}] )}. \end{aligned}$$

Part (b) of the lemma then follows from Theorem 4.4. \(\square \)

5 Large Deviations as \(\beta \rightarrow \infty \)

Here we analyze the behavior of the bulk and surface Gibbs measures \(\mu _\beta \) and \(\nu _\beta \) and of the energies \(g(\beta )\) and \(g_\mathrm {surf}(\beta )\). The large deviations result for the surface measure \(\nu _\beta \) is a consequence of the eigenvalue equation from Lemma 4.1, exponential tightness, and the uniqueness of the solution to the fixed point equation in Proposition 3.9. Since the bulk measure is absolutely continuous with respect to the product measure of two independent half-infinite chains (Equation (4.2) and Proposition 4.9(b)), we may go from the surface to the bulk measure with the help of Varadhan’s integral lemma [17, Chapter 4.3]. The asymptotic behavior of \(e_\mathrm {surf}(\beta )\) is based on the representation from Proposition 4.9(a). Throughout we assume that the pressure p is a positive constant. This is a crucial ingredient in the proof of Lemma 5.1 as it prevents the chain from breaking into several pieces. As alluded to at the end of Section 2, if p vanishes one expects fracture due to occasional extremely large interparticle distances, cf. also [28].

5.1 A Tightness Estimate

The following estimate will help us prove that the infinite-volume measure \(\nu _\beta \) is exponentially tight (see the proof of Lemma 5.3) which enters the proof of Theorem 2.4.

Lemma 5.1

For all \(\beta ,p>0\), \(N\in {\mathbb {N}}\), \(k\in \{1,\ldots ,N-1\}\), and \(r \ge 0\), we have

$$\begin{aligned} {\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})}( \{z\in {\mathbb {R}}_+^{N-1} \mid z_k \ge z_{\max } + r\}) \le \exp ( - \beta p r ). \end{aligned}$$

Proof

Fix \(k\in {\mathbb {N}}\) and \(r \ge 0\). For \(z= (z_1,\ldots ,z_{N-1})\in {\mathbb {R}}_+^{N-1}\) with \(z_k \ge z_{\max }+r\) we define a new configuration \(z'\) by setting \(z'_k = z_k - r\) and leaving all other spacings unchanged. This decreases the Gibbs energy by an amount at least

$$\begin{aligned} {\mathcal {E}}_N(z) - {\mathcal {E}}_N(z') \ge p z'_k - p z_k = p r. \end{aligned}$$

A change of variables thus yields

$$\begin{aligned} {\mathbb {Q}}_N^{\scriptscriptstyle {({\beta }})} ( \{z \mid z_k \ge z_{\max } + r \} )&= \frac{1}{Q_N(\beta )} \int _{{\mathbb {R}}_+^{N-1}} {{\text {e}} }^{-\beta {\mathcal {E}}_N(z)} {\mathbf {1}}_{ [z_{\max }+r, \infty ) } (z_k) \mathrm {d}z \\&\le \frac{1}{Q_N(\beta )} \int _{{\mathbb {R}}_+^{N-1}}{{\text {e}} }^{-\beta p r} {{\text {e}} }^{-\beta {\mathcal {E}}_N(z')} {\mathbf {1}}_{[z_{\max }, \infty )} (z'_k) \mathrm {d}z' \\&\le {{\text {e}} }^{-\beta p r}, \end{aligned}$$

and the proof of the lemma is easily concluded. \(\square \)

5.2 Gibbs Free Energy in the Bulk

Lemma 5.2

Let \(\beta \rightarrow \infty \) at fixed \(p > 0\). Then

$$\begin{aligned} g(\beta ) = -\frac{1}{\beta }\log \lambda _0(\beta ) = e_0+ O(\beta ^{-1} \log \beta ). \end{aligned}$$

Proof of Lemma 5.2

The relation between \(g(\beta )\) and \(\lambda _0(\beta )\) has been proven in Proposition 4.9. We proceed with an upper bound for \(Q_N(\beta )\) and \(\lambda _0(\beta )\). For \(z=(z_1,\ldots ,z_{N-1})\), define \(z'\) by \(z'_j= \min (z_{\max },z_j)\). Revisiting the proof of Lemma 3.1, we see that

$$\begin{aligned} {\mathcal {E}}_N(z)\ge {\mathcal {E}}_N(z') + \sum _{j=1}^{N-1} \min ( p(z_j - z_{\max }),0) \ge E_N + \sum _{j=1}^{N-1} p \min ((z_j - z_{\max }),0). \end{aligned}$$

It follows that

$$\begin{aligned} Q_N(\beta ) \le {{\text {e}} }^{-\beta E_N} \prod _{j=1}^{N-1} \bigl (z_{\max } + \int _{z_{\max }}^\infty {{\text {e}} }^{- \beta p (z_j - z_{\max }) }\mathrm {d}z_j \bigr ) \end{aligned}$$

and

$$\begin{aligned} \log \lambda _0(\beta ) \le - \beta e_0 + \log \Bigl (z_{\max } + \frac{1}{\beta p} \Bigr ), \end{aligned}$$

whence \(\beta ^{-1}\log \lambda _0(\beta ) \le - e_0+ O(\beta ^{-1})\). For a lower bound, we let \({\bar{z}} \in [z_{\min }, z_{\max }]^{N-1}\) be the minimizer of \({\mathcal {E}}_N\) and choose \(0< \varepsilon < a - z_{\min }\) so small that by Lemma 3.3

$$\begin{aligned} {\mathcal {E}}_N(z) \le E_N + C \sum _{j=1}^{N-1}(z_j - {\bar{z}}_j)^2 \end{aligned}$$

for every \(z \in \times _{j=1}^{N-1} [{\bar{z}}_j - \varepsilon , {\bar{z}}_j + \varepsilon ]\). We get

$$\begin{aligned} Q_N(\beta )&\ge {{\text {e}} }^{-\beta E_N} \prod _{j=1}^{N-1} \int _{{\bar{z}}_j-\varepsilon }^{{\bar{z}}_j+\varepsilon } {{\text {e}} }^{- C \beta (z_j - {\bar{z}}_j)^2} \mathrm {d}z_j \bigr ) = {{\text {e}} }^{-\beta E_N} \Bigr ( \int _{-\varepsilon }^\varepsilon {{\text {e}} }^{-C \beta s^2} \mathrm {d}s \Bigl )^{N-1}. \end{aligned}$$

This yields

$$\begin{aligned} \log \lambda _0(\beta )&\ge -\beta e_0 + \log \Bigl (\int _{-\varepsilon }^\varepsilon {{\text {e}} }^{- C \beta s^2}\mathrm {d}s\Bigr ) \\&= - \beta e_0 - \log \sqrt{ \frac{C \beta }{\pi }}+ \log \Bigl ( 1- \sqrt{\frac{2}{\pi }} \int _{\varepsilon \sqrt{2 C \beta }}^\infty {{\text {e}} }^{-x^2 /2} \mathrm {d}x\Bigr ) \end{aligned}$$

and \(\beta ^{-1}\log \lambda _0 (\beta ) \ge - e_0 + O(\beta ^{-1}\log \beta )\). \(\square \)

5.3 Large Deviations Principles for \(\nu _\beta \) and \(\mu _\beta \)

Here we prove Theorem 2.4.

Lemma 5.3

Every sequence \(\beta _j\rightarrow \infty \) has a subsequence along which \((\nu _{\beta _j})_{j\in {\mathbb {N}}}\) satisfies a large deviations principle with speed \(\beta _j\) and some good rate function.

Remark

The following proof crucially depends on the pressure being bounded form below. If \(p= p_\beta \rightarrow 0\), we lose exponential tightness and only know that every sequence \((\nu _{\beta _j})\) has a subsequence along which it satisfies a weak large deviations principle [17, Lemma 4.1.23], which means that the upper bound in (2.6) is required to hold for compact sets rather than closed sets.

Proof

The lemma is a consequence of exponential tightness. Let \(n\in {\mathbb {N}}_0\). Define \(K_n= \times _{j=1}^\infty [0,z_{\max }+n+j]\). \(K_n\) is compact in the product topology. Passing to the limit \(N\rightarrow \infty \) in Lemma 5.1, we find

$$\begin{aligned} \nu _\beta ( \{z\in {\mathbb {R}}_+^{\mathbb {N}}\mid z_k\ge z_{\max }+ r\} ) \le {{\text {e}} }^{-\beta p r} \end{aligned}$$

for all \(k\in {\mathbb {N}}\) and \(r \ge 0\). Therefore

$$\begin{aligned} \nu _\beta (K_n^\mathrm {c})&\le \sum _{k=1}^\infty \nu _\beta ( \{ z\in {\mathbb {R}}_+^{\mathbb {N}}\mid z_k > z_{\max }+ k+ n \} ) \\&\le \sum _{k=1}^\infty {{\text {e}} }^{-\beta p(k+n)} = \frac{\exp (-\beta p(n+1))}{1- \exp ( -\beta p)}. \end{aligned}$$

It follows that the family of measures \((\nu _\beta )_{\beta \ge 1}\) is exponentially tight, that is for every \(M>0\), we can find a compact subset \(K\subset {\mathbb {R}}_+^{\mathbb {N}}\) such that \(\limsup _{\beta \rightarrow \infty } \frac{1}{\beta }\log \nu _\beta ( K^\mathrm {c}) \le - M\). \({\mathbb {R}}_+^{\mathbb {N}}\) endowed with the product topology is separable and metrizable and therefore has a countable base. Lemma 4.1.23 in [17] applies and yields the claim. \(\square \)

Lemma 5.4

Suppose that Assumption 3 holds true and assume that along some subsequence \((\beta _j)\) the measure \(\nu _{\beta _j}\) satisfies a large deviations principle with good rate function \(I(z_1,z_2,\ldots )\). Then I satisfies

$$\begin{aligned} I(z_1,z_2,\ldots ) = \bigl (h(z_1,z_2,\ldots )- e_0\bigr ) + I(z_2,z_3,\ldots ). \end{aligned}$$

on \({\mathbb {R}}_+^{\mathbb {N}}\). In particular, \(I((z_j)_{j\in {\mathbb {N}}}) = \infty \) if \(z_j \le r_\mathrm {hc}\) for some \(j\in {\mathbb {N}}\).

Proof

Write \(\beta \) instead of \(\beta _j\). We will see that the fixed point equation for I follows from the eigenvalue equation in Lemma 4.1 and the asymptotics of the principal eigenvalue provided in Lemma 5.2. According to these,

$$\begin{aligned} \mathrm {d}\nu _{\beta } (z_1 z_2\ldots ) = {{\text {e}} }^{-\beta [h_1 + \ldots + h_n - ne_0 + o(1)]} \mathrm {d}z_1 \ldots z_n \mathrm {d}\nu _{\beta }(z_{n+1}\ldots ) \end{aligned}$$
(5.1)

for any \(n \in {\mathbb {N}}\) where the o(1)-term comes from \(\log \lambda _0^{n}(\beta ) = - \beta [ne_0+o(1)]\) and is independent of \((z_j)_{j\in {\mathbb {N}}}\).

We first show that I can only be finite on \((r_\mathrm{hc}, \infty )^{\mathbb {N}}\). Fix \(n \in {\mathbb {N}}\) and for \(\varepsilon > 0\) consider the open set \(O_{\varepsilon } = \{z \in {\mathbb {R}}^{\mathbb {N}}\mid 0< z_n < r_\mathrm{hc}+\varepsilon \}\). A repeated application of Lemma 4.1 and Lemma 5.2 give

$$\begin{aligned} \nu _\beta (O_{\varepsilon }) = \int _{O_{\varepsilon } \cap (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}} {{\text {e}} }^{-\beta [h_1 + \ldots + h_n - n e_0 + o(1)]} \mathrm {d}z_1 \ldots \mathrm {d}z_n \mathrm {d}\nu _\beta (z_{n+1}\ldots ). \end{aligned}$$

Let \(-C\) be a lower bound for \(-e_0 + v(z_{\max }) + \sum _{k=2}^\infty v(z_1 + \cdots + z_k )\) on \((r_\mathrm {hc},\infty )^{\mathbb {N}}\). Then

$$\begin{aligned} \nu _\beta (O_{\varepsilon }) \le&\int _{(r_\mathrm{hc}, \infty )^{n-1}} {{\text {e}} }^{-\beta [p(z_1 + \ldots + z_{n-1}) - C(n-1) + o(1)]} \mathrm {d}z_1 \ldots \mathrm {d}z_{n-1} \\&\times \int _{(r_\mathrm{hc}, r_\mathrm{hc}+\varepsilon )} {{\text {e}} }^{-\beta [p z_n + v(z_n) - C]} \mathrm {d}z_n \end{aligned}$$

and

$$\begin{aligned} \log \nu _\beta (O_{\varepsilon })&\le \beta (C + o(1)) (n-1) + \log \varepsilon - \beta \inf _{s \in (r_\mathrm {hc},r_\mathrm {hc} + \varepsilon ]} (p s + v(s)). \end{aligned}$$

Hence

$$\begin{aligned} - \inf _{O_\varepsilon } I \le C(n-1) - \inf _{s \in (r_\mathrm {hc},r_\mathrm {hc} + \varepsilon ]} (p s + v(s)) =:- f(\varepsilon ). \end{aligned}$$

It follows that

$$\begin{aligned} \inf \{ I(z) \mid z_n \le r_\mathrm{hc} \} \ge \lim _{\varepsilon \rightarrow 0} f(\varepsilon ) = \infty . \end{aligned}$$

Since n was arbitrary we have shown that \(I \equiv \infty \) on \({\mathbb {R}}_+^{{\mathbb {N}}} \setminus (r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\). In particular, as \(\nu _\beta \) satisfies a large deviations principle on \({\mathbb {R}}_+^{\mathbb {N}}\) with rate function I, the same large deviations principle holds on \((r_\mathrm {hc},\infty )^{\mathbb {N}}\).

We now establish another (weak) large deviations principle on \((r_\mathrm {hc},\infty )^{\mathbb {N}}\). Let \(K\subset (r_ \mathrm {hc},\infty )^{\mathbb {N}}\) be a (relatively) closed set and \([\alpha ,b]\subset (r_ \mathrm {hc},\infty )\) a compact interval. Then (5.1) with \(n = 1\) yields

$$\begin{aligned} \nu _\beta ([\alpha ,b] \times K) = \int _\alpha ^b \Bigl ( \int _K {{\text {e}} }^{-\beta [h(z_1,z_2,\ldots ) - e_0+o(1)]} \mathrm {d}\nu _\beta (z_2,z_3,\ldots ) \Bigr )\mathrm {d}z_1. \end{aligned}$$

Write \(f_\beta (z_1;K)\) for the inner integral. As h is bounded from below and for every fixed \(z_1>r_\mathrm {hc}\), \((z_2,z_3,\ldots ) \mapsto h(z_1,z_2,\ldots )\) is continuous on \((r_\mathrm{hc}, \infty )^{{\mathbb {N}}}\) with respect to the product topology, we deduce from Varadhan’s lemma [17, Chapter 4.3] that

$$\begin{aligned} \limsup _{\beta \rightarrow \infty } \frac{1}{\beta }\log f_\beta (z_1;K) \le - \inf _{(z_j)_{j\ge 2}\in K} \bigl (h(z_1,z_2,\ldots ) - e_0+ I(z_2,z_3,\ldots )\bigr )\nonumber \\ \end{aligned}$$
(5.2)

for all \(z_1\in [\alpha ,b]\). Next we note that for all \((z_j)_{j\in {\mathbb {N}}}\in (r_\mathrm {hc},\infty )^{\mathbb {N}}\), \(z'_1>r_\mathrm {hc}\), and suitable \(C>0\),

$$\begin{aligned} |h(z_1,z_2,\ldots ) - h(z'_1,z_2,\ldots )| \le |v(z_1) - v(z'_1)| + C|z_1- z'_1|. \end{aligned}$$

For \(z_1,z'_1\) bounded away from \(r_{\mathrm {hc}}\) we may exploit that the derivative of v is bounded and drop the first term, making C larger if need be. Plugging these estimates into the definition of \(f_\beta (z_1,K)\), we find that for some \(C_\alpha >0\) and all \(\beta >0\),

$$\begin{aligned} \Bigl |\frac{1}{\beta }\log f_\beta (z_1;K) -\frac{1}{\beta }\log f_\beta (z_1';K)\Bigr |\le C_{ \alpha } |z_1-z'_1|\quad (z_1,z'_1>\alpha >r_\mathrm {hc}). \end{aligned}$$

It follows that the upper bound (5.2) is uniform on compact subsets of \((r_\mathrm {hc},\infty )\) and

$$\begin{aligned} \limsup _{\beta \rightarrow \infty }\frac{1}{\beta }\log \nu _\beta ([ \alpha ,b]\times K) \le - \inf _{z\in [\alpha ,b]\times K} \bigl ( h(z_1,z_2,\ldots ) -e_0 + I(z_2,z_3,\ldots )\bigr ).\nonumber \\ \end{aligned}$$
(5.3)

A similar argument shows that for all \(b>\alpha >r_\mathrm {hc}\) and all (relatively) open subsets \(O\subset (r_{\mathrm {hc}}, \infty )^{{\mathbb {N}}}\),

$$\begin{aligned} \liminf _{\beta \rightarrow \infty }\frac{1}{\beta }\log \nu _\beta ((\alpha ,b)\times O) \ge - \inf _{z\in (\alpha ,b) \times O} \bigl ( h(z_1,z_2,\ldots ) -e_0 + I(z_2,z_3,\ldots )\bigr ).\nonumber \\ \end{aligned}$$
(5.4)

Taking monotone limits, the latter inequality is seen to extend to \(\alpha =r_\mathrm {hc}\) and \(b= \infty \). It follows that \((\nu _\beta )\), as a family of probability measures on \((r_\mathrm {hc},\infty )^{\mathbb {N}}\), satisfies a weak large deviations principle with rate function \(J= h_1 - e_0+I(z_2,\ldots )\). (It is indeed sufficient to consider product sets. This is easy to see for the lower bound: If \(U \subset (r_\mathrm {hc},\infty )^{\mathbb {N}}\) is open, then for any \(\varepsilon > 0\) one finds \({\bar{z}}\in (\alpha ,b) \times O \subset U\) with \(h({\bar{z}}_1,{\bar{z}}_2,\ldots ) -e_0 + I({\bar{z}}_2,{\bar{z}}_3,\ldots ) - \varepsilon \le \inf _{z\in U} \bigl ( h(z_1,z_2,\ldots ) -e_0 + I(z_2,z_3,\ldots )\bigr )\), from which it follows that (5.4) holds for U instead of \((\alpha ,b) \times O\). The upper bound for a general compact \(V \subset (r_\mathrm {hc},\infty )^{\mathbb {N}}\) is obtained by covering, for given \(\varepsilon > 0\), \(V \subset \bigcup _{i = 1}^{N_{\varepsilon }} (\alpha _{x_i},b_{x_i}) \times B_{\delta (x_i)}(x_i)\), where for each \(x \in V\), \(b_x> \alpha _x > r_\mathrm {hc}\) and \(\delta (x) > 0\) are chosen such that \(h(x_1,x_2,\ldots ) -e_0 + I(x_2,x_3,\ldots ) - \varepsilon \le \inf _{z\in (\alpha _{x},b_{x}) \times B_{\delta (x)}(x)} \bigl ( h(z_1,z_2,\ldots ) -e_0 + I(z_2,z_3,\ldots )\bigr )\). This is possible since I is lower semicontinuous. With the help of (5.3) we can now deduce that (5.3) holds for V instead of \([\alpha ,b] \times K\).)

Since \((r_\mathrm {hc},\infty )^{\mathbb {N}}\) is a Polish space, the rate function in a weak large deviations principle is uniquely defined [17, Chapter 4.1], hence \(J=I\) on \((r_\mathrm {hc},\infty )^{\mathbb {N}}\). To finish the proof it remains to observe that also \(J=I\) on \({\mathbb {R}}_+^{\mathbb {N}}\setminus (r_\mathrm {hc},\infty )^{\mathbb {N}}\) because both I and h are equal to \(\infty \) on that set. \(\square \)

Proof of Theorem 2.4

The large deviations principle for \(\nu _\beta \) with good rate function \(\overline{{\mathcal {E}}}_{\mathrm {surf}} - \min {\mathcal {E}}_\mathrm {surf}\) is an immediate consequence of Lemmas 5.3 and 5.4 and Proposition 3.9. As a consequence, \(\nu ^-_{\beta } \otimes \nu ^+_{\beta }\) satisfies a deviations principle with good rate function \((z_j)_{j \in {\mathbb {Z}}} \mapsto \overline{{\mathcal {E}}}_\mathrm {surf}(z_1,z_2,\ldots ) + \overline{{\mathcal {E}}}_\mathrm {surf}(z_0,z_{-1},\ldots ) - 2 \min {\mathcal {E}}_\mathrm {surf}\) on \({\mathbb {R}}_+^{{\mathbb {Z}}}\) and on \([r_\mathrm{hc}, \infty )^{{\mathbb {Z}}}\), The large deviations principle for \(\mu _\beta \) thus follows from equation (4.2), Lemmas 4.3.4 and 4.3.6 in [17], \(\min \overline{{\mathcal {E}}}_\mathrm {bulk} =0\) and

$$\begin{aligned} \overline{{\mathcal {E}}}_\mathrm {bulk}(z_1,z_2,\ldots ) = \overline{{\mathcal {E}}}_\mathrm {surf}(z_1,z_2,\ldots ) + \overline{{\mathcal {E}}}_\mathrm {surf}(z_0,z_{-1},\ldots ) + {\mathcal {W}}_0(\cdots z_0\mid z_1\cdots ) \end{aligned}$$

by Proposition 2.3, and the observation that \({\mathcal {W}}_0\) is continuous on \([r_\mathrm{hc}, \infty )^{{\mathbb {Z}}}\). \(\square \)

5.4 Surface Corrections to the Gibbs Free Energy

Proof of Theorem 2.5

The statements about \(g(\beta )\) have already been proven in Lemma 5.2. For \(g_\mathrm {surf}(\beta )\), we start from the formula in Proposition 4.9(a), to which we apply Lemma 5.2, Theorem 2.4 and Varadhan’s lemma. This yields

$$\begin{aligned} \lim _{\beta \rightarrow \infty } g_\mathrm{surf}(\beta ) = - e_0 + \inf \bigl ( {\mathcal {E}}_\mathrm{bulk} - {\mathcal {W}}_0\bigr ), \end{aligned}$$

but now, for \((z_j)\) with \(\sum _{j\in {\mathbb {Z}}}(z_j-a)^2<\infty \),

$$\begin{aligned} {\mathcal {E}}_\mathrm{bulk}- {\mathcal {W}}_0&= \sum _{j\in {\mathbb {Z}}} \sum _{k=1}^{m} \bigl ( v(z_j + \cdots + z_{j+k-1}) - v(ka) + \delta _{1k} p(z_j - a)\bigr ) \\&\quad - \sum _{j \le 0, \ell \ge 1 \atop |\ell - j| \le m-1} \bigl ( v(z_j+\cdots + z_\ell ) - v((\ell -j +1) a)\bigr ) - \sum _{k=1}^{m} (k-1)\, v(ka) \\&= {\mathcal {E}}_\mathrm{surf}(z_1,z_2,\ldots ) + {\mathcal {E}}_\mathrm{surf}(z_0,z_{-1},\ldots ) + e_\mathrm {clamp}+e_0, \end{aligned}$$

with \(e_\mathrm {clamp} := - pa - \sum \limits _{k=1}^\infty k \,v(ka)\), so

$$\begin{aligned} \inf ({\mathcal {E}}_\mathrm{bulk}-{\mathcal {W}})-e_0 = 2 \inf {\mathcal {E}}_\mathrm{surf}+e_\mathrm {clamp} = e_\mathrm{surf}. \end{aligned}$$

\(\square \)

6 Gaussian Approximation

Here we prove Theorems 2.7 and 2.8 on the Gaussian approximation to the bulk measure \(\mu _\beta \) when m is finite. We start from a standard idea, namely perturbation theory for transfer operators [24], however we need to put some work into a good choice of transfer operator as the standard symmetrized choice (6.2) does not work well. This aspect is explained in more detail in section 6.1. Throughout this section m satisfies \(2\le m <\infty \). Remember that \(d=m-1\).

6.1 Decomposition of the Energy. Choice of Transfer Operator

For finite m, the treatment with transfer operators from section 4.1 can be considerably simplified: instead of an operator that acts on functions of infinitely many variables, the transfer operator becomes an integral operator in \(L^2({\mathbb {R}}^d)\) (\(L^2\) space with respect to Lebesgue measure). There are several possible choices, corresponding each to an additive decomposition of the energy. Let \(V(z_1,\ldots ,z_d):= {\mathcal {E}}_m(z_1,\ldots ,z_d)\) and

$$\begin{aligned} W(z_1,\ldots ,z_d;z_{d+1},\ldots , z_{2d}) = \sum _{\begin{array}{c} 1\le i \le d<j \le 2d\\ |i-j|\le d \end{array}} v(z_i+\cdots + z_j). \end{aligned}$$

Let us block variables as \(x_j = (z_{dj+1},\ldots , z_{dj + d})\). Then for \((z_j)_{j \in {\mathbb {Z}}} \in {\mathcal {D}}_0^+\) we have

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}((z_j)_{j \in {\mathbb {Z}}}) = \sum _{j \in {\mathbb {Z}}} \big ( V(x_j) + W(x_j, x_{j+1})- d e_0 \big ) \end{aligned}$$
(6.1)

with only finitely many non-zero summands. By Proposition 2.3 the sum extends to \({\mathcal {D}}^+\) by continuity. The transfer operator associated with the representation (6.1) is the integral operator with kernel \(\exp ( - \beta [V(x) + W(x;y)])\); it is clearly related to the d-th power of the transfer operator \({\mathcal {L}}_\beta \) from section 4.1. The analysis is simpler for a symmetrized operator with kernel

$$\begin{aligned} T_\beta (x,y) = \mathbb {1}_{(r_\mathrm {hc},\infty )^d}(x) \exp \Bigl ( - \beta \Bigl [\tfrac{1}{2} V(x) + W(x;y) + \tfrac{1}{2} V(y)\Bigr ] \Bigr )\mathbb {1}_{(r_\mathrm {hc},\infty )^d}(y),\nonumber \\ \end{aligned}$$
(6.2)

which has the advantage of being Hilbert-Schmidt: The pressure term present in V(x) and V(y) ensures that \(T_\beta (x,y)\) decays exponentially fast when \(|x|+|y|\rightarrow \infty \) so that \(\int _{{\mathbb {R}}^{2d}} T_\beta (x,y)^2 \mathrm {d}x \mathrm {d}y <\infty \). The transfer operator \(T_\beta \) corresponds to a rewriting of  (6.1):

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}((z_j)_{j \in {\mathbb {Z}}}) = \sum _{j \in {\mathbb {Z}}} \big ( \tfrac{1}{2} V(x_j) + W(x_j, x_{j+1}) + \tfrac{1}{2} V(x_{j+1}) - d e_0 \big ). \end{aligned}$$

For the analysis of the limit \(\beta \rightarrow \infty \), we would like to have a transfer operator that concentrates in some sense around the optimal spacings so that we may approximate it with a Gaussian operator. When \(m\ge 3\), unfortunately, the function \((x,y)\mapsto \tfrac{1}{2} V(x) + W(x;y) + \tfrac{1}{2} V(y)\) need not have its minimum at \((x,y) = (\varvec{a}, \varvec{a})\), with \(\varvec{a} = (a,\ldots ,a)\in {\mathbb {R}}^d\). Therefore we introduce yet another variant of the transfer operator: we look for a function \({\widehat{H}}(x,y)\) such that

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}((z_j)_{j \in {\mathbb {Z}}}) = \sum _{j\in {\mathbb {Z}}} {\widehat{H}}(x_j, x_{j+1}) \end{aligned}$$

and \({\widehat{H}}(x,y) \ge {\widehat{H}}(\varvec{a}, \varvec{a}) =0\), and work with the kernel

$$\begin{aligned} K_\beta (x,y) := \mathbb {1}_{(r_\mathrm {hc},\infty )^{d}}(x) \exp \Bigl ( - \beta {\widehat{H}}(x,y) \Bigr ) \mathbb {1}_{(r_\mathrm {hc},\infty )^{d}}(y). \end{aligned}$$

By a slight abuse of notation we use the same letter for the integral operator

$$\begin{aligned} (K_\beta f)(x) = \int _{{\mathbb {R}}^d} K_\beta (x,y) f(y) \mathrm {d}y \end{aligned}$$

in \(L^2({\mathbb {R}}^d)\). The function \({\widehat{H}}\) is defined as follows. Set

$$\begin{aligned} H(x,y)&: = \inf \left\{ {\mathcal {E}}_{\mathrm {bulk}}\bigl ( (z_j)_{j\in {\mathbb {Z}}}\bigr )\mid (z_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm {hc},\infty )^{\mathbb {Z}}:\, (z_1,\ldots ,z_{2d}) = (x,y)\right\} , \\ w(x)&: = \inf \left\{ {\mathcal {E}}_{\mathrm {bulk}}\bigl ( (z_j)_{j\in {\mathbb {Z}}}\bigr )\mid (z_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm {hc},\infty )^{\mathbb {Z}}:\, (z_1,\ldots ,z_{d}) = x\right\} \end{aligned}$$

and

$$\begin{aligned} {\widehat{H}}(x,y) := H(x,y) - \tfrac{1}{2} w(x) - \tfrac{1}{2} w(y). \end{aligned}$$

Remember that

$$\begin{aligned} u(x) = \inf \{ {\mathcal {E}}_{\mathrm {surf}}\bigl ((z_j)_{j\in {\mathbb {N}}}\bigr )\mid (z_j)_{j\in {\mathbb {Z}}} \in (r_\mathrm {hc},\infty )^{\mathbb {N}}:\, (z_1,\ldots ,z_d) = x\}. \end{aligned}$$

Lemma 6.1

Assume \(2\le m<\infty \), \(p\in [0,p^*)\), and \(r_\mathrm {hc}>0\). Then

  1. (a)

    For all \(x,y\in (r_\mathrm {hc},\infty )^d\), we have \({\widehat{H}}(x,y) \ge {\widehat{H}}(\varvec{a}, \varvec{a}) =0\).

  2. (b)

    The function \(g(x):= \frac{1}{2}[u(x) - u(\sigma x)]\) is bounded, and we have

    $$\begin{aligned} {\widehat{H}}(x,y) = - g(x) + \Bigl ( \tfrac{1}{2} V(x) + W(x,y) + \tfrac{1}{2} V(y)- d e_0\Bigr ) + g(y). \end{aligned}$$
  3. (c)

    \({\widehat{H}}( x, y) = {\widehat{H}}(\sigma y,\sigma x)\) for all \(x,y\in (r_\mathrm {hc},\infty )^d\).

Proof

One easily checks that

$$\begin{aligned} w(x) = \inf _{y \in (r_\mathrm {hc},\infty )^d} H(x,y),\quad w(y) = \inf _{x \in (r_\mathrm {hc},\infty )^d} H(x,y), \end{aligned}$$

which yields

$$\begin{aligned} H(x,y) -\tfrac{1}{2} w(x) - \tfrac{1}{2} w(y) = \tfrac{1}{2}[ H(x,y) - w(x)] + \tfrac{1}{2} [H(x,y) - w(y)] \ge 0.\nonumber \\ \end{aligned}$$
(6.3)

For \(x = y = \varvec{a}\), we have \(H(\varvec{a},\varvec{a})= w(\varvec{a})\) hence \({\widehat{H}}(\varvec{a},\varvec{a}) =0\). This proves part (a) of the lemma. The symmetry in part (c) is immediate from the reversal symmetry of \({\mathcal {E}}_\mathrm {bulk}\). For (b), we note that

$$\begin{aligned} H(x,y) = u(\sigma x) + W(x,y) + u(y),\quad w(x) = u(\sigma x) + u(x) - V(x) + d e_0, \end{aligned}$$

the formula for \({\widehat{H}}\) follows. Because of

$$\begin{aligned} u(x) = \inf _y \bigl ( V(x) + W(x,y) - d e_0 + u(y)) \end{aligned}$$

and \(V(\sigma x) = V(x)\), \(C:= \sup _{(x,y)\in (r_\mathrm {hc},\infty )^{2d}} |W(x,y)-W(\sigma x, y)| <\infty \), we have

$$\begin{aligned} u(x) \le \inf _y \bigl ( V(\sigma x) + W(\sigma x, y) +C - d e_0 + u(y)\bigr ) = u(\sigma x) + C. \end{aligned}$$

The roles of x and \(\sigma x\) can be exchanged, hence \(u(x) - u(\sigma x)\) is bounded. \(\square \)

6.2 Some Properties of the Transfer Operator

Lemma 6.2

Assume \(2\le m<\infty \), \(p\in (0,p^*)\), and and \(r_\mathrm {hc}>0\). Then

  1. (a)

    The kernels \(K_\beta \) and \(T_\beta \) are related as follows:

    $$\begin{aligned} K_\beta (x,y) = {{\text {e}} }^{\beta d e_0+ \tfrac{1}{2} \beta [u(x) - u(\sigma x) ]} T_\beta (x,y) {{\text {e}} }^{ -\tfrac{1}{2} \beta [ u(y) - u(\sigma y)]}. \end{aligned}$$
  2. (b)

    The operator \(K_\beta \) is a Hilbert–Schmidt operator in \(L^2({\mathbb {R}}^d)\), and the kernel has the symmetry \(K_\beta (x,y) = K_\beta (\sigma y, \sigma x)\).

The lemma follows from Lemma 6.1, the elementary proofs are omitted.

By the Krein–Rutman theorem [31, 16, Chapter 6], the operator norm \(||K_\beta ||=:\Lambda _0(\beta )\) is a simple eigenvalue of \(K_\beta \), the associated eigenfunction \(\phi _\beta \) can be chosen strictly positive on \((r_\mathrm {hc},\infty )^d\), and the other eigenvalues of \(K_\beta \) have absolute value strictly smaller than \(\Lambda _0(\beta )\), that is

$$\begin{aligned} \Lambda _1(\beta ) = \sup \{|\lambda |\, : \, \lambda \, \text {eigenvalue of } K_\beta ,\, \lambda \ne \Lambda _0(\beta )\}<\Lambda _0(\beta ). \end{aligned}$$

By Lemma 6.2(b), the function \(\phi _\beta \circ \sigma \) is a left eigenfunction of \(K_\beta \):

$$\begin{aligned} \int _{{\mathbb {R}}^d} \phi _\beta (\sigma x) K_\beta (x,y) \mathrm {d}x = \Lambda _0(\beta ) \phi _\beta (\sigma y). \end{aligned}$$

Let \(\Pi _\beta \) be the rank-one projection in \(L^2({\mathbb {R}}^d)\) given by

$$\begin{aligned} \Pi _\beta f :=\frac{ \langle f, \phi _\beta \circ \sigma \rangle }{\langle \phi _\beta , \phi _\beta \circ \sigma \rangle } \phi _\beta . \end{aligned}$$

Then \(K_\beta \Pi _\beta = \Lambda _0(\beta ) \Pi _\beta = \Pi _\beta K_\beta \) and an induction over \(n\in {\mathbb {N}}\) shows

$$\begin{aligned} \frac{1}{\Lambda _0(\beta )^n} K_\beta ^n - \Pi _\beta = \Bigl ( \frac{1}{\Lambda _0(\beta )} K_\beta - \Pi _\beta \Bigr )^n. \end{aligned}$$
(6.4)

Since \(\Lambda _1(\beta )\) is nothing else but the spectral radius of \( K_\beta -\Lambda _0(\beta ) \Pi _\beta \), it follows that

$$\begin{aligned} \limsup _{n\rightarrow \infty } || \Lambda _0(\beta )^{-n} K_\beta ^n - \Pi _\beta ||^{1/n} =\frac{\Lambda _1(\beta )}{\Lambda _0(\beta )}<1. \end{aligned}$$
(6.5)

The spectral properties of \(K_\beta \) are related to the Gibbs free energy and the Gibbs measure as follows.

Lemma 6.3

Assume \(2 \le m< \infty \), \(p\in (0,p^*)\), and \(r_\mathrm {hc}>0\). Then

  1. (a)

    The Gibbs free energy is given by \(g(\beta )=e_0-\frac{1}{\beta d}\log \Lambda _0(\beta )\).

  2. (b)

    The nd-dimensional marginals of the bulk Gibbs measure \(\mu _\beta \) have probability density function

    $$\begin{aligned} \frac{1}{c} \phi _\beta (\sigma x_1) \Biggl (\prod _{i=1}^{n-1} \frac{1}{\Lambda _0(\beta )} K_\beta (x_i,x_{i+1})\Biggr ) \phi _\beta (x_n) \end{aligned}$$

    with \(c=\langle \phi _\beta , \phi _\beta \circ \sigma \rangle \).

  3. (c)

    For all \(\varepsilon >0\) and all bounded \(f,g:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\), writing \(f_0\bigl ((z_j)_{j\in {\mathbb {Z}}}\bigr ):=f(z_{0},\ldots , z_{d-1})\) and \(g_n\bigl ((z_j)_{j\in {\mathbb {Z}}}\bigr ):= g(z_{nj},\ldots , z_{nj+d-1})\), we have

    $$\begin{aligned} \bigl |\mu _\beta (f_0g_n) - \mu _\beta (f_0)\mu _\beta (g_n)\bigr | \le C_\varepsilon (\beta ) \Bigl (\frac{\Lambda _1(\beta )}{\Lambda _0(\beta )}\Bigr )^{(1-\varepsilon )n} ||f||_\infty ||g||_\infty \end{aligned}$$

    with some constant \(C_\varepsilon (\beta )\) that does not depend on f, g, or n. If \(m=2\), we can pick \(\varepsilon = 0\) and \(C_0=1\).

Proof of Lemma 6.3

For \(N= nd +1\), the partition function \(Q_N(\beta )\) is given by

$$\begin{aligned}&Q_{nd+1}(\beta )= \langle {{\text {e}} }^{-\beta V/2}, T_\beta ^{n-1} {{\text {e}} }^{-\beta V/2}\rangle \\&\quad = {{\text {e}} }^{-(n-1)\beta d e_0}\langle {{\text {e}} }^{- \beta V/2 -\beta [u- u\circ \sigma ]/2 },K_\beta ^{n-1} {{\text {e}} }^{- \beta V/2 + \beta [u- u\circ \sigma ] /2}\rangle . \end{aligned}$$

For the second identity we have used Lemma 6.2(a). The function \(u-u\circ \sigma \) is bounded by Lemma 6.1(b) and \(\exp (- \beta V)\) is integrable because \(V(z_1,\ldots ,z_d)={\mathcal {E}}_m(z_1,\ldots ,z_d)\) grows linearly when \(|z_j|\rightarrow \infty \). Therefore \(F_\beta := \exp (- \beta V/2- \beta [u- u\circ \sigma ] /2)\) and \(F_\beta \circ \sigma \) are in \(L^2({\mathbb {R}}^d)\), and as \(n \rightarrow \infty \),

$$\begin{aligned} \langle F_\beta , K_\beta ^{n-1} F_\beta \circ \sigma \rangle = \Lambda _0(\beta )^{n-1}\langle F_\beta , \phi _\beta \rangle ^2 + O(\Lambda _1(\beta )^{n-1}). \end{aligned}$$

It follows that

$$\begin{aligned} g(\beta ) = - \lim _{n\rightarrow \infty }\frac{1}{\beta (nd+1)}\log Q_{nd+1}(\beta ) = e_0- \frac{1}{\beta d}\log \Lambda _0(\beta ), \end{aligned}$$

which proves part (a) of the lemma. The standard proof of part (b) is omitted (compare [24, Chapter 4]). For (c), we use the formula for the \((n+1)d\)- dimensional marginal provided by (b). Let us choose multiplicative constants in such a way that \(c=\langle \phi _\beta ,\phi _\beta \circ \sigma \rangle =1\). Then

$$\begin{aligned} \mu _\beta (f_0g_n) - \mu _\beta (f_0)\mu _\beta (g_n)&= \langle f(\phi _\beta \circ \sigma ),\frac{1}{\Lambda _0(\beta )^n}K_\beta ^n (g \phi _\beta )\rangle \\&\quad - \langle f ( \phi _\beta \circ \sigma ), \phi _\beta \rangle \langle \phi _\beta \circ \sigma , g \phi _\beta \rangle \\&=\langle f(\phi _\beta \circ \sigma ),\Bigl ( \frac{1}{\Lambda _0(\beta )^n}K_\beta ^n- \Pi _\beta \Bigr ) (g \phi _\beta )\rangle . \end{aligned}$$

equation (6.4) yields

$$\begin{aligned} \bigl | \mu _\beta (f_0g_n) - \mu _\beta (f_0)\mu _\beta (g_n)\bigr | \le || \Bigl ( \frac{1}{\Lambda _0(\beta )} K_\beta - \Pi _\beta \Bigr )^n ||\, ||f(\phi _\beta \circ \sigma )||\, ||g \phi _\beta || \end{aligned}$$

where \(||\cdot ||\) refers to the \(L^2\)-norm for functions and the operator norm for the operator. We further bound \(|| g\phi _\beta ||\le ||g||_\infty ||\phi _\beta ||\) and \(||f (\phi _\beta \circ \sigma )||\le ||f||_\infty ||\phi _\beta ||\) and conclude with (6.5). If \(m=2\), the operators are symmetric, hence the operator norm is the same as the spectral radius and the estimates simplify accordingly. \(\square \)

Remark (Associated Markov chain)

Define the kernel

$$\begin{aligned} P_\beta (x,\mathrm {d}y):= \frac{1}{\Lambda _0(\beta ) \phi _\beta (x)} K_\beta (x,y) \phi _\beta (y) \mathrm {d}y \end{aligned}$$
(6.6)

on \((r_{\mathrm { hc}},\infty )^d\). Then \(P_\beta \) is a Markov kernel with invariant measure \(\rho _\beta (x)\mathrm {d}x\) where

$$\begin{aligned} \rho _\beta (x) =\frac{1}{c} \phi _\beta (\sigma x) \phi _\beta (x). \end{aligned}$$

If in the bulk Gibbs measure \(\mu _\beta \) we group spacing in blocks as \(x_n = (z_{dn},\ldots ,z_{dn+d-1})\), we obtain a probability measure on \((r_\mathrm {hc},\infty )^d\). This measure is exactly the distribution of the two-sided stationary Markov chain \((X_j)_{j\in {\mathbb {Z}}}\) with state space \({\mathbb {R}}^d\), transition kernel \(P_\beta \), and initial law \({\mathcal {L}}(X_0) = \rho _\beta (x)\mathrm {d}x\).

6.3 Gaussian Transfer Operator

Here we introduce the Gaussian counterpart to the transfer operator \(K_\beta \) and study its spectral properties. We start from the quadratic approximation to the bulk energy \({\mathcal {E}}_\mathrm {bulk}\). The differentiability of \({\mathcal {E}}_\mathrm {bulk}\) in a neighborhood of the constant sequence \(z_j\equiv a\) is checked in Lemma 6.11 below, for the definition of the Gaussian transfer operator we only need the infinite matrix of partial derivatives at \((\ldots , a,a,\ldots )\).

In the following we block variables as \(x_j = (z_{dj},\ldots , z_{dj + d-1})\) for \(z = (z_j)_{j \in {\mathbb {Z}}}\) and \(\xi _j = (\zeta _{dj},\ldots , \zeta _{dj + d-1})\) for \(\zeta = (\zeta _j)_{j \in {\mathbb {Z}}}\). Remember the decomposition (6.1). Set \(\varvec{a}=(a,\ldots ,a)\in {\mathbb {R}}^d\) and define the \(d\times d\) matrices

$$\begin{aligned} A := W_{yy}(\varvec{a},\varvec{a}) + V_{xx}(\varvec{a}) + W_{xx}(\varvec{a},\varvec{a}),\quad B:= - W_{xy}(\varvec{a},\varvec{a}). \end{aligned}$$
(6.7)

We note the following relations:

$$\begin{aligned} W_{yy}(\varvec{a}) =\sigma W_{xx}(\varvec{a})\sigma ,\quad B^T= \sigma B \sigma , \quad \sigma A \sigma = A. \end{aligned}$$
(6.8)

The Hessian \(\mathrm {D}^2 {\mathcal {E}}_{\mathrm {bulk}}\) at \((\ldots ,a,a,\ldots )\) is a doubly infinite, band-diagonal matrix with block form

$$\begin{aligned} \begin{pmatrix} \ddots &{}\ddots &{}\ddots &{}&{}&{} \\ &{} - B^T&{} A &{} - B&{} &{}\\ &{} &{} - B^T &{} A &{} - B &{} \\ &{}&{}&{} \ddots &{}\ddots &{} \ddots \end{pmatrix}. \end{aligned}$$
(6.9)

Note that Lemma 3.3 implies that \(\mathrm {D}^2 {\mathcal {E}}_{\mathrm {bulk}}(\ldots ,a,a,\ldots )\) is positive definite. We look for a quadratic form \({\mathcal {Q}}(x,y)\) on \({\mathbb {R}}^{2d}\) that is positive-definite and satisfies

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}\bigl ( (z_j)_{j\in {\mathbb {Z}}}\bigr ) = \tfrac{1}{2} \sum _{j\in {\mathbb {Z}}} {\mathcal {Q}}(x_j-\varvec{a}, x_{j+1}-\varvec{a}) + o\Bigl ( \sum _{j \in {\mathbb {Z}}} |x_j-\varvec{a}|^2\Bigr ). \end{aligned}$$

One candidate choice could be

$$\begin{aligned} {\mathcal {Q}}(x,y) := \tfrac{1}{2} \langle x, Ax\rangle - 2 \langle x, By'\rangle + \tfrac{1}{2} \langle y, A y\rangle \quad (x',y'\in {\mathbb {R}}^d), \end{aligned}$$

but it is not easily related to \({\widehat{H}}(x,y)\). We make a different choice which mimicks the definition of \({\widehat{H}}(x,y)\) and show later that this amounts to picking the Hessian of \({\widehat{H}}(x,y)\) (see Lemma 6.12 below).

We introduce the quadratic counterparts to the functions H(xy), w(x), and \({\widehat{H}}(x,y)\) from section 6.2. Remember the bulk Hessian from (6.9). Since it is positive-definite, there exist uniquely defined positive-definite matrices \(M\in {\mathbb {R}}^{2d\times 2d}\) and \(N\in {\mathbb {R}}^{d\times d}\) such that

$$\begin{aligned} \langle \begin{pmatrix} x\\ y\end{pmatrix}, M \begin{pmatrix} x\\ y\end{pmatrix} \rangle&= \inf \{ \langle z, \mathrm {D}^2{\mathcal {E}}_\mathrm {bulk}(a,a,\ldots ) z\rangle \mid z\in \ell ^2({\mathbb {Z}}),\, (z_1,\ldots , z_{2d}) =( x,y) \} \end{aligned}$$
(6.10)
$$\begin{aligned} \langle x, N x\rangle&= \inf \{ \langle z, \mathrm {D}^2{\mathcal {E}}_\mathrm {bulk}(a,a,\ldots ) z\rangle \mid z\in \ell ^2({\mathbb {Z}}),\, (z_1,\ldots , z_d) = x\} \end{aligned}$$
(6.11)

for all \(x,y\in {\mathbb {R}}^d\). The quadratic forms associated with M and N are the Gaussian counterparts to the functions H(xy) and w(x), respectively. Finally set

$$\begin{aligned} {\widehat{M}}:= M- \begin{pmatrix} \frac{1}{2} N&{} 0 \\ 0 &{} \frac{1}{2} N \end{pmatrix} \end{aligned}$$
(6.12)

and

$$\begin{aligned} \widehat{ {\mathcal {Q}}}(x,y):=\bigl \langle \begin{pmatrix} x\\ y \end{pmatrix}, {\widehat{M}} \begin{pmatrix} x\\ y \end{pmatrix}\bigr \rangle . \end{aligned}$$

We will see in the proof of Lemma 6.12 that M, N and \(\widehat{M}\) are the Hessians of H at \((\varvec{a}, \varvec{a})\), w at \(\varvec{a}\) and \(\widehat{H}\) at \((\varvec{a}, \varvec{a})\), respectively. The relation between \({\mathcal {Q}}\) and \(\widehat{{\mathcal {Q}}}(x,y)\) is clarified in Lemma 6.7 below. We are going to work with the kernel

$$\begin{aligned} G_\beta (x,y) := \exp \Bigl ( - \tfrac{1}{2} \beta \widehat{{\mathcal {Q}}}(x- \varvec{a},y- \varvec{a})\Bigr )\qquad (x,y\in {\mathbb {R}}^d) \end{aligned}$$

and the associated integral operator \((G_\beta f)(x)=\int _{{\mathbb {R}}^d} G_\beta (x,y) f(y)\mathrm {d}y\). In section 6.4 we show that \(G_\beta \) is a good approximation for \(K_\beta \), here we study the operator \(G_\beta \) on its own. Clearly it is enough to understand the integral operator G with kernel

$$\begin{aligned} G(x,y):= \exp (- \tfrac{1}{2} \widehat{{\mathcal {Q}}}(x,y)), \end{aligned}$$

since G and \(G_\beta \) are related by the change of variables \(x\mapsto \sqrt{\beta }(x- \varvec{a})\), see equation (6.21) below.

Lemma 6.4

Assume \(2 \le m< \infty \), \(p\in [0, p^*)\). Then the quadratic form \(\widehat{{\mathcal {Q}}}\) is positive-definite: \(\widehat{{\mathcal {Q}}}(x,y)\ge \varepsilon (|x|^2+ |y|^2)\) for some \(\varepsilon >0\) and all \((x,y)\in {\mathbb {R}}^{2d}\).

Proof

First we show that \({\widehat{M}}\) is positive semi-definite, by an argument similar to Lemma 6.2(a). Define

$$\begin{aligned} F(x,y):= \langle \begin{pmatrix} x\\ y\end{pmatrix}, M \begin{pmatrix} x\\ y\end{pmatrix}\rangle . \end{aligned}$$

Clearly

$$\begin{aligned} \langle x, N x\rangle = \inf _{y\in {\mathbb {R}}^d} F(x,y) \quad \langle y, N y\rangle = \inf _{x\in {\mathbb {R}}^d} F(x,y), \end{aligned}$$

hence

$$\begin{aligned} \langle \begin{pmatrix} x\\ y\end{pmatrix}, {\widehat{M}} \begin{pmatrix} x\\ y\end{pmatrix}\rangle = \frac{1}{2} \Bigl ( F(x,y) - \langle x, N x\rangle \Bigr ) + \frac{1}{2} \Bigl ( F(x,y) - \langle y, N y\rangle \Bigr ) \ge 0\nonumber \\ \end{aligned}$$
(6.13)

for all \((x,y)\in {\mathbb {R}}^d\times {\mathbb {R}}^d\) and \({\widehat{M}}\) is positive semi-definite. Next let \((x_0,y_0)\in {\mathbb {R}}^d\times {\mathbb {R}}^d\) be a zero of the quadratic form associated with \({\widehat{M}}\). Then by (6.13), the function \(y\mapsto F(x_0,y)\) must be minimal at \(y=y_0\), hence \(\nabla _y F(x_0,y) =0\). Similarly, the function \(y\mapsto F(x,y_0)\) must be minimal at \(x=x_0\), hence \(\nabla _x F(x_0,y_0) =0\). Thus \((x_0,y_0)\) is a critical point of F. But F is strictly convex because M is positive-definite, therefore the critical point \((x_0,y_0)\) is a global minimizer of F which yields \((x_0,y_0)=0\). It follows that \({\widehat{M}}\) is positive-definite. \(\square \)

It follows from Lemma 6.4 that \(\int _{{\mathbb {R}}^{2d}} G(x,y)^2 \mathrm {d}x\mathrm {d}y <\infty \), hence G is Hilbert-Schmidt with strictly positive integral kernel and Krein–Rutman theorem is applicable. So we may ask for its principal eigenvalue and eigenvector and its spectral gap. It is natural to look for a Gaussian eigenfunction.

Lemma 6.5

Let F be a positive-definite, symmetric \(d\times d\) matrix. Then the following two statements are equivalent:

  1. (i)

    \(\phi (x) :=\exp (- \tfrac{1}{2} \langle x, F x\rangle )\) is an eigenfunction of G.

  2. (ii)

    The function \(x\mapsto \langle x, F x\rangle \) satisfies the quadratic Bellman equation

    $$\begin{aligned} \langle x,F x\rangle = \inf _{y\in {\mathbb {R}}^d}\bigl ( \widehat{{\mathcal {Q}}}(x,y) + \langle y , Fy\rangle \bigr ). \end{aligned}$$
    (6.14)

Proof

The proof is by a straightforward completion of squares: write

$$\begin{aligned} {\widehat{M}}= \begin{pmatrix} {\widehat{M}}_1 &{} {\widehat{M}}_2\\ {\widehat{M}}_2^ T &{} {\widehat{M}}_3\end{pmatrix} \end{aligned}$$

with \(d\times d\) -matrices \({\widehat{M}}_j\). The diagonal blocks \({\widehat{M}}_1\) and \({\widehat{M}}_3\) are positive-definite because \({\widehat{M}}\) is positive-definite, therefore \({\widehat{M}}_3+ F\) is positive-definite as well. Then

$$\begin{aligned} \widehat{{\mathcal {Q}}}(x,y) + \langle y, Fy\rangle =&\langle x, {\widehat{M}}_1 x\rangle + 2 \langle x, {\widehat{M}}_2 y \rangle +\langle y,( {\widehat{M}}_3+F) y\rangle \\ =&\langle x, {\widehat{M}}_1 x\rangle - \langle x, {\widehat{M}}_2( {\widehat{M}}_3+F)^{-1} {\widehat{M}}_2^T x\rangle \\&+ \langle y + ({\widehat{M}}_3+F)^{-1} {\widehat{M}}_2^T x,( {\widehat{M}}_3+F)(y + ({\widehat{M}}_3+F)^{-1} {\widehat{M}}_2^T x)\rangle . \end{aligned}$$

It follows that

$$\begin{aligned} \inf _{y\in {\mathbb {R}}^d} \bigl ( \widehat{{\mathcal {Q}}}(x,y) + \langle y, Fy\rangle \bigr ) = \langle x, ({\widehat{M}}_1 - {\widehat{M}}_2 ( {\widehat{M}}_3+ F)^{-1} {\widehat{M}}_2^T) x\rangle \end{aligned}$$

and

$$\begin{aligned} (G\phi )(x) = \sqrt{\frac{(2\pi )^d}{\det ({\widehat{M}}_3+F)}}\, \exp \Bigl ( - \frac{1}{2} \langle x, ({\widehat{M}}_1 - {\widehat{M}}_2 ({\widehat{M}}_3+F)^{-1} {\widehat{M}}_2^T) x\rangle \Bigr ).\nonumber \\ \end{aligned}$$
(6.15)

Therefore (i) and (ii) hold true if and only if F solves

$$\begin{aligned} F = {\widehat{M}}_1 - {\widehat{M}}_2 ({\widehat{M}}_3+F)^{-1} {\widehat{M}}_2^T. \end{aligned}$$

In particular, (i) and (ii) are equivalent. \(\square \)

In Lemma 6.7 below we check that M is of the form

$$\begin{aligned} M = \begin{pmatrix} \sigma C\sigma &{} - B \\ - B^T &{} C \end{pmatrix} \end{aligned}$$
(6.16)

for some positive-definite \(d\times d\) matrix C.

Lemma 6.6

The principal eigenvalue of G is \(\sqrt{(2\pi )^d /\det C}\) and the principal eigenfunction is \(\exp ( - \tfrac{1}{2} \langle x, \tfrac{1}{2} N x\rangle )\) (up to scalar multiples).

Proof

A close look at our definitions shows that \(F:= \frac{1}{2} N\) solves (6.14) (it is positive-definite because N is). Indeed, by the definition of \(\widehat{{\mathcal {Q}}}\), \({\widehat{M}}\), we have

$$\begin{aligned} \inf _{y\in {\mathbb {R}}^d}\bigl (\widehat{{\mathcal {Q}}}(x,y) +\langle y, \tfrac{1}{2}N y\rangle \bigr )&= - \langle x, \tfrac{1}{2} N x\rangle + \inf _{y\in {\mathbb {R}}^d} \langle \begin{pmatrix} x\\ y \end{pmatrix}, M \begin{pmatrix} x\\ y \end{pmatrix}\rangle = \langle x, \tfrac{1}{2} N x\rangle . \end{aligned}$$

Therefore, by Lemma 6.5, the function \(\phi (x) =\exp ( - \frac{1}{4} \langle x, N x\rangle )\) is an eigenfunction of G. The matrix \({\widehat{M}}_3 +F\) in (6.15) is equal to \((C- \tfrac{1}{2} N) + F=C\), and we find that the principal eigenvalue of G is \(\sqrt{(2\pi )^d /\det C}\). \(\square \)

In order to identify the block C in (6.16), we introduce the quadratic analogue to the function u(x). Let A and B be the \(d\times d\) matrices from (6.7) and \(A_1:= V_{xx}(\varvec{a}) + W_{xx}(\varvec{a}, \varvec{a})\). The infinite matrix \((\partial _i \partial _j {\mathcal {E}}_\mathrm {surf}(a,a,\ldots ))_{i,j\in {\mathbb {N}}}\) is band-diagonal with block structure

$$\begin{aligned} \mathrm {D}^2{\mathcal {E}}_\mathrm {surf}(a,a,\ldots ) = \begin{pmatrix} A_1 &{} - B &{} 0 &{} \cdots &{}&{} \\ - B^T &{} A &{} - B &{} 0 &{} \cdots &{} \\ 0 &{} - B^T &{} A &{} - B &{} 0 &{} \\ \vdots &{} \ddots &{} \ddots &{} \ddots &{}\ddots &{} \ddots \end{pmatrix}. \end{aligned}$$

The matrix differs from the bulk Hessian (6.9) by the upper left corner \(A_1\): we have

$$\begin{aligned} A= A_1 + W_{yy}(\varvec{a}, \varvec{a}). \end{aligned}$$
(6.17)

By a reasoning similar to Lemma 3.3, the Hessian of \({\mathcal {E}}_\mathrm {surf}\) is positive-definite. Therefore there is a uniquely defined positive-definite \(d\times d\)-matrix D such that

$$\begin{aligned} \langle x, D x\rangle = \inf \{ \langle z, \mathrm {D}^2{\mathcal {E}}_\mathrm {surf}(a,a,\ldots ) z\rangle \mid z\in \ell ^2({\mathbb {N}}),\, (z_1,\ldots , z_d) = x\} \end{aligned}$$

for all \(x\in {\mathbb {R}}^d\). (Analogous arguments as in the proof of Lemma 6.12 show that D is the Hessian of u at \(\varvec{a}\).) Set

$$\begin{aligned} C:=D+ W_{yy}(\varvec{a}, \varvec{a}) \end{aligned}$$
(6.18)

and

$$\begin{aligned} J:= D+ W_{yy}(\varvec{a}, \varvec{a}) - \sigma D \sigma - W_{xx}(\varvec{a}, \varvec{a}) = C - \sigma C \sigma \end{aligned}$$

(remember the symmetries (6.8)).

Lemma 6.7

The matrix C solves

$$\begin{aligned} C = A - B C^{-1} B^T \end{aligned}$$

and equation (6.16) holds true. Moreover

$$\begin{aligned} \widehat{{\mathcal {Q}}}(x,y) = - \langle x, J x\rangle + {{\mathcal {Q}}}(x,y) + \langle y, J y\rangle . \end{aligned}$$

Proof

Clearly

$$\begin{aligned} \langle x, D x\rangle = \inf _{y\in {\mathbb {R}}^d} \bigl ( \langle x, A_1 x\rangle - \langle x, B y\rangle - \langle B^T x, y\rangle + \langle y, (W_{yy}(\varvec{a}, \varvec{a}) + D) y\rangle \bigr ) \end{aligned}$$

hence

$$\begin{aligned} D = A_1 - B (W_{yy}(\varvec{a}, \varvec{a})+ D)^{-1} B^T \end{aligned}$$
(6.19)

by a completion of squares similar to the proof of Lemma 6.5. We add \(W_{yy}(\varvec{a}, \varvec{a})\) to both sides, remember (6.17), and obtain the equation for C. It is easy to see that

$$\begin{aligned} M = \begin{pmatrix} \sigma D\sigma + W_{xx}(\varvec{a}, \varvec{a}) &{} - B \\ - B^T&{} W_{yy}(\varvec{a}, \varvec{a}) + D \end{pmatrix} = \begin{pmatrix} \sigma C\sigma &{} - B \\ - B^T &{} C \end{pmatrix} \end{aligned}$$

which proves (6.16). Furthermore,

$$\begin{aligned} \langle x, N x\rangle = \inf _{y\in {\mathbb {R}}^d} \langle \begin{pmatrix} x\\ y\end{pmatrix}, M \begin{pmatrix} x\\ y\end{pmatrix}\rangle , \quad \langle y, N y\rangle = \inf _{x\in {\mathbb {R}}^d} \langle \begin{pmatrix} x\\ y\end{pmatrix}, M \begin{pmatrix} x\\ y\end{pmatrix}\rangle , \end{aligned}$$

hence,

$$\begin{aligned} N =\sigma C\sigma - B C ^{-1} B^T, \quad N = C - B^T (\sigma C \sigma ) ^{-1} B. \end{aligned}$$

Let us check that the two expressions for N are indeed identical, and that \(\sigma N\sigma = N\). Combining with (6.17) and (6.19), the two expressions for N become

$$\begin{aligned} N= & {} \sigma D\sigma + W_{xx}(\varvec{a}, \varvec{a}) - \bigl (A - W_{yy}(\varvec{a}, \varvec{a}) - D\bigr ) = D + \sigma D \sigma + W_{xx}(\varvec{a}, \varvec{a}) \\&+ W_{yy}(\varvec{a}, \varvec{a}) - A \end{aligned}$$

and

$$\begin{aligned} N= & {} D + W_{yy}(\varvec{a}, \varvec{a}) - \sigma \bigl ( A - W_{yy}(\varvec{a}, \varvec{a}) - D\bigr ) \sigma = D + \sigma D \sigma + W_{xx}(\varvec{a}, \varvec{a}) \\&+ W_{yy}(\varvec{a}, \varvec{a}) - A. \end{aligned}$$

The two expressions are indeed equal, and from the end formula and (6.8) we read off that \(\sigma N \sigma = N\). Actually

$$\begin{aligned} N = D + \sigma D \sigma - V_{xx}(\varvec{a}), \end{aligned}$$

which is the analogue of \(w(x) = u(x) + u(\sigma x) - V(x)\).

Now we compute \({\widehat{M}}\). The off-diagonal blocks of \({\widehat{M}}\) are the same as those of M. The upper left diagonal block is

$$\begin{aligned} M_1 - \tfrac{1}{2} N&= \sigma D \sigma + W_{xx}(\varvec{a}, \varvec{a}) - \tfrac{1}{2} \bigl ( D + \sigma D \sigma + W_{xx}(\varvec{a}, \varvec{a}) + W_{yy}(\varvec{a}, \varvec{a}) - A\bigr ) \\&= \tfrac{1}{2} A + \tfrac{1}{2} \bigl (\sigma D \sigma + W_{xx}(\varvec{a}, \varvec{a})\bigr ) - \tfrac{1}{2} \bigl ( D + W_{yy}(\varvec{a}, \varvec{a})\bigr ). \end{aligned}$$

A similar computation yields the lower right block. Altogether we find

$$\begin{aligned} {\widehat{M}} = \begin{pmatrix} \frac{1}{2} (A -J) &{} - B \\ - B^T &{} \frac{1}{2} (A+J) \end{pmatrix} \end{aligned}$$

and the lemma follows. \(\square \)

Finally we come back to the \(\beta \)-dependent operator \(G_\beta \).

Proposition 6.8

Assume \(2\le m <\infty \) and \(p\in [0,p^*)\). The principal eigenvalue of \(G_\beta \) is

$$\begin{aligned} \Lambda _0^\mathrm {Gauss}(\beta ) = \sqrt{\frac{(2\pi )^d}{\beta ^d\, \det C}} \end{aligned}$$

and the normalized, positive principal eigenfunction is

$$\begin{aligned} \phi _\beta ^\mathrm {Gauss}(x) = \Bigl (\frac{\beta ^d \det (\frac{1}{2} N)}{\pi ^d}\Bigr )^{1/4} \exp \Bigl ( - \tfrac{1}{2} \beta \langle x - \varvec{a}, \tfrac{1}{2} N\, (x-\varvec{a})\rangle \Bigr ). \end{aligned}$$

Proof

Let \(U_\beta : L^2({\mathbb {R}}^d)\rightarrow L^2({\mathbb {R}}^d)\) be the unitary operator given by

$$\begin{aligned} (U_\beta f)(x') = \beta ^{-d/4} f(\varvec{a} + \beta ^{-1/2} x'). \end{aligned}$$
(6.20)

We have

$$\begin{aligned} \bigl ( U_\beta G_\beta f\bigr ) (x')&= \beta ^{-d/4} (G_\beta f)(\varvec{a}+ \beta ^{-1/2} \varvec{x}')\\&= \beta ^{-d/4} \int _{{\mathbb {R}}^d} G_\beta (\varvec{a}+ \beta ^{-1/2} x', \varvec{a}+ \beta ^{-1/2} y') f(\varvec{a}+ \beta ^{-1/2} y') \beta ^{-d/2} \mathrm {d}y' \\&= \beta ^{- d/2} \int _{{\mathbb {R}}^d} G(x',y') (U_\beta f)(y') \mathrm {d}y' \end{aligned}$$

hence

$$\begin{aligned} G_\beta = \beta ^{-d/2} U_\beta ^* G U_\beta \end{aligned}$$
(6.21)

and the principal eigenvalue and eigenfunction of \(G_\beta \) are obtained from those of G in Lemma 6.6 by straightforward transformations. \(\square \)

Remark

When \(m=2\), all eigenvalues and eigenfunctions of G (hence \(G_\beta \)) can be computed explicitly, and the eigenfunctions are expressed with Hermite polynomials. See [24, section 5.2] on the harmonic Kac operator.

6.4 Perturbation Theory

Remember the unitary operator \(U_\beta \) from (6.20) and the relation \(G_\beta = \beta ^{-d/2} U_\beta ^* G U_\beta \). The main technical result of this section is

Proposition 6.9

Assume \(2\le m <\infty \), \(p\in (0,p^*)\), and \(r_\mathrm {hc}>0\). We have \(||\beta ^{d/2} (K_\beta - G_\beta )|| = ||G - \beta ^{d/2} U_\beta K_\beta U_\beta ^* || \rightarrow 0\) as \(\beta \rightarrow \infty \).

Before we come to the proof of the proposition, we state a corollary on the principal eigenvalue and eigenfunction. Remember the quantities \(\Lambda _0(\beta )\), \(\Lambda _1(\beta )\), \(\phi _\beta \) defined before Lemma 6.3. We choose multiplicative constants so that \(||\phi _\beta || =1\). Let \(\lambda _j^\mathrm {Gauss}\), \(j\in {\mathbb {N}}_0\), be an enumeration of the eigenvalues of G with \(\lambda _0^\mathrm {Gauss} = ||G||\) and

$$\begin{aligned} \gamma ^\mathrm {Gauss} = \max _{j\ne 0}\frac{|\lambda _j^\mathrm {Gauss}|}{\lambda _0^{\mathrm {Gauss} }}. \end{aligned}$$

Corollary 6.10

Under the assumptions of Proposition 6.9: Let \(\Lambda _0^\mathrm {Gauss}(\beta )\) and \(\phi ^\mathrm {Gauss}_\beta (x)\) be as in Proposition 6.8. Then, as \(\beta \rightarrow \infty \),

$$\begin{aligned} \Lambda _0(\beta ) = \bigl (1+ o(1)\bigr )\Lambda _0^\mathrm {Gauss}(\beta ),\qquad \int _{{\mathbb {R}}^d} |\phi _\beta (x) - \phi ^\mathrm {Gauss}_\beta (x)|^2\mathrm {d}x\rightarrow 0, \end{aligned}$$

and

$$\begin{aligned} \lim _{\beta \rightarrow \infty } \frac{\Lambda _1(\beta )}{\Lambda _0(\beta )} = \gamma ^\mathrm {Gauss} < 1. \end{aligned}$$

The corollary follows from Proposition 6.9 and standard perturbation theory for compact operators [42]. The proof of Proposition 6.9 builds on several lemmas. First we show that \({\mathcal {E}}_\mathrm {bulk}\) is \(C^2\) in a neighborhood of its global minimizer.

Lemma 6.11

The mapping \({\mathcal {E}}_\mathrm {bulk}\) is \(C^2\) in some open neighborhood in \({\mathcal {D}}^+\) of the constant sequence \((\ldots ,a,a,\ldots )\).

Proof

Note that

$$\begin{aligned} V(z_1, \ldots , z_d) + W(z_1, \ldots , z_d, z_{d+1}, \ldots , z_{2d})- d e_0 = \sum _{i=1}^{d} h(z_{i}, \ldots , z_{d+i}) \end{aligned}$$

defines a \(C^2\) function in a neighborhood of \((a, \ldots , a) \in {\mathbb {R}}^{d} \times {\mathbb {R}}^d\) which vanishes for \((z_{1}, \ldots , z_{2d}) = (a, \ldots , a)\). Moreover, using that \((\ldots , a, a, \ldots )\) minimizes \({\mathcal {E}}_\mathrm {bulk}\) on \({\mathcal {D}}_0^+\) and so \(\partial _{x_j} {\mathcal {E}}_\mathrm {bulk}(\ldots , a, a, \ldots ) = 0\), we see that also

$$\begin{aligned} V_x(a, \ldots , a) + W_x(a, \ldots , a) + W_y(a, \ldots , a) = 0. \end{aligned}$$

For all \(z \in {\mathcal {D}}_0^+\) the derivative of \({\mathcal {E}}_\mathrm {bulk}\) at z is given by

$$\begin{aligned} \mathrm {D} {\mathcal {E}}_\mathrm {bulk}(z) \zeta = \sum _{j \in {\mathbb {Z}}} \big ( V_x(x_j) + W_x(x_j, x_{j+1}) + W_y(x_{j-1}, x_j) \big ) \xi _j \end{aligned}$$

for all \(\zeta \in \ell ^2({\mathbb {Z}})\) with \(\zeta _j = 0\) for all but finitely many j. So

$$\begin{aligned} \mathrm {D} {\mathcal {E}}_\mathrm {bulk}(z) = \big ( V_x(x_j) + W_x(x_j, x_{j+1}) + W_y(x_{j-1}, x_j) \big )_{j \in {\mathbb {Z}}}. \end{aligned}$$
(6.22)

Since

$$\begin{aligned}&\sum _{j \in {\mathbb {Z}}} | V_x(x_j) + W_x(x_j, x_{j+1}) + W_y(x_{j-1}, x_j) - V_x(x'_j) \\&\qquad - W_x(x'_j, x'_{j+1}) - W_y(x'_{j-1}, x'_j) |^2 \\&\quad \le C \sum _{j \in {\mathbb {Z}}} | (x_{j-1}, x_j, x_{j+1}) - (x'_{j-1}, x'_j, x'_{j+1}) |^2 \le C || z - z' ||_{\ell ^2} \end{aligned}$$

for \(z, z' \in {\mathcal {D}}^+\) in a neighborhood of \((\ldots , a, a, \ldots )\) with a uniform constant C, the right hand side of (6.22) extends to a uniformly continuous function there. Writing

$$\begin{aligned} {\mathcal {E}}_\mathrm {bulk}(z + \zeta ) = {\mathcal {E}}_\mathrm {bulk}(z) + \int _0^1 \mathrm {D} {\mathcal {E}}_\mathrm {bulk} (z + t \zeta ) \zeta \, \mathrm {d}t \end{aligned}$$

for \(z, z' \in {\mathcal {D}}_0^+\), a standard approximation argument shows that indeed \({\mathcal {E}}_\mathrm {bulk}\) is \(C^1\) in a neighborhood of \((\ldots , a, a, \ldots )\) also in \({\mathcal {D}}^+\) with \(\mathrm {D} {\mathcal {E}}_\mathrm {bulk}\) given by (6.22). In fact, \({\mathcal {E}}_\mathrm {bulk}\) is even \(C^2\) on a neighborhood of \((\ldots , a, a, \ldots )\) in \({\mathcal {D}}^+\) and

$$\begin{aligned} \begin{aligned} \mathrm {D}^2 {\mathcal {E}}_\mathrm {bulk}(z) \zeta =&\big ( ( V_{xx}(x_j) + W_{xx}(x_j, x_{j+1}) + W_{yy}(x_{j-1}, x_j) ) \xi _j \\&+ W_{xy}(x_j, x_{j+1}) \xi _{j+1} + W_{xy}(x_{j-1}, x_j) \xi _{j-1} \big )_{j \in {\mathbb {Z}}}. \end{aligned} \end{aligned}$$
(6.23)

This follows similarly as above by extending the derivative of \(\mathrm {D} {\mathcal {E}}_\mathrm {bulk}\), where we now use that the mappings \({\mathbb {R}}^d \times {\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \((x,x',x'') \mapsto V_{xx}(x') + W_{xx}(x', x'') + W_{yy}(x, x')\) and \({\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \((x,x') \mapsto W_{xy}(x, x')\) are uniformly continuous in a neighborhood of \(x = x' = x'' = (a, \ldots , a)\) and so \(\mathrm {D}^2 {\mathcal {E}}_\mathrm {bulk}\) extends to a continuous mapping from a neighborhood of \((\ldots , a, a, \ldots )\) to \(L(\ell ^2({\mathbb {Z}}))\) (the space of bounded linear operators on \(\ell ^2({\mathbb {Z}})\)) given by (6.23). \(\square \)

Next we show that \({\widehat{M}}\) is in fact the Hessian of \({\widehat{H}}\).

Lemma 6.12

Assume \(2\le m <\infty \), \(p\in [0,p^*)\), and \(r_\mathrm {hc}>0\). We have \({\widehat{H}}(x,y) \ge {\widehat{H}}(\varvec{a},\varvec{a}) =0\) for all \(x,y \in {\mathbb {R}}_+^d\), moreover as \(x,y\rightarrow \varvec{a}\),

$$\begin{aligned} {\widehat{H}}(x,y) = \tfrac{1}{2} \widehat{{\mathcal {Q}}}(x-\varvec{a}, y-\varvec{a}) + o(|x-\varvec{a}|^2 + |y- \varvec{a}|^2). \end{aligned}$$

The lemma leaves open whether \((\varvec{a},\varvec{a})\) is the unique global minimizer of \({\widehat{H}}\).

Proof

The first part of the lemma has already been proven in Lemma 6.2(a). With \(M \in {\mathbb {R}}^{2d \times 2d}\), \(N \in {\mathbb {R}}^{d \times d}\) as in (6.10) and (6.11) we let \(\widehat{M}\) as in (6.12). It remains to show that \(\mathrm {D}^2 {\widehat{H}}(\varvec{a}, \varvec{a}) = \widehat{M}\). Since, for a suitable \(\varepsilon > 0\), \({\mathcal {E}}_{\mathrm {bulk}}\) is convex on \({\mathcal {D}}^+ \cap [z_{\min }, z_{\max } + \varepsilon ]^{{\mathbb {Z}}}\), see (the proof of) Proposition 2.3, Lemma 3.12 shows that there is a unique function on a neighborhood of \((\varvec{a}, \varvec{a})\) in \({\mathbb {R}}^d \times {\mathbb {R}}^d\) with values in \({\mathbb {R}}^{-{\mathbb {N}}} \times {\mathbb {R}}^{{\mathbb {N}}}\), \((x,y) \mapsto {\tilde{z}} = (z_-, z_+) = (z_-(x,y), z_+(x,y))\) such that

$$\begin{aligned} H(x,y) = {\mathcal {E}}_{\mathrm {bulk}} (z_-(x,y), x, y, z_+(x,y)). \end{aligned}$$

As \(\mathrm {D}^2 {\mathcal {E}}_{\mathrm {bulk}} (\ldots ,a,a,\ldots )\) is positive definite, the implicit function theorem shows that this mapping is \(C^1\) and satisfies

$$\begin{aligned} \mathrm {D}_{{\tilde{z}}}{\mathcal {E}}_{\mathrm {bulk}} (z_-, \cdot , \cdot , z_+) = 0 \end{aligned}$$

as well as

$$\begin{aligned} \mathrm {D}_{(x,y)} {\tilde{z}} = \big ( \mathrm {D}_{{\tilde{z}}}^2 {\mathcal {E}}_{\mathrm {bulk}} (z_-, \cdot , \cdot , z_+) \big )^{-1} \mathrm {D}_{(x,y)} \mathrm {D}_{{\tilde{z}}}{\mathcal {E}}_{\mathrm {bulk}} (z_-, \cdot , \cdot , z_+). \end{aligned}$$

The latter identity implies

$$\begin{aligned} \mathrm {D}_{(x,y)} H = \mathrm {D}_{(x,y)} {\mathcal {E}}_{\mathrm {bulk}} (z_-, \cdot , \cdot , z_+), \end{aligned}$$

so that H is indeed \(C^2\) near \((\ldots , a, a, \ldots )\) and

$$\begin{aligned} \mathrm {D}_{(x,y)}^2 H = \big [ \mathrm {D}_{(x,y)}^2 {\mathcal {E}}_{\mathrm {bulk}} - \mathrm {D}_{(x,y){\tilde{z}}} {\mathcal {E}}_{\mathrm {bulk}} \big ( \mathrm {D}_{{\tilde{z}}}^2 {\mathcal {E}}_{\mathrm {bulk}} \big )^{-1} \mathrm {D}_{(x,y){\tilde{z}}}{\mathcal {E}}_{\mathrm {bulk}} \big ] (z_-, \cdot , \cdot , z_+). \end{aligned}$$

In particular, since \({\tilde{z}}(\varvec{a}, \varvec{a}) = (\ldots , a, a, \ldots )\),

$$\begin{aligned} \mathrm {D}^2 H(\varvec{a}, \varvec{a}) = \big [ \mathrm {D}_{(x,y)}^2 {\mathcal {E}}_{\mathrm {bulk}} - \mathrm {D}_{(x,y){\tilde{z}}} {\mathcal {E}}_{\mathrm {bulk}} \big ( \mathrm {D}_{{\tilde{z}}}^2 {\mathcal {E}}_{\mathrm {bulk}} \big )^{-1} \mathrm {D}_{(x,y){\tilde{z}}}{\mathcal {E}}_{\mathrm {bulk}} \big ] (\ldots , a, a, \ldots ). \end{aligned}$$

The same analysis applied to the quadratic approximation \(\ell ^2({\mathbb {Z}}) \rightarrow {\mathbb {R}}\), \(z \mapsto \frac{1}{2} \langle z, \mathrm {D}^2 {\mathcal {E}}_{\mathrm {bulk}}(\ldots ,a,a,\ldots ) z \rangle \) leads to

$$\begin{aligned} M = \big [ \mathrm {D}_{(x,y)}^2 {\mathcal {E}}_{\mathrm {bulk}} - \mathrm {D}_{(x,y){\tilde{z}}} {\mathcal {E}}_{\mathrm {bulk}} \big ( \mathrm {D}_{{\tilde{z}}}^2 {\mathcal {E}}_{\mathrm {bulk}} \big )^{-1} \mathrm {D}_{(x,y){\tilde{z}}}{\mathcal {E}}_{\mathrm {bulk}} \big ] (\ldots , a, a, \ldots ), \end{aligned}$$

too. So we have \(\mathrm {D}^2 H(\varvec{a}, \varvec{a}) = M\). A completely analogous reasoning gives \(\mathrm {D}^2 w(a, \ldots , a) = N\) and it follows that \(\mathrm {D}^2 \widehat{H}(\varvec{a}, \varvec{a}) = \widehat{M}\). \(\square \)

Lemma 6.13

Assume \(2\le m <\infty \). For some \(c_2>0\) and all \((z_1,\ldots ,z_{2d})\in (r_{\mathrm {hc}},\infty )^{2d}\),

$$\begin{aligned} {\widehat{H}}\bigl ((z_1,\ldots ,z_d),(z_{d+1},\ldots ,z_{2d})\bigr ) \ge \tfrac{1}{2} p \sum _{i=1}^{2d} z_i- c_2. \end{aligned}$$

Proof

Since the pair potential v is bounded from below, we have for some constant \(c>0\)

$$\begin{aligned} V(z_1,\ldots , z_d) = p \sum _{i=1}^d z_j - c, \quad \inf _{{\mathbb {R}}^{2d}} W(x;y) \ge - c. \end{aligned}$$

In combination with Lemma 6.1 this yields the claim. \(\square \)

In order to estimate \(||K_\beta - G_\beta ||\), we split the configuration space into a neighborhood \({\mathcal {A}}\supset B_\delta (\varvec{a})\) of \(\varvec{a}\) and its complement \({\mathcal {B}}= {\mathbb {R}}^d\setminus {\mathcal {A}}\) and treat blocks separately. For \(U\subset {\mathbb {R}}^d\), we write \({{\mathbf {1}}}_U\) for the multiplication operator with the indicator function \(\mathbb {1}_U\).

Lemma 6.14

Suppose that \({\mathcal {A}}\subset {\mathbb {R}}^d\) is compact, contains an open neighborhood of \(\varvec{a}\), and is such that \({\widehat{H}}(x,y)>0\) for all \((x,y) \in {\mathcal {A}}\times {\mathcal {A}}\setminus \{(\varvec{a},\varvec{a})\}\). Then

$$\begin{aligned} \lim _{\beta \rightarrow \infty } || {{\mathbf {1}}}_{{\mathcal {A}}}\, \beta ^{d/2} (K_\beta - G_\beta ) {{\mathbf {1}}}_{{\mathcal {A}}}|| =0. \end{aligned}$$

Proof

By Lemma 6.12, for every \(\varepsilon >0\), there is a \(\delta >0\) such that for all \(s,t\in {\mathbb {R}}^d\) with \(|s|\le \delta \) and \(|t|\le \delta \), we have

$$\begin{aligned} \tfrac{1}{2} (1-\varepsilon ) \widehat{{\mathcal {Q}}}(s,t) \le {\widehat{H}}(\varvec{a}+ s, \varvec{a} + t) \le \tfrac{1}{2} (1+\varepsilon ) \widehat{{\mathcal {Q}}}(s,t). \end{aligned}$$

Choosing \(\delta >0\) small enough we may assume without loss of generality that \(B_\delta (\varvec{a}) \subset {\mathcal {A}}\). We estimate

$$\begin{aligned}&\int _{B_\delta (\varvec{a})^2} \beta ^d |K_\beta (x,y) - G_\beta (x,y)|^2 \mathrm {d}x \mathrm {d}y \\&\quad \le \int _{B_\delta (0)^2} \beta ^d \bigl ({{\text {e}} }^{\beta \varepsilon \widehat{{\mathcal {Q}}}(s,t)} - 1\bigr )^2{{\text {e}} }^{-\beta \widehat{{\mathcal {Q}}}(s,t)} \mathrm {d}s \mathrm {d}t \\&\quad \le \int _{{\mathbb {R}}^d}\beta ^d \bigl ( {{\text {e}} }^{- \beta (1-2\varepsilon ) \widehat{{\mathcal {Q}}}(s,t)} - 2 {{\text {e}} }^{-\beta (1-\varepsilon ) \widehat{{\mathcal {Q}}}(s,t)} + {{\text {e}} }^{-\beta \widehat{{\mathcal {Q}}}(s,t)} \bigr )\mathrm {d}s\mathrm {d}t\\&\quad = \Bigl (\frac{1}{(1-2\varepsilon )^d} - \frac{2}{(1-\varepsilon )^d}+1\Bigr ) \frac{(2\pi )^d}{\sqrt{\det {\widehat{M}}}} \le k \varepsilon \end{aligned}$$

for some \(k>0\). On \({\mathcal {A}}^2\setminus B_\delta (\varvec{a})^2\), the function \({\widehat{H}}\) stays bounded away from 0, therefore

$$\begin{aligned} \int _{{\mathcal {A}}^2\setminus B_\delta (\varvec{a})^2} \beta ^d |K_\beta (x,y)|^2 \mathrm {d}x \mathrm {d}y \le {{\text {e}} }^{- c_\varepsilon \beta }. \end{aligned}$$

A similar estimate clearly holds true for \(G_\beta \) as well. Hence

$$\begin{aligned} \limsup _{\beta \rightarrow \infty } \int _{{\mathcal {A}}^2} \beta ^d |K_\beta (x,y) - G_\beta (x,y)|^2 \mathrm {d}x \mathrm {d}y \le k \varepsilon . \end{aligned}$$

This holds true for every \(\varepsilon >0\), so the left-hand side converges to zero. Since operator norms are bounded by Hilbert–Schmidt norms, the lemma follows. \(\square \)

Lemma 6.15

Assume that \({\mathcal {B}}\subset {\mathbb {R}}^d\) is such that \(\mathrm {dist}(\varvec{a}, {\mathcal {B}})>0\) and \({\mathcal {B}}\) is invariant under reversals, \(\sigma ({\mathcal {B}})={\mathcal {B}}\). Then \(||{{\mathbf {1}}}_{{\mathcal {B}}} K_\beta {{\mathbf {1}}}_{{\mathcal {B}}}|| = O({{\text {e}} }^{-\beta \delta })\rightarrow 0\).

Proof

We may view \(K_\beta ^{\mathcal {B}} = {\mathbf {1}}_{{\mathcal {B}}} K_\beta {\mathbf {1}}_{{\mathcal {B}}}\) as an operator in \(L^2({\mathcal {B}},\mathrm {d}x)\). The Krein–Rutman theorem is applicable and shows that \(\lambda = ||K_\beta ^{\mathcal {B}}||\) is a simple eigenvalue and there exists an eigenfunction \(\psi \) that is strictly positive on \({\mathcal {B}}\cap (r_{\mathrm {hc}},\infty )^d\). Because of the symmetry \({\widehat{H}}(\sigma y,\sigma x) = {\widehat{H}}(x,y)\), the function \(\psi \circ \sigma \) is a left eigenfunction. Moreover for all \(f,g\in L^2({\mathcal {B}},\mathrm {d}x)\), we have

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{\lambda ^n} \langle f, (K_\beta ^{\mathcal {B}})^n g\rangle = \langle f, \psi \rangle \langle \psi \circ \sigma , g\rangle \end{aligned}$$

so for all strictly positive functions \(f,g\in L^2({\mathcal {B}},\mathrm {d}x)\),

$$\begin{aligned} \lambda = \lim _{n\rightarrow \infty }\Bigl ( \langle f, (K_\beta ^{{\mathcal {B}}})^n g\rangle \Bigr )^{1/n}. \end{aligned}$$

We choose \(f(y)= \exp (- \beta {\widehat{H}}(\varvec{a}, y))\) and \(g(x) =\exp (- \beta {\widehat{H}}(x,\varvec{a}))\). The scalar product becomes

$$\begin{aligned} \langle f, (K_\beta ^{\mathcal {B}})^n g\rangle = \int _{{{\mathcal {B}}} ^n} {{\text {e}} }^{ - \beta \sum _{i=0}^{n+1} {\widehat{H}}(x_i,x_{i+1}) } \mathrm {d}x_1\cdots \mathrm {d}x_{n+1} \end{aligned}$$

with \(x_0 =x_{n+2} = \varvec{a}\). By Lemma 6.1(b) , remembering \(u(\varvec{a}) =0\), we have

$$\begin{aligned} \sum _{i=0}^{n+1} {\widehat{H}}(x_{i-1},x_i) = - (n+2) d e_0 - V(\varvec{a}) + \sum _{i=0}^{n+1} V(x_i) + \sum _{i=1}^n W(x_i,x_{i+1}). \end{aligned}$$

Define \((z_1,\ldots ,z_{(n+1)d} ) = (x_1,\ldots ,x_{n+1})\) and for \(j\in {\mathbb {Z}}\setminus \{1,\ldots , (n+1)d\}\), \(z_j = a\). Then we recognize

$$\begin{aligned} \sum _{i=0}^{n+1} {\widehat{H}}(x_{i-1},x_i) = {\mathcal {E}}_{\mathrm {bulk}}\bigl ( (z_j)_{j\in {\mathbb {Z}}}\bigr ) + \mathrm {const} \end{aligned}$$

where the constant depends on \(e_0\), d, and V(a) alone. As \(z_1,\ldots , z_{(n+1)d}\) stay bounded away from a, we obtain

$$\begin{aligned} \sum _{i=0}^{n+1} {\widehat{H}}(x_{i-1},x_i) \ge \delta (n+1) d- c \end{aligned}$$

for some \(\delta ,c>0\) and all \(n\in {\mathbb {N}}\) and \(x_1,\ldots ,x_{n+1}\in {\mathcal {B}}\). It follows that \( ||K_\beta ^{{\mathcal {B}}}|| = \lambda \le {{\text {e}} }^{- \beta \delta }\). \(\square \)

Lemma 6.16

Suppose that \({\mathcal {A}}\subset {\mathbb {R}}^d\) and \({\mathcal {B}} = {\mathbb {R}}^d\setminus {\mathcal {A}}\) are such that

$$\begin{aligned} V(x) + W(x,y) - d e_0 + u(y) \ge u(x)+ \delta \end{aligned}$$
(6.24)

for some \(\delta >0\) and all \(x\in {\mathcal {A}}\), \(y\in {\mathcal {B}}\). Assume also that \({\mathcal {A}}\) is invariant under reversals, \(\sigma ({\mathcal {A}}) = {\mathcal {A}}\). Then

$$\begin{aligned} \lim _{\beta \rightarrow \infty } \beta ^{d/2} \bigl ( ||{\mathbf {1}}_{{\mathcal {A}}} K_\beta {\mathbf {1}}_{{\mathcal {B}}}|| + ||{\mathbf {1}}_{{\mathcal {B}}} K_\beta {\mathbf {1}}_{{\mathcal {A}}}||\bigr ) =0. \end{aligned}$$

Proof

Revisiting the proof of Lemma 6.1, we see that

$$\begin{aligned} H(x,y) - w(x) = V(x) + W(x,y) - d e_0 + u(y) - u(x). \end{aligned}$$
(6.25)

Equations (6.25), (6.3) and (6.24) show that \({\widehat{H}}(x,y) \ge \delta /2\) for all \(x\in {\mathcal {A}}\) and \(y\in {\mathcal {B}}\). This estimate together with the growth estimate from Lemma 6.13 shows

$$\begin{aligned} \limsup _{\beta \rightarrow \infty }\frac{1}{\beta }\log \Bigl ( \int _{{\mathcal {A}}\times {\mathcal {B}}} |K_\beta (x,y)|^2\mathrm {d}x\mathrm {d}y\Bigr ) \le - \tfrac{1}{2} \delta < 0 \end{aligned}$$

hence \(||{\mathbf {1}}_{\mathcal {A}} K_\beta {{\mathbf {1}}}_{\mathcal {B}}||\rightarrow 0\). The estimate on \(||{\mathbf {1}}_\mathcal {B}K_\beta {{\mathbf {1}}}_{\mathcal {A}}||\) follows from the symmetry \(K_\beta (\sigma y,\sigma x) = K_\beta (x,y)\). \(\square \)

Proof of Proposition 6.9

Let \(\varepsilon >0\), \({\mathcal {A}}_\varepsilon := [z_{\min }, z_{\max }+\varepsilon ]^d\), and \({\mathcal {B}}= {\mathbb {R}}^d\setminus {\mathcal {A}}\). The sets \({\mathcal {A}}\) and \({\mathcal {B}}\) are clearly invariant under reversals, moreover \(z_{\min }< a \le z_{\max }\) by Theorem 2.1(b), so a is in the interior of \({\mathcal {A}}\) and bounded away from \({\mathcal {B}}\). Thus \({\mathcal {A}}\) and \({\mathcal {B}}\) satisfy the assumptions of Lemmas 6.14 and 6.15. By Lemma 3.11, they also satisfy the condition (6.24) from Lemma 6.16. By the triangle inequality,

$$\begin{aligned} ||K_\beta - G_\beta ||\le ||{{\mathbf {1}}}_{\mathcal {A}} (K_\beta - G_\beta ){{\mathbf {1}}}_{\mathcal {A}}|| + ||K_\beta - 1_{\mathcal {A}} K_\beta {{\mathbf {1}}}_{\mathcal {A}}|| + ||G_\beta - {\mathbf {1}}_{\mathcal {A}} G_\beta {{\mathbf {1}}}_{\mathcal {A}}||. \end{aligned}$$

The first term on the right-hand side, multiplied by \(\beta ^{d/2}\), goes to zero by Lemma 6.14. For the second term, we estimate

$$\begin{aligned} ||K_\beta - {\mathbf {1}}_{\mathcal {A}} K_\beta {{\mathbf {1}}}_{\mathcal {A}}|| \le || {\mathbf {1}}_{\mathcal {B}} K_\beta {\mathbf {1}}_{\mathcal {B}}|| + \bigl (|| {\mathbf {1}}_{\mathcal {A}} K_\beta {\mathbf {1}}_{\mathcal {B}}||+ || {\mathbf {1}}_{\mathcal {B}} K_\beta {\mathbf {1}}_{\mathcal {A}}||\bigr ) \end{aligned}$$

and conclude from Lemmas 6.15 and 6.16 that d \(\beta ^{d/2}||K_\beta - 1_{\mathcal {A}} K_\beta {{\mathbf {1}}}_{\mathcal {A}}||\rightarrow 0\). Bounding Hilbert–Schmidt norms, it is straightforward to check that \(||\beta ^{d/2} (G_\beta - {{\mathbf {1}}}_{\mathcal {A}}G_\beta {\mathbf {1}}_{\mathcal {A}})||\rightarrow 0\) as well, and the proof is complete. \(\square \)

6.5 Proof of Theorems 2.72.8 and  2.11

Proof of Theorem 2.8

Combining Lemma 6.3(a) and Corollary 6.10, we obtain

$$\begin{aligned} g(\beta ,p) = e_0 - \frac{1}{\beta }\log \sqrt{\frac{2\pi }{\beta (\det C)^{1/d}}} + o(\beta ^{-1}). \end{aligned}$$

\(\square \)

Proof of Theorem 2.11

The theorem is an immediate consequence of Lemma 6.3(c) and Corollary 6.10. \(\square \)

For the proof of Theorem 2.7, we first express the marginals of \(\mu ^\mathrm {Gauss}\) in terms of the matrices A and B from equation (6.7) and the matrix C from (6.18). We group variables in blocks \(x_j \in {\mathbb {R}}^d\) as usual and view \(\mu ^\mathrm {Gauss}\) as a measure on \(({\mathbb {R}}^d)^{\mathbb {Z}}\).

Proposition 6.17

Under the assumptions of Theorem 2.7, the distributions of \(x_0 = (z_0,\ldots , z_{d-1})\), \((x_0,x_1)\), and \((x_0,\ldots , x_n)\) (\(n\ge 2\)) under \(\mu ^\mathrm {Gauss}\) have probability density functions proportional to

  1. (a)

    \(\exp ( - \frac{1}{2} \beta \langle x_0, (\sigma C \sigma - BC^{-1} B^T) x_0\rangle )\),

  2. (b)

    \(\exp ( - \tfrac{1}{2} \beta [\langle \sigma x_0, C \sigma x_0\rangle - 2 \langle x_0, B x_1\rangle + \langle x_1, C x_1\rangle ])\),

  3. (c)

    \(\exp (- \tfrac{1}{2} ( \langle \sigma x_0, (C - \tfrac{1}{2} A) \sigma x_0\rangle + \sum _{i=0}^{n-1} {\mathcal {Q}}(x_i,x_{i+1}) + \langle x_{n-1}, (C - \tfrac{1}{2} A) x_{n-1}\rangle ))\)

respectively.

Proof

We recall a standard fact on marginals of multivariate Gaussians and Schur complements. Suppose we are given a positive-definite \((n+k)\times (n+k)\)-matrix in block form

$$\begin{aligned} {\mathcal {H}} = \begin{pmatrix} {\mathcal {H}}_1 &{} {\mathcal {H}}_2 \\ {\mathcal {H}}_2^T &{} {\mathcal {H}}_3 \end{pmatrix} \end{aligned}$$

where \({\mathcal {H}}_1,{\mathcal {H}}_2,{\mathcal {H}}_3\) are \(n\times n\), \(n\times k\) and \(k\times k\) matrices, respectively. Think of \({\mathcal {H}}\) as the Hessian of the energy. Consider the Gaussian measure on \({\mathbb {R}}^{n+k}\) with covariance matrix \({\mathcal {H}}^{-1}\) and probability density function

$$\begin{aligned} \rho (x,y) = \sqrt{ \frac{\det {\mathcal {H}}}{(2\pi )^{(n+k) } }}\, \exp \Bigl ( - \frac{1}{2} \langle \begin{pmatrix} x\\ y\end{pmatrix} , {\mathcal {H}} \begin{pmatrix} x \\ y \end{pmatrix}\rangle \Bigr ) \qquad (x\in {\mathbb {R}}^{n}, y\in {\mathbb {R}}^k). \end{aligned}$$

Then for all \(x\in {\mathbb {R}}^n\),

$$\begin{aligned} \int _{{\mathbb {R}}^{k}} \rho (x, y) \mathrm {d}y = \sqrt{ \frac{\det {\mathcal {M}}}{(2\pi )^{n } }}\, \exp \Bigl ( - \frac{1}{2} \langle x, {\mathcal {M}} x\rangle \Bigr ) \end{aligned}$$
(6.26)

with \({\mathcal {M}} = {\mathcal {H}}_1 - {\mathcal {H}}_2 {\mathcal {H}}_3^{-1} {\mathcal {H}}_2^T\) the Schur complement of \({\mathcal {H}}_3\) in \({\mathcal {H}}\). The inverse \({\mathcal {M}}^{-1}\) is equal to the upper left block of \({\mathcal {H}}^{-1}\). Another characterization is provided by a completion of squares, similar to the proof of Lemma 6.5: we have

$$\begin{aligned} \langle x, {\mathcal {M}} x\rangle = \inf _{y\in {\mathbb {R}}^k} \langle \begin{pmatrix} x \\ y\end{pmatrix} , {\mathcal {H}} \begin{pmatrix} x \\ y\end{pmatrix}\rangle . \end{aligned}$$

Now let \({\mathcal {H}} = ({\mathcal {H}}_{ij})_{i,j\in {\mathbb {Z}}}\) be the Hessian of \({\mathcal {E}}_\mathrm {bulk}\) at \((\ldots , a,a,\ldots )\). By definition of \(\mu ^\mathrm {Gauss}\), the distribution of \((z_1,\ldots ,z_n)\) is Gaussian with mean zero and covariance matrix \(({\mathcal {H}}^{-1})_{i,j=1,\ldots , n}\). Let \({\mathcal {M}}= ({\mathcal {M}}_{ij})_{0\le i,j\le n-1}\) be the \(n\times n\)-matrix defined by \({\mathcal {M}} ^{-1}= ({\mathcal {H}}^{-1})_{0\le i,j\le n-1}\). It is not difficult to check that the considerations above generalize to the infinite matrices at hand, hence for all \(z_0,\ldots , z_{n-1} \in {\mathbb {R}}\),

$$\begin{aligned} \sum _{i,j=0}^{n-1} {\mathcal {M}}_{ij} z_i z_j = \inf \Bigl \{ \sum _{i,j\in {\mathbb {Z}}} {\mathcal {H}}_{ij} z'_i z'_j \, \Big |\, (z'_j)_{j\in {\mathbb {Z}}}\in \ell ^2({\mathbb {Z}}):\ z'_0 = z_0,\ldots , z'_{n-1} = z_{n-1} \Bigr \}.\nonumber \\ \end{aligned}$$
(6.27)

equation (6.27) provides a variational description of the covariance matrix \({\mathcal {M}}^{-1}\) of the n-dimensional marginal of \(\mu ^\mathrm {Gauss}\). For \(n= 2d = 2(m-1)\), with \(x_0=(z_0,\ldots ,z_{d-1})\) and \(x_1 = (z_d,\ldots , z_{2d-1})\), equation (6.27) shows \({\mathcal {M}} = M\), by the definition (6.10) of M. Combining with (6.16) we get

$$\begin{aligned} {\mathcal {M}} = \begin{pmatrix} \sigma C \sigma &{} - B \\ - B^T &{} C \end{pmatrix} = M. \end{aligned}$$

This proves part (b) of the lemma. The proof of (c) is similar. Part (a) follows from (b) and a relation similar to (6.26). \(\square \)

Proof of Theorem 2.7

It is enough to treat the nd-dimensional marginals with \(n\ge 2\). Let \(\phi _\beta \) be the principal eigenfunction of \(K_\beta \), with multiplicative constant chosen so that \(\langle \phi _\beta \circ \sigma , \phi _\beta \rangle =1\). Set \({\tilde{\phi }}_\beta (x):= U_\beta \phi _\beta (x) =\beta ^{-1/4} \phi _\beta (\varvec{a}+\beta ^{-1/2}x)\) and

$$\begin{aligned} {{\tilde{K}}}_\beta (x,y):= \frac{1}{\Lambda _0(\beta )} \bigl (U_\beta K_\beta U_\beta ^*\bigr )(x,y) = \frac{1}{\Lambda _0(\beta )} K_\beta (\varvec{a} +\beta ^{-1/2}x,\varvec{a} +\beta ^{-1/2}y) \end{aligned}$$

By Lemma 6.3, the probability density \(\rho _{nd}^{\scriptscriptstyle {({\beta }})}\) for \((x_1,\ldots ,x_n)\in {\mathbb {R}}^{nd}\) satisfies

$$\begin{aligned} {\tilde{\rho }}_{nd}^{(\beta )} (x_1,\ldots ,x_n)= & {} \beta ^{-nd/2} \rho _{nd}^{\scriptscriptstyle {({\beta }})} (\varvec{a} +\beta ^{-1/2} x_1,\ldots , \varvec{a} +\beta ^{-1/2} x_n) \\= & {} {\tilde{\phi }}_\beta (\sigma x_1) \Biggl ( \prod _{i=1}^{n-1} {{\tilde{K}}}_\beta (x_i,x_{i+1}) \Biggr ){\tilde{\phi }}_\beta (x_n). \end{aligned}$$

By Proposition 6.17, the analogous representation for the Gaussian density \(\rho _{nd}^\mathrm {Gauss}\) is

$$\begin{aligned} \rho _{nd}^\mathrm {Gauss}(x_1,\ldots ,x_n) = \phi ^\mathrm {Gauss}(\sigma x_1) \Biggl (\prod _{i=1}^{n-1} {{\tilde{G}}} (x_i,x_{i+1}) \Biggr )\phi ^\mathrm {Gauss}(x_n) \end{aligned}$$

with \({{\tilde{G}}}(x,y) = (\lambda _0^\mathrm {Gauss}) G(x,y)\) and \(\phi ^\mathrm {Gauss}(x) \propto \exp (-\frac{1}{2} \langle x, \frac{1}{2} N x \rangle )\) the principal eigenfunction of G, normalized so that \(\langle \phi ^\mathrm {Gauss}\circ \sigma , \phi ^\mathrm {Gauss}\rangle =1\). It follows that

$$\begin{aligned}&\int _{{\mathbb {R}}^{nd}}\bigl |{\tilde{\rho }}_{nd}^{(\beta )} (x_1,\ldots ,x_n) -\rho _{nd}^\mathrm {Gauss}(x_1,\ldots ,x_n) \bigr |\mathrm {d}x_1\ldots \mathrm {d}x_n \\&\quad \le \bigl | \langle {\tilde{\phi }}_\beta \circ \sigma - \phi ^\mathrm {Gauss}\circ \sigma , {{\tilde{K}}}_\beta ^{n-1} {\tilde{\phi }}_\beta \rangle \bigr | + \sum _{i=1}^{n-1} \bigl | \langle \phi ^\mathrm {Gauss}\circ \sigma , {\tilde{G}}^i ({{\tilde{K}}}_\beta - {{\tilde{G}}}) {{{\tilde{K}}}_\beta }^{n-i-2} {\tilde{\phi }}_\beta \rangle \bigl | \\&\qquad + \bigl |\langle \phi ^\mathrm {Gauss}\circ \sigma , {{\tilde{G}}}^{n-1} ({\tilde{\phi }}_\beta - \phi ^\mathrm {Gauss}\rangle \bigr |. \end{aligned}$$

Using \({{\tilde{K}}}_\beta {\tilde{\phi }}_\beta = {\tilde{\phi }}_\beta \) and \({{\tilde{G}}}^* (\phi ^\mathrm {Gauss}\circ \sigma ) = \phi ^\mathrm {Gauss}\circ \sigma \), we get

$$\begin{aligned} ||\rho _{(n+1)d}^{(\beta )}- \rho _{(n+1)d}^\mathrm {Gauss}||_{L^1} \le \bigl ( ||{\tilde{\phi }}_\beta ||_{L^2} + ||\phi ^\mathrm {Gauss}||_{L^2}\bigr ) ||{\tilde{\phi }}_\beta - \phi ^\mathrm {Gauss}||_{L^2} + ||{{\tilde{K}}}_\beta - {{\tilde{G}}}|| \end{aligned}$$

which goes to zero by Proposition 6.9 (see also Corollary 6.10). \(\square \)

7 A Brascamp–Lieb Type Covariance Estimate for \(m=\infty \)

Here we prove Proposition 2.10. Key to the proof is a matrix lower bound A for the Hessian of \({\mathcal {E}}_N\). For Gaussian measures with probability density proportional to \(\exp (- \frac{\beta }{2} \langle z, A z\rangle )\) and test functions \(f_i = z_i\), \(g_j = z_j\), we end up estimating the covariance \(C_{ij} = ([\beta A]^{-1})_{ij}\). We follow [34], see also [36].

Proof of Proposition 2.10

Revisiting the proof of Lemma 3.3, we obtain bounds on matrix elements of the Hessian. Let \(N\in {\mathbb {N}}\), \(z\in [z_{\mathrm {min}}, z_\mathrm {max}]^{N-1}\). For \(1 \le i< j \le N-1\) we have

$$\begin{aligned} 0 \ge \partial _i \partial _j {\mathcal {E}}_N(z)&= \sum _{L\supset \{i,j\}} v''(\sum _{k\in L} z_k) \ge \sum _{n=j-i+1}^{N-1} v''(n z_\mathrm {min}) \#\{L \mid \#L =n,\, L \supset \{i,j\}\} \\&\ge \sum _{n=j-i+1}^\infty (n-j+i) v''(nz_\mathrm {min}) =: - \kappa _{j-i} \end{aligned}$$

with

$$\begin{aligned} 0 \le \kappa _{j-i} \le \sum _{n=j-i+1}^\infty \frac{\alpha _2 n}{ (nz_\mathrm {min})^{s+2}} \le \frac{\alpha _2}{s z_\mathrm {min}^{s+2} (j-i)^{s}}. \end{aligned}$$
(7.1)

For \(1\le i \le N-1\) we also have

$$\begin{aligned} \partial _i^2 {\mathcal {E}}_N(z) = \sum _{L \ni i} v''(\sum _{k\in L} z_k) \ge v''(z_\mathrm {max}) - \sum _{n=2}^{\infty } n \bigl | v''(n z_\mathrm {min}) \bigr | =:\rho >0 \end{aligned}$$

by Assumption 1(iv). Moreover

$$\begin{aligned} \eta := \rho - 2 \sum _{\ell =1}^\infty \kappa _\ell = v''(z_{\max }) - \sum _{n=2}^\infty n^2 |v''(n z_{\min })|>0 \end{aligned}$$

again by Assumption 1(iv). Let \(A_N\) be the \((N-1)\times (N-1)\)-matrix with diagonal \(\rho \) and off-diagonal entries \(-\kappa _{|j-i|}\); notice that \(\eta ,\kappa _{j-i}, \rho \) do not depend on N. \(A_N\) is symmetric and positive-definite.

The previous estimates together with [34, Remark 2.6] show that the energy \({\mathcal {E}}_N\) satisfies the assumptions of [34, Theorem 2.3 and Proposition 3.5]. It follows that for all smooth \(f,g:{\mathbb {R}}_+ \rightarrow {\mathbb {R}}\),

$$\begin{aligned} \Bigl |{\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})} (f_i g_j ) - {\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})} (f_i) {\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})} (g_j) \Bigr | \le \frac{1}{\beta } (A_N^{-1})_{ij} \Bigl ( {\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})} \bigl ({f'_i}^2 \bigr ) {\tilde{\mu }}_\beta ^{\scriptscriptstyle {({N}})} \bigl ({g'_j}^2 \bigr ) \Bigr )^{1/2}. \end{aligned}$$

Let \(X_1,X_2,\ldots \) be i.i.d. random variables with law

$$\begin{aligned} {\mathbb {P}}(X_i = \ell ) = \frac{\kappa _{|\ell |}}{\rho - \eta },\quad \ell \in {\mathbb {Z}}\setminus \{0\}, \quad {\mathbb {P}}(X_i = \ell ) =0 \end{aligned}$$

and \(S_n = X_1 + \cdots + X_n\). We may decompose \(A_N\) as \(\rho \mathrm {Id}\) plus an off-diagonal matrix, write a Neumann series for the inverse, and find that for \(i<j\)

$$\begin{aligned} (A_N^{-1})_{ij} \le \frac{1}{\rho } \sum _{k=1}^\infty \bigl ( 1- \frac{\eta }{\rho }\bigr )^{k} {\mathbb {P}}(S_k = j-i). \end{aligned}$$
(7.2)

Clearly

$$\begin{aligned} {\mathbb {P}}(S_k = j-i) \le \sum _{r=1}^k {\mathbb {P}}(X_r \ge (j-i)/k,\, S_k = j-i ). \end{aligned}$$
(7.3)

By (7.1), we have \({\mathbb {P}}(X_r = \ell ) \le C/|\ell |^s\) for some constant \(C>0\). Following [34, Proposition 3.5] we may estimate, for each \(m\in {\mathbb {N}}\),

$$\begin{aligned} {\mathbb {P}}(X_2 \ge m,\, S_k = j-i )&\le \sum _{\ell =m}^\infty {\mathbb {P}}(X_2 = \ell ) {\mathbb {P}}(X_1 + X_3+\cdots + X_n = j-i-\ell ) \\&\le \sup _{\ell \ge m} {\mathbb {P}}(X_2 = \ell ) \le \frac{C}{m^s}. \end{aligned}$$

Similar estimates apply to other r. Combining with (7.3), we find that

$$\begin{aligned} {\mathbb {P}}(S_k = j-i) \le \frac{C\, k^{s+1}}{|j-i|^{s}}. \end{aligned}$$

It follows that

$$\begin{aligned} (A_N^{-1})_{ij}&\le \frac{C}{\rho |i-j|^s} \sum _{k=1}^{\infty } k^{s+1} \bigl (1-\frac{\eta }{\rho }\bigr )^{k}. \end{aligned}$$

Notice that the series is convergent. The bound is plugged into the estimate (7.2) and the proposition follows by passing to the limit \(N\rightarrow \infty \). \(\square \)