Remarks on the Interpolation Method

Boccagna, Roberto; Gabrielli, Davide

doi:10.1007/s10955-020-02624-x

Remarks on the Interpolation Method

Open access
Published: 24 August 2020

Volume 181, pages 1218–1238, (2020)
Cite this article

Download PDF

You have full access to this open access article

Journal of Statistical Physics Aims and scope Submit manuscript

Remarks on the Interpolation Method

Download PDF

888 Accesses
1 Altmetric
Explore all metrics

Abstract

We discuss a generalization of the classic condition of validity of the interpolation method for the density of quenched free energy of mean field spin glasses. The condition is written just in terms of the $L^2$ metric structure of the Gaussian random variables. As an example of application we deduce the existence of the thermodynamic limit for a GREM model with infinite branches for which the classic conditions of validity fail. We underline the dependence of the density of quenched free energy just on the metric structure and discuss the models from a metric viewpoint.

Interpolation and Comparison Methods in the Mean Field Spin Glass Model

From Parisi to Boltzmann

Revisiting the scaling of the specific heat of the three-dimensional random-field Ising model

Article 14 September 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The interpolation method is a simple but powerful technique used to prove inequalities for Gaussian random vectors (see for example [20] and [21]). This method has great relevance in the field of Mathematical and Theoretical Physics since it represents an essential ingredient in the study of mean field spin glasses. In the breakthrough paper [19] it has been used to prove the existence of the thermodynamic limit for the quenched density of free energy for the Sherrington–Kirkpatrick model. This was a longstanding problem and its solution was the turning point towards the proof of the Parisi Formula [29].

Spin glasses are simple mathematical models for disordered systems whose rigorous analysis is indeed a challenge for mathematicians. We refer to [24, 28] the mathematically interested reader and to [23] the physically interested one. Among plenty of models, one of the most studied is that introduced by Sherrington and Kirkpatrick in [26] as a solvable elementary model. Indeed the structure of the solution turned out to be much more rich and complex than expected and was build up in a series of papers by Parisi (see [23] for a detailed discussion). A rigorous proof of the Parisi conjectured solution was missing for a long time and the interpolation method played a key role in its proof. See [18] for a review on this.

Using the same idea of [19], the authors of [12] proposed a general setting for the interpolation method in the framework of mean field spin glasses. Furthermore, they successfully applied this technique to prove the existence of the thermodynamic limit for the Generalized Random Energy Model (GREM, a family of models introduced in [15]) with a finite number of levels.

The interpolation method is now a powerful technique that has many different applications in different contexts, see for example [1,2,3,4,5,6, 22], a list that is by far not exaustive.

The “classical” hypothesis under which the interpolation method can be applied to the quenched free energy of mean field spin glasses consists of a collection of equalities and inequalities for the covariance matrix of the underlying multivariate Gaussian process. We show that less restrictive conditions are actually needed. More precisely, we show that the method works under conditions that involve just the $L^2$ metric structure of the Gaussian random vectors. By the correspondence in [27, 17] this is always an Euclidean metric structure. A condition of this type is very natural since the quenched free energy depends on the distribution of the Gaussian random vector only through its metric structure. This gives an interesting geometric flavor and interpretation and we discuss, at the end of the paper, the models from a purely metric viewpoint. This generalized condition of validity was also obtained through a tricky computation in the framework of Sudakov-Fernique inequalities in [11]. Here we deduce the condition by a general argument that could in principle be applied also to comparative inequalities involving expected values of different functions of Gaussian vectors. As an example of application of the generalized condition, we consider a GREM model with infinite levels and deduce the existence of the thermodynamic limit for the quenched density of free energy. Indeed, in this case the usual condition of validity of the interpolation method used in [12, 19] fails. We can deduce therefore the existence of the thermodynamic limit directly using the simple argument of the interpolation method. We refer to [9, 25] and [10] for the beautiful mathematics involved in the limit of such kind of models.

The structure of the paper is the following.

In Sect. 2 we briefly recall the basics of the interpolation method together with the conditions used in [19] and [12]; we then discuss the Euclidean metric structure associated to any Gaussian random vector and finally show the generalized conditions.

In Sect. 3 we discuss two examples. The first one is the Sherrington–Kirkpatrick model. This is done simply to recall the basic mechanism and idea of application. The second example is a GREM model with infinite levels for which it is necessary to use the generalized conditions to prove the existence of the thermodynamic limit. In the final part of this section we discuss the models from a purely metric viewpoint introducing a class of models that have a natural metric structure and for which it is possible to show the existence of the thermodynamic limit.

In the Appendix we have an elementary auxiliary Lemma.

2 The Interpolation Method

2.1 The Interpolation Method

Let $X=(X_1,\dots ,X_n)$ be a n-dimensional zero mean Gaussian random vector having covariance matrix C. The $n\times n$ symmetric matrix C is non-negative definite and the elements are defined by $C_{i,j}{:=}{\mathbb {E}}\left[ X_iX_j\right] $. When C is positive definite then the distribution of X is absolutely continuous with respect to the Lebesgue measure on ${\mathbb {R}}^n$ and the density is

$$\begin{aligned} \phi _{C}\left( x\right) {:=}\frac{1}{\sqrt{\left( 2\pi \right) ^n\text {det}\left( C\right) }}\mathrm {e}^{-\frac{1}{2} (x,C^{-1}x)}, \end{aligned}$$

(2.1)

where $\left( \,\cdot \, ,\,\cdot \,\right) $ denotes the Euclidean scalar product in ${\mathbb {R}}^n$. We restrict to the case of positive definite matrices since the other cases can be deduced by a limiting procedure. We have the Fourier transform representation

$$\begin{aligned} \phi _C\left( x\right) =\frac{1}{(2\pi )^n}\int _{{\mathbb {R}}^n}\mathrm {d}\lambda \, \mathrm {e}^{-i(\lambda ,x)}\mathrm {e}^{-\frac{1}{2} (\lambda ,C\lambda )}. \end{aligned}$$

(2.2)

We denote by $\mathrm {Tr}\,(\,\cdot \,)$ the trace of a matrix and consider $\,\overline{\! \mathcal {C}}$ the set of non-negative definite symmetric matrices endowed with the Hilbert-Schmidt scalar product

$$\begin{aligned} \left( A,B\right) {:=}\text {Tr}\left( AB\right) , \qquad A, B\in \,\overline{\! {\mathcal {C}}}. \end{aligned}$$

(2.3)

The open set of positive definite symmetric matrices corresponds to ${\mathcal {C}}$.

Let $\phi : {\mathcal {C}} \times {\mathbb {R}}^n\rightarrow {\mathbb {R}}^+$ as defined in (2.1). By (2.2) and a direct computation we have

$$\begin{aligned} \frac{\partial \phi _C\left( x\right) }{\partial C_{i,j}}=\frac{\partial \phi _C\left( x\right) }{\partial C_{j,i}}=\frac{\partial ^2\phi _C\left( x\right) }{\partial x_i\partial x_j}, \end{aligned}$$

(2.4)

and

$$\begin{aligned} \frac{\partial \phi _C\left( x\right) }{\partial C_{i,i}}=\frac{1}{2}\frac{\partial ^2\phi _C\left( x\right) }{\partial x_i^2}. \end{aligned}$$

(2.5)

Recall that in the above formulas C is a symmetric matrix so that the variations in the computation of (2.4) are constructed varying symmetrically the matrix C. More precisely let $E^{\{i,j\}}$ with $i\ne j$ be the symmetric matrix such that $E^{\{i,j\}}_{i,j}=E^{\{i,j\}}_{j,i}=1$ and having all the remaining elements equal to zero. Given $F:{\mathcal {C}}\rightarrow {\mathbb {R}}$ we define

$$\begin{aligned} \frac{\partial F\left( C\right) }{\partial C_{j,i}}=\frac{\partial F\left( C\right) }{\partial C_{i,j}}{:=}\lim _{\delta \rightarrow 0}\frac{F\left( C+\delta E^{\{i,j\}}\right) -F\left( C\right) }{\delta }. \end{aligned}$$

(2.6)

Consider now $f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ a $C^2$ function with moderate growth at infinity, for example such that $|f(x)|\le \mathrm {e}^{\lambda |x|}$ for a suitable constant $\lambda \ge 0$. This technical condition is related to the validity of some integrations by parts. We call $\nabla ^2f\left( x\right) $ the Hessian matrix of f at x, that is the symmetric matrix having elements

$$\begin{aligned} \left( \nabla ^2f\right) _{i,j}\left( x\right) {:=}\frac{\partial ^2f\left( x\right) }{\partial x_i \partial x_j}. \end{aligned}$$

The following result is the interpolation method. For the readers convenience we give the short proof.

Lemma 2.1

(Interpolation method) Consider two zero mean Gaussian random vectors X, Y having covariance matrices respectively given by $C^X$ and $C^Y$. Consider a $C^2$ function f with moderate growth. We have

$$\begin{aligned} {\mathbb {E}}\left[ f\left( Y\right) \right] -{\mathbb {E}} \left[ f\left( X\right) \right] =\frac{1}{2} \int _0^1 \mathrm {d}t\,{\mathbb {E}} \Big [\mathrm {Tr}\Big (\left( C^Y-C^X\right) \nabla ^2f\left( Z\left( t\right) \right) \Big )\Big ], \end{aligned}$$

(2.7)

where

$$\begin{aligned} Z(t)=\sqrt{t}X+\sqrt{(1-t)}Y, \end{aligned}$$

(2.8)

and X, Y are two independent copies of the random vectors.

Proof

When Z is a n-dimensional centered Gaussian random vector, then $\mathbb {E}\left[ f(Z)\right] $ depends only on the covariance matrix C of the vector Z. Fix a $C^2$ function f and define the function $F:\overline{{\mathcal {C}}}\rightarrow {\mathbb {R}}$ as

$$\begin{aligned} F\left( C\right) {:=}{\mathbb {E}}\left[ f\left( Z\right) \right] . \end{aligned}$$

(2.9)

With the help of formulas (2.4), (2.5), when $C\in {\mathcal {C}}$ we can compute

$$\begin{aligned} \frac{\partial F\left( C\right) }{\partial C_{i,j}}= & {} \int _{{\mathbb {R}}^n}\mathrm {d}x\,\frac{\partial \phi _{C}\left( x\right) }{\partial C_{i,j}}f\left( x\right) =\int _{{\mathbb {R}}^n}\mathrm {d}x\,\frac{\partial ^2 \phi _{C}\left( x\right) }{\partial x_i\partial x_j}f\left( x\right) \end{aligned}$$

(2.10)

$$\begin{aligned}= & {} \int _{{\mathbb {R}}^n}\mathrm {d}x\, \phi _{C}\left( x\right) \frac{\partial ^2f\left( x\right) }{\partial x_i\partial x_j}={\mathbb {E}}\left[ \left( \nabla ^2 f\right) _{i,j}\left( Z\right) \right] , \end{aligned}$$

(2.11)

and

$$\begin{aligned} \frac{\partial F\left( C\right) }{\partial C_{i,i}}= & {} \int _{{\mathbb {R}}^n}\mathrm {d}x\,\frac{\partial \phi _{C}\left( x\right) }{\partial C_{i,i}}f\left( x\right) =\frac{1}{2}\int _{{\mathbb {R}}^n}\mathrm {d}x\,\frac{\partial ^2 \phi _{C}\left( x\right) }{\partial x_i^2}f\left( x\right) \nonumber \\= & {} \frac{1}{2}\int _{{\mathbb {R}}^n}\mathrm {d}x\, \phi _{C}\left( x\right) \frac{\partial ^2f\left( x\right) }{\partial x_i^2}=\frac{1}{2}\, {\mathbb {E}}\left[ \left( \nabla ^2f\right) _{i,i}\left( Z\right) \right] . \end{aligned}$$

(2.12)

Given a $C^1$ parametric curve $\left\{ C(t)\right\} _{t\in [0,1]}$ on ${\mathcal {C}}$ such that $C(0)=C^X$ and $C(1)=C^Y$, then we have

$$\begin{aligned} {\mathbb {E}}\left[ f\left( Y\right) \right] -{\mathbb {E}}\left[ f\left( X\right) \right] =\frac{1}{2}\int _0^1 \mathrm {d}t\,{\mathbb {E}}\left[ \mathrm {Tr}\left( \frac{\mathrm {d}C\left( t\right) }{\mathrm {d}t}\nabla ^2f\left( Z\left( t\right) \right) \right) \right] , \end{aligned}$$

(2.13)

where $Z\left( t\right) $ is a centered Gaussian random vector having covariance $C\left( t\right) $. The special case when the curve linearly interpolates between $C^X$ and $C^Y$ gives (2.7) with Z(t) given by (2.8). If one or both the matrices $C^X$ and $C^Y$ are not strictly positive definite, it is possible to add to the matrices $\varepsilon \mathbb I$, do the same computation as above and finally take the limit $\varepsilon \rightarrow 0$. $\square $

The above formula is the core of the interpolation method. It is very useful to establish inequalities between the two expected values on the left hand side of (2.7).

The Guerra–Toninelli interpolation method is a simple but powerful technique developed in the study of mean field spin glasses (see [18, 19] and references therein), which is based on an abstract theorem about Gaussian random variables. It corresponds to the interpolation method Lemma 2.1 with the special choice of the function

$$\begin{aligned} f(x)=\log \sum _{i=1}^nw_i \mathrm {e}^{x_i}, \end{aligned}$$

(2.14)

where $w_i\in {\mathbb {R}}^+$ are some fixed positive weights.

In particular, Guerra and Toninelli obtained and used the following result (this is Theorem 2 in [18]) to prove the existence of the thermodynamic limit of the Sherrington–Kirkpatrick model. The same idea and the same Theorem (Theorem 2.2 below), was used later on in [12] to deduce the existence of the thermodynamic limit for a GREM model [15] with a finite number of levels.

Theorem 2.2

Let X, Y two centered Gaussian random vectors and the function f given by (2.14). If

$$\begin{aligned}&C^X_{i,i}=C^Y_{i,i}, \qquad \forall \, i, \end{aligned}$$

(2.15)

$$\begin{aligned}&C^X_{i,j}\ge C^Y_{i,j}, \qquad \forall \, i\ne j, \end{aligned}$$

(2.16)

then we have

$$\begin{aligned} {\mathbb {E}}\left[ f\left( Y\right) \right] \ge {\mathbb {E}}\left[ f\left( X\right) \right] . \end{aligned}$$

(2.17)

We show the proof of Theorem 2.2 that is based on the interpolation formula (2.7).

Proof of Theorem 2.2

Let us call, for any $i=1,\ldots ,n$

$$\begin{aligned} \mu _i\left( x\right) {:=}\frac{w_i\mathrm {e}^{x_i}}{\sum _{j=1}^nw_j\mathrm {e}^{x_j}}. \end{aligned}$$

(2.18)

By a direct computation, when f is (2.14), we have

$$\begin{aligned}&\frac{\partial ^2f\left( x\right) }{\partial x_i^2}=\mu _i\left( x\right) -\mu _i^2\left( x\right) , \end{aligned}$$

(2.19)

$$\begin{aligned}&\frac{\partial ^2f\left( x\right) }{\partial x_i\partial x_j}=-\mu _i\left( x\right) \mu _j\left( x\right) . \end{aligned}$$

(2.20)

By the formulas (2.19), (2.20) and conditions (2.15), (2.16), we have that

$$\begin{aligned} (C^Y-C^X)_{i,j}\left( \nabla ^2f\right) _{i,j}\left( x\right) \ge 0, \ \ \ \ \ \forall \, x\in {\mathbb {R}}^d, \;\;\; \forall \,i,j\, \end{aligned}$$

(2.21)

and the result follows by (2.7). $\square $

2.2 Covariances and Metrics

We start recalling some simple but useful Lemmas.

Lemma 2.3

We have that the $n\times n$ symmetric matrix C belongs to ${\mathcal {C}}$ if and only if there exist n vectors $a^{(i)}\in {\mathbb {R}}^n$ such that

$$\begin{aligned} C_{i,j}=\big (a^{(i)},a^{(j)}\big ). \end{aligned}$$

(2.22)

This is a classic result and the matrix C is called the Gram matrix of the vectors $\left( a^{(i)}\right) _{i=1,\dots ,n}$, see for example [7].

A finite metric space with n points is called Euclidean if there exists a collection of n points on ${\mathbb {R}}^k$ having the same relative interdistances. Of course we can always fix $k=n$. Not every metric space can be realized in this way. The simplest example is the minimal path metrics on the vertices of the graph in Fig. 1 where the edges have all length 1.

Given a centered Gaussian random vector X there is naturally associated the metric $d_X$ that is the $L^2$ distance between the random variables

$$\begin{aligned} d_X(i,j){:=}\sqrt{{\mathbb {E}}\left[ \left( X_i-X_j\right) ^2\right] }=\sqrt{C^X_{i,i}+C^X_{jj}-2C^X_{i,j}}. \end{aligned}$$

(2.23)

We have the following result (see also [17, 27])

Lemma 2.4

A finite metric space $\left( \left\{ 1,\dots ,n\right\} , d\right) $ is Euclidean if and only if there exists a zero mean Gaussian random vector $X=(X_1,\dots ,X_n)$ such that $d=d_X$.

Proof

Consider d an Euclidean distance and let $a^{(i)}$, $i=1,\dots ,n$ be some points on ${\mathbb {R}}^n$ that realize such a distance. This means that $d(i,j)=\left| a^{(i)}-a^{(j)}\right| $ where $|\cdot |$ represents the Euclidean distance in ${\mathbb {R}}^n$. Such a collection of vectors exists by definition of an Euclidean metric space. Let A be an $n\times n$ matrix defined by $A_{i,j}{:=}a^{(i)}_j$. Let $Z=(Z_1,\dots ,Z_n)$ be a vector of i.i.d. standard Gaussian random variables and consider the Gaussian vector $X=AZ$ whose covariance $C^X=AA^T$ coincides with the right hand side of (2.22). Using (2.23) we have

$$\begin{aligned} d_X\left( i,j\right) =\big |a^{(i)}-a^{(j)}\big |=d\left( i,j\right) . \end{aligned}$$

(2.24)

Conversely let X a Gaussian zero mean vector with covariance $C^X$ and let A an $n\times n$ matrix such that $C^X=AA^T$. Define n vectors in ${\mathbb {R}}^n$ by $a^{(i)}_j{:=}A_{i,j}$; by (2.23) we have that $d_X$ is determined by the first equality in (2.24) and is therefore Euclidean.$\square $

Other simple but useful lemmas are the following. We give just the statements, the proofs can be found for example in [8].

Lemma 2.5

Let $v^{(1)},\dots , v^{(n)}$ and $w^{(1)},\dots , w^{(n)}$ be two collections of n vectors in ${\mathbb {R}}^n$. We have that

$$\begin{aligned} \big ( v^{(i)},v^{(j)}\big ) =\big ( w^{(i)},w^{(j)}\big ) \qquad \forall \,i,j\, \end{aligned}$$

(2.25)

if and only if there exists $O\in O\left( n\right) $ such that $w^{(i)}=Ov^{(i)}$ for any i.

Lemma 2.6

Let $v^{(1)},\dots ,v^{(n)}$ and $w^{(1)},\dots ,w^{(n)}$ be two collections of vectors in ${\mathbb {R}}^n$. We have that

$$\begin{aligned} \big |v^{(i)}-v^{(j)}\big |=\big |w^{(i)}-w^{(j)}\big |, \qquad \forall \,i,j, \end{aligned}$$

(2.26)

if and only if there exists $O\in O\left( n\right) $ and a vector $b\in {\mathbb {R}}^n$ such that

$$\begin{aligned} w^{(i)}=Ov^{(i)}+b, \qquad \forall i. \end{aligned}$$

(2.27)

The metric structure $d_X$, associated to a Gaussian random vector X, contains less information than the covariance $C^X$ and there are random vectors having different covariances but the same metric structure. This type of invariance is best understood in terms of the vectors in ${\mathbb {R}}^n$ using the above Lemmas that characterize invariance by rotations and translations. In particular we can completely characterize the centered Gaussian random variables that share the same metric structure.

Lemma 2.7

Given X and Y two n-dimensional centered Gaussian random vectors, we have that $d_X=d_Y$ if and only if there exists a centered Gaussian random variable W such that the random vector $X_i+ W$, $i=1,\dots ,n$ has the same distribution of Y.

Proof

If Y has the same distribution of $X+W{1}_n$, where ${1}_n$ is the n-dimensional vector of all ones ${1}_n=\left( 1,1,\ldots ,1\right) $, then

$$\begin{aligned} d_Y(i,j)=\sqrt{\mathbb {E}\left[ \left( Y_i-Y_j\right) ^2\right] }=\sqrt{\mathbb {E}\left[ \left( X_i+W-X_j-W\right) ^2\right] }=d_X(i,j). \end{aligned}$$

Conversely, suppose that $d_X=d_Y$. We have that there exist two matrices $A^X$ and $A^Y$ such that $A^XZ$ has the same distribution of X and $A^YZ$ has the same distribution of Y, where Z is an n vector of i.i.d. standard Gaussian random variables. We define two collections $v^{(i)}, w^{(i)}$, $i=1,\dots n$ of vectors in ${\mathbb {R}}^n$ defined by $v^{(i)}_j{:=}A^X_{i,j}$ and $w^{(i)}_j{:=}A^Y_{i,j}$. Since $d_X=d_Y$ we have

$$\begin{aligned} \big |v^{(i)}-v^{(j)}\big |=\big |w^{(i)}-w^{(j)}\big |, \qquad \forall \, i,j, \end{aligned}$$

(2.28)

and by Lemma 2.6 there exist $O\in O(n)$ and a vector $b\in {\mathbb {R}}^n$ such that $w^{(i)}=Ov^{(i)}+b$, $i=1,\dots ,n$. In terms of the corresponding matrices this means that $A^Y=A^XO^T+B$, where the matrix B is defined as $B_{i,j}{:=}b_j$. We obtain therefore

$$\begin{aligned} Y=A^XO^TZ+BZ. \end{aligned}$$

(2.29)

The random vector $A^XO^TZ$ is a centered Gaussian random vector with covariance $A^XO^TO(A^X)^T=C^X$ so that it has the same law of X. The random vector BZ has all the components equal and setting $W=\sum _{j=1}^nb_jZ_j$ we finish the proof. $\square $

A direct consequence of the above result is the following. Define the function $F:\,\overline{\! {\mathcal {C}}}\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} F(C){:=}{\mathbb {E}}\left[ \log \sum _{i=1}^nw_i\mathrm {e}^{X_i}\right] , \end{aligned}$$

(2.30)

where X is a centered Gaussian random vector with covariance C.

Lemma 2.8

Given $C^X, C^Y \in \,\overline{\! {\mathcal {C}}}$ such that $d_X=d_Y$, then $F(C^X)=F(C^Y)$.

Proof

Since $d_X=d_Y$ by Lemma 2.7 we have that $Y=X+W$ and therefore

$$\begin{aligned}&F(C^Y) ={\mathbb {E}}\left[ \log \sum _{i=1}^nw_i\mathrm {e}^{Y_i}\right] ={\mathbb {E}}\left[ \log \sum _{i=1}^nw_i\mathrm {e}^{X_i+W}\right] \\&={\mathbb {E}}\left[ W+\log \sum _{i=1}^nw_i\mathrm {e}^{X_i}\right] =F(C^X), \end{aligned}$$

where the last equality follows by the fact that W is centered. $\square $

This Lemma simply says that we can define the right hand side of (2.30) as $\widetilde{F}(d)$ since the function depends just on the metric structure of the random variables and not on their correlations.

We expect therefore to have a version of Theorem 2.2 with conditions written just in terms of the metrics. This is done in the next section.

2.3 A Generalized Condition

We show how to generalize Theorem 2.2 proving that (2.17) can be deduced under weaker hypotheses concerning just the metric structures. The same inequality has been obtained in [11] with a tricky computation. Here we show that this fact follows from a general argument that may be applied for different functions f.

Theorem 2.9

Let X, Y two centered Gaussian random vectors and the function f given by (2.14). If

$$\begin{aligned} d_Y\left( i,j\right) \ge d_X\left( i,j\right) \qquad \forall \, i,j, \end{aligned}$$

(2.31)

then

$$\begin{aligned} {\mathbb {E}}\left[ f\left( Y\right) \right] \ge {\mathbb {E}}\left[ f\left( X\right) \right] . \end{aligned}$$

(2.32)

Note that if conditions (2.15) and (2.16) are satisfied then (2.31) holds, but it is easy to construct examples for which (2.31) holds but (2.15), (2.16) are violated.

Observe that for any x we have that $\mu \left( x\right) =\left( \mu _1\left( x\right) , \dots ,\mu _n\left( x\right) \right) \in {\mathcal {I}}^n$ (recall definition (2.18)) where

$$\begin{aligned} {\mathcal {I}}^n=\bigg \{\mu =\left( \mu _1,\dots ,\mu _n\right) :\,0\le \mu _i \le 1,\,\, \sum _{i=1}^n\mu _i=1\bigg \}. \end{aligned}$$

Namely, ${\mathcal {I}}^n\subset {\mathbb {R}}^n$ is a $(n-1)$-dimensional simplex with extremal elements $\mu ^{(1)},\dots ,\mu ^{(n)}$, where $ \mu ^{(l)}_i=\delta _{li}$.

We start with a preliminary Lemma

Lemma 2.10

Consider a symmetric matrix D and the function $G:{\mathcal {I}}^n\rightarrow {\mathbb {R}}$ defined as

$$\begin{aligned} G\left( \mu \right) {:=}\sum _{i=1}^n\mu _iD_{ii}-\sum _{i=1}^n\sum _{j=1}^n\mu _i\mu _jD_{ij}. \end{aligned}$$

(2.33)

We have that

$$\begin{aligned} \inf _{\mu \in {\mathcal {I}}^n}G\left( \mu \right) \ge 0 \end{aligned}$$

(2.34)

if and only if

$$\begin{aligned} D_{ii}+D_{jj}-2D_{ij}\ge 0 \qquad \forall \, i,j\in \left\{ 1,\dots ,n\right\} . \end{aligned}$$

(2.35)

Proof

If condition (2.35) holds, then

$$\begin{aligned} G(\mu )\ge \sum _{i=1}^n\mu _iD_{ii}-\frac{1}{2} \sum _{i=1}^n\sum _{j=1}^n\mu _i\mu _j\left( D_{ii}+D_{jj}\right) =0. \end{aligned}$$

To obtain the last identity we used the fact that $\mu \in {\mathcal {I}}^n$. Conversely, suppose inequality (2.34) to hold. Choose $\mu $ such that $\mu _l=\mu _m=\frac{1}{2}$ for some $l\ne m$ and 0 otherwise; then (2.33) becomes

$$\begin{aligned} \frac{1}{4}\left( D_{ll}+D_{mm}\right) -\frac{1}{2}D_{lm}\ge 0 \end{aligned}$$

(2.36)

where we used the symmetry of D. Consider all the couples $l,m\in \left\{ 1,\ldots ,n\right\} $ to get the result. $\square $

Proof of Theorem 2.9

By formula (2.7) we deduce the results once we show that

$$\begin{aligned} \inf _{x\in {\mathbb {R}}^n}\bigg \{\mathrm {Tr}\Big (D\left( \nabla ^2f\right) _{i,j}(x)\Big ) \bigg \}\ge 0, \end{aligned}$$

(2.37)

where we called

$$\begin{aligned} D{:=}C^Y-C^X. \end{aligned}$$

(2.38)

Using (2.19) and (2.20) we obtain that the expression to be minimized in (2.37) is

$$\begin{aligned} \sum _{i=1}^nD_{i,i}\mu _i\left( x\right) -\sum _{i=1}^n\sum _{j=1}^nD_{i,j}\mu _i\left( x\right) \mu _j\left( x\right) . \end{aligned}$$

(2.39)

We have therefore that the infimum in (2.37) coincides with $\inf _{\mu \in {\mathcal {I}}^n}G(\mu )$ and the result follows by Lemma 2.10 since (2.35) with the matrix D defined by (2.38) coincides with (2.31). $\square $

3 Examples

In this section we discuss two examples, obtaining the existence of the thermodynamic limit for the quenched free energy of two models. The first one is the Sherrington–Kirkpatrick model. The existence of the thermodynamic limit for this model was obtained, by the interpolation method, in the breakthrough paper [19]. This was done using the result Theorem 2.2. We review this result as a warm-up to fix ideas and the basic constructions. We use however Theorem 2.9 and discuss the result just in terms of the metrics. Then we discuss a class of Generalized Random Energy Models [15] for which in general conditions (2.15), (2.16) fail while condition (2.31) hold. We refer to [9, 25] and [10] for the beautiful mathematics involved in the limit of such kind of models. In the final part of the section we discuss some models from a purely metric viewpoint.

3.1 The Sherrington–Kirkpatrick Model

The Sherrington–Kirkpatrick model is a mean field spin glass model [18, 24, 26, 28]. Spins configurations are $\sigma \in \{-1,1\}^N$ and the energy of the system is given by

$$\begin{aligned} H_N\left( \sigma \right) {:=}-\frac{1}{\sqrt{N}}\sum _{i,j=1}^NJ_{i,j}\sigma \left( i\right) \sigma \left( j\right) , \end{aligned}$$

(3.1)

where $J_{i,j}$ are i.i.d. standard Gaussian random variables. Small variants of the model consider different sums in (3.1) but all the variants are equivalent modulo simple transformations. The spins are associated to the vertices of a complete graph and the interaction between each pair of spins is determined by the variables J’s. The partition function is defined as

$$\begin{aligned} Z_N\left( \beta \right) {:=}\sum _{\{\sigma \}}\mathrm {e}^{-\beta H_N(\sigma )}, \end{aligned}$$

(3.2)

where the parameter $\beta $ is the inverse temperature and the quenched free energy per site is defined by

$$\begin{aligned} F_N\left( \beta \right) {:=}-\frac{1}{\beta N}\mathbb {E}\left[ \log Z_N\left( \beta \right) \right] {:=}\frac{1}{\beta N}\alpha _N\left( \beta \right) , \end{aligned}$$

(3.3)

where the last equality defines the symbol $\alpha _N\left( \beta \right) $. The variables $\left( -\beta H_N\left( \sigma \right) \right) _{\sigma \in \left\{ -1,1\right\} ^N}$ are a centered Gaussian random vector with covariance

$$\begin{aligned} \beta ^2{\mathbb {E}}\left[ H_N\left( \sigma \right) H_N\left( \sigma '\right) \right] =\frac{\beta ^2}{N} \sum _{i,j=1}^N\sigma \left( i\right) \sigma \left( j\right) \sigma '\left( i\right) \sigma '\left( j\right) =N\beta ^2q_N^2\left( \sigma ,\sigma '\right) , \end{aligned}$$

(3.4)

where

$$\begin{aligned} q_N\left( \sigma ,\sigma '\right) {:=}\frac{1}{N}\sum _{i=1}^N\sigma \left( i\right) \sigma '\left( i\right) , \end{aligned}$$

(3.5)

is the overlap between the configurations $\sigma $ and $\sigma '$. The corresponding Euclidean distance according to (2.23) is given by

$$\begin{aligned} d_N\left( \sigma ,\sigma '\right) = \beta \sqrt{8N\Big [d^H_N\left( 1-d^H_N\right) \Big ]}, \end{aligned}$$

(3.6)

where

$$\begin{aligned} d^H_N\left( \sigma ,\sigma '\right) {:=}\frac{1}{N}\sum _{i=1}^N\mathbb I\Big (\sigma \left( i\right) \ne \sigma '\left( i\right) \Big ) \end{aligned}$$

(3.7)

is the Hamming distance. By the way, the Hamming distance is an example of a non Euclidean metric. Notice that we have of course $d_N\left( \sigma ,\sigma \right) =0$ but we have also $d_N\left( \sigma ,-\sigma \right) =0$ since $H_N\left( \sigma \right) =H_N\left( -\sigma \right) $. The fact that the right hand side of (3.7) is a distance (indeed a pseudo distance) is not trivial but follows directly since it is obtained by (2.23) (it is a function of a metric that is again a metric, see [13, 16]).

Let us split the system into two subsystems $S_1, S_2$ with respectively $N_1$ and $N_2$ vertices with $N_1+N_2=N$. We erase the interaction between spins that belong to different subsystems. We define the restricted Hamiltonians of the subsystems as

$$\begin{aligned} H_{N_k}(\sigma ){:=}-\frac{1}{\sqrt{N}_k}\sum _{i,j\in S_k}J_{i,j}\sigma \left( i\right) \sigma \left( j\right) , \qquad k=1,2, \end{aligned}$$

(3.8)

where we remark that the sum is restricted to the indices belonging to the subsystems labeled $k=1,2$. Here and hereafter we continue to use the symbol $\sigma $ both for the full configuration as well as for the configuration restricted to a subsystem. When a configuration appears in an expression that is labeled by a subsystem then we mean the configuration restricted to the subsystem. For example $d^H_{N_k}\left( \sigma ,\sigma '\right) $ and $d_{N_k}\left( \sigma ,\sigma '\right) $ are respectively the Hamming distance (3.7) and the distance (3.6) when the configuration is restricted to the subsystem $k=1,2$. Note that with this notation we have the key relationship

$$\begin{aligned} d^H_N\left( \sigma ,\sigma '\right) =\frac{N_1}{N}d^H_{N_1}\left( \sigma ,\sigma '\right) + \frac{N_2}{N}d^H_{N_2}\left( \sigma ,\sigma '\right) . \end{aligned}$$

(3.9)

Another important relationship is

$$\begin{aligned} \sum _{\{\sigma \}} \mathrm {e}^{-\beta \left( H_{N_1}(\sigma )+H_{N_2}(\sigma )\right) }=Z_{N_1}(\beta )Z_{N_2}(\beta ). \end{aligned}$$

(3.10)

We apply Theorem 2.9 with the vectors

$$\begin{aligned} \left\{ \begin{array}{l} Y=\big (-\beta H_N(\sigma )\big )_{\sigma \in \{-1,1\}^N},\\ X=\left( -\beta H_{N_1}(\sigma )-\beta H_{N_2}(\sigma )\right) _{\sigma \in \{-1,1\}^N}. \end{array} \right. \end{aligned}$$

The condition (2.31) becomes the super-Pythagorean relation

$$\begin{aligned} d_N\ge \sqrt{d_{N_1}^2+d_{N_2}^2}, \end{aligned}$$

(3.11)

that is equivalent to

$$\begin{aligned} \Big [d^H_N\left( 1-d^H_N\right) \Big ]\ge \frac{N_1}{N}\Big [d^H_{N_1}\left( 1-d^H_{N_1}\right) \Big ]+\frac{N_2}{N} \Big [d^H_{N_2}\left( 1-d^H_{N_2}\right) \Big ]. \end{aligned}$$

(3.12)

The above inequality is true by (3.9) and the concavity of the real function $x \rightarrow x\left( 1-x\right) $. By Theorem 2.9 and (3.10) we deduce

$$\begin{aligned} \alpha _N\left( \beta \right) \le \alpha _{N_1}\left( \beta \right) +\alpha _{N_2}\left( \beta \right) , \end{aligned}$$

(3.13)

and by sub-additivity and the classic Fekete Lemma we deduce that the limit of the quenched free energy per site exists

$$\begin{aligned} \lim _{N\rightarrow \infty }F_N\left( \beta \right) =\lim _{N\rightarrow \infty }\frac{1}{\beta N}\alpha _N\left( \beta \right) =\inf _N\frac{1}{\beta N}\alpha _N\left( \beta \right) . \end{aligned}$$

(3.14)

3.2 The Generalized Random Energy Model

The Generalized Random Energy Model (GREM) is a spin glass model introduced by Derrida [15] to generalize the REM (Random Energy Model) imposing pair correlations between energies. The model has a hierarchical structure, as any spin configuration correspond to a leaf of a given rooted tree.

We consider sequences of finite trees codified by finite strings of non-negative integers. Let $n\in \mathbb {N}$ and $\underline{k}=\left( k_1,\ldots ,k_n\right) $ a vector of non-negative integers and call $|\underline{k}| {:=}k_1+\ldots +k_n$. The tree $\mathcal {T}_{\underline{k}}$ is constructed as follows. The root (that is the unique node at level 0) is connected to $2^{k_1}$ nodes to compose the first level. Each node of the first level is connected to $2^{k_2}$ nodes of the second level; we have therefore $2^{k_1+k_2}$ nodes on the second level and so on. The n-th level consists of $2^{k_1}2^{k_2}\ldots 2^{k_n}=2^{|\underline{k}|}$ leaves. If there exists a $1\le j<n$ such that $k_j=0$, we mean that the nodes of the level j coincide with those of the level $j-1$. A spin configuration $\sigma \in \left\{ -1,1\right\} ^{|\underline{k}|}$ is then attached to each leaf. The Hamiltonian is

$$\begin{aligned} H_{\underline{k}}\left( \sigma \right) =-\sqrt{|\underline{k}|}\left( \varepsilon _1^{(\sigma )}+\cdots +\varepsilon _{n}^{(\sigma )}\right) , \end{aligned}$$

(3.15)

where $\varepsilon _i^{(\sigma )}\sim \mathcal {N}\left( 0,a_i\right) $ if $k_i>0$ and $\varepsilon _i^{(\sigma )}=0$ if $k_i=0$. For any $i\in \mathbb N$ we have that the $a_i$’s are positive numbers such that $\sum _{i=1}^{+\infty } a_i=1$.

The random variables $\varepsilon $’s are attached to the edges of the tree. More precisely attached to the edges that connect the level $i-1$ to the level i there is a family of i.i.d. centered Gaussian random variables with variance $a_i$, one for each edge. When we write $\varepsilon ^{(\sigma )}_i$ we mean then the random variable associated to the unique edge that connects level $i-1$ to level i and that belongs to the unique path from the leaf associated to $\sigma $ to the root. When $k_i=0$ there are no edges from level $i-1$ to level i and therefore we set $\varepsilon ^{(\sigma )}_i=0$. Then, $\left( H_{\underline{k}}(\sigma )\right) _{\sigma \in \left\{ -1,1\right\} ^{|\underline{k}| }}$ is a centered Gaussian random vector on the $|\underline{k}|$-dimensional hypercube $\left\{ -1,1\right\} ^{|\underline{k}|}$.

We call $l=l\left( \sigma ,\tau \right) \in \{0,1, \dots n-1\}$ the level of the hierarchy at which the two paths from the leaves $\sigma $ and $\tau $ of $\mathcal {T}_{\underline{k}}$ to the root merge. The two configurations share the same energy variables $\varepsilon ^{(\sigma )}_i=\varepsilon ^{(\tau )}_i$ for any $i\le l$, while $\varepsilon ^{(\sigma )}_i\ne \varepsilon ^{(\tau )}_i$ whenever $i> l$. When $\varepsilon ^{(\sigma )}_i$ and $\varepsilon ^{(\tau )}_i$ are different, they are independent. Furthermore, $\varepsilon ^{(\sigma )}_i$ and $\varepsilon ^{(\tau )}_j$ are always independent if $j\ne i$. We define $\widetilde{a}_i{:=}a_i$ when $k_i>0$ and $\widetilde{a}_i{:=}0$ when $k_i=0$. We get

$$\begin{aligned} \mathbb {E}\left[ H_{\underline{k}}\left( \sigma \right) H_{\underline{k}}\left( \tau \right) \right] =\left| \underline{k}\right| \sum _{i=1}^{l}\widetilde{a}_i, \end{aligned}$$

(3.16)

pointing out that the right hand side above is zero when $l=0$. The corresponding metric according to (2.23) is given by

$$\begin{aligned} d_{\underline{k}}\left( \sigma ,\tau \right) =\sqrt{\mathbb {E}\left[ \left( H_{\underline{k}} \left( \sigma \right) -H_{\underline{k}}\left( \tau \right) \right) ^2\right] }=\sqrt{ 2 \left| \underline{k}\right| \sum _{i=l+1}^n\widetilde{a}_i}. \end{aligned}$$

(3.17)

The term inside the square root on the right hand side represents, up to a multiplicative factor, the minimal path length distance between the two leaves $\sigma $ and $\tau $ on the tree when each edge between level $i-1$ and i has a length given by $a_i$. Since the graph is a tree the path is unique and the metric (3.17) is an ultrametric. We introduce, for notational convenience, the normalized distance

$$\begin{aligned} {{s}_{\underline{k}}}\left( \sigma ,\tau \right) {:=}\sqrt{2\sum _{i=l+1}^n\widetilde{a}_i}, \end{aligned}$$

(3.18)

so that $d_{\underline{k}}\left( \sigma ,\tau \right) =\sqrt{\left| {\underline{k}}\right| } s_{\underline{k}}\left( \sigma ,\tau \right) $ for any pair of configurations $\sigma $ and $\tau $.

Both the correlations (3.16) and the metric (3.17) depend on the vector $\underline{k}$ and on the assignment of configurations to leaves. We will discuss soon this.

Like for the Sherrington–Kirkpatrick model, given an inverse temperature $\beta $, we introduce the disorder-dependent partition function

$$\begin{aligned} Z_{\underline{k}}\left( \beta \right) {:=}\sum _{\left\{ \sigma \right\} }\mathrm {e}^{-\beta H_{{\underline{k}}}\left( \sigma \right) } \end{aligned}$$

(3.19)

and the quenched average of the free energy per site

$$\begin{aligned} F_{\underline{k}}\left( \beta \right) {:=}-\frac{1}{\beta |\underline{k}|}\,\mathbb {E}\left[ \log Z_{\underline{k}}\left( \beta \right) \right] . \end{aligned}$$

(3.20)

We prove the existence of the thermodynamic limit of (3.20) under general assumptions when a parameter N is diverging and the vector $\underline{k}=\underline{k}\left( N\right) $ is growing in such a way that also $n=n\left( N\right) $ diverges. Contucci et al. [12] proved this fact when n is constant. This was obtained applying the same strategy of the Guerra–Toninelli interpolation method [19]; in particular, they used the inequality in Theorem 2.2. When n is no longer bounded this inequality fails while the inequality in Theorem 2.9 continues to work. We describe now more precisely the growing mechanism of the model and prove the existence of the thermodynamic limit.

3.2.1 Growing and Labeling

We consider a sequence of growing trees labeled by a sequence of vectors $\underline{k}\left( N\right) $. For each $N\in \mathbb {N}$ we have the tree $\mathcal {T}_{\underline{k}\left( N\right) }$ defined by the following hypothesis and rules.

(H1)
Let $\left( \alpha _i\right) _{n=1}^{\infty }$ be a sequence of reals larger than 1 satisfying the constraint
$$\begin{aligned} \sum _{i=1}^{\infty }\log \alpha _i=\log 2. \end{aligned}$$
(3.21)
The $\alpha _i$’s define the tree $\mathcal {T}_{\underline{k}\left( N\right) }$ through
$$\begin{aligned} k_i\left( N\right) {:=}\left\lfloor \frac{N\log \alpha _i}{\log 2}\right\rfloor , \qquad i\in \mathbb {N}, \end{aligned}$$
(3.22)
where $\lfloor \, \cdot \,\rfloor $ denotes the integer part.
(H2)
The sequence $\left( a_i\right) _{i=1}^{\infty }$ corresponds to the lengths of the edges from the different levels and the variance of the associated random variables and satisfies the condition $\sum _{i=1}^{\infty }a_i=1$.

The exact values of the sums of the series are not really important and could be substituted just by summability conditions. Formula (3.22) follows by the fact that we ask that the number of edges connecting a given node at level $i-1$ to nodes at level i grows exponentially like $\alpha _i^N$.

Observe that by (3.21), for any fixed $N>0$ in ${\underline{k}\left( N\right) }$ just a finite number of components are different from zero. We define

$$\begin{aligned} n{:=}n\left( N\right) {:=}\max \{i\,:\, k_i\left( N\right) >0\} \end{aligned}$$

(3.23)

and the finite vector $\underline{k}\left( N\right) {:=}\left( k_1\left( N\right) ,\ldots ,k_n\left( N\right) \right) $. Then, a spin configuration $\sigma \in \left\{ -1,1\right\} ^{\left| \underline{k}\left( N\right) \right| }$ is assigned to each leaf. The method is actually arbitrary; indeed, the free energy of the system is obtained summing over all the configurations, thus getting rid of any dependence on the underlying choice.

We assign a spin configuration to each leaf of the tree as follows. At fixed N, we attach to every edge one or more labels of type $\left( m,s\right) $, where $s=\pm 1$ and $m\in \left\{ 1,\ldots , \left| \underline{k}(N)\right| \right\} $. Given a leaf there exists a unique path toward the root. If this path crosses an edge having a label $\left( m,s\right) $ then the configuration $\sigma $ associated to the leave is such that $\sigma \left( m\right) =s$. We assign the labels in such a way that every path meets all the labels $m=1, \dots , \left| \underline{k}\left( N\right) \right| $ and such that different leaves have associated different configurations.

We embed the tree on a plane so that the root is on the top and the paths from the leaves to the root are going upwards. Moreover all the edges connecting a given node with the nodes at the successive level are ordered from left to right. Each edge connecting the level $i-1$ to level i has exactly $k_i\left( N\right) $ labels corresponding to the values $m=\sum _{j=1}^{i-1}k_j\left( N\right) +1, \sum _{j=1}^{i-1}k_j\left( N\right) +2, \dots , \sum _{j=1}^{i}k_j\left( N\right) $. The corresponding values of the parameter s are fixed as follows.

Fix a node at level $i-1$. Number each edge connecting this node with a node at level i with an integer number going from left to right from the value 0 to $2^{k_i\left( N\right) }-1$. The leftmost will correspond to 0 while the rightmost to $2^{k_i\left( N\right) }-1$. Do this for each node. Write these integers in binary code so that the leftmost edges are numbered with $k_i\left( N\right) $ zeros and the rightmost with $k_i(N)$ ones. In our setting, the 0 corresponds to the − sign and the 1 to the $+$ sign. Then, we associate the lowest value of m to the most significant digit and the highest value of m to the less significant one. See Fig. 3 for an example.

3.2.2 Splitting the System

Let $N>0$ and consider a pair of integers $N_1,N_2$ such that $N_1+N_2=N$. We already know how to construct the trees $\mathcal {T}_{\underline{k}\left( N\right) }$, $\mathcal {T}_{\underline{k}\left( N_1\right) }$ and $\mathcal {T}_{\underline{k}\left( N_2\right) }$. Their geometric structure is simply codified by the finite vectors $\underline{k}\left( N\right) $, $\underline{k}\left( N_1\right) $, $\underline{k}\left( N_2\right) $ and we recall that, by definition, we have

$$\begin{aligned} k_{i}\left( N_j\right) {:=}\left\lfloor \frac{N_j\log \alpha _i}{\log 2}\right\rfloor , \qquad j=1,2, \,\,\,\,\,i\in \mathbb N. \end{aligned}$$

(3.24)

Notice that

$$\begin{aligned} k_i\left( N_1\right) +k_i\left( N_2\right) \le k_i\left( N\right) \le k_i\left( N_1\right) +k_i\left( N_2\right) +1. \end{aligned}$$

(3.25)

We associate the labels to the edges and leaves of the full system $\mathcal {T}_{\underline{k}\left( N\right) }$ as in the previous section. The labels of the two subsystems $\mathcal {T}_{\underline{k}\left( N_1\right) }$ and $\mathcal {T}_{\underline{k}\left( N_2\right) }$ are instead attributed in a slightly different way in order to have different spins (different labels m) belonging to the two subsystems.

The labels m attributed to the edges from level $i-1$ to level i in the full system coincide with the set $\left\{ \sum _{j=1}^{i-1}k_j\left( N\right) +1,\sum _{j=1}^{i-1}k_j\left( N\right) +2, \dots , \sum _{j=1}^{i}k_j\left( N\right) \right\} $. When we split the system into the two subsystems we assign to the edges that connect each node in the level $i-1$ to the level i of the subsystem $\mathcal {T}_{\underline{k}\left( N_1\right) }$ the labels $\left\{ \sum _{j=1}^{i-1}k_j\left( N\right) +1, \dots , \sum _{j=1}^{i-1}k_j\left( N\right) +k_i\left( N_1\right) \right\} $ while we assign to the edges that connect each node in the level $i-1$ to the level i of the subsystem $\mathcal {T}_{\underline{k}\left( N_2\right) }$ the labels $\left\{ \sum _{j=1}^{i-1}k_j\left( N\right) +k_i\left( N_1\right) +1,\dots ,\sum _{j=1}^{i-1}k_j\left( N\right) +k_i\left( N_1\right) +k_i\left( N_2\right) \right\} $. By (3.25) this is well defined. Once split the labels m into the two subsystems, the assignment of the label $s=\pm $ follows the same rule of the previous section. Since $k_i\left( N_1\right) +k_i\left( N_2\right) $ may be strictly less than $k_i\left( N\right) $, some of the labels m (i.e. some spins) may disappear in the splitting.

We discuss now the behavior of the distances. Consider two finite vectors $\underline{k}$ and $\underline{k}'$ such that $k'_i\le k_i$ for any i. We assign the labels to $\mathcal T_{\underline{k}}$ in the usual way while instead we assign the labels to $\mathcal T_{\underline{k}'}$ as follows. We assign to the edges that connect each node in the level $i-1$ to the level i of $\mathcal {T}_{\underline{k}'}$ arbitrarily $k'_i$ of the $k_i$ labels in $\mathcal T_{\underline{k}}$. The assignment of the labels $s=\pm $ follows then the usual rule.

We call respectively $d_{\underline{k}}$ and $d_{\underline{k}'}$ the metrics defined by formula (3.17) for the two trees $\mathcal T_{\underline{k}}$ and $\mathcal T_{\underline{k}'}$ and $s_{\underline{k}}$, $s_{\underline{k}'}$ the corresponding normalized distances (see (3.18)). As before given two spin configurations $\sigma ,\tau \in \left\{ -1,1\right\} ^{\left| \underline{k}\right| }$ we call again $\sigma ,\tau \in \left\{ -1,1\right\} ^{\left| \underline{k}'\right| }$ the same configurations but restricted just to the labels assigned to the edges in $\mathcal T_{\underline{k}'}$. We have the following.

Lemma 3.1

Consider two finite vectors $\underline{k}' \le \underline{k}$ and the corresponding trees $\mathcal T_{\underline{k}}$ and $\mathcal T_{\underline{k}'}$ with configurations of spins associated to the leaves as above. Then we have

$$\begin{aligned} s_{\underline{k}'}\left( \sigma ,\tau \right) \le s_{\underline{k}}\left( \sigma ,\tau \right) , \qquad \forall \, \sigma , \tau . \end{aligned}$$

(3.26)

Proof

Consider the tree $\mathcal T_{\underline{k}}$, two configurations $\sigma , \tau $ associated to two leaves and their corresponding geodetic path. Let us now consider a new finite vector $\underline{k}'$ obtained by $\underline{k}$ simply decreasing by one just a single component and preserving all the remaining ones, i.e. $k'_i=k_i-1$ and $k'_j=k_j$ for all $j\ne i$. Suppose that the label m that is missing in $\mathcal T_{\underline{k}'}$ is $m^*$. The tree $\mathcal T_{\underline{k}'}$ with the corresponding labeling is obtained from $\mathcal T_{\underline{k}}$ and the original labeling simply as follows. All the edges connecting nodes at level $i-1$ to nodes at level i in $\mathcal T_{\underline{k}}$ can be paired into pairs having exactly the same labels apart the one corresponding to $m^*$. The two paired edges will have labels respectively $\left( m^*,+\right) $ and $\left( m^*,-\right) $. If we identify each paired couple of edges, and consequently we identify too the subtrees starting from the identified nodes, we get a tree that coincides with $\mathcal T_{\underline{k}'}$ with exactly the same assignments of labels. In particular, the leaves associated to $\sigma $, $\tau $ in the new tree will be exactly the original ones after the identification. Finally the geodetic path too remains the same after the identification (see e.g. Fig. 4).

Since the identification procedure can only shorten this path we have the statement of the lemma when $\underline{k}'$ is obtained by $\underline{k}$ decreasing by one just one of its components. We finish the proof observing that any $\underline{k}'\le \underline{k}$ can be obtained by $\underline{k}$ after a finite numbers of iterations of this type. $\square $

Remark 3.2

Both $\mathcal T_{\underline{k}\left( N_i\right) }$, $i=1,2$ are obtained by $\mathcal T_{\underline{k}\left( N\right) }$ as in the hypothesis of Lemma 3.1 and we have therefore

$$\begin{aligned} s_{\underline{k}\left( N\right) }\left( \sigma ,\tau \right) \ge \max \left\{ s_{\underline{k}\left( N_1\right) }\left( \sigma ,\tau \right) , s_{\underline{k}\left( N_2\right) }\left( \sigma ,\tau \right) \right\} , \qquad \forall \,\sigma , \tau ,\,\,\,\,\, i=1,2. \end{aligned}$$

(3.27)

Since by (3.25) we have $\frac{|\underline{k}(N_1)|+|\underline{k}(N_2)|}{|\underline{k}(N)|}\le 1$ and we deduce

$$\begin{aligned} s^2_{\underline{k}\left( N\right) }\left( \sigma ,\tau \right) \ge \frac{\left| \underline{k}\left( N_1\right) \right| }{\left| \underline{k}\left( N\right) \right| }s^2_{\underline{k}\left( N_1\right) }\left( \sigma ,\tau \right) +\frac{\left| \underline{k}\left( N_2\right) \right| }{\left| \underline{k}\left( N\right) \right| }s^2_{\underline{k}\left( N_2\right) }\left( \sigma ,\tau \right) , \end{aligned}$$

(3.28)

that is equivalent to the super-Pythagorean condition

$$\begin{aligned} d_{\underline{k}\left( N\right) }\left( \sigma ,\tau \right) \ge \sqrt{d^2_{\underline{k}\left( N_1\right) }\left( \sigma ,\tau \right) +d^2_{\underline{k}\left( N_2\right) }\left( \sigma ,\tau \right) }. \end{aligned}$$

(3.29)

3.2.3 Thermodynamic Limit

We define the energy of our sequence of GREM models as $H_N\left( \sigma \right) := H_{\underline{k}\left( N\right) }\left( \sigma \right) $ (recall definition (3.15)) and the corresponding partition functions and density of free energies like in (3.19), (3.20) more precisely $Z_N\left( \beta \right) =\sum _{\left\{ \sigma \right\} } \mathrm {e}^{-\beta H_N\left( \sigma \right) }$ and

$$\begin{aligned} F_N(\beta )=-\frac{1}{\beta \left| \underline{k}\left( N\right) \right| }{\mathbb {E}}\left[ \log Z_N\left( \beta \right) \right] =:\frac{\alpha _N(\beta )}{\beta \left| \underline{k}\left( N\right) \right| }, \end{aligned}$$

(3.30)

where the last equality defines the symbol $\alpha _N(\beta )$.

We need a preliminary Lemma. Let us call $\gamma _i{:=}\frac{\log \alpha _i}{\log 2}>0$. Observe that by definition we have $\sum _{i=1}^{+\infty }\gamma _i=1$.

Lemma 3.3

We have

$$\begin{aligned} \lim _{N\rightarrow \infty }\frac{\left| \underline{k}(N)\right| }{N}=1 \end{aligned}$$

(3.31)

Proof

For any finite k we have

$$\begin{aligned} \frac{\sum _{i=1}^{k}\left( N\gamma _i-1\right) }{N}\le \frac{\left| \underline{k}\left( N\right) \right| }{N}\le \frac{\sum _{i=1}^{+\infty }N\gamma _i}{N}. \end{aligned}$$

(3.32)

The right hand side of the above equation is 1. The left hand side converges when $N\rightarrow \infty $ to $\sum _{i=1}^k\gamma _i$. Taking now the limit on $k\rightarrow \infty $ we deduce the statement of the Lemma. $\square $

We can now prove the existence of the limit for quenched free energy per site of a GREM model with infinite levels.

Theorem 3.4

Under the hypothesis (H1) and (H2), there exists the limit when $N\rightarrow \infty $ of the density of free energy (3.30) defined on $\mathcal {T}_{\underline{k}\left( N\right) }$, in the sense that there exists the following limit that coincides with an infimum

$$\begin{aligned} -\infty<\lim _{N\rightarrow \infty }-\frac{1}{\beta \left| \underline{k}\left( N\right) \right| }\,\mathbb {E}\left[ \log Z_{N}\left( \beta \right) \right] =\inf _N\frac{\alpha _N(\beta )}{\beta N} <\infty . \end{aligned}$$

(3.33)

Proof

We apply the interpolation method for the Gaussian random vectors $H_{\underline{k}\left( N\right) }\left( \sigma \right) $ and $H_{\underline{k}\left( N_1\right) }\left( \sigma \right) +H_{\underline{k}\left( N_2\right) }\left( \sigma \right) $ that are both labeled by the configurations $\sigma \in \left\{ -1,1\right\} ^{\left| \underline{k}\left( N\right) \right| }$. The Gaussian random variables used to compute $H_{\underline{k}\left( N\right) }, H_{\underline{k}\left( N_1\right) }$ and $H_{\underline{k}\left( N_2\right) }$ are all independent among them. Note that since in the splitting some spins are lost then the second Gaussian random vector is degenerate.

We have the following identity

$$\begin{aligned} \sum _{\left\{ \sigma \right\} }\mathrm {e}^{-\beta H_{\underline{k}\left( N_1\right) }\left( \sigma \right) }\mathrm {e}^{-\beta H_{\underline{k}\left( N_2\right) }\left( \sigma \right) }=Z_{N_1}\left( \beta \right) Z_{N_2}\left( \beta \right) 2^{\left| \underline{k}\left( N\right) \right| -\left| \underline{k}\left( N_1\right) \right| -\left| \underline{k}\left( N_2\right) \right| }. \end{aligned}$$

(3.34)

The last term is due to the fact that some spins may be lost in the splitting.

By Remark 3.2, we can apply Theorem (2.9) getting

$$\begin{aligned} \alpha _N\left( \beta \right) \le \alpha _{N_1}\left( \beta \right) +\alpha _{N_2}\left( \beta \right) -\Big (\left| \underline{k}\left( N\right) \right| -\left| \underline{k}\left( N_1\right) \right| -\left| \underline{k}\left( N_2\right) \right| \Big )\log 2. \end{aligned}$$

(3.35)

Since the last term in the above inequality is non-negative we obtain that the sequence $\alpha _N(\beta )$ is subadditive. By Fekete’s Lemma we deduce that there exists the limit

$$\begin{aligned} \lim _{N\rightarrow \infty }\frac{\alpha _N(\beta )}{\beta N}=\inf _N\frac{\alpha _N(\beta )}{\beta N}. \end{aligned}$$

(3.36)

By Lemma 3.3 we have that

$$\begin{aligned} \lim _{N\rightarrow \infty }\frac{\alpha _N(\beta )}{\beta \left| \underline{k}\left( N\right) \right| }= \lim _{N\rightarrow \infty }\frac{\alpha _N(\beta )}{\beta N}, \end{aligned}$$

(3.37)

and we get the main statement of the Theorem.

It remains just to prove that the limit is strictly bigger than $-\infty $.

This follows by the summability of the variances $a_i$’s. Indeed, we prove that for any $N>0$, $-\beta F_{N}\left( \beta \right) $ is bounded from above. We have

$$\begin{aligned} -\beta F_{N}\left( \beta \right)= & {} \frac{1}{\left| \underline{k}\left( N\right) \right| }\,\mathbb {E}\left[ \log Z_{N}\left( \beta \right) \right] =\frac{1}{\left| \underline{k}\left( N\right) \right| }\,\mathbb {E}\left[ \log \sum _{\left\{ \sigma \right\} }\mathrm {e}^{\beta (\varepsilon ^{(\sigma )}_1+\varepsilon ^{(\sigma )}_2+\ldots +\varepsilon ^{(\sigma )}_{n\left( N\right) })}\right] \nonumber \\\le & {} \frac{1}{\left| \underline{k}\left( N\right) \right| }\,\log \sum _{\left\{ \sigma \right\} } \mathbb {E}\left[ \mathrm {e}^{\beta (\varepsilon ^{(\sigma )}_1+\varepsilon ^{(\sigma )}_2+\ldots +\varepsilon ^{(\sigma )}_{n\left( N\right) })} \right] , \end{aligned}$$

(3.38)

where we used Jensen’s inequality. Since the $\varepsilon _i^{(\sigma )}$ are independent, the expectation value in the last row is the product of generating functions:

$$\begin{aligned} \mathbb {E}\left[ \mathrm {e}^{\beta \varepsilon _i^{(\sigma )}}\right] =\mathrm {e}^{\frac{\beta }{2}a_i}\qquad \forall \,i, \end{aligned}$$

(3.39)

hence

$$\begin{aligned} -\beta F_{N}\left( \beta \right) \le \frac{1}{\left| \underline{k}\left( N\right) \right| }\,\log \sum _{\left\{ \sigma \right\} } \mathrm {e}^{\frac{\beta }{2}\sum _{i=1}^{n\left( N\right) }a_i} \le \frac{1}{\left| \underline{k}\left( N\right) \right| }\,\log \sum _{\left\{ \sigma \right\} }\mathrm {e}^{\frac{\beta }{2}}=\log 2 + \frac{\beta }{2\left| \underline{k}\left( N\right) \right| }, \end{aligned}$$

(3.40)

where we used the fact that $\sum _{i=1}^{\infty }a_i=1$. $\square $

Just as a remark we show in Lemma 3.5 in the Appendix that the third term in the right hand side of (3.35) is negligible when N is large. This fact is irrelevant for the proof but it is interesting in itself since for different models we could have a similar situation but with the wrong sign and a bound of this type could allow to apply the generalized subadditive lemmas in [14].

3.3 Geometric Remarks

Since by Lemma 2.8 we have that the density of free energy just depends on the metric structure of the Gaussian random variables, it is interesting to analyze the metric structure corresponding to the different models. Moreover natural and interesting models can be introduced starting directly from the metric description. Since all the metric spaces involved must be Euclidean a relevant characteristic is the dimension of the space where the metric can be realized as a collection of points.

Another useful remark is that the super-Pythagorean relation (3.11) implies (3.13), that gives the convergence (3.14) to the infimum of the density of free energies while instead a sub-Pythagorean relation (i.e. (3.11) with the opposite inequality) would imply a convergence of the density of free energy to the supremum.

Let us start with the Sherrington–Kirkpatrick model. Since the energy is defined in terms of $N^2$ i.i.d. Gaussian random variables the metric of the model with N sites can be represented by $2^N$ points embedded on ${\mathbb {R}}^{N^2}$. Indeed a natural representation of this metric is the following. Consider $\sigma \in \left\{ -1,1\right\} ^N$ as a column vector and define the projector $\widehat{\sigma }:=\sqrt{N}\sigma \sigma ^T$ that is a positive definite $N\times N$ matrix. By a direct computation we have that the metric induced by the Sherrington–Kirkpatrick model is given by

$$\begin{aligned} d_N\left( \sigma ,\eta \right) =\sqrt{\mathrm {Tr}\left( \left( \widehat{\sigma }-\widehat{\eta }\right) ^2\right) }, \end{aligned}$$

i.e. it is the Euclidean metric on the one dimensional projectors induced by the Hilbert-Schmidt scalar product. The super-Pythagorean relation is strictly related to the fact that all the projectors belong to the cone of positive definite matrices.

The metric structure of the GREM is best described with the trees illustrated in Sect. 3.2.

An interesting class of models can be introduced directly defining the metric. Let $v^{\pm 1}\in {\mathbb {R}}^2$ be two vectors. To any $\sigma \in \left\{ -1,1\right\} ^N$ we associate the vector $v\left( \sigma \right) \in {\mathbb {R}}^{2^N}$ defined by

$$\begin{aligned} v\left( \sigma \right) {:=}\otimes _{i=1}^N v^{\sigma (i)}. \end{aligned}$$

(3.41)

We have

$$\begin{aligned} \Big (v\left( \sigma \right) , v\left( \eta \right) \Big )=\prod _{i=1}^N\left( v^{\sigma (i)}, v^{\eta (i)}\right) . \end{aligned}$$

If we let $v^{\pm 1}=v^{\pm 1, N}$ depend on N in such a way that $|v^{\pm 1,N}|^2=N^{\frac{1}{N}}$ and $\left( v^{1,N}, v^{-1, N}\right) =N^{\frac{1}{N}}\alpha ^{\frac{1}{N}}$, $\alpha \in [0,1)$ we obtain that the Euclidean metric on the $2^N$ points embedded into ${\mathbb {R}}^{2^N}$ (that are the points in correspondence of the vectors) is given by

$$\begin{aligned} d_N(\sigma ,\eta )=\sqrt{2N\left( 1-\alpha ^{d^H_N(\sigma ,\eta )}\right) }. \end{aligned}$$

(3.42)

This Euclidean metric (as before it is a function of a metric that is again a metric [13, 16]) satisfies the super-Pythagorean relation (3.11) since the real function $1-\alpha ^x$ is concave. We have therefore convergence of the free energy densities. A model that corresponds to this metric can be fixed such that ${\mathbb {E}}\left[ H_N\left( \sigma \right) H_N\left( \eta \right) \right] \sim \alpha ^{d^H_N\left( \sigma ,\eta \right) }$. The special case $\alpha =0$ corresponds to the REM model that is a special case of the GREM discussed in Sect. 3.2 with one single branch and all the leaves directly connected to the root. In this case all the points are equally spaced.

References

Alaoui, A.E., Krzakala, F., Vail, C.O.: Estimation in the spiked wigner model: a short proof of the replica formula. IEEE Int. Symp. Inf. Theory 2018, 1874–1878 (2018)
Google Scholar
Barbier, J., Dia, M., Macris, N., Krzakala, F.: The mutual information in random linear estimation 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 625–632. IL, Monticello (2016)
Google Scholar
Barbier, J., Macris, N.: The adaptive interpolation method: a simple scheme to prove replica formulas in Bayesian inference Probab. Theory Relat. Fields 174, 1133–1185 (2019)
Article Google Scholar
Barbier, J., Macris, N., Dia, M., Krzakala, F.: Mutual information and optimality of approximate message-passing in random linear estimation. IEEE Trans. Inf. Theory 66(7), 4270–4303 (2020)
Article MathSciNet Google Scholar
Barra, A.: The mean field Ising model through interpolating techniques. J. Stat. Phys. 132(5), 787–809 (2008)
Article MathSciNet ADS Google Scholar
Barra, A., Contucci, P., Mingione, E., Tantari, D.: Multi-species mean field spin glasses. Rigorous results. Ann. Henri Poincaré 16(3), 691–708 (2015)
Article MathSciNet ADS Google Scholar
Bhatia, R.: Positive definite matrices Princeton Series in Applied Mathematics. Princeton University Press, Princeton, NJ (2007)
Google Scholar
Blumenthal, L.M.: Theory and Applications of Distance Geometry, 2nd edn. Chelsea Publishing Co., New York (1970)
MATH Google Scholar
Bovier, A., Kurkova, I.: Derrida’s generalized random energy models. 1. Models with infinitely many hierarchies. Ann. Inst. H. Poincaré. Prob. Stat. 40, 439–480 (2004)
Article Google Scholar
Bovier, A., Kurkova, I.: Derrida’s generalized random energy models. II. Models with continuous hierarchies. Ann. Inst. H. Poincaré Probab. Statist. 40(4), 481–495 (2004)
Article MathSciNet Google Scholar
Chatterjee, S.: An error bound in the Sudakov-Fernique inequality. arXiv:math/0510424
Contucci, P., Degli, Esposti M., Giardinà, C., Graffi, S.: Thermodynamical limit for correlated Gaussian random energy models. Commun. Math. Phys. 236(1), 55–63 (2003)
Article MathSciNet ADS Google Scholar
Corazza, P.: Introduction to metric-preserving functions Amer. Math. Monthly 106(4), 309–323 (1999)
Article MathSciNet Google Scholar
de Bruijn, N.G. , Erdös, P.: Some linear and some quadratic recursion formulas. II. Nederl. Akad. Wetensch. Proc. Ser. A. 55 = Indagationes Math. 14, 152–163 (1952)
Derrida, B.: A generalization of the random energy model that includes correlations between the energies. J. Phys. Lett. 46, 401–407 (1985)
Article MathSciNet ADS Google Scholar
Dobos̆, J.: Metric Preserving Functions, Vydavatel’stvo Troffek, Koice, Slovakia, ISBN 80-88896-30-4 (1998)
Dokmanic, I., Parhizkar, R., Ranieri, J., Vetterli, M.: Euclidean Distance Matrices: Essential theory, algorithms, and applications. IEEE Signal Processing Magazine 32, 6 (2015)
Article ADS Google Scholar
Guerra, F.: Spin Glasses. In: Bovier, A., et al. (eds.) Mathematical Statistical Physics, pp. 243–271. Elsevier, Oxford (2006)
Chapter Google Scholar
Guerra, F., Toninelli, F.L.: The thermodynamic limit in mean field spin glass models. Commun. Math. Phys. 230(1), 71–79 (2002)
Article MathSciNet ADS Google Scholar
Joag-dev, K., Perlman, M.D., Pitt, L.D.: Association of normal random variables and Slepian’s inequality. Ann. Probab. 11, 451–455 (1983)
Article MathSciNet Google Scholar
Kahane, J.P.: Une inégalité du type Slepian and Gordon sur les processus gaussiens Israel. J. Math. 55, 109–110 (1986)
MATH Google Scholar
Korada, S.B., Macris, N.: Exact solution of the gauge symmetric p-spin glass model on a complete graph. J. Stat. Phys. 136(2), 205–230 (2009)
Article MathSciNet ADS Google Scholar
Mezard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. World Scientific, Singapore (1987)
MATH Google Scholar
Panchenko, D.: The Sherrington–Kirkpatrick Model Springer Monographs in Mathematics. Springer, New York (2013)
Book Google Scholar
Ruelle, D.: A mathematical reformulation of Derrida’s REM and GREM Comm. Math. Phys. 108(2), 225–239 (1987)
Article MathSciNet ADS Google Scholar
Sherrington, D., Kirkpatrick, S.: Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792–1796 (1975)
Article ADS Google Scholar
Schoenberg, I.: Remarks to Maurice Fréchets article Sur la définition axiomatique dune classe despace distanciés vectoriellement applicable sur lespace de Hilbert. Ann. Math. 36, 724–732 (1935)
Article MathSciNet Google Scholar
Talagrand M.: Spin glasses: a challenge for mathematicians. Cavity and mean field models. Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics, 46 Springer-Verlag, Berlin (2003)
Talagrand, M.: The Parisi formula. Ann. Math. (2) 163(1), 221–263 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank Adriano Barra, Francesco Guerra and Fabio Lucio Toninelli for several useful comments, suggestions and remarks.

Funding

Open access funding provided by Università degli Studi dell’Aquila within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Università dell’Aquila, Via Vetoio, Loc. Coppito, 67010, L’Aquila, Italy
Roberto Boccagna & Davide Gabrielli

Authors

Roberto Boccagna
View author publications
You can also search for this author in PubMed Google Scholar
Davide Gabrielli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Gabrielli.

Additional information

Communicated by Federico Ricci-Tersenghi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

We show here that the extra terms in (3.35) are indeed negligible when N is large.

Lemma 3.5

We have

$$\begin{aligned} \lim _{N\rightarrow \infty }\sup _{N_1+N_2=N}\frac{\left| \underline{k}\left( N\right) \right| -\left| \underline{k}\left( N_1\right) \right| -\left| \underline{k}\left( N_2\right) \right| }{|\underline{k}(N)|}=0. \end{aligned}$$

(3.43)

Proof

Let us call $I\left( N\right) \subseteq \mathbb N$ the set $I\left( N\right) {:=}\left\{ i\,:\, k_i\left( N\right) >0\right\} $. We define also $J\left( N\right) {:=}I\left( N\right) \cap \left( I\left( N-1\right) \right) ^C$. We have for any $N_1+N_2=N$ that

$$\begin{aligned} \left| \underline{k}\left( N\right) \right| -\left| \underline{k}\left( N_1\right) \right| -\left| \underline{k}\left( N_2\right) \right| \le |I(N)| \end{aligned}$$

(3.44)

By definition we have that if $i\in J\left( N\right) $ then $\frac{1}{N}\le \gamma _i<\frac{1}{N-1}$. We have therefore

$$\begin{aligned} 1=\sum _{i=1}^{+\infty }\gamma _i\ge \sum _{\ell =1}^{+\infty }\left| J\left( \ell \right) \right| \frac{1}{\ell }. \end{aligned}$$

(3.45)

We deduce therefore that the series on the right hand side has to be convergent. Since $|I(N)|=\sum _{\ell =1}^N|J(\ell )|$ we have

$$\begin{aligned} \frac{|I(N)|}{N}=\sum _{\ell =1}^N\frac{\ell }{N}|J(\ell )|\frac{1}{\ell }. \end{aligned}$$

(3.46)

The series on the right hand side of (3.45) is convergent, thus we deduce the statement by the dominated convergence Theorem. This, together with (3.44) and Lemma 3.3, concludes the proof. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Boccagna, R., Gabrielli, D. Remarks on the Interpolation Method. J Stat Phys 181, 1218–1238 (2020). https://doi.org/10.1007/s10955-020-02624-x

Download citation

Received: 01 April 2020
Accepted: 08 August 2020
Published: 24 August 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10955-020-02624-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Remarks on the Interpolation Method

Abstract

Similar content being viewed by others

Interpolation and Comparison Methods in the Mean Field Spin Glass Model

From Parisi to Boltzmann

Revisiting the scaling of the specific heat of the three-dimensional random-field Ising model

1 Introduction

2 The Interpolation Method

2.1 The Interpolation Method

Lemma 2.1

Proof

Theorem 2.2

Proof of Theorem 2.2

2.2 Covariances and Metrics

Lemma 2.3

Lemma 2.4

Proof

Lemma 2.5

Lemma 2.6

Lemma 2.7

Proof

Lemma 2.8

Proof

2.3 A Generalized Condition

Theorem 2.9

Lemma 2.10

Proof

Proof of Theorem 2.9

3 Examples

3.1 The Sherrington–Kirkpatrick Model

3.2 The Generalized Random Energy Model

3.2.1 Growing and Labeling

3.2.2 Splitting the System

Lemma 3.1

Proof

Remark 3.2

3.2.3 Thermodynamic Limit

Lemma 3.3

Proof

Theorem 3.4

Proof

3.3 Geometric Remarks

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 3.5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation