Abstract
The author has recently introduced a mean-field spin glass model with infinite connectivity Talagrand (Mean Field Models for Spin Glasses. Springer, Berlin, 2011) that has the striking property that one of the parameters naturally occurring in the replica-symmetric equations is a function. We introduce a version of this model where the interaction is “diluted”, and where at high temperature the computation of the typical value of the overlaps requires the use of a random measure as an auxiliary parameter.
1 Introduction
The “classical” mean field models for spin glasses, such as the Hopfield model, the Sherrington–Kirkpatrick model and the Perceptron model have been well understood at high enough temperature for some time [4, 5]. The ground-breaking work of Panchenko has now brought a rather complete understanding of the Sherrington–Kirkpatrick model at any temperature [1, 2], as well as the hope that the diluted version of this model will soon also be understood [3]. While these various classical models differ greatly from each other, they share many similarities, in a way which is probably not fully understood. To clarify this point, a mathematician is then tempted to attempt to invent mean-field Hamiltonians with a somewhat canonical mathematical structure that exhibit new features compared with those classical models, even at high-temperature. Interestingly, this does not seem to be so easy. One such model, the \(V\)-statistic model is proposed in [4], and we describe it now. Consider independent standard Gaussian r.v.s \((g_{ik})\), \(i,k\ge 1\) and for \({\varvec{\sigma }}=(\sigma _i)_{i \le N}\) let
Consider a symmetric function \(u:\mathbb {R}^2 \rightarrow \mathbb {R}\). Given an integer \(M\), in [4] we study the system with Hamiltonian
Here, it helps to think of \(M\) as being a proportion of \(N\). Thus, in (1.2) (and if \(u\) is of order \(1\)) there are about \(N^2\) terms, each of size \(N^{-1}\). (The reason why the notation \(H({\varvec{\sigma }})\) does not indicate the values of \(M\) and \(N\) is that we like to consider \(N,M\) large, but fixed once and for all.) The most striking feature of this model is that apparently the replica symmetric equations require the use of a function as a parameter, even though this is a model with full connectivity, i.e. every spin interacts with all the others. This should be compared with the situation of the Sherrington–Kirkpatrick model at high temperature, which is described by a single parameter, while it is the diluted version of this model which requires a function as a parameter.
This unexpected feature motivated the present investigation of a “diluted” version of (1.2), namely
Here the quantities \(\eta _{kk'}\) are independent r.v.s that are independent of the r.v.s \(g_{ik}\), and that satisfy
where \(\gamma >0\) is a parameter. The quantity (1.3) consists now of about \(N\) terms, each of size about \(1\).
The picture of what might happen for the system governed by Hamiltonian (1.3) is considerably clarified if one keeps in mind the following principle:
By this we mean that they suitably decorrelate (as will be detailed below). Of course this principle is purely heuristic and is based only on the observation of a few situations.
Let us denote by \(G\) the Gibbs measure associated with the Hamiltonian (1.3), and by \(\Sigma _N=\{-1,1\}^N\) the space of configurations. We expect that (for a given typical realization of the disorder) for \(k \ne k'\) the maps \({\varvec{\sigma }}\mapsto S_k({\varvec{\sigma }})\) and \({\varvec{\sigma }}\mapsto S_{k'}({\varvec{\sigma }})\) are nearly independent (when seen as r.v.s on the space \((\Sigma _N,G)\)). Consider the probability \(\mu _k\) on \(\mathbb {R}\) that is the image of \(G\) under the map \({\varvec{\sigma }}\mapsto S_k({\varvec{\sigma }})\). It is a random probability, since it might (and will) depend on the realization of the r.v.s in (1.3), so we can think of it as a r.v. valued in the space \(\mathcal {M}\) of probability measures on \(\mathbb {R}\). For \(k \ne k'\), we expect that the r.v.s \(\mu _k\) and \(\mu _{k'}\) are nearly probabilistically independent. Their common law is a probability measure \(\overline{\mu }\) on \(\mathcal {M}\), or in other words, a random probability measure. This is the basic reason why the description of the system governed by the Hamiltonian (1.3) must involve a random probability measure. Probably a physicist would say that \(\overline{\mu }\) is the “order parameter of the system”. It would be very naive to expect to understand anything at all about the Hamiltonian (1.4) without describing \(\overline{\mu }\). The difference with the case (1.2) is that, for the Hamiltonian (1.2), the probability \(\mu _k\) is (almost) independent of the disorder (and \(k\)), and can be described using a single auxiliary function.
As the reader will soon realize, the study of the Hamiltonian (1.4) is of extreme technical difficulty. Therefore it make sense to assume the strongest regularity conditions on \(u\) under which this Hamiltonian remains of interest. We will assume that for a certain number \(D\),
whenever \(w\) is a partial derivative of \(u\) of order \(\ell \), \(0 \le \ell \le 4\). Our results will be of the following nature. Given a value of \(\gamma \) and of \(\alpha =M/N\), then we can describe the system (with accuracy that increases with \(N\)) provided \(DK(\alpha ,\gamma ) \le 1\), where \(K(\alpha ,\gamma ) \) is a constant that depends only on \(\alpha \) and \(\gamma \) (and stays bounded when \(\alpha \) and \(\gamma \) stay bounded). It would not require much extra work to obtain an explicit dependence on \(\alpha \) and \(\gamma \), but as this dependence is unlikely to be sharp, there is no much point to do it. To simplify notation, throughout the paper, we denote by \(K\) a constant that might depend on \(\alpha \) and \(\gamma \) (but certainly not on \(N\)). This constant need not be the same at each occurrence. Similarly, we denote by \(L\) a universal constant (i.e. a number) that need not be the same at each occurrence.
We denote by \(z\) and \(\xi \) two independent standard r.v.s, that of course are independent of the r.v.s \(g_{ik}\). Given a number \(0 \le q \le 1\), we consider
so that the dependence of \(\theta \) in \(q\) is implicit. We denote by \(\mathsf {E}_\xi \) expectation in \(\xi \) only (given all the other r.v.s).
In order to state our results, our first task is, given \(q\) and an integer \(k\ge 0\), to define an operator \(T^q_k\) from the set of probability measures on \(\mathcal {M}\) to itself. We proceed as follows. Given a probability measure \(\overline{\mu }\) on \(\mathcal {M}\), we consider an i.i.d. sequence \((\mu _i)\) of \(\mathcal {M}\)-valued r.v.s, each of law \(\overline{\mu }\). Consider then the random probability measure \(\nu \) on \(\mathbb {R}\) given by
whenever \(f\) is continuous with compact support. We denote by \(T^q_k(\overline{\mu })\) the law of \(\nu \). Let us define \(\alpha =M/N\) and
and let us point out that \(\beta \) does not denote an inverse temperature!
We define
and we observe that this is a probability measure.
Proposition 1.1
If \(KD \le 1\), for each \(0 \le q \le 1\) there is a unique element \(\overline{\mu }_q\) of \(\mathcal {M}\) such that
Let us consider independent random measures \(\mu _i\) (\(i \ge 1\)), the common law of which is the measure \(\overline{\mu }_q\) of (1.11). In other words, \((\mu _i)\) is an i.i.d. sequence of \(\mathcal {M}\) valued r.v.s of common law \(\overline{\mu }_q\). Let us consider standard Gaussian r.v.s \(z,\xi ^1,\xi ^2,\) and for \(\ell =1,2\) consider
We will denote by \(\mathsf {E}_\xi \) expectation in the r.v.s \(\xi ^\ell \) (\(\ell =1,2\)) only. Throughout the paper we write
Given an integer \(a \ge 1\), let \(\varvec{x}=(x_i)_{i \le a}\), and
Consider the quantities
Proposition 1.2
(The replica-symmetric equations) If \(KD \le 1\), there is a unique couple \(q,r\) with \(0 \le q\le 1\), \(r \ge 0\), that satisfies the Eq. (1.15) together with
Our main result is as follows.
Theorem 1.3
If \(KD \le 1\), we have
where \(q\) is the number constructed in Proposition 1.2.
In (1.17) we follow our usual notation: for two configurations \({\varvec{\sigma }}^1,{\varvec{\sigma }}^2\), we have
and the bracket \(\langle \cdot \rangle \) means that the configurations are averaged independently for the Gibbs measure. Once (1.17) is proved, the road to compute the limiting free energy is wide open. A clean way to do is explained in detail in [4] and requires to extend Theorem 1.3 to suitable interpolating models. This is better left for another day.
The present model was invented largely to illustrate the following rather unexpected point: despite the fact that condition (1.17) holds (as is expected for a mean field model with infinite connectivity under a high-temperature condition) the determination of \(q\) requires the elaborate process of Propositions 1.1 and 1.2.
In [4] the author tried to develop systematic methods to control mean-field spin glasses models at high-enough temperature. He then expected that the proof of Theorem 1.3 would be a nice exercise in demonstrating the effectiveness of these methods. But in the end, the proof of Theorem 1.3 we eventually found is several order of magnitude more difficult than expected (and the work required is considerably larger than the work required to control at high temperature any of the other classical models). The author sees no a priori reason why this should be the case. Certainly he would never have started this project if he had known that it would turn out to be so difficult. This could not be guessed before hand. Technical difficulties keep cropping up in a seemingly endless stream. They reveal themselves only when one has advanced far enough to see them, and the goal seems to constantly receed as one advances. In the end these difficulties are solved using essentially an elaboration of the methods of [4], but it took considerable effort to find the proper combination of technical ingredients. Even the proof of Proposition 1.2 is rather delicate. It is therefore impossible to write all the proofs in all details. This would run well over 100 pages, and would result in an impenetrable paper. Our approach does involve new technical ingredients, which may be of further use, and it does not seem appropriate to bury these in a mass of details. Thus the author has chosen the only sensible approach he sees: trying to explain the best he can what are the difficulties, and what are the techniques required to overcome them. Each technique will be developed in complete detail in a simple case. Obtaining the actual proof of Theorem 1.3 requires the uses of these techniques in more complicated cases which are simply more cumbersome to write.
2 Fixed points
In this section we prove the relatively easy Proposition 1.1. As is often the case, the proof uses a fixed point argument.
Given two probability measures \(\mu \) and \(\mu '\) on \(\mathbb {R}\), we denote by \(\delta (\mu ,\mu ')\) the total variation distance
where the supremum is taken over all continuous functions \(f\) with \(\sup |f| \le 1\). Given two probability measures \(\overline{\mu },\overline{\mu }'\) on \(\mathcal {M}\), we denote by \(\Delta \) their distance for the transportation-cost distance associated to \(\delta \), that is
where \((\mu ,\mu ')\) is a pair of random measures such that \(\overline{\mu }\) is the law of \(\mu \) and \(\overline{\mu }'\) is the law of \(\mu '\). Proposition 1.1 is a consequence of the following estimate.
Lemma 2.1
If \(D \le 1\) and \(L\beta D\le 1\) we have
We remind the reader that here and below, \(L\) denotes a universal constant, that is not necessarily the same at each occurrence. Let us also mention that we will always assume that \(D \le 1\), which does not decrease generality.
Proof
By definition of \(\Delta \), we can find a pair \((\mu ,\mu ')\) of random measures such that \(\mu \) (resp. \(\mu '\)) has \(\overline{\mu }\) (resp. \(\overline{\mu }'\)) as distribution, and
We denote by \((\mu _i,\mu _i')\) independent copies of the couple \((\mu ,\mu ')\). Consider then the measures \(\nu ,\nu '\) given by (1.8). We will estimate \(\mathsf {E}\delta (\nu ,\nu ')\). The most important observation is that
where
Since \(|u| \le D\) (and \(D \le 1\)) we have \(|v(\theta ,x)| \le 3D\) and thus
The rest of the estimates is straightforward. We denote respectively by \(A\) and \(B\) the numerator and the denominator of the quantity (1.8), and we denote with a \('\) the corresponding quantities for \(\mu '\). When \(|f| \le 1\) we have
Thus
Writing \(B_i=\int \exp u(\theta ,x)d\mu _i(x)\), we have
since \(|B_j| \le \exp kD\). A similar inequality holds for \(|A-A'|\), and (2.7) implies that
and hence
Since \(T_k^q(\mu )\) is the law of \(\nu \) and similarly for \(\nu '\), taking expectation and using the definition of \(\Delta \) shows that
and it follows from (1.10) that
and this proves the result. \(\square \)
3 How to prove Proposition 1.2?
The map \(r \mapsto q=\mathsf {E}\mathrm {th}^2(z\sqrt{r})\) is Lipschitz since its derivative is bounded (as is apparent through integration by parts). The proof will consist in showing that the map \(q \mapsto r=r(q)\), where \(r\) is given by (1.15) is Lipschitz with constant \(KD\). Thus for \(KD \le 1\), the map \(q \mapsto \mathsf {E}\mathrm {th}^2(z\sqrt{r(q)})\) has a Lipschitz constant \(\le 1/2\), and since it maps \([0,1]\) into itself, it has a unique fixed point.
In order to keep away from the secondary obstacles created by the complicated formula (1.15) we will prove a simpler statement.
Proposition 3.1
Consider a function \(f\) on \(\mathbb {R}\) such that \(|f^{(\ell )}| \le D\) for \(\ell =0,1,2\). Then the map
where \(\mu \) is a random measure of law \(\overline{\mu }_q\) (given by Proposition 1.1) is Lipschitz with constant \(\le KD\).
The main difficulty in the proof is that there does not seem to exists strong continuity properties of the map \(q \mapsto \overline{\mu }_q.\) The situation is somewhat similar to the following. If \(\nu \) denotes the random measure that puts a unit mass at the point \(z\sqrt{q}\), where \(z\) is standard Gaussian, the smoothness of map \(q \mapsto \mathsf {E}\int f(x)d\nu (x)\) (\(=\mathsf {E}f(z\sqrt{q})\)) is revealed only through integration by parts.
The defining property of \(\overline{\mu }_q\) is the relation
and this must be the basis of any proof. Consider an i.i.d. sequence \((\mu _i)\) of random measures of law \(\overline{\mu }_q\), and a Poisson r.v. \(k\) of expectation \(\beta \). (Of course, all these are independent, and this will not be mentioned any more.) Then (if \(\theta =z\sqrt{q}+\xi \sqrt{1-q}\)) it follows from (3.1) that
where the equality is in distribution. In the right hand side, some dependence on \(q\) is explicit through \(\theta \), and some is implicit through \(\mu _i\). It is natural to expect that if we iterate a large number of times the process leading to (3.2), we will make most of the dependence in \(q\) explicit. To perform the next level of iteration, we consider independent standard Gaussian r.v.s \(\xi _i,z_i\), independent Poisson r.v.s \(k_i\) of expectation \(\beta \), and i.i.d. random measures \(\mu _{ij}\) distributed like \(\overline{\mu }_q\). We write \(\theta _i=z_i\sqrt{q}+\xi _i\sqrt{1-q}\), and we denote by \(\mathsf {E}_\xi \) expectation in \(\xi \) and \(\xi _i\). Then we see that
The problem we face with this expression is the fact that the quantities
become larger as the number of iterations increases, and that one does not see how one could control them uniformly. To go around this basic difficulty, rather that (3.2) we write
where we recall the notation \(v(x,y)=\exp u(x,y)-1\). This expression makes it apparent that the influence of \(\mu _i\) is felt only through the integration of the small function \(v\). Denoting by \(\mathsf {E}_i\) expectation in \(\xi _i\) only, rather that (3.3) we write
where
This formula makes it clear that the influence of \(\mu _{ij}\) is felt only through the integration of the small function \(v(\theta _i,x)\), and is itself dampened by the small factor \(v(\theta ,\theta _i)\), a “dampening effect” of order \(D^2\). This dampening effect is of course why we can define \(\overline{\mu }_q\) in the first place. It is brought forward explicitly in the formula (3.4). The draw back is of course that (3.4) is quite more complicated that (3.3), and it will require a significant effort to define precisely what it means to iterate the previous process. This we will be done in Sect. 4.
Much of the complication is created by the (growing) stack of fractions obtained when iterating (3.4). One could hope of course that the difficulty would decrease if one should discover a suitable “induction hypothesis” to control the map \(q \mapsto \mu _q\); our long efforts in that direction were fruitless.
4 Trees and operators
In this section we built the machinery needed to continue the process started in (3.4). This machinery will also be required in the Proof of Theorem 1.3.
The data of the numbers \(\eta _{kk'}\) is equivalent to the data of a (standard) graph on \(\{1,\ldots ,M\}\). Given a number \(p\), we will consider the subgraph consisting of points that can be reached from \(M\) in \(p\) steps on the graph. A fundamental fact (which is completely standard in the study of diluted models) is that for large \(M\) this graph looks like a Poisson random tree (where the numbers of offsprings of an individual is Poisson of expectation \(\beta \)). So, it is natural to consider trees on \(\mathbb {N}^*\). For us a tree (rooted at \(M\)) is the following structure, defined by induction on \(p\), the depth of the tree. A tree of depth 1 is simply a subset \(I=I(M) \subset \mathbb {N}^* {\setminus }\{M\}\), the set of the offsprings (of the first generation) of \(M\). Having defined a tree of depth \(p\), and for \(\ell \le p\) the set \(J_\ell \) of the offsprings of \(M\) of the \(\ell ^{\mathrm{th}}\)-generation, a tree of depth \(p+1\) is the data for \(\tau \in J_p\) of the set \(I(\tau )\) of the offsprings of \(\tau \), and \(J_{p+1}=\bigcup _{\tau \in J_p}I(\tau )\). The sets \(I(\tau )\) are assumed to be all disjoint and disjoint of the sets \(J_\ell \), \(\ell \le p\), so the sets \(J_\ell \), \(\ell \le p+1\) are all disjoint.
Consider numbers \(z_\tau ,q_\tau \ge 0\), \(\rho _\tau \ge 0\) for \(\tau \in \mathbb {N}^*\), and i.i.d. standard Gaussian r.v.s \(\xi _\tau \). We set \(\theta _\tau =z_\tau \sqrt{q_\tau }+\xi _\tau \sqrt{\rho _\tau }\). Consider the set \(\mathcal {D}\) of continuous functions \(g:\mathbb {R}\rightarrow [e^{-D}-1,e^D-1]\). Given the tree \(\mathcal {G}\) we will define by induction on the depth of \(\mathcal {G}\) a map \(W_\mathcal {G}:\mathcal {D}^{\mathbb {N}^*} \rightarrow \mathbb {R}\). This map depends on the quantities \(z_\tau ,q_\tau ,\rho _\tau \), but the dependence is kept implicit for the time being. This map has the property that for \(\varvec{h}=(h_\tau )_{\tau \in \mathbb {N}^*} \in \mathcal {D}^{\mathbb {N}^*}\), the quantity \(W_\mathcal {G}(\varvec{h})\) depends only on the functions \(h_\tau \) for \(\tau \in J_p\), where \(p\) is the depth of \(\mathcal {G}\). (In words, \(W_\mathcal {G}(\varvec{h})\) depends on the functions \(h_\tau \) of the last generation.) This will be obvious from the definitions, and is required for these definitions to make sense.
For \(p=1\), letting \(I=I(M)=J_1\) we set
where as usual \(\mathsf {E}_\xi \) denotes expectation in the r.v.s \(\xi _\tau \) only (here in \(\xi _M\) only). When \(I=\varnothing \), this formula means that \(W_\mathcal {G}(\varvec{h})=\mathsf {E}_\xi f(\theta _M)\), so in this case \(W_\mathcal {G}(\varvec{h})\) does not depend on \(\varvec{h}\).
Assuming now that the map \(W_\mathcal {G}\) has been defined for a tree of depth \(p\), we define it for a tree of depth \(p+1\). Let \(\mathcal {G}\) be such a tree. To \(\mathcal {G}\) we canonically associate a tree \(\mathcal {G}^*\) of depth \(p\) by “forgetting the last generation”. Considering functions \((h_\tau )_{\tau \in J_{p+1}}\), \(h_\tau \in \mathcal {D}\), we define functions \((h^*_\tau )_{\tau \in J_p}\) by
When \(I(\tau )=\varnothing \), this is simply \(\mathsf {E}_\xi v(x,\theta _\tau )\). We observe that \(h^*_\tau \in \mathcal {D}\). To see this we simply note e.g. that \(h_\tau ^*(x) \le \max _y v(x,y)\le e^D -1.\) We set \(\varvec{h}^*=(h^*_\tau )_{\tau \in J_p}\) and we define
which is well defined since \(W_{\mathcal {G}^*}(\varvec{h}^*)\) depends only on \(h_\tau ^*\) for \(\tau \in J_p\).
Our first task is to learn how to control the derivatives of the map \(W_\mathcal {G}\). This map may be seen as a function of the variables \(h_\tau ,\tau \in J_p\). We have \(h_\tau \in \mathcal {D}\subset \mathcal {C}\), the space of bounded continuous functions on \(\mathbb {R}\). We denote by \(D^1_{\tau ,\mathcal {G}}\) the derivative (\(=\)differential) of \(W_\mathcal {G}\) with respect to \(h_\tau \). It is a linear functional on \(\mathcal {C}\). Our goal is to control its size. For this it is convenient to introduce for \(\tau \in J_p\) the number \(b(\tau )\) defined as follows. There is a (unique) “genealogy” \(i_0=M,i_1,\ldots ,i_p=\tau \) of \(\tau \) with \(i_\ell \in I(i_{\ell -1})\) for \(1 \le \ell \le p-1\) and we set
Lemma 4.1
For each tree \(\mathcal {G}\) of depth \(p\), each \(\varvec{h}\in \mathcal {D}^{\mathbb {N}^*}\) and each \(\tau \in J_p\), we can find a signed measure \(\nu _{\mathcal {G},\tau }\) on \(\mathbb {R}\) such that
with
In (4.6) \(|\nu |\) denotes the total variation of \(\nu \). Of course \(\nu _{\mathcal {G},\tau }\) depends also on \(\varvec{h}\), the numbers \(z_\tau \), etc., but this dependence is kept implicit.
Proof
For \(p=1\), \(j \in I(M) =J_1\), we get from (4.1) by differentiation that
and, very crudely, since \(\vert f \vert \le D\),
For general \(p \), the result is proved by induction over \(p\) by differentiation of the recursion formula (4.3). This is cumbersome to write, but straightforward, using that \(e^{-D} \le 1+h_\tau (x) \le e^D\) and \(|v(x,y)| \le 2D\). \(\square \)
To understand the power of the estimate (4.6), we observe that from (4.6) we have
In case where \(\mathcal {G}\) is a Poisson random tree, we see by induction on \(p\) that (provided \(KD \le 1\)) the expected value of the right hand side decreases geometrically as \(p\) increases, which expresses that \(\mathsf {E}W_\mathcal {G}(\varvec{h})\) becomes essentially independent of its arguments.
Of course, since integration by parts will be required, we cannot get by using only first order derivatives. Thinking again of \(W_\mathcal {G}\) as function of the variables \((h_\tau )\), we denote by \(D^2_{\tau _1,\tau _2,\mathcal {G}}\) its second order derivative with respect to the variables \(h_{\tau _1}\) and \(h_{\tau _2}\). It is a bilinear functional on \(\mathcal {C}\times \mathcal {C}\). We need to control its “size” in an appropriate sense. The crucial information is contained in the following.
Lemma 4.2
For each tree \(\mathcal {G}\), of depth \(p\), each \(\tau _1,\tau _2 \in J_p\) we can find a signed measure \(\nu _{\mathcal {G},\tau _1,\tau _2}\) on \(\mathbb {R}\times \mathbb {R}\) such that
This is even more cumbersome to write than Lemma 4.1, but equally straightforward.
5 Proof of Proposition 3.1
For \(\tau \in \mathbb {N}^*\) we consider independent random measures \(\mu _\tau \) distributed like \(\overline{\mu }_q\). Consider the random family of functions \(\varvec{h}=(h_\tau )_{\tau \in \mathbb {N}^*}\) given by
Assume that (with the notations of the previous section) we have \(q_\tau =q\), \(\rho _\tau =1-q\) for each \(\tau \), and that the numbers \(z_\tau \) are independent standard normal r.v.s (independent of the \(\mu _\tau \)). Iteration \(p\) times of the relation \(\overline{\mu }_q=T^q(\overline{\mu }_q)\) should make it obvious that if \(\mu \) is a random measure distributed like \(\overline{\mu }_q\), we have
where \(\mathcal {G}_p\) is a Poisson random tree of parameter \(\beta \), i.e., for each \(\ell <p\) the cardinality of the sets \(I(\tau )\), \(\tau \in J_\ell \) are probabilistically independent, and their laws are Poisson of expectation \(\beta \). Formula (5.1) gives a precise meaning to the expression “we iterate the method of (3.4)”.
We have shown in the previous section [through (4.5) and (4.6)] that as \(p\) grows, \(\mathsf {E}W_{\mathcal {G}_p}(\varvec{h})\) becomes essentially independent of its arguments (as will be detailed below). This, denoting by \(\mathbf 0 \) the family of functions such that all the components are the zero function, we have
and the limit is uniform in \(q\). Our goal is to prove that the left-hand side of (5.2) is a \(KD\) Lipschitz function of \(q\). The point of (5.2) is that it suffices to prove that \(\mathsf {E}W_{\mathcal {G}_p}(\mathbf 0 )\) is a \(KD\) Lipschitz function of \(q\), and all the dependence in \(q\) has been make explicit in this expression. Going back to the case of general values of \(q_\tau \) and \(\rho _\tau \), we see that it suffices to prove that for each \(p\) we have
where \(K\) is of course independent of \(p\), and the summation is over all \(\tau \in \mathbb {N}^*\).
Fixing \(\ell \le p\), and conditionally on \(\mathcal {G}_p\), let us first study how \(W_{\mathcal {G}_p}(\mathbf 0 )\) depends on \(\rho _\tau \), where \(\tau \in J_\ell \) (the set of offsprings of \(M\) of \(\ell {\mathrm {th}}\) generation). Let us denote by \(\mathcal {G}'\) the restriction of \(\mathcal {G}_p\) to the first \(\ell \) generations. Then the operator \(W_{\mathcal {G}'}\) does not depend on \(\rho _\tau \), as is obvious from the induction formula (4.3). Moreover, none of the stages of the recursion beyond the \(\ell \)th stage ever involves the quantity \(\rho _\tau \). In other words, we have
where \(\varvec{h}=(h_{\tau '})_{\tau ' \in J_\ell }\) and where
for functions \(h_i\), \(i \in J_{\ell +1}\) that do not depend on \(\rho _\tau \). Thus we have
because among the functions (5.4), only the one for \(\tau =\tau '\) depends on \(\rho _{\tau }\). Let us observe that (5.4) makes it obvious using (1.6) that
Also, since the functions \(h_i\) of (5.4) are themselves defined through formulas similar to (5.4), they satisfy
We compute \(\partial h_\tau / \partial \rho _\tau \) using (5.4) for \(\tau =\tau '\), and keeping in mind that
The dangerous factor \(1/\sqrt{\rho _\tau }\) is removed through integration by parts with respect to \(\xi _\tau \), using (5.7) [that is, using the formula \(\mathsf {E}\xi _\tau U(\xi _\tau )=\mathsf {E}U'(\xi _\tau )\) for reasonably behaved functions \(U\)], which yields very crudely a bound
where \(a(\tau )=\mathrm {card}I(\tau )\). Combining with (5.5) and Lemma 4.1 we get
so that if \(\mathsf {E}_0\) denotes expectation given \(\mathcal {G}'\), and since \(\mathsf {E}a(\tau )^2 \exp 4a(\tau )D \le K\), we have
and
By recursion over \(\ell \), we get that \(\mathsf {E}\sum _{\tau \in J_\ell }\exp LDb(\tau ) \le K^{\ell -1}\) (since \(\mathsf {E}k\exp Dk \le K\) where \(k\) is a Poisson r.v. with \(\mathsf {E}k=\beta \)) and thus
and, by summation over \(\ell \) and since \(KD \le 1/2\),
which proves “half” of (5.3). To prove the other half, exactly as in (5.5), we have, for \(\tau \in J_\ell \)
where \(h_\tau \) is given by (5.4), and
so that
The dangerous factor \(1/\sqrt{q_\tau }\) will be removed by taking expectation in \(z_\tau \) (which we denote \(\mathsf {E}_\tau \)) and using the integration by parts formula \(\mathsf {E}_\tau (z_\tau U(z_\tau ))=\mathsf {E}U'(z_\tau )\) for a well behaved function \(U\), so that (abusing notation)
since we must not forget that \(D^1_{\tau ,\mathcal {G}'}\) depends on \(z_\tau \) through its argument \(h_\tau \). Using (5.4) and (5.7) we get the bounds
and using Lemma 4.2 we conclude easily as before, completing the proof of (5.3) and of Proposition 1.2. \(\square \)
6 Starting the cavity method
In mean fields models with infinite connectivity, the idea of the cavity method is to guess the behaviors of the last spin under the Hamiltonian, and to construct an interpolating Hamiltonian that witnesses this behavior. For \(0\le t\le 1\), we consider the Hamiltonian
where \(r\) is as in Proposition 1.2 and where
Thus, for \(t=1\), \(H_t\) is the Hamiltonian (1.3), while if \(t=0\), the Hamiltonian \(H_t\) acts on the last spin simply through a random external field.
We denote by \(\langle \cdot \rangle _t\) an average for Gibbs’ measure with Hamiltonian (3.1), and we write \(\nu _t=\mathsf {E}\langle \cdot \rangle _t\). By symmetry between the sites, we have
with our usual notation \(\varepsilon _\ell =\sigma _N^\ell \). Since \(S_{k,0}\) does not depend on \(\sigma _N\), it is easily seen that
because \(q=\mathsf {E}\mathrm {th}^2(z\sqrt{r})\), and thus
Thus,
where
The idea is to prove that for each \(t\) we have
This yields (1.17) when compared with (6.3). It must be stressed that weaker estimates such as
are pretty much useless.
Introducing a third replica \({\varvec{\sigma }}^3\), we have
Using the notation (1.12), and writing for simplicity \(S^\ell _{k,t}=S_{k,t}({\varvec{\sigma }}^\ell )\), \(\varepsilon _\ell =\sigma _N^\ell \), we have
making the convention that
To obtain the value of \(d\nu _t(h)/dt\), we substitute (6.6) into (6.5) and we integrate by parts with respect to the Gaussian r.v.s \(g_{kN}\) and \(z\). It is explained in great detail in [4] how to perform this type of computation so we give only the result. We introduce the following quantities
Proposition 6.1
We have
where
We will prove that under the condition \(KD \le 1\) we have
Combining these with (6.3) and (6.12) yields
and finishes the proof.
The easiest part is (6.20), that we prove now. For this, we use the formula (6.12) for \(h=(R_{1,2}-q)^2\) and we get using trivial estimates
To handle this, we observe that by symmetry
Let us denote by \(\langle \cdot \rangle _1\) an average for the Hamiltonian (where the difference with (6.1) is that the summation is over fewer choices of \(k_1\) and \(k_2\))
(Let us also observe that the notation \(\langle \cdot \rangle _1\) does not mean \(\langle \cdot \rangle _t\) for \(t=1\)). Thus if
we have
Now the average \(\langle \cdot \rangle _1\) does not depend on the r.v. \(\eta _{kM}\), and
Only trivial estimates are then required to show that
and that
Since \(r \le KD\), it follows that
and this proves (6.20) by integration.
The proofs of (6.17), (6.18), (6.19) are somewhat similar, and the rest of the paper is devoted to prove that
7 The approach
By symmetry we have
The Gibbs measure depends on the r.v.s \(\eta _{kM}\). Denoting as before \(\langle \cdot \rangle _1\) an average for the Gibbs measure with Hamiltonian (6.21), we have (with \(\alpha =M/N\))
where \(\mathcal {E}\) is as in (6.22). Let \(I=\{k;\, \eta _{kM}=1\}\). We will work conditionally given \(I\). We will estimate
where now
By “estimating” we mean finding a simpler quantity equal to it within an error term \(KD(\nu _t((R_{1,2}-q)^2)+1/N)\). Such a term will be henceforth be called a small error term.
Let us consider \(z^1,\xi ^1,\xi ^2\) independent r.v.s and denote by \(\mathsf {E}_\xi \) expectation in the r.v.s \(\xi ^\ell \) only. An absolutely central idea to the whole paper is that we make only a small error when we replace the quantity (7.2) by
where now
There is a general principle here, that will be used several times: within small error different replicas \(S^\ell _M\) can be replaced by corresponding \(\theta ^\ell \), provided one takes suitable expectations \(\mathsf {E}_\xi \). This idea is central e.g. in [4], Chapter 2]. Of course the issue will be to obtain a suitable control of these error terms.
Certainly we expect that the system with Hamiltonian (6.21) will be very close to the original system. Therefore, we expect that under the corresponding Gibbs measure the quantities \(S_{k,t}\) are nearly probabilistically independent. Their laws are random probability measures on \(\mathbb {R}\), and they should resemble independent random measures with distribution \(\overline{\mu }_q\), the object constructed in Proposition 1.1. Moreover, there seems to be no reason why the laws of the \(S_{k,t}\) under Gibbs measure should be in any way seriously correlated with the action of the Gibbs measure on \(h\). Therefore one should expect that the quantity (7.3) should be nearly equal to
where now \(\varvec{x}= (x_k)_{k \in I}\) and
If \(a=\mathrm {card}I\), through symmetry and recalling the notation (1.13) and (1.14), the quantity (7.4) is nearly
The law of \(a\) is nearly Poisson of expectation \(\beta \), so that the value of \(r\) given by (1.15) is the one that should ensure that \(B(1,2) \simeq C(1,2)\).
Of course, the bulk of the work is to prove that, with respect to the Gibbs measure, the quantities \(S_{k,t}\) do have the expected behavior, within small error terms. The only way we could imagine to approach the problem is through a kind of iteration. This iteration will face the difficulties that occur in the proof of Proposition 1.2, as well as many others. In order to be able to communicate something, rather than attempting to evaluate the complicated quantity (4.1) we will focus on a project of the some nature, but where the formulas are less cumbersome. We will show how to evaluate within a small error term the quantity
Here \(h\) does not have the same meaning as above, it is the function that was previously denoted \(\varepsilon _1\varepsilon _2 h\), i.e.
It satisfies
Our error terms need to be like \(\nu _t((R_{1,2}-q)^2)\). One factor \(R_{1,2}-q\) is provided by (7.6).
It is clear that the case \(t \ne 1\) of the Hamiltonian (6.1) is somehow a perturbation “of order \(1/N\)” of the case \(t=1\). It greatly simplifies the exposition (without putting any real difficulty under the carpet) to study the quantity (7.5) only when \(t=1\), i.e., the quantity
(Moreover the letter \(t\) becomes available for other purposes!) This is what we will do. We will assume that the function \(f\) is as in Proposition 3.1. Let us note right away that the factor \(D\) that occurs in the definition of a small error term, i.e., a term that is bounded by \(KD(\nu _t((R_{1,2}-q)^2)+1/N)\) is provided by the fact that \(f\) is of “order \(D\)”.
In a first stage, there seems to be no other way than making explicit the dependence of Gibbs’ measure on \(S^1_M\), as we have already done several times. That is, if
and \(\langle \cdot \rangle _1\) denotes an average with respect to the Gibbs measure with Hamiltonian
we have
To simplify this expression, we can then argue that this is about
where \(\theta ^\ell =z\sqrt{q}+\xi ^\ell \sqrt{1-q}\) and \(\mathcal {E}(\theta )=\exp \sum _{\ell \le 2}\sum _{k \in I}u(S^1_k,\theta ^\ell )\), and where \(\mathsf {E}_\xi \) denotes expectation in \(\xi \) only. (This is again the idea of (7.3).) If we wish to iterate this procedure we must then make explicit the dependence of the average \(\langle \cdot \rangle _1\) in the variables \(S_k\), but then there is simply no chance that the error terms become smaller as the iteration progresses, as is required to keep control. As a first move to solve the problem, we will prove the following
Claim 1
We make only a small error when we replace \(\mathsf {E}\langle hf(S_M^1)\rangle \) by
So, the idea is now to study the quantity (7.12). Rather than (7.8), we write
so that
Let us observe that on the right-hand side we have \(\langle h\rangle \) and NOT \(\langle h \rangle _1\). Let us write \(\theta =z\sqrt{q}+\xi \sqrt{1-q}\) and denote as usual by \(\mathsf {E}_\xi \) expectation in \(\xi \) alone.
Claim 2
We make only a small error when we replace (7.13) by
where \(\mathcal {E}(\theta )=\exp \sum _{k \in I} u(S_k,\theta ).\)
The main difficulty here is that the Gibbs average \(\langle \cdot \rangle \) does depend on \(S_M\), and consequently (7.14) is not business as usual [as was e.g (7.3)] for this reason. Yet we must fact this difficulty. We cannot hope that the error terms will become small as the iteration progresses if at each step we make any change to the quantity \(\langle h\rangle \).
We have, with our usual notation \(v(x,y)=\exp u(x,y)-1\),
Claim 3
We make only a small error when we replace (7.14) by
At this stage the situation looks brighter, the quantities \(S_k\) are involved only through the average of the small functions \(v(S_k,\theta )\), witnessing the dampening effect that is needed to control the error terms as the induction progresses. Also, we recognize that if we denote by \(W_I\) the operator (4.1) (with \(z_M=z, q_\tau =q, \rho _\tau = 1-q\)) then the quantity (7.15) is simply
where, for \(\tau \in I\),
Our first task will be to prove Claims 1 to 3. ( As it turns out, the key points of Claims 1 to 3 are nearly identical.) These claims together prove that \(\nu (h f(S_M^1))\) is nearly \(\mathsf {E}\langle h\rangle W_I (h)\), which is the first step of the basic induction. Iteration of this procedure is the basic induction. It shows that \(\nu (h f(S_M^1))\) is nearly \(\mathsf {E}\langle h\rangle W_{\mathcal {G}}(\varvec{h})\), for a suitable tree of great depth and a suitable family \(\varvec{h}\). But then \(W_\mathcal {G}(\varvec{h})\simeq W_\mathcal {G}(\mathbf 0 )\) and \(C=W_\mathcal {G}(\mathbf 0 )\) satisfies \(\mathsf {E}C \simeq \mathsf {E}\int fd\mu \). A last effort will then be required to prove that actually \(\mathsf {E}\langle h\rangle C \simeq \mathsf {E}\langle h\rangle \mathsf {E}C\), concluding the proof. This will be done in Sect. 14.
8 A decorrelation property: proof of Claim 1
Claim 1 can be called a decorrelation property because it asserts that in some sense the quantities \(h\) and \(f(S^1_M)\) are not correlated. Such properties will be essential.
To bound
the only approach we could conceive it is to make explicit the dependence of the bracket in \(S_M\), and (modulo a small error) in the corresponding expression to replace the quantities \(S_M^\ell \) by suitable Gaussian r.v.s \(\theta ^\ell \) as in (7.3), and then to find a way to iterate this process.
Given a subset \(J\) of \(\{1,\ldots ,M\}\) let us denote by \(\langle \cdot \rangle _J\) an average for the Gibbs measure with Hamiltonian
We start by a simple observation that will be very useful.
Lemma 8.1
If \(KD \le 1\), for any function \(g\) on 2 replicas, \(g \ge 0\), we have
Proof
Let \(U=\sum \eta _{kk'}\), where the summation is over \(k,k' \le M\), either \(k \in J\) or \(k' \in J\). Then we obviously have
where \(\exp (-2DU) \le \mathcal {E}\le \exp 2DU\) so that
and the quantities on the right are probabilistically independent, so the result follows taking expectations. \(\square \)
We would like, under certain conditions, to bound the quantity
In words, we want to “quantify the cost” of exchanging (or commuting) the two terms \(g_1\) and \(g_2\) between brackets. The conditions we will impose on the various quantities are simply those that we find convenient to keep the (implicit) iteration going. We require that \(g_1=g_1(S_1^1,S_1^2)\) and \(g_2=g_2(S_1^1,S_1^2)\), where \(g_1(x,y)\) and \(g_2(x,y)\) are functions of 2 variables such that \(|g'(x,y)| \le 1\) whenever \(g'\) is a partial derivative of either \(g_1\) or \(g_2\) of order \(\le 2\). We require that for a certain set \(I_0\) with \(1 \notin I_0\), \(F_1\) and \(F_2\) are functions of the quantities \(S^1_k,S^2_k\), \(k \in I_0\). On the other hand, \(F_3\) is a function of the quantities \(S_k\) for \(k \in I_0 \cup \{1\}\).
We assume that for a certain quantity \(C\) we have
Let us assume that for some numbers \(W_1\) and \(W_2\), under the preceding conditions the quantity (8.2) is bounded by
Since \(M\) and \(N\) are fixed, and since \(\nu ((R_{1,2}-q)^2)>0\), this is certainly true if \(W_1\) and \(W_2\) are very large. The idea of the implicit iteration is to “bootstrap” the previous information to show that one can take \(W_1\) and \(W_2 \le K\).
To study the quantity (8.2), let us set \(J'=J \cup \{1\}\). For simplicity we will write \(\langle \cdot \rangle '\) rather that \(\langle \cdot \rangle _{J'}\). Consider a set \(I \subset \{2,\ldots ,M\}\) and
We are interested in the case where \(I\) is the random set
In that case the randomness of \(I\) is independent of the other sources of randomness in (8.6), and moreover, if \(1 \notin J\), the quantity (8.2) is \(|\mathsf {E}A_I|\). When \(1 \in J\), the quantity (8.2) is \(|\mathsf {E}A_\varnothing |\). In the sequel we consider only the case \(1 \notin J\) the much simpler case where \(1 \in J\) is left to the reader. So, the goal is to bound \(|\mathsf {E}A_I\).
Let us observe that the trivial bound
holds for all choices of \(I\). This is the bound we will use when \(I \cap I_0 \ne \varnothing \), so let us assume that \(I \cap I_0 =\varnothing \). We follow the idea [explained after (7.3)] that the replicas \(S_1^\ell \) should behave like the variables \(\theta ^\ell =z\sqrt{q}+\xi ^\ell \sqrt{1-q}\), where \(z,\xi ,\xi ^\ell \) are independent standard normal r.v, and when one adds an expectation \(\mathsf {E}_\xi \) in front of each bracket. We interpolate between \(S_1^\ell \) and \(\theta ^\ell \) using
Let us set
The quantity \(F_3\) is a function of the quantities \(S_k\), \(k \in I_0 \cup \{1\}\). Let us denote by \(F_3(t)\) the value of \(F_3\) when the argument \(S_1\) is replaced by \(\theta _t=\sqrt{t}S_1+\sqrt{1-t}\theta \) (\(\theta =z\sqrt{q}+\xi \sqrt{1-q}\)), and let
Let
and finally
We observe that, since \(F_1\) and \(F_2\) do not depend on \(S_1\), all the dependence in \(S_1\) has been made explicit in the above expression. Also, \(\varphi _I(1)=A_I\), and thus
The two terms on the right-hand side will be estimated separately. The most interesting part is the bound for \(\varphi _I(0)\). It relies on the bound (8.5) (for different choices of \(I,F_1,F_2\), etc.).
First, we notice that
where
This is a function of the quantities \(S_k\) for
Also, if
we see that
whenever \(k \in I'\). This is obvious from the explicit expression (8.14), from (8.4), and since \(I\cap I_0=\varnothing \). Indeed, for a value of \(k\), at most one of the 2 quantities \(F_3(0)\) and \(\exp \sum _{k\in I}u(\theta _0,S_k)\) depends on \(S_k\).
In order to be able to apply the bound (8.5) to the quantity \(\varphi _I(0)\), we define
so that, using replicas,
(It is essential here that \(\mathcal {E}'_I\) be in the same bracket as \(g_2\).) Since \(|g_1|,|g_2| \le 1\), to bound \(|\varphi _I(0)|\) it suffices to bound for a given value of \(\theta ^\ell \), \(\ell \le 4\), the quantity
that is, the cost of exchanging the terms \(\mathcal {E}_I\) and \(\mathcal {E}_I'\) between brackets.
Set
so that
To exchange \(\mathcal {E}_I\) and \(\mathcal {E}_I'\) between brackets, we will exchange the terms \(U_k\) and \(U_k'\) one at a time. That is, if \(a=\mathrm {card}I\), \(I=\{k_1,\ldots ,k_a\}\), for \(b\le a\) we write
so that \(\mathcal {E}_I=\mathcal {E}_a=\mathcal {E}'_0\); \(\mathcal {E}'_I=\mathcal {E}'_a=\mathcal {E}_0\), and
To bound the quantity (8.17) for each \(b\) we will bound
We fix \(b\), and we write
so that \(\mathcal {E}_b=\mathcal {E}U_{k_b}\), \(\mathcal {E}_{b-1}=\mathcal {E}U_{k_b}'\), \(\mathcal {E}'_b=\mathcal {E}' U_{k_b}'\), \(\mathcal {E}'_{b-1}=\mathcal {E}' U_{k_b}\) and the quantity (8.19) is
We are going to apply the bound (8.5) to this quantity. Let \(F_1'=F_1\mathcal {E}\), \(F_2'=F_2\mathcal {E}\), so that recalling (8.15) we have
Let \(I_0'=I'{\setminus }\{k_b\}\). Then \(F'_1\) and \(F'_2\) are functions of the quantities \(S_\tau ^\ell \) for \(\tau \in I'_0\), \( \ell = 1,2\) and \(F'_3\) is a function of the quantities \(S_\tau \) for \(\tau \in I'=I'_0\cup \{k_b\}\). We then see that we can use the bound (8.5) on the quantity (8.20). (The fact that the quantity to exchange between factors depends on \(S_{k_b}^1,S_{k_b}^2\) rather than on \(S_1^1,S_1^2\) is irrelevant due to symmetry). However, we need to be clever, we need to gain a crucial factor \(D\). So we define \(V=U_{k_b}-1\), \(V'=U'_{k_b}-1\). Replacing \(U_{k_b}\) by \(V+1\), \(U_{k_b}'\) by \(V'+1\), and expanding we find that the quantity (8.20) is the sum of 3 terms, that measure respectively the costs of exchanging \(V\) and \(V'\), \(V\) and 1, or 1 and \(V'\) between brackets. To bound the first of these costs, we consider the functions
These are such that if \(g'\) is a partial derivative of order \(\le 2\) of either \(g_1\) or \(g_2\), then \(|g'| \le LD\). (Here is the factor \(D\) we need.) In this manner, setting \(a=\mathrm {card}I\), and noting that \(1+\mathrm {card}I'_0= a + \mathrm {card}I_0\), we have reached the bound
We now turn to a bound for the difference \(|\varphi _I(1)-\varphi _I(0)|\), and as usual we write
To compute \(\varphi '(t)\) we differentiate in \(t\) the formula (8.2) and we integrate by parts in the r.v.s \(z,\xi ^\ell \) and \(g_{i1}\). We use replicas to express the resulting terms. To give the inexperienced reader a feeling of what happens, let us detail a point that is rather easy now, but that will need to be addressed when we will require more abstract forms of the same argument. In \(\varphi _I'(t)\), one of the terms (created by the dependence of \(g_{1,t}\) in \(\theta ^1_t\)) is
where \(g'_{1,t}={\partial \over \partial x}g_1(\theta _t^1,\theta ^2_t)\) and \(Z=(\mathsf {E}_\xi \langle F_3(t)\mathcal {E}_I(t)\rangle ')^2\). Now
The contribution of the term of \(d\theta ^1_t/dt\) containing \(S_1^1\) to (8.22) is
To integrate by parts we use the fact that for a (not too bad) function \(U(g_{i1})\) we have \(\mathsf {E}g_{i1}U(g_{i1})=\mathsf {E}U'(g_{i1})\). There are many sources of dependence of the brackets in \(g_{i1}\). We do not address all of these and we write only the terms created by the fact that \(g_{2,t}\) depends on \(g_{i1}\) through \(S_1^1\). The other terms can be handled similarly. These terms are
where \(g'_{2,t} = {\partial \over \partial x}g_2(\theta ^1_t,\theta ^2_t)\). We use replicas to express the product of brackets as a single bracket; for example in the second bracket we replace replicas 1 and 2 by replicas 3 and 4. In this single bracket we can combine the terms \(\sigma _i^1\sigma _i^3\) as \(i \le N\) to obtain the term \(R_{1,3}\). The magic of the formulas (8.8) is that, in each of the multiple terms that occur we create in this fashion a factor \(R_{\ell ,\ell '}-q\). The terms with \(-q\) are created by the term with \(-\theta ^1\) in (8.23). In the function \(h\) we already have a factor \(R_{1,2}-q\). Recalling the notation \(a = \mathrm {card}I\), using the Cauchy–Schwarz inequality and trivial bounds we reach
using (8.1) in the last line. Condition (8.16) allows the control of the terms created by the (mild) dependence of \(F_3\) on \(S_1\) (and the values of \(k \not = 1\) considered in this condition allow the induction to continue). Even though the computation of \(\varphi '\) involves integration by parts, i.e., a second operation of derivation, the resulting expression does not contain second derivatives. This is because
Of course, this remark is unimportant, since one could control \(\partial ^2 F_3/(\partial S_k)^2\) if one so wished.
Combining (8.25) with (8.22), (8.13) and (8.21) we arrive at the inequality
This holds whenever \(I \cap I_0 =\varnothing \). Now
and, for an integer \(b\), and since \(\beta =M\gamma /N\),
Combining these with (8.8) and the fact that \(P(I \cap I_0 \not = \emptyset ) \le K \mathrm {card}I_0/N\), we get simply from (8.26) that
Since \(\mathsf {E}|A_I|\) is an upper bound for the quantity (8.2) we see that the bound (8.7) still holds if we replace \(W_1\) by \(K+KDW_1\) and \(W_2\) by \(K+KDW_2\). If \(KD \le 1\), this implies that it holds for \(W_1=W_2=K\), and finishes the proof.
It then follows from the bound (8.5) that
the factor \(D\) being provided by the fact that \(f\) is “of order \(D\)”, and this is the content of Claim 1.
9 Proof of Claim 2: another decorrelation property
Consider the set \(I=\{k < M;\, \eta _{kM}=1\}\) and the quantity \(\mathcal {E}=\exp \sum _{k \in I}u(S_k,S_M)\). We denote by \(\langle \cdot \rangle _1\) an average for the Gibbs measure with Hamiltonian (7.9). Let
where \(\theta =z\sqrt{q}+\xi \sqrt{1-q}\), and let \(\mathcal {E}(t)=\exp \sum _{k \in I}u(S_k,S(t))\),
Until further notice all expectations are taken given \(I\) although this stays implicit in the notation. To prove Claim 2 we have to prove that \(|\varphi (1)-\varphi (0)|\) is small, and as usual we rely upon the bound
To compute \(\varphi '(t)\), we simply see that
The minor (recurrent...) problem is that there are several ways the last factor depends on \(t\), and each of these ways creates a different term when computing the derivative. It is very cumbersome to write all the terms, and they can be bounded similarly, so we will deal only with one such term, namely
We must integrate by parts in the Gaussian r.v involved in \(S'(t)\), namely the \(g_{iM},z\) and \(\xi \). The difficult terms arise from the dependence of the bracket \(\langle h\rangle \) in the r.v.s \(g_{iM}\). The other terms can be nicely regrouped to create factors \(R_{\ell ,\ell '}-q\), and (since \(|h|\le 2|R_{1,2}-q|\)) there are bounded by
where \(a=\mathrm {card}I\). The difficult term can formally be written as
First, we observe that the factor \(1/\sqrt{t}\) is harmless, and it will disappear when we integrate \(t\) from 0 to 1. We write
where of course \(S_k^\ell =S_k({\varvec{\sigma }}^\ell )\), so that

We substitute in (9.3), and use replicas to transform products of brackets into single brackets. In order to absorb the various expectations \(\mathsf {E}_\xi \) into a single expectation \(\mathsf {E}_\xi \), when occurring in a replica of rank \(\ell \) we replace \(\xi \) by \(\xi ^\ell \), where \((\xi ^\ell )\) are independent copies of \(\xi \), e.g., in the formula below, we have \(S^3(t)=\sqrt{t}S^3_M+\sqrt{1-t}(z\sqrt{q}+\xi ^3\sqrt{1-q})\). In this manner we get
If we replace each occurrence of a term \(R_{\ell ,\ell '}\) by \(R_{\ell ,\ell '}-q\) in the above expression, the corresponding term will be bounded by the quantity (9.2). To control the remaining terms we have to control
a quantity that is the sum of several terms such as
Controlling these means that we can “decorrelate \(h\) and \(w(S_M^1,S_k^1)\) at a small cost”. This can be done as in Sect. 8, and the term (9.4) is bounded by
This statement might not be completely obvious; but later on we will study again the problem of controlling an expression similar to (9.4), but more general.
Finally, we must control the expectation of (9.5). A slight problem is that \(a=\mathrm {card}I\) is not probabilistically independent of \(\langle (R_{1,2}-q)^2\rangle \). But we simply observe that
and that \(\langle \cdot \rangle _1\) is probabilistically independent of \(a=\mathrm {card}I\). Thus, \(\mathsf {E}\) denoting now expectation also with respect to \(a\), we have
using Lemma 8.1 with \(J= \{ M\}\).
10 Description of the general procedure
For \(\tau \le M\) let \(I(\tau )=\{k \le M;\, \eta _{\tau k}=1 \text { or } \eta _{k \tau }=1\}\). We define recursively the sets \(J_0=\{M\}\), \(J_{\ell +1}=\cup \{I(\tau );\, \tau \in J_\ell \}\). These are random sets, to which we think as depending on a point \(\omega \in \Omega \), a certain probability space. Let us define \(s(\omega )\) as the largest integer for which
We observe that \(s(\omega ) \ge 1\). For \(p \le s(\omega )\), the data of the sets \(I(\tau )\) for \(\tau \in \cup _{\ell <p} J_\tau \) defines a tree of depth \(p\) in the sense of Sect. 4.
Throughout this section, given an integer \(p\), we write
and we denote by \(\mathcal {G}_p\) the tree of depth \(p'\) defined by the sets \(I(\tau )\).
For any \(\tau \in J_{p'}\), there is a unique genealogy \(k_0=M,k_1,\ldots ,k_{p'}=\tau \) such that \(k_\ell \in I(k_{\ell -1})\) for \(1 \le \ell < p'\). Consider the Hamiltonian \(H_\tau \) given by
where the summation is restricted to the values of \(k\) and \(k'\) that are not equal to any of the values \(k_0,\ldots ,k_{p'-1}\). Consider then the function
where of course the bracket \(\langle \cdot \rangle _\tau \) denotes an average for the Gibbs measure with Hamiltonian (10.2), and consider \(\varvec{h}=(h_\tau )_{\tau \in J_{p'}}\). Recalling the operators \(W_\mathcal {G}\) of Sect. 4 (where \(q_\tau = q\), \(\rho _\tau = 1-q\), and where \(z_\tau , \xi _\tau \) are i.i.d. r.v.s), we consider the quantity
The objective is to prove that when we go from \(p\) to \(p+1\) in (10.4) we make an error at most \((KD)^p(\nu ((R_{1,2}-q)^2)+K/N)\). Combining with the results of the previous two sections this shows that for any \(p\), the quantities \(\mathsf {E}\langle hf(S_M)\rangle \) and (10.4) differ only by a small error term \(KD\nu ((R_{1,2}-q)^2)+K/N\).
Given \(\tau \in J_{p'}\), let
and let us denote by \(\langle \cdot \rangle _\tau '\) an average for the Gibbs measure with Hamiltonian similar to (10.2), but where we also remove the terms \(\eta _{kk'}u(S_k,S_{k'})\) where either \(k\) or \(k'\) is \(\tau \); thus
Let us recall that \(\theta _\tau =z_\tau \sqrt{q}+\xi _\tau \sqrt{1-q}\). In a first stage, when \(s(\omega ) >p\) we will replace in the quantity \(\mathsf {E}\langle h\rangle W_{\mathcal {G}_p}(\varvec{h})\) the functions \(h_\tau \) by the functions
where
(and do nothing when \(s(\omega ) >p\)). In a second stage, when \(s(\omega ) >p\) we will replace the functions \(\overline{h}_\tau (x)\) by the functions
(and do nothing when \(s(\omega ) >p\)). The resulting expression is then the quantity (10.4) for \(p+1\) rather than \(p\). We will show that these changes create only a small error term.
11 Starting the first stage
Here and in the next section, the estimates are made given the r.v.s \(\eta _{kk'}\). We will later average over these. Thus, until further notice, \(\mathsf {E}\) denotes expectation given the \(\eta _{kk'}\). The idea is to replace one at a time the functions \(h_\tau \) by the functions \(\overline{h}_\tau \). We can assume that \(p<s(\omega )\), for otherwise \(\mathcal {G}_p=\mathcal {G}_{p+1}\). Thus \(p'=p\). Let us fix \(\tau \in J_p\), and let
where
Let us fix a subset \(J'\) of \(J_p\), with \(\tau \not \in J'\). This is the set of \(\tau '\) of \(J_p\) for which we have already replaced \(h_{\tau '}\) by \(\overline{h}_{\tau '}\). We define the family \(\varvec{h}^\sim _t=(h^\sim _{t,\tau '})_{\tau ' \in J_p}\) as follows. We have \(h^\sim _{t,\tau }=h_{\tau ,t}\); if \(\tau '\in J'\) we have \(h^\sim _{t,\tau '}=\overline{h}_{\tau '}\); and if \(\tau ' \in J_p{\setminus } J'\), \(\tau ' \ne \tau \) we have \(h^\sim _{t,\tau '}=h_{\tau '}\). In words, the path \(t \mapsto \varvec{h}_t^\sim \) transforms \(h_\tau \) in \(\overline{h}_\tau \) and changes nothing else. Let
so that
Here of course \(D^1_{\tau ,\mathcal {G}_p}\) is computed at the point \(\varvec{h}^\sim _t\). The proof is made (as always) cumbersome by the fact that there are several terms in \({d\over dt}h_{\tau ,t}\), but fortunately they can be treated by similar methods. We will show how to deal with
where \(\mathcal {E}_t=\exp \sum _{k\in I(\tau )}u(\theta (t),S_k)\) and \(v'(x,y)=\partial v(x,y)/\partial y\). Here,
To make sense of (11.2) we must integrate by parts in the r.v.s \(z_\tau ,\xi _\tau \), and \(g_{i\tau }\) (recall that \(S_\tau =N^{-1/2}\sum _{i \le N}\sigma _ig_{i\tau }\)). The troublesome terms are those created by integration by parts in the r.v.s \(g_{i\tau }\) and the fact that both the bracket \(\langle h\rangle \) and the components \(h^\sim _{t,\tau '}\) of \(\varvec{h}^\sim _t\) for \(\tau ' \ne \tau \) depend on these variables. The terms created by the other sources of dependence in the r.v.s \(g_{i\tau }\) (such as, e.g., the dependence of \(h^\sim _{t,\tau }=h_{\tau ,t}\)) regroup nicely with the terms created when integrating by parts in the r.v.s \(z_\tau \) and \(\xi _\tau \) to create factors \(R_{\ell ,\ell '}-q\). (It is not really obvious yet how one should proceed to regroup these terms, and this will be detailed in a moment in a similar situation.) Using (4.6) and (4.8) it is then simple to see that the global contribution of the “innocuous” terms is bounded by
where \(a(\tau )=\mathrm {card}I(\tau )\) and \(b(\tau )\) is given by (4.4).
We now turn to the study of the worrisome terms. We consider first the terms created by the dependence of \(\langle h\rangle \) on these variables \(g_{i,\tau }\). Proceeding exactly as in Sect. 9, we see that modulo terms as in (11.3) we have to control
The fact that these terms are small is a decorrelation property [more general than (9.4)] that will be examined in Sect. 12.
We consider now the terms created by the fact that for a given \(\tau ' \ne \tau \), \(\tau '\in J_p\), the function \(h^\sim _{t,\tau '}\) depends on the r.v.s \(g_{iM}\). For specificity we assume that \(\tau ' \notin J'\) so that
When integrating by parts in the r.v.s \(g_{i\tau }\) the quantity
the term that is created by the dependence of \(h_{\tau '}\) on \(g_{i\tau }\) is
Now
The summation over \(k\) ranges over the set \(\{k;\eta _{k\tau }=1 or \eta _{\tau k}\}\). We would like to regroup the terms for different values of \(i\) to create overlaps terms \(R_{\ell ,\ell '}\). While it is not clear how to do this in the expression (11.5), this is made possible by the concrete representation (4.7). To explain the process on one of the terms above we have, using (4.7)
where
To use replicas in order to transform the product of the brackets into a single bracket we proceed as follows. Each of the measures \(\langle \cdot \rangle _{\tau '}\) and \(\langle \cdot \rangle _\tau '\) has been obtained by removing certain terms from the Hamiltonian \(\sum _{1\le k<k'\le M}\eta _{kk'}u(S_k,S_{k'})\). We denote by \(\langle \cdot \rangle _{\bullet }\) an average for the Gibbs measure where one remove both the terms that have been removed in \(\langle \cdot \rangle _\tau \) and those that have been removed in \(\langle \cdot \rangle _\tau '\). Let \(\mathcal {E}\) and \(\mathcal {E}'\) be such that, respectively
Then, giving upper indexes their usual meaning, we can write
Let us consider the quantity \(U'(x,y)\) obtained from (11.6) when we replace \(R_{1,2}\) by \(R_{1,2}-q\). Then we see from (4.8) that
Thus, (provided we can control terms as above) the issue is to control terms as
That these terms are small is yet another decorrelation property, and controlling them is done exactly as controlling the terms (11.4), the control of which is the object of the next section.
12 A third decorrelation property
In (11.4), the differential \(D^1_{\tau ,\mathcal {G}_p}\) is calculated at the point \(\varvec{h}_t^\sim \). Since it makes no essential difference in the argument, but allows simpler formulas, we will pretend instead that this differential is calculated at the point \(\overline{\varvec{h}}\) given by (10.6).
In (11.4) the term \(w(S_\tau ,S_k)\) that has to be commuted between brackets depends on both \(S_k\) and \(S_\tau \), while in Sect. 8 we only have learned how to commute between brackets terms depending on \(S^1_k,S_k^2\). Of course, if wished, we could learn how to commute terms depending on \(S^1_k,S^2_k\) and \(S_\tau ^1,S^2_\tau \), but, to stay closer to the scheme of Sect. 8, we choose another route, and the first step of the proof is different from the further steps. In this first step we replace every occurrence of \(S_\tau \) by the usual corresponding terms \(\theta _\tau \) with the suitable expectations \(\mathsf {E}_\xi \) in the right places. “Every occurrence” means just that, including the occurrences that are hidden in the definition of the brackets \(\langle \cdot \rangle _{\tau '}\). This is required to have the terms created by integration by parts regroup to create factors \((R_{\ell ,\ell '}-q)\) instead of having to consider yet another decorrelation property. The error terms created by this atypical first step are of the same form as those we will meet later in the section, so they require no extra effort to be controlled.
Let us turn to the general procedure. Given a subset \(J\) of \(\{1,\ldots ,M\}\), we use again the notation \(\langle \cdot \rangle _J\) to indicate an average for the Gibbs measure where in the Hamiltonian \(\sum \eta _{kk'}u(S_k,S_{k'})\) we have removed all the terms for which either \(k\) or \(k'\) belong to \(J\). Thus \(\langle \cdot \rangle _{\tau '}=\langle \cdot \rangle _{J(\tau ')}\) where the set \(J(\tau ')\) is obtained (as in Sect. 8) as follows. If \(k_0=M,\ldots ,k_p=\tau '\) are the unique integers such that \(k_{\ell +1}\in I(k_\ell )\) for \(0\le \ell \le p-1\), then \(J(\tau ')=\{M,k_1,\ldots ,k_{p-1}\}\). We would like under rather general conditions to be able to bound the quantity
We describe the conditions we shall impose on these various quantities. First \(g_1=g_1(S_1^1,S_1^2)\) and \( g_2=g_2(S_1^1,S_1^2)\), where the functions \(g_1\) and \(g_2\) are such that their partial derivatives of order \(\le 2\) are bounded by 1. For a certain set \(I_0\), with \(\tau ,1 \not \in I_0\), the quantities \(F_1,F_2\) are functions of the quantities \(S_k\), \(k \in I_0\). The quantity \(F_3\) is a function of the quantities \(S_k\), \(k \in I_0 \cup \{1\}\). Let
The quantity \(F_\tau \) is a function of the quantities \(S_k\), \(k \in I_0\cup I'(\tau )\). The quantity \(F'_\tau \) is a function of the quantities \(S_k\), \(k \in I_0 \cup I'(\tau ) \cup \{1\}\). For a certain constant \(C\), we have, setting \(C(\tau ')=C\exp Dc(\tau ')\) (where, as usual, \(c(\tau ') = \mathrm {card}I'(\tau ') + \mathrm {card}I(\tau ')\) )
The differential \(D^1_{\tau ,\mathcal {G}_p}\) in (12.1) is computed at the point
The function \(V_{\tau '}(x)\) depends also on the quantities \(S_k\), for \(k \in I_0 \cup I'(\tau ') \cup \{1\}\), and so does the quantity \(G_{\tau '}\). Moreover
For \(k \in (I_0 \cup \{1\}) {\setminus } I'(\tau ')\) or \(k \in I'(\tau ') {\setminus } (I_0 \cup \{1\})\) and if \(\Delta \) denotes either \(\partial V_{\tau '}(x)/\partial S_k\) or \(\partial G_{\tau '}/\partial S_k\), we have
while if \(k \in (I_0 \cup \{1\}) \cap I'(\tau ')\) we have
These conditions are simply one way (among other possible ways) to keep track of what happens as the (implicit) recursion takes place. Let us sketch this recursion. Let us set \(I=I(1)=\{k;\,2 \le k \le M, \eta _{1k}=1\}\) and \(J'=J\cup \{1\}\). We are interested in the case where \(I \cap (I_0 \cup I'(\tau ))=\varnothing \) [for, otherwise, we use a trivial bound as in (8.8)]. We replace all occurrences of \(S^\ell _1\) by the usual variable \(\theta ^\ell \), using an interpolation parameter \(0 \le s \le 1\) (and the appropriate expectations of \(\mathsf {E}_\xi \) in front of each bracket). Let us denote by \(\varphi (s)\) the corresponding quantity [so \(\varphi (1)\) is the quantity (12.1)]. In \(\varphi (0)\) the differential is computed at a point \((h_{\tau '})_{\tau ' \in J_p}\) such that
where \(V_{\tau ',0}(x)\) means that \(S_1\) has been replaced by \(\theta \) in \(V_{\tau '}(x) \) (and similarly for \(G_{\tau ',0}\)) and where \(\mathcal {E}=\exp \sum _{k \in I}u(\theta ,S_k)\). The quantity \(V'_{\tau '}(x)=\mathsf {E}_\xi V_{\tau ',0}(x)\mathcal {E}\) is a function of the quantities \(S_k\) for \(k \in I_0 \cup I \cup I'(\tau ')\). Setting \(C'(\tau ')=C(\tau )\exp D\mathrm {card}I\), we have
Setting \(G'_{\tau '} = \mathsf {E}_\xi G_{\tau ',0}\mathcal {E}\) we have
If \(k \in I'(\tau ') {\setminus } (I_0 \cup I)\), then \(\mathcal {E}\) does not depend on \(S_k\) so that using (12.3) we have
Since \(I_0 \,\cap \, I=\varnothing \), if \(k \in (I_0 \cup I) {\setminus } I'(\tau ')\) then either \(k \in I_0\) and then \( k \in (I_0 \cup \{1\}) {\setminus } I'(\tau ')\) and \(k \notin I\) or else \(k \in I\) and then \( k \not \in (I_0 \cup \{1\}) {\setminus } I(\tau ')\). In both cases (12.5) still follows from (12.3) and the definition of \(\mathcal {E}\). If \(k \in (I' \cup I) \cap I'(\tau ')\) then one proves that
in a similar manner, distinguishing the cases \(k \in I\) (and then \(k \notin (I_0 \cup \{k\}) \cap I(\tau ')\)) or \(k \notin I\) and using (12.3) in the first case and (12.4) in the second case. Of course the case of \(G'_{\tau '}\) is handled similarly.
In this manner one can check in a straightforward way that all the conditions required on the various entries of (12.1) are still satisfied when one tries to bound \(\varphi (0)\) as in Sect. 9. On the other hand, these conditions are appropriate to control \(\vert \varphi '(s)\vert \). To control \(\varphi '(s)\) we compute this derivative and we integrate by parts. This integration by parts involves differentials of \(W_{\mathcal {G}_p}\) of order up to 3, but of course for these differentials a result comparable to Lemma 4.2 is available. The bound involves 3 terms (depending whether one deals with differentials of order 1, -+2 or 3 of \(W_{\mathcal {G}_p}\)): these are, setting \(a=\mathrm {card}I\)
Here, expectation is given in the r.v.s \(\eta _{kk'}\). The reason why we can use the brackets \(\langle \cdot \rangle _{J'}\) is that comparison of these with brackets \(\langle \cdot \rangle _{J'\cup J(\tau ')}\) involves only a factor \(\exp 4Dc(\tau ')\).
We then proceed as in (9.6) to obtain
where the expectation is now also over the r.v.s \(\eta _{kk'}\). In this manner one shows that the total error term of the first stage is \((KD)^p(\nu ((R_{1,2}-q)^2)+1/N)\), as required.
13 The second stage
In that stage we replace the functions \(\overline{h}_\tau \) of (10.6) by the functions \(h^*_\tau \) of (10.5). For that purpose, it greatly simplifies matters to change point of view, that is, rather than considering the operator \(W_{\mathcal {G}_p}\), to consider the operators \(W^*_{\mathcal {G}_p}\) given by
where, for \(\tau \in J_p\),
We will always assume that \(\exp (-D\mathrm {card}I(\tau )) \le h(x)\tau \le \exp D\mathrm {card}I(\tau )\). Under this condition we can control the size of the differentials of \(W^*_{\mathcal {G}_p}\) through the size of differentials of \(W_{\mathcal {G}_p}\); the extra factor \(\exp LD\mathrm {card}I(\tau )\) makes no difference in the estimates.
We want to compare
where
There seems to be no other way to do this than to interpolate using
so as usual
Hence we have to control quantities such as
where the \(*\) means that the differential is for \(W^*_{\mathcal {G}_p}\) rather than \(W_{\mathcal {G}_p}\).
To go from \(h_\tau \) to \(h'_\tau \) we have to decorrelate \(a(\tau )=\mathrm {card}I(\tau )\) factors. Of course, we proceed one factor at a time. If we have \(I(\tau )=\{k_0\} \cup (I_1\cup I_2)\), where the union is disjoint, we go from
to
The extra difficulty compared to the situation of Sect. 12 is that we have many factors, and when applying the method of Sect. 8, we have to keep track of what happens in a more precise way. If \(I=I(k_0)\), when replacing \(S_{k_0}\) by \(\theta \), in the error terms we can afford a factor \(\exp LDa\), with \(a=\mathrm {card}I\), but we cannot afford a factor \(\exp LDaa(k)\). The way the difficulty is passed (after one has faced cardiac arrest at the thought that the year of labor involved in the rest of the paper was in jeopardy) is by requiring more stringent conditions than after (12.1). Namely, the factors \(k \in I_2\) are now assumed to be of the type
with the inequality
so that we know that the quantity (13.2) is \(\le \exp D\). This condition is preserved in the “induction”, since, keeping our usual notation, if we set
we still have \(|U'_k(x)| \le F'_k\exp D\). The other regularity conditions are as in Sect. 13. Another crucial observation is that when computing \(\varphi '(s)\) the differentiation and integration by parts affect at most TWO of the brackets at the same time; so that the fact that there are \(a(\tau )\) factors in (13.1) creates only harmless terms as \(a^2(\tau )\). With these observations the proof of the second stage is completed as before.
14 A last effort
We have shown that, uniformly in \(p\), \(\mathsf {E}\langle hf(S^1_M)\rangle \) is equal to \(\mathsf {E}\langle h\rangle W_{\mathcal {G}_p}(\varvec{h})\) within small error terms. Since \(M\) and \(N\) are fixed, for \(p_0\) large (e.g. \(p_0=M\)) we have \(\mathcal {G}_{p_0+1}=\mathcal {G}_{p_0}\). We write \(\mathcal {G}\) rather than \(\mathcal {G}_{p_0}\) to lighten notation.
We would like to show that \(\mathsf {E}\langle h\rangle W_\mathcal {G}(\varvec{h})\) is, within small error terms, equal to \(\mathsf {E}\langle h\rangle \mathsf {E}\int fd\overline{\mu }\), where \(\overline{\mu }\) is the random measure of Proposition 1.1.
Let us set
Thus, it suffices to show that
is a small error term. The r.v. \(W_\mathcal {G}(\varvec{h})-A\) involves the construction of the tree \(\mathcal {G}\) starting at \(M\). We could start the same construction at any other point \(k\le M\). Let us denote by \(Y(k)\) the resulting r.v. It is clear that by symmetry we have
Now, \(\langle h\rangle ^2 \le \langle h^2\rangle \le 4\langle (R_{1,2}-q)^2\rangle \), and using the inequality \(ab \le a^2+b^2\), we see that
Since \(|f|\le D\), we have \(|A|\le D\le 1\), \(|W_{\mathcal {G}}(\varvec{h})|\le D\le 1\), so, by symmetry,
and to conclude it suffices to prove that \(\vert \mathsf {E}Y(M) Y(M-1)\le K/N\) and thus that
where the \('\) indicates that the construction starts at \(M-1\) rather than \(M\). We will prove only (14.3) since (14.2) is easier.
In Sect. 4 we have introduced what we called Poisson random tree. In this structure we are interested in the number of offsprings of a given element, but not in the location of these offsprings. Thus, when we talk of “independent Poisson random trees” we are of course only concerned about the independence of these “offspring” structures.
A key fact is as follows.
Proposition 14.1
There exists a pair \((\mathcal {H},\mathcal {H}')\) of independent Poisson random trees of depth \(p_0\) with the following property. Let us denote by \(\mathcal {H}_p\) and \(\mathcal {H}_{p}'\) their restriction to depth \(p\) for \(p\le p_0\). Then, for \(p<p_0\), the conditional probability that \((\mathcal {H}_{p+1},\mathcal {H}'_{p+1}) \ne (\mathcal {G}_{p+1},\mathcal {G}'_{p+1})\) given \((\mathcal {H}_{p},\mathcal {H}'_{p})\) and \((\mathcal {G}_{p},\mathcal {G}'_{p})\) is at most
provided that \((\mathcal {H}_{p},\mathcal {H}'_{p}) = (\mathcal {G}_{p},\mathcal {G}'_{p})\).
Proof of (14.3)
It follows from (5.1) that there exists independent (random) families \(\varvec{h}^*,\varvec{h}^\sim \) of functions such that
so that \(A^2=\mathsf {E}W_\mathcal {H}(\varvec{h}^*)W_{\mathcal {H}'}(\varvec{h}^\sim )\) and it suffices to show that
Let us think of the randomness of \(\mathcal {G},\mathcal {G}',\mathcal {H},\mathcal {H}'\) as point \(\omega \) in a certain probability space, and let us define
We observe that, by the recursion property that defines \(W_\mathcal {G}(\varvec{h})\), if \(p\le p_0\) we have \(W_\mathcal {G}(\varvec{h})=W_{\mathcal {G}_p}(\varvec{h}_p)\) for a certain family \(\varvec{h}_p\) in \(\mathcal {D}^{J_p}\); and similarly for \(\mathcal {H}'\) etc. Thus the left hand side of (14.4) is
where \(\varvec{h}_\ell \), \(\ell \le 4\) are (random) families of functions of \(\mathcal {D}\). The quantity (14.5) is bounded by I+II, where
because \(\mathcal {G}_{p(\omega )}=\mathcal {H}_{p(\omega )}\) and \(\mathcal {G}'_{p(\omega )}=\mathcal {H}_{p(\omega )}'\). Using Lemma 4.1 we then get
where
and \(A'_p\) is defined similarly. Now, by definition of \(p(\omega )\), for \(p<p_0\) we have
and thus, by the property of Proposition 14.1 we have
by estimating first the conditional expectation given \(\mathcal {H}_p,\mathcal {H}'_p,\mathcal {G}_p,\mathcal {G}_p'\). One controls the right hand side by induction over \(p\), getting that
and also \(\mathsf {E}A_{p_0}\le (KD)^{p_0}\). Since \(p_0=M\), this term is of much smaller order and thus we have proved (14.3).
We turn to the proof of Proposition 14.1. The following is standard.
Lemma 14.2
If \(Y\) is Binomial \(B(m,\delta /m)\) there exists a Poisson r.v. \(Y'\) of expectation \(\delta \) such that \(P(Y\ne Y')\le \delta K(\delta )/m\) where \(K(\delta )\) stays bounded with \(\delta \).
Corollary 14.3
If \(Y\) is Binomial \(B(M-n,\gamma /N)\) there exists a Poisson r.v. of expectation \(\beta =\gamma \alpha \) such that \(P(Y\ne Y')\le Kn/N\).
Proof
We use Lemma 14.2 to find a Poisson r.v. \(Y''\) of expectation \(\delta =\gamma (M-n)/N=\beta (1-n/M)\) such that \(P(Y''\ne Y)\le K/(M-n)\) and a Poisson r.v. \(Y'\) of expectation \(\beta \) with \(P(Y''\ne Y')\le L(\beta -\delta )\), so that
\(\square \)
Proof of Proposition 14.1
By induction over \(p\) we construct independent Poisson random trees \((\mathcal {H}_p,\mathcal {H}_p')\) of depth \(p\) and of parameter \(\beta \), such that \(\mathcal {H}_p\) and \(\mathcal {H}_p'\) are respectively the restrictions of \(\mathcal {H}_{p+1}\) and \(\mathcal {H}_{p+1}'\) to depth \(p\).
Consider the sets \(J=\bigcup _{p'<p}J_{p'}\cup J_{p'}'\) and \(J'= J\cup (J_p \cup J'_p)\). We observe that the data of the numbers \(\eta _{kk'}\) for \(k \in J,k'\le M\) determines \(\mathcal {G}_p\) and \(\mathcal {G}_p'\). The construction has the following property.
Of course here we mean that the r.v.s \(n_{kk'}\) are independent modulo the fact that \(\eta _{kk'} = \eta _{k'k}\).
The construction is performed recursively as follows. Assuming that \(\mathcal {H}_p\) and \(\mathcal {H}_p'\) have been constructed, we construct \(\mathcal {H}_{p+1}\) and \(\mathcal {H}_{p+1}'\). There is no much to do when \((\mathcal {H}_p,\mathcal {H}_p')\ne (\mathcal {G}_p,\mathcal {G}_p')\), so we assume that \((\mathcal {H}_p,\mathcal {H}_p')=(\mathcal {G}_p,\mathcal {G}_p')\). We work conditionally on this fact and the r.v.s \(\eta _{kk'}\) for \(k \in J\), \(k'\le M\). We then know form (H) that the r.v.s \(\eta _{kk'}\) for \(k\in J_p\cup J_p'\), \(k'\not \in J\) are independent with \(P(\eta _{kk'}=1)=\gamma /N\).
For each \(k \in J_p\cup J_p'\), consider the set
Thus the r.v.s \(\mathrm {card}C(k)\) are independent with law \(B(M-n,\gamma /N)\), where \(n=\mathrm {card}J\). From Corollary 14.3 we see that there exists sets \((B(k))_{k \in J_p\cup J_p'}\) such that
and that \(\mathrm {card}B(k)\) is Poisson of parameter \(\beta \). We can also assume that the r.v.s \(\mathrm {card}B(k)\) are independent, that any two of the sets \(B(k){\setminus } C(k)\) are disjoint, and that these sets are disjoint from \(J\).
Consider the event \(\Omega \) given by
On the event \(\Omega \) we modify the sets \(B(k)\) into sets \(B'(k)\) of the same cardinality that are disjoint and disjoint from \(J\). The sets \((B'(k))_{k\in J_p\cup J'_p}\) are now disjoint, and their cardinalities are the same as the cardinalities of the sets \(B(k)\), so their cardinalities are independent Poisson r.v.s of parameter \(\beta \). If we consider \(B'(k)\) as the set of offsprings of \(k\) for either \(k \in J_p\) or \(k \in J'_p\) this defines independent Poisson random trees \(\mathcal {H}_{p+1}\) and \(\mathcal {H}_{p+1}'\). Consider the set \(\Omega '\) defined by \( \exists k \not = k', k,k' \in J_p \cup J_p' , \eta _{kk'}=1.\) When \(\Omega \cup \Omega '\) does not occur, \(\mathcal {G}_p\) is such that \(C(k)\) is the set of offsprings of \(k \in J_p\) (and similarly for \(\mathcal {G}'_p\)). Thus, the conditional probability that \((\mathcal {H}_{p+1},\mathcal {H}_{p+1}')\ne (\mathcal {G}_{p+1},\mathcal {G}_{p+1}')\) is at most
Moreover, it should be clear that this construction can be done in such a way that condition (H) holds for \((p+1)\), thereby concluding the argument.
References
Panchenko, D.: The Sherrington–Kirkpatrick Model. Springer Monographs in Mathematics, xii+156 pp. Springer, New York (2013). ISBN: 978-1-4614-6288-0
Panchenko, D.: The Parisi ultrametricity conjecture. Ann. Math. (2) 177(1), 383–393 (2013)
Panchenko, D.: Structure of 1-RSB asymptotic Gibbs measures in the diluted \(p\)-spin models. J. Stat. Phys. 155(1), 1–22 (2014)
Talagrand, M.: Mean Field Models for Spin Glasses, vol. 10, xviii+485 pp. Springer, Berlin (2011). ISBN: 978-3-642-15201-6
Talagrand, M.: Mean Field Models for Spin Glasses, vol. 2, xii+ 629 pp. Springer, Berlin (2011). ISBN: 978-3-642-22252-8
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Talagrand, M. A mean-field spin glass model based on diluted \(V\)-statistics. Probab. Theory Relat. Fields 165, 401–445 (2016). https://doi.org/10.1007/s00440-015-0634-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-015-0634-8
Mathematics Subject Classification
- Primary 60K35
- Secondary 82D30