Abstract
In 1977, Thouless, Anderson, and Palmer (TAP) derived a system of consistent equations in terms of the effective magnetization in order to study the free energy in the Sherrington–Kirkpatrick (SK) spin glass model. The solutions to their equations were predicted to contain vital information about the landscapes in the SK Hamiltonian and the TAP free energy and moreover have direct connections to Parisi’s replica ansatz. In this work, we aim to investigate the validity of the TAP equations in the generic mixed p-spin model. By utilizing the ultrametricity of the overlaps, we show that the TAP equations are asymptotically satisfied by the conditional local magnetizations on the asymptotic pure states.
Similar content being viewed by others
Notes
Talagrand’s result asserts that the N TAP equations asymptotically hold simultaneously with high probability, while we establish the TAP equations in the average sense.
Girffith’s lemma: Let \(f_N\) be a sequence of differentiable convex functions defined on an open interval I. Assume that \(f_N\) converges f pointwise on I. If f is differentiable at some \(x\in I,\) then \(\lim _{N\rightarrow \infty }f_N'(x)=f'(x).\)
References
Adhikari, A., Brennecke, C., von Soosten, P., Yau, H.-T.: Dynamical approach to the TAP equations for the Sherrington–Kirkpatrick model. J. Stat. Phys. 183(35), 1–27 (2021)
Auffinger, A., Chen, W.-K.: The Parisi formula has a unique minimizer. Commun. Math. Phys. 335(3), 1429–1444 (2015)
Auffinger, A., Jagannath, A.: On spin distributions for generic \(p\)-spin models. J. Stat. Phys. 174(2), 316–332 (2019)
Auffinger, A., Jagannath, A.: Thouless-Anderson-Palmer equations for generic \(p\)-spin glasses. Ann. Probab. 47(4), 2230–2256 (2019)
Barra, A., Contucci, P., Mingione, E., Tantari, D.: Multi-species mean field spin glasses. Rigorous Results Annales Henri Poincaré 16, 691–708 (2015)
Bayati, M., Montanari, A.: The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. Inf. Theory 57(2), 764–785 (2011)
Belius, D.: High temperature TAP upper bound for the free energy of mean field spin glasses. arXiv preprint arXiv:2204.00681, (2022)
Belius, D., Kistler, N.: The TAP-Plefka variational principle for the spherical SK model. Commun. Math. Phys. 367(3), 991–1017 (2019)
Bolthausen, E.: An iterative construction of solutions of the TAP equations for the Sherrington-Kirkpatrick model. Commun. Math. Phys. 325(1), 333–366 (2014)
Bolthausen, E.: A Morita type proof of the replica-symmetric formula for SK. In: Statistical Mechanics of Classical and Disordered Systems. volume 293 of Springer Proceedings in Mathematics & Statistics, pp. 63–93. Springer, Cham (2019)
Brennecke, E., Yau, H.-T.: The replica symmetric formula for the SK model revisited. J. Math. Phys. 63(073302), 1–12 (2022)
Chatterjee, S.: Spin glasses and Stein’s method. Probab. Theory Relat. Fields 148(3–4), 567–600 (2010)
Chen, W.-K., Panchenko, D.: On the TAP free energy in the mixed \(p\)-spin models. Commun. Math. Phys. 362(1), 219–252 (2018)
Chen, W.-K., Panchenko, D., Subag, E.: Generalized TAP free energy. Commun. Pure Appl. Math. (2018)
Chen, W.-K., Panchenko, D., Subag, E.: The generalized TAP free energy II. Commun. Math. Phys. 381(1), 257–291 (2021)
Chen, W.-K., Tang, S.: On Convergence of the Cavity and Bolthausen’s TAP Iterations to the Local Magnetization. Commun. Math. Phys. 386, 1209–1242 (2021)
de Almeida, J.R.L., Thouless, D.J.: Stability of the sherrington-kirkpatrick solution of a spin glass model. J. Phys. A 11(5), 983–990 (1978)
El Alaoui, A., Montanari, A., Sellke, M.: Optimization of mean-field spin glasses. Ann. Probab. 49(6), 2922–2960 (2021)
Jagannath, A.: Approximate ultrametricity for random measures and applications to spin glasses. Commun. Pure Appl. Math. 70, 611–664 (2017)
Javanmard, A., Montanari, A.: State evolution for general approximate message passing algorithms, with applications to spatial coupling. Inf. Inference J. IMA 2(2), 115–144 (2013)
Kabashima, Y., Krzakala, F., Mézard, M., Sakata, A., Zdeborová, L.: Phase transitions and sample complexity in bayes-optimal matrix factorization. IEEE Trans. Inf. Theory 62(7), 4228–4265 (2016)
Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. World Scientific Lecture Notes in Physics, vol. 9. World Scientific Publishing Co. Inc, Teaneck, NJ (1987)
Mézard, M., Virasoro, M.A.: The microstructure of ultrametricity. Journal de Physique 46(8), 1293–1307 (1985)
Montanari, A.: Optimization of the Sherrington–Kirkpatrick Hamiltonian. SIAM J. Comput. 0(0):FOCS19–1, (2021)
Montanari, A., Richard, E.: Non-negative principal component analysis: Message passing algorithms and sharp asymptotics. IEEE Trans. Inf. Theory 62(3), 1458–1484 (2015)
Montanari, A., Venkataramanan, R.: Estimation of low-rank matrices via approximate message passing. Ann. Stat. 49(1), 321–345 (2021)
Panchenko, D.: On differentiability of the Parisi formula. Electron. Commun. Probab. 13, 241–247 (2008)
Panchenko, D.: The Parisi ultrametricity conjecture. Ann. Math. 383–393 (2013)
Panchenko, D.: The Sherrington-Kirkpatrick model. Springer Monographs in Mathematics. Springer, New York (2013)
Panchenko, D.: The Parisi formula for mixed \( p \)-spin models. Ann. Probab. 42(3), 946–958 (2014)
Parisi, G.: Infinite number of order parameters for spin-glasses. Phys. Rev. Lett. 43(23), 1754 (1979)
Parisi, G.: A sequence of approximated solutions to the SK model for spin glasses. J. Phys. A: Math. Gen. 13(4), L115 (1980)
Parisi, G.: Order parameter for spin-glasses. Phys. Rev. Lett. 50(24), 1946 (1983)
Sherrington, D., Kirkpatrick, S.: Solvable model of a spin glass. Phys. Rev. Lett. 35, 1792–1796 (1972)
Subag, E.: Following the ground-states of full-RSB spherical spin glasses. Commun. Pure Appl. Math. 74, 1021–1044 (2020)
Subag, E.: TAP approach for multi-species spherical spin glasses I: general theory. arXiv preprint arXiv:2111.07132, (2021)
Subag, E.: TAP approach for multi-species spherical spin glasses II: the free energy of the pure models. Ann. Probab. 51(3), 1004–1024 (2021)
Talagrand, M.: The Parisi formula. Ann. Math. 221–263 (2006)
Talagrand, M.: Construction of pure states in mean field models for spin glasses. Probab. Theory Relat. Fields 148, 601–643 (2006)
Talagrand, M.: Mean field models for spin glasses. Volume I, volume 54 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics]. Springer, Berlin, 2011. Basic examples
Talagrand, M.: Mean field models for spin glasses. Volume II, volume 55 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics]. Springer, Heidelberg, 2011. Advanced replica-symmetry and low temperature
Thouless, D.J., Anderson, P.W., Palmer, R.G.: Solution of ‘solvable model of a spin glass’. Philos. Mag. 35(3), 593–601 (1977)
Zdeborová, L., Krzakala, F.: Statistical physics of inference: Thresholds and algorithms. Adv. Phys. 65(5), 453–552 (2016)
Acknowledgements
W.-K. Chen’s research is partly supported by NSF grants (DMS-1752184 and DMS-2246715) and the Simons fellowship (#1027727). S. Tang’s research is partly supported by the Simons Collaboration Grant (#712728) and the NSF LEAPS-MPS Award (DMS-2137614). Both authors thank A. Auffinger for explaining their work [3] and M. Sellke for pointed out a few typos. They also thank anonymous referees for carefully reading the manuscript and providing valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by S.Chatterjee.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proofs of Propositions 4.2 and 4.3
The proofs of Propositions 4.2 and 4.3 are based on the following lemma:
Lemma A.1
There exists a constant \(K= K(\beta , h)>0\) such that for all \(N\ge 1\) and small \(\epsilon >0,\)
Proof
For notation simplicity, we suppress the superscript \(\alpha \) and write \(A_p^{\alpha }\) as \(A_p\). We handle the series of \(A_p\) first. Note that for each p, we only need to consider the case, \(N\ge p\), otherwise \(A_p=0\) by the definition (41). Write
where the second sum is over all \({{\textbf{i}}}_l = (i_{l,1},\cdots , i_{l, p-1})\), for \(l=1,2,3,4\) that are \((p-1)\)-tuples with strictly increasing coordinates from \(\{1,2, \ldots , N-1\}^{p-1}\). We note that there are \(\left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) \) choices for each \({{\textbf{i}}}_l\). Write \(g_{i_{l,1},\ldots , i_{l, p-1}, N}\) as \(g_{{{\textbf{i}}}_l, N}\) and let \(\Delta _p:=\beta _{p}\sqrt{p!}{N^{-(p-1)/2}}.\) We have
Also, there exists a constant \(C>0\) independent of \(p,N,{{\textbf{i}}}\) such that for any \(0\le k_1,k_2,k_3,k_4\le 4\), if we let \(d=k_1+k_2+k_3+k_4,\) then
and
These can be established by an induction argument on d. Now we divide the collection of \(({{\textbf{i}}}_1,{{\textbf{i}}}_2,{\textbf{i}}_3,{{\textbf{i}}}_4)\) into three cases and compute, respectively, an upper bound for the summand in (62) under each case. In the following discussion, \(C_1,C_1',C_2,C_2',\ldots \) are absolute constants independent of N and p.
-
Case I: all 4 tuples are distinct. Applying Gaussian integration by part and the chain rule, we get
$$\begin{aligned}&\ \ \ \Big |\mathbb EG_{N,\beta }(\alpha )g_{{{\textbf{i}}}_1,N}g_{{\textbf{i}}_2,N}g_{{{\textbf{i}}}_3,N}g_{{{\textbf{i}}}_4,N} \prod _{l=1}^{4}\prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Big | \le C_1\Delta _p^4(p-1)^4\mathbb EG_{N,\beta }(\alpha ) . \end{aligned}$$Since the number of choices for \(({{\textbf{i}}}_1,{{\textbf{i}}}_2, {\textbf{i}}_3, {{\textbf{i}}}_4)\) in Case I are no more than \(\left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^4\), the summation in (62) for Case I is bounded by
$$\begin{aligned}&\Delta _p^4\sum _{\alpha \in \Sigma _{N}}\sum _{{\tiny \mathrm Case\,\,I}} \Big |\mathbb EG_{N,\beta }(\alpha )g_{{{\textbf{i}}}_1,N}g_{{{\textbf{i}}}_2,N}g_{{{\textbf{i}}}_3,N}g_{{{\textbf{i}}}_4,N} \prod _{l=1}^{4}\prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Big |\\&\quad \le \Delta _p^4\cdot \left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^4\cdot C_1\Delta _p^4(p-1)^4\le C_1\beta _p^8p^8. \end{aligned}$$ -
Case II: there are three distinct tuples in \(({{\textbf{i}}}_1, {{\textbf{i}}}_2, {{\textbf{i}}}_3, {{\textbf{i}}}_4)\). Without loss of generality, suppose \({{\textbf{i}}}_1 = {{\textbf{i}}}_2\) and they are both different from distinct \({{\textbf{i}}}_3,{{\textbf{i}}}_4\). In this case, again using Gaussian integration by part twice and the chain rule, each summand in (62) is bounded in absolute value by
$$\begin{aligned} \Bigl |\mathbb EG_{N,\beta }(\alpha )g^2_{{{\textbf{i}}}_1,N}g_{{\textbf{i}}_3,N}g_{{{\textbf{i}}}_4,N}\prod _{l=1}^{4} \prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Bigr |\le C_2 \Delta _p^2(p-1)^2\mathbb Eg_{{{\textbf{i}}}_1,N}^2 G_{N,\beta }(\alpha ). \end{aligned}$$It follows that
$$\begin{aligned}&\Delta _p^4\sum _{\alpha \in \Sigma _{N}}\sum _{{\tiny \mathrm Case\,\,II}} \Big |\mathbb EG_{N,\beta }(\alpha )g_{{{\textbf{i}}}_1,N}g_{{{\textbf{i}}}_2,N}g_{{{\textbf{i}}}_3,N}g_{{{\textbf{i}}}_4,N} \prod _{l=1}^{4}\prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Big |\\&\quad \le C_2'\Delta _p^4\cdot \left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^3\cdot \Delta _p^2(p-1)^2 \cdot \mathbb Eg_{{{\textbf{i}}}_1,N}^2\\&\quad \le C_2'\Delta _p^6 \left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^3(p-1)^2\le C_2'\beta _p^6p^5 \end{aligned}$$ -
Case III: there are no more than two distinct tuples. In this case, we have three possibilities, each bounded in absolute value respectively as follows:
$$\begin{aligned} \Bigl |\mathbb EG_{N,\beta }(\alpha )g^2_{{{\textbf{i}}}_1,N}g^2_{{{\textbf{i}}}_2,N}\prod _{l=1}^{4} \prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Bigr |&\le \mathbb EG_{N,\beta }(\alpha )g^2_{{{\textbf{i}}}_1,N}g^2_{{{\textbf{i}}}_2,N} ,\\ \Bigl |\mathbb EG_{N,\beta }(\alpha )g^1_{{{\textbf{i}}}_1,N}g^3_{{{\textbf{i}}}_2,N}\prod _{l=1}^{4} \prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Bigr |&\le \mathbb EG_{N,\beta }(\alpha ) \left| g^1_{{{\textbf{i}}}_1,N}g^3_{{{\textbf{i}}}_2,N}\right| ,\\ \Bigl |\mathbb EG_{N,\beta }(\alpha )g^4_{{{\textbf{i}}}_1,N}\prod _{l=1}^{4} \prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Bigr |&\le \mathbb EG_{N,\beta }(\alpha )g^4_{{{\textbf{i}}}_1,N}. \end{aligned}$$Consequently,
$$\begin{aligned}&\Delta _p^4\sum _{\alpha \in \Sigma _{N}}\sum _{{\tiny \mathrm Case\,\,III}} \Big |\mathbb EG_{N,\beta }(\alpha )g_{{{\textbf{i}}}_1,N}g_{{{\textbf{i}}}_2,N}g_{{{\textbf{i}}}_3,N}g_{{{\textbf{i}}}_4,N} \prod _{l=1}^{4}\prod _{k=1}^{p-1}\langle \sigma _{i_{l,k}} \rangle _{N,\beta }^{\alpha }\Big |\\&\quad \le C_3\Delta _p^4\cdot \left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^2\cdot \left[ \mathbb Eg^2_{{{\textbf{i}}}_1,N}g^2_{{{\textbf{i}}}_2,N} + \mathbb E\left| g^1_{{{\textbf{i}}}_1,N}g^3_{{{\textbf{i}}}_2,N}\right| + \mathbb Eg^4_{{{\textbf{i}}}_1,N}\right] \\&\quad \le C_3'\Delta _p^4\cdot \left( {\begin{array}{c}N-1\\ p-1\end{array}}\right) ^2 \le C_3'\beta _p^4p^2. \end{aligned}$$
Combining all three cases, we have
Since \(\sum _{p\ge 2} 2^{p}\beta _{p}^{2} < \infty \), we have \(\beta _{p}^{2}= o(2^{-p})\) as \(p\rightarrow \infty \). Choosing \(p_0\) large enough such that \(\beta _{p} \le 2^{-p/2}\) and \(p^2 < 2^{p/4}\) for all \(p>p_0\), it follows that
For the summability for the series of \(B_p^\rho \), the proof is essentially the same; the only change is that in (62), \(\langle \sigma _{j} \rangle _{N,\beta }^{\alpha }\) will be replaced by \(s_{j}^\rho \). Notice that \( |s_j^\rho | \le 1\) and any partial derivatives of \(s_{j}^\rho \) of degree \(d\le 4\) with respect to the variables \((g_{{{\textbf{i}}},N})_{{{\textbf{i}}}}\) are bounded by \(\Delta _p^d\) up to an absolute constant independent of p, N and \({{\textbf{i}}}\). For example,
More general partial derivatives can be controlled by an induction argument on the number of differentiations. This implies that (64) with \(\langle \sigma _{j} \rangle _{N,\beta }^{\alpha }\) replaced by \(s_j^\rho \) is also valid. We omit the rest of the details. \(\square \)
Proof of Proposition 4.2
Similar to (65), we have
Since \(\bigl (\mathbb E\big \langle A_{p}^{4} \big \rangle _{N,\beta }\bigr )^{1/4}\) is summable, as proved in Lemma A.1, the right hand side can be made arbitrarily small by choosing \(p_0\) sufficiently large. The other assertion can be treated similarly. \(\square \)
Proof of Proposition 4.3
Note that for any \(p_1, p_2,p_3, p_{4}\ge 2\), using Hölder’s inequality yields
which implies that
By the Cauchy–Schwarz inequality and Lemma A.1,
Thus, (43) follows from (22). The proof of (44) is exactly the same. \(\square \)
Proof of Lemma 4.4
Proof of Lemma 4.4
Fist of all, for any \(\tau \in \Sigma _{N-1}, \varvec{\tau }= (\tau ^1,\ldots ,\tau ^{p-1}) \in \Sigma _{N-1}^{p-1}\), \(X_{N,\beta }\) and \(Z_{N,p}\) are centered Gaussian random variables with variances bounded by \(C_\beta \) and p respectively, which result in
Using the nested structure (20) and the Hölder inequality with p conjugate exponents \(2r(p-1),2r(p-1),\ldots ,2r(p-1)\), and \(2r/(2r-1)\), we have
where the last inequality holds since \(\prod _{l=1}^{p-1}{\mathbb {1}}_{A_{\oplus }^{\rho }}(\tau ^l) - \prod _{l=1}^{p-1}{\mathbb {1}}_{A_{\ominus }^{\rho }}(\tau ^l)\in \{0,1\}\). Since
we have
From this, by a change of measure for \(\alpha =(\rho , \alpha _N)\sim G_{N,\beta }\) as in Lemma 3.2 and the Cauchy–Schwarz inequality, we obtain the second assertion,
For the first assertion, we similarly have
where the first inequality used the Cauchy–Schwarz inequality and the second inequality was obtained by an analogous argument for \(|D_p^\alpha -D_p^\rho |^{2r}\). Via a change of measure for \(G_{N,\beta }\) as above, we can then apply the Hölder inequality in the expectation \(\mathbb E_{g_{\cdot N}}\) with thee conjugates exponents 3, 3, 3 to get the desired bound,
\(\square \)
Proof of Lemma 5.3
Proof of Lemma 5.3
The proof is essentially the same as that for Proposition 4.2. First of all, we claim that there exists a constant \(K= K(\beta , h)>0\) such that for all \(N\ge 1\) and any small \(\epsilon >0\),
This part of the argument is analogous to the proof of Lemma A.1, but in a slightly simpler manner. We begin by rewriting
When applying Gaussian integration by parts to control the last equation, we only need to differentiate \(G_{N,\beta }(\alpha )\) with respect to \(g_{{{\textbf{i}}}_l, N}\) and the bounds of the partial derivatives of \(G_{N,\beta }(\alpha )\) given by (63). An identical argument as in the proof of Lemma 4.2 implies our claim (67), the summability of \(\bigl (\mathbb E\langle |E_p^\rho |^4\rangle _{N,\beta }\bigr )^{1/4}\). With this claim, our assertion follows immediately since, similar to (65),
and the right hand side can be made arbitrarily small by choosing \(p_0\) sufficiently large. \(\quad \square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, WK., Tang, S. On the TAP Equations via the Cavity Approach in the Generic Mixed p-Spin Models. Commun. Math. Phys. 405, 87 (2024). https://doi.org/10.1007/s00220-024-04971-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00220-024-04971-2