Abstract
We study support recovery for a \(k \times k\) principal submatrix with elevated mean \(\lambda /N\), hidden in an \(N\times N\) symmetric mean zero Gaussian matrix. Here \(\lambda >0\) is a universal constant, and we assume \(k = N \rho \) for some constant \(\rho \in (0,1)\). We establish that there exists a constant \(C>0\) such that the MLE recovers a constant proportion of the hidden submatrix if \(\lambda {\ge C} \sqrt{\frac{1}{\rho } \log \frac{1}{\rho }}\), while such recovery is information theoretically impossible if \(\lambda = o( \sqrt{\frac{1}{\rho } \log \frac{1}{\rho }} )\). The MLE is computationally intractable in general, and in fact, for \(\rho >0\) sufficiently small, this problem is conjectured to exhibit a statistical-computational gap. To provide rigorous evidence for this, we study the likelihood landscape for this problem, and establish that for some \(\varepsilon >0\) and \(\sqrt{\frac{1}{\rho } \log \frac{1}{\rho } } \ll \lambda \ll \frac{1}{\rho ^{1/2 + \varepsilon }}\), the problem exhibits a variant of the Overlap-Gap-Property (OGP). As a direct consequence, we establish that a family of local MCMC based algorithms do not achieve optimal recovery. Finally, we establish that for \(\lambda > 1/\rho \), a simple spectral method recovers a constant proportion of the hidden submatrix.
This is a preview of subscription content, access via your institution.

Data availibility
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18(1), 6446–6531 (2017)
Achlioptas, D., Coja-Oghlan, A., Ricci-Tersenghi, F.: On the solution-space geometry of random constraint satisfaction problems. Random Struct. Algorithms 38(3), 251–268 (2011)
Addario-Berry, L., Maillard, P.: The algorithmic hardness threshold for continuous random energy models. Math. Stat. Learn. 2(1), 77–101 (2020)
Aizenman, M., Sims, R., Starr, S.L.: Extended variational principle for the Sherrington–Kirkpatrick spin-glass model. Phys. Rev. B 68(21), 214403 (2003)
Alon, N., Krivelevich, M., Sudakov, B.: Finding a large hidden clique in a random graph. Random Struct. Algorithms 13(3–4), 457–466 (1998)
Amini, A.A., Wainwright, M.J : High-dimensional analysis of semidefinite relaxations for sparse principal components. In: 2008 IEEE International Symposium on Information Theory, pp. 2454–2458. IEEE (2008)
Arguin, L.-P.: Spin glass computations and Ruelle’s probability cascades. J. Stat. Phys. 126(4–5), 951–976 (2007)
Auffinger, A., Chen, W.-K.: Parisi formula for the ground state energy in the mixed \(p\)-spin model. Ann. Probab. 45(6B), 4617–4631 (2017)
Auffinger, A., Chen, W..-K., Zeng, Q.: The SK model is full-step replica symmetry breaking at zero temperature. Commun. Pure Appl. Math. 73, 921–943 (2020)
Baffioni, F., Rosati, F.: Some exact results on the ultrametric overlap distribution in mean field spin glass models (i). Eur. Phys. J. B Condens. Matter Complex Syst. 17(3), 439–447 (2000)
Baik, J., Arous, B., Ben Arous, G., Péché, S.: Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33(5), 1643–1697 (2005)
Balakrishnan, S., Kolar, M., Rinaldo, A., Singh, A., Wasserman, L.: Statistical and computational tradeoffs in biclustering. In: NeurIPS 2011 Workshop on Computational Trade-Offs in Statistical Learning, vol. 4 (2011)
Banks, J., Moore, C., Vershynin, R., Verzelen, N., Jiaming, X.: Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization. IEEE Trans. Inf. Theory 64(7), 4872–4894 (2018)
Barak, B., Hopkins, S., Kelner, J., Kothari, P.K., Moitra, A., Potechin, A.: A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM J. Comput. 48(2), 687–735 (2019)
Barbier, J., Dia, M., Macris, N., Krzakala, F., Lesieur, T., Zdeborová, L.: Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pp. 424–432 (2016)
Barbier, J., Macris, N., Rush, C.: All-or-nothing statistical and computational phase transitions in sparse spiked matrix estimation. In: Advances in Neural Information Processing Systems, pp. 14915–14926 (2020)
Barra, A., Genovese, G., Guerra, F.: Equilibrium statistical mechanics of bipartite spin systems. J. Phys. A: Math. Theor. 44(24), 245002 (2011)
Ben Arous, G., Gheissari, R., Jagannath, A.: Algorithmic thresholds for tensor PCA. Ann. Probab. 48(4), 2052–2087 (2020)
Arous, G.B., Jagannath, A.: Spectral gap estimates in mean field spin glasses. Commun. Math. Phys. 361(1), 1–52 (2018)
Ben Arous, G., Wein, A.S., Zadik, I.: Free energy wells and overlap gap property in sparse pca. In: Conference on Learning Theory, pp. 479–482. PMLR (2020)
Benaych-Georges, F., Nadakuditi, R.R.: The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227(1), 494–521 (2011)
Berthet, Q., Rigollet, P.: Complexity theoretic lower bounds for sparse principal component detection. In: Conference on Learning Theory, pp. 1046–1066 (2013)
Bhamidi, S., Dey, P.S., Nobel, A.B.: Energy landscape for large average submatrix detection problems in Gaussian random matrices. Probab. Theory Relat. Fields 168(3–4), 919–983 (2017)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities. A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)
Brennan, M., Bresler, G., Huleihel, W.: Reducibility and computational lower bounds for problems with planted sparse structure. In: Proceedings of the 31st Conference On Learning Theory. PMLR, vol. 75, pp. 48–166 (2018)
Brennan, M., Bresler, G., Huleihel, W.: Universality of computational lower bounds for submatrix detection. In: Proceedings of the Thirty-Second Conference on Learning Theory. PMLR, vol. 99, pp. 417–468 (2019)
Butucea, C., Ingster, Y.I.: Detection of a sparse submatrix of a high-dimensional noisy matrix. Bernoulli 19(5B), 2652–2688 (2013)
Butucea, C., Ingster, Y.I., Suslina, I.A.: Sharp variable selection of a sparse submatrix in a high-dimensional noisy matrix. ESAIM Probab. Stat. 19, 115–134 (2015)
Cai, T.T., Liang, T., Rakhlin, A.: Computational and statistical boundaries for submatrix localization in a large noisy matrix. Ann. Stat. 45(4), 1403–1430 (2017)
Chandrasekaran, V., Jordan, M.I.: Computational and statistical tradeoffs via convex relaxation. Proc. Natl. Acad. Sci. 110(13), E1181–E1190 (2013)
Chen, W.-K., Gamarnik, D., Panchenko, D., Rahman, M.: Suboptimality of local algorithms for a class of max-cut problems. Ann. Probab. 47(3), 1587–1618 (2019)
Chen, Y., Jiaming, X.: Statistical-computational tradeoffs in planted problems and submatrix localization with a growing number of clusters and submatrices. J. Mach. Learn. Res. 17(1), 882–938 (2016)
Coja-Oghlan, A., Haqshenas, A., Hetterich, S.: Walksat stalls well below satisfiability. SIAM J. Discrete Math. 31(2), 1160–1173 (2017)
Cover, T.M., Thomas, J.A.: Elements of Information Theory, vol. 68, pp. 69–73. Wiley, New York (1991)
de Bruijn, N.G., Erdös, P.: Some linear and some quadratic recursion formulas. II. Nederl. Akad. Wetensch. Proc. Ser. A. Indagationes Math. 55, 152–163 (1952)
Deshpande, Y., Montanari, .: Information-theoretically optimal sparse PCA. In: 2014 IEEE International Symposium on Information Theory, pp. 2197–2201. IEEE (2014)
Deshpande, Y., Montanari, A.: Improved sum-of-squares lower bounds for hidden clique and hidden submatrix problems. In: Conference on Learning Theory, pp. 523–562 (2015)
Ding, Y., Kunisky, D., Wein, A.S, Bandeira, A.S: Sparse high-dimensional linear regression. Estimating squared error and a phase transition. Ann. Stat. (to appear)
Feldman, V., Grigorescu, E., Reyzin, L., Vempala, S.S., Xiao, Y.: Statistical algorithms and a lower bound for detecting planted cliques. J. ACM (JACM) 64(2), 8 (2017)
Gamarnik, D., Li, Q.: Finding a large submatrix of a Gaussian random matrix. Ann. Stat. 46(6A), 2511–2561 (2018)
Gamarnik, D., Sudan, M.: Performance of sequential local algorithms for the random NAE-K-SAT problem. SIAM J. Comput. 46(2), 590–619 (2017)
Gamarnik, D., Zadik, I.: High dimensional regression with binary coefficients. estimating squared error and a phase transtition. In: Conference on Learning Theory, pp. 948–953 (2017)
Gamarnik, D., Zadik, I.: Sparse high-dimensional linear regression. Algorithmic barriers and a local search algorithm. arXiv:1711.04952 (2017)
Gamarnik, D., Zadik, I.: The landscape of the planted clique problem: dense subgraphs and the overlap gap property. arXiv:1904.07174 (2019)
Gao, C., Ma, Z., Zhou, H.H.: Sparse CCA: adaptive estimation and computational barriers. Ann. Stat. 45(5), 2074–2101 (2017)
Hopkins, S.B., Kothari, P., Potechin, A.H., Raghavendra, P., Schramm, T.: On the integrality gap of degree-4 sum of squares for planted clique. ACM Trans. Algorithms (TALG) 14(3), 28 (2018)
Hopkins, S.B, Schramm, T., Shi, J., Steurer, D.: Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors. In: Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, pp. 178–191. ACM (2016)
Jagannath, A.: Approximate ultrametricity for random measures and applications to spin glasses. Commun. Pure Appl. Math. 70(4), 611–664 (2017)
Jagannath, A., Ko, J., Sen, S.: Max \(\kappa \)-cut and the inhomogeneous Potts spin glass. Ann. Appl. Probab. 28(3), 1536–1572 (2018)
Jagannath, A., Lopatto, P., Miolane, L.: Statistical thresholds for tensor PCA. Ann. Appl. Probab. 30(4), 1910–1933 (2020)
Jagannath, A., Sen, S.: On the unbalanced cut problem and the generalized Sherrington-Kirkpatrick model. Ann. Inst. Henri Poincaré D 8(1), 35–88 (2020)
Jagannath, A., Tobasco, I.: A dynamic programming approach to the Parisi functional. Proc. Am. Math. Soc. 144(7), 3135–3150 (2016)
Jagannath, A., Tobasco, I.: Low temperature asymptotics of spherical mean field spin glasses. Commun. Math. Phys. 352(3), 979–1017 (2017)
Jagannath, A., Tobasco, I.: Some properties of the phase diagram for mixed p-spin glasses. Probab. Theory Relat. Fields 167(3–4), 615–672 (2017)
Kolar, M., Balakrishnan, S., Rinaldo, A., Singh, A.: Minimax localization of structural information in large noisy matrices. In: Advances in Neural Information Processing Systems, pp. 909–917 (2011)
Krzakala, F., Xu, J., Zdeborová, L.: Mutual information in rank-one matrix estimation. In: 2016 IEEE Information Theory Workshop (ITW), pp. 71–75. IEEE (2016)
Lelarge, M., Miolane, L.: Fundamental limits of symmetric low-rank matrix estimation. Probab. Theory Relat. Fields 173(3–4), 859–929 (2019)
Lesieur, T., Krzakala, F., Zdeborová, L.: Phase transitions in sparse PCA. In: 2015 IEEE International Symposium on Information Theory (ISIT), pp. 1635–1639. IEEE (2015)
Lesieur, T., Krzakala, F., Zdeborová, L.: Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications. J. Stat. Mech. Theory Exp. 2017(7), 073403 (2017)
Ma, T., Wigderson, A.: Sum-of-squares lower bounds for sparse PCA. In: Advances in Neural Information Processing Systems, pp. 1612–1620 (2015)
Ma, Z., Yihong, W.: Computational barriers in minimax submatrix detection. Ann. Stat. 43(3), 1089–1116 (2015)
Meka, R., Potechin, A., Wigderson, A.: Sum-of-squares lower bounds for planted clique. In: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pp. 87–96. ACM (2015)
Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond: An Introduction to the Replica Method and Its Applications, vol. 9. World Scientific Publishing Company (1987)
Mézard, M., Mora, T., Zecchina, R.: Clustering of solutions in the random satisfiability problem. Phys. Rev. Lett. 94(19), 197205 (2005)
Montanari, A.: Finding one community in a sparse graph. J. Stat. Phys. 161(2), 273–299 (2015)
Montanari, A.: Optimization of the Sherrington–Kirkpatrick Hamiltonian. SIAM J. Comput. (2021). https://doi.org/10.1137/20M132016X
Montanari, A., Reichman, D., Zeitouni, O.: On the limitation of spectral methods: from the Gaussian hidden clique problem to rank-one perturbations of Gaussian tensors. In: Advances in Neural Information Processing Systems, pp. 217–225 (2015)
Moore, Cr.: The computer science and physics of community detection: landscapes, phase transitions, and hardness. arXiv:1702.00467 (2017)
Panchenko, D.: The Parisi ultrametricity conjecture. Ann. Math. (2) 177(1), 383–393 (2013)
Panchenko, D.: The Sherrington–Kirkpatrick model. Springer, Berlin (2013)
Panchenko, D.: The Parisi formula for mixed \(p\)-spin models. Ann. Probab. 42(3), 946–958 (2014)
Panchenko, D.: The free energy in a multi-species Sherrington–Kirkpatrick model. Ann. Probab. 43(6), 3494–3513 (2015)
Panchenko, D.: Free energy in the mixed \( p \)-spin models with vector spins. Ann. Probab. 46(2), 865–896 (2018)
Panchenko, D.: Free energy in the Potts spin glass. Ann. Probab. 46(2), 829–864 (2018)
Rahman, M., Virag, B.: Local algorithms for independent sets are half-optimal. Ann. Probab. 45(3), 1543–1577 (2017)
Richard, E., Montanari, A.: A statistical model for tensor PCA. In: Advances in Neural Information Processing Systems, pp. 2897–2905 (2014)
Rossman, B.: Average-case complexity of detecting cliques. Ph.D. thesis, Massachusetts Institute of Technology (2010)
Schramm, T., Wein, A.S: Computational barriers to estimation from low-degree polynomials. arXiv:2008.02269 (2020)
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)
Steele, J.M.: Probability Theory and Combinatorial Optimization, volume 69 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1997)
Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Classics in Mathematics. Springer, Berlin (2006). ((Reprint of the 1997 edition))
Subag, E.: Following the ground-states of full-RSB spherical spin glasses. arXiv:1812.04588 (2018)
Talagrand, M.: Mean field models for spin glasses. Volume II, volume 55 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics]. Springer, Heidelberg (2011) (Advanced replica-symmetry and low temperature)
Wein, A.S, El Alaoui, A., Moore, C.: The Kikuchi hierarchy and tensor PCA. arXiv:1904.03858 (2019)
Wu, Y., Xu, J.: Statistical problems with planted structures: information-theoretical and computational limits. In: Information-Theoretic Methods in Data Science, p. 383 (2021)
Acknowledgements
The authors thank an anonymous referee for pointing out a substantial improvement to Theorem 1.2 as well as for several constructive comments that have improved the exposition of this paper. SS thanks Yash Deshpande for introducing him to the problem. DG gratefully acknowledges the support of ONR Grant N00014-17-1- 2790. AJ gratefully acknowledges NSERC [RGPIN-2020-04597, DGECR-2020-00199] and the partial support of NSF Grant NSF OISE-1604232. Cette recherche a été financée par le Conseil de recherches en sciences naturelles et en génie du Canada (CRSNG).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Ruelle probability cascades
For the convenience of the reader, we briefly review here basic properties of Ruelle probability cascades (RPCs) (sometimes called, Derrida Ruelle Probability Cascades) used throughout this paper.
1.1 Construction and basic properties
Let us begin by recalling the construction of RPCs and some basic properties. See, e.g., [70] or [48, Sec. 3.3].
Fix \(r\ge 1\) and let \(\mathcal {A}_{r}\) be as in Sect. 6.1. We label the vertices of this tree as \(\mathcal {A}_{r}=\mathbb {N}^{0}\cup \mathbb {N}^{1}\cup \cdots \cup \mathbb {N}^{r}\), where a vertex at depth k has label \(\alpha =(\alpha ^{1},\ldots ,\alpha ^{k})\) which corresponds to the root-vertex path, \(\emptyset \rightarrow \alpha ^{1}\rightarrow (\alpha ^{1},\alpha ^{2})\rightarrow \cdots \rightarrow (\alpha ^{1},\ldots ,\alpha ^{k}).\) As above, we denote this path by \(p(\alpha )\). Denote the depth of a vertex by \(\Big |\alpha \Big |\) and let \(\partial \mathcal {A}_{r}\) denote the leaves of \(\mathcal {A}_{r}\).
For \(r\ge 1\) and a fixed sequence \(0=\mu _{-1}<\mu _{0}<\ldots <\mu _{r}=1\), we construct the corresponding RPC as follows. Let \(m_{\theta }(dx)=\theta x^{-\theta -1}dx\). For each non-leaf vertex \(\alpha \in \mathcal {A}_{r}\backslash \partial \mathcal {A}_{r}\), we assign an independent copy of the Poisson point process \(PPP(m_{\mu _{\Big |\alpha \Big |}}(dx))\) arranged in decreasing order, where we assign each child of \(\alpha \) the term in the point process of corresponding rank. This yields a collection \((u_{\alpha })_{\alpha \in \mathcal {A}_{r}}\) of random variables. Let \(w_{\alpha }=\prod _{\gamma \in p(\alpha )}u_{\gamma }\) and finally consider the normalized collection \((v_{\alpha })_{\alpha \in \mathcal {A}_{r}}\) given by
The Ruelle probability cascade with parameters \((\mu _{k})_{k=-1}^{r}\) is the stochastic process \((v_{\alpha })_{\alpha \in \partial \mathcal {A}_{r}}\).
It will also be helpful to note the following. Let \(\mu \in \mathscr {M}_1([0,\rho ])\) have finite support and consider the overlap distribution \(\mathcal {R}(\mu )\) defined as in (6.21). We note here the following elementary consequence of the definition. For a proof see, e.g., [70, Eq. 2.82].
Lemma A.1
Let \(\mu \in \mathscr {M}_1([0,\rho ])\) be of finite support and consider \(\pi \) as defined in (6.21). Then \(\mathbb {E}\pi (R_{12} = q) = \mu (\{q\})\).
1.2 Calculating expectations and Parisi PDEs
Let us now recall the following well-known result connecting Ruelle probability cascades to Parisi-type PDEs. (Recall again that we may view \(\mathbb {N}^{r}=\partial \mathcal {A}_{r}\). ) Results of this type appear in different notations throughout the spin glass literature and are sometimes referred to as consequences of the Bolthausen–Sznitman invariance of RPCs. The following results are taken from [7].
Theorem A.2
(Theorem 6 from [7]) Fix \(r\ge 1,T>0\), and sequences
Let \(\psi \in C^{1}([0,T])\) be non-negative increasing and let \((g_{\psi }(\alpha ))_{\alpha \in \mathbb {N}^{r}}\) denote the centered Gaussian process with covariance
Finally, let \((v_{\alpha })_{\alpha \in \mathbb {N}^{r}}\) be a Ruelle probability cascade with parameters \((\mu _{k})\). Then we have the following.
-
(1)
For any smooth f of at most linear growth we have that
$$\begin{aligned} \mathbb {E}\log \sum _{\alpha \in \mathbb {N}^{r}}v_{\alpha }\exp [f(g_{\psi }(\alpha ))]=\phi _{\mu }(0,0), \end{aligned}$$where \(\phi \) is the unique solution to
$$\begin{aligned} \partial _{t}\phi +\frac{\psi '}{2}\left( \Delta \phi +\mu [0,t](\partial _{x}\phi )^{2}\right)&=0\\ \phi (T,x)&=f(x), \end{aligned}$$and \(\mu \in \mathscr {M}_{1}([0,T])\) is given by \(\mu (\{q_{k}\})=\mu _{k}-\mu _{k-1}\).
-
(2)
If \((g_{\psi }^{i})_{i=1}^{M}\) are iid copies of \(g_{\psi }\) and \((f_{i})\) are of at most linear growth, then
$$\begin{aligned} \mathbb {E}\log \sum _{\alpha }v_{\alpha }\exp \Big (\sum _{i}f_{i}(g_{\psi }^{i}(\alpha )) \Big )=\sum _{i}\mathbb {E}\log \sum v_{\alpha }\exp f_{i}(g_{\psi }(\alpha )). \end{aligned}$$
We note here the following corollary which has appeared more or less verbatim in many papers and follows by applying item 1 in the above with \(f(x)=x\) and the Cole–Hopf iteration.
Corollary A.3
[Proposition 7 from [7]] We have that
The simplicity of the formula in this case follows from noting that the heat equation with initial data \(e^x\) is exactly solvable.
Strict convexity
To prove strict convexity of \(\mathcal {P}\), let us introduce the following notation. For the sake of clarity, we make the dependence of \(u_{\nu }^{i}\) on \(\Lambda _{i}\) explicit by writing \(u_{\nu ,\Lambda _{i}}(t,x)=u_{\nu }^{i}(t,x)\). Furthermore, as (2.1) is invariant under a spatial translation, we see that
where \(\nu _{0}=\nu -\nu (\{\rho \})\delta _{\rho }\). It will also be helpful to recall the dynamic programming principle for \(u_{\nu ,\Lambda _i}(t,x)\) from [51, Lemma 3.5].
Lemma
For any \((\nu ,\lambda )\in \mathcal {A}\times \mathbb {R}\) of the form \(d\nu =mdt+c \delta _{\rho }\), we have that for any \(t<t'\le \rho \),
where \(X^{\alpha }\) solves
\(W_{s}\) is a standard brownian motion and \(\mathcal {B}_{t'}\) is the space of bounded stochastic processes that are progressively measurable with respect to the filtration of \((W_{s})_{s\le t'}\). Furthermore, any optimal control \(\alpha _{s}^{*}\) satisfies
The proof will begin with the following observation.
Lemma B.1
For any \(\nu \in \mathcal {A}_{0}\), and any \(t\in [0,\rho ),\) \(u_{\nu ,0}(t,x)\) is strictly convex in x.
Proof
Fix \(x_{0},x_{1}\in \mathbb {R}\) distinct, \(\theta \in (0,1)\), and let \(x_{\theta }=\theta x_{0}+(1-\theta )x_{0}\). Let \((X_{s}^{\theta })\) denote the optimal trajectory corresponding to \(u_{\nu ,0}(t,x_{\theta })\), and similarly let \(\alpha _{s}^{\theta }=\partial _{x}u_{\nu ,0}(s,X_{s}^{\theta })\) denote a corresponding optimal control.
Observe that if we let \(G_s=\int _{0}^{s}\sqrt{2}dW_{s_1}+\int _{0}^{s}2m\left( \alpha _{s_1}^{\theta }\right) ^{2}ds_1\), then \(X_{s}^{\theta }=G_s+x_{\theta }\). We first claim that the law of \(G_\rho \) charges any interval \((a,b)\subseteq \mathbb {R}\). As it is possible that \(\int _0^\rho m^2(s) ds=\infty \), Novikov’s condition does not apply, so we cannot apply Girsanov’s theorem directly to \(G_\rho \). We circumvent this by a localization argument as follows.
Fix \(0<s<\rho \). Since \(0\le \sup |\alpha _{s}^{\theta }|\le 1\) by (2.8), we have that \(b_t = 2m(t)(\alpha _t^\theta )^2\) has \(\sup _{0\le t \le s} | b_{t}| < C(s)\) for some non-random \(C(s)>0\). By Girsanov’s theorem [81, Lemma 6.4.1] there is a tilt of the law of \(G_s\) such that the law under this tilted measure is Gaussian. Thus \(\mathbb {P}[G_{s} \in \mathcal {I}] >0\) for any interval \(\mathcal {I}\). Now, fix an interval \(\mathcal {I}= (a,b)\).
Note that
and thus
Further, \(\int _{s'}^{\rho } \sqrt{2} dW_{s_1} \sim \mathcal {N}(0, 2(\rho - s'))\). Fix \(C'>0\), and let \((c,d) \subset \mathcal {I}\) such that
Such an interval always exists once \(\rho -s'\) is sufficiently small. This implies
This, in turn, establishes that for any interval \(\mathcal {I}\), \(\mathbb {P}[G_{\rho } \in \mathcal {I}] >0\).
Let \(Y=G_{\rho }+x_{1}\) and \(Z=G_{\rho }+x_{0}\), then we have that
where in the second line we use that if \(a<0<b\) then \(\left( \theta a+(1-\theta )b\right) _{+}<\theta a_{+}+(1-\theta )b_{+}\) and that
\(\square \)
Lemma B.2
The functional \(\mathcal {P}\) is strictly convex on \(\mathcal {A}_{0}\times \mathbb {R}\).
Remark B.3
It will be easy to see from the proof that it is also convex on \(\mathcal {A}\times \mathbb {R}\). Strict convexity, however, fails on this larger domain due to the invariance of the functional under the map \((mdt+c\delta _\rho ,\Lambda _1,\Lambda _2)\mapsto (mdt,\Lambda _1+2c,\Lambda _2+2c)\).
Proof
This proof follows the approach of [52, Theorem 20]. Fix \((\nu _{1},\lambda _{1}),(\nu _{2},\lambda _{2})\in \mathcal {A}_{0}\times \mathbb {R}\), and \(\theta \in [0,1]\), let
and write \(\nu _{1}=m_{1}dt\), \(\nu _{2}=m_{2}dt\), \(\nu _{\theta }=m_{\theta }dt\) with \(m_{\theta }=\theta m_{1}+(1-\theta )m_{2}\).
Let \(X_{s}^{\theta }\) denote the optimal trajectory for the stochastic control problem for \(u_{\nu _{\theta },\lambda _{\theta }}\) and let \(\alpha ^{\theta }=\partial _{x}u_{\nu _{\theta },\lambda _{\theta }}(s,X_{s}^{\theta })\) the corresponding control. Finally let \(Y^{\theta },Z^{\theta }\) solve the SDEs
with \(Y_{0}=Z_{0}=0.\)
Now, fix some \(0<t<\rho .\) then by the dynamic programming principle,
Since the Eq. (2.1) is invariant under translations of space, we see that \(u_{\nu ,\lambda }(t,x)=u_{\nu ,0}(t,x+\lambda )\) for any \((\nu ,\lambda )\). Thus we may rewrite the above as
in the first inequality we have used the convexity of \(u_{\nu ,0}(t,0)\) in space. Note that in fact the first inequality is strict, provided
In particular, it suffices to show that \(Var(Y_{t}-Z_{t})+|\lambda _{1}-\lambda _{2}|>0\). Thus if \(\lambda _{1}\ne \lambda _{2}\) we are done. If they are equal then we know that \(m_{1}\ne m_{2}\). In this case, by right continuity and monotonicity, there must be some \(s<\tau <\rho \) such that \(m_{1}(t')\ne m_{2}(t')\) on \([s,\tau ]\) (that we can take \(t<\rho \) follows from the fact that if \(\nu _{1}\ne \nu _{2}\) then \(m_{1}\) and \(m_{2}\) must differ on a set of positive lebesgue measure ). In particular, we choose \(t=\tau \) from now on.
Note that by Ito’s lemma, our choice of \(\alpha _{s}^{\theta }\) is a martingale, with
Thus by Ito’s isometry, if we let \(\Delta _{s}=2(m_{1}-m_{2})\)
where
where\(p(t)=\int _{0}^{t}2\mathbb {E}\left( \partial _{xx}u_{\nu ,0}(s,X_{s}^{\theta })\right) ^{2}ds\). Notice that since \(t<\rho \), we have that \(\Delta _{t}\in L^{2}([0,t])\) . Thus to show positivity of the variance, it suffices to show that K is positive definite.
Since u is \(C^{2}([0,t+\epsilon ]\times \mathbb {R})\) for some \(\epsilon >0\) small enough, and u is strictly convex, we have that \(\partial _{xx}u(t,x)>0\) lebesgue a.e. x. Thus p(t) is strictly increasing. Thus this kernel corresponds to a monotone time change of a Brownian motion, so that it is positive definite. \(\square \)
Rights and permissions
About this article
Cite this article
Gamarnik, D., Jagannath, A. & Sen, S. The overlap gap property in principal submatrix recovery. Probab. Theory Relat. Fields 181, 757–814 (2021). https://doi.org/10.1007/s00440-021-01089-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-021-01089-7
Keywords
- Submatrix recovery
- Overlap gap property
- Spin glasses
Mathematics Subject Classification
- Primary 68Q87
- 60C05
- Secondary 82B44
- 68Q25
- 62H25