Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction

Ramezanali, Mohammad; Mitra, Partha P.; Sengupta, Anirvan M.

doi:10.1007/s10955-019-02292-6

Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction

Published: 04 May 2019

Volume 175, pages 764–788, (2019)
Cite this article

Journal of Statistical Physics Aims and scope Submit manuscript

Mohammad Ramezanali ORCID: orcid.org/0000-0003-0039-2043¹,
Partha P. Mitra² &
Anirvan M. Sengupta^3,4

244 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Recovery of an N-dimensional, K-sparse solution ${\mathbf {x}}$ from an M-dimensional vector of measurements ${\mathbf {y}}$ for multivariate linear regression can be accomplished by minimizing a suitably penalized least-mean-square cost $||{\mathbf {y}}-{\mathbf {H}} {\mathbf {x}}||_2^2+\lambda V({\mathbf {x}})$. Here ${\mathbf {H}}$ is a known matrix and $V({\mathbf {x}})$ is an algorithm-dependent sparsity-inducing penalty. For ‘random’ ${\mathbf {H}}$, in the limit $\lambda \rightarrow 0$ and $M,N,K\rightarrow \infty $, keeping $\rho =K/N$ and $\alpha =M/N$ fixed, exact recovery is possible for $\alpha $ past a critical value $\alpha _c = \alpha (\rho )$. Assuming ${\mathbf {x}}$ has iid entries, the critical curve exhibits some universality, in that its shape does not depend on the distribution of ${\mathbf {x}}$. However, the algorithmic phase transition occurring at $\alpha =\alpha _c$ and associated universality classes remain ill-understood from a statistical physics perspective, i.e. in terms of scaling exponents near the critical curve. In this article, we analyze the mean-field equations for two algorithms, Basis Pursuit ($V({\mathbf {x}})=||{\mathbf {x}}||_{1} $) and Elastic Net ($V({\mathbf {x}})= ||{\mathbf {x}}||_{1} + \tfrac{g}{2} ||{\mathbf {x}}||_{2}^2$) and show that they belong to different universality classes in the sense of scaling exponents, with mean squared error (MSE) of the recovered vector scaling as $\lambda ^\frac{4}{3}$ and $\lambda $ respectively, for small $\lambda $ on the critical line. In the presence of additive noise, we find that, when $\alpha >\alpha _c$, MSE is minimized at a non-zero value for $\lambda $, whereas at $\alpha =\alpha _c$, MSE always increases with $\lambda $.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Orthogonal Matching Pursuit Under the Restricted Isometry Property

Article 24 May 2016

GRPDA Revisited: Relaxed Condition and Connection to Chambolle-Pock’s Primal-Dual Algorithm

Article 28 October 2022

Spectral pursuit for simultaneous sparse representation with accuracy guarantees

Article 11 December 2023

Notes

Under this circumstance, if $\xi $ remains of order one, then the error is dominated by $\xi $, i.e. $q (\mathrm {MSE})=\sigma _\xi ^2$. However, this is not consistent with $\sigma _\xi ^2=q/\alpha $, unless $\sigma _\xi ^2=0$. Hence in this regime, we need to consider a $\sigma _\xi ^2$ that is comparable to $\theta $. Therefore, as $\vartheta \rightarrow 0$, we will have $\sigma _\xi ^2\rightarrow 0$ and $q\rightarrow 0$, making the reconstruction perfect, i.e. the limit when $\vartheta ,\theta ,\sigma _\xi ^2\rightarrow 0$ with $\tfrac{\theta }{\sigma _\xi }$ of order one.
Note that ${{\hat{\rho }}}>\rho $, even in the perfect reconstruction phase. That is because a fraction of $x_a$’s remain non-zero as long as $\vartheta >0$, and vanish only in the $\vartheta \rightarrow 0$ limit.
Note that, when $|\tau _0|=\tfrac{|x_0|}{\sigma _\xi }\rightarrow \infty $, $\varPhi (\tau +\tau _0)+\varPhi (\tau -\tau _0)\rightarrow 1$ and $(\tau -\tau _0) \phi (\tau +\tau _0),( \tau +\tau _0) \phi (\tau -\tau _0)\rightarrow 0$. The $\tau _0$ dependent expression inside $[\ldots ]^{\mathrm {av}}_{x_0}$ in Eq. (34) goes from $2\big \{(1+\tau ^2)\varPhi (\tau ) - \tau \phi (\tau )\big \}$ to $1+\tau ^2$ as $\tau _0$ goes from zero to infinity. We wrote this expression as $1+\tau ^2-\psi _\xi (\tau _0,\tau )$.

References

Amelunxen, D., Lotz, M., McCoy, M.B., Tropp, J.A.: Living on the edge: phase transitions in convex programs with random data. Inf. Inference 3(3), 224–294 (2014)
Article MathSciNet MATH Google Scholar
Andersen, M., Dahl, J., Vandenberghe, L.: CVXOPT: a python package for convex optimization (2010)
Bayati, M., Montanari, A.: The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. Inf. Theory 57(2), 764–785 (2011)
Article MathSciNet MATH Google Scholar
Bayati, M., Montanari, A.: The lasso risk for gaussian matrices. IEEE Trans. Inf. Theory 58(4), 1997–2017 (2012)
Article MathSciNet MATH Google Scholar
Candès, E., Romberg, J.: Sparsity and incoherence in compressive sampling. Inverse Probl. 23(3), 969 (2007)
Article ADS MathSciNet MATH Google Scholar
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)
Article MathSciNet MATH Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)
Article MathSciNet MATH Google Scholar
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal. Commun. Pure Appl. Math. 59(6), 797–829 (2006)
Article MathSciNet MATH Google Scholar
Donoho, D.L., Tanner, J.: Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. USA 102(27), 9446–9451 (2005)
Article ADS MathSciNet MATH Google Scholar
Donoho, D., Tanner, J.: Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing. Philos. Trans. R. Soc. A 367(1906), 4273–4293 (2009). https://doi.org/10.1098/rsta.2009.0152
Article ADS MathSciNet MATH Google Scholar
Donoho, D.L., Maleki, A., Montanari, A.: Message-passing algorithms for compressed sensing. Proc. Natl. Acad. Sci. USA 106(45), 18914–18919 (2009)
Article ADS Google Scholar
El Karoui, N.: On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Relat. Fields 170(1–2), 95–175 (2018)
Article MathSciNet MATH Google Scholar
Ganguli, S., Sompolinsky, H.: Statistical mechanics of compressed sensing. Phys. Rev. Lett. 104(18), 188701 (2010)
Article ADS Google Scholar
Kabashima, Y., Wadayama, T., Tanaka, T.: A typical reconstruction limit for compressed sensing based on $\ell _p$-norm minimization. J. Stat. Mech. 2009(09), L09003 (2009)
Article Google Scholar
Karoui, N.E.: Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators: rigorous results. arXiv preprint arXiv:1311.2445 (2013)
Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966)
Article ADS MATH Google Scholar
Ma, S.K.: Modern Theory of Critical Phenomena. Routledge, London (2018)
Book Google Scholar
Mézard, M., Parisi, G., Virasoro, M.: Sk model: the replica solution without replicas. Europhys. Lett. 1(2), 77–82 (1986)
Article ADS Google Scholar
Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond, vol. 9. World scientific, Singapore (1987)
MATH Google Scholar
Ramezanali, M., Mitra, P.P., Sengupta, A.M.: The cavity method for analysis of large-scale penalized regression. arXiv preprint arXiv:1501.03194 (2015)
Rudelson, M., Vershynin, R.: On sparse reconstruction from fourier and gaussian measurements. Commun. Pure Appl. Math. 61(8), 1025–1045 (2008)
Article MathSciNet MATH Google Scholar
Stojnic, M.: Various thresholds for $\ell _1 $-optimization in compressed sensing. arXiv preprint arXiv:0907.3666 (2009)
Stojnic, M.: $\ell _2/\ell _1$-optimization in block-sparse compressed sensing and its strong thresholds. IEEE J. Sel. Top. Signal Process. 4(2), 350–357 (2010)
Article ADS Google Scholar
Stojnic, M.: A rigorous geometry-probability equivalence in characterization of $\ell _1 $-optimization. arXiv preprint arXiv:1303.7287 (2013)
Tibshirani, R., Wainwright, M., Hastie, T.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, New York (2015)
MATH Google Scholar
Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195–198 (1943)
MathSciNet Google Scholar
Vershik, A.M., Sporyshev, P.: Asymptotic behavior of the number of faces of random polyhedra and the neighborliness problem. Selecta Math. Soviet 11(2), 181–201 (1992)
MathSciNet MATH Google Scholar
Xu, Y., Kabashima, Y.: Statistical mechanics approach to 1-bit compressed sensing. J. Stat. Mech. 2013(2), P02041 (2013)
Article MathSciNet Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Science Foundation INSPIRE (track 1) Award No. 1344069. Part of this paper was written while two of the authors (AMS and MR) were visiting Center for Computational Biology at Flatiron Institute. We are grateful for their hospitality.

Author information

Authors and Affiliations

Data Intelligence, Salesforce, 50 Fremont St. Suite 300, San Francisco, CA, 94105, USA
Mohammad Ramezanali
Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11734, USA
Partha P. Mitra
Department of Physics and Astronomy, Rutgers University, 136 Frelinghuysen Rd, Piscataway, NJ, 08854, USA
Anirvan M. Sengupta
Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York, NY, 10010, USA
Anirvan M. Sengupta

Authors

Mohammad Ramezanali
View author publications
You can also search for this author in PubMed Google Scholar
Partha P. Mitra
View author publications
You can also search for this author in PubMed Google Scholar
Anirvan M. Sengupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Ramezanali.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Equivalent Single Variable Optimization Problem

Here, we provide a sketch of our cavity argument. A detailed derivation will be left for the published version of our preprint [20]. From an algorithmic point of view, the cavity method is related to message passing algorithms, but we will assume that the algorithm convergences on a state. We wish to find an approximate statistical description of that state. Once we are done discussing cavity method, we also briefly mention how to connect these results to the replica calculations found in [13, 14].

We will consider the case where the function V is twice differentiable. To construct potentials like the $\ell _1$ norm, we use second differentiable functions like $r\ln (2\cosh (x/r))$, which tends to |x| when r goes to zero. We can study the solution for $r>0$ and then take the appropriate limit.

Minimization of the original penalized regression objective function is mathematically equivalent to the minimization of ${\mathcal {E}}({\mathbf {u}})$ over ${\mathbf {u}}$ (even if, in practice, we do not know the explicit form of ${\mathcal {E}}({\mathbf {u}})$).

$$\begin{aligned}&\min _{\mathbf {u}}{\mathcal {E}}({\mathbf {u}}) \nonumber \\&\quad =\min _{\mathbf {u}}\frac{1}{2\sigma ^2} ||{\mathbf {H}}{\mathbf {u}}-\varvec{\zeta }||_2^2 + V({\mathbf {u}}+{\mathbf {x}}_0)\nonumber \\&\quad =\min _{\mathbf {u}}\max _{\mathbf {z}}-\frac{\sigma ^2}{2} ||{\mathbf {z}}||_2^2+{\mathbf {z}}^T({\mathbf {H}}{\mathbf {u}}-\varvec{\zeta }) + V({\mathbf {u}}+{\mathbf {x}}_0)\nonumber \\&\quad =\min _{\mathbf {u}}\max _{\mathbf {z}}- \sum _{i=1}^M\Big (\frac{\sigma ^2}{2}z_i^2-\zeta _iz_i\Big )-\sum _{i=1}^M\sum _{a=1}^Nz_iH_{ia}u_a + \sum _{a=1}^NU(u_a+x_{0a}) \end{aligned}$$

(64)

Note that the $z_i$ variables and the $u_a$ variables only interact via the random measurement matrix ${\mathbf {H}}$. From this point on, our arguments are similar to that of Xu and Kabashima [28]. We try to find single variable functions ${\mathcal {E}}_a(u_a)$ and ${\mathcal {E}}_i(z_i)$ whose optimization mimics the full optimization problem.

We consider the problem with an a-cavity, meaning, a problem where the variable $x_a=u_a+x_{0a}$ has been set to zero. We also consider a problem with an i-cavity, namely a problem, where $z_i$ has been set to zero. The single variable functions are constructed by introducing the inactive/missing variable into the corresponding cavity. The discussion becomes simpler if we assume the optimization in the systems with cavities are already well-approximated by optimizing over sum of single variable functions of the following forms (as is done in Xu and Kabashima [28]):

$$\begin{aligned} {\mathcal {E}}_a(u_a)=\frac{A_a}{2}u_a^2-F_au_a+U(x_{0a}+u_a) \end{aligned}$$

(65)

and

$$\begin{aligned} {\mathcal {E}}_i(z_i)=-\frac{B_i}{2}z_i^2+K_iz_i. \end{aligned}$$

(66)

For simplicity, we will call these single variable functions potentials.

We will set up slightly more involved notation for these parameters for a cavity system. With a missing, the potentials for $z_i$ are represented by

$$\begin{aligned} {\mathcal {E}}_{i\rightarrow a}(z_i) =-\frac{B_{i\rightarrow a}}{2}z_i^2+K_{i\rightarrow a}z_i. \end{aligned}$$

(67)

Similarly, with i missing, we have

$$\begin{aligned} {\mathcal {E}}_{a\rightarrow i}(u_a)=\frac{A_{a\rightarrow i}}{2}u_a^2-F_{a\rightarrow i}u_a+U(x_{0a}+u_a). \end{aligned}$$

(68)

We argue that the corresponding parameters with or without cavity are nearly the same.

1.1 Step 1: Introducing $x_a$ into the a-cavity

$$\begin{aligned} {\mathcal {E}}_a(u_a)= & {} V(x_{0a}+u_a) +\sum _{i=1}^M\big \{\max _{z_i} (-z_iH_{ia}u_a+{\mathcal {E}}_{i\rightarrow a}(z_i))\big \} \end{aligned}$$

(69)

$$\begin{aligned} {\mathcal {E}}_{a\rightarrow i}(u_a)= & {} V(x_{0a}+u_a) +\sum _{j\ne i}\big \{\max _{z_j} (-z_jH_{ja}u_a+{\mathcal {E}}_{i\rightarrow a}(z_j))\big \} \end{aligned}$$

(70)

From here, using the parametrization of ${\mathcal {E}}_{i\rightarrow a}(z_i)$ according to Eq. 67, we know that optimal $z_j=\tfrac{H_{ia}u_a-K_{j\rightarrow a}}{K_{j\rightarrow a}}$, leading to

$$\begin{aligned} A_a&=\sum _{i=1 }^M\frac{H_{ia}^2}{B_{i\rightarrow a}} \end{aligned}$$

(71)

$$\begin{aligned} F_a&=-\sum _{i=1 }^M\frac{H_{ia}K_{i\rightarrow a}}{B_{i\rightarrow a}} \end{aligned}$$

(72)

and

$$\begin{aligned} A_{a\rightarrow i}&=\sum _{j\ne i}\frac{H_{ja}^2}{B_{j\rightarrow a}} \end{aligned}$$

(73)

$$\begin{aligned} F_{a\rightarrow i}&=-\sum _{j\ne i}\frac{H_{ja}K_{j\rightarrow a}}{B_{j\rightarrow a}}. \end{aligned}$$

(74)

Since $H_{ia}\sim \tfrac{1}{\sqrt{N}}$, $A_a\approx A_{a\rightarrow i}$ and $F_a\approx F_{a\rightarrow i}$.

1.2 Step 2: Introducing $z_i$ into the i-cavity

$$\begin{aligned} {\mathcal {E}}_i(z_i)= & {} -\frac{\sigma ^2}{2}z_i^2 +\zeta _iz_i +\sum _{a=1}^N\big \{\min _{u_a} (-z_iH_{ia}u_a+{\mathcal {E}}_{a\rightarrow i}(u_a))\big \} \end{aligned}$$

(75)

$$\begin{aligned} {\mathcal {E}}_{i\rightarrow a}(z_i)= & {} -\frac{\sigma ^2}{2}z_i^2 +\zeta _iz_i +\sum _{b\ne a}\big \{\min _{u_b} (-z_iH_{ib}u_b+{\mathcal {E}}_{i\rightarrow a}(u_b))\big \} \end{aligned}$$

(76)

Now, we use the parametrization of ${\mathcal {E}}_{a\rightarrow i}(u_a)$ according to Eq. 68, and get the optimal $u_b$ satisfies

$$\begin{aligned} -z_iH_{ib}+A_{b\rightarrow i}u_b-F_{b\rightarrow i}+U'(x_{0a}+u_a)=0. \end{aligned}$$

Since $z_iH_{ib}\sim \tfrac{1}{\sqrt{N}}$, we can expand this equation around $z_i=0,{{\bar{u}}}_b$ and find that

$$\begin{aligned} B_i&=\sigma ^2+\sum _{a=1 }^N\frac{H_{ia}^2}{A_{i\rightarrow a}+U''(x_{0a}+u_a)} \end{aligned}$$

(77)

$$\begin{aligned} K_i&=\zeta _i-\sum _{a=1}^NH_{ia}{{\bar{u}}}_a \end{aligned}$$

(78)

We could derive equations for $B_{i\rightarrow a}$ and $K_{i\rightarrow a}$, like before, but we know that we can ignore the difference between these parameters and $B_i$ and $K_i$ respectively. Also, we have optimal $u_a\approx {{\bar{u}}}_a$.

1.3 Step 3: Putting it all together

We now only deal with $A_a,B_i$ etc. From Eqs. 71 and 77

$$\begin{aligned} A_a&=\sum _{i=1 }^M\frac{H_{ia}^2}{B_i} \end{aligned}$$

(79)

$$\begin{aligned} B_i&=\sigma ^2+\sum _{a=1 }^N\frac{H_{ia}^2}{A_i+U''(x_{0a}+u_a)} \end{aligned}$$

(80)

For large M, N the expressions are self-averaging. One can essentially replace $H_{ia}^2$ by $\tfrac{1}{M}=\tfrac{1}{\alpha N}$ and see that $A_a=A,B_i=B$. In other words, these quantities are essentially index independent.

$$\begin{aligned} A&=\frac{1}{B} \end{aligned}$$

(81)

$$\begin{aligned} B&=\sigma ^2+\frac{1}{\alpha N}\sum _{a=1 }^N\frac{1}{A+U''(x_{0a}+u_a)} \end{aligned}$$

(82)

If we identify $B=\sigma _\mathrm {eff}^2=\tfrac{1}{A}$, we see that we got

$$\begin{aligned} \sigma _\mathrm {eff}^2=\sigma ^2+\frac{{{\bar{\chi }}}}{\alpha } \end{aligned}$$

(83)

according to the definition in Proposition 1.

To get our final result, we need $F_a$. Using Eqs. 72 and 78 and the various approximations

$$\begin{aligned} F_a=-\sum _{i=1 }^M\frac{H_{ia}K_{i\rightarrow a}}{B_{i\rightarrow a}}\approx -\frac{1}{B}\Bigg [\sum _{i=1}^M(\zeta _i-\sum _{b\ne a}H_{ib}u_b)H_{ia}\Bigg ]\equiv -\frac{\xi _a}{\sigma _\mathrm {eff}^2}. \end{aligned}$$

(84)

We, thus, identify the term inside the square bracket as $\xi _a$. Its distribution over different choices of ${\mathbf {H}}$ and $\varvec{\zeta }$,is approximately normal with mean zero and variance $=\sigma _\zeta ^2+\tfrac{\alpha }{N}\sum _{b\ne a}u_b^2\approx \sigma _\zeta ^2+\alpha q.$ Also, $\xi _a$ and $\xi _b$ are nearly uncorrelated for $a\ne b$.

At the end we get

$$\begin{aligned} {\mathcal {E}}_a(u_a)=\frac{A_a}{2}u_a^2-F_au_a+U(x_{0a}+u_a)=\frac{1}{2\sigma _\mathrm {eff}^2}u_a^2-\frac{\xi _au_a}{\sigma _\mathrm {eff}^2}+U(x_{0a}+u_a) \end{aligned}$$

(85)

which is the same expression as in Proposition 1.

The cavity mean field equations arose in the context of spin systems in solid state physics [18, 19]. These equations take into account the feedback dependencies by estimating the reaction of all the other ‘spins’/variables when a single spin is removed from the system, thereby leaving a ‘cavity’. This leads to a considerable simplification by utilizing the fact that the system of variables are fully connected. The local susceptibility matrix $\varvec{\chi }$, a common quantity in physics, measures how stable the solution is to perturbations. This quantity plays a key role in such systems [20]. In particular, in the asymptotic limit of large M and N, certain quantities (e.g. MSE and average local susceptibility, ${\overline{\chi }}({\mathbf {x}})$) converge, i.e. become independent of the detailed realization of the matrix ${\mathbf {H}}$. In this limit, a sudden increase in susceptibility signals the error prone phase.

These results are equivalent to those obtained by [13, 14] with the replica approach. These approaches begin with a finite temperature statistical mechanics model. In order to make a connection with these studies, one should replace ${\overline{\chi }}$ by the quantity $\beta \varDelta Q$ where $\beta $ is a quantity playing the role of inverse temperature. The quantity $\varDelta Q$ could be defined as

$$\begin{aligned} \varDelta Q\equiv [ \langle (u-\langle u\rangle )^2\rangle ]^{\mathrm {av}}_{{\mathbf {x}}_0,{\mathbf {H}},\zeta }. \end{aligned}$$

(86)

where $\langle \cdots \rangle $ is the average over ‘thermal’ fluctuations of ${\mathbf {u}}$ with in the ensemble $P_\beta $:

$$\begin{aligned} P_\beta ({\mathbf {u}}| {\mathbf {x}}_0,{\mathbf {H}},\zeta ) = \frac{1}{Z({\mathbf {x}}_0,{\mathbf {H}},\zeta )} e^{-\beta {\mathcal {E}}_({\mathbf {u}};{\mathbf {x}}_0,{\mathbf {H}},\zeta )}. \end{aligned}$$

(87)

The quantity $\varDelta Q$ is nothing but ‘thermal’ fluctuations in ${\mathbf {u}}$ and $\beta \varDelta Q$ can in fact be identified as a local susceptibility due to the fluctuation-dissipation theorem [16]. Our results are obtained in the limit $\beta \rightarrow \infty $, where the probability distribution becomes peaked near the minimum, making it into an optimization problem. The local susceptibility, however remains well-defined in this limit.

Appendix B: Ridge Regression via Singular Value Decomposition

For the sake of completeness, in this appendix, we derive Eqs. (17) and (18) in Sect. 4.1 using a singular value decomposition. Elementary derivation leads us to an explicit expression:

$$\begin{aligned} {\hat{{\mathbf {x}}}}= \frac{{\mathbf {H}}^{\mathrm {T}}{\mathbf {H}}}{\sigma ^2}\Big [\frac{{\mathbf {H}}^\mathrm {T}{\mathbf {H}}}{\sigma ^2}+\lambda {\mathbf {I}}_N\Big ]^{-1}{\mathbf {x}}_0=\sum _{i=1}^M\frac{s^2_i}{s^2_i+\lambda \sigma ^2} {\mathcal {V}}_i({\mathcal {V}}^{\mathrm {T}}_i{\mathbf {x}}_0). \end{aligned}$$

(88)

where we use the singular vector basis of the matrix ${\mathbf {H}}$, with ${\mathbf {s}}_i$ being the non-zero singular values, and ${\mathcal {V}}_i$ the corresponding right singular vectors. When we take the limit of vanishing $\sigma ^2$, we just have a projection of the N dimensional vector ${\mathbf {x}}_0$ to an M-dimensional projection spanned by ${\mathcal {V}}_i$’s. In other words

$$\begin{aligned} x_a= \sum _{b=1}^N\sum _{i=1}^M{{{\mathcal {V}}}}_{ia}\mathcal{V}_{ib}x_{0a}=\sum _{a=1}^NP_{ab} x_{0a} \end{aligned}$$

(89)

${\mathbf {P}}$ being the projection matrix. For random ${\mathbf {H}}$, ${\mathcal {V}}_i$’s are just a random choice of M orthonormal vectors. Thus, the properties of the estimate depends on the statistics of the projection matrix to a random M-dimensional subspace.

$$\begin{aligned}{}[P_{ab}]^{\mathrm {av}}_{{\mathbf {H}}}= \sum _{i=1}^M[\mathcal{V}_{ia}\mathcal{V}_{ib}]^{\mathrm {av}}_{{\mathbf {H}}}=\sum _{i=1}^M\frac{\delta _{ab}}{N}=\alpha \delta _{ab} \implies [{{\hat{x}}}_a]^{\mathrm {av}}_{{\mathbf {H}}}=\alpha x_{0a} \end{aligned}$$

(90)

For variance, we need to think of second order moments of the matrix elements of ${\mathbf {P}}$, particularly, $[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}$. We could parametrize $[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=A\delta _{bc}+B\delta _{ab}\delta _{bc}$. Since ${\mathbf {P}}$ is a projection operator, ${\mathbf {P}}^2={\mathbf {P}}$ and it is a symmetric matrix. Hence,

$$\begin{aligned} \sum _a[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=\sum _a[P_{ba}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=[P_{bc}]^{\mathrm {av}}_{{\mathbf {H}}}=\alpha \delta _{bc}. \end{aligned}$$

(91)

In the limit of $M,N\rightarrow 0$ with $\alpha $ fixed, the distribution of $P_{aa}$ gets highly concentrated around the mean $\alpha $. As a result,

$$\begin{aligned}{}[P_{aa}P_{aa}]^{\mathrm {av}}_{{\mathbf {H}}}\approx \left( \sum _a\left[ P_{aa}\right] ^{\mathrm {av}}_{{\mathbf {H}}}\right) ^2=\alpha ^2. \end{aligned}$$

(92)

Using the two constraints, represented by Eqs. (91) and (92), we can determine A and B, in the large M, N limit, leading to,

$$\begin{aligned}{}[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}\approx \frac{\alpha (1-\alpha )}{N}\delta _{bc}+\alpha ^2\delta _{ab}\delta _{bc}. \end{aligned}$$

(93)

The variance is now given by,

$$\begin{aligned}&[{{\hat{x}}}_{0a}^2]^{\mathrm {av}}_{{\mathbf {H}}}-([{{\hat{x}}}_{0a}]^{\mathrm {av}}_{{\mathbf {H}}})^2\nonumber \\&\quad = \sum _a\left[ \frac{\alpha (1-\alpha )}{N}\delta _{bc}+\alpha ^2\delta _{ab}\delta _{bc}\right] x_{0b}x_{0c} -(\alpha x_{0a})^2\nonumber \\&\quad =(1-\alpha )\alpha \rho \big [x_0^2\big ]^{\mathrm {av}}_{x_0} \end{aligned}$$

(94)

recovering our earlier result.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramezanali, M., Mitra, P.P. & Sengupta, A.M. Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction. J Stat Phys 175, 764–788 (2019). https://doi.org/10.1007/s10955-019-02292-6

Download citation

Received: 04 September 2018
Accepted: 10 April 2019
Published: 04 May 2019
Issue Date: 15 May 2019
DOI: https://doi.org/10.1007/s10955-019-02292-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction

Abstract

Access this article

Similar content being viewed by others

Orthogonal Matching Pursuit Under the Restricted Isometry Property

GRPDA Revisited: Relaxed Condition and Connection to Chambolle-Pock’s Primal-Dual Algorithm

Spectral pursuit for simultaneous sparse representation with accuracy guarantees

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Equivalent Single Variable Optimization Problem

1.1 Step 1: Introducing \(x_a\) into the a-cavity

1.2 Step 2: Introducing \(z_i\) into the i-cavity

1.3 Step 3: Putting it all together

Appendix B: Ridge Regression via Singular Value Decomposition

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction

Abstract

Access this article

Similar content being viewed by others

Orthogonal Matching Pursuit Under the Restricted Isometry Property

GRPDA Revisited: Relaxed Condition and Connection to Chambolle-Pock’s Primal-Dual Algorithm

Spectral pursuit for simultaneous sparse representation with accuracy guarantees

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Equivalent Single Variable Optimization Problem

1.1 Step 1: Introducing \(x_a\) into the a-cavity

1.2 Step 2: Introducing \(z_i\) into the i-cavity

1.3 Step 3: Putting it all together

Appendix B: Ridge Regression via Singular Value Decomposition

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation