Skip to main content
Log in

Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

Recovery of an N-dimensional, K-sparse solution \({\mathbf {x}}\) from an M-dimensional vector of measurements \({\mathbf {y}}\) for multivariate linear regression can be accomplished by minimizing a suitably penalized least-mean-square cost \(||{\mathbf {y}}-{\mathbf {H}} {\mathbf {x}}||_2^2+\lambda V({\mathbf {x}})\). Here \({\mathbf {H}}\) is a known matrix and \(V({\mathbf {x}})\) is an algorithm-dependent sparsity-inducing penalty. For ‘random’ \({\mathbf {H}}\), in the limit \(\lambda \rightarrow 0\) and \(M,N,K\rightarrow \infty \), keeping \(\rho =K/N\) and \(\alpha =M/N\) fixed, exact recovery is possible for \(\alpha \) past a critical value \(\alpha _c = \alpha (\rho )\). Assuming \({\mathbf {x}}\) has iid entries, the critical curve exhibits some universality, in that its shape does not depend on the distribution of \({\mathbf {x}}\). However, the algorithmic phase transition occurring at \(\alpha =\alpha _c\) and associated universality classes remain ill-understood from a statistical physics perspective, i.e. in terms of scaling exponents near the critical curve. In this article, we analyze the mean-field equations for two algorithms, Basis Pursuit (\(V({\mathbf {x}})=||{\mathbf {x}}||_{1} \)) and Elastic Net (\(V({\mathbf {x}})= ||{\mathbf {x}}||_{1} + \tfrac{g}{2} ||{\mathbf {x}}||_{2}^2\)) and show that they belong to different universality classes in the sense of scaling exponents, with mean squared error (MSE) of the recovered vector scaling as \(\lambda ^\frac{4}{3}\) and \(\lambda \) respectively, for small \(\lambda \) on the critical line. In the presence of additive noise, we find that, when \(\alpha >\alpha _c\), MSE is minimized at a non-zero value for \(\lambda \), whereas at \(\alpha =\alpha _c\), MSE always increases with \(\lambda \).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Under this circumstance, if \(\xi \) remains of order one, then the error is dominated by \(\xi \), i.e. \(q (\mathrm {MSE})=\sigma _\xi ^2\). However, this is not consistent with \(\sigma _\xi ^2=q/\alpha \), unless \(\sigma _\xi ^2=0\). Hence in this regime, we need to consider a \(\sigma _\xi ^2\) that is comparable to \(\theta \). Therefore, as \(\vartheta \rightarrow 0\), we will have \(\sigma _\xi ^2\rightarrow 0\) and \(q\rightarrow 0\), making the reconstruction perfect, i.e. the limit when \(\vartheta ,\theta ,\sigma _\xi ^2\rightarrow 0\) with \(\tfrac{\theta }{\sigma _\xi }\) of order one.

  2. Note that \({{\hat{\rho }}}>\rho \), even in the perfect reconstruction phase. That is because a fraction of \(x_a\)’s remain non-zero as long as \(\vartheta >0\), and vanish only in the \(\vartheta \rightarrow 0\) limit.

  3. Note that, when \(|\tau _0|=\tfrac{|x_0|}{\sigma _\xi }\rightarrow \infty \), \(\varPhi (\tau +\tau _0)+\varPhi (\tau -\tau _0)\rightarrow 1\) and \((\tau -\tau _0) \phi (\tau +\tau _0),( \tau +\tau _0) \phi (\tau -\tau _0)\rightarrow 0\). The \(\tau _0\) dependent expression inside \([\ldots ]^{\mathrm {av}}_{x_0}\) in Eq. (34) goes from \(2\big \{(1+\tau ^2)\varPhi (\tau ) - \tau \phi (\tau )\big \}\) to \(1+\tau ^2\) as \(\tau _0\) goes from zero to infinity. We wrote this expression as \(1+\tau ^2-\psi _\xi (\tau _0,\tau )\).

References

  1. Amelunxen, D., Lotz, M., McCoy, M.B., Tropp, J.A.: Living on the edge: phase transitions in convex programs with random data. Inf. Inference 3(3), 224–294 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  2. Andersen, M., Dahl, J., Vandenberghe, L.: CVXOPT: a python package for convex optimization (2010)

  3. Bayati, M., Montanari, A.: The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. Inf. Theory 57(2), 764–785 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bayati, M., Montanari, A.: The lasso risk for gaussian matrices. IEEE Trans. Inf. Theory 58(4), 1997–2017 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Candès, E., Romberg, J.: Sparsity and incoherence in compressive sampling. Inverse Probl. 23(3), 969 (2007)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  6. Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  8. Donoho, D.L.: For most large underdetermined systems of linear equations the minimal. Commun. Pure Appl. Math. 59(6), 797–829 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Donoho, D.L., Tanner, J.: Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. USA 102(27), 9446–9451 (2005)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  10. Donoho, D., Tanner, J.: Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing. Philos. Trans. R. Soc. A 367(1906), 4273–4293 (2009). https://doi.org/10.1098/rsta.2009.0152

    Article  ADS  MathSciNet  MATH  Google Scholar 

  11. Donoho, D.L., Maleki, A., Montanari, A.: Message-passing algorithms for compressed sensing. Proc. Natl. Acad. Sci. USA 106(45), 18914–18919 (2009)

    Article  ADS  Google Scholar 

  12. El Karoui, N.: On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Relat. Fields 170(1–2), 95–175 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ganguli, S., Sompolinsky, H.: Statistical mechanics of compressed sensing. Phys. Rev. Lett. 104(18), 188701 (2010)

    Article  ADS  Google Scholar 

  14. Kabashima, Y., Wadayama, T., Tanaka, T.: A typical reconstruction limit for compressed sensing based on \(\ell _p\)-norm minimization. J. Stat. Mech. 2009(09), L09003 (2009)

    Article  Google Scholar 

  15. Karoui, N.E.: Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators: rigorous results. arXiv preprint arXiv:1311.2445 (2013)

  16. Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966)

    Article  ADS  MATH  Google Scholar 

  17. Ma, S.K.: Modern Theory of Critical Phenomena. Routledge, London (2018)

    Book  Google Scholar 

  18. Mézard, M., Parisi, G., Virasoro, M.: Sk model: the replica solution without replicas. Europhys. Lett. 1(2), 77–82 (1986)

    Article  ADS  Google Scholar 

  19. Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond, vol. 9. World scientific, Singapore (1987)

    MATH  Google Scholar 

  20. Ramezanali, M., Mitra, P.P., Sengupta, A.M.: The cavity method for analysis of large-scale penalized regression. arXiv preprint arXiv:1501.03194 (2015)

  21. Rudelson, M., Vershynin, R.: On sparse reconstruction from fourier and gaussian measurements. Commun. Pure Appl. Math. 61(8), 1025–1045 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  22. Stojnic, M.: Various thresholds for \(\ell _1 \)-optimization in compressed sensing. arXiv preprint arXiv:0907.3666 (2009)

  23. Stojnic, M.: \(\ell _2/\ell _1\)-optimization in block-sparse compressed sensing and its strong thresholds. IEEE J. Sel. Top. Signal Process. 4(2), 350–357 (2010)

    Article  ADS  Google Scholar 

  24. Stojnic, M.: A rigorous geometry-probability equivalence in characterization of \(\ell _1 \)-optimization. arXiv preprint arXiv:1303.7287 (2013)

  25. Tibshirani, R., Wainwright, M., Hastie, T.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, New York (2015)

    MATH  Google Scholar 

  26. Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195–198 (1943)

    MathSciNet  Google Scholar 

  27. Vershik, A.M., Sporyshev, P.: Asymptotic behavior of the number of faces of random polyhedra and the neighborliness problem. Selecta Math. Soviet 11(2), 181–201 (1992)

    MathSciNet  MATH  Google Scholar 

  28. Xu, Y., Kabashima, Y.: Statistical mechanics approach to 1-bit compressed sensing. J. Stat. Mech. 2013(2), P02041 (2013)

    Article  MathSciNet  Google Scholar 

  29. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Foundation INSPIRE (track 1) Award No. 1344069. Part of this paper was written while two of the authors (AMS and MR) were visiting Center for Computational Biology at Flatiron Institute. We are grateful for their hospitality.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Ramezanali.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Equivalent Single Variable Optimization Problem

Here, we provide a sketch of our cavity argument. A detailed derivation will be left for the published version of our preprint [20]. From an algorithmic point of view, the cavity method is related to message passing algorithms, but we will assume that the algorithm convergences on a state. We wish to find an approximate statistical description of that state. Once we are done discussing cavity method, we also briefly mention how to connect these results to the replica calculations found in [13, 14].

We will consider the case where the function V is twice differentiable. To construct potentials like the \(\ell _1\) norm, we use second differentiable functions like \(r\ln (2\cosh (x/r))\), which tends to |x| when r goes to zero. We can study the solution for \(r>0\) and then take the appropriate limit.

Minimization of the original penalized regression objective function is mathematically equivalent to the minimization of \({\mathcal {E}}({\mathbf {u}})\) over \({\mathbf {u}}\) (even if, in practice, we do not know the explicit form of \({\mathcal {E}}({\mathbf {u}})\)).

$$\begin{aligned}&\min _{\mathbf {u}}{\mathcal {E}}({\mathbf {u}}) \nonumber \\&\quad =\min _{\mathbf {u}}\frac{1}{2\sigma ^2} ||{\mathbf {H}}{\mathbf {u}}-\varvec{\zeta }||_2^2 + V({\mathbf {u}}+{\mathbf {x}}_0)\nonumber \\&\quad =\min _{\mathbf {u}}\max _{\mathbf {z}}-\frac{\sigma ^2}{2} ||{\mathbf {z}}||_2^2+{\mathbf {z}}^T({\mathbf {H}}{\mathbf {u}}-\varvec{\zeta }) + V({\mathbf {u}}+{\mathbf {x}}_0)\nonumber \\&\quad =\min _{\mathbf {u}}\max _{\mathbf {z}}- \sum _{i=1}^M\Big (\frac{\sigma ^2}{2}z_i^2-\zeta _iz_i\Big )-\sum _{i=1}^M\sum _{a=1}^Nz_iH_{ia}u_a + \sum _{a=1}^NU(u_a+x_{0a}) \end{aligned}$$
(64)

Note that the \(z_i\) variables and the \(u_a\) variables only interact via the random measurement matrix \({\mathbf {H}}\). From this point on, our arguments are similar to that of Xu and Kabashima [28]. We try to find single variable functions \({\mathcal {E}}_a(u_a)\) and \({\mathcal {E}}_i(z_i)\) whose optimization mimics the full optimization problem.

We consider the problem with an a-cavity, meaning, a problem where the variable \(x_a=u_a+x_{0a}\) has been set to zero. We also consider a problem with an i-cavity, namely a problem, where \(z_i\) has been set to zero. The single variable functions are constructed by introducing the inactive/missing variable into the corresponding cavity. The discussion becomes simpler if we assume the optimization in the systems with cavities are already well-approximated by optimizing over sum of single variable functions of the following forms (as is done in Xu and Kabashima [28]):

$$\begin{aligned} {\mathcal {E}}_a(u_a)=\frac{A_a}{2}u_a^2-F_au_a+U(x_{0a}+u_a) \end{aligned}$$
(65)

and

$$\begin{aligned} {\mathcal {E}}_i(z_i)=-\frac{B_i}{2}z_i^2+K_iz_i. \end{aligned}$$
(66)

For simplicity, we will call these single variable functions potentials.

We will set up slightly more involved notation for these parameters for a cavity system. With a missing, the potentials for \(z_i\) are represented by

$$\begin{aligned} {\mathcal {E}}_{i\rightarrow a}(z_i) =-\frac{B_{i\rightarrow a}}{2}z_i^2+K_{i\rightarrow a}z_i. \end{aligned}$$
(67)

Similarly, with i missing, we have

$$\begin{aligned} {\mathcal {E}}_{a\rightarrow i}(u_a)=\frac{A_{a\rightarrow i}}{2}u_a^2-F_{a\rightarrow i}u_a+U(x_{0a}+u_a). \end{aligned}$$
(68)

We argue that the corresponding parameters with or without cavity are nearly the same.

1.1 Step 1: Introducing \(x_a\) into the a-cavity

$$\begin{aligned} {\mathcal {E}}_a(u_a)= & {} V(x_{0a}+u_a) +\sum _{i=1}^M\big \{\max _{z_i} (-z_iH_{ia}u_a+{\mathcal {E}}_{i\rightarrow a}(z_i))\big \} \end{aligned}$$
(69)
$$\begin{aligned} {\mathcal {E}}_{a\rightarrow i}(u_a)= & {} V(x_{0a}+u_a) +\sum _{j\ne i}\big \{\max _{z_j} (-z_jH_{ja}u_a+{\mathcal {E}}_{i\rightarrow a}(z_j))\big \} \end{aligned}$$
(70)

From here, using the parametrization of \({\mathcal {E}}_{i\rightarrow a}(z_i)\) according to Eq. 67, we know that optimal \(z_j=\tfrac{H_{ia}u_a-K_{j\rightarrow a}}{K_{j\rightarrow a}}\), leading to

$$\begin{aligned} A_a&=\sum _{i=1 }^M\frac{H_{ia}^2}{B_{i\rightarrow a}} \end{aligned}$$
(71)
$$\begin{aligned} F_a&=-\sum _{i=1 }^M\frac{H_{ia}K_{i\rightarrow a}}{B_{i\rightarrow a}} \end{aligned}$$
(72)

and

$$\begin{aligned} A_{a\rightarrow i}&=\sum _{j\ne i}\frac{H_{ja}^2}{B_{j\rightarrow a}} \end{aligned}$$
(73)
$$\begin{aligned} F_{a\rightarrow i}&=-\sum _{j\ne i}\frac{H_{ja}K_{j\rightarrow a}}{B_{j\rightarrow a}}. \end{aligned}$$
(74)

Since \(H_{ia}\sim \tfrac{1}{\sqrt{N}}\), \(A_a\approx A_{a\rightarrow i}\) and \(F_a\approx F_{a\rightarrow i}\).

1.2 Step 2: Introducing \(z_i\) into the i-cavity

$$\begin{aligned} {\mathcal {E}}_i(z_i)= & {} -\frac{\sigma ^2}{2}z_i^2 +\zeta _iz_i +\sum _{a=1}^N\big \{\min _{u_a} (-z_iH_{ia}u_a+{\mathcal {E}}_{a\rightarrow i}(u_a))\big \} \end{aligned}$$
(75)
$$\begin{aligned} {\mathcal {E}}_{i\rightarrow a}(z_i)= & {} -\frac{\sigma ^2}{2}z_i^2 +\zeta _iz_i +\sum _{b\ne a}\big \{\min _{u_b} (-z_iH_{ib}u_b+{\mathcal {E}}_{i\rightarrow a}(u_b))\big \} \end{aligned}$$
(76)

Now, we use the parametrization of \({\mathcal {E}}_{a\rightarrow i}(u_a)\) according to Eq. 68, and get the optimal \(u_b\) satisfies

$$\begin{aligned} -z_iH_{ib}+A_{b\rightarrow i}u_b-F_{b\rightarrow i}+U'(x_{0a}+u_a)=0. \end{aligned}$$

Since \(z_iH_{ib}\sim \tfrac{1}{\sqrt{N}}\), we can expand this equation around \(z_i=0,{{\bar{u}}}_b\) and find that

$$\begin{aligned} B_i&=\sigma ^2+\sum _{a=1 }^N\frac{H_{ia}^2}{A_{i\rightarrow a}+U''(x_{0a}+u_a)} \end{aligned}$$
(77)
$$\begin{aligned} K_i&=\zeta _i-\sum _{a=1}^NH_{ia}{{\bar{u}}}_a \end{aligned}$$
(78)

We could derive equations for \(B_{i\rightarrow a}\) and \(K_{i\rightarrow a}\), like before, but we know that we can ignore the difference between these parameters and \(B_i\) and \(K_i\) respectively. Also, we have optimal \(u_a\approx {{\bar{u}}}_a\).

1.3 Step 3: Putting it all together

We now only deal with \(A_a,B_i\) etc. From Eqs. 71 and 77

$$\begin{aligned} A_a&=\sum _{i=1 }^M\frac{H_{ia}^2}{B_i} \end{aligned}$$
(79)
$$\begin{aligned} B_i&=\sigma ^2+\sum _{a=1 }^N\frac{H_{ia}^2}{A_i+U''(x_{0a}+u_a)} \end{aligned}$$
(80)

For large MN the expressions are self-averaging. One can essentially replace \(H_{ia}^2\) by \(\tfrac{1}{M}=\tfrac{1}{\alpha N}\) and see that \(A_a=A,B_i=B\). In other words, these quantities are essentially index independent.

$$\begin{aligned} A&=\frac{1}{B} \end{aligned}$$
(81)
$$\begin{aligned} B&=\sigma ^2+\frac{1}{\alpha N}\sum _{a=1 }^N\frac{1}{A+U''(x_{0a}+u_a)} \end{aligned}$$
(82)

If we identify \(B=\sigma _\mathrm {eff}^2=\tfrac{1}{A}\), we see that we got

$$\begin{aligned} \sigma _\mathrm {eff}^2=\sigma ^2+\frac{{{\bar{\chi }}}}{\alpha } \end{aligned}$$
(83)

according to the definition in Proposition 1.

To get our final result, we need \(F_a\). Using Eqs. 72 and 78 and the various approximations

$$\begin{aligned} F_a=-\sum _{i=1 }^M\frac{H_{ia}K_{i\rightarrow a}}{B_{i\rightarrow a}}\approx -\frac{1}{B}\Bigg [\sum _{i=1}^M(\zeta _i-\sum _{b\ne a}H_{ib}u_b)H_{ia}\Bigg ]\equiv -\frac{\xi _a}{\sigma _\mathrm {eff}^2}. \end{aligned}$$
(84)

We, thus, identify the term inside the square bracket as \(\xi _a\). Its distribution over different choices of \({\mathbf {H}}\) and \(\varvec{\zeta }\),is approximately normal with mean zero and variance \(=\sigma _\zeta ^2+\tfrac{\alpha }{N}\sum _{b\ne a}u_b^2\approx \sigma _\zeta ^2+\alpha q.\) Also, \(\xi _a\) and \(\xi _b\) are nearly uncorrelated for \(a\ne b\).

At the end we get

$$\begin{aligned} {\mathcal {E}}_a(u_a)=\frac{A_a}{2}u_a^2-F_au_a+U(x_{0a}+u_a)=\frac{1}{2\sigma _\mathrm {eff}^2}u_a^2-\frac{\xi _au_a}{\sigma _\mathrm {eff}^2}+U(x_{0a}+u_a) \end{aligned}$$
(85)

which is the same expression as in Proposition 1.

The cavity mean field equations arose in the context of spin systems in solid state physics [18, 19]. These equations take into account the feedback dependencies by estimating the reaction of all the other ‘spins’/variables when a single spin is removed from the system, thereby leaving a ‘cavity’. This leads to a considerable simplification by utilizing the fact that the system of variables are fully connected. The local susceptibility matrix \(\varvec{\chi }\), a common quantity in physics, measures how stable the solution is to perturbations. This quantity plays a key role in such systems [20]. In particular, in the asymptotic limit of large M and N, certain quantities (e.g. MSE and average local susceptibility, \({\overline{\chi }}({\mathbf {x}})\)) converge, i.e. become independent of the detailed realization of the matrix \({\mathbf {H}}\). In this limit, a sudden increase in susceptibility signals the error prone phase.

These results are equivalent to those obtained by [13, 14] with the replica approach. These approaches begin with a finite temperature statistical mechanics model. In order to make a connection with these studies, one should replace \({\overline{\chi }}\) by the quantity \(\beta \varDelta Q\) where \(\beta \) is a quantity playing the role of inverse temperature. The quantity \(\varDelta Q\) could be defined as

$$\begin{aligned} \varDelta Q\equiv [ \langle (u-\langle u\rangle )^2\rangle ]^{\mathrm {av}}_{{\mathbf {x}}_0,{\mathbf {H}},\zeta }. \end{aligned}$$
(86)

where \(\langle \cdots \rangle \) is the average over ‘thermal’ fluctuations of \({\mathbf {u}}\) with in the ensemble \(P_\beta \):

$$\begin{aligned} P_\beta ({\mathbf {u}}| {\mathbf {x}}_0,{\mathbf {H}},\zeta ) = \frac{1}{Z({\mathbf {x}}_0,{\mathbf {H}},\zeta )} e^{-\beta {\mathcal {E}}_({\mathbf {u}};{\mathbf {x}}_0,{\mathbf {H}},\zeta )}. \end{aligned}$$
(87)

The quantity \(\varDelta Q\) is nothing but ‘thermal’ fluctuations in \({\mathbf {u}}\) and \(\beta \varDelta Q\) can in fact be identified as a local susceptibility due to the fluctuation-dissipation theorem [16]. Our results are obtained in the limit \(\beta \rightarrow \infty \), where the probability distribution becomes peaked near the minimum, making it into an optimization problem. The local susceptibility, however remains well-defined in this limit.

Appendix B: Ridge Regression via Singular Value Decomposition

For the sake of completeness, in this appendix, we derive Eqs. (17) and (18) in Sect. 4.1 using a singular value decomposition. Elementary derivation leads us to an explicit expression:

$$\begin{aligned} {\hat{{\mathbf {x}}}}= \frac{{\mathbf {H}}^{\mathrm {T}}{\mathbf {H}}}{\sigma ^2}\Big [\frac{{\mathbf {H}}^\mathrm {T}{\mathbf {H}}}{\sigma ^2}+\lambda {\mathbf {I}}_N\Big ]^{-1}{\mathbf {x}}_0=\sum _{i=1}^M\frac{s^2_i}{s^2_i+\lambda \sigma ^2} {\mathcal {V}}_i({\mathcal {V}}^{\mathrm {T}}_i{\mathbf {x}}_0). \end{aligned}$$
(88)

where we use the singular vector basis of the matrix \({\mathbf {H}}\), with \({\mathbf {s}}_i\) being the non-zero singular values, and \({\mathcal {V}}_i\) the corresponding right singular vectors. When we take the limit of vanishing \(\sigma ^2\), we just have a projection of the N dimensional vector \({\mathbf {x}}_0\) to an M-dimensional projection spanned by \({\mathcal {V}}_i\)’s. In other words

$$\begin{aligned} x_a= \sum _{b=1}^N\sum _{i=1}^M{{{\mathcal {V}}}}_{ia}\mathcal{V}_{ib}x_{0a}=\sum _{a=1}^NP_{ab} x_{0a} \end{aligned}$$
(89)

\({\mathbf {P}}\) being the projection matrix. For random \({\mathbf {H}}\), \({\mathcal {V}}_i\)’s are just a random choice of M orthonormal vectors. Thus, the properties of the estimate depends on the statistics of the projection matrix to a random M-dimensional subspace.

$$\begin{aligned}{}[P_{ab}]^{\mathrm {av}}_{{\mathbf {H}}}= \sum _{i=1}^M[\mathcal{V}_{ia}\mathcal{V}_{ib}]^{\mathrm {av}}_{{\mathbf {H}}}=\sum _{i=1}^M\frac{\delta _{ab}}{N}=\alpha \delta _{ab} \implies [{{\hat{x}}}_a]^{\mathrm {av}}_{{\mathbf {H}}}=\alpha x_{0a} \end{aligned}$$
(90)

For variance, we need to think of second order moments of the matrix elements of \({\mathbf {P}}\), particularly, \([P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}\). We could parametrize \([P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=A\delta _{bc}+B\delta _{ab}\delta _{bc}\). Since \({\mathbf {P}}\) is a projection operator, \({\mathbf {P}}^2={\mathbf {P}}\) and it is a symmetric matrix. Hence,

$$\begin{aligned} \sum _a[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=\sum _a[P_{ba}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}=[P_{bc}]^{\mathrm {av}}_{{\mathbf {H}}}=\alpha \delta _{bc}. \end{aligned}$$
(91)

In the limit of \(M,N\rightarrow 0\) with \(\alpha \) fixed, the distribution of \(P_{aa}\) gets highly concentrated around the mean \(\alpha \). As a result,

$$\begin{aligned}{}[P_{aa}P_{aa}]^{\mathrm {av}}_{{\mathbf {H}}}\approx \left( \sum _a\left[ P_{aa}\right] ^{\mathrm {av}}_{{\mathbf {H}}}\right) ^2=\alpha ^2. \end{aligned}$$
(92)

Using the two constraints, represented by Eqs. (91) and (92), we can determine A and B, in the large MN limit, leading to,

$$\begin{aligned}{}[P_{ab}P_{ac}]^{\mathrm {av}}_{{\mathbf {H}}}\approx \frac{\alpha (1-\alpha )}{N}\delta _{bc}+\alpha ^2\delta _{ab}\delta _{bc}. \end{aligned}$$
(93)

The variance is now given by,

$$\begin{aligned}&[{{\hat{x}}}_{0a}^2]^{\mathrm {av}}_{{\mathbf {H}}}-([{{\hat{x}}}_{0a}]^{\mathrm {av}}_{{\mathbf {H}}})^2\nonumber \\&\quad = \sum _a\left[ \frac{\alpha (1-\alpha )}{N}\delta _{bc}+\alpha ^2\delta _{ab}\delta _{bc}\right] x_{0b}x_{0c} -(\alpha x_{0a})^2\nonumber \\&\quad =(1-\alpha )\alpha \rho \big [x_0^2\big ]^{\mathrm {av}}_{x_0} \end{aligned}$$
(94)

recovering our earlier result.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramezanali, M., Mitra, P.P. & Sengupta, A.M. Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction. J Stat Phys 175, 764–788 (2019). https://doi.org/10.1007/s10955-019-02292-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10955-019-02292-6

Keywords

Navigation