Skip to main content
Log in

A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors

  • Published:
Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Abstract

Deep generative modeling has led to new and state of the art approaches for enforcing structural priors in a variety of inverse problems. In contrast to priors given by sparsity, deep models can provide direct low-dimensional parameterizations of the manifold of images or signals belonging to a particular natural class, allowing for recovery algorithms to be posed in a low-dimensional space. This dimensionality may even be lower than the sparsity level of the same signals when viewed in a fixed basis. What is not known about these methods is whether there are computationally efficient algorithms whose sample complexity is optimal in the dimensionality of the representation given by the generative model. In this paper, we present such an algorithm and analysis. Under the assumption that the generative model is a neural network that is sufficiently expansive at each layer and has Gaussian weights, we provide a gradient descent scheme and prove that for noisy compressive measurements of a signal in the range of the model, the algorithm converges to that signal, up to the noise level. The scaling of the sample complexity with respect to the input dimensionality of the generative prior is linear, and thus can not be improved except for constants and factors of other variables. To the best of the authors’ knowledge, this is the first recovery guarantee for compressive sensing under generative priors by a computationally efficient algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. This implementation is available at https://www.caam.rice.edu/~optimization/L1/fpc/.

References

  1. Allen-Zhu, Z., Li, Y., Song, Z.: A convergence theory for deep learning via over-parameterization. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 242–252. PMLR, 09–15 (2019)

  2. Arora, S., Liang, Y., Ma, T.: Why are deep nets reversible: a simple theory, with implications for training. Preprint (2015). arXiv:1511.05653

  3. Blanchard, J.D., Cartis, C., Tanner, J.: Compressed sensing: How sharp is the restricted isometry property? SIAM Rev. Soc. Ind. Appl. Math. 53(1), 105–125 (2011)

    MathSciNet  MATH  Google Scholar 

  4. Bora, A., Jalal, A., Price, E., Dimakis, A.G.: Compressed sensing using generative models. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 537–546. PMLR, 06–11 (2017)

  5. Candes, E.J., Tao, T.: Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006)

    Article  MathSciNet  Google Scholar 

  6. Clason, C.: Nonsmooth analysis and optimization. Preprint (2017). arXiv:1708.04180

  7. Du, S.S., Zhai, X., Poczos, B., Singh, A.: Gradient descent provably optimizes over-parameterized neural network. In: Proceedings of the 7nd International Conference on Learning Representations (2019)

  8. Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  9. Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser/Springer, Boston (2013)

  10. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Article  MathSciNet  Google Scholar 

  11. Hale, E.T., Yin, W., Zhang, Y.: Fixed-point continuation for \(\ell _1\)-minimization: methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)

    Article  MathSciNet  Google Scholar 

  12. Hand, P., Voroninski, V.: Global guarantees for enforcing deep generative priors by empirical risk. IEEE Trans. Inf. Theory 66(1), 401–418 (2019)

    Article  MathSciNet  Google Scholar 

  13. Heckel, R., Huang, W., Hand, P., Voroninski, V.: Deep denoising: rate-optimal recovery of structured signals with a deep prior. Inf. Inference (2020. accepted)

  14. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711. Springer, Cham (2016)

  15. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation (2018). arXiv:1710.10196

  16. Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1 \(\times \) 1 convolutions. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 10236–10245 (2018)

  17. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (2014)

  18. Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)

  19. Li, Y., Liang, Y.: Learning overparameterized neural networks via stochastic gradient descent on structured data. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 8168–8177 (2018)

  20. Mao, X.-J., Shen, C., Yang, Y.-B.: Image restoration using convolutional auto-encoders with symmetric skip connections. Preprint (2016). arXiv:1606.08921

  21. Mardani, M., Gong, E., Cheng, J.Y., Vasanawala, S.S., Zaharchuk, G., Xing, L., Pauly, J.M.: Deep generative adversarial neural networks for compressive sensing MRI. IEEE Trans. Med. Imaging 38(1), 167–179 (2019)

    Article  Google Scholar 

  22. Mardani, M., Monajemi, H., Papyan, V., Vasanawala, S., Donoho, D., Pauly, J.: Recurrent generative adversarial networks for proximal learning and automated compressive image recovery. Preprint (2017). arXiv:1711.10046

  23. Mousavi, A., Baraniuk, R.G.: Learning to invert: Signal recovery via deep convolutional networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2272–2276 (2017)

  24. Mousavi, A., Patel, A.B., Baraniuk, R.G.: A deep learning approach to structured signal recovery. In: 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1336–1343 (2015)

  25. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Cham (2006)

    MATH  Google Scholar 

  26. Oymak, S., Soltanolkotabi, M.: Toward moderate overparameterization: global convergence guarantees for training shallow neural networks. IEEE J. Sel. Areas Inf. Theory 1(1), 84–105 (2020)

    Article  Google Scholar 

  27. Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning, pp. 2922–2930. PMLR (2017)

  28. Sønderby, C.K., Caballero, J., Theis, L., Shi, W., Huszár, F.: Amortised MAP inference for image super-resolution. In: Proceedings of the 5th International Conference on Learning Representations (2017)

  29. Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: International Conference on Machine Learning, pp. 1747–1756. PMLR (2016)

  30. Yeh, R.A., Chen, C., Lim, T.Y., Schwing, A.G., Hasegawa-Johnson, M., Do, M.N.: Semantic image inpainting with deep generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5485–5493 (2017)

  31. Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: European Conference on Computer Vision, pp. 597–613. Springer, Cham (2016)

Download references

Acknowledgements

W.H. is partially supported by the Fundamental Research Funds for the Central Universities (No. 20720190060) and the National Natural Science Foundation of China (No. 12001455).P.H. is partially supported by NSF CAREER Award DMS-1848087 and NSF Award DMS-2022205. RH is partially supported by NSF Award IIS-1816986.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Huang.

Additional information

Communicated by Roman Vershynin.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Supporting Lemmas

Appendix A: Supporting Lemmas

Lemma A.1 is used in proofs for Sect. 5.3 and Lemma 5.3.

Lemma A.1

Suppose that the WDC and RRIC holds with \(\epsilon < 1/(16 \pi d^2)^2\) and that the noise e satisfies \(\Vert e\Vert \le a_5 2^{-d/2} \Vert x_*\Vert \). Then, for all x and all \(v_x \in \partial f(x)\),

$$\begin{aligned} \Vert {v}_x\Vert \le \frac{a_6 d}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ), \end{aligned}$$
(A.1)

where \(a_5\) and \(a_6\) are universal constants.

Proof

Define for convenience \(\zeta _j=\prod _{i = j}^{d - 1} \frac{\pi - {\bar{\theta }}_{j, x, x_*}}{\pi }\). We have

$$\begin{aligned} \Vert {v}_x\Vert&\le \Vert h_{x}\Vert + \Vert h_{x} - {v}_x\Vert \\&\le \left\| \frac{1}{2^d} x - \frac{1}{2^d} \zeta _{0} x_* - \frac{1}{2^d} \sum _{i = 0}^{d - 1} \frac{\sin {\bar{\theta }}_{i,x}}{\pi } \zeta _{i + 1} \frac{\Vert x_*\Vert }{\Vert x\Vert } x \right\| \\&\quad + a_1 \frac{d^3 \sqrt{\epsilon }}{2^d} \max ( \Vert x\Vert , \Vert x_* \Vert ) + \frac{2}{2^{d/2}} \Vert e\Vert \\&\le \frac{1}{2^d} \Vert x\Vert + \left( \frac{1}{2^d} + \frac{d}{\pi 2^d} \right) \Vert x_*\Vert + a_1 \frac{d^3 \sqrt{\epsilon }}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ) + \frac{2}{2^{d/2}} \Vert e\Vert \\&\le \frac{a_6 d}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ), \end{aligned}$$

where the second inequality follows from the definition of \(h_x\) and Lemma 5.2, the third inequality uses \(| \zeta _j | \le 1\), and the last inequality uses the assumption \(\Vert e\Vert \le a_5 2^{-d/2} \Vert x_*\Vert \). \(\square \)

Lemma A.2 is used in proofs for Lemma 5.1.

Lemma A.2

Suppose \(a_i, b_i \in [0, \pi ]\) for \(i = 1, \ldots , k\), and \(|a_i - b_i| \le |a_j - b_j|, \forall i \ge j\). Then it holds that

$$\begin{aligned} \left| \prod _{i = 1}^k \frac{\pi - a_i}{\pi } - \prod _{i = 1}^k \frac{\pi - b_i}{\pi }\right| \le \frac{k}{\pi } |a_1 - b_1|. \end{aligned}$$

Proof

Prove by induction. It is easy to verify that the inequality holds if \(k = 1\). Suppose the inequality holds with \(k = t - 1\). Then

$$\begin{aligned} \left| \prod _{i = 1}^{t} \frac{\pi - a_i}{\pi } - \prod _{i = 1}^t \frac{\pi - b_i}{\pi }\right|&\le \left| \prod _{i = 1}^{t} \frac{\pi - a_i}{\pi } - \frac{\pi - a_t}{\pi } \prod _{i = 1}^{t-1} \frac{\pi - b_i}{\pi }\right| \\&\quad + \left| \frac{\pi - a_t}{\pi } \prod _{i = 1}^{t-1} \frac{\pi - b_i}{\pi } - \prod _{i = 1}^t \frac{\pi - b_i}{\pi }\right| \\&\le \frac{t-1}{\pi } |a_1 - b_1| + \frac{1}{\pi } |a_t - b_t| \le \frac{t}{\pi } |a_1 - b_1|. \end{aligned}$$

\(\square \)

Lemma A.3 is used in proofs for Lemmas 5.2, 5.3, and 5.5.

Lemma A.3

Suppose the WDC and RRIC hold with \(\epsilon \le 1 / (16 \pi d^2)^2\). Then we have

$$\begin{aligned} \left| x^T q_x\right| \le \frac{2}{2^{d/2}} \Vert e\Vert \Vert x\Vert , \end{aligned}$$

where \(q_x = \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T e\). In addition, if x is differentiable at G(x), then we have

$$\begin{aligned} \left\| q_x\right\| \le \frac{2}{2^{d/2}} \Vert e\Vert . \end{aligned}$$

Proof

We have

$$\begin{aligned} |x^T q_x|^2&= |e^T A G(x)|^2 \le \Vert A G(x)\Vert ^2 \Vert e\Vert ^2 \le (1 + \epsilon ) \Vert G(x)\Vert ^2 \Vert e\Vert ^2 \\&\le (1 + \epsilon ) \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert e\Vert ^2 \Vert x\Vert ^2 \le (1 + \epsilon ) (1 + 2 \epsilon d)^2 \frac{1}{ 2^{d}} \Vert e\Vert ^2 \Vert x\Vert ^2, \end{aligned}$$

where the second inequality follows from RRIC and the last inequality follows from [12, (10)]. Therefore, \(\left| x^T q_x\right| \le \frac{2}{2^{d/2}} \Vert e\Vert \Vert x\Vert \).

Suppose G is differentiable at x. Then the local linearity of G implies that \(G(x + z) - G(x) = \left( \prod _{i = d}^1 W_{i, +, x}\right) z\) for any sufficiently small \(z \in {\mathbb {R}}^k\). By the RRIC, we have

$$\begin{aligned}&\left| {\left\langle A \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,A \left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} - {\left\langle \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,\left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} \right| \\&\quad \le \epsilon \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert z\Vert ^2, \end{aligned}$$

which implies

$$\begin{aligned} \left| {\left\langle A \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,A \left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} \right| \le (1 + \epsilon ) \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert z\Vert ^2. \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} \left\| A \left( \prod _{i = d}^1 W_{i, +, x}\right) \right\| \le \sqrt{1 + \epsilon } \prod _{i = d}^1\Vert W_{i, +, x} \Vert . \end{aligned}$$

Combining above inequality with \(\prod _{i = d}^1\Vert W_{i, +, x} \Vert \le (1 + 2 \epsilon d) / 2^{d/2} \le 1.5 / 2^{d/2}\) given in [12, (10)] yields

$$\begin{aligned} \left\| A \left( \prod _{i = d}^1 W_{i, +, x}\right) \right\| \le 1.5 \sqrt{1 + \epsilon } / 2^{d/2} \le 2 / 2^{d/2}, \end{aligned}$$

where the second inequality follows from the assumption on \(\epsilon \). Therefore, we obtain

$$\begin{aligned} \Vert q_x\Vert = \left\| \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T e\right\| \le \left\| \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T\right\| \Vert e\Vert \le \frac{2}{2^{d/2}} \Vert e\Vert . \end{aligned}$$

\(\square \)

Lemma A.4 is used in proofs for Lemma 5.5.

Lemma A.4

For all \(d\ge 2\), that

$$\begin{aligned} 1/\left( a_7(d + 2)^2\right) \le 1 - \rho _d \le 250/(d + 1), \end{aligned}$$

and \(a_8 = \min _{d \ge 2} \rho _d > 0\).

Proof

It holds that

$$\begin{aligned}&\log (1+x) \le x&\forall x \in [-0.5, 1] \end{aligned}$$
(A.2)
$$\begin{aligned}&\log (1-x) \ge -2 x&\forall x \in [0, 0.75] \end{aligned}$$
(A.3)

where \(\theta _{x, y} = \angle (x, y)\).

We recall the results in [12, (35), (36), and (49)]:

$$\begin{aligned}&{\check{\theta }}_i \le \frac{3 \pi }{i + 3}\;\;\;\;\; \;\;\; \hbox { and }\;\;\;\;\; \;\;\; {\check{\theta }}_i \ge \frac{\pi }{i + 1}\;\;\;\;\; \forall i \ge 0 \\&\quad 1 - \rho _d = \prod _{i = 1}^{d - 1} \left( 1 - \frac{{\check{\theta }}_{i}}{\pi } \right) + \sum _{i = 1}^{d-1} \frac{{\check{\theta }}_{i} - \sin {\check{\theta }}_{i}}{\pi } \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right) . \end{aligned}$$

Therefore, we have for all \(0 \le i \le d - 2\),

$$\begin{aligned} \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right)&\le \prod _{j = i+1}^{d-1} \left( 1 - \frac{1}{j + 1} \right) = e^{\sum _{j = i + 1}^{d - 1} \log \left( 1 - \frac{1}{j + 1}\right) } \\&\le e^{- \sum _{j = i + 1}^{d - 1} \frac{1}{j + 1}} \le e^{- \int _{i + 1}^d \frac{1}{s + 1} d s} = \frac{i + 2}{d + 1}, \\ \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right)&\ge \prod _{j = i+1}^{d-1} \left( 1 - \frac{3}{j + 3} \right) = e^{\sum _{j = i + 1}^{d - 1} \log \left( 1 - \frac{3}{j + 3}\right) } \\&\ge e^{- \sum _{j = i + 1}^{d - 1} \frac{6}{j + 3}} \ge e^{- \int _{i}^{d - 1} \frac{6}{s + 3} d s} = \left( \frac{i + 3}{d + 2}\right) ^6, \end{aligned}$$

where the second and the fifth inequalities follow from (A.2) and (A.3) respectively. Since \(\pi ^3 / (12 (i + 1)^3) \le {\check{\theta }}_{i}^3 / 12 \le {\check{\theta }}_{i} - \sin {\check{\theta }}_{i} \le {\check{\theta }}_{i}^3 / 6 \le 27 \pi ^3 / (6 (i + 3)^3)\), we have that for all \(d \ge 3\)

$$\begin{aligned} 1 - \rho _d&\le \frac{2}{d + 1} + \sum _{i = 1}^{d - 1} \frac{27 \pi ^3}{6 (i + 3)^3} \frac{i + 2}{d + 1} \le \frac{2}{d + 1} + \frac{3 \pi ^5}{4 (d + 1)} \le \frac{250}{d + 1} \end{aligned}$$

and

$$\begin{aligned} 1 - \rho _d&\ge \left( \frac{3}{(d+2)}\right) ^6 + \sum _{i = 1}^{d - 1} \frac{\pi ^3}{12 (i + 3)^3} \left( \frac{i + 3}{d + 2} \right) ^6 \ge \frac{1}{K_1 (d + 2)^2}, \end{aligned}$$

where we use \(\sum _{i = 4}^\infty \frac{1}{i^2} \le \frac{\pi ^2}{6}\) and \(\sum _{i = 1}^n i^3 = O(n^4)\). Since \(\rho _d \ge 1 - 250 / (d+1)\) and \(\rho _d > 0\) for all \(d \ge 2\), we have \(\min _{d \ge 2} \rho _d > 0\). \(\square \)

Lemma A.5 is used in proofs for Lemma 5.5.

Lemma A.5

Fix \(0< a_9 < \frac{1}{4 d^2 \pi }\). For any \(\phi _d \in [\rho _d, 1]\), it holds that

$$\begin{aligned} {f^E}(x)&< \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d + \frac{10}{a_8^3} d a_9 \right) \Vert x_*\Vert ^2 + \frac{\Vert x_*\Vert ^2}{2^{d+1}}, \forall x \in {\mathcal {B}}(\phi _d x_*, a_9 \Vert x_*\Vert ) \hbox { and } \\ {f^E}(x)&> \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d \rho _d - 10 d^3 a_9 \right) \Vert x_*\Vert ^2 + \frac{\Vert x_*\Vert ^2}{2^{d+1}}, \forall x \in {\mathcal {B}}(- \phi _d x_*, a_9 \Vert x_*\Vert ), \end{aligned}$$

where \(a_8\) is defined in Lemma A.4.

Proof

If \(x \in {\mathcal {B}}(\phi _d x_*, a_9 \Vert x_*\Vert )\), then we have \(0 \le {\bar{\theta }}_{0, x, x_*} \le \arcsin (a_9 / \phi _d) \le \frac{\pi a_9}{2 \phi _d}\), \(0 \le {\bar{\theta }}_{0, x, x_*} \le {\bar{\theta }}_{i, x, x_*} \le \frac{\pi a_9}{2 \phi _d}\), and \(\phi _d \Vert x_*\Vert - a_9 \Vert x_*\Vert \le \Vert x\Vert \le \phi _d \Vert x_*\Vert + a_9 \Vert x_*\Vert \). Note that \(\cos \theta \ge 1 - \frac{\theta ^2}{2}, \forall \theta \in [0, \pi ]\). We have

$$\begin{aligned}&{f^E}(x) - \frac{\Vert x_*\Vert ^2}{2^{d+1}} \le \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - {\bar{\theta }}_{i, x, x_*}}{\pi } \right) x_*^T x \\&\quad \le \frac{1}{2^{d+1}} (\phi _d + a_9)^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - \frac{\pi a_9}{2 \phi _d}}{\pi } \right) \Vert x_*\Vert \Vert x\Vert \cos {\bar{\theta }}_{0, x, x_*} \\&\quad \le \frac{1}{2^{d+1}} (\phi _d + a_9)^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - \frac{\pi a_9}{2 \phi _d}}{\pi } \right) (\phi _d - a_9) \Vert x_*\Vert ^2 \left( 1 - \frac{\pi ^2 a_9^2}{8 \phi _d^2} \right) \\&\quad \le \frac{1}{2^{d+1}} \left( \phi _d^2 + 2 \phi _d a_9 + a_9^2 - 2 \left( 1 - \frac{d a_9}{\phi _d}\right) (\phi _d -a_9) \left( 1 - \frac{\pi ^2 a_9^2}{8 \phi _d^2}\right) \right) \Vert x_*\Vert ^2 \\&\quad \le \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d + \frac{10}{a_8^3} d a_9 \right) \Vert x_*\Vert ^2, \end{aligned}$$

where the last inequality is by Lemma A.4 and \(a_9 < 1 / (4 \pi )\).

If \(x \in {\mathcal {B}}(- \phi _d x_*, a_9 \Vert x_*\Vert )\), then we have \(0 \le \pi - {{\bar{\theta }}}_{0, x, x_*} \le \arcsin (a_9 \pi ) \le \frac{\pi ^2}{2} a_9\), and \(\phi _d \Vert x_*\Vert - a_9 \Vert x_*\Vert \le \Vert x\Vert \le \phi _d \Vert x_*\Vert + a_9 \Vert x_*\Vert \). It follows that

$$\begin{aligned} {f^E}(x) - \frac{\Vert x_*\Vert ^2}{2^{d+1}}&\ge \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \sum _{i = 0}^{d - 1} \frac{\sin {\bar{\theta }}_{i, x, x_*}}{\pi } \left( \prod _{j = i + 1}^{d-1} \frac{\pi - {\bar{\theta }}_{j, x, x_*}}{\pi } \right) \Vert x_*\Vert \Vert x\Vert \\&\ge \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \left( \rho _d + \frac{3 d^3a_9 \pi ^2}{2} \right) \Vert x_*\Vert \Vert x\Vert \;\;\;\;\; (\hbox {by }[12, (40)]) \\&\ge \frac{1}{2^{d+1}} \left( \phi _d - a_9 \right) ^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \rho _d + \frac{3 d^3a_9 \pi ^2}{2} \right) \left( \phi _d + a_9\right) \Vert x_*\Vert ^2 \\&\ge \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d \rho _d - 10 d^3 a_9 \right) \Vert x_*\Vert ^2. \end{aligned}$$

\(\square \)

Lemma A.6 is used in proofs for Lemma 5.5.

Lemma A.6

If the WDC and RRIC hold with \(\epsilon < 1 / (16 \pi d^2)^2\), then we have

$$\begin{aligned} |f(x) - {f^E}(x)|\le & {} \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x\Vert ^2 + \frac{\epsilon (1 + 4 \epsilon d) + 48 d^3 \sqrt{\epsilon }}{2^{d+1}} \Vert x\Vert \Vert x_*\Vert \\&+ \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x_*\Vert ^2. \end{aligned}$$

Proof

For brevity of notation, let \(\Lambda _{z} = \prod _{i = d}^1 W_{i, +, z}\). We have

$$\begin{aligned} \left| f(x) - {f^E}(x) \right|&= \left| \frac{1}{2} x^T \left( \Lambda _{x}^T A^T A \Lambda _{x} - \Lambda _{x}^T \Lambda _{x} \right) x \right. + \frac{1}{2} x^T \left( \Lambda _{x}^T \Lambda _{x} - \frac{I_k}{2^d} \right) x \\&\quad - x^T \left( \Lambda _{x}^T A^T A \Lambda _{x_*} x_* - \Lambda _{x}^T \Lambda _{x_*} x_* \right) - x^T \left( \Lambda _{x}^T \Lambda _{x_*} x_* - h_{x, x_*} \right) \\&\quad + \frac{1}{2} x_*^T \left( \Lambda _{x_*}^T A^T A \Lambda _{x_*} - \Lambda _{x_*}^T \Lambda _{x_*} \right) x_* \\&\quad + \left. \frac{1}{2} x_*^T \left( \Lambda _{x_*}^T \Lambda _{x_*} - \frac{I_k}{2^d} \right) x_* \right| \\&\le \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert \Vert x_*\Vert \\&\quad + \frac{24 d^3 \sqrt{\epsilon }}{2^d} \Vert x\Vert \Vert x_*\Vert \\&\quad + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x_*\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x_*\Vert ^2\\&= \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x\Vert ^2 \\&\quad + \frac{\epsilon (1 + 4 \epsilon d) + 48 d^3 \sqrt{\epsilon }}{2^{d+1}} \Vert x\Vert \Vert x_*\Vert + \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x_*\Vert ^2, \end{aligned}$$

where the first inequality uses the WDC, the RRIC, and [12, Lemma 8]. \(\square \)

Lemma A.7 is used in proofs for Lemma A.8.

Lemma A.7

Suppose \(W \in {\mathbb {R}}^{n \times k}\) satisfies the WDC with constant \(\epsilon \). Then for any \(x, y \in {\mathbb {R}}^k\), it holds that

$$\begin{aligned} \Vert W_{+, x} x - W_{+, y} y\Vert \le \left( \sqrt{\frac{1}{2} + \epsilon } + \sqrt{2(2\epsilon + \theta )} \right) \Vert x - y\Vert , \end{aligned}$$

where \(\theta = \angle (x, y)\).

Proof

We have

$$\begin{aligned}&\Vert W_{+, x} x - W_{+, y} y\Vert \le \Vert W_{+, x} x - W_{+, x} y\Vert + \Vert W_{+, x} y - W_{+, y} y\Vert \nonumber \\&\qquad = \Vert W_{+, x} (x - y)\Vert + \Vert (W_{+, x} - W_{+, y}) y\Vert \nonumber \\&\quad \le \Vert W_{+, x}\Vert \Vert x - y\Vert + \Vert (W_{+, x} - W_{+, y}) y\Vert . \end{aligned}$$
(A.4)

By WDC assumption, we have

$$\begin{aligned} \Vert W_{+, x}^T (W_{+, x} - W_{+, y})\Vert&\le \left\| W_{+, x}^T W_{+, x} - I / 2\right\| + \left\| W_{+, x}^T W_{+, y} - Q_{x, y}\right\| \nonumber \\&+ \left\| Q_{x, y} - I / 2 \right\| \le 2 \epsilon + \theta . \end{aligned}$$
(A.5)

We also have

$$\begin{aligned}&\Vert (W_{+, x} - W_{+, y}) y\Vert ^2\nonumber \\&\quad = \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 (w_i \cdot y)^2 \nonumber \\&\quad \le \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 ((w_i \cdot x)^2 + (w_i \cdot y)^2 - 2 (w_i \cdot x) (w_i \cdot y) ) \nonumber \\&\quad = \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 (w_i \cdot (x - y) )^2 \nonumber \\&\quad = \sum _{i = 1}^n 1_{w_i \cdot x> 0} 1_{w_i \cdot y \le 0} (w_i \cdot (x - y))^2 + \sum _{i = 1}^n 1_{w_i \cdot x \le 0} 1_{w_i \cdot y > 0} (w_i \cdot (x - y))^2 \nonumber \\&\quad = (x - y)^T W_{+, x}^T (W_{+, x} - W_{+, y}) (x - y) + (x - y)^T W_{+, y}^T (W_{+, y} - W_{+, x}) (x - y) \nonumber \\&\quad \le 2 (2 \epsilon + \theta ) \Vert x - y\Vert ^2. \;\;\; (\hbox {by}~\hbox {A}.5) \end{aligned}$$
(A.6)

Combining (A.4), (A.6), and \(\Vert W_{i, +, x}\Vert ^2 \le 1/2 + \epsilon \) given in [12, (9)] yields the result. \(\square \)

Lemma A.8 is used in proofs for Lemma 5.6 and Lemma A.9.

Lemma A.8

Suppose \(x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )\), and the WDC holds with \(\epsilon < 1/ (200)^4 / d^6\). Then it holds that

$$\begin{aligned} \left\| \prod _{i = j}^1 W_{i, +, x} x - \prod _{i = j}^1 W_{i, +, x_*} x_*\right\| \le \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert . \end{aligned}$$

Proof

In this proof, we denote \(\theta _{i, x, x_*}\) and \({\bar{\theta }}_{i, x, x_*}\) by \(\theta _i\) and \({\bar{\theta }}_{i}\) respectively. Since \(x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )\), we have

$$\begin{aligned} {\bar{\theta }}_{i} \le {\bar{\theta }}_{0} \le 2 d \sqrt{\epsilon }. \end{aligned}$$
(A.7)

By [12, (14)], we also have \(|\theta _{i} - {\bar{\theta }}_{i}| \le 4 i \sqrt{\epsilon } \le 4 d \sqrt{\epsilon }\). It follows that

$$\begin{aligned} 2 \sqrt{\theta _i + 2 \epsilon }&\le 2 \sqrt{{\bar{\theta }}_i + 4 d \sqrt{\epsilon } + 2\epsilon } \le 2 \sqrt{2 d \sqrt{\epsilon } + 4 d \sqrt{\epsilon } + 2\epsilon } \nonumber \\&\quad \le 2 \sqrt{8 d \sqrt{\epsilon }} \le \frac{1}{30 d}. \, (\hbox {by the assumption on }\epsilon ) \end{aligned}$$
(A.8)

Note that \(\sqrt{1 + 2 \epsilon } \le 1 + \epsilon \le 1 + \sqrt{d\sqrt{\epsilon }}\). We have

$$\begin{aligned} \prod _{i = d - 1}^0 \left( \sqrt{1+2\epsilon } + 2 \sqrt{ \theta _i + 2 \epsilon } \right)&\le \left( 1 + 7 \sqrt{d\sqrt{\epsilon }}\right) ^d \\&\le 1 + 14 d \sqrt{d\sqrt{\epsilon }} \le \frac{107}{100} < 1.2, \end{aligned}$$

where the second inequality is from that \((1+x)^d \le 1 + 2dx\) if \(0< x d < 1\). Combining the above inequality with Lemma A.7 yields

$$\begin{aligned} \left\| \prod _{i = j}^1 W_{i, +, x}x - \prod _{i = j}^1 W_{i, +, x_*} x_*\right\|&\le \prod _{i = j - 1}^0 \left( \sqrt{\frac{1}{2}+\epsilon } + \sqrt{2} \sqrt{ \theta _i + 2 \epsilon } \right) \Vert x - x_*\Vert \\&\le \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert . \end{aligned}$$

\(\square \)

Lemma A.9 is used in proofs for Lemma 5.6.

Lemma A.9

Suppose \(x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )\), and the WDC holds with \(\epsilon < 1/ (200)^4 / d^6\). Then it holds that

$$\begin{aligned}&\left( \prod _{i = d}^1 W_{i, +, x}\right) ^T\left[ \left( \prod _{i = d}^1 W_{i, +, x}\right) x - \left( \prod _{i = d}^1 W_{i, +, x_*}\right) x_*\right] \\&\quad = \frac{1}{2^d} (x - x_*) + \frac{1}{2^d} \frac{1}{16} \Vert x - x_*\Vert O_1(1). \end{aligned}$$

Proof

For brevity of notation, let \(\Lambda _{j, k, z} = \prod _{i = j}^k W_{i, +, z}\). We have

$$\begin{aligned}&\Lambda _{d, 1, x}^T\left( \Lambda _{d, 1, x} x - \Lambda _{d, 1, x_*} x_*\right) \nonumber \\&\quad = \Lambda _{d, 1, x}^T\left[ \Lambda _{d, 1, x} x - \sum _{j = 1}^d \left( \Lambda _{d, j, x} \Lambda _{j-1, 1, x_*} x_*\right) \right. \nonumber \\&\qquad + \left. \sum _{j = 1}^d \left( \Lambda _{d, j, x} \Lambda _{j-1, 1, x_*} x_*\right) - \Lambda _{d, 1, x_*} x_*\right] \nonumber \\&\quad = \underbrace{\Lambda _{d, 1, x}^T \Lambda _{d, 1, x} (x - x_*)}_{T_1} + \underbrace{\Lambda _{d, 1, x}^T \sum _{j = 1}^d \Lambda _{d, j + 1, x} \left( W_{j, +, x} - W_{j, +, x_*} \right) \Lambda _{j-1, 1, x_*} x_*}_{T_2}. \end{aligned}$$
(A.9)

For \(T_1\), we have

$$\begin{aligned} T_1 = \frac{1}{2^d} (x - x_*) + \frac{4 d}{2^d} \Vert x - x_*\Vert O_1(\epsilon ). \;\; ([12, (10)]) \end{aligned}$$
(A.10)

For \(T_2\), we have

$$\begin{aligned} T_2&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \left\| (W_{j, +, x} - W_{j, +, x_*}) \Lambda _{j-1, 1, x_*}x_* \right\| \nonumber \\&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \left\| (\Lambda _{j-1, 1, x} x - \Lambda _{j-1, 1, x_*} x_*) \right\| \nonumber \\&\qquad \sqrt{2 (\theta _{i, x, x_*} + 2 \epsilon )} \nonumber \\&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert \frac{1}{ 30 \sqrt{2} d} \nonumber \\&= \frac{1}{16} \frac{1}{2^d} \Vert x - x_*\Vert O_1(1), \end{aligned}$$
(A.11)

where the first equation is by [12, (10)]; the second equation is by (A.6); the third equation is by Lemma A.8 and (A.8). The result follows from (A.9), (A.10) and (A.11). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, W., Hand, P., Heckel, R. et al. A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors. J Fourier Anal Appl 27, 19 (2021). https://doi.org/10.1007/s00041-021-09830-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00041-021-09830-5

Keywords

Navigation