A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors

Huang, Wen; Hand, Paul; Heckel, Reinhard; Voroninski, Vladislav

doi:10.1007/s00041-021-09830-5

A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors

Published: 11 March 2021

Volume 27, article number 19, (2021)
Cite this article

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Wen Huang¹,
Paul Hand²,
Reinhard Heckel³ &
…
Vladislav Voroninski⁴

659 Accesses
9 Citations
Explore all metrics

Abstract

Deep generative modeling has led to new and state of the art approaches for enforcing structural priors in a variety of inverse problems. In contrast to priors given by sparsity, deep models can provide direct low-dimensional parameterizations of the manifold of images or signals belonging to a particular natural class, allowing for recovery algorithms to be posed in a low-dimensional space. This dimensionality may even be lower than the sparsity level of the same signals when viewed in a fixed basis. What is not known about these methods is whether there are computationally efficient algorithms whose sample complexity is optimal in the dimensionality of the representation given by the generative model. In this paper, we present such an algorithm and analysis. Under the assumption that the generative model is a neural network that is sufficiently expansive at each layer and has Gaussian weights, we provide a gradient descent scheme and prove that for noisy compressive measurements of a signal in the range of the model, the algorithm converges to that signal, up to the noise level. The scaling of the sample complexity with respect to the input dimensionality of the generative prior is linear, and thus can not be improved except for constants and factors of other variables. To the best of the authors’ knowledge, this is the first recovery guarantee for compressive sensing under generative priors by a computationally efficient algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compressive Sensing and Neural Networks from a Statistical Learning Perspective

Compressed sensing using generative models based on fisher information

Article 03 August 2021

Compressive Learning of Deep Regularization for Denoising

Notes

This implementation is available at https://www.caam.rice.edu/~optimization/L1/fpc/.

References

Allen-Zhu, Z., Li, Y., Song, Z.: A convergence theory for deep learning via over-parameterization. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 242–252. PMLR, 09–15 (2019)
Arora, S., Liang, Y., Ma, T.: Why are deep nets reversible: a simple theory, with implications for training. Preprint (2015). arXiv:1511.05653
Blanchard, J.D., Cartis, C., Tanner, J.: Compressed sensing: How sharp is the restricted isometry property? SIAM Rev. Soc. Ind. Appl. Math. 53(1), 105–125 (2011)
MathSciNet MATH Google Scholar
Bora, A., Jalal, A., Price, E., Dimakis, A.G.: Compressed sensing using generative models. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 537–546. PMLR, 06–11 (2017)
Candes, E.J., Tao, T.: Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006)
Article MathSciNet Google Scholar
Clason, C.: Nonsmooth analysis and optimization. Preprint (2017). arXiv:1708.04180
Du, S.S., Zhai, X., Poczos, B., Singh, A.: Gradient descent provably optimizes over-parameterized neural network. In: Proceedings of the 7nd International Conference on Learning Representations (2019)
Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press, Cambridge (2012)
Book Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser/Springer, Boston (2013)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Hale, E.T., Yin, W., Zhang, Y.: Fixed-point continuation for $\ell _1$-minimization: methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)
Article MathSciNet Google Scholar
Hand, P., Voroninski, V.: Global guarantees for enforcing deep generative priors by empirical risk. IEEE Trans. Inf. Theory 66(1), 401–418 (2019)
Article MathSciNet Google Scholar
Heckel, R., Huang, W., Hand, P., Voroninski, V.: Deep denoising: rate-optimal recovery of structured signals with a deep prior. Inf. Inference (2020. accepted)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711. Springer, Cham (2016)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation (2018). arXiv:1710.10196
Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1 $\times $ 1 convolutions. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 10236–10245 (2018)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (2014)
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Li, Y., Liang, Y.: Learning overparameterized neural networks via stochastic gradient descent on structured data. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 8168–8177 (2018)
Mao, X.-J., Shen, C., Yang, Y.-B.: Image restoration using convolutional auto-encoders with symmetric skip connections. Preprint (2016). arXiv:1606.08921
Mardani, M., Gong, E., Cheng, J.Y., Vasanawala, S.S., Zaharchuk, G., Xing, L., Pauly, J.M.: Deep generative adversarial neural networks for compressive sensing MRI. IEEE Trans. Med. Imaging 38(1), 167–179 (2019)
Article Google Scholar
Mardani, M., Monajemi, H., Papyan, V., Vasanawala, S., Donoho, D., Pauly, J.: Recurrent generative adversarial networks for proximal learning and automated compressive image recovery. Preprint (2017). arXiv:1711.10046
Mousavi, A., Baraniuk, R.G.: Learning to invert: Signal recovery via deep convolutional networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2272–2276 (2017)
Mousavi, A., Patel, A.B., Baraniuk, R.G.: A deep learning approach to structured signal recovery. In: 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1336–1343 (2015)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Cham (2006)
MATH Google Scholar
Oymak, S., Soltanolkotabi, M.: Toward moderate overparameterization: global convergence guarantees for training shallow neural networks. IEEE J. Sel. Areas Inf. Theory 1(1), 84–105 (2020)
Article Google Scholar
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning, pp. 2922–2930. PMLR (2017)
Sønderby, C.K., Caballero, J., Theis, L., Shi, W., Huszár, F.: Amortised MAP inference for image super-resolution. In: Proceedings of the 5th International Conference on Learning Representations (2017)
Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: International Conference on Machine Learning, pp. 1747–1756. PMLR (2016)
Yeh, R.A., Chen, C., Lim, T.Y., Schwing, A.G., Hasegawa-Johnson, M., Do, M.N.: Semantic image inpainting with deep generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5485–5493 (2017)
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: European Conference on Computer Vision, pp. 597–613. Springer, Cham (2016)

Download references

Acknowledgements

W.H. is partially supported by the Fundamental Research Funds for the Central Universities (No. 20720190060) and the National Natural Science Foundation of China (No. 12001455).P.H. is partially supported by NSF CAREER Award DMS-1848087 and NSF Award DMS-2022205. RH is partially supported by NSF Award IIS-1816986.

Author information

Authors and Affiliations

Department of Mathematical Sciences, Xiamen University, Xiamen, China
Wen Huang
Department of Mathematics and Khoury College of Computer Sciences, Northeastern University, Boston, USA
Paul Hand
Department of Electrical and Computer Engineering, Rice University, Houston, USA
Reinhard Heckel
Helm.ai, Menlo Park, USA
Vladislav Voroninski

Authors

Wen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Paul Hand
View author publications
You can also search for this author in PubMed Google Scholar
Reinhard Heckel
View author publications
You can also search for this author in PubMed Google Scholar
Vladislav Voroninski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Huang.

Additional information

Communicated by Roman Vershynin.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Supporting Lemmas

Lemma A.1 is used in proofs for Sect. 5.3 and Lemma 5.3.

Lemma A.1

Suppose that the WDC and RRIC holds with $\epsilon < 1/(16 \pi d^2)^2$ and that the noise e satisfies $\Vert e\Vert \le a_5 2^{-d/2} \Vert x_*\Vert $. Then, for all x and all $v_x \in \partial f(x)$,

$$\begin{aligned} \Vert {v}_x\Vert \le \frac{a_6 d}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ), \end{aligned}$$

(A.1)

where $a_5$ and $a_6$ are universal constants.

Proof

Define for convenience $\zeta _j=\prod _{i = j}^{d - 1} \frac{\pi - {\bar{\theta }}_{j, x, x_*}}{\pi }$. We have

$$\begin{aligned} \Vert {v}_x\Vert&\le \Vert h_{x}\Vert + \Vert h_{x} - {v}_x\Vert \\&\le \left\| \frac{1}{2^d} x - \frac{1}{2^d} \zeta _{0} x_* - \frac{1}{2^d} \sum _{i = 0}^{d - 1} \frac{\sin {\bar{\theta }}_{i,x}}{\pi } \zeta _{i + 1} \frac{\Vert x_*\Vert }{\Vert x\Vert } x \right\| \\&\quad + a_1 \frac{d^3 \sqrt{\epsilon }}{2^d} \max ( \Vert x\Vert , \Vert x_* \Vert ) + \frac{2}{2^{d/2}} \Vert e\Vert \\&\le \frac{1}{2^d} \Vert x\Vert + \left( \frac{1}{2^d} + \frac{d}{\pi 2^d} \right) \Vert x_*\Vert + a_1 \frac{d^3 \sqrt{\epsilon }}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ) + \frac{2}{2^{d/2}} \Vert e\Vert \\&\le \frac{a_6 d}{2^d} \max (\Vert x\Vert , \Vert x_*\Vert ), \end{aligned}$$

where the second inequality follows from the definition of $h_x$ and Lemma 5.2, the third inequality uses $| \zeta _j | \le 1$, and the last inequality uses the assumption $\Vert e\Vert \le a_5 2^{-d/2} \Vert x_*\Vert $. $\square $

Lemma A.2 is used in proofs for Lemma 5.1.

Lemma A.2

Suppose $a_i, b_i \in [0, \pi ]$ for $i = 1, \ldots , k$, and $|a_i - b_i| \le |a_j - b_j|, \forall i \ge j$. Then it holds that

$$\begin{aligned} \left| \prod _{i = 1}^k \frac{\pi - a_i}{\pi } - \prod _{i = 1}^k \frac{\pi - b_i}{\pi }\right| \le \frac{k}{\pi } |a_1 - b_1|. \end{aligned}$$

Proof

Prove by induction. It is easy to verify that the inequality holds if $k = 1$. Suppose the inequality holds with $k = t - 1$. Then

$$\begin{aligned} \left| \prod _{i = 1}^{t} \frac{\pi - a_i}{\pi } - \prod _{i = 1}^t \frac{\pi - b_i}{\pi }\right|&\le \left| \prod _{i = 1}^{t} \frac{\pi - a_i}{\pi } - \frac{\pi - a_t}{\pi } \prod _{i = 1}^{t-1} \frac{\pi - b_i}{\pi }\right| \\&\quad + \left| \frac{\pi - a_t}{\pi } \prod _{i = 1}^{t-1} \frac{\pi - b_i}{\pi } - \prod _{i = 1}^t \frac{\pi - b_i}{\pi }\right| \\&\le \frac{t-1}{\pi } |a_1 - b_1| + \frac{1}{\pi } |a_t - b_t| \le \frac{t}{\pi } |a_1 - b_1|. \end{aligned}$$

$\square $

Lemma A.3 is used in proofs for Lemmas 5.2, 5.3, and 5.5.

Lemma A.3

Suppose the WDC and RRIC hold with $\epsilon \le 1 / (16 \pi d^2)^2$. Then we have

$$\begin{aligned} \left| x^T q_x\right| \le \frac{2}{2^{d/2}} \Vert e\Vert \Vert x\Vert , \end{aligned}$$

where $q_x = \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T e$. In addition, if x is differentiable at G(x), then we have

$$\begin{aligned} \left\| q_x\right\| \le \frac{2}{2^{d/2}} \Vert e\Vert . \end{aligned}$$

Proof

We have

$$\begin{aligned} |x^T q_x|^2&= |e^T A G(x)|^2 \le \Vert A G(x)\Vert ^2 \Vert e\Vert ^2 \le (1 + \epsilon ) \Vert G(x)\Vert ^2 \Vert e\Vert ^2 \\&\le (1 + \epsilon ) \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert e\Vert ^2 \Vert x\Vert ^2 \le (1 + \epsilon ) (1 + 2 \epsilon d)^2 \frac{1}{ 2^{d}} \Vert e\Vert ^2 \Vert x\Vert ^2, \end{aligned}$$

where the second inequality follows from RRIC and the last inequality follows from [12, (10)]. Therefore, $\left| x^T q_x\right| \le \frac{2}{2^{d/2}} \Vert e\Vert \Vert x\Vert $.

Suppose G is differentiable at x. Then the local linearity of G implies that $G(x + z) - G(x) = \left( \prod _{i = d}^1 W_{i, +, x}\right) z$ for any sufficiently small $z \in {\mathbb {R}}^k$. By the RRIC, we have

$$\begin{aligned}&\left| {\left\langle A \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,A \left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} - {\left\langle \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,\left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} \right| \\&\quad \le \epsilon \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert z\Vert ^2, \end{aligned}$$

which implies

$$\begin{aligned} \left| {\left\langle A \left( \prod _{i = d}^1 W_{i, +, x}\right) z ,A \left( \prod _{i = d}^1 W_{i, +, x}\right) z \right\rangle _{}} \right| \le (1 + \epsilon ) \prod _{i = d}^1\Vert W_{i, +, x} \Vert ^2 \Vert z\Vert ^2. \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} \left\| A \left( \prod _{i = d}^1 W_{i, +, x}\right) \right\| \le \sqrt{1 + \epsilon } \prod _{i = d}^1\Vert W_{i, +, x} \Vert . \end{aligned}$$

Combining above inequality with $\prod _{i = d}^1\Vert W_{i, +, x} \Vert \le (1 + 2 \epsilon d) / 2^{d/2} \le 1.5 / 2^{d/2}$ given in [12, (10)] yields

$$\begin{aligned} \left\| A \left( \prod _{i = d}^1 W_{i, +, x}\right) \right\| \le 1.5 \sqrt{1 + \epsilon } / 2^{d/2} \le 2 / 2^{d/2}, \end{aligned}$$

where the second inequality follows from the assumption on $\epsilon $. Therefore, we obtain

$$\begin{aligned} \Vert q_x\Vert = \left\| \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T e\right\| \le \left\| \left( \prod _{i = d}^1 W_{i, +, x}\right) ^T A^T\right\| \Vert e\Vert \le \frac{2}{2^{d/2}} \Vert e\Vert . \end{aligned}$$

$\square $

Lemma A.4 is used in proofs for Lemma 5.5.

Lemma A.4

For all $d\ge 2$, that

$$\begin{aligned} 1/\left( a_7(d + 2)^2\right) \le 1 - \rho _d \le 250/(d + 1), \end{aligned}$$

and $a_8 = \min _{d \ge 2} \rho _d > 0$.

Proof

It holds that

$$\begin{aligned}&\log (1+x) \le x&\forall x \in [-0.5, 1] \end{aligned}$$

(A.2)

$$\begin{aligned}&\log (1-x) \ge -2 x&\forall x \in [0, 0.75] \end{aligned}$$

(A.3)

where $\theta _{x, y} = \angle (x, y)$.

We recall the results in [12, (35), (36), and (49)]:

$$\begin{aligned}&{\check{\theta }}_i \le \frac{3 \pi }{i + 3}\;\;\;\;\; \;\;\; \hbox { and }\;\;\;\;\; \;\;\; {\check{\theta }}_i \ge \frac{\pi }{i + 1}\;\;\;\;\; \forall i \ge 0 \\&\quad 1 - \rho _d = \prod _{i = 1}^{d - 1} \left( 1 - \frac{{\check{\theta }}_{i}}{\pi } \right) + \sum _{i = 1}^{d-1} \frac{{\check{\theta }}_{i} - \sin {\check{\theta }}_{i}}{\pi } \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right) . \end{aligned}$$

Therefore, we have for all $0 \le i \le d - 2$,

$$\begin{aligned} \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right)&\le \prod _{j = i+1}^{d-1} \left( 1 - \frac{1}{j + 1} \right) = e^{\sum _{j = i + 1}^{d - 1} \log \left( 1 - \frac{1}{j + 1}\right) } \\&\le e^{- \sum _{j = i + 1}^{d - 1} \frac{1}{j + 1}} \le e^{- \int _{i + 1}^d \frac{1}{s + 1} d s} = \frac{i + 2}{d + 1}, \\ \prod _{j = i+1}^{d-1} \left( 1 - \frac{{\check{\theta }}_{j}}{\pi } \right)&\ge \prod _{j = i+1}^{d-1} \left( 1 - \frac{3}{j + 3} \right) = e^{\sum _{j = i + 1}^{d - 1} \log \left( 1 - \frac{3}{j + 3}\right) } \\&\ge e^{- \sum _{j = i + 1}^{d - 1} \frac{6}{j + 3}} \ge e^{- \int _{i}^{d - 1} \frac{6}{s + 3} d s} = \left( \frac{i + 3}{d + 2}\right) ^6, \end{aligned}$$

where the second and the fifth inequalities follow from (A.2) and (A.3) respectively. Since $\pi ^3 / (12 (i + 1)^3) \le {\check{\theta }}_{i}^3 / 12 \le {\check{\theta }}_{i} - \sin {\check{\theta }}_{i} \le {\check{\theta }}_{i}^3 / 6 \le 27 \pi ^3 / (6 (i + 3)^3)$, we have that for all $d \ge 3$

$$\begin{aligned} 1 - \rho _d&\le \frac{2}{d + 1} + \sum _{i = 1}^{d - 1} \frac{27 \pi ^3}{6 (i + 3)^3} \frac{i + 2}{d + 1} \le \frac{2}{d + 1} + \frac{3 \pi ^5}{4 (d + 1)} \le \frac{250}{d + 1} \end{aligned}$$

and

$$\begin{aligned} 1 - \rho _d&\ge \left( \frac{3}{(d+2)}\right) ^6 + \sum _{i = 1}^{d - 1} \frac{\pi ^3}{12 (i + 3)^3} \left( \frac{i + 3}{d + 2} \right) ^6 \ge \frac{1}{K_1 (d + 2)^2}, \end{aligned}$$

where we use $\sum _{i = 4}^\infty \frac{1}{i^2} \le \frac{\pi ^2}{6}$ and $\sum _{i = 1}^n i^3 = O(n^4)$. Since $\rho _d \ge 1 - 250 / (d+1)$ and $\rho _d > 0$ for all $d \ge 2$, we have $\min _{d \ge 2} \rho _d > 0$. $\square $

Lemma A.5 is used in proofs for Lemma 5.5.

Lemma A.5

Fix $0< a_9 < \frac{1}{4 d^2 \pi }$. For any $\phi _d \in [\rho _d, 1]$, it holds that

$$\begin{aligned} {f^E}(x)&< \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d + \frac{10}{a_8^3} d a_9 \right) \Vert x_*\Vert ^2 + \frac{\Vert x_*\Vert ^2}{2^{d+1}}, \forall x \in {\mathcal {B}}(\phi _d x_*, a_9 \Vert x_*\Vert ) \hbox { and } \\ {f^E}(x)&> \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d \rho _d - 10 d^3 a_9 \right) \Vert x_*\Vert ^2 + \frac{\Vert x_*\Vert ^2}{2^{d+1}}, \forall x \in {\mathcal {B}}(- \phi _d x_*, a_9 \Vert x_*\Vert ), \end{aligned}$$

where $a_8$ is defined in Lemma A.4.

Proof

If $x \in {\mathcal {B}}(\phi _d x_*, a_9 \Vert x_*\Vert )$, then we have $0 \le {\bar{\theta }}_{0, x, x_*} \le \arcsin (a_9 / \phi _d) \le \frac{\pi a_9}{2 \phi _d}$, $0 \le {\bar{\theta }}_{0, x, x_*} \le {\bar{\theta }}_{i, x, x_*} \le \frac{\pi a_9}{2 \phi _d}$, and $\phi _d \Vert x_*\Vert - a_9 \Vert x_*\Vert \le \Vert x\Vert \le \phi _d \Vert x_*\Vert + a_9 \Vert x_*\Vert $. Note that $\cos \theta \ge 1 - \frac{\theta ^2}{2}, \forall \theta \in [0, \pi ]$. We have

$$\begin{aligned}&{f^E}(x) - \frac{\Vert x_*\Vert ^2}{2^{d+1}} \le \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - {\bar{\theta }}_{i, x, x_*}}{\pi } \right) x_*^T x \\&\quad \le \frac{1}{2^{d+1}} (\phi _d + a_9)^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - \frac{\pi a_9}{2 \phi _d}}{\pi } \right) \Vert x_*\Vert \Vert x\Vert \cos {\bar{\theta }}_{0, x, x_*} \\&\quad \le \frac{1}{2^{d+1}} (\phi _d + a_9)^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \prod _{i = 0}^{d - 1} \frac{\pi - \frac{\pi a_9}{2 \phi _d}}{\pi } \right) (\phi _d - a_9) \Vert x_*\Vert ^2 \left( 1 - \frac{\pi ^2 a_9^2}{8 \phi _d^2} \right) \\&\quad \le \frac{1}{2^{d+1}} \left( \phi _d^2 + 2 \phi _d a_9 + a_9^2 - 2 \left( 1 - \frac{d a_9}{\phi _d}\right) (\phi _d -a_9) \left( 1 - \frac{\pi ^2 a_9^2}{8 \phi _d^2}\right) \right) \Vert x_*\Vert ^2 \\&\quad \le \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d + \frac{10}{a_8^3} d a_9 \right) \Vert x_*\Vert ^2, \end{aligned}$$

where the last inequality is by Lemma A.4 and $a_9 < 1 / (4 \pi )$.

If $x \in {\mathcal {B}}(- \phi _d x_*, a_9 \Vert x_*\Vert )$, then we have $0 \le \pi - {{\bar{\theta }}}_{0, x, x_*} \le \arcsin (a_9 \pi ) \le \frac{\pi ^2}{2} a_9$, and $\phi _d \Vert x_*\Vert - a_9 \Vert x_*\Vert \le \Vert x\Vert \le \phi _d \Vert x_*\Vert + a_9 \Vert x_*\Vert $. It follows that

$$\begin{aligned} {f^E}(x) - \frac{\Vert x_*\Vert ^2}{2^{d+1}}&\ge \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \sum _{i = 0}^{d - 1} \frac{\sin {\bar{\theta }}_{i, x, x_*}}{\pi } \left( \prod _{j = i + 1}^{d-1} \frac{\pi - {\bar{\theta }}_{j, x, x_*}}{\pi } \right) \Vert x_*\Vert \Vert x\Vert \\&\ge \frac{1}{2^{d+1}} \Vert x\Vert ^2 - \frac{1}{2^d} \left( \rho _d + \frac{3 d^3a_9 \pi ^2}{2} \right) \Vert x_*\Vert \Vert x\Vert \;\;\;\;\; (\hbox {by }[12, (40)]) \\&\ge \frac{1}{2^{d+1}} \left( \phi _d - a_9 \right) ^2 \Vert x_*\Vert ^2 - \frac{1}{2^d} \left( \rho _d + \frac{3 d^3a_9 \pi ^2}{2} \right) \left( \phi _d + a_9\right) \Vert x_*\Vert ^2 \\&\ge \frac{1}{2^{d+1}} \left( \phi _d^2 - 2 \phi _d \rho _d - 10 d^3 a_9 \right) \Vert x_*\Vert ^2. \end{aligned}$$

$\square $

Lemma A.6 is used in proofs for Lemma 5.5.

Lemma A.6

If the WDC and RRIC hold with $\epsilon < 1 / (16 \pi d^2)^2$, then we have

$$\begin{aligned} |f(x) - {f^E}(x)|\le & {} \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x\Vert ^2 + \frac{\epsilon (1 + 4 \epsilon d) + 48 d^3 \sqrt{\epsilon }}{2^{d+1}} \Vert x\Vert \Vert x_*\Vert \\&+ \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x_*\Vert ^2. \end{aligned}$$

Proof

For brevity of notation, let $\Lambda _{z} = \prod _{i = d}^1 W_{i, +, z}$. We have

$$\begin{aligned} \left| f(x) - {f^E}(x) \right|&= \left| \frac{1}{2} x^T \left( \Lambda _{x}^T A^T A \Lambda _{x} - \Lambda _{x}^T \Lambda _{x} \right) x \right. + \frac{1}{2} x^T \left( \Lambda _{x}^T \Lambda _{x} - \frac{I_k}{2^d} \right) x \\&\quad - x^T \left( \Lambda _{x}^T A^T A \Lambda _{x_*} x_* - \Lambda _{x}^T \Lambda _{x_*} x_* \right) - x^T \left( \Lambda _{x}^T \Lambda _{x_*} x_* - h_{x, x_*} \right) \\&\quad + \frac{1}{2} x_*^T \left( \Lambda _{x_*}^T A^T A \Lambda _{x_*} - \Lambda _{x_*}^T \Lambda _{x_*} \right) x_* \\&\quad + \left. \frac{1}{2} x_*^T \left( \Lambda _{x_*}^T \Lambda _{x_*} - \frac{I_k}{2^d} \right) x_* \right| \\&\le \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x\Vert \Vert x_*\Vert \\&\quad + \frac{24 d^3 \sqrt{\epsilon }}{2^d} \Vert x\Vert \Vert x_*\Vert \\&\quad + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x_*\Vert ^2 + \frac{\epsilon }{2} \frac{1 + 4 \epsilon d}{2^d} \Vert x_*\Vert ^2\\&= \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x\Vert ^2 \\&\quad + \frac{\epsilon (1 + 4 \epsilon d) + 48 d^3 \sqrt{\epsilon }}{2^{d+1}} \Vert x\Vert \Vert x_*\Vert + \frac{\epsilon (1 + 4 \epsilon d)}{2^d} \Vert x_*\Vert ^2, \end{aligned}$$

where the first inequality uses the WDC, the RRIC, and [12, Lemma 8]. $\square $

Lemma A.7 is used in proofs for Lemma A.8.

Lemma A.7

Suppose $W \in {\mathbb {R}}^{n \times k}$ satisfies the WDC with constant $\epsilon $. Then for any $x, y \in {\mathbb {R}}^k$, it holds that

$$\begin{aligned} \Vert W_{+, x} x - W_{+, y} y\Vert \le \left( \sqrt{\frac{1}{2} + \epsilon } + \sqrt{2(2\epsilon + \theta )} \right) \Vert x - y\Vert , \end{aligned}$$

where $\theta = \angle (x, y)$.

Proof

We have

$$\begin{aligned}&\Vert W_{+, x} x - W_{+, y} y\Vert \le \Vert W_{+, x} x - W_{+, x} y\Vert + \Vert W_{+, x} y - W_{+, y} y\Vert \nonumber \\&\qquad = \Vert W_{+, x} (x - y)\Vert + \Vert (W_{+, x} - W_{+, y}) y\Vert \nonumber \\&\quad \le \Vert W_{+, x}\Vert \Vert x - y\Vert + \Vert (W_{+, x} - W_{+, y}) y\Vert . \end{aligned}$$

(A.4)

By WDC assumption, we have

$$\begin{aligned} \Vert W_{+, x}^T (W_{+, x} - W_{+, y})\Vert&\le \left\| W_{+, x}^T W_{+, x} - I / 2\right\| + \left\| W_{+, x}^T W_{+, y} - Q_{x, y}\right\| \nonumber \\&+ \left\| Q_{x, y} - I / 2 \right\| \le 2 \epsilon + \theta . \end{aligned}$$

(A.5)

We also have

$$\begin{aligned}&\Vert (W_{+, x} - W_{+, y}) y\Vert ^2\nonumber \\&\quad = \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 (w_i \cdot y)^2 \nonumber \\&\quad \le \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 ((w_i \cdot x)^2 + (w_i \cdot y)^2 - 2 (w_i \cdot x) (w_i \cdot y) ) \nonumber \\&\quad = \sum _{i = 1}^n (1_{w_i \cdot x> 0} - 1_{w_i \cdot y> 0})^2 (w_i \cdot (x - y) )^2 \nonumber \\&\quad = \sum _{i = 1}^n 1_{w_i \cdot x> 0} 1_{w_i \cdot y \le 0} (w_i \cdot (x - y))^2 + \sum _{i = 1}^n 1_{w_i \cdot x \le 0} 1_{w_i \cdot y > 0} (w_i \cdot (x - y))^2 \nonumber \\&\quad = (x - y)^T W_{+, x}^T (W_{+, x} - W_{+, y}) (x - y) + (x - y)^T W_{+, y}^T (W_{+, y} - W_{+, x}) (x - y) \nonumber \\&\quad \le 2 (2 \epsilon + \theta ) \Vert x - y\Vert ^2. \;\;\; (\hbox {by}~\hbox {A}.5) \end{aligned}$$

(A.6)

Combining (A.4), (A.6), and $\Vert W_{i, +, x}\Vert ^2 \le 1/2 + \epsilon $ given in [12, (9)] yields the result. $\square $

Lemma A.8 is used in proofs for Lemma 5.6 and Lemma A.9.

Lemma A.8

Suppose $x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )$, and the WDC holds with $\epsilon < 1/ (200)^4 / d^6$. Then it holds that

$$\begin{aligned} \left\| \prod _{i = j}^1 W_{i, +, x} x - \prod _{i = j}^1 W_{i, +, x_*} x_*\right\| \le \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert . \end{aligned}$$

Proof

In this proof, we denote $\theta _{i, x, x_*}$ and ${\bar{\theta }}_{i, x, x_*}$ by $\theta _i$ and ${\bar{\theta }}_{i}$ respectively. Since $x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )$, we have

$$\begin{aligned} {\bar{\theta }}_{i} \le {\bar{\theta }}_{0} \le 2 d \sqrt{\epsilon }. \end{aligned}$$

(A.7)

By [12, (14)], we also have $|\theta _{i} - {\bar{\theta }}_{i}| \le 4 i \sqrt{\epsilon } \le 4 d \sqrt{\epsilon }$. It follows that

$$\begin{aligned} 2 \sqrt{\theta _i + 2 \epsilon }&\le 2 \sqrt{{\bar{\theta }}_i + 4 d \sqrt{\epsilon } + 2\epsilon } \le 2 \sqrt{2 d \sqrt{\epsilon } + 4 d \sqrt{\epsilon } + 2\epsilon } \nonumber \\&\quad \le 2 \sqrt{8 d \sqrt{\epsilon }} \le \frac{1}{30 d}. \, (\hbox {by the assumption on }\epsilon ) \end{aligned}$$

(A.8)

Note that $\sqrt{1 + 2 \epsilon } \le 1 + \epsilon \le 1 + \sqrt{d\sqrt{\epsilon }}$. We have

$$\begin{aligned} \prod _{i = d - 1}^0 \left( \sqrt{1+2\epsilon } + 2 \sqrt{ \theta _i + 2 \epsilon } \right)&\le \left( 1 + 7 \sqrt{d\sqrt{\epsilon }}\right) ^d \\&\le 1 + 14 d \sqrt{d\sqrt{\epsilon }} \le \frac{107}{100} < 1.2, \end{aligned}$$

where the second inequality is from that $(1+x)^d \le 1 + 2dx$ if $0< x d < 1$. Combining the above inequality with Lemma A.7 yields

$$\begin{aligned} \left\| \prod _{i = j}^1 W_{i, +, x}x - \prod _{i = j}^1 W_{i, +, x_*} x_*\right\|&\le \prod _{i = j - 1}^0 \left( \sqrt{\frac{1}{2}+\epsilon } + \sqrt{2} \sqrt{ \theta _i + 2 \epsilon } \right) \Vert x - x_*\Vert \\&\le \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert . \end{aligned}$$

$\square $

Lemma A.9 is used in proofs for Lemma 5.6.

Lemma A.9

Suppose $x \in {\mathcal {B}}(x_*, d \sqrt{\epsilon } \Vert x_*\Vert )$, and the WDC holds with $\epsilon < 1/ (200)^4 / d^6$. Then it holds that

$$\begin{aligned}&\left( \prod _{i = d}^1 W_{i, +, x}\right) ^T\left[ \left( \prod _{i = d}^1 W_{i, +, x}\right) x - \left( \prod _{i = d}^1 W_{i, +, x_*}\right) x_*\right] \\&\quad = \frac{1}{2^d} (x - x_*) + \frac{1}{2^d} \frac{1}{16} \Vert x - x_*\Vert O_1(1). \end{aligned}$$

Proof

For brevity of notation, let $\Lambda _{j, k, z} = \prod _{i = j}^k W_{i, +, z}$. We have

$$\begin{aligned}&\Lambda _{d, 1, x}^T\left( \Lambda _{d, 1, x} x - \Lambda _{d, 1, x_*} x_*\right) \nonumber \\&\quad = \Lambda _{d, 1, x}^T\left[ \Lambda _{d, 1, x} x - \sum _{j = 1}^d \left( \Lambda _{d, j, x} \Lambda _{j-1, 1, x_*} x_*\right) \right. \nonumber \\&\qquad + \left. \sum _{j = 1}^d \left( \Lambda _{d, j, x} \Lambda _{j-1, 1, x_*} x_*\right) - \Lambda _{d, 1, x_*} x_*\right] \nonumber \\&\quad = \underbrace{\Lambda _{d, 1, x}^T \Lambda _{d, 1, x} (x - x_*)}_{T_1} + \underbrace{\Lambda _{d, 1, x}^T \sum _{j = 1}^d \Lambda _{d, j + 1, x} \left( W_{j, +, x} - W_{j, +, x_*} \right) \Lambda _{j-1, 1, x_*} x_*}_{T_2}. \end{aligned}$$

(A.9)

For $T_1$, we have

$$\begin{aligned} T_1 = \frac{1}{2^d} (x - x_*) + \frac{4 d}{2^d} \Vert x - x_*\Vert O_1(\epsilon ). \;\; ([12, (10)]) \end{aligned}$$

(A.10)

For $T_2$, we have

$$\begin{aligned} T_2&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \left\| (W_{j, +, x} - W_{j, +, x_*}) \Lambda _{j-1, 1, x_*}x_* \right\| \nonumber \\&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \left\| (\Lambda _{j-1, 1, x} x - \Lambda _{j-1, 1, x_*} x_*) \right\| \nonumber \\&\qquad \sqrt{2 (\theta _{i, x, x_*} + 2 \epsilon )} \nonumber \\&= O_1(1) \sum _{j = 1}^d \left( \frac{1}{2^{d - \frac{j}{2}}} + \frac{(4d - 2j)\epsilon }{2^{d - \frac{j}{2}}} \right) \frac{1.2}{2^{\frac{j}{2}}} \Vert x - x_*\Vert \frac{1}{ 30 \sqrt{2} d} \nonumber \\&= \frac{1}{16} \frac{1}{2^d} \Vert x - x_*\Vert O_1(1), \end{aligned}$$

(A.11)

where the first equation is by [12, (10)]; the second equation is by (A.6); the third equation is by Lemma A.8 and (A.8). The result follows from (A.9), (A.10) and (A.11). $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, W., Hand, P., Heckel, R. et al. A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors. J Fourier Anal Appl 27, 19 (2021). https://doi.org/10.1007/s00041-021-09830-5

Download citation

Received: 21 November 2019
Revised: 14 November 2020
Accepted: 05 February 2021
Published: 11 March 2021
DOI: https://doi.org/10.1007/s00041-021-09830-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors

Abstract

Access this article

Similar content being viewed by others

Compressive Sensing and Neural Networks from a Statistical Learning Perspective

Compressed sensing using generative models based on fisher information

Compressive Learning of Deep Regularization for Denoising

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Supporting Lemmas

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

Lemma A.4

Proof

Lemma A.5

Proof

Lemma A.6

Proof

Lemma A.7

Proof

Lemma A.8

Proof

Lemma A.9

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Provably Convergent Scheme for Compressive Sensing Under Random Generative Priors

Abstract

Access this article

Similar content being viewed by others

Compressive Sensing and Neural Networks from a Statistical Learning Perspective

Compressed sensing using generative models based on fisher information

Compressive Learning of Deep Regularization for Denoising

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Supporting Lemmas

Appendix A: Supporting Lemmas

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

Lemma A.4

Proof

Lemma A.5

Proof

Lemma A.6

Proof

Lemma A.7

Proof

Lemma A.8

Proof

Lemma A.9

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation