Do Log Factors Matter? On Optimal Wavelet Approximation and the Foundations of Compressed Sensing

Adcock, Ben; Brugiapaglia, Simone; King–Roskamp, Matthew

doi:10.1007/s10208-021-09501-3

Do Log Factors Matter? On Optimal Wavelet Approximation and the Foundations of Compressed Sensing

Published: 22 March 2021

Volume 22, pages 99–159, (2022)
Cite this article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Ben Adcock¹,
Simone Brugiapaglia² &
Matthew King–Roskamp¹

476 Accesses
6 Citations
Explore all metrics

Abstract

A signature result in compressed sensing is that Gaussian random sampling achieves stable and robust recovery of sparse vectors under optimal conditions on the number of measurements. However, in the context of image reconstruction, it has been extensively documented that sampling strategies based on Fourier measurements outperform this purportedly optimal approach. Motivated by this seeming paradox, we investigate the problem of optimal sampling for compressed sensing. Rigorously combining the theories of wavelet approximation and infinite-dimensional compressed sensing, our analysis leads to new error bounds in terms of the total number of measurements m for the approximation of piecewise $\alpha $-Hölder functions. Our theoretical findings suggest that Fourier sampling outperforms random Gaussian sampling when the Hölder exponent $\alpha $ is large enough. Moreover, we establish a provably optimal sampling strategy. This work is an important first step towards the resolution of the claimed paradox and provides a clear theoretical justification for the practical success of compressed sensing techniques in imaging problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

Francis Bach

Some Uncertainty Principles for the Right-Sided Multivariate Continuous Quaternion Wavelet Transform

Article 08 April 2024

Manel Hleili

UNIFORM CONVERGENCE OF A NONPARAMETRIC ESTIMATE OF POISSON REGRESSION WITH AN APPLICATION TO GOODNESS-OF-FIT

Article 11 April 2024

P. Babilua & E. Nadaraya

Notes

This also provides some explanation as to why attempts to modify devices such as MR scanners to produce Gaussian-like measurements (see, for example, [42, 58, 59]) have not been widely adopted.
In particular, by ‘Fourier measurements’ we mean samples of the continuous Fourier transform of f, not its discrete Fourier transform. Not only is this more convenient for the analysis, it is also more relevant in practice, since modalities such as MRI are based on the continuous Fourier transform [10].
There are some theoretical results for QCBP in the presence of unknown noise [19, 32, 37, 70]. However, except in specific cases, these involve additional factors (so-called quotients) which are difficult to estimate.
The entries of the cross-Gramian matrix U (8.1) used in this sampling strategy are computed by applying the inverse discrete wavelet and Fourier transforms to the first N elements of the canonical basis of the augmented space ${\mathbb {R}}^{16N}$. Then, only the N entries corresponding to the frequencies of interest are kept. This augmentation makes the computation of U more accurate.
In order to avoid discretization effects related to the wavelet crime, the vector d of wavelet coefficients is computed by sampling the function f on a uniform grid of 16N points, applying the discrete wavelet transform and then keeping the first N of entries of the resulting vector.

References

B. Adcock, V. Antun, and A. C. Hansen. Uniform recovery in infinite-dimensional compressed sensing and applications to structured binary sampling. arXiv:1905.00126, 2019.
B. Adcock, A. Bao, and S. Brugiapaglia. Correcting for unknown errors in sparse high-dimensional function approximation. Numer. Math. (to appear), 2019.
B. Adcock, C. Boyer, and S. Brugiapaglia. On the gap between local recovery guarantees in compressed sensing and oracle estimates. arXiv:1806.03789, 2018.
B. Adcock and A. C. Hansen. Generalized sampling and infinite-dimensional compressed sensing. Found. Comput. Math., 16(5):1263–1323, 2016.
Article MathSciNet MATH Google Scholar
B. Adcock and A. C. Hansen. Compressive Imaging: Structure, Sampling, Learning. Cambridge University Press (in press), 2021.
B. Adcock, A. C. Hansen, G. Kutyniok, and J. Ma. Linear stable sampling rate: Optimality of 2D wavelet reconstructions from Fourier measurements. SIAM J. Math. Anal., 47(2):1196–1233, 2015.
Article MathSciNet MATH Google Scholar
B. Adcock, A. C. Hansen, and C. Poon. On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate. Appl. Comput. Harmon. Anal., 36(3):387–415, 2014.
Article MathSciNet MATH Google Scholar
B. Adcock, A. C. Hansen, C. Poon, and B. Roman. Breaking the coherence barrier: A new theory for compressed sensing. Forum Math. Sigma, 5, 2017.
B. Adcock, A. C. Hansen, and B. Roman. The quest for optimal sampling: computationally efficient, structure-exploiting measurements for compressed sensing. In Compressed Sensing and Its Applications. Springer, 2015.
B. Adcock, A. C. Hansen, B. Roman, and G. Teschke. Generalized sampling: stable reconstructions, inverse problems and compressed sensing over the continuum. Advances in Imaging and Electron Physics, 182:187–279, 2014.
Article Google Scholar
G. R. Arce, D. J. Brady, L. Carin, H. Arguello, and D. Kittle. Compressive coded aperture spectral imaging: An introduction. IEEE Signal Process. Mag., 31(1):105–115, 2014.
Article Google Scholar
R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hedge. Model-based compressive sensing. IEEE Trans. Inform. Theory, 56(4):1982–2001, 2010.
Article MathSciNet MATH Google Scholar
A. Bastounis, B. Adcock, and A. C. Hansen. From global to local: Getting more from compressed sensing. SIAM News, 2017.
A. Bastounis and A. C. Hansen. On the absence of the RIP in real-world applications of compressed sensing and the RIP in levels. SIAM J. Imaging Sci., 2017 (to appear).
A. Belloni, V. Chernozhukov, and L. Wang. Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791–806, 2011.
Article MathSciNet MATH Google Scholar
V. Boominathan, J. K. Adams, M. S. Asif, B. W. Avants, J. T. Robinson, R. G. Baraniuk, A. C. Sankaranarayanan, and A. Veeraraghavan. Lensless imaging: A computational renaissance. IEEE Signal Process. Mag., 33(5):23–35, 2016.
Article Google Scholar
C. Boyer, J. Bigot, and P. Weiss. Compressed sensing with structured sparsity and structured acquisition. Appl. Comput. Harm. Anal., 46(2):312–350, 2017.
Article MathSciNet MATH Google Scholar
D. J. Brady, K. Choi, D. L. Marks, R. Horisaki, and S. Lim. Compressive holography. Opt. Express, 17:13040–13049, 2009.
Article Google Scholar
S. Brugiapaglia and B. Adcock. Robustness to unknown error in sparse regularization. IEEE Trans. Inform. Theory, 64(10):6638–6661, 2018.
Article MathSciNet MATH Google Scholar
T. Cai and A. Zhang. Sparse representation of a polytope and recovery of sparse signals and low-rank matrices. IEEE Trans. Inform. Theory, 60(1):122–132, 2014.
Article MathSciNet MATH Google Scholar
E. Candès. The restricted isometry property and its implications for compressed sensing. C. R. Math. Acad. Sci. Paris, 346(9-10):589–592, 2008.
Article MathSciNet MATH Google Scholar
E. J. Candès and D. L. Donoho. New tight frames of curvelets and optimal representations of objects with piecewise c2 singularities. Comm. Pure Appl. Math, 57(2):219–266, 2004.
Article MathSciNet MATH Google Scholar
E. J. Candès and Y. Plan. A probabilistic and RIPless theory of compressed sensing. IEEE Trans. Inform. Theory, 57(11):7235–7254, 2011.
Article MathSciNet MATH Google Scholar
E. J. Candès and J. Romberg. Sparsity and incoherence in compressive sampling. Inverse Problems, 23(3):969–985, 2007.
Article MathSciNet MATH Google Scholar
E. J. Candès, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory, 52(2):489–509, 2006.
Article MathSciNet MATH Google Scholar
A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision, 40(1):120–145, 2011.
Article MathSciNet MATH Google Scholar
N. Chauffert, P. Ciuciu, J. Kahn, and P. Weiss. Variable density sampling with continuous trajectories. SIAM J. Imaging Sci., 7(4):1962–1992, 2014.
Article MathSciNet MATH Google Scholar
A. Chkifa, N. Dexter, H. Tran, and C. G. Webster. Polynomial approximation via compressed sensing of high-dimensional functions on lower sets. Math. Comp., 87:1415–1450, 2018.
Article MathSciNet MATH Google Scholar
A. Cohen, W. Dahmen, and R. A. DeVore. Compressed sensing and best $k$-term approximation. J. Amer. Math. Soc., 22(1):211–231, 2009.
Article MathSciNet MATH Google Scholar
I. Daubechies. Ten Lectures on Wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992.
M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok. Introduction to compressed sensing. In Compressed Sensing: Theory and Applications. Cambridge University Press, 2011.
R. DeVore, G. Petrova, and P. Wojtaszczyk. Instance-optimality in probability with an $\ell _1$-minimization decoder. Appl. Comput. Harmon. Anal., 27(3):275–288, 2009.
Article MathSciNet MATH Google Scholar
R. A. DeVore. Nonlinear approximation. Acta Numer., 7:51–150, 1998.
Article MATH Google Scholar
M. F. Duarte, M. A. Davenport, D. Takhar, J. Laska, K. Kelly, and R. G. Baraniuk. Single-pixel imaging via compressive sampling. IEEE Signal Process. Mag., 25(2):83–91, 2008.
Article Google Scholar
M. F. Duarte and Y. C. Eldar. Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process., 59(9):4053–4085, 2011.
Article MathSciNet MATH Google Scholar
J. A. Fessler. Optimization methods for MR image reconstruction. arXiv:1903.03510, 2019.
S. Foucart. Stability and robustness of $\ell _1$-minimizations with Weibull matrices and redundant dictionaries. Linear Algebra Appl., 441:4–21, 2014.
Article MathSciNet MATH Google Scholar
S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhauser, 2013.
Book MATH Google Scholar
M. Gataric and C. Poon. A practical guide to the recovery of wavelet coefficients from Fourier measurements. SIAM J. Sci. Comput., 38(2):A1075–A1099, 2016.
Article MathSciNet MATH Google Scholar
M. E. Gehm and D. J. Brady. Comopressive sensing in the EO/IR. Applied Optics, 54(8):C14–C22, 2015.
Article Google Scholar
C. G. Graff and E. Y. Sidky. Compressive sensing in medical imaging. Appl. Opt., 54:C23–C44, 2015.
Article Google Scholar
J. Haldar, D. Hernando, and Z. Liang. Compressed-sensing MRI with random encoding. IEEE Trans. Med. Imaging, 30(4):893–903, 2011.
Article Google Scholar
D. J. Holland, M. J. Bostock, L. F. Gladden, and D. Nietlispach. Fast multidimensional NMR spectroscopy using compressed sensing. Angew. Chem. Int. Ed., 50(29), 2011.
G. Huang, H. Jiang, K. Matthews, and P. Wilford. Lensless imaging by compressive sensing. In 20th IEEE International Conference on Image Processing, 2013.
http://www3.gehealthcare.in/~/media/images/product/product-categories/magnetic-resonance-imaging/optima-mr450w-1-5t-with-gem-suite/1-clinical/optima_mr450w_with_gem_suite_brainpropt2_clinical.jpg.
O. Katz, Y. Bromberg, and Y. Silberberg. Compressive ghost imaging. Appl. Phys. Lett., 95:131110, 2009.
Article Google Scholar
K. Kazimierczuk and V. Y. Orekhov. Accelerated NMR spectroscopy by using compressed sensing. Angew. Chem. Int. Ed., 50(24), 2011.
F. Krahmer and R. Ward. Stable and robust recovery from variable density frequency samples. IEEE Trans. Image Proc., 23(2):612–622, 2013.
Article Google Scholar
G. Kutyniok and W.-Q. Lim. Optimal compressive imaging of Fourier data. SIAM J. Imaging Sci., 11(1):507–546, 2018.
Article MathSciNet MATH Google Scholar
C. Li and B. Adcock. Compressed sensing with local structure: uniform recovery guarantees for the sparsity in levels class. Appl. Comput. Harmon. Anal., 46(3):453–477, 2019.
Article MathSciNet MATH Google Scholar
M. Lustig, D. L. Donoho, and J. M. Pauly. Sparse MRI: the application of compressed sensing for rapid MRI imaging. Magn. Reson. Med., 58(6):1182–1195, 2007.
Article Google Scholar
M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly. Compressed Sensing MRI. IEEE Signal Process. Mag., 25(2):72–82, March 2008.
Article Google Scholar
S. G. Mallat. A Wavelet Tour of Signal Processing: The Sparse Way. Academic Press, 3 edition, 2009.
MATH Google Scholar
R. F. Marcia, R. M. Willett, and Z. T. Harmany. Compressive optical imaging: Architectures and algorithms. In G. Cristobal, P. Schelken, and H. Thienpont, editors, Optical and Digital Image Processing: Fundamentals and Applications, pages 485–505. Wiley New York, 2011.
Chapter Google Scholar
K. Marwah, G. Wetzstein, Y. Bando, and R. Raskar. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Trans. Graph., 32(46), 2013.
C. Poon. On the role of total variation in compressed sensing. SIAM J. Imaging Sci., 8(1):682–720, 2015.
Article MathSciNet MATH Google Scholar
C. Poon. Structure dependent sampling in compressed sensing: theoretical guarantees for tight frames. Appl. Comput. Harm. Anal., 42(3):402–451, 2017.
Article MathSciNet MATH Google Scholar
G. Puy, J. P. Marques, R. Gruetter, J. Thiran, D. Van De Ville, P. Vandergheynst, and Y. Wiaux. Spread spectrum Magnetic Resonance Imaging. IEEE Trans. Med. Imaging, 31(3):586–598, 2012.
Article Google Scholar
X. Qu, Y. Chen, X. Zhuang, Z. Yan, D. Guo, and Z. Chen. Spread spectrum compressed sensing MRI using chirp radio frequency pulses. arXiv:1301.5451, 2013.
B. Roman, A. Bastounis, B. Adcock, and A. C. Hansen. On fundamentals of models and sampling in compressed sensing. Preprint, 2015.
B. Roman, A. C. Hansen, and B. Adcock. On asymptotic structure in compressed sensing. arXiv:1406.4178, 2014.
J. Romberg. Imaging via compressive sampling. IEEE Signal Process. Mag., 25(2):14–20, 2008.
Article Google Scholar
V. Studer, J. Bobin, M. Chahid, H. Moussavi, E. Candès, and M. Dahan. Compressive fluorescence microscopy for biological and hyperspectral imaging. Proc. Natl Acad. Sci. USA, 109(26):1679—1687, 2011.
Google Scholar
Y. Traonmilin and R. Gribonval. Stable recovery of low-dimensional cones in Hilbert spaces: One RIP to rule them all. Appl. Comput. Harm. Anal., 45(1):170–205, 2018.
Article MathSciNet MATH Google Scholar
Y. Tsaig and D. L. Donoho. Extensions of compressed sensing. Signal Process., 86(3):549–571, 2006.
Article MATH Google Scholar
E. van den Berg and M. P. Friedlander. SPGL1: A solver for large-scale sparse reconstruction, June 2007. http://www.cs.ubc.ca/labs/scl/spgl1.
E. van den Berg and M. P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput., 31(2):890–912, 2008.
Article MathSciNet MATH Google Scholar
Z. Wang and G. R. Arce. Variable density compressed image sampling. IEEE Trans. Image Proc., 19(1):264–270, 2010.
Article MathSciNet MATH Google Scholar
Y. Wiaux, L. Jacques, G. Puy, A. M. M. Scaife, and P. Vandergheynst. Compressed sensing imaging techniques for radio interferometry. Mon. Not. R. Astron. Soc., 395(3):1733–1742, 2009.
Article Google Scholar
P. Wojtaszczyk. Stability and instance optimality for Gaussian measurements in compressed sensing. Found. Comput. Math., 10(1):1–13, 2010.
Article MathSciNet MATH Google Scholar
L. Zhu, W. Zhang, D. Elnatan, and B. Huang. Faster STORM using compressed sensing. Nature Methods, 9:721—723, 2012.
Article Google Scholar

Download references

Acknowledgements

The authors extend their thanks to Vegard Antun (University of Oslo), who performed the experiment in Fig. 1. They also would like to thank Anders C. Hansen, Bradley J. Lucier and Clarice Poon. S.B. acknowledges the support of the PIMS Postdoctoral Training Centre in Stochastics, the Department of Mathematics of Simon Fraser University, NSERC, the Faculty of Arts and Science of Concordia University, and the CRM Applied Math Lab. This work was supported by the PIMS CRG in “High-dimensional Data Analysis” and by NSERC through Grant R611675.

Author information

Authors and Affiliations

Department of Mathematics, Simon Fraser University, Burnaby, Canada
Ben Adcock & Matthew King–Roskamp
Department of Mathematics and Statistics, Concordia University, Montreal, Canada
Simone Brugiapaglia

Authors

Ben Adcock
View author publications
You can also search for this author in PubMed Google Scholar
Simone Brugiapaglia
View author publications
You can also search for this author in PubMed Google Scholar
Matthew King–Roskamp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simone Brugiapaglia.

Additional information

Communicated by Albert Cohen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Fourier Transform and Series

Given $f \in L^1({\mathbb {R}}) \cap L^2({\mathbb {R}})$, we define the Fourier transform as

$$\begin{aligned} {\hat{f}}(\omega ) = \int ^{\infty }_{-\infty }f(t) {\mathrm {e}}^{-{\mathrm {i}}\omega t} \,{\mathrm {d}}t. \end{aligned}$$

If $f \in L^2([0,1])$, then we can write f as its Fourier series

$$\begin{aligned} f = \sum _{n \in {\mathbb {Z}}} \langle f, \gamma _n \rangle _{L^2} \gamma _n, \end{aligned}$$

where

$$\begin{aligned} \gamma _n(t) = {\mathrm {e}}^{2 \pi {\mathrm {i}}n t},\quad n \in {\mathbb {Z}}, \end{aligned}$$

(A.1)

is the Fourier basis for $L^2([0,1])$. If we consider f as a function in $L^2({\mathbb {R}})$ that is zero outside [0, 1], then $\langle f, \gamma _n \rangle _{L^2} = {\hat{f}}(2 \pi n)$. For convenience, we also re-index this basis over ${\mathbb {N}}$ as follows:

$$\begin{aligned} \gamma _{2n-1} = {\mathrm {e}}^{-2 \pi {\mathrm {i}}(n-1) t},\qquad \gamma _{2n} = {\mathrm {e}}^{2 \pi {\mathrm {i}}n t},\quad n \in {\mathbb {N}}. \end{aligned}$$

(A.2)

Orthogonal Wavelet Bases of $L^2([0,1])$

Let $\varphi $ and $\psi $ be the scaling function and mother wavelet, respectively, of the Daubechies’ wavelet with $p \ge 1$ vanishing moments. Write

$$\begin{aligned} \varphi _{j,k}(x) = 2^{j/2} \varphi (2^j x - k),\ \psi _{j,k}(x) = 2^{j/2} \psi (2^j x - k),\quad j , k \in {\mathbb {Z}}. \end{aligned}$$

Since we work with functions on the interval [0, 1], we need an orthonormal wavelet basis of $L^2([0,1])$. We construct this via periodization (see (5.1) and [53, Sect. 7.5.1] for more details). Define the coarsest scale

$$\begin{aligned} j_0 = \left\{ \begin{array}{ll} 0 &{}\quad p =1 \\ \lceil \log _2(2p) \rceil &{}\quad p \ge 2 \end{array} \right. . \end{aligned}$$

(B.1)

(In general, one could allow any fixed $j_0$ greater than or equal to the right-hand side. However, this does not affect any of the results in the paper; hence, we simply specify $j_0$ exactly.) We recall that Daubechies’ wavelets with p vanishing moments have the smallest possible support, of length $2p-1$. We assume the scaling function $\varphi $ and the mother wavelet $\psi $ to be supported on $[0, 2p-1]$ and $[-p+1,p]$, respectively. Then, the set of functions

$$\begin{aligned} \{ \varphi ^{{\mathrm {per}}}_{j_0,k} : k = 0,\ldots ,2^{j_0}-1 \} \cup \{ \psi ^{{\mathrm {per}}}_{j,k} : k = 0,\ldots ,2^{j-1}, \ j \ge j_0 \}, \end{aligned}$$

(B.2)

is an orthonormal basis of $L^2([0,1])$, referred to as the periodized Daubechies wavelet basis. We note in passing that

$$\begin{aligned} \psi ^{{\mathrm {per}}}_{j,k} = \psi _{j,k},\quad \varphi ^{{\mathrm {per}}}_{j,k} = \varphi _{j,k},\quad k = p-1,\ldots ,2^j-p, \end{aligned}$$

that is, wavelets that are fully supported in [0, 1] are unchanged, and

$$\begin{aligned} \varphi ^{{\mathrm {per}}}_{j,k}&= \varphi _{j,k} + \varphi _{j,2^j+k},\quad \psi ^{{\mathrm {per}}}_{j,k} = \psi _{j,k} + \psi _{j,2^j+k},\qquad k = 0,\ldots ,p-2, \\ \varphi ^{{\mathrm {per}}}_{j,k}&= \varphi _{j,k} + \varphi _{j,2^j-p-k},\quad \psi ^{{\mathrm {per}}}_{j,k} = \psi _{j,k} + \psi _{j,2^j-p-k},\qquad k = 2^j-p+1,\ldots ,2^j-1, \end{aligned}$$

where the functions in the right-hand sides are implicitly restricted to [0, 1]. As needed, we order the basis (B.2) in the usual way, rewriting it as $\{ \phi _n \}_{n \in {\mathbb {N}}}$, where

$$\begin{aligned} \begin{aligned} \phi _{n+1}&= \varphi ^{{\mathrm {per}}}_{j_0,n}, \quad n = 0,\ldots ,2^{j_0}-1 \\ \phi _{2^{j} + n + 1}&= \psi ^{{\mathrm {per}}}_{j,n}, \quad n = 0,\ldots ,2^j-1,\ j \ge j_0. \end{aligned} \end{aligned}$$

(B.3)

Proof of Theorem 9.1

The technical tools we need to prove this theorem were introduced in [1], where a similar result was proven for the weighted quadratically constrained basis pursuit decoder.

We require several concepts from [1]. First, we introduce several additional pieces of notation. Given sparsity levels ${\mathbf {M}} = (M_1,\ldots ,M_r)$ and local sparsities ${\mathbf {s}} = (s_1,\ldots ,s_r)$, let

$$\begin{aligned} D_{{\mathbf {s}},{\mathbf {M}}} = \left\{ {\varDelta }\subseteq \{1,\ldots ,N\} : | {\varDelta }\cap \{ M_{k-1}+1,\ldots ,M_k \} | \le s_k \right\} , \end{aligned}$$

be the set of all possible supports of an $({\mathbf {s}},{\mathbf {M}})$-sparse vector. Given positive weights $w = (w_i)^{M}_{i=1} \in {\mathbb {C}}^{M}$ and a set ${\varDelta }\subseteq \{1,\ldots ,M\}$, we define its weighted cardinality as follows:

$$\begin{aligned} |{\varDelta }|_w = \sum _{i \in {\varDelta }} (w_i)^2. \end{aligned}$$

The conventional tool in compressed sensing for establishing recovery guarantees is the so-called Restricted Isometry Property (RIP). In our case, we require an generalized version of the RIP. This takes into account the sparsity in levels structure, and the fact that the measurement matrix A satisfies (9.3), rather than the more standard condition ${\mathbb {E}}(A^*A) = I$.

Definition 9

(G-adjusted RIP in Levels) Let ${\mathbf {M}} = (M_1,\ldots ,M_r)$ be sparsity levels, ${\mathbf {s}} = (s_1,\ldots ,s_r)$ be local sparsities and $G \in {\mathbb {C}}^{M \times M}$ be invertible, where $M = M_r$ is the sparsity bandwidth. The $({\mathbf {s}},{\mathbf {M}})$th G-adjusted Restricted Isometry Constant in Levels (G-RICL) $\delta _{{\mathbf {s}},{\mathbf {M}},G}$ of a matrix $A \in {\mathbb {C}}^{m \times M}$ is the smallest $\delta \ge 0$ such that

$$\begin{aligned} (1-\delta ) {\left\| G x\right\| }^2_{\ell ^2} \le {\left\| A x\right\| }^2_{\ell ^2} \le (1+\delta ) {\left\| G x\right\| }^2_{\ell ^2},\quad \forall x \in {\varSigma }_{{\mathbf {s}},{\mathbf {M}}}. \end{aligned}$$

If $0< \delta _{{\mathbf {s}},{\mathbf {M}},G} < 1$, then the matrix is said to have the G-adjusted Restricted Isometry Property in levels (G-RIPL) of order $({\mathbf {s}},{\mathbf {M}})$.

In our setting, if N, M are such that $P_N UP_M$ is full rank (in particular, if the balancing property holds), then G will be taken as the unique positive definite square-root of the positive definite matrix $P_M U^* P_N U P_M$. We write $G = \sqrt{P_M U^* P_N U P_M}$ in this case.

The following result [1, Thm. 3.6] gives conditions under which the matrix A satisfies the G-RIPL:

Theorem C.1

Let $0< \delta ,\varepsilon <1$, $M \ge 2$, $1 \le {\tilde{r}} \le r \le N$ and ${\mathbf {M}} = (M_1,\ldots ,M_r)$ and ${\mathbf {s}} = (s_1,\ldots ,s_r)$ be sparsity levels and local sparsities, respectively, where $s = s_1+\cdots +s_r \ge 2$ and $M_r = M$. Let ${\varOmega }$ be an $({\mathbf {N}},{\mathbf {m}})$-multilevel random subsampling pattern with r levels and saturation ${\tilde{r}}$, and $N = N_r$. Suppose that N, M are such that $P_N U P_M$ is full rank, where U is as in (8.1) and consider the matrix A given by (8.4). If

$$\begin{aligned} m_k \gtrsim \delta ^{-2} \cdot {\left\| G^{-1}\right\| }^2_{\ell ^2} \cdot \left( \sum ^{r}_{k=1} s_k \mu \left( U^{(k,l)} \right) \right) \cdot L,\qquad k = {\tilde{r}}+1,\ldots ,r, \end{aligned}$$

where

$$\begin{aligned} L = r \cdot \log (m)\cdot \log ^2(s) \cdot \log (N) + \log (\varepsilon ^{-1}), \end{aligned}$$

then, with probability at least $1-\varepsilon $, A satisfies the G-RIPL of order $({\mathbf {s}},{\mathbf {M}})$ with constant $\delta _{{\mathbf {s}},{\mathbf {M}},G} \le \delta $ and G given by $G = \sqrt{P_M U^* P_N U P_M}$.

In order to establish Theorem 9.1, we next show that the G-RIPL implies stable and robust recovery. To do so, we first introduce the following generalization of the so-called robust Null Space Property (rNSP):

Definition 10

Let ${\mathbf {M}} = (M_1,\ldots ,M_r)$ be sparsity levels, ${\mathbf {s}} = (s_1,\ldots ,s_r)$ be local sparsities and $w \in {\mathbb {C}}^{M}$ be positive weights, where $M = M_r$. A matrix $A \in {\mathbb {C}}^{m \times M}$ has the weighted robust null space property in levels (weighted rNSPL) of order $({\mathbf {s}},{\mathbf {M}})$ with constants $0< \rho < 1$ and $\gamma > 0$ if

$$\begin{aligned} {\left\| P_{{\varDelta }} x\right\| }_{\ell ^2} \le \frac{\rho {\left\| P^{\perp }_{{\varDelta }} x\right\| }_{\ell ^1_w}}{\sqrt{|{\varDelta }|_w}} + \gamma {\left\| A x\right\| }_{\ell ^2}, \end{aligned}$$

for all $x \in {\mathbb {C}}^M$ and ${\varDelta }\in D_{{\mathbf {s}},{\mathbf {M}}}$.

Suppose the weights $w = (w_i)^{M}_{i=1}$ are of the form (9.8), i.e. constant on the sparsity levels, and define

$$\begin{aligned} \xi = \xi ({\mathbf {s}},{\mathbf {w}}) = \sum ^{r}_{k=1} (w^{(k)})^2 s_k,\qquad \zeta = \zeta ({\mathbf {s}},{\mathbf {w}}) = \min _{k=1,\ldots ,r} \left\{ (w^{(k)})^2 s_k \right\} . \end{aligned}$$

(C.1)

The following combines Lemmas 5.2 and 5.3 of [1]:

Lemma 6

Suppose that A has the weighted rNSPL of order $({\mathbf {s}},{\mathbf {M}})$ with constants $0< \rho < 1$ and $\gamma > 0$. Let $x,z \in {\mathbb {C}}^M$. Then,

$$\begin{aligned} {\left\| z - x\right\| }_{\ell ^1_w} \le \frac{1+\rho }{1-\rho } \left( 2 \sigma _{{\mathbf {s}},{\mathbf {M}}}(x)_{\ell ^1_w} + {\left\| z\right\| }_{\ell ^1_w} - {\left\| x\right\| }_{\ell ^1_w} \right) + \frac{2 \gamma }{1-\rho } \sqrt{\xi } {\left\| A (z-x)\right\| }_{\ell ^2}, \end{aligned}$$

and

$$\begin{aligned} {\left\| z - x\right\| }_{\ell ^2}\le & {} \left( \rho + (1+\rho ) (\xi / \zeta )^{1/4} / 2 \right) \frac{{\left\| z-x\right\| }_{\ell ^1_w}}{\sqrt{\xi }} \\&+ \left( 1 + (\xi / \zeta )^{1/4} / 2 \right) \gamma {\left\| A(z-x)\right\| }_{\ell ^2}. \end{aligned}$$

The G-RIPL implies the weighted rNSPL (see [1, Thm. 5.5]):

Theorem C.2

Let $A \in {\mathbb {C}}^{m \times M}$ and $G \in {\mathbb {C}}^{M \times M}$ be invertible. Let ${\mathbf {M}} = (M_1,\ldots ,M_r)$ and ${\mathbf {s}} = (s_1,\ldots ,s_r)$ be sparsity levels and local sparsities, respectively, and ${\mathbf {w}}$ be positive weights of the form (9.8). Let $0< \rho < 1$, and suppose that A has the G-RIPL of order $({\mathbf {t}},{\mathbf {M}})$ and constant 1/2, where ${\mathbf {t}} = (t_1,\ldots ,t_r)$ satisfies

$$\begin{aligned} t_l = \min \left\{ 2 \left\lceil 3 \frac{\kappa (G)^2}{\rho ^2} \frac{\xi ({\mathbf {s}},{\mathbf {w}})}{(w^{(l)})^2} \right\rceil , M_l - M_{l-1} \right\} ,\quad l =1,\ldots ,r, \end{aligned}$$

(C.2)

and $\kappa (G)=\Vert G\Vert _{\ell ^2}\Vert G^{-1}\Vert _{\ell ^2}$ is the condition number of G with respect to the $\ell ^2$-norm. Then, there exists $0 < \gamma \le \sqrt{2} {\left\| G^{-1}\right\| }_{\ell ^2}$ such that A has the weighted rNSPL of order $({\mathbf {s}},{\mathbf {M}})$ with constants $\rho $ and $\gamma $.

Finally, we are now ready to prove Theorem 9.1:

Proof of Theorem 9.1

Recall that $G^2 = P_M U^* P_N U P_M$. Hence, G is invertible since U has the balancing property (9.4), and moreover, we have

$$\begin{aligned} {\left\| G^{-1}\right\| }_{\ell ^2} \le 1/ \sqrt{\theta }. \end{aligned}$$

(C.3)

We also have ${\left\| G\right\| }_{\ell ^2} \le 1$ since U is unitary, and therefore, $\kappa (G) \le 1/\sqrt{\theta }$.

Let $t_l$ be given by (C.2) with $\rho = 1/2$. Recalling (9.9) and (C.1), observe that

$$\begin{aligned} t_l \le 48 \frac{c_2^2 r s_l}{c_1^2 \theta }. \end{aligned}$$

Therefore,

$$\begin{aligned} t = t_1+\ldots +t_r \le 48 \frac{c_2^2 r }{c_1^2 \theta } s, \end{aligned}$$

and

$$\begin{aligned}&{\Vert G^{-1}\Vert }^2_{\ell ^2} \cdot \left( \sum ^{r}_{k = 1} t_l \mu \left( U^{(k,l)} \right) \right) \cdot \left( r \cdot \log (m) \cdot \log ^2(t) \cdot \log (M) + \log (\varepsilon ^{-1}) \right) \\&\quad \lesssim \theta ^{-2} \frac{c_2^2 r}{c_1^2} \cdot \left( \sum ^{r}_{k = 1} s_l \mu \left( U^{(k,l)} \right) \right) \\&\qquad \cdot \left( r \cdot \log (m) \cdot \log ^2(c_2^2 r s / (c_1^2 \theta )) \cdot \log (M) + \log (\varepsilon ^{-1}) \right) . \end{aligned}$$

Hence, condition (C) and Theorem C.1 imply that the matrix A has the G-RIPL of order $({\mathbf {t}},{\mathbf {M}})$ with constant $\delta _{{\mathbf {t}},{\mathbf {M}},G} \le 1/2$. It now follows from Theorem C.2 that A has the weighted rNSPL of order $({\mathbf {s}},{\mathbf {M}})$ with constants $\rho = 1/2$ and $\gamma \le \sqrt{2} {\Vert G^{-1}\Vert }_{\ell ^2} \le \sqrt{2/\theta }$.

To complete the proof, we use Lemma 6 with $z = {\hat{x}}$. Using this, (C.3) and the bounds

$$\begin{aligned} c_1^2 r s \le \xi \le c_2^2 r s,\qquad c_1^2 s \le \zeta \le c_2^2 s. \end{aligned}$$

(C.4)

we see that

$$\begin{aligned} {\left\| {\hat{x}} - x\right\| }_{\ell ^2} \le&\left( 1/2 + 3/4 (c_2^2 r / c_1^2)^{1/4} \right) \frac{{\left\| {\hat{x}} - x\right\| }_{\ell ^1_w}}{c_1\sqrt{r s}} \\&+ (1 + (c_2^2 r/c_1^2)^{1/4} /2) \sqrt{2/\theta } {\left\| A ({\hat{x}} - x)\right\| }_{\ell ^2} \\ \le&\left( 1+ (c_2^2 r / c_1^2)^{1/4} \right) \left[ \frac{{\left\| {\hat{x}} - x\right\| }_{\ell ^1_w}}{c_1\sqrt{r s}} + \sqrt{2/\theta } {\left\| A ({\hat{x}} - x)\right\| }_{\ell ^2} \right] \\ \le&\left( 1+ (c_2^2 r / c_1^2)^{1/4} \right) \left[ \frac{3}{c_1\sqrt{r s}} \left( 2 \sigma _{{\mathbf {s}},{\mathbf {M}}}(x)_{\ell ^1_w} + {\left\| {\hat{x}}\right\| }_{\ell ^1_w} - {\left\| x\right\| }_{\ell ^1_w} \right) \right. \\&\left. + 5 \sqrt{2/\theta } (c_2/c_1) {\left\| A ({\hat{x}} - x) \right\| }_{\ell ^2} \right] . \end{aligned}$$

We now use the fact that ${\hat{x}}$ is a minimizer, and therefore,

$$\begin{aligned} {\left\| {\hat{x}}\right\| }_{\ell ^1_w} - {\left\| x\right\| }_{\ell ^1_w} \le \frac{1}{\lambda } \left( {\left\| A x - y\right\| }_{\ell ^2} - {\left\| A {\hat{x}} - y\right\| }_{\ell ^2} \right) , \end{aligned}$$

Writing ${\left\| A ({\hat{x}} - x) \right\| }_{\ell ^2} \le {\left\| A {\hat{x}} - y\right\| }_{\ell ^2} + {\left\| A x - y\right\| }_{\ell ^2}$ and combining with the previous inequality now yields

$$\begin{aligned} {\left\| {\hat{x}} - x\right\| }_{\ell ^2} \le&\left( 1+ (c_2^2 r / c_1^2)^{1/4} \right) \\&\left[ \frac{6 \sigma _{{\mathbf {s}},{\mathbf {M}}}(x)_{\ell ^1_w} }{c_1 \sqrt{ r s}} + \left( 5 \sqrt{2/\theta } (c_2/c_1) +\frac{3}{c_1\sqrt{r s} \lambda } \right) {\left\| A x - y\right\| }_{\ell ^2} \right. \\&\quad \left. + \left( 5 \sqrt{2/\theta } (c_2/c_1) - \frac{3}{c_1\sqrt{r s} \lambda } \right) {\left\| A {\hat{x}} -y \right\| }_{\ell ^2} \right] \end{aligned}$$

The result now follows from the bound (D) on $\lambda $ and the fact that $e = y - A x$. $\square $

Proofs of Lemmas 2, 3 and 4

Proof of Lemma 2

We first observe that $\theta = \inf _{|\omega | \le \pi } | {\hat{\varphi }}(\omega ) |^2 > 0$ for the Daubechies wavelet basis [7, Remark 7.1]. Now let $x = (x_n)^{N}_{n=1} \in {\mathbb {C}}^N$ with ${\left\| x\right\| }_{\ell ^2} = 1$ and write $g = \sum ^{N}_{n=1} x_n \phi _n$ for the corresponding finite wavelet expansion. Observe that $ {\left\| g\right\| }^2_{L^2([0,1])} = {\left\| x\right\| }^2_{\ell ^2} = 1$. Let $V^{{\mathrm {per}}}_{j} = {\mathrm {span}}\{ \varphi _{j,n} : n = 0,\ldots ,2^j-1 \}$ and $W^{{\mathrm {per}}}_{j} = {\mathrm {span}}\{ \psi _{j,n} : n = 0,\ldots ,2^j-1 \}$. Then,

$$\begin{aligned} g \in V^{{\mathrm {per}}}_{j_0} \oplus W^{{\mathrm {per}}}_{j_0} \oplus \cdots \oplus W^{{\mathrm {per}}}_{j_0+r-1} = V^{{\mathrm {per}}}_{j_0+r}, \end{aligned}$$

and conversely every $g \in V^{{\mathrm {per}}}_{j_0+r}$ with $ {\left\| g\right\| }^2_{L^2([0,1])} = 1$ is equivalent to a vector of coefficients $x \in {\mathbb {C}}^M$ with ${\left\| x\right\| }_{\ell ^2} = 1$. Note also that

$$\begin{aligned} {\left\| P_N U P_N x\right\| }^2_{2} = \sum ^{N}_{n=1} | \langle g, \gamma _n \rangle |^2. \end{aligned}$$

Hence,

$$\begin{aligned} \begin{aligned} \inf _{\begin{array}{c} x \in {\mathbb {C}}^N \\ {\left\| x\right\| }_{\ell ^2} = 1 \end{array}}&{\left\| P_N U P_N x\right\| }^2_{\ell ^2} = \inf \left\{ \sum ^{N}_{n=1} | \langle g, \gamma _n \rangle |^2 : g \in V^{{\mathrm {per}}}_{j_0+r},\ {\left\| g\right\| }_{L^2([0,1])} = 1 \right\} . \end{aligned} \end{aligned}$$

(D.1)

Fix a $g \in V^{{\mathrm {per}}}_{j_0+r}$ with ${\left\| g\right\| }_{L^2([0,1])} = 1$ and write

$$\begin{aligned} g = \sum ^{N-1}_{k = 0} z_k \varphi ^{{\mathrm {per}}}_{r+j_0,k}, \end{aligned}$$

where $ {\left\| z\right\| }_{\ell ^2} = {\left\| g\right\| }_{L^2(0,1)} = 1$ and $z = (z_k)^{N-1}_{k=0}$. Then, for any integer n,

$$\begin{aligned} {\hat{g}}(2 \pi n)&= N^{-1/2} {\hat{\varphi }} (2 \pi n / N) \sum ^{N-1}_{k=0} z_k {\mathrm {e}}^{-2 \pi {\mathrm {i}}n k /N} = N^{-1/2} {\hat{\varphi }} (2 \pi n / N) G( n/N), \end{aligned}$$

where $G(x) = \sum ^{N-1}_{k=0} z_k {\mathrm {e}}^{-2 \pi {\mathrm {i}}k x}$ is a 1-periodic function. In the first equality, we have used that

$$\begin{aligned} \widehat{\varphi ^{{\mathrm {per}}}_{j,k}}(\omega ) = \widehat{\varphi _{j,k}}(\omega ) = 2^{-j/2}{\hat{\varphi }}(\omega /2^j) {\mathrm {e}}^{-{\mathrm {i}}\omega k/2^j}, \quad \forall j, k \in {\mathbb {Z}}, \; \forall \omega \in 2\pi {\mathbb {Z}}, \end{aligned}$$

(D.2)

and that $N = 2^{j_0+r}$. Hence,

$$\begin{aligned} \sum ^{N}_{n=1} | \langle g, \gamma _n \rangle |^2= & {} \sum ^{N/2}_{n=-N/2+1} | {\hat{g}}(2 \pi n) |^2 \nonumber \\= & {} N^{-1} \sum ^{N/2}_{n=-N/2+1} \left| {\hat{\varphi }}(2 \pi n/N) \right| ^2 \left| G(n / N) \right| ^2. \end{aligned}$$

(D.3)

Using the fact that G is 1-periodic, we deduce that

$$\begin{aligned} \sum ^{N}_{n=1} | \langle g, \gamma _n \rangle |^2 \ge \inf _{|\omega | \le \pi } \left| {\hat{\varphi }}(\omega ) \right| ^2 N^{-1} \sum ^{N-1}_{n=0} |G(n/N)|^2. \end{aligned}$$

Now, since G is a trigonometric polynomial, it follows that

$$\begin{aligned} N^{-1} \sum ^{N-1}_{n=0} |G(n/N)|^2 = {\left\| G\right\| }^2_{L^2([0,1])} = {\left\| z\right\| }^2_{\ell ^2} = {\left\| g\right\| }^2_{L^2([0,1])} = 1. \end{aligned}$$

Therefore,

$$\begin{aligned} \sum ^{N}_{n=1} | \langle g, \gamma _n \rangle |^2 \ge \inf _{|\omega | \le \pi } \left| {\hat{\varphi }}(\omega ) \right| ^2 = \theta > 0. \end{aligned}$$

Since g was arbitrary, we deduce that

$$\begin{aligned} \inf _{\begin{array}{c} x \in {\mathbb {C}}^N \\ {\left\| x\right\| }_{\ell ^2} = 1 \end{array}} {\left\| P_N U P_N x\right\| }^2_{\ell ^2} \ge \theta . \end{aligned}$$

To complete the proof, we first recall that $P_N - P_N U^* P_N U P_N$ is positive semidefinite (since U is unitary), and therefore,

$$\begin{aligned} {\left\| P_N - P_N U^* P_N U P_N\right\| }_{\ell ^2}&= \sup _{\begin{array}{c} x \in {\mathbb {C}}^N \\ {\left\| x\right\| }_{\ell ^2} = 1 \end{array}} \langle (P_N - P_N U^* P_N U P_N )x, x \rangle \\&= 1 - \inf _{\begin{array}{c} x \in {\mathbb {C}}^N \\ {\left\| x\right\| }_{\ell ^2} = 1 \end{array}} {\left\| P_N U P_N x\right\| }^2_{\ell ^2} \\&\le 1 - \theta , \end{aligned}$$

as required. $\square $

For Lemma 3, we first require the following:

Lemma 7

The (k, l)th local coherence satisfies

$$\begin{aligned} \mu \left( U^{(k,l)} \right) \le 2^{1+ k-l} \max _{\omega \in B_k } \left| {\widehat{\psi }}(2\pi \omega /2^{l+j_0-1}) \right| ^2,\quad l > 1, \end{aligned}$$

and

$$\begin{aligned} \mu \left( U^{(k,1)} \right) \le 2^{k} \max \left\{ \max _{\omega \in B_k} \left| {\widehat{\psi }}(2\pi \omega /2^{j_0}) \right| ^2 , \max _{\omega \in B_k} \left| {\widehat{\varphi }}(2\pi \omega /2^{j_0}) \right| ^2 \right\} . \end{aligned}$$

Proof

By definition,

$$\begin{aligned} \mu \left( U^{(k,l)} \right) =|B_k| \max _{\omega \in B_k} \max _{0 \le n < 2^{j_0+l-1}} \left| \widehat{\psi ^{{\mathrm {per}}}_{j_0+l-1,n}}(2\pi \omega ) \right| ^2,\quad l > 1, \end{aligned}$$

and

$$\begin{aligned} \mu \left( U^{(k,1)} \right) =|B_k| \max \left\{ \max _{\omega \in B_k} \max _{0 \le n< 2^l} \left| \widehat{\psi ^{{\mathrm {per}}}_{j_0,n}}(2\pi \omega ) \right| ^2 , \max _{\omega \in B_k} \max _{0 \le n < 2^l} \left| \widehat{\varphi ^{{\mathrm {per}}}_{j_0,n}}(2\pi \omega ) \right| ^2 \right\} . \end{aligned}$$

Recall that $|B_k| \le 2^{j_0+k}$. Moreover, recall relation (D.2) and note that an analogous formula holds for $\widehat{\psi _{j,k}^{{\mathrm {per}}}}$. Since $B_k$ is a set of integers, the result now follows immediately. $\square $

Proof of Lemma 3

By the previous lemma, it suffices to estimate the Fourier transform of the wavelet and scaling function in different regions of frequency space. First, suppose that $k \ge l \ge 1$. Then, $| \omega | \ge 2^{j_0+k-1}$ for $\omega \in B_k$, and the smoothness conditions (2.1) give

$$\begin{aligned} | {\hat{\psi }}(2\pi \omega /2^{l+j_0-1}) | \lesssim 2^{-(q+1)(k-l)},\qquad | {\hat{\varphi }}(2\pi \omega /2^{l+j_0-1}) | \lesssim 2^{-(q+1)(k-l)}. \end{aligned}$$

The first estimate now follows from Lemma 7.

For the second estimate, we need to bound $| {\hat{\psi }}(2 \pi \omega ) |$ for $| \omega | \ll 1$. For this, we recall that ${\hat{\psi }}(z) = (-{\mathrm {i}}z)^{p} \chi _p(z)$ for some bounded function $\chi _p(z)$ [53, Thm. 7.4]. Hence,

$$\begin{aligned} | {\hat{\psi }}(2 \pi \omega ) |^2 \le c_p |\omega |^{2p}. \end{aligned}$$

If $l > k \ge 1$, then this and the previous lemma give

$$\begin{aligned} \mu \left( U^{(k,l)} \right) \le 2^{1+k-l} \max _{|\omega | \le 2^{j_0+k}} | {\hat{\psi }}(2\pi \omega /2^{l+j_0-1}) |^2 \lesssim c_p 2^{k-l} 2^{2p(k-l)}. \end{aligned}$$

The result now follows immediately. $\square $

Proof of Lemma 4

By direct calculation

$$\begin{aligned} {\left\| P_{{\varOmega }} D U P^{\perp }_M d\right\| }^2_{\ell ^2} \le \sum ^{r}_{k=1} \frac{N_k - N_{k-1}}{m_k} m_k \max _{N_{k-1} < i \le N_k} | \langle u_i, P^{\perp }_M d \rangle |^2, \end{aligned}$$

where $u_i = U^* e_i$ is the ith row of U. Observe that

$$\begin{aligned} | \langle u_i, P^{\perp }_M d \rangle |^2 = \left| \sum _{j> M} u_{ij} d_j \right| ^2 \le \max _{j > M} |u_{ij} |^2 {\left\| P^{\perp }_M d\right\| }^2_{\ell ^1}. \end{aligned}$$

Hence,

$$\begin{aligned} {\left\| P_{{\varOmega }} D U P^{\perp }_M d\right\| }^2_{\ell ^2}&\le \sum ^{r}_{k=1} (N_k - N_{k-1}) \max _{\begin{array}{c} N_{k-1} < i \le N_k \\ j > M \end{array}} |u_{ij} |^2 {\left\| P^{\perp }_M d\right\| }^2_{\ell ^1} \\&= \sum ^{r}_{k=1} \mu \left( P^{N_{k-1}}_{N_k} U P^{\perp }_M \right) {\left\| P^{\perp }_M d\right\| }^2_{\ell ^1}, \end{aligned}$$

which gives

$$\begin{aligned} {\Vert P_{{\varOmega }} D U P^{\perp }_M d\Vert }_{\ell ^2} \le \left( \sum ^{r}_{k=1} \mu \left( P^{N_{k-1}}_{N_k} U P^{\perp }_M \right) \right) ^{1/2} {\Vert P^{\perp }_M d\Vert }_{\ell ^1}. \end{aligned}$$

Since $M = M_r$, we now apply Lemma 3 to get

$$\begin{aligned} \mu \left( P^{N_{k-1}}_{N_k} U P^{\perp }_M \right) = \sup _{l > r} \mu \left( U^{(k,l)} \right) \le c_p 2^{-(2p+1)(r-k)}. \end{aligned}$$

Hence,

$$\begin{aligned} \sum ^{r}_{k=1} \mu \left( P^{N_{k-1}}_{N_k} U P^{\perp }_K \right) \le c_p \sum ^{r}_{k=1} 2^{-(2p+1)(r-k)} \le c_p . \end{aligned}$$

The result now follows. $\square $

Numerical Experiments

In this section, we discuss some technical details behind Fig. 2. Moreover, we provide further numerical evidence to support the comparison shown therein. We consider the function

$$\begin{aligned} f_K(x) = \sum _{i = 1}^{K} (-1)^{\text {mod}(i, 5)} \; x^{\text {mod}(i,3)} \; \text {sign}(x-(1.3)^{i-9}), \quad 0 \le x \le 1 . \end{aligned}$$

(E.1)

This function has K discontinuities in (0, 1) and its plot is shown in Fig. 4. We approximate $f_K$ for $K = 1,10,20$ using the four different encoder–decoder pairs described below.

(Fourier, $\ell ^1$): This strategy corresponds to the setting of Theorems 3.3 and 4.6 and to the error bound (1.5), up to a few minor technical modifications. The Fourier sampling strategy is as follows. We divide the frequency space into dyadic bands and consider a sampling scheme analogous to the $({\mathbf {N}},{\mathbf {m}})$-multilevel random subsampling strategy with saturation ${\tilde{r}}$ described in Definition 5, where symmetry of the samples is enforced in every frequency band. In particular, ${\mathbf {N}}$ is defined as in (8.5), the saturation level is ${\tilde{r}} = \text {round}(\log _2(m/2))$, and the local numbers of measurements are

$$ m_k = 2\left\lfloor \frac{m}{4(r-r_0)}\right\rfloor , \quad k = {\tilde{r}} +1, \ldots , r-1, $$

where, in the last frequency band, we let $m_r = m-(m_1 +\cdots + m_{r-1})$ in order to reach a total budget of m measurements exactly. The samples are then computed as follows. The first ${\tilde{r}}$ dyadic bands are saturated. For every $k > {\tilde{r}}$, we pick $m_k/2$ samples uniformly at random from the k-th frequency semiband (corresponding to positive frequencies) and we choose frequencies in the opposite semiband (corresponding to negative frequencies) in a symmetric way. The wavelet coefficients of f are recovered via basis pursuit (1.2). Numerically, (1.2) is solved using the MATLAB package SPGL1 (see [66, 67]) with parameters $\texttt {bpTol}$ = 1e-6, $\texttt {optTol}$= 1e-6, and a maximum of 10000 iterations.^{Footnote 4}

(Fourier, $\ell ^1_w$): This strategy is almost identical to (Fourier, $\ell ^1$). The only difference is that wavelet coefficients are recovered via weighted (as opposed to unweighted) basis pursuit, i.e. by solving (1.2) where the $\ell ^1$-norm is replaced with the weighted $\ell ^1_w$-norm. The weights w are set according to the recipe described in Sect. 10.2 with $\delta = 10^{-5}$. Weighted basis pursuit is numerically solved using the MATLAB package SPGL1 as in the previous case.

(Gauss, $\ell ^1$): This is the standard encoder–decoder pair of compressed sensing with random Gaussian measurements, corresponding to the setting of Theorems 3.1 and 4.3 and to the error bound (1.3). The vector $d \in {\mathbb {R}}^N$ of wavelet coefficients of f is explicitly computed and then encoded as $y = A d$, where $A \in {\mathbb {R}}^{m \times N}$ has i.i.d. entries drawn from the normal distribution with mean zero and variance 1/m. The function is recovered by means of the basis pursuit decoder (1.2), numerically solved via SPGL1 as in the previous cases.^{Footnote 5}

(Optimal, $\ell ^1$): This strategy corresponds to the setting of Theorems 3.1 and 4.3 and to the optimal error bound (1.3). As in the previous case, we compute the vector $d\in {\mathbb {R}}^N$ of wavelet coefficients of f. Then, the first $m_1 = \text {round}(m/2)$ entries of d are directly encoded into $y^{(1)} \in {\mathbb {R}}^{m_1}$. The remaining $m_2 = m - m_1$ measurements are computed as $y^{(2)} = A (d_n)_{n = m_1+1}^{N}$, where $A \in {\mathbb {R}}^{m_2 \times (N-m_1)}$ has i.i.d. entries drawn from the normal distribution with mean zero and variance $1/m_2$. We consider the basis pursuit decoder (1.2), numerically solved using SPGL1 as in the previous cases.

(Gauss, Tree): This encoder–decoder pair corresponds to the model-based compressive sensing strategy proposed in [12]. The encoder identical to (Gauss, $\ell ^1$), and the decoder explicitly promotes tree-structured sparsity in the recovered function using the model-based CoSaMP algorithm [12]. This strategy requires tuning a parameter c, which links m to the desired tree-sparsity level s as $m = c s$. In the numerical tests, we consider $c = 3,4,5,6,7$. We employ the Model-based Compressive Sensing Toolbox v1.1 provided by the authors of [12]. The maximum number of iterations for the outer loop of CoSaMP is set to 100.

These four encoder–decoder pairs are compared with $N = 2^{15} = 32768$ and values of m ranging from $2^3 = 8$ to $2^{11} = 2048$. We employ Haar and db4 wavelets, having $p= 1$ and $p = 2$ vanishing moments, respectively. In this setting, the weights used in (Fourier, $\ell ^1_w$) are constant for all $m \le 256$. The relative $L^2$ error is computed using the wavelet coefficients of f, approximated as in the strategies (Gauss, $\ell ^1$), (Optimal, $\ell ^1$) and (Gauss, Tree).

In Fig. 5 the encoder–decoder pairs (Fourier, $\ell ^1$) and (Fourier, $\ell ^1_w$) have almost identical performances and they consistently outperform all the other strategies, with only a few exceptions. Moreover, this behaviour is independent of the number of discontinuities K. It is remarkable that (Fourier, $\ell ^1$) and (Fourier, $\ell ^1_w$) are able to numerically outperform even the theoretically optimal pair (Optimal, $\ell ^1$). Although our theory prescribes the use of weighted square-root LASSO decoder in the Fourier case, the numerics show that employing (weighted or unweighted) basis pursuit (1.2) is enough to numerically outperform the other strategies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adcock, B., Brugiapaglia, S. & King–Roskamp, M. Do Log Factors Matter? On Optimal Wavelet Approximation and the Foundations of Compressed Sensing. Found Comput Math 22, 99–159 (2022). https://doi.org/10.1007/s10208-021-09501-3

Download citation

Received: 28 May 2019
Revised: 24 December 2020
Accepted: 13 January 2021
Published: 22 March 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10208-021-09501-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Do Log Factors Matter? On Optimal Wavelet Approximation and the Foundations of Compressed Sensing

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Some Uncertainty Principles for the Right-Sided Multivariate Continuous Quaternion Wavelet Transform

UNIFORM CONVERGENCE OF A NONPARAMETRIC ESTIMATE OF POISSON REGRESSION WITH AN APPLICATION TO GOODNESS-OF-FIT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Fourier Transform and Series

Orthogonal Wavelet Bases of \(L^2([0,1])\)

Proof of Theorem 9.1

Definition 9

Theorem C.1

Definition 10

Lemma 6

Theorem C.2

Proof of Theorem 9.1

Proofs of Lemmas 2, 3 and 4

Proof of Lemma 2

Lemma 7

Proof

Proof of Lemma 3

Proof of Lemma 4

Numerical Experiments

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Do Log Factors Matter? On Optimal Wavelet Approximation and the Foundations of Compressed Sensing

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Some Uncertainty Principles for the Right-Sided Multivariate Continuous Quaternion Wavelet Transform

UNIFORM CONVERGENCE OF A NONPARAMETRIC ESTIMATE OF POISSON REGRESSION WITH AN APPLICATION TO GOODNESS-OF-FIT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Fourier Transform and Series

Orthogonal Wavelet Bases of \(L^2([0,1])\)

Proof of Theorem 9.1

Definition 9

Theorem C.1

Definition 10

Lemma 6

Theorem C.2

Proof of Theorem 9.1

Proofs of Lemmas 2, 3 and 4

Proof of Lemma 2

Lemma 7

Proof

Proof of Lemma 3

Proof of Lemma 4

Numerical Experiments

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation