Conjugate gradient acceleration of iteratively re-weighted least squares methods

Fornasier, Massimo; Peter, Steffen; Rauhut, Holger; Worm, Stephan

doi:10.1007/s10589-016-9839-8

Conjugate gradient acceleration of iteratively re-weighted least squares methods

Published: 18 March 2016

Volume 65, pages 205–259, (2016)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Massimo Fornasier¹,
Steffen Peter¹,
Holger Rauhut² &
…
Stephan Worm³

1028 Accesses
14 Citations
Explore all metrics

Abstract

Iteratively re-weighted least squares (IRLS) is a method for solving minimization problems involving non-quadratic cost functions, perhaps non-convex and non-smooth, which however can be described as the infimum over a family of quadratic functions. This transformation suggests an algorithmic scheme that solves a sequence of quadratic problems to be tackled efficiently by tools of numerical linear algebra. Its general scope and its usually simple implementation, transforming the initial non-convex and non-smooth minimization problem into a more familiar and easily solvable quadratic optimization problem, make it a versatile algorithm. However, despite its simplicity, versatility, and elegant analysis, the complexity of IRLS strongly depends on the way the solution of the successive quadratic optimizations is addressed. For the important special case of compressed sensing and sparse recovery problems in signal processing, we investigate theoretically and numerically how accurately one needs to solve the quadratic problems by means of the conjugate gradient (CG) method in each iteration in order to guarantee convergence. The use of the CG method may significantly speed-up the numerical solution of the quadratic subproblems, in particular, when fast matrix-vector multiplication (exploiting for instance the FFT) is available for the matrix involved. In addition, we study convergence rates. Our modified IRLS method outperforms state of the art first order methods such as Iterative Hard Thresholding (IHT) or Fast Iterative Soft-Thresholding Algorithm (FISTA) in many situations, especially in large dimensions. Moreover, IRLS is often able to recover sparse vectors from fewer measurements than required for IHT and FISTA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Hadamard decomposition problem

Article Open access 21 May 2024

A Gauss–Newton method for mixed least squares-total least squares problems

Article 01 March 2024

A Brief Introduction to Manifold Optimization

Article Open access 04 April 2020

References

Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). doi:10.1137/080716542
Article MathSciNet MATH Google Scholar
Bickel, P., Ritov, Y., Tsybakov, A.: Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37(4), 1705–1732 (2009)
Article MathSciNet MATH Google Scholar
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009). doi:10.1016/j.acha.2009.04.002
Article MathSciNet MATH Google Scholar
Bredies, K., Lorenz, D.A.: Minimization of non-smooth, non-convex functionals by iterative thresholding. J. Optim. Theory Appl. 165, 78–112 (2015)
Article MathSciNet MATH Google Scholar
Candès, E.J., Tao, T., Romberg, J.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52(2), 489–509 (2006)
Article MathSciNet MATH Google Scholar
Candès, E.J., Plan, Y.: Near-ideal model selection by $\ell _1$ minimization. Ann. Statist. 37(5A), 2145–2177 (2009). doi:10.1214/08-AOS653
Article MathSciNet MATH Google Scholar
Candès, E.J., Tao, T.: Near optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inform. Theory 52(12), 5406–5425 (2006)
Article MathSciNet MATH Google Scholar
Chafai, D., Guédon, O., Lecué, G., Pajor, A.: Interactions between compressed sensing. random matrices and high dimensional geometry. Soc. Math. France (2012)
Chambolle, A., Lions, P.L.: Image recovery via total variation minimization and related problems. Numer. Math. 76(2), 167–188 (1997). doi:10.1007/s002110050258
Article MathSciNet MATH Google Scholar
Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Sign. Process. Lett. 14(10), 707–710 (2007). doi:10.1109/LSP.2007.898300
Article Google Scholar
Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Prob. 24(3), 035020 (2008). doi: 10.1088/0266-5611/24/3/035020
Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008, pp. 3869–3872 (2008). doi: 10.1109/ICASSP.2008.4518498
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by Basis Pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1999)
Article MathSciNet MATH Google Scholar
Cline, A.K.: Rate of convergence of Lawson’s algorithm. Math. Comp. 26, 167–176 (1972)
MathSciNet MATH Google Scholar
Cohen, A., Dahmen, W., DeVore, R.A.: Compressed sensing and best $k$-term approximation. J. Am. Math. Soc. 22(1), 211–231 (2009)
Article MathSciNet MATH Google Scholar
Daubechies, I., DeVore, R., Fornasier, M., Güntürk, C.: Iteratively re-weighted least squares minimization for sparse recovery. Comm. Pure Appl. Math. 63(1), 1–38 (2010)
Article MathSciNet MATH Google Scholar
Dirksen, S., Lecu’e, G., Rauhut, H.: On the gap between RIP-properties and sparse recovery conditions. Preprint arXiv:1504.05073 (2015)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inform. Theory 52(4), 1289–1306 (2006)
Article MathSciNet MATH Google Scholar
Fornasier, M., Rauhut, H., Ward, R.: Low-rank matrix recovery via iteratively reweighted least squares minimization. SIAM J. Optim. 21(4), 1614–1640 (2011). doi:10.1137/100811404
Article MathSciNet MATH Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Springer, New York (2013). doi: 10.1007/978-0-8176-4948-7
Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using FOCUSS: a recursive weighted norm minimization algorithm. IEEE Trans. Sign. Process. 45(3), 600–616 (1997)
Article Google Scholar
Gribonval, R., Nielsen, M.: Sparse representations in unions of bases. IEEE Trans. Inform. Theory 49(12), 3320–3325 (2003)
Article MathSciNet MATH Google Scholar
Han, W., Jensen, S., Shimansky, I.: The Kačanov method for some nonlinear problems. Appl. Numer. Math. 24(1), 57–79 (1997)
Article MathSciNet MATH Google Scholar
Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stan. 49(6), 409–436 (1952)
Article MathSciNet MATH Google Scholar
Hollanda, P.W., Welsch, R.E.: Robust regression using iteratively reweighted least-squares. Commun. Stat. 6(9), 813–827 (1977)
Article MATH Google Scholar
Ito, K., Kunisch, K.: A variational approach to sparsity optimization based on Lagrange multiplier theory. Inverse Prob. 30(1), 015,001, 23 (2014). doi: 10.1088/0266-5611/30/1/015001
Jacobs, D.A.H.: A generalization of the conjugate-gradient method to solve complex systems. IMA J. Num. Anal. 6(4), 447–452 (1986)
Article MathSciNet MATH Google Scholar
Kabanava, M., Rauhut, H.: Analysis $\ell _1$-recovery with frames and Gaussian measurements. Acta Appl. Math. (to appear)
King, J.T.: A minimal error conjugate gradient method for ill-posed problems. J. Optim. Theory Appl. 60, 297–304 (1989). doi:10.1007/BF00940009
Article MathSciNet MATH Google Scholar
Krahmer, F., Mendelson, S., Rauhut, H.: Suprema of chaos processes and the restricted isometry property. Commun. Pure Appl. Math. 67(11), 1877–1904 (2014)
Article MathSciNet MATH Google Scholar
Lai, M.J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed $\ell _q$ minimization. SIAM J. Num. Anal. 51(2), 927–957 (2013)
Article MathSciNet MATH Google Scholar
Lawson, C.L.: Contributions to the Theory of Linear Least Maximum Approximation. Ph.D. Thesis. University of California, Los Angeles (1961)
Lecuè, G., Mendelson, S.: Sparse recovery under weak moment assumptions. J. Eur. Math. Soc. (to appear)
Nocedal, J., Wright, S.: Conjugate Gradient Methods. Springer Series in Operations Research and Financial Engineering. pp. 101–134. Springer (2006)
Ochs, P., Dosovitskiy, A., Brox, T., Pock, T.: On iteratively reweighted algorithms for nonsmooth nonconvex optimization in computer vision. SIAM J. Imaging Sci. 8(1), 331–372 (2015). doi:10.1137/140971518
Article MathSciNet MATH Google Scholar
Osborne, M.R.: Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Finite algorithms in optimization and data analysis. Wiley, Chichester (1985)
Google Scholar
Quarteroni, A., Sacco, R., Saleri, F.: Numerical Mathematics. Texts in Applied Mathematics Series. Springer (2000). http://books.google.de/books?id=YVpyyi1M7vUC
Ramlau, R., Zarzer, C.A.: On the minimization of a Tikhonov functional with a non-convex sparsity constraint. Electron. Trans. Numer. Anal. 39, 476–507 (2012)
MathSciNet MATH Google Scholar
Rauhut, H.: Compressive sensing and structured random matrices. In: Fornasier, M (ed.) Theoretical foundations and numerical methods for sparse recovery, Radon Series Comp. Appl. Math., vol. 9, pp. 1–92. deGruyter (2010)
Rudelson, M., Vershynin, R.: On sparse reconstruction from Fourier and Gaussian measurements. Comm. Pure Appl. Math. 61, 1025–1045 (2008)
Article MathSciNet MATH Google Scholar
Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1–4), 259–268 (1992)
Article MathSciNet MATH Google Scholar
Vogel, C.R., Oman, M.E.: Fast, robust total variation-based reconstruction of noisy, blurred images. IEEE Trans. Image Process. 7(6), 813–824 (1998). doi:10.1109/83.679423
Article MathSciNet MATH Google Scholar
Voronin, S.: Regularization of linear systems with sparsity constraints with applications to large scale inverse problems. Ph.D. thesis, Applied and Computational Mathematics Department, Princeton University (2012)
Voronin, S., Daubechies, I.: An Iteratively Reweighted Least Squares Algorithm for Sparse Regularization. arXiv:1511.08970 [math] (2015)
Zarzer, C.A.: On Tikhonov regularization with non-convex sparsity constraints. Inverse Prob. 25(2), 025006 (2009). doi: 10.1088/0266-5611/25/2/025006

Download references

Acknowledgments

Massimo Fornasier acknowledges the support of the ERC-Starting Grant HDSPCONTR “High-Dimensional Sparse Optimal Control” and the DFG Project “Optimal Adaptive Numerical Methods for p-Poisson Elliptic equations”. Steffen Peter acknowledges the support of the Project “SparsEO: Exploiting the Sparsity in Remote Sensing for Earth Observation” funded by Munich Aerospace. Holger Rauhut would like to thank the European Research Council (ERC) for support through the Starting Grant StG 258926 SPALORA (Sparse and Low Rank Recovery) and the Hausdorff Center for Mathematics at the University of Bonn where this project has started.

Author information

Authors and Affiliations

Fakultät für Mathematik, Technische Universität München, Boltzmannstrasse 3, 85748, Garching bei München, Germany
Massimo Fornasier & Steffen Peter
Lehrstuhl C für Mathematik (Analysis), RWTH Aachen University, Pontdriesch 10, 52062, Aachen, Germany
Holger Rauhut
Schloßstr. 34, 53115, Bonn, Germany
Stephan Worm

Authors

Massimo Fornasier
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Peter
View author publications
You can also search for this author in PubMed Google Scholar
Holger Rauhut
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Worm
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steffen Peter.

Appendix: Proof of Lemma 10

“$\Rightarrow $” (in the case $0 < \tau \leqslant 1 $)

Let $x = x^{{\varepsilon },1}$ or $x\in \mathcal {X}_{{\varepsilon },\tau }(y)$, and $\eta \in \mathcal {N}_{\varPhi }$ arbitrary. Consider the function

$$\begin{aligned} G_{{\varepsilon },\tau }(t) \mathrel {\mathop :}=f_{{\varepsilon },\tau }\left( x + t\eta \right) - f_{{\varepsilon },\tau }\left( x \right) \end{aligned}$$

with its first derivative

$$\begin{aligned} G^{\prime }_{{\varepsilon },\tau }(t) = \tau \sum \limits _{i=1}^{N}\frac{x_{i}\eta _{i} +t\eta _{i}^2}{\left[ |x_{i} + t\eta _{i}|^{2} + {\varepsilon }^{2}\right] ^{\frac{2-\tau }{2}}}. \end{aligned}$$

Now $G_{{\varepsilon },\tau }(0) = 0$ and from the minimization property of $f_{{\varepsilon },\tau }(x)$, $G_{{\varepsilon },\tau }(t) \ge 0$. Therefore,

$$\begin{aligned} 0 = G^{\prime }_{{\varepsilon },\tau }(0) = \sum \limits _{i=1}^{N}\frac{x_{i}\eta _{i}}{\left[ x_{i}^{2} + {\varepsilon }^{2}\right] ^{\frac{2-\tau }{2}}} = \left\langle x,\eta \right\rangle _{\hat{w}(x,{\varepsilon },\tau )}. \end{aligned}$$

“$\Leftarrow $” (only in the case $\tau =1$)

Now let $x\in \mathcal {F}_{\varPhi }(y)$ and $\left\langle x,\eta \right\rangle _{\hat{w}(x,{\varepsilon },1)} = 0$ for all $\eta \in \mathcal {N}_{\varPhi }$. We want to show that x is the minimizer of $f_{{\varepsilon },1}$ in $\mathcal {F}_{\varPhi }(y)$. Consider the convex univariate function $g(u)\mathrel {\mathop :}=[u^{2} + {\varepsilon }^{2}]^{1/2}$. For any point $u_{0}$ we have from convexity that

$$\begin{aligned} {[}u^{2} + {\varepsilon }^{2}{]}^{1/2} \geqslant [u_{0}^{2} + {\varepsilon }^{2}]^{1/2} + {[}u_{0}^{2} + {\varepsilon }^{2}{]}^{-1/2}u_{0}(u-u_{0}) \end{aligned}$$

because the right-hand-side is the linear function which is tangent to g at $u_{0}$. It follows, that for every point $v\in \mathcal {F}_{\varPhi }(y)$ we have

$$\begin{aligned} f_{{\varepsilon },1}(v)\geqslant & {} f_{{\varepsilon },1}(x) + \sum \limits _{i=1}^{N}{[x_{i}^{2} + {\varepsilon }^{2}]^{-1/2}x_{i}(v_{i} - x_{i})}\\= & {} f_{{\varepsilon },1}(x) + \left\langle x, v-x\right\rangle _{\hat{w}(x,{\varepsilon },1)} = f_{{\varepsilon },1}(x), \end{aligned}$$

where we have used the orthogonality condition and the fact that $(v - x) \in \mathcal {N}_{\varPhi }$. Since v was chosen arbitrarily, $x = x^{{\varepsilon },1}$ as claimed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fornasier, M., Peter, S., Rauhut, H. et al. Conjugate gradient acceleration of iteratively re-weighted least squares methods. Comput Optim Appl 65, 205–259 (2016). https://doi.org/10.1007/s10589-016-9839-8

Download citation

Received: 14 September 2015
Published: 18 March 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10589-016-9839-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conjugate gradient acceleration of iteratively re-weighted least squares methods

Abstract

Access this article

Similar content being viewed by others

The Hadamard decomposition problem

A Gauss–Newton method for mixed least squares-total least squares problems

A Brief Introduction to Manifold Optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Lemma 10

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Conjugate gradient acceleration of iteratively re-weighted least squares methods

Abstract

Access this article

Similar content being viewed by others

The Hadamard decomposition problem

A Gauss–Newton method for mixed least squares-total least squares problems

A Brief Introduction to Manifold Optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Lemma 10

Appendix: Proof of Lemma 10

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation