# The Convex Geometry of Linear Inverse Problems

- 4.7k Downloads
- 328 Citations

## Abstract

In applications throughout science and engineering one is often faced with the challenge of solving an ill-posed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees of freedom relative to their ambient dimension. This paper provides a general framework to convert notions of simplicity into convex penalty functions, resulting in convex optimization solutions to linear, underdetermined inverse problems. The class of simple models considered includes those formed as the sum of a few atoms from some (possibly infinite) elementary atomic set; examples include well-studied cases from many technical fields such as sparse vectors (signal processing, statistics) and low-rank matrices (control, statistics), as well as several others including sums of a few permutation matrices (ranked elections, multiobject tracking), low-rank tensors (computer vision, neuroscience), orthogonal matrices (machine learning), and atomic measures (system identification). The convex programming formulation is based on minimizing the norm induced by the convex hull of the atomic set; this norm is referred to as the *atomic norm*. The facial structure of the atomic norm ball carries a number of favorable properties that are useful for recovering simple models, and an analysis of the underlying convex geometry provides sharp estimates of the number of generic measurements required for exact and robust recovery of models from partial information. These estimates are based on computing the Gaussian widths of tangent cones to the atomic norm ball. When the atomic set has algebraic structure the resulting optimization problems can be solved or approximated via semidefinite programming. The quality of these approximations affects the number of measurements required for recovery, and this tradeoff is characterized via some examples. Thus this work extends the catalog of simple models (beyond sparse vectors and low-rank matrices) that can be recovered from limited linear information via tractable convex programming.

## Keywords

Convex optimization Semidefinite programming Atomic norms Real algebraic geometry Gaussian width Symmetry## Mathematics Subject Classification

52A41 90C25 90C22 60D05 41A45## Notes

### Acknowledgements

This work was supported in part by AFOSR grant FA9550-08-1-0180, in part by a MURI through ARO grant W911NF-06-1-0076, in part by a MURI through AFOSR grant FA9550-06-1-0303, in part by NSF FRG 0757207, in part through ONR award N00014-11-1-0723, and NSF award CCF-1139953.

We gratefully acknowledge Holger Rauhut for several suggestions on how to improve the presentation in Sect. 3, and Amin Jalali for pointing out an error in a previous draft. We thank Santosh Vempala, Joel Tropp, Bill Helton, Martin Jaggi, and Jonathan Kelner for helpful discussions. Finally, we acknowledge the suggestions of the associate editor Emmanuel Candès as well as the comments and pointers to references made by the reviewers, all of which improved our paper.

## References

- 1.S. Aja-Fernandez, R. Garcia, D. Tao, X. Li,
*Tensors in Image Processing and Computer Vision*. Advances in Pattern Recognition (Springer, Berlin, 2009). zbMATHCrossRefGoogle Scholar - 2.N. Alon, A. Naor, Approximating the cut-norm via Grothendieck’s inequality,
*SIAM J. Comput.***35**, 787–803 (2006). MathSciNetzbMATHCrossRefGoogle Scholar - 3.A. Barron, Universal approximation bounds for superpositions of a sigmoidal function,
*IEEE Trans. Inf. Theory***39**, 930–945 (1993). MathSciNetzbMATHCrossRefGoogle Scholar - 4.A. Barvinok,
*A Course in Convexity*(American Mathematical Society, Providence, 2002). zbMATHGoogle Scholar - 5.C. Beckmann, S. Smith, Tensorial extensions of independent component analysis for multisubject FMRI analysis,
*NeuroImage***25**, 294–311 (2005). CrossRefGoogle Scholar - 6.D. Bertsekas,
*Constrained Optimization and Lagrange Multiplier Methods*(Athena Scientific, Nashua, 2007). Google Scholar - 7.D. Bertsekas, A. Nedic, A. Ozdaglar,
*Convex Analysis and Optimization*(Athena Scientific, Nashua, 2003). zbMATHGoogle Scholar - 8.P. Bickel, Y. Ritov, A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector,
*Ann. Stat.***37**, 1705–1732 (2009). MathSciNetzbMATHCrossRefGoogle Scholar - 9.J. Bochnak, M. Coste, M. Roy,
*Real Algebraic Geometry*(Springer, Berlin, 1988). Google Scholar - 10.F.F. Bonsall, A general atomic decomposition theorem and Banach’s closed range theorem,
*Q. J. Math.***42**, 9–14 (1991). MathSciNetzbMATHCrossRefGoogle Scholar - 11.A. Brieden, P. Gritzmann, R. Kannan, V. Klee, L. Lovasz, M. Simonovits, Approximation of diameters: randomization doesn’t help, in
*Proceedings of the 39th Annual Symposium on Foundations of Computer Science*(1998), pp. 244–251. Google Scholar - 12.J. Cai, E. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion,
*SIAM J. Optim.***20**, 1956–1982 (2008). CrossRefGoogle Scholar - 13.J. Cai, S. Osher, Z. Shen, Linearized Bregman iterations for compressed sensing,
*Math. Comput.***78**, 1515–1536 (2009). MathSciNetzbMATHCrossRefGoogle Scholar - 14.E. Candès, X. Li, Y. Ma, J. Wright, Robust principal component analysis?
*J. ACM***58**, 1–37 (2011). CrossRefGoogle Scholar - 15.E. Candès, Y. Plan, Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements,
*IEEE Trans. Inf. Theory***57**, 2342–2359 (2011). CrossRefGoogle Scholar - 16.E.J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,
*IEEE Trans. Inf. Theory***52**, 489–509 (2006). zbMATHCrossRefGoogle Scholar - 17.E.J. Candès, B. Recht, Exact matrix completion via convex optimization,
*Found. Comput. Math.***9**, 717–772 (2009). MathSciNetzbMATHCrossRefGoogle Scholar - 18.E. Candès, T. Tao, Decoding by linear programming,
*IEEE Trans. Inf. Theory***51**, 4203–4215 (2005). CrossRefGoogle Scholar - 19.V. Chandrasekaran, S. Sanghavi, P.A. Parrilo, A.S. Willsky, Rank-sparsity incoherence for matrix decomposition,
*SIAM J. Optim.***21**, 572–596 (2011). MathSciNetzbMATHCrossRefGoogle Scholar - 20.P. Combettes, V. Wajs, Signal recovery by proximal forward-backward splitting,
*Multiscale Model. Simul.***4**, 1168–1200 (2005). MathSciNetzbMATHCrossRefGoogle Scholar - 21.I. Daubechies, M. Defriese, C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,
*Commun. Pure Appl. Math.***LVII**, 1413–1457 (2004). CrossRefGoogle Scholar - 22.K.R. Davidson, S.J. Szarek, Local operator theory, random matrices and Banach spaces, in
*Handbook of the Geometry of Banach Spaces*, vol. I (2001), pp. 317–366. CrossRefGoogle Scholar - 23.V. de Silva, L. Lim, Tensor rank and the ill-posedness of the best low-rank approximation problem,
*SIAM J. Matrix Anal. Appl.***30**, 1084–1127 (2008). MathSciNetCrossRefGoogle Scholar - 24.R. DeVore, V. Temlyakov, Some remarks on greedy algorithms,
*Adv. Comput. Math.***5**, 173–187 (1996). MathSciNetzbMATHCrossRefGoogle Scholar - 25.M. Deza, M. Laurent,
*Geometry of Cuts and Metrics*(Springer, Berlin, 1997). zbMATHGoogle Scholar - 26.D.L. Donoho, High-dimensional centrally-symmetric polytopes with neighborliness proportional to dimension,
*Discrete Comput. Geom*. (online) (2005). Google Scholar - 27.D.L. Donoho, For most large underdetermined systems of linear equations the minimal
*ℓ*_{1}-norm solution is also the sparsest solution,*Commun. Pure Appl. Math.***59**, 797–829 (2006). MathSciNetzbMATHCrossRefGoogle Scholar - 28.D.L. Donoho, Compressed sensing,
*IEEE Trans. Inf. Theory***52**, 1289–1306 (2006). MathSciNetCrossRefGoogle Scholar - 29.D. Donoho, J. Tanner, Sparse nonnegative solution of underdetermined linear equations by linear programming,
*Proc. Natl. Acad. Sci. USA***102**, 9446–9451 (2005). MathSciNetCrossRefGoogle Scholar - 30.D. Donoho, J. Tanner, Counting faces of randomly-projected polytopes when the projection radically lowers dimension,
*J. Am. Math. Soc.***22**, 1–53 (2009). MathSciNetzbMATHCrossRefGoogle Scholar - 31.D. Donoho, J. Tanner, Counting the faces of randomly-projected hypercubes and orthants with applications,
*Discrete Comput. Geom.***43**, 522–541 (2010). MathSciNetzbMATHCrossRefGoogle Scholar - 32.R.M. Dudley, The sizes of compact subsets of Hilbert space and continuity of Gaussian processes,
*J. Funct. Anal.***1**, 290–330 (1967). MathSciNetzbMATHCrossRefGoogle Scholar - 33.M. Dyer, A. Frieze, R. Kannan, A random polynomial-time algorithm for approximating the volume of convex bodies,
*J. ACM***38**, 1–17 (1991). MathSciNetzbMATHCrossRefGoogle Scholar - 34.M. Fazel, Matrix rank minimization with applications, Ph.D. thesis, Department of Electrical Engineering, Stanford University (2002). Google Scholar
- 35.M. Figueiredo, R. Nowak, An EM algorithm for wavelet-based image restoration,
*IEEE Trans. Image Process.***12**, 906–916 (2003). MathSciNetCrossRefGoogle Scholar - 36.M. Fukushima, H. Mine, A generalized proximal point algorithm for certain non-convex minimization problems,
*Int. J. Inf. Syst. Sci.***12**, 989–1000 (1981). MathSciNetzbMATHCrossRefGoogle Scholar - 37.M. Goemans, D. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming,
*J. ACM***42**, 1115–1145 (1995). MathSciNetzbMATHCrossRefGoogle Scholar - 38.Y. Gordon, On Milman’s inequality and random subspaces which escape through a mesh in ℝ
^{n}, in*Geometric Aspects of Functional Analysis, Israel Seminar 1986–1987*. Lecture Notes in Mathematics, vol. 1317 (1988), pp. 84–106. CrossRefGoogle Scholar - 39.J. Gouveia, P. Parrilo, R. Thomas, Theta bodies for polynomial ideals,
*SIAM J. Optim.***20**, 2097–2118 (2010). MathSciNetzbMATHCrossRefGoogle Scholar - 40.T. Hale, W. Yin, Y. Zhang, A fixed-point continuation method for
*ℓ*_{1}-regularized minimization: methodology and convergence,*SIAM J. Optim.***19**, 1107–1130 (2008). MathSciNetzbMATHCrossRefGoogle Scholar - 41.J. Harris,
*Algebraic Geometry: A First Course*(Springer, Berlin). Google Scholar - 42.J. Haupt, W.U. Bajwa, G. Raz, R. Nowak, Toeplitz compressed sensing matrices with applications to sparse channel estimation,
*IEEE Trans. Inform. Theory***56**(11), 5862–5875 (2010). MathSciNetCrossRefGoogle Scholar - 43.S. Jagabathula, D. Shah, Inferring rankings using constrained sensing,
*IEEE Trans. Inf. Theory***57**, 7288–7306 (2011). MathSciNetCrossRefGoogle Scholar - 44.L. Jones, A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training,
*Ann. Stat.***20**, 608–613 (1992). zbMATHCrossRefGoogle Scholar - 45.D. Klain, G. Rota,
*Introduction to Geometric Probability*(Cambridge University Press, Cambridge, 1997). zbMATHGoogle Scholar - 46.T. Kolda, Orthogonal tensor decompositions,
*SIAM J. Matrix Anal. Appl.***23**, 243–255 (2001). MathSciNetzbMATHCrossRefGoogle Scholar - 47.T. Kolda, B. Bader, Tensor decompositions and applications,
*SIAM Rev.***51**, 455–500 (2009). MathSciNetzbMATHCrossRefGoogle Scholar - 48.M. Ledoux,
*The Concentration of Measure Phenomenon*(American Mathematical Society, Providence, 2000). Google Scholar - 49.M. Ledoux, M. Talagrand,
*Probability in Banach Spaces*(Springer, Berlin, 1991). zbMATHGoogle Scholar - 50.J. Löfberg, YALMIP: A toolbox for modeling and optimization in MATLAB, in
*Proceedings of the CACSD Conference*, Taiwan (2004). Available from http://control.ee.ethz.ch/~joloef/yalmip.php. Google Scholar - 51.S. Ma, D. Goldfarb, L. Chen, Fixed point and Bregman iterative methods for matrix rank minimization,
*Math. Program.***128**, 321–353 (2011). MathSciNetzbMATHCrossRefGoogle Scholar - 52.O. Mangasarian, B. Recht, Probability of unique integer solution to a system of linear equations,
*Eur. J. Oper. Res.***214**, 27–30 (2011). MathSciNetzbMATHCrossRefGoogle Scholar - 53.J. Matoušek,
*Lectures on Discrete Geometry*(Springer, Berlin, 2002). zbMATHCrossRefGoogle Scholar - 54.S. Negahban, P. Ravikumar, M. Wainwright, B. Yu, A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers, Preprint (2010). Google Scholar
- 55.Y. Nesterov, Quality of semidefinite relaxation for nonconvex quadratic optimization. Technical report (1997). Google Scholar
- 56.Y. Nesterov,
*Introductory Lectures on Convex Optimization*(Kluwer Academic, Amsterdam, 2004). zbMATHGoogle Scholar - 57.Y. Nesterov, Gradient methods for minimizing composite functions, CORE discussion paper 76 (2007). Google Scholar
- 58.P.A. Parrilo, Semidefinite programming relaxations for semialgebraic problems,
*Math. Program.***96**, 293–320 (2003). MathSciNetzbMATHCrossRefGoogle Scholar - 59.G. Pisier,
*Remarques sur un résultat non publié de B. Maurey*. Séminaire d’analyse fonctionnelle (Ecole Polytechnique Centre de Mathematiques, Palaiseau, 1981). Google Scholar - 60.G. Pisier, Probabilistic methods in the geometry of Banach spaces, in
*Probability and Analysis*, pp. 167–241 (1986). CrossRefGoogle Scholar - 61.E. Polak,
*Optimization: Algorithms and Consistent Approximations*(Springer, Berlin, 1997). zbMATHGoogle Scholar - 62.H. Rauhut, Circulant and Toeplitz matrices in compressed sensing, in
*Proceedings of SPARS’09*, (2009). Google Scholar - 63.B. Recht, M. Fazel, P.A. Parrilo, Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization,
*SIAM Rev.***52**, 471–501 (2010). MathSciNetzbMATHCrossRefGoogle Scholar - 64.B. Recht, W. Xu, B. Hassibi, Null space conditions and thresholds for rank minimization,
*Math. Program., Ser. B***127**, 175–211 (2011). MathSciNetzbMATHCrossRefGoogle Scholar - 65.R.T. Rockafellar,
*Convex Analysis*(Princeton University Press, Princeton, 1970). zbMATHGoogle Scholar - 66.M. Rudelson, R. Vershynin, Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements, in
*CISS 2006 (40th Annual Conference on Information Sciences and Systems)*(2006). Google Scholar - 67.R. Sanyal, F. Sottile, B. Sturmfels, Orbitopes, Preprint, arXiv:0911.5436 (2009).
- 68.N. Srebro, A. Shraibman, Rank, trace-norm and max-norm in
*18th Annual Conference on Learning Theory (COLT)*(2005). Google Scholar - 69.M. Stojnic, Various thresholds for
*ℓ*_{1}-optimization in compressed sensing, Preprint, arXiv:0907.3666 (2009). - 70.K. Toh, M. Todd, R. Tutuncu, SDPT3—a MATLAB software package for semidefinite-quadratic-linear programming. Available from. http://www.math.nus.edu.sg/~mattohkc/sdpt3.html.
- 71.K. Toh, S. Yun, An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems,
*Pac. J. Optim.***6**, 615–640 (2010). MathSciNetzbMATHGoogle Scholar - 72.S. van de Geer, P. Bühlmann, On the conditions used to prove oracle results for the Lasso,
*Electron. J. Stat.***3**, 1360–1392 (2009). MathSciNetCrossRefGoogle Scholar - 73.S. Wright, R. Nowak, M. Figueiredo, Sparse reconstruction by separable approximation,
*IEEE Trans. Signal Process.***57**, 2479–2493 (2009). MathSciNetCrossRefGoogle Scholar - 74.W. Xu, B. Hassibi, Compressive sensing over the Grassmann manifold: a unified geometric framework,
*IEEE Trans. Inform. Theory***57**(10), 6894–6919 (2011). MathSciNetCrossRefGoogle Scholar - 75.W. Yin, S. Osher, J. Darbon, D. Goldfarb, Bregman iterative algorithms for compressed sensing and related problems,
*SIAM J. Imaging Sci.***1**, 143–168 (2008). MathSciNetzbMATHCrossRefGoogle Scholar - 76.G. Ziegler,
*Lectures on Polytopes*(Springer, Berlin, 1995). zbMATHCrossRefGoogle Scholar