Skip to main content

Nonconvex Sparse Regularization and Splitting Algorithms

Part of the Scientific Computation book series (SCIENTCOMP)

Abstract

Nonconvex regularization functions such as the p quasinorm (0 < p < 1) can recover sparser solutions from fewer measurements than the convex 1 regularization function. They have been widely used for compressive sensing and signal processing. This chapter briefly reviews the development of algorithms for nonconvex regularization. Because nonconvex regularization usually has different regularity properties from other functions in a problem, we often apply operator splitting (forward-backward splitting) to develop algorithms that treat them separately. The treatment on nonconvex regularization is via the proximal mapping.

We also review another class of coordinate descent algorithms that work for both convex and nonconvex functions. They split variables into small, possibly parallel, subproblems, each of which updates a variable while fixing others. Their theory and applications have been recently extended to cover nonconvex regularization functions, which we review in this chapter.

Finally, we also briefly mention an ADMM-based algorithm for nonconvex regularization, as well as the recent algorithms for the so-called nonconvex sort 1 and 1 2 minimization.

Keywords

  • Coordinate Descent
  • Restricted Isometry Property
  • Hard Thresholding
  • Nonconvex Function
  • Smoothly Clip Absolute Deviation

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-41589-5_7
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-41589-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   219.99
Price excludes VAT (USA)
Hardcover Book
USD   219.99
Price excludes VAT (USA)

Notes

  1. 1.

    We use “norm” loosely, to refer to such things as the p quasinorm, or the 0 penalty function (which has no correct norm-like name).

  2. 2.

    The objective is convex in each of coordinates while the other coordinates are fixed.

References

  1. Antoniadis, A.: Wavelet methods in statistics: Some recent developments and their applications. Statistics Surveys 1, 16–55 (2007)

    CrossRef  MATH  MathSciNet  Google Scholar 

  2. Auslender, A.: Asymptotic properties of the Fenchel dual functional and applications to decomposition problems. Journal of Optimization Theory and Applications 73 (3), 427–449 (1992)

    CrossRef  MATH  MathSciNet  Google Scholar 

  3. Barrodale, I., Roberts, F.D.K.: Applications of mathematical programming to p approximation. In: J.B. Rosen, O.L. Mangasarian, K. Ritter (eds.) Nonlinear Programming, Madison, Wisconsin, May 4–6, 1970, pp. 447–464. Academic Press, New York (1970)

    Google Scholar 

  4. Bayram, I.: On the convergence of the iterative shrinkage/thresholding algorithm with a weakly convex penalty. IEEE Transactions on Signal Processing 64 (6), 1597–1608 (2016)

    CrossRef  MathSciNet  Google Scholar 

  5. Bertsekas, D.P.: Nonlinear Programming, 2nd edition edn. Athena Scientific (1999)

    Google Scholar 

  6. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice Hall, Englewood Cliffs, N.J (1989)

    MATH  Google Scholar 

  7. Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge, MA (1987)

    Google Scholar 

  8. Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Applied and Computational Harmonic Analysis 27 (3), 265–274 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  9. Blumensath, T., Davies, M.E.: Normalized iterative hard thresholding: Guaranteed stability and performance. IEEE Journal of Selected Topics in Signal Processing 4 (2), 298–309 (2010)

    CrossRef  Google Scholar 

  10. Bradley, J.K., Kyrola, A., Bickson, D., Guestrin, C.: Parallel coordinate descent for l1-regularized loss minimization. arXiv preprint arXiv:1105.5379 (2011)

    Google Scholar 

  11. Breheny, P., Huang, J.: Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. The Annals of Applied Statistics 5 (1), 232–253 (2011)

    CrossRef  MATH  MathSciNet  Google Scholar 

  12. Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52 (2), 489–509 (2006)

    CrossRef  MATH  MathSciNet  Google Scholar 

  13. Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted 1 minimization. Journal of Fourier Analysis and Applications 14 (5–6), 877–905 (2008)

    CrossRef  MATH  MathSciNet  Google Scholar 

  14. Cao, W., Sun, J., Xu, Z.: Fast image deconvolution using closed-form thresholding formulas of \(l_{q}(q = \frac{1} {2}, \frac{2} {3})\) regularization. Journal of Visual Communication and Image Representation 24 (1), 31–41 (2013)

    CrossRef  Google Scholar 

  15. Chartrand, R.: Exact reconstructions of sparse signals via nonconvex minimization. IEEE Signal Processing Letters 14 (10), 707–710 (2007)

    CrossRef  Google Scholar 

  16. Chartrand, R.: Generalized shrinkage and penalty functions. In: IEEE Global Conference on Signal and Information Processing, p. 616. Austin, TX (2013)

    Google Scholar 

  17. Chartrand, R.: Shrinkage mappings and their induced penalty functions. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Florence, Italy (2014)

    CrossRef  Google Scholar 

  18. Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3869–3872. Las Vegas, NV (2008)

    Google Scholar 

  19. Chazan, D., Miranker, W.: Chaotic relaxation. Linear Algebra and its Applications 2 (2), 199–222 (1969)

    CrossRef  MATH  MathSciNet  Google Scholar 

  20. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20, 33–61 (1998)

    CrossRef  MATH  MathSciNet  Google Scholar 

  21. Dhillon, I.S., Ravikumar, P.K., Tewari, A.: Nearest Neighbor based Greedy Coordinate Descent. In: J. Shawe-Taylor, R.S. Zemel, P.L. Bartlett, F. Pereira, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 24, pp. 2160–2168. Curran Associates, Inc. (2011)

    Google Scholar 

  22. Dong, B., Zhang, Y.: An efficient algorithm for 0 minimization in wavelet frame based image restoration. Journal of Scientific Computing 54 (2–3), 350–368 (2013)

    CrossRef  MATH  MathSciNet  Google Scholar 

  23. Donoho, D.L.: Compressed sensing. IEEE Transactions on Information Theory 52 (4), 1289–1306 (2006)

    CrossRef  MATH  MathSciNet  Google Scholar 

  24. Gao, H.Y., Bruce, A.G.: Waveshrink with firm shrinkage. Statistica Sinica 7 (4), 855–874 (1997)

    MATH  MathSciNet  Google Scholar 

  25. Gorodnitsky, I.F., Rao, B.D.: A new iterative weighted norm minimization algorithm and its applications. In: IEEE Sixth SP Workshop on Statistical Signal and Array Processing, pp. 412–415 (1992)

    Google Scholar 

  26. Gorodnitsky, I.F., Rao, B.D.: Convergence analysis of a class of adaptive weighted norm extrapolation algorithms. In: Asilomar Conference on Signals, Systems, and Computers, vol. 1, pp. 339–343. Pacific Grove, CA (1993)

    Google Scholar 

  27. Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Operations Research Letters 26 (3), 127–136 (2000)

    CrossRef  MATH  MathSciNet  Google Scholar 

  28. Huang, X.L., Shi, L., Yan, M.: Nonconvex sorted 1 minimization for sparse approximation. Journal of the Operations Research Society of China 3 (2), 207–229 (2015)

    CrossRef  MATH  MathSciNet  Google Scholar 

  29. Krisnan, D., Fergus, R.: Fast image deconvolution using hyper-Laplacian priors. In: Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (eds.) Advances in Neural Information Processing Systems, vol. 22, pp. 1033–1041. Vancouver, BC (2009)

    Google Scholar 

  30. Lawson, C.L.: Contributions to the theory of linear least maximum approximation. Ph.D. thesis, University of California, Los Angeles (1961)

    Google Scholar 

  31. Leahy, R.M., Jeffs, B.D.: On the design of maximally sparse beamforming arrays. IEEE Transactions on Antennas and Propagation 39 (8), 1178–1187 (1991)

    CrossRef  Google Scholar 

  32. Li, Y., Osher, S.: Coordinate descent optimization for 1 minimization with application to compressed sensing; a greedy algorithm. Inverse Problems and Imaging 3 (3), 487–503 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  33. Liu, J., Wright, S.J.: Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties. SIAM Journal on Optimization 25, 351–376 (2015)

    CrossRef  MathSciNet  Google Scholar 

  34. Liu, J., Wright, S.J., Ré, C., Bittorf, V., Sridhar, S.: An Asynchronous Parallel Stochastic Coordinate Descent Algorithm. Journal of Machine Learning Research 16 (1), 285–322 (2015)

    MATH  MathSciNet  Google Scholar 

  35. Lu, Z., Xiao, L.: Randomized block coordinate non-monotone gradient method for a class of nonlinear programming. arXiv preprint arXiv:1306.5918 (2013)

    Google Scholar 

  36. Mareček, J., Richtárik, P., Takáč, M.: Distributed Block Coordinate Descent for Minimizing Partially Separable Functions. In: M. Al-Baali, L. Grandinetti, A. Purnama (eds.) Numerical Analysis and Optimization, no. 134 in Springer Proceedings in Mathematics and Statistics, pp. 261–288. Springer International Publishing (2015)

    Google Scholar 

  37. Marjanovic, G., Solo, V.: Sparsity Penalized Linear Regression With Cyclic Descent. IEEE Transactions on Signal Processing 62 (6), 1464–1475 (2014)

    CrossRef  MathSciNet  Google Scholar 

  38. Mazumder, R., Friedman, J.H., Hastie, T.: SparseNet: Coordinate Descent With Nonconvex Penalties. Journal of the American Statistical Association 106 (495), 1125–1138 (2011)

    CrossRef  MATH  MathSciNet  Google Scholar 

  39. Mohimani, H., Babaie-Zadeh, M., Jutten, C.: A fast approach for overcomplete sparse decomposition based on smoothed L0 norm. IEEE Transactions on Signal Processing 57 (1), 289–301 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  40. Nutini, J., Schmidt, M., Laradji, I.H., Friedlander, M., Koepke, H.: Coordinate descent converges faster with the Gauss-Southwell rule than random selection. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, (2015)

    Google Scholar 

  41. Peng, Z., Wu, T., Xu, Y., Yan, M., Yin, W.: Coordinate friendly structures, algorithms and applications. Annals of Mathematical Sciences and Applications 1 (2016)

    Google Scholar 

  42. Peng, Z., Xu, Y., Yan, M., Yin, W.: ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates. SIAM Journal on Scientific Computing 38 (5), A2851-A2879, (2015)

    CrossRef  MATH  MathSciNet  Google Scholar 

  43. Peng, Z., Yan, M., Yin, W.: Parallel and distributed sparse optimization. In: Signals, Systems and Computers, 2013 Asilomar Conference on, pp. 659–646. IEEE (2013)

    Google Scholar 

  44. Powell, M.J.D.: On search directions for minimization algorithms. Mathematical Programming 4 (1), 193–201 (1973)

    CrossRef  MATH  MathSciNet  Google Scholar 

  45. Rao, B.D., Kreutz-Delgado, K.: An affine scaling methodology for best basis selection. IEEE Transactions on Signal Processing 47 (1), 187–200 (1999)

    CrossRef  MATH  MathSciNet  Google Scholar 

  46. Razaviyayn, M., Hong, M., Luo, Z.: A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization. SIAM Journal on Optimization 23 (2), 1126–1153 (2013)

    CrossRef  MATH  MathSciNet  Google Scholar 

  47. Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Mathematical Programming 156 (1–2), 433–484 (2015)

    MATH  MathSciNet  Google Scholar 

  48. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Mathematics of Operations Research 1 (2), 97–116 (1976)

    CrossRef  MATH  MathSciNet  Google Scholar 

  49. Saab, R., Chartrand, R., Özgür Yilmaz: Stable sparse approximations via nonconvex optimization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2008)

    Google Scholar 

  50. Scherrer, C., Halappanavar, M., Tewari, A., Haglin, D.: Scaling up coordinate descent algorithms for large 1 regularization problems. arXiv preprint arXiv:1206.6409 (2012)

    Google Scholar 

  51. Tseng, P.: On the rate of convergence of a partially asynchronous gradient projection algorithm. SIAM Journal on Optimization 1 (4), 603–619 (1991)

    CrossRef  MATH  MathSciNet  Google Scholar 

  52. Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109 (3), 475–494 (2001)

    CrossRef  MATH  MathSciNet  Google Scholar 

  53. Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming 117 (1–2), 387–423 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  54. Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Science 1 (3), 248–272 (2008)

    CrossRef  MATH  MathSciNet  Google Scholar 

  55. Wang, Y., Yin, W.: Sparse Signal Reconstruction via Iterative Support Detection. SIAM Journal on Imaging Sciences 3 (3), 462–491 (2010)

    CrossRef  MATH  MathSciNet  Google Scholar 

  56. Wang, Y., Yin, W., Zeng, J.: Global Convergence of ADMM in Nonconvex Nonsmooth Optimization. arXiv:1511.06324 [cs, math] (2015)

    Google Scholar 

  57. Warga, J.: Minimizing certain convex functions. Journal of the Society for Industrial & Applied Mathematics 11 (3), 588–593 (1963)

    Google Scholar 

  58. Woodworth, J., Chartrand, R.: Compressed sensing recovery via nonconvex shrinkage penalties. http://arxiv.org/abs/1504.02923

  59. Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics 2 (1), 224–244 (2008)

    CrossRef  MATH  MathSciNet  Google Scholar 

  60. Xu, Y., Yin, W.: A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion. SIAM Journal on Imaging Sciences 6 (3), 1758–1789 (2013)

    CrossRef  MATH  MathSciNet  Google Scholar 

  61. Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. arXiv:1410.1386 [math] (2014)

    Google Scholar 

  62. Xu, Z., Chang, X., Xu, F., Zhang, H.: L 1∕2 regularization: A thresholding representation theory and a fast solver. IEEE Transactions on Neural Networks and Learning Systems 23 (7), 1013–1027 (2012)

    CrossRef  Google Scholar 

  63. Xu, Z.B., Guo, H.L., Wang, Y., Zhang, H.: Representative of L 1∕2 regularization among L q (0 < q ≤ 1) regularizations: an experimental study based on phase diagram. Acta Automatica Sinica 38 (7), 1225–1228 (2012)

    Google Scholar 

  64. Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of 1−2 for compressed sensing. SIAM Journal on Scientific Computing 37 (1), A536–A563 (2015)

    CrossRef  MATH  MathSciNet  Google Scholar 

  65. Zangwill, W.I.: Nonlinear Programming: A Unified Approach. Prentice-Hall, Englewood Cliffs, NJ (1969)

    MATH  Google Scholar 

  66. Zeng, J., Peng, Z., Lin, S.: A Gauss-Seidel Iterative Thresholding Algorithm for l q Regularized Least Squares Regression. arXiv:1507.03173 [cs] (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rick Chartrand .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Chartrand, R., Yin, W. (2016). Nonconvex Sparse Regularization and Splitting Algorithms. In: Glowinski, R., Osher, S., Yin, W. (eds) Splitting Methods in Communication, Imaging, Science, and Engineering. Scientific Computation. Springer, Cham. https://doi.org/10.1007/978-3-319-41589-5_7

Download citation