Abstract
Nonconvex regularization functions such as the ℓ p quasinorm (0 < p < 1) can recover sparser solutions from fewer measurements than the convex ℓ 1 regularization function. They have been widely used for compressive sensing and signal processing. This chapter briefly reviews the development of algorithms for nonconvex regularization. Because nonconvex regularization usually has different regularity properties from other functions in a problem, we often apply operator splitting (forward-backward splitting) to develop algorithms that treat them separately. The treatment on nonconvex regularization is via the proximal mapping.
We also review another class of coordinate descent algorithms that work for both convex and nonconvex functions. They split variables into small, possibly parallel, subproblems, each of which updates a variable while fixing others. Their theory and applications have been recently extended to cover nonconvex regularization functions, which we review in this chapter.
Finally, we also briefly mention an ADMM-based algorithm for nonconvex regularization, as well as the recent algorithms for the so-called nonconvex sort ℓ 1 and ℓ 1 − ℓ 2 minimization.
Keywords
- Coordinate Descent
- Restricted Isometry Property
- Hard Thresholding
- Nonconvex Function
- Smoothly Clip Absolute Deviation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Notes
- 1.
We use “norm” loosely, to refer to such things as the ℓ p quasinorm, or the ℓ 0 penalty function (which has no correct norm-like name).
- 2.
The objective is convex in each of coordinates while the other coordinates are fixed.
References
Antoniadis, A.: Wavelet methods in statistics: Some recent developments and their applications. Statistics Surveys 1, 16–55 (2007)
Auslender, A.: Asymptotic properties of the Fenchel dual functional and applications to decomposition problems. Journal of Optimization Theory and Applications 73 (3), 427–449 (1992)
Barrodale, I., Roberts, F.D.K.: Applications of mathematical programming to ℓ p approximation. In: J.B. Rosen, O.L. Mangasarian, K. Ritter (eds.) Nonlinear Programming, Madison, Wisconsin, May 4–6, 1970, pp. 447–464. Academic Press, New York (1970)
Bayram, I.: On the convergence of the iterative shrinkage/thresholding algorithm with a weakly convex penalty. IEEE Transactions on Signal Processing 64 (6), 1597–1608 (2016)
Bertsekas, D.P.: Nonlinear Programming, 2nd edition edn. Athena Scientific (1999)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice Hall, Englewood Cliffs, N.J (1989)
Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge, MA (1987)
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Applied and Computational Harmonic Analysis 27 (3), 265–274 (2009)
Blumensath, T., Davies, M.E.: Normalized iterative hard thresholding: Guaranteed stability and performance. IEEE Journal of Selected Topics in Signal Processing 4 (2), 298–309 (2010)
Bradley, J.K., Kyrola, A., Bickson, D., Guestrin, C.: Parallel coordinate descent for l1-regularized loss minimization. arXiv preprint arXiv:1105.5379 (2011)
Breheny, P., Huang, J.: Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. The Annals of Applied Statistics 5 (1), 232–253 (2011)
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52 (2), 489–509 (2006)
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted ℓ 1 minimization. Journal of Fourier Analysis and Applications 14 (5–6), 877–905 (2008)
Cao, W., Sun, J., Xu, Z.: Fast image deconvolution using closed-form thresholding formulas of \(l_{q}(q = \frac{1} {2}, \frac{2} {3})\) regularization. Journal of Visual Communication and Image Representation 24 (1), 31–41 (2013)
Chartrand, R.: Exact reconstructions of sparse signals via nonconvex minimization. IEEE Signal Processing Letters 14 (10), 707–710 (2007)
Chartrand, R.: Generalized shrinkage and penalty functions. In: IEEE Global Conference on Signal and Information Processing, p. 616. Austin, TX (2013)
Chartrand, R.: Shrinkage mappings and their induced penalty functions. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Florence, Italy (2014)
Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3869–3872. Las Vegas, NV (2008)
Chazan, D., Miranker, W.: Chaotic relaxation. Linear Algebra and its Applications 2 (2), 199–222 (1969)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20, 33–61 (1998)
Dhillon, I.S., Ravikumar, P.K., Tewari, A.: Nearest Neighbor based Greedy Coordinate Descent. In: J. Shawe-Taylor, R.S. Zemel, P.L. Bartlett, F. Pereira, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 24, pp. 2160–2168. Curran Associates, Inc. (2011)
Dong, B., Zhang, Y.: An efficient algorithm for ℓ 0 minimization in wavelet frame based image restoration. Journal of Scientific Computing 54 (2–3), 350–368 (2013)
Donoho, D.L.: Compressed sensing. IEEE Transactions on Information Theory 52 (4), 1289–1306 (2006)
Gao, H.Y., Bruce, A.G.: Waveshrink with firm shrinkage. Statistica Sinica 7 (4), 855–874 (1997)
Gorodnitsky, I.F., Rao, B.D.: A new iterative weighted norm minimization algorithm and its applications. In: IEEE Sixth SP Workshop on Statistical Signal and Array Processing, pp. 412–415 (1992)
Gorodnitsky, I.F., Rao, B.D.: Convergence analysis of a class of adaptive weighted norm extrapolation algorithms. In: Asilomar Conference on Signals, Systems, and Computers, vol. 1, pp. 339–343. Pacific Grove, CA (1993)
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Operations Research Letters 26 (3), 127–136 (2000)
Huang, X.L., Shi, L., Yan, M.: Nonconvex sorted ℓ 1 minimization for sparse approximation. Journal of the Operations Research Society of China 3 (2), 207–229 (2015)
Krisnan, D., Fergus, R.: Fast image deconvolution using hyper-Laplacian priors. In: Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (eds.) Advances in Neural Information Processing Systems, vol. 22, pp. 1033–1041. Vancouver, BC (2009)
Lawson, C.L.: Contributions to the theory of linear least maximum approximation. Ph.D. thesis, University of California, Los Angeles (1961)
Leahy, R.M., Jeffs, B.D.: On the design of maximally sparse beamforming arrays. IEEE Transactions on Antennas and Propagation 39 (8), 1178–1187 (1991)
Li, Y., Osher, S.: Coordinate descent optimization for ℓ 1 minimization with application to compressed sensing; a greedy algorithm. Inverse Problems and Imaging 3 (3), 487–503 (2009)
Liu, J., Wright, S.J.: Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties. SIAM Journal on Optimization 25, 351–376 (2015)
Liu, J., Wright, S.J., Ré, C., Bittorf, V., Sridhar, S.: An Asynchronous Parallel Stochastic Coordinate Descent Algorithm. Journal of Machine Learning Research 16 (1), 285–322 (2015)
Lu, Z., Xiao, L.: Randomized block coordinate non-monotone gradient method for a class of nonlinear programming. arXiv preprint arXiv:1306.5918 (2013)
Mareček, J., Richtárik, P., Takáč, M.: Distributed Block Coordinate Descent for Minimizing Partially Separable Functions. In: M. Al-Baali, L. Grandinetti, A. Purnama (eds.) Numerical Analysis and Optimization, no. 134 in Springer Proceedings in Mathematics and Statistics, pp. 261–288. Springer International Publishing (2015)
Marjanovic, G., Solo, V.: Sparsity Penalized Linear Regression With Cyclic Descent. IEEE Transactions on Signal Processing 62 (6), 1464–1475 (2014)
Mazumder, R., Friedman, J.H., Hastie, T.: SparseNet: Coordinate Descent With Nonconvex Penalties. Journal of the American Statistical Association 106 (495), 1125–1138 (2011)
Mohimani, H., Babaie-Zadeh, M., Jutten, C.: A fast approach for overcomplete sparse decomposition based on smoothed L0 norm. IEEE Transactions on Signal Processing 57 (1), 289–301 (2009)
Nutini, J., Schmidt, M., Laradji, I.H., Friedlander, M., Koepke, H.: Coordinate descent converges faster with the Gauss-Southwell rule than random selection. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, (2015)
Peng, Z., Wu, T., Xu, Y., Yan, M., Yin, W.: Coordinate friendly structures, algorithms and applications. Annals of Mathematical Sciences and Applications 1 (2016)
Peng, Z., Xu, Y., Yan, M., Yin, W.: ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates. SIAM Journal on Scientific Computing 38 (5), A2851-A2879, (2015)
Peng, Z., Yan, M., Yin, W.: Parallel and distributed sparse optimization. In: Signals, Systems and Computers, 2013 Asilomar Conference on, pp. 659–646. IEEE (2013)
Powell, M.J.D.: On search directions for minimization algorithms. Mathematical Programming 4 (1), 193–201 (1973)
Rao, B.D., Kreutz-Delgado, K.: An affine scaling methodology for best basis selection. IEEE Transactions on Signal Processing 47 (1), 187–200 (1999)
Razaviyayn, M., Hong, M., Luo, Z.: A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization. SIAM Journal on Optimization 23 (2), 1126–1153 (2013)
Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Mathematical Programming 156 (1–2), 433–484 (2015)
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Mathematics of Operations Research 1 (2), 97–116 (1976)
Saab, R., Chartrand, R., Özgür Yilmaz: Stable sparse approximations via nonconvex optimization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2008)
Scherrer, C., Halappanavar, M., Tewari, A., Haglin, D.: Scaling up coordinate descent algorithms for large ℓ 1 regularization problems. arXiv preprint arXiv:1206.6409 (2012)
Tseng, P.: On the rate of convergence of a partially asynchronous gradient projection algorithm. SIAM Journal on Optimization 1 (4), 603–619 (1991)
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109 (3), 475–494 (2001)
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming 117 (1–2), 387–423 (2009)
Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Science 1 (3), 248–272 (2008)
Wang, Y., Yin, W.: Sparse Signal Reconstruction via Iterative Support Detection. SIAM Journal on Imaging Sciences 3 (3), 462–491 (2010)
Wang, Y., Yin, W., Zeng, J.: Global Convergence of ADMM in Nonconvex Nonsmooth Optimization. arXiv:1511.06324 [cs, math] (2015)
Warga, J.: Minimizing certain convex functions. Journal of the Society for Industrial & Applied Mathematics 11 (3), 588–593 (1963)
Woodworth, J., Chartrand, R.: Compressed sensing recovery via nonconvex shrinkage penalties. http://arxiv.org/abs/1504.02923
Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics 2 (1), 224–244 (2008)
Xu, Y., Yin, W.: A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion. SIAM Journal on Imaging Sciences 6 (3), 1758–1789 (2013)
Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. arXiv:1410.1386 [math] (2014)
Xu, Z., Chang, X., Xu, F., Zhang, H.: L 1∕2 regularization: A thresholding representation theory and a fast solver. IEEE Transactions on Neural Networks and Learning Systems 23 (7), 1013–1027 (2012)
Xu, Z.B., Guo, H.L., Wang, Y., Zhang, H.: Representative of L 1∕2 regularization among L q (0 < q ≤ 1) regularizations: an experimental study based on phase diagram. Acta Automatica Sinica 38 (7), 1225–1228 (2012)
Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of ℓ 1−2 for compressed sensing. SIAM Journal on Scientific Computing 37 (1), A536–A563 (2015)
Zangwill, W.I.: Nonlinear Programming: A Unified Approach. Prentice-Hall, Englewood Cliffs, NJ (1969)
Zeng, J., Peng, Z., Lin, S.: A Gauss-Seidel Iterative Thresholding Algorithm for l q Regularized Least Squares Regression. arXiv:1507.03173 [cs] (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Chartrand, R., Yin, W. (2016). Nonconvex Sparse Regularization and Splitting Algorithms. In: Glowinski, R., Osher, S., Yin, W. (eds) Splitting Methods in Communication, Imaging, Science, and Engineering. Scientific Computation. Springer, Cham. https://doi.org/10.1007/978-3-319-41589-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-41589-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41587-1
Online ISBN: 978-3-319-41589-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)