Skip to main content
Log in

Relating \(\ell _p\) regularization and reweighted \(\ell _1\) regularization

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

We propose a general framework of iteratively reweighted \(\ell _1\) algorithms for solving \(\ell _p\) regularization problems. We show that all the limit points of the iterates generated by the proposed algorithms have the same sign. Moreover, for sufficiently large iterations, the iterates also have the same sign as the limit points, and the nonzero components are bounded away from zero. Therefore, the algorithm behaves like solving a smooth problem in the reduced space consisting of the nonzero components. We analyze the global convergence and the worst-case complexity for the reweighted algorithms. Besides, a smoothing parameter updating strategy is proposed which can automatically stop reducing the smoothing parameters corresponding to the zero components of the limit points. We show that the \(\ell _p\) regularized regression problem is locally equivalent to a weighted \(\ell _1\) regularization problem near a stationary point and every stationary point corresponds to a Maximum A Posterior estimation for independently and non-identically distributed Laplace prior parameters. Numerical experiments exhibit the behaviors and the efficiency of our proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Babacan, S.D., Molina, R., Katsaggelos, A.K.: Bayesian compressive sensing using laplace priors. IEEE Trans. Image Process. 19(1), 53–63 (2010). https://doi.org/10.1109/TIP.2009.2032894

    Article  MathSciNet  MATH  Google Scholar 

  2. Bauschke, H.H., Dao, M.N., Moursi, W.M.: On fejér monotone sequences and nonexpansive mappings. arXiv preprint arXiv:1507.05585 (2015)

  3. Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)

    Article  MathSciNet  Google Scholar 

  4. Chen, X., Niu, L., Yuan, Y.X.: Optimality conditions and a smoothing trust region newton method for nonlipschitz optimization. SIAM J. Optim. 23, 1528–1552 (2013)

    Article  MathSciNet  Google Scholar 

  5. Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2-\ell _p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)

    Article  MathSciNet  Google Scholar 

  6. Chen, X., Zhou, W.: Convergence of reweighted \(\ell _1\) minimization algorithms and unique solution of truncated lp minimization. The Hong Kong Polytechnic University, Department of Applied Mathematics (2010)

  7. Figueiredo, M.A., Bioucas-Dias, J.M., Nowak, R.D.: Majorization–minimization algorithms for wavelet-based image restoration. IEEE Trans. Image Process. 16(12), 2980–2991 (2007)

    Article  MathSciNet  Google Scholar 

  8. Figueiredo, M.A., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1(4), 586–597 (2007)

    Article  Google Scholar 

  9. Ge, D., Jiang, X., Ye, Y.: A note on the complexity of \(\ell _p\) minimization. Math. Program. 129(2), 285–299 (2011)

    Article  MathSciNet  Google Scholar 

  10. Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, Boca Raton (2015)

    Book  Google Scholar 

  11. Lai, M.J., Wang, J.: An unconstrained \( \ell _q \) minimization with \(0<q\le 1\) for sparse solution of underdetermined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)

    Article  MathSciNet  Google Scholar 

  12. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018)

    Article  Google Scholar 

  13. Liu, Z., Jiang, F., Tian, G., Wang, S., Sato, F., Meltzer, S.J., Tan, M.: Sparse logistic regression with lp penalty for biomarker identification. Stat. Appl. Genet. Mol. Biol. 6(1), 6 (2007)

    Article  MathSciNet  Google Scholar 

  14. Lu, Z.: Iterative reweighted minimization methods for \(\ell _p\) regularized unconstrained nonlinear programming. Math. Program. 147(1–2), 277–307 (2014)

    Article  MathSciNet  Google Scholar 

  15. Patrikalakis, N.M., Maekawa, T.: Shape Interrogation for Computer Aided Design and Manufacturing. Springer, Berlin (2009)

    MATH  Google Scholar 

  16. Scardapane, S., Comminiello, D., Hussain, A., Uncini, A.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81–89 (2017)

    Article  Google Scholar 

  17. Sokolov, V., Polson, M.: Strategic Bayesian asset allocation (2019)

  18. Sun, T., Jiang, H., Cheng, L.: Global convergence of proximal iteratively reweighted algorithm. J. Glob. Optim. 68(4), 815–826 (2017)

    Article  MathSciNet  Google Scholar 

  19. Wang, H., Li, D.H., Zhang, X.J., Wu, L.: Optimality conditions for the constrained l p-regularization. Optimization 64(10), 2183–2197 (2015)

    Article  MathSciNet  Google Scholar 

  20. Wang, H., Zhang, F., Wu, Q., Hu, Y., Shi, Y.: Nonconvex and nonsmooth sparse optimization via adaptively iterative reweighted methods. arXiv preprint arXiv:1810.10167 (2018)

  21. Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)

    Article  MathSciNet  Google Scholar 

  22. Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of 1–2 for compressed sensing. SIAM J. Sci. Comput. 37(1), A536–A563 (2015)

    Article  MathSciNet  Google Scholar 

  23. Yu, P., Pong, T.K.: Iteratively reweighted \(\ell _1\) algorithms with extrapolation. Comput. Optim. Appl. 73(2), 353–386 (2019)

    Article  MathSciNet  Google Scholar 

  24. Zeng, J., Lin, S., Xu, Z.: Sparse regularization: convergence of iterative jumping thresholding algorithm. IEEE Trans. Signal Process. 64(19), 5106–5118 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Hao Wang was supported by the Young Scientists Fund of the National Natural Science Foundation of China under Grant 12001367.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

We discuss two common examples to see that the iterates \(\{x^k\}\) generated by Algorithm 1 generally has a unique limit point. By Theorem 6, we only have to verify whether for a stationary point \(x^*\), \(\nabla ^2 F( [x_{\mathcal{I}^*}^*; 0_{\mathcal{A}^*}]) \) is invertible.

\(\ell _p\) regularized linear regression The loss function is \(f(x) = \tfrac{1}{2}\Vert Ax-b\Vert ^2_2\) with \(A\in \mathbb {R}^{m\times n}\), \(x\in \mathbb {R}^n\) and \(b\in \mathbb {R}^m\). Therefore, \( \nabla ^2 F( [x_{\mathcal{I}^*}^*; 0_{\mathcal{A}^*}]) = [A^TA]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\). In this case, the algorithm converges uniquely to the stationary point as long as \( [A^TA]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\) is nonsingular.

\(\ell _p\) regularized logistic regression The loss function is

$$\begin{aligned} f(x) = \sum \limits _{i=1}^m[y_i\ln \sigma (a_i^Tx) + (1-y_i)\ln (1-\sigma (a_i^Tx))],\end{aligned}$$

with \(a_i\in \mathbb {R}^n, y_i\in \{0,1\}\) and \(\sigma (s) = \frac{1}{1+e^{-s}}\). In this case, \([\nabla ^2 f(x^*)]_{\mathcal{I}^*\mathcal{I}^*} = [A^TD(x^*)A]_{\mathcal{I}^*\mathcal{I}^*}\), where \(D(x^*) = \text {diag}(\sigma (a_i^Tx^*)(1-\sigma (a_i^Tx^*)), i\in \{1,\ldots ,m\})\) and \(A = [a_1, \ldots , a_m]\). In this case, the algorithm converges uniquely to the stationary point as long as \([A^TD(x^*)A]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\) is nonsingular.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Zeng, H., Wang, J. et al. Relating \(\ell _p\) regularization and reweighted \(\ell _1\) regularization. Optim Lett 15, 2639–2660 (2021). https://doi.org/10.1007/s11590-020-01685-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-020-01685-x

Keywords

Navigation