Skip to main content

Relating \(\ell _p\) regularization and reweighted \(\ell _1\) regularization

Abstract

We propose a general framework of iteratively reweighted \(\ell _1\) algorithms for solving \(\ell _p\) regularization problems. We show that all the limit points of the iterates generated by the proposed algorithms have the same sign. Moreover, for sufficiently large iterations, the iterates also have the same sign as the limit points, and the nonzero components are bounded away from zero. Therefore, the algorithm behaves like solving a smooth problem in the reduced space consisting of the nonzero components. We analyze the global convergence and the worst-case complexity for the reweighted algorithms. Besides, a smoothing parameter updating strategy is proposed which can automatically stop reducing the smoothing parameters corresponding to the zero components of the limit points. We show that the \(\ell _p\) regularized regression problem is locally equivalent to a weighted \(\ell _1\) regularization problem near a stationary point and every stationary point corresponds to a Maximum A Posterior estimation for independently and non-identically distributed Laplace prior parameters. Numerical experiments exhibit the behaviors and the efficiency of our proposed algorithms.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. 1.

    Babacan, S.D., Molina, R., Katsaggelos, A.K.: Bayesian compressive sensing using laplace priors. IEEE Trans. Image Process. 19(1), 53–63 (2010). https://doi.org/10.1109/TIP.2009.2032894

    MathSciNet  Article  MATH  Google Scholar 

  2. 2.

    Bauschke, H.H., Dao, M.N., Moursi, W.M.: On fejér monotone sequences and nonexpansive mappings. arXiv preprint arXiv:1507.05585 (2015)

  3. 3.

    Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)

    MathSciNet  Article  Google Scholar 

  4. 4.

    Chen, X., Niu, L., Yuan, Y.X.: Optimality conditions and a smoothing trust region newton method for nonlipschitz optimization. SIAM J. Optim. 23, 1528–1552 (2013)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2-\ell _p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Chen, X., Zhou, W.: Convergence of reweighted \(\ell _1\) minimization algorithms and unique solution of truncated lp minimization. The Hong Kong Polytechnic University, Department of Applied Mathematics (2010)

  7. 7.

    Figueiredo, M.A., Bioucas-Dias, J.M., Nowak, R.D.: Majorization–minimization algorithms for wavelet-based image restoration. IEEE Trans. Image Process. 16(12), 2980–2991 (2007)

    MathSciNet  Article  Google Scholar 

  8. 8.

    Figueiredo, M.A., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1(4), 586–597 (2007)

    Article  Google Scholar 

  9. 9.

    Ge, D., Jiang, X., Ye, Y.: A note on the complexity of \(\ell _p\) minimization. Math. Program. 129(2), 285–299 (2011)

    MathSciNet  Article  Google Scholar 

  10. 10.

    Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, Boca Raton (2015)

    Book  Google Scholar 

  11. 11.

    Lai, M.J., Wang, J.: An unconstrained \( \ell _q \) minimization with \(0<q\le 1\) for sparse solution of underdetermined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)

    MathSciNet  Article  Google Scholar 

  12. 12.

    Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018)

    Article  Google Scholar 

  13. 13.

    Liu, Z., Jiang, F., Tian, G., Wang, S., Sato, F., Meltzer, S.J., Tan, M.: Sparse logistic regression with lp penalty for biomarker identification. Stat. Appl. Genet. Mol. Biol. 6(1), 6 (2007)

    MathSciNet  Article  Google Scholar 

  14. 14.

    Lu, Z.: Iterative reweighted minimization methods for \(\ell _p\) regularized unconstrained nonlinear programming. Math. Program. 147(1–2), 277–307 (2014)

    MathSciNet  Article  Google Scholar 

  15. 15.

    Patrikalakis, N.M., Maekawa, T.: Shape Interrogation for Computer Aided Design and Manufacturing. Springer, Berlin (2009)

    MATH  Google Scholar 

  16. 16.

    Scardapane, S., Comminiello, D., Hussain, A., Uncini, A.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81–89 (2017)

    Article  Google Scholar 

  17. 17.

    Sokolov, V., Polson, M.: Strategic Bayesian asset allocation (2019)

  18. 18.

    Sun, T., Jiang, H., Cheng, L.: Global convergence of proximal iteratively reweighted algorithm. J. Glob. Optim. 68(4), 815–826 (2017)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Wang, H., Li, D.H., Zhang, X.J., Wu, L.: Optimality conditions for the constrained l p-regularization. Optimization 64(10), 2183–2197 (2015)

    MathSciNet  Article  Google Scholar 

  20. 20.

    Wang, H., Zhang, F., Wu, Q., Hu, Y., Shi, Y.: Nonconvex and nonsmooth sparse optimization via adaptively iterative reweighted methods. arXiv preprint arXiv:1810.10167 (2018)

  21. 21.

    Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)

    MathSciNet  Article  Google Scholar 

  22. 22.

    Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of 1–2 for compressed sensing. SIAM J. Sci. Comput. 37(1), A536–A563 (2015)

    MathSciNet  Article  Google Scholar 

  23. 23.

    Yu, P., Pong, T.K.: Iteratively reweighted \(\ell _1\) algorithms with extrapolation. Comput. Optim. Appl. 73(2), 353–386 (2019)

    MathSciNet  Article  Google Scholar 

  24. 24.

    Zeng, J., Lin, S., Xu, Z.: Sparse regularization: convergence of iterative jumping thresholding algorithm. IEEE Trans. Signal Process. 64(19), 5106–5118 (2016)

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

Hao Wang was supported by the Young Scientists Fund of the National Natural Science Foundation of China under Grant 12001367.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

We discuss two common examples to see that the iterates \(\{x^k\}\) generated by Algorithm 1 generally has a unique limit point. By Theorem 6, we only have to verify whether for a stationary point \(x^*\), \(\nabla ^2 F( [x_{\mathcal{I}^*}^*; 0_{\mathcal{A}^*}]) \) is invertible.

\(\ell _p\) regularized linear regression The loss function is \(f(x) = \tfrac{1}{2}\Vert Ax-b\Vert ^2_2\) with \(A\in \mathbb {R}^{m\times n}\), \(x\in \mathbb {R}^n\) and \(b\in \mathbb {R}^m\). Therefore, \( \nabla ^2 F( [x_{\mathcal{I}^*}^*; 0_{\mathcal{A}^*}]) = [A^TA]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\). In this case, the algorithm converges uniquely to the stationary point as long as \( [A^TA]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\) is nonsingular.

\(\ell _p\) regularized logistic regression The loss function is

$$\begin{aligned} f(x) = \sum \limits _{i=1}^m[y_i\ln \sigma (a_i^Tx) + (1-y_i)\ln (1-\sigma (a_i^Tx))],\end{aligned}$$

with \(a_i\in \mathbb {R}^n, y_i\in \{0,1\}\) and \(\sigma (s) = \frac{1}{1+e^{-s}}\). In this case, \([\nabla ^2 f(x^*)]_{\mathcal{I}^*\mathcal{I}^*} = [A^TD(x^*)A]_{\mathcal{I}^*\mathcal{I}^*}\), where \(D(x^*) = \text {diag}(\sigma (a_i^Tx^*)(1-\sigma (a_i^Tx^*)), i\in \{1,\ldots ,m\})\) and \(A = [a_1, \ldots , a_m]\). In this case, the algorithm converges uniquely to the stationary point as long as \([A^TD(x^*)A]_{\mathcal{I}^*\mathcal{I}^*} + \lambda \nabla ^2 \Vert x_{\mathcal{I}^*}^*\Vert ^p_p\) is nonsingular.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Zeng, H., Wang, J. et al. Relating \(\ell _p\) regularization and reweighted \(\ell _1\) regularization. Optim Lett 15, 2639–2660 (2021). https://doi.org/10.1007/s11590-020-01685-x

Download citation

Keywords

  • \(\ell _p\)-norm regularization
  • Iteratively reweighted algorithm
  • Nonconvex regularization
  • Non-Lipschitz differentiable
  • Maximum A Posterior